logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
ComfyUI>Workflows>Wan 2.2 VBVR in ComfyUI | Video-Based Video Reasoning

Wan 2.2 VBVR in ComfyUI | Video-Based Video Reasoning

Workflow Name: RunComfy/Wan-2.2-VBVR-in-ComfyUI
Workflow ID: 0000...1378
This advanced workflow uses the Wan 2.2 VBVR model to generate context-aware, coherent video sequences with built-in reasoning. You can produce videos that maintain temporal stability, logical motion, and well-structured scene interactions. It helps you turn complex prompts into expressive narrative animations and AI-driven video simulations. Designed for creators and developers, the system enhances realism, coordination, and video clarity. Perfect for storytelling, simulation, or visual reasoning projects, it drastically improves generation control and output quality.

Wan 2.2 VBVR in ComfyUI: reasoning‑aware image‑to‑video generation

Wan 2.2 VBVR in ComfyUI is a production‑ready workflow that brings Video‑Based Visual Reasoning to Wan 2.2 image‑to‑video. It augments the standard Wan 2.2 Mixture‑of‑Experts pipeline with a reasoning‑tuned model path and an optional VBVR LoRA path, so your videos track objects, actions, and causal events with stronger temporal logic and scene consistency.

Built for creative direction, simulation, and story beats that need more than pretty frames, this ComfyUI workflow aligns complex prompts with structured motion and multi‑object interactions. You can choose between the pure VBVR model route or layer VBVR and motion LoRAs on Wan 2.2 for speed, then export ready‑to‑edit MP4s.

Key models in Comfyui Wan 2.2 VBVR workflow

  • Wan2.2‑I2V‑A14B (MoE backbone). Two experts specialize in high‑noise and low‑noise phases and switch by SNR during denoising, giving higher capacity without extra per‑step cost. This is the primary generator the workflow extends and blends. Model card • Technical details
  • VBVR‑Wan2.2. A fine‑tune of Wan2.2‑I2V‑A14B on a large video‑reasoning suite to improve temporal, causal, and multi‑object reasoning while keeping the architecture unchanged. Use this when you want the strongest reasoning alignment. Model card • Paper
  • Wan 2.x VAE. High‑compression video autoencoder that reconstructs frames efficiently for 480p–720p workflows; Wan 2.2 describes a compression design that enables fast 720p generation. Overview
  • uMT5‑XXL text encoder. Robust multilingual T5‑family encoder used to derive prompt embeddings for Wan 2.2 text and image‑to‑video pipelines. Model card
  • Motion and reasoning LoRAs for Wan 2.2. The workflow can load VBVR LoRA for reasoning bias and LightX2V step‑distilled LoRAs for stronger motion amplitude and camera moves. VBVR LoRA example • LightX2V collection

How to use Comfyui Wan 2.2 VBVR workflow

This workflow offers three complementary routes. Each route is self‑contained from prompts and optional start image to video output, so you can test all three and keep the best take.

  • VBVR Model route
    • Purpose. Use this when you want the strongest video reasoning. It runs a high‑SNR and low‑SNR pair of VBVR‑calibrated Wan 2.2 models, splitting the denoising schedule between a “layout” stage and a “detail” stage.
    • How it works. The high‑noise stage runs first in WanVideoSampler (#173), then its latents flow into the low‑noise stage WanVideoSampler (#172) that refines motion logic and details. Switching is controlled by the samplers’ start_step and end_step, mirroring Wan 2.2’s SNR‑gated expert handoff.
    • What you set. Provide a start image if desired via LoadImage (#67), and write your prompt in the Text node near the encoder feeding WanVideoTextEncode (#170). Tweak frame count with the small integer node next to the samplers (Int (#168)).
    • Output. Frames are decoded by WanVideoDecode (#164) and assembled as MP4 with VHS_VideoCombine (#176).
  • Wan 2.2 + PainterI2V route
    • Purpose. A fast, general path for cinematic motion. It keeps stock Wan 2.2 I2V models but upgrades the image conditioning with PainterI2VforKJ to correct slow‑motion artifacts common in 4‑step LoRAs.
    • How it works. Your start image is resized for the model, embedded with PainterI2VforKJ (#181), then sampled by WanVideoSampler (#129, #130). The route uses a high‑noise then low‑noise handoff similar to VBVR but sticks to standard Wan 2.2 weights.
    • What you set. Supply your prompt in the Text node that feeds WanVideoTextEncode (#152). If you chain a LightX2V LoRA, this route’s conditioning helps the motion feel more intentional. See the node’s readme for its design goal. PainterI2VforKJ
    • Output. Frames are decoded by WanVideoDecode (#142) and saved with VHS_VideoCombine (#154).
  • Wan 2.2 + VBVR LoRA route
    • Purpose. A hybrid for quick iterations. It layers a VBVR LoRA on the high‑noise Wan 2.2 model and a motion LoRA on the low‑noise model, giving you a reasoning nudge early and clean motion polish late.
    • How it works. The high‑noise stream runs through WanVideoSampler (#27) and the low‑noise stream through WanVideoSampler (#90); both receive text embeddings from WanVideoTextEncode (#16) and optional image conditioning from PainterI2VforKJ (#179). LoRAs are applied per stage to match the expert’s role.
    • What you set. Drop your prompt into the Text node near WanVideoTextEncode (#16) and, if needed, adjust the stage split with the nearby integer widgets labeled “Split_step.” VBVR and LightX2V LoRAs are selectable from the LoRA nodes in this group. VBVR LoRA
    • Output. Frames decode via WanVideoDecode (#28) and export through VHS_VideoCombine (#60).

Key nodes in Comfyui Wan 2.2 VBVR workflow

  • WanVideoModelLoader (#165, #162)
    • Loads the VBVR‑calibrated high‑SNR and low‑SNR Wan 2.2 models that map to the early and late denoising experts. Keep the pair consistent so the SNR‑based handoff remains stable. See Wan 2.2’s MoE and SNR switch design for context. Details
  • WanVideoSampler (#173, #172, #129, #130, #27, #90)
    • Drives generation and controls the expert split. Adjust steps to balance detail vs speed, and tune start_step or end_step to shift work between the early layout expert and the late detail expert. Use cfg to trade adherence for motion freedom. Reference implementation is in the wrapper’s sampler nodes. Wrapper repo
  • PainterI2VforKJ (#178, #181, #179)
    • Replaces vanilla image‑to‑video conditioning with a motion‑amplified variant designed to fix slow‑motion when using 4‑step LightX2V LoRAs. It strengthens camera prompts and action beats while keeping subject identity. Node readme
  • WanVideoTextEncode (#170, #152, #16)
    • Encodes positive and negative prompts via a uMT5‑XXL encoder so the samplers receive rich, multilingual semantics. Keep prompt structure clear; separating scene, subject, action, and camera intent typically improves alignment. uMT5
  • VHS_VideoCombine (#176, #154, #60)
    • Assembles decoded frames into an MP4 with metadata for reproducibility. If you pass audio, the node muxes it with the video. This comes from VideoHelperSuite’s utility nodes. VideoHelperSuite

Optional extras

  • When to pick each route
    • VBVR Model: complex interactions, cause‑effect scenes, or multi‑object choreography.
    • Wan 2.2 + PainterI2V: dynamic motion or camera‑first storytelling with LightX2V LoRAs.
    • Wan 2.2 + VBVR LoRA: quick previews that still benefit from reasoning bias early in the denoising.
  • Prompting tips for Wan 2.2 VBVR in ComfyUI
    • Use short sections like [SCENE], [SUBJECT], [ACTION], [CAMERA], [LIGHTING]. This helps the text encoder separate intent.
    • For object interactions, spell out who does what to whom and in what order.
  • LoRA stacking
    • Keep reasoning‑oriented LoRA on the early expert and motion LoRA on the late expert to avoid over‑regularizing detail. VBVR and LightX2V sets compatible with Wan 2.2 are available here: VBVR LoRA • LightX2V
  • Frame size and aspect
    • The ImageResizeKJv2 nodes near inputs ensure clean divisibility for the VAE and reduce artifacts. Match your start image aspect to the target video for smoother motion propagation.

Acknowledgements

This workflow implements and builds upon the following works and resources. We gratefully acknowledge @Ai Verse, the author of Wan 2.2 VBVR in ComfyUI Source for their contributions and maintenance. For authoritative details, please refer to the original documentation and repositories linked below.

Resources

  • YouTube/Wan 2.2 VBVR in ComfyUI Source
    • Docs / Release Notes: Wan 2.2 VBVR in ComfyUI Source @Ai Verse

Note: Use of the referenced models, datasets, and code is subject to the respective licenses and terms provided by their authors and maintainers.

Want More ComfyUI Workflows?

Wan 2.2 | Open-Source Video Gen Leader

Available now! Better precision + smoother motion.

Wan 2.2 FLF2V | First-Last Frame Video Generation

Generate smooth videos from a start and end frame using Wan 2.2 FLF2V.

Wan 2.2 Lightning T2V I2V | 4-Step Ultra Fast

Wan 2.2 now 20x faster! T2V + I2V in 4 steps.

Wan 2.1 Control LoRA | Depth and Tile

Advance Wan 2.1 video generation with lightweight depth and tile LoRAs for improved structure and detail.

ComfyUI FLUX | A New Art Image Generation

ComfyUI FLUX | A New Art Image Generation

A new image generation model developed by Black Forest Labs

SAM 3 | Advanced Object Segmentation Tool

Next-gen segmentation tool for precise object masking and tracking.

LTX-2 First Last Frame | Key Frames Video Generator

Turn still frames into seamless video and sound transitions fast.

ByteDance USO | Unified Style & Subject Generator

ByteDance USO makes subject and style fusion simple and powerful.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.