ComfyUI>Workflows>Wan 2.1 | Revolutionary Video Generation

Wan 2.1 | Revolutionary Video Generation

Workflow Name: RunComfy/Wan-2.1

Workflow ID: 0000...1199

Updated 6/16/2025: ComfyUI version updated to v0.3.39 for improved stability and compatibility. Wan 2.1 surpasses all competitors in video creation benchmarks. Its 1.3B model needs only 8.19GB VRAM, supporting both Text-to-Video and Image-to-Video workflows to produce 480P videos in 4 minutes on standard hardware. The Wan 2.1 14B model offers enhanced 720P quality via RunComfy's cloud. As the first model generating both Chinese and English text in videos, Wan 2.1 expands creative options while its Wan-VAE backend efficiently handles 1080P video processing with temporal consistency.

ComfyUI Wan 2.1 Workflow Description

1. What is Wan 2.1?

The ComfyUI Wan 2.1 workflow is a cutting-edge video generation pipeline that leverages the latest Wan 2.1 models to create high-quality videos from text prompts or/and base images. Wan 2.1 supports Text-to-Video (T2V) and Image-to-Video (I2V) generation, producing 5-second videos with natural motion and professional-grade quality. Wan 2.1 sets a new benchmark for AI video generation, outperforming open-source and commercial alternatives. The Wan 2.1 14B model pushes the limits further, delivering exceptional results up to 720P.

2. Benefits and Capabilities of Wan 2.1

High-quality output: Generates 480P to 720P videos with realistic motion and high-fidelity textures.
Hardware accessibility: The lightweight Wan 2.1 1.3B model requires only 8.19GB VRAM, making it compatible with most modern GPUs (which are provided by RunComfy here!).
Versatile generation: Wan 2.1 Supports both Text-to-Video (T2V) and Image-to-Video (I2V) workflows.
Multilingual support: Wan 2.1 is the first video model capable of generating both Chinese and English text within videos.
VAE efficiency: The Wan-VAE backend efficiently handles 1080P videos while preserving temporal consistency.
Fast processing: The Wan 2.1 1.3B model delivers quick results while maintaining quality.

3. How to Use Wan 2.1

3.1 Wan 2.1 Generation Methods

Wan 2.1

Primary Wan 2.1 Generation Method (disabled by default): Text-to-Video

Inputs: Text prompt
Best for: Creating videos from scratch using textual descriptions
Characteristics:
- Uses the Wan 2.1 1.3B model for faster generation
- Creates 33-frame (5-second) videos at 480P resolution
- Optimized for smooth motion in short clips

Wan 2.1

Advanced Wan 2.1 Method (enabled by default): Image-to-Video with Text Prompt

Inputs: Base image + text prompt
Best for: Animating still images while guiding motion with a prompt
Characteristics:
- Preserves visual elements of the input image
- Allows text control over motion direction
- Uses the Wan 2.1 14B model for higher fidelity
- Creates 33-frame videos at 512x512 resolution

Example Workflow:

In CLIPTextEncode (Positive Prompt / Negative Prompt): Enter your scene description (e.g., "a fox moving quickly in a beautiful winter landscape with trees and mountains during daytime, tracking camera").
In Load Image: Upload your base image.
For further refinement (optional):
- In KSampler: Adjust steps (default: 30) for a quality vs. speed balance.
- In ModelSamplingSD3: Modify scale value (default: 8) for prompt adherence.
Click Queue Prompt to start the generation.
In SaveAnimatedWEBP find your output preview (also saved in ComfyUI > Output folder).

3.2 Parameter Reference for Wan 2.1

KSampler:
- steps: 20-30 (higher values improve quality but increase time)
- cfg: 6.0 (controls prompt adherence strength)
- scheduler: "simple" (determines noise scheduling approach)
- sampler_name: "uni_pc" (recommended sampler for Wan 2.1)
WanImageToVideo:
- width/height: 512 (output resolution)
- length: 33 (frames per video)
- batch_size: 1 (number of videos per run)
ModelSamplingSD3:
- scale: 8 (controls guidance adherence)
EmptyHunyuanLatentVideo:
- width/height: 832/480 (T2V output resolution)
- length: 33 (frames per video)
- batch_size: 1 (number of videos per run)

3.3 Advanced Optimization with Wan 2.1

Memory Optimization:
- Use the Wan 2.1 1.3B model for faster generation with lower VRAM requirements.
- Reduce resolution (e.g., 512x320) for quicker processing.
- Decrease frame count for shorter and faster renders.
Quality Optimization:
- Use the Wan 2.1 14B model for higher-quality output.
- Increase KSampler steps to 30-40 for more refined results.
- Utilize Image-to-Video with a high-quality base image for the best fidelity.

More Information

For additional details on Wan 2.1, visit the Wan-Video GitHub repository.

Credits

The Wan 2.1 model was developed by the Wan Team, and the ComfyUI integration was created by the original developers. Full credit goes to these innovators for advancing AI-powered video generation.

Want More ComfyUI Workflows?

VACE 14B: All-in-One Video Creation & Editing

Create, edit and transform videos with the powerful VACE Wan2.1 14B.

Wan 2.1 Fun | I2V + T2V

Empower your AI videos with Wan 2.1 Fun.

Wan 2.1 Fun | ControlNet Video Generation

Generate videos with ControlNet-style visual passes like Depth, Canny, and OpenPose.

Flux Kontext LoRA | Multiple Views

Generate 5-pose character turnaround sheets from single image

AnimateDiff + IPAdapter V1 | Image to Video

With IPAdapter, you can efficiently control the generation of animations using reference images.

Mesh Graphormer ControlNet | Fix Hands

Mesh Graphormer ControlNet corrects malformed hands in images while preserving the rest.

Omni Kontext | Seamless Scene Integration

Perfect scene fits. Unique style. Identity stays. Kontext keeps it real.

IPAdapter Plus (V2) | Change Clothes

Use IPAdapter Plus for your fashion model creation, easily changing outfits and styles

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.