Delivers consistent face animation from a single image using motion-driven synthesis for design and game visualization.




Delivers consistent face animation from a single image using motion-driven synthesis for design and game visualization.
Turn static photos into lifelike videos with style, motion, and full creative control.
Transform stills into narrative clips with synced audio and fluid camera motion.
Generate high quality videos from text prompts with Wan 2.2 Plus.
Create 1080p cinematic clips from stills with physics-true motion and consistent subjects.
Create high quality videos from text prompts using Pika 2.2.
Wan AI 2.1 is Alibaba’s open-source Wan video AI model designed to create high-quality AI video content. It uses advanced diffusion techniques to generate realistic motion and supports bilingual text generation (English and Chinese). The model family includes several versions:
Using Wan 2.1 online via RunComfy AI Playground: Using RunComfy Wan 2.1 AI Playground to access Wan 2.1 is straightforward. Simply visit the RunComfy AI Playground, select the Wan 2.1 AI playground, and input your text prompt or upload an image. Customize settings like resolution and duration as desired, then start the video generation process. Once completed, you can preview and download your video. This intuitive interface makes creating high-quality videos with Wan 2.1 both easy and efficient.
Using Wan 2.1 online via RunComfy ComfyUI: To effortlessly generate videos using the Wan 2.1 workflow in ComfyUI, simply visit the RunComfy Wan 2.1 Workflow page. There, you'll find a fully operational workflow ready for immediate use, with all necessary environments and models pre-configured. This setup allows you to create high-quality videos from text prompts or images with minimal effort.
Using Wan 2.1 Locally
Visit the RunComfy AI Playground and log in. After accessing your account, select the Wan 2.1 model. For Text-to-Video (T2V) generation, input your descriptive text prompt. For Image-to-Video (I2V) generation, upload your base image and optionally add a guiding text prompt. Configure your video settings, such as resolution (480p or 720p) and duration, then initiate the video generation process. Once completed, you can preview and download your video.
Choose either the Wan 2.1 Workflow or the Wan 2.1 LoRA workflow depending on your needs and log in. The ComfyUI interface allows easy customization. Enter a text prompt or upload an image, or apply LoRA models to adjust style, then set your video preferences. Once you're ready, start the video generation and download the final video when it's done.
LoRA allows you to fine-tune the Wan 2.1 video model with extra parameters to customize style, motion, or other artistic details without retraining the entire model.
Training a Wan 2.1 LoRA model follows a process similar to LoRA training for other diffusion models:
Community-created LoRA models for Wan 2.1 are available on Hugging Face. For example: Wan2.1 14B 480p I2V LoRAs
The Wan 2.1 14B models (including both T2V-14B and I2V-14B) typically require a high-end GPU—such as an NVIDIA RTX 4090—to efficiently generate high-resolution video content. Under standard settings, these models are used to produce 5-second 720p videos; however, with optimizations like model offloading and quantization, they can generate up to 8-second 480p videos using approximately 12 GB of VRAM.
In contrast, theWan 2.1 T2V-1.3B model is more resource-efficient, requiring around 8.19 GB of VRAM, which makes it well-suited for consumer-grade GPUs. While it generates 5-second 480p videos on an RTX 4090 in about 4 minutes (without additional optimizations), it generally offers a trade-off between lower VRAM usage and slightly reduced resolution and speed compared to the 14B models.
The NVIDIA RTX 3090, with its 24 GB of VRAM, is well-suited for running the Wan 2.1 T2V-1.3B model in inference mode. This version typically uses around 8.19 GB of VRAM during inference, making it compatible with the RTX 3090's memory capacity.
However, running the more demanding Wan 2.1 T2V-14B model on the RTX 3090 may pose challenges. While the GPU’s 24 GB of VRAM is substantial, users have reported that generating videos with the 14B model requires significant memory and processing power. Some have managed to run the 14B model on GPUs with as little as 10 GB of VRAM, but this often involves trade-offs in performance and may not be practical for all users.
The hardware requirements for Wan 2.1 AI video vary depending on the model. The T2V-1.3B version is optimized for efficiency and works well on consumer GPUs with around 8GB of VRAM, producing 480p videos quickly. On the other hand, the T2V-14B model offers higher-quality 720p videos but requires more VRAM to handle its 14 billion parameters.
If you want to try Wan 2.1 AI video without investing in high-end hardware, you can use the RunComfy AI Playground, which offers free credits and an online environment to explore Wan 2.1 and many other AI tools.
To run Wan 2.1 cost-effectively in the cloud, RunComfy offers two primary methods:
1. RunComfy AI Playground: This platform allows you to run Wan 2.1 along with a variety of AI tools. New users receive free credits, enabling them to explore and experiment without initial investment.
2. RunComfy ComfyUI: For a more streamlined experience, RunComfy provides a pre-configured ComfyUI workflow for Wan 2.1 and Wan 2.1 LoRA. All the necessary environments and models are set up, so you can start generating videos right away after logging in.
Additionally, for further cost savings, you can use the more efficient 1.3B model with optimization techniques such as quantization or model offloading (using flags like --offload_model True) to reduce VRAM usage and lower operational costs.
Wan 2.1 supports both text-to-video and image-to-video (I2V) generation. To create an image-to-video, simply provide a still image along with a text prompt describing the desired animation or transformation. The model will use its spatiotemporal dynamics to animate the image.
Locally: Run Wan 2.1 via the command line with the flag -task i2v-14B and specify the image path (e.g., -image examples/i2v_input.JPG) along with your prompt.
RunComfy ComfyUI: Use the Wan 2.1 workflow through RunComfy ComfyUI for seamless image-to-video generation.
RunComfy Playground: Simply select the image-to-video mode to get started.
The default—and effectively maximum—video length for Wan 2.1 is set to 81 frames. In practice, this means that if you use a typical frame rate (for example, around 16 FPS), you’ll get roughly a 5‑second video clip.
A few additional details: The model’s design requires the total number of frames to follow the form 4n+1 (e.g. 81 frames fits this rule).
Although some users have experimented with longer sequences (such as 100 frames), the standard (and most stable) configuration is 81 frames, which balances quality with temporal consistency.
Wan 2.1 video is versatile for a range of creative projects. It handles text-to-video and image-to-video generation and even supports video editing. Whether you’re creating social media clips, educational content, or promotional videos, Wan 2.1 offers a practical solution. Its ability to generate dynamic visuals and readable text makes it especially useful for content creators and marketers looking to produce high-quality ai video content without a complex setup.
You can easily use Wan 2.1 in ComfyUI for both text-to-video and image-to-video projects. Below are the detailed guides for using Wan 2.1 in ComfyUI:
RunComfy offers a pre-configured environment with all necessary models already downloaded. This means you can start generating high-quality Wan 2.1 AI video content immediately—no additional setup required.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.