logo
RunComfy
ComfyUIPlaygroundPricing
discord logo
ComfyUI>Workflows>Wan2.2 Animate | Photo to Realistic Motion Video

Wan2.2 Animate | Photo to Realistic Motion Video

Workflow Name: RunComfy/Wan2.2-Animate
Workflow ID: 0000...1292
This workflow helps you animate static images into complete motion videos that preserve character identity. By combining body pose transfer and facial mocap, it produces natural movement and expressive realism. You can take a driving video and a reference image to create lifelike character animations. It's especially useful for generating avatars, recreating performances, or storytelling projects. The workflow ensures seamless synchronization between reference identity and dynamic movements. With precise facial expressions and smooth body actions, outputs feel true to life. The process is efficient, creative, and designed for high-quality results.

Wan2.2 Animate: full‑motion reference‑to‑video animation in ComfyUI

Wan2.2 Animate turns a single reference image into a lifelike performance that follows a driving video’s full‑body motion and facial expressions. This ComfyUI Wan2.2 Animate workflow fuses pose transfer, face mocap, background control, and LoRA add‑ons so characters move naturally while identity stays intact.

Designed for avatars, performance re‑creations, music videos, and story beats, Wan2.2 Animate produces clean, temporally stable clips with optional audio passthrough, quality upscaling, and interpolation. It ships as a guided graph with sensible defaults, so you can focus on creative choices rather than plumbing.

Key models in Comfyui Wan2.2 Animate workflow

  • Wan 2.2 Animate 14B (I2V) fp8 scaled. The core video model that interprets pose, face, image, and text guidance to synthesize the motion track with identity preservation. Model set
  • Wan 2.1 VAE bf16. The matching VAE used to encode/decode latents for the Wan family, ensuring color fidelity and sharpness. VAE
  • UMT5‑XXL text encoder. Provides robust multilingual text conditioning for positive and negative prompts. Encoder
  • CLIP ViT‑H/14 vision encoder. Extracts visual embeddings from the reference image to preserve identity and style. Paper
  • Optional Wan LoRAs. Lightweight adapters for lighting and I2V behavior control, such as Lightx2v I2V 14B and Relight. Lightx2v • Relight
  • Segment Anything 2 (SAM 2). High‑quality image/video segmentation used to isolate the subject or background. Paper
  • DWPose. Accurate 2D pose estimation used for face/pose‑aware crops and masks. Repo
  • RIFE. Fast video frame interpolation to boost playback smoothness. Paper

How to use Comfyui Wan2.2 Animate workflow

Overall flow. The graph ingests a driving video and a single reference image, prepares a clean subject/background and a face‑aware crop, then feeds pose, face, image, and text embeds into Wan2.2 Animate for sampling and decode. A final stage upscales details and optionally interpolates frames before export.

  • Models
    • This group loads the Wan2.2 Animate base, matching VAE, text/vision encoders, and any selected LoRAs. The WanVideoModelLoader (#22) and WanVideoSetLoRAs (#48) wire the model and adapters, while WanVideoVAELoader (#38) and CLIPLoader (#175) provide VAE and text backbones.
    • If you plan to adjust LoRAs (e.g., relight or I2V style), keep only one or two active at a time to avoid conflicts, then preview with the provided collage nodes.

Size

  • Set your target width and height in the size group and confirm the frame_count matches the frames you plan to load from the driving video. VHS_LoadVideo (#63) reports the count; keep the sampler’s num_frames consistent to avoid tail truncation.
  • The PixelPerfectResolution (#152) helper reads the driving clip to suggest stable generation sizing.

Background Masking

  • Load your driving video in VHS_LoadVideo (#63); audio is extracted automatically for later passthrough. Use PointsEditor (#107) to place a few positive points on the subject and run Sam2Segmentation (#104) to generate a clean mask.
  • GrowMask (#100) and BlockifyMask (#108) stabilize and expand edges, and DrawMaskOnImage (#99) gives a quick sanity check. This mask lets Wan2.2 Animate focus on the performer while respecting the original background.

Reference Image

  • Drop in a single, well‑lit portrait or full‑body still. ImageResizeKJv2 (#64) matches it to your working resolution, and the output is stored for the animation stage.
  • For best identity retention, pick a reference image with a clear face and minimal occlusions.

Face Images

  • The pipeline builds a face‑aware crop to drive micro‑expressions. DWPreprocessor (#177) finds pose keypoints, FaceMaskFromPoseKeypoints (#120) isolates the face region, and ImageCropByMaskAndResize (#96) produces aligned face crops. A small preview exporter is included for quick QA (VHS_VideoCombine (#112)).

Sampling & Decode

  • The reference image is embedded via WanVideoClipVisionEncode (#70), prompts are encoded with CLIPTextEncode (#172, #182, #183), and everything is fused by WanVideoAnimateEmbeds (#62).
  • WanVideoSampler (#27) runs the core Wan2.2 Animate diffusion. You can work in “context window” mode for very long clips or use the original long‑gen path; the included note explains when to match the context window to the frame count for stability. The sampler’s output is decoded by WanVideoDecode (#28) and saved with optional audio passthrough (VHS_VideoCombine (#30)).

Result collage

  • ImageConcatMulti (#77, #66) and GetImageSizeAndCount (#42) assemble a side‑by‑side panel of reference, face, pose, and output. Use it to spot‑check identity and motion alignment before final export.

Upscale and Interpolate

  • UltimateSDUpscaleNoUpscale (#180) refines edges and textures with the provided UNet (UNETLoader (#181)) and VAE (VAELoader (#184)); positive/negative prompts can gently steer detail.
  • RIFEInterpolation (#188) optionally doubles motion smoothness, and VHS_VideoCombine (#189) writes the final Wan2.2 Animate clip.

Key nodes in Comfyui Wan2.2 Animate workflow

  • VHS_LoadVideo (#63)

    • Role. Loads the driving video, outputs frames, extracts audio, and reports the frame count for downstream consistency.
    • Tip. Keep the reported frame total aligned with the sampler’s generation length to prevent early cutoff or black frames.
  • Sam2Segmentation (#104) + PointsEditor (#107)

    • Role. Interactive subject masking that helps Wan2.2 Animate focus on the performer and avoid background entanglement.
    • Tip. A few well‑placed positive points plus a modest GrowMask tends to out‑stabilize complex backgrounds without haloing. See SAM 2 for video‑aware segmentation guidance. Paper
  • DWPreprocessor (#177) + FaceMaskFromPoseKeypoints (#120)

    • Role. Derive robust face masks and aligned crops from detected keypoints to improve lip, eye, and jaw fidelity.
    • Tip. If expressions look muted, verify the face mask covers the full jawline and cheeks; re‑run the crop after adjusting points. Repo
  • WanVideoModelLoader (#22) and WanVideoSetLoRAs (#48)

    • Role. Load Wan2.2 Animate and apply optional LoRAs for relighting or I2V bias.
    • Tip. Activate one LoRA at a time when diagnosing lighting or motion artifacts; stack sparingly to avoid over‑constraint. Models • LoRAs
  • WanVideoAnimateEmbeds (#62) and WanVideoSampler (#27)

    • Role. Fuse image, face, pose, and text conditioning into video latents and sample the sequence with Wan2.2 Animate.
    • Tip. For very long clips, switch to context‑window mode and keep its length synchronized with the intended frame count to preserve temporal coherence. Wrapper repo
  • UltimateSDUpscaleNoUpscale (#180)

    • Role. Lightweight detail pass after decode with tiling support to keep memory steady.
    • Tip. If you see tile seams, modestly increase overlap and keep prompt steering very soft to avoid off‑model textures. KJNodes
  • RIFEInterpolation (#188)

    • Role. Smooths motion by inserting in‑between frames without re‑rendering the clip.
    • Tip. Apply interpolation after upscaling so optical flow sees the final detail profile. Paper

Optional extras

  • For the cleanest identity, choose a sharp, front‑facing reference and keep accessories consistent with the driving video.
  • If background flicker appears, refine the SAM 2 mask and re‑run; masking is often the fastest fix for scene leakage.
  • Keep width and height aligned with your target platform and the input’s aspect ratio; square‑pixel, multiples of 16 work well in Wan2.2 Animate.
  • Audio from the driving video can be passed through at export; if you prefer silence, disable audio in the save node.
  • Start with one LoRA; if you add relight and I2V together, test each separately first to understand their influence.

Links you may find useful:

  • Wan2.2 Animate model and assets by Kijai: WanAnimate models, Wan 2.1 VAE, UMT5 encoder, Lightx2v
  • ComfyUI wrappers and nodes used: ComfyUI‑WanVideoWrapper, ComfyUI‑KJNodes

Acknowledgements

This workflow implements and builds upon the following works and resources. We gratefully acknowledge Wan2.2 and @ArtOfficialLabs for Wan2.2 Animate Demo for their contributions and maintenance. For authoritative details, please refer to the original documentation and repositories linked below.

Resources

  • Wan2.2/Wan2.2 Animate Demo
    • Docs / Release Notes: Wan2.2 Animate Demo @ArtOfficialLabs

Note: Use of the referenced models, datasets, and code is subject to the respective licenses and terms provided by their authors and maintainers.

Want More ComfyUI Workflows?

LivePortrait | Animate Portraits | Vid2Vid

Updated 6/16/2025: ComfyUI version updated to v0.3.39 for improved stability and compatibility. Transfer facial expressions and movements from a driving video onto a source video

Portrait Master | Text to Portrait

Portrait Master | Text to Portrait

Use the Portrait Master for greater control over portrait creations without relying on complex prompts.

Advanced Live Portrait | Parameter Control

Use customizable parameters to control every feature, from eye blinks to head movements, for natural results.

CogVideoX-5B | Advanced Text-to-Video Model

CogVideoX-5B: Advanced text-to-video model for high-quality video generation.

AnimateDiff + Batch Prompt Schedule | Text to Video

Batch Prompt schedule with AnimateDiff offers precise control over narrative and visuals in animation creation.

Instagirl v.20 | Wan 2.2 LoRA Demo

A Wan 2.2 workflow for demoing the Instagirl LoRA by Instara.

ComfyUI Phantom | Subjects to Video

Reference-driven video generation using Wan2.1 14B

Hunyuan LoRA

Use downloaded Hunyuan LoRAs to control style and character consistency in video generation.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.