ComfyUI Wan2.2 Fun Inp Workflow | First-to-Last Frame Interpolation

Wan2.2 Fun Inp: First-to-Last Frame Video Generation in ComfyUI

Wan2.2 Fun Inp turns two still images into a coherent video by guiding the model from a first frame to a last frame with natural interpolation in between. It is designed for artists, animators, and filmmakers who want cinematic consistency while retaining prompt control. The workflow ships with two parallel presets so you can prioritize either ultra-fast 4-step synthesis or more general fp8-scaled generation, both powered by Wan 2.2 Fun Inpaint.

Wan2.2 Fun Inp: First-to-Last Frame Video Generation in ComfyUI

Wan2.2 Fun Inp turns two still images into a coherent video by guiding the model from a first frame to a last frame with natural interpolation in between. It is designed for artists, animators, and filmmakers who want cinematic consistency while retaining prompt control. The workflow ships with two parallel presets so you can prioritize either ultra-fast 4-step synthesis or more general fp8-scaled generation, both powered by Wan 2.2 Fun Inpaint.

Key models in Comfyui Wan2.2 Fun Inp workflow

Wan 2.2 Fun Inpaint 14B (fp8 scaled) The main diffusion backbone specialized for “Fun Inpaint” video generation. Two variants are included: high noise for larger motion and creative transitions, and low noise when you need tighter fidelity to your start/end frames. • High noise: wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors • Low noise: wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors
Lightning 4-Step LoRA for I2V An optional LoRA that compresses the sampling schedule to just four steps for rapid iteration, ideal for previews and quick drafts. • Low noise LoRA: wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors • High noise LoRA: wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
Wan VAE The VAE handles latent–pixel conversions used by Wan models; it preserves detail and tone during decode/encode. See the Wan 2.2 package on Hugging Face.
CLIP text encoder Encodes your positive and negative prompts into conditioning vectors that steer the visual narrative. Reference implementation: openai/CLIP.
ComfyUI Video Helper Suite (export) Combines generated frames into an MP4 at your chosen frame rate. Repo: ComfyUI-VideoHelperSuite.

How to use Comfyui Wan2.2 Fun Inp workflow

The graph contains two parallel groups you can toggle depending on speed vs generality. Enable only one at a time for clean runs.

Group: Wan2.2_fun_Inp fp8_scaled + 4 steps LoRA

Use this for very fast previews. The group loads the Wan 2.2 backbone plus a Lightning 4-Step LoRA and routes your prompts through the short sampler path. Provide your start and end images, then adjust the high-level parameters as needed. Internally, WanFunInpaintToVideo (#111) seeds the trajectory from first to last frame, while a short sampler refines motion and structure in a handful of steps.

Group: Wan2.2_fun_Inp fp8_scaled

Choose this when you want a broader operating range without the 4-step constraint. This path uses the fp8-scaled Wan 2.2 model directly, maintaining the same first-to-last frame guidance but with a standard sampler budget for more nuanced detail recovery and motion shaping. The node WanFunInpaintToVideo (#148) anchors the trajectory and hands off to the downstream sampler for refinement.

Step 2 — Upload start and end images

Both groups include an Upload start and end images section. Plug a start image that sets the opening composition and an end image that defines the final pose or scene. The workflow will interpolate the motion and appearance between them, respecting your text prompts. For best results, keep aspect ratio consistent across both images.

Step 3 — Prompt

Write what you want to see in the Positive Prompt and what to avoid in the Negative Prompt. The nodes CLIP Text Encode (Positive Prompt) and CLIP Text Encode (Negative Prompt) transform your text into conditioning that steers content, style, and dynamics. Use concise, scene-oriented phrases (actions, camera cues, materials, mood) rather than long lists.

Step 4 — Video size & length

Set width, height, and length in the WanFunInpaintToVideo node to define spatial resolution and frame count. Defaults are tuned for a tall 576×1024 video with about 3–4 seconds of motion at 24 fps. Longer sequences generally benefit from the fp8-scaled path; short previews are great with the 4-step LoRA group.

Export to MP4

VHS_VideoCombine assembles frames into an MP4 with a default 24 fps and a quality-friendly CRF. The file names are prefixed for each branch (for example, Fun_Inp and Fun_Inp_4_Step) so you can compare outputs easily. Adjust the frame rate if you need slower or faster playback.

Running only one branch

Box-select a group and use Ctrl+B to enable or disable it. If you enable the fp8_scaled group, disable the fp8_scaled + 4 steps LoRA group, and vice versa. You can also use ComfyUI’s partial execution features to run just the sections you are tweaking.

Key nodes in Comfyui Wan2.2 Fun Inp workflow

`WanFunInpaintToVideo` (#111 and #148)

The core engine that blends your start_image and end_image into a continuous latent trajectory. It accepts width, height, and length to set video size and duration, then emits a latent sequence plus updated positive/negative conditioning. Start here when tuning continuity, pacing, or composition across the shot.

`UNETLoader` (#101, #102)

Chooses the Wan 2.2 Fun Inpaint model variant. Use high noise for bolder motion and more transformative interpolations. Use low noise when preserving the start and end frame identity and texture is the priority. Pair either with or without the 4-step LoRA depending on speed needs.

`ModelSamplingSD3` (#93)

Configures the sampler schedule used downstream. Keep it aligned with the chosen LoRA or fp8 path. If you see temporal flicker, modest adjustments to the sampler mode or steps can smooth transitions without over-sharpening details.

`KSamplerAdvanced` (#150)

Applies a refinement pass to the latent sequence. Increase steps slightly if you need crisper micro-detail on faces, hands, or thin structures; reduce steps for softer, dreamier motion. Avoid extreme CFG or step counts that can destabilize temporal consistency.

`VHS_VideoCombine` (#159)

Merges rendered frames to MP4. Adjust frame_rate for motion feel and playback speed, and keep the default pix_fmt for broad player compatibility. Lower CRF yields larger files with finer gradients; higher CRF compresses more aggressively.

Optional extras

Match the aspect ratio of your start and end images to the selected width×height to reduce unwanted cropping or warping.
For character shots, keep clothing, lighting, and camera angle broadly consistent between the first and last frames to encourage stable identity.
Start with a short Wan2.2 Fun Inp preview using the 4-step LoRA group, then switch to the fp8-scaled group for your final.
If the middle of the clip feels too static, try the high noise model; if transitions look chaotic, try low noise and simplify the prompt.
Keep prompts focused on scene intent (action, atmosphere, camera moves) rather than long adjective chains; Wan2.2 Fun Inp responds best to clear direction.

Acknowledgements

The Wan 2.2 Inp Fun workflow expands the creative possibilities of AI video generation by bridging start-to-end frame control with natural interpolation. It’s a versatile tool for artists, animators, and filmmakers who want cinematic consistency in their AI-driven projects.

Special thanks to the ComfyUI and Wan teams for enabling seamless Inp Fun workflow integration into next-gen creative pipelines.

Want More ComfyUI Workflows?

Wan 2.2 FLF2V | First-Last Frame Video Generation

Generate smooth videos from a start and end frame using Wan 2.2 FLF2V.

Wan 2.2 + Lightx2v V2 | Ultra Fast I2V & T2V

Dual Light LoRA setup, 4X faster.

Wan 2.2 Lightning T2V I2V | 4-Step Ultra Fast

Wan 2.2 now 20x faster! T2V + I2V in 4 steps.

Wan 2.2 Low Vram | Kijai Wrapper

Low VRAM. No longer waiting. Kijai wrapper included.

Wan 2.1 FLF2V | First-Last Frame Video

Generate smooth videos from a start and end frame using Wan 2.1 FLF2V.

IPAdapter V1 FaceID Plus | Consistent Characters

Leverage IPAdapter FaceID Plus V2 model to create consistent characters.

AnimateDiff + Dynamic Prompts | Text to Video

Utilize Dynamic Prompts (Wildcards), Animatediff, and IPAdapter to generate dynamic animations or GIFs.

LivePortrait | Animate Portraits | Vid2Vid

Updated 6/16/2025: ComfyUI version updated to v0.3.39 for improved stability and compatibility. Transfer facial expressions and movements from a driving video onto a source video

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Wan2.2 Fun Inp | Cinematic Video Generator

Wan2.2 Fun Inp: First-to-Last Frame Video Generation in ComfyUI

Wan2.2 Fun Inp: First-to-Last Frame Video Generation in ComfyUI

Key models in Comfyui Wan2.2 Fun Inp workflow

How to use Comfyui Wan2.2 Fun Inp workflow

Group: Wan2.2_fun_Inp fp8_scaled + 4 steps LoRA

Group: Wan2.2_fun_Inp fp8_scaled

Step 2 — Upload start and end images

Step 3 — Prompt

Step 4 — Video size & length

Export to MP4

Running only one branch

Key nodes in Comfyui Wan2.2 Fun Inp workflow

WanFunInpaintToVideo (#111 and #148)

UNETLoader (#101, #102)

ModelSamplingSD3 (#93)

KSamplerAdvanced (#150)

VHS_VideoCombine (#159)

Optional extras

Acknowledgements

Want More ComfyUI Workflows?

Wan 2.2 FLF2V | First-Last Frame Video Generation

Wan 2.2 + Lightx2v V2 | Ultra Fast I2V & T2V

Wan 2.2 Lightning T2V I2V | 4-Step Ultra Fast

Wan 2.2 Low Vram | Kijai Wrapper

Wan 2.1 FLF2V | First-Last Frame Video

IPAdapter V1 FaceID Plus | Consistent Characters

AnimateDiff + Dynamic Prompts | Text to Video

LivePortrait | Animate Portraits | Vid2Vid

`WanFunInpaintToVideo` (#111 and #148)

`UNETLoader` (#101, #102)

`ModelSamplingSD3` (#93)

`KSamplerAdvanced` (#150)

`VHS_VideoCombine` (#159)