RunComfy

Wan 2.2 Animate V2 | Realistic Pose Video Generator

Transforms photos into smooth-motion animated character videos using Wan 2.2.

One to All Animation | Pose-Based Video Maker

Make smooth pose-following videos with stunning motion consistency.

Qwen Image Edit 2509 | Multi-Image Editor

Turn 2–3 images into one seamless, edited masterpiece instantly.

FLUX.2 [klein] 4B & 9B | Ultra-Fast Flux Image Generator

Blazing-fast visual creation with unified editing control.

ComfyUI > Nodes > ComfyUI > ARVideoI2V

ComfyUI Node: ARVideoI2V

Class Name

ARVideoI2V

Category
conditioning/video_models

Author
ComfyAnonymous (Account age: 763days) Extension
ComfyUI Latest Updated
2026-05-13 Github Stars
112.77K

Github Ask ComfyAnonymous Current Questions Past Questions

Table of Content

Description
ARVideoI2V:
ARVideoI2V Input Parameters:
ARVideoI2V Output Parameters:
ARVideoI2V Usage Tips:
ARVideoI2V Common Errors and Solutions:
Related Nodes

How to Install ComfyUI

Install this extension via the ComfyUI Manager by searching for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ARVideoI2V Description

Transform images into videos with autoregressive models, leveraging Causal and Self-Forcing techniques.

ARVideoI2V:

The ARVideoI2V node is designed to facilitate the transformation of images into videos using autoregressive video models, specifically leveraging techniques like Causal Forcing and Self-Forcing. This node plays a crucial role in video generation workflows by encoding a starting image through a Variational Autoencoder (VAE) and storing it within the model's transformer options. This process seeds the Key-Value (KV) cache, which is essential for initializing the denoising process in autoregressive video models. By using the same Text-to-Video (T2V) model checkpoint, ARVideoI2V eliminates the need for a separate Image-to-Video (I2V) architecture, streamlining the workflow and ensuring consistency in video generation. This node is particularly beneficial for artists and creators looking to generate videos from static images, providing a seamless and efficient method to initiate the video generation process.

ARVideoI2V Input Parameters:

model

This parameter represents the model used for video generation. It is crucial as it defines the architecture and capabilities of the video generation process. The model parameter does not have specific minimum or maximum values as it is typically a pre-trained model selected based on the desired output characteristics.

vae

The VAE (Variational Autoencoder) parameter is used to encode the start image into a latent space, which is then utilized to seed the KV cache. This encoding is essential for initializing the autoregressive video generation process. Like the model parameter, the VAE is usually a pre-trained component and does not have specific value constraints.

start_image

The start_image parameter is the initial image that will be transformed into a video. This image is encoded by the VAE and serves as the foundation for the video generation process. The quality and content of the start image significantly impact the resulting video, making it a critical input.

width

This parameter defines the width of the video frames in pixels. It impacts the resolution and aspect ratio of the generated video. The width can range from a minimum of 16 to a maximum of 8192 pixels, with a default value of 832 pixels, allowing for flexibility in video resolution.

height

Similar to the width, the height parameter specifies the height of the video frames in pixels. It also affects the resolution and aspect ratio of the video. The height can be set between 16 and 8192 pixels, with a default value of 480 pixels, providing options for different video formats.

length

The length parameter determines the number of frames in the generated video. It influences the duration and smoothness of the video. The length can vary from 1 to 1024 frames, with a default value of 81 frames, allowing for short clips or longer sequences.

batch_size

This parameter specifies the number of video sequences to be processed simultaneously. It affects the computational load and efficiency of the video generation process. The batch size can range from 1 to 64, with a default value of 1, balancing performance and resource usage.

ARVideoI2V Output Parameters:

MODEL

The MODEL output represents the processed model after the start image has been encoded and the KV cache has been seeded. This output is crucial for subsequent steps in the video generation workflow, as it contains the necessary information to continue the autoregressive process.

LATENT

The LATENT output provides the encoded representation of the start image in the latent space. This representation is used to initialize the video generation process and is essential for maintaining consistency and quality in the resulting video.

ARVideoI2V Usage Tips:

Ensure that the start image is of high quality and relevant to the desired video content, as it significantly influences the final output.
Adjust the width and height parameters to match the desired resolution and aspect ratio of your video, keeping in mind the computational resources available.
Experiment with different lengths to achieve the desired video duration, balancing between smooth transitions and computational efficiency.

ARVideoI2V Common Errors and Solutions:

"Invalid model or VAE input"

Explanation: This error occurs when the model or VAE inputs are not correctly specified or incompatible with the node's requirements.
Solution: Verify that you have selected the correct pre-trained model and VAE compatible with the ARVideoI2V node. Ensure that they are properly loaded and initialized before execution.

"Start image encoding failed"

Explanation: This error indicates a problem with encoding the start image, possibly due to incorrect image format or size.
Solution: Check that the start image is in a supported format and within the acceptable size range. Ensure that the image is correctly loaded and accessible to the node.

ARVideoI2V Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI

Table of Content

Description
ARVideoI2V:
ARVideoI2V Input Parameters:
ARVideoI2V Output Parameters:
ARVideoI2V Usage Tips:
ARVideoI2V Common Errors and Solutions:
Related Nodes

Wan 2.2 + Lightx2v V2 | Ultra Fast I2V & T2V

Dual Light LoRA setup, 4X faster.

Hunyuan Image 2.1 | High-Res AI Image Generator

Next-gen 2.1 model for crisp, sharp, ultra-clear AI visuals fast.

ComfyUI F5 TTS | Natural Voice Cloning Engine

Turn text into rich, expressive voices with natural tone control.

Z-Image Finetuned Models Collection | Multi-Style Generator

Create stunning, detailed images across multiple styles and moods easily.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy