LTX 2 Pro: Advanced Image-to-Video with 4K Motion Control

ltx/ltx-2/pro/image-to-video

Generate high-fidelity, cinematic videos from text and image inputs using Lightricks' LTX 2 Pro, featuring up to 2160p resolution and integrated audio generation.

Idle

The rate is $0.06 per second for 1080p, $0.12 per second for 1440p, and $0.24 per second for 2160p.

Introduction to LTX 2 Pro

Start from the main model page: [LTX 2 Pro Text-to-Video](https://www.runcomfy.com/models/ltx/ltx-2/pro/text-to-video). LTX 2 Pro Image-to-Video lets you animate images with LTX 2 into cinematic, high-fidelity clips with stable motion and optional synchronized audio - ideal when you prioritize quality over fast text-to-video generation throughput.

What this mode does#

LTX 2 Pro Image-to-Video (I2V) turns a single reference image plus a prompt into a coherent video. It's built for maximum fidelity: preserve subject identity, add controlled camera/motion, and optionally generate synchronized audio. Use it to animate images with LTX 2 for cinematic results.

Speed profile (Pro)#

Optimized for quality, not raw speed.
Higher resolutions (1440p-2160p) and 50 FPS increase latency.
For quick previews, start at 1080p, 6s, 25 FPS; scale up after you lock the look.
Generation time is typically longer than fast text-to-video generation baselines due to higher step counts and 4K capability.

How Image-to-Video works (I2V mechanism)#

Image conditioning anchors composition and identity from your image_url.
Your prompt describes motion, atmosphere, and camera moves (e.g., "slow dolly in", "pan right").
The model preserves key visual details from the image while synthesizing temporally consistent motion.

Retake workflow#

Use I2V to create a strong first pass, then refine with targeted edits:

1) Generate your base clip here.

2) Send it to Retake to adjust motion intensity, re-time segments, or refine specific regions without starting over: LTX 2 Retake Video

Inputs#

image_url (required): JPG/PNG/WEBP/... reference image.
prompt (required): Describe subject, motion, environment, and camera.
duration: 6 / 8 / 10 seconds.
resolution: 1080p / 1440p / 2160p.
aspect_ratio: 16:9.
fps: 25 or 50.
generate_audio: true/false (default true).

Recommended settings#

Cinematic look: 25 FPS, 1080p or 1440p, describe subtle camera motion.
High action: 50 FPS, 6s duration to maximize coherence.
Final delivery: 2160p (4K) once creative choices are locked.

Related Models

hailuo-02/pro/image-to-video

Animate an image into a smooth 6s video with Hailuo 02 Pro.

pikascenes

Build a scene from 1–6 images and animate it into a video.

veo-3-1/fast/text-to-video

Create cinematic clips in seconds with Veo 3.1 Fast, built for instant text-driven motion and creative control.

kling-3.0/pro/image-to-video

Premium image-to-video with the highest visual fidelity and motion realism in the Kling V3.0 family.

ltx-2/fast/image-to-video

Transform visuals into smooth 4K motion clips with sync audio and rapid rendering.

wan-2-2/first-last-frame

Streamline scene design with high-fidelity, auto-interpolated video

Frequently Asked Questions

What does LTX 2 Pro Image-to-Video do?

It animates a single reference image into a coherent video clip guided by your prompt. Use it to animate images with LTX 2 when you need cinematic motion, strong subject preservation, and optional synchronized audio.

Why is an image required for this workflow?

This Pro I2V workflow is image-conditioned. The image_url anchors composition and identity, while the prompt specifies motion and camera. This mechanism yields higher fidelity and stability than text-only inputs.

How fast is Pro Image-to-Video?

Pro prioritizes fidelity over raw speed. Latency increases with resolution (1080p < 1440p < 2160p) and FPS (25 < 50). Expect longer runtimes than fast text-to-video generation baselines, especially at 4K and 50 FPS.

What are the limits for resolution, duration, and aspect ratio?

Resolution up to 2160p (4K), durations of 6/8/10 seconds, aspect ratio fixed at 16:9, and FPS at 25 or 50 to balance motion smoothness and coherence.

How do I iterate or make targeted changes (Retake)?

First generate a solid base clip here, then refine specific regions, motion intensity, or timing in Retake: LTX 2 Retake Video. This avoids starting over and speeds up creative iteration.

Does audio generation work with I2V?

Yes. Set generate_audio=true (default) to receive a video with synchronized ambience and sound effects that match on-screen events.

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Speed profile (Pro)#

Optimized for quality, not raw speed.

Higher resolutions (1440p-2160p) and 50 FPS increase latency.

For quick previews, start at 1080p, 6s, 25 FPS; scale up after you lock the look.

Generation time is typically longer than fast text-to-video generation baselines due to higher step counts and 4K capability.

Frequently Asked Questions