LTX 2 retake video modifie the video by the prompt.






| Parameter | Type | Default/Range | Description |
|---|---|---|---|
| prompt | string | "" | Text instruction describing how the scene should continue from the provided video’s last second. |
| video_url | string (video_uri) | "" | URL of the source Veo video to extend. Must be 720p or 1080p and 16:9 or 9:16. |
| Parameter | Type | Default/Range | Description |
|---|---|---|---|
| aspect_ratio | enum | auto, 16:9, 9:16 (default: auto) | Aspect ratio for the output hop. Use auto to inherit the source video; if set manually, it must match the source AR. |
| duration | enum | 7s (default: 7s) | Length per extension hop. Chain multiple hops for longer total duration. |
| resolution | enum | 720p (default: 720p) | Output resolution per hop. Inputs can be 720p or 1080p; extension output is 720p in this mode. |
| Parameter | Type | Default/Range | Description |
|---|---|---|---|
| generate_audio | boolean | true | If true, generates native audio synchronized with the extended visuals; set false for silent output. |
| auto_fix | boolean | false | Attempts to auto-rewrite prompts that fail policy or validation checks. |
Developers can integrate Veo 3.1 Video Extend using the RunComfy API with standard HTTP requests and JSON payloads. Provide the source video_url, a continuation prompt, and your dimensions/settings to generate 7s extension hops; chain calls programmatically to build longer videos with consistent output.
Note: API Endpoint for Veo 3.1 Video Extend
If you require fast generation when extend, use Veo 3.1 Fast Extend Video
LTX 2 retake video modifie the video by the prompt.
Efficient video transformation with cinematic motion and design precision.
Make fast, realistic videos from text or images at a low cost.
Render fluid, stylized scenes with fast, frame-consistent output
Create rapid high-quality video drafts with precise style and speed
Turn images and text into motion-accurate HD videos fast.
Veo 3.1 Video Extend supports video-to-video generation up to 1080p resolution (usually 720p for extensions) with 16:9 or 9:16 aspect ratios. Each extension segment typically lasts 4–8 seconds, and up to three reference images can be used. Prompt text is limited to around 1,200 tokens per request for stable results.
Yes, Veo 3.1 Video Extend currently allows 1–3 reference images or frames for style and character consistency during video-to-video generation. These inputs act like ControlNet conditioning to maintain over 95% visual continuity but adding more references may exceed memory limits in the API layer.
Compared to Veo 3.0, Veo 3.1 Video Extend dramatically improves video-to-video continuity and introduces native spatial audio with dialogue. It adds support for multi-aspect ratios, smoother first- and last-frame transitions, and maintains consistent subjects throughout extended clips. The visual realism and prompt adherence are notably higher.
Veo 3.1 Video Extend performs best for cinematic storytelling, branded content, social media series, or teaching videos that require continuous scenes. The video-to-video system ensures that lighting, background, and character appearance match across extended sequences with minimal flicker or drift.
After experimenting in the RunComfy Playground, developers can transition Veo 3.1 Video Extend projects to production via the RunComfy or Vertex AI API. Export session parameters (prompt, seed, mode, reference frames) from the playground, then call the API endpoint with similar payloads. Ensure proper billing configuration to replace free trial usd with purchased credits.
Veo 3.1 Video Extend generally produces more stable video-to-video results over longer durations. Unlike Wan 2.5 or Seedance, which focus on short-form clips, Veo maintains smoother cross-clip transitions and consistent audio ambience. This makes it superior for longer multi-scene narratives with minimal temporal artifacts.
Veo 3.1 Video Extend integrates synchronized ambient and dialogue audio directly with its video-to-video model, allowing scene continuation without abrupt sound changes. This audio continuity is a key differentiator that makes extended sequences feel cinematic and coherent throughout chained segments.
Yes. By chaining multiple 4–8 second segments, Veo 3.1 Video Extend can produce cumulative durations exceeding 60 seconds (up to about 148 seconds). Each segment references the final second of the previous one to preserve visual flow and sound continuity across the video-to-video chain.
The Veo 3.1 Video Extend engine uses alignment embeddings and pixel-space similarity across last-frame conditioning, plus up to three reference images. Within video-to-video mode, it reproduces motion dynamics without style drift, keeping facial structure, colors, and backgrounds persistent across sequences.
Commercial use of Veo 3.1 Video Extend for video-to-video generation generally follows the licensing guidelines from Google DeepMind. Users should confirm rights under the Veo or Vertex AI terms before distributing or monetizing extended clips, as local and platform licensing may vary depending on content type and jurisdiction.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.