logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
ComfyUI>Workflows>LTX 2.3 Prompt Relay | Scene-Controlled Video Maker

LTX 2.3 Prompt Relay | Scene-Controlled Video Maker

Workflow Name: RunComfy/LTX-2.3-Prompt-Relay
Workflow ID: 0000...1405
This workflow helps you generate cinematic clips where each scene flows naturally into the next. Designed for creators who need more precise control over temporal prompts, it delivers smooth transitions between actions and segments. You can route multiple text prompts across video beats for refined storytelling. The integrated helper refines story cues before generation, making the process efficient and coherent. Ideal for designers seeking higher narrative consistency and time-saving video creation.

LTX 2.3 Prompt Relay: multi‑beat image‑to‑video generation in ComfyUI#

LTX 2.3 Prompt Relay is a ComfyUI workflow for directing image‑to‑video with segmented prompt routing across multiple beats in one clip. It uses PromptRelayEncode as a training‑free, inference‑time controller to assign different text instructions to different time spans, so you can script camera moves and actions per beat while preserving subject continuity and smooth transitions. A Qwen VLM helper can auto‑draft or refine the story beats from a reference image before generation.

This ComfyUI LTX 2.3 Prompt Relay workflow is ideal for cinematic shorts, product shots, and narrative teasers where you want scene‑by‑scene control without fine‑tuning. It produces a synced video with decoded audio and writes an H.264 MP4 with metadata preserved.

Key models in Comfyui LTX 2.3 Prompt Relay workflow#

  • LTX‑Video 2.3 base checkpoint. The generative backbone that synthesizes temporally consistent video from text and an optional reference frame. See the community build and weights context on Hugging Face for ComfyUI users. Kijai/LTX2.3_comfy
  • LTX‑Video 2.3 Video VAE and Audio VAE. Decoders that turn the model’s latent video and latent audio into RGB frames and a waveform for muxing, used here to export an MP4. Kijai/LTX2.3_comfy
  • Qwen VLM (Instruct). A vision‑language model that reads the reference image and drafts multi‑beat action lines the workflow uses as local prompts. Integrated via the ComfyUI‑QwenVL extension. 1038lab/ComfyUI-QwenVL
  • Optional LTX 2.3 LoRAs. Style or efficiency adapters such as a distilled LoRA and a crisp‑enhance LoRA are pre‑wired for easy toggling to change texture and sharpness without altering your prompts. Kijai/LTX2.3_comfy

How to use Comfyui LTX 2.3 Prompt Relay workflow#

Overall flow#

The workflow reads a single image as the opening frame, gathers a global prompt plus beat‑specific local prompts, encodes them with Prompt Relay, samples a joint audio‑video latent, then decodes and combines frames and audio into an MP4. Groups are organized as Models, Input Video Setting, VLM, Conditioning, Create Latent, Sampling, and Decoding.

Models#

The base LTX‑Video 2.3 checkpoint loads first, then two optional LoRAs are applied in sequence to tune crispness and efficiency. Attention patching is enabled to improve fidelity under long prompts. You can keep both LoRAs, disable one, or bypass them entirely if you prefer a neutral baseline look.

Input Video Setting#

Choose width, height, total seconds, and FPS for the clip. The workflow computes the frame count automatically as a product of seconds and FPS, keeping image and audio lengths in sync. Set these before writing prompts so you know how many beats will comfortably fit.

VLM#

Load or drop a reference image. The image is preprocessed and sent to a Qwen VLM that follows a short instruction template to propose four concise beat lines separated by the pipe character “|”. You can review and edit the generated text in the on‑screen viewer before it moves on, or skip the VLM and write your own lines.

Conditioning with Prompt Relay#

PromptRelayEncode takes a global prompt for style and setting plus your local prompts for per‑beat actions. Separate beats with “|” in local prompts; the encoder routes each segment to its time span and blends between them for smooth handoffs. The node outputs prompt conditioning and a patched model so the sampler follows your beat script faithfully. Reference and usage are provided by the ComfyUI‑PromptRelay project. kijai/ComfyUI-PromptRelay

Create Latent#

An empty video latent is initialized to your chosen resolution and length. The preprocessed reference image is written into the timeline’s first frame to anchor identity, pose, and lighting. An empty audio latent with matching duration is created so decoding produces a ready‑to‑mux waveform alongside the frames.

Sampling#

A scheduler creates the noise schedule, a visualizer previews it, and the sampler runs on the concatenated audio‑video latent using the patched LTX 2.3 model and Prompt Relay conditioning. You can change the sampler type if you prefer a different trade‑off between sharpness and stability. The result is a single latent that already encodes both video and audio.

Decoding and export#

The latent is split into video and audio branches, then decoded by the LTX 2.3 Video VAE and Audio VAE. VideoHelperSuite combines the frames and waveform into an H.264 MP4 with a standard pixel format for wide player compatibility and saves the metadata for reproducibility. ComfyUI-VideoHelperSuite

Key nodes in Comfyui LTX 2.3 Prompt Relay workflow#

PromptRelayEncode (#605)#

The core controller that applies segmented prompt routing at inference time. Use global_prompt for style, setting, subject, and lens language that should persist, and use local_prompts for beat‑specific actions separated by |. Keep beats concise and focused; 3 to 6 beats usually read cleanly. If you want to hand‑time transitions, keep adjacent beats semantically compatible so the blend is natural. Reference: kijai/ComfyUI-PromptRelay

AILab_QwenVL_Advanced (#610)#

A VLM assistant that reads the reference image and expands your idea into beat lines using a short instruction prompt. Edit the instruction text to nudge tone or camera vocabulary, then review the generated beats in the viewer. The output feeds directly into local_prompts, and you can override it with your own writing at any time. Reference: 1038lab/ComfyUI-QwenVL

LTXVImgToVideoInplaceKJ (#582)#

Seeds the first frame of the latent video with your input image, promoting identity and lighting stability across beats. For pure text‑to‑video, bypass this node and start from an empty video latent. For stronger adherence to the seed frame, keep your global prompt consistent with the image content.

BasicScheduler (#514) and VisualizeSigmasKJ (#358)#

Control and preview the denoising schedule used by the sampler. Use the visualizer to sanity‑check the curve shape when switching samplers or step counts. A smoother schedule often yields steadier motion, while more aggressive schedules push detail.

VHS_VideoCombine (#604)#

Muxes decoded frames and audio into a single MP4 with a widely compatible pixel format. Make sure its frame rate matches your Input Video Setting group for accurate sync. Disconnect the audio input here if you want a silent export. Reference: ComfyUI-VideoHelperSuite

Optional extras#

  • Beat writing tips: write in present tense, keep each beat to one action, add short dialogue only when it advances the beat, and begin with a camera verb such as “push in,” “pan right,” or “handheld drift.”
  • Use the global prompt for art direction and optics (lighting, lens, mood); use local prompts for motion, gestures, and framing changes.
  • For faster iteration, keep resolution modest while drafting beats, then raise it for the final render.
  • If LoRAs oversharpen or shift color, lower their weights or disable one of them to recover neutrality.

Acknowledgements#

This workflow implements and builds upon the following works and resources. We gratefully acknowledge gordonchen19 for Prompt-Relay, kijai for ComfyUI-PromptRelay, Kijai for LTX2.3_comfy (ComfyUI model context), 1038lab for ComfyUI-QwenVL, and the Patreon post author (Innovate Futures @ Benji) for the workflow source, for their contributions and maintenance. For authoritative details, please refer to the original documentation and repositories linked below.

Resources#

  • Patreon/Workflow source
    • Docs / Release Notes: post @Benji
  • gordonchen19/Prompt-Relay
    • GitHub: gordonchen19/Prompt-Relay
    • Docs / Release Notes: site
  • kijai/ComfyUI-PromptRelay
    • GitHub: kijai/ComfyUI-PromptRelay
  • Kijai/LTX2.3_comfy
    • Hugging Face: Kijai/LTX2.3_comfy
    • Docs / Release Notes: discussion #51
  • 1038lab/ComfyUI-QwenVL
    • GitHub: 1038lab/ComfyUI-QwenVL

Note: Use of the referenced models, datasets, and code is subject to the respective licenses and terms provided by their authors and maintainers.

Want More ComfyUI Workflows?

AnimateDiff + Batch Prompt Schedule | Text to Video

Utilize Prompts Travel with Animatediff for precise control over specific frames within the animation.

AnimateDiff + Batch Prompt Schedule | Text to Video

Batch Prompt schedule with AnimateDiff offers precise control over narrative and visuals in animation creation.

LTX 2.3 First Last Frame | Seamless Video Generator

Transforms keyframes into ultra-smooth, realistic video transitions fast.

Wan2.2 Fun Inp | Cinematic Video Generator

From 2 images to stunning videos with smooth, controllable transitions.

CogVideoX-5B | Advanced Text-to-Video Model

CogVideoX-5B: Advanced text-to-video model for high-quality video generation.

SVD + IPAdapter V1 | Image to Video

Utilize IPAdapters for static image generation and Stable Video Diffusion for dynamic video generation.

MultiTalk | Photo to Talking Video

Millisecond lip sync + Wan2.1 = 15s ultra-detailed talking videos!

FireRed Image Edit | Smart Photo Enhancer

Sharp photo fixes with faithful tone and perfect detail control.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • RunComfy MCP
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.