Generate cinematic 4K clips from prompts with audio sync and pro control
LTX-2 19B Text-to-Video LoRA is a 19B-parameter, LoRA-adaptable diffusion transformer for generating short videos with synchronized audio directly from text. It accepts prompts and optional LoRA adapters and outputs coherent, styled video clips with audio in multiple resolutions, durations, and aspect ratios suitable for brand content, social media, animation, and rapid prototyping.
Output format: 480p–1080p / duration 5–20s / aspect ratio 16:9 or 9:16 / audio included
| Parameter | Required | Type | Default | Range / Options | Description |
|---|---|---|---|---|---|
| prompt* | Yes (*) | string | — | — | Text description of the scene, action, and audio cues |
| resolution | No | enum | 720p | 480p, 720p, 1080p | Output resolution |
| aspect_ratio | No | enum | 16:9 | 16:9, 9:16 | Output format |
| duration | No | number | 5 | 5–20 (seconds) | Video length in seconds |
| loras | No | list | — | up to 3 | List of LoRA adapters to apply |
| seed | No | integer | -1 | -1 for random | Random seed for reproducibility |
Usage of LTX-2 19B Text-to-Video LoRA is billed per generated second by resolution.
| Resolution | Price per second | Billing unit |
|---|---|---|
| 480p | $0.015 | Per generated second |
| 720p | $0.02 | Per generated second |
| 1080p | $0.03 | Per generated second |
1) Select the model on RunComfy: Choose LTX-2 19B Text-to-Video LoRA from the Models catalog.
2) Write your prompt: Describe the subject, actions, setting, camera movement, lighting, and key audio cues (ambience, effects, or dialogue).
3) Add adapters (optional): In the loras field, reference up to three LoRA adapters to steer style or identity; LTX-2 19B Text-to-Video LoRA will blend them during generation.
4) Set output controls: Pick resolution (480p/720p/1080p), aspect ratio (16:9 or 9:16), and duration (5–20s) to match the target channel.
5) Reproducibility: Set seed to a fixed integer to recreate a result, or -1 for exploration with new variations.
6) Generate: Submit the job and preview the clip with synchronized audio; LTX-2 19B Text-to-Video LoRA outputs a single video file.
7) Review and iterate: Adjust the prompt or LoRA list, tweak duration/ratio if framing is off, and re-run with the same seed for controlled changes.
8) API-friendly: Use RunComfy’s API endpoints to automate batch jobs without managing infrastructure or GPU provisioning.
If LTX-2 19B Text-to-Video LoRA isn’t a fit, consider:
Generate cinematic 4K clips from prompts with audio sync and pro control
Precise prompts, lifelike motion, vivid video quality.
Turn static images into vivid motion with precise text and 2K detail.
Convert visuals to cinematic videos quickly with Veo 3.1 Fast image-to-video for seamless creative control.
Create high quality videos from text prompts using Pika 2.2.
Turn images and text into motion-accurate HD videos fast.
LTX-2 19B Text-to-Video LoRA supports output up to native 4K (3840×2160) at 50 fps, with user-selectable 480p, 720p, and 1080p presets. Supported aspect ratios include 16:9 and 9:16. The current text-to-video token limit for prompts is approximately 512 tokens per generation.
Up to three LoRA modules can be simultaneously loaded in LTX-2 19B Text-to-Video LoRA. This includes optional IC-LoRAs for control signals such as pose, depth, or edge guiding, which improve structural coherence in text-to-video composition.
Start by prototyping with the browser-based RunComfy Models to refine your LTX-2 19B Text-to-Video LoRA prompts and settings. When ready for production, use the RunComfy API. This allows automated text-to-video generation, style adapter loading, and post-processing in your pipeline. Pricing uses the same usd-based credit system as the playground.
LTX-2 19B Text-to-Video LoRA excels at marketing clips, educational explainers, animated character dialogues, and stylized short-form videos where high frame rate and synchronized audio enhance quality. Its text-to-video coherence and configurable LoRAs make it ideal for brand consistency and creative media production.
LTX-2 19B Text-to-Video LoRA uses a Diffusion Transformer (DiT) backbone that jointly models visual frames and the corresponding audio waveform. The model performs text-to-video inference in one step, ensuring dialogue timing, lip sync, and ambient sound alignment without separate audio synthesis.
Yes, LTX-2 19B Text-to-Video LoRA supports lightweight LoRA training pipelines, letting developers fine-tune style, motion, or character representations efficiently. These adapters plug into the main pipeline, preserving text-to-video consistency while enabling unique creative identities or brand looks. You can use RunComfy Trainer to train your own LoRAs.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.





