logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
ComfyUI>Workflows>LTX 2.3 IC-LoRA | Motion Track AI Video Generator

LTX 2.3 IC-LoRA | Motion Track AI Video Generator

Workflow Name: LTX 2.3 IC-LoRA
Workflow ID: 0000...1372
With the latest in-context control model, this workflow helps you achieve precise motion-guided video generation. Transfer movements from reference clips, lock camera paths, and adjust scene structure with accuracy. Style control stays independent, letting you change aesthetics freely through prompts or LoRAs. Ideal for designers needing predictable, production-ready results, it bridges motion consistency with creative freedom. Perfect for motion retargeting, animation, and scene direction, this system streamlines complex video transformations.

LTX 2.3 IC-LoRA: Motion Track video generation in ComfyUI

This workflow brings the LTX 2.3 IC-LoRA system to ComfyUI so you can guide motion and scene structure while freely styling with prompts or extra LoRAs. It conditions the LTX-2.3 video generator on reference signals like depth, pose, and edges, enabling motion transfer, camera locking, and predictable composition.

Creators working on video-to-video, motion retargeting, and controlled AI animation will find that LTX 2.3 IC-LoRA separates motion control from visual style. You steer the look with text and style LoRAs, and steer the movement with structured guides, all inside a single ComfyUI graph.

Key models in Comfyui LTX 2.3 IC-LoRA workflow

  • LTX-2.3 by Lightricks. A high-fidelity latent video diffusion transformer that generates temporally consistent sequences and supports conditioning for structure and motion control. Hugging Face: Lightricks/LTX-2.3
  • LTX 2.3 IC-LoRA union-control weights. In-context LoRA weights designed to inject structured guidance signals into LTX-2.3 for precise motion and geometry control. Provided with the workflow’s model chain and loaded before generation.
  • LTX-2.3 VAEs for video and audio. Latent encoders/decoders paired with LTX-2.3 to compress and reconstruct video and audio features used during sampling. Preconfigured in the graph and switchable when using quantized builds. Examples of split packages are available here: Hugging Face: unsloth/LTX-2.3-GGUF
  • Depth Anything V2. Robust monocular depth estimation used to lock camera movement or preserve scene layout during generation. Hugging Face: LiheYoung/Depth-Anything-V2
  • DWPose. Lightweight multi-person pose estimator used to retarget or preserve character motion via keypoints. Hugging Face: yzd-v/DWPose

How to use Comfyui LTX 2.3 IC-LoRA workflow

The graph is organized into clear groups. You prepare prompts and a reference video, choose one or more structural guides, then generate and export.

Set prompts

Use CLIP Text Encode (Positive Prompt) (#2483) and CLIP Text Encode (Negative Prompt) (#2612) to describe the visual style and to exclude unwanted traits. The text encoders are loaded in the model group and routed into LTXVConditioning (#1241), which also receives the working frame rate so the conditioning matches your clip timing. Keep prompts focused on appearance because LTX 2.3 IC-LoRA will handle motion and structure.

Preprocess

Load or pass a reference clip into VHS_LoadVideo (#5182). Frames are resized in ImageResizeKJv2 (#5080) and fed into the guide extractors: DepthAnythingV2Preprocessor (#5064) for depth, DWPreprocessor (#4986) for pose, and CannyEdgePreprocessor (#4991) for edges. A downstream resize node ensures the guide maps match the model-friendly multiples, and GetImageSize (#5029) records width, height, and frame count for the rest of the pipeline. The resulting guide image sequence is stored by Set_video_controlnet (#5100) for IC-LoRA to consume.

Load Models

The base model and LoRAs are assembled in this group. CheckpointLoaderSimple (#3940) loads LTX-2.3; LoraLoaderModelOnly (#4922) applies a distilled LTX LoRA for quality and speed; LTXICLoRALoaderModelOnly (#5011) adds the LTX 2.3 IC-LoRA weights and publishes the required latent downscale factor. VAEs for video and audio are loaded, and Boolean - Use GGUF? (#5158) can switch to a quantized GGUF build via GGUFLoaderKJ (#5150) with compatible text encoders and VAEs when VRAM is tight.

Load Image (set bypass=True if t2v)

If you want to anchor composition with a still reference or first frame, use LoadImage (#2004). It is resized by ImageResizeKJv2 (#5076) and previewed for quick checks. The boolean bypass_i2v controls whether the image is used at all; set it to True for pure text-to-video with LTX 2.3 IC-LoRA.

Generate

EmptyLTXVLatentVideo (#3059) creates the latent canvas. If image anchoring is enabled, LTXVImgToVideoConditionOnly (#3159) injects only structural information from your image without baking style. The core step happens in LTXAddVideoICLoRAGuide (#5012), which attaches your chosen guide sequence to the model using the latent downscale factor from the IC-LoRA loader. Audio conditioning also flows into the latent through LTXVEmptyLatentAudio (#3980) or the custom audio path. CFGGuider (#4828), KSamplerSelect (#4831), ManualSigmas (#5025), and SamplerCustomAdvanced (#4829) then perform denoising to synthesize the final latent video while respecting both prompts and LTX 2.3 IC-LoRA controls.

Decode

LTXVSeparateAVLatent (#4845) splits the generated audio and video latents for decoding. LTXVCropGuides (#5013) aligns and crops if needed, then VAEDecodeTiled (#4851) reconstructs frames efficiently. VHS_VideoCombine (#5070) muxes frames into an MP4, using the reference clip’s audio by default. You can also decode the generated audio latent with LTXVAudioVAEDecode (#4848) if you want to audition it separately.

Prepare Reference Video

This helper area shows the reference frame pipeline. VHS_VideoInfoLoaded (#5073) extracts fps and duration, which are propagated to the conditioning nodes and to exporters so timing stays in sync. A small combine node provides a quick visual preview of the source sequence for sanity checks.

Custom Audio

If you would like audio-aware generation, the reference audio is encoded with LTXVAudioVAEEncode (#5146) and a simple mask is applied in SetLatentNoiseMask (#5148). The switch titled Switch - Custom Audio? (#5149) selects between empty or encoded audio latents before concatenation in LTXVConcatAVLatent (#4528). Final export still uses the reference audio by default; if you prefer the decoded audio from the model, route the LTXVAudioVAEDecode output to the exporter’s audio input.

LTX Official Sigma Setting

The schedule node ManualSigmas (#5025) defines a concise sigma profile tuned for LTX-2.3, and SigmasPreview (#5142) visualizes it so you can reason about noise allocation over time. This lets you trade speed for detail while maintaining the characteristic temporal stability of LTX 2.3 IC-LoRA.

Key nodes in Comfyui LTX 2.3 IC-LoRA workflow

  • LTXICLoRALoaderModelOnly (#5011). Loads the LTX 2.3 IC-LoRA weights and outputs the latent downscale factor required by the guide injector. If you add extra style LoRAs, place them before this loader to keep motion guidance dominant.
  • LTXAddVideoICLoRAGuide (#5012). The point where depth, pose, or edge sequences enter the model as in-context guidance. Tune its strength to balance between strict structural adherence and stylistic freedom from your prompt and style LoRAs.
  • LTXVImgToVideoConditionOnly (#3159). Provides optional image-to-video conditioning that transfers only composition and coarse structure from a still image. Use its bypass toggle when switching between i2v and pure text-to-video.
  • CFGGuider (#4828). Controls how strongly the model follows your prompts relative to the LTX 2.3 IC-LoRA guide. Increase guidance when style fidelity matters most, decrease it to preserve motion and geometry with minimal drift.
  • SamplerCustomAdvanced (#4829) with ManualSigmas (#5025). A compact schedule and multistep sampler pairing that delivers good temporal coherence for LTX-2.3. If you modify the schedule, keep it smoothly decreasing and test short clips before longer renders.

Optional extras

  • Choose the right guide. Use depth to lock camera and layout, pose for character motion, and edges for rigid objects or clean silhouettes. Mixing two guides is possible if they describe different aspects.
  • Keep dimensions sampler friendly. The preprocessors already round sizes to model-friendly multiples; keep your source close to the target aspect ratio to minimize padding.
  • Style without breaking motion. Add a light style LoRA before the IC-LoRA loader and keep its weight moderate so LTX 2.3 IC-LoRA can maintain geometry and timing.
  • Low VRAM mode. Toggle Use GGUF to run the quantized distilled model and matching text encoders/VAEs from the GGUF package if your GPU is constrained. Hugging Face: unsloth/LTX-2.3-GGUF
  • Stable timing. The frame rate read from the reference video is injected into conditioning and exporters so motion and audio stay aligned. If you override fps, do it consistently across conditioning and export.

Acknowledgements

This workflow implements and builds upon the following works and resources. We gratefully acknowledge @Benji’s AI Playground of LTX 2.3 IC-LoRA Source for providing source materials and guidance. For authoritative details, please refer to the original documentation and repositories linked below.

Resources

  • LTX 2.3 IC-LoRA Source
    • Docs / Release Notes: YouTube @Benji’s AI Playground

Note: Use of the referenced models, datasets, and code is subject to the respective licenses and terms provided by their authors and maintainers.

Want More ComfyUI Workflows?

Uni3C Video-Referenced Camera & Motion Transfer

Extract camera movements and human motions from reference videos for professional video generation

ComfyUI Vid2Vid Dance Transfer

Transfers the motion and style from a source video onto a target image or object.

CogVideoX Tora | Image-to-Video Model

Subject Trajectory Video Demo for CogVideoX

Omni Kontext | Seamless Scene Integration

Perfect scene fits. Unique style. Identity stays. Kontext keeps it real.

AnimateDiff + Dynamic Prompts | Text to Video

Utilize Dynamic Prompts (Wildcards), Animatediff, and IPAdapter to generate dynamic animations or GIFs.

Hunyuan Image 2.1 | High-Res AI Image Generator

Next-gen 2.1 model for crisp, sharp, ultra-clear AI visuals fast.

AnimateDiff + Batch Prompt Schedule | Text to Video

Utilize Prompts Travel with Animatediff for precise control over specific frames within the animation.

Flux Krea Dev | Natural Text to Image

The best open-source FLUX model! Absolutely incredible natural results.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.