logo
RunComfy
ComfyUIPlaygroundPricing
discord logo
ComfyUI>Workflows>LatentSync| Lip Sync Model

LatentSync| Lip Sync Model

Workflow Name: RunComfy/LatentSync
Workflow ID: 0000...1182
Updated 6/16/2025: ComfyUI version updated to v0.3.40 for improved stability and compatibility. Additionally, LatentSync has been updated to v1.6. LatentSync redefines lip syncing with audio-conditioned latent diffusion models, bypassing intermediate motion representations for seamless audio-visual alignment. Leveraging Stable Diffusion, it captures intricate correlations while ensuring temporal smoothness. Unlike pixel-based approaches, LatentSync ensures superior temporal consistency with its innovative Temporal REPresentation Alignment (TREPA) module. The TREPA module helps deliver unmatched accuracy and realism.

LatentSync is a state-of-the-art end-to-end lip sync framework that harnesses the power of audio-conditioned latent diffusion models for realistic lip sync generation. What sets LatentSync apart is its ability to directly model the intricate correlations between audio and visual components without relying on any intermediate motion representation, revolutionizing the approach to lip sync synthesis.

At the core of LatentSync's pipeline is the integration of Stable Diffusion, a powerful generative model renowned for its exceptional ability to capture and generate high-quality images. By leveraging Stable Diffusion's capabilities, LatentSync can effectively learn and reproduce the complex dynamics between speech audio and corresponding lip movements, resulting in highly accurate and convincing lip sync animations.

One of the key challenges in diffusion-based lip sync methods is maintaining temporal consistency across generated frames, which is crucial for realistic results. LatentSync tackles this issue head-on with its groundbreaking Temporal REPresentation Alignment (TREPA) module, specifically designed to enhance the temporal coherence of lip sync animations. TREPA employs advanced techniques to extract temporal representations from the generated frames using large-scale self-supervised video models. By aligning these representations with the ground truth frames, LatentSync's framework ensures a high degree of temporal coherence, resulting in remarkably smooth and convincing lip sync animations that closely match the audio input.

1.1 How to Use LatentSync Workflow?

Note: The LatentSync node has been updated to version 1.6 (latest version).

LatentSync

This is the LatentSync workflow, Left Side nodes are inputs for uploading video, Middle is processing LatentSync nodes, and right is the outputs node.

  • Upload your Video in input nodes.
  • Upload your Audio input of dialouges.
  • Click Render !!!

1.2 Video Input

LatentSync

  • Click and Upload your Reference Video which has face in it.

The video is adjusted to 25 FPS to sync properly with the Audio model

1.3 Audio Input

LatentSync

  • Click and Upload your audio here.

LatentSync sets a new benchmark for lip sync with its innovative approach to audio-visual generation. By combining precision, temporal consistency, and the power of Stable Diffusion, LatentSync transforms the way we create synchronized content. Redefine what's possible in lip sync with LatentSync.

Want More ComfyUI Workflows?

Hallo2 | Lip-Sync Portrait Animation

Audio-driven lip-sync for portrait animation in 4K.

EchoMimic | Audio-driven Portrait Animations

Generate realistic talking heads and body gestures synced with the provided audio.

FLUX Kontext Preset | Scene Control

Master scene creation with curated one-click AI presets.

Flux Kontext Character Turnaround Sheet LoRA

Generate 5-pose character turnaround sheets from single image

AnimateDiff + ControlNet + AutoMask | Comic Style

Effortlessly restyle videos, converting realistic characters into anime while keeping the original backgrounds intact.

Blender + ComfyUI | AI Rendering 3D Animations

Use Blender to set up 3D scenes and generate image sequences, then use ComfyUI for AI rendering.

LivePortrait | Animate Portraits | Vid2Vid

Updated 6/16/2025: ComfyUI version updated to v0.3.39 for improved stability and compatibility. Transfer facial expressions and movements from a driving video onto a source video

Flux Redux | Variation and Restyling

Official Flux Tools - Flux Redux for Image Variation and Restyling

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.