Lipsync Video: Emotion-Aware Face Reanimation from Audio

sync/react-1

Sync React-1 reanimates facial performance from audio or emotion prompts with identity-preserving lipsync video-to-video and audio-to-video edits up to 4K for filmmakers, localization, and post-production teams.

Idle

The rate is $0.16 per second.

Introduction to Lipsync Video Creation

Sync Labs' Sync Lipsync React-1 reanimates full facial performance in existing footage from audio and emotion prompts at $0.16 per second, delivering identity-preserving video-to-video and audio-to-video edits up to 4K. Trading reshoots, manual frame-by-frame cleanup, and basic lip-dub passes for controllable emotion-aware performance editing, it eliminates complex masking and ADR overhead for filmmakers, post-production and localization teams, and marketers producing lipsync video. For developers, lipsync video on RunComfy can be used both in the browser and via an HTTP API, so you don’t need to host or scale the model yourself.
Ideal for: Emotion-Faithful Dubbing Localization | Post-Production Performance Fixes | Multi-Emotion Cut Creation

Examples of Lipsync Video Outputs

Model Overview

Provider: Sync Labs (sync.)
Task: video-to-video,audio-to-video
Max Resolution/Duration: Up to 4K, inputs �?15s
Summary: React-1 reanimates full facial performance from new audio and emotion prompts while preserving identity and style, producing high-fidelity lipsync video edits. It controls mouth articulation, micro-expressions, head and eye motion, and timing for emotionally aligned dubs and performance fixes. Ideal for teams who need consistent identity and motion continuity at high resolution.

Key Capabilities

Emotion-driven performance reanimation

Edits beyond the mouth region to regenerate expressions, eye gaze, head motion, and timing from six emotion presets (happy, angry, sad, neutral, disgusted, surprised).
Produces natural, emotionally coherent lipsync video that matches delivery and feel, not just phoneme accuracy.

Identity, style, and timing preservation

Maintains speaker identity, speaking rhythm, accent, and stylistic traits across frames and cuts.
Ensures the resulting lipsync video aligns with the original take’s look and cadence while fixing or reinterpreting the performance.

Zero-shot, cross-format edit fidelity

Works on live-action, animation, and AI-generated footage without per-actor fine-tuning.
Provides controllable edit scope (lips/face/head) for targeted or full-face lipsync video updates at up to 4K.

Input Parameters

Core Inputs

Parameter	Type	Default/Range	Description
video_url	string (video_uri)	Required	Publicly accessible URL to source footage; must be 15s or shorter.
audio_url	string (audio_uri)	Required	Publicly accessible URL to the driving audio; must be 15s or shorter.
emotion	string (choice)	neutral	One of: happy, angry, sad, neutral, disgusted, surprised. Drives the emotional performance.

Motion & Sync Controls

Parameter	Type	Default/Range	Description
model_mode	string (choice)	face	Edit scope control: lips (mouth-only), face (full-face), head (includes head motion).
lipsync_mode	string (choice)	bounce	Handling when audio/video durations differ: cut_off, loop, bounce, silence, remap.
temperature	float	0.5	Controls expressiveness; higher values yield more animated delivery.

How lipsync video compares to other models

Vs Lipsync-2-Pro: Compared to Lipsync-2-Pro, lipsync video delivers full facial reanimation (expressions, eye/head motion, timing) rather than primarily mouth alignment, while preserving 4K identity fidelity. Key improvements include emotion control via presets and more nuanced performance rewriting. Ideal Use Case: choose React-1 when you need emotional direction and holistic performance editing.
Vs general video generators (Seedance, Wan, Kling): Compared to text-to-video/image-to-video systems, lipsync video delivers precise edits to existing footage with identity preservation and emotional fidelity instead of synthesizing new scenes. Key improvements include controllable performance, timing adherence, and dub alignment. Ideal Use Case: choose React-1 for post-production re-acting, localization, and performance fixes on real footage.
Vs Lipsync-2: Compared to Lipsync-2, lipsync video delivers emotion-guided performance, micro-expressions, and controllable head/eye movement, not just lip motion in sync. Key improvements include performance realism, timing nuance, and cross-format consistency. Ideal Use Case: choose React-1 when you need expressive, emotionally consistent re-animation.

API Integration

Developers can integrate lipsync video via the RunComfy API using standard HTTP requests for streamlined ingestion of source video, audio, and control parameters.

Note: API Endpoint for lipsync video

Official resources and licensing

Official Website/Paper: https://sync.so/react-1- License: Proprietary research preview by Sync Labs. Commercial use may require a separate agreement with the provider.

Explore Related Capabilities

If you only need straightforward lip alignment without emotion-guided re-animation, consider Sync Lipsync-2-Pro (audio-to-video), which emphasizes accurate mouth shapes and high-fidelity detail without altering broader facial performance. For generating entirely new scenes instead of editing existing footage, use a text-to-video model designed for content creation rather than performance editing. Explore More Lipsync Playgrounds Here

Related Models

veo-3-1/fast/extend-video

Streamline video refinements with seamless scene continuity for creators.

hunyuan-video-v1.5/image-to-video

Animate images into lifelike videos with smooth motion and visual precision for creators.

wan-2-6/image-to-video

Turn still visuals into motion-synced, high-detail video content with flexible control.

veo-3-1/image-to-video

Create realistic motion visuals with Veo 3.1's sleek AI video conversion.

veo-3-1/fast/text-to-video

Create cinematic clips in seconds with Veo 3.1 Fast, built for instant text-driven motion and creative control.

kling-2-1/pro/image-to-video

Animate a single image into a smooth video with Kling 2.1 Pro.

Frequently Asked Questions

What is React-1, and how does it improve the quality of a lipsync video compared to earlier Lipsync models?

React-1 transforms a lipsync video through advanced video-to-video and audio-to-video processing. It not only syncs lip motion but also regenerates full facial performances—including micro-expressions, head, and eye movements—based on emotional cues. Compared to previous Lipsync-2 or Lipsync-2-Pro models, it excels in emotional realism and identity preservation.

Can I edit existing footage using React-1 to make a more expressive lipsync video?

Yes. React-1 is specifically designed for editing existing footage via video-to-video reanimation. You can upload a source lipsync video along with a separate audio track (audio-to-video input), select one of the six emotional presets, and the model regenerates facial performance while maintaining the original identity and scene style.

What are the technical limitations when generating a lipsync video with React-1?

The current research preview of React-1 supports output up to 4K resolution but works most efficiently at 1080p. Maximum aspect ratio is 16:9, and it typically accepts one main video-to-video and one audio-to-video input per generation. Additional control sources such as ControlNet or IP-Adapter are not yet supported.

Are there token or parameter size limits that might affect generating long lipsync video sequences?

Yes. When producing a long lipsync video via the React-1 API or RunComfy playground, you may encounter memory limits beyond approximately 90–120 seconds of footage, depending on resolution. Exceeding this duration may result in partial frame drops during audio-to-video alignment.

How do developers move from testing a lipsync video in the RunComfy Playground to deploying it in production via API?

After refining your lipsync video workflow in the RunComfy Playground, you can export configuration parameters (emotion, audio, video-to-video reference) and replicate them using RunComfy’s public API. The API mirrors the playground’s model settings and supports scripted generation for automated audio-to-video pipelines. Consult API docs or contact hi@runcomfy.com for integration keys.

How does React-1 maintain emotional fidelity in a lipsync video?

React-1 employs fine-grained diffusion-based video-to-video transformation, aligning frame-level facial geometry with the given emotional prompt. During audio-to-video synthesis, emotional consistency in lip tension, eye focus, and head motion is maintained, ensuring realism in the reanimated performance.

In what scenarios does React-1 produce the most realistic lipsync video results?

React-1 performs best in dialogue-driven scenes and front-facing footage where facial details are visible. This enables precise audio-to-video mapping and expressive reanimation. It may struggle in cases with heavy occlusion, extreme lighting, or extreme profile angles affecting the video-to-video pipeline.

What differentiates React-1 from general video generation models when producing a lipsync video?

While most models generate entirely new content, React-1 enhances existing lipsync video performances through targeted video-to-video emotion editing. It focuses on identity retention, accurate timing, and emotional fidelity, whereas other generators prioritize scene diversity or stylization.

Can I use React-1 lipsync video outputs commercially?

Commercial usage of any Lipsync or React model, including lipsync video outputs generated via video-to-video or audio-to-video processing, depends on Sync Labs’ official license terms. Always verify usage and attribution requirements on sync.so before deploying results in paid or public productions.

What is the difference between a typical lipsync video model and React-1’s emotional reanimation approach?

Traditional lipsync video models synchronize mouth movements via audio-to-video alignment only. React-1, however, applies emotion-driven video-to-video transformation, modifying subtle facial aspects such as brows and gaze to reflect emotional tone, offering a richer and more natural performance.

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Lipsync Video: Emotion-Aware Face Reanimation from Audio | RunComfy