Seedance 2.0 Pro: Cinematic Text-to-Video with 2K Lip-Sync on playground and API

bytedance/seedance-2.0/pro

Generate cinematic 2K videos from text and media inputs with native audio, precise lip-sync, and smooth storytelling control for ads, film previz, and branded visual content.

Prompt *

Text prompt for the video (Chinese ~≤500 characters, English ~≤1000 words recommended).

Images

Reference images for multimodal reference mode (0–9). Support jpeg、png、webp、bmp、tiff、gif.

Videos

Reference videos for multimodal reference mode (0–3). Support mp4、mov. Video duration must be between 2 and 15 seconds.

Audio URLs

Reference audio for multimodal reference mode (0–3). Support wav、mp3. Audio duration must be between 2 and 15 seconds. Size should be less than 15MB.

Aspect Ratio (W:H)

Seedance 2.0 Pro default is adaptive (model picks closest ratio; actual ratio is returned on task query).

Duration (seconds)

Integer seconds in [4, 15].

Resolution

Generate audio

When true, the model outputs video with synchronized audio (speech, SFX, music).

Seed

Random seed for the video generation.

Web Search

When `web_search` is included, the model may run an online search depending on the prompt (e.g. specific products, current weather), which can improve factual freshness but increases latency.

Idle

The rate is $0.25 per second.

Introduction To Seedance 2.0 Pro Video Creation

ByteDance's Seedance 2.0 Pro turns text and references into cinematic videos, delivering up to 2K resolution with native audio and millisecond lip-sync. Trading manual frame-by-frame editing, masking, and separate dubbing for multimodal inputs (up to 9 images, 3 videos, 3 audio) with camera path control, multi-shot storytelling, and precise lip-sync, Seedance 2.0 Pro streamlines production by eliminating complex rotoscoping and audio alignment for marketing teams, agencies, film previz, and game studios. For developers, Seedance 2.0 Pro on RunComfy can be used both in the browser and via an HTTP API, so you don’t need to host or scale the model yourself.
Ideal for: High-Conversion Video Ads | Shot-Accurate Film Previsualization | Brand-Consistent Multi-language Lip-Synced Narratives

ByteDance Seed / Seedance 2.0 Pro#

Seedance 2.0 Pro is a multimodal text-to-video model from ByteDance Seed that turns scene descriptions and optional references into short cinematic clips. On RunComfy you drive generation with a prompt plus optional images (up to 9), videos (up to 3), and audio (up to 3) for multimodal reference mode, and you can set aspect ratio, duration, resolution, generate audio, seed, and optional tools (e.g. [{ "type": "web_search" }] to allow online search when the model chooses).

Output (this playground): selectable 480p / 720p · 4–15 seconds · aspect ratios including adaptive (default), 16:9, 9:16, 4:3, 3:4, 1:1, 21:9 · optional native audio when Generate audio is enabled

Highlights#

Multimodal references: Up to 9 images, 3 reference videos (2–15 s each, mp4/mov), and 3 audio references (2–15 s, wav/mp3, under 15 MB each), together with a strong text prompt.
Flexible framing: Default adaptive ratio lets the model pick the closest format; fixed ratios are available for platform-specific delivery.
Audio-aware video: Toggle Generate audio for synchronized speech, SFX, and music with the clip.
Controllable length: Duration is an integer from 4 to 15 seconds (default 5).
Reproducibility: Set Seed when you need repeatable results while refining prompts or references.
Optional web search: Pass tools with type: web_search so the model can search the web when needed; check usage.tool_usage.web_search on the task query response for how many searches ran.

Parameters of Seedance 2.0 Pro#

Inputs match the RunComfy OpenAPI Input schema for this template.

Parameter	Required	Type	Default	Range / Options	Description
prompt*	Yes (*)	string	—	CN ~≤500 chars, EN ~≤1000 words rec.	Text prompt for the video
image_url (Images)	No	array (image URIs)	[]	0–9	jpeg, png, webp, bmp, tiff, gif
video_url (Videos)	No	array (video URIs)	[]	0–3	mp4, mov; duration 2–15 s per clip
audio_url (Audio URLs)	No	array (audio URIs)	[]	0–3	wav, mp3; 2–15 s, < 15 MB
aspect_ratio	No	string	adaptive	adaptive, 16:9, 9:16, 4:3, 3:4, 1:1, 21:9	Adaptive: model picks closest ratio; check task result for actual ratio
duration	No	integer	5	4–15 (seconds)	Clip length in whole seconds
resolution	No	string	720p	480p, 720p	Output resolution preset
generate_audio	No	boolean	true	true / false	When true, outputs video with synchronized audio (speech, SFX, music)
seed	No	integer	—	—	Random seed for video generation
tools	No	array of objects	[]	`type`: web_search only	Declares allowed tools; with web_search, model may search per prompt; see `usage.tool_usage.web_search` on task query

How to Use Seedance 2.0 Pro#

1) Describe your scene — Cover subject, action, setting, mood, lighting, and camera. For prompt length, Chinese ~≤500 characters or English ~≤1000 words is recommended.

2) Add references (optional) — Upload images for look or identity; add short reference videos or audio. Respect duration and file-size limits for each modality.

3) Choose aspect ratio — Use adaptive for general exploration, or a fixed ratio (e.g. 9:16 or 16:9) for a known deliverable.

4) Set duration — Any integer from 4 to 15 seconds.

5) Pick resolution — 480p for fast drafts, 720p as the default balance.

6) Generate audio — Leave enabled for dialogue, SFX, or music; disable if you only need silent video.

7) Optional seed — Fix the seed while iterating so changes come from prompt or media, not randomness.

8) Optional web search — Add tools [{ "type": "web_search" }] when you want the model to be able to look up timely facts (adds latency when used); after the job completes, read usage.tool_usage.web_search on the task query.

9) Generate and iterate — Refine wording, swap references, or adjust ratio, duration, and resolution.

Prompt & Reference Tips for Seedance 2.0 Pro#

Be specific about camera and motion (e.g. medium close-up, slow push-in, handheld vs locked-off).
Use images for what must stay stable (face, costume, logo); use text for what should evolve (action, mood).
Reference videos and audio must be 2–15 seconds; keep reference audio under 15 MB.
With generate_audio on, mention dialogue tone or ambient sound you want to hear.
If results feel noisy or contradictory, simplify the prompt or reduce conflicting style cues.

How Seedance 2.0 Pro compares to other models#

Seedance 1.5 Pro: Seedance 2.0 Pro extends multimodal reference options, duration control, and cinematic motion for short clips; compare outputs on the same prompt and references.
Wan 2.5 / Kling Video 2.6: Pick based on whether you need this template’s mix of image, video, and audio references, lip-sync-friendly audio generation, and the parameter set above.

More Models to Try#

Seedance 1.5 Pro — Solid prior-generation lip-sync and short-form results.
Seedance 1.0 — Simple text-to-video baselines.
Wan 2.5 — Strong general-purpose text-to-video alternative.
Kling Video 2.6 — Motion editing and reference-driven action strengths.

In short, Seedance 2.0 Pro on RunComfy supports text plus up to nine images, three videos, and three audio references, 4–15 s clips, 480p–720p presets, flexible aspect ratios including adaptive, and optional native audio aligned with the live playground fields.

Related Playgrounds

sora-2/text-to-video

Generate realistic videos with synced audio from text using OpenAI Sora 2.

wan-2-2/image-to-video

Refined AI visuals, real-time control, and pro FX for creators

pikaffects

Add instant visual effects to a single image and export as a video.

scail

Delivers consistent face animation from a single image using motion-driven synthesis for design and game visualization.

kling-video-o1/standard/text-to-video

Create lifelike cinematic video clips from prompts with motion control.

happyhorse-1.0/text-to-video

HappyHorse 1.0 with native 1080p output, cinematic motion, and multi-shot consistency.

Frequently Asked Questions

What resolution and aspect ratio options does Seedance 2.0 Pro text-to-video expose on RunComfy?

On this playground, resolution is one of 480p, 720p (default), or 1080p. Aspect ratio can be adaptive (default—the model picks the closest ratio; the task result reflects the actual output) or fixed: 16:9, 9:16, 4:3, 3:4, 1:1, or 21:9.

What are the limits for prompts and reference media (images, video, audio)?

Prompt: Chinese ~≤500 characters or English ~≤1000 words is recommended. Images: up to 9 files (jpeg, png, webp, bmp, tiff, gif). Reference videos: up to 3 (mp4, mov), each 2–15 seconds. Reference audio: up to 3 (wav, mp3), each 2–15 seconds and under 15 MB. Only prompt is required; all reference fields are optional.

How do I move from testing in the RunComfy Playground to production API integration with Seedance 2.0 Pro text-to-video?

Use the RunComfy API with the same Input fields as the playground (prompt, image_url, video_url, audio_url, aspect_ratio, duration, resolution, generate_audio, seed). Validate prompts and media limits in the UI, then use your account API key and credits for automated jobs.

What has Seedance 2.0 Pro text-to-video improved compared to Seedance 1.5 Pro?

Seedance 2.0 Pro targets cinematic short clips with multimodal references (many images plus optional video and audio), 4–15 second duration control, flexible aspect ratios including adaptive, and native audio when generate_audio is on—useful for lip-sync and synced SFX or music. Exact benchmarks depend on your content; compare outputs on identical prompts and references.

How does Seedance 2.0 Pro text-to-video compare to models like Wan 2.5 or Kling Video 2.6?

Choice depends on your workflow. On RunComfy, this Seedance 2.0 Pro template offers up to nine images, three reference videos, three audio references, 480p–1080p presets, and toggled generated audio. Wan 2.5 and Kling 2.6 differ in pricing, limits, and strengths—run parallel tests on your typical prompts and reference sets.

Can Seedance 2.0 Pro text-to-video keep characters or style consistent across a clip?

Yes, in practice reference images (and optional reference video or audio) plus a clear prompt help anchor identity, wardrobe, and tone. There is no special “@” syntax documented in the API schema; consistency comes from aligned text and reference media within the supported limits.

Does Seedance 2.0 Pro text-to-video support native audio generation and lip-sync?

When Generate audio (generate_audio) is true (the default), the model is described as outputting video with synchronized audio (speech, SFX, music). Set it to false if you only want silent video. Lip-sync quality still depends on prompt clarity and the scene you describe.

How long can each generated video be, and how is duration set?

Duration is an integer from 4 to 15 seconds (default 5). Pick any whole-second value in that range per generation.

Can I use Seedance 2.0 Pro text-to-video output commercially?

Commercial use depends on ByteDance’s licensing terms for the model and RunComfy’s terms of service. Review the official model license and RunComfy documentation, or contact hi@runcomfy.com before using generated footage in paid campaigns or public distribution.

Who benefits most from Seedance 2.0 Pro text-to-video on RunComfy?

Teams and creators who need short cinematic clips with optional image, video, and audio references, platform-specific aspect ratios, up to 1080p exports on this playground, and optional built-in audio—for example ads, social video, previsualization, and branded narrative tests.

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Seedance 2.0 Pro: Cinematic Text-to-Video with 2K Lip-Sync on playground and API | RunComfy

Generate cinematic 2K videos from text and media inputs with native audio, precise lip-sync, and smooth storytelling control for ads, film previz, and branded visual content.

Introduction To Seedance 2.0 Pro Video Creation

ByteDance Seed / Seedance 2.0 Pro#

Highlights#

Parameters of Seedance 2.0 Pro#

How to Use Seedance 2.0 Pro#

Prompt & Reference Tips for Seedance 2.0 Pro#

How Seedance 2.0 Pro compares to other models#

More Models to Try#

Related Playgrounds

Frequently Asked Questions

What resolution and aspect ratio options does Seedance 2.0 Pro text-to-video expose on RunComfy?

What are the limits for prompts and reference media (images, video, audio)?

How do I move from testing in the RunComfy Playground to production API integration with Seedance 2.0 Pro text-to-video?

What has Seedance 2.0 Pro text-to-video improved compared to Seedance 1.5 Pro?

How does Seedance 2.0 Pro text-to-video compare to models like Wan 2.5 or Kling Video 2.6?

Can Seedance 2.0 Pro text-to-video keep characters or style consistent across a clip?

Does Seedance 2.0 Pro text-to-video support native audio generation and lip-sync?

How long can each generated video be, and how is duration set?

Can I use Seedance 2.0 Pro text-to-video output commercially?

Who benefits most from Seedance 2.0 Pro text-to-video on RunComfy?

Seedance 2.0 Pro: Cinematic Text-to-Video with 2K Lip-Sync on playground and API | RunComfy

Generate cinematic 2K videos from text and media inputs with native audio, precise lip-sync, and smooth storytelling control for ads, film previz, and branded visual content.

Introduction To Seedance 2.0 Pro Video Creation

Seedance 2.0 Pro Video Examples Showcase

ByteDance Seed / Seedance 2.0 Pro#

Highlights#

Parameters of Seedance 2.0 Pro#

How to Use Seedance 2.0 Pro#

Prompt & Reference Tips for Seedance 2.0 Pro#

How Seedance 2.0 Pro compares to other models#

More Models to Try#

Related Playgrounds

Frequently Asked Questions

What resolution and aspect ratio options does Seedance 2.0 Pro text-to-video expose on RunComfy?

What are the limits for prompts and reference media (images, video, audio)?

How do I move from testing in the RunComfy Playground to production API integration with Seedance 2.0 Pro text-to-video?

What has Seedance 2.0 Pro text-to-video improved compared to Seedance 1.5 Pro?

How does Seedance 2.0 Pro text-to-video compare to models like Wan 2.5 or Kling Video 2.6?

Can Seedance 2.0 Pro text-to-video keep characters or style consistent across a clip?

Does Seedance 2.0 Pro text-to-video support native audio generation and lip-sync?

How long can each generated video be, and how is duration set?

Can I use Seedance 2.0 Pro text-to-video output commercially?

Who benefits most from Seedance 2.0 Pro text-to-video on RunComfy?

Seedance 2.0 Pro Video Examples Showcase