logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
MODELS
Explore
All Models
LIBRARY
Generations
MODEL APIS
API Docs
API Keys
ACCOUNT
Usage

Kling Video O3 Pro Text To Video: Cinematic Text-to-Video Generation on Models and API | RunComfy

kling/kling-video-o3/pro/text-to-video

Write a prompt, choose 5 or 10 seconds, and Kling Video O3 Pro Text To Video returns a Pro-grade cinematic clip with optional synchronized sound, on RunComfy models and HTTP API.

Describe the scene, characters, camera move, lighting, and mood you want the model to render.
Output frame ratio. 16:9 for landscape, 9:16 for vertical social, 1:1 for square.
Length of the generated clip in seconds. Pricing scales linearly with the selected duration.
When enabled, synthesize matching ambient audio and sound effects alongside the video. Adds ~25% to the per-second cost.
Editing mode. Use intelligent for auto-decided scope, or customize for prompt-driven manual control.
Additional prompt segments that guide scene transitions and progressions within the clip. The sum of segment durations must equal the total video duration.
Idle
The rate is $0.112 per second without sound, and $0.14 per second with sound.

Introduction To Kling Video O3 Pro Text To Video

Kuaishou's Kling Video O3 Pro Text To Video renders cinematic 5 or 10 second clips from a single prompt at $0.112 per second without sound, or $0.14 per second when synchronized audio is generated.

Trading shot lists, on-set crews, and color sessions for a written brief, the model gives directors, motion designers, agency creatives, and product teams broadcast-grade footage with physics-aware motion and rights-managed delivery.

For developers, Kling Video O3 Pro Text To Video on RunComfy can be used both in the browser and via an HTTP API, so you don't need to host or scale the model yourself.

Ideal for: Hero Brand Films | Cinematic Concept Reels | Premium Social Spots

Kuaishou / Kling Video O3 Pro Text To Video#


This is Kuaishou's flagship O3-generation text-to-video model, tuned for final-render fidelity. Send a single written description of the scene and the model returns a 5 or 10 second clip with physics-aware motion, controlled framing, and an optional matching audio track.


It fits teams that need broadcast-grade footage from natural language — no shoot day, no compositing pass, no model hosting.


Highlights#


  • Cinematic-quality renders: Targets the highest fidelity in the O3 family, with controlled lighting, composition, and motion blur suitable for hero spots.
  • Physics-aware motion: Fluids, fabric, hair, and rigid-body interactions read naturally, so Kling Video O3 Pro Text To Video holds up under close inspection.
  • Synchronized audio option: Flip the sound toggle to layer matching ambient audio and effects on the same generation pass.
  • Multi-prompt sequencing: Chain prompt segments to drive scene progression and on-clip transitions inside a single run.
  • Element list control: Reference specific characters, props, or stylistic elements to keep them consistent across the duration of the clip.
  • Flexible aspect ratios: Output in 16:9, 9:16, or 1:1 to fit cinema, vertical social, and square placements without a re-render.
  • Direct shot-scope control: Choose intelligent for automatic edit scoping or customize for tight prompt-led control over what changes per beat.

Parameters#


ParameterRequiredTypeDefaultRange / OptionsDescription
prompt*Yes (*)string—Free textScene description covering subject, action, camera, lighting, and mood.
aspect_ratioNostring16:916:9, 9:16, 1:1Output frame ratio.
durationNointeger55, 10Clip length in seconds; billing scales linearly.
soundNobooleanfalsetrue / falseGenerate matching synchronized audio with the video.
shot_typeNostringcustomizecustomize, intelligentEditing mode; intelligent auto-decides scope, customize follows the prompt.
multi_promptNoarray[]Up to 20 segmentsAdditional prompt segments with per-segment duration to drive scene transitions.
element_listNoarray[]Up to 7 IDsKling Elements reference IDs that should stay consistent across the clip.

Pricing#


Kling Video O3 Pro Text To Video bills per second of generated output on RunComfy. Enabling sound adds roughly 25% to the per-second rate.


ModeRate per second
Without sound$0.112
With sound$0.140

Estimated cost per generation


DurationWithout soundWith sound
5 s$0.56$0.70
10 s$1.12$1.40

Related Models

wan-2-2/lora/text-to-video

Use WAN 2.2 LoRA as latest AI tool for realistic video creation from text.

wan-2-2/fun-camera

Create smooth motion clips from stills with custom camera moves.

kling-2-6/motion-control-pro

Cinematic motion model for fluid scene creation and adaptive visual editing.

kling-3.0/pro/text-to-video

Premium cinematic text-to-video with the highest visual fidelity in the Kling V3.0 family.

wan-2.7/text-to-video

Create 1080p clips with multi-reference and frame control.

wan-2-1/text-to-video

Generate cinematic videos from text prompts with Wan 2.1.

Frequently Asked Questions

What is Kling Video O3 Pro Text To Video best used for?

Kling Video O3 Pro Text To Video is Kuaishou's flagship text-to-video model, tuned for cinematic 5 or 10 second renders from a single prompt. It is a strong fit for hero brand films, premium social spots, and concept reels where physics-aware motion, controlled lighting, and broadcast-grade composition matter.

How does Kling Video O3 Pro Text To Video compare to the O3 Standard tier?

Kling Video O3 Pro Text To Video targets the highest fidelity in the O3 family, with stronger detail, motion coherence, and lighting control suited to final renders. The Standard tier offers a lower per-second price for drafts and high-volume iteration, based on publicly available information.

Does Kling Video O3 Pro Text To Video generate sound with the clip?

Yes — Kling Video O3 Pro Text To Video has a sound toggle that synthesizes matching ambient audio and effects in the same generation pass. Sound is off by default and adds roughly 25% to the per-second rate when enabled.

How well does Kling Video O3 Pro Text To Video follow detailed prompts?

Kling Video O3 Pro Text To Video reads structured prompts well — subject, action, camera move, lighting era, and mood all influence the result. Use multi-prompt segments for scene progression and an element list to keep specific characters, props, or styles consistent across the clip.

What aspect ratios and durations does Kling Video O3 Pro Text To Video support?

Kling Video O3 Pro Text To Video supports 16:9, 9:16, and 1:1 aspect ratios for cinema, vertical social, and square placements. Clip duration is 5 or 10 seconds; pricing scales linearly with the selected length. Check the current RunComfy parameter panel for the exact limits.

What input limits should I know before using Kling Video O3 Pro Text To Video?

Only the prompt field is required; aspect_ratio, duration, sound, shot_type, multi-prompt, and element_list are optional. Please follow Kuaishou's content usage policies when crafting prompts, and check the RunComfy panel for any provider-side limits that may apply.

Can developers use Kling Video O3 Pro Text To Video through the RunComfy API?

Yes — prototype Kling Video O3 Pro Text To Video in the RunComfy model UI, then call the same model from your backend over the RunComfy HTTP API with identical parameters. No GPU hosting or model scaling work is required on your side.

How much does it cost to generate with Kling Video O3 Pro Text To Video on RunComfy?

Kling Video O3 Pro Text To Video bills $0.112 per second of output without sound and $0.140 per second with sound, so a 5-second silent clip is $0.56 and a 10-second clip with sound is $1.40. Generations are deducted from your RunComfy usd / credit balance, and new users typically receive a free trial amount to test.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Video Models
  • Wan 2.6 Flash
  • Seedance 1.0 Pro Fast
  • Wan 2.6
  • Seedance 2.0 Pro
  • Wan 2.7
  • Seedance 1.0
  • View All Models →
Image Models
  • seedream 4.0
  • Flux 2 Dev
  • Nano Banana Pro
  • Nano Banana 2 Edit
  • GPT Image 2 Image Edit
  • Flux 2 Flash Edit
  • View All Models →
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Examples Of Kling Video O3 Pro Text To Video

Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...