Lifelike characters, realistic physics, and stunning effects.
This is the 4K-tier reference-driven entry in Kuaishou's O3 family, tuned for final-render fidelity while keeping the subject from the references locked across the whole shot. Provide up to seven reference images (or up to four when also pairing a reference video) plus a written description, and the model returns a 3 to 15 second 4K clip with physics-aware motion and an optional matching audio track.
It fits teams that need broadcast-grade 4K footage of a specific person, product, or stylistic motif — without a shoot day, manual rotoscoping, or self-hosted GPUs.
| Parameter | Required | Type | Default | Range / Options | Description |
|---|---|---|---|---|---|
| prompt* | Yes (*) | string | — | Free text | Scene, subject, action, camera move, and lighting. |
| images | No | array of URLs | — | Up to 7 without reference video, up to 4 with | Reference images that lock subject identity and style. |
| aspect_ratio | No | string | 16:9 | 16:9, 9:16, 1:1 | Output frame ratio. |
| duration | No | integer | 5 | 3 to 15 | Clip length in seconds; billing scales linearly. |
| sound | No | boolean | false | true / false | Generate matching audio (only when no reference video). |
| shot_type | No | string | customize | customize, intelligent | Editing mode; intelligent auto-decides scope, customize follows the prompt. |
Lifelike characters, realistic physics, and stunning effects.
Turn text into detailed cinematic scenes with Dreamina 3.0 precision.
Cinematic portrait video maker with prompt control and emotion-rich motion
Prompt-driven song creation with 44.1 kHz WAV control and section editing
Turn static photos into lifelike videos with style, motion, and full creative control.
Animate a single image into a smooth video with Kling 2.1 Standard.
Kling Video O3 4k Reference To Video is Kuaishou's 4K reference-driven entry in the O3 family, tuned for cinematic 3 to 15 second clips that lock the subject's identity from your reference images. It is a strong fit for character-driven brand films, premium product reels, and talent-consistent spots where physics-aware motion and 4K-grade detail matter.
Kling Video O3 4k Reference To Video targets the highest resolution and final-render detail in the O3 reference-to-video lineup. The Pro tier renders at a lower per-second price for HD-grade output, and the Standard tier is cheaper still for drafts and high-volume iteration, based on publicly available information.
Kling Video O3 4k Reference To Video accepts up to seven reference images when no reference video is provided, and up to four images when a reference video is also used. More clean references from different angles generally improves identity preservation across the clip.
Yes — when no reference video is supplied, Kling Video O3 4k Reference To Video can synthesize matching ambient audio in the same generation pass via the sound toggle. Sound is off by default, and pricing stays the same whether sound is on or off.
Kling Video O3 4k Reference To Video supports 16:9, 9:16, and 1:1 aspect ratios for cinema, vertical social, and square placements. Clip duration can be set from 3 to 15 seconds in one-second steps. Check the current RunComfy parameter panel for the exact limits.
Kling Video O3 4k Reference To Video extracts subject features from multiple reference viewpoints and carries them through every frame, which is the main advantage of a reference-to-video flow over pure text-to-video. Clean, well-lit references from different angles give the strongest consistency.
Only the prompt field is required for Kling Video O3 4k Reference To Video; images, aspect_ratio, duration, sound, shot_type, multi-prompt, and element_list are optional. Reference URLs must be publicly accessible. Limits may vary by mode or provider settings, so check the RunComfy panel.
Yes — prototype Kling Video O3 4k Reference To Video in the RunComfy model UI, then call the same model from your backend over the RunComfy HTTP API with identical parameters. No GPU hosting or model scaling work is required on your side.
Kling Video O3 4k Reference To Video bills a flat $0.42 per second of generated video, regardless of whether sound is on or off. A 5-second clip costs about $2.10 and a 15-second clip about $6.30. Generations are deducted from your RunComfy usd / credit balance, and new users typically receive a free trial amount to test.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.





