logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
MODELS
Explore
All Models
LIBRARY
Generations
MODEL APIS
API Docs
API Keys
ACCOUNT
Usage

Kling Video O3 4k Reference To Video: 4K Reference-Driven Video Generation on Models and API | RunComfy

kling/kling-video-o3/4K/reference-to-video

Upload up to seven reference images, write a prompt, and Kling Video O3 4k Reference To Video returns a 3-15s cinematic 4K clip with identity consistency, on RunComfy models and HTTP API.

Describe the scene, characters, action, camera move, and mood you want the 4K clip to render.
Public image URLs that lock the subject identity, props, or style. Up to 7 without a reference video, up to 4 when a reference video is also used.
Output frame ratio. 16:9 for landscape, 9:16 for vertical social, 1:1 for square.
Length of the generated 4K clip in seconds. Pricing scales linearly with the selected duration.
When enabled, synthesize matching ambient audio alongside the 4K video. Only available when no reference video is provided. Pricing is unchanged.
Editing mode. Use intelligent for auto-decided scope, or customize for prompt-driven manual control.
Additional prompt segments to guide scene transitions and progressions. The sum of durations in multi_prompt must equal to total video duration.
Idle
The rate is $0.42 per second of video, regardless of whether audio is on or off.

Introduction To Kling Video O3 4k Reference To Video

Kuaishou's Kling Video O3 4k Reference To Video turns up to seven reference images and a written brief into cinematic 3 to 15 second 4K clips at a flat $0.42 per second, with optional synthesized sound at no extra charge.

Trading reshoots, hand-keyed character rigs, and frame-by-frame compositing for one guided generation, the model gives film teams, ad agencies, brand studios, and product groups identity-consistent 4K footage that holds the subject across the entire clip.

For developers, Kling Video O3 4k Reference To Video on RunComfy can be used both in the browser and via an HTTP API, so you don't need to host or scale the model yourself.

Ideal for: Character-Driven 4K Spots | Premium Product Reels | Talent-Consistent Brand Films

Kuaishou / Kling Video O3 4K Reference To Video#


This is the 4K-tier reference-driven entry in Kuaishou's O3 family, tuned for final-render fidelity while keeping the subject from the references locked across the whole shot. Provide up to seven reference images (or up to four when also pairing a reference video) plus a written description, and the model returns a 3 to 15 second 4K clip with physics-aware motion and an optional matching audio track.


It fits teams that need broadcast-grade 4K footage of a specific person, product, or stylistic motif — without a shoot day, manual rotoscoping, or self-hosted GPUs.


Highlights#


  • 4K final-render fidelity: Kling Video O3 4k Reference To Video targets the highest resolution tier of the O3 line for delivery on large screens and premium spots.
  • Identity preservation: The model extracts subject features from multiple viewpoints and carries them coherently across every frame of the clip.
  • Multi-reference inputs: Combine up to seven image references — or four images plus a reference video for guided motion.
  • Optional reference video guidance: Layer in a reference clip when you want the new render to inherit a specific motion or staging.
  • Sound generation toggle: Synthesize matching ambient audio in the same pass when no reference video is supplied; pricing stays flat.
  • Multi-prompt sequencing: Chain prompt segments to script transitions and scene beats within a single 4K generation.
  • Element list control: Lock specific named visual elements so they stay consistent from the first frame to the last.
  • Flexible aspect ratios: Output 16:9, 9:16, or 1:1 to fit cinema, vertical social, and square placements without re-rendering.

Parameters#


ParameterRequiredTypeDefaultRange / OptionsDescription
prompt*Yes (*)string—Free textScene, subject, action, camera move, and lighting.
imagesNoarray of URLs—Up to 7 without reference video, up to 4 withReference images that lock subject identity and style.
aspect_ratioNostring16:916:9, 9:16, 1:1Output frame ratio.
durationNointeger53 to 15Clip length in seconds; billing scales linearly.
soundNobooleanfalsetrue / falseGenerate matching audio (only when no reference video).
shot_typeNostringcustomizecustomize, intelligentEditing mode; intelligent auto-decides scope, customize follows the prompt.

Related Models

luma-ray-2/image-to-video

Lifelike characters, realistic physics, and stunning effects.

dreamina-3-0/pro/text-to-video

Turn text into detailed cinematic scenes with Dreamina 3.0 precision.

fantasy-portrait/image-to-video

Cinematic portrait video maker with prompt control and emotion-rich motion

elevenlabs/music-generation

Prompt-driven song creation with 44.1 kHz WAV control and section editing

ai-avatar/v2/pro

Turn static photos into lifelike videos with style, motion, and full creative control.

kling-2-1/standard/image-to-video

Animate a single image into a smooth video with Kling 2.1 Standard.

Frequently Asked Questions

What is Kling Video O3 4k Reference To Video best used for?

Kling Video O3 4k Reference To Video is Kuaishou's 4K reference-driven entry in the O3 family, tuned for cinematic 3 to 15 second clips that lock the subject's identity from your reference images. It is a strong fit for character-driven brand films, premium product reels, and talent-consistent spots where physics-aware motion and 4K-grade detail matter.

How does Kling Video O3 4k Reference To Video compare to the O3 Pro and Standard reference tiers?

Kling Video O3 4k Reference To Video targets the highest resolution and final-render detail in the O3 reference-to-video lineup. The Pro tier renders at a lower per-second price for HD-grade output, and the Standard tier is cheaper still for drafts and high-volume iteration, based on publicly available information.

How many reference images can I use with Kling Video O3 4k Reference To Video?

Kling Video O3 4k Reference To Video accepts up to seven reference images when no reference video is provided, and up to four images when a reference video is also used. More clean references from different angles generally improves identity preservation across the clip.

Does Kling Video O3 4k Reference To Video generate sound with the clip?

Yes — when no reference video is supplied, Kling Video O3 4k Reference To Video can synthesize matching ambient audio in the same generation pass via the sound toggle. Sound is off by default, and pricing stays the same whether sound is on or off.

What aspect ratios and durations does Kling Video O3 4k Reference To Video support?

Kling Video O3 4k Reference To Video supports 16:9, 9:16, and 1:1 aspect ratios for cinema, vertical social, and square placements. Clip duration can be set from 3 to 15 seconds in one-second steps. Check the current RunComfy parameter panel for the exact limits.

How well does Kling Video O3 4k Reference To Video maintain character identity across frames?

Kling Video O3 4k Reference To Video extracts subject features from multiple reference viewpoints and carries them through every frame, which is the main advantage of a reference-to-video flow over pure text-to-video. Clean, well-lit references from different angles give the strongest consistency.

What input limits should I know before using Kling Video O3 4k Reference To Video?

Only the prompt field is required for Kling Video O3 4k Reference To Video; images, aspect_ratio, duration, sound, shot_type, multi-prompt, and element_list are optional. Reference URLs must be publicly accessible. Limits may vary by mode or provider settings, so check the RunComfy panel.

Can developers use Kling Video O3 4k Reference To Video through the RunComfy API?

Yes — prototype Kling Video O3 4k Reference To Video in the RunComfy model UI, then call the same model from your backend over the RunComfy HTTP API with identical parameters. No GPU hosting or model scaling work is required on your side.

How much does it cost to generate with Kling Video O3 4k Reference To Video on RunComfy?

Kling Video O3 4k Reference To Video bills a flat $0.42 per second of generated video, regardless of whether sound is on or off. A 5-second clip costs about $2.10 and a 15-second clip about $6.30. Generations are deducted from your RunComfy usd / credit balance, and new users typically receive a free trial amount to test.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Video Models
  • Wan 2.6 Flash
  • Seedance 1.0 Pro Fast
  • Wan 2.6
  • Seedance 2.0 Pro
  • Wan 2.7
  • Seedance 1.0
  • View All Models →
Image Models
  • seedream 4.0
  • Flux 2 Dev
  • Nano Banana Pro
  • Nano Banana 2 Edit
  • GPT Image 2 Image Edit
  • Flux 2 Flash Edit
  • View All Models →
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Examples Of Kling Video O3 4k Reference To Video

Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...
Video thumbnail
Loading...