Kling Video O3 Standard Reference to Video: Character-Consistent Video Generation on Models and API

kling/kling-video-o3/standard/reference-to-video

Combine reference images, an optional reference video, and a prompt into a 3-15s clip with consistent characters using Kling Video O3 Standard Reference to Video, on RunComfy models and HTTP API.

Prompt *

Describe the scene and action. Refer to references by position, e.g. 'The man in Figure 2 walks with the woman in Figure 1 through a sunlit park.'

Reference Images

Reference images of characters, props, or styles. Up to 7 without a reference video, or up to 4 when a reference video is provided.

Reference Video URL

Optional reference video for motion guidance, style transfer, or scene continuity. When provided, billing switches to the with-reference-video tier.

Aspect Ratio (W:H)

Output frame ratio. 16:9 for landscape, 9:16 for vertical social, 1:1 for square.

Duration (seconds)

Length of the generated clip in seconds (3-15).

Generate Sound

When enabled, synthesize synchronized audio with the video. Adds ~33% to the per-second cost (no effect when a reference video is provided).

Keep Original Sound

When a reference video is provided, retain its original audio track in the generated output.

Shot Type

Editing scope. Use intelligent for auto-decided pacing and cuts, or customize for prompt-driven manual control.

Idle

The rate is $0.084 per second without sound, and $0.112 per second with sound.

Introduction To Kling Video O3 Standard Reference to Video

Kuaishou's Kling Video O3 Standard Reference to Video turns reference images, an optional reference video, and a prompt into 3 to 15 second clips at $0.084 per second without sound, or $0.112 per second with sound.

Trading reshoots, casting calls, and frame-by-frame compositing for a single guided generation, the model gives social creators, marketing teams, ad designers, and product studios consistent characters and styles across new scenes.

For developers, Kling Video O3 Standard Reference to Video on RunComfy can be used both in the browser and via an HTTP API, so you don't need to host or scale the model yourself.

Ideal for: Character-Driven Short Videos | Brand And Spokesperson Clips | Style-Consistent Concept Reels

Kuaishou / Kling Video O3 Standard Reference to Video#

This is Kuaishou's reference-driven entry in the O3 family, built to generate new clips while preserving the identity of the people, props, or styles you supply as references. The Standard tier targets a practical price point and keeps the O3 visual language across the full 3 to 15 second range.

It fits teams that want short-form video featuring specific characters, brand elements, or art directions — without a shoot, a green screen, or manual rotoscoping.

Highlights#

Identity preservation: Lock subject features from your reference images so faces, props, or styles stay consistent across frames.
Multi-reference composition: Combine up to 7 images (or up to 4 alongside a reference video) to mix characters, items, and looks in one scene.
Optional reference video: Feed a clip for motion guidance, style transfer, or scene continuity instead of starting from stills alone.
Sound options: Keep the original audio from the reference clip, or synthesize new synchronized sound for silent generations.
Multi-format output: 16:9, 9:16, and 1:1 cover landscape, vertical, and square placements from a single model.
Flexible duration: Any whole-second length from 3 to 15 seconds works for hooks, beats, or full short-form posts.

Related Models

pikadditions

Add a person or object into an existing video with smart compositing.

wan-2-2/lora/image-to-video

Transform stills into cinematic motion with open-source precision tools.

happyhorse-1.0/reference-to-video

HappyHorse 1.0 Reference to Video fuses up to 9 reference images and a prompt into a coherent multi-character clip with stable identity.

veo-3-1/text-to-video

Generate cinematic motion clips with precise control and audio sync

kling-1-6/pro/image-to-video

Precise prompts, lifelike motion, vivid video quality.

kling-video-o1/image-to-video/reference

Generate cinematic shots guided by reference images with unified control and realistic motion.

Frequently Asked Questions

What is Kling Video O3 Standard Reference to Video and what does it do?

Kling Video O3 Standard Reference to Video is Kuaishou's reference-driven entry in the O3 family. It generates a 3 to 15 second video from a prompt while preserving the identity of the people, props, or styles you supply as reference images, with an optional reference video for motion or style guidance. The output keeps consistent characters and looks across frames without manual masking or compositing.

What kinds of references can I use with Kling Video O3 Standard Reference to Video?

You can attach reference images of characters, props, or art styles, plus an optional reference video for motion or style transfer. Without a reference video, up to 7 image references are supported; with a reference video, image references are capped at 4. In the prompt, you point at specific images by position — for example, "Figure 1 walks toward Figure 2" — so Kling Video O3 Standard Reference to Video knows which subject to place where.

How does Kling Video O3 Standard Reference to Video compare to the O3 Pro reference tier?

Compared to the Pro reference-to-video tier, Kling Video O3 Standard Reference to Video targets a lower per-second rate while keeping the O3 visual language, which is helpful for iteration and higher-volume social or marketing work. Pro is positioned for top-end fidelity on final renders based on available provider information, while Standard suits drafts, variants, and short-form output. The control surface — prompt, references, aspect ratio, duration, sound, shot type — is the same, so prompts transfer between tiers.

Which teams and use cases benefit most from Kling Video O3 Standard Reference to Video?

Social creators, ad and marketing teams, e-commerce video producers, and design studios use Kling Video O3 Standard Reference to Video to spin character-consistent variants from existing photos — different scenes, props, or moods without reshoots. It also fits brand spokesperson clips, concept art-direction reels, and multi-character storytelling shots. Developers integrate it into automated pipelines that turn a brief plus a few stills into a finished short clip.

What input limits should I know before using Kling Video O3 Standard Reference to Video?

The model requires a prompt; references and reference video are optional but recommended. Image references are capped at 7 (or 4 alongside a reference video), aspect_ratio accepts 16:9, 9:16, or 1:1, duration is an integer between 3 and 15 seconds, and shot_type accepts customize or intelligent. Sound and keep_original_sound are booleans. For other constraints such as resolution or file format, check the current RunComfy parameter panel for the exact limits, since they may vary by provider settings.

Can developers use Kling Video O3 Standard Reference to Video through the RunComfy API?

Yes. You can prototype Kling Video O3 Standard Reference to Video in the RunComfy AI Playground Web UI — dialing in references, prompt phrasing, aspect ratio, duration, and audio toggles — and then call the same model via the RunComfy API with identical parameters. This keeps creative iteration in the browser while production runs in code, without changing how the model behaves.

How much does it cost to generate with Kling Video O3 Standard Reference to Video on RunComfy?

Generations consume usd / credits from your RunComfy balance. Kling Video O3 Standard Reference to Video bills $0.084 per second without sound and $0.112 per second with sound based on available provider information; supplying a reference video switches to a 1.5× rate of $0.126 per second and overrides the sound multiplier. As examples, 5 seconds without sound is around $0.420, 10 seconds with sound is around $1.120, and 15 seconds with a reference video is around $1.890. New users typically get a free trial usd amount; refer to the Generation section of the model page for the latest rates.

What prompting style works best with Kling Video O3 Standard Reference to Video?

Kling Video O3 Standard Reference to Video responds best to clear, specific prompts that bind references to roles using "Figure 1", "Figure 2", and so on, then describe the action, environment, and camera move. Concrete cues like "slow tracking shot", "golden hour rim light", or "neon-lit street" anchor look and motion better than vague mood words. For complex scenes, use multi_prompt segments to separate beats so transitions stay clean within a single clip.

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Kuaishou / Kling Video O3 Standard Reference to Video#

It fits teams that want short-form video featuring specific characters, brand elements, or art directions — without a shoot, a green screen, or manual rotoscoping.

Highlights#

Identity preservation: Lock subject features from your reference images so faces, props, or styles stay consistent across frames.

Multi-reference composition: Combine up to 7 images (or up to 4 alongside a reference video) to mix characters, items, and looks in one scene.

Optional reference video: Feed a clip for motion guidance, style transfer, or scene continuity instead of starting from stills alone.

Sound options: Keep the original audio from the reference clip, or synthesize new synchronized sound for silent generations.

Multi-format output: 16:9, 9:16, and 1:1 cover landscape, vertical, and square placements from a single model.

Flexible duration: Any whole-second length from 3 to 15 seconds works for hooks, beats, or full short-form posts.

Frequently Asked Questions

Combine reference images, an optional reference video, and a prompt into a 3-15s clip with consistent characters using Kling Video O3 Standard Reference to Video, on RunComfy models and HTTP API.

Introduction To Kling Video O3 Standard Reference to Video

Kuaishou / Kling Video O3 Standard Reference to Video#

Highlights#

Related Models

Frequently Asked Questions

What is Kling Video O3 Standard Reference to Video and what does it do?

What kinds of references can I use with Kling Video O3 Standard Reference to Video?

How does Kling Video O3 Standard Reference to Video compare to the O3 Pro reference tier?

Which teams and use cases benefit most from Kling Video O3 Standard Reference to Video?

What input limits should I know before using Kling Video O3 Standard Reference to Video?

Can developers use Kling Video O3 Standard Reference to Video through the RunComfy API?

How much does it cost to generate with Kling Video O3 Standard Reference to Video on RunComfy?

What prompting style works best with Kling Video O3 Standard Reference to Video?

Combine reference images, an optional reference video, and a prompt into a 3-15s clip with consistent characters using Kling Video O3 Standard Reference to Video, on RunComfy models and HTTP API.

Introduction To Kling Video O3 Standard Reference to Video

Examples Of Kling Video O3 Standard Reference to Video

Kuaishou / Kling Video O3 Standard Reference to Video#

Highlights#

Related Models

Frequently Asked Questions

What is Kling Video O3 Standard Reference to Video and what does it do?

What kinds of references can I use with Kling Video O3 Standard Reference to Video?

How does Kling Video O3 Standard Reference to Video compare to the O3 Pro reference tier?

Which teams and use cases benefit most from Kling Video O3 Standard Reference to Video?

What input limits should I know before using Kling Video O3 Standard Reference to Video?

Can developers use Kling Video O3 Standard Reference to Video through the RunComfy API?

How much does it cost to generate with Kling Video O3 Standard Reference to Video on RunComfy?

What prompting style works best with Kling Video O3 Standard Reference to Video?

Examples Of Kling Video O3 Standard Reference to Video

Kling Video O3 Standard Reference to Video: Character-Consistent Video Generation on Models and API | RunComfy

Combine reference images, an optional reference video, and a prompt into a 3-15s clip with consistent characters using Kling Video O3 Standard Reference to Video, on RunComfy models and HTTP API.

Introduction To Kling Video O3 Standard Reference to Video

Kuaishou / Kling Video O3 Standard Reference to Video#

Highlights#

Related Models

Frequently Asked Questions

What is Kling Video O3 Standard Reference to Video and what does it do?

What kinds of references can I use with Kling Video O3 Standard Reference to Video?

How does Kling Video O3 Standard Reference to Video compare to the O3 Pro reference tier?

Which teams and use cases benefit most from Kling Video O3 Standard Reference to Video?

What input limits should I know before using Kling Video O3 Standard Reference to Video?

Can developers use Kling Video O3 Standard Reference to Video through the RunComfy API?

How much does it cost to generate with Kling Video O3 Standard Reference to Video on RunComfy?

What prompting style works best with Kling Video O3 Standard Reference to Video?

Kling Video O3 Standard Reference to Video: Character-Consistent Video Generation on Models and API | RunComfy

Combine reference images, an optional reference video, and a prompt into a 3-15s clip with consistent characters using Kling Video O3 Standard Reference to Video, on RunComfy models and HTTP API.

Introduction To Kling Video O3 Standard Reference to Video

Examples Of Kling Video O3 Standard Reference to Video

Kuaishou / Kling Video O3 Standard Reference to Video#

Highlights#

Related Models

Frequently Asked Questions

What is Kling Video O3 Standard Reference to Video and what does it do?

What kinds of references can I use with Kling Video O3 Standard Reference to Video?

How does Kling Video O3 Standard Reference to Video compare to the O3 Pro reference tier?

Which teams and use cases benefit most from Kling Video O3 Standard Reference to Video?

What input limits should I know before using Kling Video O3 Standard Reference to Video?

Can developers use Kling Video O3 Standard Reference to Video through the RunComfy API?

How much does it cost to generate with Kling Video O3 Standard Reference to Video on RunComfy?

What prompting style works best with Kling Video O3 Standard Reference to Video?

Examples Of Kling Video O3 Standard Reference to Video