kling/kling/lipsync/text-to-video
Kling Lipsync generates speech-synced visuals from text or video, preserving pose, identity, and scene layout.
Table of contents
1. Get started
Use RunComfy's API to run kling/kling/lipsync/text-to-video. For accepted inputs and outputs, see the model's schema.
curl --request POST \
--url https://model-api.runcomfy.net/v1/models/kling/kling/lipsync/text-to-video \
--header "Content-Type: application/json" \
--header "Authorization: Bearer <token>" \
--data '{
"video_url": "https://playgrounds-storage-public.runcomfy.net/tools/7078/media-files/usecase1-1-output.mp4",
"text": "Hello everyone, Kling Lipsync is available on RunComfy platform now, come and experience it !",
"voice_id": "genshin_vindi2"
}'2. Authentication
Set the YOUR_API_TOKEN environment variable with your API key (manage keys in your Profile) and include it on every request as a Bearer token via the Authorization header: Authorization: Bearer $YOUR_API_TOKEN.
3. API reference
Submit a request
Submit an asynchronous generation job and immediately receive a request_id plus URLs to check status, fetch results, and cancel.
curl --request POST \
--url https://model-api.runcomfy.net/v1/models/kling/kling/lipsync/text-to-video \
--header "Content-Type: application/json" \
--header "Authorization: Bearer <token>" \
--data '{
"video_url": "https://playgrounds-storage-public.runcomfy.net/tools/7078/media-files/usecase1-1-output.mp4",
"text": "Hello everyone, Kling Lipsync is available on RunComfy platform now, come and experience it !",
"voice_id": "genshin_vindi2"
}'Monitor request status
Fetch the current state for a request_id ("in_queue", "in_progress", "completed", or "cancelled").
curl --request GET \
--url https://model-api.runcomfy.net/v1/requests/{request_id}/status \
--header "Authorization: Bearer <token>"Retrieve request results
Retrieve the final outputs and metadata for the given request_id; if the job is not complete, the response returns the current state so you can continue polling.
curl --request GET \
--url https://model-api.runcomfy.net/v1/requests/{request_id}/result \
--header "Authorization: Bearer <token>"Cancel a request
Cancel a queued job by request_id, in-progress jobs cannot be cancelled.
curl --request POST \
--url https://model-api.runcomfy.net/v1/requests/{request_id}/cancel \
--header "Authorization: Bearer <token>"4. File inputs
Hosted file (URL)
Provide a publicly reachable HTTPS URL. Ensure the host allows server‑side fetches (no login/cookies required) and isn't rate‑limited or blocking bots. Recommended limits: images ≤ 50 MB (~4K), videos ≤ 100 MB (~2–5 min @ 720p). Prefer stable or pre‑signed URLs for private assets.
5. Schema
Input schema
{
"type": "object",
"title": "Input",
"required": [
"video_url",
"text",
"voice_id"
],
"properties": {
"video_url": {
"title": "Video URL",
"description": "Supports .mp4/.mov, size less than or equal to 100MB, duration between 2 and 10 seconds, resolution 720p/1080p only, width/height between 720 and 1920 pixels.",
"type": "string",
"default": "https://playgrounds-storage-public.runcomfy.net/tools/7078/media-files/usecase1-1-output.mp4"
},
"text": {
"title": "Text",
"description": "Text content for lip-sync video generation. Maximum 120 characters.",
"type": "string",
"maxLength": 120,
"default": "Hello everyone, Kling Lipsync is available on RunComfy platform now, come and experience it !"
},
"voice_id": {
"title": "Voice ID",
"description": "Voice ID to use for speech synthesis.",
"type": "string",
"enum": [
"genshin_vindi2",
"zhinen_xuesheng",
"AOT",
"ai_shatang",
"genshin_klee2",
"genshin_kirara",
"ai_kaiya",
"oversea_male1",
"ai_chenjiahao_712",
"girlfriend_4_speech02",
"chat1_female_new-3",
"chat_0407_5-1",
"cartoon-boy-07",
"uk_boy1",
"cartoon-girl-01",
"PeppaPig_platform",
"ai_huangzhong_712",
"ai_huangyaoshi_712",
"ai_laoguowang_712",
"chengshu_jiejie",
"you_pingjing",
"calm_story1",
"uk_man2",
"laopopo_speech02",
"heainainai_speech02",
"reader_en_m-v1",
"commercial_lady_en_f-v1"
],
"default": "genshin_vindi2"
},
"voice_language": {
"title": "Voice Language",
"description": "The voice language corresponding to the Voice ID.",
"type": "string",
"enum": [
"zh",
"en"
],
"default": "en"
},
"voice_speed": {
"title": "Voice Speed",
"description": "Speech rate for Text to Video generation.",
"type": "float",
"default": 1,
"minimum": 0.8,
"maximum": 2
}
}
}Output schema
{
"output": {
"type": "object",
"properties": {
"image": {
"type": "string",
"format": "uri",
"description": "single image URL"
},
"video": {
"type": "string",
"format": "uri",
"description": "single video URL"
},
"images": {
"type": "array",
"description": "multiple image URLs",
"items": { "type": "string", "format": "uri" }
},
"videos": {
"type": "array",
"description": "multiple video URLs",
"items": { "type": "string", "format": "uri" }
}
}
}
}