ComfyUI > Nodes > ComfyUI_FL-CosyVoice3 > FL CosyVoice3 Zero-Shot Clone

ComfyUI Node: FL CosyVoice3 Zero-Shot Clone

Class Name

FL_CosyVoice3_ZeroShot

Category
🔊FL CosyVoice3/Synthesis
Author
filliptm (Account age: 2386days)
Extension
ComfyUI_FL-CosyVoice3
Latest Updated
2026-03-21
Github Stars
0.11K

How to Install ComfyUI_FL-CosyVoice3

Install this extension via the ComfyUI Manager by searching for ComfyUI_FL-CosyVoice3
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_FL-CosyVoice3 in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

FL CosyVoice3 Zero-Shot Clone Description

FL_CosyVoice3_ZeroShot enables zero-shot voice cloning from a reference audio sample.

FL CosyVoice3 Zero-Shot Clone:

FL_CosyVoice3_ZeroShot is a sophisticated node designed for zero-shot voice cloning, allowing you to replicate any voice from a reference audio sample. This node leverages advanced machine learning models to capture the unique characteristics of a voice and synthesize new audio that mimics the original speaker's tone, pitch, and style. The primary benefit of this node is its ability to perform voice cloning without requiring extensive training data, making it highly accessible and efficient for users who need to generate voice content quickly. By utilizing a reference audio clip, the node can transcribe and analyze the voice characteristics, then synthesize new audio in the same voice, even if the text content is different. This capability is particularly useful for applications in AI art, content creation, and personalized voice applications, where unique and diverse voice outputs are desired.

FL CosyVoice3 Zero-Shot Clone Input Parameters:

reference_audio

The reference_audio parameter is the audio sample from which the voice characteristics will be extracted. This audio serves as the template for the voice cloning process. The quality and clarity of this audio can significantly impact the accuracy and quality of the cloned voice. It is recommended to use a clean and clear audio sample, ideally with minimal background noise, to ensure the best results. The maximum duration for the reference audio is 30 seconds.

seed

The seed parameter is used to initialize the random number generators for reproducibility. By setting a specific seed value, you can ensure that the voice cloning process yields the same results across different runs. This is particularly useful for debugging or when you need consistent outputs. If the seed is set to a negative value, the randomization will not be controlled, leading to potentially different results each time.

text

The text parameter is the content that you want to synthesize using the cloned voice. This text will be converted into speech using the voice characteristics extracted from the reference audio. The length and complexity of the text can affect the processing time and the final output quality. Ensure that the text is clear and concise for optimal synthesis.

speed

The speed parameter controls the rate at which the synthesized speech is generated. A value greater than 1.0 will speed up the speech, while a value less than 1.0 will slow it down. Adjusting this parameter allows you to match the pace of the synthesized speech to your specific needs or preferences.

use_cross_lingual_fallback

The use_cross_lingual_fallback parameter determines whether to use a cross-lingual approach when a transcript is not available. This mode allows the node to extract voice characteristics without needing a text transcript, which can be useful in multilingual contexts or when the reference audio is in a different language than the text.

FL CosyVoice3 Zero-Shot Clone Output Parameters:

audio

The audio output parameter is the synthesized audio generated by the node. This audio is in the ComfyUI AUDIO format and contains the voice cloned from the reference audio, speaking the provided text. The quality of this output depends on the reference audio and the parameters set during the process. The sample rate of the audio is determined by the model, typically 24000 Hz for CosyVoice3.

FL CosyVoice3 Zero-Shot Clone Usage Tips:

  • Ensure your reference audio is of high quality and free from background noise to achieve the best voice cloning results.
  • Use a consistent seed value if you need reproducible results across different runs of the node.
  • Experiment with the speed parameter to find the optimal speech rate that suits your specific application or artistic vision.
  • Consider using the use_cross_lingual_fallback option if your reference audio and text are in different languages, as it allows for more flexible voice cloning without needing a transcript.

FL CosyVoice3 Zero-Shot Clone Common Errors and Solutions:

Error cloning voice: <error_message>

  • Explanation: This error occurs when there is an issue during the voice cloning process, which could be due to an invalid reference audio file, incorrect parameter settings, or internal processing errors.
  • Solution: Check the quality and format of your reference audio file to ensure it meets the requirements. Verify that all input parameters are correctly set and within their valid ranges. If the problem persists, consult the traceback for more detailed error information and consider adjusting your inputs or settings accordingly.

FL CosyVoice3 Zero-Shot Clone Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_FL-CosyVoice3
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

FL CosyVoice3 Zero-Shot Clone