Save 4 hours! We auto-setup your workflow! Free!

Drop your workflow.json — we handle every dependency, custom node, and model. Just open the link and run.

Auto-Setup Workflow Json (Free) Now!
ComfyUI > Nodes > ComfyUI-FL-VoxCPM > FL VoxCPM V2 TTS

ComfyUI Node: FL VoxCPM V2 TTS

Class Name

FL_VoxCPM_V2_TTS

Category
FL/VoxCPM
Author
filliptm (Account age: 2446days)
Extension
ComfyUI-FL-VoxCPM
Latest Updated
2026-05-21
Github Stars
0.03K

How to Install ComfyUI-FL-VoxCPM

Install this extension via the ComfyUI Manager by searching for ComfyUI-FL-VoxCPM
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-FL-VoxCPM in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

FL VoxCPM V2 TTS Description

Sophisticated text-to-speech node with advanced voice cloning features for high-quality speech synthesis.

FL VoxCPM V2 TTS:

FL VoxCPM V2 TTS is a sophisticated text-to-speech node designed to generate high-quality speech using the VoxCPM V2 model. This node is equipped with advanced features such as Voice Design, Voice Cloning, Controllable Cloning, and Ultimate Cloning modes, allowing you to create highly expressive and personalized speech outputs. The node is particularly beneficial for AI artists and developers who wish to incorporate realistic and customizable voice synthesis into their projects. By leveraging the capabilities of VoxCPM V2, this node provides a versatile platform for generating speech that can be tailored to specific needs, whether it's for creating unique character voices or replicating existing ones with precision.

FL VoxCPM V2 TTS Input Parameters:

model_name

This parameter allows you to select the specific VoxCPM model to use for speech generation. It is crucial for determining the characteristics and capabilities of the generated speech. The available options are defined by the models supported by the node, and selecting the appropriate model can significantly impact the quality and style of the output.

text

The text parameter is where you input the script or content you wish to convert into speech. It supports multiline input, meaning each line is processed as a separate chunk, allowing for complex and varied speech synthesis. The default text is "VoxCPM is an innovative TTS model designed to generate highly expressive speech."

prompt_audio

This optional parameter allows you to provide reference audio for voice cloning. By supplying a sample of the desired voice, the node can more accurately replicate the voice characteristics in the generated speech.

prompt_text

The transcript of the reference audio is required for voice cloning. This optional parameter helps the node understand the context and content of the reference audio, ensuring a more accurate voice cloning process.

cfg_value

The guidance scale parameter, with a default value of 2.0, influences how closely the generated speech adheres to the provided prompt. Higher values result in speech that is more faithful to the prompt but may sound less natural. The range is from 1.0 to 10.0.

inference_timesteps

This parameter determines the number of diffusion steps used during speech generation. Higher values can improve the quality of the output but will increase processing time. The default is 10, with a range from 1 to 100.

min_tokens

Specifies the minimum length of generated audio tokens, ensuring that the output meets a certain duration. The default is 2, with a range from 1 to 100.

max_tokens

Defines the maximum length of generated audio tokens, controlling the upper limit of the speech duration. The default is 2048, with a range from 64 to 8192.

FL VoxCPM V2 TTS Output Parameters:

waveform

The waveform output parameter provides the generated audio in a tensor format, representing the synthesized speech. This output is crucial for further processing or playback, as it contains the actual audio data created by the node.

sample_rate

This parameter indicates the sample rate of the generated audio, which is essential for ensuring compatibility with various audio playback systems and maintaining the quality of the output.

FL VoxCPM V2 TTS Usage Tips:

  • Experiment with different model_name options to find the best fit for your project's voice characteristics.
  • Use prompt_audio and prompt_text for accurate voice cloning, especially when replicating specific voices.
  • Adjust cfg_value to balance between naturalness and adherence to the prompt, depending on your needs.
  • Increase inference_timesteps for higher quality output, but be mindful of the increased processing time.

FL VoxCPM V2 TTS Common Errors and Solutions:

Model 'model_name' not found.

  • Explanation: This error occurs when the specified model name is not available in the node's supported models.
  • Solution: Ensure that you select a model name from the available options provided by the node.

'model_name' is a V1 model. Use the FL VoxCPM TTS node instead.

  • Explanation: This error indicates that a V1 model was selected, which is not compatible with the V2 node.
  • Solution: Switch to using the FL VoxCPM TTS node for V1 models or select a V2 model for this node.

FL VoxCPM V2 TTS Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-FL-VoxCPM
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

FL VoxCPM V2 TTS