Save 4 hours! We auto-setup your workflow! Free!

Drop your workflow.json — we handle every dependency, custom node, and model. Just open the link and run.

Auto-Setup Workflow Json (Free) Now!
ComfyUI > Nodes > ComfyUI-FL-VoxCPM > FL VoxCPM TTS

ComfyUI Node: FL VoxCPM TTS

Class Name

FL_VoxCPM_TTS

Category
FL/VoxCPM
Author
filliptm (Account age: 2446days)
Extension
ComfyUI-FL-VoxCPM
Latest Updated
2026-05-21
Github Stars
0.03K

How to Install ComfyUI-FL-VoxCPM

Install this extension via the ComfyUI Manager by searching for ComfyUI-FL-VoxCPM
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-FL-VoxCPM in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

FL VoxCPM TTS Description

Generate speech or clone voices with VoxCPM model 1.5, LoRA support for style enhancement, ideal for AI developers.

FL VoxCPM TTS:

FL_VoxCPM_TTS is a powerful node designed to generate speech or clone voices using the VoxCPM model version 1.5, with support for LoRA (Low-Rank Adaptation) to enhance style and fine-tuning capabilities. This node is particularly beneficial for AI artists and developers who wish to create highly expressive and natural-sounding speech from text inputs. By leveraging advanced text-to-speech (TTS) technology, FL_VoxCPM_TTS allows for the synthesis of speech that can mimic specific voices or create entirely new vocal styles, making it an essential tool for projects requiring voice customization and cloning. The node's ability to handle various input parameters ensures flexibility and control over the generated audio, providing users with the means to achieve their desired auditory outcomes.

FL VoxCPM TTS Input Parameters:

model_name

This parameter allows you to select the VoxCPM model to use for speech generation. The choice of model can affect the quality and characteristics of the generated speech. The default model is the first option in the list of available models.

lora_name

This parameter lets you choose a LoRA to apply for style or fine-tuning purposes. LoRA can modify the speech style, adding a layer of customization to the generated audio. The rank of the LoRA is automatically detected, and the default option is "None."

text

This is the main text input for synthesis. You can enter the text you want to convert into speech, with each line processed as a separate chunk. The default text is "VoxCPM is an innovative TTS model designed to generate highly expressive speech."

prompt_audio

An optional parameter where you can provide reference audio for voice cloning. This helps the model to mimic the voice characteristics of the provided audio.

prompt_text

This optional parameter requires the transcript of the reference audio when performing voice cloning. It ensures that the generated speech aligns with the intended voice characteristics.

cfg_value

The guidance scale parameter, which ranges from 1.0 to 10.0, with a default value of 2.0. Higher values make the output adhere more closely to the prompt, but may result in less natural-sounding speech.

inference_timesteps

This parameter determines the number of diffusion steps used during generation. It ranges from 1 to 100, with a default of 10. More steps can improve quality but increase processing time.

min_tokens

Specifies the minimum length of generated audio tokens, ranging from 1 to 100, with a default of 2. This ensures a baseline length for the audio output.

max_tokens

Defines the maximum length of generated audio tokens, with a range from 64 to 8192 and a default of 2048. This controls the upper limit of the audio duration.

FL VoxCPM TTS Output Parameters:

waveform

The waveform output is a tensor representing the generated audio signal. It is crucial for playback or further processing, as it contains the actual sound data produced by the node.

sample_rate

This parameter indicates the sample rate of the generated audio, which is essential for ensuring the audio is played back at the correct speed and quality. It matches the sample rate used by the VoxCPM model.

FL VoxCPM TTS Usage Tips:

  • Experiment with different cfg_value settings to balance between adherence to the prompt and naturalness of the speech. Lower values may sound more natural, while higher values stick closely to the input prompt.
  • Use prompt_audio and prompt_text for voice cloning to achieve more personalized and accurate voice synthesis, especially when trying to mimic a specific voice.
  • Adjust inference_timesteps to find a sweet spot between quality and processing time. More steps can enhance quality but will take longer to process.

FL VoxCPM TTS Common Errors and Solutions:

Generation error: <error_message>

  • Explanation: This error occurs when there is an issue during the audio generation process, possibly due to incorrect input parameters or model configuration.
  • Solution: Check all input parameters for correctness, ensure the model and LoRA are properly selected, and verify that any optional inputs like prompt_audio and prompt_text are correctly provided if used.

Force offloading VoxCPM model '<model_name>' from VRAM...

  • Explanation: This message indicates that the model is being offloaded from VRAM to free up resources, which can happen if force_offload is enabled.
  • Solution: If you encounter performance issues, consider disabling force_offload unless necessary, or ensure your system has sufficient VRAM to handle the model without offloading.

FL VoxCPM TTS Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-FL-VoxCPM
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

FL VoxCPM TTS