RunComfy

Wan 2.2 FLF2V | First-Last Frame Video Generation

Generate smooth videos from a start and end frame using Wan 2.2 FLF2V.

Flux Kontext Pulid | Consistent Character Generation

Create consistent characters using FLUX Kontext with a single face reference image.

MultiTalk | Photo to Talking Video

Millisecond lip sync + Wan2.1 = 15s ultra-detailed talking videos!

SUPIR | Photo-Realistic Image/Video Upscaler

SUPIR enables photo-realistic image restoration, works with SDXL model, and supports text-prompt enhancement.

ComfyUI > Nodes > ComfyUI-FL-VoxCPM > FL VoxCPM TTS

ComfyUI Node: FL VoxCPM TTS

Class Name

FL_VoxCPM_TTS

Category
FL/VoxCPM

Author
filliptm (Account age: 2446days) Extension
ComfyUI-FL-VoxCPM Latest Updated
2026-05-21 Github Stars
0.03K

Github Ask filliptm Current Questions Past Questions

Table of Content

Description
FL_VoxCPM_TTS:
FL_VoxCPM_TTS Input Parameters:
FL_VoxCPM_TTS Output Parameters:
FL_VoxCPM_TTS Usage Tips:
FL_VoxCPM_TTS Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-FL-VoxCPM

Install this extension via the ComfyUI Manager by searching for ComfyUI-FL-VoxCPM

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-FL-VoxCPM in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

FL VoxCPM TTS Description

Generate speech or clone voices with VoxCPM model 1.5, LoRA support for style enhancement, ideal for AI developers.

FL VoxCPM TTS:

FL_VoxCPM_TTS is a powerful node designed to generate speech or clone voices using the VoxCPM model version 1.5, with support for LoRA (Low-Rank Adaptation) to enhance style and fine-tuning capabilities. This node is particularly beneficial for AI artists and developers who wish to create highly expressive and natural-sounding speech from text inputs. By leveraging advanced text-to-speech (TTS) technology, FL_VoxCPM_TTS allows for the synthesis of speech that can mimic specific voices or create entirely new vocal styles, making it an essential tool for projects requiring voice customization and cloning. The node's ability to handle various input parameters ensures flexibility and control over the generated audio, providing users with the means to achieve their desired auditory outcomes.

FL VoxCPM TTS Input Parameters:

model_name

This parameter allows you to select the VoxCPM model to use for speech generation. The choice of model can affect the quality and characteristics of the generated speech. The default model is the first option in the list of available models.

lora_name

This parameter lets you choose a LoRA to apply for style or fine-tuning purposes. LoRA can modify the speech style, adding a layer of customization to the generated audio. The rank of the LoRA is automatically detected, and the default option is "None."

text

This is the main text input for synthesis. You can enter the text you want to convert into speech, with each line processed as a separate chunk. The default text is "VoxCPM is an innovative TTS model designed to generate highly expressive speech."

prompt_audio

An optional parameter where you can provide reference audio for voice cloning. This helps the model to mimic the voice characteristics of the provided audio.

prompt_text

This optional parameter requires the transcript of the reference audio when performing voice cloning. It ensures that the generated speech aligns with the intended voice characteristics.

cfg_value

The guidance scale parameter, which ranges from 1.0 to 10.0, with a default value of 2.0. Higher values make the output adhere more closely to the prompt, but may result in less natural-sounding speech.

inference_timesteps

This parameter determines the number of diffusion steps used during generation. It ranges from 1 to 100, with a default of 10. More steps can improve quality but increase processing time.

min_tokens

Specifies the minimum length of generated audio tokens, ranging from 1 to 100, with a default of 2. This ensures a baseline length for the audio output.

max_tokens

Defines the maximum length of generated audio tokens, with a range from 64 to 8192 and a default of 2048. This controls the upper limit of the audio duration.

FL VoxCPM TTS Output Parameters:

waveform

The waveform output is a tensor representing the generated audio signal. It is crucial for playback or further processing, as it contains the actual sound data produced by the node.

sample_rate

This parameter indicates the sample rate of the generated audio, which is essential for ensuring the audio is played back at the correct speed and quality. It matches the sample rate used by the VoxCPM model.

FL VoxCPM TTS Usage Tips:

Experiment with different cfg_value settings to balance between adherence to the prompt and naturalness of the speech. Lower values may sound more natural, while higher values stick closely to the input prompt.
Use prompt_audio and prompt_text for voice cloning to achieve more personalized and accurate voice synthesis, especially when trying to mimic a specific voice.
Adjust inference_timesteps to find a sweet spot between quality and processing time. More steps can enhance quality but will take longer to process.

FL VoxCPM TTS Common Errors and Solutions:

Generation error: `<error_message>`

Explanation: This error occurs when there is an issue during the audio generation process, possibly due to incorrect input parameters or model configuration.
Solution: Check all input parameters for correctness, ensure the model and LoRA are properly selected, and verify that any optional inputs like prompt_audio and prompt_text are correctly provided if used.

Force offloading VoxCPM model '`<model_name>`' from VRAM...

Explanation: This message indicates that the model is being offloaded from VRAM to free up resources, which can happen if force_offload is enabled.
Solution: If you encounter performance issues, consider disabling force_offload unless necessary, or ensure your system has sufficient VRAM to handle the model without offloading.

FL VoxCPM TTS Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-FL-VoxCPM

Table of Content

Description
FL_VoxCPM_TTS:
FL_VoxCPM_TTS Input Parameters:
FL_VoxCPM_TTS Output Parameters:
FL_VoxCPM_TTS Usage Tips:
FL_VoxCPM_TTS Common Errors and Solutions:
Related Nodes

Qwen Image Edit | Precise AI Photo Editing

Edit photos fast with style, relighting, and object control precision.

LTX-2 ControlNet | Precision Video Generator

Sharp control, perfect sync, super clear AI video creation.

InstantCharacter

One photo, endless characters. Perfect identity preservation.

Wan 2.2 Video Restyle | First Frame Restyle for Consistent and Cinematic Video Generation

Change the first frame, folks, your style makes the whole video look amazing. Pure magic.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: FL VoxCPM TTS

FL_VoxCPM_TTS

How to Install ComfyUI-FL-VoxCPM

FL VoxCPM TTS Description

FL VoxCPM TTS:

FL VoxCPM TTS Input Parameters:

model_name

lora_name

text

prompt_audio

prompt_text

cfg_value

inference_timesteps

min_tokens

max_tokens

FL VoxCPM TTS Output Parameters:

waveform

sample_rate

FL VoxCPM TTS Usage Tips:

FL VoxCPM TTS Common Errors and Solutions:

Generation error: <error_message>

Force offloading VoxCPM model '<model_name>' from VRAM...

FL VoxCPM TTS Related Nodes

Generation error: `<error_message>`

Force offloading VoxCPM model '`<model_name>`' from VRAM...