RunComfy

ComfyUI Trellis2 | Image-to-3D Mesh Generation Workflow

Convert images into structured, editable 3D meshes with precise geometry and topology control.

FLUX IPAdapter V2 | XLabs

Explore XLabs FLUX IPAdapter V2 model compared to V1 for your creative goals.

Z-Image Turbo I2I for Characters | Ultimate Photorealism

Turns portraits into lifelike, perfectly detailed realistic faces fast.

Pose Control LipSync S2V | Expressive Video Generator

Turn images into talking, moving characters with pose and audio control.

ComfyUI > Nodes > ComfyUI_RH_VoxCPM

ComfyUI Extension: ComfyUI_RH_VoxCPM

Repo Name

ComfyUI_RH_VoxCPM

Author
HM-RunningHub (Account age: 489 days) Nodes
View all nodes(4) Latest Updated
2026-04-15 Github Stars
0.03K

Github Ask HM-RunningHub Current Questions Past Questions

Table of Content

Description
ComfyUI_RH_VoxCPM Introduction
How ComfyUI_RH_VoxCPM Works
ComfyUI_RH_VoxCPM Features
ComfyUI_RH_VoxCPM Models
What's New with ComfyUI_RH_VoxCPM
Troubleshooting ComfyUI_RH_VoxCPM
Learn More about ComfyUI_RH_VoxCPM
Related Nodes

How to Install ComfyUI_RH_VoxCPM

Install this extension via the ComfyUI Manager by searching for ComfyUI_RH_VoxCPM

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_RH_VoxCPM in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ComfyUI_RH_VoxCPM Description

ComfyUI_RH_VoxCPM is an extension for ComfyUI that enhances user interface capabilities by integrating advanced voice control features. It allows users to interact with the UI using voice commands, improving accessibility and efficiency.

ComfyUI_RH_VoxCPM Introduction

ComfyUI_RH_VoxCPM is an innovative extension designed to enhance the capabilities of ComfyUI by integrating the VoxCPM system. This extension allows you to generate high-quality, context-aware speech without the need for a tokenizer. It supports creative voice design and high-fidelity voice cloning, making it a powerful tool for AI artists interested in exploring new dimensions of audio creation. Whether you're looking to design unique voices based on textual descriptions or clone existing voices with precision, ComfyUI_RH_VoxCPM offers a versatile solution.

How ComfyUI_RH_VoxCPM Works

At its core, ComfyUI_RH_VoxCPM leverages the VoxCPM system, which is a tokenizer-free text-to-speech (TTS) technology. This means it can generate speech directly from text without converting the audio into discrete tokens. The system uses a diffusion autoregressive architecture to produce continuous audio representations, resulting in natural and expressive speech synthesis. Imagine it as a painter who creates a masterpiece directly on canvas without sketching first; similarly, VoxCPM crafts audio directly from text, ensuring fluidity and expressiveness.

ComfyUI_RH_VoxCPM Features

Voice Design: Create entirely new voices by describing characteristics such as gender, age, tone, emotion, and speed. This feature allows you to bring your creative visions to life by simply using descriptive text.
Controllable Cloning: Upload a reference audio to clone its voice characteristics while using text instructions to control style, emotion, and speed. This feature is perfect for artists who want to maintain the essence of a voice while adding their unique touch.
Ultimate Cloning: For those who need to replicate every detail of a voice, this mode allows the model to continue from a reference audio, capturing every nuance. This is ideal for high-fidelity voice cloning projects.
LoRA Fine-Tuning: Customize voice generation by loading your own LoRA weights, enabling personalized voice synthesis.
Automatic Speech Recognition (ASR): If the reference audio text is empty, the system automatically uses FunASR SenseVoiceSmall to recognize the speech.
Reference Audio Denoising: Optionally use ZipEnhancer to reduce noise in reference audio, ensuring cleaner input for cloning.

ComfyUI_RH_VoxCPM Models

ComfyUI_RH_VoxCPM supports several models, each catering to different needs:

VoxCPM2: With 2 billion parameters, this model offers the best quality and is recommended for projects requiring the highest fidelity.
VoxCPM1.5: A balanced choice with 800 million parameters, suitable for general use.
VoxCPM-0.5B: A lightweight model with 640 million parameters, ideal for projects where resource efficiency is a priority. Each model can significantly impact the quality and performance of the generated audio, so choose based on your project's requirements.

What's New with ComfyUI_RH_VoxCPM

The latest updates to ComfyUI_RH_VoxCPM include enhanced voice cloning capabilities and improved support for multi-speaker dialogues. These updates allow for more dynamic and expressive audio generation, providing AI artists with greater creative freedom and control over their projects.

Troubleshooting ComfyUI_RH_VoxCPM

If you encounter issues while using ComfyUI_RH_VoxCPM, here are some common solutions:

Problem: The generated audio does not match the expected style or emotion.
Solution: Double-check your text instructions for clarity and specificity. Ensure that the control instructions are correctly formatted and relevant to the desired output.
Problem: Reference audio is noisy or unclear.
Solution: Enable the denoise option using ZipEnhancer to clean up the reference audio before processing.
Problem: The system fails to recognize the reference audio text.
Solution: Ensure that the automatic ASR feature is enabled, or manually provide the text transcription if possible.

Learn More about ComfyUI_RH_VoxCPM

To further explore the capabilities of ComfyUI_RH_VoxCPM, consider visiting the following resources:

VoxCPM GitHub Repository for technical details and updates.
VoxCPM2 on HuggingFace for model downloads and community discussions.
RunningHub (https://www.runninghub.cn) for online usage and additional support. These resources provide valuable insights and community support, helping you make the most of ComfyUI_RH_VoxCPM in your creative projects.

ComfyUI_RH_VoxCPM Related Nodes

RunningHub VoxCPM Generate Speech

RunningHub VoxCPM Load Model

RunningHub VoxCPM Multi-Speaker (Dynamic Audio)

RunningHub VoxCPM Multi-Speaker

Table of Content

Description
ComfyUI_RH_VoxCPM Introduction
How ComfyUI_RH_VoxCPM Works
ComfyUI_RH_VoxCPM Features
ComfyUI_RH_VoxCPM Models
What's New with ComfyUI_RH_VoxCPM
Troubleshooting ComfyUI_RH_VoxCPM
Learn More about ComfyUI_RH_VoxCPM
Related Nodes

MatAnyone Video Matting | Single Mask Removal

Remove video backgrounds with one mask frame for perfect subject isolation.

Wan2.2 S2V | Sound to Video Generator

Turns your audio clip into lifelike, synced video from one image

Hunyuan Image 2.1 | High-Res AI Image Generator

Next-gen 2.1 model for crisp, sharp, ultra-clear AI visuals fast.

LongCat Avatar in ComfyUI | Identity-Consistent Avatar Animation

Turns one image into smooth, identity-consistent avatar animation.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: ComfyUI_RH_VoxCPM

ComfyUI_RH_VoxCPM

How to Install ComfyUI_RH_VoxCPM

ComfyUI_RH_VoxCPM Description

ComfyUI_RH_VoxCPM Introduction

How ComfyUI_RH_VoxCPM Works

ComfyUI_RH_VoxCPM Features

ComfyUI_RH_VoxCPM Models

What's New with ComfyUI_RH_VoxCPM

Troubleshooting ComfyUI_RH_VoxCPM

Learn More about ComfyUI_RH_VoxCPM

ComfyUI_RH_VoxCPM Related Nodes