RunComfy

Hunyuan Video 1.5 | Fast AI Video Generator

Turn text or images into smooth 1080p videos quickly and easily.

Qwen Image Edit | Precise AI Photo Editing

Edit photos fast with style, relighting, and object control precision.

Flux Depth and Canny

Official Flux Tools - Flux Depth and Canny ControlNet Model

Animatediff V2 & V3 | Text to Video

Explore AnimateDiff V3, AnimateDiff SDXL and AnimateDiff V2, and use Upscale for high-resolution results.

ComfyUI > Nodes > ComfyUI_ChatterBox_SRT_Voice > 🔄 ChatterBox Voice Conversion (diogod)

ComfyUI Node: 🔄 ChatterBox Voice Conversion (diogod)

Class Name

ChatterBoxVoiceVCDiogod

Category
ChatterBox Voice

Author
diodiogod (Account age: 768days) Extension
ComfyUI_ChatterBox_SRT_Voice Latest Updated
2026-03-21 Github Stars
0.08K

Github Ask diodiogod Current Questions Past Questions

Table of Content

Description
ChatterBoxVoiceVCDiogod:
ChatterBoxVoiceVCDiogod Input Parameters:
ChatterBoxVoiceVCDiogod Output Parameters:
ChatterBoxVoiceVCDiogod Usage Tips:
ChatterBoxVoiceVCDiogod Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_ChatterBox_SRT_Voice

Install this extension via the ComfyUI Manager by searching for ComfyUI_ChatterBox_SRT_Voice

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_ChatterBox_SRT_Voice in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

🔄 ChatterBox Voice Conversion (diogod) Description

Facilitates voice conversion in ComfyUI, enabling audio input transformation and customization.

🔄 ChatterBox Voice Conversion (diogod):

ChatterBoxVoiceVCDiogod is a sophisticated node designed to facilitate voice conversion processes within the ComfyUI framework. This node is integral for transforming audio inputs into different vocal outputs, allowing for a wide range of applications such as voice modulation, character voice creation, and more. By leveraging advanced audio analysis and conversion techniques, ChatterBoxVoiceVCDiogod provides users with the ability to manipulate and customize voice characteristics, making it an essential tool for AI artists looking to enhance their audio projects. The node's primary goal is to offer a seamless and efficient voice conversion experience, ensuring high-quality results that meet the creative needs of its users.

🔄 ChatterBox Voice Conversion (diogod) Input Parameters:

tag_audio_events

This parameter allows you to annotate sounds such as laughter or music within the transcript. It is a Boolean input, meaning it can be set to either True or False. When enabled, it provides additional context to the audio by tagging non-verbal sounds, which can be particularly useful for creating more dynamic and engaging audio content. The default value is False.

diarize

The diarize parameter is used to annotate which speaker is talking in a multi-speaker audio file. This Boolean input helps in distinguishing between different speakers, making it easier to follow conversations or dialogues. It is especially beneficial in scenarios where multiple voices are present, ensuring clarity and organization in the output. The default setting is False.

diarization_threshold

This parameter controls the sensitivity of speaker separation. It is a Float input with a default value of 0.22, and it can be adjusted between 0.1 and 0.4 with a step of 0.01. Lower values make the system more sensitive to changes in speakers, which can be useful in environments with frequent speaker changes. Adjusting this threshold allows for fine-tuning the balance between sensitivity and accuracy in speaker identification.

🔄 ChatterBox Voice Conversion (diogod) Output Parameters:

ConvertedVoice

The ConvertedVoice output parameter provides the transformed audio file after the voice conversion process. This output is crucial as it represents the final product of the node's operations, showcasing the applied voice modifications. Users can interpret this output as the realization of their voice conversion goals, whether it be for creative projects, character development, or other audio applications.

🔄 ChatterBox Voice Conversion (diogod) Usage Tips:

To achieve the best results in multi-speaker environments, enable the diarize parameter to clearly distinguish between different speakers.
Adjust the diarization_threshold to find the optimal sensitivity for your specific audio content, especially if you notice inaccuracies in speaker separation.

🔄 ChatterBox Voice Conversion (diogod) Common Errors and Solutions:

"Audio input not recognized"

Explanation: This error occurs when the node fails to identify the provided audio format.
Solution: Ensure that the audio file is in a supported format such as WAV or MP3 before inputting it into the node.

"Diarization threshold out of range"

Explanation: The specified diarization_threshold value is outside the acceptable range.
Solution: Adjust the threshold value to be within the range of 0.1 to 0.4.

"Tag audio events failed"

Explanation: The node encountered an issue while attempting to tag audio events.
Solution: Verify that the audio file is clear and free of excessive noise, and try enabling the tag_audio_events parameter again.

🔄 ChatterBox Voice Conversion (diogod) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_ChatterBox_SRT_Voice

Table of Content

Description
ChatterBoxVoiceVCDiogod:
ChatterBoxVoiceVCDiogod Input Parameters:
ChatterBoxVoiceVCDiogod Output Parameters:
ChatterBoxVoiceVCDiogod Usage Tips:
ChatterBoxVoiceVCDiogod Common Errors and Solutions:
Related Nodes

Flux PuLID for Face Swapping

Take your face swapping projects to new heights with Flux PuLID.

Wan2.2 Fun Camera | Cinematic Motion from Images

Turn still images into lively cinematic shots with smooth camera moves.

CHORD Model | AI PBR Texture Generator

Turns images into true PBR texture maps fast.

Wan 2.1 LoRA

Enhance Wan 2.1 video generation with LoRA models for improved style and customization.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: 🔄 ChatterBox Voice Conversion (diogod)

ChatterBoxVoiceVCDiogod

How to Install ComfyUI_ChatterBox_SRT_Voice

🔄 ChatterBox Voice Conversion (diogod) Description

🔄 ChatterBox Voice Conversion (diogod):

🔄 ChatterBox Voice Conversion (diogod) Input Parameters:

tag_audio_events

diarize

diarization_threshold

🔄 ChatterBox Voice Conversion (diogod) Output Parameters:

ConvertedVoice

🔄 ChatterBox Voice Conversion (diogod) Usage Tips:

🔄 ChatterBox Voice Conversion (diogod) Common Errors and Solutions:

"Audio input not recognized"

"Diarization threshold out of range"

"Tag audio events failed"

🔄 ChatterBox Voice Conversion (diogod) Related Nodes