RunComfy

FLUX.2 [klein] 4B & 9B | Ultra-Fast Flux Image Generator

Blazing-fast visual creation with unified editing control.

Flux Krea Dev | Natural Text to Image

The best open-source FLUX model! Absolutely incredible natural results.

DiffuEraser | Video Inpainting

Erase objects from videos with auto-masking and realistic reconstruction.

LongCat Avatar in ComfyUI | Identity-Consistent Avatar Animation

Turns one image into smooth, identity-consistent avatar animation.

ComfyUI > Nodes > ComfyUI_ChatterBox_SRT_Voice > 🎤 ChatterBox Voice TTS (diogod)

ComfyUI Node: 🎤 ChatterBox Voice TTS (diogod)

Class Name

ChatterBoxVoiceTTSDiogod

Category
ChatterBox Voice

Author
diodiogod (Account age: 768days) Extension
ComfyUI_ChatterBox_SRT_Voice Latest Updated
2026-03-21 Github Stars
0.08K

Github Ask diodiogod Current Questions Past Questions

Table of Content

Description
ChatterBoxVoiceTTSDiogod:
ChatterBoxVoiceTTSDiogod Input Parameters:
ChatterBoxVoiceTTSDiogod Output Parameters:
ChatterBoxVoiceTTSDiogod Usage Tips:
ChatterBoxVoiceTTSDiogod Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_ChatterBox_SRT_Voice

Install this extension via the ComfyUI Manager by searching for ComfyUI_ChatterBox_SRT_Voice

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_ChatterBox_SRT_Voice in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

🎤 ChatterBox Voice TTS (diogod) Description

ChatterBoxVoiceTTSDiogod enables natural, expressive TTS for AI projects with customizable voices.

🎤 ChatterBox Voice TTS (diogod):

ChatterBoxVoiceTTSDiogod is a sophisticated node designed to facilitate text-to-speech (TTS) conversion with a focus on generating natural and expressive audio outputs. This node is particularly beneficial for AI artists and developers who require high-quality voice synthesis for various applications, such as virtual characters, narration, or interactive media. By leveraging advanced TTS algorithms, ChatterBoxVoiceTTSDiogod can produce audio that closely mimics human speech patterns, including intonation and emotion, thereby enhancing the realism and engagement of the generated content. The node supports multiple languages and offers customization options to fine-tune the voice output, making it a versatile tool for diverse creative projects.

🎤 ChatterBox Voice TTS (diogod) Input Parameters:

t

This parameter represents the text input that you want to convert into speech. It is the primary content that the node will process to generate audio. The quality and clarity of the output audio are directly influenced by the text provided.

language

This parameter specifies the language in which the text is written. It ensures that the TTS engine applies the correct phonetic and linguistic rules to produce accurate and natural-sounding speech. Supported languages may vary, so it's important to select the appropriate one for your text.

device

This parameter determines the computational device used for processing, such as a CPU or GPU. Selecting the right device can impact the speed and efficiency of the TTS conversion process, with GPUs typically offering faster performance.

exaggeration

This parameter controls the level of expressiveness in the generated speech. Higher values can make the speech sound more animated or emotional, which can be useful for specific character voices or dramatic readings.

temperature

This parameter influences the randomness of the speech generation process. A higher temperature can result in more varied and creative outputs, while a lower temperature produces more consistent and predictable speech.

cfg_weight

This parameter adjusts the balance between following the input text closely and introducing creative variations. It allows you to fine-tune the adherence to the original text, which can be useful for achieving the desired level of fidelity in the output.

seed

This parameter sets the random seed for the TTS generation process, ensuring reproducibility of the audio output. By using the same seed, you can generate identical audio for the same input text across different sessions.

reference_audio

This optional parameter allows you to provide a reference audio file to guide the TTS engine in mimicking a specific voice or style. It can be useful for achieving consistency with existing audio content or for character-specific voice synthesis.

audio_prompt_path

This parameter specifies the file path to an audio prompt that can be used to influence the style or tone of the generated speech. It provides an additional layer of customization for the TTS output.

enable_chunking

This boolean parameter enables or disables the chunking of long text segments into smaller parts for processing. Chunking can help manage memory usage and improve processing efficiency for lengthy texts.

max_chars_per_chunk

This parameter sets the maximum number of characters allowed in each text chunk when chunking is enabled. It helps control the size of the chunks and can impact the smoothness and coherence of the generated speech.

chunk_combination_method

This parameter determines the method used to combine audio chunks back into a single output. Different methods may affect the continuity and naturalness of the final audio.

silence_between_chunks_ms

This parameter specifies the duration of silence, in milliseconds, to be inserted between audio chunks. It can help create natural pauses in the speech, enhancing the overall listening experience.

crash_protection_template

This parameter provides a template for padding short text segments to prevent crashes during sequential generation. It ensures stability in the TTS process, especially for very short inputs.

enable_audio_cache

This boolean parameter enables or disables the caching of generated audio segments. Caching can improve performance by reusing previously generated audio for identical inputs, reducing processing time.

🎤 ChatterBox Voice TTS (diogod) Output Parameters:

segment_audio_chunks

This output parameter contains the generated audio chunks for each segment of the input text. These chunks are the building blocks of the final speech output and can be combined to form a continuous audio stream.

natural_duration

This parameter provides the natural duration of the generated audio, measured in seconds. It reflects the length of the speech output and can be useful for synchronization with other media elements.

🎤 ChatterBox Voice TTS (diogod) Usage Tips:

To achieve the most natural-sounding speech, experiment with the exaggeration and temperature parameters to find the right balance for your specific application.
Utilize the reference_audio and audio_prompt_path parameters to guide the TTS engine in mimicking specific voices or styles, enhancing the consistency and quality of the output.
Enable enable_chunking for long texts to manage memory usage effectively and ensure smooth processing without sacrificing audio quality.

🎤 ChatterBox Voice TTS (diogod) Common Errors and Solutions:

"Text too short for processing"

Explanation: The input text is too short, which may cause issues in the TTS generation process.
Solution: Use the crash_protection_template parameter to pad the text, ensuring stability during processing.

"Unsupported language"

Explanation: The specified language is not supported by the TTS engine.
Solution: Verify the list of supported languages and select an appropriate one for your text input.

"Device not available"

Explanation: The specified computational device (CPU/GPU) is not available for processing.
Solution: Check your system configuration and ensure the selected device is properly set up and accessible.

🎤 ChatterBox Voice TTS (diogod) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_ChatterBox_SRT_Voice

Table of Content

Description
ChatterBoxVoiceTTSDiogod:
ChatterBoxVoiceTTSDiogod Input Parameters:
ChatterBoxVoiceTTSDiogod Output Parameters:
ChatterBoxVoiceTTSDiogod Usage Tips:
ChatterBoxVoiceTTSDiogod Common Errors and Solutions:
Related Nodes

SeedVR2 | Image & Video Upscaler

Fixes blur instantly. Better than Keep/PMRF.

SDXL LoRA Inference | AI Toolkit ComfyUI

Run your AI Toolkit-trained SDXL LoRA in ComfyUI with training-matched defaults using a single RC custom node.

Wan2.2 VACE Fun | Image to Animated Video

Turn still photos into lifelike animated videos with custom prompts.

Flux Kontext Pulid | Consistent Character Generation

Create consistent characters using FLUX Kontext with a single face reference image.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: 🎤 ChatterBox Voice TTS (diogod)

ChatterBoxVoiceTTSDiogod

How to Install ComfyUI_ChatterBox_SRT_Voice

🎤 ChatterBox Voice TTS (diogod) Description

🎤 ChatterBox Voice TTS (diogod):

🎤 ChatterBox Voice TTS (diogod) Input Parameters:

t

language

device

exaggeration

temperature

cfg_weight

seed

reference_audio

audio_prompt_path

enable_chunking

max_chars_per_chunk

chunk_combination_method

silence_between_chunks_ms

crash_protection_template

enable_audio_cache

🎤 ChatterBox Voice TTS (diogod) Output Parameters:

segment_audio_chunks

natural_duration

🎤 ChatterBox Voice TTS (diogod) Usage Tips:

🎤 ChatterBox Voice TTS (diogod) Common Errors and Solutions:

"Text too short for processing"

"Unsupported language"

"Device not available"

🎤 ChatterBox Voice TTS (diogod) Related Nodes