ComfyUI > Nodes > ComfyUI_ChatterBox_SRT_Voice > 🎤 ChatterBox Voice TTS (diogod)

ComfyUI Node: 🎤 ChatterBox Voice TTS (diogod)

Class Name

ChatterBoxVoiceTTSDiogod

Category
ChatterBox Voice
Author
diodiogod (Account age: 768days)
Extension
ComfyUI_ChatterBox_SRT_Voice
Latest Updated
2026-03-21
Github Stars
0.08K

How to Install ComfyUI_ChatterBox_SRT_Voice

Install this extension via the ComfyUI Manager by searching for ComfyUI_ChatterBox_SRT_Voice
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_ChatterBox_SRT_Voice in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

🎤 ChatterBox Voice TTS (diogod) Description

ChatterBoxVoiceTTSDiogod enables natural, expressive TTS for AI projects with customizable voices.

🎤 ChatterBox Voice TTS (diogod):

ChatterBoxVoiceTTSDiogod is a sophisticated node designed to facilitate text-to-speech (TTS) conversion with a focus on generating natural and expressive audio outputs. This node is particularly beneficial for AI artists and developers who require high-quality voice synthesis for various applications, such as virtual characters, narration, or interactive media. By leveraging advanced TTS algorithms, ChatterBoxVoiceTTSDiogod can produce audio that closely mimics human speech patterns, including intonation and emotion, thereby enhancing the realism and engagement of the generated content. The node supports multiple languages and offers customization options to fine-tune the voice output, making it a versatile tool for diverse creative projects.

🎤 ChatterBox Voice TTS (diogod) Input Parameters:

t

This parameter represents the text input that you want to convert into speech. It is the primary content that the node will process to generate audio. The quality and clarity of the output audio are directly influenced by the text provided.

language

This parameter specifies the language in which the text is written. It ensures that the TTS engine applies the correct phonetic and linguistic rules to produce accurate and natural-sounding speech. Supported languages may vary, so it's important to select the appropriate one for your text.

device

This parameter determines the computational device used for processing, such as a CPU or GPU. Selecting the right device can impact the speed and efficiency of the TTS conversion process, with GPUs typically offering faster performance.

exaggeration

This parameter controls the level of expressiveness in the generated speech. Higher values can make the speech sound more animated or emotional, which can be useful for specific character voices or dramatic readings.

temperature

This parameter influences the randomness of the speech generation process. A higher temperature can result in more varied and creative outputs, while a lower temperature produces more consistent and predictable speech.

cfg_weight

This parameter adjusts the balance between following the input text closely and introducing creative variations. It allows you to fine-tune the adherence to the original text, which can be useful for achieving the desired level of fidelity in the output.

seed

This parameter sets the random seed for the TTS generation process, ensuring reproducibility of the audio output. By using the same seed, you can generate identical audio for the same input text across different sessions.

reference_audio

This optional parameter allows you to provide a reference audio file to guide the TTS engine in mimicking a specific voice or style. It can be useful for achieving consistency with existing audio content or for character-specific voice synthesis.

audio_prompt_path

This parameter specifies the file path to an audio prompt that can be used to influence the style or tone of the generated speech. It provides an additional layer of customization for the TTS output.

enable_chunking

This boolean parameter enables or disables the chunking of long text segments into smaller parts for processing. Chunking can help manage memory usage and improve processing efficiency for lengthy texts.

max_chars_per_chunk

This parameter sets the maximum number of characters allowed in each text chunk when chunking is enabled. It helps control the size of the chunks and can impact the smoothness and coherence of the generated speech.

chunk_combination_method

This parameter determines the method used to combine audio chunks back into a single output. Different methods may affect the continuity and naturalness of the final audio.

silence_between_chunks_ms

This parameter specifies the duration of silence, in milliseconds, to be inserted between audio chunks. It can help create natural pauses in the speech, enhancing the overall listening experience.

crash_protection_template

This parameter provides a template for padding short text segments to prevent crashes during sequential generation. It ensures stability in the TTS process, especially for very short inputs.

enable_audio_cache

This boolean parameter enables or disables the caching of generated audio segments. Caching can improve performance by reusing previously generated audio for identical inputs, reducing processing time.

🎤 ChatterBox Voice TTS (diogod) Output Parameters:

segment_audio_chunks

This output parameter contains the generated audio chunks for each segment of the input text. These chunks are the building blocks of the final speech output and can be combined to form a continuous audio stream.

natural_duration

This parameter provides the natural duration of the generated audio, measured in seconds. It reflects the length of the speech output and can be useful for synchronization with other media elements.

🎤 ChatterBox Voice TTS (diogod) Usage Tips:

  • To achieve the most natural-sounding speech, experiment with the exaggeration and temperature parameters to find the right balance for your specific application.
  • Utilize the reference_audio and audio_prompt_path parameters to guide the TTS engine in mimicking specific voices or styles, enhancing the consistency and quality of the output.
  • Enable enable_chunking for long texts to manage memory usage effectively and ensure smooth processing without sacrificing audio quality.

🎤 ChatterBox Voice TTS (diogod) Common Errors and Solutions:

"Text too short for processing"

  • Explanation: The input text is too short, which may cause issues in the TTS generation process.
  • Solution: Use the crash_protection_template parameter to pad the text, ensuring stability during processing.

"Unsupported language"

  • Explanation: The specified language is not supported by the TTS engine.
  • Solution: Verify the list of supported languages and select an appropriate one for your text input.

"Device not available"

  • Explanation: The specified computational device (CPU/GPU) is not available for processing.
  • Solution: Check your system configuration and ensure the selected device is properly set up and accessible.

🎤 ChatterBox Voice TTS (diogod) Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_ChatterBox_SRT_Voice
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

🎤 ChatterBox Voice TTS (diogod)