RunComfy

ComfyUI Trellis2 | Image-to-3D Mesh Generation Workflow

Convert images into structured, editable 3D meshes with precise geometry and topology control.

SeedVR2 | Image & Video Upscaler

Fixes blur instantly. Better than Keep/PMRF.

LTX-2 ComfyUI | Real-Time Video Generator

Create real-time videos instantly, faster than any other generator.

Wan 2.2 VACE | Pose-Controlled Video Generator

Turn still images into stunning motion with pose-based control.

ComfyUI > Nodes > ComfyUI_ChatterBox_SRT_Voice > 📺 ChatterBox SRT Voice TTS

ComfyUI Node: 📺 ChatterBox SRT Voice TTS

Class Name

ChatterBoxSRTVoiceTTS

Category
ChatterBox Voice

Author
diodiogod (Account age: 768days) Extension
ComfyUI_ChatterBox_SRT_Voice Latest Updated
2026-03-21 Github Stars
0.08K

Github Ask diodiogod Current Questions Past Questions

Table of Content

Description
ChatterBoxSRTVoiceTTS:
ChatterBoxSRTVoiceTTS Input Parameters:
ChatterBoxSRTVoiceTTS Output Parameters:
ChatterBoxSRTVoiceTTS Usage Tips:
ChatterBoxSRTVoiceTTS Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_ChatterBox_SRT_Voice

Install this extension via the ComfyUI Manager by searching for ComfyUI_ChatterBox_SRT_Voice

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_ChatterBox_SRT_Voice in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

📺 ChatterBox SRT Voice TTS Description

ChatterBoxSRTVoiceTTS converts text to expressive, high-quality speech with customizable features.

📺 ChatterBox SRT Voice TTS:

ChatterBoxSRTVoiceTTS is a sophisticated node designed to convert text into speech with a focus on generating high-quality audio outputs. This node is particularly beneficial for AI artists and developers who require dynamic and expressive voice synthesis for their projects. It leverages advanced text-to-speech (TTS) technology to produce audio that can mimic various characters and languages, making it ideal for creating immersive audio experiences. The node supports features such as pause tags, which allow for more natural speech patterns by inserting pauses where necessary. Additionally, it offers customization options like exaggeration and temperature settings to adjust the expressiveness and variability of the generated speech. By utilizing this node, you can achieve a more engaging and realistic audio output, enhancing the overall quality of your AI-driven applications.

📺 ChatterBox SRT Voice TTS Input Parameters:

text

The text parameter is the primary input for the node, representing the content that you wish to convert into speech. This parameter accepts a string of text, which can be in any supported language. The length of the text can impact processing time, especially if chunking is enabled for longer texts. There are no explicit minimum or maximum values, but it's advisable to keep the text within a reasonable length for optimal performance.

audio_prompt

The audio_prompt parameter allows you to provide a reference audio file that the TTS system can use to match the voice characteristics. This can be particularly useful if you want the generated speech to mimic a specific voice or style. The parameter accepts an audio file path or a reference to an audio tensor.

exaggeration

The exaggeration parameter controls the expressiveness of the generated speech. It is a float value that adjusts how much the speech deviates from a neutral tone. Higher values result in more exaggerated speech, which can be useful for creating dramatic or emotional audio outputs. The default value is typically set to a moderate level to balance naturalness and expressiveness.

temperature

The temperature parameter influences the variability and creativity of the speech synthesis. It is a float value where lower values result in more deterministic and stable outputs, while higher values introduce more randomness and variation. This parameter is crucial for achieving the desired level of spontaneity in the generated speech.

cfg_weight

The cfg_weight parameter is a float value that adjusts the influence of the conditioning factors on the TTS model. It helps in fine-tuning the balance between the input text and the reference audio characteristics. This parameter is essential for achieving a coherent and contextually appropriate audio output.

language

The language parameter specifies the language in which the text should be synthesized. It accepts a string value representing the language code, such as "English" or "Spanish". This parameter ensures that the TTS system uses the correct phonetic and linguistic rules for the specified language.

enable_pause_tags

The enable_pause_tags parameter is a boolean that determines whether pause tags should be used in the speech synthesis. When enabled, the system inserts pauses in the speech to create more natural and human-like audio. This is particularly useful for longer texts or when simulating conversational speech.

character

The character parameter allows you to specify the character or voice persona that should be used for the speech synthesis. It accepts a string value representing the character's name or role, such as "narrator" or "villain". This parameter is crucial for projects that require distinct voice identities.

seed

The seed parameter is an integer that sets the random seed for the TTS generation process. By providing a specific seed value, you can ensure that the generated speech is reproducible and consistent across different runs. This is useful for debugging or when you need to maintain consistency in audio outputs.

enable_cache

The enable_cache parameter is a boolean that determines whether caching should be used to store intermediate audio results. Enabling caching can significantly speed up the processing time for repeated or similar text inputs, as it avoids redundant computations.

crash_protection_template

The crash_protection_template parameter is a string that provides a template for handling potential crashes during the TTS process. It typically includes placeholder text that can be used to fill in segments of the text that might cause issues, ensuring that the system can recover gracefully from errors.

stable_audio_component

The stable_audio_component parameter allows you to specify a stable audio component that can be used to enhance the consistency of the generated speech. This parameter is optional and can be used to maintain a uniform audio quality across different segments.

📺 ChatterBox SRT Voice TTS Output Parameters:

audio_output

The audio_output parameter is the primary output of the node, representing the synthesized speech in the form of an audio tensor. This output is crucial for applications that require high-quality audio, as it provides the final speech product that can be used in various multimedia projects. The audio tensor can be further processed or directly integrated into your applications.

📺 ChatterBox SRT Voice TTS Usage Tips:

To achieve the most natural-sounding speech, experiment with the exaggeration and temperature parameters to find the right balance for your specific use case.
Utilize the enable_pause_tags feature to insert natural pauses in the speech, especially for longer texts or when simulating dialogue.
If you need consistent audio outputs across different runs, make sure to set a specific seed value.
Consider enabling enable_cache to improve processing speed for repeated text inputs, especially in scenarios where performance is critical.

📺 ChatterBox SRT Voice TTS Common Errors and Solutions:

"Text input too long"

Explanation: This error occurs when the input text exceeds the maximum character limit for a single chunk.
Solution: Enable chunking by setting enable_chunking to true and adjust max_chars_per_chunk to a suitable value to split the text into manageable segments.

"Unsupported language code"

Explanation: The specified language code is not supported by the TTS system.
Solution: Verify that the language parameter is set to a valid and supported language code, such as "English" or "Spanish".

"Audio prompt not found"

Explanation: The specified audio prompt file could not be located or accessed.
Solution: Ensure that the audio_prompt parameter points to a valid file path or audio tensor and that the file is accessible.

"Character voice not available"

Explanation: The specified character voice is not available in the current TTS model.
Solution: Check the available character voices and ensure that the character parameter is set to a valid option.

📺 ChatterBox SRT Voice TTS Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_ChatterBox_SRT_Voice

Table of Content

Description
ChatterBoxSRTVoiceTTS:
ChatterBoxSRTVoiceTTS Input Parameters:
ChatterBoxSRTVoiceTTS Output Parameters:
ChatterBoxSRTVoiceTTS Usage Tips:
ChatterBoxSRTVoiceTTS Common Errors and Solutions:
Related Nodes

FLUX | A New Art Image Generation

A new image generation model developed by Black Forest Labs

ComfyUI Grounding | Object Tracking Workflow

Track any subject with pixel-perfect accuracy for stunning VFX results.

Wan2.2 VACE Fun | Image to Animated Video

Turn still photos into lifelike animated videos with custom prompts.

Virtual Try-On | Realistic Fashion Fitting

Instant outfit previews with natural, well-fitted clothing visuals

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: 📺 ChatterBox SRT Voice TTS

ChatterBoxSRTVoiceTTS

How to Install ComfyUI_ChatterBox_SRT_Voice

📺 ChatterBox SRT Voice TTS Description

📺 ChatterBox SRT Voice TTS:

📺 ChatterBox SRT Voice TTS Input Parameters:

text

audio_prompt

exaggeration

temperature

cfg_weight

language

enable_pause_tags

character

seed

enable_cache

crash_protection_template

stable_audio_component

📺 ChatterBox SRT Voice TTS Output Parameters:

audio_output

📺 ChatterBox SRT Voice TTS Usage Tips:

📺 ChatterBox SRT Voice TTS Common Errors and Solutions:

"Text input too long"

"Unsupported language code"

"Audio prompt not found"

"Character voice not available"

📺 ChatterBox SRT Voice TTS Related Nodes