ComfyUI > Nodes > ComfyUI-IndexTTS2 > IndexTTS2 Simple

ComfyUI Node: IndexTTS2 Simple

Class Name

IndexTTS2Simple

Category
Audio/IndexTTS
Author
snicolast (Account age: 2913days)
Extension
ComfyUI-IndexTTS2
Latest Updated
2025-10-13
Github Stars
0.14K

How to Install ComfyUI-IndexTTS2

Install this extension via the ComfyUI Manager by searching for ComfyUI-IndexTTS2
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-IndexTTS2 in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

IndexTTS2 Simple Description

Facilitates text-to-speech conversion with simplicity and natural-sounding audio for multimedia projects.

IndexTTS2 Simple:

IndexTTS2Simple is a node designed to facilitate the conversion of text into speech using the IndexTTS2 system. This node is part of a suite of tools aimed at providing high-quality text-to-speech synthesis with a focus on simplicity and ease of use. It leverages advanced algorithms to generate natural-sounding audio from text input, making it an invaluable tool for AI artists and developers who need to incorporate speech synthesis into their projects. The primary goal of IndexTTS2Simple is to offer a straightforward interface that abstracts the complexities of speech synthesis, allowing users to focus on creative aspects rather than technical details. By using this node, you can quickly transform written content into audio, enhancing multimedia projects with voiceovers or interactive audio elements.

IndexTTS2 Simple Input Parameters:

spk_audio_prompt

This parameter specifies the path to the speaker audio prompt, which is used to guide the voice characteristics of the generated speech. It impacts the voice's tone and style, allowing for customization based on the provided audio sample. There are no specific minimum or maximum values, but the input should be a valid file path to an audio file.

text

The text parameter is the core input for the node, representing the written content you wish to convert into speech. It directly influences the spoken output, as the node will synthesize audio based on this text. There are no explicit constraints on length, but longer texts may require more processing time.

emo_audio_prompt

This optional parameter allows you to provide an emotional audio prompt, which can be used to infuse the generated speech with specific emotional characteristics. It enhances the expressiveness of the output by mimicking the emotions present in the provided audio sample. The input should be a valid file path to an audio file.

emo_alpha

Emo_alpha is a parameter that controls the intensity of the emotional influence from the emo_audio_prompt. It ranges from 0 to 1, where 0 means no emotional influence and 1 means full influence. Adjusting this value allows you to fine-tune the emotional expression in the synthesized speech.

emo_vector

This parameter provides an alternative way to specify emotional characteristics using a vector representation. It offers more granular control over the emotional tone of the output, allowing for complex emotional expressions. The input should be a valid vector format.

use_random_style

A boolean parameter that, when set to true, enables the use of random style variations in the generated speech. This can add diversity and uniqueness to the output, making it less predictable and more dynamic.

interval_silence

Interval_silence specifies the duration of silence between segments of speech in milliseconds. It affects the pacing and naturalness of the output, with longer silences creating more deliberate pauses. The default value is 200 milliseconds.

max_text_tokens_per_segment

This parameter defines the maximum number of text tokens processed per segment. It helps manage the complexity of text processing, especially for longer inputs, by breaking them into manageable chunks. The value should be set based on the desired balance between processing efficiency and output coherence.

IndexTTS2 Simple Output Parameters:

AUDIO

The AUDIO output is the synthesized speech generated from the input text. It is a waveform representation of the spoken content, ready for playback or further processing. This output is crucial for applications requiring audio output, such as voiceovers or interactive media.

STRING

The STRING output provides a textual representation of the synthesis process, which can include metadata or status information. This output is useful for debugging or logging purposes, offering insights into the node's operation and performance.

IndexTTS2 Simple Usage Tips:

  • Ensure that the spk_audio_prompt and emo_audio_prompt are high-quality audio files to achieve the best results in voice and emotional expression.
  • Experiment with the emo_alpha parameter to find the right balance of emotional influence for your specific application, enhancing the expressiveness of the output.

IndexTTS2 Simple Common Errors and Solutions:

IndexTTS2 returned an unexpected result format

  • Explanation: This error occurs when the output from the IndexTTS2 system does not match the expected format, which should be a tuple or list with two elements.
  • Solution: Verify that all input parameters are correctly specified and that the input files are accessible and valid. If the problem persists, check for updates or patches to the IndexTTS2 system that might address compatibility issues.

IndexTTS2 Simple Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-IndexTTS2
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.