ComfyUI > Nodes > ComfyUI-Montagen > Edge TTS

ComfyUI Node: Edge TTS

Class Name

MontagenEdgeTTSNode

Category
Montagen/Generator
Author
MontagenAI (Account age: 495days)
Extension
ComfyUI-Montagen
Latest Updated
2025-05-29
Github Stars
0.03K

How to Install ComfyUI-Montagen

Install this extension via the ComfyUI Manager by searching for ComfyUI-Montagen
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-Montagen in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Edge TTS Description

MontagenEdgeTTSNode converts text to customizable speech, enhancing multimedia projects.

Edge TTS:

The MontagenEdgeTTSNode is a powerful tool designed to convert text into speech using advanced text-to-speech (TTS) technology. This node is part of the Montagen suite, specifically categorized under the generator functions, and is tailored to provide high-quality audio outputs from textual inputs. It leverages the capabilities of Edge TTS to offer customizable speech synthesis, allowing you to adjust parameters such as volume, speed, and pitch to suit your specific needs. The primary goal of this node is to facilitate the seamless integration of speech synthesis into your projects, enabling you to create dynamic audio content with ease. Whether you're developing interactive applications, creating audio narratives, or enhancing multimedia presentations, the MontagenEdgeTTSNode provides a versatile and user-friendly solution for generating speech from text.

Edge TTS Input Parameters:

text

The text parameter is a string input where you enter the text you wish to convert into speech. It supports multiline input, allowing you to provide extensive text content. This parameter is crucial as it forms the basis of the speech output. The text should be clear and concise to ensure accurate speech synthesis.

timeRangeList

The timeRangeList parameter is of type MONTAGENTIMERANGETYPE and is used to specify the time range for the speech output. This parameter helps in defining the duration or specific segments of the audio that you want to generate or manipulate.

action

The action parameter, of type MONTAGENACTIONTYPE, determines the specific action to be performed during the text-to-speech conversion. It allows you to modify or customize the speech synthesis process according to your requirements.

volume

The volume parameter is a float value that controls the loudness of the speech output. It ranges from 0 to 5.0, with a default value of 1.0. Adjusting this parameter allows you to increase or decrease the volume of the generated speech, making it suitable for different listening environments.

speed

The speed parameter is a float value that adjusts the rate of speech. It ranges from 0.5 to 2.0, with a default value of 1.0. This parameter is useful for controlling how fast or slow the speech is delivered, enabling you to match the pace of the audio to your specific needs.

pitch

The pitch parameter is an integer that modifies the pitch of the voice. It ranges from -20 to +20 Hz, with a default value of 0. This parameter allows you to alter the tonal quality of the speech, making it higher or lower in pitch to suit different character voices or stylistic preferences.

voice

The voice parameter allows you to select the voice used for text-to-speech conversion. It offers a list of default voices, with the first one being the default selection. This parameter is essential for choosing the desired vocal characteristics for the speech output.

offset

The offset parameter is a float value that specifies the offset in seconds to apply to the audio. It has a default value of 0.0 and a minimum value of 0.0. This parameter is useful for synchronizing the speech output with other media elements by introducing a delay or advancing the start time.

trim

The trim parameter is a float value that determines the amount of audio trimming to apply. It ranges from 0.0 to 2.0, with a default value of 0.2. This parameter helps in removing unwanted silence or noise from the beginning or end of the audio, ensuring a clean and precise output.

Edge TTS Output Parameters:

promptList

The promptList output is a string that contains the processed text prompts used for generating the speech. This output is important for verifying the text content that was converted into audio, ensuring that the correct prompts were used.

timeRangeList

The timeRangeList output, of type MONTAGENTIMERANGETYPE, provides the time range information associated with the generated speech. This output is useful for understanding the duration and timing of the audio segments.

action

The action output, of type MONTAGENACTIONTYPE, indicates the action that was performed during the text-to-speech conversion. This output helps in confirming the specific modifications or customizations applied to the speech synthesis process.

resourceList

The resourceList output, of type MONTAGENRESOURCESTYPE, contains a list of resources related to the generated speech. This output is valuable for accessing additional information or assets associated with the audio content.

Edge TTS Usage Tips:

  • Ensure that the text input is clear and free of errors to achieve accurate speech synthesis.
  • Experiment with different voice options to find the one that best suits your project's needs.
  • Adjust the volume, speed, and pitch parameters to create a more natural and engaging audio output.
  • Use the offset parameter to synchronize the speech with other media elements in your project.

Edge TTS Common Errors and Solutions:

Input text cannot be empty

  • Explanation: This error occurs when the text input is left blank.
  • Solution: Ensure that you provide a valid text input for conversion into speech.

NoAudioReceived

  • Explanation: This error indicates that no audio was generated during the text-to-speech process.
  • Solution: Check the text input and parameters to ensure they are correctly configured. Retry the operation with different settings if necessary.

Edge TTS Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-Montagen
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.