Edge TTS:
The MontagenEdgeTTSNode is a powerful tool designed to convert text into speech using advanced text-to-speech (TTS) technology. This node is part of the Montagen suite, specifically categorized under the generator functions, and is tailored to provide high-quality audio outputs from textual inputs. It leverages the capabilities of Edge TTS to offer customizable speech synthesis, allowing you to adjust parameters such as volume, speed, and pitch to suit your specific needs. The primary goal of this node is to facilitate the seamless integration of speech synthesis into your projects, enabling you to create dynamic audio content with ease. Whether you're developing interactive applications, creating audio narratives, or enhancing multimedia presentations, the MontagenEdgeTTSNode provides a versatile and user-friendly solution for generating speech from text.
Edge TTS Input Parameters:
text
The text parameter is a string input where you enter the text you wish to convert into speech. It supports multiline input, allowing you to provide extensive text content. This parameter is crucial as it forms the basis of the speech output. The text should be clear and concise to ensure accurate speech synthesis.
timeRangeList
The timeRangeList parameter is of type MONTAGENTIMERANGETYPE and is used to specify the time range for the speech output. This parameter helps in defining the duration or specific segments of the audio that you want to generate or manipulate.
action
The action parameter, of type MONTAGENACTIONTYPE, determines the specific action to be performed during the text-to-speech conversion. It allows you to modify or customize the speech synthesis process according to your requirements.
volume
The volume parameter is a float value that controls the loudness of the speech output. It ranges from 0 to 5.0, with a default value of 1.0. Adjusting this parameter allows you to increase or decrease the volume of the generated speech, making it suitable for different listening environments.
speed
The speed parameter is a float value that adjusts the rate of speech. It ranges from 0.5 to 2.0, with a default value of 1.0. This parameter is useful for controlling how fast or slow the speech is delivered, enabling you to match the pace of the audio to your specific needs.
pitch
The pitch parameter is an integer that modifies the pitch of the voice. It ranges from -20 to +20 Hz, with a default value of 0. This parameter allows you to alter the tonal quality of the speech, making it higher or lower in pitch to suit different character voices or stylistic preferences.
voice
The voice parameter allows you to select the voice used for text-to-speech conversion. It offers a list of default voices, with the first one being the default selection. This parameter is essential for choosing the desired vocal characteristics for the speech output.
offset
The offset parameter is a float value that specifies the offset in seconds to apply to the audio. It has a default value of 0.0 and a minimum value of 0.0. This parameter is useful for synchronizing the speech output with other media elements by introducing a delay or advancing the start time.
trim
The trim parameter is a float value that determines the amount of audio trimming to apply. It ranges from 0.0 to 2.0, with a default value of 0.2. This parameter helps in removing unwanted silence or noise from the beginning or end of the audio, ensuring a clean and precise output.
Edge TTS Output Parameters:
promptList
The promptList output is a string that contains the processed text prompts used for generating the speech. This output is important for verifying the text content that was converted into audio, ensuring that the correct prompts were used.
timeRangeList
The timeRangeList output, of type MONTAGENTIMERANGETYPE, provides the time range information associated with the generated speech. This output is useful for understanding the duration and timing of the audio segments.
action
The action output, of type MONTAGENACTIONTYPE, indicates the action that was performed during the text-to-speech conversion. This output helps in confirming the specific modifications or customizations applied to the speech synthesis process.
resourceList
The resourceList output, of type MONTAGENRESOURCESTYPE, contains a list of resources related to the generated speech. This output is valuable for accessing additional information or assets associated with the audio content.
Edge TTS Usage Tips:
- Ensure that the text input is clear and free of errors to achieve accurate speech synthesis.
- Experiment with different voice options to find the one that best suits your project's needs.
- Adjust the volume, speed, and pitch parameters to create a more natural and engaging audio output.
- Use the offset parameter to synchronize the speech with other media elements in your project.
Edge TTS Common Errors and Solutions:
Input text cannot be empty
- Explanation: This error occurs when the text input is left blank.
- Solution: Ensure that you provide a valid text input for conversion into speech.
NoAudioReceived
- Explanation: This error indicates that no audio was generated during the text-to-speech process.
- Solution: Check the text input and parameters to ensure they are correctly configured. Retry the operation with different settings if necessary.
