Fish Audio TTS:
The MontagenFishAudioTTSNode is a powerful tool designed to convert text into speech using the Fish Audio Text-to-Speech (TTS) service. This node is part of the Montagen suite, which focuses on generating audio content from textual input. It allows you to select a specific voice for the TTS conversion, apply audio offsets, and manage various audio processing parameters to achieve the desired speech output. The node is particularly beneficial for AI artists and creators who wish to integrate realistic and customizable voiceovers into their projects. By leveraging the Fish Audio TTS service, this node provides high-quality audio outputs that can be synchronized with other multimedia elements, enhancing the overall production value of your creative works.
Fish Audio TTS Input Parameters:
text
This parameter accepts a list of strings, each representing a piece of text to be converted into speech. The text can be multiline, allowing for complex and lengthy narratives to be processed. It is crucial to ensure that the text is not empty, as this will result in an error. The text content directly influences the speech output, making it the core input for the TTS process.
trim
This float parameter specifies the amount of time, in seconds, to trim from the start of the audio output. It ranges from 0.0 to 2.0 seconds, with a default value of 0.0. Trimming can be useful for removing unwanted silence or noise at the beginning of the audio file, ensuring a cleaner and more professional result.
voice
The voice parameter is a required string input that determines the voice used for the TTS conversion. It is essential to select an appropriate voice that matches the tone and style of your project. The voice selection impacts the overall quality and authenticity of the speech output.
offset
This float parameter defines the offset in seconds to apply to the audio, with a default value of 0.0. The offset can be adjusted in increments of 0.1 seconds. It is useful for synchronizing the audio with other media elements, ensuring that the speech aligns perfectly with visual or other auditory components.
unique_id
A unique identifier for the workflow or project, this parameter helps in managing and organizing different TTS tasks. It ensures that each task is distinct and can be tracked or referenced independently within the Montagen system.
prompt
The prompt parameter is a string input that provides additional context or instructions for the TTS process. It can be used to guide the voice synthesis, influencing factors such as tone, emphasis, and pacing.
extra_pnginfo
This parameter allows for the inclusion of additional metadata or information in the form of a string. It can be used to embed context or instructions that may affect the TTS output or its integration with other media elements.
apiKey
An optional string parameter, the apiKey is required to authenticate and access the Fish Audio TTS service. If not provided, the system will attempt to retrieve it automatically. The apiKey ensures secure and authorized use of the TTS capabilities.
timeRangeList
This optional parameter is a list of dictionaries that define specific time ranges for the TTS process. It allows for precise control over when certain text segments are converted to speech, facilitating complex audio timelines and synchronization.
action
An optional string parameter that specifies the action to be taken during the TTS process. It can influence how the text is processed and converted into speech, providing flexibility in handling different audio production scenarios.
normalize
A boolean parameter with a default value of True, normalize determines whether the audio output should be normalized. Normalization adjusts the audio levels to ensure consistent volume and quality across different segments.
top_p
This float parameter, ranging from 0.0 to 1.0 with a default value of 0.7, controls the randomness of the TTS output. A higher value results in more diverse and creative speech variations, while a lower value produces more predictable and stable results.
temperature
Similar to top_p, this float parameter ranges from 0.0 to 1.0 and has a default value of 0.7. It influences the creativity and variability of the TTS output, with higher values leading to more dynamic and expressive speech.
Fish Audio TTS Output Parameters:
promptList
This output is a list of strings representing the processed prompts used in the TTS conversion. It provides a record of the input prompts, allowing you to verify and review the text that was converted into speech.
timeRangeList
The timeRangeList output is a list of time ranges that were applied during the TTS process. It reflects the timing and synchronization of the audio segments, ensuring that the speech aligns with other media elements as intended.
action
This output indicates the action that was executed during the TTS process. It provides insight into the processing decisions made by the node, helping you understand how the text was converted into speech.
resourceList
The resourceList is a list of file paths to the generated audio files. These files contain the speech output and can be used in various multimedia projects. The list allows you to easily access and manage the audio resources produced by the node.
Fish Audio TTS Usage Tips:
- Ensure that the
voiceparameter is set to a suitable option that matches the style and tone of your project for optimal results. - Use the
trimandoffsetparameters to fine-tune the audio output, removing unwanted silence and synchronizing the speech with other media elements. - Experiment with the
top_pandtemperatureparameters to achieve the desired level of creativity and variability in the speech output.
Fish Audio TTS Common Errors and Solutions:
Voice is required for Fish Audio TTS.
- Explanation: This error occurs when the
voiceparameter is not provided, which is essential for the TTS process. - Solution: Ensure that you specify a valid voice option in the
voiceparameter before executing the node.
API Key is required for Fish Audio TTS.
- Explanation: This error indicates that the
apiKeyparameter is missing, preventing access to the Fish Audio TTS service. - Solution: Provide a valid API key in the
apiKeyparameter or ensure that the system can retrieve it automatically.
Input text cannot be empty
- Explanation: This error arises when the
textparameter is empty, as the TTS process requires text input to generate speech. - Solution: Make sure to input non-empty text in the
textparameter to proceed with the TTS conversion.
