ComfyUI > Nodes > ComfyUI-Montagen > Fish Audio TTS

ComfyUI Node: Fish Audio TTS

Class Name

MontagenFishAudioTTSNode

Category
Montagen/Generator
Author
MontagenAI (Account age: 495days)
Extension
ComfyUI-Montagen
Latest Updated
2025-05-29
Github Stars
0.03K

How to Install ComfyUI-Montagen

Install this extension via the ComfyUI Manager by searching for ComfyUI-Montagen
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-Montagen in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Fish Audio TTS Description

Converts text to speech using Fish Audio TTS, with customizable voice and audio settings.

Fish Audio TTS:

The MontagenFishAudioTTSNode is a powerful tool designed to convert text into speech using the Fish Audio Text-to-Speech (TTS) service. This node is part of the Montagen suite, which focuses on generating audio content from textual input. It allows you to select a specific voice for the TTS conversion, apply audio offsets, and manage various audio processing parameters to achieve the desired speech output. The node is particularly beneficial for AI artists and creators who wish to integrate realistic and customizable voiceovers into their projects. By leveraging the Fish Audio TTS service, this node provides high-quality audio outputs that can be synchronized with other multimedia elements, enhancing the overall production value of your creative works.

Fish Audio TTS Input Parameters:

text

This parameter accepts a list of strings, each representing a piece of text to be converted into speech. The text can be multiline, allowing for complex and lengthy narratives to be processed. It is crucial to ensure that the text is not empty, as this will result in an error. The text content directly influences the speech output, making it the core input for the TTS process.

trim

This float parameter specifies the amount of time, in seconds, to trim from the start of the audio output. It ranges from 0.0 to 2.0 seconds, with a default value of 0.0. Trimming can be useful for removing unwanted silence or noise at the beginning of the audio file, ensuring a cleaner and more professional result.

voice

The voice parameter is a required string input that determines the voice used for the TTS conversion. It is essential to select an appropriate voice that matches the tone and style of your project. The voice selection impacts the overall quality and authenticity of the speech output.

offset

This float parameter defines the offset in seconds to apply to the audio, with a default value of 0.0. The offset can be adjusted in increments of 0.1 seconds. It is useful for synchronizing the audio with other media elements, ensuring that the speech aligns perfectly with visual or other auditory components.

unique_id

A unique identifier for the workflow or project, this parameter helps in managing and organizing different TTS tasks. It ensures that each task is distinct and can be tracked or referenced independently within the Montagen system.

prompt

The prompt parameter is a string input that provides additional context or instructions for the TTS process. It can be used to guide the voice synthesis, influencing factors such as tone, emphasis, and pacing.

extra_pnginfo

This parameter allows for the inclusion of additional metadata or information in the form of a string. It can be used to embed context or instructions that may affect the TTS output or its integration with other media elements.

apiKey

An optional string parameter, the apiKey is required to authenticate and access the Fish Audio TTS service. If not provided, the system will attempt to retrieve it automatically. The apiKey ensures secure and authorized use of the TTS capabilities.

timeRangeList

This optional parameter is a list of dictionaries that define specific time ranges for the TTS process. It allows for precise control over when certain text segments are converted to speech, facilitating complex audio timelines and synchronization.

action

An optional string parameter that specifies the action to be taken during the TTS process. It can influence how the text is processed and converted into speech, providing flexibility in handling different audio production scenarios.

normalize

A boolean parameter with a default value of True, normalize determines whether the audio output should be normalized. Normalization adjusts the audio levels to ensure consistent volume and quality across different segments.

top_p

This float parameter, ranging from 0.0 to 1.0 with a default value of 0.7, controls the randomness of the TTS output. A higher value results in more diverse and creative speech variations, while a lower value produces more predictable and stable results.

temperature

Similar to top_p, this float parameter ranges from 0.0 to 1.0 and has a default value of 0.7. It influences the creativity and variability of the TTS output, with higher values leading to more dynamic and expressive speech.

Fish Audio TTS Output Parameters:

promptList

This output is a list of strings representing the processed prompts used in the TTS conversion. It provides a record of the input prompts, allowing you to verify and review the text that was converted into speech.

timeRangeList

The timeRangeList output is a list of time ranges that were applied during the TTS process. It reflects the timing and synchronization of the audio segments, ensuring that the speech aligns with other media elements as intended.

action

This output indicates the action that was executed during the TTS process. It provides insight into the processing decisions made by the node, helping you understand how the text was converted into speech.

resourceList

The resourceList is a list of file paths to the generated audio files. These files contain the speech output and can be used in various multimedia projects. The list allows you to easily access and manage the audio resources produced by the node.

Fish Audio TTS Usage Tips:

  • Ensure that the voice parameter is set to a suitable option that matches the style and tone of your project for optimal results.
  • Use the trim and offset parameters to fine-tune the audio output, removing unwanted silence and synchronizing the speech with other media elements.
  • Experiment with the top_p and temperature parameters to achieve the desired level of creativity and variability in the speech output.

Fish Audio TTS Common Errors and Solutions:

Voice is required for Fish Audio TTS.

  • Explanation: This error occurs when the voice parameter is not provided, which is essential for the TTS process.
  • Solution: Ensure that you specify a valid voice option in the voice parameter before executing the node.

API Key is required for Fish Audio TTS.

  • Explanation: This error indicates that the apiKey parameter is missing, preventing access to the Fish Audio TTS service.
  • Solution: Provide a valid API key in the apiKey parameter or ensure that the system can retrieve it automatically.

Input text cannot be empty

  • Explanation: This error arises when the text parameter is empty, as the TTS process requires text input to generate speech.
  • Solution: Make sure to input non-empty text in the text parameter to proceed with the TTS conversion.

Fish Audio TTS Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-Montagen
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.