RunComfy

Wan2.2 VACE Fun | Image to Animated Video

Turn still photos into lifelike animated videos with custom prompts.

Consistent Character Creator 3.0 | Easy Consistency, Any Angle

Make characters stay the same, every angle, strong and perfect.

Z Image Turbo | Ultra-Fast Photorealistic Generator

Generate ultra-clear visuals fast with unmatched real-time detail.

MatAnyone Video Matting | Single Mask Removal

Remove video backgrounds with one mask frame for perfect subject isolation.

ComfyUI > Nodes > ComfyUI-MegaTTS > MegaTTS3

ComfyUI Node: MegaTTS3

Class Name

MegaTTS3

Category
🧪AILab/🔊Audio

Author
1038lab (Account age: 774days) Extension
ComfyUI-MegaTTS Latest Updated
2025-04-13 Github Stars
0.03K

Github Ask 1038lab Current Questions Past Questions

Table of Content

Description
MegaTTS3:
MegaTTS3 Input Parameters:
MegaTTS3 Output Parameters:
MegaTTS3 Usage Tips:
MegaTTS3 Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-MegaTTS

Install this extension via the ComfyUI Manager by searching for ComfyUI-MegaTTS

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-MegaTTS in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

MegaTTS3 Description

Sophisticated text-to-speech node with advanced machine learning for realistic voice synthesis in various applications.

MegaTTS3:

MegaTTS3 is a sophisticated text-to-speech (TTS) node designed to convert written text into natural-sounding speech. It leverages advanced machine learning models to generate high-quality audio outputs that closely mimic human speech patterns. The node is particularly beneficial for applications requiring realistic voice synthesis, such as virtual assistants, audiobooks, and interactive media. By utilizing a reference voice, MegaTTS3 can produce speech that aligns with specific vocal characteristics, enhancing the personalization and authenticity of the generated audio. Its robust architecture ensures efficient processing and high-quality results, making it a valuable tool for AI artists and developers seeking to integrate voice synthesis into their projects.

MegaTTS3 Input Parameters:

reference_voice

The reference_voice parameter specifies the voice data used as a reference for generating the speech output. This parameter is crucial as it determines the vocal characteristics, such as tone and pitch, of the synthesized speech. By selecting an appropriate reference voice, you can ensure that the generated audio aligns with the desired vocal style and quality. There are no explicit minimum or maximum values for this parameter, but it should be a valid voice data file that the system can process.

input_text

The input_text parameter is the text that you want to convert into speech. This parameter is the core input for the TTS process, as it defines the content of the speech output. The quality and clarity of the generated audio depend significantly on the input text, so it should be well-structured and free of errors. There are no specific constraints on the length of the text, but longer texts may require more processing time.

language

The language parameter indicates the language of the input text. This parameter is essential for ensuring that the text is processed correctly and that the pronunciation and intonation are appropriate for the specified language. The node supports multiple languages, and selecting the correct language type is crucial for achieving accurate and natural-sounding speech.

generation_quality

The generation_quality parameter, referred to as time_step in the code, controls the quality of the speech generation process. Higher values typically result in better audio quality but may increase processing time. This parameter allows you to balance between speed and quality based on your specific needs. The default value is 32, but it can be adjusted to optimize performance.

pronunciation_strength

The pronunciation_strength parameter, denoted as p_w, influences the clarity and emphasis of the pronunciation in the generated speech. A higher value can lead to more pronounced articulation, which may be desirable for certain applications. The default value is 1.6, and you can adjust it to achieve the desired level of pronunciation clarity.

voice_similarity

The voice_similarity parameter, represented as t_w, affects how closely the generated speech matches the reference voice. A higher value increases the similarity, making the output sound more like the reference voice. The default value is 2.5, and you can modify it to fine-tune the balance between similarity and naturalness.

MegaTTS3 Output Parameters:

audio_output

The audio_output parameter is the primary output of the MegaTTS3 node, containing the synthesized speech audio. This output is a high-quality audio file that represents the input text spoken in the style of the reference voice. The audio output is crucial for applications that require realistic and natural-sounding speech, and it can be used directly in various multimedia projects or further processed as needed.

MegaTTS3 Usage Tips:

To achieve the best results, ensure that the reference_voice is of high quality and closely matches the desired vocal characteristics for your project.
Experiment with the generation_quality, pronunciation_strength, and voice_similarity parameters to find the optimal balance between audio quality, clarity, and processing time for your specific application.
When working with longer texts, consider breaking them into smaller segments to improve processing efficiency and maintain audio quality.

MegaTTS3 Common Errors and Solutions:

TTS generation failed: `<error_message>`

Explanation: This error occurs when the text-to-speech generation process encounters an issue, such as missing model files or incorrect input parameters.
Solution: Ensure that all necessary model files are present and correctly configured. Verify that the input parameters, such as reference_voice and input_text, are valid and properly formatted. If the problem persists, try restarting the node and clearing the cache to resolve any temporary issues.

MegaTTS3 Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-MegaTTS

Table of Content

Description
MegaTTS3:
MegaTTS3 Input Parameters:
MegaTTS3 Output Parameters:
MegaTTS3 Usage Tips:
MegaTTS3 Common Errors and Solutions:
Related Nodes

Flux Kontext Pulid | Consistent Character Generation

Create consistent characters using FLUX Kontext with a single face reference image.

Qwen Image Edit 2509 | Multi-Image Editor

Turn 2–3 images into one seamless, edited masterpiece instantly.

LongCat Avatar in ComfyUI | Identity-Consistent Avatar Animation

Turns one image into smooth, identity-consistent avatar animation.

FLUX Dev ControlNet | Multi-Condition ControlNet

Controlled FLUX Dev image generation with Pose, Depth, Canny, and ReColor

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: MegaTTS3

MegaTTS3

How to Install ComfyUI-MegaTTS

MegaTTS3 Description

MegaTTS3:

MegaTTS3 Input Parameters:

reference_voice

input_text

language

generation_quality

pronunciation_strength

voice_similarity

MegaTTS3 Output Parameters:

audio_output

MegaTTS3 Usage Tips:

MegaTTS3 Common Errors and Solutions:

TTS generation failed: <error_message>

MegaTTS3 Related Nodes

TTS generation failed: `<error_message>`