RunComfy

InfiniteTalk | Lip-Synced Avatar Generator

Photo + Voice = Perfectly Synced Talking Avatar in Minutes

SUPIR | Photo-Realistic Image/Video Upscaler

SUPIR enables photo-realistic image restoration, works with SDXL model, and supports text-prompt enhancement.

Wan 2.2 + Lightx2v V2 | Ultra Fast I2V & T2V

Dual Light LoRA setup, 4X faster.

FLUX LoRA (RealismLoRA) | Photorealistic Images

Blend FLUX-1 model with FLUX-RealismLoRA for photorealistic AI images

ComfyUI > Nodes > ComfyUI-MegaTTS

ComfyUI Extension: ComfyUI-MegaTTS

Repo Name

ComfyUI-MegaTTS

Author
1038lab (Account age: 774 days) Nodes
View all nodes(3) Latest Updated
2025-04-13 Github Stars
0.03K

Github Ask 1038lab Current Questions Past Questions

Table of Content

Description
ComfyUI-MegaTTS Introduction
How ComfyUI-MegaTTS Works
ComfyUI-MegaTTS Features
ComfyUI-MegaTTS Models
What's New with ComfyUI-MegaTTS
Troubleshooting ComfyUI-MegaTTS
Learn More about ComfyUI-MegaTTS
Related Nodes

How to Install ComfyUI-MegaTTS

Install this extension via the ComfyUI Manager by searching for ComfyUI-MegaTTS

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-MegaTTS in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ComfyUI-MegaTTS Description

ComfyUI-MegaTTS is a custom node for ComfyUI, leveraging ByteDance MegaTTS3 to deliver high-quality text-to-speech synthesis with voice cloning for Chinese and English languages.

ComfyUI-MegaTTS Introduction

ComfyUI-MegaTTS is an innovative extension designed to bring high-quality text-to-speech (TTS) capabilities to AI artists. Built on ByteDance's MegaTTS3, this extension allows you to convert text into natural-sounding speech in both English and Chinese. It also offers voice cloning features, enabling you to replicate any voice using just a short audio sample. This tool is particularly useful for artists looking to add a vocal element to their projects, whether it's for creating voiceovers, character voices, or any other creative audio content.

How ComfyUI-MegaTTS Works

At its core, ComfyUI-MegaTTS uses advanced machine learning models to transform written text into spoken words. It leverages a diffusion transformer model, which is a type of neural network that excels at generating high-quality audio. Think of it as a sophisticated artist that listens to your text and paints a picture of sound, capturing the nuances of human speech. The extension also includes a voice cloning feature, which works by analyzing a short audio sample to capture the unique characteristics of a voice, allowing it to mimic that voice in new speech outputs.

ComfyUI-MegaTTS Features

High-Quality Speech Synthesis: Converts text into smooth, natural-sounding speech.
Voice Cloning: Clone any voice using a short sample, requiring both WAV and NPY files.
Bilingual Support: Seamlessly switch between English and Chinese, with code-switching capabilities.
Advanced Parameter Control: Fine-tune the quality, pronunciation accuracy, and voice similarity to suit your needs.
Memory Management: Optimizes GPU resource usage to prevent memory shortages, especially for users with limited GPU memory.
Automatic Model Download: Automatically downloads necessary models when needed, simplifying the setup process.

ComfyUI-MegaTTS Models

ComfyUI-MegaTTS utilizes a modified version of the MegaTTS3 model, which is organized into several components:

Diffusion Transformer: Handles the main TTS process.
WavVAE: Compresses and reconstructs audio, though currently unavailable for direct use.
Duration and Aligner Models: Ensure accurate timing and alignment of speech.
G2P (Grapheme-to-Phoneme): Converts written text into phonetic representations. Each model plays a crucial role in ensuring the generated speech is both accurate and natural.

What's New with ComfyUI-MegaTTS

Version 1.0.2

Code and custom nodes have been restructured for better performance and GPU resource management.
Enhanced memory management to prevent memory shortages for users with lower GPU memory.
Added internationalization support for English and Chinese.

Version 1.0.1

Bug fixes to improve stability and performance.

Troubleshooting ComfyUI-MegaTTS

If you encounter issues while using ComfyUI-MegaTTS, here are some common problems and solutions:

Model Download Issues: Ensure you have a stable internet connection. If automatic downloads fail, manually download models from Hugging Face.
Voice Cloning Errors: Make sure your WAV and NPY files are correctly placed in the Voices folder and named consistently.
Memory Errors: Try reducing the generation quality or using a GPU with more memory.

Frequently Asked Questions

How do I improve voice similarity? Adjust the voice_similarity parameter to a higher value for closer resemblance to the reference voice.
Can I use my own voice samples? Yes, you can submit your samples to the Voice Submission Queue for processing.

Learn More about ComfyUI-MegaTTS

For further learning and support, consider exploring the following resources:

MegaTTS3 GitHub Repository: ByteDance/MegaTTS3
Hugging Face Model Page: ByteDance/MegaTTS3
Community Forums: Engage with other AI artists and developers to share tips and solutions. These resources provide a wealth of information to help you make the most of ComfyUI-MegaTTS in your creative projects.

ComfyUI-MegaTTS Related Nodes

MegaTTS3

MegaTTS3 (Simple)

MegaTTS Voice Maker

Table of Content

Description
ComfyUI-MegaTTS Introduction
How ComfyUI-MegaTTS Works
ComfyUI-MegaTTS Features
ComfyUI-MegaTTS Models
What's New with ComfyUI-MegaTTS
Troubleshooting ComfyUI-MegaTTS
Learn More about ComfyUI-MegaTTS
Related Nodes

ComfyUI Trellis2 | Image-to-3D Mesh Generation Workflow

Convert images into structured, editable 3D meshes with precise geometry and topology control.

SAM 3 | Advanced Object Segmentation Tool

Next-gen segmentation tool for precise object masking and tracking.

Reallusion AI Render | 3D to ComfyUI Workflows Collection

ComfyUI + Reallusion = Speed, Accessibility, and Ease for 3D visuals

MatAnyone Video Matting | Single Mask Removal

Remove video backgrounds with one mask frame for perfect subject isolation.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: ComfyUI-MegaTTS

ComfyUI-MegaTTS

How to Install ComfyUI-MegaTTS

ComfyUI-MegaTTS Description

ComfyUI-MegaTTS Introduction

How ComfyUI-MegaTTS Works

ComfyUI-MegaTTS Features

ComfyUI-MegaTTS Models

What's New with ComfyUI-MegaTTS

Version 1.0.2

Version 1.0.1

Troubleshooting ComfyUI-MegaTTS

Frequently Asked Questions

Learn More about ComfyUI-MegaTTS

ComfyUI-MegaTTS Related Nodes