ComfyUI > Nodes > ComfyUI-FL-Qwen3TTS

ComfyUI Extension: ComfyUI-FL-Qwen3TTS

Repo Name

ComfyUI-FL-Qwen3TTS

Author
filliptm (Account age: 2372 days)
Nodes
View all nodes(10)
Latest Updated
2026-03-18
Github Stars
0.12K

How to Install ComfyUI-FL-Qwen3TTS

Install this extension via the ComfyUI Manager by searching for ComfyUI-FL-Qwen3TTS
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-FL-Qwen3TTS in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI-FL-Qwen3TTS Description

ComfyUI-FL-Qwen3TTS is an extension for ComfyUI that integrates Qwen3TTS, a text-to-speech system. It enhances user interaction by converting text inputs into natural-sounding speech, improving accessibility and user experience.

ComfyUI-FL-Qwen3TTS Introduction

ComfyUI-FL-Qwen3TTS is an advanced text-to-speech (TTS) extension designed to enhance the capabilities of ComfyUI by integrating Alibaba's Qwen3-TTS model family. This extension allows you to transform written text into natural-sounding speech across multiple languages and dialects. It offers features such as voice cloning, voice design from text descriptions, and predefined speaker profiles. Whether you're an AI artist looking to add voice to your creations or someone interested in experimenting with speech synthesis, ComfyUI-FL-Qwen3TTS provides a versatile and user-friendly solution.

How ComfyUI-FL-Qwen3TTS Works

At its core, ComfyUI-FL-Qwen3TTS leverages the Qwen3-TTS models to convert text into speech. The process involves several steps:

  1. Model Loading: The extension downloads and caches the necessary Qwen3-TTS models from HuggingFace, ensuring you have access to the latest speech synthesis capabilities.
  2. Text Processing: The input text is processed and transformed into a format that the model can understand.
  3. Speech Generation: Using the selected model, the text is converted into speech. This can involve cloning a voice from a short audio sample, designing a new voice based on a text description, or using one of the predefined speaker profiles.
  4. Audio Encoding/Decoding: The generated speech is encoded and decoded using the Qwen3-TTS tokenizer, ensuring high-quality audio output.

ComfyUI-FL-Qwen3TTS Features

  • Voice Cloning: Clone any voice using a 5-15 second audio sample. This feature is perfect for creating personalized voiceovers or replicating a specific voice for artistic projects.
  • Voice Design: Create custom voices from natural language descriptions. For example, you can specify "a warm British female voice" to generate a unique voice profile.
  • Predefined Speakers: Choose from 9 ready-to-use voices across languages like Chinese, English, Japanese, and Korean. Each speaker has a distinct style and tone.
  • Fine-Tuning UI: Train custom voice models with a real-time dashboard that displays progress, loss charts, and validation audio. This feature is ideal for users who want to refine their models for specific applications.
  • Multi-Language Support: Generate speech in 10 languages, including Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, and Italian.
  • Auto Transcription: Integrated Whisper technology allows for automatic transcription of audio to text, aiding in the creation of reference text for voice cloning.

ComfyUI-FL-Qwen3TTS Models

The extension supports several models, each tailored for different use cases:

  • Qwen3-TTS-12Hz-1.7B-Base: A versatile base model suitable for voice cloning and fine-tuning.
  • Qwen3-TTS-12Hz-1.7B-CustomVoice: Offers 9 predefined speakers with style control, allowing for nuanced voice customization.
  • Qwen3-TTS-12Hz-1.7B-VoiceDesign: Enables voice creation from text descriptions, perfect for designing unique voice profiles.

Troubleshooting ComfyUI-FL-Qwen3TTS

Here are some common issues and solutions:

  • Model Loading Errors: Ensure you have a stable internet connection for downloading models. If issues persist, try clearing the cache and re-downloading the models.
  • Audio Quality Issues: Check your input text for errors and ensure the reference audio for cloning is clear and of good quality.
  • Performance Issues: Ensure your system meets the recommended requirements, such as having sufficient RAM and a compatible GPU.

Learn More about ComfyUI-FL-Qwen3TTS

To further explore the capabilities of ComfyUI-FL-Qwen3TTS, consider visiting the following resources:

ComfyUI-FL-Qwen3TTS Related Nodes

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

ComfyUI-FL-Qwen3TTS detailed guide | ComfyUI