RunComfy

Hunyuan Video 1.5 | Fast AI Video Generator

Turn text or images into smooth 1080p videos quickly and easily.

FLUX Dev ControlNet | Multi-Condition ControlNet

Controlled FLUX Dev image generation with Pose, Depth, Canny, and ReColor

Image Bypass | Smart Image Detection Bypass Utility Workflow

Skip limits and process images faster with total creative control.

Flux UltraRealistic LoRA V2

Create stunningly lifelike image with Flux UltraRealistic LoRA V2

ComfyUI > Nodes > VibeVoice ComfyUI

ComfyUI Extension: VibeVoice ComfyUI

Repo Name

VibeVoice-ComfyUI

Author
Fabio Sarracino (Account age: 110 days) Nodes
View all nodes(5) Latest Updated
2025-10-02 Github Stars
1.25K

Github Ask Fabio Sarracino Current Questions Past Questions

Table of Content

Description
VibeVoice-ComfyUI Introduction
How VibeVoice-ComfyUI Works
VibeVoice-ComfyUI Features
VibeVoice-ComfyUI Models
What's New with VibeVoice-ComfyUI
Troubleshooting VibeVoice-ComfyUI
Learn More about VibeVoice-ComfyUI
Related Nodes

How to Install VibeVoice ComfyUI

Install this extension via the ComfyUI Manager by searching for VibeVoice ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter VibeVoice ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

VibeVoice ComfyUI Description

VibeVoice ComfyUI is a ComfyUI wrapper for the Microsoft VibeVoice TTS model, enabling single and multi-speaker support and text file loading for enhanced text-to-speech functionality.

VibeVoice-ComfyUI Introduction

VibeVoice-ComfyUI is an extension designed to integrate Microsoft's VibeVoice text-to-speech (TTS) model into ComfyUI workflows. This extension allows you to generate high-quality, expressive speech from text, supporting both single and multi-speaker scenarios. Whether you're creating podcasts, voiceovers, or any other audio content, VibeVoice-ComfyUI provides a seamless way to synthesize natural-sounding speech directly within your creative projects. It addresses common challenges in TTS, such as maintaining speaker consistency and handling long-form content, making it an invaluable tool for AI artists looking to enhance their audio productions.

How VibeVoice-ComfyUI Works

At its core, VibeVoice-ComfyUI leverages the VibeVoice model, which uses advanced speech tokenizers and a diffusion framework to generate speech. Think of it as a sophisticated storyteller that can read your text and bring it to life with voices that sound natural and engaging. The model understands the context and flow of dialogue, allowing it to produce speech that feels coherent and dynamic. By using a combination of language models and acoustic processing, VibeVoice-ComfyUI can handle complex tasks like multi-speaker conversations and voice cloning, where it mimics the characteristics of a given voice sample.

VibeVoice-ComfyUI Features

Core Functionality

Single Speaker TTS: Generate speech from text using a single voice, with the option to clone a specific voice from an audio sample.
Multi-Speaker Conversations: Create dialogues with up to four distinct speakers, each with their own voice.
Voice Cloning: Capture the essence of a voice from an audio sample and use it to generate new speech.
LoRA Support: Fine-tune voices with custom Low-Rank Adaptation (LoRA) adapters for personalized voice characteristics.
Voice Speed Control: Adjust the speaking rate to match your desired pace.
Text File Loading: Easily load scripts from text files for processing.
Automatic Text Chunking: Seamlessly handle long texts by breaking them into manageable chunks.
Custom Pause Tags: Insert pauses in speech to control pacing and emphasis.
Node Chaining: Connect multiple nodes to create complex workflows.
Interruption Support: Cancel operations at any point during the generation process.
Flexible Configuration: Customize parameters like temperature, sampling, and guidance scale to suit your needs.

Performance & Optimization

Attention Mechanisms: Choose from various attention types to optimize performance.
Diffusion Steps: Balance quality and speed by adjusting the number of processing steps.
Memory Management: Efficiently manage VRAM usage with automatic cleanup options.
Apple Silicon Support: Enjoy native GPU acceleration on Apple devices with M1/M2/M3 chips.
Quantization Options: Reduce VRAM usage with 8-bit and 4-bit quantization, maintaining audio quality.

VibeVoice-ComfyUI Models

VibeVoice-ComfyUI supports several models, each suited for different use cases:

VibeVoice-1.5B: A smaller model ideal for quick prototyping and single-speaker tasks, requiring around 6GB of VRAM.
VibeVoice-Large: Offers the highest quality for multi-speaker conversations, but requires more VRAM (~20GB).
VibeVoice-Large-Q8: Provides production-quality audio with reduced VRAM usage (~12GB), perfect for GPUs with 12GB VRAM.
VibeVoice-Large-Q4: Maximizes VRAM savings with minimal quality loss, suitable for lower-end GPUs. Each model can be downloaded from HuggingFace, and they are automatically detected and managed within the ComfyUI environment.

What's New with VibeVoice-ComfyUI

Recent updates have introduced several enhancements:

Version 1.8.1: Fixed a critical bug in the bitsandbytes library affecting the Q8 model.
Version 1.8.0: Introduced the VibeVoice-Large-Q8 model, offering perfect audio quality with significant VRAM savings.
Version 1.7.0: Added dynamic 4-bit quantization for language models, improving speed and reducing VRAM usage.
Version 1.6.0: Removed automatic model downloading, giving users more control over model management. These updates improve the extension's performance and flexibility, making it easier for AI artists to create high-quality audio content.

Troubleshooting VibeVoice-ComfyUI

If you encounter issues while using VibeVoice-ComfyUI, here are some common solutions:

Installation Issues: Ensure you're using the correct Python environment and restart ComfyUI after installation.
Generation Problems: For unstable voices, try using deterministic mode. Ensure multi-speaker text is formatted correctly with sequential speaker numbers.
Memory Constraints: Use smaller models like VibeVoice-1.5B for systems with limited VRAM. For more detailed troubleshooting, refer to the ComfyUI logs and ensure all dependencies are correctly installed.

Learn More about VibeVoice-ComfyUI

To further explore VibeVoice-ComfyUI, consider the following resources:

Video Demo: Watch a demonstration of the extension in action here.
Project Page: Visit the VibeVoice Project Page for more examples and technical details.
Community Forums: Join discussions and seek support from other AI artists and developers. These resources provide valuable insights and support to help you make the most of VibeVoice-ComfyUI in your creative projects.

VibeVoice ComfyUI Related Nodes

VibeVoice Load Text From File

VibeVoice Free Memory

VibeVoice LoRA

VibeVoice Multiple Speakers

VibeVoice Single Speaker

Table of Content

Description
VibeVoice-ComfyUI Introduction
How VibeVoice-ComfyUI Works
VibeVoice-ComfyUI Features
VibeVoice-ComfyUI Models
What's New with VibeVoice-ComfyUI
Troubleshooting VibeVoice-ComfyUI
Learn More about VibeVoice-ComfyUI
Related Nodes

Wan 2.2 | Open-Source Video Gen Leader

Available now! Better precision + smoother motion.

Stable Video Infinity 2.0 | Long-Form Video Generator

Create long, smooth, story-driven AI videos effortlessly.

Wan 2.2 Image Generation | 2-in-1 Workflow Pack

MoE Mix + Low-Only with upscale. Pick one.

Hunyuan3D 2.1 | Image to 3D Model

Big jump from 2.0: Turn photos into incredible 3D models instantly.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: VibeVoice ComfyUI

VibeVoice-ComfyUI

How to Install VibeVoice ComfyUI

VibeVoice ComfyUI Description

VibeVoice-ComfyUI Introduction

How VibeVoice-ComfyUI Works

VibeVoice-ComfyUI Features

Core Functionality

Performance & Optimization

VibeVoice-ComfyUI Models

What's New with VibeVoice-ComfyUI

Troubleshooting VibeVoice-ComfyUI

Learn More about VibeVoice-ComfyUI

VibeVoice ComfyUI Related Nodes