RunComfy

Wan2.2 VACE Fun | Image to Animated Video

Turn still photos into lifelike animated videos with custom prompts.

LatentSync| Lip Sync Model

Advanced audio-driven lip sync technology.

DiffuEraser | Video Inpainting

Erase objects from videos with auto-masking and realistic reconstruction.

SUPIR + Foolhardy Remacri | 8K Image/Video Upscaler

Upscale images to 8K with SUPIR and 4x Foolhardy Remacri model.

ComfyUI > Nodes > ComfyUI-KugelAudio

ComfyUI Extension: ComfyUI-KugelAudio

Repo Name

ComfyUI-KugelAudio

Author
Saganaki22 (Account age: 0 days) Nodes
View all nodes(4) Latest Updated
2026-02-28 Github Stars
0.03K

Github Ask Saganaki22 Current Questions Past Questions

Table of Content

Description
ComfyUI-KugelAudio Introduction
How ComfyUI-KugelAudio Works
ComfyUI-KugelAudio Features
ComfyUI-KugelAudio Models
What's New with ComfyUI-KugelAudio
Troubleshooting ComfyUI-KugelAudio
Learn More about ComfyUI-KugelAudio
Related Nodes

How to Install ComfyUI-KugelAudio

Install this extension via the ComfyUI Manager by searching for ComfyUI-KugelAudio

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-KugelAudio in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ComfyUI-KugelAudio Description

ComfyUI-KugelAudio is an extension for ComfyUI that integrates audio processing capabilities, enabling users to manipulate and analyze sound within the ComfyUI environment.

ComfyUI-KugelAudio Introduction

ComfyUI-KugelAudio is an innovative extension designed to enhance the capabilities of ComfyUI by integrating advanced text-to-speech (TTS) functionalities. This extension leverages the power of an AR (Auto-Regressive) and Diffusion architecture to provide open-source TTS with voice cloning capabilities across 24 European languages. Whether you're an AI artist looking to add realistic voiceovers to your projects or exploring new creative avenues, ComfyUI-KugelAudio offers a robust solution for generating high-quality, natural-sounding speech from text.

How ComfyUI-KugelAudio Works

At its core, ComfyUI-KugelAudio transforms written text into spoken words using a sophisticated model that combines AR and Diffusion techniques. The AR component predicts the next word in a sequence, while the Diffusion model refines the audio output to ensure clarity and naturalness. This dual approach allows the extension to produce speech that closely mimics human intonation and rhythm. By using reference audio samples, the extension can also clone voices, enabling users to replicate specific vocal characteristics in their TTS outputs.

ComfyUI-KugelAudio Features

Single Speaker TTS: Converts text into speech with a single voice, ideal for narrations or monologues.
Voice Cloning: Allows you to clone any voice using a short audio sample (5-30 seconds), making it possible to personalize the TTS output with unique vocal traits.
Multi-Speaker Conversations: Supports up to 6 speakers, enabling the creation of dynamic dialogues with configurable pauses between speakers for natural pacing.
Watermark Detection: Ensures all generated audio contains an inaudible watermark, providing a layer of authenticity and security.
Language Support: Offers TTS in 24 European languages, including English, German, French, and Spanish, among others.
4-bit Quantization: Reduces VRAM usage from approximately 19GB to 8GB, making it more accessible for users with limited hardware resources.
Multiple Attention Types: Provides various attention mechanisms like Auto, SageAttention, and FlashAttention to optimize performance and quality.
Progress Tracking: Displays real-time progress bars for long text generations, keeping you informed of the process.

ComfyUI-KugelAudio Models

ComfyUI-KugelAudio utilizes a model known as kugelaudio-0-open, which consists of 7 billion parameters. This model is designed to deliver high-quality audio output while maintaining efficient performance. The model automatically downloads upon first use, ensuring a seamless setup experience.

What's New with ComfyUI-KugelAudio

Recent updates have focused on enhancing the user experience and expanding the extension's capabilities. Key improvements include the introduction of multi-speaker support, allowing for more complex audio productions, and the implementation of 4-bit quantization to reduce VRAM requirements. These updates make the extension more versatile and accessible to a broader range of users.

Troubleshooting ComfyUI-KugelAudio

Common Issues and Solutions

Voice Cloning Errors: If you encounter an error related to 'Qwen2Config', ensure you run the install_portable.bat script in the ComfyUI-KugelAudio directory.
Out of Memory (OOM) Errors: Enable 4-bit quantization to reduce VRAM usage, use SDPA or Eager attention types, and consider reducing the max_words_per_chunk setting.
Model Download Failures: Verify your internet connection and try downloading the model manually using the Hugging Face CLI.
Audio Quality Issues: Adjust the cfg_scale setting to improve clarity and reduce distortion. For static or noise, disable 4-bit quantization.

Learn More about ComfyUI-KugelAudio

To further explore the capabilities of ComfyUI-KugelAudio, consider visiting the GitHub Repository for detailed documentation and updates. Additionally, the Hugging Face Model Page provides access to the model and related resources. Engaging with community forums and tutorials can also offer valuable insights and support as you integrate this extension into your creative projects.

ComfyUI-KugelAudio Related Nodes

KugelAudio Multi-Speaker

KugelAudio TTS

KugelAudio Voice Clone

KugelAudio Watermark Check

Table of Content

Description
ComfyUI-KugelAudio Introduction
How ComfyUI-KugelAudio Works
ComfyUI-KugelAudio Features
ComfyUI-KugelAudio Models
What's New with ComfyUI-KugelAudio
Troubleshooting ComfyUI-KugelAudio
Learn More about ComfyUI-KugelAudio
Related Nodes

Flux UltraRealistic LoRA V2

Create stunningly lifelike image with Flux UltraRealistic LoRA V2

CHORD Model | AI PBR Texture Generator

Turns images into true PBR texture maps fast.

Wan 2.2 Animate V2 | Realistic Pose Video Generator

Transforms photos into smooth-motion animated character videos using Wan 2.2.

IPAdapter V1 FaceID Plus | Consistent Characters

Leverage IPAdapter FaceID Plus V2 model to create consistent characters.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: ComfyUI-KugelAudio

ComfyUI-KugelAudio

How to Install ComfyUI-KugelAudio

ComfyUI-KugelAudio Description

ComfyUI-KugelAudio Introduction

How ComfyUI-KugelAudio Works

ComfyUI-KugelAudio Features

ComfyUI-KugelAudio Models

What's New with ComfyUI-KugelAudio

Troubleshooting ComfyUI-KugelAudio

Common Issues and Solutions

Learn More about ComfyUI-KugelAudio

ComfyUI-KugelAudio Related Nodes