ComfyUI > Nodes > ComfyUI-HiggsAudio

ComfyUI Extension: ComfyUI-HiggsAudio

Repo Name

ComfyUI-HiggsAudio

Author
Yuan-ManX (Account age: 2090 days)
Nodes
View all nodes(6)
Latest Updated
2025-07-26
Github Stars
0.02K

How to Install ComfyUI-HiggsAudio

Install this extension via the ComfyUI Manager by searching for ComfyUI-HiggsAudio
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-HiggsAudio in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI-HiggsAudio Description

ComfyUI-HiggsAudio integrates Higgs Audio v2, a text-to-audio foundation model by Boson AI, into ComfyUI, enhancing audio generation capabilities.

ComfyUI-HiggsAudio Introduction

ComfyUI-HiggsAudio is an innovative extension designed to integrate the powerful Higgs Audio model into the ComfyUI environment. Developed by Boson AI, Higgs Audio is a state-of-the-art text-to-audio foundation model that excels in generating expressive and high-fidelity audio content. This extension allows AI artists to leverage the capabilities of Higgs Audio directly within ComfyUI, enabling the creation of rich audio experiences from textual descriptions. Whether you're looking to generate natural-sounding speech, create multi-speaker dialogues, or explore new audio styles, ComfyUI-HiggsAudio provides the tools to bring your audio projects to life.

How ComfyUI-HiggsAudio Works

At its core, ComfyUI-HiggsAudio utilizes the Higgs Audio model, which is trained on a vast dataset of over 10 million hours of audio and diverse text data. This extensive training allows the model to understand and generate audio with remarkable expressiveness and accuracy. The model works by converting text inputs into audio outputs, using advanced techniques like Group Relative Policy Optimization (GRPO) and a unique audio tokenizer that captures both semantic and acoustic features. This process ensures that the generated audio is not only coherent and contextually appropriate but also rich in detail and nuance.

ComfyUI-HiggsAudio Features

ComfyUI-HiggsAudio offers a range of features designed to enhance your audio generation experience:

  • Expressive Audio Generation: Create audio that captures the emotional tone and style of the input text, making it ideal for storytelling and artistic projects.
  • Multi-Speaker Dialogues: Generate dialogues with multiple speakers, each with distinct voices, to create dynamic and engaging audio scenes.
  • Voice Cloning: Clone voices from reference audio clips to generate new content that matches the style and tone of the original speaker.
  • Style Control: Fine-tune the audio output by adjusting parameters like temperature and top-p, allowing for greater creative control over the final result.

ComfyUI-HiggsAudio Models

The extension supports different versions of the Higgs Audio model, each tailored for specific use cases:

  • Higgs Audio V2: This version is optimized for expressive audio generation and excels in tasks like emotional speech synthesis and multi-speaker dialogues.
  • Higgs Audio V2.5: The latest iteration, offering improved efficiency and stability with a reduced model size of 1B parameters. It is ideal for production environments where speed and accuracy are crucial.

What's New with ComfyUI-HiggsAudio

The latest updates to ComfyUI-HiggsAudio include the integration of Higgs Audio V2.5, which brings several enhancements:

  • Improved Efficiency: The model architecture has been condensed to 1B parameters, resulting in faster processing times without compromising quality.
  • Enhanced Voice Cloning: New alignment strategies improve the accuracy and naturalness of cloned voices.
  • Finer-Grained Style Control: Users can now achieve more precise control over the audio style, allowing for more personalized and creative outputs.

Troubleshooting ComfyUI-HiggsAudio

Here are some common issues you might encounter while using ComfyUI-HiggsAudio, along with solutions:

  • Audio Quality Issues: If the generated audio sounds distorted or unnatural, try adjusting the temperature and top-p settings to find a balance that suits your needs.
  • Model Loading Errors: Ensure that all dependencies are correctly installed and that the model files are in the appropriate directory.
  • Performance Lag: If you experience slow performance, consider running the extension on a machine with a GPU to take advantage of accelerated processing.

Learn More about ComfyUI-HiggsAudio

To further explore the capabilities of ComfyUI-HiggsAudio, consider the following resources:

  • Higgs Audio V2 Blogpost (https://boson.ai/blog/higgs-audio-v2): Learn about the development and features of Higgs Audio V2.
  • Higgs Audio V2.5 Blogpost (https://www.boson.ai/blog/higgs-audio-v2.5): Discover the improvements and new features in the latest version.
  • Boson AI Playground (https://boson.ai/demo/tts): Experiment with the model in an interactive environment.
  • Hugging Face Space Playground: Access additional tools and resources for working with Higgs Audio. These resources provide valuable insights and practical examples to help you make the most of ComfyUI-HiggsAudio in your creative projects.

ComfyUI-HiggsAudio Related Nodes

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

ComfyUI-HiggsAudio detailed guide | ComfyUI