ComfyUI > Nodes > Audio Separation (Demix) > Vocals using MDX

ComfyUI Node: Vocals using MDX

Class Name

AudioSeparateVocals

Category
audio/separation
Author
set-soft (Account age: 3460days)
Extension
Audio Separation (Demix)
Latest Updated
2026-02-11
Github Stars
0.02K

How to Install Audio Separation (Demix)

Install this extension via the ComfyUI Manager by searching for Audio Separation (Demix)
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter Audio Separation (Demix) in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Vocals using MDX Description

Isolates vocals from audio using MDX-Net, aiding remixing and analysis with high accuracy.

Vocals using MDX:

The AudioSeparateVocals node is designed to isolate vocal tracks from audio files using advanced MDX-Net networks. This node is particularly beneficial for audio engineers, music producers, and AI artists who wish to extract vocals from a mix for remixing, analysis, or other creative purposes. By leveraging machine learning models, it provides a sophisticated method to separate vocals with high accuracy, ensuring that the extracted vocals maintain their quality and clarity. The node's primary goal is to simplify the process of vocal separation, making it accessible to users without requiring deep technical expertise in audio processing.

Vocals using MDX Input Parameters:

input_sound

This parameter represents the audio file from which you want to separate the vocals. It is the primary input and should be in a compatible audio format. The quality and clarity of the input sound can significantly impact the results of the separation process.

model

The model parameter allows you to select from a list of available audio separation models. These models are pre-trained and optimized for vocal separation tasks. The choice of model can affect the quality and characteristics of the separated vocals. The default model is "Kim_Vocal_2.safetensors," but you can choose others based on your specific needs.

segments

This parameter determines the number of segments the audio will be divided into during processing. It is an integer value with a default of 1, a minimum of 1, and a maximum of 64. Increasing the number of segments can improve processing efficiency and accuracy, especially for longer audio files, but may also increase computational load.

target_device

The target_device parameter specifies the computational device (CPU or CUDA) used for processing. The default device is determined by the system's configuration. Choosing the appropriate device can optimize performance, with CUDA generally offering faster processing times on compatible hardware.

Vocals using MDX Output Parameters:

Vocals

This output parameter provides the isolated vocal track from the input audio. The separated vocals are delivered as an audio file, allowing you to use them for further processing, remixing, or analysis. The quality of the output depends on the input audio and the selected model.

Complement

The Complement output contains the remaining audio elements after the vocals have been separated. This includes instruments and other non-vocal sounds, providing a complementary track to the isolated vocals. This output is useful for creating instrumental versions or further audio manipulation.

Vocals using MDX Usage Tips:

  • For optimal results, ensure that the input audio is of high quality and free from excessive noise or distortion.
  • Experiment with different models to find the one that best suits your audio material and desired outcome.
  • Use the segments parameter to adjust processing for longer audio files, balancing between performance and accuracy.

Vocals using MDX Common Errors and Solutions:

"Model not found"

  • Explanation: This error occurs when the specified model is not available in the system.
  • Solution: Ensure that the model is correctly installed and listed in the available models. You may need to refresh the model database or download the model again.

"Invalid input audio format"

  • Explanation: The input audio file is not in a supported format or is corrupted.
  • Solution: Convert the audio file to a supported format such as WAV or MP3 and ensure it is not corrupted before retrying.

"Device not supported"

  • Explanation: The selected computational device is not available or not compatible with the current setup.
  • Solution: Check your system's hardware and software configuration to ensure compatibility with the selected device. Switch to a different device if necessary.

Vocals using MDX Related Nodes

Go back to the extension to check out more related nodes.
Audio Separation (Demix)
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Vocals using MDX