VocalSeparationNode

How to Install VocalSeparation-ComfyUI

Install this extension via the ComfyUI Manager by searching for VocalSeparation-ComfyUI

1. Click the Manager button in the main menu

2. Select Custom Nodes Manager button

3. Enter VocalSeparation-ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available

16GB VRAM to 80GB VRAM GPU machines

400+ preloaded models/nodes

Freedom to upload custom models/nodes

200+ ready-to-run workflows

100% private workspace with up to 200GB storage

Dedicated Support

VocalSeparationNode Description

Separates vocals and instrumentals from audio using advanced machine learning models.

VocalSeparationNode:

The VocalSeparationNode is designed to facilitate the separation of vocal and instrumental components from a given audio track. This node leverages advanced machine learning models to accurately distinguish and extract vocals from the rest of the audio, providing you with two distinct audio outputs: one containing only the vocals and the other containing the instrumental parts. This capability is particularly beneficial for music producers, remix artists, and audio engineers who need to isolate vocals for remixing, analysis, or other creative purposes. By utilizing state-of-the-art separation techniques, the node ensures high-quality audio outputs, making it an essential tool for anyone working with music and audio processing.

VocalSeparationNode Input Parameters:

music

The music parameter is the input audio track that you wish to separate into vocal and instrumental components. It is expected to be in the form of an audio waveform, which the node will process to perform the separation. This parameter is crucial as it serves as the source material for the node's operations.

model_type

The model_type parameter specifies the machine learning model to be used for the separation process. Available options include htdemucs, mdx23c, segm_models, mel_band_roformer, and bs_roformer, with bs_roformer being the default choice. Each model type offers different separation characteristics and performance, allowing you to choose the one that best suits your needs.

batch_size

The batch_size parameter determines the number of audio samples processed in a single batch during the separation process. The default value is 4, which balances processing speed and memory usage. Adjusting this parameter can impact the node's performance, with larger batch sizes potentially speeding up processing at the cost of increased memory usage.

if_mirror

The if_mirror parameter is a boolean option that, when enabled, applies a mirroring technique during the separation process. This can enhance the quality of the separation by providing additional context to the model. The default value is True, indicating that mirroring is applied by default.

VocalSeparationNode Output Parameters:

vocals_AUDIO

The vocals_AUDIO output parameter provides the isolated vocal component of the input audio track. This output is a waveform containing only the vocal elements, making it ideal for remixing, vocal analysis, or other creative applications where vocals are needed separately from the instrumental parts.

instrumental_AUDIO

The instrumental_AUDIO output parameter delivers the instrumental component of the input audio track, with the vocals removed. This output is useful for creating karaoke tracks, instrumental analysis, or any application where the instrumental part is required without the interference of vocals.

VocalSeparationNode Usage Tips:

Experiment with different model_type options to find the one that best suits your audio material and desired separation quality.
Adjust the batch_size parameter based on your system's memory capacity to optimize processing speed without exceeding available resources.

VocalSeparationNode Common Errors and Solutions:

"Model not loaded"

Explanation: This error occurs when the specified model type is not properly loaded or initialized before processing.
Solution: Ensure that the model files are correctly installed and accessible by the node. Verify that the model_type parameter is set to a valid option.

"Audio sample rate mismatch"

Explanation: This error arises when the input audio sample rate does not match the expected target sample rate of 44100 Hz.
Solution: Use an audio processing tool to resample your input audio to 44100 Hz before feeding it into the node. Alternatively, ensure that the node's internal resampling function is correctly configured.

ComfyUI Node: VocalSeparationNode