ComfyUI > Nodes > ComfyUI_Seed-VC > Seed Voice Conversion

ComfyUI Node: Seed Voice Conversion

Class Name

SeedVCRun

Category
🎤MW/MW-Seed-VC
Author
billwuhao (Account age: 2576days)
Extension
ComfyUI_Seed-VC
Latest Updated
2026-03-24
Github Stars
0.06K

How to Install ComfyUI_Seed-VC

Install this extension via the ComfyUI Manager by searching for ComfyUI_Seed-VC
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_Seed-VC in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Seed Voice Conversion Description

SeedVCRun enables seamless voice conversion by transforming source audio to match target voice styles.

Seed Voice Conversion:

SeedVCRun is a node designed for voice conversion tasks, leveraging advanced audio processing techniques to transform a source audio waveform into a target voice style. This node is particularly beneficial for AI artists and audio engineers who wish to experiment with voice synthesis and transformation, allowing them to convert the vocal characteristics of one audio sample to match another. The primary goal of SeedVCRun is to facilitate seamless voice conversion by utilizing a reference audio sample to guide the transformation process, ensuring that the output retains the desired vocal attributes while maintaining high audio quality. This node is essential for creative projects that require voice modulation, character voice creation, or any application where voice identity transformation is needed.

Seed Voice Conversion Input Parameters:

source_audio

The source_audio parameter is the initial audio waveform that you want to transform. It serves as the base audio input for the voice conversion process. This parameter is crucial as it determines the starting point of the transformation, and its quality and characteristics will influence the final output. The audio should be provided in a format that includes both the waveform and the sample rate.

ref_audio

The ref_audio parameter is the reference audio sample that guides the voice conversion process. It provides the target vocal characteristics that the source audio will be transformed to match. This parameter is essential for achieving the desired voice style in the output, as it dictates the vocal attributes that the source audio will adopt.

steps

The steps parameter controls the number of diffusion steps used in the voice conversion process. This parameter affects the granularity and quality of the transformation, with more steps potentially leading to a more refined output. The exact range and default value are not specified, but adjusting this parameter can help fine-tune the conversion results.

speed

The speed parameter adjusts the length of the audio output relative to the source audio. It allows you to speed up or slow down the converted audio, which can be useful for matching specific timing requirements or artistic effects. The parameter's impact is on the temporal aspect of the audio, influencing how fast or slow the final output sounds.

inference_cfg_rate

The inference_cfg_rate parameter influences the configuration rate during the inference process. This parameter can affect the model's behavior and the quality of the voice conversion, although specific details on its range and default value are not provided. Adjusting this parameter can help optimize the conversion process for different audio characteristics.

f0_condition

The f0_condition parameter determines whether the fundamental frequency (f0) is considered during the conversion process. This parameter is important for maintaining pitch accuracy and ensuring that the converted voice retains natural-sounding intonation. It can be toggled on or off depending on the desired outcome.

auto_f0_adjust

The auto_f0_adjust parameter automatically adjusts the fundamental frequency to better match the target voice characteristics. This feature is useful for achieving a more natural and seamless voice conversion, as it helps align the pitch of the source audio with the reference audio.

pitch_shift

The pitch_shift parameter allows you to manually adjust the pitch of the converted audio. This parameter is useful for creative control over the final output, enabling you to raise or lower the pitch to achieve specific artistic effects or to better match the target voice style.

unload_model

The unload_model parameter determines whether the voice conversion model should be unloaded from memory after processing. This is useful for managing system resources, especially when working with limited memory capacity. Setting this parameter to true can help free up memory after the conversion task is completed.

Seed Voice Conversion Output Parameters:

waveform

The waveform output parameter is the transformed audio waveform resulting from the voice conversion process. It represents the final audio output that has been modified to match the vocal characteristics of the reference audio. This waveform is the primary result of the node's operation and can be used for further audio processing or playback.

sample_rate

The sample_rate output parameter indicates the sample rate of the converted audio waveform. This parameter is important for ensuring compatibility with other audio processing tools and for maintaining audio quality during playback. The sample rate should match the requirements of the intended use case for the converted audio.

Seed Voice Conversion Usage Tips:

  • Experiment with different ref_audio samples to achieve a wide range of voice styles and characteristics in your converted audio.
  • Adjust the steps parameter to find the optimal balance between processing time and audio quality, as more steps can lead to a more refined output.
  • Use the pitch_shift parameter creatively to explore unique vocal effects and to better match the target voice style.

Seed Voice Conversion Common Errors and Solutions:

"CUDA out of memory"

  • Explanation: This error occurs when the GPU does not have enough memory to process the voice conversion task.
  • Solution: Try reducing the steps parameter or ensure that the unload_model parameter is set to true after processing to free up memory.

"Invalid audio format"

  • Explanation: This error indicates that the input audio files are not in the expected format or do not include necessary information like waveform and sample rate.
  • Solution: Ensure that both source_audio and ref_audio are provided with the correct waveform and sample rate information.

"Model not loaded"

  • Explanation: This error suggests that the voice conversion model was not properly initialized or has been unloaded before processing.
  • Solution: Check that the model is correctly loaded before starting the conversion process and avoid setting unload_model to true prematurely.

Seed Voice Conversion Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_Seed-VC
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Seed Voice Conversion