ComfyUI > Nodes > ComfyUI > Reference Audio

ComfyUI Node: Reference Audio

Class Name

ReferenceTimbreAudio

Category
advanced/conditioning/audio
Author
ComfyAnonymous (Account age: 763days)
Extension
ComfyUI
Latest Updated
2026-05-13
Github Stars
112.77K

How to Install ComfyUI

Install this extension via the ComfyUI Manager by searching for ComfyUI
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Reference Audio Description

Node for setting reference audio in conditioning tasks, enhancing model's ability for precise timbre transfer in audio processing.

Reference Audio:

The ReferenceTimbreAudio node is designed to set a reference audio for advanced conditioning in audio processing tasks, specifically for ace step 1.5. This node is experimental and serves the purpose of encoding a reference audio clip into the conditioning process, which can be particularly useful for tasks that require precise audio identity transfer or timbre matching. By integrating reference audio, this node enhances the model's ability to maintain or transfer specific audio characteristics, such as timbre, from the reference clip to the target audio. This capability is crucial for applications in audio synthesis and transformation where maintaining the unique audio signature is important.

Reference Audio Input Parameters:

conditioning

The conditioning parameter is a required input that represents the initial state or setup for the audio processing task. It is used to incorporate the reference audio's characteristics into the processing pipeline. This parameter is crucial as it sets the baseline for how the reference audio will influence the final output, ensuring that the desired audio features are effectively transferred or maintained.

latent

The latent parameter is an optional input that allows you to provide pre-encoded audio latents. These latents represent the compressed form of audio data that can be used to enhance the conditioning process. By providing latents, you can append additional audio characteristics to the conditioning, which can be particularly useful for fine-tuning the audio output. This parameter is optional, meaning that if not provided, the node will rely solely on the conditioning input.

Reference Audio Output Parameters:

conditioning

The output conditioning parameter represents the modified conditioning state after incorporating the reference audio's timbre latents. This output is crucial as it reflects the updated audio processing setup, now enriched with the reference audio's characteristics. The output conditioning can be used in subsequent processing steps to ensure that the desired audio features are preserved or transferred effectively.

Reference Audio Usage Tips:

  • Ensure that the reference audio clip is of sufficient length to capture the desired audio characteristics. A clip that is too short may not provide enough data for effective conditioning.
  • Utilize the latent input to provide additional audio features if you have pre-encoded latents available. This can enhance the conditioning process and result in a more accurate audio transformation.

Reference Audio Common Errors and Solutions:

Reference audio is too short

  • Explanation: This error occurs when the reference audio clip is shorter than the minimum required duration of 1.8 seconds.
  • Solution: Ensure that your reference audio clip is at least 1.8 seconds long to provide sufficient data for the conditioning process.

Total reference audio duration exceeds limit

  • Explanation: This error is raised when the combined duration of all reference audio clips exceeds the maximum allowed duration of 15.1 seconds.
  • Solution: Reduce the number of reference audio clips or shorten their durations to ensure the total duration does not exceed 15.1 seconds.

Reference Audio Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Reference Audio