ComfyUI Node: Audio Align (XCorr)

Class Name

Audio Align (XCorr)

Category
Egregora/Analysis
Author
lucasgattas (Account age: 2973days)
Extension
ComfyUI · Egregora Audio Super‑Resolution
Latest Updated
2025-10-15
Github Stars
0.04K

How to Install ComfyUI · Egregora Audio Super‑Resolution

Install this extension via the ComfyUI Manager by searching for ComfyUI · Egregora Audio Super‑Resolution
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI · Egregora Audio Super‑Resolution in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Audio Align (XCorr) Description

Synchronize audio signals with precision using cross-correlation techniques for accurate alignment in audio processing tasks.

Audio Align (XCorr):

The Audio Align (XCorr) node is designed to synchronize two audio signals by aligning them in time. This is particularly useful in audio processing tasks where precise timing between reference and processed audio is crucial, such as in audio restoration, enhancement, or analysis. The node employs cross-correlation techniques, specifically the Generalized Cross-Correlation with Phase Transform (GCC-PHAT), to estimate the time delay between the reference and processed audio signals. By accurately determining this delay, the node can adjust the processed audio to match the timing of the reference audio, ensuring that both signals are perfectly aligned. This alignment is essential for subsequent audio processing tasks, such as gain matching and null testing, which rely on synchronized audio inputs to produce accurate results. The node's ability to handle fractional delays further enhances its precision, making it a valuable tool for audio engineers and AI artists working with complex audio datasets.

Audio Align (XCorr) Input Parameters:

audio_ref

This parameter represents the reference audio signal to which the processed audio will be aligned. It serves as the benchmark for synchronization, and its sample rate and waveform characteristics are crucial for accurate alignment.

audio_proc

This parameter is the processed audio signal that needs to be aligned with the reference audio. The node will adjust this audio's timing to match that of the reference audio, ensuring synchronization.

align_max_shift_ms

This parameter defines the maximum allowable time shift, in milliseconds, for aligning the audio signals. It sets a boundary for the alignment process, preventing excessive shifts that could lead to misalignment. The default value is 200 ms.

align_method

This parameter specifies the method used for alignment, with "gcc-phat" being the default. This method is known for its robustness in estimating time delays between audio signals, making it suitable for various audio processing tasks.

fractional

This boolean parameter determines whether fractional delays should be considered during alignment. Enabling this option allows for more precise alignment by accounting for sub-sample delays. The default value is True.

fir_len

This parameter sets the length of the Finite Impulse Response (FIR) filter used for fractional delay processing. A longer FIR filter can provide more accurate delay adjustments but may increase computational complexity. The default value is 64.

Audio Align (XCorr) Output Parameters:

ap_aligned

This output is the processed audio signal after alignment. It is synchronized with the reference audio, ensuring that both signals are in perfect temporal alignment.

delay_samples

This output represents the calculated delay, in samples, between the reference and processed audio signals. It indicates how much the processed audio was shifted to achieve alignment.

delay_ms

This output provides the calculated delay in milliseconds, offering a more intuitive understanding of the time shift applied during alignment.

Audio Align (XCorr) Usage Tips:

  • To achieve optimal alignment, ensure that the reference and processed audio signals have similar characteristics, such as sample rate and duration, before using the node.
  • When working with audio signals that have significant noise or distortion, consider preprocessing the audio to enhance clarity and improve alignment accuracy.

Audio Align (XCorr) Common Errors and Solutions:

Sample rate mismatch after alignment stage

  • Explanation: This error occurs when the sample rates of the reference and processed audio signals do not match after the alignment process.
  • Solution: Ensure that both audio signals have the same sample rate before alignment. If necessary, resample the processed audio to match the reference audio's sample rate.

Unsupported AUDIO object for this node

  • Explanation: This error indicates that the input audio object does not meet the expected format or lacks necessary attributes like sample rate or waveform.
  • Solution: Verify that the input audio objects are correctly formatted and contain all required attributes, such as sample rate and waveform data.

Audio Align (XCorr) Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI · Egregora Audio Super‑Resolution
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.