ComfyUI > Nodes > ComfyUI · Egregora Audio Super‑Resolution > Egregora Metrics (LSD + SI-SDR)

ComfyUI Node: Egregora Metrics (LSD + SI-SDR)

Class Name

Metrics (LSD + SI-SDR)

Category
Egregora/Analysis
Author
lucasgattas (Account age: 2973days)
Extension
ComfyUI · Egregora Audio Super‑Resolution
Latest Updated
2025-10-15
Github Stars
0.04K

How to Install ComfyUI · Egregora Audio Super‑Resolution

Install this extension via the ComfyUI Manager by searching for ComfyUI · Egregora Audio Super‑Resolution
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI · Egregora Audio Super‑Resolution in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Egregora Metrics (LSD + SI-SDR) Description

Evaluate audio quality using Log-Spectral Distance (LSD) and Scale-Invariant Signal-to-Distortion Ratio (SI-SDR).

Metrics (LSD + SI-SDR):

The Metrics (LSD + SI-SDR) node is designed to evaluate audio quality by calculating two key metrics: Log-Spectral Distance (LSD) and Scale-Invariant Signal-to-Distortion Ratio (SI-SDR). These metrics are crucial for assessing the fidelity and clarity of audio signals, particularly in audio processing and enhancement tasks. LSD measures the difference in the spectral content between two audio signals, providing insight into how closely a processed audio matches its reference in terms of frequency content. SI-SDR, on the other hand, evaluates the quality of the audio by quantifying the distortion present in the signal, independent of its scale. This node is particularly beneficial for audio engineers and AI artists who aim to enhance audio quality, as it provides a quantitative measure of improvement or degradation in audio processing tasks.

Metrics (LSD + SI-SDR) Input Parameters:

audio_ref

This parameter represents the reference audio signal against which the processed audio will be compared. It is crucial for establishing a baseline to evaluate the quality of the processed audio. The reference audio should be a high-quality version of the audio you are trying to enhance or process.

audio_proc

This parameter is the processed audio signal that you want to evaluate. It is compared against the reference audio to determine the effectiveness of the audio processing techniques applied. The goal is to have this audio closely match the reference audio in terms of quality and clarity.

n_fft

The n_fft parameter determines the number of points used in the Fast Fourier Transform (FFT) to compute the spectrogram. It affects the frequency resolution of the analysis, with a default value of 2048. The minimum value is 512, and the maximum is 8192, with a step of 128. A higher n_fft value provides better frequency resolution but may increase computational load.

hop

The hop parameter specifies the number of samples between successive frames in the spectrogram calculation. It influences the time resolution of the analysis, with a default value of 512. The minimum value is 64, and the maximum is 4096, with a step of 64. A smaller hop size offers better time resolution but increases the number of frames to process.

compute_lsd

This boolean parameter determines whether the Log-Spectral Distance (LSD) should be computed. It is set to True by default, indicating that LSD will be calculated to assess the spectral similarity between the reference and processed audio.

compute_si_sdr

This boolean parameter indicates whether the Scale-Invariant Signal-to-Distortion Ratio (SI-SDR) should be computed. It is set to True by default, meaning that SI-SDR will be calculated to evaluate the distortion level in the processed audio relative to the reference.

Metrics (LSD + SI-SDR) Output Parameters:

metrics

The metrics output is a dictionary containing the calculated values of the specified metrics. It includes lsd_mean_db, which represents the average Log-Spectral Distance in decibels, and lsd_p95_db, which is the 95th percentile of the LSD values, providing a measure of the worst-case spectral deviation. Additionally, if SI-SDR computation is enabled, si_sdr_db is included, representing the Scale-Invariant Signal-to-Distortion Ratio in decibels, which quantifies the distortion level in the processed audio.

Metrics (LSD + SI-SDR) Usage Tips:

  • Ensure that the reference audio is of high quality and closely matches the content of the processed audio to obtain meaningful metric evaluations.
  • Adjust the n_fft and hop parameters based on the specific audio characteristics and computational resources available to balance between frequency and time resolution.

Metrics (LSD + SI-SDR) Common Errors and Solutions:

"Audio length mismatch"

  • Explanation: This error occurs when the reference and processed audio signals have different lengths, which can lead to incorrect metric calculations.
  • Solution: Ensure that both audio signals are trimmed or padded to the same length before processing.

"Invalid n_fft value"

  • Explanation: This error arises when the n_fft parameter is set outside the allowed range.
  • Solution: Set the n_fft value within the specified range of 512 to 8192, ensuring it is a power of two for optimal FFT performance.

Egregora Metrics (LSD + SI-SDR) Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI · Egregora Audio Super‑Resolution
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.