ComfyUI > Nodes > ComfyUI Signal Processing > (SP) Spectogram

ComfyUI Node: (SP) Spectogram

Class Name

SignalProcessingSpectrogram

Category
Signal Processing
Author
c0ffymachyne (Account age: 5179days)
Extension
ComfyUI Signal Processing
Latest Updated
2025-05-14
Github Stars
0.02K

How to Install ComfyUI Signal Processing

Install this extension via the ComfyUI Manager by searching for ComfyUI Signal Processing
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI Signal Processing in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

(SP) Spectogram Description

Transforms audio into customizable spectrograms for intuitive visualization and analysis.

(SP) Spectogram:

The SignalProcessingSpectrogram node is designed to transform audio data into a visual representation known as a spectrogram. This node is particularly useful for AI artists and developers who wish to analyze or visualize audio signals in a more intuitive and graphical format. By converting audio waveforms into spectrograms, you can easily observe the frequency content of the audio over time, which is beneficial for tasks such as audio analysis, music visualization, and sound design. The node leverages advanced signal processing techniques to generate a detailed and colorful spectrogram, which can be customized using various parameters to suit specific needs. The primary goal of this node is to provide a seamless and efficient way to visualize audio data, making it accessible and useful for creative and analytical purposes.

(SP) Spectogram Input Parameters:

audio_input

The audio_input parameter is a dictionary that contains the audio data to be processed. It must include a key "waveform" with a value of type torch.Tensor, representing the audio waveform, and a key "sample_rate" with an integer value indicating the sample rate of the audio. This parameter is crucial as it provides the raw audio data that will be transformed into a spectrogram. The waveform can be in various dimensions, and the node will handle converting it to a mono signal if necessary.

color_map

The color_map parameter specifies the colormap used to colorize the spectrogram. It accepts string values corresponding to colormaps available in matplotlib, such as "viridis" or "inferno". This parameter affects the visual appearance of the spectrogram, allowing you to choose a color scheme that best highlights the features of the audio data. The default value is "viridis".

n_fft

The n_fft parameter determines the number of FFT (Fast Fourier Transform) points used in the spectrogram calculation. It affects the frequency resolution of the spectrogram, with higher values providing more detailed frequency information. The default value is 2048, and it should be a power of two for optimal performance.

hop_length

The hop_length parameter defines the number of audio samples between successive frames in the spectrogram. It influences the time resolution of the spectrogram, with smaller values providing finer time detail. The default value is 512, and it should be chosen based on the desired balance between time and frequency resolution.

n_mels

The n_mels parameter specifies the number of Mel bands to generate in the spectrogram. It determines the number of frequency bins in the Mel scale, which is a perceptual scale of pitches. The default value is 128, providing a good balance between detail and computational efficiency.

top_db

The top_db parameter sets the threshold for the dynamic range of the spectrogram in decibels. It clips the spectrogram to this range to enhance contrast and visibility of features. The default value is 80.0, which is suitable for most audio signals.

(SP) Spectogram Output Parameters:

image

The image output parameter is a torch.Tensor representing the generated spectrogram image. This tensor is normalized to the range [0, 1] and includes a batch dimension, making it ready for further processing or visualization. The spectrogram image provides a visual representation of the audio's frequency content over time, which can be used for analysis, artistic purposes, or as input to other machine learning models.

(SP) Spectogram Usage Tips:

  • To achieve a higher frequency resolution in your spectrogram, consider increasing the n_fft parameter, but be aware that this may reduce time resolution.
  • Experiment with different color_map options to find a visual style that best highlights the features of your audio data, especially if you are using the spectrogram for artistic purposes.

(SP) Spectogram Common Errors and Solutions:

The 'waveform' key is missing or None in 'audio_input'.

  • Explanation: This error occurs when the audio_input dictionary does not contain a valid "waveform" key or the value is None.
  • Solution: Ensure that the audio_input dictionary includes a "waveform" key with a valid torch.Tensor representing the audio waveform.

Expected 'waveform' to be a torch.Tensor, got <type>.

  • Explanation: This error indicates that the "waveform" key in audio_input is not of type torch.Tensor.
  • Solution: Verify that the waveform data is correctly converted to a torch.Tensor before passing it to the node.

The 'sample_rate' key is missing or None in 'audio_input'.

  • Explanation: This error occurs when the audio_input dictionary does not contain a valid "sample_rate" key or the value is None.
  • Solution: Ensure that the audio_input dictionary includes a "sample_rate" key with an integer value representing the audio's sample rate.

Unexpected spectrogram shape: <shape>.

  • Explanation: This error suggests that the generated spectrogram does not have the expected shape, possibly due to incorrect input dimensions or processing parameters.
  • Solution: Check the dimensions of the input waveform and ensure that the parameters such as n_fft, hop_length, and n_mels are set correctly.

(SP) Spectogram Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI Signal Processing
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.