(SP) Spectogram

Transforms audio into customizable spectrograms for intuitive visualization and analysis.

(SP) Spectogram:

The SignalProcessingSpectrogram node is designed to transform audio data into a visual representation known as a spectrogram. This node is particularly useful for AI artists and developers who wish to analyze or visualize audio signals in a more intuitive and graphical format. By converting audio waveforms into spectrograms, you can easily observe the frequency content of the audio over time, which is beneficial for tasks such as audio analysis, music visualization, and sound design. The node leverages advanced signal processing techniques to generate a detailed and colorful spectrogram, which can be customized using various parameters to suit specific needs. The primary goal of this node is to provide a seamless and efficient way to visualize audio data, making it accessible and useful for creative and analytical purposes.

(SP) Spectogram Input Parameters:

audio_input

The audio_input parameter is a dictionary that contains the audio data to be processed. It must include a key "waveform" with a value of type torch.Tensor, representing the audio waveform, and a key "sample_rate" with an integer value indicating the sample rate of the audio. This parameter is crucial as it provides the raw audio data that will be transformed into a spectrogram. The waveform can be in various dimensions, and the node will handle converting it to a mono signal if necessary.

color_map

The color_map parameter specifies the colormap used to colorize the spectrogram. It accepts string values corresponding to colormaps available in matplotlib, such as "viridis" or "inferno". This parameter affects the visual appearance of the spectrogram, allowing you to choose a color scheme that best highlights the features of the audio data. The default value is "viridis".

n_fft

The n_fft parameter determines the number of FFT (Fast Fourier Transform) points used in the spectrogram calculation. It affects the frequency resolution of the spectrogram, with higher values providing more detailed frequency information. The default value is 2048, and it should be a power of two for optimal performance.

hop_length

The hop_length parameter defines the number of audio samples between successive frames in the spectrogram. It influences the time resolution of the spectrogram, with smaller values providing finer time detail. The default value is 512, and it should be chosen based on the desired balance between time and frequency resolution.

n_mels

The n_mels parameter specifies the number of Mel bands to generate in the spectrogram. It determines the number of frequency bins in the Mel scale, which is a perceptual scale of pitches. The default value is 128, providing a good balance between detail and computational efficiency.

top_db

The top_db parameter sets the threshold for the dynamic range of the spectrogram in decibels. It clips the spectrogram to this range to enhance contrast and visibility of features. The default value is 80.0, which is suitable for most audio signals.

(SP) Spectogram Output Parameters:

image

The image output parameter is a torch.Tensor representing the generated spectrogram image. This tensor is normalized to the range [0, 1] and includes a batch dimension, making it ready for further processing or visualization. The spectrogram image provides a visual representation of the audio's frequency content over time, which can be used for analysis, artistic purposes, or as input to other machine learning models.

(SP) Spectogram Usage Tips:

To achieve a higher frequency resolution in your spectrogram, consider increasing the n_fft parameter, but be aware that this may reduce time resolution.
Experiment with different color_map options to find a visual style that best highlights the features of your audio data, especially if you are using the spectrogram for artistic purposes.

(SP) Spectogram Common Errors and Solutions:

The 'waveform' key is missing or None in 'audio_input'.

Explanation: This error occurs when the audio_input dictionary does not contain a valid "waveform" key or the value is None.
Solution: Ensure that the audio_input dictionary includes a "waveform" key with a valid torch.Tensor representing the audio waveform.

Expected 'waveform' to be a torch.Tensor, got `<type>`.

Explanation: This error indicates that the "waveform" key in audio_input is not of type torch.Tensor.
Solution: Verify that the waveform data is correctly converted to a torch.Tensor before passing it to the node.

The 'sample_rate' key is missing or None in 'audio_input'.

Explanation: This error occurs when the audio_input dictionary does not contain a valid "sample_rate" key or the value is None.
Solution: Ensure that the audio_input dictionary includes a "sample_rate" key with an integer value representing the audio's sample rate.

Unexpected spectrogram shape: `<shape>`.

Explanation: This error suggests that the generated spectrogram does not have the expected shape, possibly due to incorrect input dimensions or processing parameters.
Solution: Check the dimensions of the input waveform and ensure that the parameters such as n_fft, hop_length, and n_mels are set correctly.

ComfyUI Node: (SP) Spectogram

SignalProcessingSpectrogram

How to Install ComfyUI Signal Processing

(SP) Spectogram Description

(SP) Spectogram:

(SP) Spectogram Input Parameters:

audio_input

color_map

n_fft

hop_length

n_mels

top_db

(SP) Spectogram Output Parameters:

image

(SP) Spectogram Usage Tips:

(SP) Spectogram Common Errors and Solutions:

The 'waveform' key is missing or None in 'audio_input'.

Expected 'waveform' to be a torch.Tensor, got `<type>`.

The 'sample_rate' key is missing or None in 'audio_input'.

Unexpected spectrogram shape: `<shape>`.

(SP) Spectogram Related Nodes

ComfyUI Node: (SP) Spectogram

SignalProcessingSpectrogram

How to Install ComfyUI Signal Processing

(SP) Spectogram Description

(SP) Spectogram:

(SP) Spectogram Input Parameters:

audio_input

color_map

n_fft

hop_length

n_mels

top_db

(SP) Spectogram Output Parameters:

image

(SP) Spectogram Usage Tips:

(SP) Spectogram Common Errors and Solutions:

The 'waveform' key is missing or None in 'audio_input'.

Expected 'waveform' to be a torch.Tensor, got <type>.

The 'sample_rate' key is missing or None in 'audio_input'.

Unexpected spectrogram shape: <shape>.

(SP) Spectogram Related Nodes

Expected 'waveform' to be a torch.Tensor, got `<type>`.

Unexpected spectrogram shape: `<shape>`.