RunComfy

Flux Upscaler - Ultimate 32k | Image Upscaler

Flux Upscaler – Achieve 4k, 8k, 16k, and Ultimate 32k Resolution!

Image Bypass | Smart Image Detection Bypass Utility Workflow

Skip limits and process images faster with total creative control.

AnimateDiff + ControlNet + IPAdapter V1 | Cartoon Style

Convert the original video into the desired animation by using only a few images to define the preferred style.

FlashVSR | Real-Time Video Upscaler

Upscale videos fast, smooth, and super clear—no detail lost.

ComfyUI > Nodes > ComfyUI-FLOAT_Optimized > FLOAT Apply Audio Projection (VA)

ComfyUI Node: FLOAT Apply Audio Projection (VA)

Class Name

FloatApplyAudioProjection

Category
FLOAT/Very Advanced

Author
set-soft (Account age: 3450days) Extension
ComfyUI-FLOAT_Optimized Latest Updated
2026-03-20 Github Stars
0.03K

Github Ask set-soft Current Questions Past Questions

Table of Content

Description
FloatApplyAudioProjection:
FloatApplyAudioProjection Input Parameters:
FloatApplyAudioProjection Output Parameters:
FloatApplyAudioProjection Usage Tips:
FloatApplyAudioProjection Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-FLOAT_Optimized

Install this extension via the ComfyUI Manager by searching for ComfyUI-FLOAT_Optimized

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-FLOAT_Optimized in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

FLOAT Apply Audio Projection (VA) Description

Transforms high-dimensional audio features into compact form for motion latent space integration.

FLOAT Apply Audio Projection (VA):

The FloatApplyAudioProjection node is designed to transform high-dimensional audio features into a more compact form suitable for motion latent space. This node is the final step in processing audio data, where it applies a pre-loaded audio projection layer to features extracted from the Wav2Vec model. By doing so, it effectively reduces the dimensionality of the audio features, making them compatible with subsequent processes that require a lower-dimensional representation. This transformation is crucial for applications that involve synchronizing audio with motion data, as it ensures that the audio features are in a format that can be easily integrated into motion models. The node's primary goal is to produce a final audio conditioning tensor, known as wa_latent, which serves as a bridge between raw audio data and motion-related tasks.

FLOAT Apply Audio Projection (VA) Input Parameters:

wav2vec_features

The wav2vec_features parameter is a tensor containing the batch of interpolated feature tensors output by the Wav2Vec feature extraction node. This parameter represents the high-dimensional audio features that have been processed and interpolated to match the target video frames per second (FPS). The tensor is expected to have three dimensions, typically representing the batch size, number of frames, and feature dimension. The correct dimensionality is crucial for the projection layer to function properly, as it ensures that the features are aligned with the expected input size of the projection layer. There are no specific minimum, maximum, or default values for this parameter, but it must match the dimensionality expected by the projection layer.

projection_layer

The projection_layer parameter is an audio projection layer module, which is a neural network module responsible for transforming the high-dimensional audio features into a lower-dimensional space. This module is pre-loaded and should be compatible with the features provided by the wav2vec_features parameter. The projection layer processes the last dimension of the input tensor, effectively reducing its dimensionality to produce the final audio conditioning tensor. The correct configuration and compatibility of this module are essential for the successful execution of the node, as it directly impacts the quality and accuracy of the output.

FLOAT Apply Audio Projection (VA) Output Parameters:

wa_latent

The wa_latent parameter is the output tensor produced by the FloatApplyAudioProjection node. It represents the final audio conditioning tensor, which is a lower-dimensional representation of the original high-dimensional audio features. This tensor is crucial for applications that require the integration of audio data with motion models, as it provides a compact and efficient representation of the audio features. The wa_latent tensor is typically used in subsequent processes that involve synchronizing audio with motion data, ensuring that the audio features are in a format that can be easily utilized by motion-related tasks.

FLOAT Apply Audio Projection (VA) Usage Tips:

Ensure that the wav2vec_features tensor has the correct dimensionality and matches the expected input size of the projection_layer to avoid errors during execution.
Verify that the projection_layer is properly configured and compatible with the features extracted from the Wav2Vec model to ensure accurate and efficient transformation of audio features.

FLOAT Apply Audio Projection (VA) Common Errors and Solutions:

Input 'wav2vec_features' must be a torch.Tensor.

Explanation: This error occurs when the wav2vec_features input is not provided as a PyTorch tensor.
Solution: Ensure that the input is a valid PyTorch tensor with the correct dimensions before passing it to the node.

Input 'projection_layer' must be a torch.nn.Module.

Explanation: This error indicates that the projection_layer input is not a valid neural network module.
Solution: Verify that the projection_layer is a properly loaded and configured neural network module compatible with the node's requirements.

Input 'wav2vec_features' must contain 3 dimensions

Explanation: This error arises when the wav2vec_features tensor does not have the expected three dimensions.
Solution: Check the dimensionality of the wav2vec_features tensor and ensure it matches the expected format of (batch size, number of frames, feature dimension).

Input 'wav2vec_features' wrong size has `<actual_size>`, expected `<expected_size>`. `only_last_features` mismatch?

Explanation: This error occurs when the feature dimension of the wav2vec_features tensor does not match the expected size for the projection_layer.
Solution: Confirm that the feature dimension of the wav2vec_features tensor aligns with the expected input size of the projection_layer, and adjust the configuration if necessary.

FLOAT Apply Audio Projection (VA) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-FLOAT_Optimized

Table of Content

Description
FloatApplyAudioProjection:
FloatApplyAudioProjection Input Parameters:
FloatApplyAudioProjection Output Parameters:
FloatApplyAudioProjection Usage Tips:
FloatApplyAudioProjection Common Errors and Solutions:
Related Nodes

LTX-2 First Last Frame | Key Frames Video Generator

Turn still frames into seamless video and sound transitions fast.

Cosmos-Predict2 | Text2Image & Video2World

Fast and real! NVIDIA Cosmos with true physics.

SteadyDancer | Realistic Image-to-Video Generator

Turns portraits into smooth, lifelike motion videos instantly.

Z-Image | Fast Photorealistic Base Model

Super-fast image maker with stunning clarity and total control.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: FLOAT Apply Audio Projection (VA)

FloatApplyAudioProjection

How to Install ComfyUI-FLOAT_Optimized

FLOAT Apply Audio Projection (VA) Description

FLOAT Apply Audio Projection (VA):

FLOAT Apply Audio Projection (VA) Input Parameters:

wav2vec_features

projection_layer

FLOAT Apply Audio Projection (VA) Output Parameters:

wa_latent

FLOAT Apply Audio Projection (VA) Usage Tips:

FLOAT Apply Audio Projection (VA) Common Errors and Solutions:

Input 'wav2vec_features' must be a torch.Tensor.

Input 'projection_layer' must be a torch.nn.Module.

Input 'wav2vec_features' must contain 3 dimensions

Input 'wav2vec_features' wrong size has <actual_size>, expected <expected_size>. only_last_features mismatch?

FLOAT Apply Audio Projection (VA) Related Nodes

Input 'wav2vec_features' wrong size has `<actual_size>`, expected `<expected_size>`. `only_last_features` mismatch?