ComfyUI > Nodes > ComfyUI-FLOAT_Optimized > FLOAT Extract Emotion (Dynamic) (VA)

ComfyUI Node: FLOAT Extract Emotion (Dynamic) (VA)

Class Name

FloatExtractEmotionWithCustomModelDyn

Category
FLOAT/Very Advanced
Author
set-soft (Account age: 3450days)
Extension
ComfyUI-FLOAT_Optimized
Latest Updated
2026-03-20
Github Stars
0.03K

How to Install ComfyUI-FLOAT_Optimized

Install this extension via the ComfyUI Manager by searching for ComfyUI-FLOAT_Optimized
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-FLOAT_Optimized in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

FLOAT Extract Emotion (Dynamic) (VA) Description

Dynamically extracts time-varying emotion vectors from audio, enhancing emotion recognition.

FLOAT Extract Emotion (Dynamic) (VA):

The FloatExtractEmotionWithCustomModelDyn node is designed to dynamically extract emotions from audio features over time, providing a sequence of emotion vectors that can vary throughout the duration of an audio clip. This node is particularly useful for applications where the emotional content of audio is not static but changes, such as in expressive speech or music. By processing audio in chunks, it predicts the emotion for each segment, allowing for a nuanced and time-varying emotional representation. This dynamic approach enables more sophisticated emotion recognition and can enhance applications in fields like interactive media, virtual reality, and AI-driven storytelling by providing a richer emotional context.

FLOAT Extract Emotion (Dynamic) (VA) Input Parameters:

processed_audio_features

This parameter represents a batch of preprocessed audio features, typically output by a feature extractor like FloatAudioPreprocessAndFeatureExtract. It is a TORCH_TENSOR that contains the audio data after it has been processed to a suitable format for emotion recognition. The quality and accuracy of the emotion extraction depend significantly on the quality of these features, as they serve as the primary input for the emotion model.

emotion_model_pipe

This parameter is a tuple containing the loaded emotion recognition model pipeline, which includes the emotion model itself, a reference to the feature extractor used for the emotion model, and its configuration. It is crucial for the node's operation as it defines the model that will be used to predict emotions from the audio features. The model's accuracy and configuration will directly impact the results of the emotion extraction process.

emotion

This parameter allows you to specify a particular emotion or choose 'none' to let the model predict the emotion from the audio features. It offers options from a predefined set of emotions, with 'none' as the default. Selecting a specific emotion will generate a one-hot encoded tensor for that emotion, while choosing 'none' will enable the model to dynamically predict the emotion based on the input features. This flexibility allows for both targeted emotion extraction and more general emotion recognition.

FLOAT Extract Emotion (Dynamic) (VA) Output Parameters:

we_latent

The we_latent output is a TORCH_TENSOR that represents the dynamic, time-varying emotion latent vectors extracted from the audio features. This output provides a sequence of emotion vectors that correspond to different segments of the audio, capturing the emotional dynamics over time. It is essential for applications that require a detailed emotional analysis of audio content, as it allows for the representation of changing emotions throughout the clip.

emotion_model_pipe_out

This output is the EMOTION_MODEL_PIPE, which is essentially the same as the input emotion_model_pipe, passed through the node. It ensures that the emotion model pipeline remains available for further processing or analysis, maintaining consistency and continuity in workflows that involve multiple nodes or stages of emotion processing.

FLOAT Extract Emotion (Dynamic) (VA) Usage Tips:

  • Ensure that the processed_audio_features are correctly preprocessed and compatible with the emotion model to achieve accurate emotion predictions.
  • When using the node for dynamic emotion extraction, consider the length and segmentation of the audio to optimize the granularity of emotion changes captured.

FLOAT Extract Emotion (Dynamic) (VA) Common Errors and Solutions:

Failed to map '<emotion>'. Predicting emotion from audio features.

  • Explanation: This error occurs when a specified emotion cannot be mapped to a valid index in the model's emotion set.
  • Solution: Verify that the specified emotion is correctly spelled and available in the model's emotion set. If unsure, use 'none' to allow the model to predict the emotion automatically.

Predicting emotion from audio features using custom model.

  • Explanation: This message indicates that the node is defaulting to predicting emotions from the audio features because 'none' was selected or the specified emotion was not valid.
  • Solution: If this behavior is unintended, ensure that the correct emotion is specified and that it is supported by the model. Otherwise, this is the expected behavior when 'none' is selected.

FLOAT Extract Emotion (Dynamic) (VA) Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-FLOAT_Optimized
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

FLOAT Extract Emotion (Dynamic) (VA)