ComfyUI > Nodes > Step Audio EditX TTS > StepAudioEditX - Edit ✏️

ComfyUI Node: StepAudioEditX - Edit ✏️

Class Name

StepAudio_AudioEdit

Category
audio/step_audio
Author
saganaki22 (Account age: 1683days)
Extension
Step Audio EditX TTS
Latest Updated
2025-12-04
Github Stars
0.05K

How to Install Step Audio EditX TTS

Install this extension via the ComfyUI Manager by searching for Step Audio EditX TTS
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter Step Audio EditX TTS in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

StepAudioEditX - Edit ✏️ Description

StepAudio_AudioEdit modifies audio with emotion, style, speed, and denoising using Python.

StepAudioEditX - Edit ✏️:

The StepAudio_AudioEdit node is a powerful tool designed to modify audio files by applying various edits such as emotion, style, speed, paralinguistic effects, and denoising. This node is part of the ComfyUI framework and provides a native implementation that does not require JavaScript, relying solely on Python. Its primary purpose is to enhance audio content while preserving the original voice identity and content, making it ideal for AI artists who wish to experiment with different audio styles and effects. The node supports iterative editing, allowing users to refine their audio outputs through multiple iterations. It ensures that the audio remains coherent and natural, even after significant modifications, by leveraging advanced AI models and techniques.

StepAudioEditX - Edit ✏️ Input Parameters:

audio_text

This parameter represents the textual content of the audio that you want to edit. It serves as a reference for the AI to understand the context and content of the audio, ensuring that the modifications align with the intended message or narrative.

edit_type

The type of edit you wish to apply to the audio. Options include emotion, style, speed, and paralinguistic. Each type focuses on a different aspect of the audio, allowing for targeted modifications that can dramatically alter the audio's presentation and impact.

emotion

Specifies the emotional tone you want to infuse into the audio. This can range from happy to sad, angry to calm, and more. The emotion parameter helps in setting the mood and emotional context of the audio, making it more engaging and relatable.

style

Defines the stylistic approach for the audio. This could include different genres or artistic styles, such as formal, casual, or narrative. The style parameter allows you to tailor the audio to fit specific themes or audiences.

speed

Adjusts the playback speed of the audio. This can be used to make the audio faster or slower, depending on the desired effect. Speed adjustments can impact the pacing and energy of the audio, influencing how it is perceived by listeners.

paralinguistic

This parameter allows you to add paralinguistic effects, which are non-verbal elements that convey meaning, such as intonation, pitch, and stress. These effects can enhance the expressiveness and clarity of the audio.

denoising

A feature that reduces background noise and enhances the clarity of the audio. Denoising is crucial for improving audio quality, especially in recordings with unwanted ambient sounds.

paralinguistic_text

Text that specifies additional paralinguistic effects to be applied. This parameter is used to fine-tune the non-verbal elements of the audio, ensuring they align with the intended message.

n_edit_iterations

The number of iterations for the editing process. More iterations can lead to more refined results, as the AI has more opportunities to adjust and improve the audio.

model_path

The file path to the AI model used for editing. This parameter is essential for loading the correct model that will perform the audio modifications.

device

Specifies the hardware device to be used for processing, such as cpu or cuda. This parameter helps in optimizing performance based on the available hardware resources.

torch_dtype

Defines the data type for PyTorch operations, which can impact the precision and performance of the model. Common options include float32 and float16.

quantization

A technique used to reduce the model size and improve performance by approximating the model's weights. This parameter can help in optimizing the node for faster processing.

attention_mechanism

Specifies the attention mechanism to be used in the model, which can affect how the model focuses on different parts of the audio during editing.

temperature

A parameter that controls the randomness of the model's output. Lower values make the output more deterministic, while higher values introduce more variability.

do_sample

A boolean parameter that determines whether sampling is used during the editing process. Sampling can introduce variability and creativity in the output.

max_new_tokens

The maximum number of new tokens to be generated during the editing process. This parameter limits the extent of modifications to the audio content.

seed

A random seed for reproducibility. Setting a seed ensures that the same input will produce the same output across different runs.

keep_model_in_vram

A boolean parameter that determines whether the model should be kept in VRAM between iterations. This can improve performance by reducing loading times.

input_audio

The source audio file to be edited. The audio should be between 0.5 to 30 seconds long. This parameter is crucial as it provides the base content for the editing process.

StepAudioEditX - Edit ✏️ Output Parameters:

audio

The edited audio file. This output contains the modified version of the input audio, reflecting the applied edits such as changes in emotion, style, speed, and other effects. The output is designed to maintain the original voice identity and content while incorporating the desired modifications, resulting in a polished and enhanced audio experience.

StepAudioEditX - Edit ✏️ Usage Tips:

  • Ensure that the input audio is clear and free from excessive background noise to achieve the best results with the denoising feature.
  • Experiment with different combinations of emotion, style, and speed to discover unique audio presentations that suit your creative projects.
  • Use the n_edit_iterations parameter to refine the audio output progressively, especially for complex edits that require subtle adjustments.

StepAudioEditX - Edit ✏️ Common Errors and Solutions:

Step Audio not available: <error_msg>

  • Explanation: This error occurs when the Step Audio installation is not detected or is incomplete.
  • Solution: Verify that the Step Audio package is correctly installed and accessible. Reinstall the package if necessary and ensure all dependencies are met.

Model not found: <model_path>

  • Explanation: The specified model path is incorrect or the model file is missing.
  • Solution: Check the model path for typos or errors. Ensure that the model file exists at the specified location and is accessible by the node.

Auto-appending '<edit_info>' to end of audio: '<audio_text>'

  • Explanation: This message indicates that the paralinguistic effect is being automatically appended to the audio text.
  • Solution: Ensure that the paralinguistic text is correctly specified if you want to customize this effect. Otherwise, the default behavior will apply.

StepAudioEditX - Edit ✏️ Related Nodes

Go back to the extension to check out more related nodes.
Step Audio EditX TTS
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.