RunComfy

FlashVSR | Real-Time Video Upscaler

Upscale videos fast, smooth, and super clear—no detail lost.

Z-Image Finetuned Models Collection | Multi-Style Generator

Create stunning, detailed images across multiple styles and moods easily.

AnimateDiff + IPAdapter V1 | Image to Video

With IPAdapter, you can efficiently control the generation of animations using reference images.

Flux PuLID for Face Swapping

Take your face swapping projects to new heights with Flux PuLID.

ComfyUI > Nodes > ComfyUI_Step_Audio_EditX_SM

ComfyUI Extension: ComfyUI_Step_Audio_EditX_SM

Repo Name

ComfyUI_Step_Audio_EditX_SM

Author
smthemex (Account age: 901 days) Nodes
View all nodes(2) Latest Updated
2025-11-15 Github Stars
0.02K

Github Ask smthemex Current Questions Past Questions

Table of Content

Description
ComfyUI_Step_Audio_EditX_SM Introduction
How ComfyUI_Step_Audio_EditX_SM Works
ComfyUI_Step_Audio_EditX_SM Features
ComfyUI_Step_Audio_EditX_SM Models
What's New with ComfyUI_Step_Audio_EditX_SM
Troubleshooting ComfyUI_Step_Audio_EditX_SM
Learn More about ComfyUI_Step_Audio_EditX_SM
Related Nodes

How to Install ComfyUI_Step_Audio_EditX_SM

Install this extension via the ComfyUI Manager by searching for ComfyUI_Step_Audio_EditX_SM

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_Step_Audio_EditX_SM in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ComfyUI_Step_Audio_EditX_SM Description

ComfyUI_Step_Audio_EditX_SM is an open-source LLM-based audio model designed for expressive and iterative audio editing, focusing on emotion, speaking style, and paralinguistics, with strong zero-shot text-to-speech capabilities.

ComfyUI_Step_Audio_EditX_SM Introduction

ComfyUI_Step_Audio_EditX_SM is an innovative extension designed to enhance your audio editing capabilities using advanced AI technology. This extension is based on the Step-Audio-EditX model, which is the first open-source, large language model (LLM)-based audio model that excels in expressive and iterative audio editing. It allows you to modify audio files by adjusting emotions, speaking styles, and paralinguistic features, while also offering robust zero-shot text-to-speech (TTS) capabilities. Whether you're looking to clone voices, edit audio styles, or create expressive audio content, this extension provides a powerful toolset for AI artists to explore and expand their creative horizons.

How ComfyUI_Step_Audio_EditX_SM Works

At its core, ComfyUI_Step_Audio_EditX_SM leverages a sophisticated audio model that uses reinforcement learning to process and edit audio files. The model works by converting audio into discrete tokens using a dual-codebook audio tokenizer. These tokens are then processed by an audio LLM, which generates new token sequences based on the desired edits. Finally, an audio decoder converts these sequences back into audio waveforms. This process allows for precise control over various audio attributes, enabling users to iteratively refine the emotional tone, speaking style, and paralinguistic elements of their audio projects.

ComfyUI_Step_Audio_EditX_SM Features

Zero-Shot TTS: Effortlessly clone voices in multiple languages, including Mandarin, English, Sichuanese, and Cantonese. Simply add language tags like [Sichuanese] or [Cantonese] to your text to switch languages.
Emotion and Speaking Style Editing: Modify audio to express a wide range of emotions (e.g., happy, sad, angry) and speaking styles (e.g., whisper, serious, childlike). This feature supports iterative editing, allowing for gradual refinement of the audio's emotional and stylistic qualities.
Paralinguistic Editing: Add natural, human-like expressions to your audio with tags for breathing, laughter, surprise, and more. This feature enhances the expressiveness and realism of synthetic audio.
Customizable Settings: Adjust parameters like audio normalization peak value and LLM temperature to control the creativity and conservativeness of the model's output.

ComfyUI_Step_Audio_EditX_SM Models

The extension utilizes the Step-Audio-EditX model, which is available for download from platforms like Hugging Face and ModelScope. Additionally, the Step-Audio-Tokenizer is used to process audio tokens, available from the same sources. These models are essential for the extension's functionality, providing the necessary data and algorithms to perform advanced audio editing tasks.

What's New with ComfyUI_Step_Audio_EditX_SM

Recent updates to the extension have introduced several enhancements:

Externalized Audio Normalization and Temperature Settings: Users can now adjust the audio normalization peak value and LLM temperature externally, providing greater control over the audio editing process.
Improved Paralinguistic Mode: New prompts allow for more nuanced paralinguistic editing, enhancing the expressiveness of audio outputs.
Expanded Language Support: The model now supports Japanese and Korean, broadening the range of languages available for zero-shot TTS.

Troubleshooting ComfyUI_Step_Audio_EditX_SM

If you encounter issues while using the extension, consider the following solutions:

Audio Clipping: If your audio exceeds the normalization peak value, adjust the max_amplitude setting to prevent clipping.
Model Performance: Ensure that your system meets the GPU memory requirements (at least 12GB) for optimal performance. If you experience memory issues, consider using the 'offload' option for systems with less than 16GB of VRAM.
Editing Iterations: For best results, set the number of editing iterations (n_edit_iter) to 2 or 3, as this typically yields high-quality outputs.

Learn More about ComfyUI_Step_Audio_EditX_SM

To further explore the capabilities of ComfyUI_Step_Audio_EditX_SM, you can access additional resources and community support:

Demo Page: Try out the model's features in an interactive web demo.
Technical Report: Gain deeper insights into the model's architecture and capabilities.
Community Forums: Join discussions and seek advice from other AI artists and developers in the GitHub Discussions section. By leveraging these resources, you can maximize your creative potential with ComfyUI_Step_Audio_EditX_SM and stay informed about the latest developments in audio editing technology.

ComfyUI_Step_Audio_EditX_SM Related Nodes

Step_Audio_EditX_SM_KSampler

Step_Audio_EditX_SM_Model

Table of Content

Description
ComfyUI_Step_Audio_EditX_SM Introduction
How ComfyUI_Step_Audio_EditX_SM Works
ComfyUI_Step_Audio_EditX_SM Features
ComfyUI_Step_Audio_EditX_SM Models
What's New with ComfyUI_Step_Audio_EditX_SM
Troubleshooting ComfyUI_Step_Audio_EditX_SM
Learn More about ComfyUI_Step_Audio_EditX_SM
Related Nodes

Z-Image Turbo I2I for Characters | Ultimate Photorealism

Turns portraits into lifelike, perfectly detailed realistic faces fast.

Put It Here Kontext | Object Replacement

Put anything anywhere. Kontext makes it look real. Works perfectly.

Qwen Image Edit Plus 2511 LoRA Inference | AI Toolkit ComfyUI

Keep AI Toolkit-trained Qwen Image Edit Plus 2511 LoRA edits in ComfyUI preview-aligned using a single RCQwenImageEditPlus2511 custom node.

Flux Fill | Inpaint and Outpaint

Official Flux Tools - Flux Fill for Inpainting and Outpainting

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: ComfyUI_Step_Audio_EditX_SM

ComfyUI_Step_Audio_EditX_SM

How to Install ComfyUI_Step_Audio_EditX_SM

ComfyUI_Step_Audio_EditX_SM Description

ComfyUI_Step_Audio_EditX_SM Introduction

How ComfyUI_Step_Audio_EditX_SM Works

ComfyUI_Step_Audio_EditX_SM Features

ComfyUI_Step_Audio_EditX_SM Models

What's New with ComfyUI_Step_Audio_EditX_SM

Troubleshooting ComfyUI_Step_Audio_EditX_SM

Learn More about ComfyUI_Step_Audio_EditX_SM

ComfyUI_Step_Audio_EditX_SM Related Nodes