RunComfy

Hunyuan Video 1.5 | Fast AI Video Generator

Turn text or images into smooth 1080p videos quickly and easily.

Z-Image Finetuned Models Collection | Multi-Style Generator

Create stunning, detailed images across multiple styles and moods easily.

Wan 2.2 | Open-Source Video Gen Leader

Available now! Better precision + smoother motion.

FLUX.1 Dev LoRA Inference | AI Toolkit ComfyUI

Run your AI Toolkit-trained FLUX.1 Dev LoRA in ComfyUI with training-matched behavior using a single RCFluxDev custom node.

ComfyUI > Nodes > Step Audio EditX TTS

ComfyUI Extension: Step Audio EditX TTS

Repo Name

ComfyUI-Step_Audio_EditX_TTS

Author
saganaki22 (Account age: 1683 days) Nodes
View all nodes(2) Latest Updated
2025-12-04 Github Stars
0.05K

Github Ask saganaki22 Current Questions Past Questions

Table of Content

Description
ComfyUI-Step_Audio_EditX_TTS Introduction
How ComfyUI-Step_Audio_EditX_TTS Works
ComfyUI-Step_Audio_EditX_TTS Features
ComfyUI-Step_Audio_EditX_TTS Models
Troubleshooting ComfyUI-Step_Audio_EditX_TTS
Learn More about ComfyUI-Step_Audio_EditX_TTS
Related Nodes

How to Install Step Audio EditX TTS

Install this extension via the ComfyUI Manager by searching for Step Audio EditX TTS

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter Step Audio EditX TTS in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Step Audio EditX TTS Description

Step Audio EditX TTS is a professional voice cloning and audio editing node for ComfyUI, enabling advanced audio manipulation and text-to-speech functionalities.

ComfyUI-Step_Audio_EditX_TTS Introduction

ComfyUI-Step_Audio_EditX_TTS is an innovative extension designed to enhance your audio editing capabilities within the ComfyUI framework. This extension allows you to perform state-of-the-art zero-shot voice cloning and advanced audio editing with ease. Whether you're an AI artist looking to create unique voiceovers for your projects or a developer seeking to integrate sophisticated audio manipulation into your applications, this extension offers a comprehensive suite of tools to meet your needs. With features like emotion and style editing, speed control, and paralinguistic effects, you can transform and customize audio content to suit any creative vision.

How ComfyUI-Step_Audio_EditX_TTS Works

At its core, ComfyUI-Step_Audio_EditX_TTS leverages advanced machine learning models to analyze and manipulate audio data. The extension uses a modular workflow design, allowing you to separate the processes of voice cloning and audio editing. By providing a short reference audio clip, the extension can clone the voice and generate new speech in that voice with any text you provide. The editing capabilities enable you to adjust the emotional tone, speaking style, and speed of the audio, as well as add effects like laughter or breathing. This is achieved through a series of nodes within the ComfyUI interface, which you can connect and configure to create complex audio workflows without needing to write code.

ComfyUI-Step_Audio_EditX_TTS Features

Zero-Shot Voice Cloning: Clone any voice using just a 3-30 second audio sample. This feature is perfect for creating consistent character voices across different projects.
Advanced Audio Editing: Modify the emotion, style, and speed of audio clips. Add paralinguistic effects such as laughter or sighs, and remove background noise with denoising tools.
Native ComfyUI Integration: Seamlessly integrates with ComfyUI, allowing you to use its powerful node-based interface for audio processing.
Modular Workflow Design: Separate nodes for cloning and editing enable flexible and customizable audio workflows.
Longform Support: Smart chunking allows for the processing of long texts, automatically splitting and stitching audio seamlessly.
Iterative Editing: Apply multiple iterations of edits to achieve stronger and more pronounced effects.

ComfyUI-Step_Audio_EditX_TTS Models

The extension utilizes two main models: the Step-Audio-EditX model and the Step-Audio-Tokenizer. The Step-Audio-EditX model is responsible for the core audio processing tasks, while the Step-Audio-Tokenizer helps in managing and processing the audio data efficiently. These models work together to provide high-quality audio cloning and editing capabilities.

Troubleshooting ComfyUI-Step_Audio_EditX_TTS

Common Issues and Solutions

Garbled or Distorted Speech: Ensure that all dependencies are up to date. You can update the transformers library to version 4.53.3 and verify that librosa and hyperpyyaml are installed.
Out of Memory Errors: Try enabling quantization or reducing the max_new_tokens parameter. You can also disable the keep_model_in_vram option to free up VRAM.
Poor Voice Quality: Make sure the prompt_text matches the reference audio transcript exactly. Use high-quality reference audio and consider increasing the temperature setting for more natural variation.
Edit Node Not Working: Check that the audio length is between 0.5-30 seconds. Ensure the audio_text matches the input audio transcript and that the correct edit type is selected.

Learn More about ComfyUI-Step_Audio_EditX_TTS

For further learning and support, you can explore the following resources:

Step Audio EditX Model on HuggingFace
ComfyUI GitHub Repository
ComfyUI Examples
ComfyUI Discord Community These resources provide tutorials, community support, and additional documentation to help you make the most of the ComfyUI-Step_Audio_EditX_TTS extension.

Step Audio EditX TTS Related Nodes

StepAudioEditX - Edit ✏️

StepAudioEditX - Clone 🎤

Table of Content

Description
ComfyUI-Step_Audio_EditX_TTS Introduction
How ComfyUI-Step_Audio_EditX_TTS Works
ComfyUI-Step_Audio_EditX_TTS Features
ComfyUI-Step_Audio_EditX_TTS Models
Troubleshooting ComfyUI-Step_Audio_EditX_TTS
Learn More about ComfyUI-Step_Audio_EditX_TTS
Related Nodes

Image Bypass | Smart Image Detection Bypass Utility Workflow

Skip limits and process images faster with total creative control.

ComfyUI F5 TTS | Natural Voice Cloning Engine

Turn text into rich, expressive voices with natural tone control.

LTX-2 First Last Frame | Key Frames Video Generator

Turn still frames into seamless video and sound transitions fast.

Flux Upscaler - Ultimate 32k | Image Upscaler

Flux Upscaler – Achieve 4k, 8k, 16k, and Ultimate 32k Resolution!

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Extension: Step Audio EditX TTS

ComfyUI-Step_Audio_EditX_TTS

How to Install Step Audio EditX TTS

Step Audio EditX TTS Description

ComfyUI-Step_Audio_EditX_TTS Introduction

How ComfyUI-Step_Audio_EditX_TTS Works

ComfyUI-Step_Audio_EditX_TTS Features

ComfyUI-Step_Audio_EditX_TTS Models

Troubleshooting ComfyUI-Step_Audio_EditX_TTS

Common Issues and Solutions

Learn More about ComfyUI-Step_Audio_EditX_TTS

Step Audio EditX TTS Related Nodes