ComfyUI > Nodes > ComfyUI-AudioX > AudioX Text to Audio

ComfyUI Node: AudioX Text to Audio

Class Name

AudioXTextToAudio

Category
AudioX/Generation
Author
lum3on (Account age: 314days)
Extension
ComfyUI-AudioX
Latest Updated
2025-06-24
Github Stars
0.04K

How to Install ComfyUI-AudioX

Install this extension via the ComfyUI Manager by searching for ComfyUI-AudioX
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-AudioX in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

AudioX Text to Audio Description

Transform text to audio using AudioX model for AI artists to create custom audio elements seamlessly.

AudioX Text to Audio:

The AudioXTextToAudio node is designed to transform textual descriptions into audio outputs, leveraging the capabilities of the AudioX model. This node is particularly beneficial for AI artists and creators who wish to generate audio content directly from text prompts, allowing for a seamless integration of sound into multimedia projects. By converting descriptive text into audio, this node opens up new creative possibilities, enabling users to produce soundscapes, effects, or even narrative audio pieces that align with their artistic vision. The node's primary function is to interpret the text input and generate corresponding audio, making it a powerful tool for those looking to enhance their projects with custom audio elements.

AudioX Text to Audio Input Parameters:

model

This parameter specifies the model to be used for audio generation. It is crucial as it determines the underlying algorithm and capabilities for interpreting the text prompt and producing audio. The model must be compatible with the AudioX framework.

text_prompt

The text_prompt parameter is a string input where you provide the textual description of the audio you wish to generate. This can be a simple phrase or a detailed description, and it supports multiline input for more complex prompts. The default value is "Typing on a keyboard," but you can customize it to fit your specific needs.

steps

The steps parameter defines the number of processing steps the model will take to generate the audio. More steps can lead to higher quality audio but will increase processing time. The value ranges from 1 to 1000, with a default of 250 steps.

cfg_scale

The cfg_scale parameter is a float that controls the guidance scale for the model. It influences how closely the generated audio should adhere to the text prompt. A higher value means stricter adherence, while a lower value allows for more creative freedom. The range is from 0.1 to 20.0, with a default of 7.0.

seed

The seed parameter is an integer used to initialize the random number generator, ensuring reproducibility of results. A value of -1 indicates that a random seed will be used, while any other integer within the range of -1 to 2^32

  • 1 can be specified for consistent outputs.

duration_seconds

This parameter specifies the length of the generated audio in seconds. It allows you to control the duration of the output, with a range from 1.0 to 30.0 seconds and a default value of 10.0 seconds.

AudioX Text to Audio Output Parameters:

audio

The audio output parameter represents the generated audio file. This output is the result of processing the text prompt through the AudioX model, providing a tangible audio representation of the input description. The audio can be used in various multimedia applications, offering a direct way to incorporate custom sound into your projects.

AudioX Text to Audio Usage Tips:

  • Experiment with different text_prompt inputs to explore the range of audio outputs the model can generate. Descriptive and detailed prompts can lead to more nuanced audio results.
  • Adjust the steps parameter to balance between audio quality and processing time. For quick iterations, use fewer steps, and for final outputs, consider increasing the steps for better quality.
  • Use the cfg_scale to fine-tune how closely the audio should match the text prompt. Higher values ensure the audio closely follows the description, while lower values allow for more creative interpretation.

AudioX Text to Audio Common Errors and Solutions:

Invalid model type

  • Explanation: The specified model is not compatible with the AudioX framework.
  • Solution: Ensure that the model parameter is set to a valid AUDIOX_MODEL type.

Text prompt too long

  • Explanation: The text prompt exceeds the maximum allowed length.
  • Solution: Shorten the text prompt to fit within the acceptable length for processing.

Steps out of range

  • Explanation: The number of steps specified is outside the allowed range.
  • Solution: Adjust the steps parameter to be within the range of 1 to 1000.

Seed value invalid

  • Explanation: The seed value is not within the valid range.
  • Solution: Ensure the seed is set between -1 and 2^32 - 1, or use -1 for a random seed.

AudioX Text to Audio Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-AudioX
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.