logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
MODELS
Explore
All Models
LIBRARY
Generations
MODEL APIS
API Docs
API Keys
ACCOUNT
Usage

Ace Step 1.5: Text-to-Music with Vocals, Lyrics & Style Tag Control on Models and API | RunComfy

acestep-ai/ace-step-1.5/text-to-audio

Generate songs up to 4 minutes from style tags and optional lyrics with original vocals and high acoustic fidelity, on RunComfy models and HTTP API.

Comma-separated list of genre, mood, and instrument tags to steer the style of the generated track.
Optional vocal content for the track. Leave blank or use [inst] / [instrumental] for an instrumental; use [Verse], [Chorus], [Bridge] markers to structure the song.
Audio length in seconds.
Idle
The rate is $0.0003 per second.

Introduction To Ace Step 1.5

ACE Studio's Ace Step 1.5 transforms text style tags and optional structured lyrics into complete songs up to 4 minutes long at $0.0003 per second, with support for 50+ languages, coherent vocals, and high acoustic fidelity. Trading manual scoring sessions, vocalist bookings, and multi-track production for tag-driven, prompt-controlled Ace Step 1.5 generation, the model accelerates music ideation for media teams, game studios, content creators, and advertising producers. For developers, Ace Step 1.5 on RunComfy can be used both in the browser and via an HTTP API, so you don't need to host or scale the model yourself.
Ideal for: Music Demo Prototyping | Cinematic and Game Scoring | Short-Form Ad Music

ACE Studio / Ace Step 1.5#


Ace Step 1.5 is a text-to-music generation model that turns comma-separated style tags and optional structured lyrics into full songs with vocals, instrumentation, and synchronized lyric phrasing. It supports 50+ languages, runs efficiently, and is built for fast iteration with durations from a few seconds up to 4 minutes (240 seconds).


Output format: Audio only / duration 5–240 seconds / stereo / provider-defined sample rate.


Parameters#


ParameterRequiredTypeDefaultRange / OptionsDescription
tags*Yes (*)string—Free textComma-separated list of genre, mood, and instrument tags.
lyricsNostring—Free text or [inst] / [instrumental]Vocal content; use section markers like [Verse], [Chorus], [Bridge] to structure the song.
durationNointeger605 – 240Audio length in seconds.
seedNointeger-1-1 – 2147483647Random seed for reproducibility; -1 randomizes.

Pricing#


Ace Step 1.5 on RunComfy uses time-based billing for generated audio.


Billing unitRate
Per second of generated audio$0.0003

Estimated cost examples


DurationApprox. cost
30 s~$0.009
60 s (default)~$0.018
120 s~$0.036
240 s (4 min)~$0.072

How to Use#


1) Open the Ace Step 1.5 model in RunComfy and reveal the generation panel.

2) Enter style tags such as "lofi, hiphop, chill, mellow piano" to define genre, mood, and instrumentation.

3) Optionally add lyrics; keep [Verse], [Chorus], and [Bridge] sections clearly separated, or use [inst] for an instrumental.

4) Set duration in seconds (5–240); start short to test direction before committing to a full 4-minute render.

5) Lock the seed when you want to compare the impact of tag or lyric changes, or leave it at -1 for variety.

6) Run the generation, preview the result, and download the audio file from your job history.

7) For API use, send the same fields to the Ace Step 1.5 endpoint on RunComfy; no self-hosting is required.

8) Save promising seeds and tag combinations as presets to keep your sonic direction consistent across a project.

Related Models

veo-3-1/fast/text-to-video

Create cinematic clips in seconds with Veo 3.1 Fast, built for instant text-driven motion and creative control.

ltx-2/fast/text-to-video

Next-gen tool turning prompts into cinematic 4K video clips with audio

hailuo-02/image-to-video

Produces crisp 1080p AI videos with smart motion logic and speed

pikadditions

Add a person or object into an existing video with smart compositing.

wan-2-2/image-to-video

Refined AI visuals, real-time control, and pro FX for creators

kling-2-5/turbo/text-to-video

Generate fast, high quality videos from text with Kling 2.5 Turbo.

Frequently Asked Questions

What is Ace Step 1.5 and what does it do in a text-to-audio workflow?

Ace Step 1.5 is a text-to-music model from acestep-ai that turns style tags and optional structured lyrics into full audio tracks with melody, rhythm, and vocals. In a text-to-audio workflow on RunComfy, you describe the genre, mood, and song structure, and Ace Step 1.5 generates a coherent musical piece with synchronized lyric phrasing. It is designed for creators who want fast, prompt-driven music generation without manual composition.

What kinds of generation tasks is Ace Step 1.5 best suited for?

Ace Step 1.5 is best suited for text-to-audio tasks such as background music for videos, short song demos, ambient loops, ad jingles, and reference tracks for game scenes. It handles tag-based styling well, so you can steer genre, instrumentation, and energy with a few descriptors. Lyric and vocal generation also makes Ace Step 1.5 useful for songwriting drafts and creative prototyping.

How does Ace Step 1.5 compare to the original Ace Step and other music models?

Compared to the original Ace Step, version 1.5 keeps the same tag-driven control and 4-minute maximum duration while expanding multilingual lyric support to 50+ languages and refining structured-lyric handling. Compared to instrumental-only systems, Ace Step 1.5 natively produces vocals, instrumentation, and synchronized phrasing in a single pass. Reproducibility through a seed parameter helps developers iterate consistently on a chosen direction.

Which teams and use cases benefit most from Ace Step 1.5 in production?

Designers, technical artists, video creators, and product teams can use Ace Step 1.5 for trailers, social content, prototype game audio, e-commerce videos, and ad creatives. Developers can wrap it into pipelines that need on-demand soundtracks tied to scene metadata or campaign briefs. Because Ace Step 1.5 supports both vocals and instrumentals across many languages, it covers a wide range of audio needs from a single interface.

What input and output limits should I know before using Ace Step 1.5?

Ace Step 1.5 supports flexible duration, adjustable from 5 seconds up to 240 seconds (4 minutes) per generation, with a single required tags field and optional structured lyrics. Other constraints such as supported audio formats and tag combinations depend on the current provider configuration, so check the RunComfy parameter panel for exact limits before building around them. Limits may vary by mode or provider settings.

How do I move from testing Ace Step 1.5 in the model UI to using it in production via the RunComfy API?

You can prototype Ace Step 1.5 in the RunComfy AI Playground Web UI by adjusting style tags, lyrics, duration, and seed until the text-to-audio output matches your target. Once the configuration is stable, call the same Ace Step 1.5 model through the RunComfy API with identical parameters to automate generation from your backend or content pipeline. This keeps creative iteration in the browser and production runs in code, without changing the underlying model behavior.

How is pricing handled when generating audio with Ace Step 1.5 on RunComfy?

Ace Step 1.5 generations consume usd / credits from your RunComfy balance, and based on available provider information the model is billed per second at $0.0003. New users typically get a free trial usd amount to experiment, after which usage follows the Generation rules shown on the model page. For the most current rates and any mode-specific differences, refer to the Generation section of the Ace Step 1.5 page on RunComfy.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Video Models
  • Seedance 1.0 Pro Fast
  • Hailuo 2.3 Fast Standard
  • Wan 2.2
  • Seedance 1.5 Pro
  • Seedance 1.0
  • Veo 3.1 Fast
  • View All Models →
Image Models
  • Wan 2.6 Image to Image
  • Nano Banana 2 Edit
  • Nano Banana Pro
  • seedream 4.0
  • nano banana
  • Seedream 4.0 sequential
  • View All Models →
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2026 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.