ComfyUI > Nodes > ComfyUI_ASR

ComfyUI Extension: ComfyUI_ASR

Repo Name

ComfyUI_ASR

Author
billwuhao (Account age: 2576 days)
Nodes
View all nodes(4)
Latest Updated
2026-03-11
Github Stars
0.03K

How to Install ComfyUI_ASR

Install this extension via the ComfyUI Manager by searching for ComfyUI_ASR
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_ASR in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI_ASR Description

ComfyUI_ASR is an extension for ComfyUI that integrates automatic speech recognition capabilities. It enhances user interaction by converting spoken language into text, facilitating seamless voice command functionality.

ComfyUI_ASR Introduction

ComfyUI_ASR is an innovative extension designed to enhance your AI art projects by integrating advanced speech recognition and subtitle processing capabilities. This extension is a collection of custom nodes for ComfyUI, which allows you to effortlessly add subtitles to your videos. Whether you're working with English, Chinese, or other languages, ComfyUI_ASR provides reliable speech recognition and subtitle generation, making it an invaluable tool for AI artists looking to add a new dimension to their video content. With features like static and dynamic subtitles, customizable font settings, and color options, this extension solves the common problem of manually adding subtitles, saving you time and effort.

How ComfyUI_ASR Works

At its core, ComfyUI_ASR leverages advanced speech recognition models to convert audio into text, which can then be used to generate subtitles. Imagine it as a smart assistant that listens to your video's audio and transcribes it into text, complete with timestamps. This text is then used to create subtitles that can be displayed in your video. The extension offers two types of subtitles: static, where entire sentences appear at once, and dynamic, where words appear one by one, mimicking a typewriter effect. This flexibility allows you to choose the style that best fits your artistic vision.

ComfyUI_ASR Features

Speech Recognition

  • ASRMW Node: Converts audio into text and timestamps. You can choose from various models, such as Belle-whisper-large-v3-zh-punct-ct2, to suit your language needs. The node outputs plain text and timestamped words or sentences.

Subtitle Generation

  • StaticSubtitlesToVideoMW Node: Adds static subtitles to your video, displaying complete sentences. Customize font size, color, background, and alignment to match your video's aesthetic.
  • DynamicSubtitlesToVideoMW Node: Creates dynamic subtitles that appear word by word. This feature is perfect for creating engaging, typewriter-style effects.

Customization Options

  • Font and Color Settings: Adjust font size, color, background color, and transparency. You can also add outlines to your text for better visibility.
  • Alignment and Positioning: Subtitles can be aligned left, center, or right, and positioned anywhere on the screen to ensure they complement your video content.

Color Picker

  • ColorPickerMW Node: A simple tool to select colors for your subtitles, ensuring they stand out against your video background.

ComfyUI_ASR Models

ComfyUI_ASR supports several models for speech recognition, each tailored for different languages and performance needs:

  • Belle-whisper-large-v3-zh-punct-ct2: Ideal for Chinese language recognition.
  • Belle-whisper-large-v3-zh-punct-ct2-float32: Offers a balance between performance and precision.
  • faster-whisper-large-v3-turbo-ct2: Provides faster processing times, suitable for large projects. Choosing the right model depends on your specific requirements, such as language and processing speed.

What's New with ComfyUI_ASR

  • [2026-02-09]: Introduced SRT subtitle output for faster loading in video players, addressing the slow subtitle addition in long videos.
  • [2025-11-02]: Version 1.0.2 fixed non-integer stroke width issues and added automatic model download for first-time users.
  • [2025-11-01]: Initial release of version 1.0.0, bringing comprehensive subtitle and speech recognition features.

Troubleshooting ComfyUI_ASR

Common Issues and Solutions

  1. Model Download Issues: If models do not download automatically, manually download them and place them in the ComfyUI/models/TTS directory.
  2. Subtitle Alignment Problems: Ensure that font files are placed in the fonts directory within the node folder.
  3. Performance Lag: For large videos, consider using the faster-whisper model for improved processing speed.

FAQs

  • Can I use my own fonts? Yes, simply place your font files in the fonts directory.
  • How do I adjust subtitle timing? Use the timestamped outputs from the ASRMW node to fine-tune subtitle timing.

Learn More about ComfyUI_ASR

To further explore the capabilities of ComfyUI_ASR, consider visiting community forums and tutorials that provide insights and tips on maximizing the extension's potential. Engaging with other AI artists can also offer new perspectives and creative ideas for your projects.

ComfyUI_ASR Related Nodes

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.