RunComfy

Flux Klein Face Swap | Realistic AI Face Editor

Swap faces perfectly. Natural, lifelike, and fast AI-powered editing.

Wan Alpha | Transparent Video Generator

Alpha magic: instant transparent background videos for VFX and design.

LTX-2 First Last Frame | Key Frames Video Generator

Turn still frames into seamless video and sound transitions fast.

Qwen Image 2512 LoRA Inference | AI Toolkit ComfyUI

Use an AI Toolkit-trained LoRA with Qwen Image 2512 in ComfyUI via one RCQwenImage2512 node for preview-aligned generations.

ComfyUI > Nodes > ComfyUI_ChatterBox_SRT_Voice > 🎙️ ChatterBox Voice Capture (diogod)

ComfyUI Node: 🎙️ ChatterBox Voice Capture (diogod)

Class Name

ChatterBoxVoiceCaptureDiogod

Category
ChatterBox Voice

Author
diodiogod (Account age: 768days) Extension
ComfyUI_ChatterBox_SRT_Voice Latest Updated
2026-03-21 Github Stars
0.08K

Github Ask diodiogod Current Questions Past Questions

Table of Content

Description
ChatterBoxVoiceCaptureDiogod:
ChatterBoxVoiceCaptureDiogod Input Parameters:
ChatterBoxVoiceCaptureDiogod Output Parameters:
ChatterBoxVoiceCaptureDiogod Usage Tips:
ChatterBoxVoiceCaptureDiogod Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_ChatterBox_SRT_Voice

Install this extension via the ComfyUI Manager by searching for ComfyUI_ChatterBox_SRT_Voice

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_ChatterBox_SRT_Voice in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

🎙️ ChatterBox Voice Capture (diogod) Description

Facilitates voice capture and processing in ComfyUI, enabling seamless recording and conversion.

🎙️ ChatterBox Voice Capture (diogod):

ChatterBoxVoiceCaptureDiogod is a sophisticated node designed to facilitate the capture and processing of voice data within the ComfyUI framework. Its primary purpose is to enable seamless voice recording and conversion, making it an essential tool for AI artists who wish to integrate voice elements into their creative projects. This node is particularly beneficial for those looking to enhance their audio content with features such as voice conversion and speaker diarization. By leveraging advanced audio analysis techniques, ChatterBoxVoiceCaptureDiogod ensures high-quality voice capture and processing, allowing users to focus on the creative aspects of their work without worrying about technical complexities. Its integration into the ComfyUI ecosystem provides a user-friendly interface that simplifies the process of voice data manipulation, making it accessible even to those with limited technical expertise.

🎙️ ChatterBox Voice Capture (diogod) Input Parameters:

tag_audio_events

This parameter allows you to annotate sounds such as laughter or music within the transcript. It is a Boolean input, meaning it can be set to either True or False. When enabled, it enhances the transcript by providing context about non-verbal audio events, which can be particularly useful for creating more engaging and informative audio content. The default value is False.

diarize

The diarize parameter is used to annotate which speaker is talking during the audio recording. This is also a Boolean input, with a default value of False. Enabling this feature allows for the separation of different speakers in the transcript, which is crucial for projects involving multiple voices or characters. It helps in maintaining clarity and understanding in dialogues or interviews.

diarization_threshold

This parameter controls the sensitivity of speaker separation. It is a Float input with a default value of 0.22, and it can range from 0.1 to 0.4. The threshold determines how sensitive the system is to changes in speakers; lower values make it more sensitive, which can be useful in environments with frequent speaker changes. Adjusting this parameter allows for fine-tuning the balance between sensitivity and accuracy in speaker identification.

🎙️ ChatterBox Voice Capture (diogod) Output Parameters:

transcript

The transcript output provides a text representation of the recorded audio, including any annotations for audio events and speaker diarization if those features are enabled. This output is crucial for users who need a textual version of their audio content for further processing or analysis.

audio_segments

This output consists of segmented audio data, which can be used for detailed analysis or further processing. Each segment corresponds to a portion of the audio that has been identified as distinct, either by speaker or by audio event, depending on the input parameters set.

🎙️ ChatterBox Voice Capture (diogod) Usage Tips:

To optimize the node for projects with multiple speakers, enable the diarize parameter and adjust the diarization_threshold to a lower value for environments with frequent speaker changes.
Use the tag_audio_events parameter to enhance transcripts with contextual information about non-verbal sounds, which can improve the overall quality and engagement of your audio content.

🎙️ ChatterBox Voice Capture (diogod) Common Errors and Solutions:

"Audio input not detected"

Explanation: This error occurs when the node fails to detect any audio input, possibly due to incorrect microphone settings or permissions.
Solution: Ensure that your microphone is properly connected and configured in your system settings. Check that the application has the necessary permissions to access the microphone.

"Diarization threshold out of range"

Explanation: This error indicates that the diarization_threshold value is set outside the acceptable range of 0.1 to 0.4.
Solution: Adjust the diarization_threshold parameter to a value within the specified range to ensure proper speaker separation functionality.

🎙️ ChatterBox Voice Capture (diogod) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_ChatterBox_SRT_Voice

Table of Content

Description
ChatterBoxVoiceCaptureDiogod:
ChatterBoxVoiceCaptureDiogod Input Parameters:
ChatterBoxVoiceCaptureDiogod Output Parameters:
ChatterBoxVoiceCaptureDiogod Usage Tips:
ChatterBoxVoiceCaptureDiogod Common Errors and Solutions:
Related Nodes

SeedVR2 | Image & Video Upscaler

Fixes blur instantly. Better than Keep/PMRF.

Hunyuan Image 2.1 | High-Res AI Image Generator

Next-gen 2.1 model for crisp, sharp, ultra-clear AI visuals fast.

Instagirl v.20 | Wan 2.2 LoRA Demo

A Wan 2.2 workflow for demoing the Instagirl LoRA by Instara.

Consistent Character Creator

Create consistent, high-resolution character designs from multiple angles with full control over emotions, lighting, and environments.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: 🎙️ ChatterBox Voice Capture (diogod)

ChatterBoxVoiceCaptureDiogod

How to Install ComfyUI_ChatterBox_SRT_Voice

🎙️ ChatterBox Voice Capture (diogod) Description

🎙️ ChatterBox Voice Capture (diogod):

🎙️ ChatterBox Voice Capture (diogod) Input Parameters:

tag_audio_events

diarize

diarization_threshold

🎙️ ChatterBox Voice Capture (diogod) Output Parameters:

transcript

audio_segments

🎙️ ChatterBox Voice Capture (diogod) Usage Tips:

🎙️ ChatterBox Voice Capture (diogod) Common Errors and Solutions:

"Audio input not detected"

"Diarization threshold out of range"

🎙️ ChatterBox Voice Capture (diogod) Related Nodes