ComfyUI > Nodes > TTS Audio Suite > ๐ŸŽ™๏ธ Voice Capture

ComfyUI Node: ๐ŸŽ™๏ธ Voice Capture

Class Name

ChatterBoxVoiceCapture

Category
TTS Audio Suite/๐ŸŽต Audio Processing
Author
diogod (Account age: 667days)
Extension
TTS Audio Suite
Latest Updated
2025-12-13
Github Stars
0.46K

How to Install TTS Audio Suite

Install this extension via the ComfyUI Manager by searching for TTS Audio Suite
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter TTS Audio Suite in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

๐ŸŽ™๏ธ Voice Capture Description

Sophisticated node for voice recording and processing with customizable settings, ideal for precise audio input in various applications.

๐ŸŽ™๏ธ Voice Capture:

ChatterBoxVoiceCapture is a sophisticated node designed to facilitate voice recording and processing within the TTS-Audio-Suite. Its primary function is to capture audio input from a microphone, allowing users to record their voice with customizable settings. This node is particularly beneficial for applications requiring precise audio input, such as voice conversion or speech synthesis. It offers a range of configurable parameters to tailor the recording process, including sample rate, recording duration, and volume gain. By providing a seamless interface for voice capture, ChatterBoxVoiceCapture enhances the user experience in audio processing tasks, ensuring high-quality input for subsequent audio manipulation or analysis.

๐ŸŽ™๏ธ Voice Capture Input Parameters:

voice_device

This parameter specifies the audio input device to be used for recording. If left empty, the system's default audio input device will be used. Selecting the correct device is crucial for ensuring the desired audio source is captured.

voice_sample_rate

This parameter determines the number of audio samples captured per second, with a default value of 44100 Hz. A higher sample rate can result in better audio quality but may increase processing requirements. The sample rate should be chosen based on the desired balance between audio fidelity and system performance.

voice_max_recording_time

This parameter sets the maximum duration for the audio recording in seconds, with a default of 10.0 seconds. It limits the length of the captured audio, which is useful for managing file sizes and processing times. Adjust this parameter based on the specific needs of your project.

voice_volume_gain

This parameter controls the amplification of the recorded audio, with a default gain of 1.0x. Increasing the gain can make quieter recordings more audible, but excessive gain may introduce distortion. It is important to adjust this setting to achieve the desired audio level without compromising quality.

voice_silence_threshold

This parameter defines the minimum volume level considered as silence, with a default value of 0.02. It helps in detecting pauses or silent segments in the audio, which can be useful for trimming or processing purposes. Adjusting this threshold can improve the accuracy of silence detection.

voice_silence_duration

This parameter specifies the minimum duration of silence required to be considered a pause, with a default of 2.0 seconds. It works in conjunction with the silence threshold to identify significant pauses in the recording. This setting is useful for applications that need to segment audio based on pauses.

voice_auto_normalize

This boolean parameter determines whether the recorded audio should be automatically normalized to a standard volume level, with a default setting of True. Normalization ensures consistent audio levels across recordings, which is beneficial for maintaining uniformity in audio output.

voice_trigger

This parameter is used to specify a trigger mechanism for starting the recording process. It can be configured to respond to specific events or conditions, providing flexibility in how recordings are initiated.

๐ŸŽ™๏ธ Voice Capture Output Parameters:

audio_tensor

The output of the ChatterBoxVoiceCapture node is an audio tensor, which is a structured representation of the recorded audio data. This tensor can be used as input for further audio processing or analysis tasks. It encapsulates the captured audio in a format that is compatible with various audio processing frameworks, ensuring seamless integration with other nodes or systems.

๐ŸŽ™๏ธ Voice Capture Usage Tips:

  • Ensure that the correct audio input device is selected to capture the desired audio source effectively.
  • Adjust the sample rate and volume gain settings to balance audio quality with system performance and avoid distortion.
  • Use the silence threshold and duration parameters to accurately detect and handle pauses in the audio recording.

๐ŸŽ™๏ธ Voice Capture Common Errors and Solutions:

โŒ ChatterBox Voice Capture error: <SOUNDDEVICE_ERROR>

  • Explanation: This error indicates that there is an issue with the sound device configuration or availability.
  • Solution: Ensure that the correct audio input device is selected and properly configured. Check if the necessary drivers or software for the audio device are installed and up to date.

๐Ÿ“‹ Install PortAudio to enable voice recording

  • Explanation: This message suggests that the PortAudio library, which is required for voice recording, is not installed.
  • Solution: Install PortAudio using the appropriate method for your operating system. For Linux, use sudo apt-get install portaudio19-dev; for macOS, use brew install portaudio; for Windows, ensure it is bundled with the sounddevice package.

๐ŸŽ™๏ธ Voice Capture Related Nodes

Go back to the extension to check out more related nodes.
TTS Audio Suite
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.