RunComfy

FLUX Kontext LoRA | Style Transfer

Mix 13 art styles instantly or plug in custom LoRAs!

PuLID Flux II | Consistent Character Generation

Generate images with precise character control while preserving artistic style.

Wan 2.2 | Open-Source Video Gen Leader

Available now! Better precision + smoother motion.

FLUX Kontext Face Swap | Seamless Face Replacement

Photoreal face replacement with prompt-guided control and natural blending

ComfyUI > Nodes > Dots-TTS-ComfyUI > Dots TTS Voice Clone

ComfyUI Node: Dots TTS Voice Clone

Class Name

DotsTTSVoiceClone

Category
Dots TTS

Author
Saganaki22 (Account age: 1867days) Extension
Dots-TTS-ComfyUI Latest Updated
2026-06-23 Github Stars
0.03K

Github Ask Saganaki22 Current Questions Past Questions

Table of Content

Description
DotsTTSVoiceClone:
DotsTTSVoiceClone Input Parameters:
DotsTTSVoiceClone Output Parameters:
DotsTTSVoiceClone Usage Tips:
DotsTTSVoiceClone Common Errors and Solutions:
Related Nodes

How to Install Dots-TTS-ComfyUI

Install this extension via the ComfyUI Manager by searching for Dots-TTS-ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter Dots-TTS-ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Dots TTS Voice Clone Description

Generate realistic synthetic voices by cloning a reference speaker's voice using Dots TTS model.

Dots TTS Voice Clone:

The DotsTTSVoiceClone node is designed to facilitate the generation of speech through voice cloning using the Dots TTS model. This node allows you to create a synthetic voice that closely mimics a reference speaker's voice by utilizing a short audio clip as a reference. The primary goal of this node is to enable the creation of personalized and realistic voice outputs by leveraging advanced text-to-speech technology. By providing a reference audio and text, the node can generate speech that not only sounds like the reference speaker but also maintains the natural flow and intonation of human speech. This capability is particularly beneficial for applications requiring voice personalization, such as virtual assistants, audiobooks, and other multimedia content where a consistent and recognizable voice is desired.

Dots TTS Voice Clone Input Parameters:

dotstts_model

This parameter specifies the Dots TTS model to be used for voice cloning. It is crucial as it determines the underlying architecture and capabilities of the voice synthesis process. The model should be pre-loaded and compatible with the Dots TTS framework.

reference_audio

The reference_audio parameter is a critical input that provides a sample of the speaker's voice you wish to clone. It should be a clean audio clip, typically between 3 to 15 seconds long, to ensure accurate voice cloning. This audio serves as the basis for capturing the unique vocal characteristics of the speaker.

text

This parameter represents the text that you want to be converted into speech. The text input is essential as it defines the content of the generated speech. The node uses this text to produce a voice output that mimics the reference speaker's voice.

reference_text

The reference_text parameter is used in conjunction with the reference audio to enhance the accuracy of the voice cloning process. It provides the textual content of the reference audio, allowing the model to better align the audio characteristics with the intended speech content.

steps

This parameter controls the number of steps the model takes during the voice cloning process. A higher number of steps can lead to more refined and accurate voice synthesis, but it may also increase the processing time.

CFG

The CFG (Classifier-Free Guidance) parameter influences the balance between adhering to the reference audio's characteristics and the model's generalization capabilities. Adjusting this parameter can help fine-tune the voice output to be more or less similar to the reference speaker.

seed

The seed parameter is used to initialize the random number generator, ensuring reproducibility of the voice cloning results. By setting a specific seed value, you can achieve consistent outputs across different runs with the same input parameters.

language

This parameter specifies the language of the text input. It is important for ensuring that the generated speech adheres to the phonetic and linguistic rules of the specified language, thereby enhancing the naturalness of the voice output.

normalize_text

The normalize_text parameter determines whether the input text should be normalized before processing. Normalization can involve converting numbers to words, expanding abbreviations, and other text preprocessing steps to improve the clarity and accuracy of the generated speech.

max_audio_patches

This parameter sets the maximum audio budget for the voice cloning process. Each audio patch corresponds to approximately 0.32 seconds of audio. By adjusting this parameter, you can control the maximum duration of the generated speech, which is particularly useful for longer texts.

Dots TTS Voice Clone Output Parameters:

audio

The audio output parameter provides the generated speech audio as a result of the voice cloning process. This audio output is the synthesized voice that mimics the reference speaker, delivering the input text in a natural and personalized manner. The quality and accuracy of this output depend on the input parameters and the capabilities of the Dots TTS model used.

Dots TTS Voice Clone Usage Tips:

Ensure that the reference audio is clean and free from background noise to achieve the best voice cloning results.
Experiment with different CFG values to find the optimal balance between voice similarity and naturalness for your specific application.
Use a consistent seed value if you need to reproduce the same voice output across multiple runs.
Adjust the max_audio_patches parameter if you are working with longer texts to prevent the model from prematurely stopping the audio generation.

Dots TTS Voice Clone Common Errors and Solutions:

"decoder window size must be larger than chunk_size"

Explanation: This error occurs when the size of the decoder window is smaller than the chunk size of the audio being processed.
Solution: Ensure that the decoder window size is appropriately configured to be larger than the chunk size. Adjust the model settings or input parameters to resolve this issue.

"Invalid reference audio length"

Explanation: This error indicates that the reference audio provided is either too short or too long for effective voice cloning.
Solution: Provide a reference audio clip that is between 3 to 15 seconds long to ensure optimal voice cloning performance.

"Model not loaded"

Explanation: This error suggests that the Dots TTS model has not been properly loaded or initialized before attempting voice cloning.
Solution: Verify that the model is correctly loaded and compatible with the Dots TTS framework before executing the voice cloning process.

Dots TTS Voice Clone Related Nodes

Go back to the extension to check out more related nodes.

Dots-TTS-ComfyUI

Table of Content

Description
DotsTTSVoiceClone:
DotsTTSVoiceClone Input Parameters:
DotsTTSVoiceClone Output Parameters:
DotsTTSVoiceClone Usage Tips:
DotsTTSVoiceClone Common Errors and Solutions:
Related Nodes

Qwen Image Edit | Precise AI Photo Editing

Edit photos fast with style, relighting, and object control precision.

ComfyUI UltraShape 1.0 | 3D Mesh Refinement Tool

Refines 3D meshes fast for clean, smooth, optimized models.

Controllable Animation in AI Video | Motion Control Tool

Make videos obey your motion rules instantly and precisely.

CatVTON | Amazing Virtual Try-On

CatVTON for easy and accurate virtual try-on.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: Dots TTS Voice Clone

DotsTTSVoiceClone

How to Install Dots-TTS-ComfyUI

Dots TTS Voice Clone Description

Dots TTS Voice Clone:

Dots TTS Voice Clone Input Parameters:

dotstts_model

reference_audio

text

reference_text

steps

CFG

seed

language

normalize_text

max_audio_patches

Dots TTS Voice Clone Output Parameters:

audio

Dots TTS Voice Clone Usage Tips:

Dots TTS Voice Clone Common Errors and Solutions:

"decoder window size must be larger than chunk_size"

"Invalid reference audio length"

"Model not loaded"

Dots TTS Voice Clone Related Nodes