RunComfy

Wan 2.2 Animate | Character Swap & Lip-Sync

Transforms any face to speak and move like the original with ease.

Hunyuan3D-2 | Leading-edge 3D Assets Generator

Generate precise textured 3D assets from images with state-of-the-art AI technology.

Z Image ControlNet | Precision Image Generator

Total control over image poses, edges, and depth layouts.

FLUX.1 Dev LoRA Inference | AI Toolkit ComfyUI

Run your AI Toolkit-trained FLUX.1 Dev LoRA in ComfyUI with training-matched behavior using a single RCFluxDev custom node.

ComfyUI > Nodes > ComfyUI > ElevenLabs Speech to Speech

ComfyUI Node: ElevenLabs Speech to Speech

Class Name

ElevenLabsSpeechToSpeech

Category
api node/audio/ElevenLabs

Author
ComfyAnonymous (Account age: 763days) Extension
ComfyUI Latest Updated
2026-05-13 Github Stars
112.77K

Github Ask ComfyAnonymous Current Questions Past Questions

Table of Content

Description
ElevenLabsSpeechToSpeech:
ElevenLabsSpeechToSpeech Input Parameters:
ElevenLabsSpeechToSpeech Output Parameters:
ElevenLabsSpeechToSpeech Usage Tips:
ElevenLabsSpeechToSpeech Common Errors and Solutions:
Related Nodes

How to Install ComfyUI

Install this extension via the ComfyUI Manager by searching for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ElevenLabs Speech to Speech Description

Transform speech with emotional nuances while preserving authenticity for diverse voice options in audio projects.

ElevenLabs Speech to Speech:

The ElevenLabsSpeechToSpeech node is designed to transform speech from one voice to another while preserving the original content and emotion. This node is particularly beneficial for applications that require voice conversion, such as dubbing, voiceovers, or creating personalized voice experiences. By leveraging advanced speech synthesis technology, it ensures that the emotional nuances and intent of the original speech are maintained, providing a seamless and natural-sounding transformation. This capability is crucial for maintaining the authenticity and emotional impact of the spoken content, making it a valuable tool for creators and developers looking to enhance their audio projects with diverse voice options.

ElevenLabs Speech to Speech Input Parameters:

voice

The voice parameter specifies the target voice for the transformation. It is essential for determining the final output voice characteristics. You can connect this from the Voice Selector or Instant Voice Clone, allowing you to choose from a range of predefined or custom voices. This flexibility enables you to tailor the voice transformation to suit specific project needs or personal preferences.

audio

The audio parameter is the source audio that you wish to transform. This input is crucial as it provides the original speech content that will undergo voice conversion. The quality and clarity of the source audio can significantly impact the effectiveness of the transformation, so it is advisable to use high-quality recordings for optimal results.

stability

The stability parameter controls the voice stability during transformation, with a default value of 0.5. It ranges from 0.0 to 1.0, where lower values allow for a broader emotional range, and higher values produce more consistent but potentially monotonous speech. Adjusting this parameter helps in fine-tuning the emotional expressiveness of the transformed voice, making it either more dynamic or stable based on the desired outcome.

model

The model parameter allows you to select the model used for speech-to-speech transformation. Options include eleven_multilingual_sts_v2 and eleven_english_sts_v2. This choice determines the underlying technology and capabilities of the transformation process, such as language support and voice synthesis quality. Selecting the appropriate model is crucial for achieving the best results, especially when dealing with multilingual content or specific voice characteristics.

ElevenLabs Speech to Speech Output Parameters:

audio_output

The audio_output parameter provides the transformed audio as the output. This is the final product of the speech-to-speech transformation process, featuring the original content expressed in the selected target voice. The quality and fidelity of this output are essential for ensuring that the transformed speech meets the desired standards for clarity, emotional expression, and authenticity.

ElevenLabs Speech to Speech Usage Tips:

Experiment with different stability settings to achieve the desired emotional expressiveness in the transformed voice. Lower stability values can add more emotional depth, while higher values ensure consistency.
Choose the appropriate model based on the language and voice characteristics required for your project. This can significantly impact the quality and naturalness of the voice transformation.

ElevenLabs Speech to Speech Common Errors and Solutions:

Unknown voice: `<voice_name>`

Explanation: This error occurs when the specified voice is not recognized by the system, possibly due to a typo or an unsupported voice selection.
Solution: Verify that the voice name is correct and matches one of the available options in the Voice Selector or Instant Voice Clone. Ensure that the voice is supported by the selected model.

Invalid audio input

Explanation: This error indicates that the provided audio input is not in a compatible format or is corrupted.
Solution: Ensure that the audio file is in a supported format and is not corrupted. Re-upload the audio file if necessary and check for any issues with the file integrity.

ElevenLabs Speech to Speech Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI

Table of Content

Description
ElevenLabsSpeechToSpeech:
ElevenLabsSpeechToSpeech Input Parameters:
ElevenLabsSpeechToSpeech Output Parameters:
ElevenLabsSpeechToSpeech Usage Tips:
ElevenLabsSpeechToSpeech Common Errors and Solutions:
Related Nodes

SeedVR2 V2.5 | AI Video Upscaling Workflow

Upscale videos fast with sharp, smooth, cinematic results.

SeedVR2 | Image & Video Upscaler

Fixes blur instantly. Better than Keep/PMRF.

Instagirl v.20 | Wan 2.2 LoRA Demo

A Wan 2.2 workflow for demoing the Instagirl LoRA by Instara.

ReActor | Fast Face Swap

With ComfyUI ReActor, you can easily swap the faces of one or more characters in images or videos.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: ElevenLabs Speech to Speech

ElevenLabsSpeechToSpeech

How to Install ComfyUI

ElevenLabs Speech to Speech Description

ElevenLabs Speech to Speech:

ElevenLabs Speech to Speech Input Parameters:

voice

audio

stability

model

ElevenLabs Speech to Speech Output Parameters:

audio_output

ElevenLabs Speech to Speech Usage Tips:

ElevenLabs Speech to Speech Common Errors and Solutions:

Unknown voice: <voice_name>

Invalid audio input

ElevenLabs Speech to Speech Related Nodes

Unknown voice: `<voice_name>`