RunComfy

Wan 2.2 Animate V2 | Realistic Pose Video Generator

Transforms photos into smooth-motion animated character videos using Wan 2.2.

FLUX.2 Klein Unified Image Editing | Smart Inpaint, Outpaint & Remove

Flawless editing. Remove, fill, and extend any image fast.

Flux TTP Upscale | 4K Face Restore

Repair distorted faces and upscale images to 4K resolution.

Stable Video Infinity 2.0 | Long-Form Video Generator

Create long, smooth, story-driven AI videos effortlessly.

ComfyUI > Nodes > civitai-comfy-nodes > qwen3 / voiceDesign

ComfyUI Node: qwen3 / voiceDesign

Class Name

CivitaiTextToSpeechVllmOmniQwen3VoiceDesign

Category
Civitai/Audio/qwen3

Author
civitai (Account age: 1322days) Extension
civitai-comfy-nodes Latest Updated
2026-06-18 Github Stars
0.02K

Github Ask civitai Current Questions Past Questions

Table of Content

Description
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Input Parameters:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Output Parameters:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Usage Tips:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Common Errors and Solutions:
Related Nodes

How to Install civitai-comfy-nodes

Install this extension via the ComfyUI Manager by searching for civitai-comfy-nodes

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter civitai-comfy-nodes in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

qwen3 / voiceDesign Description

Sophisticated node for text-to-speech conversion with vllm-omni engine, focusing on high-quality voice design.

qwen3 / voiceDesign:

CivitaiTextToSpeechVllmOmniQwen3VoiceDesign is a sophisticated node designed to convert text into speech using the advanced capabilities of the vllm-omni engine within the qwen3 ecosystem. This node is part of the Civitai Orchestration suite, focusing on voice design to create high-quality, natural-sounding audio outputs. It leverages cutting-edge text-to-speech technology to provide users with a seamless experience in generating audio content from textual input. The primary goal of this node is to facilitate the creation of customized voice outputs that can be tailored to specific needs, making it an invaluable tool for AI artists and developers looking to enhance their projects with realistic voice synthesis.

qwen3 / voiceDesign Input Parameters:

text

The text parameter is the core input for the node, representing the textual content you wish to convert into speech. This parameter directly influences the audio output, as the text provided will be synthesized into spoken words. There are no specific minimum or maximum values for this parameter, but the length and complexity of the text can affect processing time and the resulting audio quality.

language

The language parameter specifies the language in which the text should be synthesized. This is crucial for ensuring that the pronunciation and intonation are appropriate for the desired language. The available options typically include a range of supported languages, allowing for flexibility in multilingual projects. Selecting the correct language is essential for achieving accurate and natural-sounding speech.

max_new_tokens

The max_new_tokens parameter determines the maximum number of tokens (or words) that can be generated in the audio output. This parameter helps control the length of the synthesized speech, ensuring it remains within desired limits. Adjusting this value can impact the duration and completeness of the audio, with higher values allowing for longer outputs.

instruct

The instruct parameter provides additional instructions or context for the text-to-speech conversion process. This can include specific guidelines on tone, style, or emphasis, helping to tailor the audio output to meet particular requirements. The use of this parameter can enhance the expressiveness and customization of the generated speech.

qwen3 / voiceDesign Output Parameters:

audio_blob

The audio_blob output is the primary result of the node, containing the synthesized audio data. This output is crucial as it represents the final speech generated from the input text, ready for use in various applications such as voiceovers, narrations, or interactive media.

model_type

The model_type output provides information about the specific model used for the text-to-speech conversion. This can be useful for understanding the characteristics and capabilities of the generated audio, as different models may offer varying levels of quality and naturalness.

speaker

The speaker output indicates the voice or persona used in the audio synthesis. This can be important for projects requiring consistent voice characteristics or when multiple voices are involved in a single application.

workflow_id

The workflow_id output is a unique identifier for the specific text-to-speech conversion process. This can be helpful for tracking and managing multiple audio generation tasks, ensuring that each output is correctly associated with its corresponding input.

raw_json

The raw_json output contains the raw data and metadata associated with the text-to-speech process. This can include detailed information about the conversion parameters and results, providing insights for debugging or further analysis.

qwen3 / voiceDesign Usage Tips:

Experiment with different language settings to achieve the most natural-sounding speech for your target audience.
Use the instruct parameter to add specific emotional tones or emphasis to the speech, enhancing the expressiveness of the audio output.
Adjust the max_new_tokens parameter to control the length of the audio, ensuring it fits within your project's requirements.

qwen3 / voiceDesign Common Errors and Solutions:

Invalid language selection

Explanation: The chosen language is not supported by the node.
Solution: Verify the list of supported languages and select an appropriate option.

Exceeded max_new_tokens limit

Explanation: The input text exceeds the maximum allowed tokens for synthesis.
Solution: Reduce the length of the input text or increase the max_new_tokens parameter if possible.

Missing text input

Explanation: No text was provided for conversion.
Solution: Ensure that the text parameter is populated with the desired content before executing the node.

qwen3 / voiceDesign Related Nodes

Go back to the extension to check out more related nodes.

civitai-comfy-nodes

Table of Content

Description
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Input Parameters:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Output Parameters:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Usage Tips:
CivitaiTextToSpeechVllmOmniQwen3VoiceDesign Common Errors and Solutions:
Related Nodes

Wan 2.2 VACE | Pose-Controlled Video Generator

Turn still images into stunning motion with pose-based control.

SteadyDancer | Realistic Image-to-Video Generator

Turns portraits into smooth, lifelike motion videos instantly.

SeedVR2 | Image & Video Upscaler

Fixes blur instantly. Better than Keep/PMRF.

AnimateDiff + ControlNet | Cartoon Style

Give your videos a playful twist by transforming them into lively cartoons.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: qwen3 / voiceDesign

CivitaiTextToSpeechVllmOmniQwen3VoiceDesign

How to Install civitai-comfy-nodes

qwen3 / voiceDesign Description

qwen3 / voiceDesign:

qwen3 / voiceDesign Input Parameters:

text

language

max_new_tokens

instruct

qwen3 / voiceDesign Output Parameters:

audio_blob

model_type

speaker

workflow_id

raw_json

qwen3 / voiceDesign Usage Tips:

qwen3 / voiceDesign Common Errors and Solutions:

Invalid language selection

Exceeded max_new_tokens limit

Missing text input

qwen3 / voiceDesign Related Nodes