RunComfy

ReActor | Fast Face Swap

Professional face swapping toolkit for ComfyUI that enables natural face replacement and enhancement.

Qwen Image Edit | Precise AI Photo Editing

Edit photos fast with style, relighting, and object control precision.

Consistent Face 3x3 Generator

Generate 3x3 consistent character faces using FLUX and Depth LoRA

LivePortrait | Animate Portraits | Vid2Vid

Transfer facial expressions and movements from a driving video onto a source video

ComfyUI > Nodes > civitai-comfy-nodes > qwen3 / base

ComfyUI Node: qwen3 / base

Class Name

CivitaiTextToSpeechVllmOmniQwen3Base

Category
Civitai/Audio/qwen3

Author
civitai (Account age: 1322days) Extension
civitai-comfy-nodes Latest Updated
2026-06-18 Github Stars
0.02K

Github Ask civitai Current Questions Past Questions

Table of Content

Description
CivitaiTextToSpeechVllmOmniQwen3Base:
CivitaiTextToSpeechVllmOmniQwen3Base Input Parameters:
CivitaiTextToSpeechVllmOmniQwen3Base Output Parameters:
CivitaiTextToSpeechVllmOmniQwen3Base Usage Tips:
CivitaiTextToSpeechVllmOmniQwen3Base Common Errors and Solutions:
Related Nodes

How to Install civitai-comfy-nodes

Install this extension via the ComfyUI Manager by searching for civitai-comfy-nodes

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter civitai-comfy-nodes in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

qwen3 / base Description

Convert text to speech with vllm-omni engine for high-quality audio outputs in Civitai Orchestration suite.

qwen3 / base:

CivitaiTextToSpeechVllmOmniQwen3Base is a powerful node designed to convert text into speech using the advanced capabilities of the vllm-omni engine within the qwen3 ecosystem. This node is part of the Civitai Orchestration suite, which focuses on providing high-quality audio outputs from textual inputs. It is particularly beneficial for AI artists and developers who need to generate realistic and expressive speech from written content. The node's primary function is to transform text into audio, making it an essential tool for applications that require voice synthesis, such as virtual assistants, audiobooks, and interactive media. By leveraging the sophisticated algorithms of the vllm-omni engine, this node ensures that the generated speech is not only clear and natural but also customizable to suit various needs and preferences.

qwen3 / base Input Parameters:

text

The text parameter is the core input for this node, representing the written content you wish to convert into speech. It directly influences the audio output, as the node will synthesize speech based on the text provided. There are no specific minimum or maximum values for this parameter, but the length of the text may affect processing time and the resulting audio's duration.

language

The language parameter specifies the language in which the text is written. This is crucial for ensuring that the speech synthesis engine correctly interprets and pronounces the text. The choice of language can significantly impact the accuracy and naturalness of the generated speech. While specific language options are not detailed, it is important to select the appropriate language for your text to achieve the best results.

max_new_tokens

The max_new_tokens parameter determines the maximum number of tokens (or words) that the node will process from the input text. This parameter helps manage the length of the generated speech, ensuring that it remains within a manageable and desired range. Adjusting this value can help optimize performance, especially when dealing with longer texts.

ref_audio_url

The ref_audio_url parameter allows you to provide a reference audio URL, which the node can use to match the style or tone of the generated speech. This can be particularly useful if you want the synthesized voice to mimic a specific speaker or audio sample. The URL should point to an accessible audio file that the node can analyze.

ref_text

The ref_text parameter serves as a reference text that can guide the speech synthesis process. By providing a sample text, you can influence the style or emphasis of the generated speech, ensuring it aligns with your desired output. This parameter is optional but can enhance the customization of the speech synthesis.

x_vector_only_mode

The x_vector_only_mode parameter is a specialized setting that, when enabled, focuses the node on generating speech using only x-vectors. This mode can be useful for specific applications where you want to emphasize certain vocal characteristics or styles. The default setting is typically disabled, allowing for a broader range of synthesis options.

qwen3 / base Output Parameters:

audio_blob

The audio_blob output is the primary result of the node, containing the synthesized speech in audio format. This output is crucial for any application that requires audio playback, as it represents the final product of the text-to-speech conversion process.

model_type

The model_type output provides information about the type of model used for the speech synthesis. This can be useful for understanding the characteristics and capabilities of the generated speech, especially if you are comparing outputs from different models.

speaker

The speaker output indicates the voice or speaker profile used in the synthesis process. This information can be important if you are using multiple speaker profiles or need to ensure consistency across different audio outputs.

workflow_id

The workflow_id output is a unique identifier for the specific text-to-speech conversion process. This can be helpful for tracking and managing multiple synthesis tasks, especially in complex workflows or batch processing scenarios.

raw_json

The raw_json output provides a detailed JSON representation of the synthesis process, including metadata and configuration details. This output is valuable for debugging, analysis, and record-keeping, as it offers insights into the node's operation and settings.

qwen3 / base Usage Tips:

Ensure that the language parameter matches the language of your input text to achieve the most accurate and natural speech synthesis.
Use the ref_audio_url and ref_text parameters to customize the style and tone of the generated speech, especially if you have specific requirements for the voice output.
Adjust the max_new_tokens parameter to control the length of the generated speech, which can help manage processing time and ensure the output meets your needs.

qwen3 / base Common Errors and Solutions:

Invalid audio URL

Explanation: The ref_audio_url provided is not accessible or does not point to a valid audio file.
Solution: Verify that the URL is correct and points to a publicly accessible audio file. Ensure that the file format is supported by the node.

Language not supported

Explanation: The specified language is not supported by the speech synthesis engine.
Solution: Check the list of supported languages and select an appropriate one for your text. If necessary, adjust the text to match a supported language.

Exceeded token limit

Explanation: The input text exceeds the maximum number of tokens allowed by the max_new_tokens parameter.
Solution: Reduce the length of the input text or increase the max_new_tokens value to accommodate longer texts.

qwen3 / base Related Nodes

Go back to the extension to check out more related nodes.

civitai-comfy-nodes

Table of Content

Description
CivitaiTextToSpeechVllmOmniQwen3Base:
CivitaiTextToSpeechVllmOmniQwen3Base Input Parameters:
CivitaiTextToSpeechVllmOmniQwen3Base Output Parameters:
CivitaiTextToSpeechVllmOmniQwen3Base Usage Tips:
CivitaiTextToSpeechVllmOmniQwen3Base Common Errors and Solutions:
Related Nodes

Image Bypass | Smart Image Detection Bypass Utility Workflow

Skip limits and process images faster with total creative control.

Reallusion AI Render | 3D to ComfyUI Workflows Collection

ComfyUI + Reallusion = Speed, Accessibility, and Ease for 3D visuals

Flux Kontext 360 Degree LoRA

Generate immersive 360-style images with depth and spatial control.

Wan 2.2 + Lightx2v V2 | Ultra Fast I2V & T2V

Dual Light LoRA setup, 4X faster.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: qwen3 / base

CivitaiTextToSpeechVllmOmniQwen3Base

How to Install civitai-comfy-nodes

qwen3 / base Description

qwen3 / base:

qwen3 / base Input Parameters:

text

language

max_new_tokens

ref_audio_url

ref_text

x_vector_only_mode

qwen3 / base Output Parameters:

audio_blob

model_type

speaker

workflow_id

raw_json

qwen3 / base Usage Tips:

qwen3 / base Common Errors and Solutions:

Invalid audio URL

Language not supported

Exceeded token limit

qwen3 / base Related Nodes