qwen3 / customVoice:
CivitaiTextToSpeechVllmOmniQwen3CustomVoice is a sophisticated node designed to convert text into speech using the Civitai Orchestration platform. This node leverages the vllm-omni engine within the qwen3 ecosystem to provide a customizable voice synthesis experience. It allows you to generate audio outputs from text inputs, offering a range of built-in speaker options to tailor the voice output to your specific needs. The node is particularly beneficial for creating personalized audio content, enabling you to specify language preferences and style instructions to achieve the desired speech characteristics. Its primary goal is to facilitate the seamless transformation of written content into high-quality spoken audio, making it an invaluable tool for AI artists and content creators looking to enhance their projects with custom voiceovers.
qwen3 / customVoice Input Parameters:
text
This parameter represents the text you wish to convert into speech. It is a required input and should be provided as a string. The text can be multiline, allowing for more extensive content to be synthesized. The quality and clarity of the output audio will depend on the text provided.
language
The language parameter allows you to specify the target language for the speech synthesis. It accepts a string value, such as "English" or "Chinese," and defaults to "Auto" if not specified. This parameter ensures that the synthesized speech matches the linguistic characteristics of the desired language.
max_new_tokens
This optional parameter sets a cap on the maximum number of tokens generated during the speech synthesis process. It is an integer value with a default of 0, meaning no cap is applied. The minimum value is 0, and the maximum is 2,147,483,647. Adjusting this parameter can help manage the length and complexity of the generated speech.
speaker
The speaker parameter allows you to choose from a list of built-in speaker names, such as "aiden," "dylan," "eric," and others. This selection determines the voice characteristics used in the CustomVoice mode, enabling you to personalize the audio output to suit your project's needs.
instruct
This optional parameter provides style instructions for the speech synthesis, such as "speak slowly and clearly." It accepts a string value and allows you to influence the delivery style of the generated speech, adding an extra layer of customization to the audio output.
api_config
The api_config parameter is optional and is used to configure the Civitai Auth connection. It defaults to using the CIVITAI_API_TOKEN or a stored OAuth login. This configuration is necessary for authenticating and authorizing access to the Civitai platform's resources.
qwen3 / customVoice Output Parameters:
audio_blob
This output parameter contains the synthesized audio in a binary format. It represents the actual speech generated from the input text, ready for playback or further processing.
model_type
The model_type output provides a string indicating the type of model used for the speech synthesis. This information can be useful for understanding the underlying technology and capabilities of the generated audio.
speaker
This output returns the name of the speaker used in the synthesis process. It confirms the voice characteristics applied to the audio output, ensuring that the desired speaker was utilized.
workflow_id
The workflow_id is a string that uniquely identifies the workflow instance used for the text-to-speech conversion. It can be helpful for tracking and managing different synthesis tasks within the Civitai platform.
raw_json
This output provides a raw JSON string containing detailed information about the synthesis process. It includes metadata and other relevant data that can be useful for debugging or analyzing the text-to-speech operation.
qwen3 / customVoice Usage Tips:
- Experiment with different speaker options to find the voice that best suits your project's tone and style.
- Use the instruct parameter to fine-tune the delivery style of the speech, such as adding pauses or emphasizing certain words.
- Adjust the max_new_tokens parameter to control the length of the generated speech, especially for longer texts.
qwen3 / customVoice Common Errors and Solutions:
"Invalid API Configuration"
- Explanation: This error occurs when the api_config parameter is not correctly set, preventing authentication with the Civitai platform.
- Solution: Ensure that the api_config is properly configured with a valid CIVITAI_API_TOKEN or OAuth login credentials.
"Unsupported Language"
- Explanation: The specified language is not supported by the text-to-speech engine.
- Solution: Verify that the language parameter is set to a supported language, such as "English" or "Chinese," or use the default "Auto" setting.
"Speaker Not Found"
- Explanation: The chosen speaker name does not exist in the list of available options.
- Solution: Double-check the speaker parameter to ensure it matches one of the available speaker names, such as "aiden" or "serena."
