千问TTS:
QwenTTS is a node designed for ComfyUI that facilitates text-to-speech (TTS) conversion using advanced models. It allows you to transform written content into natural-sounding audio, providing a versatile tool for creating voiceovers, narrations, or any application requiring synthesized speech. The node supports multiple languages and voices, offering flexibility in choosing the desired tone and style for your audio output. By leveraging the capabilities of QwenTTS, you can enhance your projects with high-quality audio that aligns with your creative vision.
千问TTS Input Parameters:
model_id
The model_id parameter specifies the version of the QwenTTS model to be used for generating speech. It allows you to select from various model versions, such as qwen-tts-latest, qwen-tts-2025-05-22, qwen-tts-2025-04-10, and qwen-tts. The default value is set to qwen-tts-latest, ensuring you use the most recent advancements in TTS technology. Choosing a specific model version can impact the quality and characteristics of the generated audio, so selecting the appropriate model for your needs is crucial.
content
The content parameter is where you input the text you wish to convert into speech. It supports multiline text, allowing you to input longer passages or scripts. The default text is set to "你好,千问!", and the placeholder suggests entering "TTS text". This parameter directly influences the spoken content of the audio output, making it essential to provide clear and well-structured text for optimal results.
voice
The voice parameter allows you to choose from a variety of voice options, such as Cherry, Serena, Ethan, Chelsie, Dylan, Jada, and Sunny. The default voice is Sunny. Each voice option offers a unique tone and style, enabling you to tailor the audio output to match the desired emotional or contextual tone of your project. Selecting the right voice can significantly enhance the listener's experience.
千问TTS Output Parameters:
音频
The 音频 (audio) output parameter provides the generated audio file resulting from the text-to-speech conversion. This audio file is the primary output of the QwenTTS node, containing the spoken version of the input text. It is essential for applications where audio playback is required, such as podcasts, video narrations, or interactive media.
采样率
The 采样率 (sample rate) output parameter indicates the sample rate of the generated audio. The sample rate is a critical factor in determining the audio quality, with higher sample rates generally providing better sound fidelity. This parameter helps ensure that the audio output is compatible with various playback systems and meets the desired quality standards for your project.
千问TTS Usage Tips:
- Experiment with different
voiceoptions to find the one that best suits the tone and style of your project. Each voice has unique characteristics that can enhance the emotional impact of your audio. - Use the
model_idparameter to select the most appropriate model version for your needs. Newer models may offer improved audio quality and naturalness, so consider using the latest version for the best results. - Ensure that the
contentparameter is well-structured and free of errors, as this directly affects the clarity and coherence of the generated speech.
千问TTS Common Errors and Solutions:
Invalid model_id
- Explanation: This error occurs when an unsupported or incorrect model ID is specified in the
model_idparameter. - Solution: Verify that the model ID is one of the supported options:
qwen-tts-latest,qwen-tts-2025-05-22,qwen-tts-2025-04-10, orqwen-tts.
Unsupported voice selection
- Explanation: This error arises when a voice option not listed in the
voiceparameter is selected. - Solution: Ensure that the voice selection is one of the available options:
Cherry,Serena,Ethan,Chelsie,Dylan,Jada, orSunny.
Text input too long
- Explanation: This error may occur if the text input in the
contentparameter exceeds the maximum allowed length. - Solution: Break down the text into smaller segments and process them separately to avoid exceeding the input limit.
