Install this extension via the ComfyUI Manager by searching
for ComfyUI-Gemini_TTS
1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-Gemini_TTS in the search bar
After installation, click the Restart button to
restart ComfyUI. Then, manually
refresh your browser to clear the cache and access
the updated list of nodes.
Visit
ComfyUI Online
for ready-to-use ComfyUI environment
ComfyUI-Gemini_TTS is a custom node for ComfyUI that integrates Google's Gemini TTS, enabling high-quality speech generation with over 30 voices, available in both free and paid tiers.
ComfyUI-Gemini_TTS Introduction
ComfyUI-Gemini_TTS is an innovative extension designed to integrate Google's Gemini Text-to-Speech (TTS) capabilities into your creative workflow. This extension allows you to generate high-quality speech using over 30 distinct voices, catering to both free and paid usage tiers. Whether you're an AI artist looking to add a voice to your digital creations or a developer seeking to enhance user interaction with natural-sounding speech, ComfyUI-Gemini_TTS offers a versatile solution. It simplifies the process of converting text into speech, providing a seamless experience with a variety of voice options to suit different artistic needs.
How ComfyUI-Gemini_TTS Works
At its core, ComfyUI-Gemini_TTS functions by taking text input and converting it into speech using Google's advanced TTS models. Imagine it as a digital storyteller that reads your script aloud, using a voice of your choice. The extension connects to Google's Gemini TTS API, which processes the text and returns an audio file that you can use in your projects. This process is akin to having a virtual narrator who can switch between different characters and emotions, depending on the voice and settings you choose. The extension handles the technical details, allowing you to focus on the creative aspects of your work.
ComfyUI-Gemini_TTS Features
30+ Premium Voices: Choose from a wide range of male and female voices, each with unique characteristics. This diversity allows you to find the perfect voice to match the tone and style of your project.
Dual Tier Support: Start with the free tier, which offers generous usage limits, and upgrade to the paid tier for higher quotas and production-level performance.
Smart Fallback: Automatically switches to a different model if you reach your usage quota, ensuring uninterrupted service.
Voice Characteristics: Provides detailed descriptions of each voice's personality, helping you select the most suitable option for your needs.
Flexible Configuration: Customize settings through environment variables, node parameters, or a configuration file, making it easy to adapt the extension to your workflow.
Robust Error Handling: Offers clear error messages and automatic retry logic to handle any issues that may arise.
Real-time Pricing: Displays cost estimates for paid tier usage, allowing you to manage your budget effectively.
ComfyUI-Gemini_TTS Models
The extension offers two main models for text-to-speech conversion:
Gemini 2.5 Pro Preview TTS: This model provides higher quality audio output, ideal for projects where sound quality is paramount. It may take a bit longer to process but delivers superior results.
Gemini 2.5 Flash Preview TTS: Designed for faster processing, this model offers good quality audio and is suitable for projects with tight deadlines or when quick iterations are needed.
Choosing between these models depends on your specific requirements for quality and speed. For instance, if you're working on a detailed animation that requires precise voice acting, the Pro model might be the best choice. On the other hand, for rapid prototyping or less critical applications, the Flash model could be more appropriate.
Troubleshooting ComfyUI-Gemini_TTS
Here are some common issues you might encounter and how to resolve them:
"API key not valid" Error: Ensure your API key starts with AIza and is approximately 39 characters long. Double-check that the key hasn't expired or been deleted.
"Rate limit exceeded" Error: If you're on the free tier, wait 60 seconds or switch to the Flash model. Consider enabling the paid tier for higher quotas.
"Billing project not found" Error: Make sure you're using the correct Project ID, not the project name, and verify that billing is enabled.
"Permission denied" Error: Confirm that the Generative Language API is enabled and that your API key has the necessary permissions.
For more detailed troubleshooting, check the console output for specific error messages and ensure your Google Cloud project settings are correctly configured.
Learn More about ComfyUI-Gemini_TTS
To further explore the capabilities of ComfyUI-Gemini_TTS, consider visiting community forums and online tutorials where you can find additional tips and tricks. Engaging with other AI artists can provide valuable insights and inspiration for your projects. Additionally, reviewing the official documentation and Google's terms of service can help you understand the full potential and limitations of the Gemini TTS API.
RunComfy is the
premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.
RunComfy also provides AI Models,
enabling artists to harness the latest AI tools to create incredible art.