ComfyUI-SoulX-Singer Introduction
ComfyUI-SoulX-Singer is an innovative extension designed to bring the power of high-quality, zero-shot singing voice synthesis to AI artists. Developed by SoulAI-Lab, this extension allows you to generate realistic singing voices for singers who have not been seen before by the model. It integrates seamlessly with ComfyUI, providing a user-friendly, node-based interface that supports both melody-conditioned (F0 contour) and score-conditioned (MIDI notes) control. This means you can precisely manage pitch, rhythm, and expression in your synthesized singing voices, making it an invaluable tool for creative projects that require unique vocal performances.
How ComfyUI-SoulX-Singer Works
At its core, ComfyUI-SoulX-Singer leverages advanced machine learning models to synthesize singing voices. The extension operates on the principle of zero-shot learning, which means it can generate singing voices without needing prior examples of the specific singer. It uses reference audio samples to capture the unique characteristics of a singer's voice and applies this to new melodies or scores. By using either melody-conditioned or score-conditioned inputs, you can control the pitch and rhythm of the synthesized voice, allowing for a high degree of customization and creativity.
ComfyUI-SoulX-Singer Features
- Zero-Shot Singing: Create high-fidelity singing voices for new singers using just a reference sample.
- Dual Control Modes: Choose between melody (F0 contour) and score (MIDI notes) conditioning to control the singing voice's pitch and rhythm.
- Native ComfyUI Integration: Enjoy a seamless experience with audio inputs, progress bars, and interruption support.
- Optimized Performance: Benefit from support for bf16/fp32 data types and advanced attention mechanisms like SDPA and SageAttention.
- Smart Auto-Download: Automatically download only the necessary components from HuggingFace, optimizing storage and performance.
- Smart Caching: Optionally cache models to improve performance, with automatic detection of changes in data type or attention settings.
- MIDI Editor Support: Use advanced nodes for manual metadata editing, enhancing your creative control.
- Improved Compatibility: Utilizes soundfile and scipy for better cross-platform support, avoiding issues with torchaudio.
ComfyUI-SoulX-Singer Models
The extension offers two main models:
- SoulX-Singer_model_bf16: This model is optimized for speed and good quality, making it ideal for quick iterations and testing.
- SoulX-Singer_model_fp32: This model provides the best quality output, suitable for final productions where quality is paramount. Each model can be automatically downloaded and is designed to work seamlessly with the ComfyUI interface.
Troubleshooting ComfyUI-SoulX-Singer
If you encounter issues while using ComfyUI-SoulX-Singer, here are some common solutions:
- Models Not Downloading: Ensure you have the latest version of the Hugging Face Hub installed and manually download the models if necessary.
- Missing Dependencies: Install all required dependencies using the provided requirements.txt file.
- Out of Memory Errors: Consider using the bf16 model instead of fp32, reduce the number of diffusion steps, or close other applications to free up memory.
- Slow Synthesis: Install the SageAttention package for optimized performance and ensure you are using a GPU with CUDA support.
- Preprocessing Pipeline Fails: Verify that all preprocessing models are correctly downloaded and placed in the specified directory.
Learn More about ComfyUI-SoulX-Singer
To further explore the capabilities of ComfyUI-SoulX-Singer, you can access a variety of resources:
- Online Demos: Try the SoulX-Singer Online Demo to see the extension in action.
- MIDI Editor: Use the MIDI Editor for detailed control over your projects.
- Official Documentation: Visit the Soul-AILab GitHub Repository for comprehensive documentation and updates.
- Research Paper: Gain insights into the underlying technology by reading the SoulX-Singer Paper. These resources will help you maximize the potential of ComfyUI-SoulX-Singer in your creative endeavors.
