Visit ComfyUI Online for ready-to-use ComfyUI environment
ComfyUI Custom Dia integrates the a/Dia TTS model into ComfyUI, enhancing its text-to-speech capabilities. This extension leverages nari-labs' innovative work to provide advanced TTS features within the ComfyUI framework.
Welcome to comfyUI-customDia, an innovative extension that integrates the Dia Text-to-Speech (TTS) model into the ComfyUI environment. Developed by the author, this extension leverages the powerful capabilities of the Dia model, created by Nari Labs, to generate highly realistic dialogue from text. Whether you're an AI artist looking to add voice to your creations or someone interested in exploring the possibilities of TTS technology, comfyUI-customDia offers a user-friendly solution. It allows you to create dialogues with multiple speakers, incorporate nonverbal cues, and even clone voices, all within the ComfyUI framework.
At its core, comfyUI-customDia uses the Dia model to transform written text into spoken dialogue. The extension functions as an output node within ComfyUI, meaning it can operate independently or as part of a larger workflow. You can input text with speaker tags like [S1] and [S2] to designate different speakers, and the model will generate corresponding audio. Additionally, you can include nonverbal tags such as (laughs) or (sighs) to enrich the audio with realistic expressions. The extension also supports voice cloning by allowing you to input an audio sample and its transcript, enabling the model to mimic the voice in the sample.
librosa package, you can maintain the pitch of the original audio, enhancing the naturalness of the cloned voice.The extension utilizes the Dia model, a 1.6 billion parameter TTS model designed for generating realistic dialogue. The model supports English and can produce a wide range of vocal expressions, making it ideal for creating engaging audio content. By conditioning the output on audio, you can control the emotion and tone of the speech, allowing for nuanced and expressive results.
Here are some common issues you might encounter while using comfyUI-customDia and how to resolve them:
descript-audio-codec and soundfile Python packages are installed. If the installation of descript-audio-codec downgrades protobuf to version 3.19.6, causing other nodes to crash, upgrade protobuf by running pip install protobuf --upgrade in the ComfyUI terminal.[S1] and [S2] tags correctly and keeping the input text length moderate.To further explore the capabilities of comfyUI-customDia, consider visiting the following resources:
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.