ComfyUI Node: Kokoro Run

Class Name

Kokoro Run

Category
🎤MW/MW-KokoroTTS
Author
mw (Account age: 2267days)
Extension
ComfyUI_KokoroTTS_MW
Latest Updated
2025-04-27
Github Stars
0.02K

How to Install ComfyUI_KokoroTTS_MW

Install this extension via the ComfyUI Manager by searching for ComfyUI_KokoroTTS_MW
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_KokoroTTS_MW in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Kokoro Run Description

Facilitates text-to-speech conversion using Kokoro TTS model for high-quality audio synthesis in AI projects.

Kokoro Run:

Kokoro Run is a node designed to facilitate text-to-speech (TTS) conversion using the Kokoro TTS model. This node is part of a custom implementation that leverages advanced machine learning models to generate high-quality audio from text input. The primary goal of Kokoro Run is to provide a seamless and efficient way to convert written text into spoken words, making it an invaluable tool for AI artists and developers who need to integrate voice synthesis into their projects. By utilizing pre-trained models and configurable parameters, Kokoro Run ensures that users can achieve natural-sounding speech output tailored to their specific needs. The node is designed to handle various text inputs and produce audio outputs with a consistent sample rate, ensuring compatibility with a wide range of applications.

Kokoro Run Input Parameters:

text

The text parameter is the primary input for the Kokoro Run node, representing the written content that you wish to convert into speech. This parameter accepts a string of text, which can be of any length, although longer texts may require more processing time. The quality and clarity of the generated speech are directly influenced by the content of the text, so it's important to ensure that the text is well-structured and free of errors. There are no explicit minimum or maximum values for this parameter, but it's advisable to keep the text concise for optimal performance.

voice

The voice parameter allows you to select the specific voice model used for speech synthesis. This parameter is crucial for determining the characteristics of the generated speech, such as tone, pitch, and accent. The available options for this parameter are determined by the pre-loaded voice models in the system, which are stored in the voices directory. Selecting the appropriate voice model can significantly impact the naturalness and expressiveness of the speech output.

speed

The speed parameter controls the rate at which the text is spoken in the generated audio. This parameter is adjustable, allowing you to fine-tune the speech speed to match your desired output. The speed is calculated based on the length of the phoneme sequence, with a default value of 1.0 representing normal speed. Adjusting this parameter can help achieve a more natural pacing, especially for longer texts or specific use cases where timing is critical.

Kokoro Run Output Parameters:

waveform

The waveform output parameter is a tensor representing the audio data generated from the input text. This tensor contains the raw audio waveform, which can be further processed or directly used in applications requiring speech output. The waveform is structured as a multi-dimensional array, with dimensions corresponding to the audio channels and sample points. This output is essential for any application that needs to play or manipulate the generated speech audio.

sample_rate

The sample_rate output parameter indicates the number of audio samples per second in the generated waveform. For Kokoro Run, the sample rate is consistently set at 24000 Hz, ensuring high-quality audio output suitable for most applications. This parameter is crucial for ensuring that the audio is played back at the correct speed and quality, and it should be considered when integrating the output into other systems or media.

Kokoro Run Usage Tips:

  • Ensure that the input text is clear and well-structured to achieve the best speech synthesis results.
  • Experiment with different voice models to find the one that best suits your project's needs and desired speech characteristics.
  • Adjust the speed parameter to fine-tune the pacing of the generated speech, especially for longer texts or specific timing requirements.

Kokoro Run Common Errors and Solutions:

Generation failed: <error_message>

  • Explanation: This error occurs when the text-to-speech generation process encounters an issue, which could be due to an invalid input, model loading failure, or resource limitations.
  • Solution: Check the input text for any errors or unsupported characters. Ensure that the necessary models and resources are correctly loaded and available. If the problem persists, try reducing the input text length or freeing up system resources.

Model cache is None

  • Explanation: This error indicates that the model required for speech synthesis has not been loaded into the cache, possibly due to a missing or incorrect model path.
  • Solution: Verify that the model paths are correctly specified and that the required models are present in the designated directories. Reload the node to ensure that the models are properly cached.

Kokoro Run Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_KokoroTTS_MW
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.