ComfyUI > Nodes > ComfyUI-SoulX-Podcast

ComfyUI Extension: ComfyUI-SoulX-Podcast

Repo Name

ComfyUI-SoulX-Podcast

Author
flybirdxx (Account age: 3194 days)
Nodes
View all nodes(3)
Latest Updated
2025-10-31
Github Stars
0.08K

How to Install ComfyUI-SoulX-Podcast

Install this extension via the ComfyUI Manager by searching for ComfyUI-SoulX-Podcast
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-SoulX-Podcast in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI-SoulX-Podcast Description

ComfyUI-SoulX-Podcast is a custom node plugin for ComfyUI, enabling visual node workflows for SoulX-Podcast's core features, including long-form, multi-speaker, and multi-dialect podcast voice generation.

ComfyUI-SoulX-Podcast Introduction

ComfyUI-SoulX-Podcast is an innovative extension designed to enhance the capabilities of ComfyUI by integrating the powerful features of SoulX-Podcast. This extension allows you to create long-form, multi-speaker, and multi-dialect podcast audio content through a visual node-based workflow. Whether you're an AI artist looking to generate engaging dialogues or a content creator aiming to produce diverse audio experiences, this extension simplifies the process by providing a user-friendly interface and robust functionality.

How ComfyUI-SoulX-Podcast Works

At its core, ComfyUI-SoulX-Podcast operates by transforming text scripts into dynamic audio content. It leverages advanced language models and audio processing techniques to generate realistic dialogues between speakers. The extension uses a series of interconnected nodes within ComfyUI, each responsible for a specific part of the audio generation process. By connecting these nodes, you can seamlessly load models, parse input scripts, and generate audio, all while customizing various parameters to suit your creative needs.

ComfyUI-SoulX-Podcast Features

  • Two-Person Podcast Generation: Create dialogues between two distinct speakers, allowing for interactive and engaging audio content.
  • Multi-Dialect Support: Generate audio in multiple Chinese dialects, enhancing the diversity and authenticity of your content. This feature requires specific dialect models.
  • Flexible Dialogue Scripts: Define your podcast's dialogue using a simple script format, making it easy to structure conversations.
  • Prompt Audio Driven: Clone the voice characteristics of speakers using reference audio, ensuring that each speaker's voice is unique and consistent.
  • Long-Form Generation: Produce extended podcast content without compromising on quality or coherence.
  • Visual Workflow: Utilize ComfyUI's node-based interface to manage the entire audio generation process visually, making it accessible even for those with minimal technical expertise.

ComfyUI-SoulX-Podcast Models

The extension supports different models tailored for various audio generation needs:

  • Standard Model (e.g., SoulX-Podcast-1.7B): Ideal for generating standard Mandarin podcasts.
  • Dialect Model (e.g., SoulX-Podcast-1.7B-dialect): Supports multiple Chinese dialects, such as Henan, Sichuan, and Cantonese. To use dialect features, ensure you select the appropriate model in the SoulX Podcast Loader node.

Troubleshooting ComfyUI-SoulX-Podcast

Here are some common issues you might encounter and their solutions:

  • Model Loading Failed: Ensure that model files are correctly placed in the ComfyUI/models/TTS/[model_name]/ directory and that all necessary files are present.
  • Unstable Voice Characteristics: Use longer and clearer prompt audio, ideally around 10 seconds, to improve voice consistency.
  • Slow Generation Speed: Consider using the vllm engine if supported, enable fp16_flow to reduce VRAM usage, and adjust the max_tokens value to optimize performance.
  • Dialogue Script Format Error: Ensure your script follows the correct format, with speaker identifiers enclosed in brackets, e.g., [S1] Hello.

Learn More about ComfyUI-SoulX-Podcast

To further explore the capabilities of ComfyUI-SoulX-Podcast, consider accessing additional resources such as tutorials, community forums, and detailed documentation. Engaging with the community can provide valuable insights and support as you experiment with the extension's features.

ComfyUI-SoulX-Podcast Related Nodes

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.