MMAudio | Video-to-Audio Model

The ComfyUI-MMAudio nodes and its associated workflow are fully developed by Kijai. We give all due credit to Kijai for this innovative work. On the RunComfy platform, we are simply presenting Kijai’s contributions to the community. It is important to note that there is currently no formal connection or partnership between RunComfy and Kijai. We deeply appreciate Kijai’s work!

ComfyUI MMAudio Workflow

Want to run this workflow?

Fully operational workflows
No missing nodes or models
No manual setups required
Features stunning visuals

ComfyUI MMAudio Examples

The ComfyUI-MMAudio nodes and its associated workflow are fully developed by Kijai. We give all due credit to Kijai for this innovative work. On the RunComfy platform, we are simply presenting Kijai’s contributions to the community. It is important to note that there is currently no formal connection or partnership between RunComfy and Kijai. We deeply appreciate Kijai’s work!

MMAudio#

MMAudio is a powerful tool for creating synchronized audio from video and text inputs. It utilizes multimodal joint training to learn from diverse audio-visual and audio-text datasets, ensuring exceptional adaptability. With its advanced synchronization module, it perfectly aligns audio to video frames. MMAudio revolutionizes audio generation, streamlining the process for creators and innovators alike.

1.1 How to Use MMAudio Workflow?#

This is the MMAudio workflow, Left Side nodes are inputs for uploading video, Middle is processing MMAudio nodes, and right is the outputs node.

Upload your Video in input nodes.
Write your audio generation prompts.
Click Render !!!

1.2 Video Input#

Click and Upload your Reference Video.

The video is set to downscale the video to ?*512 resolution as processing HD Video or longer video may run of out memory.

1.3 MMAudio Processing#

Positive: Enter the video generation prompts for the audio.
Negative: Enter what you don't want to hear.
Steps : More steps may improve audio quality.

1.4 MMAudio Models#

These are the model downloader nodes, it will automatically download models in your comfyui in 2-3 mins.

MMAudio Models : https://github.com/hkchengrex/MMAudio

With its innovative multimodal training and precise synchronization, MMAudio sets a new standard in audio generation. Whether you're crafting videos, animations, or immersive experiences, MMAudio empowers creators with seamless, high-quality audio. Elevate your projects and bring your ideas to life with MMAudio.

Want More ComfyUI Workflows?

FLUX IPAdapter V2 | XLabs

Explore XLabs FLUX IPAdapter V2 model compared to V1 for your creative goals.

Wan 2.2 FLF2V | First-Last Frame Video Generation

Generate smooth videos from a start and end frame using Wan 2.2 FLF2V.

AnimateDiff + Dynamic Prompts | Text to Video

Utilize Dynamic Prompts (Wildcards), Animatediff, and IPAdapter to generate dynamic animations or GIFs.

LTX Video | Image+Text to Video

Generates videos from image+text prompts.

Flux 2 Klein 9B KV Image Edit | Smart Image Transformer

Prompt-based image edits that keep structure and identity intact.

LTX 2.3 MSR | Multi-Subject Video Generator

Keeps every character consistent across complex video scenes.

ComfyUI Grounding | Object Tracking Workflow

Track any subject with pixel-perfect accuracy for stunning VFX results.

Flux Fill | Inpaint and Outpaint

Official Flux Tools - Flux Fill for Inpainting and Outpainting

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.