logo
RunComfy
ComfyUIPlaygroundPricing
discord logo
ComfyUI>Workflows>IDM-VTON | Virtual Try-on

IDM-VTON | Virtual Try-on

Workflow Name: RunComfy/IDM-VTON
Workflow ID: 0000...1135
IDM-VTON, or Improving Diffusion Models for Authentic Virtual Try-on in the Wild, is a groundbreaking diffusion model that allows for realistic virtual garment try-on. By preserving the unique details and identity of garments, IDM-VTON generates incredibly authentic results. The model utilizes an image prompt adapter (IP-Adapter) to extract high-level garment semantics and a parallel UNet (GarmentNet) to encode low-level features. In ComfyUI, the IDM-VTON node powers the virtual try-on process, requiring inputs such as a human image, pose representation, clothing mask, and garment image.

IDM-VTON, short for "Improving Diffusion Models for Authentic Virtual Try-on in the Wild," is an innovative diffusion model that allows you to realistically try on garments virtually using just a few inputs. What sets IDM-VTON apart is its ability to preserve the unique details and identity of the garments while generating virtual try-on results that look incredibly authentic.

1. Understanding IDM-VTON

At its core, IDM-VTON is a diffusion model that's been specifically engineered for virtual try-on. To use it, you simply need a representation of a person and a garment you want to try on. IDM-VTON then works its magic, rendering a result that looks like the person is actually wearing the garment. It achieves a level of garment fidelity and authenticity that surpasses previous diffusion-based virtual try-on methods.

2. The Inner Workings of IDM-VTON

So, how does IDM-VTON pull off such realistic virtual try-on? The secret lies in its two main modules that work together to encode the semantics of the garment input:

  1. The first is an image prompt adapter, or IP-Adapter for short. This clever component extracts the high-level semantics of the garment - essentially, the key characteristics that define its appearance. It then fuses this information into the cross-attention layer of the main UNet diffusion model.
  2. The second module is a parallel UNet called GarmentNet. Its job is to encode the low-level features of the garment - the nitty-gritty details that make it unique. These features are then fused into the self-attention layer of the main UNet.

But that's not all! IDM-VTON also makes use of detailed textual prompts for both the garment and the person inputs. These prompts provide additional context that enhances the authenticity of the final virtual try-on result.

3. Putting IDM-VTON to Work in ComfyUI

3.1 The Star of the Show: The IDM-VTON Node

In ComfyUI, the "IDM-VTON" node is the powerhouse that runs the IDM-VTON diffusion model and generates the virtual try-on output.

For the IDM-VTON node to work its magic, it needs a few key inputs:

  1. Pipeline: This is the loaded IDM-VTON diffusion pipeline that powers the whole virtual try-on process.
  2. Human Input: An image of the person who will be virtually trying on the garment.
  3. Pose Input: A preprocessed DensePose representation of the human input, which helps IDM-VTON understand the person's pose and body shape.
  4. Mask Input: A binary mask that indicates which parts of the human input are clothing. This mask needs to be converted into an appropriate format.
  5. Garment Input: An image of the garment to be virtually tried on.

3.2 Getting Everything Ready

To get the IDM-VTON node up and running, there are a few preparation steps:

  1. Loading the Human Image: A LoadImage node is used to load the image of the person. IDM-VTON
  2. Generating the Pose Image: The human image is passed through a DensePosePreprocessor node, which computes the DensePose representation that IDM-VTON needs. IDM-VTON
  3. Obtaining the Mask Image: There are two ways to get the clothing mask: IDM-VTON

a. Manual Masking (Recommended)

  • Right-click on the loaded human image and choose "Open in Mask Editor."
  • In the mask editor UI, manually mask the clothing regions.

b. Automatic Masking

  • Use a GroundingDinoSAMSegment node to automatically segment the clothing.
  • Prompt the node with a text description of the garment (like "t-shirt").

Whichever method you choose, the obtained mask needs to be converted to an image using a MaskToImage node, which is then connected to the "Mask Image" input of the IDM-VTON node.

  1. Loading the Garment Image: It is used to load the image of the garment.
IDM-VTON

For a deeper dive into the IDM-VTON model, don't miss the original paper, "Improving Diffusion Models for Authentic Virtual Try-on in the Wild". And if you're interested in using IDM-VTON in ComfyUI, be sure to check out the dedicated nodes here. Huge thanks to the researchers and developers behind these incredible resources.

Want More ComfyUI Workflows?

Omost | Enhance Image Creation

Omost | Enhance Image Creation

Omost uses LLM coding to generate precise, high-quality images.

Hunyuan Image to Video | Breathtaking Motion Creator

Create magnificent movies out of still images through cinematic motion and customizable effects.

Consistent Face 3x3 Generator

Generate 3x3 consistent character faces using FLUX and Depth LoRA

ControlNet Tile + 4x UltraSharp | Image/Video Upscaler

Use ControlNet Tile, 4xUltraSharp, and frame interpolation for a high-resolution outcome.

Vid2Vid Part 2 | SDXL Style Transfer

Enhance Vid2Vid creativity by focusing on the composition and masking of your original video.

Portrait Master | Text to Portrait

Portrait Master | Text to Portrait

Use the Portrait Master for greater control over portrait creations without relying on complex prompts.

ComfyUI + TouchDesigner | Audio Reactive Visuals

Render visuals in ComfyUI and sync audio in TouchDesigner for dynamic audio-reactive videos.

FLUX Img2Img | Merge Visuals and Prompts

FLUX Img2Img | Merge Visuals and Prompts

Merge visuals and prompts for stunning, enhanced results.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Resources
  • Free ComfyUI Online
  • ComfyUI Guides
  • RunComfy API
  • ComfyUI Tutorials
  • ComfyUI Nodes
  • Learn More
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.