ComfyUI  >  Workflows  >  BAGEL | T2I + I2T + I2I

BAGEL | T2I + I2T + I2I

ComfyUI BAGEL is an open-source multimodal foundation model featuring 7B active parameters (14B total) and a Mixture-of-Transformer-Experts (MoT) design. Built for multimodal tasks like text-to-image generation, image editing, and visual question answering, ComfyUI BAGEL outperforms top-tier open VLMs such as Qwen2.5-VL and InternVL-2.5 in benchmark tests. It also provides high-quality generative capabilities on par with specialist models like SD3. With support for natural language prompting, complex reasoning, and optional transparency into the model's decision-making process, ComfyUI BAGEL offers an all-in-one solution for advanced multimodal workflows in ComfyUI.

ComfyUI BAGEL Workflow

ComfyUI BAGEL AI | Advanced Text-to-Image & Visual Chat
Want to run this workflow?
  • Fully operational workflows
  • No missing nodes or models
  • No manual setups required
  • Features stunning visuals

ComfyUI BAGEL Examples

comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_01.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_02.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_03.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_04.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_05.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_06.webp
comfyui-bagel-ai-advanced-text-to-image-visual-chat-1229-example_07.webp

ComfyUI BAGEL Description

ComfyUI BAGEL: Multimodal Foundation Model for ComfyUI

ComfyUI BAGEL (BAndwidth-efficient Generalist Expert Learner) is a powerful multimodal foundation model designed for both image generation and vision-language understanding. Based on a 14B parameter Mixture-of-Transformer-Experts (MoT) architecture—with 7B active at inference—ComfyUI BAGEL delivers state-of-the-art performance across text-to-image generation, image editing, and image understanding tasks.

Integrated directly into ComfyUI, ComfyUI BAGEL allows creators to generate detailed images from natural language prompts, edit visuals with textual instructions, and perform multimodal tasks like visual Q&A, captioning, and step-by-step reasoning. ComfyUI BAGEL combines the quality of diffusion models (like Stable Diffusion 3) with the analytical power of leading VLMs (outperforming models like Qwen2.5-VL and InternVL-2.5).

Why Use ComfyUI BAGEL?

ComfyUI BAGEL

The ComfyUI BAGEL workflow offers:

  • Text-to-Image Generation: Create high-quality images from natural language prompts using ComfyUI BAGEL
  • Image Editing via Text: Modify existing images using descriptive instructions with ComfyUI BAGEL
  • Image Understanding: Perform image captioning, Q&A, and visual analysis tasks in ComfyUI BAGEL
  • Multimodal Reasoning: Enable step-by-step explanation or analysis of visual inputs through ComfyUI BAGEL
  • All-in-One Foundation Model: Use a single 14B MoT-based architecture for diverse multimodal tasks within ComfyUI BAGEL

With ComfyUI BAGEL, artists, researchers, and developers can explore both the generative and analytical capabilities of multimodal AI using a unified and extensible ComfyUI interface powered by ComfyUI BAGEL technology.

1 - Text-to-Image Generation with ComfyUI BAGEL

ComfyUI BAGEL

Generate Images Using Natural Language Prompts

ComfyUI BAGEL allows you to create high-quality images directly from text inputs. To get started with ComfyUI BAGEL:

  1. Enter a detailed text prompt into the Prompt input node in ComfyUI BAGEL.
  2. Optionally configure parameters like seed, aspect ratio, or decoding steps within ComfyUI BAGEL.
  3. Run the ComfyUI BAGEL workflow to generate a new image from the ComfyUI BAGEL model.

This ComfyUI BAGEL function is ideal for concept art, visual ideation, storytelling, or rapid prototyping using purely natural language descriptions.

2 - Image Understanding and Visual Q&A with ComfyUI BAGEL

ComfyUI BAGEL

Analyze and Understand Images Using Language

ComfyUI BAGEL includes advanced multimodal reasoning and comprehension features, making ComfyUI BAGEL ideal for image captioning, analysis, and Q&A:

  1. Upload an image to analyze in ComfyUI BAGEL.
  2. Type a question or prompt about the image in ComfyUI BAGEL (e.g., "What is the man holding?", "Describe this scene.").
  3. The ComfyUI BAGEL system returns a visual answer or reasoning trace based on the image content.
    This ComfyUI BAGEL feature is particularly useful for education, content tagging, accessibility workflows, or AI agents needing visual grounding through ComfyUI BAGEL capabilities.

3 - Image Editing with Textual Instructions in ComfyUI BAGEL

ComfyUI BAGEL

Modify Existing Images via Prompt-Based Editing

ComfyUI BAGEL also supports prompt-based image editing through its advanced ComfyUI BAGEL interface. Here's how to use ComfyUI BAGEL:

  1. Upload your original image in the ComfyUI BAGEL input node.
  2. Provide a text instruction describing the modification you want in ComfyUI BAGEL (e.g., "add a sunset background", "make it snow", etc.).
  3. Run the ComfyUI BAGEL node group to apply your desired edits using ComfyUI BAGEL processing.

This allows artists and designers to non-destructively transform images through simple text without needing manual photo editing, all powered by ComfyUI BAGEL technology.

Acknowledgement

The ComfyUI BAGEL workflow for ComfyUI is based on the open-source ComfyUI BAGEL-7B-MoT model by ByteDance Seed.
ComfyUI integration and ComfyUI BAGEL workflow setup were developed by neverbiasu, providing seamless access to image generation, editing, and understanding capabilities within a single unified ComfyUI BAGEL interface.

GitHub Repository:

ComfyUI BAGEL Model Information

  • Model Name: ComfyUI BAGEL-7B-MoT
  • Architecture: Mixture-of-Transformer-Experts (MoT) optimized for ComfyUI BAGEL
  • Total Parameters: 14B (7B Active) in ComfyUI BAGEL
  • ComfyUI Path: models/bagel/ComfyUI-BAGEL-7B-MoT/
  • Automatic Download: Enabled for ComfyUI BAGEL
  • Manual Download:

Want More ComfyUI Workflows?

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.