ComfyUI>Workflows>BAGEL AI | T2I + I2T + I2I

BAGEL AI | T2I + I2T + I2I

Workflow Name: RunComfy/BAGEL

Workflow ID: 0000...1229

BAGEL AI is an open-source multimodal foundation model featuring 7B active parameters (14B total) and a Mixture-of-Transformer-Experts (MoT) design. Built for multimodal tasks like text-to-image generation, image editing, and visual question answering, BAGEL AI outperforms top-tier open VLMs such as Qwen2.5-VL and InternVL-2.5 in benchmark tests. It also provides high-quality generative capabilities on par with specialist models like SD3. With support for natural language prompting, complex reasoning, and optional transparency into the model's decision-making process, BAGEL AI offers an all-in-one solution for advanced multimodal workflows in ComfyUI.

BAGEL AI: Multimodal Foundation Model for ComfyUI

BAGEL (BAndwidth-efficient Generalist Expert Learner) AI is a powerful multimodal foundation model designed for both image generation and vision-language understanding. Based on a 14B parameter Mixture-of-Transformer-Experts (MoT) architecture—with 7B active at inference—BAGEL AI delivers state-of-the-art performance across text-to-image generation, image editing, and image understanding tasks.

Integrated directly into ComfyUI, BAGEL AI allows creators to generate detailed images from natural language prompts, edit visuals with textual instructions, and perform multimodal tasks like visual Q&A, captioning, and step-by-step reasoning. BAGEL AI combines the quality of diffusion models (like Stable Diffusion 3) with the analytical power of leading VLMs (outperforming models like Qwen2.5-VL and InternVL-2.5).

Why Use BAGEL AI?

BAGEL AI

The BAGEL AI workflow offers:

Text-to-Image Generation: Create high-quality images from natural language prompts using BAGEL AI
Image Editing via Text: Modify existing images using descriptive instructions with BAGEL AI
Image Understanding: Perform image captioning, Q&A, and visual analysis tasks in BAGEL AI
Multimodal Reasoning: Enable step-by-step explanation or analysis of visual inputs through BAGEL AI
All-in-One Foundation Model: Use a single 14B MoT-based architecture for diverse multimodal tasks within BAGEL AI

With BAGEL AI, artists, researchers, and developers can explore both the generative and analytical capabilities of multimodal AI using a unified and extensible ComfyUI interface powered by BAGEL AI technology.

1 - Text-to-Image Generation with BAGEL AI

BAGEL AI

Generate Images Using Natural Language Prompts

BAGEL AI allows you to create high-quality images directly from text inputs. To get started with BAGEL AI:

Enter a detailed text prompt into the Prompt input node in BAGEL AI.
Optionally configure parameters like seed, aspect ratio, or decoding steps within BAGEL AI.
Run the BAGEL AI workflow to generate a new image from the BAGEL model.

This BAGEL AI function is ideal for concept art, visual ideation, storytelling, or rapid prototyping using purely natural language descriptions.

2 - Image Understanding and Visual Q&A with BAGEL AI

BAGEL AI

Analyze and Understand Images Using Language

BAGEL AI includes advanced multimodal reasoning and comprehension features, making BAGEL AI ideal for image captioning, analysis, and Q&A:

Upload an image to analyze in BAGEL AI.
Type a question or prompt about the image in BAGEL AI (e.g., "What is the man holding?", "Describe this scene.").
The BAGEL AI system returns a visual answer or reasoning trace based on the image content. This BAGEL AI feature is particularly useful for education, content tagging, accessibility workflows, or AI agents needing visual grounding through BAGEL AI capabilities.

3 - Image Editing with Textual Instructions in BAGEL AI

BAGEL AI

Modify Existing Images via Prompt-Based Editing

BAGEL AI also supports prompt-based image editing through its advanced BAGEL AI interface. Here's how to use BAGEL AI:

Upload your original image in the BAGEL AI input node.
Provide a text instruction describing the modification you want in BAGEL AI (e.g., "add a sunset background", "make it snow", etc.).
Run the node group to apply your desired edits using BAGEL AI processing.

This allows artists and designers to non-destructively transform images through simple text without needing manual photo editing, all powered by BAGEL AI technology.

Acknowledgement

The BAGEL AI workflow for ComfyUI is based on the open-source BAGEL-7B-MoT model by ByteDance Seed. ComfyUI integration and BAGEL AI workflow setup were developed by neverbiasu, providing seamless access to image generation, editing, and understanding capabilities within a single unified BAGEL AI interface.

GitHub Repository: https://github.com/neverbiasu/ComfyUI-BAGEL

BAGEL AI Model Information

Model Name: ComfyUI BAGEL-7B-MoT
Architecture: Mixture-of-Transformer-Experts (MoT) optimized for BAGEL AI
Total Parameters: 14B (7B Active) in BAGEL AI
ComfyUI Path: models/bagel/ComfyUI-BAGEL-7B-MoT/
Automatic Download: Enabled for BAGEL AI
Manual Download: https://huggingface.co/ByteDance-Seed/BAGEL-7B-MoT

Want More ComfyUI Workflows?

ACE-Step Music Generation | AI Audio Creation

Generate studio-quality music 15× faster with breakthrough diffusion technology.

ICEdit | Fast AI Image Editing with Nunchaku

ICEdit+Nunchaku: A solution for ultra-fast, precise AI image editing.

Step1X-Edit | AI Image Editing Tool

Perform 11 editing operations with natural language in Step1X-Edit.

LBM Relighting | I2I

Relight subjects using image-based lighting inputs with LBM.

Flux Consistent Characters | Input Image

Create consistent characters and ensure they look uniform using your images.

Epic CineFX | CogVideoX, ControlNet, and Live Portrait Workflow

Turn simple footage into epic film scenes with CogVideoX, ControlNet, and Live Portrait.

ControlNet Tile + 4x UltraSharp | Image/Video Upscaler

Use ControlNet Tile, 4xUltraSharp, and frame interpolation for a high-resolution outcome.

Stable Diffusion 3.5 vs FLUX.1

Compare Stable Diffusion 3.5 and FLUX.1 in one ComfyUI workflow.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.