Install this extension via the ComfyUI Manager by searching
for Comfy-UmiAI
1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter Comfy-UmiAI in the search bar
After installation, click the Restart button to
restart ComfyUI. Then, manually
refresh your browser to clear the cache and access
the updated list of nodes.
Visit
ComfyUI Online
for ready-to-use ComfyUI environment
Comfy-UmiAI enhances ComfyUI by converting static prompts into dynamic, context-aware workflows. It features persistent variables, conditional logic, native LoRA loading, and external data fetching.
Comfy-UmiAI Introduction
Comfy-UmiAI is an innovative extension designed to enhance your experience with ComfyUI by transforming static prompts into dynamic, context-aware workflows. This extension acts as a "Multimodal Brain" for your prompts, integrating advanced logic, persistent variables, and native LoRA loading. It also incorporates local vision and language model (LLM) intelligence directly into your prompt text box. Whether you're looking to generate high-quality, natural language prompts or streamline your creative process, Comfy-UmiAI offers a suite of tools to help you achieve your artistic goals.
How Comfy-UmiAI Works
At its core, Comfy-UmiAI works by combining various models and logic systems to process and enhance your prompts. Imagine it as a smart assistant that not only understands your input but also refines and enriches it using advanced AI capabilities. The extension uses a two-stage pipeline: the first stage involves vision models that analyze and describe images, while the second stage uses language models to refine and expand on these descriptions, turning them into detailed and creative prompts. This process allows you to generate more nuanced and contextually relevant outputs without needing to delve into complex technical details.
Comfy-UmiAI Features
Vision-to-Text: Automatically caption images using local models like JoyCaption or Llava. Simply connect an image and use the [VISION] tag in your prompt to generate descriptions.
Dual LLM Pipeline: Combine vision models with text refiners to create high-quality prompts. This feature allows for seamless integration of visual and textual data.
Native LoRA Loading: Easily load LoRA models by typing <lora:filename:1.0> directly in your text, eliminating the need for external loader nodes.
Auto-Updater: Keep your system up-to-date with a built-in switch for auto-updating and patching llama-cpp-python for CUDA compatibility.
Advanced Logic Engine: Utilize logical operators like AND, OR, NOT, and XOR to filter wildcards or conditionally change prompt text.
Persistent Variables: Define variables once and reuse them throughout your prompts for consistency.
Z-Image Support: Automatically detect and fix Z-Image format LoRAs on the fly.
Danbooru Integration: Fetch visual tags from Danbooru by typing char:name.
Resolution Control: Set image dimensions contextually within your prompts.
Comfy-UmiAI Models
Comfy-UmiAI offers a variety of models tailored to different tasks:
Vision Models:
JoyCaption-Alpha-2: Known for its accuracy in image captioning.
Llava-v1.5-7b: A reliable standard for vision tasks.
Refiner Models:
Qwen2.5-1.5B: Fast and efficient, ideal for low VRAM usage.
Dolphin-Llama3.1-8B: Creative and uncensored, perfect for following complex instructions.
Wingless Imp 8B: Excellent for creative roleplay descriptions.
These models can be selected based on your specific needs, whether you require speed, creativity, or detailed descriptions.
What's New with Comfy-UmiAI
Recent updates have brought significant improvements to Comfy-UmiAI:
Complete Rewrite of the Internal Logic Engine: This update requires users to recreate the UmiAI Node in their workflows to access new inputs and logic operators.
Vision-to-Text Feature: A new addition that allows for automatic image captioning using local models.
Enhanced LLM Pipeline: Improved integration of vision and text models for more natural language prompt generation.
These updates enhance the functionality and user experience, making it easier for AI artists to create and refine their work.
Troubleshooting Comfy-UmiAI
If you encounter issues while using Comfy-UmiAI, here are some common solutions:
Node Not Loading New Inputs: If updating from an older version, right-click the UmiAI Node in your workflow and select Fix Node (Recreate) or delete and re-add it.
Vision Projector or CUDA Errors: Toggle the update_llama_cpp widget to True and queue a prompt to reinstall the correct wheels for your CUDA version.
Model Download Issues: Ensure that your internet connection is stable and that you have sufficient storage space for model downloads.
For further assistance, consider joining the Umi AI Discord community where you can ask questions and share experiences with other users.
Learn More about Comfy-UmiAI
To deepen your understanding of Comfy-UmiAI and explore its full potential, consider the following resources:
Tutorials and Documentation: Explore detailed guides and documentation to help you get started and make the most of the extension.
Community Forums: Join the Umi AI Discord server to connect with other AI artists, share your work, and get support from the community.
By leveraging these resources, you can enhance your skills and creativity, making the most of what Comfy-UmiAI has to offer.
RunComfy is the
premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.
RunComfy also provides AI Models,
enabling artists to harness the latest AI tools to create incredible art.