ComfyUI
Playground
Pricing

RunComfy

Flux Fill | Inpaint and Outpaint

Official Flux Tools - Flux Fill for Inpainting and Outpainting

Sonic | Lip-Sync Portrait Animation

Sonic delivers advanced audio-driven lip-sync for portraits with high-quality animation.

FLUX IPAdapter V2 | XLabs

Explore XLabs FLUX IPAdapter V2 model compared to V1 for your creative goals.

UNO | Consistent Subject & Object Generation

Create stable and consistent images from subject and object references.

ComfyUI > Nodes > ComfyUI-decadetw-auto-prompt-llm > ✨ Auto-LLM-Vision

ComfyUI Node: ✨ Auto-LLM-Vision

Class Name

Auto-LLM-Vision

Category
🧩 Auto-Prompt-LLM

Author
xlinx (Account age: 4822days) Extension
ComfyUI-decadetw-auto-prompt-llm Latest Updated
2025-02-01 Github Stars
0.02K

Github Ask xlinx Current Questions Past Questions

Table of Content

Description
Auto-LLM-Vision:
Auto-LLM-Vision Input Parameters:
Auto-LLM-Vision Output Parameters:
Auto-LLM-Vision Usage Tips:
Auto-LLM-Vision Common Errors and Solutions:
Related Nodes

How to Install ComfyUI-decadetw-auto-prompt-llm

Install this extension via the ComfyUI Manager by searching for ComfyUI-decadetw-auto-prompt-llm

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI-decadetw-auto-prompt-llm in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

✨ Auto-LLM-Vision Description

Enhances AI image processing with language models for prompt generation, bridging visual and textual elements in AI art.

✨ Auto-LLM-Vision:

Auto-LLM-Vision is a powerful node designed to enhance the capabilities of AI-driven image processing within the ComfyUI framework. This node leverages advanced language models to interpret and generate prompts based on visual inputs, allowing for a seamless integration of text and vision-based AI functionalities. By utilizing this node, you can automate the generation of detailed and contextually relevant prompts from images, which can significantly enhance the creative process in AI art generation. The main goal of Auto-LLM-Vision is to bridge the gap between visual data and language models, providing a robust tool for artists to explore new dimensions of creativity by combining visual and textual elements in their work.

✨ Auto-LLM-Vision Input Parameters:

image_to_llm_vision

This parameter accepts an image input that serves as the basis for generating prompts using the language model. The image is processed to extract relevant features that can be translated into text prompts, enabling a deeper understanding and interpretation of the visual content.

llm_vision_max_token

This integer parameter defines the maximum number of tokens that the language model can generate for a given image input. It controls the length of the generated text, with a default value of 50, a minimum of 10, and a maximum of 1024. Adjusting this parameter can impact the detail and complexity of the generated prompts.

llm_vision_tempture

A float parameter that influences the randomness of the language model's output. With a default value of 0.8, and a range from -2.0 to 2.0, this parameter allows you to control the creativity of the generated text. Lower values result in more deterministic outputs, while higher values introduce more variability and creativity.

llm_vision_system_prompt

This string parameter allows you to set a system-level prompt that guides the language model's interpretation of the image. It supports multiline and dynamic prompts, providing a default template that can be customized to fit specific needs or themes.

llm_vision_ur_prompt

Similar to the system prompt, this string parameter is used to define a user-level prompt that further refines the language model's output. It also supports multiline and dynamic prompts, allowing for personalized and context-specific text generation.

llm_vision_result_append_enabled

A boolean parameter that determines whether the generated text should be appended to existing results. With a default setting of True, this option allows for continuous and cumulative text generation, which can be toggled off if a standalone output is preferred.

✨ Auto-LLM-Vision Output Parameters:

positive

This output parameter contains the positive conditioning values derived from the image input, which are used to guide the language model in generating relevant and contextually appropriate prompts. It plays a crucial role in ensuring that the generated text aligns with the intended interpretation of the visual content.

negative

The negative output parameter provides conditioning values that help the language model avoid generating irrelevant or undesirable prompts. By balancing positive and negative conditioning, the node ensures a more accurate and focused text generation process.

out_latent

This output parameter includes the latent representations of the image, which are essential for further processing and analysis. The latent data can be used to refine the generated prompts or to feed into other nodes for additional AI-driven tasks.

✨ Auto-LLM-Vision Usage Tips:

Experiment with the llm_vision_tempture parameter to find the right balance between creativity and accuracy in the generated prompts. Lower values will produce more predictable results, while higher values can introduce creative variations.
Utilize the llm_vision_system_prompt and llm_vision_ur_prompt to tailor the language model's output to specific themes or styles. Customizing these prompts can significantly enhance the relevance and quality of the generated text.
Consider the llm_vision_max_token setting when working with complex images that require detailed descriptions. Increasing the token limit can provide more comprehensive prompts, but be mindful of the potential for overly verbose outputs.

✨ Auto-LLM-Vision Common Errors and Solutions:

"Invalid image input"

Explanation: This error occurs when the provided image input is not in a supported format or is corrupted.
Solution: Ensure that the image is in a compatible format (e.g., JPEG, PNG) and is not damaged. Try re-uploading the image or converting it to a different format.

"Token limit exceeded"

Explanation: The generated text exceeds the maximum token limit set by the llm_vision_max_token parameter.
Solution: Increase the llm_vision_max_token value to accommodate longer text outputs, or simplify the image input to reduce the complexity of the generated prompts.

"Temperature value out of range"

Explanation: The llm_vision_tempture parameter is set outside the allowable range of -2.0 to 2.0.
Solution: Adjust the llm_vision_tempture value to fall within the specified range to ensure proper functioning of the language model.

✨ Auto-LLM-Vision Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI-decadetw-auto-prompt-llm

Table of Content

Description
Auto-LLM-Vision:
Auto-LLM-Vision Input Parameters:
Auto-LLM-Vision Output Parameters:
Auto-LLM-Vision Usage Tips:
Auto-LLM-Vision Common Errors and Solutions:
Related Nodes

EchoMimic | Audio-driven Portrait Animations

Generate realistic talking heads and body gestures synced with the provided audio.

Wan 2.1 Fun | ControlNet Video Generation

Generate videos with ControlNet-style visual passes like Depth, Canny, and OpenPose.

Hunyuan LoRA

Use downloaded Hunyuan LoRAs to control style and character consistency in video generation.

Product Relighting | Magnific.AI Relight Alternative

Elevate your product photography effortlessly, a top alternative to Magnific.AI Relight.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy