RunComfy

FlashVSR | Real-Time Video Upscaler

Upscale videos fast, smooth, and super clear—no detail lost.

LatentSync| Lip Sync Model

Advanced audio-driven lip sync technology.

LTX-2 ComfyUI | Real-Time Video Generator

Create real-time videos instantly, faster than any other generator.

FLUX Inpainting | Seamless Image Editing

Effortlessly fill, remove, and refine images, seamlessly integrating new content.

ComfyUI > Nodes > COMFYUI_PROMPTMODELS > Google AI - Vision Analyzer

ComfyUI Node: Google AI - Vision Analyzer

Class Name

GoogleAI_TextVisionNode

Category
Google AI/Text

Author
cdanielp (Account age: 0days) Extension
COMFYUI_PROMPTMODELS Latest Updated
2026-03-17 Github Stars
0.02K

Github Ask cdanielp Current Questions Past Questions

Table of Content

Description
GoogleAI_TextVisionNode:
GoogleAI_TextVisionNode Input Parameters:
GoogleAI_TextVisionNode Output Parameters:
GoogleAI_TextVisionNode Usage Tips:
GoogleAI_TextVisionNode Common Errors and Solutions:
Related Nodes

How to Install COMFYUI_PROMPTMODELS

Install this extension via the ComfyUI Manager by searching for COMFYUI_PROMPTMODELS

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter COMFYUI_PROMPTMODELS in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Google AI - Vision Analyzer Description

Analyzes images using AI for text-based insights, leveraging Google's Gemini Vision technology.

Google AI - Vision Analyzer:

The GoogleAI_TextVisionNode is a powerful tool designed to analyze images using advanced AI models, specifically tailored for text-based interpretation of visual content. This node leverages Google's Gemini Vision technology to provide detailed descriptions and insights from images, making it an invaluable asset for AI artists and creators who wish to extract meaningful information from visual data. By integrating multiple images, it allows for comprehensive comparisons and sequence analyses, enhancing the depth and context of the analysis. The node's primary goal is to transform visual inputs into rich textual outputs, enabling users to understand and utilize image content in a more profound way. Its seamless integration with Google's AI models ensures high accuracy and relevance in the generated descriptions, making it an essential component for projects that require sophisticated image analysis.

Google AI - Vision Analyzer Input Parameters:

image_1

This is the primary image input and is mandatory for the node's operation. It serves as the main subject for analysis, and its content will be described in detail by the AI model. There are no specific minimum or maximum values, but the image should be clear and relevant to the intended analysis.

prompt

The prompt is a string input that guides the AI model on what aspects of the image to focus on. It can be a detailed question or a simple instruction, such as "Describe this image in detail." The prompt helps tailor the analysis to specific needs or interests, enhancing the relevance of the output.

model

This parameter specifies the AI model to be used for the analysis. The default model is "gemini-3.1-pro-preview," which is optimized for high-quality text generation from images. Users can select other models if available, depending on their specific requirements and the desired output quality.

api_key

The API key is an optional string input that authenticates the user's access to Google's AI services. While it is not mandatory, providing a valid API key ensures that the node can access the latest features and capabilities of the AI models.

system_prompt

An optional string input that provides additional instructions or context for the AI model. It can be used to set the tone or style of the analysis, or to include specific guidelines that the model should follow during the image interpretation process.

image_2

This optional parameter allows users to input a second image for comparative analysis. It can be used to highlight differences or similarities between the primary image and this additional image, providing a richer context for the analysis.

image_3

Similar to image_2, this optional parameter accepts a third image for further comparison or sequence analysis. Including multiple images can enhance the depth of the analysis by allowing the AI to consider a broader range of visual data.

image_4

This optional parameter allows for the inclusion of a fourth image, further expanding the scope of the analysis. It is particularly useful for projects that require a comprehensive examination of multiple related images.

image_5

The fifth optional image input, which can be used to complete a sequence or provide additional context for the analysis. Including up to five images allows for a detailed and nuanced interpretation of complex visual scenarios.

Google AI - Vision Analyzer Output Parameters:

analysis

The output parameter is a string that contains the detailed analysis of the input image(s). This analysis is generated by the AI model based on the provided prompt and any additional images. It offers insights, descriptions, and interpretations that can be used for various creative or analytical purposes. The quality and relevance of the output depend on the clarity of the input images and the specificity of the prompt.

Google AI - Vision Analyzer Usage Tips:

Ensure that the primary image (image_1) is clear and relevant to the analysis to achieve the best results.
Use a well-defined prompt to guide the AI model's focus and enhance the relevance of the output.
Consider including additional images for comparative analysis to provide a richer context and more comprehensive insights.
If available, use a valid API key to access the latest features and capabilities of Google's AI models.

Google AI - Vision Analyzer Common Errors and Solutions:

❌ Error: Invalid API Key

Explanation: This error occurs when the provided API key is incorrect or expired.
Solution: Verify that the API key is correct and active. If necessary, obtain a new key from the Google Cloud Console.

❌ Error: Image Not Found

Explanation: This error indicates that one or more of the specified image inputs could not be located or accessed.
Solution: Ensure that all image paths are correct and that the images are accessible from the node's environment.

❌ Error: Model Not Supported

Explanation: This error arises when an unsupported or unavailable model is specified in the model parameter.
Solution: Check the available models and select one that is supported by the node, such as the default "gemini-3.1-pro-preview."

❌ Error: Prompt Too Long

Explanation: The prompt provided exceeds the maximum allowable length for processing.
Solution: Shorten the prompt to fit within the character limit, focusing on the most critical aspects of the analysis.

Google AI - Vision Analyzer Related Nodes

Go back to the extension to check out more related nodes.

COMFYUI_PROMPTMODELS

Table of Content

Description
GoogleAI_TextVisionNode:
GoogleAI_TextVisionNode Input Parameters:
GoogleAI_TextVisionNode Output Parameters:
GoogleAI_TextVisionNode Usage Tips:
GoogleAI_TextVisionNode Common Errors and Solutions:
Related Nodes

Z Image Turbo | Ultra-Fast Photorealistic Generator

Generate ultra-clear visuals fast with unmatched real-time detail.

Fantasy Portrait | Expressive Photo Animation

Photo → expressive cinematic face animation, fast and identity-accurate.

InstantCharacter

One photo, endless characters. Perfect identity preservation.

Flux UltraRealistic LoRA V2

Create stunningly lifelike image with Flux UltraRealistic LoRA V2

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: Google AI - Vision Analyzer

GoogleAI_TextVisionNode

How to Install COMFYUI_PROMPTMODELS

Google AI - Vision Analyzer Description

Google AI - Vision Analyzer:

Google AI - Vision Analyzer Input Parameters:

image_1

prompt

model

api_key

system_prompt

image_2

image_3

image_4

image_5

Google AI - Vision Analyzer Output Parameters:

analysis

Google AI - Vision Analyzer Usage Tips:

Google AI - Vision Analyzer Common Errors and Solutions:

❌ Error: Invalid API Key

❌ Error: Image Not Found

❌ Error: Model Not Supported

❌ Error: Prompt Too Long

Google AI - Vision Analyzer Related Nodes