ComfyUI > Nodes > ComfyUI-cluster > Ollama Vision Style Planner

ComfyUI Node: Ollama Vision Style Planner

Class Name

OllamaVisionStylePlanner

Category
Ollama/Planner
Author
GeekatplayStudio (Account age: 4275days)
Extension
ComfyUI-cluster
Latest Updated
2026-02-13
Github Stars
0.02K

How to Install ComfyUI-cluster

Install this extension via the ComfyUI Manager by searching for ComfyUI-cluster
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-cluster in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Ollama Vision Style Planner Description

Sophisticated node leveraging Ollama vision model for image analysis and generation planning, aiding AI artists in maintaining stylistic coherence.

Ollama Vision Style Planner:

The OllamaVisionStylePlanner is a sophisticated node designed to leverage a local Ollama vision model for analyzing images and planning subsequent image generation tasks. This node is particularly beneficial for AI artists who wish to create art that is stylistically consistent with a given image. By using a specific vision prompt, the node analyzes the style, composition, lighting, and key elements of an image. It then uses this analysis to inform the generation of new images based on a separate user-provided generation prompt. This dual-prompt approach allows for a nuanced and contextually aware generation process, making it a powerful tool for artists looking to maintain stylistic coherence across their work. The node's ability to integrate vision analysis with generation planning makes it an essential component for artists aiming to produce high-quality, style-consistent digital art.

Ollama Vision Style Planner Input Parameters:

image

This parameter accepts an image input that the Ollama vision model will analyze. The image serves as the basis for style extraction, which influences the subsequent generation process.

vision_prompt

A string input that guides the vision model in analyzing the image. It typically includes instructions to describe the style, composition, lighting, and key elements of the image. This prompt is crucial for extracting the desired stylistic features from the image.

generation_prompt

This string input is used to guide the generation of new images. It allows users to specify the creative direction for the new image, separate from the style extracted from the original image.

ollama_model

Specifies the model to be used for vision analysis. The default value is "llava:7b". This parameter determines the capabilities and performance of the vision analysis.

registry_path

A string that indicates the path to the model registry file, with a default value of "model_registry.json". This file contains information about available models and their configurations.

task_hint

This parameter provides a hint about the type of task to be performed, with options including "auto", "img2img", and "text2img". It helps the node determine the appropriate processing pathway.

user_negative

A string input for specifying negative prompts, which are used to exclude certain elements or styles from the generated image. This can help refine the output by avoiding unwanted features.

aspect_ratio

Defines the aspect ratio of the generated image. Options include "1:1", "3:2", "2:3", "4:3", "3:4", "16:9", "9:16", "21:9", "9:21", "2:1", "1:2", "5:3", "3:5", "4:5", and "5:4". This parameter affects the composition and framing of the output image.

base_size

An integer that sets the base size for the generated image, with a default of 1024 and a range from 256 to 2048, adjustable in steps of 64. This parameter influences the resolution and detail of the output.

ollama_host

Specifies the host address for the Ollama model, defaulting to "localhost". This parameter is important for connecting to the correct server for model execution.

ollama_port

An integer that sets the port number for the Ollama model, with a default of 11434 and a range from 1 to 65535. This parameter is crucial for establishing a successful connection to the model server.

max_vram

Indicates the maximum VRAM available for processing, with options of 24, 16, 12, 8, and 6, and a default of 24. This parameter affects the model's performance and the complexity of tasks it can handle.

Ollama Vision Style Planner Output Parameters:

plan

The output is a detailed plan that includes the selected model, LoRAs, and parameters for generating the new image. This plan is based on the analysis of the input image and the specified prompts, ensuring that the generated image aligns with the desired style and content.

Ollama Vision Style Planner Usage Tips:

  • Use a detailed vision prompt to ensure the model accurately captures the style and key elements of the input image, which will enhance the quality of the generated image.
  • Adjust the aspect ratio and base size parameters to match the intended use of the generated image, whether for print, digital display, or other purposes.
  • Experiment with different generation prompts to explore various creative directions while maintaining the stylistic coherence provided by the vision analysis.

Ollama Vision Style Planner Common Errors and Solutions:

ConnectionError: Unable to connect to Ollama model

  • Explanation: This error occurs when the node cannot establish a connection to the Ollama model server, possibly due to incorrect host or port settings.
  • Solution: Verify that the ollama_host and ollama_port parameters are correctly set and that the server is running and accessible.

ValueError: Invalid aspect ratio

  • Explanation: This error indicates that an unsupported aspect ratio was provided.
  • Solution: Ensure that the aspect_ratio parameter is set to one of the supported options listed in the input parameters section.

MemoryError: Insufficient VRAM

  • Explanation: This error occurs when the task exceeds the available VRAM, often due to high-resolution settings or complex models.
  • Solution: Reduce the base_size or select a lower max_vram option to fit within the available memory resources.

Ollama Vision Style Planner Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-cluster
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.