Ollama Vision Dual Planner:
The OllamaVisionDualPlanner is a sophisticated node designed to leverage a local Ollama vision model for analyzing images and planning subsequent image generation tasks. This node excels in its ability to interpret visual content through a vision prompt and then utilize a separate generation prompt to guide the creation of new images. By integrating the analysis of style, composition, lighting, and key elements of an image, it provides a comprehensive framework for generating visually coherent and contextually relevant images. The node is particularly beneficial for AI artists seeking to automate and enhance their creative workflows, as it intelligently selects the appropriate model checkpoints, LoRAs, and parameters based on the input image and prompts. This dual-planning capability ensures that the generated images align closely with the user's artistic vision and technical requirements.
Ollama Vision Dual Planner Input Parameters:
image
The image parameter is the input image that the Ollama vision model will analyze. It serves as the foundation for both the vision analysis and the subsequent image generation planning. The quality and content of this image significantly influence the node's output, as it determines the style and elements that will be emphasized in the generated image.
vision_prompt
The vision_prompt is a descriptive text input that guides the analysis of the input image. It typically includes instructions to describe the style, composition, lighting, and key elements of the image. This prompt helps the model focus on specific aspects of the image, ensuring that the analysis aligns with the user's artistic goals. The default value is "Describe the style, composition, lighting, and key elements of this image."
generation_prompt
The generation_prompt is a text input that directs the image generation process. It allows users to specify the desired characteristics and themes for the new image, ensuring that the output aligns with their creative vision. This prompt is crucial for tailoring the generated image to meet specific artistic or thematic requirements.
ollama_model
The ollama_model parameter specifies the model used for image analysis and generation planning. The default model is "llava:7b," which is optimized for handling complex visual and textual inputs. This parameter allows users to select different models based on their specific needs and the complexity of the task.
registry_path
The registry_path is the file path to the model registry, which contains information about available models, checkpoints, and their configurations. The default value is "model_registry.json." This parameter is essential for ensuring that the node can access and utilize the appropriate models and resources for the task.
task_hint
The task_hint parameter provides a hint about the type of task being performed, such as "auto," "img2img," or "text2img." This hint helps the node optimize its planning and execution strategies, ensuring that the generated images are well-suited to the intended application.
user_negative
The user_negative parameter allows users to specify negative prompts, which are aspects or elements they wish to avoid in the generated image. This input helps refine the output by excluding unwanted features, ensuring that the final image aligns more closely with the user's preferences.
aspect_ratio
The aspect_ratio parameter defines the desired aspect ratio for the generated image. Options include "1:1," "3:2," "2:3," "4:3," "3:4," "16:9," "9:16," "21:9," "9:21," "2:1," "1:2," "5:3," "3:5," "4:5," and "5:4." This parameter is crucial for ensuring that the output image fits the intended display or publication format.
base_size
The base_size parameter sets the base resolution for the generated image, with a default value of 1024. It can range from 256 to 2048, in increments of 64. This parameter affects the level of detail and clarity in the output image, with larger sizes providing higher resolution and more detail.
ollama_host
The ollama_host parameter specifies the host address for the Ollama vision model server, with a default value of "localhost." This parameter is important for establishing a connection to the server where the model is hosted.
ollama_port
The ollama_port parameter defines the port number for connecting to the Ollama vision model server, with a default value of 11434. It can range from 1 to 65535. This parameter is necessary for ensuring proper communication with the server.
max_vram
The max_vram parameter sets the maximum amount of VRAM available for the node's operations, with options including 24, 16, 12, 8, and 6, and a default value of 24. This parameter is important for managing the computational resources required for image analysis and generation.
Ollama Vision Dual Planner Output Parameters:
plan
The plan output parameter is a comprehensive set of instructions and configurations generated by the node. It includes details such as the selected model checkpoint, LoRAs, and other parameters necessary for generating the desired image. This output is crucial for guiding the image generation process and ensuring that the final output aligns with the user's artistic and technical requirements.
Ollama Vision Dual Planner Usage Tips:
- To achieve the best results, ensure that the
vision_promptis detailed and specific, as this will guide the model to focus on the most relevant aspects of the input image. - Experiment with different
generation_promptinputs to explore a variety of creative outcomes and find the one that best matches your artistic vision. - Adjust the
aspect_ratioandbase_sizeparameters to fit the intended display format and resolution requirements, ensuring that the generated image meets your needs.
Ollama Vision Dual Planner Common Errors and Solutions:
URLError
- Explanation: This error occurs when there is a problem connecting to the Ollama vision model server, possibly due to incorrect host or port settings.
- Solution: Verify that the
ollama_hostandollama_portparameters are correctly configured and that the server is running and accessible.
Plan is None
- Explanation: This error indicates that the node was unable to generate a valid plan, possibly due to issues with the input parameters or model availability.
- Solution: Check the input parameters for accuracy and ensure that the specified model is available and correctly configured in the
registry_path.
