ComfyUI > Nodes > ComfyUI_CaptionThis > Florence2 Describe Image 🐑

ComfyUI Node: Florence2 Describe Image 🐑

Class Name

Florence2DescribeImage|Mie

Category
🐑 Florence2Caption
Author
mie (Account age: 1888days)
Extension
ComfyUI_CaptionThis
Latest Updated
2025-04-22
Github Stars
0.05K

How to Install ComfyUI_CaptionThis

Install this extension via the ComfyUI Manager by searching for ComfyUI_CaptionThis
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_CaptionThis in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Florence2 Describe Image 🐑 Description

Generate descriptive image captions using advanced AI models for enhanced visual projects.

Florence2 Describe Image 🐑| Florence2 Describe Image 🐑:

Florence2DescribeImage| Florence2 Describe Image 🐑 is a powerful node designed to generate descriptive captions for images using advanced AI models. This node leverages the Florence2 model to analyze and interpret visual content, providing detailed and contextually relevant descriptions. It is particularly beneficial for AI artists and creators who wish to enhance their visual projects with meaningful textual annotations. By utilizing this node, you can transform images into rich narratives, making them more accessible and engaging. The node's primary function is to process an image through a sophisticated model, which then outputs a descriptive text based on the visual elements and the specified task. This capability is essential for applications in digital art, content creation, and any field where image understanding and description are valuable.

Florence2 Describe Image 🐑| Florence2 Describe Image 🐑 Input Parameters:

model

The model parameter specifies the Florence2 model to be used for generating image descriptions. It is crucial as it determines the underlying AI capabilities and the quality of the output. The model is pre-loaded and includes both the processor and the model itself, ensuring seamless integration and execution.

image

The image parameter is the input image that you want to describe. This parameter is essential as it provides the visual content that the model will analyze to generate a description. The image should be in a compatible format for processing.

task

The task parameter defines the specific type of description you want the model to generate. It influences the style and detail level of the output. The default task is "more_detailed_caption," but you can choose from a list of predefined tasks to suit your needs.

seed

The seed parameter is an integer used to initialize the random number generator, ensuring reproducibility of results. It allows you to obtain consistent outputs across different runs with the same input. The default value is 42, with a minimum of 1 and a maximum of 0xffffffffffffffff.

max_new_tokens

The max_new_tokens parameter sets the maximum number of tokens that the model can generate for the description. It controls the length of the output text, with a default value of 1024, a minimum of 1, and a maximum of 4096.

num_beams

The num_beams parameter determines the number of beams used in beam search, a technique for generating more accurate and diverse outputs. A higher number of beams can improve the quality of the description but may increase computation time. The default is 3, with a minimum of 1 and a maximum of 64.

do_sample

The do_sample parameter is a boolean that indicates whether to use sampling during text generation. When set to true, it allows for more varied and creative outputs. The default value is true.

keep_model_loaded

The keep_model_loaded parameter is a boolean that specifies whether to keep the model loaded in memory after execution. This can be useful for batch processing multiple images without reloading the model each time. The default value is true.

Florence2 Describe Image 🐑| Florence2 Describe Image 🐑 Output Parameters:

text

The text output parameter is the generated description of the input image. It provides a detailed and contextually relevant narrative based on the visual content and the specified task. This output is crucial for enhancing the understanding and accessibility of images in various applications.

Florence2 Describe Image 🐑| Florence2 Describe Image 🐑 Usage Tips:

  • To achieve consistent results, use the same seed value when processing similar images.
  • Experiment with different task options to find the most suitable description style for your project.
  • Adjust the max_new_tokens and num_beams parameters to balance between description length and quality.

Florence2 Describe Image 🐑| Florence2 Describe Image 🐑 Common Errors and Solutions:

Model not found

  • Explanation: This error occurs when the specified model is not available in the local directory.
  • Solution: Ensure that the model is correctly downloaded and available in the specified path. You may need to download it from the appropriate repository.

Image format not supported

  • Explanation: The input image is in a format that the processor cannot handle.
  • Solution: Convert the image to a supported format, such as JPEG or PNG, before processing.

Out of memory

  • Explanation: The model requires more memory than is available on the device.
  • Solution: Reduce the max_new_tokens or num_beams parameters, or consider using a device with more memory.

Florence2 Describe Image 🐑 Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_CaptionThis
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.