ComfyUI > Nodes > ComfyUI_QwenVL_PromptCaption > Qwen3.5 VL Caption (Inverse Prompt)

ComfyUI Node: Qwen3.5 VL Caption (Inverse Prompt)

Class Name

Qwen35Caption

Category
image/caption
Author
WingeD123 (Account age: 1221days)
Extension
ComfyUI_QwenVL_PromptCaption
Latest Updated
2026-03-23
Github Stars
0.04K

How to Install ComfyUI_QwenVL_PromptCaption

Install this extension via the ComfyUI Manager by searching for ComfyUI_QwenVL_PromptCaption
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_QwenVL_PromptCaption in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Qwen3.5 VL Caption (Inverse Prompt) Description

Generates descriptive image captions using advanced visual-language models for enhanced accessibility.

Qwen3.5 VL Caption (Inverse Prompt):

Qwen35Caption is a sophisticated node designed to generate descriptive captions for images using advanced visual-language models. Its primary purpose is to analyze an image and produce a coherent and contextually relevant text description, enhancing the interpretability and accessibility of visual content. This node leverages the Qwen model's capabilities to process images and generate text, making it a valuable tool for AI artists who wish to integrate automated captioning into their creative workflows. By providing detailed captions, Qwen35Caption helps in understanding and categorizing images, which can be particularly beneficial in large-scale image management and content creation tasks. The node is optimized for efficiency, utilizing caching mechanisms to reduce processing time and resource usage, ensuring a smooth and responsive user experience.

Qwen3.5 VL Caption (Inverse Prompt) Input Parameters:

image

The image parameter is a tensor representing the image to be captioned. It is crucial as it serves as the primary input for the node, determining the content and context of the generated caption. The image should be pre-processed into a tensor format compatible with the model's requirements. There are no specific minimum or maximum values, but the image should be correctly formatted to ensure accurate captioning.

model_path

The model_path parameter specifies the directory path where the Qwen model components are stored. This path is essential for loading the model and processor required for caption generation. Providing an incorrect path will result in a failure to load the model, thus preventing the node from functioning.

lang

The lang parameter indicates the language in which the caption should be generated. This allows the node to produce captions in different languages, catering to a diverse user base. The choice of language can impact the style and structure of the generated text.

dtype

The dtype parameter defines the data type used for model processing, affecting the precision and performance of the caption generation. It is important to select a data type that balances computational efficiency with the desired level of detail in the captions.

max_side

The max_side parameter sets the maximum dimension for resizing the image, ensuring that it fits within the model's processing capabilities. This helps in maintaining a consistent input size, which is crucial for accurate and efficient captioning.

keep_model_loaded

The keep_model_loaded parameter is a boolean that determines whether the model should remain loaded in memory after processing. Keeping the model loaded can speed up subsequent operations by avoiding repeated loading times, but it may increase memory usage.

instruction

The instruction parameter is an optional string that provides additional guidance or context for the caption generation process. This can be used to tailor the output to specific requirements or themes, enhancing the relevance and creativity of the captions.

Qwen3.5 VL Caption (Inverse Prompt) Output Parameters:

text

The text parameter is the output of the node, providing the generated caption as a string. This caption describes the content of the input image, offering insights and context that can be used for various applications such as content creation, image indexing, and accessibility enhancement. The quality and relevance of the caption depend on the input parameters and the model's capabilities.

Qwen3.5 VL Caption (Inverse Prompt) Usage Tips:

  • Ensure that the image is pre-processed correctly into a tensor format to avoid errors and ensure accurate captioning.
  • Use the instruction parameter to guide the caption generation process, especially if you have specific themes or contexts in mind.
  • Consider the trade-off between keeping the model loaded for faster processing and the increased memory usage it may entail.

Qwen3.5 VL Caption (Inverse Prompt) Common Errors and Solutions:

"Failed to load model, 模型加载失败"

  • Explanation: This error occurs when the model components cannot be loaded from the specified model_path. This could be due to an incorrect path or missing files.
  • Solution: Verify that the model_path is correct and that all necessary model files are present in the specified directory.

"no image, 无图像"

  • Explanation: This error indicates that no image was provided as input, which is essential for the captioning process.
  • Solution: Ensure that the image parameter is correctly set with a valid image tensor before executing the node.

Qwen3.5 VL Caption (Inverse Prompt) Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_QwenVL_PromptCaption
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Qwen3.5 VL Caption (Inverse Prompt)