RunComfy

Wan 2.2 Lightning T2V I2V | 4-Step Ultra Fast

Wan 2.2 now 20x faster! T2V + I2V in 4 steps.

Qwen Edit 2509 MultipleAngles | Multi-View Image Creator

Turn one photo into complete multi-angle visuals instantly.

VACE Wan2.1 | V2V

Transform videos with a reference style image using VACE Wan2.1.

Wan 2.1 LoRA

Enhance Wan 2.1 video generation with LoRA models for improved style and customization.

ComfyUI > Nodes > ComfyUI_QwenVL_PromptCaption > Qwen3.5 VL Caption (Inverse Prompt)

ComfyUI Node: Qwen3.5 VL Caption (Inverse Prompt)

Class Name

Qwen35Caption

Category
image/caption

Author
WingeD123 (Account age: 1221days) Extension
ComfyUI_QwenVL_PromptCaption Latest Updated
2026-03-23 Github Stars
0.04K

Github Ask WingeD123 Current Questions Past Questions

Table of Content

Description
Qwen35Caption:
Qwen35Caption Input Parameters:
Qwen35Caption Output Parameters:
Qwen35Caption Usage Tips:
Qwen35Caption Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_QwenVL_PromptCaption

Install this extension via the ComfyUI Manager by searching for ComfyUI_QwenVL_PromptCaption

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_QwenVL_PromptCaption in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Qwen3.5 VL Caption (Inverse Prompt) Description

Generates descriptive image captions using advanced visual-language models for enhanced accessibility.

Qwen3.5 VL Caption (Inverse Prompt):

Qwen35Caption is a sophisticated node designed to generate descriptive captions for images using advanced visual-language models. Its primary purpose is to analyze an image and produce a coherent and contextually relevant text description, enhancing the interpretability and accessibility of visual content. This node leverages the Qwen model's capabilities to process images and generate text, making it a valuable tool for AI artists who wish to integrate automated captioning into their creative workflows. By providing detailed captions, Qwen35Caption helps in understanding and categorizing images, which can be particularly beneficial in large-scale image management and content creation tasks. The node is optimized for efficiency, utilizing caching mechanisms to reduce processing time and resource usage, ensuring a smooth and responsive user experience.

Qwen3.5 VL Caption (Inverse Prompt) Input Parameters:

image

The image parameter is a tensor representing the image to be captioned. It is crucial as it serves as the primary input for the node, determining the content and context of the generated caption. The image should be pre-processed into a tensor format compatible with the model's requirements. There are no specific minimum or maximum values, but the image should be correctly formatted to ensure accurate captioning.

model_path

The model_path parameter specifies the directory path where the Qwen model components are stored. This path is essential for loading the model and processor required for caption generation. Providing an incorrect path will result in a failure to load the model, thus preventing the node from functioning.

lang

The lang parameter indicates the language in which the caption should be generated. This allows the node to produce captions in different languages, catering to a diverse user base. The choice of language can impact the style and structure of the generated text.

dtype

The dtype parameter defines the data type used for model processing, affecting the precision and performance of the caption generation. It is important to select a data type that balances computational efficiency with the desired level of detail in the captions.

max_side

The max_side parameter sets the maximum dimension for resizing the image, ensuring that it fits within the model's processing capabilities. This helps in maintaining a consistent input size, which is crucial for accurate and efficient captioning.

keep_model_loaded

The keep_model_loaded parameter is a boolean that determines whether the model should remain loaded in memory after processing. Keeping the model loaded can speed up subsequent operations by avoiding repeated loading times, but it may increase memory usage.

instruction

The instruction parameter is an optional string that provides additional guidance or context for the caption generation process. This can be used to tailor the output to specific requirements or themes, enhancing the relevance and creativity of the captions.

Qwen3.5 VL Caption (Inverse Prompt) Output Parameters:

text

The text parameter is the output of the node, providing the generated caption as a string. This caption describes the content of the input image, offering insights and context that can be used for various applications such as content creation, image indexing, and accessibility enhancement. The quality and relevance of the caption depend on the input parameters and the model's capabilities.

Qwen3.5 VL Caption (Inverse Prompt) Usage Tips:

Ensure that the image is pre-processed correctly into a tensor format to avoid errors and ensure accurate captioning.
Use the instruction parameter to guide the caption generation process, especially if you have specific themes or contexts in mind.
Consider the trade-off between keeping the model loaded for faster processing and the increased memory usage it may entail.

Qwen3.5 VL Caption (Inverse Prompt) Common Errors and Solutions:

"Failed to load model, 模型加载失败"

Explanation: This error occurs when the model components cannot be loaded from the specified model_path. This could be due to an incorrect path or missing files.
Solution: Verify that the model_path is correct and that all necessary model files are present in the specified directory.

"no image, 无图像"

Explanation: This error indicates that no image was provided as input, which is essential for the captioning process.
Solution: Ensure that the image parameter is correctly set with a valid image tensor before executing the node.

Qwen3.5 VL Caption (Inverse Prompt) Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_QwenVL_PromptCaption

Table of Content

Description
Qwen35Caption:
Qwen35Caption Input Parameters:
Qwen35Caption Output Parameters:
Qwen35Caption Usage Tips:
Qwen35Caption Common Errors and Solutions:
Related Nodes

Wan2.2 Fun Camera | Cinematic Motion from Images

Turn still images into lively cinematic shots with smooth camera moves.

HiDream E1.1 | AI Image Editing

Edit images with natural language using HiDream E1.1 model

Flux Kontext Character Turnaround Sheet LoRA

Generate 5-pose character turnaround sheets from single image

FLUX Dev ControlNet | Multi-Condition ControlNet

Controlled FLUX Dev image generation with Pose, Depth, Canny, and ReColor

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: Qwen3.5 VL Caption (Inverse Prompt)

Qwen35Caption

How to Install ComfyUI_QwenVL_PromptCaption

Qwen3.5 VL Caption (Inverse Prompt) Description

Qwen3.5 VL Caption (Inverse Prompt):

Qwen3.5 VL Caption (Inverse Prompt) Input Parameters:

image

model_path

lang

dtype

max_side

keep_model_loaded

instruction

Qwen3.5 VL Caption (Inverse Prompt) Output Parameters:

text

Qwen3.5 VL Caption (Inverse Prompt) Usage Tips:

Qwen3.5 VL Caption (Inverse Prompt) Common Errors and Solutions:

"Failed to load model, 模型加载失败"

"no image, 无图像"

Qwen3.5 VL Caption (Inverse Prompt) Related Nodes