ComfyUI > Nodes > ComfyUI_QwenVL_PromptCaption > Qwen3.5 VL Batch Caption

ComfyUI Node: Qwen3.5 VL Batch Caption

Class Name

Qwen35CaptionBatch

Category
image/caption
Author
WingeD123 (Account age: 1221days)
Extension
ComfyUI_QwenVL_PromptCaption
Latest Updated
2026-03-23
Github Stars
0.04K

How to Install ComfyUI_QwenVL_PromptCaption

Install this extension via the ComfyUI Manager by searching for ComfyUI_QwenVL_PromptCaption
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI_QwenVL_PromptCaption in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Qwen3.5 VL Batch Caption Description

Qwen35CaptionBatch enables efficient batch image captioning with the Qwen3.5 model for large projects.

Qwen3.5 VL Batch Caption:

Qwen35CaptionBatch is a powerful node designed to facilitate batch processing of image captioning tasks using the Qwen3.5 model. This node is particularly beneficial for users who need to generate descriptive captions for a large number of images efficiently. By leveraging advanced image processing and language generation capabilities, Qwen35CaptionBatch can handle multiple images simultaneously, providing detailed and contextually relevant captions. This node is ideal for AI artists and content creators looking to automate the captioning process, thereby saving time and enhancing productivity. Its design ensures optimal memory management and processing efficiency, making it a valuable tool for large-scale image captioning projects.

Qwen3.5 VL Batch Caption Input Parameters:

model_path

The model_path parameter specifies the location of the text encoder model files required for the captioning process. It is crucial for loading the appropriate model that will be used to generate captions. This parameter accepts a list of filenames from the text_encoders directory. Ensuring the correct model path is set is essential for the node to function correctly.

dtype

The dtype parameter determines the data type precision used during model processing. It offers options such as "auto", "4bit", and "8bit", with "4bit" being the default. This setting impacts the memory usage and processing speed, with lower bit precision generally offering faster processing at the cost of potential accuracy. Selecting the appropriate dtype can optimize performance based on the available hardware resources.

keep_model_loaded

The keep_model_loaded parameter is a boolean setting that dictates whether the model should remain loaded in memory after processing. By default, it is set to False, meaning the model will be unloaded to free up memory. Keeping the model loaded can be beneficial for consecutive tasks that require the same model, reducing loading times.

lang

The lang parameter specifies the language in which the captions will be generated. It supports "中文" (Chinese) and "English", with "中文" as the default. This setting ensures that the generated captions are in the desired language, catering to different linguistic needs.

max_side

The max_side parameter defines the maximum dimension (in pixels) for resizing images before processing. It has a default value of 512, with a minimum of 256 and a maximum of 2240, adjustable in steps of 32. This parameter helps manage memory usage and processing time by ensuring images are not excessively large.

image_path

The image_path parameter is a string that specifies the directory path containing the images to be captioned. It is essential for the node to locate and process the images. Providing a valid directory path is crucial for successful execution.

save_path

The save_path parameter is an optional string that determines where the generated captions will be saved. If left empty, the captions will be saved in the same directory as the images. This flexibility allows users to organize output files according to their preferences.

instruction

The instruction parameter is an optional multiline string that allows users to provide specific instructions or prompts to guide the caption generation process. This can be used to tailor the captions to specific requirements or themes.

Qwen3.5 VL Batch Caption Output Parameters:

summary

The summary output parameter provides the generated captions as a string. This output is the culmination of the image captioning process, offering users a textual description of the images processed. The summary is essential for understanding the content and context of the images, making it a valuable asset for content creation and analysis.

Qwen3.5 VL Batch Caption Usage Tips:

  • Ensure that the model_path is correctly set to avoid errors related to model loading. Double-check the path to the text encoder files.
  • Use the dtype parameter to balance between processing speed and accuracy. Opt for "4bit" for faster processing if precision is not critical.
  • Consider setting keep_model_loaded to True if you plan to process multiple batches consecutively, as this can save time by avoiding repeated model loading.
  • Adjust the max_side parameter based on your hardware capabilities to optimize memory usage and processing time.

Qwen3.5 VL Batch Caption Common Errors and Solutions:

"0 image captioned, 共处理0张图片"

  • Explanation: This error occurs when the specified image_path is invalid or does not contain any images.
  • Solution: Verify that the image_path is correct and points to a directory containing valid image files. Ensure the directory is accessible and contains supported image formats.

"Model loading error"

  • Explanation: This error indicates an issue with loading the model from the specified model_path.
  • Solution: Check that the model_path is correct and that the necessary model files are present in the specified directory. Ensure there are no permission issues preventing access to the files.

Qwen3.5 VL Batch Caption Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI_QwenVL_PromptCaption
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Qwen3.5 VL Batch Caption