RunComfy

Flux Klein Face Swap | Realistic AI Face Editor

Swap faces perfectly. Natural, lifelike, and fast AI-powered editing.

FLUX.2 [klein] 4B & 9B | Ultra-Fast Flux Image Generator

Blazing-fast visual creation with unified editing control.

MatAnyone Video Matting | Single Mask Removal

Remove video backgrounds with one mask frame for perfect subject isolation.

FLUX ControlNet Depth-V3 & Canny-V3

Achieve better control with FLUX-ControlNet-Depth & FLUX-ControlNet-Canny for FLUX.1 [dev].

ComfyUI > Nodes > Shrug-Prompter: Unified VLM Integration for ComfyUI > Two-Round VLM Prompter

ComfyUI Node: Two-Round VLM Prompter

Class Name

TwoRoundVLMPrompter

Category
VLM/Advanced

Author
fblissjr (Account age: 4014days) Extension
Shrug-Prompter: Unified VLM Integration for ComfyUI Latest Updated
2025-09-30 Github Stars
0.02K

Github Ask fblissjr Current Questions Past Questions

Table of Content

Description
TwoRoundVLMPrompter:
TwoRoundVLMPrompter Input Parameters:
TwoRoundVLMPrompter Output Parameters:
TwoRoundVLMPrompter Usage Tips:
TwoRoundVLMPrompter Common Errors and Solutions:
Related Nodes

How to Install Shrug-Prompter: Unified VLM Integration for ComfyUI

Install this extension via the ComfyUI Manager by searching for Shrug-Prompter: Unified VLM Integration for ComfyUI

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter Shrug-Prompter: Unified VLM Integration for ComfyUI in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

Two-Round VLM Prompter Description

Enhances video prompts by analyzing images with VLM and crafting cinematic prompts using Qwen2.5.

Two-Round VLM Prompter:

The TwoRoundVLMPrompter is a sophisticated node designed to enhance the process of generating detailed and contextually rich prompts for video generation models. It operates in two distinct rounds, each serving a unique purpose. In the first round, the node leverages a Vision Language Model (VLM) to meticulously analyze an image, capturing every visual detail, color, and composition element. This round is crucial for gathering comprehensive observational data. In the second round, the node utilizes the Qwen2.5 model to transform the detailed description from the first round into a cinematic prompt tailored for video generation. This transformation focuses on aspects such as movement, atmosphere, and visual style, making it ideal for creating dynamic and engaging video content. By employing specialized models for each task, the TwoRoundVLMPrompter ensures that the output is both precise and creatively inspiring, making it an invaluable tool for AI artists looking to generate high-quality video prompts.

Two-Round VLM Prompter Input Parameters:

round1_context

This parameter specifies the context for the first round of processing, where a Vision Language Model (VLM) analyzes the image. It is crucial for setting the environment in which the model operates, ensuring that the observations are accurate and relevant to the task at hand.

round1_system_prompt

This is a string parameter that provides the system prompt for the first round. It is designed to guide the VLM in its observational task, with a default prompt encouraging detailed and comprehensive descriptions of the image. The prompt is multiline and can be customized to suit specific needs.

round1_user_prompt

Similar to the system prompt, this string parameter allows the user to input a custom prompt for the first round. It defaults to a request for a detailed description of the image, including all visual elements and notable features. This prompt is also multiline, providing flexibility in how the task is framed.

round2_context

This parameter sets the context for the second round, where the Qwen2.5 model rewrites the description into a cinematic prompt. It ensures that the rewriting process is aligned with the intended use case, focusing on video generation.

round2_system_prompt

A string parameter that provides the system prompt for the second round. It defaults to a prompt that positions the model as an expert in prompt engineering for video generation, guiding the transformation of the description into a cinematic format.

round2_user_prompt

This string parameter allows the user to input a custom prompt for the second round. It defaults to a request for rewriting the description as a cinematic prompt, emphasizing movement, atmosphere, and visual style. The prompt is multiline, allowing for detailed instructions.

max_tokens

An integer parameter that defines the maximum number of tokens the model can generate in its output. It ranges from 1 to 32000, with a default value of 512. This parameter controls the length of the generated text, impacting the level of detail and complexity in the output.

temperature

A float parameter that influences the randomness of the model's output. It ranges from 0.0 to 2.0, with a default value of 0.7. A higher temperature results in more creative and diverse outputs, while a lower temperature produces more deterministic results.

top_p

This float parameter, ranging from 0.0 to 1.0 with a default of 0.9, determines the cumulative probability for token selection. It helps in controlling the diversity of the output by limiting the token pool to those with the highest probabilities, ensuring a balance between creativity and coherence.

Two-Round VLM Prompter Output Parameters:

context

This output parameter provides the updated context after both rounds of processing. It includes information about the models used in each round and the lengths of the observation and final prompt, offering insights into the processing workflow.

final_prompt

The final prompt is the result of the second round of processing, where the initial observation is transformed into a cinematic prompt suitable for video generation. It encapsulates the creative and stylistic elements necessary for dynamic video content.

round1_observation

This output contains the detailed description generated in the first round. It serves as the foundational observation that informs the subsequent rewriting process, capturing all relevant visual details of the image.

debug_info

The debug information provides insights into the processing steps, including model details and response lengths. It is particularly useful for troubleshooting and understanding the node's behavior during execution.

Two-Round VLM Prompter Usage Tips:

Customize the round1_user_prompt to focus on specific visual elements or themes you want to emphasize in the observation phase.
Adjust the temperature and top_p parameters to fine-tune the creativity and coherence of the final prompt, depending on whether you want a more exploratory or focused output.

Two-Round VLM Prompter Common Errors and Solutions:

Model Not Found

Explanation: This error occurs when the specified model for either round is not available or incorrectly specified in the context.
Solution: Ensure that the model names in round1_context and round2_context are correctly specified and that the models are available in your environment.

Invalid Token Range

Explanation: This error arises when the max_tokens parameter is set outside the allowed range.
Solution: Verify that the max_tokens value is between 1 and 32000 and adjust it accordingly.

Prompt Length Exceeded

Explanation: This error occurs when the generated prompt exceeds the maximum token limit.
Solution: Reduce the complexity of the prompts or increase the max_tokens parameter to accommodate longer outputs.

Two-Round VLM Prompter Related Nodes

Go back to the extension to check out more related nodes.

Shrug-Prompter: Unified VLM Integration for ComfyUI

Table of Content

Description
TwoRoundVLMPrompter:
TwoRoundVLMPrompter Input Parameters:
TwoRoundVLMPrompter Output Parameters:
TwoRoundVLMPrompter Usage Tips:
TwoRoundVLMPrompter Common Errors and Solutions:
Related Nodes

Wan2.2 Fun Camera | Cinematic Motion from Images

Turn still images into lively cinematic shots with smooth camera moves.

Qwen-Image Lightning | 8-Step Speed Boost

Cut generation time in half.

Cosmos-Predict2 | Text2Image & Video2World

Fast and real! NVIDIA Cosmos with true physics.

Qwen Image Edit Plus 2511 LoRA Inference | AI Toolkit ComfyUI

Keep AI Toolkit-trained Qwen Image Edit Plus 2511 LoRA edits in ComfyUI preview-aligned using a single RCQwenImageEditPlus2511 custom node.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: Two-Round VLM Prompter

TwoRoundVLMPrompter

How to Install Shrug-Prompter: Unified VLM Integration for ComfyUI

Two-Round VLM Prompter Description

Two-Round VLM Prompter:

Two-Round VLM Prompter Input Parameters:

round1_context

round1_system_prompt

round1_user_prompt

round2_context

round2_system_prompt

round2_user_prompt

max_tokens

temperature

top_p

Two-Round VLM Prompter Output Parameters:

context

final_prompt

round1_observation

debug_info

Two-Round VLM Prompter Usage Tips:

Two-Round VLM Prompter Common Errors and Solutions:

Model Not Found

Invalid Token Range

Prompt Length Exceeded

Two-Round VLM Prompter Related Nodes