RunComfy

Z Image Turbo | Ultra-Fast Photorealistic Generator

Generate ultra-clear visuals fast with unmatched real-time detail.

FLUX Controlnet Inpainting

Enhance realism by using ControlNet to guide FLUX.1-dev.

Wan 2.1 | Revolutionary Video Generation

Create incredible videos from text or images with breakthrough AI running on everyday CPUs.

SDXL Turbo | Rapid Text to Image

Experience fast text-to-image synthesis with SDXL Turbo.

ComfyUI > Nodes > ComfyUI_MiniCPM-V-4_5 > MiniCPM VQA Polished

ComfyUI Node: MiniCPM VQA Polished

Class Name

MiniCPM_VQA_Polished

Category
Comfyui_MiniCPM-V-4_5

Author
IuvenisSapiens (Account age: 1056days) Extension
ComfyUI_MiniCPM-V-4_5 Latest Updated
2025-08-29 Github Stars
0.26K

Github Ask IuvenisSapiens Current Questions Past Questions

Table of Content

Description
MiniCPM_VQA_Polished:
MiniCPM_VQA_Polished Input Parameters:
MiniCPM_VQA_Polished Output Parameters:
MiniCPM_VQA_Polished Usage Tips:
MiniCPM_VQA_Polished Common Errors and Solutions:
Related Nodes

How to Install ComfyUI_MiniCPM-V-4_5

Install this extension via the ComfyUI Manager by searching for ComfyUI_MiniCPM-V-4_5

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI_MiniCPM-V-4_5 in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

MiniCPM VQA Polished Description

Facilitates video question answering using MiniCPM-V-4_5 model for efficient VQA tasks.

MiniCPM VQA Polished:

The MiniCPM_VQA_Polished node is designed to facilitate video question answering (VQA) tasks by leveraging the capabilities of the MiniCPM-V-4_5 model. This node is particularly useful for AI artists and developers who wish to integrate advanced video analysis and question answering functionalities into their projects. The node processes video inputs to extract meaningful frames and utilizes a pre-trained model to generate responses to questions related to the video content. By offering a streamlined interface for video encoding and model inference, MiniCPM_VQA_Polished enhances the efficiency and accuracy of VQA tasks, making it an essential tool for projects that require detailed video content analysis and interpretation.

MiniCPM VQA Polished Input Parameters:

text

This parameter accepts a string input, which represents the question you want to ask about the video content. It supports multiline text, allowing for complex queries. The default value is an empty string. The text input directly influences the type of information the model will attempt to extract and answer from the video.

model

This parameter allows you to select the model variant to be used for inference. The options are MiniCPM-V-4_5-int4 and MiniCPM-V-4_5, with the default being MiniCPM-V-4_5-int4. Choosing a different model can affect the performance and accuracy of the results, with some models being optimized for speed and others for precision.

keep_model_loaded

A boolean parameter that determines whether the model should remain loaded in memory after execution. The default value is False. Keeping the model loaded can reduce initialization time for subsequent inferences but may consume more memory resources.

top_p

This float parameter, with a default value of 0.8, controls the nucleus sampling strategy during inference. It determines the cumulative probability threshold for token selection, influencing the diversity of the generated answers. A higher value allows for more diverse outputs.

top_k

An integer parameter with a default value of 100, it specifies the number of highest probability tokens to consider during sampling. This parameter impacts the randomness and variability of the model's responses, with higher values leading to more varied outputs.

temperature

This float parameter, ranging from 0 to 1 with a default of 0.7, adjusts the randomness of the model's predictions. Lower values make the model's output more deterministic, while higher values increase variability and creativity in the responses.

repetition_penalty

A float parameter with a default value of 1.05, it penalizes the model for repeating the same tokens, encouraging more varied and less repetitive outputs. This is particularly useful for generating coherent and engaging responses.

max_new_tokens

An integer parameter with a default value of 2048, it sets the maximum number of new tokens the model can generate in response to the input question. This limits the length of the generated answer, ensuring it remains concise and relevant.

video_max_num_frames

This integer parameter, with a default value of 64, specifies the maximum number of frames to be sampled from the video for analysis. Reducing this number can help avoid out-of-memory (OOM) errors, especially with high-resolution videos.

video_max_slice_nums

An integer parameter with a default value of 2, it determines the number of slices the video is divided into for processing. Adjusting this can help manage memory usage and processing time, particularly for longer videos.

seed

An integer parameter with a default value of -1, it sets the random seed for reproducibility of results. Using a specific seed ensures that the same input will produce the same output across different runs, which is useful for debugging and consistency.

MiniCPM VQA Polished Output Parameters:

STRING

The output is a string that contains the model's response to the input question based on the video content. This output provides insights and answers derived from the video, reflecting the model's understanding and interpretation of the visual data.

MiniCPM VQA Polished Usage Tips:

To optimize performance, consider using the MiniCPM-V-4_5-int4 model for faster inference times, especially when working with large datasets or requiring quick responses.
Adjust the video_max_num_frames parameter to a lower value if you encounter memory issues, particularly with high-resolution videos, to ensure smooth processing.
Utilize the temperature and top_p parameters to fine-tune the creativity and diversity of the model's responses, depending on whether you need more deterministic or varied outputs.

MiniCPM VQA Polished Common Errors and Solutions:

CUDA out of memory

Explanation: This error occurs when the GPU runs out of memory during processing, often due to high-resolution videos or large frame counts.
Solution: Reduce the video_max_num_frames or video_max_slice_nums parameters to decrease memory usage. Alternatively, consider using a model variant with lower memory requirements.

Model not loaded

Explanation: This error may arise if the model is not kept loaded between inferences, leading to delays or failures in processing.
Solution: Set the keep_model_loaded parameter to True if you plan to run multiple inferences in succession to avoid reloading the model each time.

Invalid input text

Explanation: This error can occur if the input text is not properly formatted or is empty, leading to issues in generating a response.
Solution: Ensure that the text parameter contains a valid question or query related to the video content, and check for any formatting issues.

MiniCPM VQA Polished Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI_MiniCPM-V-4_5

Table of Content

Description
MiniCPM_VQA_Polished:
MiniCPM_VQA_Polished Input Parameters:
MiniCPM_VQA_Polished Output Parameters:
MiniCPM_VQA_Polished Usage Tips:
MiniCPM_VQA_Polished Common Errors and Solutions:
Related Nodes

Virtual Try-On | Realistic Fashion Fitting

Instant outfit previews with natural, well-fitted clothing visuals

PMRF Ultra Fast Upscaler | Low VRAM ComfyUI

Ultra fast PMRF upscaler! 3.79s on medium machine. 2x scale.

Flex.1 LoRA Inference | AI Toolkit ComfyUI

Run your AI Toolkit-trained Flex.1 LoRA in ComfyUI with training-matched defaults using a single RC custom node.

SCAIL Model | Pose-Guided Animation Maker

Pose-driven animation with identity stability and motion precision.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Support

Resources

Legal

RunComfy

Save 4 hours! We auto-setup your workflow! Free!

ComfyUI Node: MiniCPM VQA Polished

MiniCPM_VQA_Polished

How to Install ComfyUI_MiniCPM-V-4_5

MiniCPM VQA Polished Description

MiniCPM VQA Polished:

MiniCPM VQA Polished Input Parameters:

text

model

keep_model_loaded

top_p

top_k

temperature

repetition_penalty

max_new_tokens

video_max_num_frames

video_max_slice_nums

seed

MiniCPM VQA Polished Output Parameters:

STRING

MiniCPM VQA Polished Usage Tips:

MiniCPM VQA Polished Common Errors and Solutions:

CUDA out of memory

Model not loaded

Invalid input text

MiniCPM VQA Polished Related Nodes