InfiniteTalk | Lip-Synced Avatar Generator

Photo + Voice = Perfectly Synced Talking Avatar in Minutes

Flux 2 Dev | Photoreal Text-to-Image Generator

Next-level image realism with advanced generation control power

Qwen Image 2512 LoRA Inference | AI Toolkit ComfyUI

Use an AI Toolkit-trained LoRA with Qwen Image 2512 in ComfyUI via one RCQwenImage2512 node for preview-aligned generations.

Qwen Image Edit 2509 | Multi-Image Editor

Turn 2–3 images into one seamless, edited masterpiece instantly.

ComfyUI > Nodes > ComfyUI ModelScope API Node > ModelScope-Vision 图生文节点

ComfyUI Node: ModelScope-Vision 图生文节点

Class Name

ModelScopeVisionNode

Category
ModelScopeAPI

Author
hujuying (Account age: 1426days) Extension
ComfyUI ModelScope API Node Latest Updated
2025-12-31 Github Stars
0.06K

Github Ask hujuying Current Questions Past Questions

Table of Content

Description
ModelScopeVisionNode:
ModelScopeVisionNode Input Parameters:
ModelScopeVisionNode Output Parameters:
ModelScopeVisionNode Usage Tips:
ModelScopeVisionNode Common Errors and Solutions:
Related Nodes

How to Install ComfyUI ModelScope API Node

Install this extension via the ComfyUI Manager by searching for ComfyUI ModelScope API Node

1. Click the Manager button in the main menu
2. Select Custom Nodes Manager button
3. Enter ComfyUI ModelScope API Node in the search bar

After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

Free trial available
16GB VRAM to 80GB VRAM GPU machines
400+ preloaded models/nodes
Freedom to upload custom models/nodes
200+ ready-to-run workflows
100% private workspace with up to 200GB storage
Dedicated Support

Run ComfyUI Online

ModelScope-Vision 图生文节点 Description

Facilitates image analysis via ModelScope API, enabling AI-driven visual content interpretation.

ModelScope-Vision 图生文节点:

The ModelScopeVisionNode is designed to facilitate the analysis of images by leveraging the capabilities of the ModelScope API. This node serves as a bridge between your visual data and advanced AI models, enabling you to extract meaningful insights and descriptions from images. By integrating with the ModelScope API, it allows you to send image data along with textual prompts to a specified model, which then processes the input and returns a detailed analysis or description. This functionality is particularly beneficial for AI artists and developers who wish to automate the interpretation of visual content, making it easier to generate descriptive text from images without requiring deep technical expertise in machine learning or computer vision.

ModelScope-Vision 图生文节点 Input Parameters:

api_token

The api_token is a crucial input parameter that serves as your authentication key for accessing the ModelScope API. It ensures that your requests to the API are authorized and secure. Without a valid API token, the node will not be able to communicate with the ModelScope service, thus preventing any image analysis from taking place. There are no specific minimum or maximum values for this parameter, but it must be a valid token string provided by ModelScope.

model

The model parameter specifies the AI model to be used for processing the image and text input. This parameter determines the type of analysis or description that will be generated based on the capabilities of the chosen model. The default value is "stepfun-ai/step3", but you can select other models available in the ModelScope platform to suit your specific needs.

max_tokens

The max_tokens parameter defines the maximum number of tokens that the model can generate in its response. This parameter impacts the length and detail of the output description. The default value is 1000 tokens, which provides a balance between detail and brevity. Adjusting this value allows you to control the verbosity of the output.

temperature

The temperature parameter controls the randomness of the model's output. A lower temperature value results in more deterministic and focused responses, while a higher value introduces more variability and creativity. The default temperature is set to 0.7, which offers a good mix of coherence and diversity in the generated descriptions.

prompt

The prompt is a text input that guides the model in generating the desired output. It provides context or specific instructions for the type of analysis or description you want from the image. This parameter is essential for tailoring the model's response to your particular requirements.

image_url

The image_url parameter is the URL of the image you wish to analyze. This parameter is critical as it provides the visual data that the model will process. The image must be accessible via the provided URL for the node to function correctly.

ModelScope-Vision 图生文节点 Output Parameters:

description

The description output parameter contains the textual analysis or description generated by the model based on the provided image and prompt. This output is valuable for understanding the content and context of the image, offering insights that can be used for various applications such as content creation, data annotation, or automated reporting.

ModelScope-Vision 图生文节点 Usage Tips:

Ensure that your api_token is valid and up-to-date to maintain uninterrupted access to the ModelScope API.
Experiment with different temperature settings to find the right balance between creativity and coherence in the model's output.
Use specific and detailed prompts to guide the model towards generating more relevant and accurate descriptions.

ModelScope-Vision 图生文节点 Common Errors and Solutions:

图像分析失败: `<error_message>`

Explanation: This error indicates that the image analysis process encountered an issue, which could be due to an invalid image URL, network problems, or an incorrect API token.
Solution: Verify that the image URL is correct and accessible, ensure your API token is valid, and check your network connection. If the problem persists, consult the ModelScope API documentation for further troubleshooting steps.

ModelScope-Vision 图生文节点 Related Nodes

Go back to the extension to check out more related nodes.

ComfyUI ModelScope API Node

Table of Content

Description
ModelScopeVisionNode:
ModelScopeVisionNode Input Parameters:
ModelScopeVisionNode Output Parameters:
ModelScopeVisionNode Usage Tips:
ModelScopeVisionNode Common Errors and Solutions:
Related Nodes

Wan 2.1 LoRA

Enhance Wan 2.1 video generation with LoRA models for improved style and customization.

OmniGen2 | Text-to-Image & Editing

Powerful unified model for image generation and editing

HiDream E1.1 | AI Image Editing

Edit images with natural language using HiDream E1.1 model

Flux Upscaler - Ultimate 32k | Image Upscaler

Flux Upscaler – Achieve 4k, 8k, 16k, and Ultimate 32k Resolution!

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.