Vidu Q2: Precision Image-to-Image Generator with 4K Output

vidu/q2/reference-to-image

Transform reference images into high-quality 4K visuals with precise editing, consistent styles, and fast, production-ready output for film, design, and advertising workflows.

Idle

The rate is $0.04 per image.

Introduction to Vidu Q2 Image Generation

Developed by ShengShu Technology, Vidu Q2 is a cutting-edge image-to-image model built for creators, designers, and production teams who need consistent, high-quality visuals at incredible speed. Vidu Q2 delivers professional-level editing, reference-based generation, and 4K output with precision and efficiency, making it ideal for film, advertising, and creative media workflows. For developers, Vidu Q2 on RunComfy can be used both in the browser and via an HTTP API, so you don’t need to host or scale the model yourself.

Creative Image Examples Using Vidu Q2

Model overview

Provider: Vidu
Task: image-to-image
Architecture: Diffusion-based image-to-image pipeline with cross-attention over reference images
Resolution/Specs: Up to 4K still images
Key strengths:

- Consistent subject and style transfer from multiple reference images

- High-fidelity 4K output with sharp detail retention

- Strong prompt adherence for fine-grained edits

- Fast, production-ready inference on managed GPUs

- Deterministic reproduction via seeding

Vidu Q2 is an image-to-image model that turns reference images and prompts into high-quality 4K visuals with precise, controllable edits. It is optimized for production pipelines that need speed, consistency, and repeatability on RunComfy.

How Vidu Q2 runs on RunComfy

RunComfy provides a managed environment for Vidu Q2 so you can prototype in minutes and scale to production without wrangling infrastructure. You get consistent performance, simple versioning, and reproducible runs.

Playground UI: Experience the model directly in your browser without installation.
Playground API: Developers can integrate Vidu Q2 via a scalable HTTP API. Endpoint: https://www.runcomfy.com/models/vidu/q2/reference-to-image/api

Input parameters

Below are the supported inputs for Vidu Q2.

Core prompts

Parameter	Type	Default/Range	Description
prompt	string	"" (max 1500 chars)	Natural language instruction guiding how the model should transform and stylize the output. Be explicit about subject, style, lighting, camera, and materials for best control.
reference_image_urls	array of image URLs	[]	One or more https URLs to reference images for identity/style consistency. Use 1-3 high-quality images of the same subject; ensure clear, well-lit framing for reliable transfer.

Dimensions and settings

Parameter	Type	Default/Range	Description
aspect_ratio	enum string	16:9 (choices: 16:9, 9:16, 1:1)	Target aspect ratio for the generated image. Match this to your composition (e.g., 9:16 for vertical portraits, 16:9 for landscapes, 1:1 for square creatives).
seed	integer	0 (random)	Controls determinism. Use a fixed seed to reproduce a result; set 0 to randomize each run during exploration.

Required parameter: prompt

Recommended settings

These tips help get the best results from Vidu Q2 for image-to-image workflows:

Use 1-3 high-resolution reference images of the same subject with consistent lighting and pose for the strongest identity transfer.
Choose an aspect_ratio that matches your framing from the start to avoid unwanted cropping or stretching.
Keep prompts concise but specific: include subject, scene, style descriptors (e.g., cinematic, editorial, painterly), lighting, and any constraints (e.g., no text, no watermark).
Start with seed = 0 while exploring variations; lock a specific seed to finalize and reproduce a chosen look.
Avoid conflicting style terms in the prompt (e.g., photorealistic and watercolor) unless you are intentionally blending styles.

Output quality and performance

Output: A single high-resolution image (PNG/JPEG) generated from your references and prompt. Vidu Q2 targets up to 4K stills.
Performance: On RunComfy you can expect responsive, seconds-level latency for most runs, with consistent p95 thanks to no cold starts and autoscaling. Concurrency is handled transparently for production workloads.

Recommended use cases

Film, TV, and previsualization: maintain character/prop consistency while exploring new looks.
Advertising and brand design: rapid creative variations and style-locked campaigns.
Product and industrial design: material/finish iterations with consistent lighting and perspective.
E-commerce and catalog imaging: restyle or upscale product shots while preserving identity.

Related Models

seedream-4-5/edit-sequential

Create consistent visual stories with advanced image editing and multi-scene control.

nano-banana/pro/edit

Turn sketches into precise 2K-4K visuals with smart correction and seamless creative control.

qwen-edit-2509/multi-image-edit-plus

Advanced image editing model for detailed, consistent visual creation and precise design workflows.

qwen-image-layered

Transforms images into editable RGBA layers for precise object isolation and seamless design control.

seedream-4-5/sequential

Create cohesive 4K visuals with stable subjects and refined scene alignment.

qwen-edit-2509/lora/next-scene

Create seamless cinematic sequences with smooth framing and stable lighting for coherent story visuals.

Frequently Asked Questions

What kind of license does Vidu Q2 image-to-image use, and does RunComfy change it?

Vidu Q2 image-to-image is governed by its original license from ShengShu Technology, often aligned with permissive Open RAIL or similar frameworks allowing research and possible commercial use under stated conditions. RunComfy simply provides API and cloud hosting access — it does not alter or override the model’s original licensing structure. Users must review and adhere to Vidu Q2’s stated license before selling or distributing generated content.

What are the performance considerations of Vidu Q2 image-to-image on RunComfy?

Vidu Q2 image-to-image operates on cloud-hosted GPUs managed by RunComfy, offering low latency (around 3–6 seconds for 1080p) and stable concurrency through automatic GPU scaling. RunComfy manages GPU load balancing so users can run multiple concurrent generations without setup. Local execution isn’t supported due to high hardware demands.

Are there any technical limitations when using Vidu Q2 image-to-image via the RunComfy Playground or API?

Yes, Vidu Q2 image-to-image supports outputs up to 4K resolution, though 1080p is free-tier optimized. Each prompt is limited to roughly 512 tokens, and reference inputs via ControlNet or IP-Adapter are capped at four. Aspect ratios follow 1:1, 16:9, and 9:16 presets to ensure stability across generations.

How can I transition from testing Vidu Q2 image-to-image in RunComfy Playground to a production setup?

You can prototype in the RunComfy Playground (https://www.runcomfy.com/playground) using Vidu Q2 image-to-image, then integrate your workflow via the RunComfy API. The API mirrors playground parameters for inputs, references, and post-processing options. For production use, you’ll need to purchase usd credits, authenticate with your RunComfy API key, and adjust concurrency or callback URLs for automated processing.

What makes Vidu Q2 image-to-image different from Vidu Q1 or other competitors?

Vidu Q2 image-to-image vastly improves consistency, handling multiple references with refined spatial and identity control. It’s up to 2× faster than Vidu Q1 and handles both text-to-image and full-image editing in the same unified architecture. This makes it reliable for professional pipelines across advertising, animation, and concept art where visual continuity is critical.

What are the GPU infrastructure details behind Vidu Q2 image-to-image on RunComfy?

RunComfy runs Vidu Q2 image-to-image entirely on managed GPU clusters in the cloud, eliminating the need for users to provision hardware. Resources are dynamic—users experience minimal latency and autoscaled throughput even at high demand. This managed approach separates inference performance from local computing limitations.

Does using Vidu Q2 image-to-image via RunComfy offer any free trial?

Yes, new RunComfy users receive free trial usd credits that can be used with Vidu Q2 image-to-image for generating outputs up to 1080p. This allows users to evaluate quality and performance before upgrading to paid usage for higher resolution or production-scale workflows.

Where can I get help or report issues related to Vidu Q2 image-to-image on RunComfy?

For support or feedback regarding Vidu Q2 image-to-image usage, integration questions, or GPU availability, you can contact RunComfy directly at hi@runcomfy.com. The platform team assists with both technical troubleshooting and billing inquiries related to usd consumption or API access.

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Vidu Q2: Precision Image-to-Image Generator with 4K Output | RunComfy