logo
RunComfy
  • Models
  • ComfyUI
  • TrainerNew
  • API
  • Pricing
discord logo
MODELS
Explore
All Models
LIBRARY
Generations
MODEL APIS
API Docs
API Keys
ACCOUNT
Usage

Z Image Turbo ControlNet: Photoreal Image-to-Image with Depth & Pose Control on playground and API | RunComfy

tongyi-mai/z-image/turbo/controlnet/lora

The most powerful version of Z Image Turbo. Combine ControlNet (Canny, Depth, Pose) for structure locking with custom LoRAs for style transfer in a single high-speed workflow.

URL of the input image used for ControlNet-based generation.
LoRAs 1
URL, HuggingFace repo ID (owner/repo) to lora weights.
Scale of the LoRA model.
List of LoRAs to apply (maximum 3).
Controls how strongly the ControlNet conditions affect the output.
Specifies the start point of the ControlNet conditioning during generation.
Specifies the end point of the ControlNet conditioning during generation.
Defines what preprocessing (if any) will be applied to the ControlNet input image.
Specifies the number of inference steps during generation.
Enables automatic prompt expansion to improve results; increases cost by 0.0025 credits per request.
Specifies the output image format.
Idle
The rate is $0.01 per image.

Introduction to Z Image Turbo ControlNet LoRA

Z Image Turbo ControlNet LoRA is the ultimate precision tool for creators who need both structural control and stylistic freedom. It fuses the lightning-fast generation of Z Image Turbo with ControlNet's geometric guidance (Canny, Depth, Pose) and the limitless customization of LoRAs. This playground is designed for advanced workflows: upload a reference image to lock the composition, load custom LoRAs to define the art style, and generate high-fidelity results that strictly follow your layout while adopting your desired aesthetic—all in seconds.

Z Image Turbo ControlNet On X: Insights And Updates

Key Capabilities

  • Precise Structure Control: Use ControlNet to lock pose, depth, or edges from your input image. Perfect for keeping a character's posture or a room's layout identical while changing everything else.
  • Style Injection via LoRA: Load up to 3 custom LoRA models (via URL/Path) to apply specific art styles or character details on top of your controlled structure.
  • Advanced Conditioning: Choose from canny (edge detection), depth (3D distance), or pose (human keypoints) preprocessors to tell the model exactly what to respect from your reference image.
  • Fine-Grained Influence: Independently adjust Control Scale (how much the reference image matters) and LoRA Scale (how much the style matters) for the perfect balance.

How to use Z Image Turbo ControlNet LoRA

  1. Upload Reference: Upload an image to the Image slot. This will serve as the structural guide.
  2. Select Preprocess:

- Canny: For keeping strict outlines and details.

- Depth: For architectural renders or maintaining 3D volume.

- Pose: For changing a character's outfit or background while keeping their position fixed.

  1. Load LoRAs: Add your desired style or character LoRAs in the LoRAs list.
  2. Tune Control: Adjust Control Scale (Default 0.9). Lower it if you want the model to be more creative; raise it for strict adherence.

Pro Tips

  • Control Step Timing: Use Control End (Default 0.4) to let the structure be defined early, but allow the LoRA style to take over the details in the later steps.
  • Aspect Ratio: Ensure your Aspect Ratio matches your input image's shape to avoid stretching or cropping.
  • Magic Prompt: Enable Magic Prompt if you want the model to add more detail to your scene automatically, useful when your manual prompt is simple.

Related Tools

  • For simple text-to-image with styles (no structure control needed), use Z Image Turbo LoRA.
  • For pure text-to-image speed, use the base model: Z Image Turbo Text to Image.

Related Playgrounds

flux-1-kontext/max/text-to-image

Advanced model with fast text control, precision edits, and consistent visual fidelity.

flux-2/turbo/edit

Delivers refined image remastering and brand-consistent visual edits with scalable control.

qwen-image/text-to-image

Precise text rendering & multilingual edits for visual pros

qwen-image-layered

Transforms images into editable RGBA layers for precise object isolation and seamless design control.

z-image/turbo/inpainting/lora

Fast, photorealistic image repair and refinements for product visuals.

flux-2/lora/edit

Refine images with adaptive style control, LoRA merging, and high-res rendering for consistent design output.

Frequently Asked Questions

Can I use Z Image Turbo ControlNet for commercial text-to-image projects on RunComfy?

Yes, Z Image Turbo ControlNet is distributed under the Apache 2.0 open-source license, which generally allows commercial use. However, using it on RunComfy does not override or bypass the model’s original license terms. If you plan to deploy Z Image Turbo ControlNet commercially for text-to-image generation at scale, review the official license from the model creators to ensure proper compliance.

Are there technical limitations when using Z Image Turbo ControlNet for text-to-image generation?

Yes. Z Image Turbo ControlNet currently supports maximum output resolutions of up to about 1536×1536 pixels. The prompt input is limited to approximately 200–250 tokens, and users can apply up to 4 simultaneous reference conditions through the Fun ControlNet Union (Canny, HED, Depth, Pose, or MLSD). These constraints balance quality, speed, and GPU resource efficiency.

What are the main strengths of Z Image Turbo ControlNet compared to earlier text-to-image models?

Z Image Turbo ControlNet stands out for its bilingual English and Chinese comprehension, improved prompt fidelity, and text rendering within generated images. Using a 6-billion-parameter Single-Stream DiT structure, it performs image generation in just 8 inference steps, delivering fast, high-quality text-to-image outcomes while being more VRAM-efficient than larger competitors like SDXL or Flux.

Can I fine-tune or customize Z Image Turbo ControlNet in my text-to-image pipeline?

Yes, Z Image Turbo ControlNet supports the Base and Edit variants for fine-tuning and image editing tasks. Developers can adapt these for domain-specific text-to-image generation while still benefiting from the core DiT efficiency. Note that fine-tuned derivatives must also respect the original Apache 2.0 licensing conditions.

What happens after my free trial of Z Image Turbo ControlNet ends on RunComfy?

After your free trial credits (usd) are used, you’ll need to purchase additional usd to continue generating text-to-image outputs with Z Image Turbo ControlNet. Pricing is listed under the 'Generation' section of your account. You can monitor usage and costs directly within your RunComfy dashboard.

What kind of support is available for Z Image Turbo ControlNet users on RunComfy?

If you encounter issues using Z Image Turbo ControlNet for text-to-image generation, you can reach RunComfy’s support team via hi@runcomfy.com. They assist with API integration, usage limits, and troubleshooting performance or licensing-related questions.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Video Models/Tools
  • Wan 2.6
  • Wan 2.6 Text to Video
  • Veo 3.1 Fast Video Extend
  • Seedance Lite
  • Wan 2.2
  • Seedance 1.0 Pro Fast
  • View All Models →
Image Models
  • GPT Image 1.5 Image to Image
  • Flux 2 Max Edit
  • GPT Image 1.5 Text To Image
  • Gemini 3 Pro
  • seedream 4.0
  • Nano Banana Pro
  • View All Models →
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.