logo
RunComfy
  • ComfyUI
  • TrainerNew
  • Models
  • API
  • Pricing
discord logo
TRAINING & INFERENCE
Train LoRA
LoRA Assets
Run LoRA
Generations
DEDICATED ENDPOINTS
Deployments
Requests
ACCOUNT
Usage
API Docs
API Keys
How-tos
AI Toolkit LoRA Training Guides

Z-Image Character LoRA Dataset Guide: How Many Images, Angles, Captions, and Steps?

This guide explains how to build a strong Z-Image character LoRA dataset with better images, angle coverage, captions, and step planning. It is written for users who care about stable likeness and reusable character assets rather than generic fine-tuning.

Train Diffusion Models with Ostris AI Toolkit

← Role horizontalmente para ver o formulário completo →

Ostris AI ToolkitOstrisAI-Toolkit

New Training Job

Job

Model

Use a Hugging Face repo ID (e.g. owner/model-name).
⚠️ full URLs, .safetensors files, and local files are not supported.

Quantization

Target

Save

Training

Datasets

Dataset 1

Sample

Z-Image Character LoRA Training Dataset: How Many Images, Angles, Captions, and Steps?

If you are preparing a Z-Image character LoRA training dataset, you probably want one thing:

A character LoRA that still looks like the same person on new prompts, new angles, and new expressions.

Not a vague "sort of the same person" LoRA.

Not a pretty sample grid that falls apart the moment you test it properly.

This guide is for building a Z-Image character LoRA training dataset that actually works — getting the image count, angle mix, captions, and training dose right from the start.

By the end, you will know:

  • how many images you really need for a Z-Image character LoRA
  • how to distribute angles, crops, and expressions
  • when captions help and when they make likeness worse
  • how to think about steps and training dose without guessing
  • how to run a smoke test before spending more GPU time
For the base workflow itself, see the main Z-Image Base LoRA training guide.

Table of contents

  • 1. What a strong Z-Image character LoRA training dataset must teach
  • 2. How many images for a Z-Image character LoRA training dataset?
  • 3. Best crop and angle mix for Z-Image character likeness
  • 4. Best caption style for a Z-Image character LoRA
  • 5. How many training steps does a Z-Image character LoRA need?
  • 6. Best smoke-test workflow before a full run
  • 7. Why Z-Image likeness becomes unstable
  • 8. Bottom line

1. What a strong Z-Image character LoRA training dataset must teach

A good Z-Image character LoRA should learn:

  • who the person is
  • what should remain stable across prompts
  • what is allowed to vary

That is why the dataset matters so much.

If your data is too repetitive, the LoRA overfits.

If your data is too chaotic, the likeness gets weak.

If your captions are too noisy, the identity signal gets diluted.

This is especially important because Z-Image can learn strongly from a relatively small dataset. That is useful, but it also means bad dataset decisions show up quickly.


2. How many images for a Z-Image character LoRA training dataset?

A practical starting band is:

  • 15-30 images for a focused character LoRA
  • 20-40 images if you want more robustness across prompts

You do not need hundreds of images to begin.

What matters more is whether the images cover the identity clearly.

What to optimize for

Prefer:

  • clean, high-quality images
  • visible face detail
  • meaningful variation in angle and lighting
  • low redundancy

Avoid:

  • 20 near-identical selfies
  • blurry or compression-damaged images
  • giant datasets full of weak duplicates

Quality beats raw quantity for this task.


3. Best crop and angle mix for Z-Image character likeness

If your goal is strong character consistency, your Z-Image character LoRA training dataset should not be all one crop type.

A strong practical mix

  • 40-60% close-up or head-and-shoulders shots
  • 25-40% medium shots
  • 10-20% full-body or wider shots

Why this works:

  • closeups teach facial identity
  • medium shots help pose and clothing generalization
  • a small amount of wide framing prevents the LoRA from becoming "face only"

Angle coverage

Aim to include:

  • front view
  • three-quarter view
  • side view
  • different expressions

If all images are front-facing, the LoRA often weakens badly on profile or expressive prompts.

Background strategy

Do not make the background the only thing that changes.

You want enough background variety that the model learns the person, but not so much chaos that the subject signal becomes weak.


4. Best caption style for a Z-Image character LoRA

Captions should help the model separate:

  • what is the identity
  • what is clothing, expression, lighting, or pose

Keep captions short and consistent

A good starting pattern:

  • use a unique trigger word
  • keep captions short
  • describe only the variables you want to remain controllable

Examples:

  • photo of [trigger], smiling, red jacket
  • photo of [trigger], side view, studio lighting

Do not write essay captions unless you have a strong reason

Long captions often create more noise than value for character likeness.

Caption the changing parts

If expression, outfit, or environment should remain flexible, caption those.

That helps the trigger absorb the stable identity while the captions absorb the variables.


5. How many training steps does a Z-Image character LoRA need?

Do not choose steps by vibes.

The better way is to think in terms of training dose per image.

A good working band for character training is:

  • 50-100 effective repeats per image

That is not a law, but it is a useful frame.

Practical starting point

For a 20-40 image character dataset:

  • run a short smoke test first
  • then plan a fuller run in the 2000-4000 step band

What matters most is what the previews show:

  • if likeness is still weak, you may need more dose
  • if outputs start looking "fried," too rigid, or always the same, you may have gone too far

Base vs Turbo reminder

If you are training on Z-Image Base, evaluate with Base-style sampling.

Do not judge a Base LoRA at Turbo-like settings.


6. Best smoke-test workflow before a full run

This is one of the best ways to save time.

Smoke-test recipe

  1. Use a smaller dataset subset or the full dataset at conservative settings.
  2. Train for a short run, roughly 1000-1500 steps.
  3. Evaluate with fixed prompts and fixed seed.
  4. Decide whether the dataset logic is working before scaling up.

What you are checking

  • Is the identity starting to appear?
  • Does the LoRA still respond to prompts?
  • Are expressions, clothing, and angles still flexible?
  • Are the previews improving or becoming more rigid?

This is much better than committing immediately to a long run without knowing whether your Z-Image character LoRA training dataset is actually working.


7. Why Z-Image likeness becomes unstable

7.1 Too many duplicates

The LoRA learns one angle too hard and stops generalizing.

7.2 Too many full-body images

Wide shots are useful, but if they dominate your Z-Image character LoRA training dataset, face quality usually suffers.

7.3 Captions are too long or inconsistent

This weakens the identity signal and adds noise.

7.4 You changed too many things at once

If you change:

  • dataset
  • captions
  • steps
  • sampling
  • rank

all together, it becomes hard to diagnose why likeness is weak.

7.5 You are evaluating the wrong way

This is especially important on Z-Image Base.

If you preview with the wrong sampling assumptions, you can think the LoRA is worse than it really is.


8. Bottom line

A strong Z-Image character LoRA dataset is not about maximizing image count.

It is about:

  • enough images to cover identity
  • enough angle variety to survive prompt changes
  • enough caption discipline to keep identity strong
  • enough training dose to lock likeness without frying the LoRA

That is the real job of this page.

You are not trying to train a more general model.

You are trying to produce a character LoRA you can keep using in real work on top of Z-Image.

Ready to start training?