FLUX.2 LoRA ComfyUI Inference: training-matched AI Toolkit LoRA output with the FLUX.2 Dev pipeline

This production-ready RunComfy workflow runs FLUX.2 Dev LoRA inference in ComfyUI through RC FLUX.2 Dev (Flux2Pipeline) (pipeline-level alignment, not a generic sampler graph). RunComfy built and open-sourced this custom node—see the runcomfy-com repositories—and you control adapter application with lora_path and lora_scale.

Note: This workflow requires 3XL machine to run.

Why FLUX.2 LoRA ComfyUI Inference often looks different in ComfyUI

AI Toolkit training previews are rendered through a model-specific FLUX.2 pipeline, where text encoding, scheduling, and LoRA injection are designed to work together. In ComfyUI, rebuilding FLUX.2 with a different graph (or a different LoRA loader path) can change those interactions, so copying the same prompt, steps, CFG, and seed still produces visible drift. The RunComfy RC pipeline nodes close that gap by executing FLUX.2 end-to-end in Flux2Pipeline and applying your LoRA inside that pipeline, keeping inference aligned with preview behavior. Source: RunComfy open-source repositories.

How to use the FLUX.2 LoRA ComfyUI Inference workflow

Step 1: Get the LoRA path and load it into the workflow (2 options)

⚠️ Important · FLUX.2 access & Hugging Face token required

FLUX.2 Dev models require explicit access authorization on Hugging Face.

Before running this workflow:

Make sure your Hugging Face account has been granted access to FLUX.2 (Dev)
Create a Hugging Face access token
Paste your token into the hf_token field on the Load Pipeline node

Without a valid token and proper model access, the workflow will not run. For step-by-step instructions, see Hugging Face token for FLUX.2.

Option A — RunComfy training result → download to local ComfyUI:

Go to Trainer → LoRA Assets
Find the LoRA you want to use
Click the ⋮ (three-dot) menu on the right → select Copy LoRA Link
In the ComfyUI workflow page, paste the copied link into the Download input field at the top-right corner of the UI
Before clicking Download, make sure the target folder is set to ComfyUI > models > loras (this folder must be selected as the download target)
Click Download — this ensures the LoRA file is saved into the correct models/loras directory
After the download finishes, refresh the page
The LoRA now appears in the LoRA select dropdown in the workflow — select it

FLUX.2 Dev: copy the LoRA link from the RunComfy Trainer screen

Option B — Direct LoRA URL (overrides Option A):

Paste the direct .safetensors download URL into the path / url input field of the LoRA node
When a URL is provided here, it overrides Option A — the workflow loads the LoRA directly from the URL at runtime
No local download or file placement is required

Tip: confirm the URL resolves to the actual .safetensors file (not a landing page or redirect).

FLUX.2 Dev: paste a direct .safetensors URL into path/url on the LoRA node

Step 2: Match inference parameters with your training sample settings

In the LoRA node, select your adapter in lora_path (Option A), or paste a direct .safetensors link into path / url (Option B overrides the dropdown). Then set lora_scale to the same strength you used during training previews and adjust from there.

Remaining parameters are on the Generate node (and, depending on the graph, the Load Pipeline node):

prompt: your text prompt (include trigger words if you trained with them)
width / height: output resolution; match your training preview size for the cleanest comparison (multiples of 16 are recommended for FLUX.2)
sample_steps: number of inference steps (25 is a common default)
guidance_scale: CFG/guidance value (4.0 is a common default)
seed: fixed seed to reproduce; change it to explore variations
seed_mode (only if present): choose fixed or randomize
negative_prompt (only if present): FLUX.2 is guidance-distilled in this workflow, so negative prompts are ignored
hf_token: Hugging Face access token; required for FLUX.2 Dev model download (paste it on the Load Pipeline node)

Training alignment tip: if you customized sampling values during training (seed, guidance_scale, sample_steps, trigger words, resolution), mirror those exact values here. If you trained on RunComfy, open Trainer → LoRA Assets > Config to view the resolved YAML and copy preview/sample settings into the workflow nodes.

FLUX.2 Dev: preview and sample settings shown in the RunComfy LoRA Config screen

Step 3: Run FLUX.2 LoRA ComfyUI Inference

Click Queue/Run — the SaveImage node writes results to your ComfyUI output folder.

Quick checklist:

✓ LoRA is either: downloaded into ComfyUI/models/loras (Option A), or loaded via a direct .safetensors URL (Option B)
✓ Page refreshed after local download (Option A only)
✓ Inference parameters match training sample config (if customized)

If everything above is correct, the inference results here should closely match your training previews.

Troubleshooting FLUX.2 LoRA ComfyUI Inference

Most FLUX.2 “training preview vs ComfyUI inference” gaps come from pipeline-level differences (how the model is loaded, scheduled, and how the LoRA is merged), not from a single wrong knob. This RunComfy workflow restores the closest “training-matched” baseline by running inference through RC FLUX.2 Dev (Flux2Pipeline) end-to-end and applying your LoRA inside that pipeline via lora_path / lora_scale (instead of stacking generic loader/sampler nodes).

(1) Flux.2 with Lora error: "mul_cuda" not implemented for 'Float8_e4m3fn'

Why this happens This typically occurs when FLUX.2 is loaded with Float8/FP8 weights (or mixed-precision quantization) and the LoRA is applied through a generic ComfyUI LoRA path. The LoRA merge can force unsupported Float8 ops (or mixed Float8 + BF16 promotions), which triggers the mul_cuda Float8 runtime error.

How to fix (recommended)

Run inference through RC FLUX.2 Dev (Flux2Pipeline) and load the adapter only via lora_path / lora_scale so the LoRA merge happens in the AI Toolkit–aligned pipeline, not via a generic LoRA loader stacked on top.
If you are debugging in a non-RC graph: avoid applying a LoRA on top of Float8/FP8 diffusion weights. Use a BF16/FP16-compatible loading path for FLUX.2 before adding the LoRA.

(2) LoRA shape mismatches should fail fast instead of corrupting GPU state and causing OOM/system instability

Why this happens This is almost always a base mismatch: the LoRA was trained for a different model family (for example FLUX.1) but is being applied to FLUX.2 Dev. You’ll often see many lora key not loaded lines and then shape mismatches; in the worst case the session can become unstable and end in OOMs.

How to fix (recommended)

Make sure the LoRA was trained specifically for black-forest-labs/FLUX.2-dev with AI Toolkit (FLUX.1 / FLUX.2 / Klein variants are not interchangeable).
Keep the graph “single-path” for LoRA: load the adapter only through the workflow’s lora_path input and let Flux2Pipeline handle the merge. Don’t stack an additional generic LoRA loader in parallel.
If you already hit a mismatch and ComfyUI starts producing unrelated CUDA/OOM errors afterward, restart the ComfyUI process to fully reset the GPU + model state, then retry with a compatible LoRA.

(3) Flux.2 Dev - Using LoRAs more than doubles inference time

Why this happens A LoRA can make FLUX.2 Dev much slower when the LoRA path forces extra patching/dequantization work or applies weights in a slower code path than the base model alone.

How to fix (recommended)

Use this workflow’s RC FLUX.2 Dev (Flux2Pipeline) path and pass your adapter through lora_path / lora_scale. In this setup, the LoRA is merged once during pipeline load (AI Toolkit-style), so the per-step sampling cost stays close to the base model.
When you’re chasing preview-matching behavior, avoid stacking multiple LoRA loaders or mixing loader paths. Keep it to one lora_path + one lora_scale until the baseline matches.

Note In this FLUX.2 Dev workflow, FLUX.2 is guidance-distilled, so negative_prompt may be ignored by the pipeline even if a UI field exists—match previews using prompt wording + guidance_scale + lora_scale first.

Run FLUX.2 LoRA ComfyUI Inference now

Open the workflow, set lora_path, and click Queue/Run to get FLUX.2 Dev LoRA results that stay close to your AI Toolkit training previews.

Want More ComfyUI Workflows?

InfiniteTalk | Lip-Synced Avatar Generator

Photo + Voice = Perfectly Synced Talking Avatar in Minutes

Stable Diffusion 3.5 vs FLUX.1

Compare Stable Diffusion 3.5 and FLUX.1 in one ComfyUI workflow.

Flux PuLID for Face Swapping

Take your face swapping projects to new heights with Flux PuLID.

Wan 2.1 Fun | ControlNet Video Generation

Generate videos with ControlNet-style visual passes like Depth, Canny, and OpenPose.

Uni3C Video-Referenced Camera & Motion Transfer

Extract camera movements and human motions from reference videos for professional video generation

Anyline + MistoLine | High-Quality Sketch to Image

MistoLine adapts to various line art inputs, effortlessly generating high-quality images from sketches.

FLUX LoRA (RealismLoRA) | Photorealistic Images

Blend FLUX-1 model with FLUX-RealismLoRA for photorealistic AI images

Qwen Image Edit 2511 | Smart Image Edit Workflow

Edits your image exactly how you tell it to—fast and precise.

Support

Resources

Legal

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

FLUX.2 Dev LoRA Inference | AI Toolkit ComfyUI

FLUX.2 LoRA ComfyUI Inference: training-matched AI Toolkit LoRA output with the FLUX.2 Dev pipeline

Why FLUX.2 LoRA ComfyUI Inference often looks different in ComfyUI

How to use the FLUX.2 LoRA ComfyUI Inference workflow

Step 1: Get the LoRA path and load it into the workflow (2 options)

⚠️ Important · FLUX.2 access & Hugging Face token required

Step 2: Match inference parameters with your training sample settings

Step 3: Run FLUX.2 LoRA ComfyUI Inference

Troubleshooting FLUX.2 LoRA ComfyUI Inference

(1) Flux.2 with Lora error: "mul_cuda" not implemented for 'Float8_e4m3fn'

(2) LoRA shape mismatches should fail fast instead of corrupting GPU state and causing OOM/system instability

(3) Flux.2 Dev - Using LoRAs more than doubles inference time

Run FLUX.2 LoRA ComfyUI Inference now

Want More ComfyUI Workflows?

InfiniteTalk | Lip-Synced Avatar Generator

Stable Diffusion 3.5 vs FLUX.1

Flux PuLID for Face Swapping

Wan 2.1 Fun | ControlNet Video Generation

Uni3C Video-Referenced Camera & Motion Transfer

Anyline + MistoLine | High-Quality Sketch to Image

FLUX LoRA (RealismLoRA) | Photorealistic Images

Qwen Image Edit 2511 | Smart Image Edit Workflow