ComfyUI > Workflows > Step1X-Edit | AI Image Editing Tool

Step1X-Edit | AI Image Editing Tool

Workflow Name: RunComfy/Step1X-Edit

Workflow ID: 0000...1221

Step1X-Edit is a image editing model that processes reference images and user instructions to create precisely edited outputs. This unified framework combines the strong semantic reasoning of Multimodal Large Language Models with a diffusion architecture, allowing you to perform 11 different editing operations including subject addition/removal, style transfer, text modification, and more. Simply provide your image and describe the changes you want - Step1X-Edit delivers studio-grade results comparable to leading proprietary models.

This ComfyUI Step1X-Edit Workflow was created by the Step1X-Image Team at StepFun. All credit goes to their innovative work in developing this powerful image editing framework!

ComfyUI Step1X-Edit Workflow

Step1X-Edit ComfyUI Workflow | Advanced AI Image Editing

Want to run this workflow?

Fully operational workflows
No missing nodes or models
No manual setups required
Features stunning visuals

ComfyUI Step1X-Edit Examples

ComfyUI Step1X-Edit Description

1. What is Step1X-Edit?

Step1X-Edit is an advanced image editing model developed by StepFun AI that aims to provide comparable performance to closed-source models like GPT-4o and Gemini2 Flash. The Step1X-Edit framework combines the semantic reasoning capabilities of Multimedia Large Language Models (MLLM) with a Diffusion in Transformer (DiT) architecture to deliver high-quality instruction-based image editing.

Step1X-Edit excels at understanding natural language instructions and applying precise edits while maintaining image fidelity. The Step1X-Edit model was trained on over 1 million high-quality instruction-image pairs covering 11 distinct editing categories, making it extraordinarily versatile for various editing tasks.

2. Benefits of ComfyUI Step1X-Edit:

Great Instruction Understanding: Step1X-Edit leverages MLLM technology to comprehend complex editing requests with nuanced understanding of both text and visual content.
Comprehensive Editing Capabilities: Step1X-Edit handles 11 different editing categories including subject addition/removal, background changes, color alterations, material modifications, motion changes, and more.
High Fidelity Results: Step1X-Edit maintains a good balance between reference image reconstruction and editing prompt following, preserving image quality.
Simplified Workflow: No need for masks during the editing process, offering a streamlined Step1X-Edit user experience.
Open Source Alternative: Step1X-Edit provides comparable results to proprietary models while being fully open-source.

3. Quick Start Guide

3.1 System Requirements

Step1X-Edit is a resource-intensive model that performs best with:

VRAM: Recommended 80GB for optimal performance at 1024×1024 resolution
Note: RunComfy's cloud GPU service provides all the necessary computational power for Step1X-Edit without any installation required. Simply select a machine with sufficient VRAM from the options available.

3.2 Workflow Options

Step1X-Edit offers two primary workflow configurations:

Regular Workflow (Non-Real Person Version)

Best for: General purpose editing of objects, scenes, and non-human subjects with Step1X-Edit
Characteristics:
- Simple 3-step process: Load Image → Edit with Step1X-Edit → Save Result
- Excellent performance for text modification, subject addition/removal, style transfers, background changes, etc.
- Direct editing without additional face processing

Real Person Workflow (Extended Version)

Best for: Editing images containing human faces where facial identity preservation is crucial
Characteristics:
- Combines Step1X-Edit with additional face consistency preservation
- Uses Face Bounding Box and simple person description to enhance identity preservation
- Preserves identity features better than the standard Step1X-Edit workflow

3.3 Parameter Reference

Main Step1X-Edit Node Parameters:

cfg: Guidance scale, typically around 6.0 (higher = more adherence to prompt)
size_level: Controls output resolution (512, 768, or 1024)
num_steps: Number of diffusion steps (typically 20-31)
mllm_model: The vision language model (default: Qwen2.5-VL-7B-Instruct)

For Real Person Workflow Additional Parameters:

Face Bounding Box Node (from FaceAnalysis):
- Index: Face detection control
  - -1: Detect all faces (default)
  - 0: Select largest face only
  - 1: Select second-largest face
  - Check workflow carefully when dealing with multiple faces
- padding: Additional space around face (default: 0)
- padding_percent: Percentage-based padding (default: 0.30)

3.4 Editing Task Categories

Step1X-Edit has been specifically optimized for these 11 editing categories:

Subject Addition: Add new objects or people to a scene using Step1X-Edit
Subject Removal: Remove unwanted elements from an image with Step1X-Edit
Subject Replacement: Swap one object for another using Step1X-Edit
Background Change: Modify or replace the background while preserving foreground elements
Color Alteration: Change specific colors within the image with Step1X-Edit
Material Modification: Transform the material properties of objects (e.g., glass to metal)
Motion Change: Alter the position or pose of subjects using Step1X-Edit
Portrait Beautification: Enhance or modify portraits with natural improvements
Style Transfer: Apply artistic styles to images with Step1X-Edit
Text Modification: Edit or replace text within images using Step1X-Edit
Tone Transformation: Adjust overall image tone, lighting, or atmosphere

3.5 Step-by-Step Usage Guide

Regular Workflow (Non-Real Person Version)

Upload Your Image using the Load Image node
Enter Your Editing Instructions in the Step1X-Edit Node
Adjust Parameters if needed:
- cfg: 6.0 is a good default for Step1X-Edit
- size_level: 512 for testing, 1024 for final results
- num_steps: 20-31 (more steps = better quality but slower)
Click Run to process your edit with Step1X-Edit

Real Person Workflow (Face Editing)

Upload Your Image using the Load Image node
Enter a Simple Person Description in the CR Prompt Text node
- Just use basic terms like "young woman" or "man"
- This helps the Step1X-Edit model understand who's in the image
Enter Your Editing Instructions in the Step1X-Edit Node
- Be specific about what you want to change about the person
Adjust Parameters if needed:
- Same as regular workflow, plus face detection settings if needed
Click Run to process your edit with Step1X-Edit
View and Download the result

3.6 Tips for Best Results

Clear Instructions: Be specific and concise in your Step1X-Edit prompts
Size Considerations: Larger sizes (1024) produce better quality but take longer to process
Face Handling: Use the Real Person workflow when editing human faces with Step1X-Edit
Multiple Edits: For complex edits, consider breaking them down into separate steps
Workflow Selection: Choose the appropriate Step1X-Edit workflow based on your subject matter
Machine Selection: Opt for 2X Large (80GB VRAM) or 2XL Plus (80GB VRAM) for optimal Step1X-Edit performance

4. Acknowledgements

This implementation is based on the Step1X-Edit model developed by the StepFun AI team (). The ComfyUI integration of Step1X-Edit was created by , making this powerful technology accessible within the ComfyUI environment.

RunComfy has integrated the Step1X-Edit technology into an easy-to-use cloud workflow, making advanced AI image editing accessible without the need for local installation or high-end hardware.

Sincere thanks to the original authors and the ComfyUI integration developer for making this tool available to the community.

Want More ComfyUI Workflows?

Loading preview...