logo
RunComfy
  • Playground
  • ComfyUI
  • TrainerNew
  • API
  • Pricing
discord logo
PLAYGROUND
Explore
All Models
Lipsync Studio
Character Swap
Upscale Video
LIBRARY
Generations
MODEL APIS
API Docs
API Keys
ACCOUNT
Usage

GPT Image 1.5: Identity-Preserving Image Editing & Generation on playground and API | RunComfy

openai/gpt-image-1-5/image-to-image

Generate and edit realistic images from text or photos with 4x faster renders, precise multi-step edits, consistent lighting, and accurate small-text for design, e-commerce, and marketing visuals.

The URLs of the images to use as a reference for the generation
Aspect ratio for the generated image
Background for the generated image
Quality for the generated image
Input fidelity for the generated image
Output format for the images
Idle
The rate is $0.009 per image for low quality, $0.034 per image for medium quality, and $0.133 per image for high quality.

Introduction to GPT Image 1.5 Capabilities

OpenAI's GPT Image 1.5 generates and edits images from text and existing photos, starting at $0.009 per image with up to 4x faster renders and default 1024x1024 outputs, delivering precise multi-step editing and faithful small-text rendering. Trading manual masking and round-trips between apps for context-aware, identity-preserving transformations with precise add-remove-combine controls, GPT Image 1.5 streamlines production by removing tedious selection steps and keeping lighting, composition, and text consistent across edits, built for e-commerce teams, designers, marketers, and enterprise content pipelines. For developers, GPT Image 1.5 on RunComfy can be used both in the browser and via an HTTP API, so you don’t need to host or scale the model yourself.
Ideal for: Identity-Consistent Product Photography Edits | Photorealistic Try-On and Style Variations | Campaign-Ready Visuals with Accurate Text

Examples Created with GPT Image 1.5

Hyper-realistic winter portrait of a woman in a snow-covered fur-lined hood.
Silhouetted fisherman on a bridge at sunset near a mosque skyline, reflecting soft golden tones, inspired by GPT Image 1.5.
Overhead view of a vintage typewriter, journal, pencil, and coffee cup floating on fabric in a pool, crafted using GPT Image 1.5.
Two men wearing Santa hats sharing Coca-Cola in a festive setting, created using GPT Image 1.5.
Golden retriever, tabby cat, and pet rat snuggled on a couch watching TV, depicted using GPT Image 1.5 Image model.
Man reading a newspaper featuring a comparison between a humanoid robot and a banana with a computer chip, highlighting GPT Image 1.5 model.

GPT Image 1.5 Image to Image on X Platform

Model Overview


  • Provider: OpenAI
  • Task: image-to-image
  • Max Resolution/Duration: Up to 1536×1024 (or 1024×1536); default 1024×1024
  • Summary: GPT Image 1.5 is a high-fidelity image-to-image model built for precise, multi-step edits and fast iteration. It preserves lighting, composition, and subject likeness while following detailed instructions, and it renders small, dense text with greater clarity. GPT Image 1.5 is optimized for production workflows in design, e-commerce, and marketing that demand consistent visual quality and speed.

Key Capabilities


Precision edits that preserve lighting, composition, and likeness

  • GPT Image 1.5 performs targeted transformations on existing images (style changes, add/remove elements, apparel/hairstyle adjustments) while maintaining scene lighting, camera composition, and subject identity.
  • Results stay consistent across iterative edits, enabling reliable multi-step workflows without degradation.

4× faster generation for iterative creative cycles

  • GPT Image 1.5 delivers up to four times faster renders than GPT Image 1, reducing turnaround for review-and-revise loops.
  • Faster sampling makes A/B exploration and fine control over edits practical at scale.

Stronger prompt adherence and clearer small text

  • GPT Image 1.5 follows complex, instruction-heavy prompts more reliably than prior GPT Image / DALL·E models.
  • It improves the rendering of small and dense text (labels, UI elements, packaging), critical for e-commerce and brand assets.

Input Parameters


Core Prompts

ParameterTypeDefault/RangeDescription
promptstringdefault: ""Required. Instruction text describing the generation or the edit to apply.
image_urlsarray[string]default: []One or more image URLs to use as sources or references for image-to-image edits.

Dimensions & Settings

ParameterTypeDefault/RangeDescription
image_sizestring (enum)auto, 1024x1024, 1536x1024, 1024x1536 (default: auto)Target aspect/size. Use auto to let the model choose; specify exact dimensions for square, landscape, or portrait.
backgroundstring (enum)auto, transparent, opaque (default: auto)Background handling. Transparent enables export-ready assets; opaque keeps a solid background.
qualitystring (enum)low, medium, high (default: high)Rendering quality/performance tradeoff. High emphasizes fidelity.
input_fidelitystring (enum)low, high (default: high)Degree to preserve content from the first input image; high maintains stronger likeness and layout.

Output & Delivery

ParameterTypeDefault/RangeDescription
output_formatstring (enum)jpeg, png, webp (default: png)Output file format. Use PNG for transparency, JPEG for smaller size, WEBP for modern compression.

How GPT Image 1.5 compares to other models

  • Vs GPT Image 1 (gpt-image-1): Compared to GPT Image 1, GPT Image 1.5 delivers roughly 4× faster generation, stronger adherence to complex prompts, clearer small text, and better preservation of lighting, composition, and likeness across edits. Ideal when iterative precision and turnaround speed are both critical.
  • Vs DALL·E 3: Compared to DALL·E 3, GPT Image 1.5 emphasizes instruction-following and image-to-image editing fidelity, maintaining scene integrity during object additions/removals and style shifts. Choose GPT Image 1.5 for multi-step edits that must retain identity and layout.
  • Vs Flux 2: Flux 2 can target very high native resolutions and local deployment scenarios, but GPT Image 1.5 focuses on end-to-end speed, consistent editing, realistic transformations (e.g., try-ons), and streamlined UI/API integration. Use GPT Image 1.5 when enterprise-ready workflows and fast, repeatable edits matter most.

Detailed Research: GPT Image 1.5 vs. Google Nano Banana Pro


0. Executive Takeaways


  • Both models are production-grade image generation systems, but they optimize for different workflows:

- GPT Image 1.5 focuses on strong instruction-following, fast iteration, and precise image editing (OpenAI claims up to 4× faster generation).

- Nano Banana Pro emphasizes studio-style control, higher output resolution (up to 4K), multi-reference composition (up to 14 images), and optional Search grounding for factual visuals.


  • Neither OpenAI nor Google publicly discloses full architecture details or parameter counts. What is available and reliable are their interfaces, modalities, limits, and workflow primitives.

  • LMArena human-preference tests rank GPT Image 1.5 #1 in Text-to-Image (as of Dec 16, 2025), with Nano Banana Pro close behind.

  • Microsoft Foundry benchmarks show GPT Image 1.5 outperforming Nano Banana Pro on prompt alignment and diagram/flowchart tasks.

  • Community feedback suggests: GPT Image 1.5 excels at prompt adherence and reference-image conditioning. Nano Banana Pro excels at design-heavy outputs (text-in-image, infographics) but can show occasional artifacts or style drift.



1. Research Methodology


1.1 Official Sources


  • OpenAI: ChatGPT Images release notes, Images API documentation, and prompting guides.
  • Google / DeepMind: Nano Banana Pro (Gemini 3 Pro Image) launch posts, Gemini API docs, and Google Cloud announcements.

1.2 Community Sources (Qualitative)


  • Reddit and X (Twitter) discussions focusing on generation quality, prompt control, and editing behavior.

1.3 Third-Party Benchmarks


  • LMArena (human preference leaderboards).
  • Microsoft Azure AI Foundry published benchmark tables.
  • Open benchmarks and research projects (GenExam, RISEBench), where applicable.



2. Official Technical Comparison


2.1 Model Naming & Release Context


  • OpenAI: gpt-image-1.5 (snapshot: gpt-image-1.5-2025-12-16), marketed in ChatGPT as ChatGPT Images.
  • Google: Nano Banana Pro, also referred to as Gemini 3 Pro Image or gemini-3-pro-image-preview.



2.2 Architecture & Parameter Disclosure


  • Neither model publicly discloses:(1) Core generative architecture (e.g., diffusion vs. autoregressive internals). (2) Training recipe. (3) Parameter count.

  • GPT Image 1.5 is described as a natively multimodal language model capable of image generation and editing.
  • Nano Banana Pro is built on Gemini 3, integrating reasoning, real-world knowledge, and optional Search grounding.
  • Google applies SynthID watermarking to generated images for provenance.



2.3 Inputs, Outputs, and Limits


2.3.1 GPT Image 1.5 (OpenAI)


Limits & Formats

  • Images per request: 1–10
  • Edit inputs: up to 16 images, ≤50MB each
  • Supported formats: PNG, JPEG, WEBP
  • Output sizes: 1024×1024, 1536×1024, 1024×1536, auto
  • Prompt length: up to 32,000 characters

Workflow Characteristics

  • Strong preservation of lighting, composition, and subject identity during edits
  • Emphasis on fast iteration and controllable edits



2.3.2 Nano Banana Pro (Gemini 3 Pro Image)


Limits & Formats

  • Maximum resolution: up to 4K
  • Reference images: up to 14
  • Input formats: PNG, JPEG, WEBP, HEIC, HEIF
  • Inline image payload limit: <20MB (File API recommended for larger inputs)

Workflow Characteristics

  • Strong studio-style controls for layout, typography, and composition
  • Optional Search grounding for factual and real-world accuracy
  • SynthID watermarking applied to outputs



3. Users Community Feedback


3.1 GPT Image 1.5


Common Praise

  • Strong prompt adherence
  • Reliable use of reference images
  • Predictable behavior during iterative edits

Common Criticism

  • Occasional fine-detail artifacts when zoomed in



3.2 Nano Banana Pro


Common Praise

  • Excellent text-in-image and infographic generation
  • Strong layout and design-oriented outputs

Common Criticism

  • Style fidelity issues when matching references
  • Occasional unexpected or inconsistent edits



3.3 Production Risk Notes


  • Public discussions highlight potential bias or stereotyping risks in certain Nano Banana Pro generations, which may be relevant for production pipelines.



4. Benchmarks & Comparative Evaluations


4.1 Human Preference (LMArena)


  • Text-to-Image: GPT Image 1.5 ranked #1; Nano Banana Pro ranked slightly lower.
  • Image Editing: GPT Image 1.5 marginally outperforms Nano Banana Pro.



4.2 Microsoft Foundry Benchmarks


  • Prompt Alignment: GPT Image 1.5 > Nano Banana Pro
  • Diagram / Flowchart Accuracy: GPT Image 1.5 slightly higher

These results are based on Microsoft’s internal datasets and evaluation criteria.




4.3 Open Benchmarks


  • GenExam and RISEBench evaluations show Nano Banana Pro performing strongly relative to earlier Gemini and GPT-Image-1 models.
  • These benchmarks do not yet directly evaluate GPT Image 1.5 and should be interpreted as contextual signals.



4.4 Metrics Availability


  • FID: No authoritative public FID comparison exists for these two proprietary models.
  • Prompt Adherence: Supported by Microsoft Foundry metrics and LMArena rankings.
  • Generation Speed: OpenAI and Microsoft report up to 4× faster generation for GPT Image 1.5; Google does not publish an equivalent speed multiplier.



5. Practical Selection Guide


Choose GPT Image 1.5 When:


  • Tight prompt adherence is critical
  • Fast iteration and precise edits are required
  • A simple, production-friendly Images API is preferred

Choose Nano Banana Pro When:


  • High-resolution (4K) output is required
  • Workflows involve typography, infographics, or UI-style visuals
  • Grounded, real-world knowledge improves output quality



6. Licensing & Usage Notes


  • GPT Image 1.5: Proprietary; usage governed by OpenAI API and platform terms.
  • Nano Banana Pro: Proprietary; usage governed by Google Cloud / Gemini API terms; SynthID watermarking applied.


API Integration

  • Developers can integrate GPT Image 1.5 through the RunComfy API using standard HTTP requests. Send prompts plus optional image URLs, select size and quality, and receive rendered outputs in common formats. Integration is streamlined for both synchronous responses and typical job histories.
  • Note: API Endpoint for GPT Image 1.5

Official resources

  • Official Website: https://openai.com/blog/chatgpt-images-gpt-image-1-5
  • Official Documentation: https://platform.openai.com/docs/guides/images/image-generation
  • License: Proprietary (OpenAI Terms). Commercial use is permitted via the OpenAI API under applicable terms; some enterprise uses may require a separate agreement.

Explore Related Capabilities

  • If you require generating images from scratch rather than editing an existing image, use the same model configured for text-to-image: GPT Image 1.5 – Generation at GPT 1.5 Text to Image. It is optimized for prompt-driven creation while retaining the instruction-following strengths of GPT Image 1.5.

Related Playgrounds

nano-banana/pro/edit

Turn sketches into precise 2K-4K visuals with smart correction and seamless creative control.

flux-2/lora/edit

Refine images with adaptive style control, LoRA merging, and high-res rendering for consistent design output.

sam-3/image-to-image

Advanced concept-driven image editing with unified segmentation and detection for creators.

gemini-3-pro-image-preview/text-to-image

Create precise, consistent visuals with 4K detail and adaptive text-to-image rendering for design and production needs.

flux-2/max/edit

Precision visual editing tool for consistent, photorealistic brand assets

nano-banana/text-to-image

Seamlessly craft, edit, and fuse images for storytelling, branding, and beyond

Frequently Asked Questions

What are the main capabilities of GPT Image 1.5 in image-to-image generation?

GPT Image 1.5 can create original visuals from text or modify existing images using image-to-image workflows. It excels in preserving fine details, lighting, and texture across multiple edits, offering up to 4× faster generation compared to GPT Image 1. This makes it ideal for creative professionals who need consistency and realism in iterative edits.

How does GPT Image 1.5 differ from earlier models like GPT Image 1 in image-to-image editing?

Compared to GPT Image 1, GPT Image 1.5 introduces improved prompt adherence, more realistic lighting and composition, and richer texture handling in image-to-image transformations. It also provides smoother iterative editing and better text fidelity, which helps developers and technical artists retain visual consistency through complex editing workflows.

What technical limitations should developers know about when working with GPT Image 1.5 image-to-image generation?

GPT Image 1.5 currently outputs up to 1024×1024 pixels (about 1 MP) for most aspect ratios, with prompt token limits near 1000 tokens. It accepts one reference image per image-to-image edit. Developers needing multiple reference compositing should combine them manually before upload or consider alternate workflows.

Are there aspect ratio constraints or format restrictions in GPT Image 1.5 image-to-image outputs?

Yes. GPT Image 1.5 supports square (1:1), landscape (16:9), and portrait (9:16) ratios. Nonstandard aspect ratios are auto-cropped or padded. Supported formats include PNG and JPEG for both input and output in image-to-image editing sessions.

How can I transition from testing GPT Image 1.5 in the RunComfy Playground to full production via API?

Once your prototype using GPT Image 1.5 works as expected in the RunComfy Playground, you can migrate by using the RunComfy API, which mirrors the playground’s parameters, including image-to-image calls. You’ll authenticate with your API key, use the ‘generation’ endpoint, and manage usd credits or paid tiers for production-level scalability.

What makes GPT Image 1.5 superior to competitors in the image-to-image editing space?

GPT Image 1.5 stands out for its balanced blend of image quality, speed, and consistency across edits. While rivals like Flux 2 may offer higher resolution, GPT Image 1.5 provides more stable identity preservation, coherent lighting, and semantic prompt accuracy—especially useful in image-to-image editing scenarios for commercial applications.

Does GPT Image 1.5 handle text rendering inside images better than earlier versions during image-to-image edits?

Yes. GPT Image 1.5 improves legibility of small or dense text elements embedded in generated graphics. When performing image-to-image edits involving logos or signage, the model retains crisp outlines and consistent font rendering, surpassing GPT Image 1 and many competing systems in text fidelity.

Can GPT Image 1.5 be used for commercial image-to-image projects?

In general, you may use GPT Image 1.5 outputs commercially, but always confirm the applicable licensing terms on the official OpenAI platform or RunComfy policy pages. Commercial workflows involving image-to-image editing should verify output rights and data policies, as these may differ depending on API integration modes.

How does GPT Image 1.5 ensure consistent visual identity in multi-step image-to-image processes?

GPT Image 1.5 employs advanced internal representation tracking that preserves facial likeness, textures, and lighting consistency over successive edits. This helps developers or technical artists perform multi-stage image-to-image transformations such as character or product retexturing without introducing visual drift.

Is there a way to optimize generation cost while using GPT Image 1.5 image-to-image features?

Yes. Efficient prompting and batching can reduce usd consumption in RunComfy’s GPT Image 1.5 API. Reusing masked edits for image-to-image tasks instead of full regenerations preserves credits and lowers processing costs while maintaining control over fine visual adjustments.

Follow us
  • LinkedIn
  • Facebook
  • Instagram
  • Twitter
Support
  • Discord
  • Email
  • System Status
  • Affiliate
Video Models/Tools
  • Wan 2.6
  • Wan 2.6 Text to Video
  • Veo 3.1 Fast Video Extend
  • Seedance Lite
  • Wan 2.2
  • Seedance 1.0 Pro Fast
  • View All Models →
Image Models
  • GPT Image 1.5 Image to Image
  • Flux 2 Max Edit
  • GPT Image 1.5 Text To Image
  • Gemini 3 Pro
  • seedream 4.0
  • Nano Banana Pro
  • View All Models →
Legal
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.