Awesome Image Generation

A curated list of AI image generation APIs, SDKs, and production-ready tools. Focused on services developers can integrate today.

Last verified: March 2026

Related Lists

Text-to-Image APIs

OpenAI GPT Image – gpt-image-1, gpt-image-1.5, gpt-image-1-mini. Natively multimodal generation, editing, and inpainting. DALL-E 2/3 deprecated May 2026. API Reference | SDK: Python, Node
Black Forest Labs (FLUX Pro) – FLUX 1.1 Pro and FLUX.2 (32B params) via REST API. From the creators of FLUX and Stable Diffusion. Docs | Also on Replicate, fal.ai, Together AI
Stability AI – Stable Diffusion 3.5 and Stable Image via REST API. Text-to-image, image-to-image, upscaling, inpainting. Docs
Google Imagen (Vertex AI) – Imagen 4 via Vertex AI. Text-to-image, editing, outpainting, inpainting, customization. Docs | API Reference | SDK: Python (google-cloud-aiplatform), Node
Adobe Firefly API – Image generation, editing, Photoshop automation, and Lightroom operations. Part of Firefly Services platform. Docs | SDK: JS/TS (official)
Ideogram – Known for high-quality text rendering in images. Ideogram 3.0 supports generation, remix, edit, and character reference. OpenAI-compatible interface. Docs
Recraft AI – Raster and vector image generation. V4 model (Feb 2026). Background removal, inpainting, outpainting, vectorization. OpenAI-compatible interface. Docs | ComfyUI Plugin
Midjourney – Official API released late 2025. Enterprise/Pro plan holders only; no public self-service access. Docs
Amazon Titan Image Generator – Text-to-image via AWS Bedrock. Image conditioning, color palette guidance, background removal, and variations. Docs | SDK: Python (boto3), Java, PHP
Leonardo AI – Text-to-image, image-to-image, and image-to-video. Webhooks, LoRA models, and "Get API Code" export from web UI. Docs | SDK: TypeScript, Python
fal.ai – Serverless inference hosting 1000+ image models. Fastest diffusion inference engine. Hosts FLUX, SD, and more. SOC 2 compliant. Docs | SDK: Python, JS
PixelAPI – Pay-per-use AI image API running SDXL, FLUX Schnell, and FLUX Dev on owned GPUs. Background removal, 4x upscaling, text-to-image, and image-to-image at $0.003-$0.005/image. Docs | SDK: pip install pixelapi / npm install pixelapi

Open Source Models

FLUX Family (Black Forest Labs)

FLUX.1 [schnell] – 12B param rectified flow transformer. 1-4 step generation. Fully open for commercial use. Apache 2.0. GitHub
FLUX.1 [dev] – 12B param guidance-distilled model. High quality, competitive with closed-source. Non-commercial license.
FLUX.2 [dev] – 32B param model with generation, editing, and multi-reference combining.

Stable Diffusion Family (Stability AI)

Stable Diffusion 1.5 – 860M UNet, runs on consumer GPUs. Foundation for massive community ecosystem of LoRAs, fine-tunes, and extensions.
Stable Diffusion XL (SDXL) – Native 1024x1024. Improved text-in-image and limb generation. Base + refiner pipeline.
Stable Diffusion 3.5 Large – MMDiT architecture with three text encoders (including T5-XXL). Highest-quality Stability open model. Turbo
SDXL-Turbo – Adversarial distillation of SDXL enabling single-step generation.

Other Open Models

PixArt-Alpha / PixArt-Sigma – DiT-based T2I at 10.8% of SD1.5 training cost. Near-commercial quality. HuggingFace
Kandinsky 3 – Open-source T2I from AI Forever. 2x larger U-Net and 10x larger text encoder vs v2.x. HuggingFace
DeepFloyd IF – Cascaded pixel-space diffusion (64px → 256px → 1024px). Strong text rendering. Zero-Shot FID 6.66 on COCO.
Playground v2.5 – Aesthetic-focused model fine-tuned on SDXL architecture.
LCM / LCM-LoRA – Latent Consistency Models enabling 2-4 step generation. LCM-LoRA is a lightweight ~100MB adapter for any SDXL model. Diffusers Docs

Open Source Frameworks and UIs

ComfyUI – Node-based graph UI and backend for diffusion models. Highly customizable, API-accessible. Supports SD, SDXL, Flux, and modern models. GPL-3.0. Docs
AUTOMATIC1111 WebUI – Most widely used Gradio-based SD web UI. 161k+ stars. Extensive extension ecosystem. AGPL-3.0. Wiki
Fooocus – Midjourney-inspired SDXL UI. Prompt-only workflow, no manual parameter tweaking. GPL-3.0.
InvokeAI – Creative engine for SD models targeting professionals. Industry-leading WebUI. Apache 2.0. Website
Forge – Fork of AUTOMATIC1111 with improved GPU memory management and performance. Compatible with A1111 extensions. AGPL-3.0.

Image Editing and Enhancement

ControlNet – Precise structural control for diffusion models via edge maps, depth, pose, normals. Available for SD1.5, SDXL, and Flux. A1111 Extension
IP-Adapter – Lightweight adapter (~100MB) for image-based prompting. New cross-attention layers for image feature conditioning. Diffusers Docs
Real-ESRGAN – Image and video upscaler, up to 8x. Handles real-world blind super-resolution with noise/artifact removal. BSD 3-Clause. Replicate
GFPGAN – Face restoration from Tencent ARC. Restores facial details from degraded images. Often paired with Real-ESRGAN.

SDKs and Developer Tooling

HuggingFace Diffusers – The canonical PyTorch library for diffusion models. SD 1.5, SDXL, SD3, Flux, ControlNet, IP-Adapter, and more. Docs | pip install diffusers
Replicate SDK – Python/JS client for 50,000+ hosted ML models. Pay-per-second, no GPU management. Docs | pip install replicate / npm install replicate
OpenAI SDK – Official SDK for GPT Image generation and editing. client.images.generate() and client.images.edit(). pip install openai / npm install openai
fal.ai SDK – Python and JS SDKs for serverless inference. Also a Vercel AI SDK provider. Docs | pip install fal-client / npm install @fal-ai/client
Gradio – Python library for building interactive ML demos and web UIs. Foundation for AUTOMATIC1111, Fooocus, and HuggingFace Spaces. Includes gradio-client for programmatic access. GitHub | pip install gradio

Infrastructure and Deployment

GPU Cloud Providers

RunPod – GPU pods and serverless endpoints. 48% of serverless cold starts under 200ms. Docs
Modal – Serverless Python GPU cloud. Sub-second cold starts. Docs | pip install modal
Lambda Labs – On-demand A100 and H100 GPUs. Competitive pricing (~$1.10/hr A100 80GB). Docs
Replicate – Serverless model hosting for open-source image models. Docs
fal.ai – Fastest diffusion inference engine. 1000+ hosted models. Docs
Together AI – Inference API for 200+ open models. Docs

Image Storage and Delivery

Cloudinary – Enterprise image/video CDN with AI-powered transformations. Docs | SDK: Python, Node, Ruby, PHP, Java, .NET
Imgix – Real-time image processing CDN. URL-parameter-based transforms. Connects to existing S3/GCS storage. Docs
Cloudflare Images – Image CDN on Cloudflare's global network. Pre-defined variants for transformations.
Backblaze B2 – S3-compatible object storage at ~$0.006/GB/month. Free egress via Cloudflare. Docs

Evaluation and Observability

pytorch-fid – PyTorch FID (Fréchet Inception Distance) implementation. Measures distribution similarity between real and generated images. pip install pytorch-fid
torch-fidelity – High-fidelity ISC, FID, KID, and PRC metrics. Supports InceptionV3, CLIP, DINOv2, VGG16 feature extractors. Docs | pip install torch-fidelity
ImageReward – First general-purpose human preference reward model for T2I (NeurIPS 2023). Trained on 137k expert comparison pairs. Paper
IQA-PyTorch – Comprehensive image quality toolbox. PSNR, SSIM, LPIPS, FID, NIQE, MUSIQ, TOPIQ, NIMA, BRISQUE, and more.
CLIP Score – Measures semantic alignment between text prompts and generated images using CLIP embeddings. Available via torchmetrics.multimodal.CLIPScore.

Templates and Example Projects

OpenAI Cookbook (GPT Image) – Official notebooks for image generation and editing with gpt-image-1.
HuggingFace Diffusers Examples – Official scripts for DreamBooth, LoRA fine-tuning, ControlNet training, and more.
Replicate Text-to-Image Collection – Curated runnable models with inline API code examples.
B2 Image Generation Prompt Flow – Image generation pipeline with prompt flow and Backblaze B2 cloud storage integration.
B2 Background Removal with Transformers.js – Browser-based background removal using Transformers.js with Backblaze B2 storage.
HuggingFace Spaces – Free hosting for Gradio and Streamlit ML demos. Thousands of image generation demos. Docs

Learning Resources

Contributing

Contributions welcome! Please read the contribution guidelines first. PRs for new tools, corrections, and updates are appreciated.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Image Generation

Related Lists

Contents

Text-to-Image APIs

Open Source Models

FLUX Family (Black Forest Labs)

Stable Diffusion Family (Stability AI)

Other Open Models

Open Source Frameworks and UIs

Image Editing and Enhancement

SDKs and Developer Tooling

Infrastructure and Deployment

GPU Cloud Providers

Image Storage and Delivery

Evaluation and Observability

Templates and Example Projects

Learning Resources

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome Image Generation

Related Lists

Contents

Text-to-Image APIs

Open Source Models

FLUX Family (Black Forest Labs)

Stable Diffusion Family (Stability AI)

Other Open Models

Open Source Frameworks and UIs

Image Editing and Enhancement

SDKs and Developer Tooling

Infrastructure and Deployment

GPU Cloud Providers

Image Storage and Delivery

Evaluation and Observability

Templates and Example Projects

Learning Resources

Contributing

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages