A curated list of AI image generation APIs, SDKs, and production-ready tools. Focused on services developers can integrate today.
Last verified: March 2026
- Text-to-Image APIs
- Open Source Models
- Open Source Frameworks and UIs
- Image Editing and Enhancement
- SDKs and Developer Tooling
- Infrastructure and Deployment
- Evaluation and Observability
- Templates and Example Projects
- Learning Resources
- OpenAI GPT Image –
gpt-image-1,gpt-image-1.5,gpt-image-1-mini. Natively multimodal generation, editing, and inpainting. DALL-E 2/3 deprecated May 2026. API Reference | SDK: Python, Node - Black Forest Labs (FLUX Pro) – FLUX 1.1 Pro and FLUX.2 (32B params) via REST API. From the creators of FLUX and Stable Diffusion. Docs | Also on Replicate, fal.ai, Together AI
- Stability AI – Stable Diffusion 3.5 and Stable Image via REST API. Text-to-image, image-to-image, upscaling, inpainting. Docs
- Google Imagen (Vertex AI) – Imagen 4 via Vertex AI. Text-to-image, editing, outpainting, inpainting, customization. Docs | API Reference | SDK: Python (google-cloud-aiplatform), Node
- Adobe Firefly API – Image generation, editing, Photoshop automation, and Lightroom operations. Part of Firefly Services platform. Docs | SDK: JS/TS (official)
- Ideogram – Known for high-quality text rendering in images. Ideogram 3.0 supports generation, remix, edit, and character reference. OpenAI-compatible interface. Docs
- Recraft AI – Raster and vector image generation. V4 model (Feb 2026). Background removal, inpainting, outpainting, vectorization. OpenAI-compatible interface. Docs | ComfyUI Plugin
- Midjourney – Official API released late 2025. Enterprise/Pro plan holders only; no public self-service access. Docs
- Amazon Titan Image Generator – Text-to-image via AWS Bedrock. Image conditioning, color palette guidance, background removal, and variations. Docs | SDK: Python (boto3), Java, PHP
- Leonardo AI – Text-to-image, image-to-image, and image-to-video. Webhooks, LoRA models, and "Get API Code" export from web UI. Docs | SDK: TypeScript, Python
- fal.ai – Serverless inference hosting 1000+ image models. Fastest diffusion inference engine. Hosts FLUX, SD, and more. SOC 2 compliant. Docs | SDK: Python, JS
- PixelAPI – Pay-per-use AI image API running SDXL, FLUX Schnell, and FLUX Dev on owned GPUs. Background removal, 4x upscaling, text-to-image, and image-to-image at $0.003-$0.005/image. Docs | SDK:
pip install pixelapi/npm install pixelapi
- FLUX.1 [schnell] – 12B param rectified flow transformer. 1-4 step generation. Fully open for commercial use. Apache 2.0. GitHub
- FLUX.1 [dev] – 12B param guidance-distilled model. High quality, competitive with closed-source. Non-commercial license.
- FLUX.2 [dev] – 32B param model with generation, editing, and multi-reference combining.
- Stable Diffusion 1.5 – 860M UNet, runs on consumer GPUs. Foundation for massive community ecosystem of LoRAs, fine-tunes, and extensions.
- Stable Diffusion XL (SDXL) – Native 1024x1024. Improved text-in-image and limb generation. Base + refiner pipeline.
- Stable Diffusion 3.5 Large – MMDiT architecture with three text encoders (including T5-XXL). Highest-quality Stability open model. Turbo
- SDXL-Turbo – Adversarial distillation of SDXL enabling single-step generation.
- PixArt-Alpha / PixArt-Sigma – DiT-based T2I at 10.8% of SD1.5 training cost. Near-commercial quality. HuggingFace
- Kandinsky 3 – Open-source T2I from AI Forever. 2x larger U-Net and 10x larger text encoder vs v2.x. HuggingFace
- DeepFloyd IF – Cascaded pixel-space diffusion (64px → 256px → 1024px). Strong text rendering. Zero-Shot FID 6.66 on COCO.
- Playground v2.5 – Aesthetic-focused model fine-tuned on SDXL architecture.
- LCM / LCM-LoRA – Latent Consistency Models enabling 2-4 step generation. LCM-LoRA is a lightweight ~100MB adapter for any SDXL model. Diffusers Docs
- ComfyUI – Node-based graph UI and backend for diffusion models. Highly customizable, API-accessible. Supports SD, SDXL, Flux, and modern models. GPL-3.0. Docs
- AUTOMATIC1111 WebUI – Most widely used Gradio-based SD web UI. 161k+ stars. Extensive extension ecosystem. AGPL-3.0. Wiki
- Fooocus – Midjourney-inspired SDXL UI. Prompt-only workflow, no manual parameter tweaking. GPL-3.0.
- InvokeAI – Creative engine for SD models targeting professionals. Industry-leading WebUI. Apache 2.0. Website
- Forge – Fork of AUTOMATIC1111 with improved GPU memory management and performance. Compatible with A1111 extensions. AGPL-3.0.
- ControlNet – Precise structural control for diffusion models via edge maps, depth, pose, normals. Available for SD1.5, SDXL, and Flux. A1111 Extension
- IP-Adapter – Lightweight adapter (~100MB) for image-based prompting. New cross-attention layers for image feature conditioning. Diffusers Docs
- Real-ESRGAN – Image and video upscaler, up to 8x. Handles real-world blind super-resolution with noise/artifact removal. BSD 3-Clause. Replicate
- GFPGAN – Face restoration from Tencent ARC. Restores facial details from degraded images. Often paired with Real-ESRGAN.
- HuggingFace Diffusers – The canonical PyTorch library for diffusion models. SD 1.5, SDXL, SD3, Flux, ControlNet, IP-Adapter, and more. Docs |
pip install diffusers - Replicate SDK – Python/JS client for 50,000+ hosted ML models. Pay-per-second, no GPU management. Docs |
pip install replicate/npm install replicate - OpenAI SDK – Official SDK for GPT Image generation and editing.
client.images.generate()andclient.images.edit().pip install openai/npm install openai - fal.ai SDK – Python and JS SDKs for serverless inference. Also a Vercel AI SDK provider. Docs |
pip install fal-client/npm install @fal-ai/client - Gradio – Python library for building interactive ML demos and web UIs. Foundation for AUTOMATIC1111, Fooocus, and HuggingFace Spaces. Includes
gradio-clientfor programmatic access. GitHub |pip install gradio
- RunPod – GPU pods and serverless endpoints. 48% of serverless cold starts under 200ms. Docs
- Modal – Serverless Python GPU cloud. Sub-second cold starts. Docs |
pip install modal - Lambda Labs – On-demand A100 and H100 GPUs. Competitive pricing (~$1.10/hr A100 80GB). Docs
- Replicate – Serverless model hosting for open-source image models. Docs
- fal.ai – Fastest diffusion inference engine. 1000+ hosted models. Docs
- Together AI – Inference API for 200+ open models. Docs
- Cloudinary – Enterprise image/video CDN with AI-powered transformations. Docs | SDK: Python, Node, Ruby, PHP, Java, .NET
- Imgix – Real-time image processing CDN. URL-parameter-based transforms. Connects to existing S3/GCS storage. Docs
- Cloudflare Images – Image CDN on Cloudflare's global network. Pre-defined variants for transformations.
- Backblaze B2 – S3-compatible object storage at ~$0.006/GB/month. Free egress via Cloudflare. Docs
- pytorch-fid – PyTorch FID (Fréchet Inception Distance) implementation. Measures distribution similarity between real and generated images.
pip install pytorch-fid - torch-fidelity – High-fidelity ISC, FID, KID, and PRC metrics. Supports InceptionV3, CLIP, DINOv2, VGG16 feature extractors. Docs |
pip install torch-fidelity - ImageReward – First general-purpose human preference reward model for T2I (NeurIPS 2023). Trained on 137k expert comparison pairs. Paper
- IQA-PyTorch – Comprehensive image quality toolbox. PSNR, SSIM, LPIPS, FID, NIQE, MUSIQ, TOPIQ, NIMA, BRISQUE, and more.
- CLIP Score – Measures semantic alignment between text prompts and generated images using CLIP embeddings. Available via
torchmetrics.multimodal.CLIPScore.
- OpenAI Cookbook (GPT Image) – Official notebooks for image generation and editing with gpt-image-1.
- HuggingFace Diffusers Examples – Official scripts for DreamBooth, LoRA fine-tuning, ControlNet training, and more.
- Replicate Text-to-Image Collection – Curated runnable models with inline API code examples.
- B2 Image Generation Prompt Flow – Image generation pipeline with prompt flow and Backblaze B2 cloud storage integration.
- B2 Background Removal with Transformers.js – Browser-based background removal using Transformers.js with Backblaze B2 storage.
- HuggingFace Spaces – Free hosting for Gradio and Streamlit ML demos. Thousands of image generation demos. Docs
- OpenAI Images API Guide
- HuggingFace Diffusers Documentation
- ComfyUI Documentation
- Black Forest Labs API Docs
- Google Imagen on Vertex AI Guide
- Adobe Firefly API Tutorial
Contributions welcome! Please read the contribution guidelines first. PRs for new tools, corrections, and updates are appreciated.
