Skip to content

Commit 079a3a7

Browse files
ConalMullanclaude
andcommitted
Add FLUX.2 Klein 4B image generation tool with scene presets (v0.11.0)
- New tools/flux2.py: text-to-image, image editing, multi-image compositing - 8 scene presets for video production (title-bg, problem, solution, etc.) - Brand-aware generation reads brand.json colors into prompts - Docker image with baked model weights (ghcr.io/conalmullan/video-toolkit-flux2) - Rebuilt qwen3-tts Docker image with baked models - "The Space Between" showcase video (all AI-generated) - Updated README, registry, and changelog Closes #6 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3bdad4d commit 079a3a7

7 files changed

Lines changed: 1525 additions & 10 deletions

File tree

README.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ An AI-native video production workspace for [Claude Code](https://claude.ai/code
2222
> - `ghcr.io/conalmullan/video-toolkit-propainter` — Video watermark removal (ProPainter)
2323
> - `ghcr.io/conalmullan/video-toolkit-sadtalker` — Talking head generation (SadTalker)
2424
> - `ghcr.io/conalmullan/video-toolkit-qwen3-tts` — Text-to-speech (Qwen3-TTS)
25+
> - `ghcr.io/conalmullan/video-toolkit-flux2` — Text-to-image & editing (FLUX.2 Klein 4B)
2526
>
2627
> My motto: **Be brave. Experiment.** And please share any videos you create or ideas you have back with the project — it helps me keep improving this toolkit for everyone.
2728
@@ -149,6 +150,7 @@ See `examples/` for finished projects you can learn from (oldest first, showing
149150
| 2026-01-22 | [ds-remote-mcp](https://demos.digitalsamba.com/video/ds-remote-mcp.mp4) | Remote MCP server demo *(the jazz background music is a joke)* |
150151
| 2026-01-25 | [schlumbergera](https://demos.digitalsamba.com/video/schlumbergera.mp4) | Android sprint review video |
151152
| 2026-02-23 | [cortina](https://demos.digitalsamba.com/video/sprint-review.mp4) | Mobile platforms sprint review |
153+
| 2026-03-15 | [the-space-between](https://demos.digitalsamba.com/video/the-space-between.mp4) | AI-generated video essay — flux2 avatar, Qwen3-TTS voice, SadTalker animation |
152154

153155
### Scene Transitions
154156

@@ -250,6 +252,12 @@ python tools/locate_watermark.py --input video.mp4 --grid --output-dir ./review/
250252
# Creates animated presenter/narrator from a static portrait + voiceover audio
251253
# Use with NarratorPiP component for picture-in-picture presenter overlays
252254
python tools/sadtalker.py --image portrait.png --audio voiceover.mp3 --output talking.mp4
255+
256+
# AI image generation (FLUX.2 Klein 4B — text-to-image + editing)
257+
python tools/flux2.py --prompt "A sunset over mountains"
258+
python tools/flux2.py --preset title-bg --brand digital-samba
259+
python tools/flux2.py --input photo.jpg --prompt "Add sunglasses"
260+
python tools/flux2.py --list-presets
253261
```
254262

255263
**Tool Categories:**
@@ -258,7 +266,7 @@ python tools/sadtalker.py --image portrait.png --audio voiceover.mp3 --output ta
258266
|------|-------|---------|
259267
| **Project** | voiceover, music, sfx | Used during video creation workflow |
260268
| **Utility** | redub, addmusic, notebooklm_brand, locate_watermark | Quick transformations, no project needed |
261-
| **Cloud GPU** | image_edit, upscale, dewatermark, sadtalker, qwen3_tts | AI processing via RunPod (see below) |
269+
| **Cloud GPU** | image_edit, upscale, dewatermark, sadtalker, qwen3_tts, flux2 | AI processing via RunPod (see below) |
262270

263271
See [docs/runpod-setup.md](docs/runpod-setup.md) for Cloud GPU tool setup.
264272

@@ -273,6 +281,7 @@ Cloud GPU tools use pre-built Docker images deployed to RunPod serverless:
273281
| dewatermark | `ghcr.io/conalmullan/video-toolkit-propainter:latest` | 24GB (RTX 3090/4090) |
274282
| sadtalker | `ghcr.io/conalmullan/video-toolkit-sadtalker:latest` | 24GB (RTX 4090) |
275283
| qwen3_tts | `ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest` | 24GB (ADA) |
284+
| flux2 | `ghcr.io/conalmullan/video-toolkit-flux2:latest` | 24GB (ADA) |
276285

277286
Dockerfiles and handlers are in `docker/`. Run `python tools/<tool>.py --setup` to auto-deploy.
278287

_internal/CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,32 @@ All notable changes to claude-code-video-toolkit.
66
77
---
88

9+
## 2026-03-15 (v0.11.0)
10+
11+
### Added
12+
- **`tools/flux2.py`** — AI image generation and editing using FLUX.2 Klein 4B
13+
- Text-to-image generation from prompts (~2.5s fast mode, ~8s quality mode)
14+
- Image editing with reference images (`--input photo.jpg --prompt "..."`)
15+
- Multi-image compositing (up to 3 reference images)
16+
- **8 scene presets** for video production: `title-bg`, `problem`, `solution`, `demo-bg`, `stats-bg`, `cta`, `thumbnail`, `portrait-bg`
17+
- **Brand-aware generation**`--brand digital-samba` reads brand.json and injects color palette into prompts
18+
- Preset + prompt layering — preset provides style/mood, `--prompt` adds subject context
19+
- `--setup` creates RunPod template + endpoint via GraphQL
20+
- `--list-presets` shows all available scene presets
21+
- Docker image: `ghcr.io/conalmullan/video-toolkit-flux2:latest` (baked model weights, ~15GB)
22+
- Apache 2.0 licensed model (commercial OK)
23+
- **"The Space Between"**[showcase video](https://demos.digitalsamba.com/video/the-space-between.mp4) demonstrating end-to-end AI video creation
24+
- Avatar generated with flux2, voiced with Qwen3-TTS, animated with SadTalker, composed in Remotion
25+
- Concept, script, voice, visuals — all AI-generated
26+
- New video format: video essay (beyond sprint reviews and product demos)
27+
28+
### Changed
29+
- **Baked Docker images rebuilt** — both `flux2` and `qwen3-tts` images now include model weights baked in, eliminating cold-start model downloads (~30s cold start vs minutes)
30+
- Updated `_internal/toolkit-registry.json` with flux2 tool entry, cloud endpoint, scene presets, and showcase example
31+
- Updated README with flux2 tool, Docker image, demo table entry, and Cloud GPU tools table
32+
33+
---
34+
935
## 2026-02-25 (v0.10.1)
1036

1137
### Added

_internal/toolkit-registry.json

Lines changed: 41 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "claude-code-video-toolkit",
3-
"version": "0.10.1",
3+
"version": "0.11.0",
44
"description": "AI-native video production workspace for Claude Code",
55
"repository": "https://github.com/digitalsamba/claude-code-video-toolkit",
66

@@ -309,6 +309,30 @@
309309
"requires": "ffprobe (from ffmpeg)",
310310
"created": "2026-02-24",
311311
"updated": "2026-02-24"
312+
},
313+
"flux2": {
314+
"path": "tools/flux2.py",
315+
"description": "AI image generation and editing using FLUX.2 Klein 4B - text-to-image, image editing, and scene presets",
316+
"usage": "python tools/flux2.py --preset title-bg --brand digital-samba",
317+
"status": "beta",
318+
"category": "image-generation",
319+
"backend": "flux2-klein-4b",
320+
"requires": "RunPod account",
321+
"options": {
322+
"modes": ["generate", "edit"],
323+
"presets": ["title-bg", "problem", "solution", "demo-bg", "stats-bg", "cta", "thumbnail", "portrait-bg"],
324+
"brand": true,
325+
"width": [512, 768, 1024, 1280, 1920],
326+
"height": [512, 768, 1024, 1080, 1280],
327+
"steps": "4 (fast) to 50 (quality)",
328+
"guidance": "1.0 (fast) to 4.0 (quality)",
329+
"seed": true,
330+
"multiImage": true
331+
},
332+
"envVars": ["RUNPOD_API_KEY", "RUNPOD_FLUX2_ENDPOINT_ID"],
333+
"estimatedCost": "$0.01-0.03 per image",
334+
"created": "2026-03-14",
335+
"updated": "2026-03-15"
312336
}
313337
},
314338

@@ -371,10 +395,17 @@
371395
"envVar": "RUNPOD_QWEN3_TTS_ENDPOINT_ID",
372396
"operations": ["qwen3_tts"],
373397
"estimatedCost": "$0.005-0.10 per generation"
398+
},
399+
"flux2": {
400+
"image": "ghcr.io/conalmullan/video-toolkit-flux2:latest",
401+
"dockerfile": "docker/runpod-flux2/",
402+
"envVar": "RUNPOD_FLUX2_ENDPOINT_ID",
403+
"operations": ["flux2"],
404+
"estimatedCost": "$0.01-0.03 per image"
374405
}
375406
},
376407
"created": "2025-12-30",
377-
"updated": "2026-01-12"
408+
"updated": "2026-03-14"
378409
}
379410
},
380411

@@ -561,6 +592,14 @@
561592
"template": "sprint-review",
562593
"complexity": "intermediate",
563594
"contributor": "Digital Samba"
595+
},
596+
"the-space-between": {
597+
"path": "examples/the-space-between/",
598+
"description": "AI-generated video essay — flux2 avatar, Qwen3-TTS voice, SadTalker animation",
599+
"template": "custom",
600+
"complexity": "advanced",
601+
"demo": "https://demos.digitalsamba.com/video/the-space-between.mp4",
602+
"created": "2026-03-15"
564603
}
565604
},
566605

docker/runpod-flux2/Dockerfile

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# RunPod Serverless handler for FLUX.2 Klein 4B (models baked in)
2+
#
3+
# Build: docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-flux2:latest --push .
4+
#
5+
# Image size: ~15GB (includes ~8GB model weights)
6+
# Cold start: ~30s (no model download needed)
7+
#
8+
# GPU Requirements:
9+
# - Minimum: 16GB VRAM (RTX 4070 Ti, A4000)
10+
# - Optimal: 24GB VRAM (RTX 4090, ADA_24)
11+
12+
FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04
13+
14+
ENV DEBIAN_FRONTEND=noninteractive
15+
16+
# Install Python and system dependencies
17+
RUN apt-get update && apt-get install -y \
18+
python3.11 \
19+
python3-pip \
20+
python3.11-venv \
21+
python3.11-dev \
22+
git \
23+
git-lfs \
24+
curl \
25+
&& rm -rf /var/lib/apt/lists/* \
26+
&& ln -sf /usr/bin/python3.11 /usr/bin/python3 \
27+
&& ln -sf /usr/bin/python3.11 /usr/bin/python
28+
29+
RUN git lfs install
30+
31+
WORKDIR /app
32+
33+
# CUDA environment
34+
ENV CUDA_HOME=/usr/local/cuda
35+
ENV PATH="${CUDA_HOME}/bin:${PATH}"
36+
ENV LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
37+
38+
# Install PyTorch with CUDA 12.4 support
39+
RUN pip3 install --no-cache-dir \
40+
torch==2.5.1 \
41+
torchvision==0.20.1 \
42+
--index-url https://download.pytorch.org/whl/cu124
43+
44+
# Install diffusers from git (required for Flux2KleinPipeline)
45+
RUN pip3 install --no-cache-dir git+https://github.com/huggingface/diffusers
46+
47+
# Install dependencies
48+
RUN pip3 install --no-cache-dir \
49+
transformers>=4.45.0 \
50+
accelerate>=0.30.0 \
51+
safetensors>=0.4.0 \
52+
sentencepiece>=0.2.0 \
53+
protobuf>=4.25.0 \
54+
Pillow>=10.0.0 \
55+
numpy>=1.26.0
56+
57+
# Install RunPod SDK and utilities
58+
RUN pip3 install --no-cache-dir \
59+
runpod>=1.7.0 \
60+
requests>=2.31.0 \
61+
boto3>=1.34.0 \
62+
huggingface_hub>=0.25.0
63+
64+
# Set HF cache location
65+
ENV HF_HOME=/root/.cache/huggingface
66+
67+
# === BAKE MODEL INTO IMAGE ===
68+
# Download FLUX.2 Klein 4B during build (~8GB)
69+
RUN python3 -c "\
70+
from huggingface_hub import snapshot_download; \
71+
import os; \
72+
os.makedirs('/root/.cache/huggingface', exist_ok=True); \
73+
print('Downloading FLUX.2 Klein 4B model...'); \
74+
snapshot_download('black-forest-labs/FLUX.2-klein-4B', cache_dir='/root/.cache/huggingface'); \
75+
print('Model downloaded successfully'); \
76+
"
77+
78+
# Copy handler
79+
COPY handler.py /app/handler.py
80+
81+
ENV PYTHONUNBUFFERED=1
82+
83+
# Health check — verify torch and diffusers
84+
RUN python3 -c "\
85+
import torch; \
86+
from diffusers import Flux2KleinPipeline; \
87+
print(f'PyTorch {torch.__version__}, CUDA available: {torch.cuda.is_available()}'); \
88+
print('Flux2KleinPipeline: available'); \
89+
"
90+
91+
CMD ["python3", "-u", "/app/handler.py"]

0 commit comments

Comments
 (0)