Skip to content

refactor: Claude visualizer + DI-driven ThemeImageService#64

Merged
meninoebom merged 1 commit into
mainfrom
feature/tune-image-prompt-literal
Apr 22, 2026
Merged

refactor: Claude visualizer + DI-driven ThemeImageService#64
meninoebom merged 1 commit into
mainfrom
feature/tune-image-prompt-literal

Conversation

@meninoebom
Copy link
Copy Markdown
Owner

Why

The first version passed theme body_md verbatim into the Flux prompt. For concrete themes ("notes on a good afternoon") that worked; for abstract themes ("the way we talk about AI agents") Flux defaulted to contemporary-figurative-painting tropes (groups of diverse people) because it had no visual referent for meta-language.

Fix: insert a Claude visualization step. Claude translates theme text into a concrete, renderable scene sentence; Flux renders the scene.

Architecture

While touching the module, restructured for simplicity, robustness, and testability per the stated design principles:

app/images/
├── __init__.py       # public API + default_theme_image_service() factory
├── errors.py         # ImageGenerationError, ImageCommitError
├── service.py        # Visualizer/ImageGenerator/ImageStore protocols,
│                     # ThemeImageService orchestrator, STYLE_SUFFIX, compose_prompt
└── providers.py      # ClaudeVisualizer, ReplicateImageGenerator, R2ImageStore
  • Service layer is pure orchestration. No imports from providers. All deps injected via constructor.
  • Providers encapsulate vendor APIs. Each wraps vendor-specific errors as ImageGenerationError / ImageCommitError, so the service and API layers never see anthropic.APIError or replicate.exceptions.*.
  • Providers accept their clients/callables as constructor args (runner, fetch, put, client), so tests can inject doubles without monkeypatching module globals.
  • API layer uses Depends(default_theme_image_service). Endpoints know nothing about Claude/Replicate/R2. Tests override with app.dependency_overrides[default_theme_image_service] = lambda: fake_service.

Before / after

Before: flat app/images.py with free functions (build_theme_prompt, generate_candidate_images, commit_image_to_r2). API layer imported and called them directly. Tests monkeypatched app.images.replicate.run and app.api.commit_image_to_r2 to stub externals.

After: ThemeImageService with three injected providers behind Protocols. API calls service.generate_candidates(theme_body, tag_names) and service.commit_candidate(source_url). Tests at each layer see only what they need.

Smoke test on the failing theme

Input theme: "The way we talk about AI models and AI agents support an image of them as self-contained objects that function autonomously in their own virtual spaces."

Claude's scene description (what Flux actually renders):

"A sleek black box sits alone on a white pedestal in an empty, minimalist room with soft overhead light, its surface smooth and featureless except for a single glowing input slot and output port, while streams of data flow silently through invisible channels only it can perceive. The surrounding space is pristine and untouched, suggesting this object operates entirely within its own sealed logic, disconnected from the hands or minds that built it."

That's the translation gap. Abstract → concrete, which is a task LLMs excel at and image models don't.

Cost + latency

  • Haiku 4.5 per visualization: ~$0.0005 (~200 input tokens + ~60 output tokens)
  • Added latency: ~1-2s before Flux kicks in
  • Per-generation total: ~$0.0125 and ~6-10s

Tests

  • 49 new tests (175 total, all pass)
  • Service tests use FakeVisualizer / FakeImageGenerator / FakeImageStore — no network
  • Provider tests mock at the client boundary (fake anthropic client, fake runner, fake fetch/put callables)
  • Endpoint tests install a _FakeService via dependency_overrides

Test plan

  • Backend: uv run pytest → 175 passed
  • Live check: generate on the AI theme in the writer UI, confirm on-topic imagery
  • Regression: generate on a concrete theme that worked before, confirm it still works
  • Error case: temporarily break ANTHROPIC_API_KEY or REPLICATE_API_TOKEN, confirm writer sees a clean error

🤖 Generated with Claude Code

Flux couldn't render abstract themes (e.g. "the way we talk about AI
agents") because image models have no referent for meta-language. Fixed
by inserting a Claude visualization step that translates theme text into
a concrete scene sentence, which then feeds Flux.

Architectural pass to make the module simple, robust, and testable:

- Replace flat app/images.py with app/images/ package:
  - service.py — protocols (Visualizer, ImageGenerator, ImageStore),
    ThemeImageService orchestrator, pure compose_prompt + STYLE_SUFFIX
  - providers.py — ClaudeVisualizer, ReplicateImageGenerator, R2ImageStore
    concrete implementations; each wraps vendor errors as ImageGeneration/
    ImageCommitError
  - errors.py — error hierarchy, decoupled from service and providers
  - __init__.py — default_theme_image_service() factory (lru_cached)

- API layer uses Depends(default_theme_image_service); all vendor
  details stay inside providers. Endpoints now trivially swappable in
  tests via dependency_overrides.

- Providers accept their external clients/callables as constructor
  args, so tests mock at construction rather than monkeypatching module
  globals.

- Tests rewritten into three layers: service orchestration with fakes
  (no network), provider internals with mocked clients, HTTP endpoints
  with service dependency override. 49 new tests, 175 total pass.

The Claude visualizer uses Haiku 4.5. Extra cost per generation is
~$0.0005 (Haiku input/output tokens) on top of ~$0.012 for the Flux
grid-of-4. Latency adds ~1-2s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@railway-app
Copy link
Copy Markdown

railway-app Bot commented Apr 21, 2026

🚅 Deployed to the breadcrumbs-pr-64 environment in Breadcrumbs

Service Status Web Updated (UTC)
Breadcrumbs Web App Server ✅ Success (View Logs) Web Apr 21, 2026 at 8:07 am

@railway-app railway-app Bot temporarily deployed to Breadcrumbs / breadcrumbs-pr-64 April 21, 2026 08:03 Destroyed
@meninoebom meninoebom merged commit cc4c6c8 into main Apr 22, 2026
3 checks passed
@meninoebom meninoebom deleted the feature/tune-image-prompt-literal branch April 22, 2026 06:38
@meninoebom meninoebom mentioned this pull request Apr 22, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant