Skip to content

perf(docker): single-stage with cache-friendly layer ordering#139

Merged
georgeh0 merged 1 commit intomainfrom
g/docker-layer-cache-v2
Apr 15, 2026
Merged

perf(docker): single-stage with cache-friendly layer ordering#139
georgeh0 merged 1 commit intomainfrom
g/docker-layer-cache-v2

Conversation

@georgeh0
Copy link
Copy Markdown
Member

Replaces the multi-stage / two-COPY layout introduced in #138 with a single-stage Dockerfile that actually achieves the user-pull-cost optimization. The previous attempt bloated the image to 10 GB without reducing per-release downloads — BuildKit's COPY --from emits the full copied tree as a layer rather than a diff vs. the destination.

Summary

  • Single-stage runtime image with cache-friendly layer ordering. Heavy stable installs (sentence-transformers, model bake, base setup) come first; per-release cocoindex + cocoindex-code install is the last layer. Each RUN uv pip install produces its own distinct layer with a content-addressable digest.
  • Stable layers persist across releases. The sentence-transformers install is keyed on the RUN command string (no source-tree dependency). Subsequent releases reuse the same digest, so users docker pulling an upgrade keep that ~5 GB layer locally.
  • Per-release layer is small. ~470 MB containing cocoindex + cocoindex-code + their non-ST transitive deps (LiteLLM stack, MCP, typer, pydantic, etc.). Future option: bump litellm into the stable layer to shrink further.
  • RUN --mount=type=bind,source=.,target=/ccc-src,rw=true instead of COPY . /ccc-src — gives hatch-vcs a writable overlay for _version.py during the PEP 517 build without persisting the source tree as a layer in the final image.

Numbers

Total image Per-release pull
Before (single COPY) ~5 GB ~5 GB
#138 (two-COPY split) 10.1 GB ~5 GB (no improvement)
This PR 5.77 GB (full) / 534 MB (slim) ~470 MB (full)

Test plan

  • Local builds for both variants succeed end-to-end.
  • uv run pytest -m docker_e2e — 6 passed, 2 Linux-only PUID tests skipped on macOS.
  • Next workflow_dispatch with test_docker=true will populate the GHA cache; the release after should show short build times.

🤖 Generated with Claude Code

Reshape the Dockerfile so heavy deps live in a stable early layer (digest
reproducible across releases, users cache it) and per-release cocoindex +
cocoindex-code installs land in their own small layer at the end. Cuts
the per-release `docker pull` from ~5 GB to ~470 MB.

Specifically:
- Drop the multi-stage builder/model_cache layout; do everything in one
  runtime image so each install RUN produces its own distinct layer.
  BuildKit COPY in a multi-stage emits the full copied tree as a layer
  (not a diff) — that's what made the previous two-COPY split bloat the
  image to ~10 GB without saving any pull cost.
- Order layers so per-release content (the source-tree-dependent install)
  is last; everything before reuses across releases.
- Use `RUN --mount=type=bind,source=.,target=/ccc-src,rw=true` instead of
  `COPY . /ccc-src` so hatch-vcs can write `_version.py` during the PEP 517
  build without persisting the source tree as a layer in the final image.

Image sizes: slim 534 MB (was 598 MB), full 5.77 GB (was 5.83 GB).
Per-release layer: 468 MB (uv install on top of pre-installed ST).
Verified: docker E2E suite passes (6 passed, 2 Linux-only skipped on macOS).
@georgeh0 georgeh0 merged commit 00ae2d2 into main Apr 15, 2026
9 of 10 checks passed
@georgeh0 georgeh0 deleted the g/docker-layer-cache-v2 branch April 15, 2026 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant