Skip to content

feat(docker): slim/full image variants + cached deps layer + [full] extra rename#138

Merged
georgeh0 merged 2 commits intomainfrom
g/docker-layer-cache
Apr 14, 2026
Merged

feat(docker): slim/full image variants + cached deps layer + [full] extra rename#138
georgeh0 merged 2 commits intomainfrom
g/docker-layer-cache

Conversation

@georgeh0
Copy link
Copy Markdown
Member

@georgeh0 georgeh0 commented Apr 14, 2026

Follow-up to #135, #136, #137. Restructures the Docker image for fast incremental builds and publishes two variants per release.

Summary

Dockerfile — layer split for cache reuse across releases:

  • Stage 1 (deps): installs only the heavy, slow-changing base deps per variant. No cocoindex or cocoindex-code here — they bump too often. Cache key is the RUN command string, so this layer reuses across releases until we bump it manually.
  • Stage 2 (builder): installs cocoindex + cocoindex-code per release. Fast because transitive deps are already in place.

Two image variants published per release:

Tag Size Backend Notes
cocoindex/cocoindex-code:latest (slim, default) ~450 MB LiteLLM-only Cloud embeddings. Most users.
cocoindex/cocoindex-code:full ~5 GB sentence-transformers + LiteLLM Offline-ready, includes torch + pre-baked default model.

Release workflow matrices on {slim, full}; each variant has its own GHA BuildKit cache scope so they don't evict each other's layers. :slim / :full tag pair also published to GHCR.

Packaging rename: cocoindex-code[default]cocoindex-code[full] to match the Docker variant name. [embeddings-local] remains the canonical primary extra; [full] is the umbrella alias that may bundle more optional niceties over time. CLI hints that point at missing sentence-transformers continue to reference [embeddings-local] (the specific pointer).

README: new "Choosing an image" table documents both variants. Mac-on-Docker MPS note narrowed to the :full case only — slim + LiteLLM users are unaffected because inference happens on the provider's side.

Test plan

  • Local docker build for both variants succeeds (verified: slim ≈ 431 MB, full ≈ 5.38 GB).
  • mypy + default pytest green.
  • Next workflow_dispatch with test_docker=true will push :test (slim) and :test-full to both registries; once confirmed, the following real release should show dramatically shorter per-release build time (base deps layer cached).

🤖 Generated with Claude Code

…d GHA cache

Dockerfile previously installed cocoindex-code, cocoindex, torch,
sentence-transformers, and all transitive deps in one RUN. Any change to
the source tree (via COPY . /ccc-src) invalidated that single layer,
forcing a full re-install — ~1 GB of wheels for torch + friends — on
every release. Under QEMU for the arm64 cross-build this was slow
enough to be painful.

Split into two stages:
- `deps`: install cocoindex + cocoindex-code[default] from PyPI. Cache
  key is just the RUN command string, so this layer is reused across
  releases until we bump the pins.
- `builder`: overlay the release version via
  `CCC_INSTALL_SPEC=/ccc-src[default]` with `--no-deps
  --force-reinstall` — only the cocoindex-code package is touched; the
  heavy deps layer stays untouched.

Also add BuildKit layer cache (`type=gha`) to the publish-docker job so
the deps layer persists across workflow runs, not just within a single
build.
…ull] extra

Build two Docker image variants per release:
- slim (:latest, default) — ~450 MB. LiteLLM-only. cocoindex + cocoindex-code
  without sentence-transformers. Targets cloud-backed embeddings.
- full (:full)            — ~5 GB. Bundles sentence-transformers + torch +
  a pre-baked default model. Targets offline-ready local embeddings.

Dockerfile gains a CCC_VARIANT build arg that gates stage 1's
sentence-transformers install and stage 3's model bake. Release workflow
matrices on {slim, full}; each variant has its own GHA cache scope so
layer reuse works across releases without the variants evicting each
other.

Also rename the PyPI `[default]` umbrella extra to `[full]` so pip and
Docker names match. `[embeddings-local]` remains the canonical primary
extra (the one that specifically pulls in sentence-transformers); `[full]`
is its umbrella alias that may bundle additional optional niceties later.
CLI hints that point at missing sentence-transformers continue to name
`[embeddings-local]` directly — the most specific pointer for that case.

README documents both image variants with a comparison table and narrows
the Mac-on-Docker MPS note to only :full users (slim + LiteLLM is
unaffected).
@georgeh0 georgeh0 changed the title perf(docker): split into reusable deps layer + add GHA build cache feat(docker): slim/full image variants + cached deps layer + [full] extra rename Apr 14, 2026
@georgeh0 georgeh0 merged commit 6f84edc into main Apr 14, 2026
9 of 10 checks passed
@georgeh0 georgeh0 deleted the g/docker-layer-cache branch April 14, 2026 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant