Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docker): reduce size between docker builds #7571

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

keturn
Copy link
Contributor

@keturn keturn commented Jan 18, 2025

by adding a layer with all the pytorch dependencies that don't change most of the time.

Summary

Every time the main docker images rebuild and I pull main-cuda, it gets another 3+ GB, which seems like about a zillion times too much since most things don't change from one commit on main to the next.

This is an attempt to follow the guidance in Using uv in Docker: Intermediate Layers so there's one layer that installs all the dependencies—including PyTorch with its bundled nvidia libraries—before the project's own frequently-changing files are copied in to the image.

Related Issues / Discussions

QA Instructions

Hopefully the CI system building the docker images is sufficient.

But there is one change to pyproject.toml related to xformers, so it'd be worth checking that python -m xformers.info still says it has triton on the platforms that expect it.

Merge Plan

I don't expect this to be a disruptive merge. Though I did take the liberty of moving /opt/venv to the uv-default /opt/invokeai/.venv, which someone might notice.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

by adding a layer with all the pytorch dependencies that don't change most of the time.
@github-actions github-actions bot added docker Root python-deps PRs that change python dependencies labels Jan 18, 2025
Comment on lines -104 to -105
# Auxiliary dependencies, pinned only if necessary.
"triton; sys_platform=='linux'",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, CUDA builds of torch 2.4.1 have a dependency on triton, which has two consequences:

  1. triton is installed without us declaring the dependency here.
  2. some versions of triton conflict with that torch dependency. i.e. torch 2.4.1+cu124 requires 3.0.0 and will conflict with 3.1.0.

If we had an exact dependency on torch==2.4.1+cu124, uv's resolver would figure it out which version of triton works, and it would be fine.

However, since the torch version is left ambiguous, it's possible for the resolver to decide that non-CUDA torch==2.4.1 is a better solution since that doesn't conflict with the latest version of xformers. 😖

@keturn
Copy link
Contributor Author

keturn commented Jan 18, 2025

This Dockerfile is also quirky in that it separates builder and runtime stages, but then it puts all the build-deps in the runtime stage anyway (to build patchmatch?), which kinda defeats the purpose. But I think we can leave that alone for now as an independent concern.

There is one thing I haven't confirmed for the space savings: my test builds have been with podman (buildah), not buildkit. buildah doesn't support COPY --link, so the interaction between the stages and layers isn't exactly the same…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docker python-deps PRs that change python dependencies Root
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant