feature: add Dockerfile.maximal to optimize CI (#1252)#1339
feature: add Dockerfile.maximal to optimize CI (#1252)#1339tirthpatel90 wants to merge 16 commits intooraios:mainfrom
Conversation
… to drastically reduce build time
…fix 6-hour infinite hang
|
Hey @tirthpatel90 , thanks for the PR. This just adds a new dockerfile, I don't see how this is accelerating CI. The dockerfile won't be built into an image or used anywhere in CI, or am I missing something? |
|
Hi @MischaPanch, you are absolutely right! My main focus initially was to successfully build this massive Now that the core image builds perfectly, I'm ready to wire it up to the CI! To actually accelerate the CI, what is your preferred strategy? Should we add a workflow to publish this image to the GitHub Container Registry ( Let me know how you'd like to handle the image hosting, and I'll push the necessary CI workflow changes to this PR right away! |
|
Yes, let's build it, push it to the GH container registry, adjust the pytest workflow to use it and see how much it accelerates. Would it be possible for you to test this out in your fork and link the actions here? I suppose you can't write to the right place in the container registry from the actions triggered in a PR, right? |
|
I would prefer to review the dockerfile after things are running and acceleration is visible :). That's why I'm asking |
|
Btw, setting up OCaml eats up a lot of time in CI, if it's possible to include it here, that would be great. If not, we might just disable it at some point |
|
Sounds like a plan, @MischaPanch! I'll add OCaml to the maximal image, set up a workflow on my fork to build and publish it to my personal GHCR, and then adjust the I'll ping you here with the action run links showing the acceleration metrics once it's running smoothly on my end. Working on it now! |
|
Hi @MischaPanch, I have successfully published the maximal image to GHCR and wired it up to a test workflow on my fork. The good news: The environment works perfectly and immediately breezes through ~65% of the test suite (including C++, Go, Java, Rust, etc.) without any setup overhead! However, the test execution hangs/slows down significantly after the 65% mark. Based on the logs, I suspect two reasons for this:
Does the test suite automatically attempt to install missing language servers during execution? If so, is there an environment variable or a |
|
Closing this PR in favor of the parallelized CI architecture discussed in #1362. The monolithic maximal image successfully proved that we can drastically cut down test times by pre-baking dependencies. However, moving forward, we will pivot to segmenting the tests into dynamic matrix batches and running them in parallel using a leaner base image. This will be more scalable and maintainable. Thanks for the feedback, everyone! I'll be opening a new PR for the parallel matrix workflows soon. |
This PR introduces Dockerfile.maximal to resolve #1252. It successfully ports the heaviest dependencies (R, Julia, Rust, Go, C++, Node, Ruby, etc.) into a native Docker image, bypassing OS security restrictions and memory limits.
Note: For stability and to avoid GitHub Actions OOM limits, extremely niche toolchains (like Swift, Haskell, Lean4) are deferred to a Phase 2 update. This Phase 1 image immediately resolves the primary CI compilation bottlenecks.