Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

Closed
lihaoyi opened this issue Feb 19, 2025 · 1 comment · Fixed by #4614
Closed

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

lihaoyi opened this issue Feb 19, 2025 · 1 comment · Fixed by #4614
Labels
Milestone

Comments

@lihaoyi
Copy link
Member

lihaoyi commented Feb 19, 2025


From the maintainer Li Haoyi: I'm putting a 1500USD bounty on this issue, payable by bank transfer on a merged PR implementing this.

See https://github.com/orgs/com-lihaoyi/discussions/6 for other bounties and the terms and conditions that bounties operate under


The goal of this ticket is to implement a variant of testForkGrouping that performs work-stealing: rather than splitting test classes into fixed groups with each grouping running serially in a separate JVM, we want to spawn 1-or-more JVMs for each test group with the JVMs in each group performing work-stealing on the group's test classes, shutting down when that group's test classes have all been completed.

Currently, testForkGrouping only allows static allocation of test classes to various subprocesses. This means it cannot be easily turned on by default, since there will always be codebases with many fast test classes where forking is a negative due to the JVM startup overhead, and others with fewer slow test classes where the JVM startup overhead is less costly. A work-stealing test runner as described above would avoid this problem by letting fast test classes share the JVM, only forking as many parallel JVMs as necessary to use saturate Mill's thread count config (--jobs) which defaults to NUM_CORES. This would likely be self-tuning enough to turn it on by default, so everyone can benefit from running test classes in parallel regardless of the runtime characteristics of their test classes.

As described above, work stealing would only occur within each test group. Thus a user would still be able to use testForkGrouping to separate test classes that should never run in the same JVM, but if there are no such restrictions they could stick with the default "everything in one group" config and Mill's test runners will work-steal the test classes to complete them as soon as possible.

See earlier discussion in #4419

@HollandDM
Copy link
Contributor

HollandDM commented Feb 23, 2025

Hello, I've submitted my PR in an attempt to handle this. It's currently a PoC as it's using a simple cached thread pool. I want to clarify the direction first and then follow up with the work stealing thread pool implementation if everything is good to go. Please take a look and we can have some conversation about the approach

@lefou lefou added this to the 0.13.0 milestone Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants