Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

lihaoyi · 2025-02-19T10:19:48Z

From the maintainer Li Haoyi: I'm putting a 1500USD bounty on this issue, payable by bank transfer on a merged PR implementing this.

See https://github.com/orgs/com-lihaoyi/discussions/6 for other bounties and the terms and conditions that bounties operate under

The goal of this ticket is to implement a variant of testForkGrouping that performs work-stealing: rather than splitting test classes into fixed groups with each grouping running serially in a separate JVM, we want to spawn 1-or-more JVMs for each test group with the JVMs in each group performing work-stealing on the group's test classes, shutting down when that group's test classes have all been completed.

Currently, testForkGrouping only allows static allocation of test classes to various subprocesses. This means it cannot be easily turned on by default, since there will always be codebases with many fast test classes where forking is a negative due to the JVM startup overhead, and others with fewer slow test classes where the JVM startup overhead is less costly. A work-stealing test runner as described above would avoid this problem by letting fast test classes share the JVM, only forking as many parallel JVMs as necessary to use saturate Mill's thread count config (--jobs) which defaults to NUM_CORES. This would likely be self-tuning enough to turn it on by default, so everyone can benefit from running test classes in parallel regardless of the runtime characteristics of their test classes.

As described above, work stealing would only occur within each test group. Thus a user would still be able to use testForkGrouping to separate test classes that should never run in the same JVM, but if there are no such restrictions they could stick with the default "everything in one group" config and Mill's test runners will work-steal the test classes to complete them as soon as possible.

See earlier discussion in #4419

The text was updated successfully, but these errors were encountered:

HollandDM · 2025-02-23T16:26:55Z

Hello, I've submitted my PR in an attempt to handle this. It's currently a PoC as it's using a simple cached thread pool. I want to clarify the direction first and then follow up with the work stealing thread pool implementation if everything is good to go. Please take a look and we can have some conversation about the approach

lihaoyi added the bounty label Feb 19, 2025

HollandDM mentioned this issue Feb 23, 2025

Use process pool for test runner #4614

Merged

lihaoyi closed this as completed in #4614 Mar 10, 2025

lihaoyi closed this as completed in f733779 Mar 10, 2025

lefou added this to the 0.13.0 milestone Mar 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

lihaoyi commented Feb 19, 2025 •

edited

Loading

HollandDM commented Feb 23, 2025 •

edited

Loading

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

Implement a dynamic work-stealing test fork runner (1500USD Bounty) #4590

Comments

lihaoyi commented Feb 19, 2025 • edited Loading

HollandDM commented Feb 23, 2025 • edited Loading

lihaoyi commented Feb 19, 2025 •

edited

Loading

HollandDM commented Feb 23, 2025 •

edited

Loading