- 
                Notifications
    
You must be signed in to change notification settings  - Fork 604
 
Configurable core count #2363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Configurable core count #2363
Conversation
          Codecov ReportAttention: Patch coverage is  
 
 Additional details and impacted files@@            Coverage Diff             @@
##             main    #2363      +/-   ##
==========================================
- Coverage   71.58%   66.60%   -4.99%     
==========================================
  Files          65       64       -1     
  Lines       36214    34169    -2045     
==========================================
- Hits        25923    22757    -3166     
- Misses      10291    11412    +1121     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
  | 
    
| Some(SocketAddr::from(([0, 0, 0, 0], server_addr.port()))), | ||
| self.scheduler_url().to_url(), | ||
| token, | ||
| 4, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be great if the test could verify that, indeed, 4 core are being used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so, but that would add a lot more  work than the actual change - and you are seeing some of the first Rust code that I've written here, so I really don't know the tradeoffs etc to make a good decision about how to pull it off.
I have some ideas about scheduling improvements, though absolutely no guarantees that I'll get to them. That would need a corresponding test harness (maybe with a kind of mock compiler that takes a configurable amount of CPU time and memory and writes some kind of tracing output) with one of the more trivial checks being that the max number of jobs is not exceeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean actually verify, not jusk asking the scheduler what it thinks its limit is, which seems a bit pointless because it "obviously works" (right?)
76945e4    to
    5934244      
    Compare
  
    5934244    to
    49036b7      
    Compare
  
    49036b7    to
    bd88a2b      
    Compare
  
    bd88a2b    to
    515168d      
    Compare
  
    It seems to be a bad deal: increases line count and obscures the origin of values in a pretty long function.
515168d    to
    5e15d3f      
    Compare
  
    Also move the slight inflation of CPU core count ("overcommit" to
make up for various latencies) to the builder in order to enable
setting an exact maximum number of cores to use which will never be
exceeded. That introduces a small problem in the scheduling
protocol (excess overcommit if the builder is new and the scheduler
is old) that seems pretty acceptable to me and, anyway, does not
occur if both builder and scheduler are of the same version.
As another side effect, it shouldn't occur anymore that the
scheduler reports more running jobs than available slots.
    5e15d3f    to
    3d47eb4      
    Compare
  
    
Some nodes can run out of memory if all of their cores are used. There may be other reasons to limit the amount of cores used.