-
Notifications
You must be signed in to change notification settings - Fork 2k
scheduler: fixed a bug where the bandwidth of reserved cores were not taken into account #26768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
505ac99
to
ab9cb3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvegter before you get too much further with this and/or we get into a detailed review, could you help us out by describing the bug in a detail in the PR description?
Hey @tgross , I like the speed at which you picking up on draft PRs 😉 I have added some rudimentary description, describing the basics of problem at hand and reproducer for latest main (but also some initial tests). I'm changing work laptops by the end of week, hence my premature push, so I probably won't be able to continue working on anyhow. |
Hey @tgross , before I pick this up again, do you have a preference of first creating a GitHub issue , or perhaps already any feedback in the direction of this solution ? |
Sorry, I got sidetracked and didn't get a chance to re-review your updated description here. No need to open a new GitHub issue for it, we can discuss here. The overall problem you're describing makes sense, It looks like the node is fingerprinting the usable compute as I'd expect. Ex. this instance has 22 cores and a total of 25400 MHz (mix of pCores and eCores). If I reserve one core: client {
reserved {
cores = "0"
}
} And as your
It looks like your approach is to tweak the subtraction of comparable resources after the fact. And it looks like this has broken some tests. Wouldn't it be better to make sure that the comparable resources actually include the usable CPU (less the reserved cores) to begin with, rather than patching that up? |
2296998
to
b8e7428
Compare
@mvegter I saw that you pushed an update. I'm going to try to get this reviewed this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mvegter! This is looking pretty good. In addition to my comments inline, I think the test changes aren't actually exercising the behavior you expect. If I drop the nomad/structs/funcs_test.go
into main
, it still passes all the tests, which suggests there's a gap there.
nomad/structs/funcs.go
Outdated
nodeMem -= float64(reserved.Flattened.Memory.MemoryMB) | ||
} | ||
available := node.NodeResources.Comparable() | ||
available.Subtract(node.ReservedResources.Comparable()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've noted in my comment on nomad/structs/structs.go
, we've changed the return value of NodeResources.Comparable()
so that it now includes the unreserved resources. But then we have to subtract the ReservedResources.Comparable()
here and on line 195?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @tgross , I fixed up most of your comments. As we are looping on this one, would you prefer the fix in the node.NodeResources.Comparable()
to reflect the usable resource and dropping Comparable
on the ReservedResources
? This would be a slightly big change, as the NodeResources
has no view of the reserved resource (except the leaking CPU details in the Processors.Topology
). Or , create a new method e.g. node.Comparable()
that would essentially convert the NodeResources
and ReservedResources
into a single ComparableResources
?
If I drop the nomad/structs/funcs_test.go into main, it still passes all the tests, which suggests there's a gap there.
Not sure I see the gap you are referring, as the funcs_test.go
covers the cases after fingerprinting ; the change in CpuShares
behaviour is more accurately captured in client_test.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a slightly big change, as the NodeResources has no view of the reserved resource (except the leaking CPU details in the Processors.Topology ). Or , create a new method e.g. node.Comparable() that would essentially convert the NodeResources and ReservedResources into a single ComparableResources ?
Oof, this data model is ugly, isn't it? 😁 Another alternative would be to have NodeResources.Comparable()
take a parameter for the reserved resources:
func (n *NodeResources) Comparable(reserved *ComparableResources) *ComparableResources {
That way we'd force the caller to make sure they're correctly handling the reserved resources, while making it clear this doesn't include resources for allocations. That's still pretty ugly though.
Tell you what, none of this impacts the functionality and is just my aesthetic preferences, so we can land this PR without worrying about that.
Not sure I see the gap you are referring, as the funcs_test.go covers the cases after fingerprinting ; the change in CpuShares behaviour is more accurately captured in client_test.go.
I guess I'm asking why bother changing the tests in funcs_test.go if all the tests in nomad/structs
pass even if we dropped those test changes in the main branch? What are we testing here? Is this just belt-and-suspenders to make sure we don't introduce nil pointer bugs later on? (That's fine if so.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell you what, none of this impacts the functionality and is just my aesthetic preferences, so we can land this PR without worrying about that.
Given that NodeResources.Comparable
and ReservedResources.Comparable
are always called sequentially, essentially merging the two int a Node.Comparable
and dropping their respective methods may be the cleanest approach for now.
What are we testing here? Is this just belt-and-suspenders to make sure we don't introduce nil pointer bugs later on? (That's fine if so.)
These are the first usages of "disabled" cores, the Disabled
field is currently only for cosmetic reasons as none of the tests check Topology
directly, but the actual bandwidth plus the reserved resources are now tied and will affect test results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essentially merging the two int a Node.Comparable and dropping their respective methods may be the cleanest approach for now.
Go for it.
b8e7428
to
7f90a6f
Compare
I've verified this impacts back to 1.8.x+ent, so I've added the appropriate backport labels. |
957d379
to
51ece85
Compare
… taken into account
51ece85
to
f7854bc
Compare
Description
Configuring
cores
as part of the clientreserved
resources only limits scheduling on those cores, but isn't reflected in the available MHz bandwidth, thus affecting logic in places such as bin spread .Testing & Reproduction steps
versus
So on a VM with 124 cores, leaving the last core available for scheduling, we allegedly have 1 core or 2245 MHz / cpu available. While testing :
Mimic of the situation, where we reserve the same amount of bandwidth but using cpu instead of cores we do fail :
Links
Contributor Checklist
changelog entry using the
make cl
command.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.
Changes to Security Controls
Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.