-
-
Notifications
You must be signed in to change notification settings - Fork 732
Stealing balance not accounting for recent decisions #7002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Restoring the link to the actual scheduler dictionary again is not that difficult diff --git a/distributed/stealing.py b/distributed/stealing.py
index b9858f465..4d2abc044 100644
--- a/distributed/stealing.py
+++ b/distributed/stealing.py
@@ -406,13 +413,13 @@ class WorkStealing(SchedulerPlugin):
occ_thief,
)
)
- s.check_idle_saturated(victim, occ=occ_victim)
- s.check_idle_saturated(thief, occ=occ_thief)
+ s.check_idle_saturated(thief, occ=combined_occupancy(thief))
+ s.check_idle_saturated(victim, occ=combined_occupancy(victim))
with log_errors():
i = 0
# Paused and closing workers must never become thieves
- idle = [ws for ws in s.idle.values() if ws.status == Status.running]
+ idle = s.idle.values()
if not idle or len(idle) == len(s.workers):
return
@@ -429,8 +436,6 @@ class WorkStealing(SchedulerPlugin):
]
elif len(saturated) < 20:
saturated = sorted(saturated, key=combined_occupancy, reverse=True)
- if len(idle) < 20:
- idle = sorted(idle, key=combined_occupancy)
for level, cost_multiplier in enumerate(self.co However, even if done correctly, |
Assuming |
interesting. Typically occ is much larger though, it's rather |
The Stealing.balance implementation is currently using a greedy for-loop with multiple early termination conditions. These break statements are de-facto dead code causing too greedy and potentially destructive work stealing.
Stealing is highly sensitive to the definition of idle and saturated workers which is defined in
Scheduler.check_idle_saturated
.Explanation of saturated
See also https://github.com/dask/distributed/pull/6614/files/36a60a5e358ea2a5d16597651126ac5892203b01#r952608704
I think the new definition of idle to incorporate whether there are free slots makes a lot of sense.
off-topic but I spent some time on this before and it's worth sharing
you can rewrite this statement to
i.e.
pending
is "average occupancy per thread that is in stateready
/ not, yet executing"or put into context of stealing, which is the only place that
saturated
matters, "occupancy per thread that could be stolen"I guess what's written there is more stable in terms of floating point arithmetic.
So, the definition of saturated is spelled out something like
A worker is classified as saturated if it's
ready
occupancy is at least 0.4s per thread and at least 1.9 times the average cluster wide occupancy.i.e. "there is something queued up / in ready and it is about two times more than average"
Simplified, this is how the algorithm is intended to work:
Stealing.balance
iterates over saturated workers in descending occupancy and victimizes them for stealing. An idle worker will act as the thief and work will be rebalanced. This continues until there are no longer saturated workers or there are no longer idle workers.However, this pseudo algorithm is no longer implemented. The current code only shows hints at how this was formerly achieved and it likely strongly depended on dicts, sets, etc. being passed by reference such that any changes to the fundamental collections are immediately propagated to the balance loop which then allows for early termination.
Stealing.maybe_move_task
is callingScheduler. check_idle_saturated
with an additional keywordocc
which is only used here. This keyword allows to override the actual occupancy of the worker which would allow us to reclassify thieves and victims based on theircombined_occupancy
, i.e. theScheduler.idle
andScheduler.saturated
collections are updated under the assumption that in-flight tasks are stolen successfully.distributed/distributed/stealing.py
Lines 409 to 410 in b133009
Assuming that
idle
in the balance for-loop was referencingScheduler.idle
and not be a derivative, this would allow us to break out of the loop at various places, e.g. here, here or here. However, the loop variableidle
is always defined as a list since #5665 and even before is sometimes a list due to sorting ever since #754This effectively renders any
if not idle; break
statements dead code. This can be confirmed by codecoverage measurementsGoing back to #750 it appears that a similar mechanism existed for early termination if saturated became empty. The implementation changed drastically and I do not see any leftover termination mechanism that ensures a potential victim is still saturated if one accounts for combined_occupancy.
I suspect this lack of early termination to be a major contributor to at least the following tickets
The text was updated successfully, but these errors were encountered: