Poor adaptive target for empty clusters #6962
Labels
adaptive
All things relating to adaptive scaling
enhancement
Improve existing functionality or make things work better
scheduling
Uh oh!
There was an error while loading. Please reload this page.
If a cluster hasn't run any work yet, it will only recommend 1 worker initially, regardless of how many tasks are queued on the scheduler:
The scheduler's adaptive target is based on looking at its
total_occupancy
. But occupancy is only updated once tasks are scheduled (into processing). So if there are no workers, no tasks can be scheduled, and occupancy remains 0 even with tons of tasks inunrunnable
.I would expect the
total_occupancy
to also include the expected runtime of all unrunnable/queued tasks. That would result in faster scale-up from zero usually. Some deployment systems might be quite slow to scale. You might have to wait a few minutes to get 1 worker, to realize you then need more, and then wait a few minutes again. It would be better to ask for more up front.This is what ensures we at least get one worker, otherwise we'd never scale up at all:
distributed/distributed/scheduler.py
Lines 7287 to 7288 in 16748b7
The text was updated successfully, but these errors were encountered: