design: Solver refactor design by itsomri · Pull Request #1574 · kai-scheduler/KAI-Scheduler

itsomri · 2026-05-10T14:05:13Z

Description

This PR proposes a refactor for the job solver stack.

Related Issues

Fixes #

Checklist

Note: Ensure your PR title follows the Conventional Commits format (e.g., feat(scheduler): add new feature)

Self-reviewed
Added/updated tests (if needed)
Updated documentation (if needed)

Breaking Changes

Additional Notes

Signed-off-by: itsomri <[email protected]>

coderabbitai · 2026-05-10T14:05:19Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c294d114-2e82-444a-b03a-65cc3f20e234

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch omric/solver-refactor-design

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: itsomri <[email protected]>

davidLif · 2026-05-11T12:29:47Z

+
+Typical podgroups are usually:
+- Single pod workloads, which are the easiest
+- Distributed training, inference, or data processing jobs — a leader plus several (typically one, sometimes a few, generally less than 10) worker templates, ranging from very few to thousands of pods that are *replicas* of those few templates: identical resource requests, predicates, and affinity rules. The solver treats every pod independently today, but template-level equivalence classes can shrink the effective candidate space substantially.


Can we actually assume that? We also see movment to workloads with several different worker types, and separate auto scaling for the different types and replicas, which leads to different resources requests

Several different is still fewer than # of replicas. We need to optimize for pod templates, not pods. The intention here is, for example, when we implement bin-packing approximations (for example in gpu scenario pre filter), we can bucket by pod type and get better results both in correctness and in performance

davidLif · 2026-05-11T12:38:49Z

+- **Jobs with topology requirements** is a common use case that requires its own optimizations
+- Busy, multi-tenant, highly utilized clusters that serve dozens of teams
+
+### Best solutions for reclaim, by multiple criteria


Rather then "multiple criteria", we hsould talk about queue, job and subgroups order in general, as the critiria and their score can be changed

The intention here is to explain the complexity of what can be considered a "good" solution and to show how today's assumptions are very naive. For example:

Who says that 1,000 victims from priority 49 are preferable to one victim from priority 50? (which is the state today)

Maybe it's worth it to evict 1,001 eligible victims vs 1,000, if it gives the preemptor a better topology placement?

What is better? An unfair but 100% allocated cluster (bin-packing optimized), vs 80%, completely fair allocation?

The refactor will not provide us knobs to address this issues, but we will be able to start having this conversation if we implement some scenario scoring mechanism.

davidLif · 2026-05-11T12:43:46Z

+While out of scope for the initial refactor, it's worth considering that different job classes warrant different scenario-generation strategies. The refactor should take into account that scenario generation could be **adaptive** to job type and cluster state.
+
+- **Strict topology gangs** — enumerate viable placement domains first, derive the minimum victim set per domain. The default victim-set-first generator could be suboptimal here, both from performance and for finding the optimal solution.
+- **Single-task reclaimers** — one pod can only land on one node, so a single-pod reclaimer requires single-node sub-scenario evaluation. This can be generalized further: each reclaimer set of pods has a theoretical minimum and maximum number of nodes that need to be evaluated.


Assuming a max number of nodes might be problematic, when a consolidating reclaim is ebnabled.

That's correct, I wanted to add a note on that

davidLif · 2026-05-11T13:18:48Z

+
+The legacy gang loop probed `k = 1, 2, 4, ..., N` and could retain the largest feasible `k` as a partial allocation. The new simulator is all-or-nothing.
+
+- **Strict gang semantics is correct** for min-member=N jobs — partial allocations don't help a gang that requires N tasks running together.


I belived we did used this data as a part of our unscheduleable explanation

We can solve that without adding this complexity to the scheduler loop - for example, a cli tool that will simulate this on demand

itsomri added 3 commits May 7, 2026 18:01

WIP design

07d21a3

Signed-off-by: itsomri <[email protected]>

WIP design

2d99591

Signed-off-by: itsomri <[email protected]>

Fixes

8b0d2e4

Signed-off-by: itsomri <[email protected]>

fixes

b4785f2

Signed-off-by: itsomri <[email protected]>

davidLif reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design: Solver refactor design#1574

design: Solver refactor design#1574
itsomri wants to merge 4 commits into
mainfrom
omric/solver-refactor-design

itsomri commented May 10, 2026

Uh oh!

coderabbitai Bot commented May 10, 2026 •

edited

Loading

Review skipped

Uh oh!

davidLif May 11, 2026

Uh oh!

itsomri May 12, 2026

Uh oh!

davidLif May 11, 2026

Uh oh!

itsomri May 12, 2026

Uh oh!

davidLif May 11, 2026

Uh oh!

itsomri May 12, 2026

Uh oh!

davidLif May 11, 2026

Uh oh!

itsomri May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		The legacy gang loop probed `k = 1, 2, 4, ..., N` and could retain the largest feasible `k` as a partial allocation. The new simulator is all-or-nothing.

		- Strict gang semantics is correct for min-member=N jobs — partial allocations don't help a gang that requires N tasks running together.

Conversation

itsomri commented May 10, 2026

Description

Related Issues

Checklist

Breaking Changes

Additional Notes

Uh oh!

coderabbitai Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

davidLif May 11, 2026

Choose a reason for hiding this comment

Uh oh!

itsomri May 12, 2026

Choose a reason for hiding this comment

Uh oh!

davidLif May 11, 2026

Choose a reason for hiding this comment

Uh oh!

itsomri May 12, 2026

Choose a reason for hiding this comment

Uh oh!

davidLif May 11, 2026

Choose a reason for hiding this comment

Uh oh!

itsomri May 12, 2026

Choose a reason for hiding this comment

Uh oh!

davidLif May 11, 2026

Choose a reason for hiding this comment

Uh oh!

itsomri May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 10, 2026 •

edited

Loading

itsomri May 12, 2026 •

edited

Loading