From d1859a762eb6d8876407645177092ad8ffd03c36 Mon Sep 17 00:00:00 2001 From: Gavin Guo Date: Mon, 1 Dec 2025 12:19:14 +0800 Subject: [PATCH 1/3] scx_lavd: Update stale migration roles when no domain is overloaded MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In current implementation, plan_x_cpdom_migration() returns early when no domain was overloaded without updating the per-domain migration role: cpdomc->is_stealer, cpdomc->is_stealee, and sys_stat.nr_stealee. Under normal conditions these fields are updated in the classification loop, marking each domain as stealer, stealee, or neutral given by its current load. When the overloaded condition returns early, the loop never runs and the remained roles still impact the load balancer's decision. Places make migration decisions: 1. consume_task() checks is_stealer before calling try_to_steal_task() 2. try_to_steal_task() only steals from stealee domains 3. the idle donation path in pick_idle_cpu() uses is_stealer/is_stealee   to decide cross-domain task placement As a result, a domain marked as stealer or stealee in a previous round could keep that role even after plan_x_cpdom_migration() determines that no migration was needed. This induces cross-domain migrations to continue based on stale decisions. Fix this by jumping to the classification loop instead of returning early, making sure the roles are recalculated. Fixes: 17c7b2754442 ("scx_lavd: Make the load balancing core compaction- and capacity-aware.") Signed-off-by: Gavin Guo --- scheds/rust/scx_lavd/src/bpf/balance.bpf.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c index 0edc19c0fc..5d527631b5 100644 --- a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c +++ b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c @@ -108,11 +108,15 @@ int plan_x_cpdom_migration(void) if ((stealee_threshold > max_sc_load) && !overflow_running) { /* * If there is no overloaded domain, do not try to steal. + * However, the classification must still run to update + * stealer/stealee roles. Otherwise, stale roles from previous + * rounds will cause the load balancer to incorrectly migrate + * tasks from stale stealees to stale stealers. * <~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~> * [stealer_threshold ... avg_sc_load ... max_sc_load ... stealee_threshold] * --------------------------------------> */ - return 0; + goto calc_stealer_stealee; } if ((stealee_threshold <= max_sc_load || overflow_running) && (stealer_threshold < min_sc_load)) { @@ -125,6 +129,7 @@ int plan_x_cpdom_migration(void) stealer_threshold = min_sc_load; } +calc_stealer_stealee: /* * Determine stealer and stealee domains. */ From 8ef76f156e6a2a2efafe566f9299fb085e66c15c Mon Sep 17 00:00:00 2001 From: Gavin Guo Date: Mon, 1 Dec 2025 12:51:48 +0800 Subject: [PATCH 2/3] scx_lavd: Remove redundant condition in plan_x_cpdom_migration The first clause of this condition: (stealee_threshold <= max_sc_load || overflow_running) is the negation of the previous if statement: (stealee_threshold > max_sc_load) && !overflow_running So, if we reach this point, that check already failed and the second clause is always true. Signed-off-by: Gavin Guo --- scheds/rust/scx_lavd/src/bpf/balance.bpf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c index 5d527631b5..3cd8f77353 100644 --- a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c +++ b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c @@ -118,8 +118,7 @@ int plan_x_cpdom_migration(void) */ goto calc_stealer_stealee; } - if ((stealee_threshold <= max_sc_load || overflow_running) && - (stealer_threshold < min_sc_load)) { + if (stealer_threshold < min_sc_load) { /* * If there is a overloaded domain, always try to steal. * <~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~> From 5b0e002ce4a390c2b5ad5be10ab66402b8d9399c Mon Sep 17 00:00:00 2001 From: Gavin Guo Date: Mon, 1 Dec 2025 13:24:26 +0800 Subject: [PATCH 3/3] scx_lavd: Skip stealing when no stealees Add an early exit in try_to_steal_task() when sys_stat.nr_stealee is zero. This avoids the probabilistic calculation and nested loop traversal when no domain is marked as a stealee. Together with the commit that ensures the classification loop runs when there is no domain's scaled loading over the stealee_threshold, this guarantees sys_stat.nr_stealee accurately reflects the current state and avoid redundant hunting for tasks to be stolen. Signed-off-by: Gavin Guo --- scheds/rust/scx_lavd/src/bpf/balance.bpf.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c index 3cd8f77353..56bf68e3d2 100644 --- a/scheds/rust/scx_lavd/src/bpf/balance.bpf.c +++ b/scheds/rust/scx_lavd/src/bpf/balance.bpf.c @@ -257,6 +257,12 @@ static bool try_to_steal_task(struct cpdom_ctx *cpdomc) if (!cpdomc->nr_active_cpus) return false; + /* + * No stealee, nothing to steal. + */ + if (!sys_stat.nr_stealee) + return false; + /* * Probabilistically make a go or no go decision to avoid the * thundering herd problem. In other words, one out of nr_cpus