Skip to content

allocator/mmaprototype: WB normalization classifies some stores as overloaded when WB is small #164539

@angeladietz

Description

@angeladietz

Problem

MMA's WriteBandwidth (WB) normalization divides raw bytes/s by 128 KiB and truncates to an integer (load.go:565-567).

On clusters with low store-level WriteBandwidth, this normalization results in stores with similar WB to have very different load summaries. For example, a store at 920 KB/s (normalized load=7) and one at 910 KB/s (normalized load=6) are 16.7% apart in fractionAbove despite a 1% difference in actual throughput. With the 10% mean-fraction threshold, load=7 is overloadSlow while load=6 is loadNormal.

This has two consequences:

  1. Stores are falsely classified as overloaded, entering the shedding pool every tick despite having nearly identical throughput to their peers.
  2. Moves can't fix the overload. When per-range WB is small relative to the 128 KiB quantization unit, shedding a range doesn't change the store's integer-truncated load level, and its essentially impossible to arrive at a state where no stores are overloaded on WB, regardless of where replicas are moved to.

This can result in indefinite thrashing: MMA continuously moves replicas, each incurring raft snapshot and disk I/O cost, with zero WB benefit. On a test cluster, this produced 13,260 successful but useless replica moves over 6 hours. See write-up for the full investigation.

Possible approaches

We should address this related TODO:

// TODO(sumeer): consider adding a summaryUpperBound for small
// WriteBandwidth values too.

In addition/instead, we could consider:

  • Using a much smaller divisor (e.g., 8 KiB) so that per-range deltas meaningfully change the load level
  • Perform the fractionAbove comparison on raw bytes/s, keeping quantization only for display or bucketing. Eliminates the cliff entirely.

Epic: CRDB-56265
Jira issue: CRDB-60867

Epic CRDB-56265

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions