Skip to content

Conversation

@alex
Copy link
Contributor

@alex alex commented Dec 4, 2025

The blocking pool's task queue was protected by a single mutex, causing severe contention when many threads spawn blocking tasks concurrently. This resulted in nearly linear degradation: 16 concurrent threads took ~18x longer than a single thread.

Replace the single-mutex queue with a sharded queue that distributes tasks across 16 lock-protected shards. The implementation adapts to concurrency levels by using fewer shards when thread count is low, maintaining cache locality while avoiding contention at scale.

Benchmark results (spawning 100 batches of 16 tasks per thread):

Concurrency Before After Improvement
1 thread 13.3ms 17.8ms +34%
2 threads 26.0ms 20.1ms -23%
4 threads 45.4ms 27.5ms -39%
8 threads 111.5ms 20.3ms -82%
16 threads 247.8ms 22.4ms -91%

The slight overhead at 1 thread is due to the sharded infrastructure, but this is acceptable given the dramatic improvement at higher concurrency where the original design suffered from lock contention.

(Notwithstanding that this shows as a commit from claude, every line is human reviewed. If there's a mistake, it's Alex's fault.)

@alex alex force-pushed the claude/improve-spawn-blocking-perf-01A5VqgjoFsxUcvmP6eAjdTf branch 3 times, most recently from 9537dda to 016f6ca Compare December 4, 2025 00:19
@alex
Copy link
Contributor Author

alex commented Dec 4, 2025

(FreeBSD failures look unrelated.)

@ADD-SP ADD-SP added A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking T-performance Topic: performance and benchmarks labels Dec 4, 2025
@ADD-SP ADD-SP added S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. and removed S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. labels Dec 4, 2025
@alex alex force-pushed the claude/improve-spawn-blocking-perf-01A5VqgjoFsxUcvmP6eAjdTf branch 2 times, most recently from f4416fb to 21ff5ce Compare December 4, 2025 17:36
@martin-g
Copy link
Member

martin-g commented Dec 4, 2025

Please rebase to latest master to get the fix for the FreeBSD failures.

@alex alex force-pushed the claude/improve-spawn-blocking-perf-01A5VqgjoFsxUcvmP6eAjdTf branch from 21ff5ce to 694fa6b Compare December 4, 2025 17:39
@alex alex force-pushed the claude/improve-spawn-blocking-perf-01A5VqgjoFsxUcvmP6eAjdTf branch from 694fa6b to edd5e10 Compare December 5, 2025 12:37
The blocking pool's task queue was protected by a single mutex, causing
severe contention when many threads spawn blocking tasks concurrently.
This resulted in nearly linear degradation: 16 concurrent threads took
~18x longer than a single thread.

Replace the single-mutex queue with a sharded queue that distributes
tasks across 16 lock-protected shards. The implementation adapts to
concurrency levels by using fewer shards when thread count is low,
maintaining cache locality while avoiding contention at scale.

Benchmark results (spawning 100 batches of 16 tasks per thread):

| Concurrency | Before   | After   | Improvement |
|-------------|----------|---------|-------------|
| 1 thread    | 13.3ms   | 17.8ms  | +34%        |
| 2 threads   | 26.0ms   | 20.1ms  | -23%        |
| 4 threads   | 45.4ms   | 27.5ms  | -39%        |
| 8 threads   | 111.5ms  | 20.3ms  | -82%        |
| 16 threads  | 247.8ms  | 22.4ms  | -91%        |

The slight overhead at 1 thread is due to the sharded infrastructure,
but this is acceptable given the dramatic improvement at higher
concurrency where the original design suffered from lock contention.
@alex alex force-pushed the claude/improve-spawn-blocking-perf-01A5VqgjoFsxUcvmP6eAjdTf branch from edd5e10 to 126cb78 Compare December 6, 2025 02:06
@ADD-SP ADD-SP added S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. and removed S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. labels Dec 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-tokio Area: The main tokio crate M-blocking Module: tokio/task/blocking T-performance Topic: performance and benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants