pkg/executor/sortexec: stabilize flaky TestInterruptedDuringSpilling by flaky-claw · Pull Request #67891 · pingcap/tidb

flaky-claw · 2026-04-19T01:12:46Z

What problem does this PR solve?

Issue Number: close #50799

Problem Summary:
Flaky test TestInterruptedDuringSpilling in pkg/executor/sortexec intermittently fails, so this PR stabilizes that path.

What changed and how does it work?

Root Cause

The flaky came from a brittle <1s wall-clock assumption in the spill interruption test, which is sensitive to spill checkpoint timing under variable runtime pressure.

Fix

Forcing small spill chunks only in this test makes interruption checkpoints dense and deterministic, preserving the original assertion intent without product-code changes.

Verification

Spec:

target: pkg/executor/sortexec :: TestInterruptedDuringSpilling
strategy: tidb.go_flaky.default
plan mode: BASELINE_ONLY
requirements: required case must execute; no skip; repeat count = 1
baseline gates: required_flaky_gate, build_safety_gate, intent_guard_gate

Observed result:

status: passed
required case executed: yes
submission decision: ALLOWED
scope debt present: yes

Gate checklist:

Required flaky gate: PASS
Build safety gate: PASS
Intent guard gate: PASS
Repo-wide advisory gate: SKIPPED
Feedback specific gate: SKIPPED

Commands:

go test -json ./pkg/executor/sortexec -run '^TestInterruptedDuringSpilling$' -count=1
go test -json ./pkg/executor/sortexec -count=1
make build

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Fixes #50799

Summary by CodeRabbit

Tests
- Improved test reliability by configuring a smaller chunk size during spill testing to better validate interrupt handling behavior.

pantheon-ai · 2026-04-19T01:12:51Z

@flaky-claw I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

_{ℹ️ Learn more details on Pantheon AI.}

tiprow · 2026-04-19T01:13:03Z

Hi @flaky-claw. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-04-19T01:13:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dd9e9a5a-19ee-4aed-b3c3-8d59ba339831

📥 Commits

Reviewing files that changed from the base of the PR and between ce92298 and e4aa40a.

📒 Files selected for processing (1)

pkg/executor/sortexec/sortexec_pkg_test.go

📝 Walkthrough

Walkthrough

The test TestInterruptedDuringSpilling is modified to temporarily adjust the global spillChunkSize during test execution by capturing the original value, setting a smaller size for testing, and restoring the original via deferred cleanup.

Changes

Cohort / File(s)	Summary
Test Spill Configuration `pkg/executor/sortexec/sortexec_pkg_test.go`	Adds test setup code to override `spillChunkSize` for `TestInterruptedDuringSpilling` using `SetSmallSpillChunkSizeForTest()` with deferred restoration of the original value.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested labels

size/XS, ok-to-test, approved, lgtm

Suggested reviewers

windtalker
YangKeao
zanmato1984

Poem

🐰 A flaky test, oh how it did stall,
With spilling chunks that grew far too tall,
Now we shrink them down, make haste,
No more timeouts to waste,
The test runs swift, no more to appall! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title accurately summarizes the main change: stabilizing a flaky test by adjusting spill chunk size configuration within the test itself.
Description check	✅ Passed	The pull request description comprehensively covers all required sections including the problem statement (linked issue `#50799`), root cause analysis, the fix applied, and verification results with test execution details.
Linked Issues check	✅ Passed	The code changes directly address issue `#50799` by stabilizing the flaky TestInterruptedDuringSpilling test through deterministic spill checkpoint handling without altering production code.
Out of Scope Changes check	✅ Passed	All changes are limited to the test file and exclusively focus on stabilizing the failing test by capturing and restoring spillChunkSize, with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-19T01:31:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.4278%. Comparing base (e3f45e4) to head (e4aa40a).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67891        +/-   ##
================================================
- Coverage   77.7969%   77.4278%   -0.3692%     
================================================
  Files          1983       1966        -17     
  Lines        548948     549053       +105     
================================================
- Hits         427065     425120      -1945     
- Misses       120962     123931      +2969     
+ Partials        921          2       -919

Flag	Coverage Δ
integration	`40.8885% <ø> (+1.0913%)`	⬆️
unit	`76.6562% <ø> (+0.3066%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`50.0872% <ø> (-13.0244%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yinsustart · 2026-04-20T09:52:54Z

/test check-dev2

tiprow · 2026-04-20T09:53:18Z

@yinsustart: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/test check-dev2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2026-04-20T09:58:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wshwsh12
Once this PR has been reviewed and has the lgtm label, please assign zanmato1984 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/executor/sortexec/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-04-20T09:58:07Z

[LGTM Timeline notifier]

Timeline:

2026-04-20 09:58:07.114851865 +0000 UTC m=+1987092.320211922: ☑️ agreed by wshwsh12.

hawkingrei · 2026-04-21T05:51:57Z

/hold

fix: stabilize flaky issue pingcap#50799

e4aa40a

ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed labels Apr 19, 2026

ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 19, 2026

wshwsh12 approved these changes Apr 20, 2026

View reviewed changes

ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 20, 2026

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/executor/sortexec: stabilize flaky TestInterruptedDuringSpilling#67891

pkg/executor/sortexec: stabilize flaky TestInterruptedDuringSpilling#67891
flaky-claw wants to merge 1 commit intopingcap:masterfrom
flaky-claw:flakyfixer/case_959d028a3d36-a1

flaky-claw commented Apr 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

pantheon-ai bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

tiprow bot commented Apr 19, 2026

Uh oh!

coderabbitai bot commented Apr 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

yinsustart commented Apr 20, 2026

Uh oh!

tiprow bot commented Apr 20, 2026

Uh oh!

ti-chi-bot bot commented Apr 20, 2026

Uh oh!

ti-chi-bot bot commented Apr 20, 2026

Uh oh!

hawkingrei commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

flaky-claw commented Apr 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Root Cause

Fix

Verification

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiprow bot commented Apr 19, 2026

Uh oh!

coderabbitai bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yinsustart commented Apr 20, 2026

Uh oh!

tiprow bot commented Apr 20, 2026

Uh oh!

ti-chi-bot bot commented Apr 20, 2026

Uh oh!

ti-chi-bot bot commented Apr 20, 2026

[LGTM Timeline notifier]

Uh oh!

hawkingrei commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

flaky-claw commented Apr 19, 2026 •

edited by coderabbitai bot

Loading

pantheon-ai bot commented Apr 19, 2026 •

edited

Loading

coderabbitai bot commented Apr 19, 2026 •

edited

Loading

codecov bot commented Apr 19, 2026 •

edited

Loading