Skip to content

fix(blockstm): remove duplicate unlock in Resume() causing double unlock panic#26214

Open
fx0x55 wants to merge 6 commits intocosmos:mainfrom
fx0x55:fix-double-unlock-status
Open

fix(blockstm): remove duplicate unlock in Resume() causing double unlock panic#26214
fx0x55 wants to merge 6 commits intocosmos:mainfrom
fx0x55:fix-double-unlock-status

Conversation

@fx0x55
Copy link
Copy Markdown
Contributor

@fx0x55 fx0x55 commented Apr 2, 2026

Description

Fixes a critical concurrency bug in internal/blockstm/status.go where the Resume() method could cause a double unlock panic.

Problem

The Resume() method had the following issue:

  1. Used defer s.Unlock() at the beginning (line 100)
  2. Had redundant condition checks with manual s.Unlock() calls before panic() (lines 107-109, 113-115)
  3. When panic occurred, the deferred unlock would still execute, resulting in a double unlock panic

Changes

  • Removed the redundant duplicate condition checks and manual unlock calls (12 lines removed)
  • The condition is already properly checked at the beginning of the function with early return

Before

s.Lock()
defer s.Unlock()

if s.status != StatusSuspended || s.cond == nil {
    return
}

// status must be SUSPENDED and cond != nil
if s.status != StatusSuspended || s.cond == nil {
    s.Unlock()  // BUG: manual unlock before panic
    panic(fmt.Sprintf("invalid resume: status=%v", s.status))
}

After

s.Lock()
defer s.Unlock()

if s.status != StatusSuspended || s.cond == nil {
    return
}

s.status = StatusExecuting
s.cond.Notify()
s.cond = nil

Testing

All existing tests in internal/blockstm/... pass:

go test ./internal/blockstm/...

…ock panic

The Resume() method had defer s.Unlock() followed by manual s.Unlock()
calls before panic. When panic occurred, the deferred unlock would
execute, resulting in a double unlock panic.

Remove the redundant condition checks and manual unlock calls since
the condition is already checked at the beginning of the function.
@fx0x55 fx0x55 marked this pull request as ready for review April 2, 2026 07:35
@fx0x55 fx0x55 requested a review from a team as a code owner April 15, 2026 01:31
@aljo242
Copy link
Copy Markdown
Contributor

aljo242 commented Apr 23, 2026

Thanks for the fix — removing the redundant branches (and the manual Unlock under defer) is clearly the right move.

There don’t appear to be any targeted tests for StatusEntry.Resume / the suspended → executing path in internal/blockstm. It would help to add a small test file (or cases in an existing test) that cover at least:

  • SuspendResume: status becomes executing, cond is cleared, notify behavior is exercised if that’s observable from the test harness.
  • Cancellation / wake ordering: e.g. TryCancel (or equivalent) clears cond / moves state, then Resume is a no-op (matching the comment on Resume).
  • Idempotent / double Resume when already not suspended (no panic, no double-unlock).

That would lock in the intended semantics and guard against regressions if the early-return guard ever changes.

@fx0x55
Copy link
Copy Markdown
Contributor Author

fx0x55 commented Apr 24, 2026

Thanks for the fix — removing the redundant branches (and the manual Unlock under defer) is clearly the right move.

There don’t appear to be any targeted tests for StatusEntry.Resume / the suspended → executing path in internal/blockstm. It would help to add a small test file (or cases in an existing test) that cover at least:

  • SuspendResume: status becomes executing, cond is cleared, notify behavior is exercised if that’s observable from the test harness.
  • Cancellation / wake ordering: e.g. TryCancel (or equivalent) clears cond / moves state, then Resume is a no-op (matching the comment on Resume).
  • Idempotent / double Resume when already not suspended (no panic, no double-unlock).

That would lock in the intended semantics and guard against regressions if the early-return guard ever changes.

Thanks! Added status_test.go covering:

  • Suspend → Resume (status/cond behavior)
  • TryCancel → Resume is no-op
  • Resume when not suspended (no panic, no-op)
  • Full state machine transitions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants