Skip to content

Conversation

@kim
Copy link
Contributor

@kim kim commented Oct 28, 2025

The commitlog creates new segments atomically, returning EEXIST if the segment already exists. This is to break a retry loop in case the filesystem becomes unwritable.

This error did not contain any context about what does not exist, so this patch adds some.

Also, an unhandled edge case has been discovered:

When opening an existing log, the commitlog will try to resume the last segment for writing. If it finds a corrupt commit in that segment, it won't resume, but instead create a new segment at the corrupt commit's offset + 1.

However, if the first commit in the last segment is corrupted, the offset will be that of the last segment -- trying to start a new segment will thus fail with EEXIST.

Without additional recovery mechanisms, it is not obvious what to do in this case: the segment could contain valid data after the initial commit, so we certainly don't want to throw it away.

Instead, we now detect this case and return InvalidData with some context.

Expected complexity level and risk

1

Testing

  • A (regression) test is included

The commitlog creates new segments atomically, returning EEXIST if the
segment already exists. This is to break a retry loop in case the
filesystem becomes unwritable.

This error did not contain any context about what does not exist, so
this patch adds some.

Also, an unhandled edge case has been discovered:

When opening an existing log, the commitlog will try to resume the last
segment for writing. If it finds a corrupt commit in that segment, it
won't resume, but instead create a new segment at the corrupt commit's
offset + 1.

However, if the first commit in the last segment is corrupted, the
offset will be that of the last segment -- trying to start a new segment
will thus fail with EEXIST.

Without additional recovery mechanisms, it is not obvious what to do in
this case: the segment could contain valid data after the initial
commit, so we certainly don't want to throw it away.

Instead, we now detect this case an return `InvalidData` with some
context.
@kim kim requested review from Kasama and Shubham8287 October 28, 2025 11:46
@kim kim added this pull request to the merge queue Oct 29, 2025
Merged via the queue into master with commit 798852e Oct 29, 2025
27 checks passed
@kim kim deleted the kim/commitlog/error-context branch October 29, 2025 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants