Skip to content

mysqlcdc: checkpoint at transaction boundary#4396

Merged
josephwoodward merged 1 commit into
mainfrom
jw/mysqlxid
May 6, 2026
Merged

mysqlcdc: checkpoint at transaction boundary#4396
josephwoodward merged 1 commit into
mainfrom
jw/mysqlxid

Conversation

@josephwoodward
Copy link
Copy Markdown
Contributor

@josephwoodward josephwoodward commented May 6, 2026

The problem:

Err: table id 1302: invalid table id, no corresponding table map event"

Currently we checkpoint at a row level. A MySQL binlog looks like the following, with go-mysql keeping an internal buffer of events keyed by the table ID that's set in the TABLE_MAP_EVENT :

  TABLE_MAP_EVENT  pos=100 → LogPos=150                                                                                                                             
  ROWS_EVENT #1    pos=150 → LogPos=250   ← batch flushed here, checkpoint saved as 250                                                                             
  ROWS_EVENT #2    pos=250 → LogPos=350                                                                                                                             
  XID_EVENT        pos=350 → LogPos=380  

If the connector restarts mid transaction then it's unable to recover as it no longer has the table id and we see the above error.

This change:

This changes listens to a sentient event (XID_EVENT) that exists after every transaction is flushed. Ensuring we checkpoint at this position ensures we capture at clean transaction boundaries and only recover at the beginning of the transaction.

Attn: This does mean that in the event of a connector crash, it's possible that we redeliver already processed messages as we start from the beginning of the transaction (satisfying at-least-once delivery).

image

@claude
Copy link
Copy Markdown

claude Bot commented May 6, 2026

Commits
LGTM

Review
Single-commit PR that switches MySQL CDC checkpointing to advance only on transaction boundaries (XID events) so that on restart canal.RunFrom always resumes at the start of a new transaction and never misses the preceding TABLE_MAP_EVENT. The new OnXID handler is consistent with the existing OnRow/onMessage channel-send pattern (SoftStopChan() + context.Canceled), and the latestXIDPos plumbing through flushBatch correctly preserves at-least-once semantics (batches with rows from an as-yet-uncommitted transaction will be replayed from the prior XID on restart).

LGTM

@josephwoodward josephwoodward changed the title mysqlcdc: switch to storing xid for checkpoint mysqlcdc: ensure we also checkpoint at transaction boundary May 6, 2026
@claude
Copy link
Copy Markdown

claude Bot commented May 6, 2026

Commits
LGTM

Review
Reviewed the change to checkpoint at transaction boundaries (XID events) instead of arbitrary row positions. The new OnXID handler emits a sentinel MessageEvent that updates latestXIDPos in readMessages, and flushBatch now tracks the most recent transaction-boundary position rather than the last message's binlog position. The integration test loosening from == to >= correctly reflects the new at-least-once semantics on restart.

LGTM

Ensure we only checkpoint at the transaction boundary as opposed to the
row boundary. This way we don't risk recovering mid transaction in the
event of a crash.
@claude
Copy link
Copy Markdown

claude Bot commented May 6, 2026

Commits
LGTM

Review
The change moves checkpoint advancement from row-level binlog positions to transaction-boundary (XID) positions, ensuring that on restart canal.RunFrom always begins at a transaction start and the corresponding TABLE_MAP_EVENT is delivered before any row events. The new OnXID handler propagates the post-XID position via rawMessageEvents, readMessages tracks the latest XID position, and flushBatch is simplified to use the passed-in checkpoint position instead of reading binlog metadata from the last batch message. The integration test relaxation (==>=) correctly reflects at-least-once semantics on restart.

LGTM

@josephwoodward josephwoodward marked this pull request as ready for review May 6, 2026 11:34
@claude
Copy link
Copy Markdown

claude Bot commented May 6, 2026

Commits
LGTM

Review
Small, focused change that wires OnXID into the canal handler and threads a latestXIDPos *position through readMessages so checkpoints only advance to transaction boundaries. Snapshot path remains correct (snapshot completion uses its own me.Position, and snapshot rows never carried binlog_position metadata, so passing latestXIDPos == nil during snapshot flushes is equivalent to the prior behavior). The integration test relaxation from == 1000 to >= 1000 matches the at-least-once replay semantics introduced by transaction-boundary checkpointing.

LGTM

@josephwoodward josephwoodward changed the title mysqlcdc: ensure we also checkpoint at transaction boundary mysqlcdc: checkpoint at transaction boundary May 6, 2026
@josephwoodward josephwoodward merged commit 4c25d02 into main May 6, 2026
8 checks passed
@josephwoodward josephwoodward deleted the jw/mysqlxid branch May 6, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants