V2 checkpoints migration performance improvements#1147
Merged
Conversation
Replace the per-session WriteCommittedWithSessionIndex call with a new WriteCommittedMainBatch path, accumulated alongside pendingFull and flushed at every /full pack boundary plus once at the end. Cuts /main ref-CAS overhead from one update per session to one per generation batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 8d771ca24516
…-improvements # Conflicts: # cmd/entire/cli/migrate.go
Adds perf.Annotate, which attaches a synthetic child span with a pre-computed duration to the surrounding context. Lets the migrate loop surface cumulative time spent in migrate_one_checkpoint vs. flush_main vs. pack_full_generation without paying the per-iteration span cost (4k+ iterations would blow past trace.go's 1MB limit). Each batch flush + archive pack also gets its own span, so a doctor trace reader can tell whether a slow run is uniformly slow or bursting on certain batches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 428acebd2ac0
Adds compact.WithOffset, which produces the full compact bytes plus the checkpoint-start line offset in one parse. For JSONL inputs (Claude / Cursor — the migration's hot path) the second compact pass becomes a count-only walk over the shared parsed entries, skipping the json.Marshal of every output line. For non-line-oriented formats (OpenCode, Gemini, Codex) and the merge-heavy line formats (Copilot, Droid) we fall back to running Compact twice so the stored offset stays byte-identical to the prior `lines(full) - lines(scoped)` calculation — the user explicitly ruled out any drift in start lines as a regression risk. A property test pins the equivalence across every format fixture in the package. Migration's per-checkpoint loop now goes through compact.WithOffset; the tryCompactTranscript / computeCompactOffset helpers stay for the resume path's UpdateCommitted flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 229bf9284c3e
This reverts commit 545c381. Entire-Checkpoint: da90343bf9ae
Entire-Checkpoint: 671e0161dd29
Contributor
There was a problem hiding this comment.
Pull request overview
This PR targets faster v1→v2 checkpoint migration in the Entire CLI by eliminating repeated blob/transcript work and reducing repeated /main tree rewrites during migration.
Changes:
- Adds
perf.Annotateto surface cumulative (summed) timings in perf traces without per-iteration spans. - Speeds migration by batching
/mainwrites (WriteCommittedMainBatch), reusing v1 transcript blob hashes when packing v2/full/*, and caching compact-transcript offset calculations. - Adjusts generation metadata packing to prefer already-loaded checkpoint
CreatedAtinstead of rescanning raw transcripts, with corresponding test updates.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| perf/span.go | Adds Annotate for synthetic child spans with precomputed duration. |
| perf/span_test.go | Adds tests for Annotate behavior (duration + no-parent no-op). |
| cmd/entire/cli/migrate.go | Refactors migration loop for batching, caching, and reduced redundant work; adds perf aggregation annotations. |
| cmd/entire/cli/migrate_test.go | Adds coverage for raw blob reuse and updates migration-related expectations. |
| cmd/entire/cli/checkpoint/v2_store_test.go | Adds correctness + perf invariants tests for WriteCommittedMainBatch. |
| cmd/entire/cli/checkpoint/v2_committed.go | Introduces WriteCommittedMainBatch and supporting subtree-building helpers. |
| cmd/entire/cli/checkpoint/committed.go | Captures v1 transcript blob hashes during session reads for reuse during migration. |
| cmd/entire/cli/checkpoint/checkpoint.go | Extends SessionContent with TranscriptBlobHashes to support blob reuse. |
Entire-Checkpoint: 78a51d5b7eef
…-performance-improvements
Entire-Checkpoint: e3a22cff9ec6
Entire-Checkpoint: 2d0a5f727cb7
pfleidi
reviewed
May 8, 2026
pfleidi
approved these changes
May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://entire.io/gh/entireio/cli/trails/321
Summary
Tested with a repo with ~2500 checkpoints. Reduced migration time to ~3.5 minutes (
migrate_checkpoints 216869ms), down from ~5.5 minutes (migrate_checkpoints 337738ms).Validated with a fresh migration of the repo locally and compared with the version pushed up to GH, ensured there were no regressions in the data pushed up. See the comparison report here: https://gist.github.com/computermode/1c06c434317fb8fe7df18b4598913ab8
Script to compare repos: https://gist.github.com/computermode/599fea82d7ae0147716997de8f19576a
Biggest Redundancies
CreatedAtfirst/mainflushWhere It Showed Up In Trace
pack_full_generation_total: duplicate raw transcript packingpack_full.generation_timestamps_total: raw transcript timestamp rescansmigrate_one.compact_transcript_total: duplicate compact passesflush_main_total: repeated/maintree rewritesKey Point
The batch size was already
100; the slowdown was inside each batch from repeated blob work, transcript parsing, and Git tree rewrites.Note
Medium Risk
Touches checkpoint migration and v2 git ref-writing logic; while covered by new invariants/tests, mistakes could lead to missing or mis-indexed checkpoint data during migration.
Overview
Speeds up v1→v2 checkpoint migration by batching v2
/mainwrites: migration now buffers per-sessionWriteCommittedOptionsand flushes them via a newWriteCommittedMainBatchthat updates/mainwith a single commit/CAS per batch.Avoids redundant work during
/full/*packing by reusing existing v1 transcript blob hashes (surfaced viaSessionContent.TranscriptBlobHashes) and by preferring checkpointCreatedAtwhen generatinggeneration.json, falling back to transcript timestamp scans only when needed.Adds a compact-transcript offset cache and cumulative perf annotation (
perf.Annotate) to reduce repeated offset computation and make phase totals visible, alongside new tests validating batch-vs-sequential tree equality, single-commit behavior, and raw-blob reuse.Reviewed by Cursor Bugbot for commit 961c195. Configure here.