Skip to content

docs: snapshot & parallel-worktree audit (omnigraph comparison)#999

Merged
carlos-alm merged 6 commits into
mainfrom
docs/snapshot-worktree-audit
Apr 22, 2026
Merged

docs: snapshot & parallel-worktree audit (omnigraph comparison)#999
carlos-alm merged 6 commits into
mainfrom
docs/snapshot-worktree-audit

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

Key takeaway

The 3 filed bugs are the concrete action items. The omnigraph-inspired architectural rewrites (copy-on-write branches, commit-equals-snapshot storage model) don't fit codegraph because they couple graph freshness to git commits, which breaks the "graph tracks your working tree live" invariant.

Test plan

Research pass comparing codegraph to omnigraph.dev. Documents three
filed correctness bugs (#995 snapshot TOCTOU, #996 unlocked journal
appends, #997 watcher never updates journal header) and records why
a content-addressed parse cache was considered and rejected: parsing
is ~10% of full-build time, so the cache would save ~10% on first-
build-in-new-worktree and near-zero on incremental/watch.

Also corrects the initial "structural failure in multi-worktree
incremental updates" framing — each linked worktree gets its own DB
(verified), and incremental builds within a worktree work correctly.
The real gap is that first-build-in-new-worktree doesn't amortize
work other worktrees already did, which is an optimization
opportunity, not a structural failure.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 21, 2026

Claude finished @carlos-alm's task —— View job


PR Review: docs/snapshot-worktree-audit

Todo List:

  • Read the added documentation file
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity
  • Phase 4: Code quality (documentation)
  • Phase 5: Backlog compliance assessment
  • Provide final assessment with rating

Root Cause Analysis

Problem Being Solved: Addresses the framing of "structural failure in multi-worktree incremental updates" by conducting a systematic comparison with omnigraph.dev to identify actual vs. perceived issues with codegraph's parallel worktree behavior.

Why This Approach: The audit approach is methodical and well-justified. Rather than making architectural assumptions, the author:

  1. Verified actual behavior through local testing (each worktree gets its own DB)
  2. Filed concrete bugs found during investigation (bug(snapshot): TOCTOU race on --force save path #995, bug(journal): concurrent appends are not atomic — corrupted under parallel builds #996, bug(journal): header timestamp lags entries during watch → Tier 0 invalidation forces expensive rescan #997)
  3. Evaluated a specific optimization (content-addressed parse cache) against real performance data
  4. Rejected omnigraph's full model due to incompatibility with codegraph's "live working tree" invariant

Risk Assessment: Very low risk - documentation-only change with no behavioral impact.


Phase 2-3: Configuration/Test Changes

Clean - No configuration changes, no test modifications, no code touched.


Phase 4: Documentation Quality

Strengths:

  • Excellent structure: Clear TL;DR, verification vs. inference table, specific line number references
  • Honest limitations: Section 6 acknowledges gaps (didn't read omnigraph source, limited profile data)
  • Actionable outcomes: Filed 3 specific bugs, provided clear recommendations table
  • Technical depth: Detailed analysis of snapshot TOCTOU race, journal concurrency issues, advisory lock behavior

Technical accuracy verified through code references:

  • src/features/snapshot.ts:27-61 - VACUUM INTO operation ✓
  • src/domain/graph/journal.ts:62-105 - Journal append/header race ✓
  • src/db/connection.ts:112-130 - Advisory lock implementation ✓

Minor areas for improvement:

  • Section 4 conclusion could be stronger - "bad trade" is subjective, but the 10% performance impact analysis supports it
  • Registry collision handling (Section 2.5) could link to specific code

Backlog Compliance

  • Zero-dep: ✓ - No new dependencies, documentation only
  • Foundation-aligned: ✓ - Maintains P1 (always-current) by rejecting omnigraph's commit-coupled model; supports P8 (tool not agent) by focusing on technical analysis over decision-making
  • Problem-fit: 4 - Directly addresses potential architectural concerns that could impact AI agent accuracy/performance with parallel worktrees. While not core dependency-graph functionality, it prevents misguided refactoring that could harm the tool's reliability
  • Breaking: No - Documentation addition only, no API/behavior changes
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None identified. This is exemplary technical documentation that:

  1. Uses evidence-based analysis over speculation
  2. Files actionable bugs rather than working around issues
  3. Maintains architectural integrity by rejecting incompatible designs
  4. Provides specific code references for verification

Final Recommendation

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR adds a documentation audit report comparing codegraph's snapshot and incremental-build design against omnigraph, recording 3 correctness bugs (#995, #996, #997), evaluating and rejecting a content-addressed parse cache optimization, and correcting an earlier "structural failure" framing to a more precise "amortization gap." The document is well-reasoned, appropriately caveated about omnigraph internals, and the previously flagged missing-R2 gap is resolved in this version (table now reads R1 / R2 / R3 / R4 with §4 back-reference).

Confidence Score: 5/5

Docs-only change, safe to merge — no code behavior is affected.

All findings are documentation artifacts with no runtime impact. The previously flagged R2 numbering issue is addressed in this version. The audit content is accurate, well-caveated, and internally consistent.

No files require special attention.

Important Files Changed

Filename Overview
docs/reports/snapshot-and-worktree-audit.md New audit report documenting 3 filed bugs and rejecting a parse-cache optimization; well-structured with appropriate confidence caveats on omnigraph internals, and the previously noted R2 numbering gap is resolved.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[snapshotSave called] --> B{existsSync?}
    B -- yes --> C[unlinkSync existing .db]
    C --> D[VACUUM INTO new .db]
    B -- no --> D
    D --> E[Done]
    style C fill:#ffcccc,stroke:#cc0000
    style D fill:#ffcccc,stroke:#cc0000

    F[Watcher fires] --> G[appendJournalEntries fs.appendFileSync — unlocked]
    H[Manual build fires] --> G
    G --> I{Interleaved writes?}
    I -- yes --> J[Corrupt journal entries]
    I -- no --> K[Clean entries]
    style G fill:#ffcccc,stroke:#cc0000
    style J fill:#ffcccc,stroke:#cc0000

    L[New linked worktree created] --> M[git rev-parse --show-toplevel returns worktree path]
    M --> N[Own .codegraph/graph.db]
    N --> O[First build — full cold parse]
    O --> P[~10% of time is parsing — optimization gap not structural failure]
    style N fill:#ccffcc,stroke:#00aa00
Loading

Reviews (6): Last reviewed commit: "Merge branch 'main' into docs/snapshot-w..." | Re-trigger Greptile

Comment on lines +161 to +169
| ID | Action | Status |
|---|---|---|
| **A1** | Fix #995 — snapshot save TOCTOU race. Use `<name>.db.tmp-<pid>` + atomic rename, or a real file lock. | **Do** |
| **A2** | Fix #996 — lock journal mutations (or replace with a per-line append-only log). | **Do** |
| **A3** | Fix #997 — watcher updates journal header in the same critical section as entry appends. | **Do** |
| ~~R1~~ | ~~Content-addressed parse cache.~~ | **Rejected** — 10% parse share |
| R3 | Bind snapshot names to git identity + metadata + GC. | Nice-to-have. Not urgent. |
| R4 | Share more than parses across worktrees (resolved edges, embeddings) by content hash. | Speculative — conditional on verifying content-purity. Not recommended now. |
| R5 | Replace SQLite-as-materialized-graph with append-only versioned store. | Long-horizon, out of scope. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing R2 in recommendations table

The recommendations table jumps from R1 directly to R3, with no mention of R2 anywhere in the document. Readers will naturally wonder what R2 was — whether it was merged into another item, deliberately removed, or an accidental numbering skip. A brief parenthetical (e.g. "R2 was merged into A2" or renumbering R3→R2, R4→R3, R5→R4) would prevent future confusion.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in dcf3705 — renumbered R3→R2, R4→R3, R5→R4 so the table reads R1 (struck, rejected) / R2 / R3 / R4 with no visible gap. Also added a parenthetical on the R1 row pointing to §4 for the rejection rationale. Updated the one back-reference in §6 (was 'R4', now 'R3').

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile's orphaned-source finding (review #999 (review)):

  • Anchored the .claude/hooks/enrich-context.sh:32-34 source citation in §2.4. The hook derives the DB path via git rev-parse --show-toplevel, which is exactly the evidence behind the "linked worktree reads its own DB" claim — so the reference now has a body-text anchor instead of being an orphan.

Fix in e1fbd60.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 6fb43d3 into main Apr 22, 2026
21 checks passed
@carlos-alm carlos-alm deleted the docs/snapshot-worktree-audit branch April 22, 2026 17:10
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant