perf(native): skip backfill on clean incrementals#1082
Conversation
…ed files #1069 made `backfillNativeDroppedFiles` run on every successful orchestrator pass — including incrementals — to repair `nodes`/`file_hashes` rows the orchestrator deleted for files outside its narrower file_collector (Clojure, Julia, R, Erlang, F#, Gleam, etc.). #1070 then taught the orchestrator's `detect_removed_files` to skip those extensions, so a current-binary 1-file rebuild reports `removedCount=0` and the orchestrator never deletes the dropped-language rows in the first place. But the JS side kept calling backfill unconditionally — wasting ~45ms per incremental on this repo (fs walk + 2 DB queries + WASM re-parse of all 48 unsupported-extension fixture files) repairing a gap that no longer exists. Gate the backfill call on `result.isFullBuild || result.removedCount > 0`: - Full builds: backfill runs (orchestrator never inserted dropped-language files, gap-fill is the whole point). - Incrementals on a current binary with #1070: `removedCount=0`, backfill skipped, no work needed. - Incrementals on a legacy binary (≤3.9.6) without #1070: `removedCount>0`, backfill runs, gap-repair behavior preserved. Local measurement on this repo (incremental-benchmark.ts, native engine): before: 1-file rebuild ~108ms (post-revert main, no fix) after: 1-file rebuild ~60ms Closes #1075.
|
Claude finished @carlos-alm's task —— View job Reviewing PR: perf(native): skip backfill on clean incrementalsTodo List:
Root Cause AnalysisProblem Being Solved: The persistent Why This Approach: The fix gates the backfill call on Risk Assessment: LOW. The conditional logic correctly maintains backward compatibility while fixing the performance regression. The test plan covers both current and legacy binary scenarios. Configuration Changes Analysis✅ No configuration changes detected - only runtime logic modification in pipeline.ts:764-766. No biome rules, test timeouts, or build configuration changes that would hide underlying issues. Test Integrity Analysis✅ Tests are strengthened, not weakened - PR mentions specific test files that verify the fix:
No test modifications in the PR itself, just runtime logic changes. The author verified behavior on both current and legacy binaries. Code Quality Analysis✅ High quality implementation: Performance Fix (pipeline.ts:764-766): const removedCount = result.removedCount ?? 0;
if (result.isFullBuild || removedCount > 0) {
await backfillNativeDroppedFiles(ctx);
}Strengths:
Technical correctness:
Backlog Compliance AssessmentZero-dep: ✓ - No new runtime dependencies added. Uses existing backfill infrastructure conditionally. Foundation-aligned: ✓ - Fully compliant with FOUNDATION.md principles:
Problem-fit: 4/5 - Directly addresses performance regression in incremental builds, which are core to the "always-current graph" value proposition. Faster incrementals = more responsive AI agent workflows. Breaking: No - Pure performance optimization with no API, CLI, or behavior changes. Full backward compatibility maintained. Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓) Critical ConcernsNone detected. This is a clean performance fix that:
Final Assessment |
Greptile SummaryThis PR gates the
Confidence Score: 5/5Safe to merge — the change is a targeted, well-scoped guard around a single async call with no new code paths introduced. The diff is three lines of logic replacing an unconditional call. Both branches of the new condition (full build and legacy-binary incremental) preserve the prior behavior exactly. The only behavioral change — skipping backfill on current-binary clean incrementals — is the intentional optimization, correct given that the current binary's detect_removed_files filter never deletes dropped-language rows. The one known gap (new dropped-language file additions on incrementals) is explicitly acknowledged and tracked in #1083. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[tryNativeOrchestrator] --> B[Run Rust orchestrator binary]
B --> C{result.earlyExit?}
C -- Yes --> D[Return early-exit\nno backfill]
C -- No --> E[Run post-native structure\n& analysis if needed]
E --> F{result.isFullBuild\nOR removedCount > 0?}
F -- Yes\nfull build or legacy binary\nwith deletions --> G[backfillNativeDroppedFiles\nfs walk + DB query + WASM re-parse]
F -- No\ncurrent binary incremental\nremoveCount=0 --> H[Skip backfill\n~45ms saved]
G --> I[closeDbPair & return timing]
H --> I
Reviews (3): Last reviewed commit: "Merge branch 'main' into perf/skip-backf..." | Re-trigger Greptile |
| const removedCount = result.removedCount ?? 0; | ||
| if (result.isFullBuild || removedCount > 0) { | ||
| await backfillNativeDroppedFiles(ctx); |
There was a problem hiding this comment.
Backfill silently skipped on full builds with binaries that omit
isFullBuild
isFullBuild is typed as isFullBuild?: boolean, so supported binaries between 3.9.1 and the version that first started emitting this field may leave it absent. On a fresh full build with such a binary, result.isFullBuild is undefined (falsy) and removedCount is 0 (nothing existed to delete), so the condition is false and the backfill is skipped entirely. The non-Rust-extension files (Clojure, Julia, R, etc.) are then absent from nodes/file_hashes, and the fast-skip pre-flight introduced in #1054 will reject every subsequent incremental until a forced full rebuild occurs. The same branch already uses !!result.isFullBuild at line 741, so this is a shared assumption — but it's worth confirming that all binaries ≥ 3.9.1 reliably set isFullBuild: true during every full-build code path, or adding a version-range guard here to fall back to unconditional backfill for binaries that are known not to emit the field.
There was a problem hiding this comment.
The version gate at shouldSkipNativeOrchestrator (semverCompare(engineVersion, '3.9.1') < 0 → skip) means any binary that reaches tryNativeOrchestrator is ≥ 3.9.1. is_full_build: bool was added to the Rust BuildPipelineResult in #758 (3.8.0 era) and is serialized via serde with no skip_serializing_if — so it's always emitted as true or false, never absent, on any 3.9.1+ binary. The defensive !!result.isFullBuild at line 741 is a stylistic safeguard for the historical Option<bool> shape that was never actually used. The new gate at line 764 already coerces via ||, so the practical risk surface is empty under the version gate.
Codegraph Impact Analysis1 functions changed → 5 callers affected across 5 files
|
| const removedCount = result.removedCount ?? 0; | ||
| if (result.isFullBuild || removedCount > 0) { | ||
| await backfillNativeDroppedFiles(ctx); | ||
| } |
There was a problem hiding this comment.
Newly-added dropped-language files silently missed on incrementals
When a user adds a new file with an unsupported-Rust extension (Clojure, Julia, R, Erlang, F#, Gleam) during an incremental pass, the current binary's orchestrator neither processes it nor increments removedCount. Both result.isFullBuild and removedCount are 0/falsy, so the gate evaluates to false and backfillNativeDroppedFiles is never called. The new file is then absent from both nodes and file_hashes until the next forced full rebuild. Before this PR the unconditional call to backfillNativeDroppedFiles would have caught these files on the very next pass via its expected ∖ existingNodes diff. Adding || (result.changedCount ?? 0) > 0 to the condition would not help either — the orchestrator's narrower file_collector never counts dropped-language additions in changedCount.
There was a problem hiding this comment.
Confirmed — this is a real but narrow correctness gap. Tracked in #1083 for follow-up. The PR's primary motivation is the measured +144% CI regression on 1-file rebuilds, which affects every user on every incremental. The dropped-language addition case is rare (user adds a new .clj/.jl/.r/.erl/.fs/.gleam file in isolation, with no other file changes) and recoverable via codegraph build --force or by touching any Rust-supported file alongside the addition. Keeping it deferred so the perf fix can land without a regression to the broader fs-walk overhead.
The native 1-file rebuild regressed from 78ms to ~116ms (build) and 54ms to ~81ms (incremental) when #1069 made backfillNativeDroppedFiles run on every successful orchestrator pass — including incrementals — to repair file_hashes/nodes rows for unsupported-extension files. #1070 fixed the orchestrator side, but the JS-side call stayed unconditional, wasting ~45ms per incremental on the codegraph corpus. Fix is tracked in PR #1082, which gates the backfill call on `isFullBuild || removedCount > 0`. Adding the known-regression entry lets this PR's gate pass while #1082 ships through review; the existing stale-entry test will warn once 3.9.7 lands and the entry stays past its useful life.
Summary
Fixes the persistent
1-file rebuild: 54 → 132 (+144%)failure in the regression-guard CI gate. With this change, my local 1-file rebuild drops from ~108ms (post-revert main) to ~60ms.Root cause
#1069 made
backfillNativeDroppedFilesrun on every successful orchestrator pass — including incrementals — to repairnodes/file_hashesrows the orchestrator deleted for files outside its narrower file_collector (Clojure, Julia, R, Erlang, F#, Gleam, etc.).#1070 then taught the orchestrator's
detect_removed_filesto skip those extensions, so on a current binary the orchestrator never deletes those rows. But the JS side kept calling backfill unconditionally — wasting ~45ms per incremental on this repo (fs walk + 2 DB queries + WASM re-parse of all 48 unsupported-extension fixture files) to repair a gap that no longer exists.Fix
Gate the backfill call on
result.isFullBuild || result.removedCount > 0:removedCount=0, backfill skipped — no work needed.removedCount>0, backfill runs — gap-repair behavior preserved.The legacy-binary path matters because #1069 was originally a fix for a real correctness issue (#1054 fast-skip pre-flight breaking when file_hashes had gaps). Users on the published 3.9.6 binary still need that repair until they update to a binary with the #1070 filter.
Test plan
npx vitest run tests/integration/incremental-parity.test.ts tests/builder/insert-nodes.test.ts tests/builder/detect-changes.test.ts— 28/28 pass.npm run lint— clean.Pre-publish benchmark gateto confirm the regression-guard 1-file metric clears the +25% threshold.Closes #1075.