Skip to content

perf(native): skip backfill on clean incrementals#1082

Merged
carlos-alm merged 2 commits into
mainfrom
perf/skip-backfill-on-clean-incrementals
May 8, 2026
Merged

perf(native): skip backfill on clean incrementals#1082
carlos-alm merged 2 commits into
mainfrom
perf/skip-backfill-on-clean-incrementals

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

Fixes the persistent 1-file rebuild: 54 → 132 (+144%) failure in the regression-guard CI gate. With this change, my local 1-file rebuild drops from ~108ms (post-revert main) to ~60ms.

Root cause

#1069 made backfillNativeDroppedFiles run on every successful orchestrator pass — including incrementals — to repair nodes/file_hashes rows the orchestrator deleted for files outside its narrower file_collector (Clojure, Julia, R, Erlang, F#, Gleam, etc.).

#1070 then taught the orchestrator's detect_removed_files to skip those extensions, so on a current binary the orchestrator never deletes those rows. But the JS side kept calling backfill unconditionally — wasting ~45ms per incremental on this repo (fs walk + 2 DB queries + WASM re-parse of all 48 unsupported-extension fixture files) to repair a gap that no longer exists.

Fix

Gate the backfill call on result.isFullBuild || result.removedCount > 0:

The legacy-binary path matters because #1069 was originally a fix for a real correctness issue (#1054 fast-skip pre-flight breaking when file_hashes had gaps). Users on the published 3.9.6 binary still need that repair until they update to a binary with the #1070 filter.

Test plan

  • npx vitest run tests/integration/incremental-parity.test.ts tests/builder/insert-nodes.test.ts tests/builder/detect-changes.test.ts — 28/28 pass.
  • npm run lint — clean.
  • Local incremental-benchmark.ts measurement on this repo with current-built binary: 1-file ~60ms (was ~108ms).
  • Same with published 3.9.6 binary in node_modules (no fix(native): skip unsupported-extension files in detect_removed_files #1070 filter): 1-file ~58ms — confirms the legacy path still backfills correctly without breaking perf.
  • Watch the post-merge Pre-publish benchmark gate to confirm the regression-guard 1-file metric clears the +25% threshold.

Closes #1075.

…ed files

#1069 made `backfillNativeDroppedFiles` run on every successful orchestrator
pass — including incrementals — to repair `nodes`/`file_hashes` rows the
orchestrator deleted for files outside its narrower file_collector (Clojure,
Julia, R, Erlang, F#, Gleam, etc.).

#1070 then taught the orchestrator's `detect_removed_files` to skip those
extensions, so a current-binary 1-file rebuild reports `removedCount=0` and
the orchestrator never deletes the dropped-language rows in the first place.
But the JS side kept calling backfill unconditionally — wasting ~45ms per
incremental on this repo (fs walk + 2 DB queries + WASM re-parse of all 48
unsupported-extension fixture files) repairing a gap that no longer exists.

Gate the backfill call on `result.isFullBuild || result.removedCount > 0`:
- Full builds: backfill runs (orchestrator never inserted dropped-language
  files, gap-fill is the whole point).
- Incrementals on a current binary with #1070: `removedCount=0`, backfill
  skipped, no work needed.
- Incrementals on a legacy binary (≤3.9.6) without #1070: `removedCount>0`,
  backfill runs, gap-repair behavior preserved.

Local measurement on this repo (incremental-benchmark.ts, native engine):
  before: 1-file rebuild ~108ms (post-revert main, no fix)
  after:  1-file rebuild  ~60ms

Closes #1075.
@claude
Copy link
Copy Markdown

claude Bot commented May 8, 2026

Claude finished @carlos-alm's task —— View job


Reviewing PR: perf(native): skip backfill on clean incrementals

Todo List:

  • Read the changed file to understand the implementation
  • Analyze root cause and approach (Phase 1)
  • Check for configuration changes (Phase 2)
  • Verify test integrity (Phase 3)
  • Review code quality and performance (Phase 4)
  • Rate against BACKLOG.md criteria (Phase 5)
  • Provide final structured assessment

Root Cause Analysis

Problem Being Solved: The persistent 1-file rebuild: 54 → 132 (+144%) failure in CI regression-guard. PR #1069 introduced unconditional backfill calls on every orchestrator pass, including incrementals, to repair nodes/file_hashes for dropped languages. But PR #1070 then fixed the orchestrator to skip those extensions, making the backfill unnecessary for current binaries while still wasting ~45ms per incremental.

Why This Approach: The fix gates the backfill call on result.isFullBuild || result.removedCount > 0. This preserves behavior for legacy binaries (≤3.9.6) where removedCount > 0 indicates actual dropped files, while skipping the expensive operation on current binaries where removedCount = 0 means no files were dropped.

Risk Assessment: LOW. The conditional logic correctly maintains backward compatibility while fixing the performance regression. The test plan covers both current and legacy binary scenarios.

Configuration Changes Analysis

No configuration changes detected - only runtime logic modification in pipeline.ts:764-766. No biome rules, test timeouts, or build configuration changes that would hide underlying issues.

Test Integrity Analysis

Tests are strengthened, not weakened - PR mentions specific test files that verify the fix:

No test modifications in the PR itself, just runtime logic changes. The author verified behavior on both current and legacy binaries.

Code Quality Analysis

High quality implementation:

Performance Fix (pipeline.ts:764-766):

const removedCount = result.removedCount ?? 0;
if (result.isFullBuild || removedCount > 0) {
  await backfillNativeDroppedFiles(ctx);
}

Strengths:

  • Precise conditional logic: Only runs expensive backfill when actually needed
  • Backward compatibility: Legacy binaries (≤3.9.6) still get gap-repair via removedCount > 0
  • Clear performance impact: Local measurement shows ~45% speedup (108ms → 60ms)
  • Well-documented: Extensive comments explain the three scenarios (full builds, current binary incrementals, legacy binary incrementals)

Technical correctness:

  • The removedCount field accurately reflects whether the orchestrator deleted rows
  • Full builds always need backfill (orchestrator doesn't insert dropped files initially)
  • Current binaries with fix(native): skip unsupported-extension files in detect_removed_files #1070 filter report removedCount=0 when no actual deletions occur
  • Legacy binaries still report removedCount>0 and get the repair behavior

Backlog Compliance Assessment

Zero-dep: ✓ - No new runtime dependencies added. Uses existing backfill infrastructure conditionally.

Foundation-aligned: ✓ - Fully compliant with FOUNDATION.md principles:

  • P1 (always-current): Maintains graph accuracy while improving performance
  • P4 (zero-cost core): Reduces unnecessary overhead in core pipeline
  • P5 (embeddable): No API changes, pure internal optimization
  • P8 (optimize for development): Directly improves incremental build speed for developers

Problem-fit: 4/5 - Directly addresses performance regression in incremental builds, which are core to the "always-current graph" value proposition. Faster incrementals = more responsive AI agent workflows.

Breaking: No - Pure performance optimization with no API, CLI, or behavior changes. Full backward compatibility maintained.

Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Critical Concerns

None detected. This is a clean performance fix that:

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Performance regression where backfill operations run unconditionally on incrementals, wasting ~45ms per build
**Why This Approach**: Conditional gating based on isFullBuild || removedCount > 0 preserves all behavior while eliminating unnecessary work
**Risk Assessment**: Low - maintains backward compatibility while fixing documented regression

## Backlog Compliance
- Zero-dep: ✓ — no new dependencies  
- Foundation-aligned: ✓ — optimizes core pipeline performance (P4), maintains always-current graph (P1)
- Problem-fit: 4 — directly improves incremental build performance, core to AI agent responsiveness
- Breaking: No — pure internal optimization with full backward compatibility
- Tier: 1

## Critical Concerns
None

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: Excellent performance fix with clear problem statement, proper backward compatibility, comprehensive test coverage, and measurable benefit. Addresses a specific regression without introducing risk.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR gates the backfillNativeDroppedFiles call on result.isFullBuild || removedCount > 0, eliminating ~45 ms of overhead per incremental build on repos using the current binary (≥ 3.9.6 with #1070's detect_removed_files filter). The logic is layered to preserve correctness for legacy binaries that still delete dropped-language rows on every pass.

Confidence Score: 5/5

Safe to merge — the change is a targeted, well-scoped guard around a single async call with no new code paths introduced.

The diff is three lines of logic replacing an unconditional call. Both branches of the new condition (full build and legacy-binary incremental) preserve the prior behavior exactly. The only behavioral change — skipping backfill on current-binary clean incrementals — is the intentional optimization, correct given that the current binary's detect_removed_files filter never deletes dropped-language rows. The one known gap (new dropped-language file additions on incrementals) is explicitly acknowledged and tracked in #1083.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/graph/builder/pipeline.ts Adds a guard so backfillNativeDroppedFiles only runs on full builds or when removedCount > 0; change is minimal, well-commented, and consistent with the removed variable already computed at line 667.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[tryNativeOrchestrator] --> B[Run Rust orchestrator binary]
    B --> C{result.earlyExit?}
    C -- Yes --> D[Return early-exit\nno backfill]
    C -- No --> E[Run post-native structure\n& analysis if needed]
    E --> F{result.isFullBuild\nOR removedCount > 0?}
    F -- Yes\nfull build or legacy binary\nwith deletions --> G[backfillNativeDroppedFiles\nfs walk + DB query + WASM re-parse]
    F -- No\ncurrent binary incremental\nremoveCount=0 --> H[Skip backfill\n~45ms saved]
    G --> I[closeDbPair & return timing]
    H --> I
Loading

Reviews (3): Last reviewed commit: "Merge branch 'main' into perf/skip-backf..." | Re-trigger Greptile

Comment on lines +763 to +765
const removedCount = result.removedCount ?? 0;
if (result.isFullBuild || removedCount > 0) {
await backfillNativeDroppedFiles(ctx);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Backfill silently skipped on full builds with binaries that omit isFullBuild

isFullBuild is typed as isFullBuild?: boolean, so supported binaries between 3.9.1 and the version that first started emitting this field may leave it absent. On a fresh full build with such a binary, result.isFullBuild is undefined (falsy) and removedCount is 0 (nothing existed to delete), so the condition is false and the backfill is skipped entirely. The non-Rust-extension files (Clojure, Julia, R, etc.) are then absent from nodes/file_hashes, and the fast-skip pre-flight introduced in #1054 will reject every subsequent incremental until a forced full rebuild occurs. The same branch already uses !!result.isFullBuild at line 741, so this is a shared assumption — but it's worth confirming that all binaries ≥ 3.9.1 reliably set isFullBuild: true during every full-build code path, or adding a version-range guard here to fall back to unconditional backfill for binaries that are known not to emit the field.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version gate at shouldSkipNativeOrchestrator (semverCompare(engineVersion, '3.9.1') < 0 → skip) means any binary that reaches tryNativeOrchestrator is ≥ 3.9.1. is_full_build: bool was added to the Rust BuildPipelineResult in #758 (3.8.0 era) and is serialized via serde with no skip_serializing_if — so it's always emitted as true or false, never absent, on any 3.9.1+ binary. The defensive !!result.isFullBuild at line 741 is a stylistic safeguard for the historical Option<bool> shape that was never actually used. The new gate at line 764 already coerces via ||, so the practical risk surface is empty under the version gate.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Codegraph Impact Analysis

1 functions changed5 callers affected across 5 files

  • tryNativeOrchestrator in src/domain/graph/builder/pipeline.ts:604 (5 transitive callers)

Comment on lines +763 to +766
const removedCount = result.removedCount ?? 0;
if (result.isFullBuild || removedCount > 0) {
await backfillNativeDroppedFiles(ctx);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Newly-added dropped-language files silently missed on incrementals

When a user adds a new file with an unsupported-Rust extension (Clojure, Julia, R, Erlang, F#, Gleam) during an incremental pass, the current binary's orchestrator neither processes it nor increments removedCount. Both result.isFullBuild and removedCount are 0/falsy, so the gate evaluates to false and backfillNativeDroppedFiles is never called. The new file is then absent from both nodes and file_hashes until the next forced full rebuild. Before this PR the unconditional call to backfillNativeDroppedFiles would have caught these files on the very next pass via its expected ∖ existingNodes diff. Adding || (result.changedCount ?? 0) > 0 to the condition would not help either — the orchestrator's narrower file_collector never counts dropped-language additions in changedCount.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed — this is a real but narrow correctness gap. Tracked in #1083 for follow-up. The PR's primary motivation is the measured +144% CI regression on 1-file rebuilds, which affects every user on every incremental. The dropped-language addition case is rare (user adds a new .clj/.jl/.r/.erl/.fs/.gleam file in isolation, with no other file changes) and recoverable via codegraph build --force or by touching any Rust-supported file alongside the addition. Keeping it deferred so the perf fix can land without a regression to the broader fs-walk overhead.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

carlos-alm added a commit that referenced this pull request May 8, 2026
The native 1-file rebuild regressed from 78ms to ~116ms (build) and 54ms
to ~81ms (incremental) when #1069 made backfillNativeDroppedFiles run on
every successful orchestrator pass — including incrementals — to repair
file_hashes/nodes rows for unsupported-extension files. #1070 fixed the
orchestrator side, but the JS-side call stayed unconditional, wasting
~45ms per incremental on the codegraph corpus.

Fix is tracked in PR #1082, which gates the backfill call on
`isFullBuild || removedCount > 0`. Adding the known-regression entry
lets this PR's gate pass while #1082 ships through review; the existing
stale-entry test will warn once 3.9.7 lands and the entry stays past
its useful life.
@carlos-alm carlos-alm merged commit 8931706 into main May 8, 2026
32 checks passed
@carlos-alm carlos-alm deleted the perf/skip-backfill-on-clean-incrementals branch May 8, 2026 08:57
@github-actions github-actions Bot locked and limited conversation to collaborators May 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

native fast-skip rejects some no-op rebuilds due to mtime precision drift between Rust and JS

1 participant