Skip to content

revert: "fix(native): read mtime via BigInt nanoseconds (#1079)"#1080

Merged
carlos-alm merged 2 commits intomainfrom
revert/1079-mtime-bigint
May 8, 2026
Merged

revert: "fix(native): read mtime via BigInt nanoseconds (#1079)"#1080
carlos-alm merged 2 commits intomainfrom
revert/1079-mtime-bigint

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Reverts fix(native): read mtime via BigInt nanoseconds to match Rust truncation #1079.
  • The fix targeted a fast-skip optimization miss caused by f64 rounding of stat.mtimeMs. Investigation showed it was not a correctness bug — when fast-skip misses, the file is re-hashed from contents and the resulting graph is byte-identical either way. The fix only changed the optimization hit rate.
  • The miss-rate window is ~0.026% of fresh-file mtimes (the ~256 ns range where Number(mtimeNs) / 1e6 rounds across a millisecond boundary). The fix paid a constant-cost fs.statSync(path, { bigint: true }) overhead on 100% of stat calls in the change-detection hot path to recover that 0.026% miss rate.
  • The post-merge benchmark gate confirmed the trade was net-negative on the most-affected workload — no-op rebuild went from +70% over baseline (warning) to +108% (hard fail in regression-guard). See run https://github.com/optave/ops-codegraph-tool/actions/runs/25532744746/job/74943479391.

Reopens #1075. The right path forward is to document that fast-skip is a best-effort optimization (the on-disk hash is the source of truth for graph state) rather than chase exact integer parity with Rust's Duration::as_millis().

Test plan

… truncation (#1079)"

This reverts commit 3575880.

#1079 paid a real perf cost (BigInt-mode statSync on every file in change
detection) for what is in practice a no-op: at large epoch values, JS
Math.floor(stat.mtimeMs) can read N+1 where Rust wrote N, busting the
fast-skip optimization. But fast-skip only gates whether to re-hash the
file — when it misses, the hash is recomputed from contents and matches
what's stored, so the graph is byte-identical either way. The fix changes
optimization hit rate, not correctness.

The 0.026% miss-rate window the fix targeted does not justify the
constant-cost BigInt overhead applied to 100% of stat calls on the
change-detection hot path. Post-merge benchmark gate confirmed the
regression: no-op rebuild went from +70% (warning) to +108% (fail).

Reopens #1075. The right path is documenting that fast-skip is
best-effort (the on-disk hash is the source of truth for graph
correctness) rather than chasing exact integer parity with Rust.
@claude
Copy link
Copy Markdown

claude Bot commented May 8, 2026

Claude finished @carlos-alm's task —— View job


Reviewing PR #1080

Conducting thorough review of this revert PR following the 5-phase review process.

Todo List

  • Phase 1: Root cause analysis - understand what problem this revert solves
  • Phase 2: Configuration change detection - check for any rule relaxations
  • Phase 3: Test integrity - verify tests aren't weakened
  • Phase 4: Code quality review - examine the technical changes
  • Phase 5: Backlog compliance assessment
  • Final Assessment: Provide rating and recommendation

Root Cause Analysis

Problem Being Solved: The revert addresses a performance regression introduced in #1079. The original fix used BigInt nanosecond precision to match Rust's Duration::as_millis() truncation, but paid a significant performance cost (~38% regression in no-op rebuilds) for minimal correctness benefit.

Why This Approach: The author correctly identifies that the original issue (#1075) was an optimization problem, not a correctness bug. When the fast-skip optimization misses due to f64 rounding, files are re-hashed from contents and produce identical graph results. The fix targeted ~0.026% of cases but imposed overhead on 100% of stat calls.

Risk Assessment: LOW RISK - This is a clean revert to a known-good state. The benchmark gate confirmed the performance regression, making this a data-driven decision.

Backlog Compliance

  • Zero-dep: ✓ — No new dependencies, removes BigInt overhead
  • Foundation-aligned: ✓ — Improves performance without breaking core principles
  • Problem-fit: 4 — Directly addresses performance concerns that affect AI agent efficiency
  • Breaking: No — Purely internal optimization change, no API impact
  • Tier: 1

Critical Concerns

None. This is an exemplary revert with:

Code Quality Review

The changes are mechanically sound:

  1. fileStat() function: Correctly reverts from { bigint: true } back to standard fs.statSync(), changing return type from { mtime: number; size: number } to { mtimeMs: number; size: number }

  2. Timestamp comparisons: All usage sites properly updated to use Math.floor(stat.mtimeMs) instead of direct stat.mtime

  3. Test integrity: Tests properly reverted without weakening assertions. The removed test specifically targeted the BigInt precision edge case that's no longer relevant.

  4. Performance consideration: The revert eliminates the BigInt conversion overhead on every file stat call, which was the core performance bottleneck.

Technical Assessment

The original #1079 fix targeted a narrow edge case where JavaScript's f64 mtimeMs could round differently than Rust's integer millisecond truncation (affecting ~0.026% of timestamps). However:

  • Impact scope: Only affected the fast-skip optimization hit rate, not graph correctness
  • Fallback behavior: When fast-skip misses, content hashing ensures identical results
  • Performance cost: BigInt operations on 100% of files vs. benefit on 0.026% of files

The cost/benefit analysis clearly favors the revert. The author's proposal to document fast-skip as "best-effort" is the right architectural approach.

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: Exemplary data-driven performance fix with thorough analysis, clean implementation, and proper issue tracking. This represents exactly the kind of evidence-based optimization decision-making that should be encouraged.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR reverts #1079, which had switched fileStat to read mtime via BigInt nanoseconds (fs.statSync({ bigint: true })) to eliminate f64-rounding mismatches with Rust's Duration::as_millis(). Benchmarks showed the constant { bigint: true } overhead on every stat call in the change-detection hot path pushed the no-op rebuild cost from +70% (warning) to +108% (hard fail), while the correctness impact was negligible (0.026% fast-skip miss rate; misses only cause re-hashing, not graph divergence).

  • helpers.ts: fileStat reverts to Math.floor(stat.mtimeMs), now flooring once inside the helper so every consumer gets an already-integer mtime without needing to apply Math.floor themselves.
  • insert-nodes.ts: Removes the now-redundant Math.floor(item.stat.mtime) wrappers in the metadataUpdates loops, since item.stat.mtime is guaranteed to be a floor-ed integer from fileStat.
  • tests/builder/detect-changes.test.ts: Drops the BigInt-specific regression tests and stubs added in fix(native): read mtime via BigInt nanoseconds to match Rust truncation #1079 and updates seed helpers to use the standard Math.floor(stat.mtimeMs) path.

Confidence Score: 5/5

Safe to merge — this is a clean, well-scoped revert with consistent internal logic and no correctness regression.

The change is a straightforward revert confirmed by benchmark data. fileStat now floors once internally, all write paths in the diff consume that pre-floored value correctly, and the test suite is updated to match. The only asymmetry — detect-changes.ts still calling Math.floor on an already-integer mtime — is a pre-existing redundancy outside the diff and is entirely harmless.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/graph/builder/helpers.ts Reverts fileStat from BigInt nanosecond reads to Math.floor(stat.mtimeMs); floor is now applied once inside the helper, making all return values guaranteed integers.
src/domain/graph/builder/stages/insert-nodes.ts Correctly removes the now-redundant Math.floor wrappers around item.stat.mtime in both buildFileHashes and updateFileHashes metadata-update loops; the value is pre-floored by fileStat.
tests/builder/detect-changes.test.ts Removes the BigInt stub tests for fileStat and updates seed helpers to use Math.floor(stat.mtimeMs), consistent with the reverted implementation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[fileStat called] --> B["fs.statSync(filePath)"]
    B --> C["Math.floor(s.mtimeMs)"]
    C --> D["return { mtime: integer, size: number }"]
    D --> E{Consumer}
    E --> F["detect-changes.ts\nstat.mtime used directly\n(already integer)"]
    E --> G["insert-nodes.ts buildFileHashes\nitem.stat.mtime used directly\n(Math.floor removed)"]
    E --> H["insert-nodes.ts updateFileHashes\nitem.stat.mtime used directly\n(Math.floor removed)"]
    F --> I[(file_hashes DB\ninteger mtime column)]
    G --> I
    H --> I
Loading

Reviews (2): Last reviewed commit: "refactor(builder): floor mtime inside fi..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Codegraph Impact Analysis

3 functions changed15 callers affected across 5 files

  • fileStat in src/domain/graph/builder/helpers.ts:232 (15 transitive callers)
  • buildFileHashes in src/domain/graph/builder/stages/insert-nodes.ts:105 (3 transitive callers)
  • updateFileHashes in src/domain/graph/builder/stages/insert-nodes.ts:338 (3 transitive callers)

Encapsulate `Math.floor(stat.mtimeMs)` inside the `fileStat` helper so
every consumer of the integer DB column gets a pre-floored value by
default. Eliminates the risk that a future call site reads
`stat.mtimeMs` and stores it un-floored, which would silently write a
non-integer (or rounded-up integer) into the DB and cause spurious
fast-skip misses on the next build.

All six existing call sites simplified from `Math.floor(stat.mtimeMs)`
to `stat.mtime`. Behaviour unchanged.

Addresses Greptile P2 feedback on the revert PR.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed Greptile P2 feedback ("Math.floor responsibility pushed to every call site"):

  • fileStat now floors mtimeMs inside the helper and returns { mtime: number; size: number } (already-integer mtime that matches the DB column).
  • All six call sites simplified from Math.floor(stat.mtimeMs) to stat.mtime.
  • Behaviour unchanged; pushed as commit bf3f4b4 (refactor — separate from the mechanical revert).

This eliminates the forward-looking concern that a future call site could store an un-floored mtimeMs and cause spurious fast-skip misses.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 21b0b6f into main May 8, 2026
20 checks passed
@carlos-alm carlos-alm deleted the revert/1079-mtime-bigint branch May 8, 2026 07:04
@github-actions github-actions Bot locked and limited conversation to collaborators May 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant