Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions scripts/benchmark.ts
Original file line number Diff line number Diff line change
Expand Up @@ -133,9 +133,23 @@ const buildStart = performance.now();
const buildResult = await buildGraph(root, { engine, incremental: false });
const buildTimeMs = performance.now() - buildStart;

const queryStart = performance.now();
fnDepsData('buildGraph', dbPath);
const queryTimeMs = performance.now() - queryStart;
// Warmed median of QUERY_RUNS samples with `noTests: true` to match the
// methodology used by query-benchmark.ts and the per-target `queries.*Ms`
// block below. Earlier versions of this script measured a single cold call,
// which conflated steady-state query latency with NAPI/rusqlite/OS-page-cache
// init costs (~65ms on macOS) and inflated growth from test-fixture files
// pulled in by new native extractors. See #1113 for the methodology rationale.
const QUERY_WARMUP_RUNS = 3;
for (let i = 0; i < QUERY_WARMUP_RUNS; i++) {
fnDepsData('buildGraph', dbPath, { depth: 3, noTests: true });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Comment slightly overstates benchQuery parity

The comment says the new methodology matches "the per-target queries.*Ms block below", but benchQuery (used for all queries.*Ms values) has no explicit warmup loop of its own. The implicit warming works because the queryTimeMs warmup runs execute before benchQuery is ever called, but that coupling is fragile — if the order of operations ever changes, queries.*Ms loses its warm-up silently. Consider either adding a warmup inside benchQuery, or rephrasing the comment to clarify that the warmup runs here also prime the DB/NAPI cache for the subsequent queries.*Ms block.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in de2212c. Added an explicit warmup loop inside benchQuery itself (3 runs, gated by QUERY_WARMUP_RUNS), so each call site primes its own NAPI/rusqlite/page-cache state. The methodology no longer depends on call ordering, and queryTimeMs now uses benchQuery directly — the comment describing parity with the queries.*Ms block is now accurate.

const queryTimings: number[] = [];
for (let i = 0; i < QUERY_RUNS; i++) {
const start = performance.now();
fnDepsData('buildGraph', dbPath, { depth: 3, noTests: true });
queryTimings.push(performance.now() - start);
}
const queryTimeMs = median(queryTimings);

const stats = statsData(dbPath);
const totalFiles = stats.files.total;
Expand Down
23 changes: 13 additions & 10 deletions tests/benchmarks/regression-guard.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -188,16 +188,19 @@ const SKIP_VERSIONS = new Set(['3.8.0']);
* one-time bump as the cost of supporting Verilog. Tracked separately;
* exempt this release.
*
* - 3.10.0:Query time — cumulative effect of adding two native extractors
* (Solidity #1100 + R #1102) in quick succession. Neither tripped the
* threshold individually (Solidity PR's Query time stayed at 49ms, R PR
* showed no warning), but the combined +110% (49.6 → ~105ms) on the
* `fnDepsData('buildGraph', dbPath)` measurement reflects natural graph
* growth: ~1100 LoC of new extractor code + 9 fixture files added to the
* self-build benchmark expand `buildGraph`'s transitive callee count and
* DB row counts. Tracked in #1113 — exempt this release; remove once
* 3.11.0+ data captures the new steady-state and the per-language
* fixture footprint has been evaluated.
* - 3.10.0:Query time — methodology artifact, not a real regression. The
* metric was a single-shot cold call to `fnDepsData('buildGraph', dbPath)`
* with no warmup, no median, and `noTests: false` — so it captured ~65ms
* of NAPI/rusqlite/OS-page-cache init plus the cost of walking through
* fixture files added by new language extractors. Local v3.9.6 vs HEAD
* on the same corpus measured 78.8ms vs 67.5ms single-shot (HEAD faster),
* while the warmed `queries.fnDepsMs` in the same benchmark showed 4.0ms
* vs 2.8ms — confirming no underlying regression. Methodology fixed in
* #1113: queryTimeMs now uses 3 warmup runs + median of 5 with
* `noTests: true`, matching query-benchmark.ts hygiene. Exemption kept
* in place until 3.11.0+ data captures the new steady-state under the
* updated methodology (expected ~36ms native on this corpus); remove
* the entry then.
*
* - 3.10.0:fnDeps depth 5 — same cause as Query time above. Merging main
* into #1102 added the Erlang extractor (#1103) on top of the existing
Expand Down
Loading