You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per-function complexity metadata is now stored on graph nodes and queryable,
alongside several indexing performance and correctness fixes developed and
validated together (3704 tests, ASan/UBSan clean).
Bottleneck metrics (query via query_graph):
- Tier A (in the extraction AST walk): cyclomatic (complexity), cognitive
(nesting-weighted), loop_count, loop_depth (max nested-loop depth),
param_count, max_access_depth.
- Tier B (new pre-dump pass, pass_complexity.c): transitive_loop_depth
propagated along CALLS edges + a recursive flag (direct self-recursion and
mutual-recursion cycles), plus the call-context signals linear_scan_in_loop,
alloc_in_loop, recursion_in_loop and unguarded_recursion.
- query_graph and get_architecture tool descriptions document the metrics and
the Leiden community clusters.
Cypher engine:
- node_prop exposes arbitrary persisted node properties to WHERE/RETURN.
- Fix projection aliasing: multi-property rows shared a single static buffer so
every column returned the last value read; now per-column/rotating buffers.
- Fix a stack-use-after-scope in aggregate RETURN (caller-owned value buffers).
Indexing performance:
- Gate C/C++ #define Macro-node extraction to full mode (it is ~49% of nodes on
the Linux kernel); moderate/fast skip it.
- Emit the complexity property block only for Function/Method nodes so the
millions of Macro/Field/Variable/Class/Enum nodes no longer carry zeroed
fields — large RAM reduction at scale.
- Classify node types via tree-sitter TSSymbol bitsets in cbm_kind_in_set
instead of per-node strcmp scans (thread-local cache, strcmp fallback;
behaviour-identical).
- Subsample frequent (Zipfian) tokens in the semantic co-occurrence finalize;
~14x faster finalize on the kernel, output unchanged.
- pass_lsp_cross: replace O(n^2) linear dedup with hash-set dedup.
Windows:
- Canonicalize drive-letter case during path normalization so "c:/repo" and
"C:/repo" derive the same project key and cache file (#394/#227/#367).
Tests: extraction, pipeline and cypher regressions covering all of the above.
0 commit comments