feat(native): port Gleam extractor to Rust#1105
Conversation
|
Claude finished @carlos-alm's task —— View job Reviewing PR: feat(native): port Gleam extractor to RustTodo List:
Phase 1: Root Cause Analysis ✅Problem Being Solved: This PR implements native Gleam extractor support to complete the dual-engine architecture. Issue #1071 requested Gleam support, and this adds the missing native Rust implementation to match the existing WASM extractor. Why This Approach: The approach follows the established dual-engine pattern - both native (Rust) and WASM engines must produce identical results for all supported languages. The PR properly mirrors the existing Risk Assessment: Low risk - this is additive functionality that doesn't modify existing behavior for other languages. Phase 2: Configuration Change Detection ✅Configuration Changes Analyzed:
Assessment: Clean addition - no concerning configuration relaxation found. Phase 3: Test Integrity Verification ✅Test Modifications Analyzed:
Assessment: Test changes are appropriate and strengthen coverage. Phase 4: Code Quality Assessment ✅Native Extractor Implementation ( Strengths:
Code Quality Details:
Integration Points:
Assessment: High-quality implementation that correctly ports WASM functionality to native Rust. Phase 5: Backlog Compliance Evaluation ✅Root Cause AnalysisProblem Being Solved: Completes dual-engine architecture parity for Gleam language support (native Rust + WASM) Backlog Compliance
Critical ConcernsNone identified. This is a clean implementation that follows established patterns. Final Recommendation
|
Greptile SummaryThis PR ports the Gleam symbol extractor from the WASM/JS engine to a native Rust implementation, adding
Confidence Score: 5/5Safe to merge — the Rust extractor faithfully mirrors the JS engine for all Gleam constructs, previously-flagged issues have been addressed, and the infrastructure changes are self-consistent. All handler functions have been cross-checked against the JS extractor. The named_child(0) fix is correctly applied in both engines. The function_call | call dual-dispatch is in place. Change detection, file collection, and extension registration are updated atomically. The nine unit tests plus the updated classification test cover the full extractor surface. No files require special attention. Important Files Changed
Reviews (4): Last reviewed commit: "fix(gleam): match both function_call and..." | Re-trigger Greptile |
| }; | ||
|
|
||
| symbols.definitions.push(Definition { | ||
| name: node_text(&name_node, source).to_string(), | ||
| kind: "function".to_string(), | ||
| line: start_line(node), | ||
| end_line: Some(end_line(node)), | ||
| decorators: None, | ||
| complexity: None, | ||
| cfg: None, | ||
| children: None, | ||
| }); | ||
| } | ||
|
|
||
| fn handle_type_definition(node: &Node, source: &[u8], symbols: &mut FileSymbols) { |
There was a problem hiding this comment.
handle_external_function drops parameter children
handle_function extracts parameters and stores them as children, but handle_external_function hard-codes children: None. External Gleam functions still have a full parameter list in their signatures, so callers that rely on children to understand arity or parameter names will get nothing for external functions. This creates a silent asymmetry: two functions with identical signatures produce different output depending on whether they are external.
There was a problem hiding this comment.
Tracking as a follow-up in #1110. Both engines (native Rust and WASM/JS) currently drop the parameter list for external functions — the Rust port faithfully mirrors existing WASM/JS behavior to keep dual-engine parity, but the silent asymmetry between regular and external Gleam functions is real and worth fixing in both engines together. Deferred to keep this PR scoped to "port to native" rather than "port + change extraction semantics across engines".
| let record = func_node | ||
| .child_by_field_name("record") | ||
| .or_else(|| func_node.child(0)); |
There was a problem hiding this comment.
child(0) fallback for record may pick up anonymous punctuation nodes
func_node.child(0) returns the first child regardless of whether it is named or anonymous. In the Gleam tree-sitter grammar a field_access node's children include the . punctuation token, so the fallback could capture . as the receiver text instead of the module identifier. Prefer func_node.named_child(0) to skip anonymous punctuation tokens.
| let record = func_node | |
| .child_by_field_name("record") | |
| .or_else(|| func_node.child(0)); | |
| let record = func_node | |
| .child_by_field_name("record") | |
| .or_else(|| func_node.named_child(0)); |
There was a problem hiding this comment.
Fixed in b971244. Replaced the func_node.child(0) fallback for the record field with func_node.named_child(0) to skip anonymous punctuation tokens. Applied the same fix to the JS extractor (src/extractors/gleam.ts) to keep dual-engine parity.
| } | ||
|
|
||
| fn handle_call(node: &Node, source: &[u8], symbols: &mut FileSymbols) { | ||
| let func_node = match node.child_by_field_name("function").or_else(|| node.child(0)) { |
There was a problem hiding this comment.
handle_call first-child fallback also uses unnamed child
Same concern at the top-level func_node selection: node.child(0) can return anonymous tokens. node.named_child(0) is consistent with how other extractors resolve this.
| let func_node = match node.child_by_field_name("function").or_else(|| node.child(0)) { | |
| let func_node = match node.child_by_field_name("function").or_else(|| node.named_child(0)) { |
There was a problem hiding this comment.
Fixed in b971244. Replaced node.child(0) with node.named_child(0) in handle_call to skip anonymous punctuation tokens. Same fix applied to the JS extractor to keep dual-engine parity.
Codegraph Impact Analysis29 functions changed → 14 callers affected across 3 files
|
Replaces child(0) fallbacks in handleCall / handle_call with named_child(0) in both the native Rust and WASM/JS Gleam extractors. The Gleam tree-sitter grammar's field_access node includes the '.' punctuation token as a child, so child(0) on field_access could return '.' as the receiver text on malformed input. named_child(0) skips anonymous tokens and is consistent across both engines. The field accessors always succeed on valid Gleam, so this only affects the defensive fallback path, but it removes a silent asymmetry that Greptile flagged in review of #1105.
| "import" => handle_import(node, source, symbols), | ||
| "function_call" => handle_call(node, source, symbols), | ||
| _ => {} | ||
| } |
There was a problem hiding this comment.
Missing
call node type in match_gleam_node
The JS extractor's walkGleamNode dispatches on both 'function_call' and 'call' node types, but the Rust match_gleam_node only matches "function_call". Any Gleam file parsed with a grammar version that emits call nodes (or if the grammar ever uses both interchangeably) will have its call sites silently skipped by the native extractor while the WASM engine extracts them correctly — breaking the dual-engine parity that this PR aims to guarantee.
| } | |
| "function_call" | "call" => handle_call(node, source, symbols), |
There was a problem hiding this comment.
Fixed in a726c3a. Updated match_gleam_node to match both function_call and call node types, matching the JS walkGleamNode dispatch in src/extractors/gleam.ts. The 9 Gleam unit tests still pass.
Summary
tree-sitter-gleamdependency and a native Gleam extractor incrates/codegraph-core/src/extractors/gleam.rs..gleamwithLanguageKind::Gleamand the Rustfile_collector, adds Gleam toNATIVE_SUPPORTED_EXTENSIONSon the JS side, and wiresGLEAM_AST_TYPES/GLEAM_AST_CONFIGon both the native and JS sides so the two engines extract identicalast_nodesfor Gleam source.extractGleamSymbols: module-level function definitions asfunction(with parameter children), type definitions astype/record/enum(mapped from the Gleam node kind), constants asvariable, import declarations, and function-application call extraction.Closes #1071
Test plan
cargo build --release -p codegraph-core(clean build)cargo test -p codegraph-core --lib— 193/193npx tree-sitter build --wasmregeneratestree-sitter-gleam.wasmnpx vitest run tests/parsers/gleam.test.ts— 4/4npx vitest run tests/parsers/native-drop-classification.test.ts— 13/13