Skip to content

Lambe 0.4.0: drop XML, bare pipeline ops, MCP surface fixes#1

Merged
hakimjonas merged 3 commits into
mainfrom
lambe-0.4.0
Apr 23, 2026
Merged

Lambe 0.4.0: drop XML, bare pipeline ops, MCP surface fixes#1
hakimjonas merged 3 commits into
mainfrom
lambe-0.4.0

Conversation

@hakimjonas
Copy link
Copy Markdown
Owner

Summary

  • Drop XML support. The xmlToNative projection silently collapsed repeated sibling elements (last-wins map semantics) and dropped attributes entirely, producing wrong query results. Rather than ship a footgun, XML is removed pending a proper projection design (auto-arrayify siblings, preserve attributes). The rumil XML parser itself is unchanged and spec-compliant.
  • Pipeline ops are now valid bare expressions with implicit . input. has("k"), length, keys, sum, filter(...), map(...) — standalone or inside map/filter. Equivalent to . | op. One-line parser change; 22 new tests.
  • MCP surface improvements. Version now read from pubspec.yaml via build-time-generated lib/src/_version.dart (was hardcoded 0.1.0). CSV and TSV exposed in all three format enums (library already supported them). New output_format parameter on lambe_query for yaml/toml/csv/tsv/hcl output, matching the --to CLI flag. Tool descriptions rewritten to document real pitfalls (&&/||, bracket syntax for hyphenated keys, group_by shape).
  • Doc-example test coverage. test/doc_examples_test.dart extracts every lam '...' from AI.md and every embedded query from the MCP server's Dart string literals (34 expressions), and asserts each parses. Prevents phantom-feature drift where LLM-drafted examples advertise syntax the parser doesn't implement.
  • Phantom .. (recursive descent) removed from docs. It was advertised as a Markdown query operator but was never implemented.
  • Fixed broken filter(has("resources") == false) example — which now works on its own merits thanks to the bare-op change.

Breaking

  • Format.xml, OutputFormat.xml, and XML extension detection (.xml, .pom, .csproj, .svg) removed from the library.
  • --format xml / --to xml CLI flags and the :to xml REPL command no longer accepted.

Non-breaking additions

  • lambeVersion exported from package:lambe/lambe.dart.
  • output_format parameter on the lambe_query MCP tool.
  • csv, tsv in all three MCP format enums.
  • tool/gen_version.dart + release workflow step.
  • Bare pipeline ops as expressions — every existing query still works; the change is purely additive.

Test plan

  • dart analyze lib bin test tool — no issues
  • dart test — 547 pass (was 491 on main; added 22 bare-op tests + 34 doc-example tests)
  • MCP server compiles and reports correct version via stdio initialize
  • 13 ambiguity probes for the bare-op change ({length}, .length, keys[0], "(length)", length + 1, etc.) all resolve per prior semantics
  • Release workflow end-to-end (will run when tag is pushed after merge)
  • MCP registry publish step no longer fails on duplicate version (will run on tag)

Out of scope

  • Re-introducing XML with a proper projection (future release, needs design discussion for attribute representation and sibling-list semantics).
  • A generic Format.delimited to unify CSV/TSV under the delimited parser (deferred — the current csv/tsv split preserves a useful dialect-detection escape hatch).

Breaking:
- Remove XML input/output support. The xmlToNative projection silently
  collapsed repeated sibling elements (last-wins map semantics) and dropped
  attributes entirely, producing wrong query results with no indication.
  Rather than ship a footgun, XML is dropped pending a proper projection
  design. The rumil XML parser itself is unchanged and remains spec-
  compliant.

MCP surface:
- Add output_format parameter to lambe_query so agents can request
  yaml/toml/csv/tsv/hcl output, matching the --to CLI flag.
- Expose csv, tsv in all three MCP format enums (library already supported
  them; MCP surface was the gap).
- Rewrite tool descriptions and instructions to document common pitfalls:
  && / || for boolean logic, bracket syntax for hyphenated keys, pipeline
  ops requiring a leading |, group_by returning [{key, values}].
- MCP server now reports its actual version via build-time-generated
  lib/src/_version.dart (was hardcoded 0.1.0).

Quality gates:
- tool/gen_version.dart reads pubspec.yaml and writes lib/src/_version.dart.
  Release workflow runs it before compile.
- test/doc_examples_test.dart extracts every lambe expression from AI.md
  code blocks/tables and from every Dart string literal in mcp_server.dart,
  then parses and evaluates each against a fixture. Catches phantom
  features in docs (e.g., LLM-drafted examples advertising unimplemented
  syntax) at CI time.

Bug fixes:
- Remove phantom `..` (recursive descent) from docs. The operator was
  advertised in AI.md and MCP instructions as a Markdown pattern but was
  never implemented.
- Fix broken AI.md example: filter(has("resources") == false) required
  filter((. | has("resources")) == false) since has is a pipeline op.

Tests: 525 pass (491 + 34 extracted doc examples).
Admit _pipeOp into _atom so has("k"), length, keys, filter(...), map(...)
and every other pipe op can appear as standalone expressions.
Semantically equivalent to ". | op" — the evaluator already treats ops as
LamExpr subtypes; the parser was the only thing blocking it. Placed last
in the _atom alternation so existing constructs (object shorthand
{length}, field access .length, string interpolation "\(length)") keep
their prior meaning.

Unblocks shapes agents and users reach for naturally:
  has("users")                       ≡ . | has("users")
  length                             ≡ . | length
  .users | map(has("email"))         ≡ .users | map(. | has("email"))
  .lists | filter(length > 2)        ≡ .lists | filter(. | length > 2)

MCP instruction block updated: the "pipeline ops must follow |"
prohibition is gone (it was never true at the evaluator level, now not
true at the parser level either). The AI.md workaround
filter((. | has("resources")) == false) reverts to the cleaner
filter(has("resources") == false).

Housekeeping pass on the prior commit's code:
- drop the // ---- Tool: X ---- section banners from mcp_server.dart
  (not used elsewhere in the codebase)
- trim inline "what this does" comments in doc_examples_test.dart; add
  /// docs to every private helper per repo convention
- update the _atom grammar comment in parser.dart to reflect the new
  alternatives

Tests: 547 pass (525 + 22 new bare-op tests).
@hakimjonas hakimjonas marked this pull request as ready for review April 23, 2026 14:22
@hakimjonas hakimjonas merged commit ecc988e into main Apr 23, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant