Skip to content

Conversation

@robin-aws
Copy link
Contributor

@robin-aws robin-aws commented Dec 5, 2025

Background

  • What do these changes do?
    • Adds the ability to evaluate JmespathExpressions
      • Defines a JmespathRuntime<T> interface to provide the necessary basic operations on JSON values at runtime.
      • The actual Evaluator is largely ported over from smithy-java's JMESPathDocumentQuery, but also taking some inspiration from jmespath-java, which also has a generic interface for adapting any existing JSON value representation.
    • Adds JmespathRuntime<T> implementations for both LiteralExpression and Node.
    • Adds a separate smithy-jmespath-tests module with the compliance tests from https://github.com/jmespath/jmespath.test, and instantiates them for both runtime implementations.
  • Why are they important?
    • The existing smithy-java evaluator only works on its own Document type, and evaluating expressions on generated types such as SerializableStruct subtypes currently requires first converting to a Document, which is slow. I've also written up implementations of JmespathRuntime for Document and generated types: see smithy-java link below.
    • JMESPath expressions are very easy to get wrong. The existing type checker/linter helps catch egregious errors but is far from complete. Having an interpreter opens the door to features where models can provide examples/tests of their expressions, possibly using the pending @shapeExamples trait (Add new shapeExamples trait that communicates and enforces allowed and disallowed values #2851)

Testing

  • How did you test these changes?
    • Mainly existing regression tests and the new smithy-jmespath-tests module.

Links

Note for reviewers: I'm not fully satisfied with how I've arranged the code into packages. I want to minimize how much API is exposed publicly, but didn't want to add all the new types into the base package. At the same time I don't want to expose the executable version of the Function interface publicly yet - I want to support custom functions in the future, but I'm not yet sure the current definition is optimal for that.

  • Edit: I dumped all of the Function-related types into the evaluation package and made them package-private. This can be revisited in a future change to support custom functions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@robin-aws robin-aws requested a review from a team as a code owner December 5, 2025 23:19
@robin-aws robin-aws changed the title Smithy jmespath evaluator JMESPath evaluation Dec 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

This pull request does not contain a staged changelog entry. To create one, use the ./.changes/new-change command. For example:

./.changes/new-change --pull-requests "#2878" --type feature --description "Smithy jmespath evaluator"

Make sure that the description is appropriate for a changelog entry and that the proper feature type is used. See ./.changes/README or run ./.changes/new-change -h for more information.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 6, 2025

This pull request does not contain a staged changelog entry. To create one, use the ./.changes/new-change command. For example:

./.changes/new-change --pull-requests "#2878" --type feature --description "JMESPath evaluation"

Make sure that the description is appropriate for a changelog entry and that the proper feature type is used. See ./.changes/README or run ./.changes/new-change -h for more information.


@Override
public Node createBoolean(boolean b) {
return new BooleanNode(b, SourceLocation.none());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a cached value. I think Node might already handle this for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node.nullNode() and Node.from(boolean) were not in fact already using cached values, but I fixed that. :)


@Override
public Node element(Node array, Node index) {
return array.expectArrayNode().get(index.expectNumberNode().getValue().intValue()).orElseGet(this::createNull);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty rough for array indexing. I think we should add a method to ArrayNode to get a value at an index or return null if not found to remove indirection and allocations here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in a getOrNull method that returns a possibly-null Node instead of an Optional<Node> as get does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up adding an elementAt method that should always return a non-null value (since the elements of an ArrayNode should use NullNode rather than null) and throws if the index is out of range. That works better here since the evaluator checks the range already for you. Let me know if that's not what you had in mind.

extra["moduleName"] = "software.amazon.smithy.model"

dependencies {
api(project(":smithy-jmespath"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the artifact size hit to smithy-model to unconditionally pull in jmespath. Do we need Node-based evaluation? If so, can it be a bridge package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'd rather move this out

Copy link
Contributor Author

@robin-aws robin-aws Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I plan to leverage the Node-based evaluation in the next PR, which will add a new trait that uses JMESPath and wants to validate that positive/negative examples are actually correct.

I've moved the NodeJmespathRuntime class and tests to a separate smithy-model-jmespath package now though.


@Override
public T visitFunction(FunctionExpression functionExpression) {
Function function = FunctionRegistry.lookup(functionExpression.getName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice if FunctionRegistry also included a way to get all registered functions, and then that could be used to turn runtime function lookup into primitive array index access rather than the current hashmap access. Not sure how though... maybe all builtin functions get assigned a fixed index and functionExpression returns that index or -1 if it's custom. Then the parser can setup the right index, the FunctionExpression getter returns that index, and then maybe add a method to JmesPath runtime called getFunction(FunctionExpression)? or something like that. This would also remove the global function registry and support custom functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking through this as well. The AST would have to change somehow, perhaps to add a FunctionIndexExpression that you can compile FunctionExpressions to.

I will move all of the Function related types into the same package as the evaluator so that all of them can be package-private, so we can figure this out and support custom functions in a later PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants