Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify validation rules of canonical options #466

Merged
merged 1 commit into from
Mar 10, 2025

Conversation

alexcrichton
Copy link
Collaborator

  • Refactor the spec a bit to have a dedicated section for general-purpose validation of canonical options with additional rules per-use-site.
  • Each option is present at most once in a list of options.
  • Validation is required if an option is present, for example memory must always be a subtype of (memory 1) even if the lifting/lowering doesn't require a memory.
  • At most one string encoding can be specified.
  • The callback option is specified what type the function must have.
  • The callback option requires the async option.
  • The async option is explicitly disallowed on error-context.* builtins.
  • The error-context.* builtins require memory and realloc as appropriate.
  • The requirement of memory and realloc on {stream,future}.{read,write} is documented (although "required by" is currently a bit vague, that's left to a future refactoring).

* Refactor the spec a bit to have a dedicated section for
  general-purpose validation of canonical options with additional rules
  per-use-site.
* Each option is present at most once in a list of options.
* Validation is required if an option is present, for example `memory`
  must always be a subtype of `(memory 1)` even if the lifting/lowering
  doesn't require a `memory`.
* At most one string encoding can be specified.
* The `callback` option is specified what type the function must have.
* The `callback` option requires the `async` option.
* The `async` option is explicitly disallowed on `error-context.*`
  builtins.
* The `error-context.*` builtins require `memory` and `realloc` as
  appropriate.
* The requirement of `memory` and `realloc` on
  `{stream,future}.{read,write}` is documented (although "required by"
  is currently a bit vague, that's left to a future refactoring).
* a `realloc` is present if required by lifting and has type `(func (param i32 i32 i32 i32) (result i32))`
* there is no `post-return` in `$opts`
* a `memory` is present if required by lowering
* a `realloc` is present if required by lowering
* if `contains_async_value($ft)`, then `$opts.async` must be set
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukewagner @dicej I want to call attention to this point, I didn't change this here but this is a mismatch between the spec and wasm-tools. Currently wasm-tools does not validate this (and thus Wasmtime doesn't validate this).

@lukewagner is this a historical holdout from a previous version of things? Or should this still be required? If so it's another thing to keep track of @dicej (and I can work on wasm-tools/etc updates)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; I think we can delete this rule now.

@alexcrichton
Copy link
Collaborator Author

The main motivation behind this was I wasn't able to find the one-liner of "what is the type of callback" (although I probably missed it) and that kind of snowballed into more formally specifying what wasmparser does today in terms of validation.

Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for doing this. A few quick questions:

* a `realloc` is present if required by lifting and has type `(func (param i32 i32 i32 i32) (result i32))`
* there is no `post-return` in `$opts`
* a `memory` is present if required by lowering
* a `realloc` is present if required by lowering
* if `contains_async_value($ft)`, then `$opts.async` must be set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; I think we can delete this rule now.

@@ -3034,6 +3039,26 @@ elimination of string operations on the labels of records and variants) as well
as post-MVP [adapter functions].


### `canon $opts`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could you move this section to be before canon lift and call it "canonopt validation"?

Comment on lines +3046 to +3051
specifying `string-encoding=utf8` twice is an error. Each individual option, if
present, is validated as such:

* `string-encoding=utf8` - cannot be combined with `utf16` or `latin1+utf16`
* `string-encoding=utf16` - cannot be combined with `utf8` or `latin1+utf16`
* `string-encoding=latin1+utf16` - cannot be combined with `utf8` or `utf16`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we collapse this sentence and 3 bullets down to:

  • At most one string-encoding=X option can be set

Comment on lines +3495 to +3497
* `memory` is required for `stream.write` if required by lowering
* `memory` is required for `stream.read` if required by lifting
* `realloc` is required for `stream.read` if required by lifting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a string-encoding bullet also be required along the same lines? Instead of listing memory, realloc and string-encoding out for each built-in, could we factor out these bullets into the new canonopt section?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of collapsing things yeah, I'll work on adding that.

Otherwise though I don't think we'd have a bullet for string-encoding because it's implicitly always set even when not mentioned, so at least currently in wasmparser there's no validation of string-encoding other than at most one is present.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, good point; thanks!

@alexcrichton alexcrichton merged commit 0ed3af4 into WebAssembly:main Mar 10, 2025
1 check passed
@alexcrichton
Copy link
Collaborator Author

Sorry that was a push gone wrong. I've reverted the changes on main but I can't seem to reopen this PR any more. I've opened a new one at #467 addressing your comments Luke

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants