-
Notifications
You must be signed in to change notification settings - Fork 9
RFC: typed array conveniences #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dherman
wants to merge
2
commits into
main
Choose a base branch
from
typed-array-conveniences
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,312 @@ | ||
| - Feature Name: typed_array_conveniences | ||
| - Start Date: 2022-07-23 | ||
| - RFC PR: (leave this empty) | ||
| - Neon Issue: (leave this empty) | ||
|
|
||
| # Summary | ||
| [summary]: #summary | ||
|
|
||
| Core support for reading and writing typed arrays was added in [RFC 39](https://github.com/neon-bindings/rfcs/blob/main/text/0039-node-api-borrow.md). This RFC adds APIs for conveniently **constructing and accessing metadata from** typed arrays. | ||
|
|
||
| # Motivation | ||
| [motivation]: #motivation | ||
|
|
||
| Constructing typed arrays can be done in JavaScript or by accessing the constructors from `cx.global()`, but this requires a lot of boilerplate and is slower than calling the direct Node-API functions for doing the same. | ||
|
|
||
| Reading metadata such as the byte length can be performed by borrowing a typed array's contents and doing arithmetic on the slice size. But this is error-prone and not particularly discoverable. | ||
|
|
||
| Having a richer API for working with typed arrays also makes for a better learning experience in the API docs. | ||
|
|
||
| # Guide-level explanation | ||
| [guide-level-explanation]: #guide-level-explanation | ||
|
|
||
| This example creates a typed array of unsigned 32-bit integers with a user-specified length: | ||
|
|
||
| ```rust | ||
| fn create_int_array(mut cx: FunctionContext) -> JsResult<JsTypedArray<u32>> { | ||
| let len = cx.argument::<JsNumber>(0)?.value(&mut cx) as usize; | ||
| JsTypedArray::new(&mut cx, len) | ||
| } | ||
| ``` | ||
|
|
||
| This example creates a typed array as a view over a region of an existing buffer: | ||
|
|
||
| ```rust | ||
| // Allocate a 16-byte ArrayBuffer and a uint32 array of length 2 (i.e., 8 bytes) | ||
| // starting at byte offset 4 of the buffer: | ||
| // | ||
| // 0 4 8 12 16 | ||
| // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| // buf: | | | | | | | | | | | | | | | | | | ||
| // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ||
| // ^ ^ | ||
| // | | | ||
| // +-------+-------+ | ||
| // arr: | | | | ||
| // +-------+-------+ | ||
| // 0 1 2 | ||
| let buf = cx.array_buffer(16); | ||
| let arr = JsUint32Array::from_region(&mut cx, buf.region(4, 2))?; | ||
| ``` | ||
|
|
||
| # Reference-level explanation | ||
| [reference-level-explanation]: #reference-level-explanation | ||
|
|
||
| ## Convenience types | ||
|
|
||
| The `neon::types` module contains the following convenient type aliases: | ||
|
|
||
| | Rust Type | Convenience Type | JavaScript Type | | ||
| | --------------------| ------------------ | ---------------------------------- | | ||
| | `JsTypedArray<u8>` | `JsUint8Array` | [`Uint8Array`][Uint8Array] | | ||
| | `JsTypedArray<i8>` | `JsInt8Array` | [`Int8Array`][Int8Array] | | ||
| | `JsTypedArray<u16>` | `JsUint16Array` | [`Uint16Array`][Uint16Array] | | ||
| | `JsTypedArray<i16>` | `JsInt16Array` | [`Int16Array`][Int16Array] | | ||
| | `JsTypedArray<u32>` | `JsUint32Array` | [`Uint32Array`][Uint32Array] | | ||
| | `JsTypedArray<i32>` | `JsInt32Array` | [`Int32Array`][Int32Array] | | ||
| | `JsTypedArray<u64>` | `JsBigUint64Array` | [`BigUint64Array`][BigUint64Array] | | ||
| | `JsTypedArray<i64>` | `JsBigInt64Array` | [`BigInt64Array`][BigInt64Array] | | ||
| | `JsTypedArray<f32>` | `JsFloat32Array` | [`Float32Array`][Float32Array] | | ||
| | `JsTypedArray<f64>` | `JsFloat64Array` | [`Float64Array`][Float64Array] | | ||
|
|
||
| These can be convenient to type, and also create a discoverable place for documentation and examples in the API docs. | ||
|
|
||
| ## Constructing typed arrays | ||
|
|
||
| This RFC defines a number of APIs for constructing typed arrays. | ||
|
|
||
| ### `JsTypedArray` | ||
|
|
||
| The `JsTypedArray` type gets three new constructor methods for creating a new typed array. | ||
|
|
||
| ```rust | ||
| impl<T: Binary> JsTypedArray<T> { | ||
| // Constructs a new typed array from a length, allocating a new backing buffer. | ||
| pub fn new<'cx, C>( | ||
| cx: &mut C, | ||
| len: usize, | ||
| ) -> JsResult<'cx, Self> | ||
| where | ||
| C: Context<'cx>; | ||
|
|
||
| // Constructs a new typed array from an existing buffer. | ||
| pub fn from_buffer<'cx, 'b: 'cx, C>( | ||
| cx: &mut C, | ||
| buffer: Handle<'b, JsArrayBuffer>, | ||
| ) -> JsResult<'cx, Self> | ||
| where | ||
| C: Context<'cx>; | ||
|
|
||
| // Constructs a new typed array from a region of an existing buffer. | ||
| pub fn from_region<'c, 'r, C>( | ||
| cx: &mut C, | ||
| region: Region<'r, T>, | ||
| ) -> JsResult<'c, Self> | ||
| where | ||
| C: Context<'c>; | ||
| } | ||
| ``` | ||
|
|
||
| ### `JsArrayBuffer` | ||
|
|
||
| The `JsArrayBuffer` type gets a new method for constructing a `Region` struct, which describes a region of a buffer (see below). | ||
|
|
||
| ```rust | ||
| impl Handle<'cx, JsArrayBuffer> { | ||
| pub fn region<T: Binary>(self, byte_offset: usize, len: usize) -> Region<'cx, T>; | ||
| } | ||
| ``` | ||
|
|
||
| ### Helper types | ||
|
|
||
| #### `neon::types::binary::Binary` | ||
kjvalencik marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| The constructor functions require passing a type tag to Node-API functions, so this RFC defines a sealed `Binary` trait to define on all the primitive element types of typed arrays (`i8`, `u8`, `i16`, `u16`, etc) in order to define the type tag as an associated constant. This is exposed for documentation and to allow userland generic helper functions that work on all typed array types. | ||
|
|
||
| ```rust | ||
| trait Binary: Sealed + Copy + Debug { | ||
| const TYPE_TAG: /* hidden */; | ||
| } | ||
| ``` | ||
|
|
||
| #### `neon::types::binary::Region` | ||
|
|
||
| The `Region` struct defines a region of an underlying buffer. For performance, it's not actually checked to be a valid region until actually constructing a typed array object. (Otherwise, every method on the type would have to re-validate every time it's called, because of the possibility of the buffer getting detached.) | ||
|
|
||
| ```rust | ||
| #[derive(Clone, Copy)] | ||
| struct Region<'cx, T: Binary> { /* hidden */ } | ||
|
|
||
| impl<'cx, T: Binary> Region<'cx, T> { | ||
| // Constructs a new typed array for this region. | ||
| pub fn to_typed_array<'c, C>(self, cx: &mut C) -> JsResult<'c, JsTypedArray<T>> | ||
kjvalencik marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| where C: Context<'c>; | ||
| } | ||
kjvalencik marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| ## Convenient queries | ||
|
|
||
| This RFC also defines a number of methods that query typed arrays for metadata. | ||
|
|
||
| ### `JsTypedArray` | ||
|
|
||
| ```rust | ||
| impl<T: Binary> JsTypedArray<T> { | ||
| // Returns the region of the backing buffer for this typed array. | ||
| pub fn to_region<'cx, C>( | ||
| &self, | ||
| cx: &mut C, | ||
| ) -> Region<'cx, T> | ||
| where | ||
| C: Context<'cx>; | ||
|
|
||
| // Returns the backing buffer. | ||
| pub fn buffer<'cx, C>( | ||
| &self, | ||
| cx: &mut C, | ||
| ) -> Handle<'cx, JsArrayBuffer> | ||
| where | ||
| C: Context<'cx>; | ||
|
|
||
| // Returns the byte offset into the backing buffer. | ||
| pub fn byte_offset<'cx, C>( | ||
| &self, | ||
| cx: &mut C, | ||
| ) -> usize | ||
| where | ||
| C: Context<'cx>; | ||
|
|
||
| // Returns the number of elements in the typed array. | ||
| pub fn len<'cx, C>( | ||
| &self, | ||
| cx: &mut C, | ||
| ) -> usize | ||
| where | ||
| C: Context<'cx>; | ||
| } | ||
| ``` | ||
|
|
||
| ### `TypedArray` | ||
|
|
||
| This RFC adds a convenience method to the `TypedArray` trait: | ||
|
|
||
| ```rust | ||
| pub trait TypedArray: Sealed { | ||
| /* as before... */ | ||
|
|
||
| // Returns the number of bytes in this typed array. | ||
| fn byte_length<'cx, C>(&self, cx: &mut C) -> usize | ||
| where | ||
| C: Context<'cx>; | ||
| } | ||
| ``` | ||
|
|
||
| ### `Region` | ||
|
|
||
| The `Region` struct also defines a number of convenient queries. | ||
|
|
||
| ```rust | ||
| impl<'cx, T: Binary> Region<'cx, T> { | ||
|
|
||
| pub fn buffer(self) -> Handle<'cx, JsArrayBuffer>; | ||
|
|
||
| pub fn byte_offset(self) -> usize; | ||
|
|
||
| pub fn len(self) -> usize; | ||
|
|
||
| pub fn byte_length(self) -> usize; | ||
| } | ||
| ``` | ||
|
|
||
|
|
||
| # Drawbacks | ||
| [drawbacks]: #drawbacks | ||
|
|
||
| There aren't any real drawbacks to exposing this functionality, since it's provided by Node-API. There are, however, a number of tradeoffs in the particulars of the design. | ||
|
|
||
| The main drawbacks of this RFC's proposed design are: | ||
|
|
||
| ## Naming inconsistency: `len()` vs `byte_length()` | ||
|
|
||
| This RFC goes with `len()` to match a strong Rust convention, and `byte_length()` to match the typed array naming convention (using snake case, of course). These does lead to an inconsistency between the two. A couple alternatives would be `byte_len()` or `size()`. But then `byte_len()` ends up feeling a bit inconsistent with `byte_offset()`. We could then choose `byte_off()` but that's a bit opaque. | ||
kjvalencik marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Unfortunate docs placement of `region()` | ||
|
|
||
| In order to make the lifetimes work out, `region()` is a method of `Handle<JsArrayBuffer>` instead of `JsArrayBuffer`. This has the unfortunate effect of placing the method in the API docs for [`neon::handle::Handle`][neon::handle::Handle] instead of [`neon::types::JsArrayBuffer`][neon::types::JsArrayBuffer], which makes them less discoverable. The other docs should make an effort to mention the method prominently, use it in examples, and link to its docs in order to mitigate this issue. | ||
dherman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| We could eliminate the underlying buffer from the `Region` data structure. Conversion to a typed array would require passing the buffer in as an additional parameter, which ends up more verbose and redundant: | ||
| ```rust | ||
| buffer.region(4, 2).to_typed_array(&mut cx, buffer) | ||
| ``` | ||
|
|
||
| ## Cost of multiple FFI calls | ||
|
|
||
| The `.buffer()`, `.byte_offset()`, `.len()`, and `.byte_length()` methods all make FFI calls to the same underlying Node-API function ([`napi_get_typedarray_info`][napi_get_typedarray_info]), which incurs unnecessary FFI overhead. Unfortunately, the semantics of JavaScript buffer detachment means that `.byte_offset()`, `.len()`, and `.byte_length()` can all change their value over time, which means it's not safe to cache the results of this FFI call. | ||
|
|
||
| The `.to_region()` method mitigates this issue by allowing performance-conscious Neon users to incur a single FFI call to retrieve all these values at once. | ||
|
|
||
| # Rationale and alternatives | ||
| [alternatives]: #alternatives | ||
|
|
||
| ## Conversion via `From`/`Into` | ||
|
|
||
| Making use of the standard conversion APIs of Rust to convert between buffers and typed arrays, or between regions and typed arrays, isn't really an option since all conversions need a context. | ||
|
|
||
| ## Range syntax | ||
|
|
||
| It's appealing to imagine using Rust's [range syntax][std::ops::Range] for regions, and even to use [index overloading][std::ops::Index] to enable syntax like: | ||
|
|
||
| ```rust | ||
| JsUint32Array::from_range(&mut cx, buffer[4..12]) | ||
| ``` | ||
|
|
||
| This runs into two problems: | ||
|
|
||
| 1. **Impedance mismatch:** The JavaScript range APIs all take a byte offset and a typed length, whereas the Rust range syntax takes a start offset and end offset. | ||
| 2. **`Index` incompatibility:** The signature of [`Index::index()`][std::ops::Index::index] returns a reference, so it's not allowed to allocate a custom struct. This makes it [effectively incompatible](https://users.rust-lang.org/t/how-to-implement-index-range-usize-for-a-custom-slice-type-with-additional-data/66201) with custom slice-like types such as `Region`. | ||
|
|
||
| ## Untyped regions | ||
|
|
||
| It might seem nonobvious why `Region` is typed. An untyped alternative would lose track of the meaning of the `len()` value. Another alternative would be to use byte length instead, but this would be less consistent with the JavaScript typed array APIs. | ||
|
|
||
| It's also possible to leverage type inference to avoid having to actually state the type explicitly: | ||
| ```rust | ||
| let arr = JsUint32Array::from_region(&mut cx, buf.region(4, 2))?; | ||
| ``` | ||
|
|
||
| ## Flat constructor for `from_region` | ||
|
|
||
| This RFC's design makes `from_region` and `to_region` symmetric signatures with `Region` as an input type vs. return type, respectively. An alternative approach would be to make `from_region` take the `byte_offset` and `len` as immediate arguments of `from_region`, and have `to_region` return a tuple. However, this would risk being more error-prone: users would have to remember what the two different `usize` elements of the tuple represent, and it would be easy to confuse the `len` integer for a byte length. It would also be untyped (see above). | ||
|
|
||
| ## Builder pattern for `Region` | ||
|
|
||
| Instead of `buffer.region(offset, len)`, we could expose builder methods for constructing regions, like `buffer.byte_offset(offset).len(len)`. This would have the benefit of making the semantics of the integers more unambiguous, but doesn't seem worth the extra verbosity. | ||
|
|
||
| ## Alternative names for `Region` | ||
dherman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| One option would be to use "slice" terminology, since a `Region` is similar to a Rust slice. But a slice is a particular datatype in Rust, whereas this is a custom type. Also, a slice is a live view, and a region is more like a _description_ of a slice but it's not a live view until it's converted to a typed array. | ||
|
|
||
| Another option would be "range" terminology, but again, a [range][std::ops::Range] is a specific thing in Rust. | ||
|
|
||
|
|
||
| # Unresolved questions | ||
| [unresolved]: #unresolved-questions | ||
|
|
||
| None. | ||
|
|
||
| [std::ops::Index]: https://doc.rust-lang.org/stable/std/ops/trait.Index.html | ||
| [std::ops::Range]: https://doc.rust-lang.org/stable/std/ops/struct.Range.html | ||
| [std::ops::Index::index]: https://doc.rust-lang.org/stable/std/ops/trait.Index.html#tymethod.index | ||
| [neon::handle::Handle]: https://docs.rs/neon/latest/neon/handle/struct.Handle.html | ||
| [neon::types::JsArrayBuffer]: https://docs.rs/neon/latest/neon/types/struct.JsArrayBuffer.html | ||
| [napi_get_typedarray_info]: https://nodejs.org/api/n-api.html#napi_get_typedarray_info | ||
| [Uint8Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array | ||
| [Int8Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Int8Array | ||
| [Uint16Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint16Array | ||
| [Int16Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Int16Array | ||
| [Uint32Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint32Array | ||
| [Int32Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Int32Array | ||
| [BigUint64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigUint64Array | ||
| [BigInt64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt64Array | ||
| [Float32Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float32Array | ||
| [Float64Array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float64Array | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.