-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can streams be transferred via postMessage()? #276
Comments
I don't know if there are plans for this, but one thought I had:
|
Yeah, that sounds about right. I was hoping to work out the details, and at some point I had an ambitious plan to first figure out how we clone/transfer promises and then do the same for streams (see dslomov/ecmascript-structured-clone#5). But maybe we can just jump to the end if this is important for a particular use case that an implementer wants to implement. |
I was talking with @sicking today and he raised an interesting point. What do we do in this situation:
Ideally the worker would have access to the stream data without all the buffers going through the original thread, but that seems impossible if .read() is JS implemented in the original context. Can we make the .read() and .writer() methods unforgeable? Or should postMessage() just throw in this situation? |
I would imagine we just do what we do for all other platform objects (including e.g. promises), and define the behavior of how other code interacts with them not in terms of their public API but in terms of abstract operations that apply to them. So e.g. the postMessage algorithm references TeeReadableStream(s) instead of s.tee(). |
(Sometimes this is also stated in specs as "using the original value of |
And because the tee concept operates on the source it effectively by-passes any overrides on the outer object? |
I mean, I'd say that it operates on the stream, but does so by reaching into its innards instead of through the public API, in order to be more tamper-proof. |
So that's true even for piping? I.e. if I do var writable = nativeFileSystemAPI.appendToFile(filename);
writable.write = function(buffer) {
console.log(buffer);
WritableStream.prototype.write.call(this, buffer);
}
readable.pipeTo(writable); Then nothing will get logged since |
@sicking That's a good question. My initial answer was no: However I of course recognize the impulse behind it; you want to have fast off-main-thread piping be the norm for UA-created streams talking to each other. I see a few perspectives on your question in that light:
I'd be quite interested in an implementer perspective on which of these seems more attractive (from everyone, @tyoshino and @yutakahirano included!). |
I asked @bzbarsky and this option is possible, but it would probably not be our first choice. It would require adding back features that were removed from SpiderMonkey for being a bit hacky. It seems we should go with other options if possible. |
I should note that option 1 requires carefully defining exactly what happens if the deopt happens in the middle of the copy operation somewhere; you have to synchronize somehow so that the writes that need to be observable after the deopt are actually observable or something. |
We've redesigned the readable stream to have the reader+stream structure and are going to also introduce the writer+stream structure to the writable stream. Given that change, the example by Jonas (#276 (comment)) doesn't work since pipeTo() obtains a writer by itself. There's no point where we can substitute |
So, to implement the 1st option in Domenic's comment (#276 (comment)), for example, we should make overriding getReader() deopt the special piping algorithm of the readable stream writable stream pair, I guess. |
But |
@tyoshino, sorry, I don't understand what you meant at #276 (comment). Can you explain? |
Not streams but the reader and the writer are important for pipeTo. If the reader and the writer gotten via getReader() and getWriter() are authentic, we can enable the optimization.
|
OK, that seems to push the issue off to how pipeTo interacts with the reader and writer. That is, all of #276 (comment) applies but now to the reader and writer, right? |
I think it helps reduce the surface area a little bit. In particular one strategy would be that you only have to check if Except... what if someone overwrites Another strategy would be that the first thing the algorithm does is grab all the methods off the writer, and then use them forever more after. E.g. const [write, close] = [writer.write, writer.close];
// we can unobservably insert a check here that write and close are the expected ones...
// later:
write.call(writer) // instead of writer.write() Except ... this doesn't work for our I'm starting to feel that we need to program what we want more directly into the spec, instead of trying to make it an unobservable optimization. That implies one of the other two strategies. (Or perhaps less-extensible versions of them, at least in the short term...) I hope to think on this more productively next Monday, when my brain has had a chance to un-fry itself after a TC39 week. I feel like I need to take a step back, and say what the high-level goals and constraints are here. (I've left a lot implicit, I think.) But the above is kind of where I'm at right now; hope it helps people understand that we're taking this seriously at least :) |
You don't have to check dest.getWriter, right? You have to check something about the object it returns, as you note. I assume getWriter is only called once, up front. But yeah, my point was that we have the same problem with the writer. :( |
Well, you could do either, but yeah, both fail. |
It seems we can avoid most the issues if we just make it so you can postMessage() the stream, but not the reader. When you postMessage() the stream then it locks the stream and transfers the underlying source. This doesn't give 3rd party js any surface area to override as far as I can tell. |
Right, that does address the OP, although we did get into the question of how to make pipeTo unobservable (independent of postMessage). |
Maybe we should first discuss the overwrite detection issue at #321. |
This issue is really old now, but what happened to the alternative from #97 (comment) that just uses an explicit 'socketpair' connecting to the other page? Issue #244 never had any discussion about it, and it sidesteps any transfer-of-state issues. |
I've thought about special-casing double-transfer a bit more and now understand why it's hard. The problem is setting up the initial state of the transferred stream. A normal transfer involves setting up a special kind of identity transform that communicates over a message pipe. We pipe the original stream into the local end, and the remote end becomes the transferred stream. There are two key features of this
Now in the double-transfer case we'd like to reuse the message pipe embedded in the transferred stream, transfer it again, and use it to create the doubly-transferred stream in the target realm. Which works fine if the transferred stream happens to be in its default state, but if it's in some other state that won't be correctly reflected in the doubly-transferred stream. So this second transfer involves transferring not just a message pipe, but the complete current state of the stream. So we have two choices,
Neither of these options are particularly attractive. I would prefer to ship the first version of transferable streams without special-casing double-transfer at all, but if we are going to change the whole approach then we should probably decide it now. |
Make readable, writable, and transform streams transferable via postMessage(stream, [stream]). The streams themselves must be transferred, but the chunks written or read from the streams are cloned, not transferred. Support for transferring chunks will require API changes and is expected to be added in a future update. There is no reference implementation of this functionality as jsdom does not support transferrable objects, and so it wouldn't be testable. Closes whatwg#276.
Could we build something on top of the current draft, by reestablishing the state with a special message whose contents are [[state]], [[storedError]], and maybe [[disturbed]]? I guess it gets a decent bit uglier if we also have to do [[queue]] and [[queueTotalSize]], but I think that'd be all of it... |
We don't have the other end of the message port. It's still in the original realm. So our only communication channel is the |dataHolder|. Then add in the fact that there may be a message in transit from the original realm that's trying to change our state at the same time we're trying to clone it and the whole thing becomes very painful. |
Sorry, I'm pretty behind on the discussion. Please allow me to ask some stupid questions to try and get caught up... From what I understand we're considering a situation like this: // in window
const s = new ReadableStream({ start(c) { c.error(); } });
frames[0].postMessage(s, [s]);
// in frames[0]
worker.postMessage(s, [s]); I guess what I'm envisioning is that in (I was initially thinking that the transfer-recieving steps would dispatch a synthetic message event on the MessagePort that was something like
I feel like this should all be handled by HTML's (and implementations') existing infrastructure. As long as we reestablish the state that was last seen before any such state-changing messages arrive, everything would be consistent, and HTML gives us the ordering guarantees needed to make that possible. |
I think you might be on to something. Something like: ReadableStream transfer steps, given value and dataHolder:
ReadableStream transfer-receiving steps, given dataHolder and value:
We only ever construct and pipe to a Of course, in order for the next realm to know how to continue, we have to tell it where we left off. That's why we save and restore the state of the internal slots before and after transferring. 🙂 |
I thought this wouldn't work because both
The real difficult case is WritableStream, because the queue can be non-empty, and you shouldn't clone the chunks if you're going to pipe them. |
Hmm, okay, this might be a bit trickier than I thought. 😅 If we're going to transfer the port that we were using for the cross-realm transform readable/writable, we have to be absolutely sure that it no longer needs it inside the current realm. I can see this being a problem with the current writeAlgorithm for Also, the mere fact that we were waiting for backpressure is actually part of the stream's state, so we need to somehow transfer that as well. We could keep a boolean flag to indicate whether there is backpressure or not, send that flag when transferring the stream, and use that when we transfer-receive the stream to initialize backpressurePromise to either a pending or a resolved promise. I still think we can do this with the current "elegant" approach. We just have to be very careful to re-construct the cross-real transform readable/writable in exactly the same state as it was inside the realm from which we transferred it. |
Yes, I see what you mean. I think implementations are a lot more complex than the standard's platonic ideal, but they must be providing the same ordering guarantees for interoperability. |
I didn't think of that. It seems doable, but ugly. |
Yeah, it only occurred to me while I was looking at how we use the message port inside the cross-realm transform streams. I'm gonna have to think about this some more, perhaps there is more "hidden state" in there somewhere. |
It occurs to me that it might be a bit strange that for author-created duplex streams, the syntax is destination.postMessage(duplex, { transfer: [duplex.readable, duplex.writable] }); whereas for transform streams it is
I can't think of any nice solution for this though. Unfortunately the structured-serialize algorithm will throw for any platform object not explicitly marked as serializable (like a |
So #1053 is nearing completion. The remaining open issue, which I think we could use community and implementer input on, is whether to support double-transfers (e.g. window -> window -> worker) now or later. I'm a bit hesitant about putting double-transfers off until later, given that we had developers try this out behind a flag and become sad about them not working. (See upthread.) If we do put them off, I'd like us to add an explicit transfer-time error to avoid the confusing developer experience. E.g. each transferred stream gets a bit saying "was created by transferring" that prevents further transfers by throwing an "NotImplemented" DOMException. EDIT: see the next couple of comments; the double-transfer issue is a bit more subtle than this comment alone indicates. However, it also sounds like there are developers (who haven't found their way to this thread, but have showed up in e.g. the origin trial feedback) which would benefit from this feature with only single-transfers supported. My impression is that recent discussions have started to discover how double-transfers could be supported, without too much effort. And it sounds like supporting double-transfers would not cause any backward-incompatible changes. But, the actual mechanics are still up in the air and uncertain, and we'd need a good amount of time for formalizing them, testing them, and implementing them to work out the kinks caused by the increased complexity. Given this, I'd recommend that we merge #1053 and then work on double-transfers in the Chromium implementation and spec ASAP afterward. We could open a new issue distilling the various discussions that have happened here so far. If folks disagree, either web developers planning to use this feature or implementers with concerns about merging #1053, I'd love to hear about it. |
I'm not in favour of making double-transfers fail. They're useful in scenarios where you don't tear down the middle context. For example, I use them in the tests 😄 Synchronously transferring the state of a WritableStream is looking increasingly difficult to me. For example, writer.write(chunk);
writer.releaseLock();
worker.postMessage(writable, [writable]);
PipeTo doesn't have a problem with this because it operates asynchronously. |
I hadn't fully appreciated the distinction between "double transfers are broken" and "double transfers are broken if you tear down the middle context". I see now that we are in the latter situation, not the former. To properly warn developers in that case we'd need to do something complicated, like, on middle-context teardown, error all streams that are currently being passed through the context. Do you think that'd be reasonably implementable? I can think of how to spec it, but the spec isn't concerned with efficiency. In spec-land (and implementation-land?) you can synchronously observe the state of a promise. It's something we try to avoid but this seems like a special case. |
I agree that it would be the best behaviour, but I think it falls under the category of "things we could implement, but probably wouldn't". It would involve messing around in the internals of MessagePort, which is unlikely to be pleasant and quite likely to break other things. I can imagine other browsers might have implemented MessagePort in a way that makes it impossible to efficiently observe disentanglement. In practice people will usually send data through their streams, so maybe they'll find out soon enough? |
Okay, the following is more of a brain-dump, sorry in advance. 😛
It gets even worse. You can hold on to a promise returned by let writer = writable.getWriter();
let promise = writer.write(chunk);
writer.releaseLock();
worker.postMessage(writable, [writable]);
await promise; // can happen immediately, or much later However, after we've transferred the writable to another context, there's no way for us to get the result of any of our queued writes. So... will we need to resolve any pending
We need to maintain enough bookkeeping as part of creating a cross-realm transform readable/writable, such that we can transfer that bookkeeping to another realm and have it re-create the cross-realm transform readable/writable in exactly the same state. This state needs to include the state of the backpressurePromise. We could manually keep track of it with a boolean backpressure flag as part of the cross-realm transform writable, or we could just synchronously observe the promise's state as Domenic suggested. It's a matter of preference. One tricky bit is the chunk argument inside writeAlgorithm. If we start writing the chunk but we're waiting for backpressure to be relieved, then we have not yet sent a I guess this does have an edge case: we might have just sent the One problem with transferring the queue in its entirety though is that it is closely tied to writable.[[writeRequests]]. Each entry in [[queue]] must correspond to an entry in [[writeRequests]] (or to the [[inFlightWriteRequest]]). If we add a bunch of chunks to the transferred stream's queue, we could "fix" things by adding an equal number of "dummy" promises to [[writeRequests]]. That way, if the user calls ...I don't know if I'm explaining this properly in prose. Maybe I should just make a draft PR to try out a few ideas? 😅 |
Make readable, writable, and transform streams transferable via postMessage(stream, [stream]). The streams themselves must be transferred, but the chunks written or read from the streams are cloned, not transferred. Support for transferring chunks will require API changes and is expected to be added in a future update. There is no reference implementation of this functionality as jsdom does not support transferrable objects, and so it wouldn't be testable. Closes whatwg#276.
Make readable, writable, and transform streams transferable via postMessage(stream, [stream]). The streams themselves must be transferred, but the chunks written or read from the streams are cloned, not transferred. Support for transferring chunks will require API changes and is expected to be added in a future update. There is no reference implementation of this functionality as jsdom does not support transferrable objects, and so it wouldn't be testable. Closes whatwg#276.
Make readable, writable, and transform streams transferable via postMessage(stream, [stream]). The streams themselves must be transferred, but the chunks written or read from the streams are cloned, not transferred. Support for transferring chunks will require API changes and is expected to be added in a future update. There is no reference implementation of this functionality as jsdom does not support transferrable objects, and so it wouldn't be testable. Closes #276.
I've closed this issue since transferable streams have landed. I've created a new issue for discussion of the double-transfer problem: #1063. |
Make readable, writable, and transform streams transferable via postMessage(stream, [stream]). The streams themselves must be transferred, but the chunks written or read from the streams are cloned, not transferred. Support for transferring chunks will require API changes and is expected to be added in a future update. There is no reference implementation of this functionality as jsdom does not support transferrable objects, and so it wouldn't be testable. Closes whatwg#276.
Are there plans to support transferring streams between workers or iframes using postMessage?
This question came up in https://bugzilla.mozilla.org/show_bug.cgi?id=1128959#c2
The text was updated successfully, but these errors were encountered: