-
Notifications
You must be signed in to change notification settings - Fork 13
Ipc failure while streaming #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c9f2d96 to
2aed7dd
Compare
|
Lol.. ok so its 5dc07f0.. No idea why yet, but dropping that commit for this history and gonna rebase all dependents. |
2aed7dd to
f7cc993
Compare
|
Lul, and turns out ba6a2b1 causes the So gonna drop that one for now as well.. |
f7cc993 to
6d34893
Compare
6895172 to
b722238
Compare
| ): | ||
| # if we can't propagate the error that's a big boo boo | ||
| log.error( | ||
| log.exception( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important to show what traceback wasn't able to be propagated to the far end actor's caller task.
| log.warning(text) | ||
| await send_chan.send(msg) | ||
| try: | ||
| await send_chan.send(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using backpressure on a stream we need to be sure we don't crash when the local feed memchan has been broken.
Need a test for this, though I can't remember if this was the original issue that caused the ipc_failure_during_stream.py example to inf hang before?
tractor/_streaming.py
Outdated
| # far end. | ||
| await self.send_stop() | ||
| try: | ||
| await self.send_stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need a test for this:
- one end of a stream is closed
- the other end trying to close before any other stream methods
.receive()/.send()are called (thus detecting the closure before trying to send a stop msg)
| finally: | ||
| # NOTE: this is ABSOLUTELY REQUIRED to avoid | ||
| # the following wacky bug: | ||
| # <tractorbugurlhere> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can remember the case that caused this unfortunately 😂 though pretty sure it was in piker.
Unfortunately this may just go in without a false positive case ..
When backpressure is used and a feeder mem chan breaks during msg delivery (usually because the IPC allocating task already terminated) instead of raising we simply warn as we do for the non-backpressure case. Also, add a proper `Actor.is_arbiter` test inside `._invoke()` to avoid doing an arbiter-registry lookup if the current actor **is** the registrar.
Use a task nursery in the subactor to spawn tasks which cancel the IPC channel mid stream to simulate the most concurrent case we're likely to see. Make `main()` accept a `debug_mode: bool` for parametrization. Fill out detailed comments/docs on this example.
With the new fancy `_pytest.pathlib.import_path()` we can do real parametrization of the example-script-module code and thus configure whether the child, parent, or both silently break the IPC connection. Parametrize the test for all the above mentioned cases as well as the case where the IPC never breaks but we still simulate the user hammering ctl-c / SIGINT to terminate the actor tree. Adjust expected errors based on each case and heavily document each of these.
We weren't doing this originally I *think* just because of the path dependent nature of the way the code was developed (originally being mega pedantic about one-way vs. bidirectional streams) but, it doesn't seem like there's any issue just calling the stream's `.aclose()`; also have the benefit of just being less code and logic checks B)
710dee0 to
13c9ead
Compare
Variety of streaming related teardown fixes for when IPC goes down before streams are terminated gracefully. Mostly discovered during dev of initial draft of pikers/piker#420.
TODO:
OSError: [Errno 9] Bad file descriptorin cluster test #345 => dropping 5dc07f0 did it