-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vm: rework the exception handling #1061
vm: rework the exception handling #1061
Conversation
Summary ======= Implement a different approach to exception handling in the VM, fixing many (all?) of the issues around `raise`, `except`, and `finally`. This makes the VM's exception handling implementation the most complete out of the existing backends. Details ======= Exception and `finally` handling was very incomplete, with many non- simple cases not working (e.g., `finally` not intercepting `break`s, the current exception not being reset once handled, etc.). **New exception handling implementation:** * instead of the safepoint-based approach (`opcTry`), a separate instruction-to-exception-handler lookup table is used * each call frame uses its own lookup table; setting the lookup table is done via the new `SetEh` instruction * the table stores mappings from instruction position to the position of special-purpose exception handling (=EH) instructions * when an instruction raises, it is looked up in the table. If no associated EH instruction is found, the call instruction in the above frame is looked up, then the call instruction in the frame, etc. * if no EH instruction is found even after unwinding the call stack, the exception is treated as unhandled and reported to the supervisor * if an EH instruction is found, an internal EH thread is spawned for the raised exception and the EH instruction(s) are executed * the EH instructions describe which `except` and `finally` clauses are visited and in what order * entering an `except` or `finally` clause from an EH thread sets the current exception (the one returned by `getCurrentException`) to the thread's associated exception This affords for a large amount of flexibility with exception handling. **New `finally` implementation:** * the `Finally` and `FinallyEnd` opcodes stay, but they now work differently * each finally section is associated with a *control register* * the control register stores the information necessary for knowing what to do when the end of the section (`FinallyEnd`) is reached * how execution continues at the end of a finally section depends on how the `Finally` instruction was reached: * if reached by normal control-flow, execution continues at the instruction designated by the `Finally` instruction * if reached via an `Enter` instruction, execution continues at the instruction following the `Enter` instruction * if reached from exception handling, the EH thread is resumed * jumping from outside a finally section to within one is forbidden The `Enter` and `Leave` are two new instructions: * `Enter` is used for redirecting to finally sections and works as described above * `Leave` is used for terminating open EH threads when exiting an `except` clause or when exiting a `finally` clause through unstructured control-flow (e.g., `break`) When a `break` or `return` exits one or more `try` clauses with attached `finally` clauses, `vmgen` emits an `Enter` instruction targeting each, prior to the final jump. Similarly, `vmgen` emits an `Leave` instruction for each `finally` and `except` clause exited through unstructured control-flow. Considered alternatives ----------------------- Much of the mentioned changes could have also been implemented on-top of the safe-point mechanism. This, however, would not allow supporting clean-up-only control-flow (that is, all exception handler being skipped and only certain finally section being executed), which, while not used at the moment, could in the future become useful for implementing panics. Code generator details ---------------------- * a `SetEh` instruction is always emitted, even if a code fragment / procedure, has no instruction-to-EH mappings. This is a temporary limitation * all `IndCall` and `IndCallAsgn` instructions (i.e., calls that weren't lowered into dedicated instruction) are treated as raising * the EH instructions are emitted together with the normal bytecode; no separate code generation pass is used Future direction ---------------- The MIR (and subsequently the `CgNode` IR) is planned to use goto- based control-flow primitives, similar to the ones that the VM uses. This means that much of the semantics-related decision-making (e.g., where to insert `Leave` instructions) is going to move out of `vmgen`.
`vm.setCurrentException` always incremented the provided refs ref- counter, even if the ref was nil (which is a valid case). An `isNotNil` guard is added to prevent the ref-counter being incremented.
PR-local bug: exiting a finally section via `FinallyEnd` didn't continue stack frame unwinding, which would result in execution being aborted, even though there were exception handlers present. `opRaise` is split up into two procedures: `findEh` (find the EH code associated with an instruction) and `resumeEh` (execute the EH thread and unwind). This is so that `FinallyEnd` can use `resumeEh` to unwind correctly.
The tests succeed with the VM now.
The `SetEh` instruction patching was implemented incorrectly, resulting in incorrect operand value when the lower bound is >= 256.
It doesn't work at the moment, but it's also not entirely clear whether it actually should.
`initIntReg` changed to doing the register cleanup itself.
There's is an exception-related situation that none of the backends handle correctly: re-raising the active exception and immediately catching it within a
Handling the active exception within the My personal opinion is that re-raising outside of an proc reraise() = raise
try:
raise CatchableError.newException("")
except:
reraise() would stop working too, but I don't think that this would be a bad thing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case coverage expansion is especially awesome.
I have a few minor suggestions, mostly soft.
''' | ||
""" | ||
|
||
# XXX: it's questionable whether this really should work / be allowed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking this through myself:
- on line 17 we decide to 'cancel the current computation', targeting the handler on line 29
- on line 20 we decide to 'retarget the cancellation handler', now on line 21
- upon leaving the line 21 cancellation handler, we decide to continue the rest of the computation
Reading through my characterization, I think the 'precise' question is, should a finally
be allowed re-parent an existing exception (cancellation)? I believe the answer is "yes, it should" because if not then, the property we'd be after is that finally
cannot raise
, which seems infeasible/not very useful. I consider this a variant of raising an entirely new exception that overrides the old one, except in this case our 'new exception' 'adopts' the data of the old one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not sure I completely follow. A finally
could be allowed to raise without having to allow re-targeting an existing exception to a handler within the finally
(e.g., by disallowing re-raise outside of except
branches.
In your model, would line 28 be executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your model, would line 28 be executed?
That's a great question, I want to say yes. But that would mean we entered the finally
through exceptional means, it handled the exception internally, then the treatment would have to be flipped to non-exceptional.
Which seems to defer the determination of how to leave a finally
that much more. Where each control flow branch within a finally
can change the handling... ooph.
Perhaps a static restriction is a better bet, because we can always lift the restriction later when/if we're willing to do that extra analysis and think through all the implications (extra accumulated experience in the future will also be beneficial).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which seems to defer the determination of how to leave a
finally
that much more. Where each control flow branch within afinally
can change the handling... ooph.
Yeah, this is the reason why I'd prefer not supporting the re-targeting.
Perhaps a static restriction is a better bet, because we can always lift the restriction later when/if we're willing to do that extra analysis and think through all the implications (extra accumulated experience in the future will also be beneficial).
Okay, in that case I would make a follow up that statically disallows a re-raise outside of except
branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, and thanks for working through the reasoning with me.
Co-authored-by: Saem Ghani <[email protected]>
Co-authored-by: Saem Ghani <[email protected]>
* reword the description to better highlight the uncertainty * change the test's success condition to all three steps * fix the test running forever due the signal handler being entered over and over
I've changed the test description of There's was also the problem of the test running forever (an infinite loop caused by the SIGSEGV signal handler triggering itself), which I fixed by disabling the custom signal handlers ( |
Thank you for the review, Saem. |
/merge |
Merge requested by: @saem Contents after the first section break of the PR description has been removed and preserved below:
|
Summary
Implement a different approach to exception handling in the VM, fixing
most of the issues around
raise
,except
, andfinally
. This makesthe VM's exception handling implementation the most complete out of the
existing backends.
Details
Exception and
finally
handling was very incomplete, with many non-simple cases not working (e.g.,
finally
not interceptingbreak
s,the current exception not being reset once handled, etc.).
New exception handling implementation:
opcTry
), a separateinstruction-to-exception-handler lookup table is used
is done via the new
SetEh
instructionof special-purpose exception handling (=EH) instructions
associated EH instruction is found, the call instruction in the above
frame is looked up, then the call instruction in the frame, etc.
the exception is treated as unhandled and reported to the supervisor
the raised exception and the EH instruction(s) are executed
except
andfinally
clauses arevisited and in what order
except
orfinally
clause from an EH thread sets thecurrent exception (the one returned by
getCurrentException
) to thethread's associated exception
This affords for a large amount of flexibility with exception handling.
New
finally
implementation:Finally
andFinallyEnd
opcodes stay, but they now workdifferently
what to do when the end of the section (
FinallyEnd
) is reachedhow the
Finally
instruction was reached:instruction designated by the
Finally
instructionEnter
instruction, execution continues at theinstruction following the
Enter
instructionThe
Enter
andLeave
are two new instructions:Enter
is used for redirecting to finally sections and works asdescribed above
Leave
is used for terminating open EH threads when exiting anexcept
clause or when exiting afinally
clause throughunstructured control-flow (e.g.,
break
)When a
break
orreturn
exits one or moretry
clauses withattached
finally
clauses,vmgen
emits anEnter
instructiontargeting each, prior to the final jump. Similarly,
vmgen
emits anLeave
instruction for eachfinally
andexcept
clause exitedthrough unstructured control-flow.
Considered alternatives
Much of the mentioned changes could have also been implemented on-top
of the safe-point mechanism. This, however, would not allow supporting
clean-up-only control-flow (that is, all exception handler being
skipped and only certain finally section being executed), which, while
not used at the moment, could in the future become useful for
implementing panics.
Code generator details
SetEh
instruction is always emitted, even if a code fragment /procedure has no instruction-to-EH mappings. This is a temporary
limitation
IndCall
andIndCallAsgn
instructions (i.e., calls thatweren't lowered into dedicated instruction) are treated as raising
no separate code generation pass is used
Future direction
The MIR (and subsequently the
CgNode
IR) is planned to use goto-based control-flow primitives, similar to the ones that the VM uses.
This means that much of the semantics-related decision-making (e.g.,
where to insert
Leave
instructions) is going to move out ofvmgen
.Tests
Multiple tests for things that previously didn't work with the VM are
added. In addition, the
knownIssue
marker is removed from multipleexception
tests that now succeed.