-
Notifications
You must be signed in to change notification settings - Fork 1.5k
CFI improvements to the AArch64 fiber implementation #4195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me!
To confirm, have you had the chance to test this on macOS arm hardware as well? If not I'm sure @cfallin would be able to test this out.
I can confirm that |
Thanks for testing my changes, @cfallin! Otherwise my main testing platform is an AWS Graviton3 instance that supports pointer authentication natively and that runs Ubuntu 20.04. |
Hm CI timed out here for qemu emulation of AArch64 which we haven't seen before AFAIK, so that may actually be related to the changes here as well? |
Now the fiber implementation on AArch64 authenticates function return addresses and includes the relevant BTI instructions, except on macOS. Also, change the locations of the saved FP and LR registers on the fiber stack to make them compliant with the Procedure Call Standard for the Arm 64-bit Architecture. Copyright (c) 2022, Arm Limited.
@alexcrichton I have tried to elaborate on the most important technical details in the comments - please, let me know if you feel that there is anything else that warrants further explanation. Also, now this PR fixes #4226 and, finally, I have added a fix for a minor issue that I have noticed previously - I have changed the locations of the saved FP and LR registers on the fiber stack to improve compliance with the procedure call standard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this!
This resolves two issues from recent changes in bytecodealliance#6649: * First the s390x calling convention for wasm functions is changed back to `WasmtimeSystemV` from `Fast`. This was an accidental omission from bytecodealliance#6649 where the conclusion was that s390x will continue using a calling convention with little-endian lane order for lane arguments. The only calling convention that supports this today is `WasmtimeSystemV`, although the `Tail` calling convention will likely use it in the future as well. * Second the apple-aarch64 platform now uses the `Fast` calling convention instead of `AppleAarch64` calling convention. That convention was specified in bytecodealliance#4195 but local testing shows that is not necessary in the sense that tests all pass with the `Fast` calling convention. This means that the prior comment why the `AppleAarch64` calling convention is required is no longer accurate and in the absence of a reason not to I went ahead and switched it to `Fast`. In the near future all wasm functions will unconditionally use the `Tail` calling convention and at that time the lane order can be specified on s390x to be little-endian to satisfy all the constraints here. Additionally any unwinding directives, if necessary for aarch64, can be specified as needed.
This resolves two issues from recent changes in bytecodealliance#6649: * First the s390x calling convention for wasm functions is changed back to `WasmtimeSystemV` from `Fast`. This was an accidental omission from bytecodealliance#6649 where the conclusion was that s390x will continue using a calling convention with little-endian lane order for lane arguments. The only calling convention that supports this today is `WasmtimeSystemV`, although the `Tail` calling convention will likely use it in the future as well. * Second the apple-aarch64 platform now uses the `Fast` calling convention instead of `AppleAarch64` calling convention. That convention was specified in bytecodealliance#4195 but local testing shows that is not necessary in the sense that tests all pass with the `Fast` calling convention. This means that the prior comment why the `AppleAarch64` calling convention is required is no longer accurate and in the absence of a reason not to I went ahead and switched it to `Fast`. In the near future all wasm functions will unconditionally use the `Tail` calling convention and at that time the lane order can be specified on s390x to be little-endian to satisfy all the constraints here. Additionally any unwinding directives, if necessary for aarch64, can be specified as needed.
* Update calling conventions for wasm functions slightly This resolves two issues from recent changes in #6649: * First the s390x calling convention for wasm functions is changed back to `WasmtimeSystemV` from `Fast`. This was an accidental omission from #6649 where the conclusion was that s390x will continue using a calling convention with little-endian lane order for lane arguments. The only calling convention that supports this today is `WasmtimeSystemV`, although the `Tail` calling convention will likely use it in the future as well. * Second the apple-aarch64 platform now uses the `Fast` calling convention instead of `AppleAarch64` calling convention. That convention was specified in #4195 but local testing shows that is not necessary in the sense that tests all pass with the `Fast` calling convention. This means that the prior comment why the `AppleAarch64` calling convention is required is no longer accurate and in the absence of a reason not to I went ahead and switched it to `Fast`. In the near future all wasm functions will unconditionally use the `Tail` calling convention and at that time the lane order can be specified on s390x to be little-endian to satisfy all the constraints here. Additionally any unwinding directives, if necessary for aarch64, can be specified as needed. * Fix compile
Now the fiber implementation on AArch64 authenticates function return addresses and includes the relevant BTI instructions, except on macOS, following the guidelines presented in the bytecodealliance/rfcs#17 RFC proposal.
This PR has been spun off from #3606 and fixes #3183.