-
Notifications
You must be signed in to change notification settings - Fork 3.1k
powerpc: Fix multiple issues with FP/VSX save/restore #1756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for taking the time to contribute to FreeBSD! |
According to the ELF ABI v2, two scratch regions are reserved below the stack pointer, one 288 byte general region and one 224 byte compiler region. FreeBSD only reserved the 288 byte region. Follow the ELV v2 ABI and reserve the full 512 byte region as specified. Signed-off-by: Timothy Pearson <[email protected]>
Multiple issues existed within the powerpc FP/VSX save/restore functionality, leading to register corruption and loss of register contents in specific scenarios involving high signal load and use of both floating point and VSX instructions. Issue freebsd#1 On little endian systems the PCB used the wrong location for the shadowed FP register within the larger VSX register. This appears to have been an attempt to correct issue freebsd#2 without understanding how the vector load/store instructions actually operate. Issue freebsd#2 On little endian systems, the VSX state save/restore routines swapped 32-bit words within the 64-bit aliased double word for the associated floating point register. This was due to the use of a word-oriented load/store vs. doubleword oriented load/store. Issue freebsd#3 The FPU was turned off in the PCB but not in hardware, leading to a potential race condition if the same thread was scheduled immediately after sigreturn. The triggering codebase for this is Go, which makes heavy use of signals and and generates an unusual mix of floating point and VSX assembler. As a result, when combined with th powerpc lazy FPU restore, a condition was repeatedly hit whereby the thread was interrupted in FP+VSX mode, then restored in FP only mode, thus reliably triggering the issues above. Also clean up the associated asm() style issue flagged by GitHub Actions. Signed-off-by: Timothy Pearson <[email protected]>
|
@bdragon28 @chmeeedalf Adding to CC. This wasn't easy to track down and I would like to get it merged and backported if possible, considering the risk of random data corruption. This probably explains the sporadic segfaults I've seen on ppc64le FreeBSD boxes for a couple of years now... Thanks! |
|
Hi, sorry for the delay, meant to get to this earlier. The changes look good to me, so I'll merge them in the next few days as I have time. |
|
@chmeeedalf No problem. As a procedural matter, is it preferred to upload here or to Phabricator? I don't have a preference either way, just looking to make things easier. There's a companion PR here #1759 that needs merging as well. Also, any chance we can get https://reviews.freebsd.org/D44274 merged? 😉 Thanks! |
|
I added a comment to the Differential on Friday listing what I think Mark is asking for, if you have any questions on it, feel free to ask me further. |
According to the ELF ABI v2, two scratch regions are reserved below the stack pointer, one 288 byte general region and one 224 byte compiler region. FreeBSD only reserved the 288 byte region. Follow the ELV v2 ABI and reserve the full 512 byte region as specified. Signed-off-by: Timothy Pearson <[email protected]> MFC after: 1 week Pull Request: #1756
Multiple issues existed within the powerpc FP/VSX save/restore functionality, leading to register corruption and loss of register contents in specific scenarios involving high signal load and use of both floating point and VSX instructions. Issue #1 On little endian systems the PCB used the wrong location for the shadowed FP register within the larger VSX register. This appears to have been an attempt to correct issue #2 without understanding how the vector load/store instructions actually operate. Issue #2 On little endian systems, the VSX state save/restore routines swapped 32-bit words within the 64-bit aliased double word for the associated floating point register. This was due to the use of a word-oriented load/store vs. doubleword oriented load/store. Issue #3 The FPU was turned off in the PCB but not in hardware, leading to a potential race condition if the same thread was scheduled immediately after sigreturn. The triggering codebase for this is Go, which makes heavy use of signals and and generates an unusual mix of floating point and VSX assembler. As a result, when combined with th powerpc lazy FPU restore, a condition was repeatedly hit whereby the thread was interrupted in FP+VSX mode, then restored in FP only mode, thus reliably triggering the issues above. Also clean up the associated asm() style issue flagged by GitHub Actions. Signed-off-by: Timothy Pearson <[email protected]> MFC after: 1 week Pull Request: #1756
Understood, I think we were hoping not to have two approvals (which could easily be two years at the current pace) in the chain. Is there any way to get a faster response if we do break it up as suggested? Just trying to allocate appropriate resources. Thanks! 😄 |
|
OpenSSL 3.5 is a blocker for FreeBSD 15, and someone is working on that. If you rebase your changes against OpenSSL 3.5 (should be able to regenerate however you did the current review), then it should be a smoother process; I imagine the diff against what you've already posted shouldn't be too large, so shouldn't be hard to review (might take 2 rounds, first as a cut to make it work, second as a regen once 3.5 is officially in the tree). Alternatively, if you give me instructions on how to regen, we can do the initial review of a regen against 3.5, and I can regen myself once 3.5 is officially in the tree, which would save you some resource effort. From what I can tell the only hangup currently is the reproducibility of the asm generation with internal OpenSSL, the actual code is fine. I'll do my best to get it marshalled in before 15.0 (we should have some time to do that, as long as everything is planned ahead of time). |
|
@chmeeedalf That sounds reasonable, appreciate the suggestion. Is there a Git tree or patchset with the OpenSSL 3.5 work in progress I can pull from? Will need to get a 15 development VM spun up, which will be a good test of the current kernel as well. Thanks again! |
According to the ELF ABI v2, two scratch regions are reserved below the stack pointer, one 288 byte general region and one 224 byte compiler region. FreeBSD only reserved the 288 byte region. Follow the ELV v2 ABI and reserve the full 512 byte region as specified. Signed-off-by: Timothy Pearson <[email protected]> MFC after: 1 week Pull Request: #1756 (cherry picked from commit 645bb3e)
Multiple issues existed within the powerpc FP/VSX save/restore functionality, leading to register corruption and loss of register contents in specific scenarios involving high signal load and use of both floating point and VSX instructions. Issue #1 On little endian systems the PCB used the wrong location for the shadowed FP register within the larger VSX register. This appears to have been an attempt to correct issue #2 without understanding how the vector load/store instructions actually operate. Issue #2 On little endian systems, the VSX state save/restore routines swapped 32-bit words within the 64-bit aliased double word for the associated floating point register. This was due to the use of a word-oriented load/store vs. doubleword oriented load/store. Issue #3 The FPU was turned off in the PCB but not in hardware, leading to a potential race condition if the same thread was scheduled immediately after sigreturn. The triggering codebase for this is Go, which makes heavy use of signals and and generates an unusual mix of floating point and VSX assembler. As a result, when combined with th powerpc lazy FPU restore, a condition was repeatedly hit whereby the thread was interrupted in FP+VSX mode, then restored in FP only mode, thus reliably triggering the issues above. Also clean up the associated asm() style issue flagged by GitHub Actions. Signed-off-by: Timothy Pearson <[email protected]> MFC after: 1 week Pull Request: #1756 (cherry picked from commit 077e30e)
Multiple issues existed within the powerpc FP/VSX save/restore functionality, leading to register corruption and loss of register contents in specific scenarios involving high signal load and use of both floating point and VSX instructions.
Issue #1
On little endian systems the PCB used the wrong location for the shadowed FP register within the larger VSX register. This appears to have been an attempt to correct issue #2 without understanding how the vector load/store instructions actually operate.
Issue #2
On little endian systems, the VSX state save/restore routines swapped 32-bit words within the 64-bit aliased double word for the associated floating point register. This was due to the use of a word-oriented load/store vs. doubleword oriented load/store.
Issue #3
The FPU was turned off in the PCB but not in hardware, leading to a potential race condition if the same thread was scheduled immediately after sigreturn.
The triggering codebase for this is Go, which makes heavy use of signals and and generates an unusual mix of floating point and VSX assembler. As a result, when combined with th powerpc lazy FPU restore, a condition was repeatedly hit whereby the thread was interrupted in FP+VSX mode, then restored in FP only mode, thus reliably triggering the issues above.