Skip to content

Conversation

@madscientist159
Copy link
Contributor

@madscientist159 madscientist159 commented Jul 8, 2025

On PowerPC platforms a valid link to the Table of Contents (TOC) is required for PLT lookups to function. This TOC pointer is stored in a dedicated register, and is used along with the stack pointer by both C prologue and PLT lookup code.

When calling swapcontext() with uc_link != NULL, a PLT lookup to setcontext(3) is attempted from within the _ctx_done context. The exiting process has usually trashed both r1 and r2 at this point[1], leading to a crash within the PLT lookup before setcontext(2) is reached to restore the linked context.

Restore r1 and r2 from the incoming context to ensure the PLT lookup to setcontext(3) succeeds. As this subsequently calls setcontext(2), which overwrites r1 and r2 from the same context a second time, this should be safe.

[1] For clarity, since the interaction between compiler and context assembly is not obvious, the text of the comment from the patch is reproduced below:

...the current stack and TOC have almost certainly been trashed by the callback function. Since the compiler "knows" (incorrectly) that the callback in _ctx_start() is noreturn, it does not reserve a stack frame in the function prologue, and does not emit an epilogue. Since the prologue always reloads the TOC based on the global entry point, corruption at return is all but guaranteed.

This is not a compiler bug; the compiler is free to use that optimization in this situation (we are literally changing context underneath the compiler here). It is simply something we need to be aware of due to the immediate PLT call before setcontext(2) is reached.

@madscientist159
Copy link
Contributor Author

madscientist159 commented Jul 8, 2025

Here's a simple reproducer for those with ppc64 systems. This is largely taken from one of the Go test suites that highlighted the problem, but is straight C and nicely shows the crash.

This same code works perfectly on a stock Linux/ppc64le host.

#include <stdio.h>
#include <stdlib.h>
#include <ucontext.h>

#define STACK_SIZE      (64ull << 10)

static ucontext_t uctx_save, uctx_switch;

void stackSwitchCallback() {
	printf("stackSwitchCallback() fired\n");
}

int main() {
	char *stack1 = malloc(STACK_SIZE);
	if (stack1 == NULL) {
		perror("malloc");
		exit(1);
	}

	if (getcontext(&uctx_switch) == -1) {
		perror("getcontext");
		exit(1);
	}
	uctx_switch.uc_stack.ss_sp = stack1;
	uctx_switch.uc_stack.ss_size = STACK_SIZE;
	uctx_switch.uc_link = &uctx_save;
	makecontext(&uctx_switch, stackSwitchCallback, 0);

	if (swapcontext(&uctx_save, &uctx_switch) == -1) {
		perror("swapcontext");
		exit(1);
	}
}

@madscientist159
Copy link
Contributor Author

Looks like the CI is failing for some reason completely unrelated to this patch. ⚡

@bsdimp
Copy link
Member

bsdimp commented Jul 9, 2025

Yea, something weird is up with CI.

Any chance you can turn your 'reproducer' into a kyua test so we don't accidentally break this in the future?

@madscientist159 madscientist159 force-pushed the main-libc branch 2 times, most recently from a583889 to eda3aaf Compare July 9, 2025 17:00
@madscientist159
Copy link
Contributor Author

@bsdimp Sure! Looks like this was never tested on any arch, should be good to go now.

@chmeeedalf
Copy link
Contributor

chmeeedalf commented Jul 11, 2025

I think the "correct" route would be to restore the TOC upon return from the context call. Also, r12 should be saved (for ELFv2) in __makecontext(), so it can be restored (recalculate TOC as necessary).

@madscientist159
Copy link
Contributor Author

madscientist159 commented Jul 11, 2025

I think the "correct" route would be to restore the TOC upon return from the context call. Also, r12 should be saved (for ELFv2) in __makecontext(), so it can be restored (recalculate TOC as necessary).

I had thought so too, but it doesn't work. The compiler generates a prologue on _ctx_start that blows away the required r1/r2 values before we even reach the assembly code in that function. Restoring from the context seems to be the only sane way to do this.

@chmeeedalf
Copy link
Contributor

r1 isn't touched by _ctx_start(), and r2 is loaded by a standard prologue, which requires a sane r12. It looks like makecontext() is not saving off r12, so the context is bogus at the start of _ctx_start(). Something like (we're dying or switching, so all callee-saved registers are available, let's blow away r16 here):

diff --git a/lib/libc/powerpc64/gen/_ctx_start.S b/lib/libc/powerpc64/gen/_ctx_start.S
index c2f8abfd6486..bfec814417fd 100644
--- a/lib/libc/powerpc64/gen/_ctx_start.S
+++ b/lib/libc/powerpc64/gen/_ctx_start.S
@@ -37,8 +37,10 @@
        /* Load global entry point */
        mr      %r12,%r14
 #endif
+       mr      %r16,%r2
        mtlr    %r14
        blrl                    /* branch to start function */
+       mr      %r2,%r16
        mr      %r3,%r15        /* pass pointer to ucontext as argument */
        nop
        bl      CNAME(_ctx_done) /* branch to ctxt completion func */
diff --git a/lib/libc/powerpc64/gen/makecontext.c b/lib/libc/powerpc64/gen/makecontext.c
index 11ddc985e1e9..8a9618c87e87 100644
--- a/lib/libc/powerpc64/gen/makecontext.c
+++ b/lib/libc/powerpc64/gen/makecontext.c
@@ -143,7 +143,7 @@ __makecontext(ucontext_t *ucp, void (*start)(void), int argc, ...)
        va_end(ap);
 
        /*
-        * Use caller-saved regs 14/15 to hold params that _ctx_start
+        * Use callee-saved regs 14/15 to hold params that _ctx_start
         * will use to invoke the user-supplied func
         */
 #if !defined(_CALL_ELF) || _CALL_ELF == 1
@@ -151,6 +151,8 @@ __makecontext(ucontext_t *ucp, void (*start)(void), int argc, ...)
        mc->mc_srr0 = *(uintptr_t *)_ctx_start;
 #else
        mc->mc_srr0 = (uintptr_t) _ctx_start;
+       mc->mc_gpr[12] = (uintptr_t) _ctx_start;
+       mc->mc_gpr[16] = (uintptr_t) _ctx_start;
 #endif
        mc->mc_gpr[1] = (uintptr_t) sp;         /* new stack pointer */
        mc->mc_gpr[14] = (uintptr_t) start;     /* r14 <- start */

@madscientist159
Copy link
Contributor Author

madscientist159 commented Jul 11, 2025

Right, but we know nothing about the previous program that was running when swapcontext() was called. We don't know if r2 is actually derived from r12, or even how it might have been derived previously. The prologue in question is literally:

addis r2,r12,24
addi r2,r2,r2

Only after that runs do we start executing inside _ctx_start from mr r12,r14 -- but now r2 has already been blown away. Sure, we can guess at what it might have been before the prologue ran (based on r12), but that seems kinda dangerous, especially when we have a known-good r2 stored in the context.

If we try to recompute r2 from r12 based on assumptions about the executing program, I can tweak my example reproducer pretty easily to cause another crash. According to the ELF v2 ABI r12 is volatile, so AFAICS all I need to do is instruct the compiler to use r12 for some other purpose right before swapcontext().

@chmeeedalf
Copy link
Contributor

r12 is the entry point of the target. Since we already know the address of _ctx_start() we can save that off again just to be safe. Actually, my saving _ctx_start() into r16 in makecontext() is silly, since I'm saving off the calculated r2 into r16 in _ctx_start directly, and restoring it, so it's always safe. Now we're not touching r1, and we have our known-good (if ABI is followed...) r2 (saved off in r16). If the ABI isn't followed, then all bets are off anyway. Even Go needs to follow the ELF ABI in order to play well with others.

You mentioned in your comment that your change is kind of a hack that works, and I agree that there is a problem to solve, so I'm proposing a solution that's less of a hack, and more of an ABI certainty. A normal function call would result in r2 being restored from the stack upon return, but we don't need to do that in _ctx_start, we can just abuse the "safe" registers instead, and let the linker decide if it needs to do any fixups for r12 in the function calls.

We don't know what the target ucontext is, so it's unsafe to switch to it; the TOC might be found later on (think an asm prologue stub to start a lightweight thread), so the only thing we can be certain of is the context that we have, assuming the ABI is followed.

@madscientist159
Copy link
Contributor Author

I'm definitely not saying my solution is the absolute correct way to do this, and you do have valid points. I've been thinking, and I wonder if I can just coax the compiler not to emit that prologue on _ctx_start if these issues go away. Let me try a couple things on this end and see what I can do.

@chmeeedalf
Copy link
Contributor

The prologue is generated by the ENTRY() macro, defined in <machine/asm.h>, but I think we do want to generate the prologue, and provide a sane TOC pointer. Keep in mind that when it calls into _ctx_done(), it's an intra-library call, so bypasses the r2 setup prologue. By generating the right thing in _ctx_start(), everything should be happy.

@madscientist159
Copy link
Contributor Author

madscientist159 commented Jul 11, 2025

@chmeeedalf While it wasn't quite as simple as the proposed patch, the basic concepts appear to have worked. Thanks for the feedback, much appreciated. Does this look better?

@chmeeedalf
Copy link
Contributor

chmeeedalf commented Jul 11, 2025

mc_gpr[2] is never set up in makecontext() (mc_gpr[3 + N] are, as is 1, but others are not). If you set mc->mc_gpr[12] to _ctx_start, then save r2 after the calculation, I think we're good. Creating the stack frame as you did is a good idea, too, for backtracing.

@madscientist159
Copy link
Contributor Author

madscientist159 commented Jul 11, 2025

@chmeeedalf Correct me if I'm wrong, but isn't r2 initialized (along with all the other registers) from the currently executing context by the kernel in the earlier getcontext() syscall? After that call, the value in r2 is exactly what we need to restore prior to _ctx_done; i.e. what is in r2 prior to _ctx_start prologue execution.

If I try to use the TOC subsequently calculated in the _ctx_start prologue it just crashes.

@chmeeedalf
Copy link
Contributor

Ah, I'm wrong, because mc_gpr[2] would've been set up by getcontext(), so it is valid, or valid for libsys, that is. However, %r2 is calculated in _ctx_start() from %r12, so we should make sure that's valid, otherwise it's garbage for any libc calls made from _ctx_start. So I think a hybrid approach between our patches is correct (make sure %r12 is valid going in, set up the stack frame, and push our now-valid %r2 to it, pop it off after the call)

@madscientist159
Copy link
Contributor Author

@chmeeedalf Now this is all starting to make more sense. Two bugs in the same general code combining to make a bigger mess. Let me crank on it a bit and update the patch set. 👍

@chmeeedalf
Copy link
Contributor

%r2 is calculated from %r12, which is supposed to point to the function entry point in ELFv2. When making intra-library calls you can skip over that, because %r2 is valid for the entire library, but when passing function pointers around, the function pointer points to that prologue, so it's expecting %r12 to be that entry point. Garbage in, garbage out (mc->mc_gpr[12] is likely pointing to getcontext() at this point, so we end up with a bogus TOC pointer after calculating it in _ctx_start()'s prologue).

@madscientist159
Copy link
Contributor Author

@chmeeedalf Indeed! I'm slowly wrapping my head around this obscure part of the C library...

Hopefully this looks good. It works fine in testing on my end.

@chmeeedalf
Copy link
Contributor

Overall looks good, thanks for humoring me on this. One minor nit: Don't restore %r1 before calling _ctx_done, just leave the frame as-is (it's _ctx_start()'s frame, after all). Aside from that, I approve.

@madscientist159
Copy link
Contributor Author

@chmeeedalf No worries at all, thanks for the thorough review! I think we now both know this section of the code far better than before.

Since we're keeping our stack frame around, I also added the remainder of the ABI required save fields. Technically not needed, but also technically correct. 😉

@chmeeedalf
Copy link
Contributor

You might hate me for this, but I don't think we can create a new stack frame for _ctx_start(), makecontext() has to create the full stack frame, and _ctx_start just use it, in order to allow for overflow arguments. I've just been reading through the makecontext() source, and I think it's broken with regard to how it sets up the stack frame, the 'sizeof(uintptr_t)*(stackargs + 2)' should probably be 'stackargs + 6' instead, in order to maintain alignment for both ELFv1 and ELFv2 (see down below in the 'if (argc > 8)' block). If we size it right, then we just populate the frame as we normally would, if _ctx_start() were making the actual call.

So for ELFv2 we need space, allocated in makecontext(), for:

  • chain pointer
  • CR
  • TOC
  • LR

I don't think we're caring about ELFv1 anymore, so we can let it rot for now. If anyone is interested they can do the work to fix that side :)

Does the above make sense? We're ~95% there fixing this for all use cases.

On PowerPC platforms a valid link to the Table of Contents (TOC) is
required for PLT lookups to function.  This TOC pointer is stored in
a dedicated register, and is used along with the stack pointer by both
C prologue and PLT lookup code.

When calling swapcontext() with uc_link != NULL, a PLT lookup to
setcontext(3) is attempted from within the _ctx_done context.  The
exiting process has usually trashed both r1 and r2 at this point,
leading to a crash within the PLT lookup before setcontext(2) is
reached to restore the linked context.

Save and restore r1 and r2, using r16 as a scratch register to bypass
the prologue trampling r2.  This ensures the subsequent PLT lookup to
setcontext(3) succeeds.

Signed-off-by: Timothy Pearson <[email protected]>
@madscientist159
Copy link
Contributor Author

@chmeeedalf Hey, I'd rather get it right than push something that causes problems down the line...

It does make sense, and no, I don't care about ELF v1. If we don't have v1 users we should probably start looking at dropping the code, honestly, but that's for another time.

I'll buy the stackargs + 6 argument (+4, then another +2 for worst-case alignment), seems sane enough. With the existing code we ended up with this little mini-frame on the root of the stack, which was definitely in violation of the v2 ABI, and this explains why I needed to add another frame to store r2.

@madscientist159
Copy link
Contributor Author

Looks like we managed to confuse the style checker -- it thinks the pointer cast is a multiplication. 🤣

@chmeeedalf
Copy link
Contributor

Looks good to me, thanks! I'll push it soon (tonight, probably).

@chmeeedalf
Copy link
Contributor

And now this, and PR #1756, are pushed. Thanks for your work!

freebsd-git pushed a commit that referenced this pull request Jul 13, 2025
On PowerPC platforms a valid link to the Table of Contents (TOC) is
required for PLT lookups to function.  This TOC pointer is stored in
a dedicated register, and is used along with the stack pointer by both
C prologue and PLT lookup code.

When calling swapcontext() with uc_link != NULL, a PLT lookup to
setcontext(3) is attempted from within the _ctx_done context.  The
exiting process has usually trashed both r1 and r2 at this point,
leading to a crash within the PLT lookup before setcontext(2) is
reached to restore the linked context.

Save and restore r2 as in a regular function.  This ensures the
subsequent PLT lookup to setcontext(3) succeeds.

Signed-off-by: Timothy Pearson <[email protected]>

MFC after:	1 week
Pull Request:	#1759
@bsdimp
Copy link
Member

bsdimp commented Jul 13, 2025

Landed. Thanks

@bsdimp bsdimp closed this Jul 13, 2025
@bsdimp bsdimp added the merged Closed commit that's been merged label Jul 13, 2025
freebsd-git pushed a commit that referenced this pull request Jul 21, 2025
On PowerPC platforms a valid link to the Table of Contents (TOC) is
required for PLT lookups to function.  This TOC pointer is stored in
a dedicated register, and is used along with the stack pointer by both
C prologue and PLT lookup code.

When calling swapcontext() with uc_link != NULL, a PLT lookup to
setcontext(3) is attempted from within the _ctx_done context.  The
exiting process has usually trashed both r1 and r2 at this point,
leading to a crash within the PLT lookup before setcontext(2) is
reached to restore the linked context.

Save and restore r2 as in a regular function.  This ensures the
subsequent PLT lookup to setcontext(3) succeeds.

Signed-off-by: Timothy Pearson <[email protected]>

MFC after:	1 week
Pull Request:	#1759

(cherry picked from commit 8efa35f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged Closed commit that's been merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants