Skip to content

Conversation

@maribu
Copy link
Member

@maribu maribu commented Nov 18, 2025

Contribution description

The previous implementation relied on thread_flag_set() to defer the context switch when called with IRQs disabled until irq_restore() is called. This however can only be the case when thread_yield_higher triggers an IRQ and performs the context switch within the ISR. This allowed the previous implementation to continue calling thread_flag_set() for the remaining group members.

This however is an implementation detail that is not part of the API contract. Platforms that do not have a service request IRQ may have to use other means for context switching that do not get deferred until an irq_restore() is called. In that case, the first call to thread_flags_set() even with IRQs disabled may directly trigger a context switch to the unblocked thread, even if other group members would also be unblocked and have a higher priority.

This changes the implementation to manually set the flags and update the thread status without yielding and keep track whether any thread has been awoken. Only once the states of all threads have been updated, the adapted implementation will now call thread_yield() (unless no thread was awoken).

Testing procedure

In master

$ make BOARD=arduino-mega2560 -C tests/core/thread_flags_group flash test
make: Entering directory '/home/[email protected]/Repos/software/RIOT/master/tests/core/thread_flags_group'
Building application "tests_thread_flags_group" for "arduino-mega2560" with CPU "atmega2560".

[...]

Help: Press s to start test, r to print it is ready
READY
s
START
main(): This is RIOT! (Version: 2026.01-devel-144-gbcd09d)
START
waiting forever-waiter
waiting waiter
waiting waiter
waking up!
woken up waiter
woken up waiter
/home/[email protected]/Repos/software/RIOT/master/tests/core/thread_flags_group/main.c:73 => failed condition
*** RIOT kernel panic:
CONDITION FAILED.

*** halted.

In this PR

$ make BOARD=arduino-mega2560 -C tests/core/thread_flags_group flash test
make: Entering directory '/home/[email protected]/Repos/software/RIOT/master/tests/core/thread_flags_group'
Building application "tests_thread_flags_group" for "arduino-mega2560" with CPU "atmega2560".

[...]

Help: Press s to start test, r to print it is ready
READY
s
START
main(): This is RIOT! (Version: 2026.01-devel-137-g8c63a-core/thread_flags_group/fix)
START
waiting forever-waiter
waiting waiter
waiting waiter
waking up!
woken up waiter
woken up waiter
SUCCESS

make: Leaving directory '/home/[email protected]/Repos/software/RIOT/master/tests/core/thread_flags_group'

Issues/PRs references

None

@maribu maribu requested a review from kaspar030 as a code owner November 18, 2025 13:45
@maribu maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch labels Nov 18, 2025
@github-actions github-actions bot added the Area: core Area: RIOT kernel. Handle PRs marked with this with care! label Nov 18, 2025
@maribu
Copy link
Member Author

maribu commented Nov 18, 2025

@derMihai: Would you mind to also take a look?

@maribu maribu force-pushed the core/thread_flags_group/fix branch from 8c63a9b to 150db5b Compare November 18, 2025 13:48
@riot-ci
Copy link

riot-ci commented Nov 18, 2025

Murdock results

✔️ PASSED

55f7494 fixup! core/thread_flags: change internal API

Success Failures Total Runtime
10932 0 10932 19m:28s

Artifacts

@crasbe
Copy link
Contributor

crasbe commented Nov 18, 2025

Is this a 2 ACK kind of PR?

@mguetschow
Copy link
Contributor

Is this a 2 ACK kind of PR?

I'd say so, as it touches core. I though CI would label accordingly 🤔

}

irq_restore(irq_state);
if (yield) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this be called unconditionally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be faster in case all threads of the group either where already pending or not currently actively waiting for one of the flags in mask.

@derMihai
Copy link
Contributor

Huh, I thought this was sorted out: #21123 (comment)

But if not, then you're totally right and this PR looks fine.

@maribu
Copy link
Member Author

maribu commented Nov 18, 2025

I would assume that @kaspar030 was referring to the crash with "sorted out". And indeed, that crash did not happen anymore.

Just sending out PendSV (or similar) in thread_yield_higher() is nothing you can do on platforms other than Cortex M. At least AVR and MSP430 do not have any service IRQs. I don't know how the situation is on RISC-V and Xtensa.

We certainly cannot assume that each and every architecture has a software IRQ usable for context switching without limiting portability.

@maribu maribu added the Process: needs >1 ACK Integration Process: This PR requires more than one ACK label Nov 18, 2025
@github-actions github-actions bot added the Process: missing approvals Integration Process: PR needs more ACKS (handled by action) label Nov 18, 2025
@maribu maribu force-pushed the core/thread_flags_group/fix branch from d705c34 to aa010b8 Compare November 19, 2025 08:28
Copy link
Contributor

@mguetschow mguetschow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing!

There is quite a lot of code that is just needed for debugging, but which makes the function look more complex than it is. I don't have a good suggestion how, but maybe you have an idea how to leave it more concise?

All uses of thread_flags_wake() also had to set the flags anyway, so
we can just combine those operations into a new
thread_flags_set_internal() and update the users to use that instead.
The previous implementation relied on `thread_flag_set()` to defer the
context switch when called with IRQs disabled until `irq_restore()` is
called. This however can only be the case when `thread_yield_higher`
triggers an IRQ and performs the context switch within the ISR. This
allowed the previous implementation to continue calling
`thread_flag_set()` for the remaining group members.

This however is an implementation detail that is not part of the API
contract. Platforms that do not have a service request IRQ may have to
use other means for context switching that do not get deferred until
an `irq_restore()` is called. In that case, the first call to
`thread_flags_set()` even with IRQs disabled may directly trigger a
context switch to the unblocked thread, even if other group members
would also be unblocked and have a higher priority.

This changes the implementation to manually set the flags and update
the thread status without yielding and keep track whether any thread
has been awoken. Only once the states of all threads have been updated,
the adapted implementation will now call `thread_yield()` (unless no
thread was awoken).
@maribu maribu force-pushed the core/thread_flags_group/fix branch from e6c7979 to d873852 Compare November 20, 2025 10:19
Comment on lines +145 to +149
bool thread_flags_set_internal(thread_t *thread, thread_flags_t mask)
{
thread->flags |= mask;
return _thread_flags_wake(thread);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that's a nice solution, thanks! Maybe remove thread_flags_wake though - it's not exposed through the header file anymore anyways.

RIOT/core/thread_flags.c

Lines 56 to 59 in a23948c

int thread_flags_wake(thread_t *thread)
{
return _thread_flags_wake(thread);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function has multiple callers...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_thread_flags_wake still has, but thread_flags_wake does not, does it? At least its not exposed through the headerfile anymore and all calls inside thread_flags.c should use the static (_-prefixed) function anyways?

So I'd say thread_flags_wake should be dead code as it stands now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK. I could have sworn I dropped it. Maybe reordering the commits didn't work as expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: core Area: RIOT kernel. Handle PRs marked with this with care! CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: missing approvals Integration Process: PR needs more ACKS (handled by action) Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Process: needs >1 ACK Integration Process: This PR requires more than one ACK Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants