-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sched: Fix nx_waitid(), nx_waitpid(), nxtask_exithook() for SMP #2298
Conversation
Summary: - I noticed waitpid_test stops with lc823450-xgevk:rndis - The condition was CONFIG_DEBUG_ASSERTION=y - Actually, the child task sent SIGCHILD but the parent couldn't catch the signal - Then, I found that nx_waitid(), nx_waitpid() use sched_lock() - However, a parent task and a child task are running on different CPUs - So, sched_lock() is not enough and need to use a critical section - Also, signal handling in nxtask_exithook() must be done in a critical section Impact: - SMP only Testing: - Tested with ostest with the following configurations - lc823450-xgevk:rndis (CONFIG_DEBUG_ASSERTION=y and n) - spresense:smp - spresense:wifi_smp (NCPUS=2 and 4) - sabre-6quad:smp (QEMU) - esp32-core:smp (QEMU) - maix-bit:smp (QEMU) - sim:smp Signed-off-by: Masayuki Ishikawa <[email protected]>
*/ | ||
|
||
#ifdef CONFIG_SMP | ||
irqstate_t flags = enter_critical_section(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the call to sched_(un)lock
in line 156 not needed then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ouss4
Thanks for your comment. I will modify the code and do the regression test again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ouss4
I tried the following change in sched_waitid.c and sched_waitpid.c
The ostest passed for the all SMP configurations.
However, Wi-Fi streaming plus stress testing with spresense:wifi_smp (NCPUS=4, DEBUG_ASSERTIONS=y)
failed with DEBUGASSERTION after 3.5 hours. (FYI, without the following changes, the test continued over 9 hours)
I think replacing sched_lock() and sched_unlock() with a critical section should be a future task.
What do you think?
#ifdef CONFIG_SMP
/* NOTE: sched_lock() is not enough for SMP
* because the child task is running on another CPU
*/
irqstate_t flags = enter_critical_section();
#else
/* Disable pre-emption so that nothing changes while the loop executes */
sched_lock();
#endif
...
#ifdef CONFIG_SMP
leave_critical_section(flags);
#else
sched_unlock();
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See issue #1137 |
Summary
Impact
Testing