You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
commit ade8147 upstream.
A soft lockup issue was found in the product with about 56,000 tasks were
in the OOM cgroup, it was traversing them when the soft lockup was
triggered.
watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [VM Thread:1503066]
CPU: 2 PID: 1503066 Comm: VM Thread Kdump: loaded Tainted: G
Hardware name: Huawei Cloud OpenStack Nova, BIOS
RIP: 0010:console_unlock+0x343/0x540
RSP: 0000:ffffb751447db9a0 EFLAGS: 00000247 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000247
RBP: ffffffffafc71f90 R08: 0000000000000000 R09: 0000000000000040
R10: 0000000000000080 R11: 0000000000000000 R12: ffffffffafc74bd0
R13: ffffffffaf60a220 R14: 0000000000000247 R15: 0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2fe6ad91f0 CR3: 00000004b2076003 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
vprintk_emit+0x193/0x280
printk+0x52/0x6e
dump_task+0x114/0x130
mem_cgroup_scan_tasks+0x76/0x100
dump_header+0x1fe/0x210
oom_kill_process+0xd1/0x100
out_of_memory+0x125/0x570
mem_cgroup_out_of_memory+0xb5/0xd0
try_charge+0x720/0x770
mem_cgroup_try_charge+0x86/0x180
mem_cgroup_try_charge_delay+0x1c/0x40
do_anonymous_page+0xb5/0x390
handle_mm_fault+0xc4/0x1f0
This is because thousands of processes are in the OOM cgroup, it takes a
long time to traverse all of them. As a result, this lead to soft lockup
in the OOM process.
To fix this issue, call 'cond_resched' in the 'mem_cgroup_scan_tasks'
function per 1000 iterations. For global OOM, call
'touch_softlockup_watchdog' per 1000 iterations to avoid this issue.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 9cbb78b ("mm, memcg: introduce own oom handler to iterate only over its own threads")
Signed-off-by: Chen Ridong <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Michal Koutný <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
0 commit comments