Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: b93acee44e1a2b541ffe5ae1dda6be3d6f65987f ("mm/oom_kill: wake futex waiters before annihilating victim shared mutex")
url: https://github.com/0day-ci/linux/commits/Yang-Li/net-phy-micrel-use-min-macro-instead-of-doing-it-manually/20211224-171618
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu Icelake-Server -smp 4 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 61.631030][ T538] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:584
[ 61.632850][ T538] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 538, name: kworker/u8:5
[ 61.634593][ T538] preempt_count: 1, expected: 0
[ 61.635532][ T538] RCU nest depth: 0, expected: 0
[ 61.636479][ T538] 2 locks held by kworker/u8:5/538:
[ 61.637447][ T538] #0: c5515ef8 (oom_lock){+.+.}-{3:3}, at: __alloc_pages_slowpath+0x4f7/0xe40
[ 61.639340][ T538] #1: f591cafc (&p->alloc_lock){+.+.}-{2:2}, at: find_lock_task_mm (mm/oom_kill.c:145)
[ 61.641023][ T538] Preemption disabled at:
[ 61.641037][ T538] find_lock_task_mm (mm/oom_kill.c:145)
[ 61.642909][ T538] CPU: 2 PID: 538 Comm: kworker/u8:5 Not tainted 5.16.0-rc6-00051-gb93acee44e1a #2
[ 61.644587][ T538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 61.646331][ T538] Call Trace:
[ 61.646977][ T538] dump_stack_lvl (lib/dump_stack.c:107)
[ 61.647873][ T538] dump_stack (lib/dump_stack.c:114)
[ 61.648692][ T538] __might_resched.cold (kernel/sched/core.c:9539)
[ 61.649673][ T538] ? find_lock_task_mm (mm/oom_kill.c:145)
[ 61.650669][ T538] __might_sleep (kernel/sched/core.c:9468 (discriminator 24))
[ 61.651544][ T538] __mutex_lock (include/linux/kernel.h:101 kernel/locking/mutex.c:584 kernel/locking/mutex.c:740)
[ 61.652306][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.655532][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.656539][ T538] mutex_lock_nested (kernel/locking/mutex.c:793)
[ 61.657421][ T538] ? futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.658371][ T538] futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.659339][ T538] futex_exit_release (kernel/futex/core.c:1124)
[ 61.660237][ T538] __oom_kill_process (mm/oom_kill.c:899)
[ 61.661238][ T538] oom_kill_process (mm/oom_kill.c:1008)
[ 61.662178][ T538] out_of_memory (mm/oom_kill.c:1131 (discriminator 4))
[ 61.663049][ T538] __alloc_pages_slowpath+0xd7f/0xe40
[ 61.664287][ T538] __alloc_pages (mm/page_alloc.c:5382)
[ 61.665183][ T538] cache_grow_begin+0x247/0x320
[ 61.666307][ T538] cache_alloc_refill (mm/slab.c:2965)
[ 61.667261][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.668392][ T538] kmem_cache_alloc (mm/slab.c:3048 mm/slab.c:3031 mm/slab.c:3292 mm/slab.c:3316 mm/slab.c:3507)
[ 61.669315][ T538] getname_kernel (fs/namei.c:227)
[ 61.670208][ T538] kernel_execve (include/linux/err.h:36 fs/exec.c:1940)
[ 61.671077][ T538] call_usermodehelper_exec_async (kernel/umh.c:116)
[ 61.672208][ T538] ? umh_complete (kernel/umh.c:67)
[ 61.673026][ T538] ret_from_fork (arch/x86/entry/entry_32.S:775)
[ 61.673926][ T538]
[ 61.674318][ T538] =============================
[ 61.675084][ T538] [ BUG: Invalid wait context ]
[ 61.675857][ T538] 5.16.0-rc6-00051-gb93acee44e1a #2 Tainted: G W
[ 61.677049][ T538] -----------------------------
[ 61.677824][ T538] kworker/u8:5/538 is trying to lock:
[ 61.678698][ T538] f591d154 (&tsk->futex_exit_mutex){+.+.}-{3:3}, at: futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.680223][ T538] other info that might help us debug this:
[ 61.681201][ T538] context-{4:4}
[ 61.681786][ T538] 2 locks held by kworker/u8:5/538:
[ 61.682660][ T538] #0: c5515ef8 (oom_lock){+.+.}-{3:3}, at: __alloc_pages_slowpath+0x4f7/0xe40
[ 61.684302][ T538] #1: f591cafc (&p->alloc_lock){+.+.}-{2:2}, at: find_lock_task_mm (mm/oom_kill.c:145)
[ 61.685872][ T538] stack backtrace:
[ 61.686492][ T538] CPU: 2 PID: 538 Comm: kworker/u8:5 Tainted: G W 5.16.0-rc6-00051-gb93acee44e1a #2
[ 61.688196][ T538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 61.689731][ T538] Call Trace:
[ 61.690294][ T538] dump_stack_lvl (lib/dump_stack.c:107)
[ 61.691025][ T538] dump_stack (lib/dump_stack.c:114)
[ 61.691730][ T538] __lock_acquire.cold (kernel/locking/lockdep.c:4680 kernel/locking/lockdep.c:4739 kernel/locking/lockdep.c:4977)
[ 61.692585][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.693533][ T538] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602)
[ 61.694275][ T538] ? futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.695139][ T538] ? __might_resched.cold (kernel/sched/core.c:9539)
[ 61.696045][ T538] ? find_lock_task_mm (mm/oom_kill.c:145)
[ 61.696868][ T538] __mutex_lock (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-long.h:523 include/linux/atomic/atomic-instrumented.h:1184 kernel/locking/mutex.c:104 kernel/locking/mutex.c:149 kernel/locking/mutex.c:609 kernel/locking/mutex.c:740)
[ 61.697523][ T538] ? futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.698389][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.699287][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.700190][ T538] mutex_lock_nested (kernel/locking/mutex.c:793)
[ 61.700975][ T538] ? futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.701816][ T538] futex_cleanup_begin (kernel/futex/core.c:1084)
[ 61.702621][ T538] futex_exit_release (kernel/futex/core.c:1124)
[ 61.703363][ T538] __oom_kill_process (mm/oom_kill.c:899)
[ 61.704227][ T538] oom_kill_process (mm/oom_kill.c:1008)
[ 61.705003][ T538] out_of_memory (mm/oom_kill.c:1131 (discriminator 4))
[ 61.705784][ T538] __alloc_pages_slowpath+0xd7f/0xe40
[ 61.706829][ T538] __alloc_pages (mm/page_alloc.c:5382)
[ 61.707608][ T538] cache_grow_begin+0x247/0x320
[ 61.708602][ T538] cache_alloc_refill (mm/slab.c:2965)
[ 61.709435][ T538] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 61.710385][ T538] kmem_cache_alloc (mm/slab.c:3048 mm/slab.c:3031 mm/slab.c:3292 mm/slab.c:3316 mm/slab.c:3507)
[ 61.711194][ T538] getname_kernel (fs/namei.c:227)
[ 61.711964][ T538] kernel_execve (include/linux/err.h:36 fs/exec.c:1940)
[ 61.712710][ T538] call_usermodehelper_exec_async (kernel/umh.c:116)
[ 61.713660][ T538] ? umh_complete (kernel/umh.c:67)
[ 61.714361][ T538] ret_from_fork (arch/x86/entry/entry_32.S:775)
[ 61.715202][ T538] Out of memory: Killed process 458 (systemd-journal) total-vm:15160kB, anon-rss:344kB, file-rss:3536kB, shmem-rss:1384kB, UID:0 pgtables:20kB oom_score_adj:0
[ 61.831730][ T41] oom_reaper: reaped process 458 (systemd-journal), now anon-rss:0kB, file-rss:0kB, shmem-rss:1384kB
[ 62.198664][ T539] _warn_unseeded_randomness: 3 callbacks suppressed
[ 62.198675][ T539] random: get_random_u32 called from arch_rnd+0x19/0x30 with crng_init=1
[ 62.198691][ T539] random: get_random_u32 called from randomize_stack_top+0x35/0x50 with crng_init=1
[ 62.198703][ T539] random: get_random_u32 called from arch_align_stack+0x35/0x50 with crng_init=1
Stopping Flush Journal to Persistent Storage...
[ OK ] Stopped Journal Service.
Starting Journal Service...
[ 71.972603][ T540] random: get_random_u32 called from arch_rnd+0x19/0x30 with crng_init=1
[ 71.972628][ T540] random: get_random_u32 called from randomize_stack_top+0x35/0x50 with crng_init=1
[ 71.972635][ T540] random: get_random_u32 called from arch_align_stack+0x35/0x50 with crng_init=1
[ OK ] Started Journal Service.
Starting Flush Journal to Persistent Storage...
[ OK ] Reached target System Initialization.
[ OK ] Listening on D-Bus System Message Bus Socket.
[ OK ] Reached target Sockets.
[ OK ] Started Daily apt download activities.
[ OK ] Started Daily Cleanup of Temporary Directories.
[ OK ] Started Daily apt upgrade and clean activities.
[ OK ] Reached target Timers.
[ OK ] Reached target Basic System.
Starting LKP bootstrap...
Startin[ 77.266134][ T544] _warn_unseeded_randomness: 17 callbacks suppressed
g /etc/rc.local [ 77.266144][ T544] random: get_random_u32 called from arch_rnd+0x19/0x30 with crng_init=1
Compatibility...[ 77.266160][ T544] random: get_random_u32 called from randomize_stack_top+0x35/0x50 with crng_init=1
[ 77.266173][ T544] random: get_random_u32 called from arch_align_stack+0x35/0x50 with crng_init=1
Starting LSB: Execute the kexec -e command to reboot system...
Starting LSB: Start and stop bmc-watchdog...
Starting OpenBSD Secure Shell server...
[ 87.529761][ T560] _warn_unseeded_randomness: 70 callbacks suppressed
[ 87.529769][ T560] random: get_random_u32 called from arch_rnd+0x19/0x30 with crng_init=1
[ 87.529779][ T560] random: get_random_u32 called from randomize_stack_top+0x35/0x50 with crng_init=1
[ 87.529787][ T560] random: get_random_u32 called from arch_align_stack+0x35/0x50 with crng_init=1
Starting System Logging Service...
[ 87.475599] rc.local[545]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/lkp/lkp/src/bin
[ 87.588377][ T523] systemd-udevd invoked oom-killer: gfp_mask=0x40cc0(GFP_KERNEL|__GFP_COMP), order=0, oom_score_adj=0
[ 87.590235][ T523] CPU: 1 PID: 523 Comm: systemd-udevd Tainted: G W 5.16.0-rc6-00051-gb93acee44e1a #2
[ 87.592034][ T523] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 87.593451][ T523] Call Trace:
[ 87.593965][ T523] dump_stack_lvl (lib/dump_stack.c:107)
[ 87.594666][ T523] dump_stack (lib/dump_stack.c:114)
[ 87.595338][ T523] dump_header (mm/oom_kill.c:465)
[ 87.596051][ T523] ? ___ratelimit (lib/ratelimit.c:69)
[ 87.596794][ T523] oom_kill_process.cold (mm/oom_kill.c:994)
[ 87.597533][ T523] out_of_memory (mm/oom_kill.c:1131 (discriminator 4))
[ 87.598258][ T523] __alloc_pages_slowpath+0xd7f/0xe40
[ 87.599300][ T523] __alloc_pages (mm/page_alloc.c:5382)
[ 87.600063][ T523] cache_grow_begin+0x247/0x320
[ 87.600974][ T523] cache_alloc_refill (mm/slab.c:2965)
[ 87.601757][ T523] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 87.602703][ T523] kmem_cache_alloc (mm/slab.c:3048 mm/slab.c:3031 mm/slab.c:3292 mm/slab.c:3316 mm/slab.c:3507)
[ 87.603512][ T523] ? import_iovec (lib/iov_iter.c:2015)
[ 87.604274][ T523] getname_flags (include/linux/audit.h:323 include/linux/audit.h:363 fs/namei.c:134)
[ 87.605141][ T523] ? kfree (mm/slab.c:3790)
[ 87.605789][ T523] getname_flags (fs/namei.c:204)
[ 87.606502][ T523] getname (fs/namei.c:218)
[ 87.607132][ T523] do_sys_openat2 (include/linux/err.h:36 fs/open.c:1207)
[ 87.607931][ T523] ? dput_to_list (include/linux/rcupdate.h:689 fs/dcache.c:909)
[ 87.608737][ T523] do_sys_open (fs/open.c:1229)
[ 87.609451][ T523] __ia32_sys_open (fs/open.c:1232)
[ 87.610228][ T523] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
[ 87.611061][ T523] ? check_preemption_disabled (lib/smp_processor_id.c:16)
[ 87.612021][ T523] ? rcu_read_lock_held_common (kernel/rcu/update.c:104)
[ 87.612977][ T523] ? rcu_read_lock_sched_held (kernel/rcu/update.c:123)
[ 87.613916][ T523] ? trace_irq_enable (include/trace/events/preemptirq.h:40)
[ 87.614765][ T523] ? syscall_exit_to_user_mode (kernel/entry/common.c:303)
[ 87.615723][ T523] ? do_int80_syscall_32 (arch/x86/entry/common.c:136)
[ 87.616495][ T523] ? do_int80_syscall_32 (arch/x86/entry/common.c:136)
[ 87.617348][ T523] ? do_int80_syscall_32 (arch/x86/entry/common.c:136)
[ 87.618209][ T523] ? do_int80_syscall_32 (arch/x86/entry/common.c:136)
[ 87.619046][ T523] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1097)
[ 87.620016][ T523] entry_INT80_32 (arch/x86/entry/entry_32.S:981)
[ 87.620774][ T523] EIP: 0xb7f33a02
[ 87.621392][ T523] Code: 95 01 00 05 25 36 02 00 83 ec 14 8d 80 e8 99 ff ff 50 6a 02 e8 1f ff 00 00 c7 04 24 7f 00 00 00 e8 7e 87 01 00 66 90 90 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
All code
========
0: 95 xchg %eax,%ebp
1: 01 00 add %eax,(%rax)
3: 05 25 36 02 00 add $0x23625,%eax
8: 83 ec 14 sub $0x14,%esp
b: 8d 80 e8 99 ff ff lea -0x6618(%rax),%eax
11: 50 push %rax
12: 6a 02 pushq $0x2
14: e8 1f ff 00 00 callq 0xff38
19: c7 04 24 7f 00 00 00 movl $0x7f,(%rsp)
20: e8 7e 87 01 00 callq 0x187a3
25: 66 90 xchg %ax,%ax
27: 90 nop
28: cd 80 int $0x80
2a:* c3 retq <-- trapping instruction
2b: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi
31: 8d bc 27 00 00 00 00 lea 0x0(%rdi,%riz,1),%edi
38: 8b 1c 24 mov (%rsp),%ebx
3b: c3 retq
3c: 8d .byte 0x8d
3d: b6 00 mov $0x0,%dh
...
Code starting with the faulting instruction
===========================================
0: c3 retq
1: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi
7: 8d bc 27 00 00 00 00 lea 0x0(%rdi,%riz,1),%edi
e: 8b 1c 24 mov (%rsp),%ebx
11: c3 retq
12: 8d .byte 0x8d
13: b6 00 mov $0x0,%dh
To reproduce:
# build kernel
cd linux
cp config-5.16.0-rc6-00051-gb93acee44e1a .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang