Hello,
kernel test robot noticed "WARNING:at_kernel/workqueue.c:#worker_enter_idle" on:
commit: 6a97cb750bf8a8f4b6e514355ab2b80a6866c542 ("workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE")
https://git.kernel.org/cgit/linux/kernel/git/tj/wq.git auto-cpu-intensive-v4
in testcase: rcutorture
version:
with following parameters:
runtime: 300s
test: cpuhotplug
torture_type: busted_srcud
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
this does not always happen, but not a small rate, and clean on parent in our
tests:
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/test/torture_type:
vm-snb/rcutorture/debian-11.1-i386-20220923.cgz/x86_64-randconfig-a003-20230531/gcc-12/300s/cpuhotplug/busted_srcud
52185b931a87084c 6a97cb750bf8a8f4b6e514355ab
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:20 45% 9:20 dmesg.RIP:worker_enter_idle
:20 45% 9:20 dmesg.WARNING:at_kernel/workqueue.c:#worker_enter_idle
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]
[ 383.839791][ T66] ------------[ cut here ]------------
[ 383.840594][ T66] WARNING: CPU: 0 PID: 66 at kernel/workqueue.c:1903 worker_enter_idle (workqueue.c:?)
[ 383.841389][ T66] Modules linked in: rcutorture torture
[ 383.841872][ T66] CPU: 0 PID: 66 Comm: kworker/0:2 Tainted: G W 6.4.0-rc1-00007-g6a97cb750bf8 #1
[ 383.842740][ T66] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 383.843617][ T66] Workqueue: 0x0 (rcu_gp)
[ 383.844027][ T66] RIP: 0010:worker_enter_idle (workqueue.c:?)
[ 383.844546][ T66] Code: 48 89 fa 48 c1 e0 2a 48 c1 ea 03 8a 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 09 84 d2 74 05 e8 78 4c 40 00 83 7d 5c 00 74 02 <0f> 0b 5b 5d 41 5c 41 5d c3 41 54 49 89 d4 55 48 89 f5 53 48 89 fb
All code
========
0: 48 89 fa mov %rdi,%rdx
3: 48 c1 e0 2a shl $0x2a,%rax
7: 48 c1 ea 03 shr $0x3,%rdx
b: 8a 14 02 mov (%rdx,%rax,1),%dl
e: 48 89 f8 mov %rdi,%rax
11: 83 e0 07 and $0x7,%eax
14: 83 c0 03 add $0x3,%eax
17: 38 d0 cmp %dl,%al
19: 7c 09 jl 0x24
1b: 84 d2 test %dl,%dl
1d: 74 05 je 0x24
1f: e8 78 4c 40 00 callq 0x404c9c
24: 83 7d 5c 00 cmpl $0x0,0x5c(%rbp)
28: 74 02 je 0x2c
2a:* 0f 0b ud2 <-- trapping instruction
2c: 5b pop %rbx
2d: 5d pop %rbp
2e: 41 5c pop %r12
30: 41 5d pop %r13
32: c3 retq
33: 41 54 push %r12
35: 49 89 d4 mov %rdx,%r12
38: 55 push %rbp
39: 48 89 f5 mov %rsi,%rbp
3c: 53 push %rbx
3d: 48 89 fb mov %rdi,%rbx
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 5b pop %rbx
3: 5d pop %rbp
4: 41 5c pop %r12
6: 41 5d pop %r13
8: c3 retq
9: 41 54 push %r12
b: 49 89 d4 mov %rdx,%r12
e: 55 push %rbp
f: 48 89 f5 mov %rsi,%rbp
12: 53 push %rbx
13: 48 89 fb mov %rdi,%rbx
[ 383.846260][ T66] RSP: 0018:ffffc90000e5fe80 EFLAGS: 00210086
[ 383.846801][ T66] RAX: 0000000000000007 RBX: 0000000000000003 RCX: 0000000000000000
[ 383.847458][ T66] RDX: 1ffff11075e47200 RSI: ffff8883af239638 RDI: ffff8883af23961c
[ 383.848151][ T66] RBP: ffff8883af2395c0 R08: fffff520001cbfad R09: 0000000000000003
[ 383.864919][ T66] R10: 0000000000000001 R11: ffffffff81292f65 R12: ffff8883af239634
[ 383.865648][ T66] R13: ffff8883af239638 R14: ffffe8ffffc05fa8 R15: ffff8881bfde8000
[ 383.866360][ T66] FS: 0000000000000000(0000) GS:ffff8883af200000(0000) knlGS:0000000000000000
[ 383.867160][ T66] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 383.867736][ T66] CR2: 0000000056625160 CR3: 00000001bf680000 CR4: 00000000000406f0
[ 383.868439][ T66] Call Trace:
[ 383.868737][ T66] <TASK>
[ 383.869006][ T66] worker_thread (workqueue.c:?)
[ 383.869428][ T66] ? rescuer_thread (workqueue.c:?)
[ 383.869858][ T66] kthread (kthread.c:?)
[ 383.870216][ T66] ? kthread_complete_and_exit (kthread.c:?)
[ 383.870710][ T66] ret_from_fork (??:?)
[ 383.871118][ T66] </TASK>
[ 383.871392][ T66] irq event stamp: 372430
[ 383.871770][ T66] hardirqs last enabled at (372429): _raw_spin_unlock_irq (??:?)
[ 383.872627][ T66] hardirqs last disabled at (372430): _raw_spin_lock_irq (??:?)
[ 383.873463][ T66] softirqs last enabled at (372426): srcu_invoke_callbacks (srcutree.c:?)
[ 383.874357][ T66] softirqs last disabled at (372416): srcu_invoke_callbacks (srcutree.c:?)
[ 383.875230][ T66] ---[ end trace 0000000000000000 ]---
[ 383.911576][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 383.913220][ T778] smpboot: CPU 0 is now offline
[ 383.937638][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 383.956429][ T0] masked ExtINT on CPU#0
[ 383.957038][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 384.048832][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 384.053689][ T778] smpboot: CPU 0 is now offline
[ 384.087434][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 384.088421][ T0] masked ExtINT on CPU#0
[ 384.088903][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 384.161716][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 384.165632][ T778] smpboot: CPU 0 is now offline
[ 384.201629][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 384.202678][ T0] masked ExtINT on CPU#0
[ 384.203223][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 384.293717][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 384.303187][ T778] smpboot: CPU 1 is now offline
[ 384.325618][ T778] x86: Booting SMP configuration:
[ 384.326196][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 384.330912][ T0] masked ExtINT on CPU#1
[ 384.331459][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 384.452543][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 384.470688][ T778] smpboot: CPU 0 is now offline
[ 384.508469][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 384.516760][ T0] masked ExtINT on CPU#0
[ 384.524642][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 384.600928][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 384.604489][ T778] smpboot: CPU 1 is now offline
[ 384.634772][ T778] x86: Booting SMP configuration:
[ 384.635438][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 384.638672][ T0] masked ExtINT on CPU#1
[ 384.639204][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 384.711713][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 384.715687][ T778] smpboot: CPU 1 is now offline
[ 384.756551][ T778] x86: Booting SMP configuration:
[ 384.757172][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 384.761578][ T0] masked ExtINT on CPU#1
[ 384.762137][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 384.822854][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 384.826733][ T778] smpboot: CPU 0 is now offline
[ 384.854352][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 384.855284][ T0] masked ExtINT on CPU#0
[ 384.856355][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 384.907586][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 384.911114][ T778] smpboot: CPU 1 is now offline
[ 384.930329][ T778] x86: Booting SMP configuration:
[ 384.930879][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 384.934490][ T0] masked ExtINT on CPU#1
[ 384.935036][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 384.988713][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 384.992191][ T778] smpboot: CPU 1 is now offline
[ 385.014302][ T778] x86: Booting SMP configuration:
[ 385.014842][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 385.018360][ T0] masked ExtINT on CPU#1
[ 385.018910][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 385.070898][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 385.074682][ T778] smpboot: CPU 0 is now offline
[ 385.110766][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 385.111739][ T0] masked ExtINT on CPU#0
[ 385.112241][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 385.190845][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 385.194656][ T778] smpboot: CPU 1 is now offline
[ 385.220409][ T778] x86: Booting SMP configuration:
[ 385.220980][ T778] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 385.224337][ T0] masked ExtINT on CPU#1
[ 385.224883][ T0] numa_add_cpu cpu 1 node 0: mask now 0-1
[ 385.280190][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 385.285294][ T778] smpboot: CPU 0 is now offline
[ 385.309929][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 385.310916][ T0] masked ExtINT on CPU#0
[ 385.311385][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 385.389121][ T19] numa_remove_cpu cpu 0 node 0: mask now 1
[ 385.392753][ T778] smpboot: CPU 0 is now offline
[ 385.425664][ T778] smpboot: Booting Node 0 Processor 0 APIC 0x0
[ 385.426672][ T0] masked ExtINT on CPU#0
[ 385.427199][ T0] numa_add_cpu cpu 0 node 0: mask now 0-1
[ 385.515315][ T24] numa_remove_cpu cpu 1 node 0: mask now 0
[ 385.519172][ T778] smpboot: CPU 1 is now offline
[ 385.554725][ T778] x86: Booting SMP configuration:
To reproduce:
# build kernel
cd linux
cp config-6.4.0-rc1-00007-g6a97cb750bf8 .config
make HOSTCC=gcc-12 CC=gcc-12 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-12 CC=gcc-12 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki