2022-10-14 02:45:21

by kernel test robot

[permalink] [raw]
Subject: [cpumask] 78e5a33994: WARNING:at_include/linux/cpumask.h:#wq_select_unbound_cpu


Hi Yury Norov,

we reported
"[cpumask] b9a7ecc71f: WARNING:at_include/linux/cpumask.h:#__is_kernel_percpu_address"
on
https://lore.kernel.org/all/[email protected]/
when this commit is on linux-next/master.
we noticed there are some dicussion and patch review there.

now this commit has been merged into mainline and the similar issues still
exist. we also did further tests on mainline and linux-next/master, confirmed
there is no fix so far. based on we are not clear enough about the conclusion
and the next step from discussions for report of "b9a7ecc71f", we report this
again just FYI that what we observed in our tests.

8173aa26260e6d01 78e5a3399421ad79fc024e6d78e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:6 100% 6:6 dmesg.EIP:c_start
:6 100% 6:6 dmesg.EIP:clocksource_watchdog
:6 100% 6:6 dmesg.EIP:smp_call_function_many_cond
:6 100% 6:6 dmesg.EIP:wq_select_unbound_cpu
:6 100% 6:6 dmesg.WARNING:at_include/linux/cpumask.h:#c_start
:6 100% 6:6 dmesg.WARNING:at_include/linux/cpumask.h:#clocksource_watchdog
:6 100% 6:6 dmesg.WARNING:at_include/linux/cpumask.h:#smp_call_function_many_cond
:6 100% 6:6 dmesg.WARNING:at_include/linux/cpumask.h:#wq_select_unbound_cpu

below is the full report.

Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 78e5a3399421ad79fc024e6d78e2deb7809d26af ("cpumask: fix checking valid cpu range")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/r/[email protected]


[ 2.927894][ T1] ------------[ cut here ]------------
[ 2.927894][ T1] WARNING: CPU: 1 PID: 1 at include/linux/cpumask.h:110 wq_select_unbound_cpu+0x1b8/0x1c0
[ 2.927894][ T1] Modules linked in:
[ 2.927894][ T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc4-00024-g78e5a3399421 #1
[ 2.927894][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[ 2.927894][ T1] EIP: wq_select_unbound_cpu+0x1b8/0x1c0
[ 2.927894][ T1] Code: 89 c3 89 c7 a1 e0 6e 7b c2 89 45 f0 39 d8 b8 40 a5 71 c2 0f 96 c2 31 c9 e8 a5 62 16 00 39 5d f0 77 8c e9 e6 fe ff ff 8d 76 00 <0
f> 0b e9 27 ff ff ff 90 55 89 e5 57 56 89 c6 53 83 ec 28 89 55 d0
[ 2.927894][ T1] EAX: c271a8a0 EBX: 00000002 ECX: 00000000 EDX: 00000001
[ 2.927894][ T1] ESI: 00000001 EDI: 00000002 EBP: c34c1ea4 ESP: c34c1e90
[ 2.927894][ T1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010046
[ 2.927894][ T1] CR0: 80050033 CR2: 00000000 CR3: 028cb000 CR4: 000406d0
[ 2.927894][ T1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 2.927894][ T1] DR6: fffe0ff0 DR7: 00000400
[ 2.927894][ T1] Call Trace:
[ 2.927894][ T1] ? __queue_work+0x82/0x720
[ 2.927894][ T1] __queue_work+0x41b/0x720
[ 2.927894][ T1] ? __queue_work+0x82/0x720
[ 2.927894][ T1] queue_work_on+0x67/0x70
[ 2.927894][ T1] trace_eval_init+0xfc/0x10e
[ 2.927894][ T1] do_one_initcall+0x95/0x470
[ 2.927894][ T1] ? eval_map_work_func+0x3b/0x3b
[ 2.927894][ T1] ? rdinit_setup+0x3d/0x3d
[ 2.927894][ T1] ? rcu_read_lock_sched_held+0x41/0x70
[ 2.927894][ T1] do_initcalls+0xf9/0x11b
[ 2.927894][ T1] kernel_init_freeable+0x103/0x136
[ 2.927894][ T1] ? rest_init+0x170/0x170
[ 2.927894][ T1] kernel_init+0x1a/0x110
[ 2.927894][ T1] ? schedule_tail_wrapper+0x9/0xc
[ 2.927894][ T1] ret_from_fork+0x1c/0x28
[ 2.927894][ T1] irq event stamp: 38076
[ 2.927894][ T1] hardirqs last enabled at (38075): [<c1d39c54>] _raw_spin_unlock_irqrestore+0x34/0x50
[ 2.927894][ T1] hardirqs last disabled at (38076): [<c1097293>] queue_work_on+0x53/0x70
[ 2.927894][ T1] softirqs last enabled at (37886): [<c1d3b912>] __do_softirq+0x3b2/0x66e
[ 2.927894][ T1] softirqs last disabled at (37877): [<c101f0cc>] call_on_stack+0x4c/0x60
[ 2.927894][ T1] ---[ end trace 0000000000000000 ]---



To reproduce:

# build kernel
cd linux
cp config-6.0.0-rc4-00024-g78e5a3399421 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (5.39 kB)
config-6.0.0-rc4-00024-g78e5a3399421 (147.35 kB)
job-script (4.93 kB)
dmesg.xz (31.09 kB)
Download all attachments