2021-03-03 00:47:22

by kernel test robot

[permalink] [raw]
Subject: [task_work] 35d0b389f3: BUG:workqueue_lockup-pool


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 35d0b389f3b23439ad15b610d6e43fc72fc75779 ("task_work: unconditionally run task_work from get_signal()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+---------------------------------------------------------+-----------+------------+
| | v5.11-rc2 | 35d0b389f3 |
+---------------------------------------------------------+-----------+------------+
| BUG:workqueue_lockup-pool | 0 | 7 |
| INFO:rcu_preempt_detected_stalls_on_CPUs/tasks | 0 | 2 |
| BUG:scheduling_while_atomic | 0 | 1 |
| BUG:spinlock_recursion_on_CPU | 0 | 2 |
| EIP:preempt_schedule_common | 0 | 1 |
| EIP:_raw_spin_unlock_irq | 0 | 1 |
| EIP:preempt_count_sub | 0 | 1 |
+---------------------------------------------------------+-----------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 60.721590] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 45s!
[ 60.722087] Showing busy workqueues and worker pools:
[ 60.722384] workqueue events: flags=0x0
[ 60.722612] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256 refcnt=4
[ 60.723012] pending: vmstat_shepherd, cache_reap, regulator_init_complete_work_function
[ 60.723511] workqueue rcu_gp: flags=0x8
[ 60.723740] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 60.724147] pending: strict_work_handler
[ 60.724405] workqueue mm_percpu_wq: flags=0x8
[ 60.724671] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 60.725071] pending: vmstat_update
[ 90.801593] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 75s!
[ 90.802096] Showing busy workqueues and worker pools:
[ 90.802394] workqueue events: flags=0x0
[ 90.802623] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256 refcnt=4
[ 90.803030] pending: vmstat_shepherd, cache_reap, regulator_init_complete_work_function
[ 90.803528] workqueue rcu_gp: flags=0x8
[ 90.803757] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 90.804157] pending: strict_work_handler
[ 90.804415] workqueue mm_percpu_wq: flags=0x8
[ 90.804675] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 90.805088] pending: vmstat_update
[ 115.721589] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 115.721985] (detected by 1, t=10002 jiffies, g=6965, q=871)
[ 115.722324] rcu: All QSes seen, last rcu_preempt kthread activity 10002 (-18436--28438), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 115.723004] rcu: rcu_preempt kthread starved for 10002 jiffies! g6965 f0x0 RCU_GP_ONOFF(3) ->state=0x0 ->cpu=0
[ 115.723581] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 115.724106] rcu: RCU grace-period kthread stack dump:
[ 115.724402] task:rcu_preempt state:R running task stack: 0 pid: 10 ppid: 2 flags:0x00004008
[ 115.724975] Call Trace:
[ 115.725127] __schedule (kbuild/src/consumer/kernel/sched/core.c:4316 kbuild/src/consumer/kernel/sched/core.c:5064)
[ 115.725340] ? __queue_work (kbuild/src/consumer/kernel/workqueue.c:1470)
[ 115.725570] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.725864] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:49 (discriminator 1) kbuild/src/consumer/arch/x86/include/asm/irqflags.h:89 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5327 (discriminator 1))
[ 115.726123] irqentry_exit_cond_resched (??:?)
[ 115.726437] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:435)
[ 115.726651] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.726934] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172)
[ 115.727174] EIP: preempt_schedule_common (kbuild/src/consumer/arch/x86/include/asm/bitops.h:207 kbuild/src/consumer/include/asm-generic/bitops/instrumented-non-atomic.h:135 kbuild/src/consumer/include/linux/thread_info.h:102 kbuild/src/consumer/include/linux/sched.h:1917 kbuild/src/consumer/kernel/sched/core.c:5232)
[ 115.727457] Code: f5 cb 89 de 64 ff 05 ac 77 f5 cb 64 a1 ac 77 f5 cb 25 ff ff ff 7f 48 74 2b b8 01 00 00 00 e8 c9 f4 ff ff 64 ff 0d ac 77 f5 cb <8b> 06 a8 08 75 d4 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 8d b4 26
All code
========
0: f5 cmc
1: cb lret
2: 89 de mov %ebx,%esi
4: 64 ff 05 ac 77 f5 cb incl %fs:-0x340a8854(%rip) # 0xffffffffcbf577b7
b: 64 a1 ac 77 f5 cb 25 movabs %fs:0xffffff25cbf577ac,%eax
12: ff ff ff
15: 7f 48 jg 0x5f
17: 74 2b je 0x44
19: b8 01 00 00 00 mov $0x1,%eax
1e: e8 c9 f4 ff ff callq 0xfffffffffffff4ec
23: 64 ff 0d ac 77 f5 cb decl %fs:-0x340a8854(%rip) # 0xffffffffcbf577d6
2a:* 8b 06 mov (%rsi),%eax <-- trapping instruction
2c: a8 08 test $0x8,%al
2e: 75 d4 jne 0x4
30: 8b 5d f4 mov -0xc(%rbp),%ebx
33: 8b 75 f8 mov -0x8(%rbp),%esi
36: 8b 7d fc mov -0x4(%rbp),%edi
39: 89 ec mov %ebp,%esp
3b: 5d pop %rbp
3c: c3 retq
3d: 8d .byte 0x8d
3e: b4 26 mov $0x26,%ah

Code starting with the faulting instruction
===========================================
0: 8b 06 mov (%rsi),%eax
2: a8 08 test $0x8,%al
4: 75 d4 jne 0xffffffffffffffda
6: 8b 5d f4 mov -0xc(%rbp),%ebx
9: 8b 75 f8 mov -0x8(%rbp),%esi
c: 8b 7d fc mov -0x4(%rbp),%edi
f: 89 ec mov %ebp,%esp
11: 5d pop %rbp
12: c3 retq
13: 8d .byte 0x8d
14: b4 26 mov $0x26,%ah
[ 115.728529] EAX: d7ed9940 EBX: c107c280 ECX: 00000000 EDX: 00000002
[ 115.728895] ESI: c107c280 EDI: cb40d425 EBP: c107ff20 ESP: c107ff14
[ 115.729259] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000246
[ 115.729653] ? preempt_schedule (kbuild/src/consumer/kernel/sched/core.c:5250)
[ 115.729891] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.730186] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.730479] ? preempt_schedule_common (kbuild/src/consumer/arch/x86/include/asm/bitops.h:207 kbuild/src/consumer/include/asm-generic/bitops/instrumented-non-atomic.h:135 kbuild/src/consumer/include/linux/thread_info.h:102 kbuild/src/consumer/include/linux/sched.h:1917 kbuild/src/consumer/kernel/sched/core.c:5232)
[ 115.730751] preempt_schedule (kbuild/src/consumer/kernel/sched/core.c:5250)
[ 115.730979] preempt_schedule_thunk (kbuild/src/consumer/arch/x86/entry/thunk_32.S:33)
[ 115.731227] ? _raw_spin_unlock (kbuild/src/consumer/kernel/locking/spinlock.c:184)
[ 115.731465] rcu_gp_kthread (kbuild/src/consumer/kernel/rcu/tree.c:1769 kbuild/src/consumer/kernel/rcu/tree.c:2103)
[ 115.731694] kthread (kbuild/src/consumer/kernel/kthread.c:292)
[ 115.731888] ? dump_blkd_tasks+0x40/0x40
[ 115.732180] ? kthread_create_worker_on_cpu (kbuild/src/consumer/kernel/kthread.c:245)
[ 115.732484] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_32.S:856)
[ 115.732723] BUG: spinlock recursion on CPU#1, udevd/275
[ 115.733029] lock: rcu_state+0x0/0x2aa0, .magic: dead4ead, .owner: udevd/275, .owner_cpu: 1
[ 115.733514] CPU: 1 PID: 275 Comm: udevd Not tainted 5.11.0-rc2-00001-g35d0b389f3b2 #2
[ 115.733968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 115.734449] Call Trace:
[ 115.734596] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
[ 115.734796] spin_dump.cold (kbuild/src/consumer/kernel/locking/spinlock_debug.c:68 (discriminator 3))
[ 115.735014] do_raw_spin_lock (kbuild/src/consumer/kernel/locking/spinlock_debug.c:75 kbuild/src/consumer/kernel/locking/spinlock_debug.c:84 kbuild/src/consumer/kernel/locking/spinlock_debug.c:112)
[ 115.735242] _raw_spin_lock_irqsave (kbuild/src/consumer/kernel/locking/spinlock.c:160)
[ 115.735498] ? rcu_force_quiescent_state (kbuild/src/consumer/kernel/rcu/tree.c:2664)
[ 115.735779] rcu_force_quiescent_state (kbuild/src/consumer/kernel/rcu/tree.c:2664)
[ 115.736049] rcu_sched_clock_irq.cold (kbuild/src/consumer/kernel/rcu/tree_stall.h:536 kbuild/src/consumer/kernel/rcu/tree_stall.h:652 kbuild/src/consumer/kernel/rcu/tree.c:3751 kbuild/src/consumer/kernel/rcu/tree.c:2580)
[ 115.736326] ? _raw_spin_unlock_irqrestore (kbuild/src/consumer/arch/x86/include/asm/preempt.h:102 kbuild/src/consumer/include/linux/spinlock_api_smp.h:161 kbuild/src/consumer/kernel/locking/spinlock.c:191)
[ 115.736617] ? hrtimer_run_queues (kbuild/src/consumer/kernel/time/hrtimer.c:1760)
[ 115.736864] update_process_times (kbuild/src/consumer/arch/x86/include/asm/preempt.h:26 kbuild/src/consumer/kernel/time/timer.c:1784)
[ 115.737109] tick_periodic (kbuild/src/consumer/kernel/time/tick-common.c:101)
[ 115.737329] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.737619] ? debug_smp_processor_id (kbuild/src/consumer/lib/smp_processor_id.c:65)
[ 115.737886] tick_handle_periodic (kbuild/src/consumer/kernel/time/tick-common.c:124)
[ 115.738136] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.738435] __sysvec_apic_timer_interrupt (kbuild/src/consumer/include/asm-generic/irq_regs.h:28 kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1105)
[ 115.738727] ? perf_iterate_sb (kbuild/src/consumer/include/linux/rcupdate.h:695 (discriminator 3) kbuild/src/consumer/kernel/events/core.c:7460 (discriminator 3))
[ 115.738966] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.739247] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172)
[ 115.739484] EIP: _raw_spin_unlock_irq (kbuild/src/consumer/include/linux/spinlock_api_smp.h:169 kbuild/src/consumer/kernel/locking/spinlock.c:199)
[ 115.739751] Code: 00 00 00 00 8d b4 26 00 00 00 00 90 55 89 e5 53 89 c3 8b 55 04 83 c0 10 e8 2f 53 69 ff 89 d8 e8 48 90 69 ff fb b8 01 00 00 00 <e8> 3d c7 66 ff 64 a1 ac 77 f5 cb 85 c0 74 03 5b 5d c3 e8 1a dc 5e
All code
========
0: 00 00 add %al,(%rax)
2: 00 00 add %al,(%rax)
4: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
b: 90 nop
c: 55 push %rbp
d: 89 e5 mov %esp,%ebp
f: 53 push %rbx
10: 89 c3 mov %eax,%ebx
12: 8b 55 04 mov 0x4(%rbp),%edx
15: 83 c0 10 add $0x10,%eax
18: e8 2f 53 69 ff callq 0xffffffffff69534c
1d: 89 d8 mov %ebx,%eax
1f: e8 48 90 69 ff callq 0xffffffffff69906c
24: fb sti
25: b8 01 00 00 00 mov $0x1,%eax
2a:* e8 3d c7 66 ff callq 0xffffffffff66c76c <-- trapping instruction
2f: 64 a1 ac 77 f5 cb 85 movabs %fs:0x374c085cbf577ac,%eax
36: c0 74 03
39: 5b pop %rbx
3a: 5d pop %rbp
3b: c3 retq
3c: e8 .byte 0xe8
3d: 1a dc sbb %ah,%bl
3f: 5e pop %rsi

Code starting with the faulting instruction
===========================================
0: e8 3d c7 66 ff callq 0xffffffffff66c742
5: 64 a1 ac 77 f5 cb 85 movabs %fs:0x374c085cbf577ac,%eax
c: c0 74 03
f: 5b pop %rbx
10: 5d pop %rbp
11: c3 retq
12: e8 .byte 0xe8
13: 1a dc sbb %ah,%bl
15: 5e pop %rsi
[ 115.740821] EAX: 00000001 EBX: d7ef7940 ECX: 00000000 EDX: cb9e33fa
[ 115.741184] ESI: dd75c240 EDI: 00000000 EBP: dce43d54 ESP: dce43d50
[ 115.741547] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000246
[ 115.741940] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.742233] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.742524] ? _raw_spin_unlock_irq (kbuild/src/consumer/include/linux/spinlock_api_smp.h:169 kbuild/src/consumer/kernel/locking/spinlock.c:199)
[ 115.742788] finish_task_switch (kbuild/src/consumer/arch/x86/include/asm/current.h:15 kbuild/src/consumer/kernel/sched/core.c:4091 kbuild/src/consumer/kernel/sched/core.c:4192)
[ 115.743032] ? finish_task_switch (kbuild/src/consumer/kernel/sched/core.c:3965 kbuild/src/consumer/kernel/sched/core.c:4181)
[ 115.743285] __schedule (kbuild/src/consumer/kernel/sched/core.c:5072)
[ 115.743492] ? __queue_work (kbuild/src/consumer/kernel/workqueue.c:1470)
[ 115.743719] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.744011] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:49 (discriminator 1) kbuild/src/consumer/arch/x86/include/asm/irqflags.h:89 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5327 (discriminator 1))
[ 115.744258] irqentry_exit_cond_resched (??:?)
[ 115.744569] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:435)
[ 115.744782] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096)
[ 115.745064] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172)
[ 115.745302] EIP: preempt_count_sub (kbuild/src/consumer/kernel/sched/core.c:4758)
[ 115.745555] Code: 15 10 22 15 cc 85 d2 75 18 64 8b 15 ac 77 f5 cb 81 e2 ff ff ff 7f 39 c2 7c 17 3d fe 00 00 00 76 40 f7 d8 64 01 05 ac 77 f5 cb <c9> c3 8d 74 26 00 90 e8 23 a0 98 00 85 c0 90 74 ef 8b 0d 0c d5 ea
All code
========
0: 15 10 22 15 cc adc $0xcc152210,%eax
5: 85 d2 test %edx,%edx
7: 75 18 jne 0x21
9: 64 8b 15 ac 77 f5 cb mov %fs:-0x340a8854(%rip),%edx # 0xffffffffcbf577bc
10: 81 e2 ff ff ff 7f and $0x7fffffff,%edx
16: 39 c2 cmp %eax,%edx
18: 7c 17 jl 0x31
1a: 3d fe 00 00 00 cmp $0xfe,%eax
1f: 76 40 jbe 0x61
21: f7 d8 neg %eax
23: 64 01 05 ac 77 f5 cb add %eax,%fs:-0x340a8854(%rip) # 0xffffffffcbf577d6
2a:* c9 leaveq <-- trapping instruction
2b: c3 retq
2c: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
30: 90 nop
31: e8 23 a0 98 00 callq 0x98a059
36: 85 c0 test %eax,%eax
38: 90 nop
39: 74 ef je 0x2a
3b: 8b .byte 0x8b
3c: 0d .byte 0xd
3d: 0c d5 or $0xd5,%al
3f: ea (bad)

Code starting with the faulting instruction
===========================================
0: c9 leaveq
1: c3 retq
2: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
6: 90 nop
7: e8 23 a0 98 00 callq 0x98a02f
c: 85 c0 test %eax,%eax
e: 90 nop
f: 74 ef je 0x0
11: 8b .byte 0x8b
12: 0d .byte 0xd
13: 0c d5 or $0xd5,%al
15: ea (bad)


To reproduce:

# build kernel
cd linux
cp config-5.11.0-rc2-00001-g35d0b389f3b2 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Oliver Sang


Attachments:
(No filename) (16.79 kB)
config-5.11.0-rc2-00001-g35d0b389f3b2 (133.64 kB)
job-script (4.25 kB)
dmesg.xz (16.34 kB)
Download all attachments