2022-03-11 21:40:29

by kernel test robot

[permalink] [raw]
Subject: [sched/cpuacct] 3eba0505d0: EIP:finish_task_switch



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 3eba0505d03a9c1eb30d40c2330c0880b22d1b3a ("sched/cpuacct: Remove redundant RCU read lock")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------------------+------------+------------+
| | dc6e0818bc | 3eba0505d0 |
+-----------------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 8 | 10 |
| WARNING:suspicious_RCU_usage | 8 | 10 |
| include/linux/cgroup.h:#suspicious_rcu_dereference_check()usage | 8 | 10 |
| WARNING:possible_circular_locking_dependency_detected | 8 | 10 |
| EIP:finish_task_switch | 0 | 10 |
+-----------------------------------------------------------------+------------+------------+

please be noted other issues we found for this commit,
suspicious_RCU_usage or possible_circular_locking_dependency_detected
are also observed in parent.

EIP:finish_task_switch only happens upon this commit, so though it follows
the Call Trace for other issues, we still report this FYI.


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 9.002853][ T2] WARNING: suspicious RCU usage
[ 9.002853][ T2] 5.17.0-rc5-00051-g3eba0505d03a #1 Not tainted
[ 9.002853][ T2] -----------------------------
[ 9.002853][ T2] include/linux/cgroup.h:494 suspicious rcu_dereference_check() usage!
[ 9.002853][ T2]
[ 9.002853][ T2] other info that might help us debug this:
[ 9.002853][ T2]
[ 9.002853][ T2] rcu_scheduler_active = 2, debug_locks = 1
[ 9.002853][ T2] 1 lock held by kthreadd/2:
[ 9.002853][ T2] #0: c24b61d0 (&rq->__lock){-...}-{2:2}, at: __schedule (kernel/sched/sched.h:1539 (discriminator 22) kernel/sched/sched.h:1617 (discriminator 22) kernel/sched/core.c:6220 (discriminator 22))
[ 9.002853][ T2]
[ 9.002853][ T2] stack backtrace:
[ 9.002853][ T2] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.17.0-rc5-00051-g3eba0505d03a #1 43d8d413b42d4c183bf4cd8750f3ef2f56ba06d1
[ 9.002853][ T2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 9.002853][ T2] Call Trace:
[ 9.002853][ T2] dump_stack_lvl (lib/dump_stack.c:108)
[ 9.002853][ T2] dump_stack (lib/dump_stack.c:114)
[ 9.002853][ T2] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6564)
[ 9.002853][ T2] cpuacct_charge (include/linux/cgroup.h:494 kernel/sched/cpuacct.c:40 kernel/sched/cpuacct.c:342)
[ 9.002853][ T2] update_curr (include/linux/cgroup.h:481 include/linux/cgroup.h:550 include/linux/cgroup.h:794 kernel/sched/fair.c:877)
[ 9.002853][ T2] dequeue_entity (kernel/sched/fair.c:3250 kernel/sched/fair.c:4168 kernel/sched/fair.c:4402)
[ 9.002853][ T2] dequeue_task_fair (kernel/sched/fair.c:5747)
[ 9.002853][ T2] __schedule (kernel/sched/core.c:6262)
[ 9.002853][ T2] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:51 (discriminator 22))
[ 9.002853][ T2] schedule (arch/x86/include/asm/preempt.h:85 (discriminator 1) kernel/sched/core.c:6380 (discriminator 1))
[ 9.002853][ T2] ? kthread_is_per_cpu (kernel/kthread.c:719)
[ 9.002853][ T2] kthreadd (kernel/kthread.c:735)
[ 9.002853][ T2] ? kthread_is_per_cpu (kernel/kthread.c:719)
[ 9.002853][ T2] ret_from_fork (arch/x86/entry/entry_32.S:772)
[ 9.002853][ T2]
[ 9.002853][ T2] =============================
[ 9.002853][ T2] WARNING: suspicious RCU usage
[ 9.002853][ T2] 5.17.0-rc5-00051-g3eba0505d03a #1 Not tainted
[ 9.002853][ T2] -----------------------------
[ 9.002853][ T2] include/linux/cgroup.h:481 suspicious rcu_dereference_check() usage!
[ 9.002853][ T2]
[ 9.002853][ T2] other info that might help us debug this:
[ 9.002853][ T2]
[ 9.002853][ T2] rcu_scheduler_active = 2, debug_locks = 1
[ 9.002853][ T2] 1 lock held by kthreadd/2:
[ 9.002853][ T2] #0: c24b61d0 (&rq->__lock){-...}-{2:2}, at: __schedule (kernel/sched/sched.h:1539 (discriminator 22) kernel/sched/sched.h:1617 (discriminator 22) kernel/sched/core.c:6220 (discriminator 22))
[ 9.002853][ T2]
[ 9.002853][ T2] stack backtrace:
[ 9.002853][ T2] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.17.0-rc5-00051-g3eba0505d03a #1 43d8d413b42d4c183bf4cd8750f3ef2f56ba06d1
[ 9.002853][ T2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 9.002853][ T2] Call Trace:
[ 9.002853][ T2] dump_stack_lvl (lib/dump_stack.c:108)
[ 9.002853][ T2] dump_stack (lib/dump_stack.c:114)
[ 9.002853][ T2] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6564)
[ 9.002853][ T2] update_curr (include/linux/cgroup.h:481 include/linux/cgroup.h:550 include/linux/cgroup.h:794 kernel/sched/fair.c:877)
[ 9.002853][ T2] dequeue_entity (kernel/sched/fair.c:3250 kernel/sched/fair.c:4168 kernel/sched/fair.c:4402)
[ 9.002853][ T2] dequeue_task_fair (kernel/sched/fair.c:5747)
[ 9.002853][ T2] __schedule (kernel/sched/core.c:6262)
[ 9.002853][ T2] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:51 (discriminator 22))
[ 9.002853][ T2] schedule (arch/x86/include/asm/preempt.h:85 (discriminator 1) kernel/sched/core.c:6380 (discriminator 1))
[ 9.002853][ T2] ? kthread_is_per_cpu (kernel/kthread.c:719)
[ 9.002853][ T2] kthreadd (kernel/kthread.c:735)
[ 9.002853][ T2] ? kthread_is_per_cpu (kernel/kthread.c:719)
[ 9.002853][ T2] ret_from_fork (arch/x86/entry/entry_32.S:772)
[ 9.002853][ C0]
[ 9.002853][ C0] =============================
[ 9.002853][ C0] WARNING: suspicious RCU usage
[ 9.002853][ C0] 5.17.0-rc5-00051-g3eba0505d03a #1 Not tainted
[ 9.002853][ C0] -----------------------------
[ 9.002853][ C0] include/linux/cgroup.h:481 suspicious rcu_dereference_check() usage!
[ 9.002853][ C0]
[ 9.002853][ C0] other info that might help us debug this:
[ 9.002853][ C0]
[ 9.002853][ C0] rcu_scheduler_active = 2, debug_locks = 1
[ 9.002853][ C0] no locks held by swapper/1.
[ 9.002853][ C0]
[ 9.002853][ C0] stack backtrace:
[ 9.002853][ C0] CPU: 0 PID: 1 Comm: swapper Not tainted 5.17.0-rc5-00051-g3eba0505d03a #1 43d8d413b42d4c183bf4cd8750f3ef2f56ba06d1
[ 9.002853][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 9.002853][ C0] Call Trace:
[ 9.002853][ C0] <IRQ>
[ 9.002853][ C0] dump_stack_lvl (lib/dump_stack.c:108)
[ 9.002853][ C0] dump_stack (lib/dump_stack.c:114)
[ 9.002853][ C0] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6564)
[ 9.002853][ C0] account_system_index_time (include/linux/cgroup.h:481 include/linux/cgroup.h:550 include/linux/cgroup.h:807 kernel/sched/cputime.c:110 kernel/sched/cputime.c:173)
[ 9.002853][ C0] account_system_time (kernel/sched/cputime.c:202)
[ 9.002853][ C0] account_process_tick (kernel/sched/cputime.c:495)
[ 9.002853][ C0] update_process_times (kernel/time/timer.c:1759 kernel/time/timer.c:1784)
[ 9.002853][ C0] tick_periodic+0x28/0xf0
[ 9.002853][ C0] ? tick_handle_periodic (kernel/time/tick-common.c:120)
[ 9.002853][ C0] tick_handle_periodic (kernel/time/tick-common.c:120)
[ 9.002853][ C0] timer_interrupt (arch/x86/kernel/time.c:59)
[ 9.002853][ C0] __handle_irq_event_percpu (kernel/irq/handle.c:158)
[ 9.002853][ C0] handle_irq_event_percpu (kernel/irq/handle.c:193)
[ 9.002853][ C0] handle_irq_event (kernel/irq/handle.c:210)
[ 9.002853][ C0] ? handle_fasteoi_nmi (kernel/irq/chip.c:632)
[ 9.002853][ C0] handle_level_irq (include/linux/irq.h:346 kernel/irq/chip.c:617 kernel/irq/chip.c:653)
[ 9.002853][ C0] __handle_irq (arch/x86/kernel/irq_32.c:100 arch/x86/kernel/irq_32.c:158)
[ 9.002853][ C0] </IRQ>
[ 9.002853][ C0] __common_interrupt (arch/x86/kernel/irq.c:264 (discriminator 19))
[ 9.002853][ C0] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125)
[ 9.002853][ C0] common_interrupt (arch/x86/kernel/irq.c:240)
[ 9.002853][ C0] asm_common_interrupt (arch/x86/include/asm/idtentry.h:629)
[ 9.002853][ C0] EIP: finish_task_switch (kernel/sched/core.c:4783 kernel/sched/core.c:4885)
[ 9.002853][ C0] Code: 08 31 d2 89 4c 24 04 31 c9 c7 04 24 00 00 00 00 e8 b2 08 02 00 b8 c0 61 4b c2 e8 78 ae a8 00 e8 53 1d 09 00 fb a1 d4 a9 4a c2 <8b> b8 f0 13 00 00 85 ff 0f 85 27 01 00 00 85 db 74 16 a1 d4 a9 4a
All code
========
0: 08 31 or %dh,(%rcx)
2: d2 89 4c 24 04 31 rorb %cl,0x3104244c(%rcx)
8: c9 leaveq
9: c7 04 24 00 00 00 00 movl $0x0,(%rsp)
10: e8 b2 08 02 00 callq 0x208c7
15: b8 c0 61 4b c2 mov $0xc24b61c0,%eax
1a: e8 78 ae a8 00 callq 0xa8ae97
1f: e8 53 1d 09 00 callq 0x91d77
24: fb sti
25: a1 d4 a9 4a c2 8b b8 movabs 0x13f0b88bc24aa9d4,%eax
2c: f0 13
2e:* 00 00 add %al,(%rax) <-- trapping instruction
30: 85 ff test %edi,%edi
32: 0f 85 27 01 00 00 jne 0x15f
38: 85 db test %ebx,%ebx
3a: 74 16 je 0x52
3c: a1 .byte 0xa1
3d: d4 (bad)
3e: a9 .byte 0xa9
3f: 4a rex.WX

Code starting with the faulting instruction
===========================================
0: 8b b8 f0 13 00 00 mov 0x13f0(%rax),%edi
6: 85 ff test %edi,%edi
8: 0f 85 27 01 00 00 jne 0x135
e: 85 db test %ebx,%ebx
10: 74 16 je 0x28
12: a1 .byte 0xa1
13: d4 (bad)
14: a9 .byte 0xa9
15: 4a rex.WX
[ 9.002853][ C0] EAX: c3598040 EBX: 00000000 ECX: ffffffff EDX: 00000000
[ 9.002853][ C0] ESI: c3599b00 EDI: c3598040 EBP: c3581f90 ESP: c3581f70
[ 9.002853][ C0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000202
[ 9.002853][ C0] ? finish_task_switch (kernel/sched/core.c:4783 kernel/sched/core.c:4885)
[ 9.002853][ C0] ? finish_task_switch (kernel/sched/core.c:4755 kernel/sched/core.c:4875)
[ 9.002853][ C0] ? rest_init (init/main.c:1494)
[ 9.002853][ C0] schedule_tail (arch/x86/include/asm/current.h:15 kernel/sched/core.c:4936)
[ 9.002853][ C0] ? rest_init (init/main.c:1494)
[ 9.002853][ C0] schedule_tail_wrapper (arch/x86/entry/entry_32.S:737)
[ 9.002853][ C0] ret_from_fork (arch/x86/entry/entry_32.S:755)
[ 9.005199][ T1] cblist_init_generic: Setting adjustable number of callback queues.
[ 9.005860][ T1] cblist_init_generic: Setting shift to 0 and lim to 1.
[ 9.006994][ T1] cblist_init_generic: Setting shift to 0 and lim to 1.
[ 9.007969][ T1] cblist_init_generic: Setting shift to 0 and lim to 1.
[ 9.008960][ T1] Running RCU-tasks wait API self tests
[ 9.012886][ T1] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.
[ 9.016749][ T1] devtmpfs: initialized
[ 9.018657][ T1] random: get_random_u32 called from bucket_table_alloc+0x94/0x180 with crng_init=0
[ 9.019923][ T10] Callback from call_rcu_tasks_trace() invoked.
[ 9.022381][ T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[ 9.022879][ T1] futex hash table entries: 256 (order: 1, 12288 bytes, linear)
[ 9.025016][ T1] pinctrl core: initialized pinctrl subsystem
[ 9.025862][ T1] pinctrl core: failed to create debugfs directory
[ 9.027694][ T1]
[ 9.027861][ T1] *************************************************************
[ 9.028863][ T1] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
[ 9.029863][ T1] ** **
[ 9.030862][ T1] ** IOMMU DebugFS SUPPORT HAS BEEN ENABLED IN THIS KERNEL **
[ 9.031862][ T1] ** **
[ 9.032861][ T1] ** This means that this kernel is built to expose internal **
[ 9.033864][ T1] ** IOMMU data structures, which may compromise security on **
[ 9.035861][ T1] ** your system. **
[ 9.036866][ T1] ** **
[ 9.037861][ T1] ** If you see this message and you are not debugging the **
[ 9.038862][ T1] ** kernel, report this immediately to your vendor! **
[ 9.039860][ T1] ** **
[ 9.040861][ T1] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
[ 9.041865][ T1] *************************************************************
[ 9.044111][ T1] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 9.048749][ T1] thermal_sys: Registered thermal governor 'step_wise'
[ 9.048789][ T1] cpuidle: using governor ladder
[ 9.052307][ T1] PCI: PCI BIOS area is rw and x. Use pci=nobios if you want it NX.
[ 9.052863][ T1] PCI: PCI BIOS revision 2.10 entry at 0xfd1bc, last bus=0
[ 9.079034][ T1] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[ 9.081139][ T1] cryptd: max_cpu_qlen set to 1000
[ 9.084322][ T1] ACPI: Added _OSI(Module Device)
[ 9.084863][ T1] ACPI: Added _OSI(Processor Device)
[ 9.085863][ T1] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 9.086884][ T1] ACPI: Added _OSI(Processor Aggregator Device)
[ 9.087877][ T1] ACPI: Added _OSI(Linux-Dell-Video)
[ 9.088878][ T1] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 9.089875][ T1] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 9.110451][ T1] ACPI: 1 ACPI AML tables successfully acquired and loaded
[ 9.116752][ T1] ACPI: Interpreter enabled
[ 9.117891][ T1] ACPI: PM: (supports S0 S5)
[ 9.118938][ T9] Callback from call_rcu_tasks_rude() invoked.
[ 9.119891][ T1] ACPI: Using PIC for interrupt routing
[ 9.121986][ T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 9.124609][ T1] ACPI: Enabled 2 GPEs in block 00 to 0F
[ 9.165558][ T1] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 9.165893][ T1] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments HPX-Type3]
[ 9.166868][ T1] acpi PNP0A03:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI]
[ 9.172054][ T1] PCI host bridge to bus 0000:00
[ 9.172871][ T1] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
[ 9.173873][ T1] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
[ 9.174874][ T1] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[ 9.176868][ T1] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window]
[ 9.177876][ T1] pci_bus 0000:00: root bus resource [mem 0x440000000-0x4bfffffff window]
[ 9.178876][ T1] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 9.180043][ T1] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[ 9.182528][ T1] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[ 9.184489][ T1] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[ 9.189281][ T1] pci 0000:00:01.1: reg 0x20: [io 0xc040-0xc04f]
[ 9.191862][ T1] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7]
[ 9.192864][ T1] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6]
[ 9.193866][ T1] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177]
[ 9.194864][ T1] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376]
[ 9.197285][ T1] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000


To reproduce:

# build kernel
cd linux
cp config-5.17.0-rc5-00051-g3eba0505d03a .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/[email protected]

Thanks,
Oliver Sang


Attachments:
(No filename) (16.96 kB)
config-5.17.0-rc5-00051-g3eba0505d03a (143.21 kB)
job-script (4.67 kB)
dmesg.xz (14.73 kB)
Download all attachments