2022-02-14 02:53:56

by Guanghui Feng

[permalink] [raw]
Subject: [PATCH] printk: fix softlockup/rcu stall warning without setting CONFIG_PREEMPTION

If without setting CONFIG_PREEMPTION, and there are too many messages
only in one cpu which will set console_may_schedule to 0 without cond_resched,
we will find softlockup/rcu stall warning. Although the timer interrupt
can be raised in the routine of console_unlock, the irq interrupt return
to user/kernel will't schedule other tasks(without setting CONFIG_PREEMPTION).
So I add need_resched() for cond_schedule to avoid it.

Call trace:
console_unlock+0x220/0x420
vprintk_emit+0x17c/0x1ac
vprintk_default+0x3c/0x44
vprintk+0x38/0x70
printk+0x64/0x88
dump_task.part.0+0xc4/0xe0
dump_task+0x70/0x74
dump_tasks+0x78/0x90
dump_global_header+0xcc/0xe8
oom_kill_process+0x258/0x274
out_of_memory.part.0+0xb0/0x33c
out_of_memory+0x4c/0xa0
__alloc_pages_may_oom+0x11c/0x1a0
__alloc_pages_slowpath.constprop.0+0x4c0/0x75c
__alloc_pages_nodemask+0x2b4/0x310
alloc_pages_current+0x8c/0x140
get_zeroed_page+0x20/0x50
__pud_alloc+0x40/0x190
copy_pud_range+0x264/0x280
copy_page_range+0xe8/0x204
dup_mmap+0x334/0x434
dup_mm+0x64/0x11c
copy_process+0x5e0/0x11a0
kernel_clone+0x94/0x364
__do_sys_clone+0x54/0x80
__arm64_sys_clone+0x24/0x30
el0_svc_common.constprop.0+0x7c/0x210
do_el0_svc+0x74/0x90
el0_svc+0x24/0x60
el0_sync_handler+0xa8/0xb0
el0_sync+0x140/0x180

Signed-off-by: Guanghui Feng <[email protected]>
---
kernel/printk/printk.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 82abfaf..54b6cf2 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2716,7 +2716,11 @@ void console_unlock(void)
if (handover)
return;

+#ifndef CONFIG_PREEMPTION
+ if (do_cond_resched || need_resched())
+#else
if (do_cond_resched)
+#endif
cond_resched();
}

--
1.8.3.1


2022-02-14 08:33:05

by Oliver Sang

[permalink] [raw]
Subject: [printk] 94fb1f214c: BUG:sleeping_function_called_from_invalid_context_at_kernel/printk/printk.c



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 94fb1f214c6f04ef6e7395a31dc0e22862593982 ("[PATCH] printk: fix softlockup/rcu stall warning without setting CONFIG_PREEMPTION")
url: https://github.com/0day-ci/linux/commits/Guanghui-Feng/printk-fix-softlockup-rcu-stall-warning-without-setting-CONFIG_PREEMPTION/20220212-212908
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 83e396641110663d3c7bb25b9bc0c6a750359ecf
patch link: https://lore.kernel.org/lkml/[email protected]

in testcase: blktests
version: blktests-x86_64-f51ee53-1_20220128
with following parameters:

disk: 1HDD
test: block-group-03
ucode: 0xe2



on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------------------------------+------------+------------+
| | 83e3966411 | 94fb1f214c |
+-----------------------------------------------------------------------------+------------+------------+
| boot_successes | 27 | 0 |
| boot_failures | 0 | 8 |
| BUG:sleeping_function_called_from_invalid_context_at_kernel/printk/printk.c | 0 | 8 |
+-----------------------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 3.106625][ T1] BUG: sleeping function called from invalid context at kernel/printk/printk.c:2724
[ 3.106626][ T1] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
[ 3.106628][ T1] preempt_count: 1, expected: 0
[ 3.106630][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc3-00248-g94fb1f214c6f #1
[ 3.106633][ T1] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[ 3.106634][ T1] Call Trace:
[ 3.106635][ T1] <TASK>
[ 3.106637][ T1] dump_stack_lvl (lib/dump_stack.c:107)
[ 3.106642][ T1] __might_resched.cold (kernel/sched/core.c:9577 kernel/sched/core.c:9530)
[ 3.106646][ T1] console_unlock (include/linux/sched.h:2031 kernel/printk/printk.c:2724)
[ 3.106649][ T1] ? do_syslog (kernel/printk/printk.c:2088)
[ 3.106653][ T1] vprintk_emit (arch/x86/include/asm/preempt.h:85 kernel/printk/printk.c:2246)
[ 3.106655][ T1] _printk (kernel/printk/printk.c:2261)
[ 3.106658][ T1] ? record_print_text.cold (kernel/printk/printk.c:2261)
[ 3.106662][ T1] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[ 3.106666][ T1] x86_pmu_show_pmu_cap (arch/x86/events/core.c:2044)
[ 3.106670][ T1] ? map_vsyscall (arch/x86/events/core.c:2072)
[ 3.106674][ T1] init_hw_perf_events (arch/x86/events/core.c:2137)
[ 3.106677][ T1] ? map_vsyscall (arch/x86/events/core.c:2072)
[ 3.106680][ T1] do_one_initcall (init/main.c:1300)
[ 3.106683][ T1] ? perf_trace_initcall_level (init/main.c:1291)
[ 3.106686][ T1] ? migrate_swap_stop (kernel/sched/core.c:3985)
[ 3.106689][ T1] ? cpumask_next (lib/cpumask.c:23)
[ 3.106692][ T1] ? cblist_init_generic (kernel/rcu/tasks.h:254)
[ 3.106696][ T1] kernel_init_freeable (init/main.c:1416 init/main.c:1602)
[ 3.106700][ T1] ? console_on_rootfs (init/main.c:1584)
[ 3.106703][ T1] ? _raw_spin_lock_irq (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
[ 3.106705][ T1] ? _raw_spin_lock (kernel/locking/spinlock.c:169)
[ 3.106707][ T1] ? rest_init (init/main.c:1494)
[ 3.106711][ T1] kernel_init (init/main.c:1504)
[ 3.106714][ T1] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 3.106717][ T1] </TASK>
[ 3.138625][ T1] ... bit width: 48
[ 3.139625][ T1] ... generic registers: 8
[ 3.140625][ T1] ... value mask: 0000ffffffffffff
[ 3.141625][ T1] ... max period: 00007fffffffffff
[ 3.142625][ T1] ... fixed-purpose events: 3
[ 3.143625][ T1] ... event mask: 00000007000000ff
[ 3.144779][ T1] rcu: Hierarchical SRCU implementation.
[ 3.147112][ T5] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[ 3.147800][ T1] smp: Bringing up secondary CPUs ...
[ 3.148897][ T1] x86: Booting SMP configuration:
[ 3.149628][ T1] .... node #0, CPUs: #1
[ 1.042475][ T0] masked ExtINT on CPU#1
[ 3.158925][ T1] #2
[ 1.042475][ T0] masked ExtINT on CPU#2
[ 3.165913][ T1] #3
[ 1.042475][ T0] masked ExtINT on CPU#3
[ 3.172883][ T1] smp: Brought up 1 node, 4 CPUs
[ 3.173626][ T1] smpboot: Max logical packages: 1
[ 3.174625][ T1] smpboot: Total of 4 processors activated (25599.84 BogoMIPS)
[ 3.263787][ T32] node 0 deferred pages initialised in 87ms
[ 3.271136][ T1] devtmpfs: initialized
[ 3.271866][ T1] x86/mm: Memory block size: 128MB
[ 3.290254][ T1] ACPI: PM: Registering ACPI NVS region [mem 0xcb120000-0xcb120fff] (4096 bytes)
[ 3.290631][ T1] ACPI: PM: Registering ACPI NVS region [mem 0xdb773000-0xdbf25fff] (8073216 bytes)
[ 3.294342][ T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[ 3.294629][ T1] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[ 3.295764][ T1] pinctrl core: initialized pinctrl subsystem
[ 3.297107][ T1] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 3.298360][ T1] audit: initializing netlink subsys (disabled)
[ 3.298652][ T39] audit: type=2000 audit(1512774778.254:1): state=initialized audit_enabled=0 res=1
[ 3.299161][ T1] thermal_sys: Registered thermal governor 'fair_share'
[ 3.299628][ T1] thermal_sys: Registered thermal governor 'bang_bang'
[ 3.300625][ T1] thermal_sys: Registered thermal governor 'step_wise'
[ 3.301625][ T1] thermal_sys: Registered thermal governor 'user_space'
[ 3.302649][ T1] cpuidle: using governor menu
[ 3.304870][ T1] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[ 3.305627][ T1] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[ 3.306912][ T1] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
[ 3.307626][ T1] PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved in E820
[ 3.310078][ T1] PCI: Using configuration type 1 for base access
[ 3.323930][ T1] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[ 3.324706][ T1] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[ 3.325648][ T1] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 3.326768][ T1] cryptd: max_cpu_qlen set to 1000
[ 3.327890][ T1] ACPI: Added _OSI(Module Device)
[ 3.328768][ T1] ACPI: Added _OSI(Processor Device)
[ 3.329625][ T1] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 3.330625][ T1] ACPI: Added _OSI(Processor Aggregator Device)
[ 3.331630][ T1] ACPI: Added _OSI(Linux-Dell-Video)
[ 3.332629][ T1] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 3.333629][ T1] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 3.693287][ T1] ACPI: 8 ACPI AML tables successfully acquired and loaded
[ 3.713410][ T1] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[ 3.739712][ T1] ACPI: Dynamic OEM Table Load:
[ 3.740633][ T1] ACPI: SSDT 0xFFFF888102ABF000 0006A2 (v02 PmRef Cpu0Ist 00003000 INTL 20120913)
[ 3.751345][ T1] ACPI: _PR_.CPU0: _OSC native thermal LVT Acked
[ 3.771178][ T1] ACPI: Dynamic OEM Table Load:
[ 3.771631][ T1] ACPI: SSDT 0xFFFF888101D28800 00037F (v02 PmRef Cpu0Cst 00003001 INTL 20120913)
[ 3.777097][ T1] ACPI: Dynamic OEM Table Load:
[ 3.777630][ T1] ACPI: SSDT 0xFFFF888102053300 00008E (v02 PmRef Cpu0Hwp 00003000 INTL 20120913)
[ 3.780935][ T1] ACPI: Dynamic OEM Table Load:
[ 3.781629][ T1] ACPI: SSDT 0xFFFF88810123EC00 000130 (v02 PmRef HwpLvt 00003000 INTL 20120913)
[ 3.789228][ T1] ACPI: Dynamic OEM Table Load:
[ 3.789640][ T1] ACPI: SSDT 0xFFFF888102B06000 0005AA (v02 PmRef ApIst 00003000 INTL 20120913)
[ 3.803937][ T1] ACPI: Dynamic OEM Table Load:
[ 3.804630][ T1] ACPI: SSDT 0xFFFF88810123D800 000119 (v02 PmRef ApHwp 00003000 INTL 20120913)
[ 3.808782][ T1] ACPI: Dynamic OEM Table Load:
[ 3.809630][ T1] ACPI: SSDT 0xFFFF88810123E000 000119 (v02 PmRef ApCst 00003000 INTL 20120913)
[ 3.849846][ T1] ACPI: Interpreter enabled
[ 3.850722][ T1] ACPI: PM: (supports S0 S3 S4 S5)
[ 3.851626][ T1] ACPI: Using IOAPIC for interrupt routing
[ 3.852833][ T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 3.867137][ T1] ACPI: Enabled 7 GPEs in block 00 to 7F
[ 3.893176][ T1] ACPI: PM: Power Resource [PG00]
[ 3.899374][ T1] ACPI: PM: Power Resource [PG01]
[ 3.905014][ T1] ACPI: PM: Power Resource [PG02]
[ 3.939648][ T1] ACPI: PM: Power Resource [WRST]
[ 3.944407][ T1] ACPI: PM: Power Resource [WRST]
[ 3.948404][ T1] ACPI: PM: Power Resource [WRST]
[ 3.952401][ T1] ACPI: PM: Power Resource [WRST]
[ 3.956397][ T1] ACPI: PM: Power Resource [WRST]
[ 3.960402][ T1] ACPI: PM: Power Resource [WRST]
[ 3.964395][ T1] ACPI: PM: Power Resource [WRST]
[ 3.968403][ T1] ACPI: PM: Power Resource [WRST]
[ 3.972400][ T1] ACPI: PM: Power Resource [WRST]
[ 3.976402][ T1] ACPI: PM: Power Resource [WRST]
[ 3.980396][ T1] ACPI: PM: Power Resource [WRST]
[ 3.984401][ T1] ACPI: PM: Power Resource [WRST]
[ 3.988400][ T1] ACPI: PM: Power Resource [WRST]
[ 3.992398][ T1] ACPI: PM: Power Resource [WRST]
[ 3.996395][ T1] ACPI: PM: Power Resource [WRST]
[ 4.000397][ T1] ACPI: PM: Power Resource [WRST]
[ 4.004394][ T1] ACPI: PM: Power Resource [WRST]
[ 4.008395][ T1] ACPI: PM: Power Resource [WRST]
[ 4.012397][ T1] ACPI: PM: Power Resource [WRST]
[ 4.016395][ T1] ACPI: PM: Power Resource [WRST]
[ 4.186388][ T1] ACPI: PM: Power Resource [FN00]
[ 4.187769][ T1] ACPI: PM: Power Resource [FN01]
[ 4.189766][ T1] ACPI: PM: Power Resource [FN02]
[ 4.191758][ T1] ACPI: PM: Power Resource [FN03]
[ 4.193765][ T1] ACPI: PM: Power Resource [FN04]
[ 4.206063][ T1] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
[ 4.206643][ T1] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[ 4.208336][ T1] acpi PNP0A08:00: _OSC: OS requested [PCIeHotplug SHPCHotplug PME AER PCIeCapability LTR]
[ 4.208627][ T1] acpi PNP0A08:00: _OSC: platform willing to grant [PCIeHotplug SHPCHotplug PME AER PCIeCapability LTR]
[ 4.209625][ T1] acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_ERROR)
[ 4.223535][ T1] PCI host bridge to bus 0000:00
[ 4.223629][ T1] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
[ 4.224628][ T1] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
[ 4.225627][ T1] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[ 4.226627][ T1] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xf7ffffff window]
[ 4.227627][ T1] pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff window]
[ 4.228628][ T1] pci_bus 0000:00: root bus resource [bus 00-3e]
[ 4.230460][ T1] pci 0000:00:00.0: [8086:191f] type 00 class 0x060000
[ 4.230879][ T1] pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
[ 4.231699][ T1] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
[ 4.241259][ T1] pci 0000:00:02.0: [8086:1912] type 00 class 0x030000
[ 4.241632][ T1] pci 0000:00:02.0: reg 0x10: [mem 0xf6000000-0xf6ffffff 64bit]
[ 4.242629][ T1] pci 0000:00:02.0: reg 0x18: [mem 0xe0000000-0xefffffff 64bit pref]


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (12.91 kB)
config-5.17.0-rc3-00248-g94fb1f214c6f (181.74 kB)
job-script (5.79 kB)
dmesg.xz (25.44 kB)
blktests (1.27 kB)
job.yaml (4.69 kB)
reproduce (94.00 B)
Download all attachments

2022-02-14 08:57:45

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] printk: fix softlockup/rcu stall warning without setting CONFIG_PREEMPTION

On (22/02/12 21:27), Guanghui Feng wrote:
> console_unlock+0x220/0x420
> vprintk_emit+0x17c/0x1ac
> vprintk_default+0x3c/0x44
> vprintk+0x38/0x70
> printk+0x64/0x88
> dump_task.part.0+0xc4/0xe0
> dump_task+0x70/0x74
> dump_tasks+0x78/0x90
> dump_global_header+0xcc/0xe8
> oom_kill_process+0x258/0x274
> out_of_memory.part.0+0xb0/0x33c
> out_of_memory+0x4c/0xa0
> __alloc_pages_may_oom+0x11c/0x1a0
> __alloc_pages_slowpath.constprop.0+0x4c0/0x75c
> __alloc_pages_nodemask+0x2b4/0x310
> alloc_pages_current+0x8c/0x140
> get_zeroed_page+0x20/0x50
> __pud_alloc+0x40/0x190
> copy_pud_range+0x264/0x280
> copy_page_range+0xe8/0x204
> dup_mmap+0x334/0x434
> dup_mm+0x64/0x11c
> copy_process+0x5e0/0x11a0
> kernel_clone+0x94/0x364
> __do_sys_clone+0x54/0x80
> __arm64_sys_clone+0x24/0x30
> el0_svc_common.constprop.0+0x7c/0x210
> do_el0_svc+0x74/0x90
> el0_svc+0x24/0x60
> el0_sync_handler+0xa8/0xb0
> el0_sync+0x140/0x180

[..]

> @@ -2716,7 +2716,11 @@ void console_unlock(void)
> if (handover)
> return;
>
> +#ifndef CONFIG_PREEMPTION
> + if (do_cond_resched || need_resched())
> +#else
> if (do_cond_resched)
> +#endif
> cond_resched();

console_unlock() can be called from atomic context, it should not schedule
when it cannot do so. That's what console_may_schedule indicates.