2021-07-12 14:16:34

by kernel test robot

[permalink] [raw]
Subject: [serial] 75f4e830fa: WARNING:inconsistent_lock_state



Greeting,

FYI, we noticed the following commit (built with clang-13):

commit: 75f4e830fa9c47637054a3b7201765f2a314bda2 ("serial: do not restore interrupt state in sysrq helper")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-------------------------------------------------+------------+------------+
| | 86eb032675 | 75f4e830fa |
+-------------------------------------------------+------------+------------+
| boot_failures | 0 | 10 |
| WARNING:inconsistent_lock_state | 0 | 10 |
| inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage | 0 | 10 |
+-------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 207.990323] WARNING: inconsistent lock state
[ 207.990819] 5.12.0-rc6+ #1 Not tainted
[ 207.991260] --------------------------------
[ 207.991754] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 207.992461] rngd/341 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 207.993040] ffffffff90c37a50 (&port_lock_key){?.-.}-{2:2}, at: serial8250_handle_irq+0x1a/0xdb
[ 207.994026] {IN-HARDIRQ-W} state was registered at:
[ 207.994592] lock_acquire+0x1a1/0x2ab
[ 207.995032] _raw_spin_lock_irqsave+0x6b/0xbd
[ 207.995540] serial8250_console_write+0x78/0x367
[ 207.996083] console_unlock+0x426/0x4fe
[ 207.996530] vprintk_emit+0xc7/0x12f
[ 207.996949] printk+0x54/0x6e
[ 207.997318] print_local_APIC+0x55/0x2cd
[ 207.997770] flush_smp_call_function_queue+0xeb/0x297
[ 207.998348] __sysvec_call_function_single+0xf0/0x1f4
[ 207.998920] sysvec_call_function_single+0x87/0xa3
[ 207.999477] asm_sysvec_call_function_single+0x12/0x20
[ 208.000060] default_idle+0x12/0x13
[ 208.000469] default_idle_call+0x4f/0x86
[ 208.000926] do_idle+0xd3/0x1ec
[ 208.001312] cpu_startup_entry+0x14/0x16
[ 208.001767] start_kernel+0x3e1/0x45f
[ 208.002196] secondary_startup_64_no_verify+0xb0/0xbb
[ 208.002769] irq event stamp: 20208
[ 208.003153] hardirqs last enabled at (20208): [<ffffffff8e745b95>] _raw_spin_unlock_irq+0x1f/0x3e
[ 208.004143] hardirqs last disabled at (20207): [<ffffffff8e745a10>] _raw_spin_lock_irq+0x3b/0xb4
[ 208.005145] softirqs last enabled at (20186): [<ffffffff8ea004be>] __do_softirq+0x4be/0x50b
[ 208.006073] softirqs last disabled at (20203): [<ffffffff8dce512b>] __irq_exit_rcu+0xb7/0xbe
[ 208.007012]
[ 208.007012] other info that might help us debug this:
[ 208.007748] Possible unsafe locking scenario:
[ 208.007748]
[ 208.008411] CPU0
[ 208.008682] ----
[ 208.008955] lock(&port_lock_key);
[ 208.009398] <Interrupt>
[ 208.009703] lock(&port_lock_key);
[ 208.010122]
[ 208.010122] *** DEADLOCK ***
[ 208.010122]
[ 208.010790] 1 lock held by rngd/341:
[ 208.011190] #0: ffff9e8ec0e6fd80 ((&up->timer)){+.-.}-{0:0}, at: call_timer_fn+0x48/0x2eb
[ 208.012101]
[ 208.012101] stack backtrace:
[ 208.012579] CPU: 1 PID: 341 Comm: rngd Not tainted 5.12.0-rc6+ #1
[ 208.013253] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 208.014128] Call Trace:
[ 208.014406] dump_stack+0x8b/0xdc
[ 208.014776] mark_lock_irq+0x5b9/0x6f3
[ 208.015175] ? stack_trace_save+0x50/0x6e
[ 208.015615] ? save_trace+0x4d/0x30d
[ 208.016001] mark_lock+0x121/0x1a3
[ 208.016378] __lock_acquire+0x645/0x2ca8
[ 208.016807] ? lockdep_unlock+0x8f/0x113
[ 208.017256] ? __lock_acquire+0x133e/0x2ca8
[ 208.017707] ? lock_is_held_type+0x102/0x15d
[ 208.018175] lock_acquire+0x1a1/0x2ab
[ 208.018590] ? serial8250_handle_irq+0x1a/0xdb
[ 208.019072] ? serial8250_handle_irq+0x1a/0xdb
[ 208.019560] _raw_spin_lock+0x34/0x67
[ 208.019956] ? serial8250_handle_irq+0x1a/0xdb
[ 208.020444] ? univ8250_console_match+0x130/0x130
[ 208.020951] serial8250_handle_irq+0x1a/0xdb
[ 208.021444] ? univ8250_console_match+0x130/0x130
[ 208.021950] serial8250_default_handle_irq+0x3b/0x4a
[ 208.022492] serial8250_timeout+0x17/0x42
[ 208.022933] call_timer_fn+0x145/0x2eb
[ 208.023349] ? univ8250_console_match+0x130/0x130
[ 208.023854] run_timer_softirq+0x281/0x33e
[ 208.024308] __do_softirq+0x28f/0x50b
[ 208.024716] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[ 208.025389] __irq_exit_rcu+0xb7/0xbe
[ 208.025817] irq_exit_rcu+0x5/0x19
[ 208.026211] sysvec_apic_timer_interrupt+0x3e/0xa3
[ 208.026781] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 208.027378] RIP: 0033:0x40673a
[ 208.027731] Code: 8b 40 48 01 d0 8d 50 ff 48 8b 45 d8 89 50 40 48 8b 45 d8 8b 40 40 ba 00 00 00 00 f7 75 e4 48 8b 45 d8 89 50 40 48 83 45 e8 01 <48> 8b 45 d8 8b 40 4c 89 c2 48 8b 45 f0 48 01 d0 48 3b 45 e8 77 8b
[ 208.029784] RSP: 002b:00007ffc3aa56810 EFLAGS: 00000206
[ 208.030352] RAX: 0000000000c15040 RBX: 0000000000000000 RCX: 0000000000000000
[ 208.031132] RDX: 00000000000004d8 RSI: 0000000000000007 RDI: 000000002f97259d
[ 208.031925] RBP: 00007ffc3aa56840 R08: 0000003166008564 R09: 00007ffc3abdf080
[ 208.032719] R10: 00007ffc3abdf0a8 R11: 00000000000050bc R12: 0000000000000000
[ 208.033508] R13: 00007ffc3aa56a40 R14: 0000000000000000 R15: 0000000000000000
[ 208.105369] Kernel tests: Boot OK!
[ 208.105380]
[ 208.634936] /lkp/lkp/src/bin/run-lkp
[ 208.634945]
[ 208.806327] RESULT_ROOT=/result/boot/300/vm-snb/aliyun-x86_64-20190626.cgz/x86_64-randconfig-a004-20210711/clang-13/75f4e830fa9c47637054a3b7201765f2a314bda2/15
[ 208.806339]
[ 208.837522] random: crng init done
[ 208.837945] random: 7 urandom warning(s) missed due to ratelimiting
[ 208.844355] job=/lkp/jobs/scheduled/vm-snb-170/boot-300-aliyun-x86_64-20190626.cgz-75f4e830fa9c47637054a3b7201765f2a314bda2-20210712-32373-9ohlkk-13.yaml
[ 208.844367]
[ 209.168997] process 'local/aegis/aegis_update/AliYunDunUpdate' started with executable stack
[ 209.274626] result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/boot/300/vm-snb/aliyun-x86_64-20190626.cgz/x86_64-randconfig-a004-20210711/clang-13/75f4e830fa9c47637054a3b7201765f2a314bda2/15, TMP_RESULT_ROOT: /tmp/lkp/result
[ 209.274995]
[ 209.327376] run-job /lkp/jobs/scheduled/vm-snb-170/boot-300-aliyun-x86_64-20190626.cgz-75f4e830fa9c47637054a3b7201765f2a314bda2-20210712-32373-9ohlkk-13.yaml
[ 209.327388]
[ 210.153970] /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/vm-snb-170/boot-300-aliyun-x86_64-20190626.cgz-75f4e830fa9c47637054a3b7201765f2a314bda2-20210712-32373-9ohlkk-13.yaml&job_state=running -O /dev/null
[ 210.153982]


To reproduce:

# build kernel
cd linux
cp config-5.12.0-rc6+ .config
make HOSTCC=clang-13 CC=clang-13 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (7.48 kB)
config-5.12.0-rc6+ (219.76 kB)
job-script (4.50 kB)
dmesg.xz (16.94 kB)
Download all attachments

2021-07-12 14:41:11

by Johan Hovold

[permalink] [raw]
Subject: Re: [serial] 75f4e830fa: WARNING:inconsistent_lock_state

On Mon, Jul 12, 2021 at 10:34:14PM +0800, kernel test robot wrote:

> Greeting,
>
> FYI, we noticed the following commit (built with clang-13):
>
> commit: 75f4e830fa9c47637054a3b7201765f2a314bda2 ("serial: do not restore interrupt state in sysrq helper")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

> [ 208.007748] Possible unsafe locking scenario:
> [ 208.007748]
> [ 208.008411] CPU0
> [ 208.008682] ----
> [ 208.008955] lock(&port_lock_key);
> [ 208.009398] <Interrupt>
> [ 208.009703] lock(&port_lock_key);
> [ 208.010122]
> [ 208.010122] *** DEADLOCK ***
> [ 208.010122]
> [ 208.010790] 1 lock held by rngd/341:
> [ 208.011190] #0: ffff9e8ec0e6fd80 ((&up->timer)){+.-.}-{0:0}, at: call_timer_fn+0x48/0x2eb
> [ 208.012101]
> [ 208.012101] stack backtrace:
> [ 208.012579] CPU: 1 PID: 341 Comm: rngd Not tainted 5.12.0-rc6+ #1
> [ 208.013253] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [ 208.014128] Call Trace:
> [ 208.014406] dump_stack+0x8b/0xdc
> [ 208.014776] mark_lock_irq+0x5b9/0x6f3
> [ 208.015175] ? stack_trace_save+0x50/0x6e
> [ 208.015615] ? save_trace+0x4d/0x30d
> [ 208.016001] mark_lock+0x121/0x1a3
> [ 208.016378] __lock_acquire+0x645/0x2ca8
> [ 208.016807] ? lockdep_unlock+0x8f/0x113
> [ 208.017256] ? __lock_acquire+0x133e/0x2ca8
> [ 208.017707] ? lock_is_held_type+0x102/0x15d
> [ 208.018175] lock_acquire+0x1a1/0x2ab
> [ 208.018590] ? serial8250_handle_irq+0x1a/0xdb
> [ 208.019072] ? serial8250_handle_irq+0x1a/0xdb
> [ 208.019560] _raw_spin_lock+0x34/0x67
> [ 208.019956] ? serial8250_handle_irq+0x1a/0xdb
> [ 208.020444] ? univ8250_console_match+0x130/0x130
> [ 208.020951] serial8250_handle_irq+0x1a/0xdb
> [ 208.021444] ? univ8250_console_match+0x130/0x130
> [ 208.021950] serial8250_default_handle_irq+0x3b/0x4a

Bah, the interrupt handler can end up being called by a timer callback
in some setups.

I'll prepare a patch.

> [ 208.022492] serial8250_timeout+0x17/0x42
> [ 208.022933] call_timer_fn+0x145/0x2eb
> [ 208.023349] ? univ8250_console_match+0x130/0x130
> [ 208.023854] run_timer_softirq+0x281/0x33e
> [ 208.024308] __do_softirq+0x28f/0x50b
> [ 208.024716] ? asm_sysvec_apic_timer_interrupt+0xa/0x20
> [ 208.025389] __irq_exit_rcu+0xb7/0xbe
> [ 208.025817] irq_exit_rcu+0x5/0x19
> [ 208.026211] sysvec_apic_timer_interrupt+0x3e/0xa3
> [ 208.026781] asm_sysvec_apic_timer_interrupt+0x12/0x20

Thanks for the report.

Johan