2020-10-13 16:09:57

by Naresh Kamboju

[permalink] [raw]
Subject: WARNING: kernel/irq/chip.c:242 __irq_startup+0xa8/0xb0

On stable rc 5.8.15 the following kernel warning was noticed once
while boot and this is hard to reproduce.

metadata:
git branch: linux-5.8.y
git repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
git commit: f4ed6fb8f1680de812daf362f28342e6bf19fdcc
git describe: v5.8.14-125-gf4ed6fb8f168
make_kernelversion: 5.8.15-rc1
kernel-config:
https://builds.tuxbuild.com/5nFtomB9FDlQVjafKdyR9A/kernel.config

warning log:
----------------
[ 43.512935] ------------[ cut here ]------------
[ 43.517563] WARNING: CPU: 0 PID: 424 at kernel/irq/chip.c:242
__irq_startup+0xa8/0xb0
[ 43.517564] Modules linked in: caam error crct10dif_ce ina2xx lm90
qoriq_thermal fuse
[ 43.533218] CPU: 0 PID: 424 Comm: (agetty) Not tainted 5.8.15-rc1 #1
[ 43.533219] Hardware name: Freescale Layerscape 2088A RDB Board (DT)
[ 43.533223] pstate: 60000085 (nZCv daIf -PAN -UAO BTYPE=--)
[ 43.533225] pc : __irq_startup+0xa8/0xb0
[ 43.533229] lr : irq_startup+0x64/0x140
[ 43.559218] sp : ffff800010b838c0
[ 43.559219] x29: ffff800010b838c0 x28: ffff0082cd3bab00
[ 43.559222] x27: ffffb8c772ed8ec0 x26: 0000000000020902
[ 43.573133] x25: ffffb8c771d39e68 x24: ffffb8c771d39b08
[ 43.573135] x23: 0000000000000000 x22: 0000000000000001
[ 43.573138] x21: 0000000000000001 x20: ffff0082edf42c40
[ 43.573140] x19: ffff0082ede29200 x18: 0000000000000001
[ 43.573142] x17: 0000000000000000 x16: 0000000000000000
[ 43.573145] x15: ffff0082edf430c0 x14: ffff800010bf5000
[ 43.573147] x13: ffff800010bed000 x12: ffff800010be9000
[ 43.573149] x11: 0000000000000040 x10: ffffb8c772cdb940
[ 43.573153] x9 : ffffb8c770b7fe0c x8 : ffff0082ee000270
[ 43.620866] x7 : 0000000000000000 x6 : ffffb8c772cbaa08
[ 43.620869] x5 : ffffb8c772cba000 x4 : 0000000000000504
[ 43.620871] x3 : ffff0082ede29200 x2 : 0000000003032004
[ 43.636780] x1 : 0000000003032004 x0 : ffff0082ede29258
[ 43.636783] Call trace:
[ 43.636786] __irq_startup+0xa8/0xb0
[ 43.636789] irq_startup+0x64/0x140
[ 43.651569] __enable_irq+0x78/0x88
[ 43.651571] enable_irq+0x54/0xa8
[ 43.651577] serial8250_do_startup+0x670/0x718
[ 43.662791] serial8250_startup+0x30/0x40
[ 43.666793] uart_startup.part.0+0x12c/0x2e0
[ 43.671055] uart_port_activate+0x68/0xa0
[ 43.675058] tty_port_open+0x98/0x250
[ 43.678712] uart_open+0x24/0x38
[ 43.681932] tty_open+0x100/0x480
[ 43.685240] chrdev_open+0xac/0x1a8
[ 43.688721] do_dentry_open+0x130/0x3d0
[ 43.692548] vfs_open+0x34/0x40
[ 43.695681] path_openat+0x888/0xde8
[ 43.699247] do_filp_open+0x80/0x108
[ 43.702814] do_sys_openat2+0x1ec/0x2a8
[ 43.706642] do_sys_open+0x60/0xa8
[ 43.710035] __arm64_sys_openat+0x2c/0x38
[ 43.714037] el0_svc_common.constprop.0+0x7c/0x198
[ 43.718820] do_el0_svc+0x2c/0x98
[ 43.722128] el0_sync_handler+0x9c/0x1b8
[ 43.726041] el0_sync+0x158/0x180
[ 43.729347] ---[ end trace 434ed7c8787a1d1f ]---

full log,
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.8.y/build/v5.8.14-125-gf4ed6fb8f168/testrun/3297947/suite/linux-log-parser/test/check-kernel-warning-93863/log


--
Linaro LKFT
https://lkft.linaro.org


2020-11-18 06:16:32

by Naresh Kamboju

[permalink] [raw]
Subject: Re: WARNING: kernel/irq/chip.c:242 __irq_startup+0xa8/0xb0

On Tue, 13 Oct 2020 at 11:09, Naresh Kamboju <[email protected]> wrote:
>
> On stable rc 5.8.15 the following kernel warning was noticed once
> while boot and this is hard to reproduce.

This is now reproduciable on arm64 NXP ls2088 device

[ 19.980839] ------------[ cut here ]------------
[ 19.985468] WARNING: CPU: 1 PID: 441 at kernel/irq/chip.c:242
__irq_startup+0x9c/0xa8
[ 19.985472] Modules linked in: rfkill lm90 ina2xx crct10dif_ce
qoriq_thermal fuse
[ 20.000773] CPU: 1 PID: 441 Comm: (agetty) Not tainted 5.4.78-rc1 #2
[ 20.000775] Hardware name: Freescale Layerscape 2088A RDB Board (DT)
[ 20.000779] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 20.018253] pc : __irq_startup+0x9c/0xa8
[ 20.018256] lr : irq_startup+0x64/0x130
[ 20.018259] sp : ffff80001122f8e0
[ 20.029303] x29: ffff80001122f8e0 x28: ffff0082c242d400
[ 20.029306] x27: ffffdd0f47234768 x26: 0000000000020902
[ 20.029309] x25: ffffdd0f461a6f10 x24: ffffdd0f461a6bc8
[ 20.029311] x23: 0000000000000000 x22: 0000000000000001
[ 20.029314] x21: 0000000000000001 x20: ffff0082c22f8780
[ 20.029316] x19: ffff0082c1060800 x18: 0000000000000001
[ 20.029318] x17: 0000000000000000 x16: ffff8000114a0000
[ 20.029321] x15: 0000000000000000 x14: ffff0082c0e92f90
[ 20.071738] x13: ffff0082c0e93080 x12: ffff800011460000
[ 20.071741] x11: dead000000000100 x10: 0000000000000040
[ 20.071743] x9 : ffffdd0f47093ba8 x8 : ffffdd0f47093ba0
[ 20.087653] x7 : ffff0082a00002b0 x6 : ffffdd0f47074958
[ 20.087655] x5 : ffffdd0f47074000 x4 : ffff800011230000
[ 20.087657] x3 : 0000000000000504 x2 : 0000000000000001
[ 20.103567] x1 : 0000000003032004 x0 : ffff0082c1060858
[ 20.103570] Call trace:
[ 20.103573] __irq_startup+0x9c/0xa8
[ 20.103577] irq_startup+0x64/0x130
[ 20.118359] __enable_irq+0x7c/0x88
[ 20.118362] enable_irq+0x54/0xa8
[ 20.118367] serial8250_do_startup+0x658/0x718
[ 20.118371] serial8250_startup+0x38/0x48
[ 20.133589] uart_startup.part.0+0x12c/0x2b8
[ 20.133592] uart_port_activate+0x64/0x98
[ 20.133595] tty_port_open+0x94/0x200
[ 20.133599] uart_open+0x2c/0x40
[ 20.148730] tty_open+0x108/0x438
[ 20.148734] chrdev_open+0xa8/0x1a0
[ 20.148737] do_dentry_open+0x118/0x3b8
[ 20.159348] vfs_open+0x38/0x48
[ 20.159350] path_openat+0x4c8/0x1290
[ 20.159353] do_filp_open+0x84/0x108
[ 20.159357] do_sys_open+0x180/0x228
[ 20.173271] __arm64_sys_openat+0x2c/0x38
[ 20.173274] el0_svc_handler+0x88/0x1c8
[ 20.173278] el0_svc+0x8/0x1bc
[ 20.184148] ---[ end trace 736144791ac25035 ]---



ref:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.77-152-ga3746663c347/testrun/3452654/suite/linux-log-parser/test/check-kernel-warning-139363/log

2020-11-18 09:10:47

by Marc Zyngier

[permalink] [raw]
Subject: Re: WARNING: kernel/irq/chip.c:242 __irq_startup+0xa8/0xb0

Naresh,

On 2020-11-18 06:11, Naresh Kamboju wrote:
> On Tue, 13 Oct 2020 at 11:09, Naresh Kamboju
> <[email protected]> wrote:
>>
>> On stable rc 5.8.15 the following kernel warning was noticed once
>> while boot and this is hard to reproduce.
>
> This is now reproduciable on arm64 NXP ls2088 device

How reproducible? On demand? Once in a while?

>
> [ 19.980839] ------------[ cut here ]------------
> [ 19.985468] WARNING: CPU: 1 PID: 441 at kernel/irq/chip.c:242
> __irq_startup+0x9c/0xa8
> [ 19.985472] Modules linked in: rfkill lm90 ina2xx crct10dif_ce
> qoriq_thermal fuse
> [ 20.000773] CPU: 1 PID: 441 Comm: (agetty) Not tainted 5.4.78-rc1 #2

Can you please try and reproduce this on mainline?

> [ 20.000775] Hardware name: Freescale Layerscape 2088A RDB Board (DT)
> [ 20.000779] pstate: 60000085 (nZCv daIf -PAN -UAO)
> [ 20.018253] pc : __irq_startup+0x9c/0xa8
> [ 20.018256] lr : irq_startup+0x64/0x130
> [ 20.018259] sp : ffff80001122f8e0
> [ 20.029303] x29: ffff80001122f8e0 x28: ffff0082c242d400
> [ 20.029306] x27: ffffdd0f47234768 x26: 0000000000020902
> [ 20.029309] x25: ffffdd0f461a6f10 x24: ffffdd0f461a6bc8
> [ 20.029311] x23: 0000000000000000 x22: 0000000000000001
> [ 20.029314] x21: 0000000000000001 x20: ffff0082c22f8780
> [ 20.029316] x19: ffff0082c1060800 x18: 0000000000000001
> [ 20.029318] x17: 0000000000000000 x16: ffff8000114a0000
> [ 20.029321] x15: 0000000000000000 x14: ffff0082c0e92f90
> [ 20.071738] x13: ffff0082c0e93080 x12: ffff800011460000
> [ 20.071741] x11: dead000000000100 x10: 0000000000000040
> [ 20.071743] x9 : ffffdd0f47093ba8 x8 : ffffdd0f47093ba0
> [ 20.087653] x7 : ffff0082a00002b0 x6 : ffffdd0f47074958
> [ 20.087655] x5 : ffffdd0f47074000 x4 : ffff800011230000
> [ 20.087657] x3 : 0000000000000504 x2 : 0000000000000001
> [ 20.103567] x1 : 0000000003032004 x0 : ffff0082c1060858
> [ 20.103570] Call trace:
> [ 20.103573] __irq_startup+0x9c/0xa8
> [ 20.103577] irq_startup+0x64/0x130
> [ 20.118359] __enable_irq+0x7c/0x88
> [ 20.118362] enable_irq+0x54/0xa8
> [ 20.118367] serial8250_do_startup+0x658/0x718
> [ 20.118371] serial8250_startup+0x38/0x48

Looking at the DT:

serial0: serial@21c0500 {
interrupts = <0 32 0x4>; /* Level high type */

serial1: serial@21c0600 {
interrupts = <0 32 0x4>; /* Level high type */

serial2: serial@21d0500 {
interrupts = <0 33 0x4>; /* Level high type */

serial3: serial@21d0600 {
interrupts = <0 33 0x4>; /* Level high type */


The UART interrupt lines are shared. Braindead, 1980 style.

Which UART is agetty trying to use? Is there any other process using
another UART concurrently? We could have a race between the line being
shut down on one device, and activated from the other, but I can't
spot it right away.

If you can reproduce it easily enough, it shouldn't be too hard to trace
what happens around the activation of the shared interrupt (assuming
that
this is where the problem is).

M.
--
Jazz is not dead. It just smells funny...

2020-11-18 17:00:57

by Naresh Kamboju

[permalink] [raw]
Subject: Re: WARNING: kernel/irq/chip.c:242 __irq_startup+0xa8/0xb0

On Wed, 18 Nov 2020 at 14:38, Marc Zyngier <[email protected]> wrote:
>
> Naresh,
>
> On 2020-11-18 06:11, Naresh Kamboju wrote:
> > On Tue, 13 Oct 2020 at 11:09, Naresh Kamboju
> > <[email protected]> wrote:
> >>
> >> On stable rc 5.8.15 the following kernel warning was noticed once
> >> while boot and this is hard to reproduce.
> >
> > This is now reproduciable on arm64 NXP ls2088 device
>
> How reproducible? On demand? Once in a while?

Once in a while.

>
> >
> > [ 19.980839] ------------[ cut here ]------------
> > [ 19.985468] WARNING: CPU: 1 PID: 441 at kernel/irq/chip.c:242
> > __irq_startup+0x9c/0xa8
> > [ 19.985472] Modules linked in: rfkill lm90 ina2xx crct10dif_ce
> > qoriq_thermal fuse
> > [ 20.000773] CPU: 1 PID: 441 Comm: (agetty) Not tainted 5.4.78-rc1 #2
>
> Can you please try and reproduce this on mainline?

yes on mainline kernel.

[ 18.284440] ------------[ cut here ]------------
[ 18.289063] WARNING: CPU: 3 PID: 441 at kernel/irq/chip.c:242
__irq_startup+0xa8/0xb0
[ 18.296891] Modules linked in: rfkill caam crct10dif_ce error lm90
ina2xx qoriq_thermal fuse
[ 18.296914] CPU: 3 PID: 441 Comm: (agetty) Not tainted 5.10.0-rc3 #2
[ 18.311693] Hardware name: Freescale Layerscape 2088A RDB Board (DT)
[ 18.311697] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[ 18.311700] pc : __irq_startup+0xa8/0xb0
[ 18.311702] lr : irq_startup+0x64/0x140
[ 18.311708] sp : ffff800011b2f8f0
[ 18.335095] x29: ffff800011b2f8f0 x28: ffff724881a03800
[ 18.335100] x27: ffffa9adf65ff200 x26: 0000000000020902
[ 18.335104] x25: ffffa9adf522bb40 x24: ffffa9adf522b7f8
[ 18.351023] x23: 0000000000000000 x22: 0000000000000001
[ 18.351027] x21: 0000000000000001 x20: ffff724883dbc940
[ 18.351030] x19: ffff7248812a0800 x18: 0000000000000010
[ 18.351035] x17: 0000000000000000 x16: 0000000000000000
[ 18.366952] x15: ffff724883dbcdc0 x14: ffffffffffffffff
[ 18.366956] x13: ffff800011de0000 x12: ffff800011de0000
[ 18.366959] x11: 0000000000000040 x10: ffffa9adf63c64e0
[ 18.388178] x9 : ffffa9adf3f90c3c x8 : ffff7248a0000270
[ 18.388181] x7 : 0000000000000000 x6 : ffffa9adf6334aa8
[ 18.388185] x5 : ffffa9adf6334000 x4 : 0000000000000504
[ 18.388189] x3 : ffff7248812a0800 x2 : 0000000013032004
[ 18.404106] x1 : 0000000013032004 x0 : ffff7248812a0860
[ 18.404111] Call trace:
[ 18.404114] __irq_startup+0xa8/0xb0
[ 18.404116] irq_startup+0x64/0x140
[ 18.404123] __enable_irq+0x78/0x88
[ 18.427685] enable_irq+0x54/0xa8
[ 18.427689] serial8250_do_startup+0x670/0x718
[ 18.427692] serial8250_startup+0x30/0x40
[ 18.427696] uart_startup.part.0+0x12c/0x2e0
[ 18.427701] uart_port_activate+0x68/0xa0
[ 18.447701] tty_port_open+0x98/0x250
[ 18.447704] uart_open+0x24/0x38
[ 18.447707] tty_open+0x120/0x518
[ 18.447712] chrdev_open+0xac/0x1a8
[ 18.447718] do_dentry_open+0x134/0x3a0
[ 18.465200] vfs_open+0x34/0x40
[ 18.465203] path_openat+0x85c/0xde0
[ 18.465207] do_filp_open+0x80/0x108
[ 18.465209] do_sys_openat2+0x1ec/0x2a0
[ 18.465215] do_sys_open+0x60/0xa8
[ 18.482695] __arm64_sys_openat+0x2c/0x38
[ 18.482699] do_el0_svc+0xe0/0x210
[ 18.482702] el0_sync_handler+0x180/0x18c
[ 18.482705] el0_sync+0x174/0x180
[ 18.482710] ---[ end trace 12c4809389e4277f ]---

>
> Looking at the DT:
>
> serial0: serial@21c0500 {
> interrupts = <0 32 0x4>; /* Level high type */
>
> serial1: serial@21c0600 {
> interrupts = <0 32 0x4>; /* Level high type */
>
> serial2: serial@21d0500 {
> interrupts = <0 33 0x4>; /* Level high type */
>
> serial3: serial@21d0600 {
> interrupts = <0 33 0x4>; /* Level high type */
>
>
> The UART interrupt lines are shared. Braindead, 1980 style.
>
> Which UART is agetty trying to use? Is there any other process using
> another UART concurrently? We could have a race between the line being
> shut down on one device, and activated from the other, but I can't
> spot it right away.

From the boot log,

Created slice system-getty.slice.
Created slice system-serial\x2dgetty.slice.
<>
Found device /dev/ttyS0.
Found device /dev/ttyS1.
<>
Started Getty on tty1.
Started Serial Getty on ttyS1.
Started Serial Getty on ttyS0.
Started Authorization Manager.

And the warning triggered.

>
> If you can reproduce it easily enough, it shouldn't be too hard to trace
> what happens around the activation of the shared interrupt (assuming
> that
> this is where the problem is).

We do not find easy steps to reproduce it.
This warning was noticed while booting once in 20 runs.

ref:
https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v5.10-rc3-391-g9cfd9c45994b/testrun/3443291/suite/linux-log-parser/test/check-kernel-warning-138789/log

- Naresh