2022-08-21 16:00:19

by Igor Kononenko

[permalink] [raw]
Subject: [PATCH 1/3] ipmi:kcs_bmc: Add cleanup regmap(interrupt-regs) on a shutdown.

The bmc might be rebooted while the host is still reachable and the host
might send requests through configured lpc-kcs channels. That leads to
raise lpc-snoop interrupts that haven't adjusted IRQ handlers on next
early kernel boot, because on the aspeed-chip reboot it might keep
unclear registers on a state that is configured on the last boot.

The described case get a `nobody cared` warning about more than 100.000
requests when another one driver configures the same shared IRQ(e.g. 35)
and the interrupt will be passed through for lpc-kcs IRQ handler by
`IRQ_NONE` state flag.

The kernel output of described way is bellow:
```
[ 1.360110] irq 35: nobody cared (try booting with the "irqpoll" option)
[ 1.360145] CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.43-c109de3-24cc5b6 #1
[ 1.360158] Hardware name: Generic DT based system
[ 1.360168] Backtrace:
[ 1.360228] [<80107f5c>] (dump_backtrace) from [<80108184>] (show_stack+0x20/0x24)
[ 1.360250] r7:00000023 r6:00000000 r5:00000000 r4:9d12d560
[ 1.360283] [<80108164>] (show_stack) from [<8084ae54>] (dump_stack+0x20/0x28)
[ 1.360316] [<8084ae34>] (dump_stack) from [<80156790>] (__report_bad_irq+0x40/0xc0)
[ 1.360344] [<80156750>] (__report_bad_irq) from [<801566c0>] (note_interrupt+0x238/0x290)
[ 1.360366] r9:9d0ae000 r8:9d00c600 r7:00000023 r6:00000000 r5:00000000 r4:9d12d560
[ 1.360408] [<80156488>] (note_interrupt) from [<80153594>] (handle_irq_event+0xb4/0xc4)
[ 1.360429] r10:00000000 r9:9d0ae000 r8:9d00c600 r7:00000001 r6:00000000 r5:00000000
[ 1.360440] r4:9d12d560 r3:00000000
[ 1.360466] [<801534e0>] (handle_irq_event) from [<8015788c>] (handle_level_irq+0xac/0x180)
[ 1.360480] r5:80c7d35c r4:9d12d560
[ 1.360503] [<801577e0>] (handle_level_irq) from [<80152a5c>] (__handle_domain_irq+0x6c/0xc8)
[ 1.360519] r5:80c7d35c r4:9d12d560
[ 1.360545] [<801529f0>] (__handle_domain_irq) from [<801021cc>] (avic_handle_irq+0x68/0x70)
[ 1.360568] r9:9d0ae000 r8:9d12d608 r7:9d0afc84 r6:ffffffff r5:9d0afc50 r4:9d002380
[ 1.360587] [<80102164>] (avic_handle_irq) from [<80101a6c>] (__irq_svc+0x6c/0x90)
[ 1.360603] Exception stack(0x9d0afc50 to 0x9d0afc98)
[ 1.360620] fc40: 00000000 00000100 00000000 00000000
[ 1.360640] fc60: 9d12d560 98bbd0c0 00000000 00000023 9d12d608 60000053 00000000 9d0afcd4
[ 1.360657] fc80: 9d0afc80 9d0afca0 801570ec 80154cdc 40000053 ffffffff
[ 1.360670] r5:40000053 r4:80154cdc
[ 1.360693] [<801549e0>] (__setup_irq) from [<801555e4>] (request_threaded_irq+0xdc/0x15c)
[ 1.360715] r9:98bbb340 r8:00000023 r7:9d12d570 r6:9d12d560 r5:00000000 r4:98bbd0c0
[ 1.360741] [<80155508>] (request_threaded_irq) from [<801589d8>] (devm_request_threaded_irq+0x70/0xc4)
[ 1.360762] r10:80a7353c r9:00000000 r8:98bbb340 r7:9d130e10 r6:00000023 r5:804f4a70
[ 1.360774] r4:98bbd020 r3:00000080
[ 1.360800] [<80158968>] (devm_request_threaded_irq) from [<804f4ebc>] (aspeed_lpc_snoop_probe+0x100/0x2ac)
[ 1.360821] r10:00000000 r9:9d130e10 r8:98bbd040 r7:00000000 r6:9d130e00 r5:98bbb340
[ 1.360830] r4:00000000
[ 1.360851] [<804f4dbc>] (aspeed_lpc_snoop_probe) from [<8056a5c4>] (platform_drv_probe+0x44/0x80)
[ 1.360873] r9:80c5ef90 r8:00000000 r7:80cc2938 r6:00000000 r5:80c5ef90 r4:9d130e10
[ 1.360910] [<8056a580>] (platform_drv_probe) from [<80568420>] (really_probe+0x26c/0x498)
[ 1.360924] r5:80cc282c r4:9d130e10
[ 1.360949] [<805681b4>] (really_probe) from [<80568c28>] (driver_probe_device+0x138/0x184)
[ 1.360970] r10:80b0050c r9:80adadb0 r8:00000007 r7:80c5ef90 r6:9d130e10 r5:00000000
[ 1.360981] r4:80c5ef90
[ 1.361004] [<80568af0>] (driver_probe_device) from [<80568fe4>] (device_driver_attach+0xb8/0xc0)
[ 1.361020] r7:80c5ef90 r6:9d130e54 r5:00000000 r4:9d130e10
[ 1.361046] [<80568f2c>] (device_driver_attach) from [<80569070>] (__driver_attach+0x84/0x16c)
[ 1.361063] r7:80c61128 r6:9d130e10 r5:00000001 r4:80c5ef90
[ 1.361088] [<80568fec>] (__driver_attach) from [<80566068>] (bus_for_each_dev+0x84/0xcc)
[ 1.361106] r7:80c61128 r6:80568fec r5:80c5ef90 r4:00000000
[ 1.361130] [<80565fe4>] (bus_for_each_dev) from [<80569180>] (driver_attach+0x28/0x30)
[ 1.361147] r6:00000000 r5:9d1a3d40 r4:80c5ef90
[ 1.361169] [<80569158>] (driver_attach) from [<80566a08>] (bus_add_driver+0x114/0x200)
[ 1.361195] [<805668f4>] (bus_add_driver) from [<80569814>] (driver_register+0x98/0x128)
[ 1.361214] r7:00000000 r6:80ca0ca0 r5:00000000 r4:80c5ef90
[ 1.361241] [<8056977c>] (driver_register) from [<8056b528>] (__platform_driver_register+0x40/0x54)
[ 1.361256] r5:000000b8 r4:80b2575c
[ 1.361294] [<8056b4e8>] (__platform_driver_register) from [<80b2577c>] (aspeed_lpc_snoop_driver_init+0x20/0x28)
[ 1.361331] [<80b2575c>] (aspeed_lpc_snoop_driver_init) from [<80b01318>] (do_one_initcall+0x84/0x184)
[ 1.361356] [<80b01294>] (do_one_initcall) from [<80b01540>] (kernel_init_freeable+0x128/0x1ec)
[ 1.361375] r7:80b3f858 r6:80ca0ca0 r5:000000b8 r4:80b61914
[ 1.361412] [<80b01418>] (kernel_init_freeable) from [<80864060>] (kernel_init+0x18/0x11c)
[ 1.361435] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:80864048
[ 1.361444] r4:00000000
[ 1.361470] [<80864048>] (kernel_init) from [<801010e8>] (ret_from_fork+0x14/0x2c)
[ 1.361483] Exception stack(0x9d0affb0 to 0x9d0afff8)
[ 1.361500] ffa0: 00000000 00000000 00000000 00000000
[ 1.361518] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 1.361535] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 1.361547] r5:80864048 r4:00000000
[ 1.361555] handlers:
[ 1.361592] [<(ptrval)>] aspeed_lpc_snoop_irq
[ 1.361609] Disabling IRQ #35

```

Note:
* The lpc-snoop driver found on the same 1e789080 address and have
same IRQ#35 as for lpc-kcs
* The lpc-snoop initizalied earlier than lpc-kcs.

The patch brings a way to masking lpc-kcs interrupt all through
a bmc-rebooting time.

Tested on the YADRO VEGMAN N110 stand.

Signed-off-by: Igor Kononenko <[email protected]>
---
drivers/char/ipmi/kcs_bmc_aspeed.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c
index cdc88cde1e9a..440f31d96bd3 100644
--- a/drivers/char/ipmi/kcs_bmc_aspeed.c
+++ b/drivers/char/ipmi/kcs_bmc_aspeed.c
@@ -636,6 +636,13 @@ static int aspeed_kcs_remove(struct platform_device *pdev)
return 0;
}

+static void aspeed_kcs_shutdown(struct platform_device *pdev)
+{
+ struct kcs_bmc *kcs_bmc = dev_get_drvdata(&pdev->dev);
+
+ aspeed_kcs_irq_mask_update(kcs_bmc, (KCS_BMC_EVENT_TYPE_IBF), 0);
+}
+
static const struct of_device_id ast_kcs_bmc_match[] = {
{ .compatible = "aspeed,ast2400-kcs-bmc-v2" },
{ .compatible = "aspeed,ast2500-kcs-bmc-v2" },
@@ -651,6 +658,7 @@ static struct platform_driver ast_kcs_bmc_driver = {
},
.probe = aspeed_kcs_probe,
.remove = aspeed_kcs_remove,
+ .shutdown = aspeed_kcs_shutdown,
};
module_platform_driver(ast_kcs_bmc_driver);

--
2.25.1


2022-08-21 20:01:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 1/3] ipmi:kcs_bmc: Add cleanup regmap(interrupt-regs) on a shutdown.

Hi Igor,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on cminyard-ipmi/for-next]
[also build test ERROR on soc/for-next linus/master v6.0-rc1 next-20220819]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Igor-Kononenko/aspeed-lpc-Fix-lpc-snoop-probe-exception/20220822-000836
base: https://github.com/cminyard/linux-ipmi for-next
config: riscv-randconfig-r042-20220821 (https://download.01.org/0day-ci/archive/20220822/[email protected]/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 01ffe31cbb54bfd8e38e71b3cf804a1d67ebf9c1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install riscv cross compiling tool for clang build
# apt-get install binutils-riscv64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/9c523bc00c11d0e9499bf6e3d3c5cc2fcf3fff8f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Igor-Kononenko/aspeed-lpc-Fix-lpc-snoop-probe-exception/20220822-000836
git checkout 9c523bc00c11d0e9499bf6e3d3c5cc2fcf3fff8f
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from drivers/char/ipmi/kcs_bmc_aspeed.c:10:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/riscv/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/riscv/include/asm/io.h:136:
include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
^
In file included from drivers/char/ipmi/kcs_bmc_aspeed.c:10:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/riscv/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/riscv/include/asm/io.h:136:
include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
^
In file included from drivers/char/ipmi/kcs_bmc_aspeed.c:10:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/riscv/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/riscv/include/asm/io.h:136:
include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:1024:55: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
return (port > MMIO_UPPER_LIMIT) ? NULL : PCI_IOBASE + port;
~~~~~~~~~~ ^
>> drivers/char/ipmi/kcs_bmc_aspeed.c:643:29: error: incompatible pointer types passing 'struct kcs_bmc *' to parameter of type 'struct kcs_bmc_device *' [-Werror,-Wincompatible-pointer-types]
aspeed_kcs_irq_mask_update(kcs_bmc, (KCS_BMC_EVENT_TYPE_IBF), 0);
^~~~~~~
drivers/char/ipmi/kcs_bmc_aspeed.c:399:63: note: passing argument to parameter 'kcs_bmc' here
static void aspeed_kcs_irq_mask_update(struct kcs_bmc_device *kcs_bmc, u8 mask, u8 state)
^
7 warnings and 1 error generated.


vim +643 drivers/char/ipmi/kcs_bmc_aspeed.c

638
639 static void aspeed_kcs_shutdown(struct platform_device *pdev)
640 {
641 struct kcs_bmc *kcs_bmc = dev_get_drvdata(&pdev->dev);
642
> 643 aspeed_kcs_irq_mask_update(kcs_bmc, (KCS_BMC_EVENT_TYPE_IBF), 0);
644 }
645

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-08-22 06:45:32

by Igor Kononenko

[permalink] [raw]
Subject: [PATCH v2 1/3] ipmi:kcs_bmc: Add cleanup regmap(interrupt-regs) on a shutdown.

The previos v1 [PATCH 1/3] have error of getting `struct kcs_bmc_device`
pointer from `struct platform_device *`. Replaced to retriveing pointer by
`platform_get_drvdata()`

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Igor Kononenko <[email protected]>
---
drivers/char/ipmi/kcs_bmc_aspeed.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c
index cdc88cde1e9a..8437f3cfe3f4 100644
--- a/drivers/char/ipmi/kcs_bmc_aspeed.c
+++ b/drivers/char/ipmi/kcs_bmc_aspeed.c
@@ -636,6 +636,14 @@ static int aspeed_kcs_remove(struct platform_device *pdev)
return 0;
}

+static void aspeed_kcs_shutdown(struct platform_device *pdev)
+{
+ struct aspeed_kcs_bmc *priv = platform_get_drvdata(pdev);
+ struct kcs_bmc_device *kcs_bmc = &priv->kcs_bmc;
+
+ aspeed_kcs_irq_mask_update(kcs_bmc, (KCS_BMC_EVENT_TYPE_IBF), 0);
+}
+
static const struct of_device_id ast_kcs_bmc_match[] = {
{ .compatible = "aspeed,ast2400-kcs-bmc-v2" },
{ .compatible = "aspeed,ast2500-kcs-bmc-v2" },
@@ -651,6 +659,7 @@ static struct platform_driver ast_kcs_bmc_driver = {
},
.probe = aspeed_kcs_probe,
.remove = aspeed_kcs_remove,
+ .shutdown = aspeed_kcs_shutdown,
};
module_platform_driver(ast_kcs_bmc_driver);

--
2.25.1

2022-08-23 00:44:11

by Joel Stanley

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] ipmi:kcs_bmc: Add cleanup regmap(interrupt-regs) on a shutdown.

On Mon, 22 Aug 2022 at 06:36, Igor Kononenko <[email protected]> wrote:
>
> The previos v1 [PATCH 1/3] have error of getting `struct kcs_bmc_device`
> pointer from `struct platform_device *`. Replaced to retriveing pointer by
> `platform_get_drvdata()`

Can we get a v3 with your original commit message added? You had a
good write up in v1 but it was dropped in v2.

This change looks like the right thing to do. We should get Andrew to
review too, as he spends more time with the KCS controllers.

The missed irq issue was seen with the other LPC sub drivers because
the clock wasn't enabled. We ended up with this patch:

https://lore.kernel.org/all/[email protected]/

Do we need to do something similar for KCS?

>
> Reported-by: kernel test robot <[email protected]>
> Signed-off-by: Igor Kononenko <[email protected]>
> ---
> drivers/char/ipmi/kcs_bmc_aspeed.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/char/ipmi/kcs_bmc_aspeed.c b/drivers/char/ipmi/kcs_bmc_aspeed.c
> index cdc88cde1e9a..8437f3cfe3f4 100644
> --- a/drivers/char/ipmi/kcs_bmc_aspeed.c
> +++ b/drivers/char/ipmi/kcs_bmc_aspeed.c
> @@ -636,6 +636,14 @@ static int aspeed_kcs_remove(struct platform_device *pdev)
> return 0;
> }
>
> +static void aspeed_kcs_shutdown(struct platform_device *pdev)
> +{
> + struct aspeed_kcs_bmc *priv = platform_get_drvdata(pdev);
> + struct kcs_bmc_device *kcs_bmc = &priv->kcs_bmc;
> +
> + aspeed_kcs_irq_mask_update(kcs_bmc, (KCS_BMC_EVENT_TYPE_IBF), 0);
> +}
> +
> static const struct of_device_id ast_kcs_bmc_match[] = {
> { .compatible = "aspeed,ast2400-kcs-bmc-v2" },
> { .compatible = "aspeed,ast2500-kcs-bmc-v2" },
> @@ -651,6 +659,7 @@ static struct platform_driver ast_kcs_bmc_driver = {
> },
> .probe = aspeed_kcs_probe,
> .remove = aspeed_kcs_remove,
> + .shutdown = aspeed_kcs_shutdown,
> };
> module_platform_driver(ast_kcs_bmc_driver);
>
> --
> 2.25.1
>