I2C interrupts may be triggered unexpectedly,
such as programs that directly access I2C registers,
bus conflicts caused by hardware design defects, etc.
These can cause null pointer reference errors and kernel panic.
kernel log:
[ 52.676442] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[ 52.816536] Workqueue: efi_rts_wq efi_call_rts
[ 52.820968] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 52.825753] pc : i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
[ 52.831487] lr : i2c_dw_isr+0x88/0x5e0 [i2c_designware_core]
[ 52.837134] sp : ffff8020fff17650
[ 52.924451] Call trace:
[ 52.926888] i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
...
[ 52.957394] gic_handle_irq+0x7c/0x178
[ 52.961130] el1_irq+0xb0/0x140
[ 52.964259] 0x21291d30
[ 52.983729] 0x21160938
[ 52.986164] __efi_rt_asm_wrapper+0x28/0x44
[ 52.990335] efi_call_rts+0x78/0x448
[ 53.019021] Kernel panic - not syncing: Fatal exception in interrupt
Signed-off-by: huangbibo <[email protected]>
---
drivers/i2c/busses/i2c-designware-master.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c
index 2871cf2ee8b4..6af1ede38253 100644
--- a/drivers/i2c/busses/i2c-designware-master.c
+++ b/drivers/i2c/busses/i2c-designware-master.c
@@ -631,8 +631,14 @@ static int i2c_dw_irq_handler_master(struct dw_i2c_dev *dev)
if (stat & DW_IC_INTR_RX_FULL)
i2c_dw_read(dev);
- if (stat & DW_IC_INTR_TX_EMPTY)
- i2c_dw_xfer_msg(dev);
+ if (stat & DW_IC_INTR_TX_EMPTY) {
+ if (dev->msgs) {
+ i2c_dw_xfer_msg(dev);
+ } else { //null pointer
+ i2c_dw_disable_int(dev);
+ return 0;
+ }
+ }
/*
* No need to modify or disable the interrupt mask here.
--
2.20.1
Hi
On 11/11/21 8:57 AM, huangbibo wrote:
> I2C interrupts may be triggered unexpectedly,
> such as programs that directly access I2C registers,
> bus conflicts caused by hardware design defects, etc.
> These can cause null pointer reference errors and kernel panic.
>
> kernel log:
> [ 52.676442] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> ...
> [ 52.816536] Workqueue: efi_rts_wq efi_call_rts
> [ 52.820968] pstate: 60000085 (nZCv daIf -PAN -UAO)
> [ 52.825753] pc : i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
> [ 52.831487] lr : i2c_dw_isr+0x88/0x5e0 [i2c_designware_core]
> [ 52.837134] sp : ffff8020fff17650
> [ 52.924451] Call trace:
> [ 52.926888] i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
> ...
> [ 52.957394] gic_handle_irq+0x7c/0x178
> [ 52.961130] el1_irq+0xb0/0x140
> [ 52.964259] 0x21291d30
> [ 52.983729] 0x21160938
> [ 52.986164] __efi_rt_asm_wrapper+0x28/0x44
> [ 52.990335] efi_call_rts+0x78/0x448
> [ 53.019021] Kernel panic - not syncing: Fatal exception in interrupt
>
> Signed-off-by: huangbibo <[email protected]>
> ---
> drivers/i2c/busses/i2c-designware-master.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c
> index 2871cf2ee8b4..6af1ede38253 100644
> --- a/drivers/i2c/busses/i2c-designware-master.c
> +++ b/drivers/i2c/busses/i2c-designware-master.c
> @@ -631,8 +631,14 @@ static int i2c_dw_irq_handler_master(struct dw_i2c_dev *dev)
> if (stat & DW_IC_INTR_RX_FULL)
> i2c_dw_read(dev);
>
> - if (stat & DW_IC_INTR_TX_EMPTY)
> - i2c_dw_xfer_msg(dev);
> + if (stat & DW_IC_INTR_TX_EMPTY) {
> + if (dev->msgs) {
> + i2c_dw_xfer_msg(dev);
> + } else { //null pointer
> + i2c_dw_disable_int(dev);
> + return 0;
> + }
> + }
This feels to me we are masking the problem. It's common to have
i2c-designware device suspended (and registers might read all bits zero)
and receive interrupts from another device if interrupt line is shared.
Also while dev->msgs is NULL.
I'd like to understand the issue more. Could you add
"i2c_designware_core.dyndbg=+p" into command line in order to see
dev_dbg() prints in dmesg and with following patch?
--- a/drivers/i2c/busses/i2c-designware-master.c
+++ b/drivers/i2c/busses/i2c-designware-master.c
@@ -720,6 +720,7 @@ static int i2c_dw_irq_handler_master(struct
dw_i2c_dev *dev)
u32 stat;
stat = i2c_dw_read_clear_intrbits(dev);
+ dev_dbg(dev->dev, "stat %#x\n", stat);
if (stat & DW_IC_INTR_TX_ABRT) {
dev->cmd_err |= DW_IC_ERR_TX_ABRT;
dev->status = STATUS_IDLE;
Jarkko
On Thu, Nov 11, 2021 at 02:57:59PM +0800, huangbibo wrote:
> I2C interrupts may be triggered unexpectedly,
> such as programs that directly access I2C registers,
Wow! What programs you are talking about? How comes they are doing this
weirdness in parallel with the OS running?
Can you elaborate, please?
> bus conflicts caused by hardware design defects, etc.
> These can cause null pointer reference errors and kernel panic.
--
With Best Regards,
Andy Shevchenko
On Fri, Nov 12, 2021 at 03:35:07AM +0000, 黄碧波 wrote:
> Hi Andy,
First of all, please fix your email client (it mangled the message
in a bad way) and do not top post!
> This patch is to make the kernel more stable & stronger,
> Even if there is an unexpected I2C interrupt, kernel will not crash
As far as I can see this is, as Jarkko said, the change to hide the real issue.
> Let me elaborate on this issue:
> The BIOS support EFI RTC feature and add the I2C bus descr in the ACPI table (RTC chip is connected to I2C bus), The OS matches and registers the I2C bus driver.
>
> When OS get RTC time by BIOS interface (Runtime Server), crash occurs.
> The BIOS interface direct acces registers to sends and receives data, which conflicts with the I2C driver
>
> This is a BIOS error and the root cause of this issue. The final solution is to delete the I2C device node in the ACPI table
Oh, yeah, yet another brain damaged design. If somebody wants to have driver of
the I²C peripheral in the ASL, it's not gonna work in Linux (in 99.99% cases).
What you should do is go and fix BIOS that it won't do two things together, i.e.
ASL based driver and exposure of I²C host controller in ACPI.
On the constructive way, you need to use DMI quirks and somehow make EFI and
I²C code to be communicating nicely in the kernel.
The patch makes no sense to me, the problem is obviously somewhere else.
NAK.
>
>
> ------------------ Original ------------------
> From: "Jarkko Nikula"<[email protected]>;
> Date: Thu, Nov 11, 2021 02:06 PM
> To: "huangbibo"<[email protected]>; "linux-i2c"<[email protected]>;
> Cc: "linux-kernel"<[email protected]>; "mika.westerberg"<[email protected]>; "andriy.shevchenko"<[email protected]>; "p.zabel"<[email protected]>;
> Subject: Re: [PATCH] i2c: designware: I2C unexpected interrupt handling will cause kernel panic
>
>
>
> Hi
>
> On 11/11/21 8:57 AM, huangbibo wrote:
> > I2C interrupts may be triggered unexpectedly,
> > such as programs that directly access I2C registers,
> > bus conflicts caused by hardware design defects, etc.
> > These can cause null pointer reference errors and kernel panic.
> >
> > kernel log:
> > [ 52.676442] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > ...
> > [ 52.816536] Workqueue: efi_rts_wq efi_call_rts
> > [ 52.820968] pstate: 60000085 (nZCv daIf -PAN -UAO)
> > [ 52.825753] pc : i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
> > [ 52.831487] lr : i2c_dw_isr+0x88/0x5e0 [i2c_designware_core]
> > [ 52.837134] sp : ffff8020fff17650
> > [ 52.924451] Call trace:
> > [ 52.926888] i2c_dw_isr+0x36c/0x5e0 [i2c_designware_core]
> > ...
> > [ 52.957394] gic_handle_irq+0x7c/0x178
> > [ 52.961130] el1_irq+0xb0/0x140
> > [ 52.964259] 0x21291d30
> > [ 52.983729] 0x21160938
> > [ 52.986164] __efi_rt_asm_wrapper+0x28/0x44
> > [ 52.990335] efi_call_rts+0x78/0x448
> > [ 53.019021] Kernel panic - not syncing: Fatal exception in interrupt
--
With Best Regards,
Andy Shevchenko