2023-11-22 07:36:18

by Daniel Mack

[permalink] [raw]
Subject: [PATCH v4] serial: sc16is7xx: address RX timeout interrupt errata

This device has a silicon bug that makes it report a timeout interrupt
but no data in the FIFO.

The datasheet states the following in the errata section 18.1.4:

"If the host reads the receive FIFO at the same time as a
time-out interrupt condition happens, the host might read 0xCC
(time-out) in the Interrupt Indication Register (IIR), but bit 0
of the Line Status Register (LSR) is not set (means there is no
data in the receive FIFO)."

The errata doesn't explicitly mention that, but tests have shown
and the vendor has confirmed that the RXLVL register is equally
affected.

This bug has hit us on production units and when it does, sc16is7xx_irq()
would spin forever because sc16is7xx_port_irq() keeps seeing an
interrupt in the IIR register that is not cleared because the driver
does not call into sc16is7xx_handle_rx() unless the RXLVL register
reports at least one byte in the FIFO.

Fix this by always reading one byte when this condition is detected
in order to clear the interrupt. This approach was confirmed to be
correct by NXP through their support channels.

Signed-off-by: Daniel Mack <[email protected]>
Co-Developed-by: Maxim Popov <[email protected]>
Cc: [email protected]
---
Meanwhile, NXP has confirmed this fix to be correct.

v4: NXP has confirmed the fix; update the commit log accordingly
v3: re-added the additional Co-Developed-by and stable@ tags
v2: reworded the commit log a bit for more context.

drivers/tty/serial/sc16is7xx.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 289ca7d4e566..76f76e510ed1 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -765,6 +765,18 @@ static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
case SC16IS7XX_IIR_RTOI_SRC:
case SC16IS7XX_IIR_XOFFI_SRC:
rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
+
+ /*
+ * There is a silicon bug that makes the chip report a
+ * time-out interrupt but no data in the FIFO. This is
+ * described in errata section 18.1.4.
+ *
+ * When this happens, read one byte from the FIFO to
+ * clear the interrupt.
+ */
+ if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
+ rxlen = 1;
+
if (rxlen)
sc16is7xx_handle_rx(port, rxlen, iir);
break;
--
2.41.0


2023-11-22 19:38:06

by Hugo Villeneuve

[permalink] [raw]
Subject: Re: [PATCH v4] serial: sc16is7xx: address RX timeout interrupt errata

On Wed, 22 Nov 2023 08:35:41 +0100
Daniel Mack <[email protected]> wrote:

> This device has a silicon bug that makes it report a timeout interrupt
> but no data in the FIFO.
>
> The datasheet states the following in the errata section 18.1.4:
>
> "If the host reads the receive FIFO at the same time as a
> time-out interrupt condition happens, the host might read 0xCC
> (time-out) in the Interrupt Indication Register (IIR), but bit 0
> of the Line Status Register (LSR) is not set (means there is no
> data in the receive FIFO)."
>
> The errata doesn't explicitly mention that, but tests have shown
> and the vendor has confirmed that the RXLVL register is equally
> affected.

Hi Daniel,
thank you for the feedback from NXP.

I would suggest to replace this paragraph with something like this:

------
The errata description seems to indicate it affects only polled mode of
operation when reading bit 0 of the LSR register. But when using
interrupt mode (IRQ) like this driver does, reading RXLVL gives a value
of zero even if there is data in the Rx FIFO (confirmed by tests and
NXP).
------

> This bug has hit us on production units and when it does, sc16is7xx_irq()
> would spin forever because sc16is7xx_port_irq() keeps seeing an
> interrupt in the IIR register that is not cleared because the driver
> does not call into sc16is7xx_handle_rx() unless the RXLVL register
> reports at least one byte in the FIFO.
>
> Fix this by always reading one byte when this condition is detected

Change "reading one byte" to "reading one byte from the Rx FIFO".


> in order to clear the interrupt. This approach was confirmed to be
> correct by NXP through their support channels.
>
> Signed-off-by: Daniel Mack <[email protected]>
> Co-Developed-by: Maxim Popov <[email protected]>
> Cc: [email protected]

I tested your patch for the last few days, and I was not able to
reproduce the problem (I put a trace to detect the condition). But
at the same time, it has not caused any regressions.

With the above changes, feel free to add:

Tested by: Hugo Villeneuve <[email protected]>

Hugo.


> ---
> Meanwhile, NXP has confirmed this fix to be correct.
>
> v4: NXP has confirmed the fix; update the commit log accordingly
> v3: re-added the additional Co-Developed-by and stable@ tags
> v2: reworded the commit log a bit for more context.
>
> drivers/tty/serial/sc16is7xx.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
> index 289ca7d4e566..76f76e510ed1 100644
> --- a/drivers/tty/serial/sc16is7xx.c
> +++ b/drivers/tty/serial/sc16is7xx.c
> @@ -765,6 +765,18 @@ static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
> case SC16IS7XX_IIR_RTOI_SRC:
> case SC16IS7XX_IIR_XOFFI_SRC:
> rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
> +
> + /*
> + * There is a silicon bug that makes the chip report a
> + * time-out interrupt but no data in the FIFO. This is
> + * described in errata section 18.1.4.
> + *
> + * When this happens, read one byte from the FIFO to
> + * clear the interrupt.
> + */
> + if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
> + rxlen = 1;
> +
> if (rxlen)
> sc16is7xx_handle_rx(port, rxlen, iir);
> break;
> --
> 2.41.0
>
>