2023-09-18 12:29:16

by Biju Das

[permalink] [raw]
Subject: [PATCH 0/3] Fix IRQ storm with GPIO interrupts

The following issues observed while adding IRQ support for RTC.
* The irq_disable is not clearing interrupt source properly.
* The driver is not following as per hardware manual for changing
interrupt settings.
* IRQ storm due to phantum interrupt, when we select the TINT source.
Here IRQ handler disables the interrupts using disable_irq_nosync()
and scheduling a work queue and in the work queue, re-enabling the
interrupt with enable_irq().

Biju Das (3):
irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source
irqchip: renesas-rzg2l: Mask interrupts for changing interrupt
settings
irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for
TINT

drivers/irqchip/irq-renesas-rzg2l.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

--
2.25.1


2023-09-18 12:37:16

by Biju Das

[permalink] [raw]
Subject: [PATCH 1/3] irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source

The logic to clear the TINT interrupt source in rzg2l_irqc_irq_disable()
is wrong as the mask is correct only for LSB on the TSSR register.
This issue is found when testing with two TINT interrupt sources. So fix
the logic for all TINTs by using the macro TSSEL_SHIFT() to multiply
tssr_offset with 8.

Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
Signed-off-by: Biju Das <[email protected]>
Tested-by: Claudiu Beznea <[email protected]>
---
drivers/irqchip/irq-renesas-rzg2l.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
index 4bbfa2b0a4df..2cee5477be6b 100644
--- a/drivers/irqchip/irq-renesas-rzg2l.c
+++ b/drivers/irqchip/irq-renesas-rzg2l.c
@@ -118,7 +118,7 @@ static void rzg2l_irqc_irq_disable(struct irq_data *d)

raw_spin_lock(&priv->lock);
reg = readl_relaxed(priv->base + TSSR(tssr_index));
- reg &= ~(TSSEL_MASK << tssr_offset);
+ reg &= ~(TSSEL_MASK << TSSEL_SHIFT(tssr_offset));
writel_relaxed(reg, priv->base + TSSR(tssr_index));
raw_spin_unlock(&priv->lock);
}
--
2.25.1

2023-09-18 12:37:53

by Biju Das

[permalink] [raw]
Subject: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

In case of edge trigger detection, enabling the TINT source causes a
phantum interrupt that leads to irq storm. So clear the phantum interrupt
in rzg2l_irqc_irq_enable().

This issue is observed when the irq handler disables the interrupts using
disable_irq_nosync() and scheduling a work queue and in the work queue,
re-enabling the interrupt with enable_irq().

Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
Signed-off-by: Biju Das <[email protected]>
Tested-by: Claudiu Beznea <[email protected]>
---
drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
index 33a22bafedcd..78a9e90512a6 100644
--- a/drivers/irqchip/irq-renesas-rzg2l.c
+++ b/drivers/irqchip/irq-renesas-rzg2l.c
@@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data *d)
reg = readl_relaxed(priv->base + TSSR(tssr_index));
reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
writel_relaxed(reg, priv->base + TSSR(tssr_index));
+ /*
+ * In case of edge trigger detection, enabling the TINT source
+ * cause a phantum interrupt that leads to irq storm. So clear
+ * the phantum interrupt.
+ */
+ rzg2l_tint_eoi(d);
raw_spin_unlock(&priv->lock);
irq_chip_unmask_parent(d);
}
--
2.25.1

2023-09-18 12:50:56

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 1/3] irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source

On Mon, Sep 18, 2023 at 2:24 PM Biju Das <[email protected]> wrote:
> The logic to clear the TINT interrupt source in rzg2l_irqc_irq_disable()
> is wrong as the mask is correct only for LSB on the TSSR register.
> This issue is found when testing with two TINT interrupt sources. So fix
> the logic for all TINTs by using the macro TSSEL_SHIFT() to multiply
> tssr_offset with 8.
>
> Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
> Signed-off-by: Biju Das <[email protected]>
> Tested-by: Claudiu Beznea <[email protected]>

Reviewed-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-09-19 09:09:00

by claudiu beznea

[permalink] [raw]
Subject: Re: [PATCH 1/3] irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source



On 18.09.2023 15:24, Biju Das wrote:
> The logic to clear the TINT interrupt source in rzg2l_irqc_irq_disable()
> is wrong as the mask is correct only for LSB on the TSSR register.
> This issue is found when testing with two TINT interrupt sources. So fix
> the logic for all TINTs by using the macro TSSEL_SHIFT() to multiply
> tssr_offset with 8.
>
> Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
> Signed-off-by: Biju Das <[email protected]>
> Tested-by: Claudiu Beznea <[email protected]>

Reviewed-by: Claudiu Beznea <[email protected]>

> ---
> drivers/irqchip/irq-renesas-rzg2l.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
> index 4bbfa2b0a4df..2cee5477be6b 100644
> --- a/drivers/irqchip/irq-renesas-rzg2l.c
> +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> @@ -118,7 +118,7 @@ static void rzg2l_irqc_irq_disable(struct irq_data *d)
>
> raw_spin_lock(&priv->lock);
> reg = readl_relaxed(priv->base + TSSR(tssr_index));
> - reg &= ~(TSSEL_MASK << tssr_offset);
> + reg &= ~(TSSEL_MASK << TSSEL_SHIFT(tssr_offset));
> writel_relaxed(reg, priv->base + TSSR(tssr_index));
> raw_spin_unlock(&priv->lock);
> }

2023-09-19 14:48:19

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

On Mon, 18 Sep 2023 13:24:11 +0100,
Biju Das <[email protected]> wrote:
>
> In case of edge trigger detection, enabling the TINT source causes a
> phantum interrupt that leads to irq storm. So clear the phantum interrupt
> in rzg2l_irqc_irq_enable().
>
> This issue is observed when the irq handler disables the interrupts using
> disable_irq_nosync() and scheduling a work queue and in the work queue,
> re-enabling the interrupt with enable_irq().
>
> Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
> Signed-off-by: Biju Das <[email protected]>
> Tested-by: Claudiu Beznea <[email protected]>
> ---
> drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
> index 33a22bafedcd..78a9e90512a6 100644
> --- a/drivers/irqchip/irq-renesas-rzg2l.c
> +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data *d)
> reg = readl_relaxed(priv->base + TSSR(tssr_index));
> reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> writel_relaxed(reg, priv->base + TSSR(tssr_index));
> + /*
> + * In case of edge trigger detection, enabling the TINT source
> + * cause a phantum interrupt that leads to irq storm. So clear
> + * the phantum interrupt.
> + */
> + rzg2l_tint_eoi(d);

This looks incredibly unsafe. disable_irq()+enable_irq() with an
interrupt being made pending in the middle, and you've lost that
interrupt.

What prevents this scenario?

M.

--
Without deviation from the norm, progress is not possible.

2023-09-19 16:28:09

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

On Tue, 19 Sep 2023 16:24:53 +0100,
Biju Das <[email protected]> wrote:
>
> Hi Marc Zyngier,
>
> Thanks for the feedback.
>
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> > trigger detection for TINT
> >
> > On Mon, 18 Sep 2023 13:24:11 +0100,
> > Biju Das <[email protected]> wrote:
> > >
> > > In case of edge trigger detection, enabling the TINT source causes a
> > > phantum interrupt that leads to irq storm. So clear the phantum
> > > interrupt in rzg2l_irqc_irq_enable().
> > >
> > > This issue is observed when the irq handler disables the interrupts
> > > using
> > > disable_irq_nosync() and scheduling a work queue and in the work
> > > queue, re-enabling the interrupt with enable_irq().
> > >
> > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller
> > > driver")
> > > Signed-off-by: Biju Das <[email protected]>
> > > Tested-by: Claudiu Beznea <[email protected]>
> > > ---
> > > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > > b/drivers/irqchip/irq-renesas-rzg2l.c
> > > index 33a22bafedcd..78a9e90512a6 100644
> > > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data
> > *d)
> > > reg = readl_relaxed(priv->base + TSSR(tssr_index));
> > > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> > > writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > > + /*
> > > + * In case of edge trigger detection, enabling the TINT source
> > > + * cause a phantum interrupt that leads to irq storm. So clear
> > > + * the phantum interrupt.
> > > + */
> > > + rzg2l_tint_eoi(d);
> >
> > This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt
> > being made pending in the middle, and you've lost that interrupt.
>
> In this driver that will never happen as it clears the TINT source
> during disable(), so there won't be any TINT source for interrupt
> detection after disable().

So you mean that you *already* lose interrupts across a disable
followed by an enable? I'm slightly puzzled...

M.

--
Without deviation from the norm, progress is not possible.

2023-09-19 16:58:59

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

On Tue, 19 Sep 2023 17:32:05 +0100,
Biju Das <[email protected]> wrote:

[...]

> > So you mean that you *already* lose interrupts across a disable followed by
> > an enable? I'm slightly puzzled...
>
> There is no interrupt lost at all.
>
> Currently this patch addresses 2 issues.
>
> Scenario 1: Extra interrupt when we select TINT source on enable_irq()
>
> Getting an extra interrupt, when client drivers calls enable_irq()
> during probe()/resume(). In this case, the irq handler on the Client
> driver just clear the interrupt status bit.
>
> Issue 2: IRQ storm when we select TINT source on enable_irq()
>
> Here as well, we are getting an extra interrupt, when client drivers
> calls enable_irq() during probe() and this Interrupts getting
> generated infinitely, when the client driver calls disable_irq() in
> irq handler and in in work queue calling enable_irq().

How do you know this is a spurious interrupt? For all you can tell,
you are just consuming an edge. I absolutely don't buy this
workaround, because you have no context that allows you to
discriminate between a real spurious interrupt and a normal interrupt
that lands while the interrupt line was masked.

> Currently we are not loosing interrupts, but we are getting additional
> Interrupt(phantom) which is causing the issue.

If you get an interrupt at probe time in the endpoint driver, that's
probably because the device is not in a quiescent state when the
interrupt is requested. And it is probably this that needs addressing.

M.

--
Without deviation from the norm, progress is not possible.

2023-09-19 17:01:42

by Biju Das

[permalink] [raw]
Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Hi Marc Zyngier,

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
>
> On Tue, 19 Sep 2023 16:24:53 +0100,
> Biju Das <[email protected]> wrote:
> >
> > Hi Marc Zyngier,
> >
> > Thanks for the feedback.
> >
> > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > > edge trigger detection for TINT
> > >
> > > On Mon, 18 Sep 2023 13:24:11 +0100,
> > > Biju Das <[email protected]> wrote:
> > > >
> > > > In case of edge trigger detection, enabling the TINT source causes
> > > > a phantum interrupt that leads to irq storm. So clear the phantum
> > > > interrupt in rzg2l_irqc_irq_enable().
> > > >
> > > > This issue is observed when the irq handler disables the
> > > > interrupts using
> > > > disable_irq_nosync() and scheduling a work queue and in the work
> > > > queue, re-enabling the interrupt with enable_irq().
> > > >
> > > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt
> > > > Controller
> > > > driver")
> > > > Signed-off-by: Biju Das <[email protected]>
> > > > Tested-by: Claudiu Beznea <[email protected]>
> > > > ---
> > > > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > > > b/drivers/irqchip/irq-renesas-rzg2l.c
> > > > index 33a22bafedcd..78a9e90512a6 100644
> > > > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct
> > > > irq_data
> > > *d)
> > > > reg = readl_relaxed(priv->base + TSSR(tssr_index));
> > > > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> > > > writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > > > + /*
> > > > + * In case of edge trigger detection, enabling the TINT
> source
> > > > + * cause a phantum interrupt that leads to irq storm. So
> clear
> > > > + * the phantum interrupt.
> > > > + */
> > > > + rzg2l_tint_eoi(d);
> > >
> > > This looks incredibly unsafe. disable_irq()+enable_irq() with an
> > > interrupt being made pending in the middle, and you've lost that
> interrupt.
> >
> > In this driver that will never happen as it clears the TINT source
> > during disable(), so there won't be any TINT source for interrupt
> > detection after disable().
>
> So you mean that you *already* lose interrupts across a disable followed by
> an enable? I'm slightly puzzled...

There is no interrupt lost at all.

Currently this patch addresses 2 issues.

Scenario 1: Extra interrupt when we select TINT source on enable_irq()

Getting an extra interrupt, when client drivers calls enable_irq() during probe()/resume(). In this case, the irq handler on the
Client driver just clear the interrupt status bit.

Issue 2: IRQ storm when we select TINT source on enable_irq()

Here as well, we are getting an extra interrupt, when client drivers calls enable_irq() during probe() and this Interrupts getting generated infinitely, when the client driver calls disable_irq() in irq handler and in in work queue calling enable_irq().

Currently we are not loosing interrupts, but we are getting additional
Interrupt(phantom) which is causing the issue.

Cheers,
Biju


2023-09-19 22:31:13

by Biju Das

[permalink] [raw]
Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Hi Marc Zyngier,

Thanks for the feedback.

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
>
> On Mon, 18 Sep 2023 13:24:11 +0100,
> Biju Das <[email protected]> wrote:
> >
> > In case of edge trigger detection, enabling the TINT source causes a
> > phantum interrupt that leads to irq storm. So clear the phantum
> > interrupt in rzg2l_irqc_irq_enable().
> >
> > This issue is observed when the irq handler disables the interrupts
> > using
> > disable_irq_nosync() and scheduling a work queue and in the work
> > queue, re-enabling the interrupt with enable_irq().
> >
> > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller
> > driver")
> > Signed-off-by: Biju Das <[email protected]>
> > Tested-by: Claudiu Beznea <[email protected]>
> > ---
> > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > b/drivers/irqchip/irq-renesas-rzg2l.c
> > index 33a22bafedcd..78a9e90512a6 100644
> > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data
> *d)
> > reg = readl_relaxed(priv->base + TSSR(tssr_index));
> > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> > writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > + /*
> > + * In case of edge trigger detection, enabling the TINT source
> > + * cause a phantum interrupt that leads to irq storm. So clear
> > + * the phantum interrupt.
> > + */
> > + rzg2l_tint_eoi(d);
>
> This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt
> being made pending in the middle, and you've lost that interrupt.

In this driver that will never happen as it clears the TINT source
during disable(), so there won't be any TINT source for interrupt detection after disable().

Cheers,
Biju

> What prevents this scenario?

2023-09-20 03:43:39

by Biju Das

[permalink] [raw]
Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Hi Marc Zyngier,

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
>
> On Tue, 19 Sep 2023 17:32:05 +0100,
> Biju Das <[email protected]> wrote:
>
> [...]
>
> > > So you mean that you *already* lose interrupts across a disable
> > > followed by an enable? I'm slightly puzzled...
> >
> > There is no interrupt lost at all.
> >
> > Currently this patch addresses 2 issues.
> >
> > Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> >
> > Getting an extra interrupt, when client drivers calls enable_irq()
> > during probe()/resume(). In this case, the irq handler on the Client
> > driver just clear the interrupt status bit.
> >
> > Issue 2: IRQ storm when we select TINT source on enable_irq()
> >
> > Here as well, we are getting an extra interrupt, when client drivers
> > calls enable_irq() during probe() and this Interrupts getting
> > generated infinitely, when the client driver calls disable_irq() in
> > irq handler and in in work queue calling enable_irq().
>
> How do you know this is a spurious interrupt?

We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin
and other end to ground. During the boot, I get an interrupt
even though there is no high to low transition, when the IRQ is setup
in the probe(). From this it is a spurious interrupt.

> For all you can tell, you are
> just consuming an edge. I absolutely don't buy this workaround, because you
> have no context that allows you to discriminate between a real spurious
> interrupt and a normal interrupt that lands while the interrupt line was
> masked.
>
> > Currently we are not loosing interrupts, but we are getting additional
> > Interrupt(phantom) which is causing the issue.
>
> If you get an interrupt at probe time in the endpoint driver, that's
> probably because the device is not in a quiescent state when the interrupt
> is requested. And it is probably this that needs addressing.

Any pointer for addressing this issue?

Thanks for your help.

Cheers,
Biju

2023-09-21 18:02:38

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

On Tue, 19 Sep 2023 18:06:54 +0100,
Biju Das <[email protected]> wrote:
>
> Hi Marc Zyngier,
>
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> > trigger detection for TINT
> >
> > On Tue, 19 Sep 2023 17:32:05 +0100,
> > Biju Das <[email protected]> wrote:
> >
> > [...]
> >
> > > > So you mean that you *already* lose interrupts across a disable
> > > > followed by an enable? I'm slightly puzzled...
> > >
> > > There is no interrupt lost at all.
> > >
> > > Currently this patch addresses 2 issues.
> > >
> > > Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> > >
> > > Getting an extra interrupt, when client drivers calls enable_irq()
> > > during probe()/resume(). In this case, the irq handler on the Client
> > > driver just clear the interrupt status bit.
> > >
> > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > >
> > > Here as well, we are getting an extra interrupt, when client drivers
> > > calls enable_irq() during probe() and this Interrupts getting
> > > generated infinitely, when the client driver calls disable_irq() in
> > > irq handler and in in work queue calling enable_irq().
> >
> > How do you know this is a spurious interrupt?
>
> We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin
> and other end to ground. During the boot, I get an interrupt
> even though there is no high to low transition, when the IRQ is setup
> in the probe(). From this it is a spurious interrupt.

That doesn't really handle my question. At the point of enabling the
interrupt and consuming the edge (which is what this patch does), how
do you know you can readily discard this signal? This is a genuine
question.

Spurious interrupts at boot are common. The HW resets in a funky,
unspecified state, and it's SW's job to initialise it before letting
other agents in the system use interrupts.

>
> > For all you can tell, you are
> > just consuming an edge. I absolutely don't buy this workaround, because you
> > have no context that allows you to discriminate between a real spurious
> > interrupt and a normal interrupt that lands while the interrupt line was
> > masked.
> >
> > > Currently we are not loosing interrupts, but we are getting additional
> > > Interrupt(phantom) which is causing the issue.
> >
> > If you get an interrupt at probe time in the endpoint driver, that's
> > probably because the device is not in a quiescent state when the interrupt
> > is requested. And it is probably this that needs addressing.
>
> Any pointer for addressing this issue?

Nothing but the most basic stuff: you should make sure that the
interrupt isn't enabled before you can actually handle it, and triage
it as spurious.

M.

--
Without deviation from the norm, progress is not possible.

2023-09-23 14:26:52

by Biju Das

[permalink] [raw]
Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Hi Marc Zyngier,

Thanks for the feedback.

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
>
> On Tue, 19 Sep 2023 18:06:54 +0100,
> Biju Das <[email protected]> wrote:
> >
> > Hi Marc Zyngier,
> >
> > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > > edge trigger detection for TINT
> > >
> > > On Tue, 19 Sep 2023 17:32:05 +0100,
> > > Biju Das <[email protected]> wrote:
> > >
> > > [...]
> > >
> > > > > So you mean that you *already* lose interrupts across a disable
> > > > > followed by an enable? I'm slightly puzzled...
> > > >
> > > > There is no interrupt lost at all.
> > > >
> > > > Currently this patch addresses 2 issues.
> > > >
> > > > Scenario 1: Extra interrupt when we select TINT source on
> > > > enable_irq()
> > > >
> > > > Getting an extra interrupt, when client drivers calls enable_irq()
> > > > during probe()/resume(). In this case, the irq handler on the
> > > > Client driver just clear the interrupt status bit.
> > > >
> > > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > > >
> > > > Here as well, we are getting an extra interrupt, when client
> > > > drivers calls enable_irq() during probe() and this Interrupts
> > > > getting generated infinitely, when the client driver calls
> > > > disable_irq() in irq handler and in in work queue calling
> enable_irq().
> > >
> > > How do you know this is a spurious interrupt?
> >
> > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and
> > other end to ground. During the boot, I get an interrupt even though
> > there is no high to low transition, when the IRQ is setup in the
> > probe(). From this it is a spurious interrupt.
>
> That doesn't really handle my question. At the point of enabling the
> interrupt and consuming the edge (which is what this patch does), how do
> you know you can readily discard this signal? This is a genuine question.
>
> Spurious interrupts at boot are common. The HW resets in a funky,
> unspecified state, and it's SW's job to initialise it before letting other
> agents in the system use interrupts.

I got your point related to loosing interrupts.

Now I can detect spurious interrupts for edge trigger.

Pin controller driver has a read-only register to monitor input values of GPIO input pins, use that register values before/after rzg2l_irq_enable() with TINT Status Control Register (TSCR)
in IRQ controller to detect the spurious interrupt.

Eg:
1) Check PIN_43_0 value (ex: low)in pinctrl driver
2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in pinctrl driver
3) Check PIN_43_0 value (ex: low) in pinctrl driver
4) Check the TINT Status Control Register(TSCR) in IRQ controller driver

If the values in 1 and 3 are same and the status in 4 is set, then there is a spurious interrupt.

>
> >
> > > For all you can tell, you are
> > > just consuming an edge. I absolutely don't buy this workaround,
> > > because you have no context that allows you to discriminate between
> > > a real spurious interrupt and a normal interrupt that lands while
> > > the interrupt line was masked.
> > >
> > > > Currently we are not loosing interrupts, but we are getting
> > > > additional
> > > > Interrupt(phantom) which is causing the issue.
> > >
> > > If you get an interrupt at probe time in the endpoint driver, that's
> > > probably because the device is not in a quiescent state when the
> > > interrupt is requested. And it is probably this that needs addressing.
> >
> > Any pointer for addressing this issue?
>
> Nothing but the most basic stuff: you should make sure that the interrupt
> isn't enabled before you can actually handle it, and triage it as spurious.

For the GPIO interrupt case I have,

RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver-->GIC controller.

1) I have configured the pin as GPIO interrupts in pin controller driver
2) Set the IRQ detection in IRQ controller for edge trigger
3) The moment I set the IRQ source in IRQ controller
I get an interrupt, even though there is no voltage transition.

Here the system is setup properly, but there is a spurious interrupt. Currently don't know how to handle it?

Any pointers for handling this issue?

Note:
Currently the pin controller driver is not configuring GPIO as GPIO input in Port Mode Register for the GPIO interrupts instead it is using reset value which is "Hi-Z". I will send a patch to fix it.

Cheers,
Biju

Subject: [irqchip: irq/irqchip-fixes] irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source

The following commit has been merged into the irq/irqchip-fixes branch of irqchip:

Commit-ID: 9b8df572ba3f4e544366196820a719a40774433e
Gitweb: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/9b8df572ba3f4e544366196820a719a40774433e
Author: Biju Das <[email protected]>
AuthorDate: Mon, 18 Sep 2023 13:24:09 +01:00
Committer: Marc Zyngier <[email protected]>
CommitterDate: Sun, 24 Sep 2023 10:18:19 +01:00

irqchip: renesas-rzg2l: Fix logic to clear TINT interrupt source

The logic to clear the TINT interrupt source in rzg2l_irqc_irq_disable()
is wrong as the mask is correct only for LSB on the TSSR register.
This issue is found when testing with two TINT interrupt sources. So fix
the logic for all TINTs by using the macro TSSEL_SHIFT() to multiply
tssr_offset with 8.

Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
Signed-off-by: Biju Das <[email protected]>
Tested-by: Claudiu Beznea <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/irqchip/irq-renesas-rzg2l.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
index 4bbfa2b..2cee547 100644
--- a/drivers/irqchip/irq-renesas-rzg2l.c
+++ b/drivers/irqchip/irq-renesas-rzg2l.c
@@ -118,7 +118,7 @@ static void rzg2l_irqc_irq_disable(struct irq_data *d)

raw_spin_lock(&priv->lock);
reg = readl_relaxed(priv->base + TSSR(tssr_index));
- reg &= ~(TSSEL_MASK << tssr_offset);
+ reg &= ~(TSSEL_MASK << TSSEL_SHIFT(tssr_offset));
writel_relaxed(reg, priv->base + TSSR(tssr_index));
raw_spin_unlock(&priv->lock);
}

2023-10-06 10:46:49

by Biju Das

[permalink] [raw]
Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Hi Marc,

> Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
>
> Hi Marc Zyngier,
>
> Thanks for the feedback.
>
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > edge trigger detection for TINT
> >
> > On Tue, 19 Sep 2023 18:06:54 +0100,
> > Biju Das <[email protected]> wrote:
> > >
> > > Hi Marc Zyngier,
> > >
> > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm
> > > > with edge trigger detection for TINT
> > > >
> > > > On Tue, 19 Sep 2023 17:32:05 +0100, Biju Das
> > > > <[email protected]> wrote:
> > > >
> > > > [...]
> > > >
> > > > > > So you mean that you *already* lose interrupts across a
> > > > > > disable followed by an enable? I'm slightly puzzled...
> > > > >
> > > > > There is no interrupt lost at all.
> > > > >
> > > > > Currently this patch addresses 2 issues.
> > > > >
> > > > > Scenario 1: Extra interrupt when we select TINT source on
> > > > > enable_irq()
> > > > >
> > > > > Getting an extra interrupt, when client drivers calls
> > > > > enable_irq() during probe()/resume(). In this case, the irq
> > > > > handler on the Client driver just clear the interrupt status bit.
> > > > >
> > > > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > > > >
> > > > > Here as well, we are getting an extra interrupt, when client
> > > > > drivers calls enable_irq() during probe() and this Interrupts
> > > > > getting generated infinitely, when the client driver calls
> > > > > disable_irq() in irq handler and in in work queue calling
> > enable_irq().
> > > >
> > > > How do you know this is a spurious interrupt?
> > >
> > > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and
> > > other end to ground. During the boot, I get an interrupt even though
> > > there is no high to low transition, when the IRQ is setup in the
> > > probe(). From this it is a spurious interrupt.
> >
> > That doesn't really handle my question. At the point of enabling the
> > interrupt and consuming the edge (which is what this patch does), how
> > do you know you can readily discard this signal? This is a genuine
> question.
> >
> > Spurious interrupts at boot are common. The HW resets in a funky,
> > unspecified state, and it's SW's job to initialise it before letting
> > other agents in the system use interrupts.
>
> I got your point related to loosing interrupts.
>
> Now I can detect spurious interrupts for edge trigger.
>
> Pin controller driver has a read-only register to monitor input values of
> GPIO input pins, use that register values before/after rzg2l_irq_enable()
> with TINT Status Control Register (TSCR) in IRQ controller to detect the
> spurious interrupt.
>
> Eg:
> 1) Check PIN_43_0 value (ex: low)in pinctrl driver
> 2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in
> pinctrl driver
> 3) Check PIN_43_0 value (ex: low) in pinctrl driver
> 4) Check the TINT Status Control Register(TSCR) in IRQ controller driver
>
> If the values in 1 and 3 are same and the status in 4 is set, then
> there is a spurious interrupt.
>
> >
> > >
> > > > For all you can tell, you are
> > > > just consuming an edge. I absolutely don't buy this workaround,
> > > > because you have no context that allows you to discriminate
> > > > between a real spurious interrupt and a normal interrupt that
> > > > lands while the interrupt line was masked.
> > > >
> > > > > Currently we are not loosing interrupts, but we are getting
> > > > > additional
> > > > > Interrupt(phantom) which is causing the issue.
> > > >
> > > > If you get an interrupt at probe time in the endpoint driver,
> > > > that's probably because the device is not in a quiescent state
> > > > when the interrupt is requested. And it is probably this that needs
> addressing.
> > >
> > > Any pointer for addressing this issue?
> >
> > Nothing but the most basic stuff: you should make sure that the
> > interrupt isn't enabled before you can actually handle it, and triage it
> as spurious.
>
> For the GPIO interrupt case I have,
>
> RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver--
> >GIC controller.
>
> 1) I have configured the pin as GPIO interrupts in pin controller driver
> 2) Set the IRQ detection in IRQ controller for edge trigger
> 3) The moment I set the IRQ source in IRQ controller
> I get an interrupt, even though there is no voltage transition.
>
> Here the system is setup properly, but there is a spurious interrupt.
> Currently don't know how to handle it?
>
> Any pointers for handling this issue?
>
> Note:
> Currently the pin controller driver is not configuring GPIO as GPIO input
> in Port Mode Register for the GPIO interrupts instead it is using reset
> value which is "Hi-Z". I will send a patch to fix it.

An update, I have found a way to fix the spurious interrupt issue.

Spurious interrupt is generated if we do simultaneous writing of
TINT Source selection and TINT Source enable in TSSRx register.

If we write the register in correct order, then there is no issue.
i.e., first set the TINT Source selection and after that enable it.

Looks like it is a HW race condition. I am checking this issue with HW team.

Cheers,
Biju