2022-04-27 14:54:32

by Daniel Thompson

[permalink] [raw]
Subject: [PATCH] irqchip/exiu: Fix acknowledgment of edge triggered interrupts

Currently the EXIU uses the fasteoi interrupt flow that is configured by
it's parent (irq-gic-v3.c). With this flow the only chance to clear the
interrupt request happens during .irq_eoi() and (obviously) this happens
after the interrupt handler has run. EXIU requires edge triggered
interrupts to be acked prior to interrupt handling. Without this we
risk incorrect interrupt dismissal when a new interrupt is delivered
after the handler reads and acknowledges the peripheral but before the
irq_eoi() takes place.

Fix this by clearing the interrupt request from .irq_ack() instead. This
requires switching to the fasteoi-ack flow instead of the fasteoi flow.
This approach is inspired by the nmi code found in irq-sun6i-r.c .

Note: It *is* intentional not to call irq_chip_ack_parent() from
exiu_irq_ack(). Parents that implement the fasteoi-ack flow don't
want us to do that (and we'd crash if we tried).

This problem was discovered through code review rather then reproducing
missed interrupts in a real platform. Nevertheless the change has been
tested for regression on a Developerbox/SC2A11 using the power button
(which is edge triggered and is the only thing connected to the EXIU on
this platform).

Fixes: 706cffc1b912 ("irqchip/exiu: Add support for Socionext Synquacer EXIU controller")
Signed-off-by: Daniel Thompson <[email protected]>
---
arch/arm64/Kconfig.platforms | 1 +
drivers/irqchip/irq-sni-exiu.c | 16 ++++++++++++----
2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index 30b123cde02c..aaeaf57c8222 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -253,6 +253,7 @@ config ARCH_INTEL_SOCFPGA

config ARCH_SYNQUACER
bool "Socionext SynQuacer SoC Family"
+ select IRQ_FASTEOI_HIERARCHY_HANDLERS

config ARCH_TEGRA
bool "NVIDIA Tegra SoC Family"
diff --git a/drivers/irqchip/irq-sni-exiu.c b/drivers/irqchip/irq-sni-exiu.c
index abd011fcecf4..1cc2ec272ebd 100644
--- a/drivers/irqchip/irq-sni-exiu.c
+++ b/drivers/irqchip/irq-sni-exiu.c
@@ -37,12 +37,11 @@ struct exiu_irq_data {
u32 spi_base;
};

-static void exiu_irq_eoi(struct irq_data *d)
+static void exiu_irq_ack(struct irq_data *d)
{
struct exiu_irq_data *data = irq_data_get_irq_chip_data(d);

writel(BIT(d->hwirq), data->base + EIREQCLR);
- irq_chip_eoi_parent(d);
}

static void exiu_irq_mask(struct irq_data *d)
@@ -104,7 +103,8 @@ static int exiu_irq_set_type(struct irq_data *d, unsigned int type)

static struct irq_chip exiu_irq_chip = {
.name = "EXIU",
- .irq_eoi = exiu_irq_eoi,
+ .irq_ack = exiu_irq_ack,
+ .irq_eoi = irq_chip_eoi_parent,
.irq_enable = exiu_irq_enable,
.irq_mask = exiu_irq_mask,
.irq_unmask = exiu_irq_unmask,
@@ -148,6 +148,7 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
struct irq_fwspec parent_fwspec;
struct exiu_irq_data *info = dom->host_data;
irq_hw_number_t hwirq;
+ int i, ret;

parent_fwspec = *fwspec;
if (is_of_node(dom->parent->fwnode)) {
@@ -165,7 +166,14 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
irq_domain_set_hwirq_and_chip(dom, virq, hwirq, &exiu_irq_chip, info);

parent_fwspec.fwnode = dom->parent->fwnode;
- return irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
+ ret = irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < nr_irqs; i++)
+ irq_set_handler(virq+i, handle_fasteoi_ack_irq);
+
+ return 0;
}

static const struct irq_domain_ops exiu_domain_ops = {

base-commit: b2d229d4ddb17db541098b83524d901257e93845
--
2.35.1


2022-04-27 15:56:50

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] irqchip/exiu: Fix acknowledgment of edge triggered interrupts

On Wed, 27 Apr 2022 at 16:27, Daniel Thompson
<[email protected]> wrote:
>
> Currently the EXIU uses the fasteoi interrupt flow that is configured by
> it's parent (irq-gic-v3.c). With this flow the only chance to clear the
> interrupt request happens during .irq_eoi() and (obviously) this happens
> after the interrupt handler has run. EXIU requires edge triggered
> interrupts to be acked prior to interrupt handling. Without this we
> risk incorrect interrupt dismissal when a new interrupt is delivered
> after the handler reads and acknowledges the peripheral but before the
> irq_eoi() takes place.
>
> Fix this by clearing the interrupt request from .irq_ack() instead. This
> requires switching to the fasteoi-ack flow instead of the fasteoi flow.
> This approach is inspired by the nmi code found in irq-sun6i-r.c .
>

How are level triggered EXIU interrupts affected by this change?

> Note: It *is* intentional not to call irq_chip_ack_parent() from
> exiu_irq_ack(). Parents that implement the fasteoi-ack flow don't
> want us to do that (and we'd crash if we tried).
>
> This problem was discovered through code review rather then reproducing
> missed interrupts in a real platform. Nevertheless the change has been
> tested for regression on a Developerbox/SC2A11 using the power button
> (which is edge triggered and is the only thing connected to the EXIU on
> this platform).
>

This is not entirely true. The PHY interrupt is also wired to the
EXIU, but this never worked reliably, so we don't expose this in the
DT or ACPI tables.
I wonder if this change would help, but I don't remember exactly what
the problem was.

> Fixes: 706cffc1b912 ("irqchip/exiu: Add support for Socionext Synquacer EXIU controller")
> Signed-off-by: Daniel Thompson <[email protected]>
> ---
> arch/arm64/Kconfig.platforms | 1 +
> drivers/irqchip/irq-sni-exiu.c | 16 ++++++++++++----
> 2 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
> index 30b123cde02c..aaeaf57c8222 100644
> --- a/arch/arm64/Kconfig.platforms
> +++ b/arch/arm64/Kconfig.platforms
> @@ -253,6 +253,7 @@ config ARCH_INTEL_SOCFPGA
>
> config ARCH_SYNQUACER
> bool "Socionext SynQuacer SoC Family"
> + select IRQ_FASTEOI_HIERARCHY_HANDLERS
>
> config ARCH_TEGRA
> bool "NVIDIA Tegra SoC Family"
> diff --git a/drivers/irqchip/irq-sni-exiu.c b/drivers/irqchip/irq-sni-exiu.c
> index abd011fcecf4..1cc2ec272ebd 100644
> --- a/drivers/irqchip/irq-sni-exiu.c
> +++ b/drivers/irqchip/irq-sni-exiu.c
> @@ -37,12 +37,11 @@ struct exiu_irq_data {
> u32 spi_base;
> };
>
> -static void exiu_irq_eoi(struct irq_data *d)
> +static void exiu_irq_ack(struct irq_data *d)
> {
> struct exiu_irq_data *data = irq_data_get_irq_chip_data(d);
>
> writel(BIT(d->hwirq), data->base + EIREQCLR);
> - irq_chip_eoi_parent(d);
> }
>
> static void exiu_irq_mask(struct irq_data *d)
> @@ -104,7 +103,8 @@ static int exiu_irq_set_type(struct irq_data *d, unsigned int type)
>
> static struct irq_chip exiu_irq_chip = {
> .name = "EXIU",
> - .irq_eoi = exiu_irq_eoi,
> + .irq_ack = exiu_irq_ack,
> + .irq_eoi = irq_chip_eoi_parent,
> .irq_enable = exiu_irq_enable,
> .irq_mask = exiu_irq_mask,
> .irq_unmask = exiu_irq_unmask,
> @@ -148,6 +148,7 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
> struct irq_fwspec parent_fwspec;
> struct exiu_irq_data *info = dom->host_data;
> irq_hw_number_t hwirq;
> + int i, ret;
>
> parent_fwspec = *fwspec;
> if (is_of_node(dom->parent->fwnode)) {
> @@ -165,7 +166,14 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
> irq_domain_set_hwirq_and_chip(dom, virq, hwirq, &exiu_irq_chip, info);
>
> parent_fwspec.fwnode = dom->parent->fwnode;
> - return irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
> + ret = irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
> + if (ret)
> + return ret;
> +
> + for (i = 0; i < nr_irqs; i++)
> + irq_set_handler(virq+i, handle_fasteoi_ack_irq);
> +
> + return 0;
> }
>
> static const struct irq_domain_ops exiu_domain_ops = {
>
> base-commit: b2d229d4ddb17db541098b83524d901257e93845
> --
> 2.35.1
>

2022-04-29 16:26:55

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH] irqchip/exiu: Fix acknowledgment of edge triggered interrupts

On Wed, Apr 27, 2022 at 05:29:33PM +0200, Ard Biesheuvel wrote:
> On Wed, 27 Apr 2022 at 16:27, Daniel Thompson
> <[email protected]> wrote:
> >
> > Currently the EXIU uses the fasteoi interrupt flow that is configured by
> > it's parent (irq-gic-v3.c). With this flow the only chance to clear the
> > interrupt request happens during .irq_eoi() and (obviously) this happens
> > after the interrupt handler has run. EXIU requires edge triggered
> > interrupts to be acked prior to interrupt handling. Without this we
> > risk incorrect interrupt dismissal when a new interrupt is delivered
> > after the handler reads and acknowledges the peripheral but before the
> > irq_eoi() takes place.
> >
> > Fix this by clearing the interrupt request from .irq_ack() instead. This
> > requires switching to the fasteoi-ack flow instead of the fasteoi flow.
> > This approach is inspired by the nmi code found in irq-sun6i-r.c .
> >
>
> How are level triggered EXIU interrupts affected by this change?

Functionally they should not be affected simply because the controller
does not care when (or even if) software writes to IREQCLR... and
testing on SC2A11/Developerbox with a hacked gpio-keys driver does
back this up.

Since the controller doesn't really care either way I originally
decided to change the flow globally. We could perhaps optimize things
slightly by switching to the fasteoi flow for level triggered
interrupts... they would then be cheaper than they are today (because
we could avoid the spurious ack entirely).


> > Note: It *is* intentional not to call irq_chip_ack_parent() from
> > exiu_irq_ack(). Parents that implement the fasteoi-ack flow don't
> > want us to do that (and we'd crash if we tried).
> >
> > This problem was discovered through code review rather then reproducing
> > missed interrupts in a real platform. Nevertheless the change has been
> > tested for regression on a Developerbox/SC2A11 using the power button
> > (which is edge triggered and is the only thing connected to the EXIU on
> > this platform).
>
> This is not entirely true. The PHY interrupt is also wired to the
> EXIU, but this never worked reliably, so we don't expose this in the
> DT or ACPI tables.
> I wonder if this change would help, but I don't remember exactly what
> the problem was.

Fair point. Although I'm afraid that, in the short term, I will have to
take the simple approach to fixing this: s/connected/enabled by default/


Daniel.


> > Signed-off-by: Daniel Thompson <[email protected]>
> > ---
> > arch/arm64/Kconfig.platforms | 1 +
> > drivers/irqchip/irq-sni-exiu.c | 16 ++++++++++++----
> > 2 files changed, 13 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
> > index 30b123cde02c..aaeaf57c8222 100644
> > --- a/arch/arm64/Kconfig.platforms
> > +++ b/arch/arm64/Kconfig.platforms
> > @@ -253,6 +253,7 @@ config ARCH_INTEL_SOCFPGA
> >
> > config ARCH_SYNQUACER
> > bool "Socionext SynQuacer SoC Family"
> > + select IRQ_FASTEOI_HIERARCHY_HANDLERS
> >
> > config ARCH_TEGRA
> > bool "NVIDIA Tegra SoC Family"
> > diff --git a/drivers/irqchip/irq-sni-exiu.c b/drivers/irqchip/irq-sni-exiu.c
> > index abd011fcecf4..1cc2ec272ebd 100644
> > --- a/drivers/irqchip/irq-sni-exiu.c
> > +++ b/drivers/irqchip/irq-sni-exiu.c
> > @@ -37,12 +37,11 @@ struct exiu_irq_data {
> > u32 spi_base;
> > };
> >
> > -static void exiu_irq_eoi(struct irq_data *d)
> > +static void exiu_irq_ack(struct irq_data *d)
> > {
> > struct exiu_irq_data *data = irq_data_get_irq_chip_data(d);
> >
> > writel(BIT(d->hwirq), data->base + EIREQCLR);
> > - irq_chip_eoi_parent(d);
> > }
> >
> > static void exiu_irq_mask(struct irq_data *d)
> > @@ -104,7 +103,8 @@ static int exiu_irq_set_type(struct irq_data *d, unsigned int type)
> >
> > static struct irq_chip exiu_irq_chip = {
> > .name = "EXIU",
> > - .irq_eoi = exiu_irq_eoi,
> > + .irq_ack = exiu_irq_ack,
> > + .irq_eoi = irq_chip_eoi_parent,
> > .irq_enable = exiu_irq_enable,
> > .irq_mask = exiu_irq_mask,
> > .irq_unmask = exiu_irq_unmask,
> > @@ -148,6 +148,7 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
> > struct irq_fwspec parent_fwspec;
> > struct exiu_irq_data *info = dom->host_data;
> > irq_hw_number_t hwirq;
> > + int i, ret;
> >
> > parent_fwspec = *fwspec;
> > if (is_of_node(dom->parent->fwnode)) {
> > @@ -165,7 +166,14 @@ static int exiu_domain_alloc(struct irq_domain *dom, unsigned int virq,
> > irq_domain_set_hwirq_and_chip(dom, virq, hwirq, &exiu_irq_chip, info);
> >
> > parent_fwspec.fwnode = dom->parent->fwnode;
> > - return irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
> > + ret = irq_domain_alloc_irqs_parent(dom, virq, nr_irqs, &parent_fwspec);
> > + if (ret)
> > + return ret;
> > +
> > + for (i = 0; i < nr_irqs; i++)
> > + irq_set_handler(virq+i, handle_fasteoi_ack_irq);
> > +
> > + return 0;
> > }
> >
> > static const struct irq_domain_ops exiu_domain_ops = {
> >
> > base-commit: b2d229d4ddb17db541098b83524d901257e93845
> > --
> > 2.35.1
> >

2022-04-30 13:01:31

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH] irqchip/exiu: Fix acknowledgment of edge triggered interrupts

On Fri, Apr 29, 2022 at 02:28:58PM +0100, Daniel Thompson wrote:
> On Wed, Apr 27, 2022 at 05:29:33PM +0200, Ard Biesheuvel wrote:
> > On Wed, 27 Apr 2022 at 16:27, Daniel Thompson
> > <[email protected]> wrote:
> > >
> > > Currently the EXIU uses the fasteoi interrupt flow that is configured by
> > > it's parent (irq-gic-v3.c). With this flow the only chance to clear the
> > > interrupt request happens during .irq_eoi() and (obviously) this happens
> > > after the interrupt handler has run. EXIU requires edge triggered
> > > interrupts to be acked prior to interrupt handling. Without this we
> > > risk incorrect interrupt dismissal when a new interrupt is delivered
> > > after the handler reads and acknowledges the peripheral but before the
> > > irq_eoi() takes place.
> > >
> > > Fix this by clearing the interrupt request from .irq_ack() instead. This
> > > requires switching to the fasteoi-ack flow instead of the fasteoi flow.
> > > This approach is inspired by the nmi code found in irq-sun6i-r.c .
> > >
> >
> > How are level triggered EXIU interrupts affected by this change?
>
> Functionally they should not be affected simply because the controller
> does not care when (or even if) software writes to IREQCLR... and
> testing on SC2A11/Developerbox with a hacked gpio-keys driver does
> back this up.

If, and only if, you do the experiment properly... and this afternoon
the penny dropped and I fixed that.

In summary, level triggered interrupts require IREQCLR to happen at EOI
(to avoid spurious re-entry) whilst edge triggered interrupts require
IREQCLR to be acked before ISR is entered (to avoid spurious dismissal).

v2 is on its way...


Daniel.