LinuxLists.cc - [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Mon, Oct 18, 2021 at 10:47 AM Samuel Holland <[email protected]> wrote:
>
> On 10/15/21 10:21 PM, [email protected] wrote:
> > From: Guo Ren <[email protected]>
> >
> > 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly
> > for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver
> > due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip
> > drivers using handle_fasteoi_irq() also implement irq_mask/unmask().
> >
> > 2) The C9xx PLIC does not comply with the interrupt claim/completion
> > process defined by the RISC-V PLIC specification because C9xx PLIC
> > will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
> > and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> > writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
> > the generic handle_fasteoi_irq() used in the PLIC driver.
> >
> > 3) This patch adds an errata fix for IRQS_ONESHOT handling on
> > C9xx PLIC by using irq_enable/disable() callbacks instead of
> > irq_mask/unmask().
> >
> > Signed-off-by: Guo Ren <[email protected]>
> > Cc: Anup Patel <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Marc Zyngier <[email protected]>
> > Cc: Palmer Dabbelt <[email protected]>
> > Cc: Atish Patra <[email protected]>
> >
> > ---
> >
> > Changes since V4:
> > - Update comment by Anup
> >
> > Changes since V3:
> > - Rename "c9xx" to "c900"
> > - Add sifive_plic_chip and thead_plic_chip for difference
> >
> > Changes since V2:
> > - Add a separate compatible string "thead,c9xx-plic"
> > - set irq_mask/unmask of "plic_chip" to NULL and point
> > irq_enable/disable of "plic_chip" to plic_irq_mask/unmask
> > - Add a detailed comment block in plic_init() about the
> > differences in Claim/Completion process of RISC-V PLIC and C9xx
> > PLIC.
> > ---
> > drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++++++++++++++++--
> > 1 file changed, 32 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > index cf74cfa82045..960b29d02070 100644
> > --- a/drivers/irqchip/irq-sifive-plic.c
> > +++ b/drivers/irqchip/irq-sifive-plic.c
> > @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d)
> > writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> > }
> >
> > -static struct irq_chip plic_chip = {
> > +static struct irq_chip sifive_plic_chip = {
> > .name = "SiFive PLIC",
> > .irq_mask = plic_irq_mask,
> > .irq_unmask = plic_irq_unmask,
> > @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = {
> > #endif
> > };
> >
> > +/*
> > + * The C9xx PLIC does not comply with the interrupt claim/completion
> > + * process defined by the RISC-V PLIC specification because C9xx PLIC
> > + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
> > + * and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> > + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
> > + * the generic handle_fasteoi_irq() used in the PLIC driver.
> > + */
> > +static struct irq_chip thead_plic_chip = {
> > + .name = "T-Head PLIC",
> > + .irq_disable = plic_irq_mask,
> > + .irq_enable = plic_irq_unmask,
> > + .irq_eoi = plic_irq_eoi,
> > +#ifdef CONFIG_SMP
> > + .irq_set_affinity = plic_set_affinity,
> > +#endif
> I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED,
> .irq_eoi is called at the end of the hard IRQ handler. This unmasks the
> IRQ before the irqthread has a chance to run, so it causes an interrupt
> storm for any threaded level IRQ (I saw this happen for sun8i_thermal).
>
> With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the irqthread
> runs. This is good. Except that the call to unmask_threaded_irq() is
> inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be
> set because .irq_mask is NULL. So the end result is that the IRQ is
> never EOI'd and is masked permanently.
>
> If you set .flags = IRQCHIP_EOI_THREADED, and additionally set .irq_mask
> and .irq_unmask to a dummy function that does nothing, the IRQ core will
> properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected.
> But adding dummy functions seems not so ideal, so I am not sure if this
> is the best solution.

This series only tries to optimize a particular case in handle_fasteoi_irq()
for T-HEAD PLIC. I am not sure about this series either.

Although, we do need separate compatible strings for T-HEAD PLIC
because T-HEAD PLIC is not compliant with RISC-V PLIC specification.

Regards,
Anup

>
> Regards,
> Samuel
>
> > +};
> > +
> > +static struct irq_chip *def_plic_chip = &sifive_plic_chip;
> > +
> > static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq,
> > irq_hw_number_t hwirq)
> > {
> > struct plic_priv *priv = d->host_data;
> >
> > - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data,
> > + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data,
> > handle_fasteoi_irq, NULL, NULL);
> > irq_set_noprobe(irq);
> > irq_set_affinity(irq, &priv->lmask);
> > @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node,
> > return error;
> > }
> >
> > +static int __init thead_c900_plic_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + def_plic_chip = &thead_plic_chip;
> > +
> > + return plic_init(node, parent);
> > +}
> > +
> > IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
> > IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
> > +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init);
> > +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init);
> >
>

2021-10-18 07:09:54

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

Hi Samuel,

On Mon, Oct 18, 2021 at 1:17 PM Samuel Holland <[email protected]> wrote:
>
> On 10/15/21 10:21 PM, [email protected] wrote:
> > From: Guo Ren <[email protected]>
> >
> > 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly
> > for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver
> > due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip
> > drivers using handle_fasteoi_irq() also implement irq_mask/unmask().
> >
> > 2) The C9xx PLIC does not comply with the interrupt claim/completion
> > process defined by the RISC-V PLIC specification because C9xx PLIC
> > will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
> > and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> > writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
> > the generic handle_fasteoi_irq() used in the PLIC driver.
> >
> > 3) This patch adds an errata fix for IRQS_ONESHOT handling on
> > C9xx PLIC by using irq_enable/disable() callbacks instead of
> > irq_mask/unmask().
> >
> > Signed-off-by: Guo Ren <[email protected]>
> > Cc: Anup Patel <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: Marc Zyngier <[email protected]>
> > Cc: Palmer Dabbelt <[email protected]>
> > Cc: Atish Patra <[email protected]>
> >
> > ---
> >
> > Changes since V4:
> > - Update comment by Anup
> >
> > Changes since V3:
> > - Rename "c9xx" to "c900"
> > - Add sifive_plic_chip and thead_plic_chip for difference
> >
> > Changes since V2:
> > - Add a separate compatible string "thead,c9xx-plic"
> > - set irq_mask/unmask of "plic_chip" to NULL and point
> > irq_enable/disable of "plic_chip" to plic_irq_mask/unmask
> > - Add a detailed comment block in plic_init() about the
> > differences in Claim/Completion process of RISC-V PLIC and C9xx
> > PLIC.
> > ---
> > drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++++++++++++++++--
> > 1 file changed, 32 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> > index cf74cfa82045..960b29d02070 100644
> > --- a/drivers/irqchip/irq-sifive-plic.c
> > +++ b/drivers/irqchip/irq-sifive-plic.c
> > @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d)
> > writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> > }
> >
> > -static struct irq_chip plic_chip = {
> > +static struct irq_chip sifive_plic_chip = {
> > .name = "SiFive PLIC",
> > .irq_mask = plic_irq_mask,
> > .irq_unmask = plic_irq_unmask,
> > @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = {
> > #endif
> > };
> >
> > +/*
> > + * The C9xx PLIC does not comply with the interrupt claim/completion
> > + * process defined by the RISC-V PLIC specification because C9xx PLIC
> > + * will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
> > + * and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> > + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
> > + * the generic handle_fasteoi_irq() used in the PLIC driver.
> > + */
> > +static struct irq_chip thead_plic_chip = {
> > + .name = "T-Head PLIC",
> > + .irq_disable = plic_irq_mask,
> > + .irq_enable = plic_irq_unmask,
> > + .irq_eoi = plic_irq_eoi,
> > +#ifdef CONFIG_SMP
> > + .irq_set_affinity = plic_set_affinity,
> > +#endif
> I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED,
> .irq_eoi is called at the end of the hard IRQ handler. This unmasks the
> IRQ before the irqthread has a chance to run, so it causes an interrupt
> storm for any threaded level IRQ (I saw this happen for sun8i_thermal).
devm_request_threaded_irq(struct device *dev, unsigned int irq,
irq_handler_t handler, irq_handler_t thread_fn

I think you should pull down the IRQ level signal in "handler" and put
the backend progress into "thread_fn".

Could you give out your driver code?

>
> With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the irqthread
> runs. This is good. Except that the call to unmask_threaded_irq() is
> inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be
> set because .irq_mask is NULL. So the end result is that the IRQ is
> never EOI'd and is masked permanently.
I don't think we should use IRQCHIP_EOI_THREADED because it makes the
IRQ path complex, we need to let the driver separate their "handler" &
"thread_fn" properly.

How do you think?

>
> If you set .flags = IRQCHIP_EOI_THREADED, and additionally set .irq_mask
> and .irq_unmask to a dummy function that does nothing, the IRQ core will
> properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected.
> But adding dummy functions seems not so ideal, so I am not sure if this
> is the best solution.
It's ununderstandable, we need to find a way.

Thx for the test & the question.

>
> Regards,
> Samuel
>
> > +};
> > +
> > +static struct irq_chip *def_plic_chip = &sifive_plic_chip;
> > +
> > static int plic_irqdomain_map(struct irq_domain *d, unsigned int irq,
> > irq_hw_number_t hwirq)
> > {
> > struct plic_priv *priv = d->host_data;
> >
> > - irq_domain_set_info(d, irq, hwirq, &plic_chip, d->host_data,
> > + irq_domain_set_info(d, irq, hwirq, def_plic_chip, d->host_data,
> > handle_fasteoi_irq, NULL, NULL);
> > irq_set_noprobe(irq);
> > irq_set_affinity(irq, &priv->lmask);
> > @@ -390,5 +410,15 @@ static int __init plic_init(struct device_node *node,
> > return error;
> > }
> >
> > +static int __init thead_c900_plic_init(struct device_node *node,
> > + struct device_node *parent)
> > +{
> > + def_plic_chip = &thead_plic_chip;
> > +
> > + return plic_init(node, parent);
> > +}
> > +
> > IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
> > IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
> > +IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", thead_c900_plic_init);
> > +IRQCHIP_DECLARE(allwinner_sun20i_d1_plic, "allwinner,sun20i-d1-plic", thead_c900_plic_init);
> >
>

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-18 07:23:29

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On 2021-10-18 06:17, Samuel Holland wrote:
> On 10/15/21 10:21 PM, [email protected] wrote:
>> From: Guo Ren <[email protected]>
>>
>> 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly

Drop this useless numbering.

>> for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver
>> due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip
>> drivers using handle_fasteoi_irq() also implement irq_mask/unmask().

This paragraph doesn't provide any useful information in the context
of this patch. That's at best cover-letter material.

>> 2) The C9xx PLIC does not comply with the interrupt claim/completion
>> process defined by the RISC-V PLIC specification because C9xx PLIC
>> will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
>> and the IRQ will be unmasked upon completion by PLIC driver (i.e.
>> writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
>> the generic handle_fasteoi_irq() used in the PLIC driver.
>>
>> 3) This patch adds an errata fix for IRQS_ONESHOT handling on

s/fix/workaround/

>> C9xx PLIC by using irq_enable/disable() callbacks instead of
>> irq_mask/unmask().

From Documentation/process/submitting-patches.rst:

<quote>
Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour.
</quote>

>>
>> Signed-off-by: Guo Ren <[email protected]>
>> Cc: Anup Patel <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Palmer Dabbelt <[email protected]>
>> Cc: Atish Patra <[email protected]>
>>
>> ---
>>
>> Changes since V4:
>> - Update comment by Anup
>>
>> Changes since V3:
>> - Rename "c9xx" to "c900"
>> - Add sifive_plic_chip and thead_plic_chip for difference
>>
>> Changes since V2:
>> - Add a separate compatible string "thead,c9xx-plic"
>> - set irq_mask/unmask of "plic_chip" to NULL and point
>> irq_enable/disable of "plic_chip" to plic_irq_mask/unmask
>> - Add a detailed comment block in plic_init() about the
>> differences in Claim/Completion process of RISC-V PLIC and C9xx
>> PLIC.
>> ---
>> drivers/irqchip/irq-sifive-plic.c | 34
>> +++++++++++++++++++++++++++++--
>> 1 file changed, 32 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-sifive-plic.c
>> b/drivers/irqchip/irq-sifive-plic.c
>> index cf74cfa82045..960b29d02070 100644
>> --- a/drivers/irqchip/irq-sifive-plic.c
>> +++ b/drivers/irqchip/irq-sifive-plic.c
>> @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d)
>> writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
>> }
>>
>> -static struct irq_chip plic_chip = {
>> +static struct irq_chip sifive_plic_chip = {
>> .name = "SiFive PLIC",
>> .irq_mask = plic_irq_mask,
>> .irq_unmask = plic_irq_unmask,
>> @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = {
>> #endif
>> };
>>
>> +/*
>> + * The C9xx PLIC does not comply with the interrupt claim/completion
>> + * process defined by the RISC-V PLIC specification because C9xx PLIC
>> + * will mask an IRQ when it is claimed by PLIC driver (i.e.
>> readl(claim)
>> + * and the IRQ will be unmasked upon completion by PLIC driver (i.e.
>> + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT
>> by
>> + * the generic handle_fasteoi_irq() used in the PLIC driver.
>> + */
>> +static struct irq_chip thead_plic_chip = {
>> + .name = "T-Head PLIC",
>> + .irq_disable = plic_irq_mask,
>> + .irq_enable = plic_irq_unmask,
>> + .irq_eoi = plic_irq_eoi,
>> +#ifdef CONFIG_SMP
>> + .irq_set_affinity = plic_set_affinity,
>> +#endif
> I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED,
> .irq_eoi is called at the end of the hard IRQ handler. This unmasks the
> IRQ before the irqthread has a chance to run, so it causes an interrupt
> storm for any threaded level IRQ (I saw this happen for sun8i_thermal).
>
> With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the
> irqthread
> runs. This is good. Except that the call to unmask_threaded_irq() is
> inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be
> set because .irq_mask is NULL. So the end result is that the IRQ is
> never EOI'd and is masked permanently.
>
> If you set .flags = IRQCHIP_EOI_THREADED, and additionally set
> .irq_mask
> and .irq_unmask to a dummy function that does nothing, the IRQ core
> will
> properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected.
> But adding dummy functions seems not so ideal, so I am not sure if this
> is the best solution.

This series is totally broken indeed, because it assumes that
enable/disable are a substitute to mask/unmask. Nothing could be further
from the truth. mask/unmask must be implemented, and enable/disable
supplement them if the HW requires something different at startup time.

If you have an 'automask' behaviour and yet the HW doesn't record this
in a separate bit, then you need to track this by yourself in the
irq_eoi() callback instead. I guess that you would skip the write to
the CLAIM register in this case, though I have no idea whether this
breaks
the HW interrupt state or not.

There is an example of this in the Apple AIC driver.

M.
--
Jazz is not dead. It just smells funny...

2021-10-19 09:39:29

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

Thx Marc,

On Mon, Oct 18, 2021 at 3:21 PM Marc Zyngier <[email protected]> wrote:
>
> On 2021-10-18 06:17, Samuel Holland wrote:
> > On 10/15/21 10:21 PM, [email protected] wrote:
> >> From: Guo Ren <[email protected]>
> >>
> >> 1) The irq_mask/unmask() is used by handle_fasteoi_irq() is mostly
>
> Drop this useless numbering.
Okay

>
> >> for ONESHOT irqs and there is no limitation in the RISC-V PLIC driver
> >> due to use of irq_mask/unmask() callbacks. In fact, a lot of irqchip
> >> drivers using handle_fasteoi_irq() also implement irq_mask/unmask().
>
> This paragraph doesn't provide any useful information in the context
> of this patch. That's at best cover-letter material.
Okay. I would reconstruct the sentence.

>
> >> 2) The C9xx PLIC does not comply with the interrupt claim/completion
> >> process defined by the RISC-V PLIC specification because C9xx PLIC
> >> will mask an IRQ when it is claimed by PLIC driver (i.e. readl(claim)
> >> and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> >> writel(claim). This behaviour breaks the handling of IRQS_ONESHOT by
> >> the generic handle_fasteoi_irq() used in the PLIC driver.
> >>
> >> 3) This patch adds an errata fix for IRQS_ONESHOT handling on
>
> s/fix/workaround/
Okay

>
> >> C9xx PLIC by using irq_enable/disable() callbacks instead of
> >> irq_mask/unmask().
>
> From Documentation/process/submitting-patches.rst:
>
> <quote>
> Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
> instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
> to do frotz", as if you are giving orders to the codebase to change
> its behaviour.
> </quote>
I would try the style in the next version of the patch.

>
> >>
> >> Signed-off-by: Guo Ren <[email protected]>
> >> Cc: Anup Patel <[email protected]>
> >> Cc: Thomas Gleixner <[email protected]>
> >> Cc: Marc Zyngier <[email protected]>
> >> Cc: Palmer Dabbelt <[email protected]>
> >> Cc: Atish Patra <[email protected]>
> >>
> >> ---
> >>
> >> Changes since V4:
> >> - Update comment by Anup
> >>
> >> Changes since V3:
> >> - Rename "c9xx" to "c900"
> >> - Add sifive_plic_chip and thead_plic_chip for difference
> >>
> >> Changes since V2:
> >> - Add a separate compatible string "thead,c9xx-plic"
> >> - set irq_mask/unmask of "plic_chip" to NULL and point
> >> irq_enable/disable of "plic_chip" to plic_irq_mask/unmask
> >> - Add a detailed comment block in plic_init() about the
> >> differences in Claim/Completion process of RISC-V PLIC and C9xx
> >> PLIC.
> >> ---
> >> drivers/irqchip/irq-sifive-plic.c | 34
> >> +++++++++++++++++++++++++++++--
> >> 1 file changed, 32 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/irqchip/irq-sifive-plic.c
> >> b/drivers/irqchip/irq-sifive-plic.c
> >> index cf74cfa82045..960b29d02070 100644
> >> --- a/drivers/irqchip/irq-sifive-plic.c
> >> +++ b/drivers/irqchip/irq-sifive-plic.c
> >> @@ -166,7 +166,7 @@ static void plic_irq_eoi(struct irq_data *d)
> >> writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> >> }
> >>
> >> -static struct irq_chip plic_chip = {
> >> +static struct irq_chip sifive_plic_chip = {
> >> .name = "SiFive PLIC",
> >> .irq_mask = plic_irq_mask,
> >> .irq_unmask = plic_irq_unmask,
> >> @@ -176,12 +176,32 @@ static struct irq_chip plic_chip = {
> >> #endif
> >> };
> >>
> >> +/*
> >> + * The C9xx PLIC does not comply with the interrupt claim/completion
> >> + * process defined by the RISC-V PLIC specification because C9xx PLIC
> >> + * will mask an IRQ when it is claimed by PLIC driver (i.e.
> >> readl(claim)
> >> + * and the IRQ will be unmasked upon completion by PLIC driver (i.e.
> >> + * writel(claim). This behaviour breaks the handling of IRQS_ONESHOT
> >> by
> >> + * the generic handle_fasteoi_irq() used in the PLIC driver.
> >> + */
> >> +static struct irq_chip thead_plic_chip = {
> >> + .name = "T-Head PLIC",
> >> + .irq_disable = plic_irq_mask,
> >> + .irq_enable = plic_irq_unmask,
> >> + .irq_eoi = plic_irq_eoi,
> >> +#ifdef CONFIG_SMP
> >> + .irq_set_affinity = plic_set_affinity,
> >> +#endif
> > I tested this, and it doesn't work. Without IRQCHIP_EOI_THREADED,
> > .irq_eoi is called at the end of the hard IRQ handler. This unmasks the
> > IRQ before the irqthread has a chance to run, so it causes an interrupt
> > storm for any threaded level IRQ (I saw this happen for sun8i_thermal).
> >
> > With IRQCHIP_EOI_THREADED, .irq_eoi is delayed until after the
> > irqthread
> > runs. This is good. Except that the call to unmask_threaded_irq() is
> > inside a check for IRQD_IRQ_MASKED. And IRQD_IRQ_MASKED will never be
> > set because .irq_mask is NULL. So the end result is that the IRQ is
> > never EOI'd and is masked permanently.
> >
> > If you set .flags = IRQCHIP_EOI_THREADED, and additionally set
> > .irq_mask
> > and .irq_unmask to a dummy function that does nothing, the IRQ core
> > will
> > properly set/unset IRQD_IRQ_MASKED, and the IRQs will flow as expected.
> > But adding dummy functions seems not so ideal, so I am not sure if this
> > is the best solution.
>
> This series is totally broken indeed, because it assumes that
> enable/disable are a substitute to mask/unmask. Nothing could be further
> from the truth. mask/unmask must be implemented, and enable/disable
> supplement them if the HW requires something different at startup time.
After re-studying irqchip, I agree that you are right. The csky-mpintc
driver needs to be corrected, I will send patches asap. I hope you can
continue to help review.

handle_fasteoi_irq itself has avoided mask/unmask, so my understanding
is wrong. The mask/unmask design can prevent "rogue interrupts" from
damaging the system. C-SKY guys encountered the thread_irq interrupt
storm problem. The solution at that time was to pull the interrupt
signal in the handler and put the rest in thread_fn. If we implemented
the mask/unmask correctly in csky-mpintc, it was unnecessary.

>
> If you have an 'automask' behavior and yet the HW doesn't record this
> in a separate bit, then you need to track this by yourself in the
> irq_eoi() callback instead. I guess that you would skip the write to
> the CLAIM register in this case, though I have no idea whether this
> breaks
> the HW interrupt state or not.
The problem is when enable bit is 0 for that irq_number,
"writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
the hw state machine. Then this irq would enter in ack state and no
continues irqs could come in.

>
> There is an example of this in the Apple AIC driver.
Thx for the tip, I think your suggestion is:
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d)
{
struct plic_handler *handler = this_cpu_ptr(&plic_handlers);

- writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ if (irqd_irq_masked(d)) {
+ plic_irq_unmask(d);
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ plic_irq_mask(d);
+ } else {
+ writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
+ }
}

The above could solve the problem, I've tested it on qemu & our hw platform.

>
> M.
> --
> Jazz is not dead. It just smells funny...

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-19 10:21:25

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Tue, 19 Oct 2021 10:33:49 +0100,
Guo Ren <[email protected]> wrote:

> > If you have an 'automask' behavior and yet the HW doesn't record this
> > in a separate bit, then you need to track this by yourself in the
> > irq_eoi() callback instead. I guess that you would skip the write to
> > the CLAIM register in this case, though I have no idea whether this
> > breaks
> > the HW interrupt state or not.
> The problem is when enable bit is 0 for that irq_number,
> "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> the hw state machine. Then this irq would enter in ack state and no
> continues irqs could come in.

Really? This means that you cannot mask an interrupt while it is being
handled? How great...

> >
> > There is an example of this in the Apple AIC driver.
> Thx for the tip, I think your suggestion is:
> +++ b/drivers/irqchip/irq-sifive-plic.c
> @@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d)
> {
> struct plic_handler *handler = this_cpu_ptr(&plic_handlers);
>
> - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> + if (irqd_irq_masked(d)) {
> + plic_irq_unmask(d);
> + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> + plic_irq_mask(d);

This looks pretty dodgy. You are relying on interrupts being globally
masked on the CPU, I guess. It probably works today, but man, what a
terrible HW implementation.

You'll definitely have to move this into a c900-specific callback.

M.

--
Without deviation from the norm, progress is not possible.

2021-10-19 13:28:49

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
>
> On Tue, 19 Oct 2021 10:33:49 +0100,
> Guo Ren <[email protected]> wrote:
>
> > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > in a separate bit, then you need to track this by yourself in the
> > > irq_eoi() callback instead. I guess that you would skip the write to
> > > the CLAIM register in this case, though I have no idea whether this
> > > breaks
> > > the HW interrupt state or not.
> > The problem is when enable bit is 0 for that irq_number,
> > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > the hw state machine. Then this irq would enter in ack state and no
> > continues irqs could come in.
>
> Really? This means that you cannot mask an interrupt while it is being
> handled? How great...
If the completion ID does not match an interrupt source that is
currently enabled for the target, the completion is silently ignored.
So, C9xx completion depends on enable-bit.

>
> > >
> > > There is an example of this in the Apple AIC driver.
> > Thx for the tip, I think your suggestion is:
> > +++ b/drivers/irqchip/irq-sifive-plic.c
> > @@ -163,7 +163,12 @@ static void plic_irq_eoi(struct irq_data *d)
> > {
> > struct plic_handler *handler = this_cpu_ptr(&plic_handlers);
> >
> > - writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> > + if (irqd_irq_masked(d)) {
> > + plic_irq_unmask(d);
> > + writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM);
> > + plic_irq_mask(d);
>
> This looks pretty dodgy. You are relying on interrupts being globally
> masked on the CPU, I guess. It probably works today, but man, what a
> terrible HW implementation.

>
> You'll definitely have to move this into a c900-specific callback.
Yes, it's an errata.

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-20 13:37:32

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Tue, 19 Oct 2021 14:27:02 +0100,
Guo Ren <[email protected]> wrote:
>
> On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Tue, 19 Oct 2021 10:33:49 +0100,
> > Guo Ren <[email protected]> wrote:
> >
> > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > in a separate bit, then you need to track this by yourself in the
> > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > the CLAIM register in this case, though I have no idea whether this
> > > > breaks
> > > > the HW interrupt state or not.
> > > The problem is when enable bit is 0 for that irq_number,
> > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > the hw state machine. Then this irq would enter in ack state and no
> > > continues irqs could come in.
> >
> > Really? This means that you cannot mask an interrupt while it is being
> > handled? How great...
> If the completion ID does not match an interrupt source that is
> currently enabled for the target, the completion is silently ignored.
> So, C9xx completion depends on enable-bit.

Is that what the PLIC spec says? Or what your implementation does? I
can understand that one implementation would be broken, but if the
PLIC architecture itself is broken, that's far more concerning.

M.

--
Without deviation from the norm, progress is not possible.

2021-10-20 14:22:12

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <[email protected]> wrote:
>
> On Tue, 19 Oct 2021 14:27:02 +0100,
> Guo Ren <[email protected]> wrote:
> >
> > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > Guo Ren <[email protected]> wrote:
> > >
> > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > in a separate bit, then you need to track this by yourself in the
> > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > breaks
> > > > > the HW interrupt state or not.
> > > > The problem is when enable bit is 0 for that irq_number,
> > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > the hw state machine. Then this irq would enter in ack state and no
> > > > continues irqs could come in.
> > >
> > > Really? This means that you cannot mask an interrupt while it is being
> > > handled? How great...
> > If the completion ID does not match an interrupt source that is
> > currently enabled for the target, the completion is silently ignored.
> > So, C9xx completion depends on enable-bit.
>
> Is that what the PLIC spec says? Or what your implementation does? I
> can understand that one implementation would be broken, but if the
> PLIC architecture itself is broken, that's far more concerning.

Here is the description of Interrupt Completion in PLIC spec [1]:

The PLIC signals it has completed executing an interrupt handler by
writing the interrupt ID it received from the claim to the claim/complete
register. The PLIC does not check whether the completion ID is the same
as the last claim ID for that target. If the completion ID does not match
an interrupt source that is currently enabled for the target, the
^^ ^^^^^^^^^ ^^^^^^^
completion is silently ignored.

[1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc

Did we misunderstand the PLIC spec?

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-20 14:36:37

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
>
> On Tue, 19 Oct 2021 14:27:02 +0100,
> Guo Ren <[email protected]> wrote:
> >
> > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > Guo Ren <[email protected]> wrote:
> > >
> > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > in a separate bit, then you need to track this by yourself in the
> > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > breaks
> > > > > the HW interrupt state or not.
> > > > The problem is when enable bit is 0 for that irq_number,
> > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > the hw state machine. Then this irq would enter in ack state and no
> > > > continues irqs could come in.
> > >
> > > Really? This means that you cannot mask an interrupt while it is being
> > > handled? How great...
> > If the completion ID does not match an interrupt source that is
> > currently enabled for the target, the completion is silently ignored.
> > So, C9xx completion depends on enable-bit.
>
> Is that what the PLIC spec says? Or what your implementation does? I
> can understand that one implementation would be broken, but if the
> PLIC architecture itself is broken, that's far more concerning.

Yes, we are dealing with a broken/non-compliant PLIC
implementation.

The RISC-V PLIC spec defines a very different behaviour for the
interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
writel(claim)). The T-HEAD PLIC implementation does things
different from what the RISC-V PLIC spec says because it will
mask an interrupt upon interrupt claim whereas PLIC spec says
it should only clear the interrupt pending bit (not mask the interrupt).

Quoting interrupt claim process (chapter 9) from PLIC spec:
"The PLIC can perform an interrupt claim by reading the claim/complete
register, which returns the ID of the highest priority pending interrupt or
zero if there is no pending interrupt. A successful claim will also atomically
clear the corresponding pending bit on the interrupt source."

Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc

Regards,
Anup

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

2021-10-20 15:05:44

by Darius Rad

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote:
> On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Tue, 19 Oct 2021 14:27:02 +0100,
> > Guo Ren <[email protected]> wrote:
> > >
> > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > >
> > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > breaks
> > > > > > the HW interrupt state or not.
> > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > continues irqs could come in.
> > > >
> > > > Really? This means that you cannot mask an interrupt while it is being
> > > > handled? How great...
> > > If the completion ID does not match an interrupt source that is
> > > currently enabled for the target, the completion is silently ignored.
> > > So, C9xx completion depends on enable-bit.
> >
> > Is that what the PLIC spec says? Or what your implementation does? I
> > can understand that one implementation would be broken, but if the
> > PLIC architecture itself is broken, that's far more concerning.
>
> Here is the description of Interrupt Completion in PLIC spec [1]:
>
> The PLIC signals it has completed executing an interrupt handler by
> writing the interrupt ID it received from the claim to the claim/complete
> register. The PLIC does not check whether the completion ID is the same
> as the last claim ID for that target. If the completion ID does not match
> an interrupt source that is currently enabled for the target, the
> ^^ ^^^^^^^^^ ^^^^^^^
> completion is silently ignored.
>
> [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
>
> Did we misunderstand the PLIC spec?
>

That clause sounds to me like it is due to the SiFive implementation, which
the RISC-V PLIC specification is based on. Since the PLIC spec is still a
draft I would expect it to change before release.

2021-10-20 15:13:10

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, 20 Oct 2021 15:33:49 +0100,
Anup Patel <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Tue, 19 Oct 2021 14:27:02 +0100,
> > Guo Ren <[email protected]> wrote:
> > >
> > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > >
> > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > breaks
> > > > > > the HW interrupt state or not.
> > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > continues irqs could come in.
> > > >
> > > > Really? This means that you cannot mask an interrupt while it is being
> > > > handled? How great...
> > > If the completion ID does not match an interrupt source that is
> > > currently enabled for the target, the completion is silently ignored.
> > > So, C9xx completion depends on enable-bit.
> >
> > Is that what the PLIC spec says? Or what your implementation does? I
> > can understand that one implementation would be broken, but if the
> > PLIC architecture itself is broken, that's far more concerning.
>
> Yes, we are dealing with a broken/non-compliant PLIC
> implementation.
>
> The RISC-V PLIC spec defines a very different behaviour for the
> interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> writel(claim)). The T-HEAD PLIC implementation does things
> different from what the RISC-V PLIC spec says because it will
> mask an interrupt upon interrupt claim whereas PLIC spec says
> it should only clear the interrupt pending bit (not mask the interrupt).
>
> Quoting interrupt claim process (chapter 9) from PLIC spec:
> "The PLIC can perform an interrupt claim by reading the claim/complete
> register, which returns the ID of the highest priority pending interrupt or
> zero if there is no pending interrupt. A successful claim will also atomically
> clear the corresponding pending bit on the interrupt source."
>
> Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc

That's not the point I'm making. According to Guo, the PLIC (any
implementation of it) will ignore a write to claim on a masked
interrupt.

If that's indeed correct, then a sequence such as:

(1) irq = read(claim)
(2) mask from the interrupt handler with the right flags so that it
isn't done lazily
(3) write(irq, claim)

will result in an interrupt blocked in ack state (and probably no more
interrupt for this CPU at this priority). That would be an interesting
bug in the current code, but also a pretty bad architectural choice.

M.

--
Without deviation from the norm, progress is not possible.

2021-10-20 16:11:54

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <[email protected]> wrote:
>
> On Wed, 20 Oct 2021 15:33:49 +0100,
> Anup Patel <[email protected]> wrote:
> >
> > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > Guo Ren <[email protected]> wrote:
> > > >
> > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > breaks
> > > > > > > the HW interrupt state or not.
> > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > continues irqs could come in.
> > > > >
> > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > handled? How great...
> > > > If the completion ID does not match an interrupt source that is
> > > > currently enabled for the target, the completion is silently ignored.
> > > > So, C9xx completion depends on enable-bit.
> > >
> > > Is that what the PLIC spec says? Or what your implementation does? I
> > > can understand that one implementation would be broken, but if the
> > > PLIC architecture itself is broken, that's far more concerning.
> >
> > Yes, we are dealing with a broken/non-compliant PLIC
> > implementation.
> >
> > The RISC-V PLIC spec defines a very different behaviour for the
> > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > writel(claim)). The T-HEAD PLIC implementation does things
> > different from what the RISC-V PLIC spec says because it will
> > mask an interrupt upon interrupt claim whereas PLIC spec says
> > it should only clear the interrupt pending bit (not mask the interrupt).
> >
> > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > "The PLIC can perform an interrupt claim by reading the claim/complete
> > register, which returns the ID of the highest priority pending interrupt or
> > zero if there is no pending interrupt. A successful claim will also atomically
> > clear the corresponding pending bit on the interrupt source."
> >
> > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
>
> That's not the point I'm making. According to Guo, the PLIC (any
> implementation of it) will ignore a write to claim on a masked
> interrupt.

Yes, write to claim on a masked interrupt is certainly ignored but
read to claim does not automatically mask the interrupt.

>
> If that's indeed correct, then a sequence such as:
>
> (1) irq = read(claim)

This will return highest priority pending interrupt and clear the
pending bit as-per RISC-V PLIC spec.

> (2) mask from the interrupt handler with the right flags so that it
> isn't done lazily
> (3) write(irq, claim)
>
> will result in an interrupt blocked in ack state (and probably no more
> interrupt for this CPU at this priority). That would be an interesting
> bug in the current code, but also a pretty bad architectural choice.

The interrupt claim/completion is for each interrupt and not at CPU
level so if an interrupt is masked then only that interrupt is blocked
for all CPUs but other interrupts can still be raised.

Regards,
Anup

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

2021-10-20 16:23:54

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote:
> > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > Guo Ren <[email protected]> wrote:
> > > >
> > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > breaks
> > > > > > > the HW interrupt state or not.
> > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > continues irqs could come in.
> > > > >
> > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > handled? How great...
> > > > If the completion ID does not match an interrupt source that is
> > > > currently enabled for the target, the completion is silently ignored.
> > > > So, C9xx completion depends on enable-bit.
> > >
> > > Is that what the PLIC spec says? Or what your implementation does? I
> > > can understand that one implementation would be broken, but if the
> > > PLIC architecture itself is broken, that's far more concerning.
> >
> > Here is the description of Interrupt Completion in PLIC spec [1]:
> >
> > The PLIC signals it has completed executing an interrupt handler by
> > writing the interrupt ID it received from the claim to the claim/complete
> > register. The PLIC does not check whether the completion ID is the same
> > as the last claim ID for that target. If the completion ID does not match
> > an interrupt source that is currently enabled for the target, the
> > ^^ ^^^^^^^^^ ^^^^^^^
> > completion is silently ignored.
> >
> > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> >
> > Did we misunderstand the PLIC spec?
> >
>
> That clause sounds to me like it is due to the SiFive implementation, which
> the RISC-V PLIC specification is based on. Since the PLIC spec is still a
> draft I would expect it to change before release.

The SiFive PLIC has been adopted by various RISC-V platforms (including
SiFive themselves). Almost all existing RISC-V boards have PLIC as the
interrupt controller.

Considering the wide usage of PLIC across existing platforms, the RISC-V
International has adopted it as an official RISC-V non-ISA spec. Of course,
the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec
but changing the RISC-V PLIC spec now would mean all existing RISC-V
platforms will become non-compliant.

The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the
new interrupt controller spec for future RISC-V platforms.

Regards,
Anup

2021-10-20 16:54:02

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, 20 Oct 2021 17:08:36 +0100,
Anup Patel <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Wed, 20 Oct 2021 15:33:49 +0100,
> > Anup Patel <[email protected]> wrote:
> > >
> > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > breaks
> > > > > > > > the HW interrupt state or not.
> > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > continues irqs could come in.
> > > > > >
> > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > handled? How great...
> > > > > If the completion ID does not match an interrupt source that is
> > > > > currently enabled for the target, the completion is silently ignored.
> > > > > So, C9xx completion depends on enable-bit.
> > > >
> > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > can understand that one implementation would be broken, but if the
> > > > PLIC architecture itself is broken, that's far more concerning.
> > >
> > > Yes, we are dealing with a broken/non-compliant PLIC
> > > implementation.
> > >
> > > The RISC-V PLIC spec defines a very different behaviour for the
> > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > > writel(claim)). The T-HEAD PLIC implementation does things
> > > different from what the RISC-V PLIC spec says because it will
> > > mask an interrupt upon interrupt claim whereas PLIC spec says
> > > it should only clear the interrupt pending bit (not mask the interrupt).
> > >
> > > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > > "The PLIC can perform an interrupt claim by reading the claim/complete
> > > register, which returns the ID of the highest priority pending interrupt or
> > > zero if there is no pending interrupt. A successful claim will also atomically
> > > clear the corresponding pending bit on the interrupt source."
> > >
> > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> >
> > That's not the point I'm making. According to Guo, the PLIC (any
> > implementation of it) will ignore a write to claim on a masked
> > interrupt.
>
> Yes, write to claim on a masked interrupt is certainly ignored but
> read to claim does not automatically mask the interrupt.
>
> >
> > If that's indeed correct, then a sequence such as:
> >
> > (1) irq = read(claim)
>
> This will return highest priority pending interrupt and clear the
> pending bit as-per RISC-V PLIC spec.
>
> > (2) mask from the interrupt handler with the right flags so that it
> > isn't done lazily
> > (3) write(irq, claim)
> >
> > will result in an interrupt blocked in ack state (and probably no more
> > interrupt for this CPU at this priority). That would be an interesting
> > bug in the current code, but also a pretty bad architectural choice.
>
> The interrupt claim/completion is for each interrupt and not at CPU
> level so if an interrupt is masked then only that interrupt is blocked
> for all CPUs but other interrupts can still be raised.

Do you mean that another interrupt of the same priority will be able
to be taken on *this* CPU, despite the completion being silently
ignored?

M.

--
Without deviation from the norm, progress is not possible.

2021-10-20 18:06:51

by Darius Rad

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 09:48:36PM +0530, Anup Patel wrote:
> On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <[email protected]> wrote:
> >
> > On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote:
> > > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > breaks
> > > > > > > > the HW interrupt state or not.
> > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > continues irqs could come in.
> > > > > >
> > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > handled? How great...
> > > > > If the completion ID does not match an interrupt source that is
> > > > > currently enabled for the target, the completion is silently ignored.
> > > > > So, C9xx completion depends on enable-bit.
> > > >
> > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > can understand that one implementation would be broken, but if the
> > > > PLIC architecture itself is broken, that's far more concerning.
> > >
> > > Here is the description of Interrupt Completion in PLIC spec [1]:
> > >
> > > The PLIC signals it has completed executing an interrupt handler by
> > > writing the interrupt ID it received from the claim to the claim/complete
> > > register. The PLIC does not check whether the completion ID is the same
> > > as the last claim ID for that target. If the completion ID does not match
> > > an interrupt source that is currently enabled for the target, the
> > > ^^ ^^^^^^^^^ ^^^^^^^
> > > completion is silently ignored.
> > >
> > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> > >
> > > Did we misunderstand the PLIC spec?
> > >
> >
> > That clause sounds to me like it is due to the SiFive implementation, which
> > the RISC-V PLIC specification is based on. Since the PLIC spec is still a
> > draft I would expect it to change before release.
>
> The SiFive PLIC has been adopted by various RISC-V platforms (including
> SiFive themselves). Almost all existing RISC-V boards have PLIC as the
> interrupt controller.
>
> Considering the wide usage of PLIC across existing platforms, the RISC-V
> International has adopted it as an official RISC-V non-ISA spec. ...

You mean is in the process of adopting it, right?

> ... Of course,
> the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec
> but changing the RISC-V PLIC spec now would mean all existing RISC-V
> platforms will become non-compliant.
>

I would expect the review process to produce a proper specification, rather
than a verbatim copy of the SiFive datasheet, and clarify some ambgiuous
and implementation specific language. Clarifying the specification does
not necessarily make all existing implementations non-compliant, as this
has been done numerous times with other RISC-V specifications.

> The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the
> new interrupt controller spec for future RISC-V platforms.
>
> Regards,
> Anup
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2021-10-21 01:49:33

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Thu, Oct 21, 2021 at 12:08 AM Anup Patel <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Wed, 20 Oct 2021 15:33:49 +0100,
> > Anup Patel <[email protected]> wrote:
> > >
> > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > breaks
> > > > > > > > the HW interrupt state or not.
> > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > continues irqs could come in.
> > > > > >
> > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > handled? How great...
> > > > > If the completion ID does not match an interrupt source that is
> > > > > currently enabled for the target, the completion is silently ignored.
> > > > > So, C9xx completion depends on enable-bit.
> > > >
> > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > can understand that one implementation would be broken, but if the
> > > > PLIC architecture itself is broken, that's far more concerning.
> > >
> > > Yes, we are dealing with a broken/non-compliant PLIC
> > > implementation.
> > >
> > > The RISC-V PLIC spec defines a very different behaviour for the
> > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > > writel(claim)). The T-HEAD PLIC implementation does things
> > > different from what the RISC-V PLIC spec says because it will
> > > mask an interrupt upon interrupt claim whereas PLIC spec says
> > > it should only clear the interrupt pending bit (not mask the interrupt).
> > >
> > > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > > "The PLIC can perform an interrupt claim by reading the claim/complete
> > > register, which returns the ID of the highest priority pending interrupt or
> > > zero if there is no pending interrupt. A successful claim will also atomically
> > > clear the corresponding pending bit on the interrupt source."
> > >
> > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> >
> > That's not the point I'm making. According to Guo, the PLIC (any
> > implementation of it) will ignore a write to claim on a masked
> > interrupt.
>
> Yes, write to claim on a masked interrupt is certainly ignored but
> read to claim does not automatically mask the interrupt.
>
> >
> > If that's indeed correct, then a sequence such as:
> >
> > (1) irq = read(claim)
>
> This will return highest priority pending interrupt and clear the
> pending bit as-per RISC-V PLIC spec.
>
> > (2) mask from the interrupt handler with the right flags so that it
> > isn't done lazily
> > (3) write(irq, claim)
> >
> > will result in an interrupt blocked in ack state (and probably no more
> > interrupt for this CPU at this priority). That would be an interesting
> > bug in the current code, but also a pretty bad architectural choice.
>
> The interrupt claim/completion is for each interrupt and not at CPU
> level so if an interrupt is masked then only that interrupt is blocked
> for all CPUs but other interrupts can still be raised.
1.
I think PLIC only could receive a new coming IRQ after completion:

claim IRQ-0
complete IRQ-0
claim IRQ-1
complete IRQ-1
claim IRQ-2
complete IRQ-2

Any recursion would break the PLIC, right? That's why we need to mask
the IRQ before entering this IRQ thread_fn.

2.
plic_handle_irq -> readl(claim)
handle_fasteoi_irq -> if (desc->istate & IRQS_ONESHOT) mask_irq(desc);
handle_fasteoi_irq -> chip->irq_eoi(&desc->irq_data); // failied

Seems all ONESHOT IRQs would be broken, right?

>
> Regards,
> Anup
>
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-21 02:03:05

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 11:08 PM Marc Zyngier <[email protected]> wrote:
>
> On Wed, 20 Oct 2021 15:33:49 +0100,
> Anup Patel <[email protected]> wrote:
> >
> > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > Guo Ren <[email protected]> wrote:
> > > >
> > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > breaks
> > > > > > > the HW interrupt state or not.
> > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > continues irqs could come in.
> > > > >
> > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > handled? How great...
> > > > If the completion ID does not match an interrupt source that is
> > > > currently enabled for the target, the completion is silently ignored.
> > > > So, C9xx completion depends on enable-bit.
> > >
> > > Is that what the PLIC spec says? Or what your implementation does? I
> > > can understand that one implementation would be broken, but if the
> > > PLIC architecture itself is broken, that's far more concerning.
> >
> > Yes, we are dealing with a broken/non-compliant PLIC
> > implementation.
> >
> > The RISC-V PLIC spec defines a very different behaviour for the
> > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > writel(claim)). The T-HEAD PLIC implementation does things
> > different from what the RISC-V PLIC spec says because it will
> > mask an interrupt upon interrupt claim whereas PLIC spec says
> > it should only clear the interrupt pending bit (not mask the interrupt).
> >
> > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > "The PLIC can perform an interrupt claim by reading the claim/complete
> > register, which returns the ID of the highest priority pending interrupt or
> > zero if there is no pending interrupt. A successful claim will also atomically
> > clear the corresponding pending bit on the interrupt source."
> >
> > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
>
> That's not the point I'm making. According to Guo, the PLIC (any
> implementation of it) will ignore a write to claim on a masked
> interrupt.
>
> If that's indeed correct, then a sequence such as:
>
> (1) irq = read(claim)
> (2) mask from the interrupt handler with the right flags so that it
> isn't done lazily
> (3) write(irq, claim)

How about letting the IRQ chip change?

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index a98bcfc4be7b..ed6ace1058ac 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -444,10 +444,10 @@ void unmask_threaded_irq(struct irq_desc *desc)
{
struct irq_chip *chip = desc->irq_data.chip;

+ unmask_irq(desc);
+
if (chip->flags & IRQCHIP_EOI_THREADED)
chip->irq_eoi(&desc->irq_data);
-
- unmask_irq(desc);
}

/*
@@ -673,8 +673,8 @@ static void cond_unmask_eoi_irq(struct irq_desc
*desc, struct irq_chip *chip)
*/
if (!irqd_irq_disabled(&desc->irq_data) &&
irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot) {
- chip->irq_eoi(&desc->irq_data);
unmask_irq(desc);
+ chip->irq_eoi(&desc->irq_data);
} else if (!(chip->flags & IRQCHIP_EOI_THREADED)) {
chip->irq_eoi(&desc->irq_data);
}

>
> will result in an interrupt blocked in ack state (and probably no more
> interrupt for this CPU at this priority). That would be an interesting
> bug in the current code, but also a pretty bad architectural choice.
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-10-21 08:36:10

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Thu, 21 Oct 2021 03:00:43 +0100,
Guo Ren <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 11:08 PM Marc Zyngier <[email protected]> wrote:
> >
> > On Wed, 20 Oct 2021 15:33:49 +0100,
> > Anup Patel <[email protected]> wrote:
> > >
> > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > > >
> > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > Guo Ren <[email protected]> wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > breaks
> > > > > > > > the HW interrupt state or not.
> > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > continues irqs could come in.
> > > > > >
> > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > handled? How great...
> > > > > If the completion ID does not match an interrupt source that is
> > > > > currently enabled for the target, the completion is silently ignored.
> > > > > So, C9xx completion depends on enable-bit.
> > > >
> > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > can understand that one implementation would be broken, but if the
> > > > PLIC architecture itself is broken, that's far more concerning.
> > >
> > > Yes, we are dealing with a broken/non-compliant PLIC
> > > implementation.
> > >
> > > The RISC-V PLIC spec defines a very different behaviour for the
> > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > > writel(claim)). The T-HEAD PLIC implementation does things
> > > different from what the RISC-V PLIC spec says because it will
> > > mask an interrupt upon interrupt claim whereas PLIC spec says
> > > it should only clear the interrupt pending bit (not mask the interrupt).
> > >
> > > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > > "The PLIC can perform an interrupt claim by reading the claim/complete
> > > register, which returns the ID of the highest priority pending interrupt or
> > > zero if there is no pending interrupt. A successful claim will also atomically
> > > clear the corresponding pending bit on the interrupt source."
> > >
> > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> >
> > That's not the point I'm making. According to Guo, the PLIC (any
> > implementation of it) will ignore a write to claim on a masked
> > interrupt.
> >
> > If that's indeed correct, then a sequence such as:
> >
> > (1) irq = read(claim)
> > (2) mask from the interrupt handler with the right flags so that it
> > isn't done lazily
> > (3) write(irq, claim)
>
> How about letting the IRQ chip change?
>
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index a98bcfc4be7b..ed6ace1058ac 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -444,10 +444,10 @@ void unmask_threaded_irq(struct irq_desc *desc)
> {
> struct irq_chip *chip = desc->irq_data.chip;
>
> + unmask_irq(desc);
> +
> if (chip->flags & IRQCHIP_EOI_THREADED)
> chip->irq_eoi(&desc->irq_data);
> -
> - unmask_irq(desc);
> }
>
> /*
> @@ -673,8 +673,8 @@ static void cond_unmask_eoi_irq(struct irq_desc
> *desc, struct irq_chip *chip)
> */
> if (!irqd_irq_disabled(&desc->irq_data) &&
> irqd_irq_masked(&desc->irq_data) && !desc->threads_oneshot) {
> - chip->irq_eoi(&desc->irq_data);
> unmask_irq(desc);
> + chip->irq_eoi(&desc->irq_data);
> } else if (!(chip->flags & IRQCHIP_EOI_THREADED)) {
> chip->irq_eoi(&desc->irq_data);
> }

No, I don't think that's acceptable, and I strongly suspect that other
irqchips have the opposite requirement. You'll have to keep the
workaround in the PLIC code and track the EOI vs unmask to do the
right thing in both callbacks.

M.

--
Without deviation from the norm, progress is not possible.

2021-10-21 08:52:13

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 11:32 PM Darius Rad <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 09:48:36PM +0530, Anup Patel wrote:
> > On Wed, Oct 20, 2021 at 8:29 PM Darius Rad <[email protected]> wrote:
> > >
> > > On Wed, Oct 20, 2021 at 10:19:06PM +0800, Guo Ren wrote:
> > > > On Wed, Oct 20, 2021 at 9:34 PM Marc Zyngier <[email protected]> wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > > Guo Ren <[email protected]> wrote:
> > > > > > >
> > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > > breaks
> > > > > > > > > the HW interrupt state or not.
> > > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > > continues irqs could come in.
> > > > > > >
> > > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > > handled? How great...
> > > > > > If the completion ID does not match an interrupt source that is
> > > > > > currently enabled for the target, the completion is silently ignored.
> > > > > > So, C9xx completion depends on enable-bit.
> > > > >
> > > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > > can understand that one implementation would be broken, but if the
> > > > > PLIC architecture itself is broken, that's far more concerning.
> > > >
> > > > Here is the description of Interrupt Completion in PLIC spec [1]:
> > > >
> > > > The PLIC signals it has completed executing an interrupt handler by
> > > > writing the interrupt ID it received from the claim to the claim/complete
> > > > register. The PLIC does not check whether the completion ID is the same
> > > > as the last claim ID for that target. If the completion ID does not match
> > > > an interrupt source that is currently enabled for the target, the
> > > > ^^ ^^^^^^^^^ ^^^^^^^
> > > > completion is silently ignored.
> > > >
> > > > [1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> > > >
> > > > Did we misunderstand the PLIC spec?
> > > >
> > >
> > > That clause sounds to me like it is due to the SiFive implementation, which
> > > the RISC-V PLIC specification is based on. Since the PLIC spec is still a
> > > draft I would expect it to change before release.
> >
> > The SiFive PLIC has been adopted by various RISC-V platforms (including
> > SiFive themselves). Almost all existing RISC-V boards have PLIC as the
> > interrupt controller.
> >
> > Considering the wide usage of PLIC across existing platforms, the RISC-V
> > International has adopted it as an official RISC-V non-ISA spec. ...
>
> You mean is in the process of adopting it, right?

Yes, it in the process.

>
> > ... Of course,
> > the RISC-V PLIC spec needs to follow the process for RISC-V non-ISA spec
> > but changing the RISC-V PLIC spec now would mean all existing RISC-V
> > platforms will become non-compliant.
> >
>
> I would expect the review process to produce a proper specification, rather
> than a verbatim copy of the SiFive datasheet, and clarify some ambgiuous
> and implementation specific language. Clarifying the specification does
> not necessarily make all existing implementations non-compliant, as this
> has been done numerous times with other RISC-V specifications.

Yes, clarification can be definitely done.

Regards,
Anup

>
> > The RISC-V AIA spec is intended to replace the RISC-V PLIC spec as the
> > new interrupt controller spec for future RISC-V platforms.
> >
> > Regards,
> > Anup
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

2021-10-21 08:55:42

[permalink] [raw]

Subject: Re: [PATCH V4 1/3] irqchip/sifive-plic: Add thead,c900-plic support

On Wed, Oct 20, 2021 at 10:18 PM Marc Zyngier <[email protected]> wrote:
>
> On Wed, 20 Oct 2021 17:08:36 +0100,
> Anup Patel <[email protected]> wrote:
> >
> > On Wed, Oct 20, 2021 at 8:38 PM Marc Zyngier <[email protected]> wrote:
> > >
> > > On Wed, 20 Oct 2021 15:33:49 +0100,
> > > Anup Patel <[email protected]> wrote:
> > > >
> > > > On Wed, Oct 20, 2021 at 7:04 PM Marc Zyngier <[email protected]> wrote:
> > > > >
> > > > > On Tue, 19 Oct 2021 14:27:02 +0100,
> > > > > Guo Ren <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 6:18 PM Marc Zyngier <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, 19 Oct 2021 10:33:49 +0100,
> > > > > > > Guo Ren <[email protected]> wrote:
> > > > > > >
> > > > > > > > > If you have an 'automask' behavior and yet the HW doesn't record this
> > > > > > > > > in a separate bit, then you need to track this by yourself in the
> > > > > > > > > irq_eoi() callback instead. I guess that you would skip the write to
> > > > > > > > > the CLAIM register in this case, though I have no idea whether this
> > > > > > > > > breaks
> > > > > > > > > the HW interrupt state or not.
> > > > > > > > The problem is when enable bit is 0 for that irq_number,
> > > > > > > > "writel(d->hwirq, handler->hart_base + CONTEXT_CLAIM)" wouldn't affect
> > > > > > > > the hw state machine. Then this irq would enter in ack state and no
> > > > > > > > continues irqs could come in.
> > > > > > >
> > > > > > > Really? This means that you cannot mask an interrupt while it is being
> > > > > > > handled? How great...
> > > > > > If the completion ID does not match an interrupt source that is
> > > > > > currently enabled for the target, the completion is silently ignored.
> > > > > > So, C9xx completion depends on enable-bit.
> > > > >
> > > > > Is that what the PLIC spec says? Or what your implementation does? I
> > > > > can understand that one implementation would be broken, but if the
> > > > > PLIC architecture itself is broken, that's far more concerning.
> > > >
> > > > Yes, we are dealing with a broken/non-compliant PLIC
> > > > implementation.
> > > >
> > > > The RISC-V PLIC spec defines a very different behaviour for the
> > > > interrupt claim (i.e. readl(claim)) and interrupt completion (i.e.
> > > > writel(claim)). The T-HEAD PLIC implementation does things
> > > > different from what the RISC-V PLIC spec says because it will
> > > > mask an interrupt upon interrupt claim whereas PLIC spec says
> > > > it should only clear the interrupt pending bit (not mask the interrupt).
> > > >
> > > > Quoting interrupt claim process (chapter 9) from PLIC spec:
> > > > "The PLIC can perform an interrupt claim by reading the claim/complete
> > > > register, which returns the ID of the highest priority pending interrupt or
> > > > zero if there is no pending interrupt. A successful claim will also atomically
> > > > clear the corresponding pending bit on the interrupt source."
> > > >
> > > > Refer, https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc
> > >
> > > That's not the point I'm making. According to Guo, the PLIC (any
> > > implementation of it) will ignore a write to claim on a masked
> > > interrupt.
> >
> > Yes, write to claim on a masked interrupt is certainly ignored but
> > read to claim does not automatically mask the interrupt.
> >
> > >
> > > If that's indeed correct, then a sequence such as:
> > >
> > > (1) irq = read(claim)
> >
> > This will return highest priority pending interrupt and clear the
> > pending bit as-per RISC-V PLIC spec.
> >
> > > (2) mask from the interrupt handler with the right flags so that it
> > > isn't done lazily
> > > (3) write(irq, claim)
> > >
> > > will result in an interrupt blocked in ack state (and probably no more
> > > interrupt for this CPU at this priority). That would be an interesting
> > > bug in the current code, but also a pretty bad architectural choice.
> >
> > The interrupt claim/completion is for each interrupt and not at CPU
> > level so if an interrupt is masked then only that interrupt is blocked
> > for all CPUs but other interrupts can still be raised.
>
> Do you mean that another interrupt of the same priority will be able
> to be taken on *this* CPU, despite the completion being silently
> ignored?

This part is not clear in the RISC-V PLIC spec so I will request for
adding clarification.

Regards,
Anup

>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

2021-10-21 09:46:03