2019-09-10 19:02:33

by Greg Kurz

[permalink] [raw]
Subject: [PATCH] powerpc/xive: Fix bogus error code returned by OPAL

There's a bug in skiboot that causes the OPAL_XIVE_ALLOCATE_IRQ call
to return the 32-bit value 0xffffffff when OPAL has run out of IRQs.
Unfortunatelty, OPAL return values are signed 64-bit entities and
errors are supposed to be negative. If that happens, the linux code
confusingly treats 0xffffffff as a valid IRQ number and panics at some
point.

A fix was recently merged in skiboot:

e97391ae2bb5 ("xive: fix return value of opal_xive_allocate_irq()")

but we need a workaround anyway to support older skiboots already
on the field.

Internally convert 0xffffffff to OPAL_RESOURCE which is the usual error
returned upon resource exhaustion.

Signed-off-by: Greg Kurz <[email protected]>
---
arch/powerpc/sysdev/xive/native.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 37987c815913..c35583f84f9f 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -231,6 +231,15 @@ static bool xive_native_match(struct device_node *node)
return of_device_is_compatible(node, "ibm,opal-xive-vc");
}

+static int64_t opal_xive_allocate_irq_fixup(uint32_t chip_id)
+{
+ s64 irq = opal_xive_allocate_irq(chip_id);
+
+#define XIVE_ALLOC_NO_SPACE 0xffffffff /* No possible space */
+ return
+ irq == XIVE_ALLOC_NO_SPACE ? OPAL_RESOURCE : irq;
+}
+
#ifdef CONFIG_SMP
static int xive_native_get_ipi(unsigned int cpu, struct xive_cpu *xc)
{
@@ -238,7 +247,7 @@ static int xive_native_get_ipi(unsigned int cpu, struct xive_cpu *xc)

/* Allocate an IPI and populate info about it */
for (;;) {
- irq = opal_xive_allocate_irq(xc->chip_id);
+ irq = opal_xive_allocate_irq_fixup(xc->chip_id);
if (irq == OPAL_BUSY) {
msleep(OPAL_BUSY_DELAY_MS);
continue;
@@ -259,7 +268,7 @@ u32 xive_native_alloc_irq(void)
s64 rc;

for (;;) {
- rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP);
+ rc = opal_xive_allocate_irq_fixup(OPAL_XIVE_ANY_CHIP);
if (rc != OPAL_BUSY)
break;
msleep(OPAL_BUSY_DELAY_MS);


2019-09-11 14:28:46

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/xive: Fix bogus error code returned by OPAL

Hi Greg,

Couple of comments ...

Greg Kurz <[email protected]> writes:
> There's a bug in skiboot that causes the OPAL_XIVE_ALLOCATE_IRQ call
> to return the 32-bit value 0xffffffff when OPAL has run out of IRQs.
> Unfortunatelty, OPAL return values are signed 64-bit entities and
> errors are supposed to be negative. If that happens, the linux code
> confusingly treats 0xffffffff as a valid IRQ number and panics at some
> point.
>
> A fix was recently merged in skiboot:
>
> e97391ae2bb5 ("xive: fix return value of opal_xive_allocate_irq()")
>
> but we need a workaround anyway to support older skiboots already
> on the field.
^
in

>
> Internally convert 0xffffffff to OPAL_RESOURCE which is the usual error
> returned upon resource exhaustion.

This should go to stable, any idea what versions it should go back to?
Probably whenever the xive code was introduced?

> Signed-off-by: Greg Kurz <[email protected]>
> ---
> arch/powerpc/sysdev/xive/native.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> index 37987c815913..c35583f84f9f 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -231,6 +231,15 @@ static bool xive_native_match(struct device_node *node)
> return of_device_is_compatible(node, "ibm,opal-xive-vc");
> }
>
> +static int64_t opal_xive_allocate_irq_fixup(uint32_t chip_id)
^ ^
Can you use s64 here and u32 here ....

Instead of calling this opal_xive_allocate_irq_fixup() and relying on
all callers to call the fixup, can we rename the opal wrapper and leave
this function's name unchanged, eg:

-OPAL_CALL(opal_xive_allocate_irq, OPAL_XIVE_ALLOCATE_IRQ);
+OPAL_CALL(opal_xive_allocate_irq_raw, OPAL_XIVE_ALLOCATE_IRQ);


> +{
> + s64 irq = opal_xive_allocate_irq(chip_id);
> +
> +#define XIVE_ALLOC_NO_SPACE 0xffffffff /* No possible space */
> + return
> + irq == XIVE_ALLOC_NO_SPACE ? OPAL_RESOURCE : irq;
> +}

I don't really like the #define and the weird indenting and so on, can
we instead do it like:

/*
* Old versions of skiboot can incorrectly return 0xffffffff to
* indicate no space, fix it up here.
*/
return irq == 0xffffffff ? OPAL_RESOURCE : irq;

cheers

2019-09-11 16:03:51

by Greg Kurz

[permalink] [raw]
Subject: Re: [PATCH] powerpc/xive: Fix bogus error code returned by OPAL

On Thu, 12 Sep 2019 00:26:19 +1000
Michael Ellerman <[email protected]> wrote:

> Hi Greg,
>

Bom dia ! :)

> Couple of comments ...
>
> Greg Kurz <[email protected]> writes:
> > There's a bug in skiboot that causes the OPAL_XIVE_ALLOCATE_IRQ call
> > to return the 32-bit value 0xffffffff when OPAL has run out of IRQs.
> > Unfortunatelty, OPAL return values are signed 64-bit entities and
> > errors are supposed to be negative. If that happens, the linux code
> > confusingly treats 0xffffffff as a valid IRQ number and panics at some
> > point.
> >
> > A fix was recently merged in skiboot:
> >
> > e97391ae2bb5 ("xive: fix return value of opal_xive_allocate_irq()")
> >
> > but we need a workaround anyway to support older skiboots already
> > on the field.
> ^
> in
>
> >
> > Internally convert 0xffffffff to OPAL_RESOURCE which is the usual error
> > returned upon resource exhaustion.
>
> This should go to stable, any idea what versions it should go back to?
> Probably whenever the xive code was introduced?
>

Yes I guess so. This would mean v4.12. I'll add the appropriate stable
tag before re-posting, and address all the other remarks of course.

> > Signed-off-by: Greg Kurz <[email protected]>
> > ---
> > arch/powerpc/sysdev/xive/native.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
> > index 37987c815913..c35583f84f9f 100644
> > --- a/arch/powerpc/sysdev/xive/native.c
> > +++ b/arch/powerpc/sysdev/xive/native.c
> > @@ -231,6 +231,15 @@ static bool xive_native_match(struct device_node *node)
> > return of_device_is_compatible(node, "ibm,opal-xive-vc");
> > }
> >
> > +static int64_t opal_xive_allocate_irq_fixup(uint32_t chip_id)
> ^ ^
> Can you use s64 here and u32 here ....
>
> Instead of calling this opal_xive_allocate_irq_fixup() and relying on
> all callers to call the fixup, can we rename the opal wrapper and leave
> this function's name unchanged, eg:
>
> -OPAL_CALL(opal_xive_allocate_irq, OPAL_XIVE_ALLOCATE_IRQ);
> +OPAL_CALL(opal_xive_allocate_irq_raw, OPAL_XIVE_ALLOCATE_IRQ);
>
>
> > +{
> > + s64 irq = opal_xive_allocate_irq(chip_id);
> > +
> > +#define XIVE_ALLOC_NO_SPACE 0xffffffff /* No possible space */
> > + return
> > + irq == XIVE_ALLOC_NO_SPACE ? OPAL_RESOURCE : irq;
> > +}
>
> I don't really like the #define and the weird indenting and so on, can
> we instead do it like:
>
> /*
> * Old versions of skiboot can incorrectly return 0xffffffff to
> * indicate no space, fix it up here.
> */
> return irq == 0xffffffff ? OPAL_RESOURCE : irq;
>
> cheers

2019-09-24 16:50:36

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] powerpc/xive: Fix bogus error code returned by OPAL

On Mon, Sep 23, 2019 at 08:29:40AM +0200, Greg Kurz wrote:
> There's a bug in skiboot that causes the OPAL_XIVE_ALLOCATE_IRQ call
> to return the 32-bit value 0xffffffff when OPAL has run out of IRQs.
> Unfortunatelty, OPAL return values are signed 64-bit entities and
> errors are supposed to be negative. If that happens, the linux code
> confusingly treats 0xffffffff as a valid IRQ number and panics at some
> point.
>
> A fix was recently merged in skiboot:
>
> e97391ae2bb5 ("xive: fix return value of opal_xive_allocate_irq()")
>
> but we need a workaround anyway to support older skiboots already
> in the field.
>
> Internally convert 0xffffffff to OPAL_RESOURCE which is the usual error
> returned upon resource exhaustion.
>
> Cc: [email protected] # v4.12+
> Signed-off-by: Greg Kurz <[email protected]>
> Reviewed-by: C?dric Le Goater <[email protected]>
> Signed-off-by: Michael Ellerman <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> (cherry picked from commit 6ccb4ac2bf8a35c694ead92f8ac5530a16e8f2c8,
> groug: fix arch/powerpc/platforms/powernv/opal-wrappers.S instead of
> non-existing arch/powerpc/platforms/powernv/opal-call.c)
> Signed-off-by: Greg Kurz <[email protected]>
> ---
>
> This is for 4.14 and 4.19.

Thanks for the backport, now queued up.

greg k-h