2017-08-03 01:38:30

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity

>From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <[email protected]>
Date: Tue, 1 Aug 2017 20:54:41 -0500
Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity

Ensure irqd is active before attempting to set affinity. This should
make the set affinity code more robust. For instance, this prevents
these messages seen on a 4.12 based kernel when taking cpus offline:

[ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
[ 77.885859] xive: Error -6 reconfiguring irq 17
[ 77.885862] IRQ17: set affinity failed(-6).

The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:

commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
---
arch/powerpc/sysdev/xive/common.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 6595462..2708d42 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -672,6 +672,10 @@ static int xive_irq_set_affinity(struct irq_data *d,
if (cpumask_any_and(cpumask, cpu_online_mask) >= nr_cpu_ids)
return -EINVAL;

+ /* Don't do anything if the interrupt isn't started */
+ if (!irqd_is_started(d))
+ return IRQ_SET_MASK_OK;
+
/*
* If existing target is already in the new mask, and is
* online then do nothing.
--
1.8.3.1


2017-08-08 10:40:37

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc: xive: ensure active irqd when setting affinity

Sukadev Bhattiprolu <[email protected]> writes:

> From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt <[email protected]>
> Date: Tue, 1 Aug 2017 20:54:41 -0500
> Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity
>
> Ensure irqd is active before attempting to set affinity. This should
> make the set affinity code more robust. For instance, this prevents
> these messages seen on a 4.12 based kernel when taking cpus offline:
>
> [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
> [ 77.885859] xive: Error -6 reconfiguring irq 17
> [ 77.885862] IRQ17: set affinity failed(-6).
>
> The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:
>
> commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")

So do we still need this? Or is the above only a partial fix?

I'm a bit confused.

cheers

2017-08-09 00:00:42

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH] powerpc: xive: ensure active irqd when setting affinity

Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
> > From: Benjamin Herrenschmidt <[email protected]>
> > Date: Tue, 1 Aug 2017 20:54:41 -0500
> > Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity
> >
> > Ensure irqd is active before attempting to set affinity. This should
> > make the set affinity code more robust. For instance, this prevents
> > these messages seen on a 4.12 based kernel when taking cpus offline:
> >
> > [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
> > [ 77.885859] xive: Error -6 reconfiguring irq 17
> > [ 77.885862] IRQ17: set affinity failed(-6).
> >
> > The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:
> >
> > commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")
>
> So do we still need this? Or is the above only a partial fix?

It would be good to have this fix.

Commit 91f26cb4cd3c fixes the problem, so we wont see the errors with
that commit applied. But if such a problem were to show up again, xive
will handle them earlier before hitting those errors.

Sukadev

>
> I'm a bit confused.
>
> cheers

2017-08-09 06:15:38

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc: xive: ensure active irqd when setting affinity

Sukadev Bhattiprolu <[email protected]> writes:
> Michael Ellerman [[email protected]] wrote:
>> Sukadev Bhattiprolu <[email protected]> writes:
>> > From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
>> > From: Benjamin Herrenschmidt <[email protected]>
>> > Date: Tue, 1 Aug 2017 20:54:41 -0500
>> > Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity
>> >
>> > Ensure irqd is active before attempting to set affinity. This should
>> > make the set affinity code more robust. For instance, this prevents
>> > these messages seen on a 4.12 based kernel when taking cpus offline:
>> >
>> > [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
>> > [ 77.885859] xive: Error -6 reconfiguring irq 17
>> > [ 77.885862] IRQ17: set affinity failed(-6).
>> >
>> > The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:
>> >
>> > commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")
>>
>> So do we still need this? Or is the above only a partial fix?
>
> It would be good to have this fix.
>
> Commit 91f26cb4cd3c fixes the problem, so we wont see the errors with
> that commit applied. But if such a problem were to show up again, xive
> will handle them earlier before hitting those errors.

I'm not sure I'm convinced. We can't handle every possible case of the
higher level code calling us in situations we don't expect.

For example irq_data could be NULL, but we trust the higher level code
not to do that to us.

Also I don't see any other driver doing this check.

$ git grep irqd_is_started
include/linux/irq.h:static inline bool irqd_is_started(struct irq_data *d)
kernel/irq/chip.c: if (irqd_is_started(d)) {
kernel/irq/chip.c: if (irqd_is_started(&desc->irq_data)) {
kernel/irq/cpuhotplug.c: if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !irq_needs_fixup(d)) {


cheers

2017-08-09 07:36:25

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] powerpc: xive: ensure active irqd when setting affinity

On Wed, 2017-08-09 at 16:15 +1000, Michael Ellerman wrote:
> I'm not sure I'm convinced. We can't handle every possible case of the
> higher level code calling us in situations we don't expect.
>
> For example irq_data could be NULL, but we trust the higher level code
> not to do that to us.
>
> Also I don't see any other driver doing this check.
>
> $ git grep irqd_is_started
> include/linux/irq.h:static inline bool irqd_is_started(struct irq_data *d)
> kernel/irq/chip.c: if (irqd_is_started(d)) {
> kernel/irq/chip.c: if (irqd_is_started(&desc->irq_data)) {
> kernel/irq/cpuhotplug.c: if (irqd_is_per_cpu(d) || !irqd_is_started(d) || !irq_needs_fixup(d)) {

irqd_is_started is brand new so you won't find any :-)

For most cases the problem is a non-issue. Due to how xive works, it's
more of a problem for us because a non-started interrupt has no
targetting information at all.

So this is *somewhat* related to xive internal and I'd rather have
that sanity check in there.

Cheers,
Ben.

2017-08-11 12:19:59

by Michael Ellerman

[permalink] [raw]
Subject: Re: powerpc: xive: ensure active irqd when setting affinity

On Thu, 2017-08-03 at 01:38:22 UTC, Sukadev Bhattiprolu wrote:
> >From fd0abf5c61b6041fdb75296e8580b86dc91d08d6 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt <[email protected]>
> Date: Tue, 1 Aug 2017 20:54:41 -0500
> Subject: [PATCH] powerpc: xive: ensure active irqd when setting affinity
>
> Ensure irqd is active before attempting to set affinity. This should
> make the set affinity code more robust. For instance, this prevents
> these messages seen on a 4.12 based kernel when taking cpus offline:
>
> [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
> [ 77.885859] xive: Error -6 reconfiguring irq 17
> [ 77.885862] IRQ17: set affinity failed(-6).
>
> The underlying problem with taking cpus offline was fixed in 4.13-rc1 by:
>
> commit 91f26cb4cd3c ("genirq/cpuhotplug: Do not migrated shutdown irqs")
>
> Signed-off-by: Sukadev Bhattiprolu <[email protected]>
> Signed-off-by: Benjamin Herrenschmidt <[email protected]>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/cffb717ceb8e2ca0316e89d908db54

cheers