2015-04-03 00:50:25

by Jesse Brandeburg

[permalink] [raw]
Subject: [PATCH] irq: revert non-working patch to affinity defaults

I've seen a couple of reports of issues since commit e2e64a932556 ("genirq:
Set initial affinity in irq_set_affinity_hint()") where the
affinity for the interrupt when programmed via
/proc/irq/<nnn>/smp_affinity will not be able to stick. It changes back
to some previous value at the next interrupt on that IRQ.

The original intent was to fix the broken default behavior of all IRQs
for a device starting up on CPU0. With a network card with 64 or more
queues, all 64 queue's interrupt vectors end up on CPU0 which can have
bad side effects, and has to be fixed by the irqbalance daemon, or by
the user at every boot with some kind of affinity script.

The symptom is that after a driver calls set_irq_affinity_hint, the
affinity will be set for that interrupt (and readable via /proc/...),
but on the first irq for that vector, the affinity for CPU0 or CPU1
resets to the default. The rest of the irq affinites seem to work and
everything is fine.

Impact if we don't fix this for 4.0.0:
Some users won't be able to set irq affinity as expected, on
some cpus.

I've spent a chunk of time trying to debug this with no luck and suggest
that we revert the change if no-one else can help me debug what is going
wrong, we can pick up the change later.

This commit would also revert commit 4fe7ffb7e17ca ("genirq: Fix null pointer
reference in irq_set_affinity_hint()") which was a bug fix to the original
patch.

Signed-off-by: Jesse Brandeburg <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: John <[email protected]>
---
kernel/irq/manage.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 886d09e..c166a16 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -243,9 +243,6 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
return -EINVAL;
desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
- /* set the initial affinity to prevent every interrupt being on CPU0 */
- if (m)
- __irq_set_affinity(irq, m, false);
return 0;
}
EXPORT_SYMBOL_GPL(irq_set_affinity_hint);


2015-04-03 06:56:07

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] irq: revert non-working patch to affinity defaults


* Jesse Brandeburg <[email protected]> wrote:

> I've seen a couple of reports of issues since commit e2e64a932556 ("genirq:
> Set initial affinity in irq_set_affinity_hint()") where the
> affinity for the interrupt when programmed via
> /proc/irq/<nnn>/smp_affinity will not be able to stick. It changes back
> to some previous value at the next interrupt on that IRQ.
>
> The original intent was to fix the broken default behavior of all IRQs
> for a device starting up on CPU0. With a network card with 64 or more
> queues, all 64 queue's interrupt vectors end up on CPU0 which can have
> bad side effects, and has to be fixed by the irqbalance daemon, or by
> the user at every boot with some kind of affinity script.
>
> The symptom is that after a driver calls set_irq_affinity_hint, the
> affinity will be set for that interrupt (and readable via /proc/...),
> but on the first irq for that vector, the affinity for CPU0 or CPU1
> resets to the default. The rest of the irq affinites seem to work and
> everything is fine.
>
> Impact if we don't fix this for 4.0.0:
> Some users won't be able to set irq affinity as expected, on
> some cpus.
>
> I've spent a chunk of time trying to debug this with no luck and suggest
> that we revert the change if no-one else can help me debug what is going
> wrong, we can pick up the change later.
>
> This commit would also revert commit 4fe7ffb7e17ca ("genirq: Fix null pointer
> reference in irq_set_affinity_hint()") which was a bug fix to the original
> patch.

So the original commit also has the problem that it unnecessary
drops/retakes the descriptor lock:

> irq_put_desc_unlock(desc, flags);
> - /* set the initial affinity to prevent every interrupt being on CPU0 */
> - if (m)
> - __irq_set_affinity(irq, m, false);


i.e. why not just call into irq_set_affinity_locked() while we still
have the descriptor locked?

Now this is just a small annoyance that should not really matter - it
would be nice to figure out the real reason for why the irqs move back
to CPU#0.

In theory the same could happen to 'irqbalanced' as well, if it calls
shortly after an irq was registered - so this is not a bug we want to
ignore.

Also, worst case we are back to where v3.19 was, right? So could we
try to analyze this a bit more?

Thanks,

Ingo

2015-04-04 00:13:40

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH] irq: revert non-working patch to affinity defaults

On Fri, 3 Apr 2015 08:55:57 +0200
Ingo Molnar <[email protected]> wrote:
> So the original commit also has the problem that it unnecessary
> drops/retakes the descriptor lock:
>
> > irq_put_desc_unlock(desc, flags);
> > - /* set the initial affinity to prevent every interrupt being on CPU0 */
> > - if (m)
> > - __irq_set_affinity(irq, m, false);
>
>
> i.e. why not just call into irq_set_affinity_locked() while we still
> have the descriptor locked?

I had tried that but it didn't help much. I also tried kzalloc a new
descriptor like the proc functionality does, and that changes the
behavior a little, but doesn't fix it AFAICS.

> Now this is just a small annoyance that should not really matter - it
> would be nice to figure out the real reason for why the irqs move back
> to CPU#0.
>
> In theory the same could happen to 'irqbalanced' as well, if it calls
> shortly after an irq was registered - so this is not a bug we want to
> ignore.

Let me know if I can do something to help, the IRQ code is a bit of a
steep learning curve, so the chances of me fixing it are small.

> Also, worst case we are back to where v3.19 was, right? So could we
> try to analyze this a bit more?

Yes, 3.19 shipped with this issue. Again, let me know if I can help.

Thanks,
Jesse

2015-04-04 09:34:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] irq: revert non-working patch to affinity defaults


* Jesse Brandeburg <[email protected]> wrote:

> > Now this is just a small annoyance that should not really matter -
> > it would be nice to figure out the real reason for why the irqs
> > move back to CPU#0.
> >
> > In theory the same could happen to 'irqbalanced' as well, if it
> > calls shortly after an irq was registered - so this is not a bug
> > we want to ignore.
>
> Let me know if I can do something to help, the IRQ code is a bit of
> a steep learning curve, so the chances of me fixing it are small.

Well, as a starter, if you can reproduce it on a system (I cannot),
then try to stick a few printks in there to print out the affinity
mask as it gets changed plus dump_stack(), and see who changes it
back?

Chances are it's irqbalanced? If not then the stack dump will tell. It
shouldn't be too chatty.

(trace_printk() if you prefer traces.)

Thanks,

Ingo

2015-04-04 10:00:46

by John

[permalink] [raw]
Subject: Re: [PATCH] irq: revert non-working patch to affinity defaults

On 4/4/2015 2:34 AM, Ingo Molnar wrote:
> Well, as a starter, if you can reproduce it on a system (I cannot),

How many cores does your current test system have, Ingo? I remember that
I did not see this on a machine with 24 cores, but did see it on one
with 36. My initial thought was that it might be related to the fact
that 24 cores can be described with a traditional 32-bit mask, but I
wasn't able to find anywhere that a larger mask wouldn't be handled
correctly.

In my case, it was happening with ixgbe and i40e cards.

-John