2015-08-29 13:02:10

by Yang Yingliang

[permalink] [raw]
Subject: [PATCH] arm64: fix a migrating irq bug when hotplug cpu

From: Yang Yingliang <[email protected]>

When cpu is disabled, all irqs will be migratged to another cpu.
In some cases, a new affinity is different, it needed to be coppied
to irq's affinity. But if the type of irq is LPI, it's affinity will
not be coppied because of irq_set_affinity's return value.
So copy the affinity, when the return value is IRQ_SET_MASK_OK_DONE.

Cc: Jiang Liu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/kernel/irq.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 463fa2e..2acc8ec 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -78,10 +78,13 @@ static bool migrate_one_irq(struct irq_desc *desc)
}

c = irq_data_get_irq_chip(d);
- if (!c->irq_set_affinity)
+ if (!c->irq_set_affinity) {
pr_debug("IRQ%u: unable to set affinity\n", d->irq);
- else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK &&
ret)
- cpumask_copy(irq_data_get_affinity_mask(d), affinity);
+ } else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK
&& ret) {
+ int r = c->irq_set_affinity(d, affinity, false);
+ if ((r == IRQ_SET_MASK_OK || r == IRQ_SET_MASK_OK_DONE) && ret)
+ cpumask_copy(irq_data_get_affinity_mask(d), affinity);
+ }

return ret;
}
--
2.5.0


2015-08-29 15:12:48

by Jiang Liu

[permalink] [raw]
Subject: Re: [PATCH] arm64: fix a migrating irq bug when hotplug cpu

On 2015/8/29 21:00, Yang Yingliang wrote:
> From: Yang Yingliang <[email protected]>
>
> When cpu is disabled, all irqs will be migratged to another cpu.
> In some cases, a new affinity is different, it needed to be coppied
> to irq's affinity. But if the type of irq is LPI, it's affinity will
> not be coppied because of irq_set_affinity's return value.
> So copy the affinity, when the return value is IRQ_SET_MASK_OK_DONE.
Hi Yingliang,
If irq_set_affinity callback returns IRQ_SET_MASK_OK_DONE,
it means that irq_set_affinity has copied the new CPU mask to irq
affinity mask. It would be better to change irq_set_affinity for LPI
to follow this rule.
Thanks!
Gerry
>
> Cc: Jiang Liu <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm64/kernel/irq.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index 463fa2e..2acc8ec 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
> @@ -78,10 +78,13 @@ static bool migrate_one_irq(struct irq_desc *desc)
> }
>
> c = irq_data_get_irq_chip(d);
> - if (!c->irq_set_affinity)
> + if (!c->irq_set_affinity) {
> pr_debug("IRQ%u: unable to set affinity\n", d->irq);
> - else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK
> && ret)
> - cpumask_copy(irq_data_get_affinity_mask(d), affinity);
> + } else if (c->irq_set_affinity(d, affinity, false) ==
> IRQ_SET_MASK_OK && ret) {
> + int r = c->irq_set_affinity(d, affinity, false);
> + if ((r == IRQ_SET_MASK_OK || r == IRQ_SET_MASK_OK_DONE) && ret)
> + cpumask_copy(irq_data_get_affinity_mask(d), affinity);
> + }
>
> return ret;
> }

2015-08-29 18:14:45

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] arm64: fix a migrating irq bug when hotplug cpu

On 2015-08-29 16:12, Jiang Liu wrote:
> On 2015/8/29 21:00, Yang Yingliang wrote:
>> From: Yang Yingliang <[email protected]>
>>
>> When cpu is disabled, all irqs will be migratged to another cpu.
>> In some cases, a new affinity is different, it needed to be coppied
>> to irq's affinity. But if the type of irq is LPI, it's affinity will
>> not be coppied because of irq_set_affinity's return value.
>> So copy the affinity, when the return value is IRQ_SET_MASK_OK_DONE.
> Hi Yingliang,
> If irq_set_affinity callback returns IRQ_SET_MASK_OK_DONE,
> it means that irq_set_affinity has copied the new CPU mask to irq
> affinity mask. It would be better to change irq_set_affinity for LPI
> to follow this rule.

The main issue here seems to be that we do not call irq_set_affinity,
but
that we directly call into the top-level irqchip method, which relies
on
the core code to do the copy (see irq_do_set_affinity). Too bad.

It feels like the arm/arm64 code would probably be better consolidated
into
kernel/irq/migration.c, which already deals with some of this for x86
and ia64. It would save us the duplication and will make sure we don't
miss things next time we add a new return code, as irq_do_set_affinity
would handle this properly.

Thoughts?

M.
--
Fast, cheap, reliable. Pick two.

2015-08-30 13:16:07

by Hanjun Guo

[permalink] [raw]
Subject: Re: [PATCH] arm64: fix a migrating irq bug when hotplug cpu

On 08/30/2015 02:12 AM, Marc Zyngier wrote:
> On 2015-08-29 16:12, Jiang Liu wrote:
>> On 2015/8/29 21:00, Yang Yingliang wrote:
>>> From: Yang Yingliang <[email protected]>
>>>
>>> When cpu is disabled, all irqs will be migratged to another cpu.
>>> In some cases, a new affinity is different, it needed to be coppied
>>> to irq's affinity. But if the type of irq is LPI, it's affinity will
>>> not be coppied because of irq_set_affinity's return value.
>>> So copy the affinity, when the return value is IRQ_SET_MASK_OK_DONE.
>> Hi Yingliang,
>> If irq_set_affinity callback returns IRQ_SET_MASK_OK_DONE,
>> it means that irq_set_affinity has copied the new CPU mask to irq
>> affinity mask. It would be better to change irq_set_affinity for LPI
>> to follow this rule.
>
> The main issue here seems to be that we do not call irq_set_affinity, but
> that we directly call into the top-level irqchip method, which relies on
> the core code to do the copy (see irq_do_set_affinity). Too bad.
>
> It feels like the arm/arm64 code would probably be better consolidated into
> kernel/irq/migration.c, which already deals with some of this for x86
> and ia64. It would save us the duplication and will make sure we don't
> miss things next time we add a new return code, as irq_do_set_affinity
> would handle this properly.
>
> Thoughts?

I agree. In arch/arm64/kernel/irq.c the irq migrate code is the same
as ARM32, and it's duplicate. But this is a bugfix, can we fix it in
a simple way, and refactor the code later?

Thanks
Hanjun

2015-08-31 12:20:44

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] arm64: fix a migrating irq bug when hotplug cpu

On Sun, 30 Aug 2015 21:15:56 +0800
Hanjun Guo <[email protected]> wrote:

> On 08/30/2015 02:12 AM, Marc Zyngier wrote:
> > On 2015-08-29 16:12, Jiang Liu wrote:
> >> On 2015/8/29 21:00, Yang Yingliang wrote:
> >>> From: Yang Yingliang <[email protected]>
> >>>
> >>> When cpu is disabled, all irqs will be migratged to another cpu.
> >>> In some cases, a new affinity is different, it needed to be coppied
> >>> to irq's affinity. But if the type of irq is LPI, it's affinity will
> >>> not be coppied because of irq_set_affinity's return value.
> >>> So copy the affinity, when the return value is IRQ_SET_MASK_OK_DONE.
> >> Hi Yingliang,
> >> If irq_set_affinity callback returns IRQ_SET_MASK_OK_DONE,
> >> it means that irq_set_affinity has copied the new CPU mask to irq
> >> affinity mask. It would be better to change irq_set_affinity for LPI
> >> to follow this rule.
> >
> > The main issue here seems to be that we do not call irq_set_affinity, but
> > that we directly call into the top-level irqchip method, which relies on
> > the core code to do the copy (see irq_do_set_affinity). Too bad.
> >
> > It feels like the arm/arm64 code would probably be better consolidated into
> > kernel/irq/migration.c, which already deals with some of this for x86
> > and ia64. It would save us the duplication and will make sure we don't
> > miss things next time we add a new return code, as irq_do_set_affinity
> > would handle this properly.
> >
> > Thoughts?
>
> I agree. In arch/arm64/kernel/irq.c the irq migrate code is the same
> as ARM32, and it's duplicate. But this is a bugfix, can we fix it in
> a simple way, and refactor the code later?

I'm not buying this.

I really can't see how adding more duplication can be beneficial. It is
not so much that there is duplication between arm and arm64 that
bothers me (as if that was the only thing...). The real issue is that
there is duplication between the arch code and the core code.

Migrating interrupts is a core code matter, and that's were it should
be handled IMHO. Plus, we're in the merge window, and there is plenty
of time to get this fixed the proper way.

Thanks,

M.
--
Jazz is not dead. It just smells funny.