2011-05-05 08:15:47

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH] x86: skip migrating percpu irq in fixup_irqs

x86: skip migrating percpu irq in fixup_irqs

IRQF_PER_CPU is used by Xen for a set of virtual interrupts binding to
a specific vcpu, but it's not recognized in fixup_irqs which simply
atempts to migrate it to other vcpus. In most cases this just works,
because Xen event channel chip silently fails the set_affinity ops for
irqs marked as IRQF_PER_CPU. But there's a special category (like used
for pv spinlock) also adds a IRQ_DISABLED flag, used as polled only event
channels which don't expect any instance injected. However fixup_irqs
absolutely masks/unmasks irqs which makes such special type irq injected
unexpectely while there's no handler for it.

This error is triggered on some box when doing reboot. The fix is to
recognize IRQF_PER_CPU flag early before the affinity change, which
actually matches the rationale of IRQF_PER_CPU.

Signed-off-by: Fengzhe Zhang <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: H. Peter Anvin <[email protected]>
CC: Ian Campbell <[email protected]>
CC: Jan Beulich <[email protected]>

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 1cb0b9f..544efe2 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -249,7 +249,7 @@ void fixup_irqs(void)

data = irq_desc_get_irq_data(desc);
affinity = data->affinity;
- if (!irq_has_action(irq) ||
+ if (!irq_has_action(irq) || irqd_is_per_cpu(data) ||
cpumask_subset(affinity, cpu_online_mask)) {
raw_spin_unlock(&desc->lock);
continue;


2011-05-05 08:49:07

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] x86: skip migrating percpu irq in fixup_irqs

On Thu, 2011-05-05 at 09:15 +0100, Tian, Kevin wrote:
> x86: skip migrating percpu irq in fixup_irqs
>
> IRQF_PER_CPU is used by Xen for a set of virtual interrupts binding to
> a specific vcpu, but it's not recognized in fixup_irqs which simply
> atempts to migrate it to other vcpus. In most cases this just works,
> because Xen event channel chip silently fails the set_affinity ops for
> irqs marked as IRQF_PER_CPU. But there's a special category (like used
> for pv spinlock) also adds a IRQ_DISABLED flag, used as polled only event
> channels which don't expect any instance injected. However fixup_irqs
> absolutely masks/unmasks irqs which makes such special type irq injected
> unexpectely while there's no handler for it.
>
> This error is triggered on some box when doing reboot. The fix is to
> recognize IRQF_PER_CPU flag early before the affinity change, which
> actually matches the rationale of IRQF_PER_CPU.

Skipping affinity fixup for PER_CPU IRQs seems logical enough (so I
suspect this patch is good in its own right) but if the real issue you
are fixing is that IRQ_DISABLED IQRs are getting unmasked should that
issue be addressed directly rather than relying on it being a
side-effect of PER_CPU-ness?

Ian.

> Signed-off-by: Fengzhe Zhang <[email protected]>
> Signed-off-by: Kevin Tian <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: H. Peter Anvin <[email protected]>
> CC: Ian Campbell <[email protected]>
> CC: Jan Beulich <[email protected]>
>
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 1cb0b9f..544efe2 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -249,7 +249,7 @@ void fixup_irqs(void)
>
> data = irq_desc_get_irq_data(desc);
> affinity = data->affinity;
> - if (!irq_has_action(irq) ||
> + if (!irq_has_action(irq) || irqd_is_per_cpu(data) ||
> cpumask_subset(affinity, cpu_online_mask)) {
> raw_spin_unlock(&desc->lock);
> continue;

2011-05-05 09:03:17

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH] x86: skip migrating percpu irq in fixup_irqs

> From: Ian Campbell [mailto:[email protected]]
> Sent: Thursday, May 05, 2011 4:49 PM
>
> On Thu, 2011-05-05 at 09:15 +0100, Tian, Kevin wrote:
> > x86: skip migrating percpu irq in fixup_irqs
> >
> > IRQF_PER_CPU is used by Xen for a set of virtual interrupts binding to
> > a specific vcpu, but it's not recognized in fixup_irqs which simply
> > atempts to migrate it to other vcpus. In most cases this just works,
> > because Xen event channel chip silently fails the set_affinity ops for
> > irqs marked as IRQF_PER_CPU. But there's a special category (like used
> > for pv spinlock) also adds a IRQ_DISABLED flag, used as polled only
> > event channels which don't expect any instance injected. However
> > fixup_irqs absolutely masks/unmasks irqs which makes such special type
> > irq injected unexpectely while there's no handler for it.
> >
> > This error is triggered on some box when doing reboot. The fix is to
> > recognize IRQF_PER_CPU flag early before the affinity change, which
> > actually matches the rationale of IRQF_PER_CPU.
>
> Skipping affinity fixup for PER_CPU IRQs seems logical enough (so I suspect this
> patch is good in its own right) but if the real issue you are fixing is that
> IRQ_DISABLED IQRs are getting unmasked should that issue be addressed
> directly rather than relying on it being a side-effect of PER_CPU-ness?

actually this is one thing I'm not sure and would like to hear more suggestions.
imo affinity and percpu are same type of attributes, which describe the
constraints to host a given irq, while disabled/masked are another type of
attributes which describe whether to accept an irq instance (mask is for
short period). even without the IRQ_DISABLED issue, this patch is also
required though it doesn't expose observable failure. I didn't bake the 2nd
patch to avoid unmask for IRQ_DISABLED, because I want to know whether
this is desired, e.g. IRQ_DISABLED is set in the fly when there can be still
one instance pending?

Thanks
Kevin

>
> > Signed-off-by: Fengzhe Zhang <[email protected]>
> > Signed-off-by: Kevin Tian <[email protected]>
> > CC: Thomas Gleixner <[email protected]>
> > CC: Ingo Molnar <[email protected]>
> > CC: H. Peter Anvin <[email protected]>
> > CC: Ian Campbell <[email protected]>
> > CC: Jan Beulich <[email protected]>
> >
> > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index
> > 1cb0b9f..544efe2 100644
> > --- a/arch/x86/kernel/irq.c
> > +++ b/arch/x86/kernel/irq.c
> > @@ -249,7 +249,7 @@ void fixup_irqs(void)
> >
> > data = irq_desc_get_irq_data(desc);
> > affinity = data->affinity;
> > - if (!irq_has_action(irq) ||
> > + if (!irq_has_action(irq) || irqd_is_per_cpu(data) ||
> > cpumask_subset(affinity, cpu_online_mask)) {
> > raw_spin_unlock(&desc->lock);
> > continue;
>

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2011-05-05 09:34:06

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH] x86: skip migrating percpu irq in fixup_irqs

On Thu, 5 May 2011, Tian, Kevin wrote:
> > From: Ian Campbell [mailto:[email protected]]
> > Sent: Thursday, May 05, 2011 4:49 PM
> >
> > On Thu, 2011-05-05 at 09:15 +0100, Tian, Kevin wrote:
> > > x86: skip migrating percpu irq in fixup_irqs
> > >
> > > IRQF_PER_CPU is used by Xen for a set of virtual interrupts binding to
> > > a specific vcpu, but it's not recognized in fixup_irqs which simply
> > > atempts to migrate it to other vcpus. In most cases this just works,
> > > because Xen event channel chip silently fails the set_affinity ops for
> > > irqs marked as IRQF_PER_CPU. But there's a special category (like used
> > > for pv spinlock) also adds a IRQ_DISABLED flag, used as polled only

I guess you mean IRQF_DISABLED, right?

> > > event channels which don't expect any instance injected. However
> > > fixup_irqs absolutely masks/unmasks irqs which makes such special type
> > > irq injected unexpectely while there's no handler for it.

IRQF_DISABLED has absolutely nothing to do with polling and
mask/unmask. IRQF_DISABLED is actually a NOOP and totally irrelevant.

> > > This error is triggered on some box when doing reboot. The fix is to
> > > recognize IRQF_PER_CPU flag early before the affinity change, which
> > > actually matches the rationale of IRQF_PER_CPU.
> >
> > Skipping affinity fixup for PER_CPU IRQs seems logical enough (so I suspect this
> > patch is good in its own right) but if the real issue you are fixing is that
> > IRQ_DISABLED IQRs are getting unmasked should that issue be addressed
> > directly rather than relying on it being a side-effect of PER_CPU-ness?
>
> actually this is one thing I'm not sure and would like to hear more suggestions.
> imo affinity and percpu are same type of attributes, which describe the
> constraints to host a given irq, while disabled/masked are another type of

They are similar, but percpu says explicitely: This interrupt can
never be moved away from the cpu it is bound to. Affinity ist just the
current binding which is allowed to be changed.

> attributes which describe whether to accept an irq instance (mask is for
> short period). even without the IRQ_DISABLED issue, this patch is also
> required though it doesn't expose observable failure. I didn't bake the 2nd
> patch to avoid unmask for IRQ_DISABLED, because I want to know whether
> this is desired, e.g. IRQ_DISABLED is set in the fly when there can be still
> one instance pending?

What you probably mean is the core internal disabled state of the
interrupt line. Yes, we should not unmask such interrupts in the fixup
move.

Thanks,

tglx

2011-05-06 01:12:47

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH] x86: skip migrating percpu irq in fixup_irqs

> From: Thomas Gleixner [mailto:[email protected]]
> Sent: Thursday, May 05, 2011 5:34 PM
>
> On Thu, 5 May 2011, Tian, Kevin wrote:
> > > From: Ian Campbell [mailto:[email protected]]
> > > Sent: Thursday, May 05, 2011 4:49 PM
> > >
> > > On Thu, 2011-05-05 at 09:15 +0100, Tian, Kevin wrote:
> > > > x86: skip migrating percpu irq in fixup_irqs
> > > >
> > > > IRQF_PER_CPU is used by Xen for a set of virtual interrupts
> > > > binding to a specific vcpu, but it's not recognized in fixup_irqs
> > > > which simply atempts to migrate it to other vcpus. In most cases
> > > > this just works, because Xen event channel chip silently fails the
> > > > set_affinity ops for irqs marked as IRQF_PER_CPU. But there's a
> > > > special category (like used for pv spinlock) also adds a
> > > > IRQ_DISABLED flag, used as polled only
>
> I guess you mean IRQF_DISABLED, right?

yes. :-)

>
> > > > event channels which don't expect any instance injected. However
> > > > fixup_irqs absolutely masks/unmasks irqs which makes such special
> > > > type irq injected unexpectely while there's no handler for it.
>
> IRQF_DISABLED has absolutely nothing to do with polling and mask/unmask.
> IRQF_DISABLED is actually a NOOP and totally irrelevant.

this is just due to the way how Xen pvops implements the event polling, with
corresponding irq marked as both IRQF_PER_CPU and IRQ_DISABLED. This
way when the event polled by the pvops guest comes, Xen simply resumes
the guest at next instruction after the poll hypercall but no need to make this
instance aware into the guest.

>
> > > > This error is triggered on some box when doing reboot. The fix is
> > > > to recognize IRQF_PER_CPU flag early before the affinity change,
> > > > which actually matches the rationale of IRQF_PER_CPU.
> > >
> > > Skipping affinity fixup for PER_CPU IRQs seems logical enough (so I
> > > suspect this patch is good in its own right) but if the real issue
> > > you are fixing is that IRQ_DISABLED IQRs are getting unmasked should
> > > that issue be addressed directly rather than relying on it being a side-effect
> of PER_CPU-ness?
> >
> > actually this is one thing I'm not sure and would like to hear more
> suggestions.
> > imo affinity and percpu are same type of attributes, which describe
> > the constraints to host a given irq, while disabled/masked are another
> > type of
>
> They are similar, but percpu says explicitely: This interrupt can never be moved
> away from the cpu it is bound to. Affinity ist just the current binding which is
> allowed to be changed.

yes.

>
> > attributes which describe whether to accept an irq instance (mask is
> > for short period). even without the IRQ_DISABLED issue, this patch is
> > also required though it doesn't expose observable failure. I didn't
> > bake the 2nd patch to avoid unmask for IRQ_DISABLED, because I want to
> > know whether this is desired, e.g. IRQ_DISABLED is set in the fly when
> > there can be still one instance pending?
>
> What you probably mean is the core internal disabled state of the interrupt line.
> Yes, we should not unmask such interrupts in the fixup move.
>

OK, I'll send an updated patch to cover both IRQF_PER_CPU and IRQF_DISABLED
case.

Thanks
Kevin