2011-05-06 06:45:18

by Tian, Kevin

[permalink] [raw]
Subject: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

x86: don't unmask disabled irqs when migrating them

it doesn't make sense to mask/unmask a disabled irq when migrating it
from offlined cpu to another, because it's not expected to handle
any instance of it. Current mask/set_affinity/unmask steps may trigger
unexpected instance on disabled irq which then simply bug on when there
is no handler for it. One failing example is observed in Xen. Xen pvops
guest marks a special type of irqs as disabled, which are simply used
as a notification mechanism to wake up blocked guest in event polling
state. Absolutely unmask them may cause the notification instance
instead injected into the guest and then cause trouble.

Signed-off-by: Fengzhe Zhang <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: H. Peter Anvin <[email protected]>
CC: Ian Campbell <[email protected]>
CC: Jan Beulich <[email protected]>

--- linux-2.6.39-rc6.orig/arch/x86/kernel/irq.c 2011-05-04 10:59:13.000000000 +0800
+++ linux-2.6.39-rc6/arch/x86/kernel/irq.c 2011-05-06 09:20:25.563963000 +0800
@@ -237,6 +237,7 @@ void fixup_irqs(void)
for_each_irq_desc(irq, desc) {
int break_affinity = 0;
int set_affinity = 1;
+ int do_mask;
const struct cpumask *affinity;

if (!desc)
@@ -268,7 +269,9 @@ void fixup_irqs(void)
}

chip = irq_data_get_irq_chip(data);
- if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
+ do_mask = !irqd_irq_disabled(data) &&
+ !irqd_can_move_in_process_context(data) && chip->irq_mask;
+ if (do_mask)
chip->irq_mask(data);

if (chip->irq_set_affinity)
@@ -276,7 +279,7 @@ void fixup_irqs(void)
else if (!(warned++))
set_affinity = 0;

- if (!irqd_can_move_in_process_context(data) && chip->irq_unmask)
+ if (do_mask)
chip->irq_unmask(data);

raw_spin_unlock(&desc->lock);


2011-05-06 10:00:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Fri, 6 May 2011, Tian, Kevin wrote:
> x86: don't unmask disabled irqs when migrating them
>
> it doesn't make sense to mask/unmask a disabled irq when migrating it
> from offlined cpu to another, because it's not expected to handle
> any instance of it. Current mask/set_affinity/unmask steps may trigger
> unexpected instance on disabled irq which then simply bug on when there
> is no handler for it. One failing example is observed in Xen. Xen pvops

So there is no handler, why the heck is there an irq action?

if (!irq_has_action(irq) ....
continue;

Should have caught an uninitialized interrupt. If Xen abuses
interrupts that way, then it rightfully explodes. And we do not fix it
by magic somewhere else.

> guest marks a special type of irqs as disabled, which are simply used

As I explained before several times, IRQF_DISABLED has absolutely
nothing to do with it and pvops _CANNOT_ mark an interrupt disabled.

>
> chip = irq_data_get_irq_chip(data);
> - if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
> + do_mask = !irqd_irq_disabled(data) &&
> + !irqd_can_move_in_process_context(data) && chip->irq_mask;
> + if (do_mask)
> chip->irq_mask(data);

This is completely wrong. irqd_irq_disabled() is a status information
which does not tell you whether the interrupt is actually masked at
the hardware level because we do lazy interrupt hardware masking. So
your change would keep the line unmasked at the hardware level for all
interrupts which are in the lazy disabled state.

The only conditional which is interesting is the unmask path and
that's a simple optimization and not a correctness problem.

Thanks,

tglx

2011-05-06 12:54:47

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Thomas Gleixner
> Sent: Friday, May 06, 2011 6:00 PM
>
> On Fri, 6 May 2011, Tian, Kevin wrote:
> > x86: don't unmask disabled irqs when migrating them
> >
> > it doesn't make sense to mask/unmask a disabled irq when migrating it
> > from offlined cpu to another, because it's not expected to handle any
> > instance of it. Current mask/set_affinity/unmask steps may trigger
> > unexpected instance on disabled irq which then simply bug on when
> > there is no handler for it. One failing example is observed in Xen.
> > Xen pvops
>
> So there is no handler, why the heck is there an irq action?
>
> if (!irq_has_action(irq) ....
> continue;
>
> Should have caught an uninitialized interrupt. If Xen abuses interrupts that way,
> then it rightfully explodes. And we do not fix it by magic somewhere else.

sorry that my bad description here. there does be a dummy handler registered
on such irqs which simply throws out a BUG_ON when hit. I should just say such
injection is not expected instead of no handler. :-)

>
> > guest marks a special type of irqs as disabled, which are simply used
>
> As I explained before several times, IRQF_DISABLED has absolutely nothing to
> do with it and pvops _CANNOT_ mark an interrupt disabled.

I have to admit that I need more study about whole interrupt sub-system, to better
understand your explanation here. Also here again my description is not accurate
enough. I meant that Xen pvops request the special irq with below flags:
IRQF_DISABLED|IRQF_PERCPU|IRQF_NOBALANCING
and then later explicitly disable it with disable_irq(). As you said that IRQF_DISABLED
itself has nothing to do with it, and it's the later disable_irq() which takes real
effect because Xen event chip hooks this callback to mask the irq from the chip level.

>
> >
> > chip = irq_data_get_irq_chip(data);
> > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
> > + do_mask = !irqd_irq_disabled(data) &&
> > + !irqd_can_move_in_process_context(data) && chip->irq_mask;
> > + if (do_mask)
> > chip->irq_mask(data);
>
> This is completely wrong. irqd_irq_disabled() is a status information which does
> not tell you whether the interrupt is actually masked at the hardware level
> because we do lazy interrupt hardware masking. So your change would keep
> the line unmasked at the hardware level for all interrupts which are in the lazy
> disabled state.

Got it.

>
> The only conditional which is interesting is the unmask path and that's a simple
> optimization and not a correctness problem.
>

So what's your suggestion based on my updated information? Is there any
interface I may take to differentiate above exception with normal case? Basically
in Xen usage we want such irqs permanently disabled at the chip level. Or
could we only do mask/unmask for irqs which are unmasked atm if as you said
it's just an optimization step? :-)


Thanks
Kevin

2011-05-06 13:15:02

by Tian, Kevin

[permalink] [raw]
Subject: RE: [Xen-devel] RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Tian, Kevin
> Sent: Friday, May 06, 2011 8:55 PM
>
> > From: Thomas Gleixner
> > Sent: Friday, May 06, 2011 6:00 PM
> >
> > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > x86: don't unmask disabled irqs when migrating them
> > >
> > > it doesn't make sense to mask/unmask a disabled irq when migrating
> > > it from offlined cpu to another, because it's not expected to handle
> > > any instance of it. Current mask/set_affinity/unmask steps may
> > > trigger unexpected instance on disabled irq which then simply bug on
> > > when there is no handler for it. One failing example is observed in Xen.
> > > Xen pvops
> >
> > So there is no handler, why the heck is there an irq action?
> >
> > if (!irq_has_action(irq) ....
> > continue;
> >
> > Should have caught an uninitialized interrupt. If Xen abuses
> > interrupts that way, then it rightfully explodes. And we do not fix it by magic
> somewhere else.
>
> sorry that my bad description here. there does be a dummy handler registered
> on such irqs which simply throws out a BUG_ON when hit. I should just say such
> injection is not expected instead of no handler. :-)
>
> >
> > > guest marks a special type of irqs as disabled, which are simply
> > > used
> >
> > As I explained before several times, IRQF_DISABLED has absolutely
> > nothing to do with it and pvops _CANNOT_ mark an interrupt disabled.
>
> I have to admit that I need more study about whole interrupt sub-system, to
> better understand your explanation here. Also here again my description is not
> accurate enough. I meant that Xen pvops request the special irq with below
> flags:
> IRQF_DISABLED|IRQF_PERCPU|IRQF_NOBALANCING
> and then later explicitly disable it with disable_irq(). As you said that
> IRQF_DISABLED itself has nothing to do with it, and it's the later disable_irq()
> which takes real effect because Xen event chip hooks this callback to mask the
> irq from the chip level.
>
> >
> > >
> > > chip = irq_data_get_irq_chip(data);
> > > - if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
> > > + do_mask = !irqd_irq_disabled(data) &&
> > > + !irqd_can_move_in_process_context(data) &&
> chip->irq_mask;
> > > + if (do_mask)
> > > chip->irq_mask(data);
> >
> > This is completely wrong. irqd_irq_disabled() is a status information
> > which does not tell you whether the interrupt is actually masked at
> > the hardware level because we do lazy interrupt hardware masking. So
> > your change would keep the line unmasked at the hardware level for all
> > interrupts which are in the lazy disabled state.
>
> Got it.
>
> >
> > The only conditional which is interesting is the unmask path and
> > that's a simple optimization and not a correctness problem.
> >
>
> So what's your suggestion based on my updated information? Is there any
> interface I may take to differentiate above exception with normal case?
> Basically in Xen usage we want such irqs permanently disabled at the chip level.
> Or could we only do mask/unmask for irqs which are unmasked atm if as you
> said it's just an optimization step? :-)
>

After another thought, I just got myself messed. This is not required by Xen
actually, as the 1st patch to handle IRQF_PERCPU has been enough to cover
the special requirement. Could you ack the 1st one if you agree?

This 2nd patch is cooked upon the question whether generally a irq disabled at
chip level should be unmasked at migration. If there's no action registered,
it won't be migrated. The only question is if there does be action registered,
and this irq is disabled at chip level, will there be any issue to force unmask it
at migration, or should we only mask/unmask for a irq which is unmasked at
the migration?

Thanks
Kevin

2011-05-06 13:24:13

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Fri, 6 May 2011, Tian, Kevin wrote:
> > From: Thomas Gleixner
> > Sent: Friday, May 06, 2011 6:00 PM
> >
> > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > x86: don't unmask disabled irqs when migrating them
> > >
> > > it doesn't make sense to mask/unmask a disabled irq when migrating it
> > > from offlined cpu to another, because it's not expected to handle any
> > > instance of it. Current mask/set_affinity/unmask steps may trigger
> > > unexpected instance on disabled irq which then simply bug on when
> > > there is no handler for it. One failing example is observed in Xen.
> > > Xen pvops
> >
> > So there is no handler, why the heck is there an irq action?
> >
> > if (!irq_has_action(irq) ....
> > continue;
> >
> > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way,
> > then it rightfully explodes. And we do not fix it by magic somewhere else.
>
> sorry that my bad description here. there does be a dummy handler registered
> on such irqs which simply throws out a BUG_ON when hit. I should just say such
> injection is not expected instead of no handler. :-)

So can please someone point me to that particular incarnation of
nonsense and provide a reasonable explanation for this abuse?

What is the point of an interrupt, which is permanently disabled, has
a handler with a BUG() inside and an irqaction assigned ?

What's the purpose of this? Why is the irqaction there in the first
place? To be called by some other weird means than by the irq
handling code?

> > The only conditional which is interesting is the unmask path and that's a simple
> > optimization and not a correctness problem.
> >
>
> So what's your suggestion based on my updated information? Is there any
> interface I may take to differentiate above exception with normal case? Basically
> in Xen usage we want such irqs permanently disabled at the chip level. Or
> could we only do mask/unmask for irqs which are unmasked atm if as you said
> it's just an optimization step? :-)

No we can make the unmask conditional on !irqd_irq_disabled() because
that's not violating any of the semantics. The interrupt would be
masked anyway when it arrives and the handler code sees that it is
lazy disabled. I mean real handler code, not the Xen abomination.

The only valid reason why I'd apply that patch is that it avoids a
potential extra interrupt, but not to prevent screwed up handlers from
exploding.

Thanks,

tglx

2011-05-06 14:05:00

by Ian Campbell

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Fri, 2011-05-06 at 14:24 +0100, Thomas Gleixner wrote:
> On Fri, 6 May 2011, Tian, Kevin wrote:
> > > From: Thomas Gleixner
> > > Sent: Friday, May 06, 2011 6:00 PM
> > >
> > > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > > x86: don't unmask disabled irqs when migrating them
> > > >
> > > > it doesn't make sense to mask/unmask a disabled irq when migrating it
> > > > from offlined cpu to another, because it's not expected to handle any
> > > > instance of it. Current mask/set_affinity/unmask steps may trigger
> > > > unexpected instance on disabled irq which then simply bug on when
> > > > there is no handler for it. One failing example is observed in Xen.
> > > > Xen pvops
> > >
> > > So there is no handler, why the heck is there an irq action?
> > >
> > > if (!irq_has_action(irq) ....
> > > continue;
> > >
> > > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way,
> > > then it rightfully explodes. And we do not fix it by magic somewhere else.
> >
> > sorry that my bad description here. there does be a dummy handler registered
> > on such irqs which simply throws out a BUG_ON when hit. I should just say such
> > injection is not expected instead of no handler. :-)
>
> So can please someone point me to that particular incarnation of
> nonsense and provide a reasonable explanation for this abuse?
>
> What is the point of an interrupt, which is permanently disabled, has
> a handler with a BUG() inside and an irqaction assigned ?
>
> What's the purpose of this? Why is the irqaction there in the first
> place? To be called by some other weird means than by the irq
> handling code?

The Xen PV spinlock code (arch/x86/xen/spinlock.c) allocates an IRQ
(per-cpu lock_kicker_irq). I think it is there purely in order to have
the associated underlying evtchn to use as the thing to poll (Xen has an
evtchn poll hypercall, see xen_poll_irq()) on the slow path and kick on
release. There is never any need to call a handler for that evtchn --
just notifying the evtchn is enough to wake the poller.

The irq is setup using request_irq(). Is there a different API to
register an IRQ without attaching a handler/action to it? (I can't think
why such a thing would exist).

I'm not really sure why these can't just be an evtchn without an
associated IRQ since it doesn't really have any interrupt-like
semantics. Perhaps just a general desire to keep event channels
abstracted into the core Xen event subsystem with IRQs as the public
facing API? Jeremy?

Ian.

> > > The only conditional which is interesting is the unmask path and that's a simple
> > > optimization and not a correctness problem.
> > >
> >
> > So what's your suggestion based on my updated information? Is there any
> > interface I may take to differentiate above exception with normal case? Basically
> > in Xen usage we want such irqs permanently disabled at the chip level. Or
> > could we only do mask/unmask for irqs which are unmasked atm if as you said
> > it's just an optimization step? :-)
>
> No we can make the unmask conditional on !irqd_irq_disabled() because
> that's not violating any of the semantics. The interrupt would be
> masked anyway when it arrives and the handler code sees that it is
> lazy disabled. I mean real handler code, not the Xen abomination.
>
> The only valid reason why I'd apply that patch is that it avoids a
> potential extra interrupt, but not to prevent screwed up handlers from
> exploding.
>
> Thanks,
>
> tglx

2011-05-06 14:27:17

by Stefano Stabellini

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Fri, 6 May 2011, Tian, Kevin wrote:
> > From: Thomas Gleixner
> > Sent: Friday, May 06, 2011 6:00 PM
> >
> > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > x86: don't unmask disabled irqs when migrating them
> > >
> > > it doesn't make sense to mask/unmask a disabled irq when migrating it
> > > from offlined cpu to another, because it's not expected to handle any
> > > instance of it. Current mask/set_affinity/unmask steps may trigger
> > > unexpected instance on disabled irq which then simply bug on when
> > > there is no handler for it. One failing example is observed in Xen.
> > > Xen pvops
> >
> > So there is no handler, why the heck is there an irq action?
> >
> > if (!irq_has_action(irq) ....
> > continue;
> >
> > Should have caught an uninitialized interrupt. If Xen abuses interrupts that way,
> > then it rightfully explodes. And we do not fix it by magic somewhere else.
>
> sorry that my bad description here. there does be a dummy handler registered
> on such irqs which simply throws out a BUG_ON when hit. I should just say such
> injection is not expected instead of no handler. :-)

I don't think this patch is necessary anymore after the event channel
handling cleanup patches I have just sent to the list.
Could you please try the following two patches:

http://marc.info/?l=linux-kernel&m=130468120032172&w=2
http://marc.info/?l=linux-kernel&m=130468178200468&w=2

and let me know if you still need this patch?

2011-05-06 21:43:18

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Stefano Stabellini [mailto:[email protected]]
> Sent: Friday, May 06, 2011 10:29 PM
>
> On Fri, 6 May 2011, Tian, Kevin wrote:
> > > From: Thomas Gleixner
> > > Sent: Friday, May 06, 2011 6:00 PM
> > >
> > > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > > x86: don't unmask disabled irqs when migrating them
> > > >
> > > > it doesn't make sense to mask/unmask a disabled irq when migrating
> > > > it from offlined cpu to another, because it's not expected to
> > > > handle any instance of it. Current mask/set_affinity/unmask steps
> > > > may trigger unexpected instance on disabled irq which then simply
> > > > bug on when there is no handler for it. One failing example is observed in
> Xen.
> > > > Xen pvops
> > >
> > > So there is no handler, why the heck is there an irq action?
> > >
> > > if (!irq_has_action(irq) ....
> > > continue;
> > >
> > > Should have caught an uninitialized interrupt. If Xen abuses
> > > interrupts that way, then it rightfully explodes. And we do not fix it by magic
> somewhere else.
> >
> > sorry that my bad description here. there does be a dummy handler
> > registered on such irqs which simply throws out a BUG_ON when hit. I
> > should just say such injection is not expected instead of no handler.
> > :-)
>
> I don't think this patch is necessary anymore after the event channel handling
> cleanup patches I have just sent to the list.
> Could you please try the following two patches:
>
> http://marc.info/?l=linux-kernel&m=130468120032172&w=2
> http://marc.info/?l=linux-kernel&m=130468178200468&w=2
>
> and let me know if you still need this patch?

thanks, and I'll take a look at them.

Thanks
Kevin

2011-05-08 01:44:59

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On 05/07/2011 12:04 AM, Ian Campbell wrote:
> I'm not really sure why these can't just be an evtchn without an
> associated IRQ since it doesn't really have any interrupt-like
> semantics. Perhaps just a general desire to keep event channels
> abstracted into the core Xen event subsystem with IRQs as the public
> facing API? Jeremy?

It doesn't really need to be an irq. The main reason was so that it
would appear in /proc/interrupts so I could use the counter as a "number
of times a spinlock was kicked" counter. That could be exposed in some
other way if being part of the interrupt infrastructure brings too much
baggage with it.

J

2011-05-09 00:45:05

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Jeremy Fitzhardinge [mailto:[email protected]]
> Sent: Sunday, May 08, 2011 9:45 AM
>
> On 05/07/2011 12:04 AM, Ian Campbell wrote:
> > I'm not really sure why these can't just be an evtchn without an
> > associated IRQ since it doesn't really have any interrupt-like
> > semantics. Perhaps just a general desire to keep event channels
> > abstracted into the core Xen event subsystem with IRQs as the public
> > facing API? Jeremy?
>
> It doesn't really need to be an irq. The main reason was so that it would
> appear in /proc/interrupts so I could use the counter as a "number of times a
> spinlock was kicked" counter. That could be exposed in some other way if
> being part of the interrupt infrastructure brings too much baggage with it.
>

Perhaps we don't need an irq binding here. Just like a local APIC interrupt
source which only needs vector. Somehow the virq or vipi concept in Xen
context is similar.

Thanks
Kevin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2011-05-09 01:45:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On 05/09/2011 10:44 AM, Tian, Kevin wrote:
>> It doesn't really need to be an irq. The main reason was so that it would
>> appear in /proc/interrupts so I could use the counter as a "number of times a
>> spinlock was kicked" counter. That could be exposed in some other way if
>> being part of the interrupt infrastructure brings too much baggage with it.
>>
> Perhaps we don't need an irq binding here. Just like a local APIC interrupt
> source which only needs vector. Somehow the virq or vipi concept in Xen
> context is similar.

An event channel is logically equivalent to a vector, so that would make
sense. We currently allocate irqs for cross-cpu call and reschedule
event channels, whereas native x86 simply uses a naked vector for
those. But they are real interrupts, so an irq at least makes some
logical sense in those cases.

For spinlocks, the event channel is more like a usermode-level signal
which is always blocked and only ever tested with sigsuspend (or is it
sigpoll? something like that).

J

2011-05-09 02:11:32

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Stefano Stabellini [mailto:[email protected]]
> Sent: Friday, May 06, 2011 10:29 PM
>
> On Fri, 6 May 2011, Tian, Kevin wrote:
> > > From: Thomas Gleixner
> > > Sent: Friday, May 06, 2011 6:00 PM
> > >
> > > On Fri, 6 May 2011, Tian, Kevin wrote:
> > > > x86: don't unmask disabled irqs when migrating them
> > > >
> > > > it doesn't make sense to mask/unmask a disabled irq when migrating
> > > > it from offlined cpu to another, because it's not expected to
> > > > handle any instance of it. Current mask/set_affinity/unmask steps
> > > > may trigger unexpected instance on disabled irq which then simply
> > > > bug on when there is no handler for it. One failing example is observed in
> Xen.
> > > > Xen pvops
> > >
> > > So there is no handler, why the heck is there an irq action?
> > >
> > > if (!irq_has_action(irq) ....
> > > continue;
> > >
> > > Should have caught an uninitialized interrupt. If Xen abuses
> > > interrupts that way, then it rightfully explodes. And we do not fix it by magic
> somewhere else.
> >
> > sorry that my bad description here. there does be a dummy handler
> > registered on such irqs which simply throws out a BUG_ON when hit. I
> > should just say such injection is not expected instead of no handler.
> > :-)
>
> I don't think this patch is necessary anymore after the event channel handling
> cleanup patches I have just sent to the list.
> Could you please try the following two patches:
>
> http://marc.info/?l=linux-kernel&m=130468120032172&w=2
> http://marc.info/?l=linux-kernel&m=130468178200468&w=2
>
> and let me know if you still need this patch?

yes, with your patch this issue disappears, since you explicitly make mask/unmask as
a nop for xen_percpu_chip, which effectively avoids them from undesired unmask
when doing the migration. Though it works, it's not intuitive as to me it's an
workaround to make Xen chip implementation adapting to specific fixup_irqs logic.
If the logic within fixup_irqs is changed in the future, it's still possible to expose this
issue again. If we explicitly avoid the percpu irqs in the 1st place in fixup_irqs like
what my 1st patch does, it'd be clearer.

But based on your referred patchesand Jeremy's info about possible unbinding xenn
event from irq namespace, my patches are not that necessary now as before. So I'll leave
to Thomas to decide whether he wants to take my 1st patch. the 2nd one talked in this
thread could be dropped.

Thanks
Kevin

2011-05-09 12:01:17

by Stefano Stabellini

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Mon, 9 May 2011, Tian, Kevin wrote:
> yes, with your patch this issue disappears, since you explicitly make mask/unmask as
> a nop for xen_percpu_chip, which effectively avoids them from undesired unmask
> when doing the migration. Though it works, it's not intuitive as to me it's an
> workaround to make Xen chip implementation adapting to specific fixup_irqs logic.

I have been tring to follow the example of existing supported drivers.
The only x86 driver I could find that uses handle_percpu_irq is uv_irq
that does exatly the same thing.

2011-05-09 12:37:05

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Mon, 9 May 2011, Stefano Stabellini wrote:

> On Mon, 9 May 2011, Tian, Kevin wrote:
> > yes, with your patch this issue disappears, since you explicitly make mask/unmask as
> > a nop for xen_percpu_chip, which effectively avoids them from undesired unmask
> > when doing the migration. Though it works, it's not intuitive as to me it's an
> > workaround to make Xen chip implementation adapting to specific fixup_irqs logic.
>
> I have been tring to follow the example of existing supported drivers.
> The only x86 driver I could find that uses handle_percpu_irq is uv_irq
> that does exatly the same thing.

Which is a good enough argument to make that change at the common code
level instead of having fancy workarounds here and there.

Thanks,

tglx

2011-05-10 03:24:39

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Thomas Gleixner [mailto:[email protected]]
> Sent: Monday, May 09, 2011 8:37 PM
>
> On Mon, 9 May 2011, Stefano Stabellini wrote:
>
> > On Mon, 9 May 2011, Tian, Kevin wrote:
> > > yes, with your patch this issue disappears, since you explicitly
> > > make mask/unmask as a nop for xen_percpu_chip, which effectively
> > > avoids them from undesired unmask when doing the migration. Though
> > > it works, it's not intuitive as to me it's an workaround to make Xen chip
> implementation adapting to specific fixup_irqs logic.
> >
> > I have been tring to follow the example of existing supported drivers.
> > The only x86 driver I could find that uses handle_percpu_irq is uv_irq
> > that does exatly the same thing.
>
> Which is a good enough argument to make that change at the common code
> level instead of having fancy workarounds here and there.
>

So Thomas, what's your suggestion to continue here? Is my original patch to skip
percpu irq in common code a good option to go, or you want a cleaner code in
other form? Once it's clear I'll discuss with Stefano e.g. possibly merge with his
cleanup patch series. :-)

Thanks
Kevin

2011-05-18 23:50:08

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

> From: Tian, Kevin
> Sent: Tuesday, May 10, 2011 11:24 AM
>
> > From: Thomas Gleixner [mailto:[email protected]]
> > Sent: Monday, May 09, 2011 8:37 PM
> >
> > On Mon, 9 May 2011, Stefano Stabellini wrote:
> >
> > > On Mon, 9 May 2011, Tian, Kevin wrote:
> > > > yes, with your patch this issue disappears, since you explicitly
> > > > make mask/unmask as a nop for xen_percpu_chip, which effectively
> > > > avoids them from undesired unmask when doing the migration. Though
> > > > it works, it's not intuitive as to me it's an workaround to make
> > > > Xen chip
> > implementation adapting to specific fixup_irqs logic.
> > >
> > > I have been tring to follow the example of existing supported drivers.
> > > The only x86 driver I could find that uses handle_percpu_irq is
> > > uv_irq that does exatly the same thing.
> >
> > Which is a good enough argument to make that change at the common code
> > level instead of having fancy workarounds here and there.
> >
>
> So Thomas, what's your suggestion to continue here? Is my original patch to
> skip percpu irq in common code a good option to go, or you want a cleaner code
> in other form? Once it's clear I'll discuss with Stefano e.g. possibly merge with
> his cleanup patch series. :-)
>

Hi, Thomas,

any response on this? Sorry that you may explain your comments clearly in earlier
thread, but I may not catch all of them in one place.

I sent out two fixes here:
[1/2] don't move irq when it's percpu type
[2/2] don't unmask irq when it's disabled at irqchip level

for [2/2], as you explained it's legitimate to mask/unmask a disabled irq since from
irqchip level mask/disable actually means same thing. The only possible gain to
do that is to avoid a potential extra interrupt, which is a different purpose as what
I originally target. So for [2/2] I think it's not required now.

for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq.
However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses
nop mask/unmask hack borrowed from uv machine to work around the issue. As
you suggested it's better to consolidate into the common place instead of scattering
in different places. My view on this common logic is what [1/2] tries to address, is
it correct? If yes, would you consider taking this patch? Stefano told me that his
patches will go in in next merge window. So I think either you can take [1/2] now and
then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's
patch to clean both xen and uv parts.

Let me know your thought.

Thanks
Kevin

2011-05-19 12:06:04

by Stefano Stabellini

[permalink] [raw]
Subject: RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Thu, 19 May 2011, Tian, Kevin wrote:
> for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq.
> However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses
> nop mask/unmask hack borrowed from uv machine to work around the issue. As
> you suggested it's better to consolidate into the common place instead of scattering
> in different places. My view on this common logic is what [1/2] tries to address, is
> it correct? If yes, would you consider taking this patch? Stefano told me that his
> patches will go in in next merge window. So I think either you can take [1/2] now and
> then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's
> patch to clean both xen and uv parts.

Actually I think Kevin's generic patch is better too.
If you ack it I'll remove my patch right away from the queue (maybe I
should remove it anyway?).
Kevin probably needs to write a cleanup patch to remove the equivalent
hack from the uv_irq.

2011-05-19 16:19:22

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [Xen-devel] RE: [PATCH v2 2/2] x86: don't unmask disabled irqs when migrating them

On Thu, May 19, 2011 at 01:08:07PM +0100, Stefano Stabellini wrote:
> On Thu, 19 May 2011, Tian, Kevin wrote:
> > for [1/2] I think it's still necessary as it's meaningless to migrate a percpu type irq.
> > However Stefano has sent out a cleanup patch for Xen percpu irqchip which uses
> > nop mask/unmask hack borrowed from uv machine to work around the issue. As
> > you suggested it's better to consolidate into the common place instead of scattering
> > in different places. My view on this common logic is what [1/2] tries to address, is
> > it correct? If yes, would you consider taking this patch? Stefano told me that his
> > patches will go in in next merge window. So I think either you can take [1/2] now and
> > then I'll do cleanup after Stefano's patch is in, or I can rebase my [1/2] after Stefano's
> > patch to clean both xen and uv parts.
>
> Actually I think Kevin's generic patch is better too.
> If you ack it I'll remove my patch right away from the queue (maybe I
> should remove it anyway?).

I dropped your patch.