2007-05-31 13:33:57

by Eric W. Biederman

[permalink] [raw]
Subject: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling


I just realized that except for doing the code review and noticing
that the current cpu hotplug code is fundamentally incompatible
with x86 I haven't done anything about it. So here is my patch
to document what is wrong.

The current cpu hotplug code requires irqs to be migrated from a cpu
outside of irq context. On x86 ioapics simply do not support this,
making the code unfixable without major redesign of the generic cpu
hotplug code.

So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
and adds a WARN_ON so people that do enable it are not in doubt about
which part of the code is broken, even if it does work for them.

Signed-off-by: Eric W. Biederman <[email protected]>
---
arch/i386/Kconfig | 2 +-
arch/i386/kernel/irq.c | 13 +++++++++++++
arch/x86_64/Kconfig | 2 +-
arch/x86_64/kernel/irq.c | 13 +++++++++++++
4 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig
index 8770a5d..74444c1 100644
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -892,7 +892,7 @@ config PHYSICAL_ALIGN

config HOTPLUG_CPU
bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)"
- depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER
+ depends on SMP && HOTPLUG && EXPERIMENTAL && !X86_VOYAGER && BROKEN
---help---
Say Y here to experiment with turning CPUs off and on, and to
enable suspend on SMP systems. CPUs can be controlled through
diff --git a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
index d2daf67..42aeccb 100644
--- a/arch/i386/kernel/irq.c
+++ b/arch/i386/kernel/irq.c
@@ -312,6 +312,19 @@ void fixup_irqs(cpumask_t map)
unsigned int irq;
static int warned;

+ /*
+ * Function is so wrong at so many levels.
+ * - We migrate irqs that are directed at the cpu we are
+ * removing.
+ * - We cannot safely migrate ioapic irqs on x86 except in
+ * side of irq context.
+ * - We are enabling irqs when the interface requires irqs to
+ * be disabled.
+ * Since someone probably finds this useful just warn very
+ * loudly until cpu hotplug is redesigned.
+ */
+ WARN_ON(1);
+
for (irq = 0; irq < NR_IRQS; irq++) {
cpumask_t mask;
if (irq == 2)
diff --git a/arch/x86_64/Kconfig b/arch/x86_64/Kconfig
index 5ce9443..a61c4f2 100644
--- a/arch/x86_64/Kconfig
+++ b/arch/x86_64/Kconfig
@@ -429,7 +429,7 @@ config NR_CPUS

config HOTPLUG_CPU
bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)"
- depends on SMP && HOTPLUG && EXPERIMENTAL
+ depends on SMP && HOTPLUG && EXPERIMENTAL && BROKEN
help
Say Y here to experiment with turning CPUs off and on. CPUs
can be controlled through /sys/devices/system/cpu/cpu#.
diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 3eaceac..da6f282 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -142,6 +142,19 @@ void fixup_irqs(cpumask_t map)
unsigned int irq;
static int warned;

+ /*
+ * Function is so wrong at so many levels.
+ * - We migrate irqs that are directed at the cpu we are
+ * removing.
+ * - We cannot safely migrate ioapic irqs on x86 except in
+ * side of irq context.
+ * - We are enabling irqs when the interface requires irqs to
+ * be disabled.
+ * Since someone probably finds this useful just warn very
+ * loudly until cpu hotplug is redesigned.
+ */
+ WARN_ON(1);
+
for (irq = 0; irq < NR_IRQS; irq++) {
cpumask_t mask;
if (irq == 2)
--
1.5.1.1.181.g2de0


2007-05-31 14:34:48

by Robert Hancock

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

Eric W. Biederman wrote:
> I just realized that except for doing the code review and noticing
> that the current cpu hotplug code is fundamentally incompatible
> with x86 I haven't done anything about it. So here is my patch
> to document what is wrong.
>
> The current cpu hotplug code requires irqs to be migrated from a cpu
> outside of irq context. On x86 ioapics simply do not support this,
> making the code unfixable without major redesign of the generic cpu
> hotplug code.
>
> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
> and adds a WARN_ON so people that do enable it are not in doubt about
> which part of the code is broken, even if it does work for them.
>
> Signed-off-by: Eric W. Biederman <[email protected]>

I don't think this is useful, though the code may be problematic, this
patch will break suspend on all SMP machines with an existing config,
which is a major regression..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-05-31 15:49:18

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

Robert Hancock <[email protected]> writes:

> Eric W. Biederman wrote:
>> I just realized that except for doing the code review and noticing
>> that the current cpu hotplug code is fundamentally incompatible
>> with x86 I haven't done anything about it. So here is my patch
>> to document what is wrong.
>>
>> The current cpu hotplug code requires irqs to be migrated from a cpu
>> outside of irq context. On x86 ioapics simply do not support this,
>> making the code unfixable without major redesign of the generic cpu
>> hotplug code.
>>
>> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
>> and adds a WARN_ON so people that do enable it are not in doubt about
>> which part of the code is broken, even if it does work for them.
>>
>> Signed-off-by: Eric W. Biederman <[email protected]>
>
> I don't think this is useful, though the code may be problematic, this patch
> will break suspend on all SMP machines with an existing config, which is a major
> regression..


Cool.

My patch does not change the functionality of the code just complains
very loudly that it is broken.

Further the code is broken at a design level. The code isn't
problematic the code is impossible. The cpu hotplug code can not be
fixed on x86 without a redesign of the generic cpu hotplug code.

Suspend does not need to use cpu hotplug because it already gets in
deep with the drivers, and can stop interrupts at the source. I know
there was some talk about this doing this earlier, but I don't know if
anything came of that discussion.

Regardless if you care about this being a problem feel free to fix the
relevant code so it attempts to do something that the hardware
actually supports.

But if the suspend needs this code for smp support it is also broken.

Eric

2007-05-31 20:07:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Thursday, 31 May 2007 17:47, Eric W. Biederman wrote:
> Robert Hancock <[email protected]> writes:
>
> > Eric W. Biederman wrote:
> >> I just realized that except for doing the code review and noticing
> >> that the current cpu hotplug code is fundamentally incompatible
> >> with x86 I haven't done anything about it. So here is my patch
> >> to document what is wrong.
> >>
> >> The current cpu hotplug code requires irqs to be migrated from a cpu
> >> outside of irq context. On x86 ioapics simply do not support this,
> >> making the code unfixable without major redesign of the generic cpu
> >> hotplug code.
> >>
> >> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
> >> and adds a WARN_ON so people that do enable it are not in doubt about
> >> which part of the code is broken, even if it does work for them.
> >>
> >> Signed-off-by: Eric W. Biederman <[email protected]>
> >
> > I don't think this is useful, though the code may be problematic, this patch
> > will break suspend on all SMP machines with an existing config, which is a major
> > regression..
>
>
> Cool.
>
> My patch does not change the functionality of the code just complains
> very loudly that it is broken.
>
> Further the code is broken at a design level. The code isn't
> problematic the code is impossible. The cpu hotplug code can not be
> fixed on x86 without a redesign of the generic cpu hotplug code.
>
> Suspend does not need to use cpu hotplug because it already gets in
> deep with the drivers, and can stop interrupts at the source. I know
> there was some talk about this doing this earlier, but I don't know if
> anything came of that discussion.
>
> Regardless if you care about this being a problem feel free to fix the
> relevant code so it attempts to do something that the hardware
> actually supports.
>
> But if the suspend needs this code for smp support it is also broken.

Well, from the functionality point of view, it's not. We have no problems
with it, at least not that I know of.

Greetings,
Rafael

2007-06-01 19:50:12

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

"Rafael J. Wysocki" <[email protected]> writes:
>> Cool.
>>
>> My patch does not change the functionality of the code just complains
>> very loudly that it is broken.
>>
>> Further the code is broken at a design level. The code isn't
>> problematic the code is impossible. The cpu hotplug code can not be
>> fixed on x86 without a redesign of the generic cpu hotplug code.
>>
>> Suspend does not need to use cpu hotplug because it already gets in
>> deep with the drivers, and can stop interrupts at the source. I know
>> there was some talk about this doing this earlier, but I don't know if
>> anything came of that discussion.
>>
>> Regardless if you care about this being a problem feel free to fix the
>> relevant code so it attempts to do something that the hardware
>> actually supports.
>>
>> But if the suspend needs this code for smp support it is also broken.
>
> Well, from the functionality point of view, it's not. We have no problems
> with it, at least not that I know of.

Luck, or enough other issues someone hasn't tracked their problems
down to this. On the pure cpu hotplug path I just got a bug
report about it not working: http://lkml.org/lkml/2007/5/31/419

So apparently real people can hit this one. Not just theoretical
people seen by code reviewers.

The code is broken by design and cannot be made to support existing
x86 SMP systems. Please find a way to use something that works.

The suspend path is already talking to the drivers and can stop IRQs
at their source so it should not be a big deal.

Eric

2007-06-01 20:00:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Friday, 1 June 2007 21:48, Eric W. Biederman wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
> >> Cool.
> >>
> >> My patch does not change the functionality of the code just complains
> >> very loudly that it is broken.
> >>
> >> Further the code is broken at a design level. The code isn't
> >> problematic the code is impossible. The cpu hotplug code can not be
> >> fixed on x86 without a redesign of the generic cpu hotplug code.
> >>
> >> Suspend does not need to use cpu hotplug because it already gets in
> >> deep with the drivers, and can stop interrupts at the source. I know
> >> there was some talk about this doing this earlier, but I don't know if
> >> anything came of that discussion.
> >>
> >> Regardless if you care about this being a problem feel free to fix the
> >> relevant code so it attempts to do something that the hardware
> >> actually supports.
> >>
> >> But if the suspend needs this code for smp support it is also broken.
> >
> > Well, from the functionality point of view, it's not. We have no problems
> > with it, at least not that I know of.
>
> Luck, or enough other issues someone hasn't tracked their problems
> down to this. On the pure cpu hotplug path I just got a bug
> report about it not working: http://lkml.org/lkml/2007/5/31/419
>
> So apparently real people can hit this one. Not just theoretical
> people seen by code reviewers.
>
> The code is broken by design and cannot be made to support existing
> x86 SMP systems. Please find a way to use something that works.
>
> The suspend path is already talking to the drivers and can stop IRQs
> at their source so it should not be a big deal.

Very well, but could you _please_ give us some time to do this?

We know of the problem now and will work to fix it, but it's not _that_ easy.

In fact, we also rely on the CPU hotplug's code that takes tasks away from the
offlined CPUs (and does the opposite with respect to the onlined CPUs), so
we just can't get rid of the CPU hotplug _right_ _now_.

Greetings,
Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-01 20:29:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Friday, 1 June 2007 21:48, Eric W. Biederman wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
> >> Cool.
> >>
> >> My patch does not change the functionality of the code just complains
> >> very loudly that it is broken.
> >>
> >> Further the code is broken at a design level. The code isn't
> >> problematic the code is impossible. The cpu hotplug code can not be
> >> fixed on x86 without a redesign of the generic cpu hotplug code.
> >>
> >> Suspend does not need to use cpu hotplug because it already gets in
> >> deep with the drivers, and can stop interrupts at the source. I know
> >> there was some talk about this doing this earlier, but I don't know if
> >> anything came of that discussion.
> >>
> >> Regardless if you care about this being a problem feel free to fix the
> >> relevant code so it attempts to do something that the hardware
> >> actually supports.
> >>
> >> But if the suspend needs this code for smp support it is also broken.
> >
> > Well, from the functionality point of view, it's not. We have no problems
> > with it, at least not that I know of.
>
> Luck, or enough other issues someone hasn't tracked their problems
> down to this. On the pure cpu hotplug path I just got a bug
> report about it not working: http://lkml.org/lkml/2007/5/31/419

[Apart from what I said in the other message:]

In the hibernation/suspend code paths we call the CPU hotplug quite late,
actually right before we turn off local IRQs on the only CPU that is not
offlined. I think that at this point the majority of interrupt sources should
have been turned off already.

Greetings,
Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-01 20:31:52

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

"Rafael J. Wysocki" <[email protected]> writes:

> Very well, but could you _please_ give us some time to do this?
>
> We know of the problem now and will work to fix it, but it's not _that_ easy.
>
> In fact, we also rely on the CPU hotplug's code that takes tasks away from the
> offlined CPUs (and does the opposite with respect to the onlined CPUs), so
> we just can't get rid of the CPU hotplug _right_ _now_.

Sure. My primary point was to document this, and get things moving if
possible to fix it. This patch is 2.6.23 at the earliest, so there is
some time to work on it even on the most optimistic merge schedule.

I just don't want to be complacent and let this problem sit undetected
in back corner any more.

Eric

2007-06-01 20:38:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

[Removed the Zwane's address from the CC list, because I get rejects from it]

On Friday, 1 June 2007 22:29, Eric W. Biederman wrote:
> "Rafael J. Wysocki" <[email protected]> writes:
>
> > Very well, but could you _please_ give us some time to do this?
> >
> > We know of the problem now and will work to fix it, but it's not _that_ easy.
> >
> > In fact, we also rely on the CPU hotplug's code that takes tasks away from the
> > offlined CPUs (and does the opposite with respect to the onlined CPUs), so
> > we just can't get rid of the CPU hotplug _right_ _now_.
>
> Sure. My primary point was to document this, and get things moving if
> possible to fix it. This patch is 2.6.23 at the earliest, so there is
> some time to work on it even on the most optimistic merge schedule.
>
> I just don't want to be complacent and let this problem sit undetected
> in back corner any more.

So, can you advise what we can do to get the "nonboot" CPUs out of the picture
during a suspend/hibernation?

Greetings,
Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-08 15:11:48

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

Hi!

> I just realized that except for doing the code review and noticing
> that the current cpu hotplug code is fundamentally incompatible
> with x86 I haven't done anything about it. So here is my patch
> to document what is wrong.
>
> The current cpu hotplug code requires irqs to be migrated from a cpu
> outside of irq context. On x86 ioapics simply do not support this,
> making the code unfixable without major redesign of the generic cpu
> hotplug code.
>
> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
> and adds a WARN_ON so people that do enable it are not in doubt about
> which part of the code is broken, even if it does work for them.


> --- a/arch/i386/kernel/irq.c
> +++ b/arch/i386/kernel/irq.c
> @@ -312,6 +312,19 @@ void fixup_irqs(cpumask_t map)
> unsigned int irq;
> static int warned;
>
> + /*
> + * Function is so wrong at so many levels.
> + * - We migrate irqs that are directed at the cpu we are
> + * removing.

Is this about irq pinning?

> + * - We cannot safely migrate ioapic irqs on x86 except in
> + * side of irq context.

'inside'?

Can you be more specific for this one?

> + * Since someone probably finds this useful just warn very
> + * loudly until cpu hotplug is redesigned.
> + */
> + WARN_ON(1);

Ugh, no, this does not warn anyone. This will just make people ask me
why they see stack trace while suspending... and we are not interested
in the stack trace, anyway.

printk(KERN_WARNING)?

> index 5ce9443..a61c4f2 100644
> --- a/arch/x86_64/Kconfig
> +++ b/arch/x86_64/Kconfig
> @@ -429,7 +429,7 @@ config NR_CPUS
>
> config HOTPLUG_CPU
> bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)"
> - depends on SMP && HOTPLUG && EXPERIMENTAL
> + depends on SMP && HOTPLUG && EXPERIMENTAL && BROKEN
> help

Great, this will force everyone and their dog to enable broken, making
broken useless. Please don't.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-06-12 18:22:18

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

Pavel Machek <[email protected]> writes:

> Hi!
>
>> I just realized that except for doing the code review and noticing
>> that the current cpu hotplug code is fundamentally incompatible
>> with x86 I haven't done anything about it. So here is my patch
>> to document what is wrong.
>>
>> The current cpu hotplug code requires irqs to be migrated from a cpu
>> outside of irq context. On x86 ioapics simply do not support this,
>> making the code unfixable without major redesign of the generic cpu
>> hotplug code.
>>
>> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
>> and adds a WARN_ON so people that do enable it are not in doubt about
>> which part of the code is broken, even if it does work for them.
>
>
>> --- a/arch/i386/kernel/irq.c
>> +++ b/arch/i386/kernel/irq.c
>> @@ -312,6 +312,19 @@ void fixup_irqs(cpumask_t map)
>> unsigned int irq;
>> static int warned;
>>
>> + /*
>> + * Function is so wrong at so many levels.
>> + * - We migrate irqs that are directed at the cpu we are
>> + * removing.
>
> Is this about irq pinning?

Sorry. That should have been: We migrate irqs that are not directed at the
cpu we are removing. (We are migrating irqs when it is unnecessary).

>> + * - We cannot safely migrate ioapic irqs on x86 except in
>> + * side of irq context.
>
> 'inside'?
>
> Can you be more specific for this one?

Yes inside.

An irq migration currently requires two instances of the irq firing to
complete. Once on the source cpu once on the target cpu.

Migrating irqs while the irq is alive is a royal pain.

>> + * Since someone probably finds this useful just warn very
>> + * loudly until cpu hotplug is redesigned.
>> + */
>> + WARN_ON(1);
>
> Ugh, no, this does not warn anyone. This will just make people ask me
> why they see stack trace while suspending... and we are not interested
> in the stack trace, anyway.
>
> printk(KERN_WARNING)?


Because you are calling unfixably broken code. That should be a decent
incentive to do something else won't it?

IOAPICs do not support what the code is doing here. There is lots of
practical evidence including bad experiences and practical tests that
support this.

I suspect the only reason you don't have problems is that the irqs are
already shut down at the source before we get to this code path.

>> index 5ce9443..a61c4f2 100644
>> --- a/arch/x86_64/Kconfig
>> +++ b/arch/x86_64/Kconfig
>> @@ -429,7 +429,7 @@ config NR_CPUS
>>
>> config HOTPLUG_CPU
>> bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)"
>> - depends on SMP && HOTPLUG && EXPERIMENTAL
>> + depends on SMP && HOTPLUG && EXPERIMENTAL && BROKEN
>> help
>
> Great, this will force everyone and their dog to enable broken, making
> broken useless. Please don't.

CONFIG_BROKEN is quite likely excessive but the code is totally and
unfixably broken. So it still doesn't feel wrong to me.

Perhaps we should just disable swap suspend on SMP until we get a
design that can be implemented correctly on existing hardware. I am
not happy with people telling me that we must keep broken code because
people with brand new SMP laptops will scream otherwise. Since the
fundamental problems in this code path don't appear to bite people
very frequently I don't mind waiting while an alternative solution is
debugged and tested, but there is no way it makes sense to keep this
code in service more for more than a kernel release or two.

Eric

2007-06-12 20:45:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote:
> Pavel Machek <[email protected]> writes:
>
> > Hi!
> >
> >> I just realized that except for doing the code review and noticing
> >> that the current cpu hotplug code is fundamentally incompatible
> >> with x86 I haven't done anything about it. So here is my patch
> >> to document what is wrong.
> >>
> >> The current cpu hotplug code requires irqs to be migrated from a cpu
> >> outside of irq context. On x86 ioapics simply do not support this,
> >> making the code unfixable without major redesign of the generic cpu
> >> hotplug code.
> >>
> >> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN
> >> and adds a WARN_ON so people that do enable it are not in doubt about
> >> which part of the code is broken, even if it does work for them.
> >
> >
> >> --- a/arch/i386/kernel/irq.c
> >> +++ b/arch/i386/kernel/irq.c
> >> @@ -312,6 +312,19 @@ void fixup_irqs(cpumask_t map)
> >> unsigned int irq;
> >> static int warned;
> >>
> >> + /*
> >> + * Function is so wrong at so many levels.
> >> + * - We migrate irqs that are directed at the cpu we are
> >> + * removing.
> >
> > Is this about irq pinning?
>
> Sorry. That should have been: We migrate irqs that are not directed at the
> cpu we are removing. (We are migrating irqs when it is unnecessary).
>
> >> + * - We cannot safely migrate ioapic irqs on x86 except in
> >> + * side of irq context.
> >
> > 'inside'?
> >
> > Can you be more specific for this one?
>
> Yes inside.
>
> An irq migration currently requires two instances of the irq firing to
> complete. Once on the source cpu once on the target cpu.
>
> Migrating irqs while the irq is alive is a royal pain.
>
> >> + * Since someone probably finds this useful just warn very
> >> + * loudly until cpu hotplug is redesigned.
> >> + */
> >> + WARN_ON(1);
> >
> > Ugh, no, this does not warn anyone. This will just make people ask me
> > why they see stack trace while suspending... and we are not interested
> > in the stack trace, anyway.
> >
> > printk(KERN_WARNING)?
>
>
> Because you are calling unfixably broken code. That should be a decent
> incentive to do something else won't it?

Can you please tell me _what_ else can be done?

> IOAPICs do not support what the code is doing here. There is lots of
> practical evidence including bad experiences and practical tests that
> support this.

Well, AFAICS, Suresh has tried to debug one failing case recently without
any consistent conclusions. I don't know of any other test cases (links,
please?).

> I suspect the only reason you don't have problems is that the irqs are
> already shut down at the source before we get to this code path.

You are probably right, but OTOH I've tried the CPU hotplug via the sysfs
interface for _many_ times and it has _never_ failed for me.

> >> index 5ce9443..a61c4f2 100644
> >> --- a/arch/x86_64/Kconfig
> >> +++ b/arch/x86_64/Kconfig
> >> @@ -429,7 +429,7 @@ config NR_CPUS
> >>
> >> config HOTPLUG_CPU
> >> bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)"
> >> - depends on SMP && HOTPLUG && EXPERIMENTAL
> >> + depends on SMP && HOTPLUG && EXPERIMENTAL && BROKEN
> >> help
> >
> > Great, this will force everyone and their dog to enable broken, making
> > broken useless. Please don't.
>
> CONFIG_BROKEN is quite likely excessive but the code is totally and
> unfixably broken. So it still doesn't feel wrong to me.
>
> Perhaps we should just disable swap suspend on SMP until we get a
> design that can be implemented correctly on existing hardware.

All kinds of suspend, not only hibernation.

> I am not happy with people telling me that we must keep broken code because
> people with brand new SMP laptops will scream otherwise.

Sorry, but that's how it goes.

We've been using that code for more then a year now and I'd expect someone
to tell us that it's wrong a bit earlier.

If you are telling us to drop it now, then _please_ advise what we can use
instead, because we obviously need the functionality.

> Since the fundamental problems in this code path don't appear to bite people
> very frequently I don't mind waiting while an alternative solution is
> debugged and tested, but there is no way it makes sense to keep this
> code in service more for more than a kernel release or two.

Greetings,
Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-12 22:01:09

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Tue, Jun 12, 2007 at 10:52:09PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote:
> > Because you are calling unfixably broken code. That should be a decent
> > incentive to do something else won't it?
>
> Can you please tell me _what_ else can be done?
>
> > IOAPICs do not support what the code is doing here. There is lots of
> > practical evidence including bad experiences and practical tests that
> > support this.
>
> Well, AFAICS, Suresh has tried to debug one failing case recently without
> any consistent conclusions. I don't know of any other test cases (links,
> please?).

Rafael, Darrick Wong's issue looks different and hence I was motivated to
look and fix if it was a SW issue. For now, I am not able to comprehend
what is happening on Darrick Wong's system. Need more help from Darrick
as he has the golden failing system.

Meanwhile I talked to our hardware folks about the irq migration in general.

One good news is that future versions of chipsets will have an Interrupt
Remapping feature(for more details please refer to
http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf)
where in we can reliably migrate the irq to someother cpu in the process
context.

And for the existing platforms, chipset folks don't see a reason why the
Eric's algorithm(specified below) should fail.

Eric's algorithm for level triggered(edge triggered should be easier than
level triggered):
- Mask the irq in process context.
- Poll RIRR until an ack of for the irq was not pending.
- Migrate the irq.

Eric had a problem of stuck remote IRR on E75xx chipset with this algorithm
and my next step is to reproduce this issue on this platform and understand
the behavior.

thanks,
suresh

2007-06-12 22:10:25

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Tuesday, 12 June 2007 23:56, Siddha, Suresh B wrote:
> On Tue, Jun 12, 2007 at 10:52:09PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote:
> > > Because you are calling unfixably broken code. That should be a decent
> > > incentive to do something else won't it?
> >
> > Can you please tell me _what_ else can be done?
> >
> > > IOAPICs do not support what the code is doing here. There is lots of
> > > practical evidence including bad experiences and practical tests that
> > > support this.
> >
> > Well, AFAICS, Suresh has tried to debug one failing case recently without
> > any consistent conclusions. I don't know of any other test cases (links,
> > please?).
>
> Rafael, Darrick Wong's issue looks different and hence I was motivated to
> look and fix if it was a SW issue. For now, I am not able to comprehend
> what is happening on Darrick Wong's system. Need more help from Darrick
> as he has the golden failing system.
>
> Meanwhile I talked to our hardware folks about the irq migration in general.
>
> One good news is that future versions of chipsets will have an Interrupt
> Remapping feature(for more details please refer to
> http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf)
> where in we can reliably migrate the irq to someother cpu in the process
> context.
>
> And for the existing platforms, chipset folks don't see a reason why the
> Eric's algorithm(specified below) should fail.
>
> Eric's algorithm for level triggered(edge triggered should be easier than
> level triggered):
> - Mask the irq in process context.
> - Poll RIRR until an ack of for the irq was not pending.
> - Migrate the irq.
>
> Eric had a problem of stuck remote IRR on E75xx chipset with this algorithm
> and my next step is to reproduce this issue on this platform and understand
> the behavior.

OK

In that case, do I understand correctly that we are going to implement the
Eric's algorithm above for the CPU hotunplugging on x86 once you've figured
out what's the E75xx issue?

Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-12 22:28:39

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Wed, Jun 13, 2007 at 12:16:08AM +0200, Rafael J. Wysocki wrote:
> On Tuesday, 12 June 2007 23:56, Siddha, Suresh B wrote:
> > On Tue, Jun 12, 2007 at 10:52:09PM +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote:
> > > > Because you are calling unfixably broken code. That should be a decent
> > > > incentive to do something else won't it?
> > >
> > > Can you please tell me _what_ else can be done?
> > >
> > > > IOAPICs do not support what the code is doing here. There is lots of
> > > > practical evidence including bad experiences and practical tests that
> > > > support this.
> > >
> > > Well, AFAICS, Suresh has tried to debug one failing case recently without
> > > any consistent conclusions. I don't know of any other test cases (links,
> > > please?).
> >
> > Rafael, Darrick Wong's issue looks different and hence I was motivated to
> > look and fix if it was a SW issue. For now, I am not able to comprehend
> > what is happening on Darrick Wong's system. Need more help from Darrick
> > as he has the golden failing system.
> >
> > Meanwhile I talked to our hardware folks about the irq migration in general.
> >
> > One good news is that future versions of chipsets will have an Interrupt
> > Remapping feature(for more details please refer to
> > http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf)
> > where in we can reliably migrate the irq to someother cpu in the process
> > context.
> >
> > And for the existing platforms, chipset folks don't see a reason why the
> > Eric's algorithm(specified below) should fail.
> >
> > Eric's algorithm for level triggered(edge triggered should be easier than
> > level triggered):
> > - Mask the irq in process context.
> > - Poll RIRR until an ack of for the irq was not pending.
> > - Migrate the irq.
> >
> > Eric had a problem of stuck remote IRR on E75xx chipset with this algorithm
> > and my next step is to reproduce this issue on this platform and understand
> > the behavior.
>
> OK
>
> In that case, do I understand correctly that we are going to implement the
> Eric's algorithm above for the CPU hotunplugging on x86 once you've figured
> out what's the E75xx issue?

That is the idea. Thanks.

2007-06-12 22:52:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

On Wednesday, 13 June 2007 00:24, Siddha, Suresh B wrote:
> On Wed, Jun 13, 2007 at 12:16:08AM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, 12 June 2007 23:56, Siddha, Suresh B wrote:
> > > On Tue, Jun 12, 2007 at 10:52:09PM +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote:
> > > > > Because you are calling unfixably broken code. That should be a decent
> > > > > incentive to do something else won't it?
> > > >
> > > > Can you please tell me _what_ else can be done?
> > > >
> > > > > IOAPICs do not support what the code is doing here. There is lots of
> > > > > practical evidence including bad experiences and practical tests that
> > > > > support this.
> > > >
> > > > Well, AFAICS, Suresh has tried to debug one failing case recently without
> > > > any consistent conclusions. I don't know of any other test cases (links,
> > > > please?).
> > >
> > > Rafael, Darrick Wong's issue looks different and hence I was motivated to
> > > look and fix if it was a SW issue. For now, I am not able to comprehend
> > > what is happening on Darrick Wong's system. Need more help from Darrick
> > > as he has the golden failing system.
> > >
> > > Meanwhile I talked to our hardware folks about the irq migration in general.
> > >
> > > One good news is that future versions of chipsets will have an Interrupt
> > > Remapping feature(for more details please refer to
> > > http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf)
> > > where in we can reliably migrate the irq to someother cpu in the process
> > > context.
> > >
> > > And for the existing platforms, chipset folks don't see a reason why the
> > > Eric's algorithm(specified below) should fail.
> > >
> > > Eric's algorithm for level triggered(edge triggered should be easier than
> > > level triggered):
> > > - Mask the irq in process context.
> > > - Poll RIRR until an ack of for the irq was not pending.
> > > - Migrate the irq.
> > >
> > > Eric had a problem of stuck remote IRR on E75xx chipset with this algorithm
> > > and my next step is to reproduce this issue on this platform and understand
> > > the behavior.
> >
> > OK
> >
> > In that case, do I understand correctly that we are going to implement the
> > Eric's algorithm above for the CPU hotunplugging on x86 once you've figured
> > out what's the E75xx issue?
>
> That is the idea. Thanks.

OK, thanks.

Eric, would you agree to follow this plan without making the entire CPU hotplug
(on x86) depend on BROKEN?

Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth

2007-06-22 17:29:17

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling

"Rafael J. Wysocki" <[email protected]> writes:
>
> OK, thanks.
>
> Eric, would you agree to follow this plan without making the entire CPU hotplug
> (on x86) depend on BROKEN?


Sorry for the delay, I missed this email. If we can actually move irq
migration into process context on x86. I would be happy to.

Currently I am dubious if this can be done reliably. But I think
the benefits of doing irq migration outside of the irq handler
are sufficient to give another go at making it work.

However this isn't just Intel's ioapics we have to worry about,
but Intel's were the worst so that is a reasonable starting point.

Eric