2007-09-27 21:29:36

by Mark Lord

[permalink] [raw]
Subject: Problems with SMP & ACPI powering off

Question: do we disable all CPUs except 0 when doing ACPI power off?

Background:
I have a machine here dedicated to running MythTV.
It powers up to record, and then sets the RTC alarm for next time
and powers down again in between recordings.

It has an Intel Core2duo E6300 CPU, currently on an ICH8 motherboard.
Previously it was on a completely different (vendor,bios,...) ICH7 motherboard.

In both cases, "halt -p" sometimes fails to actually turn off the power,
which means that it later then fails to "turn on" to record again.

Annoying.

This is a 32-bit kernel/runtime, with full ACPI (not APM) kernel support enabled.

So I'm wondering if it may be due to the old SMP-poweroff bogeyman ?

For now, I've hardcoded a cpu_down(1) into the poweroff code,
and we'll see if that helps or is merely redundant.

But I do wonder where else to look for a cause?

Two different boards, vendors, BIOSs, same CPU chip. Same problem.

????


2007-09-27 21:30:19

by Mark Lord

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

Mark Lord wrote:
> Question: do we disable all CPUs except 0 when doing ACPI power off?
>
> Background:
> I have a machine here dedicated to running MythTV.
> It powers up to record, and then sets the RTC alarm for next time
> and powers down again in between recordings.
>
> It has an Intel Core2duo E6300 CPU, currently on an ICH8 motherboard.
> Previously it was on a completely different (vendor,bios,...) ICH7
> motherboard.
>
> In both cases, "halt -p" sometimes fails to actually turn off the power,
> which means that it later then fails to "turn on" to record again.
>
> Annoying.
>
> This is a 32-bit kernel/runtime, with full ACPI (not APM) kernel support
> enabled.
>
> So I'm wondering if it may be due to the old SMP-poweroff bogeyman ?
>
> For now, I've hardcoded a cpu_down(1) into the poweroff code,
> and we'll see if that helps or is merely redundant.
>
> But I do wonder where else to look for a cause?
>
> Two different boards, vendors, BIOSs, same CPU chip. Same problem.

Oh, and two different power-supplies, too.

-ml

2007-09-27 21:46:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

On Thursday, 27 September 2007 23:29, Mark Lord wrote:
> Question: do we disable all CPUs except 0 when doing ACPI power off?

No, but we should.

> Background:
> I have a machine here dedicated to running MythTV.
> It powers up to record, and then sets the RTC alarm for next time
> and powers down again in between recordings.
>
> It has an Intel Core2duo E6300 CPU, currently on an ICH8 motherboard.
> Previously it was on a completely different (vendor,bios,...) ICH7 motherboard.
>
> In both cases, "halt -p" sometimes fails to actually turn off the power,
> which means that it later then fails to "turn on" to record again.
>
> Annoying.
>
> This is a 32-bit kernel/runtime, with full ACPI (not APM) kernel support enabled.
>
> So I'm wondering if it may be due to the old SMP-poweroff bogeyman ?

May be.

Which kernel?

> For now, I've hardcoded a cpu_down(1) into the poweroff code,
> and we'll see if that helps or is merely redundant.
>
> But I do wonder where else to look for a cause?
>
> Two different boards, vendors, BIOSs, same CPU chip. Same problem.

Same chipset, perchance?

Greetings,
Rafael

2007-09-27 23:07:40

by Mark Lord

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

Rafael J. Wysocki wrote:
> On Thursday, 27 September 2007 23:29, Mark Lord wrote:
>> Question: do we disable all CPUs except 0 when doing ACPI power off?
>
> No, but we should.
>
>> Background:
>> I have a machine here dedicated to running MythTV.
>> It powers up to record, and then sets the RTC alarm for next time
>> and powers down again in between recordings.
>>
>> It has an Intel Core2duo E6300 CPU, currently on an ICH8 motherboard.
>> Previously it was on a completely different (vendor,bios,...) ICH7 motherboard.
>>
>> In both cases, "halt -p" sometimes fails to actually turn off the power,
>> which means that it later then fails to "turn on" to record again.
>>
>> Annoying.
>>
>> This is a 32-bit kernel/runtime, with full ACPI (not APM) kernel support enabled.
>>
>> So I'm wondering if it may be due to the old SMP-poweroff bogeyman ?
>
> May be.
>
> Which kernel?

Latest 2.6.23-rc-git. Same problem from time to time on 2.6.17, as well.
Dunno about in between those Revs., but it's much more common on the latest
than it was on the old kernel.

>
>> For now, I've hardcoded a cpu_down(1) into the poweroff code,
>> and we'll see if that helps or is merely redundant.
>>
>> But I do wonder where else to look for a cause?
>>
>> Two different boards, vendors, BIOSs, same CPU chip. Same problem.
>
> Same chipset, perchance?

Mmmm I originally didn't think so.

But actually one board is ICH8, the other ICH8R,
so yes, they use the same chipset.

Cheers

2007-09-28 04:58:59

by Len Brown

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

On Thursday 27 September 2007 18:00, Rafael J. Wysocki wrote:
> On Thursday, 27 September 2007 23:29, Mark Lord wrote:
> > Question: do we disable all CPUs except 0 when doing ACPI power off?
>
> No, but we should.

We used to.
It is absolutely mandatory -- else it confuses the BIOS on some boards
b/c it isn't expecting SMM to get entered from other than cpu0.

-Len

2007-09-28 12:41:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

On Friday, 28 September 2007 06:57, Len Brown wrote:
> On Thursday 27 September 2007 18:00, Rafael J. Wysocki wrote:
> > On Thursday, 27 September 2007 23:29, Mark Lord wrote:
> > > Question: do we disable all CPUs except 0 when doing ACPI power off?
> >
> > No, but we should.
>
> We used to.
> It is absolutely mandatory -- else it confuses the BIOS on some boards
> b/c it isn't expecting SMM to get entered from other than cpu0.

Can we use the CPU hotplug for that, like in the suspend/hibernation case?

Rafael

2007-09-28 13:22:45

by Mark Lord

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

Rafael J. Wysocki wrote:
> On Friday, 28 September 2007 06:57, Len Brown wrote:
>> On Thursday 27 September 2007 18:00, Rafael J. Wysocki wrote:
>>> On Thursday, 27 September 2007 23:29, Mark Lord wrote:
>>>> Question: do we disable all CPUs except 0 when doing ACPI power off?
>>> No, but we should.
>> We used to.
>> It is absolutely mandatory -- else it confuses the BIOS on some boards
>> b/c it isn't expecting SMM to get entered from other than cpu0.
>
> Can we use the CPU hotplug for that, like in the suspend/hibernation case?

Well, so far it's working: about ten poweroffs since I patched it,
and no issues with any of them. Prior to that, it seemed like about
one in five poweroffs wouldn't (power off).

It'll take a lot more testing to confirm, though.

What can I call to determine if more than one CPU is enabled, anyway?

Here's the hack I'm using here, very situation (2 cores) specific,
and it still has some printk's leftover with a sleep so I have time
to read them before the lights go out. :)

--- old/arch/i386/kernel/reboot.c 2007-09-27 17:17:00.000000000 -0400
+++ linux/arch/i386/kernel/reboot.c 2007-09-27 17:15:35.000000000 -0400
@@ -393,8 +393,22 @@
.halt = native_machine_halt,
};

+static void kill_cpu1(void)
+{
+ extern int cpu_down(unsigned int cpu);
+
+ printk(KERN_EMERG "kill_cpu1: was running on CPU%d\n", smp_processor_id());
+ /* Some bioses don't like being called from CPU != 0 */
+ set_cpus_allowed(current, cpumask_of_cpu(0));
+ printk(KERN_EMERG "kill_cpu1: now running on CPU%d\n", smp_processor_id());
+ cpu_down(1);
+ printk(KERN_EMERG "kill_cpu1: done\n");
+ msleep(1000);
+}
+
void machine_power_off(void)
{
+ (void)kill_cpu1();
machine_ops.power_off();
}

2007-09-28 13:30:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

On Friday, 28 September 2007 15:22, Mark Lord wrote:
> Rafael J. Wysocki wrote:
> > On Friday, 28 September 2007 06:57, Len Brown wrote:
> >> On Thursday 27 September 2007 18:00, Rafael J. Wysocki wrote:
> >>> On Thursday, 27 September 2007 23:29, Mark Lord wrote:
> >>>> Question: do we disable all CPUs except 0 when doing ACPI power off?
> >>> No, but we should.
> >> We used to.
> >> It is absolutely mandatory -- else it confuses the BIOS on some boards
> >> b/c it isn't expecting SMM to get entered from other than cpu0.
> >
> > Can we use the CPU hotplug for that, like in the suspend/hibernation case?
>
> Well, so far it's working: about ten poweroffs since I patched it,
> and no issues with any of them. Prior to that, it seemed like about
> one in five poweroffs wouldn't (power off).
>
> It'll take a lot more testing to confirm, though.
>
> What can I call to determine if more than one CPU is enabled, anyway?
>
> Here's the hack I'm using here, very situation (2 cores) specific,
> and it still has some printk's leftover with a sleep so I have time
> to read them before the lights go out. :)

Well, we have disable_nonboot_cpus() that we use for suspend and that is
supposed to be general.

The question is when to call it.

Greetings,
Rafael

2007-09-28 13:46:41

by Mark Lord

[permalink] [raw]
Subject: Re: Problems with SMP & ACPI powering off

Rafael J. Wysocki wrote:
> On Friday, 28 September 2007 15:22, Mark Lord wrote:
>> Rafael J. Wysocki wrote:
>>> On Friday, 28 September 2007 06:57, Len Brown wrote:
>>>> On Thursday 27 September 2007 18:00, Rafael J. Wysocki wrote:
>>>>> On Thursday, 27 September 2007 23:29, Mark Lord wrote:
>>>>>> Question: do we disable all CPUs except 0 when doing ACPI power off?
>>>>> No, but we should.
>>>> We used to.
>>>> It is absolutely mandatory -- else it confuses the BIOS on some boards
>>>> b/c it isn't expecting SMM to get entered from other than cpu0.
>>> Can we use the CPU hotplug for that, like in the suspend/hibernation case?
>> Well, so far it's working: about ten poweroffs since I patched it,
>> and no issues with any of them. Prior to that, it seemed like about
>> one in five poweroffs wouldn't (power off).
>>
>> It'll take a lot more testing to confirm, though.
>>
>> What can I call to determine if more than one CPU is enabled, anyway?
>>
>> Here's the hack I'm using here, very situation (2 cores) specific,
>> and it still has some printk's leftover with a sleep so I have time
>> to read them before the lights go out. :)
>
> Well, we have disable_nonboot_cpus() that we use for suspend and that is
> supposed to be general.
>
> The question is when to call it.

How about from the obvious candidate: kernel/sys.c::kernel_power_off() ?

I'll rework my hack into a proper patch there,
repost it here, and test with it for another day or two.

-ml

2007-09-28 13:52:45

by Mark Lord

[permalink] [raw]
Subject: [PATCH] disable non-boot CPUs before poweroff


We need to disable all CPUs other than the boot CPU (usually 0)
before attempting to power-off modern SMP machines.
This seems to fix the hang-on-poweroff issue
that one of my SMP boxes exhibits. More testing required.

Signed-off-by: Mark Lord <[email protected]>
---

--- linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
+++ linux/kernel/sys.c 2007-09-28 09:48:54.000000000 -0400
@@ -32,6 +32,7 @@
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
+#include <linux/cpu.h>

#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -879,6 +880,7 @@
if (pm_power_off_prepare)
pm_power_off_prepare();
sysdev_shutdown();
+ disable_nonboot_cpus();
printk(KERN_EMERG "Power down.\n");
machine_power_off();
}

2007-09-28 14:12:10

by Mark Lord

[permalink] [raw]
Subject: Re: [PATCH] disable non-boot CPUs before poweroff

Mark Lord wrote:
>
> We need to disable all CPUs other than the boot CPU (usually 0)
> before attempting to power-off modern SMP machines.
> This seems to fix the hang-on-poweroff issue
> that one of my SMP boxes exhibits. More testing required.
>
> Signed-off-by: Mark Lord <[email protected]>
> ---
>
> --- linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
> +++ linux/kernel/sys.c 2007-09-28 09:48:54.000000000 -0400
> @@ -32,6 +32,7 @@
> #include <linux/getcpu.h>
> #include <linux/task_io_accounting_ops.h>
> #include <linux/seccomp.h>
> +#include <linux/cpu.h>
>
> #include <linux/compat.h>
> #include <linux/syscalls.h>
> @@ -879,6 +880,7 @@
> if (pm_power_off_prepare)
> pm_power_off_prepare();
> sysdev_shutdown();
> + disable_nonboot_cpus();
> printk(KERN_EMERG "Power down.\n");
> machine_power_off();
> }

Okay, verified now. Prior to this patch, *both* CPUs were still up and running
when machine_power_off() got called, and there was no guarantee that CPU0 was
the one calling machine_power_off(). BUG.

The above patch guarantees that only the single boot CPU is running
and calling machine_power_off().

Hopefully this buries the SMP-power-off bogeyman for good!

Cheers

2007-09-28 14:51:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] disable non-boot CPUs before poweroff

On Friday, 28 September 2007 15:52, Mark Lord wrote:
>
> We need to disable all CPUs other than the boot CPU (usually 0)
> before attempting to power-off modern SMP machines.
> This seems to fix the hang-on-poweroff issue
> that one of my SMP boxes exhibits. More testing required.
>
> Signed-off-by: Mark Lord <[email protected]>
> ---
>
> --- linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
> +++ linux/kernel/sys.c 2007-09-28 09:48:54.000000000 -0400
> @@ -32,6 +32,7 @@
> #include <linux/getcpu.h>
> #include <linux/task_io_accounting_ops.h>
> #include <linux/seccomp.h>
> +#include <linux/cpu.h>
>
> #include <linux/compat.h>
> #include <linux/syscalls.h>
> @@ -879,6 +880,7 @@
> if (pm_power_off_prepare)
> pm_power_off_prepare();
> sysdev_shutdown();
> + disable_nonboot_cpus();

Before sysdev_shutdown(), please.

sysdev_shutdown() may touch things that belong to CPU0.

> printk(KERN_EMERG "Power down.\n");
> machine_power_off();
> }

Greetings,
Rafael

2007-09-28 14:55:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] disable non-boot CPUs before poweroff


On Fri, 2007-09-28 at 09:52 -0400, Mark Lord wrote:
> We need to disable all CPUs other than the boot CPU (usually 0)
> before attempting to power-off modern SMP machines.
> This seems to fix the hang-on-poweroff issue
> that one of my SMP boxes exhibits. More testing required.
>
> Signed-off-by: Mark Lord <[email protected]>

Fixes my new toybox as well. Thanks for tracking it down before I had to
dig in.

Acked-by: Thomas Gleixner <[email protected]>

> ---
>
> --- linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
> +++ linux/kernel/sys.c 2007-09-28 09:48:54.000000000 -0400
> @@ -32,6 +32,7 @@
> #include <linux/getcpu.h>
> #include <linux/task_io_accounting_ops.h>
> #include <linux/seccomp.h>
> +#include <linux/cpu.h>
>
> #include <linux/compat.h>
> #include <linux/syscalls.h>
> @@ -879,6 +880,7 @@
> if (pm_power_off_prepare)
> pm_power_off_prepare();
> sysdev_shutdown();
> + disable_nonboot_cpus();
> printk(KERN_EMERG "Power down.\n");
> machine_power_off();
> }
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2007-09-28 15:02:44

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] disable non-boot CPUs before poweroff

On Fri, 2007-09-28 at 17:05 +0200, Rafael J. Wysocki wrote:
> > if (pm_power_off_prepare)
> > pm_power_off_prepare();
> > sysdev_shutdown();
> > + disable_nonboot_cpus();
>
> Before sysdev_shutdown(), please.
>
> sysdev_shutdown() may touch things that belong to CPU0.

Damn, you're right. Missed that.

tglx




2007-09-28 19:53:29

by Mark Lord

[permalink] [raw]
Subject: [PATCH] (repost) Fix SMP poweroff hangs

Rafael J. Wysocki wrote:
> ..
>> @@ -879,6 +880,7 @@
>> if (pm_power_off_prepare)
>> pm_power_off_prepare();
>> sysdev_shutdown();
>> + disable_nonboot_cpus();
>
> Before sysdev_shutdown(), please.
>
> sysdev_shutdown() may touch things that belong to CPU0.

Thanks. Here is the revised patch.

* * *

We need to disable all CPUs other than the boot CPU (usually 0)
before attempting to power-off modern SMP machines.
This fixes the hang-on-poweroff issue on my MythTV SMP box,
and also on Thomas Gleixner's new toybox.

Signed-off-by: Mark Lord <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
---

--- linux/kernel/sys.c.orig 2007-09-13 09:49:11.000000000 -0400
+++ linux/kernel/sys.c 2007-09-28 15:48:54.000000000 -0400
@@ -32,6 +32,7 @@
#include <linux/getcpu.h>
#include <linux/task_io_accounting_ops.h>
#include <linux/seccomp.h>
+#include <linux/cpu.h>

#include <linux/compat.h>
#include <linux/syscalls.h>
@@ -878,6 +879,7 @@
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
+ disable_nonboot_cpus();
sysdev_shutdown();
printk(KERN_EMERG "Power down.\n");
machine_power_off();

2007-09-30 09:00:52

by Santiago Garcia Mantinan

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

Doesn't fix for me!

I have an Athlon x2 running on a Asus A8N-E mobo which has an NForce 4
chipset, I thought this patch would fix poweroff for me too, but it doesn't.

I'm seing this on 2.6.23-rc8 with and without your patch, here is what I get
on the console:

Shutdown: hdd
Shutdown: hda
System halted.

Nothing else pops up.

When I hadn't put your patch if I tried Ctrl+Alt+Del after the System halted
message I could then see these messages:

md: stopping all md devices.
md: md1 sill in use.

But no reboot would take place. I have not tested this Ctrl+Alt+Del thing
with your patch, but I think it still behaves like that (not rebooting).

I'm attaching my .config.

Regards...
--
Manty/BestiaTester -> http://manty.net


Attachments:
(No filename) (730.00 B)
config-2.6.23-rc8 (46.87 kB)
Download all attachments

2007-09-30 17:21:54

by Mark Lord

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

Santiago Garcia Mantinan wrote:
> Doesn't fix for me!
>
> I have an Athlon x2 running on a Asus A8N-E mobo which has an NForce 4
> chipset, I thought this patch would fix poweroff for me too, but it doesn't.
>
> I'm seing this on 2.6.23-rc8 with and without your patch, here is what I get
> on the console:
>
> Shutdown: hdd
> Shutdown: hda
> System halted.
>
> Nothing else pops up.

I'd say your problem is more of a distro issue,
in that the method you are using to shutdown
is not actually requesting "poweroff".

That last mess above ("System halted.") comes from kernel_halt(),
rather than the expected message ("Power down.") from kernel_power_off().

So, try using the "poweroff" command instead of "halt",
or try using "halt -p". If neither of those work,
then edit /etc/init.d/halt and hardcode the "-p" parameter
inside there onto the "halt" command line(s).

I had to do that frequently back in the Redhat/Fedora days.
I'm sure they have a nice GUI for it somewhere,
but at the time it was simpler to just edit the script.

Cheers

2007-09-30 17:55:04

by Santiago Garcia Mantinan

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

> I'd say your problem is more of a distro issue,
> in that the method you are using to shutdown
> is not actually requesting "poweroff".

> That last mess above ("System halted.") comes from kernel_halt(),
> rather than the expected message ("Power down.") from kernel_power_off().

> So, try using the "poweroff" command instead of "halt",
> or try using "halt -p". If neither of those work,

Well it works ok with 2.6.22 powering off and saying so right before
powering off, with some references to ACPI. On 2.6.23-rc8 however it doesn't
seem to get that far.

I have followed the poweroff of my distro (Debian unstable) and on getting
to the end of rc 0 it calls halt with options -d -f -i -p. So it does call
it with the -p you asked for. BTW, this halt comes fom Debian's sysvinit
version 2.86.ds1-38.1 in case that matters, but as I said it is working ok
for .22.


> then edit /etc/init.d/halt and hardcode the "-p" parameter
> inside there onto the "halt" command line(s).

That init.d/halt is the one that is calling halt with -d -f -i -p already.

Regards...
--
Manty/BestiaTester -> http://manty.net

2007-09-30 18:47:37

by Mark Lord

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

Santiago Garcia Mantinan wrote:
>> I'd say your problem is more of a distro issue,
>> in that the method you are using to shutdown
>> is not actually requesting "poweroff".
>
>> That last mess above ("System halted.") comes from kernel_halt(),
>> rather than the expected message ("Power down.") from kernel_power_off().
>
>> So, try using the "poweroff" command instead of "halt",
>> or try using "halt -p". If neither of those work,
>
> Well it works ok with 2.6.22 powering off and saying so right before
> powering off, with some references to ACPI. On 2.6.23-rc8 however it doesn't
> seem to get that far.
>
> I have followed the poweroff of my distro (Debian unstable) and on getting
> to the end of rc 0 it calls halt with options -d -f -i -p. So it does call
> it with the -p you asked for. BTW, this halt comes fom Debian's sysvinit
> version 2.86.ds1-38.1 in case that matters, but as I said it is working ok
> for .22.
>
>
>> then edit /etc/init.d/halt and hardcode the "-p" parameter
>> inside there onto the "halt" command line(s).
>
> That init.d/halt is the one that is calling halt with -d -f -i -p already.

Mmm.. then I wonder why it is not actually getting into the kernel's poweroff function?

Can you boot into single user mode ( kernel command line parameter of S ),
and then remount your filesystems all r/o (ALT-SysRQ-u + ALT-SysRQ-s,
or do it all manually if you prefer).

Then manually try this command from the primary console (Ctrl-ALT-F1):

/sbin/halt -f -p

The machine *should* poweroff.
If not, then do the whole thing again with this command:

strace /bin/halt -f -p

And see what the final syscall + parameters is. Post them here if you can.

Cheers

2007-09-30 20:03:51

by Santiago Garcia Mantinan

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

I booted into single mode, then umounted all unneeded stuff and put / to ro,
stopped all unused raids, ... then did...

> /sbin/halt -f -p
>
> The machine *should* poweroff.

Nope, it didn't and what was curious was that I was left at the bash prompt
:-?

> If not, then do the whole thing again with this command:
>
> strace /bin/halt -f -p

I tried this and I did the strace with a -o to get the output on a remote
cifs fs, here is the full trace:

execve("/sbin/halt", ["/sbin/halt", "-f", "-p"], [/* 16 vars */]) = 0
brk(0) = 0x804b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f29000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=52448, ...}) = 0
mmap2(NULL, 52448, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f1c000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\260a\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1335536, ...}) = 0
mmap2(NULL, 1340944, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7dd4000
mmap2(0xb7f16000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x142) = 0xb7f16000
mmap2(0xb7f19000, 9744, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f19000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7dd3000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb7dd36b0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7f16000, 4096, PROT_READ) = 0
munmap(0xb7f1c000, 52448) = 0
geteuid32() = 0
chdir("/") = 0
open("/var/log/wtmp", O_WRONLY|O_APPEND) = -1 EROFS (Read-only file system)
sync() = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({2, 0}, {2, 0}) = 0
reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_RESTART|0x88888888) = 0
kill(1, SIGTSTP) = 0
reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_POWER_OFF <unfinished ... exit status 0>

Same output on the screen as I commented before, again I got to the shell
and this time I even typed some commands like ps and amazingly they still
worked, even though this was supposed to be halted.

Hope that helps.
--
Manty/BestiaTester -> http://manty.net

2007-09-30 22:53:05

by Mark Lord

[permalink] [raw]
Subject: Re: 32-bit Athlon X2 won't poweroff (was: Fix SMP poweroff hangs)

Santiago Garcia Mantinan wrote:
> I booted into single mode, then umounted all unneeded stuff and put / to ro,
> stopped all unused raids, ... then did...
..
>> strace /bin/halt -f -p
..
> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> nanosleep({2, 0}, {2, 0}) = 0
> reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_RESTART|0x88888888) = 0
> kill(1, SIGTSTP) = 0
> reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_POWER_OFF <unfinished ... exit status 0>
>
> Same output on the screen as I commented before, again I got to the shell
> and this time I even typed some commands like ps and amazingly they still
> worked, even though this was supposed to be halted.

Mmm.. okay, user space is doing the right things.

So next is inside the kernel itself, at linux/kernel/sys.c :: sys_reboot(),
where we see this code:

/* Instead of trying to make the power_off code look like
* halt when pm_power_off is not set do it the easy way.
*/
if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
cmd = LINUX_REBOOT_CMD_HALT;

This converts a "poweroff" into a "reboot" if no machine dependent
power off function has been bound in (pm_power_off() is a function pointer).

So for this to work, I believe that either ACPI or APM has to have been
configured into the kernel (and the modules loaded). Your kernel .config
from earlier shows ACPI built-in to the kernel core, so it should be present.

Unless you booted with noacpi or some such parameter..
So let's have a look at the kernel boot logs,
and you could also try CONGIG_ACPI_DEBUG=y

Bizarre (and nothing to do with my patch).

2007-09-30 22:57:25

by Mark Lord

[permalink] [raw]
Subject: Re: 32-bit Athlon X2 won't poweroff

Mark Lord wrote:
>..
> So next is inside the kernel itself, at linux/kernel/sys.c :: sys_reboot(),
> where we see this code:
>
> /* Instead of trying to make the power_off code look like
> * halt when pm_power_off is not set do it the easy way.
> */
> if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
> cmd = LINUX_REBOOT_CMD_HALT;
>
> This converts a "poweroff" into a "reboot" if no machine dependent
> power off function has been bound in (pm_power_off() is a function
> pointer).

Duh.. fingers failed to follow brain: that converts a "poweroff" into a "halt",
which is what you are seeing.

> So for this to work, I believe that either ACPI or APM has to have been
> configured into the kernel (and the modules loaded). Your kernel .config
> from earlier shows ACPI built-in to the kernel core, so it should be
> present.
>
> Unless you booted with noacpi or some such parameter..
> So let's have a look at the kernel boot logs,
> and you could also try CONGIG_ACPI_DEBUG=y
>
> Bizarre (and nothing to do with my patch).

2007-10-01 16:19:34

by Santiago Garcia Mantinan

[permalink] [raw]
Subject: Re: 32-bit Athlon X2 won't poweroff (was: Fix SMP poweroff hangs)

> So for this to work, I believe that either ACPI or APM has to have been
> configured into the kernel (and the modules loaded). Your kernel .config
> from earlier shows ACPI built-in to the kernel core, so it should be
> present.

Yes, and it is indeed, the acpid is running and it detects my power button
and starts the poweroff when I hit it.

> Unless you booted with noacpi or some such parameter..
> So let's have a look at the kernel boot logs,

I believe this is normal, I have done a grep -i acpi on the dmesg, here is
the result, if you want the full dmesg tell me:

BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
ACPI: RSDP 000F7560, 0014 (r0 Nvidia)
ACPI: RSDT 3FFF3040, 0030 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: FACP 3FFF30C0, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: DSDT 3FFF3180, 65F2 (r1 NVIDIA AWRDACPI 1000 MSFT 100000E)
ACPI: FACS 3FFF0000, 0040
ACPI: MCFG 3FFF9880, 003C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
ACPI: APIC 3FFF97C0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 0)
Nvidia board detected. Ignoring ACPI timer override.
If you got timer trouble try acpi_use_timer_override
ACPI: PM-Timer IO Port: 0x4008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: BIOS IRQ0 pin2 override ignored.
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: Core revision 20070126
tbxface-0598 [00] tb_load_namespace : ACPI Tables successfully acquired
evxfevnt-0091 [00] enable : Transition to ACPI mode successful
ACPI: bus type pci registered
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK2] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs *3 4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK4] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK5] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LUBA] (IRQs *3 4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LMAC] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LMCI] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSID] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LFID] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LPCA] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0
ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
ACPI: PCI Interrupt Link [APC5] (IRQs *16), disabled.
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0
ACPI: PCI Interrupt Link [APCP] (IRQs 20 21 22 23) *0, disabled.
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 16 devices
ACPI: ACPI bus type pnp unregistered
PCI: Using ACPI for IRQ routing
Time: acpi_pm clocksource has been installed.
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Thermal Zone [THRM] (40 C)
ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
ACPI: PCI Interrupt 0000:05:08.0[A] -> Link [APC3] -> GSI 18 (level, low) -> IRQ 16
ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCJ] -> GSI 23 (level, low) -> IRQ 17
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22
ACPI: PCI Interrupt 0000:00:0a.0[A] -> Link [APCH] -> GSI 22 (level, low) -> IRQ 18
ACPI: PCI Interrupt Link [APCF] enabled at IRQ 21
ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [APCF] -> GSI 21 (level, low) -> IRQ 19
ACPI: PCI Interrupt Link [APCL] enabled at IRQ 20
ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCL] -> GSI 20 (level, low) -> IRQ 20
parport_pc 00:09: reported by Plug and Play ACPI

> and you could also try CONGIG_ACPI_DEBUG=y

I did so (in fact the above grep is with that activated) but the only things
that came out as new stuff were these two lines:

tbxface-0598 [00] tb_load_namespace : ACPI Tables successfully acquired
evxfevnt-0091 [00] enable : Transition to ACPI mode successful

Nothing new came out on the halt -f -p test :-(

If you guys have any other tests on mind just let me know.

Regards...
--
Manty/BestiaTester -> http://manty.net

2007-10-01 16:37:25

by Mark Lord

[permalink] [raw]
Subject: Re: 32-bit Athlon X2 won't poweroff

Santiago Garcia Mantinan wrote:
>> So for this to work, I believe that either ACPI or APM has to have been
>> configured into the kernel (and the modules loaded). Your kernel .config
>> from earlier shows ACPI built-in to the kernel core, so it should be
>> present.
>
> Yes, and it is indeed, the acpid is running and it detects my power button
> and starts the poweroff when I hit it.
>
>> Unless you booted with noacpi or some such parameter..
>> So let's have a look at the kernel boot logs,
>
> I believe this is normal, I have done a grep -i acpi on the dmesg, here is
> the result, if you want the full dmesg tell me:
..
> ACPI: Interpreter enabled
> ACPI: Using IOAPIC for interrupt routing
..

The output is missing a line like this, which should have been between the two above:

ACPI: (supports S0 S3 S4 S5)

The ACPI power-off function only gets bound into pm_power_off()
when that line shows S5 on it.

The only way that line can be missing, is if something disabled ACPI
after boot.

This patch (below) should find the culprit for you:

---


--- old/include/asm-i386/acpi.h 2007-09-28 18:09:14.000000000 -0400
+++ linux/include/asm-i386/acpi.h 2007-10-01 12:35:23.000000000 -0400
@@ -97,6 +97,7 @@
extern int acpi_pci_disabled;
static inline void disable_acpi(void)
{
+ WARN_ON(1);
acpi_disabled = 1;
acpi_ht = 0;
acpi_pci_disabled = 1;

2007-10-01 19:50:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

On Sunday, 30 September 2007 19:54, Santiago Garcia Mantinan wrote:
> > I'd say your problem is more of a distro issue,
> > in that the method you are using to shutdown
> > is not actually requesting "poweroff".
>
> > That last mess above ("System halted.") comes from kernel_halt(),
> > rather than the expected message ("Power down.") from kernel_power_off().
>
> > So, try using the "poweroff" command instead of "halt",
> > or try using "halt -p". If neither of those work,
>
> Well it works ok with 2.6.22 powering off and saying so right before
> powering off, with some references to ACPI. On 2.6.23-rc8 however it doesn't
> seem to get that far.

There was a bug in 2.6.23-rc8 that caused this to happen.

It's been fixed in the later -git kernels
(commits 2f3f22269bdf702311342c5d106dfdd7347d1c3e,
853298bc03ef65e3eb392f5d61265605214ee8fb).

Greetings,
Rafael

2007-10-01 22:38:50

by Santiago Garcia Mantinan

[permalink] [raw]
Subject: Re: [PATCH] (repost) Fix SMP poweroff hangs

> There was a bug in 2.6.23-rc8 that caused this to happen.
>
> It's been fixed in the later -git kernels
> (commits 2f3f22269bdf702311342c5d106dfdd7347d1c3e,
> 853298bc03ef65e3eb392f5d61265605214ee8fb).

You are right, I have downloaded current head of git and seems to work ok.

Sorry for the lost time :-(

Regards...
--
Manty/BestiaTester -> http://manty.net