2006-02-13 02:30:36

by Alistair John Strachan

[permalink] [raw]
Subject: 2.6.16-rc2, x86-64, CPU hotplug failure

Hi,

In an attempt to play with ACPI S3 on my Athlon 64 X2 3800+, I recompiled
2.6.16-rc2 with CPU hotplug and ACPI sleep state support. I experienced
multiple crashes and oopsen, which I quickly discovered were the result of
bringing at least one CPU back online.

echo 0 >> /sys/devices/system/cpu/cpu1/online

Works, but then if I try to do:

echo 1 >> /sys/devices/system/cpu/cpu1/online

I get an oops. Unfortunately this board has no serial ports so I've taken a
digital camera shot of the oops. From dmesg, I'm using the PM timer.

[alistair] 02:13 [~] dmesg | egrep time\.c
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2500.768 MHz processor.
time.c: Using PM based timekeeping.

http://devzero.co.uk/~alistair/oops-20060213/

Find the oops, my config and dmesg for a successful boot at this location.

--
Cheers,
Alistair.

'No sense being pessimistic, it probably wouldn't work anyway.'
Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.


2006-02-13 07:01:36

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.16-rc2, x86-64, CPU hotplug failure

On Mon, 13 Feb 2006, Alistair John Strachan wrote:

> In an attempt to play with ACPI S3 on my Athlon 64 X2 3800+, I recompiled
> 2.6.16-rc2 with CPU hotplug and ACPI sleep state support. I experienced
> multiple crashes and oopsen, which I quickly discovered were the result of
> bringing at least one CPU back online.
>
> echo 0 >> /sys/devices/system/cpu/cpu1/online
>
> Works, but then if I try to do:
>
> echo 1 >> /sys/devices/system/cpu/cpu1/online
>
> I get an oops. Unfortunately this board has no serial ports so I've taken a
> digital camera shot of the oops. From dmesg, I'm using the PM timer.
>
> [alistair] 02:13 [~] dmesg | egrep time\.c
> time.c: Using 3.579545 MHz PM timer.
> time.c: Detected 2500.768 MHz processor.
> time.c: Using PM based timekeeping.
>
> http://devzero.co.uk/~alistair/oops-20060213/

Nice snapshot, that bug was fixed around 2.6.16-rc3, unsynchronized_tsc
was marked __init instead of __cpuinit

2006-02-13 09:25:52

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.16-rc2, x86-64, CPU hotplug failure

On Monday 13 February 2006 03:30, Alistair John Strachan wrote:
> Hi,
>
> In an attempt to play with ACPI S3 on my Athlon 64 X2 3800+, I recompiled
> 2.6.16-rc2 with CPU hotplug and ACPI sleep state support. I experienced
> multiple crashes and oopsen, which I quickly discovered were the result of
> bringing at least one CPU back online.

Yes, known problem. They seem to be related to the powernow driver. Does
it work if you don't compile CPUFREQ in?

-Andi

2006-02-13 10:06:17

by Alistair John Strachan

[permalink] [raw]
Subject: Re: 2.6.16-rc2, x86-64, CPU hotplug failure

On Monday 13 February 2006 07:05, Zwane Mwaikambo wrote:
> On Mon, 13 Feb 2006, Alistair John Strachan wrote:
> > In an attempt to play with ACPI S3 on my Athlon 64 X2 3800+, I recompiled
> > 2.6.16-rc2 with CPU hotplug and ACPI sleep state support. I experienced
> > multiple crashes and oopsen, which I quickly discovered were the result
> > of bringing at least one CPU back online.
> >
> > echo 0 >> /sys/devices/system/cpu/cpu1/online
> >
> > Works, but then if I try to do:
> >
> > echo 1 >> /sys/devices/system/cpu/cpu1/online
> >
> > I get an oops. Unfortunately this board has no serial ports so I've taken
> > a digital camera shot of the oops. From dmesg, I'm using the PM timer.
> >
> > [alistair] 02:13 [~] dmesg | egrep time\.c
> > time.c: Using 3.579545 MHz PM timer.
> > time.c: Detected 2500.768 MHz processor.
> > time.c: Using PM based timekeeping.
> >
> > http://devzero.co.uk/~alistair/oops-20060213/
>
> Nice snapshot, that bug was fixed around 2.6.16-rc3, unsynchronized_tsc
> was marked __init instead of __cpuinit

Thanks Zwane, everything's working now. I guess I should have upgraded when I
read the announcement.

On the ACPI front, both standby and mem seem to work (S1 and S3 I assume), in
that they suspend, and now resume, but my SATA controller and NIC do not seem
to wake up properly. Since my rootfs is on the SATA controller, things
quickly hang thereafter.

Oh well.

--
Cheers,
Alistair.

'No sense being pessimistic, it probably wouldn't work anyway.'
Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.