2007-08-03 14:14:52

by Ben Collins

[permalink] [raw]
Subject: Regression in 2.6.22, clock problems on Turion with 32-bit kernel

Tim and I have both experienced this problem. With 2.6.20 things worked
perfectly fine on these systems. The two machines are a Dell 1501 Turion
X2 and Dell 1521 Turion X2.

With 2.6.22 the kernel hangs shortly after starting up, but after
several minutes, you can get activity by tapping keyboard (generating
interrupts). We have NO_HZ and HIGH_RES enabled, but even disabling this
doesn't help.

I've tried every combination of boot param revolving around clocksource
and interrupts. The only thing that gets me booting is nolapic, but then
again, that knocks me down to a single cpu. Setting maxcpus=1 or nosmp
doesn't fix it.

We both bisected (separately I might add) down to this commit:

commit e9e2cdb412412326c4827fc78ba27f410d837e6e
Author: Thomas Gleixner <[email protected]>
Date: Fri Feb 16 01:28:04 2007 -0800

[PATCH] clockevents: i386 drivers

Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update
the timer IRQ to call into the PIT/HPET driver's event handler and the
lapic-timer IRQ to call into the lapic clockevent driver. The assignement of
timer functionality is delegated to the core framework code and replaces the
compile and runtime evalution in do_timer_interrupt_hook()

Use the clockevents broadcast support and implement the lapic_broadcast
function for ACPI.

No changes to existing functionality.

Note, the problem doesn't happen when using an x86_64 kernel with the
same basic config, on the same machine.

Hoping to get some tips to test something a bit more specific in this
patch.

--
Ubuntu : http://www.ubuntu.com/
Linux1394: http://wiki.linux1394.org/


2007-08-03 14:56:58

by Tim Gardner

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel

One other option that allows these systems to boot is 'acpi=off', though
that is hardly useful on a laptop.

rtg

Ben Collins wrote:
> Tim and I have both experienced this problem. With 2.6.20 things worked
> perfectly fine on these systems. The two machines are a Dell 1501 Turion
> X2 and Dell 1521 Turion X2.
>
> With 2.6.22 the kernel hangs shortly after starting up, but after
> several minutes, you can get activity by tapping keyboard (generating
> interrupts). We have NO_HZ and HIGH_RES enabled, but even disabling this
> doesn't help.
>
> I've tried every combination of boot param revolving around clocksource
> and interrupts. The only thing that gets me booting is nolapic, but then
> again, that knocks me down to a single cpu. Setting maxcpus=1 or nosmp
> doesn't fix it.
>
> We both bisected (separately I might add) down to this commit:
>
> commit e9e2cdb412412326c4827fc78ba27f410d837e6e
> Author: Thomas Gleixner <[email protected]>
> Date: Fri Feb 16 01:28:04 2007 -0800
>
> [PATCH] clockevents: i386 drivers
>
> Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update
> the timer IRQ to call into the PIT/HPET driver's event handler and the
> lapic-timer IRQ to call into the lapic clockevent driver. The assignement of
> timer functionality is delegated to the core framework code and replaces the
> compile and runtime evalution in do_timer_interrupt_hook()
>
> Use the clockevents broadcast support and implement the lapic_broadcast
> function for ACPI.
>
> No changes to existing functionality.
>
> Note, the problem doesn't happen when using an x86_64 kernel with the
> same basic config, on the same machine.
>
> Hoping to get some tips to test something a bit more specific in this
> patch.
>


--
Tim Gardner [email protected]

2007-08-03 15:32:15

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel


> I've tried every combination of boot param revolving around clocksource
> and interrupts. The only thing that gets me booting is nolapic, but then
> again, that knocks me down to a single cpu.

hummm.... I wonder how nolapic knows you down to a single cpu.......
that is just an entirely strange relationship.

2007-08-03 15:44:05

by Ben Collins

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel


On Fri, 2007-08-03 at 08:30 -0700, Arjan van de Ven wrote:
> > I've tried every combination of boot param revolving around clocksource
> > and interrupts. The only thing that gets me booting is nolapic, but then
> > again, that knocks me down to a single cpu.
>
> hummm.... I wonder how nolapic knows you down to a single cpu.......
> that is just an entirely strange relationship.

Sorry, s/cpu/core/, but not sure if that makes a difference.

--
Ubuntu : http://www.ubuntu.com/
Linux1394: http://wiki.linux1394.org/

2007-08-03 15:48:38

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel

On Fri, 2007-08-03 at 11:43 -0400, Ben Collins wrote:
> On Fri, 2007-08-03 at 08:30 -0700, Arjan van de Ven wrote:
> > > I've tried every combination of boot param revolving around clocksource
> > > and interrupts. The only thing that gets me booting is nolapic, but then
> > > again, that knocks me down to a single cpu.
> >
> > hummm.... I wonder how nolapic knows you down to a single cpu.......
> > that is just an entirely strange relationship.
>
> Sorry, s/cpu/core/, but not sure if that makes a difference.

still that is extremely surprising; local apic normally has nothing to
do with smp in any shape or form.... something weird must be going on.


--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2007-08-03 15:51:17

by Cal Peake

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel

On Fri, 3 Aug 2007, Ben Collins wrote:

>
> On Fri, 2007-08-03 at 08:30 -0700, Arjan van de Ven wrote:
> > > I've tried every combination of boot param revolving around clocksource
> > > and interrupts. The only thing that gets me booting is nolapic, but then
> > > again, that knocks me down to a single cpu.
> >
> > hummm.... I wonder how nolapic knows you down to a single cpu.......
> > that is just an entirely strange relationship.
>
> Sorry, s/cpu/core/, but not sure if that makes a difference.

Ben, Tim,

See thread <http://marc.info/?t=118573271600006&r=1&w=2>. Short version:
nolapic_timer should fix things for the moment. Long term: some AMD kernel
code needs to be fixed up to deal with a broken local APIC.

Cheers,
--
Cal Peake

2007-08-03 15:58:51

by Ben Collins

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel


On Fri, 2007-08-03 at 11:50 -0400, Cal Peake wrote:
> On Fri, 3 Aug 2007, Ben Collins wrote:
>
> >
> > On Fri, 2007-08-03 at 08:30 -0700, Arjan van de Ven wrote:
> > > > I've tried every combination of boot param revolving around clocksource
> > > > and interrupts. The only thing that gets me booting is nolapic, but then
> > > > again, that knocks me down to a single cpu.
> > >
> > > hummm.... I wonder how nolapic knows you down to a single cpu.......
> > > that is just an entirely strange relationship.
> >
> > Sorry, s/cpu/core/, but not sure if that makes a difference.
>
> Ben, Tim,
>
> See thread <http://marc.info/?t=118573271600006&r=1&w=2>. Short version:
> nolapic_timer should fix things for the moment. Long term: some AMD kernel
> code needs to be fixed up to deal with a broken local APIC.


nolapic_timer does not fix it for me. Only nolapic and acpi=off works. I
commented on that thread as well now, thanks.

--
Ubuntu : http://www.ubuntu.com/
Linux1394: http://wiki.linux1394.org/

2007-08-03 16:02:15

by Cal Peake

[permalink] [raw]
Subject: Re: Regression in 2.6.22, clock problems on Turion with 32-bit kernel

On Fri, 3 Aug 2007, Ben Collins wrote:

> nolapic_timer does not fix it for me. Only nolapic and acpi=off works. I
> commented on that thread as well now, thanks.

Interesting. However, I don't have NO_HZ so maybe that plays on it...

--
Cal Peake