2007-01-24 22:46:54

by Paweł Sikora

[permalink] [raw]
Subject: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

hi,

for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
disabling the "amd-k8 cool'n'quiet" option in bios helps.
moreover, it works fine for 2.6.17.13, so it looks like
a recent regression. i can provide more details if you need.

BR,
pawel.


2007-01-24 22:52:34

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

On Wed, Jan 24, 2007 at 11:46:44PM +0100, Paweł Sikora wrote:

> hi,

Hi Paweł,

> for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
> http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
> disabling the "amd-k8 cool'n'quiet" option in bios helps.
> moreover, it works fine for 2.6.17.13, so it looks like
> a recent regression. i can provide more details if you need.

thanks for your report.

Can you narrow down a bit when it started?
Is 2.6.19 OK?
Is 2.6.18 OK?

I've Cc'ed the ACPI maintainers that might have some clues.

> BR,
> pawel.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2007-01-25 05:00:48

by Len Brown

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

On Wednesday 24 January 2007 17:52, Adrian Bunk wrote:
> On Wed, Jan 24, 2007 at 11:46:44PM +0100, Paweł Sikora wrote:

> > for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
> > http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
> > disabling the "amd-k8 cool'n'quiet" option in bios helps.
> > moreover, it works fine for 2.6.17.13, so it looks like
> > a recent regression. i can provide more details if you need.
>
> thanks for your report.
>
> Can you narrow down a bit when it started?
> Is 2.6.19 OK?
> Is 2.6.18 OK?

Is the stack trace always the same? It doesn't make much sense to me.

if AMD cool & quiet is enabled in the BIOS, but CONFIG_CPU_FREQ=n
in the kernel, do you see the same problem?

thanks,
-Len

2007-01-25 20:55:00

by Paweł Sikora

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

On Thursday 25 of January 2007 05:50:45 Len Brown wrote:
> On Wednesday 24 January 2007 17:52, Adrian Bunk wrote:
> > On Wed, Jan 24, 2007 at 11:46:44PM +0100, Paweł Sikora wrote:
> > > for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
> > > http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
> > > disabling the "amd-k8 cool'n'quiet" option in bios helps.
> > > moreover, it works fine for 2.6.17.13, so it looks like
> > > a recent regression. i can provide more details if you need.
> >
> > thanks for your report.
> >
> > Can you narrow down a bit when it started?
> > Is 2.6.19 OK?
> > Is 2.6.18 OK?

ok, here are results of my tests:

M/B: http://www.epox.nl/products/view.php?product_id=421

$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 55
model name : AMD Athlon(tm) 64 Processor 3700+
stepping : 2
cpu MHz : 2200.000
cache size : 1024 KB
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm
3dnowext 3dnow pni lahf_lm
bogomips : 4423.06
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

amd-k8 cool'n'quiet enabled in bios.

2.6.17.13-uni, 2.6.17.14-smp, 2.6.18.6-smp and 2.6.20rc5-uni work.
2.6.20rc5-smp ooopses during boot but works if c'n'q is disabled.
pure 2.6.19.x not tested yet...

> Is the stack trace always the same? It doesn't make much sense to me.
>
> if AMD cool & quiet is enabled in the BIOS, but CONFIG_CPU_FREQ=n
> in the kernel, do you see the same problem?

looks like not related to config_cpu_freq.

2007-01-25 22:12:15

by Paweł Sikora

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

On Thursday 25 of January 2007 05:50:45 Len Brown wrote:
> On Wednesday 24 January 2007 17:52, Adrian Bunk wrote:
> > On Wed, Jan 24, 2007 at 11:46:44PM +0100, Paweł Sikora wrote:
> > > for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
> > > http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
> > > disabling the "amd-k8 cool'n'quiet" option in bios helps.
> > > moreover, it works fine for 2.6.17.13, so it looks like
> > > a recent regression. i can provide more details if you need.
> >
> > thanks for your report.
> >
> > Can you narrow down a bit when it started?
> > Is 2.6.19 OK?
> > Is 2.6.18 OK?
>
> Is the stack trace always the same? It doesn't make much sense to me.

with debug options enabled oops looks better:
http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-oops-01.jpg

2007-01-26 21:57:58

by Paweł Sikora

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

for more details see PR 7889 at kernel bugzilla.

2007-01-29 22:34:18

by Chuck Ebbert

[permalink] [raw]
Subject: Re: 2.6.20rc5 k8/acpi regression ( 2.6.17.13 works fine ).

Paweł Sikora wrote:
> On Thursday 25 of January 2007 05:50:45 Len Brown wrote:
>
>> On Wednesday 24 January 2007 17:52, Adrian Bunk wrote:
>>
>>> On Wed, Jan 24, 2007 at 11:46:44PM +0100, Paweł Sikora wrote:
>>>
>>>> for 2.6.20rc5 i get an acpi related oops during x86-64 boot:
>>>> http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-acpi-oops.jpg
>>>> disabling the "amd-k8 cool'n'quiet" option in bios helps.
>>>> moreover, it works fine for 2.6.17.13, so it looks like
>>>> a recent regression. i can provide more details if you need.
>>>>
>>> thanks for your report.
>>>
>>> Can you narrow down a bit when it started?
>>> Is 2.6.19 OK?
>>> Is 2.6.18 OK?
>>>
>> Is the stack trace always the same? It doesn't make much sense to me.
>>
>
> with debug options enabled oops looks better:
> http://minus.ds14.agh.edu.pl/~pluto/2.6.20rc5-oops-01.jpg
>
>

In __rmqueue() (mm/page_alloc.c line 619:

static struct page *__rmqueue(struct zone *zone, unsigned int order)
{
struct free_area * area;
unsigned int current_order;
struct page *page;

for (current_order = order; current_order < MAX_ORDER;
++current_order) {
area = zone->free_area + current_order;
if (list_empty(&area->free_list))
continue;

page = list_entry(area->free_list.next, struct page, lru);
list_del(&page->lru); <=====================
rmv_page_order(page);
area->nr_free--;

page->lru is NULL