2007-09-26 11:03:18

by Joerg

[permalink] [raw]
Subject: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

Hello all,
this is what git bisect told me about the problem:

jpo@jpo-laptop:~/linux-2.6$ git bisect good
4fd06960f120e02e9abc802a09f9511c400042a5 is first bad commit
commit 4fd06960f120e02e9abc802a09f9511c400042a5
Author: H. Peter Anvin <[email protected]>
Date: Wed Jul 11 12:18:56 2007 -0700

Use the new x86 setup code for i386

This patch hooks the new x86 setup code into the Makefile machinery. It
also adapts boot/tools/build.c to a two-file (as opposed to three-file)
universe, and simplifies it substantially.

Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

:040000 040000 6560eb5b7e40d93813276544bced8c478f9067f5 fe5f90d9ca08e526559815789175602ba2c51743 M arch


--
Regards

Joerg



----- Urspr?ngliche Mail ----
Von: Jordan Crouse <[email protected]>
An: Joerg Pommnitz <[email protected]>
CC: [email protected]; [email protected]
Gesendet: Dienstag, den 25. September 2007, 17:04:52 Uhr
Betreff: Re: Problems with 2.6.23-rc6 on AMD Geode LX800

On 25/09/07 01:38 -0700, Joerg Pommnitz wrote:
> Chuck, Jordan,
> thanks for taking an interest in this problem. As suggested by Jordan I tried a new
> BIOS revision from
> http://www.digitallogic.ch/index.php?id=256&dir=/MSEP800%20-%20SM800PCX%20%20-%20MPC20%20-%20MPC21&mountpoint=23
>
> Unfortunately the kernel still fails to boot in the same way.

You'll have to contact Digital Logic and have them check with the BIOS vendor
to see if the fix was made in that version or not. I don't have that
particular board, so I can't try out the fixes here.

I'm still trying to track down the particulars of the fix from the BIOS
vendor. I'll let you know.

> Do you still need the disassembled reserve_bootmem_core?

Sure - you might as well - just to make sure its the same problem.

Jordan

--
Jordan Crouse
Systems Software Development Engineer
Advanced Micro Devices, Inc.








__________________________________
Yahoo! Clever: Sie haben Fragen? Yahoo! Nutzer antworten Ihnen. http://www.yahoo.de/clever


2007-09-26 14:10:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

Joerg Pommnitz wrote:
> Hello all,
> this is what git bisect told me about the problem:
>
> jpo@jpo-laptop:~/linux-2.6$ git bisect good
> 4fd06960f120e02e9abc802a09f9511c400042a5 is first bad commit
> commit 4fd06960f120e02e9abc802a09f9511c400042a5
> Author: H. Peter Anvin <[email protected]>
> Date: Wed Jul 11 12:18:56 2007 -0700
>
> Use the new x86 setup code for i386
>
> This patch hooks the new x86 setup code into the Makefile machinery. It
> also adapts boot/tools/build.c to a two-file (as opposed to three-file)
> universe, and simplifies it substantially.
>
> Signed-off-by: H. Peter Anvin <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> :040000 040000 6560eb5b7e40d93813276544bced8c478f9067f5 fe5f90d9ca08e526559815789175602ba2c51743 M arch
>

There is something very fishy.

The only documentation you've given us so far is a screen shot which
contained a message ("BIOS data check successful") which doesn't occur
in the kernel.

The loader string doesn't look all that familiar either; it looks like
an extremely old version of SYSLINUX, but that doesn't contain that
message either.

INT 6 is #UD, the undefined instruction exception. This is consistent with:

> Its hitting a bug - specifically (from bootmem.c:125):
> BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);

However, all that tells us is that reserve_bootmem_core() was either
called with a bad address or bdata->node_low_pfn is garbage. In
particular, without knowing how it got there it's hard to know for sure.

Could you send me the boot messages from a working kernel boot?

-hpa

2007-09-26 15:28:26

by Joerg

[permalink] [raw]
Subject: RE: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

> There is something very fishy.
>
> The only documentation you've given us so far is a screen shot which
> contained a message ("BIOS data check successful") which doesn't occur
> in the kernel.
>
> The loader string doesn't look all that familiar either; it looks like
> an extremely old version of SYSLINUX, but that doesn't contain that
> message either.

The boot loader is LILO from Ubuntu 7.04, so it should be fairly recent.

> INT 6 is #UD, the undefined instruction exception. This is consistent with:
>
> > Its hitting a bug - specifically (from bootmem.c:125):
> > BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);
>
> However, all that tells us is that reserve_bootmem_core() was either
> called with a bad address or bdata->node_low_pfn is garbage. In
> particular, without knowing how it got there it's hard to know for sure.
>
> Could you send me the boot messages from a working kernel boot?

Attached is a boot log I get where the last patch is
f2d98ae63dc64dedb00499289e13a50677f771f9, e.g. "Linker script for the
new x86 setup code".

The kernel is directly from "git bisect, make defconfig, make", no local
changes or strange patches applied. Build environment is plain Ubuntu-7.04.

--
Kind regards

Joerg




__________________________________
Alles was der Gesundheit und Entspannung dient. BE A BETTER MEDIZINMANN! http://www.yahoo.de/clever


Attachments:
dmesg.txt (10.15 kB)

2007-09-26 15:42:09

by Jordan Crouse

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

On 26/09/07 07:10 -0700, H. Peter Anvin wrote:
> Joerg Pommnitz wrote:
> > Hello all,
> > this is what git bisect told me about the problem:
> >
> > jpo@jpo-laptop:~/linux-2.6$ git bisect good
> > 4fd06960f120e02e9abc802a09f9511c400042a5 is first bad commit
> > commit 4fd06960f120e02e9abc802a09f9511c400042a5
> > Author: H. Peter Anvin <[email protected]>
> > Date: Wed Jul 11 12:18:56 2007 -0700
> >
> > Use the new x86 setup code for i386
> >
> > This patch hooks the new x86 setup code into the Makefile machinery. It
> > also adapts boot/tools/build.c to a two-file (as opposed to three-file)
> > universe, and simplifies it substantially.
> >
> > Signed-off-by: H. Peter Anvin <[email protected]>
> > Signed-off-by: Linus Torvalds <[email protected]>
> >
> > :040000 040000 6560eb5b7e40d93813276544bced8c478f9067f5 fe5f90d9ca08e526559815789175602ba2c51743 M arch
> >
>
> There is something very fishy.
>
> The only documentation you've given us so far is a screen shot which
> contained a message ("BIOS data check successful") which doesn't occur
> in the kernel.
>
> The loader string doesn't look all that familiar either; it looks like
> an extremely old version of SYSLINUX, but that doesn't contain that
> message either.
>
> INT 6 is #UD, the undefined instruction exception. This is consistent with:
>
> > Its hitting a bug - specifically (from bootmem.c:125):
> > BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);
>
> However, all that tells us is that reserve_bootmem_core() was either
> called with a bad address or bdata->node_low_pfn is garbage. In
> particular, without knowing how it got there it's hard to know for sure.

/me swings a +5 JTAG debugger

Its the latter - max_pfn as read by find_max_pfn() in arch/i386/e820.c
is being set to 9F (640k) in the broken case, this due to the
the e820 map looking something like this:

Address Size Type
00000000 0009FC00 1
0009FC00 00000400 2
000E0000 00002000 2

(Yep, thats it - thats the list. e820.nr_map is indeed 3).

Long story short, bdata->node_low_pfn gets set to 9F, and When we
try to allocate the bootmem bitmap (at _pa_symbol(_text), which is
page 0x100), then the system gets appropriately angry.

As background, I'm using syslinux 3.36 as my loader here - I've used this
exact same version for a very long time, so I don't blame it in the least.
Something is getting confused in the early kernel, and whatever that
something is, a still unknown change in a newer version of the BIOS
fixed it. The search goes on.

Jordan
--
Jordan Crouse
Systems Software Development Engineer
Advanced Micro Devices, Inc.


2007-09-26 16:58:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

Jordan Crouse wrote:
>
> As background, I'm using syslinux 3.36 as my loader here - I've used this
> exact same version for a very long time, so I don't blame it in the least.
> Something is getting confused in the early kernel, and whatever that
> something is, a still unknown change in a newer version of the BIOS
> fixed it. The search goes on.
>

OK, we should put printf's in arch/i386/boot/memory.c and see what
actually gets read out from the BIOS. This could either be a BIOS
problem or a bug in memory.c (or a bug elsewhere in the code that the
change in memory.c triggers, but that seems less likely.)

-hpa

P.S. Are you guys in the Bay Area by any chance?

2007-09-26 19:15:18

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
index 1a2e62d..a0ccf29 100644
--- a/arch/i386/boot/memory.c
+++ b/arch/i386/boot/memory.c
@@ -33,6 +33,12 @@ static int detect_memory_e820(void)
"=m" (*desc)
: "D" (desc), "a" (0xe820));

+ printf("e820: err %d id 0x%08x next %u %08x:%08x %u\n",
+ err, id, next,
+ (unsigned int)desc->addr,
+ (unsigned int)desc->size,
+ desc->type);
+
if (err || id != SMAP)
break;


Attachments:
diff (488.00 B)

2007-09-26 20:58:35

by Jordan Crouse

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

[i386]: Return an error if the e820 detection goes bad

From: Jordan Crouse <[email protected]>

Change the e820 code to always return an error if something
bad happens while reading the e820 map. This matches the
old code behavior, and allows brain-dead e820 implementations
to still work.

Signed-off-by: Jordan Crouse <[email protected]>
---

arch/i386/boot/memory.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
index 1a2e62d..4c7f0f6 100644
--- a/arch/i386/boot/memory.c
+++ b/arch/i386/boot/memory.c
@@ -22,7 +22,7 @@ static int detect_memory_e820(void)
{
u32 next = 0;
u32 size, id;
- u8 err;
+ u8 err, count = 0;
struct e820entry *desc = boot_params.e820_map;

do {
@@ -34,13 +34,14 @@ static int detect_memory_e820(void)
: "D" (desc), "a" (0xe820));

if (err || id != SMAP)
- break;
+ return -1;

- boot_params.e820_entries++;
+ count++;
desc++;
} while (next && boot_params.e820_entries < E820MAX);

- return boot_params.e820_entries;
+ boot_params.e820_entries = count;
+ return count;
}

static int detect_memory_e801(void)


Attachments:
(No filename) (2.05 kB)
memory-bail-on-error.patch (1.14 kB)
Download all attachments

2007-09-26 21:04:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

Jordan Crouse wrote:
> On 26/09/07 12:14 -0700, H. Peter Anvin wrote:
>> Please try the following debug patch to let us know what is going on.
>>
>> -hpa
>
>> diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
>> index 1a2e62d..a0ccf29 100644
>> --- a/arch/i386/boot/memory.c
>> +++ b/arch/i386/boot/memory.c
>> @@ -33,6 +33,12 @@ static int detect_memory_e820(void)
>> "=m" (*desc)
>> : "D" (desc), "a" (0xe820));
>>
>> + printf("e820: err %d id 0x%08x next %u %08x:%08x %u\n",
>> + err, id, next,
>> + (unsigned int)desc->addr,
>> + (unsigned int)desc->size,
>> + desc->type);
>> +
>> if (err || id != SMAP)
>> break;
>
> Okay, we have clarity. Here is the output
>
> e820: err 0 id 0x534d4150 next 15476 00000000:0009fc00 1
> e820: err 0 id 0x534d4150 next 15496 0009fc00:00000400 2
> e820: err 0 id 0x534d4150 next 15516 000e0000:00020000 2
> e820: err 0 id 0x0e7b0000 next 11536 00100000:0e6b0000 1
>
> In the last entry, id is obviously wrong (it should be 'SMAP' or
> 0x534d4150). This is the BIOS bug.
>
> Here's the reason why this bothers us now. In the old assembly code,
> if the returned ID wasn't equal to 'SMAP', we jumped straight to the e801
> code. In the new code in memory.c, if id != SMAP, we break out of the
> int15 loop, and return boot_params.e820_entries, which in our case is
> 3. detect_memory() considers this to be successful, and no attempt to
> parse e801 is made.
>
> So thats where the problem is - in the old code with the buggy BIOS, we
> punted to reading the e801 information, and that was enough to keep us
> going. In the new code, we allow a partial table to be used, and we
> blow up.
>
> Attached is a patch to fix this - it returns -1 on error, and only sets
> boot_params.e820_entries to be non-zero if we have something useful
> in it. This punts the detection to the e801 code, which then is
> then successful.
>
> This fixes the problem with the DB800, and so it probably should
> with the other Geode platforms affected by this.
>
> Many thanks to hpa for the guiding hand.
>

This patch is obviously wrong. There are a lot of e820 BIOSen out there
that terminate with CF=1, and that is a legitimate termination condition
for e820. Now, as far as what to do when id != SMAP, it probably is
still the right thing to do; since the BOS vendor couldn't get something
that elementary correct, we shouldn't trust the data.

I'll write up a corrected patch.

-hpa

2007-09-26 21:15:59

by Jordan Crouse

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

On 26/09/07 14:04 -0700, H. Peter Anvin wrote:
> Jordan Crouse wrote:
> > On 26/09/07 12:14 -0700, H. Peter Anvin wrote:
> >> Please try the following debug patch to let us know what is going on.
> >>
> >> -hpa
> >
> >> diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
> >> index 1a2e62d..a0ccf29 100644
> >> --- a/arch/i386/boot/memory.c
> >> +++ b/arch/i386/boot/memory.c
> >> @@ -33,6 +33,12 @@ static int detect_memory_e820(void)
> >> "=m" (*desc)
> >> : "D" (desc), "a" (0xe820));
> >>
> >> + printf("e820: err %d id 0x%08x next %u %08x:%08x %u\n",
> >> + err, id, next,
> >> + (unsigned int)desc->addr,
> >> + (unsigned int)desc->size,
> >> + desc->type);
> >> +
> >> if (err || id != SMAP)
> >> break;
> >
> > Okay, we have clarity. Here is the output
> >
> > e820: err 0 id 0x534d4150 next 15476 00000000:0009fc00 1
> > e820: err 0 id 0x534d4150 next 15496 0009fc00:00000400 2
> > e820: err 0 id 0x534d4150 next 15516 000e0000:00020000 2
> > e820: err 0 id 0x0e7b0000 next 11536 00100000:0e6b0000 1
> >
> > In the last entry, id is obviously wrong (it should be 'SMAP' or
> > 0x534d4150). This is the BIOS bug.
> >
> > Here's the reason why this bothers us now. In the old assembly code,
> > if the returned ID wasn't equal to 'SMAP', we jumped straight to the e801
> > code. In the new code in memory.c, if id != SMAP, we break out of the
> > int15 loop, and return boot_params.e820_entries, which in our case is
> > 3. detect_memory() considers this to be successful, and no attempt to
> > parse e801 is made.
> >
> > So thats where the problem is - in the old code with the buggy BIOS, we
> > punted to reading the e801 information, and that was enough to keep us
> > going. In the new code, we allow a partial table to be used, and we
> > blow up.
> >
> > Attached is a patch to fix this - it returns -1 on error, and only sets
> > boot_params.e820_entries to be non-zero if we have something useful
> > in it. This punts the detection to the e801 code, which then is
> > then successful.
> >
> > This fixes the problem with the DB800, and so it probably should
> > with the other Geode platforms affected by this.
> >
> > Many thanks to hpa for the guiding hand.
> >
>
> This patch is obviously wrong. There are a lot of e820 BIOSen out there
> that terminate with CF=1, and that is a legitimate termination condition
> for e820. Now, as far as what to do when id != SMAP, it probably is
> still the right thing to do; since the BOS vendor couldn't get something
> that elementary correct, we shouldn't trust the data.
>
> I'll write up a corrected patch.

Hmm - the old code seems to fail to e801 when CF was set too:

int $0x15 # make the call
jc bail820 # fall to e801 if it fails

cmpl $SMAP, %eax # check the return is `SMAP'
jne bail820 # fall to e801 if it fails

Thats not to say that the old code was correct, mind you. Failing on a
bad ID and returning without error on a set CF seems to be good to me.

Jordan

--
Jordan Crouse
Systems Software Development Engineer
Advanced Micro Devices, Inc.


2007-09-26 21:21:18

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

Jordan Crouse wrote:
>
> Hmm - the old code seems to fail to e801 when CF was set too:
>
> int $0x15 # make the call
> jc bail820 # fall to e801 if it fails
>
> cmpl $SMAP, %eax # check the return is `SMAP'
> jne bail820 # fall to e801 if it fails
>
> Thats not to say that the old code was correct, mind you. Failing on a
> bad ID and returning without error on a set CF seems to be good to me.
>

Testing this patch now:


Attachments:
0001-x86-setup-Handle-case-of-improperly-terminated-E82.patch (2.19 kB)

2007-09-26 21:30:22

by Jordan Crouse

[permalink] [raw]
Subject: Re: Regression in 2.6.23-pre Was: Problems with 2.6.23-rc6 on AMD Geode LX800

On 26/09/07 14:20 -0700, H. Peter Anvin wrote:
> Testing this patch now:
>

> >From 2efa33f81ef56e7700c09a3d8a881c96692149e5 Mon Sep 17 00:00:00 2001
> From: H. Peter Anvin <[email protected]>
> Date: Wed, 26 Sep 2007 14:11:43 -0700
> Subject: [PATCH] [x86 setup] Handle case of improperly terminated E820 chain
>
> At least one system (a Geode system with a Digital Logic BIOS) has
> been found which suddenly stops reporting the SMAP signature when
> reading the E820 memory chain. We can't know what, exactly, broke in
> the BIOS, so if we detect this situation, declare the E820 data
> unusable and fall back to E801.
>
> Also, revert to original behavior of always probing all memory
> methods; that way all the memory information is available to the
> kernel.
>
> Signed-off-by: H. Peter Anvin <[email protected]>
> Cc: Jordan Crouse <[email protected]>
> Cc: Joerg Pommnitz <[email protected]>
> ---
> arch/i386/boot/memory.c | 30 +++++++++++++++++++++++-------
> 1 files changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
> index 1a2e62d..bccaa1c 100644
> --- a/arch/i386/boot/memory.c
> +++ b/arch/i386/boot/memory.c
> @@ -20,6 +20,7 @@
>
> static int detect_memory_e820(void)
> {
> + int count = 0;
> u32 next = 0;
> u32 size, id;
> u8 err;
> @@ -33,14 +34,24 @@ static int detect_memory_e820(void)
> "=m" (*desc)
> : "D" (desc), "a" (0xe820));
>
> - if (err || id != SMAP)
> + /* Some BIOSes stop returning SMAP in the middle of
> + the search loop. We don't know exactly how the BIOS
> + screwed up the map at that point, we might have a
> + partial map, the full map, or complete garbage, so
> + just return failure. */
> + if (id != SMAP) {
> + count = 0;
> break;
> + }
>
> - boot_params.e820_entries++;
> + if (err)
> + break;
> +
> + count++;
> desc++;
> - } while (next && boot_params.e820_entries < E820MAX);
> + } while (next && count < E820MAX);
>
> - return boot_params.e820_entries;
> + return boot_params.e820_entries = count;
> }
>
> static int detect_memory_e801(void)
> @@ -89,11 +100,16 @@ static int detect_memory_88(void)
>
> int detect_memory(void)
> {
> + int err = -1;
> +
> if (detect_memory_e820() > 0)
> - return 0;
> + err = 0;
>
> if (!detect_memory_e801())
> - return 0;
> + err = 0;
> +
> + if (!detect_memory_88())
> + err = 0;
>
> - return detect_memory_88();
> + return err;
> }
> --
> 1.5.3.1
>

Works here with the buggy BIOS.

Acked-by: Jordan Crouse <[email protected]>

Thanks.

--
Jordan Crouse
Systems Software Development Engineer
Advanced Micro Devices, Inc.