2003-01-12 02:22:34

by Mikael Pettersson

[permalink] [raw]
Subject: 2.5.55/.56 instant reboot problem on 486

My '94 vintage 486 has problems booting 2.5.55 and 2.5.56.
When it fails, the boot gets to loading the kernel and
printing "Ok, booting the kernel.". Then there is a short
pause (line a tenth of a second) and the machine reboots.

After doing a binary search with "for(;;);" statements
(printk doesn't work this early) I found that the reboot
occurs in arch/i386/mm/init.c:kernel_physical_mapping_init():
(start_kernel() -> setup_arch() -> paging_init() ->
pagetable_init() -> kernel_physical_mapping_init())

diff -ruNp linux-2.5.55/arch/i386/mm/init.c linux-2.5.55.hack/arch/i386/mm/init.c
--- linux-2.5.55/arch/i386/mm/init.c 2003-01-12 02:20:49.000000000 +0100
+++ linux-2.5.55.hack/arch/i386/mm/init.c 2003-01-12 01:44:49.000000000 +0100
@@ -134,6 +134,7 @@ static void __init kernel_physical_mappi
pgd = pgd_base + pgd_ofs;
pfn = 0;

+ //for(;;);
for (; pgd_ofs < PTRS_PER_PGD; pgd++, pgd_ofs++) {
pmd = one_md_table_init(pgd);
if (pfn >= max_low_pfn)
@@ -151,6 +152,7 @@ static void __init kernel_physical_mappi
}
}
}
+ for(;;);
}

static inline int page_kills_ppro(unsigned long pagenr)

If I uncomment the first "//for(;;);" the kernel hangs, but if
I keep it commented out, the kernel reboots -- i.e. it doesn't
get to the final "for(;;);" at the end of the function.

The problem is apparently related to the size of the kernel.
With gcc-2.95.3 and my normal config for this machine,
size vmlinux is

text data bss dec hex filename
1330953 109008 125656 1565617 17e3b1 vmlinux

and the kernel reboots. If I alter the size by changing some
irrelevant config option (like disabling INPUT_MOUSEDEV or
enabling KALLSYMS), the reboot problem doesn't occur.

With gcc-3.2 the bug disappears, but only because gcc-3.2
generates a much larger code segment. If I remove some
driver & fs config options, the vmlinux size becomes almost
the same as above, and the reboot bug appears again.

The same kernel that fails on the 486 boots Ok on my newer
test boxes, so the problem is either 486-specific, related
to the actual memory size, or the BIOS memory size reporting
method (the 486 uses int 15 0x88); here's what 2.5.54 says:

BIOS-provided physical RAM map:
BIOS-88: 0000000000000000 - 000000000009f000 (usable)
BIOS-88: 0000000000100000 - 0000000001c00000 (usable)

The 486 has no known HW problems, and it survives memtest86.

/Mikael


2003-01-12 04:13:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.5.55/.56 instant reboot problem on 486


On Sun, 12 Jan 2003, Mikael Pettersson wrote:
>
> My '94 vintage 486 has problems booting 2.5.55 and 2.5.56.

Should I take it that 2.5.54 works? Or you haven't tested?

> After doing a binary search with "for(;;);" statements
> (printk doesn't work this early) I found that the reboot
> occurs in arch/i386/mm/init.c:kernel_physical_mapping_init():
> (start_kernel() -> setup_arch() -> paging_init() ->
> pagetable_init() -> kernel_physical_mapping_init())

Ho humm.. Sounds like the non-PSE case is broken. Which should probably
mean that even newer CPU's should show the same thing if we boot with
"mem=nopentium". Can you verify that with your other machine that
otherwise boots the same kernel fine?

> The problem is apparently related to the size of the kernel.
> With gcc-2.95.3 and my normal config for this machine,
> size vmlinux is
>
> text data bss dec hex filename
> 1330953 109008 125656 1565617 17e3b1 vmlinux
>
> and the kernel reboots. If I alter the size by changing some
> irrelevant config option (like disabling INPUT_MOUSEDEV or
> enabling KALLSYMS), the reboot problem doesn't occur.

That's bizarre. Especially the fact that a _smaller_ kernel has problems,
but a biger one does not.
>
> With gcc-3.2 the bug disappears, but only because gcc-3.2
> generates a much larger code segment. If I remove some
> driver & fs config options, the vmlinux size becomes almost
> the same as above, and the reboot bug appears again.
>
> The same kernel that fails on the 486 boots Ok on my newer
> test boxes, so the problem is either 486-specific, related
> to the actual memory size, or the BIOS memory size reporting
> method (the 486 uses int 15 0x88); here's what 2.5.54 says:
>
> BIOS-provided physical RAM map:
> BIOS-88: 0000000000000000 - 000000000009f000 (usable)
> BIOS-88: 0000000000100000 - 0000000001c00000 (usable)

That looks like a perfectly fine memory map, even if 28MB or memory sounds
a bit strange.

Linus

2003-01-12 07:01:53

by Brian Gerst

[permalink] [raw]
Subject: Re: 2.5.55/.56 instant reboot problem on 486

diff -urN linux-2.5.56/arch/i386/mm/init.c linux/arch/i386/mm/init.c
--- linux-2.5.56/arch/i386/mm/init.c Sun Jan 12 00:16:22 2003
+++ linux/arch/i386/mm/init.c Sun Jan 12 01:48:28 2003
@@ -71,12 +71,16 @@
*/
static pte_t * __init one_page_table_init(pmd_t *pmd)
{
- pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
- set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
- if (page_table != pte_offset_kernel(pmd, 0))
- BUG();
+ if (pmd_none(*pmd)) {
+ pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+ set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
+ if (page_table != pte_offset_kernel(pmd, 0))
+ BUG();

- return page_table;
+ return page_table;
+ }
+
+ return pte_offset_kernel(pmd, 0);
}

/*


Attachments:
ptefix-1 (756.00 B)

2003-01-12 15:57:15

by Mikael Pettersson

[permalink] [raw]
Subject: Re: 2.5.55/.56 instant reboot problem on 486

On Sat, 11 Jan 2003 20:17:19 -0800 (PST), Linus Torvalds wrote:
>On Sun, 12 Jan 2003, Mikael Pettersson wrote:
>>
>> My '94 vintage 486 has problems booting 2.5.55 and 2.5.56.
>
>Should I take it that 2.5.54 works? Or you haven't tested?
>...
>Ho humm.. Sounds like the non-PSE case is broken. Which should probably
>mean that even newer CPU's should show the same thing if we boot with
>"mem=nopentium". Can you verify that with your other machine that
>otherwise boots the same kernel fine?

2.5.54 works, but since the bug in 2.5.55/.56 is dependent on
kernel size, and since I couldn't find anything in patch-2.5.55
to explain the change in behaviour, I suspected that the bug has
been around a bit longer: I just didn't manage to trigger it.

mem=nopentium made no difference for the other machine: it still
managed to boot the same kernel the 486 failed to boot.

However, with the patch to one_page_table_init() that Brian Gerst
posted earlier today (included below), my 486 boots 2.5.56 Ok.

/Mikael

diff -urN linux-2.5.56/arch/i386/mm/init.c linux/arch/i386/mm/init.c
--- linux-2.5.56/arch/i386/mm/init.c Sun Jan 12 00:16:22 2003
+++ linux/arch/i386/mm/init.c Sun Jan 12 01:48:28 2003
@@ -71,12 +71,16 @@
*/
static pte_t * __init one_page_table_init(pmd_t *pmd)
{
- pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
- set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
- if (page_table != pte_offset_kernel(pmd, 0))
- BUG();
+ if (pmd_none(*pmd)) {
+ pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+ set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
+ if (page_table != pte_offset_kernel(pmd, 0))
+ BUG();

- return page_table;
+ return page_table;
+ }
+
+ return pte_offset_kernel(pmd, 0);
}

/*

2003-01-13 04:13:47

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.5.55/.56 instant reboot problem on 486

diff -urN linux-2.5.56/arch/i386/mm/init.c linux/arch/i386/mm/init.c
--- linux-2.5.56/arch/i386/mm/init.c Sun Jan 12 00:16:22 2003
+++ linux/arch/i386/mm/init.c Sun Jan 12 01:48:28 2003
@@ -71,12 +71,16 @@
*/
static pte_t * __init one_page_table_init(pmd_t *pmd)
{
- pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
- set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
- if (page_table != pte_offset_kernel(pmd, 0))
- BUG();
+ if (pmd_none(*pmd)) {
+ pte_t *page_table = (pte_t *) alloc_bootmem_low_pages(PAGE_SIZE);
+ set_pmd(pmd, __pmd(__pa(page_table) | _PAGE_TABLE));
+ if (page_table != pte_offset_kernel(pmd, 0))
+ BUG();

- return page_table;
+ return page_table;
+ }
+
+ return pte_offset_kernel(pmd, 0);
}

/*