With 2 GB in place, the kernel boots just fine, but with
4 GB, it reports:
kernel direct mapping tables upto ffff 8101 5000 000 @ 8000-f000
PANIC: early exception rip ffff ffff 8016 f002 error 0 cr2 4230
PANIC: early exception rip ffff ffff 8011 d1fe error 0 cr2 ffff ffff f5ff d023
and some other lines, which I didn't jot down on paper...
These were copied from some Fedora Core development kernel version
after 2.6.15-rc1 (last working one) in a box with 4 GB memory.
Those hex values didn't have intermediate spaces in them, though.
That was me trying to understand 64 bit values.
Last working kernel with all 4 GB memory in the box was 2.6.15-rc1
Since then the kernels have failed to boot at all, unless machine
PHYSICAL memory is stripped down to 2 GB. Command-line options
(e.g. "mem=2G") don't help at all.
/Matti Aarnio
Matti Aarnio <[email protected]> writes:
> With 2 GB in place, the kernel boots just fine, but with
> 4 GB, it reports:
Works for me on several machines.
I even have a fix for the Asus wrong MCFG problem now that
broke the IOMMU on these boards (workaround is pci=nommconf)
>
> kernel direct mapping tables upto ffff 8101 5000 000 @ 8000-f000
> PANIC: early exception rip ffff ffff 8016 f002 error 0 cr2 4230
> PANIC: early exception rip ffff ffff 8011 d1fe error 0 cr2 ffff ffff f5ff d023
>
> and some other lines, which I didn't jot down on paper...
Can you please look up the RIP values in your System.map?
> These were copied from some Fedora Core development kernel version
> after 2.6.15-rc1 (last working one) in a box with 4 GB memory.
Please try vanilla 2.6.15rc2 as a reference at least.
-Andi
On Tue, Nov 29, 2005 at 07:01:12AM -0700, Andi Kleen wrote:
> Matti Aarnio <[email protected]> writes:
>
> > With 2 GB in place, the kernel boots just fine, but with
> > 4 GB, it reports:
>
> Works for me on several machines.
>
> I even have a fix for the Asus wrong MCFG problem now that
> broke the IOMMU on these boards (workaround is pci=nommconf)
>
> >
> > kernel direct mapping tables upto ffff 8101 5000 000 @ 8000-f000
> > PANIC: early exception rip ffff ffff 8016 f002 error 0 cr2 4230
> > PANIC: early exception rip ffff ffff 8011 d1fe error 0 cr2 ffff ffff f5ff d023
> >
> > and some other lines, which I didn't jot down on paper...
>
> Can you please look up the RIP values in your System.map?
>
> > These were copied from some Fedora Core development kernel version
> > after 2.6.15-rc1 (last working one) in a box with 4 GB memory.
>
> Please try vanilla 2.6.15rc2 as a reference at least.
Tried. Crashes with 4 GB memory present in the box.
Boots and runs nicely with 2 GB memory populated in.
After adding -g to *CFLAGS of top-level Makefile, and
trying to determine WHERE those PANICs happened in rc2:
(gdb) list *0xffffffff80163a43
0xffffffff80163a43 is in memmap_init_zone (mm/page_alloc.c:1687).
1682 for (pfn = start_pfn; pfn < end_pfn; pfn++, page++) {
1683 if (!early_pfn_valid(pfn))
1684 continue;
1685 if (!early_pfn_in_nid(pfn, nid))
1686 continue;
1687 page = pfn_to_page(pfn);
1688 set_page_links(page, zone, nid, pfn);
1689 set_page_count(page, 1);
1690 reset_page_mapcount(page);
1691 SetPageReserved(page);
(gdb) list *0xffffffff801196fa
0xffffffff801196fa is in safe_smp_processor_id (include/asm/smp.h:77).
72 #define raw_smp_processor_id() read_pda(cpunumber)
73
74 static inline int hard_smp_processor_id(void)
75 {
76 /* we don't want to mark this access volatile - bad code generation */
77 return GET_APIC_ID(*(unsigned int *)(APIC_BASE+APIC_ID));
78 }
79
80 extern int safe_smp_processor_id(void);
81 extern int __cpu_disable(void);
Not that those explain all that much...
> -Andi
/Matti Aarnio
> Not that those explain all that much...
Can you send me your .config? If you have SPARSEMEM enabled can you
disable it?
-Andi
On 11/29/05, Andi Kleen <[email protected]> wrote:
> > Not that those explain all that much...
>
> Can you send me your .config? If you have SPARSEMEM enabled can you
> disable it?
This looks just like the sparsemem troubles. There is a patch around
somwhere.... I thought a patch was being pushed into mainline but I
guess not.
Thanks,
Keith
On Tue, Nov 29, 2005 at 06:26:28PM -0800, Keith Mannthey wrote:
> On 11/29/05, Andi Kleen <[email protected]> wrote:
> > > Not that those explain all that much...
> >
> > Can you send me your .config? If you have SPARSEMEM enabled can you
> > disable it?
>
> This looks just like the sparsemem troubles. There is a patch around
> somwhere.... I thought a patch was being pushed into mainline but I
> guess not.
It was I think. But I still don't trust it.
-Andi