Linux 2.6.14.2, yeah, I know, and sorry if this has been fixed...but read on, please,
this is a new take...
So everything else is equal...same rootfs, same command line, same
kernel, and same modules...
When I boot my x86 board with one BIOS (and bootloader, not to forget that),
I can modprobe 'til the proverbial cows come home...
With the other BIOS (and its own bootloader), modprobe'ing ANY module gets this:
# modprobe ide-core
Unable to handle kernel paging request at virtual address e081e000
printing eip:
c0127ae5
*pde = 01774067
*pte = 1fa6b01e
Oops: 0002 [#1]
SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c0127ae5>] Not tainted VLI
EFLAGS: 00010202 (2.6.14.2-2-zbios)
eax: 00000000 ebx: e08128f0 ecx: 00004ac4 edx: 00012b10
esi: 00012b10 edi: e081e000 ebp: 00000000 esp: df863f3c
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 265, threadinfo=df862000 task=dfe31030)
Stack: c17b1f60 e081e000 00000000 e0812380 00000000 00000000 00000000 00000000
0000000d 00000011 00000000 00000018 00000009 00000000 0000000f 0000001f
0000001e 00000020 e0819298 e0804000 b7e76000 df862000 b7f162f4 df862000
Call Trace:
[<c0127f98>]
[<c010271d>]
Code: 89 44 24 18 83 c4 14 83 7c 24 04 00 0f 84 0a 04 00 00 8b 7c 24 0c 31 ed 89 e8 8b b7 bc 00 00 00 8b 7c 24 04 89 f1
Segmentation fault
I've seen others post similar issues, but no one has correlated it to a BIOS issue.
Can anybody suggest to me why on earth BIOS would matter when
modprobe'ing??? I'm a BIOS engineer, and I can fix my own @#$%, but
I cannot fathom the connection, at least not while I can pass a breathalizer test...
Thanks,
John
P.S.: I've not been able to find any other problems or issues on either BIOS. Modules
that were linked in at build time (full USB stack, a couple of SCSI mods., and a few
network mods. all work fine...), and no other manifestations that anything is amiss...
I can provide more details, but all I'm looking for is a hint at what the connection between
load_module and BIOS might be...I can probably take it from there...
On Tue, 2006-03-21 at 20:05 -0800, John Z. Bohach wrote:
> Linux 2.6.14.2, yeah, I know, and sorry if this has been fixed...but read on, please,
> this is a new take...
at least enable CONFIG_KALLSYMS to get us a readable backtrace
On Wednesday 22 March 2006 01:06, Arjan van de Ven wrote:
> On Tue, 2006-03-21 at 20:05 -0800, John Z. Bohach wrote:
> > Linux 2.6.14.2, yeah, I know, and sorry if this has been fixed...but read
> > on, please, this is a new take...
>
> at least enable CONFIG_KALLSYMS to get us a readable backtrace
>
I'll do you one better: here's the failing line from module.c:
/* Determine total sizes, and put offsets in sh_entsize. For now
this is done generically; there doesn't appear to be any
special cases for the architectures. */
layout_sections(mod, hdr, sechdrs, secstrings);
/* Do the allocs. */
ptr = module_alloc(mod->core_size);
if (!ptr) {
err = -ENOMEM;
goto free_percpu;
}
!!! ---> memset(ptr, 0, mod->core_size);
mod->module_core = ptr;
The !!!---> memset() line is the one that fails. In assembly, its:
c0127ae5: f3 ab repz stos %eax,%es:(%edi)
eax is 0, and es is 0x7b with edi at 0xe0806000. Everything looks fine.
Interestingly, and expectedly, I might add, these values are identical in both the
failing case and the succeeding case. Here's the rest of the story:
I'm testing a different boot loader in the failing case. The failing case loads
arch/i386/boot/compressed/vmlinux
while the successful case loads
arch/i386/boot/bzImage.
Sorry I wasn't complete before, I was hoping not to
have to get this deep. As far as I can tell, the bootloader (a very minimal one,
that uses no BIOS calls, sets up the linux params. area, as well as the MP table
and E820 table (in the cmdline area...). Linux is happy with both, without
displaying any visible differences with either bootloader on either BIOS.
/proc/meminfo looks identical in both failing/succeeding cases...
However, in the failing case, the 16-bit setup.o stuff of the kernel never got run.
My bootloader takes care of the the useful work, and sets things up appropriately
without needing any BIOS calls (32-bit protected mode, initial GDT, etc.).
I've looked through setup.S, and can't see any major
differences between 2.4 and 2.6, but I'm starting to wonder.... I'm thinking that by
skipping the setup.S stuff, some initialization related to paging and/or virtual
memory gets skipped? Does anyone know if this is true?
I have a feeling that its some pre-kernel initialization (or lack thereof) that has some subtle
side-effect regarding paging and is triggered by the first modload. Does the fact the the virtual
address is at 0xe0806000 have any significance? Does the 0xexxxxxxx range only
get used for modules? I'm trying to figure out what triggers this failed paging request...
On Wednesday 22 March 2006 19:48, John Z. Bohach wrote:
> On Wednesday 22 March 2006 01:06, Arjan van de Ven wrote:
> > On Tue, 2006-03-21 at 20:05 -0800, John Z. Bohach wrote:
> > > Linux 2.6.14.2, yeah, I know, and sorry if this has been fixed...but
> > > read on, please, this is a new take...
> >
> > at least enable CONFIG_KALLSYMS to get us a readable backtrace
>
> I'll do you one better: here's the failing line from module.c:
>
> /* Determine total sizes, and put offsets in sh_entsize. For now
> this is done generically; there doesn't appear to be any
> special cases for the architectures. */
> layout_sections(mod, hdr, sechdrs, secstrings);
>
> /* Do the allocs. */
> ptr = module_alloc(mod->core_size);
> if (!ptr) {
> err = -ENOMEM;
> goto free_percpu;
> }
> !!! ---> memset(ptr, 0, mod->core_size);
> mod->module_core = ptr;
>
Let me summarize this a little better:
ptr = module_alloc(mod->core_size);
is fine, but when a few lines later, memset() tries to operate on that same
ptr to zero it out with
memset(ptr, 0, mod->core_size);
I get:
Unable to handle kernel paging request at virtual address f8806000
(f8806000 is (in this case) the value of ptr returned by module_alloc()).
I've validated the parameters, they all look okay. I think the page fault for ptr
is normal(?), and the page fault handler is suppossed to set up this page???
but fails...
Yet it succeeds with a different BIOS and bootloader, is what I'm trying to say.
So it seems that the page fault handler is somehow affected by something that the
BIOS has/has not done, long after the system has booted and been running, with
many page faults under its belt...now I've seen it all...or not.
On Wed, 22 Mar 2006 21:26:56 -0800 John Z. Bohach wrote:
> On Wednesday 22 March 2006 19:48, John Z. Bohach wrote:
> > On Wednesday 22 March 2006 01:06, Arjan van de Ven wrote:
> > > On Tue, 2006-03-21 at 20:05 -0800, John Z. Bohach wrote:
> > > > Linux 2.6.14.2, yeah, I know, and sorry if this has been fixed...but
> > > > read on, please, this is a new take...
> > >
> > > at least enable CONFIG_KALLSYMS to get us a readable backtrace
> >
> > I'll do you one better: here's the failing line from module.c:
> >
> > /* Determine total sizes, and put offsets in sh_entsize. For now
> > this is done generically; there doesn't appear to be any
> > special cases for the architectures. */
> > layout_sections(mod, hdr, sechdrs, secstrings);
> >
> > /* Do the allocs. */
> > ptr = module_alloc(mod->core_size);
> > if (!ptr) {
> > err = -ENOMEM;
> > goto free_percpu;
> > }
> > !!! ---> memset(ptr, 0, mod->core_size);
> > mod->module_core = ptr;
> >
>
> Let me summarize this a little better:
>
> ptr = module_alloc(mod->core_size);
>
> is fine, but when a few lines later, memset() tries to operate on that same
> ptr to zero it out with
>
> memset(ptr, 0, mod->core_size);
>
> I get:
>
> Unable to handle kernel paging request at virtual address f8806000
> (f8806000 is (in this case) the value of ptr returned by module_alloc()).
>
> I've validated the parameters, they all look okay. I think the page fault for ptr
> is normal(?), and the page fault handler is suppossed to set up this page???
> but fails...
>
> Yet it succeeds with a different BIOS and bootloader, is what I'm trying to say.
>
> So it seems that the page fault handler is somehow affected by something that the
> BIOS has/has not done, long after the system has booted and been running, with
> many page faults under its belt...now I've seen it all...or not.
Sounds like we need to see complete boot logs from both BIOSen boots.
Can you do that?
I'm just guessing that the memory maps are different, but who knows.
---
~Randy
On Wednesday 22 March 2006 21:42, Randy.Dunlap wrote:
> > So it seems that the page fault handler is somehow affected by something
> > that the BIOS has/has not done, long after the system has booted and been
> > running, with many page faults under its belt...now I've seen it all...or
> > not.
>
> Sounds like we need to see complete boot logs from both BIOSen boots.
> Can you do that?
>
> I'm just guessing that the memory maps are different, but who knows.
I got something...
Here's the symbolic dump. I've gotten it to break on the BIOS it
did work on, by adding 512 MB RAM and bringing the total RAM to 1GB.
In fact, it now breaks during boot-up, and doesn't even give me a chance
to modprobe anything. However, the cmdline is what makes it break/work:
Here it is:
fails with cmdline:
Kernel command line: ro root=/dev/sda1 rootdelay=10 mem=0x200M console=ttyS0,115200n8
works with:
Kernel command line: ro root=/dev/sda1 rootdelay=10 console=ttyS0,115200n8
Note the "mem=" being the differentiator!
So I guess BIOS is off the hook. Here's a more interesting dump of the new failing
case (with 1 GB RAM, and mem=0x200M on command line). BTW, note that
mem=0x200M works fine as long as there's only 512 MB in the system.
(And also note that the kernel was built without ACPI (or APM) support).
INIT: version 2.85 booting
INIT: Entering runlevel: 3
Starting system log daemon...
[ 39.333210] Unable to handle kernel paging request at virtual address b7c4e000
[ 39.340642] printing eip:
[ 39.343414] c013213c
[ 39.345653] *pde = 017eb067
[ 39.348516] *pte = 00000000
[ 39.351379] Oops: 0002 [#1]
[ 39.354239] SMP DEBUG_PAGEALLOC
[ 39.357476] Modules linked in:
[ 39.360615] CPU: 0
[ 39.360616] EIP: 0060:[<c013213c>] Not tainted VLI
[ 39.360618] EFLAGS: 00010006 (2.6.14.2)
[ 39.373302] EIP is at free_block+0x41/0xbc
[ 39.377501] eax: c1508d40 ebx: dffb6000 ecx: dffb6b80 edx: b7c4e000
[ 39.384458] esi: c1508d40 edi: 00000000 ebp: c1505280 esp: c1567ef4
[ 39.391413] ds: 007b es: 007b ss: 0068
[ 39.395622] Process events/0 (pid: 4, threadinfo=c1566000 task=c151fa30)
[ 39.402309] Stack: c150aa14 00000003 c150aa00 c1505280 c0132963 c1505280 c150aa14 00000003
[ 39.410930] 00000000 00000000 c1508368 c1508d10 c1508d40 c1505280 c0132a0d c1505280
[ 39.419568] c150aa00 00000000 00000000 00000002 c15052dc 00000246 c14063e0 c14063e4
[ 39.428186] Call Trace:
[ 39.430878] [<c0132963>] drain_array_locked+0x61/0x8c
[ 39.436161] [<c0132a0d>] cache_reap+0x7f/0x18f
[ 39.440823] [<c011f86a>] worker_thread+0x16f/0x1dd
[ 39.445843] [<c013298e>] cache_reap+0x0/0x18f
[ 39.450420] [<c0110299>] default_wake_function+0x0/0x12
[ 39.455879] [<c0110299>] default_wake_function+0x0/0x12
[ 39.461338] [<c011f6fb>] worker_thread+0x0/0x1dd
[ 39.466173] [<c0122d23>] kthread+0x7c/0xa6
[ 39.470472] [<c0122ca7>] kthread+0x0/0xa6
[ 39.474681] [<c0100ea5>] kernel_thread_helper+0x5/0xb
[ 39.479967] Code: 24 18 8b 15 50 ec 31 c0 8b 0c b8 8d 81 00 00 00 40 c1 e8 0c c1 e0 05 8b 5c 02 1c 8b 44 24 20 8b 53
[ 39.499780]
[42949372.960000] Linux version 2.6.14.2 (root@zeus) (gcc version 3.3.4) #1 SMP Fri Mar 24 08:27:33 PST 2006
[42949372.960000] BIOS-provided physical RAM map:
[42949372.960000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[42949372.960000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[42949372.960000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[42949372.960000] BIOS-e820: 0000000000100000 - 000000003fe30000 (usable)
[42949372.960000] BIOS-e820: 000000003fe30000 - 000000003fe40000 (ACPI data)
[42949372.960000] BIOS-e820: 000000003fe40000 - 000000003ff00000 (ACPI NVS)
[42949372.960000] BIOS-e820: 000000003ff00000 - 0000000040000000 (reserved)
[42949372.960000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[42949372.960000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[42949372.960000] user-defined physical RAM map:
[42949372.960000] user: 0000000000000000 - 000000000009fc00 (usable)
[42949372.960000] user: 000000000009fc00 - 00000000000a0000 (reserved)
[42949372.960000] user: 00000000000e0000 - 0000000000100000 (reserved)
[42949372.960000] user: 0000000000100000 - 0000000020000000 (usable)
[42949372.960000] 512MB LOWMEM available.
[42949372.960000] found SMP MP-table at 000ff780
...
Thanks for taking a look...
On Fri, 24 Mar 2006 09:36:13 -0800 John Z. Bohach wrote:
> On Wednesday 22 March 2006 21:42, Randy.Dunlap wrote:
> > > So it seems that the page fault handler is somehow affected by something
> > > that the BIOS has/has not done, long after the system has booted and been
> > > running, with many page faults under its belt...now I've seen it all...or
> > > not.
> >
> > Sounds like we need to see complete boot logs from both BIOSen boots.
> > Can you do that?
> >
> > I'm just guessing that the memory maps are different, but who knows.
>
> I got something...
>
> Here's the symbolic dump. I've gotten it to break on the BIOS it
> did work on, by adding 512 MB RAM and bringing the total RAM to 1GB.
> In fact, it now breaks during boot-up, and doesn't even give me a chance
> to modprobe anything. However, the cmdline is what makes it break/work:
>
> Here it is:
>
> fails with cmdline:
>
> Kernel command line: ro root=/dev/sda1 rootdelay=10 mem=0x200M console=ttyS0,115200n8
>
> works with:
>
> Kernel command line: ro root=/dev/sda1 rootdelay=10 console=ttyS0,115200n8
>
> Note the "mem=" being the differentiator!
OK, that is memory map difference.
Can you test a more recent kernel to see if it has the same problem?
(like 2.6.16 or 2.6.16-git9)
> So I guess BIOS is off the hook. Here's a more interesting dump of the new failing
> case (with 1 GB RAM, and mem=0x200M on command line). BTW, note that
> mem=0x200M works fine as long as there's only 512 MB in the system.
>
> (And also note that the kernel was built without ACPI (or APM) support).
>
> INIT: version 2.85 booting
> INIT: Entering runlevel: 3
> Starting system log daemon...
> [ 39.333210] Unable to handle kernel paging request at virtual address b7c4e000
> [ 39.340642] printing eip:
> [ 39.343414] c013213c
> [ 39.345653] *pde = 017eb067
> [ 39.348516] *pte = 00000000
> [ 39.351379] Oops: 0002 [#1]
> [ 39.354239] SMP DEBUG_PAGEALLOC
> [ 39.357476] Modules linked in:
> [ 39.360615] CPU: 0
> [ 39.360616] EIP: 0060:[<c013213c>] Not tainted VLI
> [ 39.360618] EFLAGS: 00010006 (2.6.14.2)
> [ 39.373302] EIP is at free_block+0x41/0xbc
> [ 39.377501] eax: c1508d40 ebx: dffb6000 ecx: dffb6b80 edx: b7c4e000
> [ 39.384458] esi: c1508d40 edi: 00000000 ebp: c1505280 esp: c1567ef4
> [ 39.391413] ds: 007b es: 007b ss: 0068
> [ 39.395622] Process events/0 (pid: 4, threadinfo=c1566000 task=c151fa30)
> [ 39.402309] Stack: c150aa14 00000003 c150aa00 c1505280 c0132963 c1505280 c150aa14 00000003
> [ 39.410930] 00000000 00000000 c1508368 c1508d10 c1508d40 c1505280 c0132a0d c1505280
> [ 39.419568] c150aa00 00000000 00000000 00000002 c15052dc 00000246 c14063e0 c14063e4
> [ 39.428186] Call Trace:
> [ 39.430878] [<c0132963>] drain_array_locked+0x61/0x8c
> [ 39.436161] [<c0132a0d>] cache_reap+0x7f/0x18f
> [ 39.440823] [<c011f86a>] worker_thread+0x16f/0x1dd
> [ 39.445843] [<c013298e>] cache_reap+0x0/0x18f
> [ 39.450420] [<c0110299>] default_wake_function+0x0/0x12
> [ 39.455879] [<c0110299>] default_wake_function+0x0/0x12
> [ 39.461338] [<c011f6fb>] worker_thread+0x0/0x1dd
> [ 39.466173] [<c0122d23>] kthread+0x7c/0xa6
> [ 39.470472] [<c0122ca7>] kthread+0x0/0xa6
> [ 39.474681] [<c0100ea5>] kernel_thread_helper+0x5/0xb
> [ 39.479967] Code: 24 18 8b 15 50 ec 31 c0 8b 0c b8 8d 81 00 00 00 40 c1 e8 0c c1 e0 05 8b 5c 02 1c 8b 44 24 20 8b 53
> [ 39.499780]
>
> [42949372.960000] Linux version 2.6.14.2 (root@zeus) (gcc version 3.3.4) #1 SMP Fri Mar 24 08:27:33 PST 2006
> [42949372.960000] BIOS-provided physical RAM map:
> [42949372.960000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> [42949372.960000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> [42949372.960000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> [42949372.960000] BIOS-e820: 0000000000100000 - 000000003fe30000 (usable)
> [42949372.960000] BIOS-e820: 000000003fe30000 - 000000003fe40000 (ACPI data)
> [42949372.960000] BIOS-e820: 000000003fe40000 - 000000003ff00000 (ACPI NVS)
> [42949372.960000] BIOS-e820: 000000003ff00000 - 0000000040000000 (reserved)
> [42949372.960000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
> [42949372.960000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
> [42949372.960000] user-defined physical RAM map:
> [42949372.960000] user: 0000000000000000 - 000000000009fc00 (usable)
> [42949372.960000] user: 000000000009fc00 - 00000000000a0000 (reserved)
> [42949372.960000] user: 00000000000e0000 - 0000000000100000 (reserved)
> [42949372.960000] user: 0000000000100000 - 0000000020000000 (usable)
> [42949372.960000] 512MB LOWMEM available.
> [42949372.960000] found SMP MP-table at 000ff780
> ...
>
> Thanks for taking a look...
---
~Randy
On Friday 24 March 2006 16:32, Randy.Dunlap wrote:
> > Here it is:
> >
> > fails with cmdline:
> >
> > Kernel command line: ro root=/dev/sda1 rootdelay=10 mem=0x200M
> > console=ttyS0,115200n8
> >
> > works with:
> >
> > Kernel command line: ro root=/dev/sda1 rootdelay=10
> > console=ttyS0,115200n8
> >
> > Note the "mem=" being the differentiator!
>
> OK, that is memory map difference.
>
> Can you test a more recent kernel to see if it has the same problem?
> (like 2.6.16 or 2.6.16-git9)
No luck, or difference, for that matter. 2.6.16 behaves identically. I'm
trying a few different options, such as disabling MSI/MSI-X support,
because what I've seen is that it all works fine with it as long as the h/w
has MSI support, but in all the case I've seen fail, the common denominator
is no MSI (and also all ICH4 platforms). The cases where I can't make it fail
is where the h/w has MSI support. One other noteworthy difference is that the
failures all occur on Intel graphics chipsets, while the successes are non-graphics.
Still trying to find out whether the failure follows graphics or the ICH4.
Anyway, what would help me is if someone could tell me if the page fault is a normal and
expected code path by design, in order to page in the area setup by __vmalloc_area()
as triggered by the module_alloc() call. I'd really rather not have to trace through the
page fault handler to identify the difference between success/failure unless I have to.
>Subject: Re: mem= causes oops (was Re: BIOS causes (exposes?) modprobe
> (load_module) kernel oops)
>
Hm, seeing this mail reminds me of something I seen on SPARC just a while
ago. Maybe it's just something on my side. If I specify `mem=65536`, that
is, with no size suffix like M or G, what does Linux make out of it? 65536
KB or 64 KB?
Jan Engelhardt
--
On Sat, 25 Mar 2006 19:50:35 +0100 (MET) Jan Engelhardt wrote:
>
> >Subject: Re: mem= causes oops (was Re: BIOS causes (exposes?) modprobe
> > (load_module) kernel oops)
> >
>
> Hm, seeing this mail reminds me of something I seen on SPARC just a while
> ago. Maybe it's just something on my side. If I specify `mem=65536`, that
> is, with no size suffix like M or G, what does Linux make out of it? 65536
> KB or 64 KB?
65536 bytes. All of the suffixes [KMG] are optional.
---
~Randy
>>
>> Hm, seeing this mail reminds me of something I seen on SPARC just a while
>> ago. Maybe it's just something on my side. If I specify `mem=65536`, that
>> is, with no size suffix like M or G, what does Linux make out of it? 65536
>> KB or 64 KB?
>
>65536 bytes. All of the suffixes [KMG] are optional.
>
Oh ok, that should explain why it 'segfaults' with mem=65536 ;)
Jan Engelhardt
--