Hi,
Due to some mixture of late night hacking and weak coffee i managed to build a
HIGHMEM (4G) kernel on a P2-192M box. I then tried to create some loopback filesystems
(ext2 on ext3) and mount them. mounting caused the oops attached. I noticed the highmem
option was enabled when i saw create_bounce. It seems like create_bounce is sending
mempool_alloc a NULL pointer in the form of the pool argument (ptr to page_pool global
variable in highmem.c), verified by putting a BUG check against pool. To me it looks like
page_pool wasn't mempool_create()'d before attempting mempool_alloc because we would have
hit the BUG() call in what seems to be the only mempool_create call in
mempool.c:init_emergency_pool(), this would be because since (sysinfo) "i" has a totalhigh
of 0 we return from init_emergency_pool without calling mempool_create()
mm/highmem.c:init_emergency_pool()
<--snip-->
if (!i.totalhigh)
return 0; <--- we end up skipping mempool_create here
page_pool = mempool_create(POOL_SIZE, page_pool_alloc, page_pool_free, NULL);
if (!page_pool)
BUG();
<--snip-->
Then again, is highmem kernel on non-highmem box a valid configuration?
Unable to handle kernel NULL pointer dereference at virtual address 00000014
printing eip:
c0132c7a
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0132c7a>] Not tainted
EFLAGS: 00010206
eax: c6e14000 ebx: c145ed20 ecx: 00000001 edx: c145eea0
esi: 00000000 edi: 00000060 ebp: 00000070 esp: c6e15dd8
ds: 0018 es: 0018 ss: 0018
Process mount (pid: 4317, stackpage=c6e15000)
Stack: 00000000 c6e14000 00000000 00000000 00000000 c6e14000 00000000 00000000
c145ed20 00000000 c145c9a0 c145ccc0 c01332ea 00000000 00000070 00000000
00000000 00000000 c145eea0 00000000 00000007 00000000 c7efc000 cc83f8d8
Call Trace: [<c01332ea>] [<cc83f8d8>] [<c01b6000>] [<c01256c2>] [<c01b60ed>]
[<c013777c>] [<c012e9a1>] [<c012832c>] [<c013a560>] [<c01287dc>] [<c0128680>]
[<c0134de6>] [<c0134d39>] [<c01087eb>]
Code: 8b 5e 14 53 57 ff 56 18 5a 85 c0 59 0f 85 b4 00 00 00 39 fd
>>EIP; c0132c7a <mempool_alloc+5a/130> <=====
Trace; c01332ea <create_bounce+ca/2a0>
Trace; cc83f8d8 <[loop]loop_make_request+98/1f0>
Trace; c01b6000 <generic_make_request+1a0/1c0>
Trace; c01256c2 <do_anonymous_page+c2/e0>
Trace; c01b60ed <submit_bio+7d/90>
Trace; c013777c <block_read_full_page+24c/260>
Trace; c012e9a1 <__alloc_pages+41/180>
Trace; c012832c <do_generic_file_read+2cc/440>
Trace; c013a560 <blkdev_get_block+0/40>
Trace; c01287dc <generic_file_read+7c/130>
Trace; c0128680 <file_read_actor+0/e0>
Trace; c0134de6 <sys_read+96/d0>
Trace; c0134d39 <sys_llseek+c9/e0>
Trace; c01087eb <system_call+33/38>
Code; c0132c7a <mempool_alloc+5a/130>
00000000 <_EIP>:
Code; c0132c7a <mempool_alloc+5a/130> <=====
0: 8b 5e 14 mov 0x14(%esi),%ebx <=====
Code; c0132c7d <mempool_alloc+5d/130>
3: 53 push %ebx
Code; c0132c7e <mempool_alloc+5e/130>
4: 57 push %edi
Code; c0132c7f <mempool_alloc+5f/130>
5: ff 56 18 call *0x18(%esi)
Code; c0132c82 <mempool_alloc+62/130>
8: 5a pop %edx
Code; c0132c83 <mempool_alloc+63/130>
9: 85 c0 test %eax,%eax
Code; c0132c85 <mempool_alloc+65/130>
b: 59 pop %ecx
Code; c0132c86 <mempool_alloc+66/130>
c: 0f 85 b4 00 00 00 jne c6 <_EIP+0xc6> c0132d40 <mempool_alloc+120/130>
Code; c0132c8c <mempool_alloc+6c/130>
12: 39 fd cmp %edi,%eb
bug check patch:
diff -urN linux-2.5.1-pre11-orig/mm/highmem.c linux-2.5.1-pre11-test/mm/highmem.c
--- linux-2.5.1-pre11-orig/mm/highmem.c Thu Jan 1 13:33:33 1998
+++ linux-2.5.1-pre11-test/mm/highmem.c Thu Jan 1 03:33:20 1998
@@ -204,8 +204,11 @@
si_meminfo(&i);
si_swapinfo(&i);
- if (!i.totalhigh)
+ if (!i.totalhigh) {
+ printk(KERN_WARNING "WARNING: You have highmem support on a non-highmem box!"
+ " Recompile with CONFIG_NOHIGHMEM=y\n");
return 0;
+ }
page_pool = mempool_create(POOL_SIZE, page_pool_alloc, page_pool_free, NULL);
if (!page_pool)
diff -urN linux-2.5.1-pre11-orig/mm/mempool.c linux-2.5.1-pre11-test/mm/mempool.c
--- linux-2.5.1-pre11-orig/mm/mempool.c Thu Jan 1 13:33:33 1998
+++ linux-2.5.1-pre11-test/mm/mempool.c Thu Jan 1 03:28:34 1998
@@ -186,6 +186,9 @@
int curr_nr;
DECLARE_WAITQUEUE(wait, current);
int gfp_nowait = gfp_mask & ~__GFP_WAIT;
+
+ if (!pool)
+ BUG();
repeat_alloc:
element = pool->alloc(gfp_nowait, pool->pool_data);
Another question, how did we manage to hit the bounce stuff when i mounted
a loopback filesystem? regular mounts are fine.
Cheers,
Zwane Mwaikambo
On Sun, 16 Dec 2001, Zwane Mwaikambo wrote:
> Then again, is highmem kernel on non-highmem box a valid configuration?
Andrea had patches that allowed such a configuration. Better still,
it could be used for debugging highmem problems on boxes without highmem.
I thought this stuff had been merged, maybe it was only -aa, or -ac.
regards,
Dave.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
On Sun, 16 Dec 2001, Zwane Mwaikambo wrote:
> Then again, is highmem kernel on non-highmem box a valid
> configuration?
yes, it's a valid configuration. Eg. distribution makers are frequently
using highmem-enabled kernels - and it's a natural thing that they boot &
work just fine on non-highmem boxes as well. Also, even a highmem box
could have a RAM failure anytime that forces a temporary removal of RAM,
causing the box to have no highmem RAM anymore, in which situation it
would be pretty awkward if the highmem-enabled kernel failed.
Ingo
On Sun, 16 Dec 2001, Zwane Mwaikambo wrote:
> Due to some mixture of late night hacking and weak coffee i
> managed to build a HIGHMEM (4G) kernel on a P2-192M box. I then tried
> to create some loopback filesystems (ext2 on ext3) and mount them.
> mounting caused the oops attached. I noticed the highmem option was
> enabled when i saw create_bounce. It seems like create_bounce is
> sending mempool_alloc a NULL pointer in the form of the pool argument
> [...]
it looks like the BLK_BOUNCE_HIGH definition is wrong, it's off by 1.
Please try the attached patch, does it fix the oops? (the patch also fixes
BLK_BOUNCE_ANY - which is off by one as well.) In both cases, we created a
bounce page for the very last page in the system.
Ingo
On Sun, 16 Dec 2001, Ingo Molnar wrote:
> yes, it's a valid configuration. Eg. distribution makers are frequently
> using highmem-enabled kernels - and it's a natural thing that they boot &
> work just fine on non-highmem boxes as well. Also, even a highmem box
> could have a RAM failure anytime that forces a temporary removal of RAM,
> causing the box to have no highmem RAM anymore, in which situation it
> would be pretty awkward if the highmem-enabled kernel failed.
Thanks, i'll try the patch this evening when i get home and give you the
lowdown.
Cheers,
Zwane Mwaikambo
--- linux/include/linux/vt_kern.h~ Wed Dec 5 10:12:17 2001
+++ linux/include/linux/vt_kern.h Wed Dec 5 10:45:20 2001
@@ -7,6 +7,7 @@
*/
#include <linux/config.h>
+#include <linux/tty.h> /* needed for MAX_NR_CONSOLES */
#include <linux/vt.h>
#include <linux/kd.h>
--
Kirk Reiser The Computer Braille Facility
e-mail: [email protected] University of Western Ontario
phone: (519) 661-3061
On Sun, 16 Dec 2001, Zwane Mwaikambo wrote:
> Thanks, i'll try the patch this evening when i get home and give you the
> lowdown.
No luck, same oops.
Cheers,
Zwane
okay, there was another bug as well, this time in the loopback driver: it
did not set up its own bounce limit. This happens because the loopback
driver is a special driver that is not governed by the normal elevator,
thus it does not call blk_init_queue(). So the attached patch has two
fixes:
- call blk_queue_bounce_limit() from loop.c
- it fixes the off-by-one bounce-limit bugs in blkdev.h
does this fix your system?
Ingo
On Mon, 17 Dec 2001, Ingo Molnar wrote:
>
> okay, there was another bug as well, this time in the loopback driver: it
> did not set up its own bounce limit. This happens because the loopback
> driver is a special driver that is not governed by the normal elevator,
> thus it does not call blk_init_queue(). So the attached patch has two
> fixes:
>
> - call blk_queue_bounce_limit() from loop.c
> - it fixes the off-by-one bounce-limit bugs in blkdev.h
>
> does this fix your system?
Yep dead on, thanks.
Regards,
Zwane Mwaikambo