2002-11-22 00:59:45

by Tupshin Harper

[permalink] [raw]
Subject: paging oops with 2.4.20-rc2

I'm experiencing frequent oops when doing much of anything
cpu/memory/disk intensive. I've yet to get through a kernel compilation
on this machine.

The hardware is a new kt400 based motherboard (Gigabyte GA-7VAX with VIA
chipset)with an athlon XP. I upgraded to 2.4.20-rc2 in order to get DMA
support for the VIA 8235 chipset, and that appears to work. However,
this version did introduce frequent oops(as shown below), all involving
kernel paging requests. I've included a ksymoops processed log below.
Please let me know if you need more information, as this is the first
kernel bug report I've made.

Some notable things about the machine:
1)Every partition is reiserfs 3.6
2) Every partition except /boot is on a software raid-0 + LVM setup.

Thanks
-Tupshin


Unable to handle kernel paging request at virtual address db7c7cd8
c01a76f9
*pde = 1b40e9e3
Oops: 0009
CPU: 0
EIP: 0010:[do_journal_end+1513/2720] Not tainted
EFLAGS: 00010206
eax: db7c7cc0 ebx: 00000024 ecx: d572b000 edx: 8005003b
esi: d6f1e000 edi: d572b000 ebp: e0bb41a8 esp: c15b5f3c
ds: 0018 es: 0018 ss: 0018
Process kupdated (pid: 6, stackpage=c15b5000)
Stack: c15b5fa0 df5dec00 0000001f 3ddd5019 00002c98 00000004 00000002
00000000
000000c5 000000e8 00000bd4 d5969140 d5712440 d554b000 d5968000
e0b9db10
c01a6b0f c15b5fa0 df5dec00 00000001 00000006 df5dec00 df5dec44
c15b424b
Call Trace: [flush_old_commits+287/320] [reiserfs_write_super+21/32]
[sync_supers+191/240] [sync_old_buffers+12/64] [kupdate+213/256]
Code: 8b 40 18 a9 00 00 01 00 0f 84 89 00 00 00 8b 44 24 48 8b 54
Using defaults from ksymoops -t elf32-i386 -a i386


>>eax; db7c7cc0 <_end+1b4105bc/2070095c>
>>ecx; d572b000 <_end+153738fc/2070095c>
>>esi; d6f1e000 <_end+16b668fc/2070095c>
>>edi; d572b000 <_end+153738fc/2070095c>
>>ebp; e0bb41a8 <[8139too].data.end+f3221/4270d9>
>>esp; c15b5f3c <_end+11fe838/2070095c>

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 8b 40 18 mov 0x18(%eax),%eax
Code; 00000003 Before first symbol
3: a9 00 00 01 00 test $0x10000,%eax
Code; 00000008 Before first symbol
8: 0f 84 89 00 00 00 je 97 <_EIP+0x97>
Code; 0000000e Before first symbol
e: 8b 44 24 48 mov 0x48(%esp,1),%eax
Code; 00000012 Before first symbol
12: 8b 54 00 00 mov 0x0(%eax,%eax,1),%edx



2002-11-22 06:47:43

by Oleg Drokin

[permalink] [raw]
Subject: Re: paging oops with 2.4.20-rc2

Hello!

On Thu, Nov 21, 2002 at 05:07:05PM -0800, Tupshin Harper wrote:
> I'm experiencing frequent oops when doing much of anything
> cpu/memory/disk intensive. I've yet to get through a kernel compilation
> on this machine.

Is the oops always the same and looks like the one you've posted here?
Or is it different from time to time?
If it always the same, can you please try to compile your kernel with
CONFIG_REISERFS_CHECK (reiserfs debug) option enabled and see what happens?

Thank you.

Bye,
Oleg

2002-11-22 08:51:41

by Tupshin Harper

[permalink] [raw]
Subject: Re: paging oops with 2.4.20-rc2

Oleg Drokin wrote:

>Hello!
>
>
>
>Is the oops always the same and looks like the one you've posted here?
>Or is it different from time to time?
>If it always the same, can you please try to compile your kernel with
>CONFIG_REISERFS_CHECK (reiserfs debug) option enabled and see what happens?
>
>Thank you.
>
>
>
Well after looking more at my logs, I realize that I'm hitting a kernel
BUG in page_alloc.c more often than the previous oops that I posted, and
always as a (not necessarily immediate) precursor. Here's the log:

kernel BUG at page_alloc.c:100!
invalid operand: 0000
CPU: 0
EIP: 0010:[__free_pages_ok+54/656] Not tainted
EFLAGS: 00010282
eax: c1404914 ebx: c141d15c ecx: c141d178 edx: c140782c
esi: 00000000 edi: c864079c ebp: 085e4000 esp: d7619e30
ds: 0018 es: 0018 ss: 0018
Process cc1 (pid: 20387, stackpage=d7619000)
Stack: c141d15c 00003000 c864079c 085e4000 c0378900 00000000 c031bdf4
c102c01c
c031be0c 00000213 ffffffff 00016ef5 0000b77a c012b47b c012b873
c141d15c
c0120d90 c141d15c 0021c000 c01211b3 17ef0067 c85a96c0 c8aef2c0
081e4000
Call Trace: [__free_pages+27/32] [free_page_and_swap_cache+51/64]
[__free_pte+64/80] [zap_page_range+403/576] [exit_mmap+181/288]
Code: 0f 0b 64 00 d3 d2 2b c0 83 7b 08 00 74 08 0f 0b 66 00 d3 d2
Using defaults from ksymoops -t elf32-i386 -a i386


>>eax; c1404914 <_end+103b1d0/20b1491c>
>>ebx; c141d15c <_end+1053a18/20b1491c>
>>ecx; c141d178 <_end+1053a34/20b1491c>
>>edx; c140782c <_end+103e0e8/20b1491c>
>>edi; c864079c <_end+8277058/20b1491c>
>>esp; d7619e30 <_end+172506ec/20b1491c>

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 0f 0b ud2a
Code; 00000002 Before first symbol
2: 64 fs
Code; 00000003 Before first symbol
3: 00 d3 add %dl,%bl
Code; 00000005 Before first symbol
5: d2 2b shrb %cl,(%ebx)
Code; 00000007 Before first symbol
7: c0 83 7b 08 00 74 08 rolb $0x8,0x7400087b(%ebx)
Code; 0000000e Before first symbol
e: 0f 0b ud2a
Code; 00000010 Before first symbol
10: 66 data16
Code; 00000011 Before first symbol
11: 00 d3 add %dl,%bl
Code; 00000013 Before first symbol
13: d2 00 rolb %cl,(%eax)

kernel BUG at page_alloc.c:100!
invalid operand: 0000
CPU: 0
EIP: 0010:[__free_pages_ok+54/656] Not tainted
EFLAGS: 00010282
eax: 0100000d ebx: c141d15c ecx: c141d15c edx: 00000000
esi: 00000000 edi: 00000000 ebp: c0194a90 esp: c15b5f78
ds: 0018 es: 0018 ss: 0018
Process kupdated (pid: 6, stackpage=c15b5000)
Stack: c141d15c d76db230 00000000 c0194a90 d76db230 00000000 c8aef080
c15b5fc8
00027557 c0194aad c141d15c df5de000 c141d15c c012b47b c0124071
00000004
d76db180 df5de060 df5de000 c0142c1c d76db230 c15b4000 c02bde17
c15b424b
Call Trace: [reiserfs_writepage+0/48] [reiserfs_writepage+29/48]
[__free_pages+27/32] [filemap_fdatasync+129/144]
[sync_unlocked_inodes+140/384]
Code: 0f 0b 64 00 d3 d2 2b c0 83 7b 08 00 74 08 0f 0b 66 00 d3 d2


>>ebx; c141d15c <_end+1053a18/20b1491c>
>>ecx; c141d15c <_end+1053a18/20b1491c>
>>ebp; c0194a90 <reiserfs_writepage+0/30>
>>esp; c15b5f78 <_end+11ec834/20b1491c>

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 0f 0b ud2a
Code; 00000002 Before first symbol
2: 64 fs
Code; 00000003 Before first symbol
3: 00 d3 add %dl,%bl
Code; 00000005 Before first symbol
5: d2 2b shrb %cl,(%ebx)
Code; 00000007 Before first symbol
7: c0 83 7b 08 00 74 08 rolb $0x8,0x7400087b(%ebx)
Code; 0000000e Before first symbol
e: 0f 0b ud2a
Code; 00000010 Before first symbol
10: 66 data16
Code; 00000011 Before first symbol
11: 00 d3 add %dl,%bl
Code; 00000013 Before first symbol
13: d2 00 rolb %cl,(%eax)

kernel BUG at page_alloc.c:100!
invalid operand: 0000
CPU: 0
EIP: 0010:[__free_pages_ok+54/656] Not tainted
EFLAGS: 00010282
eax: 01000008 ebx: c141d15c ecx: c141d15c edx: 00000000
esi: 00000000 edi: d76db230 ebp: 00000001 esp: d3bd3ef8
ds: 0018 es: 0018 ss: 0018
Process ld (pid: 21231, stackpage=d3bd3000)
Stack: c141d15c 00001000 d76db230 00000001 00000018 ffffffff c0124fda
00000010
00010216 c141d15c dfeb8778 d76db230 00000000 c012b47b c0124b53
d3bd3f8c
c141d15c 00000000 00001000 00000000 d9703640 00000000 40013000
00001000
Call Trace: [file_read_actor+90/144] [__free_pages+27/32]
[do_generic_file_read+499/1024] [generic_file_read+127/272]
[file_read_actor+0/144]
Code: 0f 0b 64 00 d3 d2 2b c0 83 7b 08 00 74 08 0f 0b 66 00 d3 d2


>>ebx; c141d15c <_end+1053a18/20b1491c>
>>ecx; c141d15c <_end+1053a18/20b1491c>
>>edi; d76db230 <_end+17311aec/20b1491c>
>>esp; d3bd3ef8 <_end+1380a7b4/20b1491c>

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 0f 0b ud2a
Code; 00000002 Before first symbol
2: 64 fs
Code; 00000003 Before first symbol
3: 00 d3 add %dl,%bl
Code; 00000005 Before first symbol
5: d2 2b shrb %cl,(%ebx)
Code; 00000007 Before first symbol
7: c0 83 7b 08 00 74 08 rolb $0x8,0x7400087b(%ebx)
Code; 0000000e Before first symbol
e: 0f 0b ud2a
Code; 00000010 Before first symbol
10: 66 data16
Code; 00000011 Before first symbol
11: 00 d3 add %dl,%bl
Code; 00000013 Before first symbol
13: d2 00 rolb %cl,(%eax)



2002-11-24 05:28:35

by Tupshin Harper

[permalink] [raw]
Subject: Re: paging oops with 2.4.20-rc2

Well, after further examination and manipulations, I determined that
both the paging oops and the report of BUG page_alloc.c that I was
getting are eliminated after moving my swap partition off of the
raid0+lvm volume that it was on and onto a separate disk.

Still a bug, but a fairly obscure one. It's obviously not necessary for
performance reasons since the kernel stripes swaps itself, but it
certainly caused me a lot of headache until I figured out what was going on.

-Tupshin

Oleg Drokin wrote:

>Is the oops always the same and looks like the one you've posted here?
>Or is it different from time to time?
>If it always the same, can you please try to compile your kernel with
>CONFIG_REISERFS_CHECK (reiserfs debug) option enabled and see what happens?
>
>Thank you.
>
>Bye,
> Oleg
>
>
>