2001-02-07 01:36:47

by Arthur Pedyczak

[permalink] [raw]
Subject: Oopses in 2.4.1 (lots of them)


Hi all,
I have a misfortune of reporting yet another Oops in 2.4.1 (my previous
report got ignored). After running for 4 days I got many, many oopses.
They were trigerred by xscreensaver, and some other X-related apps.
After dopping to runlevel 3, the system seemed O.K. Nothing unusual in
process table, no zombies etc. I could restart the X server itself, bu any
attemp to start gdm would generate yet another Oops. Had to reboot.

Ideas/suggestions/Help appreciated

Arthur

==========================================================================================
My hardware:
PIII 450
motherboard: Asus P2B
384 MB RAM (no swap)
ide: PIIX4
ide0 hda: WDC AC313000R, ATA DISK drive
hdb: MATSHITA CR-589, ATAPI CDROM drive
ide1 hdc: WDC WD200BB-00AUA1, ATA DISK drive
hdd: MITSBICDRW4420a, ATAPI CDROM drive (ide-scsi)
graphics: Riva TNT2
sound: es1370
eth0 eepro100
eth1 3c59x
=======================
ksymoops output:
=======================
ksymoops 2.3.4 on i686 2.4.1. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.1/ (default)
-m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

Feb 6 16:41:46 cs865114-a kernel: Unable to handle kernel paging request at virtual address 0909093e
Feb 6 16:41:46 cs865114-a kernel: c0131ce1
Feb 6 16:41:46 cs865114-a kernel: *pde = 00000000
Feb 6 16:41:46 cs865114-a kernel: Oops: 0002
Feb 6 16:41:46 cs865114-a kernel: CPU: 0
Feb 6 16:41:46 cs865114-a kernel: EIP: 0010:[file_move+25/44]
Feb 6 16:41:46 cs865114-a kernel: EFLAGS: 00210282
Feb 6 16:41:46 cs865114-a kernel: eax: 0909093a ebx: d7937440 ecx: cb456600 edx: c6c35a20
Feb 6 16:41:46 cs865114-a kernel: esi: d5d16600 edi: ffffffe9 ebp: d7a1c320 esp: c3a65f48
Feb 6 16:41:46 cs865114-a kernel: ds: 0018 es: 0018 ss: 0018
Feb 6 16:41:46 cs865114-a kernel: Process xroger (pid: 1066, stackpage=c3a65000)
Feb 6 16:41:46 cs865114-a kernel: Stack: cb456600 c0130a6e cb456600 d7937440 400134a0 c3aa4000 00000000 c3aa4000
Feb 6 16:41:46 cs865114-a kernel: c01309ba d79d01c0 d7a1c320 00000000 c3a64000 00000003 08048984 d79d01c0
Feb 6 16:41:46 cs865114-a kernel: d7a1c320 08048984 c3aa4000 00000003 00000001 00000001 c0130cac c3aa4000
Feb 6 16:41:46 cs865114-a kernel: Call Trace: [dentry_open+170/328] [filp_open+82/92] [sys_open+56/180] [system_call+51/56]
Feb 6 16:41:46 cs865114-a kernel: Code: 89 48 04 89 01 89 59 04 89 0b 90 8d 74 26 00 5b c3 89 f6 53
Using defaults from ksymoops -t elf32-i386 -a i386

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 89 48 04 mov %ecx,0x4(%eax)
Code; 00000003 Before first symbol
3: 89 01 mov %eax,(%ecx)
Code; 00000005 Before first symbol
5: 89 59 04 mov %ebx,0x4(%ecx)
Code; 00000008 Before first symbol
8: 89 0b mov %ecx,(%ebx)
Code; 0000000a Before first symbol
a: 90 nop
Code; 0000000b Before first symbol
b: 8d 74 26 00 lea 0x0(%esi,1),%esi
Code; 0000000f Before first symbol
f: 5b pop %ebx
Code; 00000010 Before first symbol
10: c3 ret
Code; 00000011 Before first symbol
11: 89 f6 mov %esi,%esi
Code; 00000013 Before first symbol
13: 53 push %ebx

Feb 6 16:51:46 cs865114-a kernel: Unable to handle kernel paging request at virtual address 0909093e
Feb 6 16:51:46 cs865114-a kernel: c0131ce1
Feb 6 16:51:46 cs865114-a kernel: *pde = 00000000
Feb 6 16:51:46 cs865114-a kernel: Oops: 0002
Feb 6 16:51:46 cs865114-a kernel: CPU: 0
Feb 6 16:51:46 cs865114-a kernel: EIP: 0010:[file_move+25/44]
Feb 6 16:51:46 cs865114-a kernel: EFLAGS: 00210282
Feb 6 16:51:46 cs865114-a kernel: eax: 0909093a ebx: d7937440 ecx: c961a5a0 edx: c6c35d80
Feb 6 16:51:46 cs865114-a kernel: esi: d5d16600 edi: ffffffe9 ebp: d7a1c320 esp: d1de7f48
Feb 6 16:51:46 cs865114-a kernel: ds: 0018 es: 0018 ss: 0018
Feb 6 16:51:46 cs865114-a kernel: Process xroger (pid: 1080, stackpage=d1de7000)
Feb 6 16:51:46 cs865114-a kernel: Stack: c961a5a0 c0130a6e c961a5a0 d7937440 400134a0 d7140000 00000000 d7140000
Feb 6 16:51:46 cs865114-a kernel: c01309ba d79d01c0 d7a1c320 00000000 d1de6000 00000003 08048984 d79d01c0
Feb 6 16:51:46 cs865114-a kernel: d7a1c320 08048984 d7140000 00000003 00000001 00000001 c0130cac d7140000
Feb 6 16:51:46 cs865114-a kernel: Call Trace: [dentry_open+170/328] [filp_open+82/92] [sys_open+56/180] [system_call+51/56]
Feb 6 16:51:46 cs865114-a kernel: Code: 89 48 04 89 01 89 59 04 89 0b 90 8d 74 26 00 5b c3 89 f6 53

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 89 48 04 mov %ecx,0x4(%eax)
Code; 00000003 Before first symbol
3: 89 01 mov %eax,(%ecx)
Code; 00000005 Before first symbol
5: 89 59 04 mov %ebx,0x4(%ecx)
Code; 00000008 Before first symbol
8: 89 0b mov %ecx,(%ebx)
Code; 0000000a Before first symbol
a: 90 nop
Code; 0000000b Before first symbol
b: 8d 74 26 00 lea 0x0(%esi,1),%esi
Code; 0000000f Before first symbol
f: 5b pop %ebx
Code; 00000010 Before first symbol
10: c3 ret
Code; 00000011 Before first symbol
11: 89 f6 mov %esi,%esi
Code; 00000013 Before first symbol
13: 53 push %ebx

Feb 6 17:01:46 cs865114-a kernel: Unable to handle kernel paging request at virtual address 0909093e
Feb 6 17:01:46 cs865114-a kernel: c0131ce1
Feb 6 17:01:46 cs865114-a kernel: *pde = 00000000
Feb 6 17:01:46 cs865114-a kernel: Oops: 0002
Feb 6 17:01:46 cs865114-a kernel: CPU: 0
Feb 6 17:01:46 cs865114-a kernel: EIP: 0010:[file_move+25/44]
Feb 6 17:01:46 cs865114-a kernel: EFLAGS: 00210282
Feb 6 17:01:46 cs865114-a kernel: eax: 0909093a ebx: d7937440 ecx: cf0f8a40 edx: cb0f9680
Feb 6 17:01:46 cs865114-a kernel: esi: d5d16600 edi: ffffffe9 ebp: d7a1c320 esp: d1de7f48
Feb 6 17:01:46 cs865114-a kernel: ds: 0018 es: 0018 ss: 0018
Feb 6 17:01:46 cs865114-a kernel: Process xroger (pid: 1098, stackpage=d1de7000)
Feb 6 17:01:46 cs865114-a kernel: Stack: cf0f8a40 c0130a6e cf0f8a40 d7937440 400134a0 d4223000 00000000 d4223000
Feb 6 17:01:46 cs865114-a kernel: c01309ba d79d01c0 d7a1c320 00000000 d1de6000 00000003 08048984 d79d01c0
Feb 6 17:01:46 cs865114-a kernel: d7a1c320 08048984 d4223000 00000003 00000001 00000001 c0130cac d4223000
Feb 6 17:01:46 cs865114-a kernel: Call Trace: [dentry_open+170/328] [filp_open+82/92] [sys_open+56/180] [system_call+51/56]
Feb 6 17:01:46 cs865114-a kernel: Code: 89 48 04 89 01 89 59 04 89 0b 90 8d 74 26 00 5b c3 89 f6 53

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 89 48 04 mov %ecx,0x4(%eax)
Code; 00000003 Before first symbol
3: 89 01 mov %eax,(%ecx)
Code; 00000005 Before first symbol
5: 89 59 04 mov %ebx,0x4(%ecx)
Code; 00000008 Before first symbol
8: 89 0b mov %ecx,(%ebx)
Code; 0000000a Before first symbol
a: 90 nop
Code; 0000000b Before first symbol
b: 8d 74 26 00 lea 0x0(%esi,1),%esi
Code; 0000000f Before first symbol
f: 5b pop %ebx
Code; 00000010 Before first symbol
10: c3 ret
Code; 00000011 Before first symbol
11: 89 f6 mov %esi,%esi
Code; 00000013 Before first symbol
13: 53 push %ebx

Feb 6 17:11:46 cs865114-a kernel: Unable to handle kernel paging request at virtual address 0909093e
Feb 6 17:11:46 cs865114-a kernel: c0131ce1
Feb 6 17:11:46 cs865114-a kernel: *pde = 00000000
Feb 6 17:11:46 cs865114-a kernel: Oops: 0002
Feb 6 17:11:46 cs865114-a kernel: CPU: 0
Feb 6 17:11:46 cs865114-a kernel: EIP: 0010:[file_move+25/44]
Feb 6 17:11:46 cs865114-a kernel: EFLAGS: 00210282
Feb 6 17:11:46 cs865114-a kernel: eax: 0909093a ebx: d7937440 ecx: c3958620 edx: cb0f9680
Feb 6 17:11:46 cs865114-a kernel: esi: d5d16600 edi: ffffffe9 ebp: d7a1c320 esp: d1de7f48
Feb 6 17:11:46 cs865114-a kernel: ds: 0018 es: 0018 ss: 0018
Feb 6 17:11:46 cs865114-a kernel: Process xroger (pid: 1114, stackpage=d1de7000)
Feb 6 17:11:46 cs865114-a kernel: Stack: c3958620 c0130a6e c3958620 d7937440 400134a0 c8392000 00000000 c8392000
Feb 6 17:11:46 cs865114-a kernel: c01309ba d79d01c0 d7a1c320 00000000 d1de6000 00000003 08048984 d79d01c0
Feb 6 17:11:46 cs865114-a kernel: d7a1c320 08048984 c8392000 00000003 00000001 00000001 c0130cac c8392000
Feb 6 17:11:46 cs865114-a kernel: Call Trace: [dentry_open+170/328] [filp_open+82/92] [sys_open+56/180] [system_call+51/56]
Feb 6 17:11:46 cs865114-a kernel: Code: 89 48 04 89 01 89 59 04 89 0b 90 8d 74 26 00 5b c3 89 f6 53

Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 89 48 04 mov %ecx,0x4(%eax)
Code; 00000003 Before first symbol
3: 89 01 mov %eax,(%ecx)
Code; 00000005 Before first symbol
5: 89 59 04 mov %ebx,0x4(%ecx)
Code; 00000008 Before first symbol
8: 89 0b mov %ecx,(%ebx)
Code; 0000000a Before first symbol
a: 90 nop
Code; 0000000b Before first symbol
b: 8d 74 26 00 lea 0x0(%esi,1),%esi
Code; 0000000f Before first symbol
f: 5b pop %ebx
Code; 00000010 Before first symbol
10: c3 ret
Code; 00000011 Before first symbol
11: 89 f6 mov %esi,%esi
Code; 00000013 Before first symbol
13: 53 push %ebx


1 warning issued. Results may not be reliable.


2001-02-07 09:18:27

by Alan

[permalink] [raw]
Subject: Re: Oopses in 2.4.1 (lots of them)

> report got ignored). After running for 4 days I got many, many oopses.
> They were trigerred by xscreensaver, and some other X-related apps.
> After dopping to runlevel 3, the system seemed O.K. Nothing unusual in
> graphics: Riva TNT2

That makes it harder to say 'Use a 3.3.6 X server'. If you are using the
nvidia binary/obfuscated modules for their 3d and stuff try running without
them.

Alan

2001-02-07 12:50:55

by Arthur Pedyczak

[permalink] [raw]
Subject: Re: Oopses in 2.4.1 (lots of them)

On Wed, 7 Feb 2001, Alan Cox wrote:

> > report got ignored). After running for 4 days I got many, many oopses.
> > They were trigerred by xscreensaver, and some other X-related apps.
> > After dopping to runlevel 3, the system seemed O.K. Nothing unusual in
> > graphics: Riva TNT2
>
> That makes it harder to say 'Use a 3.3.6 X server'. If you are using the
> nvidia binary/obfuscated modules for their 3d and stuff try running without
> them.
>
> Alan
>
Well,
NVidia is only one of the few suspects I have to eliminate. I also used
OSS, vmware, and free s/wan (for IPSEC). So now I am in a process of
eliminating them one by one.
Also, last night Linus suggested that this could be a hardware problem. I
am not sure how to eliminate or confirm this. Recently I added some RAM
(256->384) and decided to get rid of swap. This seemed to have destabilized
the system, although nothing is obvious. I can try to stress the system by
copying 2 CDs to files simultaneously, while running kernel build in the
background and tar-gzipping /usr, all at once. The load goes through the
roof, but everything works. Then, few hours later with no load a simple
thing like xscreensaver or makewhathis.cron would oops for no apparent
reason.
Not sure what to make of this all. Any ideas?

Arthur

2001-02-07 18:35:13

by Alan

[permalink] [raw]
Subject: Re: Oopses in 2.4.1 (lots of them)

> am not sure how to eliminate or confirm this. Recently I added some RAM
> (256->384) and decided to get rid of swap. This seemed to have destabilized
> the system, although nothing is obvious. I can try to stress the system by

Get a copy of memtest86, its a standalone memory tester.

2001-02-08 02:45:16

by Arthur Pedyczak

[permalink] [raw]
Subject: Re: Oopses in 2.4.1 (lots of them)

On Wed, 7 Feb 2001, Alan Cox wrote:

> > am not sure how to eliminate or confirm this. Recently I added some RAM
> > (256->384) and decided to get rid of swap. This seemed to have destabilized
> > the system, although nothing is obvious. I can try to stress the system by
>
> Get a copy of memtest86, its a standalone memory tester.
>
Alan, Linus,

Thanks for your help. I ran memtest86 for 6 hrs. RAM looks
O.K.. I added swap back (just in case). Now I will be eliminating
suspicious kernel modules one by one.
Will post results in few days.
Cheers!

Arthur

2001-02-22 12:59:41

by Arthur Pedyczak

[permalink] [raw]
Subject: Re: Oopses in 2.4.1 (lots of them)

On Wed, 7 Feb 2001, Alan Cox wrote:

> > report got ignored). After running for 4 days I got many, many oopses.
> > They were trigerred by xscreensaver, and some other X-related apps.
> > After dopping to runlevel 3, the system seemed O.K. Nothing unusual in
> > graphics: Riva TNT2
>
> That makes it harder to say 'Use a 3.3.6 X server'. If you are using the
> nvidia binary/obfuscated modules for their 3d and stuff try running without
> them.
>
Alan,
Looks like you were 100% right about nvidia kernel module. After I
eliminated it and reverted to the driver coming with XFree-4.0.1, my
system seems stable again. It's been up for 7 days (with NVdriver from
nvidia I couldn't get past 72 hrs mark).
Thanks for your help!

A.