2001-04-29 21:49:03

by putter

[permalink] [raw]
Subject: reiserfs autofix?

Hi,
I am kernel newbie, especially with logging filesystems.
Now I am using Mandrake 7.1 with 2.4.3 kernel and imon patch
and NVidia drivers compiled into the kernel.
Now, all my partitions are ReiserFS. I usually play quake once
or twice a day. Sometimes graphics subsystem freezes up, so it takes
keyboard input. Caps and Numlock are working fine, unless I try to kill
X with ctrlalt-backspace. So I reset my machine with hardware switch.

here is the interesting part... after I reset my machine like that,
some files start to appear corrupted. Segmentation faults etc.
Isn't reiserfs suppose to be safe? NOW, THE REAL SPOOKY PART:
I reboot my machine with normal procedure, like shutdonw -r now,
and on other boot, corrupted files FIX themselves. Any insight?
I think it is rather unacceptable...
cheers,
pavel


2001-04-29 22:56:08

by Chris Mason

[permalink] [raw]
Subject: Re: reiserfs autofix?



On Sunday, April 29, 2001 02:48:27 PM -0700 putter <[email protected]>
wrote:

> Hi,
> I am kernel newbie, especially with logging filesystems.
> Now I am using Mandrake 7.1 with 2.4.3 kernel and imon patch
> and NVidia drivers compiled into the kernel.
^^^^^^^^^^^^^^^

The binary only nvidia drivers make it a bit hard for us to debug.

> Now, all my partitions are ReiserFS. I usually play quake once
> or twice a day. Sometimes graphics subsystem freezes up, so it takes
> keyboard input. Caps and Numlock are working fine, unless I try to kill
> X with ctrlalt-backspace. So I reset my machine with hardware switch.

Check your /var/log/messages. You probably have messages from reiserfs.
Send along an lspci so we can see what your hardware is.

-chris

2001-04-30 07:07:36

by putter

[permalink] [raw]
Subject: Re: reiserfs autofix?

I think I have tracked down the problem to the card itself. My machine is on @ graphics mode all the time,
like 24hrs a day, and it seems that it is somewhat taxing on the cards performance. So now I switch down to
text mode, everytime I leave the machine. How did I find out? I placed my finger of heatsink of my GeForce DDR.
It was HOT! Fan works alright, so if I was to run computer a while, stress accumilates, and when I run GeForce
understress of maximum resolutions, it craps out. So much for NVidia eh?

BTW, I don't question graphical subsystem crashes. I question reiserfs that suppose to leave my partitions
in consistent state, no matter how trigger happy with power switch I am, or is my judgement is clouded? >=)

So here's details....: Offending reiserfs messages, after last boot.

2376 Apr 29 15:23:28 candle fancylogin: from /dev/tty1: ACCESS GRANTED: pavel logged in
2377 Apr 29 15:23:33 candle kernel: NVRM: loading NVIDIA kernel module version 1.0-769
2378 Apr 29 16:24:29 candle kernel: mtrr: no MTRR for e4000000,2000000 found
2379 Apr 29 16:24:45 candle kernel: mtrr: no MTRR for e4000000,2000000 found
2380 Apr 29 16:24:50 candle fancylogin: from /dev/tty1: ACCESS GRANTED: pavel logged in
2381 Apr 29 16:31:18 candle modprobe: modprobe: Can't locate module net-pf-10
2382 Apr 29 18:01:02 candle kernel: vs-13042: reiserfs_read_inode2: [7772 8013 0x0 SD] not found
2383 Apr 29 18:01:02 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2384 Apr 29 19:01:01 candle kernel: vs-13042: reiserfs_read_inode2: [7772 8013 0x0 SD] not found
2385 Apr 29 19:01:01 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2386 Apr 29 20:01:00 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2387 Apr 29 21:01:00 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2388 Apr 29 22:01:01 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2389 Apr 29 23:01:01 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found
2390 Apr 29 23:52:55 candle sudo: pavel : TTY=pts/1 ; PWD=/home/pavel ; USER=root ; COMMAND=/bin/su -
2391 Apr 29 23:52:55 candle PAM_pwdb[2242]: (su) session opened for user root by pavel(uid=0)
2392 Apr 29 23:53:07 candle PAM_pwdb[2263]: (su) session opened for user spam by pavel(uid=0)
2393 Apr 29 23:54:42 candle PAM_pwdb[2263]: (su) session closed for user spam
2394 Apr 29 23:54:44 candle PAM_pwdb[2285]: (su) session opened for user spam by pavel(uid=0)
2395 Apr 30 00:00:14 candle sudo: pavel : TTY=pts/0 ; PWD=/home/pavel ; USER=root ; COMMAND=/bin/su -
2396 Apr 30 00:00:14 candle PAM_pwdb[2320]: (su) session opened for user root by pavel(uid=0)
2397 Apr 30 00:01:00 candle kernel: vs-13048: reiserfs_iget: bad_inode. Stat data of (7772 8013) not found

2001-04-30 12:03:55

by Chris Mason

[permalink] [raw]
Subject: Re: reiserfs autofix?



On Monday, April 30, 2001 12:07:04 AM -0700 putter <[email protected]>
wrote:

> I think I have tracked down the problem to the card itself. My machine is
> on @ graphics mode all the time, like 24hrs a day, and it seems that it
> is somewhat taxing on the cards performance. So now I switch down to text
> mode, everytime I leave the machine. How did I find out? I placed my
> finger of heatsink of my GeForce DDR. It was HOT! Fan works alright, so
> if I was to run computer a while, stress accumilates, and when I run
> GeForce understress of maximum resolutions, it craps out. So much for
> NVidia eh?

Do a search through the kernel arcvhies for nvidia. The crashes could just
be the driver. But heat is always a problem, add fans ;-)

>
> BTW, I don't question graphical subsystem crashes. I question reiserfs
> that suppose to leave my partitions in consistent state, no matter how
> trigger happy with power switch I am, or is my judgement is clouded? >=)

After a crash, reiserfs only cleans up after itself. If someone else went
in and hosed the metadata (nvidia, bad drive, controller, ide fun with
via), you've still got bad blocks.

This is one possible reason that we've seen more reports than ext2 has.
After a crash, ext2fsck fixes _whatever_ was broken. log replay in
reiserfs only fixes the operations that were in progress when the system
crashed.

Anyway, those messages show that you've got metadata corruption. grab the
latest reiserfsprogs from ftp.reiserfs.org and run reiserfsck -x (after
backing things up).

-chris

2001-04-30 17:08:32

by Alan

[permalink] [raw]
Subject: Re: reiserfs autofix?

> 2376 Apr 29 15:23:28 candle fancylogin: from /dev/tty1: ACCESS GRANTED: pavel logged in
> 2377 Apr 29 15:23:33 candle kernel: NVRM: loading NVIDIA kernel module version 1.0-769
> 2378 Apr 29 16:24:29 candle kernel: mtrr: no MTRR for e4000000,2000000 found
> 2379 Apr 29 16:24:45 candle kernel: mtrr: no MTRR for e4000000,2000000 found
> 2380 Apr 29 16:24:50 candle fancylogin: from /dev/tty1: ACCESS GRANTED: pavel logged in

You are using the NVIDIA closed driver. Please duplicate the problem without
that driver or take it up with NVIDIA not the kernel list