2001-07-18 04:14:14

by Sam Thompson

[permalink] [raw]
Subject: ReiserFS / 2.4.6 / Data Corruption

First, please CC all replies to [email protected], as I am not on the mailing list.


The other day a computer of mine lost power and the ext2 fs was severely damaged
. I decided to reinstall debian using reiserfs to prevent this. I had no problems with installation, (I've done this same install on other computers) but as I started to untar backup tarballs I had made, I started noticing problems with what I believe is the filesystem.

Tar/gzip will complain about crc errors in files: for example in a certain 40 mb file I can decompress fine on other computers. If I try to uncompress the same file immediately, it will fail at a different point, seemingly at random. Sometimes it works fine. Random debian packages I apt-get have the same problem. Sometimes they won't unpack properly, sometimes they will.

I tried reinstall gzip several times, but I don't think the problems are limited to compressed files, just very obvious in critical situations like that.

I can get complex software to run: xfree86 4.1, mozilla, etc, fine, although som
e files apparently go missing in some programs.

Just now I got the following error message when deleting a tarball:

vs-4080: reiserfs_free_block: free_block (0301:672040)[dev:blocknr]: bit already
cleared


Next, I took the hard drive to my other, stable computer and ran reiserfsck --rebuild-tree on it, under the hopes that this would fix it. It did appear to fix it, but about 10 minutes later the symptoms came back.

Here is 'debugreiserfs /dev/hda1' output:


Super block of format 3.5 found on the 0x3 in block 16
Block count 4233112
Blocksize 4096
Free blocks 3900694
Busy blocks (skipped 16, bitmaps - 130, journal blocks - 8193
1 super blocks, 324078 data blocks
Root block 8529
Journal block (first) 18
Journal dev 0
Journal orig size 8192
Filesystem state ERROR
Tree height 4
Hash function used to sort names: "tea"
Objectid map size 62, max 1004
Version 0


Here is my relevant hardware:

Motherboard: Asus A7V KT133 with 686A southbridge (NOT the 686B).
Harddrive: 30 gig ide maxtor/generic.

I installed 2.4.6 to try and fix the problem, it didn't seem to help, although I do not clearly remember the difference between 2.2.17-patched and 2.4.6 in terms of the symptoms.

I tried reinstalling once, but that did not help.

I'm at a loss as to how to proceed. Any ideas?

Thank you for your time,

Sam
---
[email protected]


2001-07-18 05:19:16

by Steve Kieu

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Just from my experience of using fs:

My advice:

Dont use reiserfs,JFS
it is ok to use ext2

Go journalling? use ext3 or XFS

I have used all of these fs and pick up this rule (up
to now, not sure it remains right in the far future)

cheers


--- Sam Thompson
<[email protected]> wrote: > First,
please CC all replies to [email protected],
> as I am not on the mailing list.
>
>
> The other day a computer of mine lost power and the
> ext2 fs was severely damaged
> . I decided to reinstall debian using reiserfs to
> prevent this. I had no problems with installation,
> (I've done this same install on other computers) but
> as I started to untar backup tarballs I had made, I
> started noticing problems with what I believe is the
> filesystem.
>
> Tar/gzip will complain about crc errors in files:
> for example in a certain 40 mb file I can decompress
> fine on other computers. If I try to uncompress the
> same file immediately, it will fail at a different
> point, seemingly at random. Sometimes it works fine.
> Random debian packages I apt-get have the same
> problem. Sometimes they won't unpack properly,
> sometimes they will.
>
> I tried reinstall gzip several times, but I don't
> think the problems are limited to compressed files,
> just very obvious in critical situations like that.
>
> I can get complex software to run: xfree86 4.1,
> mozilla, etc, fine, although som
> e files apparently go missing in some programs.
>
> Just now I got the following error message when
> deleting a tarball:
>
> vs-4080: reiserfs_free_block: free_block
> (0301:672040)[dev:blocknr]: bit already
> cleared
>
>
> Next, I took the hard drive to my other, stable
> computer and ran reiserfsck --rebuild-tree on it,
> under the hopes that this would fix it. It did
> appear to fix it, but about 10 minutes later the
> symptoms came back.
>
> Here is 'debugreiserfs /dev/hda1' output:
>
>
> Super block of format 3.5 found on the 0x3 in block
> 16
> Block count 4233112
> Blocksize 4096
> Free blocks 3900694
> Busy blocks (skipped 16, bitmaps - 130, journal
> blocks - 8193
> 1 super blocks, 324078 data blocks
> Root block 8529
> Journal block (first) 18
> Journal dev 0
> Journal orig size 8192
> Filesystem state ERROR
> Tree height 4
> Hash function used to sort names: "tea"
> Objectid map size 62, max 1004
> Version 0
>
>
> Here is my relevant hardware:
>
> Motherboard: Asus A7V KT133 with 686A southbridge
> (NOT the 686B).
> Harddrive: 30 gig ide maxtor/generic.
>
> I installed 2.4.6 to try and fix the problem, it
> didn't seem to help, although I do not clearly
> remember the difference between 2.2.17-patched and
> 2.4.6 in terms of the symptoms.
>
> I tried reinstalling once, but that did not help.
>
> I'm at a loss as to how to proceed. Any ideas?
>
> Thank you for your time,
>
> Sam
> ---
> [email protected]
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

=====
S.KIEU

_____________________________________________________________________________
http://messenger.yahoo.com.au - Yahoo! Messenger
- Voice chat, mail alerts, stock quotes and favourite news and lots more!

2001-07-18 09:43:46

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

That you had problems on both filesystems makes me suspect the hard drive. May I suggest you run
the badblocks program and see if it finds any?

My other developers may have other suggestions.

Hans

Sam Thompson wrote:
>
> First, please CC all replies to [email protected], as I am not on the mailing list.
>
> The other day a computer of mine lost power and the ext2 fs was severely damaged
> . I decided to reinstall debian using reiserfs to prevent this. I had no problems with installation, (I've done this same install on other computers) but as I started to untar backup tarballs I had made, I started noticing problems with what I believe is the filesystem.
>
> Tar/gzip will complain about crc errors in files: for example in a certain 40 mb file I can decompress fine on other computers. If I try to uncompress the same file immediately, it will fail at a different point, seemingly at random. Sometimes it works fine. Random debian packages I apt-get have the same problem. Sometimes they won't unpack properly, sometimes they will.
>
> I tried reinstall gzip several times, but I don't think the problems are limited to compressed files, just very obvious in critical situations like that.
>
> I can get complex software to run: xfree86 4.1, mozilla, etc, fine, although som
> e files apparently go missing in some programs.
>
> Just now I got the following error message when deleting a tarball:
>
> vs-4080: reiserfs_free_block: free_block (0301:672040)[dev:blocknr]: bit already
> cleared
>
> Next, I took the hard drive to my other, stable computer and ran reiserfsck --rebuild-tree on it, under the hopes that this would fix it. It did appear to fix it, but about 10 minutes later the symptoms came back.
>
> Here is 'debugreiserfs /dev/hda1' output:
>
> Super block of format 3.5 found on the 0x3 in block 16
> Block count 4233112
> Blocksize 4096
> Free blocks 3900694
> Busy blocks (skipped 16, bitmaps - 130, journal blocks - 8193
> 1 super blocks, 324078 data blocks
> Root block 8529
> Journal block (first) 18
> Journal dev 0
> Journal orig size 8192
> Filesystem state ERROR
> Tree height 4
> Hash function used to sort names: "tea"
> Objectid map size 62, max 1004
> Version 0
>
> Here is my relevant hardware:
>
> Motherboard: Asus A7V KT133 with 686A southbridge (NOT the 686B).
> Harddrive: 30 gig ide maxtor/generic.
>
> I installed 2.4.6 to try and fix the problem, it didn't seem to help, although I do not clearly remember the difference between 2.2.17-patched and 2.4.6 in terms of the symptoms.
>
> I tried reinstalling once, but that did not help.
>
> I'm at a loss as to how to proceed. Any ideas?
>
> Thank you for your time,
>
> Sam
> ---
> [email protected]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-18 13:10:32

by Andre Pang

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 17, 2001 at 09:14:01PM -0700, Sam Thompson wrote:

> Tar/gzip will complain about crc errors in files: for example
> in a certain 40 mb file I can decompress fine on other
> computers. If I try to uncompress the same file immediately,
> it will fail at a different point, seemingly at random.
> Sometimes it works fine. Random debian packages I apt-get have
> the same problem. Sometimes they won't unpack properly,
> sometimes they will.

you also possibly have a ram problem. check out memtest86 at
freshmeat.net and try that. anything can happen to a computer
if power dies unexpectedly.


--
#ozone/algorithm <[email protected]> - trust.in.love.to.save

2001-07-18 16:26:57

by Sam Thompson

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

I should have checked this first, but you were right. Memtest86 revealed I had a bad memory module. When I replaced it, everything began running flawlessly. I've been running for several hours with no problems.

Thank you very much,
Sam

* Vladimir V. Saveliev ([email protected]) wrote:
> Hi
>
> If you are able to get a problem easily - would you mind to start with
> simple hardware checking (just to ):
> does your CPU's cooler rotate?
> is CPU temperature ok?
> check memory with some tool
>
> Thanks,
> vs
>
>
> >
> > Sam Thompson wrote:
> >
> >> First, please CC all replies to [email protected], as I am not on the mailing list.
> >>
> >> The other day a computer of mine lost power and the ext2 fs was severely damaged
> >> . I decided to reinstall debian using reiserfs to prevent this. I had no problems with installation, (I've done this same install on other computers) but as I started to untar backup tarballs I had made, I started noticing problems with what I believe is the filesystem.
> >>
> >> Tar/gzip will complain about crc errors in files: for example in a certain 40 mb file I can decompress fine on other computers. If I try to uncompress the same file immediately, it will fail at a different point, seemingly at random. Sometimes it works fine. Random debian packages I apt-get have the same problem. Sometimes they won't unpack properly, sometimes they will.
> >>
> >> I tried reinstall gzip several times, but I don't think the problems are limited to compressed files, just very obvious in critical situations like that.
> >>
> >> I can get complex software to run: xfree86 4.1, mozilla, etc, fine, although som
> >> e files apparently go missing in some programs.
> >>
> >> Just now I got the following error message when deleting a tarball:
> >>
> >> vs-4080: reiserfs_free_block: free_block (0301:672040)[dev:blocknr]: bit already
> >> cleared
> >>
> >> Next, I took the hard drive to my other, stable computer and ran reiserfsck --rebuild-tree on it, under the hopes that this would fix it. It did appear to fix it, but about 10 minutes later the symptoms came back.
> >>
> >> Here is 'debugreiserfs /dev/hda1' output:
> >>
> >> Super block of format 3.5 found on the 0x3 in block 16
> >> Block count 4233112
> >> Blocksize 4096
> >> Free blocks 3900694
> >> Busy blocks (skipped 16, bitmaps - 130, journal blocks - 8193
> >> 1 super blocks, 324078 data blocks
> >> Root block 8529
> >> Journal block (first) 18
> >> Journal dev 0
> >> Journal orig size 8192
> >> Filesystem state ERROR
> >> Tree height 4
> >> Hash function used to sort names: "tea"
> >> Objectid map size 62, max 1004
> >> Version 0
> >>
> >> Here is my relevant hardware:
> >>
> >> Motherboard: Asus A7V KT133 with 686A southbridge (NOT the 686B).
> >> Harddrive: 30 gig ide maxtor/generic.
> >>
> >> I installed 2.4.6 to try and fix the problem, it didn't seem to help, although I do not clearly remember the difference between 2.2.17-patched and 2.4.6 in terms of the symptoms.
> >>
> >> I tried reinstalling once, but that did not help.
> >>
> >> I'm at a loss as to how to proceed. Any ideas?
> >>
> >> Thank you for your time,
> >>
> >> Sam
> >> ---
> >> [email protected]
> >> -
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at http://www.tux.org/lkml/
> >
>
>

2001-07-18 16:26:07

by Erik Mouw

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> My advice:
>
> Dont use reiserfs,JFS
> it is ok to use ext2
>
> Go journalling? use ext3 or XFS
>
> I have used all of these fs and pick up this rule (up
> to now, not sure it remains right in the far future)

FUD. I've been using reiserfs on quite some systems and never got any
problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
it as one of their stable filesystems for over a year.


Erik

--
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
of Electrical Engineering, Faculty of Information Technology and Systems,
Delft University of Technology, PO BOX 5031, 2600 GA Delft, The Netherlands
Phone: +31-15-2783635 Fax: +31-15-2781843 Email: [email protected]
WWW: http://www-ict.its.tudelft.nl/~erik/

2001-07-18 16:35:57

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

It is very understandable that users should be less confident in the stability of our code than we
are. At this time, hardware bugs that look very much like software bugs greatly outnumber software
bugs. We find that if a bug is easy for the user to hit by doing something not especially unusual,
and it doesn't seem like recent VFS changes have caused it, it is almost surely a hardware bug
masquerading as a software bug.

Hans

Sam Thompson wrote:
>
> I should have checked this first, but you were right. Memtest86 revealed I had a bad memory module. When I replaced it, everything began running flawlessly. I've been running for several hours with no problems.
>
> Thank you very much,
> Sam
>
> * Vladimir V. Saveliev ([email protected]) wrote:
> > Hi
> >
> > If you are able to get a problem easily - would you mind to start with
> > simple hardware checking (just to ):
> > does your CPU's cooler rotate?
> > is CPU temperature ok?
> > check memory with some tool
> >
> > Thanks,
> > vs
> >
> >
> > >
> > > Sam Thompson wrote:
> > >
> > >> First, please CC all replies to [email protected], as I am not on the mailing list.
> > >>
> > >> The other day a computer of mine lost power and the ext2 fs was severely damaged
> > >> . I decided to reinstall debian using reiserfs to prevent this. I had no problems with installation, (I've done this same install on other computers) but as I started to untar backup tarballs I had made, I started noticing problems with what I believe is the filesystem.
> > >>
> > >> Tar/gzip will complain about crc errors in files: for example in a certain 40 mb file I can decompress fine on other computers. If I try to uncompress the same file immediately, it will fail at a different point, seemingly at random. Sometimes it works fine. Random debian packages I apt-get have the same problem. Sometimes they won't unpack properly, sometimes they will.
> > >>
> > >> I tried reinstall gzip several times, but I don't think the problems are limited to compressed files, just very obvious in critical situations like that.
> > >>
> > >> I can get complex software to run: xfree86 4.1, mozilla, etc, fine, although som
> > >> e files apparently go missing in some programs.
> > >>
> > >> Just now I got the following error message when deleting a tarball:
> > >>
> > >> vs-4080: reiserfs_free_block: free_block (0301:672040)[dev:blocknr]: bit already
> > >> cleared
> > >>
> > >> Next, I took the hard drive to my other, stable computer and ran reiserfsck --rebuild-tree on it, under the hopes that this would fix it. It did appear to fix it, but about 10 minutes later the symptoms came back.
> > >>
> > >> Here is 'debugreiserfs /dev/hda1' output:
> > >>
> > >> Super block of format 3.5 found on the 0x3 in block 16
> > >> Block count 4233112
> > >> Blocksize 4096
> > >> Free blocks 3900694
> > >> Busy blocks (skipped 16, bitmaps - 130, journal blocks - 8193
> > >> 1 super blocks, 324078 data blocks
> > >> Root block 8529
> > >> Journal block (first) 18
> > >> Journal dev 0
> > >> Journal orig size 8192
> > >> Filesystem state ERROR
> > >> Tree height 4
> > >> Hash function used to sort names: "tea"
> > >> Objectid map size 62, max 1004
> > >> Version 0
> > >>
> > >> Here is my relevant hardware:
> > >>
> > >> Motherboard: Asus A7V KT133 with 686A southbridge (NOT the 686B).
> > >> Harddrive: 30 gig ide maxtor/generic.
> > >>
> > >> I installed 2.4.6 to try and fix the problem, it didn't seem to help, although I do not clearly remember the difference between 2.2.17-patched and 2.4.6 in terms of the symptoms.
> > >>
> > >> I tried reinstalling once, but that did not help.
> > >>
> > >> I'm at a loss as to how to proceed. Any ideas?
> > >>
> > >> Thank you for your time,
> > >>
> > >> Sam
> > >> ---
> > >> [email protected]
> > >> -
> > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > >> the body of a message to [email protected]
> > >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >> Please read the FAQ at http://www.tux.org/lkml/
> > >
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-19 02:03:21

by Steve Kieu

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

--- Erik Mouw <[email protected]> wrote: > On
Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu
> wrote:
> > My advice:
> >
> > Dont use reiserfs,JFS
> > it is ok to use ext2
> >
> > Go journalling? use ext3 or XFS
> >
> > I have used all of these fs and pick up this rule
> (up
> > to now, not sure it remains right in the far
> future)
>
> FUD. I've been using reiserfs on quite some systems

Probably !. I said just from my computer, :-)

Reiserfs uses system resources more than others.
Perfomance is ok (not as far more or less than JFS)
but after using for a while, some mysterious things
happen ; for example, the ini file of some program is
changed wihtout any reason. For example I run mc and
make it learn all keys, and pause when executing a
command ; After reboot, sometimes all these setting
are lost, some times not. It still happen with XFS
though but never see in ext2, ext3 (now I am using)

JFS I was happy to use that when my computer is normal
power off. One time, power outage then it completely
trashed my root partition (can not recover by any
means) Have to restore from backup file and sure, let
it go for now.

I aggree Reiserfs should be stable but unfortunately
in my computer it doesn's show any good sign of
advantages than xfs or ext3. Dont mention about some
minor bug like zero log file (fixed already I hope)
but the data. Ahhh, I remember one time when I ran

pppd call myisp

pppd can not make the connection. I view the syslog
file and noticed that chat send the wrong command to
the modem. Strange, I thought as it is usually ok to
make the connection. Check the /etc/ppp/chat/myisp
file ; things seem to be normal. Ok I delete that file
and edit it again exactly what I saw in the previous
file. Run pppd call myisp; it is Ok. What do you
think?

It has not yet happen to me now, for about 2 weeks
(with ext3)

Okay may be that is FUD ; lets be like that way. I
only say from my usage.

Cheers,


> and never got any
> problem. If reiserfs wouldn't be stable, SuSE
> wouldn't have supported
> it as one of their stable filesystems for over a
> year.
>
>
> Erik
>
> --
> J.A.K. (Erik) Mouw, Information and Communication
> Theory Group, Department
> of Electrical Engineering, Faculty of Information
> Technology and Systems,
> Delft University of Technology, PO BOX 5031, 2600
> GA Delft, The Netherlands
> Phone: +31-15-2783635 Fax: +31-15-2781843 Email:
> [email protected]
> WWW: http://www-ict.its.tudelft.nl/~erik/
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

=====
S.KIEU

_____________________________________________________________________________
http://messenger.yahoo.com.au - Yahoo! Messenger
- Voice chat, mail alerts, stock quotes and favourite news and lots more!

2001-07-19 13:30:24

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

I think you have problems that are completely unrelated to ReiserFS.

Hans

Steve Kieu wrote:
>
> --- Erik Mouw <[email protected]> wrote: > On
> Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu
> > wrote:
> > > My advice:
> > >
> > > Dont use reiserfs,JFS
> > > it is ok to use ext2
> > >
> > > Go journalling? use ext3 or XFS
> > >
> > > I have used all of these fs and pick up this rule
> > (up
> > > to now, not sure it remains right in the far
> > future)
> >
> > FUD. I've been using reiserfs on quite some systems
>
> Probably !. I said just from my computer, :-)
>
> Reiserfs uses system resources more than others.
> Perfomance is ok (not as far more or less than JFS)
> but after using for a while, some mysterious things
> happen ; for example, the ini file of some program is
> changed wihtout any reason. For example I run mc and
> make it learn all keys, and pause when executing a
> command ; After reboot, sometimes all these setting
> are lost, some times not. It still happen with XFS
> though but never see in ext2, ext3 (now I am using)
>
> JFS I was happy to use that when my computer is normal
> power off. One time, power outage then it completely
> trashed my root partition (can not recover by any
> means) Have to restore from backup file and sure, let
> it go for now.
>
> I aggree Reiserfs should be stable but unfortunately
> in my computer it doesn's show any good sign of
> advantages than xfs or ext3. Dont mention about some
> minor bug like zero log file (fixed already I hope)
> but the data. Ahhh, I remember one time when I ran
>
> pppd call myisp
>
> pppd can not make the connection. I view the syslog
> file and noticed that chat send the wrong command to
> the modem. Strange, I thought as it is usually ok to
> make the connection. Check the /etc/ppp/chat/myisp
> file ; things seem to be normal. Ok I delete that file
> and edit it again exactly what I saw in the previous
> file. Run pppd call myisp; it is Ok. What do you
> think?
>
> It has not yet happen to me now, for about 2 weeks
> (with ext3)
>
> Okay may be that is FUD ; lets be like that way. I
> only say from my usage.
>
> Cheers,
>
> > and never got any
> > problem. If reiserfs wouldn't be stable, SuSE
> > wouldn't have supported
> > it as one of their stable filesystems for over a
> > year.
> >
> >
> > Erik
> >
> > --
> > J.A.K. (Erik) Mouw, Information and Communication
> > Theory Group, Department
> > of Electrical Engineering, Faculty of Information
> > Technology and Systems,
> > Delft University of Technology, PO BOX 5031, 2600
> > GA Delft, The Netherlands
> > Phone: +31-15-2783635 Fax: +31-15-2781843 Email:
> > [email protected]
> > WWW: http://www-ict.its.tudelft.nl/~erik/
> > -
> > To unsubscribe from this list: send the line
> > "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> =====
> S.KIEU
>
> _____________________________________________________________________________
> http://messenger.yahoo.com.au - Yahoo! Messenger
> - Voice chat, mail alerts, stock quotes and favourite news and lots more!
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-19 15:52:41

by Erik Mouw

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Thu, Jul 19, 2001 at 12:02:59PM +1000, Steve Kieu wrote:
> --- Erik Mouw <[email protected]> wrote: > On
> > FUD. I've been using reiserfs on quite some systems
>
> Probably !. I said just from my computer, :-)
>
> Reiserfs uses system resources more than others.
> Perfomance is ok (not as far more or less than JFS)
> but after using for a while, some mysterious things
> happen ; for example, the ini file of some program is
> changed wihtout any reason. For example I run mc and
> make it learn all keys, and pause when executing a
> command ; After reboot, sometimes all these setting
> are lost, some times not. It still happen with XFS
> though but never see in ext2, ext3 (now I am using)

That sounds more like hardware problems to me.


Erik

--
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
of Electrical Engineering, Faculty of Information Technology and Systems,
Delft University of Technology, PO BOX 5031, 2600 GA Delft, The Netherlands
Phone: +31-15-2783635 Fax: +31-15-2781843 Email: [email protected]
WWW: http://www-ict.its.tudelft.nl/~erik/

2001-07-27 12:50:34

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Wed, 18 Jul 2001, Erik Mouw wrote:

> On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> > My advice:
> >
> > Dont use reiserfs,JFS
> > it is ok to use ext2
> >
> > Go journalling? use ext3 or XFS
> >
> > I have used all of these fs and pick up this rule (up
> > to now, not sure it remains right in the far future)
>
> FUD. I've been using reiserfs on quite some systems and never got any
> problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
> it as one of their stable filesystems for over a year.

Actually, I've been having some nasty corruption problems as well with
reiserfs. I develop my own drivers, and do occasionally make a mistake,
and when that hangs the kernel it will also screw up all files touched
just before it in a edit-make-install-try cycle. Which can be rather
annoying, because you can start all over again (this effect randomly
distributes the last touched sectors to the last touched files. Very nice
effect, but not something I expect from a journalled filesystem).

Regards,

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 12:57:14

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

[email protected] wrote:
>
> On Wed, 18 Jul 2001, Erik Mouw wrote:
>
> > On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> > > My advice:
> > >
> > > Dont use reiserfs,JFS
> > > it is ok to use ext2
> > >
> > > Go journalling? use ext3 or XFS
> > >
> > > I have used all of these fs and pick up this rule (up
> > > to now, not sure it remains right in the far future)
> >
> > FUD. I've been using reiserfs on quite some systems and never got any
> > problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
> > it as one of their stable filesystems for over a year.
>
> Actually, I've been having some nasty corruption problems as well with
> reiserfs. I develop my own drivers, and do occasionally make a mistake,
> and when that hangs the kernel it will also screw up all files touched
> just before it in a edit-make-install-try cycle. Which can be rather
> annoying, because you can start all over again (this effect randomly
> distributes the last touched sectors to the last touched files. Very nice
> effect, but not something I expect from a journalled filesystem).
>
> Regards,
>
> Bas Vermeulen
>
> --
> "God, root, what is difference?"
> -- Pitr, User Friendly
>
> "God is more forgiving."
> -- Dave Aronson
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

Do you think it is reasonable to ask that a filesystem be designed to work well with bad drivers?

2001-07-27 13:21:46

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, 27 Jul 2001, Hans Reiser wrote:

> [email protected] wrote:
> >
> > On Wed, 18 Jul 2001, Erik Mouw wrote:
> >
> > > On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> > > > My advice:
> > > >
> > > > Dont use reiserfs,JFS
> > > > it is ok to use ext2
> > > >
> > > > Go journalling? use ext3 or XFS
> > > >
> > > > I have used all of these fs and pick up this rule (up
> > > > to now, not sure it remains right in the far future)
> > >
> > > FUD. I've been using reiserfs on quite some systems and never got any
> > > problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
> > > it as one of their stable filesystems for over a year.
> >
> > Actually, I've been having some nasty corruption problems as well with
> > reiserfs. I develop my own drivers, and do occasionally make a mistake,
> > and when that hangs the kernel it will also screw up all files touched
> > just before it in a edit-make-install-try cycle. Which can be rather
> > annoying, because you can start all over again (this effect randomly
> > distributes the last touched sectors to the last touched files. Very nice
> > effect, but not something I expect from a journalled filesystem).
>
> Do you think it is reasonable to ask that a filesystem be designed to
> work well with bad drivers?

Yup. I know ext2 can do it. I expect a filesystem to not foul up my data
when something happens. Especially not shuffle around sectors in several
files. I can understand that the changes I made are not on disc, I can
even understand it if my files are gone, but not when it corrupts my data.
That just plain sucks.

A friend of mine has had crashes as well (not reiser related btw), where
files he was using at the time suddenly contained different pieces of
different files. It's just plain annoying. The reason why *I* use(d)
reiserfs was the fact that I thought that it would protect my data when
something does crash. From my experience, it doesn't, and I'd rather wait
a couple of minutes for ext2 to fsck than use reiserfs and be sure I can
start all over again.

Regards,

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 13:35:56

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, 27 Jul 2001, Alan Cox wrote:

> > > and when that hangs the kernel it will also screw up all files touched
> > > just before it in a edit-make-install-try cycle. Which can be rather
> > > annoying, because you can start all over again (this effect randomly
> > > distributes the last touched sectors to the last touched files. Very nice
> > > effect, but not something I expect from a journalled filesystem).
> > >
> > Do you think it is reasonable to ask that a filesystem be designed to
> > work well with bad drivers?
>
> Its certainly a good idea. But it sounds to me like he is describing the
> normal effect of metadata only logging.
>
> Putting a sync just before the insmod when developing new drivers is a good
> idea btw

I've been doing that most of the time. But I sometimes forget that.
But as I said, it's not something I expected from a journalled filesystem.

Regards,

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 13:32:26

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> > and when that hangs the kernel it will also screw up all files touched
> > just before it in a edit-make-install-try cycle. Which can be rather
> > annoying, because you can start all over again (this effect randomly
> > distributes the last touched sectors to the last touched files. Very nice
> > effect, but not something I expect from a journalled filesystem).
> >
> Do you think it is reasonable to ask that a filesystem be designed to
> work well with bad drivers?

Its certainly a good idea. But it sounds to me like he is describing the
normal effect of metadata only logging.

Putting a sync just before the insmod when developing new drivers is a good
idea btw

2001-07-27 13:40:16

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> > Putting a sync just before the insmod when developing new drivers is a good
> > idea btw
>
> I've been doing that most of the time. But I sometimes forget that.
> But as I said, it's not something I expected from a journalled filesystem.

You misunderstand journalling then

A journalling file system can offer different levels of guarantee. With
metadata only journalling you don't take any real performance hit but your
file system is always consistent on reboot (consistent as in fsck would pass
it) but it makes no guarantee that data blocks got written.

Full data journalling will give you what you expect but at a performance hit
for many applications.

Alan

2001-07-27 13:45:16

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, 27 Jul 2001, Alan Cox wrote:

> > > Putting a sync just before the insmod when developing new drivers is a good
> > > idea btw
> >
> > I've been doing that most of the time. But I sometimes forget that.
> > But as I said, it's not something I expected from a journalled filesystem.
>
> You misunderstand journalling then

Yup, I guess I did.

> A journalling file system can offer different levels of guarantee. With
> metadata only journalling you don't take any real performance hit but your
> file system is always consistent on reboot (consistent as in fsck would pass
> it) but it makes no guarantee that data blocks got written.

I allways thought that it could/would roll back the changes that weren't
consistent. But I stand corrected. Thanks... :)

> Full data journalling will give you what you expect but at a performance hit
> for many applications.

Do any of the other journalled filesystems for linux do this? If not, I
guess I'll go back to ext2.

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 13:50:08

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> > Full data journalling will give you what you expect but at a performance hit
> > for many applications.
>
> Do any of the other journalled filesystems for linux do this? If not, I
> guess I'll go back to ext2.

ext3 can do full data journalling, I dont know if reiserfs has an option for
it or not

Alan

2001-07-27 14:17:26

by Philip R Auld

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:
>
> Its certainly a good idea. But it sounds to me like he is describing the
> normal effect of metadata only logging.
>

Which brings up something I have been struggling with lately:

Linux (using both ext2 and reiserfs) can show garbage data blocks at the end of
files after a crash. With reiserfs this is clearly due to metadata only logging
and happens say 3 out of 5 times. With ext2 the frequency is about 1 in 5 times,
and more often that not it is simply zeroed data. Sometimes it is old data
though.


This is something that is not present in other unix filesystems as far as I can
tell. If linux wants to be used in enterprise sites we can't allow
old data blocks to be read. And ideally shouldn't allow zero blocks to be seen
either, but this is somewhat less serious.

I cannot reproduce this in ufs on either freebsd or solaris8.

I have not tested it with xfs and jfs for linux yet (and don't have any native
systems at hand.)

I believe vxfs to have a mechanism to prevent this despite metadata only
logging.

reiserfs with full data logging enabled of course does not show this behavior
(and works really well if you are willing to take the performance hit).

The basic test I use is to run this perl script for a while (to make sure at
least somehting has had a chance to get written out) and then power-cycle the
machine. When it comes back a simple tail logfile will show the problem. I also
run bonnie before hand to fill the disk with a known pattern so its easier to
spot.

linux is 2.2.16 and 2.4.2 from redhat 7.1. reiserfs is 3.5.33 and was tested
only on 2.2.16.


#!/usr/bin/perl
use Fcntl;
$count = 0;
while (1) {
#sysopen(FH, "/scratch/logfile", O_RDWR|O_APPEND|O_CREAT|O_SYNC)
sysopen(FH, "/scratch/logfile", O_RDWR|O_APPEND|O_CREAT)
or die "Couldn't open file $path : $!\n";
print FH "Log file line ", $count , " yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
yadda yadda yadda yadda yadda yadda yadda yadda \n" ;
close (FH);
#print $count , "\n";
$count++;
}


------------------------------------------------------
Philip R. Auld, Ph.D. Technical Staff
Egenera Corp. [email protected]
165 Forest St, Marlboro, MA 01752 (508)786-9444

2001-07-27 14:21:06

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> This is something that is not present in other unix filesystems as far as I can
> tell. If linux wants to be used in enterprise sites we can't allow
> old data blocks to be read. And ideally shouldn't allow zero blocks to be seen
> either, but this is somewhat less serious.

> I cannot reproduce this in ufs on either freebsd or solaris8.

It can happen on UFS. What normally happens on UFS is that you get an old
file attached to a new filename when the file is deleted and the inode
reused.

Basically it can happen on any no data logging fs (with a few exceptions for
other clever algorithms like tree-phase)

If you write the metadata block first (UFS) then there is a risk of getting
someone elses data appended to the end of a file (eg length updated before
data blocks). If you write data first there is a risk of writing the data
and never committing the removal of the block from previous files.

FreeBSD softupdates probably make it very hard to trigger and they are a
very nice approach

2001-07-27 14:21:36

by Joshua Schmidlkofer

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

I've almost quit using reiser, because everytime I have a power outage, the
last 2 or three files that I've editted, even ones that I haven't touched in
a while, will usually be hopelessly corrupted. The '<file>~' that Emacs
makes is usually fine though. It seems to be that any open file is
in danger. I don't know if this is normal, or not, but I switched to XFS on
several machines. I have nothing against reiser. I assumed that these
problems were due to immaturity....

One more thing - All my computers with Reiser as '/' on them had a
disturbingly long boot time. From the time when the Redhat startup scripts
began, it was.... hideously slow. I thought nothing of it, blaming bash,
etc, Until I switched to ext2 on all those. Now the boot time is... SUPER
fast. [3 Computers, 1 K6-2, a Pentium III, and a Pentium II, all 128+meg,
and IDE] I currently have 3 computers running reiserfs left, all are using
MySQL databases.
Once, I lost power in on my SQL box, [it was blessedly during a
period of no use.] I had to rebuild all the indexes. Not only THAT, but
what happens to that box if I lose power whilst in the middle of operations?
I am working on the migration plan to move that to XFS because of these
concerns. [However, I am doing a better job of testing with XFS first.]

I think that Reiser is cool, and has neat ideology, but I am un-nerved by
this behaviour.

js


>
> Yup. I know ext2 can do it. I expect a filesystem to not foul up my data
> when something happens. Especially not shuffle around sectors in several
> files. I can understand that the changes I made are not on disc, I can
> even understand it if my files are gone, but not when it corrupts my data.
> That just plain sucks.
>
> A friend of mine has had crashes as well (not reiser related btw), where
> files he was using at the time suddenly contained different pieces of
> different files. It's just plain annoying. The reason why *I* use(d)
> reiserfs was the fact that I thought that it would protect my data when
> something does crash. From my experience, it doesn't, and I'd rather wait
> a couple of minutes for ext2 to fsck than use reiserfs and be sure I can
> start all over again.
>
> Regards,
>
> Bas Vermeulen

2001-07-27 14:24:36

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:
>
> > > and when that hangs the kernel it will also screw up all files touched
> > > just before it in a edit-make-install-try cycle. Which can be rather
> > > annoying, because you can start all over again (this effect randomly
> > > distributes the last touched sectors to the last touched files. Very nice
> > > effect, but not something I expect from a journalled filesystem).
> > >
> > Do you think it is reasonable to ask that a filesystem be designed to
> > work well with bad drivers?
>
> Its certainly a good idea.
I think it is a terrible idea.... at least as a general expectation to meet, there may be specifics
where things can be done though.... like journaling....

> But it sounds to me like he is describing the
> normal effect of metadata only logging.

Ah, right you are. Now I understand him. Well, data-journaling that doesn't cost a whole lot of
performance awaits reiser4, and reiser4 is at least a year away, we are doing seminars and
pseudo-coding now.

>
> Putting a sync just before the insmod when developing new drivers is a good
> idea btw

This makes a lot of sense to me. Good suggestion. It should go into our FAQ. Dad, please put it
there.

Q: I like to dynamically load buggy drivers into the kernel because that is what kernel developers
like me do for fun, how can I better avoid data corruption when doing this and using ReiserFS?

A: Do sync before insmod. (Alan Cox's good suggestion.)

2001-07-27 14:41:37

by Jordan Breeding

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

"Philip R. Auld" wrote:
>
> Alan Cox wrote:
> >
> > Its certainly a good idea. But it sounds to me like he is describing the
> > normal effect of metadata only logging.
> >
>
> Which brings up something I have been struggling with lately:
>
> Linux (using both ext2 and reiserfs) can show garbage data blocks at the end of
> files after a crash. With reiserfs this is clearly due to metadata only logging
> and happens say 3 out of 5 times. With ext2 the frequency is about 1 in 5 times,
> and more often that not it is simply zeroed data. Sometimes it is old data
> though.
>
> This is something that is not present in other unix filesystems as far as I can
> tell. If linux wants to be used in enterprise sites we can't allow
> old data blocks to be read. And ideally shouldn't allow zero blocks to be seen
> either, but this is somewhat less serious.
>
> I cannot reproduce this in ufs on either freebsd or solaris8.
>
> I have not tested it with xfs and jfs for linux yet (and don't have any native
> systems at hand.)
>
> I believe vxfs to have a mechanism to prevent this despite metadata only
> logging.
>
> reiserfs with full data logging enabled of course does not show this behavior
> (and works really well if you are willing to take the performance hit).
>
> The basic test I use is to run this perl script for a while (to make sure at
> least somehting has had a chance to get written out) and then power-cycle the
> machine. When it comes back a simple tail logfile will show the problem. I also
> run bonnie before hand to fill the disk with a known pattern so its easier to
> spot.
>
> linux is 2.2.16 and 2.4.2 from redhat 7.1. reiserfs is 3.5.33 and was tested
> only on 2.2.16.
>
> #!/usr/bin/perl
> use Fcntl;
> $count = 0;
> while (1) {
> #sysopen(FH, "/scratch/logfile", O_RDWR|O_APPEND|O_CREAT|O_SYNC)
> sysopen(FH, "/scratch/logfile", O_RDWR|O_APPEND|O_CREAT)
> or die "Couldn't open file $path : $!\n";
> print FH "Log file line ", $count , " yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda yadda
> yadda yadda yadda yadda yadda yadda yadda yadda \n" ;
> close (FH);
> #print $count , "\n";
> $count++;
> }
>
> ------------------------------------------------------
> Philip R. Auld, Ph.D. Technical Staff
> Egenera Corp. [email protected]
> 165 Forest St, Marlboro, MA 01752 (508)786-9444
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

I didn't know that there was a way to enable full data journaling using
reiserfs. I was under the impression that with the latest round of the
unlink patch to go with 2.4.7 that reiserfs was basically in ordered
journaling mode instead of writeback (I believe that is the name), if I
am wrong or if there really is a way to enable full data journaling
please let me know. Thanks.

Jordan

2001-07-27 14:49:38

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

we all have different usage patterns and different needs. I find it extremely convenient to not be
using ext2 when my dell laptop with its poor linux power management crashes frequently, or the
kernel crashes. I have never had any problem with data corruption. Many users I know have also
had good experiences with leaving behind ext2 and going to reiserfs on their laptops. For your
needs and patterns though, it sounds like you need ext3.

Hans

[email protected] wrote:
>
> On Fri, 27 Jul 2001, Hans Reiser wrote:
>
> > [email protected] wrote:
> > >
> > > On Wed, 18 Jul 2001, Erik Mouw wrote:
> > >
> > > > On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> > > > > My advice:
> > > > >
> > > > > Dont use reiserfs,JFS
> > > > > it is ok to use ext2
> > > > >
> > > > > Go journalling? use ext3 or XFS
> > > > >
> > > > > I have used all of these fs and pick up this rule (up
> > > > > to now, not sure it remains right in the far future)
> > > >
> > > > FUD. I've been using reiserfs on quite some systems and never got any
> > > > problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
> > > > it as one of their stable filesystems for over a year.
> > >
> > > Actually, I've been having some nasty corruption problems as well with
> > > reiserfs. I develop my own drivers, and do occasionally make a mistake,
> > > and when that hangs the kernel it will also screw up all files touched
> > > just before it in a edit-make-install-try cycle. Which can be rather
> > > annoying, because you can start all over again (this effect randomly
> > > distributes the last touched sectors to the last touched files. Very nice
> > > effect, but not something I expect from a journalled filesystem).
> >
> > Do you think it is reasonable to ask that a filesystem be designed to
> > work well with bad drivers?
>
> Yup. I know ext2 can do it. I expect a filesystem to not foul up my data
> when something happens. Especially not shuffle around sectors in several
> files. I can understand that the changes I made are not on disc, I can
> even understand it if my files are gone, but not when it corrupts my data.
> That just plain sucks.
>
> A friend of mine has had crashes as well (not reiser related btw), where
> files he was using at the time suddenly contained different pieces of
> different files. It's just plain annoying. The reason why *I* use(d)
> reiserfs was the fact that I thought that it would protect my data when
> something does crash. From my experience, it doesn't, and I'd rather wait
> a couple of minutes for ext2 to fsck than use reiserfs and be sure I can
> start all over again.
>
> Regards,
>
> Bas Vermeulen
>
> --
> "God, root, what is difference?"
> -- Pitr, User Friendly
>
> "God is more forgiving."
> -- Dave Aronson
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-27 14:52:48

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

"Philip R. Auld" wrote:

> reiserfs with full data logging enabled of course does not show this behavior
> (and works really well if you are willing to take the performance hit).

Hmmm, I didn't realize this had made off our wish list and into the code.:)
We should benchmark the cost to performance.

Hans

2001-07-27 14:56:48

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Joshua Schmidlkofer wrote:
>
> I've almost quit using reiser, because everytime I have a power outage, the
> last 2 or three files that I've editted, even ones that I haven't touched in
> a while, will usually be hopelessly corrupted. The '<file>~' that Emacs
> makes is usually fine though. It seems to be that any open file is
> in danger. I don't know if this is normal, or not, but I switched to XFS on
> several machines. I have nothing against reiser. I assumed that these
> problems were due to immaturity....
>
> One more thing - All my computers with Reiser as '/' on them had a
> disturbingly long boot time. From the time when the Redhat startup scripts
> began, it was.... hideously slow. I thought nothing of it, blaming bash,

Don't use RedHat with ReiserFS, they screw things up so many ways.....

For instance, they compile it with the wrong options set, their boot scripts are wrong, they just
shovel software onto the CD.

Use SuSE, and trust me, ReiserFS will boot faster than ext2.

Actually, I am curious as to exactly how they manage to make ReiserFS boot longer than ext2. Do
they run fsck or what?

Hans

> etc, Until I switched to ext2 on all those. Now the boot time is... SUPER
> fast. [3 Computers, 1 K6-2, a Pentium III, and a Pentium II, all 128+meg,
> and IDE] I currently have 3 computers running reiserfs left, all are using
> MySQL databases.
> Once, I lost power in on my SQL box, [it was blessedly during a
> period of no use.] I had to rebuild all the indexes. Not only THAT, but
> what happens to that box if I lose power whilst in the middle of operations?
> I am working on the migration plan to move that to XFS because of these
> concerns. [However, I am doing a better job of testing with XFS first.]
>
> I think that Reiser is cool, and has neat ideology, but I am un-nerved by
> this behaviour.
>
> js
>
> >
> > Yup. I know ext2 can do it. I expect a filesystem to not foul up my data
> > when something happens. Especially not shuffle around sectors in several
> > files. I can understand that the changes I made are not on disc, I can
> > even understand it if my files are gone, but not when it corrupts my data.
> > That just plain sucks.
> >
> > A friend of mine has had crashes as well (not reiser related btw), where
> > files he was using at the time suddenly contained different pieces of
> > different files. It's just plain annoying. The reason why *I* use(d)
> > reiserfs was the fact that I thought that it would protect my data when
> > something does crash. From my experience, it doesn't, and I'd rather wait
> > a couple of minutes for ext2 to fsck than use reiserfs and be sure I can
> > start all over again.
> >
> > Regards,
> >
> > Bas Vermeulen

2001-07-27 15:02:18

by Daniel Phillips

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Friday 27 July 2001 16:18, Joshua Schmidlkofer wrote:
> I've almost quit using reiser, because everytime I have a power
> outage, the last 2 or three files that I've editted, even ones that I
> haven't touched in a while, will usually be hopelessly corrupted.

My early flush patch will fix this, or at least it will if I get
together with the ReiserFS guys and figure out how to integrate their
flushing mechanism with the standard bdflush. Or they could
incorporate the ideas from my early flush in their own flush daemon,
though generalizing the standard flush would have more value in the
long run.

> The '<file>~' that Emacs makes is usually fine though.

Because it's "created" by a rename.

[...]
> Once, I lost power in on my SQL box, [it was blessedly during a
> period of no use.] I had to rebuild all the indexes. Not only
> THAT, but what happens to that box if I lose power whilst in the
> middle of operations? I am working on the migration plan to move that
> to XFS because of these concerns. [However, I am doing a better job
> of testing with XFS first.]

Help is on its way. You can try full-data journalling with the journal
on NVRAM or on a separate disk. You can also wait for me to get a
usable version of Tux2 working. It's progressed a little slowly
because of frequent side trips ;-) But hopefully I'll be able to do
something about that soon.

Which flavor of SQL are you using? Are the indices in separate files?
(Sounds like they are.)

> I think that Reiser is cool, and has neat ideology, but I am
> un-nerved by this behaviour.

I think it's not hard to fix.

--
Daniel

2001-07-27 15:02:18

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, 27 Jul 2001, Hans Reiser wrote:

> we all have different usage patterns and different needs. I find it
> extremely convenient to not be using ext2 when my dell laptop with its
> poor linux power management crashes frequently, or the kernel crashes.
> I have never had any problem with data corruption. Many users I know
> have also had good experiences with leaving behind ext2 and going to
> reiserfs on their laptops. For your needs and patterns though, it
> sounds like you need ext3.

The point is, this can happen every time the kernel crashes, and reiserfs
wrote something to it's metadata logs (or so I gather from your and Alan's
explanation). And apart from my source files getting randomly distributed,
reiserfs works like a charm (I have a Dell as well, and it used to crash a
lot, which was the main reason for me to switch to reiserfs in the first
place), is fast, and stable. I like it a lot, but not on a machine where I
do my development on, nor a machine without a UPS. It just doesn't help
not knowing if/when a file gets corrupted/wrongly distributed/written
back/whatever.

It looks to me (with all my ignorance) that reiserfs shuffles it's blocks
a lot when writing back, and that bites when something interrupts it.
I can't back that up with code, put my finger to it or anything else, but
that's my take on my problems.

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 15:05:28

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> Don't use RedHat with ReiserFS, they screw things up so many ways.....
> For instance, they compile it with the wrong options set, their boot scripts are wrong, they just
> shovel software onto the CD.

Sorry Hans you can rant all you like but you know you are wrong on most
of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
and didn't ship until we stopped seeing corruption problems with the mm/fs
code.

That test suite caught bugs in kernel revisions other vendors shipped
blindly to their customers without fixing.

That is hardly shovelling software onto the CD.

> Actually, I am curious as to exactly how they manage to make ReiserFS boot longer than ext2. Do
> they run fsck or what?

No. The only thing I can think of that might slow it is that we build with
the reiserfs paranoia/sanity checks on. Thats because at the time 7.1 was
done the kernel list was awash with reiserfs bug reports and Chris Mason
tail recursion bug patch of the week.

That might be something to check to get a fair comparison

Alan

2001-07-27 15:03:39

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 06:55:09PM +0400, Hans Reiser wrote:

Don't use RedHat with ReiserFS, they screw things up so many
ways.....

For instance, they compile it with the wrong options set, their
boot scripts are wrong, they just shovel software onto the CD.

Use SuSE, and trust me, ReiserFS will boot faster than ext2.

Actually, I am curious as to exactly how they manage to make
ReiserFS boot longer than ext2. Do they run fsck or what?

FWIW, Debian although it doesn't support reiserfs "out of the box" at
present, works flawlessly for a large number of people I know. I also
hear Mandrake 7.2 and 8.0 work pretty nice if you want a pointy-clicky
experience :)

Since so many people seem to run RedHat, perhaps it's worth someone
determining exactly what is busted with their init scripts or whatever
that makes reiserfs barf more often that with other distributions.



--cw

2001-07-27 15:07:38

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 08:18:12AM -0600, Joshua Schmidlkofer wrote:

Once, I lost power in on my SQL box, [it was blessedly during
a period of no use.] I had to rebuild all the indexes. Not only
THAT, but what happens to that box if I lose power whilst in the
middle of operations? I am working on the migration plan to move
that to XFS because of these concerns. [However, I am doing a
better job of testing with XFS first.]

Sounds like a MySQL bug (assuming data is on disk when perhaps it's
not). Using either Oracle or Sybase you seem to be able to yank the
power at pretty much any time even under load and things will recovery
gracefully upon restart.





--cw

2001-07-27 15:13:08

by Philip R Auld

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser wrote:
>
> "Philip R. Auld" wrote:
>
> > reiserfs with full data logging enabled of course does not show this behavior
> > (and works really well if you are willing to take the performance hit).
>
> Hmmm, I didn't realize this had made off our wish list and into the code.:)
> We should benchmark the cost to performance.
>
> Hans

Ooops, hope I'm not getting Chris in trouble ;)

This is reiserfs 3.5.33, with a few changes from Chris to enable full logging,
and from me to make it a mount option.

We are in a situation where we need the safety more than the speed so it was
necessary.


Here is a simple comparison using bonnie:

-------Sequential Output-------- ---Sequential Input-- --Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
pblade 1 (reiserfs defaults)
1000 13048 98.9 21609 27.4 6599 10.7 11066 72.3 16483 8.4 1011.2 5.3
1000 12771 96.7 21058 25.9 5536 9.0 10430 67.5 17347 8.4 1065.2 6.7
1000 13034 98.6 19746 21.6 7026 11.6 9884 64.4 14838 7.2 1106.0 9.7
1000 13091 99.3 19483 28.9 7586 12.3 10520 68.4 14685 6.9 900.9 6.3
pblade 2 (ext2 defaults)
1000 14373 99.9 14940 8.8 7494 11.1 10093 65.3 22213 9.3 1028.3
6.4
1000 14305 99.6 16129 9.4 7768 11.9 9629 62.2 26108 10.8 1135.8 7.7
1000 14400 99.9 16769 9.8 7397 11.2 9805 63.4 21820 9.1 1139.8 5.7
1000 14361 100. 17089 10.4 7768 11.5 9924 64.1 24154 9.8 1112.9 7.2
pblade 3 (log all data)
1000 5932 47.6 7244 12.5 4708 9.7 13909 90.5 17051 8.1 894.5 6.5
1000 5839 46.9 7229 12.5 4604 9.9 13437 87.9 19852 9.7 724.3 4.7
1000 5853 47.0 7176 12.3 4611 9.8 13995 91.1 18838 8.7 908.0 5.7
1000 5604 45.1 7106 12.2 4627 9.5 13628 88.6 15248 6.9 882.9 6.6
pblade 6 ( log new data )
1000 5556 49.0 7057 11.9 7714 12.6 11559 92.8 18075 8.8 1264.3 7.3
1000 5631 49.8 7307 12.3 7945 13.0 11558 93.0 18859 9.0 1230.7 8.0
1000 5610 49.6 7337 12.5 6620 11.0 11821 95.0 16484 7.5 1236.8 9.3
1000 5592 49.4 7070 12.1 7422 12.0 11575 92.9 16198 7.3 1236.6 4.9


I suugest we move this to reiserfs-list for more discussion if needed :)


Cheers,

Phil


------------------------------------------------------
Philip R. Auld, Ph.D. Technical Staff
Egenera Corp. [email protected]
165 Forest St, Marlboro, MA 01752 (508)786-9444

2001-07-27 15:14:18

by Cress, Andrew R

[permalink] [raw]
Subject: RE: ReiserFS / 2.4.6 / Data Corruption

-----Original Message-----
On Fri, Jul 27, 2001 at 06:55:09PM +0400, Hans Reiser wrote:
Don't use RedHat with ReiserFS, they screw things up so many
ways.....

For instance, they compile it with the wrong options set, their
boot scripts are wrong, they just shovel software onto the CD.
[...]
>Chris Wedgewood wrote:
>Since so many people seem to run RedHat, perhaps it's worth someone
>determining exactly what is busted with their init scripts or whatever
>that makes reiserfs barf more often that with other distributions.
---
Yes, I would be very interested in a tips/HOWTO on how to fix the compile
options, boot scripts, etc. for RedHat 7.1. I've been struggling with a
software RAID1 configuration with reiserfs on root and Redhat 7.1.

Andy Cress



2001-07-27 15:29:08

by Joshua Schmidlkofer

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Friday 27 July 2001 09:06 am, Alan Cox wrote:
> > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> > For instance, they compile it with the wrong options set, their boot
> > scripts are wrong, they just shovel software onto the CD.
>
> Sorry Hans you can rant all you like but you know you are wrong on most
> of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
> and didn't ship until we stopped seeing corruption problems with the mm/fs
> code.
>
> That test suite caught bugs in kernel revisions other vendors shipped
> blindly to their customers without fixing.
>
> That is hardly shovelling software onto the CD.
>
> > Actually, I am curious as to exactly how they manage to make ReiserFS
> > boot longer than ext2. Do they run fsck or what?
>
> No. The only thing I can think of that might slow it is that we build with
> the reiserfs paranoia/sanity checks on. Thats because at the time 7.1 was
> done the kernel list was awash with reiserfs bug reports and Chris Mason
> tail recursion bug patch of the week.
>
> That might be something to check to get a fair comparison

I feel that things are actually progressing above my level of perception
here, however, I would like to mention that since my Redhat 4.x days i have
feared vendor kernels, and I never use them, for better or worse.

Also, maybe I screwed my own system - I don't think so, but maybe. I
prefer to stick with Linus's kernels, and sometimes, depending on the
changlog -ac kernels. As far as the kernel & init scirpts are concerned, I
axed any fsck'ing entries for reiserfs. [I assume that they were
unnessecary.] I used kgcc [w/Rh7.1] to compile kernels, until recently. And
I stayed current with the lkml, and the namesys page watching for obvious
updates that I needed.

The slowness [seemed] actually [to be] the process of starting & stopping
daemons. Almost like there was some sort of stigma about reading shell
scripts. All the binaries executed with appropriate haste.

As far as shoveling code. Sometimes the options used to compile packages
leaves me with a large bit of wonder. Strange and seemingly heinous changes
to the various utilities, etc. But, I have never had a cause to fault them
based on this. [Except that I have never found the magic that causes all the
SRPMS to be [re]buildable.]

So to sort it, I don't feel that being a moron caused to boot slow - unless
there is some wierd filehandling problem in bash2, or something that causes
severe slow-down when sourcing shell scripts. ???? However, Hans, I do
beleive you about Suse, and if I wasn't a cheap bastard I would probably buy
a copy.

thanks for all the response, and I am sorry if this does not belong here.

2001-07-27 15:32:28

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:
>
> > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> > For instance, they compile it with the wrong options set, their boot scripts are wrong, they just
> > shovel software onto the CD.
>
> Sorry Hans you can rant all you like but you know you are wrong on most
> of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
> and didn't ship until we stopped seeing corruption problems with the mm/fs
> code.
>
> That test suite caught bugs in kernel revisions other vendors shipped
> blindly to their customers without fixing.
>
> That is hardly shovelling software onto the CD.
>
> > Actually, I am curious as to exactly how they manage to make ReiserFS boot longer than ext2. Do
> > they run fsck or what?
>
> No. The only thing I can think of that might slow it is that we build with
> the reiserfs paranoia/sanity checks on. Thats because at the time 7.1 was

Yes, that option should never be on for an end user not having a bug that he wants a more detailed
bug report on. It just makes us look slow compared to ext2.

2.4.2 was not a stable kernel for any FS, not just for ReiserFS.

2.4.4 was the earliest kernel that should have been called 2.4.0, and sad to say, I bet we won't hit
a really stable kernel for another couple of versions.

I understand the marketing pressure on distributions to ship using 2.4.x as soon as 2.4.0 was
available, and that pressure should never have been generated upon them by making an unstable kernel
be named 2.4.0.

It won't surpise me if you agree with me on the kernel naming though, and if so it is pointless for
me to complain to you about it.

> done the kernel list was awash with reiserfs bug reports and Chris Mason
> tail recursion bug patch of the week.
>
> That might be something to check to get a fair comparison
>
> Alan

I don't think that even with CONFIG_REISERFS_CHECK on, journal replay can take as long as fsck on
ext2. reiserfsck though, if that was on, oh, could even RedHat be that desperate to make us look
bad to users as to run reiserfsck at every boot?

I surely hope not, and I'd like to hear that this user just had something individually wrong with
his configuration.

Hans

2001-07-27 15:35:39

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Daniel Phillips wrote:
>
> On Friday 27 July 2001 16:18, Joshua Schmidlkofer wrote:
> > I've almost quit using reiser, because everytime I have a power
> > outage, the last 2 or three files that I've editted, even ones that I
> > haven't touched in a while, will usually be hopelessly corrupted.
>
> My early flush patch will fix this, or at least it will if I get
> together with the ReiserFS guys and figure out how to integrate their
> flushing mechanism with the standard bdflush. Or they could
> incorporate the ideas from my early flush in their own flush daemon,
> though generalizing the standard flush would have more value in the
> long run.

Can you describe early flush?

>
> > The '<file>~' that Emacs makes is usually fine though.
>
> Because it's "created" by a rename.
>
> [...]
> > Once, I lost power in on my SQL box, [it was blessedly during a
> > period of no use.] I had to rebuild all the indexes. Not only
> > THAT, but what happens to that box if I lose power whilst in the
> > middle of operations? I am working on the migration plan to move that
> > to XFS because of these concerns. [However, I am doing a better job
> > of testing with XFS first.]
>
> Help is on its way. You can try full-data journalling with the journal
> on NVRAM or on a separate disk. You can also wait for me to get a

Well, if you have NVRAM, you might try using our new journal relocation patch to put the journal on
NVRAM.

> I think it's not hard to fix.
>
> --
> Daniel

2001-07-27 15:39:28

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

This "feature" of not guaranteeing that a write that is in progress when the machine crashes will
not write garbage, has been present in most Unix filesystems for about 25 years of Unix history. It
is not that we are deviant on this, it is that a tradeoff is made, and for most but not all users it
is a good one to make.

reiser4 will do it better though by making data logging available as an option with only a moderate
performance penalty.

Hans

2001-07-27 15:48:58

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Well, I am afraid this is much too vague for me to have any understanding of what went wrong on your
system.

Maybe somebody else who is using both ReiserFS and RedHat's boot scripts can comment on whether
things are slow for them and if so, where they get slow.

With this lack of specificity is entirely possible that things went slow for coincidental reasons
unrelated to ReiserFS (waiting for network stuff to timeout, etc.)

Hans

Joshua Schmidlkofer wrote:
>
> On Friday 27 July 2001 09:06 am, Alan Cox wrote:
> > > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> > > For instance, they compile it with the wrong options set, their boot
> > > scripts are wrong, they just shovel software onto the CD.
> >
> > Sorry Hans you can rant all you like but you know you are wrong on most
> > of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
> > and didn't ship until we stopped seeing corruption problems with the mm/fs
> > code.
> >
> > That test suite caught bugs in kernel revisions other vendors shipped
> > blindly to their customers without fixing.
> >
> > That is hardly shovelling software onto the CD.
> >
> > > Actually, I am curious as to exactly how they manage to make ReiserFS
> > > boot longer than ext2. Do they run fsck or what?
> >
> > No. The only thing I can think of that might slow it is that we build with
> > the reiserfs paranoia/sanity checks on. Thats because at the time 7.1 was
> > done the kernel list was awash with reiserfs bug reports and Chris Mason
> > tail recursion bug patch of the week.
> >
> > That might be something to check to get a fair comparison
>
> I feel that things are actually progressing above my level of perception
> here, however, I would like to mention that since my Redhat 4.x days i have
> feared vendor kernels, and I never use them, for better or worse.
>
> Also, maybe I screwed my own system - I don't think so, but maybe. I
> prefer to stick with Linus's kernels, and sometimes, depending on the
> changlog -ac kernels. As far as the kernel & init scirpts are concerned, I
> axed any fsck'ing entries for reiserfs. [I assume that they were
> unnessecary.] I used kgcc [w/Rh7.1] to compile kernels, until recently. And
> I stayed current with the lkml, and the namesys page watching for obvious
> updates that I needed.
>
> The slowness [seemed] actually [to be] the process of starting & stopping
> daemons. Almost like there was some sort of stigma about reading shell
> scripts. All the binaries executed with appropriate haste.
>
> As far as shoveling code. Sometimes the options used to compile packages
> leaves me with a large bit of wonder. Strange and seemingly heinous changes
> to the various utilities, etc. But, I have never had a cause to fault them
> based on this. [Except that I have never found the magic that causes all the
> SRPMS to be [re]buildable.]
>
> So to sort it, I don't feel that being a moron caused to boot slow - unless
> there is some wierd filehandling problem in bash2, or something that causes
> severe slow-down when sourcing shell scripts. ???? However, Hans, I do
> beleive you about Suse, and if I wasn't a cheap bastard I would probably buy
> a copy.
>
> thanks for all the response, and I am sorry if this does not belong here.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-27 15:51:08

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> > No. The only thing I can think of that might slow it is that we build with
> > the reiserfs paranoia/sanity checks on. Thats because at the time 7.1 was
>
> Yes, that option should never be on for an end user not having a bug that he wants a more detailed
> bug report on. It just makes us look slow compared to ext2.

Maybe its old fashioned but we'd rather any inconsistency in the file system
behaviour was made obvious to the end user. Enterprise customers object to
losing data.

> 2.4.2 was not a stable kernel for any FS, not just for ReiserFS.

The RH 2.4.2 derived kernel isnt 2.4.2 by any stretch of the imagination.
Vanilla 2.4.2 wouldnt pass a test suite.

> I don't think that even with CONFIG_REISERFS_CHECK on, journal replay can take as long as fsck on
> ext2. reiserfsck though, if that was on, oh, could even RedHat be that desperate to make us look
> bad to users as to run reiserfsck at every boot?

Hans, if you stopped considering every report that your file system wasn't
the best in the world as either a conspiracy theory or someone elses fault
you'd have a much better product

Nobody needs conspiracies to not use reiserfs as their core fs, and until
things like big endian support are cleanly resolved that isnt likely to
change.

Alan

Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Chris Wedgwood <[email protected]> writes:

>FWIW, Debian although it doesn't support reiserfs "out of the box" at
>present, works flawlessly for a large number of people I know. I also
>hear Mandrake 7.2 and 8.0 work pretty nice if you want a pointy-clicky
>experience :)

>Since so many people seem to run RedHat, perhaps it's worth someone
>determining exactly what is busted with their init scripts or whatever
>that makes reiserfs barf more often that with other distributions.

There is nothing wrong with RedHat init scripts.

I run RH 6.2 on my self-rolled 2.2.x kernels and they boot ReiserFS
fine and neither faster nor slower than ext2. Nothing wrong with
RedHat here.

I consider a vendor that does not always ship "latest and greatest"
but tries to stress test its software superior to one that crams out
one new release every three months. And if they enable paranoia mode
in the experimental sections of the kernel: Fine! Goes well with my
philosophy of server running. Leads to 500+ days uptime.

I dropped out of ReiserFS again, however, because of unexplained
problems with various user space applications (PostgreSQL 6.5.3 and
7.x or Highwind oops bCandid oops Software.Com oops Highwind-again
Cyclone and Typhoon) that are heavy thread and mmap() users.

I use ReiserFS for my ftp-data-caroussels and as spool and staging
disks. Not for system disks that contain binaries or performance
critical application disks. That works fine.


Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH [email protected]

Am Schwabachgrund 22 Fon.: 09131 / 50654-0 [email protected]
D-91054 Buckenhof Fax.: 09131 / 50654-20

2001-07-27 16:26:10

by Daniel Phillips

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Friday 27 July 2001 17:33, Hans Reiser wrote:
> Daniel Phillips wrote:
> > On Friday 27 July 2001 16:18, Joshua Schmidlkofer wrote:
> > > I've almost quit using reiser, because everytime I have a power
> > > outage, the last 2 or three files that I've editted, even ones
> > > that I haven't touched in a while, will usually be hopelessly
> > > corrupted.
> >
> > My early flush patch will fix this, or at least it will if I get
> > together with the ReiserFS guys and figure out how to integrate
> > their flushing mechanism with the standard bdflush. Or they could
> > incorporate the ideas from my early flush in their own flush
> > daemon, though generalizing the standard flush would have more
> > value in the long run.
>
> Can you describe early flush?

The idea is to do what amounts to a sync within a tenth of a second of
disk bandwidth usage falling below a certain threshhold.

The original posts/patches are here:

[RFC] Early flush (was: spindown)
[RFC] Early flush: new, improved (updated)

and there are long threads attached to each of them. The clearest
explanation is probably Jonathan Corbet's writeup on lwn:

http://lwn.net/2001/0628/kernel.php3

(Thanks, Jonathan, I often get the feeling I understand what I actually
did only *after* reading your writeups:-)

The second of the two patches needs more work - I think I goofed on
some needed "volatile" handling, see the current flam^H^H^H^H thread
about that.

--
Daniel

2001-07-27 16:27:38

by Kip Macy

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption





> Alan Cox wrote:
> >
> > > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> > > For instance, they compile it with the wrong options set, their boot scripts are wrong, they just
> > > shovel software onto the CD.
> >
> > Sorry Hans you can rant all you like but you know you are wrong on most
> > of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
> > and didn't ship until we stopped seeing corruption problems with the mm/fs
> > code.

Sorry Alan, but even though I am sure Redhat did lots of stress testing,
Redhat 7.1 was not a particularly solid release. I got oopses in the
eepro100 driver even though lots of other people use it, and the netapp
simulator which works just fine on 2.2.16 does not work on it. When I ran
strace on the simulator while it was zeroing some files it turned out that
sys_write was failing with ENOMEM (on a machine with 1GB of RAM that was
not doing anything else).

2001-07-27 16:33:38

by Andrew Morton

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Joshua Schmidlkofer wrote:
>
> I've almost quit using reiser, because everytime I have a power outage, the
> last 2 or three files that I've editted, even ones that I haven't touched in
> a while, will usually be hopelessly corrupted. The '<file>~' that Emacs
> makes is usually fine though.

It's a matter of timing. There is a lengthy window where the metadata
is written, but its data is not. If you crash in this window, the files
contain old data.

You can narrow the window of exposure by fiddling with the
parameters in /proc/sys/vm/bdflush - force a full flush every
five seconds, say.

> It seems to be that any open file is
> in danger. I don't know if this is normal, or not, but I switched to XFS on
> several machines. I have nothing against reiser. I assumed that these
> problems were due to immaturity....

I'm under the impression that XFS also leaves data in the hands
of the kernel's normal writeback mechanisms and will thus be
exposed to the same problem. I may be wrong about this.


Here's a ten-minute hack which gives reiserfs a simple `ordered data'
mode. It simply pushes all the dirty buffers and pages out to disk
before running a commit. Performance is still OK.

I hit reset partway through a massive file tree copy and every
file which had been copied came up peachy - which is very different
from the behaviour without the patch. Interestingly, there were
zero truncated files as well. hmmm...






--- linux-2.4.7/include/linux/fs.h Sat Jul 21 12:37:14 2001
+++ lk-ext3/include/linux/fs.h Sat Jul 28 02:37:43 2001
@@ -1061,6 +1061,7 @@ extern int fs_may_remount_ro(struct supe
extern int try_to_free_buffers(struct page *, unsigned int);
extern void refile_buffer(struct buffer_head * buf);
extern void end_buffer_io_sync(struct buffer_head *bh, int uptodate);
+extern int flush_all_but_supers(kdev_t dev);

/* reiserfs_writepage needs this */
extern void set_buffer_async_io(struct buffer_head *bh) ;
--- linux-2.4.7/include/linux/reiserfs_fs_sb.h Sat Jul 21 12:37:14 2001
+++ lk-ext3/include/linux/reiserfs_fs_sb.h Sat Jul 28 02:37:43 2001
@@ -289,6 +289,8 @@ struct reiserfs_sb_info
/* To be obsoleted soon by per buffer seals.. -Hans */
atomic_t s_generation_counter; // increased by one every time the
// tree gets re-balanced
+
+ int no_sync;

/* session statistics */
int s_kmallocs;
--- linux-2.4.7/fs/reiserfs/journal.c Sat Jul 21 12:37:14 2001
+++ lk-ext3/fs/reiserfs/journal.c Sat Jul 28 02:37:43 2001
@@ -2719,6 +2719,9 @@ static int do_journal_end(struct reiserf
reiserfs_discard_all_prealloc(th); /* it should not involve new blocks into
* the transaction */
#endif
+
+ if (!p_s_sb->u.reiserfs_sb.no_sync)
+ flush_all_but_supers(p_s_sb->s_dev);

rs = SB_DISK_SUPER_BLOCK(p_s_sb) ;
/* setup description block */
--- linux-2.4.7/fs/reiserfs/super.c Wed Jul 4 18:21:31 2001
+++ lk-ext3/fs/reiserfs/super.c Sat Jul 28 02:37:43 2001
@@ -116,7 +116,9 @@ void reiserfs_put_super (struct super_bl
/* note, journal_release checks for readonly mount, and can decide not
** to do a journal_end
*/
+ s->u.reiserfs_sb.no_sync = 1;
journal_release(&th, s) ;
+ s->u.reiserfs_sb.no_sync = 0;

for (i = 0; i < SB_BMAP_NR (s); i ++)
brelse (SB_AP_BITMAP (s)[i]);
--- linux-2.4.7/fs/buffer.c Sat Jul 21 12:37:14 2001
+++ lk-ext3/fs/buffer.c Sat Jul 28 02:37:43 2001
@@ -333,6 +333,18 @@ int fsync_dev(kdev_t dev)
return sync_buffers(dev, 1);
}

+int flush_all_but_supers(kdev_t dev)
+{
+ sync_buffers(dev, 0);
+
+ lock_kernel();
+ sync_inodes(dev);
+ DQUOT_SYNC(dev);
+ unlock_kernel();
+
+ return sync_buffers(dev, 1);
+}
+
/*
* There's no real reason to pretend we should
* ever do anything differently
--- linux-2.4.7/kernel/ksyms.c Sat Jul 21 12:37:14 2001
+++ lk-ext3/kernel/ksyms.c Sat Jul 28 02:37:43 2001
@@ -161,6 +161,7 @@ EXPORT_SYMBOL(d_lookup);
EXPORT_SYMBOL(__d_path);
EXPORT_SYMBOL(mark_buffer_dirty);
EXPORT_SYMBOL(set_buffer_async_io); /* for reiserfs_writepage */
+EXPORT_SYMBOL(flush_all_but_supers);
EXPORT_SYMBOL(__mark_buffer_dirty);
EXPORT_SYMBOL(__mark_inode_dirty);
EXPORT_SYMBOL(get_empty_filp);

2001-07-27 16:43:08

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:

> Nobody needs conspiracies to not use reiserfs as their core fs, and until
> things like big endian support are cleanly resolved that isnt likely to
> change.
>
> Alan
big endian support is resolved, there is a working patch for it by Jeff Mahoney, it passes all of
our tests, but it is a feature not a bug fix, and not something for a supposedly stabilizing kernel.

Nikita, you were supposed to send the big endian support and some other stuff in to Alan for testing
in the ac series, what is the status of patches that are supposed to be going to Alan?

Hans

2001-07-27 16:50:29

by Hans Reiser

[permalink] [raw]
Subject: Early Flush

Daniel Phillips wrote:
>
> On Friday 27 July 2001 17:33, Hans Reiser wrote:
> > Daniel Phillips wrote:
> > > On Friday 27 July 2001 16:18, Joshua Schmidlkofer wrote:
> > > > I've almost quit using reiser, because everytime I have a power
> > > > outage, the last 2 or three files that I've editted, even ones
> > > > that I haven't touched in a while, will usually be hopelessly
> > > > corrupted.
> > >
> > > My early flush patch will fix this, or at least it will if I get
> > > together with the ReiserFS guys and figure out how to integrate
> > > their flushing mechanism with the standard bdflush. Or they could
> > > incorporate the ideas from my early flush in their own flush
> > > daemon, though generalizing the standard flush would have more
> > > value in the long run.
> >
> > Can you describe early flush?
>
> The idea is to do what amounts to a sync within a tenth of a second of
> disk bandwidth usage falling below a certain threshhold.
>
> The original posts/patches are here:
>
> [RFC] Early flush (was: spindown)
> [RFC] Early flush: new, improved (updated)
>
> and there are long threads attached to each of them. The clearest
> explanation is probably Jonathan Corbet's writeup on lwn:
>
> http://lwn.net/2001/0628/kernel.php3
>
> (Thanks, Jonathan, I often get the feeling I understand what I actually
> did only *after* reading your writeups:-)
>
> The second of the two patches needs more work - I think I goofed on
> some needed "volatile" handling, see the current flam^H^H^H^H thread
> about that.
>
> --
> Daniel
Daniel, what you have done is something that I have wanted and believed in for a long time.

Spell out what you need from us and we will support you.

Hans

2001-07-27 16:54:29

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> > Alan
> big endian support is resolved, there is a working patch for it by Jeff Mahoney, it passes all of
> our tests, but it is a feature not a bug fix, and not something for a supposedly stabilizing kernel.
>
> Nikita, you were supposed to send the big endian support and some other stuff in to Alan for testing
> in the ac series, what is the status of patches that are supposed to be going to Alan?

I suspect its a bug fix to S/390, ppc and sparc users 8)

I'd be happy to test run it in -ac

2001-07-27 16:58:39

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Andrew, can you do this such that there is no disruption of our disk format, and make a mount option
out of it, and probably we should use this patch....

After you make a mount option out of it, grev will benchmark it for us using the usual suite of
benchmarks.

Comments Chris?

Thanks,

Hans

Andrew Morton wrote:
>
> Joshua Schmidlkofer wrote:
> >
> > I've almost quit using reiser, because everytime I have a power outage, the
> > last 2 or three files that I've editted, even ones that I haven't touched in
> > a while, will usually be hopelessly corrupted. The '<file>~' that Emacs
> > makes is usually fine though.
>
> It's a matter of timing. There is a lengthy window where the metadata
> is written, but its data is not. If you crash in this window, the files
> contain old data.
>
> You can narrow the window of exposure by fiddling with the
> parameters in /proc/sys/vm/bdflush - force a full flush every
> five seconds, say.
>
> > It seems to be that any open file is
> > in danger. I don't know if this is normal, or not, but I switched to XFS on
> > several machines. I have nothing against reiser. I assumed that these
> > problems were due to immaturity....
>
> I'm under the impression that XFS also leaves data in the hands
> of the kernel's normal writeback mechanisms and will thus be
> exposed to the same problem. I may be wrong about this.
>
> Here's a ten-minute hack which gives reiserfs a simple `ordered data'
> mode. It simply pushes all the dirty buffers and pages out to disk
> before running a commit. Performance is still OK.
>
> I hit reset partway through a massive file tree copy and every
> file which had been copied came up peachy - which is very different
> from the behaviour without the patch. Interestingly, there were
> zero truncated files as well. hmmm...
>
> --- linux-2.4.7/include/linux/fs.h Sat Jul 21 12:37:14 2001
> +++ lk-ext3/include/linux/fs.h Sat Jul 28 02:37:43 2001
> @@ -1061,6 +1061,7 @@ extern int fs_may_remount_ro(struct supe
> extern int try_to_free_buffers(struct page *, unsigned int);
> extern void refile_buffer(struct buffer_head * buf);
> extern void end_buffer_io_sync(struct buffer_head *bh, int uptodate);
> +extern int flush_all_but_supers(kdev_t dev);
>
> /* reiserfs_writepage needs this */
> extern void set_buffer_async_io(struct buffer_head *bh) ;
> --- linux-2.4.7/include/linux/reiserfs_fs_sb.h Sat Jul 21 12:37:14 2001
> +++ lk-ext3/include/linux/reiserfs_fs_sb.h Sat Jul 28 02:37:43 2001
> @@ -289,6 +289,8 @@ struct reiserfs_sb_info
> /* To be obsoleted soon by per buffer seals.. -Hans */
> atomic_t s_generation_counter; // increased by one every time the
> // tree gets re-balanced
> +
> + int no_sync;
>
> /* session statistics */
> int s_kmallocs;
> --- linux-2.4.7/fs/reiserfs/journal.c Sat Jul 21 12:37:14 2001
> +++ lk-ext3/fs/reiserfs/journal.c Sat Jul 28 02:37:43 2001
> @@ -2719,6 +2719,9 @@ static int do_journal_end(struct reiserf
> reiserfs_discard_all_prealloc(th); /* it should not involve new blocks into
> * the transaction */
> #endif
> +
> + if (!p_s_sb->u.reiserfs_sb.no_sync)
> + flush_all_but_supers(p_s_sb->s_dev);
>
> rs = SB_DISK_SUPER_BLOCK(p_s_sb) ;
> /* setup description block */
> --- linux-2.4.7/fs/reiserfs/super.c Wed Jul 4 18:21:31 2001
> +++ lk-ext3/fs/reiserfs/super.c Sat Jul 28 02:37:43 2001
> @@ -116,7 +116,9 @@ void reiserfs_put_super (struct super_bl
> /* note, journal_release checks for readonly mount, and can decide not
> ** to do a journal_end
> */
> + s->u.reiserfs_sb.no_sync = 1;
> journal_release(&th, s) ;
> + s->u.reiserfs_sb.no_sync = 0;
>
> for (i = 0; i < SB_BMAP_NR (s); i ++)
> brelse (SB_AP_BITMAP (s)[i]);
> --- linux-2.4.7/fs/buffer.c Sat Jul 21 12:37:14 2001
> +++ lk-ext3/fs/buffer.c Sat Jul 28 02:37:43 2001
> @@ -333,6 +333,18 @@ int fsync_dev(kdev_t dev)
> return sync_buffers(dev, 1);
> }
>
> +int flush_all_but_supers(kdev_t dev)
> +{
> + sync_buffers(dev, 0);
> +
> + lock_kernel();
> + sync_inodes(dev);
> + DQUOT_SYNC(dev);
> + unlock_kernel();
> +
> + return sync_buffers(dev, 1);
> +}
> +
> /*
> * There's no real reason to pretend we should
> * ever do anything differently
> --- linux-2.4.7/kernel/ksyms.c Sat Jul 21 12:37:14 2001
> +++ lk-ext3/kernel/ksyms.c Sat Jul 28 02:37:43 2001
> @@ -161,6 +161,7 @@ EXPORT_SYMBOL(d_lookup);
> EXPORT_SYMBOL(__d_path);
> EXPORT_SYMBOL(mark_buffer_dirty);
> EXPORT_SYMBOL(set_buffer_async_io); /* for reiserfs_writepage */
> +EXPORT_SYMBOL(flush_all_but_supers);
> EXPORT_SYMBOL(__mark_buffer_dirty);
> EXPORT_SYMBOL(__mark_inode_dirty);
> EXPORT_SYMBOL(get_empty_filp);

2001-07-27 17:11:09

by Steve Lord

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption


>
> You can narrow the window of exposure by fiddling with the
> parameters in /proc/sys/vm/bdflush - force a full flush every
> five seconds, say.
>
> > It seems to be that any open file is
> > in danger. I don't know if this is normal, or not, but I switched to XFS o
> n
> > several machines. I have nothing against reiser. I assumed that these
> > problems were due to immaturity....
>
> I'm under the impression that XFS also leaves data in the hands
> of the kernel's normal writeback mechanisms and will thus be
> exposed to the same problem. I may be wrong about this.
>

Yes, XFS does leave writing the data to the normal writeback mechanisms,
however, what happens with XFS is usually:

o a file with no extents - the size made it out to disk but the data did not.
since on writes to new space we do not allocate the space until we flush
you tend not to see old data. The only way out of something like this is
to prevent the inode size update from hitting disk until the file data
is on disk. The performance consequences of doing that are probably
large.

This situation is somewhat helped by the fact that if one page gets
flushed by bdflush and it calls back into xfs to allocate space, we
will allocate space for, and flush all surrounding data in the file,
so this may be causing earler flushing than might otherwise happen.

Since xfs usually operates with a much smaller in memory log than other
filesystems (64K default) and we have some synchronous transactions which
cause a flush of the in memory log, the amount that time can go backwards
by in a crash is a lot smaller.

Steve


2001-07-27 17:21:49

by Andrew Morton

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser wrote:
>
> Andrew, can you do this such that there is no disruption of our
> disk format, and make a mount option
> out of it, and probably we should use this patch....

I'll defer to Chris :)

There's no disruption to disk format - it just simulates
the user typing `sync' at the right time. I think the
concept is sound, and I'm sure Chris can find a more efficient
way...


> After you make a mount option out of it, grev will benchmark
> it for us using the usual suite of benchmarks.
>

Ordered-data is a funny thing. Under heavy loads it tends
to make a significant throughput difference - on ext3 it
almost halves throughput wrt writeback mode.

But this by no means indicates that writes are half as slow;
what happens is that metadata-intensive workloads fill the
journal up quickly, so the `sync' happens more frequently.
Under normal workloads, or less metadata-intense workloads
the difference is very small.

During testing of that little patch I noted that the
disk went crunch every thirty seconds or so, which is good.
Presumably the reiserfs journal is larger, or more space-efficient.

-

2001-07-27 17:30:19

by Ville Herva

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 09:25:23AM -0700, you [Kip Macy] claimed:
>
> sys_write was failing with ENOMEM (on a machine with 1GB of RAM that was
> not doing anything else).

I second that.

256M memory, no swap at the time.

After fresh boot to the default RH71 kernel (2.4.2-2 or whatever it is) on
console (no X running):

> diff -Naur /usr/src/linux.rh-default /usr/src/linux-2.4.4 > diff
zsh: killed diff

> dmesg | tail
kernel: out of memory, killed process n (xfs)
kernel: out of memory, killed process n (diff)

Phew.


-- v --

[email protected]

2001-07-27 17:36:19

by Eric W. Biederman

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser <[email protected]> writes:

> This "feature" of not guaranteeing that a write that is in progress when the
> machine crashes will
>
> not write garbage, has been present in most Unix filesystems for about 25 years
> of Unix history.

A write in progress causing garabage when the power is lost is a
driver, and drive thing.

stock unix behavior is that it delays writes for up to 30 seconds,
which in case of a crash could mean you have old data on disk. Not
wrong data. This is helped because in stock unix filesystems blocks
are rarely reallocated or moved. In reiserfs with the btree at least
some kinds of data are moved all over the disk.

I want to suspect a btree problem on the block jumping around (it's
a good canidate). But unless you have messed up metadata journalling
btree writes are journaled. The reason I am suspecting the btree is
that most source code files are small so probably don't have complete
filesystem blocks of their own.

> It
>
> is not that we are deviant on this, it is that a tradeoff is made, and for most
> but not all users it
>
> is a good one to make.

If you can give me an explanation of what would cause the described
behavior of small files swapping their contents I would believe I
would feel more secure than just a reflex ``we don't garantee all of the
data written before power failure''.

Eric


2001-07-27 17:44:11

by Ville Herva

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 06:40:32PM +0100, you [Alan Cox] claimed:
> > After fresh boot to the default RH71 kernel (2.4.2-2 or whatever it is) on
> > console (no X running):
> >
> > > diff -Naur /usr/src/linux.rh-default /usr/src/linux-2.4.4 > diff
> > zsh: killed diff
> >
> > > dmesg | tail
> > kernel: out of memory, killed process n (xfs)
> > kernel: out of memory, killed process n (diff)
> >
> > Phew.
>
> No argument on that one. I'm still seeing it in vanilla 2.4.6 as well but
> 2.4.7 is looking a lot better.

I wasn't able to easily reproduce that on 2.4.4ac5 (that I upgraded into
almost immediately). It may be that the OOM rambo wasn't fully sane on that
one either, but at least it seemed to handle the silly "someone filled the
cache - gee, we must be oom" case rather better...


-- v --

[email protected]

2001-07-27 17:47:39

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Andrew Morton wrote:
>
> Hans Reiser wrote:
> >
> > Andrew, can you do this such that there is no disruption of our
> > disk format, and make a mount option
> > out of it, and probably we should use this patch....
>
> I'll defer to Chris :)

Yes, I'll let him think carefully through the details of how it affects ordering of the writes.


>
> There's no disruption to disk format - it just simulates
> the user typing `sync' at the right time. I think the
> concept is sound, and I'm sure Chris can find a more efficient
> way...

Oops, sorry, you changed the in-ram not the on-disk sb....

>
> > After you make a mount option out of it, grev will benchmark
> > it for us using the usual suite of benchmarks.
> >
>
> Ordered-data is a funny thing. Under heavy loads it tends
> to make a significant throughput difference - on ext3 it
> almost halves throughput wrt writeback mode.
>
> But this by no means indicates that writes are half as slow;
> what happens is that metadata-intensive workloads fill the
> journal up quickly, so the `sync' happens more frequently.
> Under normal workloads, or less metadata-intense workloads
> the difference is very small.
>
> During testing of that little patch I noted that the
> disk went crunch every thirty seconds or so, which is good.
> Presumably the reiserfs journal is larger, or more space-efficient.
>
> -

Thanks Andrew

2001-07-27 17:52:59

by Christoph Rohland

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hi Hans,

On Fri, 27 Jul 2001, Hans Reiser wrote:
> Maybe somebody else who is using both ReiserFS and RedHat's boot
> scripts can comment on whether things are slow for them and if so,
> where they get slow.

At least not if it's not the root disk. I have a RH71 box with a 19GB
reiserfs partition and it's booting fast and fine.

Greetings
Christoph


2001-07-27 18:03:40

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Rohland wrote:
>
> Hi Hans,
>
> On Fri, 27 Jul 2001, Hans Reiser wrote:
> > Maybe somebody else who is using both ReiserFS and RedHat's boot
> > scripts can comment on whether things are slow for them and if so,
> > where they get slow.
>
> At least not if it's not the root disk. I have a RH71 box with a 19GB
> reiserfs partition and it's booting fast and fine.
>
> Greetings
> Christoph
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

Ok, well then I conclude that it was a user misdiagnosis as to what his booting problem was of some
unknowable form.

Apologies to RedHat.

Hans

2001-07-27 18:10:50

by Dustin Byford

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser wrote:

> Maybe somebody else who is using both ReiserFS and RedHat's boot scripts can comment on whether
> things are slow for them and if so, where they get slow.


For what it's worth I just configured a RedHat 7.1 box with reiserfs on
all partitions except /boot using this update disk
ftp://139.82.28.40/pub/update-rh71reiser-v1.img from
http://cambuca.ldhs.cetuc.puc-rio.br/.

Upgraded all of redhat's packages, note there is a SysVinit update and a
gcc update.

Compiled a 2.4.7-pre kernel and the latest reiserfsprogs.

Mounted /boot read only to eliminate the chance of an fsck required on
that partition.

I have been running reiserfs on a mail server with about 60k accounts
(30k really active) for about 6 months. I haven't experienced any
problems with the filesystems. The one I just configured is its intended
replacment. After a few days of testing with bonnie, some perl scripts I
wrote, and a few pullings of the power cord I think it's almost ready
for production. An upgrade to 2.4.7 and some more testing will tell.

If you pull the plug on this machine it takes around 40 seconds to get
back to the login prompt, (p3-600 60G ide drive). Including the act of
pulling the power cord, bios delays, lilo delays, and the rest of the
redhat boot sequence.

So, in my experience I've had very few problems with reiserfs and
redhat. That said, the slightest hint of data corruption under normal
(continuous power, no failing hardware) operation and I'll probably be
evaluating other filesystems. There are sometimes as many as 500,000
files on this filesystem, reiserfs seems to do a good job under my
conditions.

--Dustin

Also, one purely cosmetic patch to rc.sysinit if you want:
--- rc.sysinit.orig Fri Jul 27 13:06:58 2001
+++ rc.sysinit Fri Jul 27 13:38:25 2001
@@ -211,7 +211,8 @@

_RUN_QUOTACHECK=0
ROOTFSTYPE=`grep " / " /proc/mounts | awk '{ print $3 }'`
-if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" ]; then
+if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" \
+ -a "$ROOTFSTYPE" != "reiserfs" ]; then

STRING=$"Checking root filesystem"
echo $STRING

2001-07-27 18:44:37

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On 27 Jul 2001, Eric W. Biederman wrote:

> Hans Reiser <[email protected]> writes:
>
> > This "feature" of not guaranteeing that a write that is in progress when the
> > machine crashes will
> >
> > not write garbage, has been present in most Unix filesystems for about 25 years
> > of Unix history.
>
> A write in progress causing garabage when the power is lost is a
> driver, and drive thing.
>
> stock unix behavior is that it delays writes for up to 30 seconds,
> which in case of a crash could mean you have old data on disk. Not
> wrong data. This is helped because in stock unix filesystems blocks
> are rarely reallocated or moved. In reiserfs with the btree at least
> some kinds of data are moved all over the disk.
>
> I want to suspect a btree problem on the block jumping around (it's
> a good canidate). But unless you have messed up metadata journalling
> btree writes are journaled. The reason I am suspecting the btree is
> that most source code files are small so probably don't have complete
> filesystem blocks of their own.

Possibly. We're talking 130 kByte in total. The above is the reason why
I don't like using reiserfs on my development system. My files get
completely garbled, with the data randomly distributed over the files last
touched. (Object files, dependency files, source files and header files)
I don't mind loosing data I've just written, but I *hate* it when it
garbles all my files.

> If you can give me an explanation of what would cause the described
> behavior of small files swapping their contents I would believe I
> would feel more secure than just a reflex ``we don't garantee all of the
> data written before power failure''.

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-27 19:22:08

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Dustin Byford wrote:

> Also, one purely cosmetic patch to rc.sysinit if you want:
> --- rc.sysinit.orig Fri Jul 27 13:06:58 2001
> +++ rc.sysinit Fri Jul 27 13:38:25 2001
> @@ -211,7 +211,8 @@
>
> _RUN_QUOTACHECK=0
> ROOTFSTYPE=`grep " / " /proc/mounts | awk '{ print $3 }'`
> -if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" ]; then
> +if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" \
> + -a "$ROOTFSTYPE" != "reiserfs" ]; then
>
> STRING=$"Checking root filesystem"
> echo $STRING


Yes, this patch is much needed. Edward, put it on our website in an appropriate location.

Hans

2001-07-27 19:24:00

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

[email protected] wrote:
>
> On 27 Jul 2001, Eric W. Biederman wrote:
>
> > Hans Reiser <[email protected]> writes:
> >
> > > This "feature" of not guaranteeing that a write that is in progress when the
> > > machine crashes will
> > >
> > > not write garbage, has been present in most Unix filesystems for about 25 years
> > > of Unix history.
> >
> > A write in progress causing garabage when the power is lost is a
> > driver, and drive thing.
> >
> > stock unix behavior is that it delays writes for up to 30 seconds,
> > which in case of a crash could mean you have old data on disk. Not
> > wrong data. This is helped because in stock unix filesystems blocks
> > are rarely reallocated or moved. In reiserfs with the btree at least
> > some kinds of data are moved all over the disk.
> >
> > I want to suspect a btree problem on the block jumping around (it's
> > a good canidate). But unless you have messed up metadata journalling
> > btree writes are journaled. The reason I am suspecting the btree is
> > that most source code files are small so probably don't have complete
> > filesystem blocks of their own.
>
> Possibly. We're talking 130 kByte in total. The above is the reason why
> I don't like using reiserfs on my development system. My files get
> completely garbled, with the data randomly distributed over the files last
> touched. (Object files, dependency files, source files and header files)
> I don't mind loosing data I've just written, but I *hate* it when it
> garbles all my files.
>
> > If you can give me an explanation of what would cause the described
> > behavior of small files swapping their contents I would believe I
> > would feel more secure than just a reflex ``we don't garantee all of the
> > data written before power failure''.
>
> Bas Vermeulen
>
> --
> "God, root, what is difference?"
> -- Pitr, User Friendly
>
> "God is more forgiving."
> -- Dave Aronson
You should not see old data being corrupted. If you are seeing it with a recent ReiserFS version,
we'd like your help in reproducing it.

Hans

2001-07-27 19:31:28

by Jussi Laako

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

[email protected] wrote:
>
> Possibly. We're talking 130 kByte in total. The above is the reason why
> I don't like using reiserfs on my development system. My files get
> completely garbled, with the data randomly distributed over the files

How about using notail -option?

- Jussi

--
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B 39DD A4DE 63EB C216 1E4B
Available at PGP keyservers

2001-07-27 21:07:21

by Marc Lehmann

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 04:06:16PM +0100, Alan Cox <[email protected]> wrote:
> > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> > For instance, they compile it with the wrong options set, their boot scripts are wrong, they just
> > shovel software onto the CD.
>
> Sorry Hans you can rant all you like but you know you are wrong on most
> of that. RH did weeks of stress testing on multiple systems up to 8Gb 8 way
> and didn't ship until we stopped seeing corruption problems with the mm/fs
> code.

You might be well advised looking at reality (visit a few other projects)
and you'll see that redhat, indeed, has a very bad reputation. Wether it's
gimp, gtk, perl, wine, dosemu or any other project, the basic reaction is:
oh, you have gt problems under redhat? you compile it yourself and most
probably your problems will go away (gtk+ even had this message in their
install script).

> That test suite caught bugs in kernel revisions other vendors shipped
> blindly to their customers without fixing.

they might have a very good testsuite, but that means nothing: redhat
so frequently takes snapshots of undebugged alpha versions of software
(higher version numbers) that no matter of testing will suffice to ever
make this work.

the might be doing well for the kernel, but that only gets you so far.

> That is hardly shovelling software onto the CD.

Right, that's shovelling the latest alpha versions of software onto CD.

> > Actually, I am curious as to exactly how they manage to make ReiserFS boot longer than ext2. Do
> > they run fsck or what?
> No. The only thing I can think of that might slow it is that we build with
> the reiserfs paranoia/sanity checks on.

That's a pretty dumb thing. Maybe one should have asked the develoers
before doing this (they never do). Redhat somehow manages pretty well to
show reiserfs in a bad light ;)

However, ext2 is much faster on mount time with -onocheck (instantaneous);
and for all current harddisk sizes ext2 is somewhat to much slower on
mount. And yes, the redhat init system (just like suse's or most others,
of course) is sooo slow that improving the init system will have a much
larger effect than the ext2/reiserfs differences.

(So trying to improve this in the kernel would be the wrong place to
start).

--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / [email protected] |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|

2001-07-27 21:07:00

by Marc Lehmann

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, Jul 27, 2001 at 07:38:12PM +0400, Hans Reiser <[email protected]> wrote:
> not write garbage, has been present in most Unix filesystems for about 25 years of Unix history. It
> is not that we are deviant on this, it is that a tradeoff is made, and for most but not all users it
> is a good one to make.

it just happens muchg more with reiserfs than with other fs's. but I trust
chreis mason who said that this might be fixable. so it might not be a design
trade-off at all.

--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / [email protected] |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|

2001-07-27 21:15:01

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

"pcg( Marc)"@goof(A.).(Lehmann )com wrote:

> > No. The only thing I can think of that might slow it is that we build with
> > the reiserfs paranoia/sanity checks on.
>
> That's a pretty dumb thing. Maybe one should have asked the develoers
> before doing this (they never do). Redhat somehow manages pretty well to
> show reiserfs in a bad light ;)

Let us be a bit more precise here. If you click on the help button when deciding whether to select
that option it tells you not to do it. What can you say about a distro that doesn't read the help
buttons for the kernel options when configuring the kernel? Shovelware?

Hans

2001-07-27 21:24:56

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> Let us be a bit more precise here. If you click on the help button when deciding whether to select
> that option it tells you not to do it. What can you say about a distro that doesn't read the help
> buttons for the kernel options when configuring the kernel? Shovelware?

The alternative was to disable it. Because at the time we had lots of good
evidence it didnt work reliably. Evidence backed up by the pile of later
Chris Mason patches.

2001-07-27 21:47:08

by Daniel Phillips

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Friday 27 July 2001 20:47, [email protected] wrote:
> I don't mind loosing data I've just written, but I
> *hate* it when it garbles all my files.

Have you tried running with no tail merging? (And no already-merged
tails.)

--
Daniel

2001-07-27 21:48:58

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:
>
> > Let us be a bit more precise here. If you click on the help button when deciding whether to select
> > that option it tells you not to do it. What can you say about a distro that doesn't read the help
> > buttons for the kernel options when configuring the kernel? Shovelware?
>
> The alternative was to disable it. Because at the time we had lots of good
> evidence it didnt work reliably. Evidence backed up by the pile of later
> Chris Mason patches.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

Better to disable it than to cripple it.

By the way, how about considering the use of tests before redhat coders put stuff in the linux
kernel? You know, if VFS changes actually got tested before users encountered things like Viro
breaking ReiserFS in 2.4.5, it would be nice.

At Namesys, like all normal software shops, we actually run a test suite before shipping code
externally. We usually try to require that it be tested by at least one person in addition to the
code author.

It would catch things like your gcc problems. Test suites don't catch everything, but they are
considered the responsible thing to do at most places.

Hans

2001-07-27 22:03:10

by Luigi Genoni

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Sat, 28 Jul 2001, Chris Wedgwood wrote:

> On Fri, Jul 27, 2001 at 06:55:09PM +0400, Hans Reiser wrote:
>
> Don't use RedHat with ReiserFS, they screw things up so many
> ways.....
>
> For instance, they compile it with the wrong options set, their
> boot scripts are wrong, they just shovel software onto the CD.
>
> Use SuSE, and trust me, ReiserFS will boot faster than ext2.
>
> Actually, I am curious as to exactly how they manage to make
> ReiserFS boot longer than ext2. Do they run fsck or what?
>
> FWIW, Debian although it doesn't support reiserfs "out of the box" at
> present, works flawlessly for a large number of people I know. I also
> hear Mandrake 7.2 and 8.0 work pretty nice if you want a pointy-clicky
> experience :)
>
I could add that also slackware is just faster with / with reiserFS
than with ext2.
But i saw that some of RH init script are, how can I say, redundant....

Luigi

> Since so many people seem to run RedHat, perhaps it's worth someone
> determining exactly what is busted with their init scripts or whatever
> that makes reiserfs barf more often that with other distributions.
>
>
>
> --cw
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-07-27 22:10:11

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> By the way, how about considering the use of tests before redhat coders put stuff in the linux
> kernel? You know, if VFS changes actually got tested before users encountered things like Viro
> breaking ReiserFS in 2.4.5, it would be nice.
> At Namesys, like all normal software shops, we actually run a test suite before shipping code
> externally. We usually try to require that it be tested by at least one person in addition to the
> code author.

*PLONK*

No doubt if Namesys ran test suites all the tail merging bug fiasco and the
directory/tree balance races wouldnt have happened.

2001-07-27 17:39:39

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> After fresh boot to the default RH71 kernel (2.4.2-2 or whatever it is) on
> console (no X running):
>
> > diff -Naur /usr/src/linux.rh-default /usr/src/linux-2.4.4 > diff
> zsh: killed diff
>
> > dmesg | tail
> kernel: out of memory, killed process n (xfs)
> kernel: out of memory, killed process n (diff)
>
> Phew.

No argument on that one. I'm still seeing it in vanilla 2.4.6 as well but
2.4.7 is looking a lot better.

2001-07-28 06:17:31

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> You should not see old data being corrupted. If you are seeing it with
> a recent ReiserFS version,
> we'd like your help in reproducing it.

It is not old data perse. I edited those files. They have been opened, and
written back. But it will shuffle every bit of data in those files, and
I'll find sourcecode in the object file, *.d files, etc. The source file
itself is mostly garbled as well.

I can see if I can come up with a module as simple as possible to
reproduce this. (This is still a while(1); in kernel essentially, with
a couple of seconds between the hang and the compile/install cycle)

If you're interested, let me know, and I'll see if I can make a test-case
for you.

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-28 06:19:00

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Fri, 27 Jul 2001, Jussi Laako wrote:

> [email protected] wrote:
> >
> > Possibly. We're talking 130 kByte in total. The above is the reason why
> > I don't like using reiserfs on my development system. My files get
> > completely garbled, with the data randomly distributed over the files
>
> How about using notail -option?

Never tried it. I'll see if I can reproduce.

Bas Vermeulen
--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

2001-07-28 07:38:07

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Alan Cox wrote:
>
> > By the way, how about considering the use of tests before redhat coders put stuff in the linux
> > kernel? You know, if VFS changes actually got tested before users encountered things like Viro
> > breaking ReiserFS in 2.4.5, it would be nice.
> > At Namesys, like all normal software shops, we actually run a test suite before shipping code
> > externally. We usually try to require that it be tested by at least one person in addition to the
> > code author.
>
> *PLONK*
>
> No doubt if Namesys ran test suites all the tail merging bug fiasco and the
> directory/tree balance races wouldnt have happened.
Our test suites need much improvement, but we do have them and use them. Can you say the same?

Hans

2001-07-28 07:41:08

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

[email protected] wrote:
>
> > You should not see old data being corrupted. If you are seeing it with
> > a recent ReiserFS version,
> > we'd like your help in reproducing it.
>
> It is not old data perse. I edited those files. They have been opened, and
> written back. But it will shuffle every bit of data in those files, and
> I'll find sourcecode in the object file, *.d files, etc. The source file
> itself is mostly garbled as well.
>
> I can see if I can come up with a module as simple as possible to
> reproduce this. (This is still a while(1); in kernel essentially, with
> a couple of seconds between the hang and the compile/install cycle)
>
> If you're interested, let me know, and I'll see if I can make a test-case
> for you.
>
> Bas Vermeulen
>
> --
> "God, root, what is difference?"
> -- Pitr, User Friendly
>
> "God is more forgiving."
> -- Dave Aronson
I am very much interested in a test case.

hans

2001-07-28 13:46:34

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Saturday 28 July 2001 02:55, Hans Reiser wrote:
> Joshua Schmidlkofer wrote:
> > I've almost quit using reiser, because everytime I have a power outage,
> > the last 2 or three files that I've editted, even ones that I haven't
> > touched in a while, will usually be hopelessly corrupted. The '<file>~'
> > that Emacs makes is usually fine though. It seems to be that any open
> > file is in danger. I don't know if this is normal, or not, but I
> > switched to XFS on several machines. I have nothing against reiser. I
> > assumed that these problems were due to immaturity....
> >
> > One more thing - All my computers with Reiser as '/' on them had a
> > disturbingly long boot time. From the time when the Redhat startup
> > scripts began, it was.... hideously slow. I thought nothing of it,
> > blaming bash,
>
> Don't use RedHat with ReiserFS, they screw things up so many ways.....
>
> For instance, they compile it with the wrong options set, their boot
> scripts are wrong, they just shovel software onto the CD.
>
> Use SuSE, and trust me, ReiserFS will boot faster than ext2.
>
> Actually, I am curious as to exactly how they manage to make ReiserFS boot
> longer than ext2. Do they run fsck or what?
>
> Hans

Regards to the ReiserFS. Something more spookie, OpenLinux (no boos and
hisses please ;) ), they have ReiserFS as a module, yet, when I have the root
partition as reiser I have no problems, voo doo magic perhaps? because when I
compiled 2.4.7 w/ ReiserFS as a module, the boot forks up.

Regarding the last comment, I think Redhat and Caldera have debugging enable
(God knows why?), well, Caldera definately dones, after having a look at
their default kernel configuration, hence, when I recompiled my kernel to
2.4.7, threw the reiserFS into the guts of the kernel with debugging turned
off, there was a speed increase.

Also, to speed it up, I have heard a urban myth (I am not too sure whether it
is true), you add the tag notail. A little more disk space is used, however,
apparently, it is meant to speed up access.

Matthew Gardiner
--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-28 14:09:56

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Saturday, July 28, 2001 11:36:33 AM +0400 Hans Reiser <[email protected]>
wrote:

> Alan Cox wrote:
>>
>> No doubt if Namesys ran test suites all the tail merging bug fiasco and the
>> directory/tree balance races wouldnt have happened.
> Our test suites need much improvement, but we do have them and use them.
> Can you say the same?

He's already described some of the testing they do. I would suggest there
are better ways to use l-k bandwidth than picking a fight with redhat,
especially on topics that have already been beaten to death.

Alan, thanks for helping to test the reiserfs patches we've been sending to
in the ac tree, we do appreciate it.

-chris

2001-07-28 14:14:46

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Saturday 28 July 2001 01:24, [email protected] wrote:
> On Fri, 27 Jul 2001, Hans Reiser wrote:
> > [email protected] wrote:
> > > On Wed, 18 Jul 2001, Erik Mouw wrote:
> > > > On Wed, Jul 18, 2001 at 03:18:59PM +1000, Steve Kieu wrote:
> > > > > My advice:
> > > > >
> > > > > Dont use reiserfs,JFS
> > > > > it is ok to use ext2
> > > > >
> > > > > Go journalling? use ext3 or XFS
> > > > >
> > > > > I have used all of these fs and pick up this rule (up
> > > > > to now, not sure it remains right in the far future)
> > > >
> > > > FUD. I've been using reiserfs on quite some systems and never got any
> > > > problem. If reiserfs wouldn't be stable, SuSE wouldn't have supported
> > > > it as one of their stable filesystems for over a year.
> > >
> > > Actually, I've been having some nasty corruption problems as well with
> > > reiserfs. I develop my own drivers, and do occasionally make a mistake,
> > > and when that hangs the kernel it will also screw up all files touched
> > > just before it in a edit-make-install-try cycle. Which can be rather
> > > annoying, because you can start all over again (this effect randomly
> > > distributes the last touched sectors to the last touched files. Very
> > > nice effect, but not something I expect from a journalled filesystem).
> >
> > Do you think it is reasonable to ask that a filesystem be designed to
> > work well with bad drivers?
>
> Yup. I know ext2 can do it. I expect a filesystem to not foul up my data
> when something happens. Especially not shuffle around sectors in several
> files. I can understand that the changes I made are not on disc, I can
> even understand it if my files are gone, but not when it corrupts my data.
> That just plain sucks.
>
> A friend of mine has had crashes as well (not reiser related btw), where
> files he was using at the time suddenly contained different pieces of
> different files. It's just plain annoying. The reason why *I* use(d)
> reiserfs was the fact that I thought that it would protect my data when
> something does crash. From my experience, it doesn't, and I'd rather wait
> a couple of minutes for ext2 to fsck than use reiserfs and be sure I can
> start all over again.
>
> Regards,
>
> Bas Vermeulen

What chipset have you got? I am running a PIII 550 w/ Intel BX chipset and
ReiserFS (Kernel 2.4.7) and haven't run into any problems. Have you tried
just sticking with the generic IDE driver and wait until the chipset specific
driver becomes more stable?

Matthew Gardiner

--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-28 14:17:56

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Saturday 28 July 2001 01:39, Alan Cox wrote:
> > > Putting a sync just before the insmod when developing new drivers is a
> > > good idea btw
> >
> > I've been doing that most of the time. But I sometimes forget that.
> > But as I said, it's not something I expected from a journalled
> > filesystem.
>
> You misunderstand journalling then
>
> A journalling file system can offer different levels of guarantee. With
> metadata only journalling you don't take any real performance hit but your
> file system is always consistent on reboot (consistent as in fsck would
> pass it) but it makes no guarantee that data blocks got written.
>
> Full data journalling will give you what you expect but at a performance
> hit for many applications.
>
> Alan

Just in regards to full journalling, will/is there an option in ReiserFS to
allow it? Personally, I would much rather have full journalling, and a little
more of a performance hit for security and reliability, than great
performance and a higher level of risk.

Matthew Gardiner
--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-28 14:20:16

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

I've noticed that in the menuconfig there is support for the Vertias
Journalling File System. Has there been any push for that to be a "bootable"
filesystem so it can be used for Linux?

Matthew Gardiner
--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-28 14:38:25

by bvermeul

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> What chipset have you got? I am running a PIII 550 w/ Intel BX chipset and
> ReiserFS (Kernel 2.4.7) and haven't run into any problems. Have you tried
> just sticking with the generic IDE driver and wait until the chipset specific
> driver becomes more stable?

Intel i815, and the thing is rock solid unless I fuck up with a driver.

Bas Vermeulen

--
"God, root, what is difference?"
-- Pitr, User Friendly

"God is more forgiving."
-- Dave Aronson

Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser <[email protected]> writes:

> Well, I am afraid this is much too vague for me to have any
> understanding of what went wrong on your system.

But you were able on this vagueness of accusing Redhat to "just shovel
software on a CD". Why? Because they didn't give you money unlike some
other vendors, e.g. SuSE?

The thing that really pisses me off about ReiserFS from time to time
is not the "FS" part...

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH [email protected]

Am Schwabachgrund 22 Fon.: 09131 / 50654-0 [email protected]
D-91054 Buckenhof Fax.: 09131 / 50654-20

2001-07-28 16:17:33

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Matthew Gardiner wrote:
>
> On Saturday 28 July 2001 02:55, Hans Reiser wrote:
> > Joshua Schmidlkofer wrote:
> > > I've almost quit using reiser, because everytime I have a power outage,
> > > the last 2 or three files that I've editted, even ones that I haven't
> > > touched in a while, will usually be hopelessly corrupted. The '<file>~'
> > > that Emacs makes is usually fine though. It seems to be that any open
> > > file is in danger. I don't know if this is normal, or not, but I
> > > switched to XFS on several machines. I have nothing against reiser. I
> > > assumed that these problems were due to immaturity....
> > >
> > > One more thing - All my computers with Reiser as '/' on them had a
> > > disturbingly long boot time. From the time when the Redhat startup
> > > scripts began, it was.... hideously slow. I thought nothing of it,
> > > blaming bash,
> >
> > Don't use RedHat with ReiserFS, they screw things up so many ways.....
> >
> > For instance, they compile it with the wrong options set, their boot
> > scripts are wrong, they just shovel software onto the CD.
> >
> > Use SuSE, and trust me, ReiserFS will boot faster than ext2.
> >
> > Actually, I am curious as to exactly how they manage to make ReiserFS boot
> > longer than ext2. Do they run fsck or what?
> >
> > Hans
>
> Regards to the ReiserFS. Something more spookie, OpenLinux (no boos and
> hisses please ;) ), they have ReiserFS as a module, yet, when I have the root
> partition as reiser I have no problems, voo doo magic perhaps? because when I
> compiled 2.4.7 w/ ReiserFS as a module, the boot forks up.

Perhaps there is a problem in which the reiserfs module does not get loaded
before you need to read the root partition?

If you isolate the problem to where you think it is a reiserfs bug, please let
me know. It sounds like not.


>
> Also, to speed it up, I have heard a urban myth (I am not too sure whether it
> is true), you add the tag notail. A little more disk space is used, however,
> apparently, it is meant to speed up access.

This is entirely correct. Moving tails around costs performance, ReiserFS
cannot give you something for nothing in this respect.

Hans

2001-07-28 16:24:03

by Alan

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

> I've noticed that in the menuconfig there is support for the Vertias
> Journalling File System. Has there been any push for that to be a "bootable"
> filesystem so it can be used for Linux?

The Linux freevxfs module is read only currently. Veritas apparently will be
releasing the genuine article for Linux but binary only with all the mess
that entails

2001-07-28 16:27:43

by Jeff Garzik

[permalink] [raw]
Subject: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Alan Cox wrote:
> The Linux freevxfs module is read only currently. Veritas apparently will be
> releasing the genuine article for Linux but binary only with all the mess
> that entails

Isn't that a violation of the GPL, to release binary modules?

--
Jeff Garzik | "Mind if I drive?" -Sam
Building 1024 | "Not if you don't mind me clawing at the dash
MandrakeSoft | and shrieking like a cheerleader." -Max

2001-07-28 16:45:34

by Marcus Meissner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

In article <[email protected]> you wrote:
> Regards to the ReiserFS. Something more spookie, OpenLinux (no boos and
> hisses please ;) ), they have ReiserFS as a module, yet, when I have the root
> partition as reiser I have no problems, voo doo magic perhaps? because when I
> compiled 2.4.7 w/ ReiserFS as a module, the boot forks up.

We have the reiserfs module in the initial ramdisk in such setups.

You need to recreate the initrd in those cases.
(Run "/usr/libexec/modules/mkinitrd.sh 2.4.7" in the /boot directory, this
will create /boot/initrd-2.4.7.gz.)

> Regarding the last comment, I think Redhat and Caldera have debugging enable
> (God knows why?), well, Caldera definately dones, after having a look at
> their default kernel configuration, hence, when I recompiled my kernel to
> 2.4.7, threw the reiserFS into the guts of the kernel with debugging turned
> off, there was a speed increase.

ReiserFS is experimental in the 2.4 series, thats why we ship with a big
disclaimer and with checking enabled.

(And before you argue again, we ship 2.4.2-ac26. Since then several major
bugs have been found in reiserfs, including the knfsd lossage.)

Ciao, Marcus

2001-07-28 16:45:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hi Matthew,

In article <[email protected]> you wrote:
> Regards to the ReiserFS. Something more spookie, OpenLinux (no boos and
> hisses please ;) ), they have ReiserFS as a module, yet, when I have the root
> partition as reiser I have no problems, voo doo magic perhaps? because when I
> compiled 2.4.7 w/ ReiserFS as a module, the boot forks up.

Well, as reiserfs is a module it needs to be on the initrd. The install
of the kernel kernel RPM automatically creates a new initrd which includes
the modules in /etc/modules/rootfs. If you don't create a new initrd your
modular reiserfs setup will of course fail.

> Regarding the last comment, I think Redhat and Caldera have debugging enable
> (God knows why?), well, Caldera definately dones, after having a look at
> their default kernel configuration, hence, when I recompiled my kernel to
> 2.4.7, threw the reiserFS into the guts of the kernel with debugging turned
> off, there was a speed increase.

Reiserfs as implemented in the 2.4.2-based kernel of OpenLinux 3.1 is
everything but stable and has a lot of issues (e.g. NFS-exporting doesn't
work). That is the reason why it is a) marked experimental and is completly
unsupported (and that is written _big_ _fat_ in manuals and similar stuff)
and b) has debugging enabled to have the additional sanity checks that are
under this option and give addtional hints if reiserfs fails again.

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-28 16:51:35

by Thomas Kotzian

[permalink] [raw]
Subject: missing symbols in 2.4.7-ac2

when compiling with highmem = 4GB
problem in 3c59x - module:
unresolved symbol nr_free_highpages ...

ThomasK.

2001-07-28 17:45:22

by Richard Gooch

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Jeff Garzik writes:
> Alan Cox wrote:
> > The Linux freevxfs module is read only currently. Veritas apparently will be
> > releasing the genuine article for Linux but binary only with all the mess
> > that entails
>
> Isn't that a violation of the GPL, to release binary modules?

Linus said it's OK. I know Alan doesn't agree, but that's life :-)
The king penguin has spoken.

I don't see the need to be bloody-minded on this issue. If a vendor
wants to go through the pain of tracking kernel drift and having to
compile modules for many different versions, then let them. Given how
much trouble it is, why bother them with legal threats?

The right answer for vendors who want to ship binary modules is to
ship an Open Source interface layer which shields the vendor from
kernel drift (since users will be able to build the interface layer if
they need to, without waiting for the vendor).
I guess that would also shield them from unhelpful legal threats.

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2001-07-28 18:22:59

by Andreas Dilger

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Jeff Garzik writes:
> Alan Cox wrote:
> > The Linux freevxfs module is read only currently. Veritas apparently will be
> > releasing the genuine article for Linux but binary only with all the mess
> > that entails
>
> Isn't that a violation of the GPL, to release binary modules?

Noooooo.... Not this thread again.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-07-28 19:03:10

by Rik van Riel

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

On Sat, 28 Jul 2001, Jeff Garzik wrote:
> Alan Cox wrote:
> > The Linux freevxfs module is read only currently. Veritas apparently will be
> > releasing the genuine article for Linux but binary only with all the mess
> > that entails
>
> Isn't that a violation of the GPL, to release binary modules?

Binary modules using only the interfaces exported in /proc/ksyms
are, under certain readings of the GPL, no less "infected" by the
GPL than binary programs making system calls.

This means binary only modules are ok, as long as they don't need
changes in the kernel to work.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [email protected] (spam digging piggy)

2001-07-28 19:08:02

by Alan

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

> The right answer for vendors who want to ship binary modules is to
> ship an Open Source interface layer which shields the vendor from
> kernel drift (since users will be able to build the interface layer if
> they need to, without waiting for the vendor).

As people have seen from vmware and from the ever growing piles of
nvidia crashes the truth about binary modules in general even with glue is
pain and suffering.

Veritas have some good Linux people though, and while I'm sad they won't
open source the core of veritas they do at least appear to have the
knowledgebase to do a good job

2001-07-29 01:47:42

by Andrew Morton

[permalink] [raw]
Subject: Re: missing symbols in 2.4.7-ac2

Thomas Kotzian wrote:
>
> when compiling with highmem = 4GB
> problem in 3c59x - module:
> unresolved symbol nr_free_highpages ...
>

Ah. Sorry.

Alan, is it OK to export this symbol?


--- linux-2.4.7-ac1/kernel/ksyms.c Sun Jul 29 11:43:01 2001
+++ ac/kernel/ksyms.c Sun Jul 29 11:43:05 2001
@@ -122,6 +122,7 @@ EXPORT_SYMBOL(kmap_high);
EXPORT_SYMBOL(kunmap_high);
EXPORT_SYMBOL(highmem_start_page);
EXPORT_SYMBOL(create_bounce);
+EXPORT_SYMBOL(nr_free_highpages);
#endif

/* filesystem internal functions */

2001-07-29 07:06:16

by Richard Gooch

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Alan Cox writes:
> > The right answer for vendors who want to ship binary modules is to
> > ship an Open Source interface layer which shields the vendor from
> > kernel drift (since users will be able to build the interface layer if
> > they need to, without waiting for the vendor).
>
> As people have seen from vmware and from the ever growing piles of
> nvidia crashes the truth about binary modules in general even with
> glue is pain and suffering.

Sure. If you load a binary module (shim layer or not), you don't get
community support. So vendors are digging their own shitpile by
shipping binary-only drivers. I just don't see the need to shove them
in the back while they do it.

Besides, if someone can make a lot of money shipping binary drivers,
then they can afford the support costs, so it may well be a viable
revenue model for them (at the very least, programmers need to eat
too).

> Veritas have some good Linux people though, and while I'm sad they
> won't open source the core of veritas they do at least appear to
> have the knowledgebase to do a good job

Yeah, I'd rather see all source open. But that's an ideal world. In
the meantime, many people want $$$. One of the great things about
Linux is that it is open and allows different funding models. The
success of Linux is due to the openness, not some cool technological
feature.

Open Source pushes the "innovation envelope". Eventually, the "core"
(what's now the basic OS) which isn't worth selling grows outwards,
consuming areas where it used to be profitable to sell software. So it
forces companies to innovate or die, leading to a dynamic industry.
That is good for both Society and Industry (as seen by the respective
idealogical poles).

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2001-07-29 10:00:28

by Chris Wedgwood

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

On Sun, Jul 29, 2001 at 03:05:06AM -0400, Richard Gooch wrote:

Yeah, I'd rather see all source open. But that's an ideal world. In
the meantime, many people want $$$. One of the great things about
Linux is that it is open and allows different funding models. The
success of Linux is due to the openness, not some cool technological
feature.

People all need to appreciate sometimes vendors cannot released open
source drivers even if they wanted too. Sometimes vendors have the
ability to released binary only drivers which are derived in part from
source-code which they license --- but cannot share.

This is also the case for various SCSI cards and such like, firmware
is provided binary-only because the source for the firmware isn't
something that can be distributed.



--cw

2001-07-29 10:16:39

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Sunday 29 July 2001 04:25, Alan Cox wrote:
> > I've noticed that in the menuconfig there is support for the Vertias
> > Journalling File System. Has there been any push for that to be a
> > "bootable" filesystem so it can be used for Linux?
>
> The Linux freevxfs module is read only currently. Veritas apparently will
> be releasing the genuine article for Linux but binary only with all the
> mess that entails

tsk tsk tsk. A bit disappointing that Vertias has taken that approach.
However, even still, reiserFS is pretty awsome. Extremely fast and space
efficient, esp on a 60gig drive ;)

Matthew Gardiner
--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-29 10:21:31

by Hugh Dickins

[permalink] [raw]
Subject: Re: missing symbols in 2.4.7-ac2

On Sun, 29 Jul 2001, Andrew Morton wrote:
> Thomas Kotzian wrote:
> > when compiling with highmem = 4GB
> > problem in 3c59x - module:
> > unresolved symbol nr_free_highpages ...
>
> Ah. Sorry.
> Alan, is it OK to export this symbol?

Laconic version: "Probably not: si_meminfo() is your friend".

Verbose version:
I don't think you really want nr_free_highpages(), that's transient
info - it won't usually fall so low as 0 if there is highmem, but do
you want to rely on that? And nr_free_highpages() is CONFIG_HIGHMEM
only, so you'd need #ifdef CONFIG_HIGHMEM around its call in 3c59x.c.

But si_meminfo() is already exported: I think sysinfo.totalhigh is
what you want to check; if I'm wrong, and you really are interested
in whether there are currently free highpages, sysinfo.freehigh
gives you that too without needing a new export.

(I think there probably will be a need for new interfaces
to export more per-zone memory info, but not for this.)

Hugh

--- linux-2.4.7-ac2/drivers/net/3c59x.c Sat Jul 28 07:12:03 2001
+++ linux/drivers/net/3c59x.c Sun Jul 29 10:53:31 2001
@@ -1299,8 +1299,11 @@
/* The 3c59x-specific entries in the device structure. */
dev->open = vortex_open;
if (vp->full_bus_master_tx) {
+ struct sysinfo sysinfo;
+
dev->hard_start_xmit = boomerang_start_xmit;
- if (nr_free_highpages() == 0) {
+ si_meminfo(&sysinfo);
+ if (sysinfo.totalhigh == 0) {
/* Actually, it still should work with iommu. */
dev->features |= NETIF_F_SG;
}

2001-07-29 10:21:19

by Matthew Gardiner

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Sunday 29 July 2001 04:45, Christoph Hellwig wrote:
> Hi Matthew,
>
> In article <[email protected]> you wrote:
> > Regards to the ReiserFS. Something more spookie, OpenLinux (no boos and
> > hisses please ;) ), they have ReiserFS as a module, yet, when I have the
> > root partition as reiser I have no problems, voo doo magic perhaps?
> > because when I compiled 2.4.7 w/ ReiserFS as a module, the boot forks up.
>
> Well, as reiserfs is a module it needs to be on the initrd. The install
> of the kernel kernel RPM automatically creates a new initrd which includes
> the modules in /etc/modules/rootfs. If you don't create a new initrd your
> modular reiserfs setup will of course fail.
>
> > Regarding the last comment, I think Redhat and Caldera have debugging
> > enable (God knows why?), well, Caldera definately dones, after having a
> > look at their default kernel configuration, hence, when I recompiled my
> > kernel to 2.4.7, threw the reiserFS into the guts of the kernel with
> > debugging turned off, there was a speed increase.
>
> Reiserfs as implemented in the 2.4.2-based kernel of OpenLinux 3.1 is
> everything but stable and has a lot of issues (e.g. NFS-exporting doesn't
> work). That is the reason why it is a) marked experimental and is
> completly unsupported (and that is written _big_ _fat_ in manuals and
> similar stuff) and b) has debugging enabled to have the additional sanity
> checks that are under this option and give addtional hints if reiserfs
> fails again.

I've upgraded to 2.4.7 without any problems.

Regard to the above, that is, moduler ReiserFS, its not really an issue, as
compiling into the kernel hasn't caused any problems.

Just a little suggestion. Is it possible to offer "kernel binary upgrades"
every other major release, for example, it shipped with 2.4.2, hence, the
next upgrade to be release, 2.4.4 then 2.4.6 then 2.4.8. I can compile
everthing however, I know a couple of people too scared to get down into the
nitty gritty of Linux and compile their own kernel.

Matthew Gardiner

--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-29 10:25:49

by Matthew Gardiner

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

On Sunday 29 July 2001 07:08, Alan Cox wrote:
> > The right answer for vendors who want to ship binary modules is to
> > ship an Open Source interface layer which shields the vendor from
> > kernel drift (since users will be able to build the interface layer if
> > they need to, without waiting for the vendor).
>
> As people have seen from vmware and from the ever growing piles of
> nvidia crashes the truth about binary modules in general even with glue is
> pain and suffering.
>
> Veritas have some good Linux people though, and while I'm sad they won't
> open source the core of veritas they do at least appear to have the
> knowledgebase to do a good job

1. With the file system, why not charge for commercial use?
2. Regards to hardware manufacturers, what have the got to lose from
publishing the specs? nothing.

Matthew Gardiner
--
WARNING:

This email was written on an OS using the viral 'GPL' as its license.

Please check with Bill Gates before continuing to read this email/posting.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

2001-07-29 10:42:33

by Andrew Morton

[permalink] [raw]
Subject: Re: missing symbols in 2.4.7-ac2

Hugh Dickins wrote:
>
> On Sun, 29 Jul 2001, Andrew Morton wrote:
> > Thomas Kotzian wrote:
> > > when compiling with highmem = 4GB
> > > problem in 3c59x - module:
> > > unresolved symbol nr_free_highpages ...
> >
> > Ah. Sorry.
> > Alan, is it OK to export this symbol?
>
> Laconic version: "Probably not: si_meminfo() is your friend".

:)

> Verbose version:
> I don't think you really want nr_free_highpages(), that's transient
> info - it won't usually fall so low as 0 if there is highmem, but do
> you want to rely on that?

Prefer not to. We want to know "does the system have any highmem
pages". I didn't know about sysinfo.totalhigh, so I used
nr_free_highpages(), which answers the question "does the system
have any free high pages right now".

It's good enough - if we get it wrong (system was very low on memory
when the driver was initialised) the driver will work - it just won't
perform zerocopy optimisations.

> And nr_free_highpages() is CONFIG_HIGHMEM
> only, so you'd need #ifdef CONFIG_HIGHMEM around its call in 3c59x.c.

That's OK actually - nr_free_highpages() evaluates to constant zero if
CONFIG_HIGHMEM isn't defined.


--- linux-2.4.7-ac2/drivers/net/3c59x.c Sat Jul 28 07:12:03 2001
+++ linux/drivers/net/3c59x.c Sun Jul 29 10:53:31 2001
@@ -1299,8 +1299,11 @@
/* The 3c59x-specific entries in the device structure. */
dev->open = vortex_open;
if (vp->full_bus_master_tx) {
+ struct sysinfo sysinfo;
+
dev->hard_start_xmit = boomerang_start_xmit;
- if (nr_free_highpages() == 0) {
+ si_meminfo(&sysinfo);
+ if (sysinfo.totalhigh == 0) {
/* Actually, it still should work with iommu. */
dev->features |= NETIF_F_SG;
}

Much preferable! Thanks.

I've checked all the architectures. Looks fine, works OK. Alan, please
apply this one.

-

2001-07-29 11:03:47

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Sun, Jul 29, 2001 at 10:19:32PM +1200, Matthew Gardiner wrote:

Just a little suggestion. Is it possible to offer "kernel binary
upgrades" every other major release, for example, it shipped with
2.4.2, hence, the next upgrade to be release, 2.4.4 then 2.4.6
then 2.4.8. I can compile everthing however, I know a couple of
people too scared to get down into the nitty gritty of Linux and
compile their own kernel.

Umm.. most (if not all) vendors provide kernel binary upgrades. If
these are not frequent enough for your needs, you need to complain to
them.




--cw

2001-07-29 11:10:16

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Sun, Jul 29, 2001 at 10:15:03PM +1200, Matthew Gardiner wrote:

tsk tsk tsk. A bit disappointing that Vertias has taken that approach.
However, even still, reiserFS is pretty awsome. Extremely fast and space
efficient, esp on a 60gig drive ;)

Why "tsk tsk tsk" ? If reiserfs suits you, use it --- you need never
go near VXFS.

Personally, even though I use reiserfs, I am of the opinion that XFS,
and VXFS and both superior, especially when you include volume
management. Time will show whether or not these very well designed
file-systems are suitable under Linux though, as reiserfs has a
considerable head start.



--cw

2001-07-29 11:07:16

by Chris Wedgwood

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

On Sun, Jul 29, 2001 at 10:24:11PM +1200, Matthew Gardiner wrote:

1. With the file system, why not charge for commercial use?

Maybe they will... but it's not something they could do under the GPL.

2. Regards to hardware manufacturers, what have the got to lose from
publishing the specs? nothing.

Many manufactures will claim otherwise... for some hardware products,
the useful life-cycles is only six months, if you can't make money
within that period of time, the product never will --- so there are
arguments for keeping things vague for just a little while.

Also, some hardware vendors cannot release specifications because they
don't own all the IP here either (see my earlier comments) or are part
of some kind of cartel/consortium which is overrun by labatomized
lawyers, the DVD people for example.




--cw

2001-07-29 11:16:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

In article <[email protected]> you wrote:
> I've noticed that in the menuconfig there is support for the Vertias
> Journalling File System. Has there been any push for that to be a "bootable"
> filesystem so it can be used for Linux?

I don't see any reason wht it shoudn't be bootable, I just haven't tested it
yet. If you want to try it, please follow the below steps:

1) Get one of these CD-ROM readonly distribution
2) Copy it over NFS to a UnixWare (or any other x86 System with VxFS)
3) Make a VxFS system big enough for the distribution
4) Copy the Distribution on the VxFS filesystem

And now the difficult part:

5) Adjust the ondisk dev_t to match Linux's major/minor split instead
of SVR4's. This can either be done by creating (bogus) SVR4 device
nodes that are valid Linux ones when read by Linux or by doing this
with fsdb after they were created.

If you have success with this sppropeach please drop me a mail - I'll add
it to the freevxfs docs then.

Christoph

--
Whip me. Beat me. Make me maintain AIX.

2001-07-29 14:29:59

by Luigi Genoni

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Sun, 29 Jul 2001, Chris Wedgwood wrote:

> On Sun, Jul 29, 2001 at 10:15:03PM +1200, Matthew Gardiner wrote:
>
> tsk tsk tsk. A bit disappointing that Vertias has taken that approach.
> However, even still, reiserFS is pretty awsome. Extremely fast and space
> efficient, esp on a 60gig drive ;)
>
> Why "tsk tsk tsk" ? If reiserfs suits you, use it --- you need never
> go near VXFS.
It depends, for example if you have to manage a farm (let's say 800
systems) with many Unixes
around, where solaris is the 70% of your installed basis, then
veritas (mainly the VM) could be a solution to keep an uniform
environment. That is a good thing if your sysadmin staff is composed also
by people without a real high skill.
>
> Personally, even though I use reiserfs, I am of the opinion that XFS,
> and VXFS and both superior, especially when you include volume
> management.
a journaling filesystem and a volume manager are two complementary
and usefull things, but anyway are different things.
While i do agree that Linux LVM is still not complitelly usable in a
production environment, (but anyway ELVM from IBM is somehow immmature),
and some details of its design are not completely, how can I say...,
suitable for future HW developments, I found reiserFS tecnology to be
really interesting. On a technological point of view reiserFS is much
more advanced in front of any other journaled FS around.

I still have to see vxfs with Linux, but i saw it under solaris and HP-UX
(i think I used all journaled aroung, jfs, xfs, reiserFS, ext3, vxfs, gfs,
on all unixes i could), seeing it to too much slow on high end scsi HW,
and XFS on my origin 2000 (8 processor) sometimes takes one CPU just to
manage journaling under heavy I/O. Under Linux xfs is maybe better that
under Irix (!!!???), but its tecnology was thinked for other kind of HW,
and an experienced sysadmin can "feel" this.
> Time will show whether or not these very well designed
> file-systems are suitable under Linux though, as reiserfs has a
> considerable head start.
Yes, time will show. reiserFS can have a wonderfull future, better than
ext3 if it will be mature before ext3, worse if after. But for Linux
jfs and xfs are interesting right now, just untill native journaled will
be ready, then i would bet everyone will stay with reiserFS or ext3, not
considering any other solution.

Luigi


2001-07-30 10:08:40

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>

> Reiserfs as implemented in the 2.4.2-based kernel of OpenLinux 3.1 is
> everything but stable and has a lot of issues (e.g. NFS-exporting doesn't
> work). That is the reason why it is a) marked experimental and is completly
> unsupported (and that is written _big_ _fat_ in manuals and similar stuff)
> and b) has debugging enabled to have the additional sanity checks that are
> under this option and give addtional hints if reiserfs fails again.

The debugging won't prevent a single crash, it will only print a diagnostic that
might help to understand why it crashed. It makes zero sense for a distro to
have it on, and I think we make that pretty clear in the help button. It would
be nice if distros read the help buttons before selecting options when
configuring their kernels.:-/

I make no claims that users should use ReiserFS as it is in a 2.4.2 kernel....
>
> Christoph
>
> --
> Of course it doesn't work. We've performed a software upgrade.

2001-07-30 15:25:49

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Saturday, July 28, 2001 03:28:05 AM +1000 Andrew Morton
<[email protected]> wrote:

[ patch to trigger data writes before commit in reiserfs ]

>
> There's no disruption to disk format - it just simulates
> the user typing `sync' at the right time. I think the
> concept is sound, and I'm sure Chris can find a more efficient
> way...

Well, its gets points for simplicity ;-)

What I think we need is for commit_write to put new buffers a per super
list of new buffers, and then the journal code can flush that list on
commit.

Since all the filesystems already mark things BH_New, it seems a good
choice to let commit_write look for BH_New buffers and put them on this new
list. But, the only place BH_New seems to get cleared right now is
unmap_buffer, which only gets called from block_flushpage.

Is there any reason we can't just clear BH_New before writing the buffer
out? It looks like a bug to leave it set the way we do now.

-chris

2001-07-30 15:41:59

by Andrew Morton

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Chris Mason wrote:
>
> On Saturday, July 28, 2001 03:28:05 AM +1000 Andrew Morton
> <[email protected]> wrote:
>
> [ patch to trigger data writes before commit in reiserfs ]
>
> >
> > There's no disruption to disk format - it just simulates
> > the user typing `sync' at the right time. I think the
> > concept is sound, and I'm sure Chris can find a more efficient
> > way...
>
> Well, its gets points for simplicity ;-)

Well, I tried system("/bin/sync"); but that didn't link.

> What I think we need is for commit_write to put new buffers a per super
> list of new buffers, and then the journal code can flush that list on
> commit.

whee. Now there's an idea - If the fs keeps track of all its inodes
then you can traverse those and flush out the i_dirty_buffers ring
on each one.

writepage() output is a problem, but that never sits well with
journalling. I guess one could do fdatasync/fdatawait against
the same list of inodes.

> Since all the filesystems already mark things BH_New, it seems a good
> choice to let commit_write look for BH_New buffers and put them on this new
> list. But, the only place BH_New seems to get cleared right now is
> unmap_buffer, which only gets called from block_flushpage.
>
> Is there any reason we can't just clear BH_New before writing the buffer
> out? It looks like a bug to leave it set the way we do now.

I think it can be cleared as soon as the get_block() caller has looked at
it, actually. test_and_clear_bit. The lifecycle of the various buffer_head
fields is exhasperatingly fluffy.

I'd be reluctant to add another eight bytes to buffer_head though.
It's 96 now, which is a nice number. b_inode can go - it's just
a boolean. b_size and b_list can be crunched into a single byte..

How about just doing it via the inodes?

-

2001-07-30 16:05:31

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Tuesday, July 31, 2001 01:47:25 AM +1000 Andrew Morton <[email protected]>
wrote:

> Chris Mason wrote:
>>
>> On Saturday, July 28, 2001 03:28:05 AM +1000 Andrew Morton
>> <[email protected]> wrote:
>>
>> [ patch to trigger data writes before commit in reiserfs ]
>>
>> >
>> > There's no disruption to disk format - it just simulates
>> > the user typing `sync' at the right time. I think the
>> > concept is sound, and I'm sure Chris can find a more efficient
>> > way...
>>
>> Well, its gets points for simplicity ;-)
>
> Well, I tried system("/bin/sync"); but that didn't link.
>
>> What I think we need is for commit_write to put new buffers a per super
>> list of new buffers, and then the journal code can flush that list on
>> commit.
>
> whee. Now there's an idea - If the fs keeps track of all its inodes
> then you can traverse those and flush out the i_dirty_buffers ring
> on each one.

It won't work as well in a generic sense, but I was planning on just using
the b_inode_buffers. Instead of going onto the inode's dirty buffer list,
they go onto this special list instead (the reiserfs journal already has a
dummy inode used for this).

The advantage is that nothing extra is needed on the buffer head, but the
disadvantage is the buffer doesn't go on the inode's list. Somebody needs
to flush the new list on fsync and such. It works for reiserfs, but not in
general.

I think you're right about being able to clear BH_New once get_block tests
it. Unless someone comes up with a reason against, I'll test it out. I'm
guessing that we are wasting time rerunning unmap_underlying_metadata on
buffers marked BH_New.

-chris

2001-07-30 19:07:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Mon, Jul 30, 2001 at 02:08:17PM +0400, Hans Reiser wrote:
> Christoph Hellwig wrote:
> >
>
> > Reiserfs as implemented in the 2.4.2-based kernel of OpenLinux 3.1 is
> > everything but stable and has a lot of issues (e.g. NFS-exporting doesn't
> > work). That is the reason why it is a) marked experimental and is completly
> > unsupported (and that is written _big_ _fat_ in manuals and similar stuff)
> > and b) has debugging enabled to have the additional sanity checks that are
> > under this option and give addtional hints if reiserfs fails again.
>
> The debugging won't prevent a single crash, it will only print a diagnostic that
> might help to understand why it crashed.

I don't know when you took a look at you code the last time, but when
I did some time before the COL 3.1 release, there were lots of places
in the reiserfs code where functions assumed that they have valid
arguments when compiled without debugging and did the check explicitly
when compiled with. Given the state the reiserfs code is in I really
prefer to see this option turned on.

> It makes zero sense for a distro to
> have it on, and I think we make that pretty clear in the help button. It would
> be nice if distros read the help buttons before selecting options when
> configuring their kernels.:-/

Well sometimes it's even better to take a look at the code..

Christoph

> I make no claims that users should use ReiserFS as it is in a 2.4.2 kernel....

No one said that (and even if I wouldn't believe him).

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-30 20:30:37

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> On Mon, Jul 30, 2001 at 02:08:17PM +0400, Hans Reiser wrote:
> > Christoph Hellwig wrote:
> > >
> >
> > > Reiserfs as implemented in the 2.4.2-based kernel of OpenLinux 3.1 is
> > > everything but stable and has a lot of issues (e.g. NFS-exporting doesn't
> > > work). That is the reason why it is a) marked experimental and is completly
> > > unsupported (and that is written _big_ _fat_ in manuals and similar stuff)
> > > and b) has debugging enabled to have the additional sanity checks that are
> > > under this option and give addtional hints if reiserfs fails again.
> >
> > The debugging won't prevent a single crash, it will only print a diagnostic that
> > might help to understand why it crashed.
>
> I don't know when you took a look at you code the last time, but when
> I did some time before the COL 3.1 release, there were lots of places
> in the reiserfs code where functions assumed that they have valid
> arguments when compiled without debugging and did the check explicitly
> when compiled with. Given the state the reiserfs code is in I really
> prefer to see this option turned on.

But there is not one where they recover from invalid arguments without a panic
(unless I failed to notice something), so it gets you nothing except a message
that we the developers will find more informative when trying to find what made
it crash. We check invalid arguments at entry to reiserfs, and for those we
error gracefully. (We also have some checks for garbage blocks having been
read, and we have made some efforts to error gracefully from those.)

Hans

2001-07-30 20:49:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 12:30:12AM +0400, Hans Reiser wrote:
> But there is not one where they recover from invalid arguments without a panic
> (unless I failed to notice something),

Right.

> so it gets you nothing except a message
> that we the developers will find more informative when trying to find what made
> it crash.

Nope. It does a reiserfs_panic instead of letting the wrong arguments
slipping into lower layers and possibly on disk and thus corrupting data.

And in my opinion correct data is much more worth than one crash more or
less (especially with a journaling filesystem).

Christoph

--
Whip me. Beat me. Make me maintain AIX.

2001-07-30 21:05:47

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> On Tue, Jul 31, 2001 at 12:30:12AM +0400, Hans Reiser wrote:
> > But there is not one where they recover from invalid arguments without a panic
> > (unless I failed to notice something),
>
> Right.
>
> > so it gets you nothing except a message
> > that we the developers will find more informative when trying to find what made
> > it crash.
>
> Nope. It does a reiserfs_panic instead of letting the wrong arguments
> slipping into lower layers and possibly on disk and thus corrupting data.
>
> And in my opinion correct data is much more worth than one crash more or
> less (especially with a journaling filesystem).
>
> Christoph
>
> --
> Whip me. Beat me. Make me maintain AIX.


There is nothing like a distro maintainer overriding the design decisions made
by the lead architect of a package, not believing that said architect knows what
the fuck he is doing.

We will make this unusable by you from this point onwards. Vitaly, I told you
what to do weeks ago in this regard, do it today.

Does it get worse than shovelware? I suppose it does....

Hans

2001-07-30 21:13:37

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

The debugging tests in reiserfs were deliberately encouraged to be excessive and
performance unconcerned. That is part of how we get programmers to write
excessively paranoid bug finding code, we tell them not to worry about the
effect on performance, it will only be used when looking for bugs.

People like you destroy my ability to get lots of tests put into the code by the
coders.

Hans

Christoph Hellwig wrote:
>
> On Tue, Jul 31, 2001 at 12:30:12AM +0400, Hans Reiser wrote:
> > But there is not one where they recover from invalid arguments without a panic
> > (unless I failed to notice something),
>
> Right.
>
> > so it gets you nothing except a message
> > that we the developers will find more informative when trying to find what made
> > it crash.
>
> Nope. It does a reiserfs_panic instead of letting the wrong arguments
> slipping into lower layers and possibly on disk and thus corrupting data.
>
> And in my opinion correct data is much more worth than one crash more or
> less (especially with a journaling filesystem).
>
> Christoph
>
> --
> Whip me. Beat me. Make me maintain AIX.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-07-30 21:21:19

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> On Tue, Jul 31, 2001 at 12:30:12AM +0400, Hans Reiser wrote:
> > But there is not one where they recover from invalid arguments without a panic
> > (unless I failed to notice something),
>
> Right.
>
> > so it gets you nothing except a message
> > that we the developers will find more informative when trying to find what made
> > it crash.
>
> Nope. It does a reiserfs_panic instead of letting the wrong arguments
> slipping into lower layers and possibly on disk and thus corrupting data.
>
> And in my opinion correct data is much more worth than one crash more or
> less (especially with a journaling filesystem).

The cost is not a crash, the cost is performance sucks.

>
> Christoph
>
> --
> Whip me. Beat me. Make me maintain AIX.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Are you going to leave it on for future versions of ReiserFS, or just for Linux
2.4.2?

Hans

2001-07-30 21:30:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 01:05:11AM +0400, Hans Reiser wrote:
> There is nothing like a distro maintainer

[NOTE: I do not maintain the Caldera kernel RPM, but I was
involved in the decision to turn reiserfs debugging on]

> overriding the design decisions made
> by the lead architect of a package, not believing that said architect knows what
> the fuck he is doing.

Reiserfs lately had a lot of stability issues, reports of data corruption
and as you said before you don't considere the reiserfs version in 2.4.2-ac
stable yourself.

The averange user will not blame you if he loses data through a problem
in reiserfs but the distribtuion, even if this filesystem is clearly
marked unsupported.

>
> We will make this unusable by you from this point onwards.
>

I do not see the debug kernel removed from the official kernel tree
before reiserfs has proven known stable.

Of course there is still the option of CONFIG_REISERFS_FS=n if you
intentionally want to make your filesystem less acceptable..

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-30 21:44:37

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> On Tue, Jul 31, 2001 at 01:05:11AM +0400, Hans Reiser wrote:
> > There is nothing like a distro maintainer
>
> [NOTE: I do not maintain the Caldera kernel RPM, but I was
> involved in the decision to turn reiserfs debugging on]
>
> > overriding the design decisions made
> > by the lead architect of a package, not believing that said architect knows what
> > the fuck he is doing.
>
> Reiserfs lately had a lot of stability issues, reports of data corruption
> and as you said before you don't considere the reiserfs version in 2.4.2-ac
> stable yourself.
>
> The averange user will not blame you if he loses data through a problem
> in reiserfs but the distribtuion, even if this filesystem is clearly
> marked unsupported.
>
> >
> > We will make this unusable by you from this point onwards.
> >
>
> I do not see the debug kernel removed from the official kernel tree
> before reiserfs has proven known stable.
>
> Of course there is still the option of CONFIG_REISERFS_FS=n if you
> intentionally want to make your filesystem less acceptable..
>
> Christoph
>
> --
> Of course it doesn't work. We've performed a software upgrade.

We'd rather be off, than have debug on. Debug is not designed to be on, unless
you are debugging. Most users don't know that you have turned debug on. They
just think ReiserFS isn't as fast as their SuSE using friends say it is. We
will make sure users know that their distro has turned debug on.

Hans

2001-07-30 21:48:17

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> On Tue, Jul 31, 2001 at 01:05:11AM +0400, Hans Reiser wrote:
> > There is nothing like a distro maintainer
>
> [NOTE: I do not maintain the Caldera kernel RPM, but I was
> involved in the decision to turn reiserfs debugging on]
>
> > overriding the design decisions made
> > by the lead architect of a package, not believing that said architect knows what
> > the fuck he is doing.
>
> Reiserfs lately had a lot of stability issues, reports of data corruption
> and as you said before you don't considere the reiserfs version in 2.4.2-ac
> stable yourself.

I also don't consider any 2.4 prior to 2.4.4 to be stable, and I don't consider
2.4.4 to be especially stable but it is usable.

Shipping 2.4.2 is something you and RedHat did for understandable marketing
reasons. SuSE waited for 2.4.4.

Hans

2001-07-30 21:49:57

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 01:21:09AM +0400, Hans Reiser wrote:
>
> The cost is not a crash, the cost is performance sucks.
>

I give a damn for the performance if my filesystem doesn't prove stable.
And I think you can't deny that all reiserfs versions for 2.4 had issues
in that area. _IF_ reiserfs proves stable in the next time I don't see
any reason why this checks should stay in.

For example I've just turned of the debugging on my ext3-using boxens.
Not only ext3 has proven stable, but I also know that if it fails there
is still e2fsck which has proven absolutly reliable in the last years.

Another example is the write support I currently add to my freevxfs driver
(and no, I'm neither working for RedHat nor is it the VxFS module that
played a central role in your 3/2000 conspiration theories): until it has
proven stable for a long time I will not even add a option to turn off
all the consistency checks I've added. I'll give a damn if ext2, reiserfs
or VERITAS will beat me until it is stable.

>
> Are you going to leave it on for future versions of ReiserFS, or just for Linux
> 2.4.2?

I'm not in a position to decide it. But if I'm asked for my opion (again)
the answer will depend on wether reiserfs will be more stable than now
at that point.

Christoph

--
Whip me. Beat me. Make me maintain AIX.

2001-07-30 21:57:07

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 01:48:03AM +0400, Hans Reiser wrote:

I also don't consider any 2.4 prior to 2.4.4 to be stable, and I
don't consider 2.4.4 to be especially stable but it is usable.

Shipping 2.4.2 is something you and RedHat did for understandable
marketing reasons. SuSE waited for 2.4.4.

This is a myth.

As has been explained --- RedHat does _NOT_ ship a stock
kernel. Redhat 2.4.2 is not the same as stock/linux 2.4.2 so you are
comparing apples with oranges.



--cw


2001-07-30 21:59:37

by Rik van Riel

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, 31 Jul 2001, Hans Reiser wrote:
> Christoph Hellwig wrote:
> >
> > Nope. It does a reiserfs_panic instead of letting the wrong arguments
> > slipping into lower layers and possibly on disk and thus corrupting data.
> >
> > And in my opinion correct data is much more worth than one crash more or
> > less (especially with a journaling filesystem).
>
> There is nothing like a distro maintainer overriding the design
> decisions made by the lead architect of a package, not believing
> that said architect knows what the fuck he is doing.

Are you actually saying you don't care about user's data,
or is it just my imagination ?

(I hope it's my imagination ...)

cheers,

Rik
--
Executive summary of a recent Microsoft press release:
"we are concerned about the GNU General Public License (GPL)"


http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-07-30 21:59:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 09:57:21AM +1200, Chris Wedgwood wrote:
> On Tue, Jul 31, 2001 at 01:48:03AM +0400, Hans Reiser wrote:
>
> I also don't consider any 2.4 prior to 2.4.4 to be stable, and I
> don't consider 2.4.4 to be especially stable but it is usable.
>
> Shipping 2.4.2 is something you and RedHat did for understandable
> marketing reasons. SuSE waited for 2.4.4.
>
> This is a myth.
>
> As has been explained --- RedHat does _NOT_ ship a stock
> kernel. Redhat 2.4.2 is not the same as stock/linux 2.4.2 so you are
> comparing apples with oranges.

The same is true for Caldera, btw.

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-30 22:05:07

by Rik van Riel

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, 31 Jul 2001, Hans Reiser wrote:
> Christoph Hellwig wrote:

> > Nope. It does a reiserfs_panic instead of letting the wrong arguments
> > slipping into lower layers and possibly on disk and thus corrupting data.
> >
> > And in my opinion correct data is much more worth than one crash more or
> > less (especially with a journaling filesystem).
>
> The cost is not a crash, the cost is performance sucks.

If you can chose between sucky performance or a chance
at silent data corruption ... which would you chose ?

Rik
--
Executive summary of a recent Microsoft press release:
"we are concerned about the GNU General Public License (GPL)"


http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-07-30 22:34:47

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Rik van Riel wrote:
>
> On Tue, 31 Jul 2001, Hans Reiser wrote:
> > Christoph Hellwig wrote:
> > >
> > > Nope. It does a reiserfs_panic instead of letting the wrong arguments
> > > slipping into lower layers and possibly on disk and thus corrupting data.
> > >
> > > And in my opinion correct data is much more worth than one crash more or
> > > less (especially with a journaling filesystem).
> >
> > There is nothing like a distro maintainer overriding the design
> > decisions made by the lead architect of a package, not believing
> > that said architect knows what the fuck he is doing.
>
> Are you actually saying you don't care about user's data,
> or is it just my imagination ?
>
> (I hope it's my imagination ...)
>
> cheers,
>
> Rik
> --
> Executive summary of a recent Microsoft press release:
> "we are concerned about the GNU General Public License (GPL)"
>
> http://www.surriel.com/
> http://www.conectiva.com/ http://distro.conectiva.com/

I am saying that you can put so many internal checks into a filesytem that it is
unusable for any real usage. Guess what? ReiserFS does that! But we surround
the checks with a #define. The only limit we have on the checks, is that after
the relevant bug disappears we cut out the ones that make things so slow that it
noticeably inconveniences our debugging. It has to slow things down quite a lot
that we can't stand to wait for it while debugging, but there are some kinds of
checks that you can do that are that slow.

ReiserFS checks more things than the rest of the kernel does. We can do this
because we use the #define, and pay no price for it. You should do this also in
your code....

Every major kernel component should have a #define which if on checks every
imaginable thing the developer can think of to check regardless of how slow it
makes the code go to check it. Then, when users (or at least as usefully,
developers adding a new feature) have bugs in that component, they can turn it
on.

Hans

2001-07-30 22:36:58

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Rik van Riel wrote:
>
> On Tue, 31 Jul 2001, Hans Reiser wrote:
> > Christoph Hellwig wrote:
>
> > > Nope. It does a reiserfs_panic instead of letting the wrong arguments
> > > slipping into lower layers and possibly on disk and thus corrupting data.
> > >
> > > And in my opinion correct data is much more worth than one crash more or
> > > less (especially with a journaling filesystem).
> >
> > The cost is not a crash, the cost is performance sucks.
>
> If you can chose between sucky performance or a chance
> at silent data corruption ... which would you chose ?
>
> Rik
> --
> Executive summary of a recent Microsoft press release:
> "we are concerned about the GNU General Public License (GPL)"
>
> http://www.surriel.com/
> http://www.conectiva.com/ http://distro.conectiva.com/


If you could halve linux memory manager performance and check as many things as
reiserfs checks, would you do it. I think not, or else you would have. You
made the right choice. Now, if you add a #define, you can check as many things
as ReiserFS checks, and still go just as fast....

Hans

2001-07-30 22:41:57

by Kip Macy

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

How does compiling in debug infrastructure protect the user's data? By
making the file system so slow that he won't use it? :-)

-Kip

> Are you actually saying you don't care about user's data,
> or is it just my imagination ?
>
> (I hope it's my imagination ...)
>
> cheers,
>
> Rik
> --
> Executive summary of a recent Microsoft press release:
> "we are concerned about the GNU General Public License (GPL)"
>
>
> http://www.surriel.com/
> http://www.conectiva.com/ http://distro.conectiva.com/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-07-30 22:51:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Mon, Jul 30, 2001 at 03:41:16PM -0700, Kip Macy wrote:
> How does compiling in debug infrastructure protect the user's data? By
> making the file system so slow that he won't use it? :-)

The <<reiserfs debugging code>> isn't debugging code in a strict sense.
It mostly it consists of sequences in the form of:

(sometimes there is also code that the documentation states as deadlock-
avoidance, why it is not enabled without _CHECK defined is left as
exercise to the reader)

#ifdef CONFIG_REISERFS_CHECK
if (condition_that_should_not_happen)
reiserfs_panic (sb, "some_obscure_error_code");
#endif

This way the system stops with a indication of the failing component
instead of silently corrupting disk contents. As reiserfs maintains
a log the recovery from that panic shouldn't take that long either.

(On the other hand I've seen some reiserfs systems that destroyed their
disk contents while trying to recover. That's a reason why I still
can't recomend using reiserfs for anything but /tmp, test machines
or proxy caches).

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-30 22:54:59

by Rik van Riel

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, 31 Jul 2001, Hans Reiser wrote:
> Rik van Riel wrote:
> > On Tue, 31 Jul 2001, Hans Reiser wrote:

> > > The cost is not a crash, the cost is performance sucks.
> >
> > If you can chose between sucky performance or a chance
> > at silent data corruption ... which would you chose ?
>
> If you could halve linux memory manager performance and check as
> many things as reiserfs checks, would you do it.

I haven't removed a single debugging check from the
2.4 VM. Performance is MUCH more reliant on things
like evicting the right page from RAM or reading in
the right page at the right time.

CPU usage is only secondary.

> .. You made the right choice.

Thanks ;) [yeah, yeah ... flame me about out-of-context]


> Now, if you add a #define, you can check as many things as
> ReiserFS checks, and still go just as fast....

I'm sure these checks make reiserfs a tad more CPU hungry,
but isn't the real win in reiserfs supposed to come from
superior disk layout, readahead across files, etc... ?

Or is that all just a myth ?

regards,

Rik
--
Executive summary of a recent Microsoft press release:
"we are concerned about the GNU General Public License (GPL)"


http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com/

2001-07-30 23:12:47

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Rik van Riel wrote:

> > If you could halve linux memory manager performance and check as
> > many things as reiserfs checks, would you do it.
>
> I haven't removed a single debugging check from the
> 2.4 VM. Performance is MUCH more reliant on things
> like evicting the right page from RAM or reading in
> the right page at the right time.
>
> CPU usage is only secondary.
>
> > .. You made the right choice.
>
> Thanks ;) [yeah, yeah ... flame me about out-of-context]
>
> > Now, if you add a #define, you can check as many things as
> > ReiserFS checks, and still go just as fast....
>
> I'm sure these checks make reiserfs a tad more CPU hungry,
> but isn't the real win in reiserfs supposed to come from
> superior disk layout, readahead across files, etc... ?
>
> Or is that all just a myth ?
>
> regards,
>
> Rik
> --
> Executive summary of a recent Microsoft press release:
> "we are concerned about the GNU General Public License (GPL)"
>
> http://www.surriel.com/
> http://www.conectiva.com/ http://distro.conectiva.com/


A tree is a complex structure. You can check it, and the temporary structures
involved in balancing it, quite a lot of ways while balancing it. I believe you
that the checks you need for your code have no significant performance impact.
Ours sometimes do. Consistency checks can be quite a bit more than a tad
consumptive of CPU. Like I said, there were a few checks we removed after the
bug was gone because we got tired waiting for our debugging iterations taking so
long because of them.

Using the #define means we don't have to think about the effect on performance
of a check, we just leave it in. Some checks belong outside the #define
(checking to see if garbage came back from disk is left outside the define
nowadays.) Distros should trust the developers in these tradeoff decisions.
Otherwise, things just get stupid.

Hans

2001-07-31 02:28:49

by Andrew Morton

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Christoph Hellwig wrote:
>
> For example I've just turned of the debugging on my ext3-using boxens.

FYI... CONFIG_JBD_DEBUG is really just that - debug stuff. Mainly,
it enables the printks which are controlled by /proc/sys/fs/jbd-debug.

Early on, sct made the decision that the assertion checks in ext3:

akpm-1:/usr/src/ext3> grep -r ASSERT . | wc -l
187

cannot be disabled. Each and every one of these will nicely
crash the machine. The idea being, as you stated earlier,
that data integrity is golden - if we detect an inconsistency
we take the machine out and let recovery fix it up.

Turns out that at present we're over-aggressive on this. A modest
filesytem inconsistency (bit already free in bitmap, whatever)
or an IO error could force a panic. Stephen is working on changing
the fs to be more selective in its handling of errors - less severe
errors will turn the fs readonly.

I would support your decision to enable reiserfs checking. It's
a valuable feature. It can save your data from hardware failures
as well as software failures. Perhaps Hans' team should look into
moving the expensive checks into a different ifdef.

-

Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser <[email protected]> writes:

>I also don't consider any 2.4 prior to 2.4.4 to be stable, and I don't consider
>2.4.4 to be especially stable but it is usable.

>Shipping 2.4.2 is something you and RedHat did for understandable marketing
>reasons. SuSE waited for 2.4.4.

Well, SuSE shipped 2.4.2 on their 7.1 release and I didn't see you
jumping up and down in anger for "shipping an unstable release":

ftp://ftp.suse.com/pub/suse/i386/7.1/full-names/i386/k_i386_24-2.4.2-12.i386.rpm

Ah, but then again, you got money from them... He who pays the piper,
calls the tune. And you're a fine piper.

Sorry, but I can't take you seriously. Especially as you're _so_
_obviously_ vendor biased, that it stinks.

Regards
Henning

--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH [email protected]

Am Schwabachgrund 22 Fon.: 09131 / 50654-0 [email protected]
D-91054 Buckenhof Fax.: 09131 / 50654-20

2001-07-31 09:56:16

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

"Henning P. Schmiedehausen" wrote:
>
> Hans Reiser <[email protected]> writes:
>
> >I also don't consider any 2.4 prior to 2.4.4 to be stable, and I don't consider
> >2.4.4 to be especially stable but it is usable.
>
> >Shipping 2.4.2 is something you and RedHat did for understandable marketing
> >reasons. SuSE waited for 2.4.4.
>
> Well, SuSE shipped 2.4.2 on their 7.1 release and I didn't see you
> jumping up and down in anger for "shipping an unstable release":
>
> ftp://ftp.suse.com/pub/suse/i386/7.1/full-names/i386/k_i386_24-2.4.2-12.i386.rpm
>
> Ah, but then again, you got money from them... He who pays the piper,
> calls the tune. And you're a fine piper.
>
> Sorry, but I can't take you seriously. Especially as you're _so_
> _obviously_ vendor biased, that it stinks.
>
> Regards
> Henning
>
> --
> Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
> INTERMETA - Gesellschaft fuer Mehrwertdienste mbH [email protected]
>
> Am Schwabachgrund 22 Fon.: 09131 / 50654-0 [email protected]
> D-91054 Buckenhof Fax.: 09131 / 50654-20
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

I stand corrected, this means that all of the distros shipped 2.4 before they
should have. (I use SuSE 7.2. I thought that 7.1 had a 2.2 kernel as the
default, but I guess I was wrong.)

SuSE has its flaws also. I have complained to them about the yast license, for
instance. (I think the best single thing they could do for SuSE sales is change
that license.) All the distros I know of except debian like to put kernel
patches into their distros first. You would think they would want them in the
kernel first so that they could know they are stable, but that would give them
no "advantage". Sigh. I suppose there are much worse things they could do.

Hans

PS

I don't get money from SuSE anymore, I get it from DARPA. I do run SuSE on my
computer though, which is probably enough to bias me.

2001-07-31 10:24:43

by Arjan van de Ven

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Hans Reiser wrote:
> All the distros I know of except debian like to put kernel
> patches into their distros first. You would think they would want them in the
> kernel first so that they could know they are stable, but that would give them
> no "advantage". Sigh. I suppose there are much worse things they could do.

In the future, please check your facts more thoroughly. Almost all of
the patches in the Red Hat
2.4.2-2 kernel were bugfixes from later upstream kernel releases.
INCLUDING reiserfs corruption fixes.
Caldera, Suse, Conectiva and Mandrake all do the same. Ok so we all
differ slightly in which bugfixes
each distro picks, and which base version we start with. That's a matter
of taste. And fwiw,
the 2.4.2-2 Red Hat shipped was closer to 2.4.3-acX than the actual
2.4.2, due to the dozens and
dozens of bugfixes applied from these newer kernels. (and yes we do test
our kernels. hard. That's
why we can't recommend reiserfs on anything non Little Endian or 64 bit
right now)

Please take your false conspiracy theories to some place where they are
more appropriate.

Greetings,
Arjan van de Ven
Red Hat Linux kernel maintainer

2001-07-31 10:25:06

by Anders Eriksson

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption


Side note. I vaguely recall a distribution who's name has escaped me since.
Thier selling point was "It's harder to install" and they claimed not to patch
any source. "If it's good enough for the author, it's good enough for us".
Might be worth checking out. If someone has a disro name for it, please...

/A

2001-07-31 10:32:25

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 12:24:46PM +0200, Anders Eriksson wrote:

Side note. I vaguely recall a distribution who's name has escaped
me since.

OpenBSD?

:)


--cw

2001-07-31 10:31:44

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 02:36:39AM +0400, Hans Reiser wrote:

If you could halve linux memory manager performance and check as
many things as reiserfs checks, would you do it. I think not, or
else you would have. You made the right choice. Now, if you add
a #define, you can check as many things as ReiserFS checks, and
still go just as fast....

The memory manager is stress much more often that reiserfs, EVERYBODY
has it.

The MM system does have various sanity checks, things might be
slightly faster without them, but having the sanity checks is still
very important.

If the memory manager does something bad, chances are your system will
go boom --- upon reboot all is happy. If as fs goes bad, that
corruption might still be there when you reboot, even if to another
kernel! This is a major difference.

Anyhow, I use resierfs with debugging/checking on in lots of places.
The speed difference is negligible, so I think this whole thread is
pointless.

FWIW, if the mainline kernels remove the debugging option, I will hack
it back in --- I for one am happy with the performance and am pleased
there is additional sanity checking.






--cw

2001-07-31 11:00:09

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Chris Wedgwood wrote:
>
> On Tue, Jul 31, 2001 at 02:36:39AM +0400, Hans Reiser wrote:
>
> If you could halve linux memory manager performance and check as
> many things as reiserfs checks, would you do it. I think not, or
> else you would have. You made the right choice. Now, if you add
> a #define, you can check as many things as ReiserFS checks, and
> still go just as fast....
>
> The memory manager is stress much more often that reiserfs, EVERYBODY
> has it.
>
> The MM system does have various sanity checks, things might be
> slightly faster without them, but having the sanity checks is still
> very important.
>
> If the memory manager does something bad, chances are your system will
> go boom --- upon reboot all is happy. If as fs goes bad, that
> corruption might still be there when you reboot, even if to another
> kernel! This is a major difference.
>
> Anyhow, I use resierfs with debugging/checking on in lots of places.
> The speed difference is negligible, so I think this whole thread is
> pointless.
>
> FWIW, if the mainline kernels remove the debugging option, I will hack
> it back in --- I for one am happy with the performance and am pleased
> there is additional sanity checking.
>
> --cw
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Last I ran benchmarks the performance cost was 30-40%, but this was some time
ago. I think that the coders have been quietly culling some checks out of the
FS, and so it does not cost as much anymore. I would prefer that the "excesive"
checks had stayed in.

Sigh, I see I cannot persuade in this argument. It seems Linus is right, and
debugging checks don't belong in debugged code even if they would make it easier
for persons hacking on the code to debug their latest hacks.

Hans

2001-07-31 11:35:57

by David Weinehall

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 02:34:38AM +0400, Hans Reiser wrote:

[snipping earlier discussion]

> I am saying that you can put so many internal checks into a filesytem
> that it is unusable for any real usage. Guess what? ReiserFS does
> that! But we surround the checks with a #define. The only limit we
> have on the checks, is that after the relevant bug disappears we cut
> out the ones that make things so slow that it noticeably
> inconveniences our debugging. It has to slow things down quite a lot
> that we can't stand to wait for it while debugging, but there are some
> kinds of checks that you can do that are that slow.
>
> ReiserFS checks more things than the rest of the kernel does. We can
> do this because we use the #define, and pay no price for it. You
> should do this also in your code....
>
> Every major kernel component should have a #define which if on checks
> every imaginable thing the developer can think of to check regardless
> of how slow it makes the code go to check it. Then, when users (or at
> least as usefully, developers adding a new feature) have bugs in that
> component, they can turn it on.

Ugh! I think you need to have a little chat with Linus about this
opinion of yours on how to use #ifdef / #endif in code... I'm not all
that sure he'll agree with you.


/David
_ _
// David Weinehall <[email protected]> /> Northern lights wander \\
// Project MCA Linux hacker // Dance across the winter sky //
\> http://www.acc.umu.se/~tao/ </ Full colour fire </

2001-07-31 11:41:58

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 02:59:46PM +0400, Hans Reiser wrote:

Sigh, I see I cannot persuade in this argument. It seems Linus is
right, and debugging checks don't belong in debugged code even if
they would make it easier for persons hacking on the code to debug
their latest hacks.

In six months time, or whenever people feel more confident about
resierfs stability (there are still many bigs to be found) then these
checks can be relaxed.

Right now, reiserfs is still relatively new --- and its much more
complex and ext2, so having additional sanity checks is a good idea.




--cw

2001-07-31 12:22:49

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption (patch to cause redhat to unmount reiserfs on halt included)

David Weinehall wrote:

> > Every major kernel component should have a #define which if on checks
> > every imaginable thing the developer can think of to check regardless
> > of how slow it makes the code go to check it. Then, when users (or at
> > least as usefully, developers adding a new feature) have bugs in that
> > component, they can turn it on.
>
> Ugh! I think you need to have a little chat with Linus about this
> opinion of yours on how to use #ifdef / #endif in code... I'm not all
> that sure he'll agree with you.

I didn't say he would agree with me, in fact I am sure he doesn't alike
assertions in the code. I merely said it should be done.:-) As a final little
quibble, let me mention that nikita has created macros that neatly hide the
#ifdefs, and sent them out for testing.

We will consider pulling all but the essential assertions out of ReiserFS.
Sigh. This is the difference between engineering, and marketing. As an
engineer, I said overengineer the checks so that our testing process will catch
more things, and then #define them out so that there is no performance cost.
Perfectly logical. Then along come the distros, and they turn on debugging,
they don't tell the users that debugging is on, and users think we are slower
than other filesystems when we are just configured exactly as we tell the users
not to configure us, sigh.

I'll try simply ensuring that users are warned that debugging is on first. Of
course, with the way syslog is usually misconfigured on most distros we'll have
to be careful to ensure that they ever see the messages.... Should I ask
whether, with ReiserFS debugging on, and the default syslog.conf, the assertions
being checked for on these particular distros ever reach the users? Better I
not ask....?

If Chris wants to run ReiserFS with the checks on, fine, he is a user, and he at
least knows he is doing it, but when a distro does it without warning users the
FS is crippled it is really foul.

Well, if any of you users out there are interested in knowing practical details
of how to overcome the shovelware, even more important than recompiling your
kernel, these patches will help. Note the cute patch that causes reiserfs to
get unmounted rather than unpowered by these folks so concerned about corruption
of data.:-O I am merely passing these patches onwards, I have not verified that
they are correct (because I lack a redhat machine to test on). If RedHat could
confirm that the patch is correct it would be nice, and mindboggling as well.

Vitaly, make sure these are on our website.

>From Dustin Byford:

--- rc.sysinit.orig Mon Jul 30 22:58:45 2001
+++ rc.sysinit Mon Jul 30 22:57:16 2001
@@ -211,7 +211,8 @@

_RUN_QUOTACHECK=0
ROOTFSTYPE=`grep " / " /proc/mounts | awk '{ print $3 }'`
-if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" ]; then
+if [ -z "$fastboot" -a "$ROOTFSTYPE" != "nfs" \
+ -a "$ROOTFSTYPE" != "reiserfs" ]; then

STRING=$"Checking root filesystem"
echo $STRING

>From David Rees:

--- halt.orig Mon Jul 30 17:26:24 2001
+++ halt Mon Jul 30 17:26:36 2001
@@ -165,7 +165,7 @@

# Remount read only anything that's left mounted.
#echo $"Remounting remaining filesystems (if any) readonly"
-mount | awk '/ext2/ { print $3 }' | while read line; do
+mount | awk '/ext2|reiserfs/ { print $3 }' | while read line; do
mount -n -o ro,remount $line
done

2001-07-31 12:38:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption (patch to cause redhat to unmount reiserfs on halt included)

On Tue, Jul 31, 2001 at 04:22:29PM +0400, Hans Reiser wrote:
> I'll try simply ensuring that users are warned that debugging is on first.

Shouldn't the user be warned when mounting a reiserfs filesystem without
checking instead?

> Of
> course, with the way syslog is usually misconfigured on most distros we'll have
> to be careful to ensure that they ever see the messages.... Should I ask
> whether, with ReiserFS debugging on, and the default syslog.conf, the assertions
> being checked for on these particular distros ever reach the users? Better I
> not ask....?

I think you got quite a few facts wrong:

o when a kernel with non-modular reiserfs is booted, reitherfs is
loaded before syslogd even starts
o wether iy hits some logfile, the console or not usually depends on
the KERN_ prefix you give to reiserfs
o on Caldera the user won't see any kernel messages unless something
unexpected happens or he explicitly wants it.

In either case one could rip that message out if there is any gain from it..

> If Chris wants to run ReiserFS with the checks on, fine, he is a user,

I am _not_ a reiserfs user.

> and he at
> least knows he is doing it, but when a distro does it without warning users the
> FS is crippled it is really foul.

I think you got that wrong. It's really foul to not have checks in that
can prevent silent corruption on a filesystem that is not know for being
very stable.

> Well, if any of you users out there are interested in knowing practical details
> of how to overcome the shovelware,

There is no reason why you can't put a reiserfs_nocheck.o module on
your website. If you want to I can send you a Caldera OpenLinux 3.1
package as reference.

Christoph

--
Of course it doesn't work. We've performed a software upgrade.

2001-07-31 13:14:18

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption (patch to cause redhat to unmount reiserfs on halt included)

Christoph Hellwig wrote:

> I am _not_ a reiserfs user.
Other Chris.

2001-07-31 13:42:52

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Tuesday, July 31, 2001 02:59:46 PM +0400 Hans Reiser <[email protected]>
wrote:

[ CONFIG_REISERFS_CHECK ]

> Last I ran benchmarks the performance cost was 30-40%, but this was some
> time ago. I think that the coders have been quietly culling some checks
> out of the FS, and so it does not cost as much anymore. I would prefer
> that the "excesive" checks had stayed in.
>
> Sigh, I see I cannot persuade in this argument. It seems Linus is right,
> and debugging checks don't belong in debugged code even if they would make
> it easier for persons hacking on the code to debug their latest hacks.
>

In the end, the distributions are responsible for their own quality control,
and they are free to turn on whatever debugging features they like. You can
yell, scream, call them names, and in general piss them off however you like
and they will still be absolutely correct in turning on whatever debugging
check they feel is important.

The right way to deal with this is ask why they think it's important to turn
on the checks. The goal behind code under CONFIG_REISERFS_CHECK is to add
extra runtime consistency checks, but without CONFIG_REISERFS_CHECK on, the
code should still make sure it isn't hosing the disk. In other words, the
goal is like this:

if (some_error) {
#ifdef CONFIG_REISERFS_CHECK
panic("some_error") ;
#else
gracefully_recover
#endif

There are places CONFIG_REISERFS_CHECK does extra scanning of the metadata
and such, but all of these are supposed to be things that can be recovered
from with the debugging off. Anything else is a bug.

-chris

2001-07-31 15:15:34

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 09:41:25AM -0400, Chris Mason wrote:

if (some_error) {
#ifdef CONFIG_REISERFS_CHECK
panic("some_error") ;
#else
gracefully_recover
#endif

What a terrible construct... if would be much more elegant as:

if(some_error) {
_namesys_internal_foo("some_error");
recover_bar();
}

where _namesys_internal_foo is compiled differently and may not return
depending on CONFIG_REISERFS_CHECK and maybe also the error type.

That way we don't end up with even more #ifdef BLAH / #endif cruft
which obfuscates what is already hard to read code in places!

Flames welcome :)





--cw

2001-07-31 15:18:45

by Florian Weimer

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Chris Wedgwood <[email protected]> writes:

> People all need to appreciate sometimes vendors cannot released open
> source drivers even if they wanted too. Sometimes vendors have the
> ability to released binary only drivers which are derived in part from
> source-code which they license --- but cannot share.

That's particularly true if there is no other documentation for the
hardware other than this reference source code. This seems to be a
common situation, even with hardware which has good specs, technically
speaking.

--
Florian Weimer [email protected]
University of Stuttgart http://cert.uni-stuttgart.de/
RUS-CERT +49-711-685-5973/fax +49-711-685-5898

2001-07-31 15:19:24

by Florian Weimer

[permalink] [raw]
Subject: Re: binary modules (was Re: ReiserFS / 2.4.6 / Data Corruption)

Matthew Gardiner <[email protected]> writes:

> 2. Regards to hardware manufacturers, what have the got to lose from
> publishing the specs? nothing.

Some vendors do not have proper specs or have received them under NDA
themselves.

--
Florian Weimer [email protected]
University of Stuttgart http://cert.uni-stuttgart.de/
RUS-CERT +49-711-685-5973/fax +49-711-685-5898

2001-07-31 15:22:14

by Hans Reiser

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Chris Mason wrote:
>
> On Tuesday, July 31, 2001 02:59:46 PM +0400 Hans Reiser <[email protected]>
> wrote:
>
> [ CONFIG_REISERFS_CHECK ]
>
> > Last I ran benchmarks the performance cost was 30-40%, but this was some
> > time ago. I think that the coders have been quietly culling some checks
> > out of the FS, and so it does not cost as much anymore. I would prefer
> > that the "excesive" checks had stayed in.
> >
> > Sigh, I see I cannot persuade in this argument. It seems Linus is right,
> > and debugging checks don't belong in debugged code even if they would make
> > it easier for persons hacking on the code to debug their latest hacks.
> >
>
> In the end, the distributions are responsible for their own quality control,
> and they are free to turn on whatever debugging features they like. You can
> yell, scream, call them names, and in general piss them off however you like
> and they will still be absolutely correct in turning on whatever debugging
> check they feel is important.

If they tell the user that the debugging is on and the FS is slowed. I think
this is my solution, we will just make sure that the user knows with every mount
and every boot that debug is on and things are going to be slow.
>
> The right way to deal with this is ask why they think it's important to turn
> on the checks. The goal behind code under CONFIG_REISERFS_CHECK is to add
> extra runtime consistency checks, but without CONFIG_REISERFS_CHECK on, the
> code should still make sure it isn't hosing the disk. In other words, the
> goal is like this:
>
> if (some_error) {
> #ifdef CONFIG_REISERFS_CHECK
> panic("some_error") ;
> #else
> gracefully_recover
> #endif
>
> There are places CONFIG_REISERFS_CHECK does extra scanning of the metadata
> and such, but all of these are supposed to be things that can be recovered
> from with the debugging off. Anything else is a bug.
>
> -chris


I am sorry Chris, but I cannot see the sense in what you say.
CONFIG_REISERFS_CHECK is not a flag that indicates whether the user desires
graceful recovery, it is a flag that indicates whether every imaginable check
should be in the code, performance be damned, because there is a bug in the code
somewhere, and we are desperately trying to get a clue about what its source is
earlier in its life prior to the machine hanging. (Bugs where there is a time
lag between data structure corrupting and FS crashing are harder than others to
debug, and checking the data structures excessively is one way to try to fing
those bugs.) Making graceful recovery a selectable option is a separate topic.

There are lots of arguments that naturally arise in a development team about
what checks are debug only and what ones belong outside. I lose many of these
arguments to the betterment of ReiserFS. If persons not on the development
team, and not involved in those discussions, and not end users, turn debug on
and let users think it is normal slow motion reiserfs that they run, it screws
our whole methodology.

It may help readers if they understand that Chris does not like the big heavy
checks that one would not want to run all the time being inside
CONFIG_REISERFS_CHECK.

Having levels of CONFIG_REISERFS_CHECK, in which one level is something to the
effect of CHECK_EVERYTHING_YOU_CAN_WITHOUT_MY_NOTICING_THINGS_GOING_SLOWER,
would be reasonable. We have what we have though.

Hans

2001-07-31 15:51:11

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Tuesday, July 31, 2001 07:22:01 PM +0400 Hans Reiser <[email protected]>
wrote:

> Chris Mason wrote:
>>
>> On Tuesday, July 31, 2001 02:59:46 PM +0400 Hans Reiser
>> <[email protected]> wrote:
>>
>> [ CONFIG_REISERFS_CHECK ]
>>
>> > Last I ran benchmarks the performance cost was 30-40%, but this was some
>> > time ago. I think that the coders have been quietly culling some checks
>> > out of the FS, and so it does not cost as much anymore. I would prefer
>> > that the "excesive" checks had stayed in.
>> >
>> > Sigh, I see I cannot persuade in this argument. It seems Linus is right,
>> > and debugging checks don't belong in debugged code even if they would
>> > make it easier for persons hacking on the code to debug their latest
>> > hacks.
>> >
>>
>> In the end, the distributions are responsible for their own quality
>> control, and they are free to turn on whatever debugging features they
>> like. You can yell, scream, call them names, and in general piss them off
>> however you like and they will still be absolutely correct in turning on
>> whatever debugging check they feel is important.
>
> If they tell the user that the debugging is on and the FS is slowed. I
> think this is my solution, we will just make sure that the user knows with
> every mount and every boot that debug is on and things are going to be slow.

It already does. Read the mount output ;-)

>>
>> The right way to deal with this is ask why they think it's important to
>> turn on the checks. The goal behind code under CONFIG_REISERFS_CHECK is
>> to add extra runtime consistency checks, but without CONFIG_REISERFS_CHECK
>> on, the code should still make sure it isn't hosing the disk. In other
>> words, the goal is like this:
>>
>> if (some_error) {
>> #ifdef CONFIG_REISERFS_CHECK
>> panic("some_error") ;
>> #else
>> gracefully_recover
>> #endif
>>
>> There are places CONFIG_REISERFS_CHECK does extra scanning of the metadata
>> and such, but all of these are supposed to be things that can be recovered
>> from with the debugging off. Anything else is a bug.
>>
>> -chris
>
>
> I am sorry Chris, but I cannot see the sense in what you say.
> CONFIG_REISERFS_CHECK is not a flag that indicates whether the user desires
> graceful recovery, it is a flag that indicates whether every imaginable
> check should be in the code, performance be damned, because there is a bug
> in the code somewhere, and we are desperately trying to get a clue about
> what its source is earlier in its life prior to the machine hanging.

If graceful recovery is not possible with CONFIG_REISERFS_CHECK off, the FS
is supposed to panic (or remount readonly). Anything less is a bug.

CONFIG_REISERFS_CHECK might put the panic in a different place, for example,
it might notice when a block is read in that one of the items is hosed. That
doesn't mean we can completely ignore hosed items with CONFIG_REISERFS_CHECK
off though.

> (Bugs where there is a time lag between data structure corrupting and FS
> crashing are harder than others to debug, and checking the data structures
> excessively is one way to try to fing those bugs.) Making graceful
> recovery a selectable option is a separate topic.
>
> There are lots of arguments that naturally arise in a development team about
> what checks are debug only and what ones belong outside. I lose many of
> these arguments to the betterment of ReiserFS. If persons not on the
> development team, and not involved in those discussions, and not end users,
> turn debug on and let users think it is normal slow motion reiserfs that
> they run, it screws our whole methodology.

Distributions have methodogies too, its their job to apply it to the products
they ship. They aren't adding code or breaking existing code, they are
simply enabling one of the options we provide. If you really don't want
anyone to enable it, it shouldn't be there at all.

>
> It may help readers if they understand that Chris does not like the big
> heavy checks that one would not want to run all the time being inside
> CONFIG_REISERFS_CHECK.

As per the rules above, this is somewhat true. I want the FS to be fast as
much as anyone else, but reasonable safety checks are much more important.

>
> Having levels of CONFIG_REISERFS_CHECK, in which one level is something to
> the effect of
> CHECK_EVERYTHING_YOU_CAN_WITHOUT_MY_NOTICING_THINGS_GOING_SLOWER, would be
> reasonable. We have what we have though.
>

If you can do the check without being slower, why would anyone ever turn it
off? Speed is not the determining factor in which checks are done, safety is.

-chris

2001-07-31 15:59:51

by Chris Mason

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption



On Wednesday, August 01, 2001 03:15:54 AM +1200 Chris Wedgwood <[email protected]>
wrote:

> On Tue, Jul 31, 2001 at 09:41:25AM -0400, Chris Mason wrote:
>
> if (some_error) {
> #ifdef CONFIG_REISERFS_CHECK
> panic("some_error") ;
> #else
> gracefully_recover
> #endif
>
> What a terrible construct... if would be much more elegant as:
>
> if(some_error) {
> _namesys_internal_foo("some_error");
> recover_bar();
> }
>
> where _namesys_internal_foo is compiled differently and may not return
> depending on CONFIG_REISERFS_CHECK and maybe also the error type.

Two part answer...

1) almost none of the CONFIG_REISERFS_CHECKs look like that, it was an
oversimplified example ;-)

2) Even still, the #ifdefs look nasty, and make the code hard to read. Take
a look at the latest ac release, which has a patch from Nikita that is
similar to what you describe.

-chris



2001-07-31 17:01:29

by J Sloan

[permalink] [raw]
Subject: [OT] Re: ReiserFS / 2.4.6 / Data Corruption

Anders Eriksson wrote:

> Side note. I vaguely recall a distribution who's name has escaped me since.
> Thier selling point was "It's harder to install" and they claimed not to patch
> any source. "If it's good enough for the author, it's good enough for us".
> Might be worth checking out. If someone has a disro name for it, please...
>

Rock Linux IIRC -

cu

jjs

2001-07-31 22:09:15

by Jussi Laako

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

Rik van Riel wrote:
>
> If you can chose between sucky performance or a chance
> at silent data corruption ... which would you chose ?

Just a side note to this discussion.

I'd be very happy with full data journalling even with 50% performance
penalty... There are applications that require extreme data integrity all
times no matter what happens.

- Jussi Laako

--
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B 39DD A4DE 63EB C216 1E4B
Available at PGP keyservers

2001-07-31 22:33:05

by Dan Hollis

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Wed, 1 Aug 2001, Jussi Laako wrote:
> I'd be very happy with full data journalling even with 50% performance
> penalty... There are applications that require extreme data integrity all
> times no matter what happens.

How about an idea I proposed a while back, 'integrity loopback'?

A loopback device which writes a CRC with each block and checks the CRC
when read back.

So if you have a flaky DMA controller, bad cables, etc you will know
instantly. It would at least help catch the 'silent corruption' cases.

-Dan

--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-07-31 23:45:03

by Chris Wedgwood

[permalink] [raw]
Subject: Re: ReiserFS / 2.4.6 / Data Corruption

On Tue, Jul 31, 2001 at 03:32:39PM -0700, Dan Hollis wrote:

How about an idea I proposed a while back, 'integrity loopback'?

A loopback device which writes a CRC with each block and checks
the CRC when read back.

So if you have a flaky DMA controller, bad cables, etc you will
know instantly. It would at least help catch the 'silent
corruption' cases.

It still doesn't help with block-reordering, the fs needs some way to
communication write-barriers or relative block write ordering to the
lower-levels.

To implement the device, I would hack loopback to take no only the
loopback file, but also another 'checksum' file of 160-bits or
whatever for each 4096 (or whatever) block. This file might initially
be of zero-length, in which case the bind is responsible for
checksumming the blocks and writing the checksums out on attach.

I say 160-bits (or whatever) so you can use something like SHA1 for
the checksums, this way you can use a small application to resync the
entire fs at the block level over a network without having to read
every block (ie. you compared checksums and then xmit the blocks).
The latter is something I needed a while ago.



--cw