2002-04-09 21:43:49

by Alexis S. L. Carvalho

[permalink] [raw]
Subject: implementing soft-updates

Hi

Does anyone know of any implementation of soft-updates over ext2? I'm
starting a project on this for grad school, and I'd like to know of any
previous (current?) efforts.

Thanks.

Alexis


2002-04-09 22:15:15

by Mike Fedyk

[permalink] [raw]
Subject: Re: implementing soft-updates

On Tue, Apr 09, 2002 at 06:46:05PM -0300, Alexis S. L. Carvalho wrote:
> Hi
>
> Does anyone know of any implementation of soft-updates over ext2? I'm
> starting a project on this for grad school, and I'd like to know of any
> previous (current?) efforts.
>

Heh, ext3? ;)

2002-04-09 22:23:46

by Dominik Kubla

[permalink] [raw]
Subject: Re: implementing soft-updates

On Tue, Apr 09, 2002 at 03:17:25PM -0700, Mike Fedyk wrote:
> On Tue, Apr 09, 2002 at 06:46:05PM -0300, Alexis S. L. Carvalho wrote:
> > Hi
> >
> > Does anyone know of any implementation of soft-updates over ext2? I'm
> > starting a project on this for grad school, and I'd like to know of any
> > previous (current?) efforts.
> >
>
> Heh, ext3? ;)

No. Ext3 uses journalling. Soft-updates is something different.

Dominik
--
"Those who would give up essential Liberty to purchase a little
temporary Safety deserve neither Liberty nor Safety." (Benjamin Franklin)

2002-04-09 23:34:30

by Mike Fedyk

[permalink] [raw]
Subject: Re: implementing soft-updates

On Wed, Apr 10, 2002 at 12:23:37AM +0200, Dominik Kubla wrote:
> On Tue, Apr 09, 2002 at 03:17:25PM -0700, Mike Fedyk wrote:
> > On Tue, Apr 09, 2002 at 06:46:05PM -0300, Alexis S. L. Carvalho wrote:
> > > Hi
> > >
> > > Does anyone know of any implementation of soft-updates over ext2? I'm
> > > starting a project on this for grad school, and I'd like to know of any
> > > previous (current?) efforts.
> > >
> >
> > Heh, ext3? ;)
>
> No. Ext3 uses journalling. Soft-updates is something different.
>

Yes, I know that... (don't for get the ";)"...)

Sorry, I don't know if anyone else has started anyting like this.

It would be interesting to see how the locking would work compared to ext3.

You might get more responces from:

[email protected]
-and-
[email protected]

Since that's where the ext2/3 guys hang out.

Mike

2002-04-10 00:41:44

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: implementing soft-updates

Alexis S. L. Carvalho writes:

> Does anyone know of any implementation of soft-updates
> over ext2? I'm starting a project on this for grad school,
> and I'd like to know of any previous (current?) efforts.

That's interesting. Some comments:

It is common for controllers, RAID arrays, and the disks to
mess up your ordering. Power failure during a write has been
known to scribble on random unrelated parts of the disk.
Power failure often creates bad sectors that can only be
fixed by a large write that covers the affected area.

Ext2 has deletion time stamps. These are not really good for
performance, but they help fsck to know what is going on.

While ext2 fsck doesn't guarantee anything, in practice it is far
more reliable than ufs fsck. If you change the algorithms to be
like those used by BSD, then you may lose some of the ability to
recover. Remember, fsck isn't just for power failures. It tries
to piece together a filesystem that has suffered disk corruption
caused by attackers, kernel bugs, fdisk screwups, MS-DOS writing
past the end of a partition, Windows NT Disk Manager, viruses,
disk head crashes, and every other cause you can imagine. If you
change fsck to make BSD-style assumptions about write ordering,
you weaken the ability to deal with disasters.

I'm sure you are aware of ext3. You should also be aware of tux2.
Tux2 uses the phase-tree algorithm to perform atomic updates of
the whole filesystem. Tux2 looks horridly slow at first glance,
but is actually quite fast. The overhead drops to almost nothing
as the number of simultaneous operations goes to infinity.
(the overhead asymptoticly approaches 0.1%) While the operations
tend to cause fragmentation, they also make defragmentation be
really cheap -- you can defragment on-th-fly as part of normal
filesystem operations without any additional IO. There is a
neat trick you can do with the phase-tree algorithm for better
integrity: make every non-leaf node carry checksums for all
directly connected child nodes. (either plain or keyed crypto)
Filesystem-level snapshots are easy with the phase-tree algorithm.

Soft-updates are mainly useful for OS wars. Lots of FUD comes
flying out of the BSD camp. Ext2 horror stories are rare
when you consider just how many millions of users ext2 has.
Soft-updates would make our worst problems even worse. The whole
point of soft-updates is to have fsck and the kernel trust the
metadata a bit more... which is terrible if your VIA motherboard
is mangling your metadata before it hits the disk. Not to say
that doing well in an OS war isn't a useful goal though!

In case you are still thinking about what to do, here are a
few filesystem ideas that you might like:

soft-updates for ext2
ext2 compression (e2compr)
delayed allocation (allocate space only when about to do IO)
while rw mounted: defrag, undelete (not trash bin), grow, shrink, fsck
get tux2 into production shape
use the phase-tree algorithm for FAT32 (hint: active FAT flags)
new phase-tree filesystem, perhaps with JFS or XFS structure
make ext2 extents work
make ext2 handle huge block sizes
mark idle filesystems clean; mark dirty before non-atomic updates
ACLs compatible with NFSv4, fast, and compact
secure deletion (stop root, not the NSA: zero the name, inode...)
tools for in-place filesystem conversion (ufs --> ext2)
HFS+ filesystem
Apple's UID hacks for Darwin (the BSD-like MacOS X kernel)
design a fast way to map from inode number to filename(s)
try larger inodes (example: 168-byte, 3 in 512 bytes, 0,1,2,x,4,5,6,x,8...)
provide real-time file IO (app buffers do not guarantee bandwidth)

BTW, the unbalanced trees can be good. They provide quick access
to file magic (see "file" command) and other header information.
We have read-ahead to take care of the rest of the file.

2002-04-10 01:56:38

by Alexis S. L. Carvalho

[permalink] [raw]
Subject: Re: implementing soft-updates

First of all, thanks for your comments.

Thus spake Albert D. Cahalan:
> Alexis S. L. Carvalho writes:
>
> > Does anyone know of any implementation of soft-updates
> > over ext2? I'm starting a project on this for grad school,
> > and I'd like to know of any previous (current?) efforts.
>
> That's interesting. Some comments:
>
> It is common for controllers, RAID arrays, and the disks to
> mess up your ordering. Power failure during a write has been
> known to scribble on random unrelated parts of the disk.
> Power failure often creates bad sectors that can only be
> fixed by a large write that covers the affected area.

OK, but if something scribbles on random unrelated parts of the disk
there's not much you can do besides praying that fsck will fix it.

> Ext2 has deletion time stamps. These are not really good for
> performance, but they help fsck to know what is going on.
>
> While ext2 fsck doesn't guarantee anything, in practice it is far
> more reliable than ufs fsck. If you change the algorithms to be
> like those used by BSD, then you may lose some of the ability to
> recover. Remember, fsck isn't just for power failures. It tries
> to piece together a filesystem that has suffered disk corruption
> caused by attackers, kernel bugs, fdisk screwups, MS-DOS writing
> past the end of a partition, Windows NT Disk Manager, viruses,
> disk head crashes, and every other cause you can imagine. If you
> change fsck to make BSD-style assumptions about write ordering,
> you weaken the ability to deal with disasters.

I haven't looked into e2fsck yet, but if/when I get to it, I'll probably
add a mode that makes some assumptions about the disk state. If you
don't explicitly ask for this mode, you get the current behavior.

Also, this mode would only be run during the boot sequence under a
specific situation (the system crashed while running with soft-updates).
Note that if you were running a journalling fs, fsck wouldn't be run at
all.

> I'm sure you are aware of ext3. You should also be aware of tux2.

I read some stuff about tux2 a couple of years ago, but I do have to
re-read it all...

> Soft-updates are mainly useful for OS wars. Lots of FUD comes
> flying out of the BSD camp. Ext2 horror stories are rare
> when you consider just how many millions of users ext2 has.

Well, I found soft-updates pretty interesting, and I want to play a bit
with it. Anyway, given my (lack of) experience with kernel programming I
don't believe I'll have anything useful for some time yet...

> In case you are still thinking about what to do, here are a
> few filesystem ideas that you might like:
<nice list of fs projects>

hmm... I guess I find soft-updates sexy enough... :-)

Thanks

Alexis

2002-04-10 02:57:09

by Andreas Dilger

[permalink] [raw]
Subject: Re: implementing soft-updates

On Apr 09, 2002 20:41 -0400, Albert D. Cahalan wrote:
> In case you are still thinking about what to do, here are a
> few filesystem ideas that you might like:
>
> ext2 compression (e2compr)
- project needs polishing, integration
> delayed allocation (allocate space only when about to do IO)
- Andrew Morton has done this for 2.5
> while rw mounted: defrag, undelete (not trash bin), grow, shrink, fsck
- Andrew Morton has implemented for ext3 (kernel space, needs user tool)
> make ext2 extents work
- yes, discussion ongoing on ext2-devel, no real progress yet
> make ext2 handle huge block sizes
- kernel issues w.r.t. buffers > PAGE_SIZE
> mark idle filesystems clean; mark dirty before non-atomic updates
- maybe marginally useful
> tools for in-place filesystem conversion (ufs --> ext2)
- existing project
> try larger inodes (example: 168-byte, 3 in 512 bytes, 0,1,2,x,4,5,6,x,8...)
- discussion ongoing on ext2-devel with some good progress

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-04-10 03:48:26

by Andreas Dilger

[permalink] [raw]
Subject: Re: implementing soft-updates

On Apr 09, 2002 22:58 -0300, Alexis S. L. Carvalho wrote:
> OK, but if something scribbles on random unrelated parts of the disk
> there's not much you can do besides praying that fsck will fix it.

Well, the fact that ext2 uses fixed areas of the disk for specific
purposes (e.g. inode table) and it has backups of a lot of metadata
makes it very possible to recover from random data corruption.

> Note that if you were running a journalling fs, fsck wouldn't be run at
> all.

Note that this is incorrect. Even with ext3, e2fsck is run on each
boot. While in the normal case all it does is journal recovery (takes
a few seconds at most) and do a superficial check of the superblock.
This is incredibly useful, however, if there was a filesystem error,
since e2fsck has a chance to check and cleanup the filesystem before
it is put into use.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-04-10 09:28:15

by Dominik Kubla

[permalink] [raw]
Subject: Re: implementing soft-updates

On Tue, Apr 09, 2002 at 08:41:28PM -0400, Albert D. Cahalan wrote:
...
> While ext2 fsck doesn't guarantee anything, in practice it is far
> more reliable than ufs fsck. If you change the algorithms to be
> like those used by BSD, then you may lose some of the ability to
> recover. Remember, fsck isn't just for power failures. It tries
> to piece together a filesystem that has suffered disk corruption
> caused by attackers, kernel bugs, fdisk screwups, MS-DOS writing
> past the end of a partition, Windows NT Disk Manager, viruses,
> disk head crashes, and every other cause you can imagine. If you
> change fsck to make BSD-style assumptions about write ordering,
> you weaken the ability to deal with disasters.

I disagree. In fact the current BSD softupdate code guarantees that all
that ever happens is that freed blocks are not entered into the free
block list. Something fsck can fix in background on a life system. See
M. Kirk McKusicks BSDcon 02 paper 'Running fsck in background.'

Your argument that faulty hardware may create havoc with your on-disk
data structures is something every file system is prone to unless it
uses a raw-read-after-write for checking purposes. Something which
definitely kills disk performance.

The background fsck capability, just like journalling or logging, are
typically only in needed in 24/7 systems (sure, they are nice to have in
your home system, but do you _REALLY_ need them? i don't!) and those
system typically are run on proven hardware which is operated well
within the specs. So please don't construct these kinds of arguments.

The fact that the BSD FFS in it's currently released version (which does
not include snapshot and background fsck capability) is considered to be
one of the more reliable file systems around, even when softupdates are
enabled, speaks for itself. So please just as you don't want horror
stories about Linux ext2 spread: don't do it yourself.

Alexis, if you're looking for a rewarding Linux project, don't focus too
much on softupdates, the majority of linux users/developers couldn't
care less. I wonder sometimes if this is perhaps because BSD did it
first?

Read M. Kirk McKusick's paper on fsck and snapshots (it's in the
proceedings of this years BSDcon, available from Usenix) and try to
implement the snapshot capability for ext2/ext3. Everyone of us who has
to do live backups of production systems will thank you if you get that
development started. I found that Mr. McKusick is somebody who is very
helpful towards people trying to understand his work, so you might get
help from him if you get stuck. OTOH if you avoid the buzzword
'softupdates' many Linux file system hackers will be much more inclined
to help you out with the Linux part.

Yours,
Dominik Kubla
--
"Those who would give up essential Liberty to purchase a little
temporary Safety deserve neither Liberty nor Safety." (Benjamin Franklin)

2002-04-10 11:52:18

by Peter Wächtler

[permalink] [raw]
Subject: Re: implementing soft-updates

Andreas Dilger wrote:
>
> On Apr 09, 2002 20:41 -0400, Albert D. Cahalan wrote:
> > In case you are still thinking about what to do, here are a
> > few filesystem ideas that you might like:
> >
> > ext2 compression (e2compr)
> - project needs polishing, integration

Do we want to setup a project on sourceforge for this?

2002-04-10 18:07:15

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: implementing soft-updates

Dominik Kubla writes:
> On Tue, Apr 09, 2002 at 08:41:28PM -0400, Albert D. Cahalan wrote:

>> While ext2 fsck doesn't guarantee anything, in practice it is far
>> more reliable than ufs fsck. If you change the algorithms to be
>> like those used by BSD, then you may lose some of the ability to
>> recover. Remember, fsck isn't just for power failures. It tries
>> to piece together a filesystem that has suffered disk corruption
>> caused by attackers, kernel bugs, fdisk screwups, MS-DOS writing
>> past the end of a partition, Windows NT Disk Manager, viruses,
>> disk head crashes, and every other cause you can imagine. If you
>> change fsck to make BSD-style assumptions about write ordering,
>> you weaken the ability to deal with disasters.
>
> I disagree. In fact the current BSD softupdate code guarantees that all
> that ever happens is that freed blocks are not entered into the free
> block list. Something fsck can fix in background on a life system. See
> M. Kirk McKusicks BSDcon 02 paper 'Running fsck in background.'

Two cases:

a. proper shutdown -- somewhat OK to never fsck
b. unclean shutdown -- may involve kernel crashing

So with an unclean filesystem, _any_ avoidance of fsck is
suspect. I have a UPS; when my system boots on an unclean
filesystem it's because XFree86 thought it could run a
hardware driver in userspace.

Journalling gives you a nice list of recently-touched data
structures to examine. The phase-tree algorithm can support
low-cost incremental checksumming of the whole filesystem.
Soft-updates leave you with... well, is prayer any good?
You'd better run fsck at boot, which AFAIK is exactly what
is done; you even say "not include [...] background fsck".

> The fact that the BSD FFS in it's currently released version (which does
> not include snapshot and background fsck capability) is considered to be
> one of the more reliable file systems around, even when softupdates are
> enabled, speaks for itself. So please just as you don't want horror
> stories about Linux ext2 spread: don't do it yourself.

I'm just tired of this: "Back when I used to use Linux 2.1.44 my
disks were trashed so bad that I lost everything! So use BSD."
Last time I checked, BSD fsck didn't have a set of regression tests
like ext2 fsck does. On the BSD mailing lists you can read about
fsck getting signal 11. So it's not God's Glorious Filesystem by
any means.

2002-04-10 18:15:12

by Andreas Dilger

[permalink] [raw]
Subject: Re: implementing soft-updates

On Apr 10, 2002 11:28 +0200, Dominik Kubla wrote:
> try to implement the snapshot capability for ext2/ext3. Everyone of us
> who has to do live backups of production systems will thank you if you
> get that development started.

LVM can already do snapshots at the device level. It integrates with
ext3/XFS/reiserfs via sync_super_lockfs/unlockfs so that what is in
the snapshot is a consistent, clean filesystem.

There might need to be a little touchup with ext2 to support these
calls, but even in the current state you get a usable filesystem
snapshot, with the exception that the filesystem has not been marked
clean.

As for a filesystem-level ext2/ext3 snapshot, this has also already
been done (sf.net/projects/snapfs). The people who took over that
project have removed all of the released files and CVS, but you can
still get the CVS from the sourceforge CVS backups. I also have a
version here, but don't have any time to work on it.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-04-10 19:24:23

by Andi Kleen

[permalink] [raw]
Subject: Re: implementing soft-updates

Dominik Kubla <[email protected]> writes:

> The background fsck capability, just like journalling or logging, are
> typically only in needed in 24/7 systems (sure, they are nice to have in
> your home system, but do you _REALLY_ need them? i don't!) and those
> system typically are run on proven hardware which is operated well
> within the specs. So please don't construct these kinds of arguments.

You can already do background fsck on a linux system today. Just do it on
a LVM/EVMS snapshot.


-Andi

2002-04-11 12:48:32

by Bill Davidsen

[permalink] [raw]
Subject: Re: implementing soft-updates

On Tue, 9 Apr 2002, Andreas Dilger wrote:

> On Apr 09, 2002 20:41 -0400, Albert D. Cahalan wrote:
> > In case you are still thinking about what to do, here are a
> > few filesystem ideas that you might like:

> > mark idle filesystems clean; mark dirty before non-atomic updates
> - maybe marginally useful

I would think far mnore than marginally useful... this would provide
an improved possibility of avoiding fsck, something many people would find
desirable. And along that line I'll offer one more idea, not to increment
the mount count if a f/s is mounted and no writes are done to the f/s,
such as ro mount, noatime mount and no writes, etc. The check for fsck
after N mounts was designed to assist with reliability, not be a penalty
in boot time.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-04-11 16:35:21

by James Bottomley

[permalink] [raw]
Subject: Re: implementing soft-updates

[email protected] said:
> Does anyone know of any implementation of soft-updates over ext2? I'm
> starting a project on this for grad school, and I'd like to know of
> any previous (current?) efforts.

There was a previous attempt (now defunct, I believe) to implement a phase
tree approach for ext2. While this is definitely not the same as the McKusik
soft update approach, the end goal of ensuring that the filesystem is
consistent at all times during operation is, so you may be able to salvage
something to help you from it.

The person originally doing it was Daniel Phillips

http://people.nl.linux.org/~phillips/

and he called the filesystem tux2. There were also several papers he
presented, one to ALS 2000 which unfortunately has no surviving on-line copy.
Marc Merlin I believe has a copy of the presentation made to the Australian
Linux Conference in 2001:

http://marc.merlins.org/linux/linux.conf.au_2001/Day2/Tux2.html

And there are probably others dotted about the web if you look.

James Bottomley


2002-04-15 20:21:35

by Pavel Machek

[permalink] [raw]
Subject: Re: implementing soft-updates

Hi

> > The background fsck capability, just like journalling or logging, are
> > typically only in needed in 24/7 systems (sure, they are nice to have in
> > your home system, but do you _REALLY_ need them? i don't!) and those
> > system typically are run on proven hardware which is operated well
> > within the specs. So please don't construct these kinds of arguments.
>
> You can already do background fsck on a linux system today. Just do it on
> a LVM/EVMS snapshot.

How do you fix errors you find by such background fsck?
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

2002-04-15 20:22:25

by Pavel Machek

[permalink] [raw]
Subject: Re: implementing soft-updates

Hi!

> I'm just tired of this: "Back when I used to use Linux 2.1.44 my
> disks were trashed so bad that I lost everything! So use BSD."
> Last time I checked, BSD fsck didn't have a set of regression tests
> like ext2 fsck does. On the BSD mailing lists you can read about
> fsck getting signal 11. So it's not God's Glorious Filesystem by
> any means.

Actually, I believe I can make current e2fsck mark filesystem clean when
it is not [by trashing disk badly and killing lost+found]. If you can mail
me recent [static] e2fsck, I can test it and create image where it fails.

Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

2002-04-15 20:25:53

by Andi Kleen

[permalink] [raw]
Subject: Re: implementing soft-updates

On Mon, Apr 08, 2002 at 08:35:16PM +0000, Pavel Machek wrote:
> Hi
>
> > > The background fsck capability, just like journalling or logging, are
> > > typically only in needed in 24/7 systems (sure, they are nice to have in
> > > your home system, but do you _REALLY_ need them? i don't!) and those
> > > system typically are run on proven hardware which is operated well
> > > within the specs. So please don't construct these kinds of arguments.
> >
> > You can already do background fsck on a linux system today. Just do it on
> > a LVM/EVMS snapshot.
>
> How do you fix errors you find by such background fsck?

You umount the file system (that is the best you can do with a random
error anyways, BSD doesn't do any better except in the special case
of lost blocks in the bitmap) and fsck it again on the real volume.

In theory you could build a mechanism to pass some state from the
first fsck to the second to speed the second up, but it is probably not
worth it.


-Andi

2002-04-15 20:48:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: implementing soft-updates

On Apr 08, 2002 20:35 +0000, Pavel Machek wrote:
> Andi Kleen writes:
> > You can already do background fsck on a linux system today. Just do it on
> > a LVM/EVMS snapshot.
>
> How do you fix errors you find by such background fsck?

You shouldn't get any in the first place (they would be from disk
errors, memory corruption, software bugs, etc). If you _do_ get such
an error, isn't it worth it to shut down your system and bring it back
to a known state (and maybe figure out what actually caused this error)?

The only reason to have such a feature is for high-availability,
high-uptime systems which cannot normally be shut down. In very recent
versions of e2fsprogs, you are able to reset the "last checked" field
in the superblock (you could reset the "mount count" field for a long
time), so that you do an online check every week/month, then you can
avoid the forced fsck after a reboot because the filesystem hasn't been
checked in 6 months.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-04-16 10:45:58

by Wichert Akkerman

[permalink] [raw]
Subject: Re: implementing soft-updates

In article <[email protected]>,
Pavel Machek <[email protected]> wrote:
>How do you fix errors you find by such background fsck?

Theoretically I suppose you could use a writeable snapshot and then switch the
fscked snapshot with the currently mounted fs.

Wichert.

--
_________________________________________________________________
/ Nothing is fool-proof to a sufficiently talented fool \
| [email protected] http://www.liacs.nl/~wichert/ |
| 1024D/2FA3BC2D 576E 100B 518D 2F16 36B0 2805 3CB8 9250 2FA3 BC2D |

2002-04-16 18:53:04

by Mike Fedyk

[permalink] [raw]
Subject: Re: implementing soft-updates

On Tue, Apr 16, 2002 at 12:45:53PM +0200, Wichert Akkerman wrote:
> In article <[email protected]>,
> Pavel Machek <[email protected]> wrote:
> >How do you fix errors you find by such background fsck?
>
> Theoretically I suppose you could use a writeable snapshot and then switch the
> fscked snapshot with the currently mounted fs.
>

Not if there were writes after the fsck. You could probably do it if there were only
reads though. If there were writes it would be based on a corrupted (to
whatever extent) filesystem.