Andrea Arcangeli wrote:
>
> On Fri, Jul 06, 2001 at 12:28:15AM +1000, Andrew Morton wrote:
> > ext3 journals data. That's unique and it breaks things (or rather,
> > things break it). It'd be trivial to support O_DIRECT in ext3's
> > writeback mode (metadata-only), but nobody uses that.
>
> I thought everybody uses metadata-only to avoid killing data-write
> performance.
ext3 has three modes:
data=journal
Data is journalled. Yes, this slows things down
significantly.
data=ordered
The default mode and the most popular. All data is written
to disk prior to a commit. Write throughput is good, and
you don't have uninitialised data in your files after a
crash.
data=writeback
Metadata-only. Better write throughput (in dbench, anyway),
but only metadata integrity is preserved after a crash. ie:
fsck says the fs is fine, but files can (and almost always do)
contain random stuff after a crash.
Ordered data mode is really nice. It's not magical though - for example,
if you reset the machine during a kernel build, a subsequent `make' will
fail because you have a number of .o files which have zero length.
That's the length they happened to have when the machine went down.
For ordered-data mode we need to keep track of all the buffers which
are associated with a transaction's journalled metadata and ensure that
they are written out before the transaction commits. That is done with
a little structure which hangs off ->b_private.
> So I thought it was ok to at first support O_DIRECT only
> for metadata journaling, doing that should be a three liner as you said
> and that is what I expected.
Yup. metadata-only journalling is all-round much, much simpler.
On Fri, 06 Jul 2001 01:06:53 +1000,
Andrew Morton <[email protected]> wrote:
>Ordered data mode is really nice. It's not magical though - for example,
>if you reset the machine during a kernel build, a subsequent `make' will
>fail because you have a number of .o files which have zero length.
FYI, that particular problem will disappear with the 2.5 Makefiles.
The zero length .o files will still exist but the post-compile
dependency data (.o.d) will not exist so a subsequent make kernel will
rebuild the incomplete objects. This is a general workaround for
incomplete kernel objects, independent of the file system type.