2009-08-10 20:03:43

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

The old description for this configuration option was perhaps not
completely balanced in terms of describing the tradeoffs of using a
default of data=writeback vs. data=ordered. Despite the fact that old
description very strongly recomended disabling this feature, all of
the major distributions have elected to preserve the existing 'legacy'
default, which is a strong hint that it perhaps wasn't telling the
whole story.

This revised description has been vetted by a number of ext3
developers as being better at informing the user about the tradeoffs
of enabling or disabling this configuration feature.

Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]
---
fs/ext3/Kconfig | 32 +++++++++++++++++---------------
1 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/fs/ext3/Kconfig b/fs/ext3/Kconfig
index fb3c1a2..522b154 100644
--- a/fs/ext3/Kconfig
+++ b/fs/ext3/Kconfig
@@ -29,23 +29,25 @@ config EXT3_FS
module will be called ext3.

config EXT3_DEFAULTS_TO_ORDERED
- bool "Default to 'data=ordered' in ext3 (legacy option)"
+ bool "Default to 'data=ordered' in ext3"
depends on EXT3_FS
help
- If a filesystem does not explicitly specify a data ordering
- mode, and the journal capability allowed it, ext3 used to
- historically default to 'data=ordered'.
-
- That was a rather unfortunate choice, because it leads to all
- kinds of latency problems, and the 'data=writeback' mode is more
- appropriate these days.
-
- You should probably always answer 'n' here, and if you really
- want to use 'data=ordered' mode, set it in the filesystem itself
- with 'tune2fs -o journal_data_ordered'.
-
- But if you really want to enable the legacy default, you can do
- so by answering 'y' to this question.
+ The journal mode options for ext3 have different tradeoffs
+ between when data is guaranteed to be on disk and
+ performance. The use of "data=writeback" can cause
+ unwritten data to appear in files after an system crash or
+ power failure, which can be a security issue. However,
+ "data=ordered" mode can also result in major performance
+ problems, including seconds-long delays before an fsync()
+ call returns. For details, see:
+
+ http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs
+
+ If you have been historically happy with ext3's performance,
+ data=ordered mode will be a safe choice and you should
+ answer 'y' here. If you understand the reliability and data
+ privacy issues of data=writeback and are willing to make
+ that trade off, answer 'n'.

config EXT3_FS_XATTR
bool "Ext3 extended attributes"
--
1.6.3.2.1.gb9f7d.dirty


2009-08-10 20:28:23

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

On Mon 10-08-09 16:03:43, Theodore Ts'o wrote:
> The old description for this configuration option was perhaps not
> completely balanced in terms of describing the tradeoffs of using a
> default of data=writeback vs. data=ordered. Despite the fact that old
> description very strongly recomended disabling this feature, all of
> the major distributions have elected to preserve the existing 'legacy'
> default, which is a strong hint that it perhaps wasn't telling the
> whole story.
>
> This revised description has been vetted by a number of ext3
> developers as being better at informing the user about the tradeoffs
> of enabling or disabling this configuration feature.
Thanks. Merged to my tree. I plan to push it to Linus in the next merge
window.

Honza
> Signed-off-by: "Theodore Ts'o" <[email protected]>
> Cc: [email protected]
> ---
> fs/ext3/Kconfig | 32 +++++++++++++++++---------------
> 1 files changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/fs/ext3/Kconfig b/fs/ext3/Kconfig
> index fb3c1a2..522b154 100644
> --- a/fs/ext3/Kconfig
> +++ b/fs/ext3/Kconfig
> @@ -29,23 +29,25 @@ config EXT3_FS
> module will be called ext3.
>
> config EXT3_DEFAULTS_TO_ORDERED
> - bool "Default to 'data=ordered' in ext3 (legacy option)"
> + bool "Default to 'data=ordered' in ext3"
> depends on EXT3_FS
> help
> - If a filesystem does not explicitly specify a data ordering
> - mode, and the journal capability allowed it, ext3 used to
> - historically default to 'data=ordered'.
> -
> - That was a rather unfortunate choice, because it leads to all
> - kinds of latency problems, and the 'data=writeback' mode is more
> - appropriate these days.
> -
> - You should probably always answer 'n' here, and if you really
> - want to use 'data=ordered' mode, set it in the filesystem itself
> - with 'tune2fs -o journal_data_ordered'.
> -
> - But if you really want to enable the legacy default, you can do
> - so by answering 'y' to this question.
> + The journal mode options for ext3 have different tradeoffs
> + between when data is guaranteed to be on disk and
> + performance. The use of "data=writeback" can cause
> + unwritten data to appear in files after an system crash or
> + power failure, which can be a security issue. However,
> + "data=ordered" mode can also result in major performance
> + problems, including seconds-long delays before an fsync()
> + call returns. For details, see:
> +
> + http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs
> +
> + If you have been historically happy with ext3's performance,
> + data=ordered mode will be a safe choice and you should
> + answer 'y' here. If you understand the reliability and data
> + privacy issues of data=writeback and are willing to make
> + that trade off, answer 'n'.
>
> config EXT3_FS_XATTR
> bool "Ext3 extended attributes"
> --
> 1.6.3.2.1.gb9f7d.dirty
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2009-08-10 20:55:44

by Ric Wheeler

[permalink] [raw]
Subject: Re: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

On 08/10/2009 04:03 PM, Theodore Ts'o wrote:
> The old description for this configuration option was perhaps not
> completely balanced in terms of describing the tradeoffs of using a
> default of data=writeback vs. data=ordered. Despite the fact that old
> description very strongly recomended disabling this feature, all of
> the major distributions have elected to preserve the existing 'legacy'
> default, which is a strong hint that it perhaps wasn't telling the
> whole story.
>
> This revised description has been vetted by a number of ext3
> developers as being better at informing the user about the tradeoffs
> of enabling or disabling this configuration feature.
>
> Signed-off-by: "Theodore Ts'o"<[email protected]>
> Cc: [email protected]
>

Thanks Ted - this is much more informative and will allow for a more
informed trade off to be made,

Ric

> ---
> fs/ext3/Kconfig | 32 +++++++++++++++++---------------
> 1 files changed, 17 insertions(+), 15 deletions(-)
>
> diff --git a/fs/ext3/Kconfig b/fs/ext3/Kconfig
> index fb3c1a2..522b154 100644
> --- a/fs/ext3/Kconfig
> +++ b/fs/ext3/Kconfig
> @@ -29,23 +29,25 @@ config EXT3_FS
> module will be called ext3.
>
> config EXT3_DEFAULTS_TO_ORDERED
> - bool "Default to 'data=ordered' in ext3 (legacy option)"
> + bool "Default to 'data=ordered' in ext3"
> depends on EXT3_FS
> help
> - If a filesystem does not explicitly specify a data ordering
> - mode, and the journal capability allowed it, ext3 used to
> - historically default to 'data=ordered'.
> -
> - That was a rather unfortunate choice, because it leads to all
> - kinds of latency problems, and the 'data=writeback' mode is more
> - appropriate these days.
> -
> - You should probably always answer 'n' here, and if you really
> - want to use 'data=ordered' mode, set it in the filesystem itself
> - with 'tune2fs -o journal_data_ordered'.
> -
> - But if you really want to enable the legacy default, you can do
> - so by answering 'y' to this question.
> + The journal mode options for ext3 have different tradeoffs
> + between when data is guaranteed to be on disk and
> + performance. The use of "data=writeback" can cause
> + unwritten data to appear in files after an system crash or
> + power failure, which can be a security issue. However,
> + "data=ordered" mode can also result in major performance
> + problems, including seconds-long delays before an fsync()
> + call returns. For details, see:
> +
> + http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs
> +
> + If you have been historically happy with ext3's performance,
> + data=ordered mode will be a safe choice and you should
> + answer 'y' here. If you understand the reliability and data
> + privacy issues of data=writeback and are willing to make
> + that trade off, answer 'n'.
>
> config EXT3_FS_XATTR
> bool "Ext3 extended attributes"
>


2009-08-11 09:33:16

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

On Tue 11-08-09 06:49:20, Al Boldi wrote:
> Theodore Ts'o wrote:
> > + "data=ordered" mode can also result in major performance
> > + problems, including seconds-long delays before an fsync()
> > + call returns. For details, see:
> > +
> > + http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs
>
> Why isn't the fsync problem fixable?
Because it's quite deep in the design of JBD: All the modifications done
to a filesystem go to one transactions. When the transaction grows big
enough or old enough, we commit the transaction, which means we write all
the metadata to the journal and all the ordered data to their final
location on disk. If you do fsync(), you have to wait for a transaction
commit with your data to finish, so that you are guaranteed a consistent
state of metadata is on disk. But when there is heavy background writing,
it means there's a lot of data you have to write out and wait for... It's
not easy to work around this - naively, you might want to separate out just
the writes you care about for fsync() but that's not easily possible
because bitmaps and group descriptors are modified by other writes as well.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2009-08-11 11:41:01

by Al Boldi

[permalink] [raw]
Subject: Re: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

Jan Kara wrote:
> On Tue 11-08-09 06:49:20, Al Boldi wrote:
> > Theodore Ts'o wrote:
> > > + "data=ordered" mode can also result in major performance
> > > + problems, including seconds-long delays before an fsync()
> > > + call returns. For details, see:
> > > +
> > > + http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs
> >
> > Why isn't the fsync problem fixable?
>
> Because it's quite deep in the design of JBD: All the modifications done
> to a filesystem go to one transactions. When the transaction grows big
> enough or old enough, we commit the transaction, which means we write all
> the metadata to the journal and all the ordered data to their final
> location on disk. If you do fsync(), you have to wait for a transaction
> commit with your data to finish, so that you are guaranteed a consistent
> state of metadata is on disk. But when there is heavy background writing,
> it means there's a lot of data you have to write out and wait for... It's
> not easy to work around this - naively, you might want to separate out just
> the writes you care about for fsync() but that's not easily possible
> because bitmaps and group descriptors are modified by other writes as well.

Ok, I remember now, that was the konqueror deadlocks problem. I think making
the fsync soft in that case would yield a better result than turning
ordered-mode off completely.

BTW: did you get around fixing the ordered-mode redundant write out problem?


Thanks!

--
Al

2009-08-11 12:49:01

by Al Boldi

[permalink] [raw]
Subject: Re: [PATCH, RFC] ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED

Theodore Ts'o wrote:
> + "data=ordered" mode can also result in major performance
> + problems, including seconds-long delays before an fsync()
> + call returns. For details, see:
> +
> + http://ext4.wiki.kernel.org/index.php/Ext3_data_mode_tradeoffs

Why isn't the fsync problem fixable?


Thanks!

--
Al


2009-08-11 13:35:39

by Frans Pop

[permalink] [raw]
Subject: What happened to data=guarded? (was: ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED)

> The old description for this configuration option was perhaps not
> completely balanced in terms of describing the tradeoffs of using a
> default of data=writeback vs. data=ordered.

Somewhat unrelated, but what happened to the data=guarded patches Chris
Mason proposed back in April?

Cheers,
FJP

2009-08-11 13:37:30

by Chris Mason

[permalink] [raw]
Subject: Re: What happened to data=guarded? (was: ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED)

On Tue, Aug 11, 2009 at 03:35:36PM +0200, Frans Pop wrote:
> > The old description for this configuration option was perhaps not
> > completely balanced in terms of describing the tradeoffs of using a
> > default of data=writeback vs. data=ordered.
>
> Somewhat unrelated, but what happened to the data=guarded patches Chris
> Mason proposed back in April?

I missed 2.6.31 but plan on sending for 2.6.32. I promised to send
along a forward port of the patches a while back, but I finally have one
in testing here. It should go out shortly.

-chris


2009-08-11 14:54:36

by Frans Pop

[permalink] [raw]
Subject: Re: What happened to data=guarded?

On Tuesday 11 August 2009, Chris Mason wrote:
> On Tue, Aug 11, 2009 at 03:35:36PM +0200, Frans Pop wrote:
> > Somewhat unrelated, but what happened to the data=guarded patches
> > Chris Mason proposed back in April?
>
> I missed 2.6.31 but plan on sending for 2.6.32. I promised to send
> along a forward port of the patches a while back, but I finally have
> one in testing here. It should go out shortly.

Good to hear. I've so far stayed with data=ordered as I think I'd prefer
data=guarded over data=writeback. I'll certainly give it a try when it's
available.

Thanks,
FJP

2009-08-11 15:29:14

by Andi Kleen

[permalink] [raw]
Subject: Re: What happened to data=guarded?

Frans Pop <[email protected]> writes:

> On Tuesday 11 August 2009, Chris Mason wrote:
>> On Tue, Aug 11, 2009 at 03:35:36PM +0200, Frans Pop wrote:
>> > Somewhat unrelated, but what happened to the data=guarded patches
>> > Chris Mason proposed back in April?
>>
>> I missed 2.6.31 but plan on sending for 2.6.32. I promised to send
>> along a forward port of the patches a while back, but I finally have
>> one in testing here. It should go out shortly.
>
> Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> data=guarded over data=writeback. I'll certainly give it a try when it's
> available.

Same here. data=writeback already cost me a few files after crashes here :/

-Andi

--
[email protected] -- Speaking for myself only.

2009-08-11 15:34:24

by Jan Kara

[permalink] [raw]
Subject: Re: What happened to data=guarded?

> Frans Pop <[email protected]> writes:
>
> > On Tuesday 11 August 2009, Chris Mason wrote:
> >> On Tue, Aug 11, 2009 at 03:35:36PM +0200, Frans Pop wrote:
> >> > Somewhat unrelated, but what happened to the data=guarded patches
> >> > Chris Mason proposed back in April?
> >>
> >> I missed 2.6.31 but plan on sending for 2.6.32. I promised to send
> >> along a forward port of the patches a while back, but I finally have
> >> one in testing here. It should go out shortly.
> >
> > Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> > data=guarded over data=writeback. I'll certainly give it a try when it's
> > available.
>
> Same here. data=writeback already cost me a few files after crashes here :/
In this regard, data=guarded need not be better than data=writeback.
We push out the data in guarded mode as late as in writeback mode
(that's where the performance benefit comes from ;). The difference is
that we increase i_size only after data are safely on disk so we cannot
expose old data.
So security-wise, guarded mode is as safe as ordered mode but in other
aspects its more like data=writeback.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2009-08-11 18:05:09

by Eric Sandeen

[permalink] [raw]
Subject: Re: What happened to data=guarded?

Jan Kara wrote:
>> Frans Pop <[email protected]> writes:
>>
>>> On Tuesday 11 August 2009, Chris Mason wrote:
>>>> On Tue, Aug 11, 2009 at 03:35:36PM +0200, Frans Pop wrote:
>>>>> Somewhat unrelated, but what happened to the data=guarded patches
>>>>> Chris Mason proposed back in April?
>>>> I missed 2.6.31 but plan on sending for 2.6.32. I promised to send
>>>> along a forward port of the patches a while back, but I finally have
>>>> one in testing here. It should go out shortly.
>>> Good to hear. I've so far stayed with data=ordered as I think I'd prefer
>>> data=guarded over data=writeback. I'll certainly give it a try when it's
>>> available.
>> Same here. data=writeback already cost me a few files after crashes here :/
> In this regard, data=guarded need not be better than data=writeback.
> We push out the data in guarded mode as late as in writeback mode
> (that's where the performance benefit comes from ;). The difference is
> that we increase i_size only after data are safely on disk so we cannot
> expose old data.
> So security-wise, guarded mode is as safe as ordered mode but in other
> aspects its more like data=writeback.

Yes, I think the people anxiously waiting for data=guarded may be sadly
surprised at their 0-length files.

For those who understand the data=writeback tradeoffs it'll be very
useful in terms of more consistent results (easily-detectable 0-size or
short files, vs. randomly corrupted data sprinkled around) but it's not
going to be "data=ordered, but faster!"

-Eric

> Honza


2009-08-11 18:57:13

by Theodore Ts'o

[permalink] [raw]
Subject: Re: What happened to data=guarded?

On Tue, Aug 11, 2009 at 05:29:14PM +0200, Andi Kleen wrote:
> > Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> > data=guarded over data=writeback. I'll certainly give it a try when it's
> > available.
>
> Same here. data=writeback already cost me a few files after crashes here :/

What sort of files were you losing? I don't know if we can improve
the implied flush hueristics, but we should at least try to see if we
do something about it.

- Ted

2009-08-11 19:09:38

by Andi Kleen

[permalink] [raw]
Subject: Re: What happened to data=guarded?

On Tue, Aug 11, 2009 at 02:57:03PM -0400, Theodore Tso wrote:
> On Tue, Aug 11, 2009 at 05:29:14PM +0200, Andi Kleen wrote:
> > > Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> > > data=guarded over data=writeback. I'll certainly give it a try when it's
> > > available.
> >
> > Same here. data=writeback already cost me a few files after crashes here :/
>
> What sort of files were you losing? I don't know if we can improve
> the implied flush hueristics, but we should at least try to see if we
> do something about it.

Common case is something patched or git checkout shortly (but not
as short as in seconds) before the crash.

-Andi

--
[email protected] -- Speaking for myself only.

2009-08-11 21:02:16

by Pavel Machek

[permalink] [raw]
Subject: Re: What happened to data=guarded?

On Tue 2009-08-11 14:57:03, Theodore Tso wrote:
> On Tue, Aug 11, 2009 at 05:29:14PM +0200, Andi Kleen wrote:
> > > Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> > > data=guarded over data=writeback. I'll certainly give it a try when it's
> > > available.
> >
> > Same here. data=writeback already cost me a few files after crashes here :/
>
> What sort of files were you losing? I don't know if we can improve
> the implied flush hueristics, but we should at least try to see if we
> do something about it.

IIRC... the flush heuristics invoke async flush, so you can still lose
data if you are unlucky, no?
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-08-12 20:37:57

by Jan Kara

[permalink] [raw]
Subject: Re: What happened to data=guarded?

On Mon 10-08-09 18:42:05, Pavel Machek wrote:
> On Tue 2009-08-11 14:57:03, Theodore Tso wrote:
> > On Tue, Aug 11, 2009 at 05:29:14PM +0200, Andi Kleen wrote:
> > > > Good to hear. I've so far stayed with data=ordered as I think I'd prefer
> > > > data=guarded over data=writeback. I'll certainly give it a try when it's
> > > > available.
> > >
> > > Same here. data=writeback already cost me a few files after crashes here :/
> >
> > What sort of files were you losing? I don't know if we can improve
> > the implied flush hueristics, but we should at least try to see if we
> > do something about it.
>
> IIRC... the flush heuristics invoke async flush, so you can still lose
> data if you are unlucky, no?
Of course you can but it can happen in data=ordered mode as well (if the
machine crashes before the transaction is committed). The percieved
difference is in the fact that kjournald starts its commit every 5 seconds
while pdflush starts writeback every 30-35 seconds. So if you use
data=guarded/writeback mode and set dirty_expire_centisecs to 500, the
experience wrt. data loss is going to be similar to data=ordered mode.
fsync with heavy background writers won't be that painful as in data=ordered
mode but apart from that the performance will be probably comparable.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR