Hi,
I'm not completely sure this is interesting for enough people but maybe
it is...
As you well know, there are three independent code bases in kernel
implementing ext-based filesystems - ext2, ext3, and ext4. Of course it
costs some effort to maintain them all in a reasonably good condition so
once in a while someone comes and proposes we should drop one of ext2, ext3
or both. So I'd like to gather input what people think about this - should
we ever drop ext2 / ext3 codebases? If yes, under what condition do we deem
it is OK to drop it?
To give some facts:
Feature-wise, ext4 should now be almost a superset of both ext2 and
ext3. ext4 has nojournal mode to simulate ext2, looking at the code I only
don't see XIP support in ext4, arguably also nobh-mode but I personally
feel that these days the complication in the code isn't worth it. As far as
I know it should be backward compatible to writeably mount ext2/ext3
filesystem with ext4 (i.e., no incompatible features should be turned on
magically).
On the other hand there are differences noticeable under some conditions -
e.g. delayed allocation, data=ordered mode of ext3 gives better data
integrity than that of ext4 in practice (it's just a side effect we never
promised but app developers somehow got used to it ;), different allocation
decisions, and I believe there are more of these subtle differences.
Then of course there is the factor of the codebase itself: Ext2 - ~9k
lines, Ext3+JBD - 24k lines, Ext4+JBD2 - 43k lines. Ext2 codebase is so
simple that it sometimes serves as a "model filesystem". But arguably it
also bitrots slowly so copy-and-pasting from ext2 need not be clever idea
anymore.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On 2/3/11 8:40 AM, Jan Kara wrote:
> Hi,
>
> I'm not completely sure this is interesting for enough people but maybe
> it is...
>
> As you well know, there are three independent code bases in kernel
> implementing ext-based filesystems - ext2, ext3, and ext4. Of course it
> costs some effort to maintain them all in a reasonably good condition so
> once in a while someone comes and proposes we should drop one of ext2, ext3
> or both. So I'd like to gather input what people think about this - should
> we ever drop ext2 / ext3 codebases? If yes, under what condition do we deem
> it is OK to drop it?
>
> To give some facts:
> Feature-wise, ext4 should now be almost a superset of both ext2 and
> ext3. ext4 has nojournal mode to simulate ext2, looking at the code I only
> don't see XIP support in ext4, arguably also nobh-mode but I personally
> feel that these days the complication in the code isn't worth it. As far as
> I know it should be backward compatible to writeably mount ext2/ext3
> filesystem with ext4 (i.e., no incompatible features should be turned on
> magically).
>
> On the other hand there are differences noticeable under some conditions -
> e.g. delayed allocation, data=ordered mode of ext3 gives better data
> integrity than that of ext4 in practice (it's just a side effect we never
> promised but app developers somehow got used to it ;), different allocation
> decisions, and I believe there are more of these subtle differences.
I think that ext4 with nodelalloc should mostly mimic ext3 in those
cases, no?
> Then of course there is the factor of the codebase itself: Ext2 - ~9k
> lines, Ext3+JBD - 24k lines, Ext4+JBD2 - 43k lines. Ext2 codebase is so
> simple that it sometimes serves as a "model filesystem". But arguably it
> also bitrots slowly so copy-and-pasting from ext2 need not be clever idea
> anymore.
Yep at one point it was asserted that ext2 was a model filesystem and should
therefore be kept around, but I agree with you that it may not really
serve that purpose too well.
While I'm no fan of having 3 kinda-similar codebases that must be maintained,
my concerns would be:
1) ext4 is still in active development, and may introduce instabilities
that ext3 would otherwise avoid.
2) ext4's more, um ... unique option combinations probably get next to
no testing in the real world. So while we can say that noextent,
nodelalloc is mostly like ext3, in practice, does that ever really
get much testing?
If we can have a real plan for moving in this direction though, I'd
support it. I'm just not sure how we get enough real testing under
our belts to be comfortable with dropping ext[23], especially as
most distros now default to ext4 anyway.
-Eric
> Honza
On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> If we can have a real plan for moving in this direction though, I'd
> support it. ?I'm just not sure how we get enough real testing under
> our belts to be comfortable with dropping ext[23], especially as
> most distros now default to ext4 anyway.
Eric what sort of testing are you looking for?
I admit I like having ext2 around for comparisons in bug situations.
It really helps to isolate the problem area. How painful is the
upkeep?
mrubin
On 2/3/11 1:32 PM, Michael Rubin wrote:
> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
>> If we can have a real plan for moving in this direction though, I'd
>> support it. I'm just not sure how we get enough real testing under
>> our belts to be comfortable with dropping ext[23], especially as
>> most distros now default to ext4 anyway.
>
> Eric what sort of testing are you looking for?
Anything, the more formal or more widespread the better.
I just don't think it's used much this way today...
We can start with xfstests etc but I'd be more concerned about
unexpected behavioral or performance changes.
> I admit I like having ext2 around for comparisons in bug situations.
> It really helps to isolate the problem area. How painful is the
> upkeep?
since ext4 was merged, about 450 commits to ext2 & ext3 files.
since 2.6.32, about 150 commits.
Translating that into pain units, I dunno. In distro-land, I often
have bugfixes that need to hit 2 or 3 of the filesystems as well.
-Eric
> mrubin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 3, 2011 at 9:49 PM, Eric Sandeen <[email protected]> wrote:
>
> On 2/3/11 1:32 PM, Michael Rubin wrote:
> > On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> >> If we can have a real plan for moving in this direction though, I'd
> >> support it. ?I'm just not sure how we get enough real testing under
> >> our belts to be comfortable with dropping ext[23], especially as
> >> most distros now default to ext4 anyway.
> >
> > Eric what sort of testing are you looking for?
>
> Anything, the more formal or more widespread the better.
>
> I just don't think it's used much this way today...
>
> We can start with xfstests etc but I'd be more concerned about
> unexpected behavioral or performance changes.
>
> > I admit I like having ext2 around for comparisons in bug situations.
> > It really helps to isolate the problem area. How painful is the
> > upkeep?
>
> since ext4 was merged, about 450 commits to ext2 & ext3 files.
>
> since 2.6.32, about 150 commits.
>
Can you give a rough estimate of how those commits diverge between
bugfixes, kernel API changes, code cleanups?
Next3 has been following ext3 since 2.6.31 and I remember changes of
the 2 latter,
but not many major bugfixes.
I hardly think we can get away with throwing out ext3 code base, but
maybe it can go
into bugfixes-only mode? that is unless Jan likes to apply cleanups ;-)
Amir.
> Translating that into pain units, I dunno. ?In distro-land, I often
> have bugfixes that need to hit 2 or 3 of the filesystems as well.
>
> -Eric
>
> > mrubin
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
On 2/3/11 3:57 PM, Amir Goldstein wrote:
> Can you give a rough estimate of how those commits diverge between
> bugfixes, kernel API changes, code cleanups?
Um, maybe by the time LSF rolls around ;) It'd take a while to sift
though.
> Next3 has been following ext3 since 2.6.31 and I remember changes of
> the 2 latter,
> but not many major bugfixes.
>
> I hardly think we can get away with throwing out ext3 code base, but
> maybe it can go
> into bugfixes-only mode? that is unless Jan likes to apply cleanups ;-)
In theory that's what it's been for a couple years already :)
-Eric
> Amir.
>
On Thu, Feb 03, 2011 at 11:32:01AM -0800, Michael Rubin wrote:
> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> > If we can have a real plan for moving in this direction though, I'd
> > support it. ?I'm just not sure how we get enough real testing under
> > our belts to be comfortable with dropping ext[23], especially as
> > most distros now default to ext4 anyway.
>
> Eric what sort of testing are you looking for?
The biggest problem in my opinion is that we have a large set of
options, and we don't necessarily test all of them. The options that
I normaly test is
* 4k blocksize, with journal, extents
* 1k blocksize, with journal, extents (this helps flush out problems
that show up architectures with 16k page size and
4k block sizes, i.e., Power PC and Itanium)
* 4k blocksize, no journal
Things that I should also test, but which take a lot longer:
* nodelalloc (and combinatorics, 4k/1k blocksize, journal)
* filesystem with extents disabled (with more combinatorics!)
I'll sometimes do these additional tests, but they're not part of my
regular test sets.
- Ted
On Thu 03-02-11 09:08:18, Eric Sandeen wrote:
> On 2/3/11 8:40 AM, Jan Kara wrote:
> > As you well know, there are three independent code bases in kernel
> > implementing ext-based filesystems - ext2, ext3, and ext4. Of course it
> > costs some effort to maintain them all in a reasonably good condition so
> > once in a while someone comes and proposes we should drop one of ext2, ext3
> > or both. So I'd like to gather input what people think about this - should
> > we ever drop ext2 / ext3 codebases? If yes, under what condition do we deem
> > it is OK to drop it?
> >
> > To give some facts:
> > Feature-wise, ext4 should now be almost a superset of both ext2 and
> > ext3. ext4 has nojournal mode to simulate ext2, looking at the code I only
> > don't see XIP support in ext4, arguably also nobh-mode but I personally
> > feel that these days the complication in the code isn't worth it. As far as
> > I know it should be backward compatible to writeably mount ext2/ext3
> > filesystem with ext4 (i.e., no incompatible features should be turned on
> > magically).
> >
> > On the other hand there are differences noticeable under some conditions -
> > e.g. delayed allocation, data=ordered mode of ext3 gives better data
> > integrity than that of ext4 in practice (it's just a side effect we never
> > promised but app developers somehow got used to it ;), different allocation
> > decisions, and I believe there are more of these subtle differences.
>
> I think that ext4 with nodelalloc should mostly mimic ext3 in those
> cases, no?
Yeah, mostly. The biggest obstacle I see here is the different behavior
of mmap - with nodelalloc allocation happens at the time of page fault and
that fragments the file like hell for some kinds of load. Since ext3 here
essentially does delayed allocation, it might be useful to do delayed
allocation only from page fault path when we try to mimic ext3 behavior.
So mimicking ext3 is possible but needs some tweaks...
> > Then of course there is the factor of the codebase itself: Ext2 - ~9k
> > lines, Ext3+JBD - 24k lines, Ext4+JBD2 - 43k lines. Ext2 codebase is so
> > simple that it sometimes serves as a "model filesystem". But arguably it
> > also bitrots slowly so copy-and-pasting from ext2 need not be clever idea
> > anymore.
>
> Yep at one point it was asserted that ext2 was a model filesystem and should
> therefore be kept around, but I agree with you that it may not really
> serve that purpose too well.
>
> While I'm no fan of having 3 kinda-similar codebases that must be maintained,
> my concerns would be:
>
> 1) ext4 is still in active development, and may introduce instabilities
> that ext3 would otherwise avoid.
Sure but since ext4 is now pushed in RHEL, Fedora, openSUSE, Ubuntu, we
should be already really careful not to break stuff. I agree there is
higher potential for bugs in ext4 but sometime it should be good enough I
hope ;). And it's exactly this "sometime" which I'd like to get some
concesus on.
> 2) ext4's more, um ... unique option combinations probably get next to
> no testing in the real world. So while we can say that noextent,
> nodelalloc is mostly like ext3, in practice, does that ever really
> get much testing?
Yes. We definitely cannot remove old codebase until the equivalent paths
in ext4 won't be beaten regularly and hard. So I agree there is definitely
lots of testing ahead if we decide to move towards removing old code.
> If we can have a real plan for moving in this direction though, I'd
> support it. I'm just not sure how we get enough real testing under
> our belts to be comfortable with dropping ext[23], especially as
> most distros now default to ext4 anyway.
Well, I believe this actually works for us. If the real users move to
ext4 (or a different fs), then it's easier to make ext[23] mode in ext4
good enough for the few legacy users...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu 03-02-11 11:32:01, Michael Rubin wrote:
> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> > If we can have a real plan for moving in this direction though, I'd
> > support it. ?I'm just not sure how we get enough real testing under
> > our belts to be comfortable with dropping ext[23], especially as
> > most distros now default to ext4 anyway.
>
> Eric what sort of testing are you looking for?
I believe Ted wrote a good summary of what combinations of options would
need to be tested on a regular basis to get at least some confidence that
the switch could work.
> I admit I like having ext2 around for comparisons in bug situations.
> It really helps to isolate the problem area. How painful is the
> upkeep?
Well, for me it's a couple of hours per week on average I'd say. Plus
there is some work other people do when changing some VFS/MM interfaces
influencing all the filesystems.
The time I spend is enough to keep ext3 in a good shape I believe but I
have a feeling that ext2 is slowly bitrotting. Sometime when I look at
ext2 code I see stuff we simply do differently these days and that's just
a step away from the code getting broken... It would not be too much work
to clean things up and maintain but it's a work with no clear gain (if you
do the thankless job of maintaining old code, you should at least have
users who appreciate that ;) so naturally no one does it.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu 03-02-11 23:57:25, Amir Goldstein wrote:
> On Thu, Feb 3, 2011 at 9:49 PM, Eric Sandeen <[email protected]> wrote:
> Can you give a rough estimate of how those commits diverge between
> bugfixes, kernel API changes, code cleanups?
>
> Next3 has been following ext3 since 2.6.31 and I remember changes of
> the 2 latter, but not many major bugfixes.
So I took the work and went through the commit log of ext3 since 2.6.19
(when ext4 was merged). We have
305 commits in total, from those are:
62 cleanups
113 bugfixes
105 changes because of API changing or other kernel-wide efforts
25 features, speedups, and similar
The cathegorization of commits is somewhat arbitrary in some cases but I
think the numbers should be roughly fair...
> I hardly think we can get away with throwing out ext3 code base, but
> maybe it can go into bugfixes-only mode? that is unless Jan likes to
> apply cleanups ;-)
As you can see, it pretty much is. 25 feature commits in 5 years (and
those features are often like - report mount options in /proc/mounts,
unify error messages, avoid loading bitmap when block group is full, etc.)
is IMHO bugfixes-only mode if you don't want the filesystem to start
bitrotting. I've merged one bigger feature in last year and that was FITRIM
support on the grounds that it did not touch any code outside of FITRIM ioctl
handling itself. So when Lukas wanted to do the work with implementing it,
I was OK with it.
Sure I could be harder on people pushing cleanups on me but I don't want to
scare newbies away so I try to be nice and if the result actually is
better, I take it.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On 02/04/2011 08:17 AM, Jan Kara wrote:
> On Thu 03-02-11 11:32:01, Michael Rubin wrote:
>> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen<[email protected]> wrote:
>>> If we can have a real plan for moving in this direction though, I'd
>>> support it. I'm just not sure how we get enough real testing under
>>> our belts to be comfortable with dropping ext[23], especially as
>>> most distros now default to ext4 anyway.
>> Eric what sort of testing are you looking for?
> I believe Ted wrote a good summary of what combinations of options would
> need to be tested on a regular basis to get at least some confidence that
> the switch could work.
>
>> I admit I like having ext2 around for comparisons in bug situations.
>> It really helps to isolate the problem area. How painful is the
>> upkeep?
> Well, for me it's a couple of hours per week on average I'd say. Plus
> there is some work other people do when changing some VFS/MM interfaces
> influencing all the filesystems.
>
> The time I spend is enough to keep ext3 in a good shape I believe but I
> have a feeling that ext2 is slowly bitrotting. Sometime when I look at
> ext2 code I see stuff we simply do differently these days and that's just
> a step away from the code getting broken... It would not be too much work
> to clean things up and maintain but it's a work with no clear gain (if you
> do the thankless job of maintaining old code, you should at least have
> users who appreciate that ;) so naturally no one does it.
>
> Honza
I would definitely be interesting in figuring out if and when we can drop one or
both of ext2 and ext3. The number of actively supported file systems to test for
correctness and performance is getting to be a challenge.
Great topic, might require beer though to be done right :)
ric
On Fri, 2011-02-04 at 12:03 -0500, Ric Wheeler wrote:
> On 02/04/2011 08:17 AM, Jan Kara wrote:
> > On Thu 03-02-11 11:32:01, Michael Rubin wrote:
> >> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen<[email protected]> wrote:
> >>> If we can have a real plan for moving in this direction though, I'd
> >>> support it. I'm just not sure how we get enough real testing under
> >>> our belts to be comfortable with dropping ext[23], especially as
> >>> most distros now default to ext4 anyway.
> >> Eric what sort of testing are you looking for?
> > I believe Ted wrote a good summary of what combinations of options would
> > need to be tested on a regular basis to get at least some confidence that
> > the switch could work.
> >
> >> I admit I like having ext2 around for comparisons in bug situations.
> >> It really helps to isolate the problem area. How painful is the
> >> upkeep?
> > Well, for me it's a couple of hours per week on average I'd say. Plus
> > there is some work other people do when changing some VFS/MM interfaces
> > influencing all the filesystems.
> >
> > The time I spend is enough to keep ext3 in a good shape I believe but I
> > have a feeling that ext2 is slowly bitrotting. Sometime when I look at
> > ext2 code I see stuff we simply do differently these days and that's just
> > a step away from the code getting broken... It would not be too much work
> > to clean things up and maintain but it's a work with no clear gain (if you
> > do the thankless job of maintaining old code, you should at least have
> > users who appreciate that ;) so naturally no one does it.
> >
> > Honza
>
> I would definitely be interesting in figuring out if and when we can drop one or
> both of ext2 and ext3. The number of actively supported file systems to test for
> correctness and performance is getting to be a challenge.
ext2 yes ... I think there's no way we can drop ext3: it's still a
current default filesystem for most distributions. Now, if we discuss
dropping ext2 and working out an end of life plan for ext3 (for the
feature removals schedule) so we don't eventually get into the same
position with it as we are with ext2, then this sounds like a plan.
> Great topic, might require beer though to be done right :)
I'm invoking the anti-discrimination statutes here on behalf of those of
us who don't like beer.
James
On 2011-02-04, at 6:03, Jan Kara <[email protected]> wrote:
>> I think that ext4 with nodelalloc should mostly mimic ext3 in those
>> cases, no?
> Yeah, mostly. The biggest obstacle I see here is the different behavior
> of mmap - with nodelalloc allocation happens at the time of page fault and
> that fragments the file like hell for some kinds of load. Since ext3 here
> essentially does delayed allocation, it might be useful to do delayed
> allocation only from page fault path when we try to mimic ext3 behavior.
> So mimicking ext3 is possible but needs some tweaks...
The question is whether we need to mimic the runtime behavior or just the on-disk format? Apps already need to deal with ext4 and other fs that do not do ext3 ordered mode.
>> If we can have a real plan for moving in this direction though, I'd
>> support it. I'm just not sure how we get enough real testing under
>> our belts to be comfortable with dropping ext[23], especially as
>> most distros now default to ext4 anyway.
> Well, I believe this actually works for us. If the real users move to
> ext4 (or a different fs), then it's easier to make ext[23] mode in ext4
> good enough for the few legacy users...
I think the best road forward is to make ext4 the default for ext2 and ext3 filesystems in newer kernels, and mark ext2 and ext3 obsolete. This will start to get usage and testing of these other config options. The ext2 mode is already heavily tested at Google, and don't they also test noextent mode on updated filesystems, or were all of the filesystems reformatted with ext4 options?
After some number of kernel releases with ext4 as the default, we could remove the ext2 and ext3 code.
Cheers, Andreas.
On Fri, 2011-02-04 at 11:17 -0600, James Bottomley wrote:
> On Fri, 2011-02-04 at 12:03 -0500, Ric Wheeler wrote:
> > On 02/04/2011 08:17 AM, Jan Kara wrote:
> ext2 yes ... I think there's no way we can drop ext3: it's still a
> current default filesystem for most distributions. Now, if we discuss
> dropping ext2 and working out an end of life plan for ext3 (for the
> feature removals schedule) so we don't eventually get into the same
> position with it as we are with ext2, then this sounds like a plan.
>
> > Great topic, might require beer though to be done right :)
>
> I'm invoking the anti-discrimination statutes here on behalf of those of
> us who don't like beer.
OK. I'm putting this as filesystems track only proposal, then...
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
On Fri 04-02-11 10:36:21, Andreas Dilger wrote:
> On 2011-02-04, at 6:03, Jan Kara <[email protected]> wrote:
> >> I think that ext4 with nodelalloc should mostly mimic ext3 in those
> >> cases, no?
> > Yeah, mostly. The biggest obstacle I see here is the different behavior
> > of mmap - with nodelalloc allocation happens at the time of page fault and
> > that fragments the file like hell for some kinds of load. Since ext3 here
> > essentially does delayed allocation, it might be useful to do delayed
> > allocation only from page fault path when we try to mimic ext3 behavior.
> > So mimicking ext3 is possible but needs some tweaks...
>
> The question is whether we need to mimic the runtime behavior or just the
> on-disk format? Apps already need to deal with ext4 and other fs that do
> not do ext3 ordered mode.
Well written apps do, but badly written apps don't and e.g. our distro
customers don't always have the choice of the application. So as a developer
I see your point (screw stupidly written apps) but in the real world, I'm
afraid it's too hard on users.
> >> If we can have a real plan for moving in this direction though, I'd
> >> support it. I'm just not sure how we get enough real testing under
> >> our belts to be comfortable with dropping ext[23], especially as
> >> most distros now default to ext4 anyway.
> > Well, I believe this actually works for us. If the real users move to
> > ext4 (or a different fs), then it's easier to make ext[23] mode in ext4
> > good enough for the few legacy users...
>
> I think the best road forward is to make ext4 the default for ext2 and
> ext3 filesystems in newer kernels, and mark ext2 and ext3 obsolete. This
> will start to get usage and testing of these other config options. The
> ext2 mode is already heavily tested at Google, and don't they also test
> noextent mode on updated filesystems, or were all of the filesystems
> reformatted with ext4 options?
Yes, I know you are on relatively radical side ;). My position would be
to test ext4 for resonable combinations of options as ext2 driver and if
that works, switch ext2 as you describe. Then if it works fine for an year
or so, we can talk about ext3 but as James said, ext3 is still widely used
so there might be more friction on subtle runtime differences...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On 2011-02-07, at 08:19, Jan Kara wrote:
> On Fri 04-02-11 10:36:21, Andreas Dilger wrote:
>> The question is whether we need to mimic the runtime behavior or just the
>> on-disk format? Apps already need to deal with ext4 and other fs that do
>> not do ext3 ordered mode.
>
> Well written apps do, but badly written apps don't and e.g. our distro
> customers don't always have the choice of the application. So as a developer
> I see your point (screw stupidly written apps) but in the real world, I'm
> afraid it's too hard on users.
We have to remember that this is only for new kernels, and does not affect older kernels or existing applications, so such users shouldn't be affected.
>> I think the best road forward is to make ext4 the default for ext2 and
>> ext3 filesystems in newer kernels, and mark ext2 and ext3 obsolete. This
>> will start to get usage and testing of these other config options. The
>> ext2 mode is already heavily tested at Google, and don't they also test
>> noextent mode on updated filesystems, or were all of the filesystems
>> reformatted with ext4 options?
>
> Yes, I know you are on relatively radical side ;). My position would be
> to test ext4 for resonable combinations of options as ext2 driver and if
> that works, switch ext2 as you describe. Then if it works fine for an year
> or so, we can talk about ext3 but as James said, ext3 is still widely used
> so there might be more friction on subtle runtime differences...
Since most new distros use ext4 by default, the point is kind of moot, because those users will get this behaviour in any case. Relatively few users upgrade their kernel on a production system after it is installed, except for errata kernels, and I definitely wouldn't expect such a change to appear in an errata kernel.
Cheers, Andreas
On Fri, 2011-02-04 at 11:17 -0600, James Bottomley wrote:
> On Fri, 2011-02-04 at 12:03 -0500, Ric Wheeler wrote:
> > On 02/04/2011 08:17 AM, Jan Kara wrote:
> > > On Thu 03-02-11 11:32:01, Michael Rubin wrote:
> > >> On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen<[email protected]> wrote:
> > >>> If we can have a real plan for moving in this direction though, I'd
> > >>> support it. I'm just not sure how we get enough real testing under
> > >>> our belts to be comfortable with dropping ext[23], especially as
> > >>> most distros now default to ext4 anyway.
> > >> Eric what sort of testing are you looking for?
> > > I believe Ted wrote a good summary of what combinations of options would
> > > need to be tested on a regular basis to get at least some confidence that
> > > the switch could work.
> > >
> > >> I admit I like having ext2 around for comparisons in bug situations.
> > >> It really helps to isolate the problem area. How painful is the
> > >> upkeep?
> > > Well, for me it's a couple of hours per week on average I'd say. Plus
> > > there is some work other people do when changing some VFS/MM interfaces
> > > influencing all the filesystems.
> > >
> > > The time I spend is enough to keep ext3 in a good shape I believe but I
> > > have a feeling that ext2 is slowly bitrotting. Sometime when I look at
> > > ext2 code I see stuff we simply do differently these days and that's just
> > > a step away from the code getting broken... It would not be too much work
> > > to clean things up and maintain but it's a work with no clear gain (if you
> > > do the thankless job of maintaining old code, you should at least have
> > > users who appreciate that ;) so naturally no one does it.
> > >
> > > Honza
> >
> > I would definitely be interesting in figuring out if and when we can drop one or
> > both of ext2 and ext3. The number of actively supported file systems to test for
> > correctness and performance is getting to be a challenge.
>
> ext2 yes ... I think there's no way we can drop ext3: it's still a
> current default filesystem for most distributions. Now, if we discuss
> dropping ext2 and working out an end of life plan for ext3 (for the
> feature removals schedule) so we don't eventually get into the same
> position with it as we are with ext2, then this sounds like a plan.
>
I second this. Clearly we see ext2 is sunsetting, especially given ext4
has no journal mode already. For ext3, it still widely used by many
users, though we have a way to migrate ext3 to ext4 there but still it
require quit brainstorming to figure out what need to improve in ext4 to
handle ext3 filesystem files more smoothly. Having a plan discussion
sounds interesting to me.
Mingming
> > Great topic, might require beer though to be done right :)
>
> I'm invoking the anti-discrimination statutes here on behalf of those of
> us who don't like beer.
>
> James
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon 07-02-11 08:35:31, Andreas Dilger wrote:
> On 2011-02-07, at 08:19, Jan Kara wrote:
> > On Fri 04-02-11 10:36:21, Andreas Dilger wrote:
> >> The question is whether we need to mimic the runtime behavior or just the
> >> on-disk format? Apps already need to deal with ext4 and other fs that do
> >> not do ext3 ordered mode.
> >
> > Well written apps do, but badly written apps don't and e.g. our distro
> > customers don't always have the choice of the application. So as a developer
> > I see your point (screw stupidly written apps) but in the real world, I'm
> > afraid it's too hard on users.
>
> We have to remember that this is only for new kernels, and does not
> affect older kernels or existing applications, so such users shouldn't be
> affected.
Well, customers do upgrade distros and that means they get new kernels
but still they are bound to use the same app from their ISV so I don't
think there won't be users hitting this.
> >> I think the best road forward is to make ext4 the default for ext2 and
> >> ext3 filesystems in newer kernels, and mark ext2 and ext3 obsolete. This
> >> will start to get usage and testing of these other config options. The
> >> ext2 mode is already heavily tested at Google, and don't they also test
> >> noextent mode on updated filesystems, or were all of the filesystems
> >> reformatted with ext4 options?
> >
> > Yes, I know you are on relatively radical side ;). My position would be
> > to test ext4 for resonable combinations of options as ext2 driver and if
> > that works, switch ext2 as you describe. Then if it works fine for an year
> > or so, we can talk about ext3 but as James said, ext3 is still widely used
> > so there might be more friction on subtle runtime differences...
>
> Since most new distros use ext4 by default, the point is kind of moot,
> because those users will get this behaviour in any case. Relatively few
> users upgrade their kernel on a production system after it is installed,
> except for errata kernels, and I definitely wouldn't expect such a change
> to appear in an errata kernel.
Umm, lot of our customers upgrade even production systems (e.g. SLE10 SP3
-> SLE11 SP1 these days). But still, they keep the old filesystem (they do
not reformat their storage) because they were happy with how it worked. And
yes, they are happy about things that get better but loudly complain about
things that got worse for them.
Of course this does not have a perfect solution (someone will always
complain ;) but putting reasonable effort into making behavior of ext4
in the 'compatibility' mode not too much different from ext3 is IMHO decent
to users.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
Just as an aside, we just upgraded in place a large number of ext2
file systems to ext4. The process completed very smoothly and created
a performance boost for almost every workload we had.
I think keeping ext3 around is really convenient to compare against
ext4 as it becomes more mature, but outside of its academic use I
don't see any good reason to keep it around. With such an easy
migration path for users (mount as ext4 in place) I think an "end of
life" plan should not be that complicated and encouraged.
mrubin
On Fri, Feb 4, 2011 at 3:17 PM, Jan Kara <[email protected]> wrote:
>
> On Thu 03-02-11 11:32:01, Michael Rubin wrote:
> > On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> > > If we can have a real plan for moving in this direction though, I'd
> > > support it. ?I'm just not sure how we get enough real testing under
> > > our belts to be comfortable with dropping ext[23], especially as
> > > most distros now default to ext4 anyway.
> >
> > Eric what sort of testing are you looking for?
> I believe Ted wrote a good summary of what combinations of options would
> need to be tested on a regular basis to get at least some confidence that
> the switch could work.
>
So the problem is that people don'y have much incentive to test "ext3
mode" as long as they have, well, ext3.
I can offer an incentive in the form of snapshots support, which may
appeal for some users, to whom performance improvements is not a good
enough reason to upgrade their fs.
Most conveniently, ext4 snapshots is short of extents and delalloc
support at the moment, but the rest of the code, which was ported from
next3 is ready to be stabilized/cleaned up for submission.
So it can be claimed, that pursuing my cause, of pushing the snapshots
feature for early testers as soon as possible (i.e. before extent
move-on-write implementation), may also be beneficial to the cause of
getting "ext3 mode" tested by a larger number of users.
What do you say, Jan. Do you think that some of your upgrading
customers could be lured into using ext4 code if we offer them
snapshots in "ext3 mode"?
Amir.
On Sat 12-02-11 13:05:02, Amir Goldstein wrote:
> On Fri, Feb 4, 2011 at 3:17 PM, Jan Kara <[email protected]> wrote:
> >
> > On Thu 03-02-11 11:32:01, Michael Rubin wrote:
> > > On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
> > > > If we can have a real plan for moving in this direction though, I'd
> > > > support it. ?I'm just not sure how we get enough real testing under
> > > > our belts to be comfortable with dropping ext[23], especially as
> > > > most distros now default to ext4 anyway.
> > >
> > > Eric what sort of testing are you looking for?
> > I believe Ted wrote a good summary of what combinations of options would
> > need to be tested on a regular basis to get at least some confidence that
> > the switch could work.
>
> So the problem is that people don'y have much incentive to test "ext3
> mode" as long as they have, well, ext3.
>
> I can offer an incentive in the form of snapshots support, which may
> appeal for some users, to whom performance improvements is not a good
> enough reason to upgrade their fs.
>
> Most conveniently, ext4 snapshots is short of extents and delalloc
> support at the moment, but the rest of the code, which was ported from
> next3 is ready to be stabilized/cleaned up for submission.
>
> So it can be claimed, that pursuing my cause, of pushing the snapshots
> feature for early testers as soon as possible (i.e. before extent
> move-on-write implementation), may also be beneficial to the cause of
> getting "ext3 mode" tested by a larger number of users.
>
> What do you say, Jan. Do you think that some of your upgrading
> customers could be lured into using ext4 code if we offer them
> snapshots in "ext3 mode"?
Well, some people might be interested in snapshotting and might move to
ext4 for that reason but these would be mostly people installing new
systems anyway, not the ones just updating older systems. So I don't feel
this would be a major game changer...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Mon, Feb 14, 2011 at 7:25 PM, Jan Kara <[email protected]> wrote:
> On Sat 12-02-11 13:05:02, Amir Goldstein wrote:
>> On Fri, Feb 4, 2011 at 3:17 PM, Jan Kara <[email protected]> wrote:
>> >
>> > On Thu 03-02-11 11:32:01, Michael Rubin wrote:
>> > > On Thu, Feb 3, 2011 at 7:08 AM, Eric Sandeen <[email protected]> wrote:
>> > > > If we can have a real plan for moving in this direction though, I'd
>> > > > support it. ?I'm just not sure how we get enough real testing under
>> > > > our belts to be comfortable with dropping ext[23], especially as
>> > > > most distros now default to ext4 anyway.
>> > >
>> > > Eric what sort of testing are you looking for?
>> > I believe Ted wrote a good summary of what combinations of options would
>> > need to be tested on a regular basis to get at least some confidence that
>> > the switch could work.
>>
>> So the problem is that people don'y have much incentive to test "ext3
>> mode" as long as they have, well, ext3.
>>
>> I can offer an incentive in the form of snapshots support, which may
>> appeal for some users, to whom performance improvements is not a good
>> enough reason to upgrade their fs.
>>
>> Most conveniently, ext4 snapshots is short of extents and delalloc
>> support at the moment, but the rest of the code, which was ported from
>> next3 is ready to be stabilized/cleaned up for submission.
>>
>> So it can be claimed, that pursuing my cause, of pushing the snapshots
>> feature for early testers as soon as possible (i.e. before extent
>> move-on-write implementation), may also be beneficial to the cause of
>> getting "ext3 mode" tested by a larger number of users.
>>
>> What do you say, Jan. Do you think that some of your upgrading
>> customers could be lured into using ext4 code if we offer them
>> snapshots in "ext3 mode"?
> ?Well, some people might be interested in snapshotting and might move to
> ext4 for that reason but these would be mostly people installing new
> systems anyway, not the ones just updating older systems. So I don't feel
> this would be a major game changer...
>
Yes, of course. Upgraders won't be the ones using snapshots.
My intension was to state that those people installing new systems to test
snapshots would be functioning as testers for "ext3 mode", because:
1. when no snapshots exists it boils down to testing "ext3 mode".
2. it is unlikely that snapshots will mask "ext3 mode" bugs.
So my claim is that "ext3 mode" would benefit from a transition period in which
snapshots and (extens,delalloc) are mutually exclusive in ext4.
Amir.
-lsf-pc, -linux-fsdevel
On Mon, Feb 14, 2011 at 09:00:58PM +0200, Amir Goldstein wrote:
> Yes, of course. Upgraders won't be the ones using snapshots.
> My intension was to state that those people installing new systems to test
> snapshots would be functioning as testers for "ext3 mode", because:
> 1. when no snapshots exists it boils down to testing "ext3 mode".
> 2. it is unlikely that snapshots will mask "ext3 mode" bugs.
>
> So my claim is that "ext3 mode" would benefit from a transition
> period in which snapshots and (extens,delalloc) are mutually
> exclusive in ext4.
Here are the requirements that I think are critical before we do this:
1) We need to solve the testing matrix problem. Right now "ext3 mode"
in ext4 doesn't get enough testing as it is. Part of the solution is
(a) deciding on the modes that need testing, and (b) writing some
shell scripts so that xfstests can be automatically run in all of the
right modes. And then it will be having some number of people
(hopefully not just me) running said tests and reporting failures.
2) The code has to integrate in a fairly seemless and easy way.
mballoc.c is an example of code that still needs a lot of cleanup.
Coly Li has submitted some cleanups, which is great. But I suspect a
lot more is needed.
One thing that comes to mind about your question with the
e4b->alloc_semp causing problems. If the only reason why we need it
is to protect against multiple attempts to initialize different block
groups that share the same buddy bitmap, can we solve the problem by
ditching e4b->alloc_semp entirely, and simply using lock_page() on the
buddy bitmap page to protect it?
That's an example of the radical code cleanup and simplification that
parts of the ext4 codebase could really use. That isn't the
snapshot's code fault, and if we didn't really need to touch parts the
code in question, it's probably stable enough as it is.
Unfortunately, if you need to make changes, there's enough code debt
in some of the files that you need to change that any changes _has_ to
make things better, and not worse. So for example, checking to see if
the blocksize==page_size, and then skipping the down_read(alloc_smp)
call is an example of layering _more_ complexity and code hackery, and
not less.
Note what I did with patches in the ext4_da_writepages() codepath ---
about 100 lines of code removed in just 7 patches, and I expect
performance will get better as a result of the cleanup. And then
compare that to how that code looked in say, 2.6.27. We need to do
similar amounts of cleanup in other parts of ext4 --- and mballoc.c is
by no means the worse. But building on top of code which has a fair
amount of code debt, is not a receipe for long-term success; it's like
building a castle on quicksand, or in a swamp (insert obligatory Monty
Python reference here).
- Ted
On 2011-02-14, at 12:58, Ted Ts'o wrote:
> On Mon, Feb 14, 2011 at 09:00:58PM +0200, Amir Goldstein wrote:
>> Yes, of course. Upgraders won't be the ones using snapshots.
>> My intension was to state that those people installing new systems to test
>> snapshots would be functioning as testers for "ext3 mode", because:
>> 1. when no snapshots exists it boils down to testing "ext3 mode".
>> 2. it is unlikely that snapshots will mask "ext3 mode" bugs.
>>
>> So my claim is that "ext3 mode" would benefit from a transition
>> period in which snapshots and (extens,delalloc) are mutually
>> exclusive in ext4.
>
> Here are the requirements that I think are critical before we do this:
>
> 1) We need to solve the testing matrix problem. Right now "ext3 mode"
> in ext4 doesn't get enough testing as it is. Part of the solution is
> (a) deciding on the modes that need testing, and (b) writing some
> shell scripts so that xfstests can be automatically run in all of the
> right modes. And then it will be having some number of people
> (hopefully not just me) running said tests and reporting failures.
I think the vast majority of systems use the default ext3 features as set by "mke2fs -j": "has_journal ext_attr resize_inode dir_index filetype sparse_super large_file"
> One thing that comes to mind about your question with the
> e4b->alloc_semp causing problems. If the only reason why we need it
> is to protect against multiple attempts to initialize different block
> groups that share the same buddy bitmap, can we solve the problem by
> ditching e4b->alloc_semp entirely, and simply using lock_page() on the
> buddy bitmap page to protect it?
With flex_bg it might make sense to just read all of the bitmaps into that page in one shot, and avoid additional seeking later. This won't consume any additional RAM, because the page is already allocated, but can give a good performance win.
Cheers, Andreas
On Mon, Feb 14, 2011 at 9:58 PM, Ted Ts'o <[email protected]> wrote:
> -lsf-pc, -linux-fsdevel
>
> On Mon, Feb 14, 2011 at 09:00:58PM +0200, Amir Goldstein wrote:
>> Yes, of course. Upgraders won't be the ones using snapshots.
>> My intension was to state that those people installing new systems to test
>> snapshots would be functioning as testers for "ext3 mode", because:
>> 1. when no snapshots exists it boils down to testing "ext3 mode".
>> 2. it is unlikely that snapshots will mask "ext3 mode" bugs.
>>
>> So my claim is that "ext3 mode" would benefit from a transition
>> period in which snapshots and (extens,delalloc) are mutually
>> exclusive in ext4.
>
> Here are the requirements that I think are critical before we do this:
>
> 1) We need to solve the testing matrix problem. ?Right now "ext3 mode"
> in ext4 doesn't get enough testing as it is. ?Part of the solution is
> (a) deciding on the modes that need testing, and (b) writing some
> shell scripts so that xfstests can be automatically run in all of the
> right modes. ?And then it will be having some number of people
> (hopefully not just me) running said tests and reporting failures.
>
> 2) The code has to integrate in a fairly seemless and easy way.
> mballoc.c is an example of code that still needs a lot of cleanup.
> Coly Li has submitted some cleanups, which is great. ?But I suspect a
> lot more is needed.
>
> One thing that comes to mind about your question with the
> e4b->alloc_semp causing problems. ?If the only reason why we need it
> is to protect against multiple attempts to initialize different block
> groups that share the same buddy bitmap, can we solve the problem by
> ditching e4b->alloc_semp entirely, and simply using lock_page() on the
> buddy bitmap page to protect it?
>
Perhaps. I imagine there is more than one elegant way to deal with that,
but using a semaphore is not one of them.
I will take a shot at evaporating e4b->alloc_semp.
> That's an example of the radical code cleanup and simplification that
> parts of the ext4 codebase could really use. ?That isn't the
> snapshot's code fault, and if we didn't really need to touch parts the
> code in question, it's probably stable enough as it is.
> Unfortunately, if you need to make changes, there's enough code debt
> in some of the files that you need to change that any changes _has_ to
> make things better, and not worse. ?So for example, checking to see if
> the blocksize==page_size, and then skipping the down_read(alloc_smp)
> call is an example of layering _more_ complexity and code hackery, and
> not less.
>
Fair enough. I accept the challenge.
I shall cleanup mballoc.c in the process of merging snapshots code.
If you have specific things that bug you in mballoc.c, let me know.
> Note what I did with patches in the ext4_da_writepages() codepath ---
> about 100 lines of code removed in just 7 patches, and I expect
> performance will get better as a result of the cleanup. ?And then
> compare that to how that code looked in say, 2.6.27. ?We need to do
> similar amounts of cleanup in other parts of ext4 --- and mballoc.c is
> by no means the worse. ?But building on top of code which has a fair
> amount of code debt, is not a receipe for long-term success; it's like
> building a castle on quicksand, or in a swamp (insert obligatory Monty
> Python reference here).
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>
On Mon, Feb 14, 2011 at 02:58:45PM -0500, Ted Ts'o wrote:
> -lsf-pc, -linux-fsdevel
>
> On Mon, Feb 14, 2011 at 09:00:58PM +0200, Amir Goldstein wrote:
> > Yes, of course. Upgraders won't be the ones using snapshots.
> > My intension was to state that those people installing new systems to test
> > snapshots would be functioning as testers for "ext3 mode", because:
> > 1. when no snapshots exists it boils down to testing "ext3 mode".
> > 2. it is unlikely that snapshots will mask "ext3 mode" bugs.
> >
> > So my claim is that "ext3 mode" would benefit from a transition
> > period in which snapshots and (extens,delalloc) are mutually
> > exclusive in ext4.
>
> Here are the requirements that I think are critical before we do this:
>
> 1) We need to solve the testing matrix problem. Right now "ext3 mode"
> in ext4 doesn't get enough testing as it is. Part of the solution is
> (a) deciding on the modes that need testing, and (b) writing some
> shell scripts so that xfstests can be automatically run in all of the
> right modes. And then it will be having some number of people
> (hopefully not just me) running said tests and reporting failures.
What scripts are needed? xfstests has the $MKFS_OPTIONS and
$MOUNT_OPTIONS environment variables for customising your mkfs and
mount parameters for each test run, so isn't testing ext3
filesystems with the ext4 code should just be a matter of setting
these appropriately?
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Tue, Feb 15, 2011 at 03:28:37PM +1100, Dave Chinner wrote:
>
> What scripts are needed? xfstests has the $MKFS_OPTIONS and
> $MOUNT_OPTIONS environment variables for customising your mkfs and
> mount parameters for each test run, so isn't testing ext3
> filesystems with the ext4 code should just be a matter of setting
> these appropriately?
Correct, this doesn't require changes to xfstests.
What is needed for this ext4/ext3-using-ext4 testing is a wrapper
script *around* xfstests that sets up the enviornment variables
correctly, and uses different devices for the "common case"
combinations of mkfs and mount options (where we would keep an aged
file system around), and for those devices which we don't think are
valuable enough to dedicate a reserved file system image, we'd have to
mkfs a special version of that filesystem for TEST_DEV.
(I'm not sure why xfstests doesn't use freshly created file in the
case where SCRATCH_DEV is defined by TEST_DEV is not, but it doesn't;
as far as I know there are no tests where it uses both TEST_DEV and
SCRATCH_DEV, is there?)
- Ted
On Tue, Feb 15, 2011 at 12:29:22PM -0500, Ted Ts'o wrote:
> On Tue, Feb 15, 2011 at 03:28:37PM +1100, Dave Chinner wrote:
> >
> > What scripts are needed? xfstests has the $MKFS_OPTIONS and
> > $MOUNT_OPTIONS environment variables for customising your mkfs and
> > mount parameters for each test run, so isn't testing ext3
> > filesystems with the ext4 code should just be a matter of setting
> > these appropriately?
>
> Correct, this doesn't require changes to xfstests.
>
> What is needed for this ext4/ext3-using-ext4 testing is a wrapper
> script *around* xfstests that sets up the enviornment variables
> correctly, and uses different devices for the "common case"
> combinations of mkfs and mount options (where we would keep an aged
> file system around), and for those devices which we don't think are
> valuable enough to dedicate a reserved file system image, we'd have to
> mkfs a special version of that filesystem for TEST_DEV.
Every developer has their own set of wrapper scripts for doing just
this. Every test environment is different, so I'm not sure there is
a one-size-fits-all script waiting here.
In the past I've considered extending this sort of test
configuration to the configuration files and adding a command line
parameter to select the config file that defines the test setup. I
think you can specify the config file via the HOST_OPTIONS env
variable right now, but I haven't looked any further than that.
FWIW, I keep all my config files in a patch I apply to my xfstests
git repo before I rsync it to all my test machines, so this approach
would work for me, too. ;)
> (I'm not sure why xfstests doesn't use freshly created file in the
> case where SCRATCH_DEV is defined by TEST_DEV is not, but it doesn't;
I'm not sure what you are asking for here...
> as far as I know there are no tests where it uses both TEST_DEV and
> SCRATCH_DEV, is there?)
There are tests that do this (e.g. 073) - maybe none of the
generic tests do right now, but there are XFS specific tests that
do.
Cheers,
Dave.
--
Dave Chinner
[email protected]