LinuxLists.cc - Re: [RFC 12/26] ext2 white-out support

2007-08-01 15:23:53

Subject: Re: [RFC 12/26] ext2 white-out support

On Tue, 2007-07-31 at 13:11 -0400, Josef Sipek wrote:
> On Tue, Jul 31, 2007 at 07:00:12PM +0200, Jan Blunck wrote:
> > On Tue, Jul 31, Josef Sipek wrote:
> >
> > > On Mon, Jul 30, 2007 at 06:13:35PM +0200, Jan Blunck wrote:
> > > > Introduce white-out support to ext2.
> > >
> > > I think storing whiteouts on the branches is wrong. It creates all sort of
> > > nasty cases when people actually try to use unioning. Imagine a (no-so
> > > unlikely) scenario where you have 2 unions, and they share a branch. If you
> > > create a whiteout in one union on that shared branch, the whiteout magically
> > > affects the other union as well! Whiteouts are a union-level construct, and
> > > therefore storing them at the branch level is wrong.
> >
> > So you think that just because you mounted the filesystem somewhere else it
> > should look different? This is what sharing is all about. If you share a
> > filesystem you also share the removal of objects.
>
> The removal happens at the union level, not the branch level. Say you have:
>
> /a/
> /b/foo
> /c/foo
>
> And you mount /u1 as a union of {a,b}, and /u2 as union of {a,c}.

Who does this? I'm assuming that a is the "top" layer. Aren't union
mounts typically about sharing lower layers and having a separate rw
layer for each union mount?

> $ find /u*
> /u1
> /u1/foo
> /u2
> /u2/foo
> $ rm /u1/foo # this creates whiteout for "foo" in /a
> $ find /u*
> /u1
> /u2
>
> Is that what you'd expect as a user? I don't think so.

That's exactly what I would expect.

If I were to:
$ echo "this is new" > /u1/foo

I would expect:
$ cat /u2/foo
this is new

So why should rm behave differently?

I haven't really been tuned into union mounts, so maybe I'm missing out
on something basic here.

Thanks,
Shaggy
--
David Kleikamp
IBM Linux Technology Center

2007-08-01 18:44:47

by Josef Sipek

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, Aug 01, 2007 at 10:23:29AM -0500, Dave Kleikamp wrote:
> On Tue, 2007-07-31 at 13:11 -0400, Josef Sipek wrote:
> > On Tue, Jul 31, 2007 at 07:00:12PM +0200, Jan Blunck wrote:
> > > On Tue, Jul 31, Josef Sipek wrote:
> > >
> > > > On Mon, Jul 30, 2007 at 06:13:35PM +0200, Jan Blunck wrote:
> > > > > Introduce white-out support to ext2.
> > > >
> > > > I think storing whiteouts on the branches is wrong. It creates all sort of
> > > > nasty cases when people actually try to use unioning. Imagine a (no-so
> > > > unlikely) scenario where you have 2 unions, and they share a branch. If you
> > > > create a whiteout in one union on that shared branch, the whiteout magically
> > > > affects the other union as well! Whiteouts are a union-level construct, and
> > > > therefore storing them at the branch level is wrong.
> > >
> > > So you think that just because you mounted the filesystem somewhere else it
> > > should look different? This is what sharing is all about. If you share a
> > > filesystem you also share the removal of objects.
> >
> > The removal happens at the union level, not the branch level. Say you have:
> >
> > /a/
> > /b/foo
> > /c/foo
> >
> > And you mount /u1 as a union of {a,b}, and /u2 as union of {a,c}.
>
> Who does this? I'm assuming that a is the "top" layer. Aren't union
> mounts typically about sharing lower layers and having a separate rw
> layer for each union mount?

Alright not the greatest of examples, there is something to be said about
symmetry, so...let me try again :)

/a/
/b/bar (whiteout for bar)
/c/foo/qwerty

Now, let's mount a union of {a,b,c}, and we'll see:

$ find /u
/u
/u/foo
/u/foo/qwerty
$ mv /u/foo /u/bar

Now what? How do you rename? Do you rename in the same branch (assuming it
is rw)? If you do, you'll get:

$ find /u
/u

Oops! There's a whiteout in /b that hides the directory in /c -- rename(2)
shouldn't make directory subtrees disappear.

There are two ways to solve this:

1) "cp -r" the entire subtree being renamed to highest-priority branch, and
rename there (you might have to recreate a series of directories to have a
place to "cp" to...so you got "cp -r" _AND_ "mkdir -p"-like code in the VFS!
1/2 a :) )

2) Don't store whiteouts within branches. This makes it really easy to
rename and remove the whiteout.

Sure, you could try to rename in-place and remove the whiteout, but what if
you have:

/a/
/b/bar (whiteout)
/c/bar/blah
/d/foo/qwerty

$ mv /u/foo /u/bar

You can't just remove the whiteout, because that'd uncover the whited-out
directory bar in /c.

Josef 'Jeff' Sipek.

--
Bad pun of the week: The formula 1 control computer suffered from a race
condition

2007-08-01 19:10:46

by Dave Kleikamp

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, 2007-08-01 at 14:44 -0400, Josef Sipek wrote:
> Alright not the greatest of examples, there is something to be said about
> symmetry, so...let me try again :)
>
> /a/
> /b/bar (whiteout for bar)
> /c/foo/qwerty
>
> Now, let's mount a union of {a,b,c}, and we'll see:
>
> $ find /u
> /u
> /u/foo
> /u/foo/qwerty
> $ mv /u/foo /u/bar
>
> Now what? How do you rename? Do you rename in the same branch (assuming it
> is rw)?

Er, no. According to Documentation/filesystems/union-mounts.txt, "only
the topmost layer of the mount stack can be altered".

> If you do, you'll get:
>
> $ find /u
> /u
>
> Oops! There's a whiteout in /b that hides the directory in /c -- rename(2)
> shouldn't make directory subtrees disappear.
>
> There are two ways to solve this:
>
> 1) "cp -r" the entire subtree being renamed to highest-priority branch, and
> rename there (you might have to recreate a series of directories to have a
> place to "cp" to...so you got "cp -r" _AND_ "mkdir -p"-like code in the VFS!
> 1/2 a :) )

I think this is the only alternative, given the design.

> 2) Don't store whiteouts within branches. This makes it really easy to
> rename and remove the whiteout.
>
> Sure, you could try to rename in-place and remove the whiteout, but what if
> you have:
>
> /a/
> /b/bar (whiteout)
> /c/bar/blah
> /d/foo/qwerty
>
> $ mv /u/foo /u/bar
>
> You can't just remove the whiteout, because that'd uncover the whited-out
> directory bar in /c.
>
> Josef 'Jeff' Sipek.
>
--
David Kleikamp
IBM Linux Technology Center

2007-08-01 19:33:50

by Josef Sipek

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, Aug 01, 2007 at 02:10:31PM -0500, Dave Kleikamp wrote:
> On Wed, 2007-08-01 at 14:44 -0400, Josef Sipek wrote:
> > Alright not the greatest of examples, there is something to be said about
> > symmetry, so...let me try again :)
> >
> > /a/
> > /b/bar (whiteout for bar)
> > /c/foo/qwerty
> >
> > Now, let's mount a union of {a,b,c}, and we'll see:
> >
> > $ find /u
> > /u
> > /u/foo
> > /u/foo/qwerty
> > $ mv /u/foo /u/bar
> >
> > Now what? How do you rename? Do you rename in the same branch (assuming it
> > is rw)?
>
> Er, no. According to Documentation/filesystems/union-mounts.txt, "only
> the topmost layer of the mount stack can be altered".

This brings up an very interesting (but painful) question...which makes more
sense? Allowing the modifications in only the top-most branch, or any branch
(given the user allows it at mount-time)?

This is really question to the community at large, not just you, Dave :)

> > 1) "cp -r" the entire subtree being renamed to highest-priority branch, and
> > rename there (you might have to recreate a series of directories to have a
> > place to "cp" to...so you got "cp -r" _AND_ "mkdir -p"-like code in the VFS!
> > 1/2 a :) )
>
> I think this is the only alternative, given the design.

Right. Doing something like this at the filesystem level (as we do in
unionfs) seems less painful - filesystems are places full of all sorts of
nefarious activities to begin with. Having it in the VFS seems...even
uglier.

Josef 'Jeff' Sipek.

--
*NOTE: This message is ROT-13 encrypted twice for extra protection*

2007-08-01 19:52:34

by Dave Kleikamp

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, 2007-08-01 at 15:33 -0400, Josef Sipek wrote:
> On Wed, Aug 01, 2007 at 02:10:31PM -0500, Dave Kleikamp wrote:
> > On Wed, 2007-08-01 at 14:44 -0400, Josef Sipek wrote:
> > > Now what? How do you rename? Do you rename in the same branch (assuming it
> > > is rw)?
> >
> > Er, no. According to Documentation/filesystems/union-mounts.txt, "only
> > the topmost layer of the mount stack can be altered".
>
> This brings up an very interesting (but painful) question...which makes more
> sense? Allowing the modifications in only the top-most branch, or any branch
> (given the user allows it at mount-time)?

Your examples point out the complexity of trying to allow modifications
at lower levels. It seems to me to be simpler (even if recursive copies
are needed) to leave it as proposed.

> This is really question to the community at large, not just you, Dave :)

I agree, but I have to add my $.02.

> > > 1) "cp -r" the entire subtree being renamed to highest-priority branch, and
> > > rename there (you might have to recreate a series of directories to have a
> > > place to "cp" to...so you got "cp -r" _AND_ "mkdir -p"-like code in the VFS!
> > > 1/2 a :) )
> >
> > I think this is the only alternative, given the design.
>
> Right. Doing something like this at the filesystem level (as we do in
> unionfs) seems less painful - filesystems are places full of all sorts of
> nefarious activities to begin with. Having it in the VFS seems...even
> uglier.

I haven't looked at either implementation close enough to offer an
opinion here that I would be able to defend. I'm sure others have their
opinions.

> Josef 'Jeff' Sipek.
>

Thanks,
Shaggy
--
David Kleikamp
IBM Linux Technology Center

2007-08-01 22:07:21

by Erez Zadok

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

In message <[email protected]>, Dave Kleikamp writes:
> On Wed, 2007-08-01 at 15:33 -0400, Josef Sipek wrote:
> > On Wed, Aug 01, 2007 at 02:10:31PM -0500, Dave Kleikamp wrote:
> > > On Wed, 2007-08-01 at 14:44 -0400, Josef Sipek wrote:
> > > > Now what? How do you rename? Do you rename in the same branch (assuming it
> > > > is rw)?
> > >
> > > Er, no. According to Documentation/filesystems/union-mounts.txt, "only
> > > the topmost layer of the mount stack can be altered".
> >
> > This brings up an very interesting (but painful) question...which makes more
> > sense? Allowing the modifications in only the top-most branch, or any branch
> > (given the user allows it at mount-time)?
>
> Your examples point out the complexity of trying to allow modifications
> at lower levels. It seems to me to be simpler (even if recursive copies
> are needed) to leave it as proposed.
[...]

There are three other reasons why Unionfs and our users like to have
multiple writable branches:

1. If only the topmost layer is writable, then every little change tends to
cause a copyup, which tends to clutter the top layer more quickly. Some
of our users didn't like that idea, while others explicitly wanted it --
so we give them a choice to decide, on a per layer/branch whether it
should be writable or readonly.

2. Some users unify different packages together. Imagine you union under
/union, several installed packages: /X11R6/{bin,man,lib,conf},
/apache/{bin,man,lib,etc}, and /mysql/{bin,man,lib,etc}, and so on. If a
user modifies /union/apache/etc/apache.conf, they sometimes want
apache.conf to remain in the writable branch it came from, not copied up.
That way all apache related files are logically left where they came
from, which makes administration easier. Again, some users like to have
multiple writable branches, and some don't -- so in Unionfs we give them
the choice. And yes, it does make our implementation more complex.

3. Some people use Unionfs in the scenario described in point #2 above, as a
poor man's space- and load- distribution system. Some of our users like
the idea of controlling how much storage space they give each branch, and
how much it might grow, and even how much CPU or I/O load might be placed
on each of the lower filesystems which serve a given branch. That way
they worry less about the top-layer's space filling up more quickly than
expected. Now Unionfs was never designed to be a load-balancing f/s (we
have RAIF for that, see <http://www.filesystems.org/project-raif.html>),
but users seems to always find creative ways to [ab]use one's software in
ways one never thought of. :-)

BTW, does Union Mounts copyup on meta-data changes (e.g., chmod, chgrp,
etc.)?

Erez.

2007-08-02 05:24:52

by Ph. Marek

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Mittwoch, 1. August 2007, Josef Sipek wrote:
> Alright not the greatest of examples, there is something to be said about
> symmetry, so...let me try again :)
...
> Oops! There's a whiteout in /b that hides the directory in /c -- rename(2)
> shouldn't make directory subtrees disappear.
>
> There are two ways to solve this:
>
> 1) "cp -r" the entire subtree ...
>
> 2) Don't store whiteouts within branches ...
Sorry for making uninformed guesses, but if there are already special nodes
(whiteout), why not extending them to some more general format - specifying a
(source, destination) pair at the topmost level?
- A delete is a (source, NULL) pair
- A rename is a (source, destination) pair, which causes lookups on source to
use the string destination in the lower branches.

Would that work?

Regards,

Phil

2007-08-02 11:55:47

by Jan Blunck

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, Aug 01, Josef Sipek wrote:

> This brings up an very interesting (but painful) question...which makes more
> sense? Allowing the modifications in only the top-most branch, or any branch
> (given the user allows it at mount-time)?

My implementation is keeping things simple because of reason. There have been
many attempts to get unioning working on the filesystem layer. Most of them
failed because of complexity. E.g. BSD throwed away all of the filesystem
stacking support after they tried to fix unionfs for years. Writing to lower
layers is making things unnecessary complex. Therefore I left it out.

> > > 1) "cp -r" the entire subtree being renamed to highest-priority branch, and
> > > rename there (you might have to recreate a series of directories to have a
> > > place to "cp" to...so you got "cp -r" _AND_ "mkdir -p"-like code in the VFS!
> > > 1/2 a :) )
> >
> > I think this is the only alternative, given the design.
>
> Right. Doing something like this at the filesystem level (as we do in
> unionfs) seems less painful - filesystems are places full of all sorts of
> nefarious activities to begin with. Having it in the VFS seems...even
> uglier.

The userspace is doing it since I return -EXDEV. And that even comes for
free. I don't need to hack around and call back into VFS as you do. It is so
simple and straightforward in the VFS.

2007-08-02 12:05:55

by Jan Blunck

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, Aug 01, Erez Zadok wrote:

> There are three other reasons why Unionfs and our users like to have
> multiple writable branches:
>

...

> And yes, it does make our implementation more complex.

And error-prone and unflexible wrt to changes. When XIP was introduced,
unionfs crashed all over this changes. I don't know if this has changed
yet. Not speaking of other issues like calling back into VFS (stack usage),
locking problems and so on.

> 3. Some people use Unionfs in the scenario described in point #2 above, as a
> poor man's space- and load- distribution system. Some of our users like
> the idea of controlling how much storage space they give each branch, and
> how much it might grow, and even how much CPU or I/O load might be placed
> on each of the lower filesystems which serve a given branch. That way
> they worry less about the top-layer's space filling up more quickly than
> expected. Now Unionfs was never designed to be a load-balancing f/s (we
> have RAIF for that, see <http://www.filesystems.org/project-raif.html>),
> but users seems to always find creative ways to [ab]use one's software in
> ways one never thought of. :-)

And this has nothing to do with unioning ...

> BTW, does Union Mounts copyup on meta-data changes (e.g., chmod, chgrp,
> etc.)?

No. But it was proposed during on of the last postings.

2007-08-02 12:12:18

by Jan Blunck

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Thu, Aug 02, Ph. Marek wrote:

> On Mittwoch, 1. August 2007, Josef Sipek wrote:
> > Alright not the greatest of examples, there is something to be said about
> > symmetry, so...let me try again :)
> ...
> > Oops! There's a whiteout in /b that hides the directory in /c -- rename(2)
> > shouldn't make directory subtrees disappear.
> >
> > There are two ways to solve this:
> >
> > 1) "cp -r" the entire subtree ...
> >
> > 2) Don't store whiteouts within branches ...
> Sorry for making uninformed guesses, but if there are already special nodes
> (whiteout), why not extending them to some more general format - specifying a
> (source, destination) pair at the topmost level?
> - A delete is a (source, NULL) pair
> - A rename is a (source, destination) pair, which causes lookups on source to
> use the string destination in the lower branches.

Originally I had the idea that whiteouts are a special kind of symlink. After
discussing that with various people sticked to the simplest approach.

2007-08-02 17:54:18

by Jörn Engel

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

On Wed, 1 August 2007 15:33:30 -0400, Josef Sipek wrote:
>
> This brings up an very interesting (but painful) question...which makes more
> sense? Allowing the modifications in only the top-most branch, or any branch
> (given the user allows it at mount-time)?
>
> This is really question to the community at large, not just you, Dave :)

Only write to top-most layer.

There are two reasons for this. First it allows users to create a union
mount, test something (e.g. update the distribution) and remove every
trace from the test by umounting the top-most layer. Such a thing can
be quite valuable.

The second reason is simplicity. I personally couldn't even start to
describe the semantics. If the user does a rename, which layer will the
change end up in? What if source or target exist in multiple layers?
How to rename a directory in a lower layer containing a new file in an
upper layer?

Finding new and interesting corner cases for such a beast can be quite
entertaining. And until someone has properly documented the semantics
for _all_ the corner cases, my enthusiasm is below freezing point. Does
such a documentation exist?

Jörn

--
A surrounded army must be given a way out.
-- Sun Tzu

2007-08-02 18:15:46

by Jeremy Maitin-Shepard

[permalink] [raw]

Subject: Re: [RFC 12/26] ext2 white-out support

Jörn Engel <[email protected]> writes:

> On Wed, 1 August 2007 15:33:30 -0400, Josef Sipek wrote:
>>
>> This brings up an very interesting (but painful) question...which makes more
>> sense? Allowing the modifications in only the top-most branch, or any branch
>> (given the user allows it at mount-time)?
>>
>> This is really question to the community at large, not just you, Dave :)

> Only write to top-most layer.

> There are two reasons for this. First it allows users to create a union
> mount, test something (e.g. update the distribution) and remove every
> trace from the test by umounting the top-most layer. Such a thing can
> be quite valuable.

Josef did specifically state that modification to the lower layers would
be allowed only if a special mount flag is given.

> The second reason is simplicity. I personally couldn't even start to
> describe the semantics. If the user does a rename, which layer will the
> change end up in? What if source or target exist in multiple layers?
> How to rename a directory in a lower layer containing a new file in an
> upper layer?

> Finding new and interesting corner cases for such a beast can be quite
> entertaining. And until someone has properly documented the semantics
> for _all_ the corner cases, my enthusiasm is below freezing point. Does
> such a documentation exist?

I think that if someone can come up with consistent (and useful)
semantics for a mount option that allows modifications to other layers
as well, it would be a useful additional feature to support. It seems
that it should be possible to add this feature at a later time in any
case.

Perhaps referring to the plan9 semantics could be helpful.

--
Jeremy Maitin-Shepard