2011-05-11 17:27:48

by David Howells

[permalink] [raw]
Subject: Unionmount fallthru directory entries


What are unionmount fallthru directory entries for and why are they needed?

I'm guess that what they do is indicate that the dirent in question must be
looked up in the corresponding directory on the underlying fs.

As to why they are needed, am I right in thinking that unionmount caches a
copy of all the lower directory's entries in the upper directory with fallthru
markers set on the first call to readdir?

David


2011-05-11 19:28:51

by Valerie Aurora

[permalink] [raw]
Subject: Re: Unionmount fallthru directory entries

On Wed, May 11, 2011 at 10:00 AM, David Howells <[email protected]> wrote:
>
> What are unionmount fallthru directory entries for and why are they needed?

The long version of this answer is in:

http://lwn.net/Articles/325369/

The short version is that fallthrus allow you to generate and use a
32-bit d_off in readdir() using the topmost file system's native
readdir() implementation. It lets you process duplicates and
whiteouts only once per directory, instead of every time you open it,
and does not require pinning O(size of all unioned dirs) memory while
the directory is open, and does not require a d_off generating hack
when readdir() runs off the end of the first dir (which no one has
made fs-agnostic so far). The problems with previous approaches are
summarized in that article.

> I'm guess that what they do is indicate that the dirent in question must be
> looked up in the corresponding directory on the underlying fs.

Yes, if you are doing something that isn't just readdir().

> As to why they are needed, am I right in thinking that unionmount caches a
> copy of all the lower directory's entries in the upper directory with fallthru
> markers set on the first call to readdir?

Yes, but actually on the first lookup, not first readdir(), if I
recall correctly. The duplicate and whiteout processing are done
exactly once, on first lookup of the directory, and then the directory
is marked opaque.

It's the best solution I've seen so far, but that doesn't mean it
can't be improved.

-VAL