2007-12-14 15:49:14

by Erez Zadok

[permalink] [raw]
Subject: [GIT PULL -mm] 0/2 Unionfs updates/fixes/cleanups


The following is a series of patches related to Unionfs.

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc5-43-gda8cadb), MM (MMOTM stamp-2007-12-13-15-37), as well
as the backports to 2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs,
nfs2/3/4, jffs2, ramfs, tmpfs, cramfs, and squashfs (where available). Also
tested with LTP-full. See http://unionfs.filesystems.org/ to download
back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (2):
Unionfs: avoid using drop_pagecache_sb in remount
Unionfs: clarify usage.txt mount options

Documentation/filesystems/unionfs/usage.txt | 38 +++++++++++++++++++---------
fs/drop_caches.c | 4 --
fs/unionfs/dentry.c | 11 ++++++++
fs/unionfs/super.c | 34 ++++++++-----------------
fs/unionfs/union.h | 1
include/linux/mm.h | 2 -
6 files changed, 52 insertions(+), 38 deletions(-)

---
Erez Zadok
[email protected]


2007-12-14 15:49:30

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 1/2] Unionfs: avoid using drop_pagecache_sb in remount

Exporting drop_pagecache_sb to modules is somewhat risky because one cannot
sleep inside invalidate_mapping_pages. This could cause a lot of latency in
the pre-emption code. So don't export this symbol to minimize the risk that
others will use it.

Instead, unionfs will try to directly invalidate as many pages it can from
the unionfs_remount code. Invalidating those inode pages is not strictly
required, but helpful in encouraging a revalidation of inodes sooner than
waiting for individual f/s ops to access the union. Since a remount is
already an expensive but rare operation, this inode pages invalidation
shouldn't add too much overhead.

CC: Nick Piggin <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/drop_caches.c | 4 +---
fs/unionfs/dentry.c | 11 +++++++++++
fs/unionfs/super.c | 34 ++++++++++++----------------------
fs/unionfs/union.h | 1 +
include/linux/mm.h | 2 --
5 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 90410ac..59375ef 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -3,7 +3,6 @@
*/

#include <linux/kernel.h>
-#include <linux/module.h>
#include <linux/mm.h>
#include <linux/fs.h>
#include <linux/writeback.h>
@@ -13,7 +12,7 @@
/* A global variable is a bit ugly, but it keeps the code simple */
int sysctl_drop_caches;

-void drop_pagecache_sb(struct super_block *sb)
+static void drop_pagecache_sb(struct super_block *sb)
{
struct inode *inode;

@@ -25,7 +24,6 @@ void drop_pagecache_sb(struct super_block *sb)
}
spin_unlock(&inode_lock);
}
-EXPORT_SYMBOL(drop_pagecache_sb);

void drop_pagecache(void)
{
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 7d27987..c7b6a7c 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -276,6 +276,17 @@ static inline void purge_inode_data(struct inode *inode)
truncate_inode_pages(&inode->i_data, 0);
}

+void purge_sb_data(struct super_block *sb)
+{
+ struct inode *inode;
+
+ list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+ if (inode->i_state & (I_FREEING|I_WILL_FREE))
+ continue;
+ purge_inode_data(inode);
+ }
+}
+
/*
* Revalidate a parent chain of dentries, then the actual node.
* Assumes that dentry is locked, but will lock all parents if/when needed.
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index d9cf2a7..0ff9a9e 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -704,29 +704,19 @@ out_no_change:
*/

/*
- * Now we call drop_pagecache_sb() to invalidate all pages in this
- * super. This function calls invalidate_inode_pages(mapping),
- * which calls invalidate_mapping_pages(): the latter, however, will
- * not invalidate pages which are dirty, locked, under writeback, or
- * mapped into page tables. We shouldn't have to worry about dirty
- * or under-writeback pages, because do_remount_sb() called
- * fsync_super() which would not have returned until all dirty pages
- * were flushed.
- *
- * But do we have to worry about locked pages? Is there any chance
- * that in here we'll get locked pages?
- *
- * XXX: what about pages mapped into pagetables? Are these pages
- * which user processes may have mmap(2)'ed? If so, then we need to
- * invalidate those too, no? Maybe we'll have to write our own
- * version of invalidate_mapping_pages() which also handled mapped
- * pages.
- *
- * XXX: Alternatively, maybe we should call truncate_inode_pages(),
- * which use two passes over the pages list, and will truncate all
- * pages.
+ * Once we finish the remounting successfully, our superblock
+ * generation number will have increased. This will be detected by
+ * our dentry-revalidation code upon subsequent f/s operations
+ * through unionfs. The revalidation code will rebuild the union of
+ * lower inodes for a given unionfs inode and invalidate any pages
+ * of such "stale" inodes (by calling our purge_inode_data
+ * function). This revalidation will happen lazily and
+ * incrementally, as users perform operations on cached inodes. We
+ * would like to encourage this revalidation to happen sooner if
+ * possible, so we try to invalidate as many other pages in our
+ * superblock as we can.
*/
- drop_pagecache_sb(sb);
+ purge_sb_data(sb);

/* copy new vectors into their correct place */
tmp_data = UNIONFS_SB(sb)->data;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 20bff7b..b1bcb42 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -362,6 +362,7 @@ extern int unionfs_rmdir(struct inode *dir, struct dentry *dentry);
extern bool __unionfs_d_revalidate_chain(struct dentry *dentry,
struct nameidata *nd, bool willwrite);
extern bool is_newer_lower(const struct dentry *dentry);
+extern void purge_sb_data(struct super_block *sb);

/* The values for unionfs_interpose's flag. */
#define INTERPOSE_DEFAULT 0
diff --git a/include/linux/mm.h b/include/linux/mm.h
index fc61bd3..1b7b95c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -19,7 +19,6 @@ struct anon_vma;
struct file_ra_state;
struct user_struct;
struct writeback_control;
-struct super_block;

#ifndef CONFIG_DISCONTIGMEM /* Don't use mapnrs, do it properly */
extern unsigned long max_mapnr;
@@ -1136,7 +1135,6 @@ int drop_caches_sysctl_handler(struct ctl_table *, int, struct file *,
void __user *, size_t *, loff_t *);
unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
unsigned long lru_pages);
-extern void drop_pagecache_sb(struct super_block *);
void drop_pagecache(void);
void drop_slab(void);

--
1.5.2.2

2007-12-14 15:50:23

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 2/2] Unionfs: clarify usage.txt mount options

CC: Jim Kissel <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
Documentation/filesystems/unionfs/usage.txt | 38 +++++++++++++++++++--------
1 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/Documentation/filesystems/unionfs/usage.txt b/Documentation/filesystems/unionfs/usage.txt
index a6b1aca..59c4f28 100644
--- a/Documentation/filesystems/unionfs/usage.txt
+++ b/Documentation/filesystems/unionfs/usage.txt
@@ -7,17 +7,32 @@ read-write branches, as well as insertion and deletion of branches anywhere
in the fan-out. To maintain Unix semantics, Unionfs handles elimination of
duplicates, partial-error conditions, and more.

-# mount -t unionfs -o branch-option[,union-options[,...]] none MOUNTPOINT
+GENERAL SYNTAX
+==============

-The available branch-option for the mount command is:
+# mount -t unionfs -o <OPTIONS>,<BRANCH-OPTIONS> none MOUNTPOINT
+
+OPTIONS can be any legal combination one of:
+
+- ro # mount file system read-only
+- rw # mount file system read-write
+- remount # remount the file system (see Branch Management below)
+- incgen # increment generation no. (see Cache Consistency below)
+
+BRANCH-OPTIONS can be either (1) a list of branches given to the "dirs="
+option, or (2) a list of individual branch manipulation commands, described
+in the "Branch Management" section below.
+
+The syntax for the "dirs=" mount option is:

dirs=branch[=ro|=rw][:...]

-specifies a separated list of which directories compose the union.
-Directories that come earlier in the list have a higher precedence than
-those which come later. Additionally, read-only or read-write permissions of
-the branch can be specified by appending =ro or =rw (default) to each
-directory.
+The "dirs=" option takes a colon-delimited list of directories to compose
+the union, with an optional branch mode for each of those directories.
+Directories that come earlier (specified first, on the left) in the list
+have a higher precedence than those which come later. Additionally,
+read-only or read-write permissions of the branch can be specified by
+appending =ro or =rw (default) to each directory.

Syntax:

@@ -28,11 +43,12 @@ Example:
dirs=/writable_branch=rw:/read-only_branch=ro


-DYNAMIC BRANCH MANAGEMENT AND REMOUNTS
-======================================
+BRANCH MANAGEMENT
+=================

-You can remount a union and change its overall mode, or reconfigure the
-branches, as follows.
+Once you mount your union for the first time, using the "dirs=" option, you
+can then change the union's overall mode or reconfigure the branches, using
+the remount option, as follows.

To downgrade a union from read-write to read-only:

--
1.5.2.2

2007-12-14 16:32:58

by Michael Tokarev

[permalink] [raw]
Subject: Re: [PATCH 2/2] Unionfs: clarify usage.txt mount options

Erez Zadok wrote:

> --- a/Documentation/filesystems/unionfs/usage.txt
> +++ b/Documentation/filesystems/unionfs/usage.txt
[]
> +OPTIONS can be any legal combination one of:
^^^^^
A small typo.

> +
> +- ro # mount file system read-only
> +- rw # mount file system read-write
> +- remount # remount the file system (see Branch Management below)
> +- incgen # increment generation no. (see Cache Consistency below)
> +
> +BRANCH-OPTIONS can be either (1) a list of branches given to the "dirs="
> +option, or (2) a list of individual branch manipulation commands, described
> +in the "Branch Management" section below.

Should this (2) choice mention remount operation somehow?
It seems the (2) case is about remount only.

> +The "dirs=" option takes a colon-delimited list of directories to compose
> +the union, with an optional branch mode for each of those directories.
> +Directories that come earlier (specified first, on the left) in the list
> +have a higher precedence than those which come later. Additionally,

By the way, what is "precedence" here? I mean, will unionfs always
write to the first rw branch, or is it possible to write to some
deeper branch too?

And finally, can a first (and all of them as well) branch be ro?

Thanks.

/mjt

2007-12-14 16:36:20

by Michael Tokarev

[permalink] [raw]
Subject: Re: [PATCH 2/2] Unionfs: clarify usage.txt mount options

Michael Tokarev wrote:
> Erez Zadok wrote:
[...]

JFYI: My message bounced back:

<[email protected]>: host cs.sunysb.edu[130.245.1.15] said: 550 5.7.1 Access
denied (in reply to MAIL FROM command)

(stupid anti-spam policy @sunysb.edu, it seems - refusing
to accept *.ru as sender address, no matter if the mail
was sent to, say, postmaster@, asking about delivery
problems... Oh well.)

Please excuse me for cluttering up the lists.

/mjt

2007-12-14 16:59:48

by Erez Zadok

[permalink] [raw]
Subject: Re: [PATCH 2/2] Unionfs: clarify usage.txt mount options

In message <[email protected]>, Michael Tokarev writes:
> Erez Zadok wrote:
>
> > --- a/Documentation/filesystems/unionfs/usage.txt
> > +++ b/Documentation/filesystems/unionfs/usage.txt
> []
> > +OPTIONS can be any legal combination one of:
> ^^^^^
> A small typo.

Thanks. Will fix.

> > +
> > +- ro # mount file system read-only
> > +- rw # mount file system read-write
> > +- remount # remount the file system (see Branch Management below)
> > +- incgen # increment generation no. (see Cache Consistency below)
> > +
> > +BRANCH-OPTIONS can be either (1) a list of branches given to the "dirs="
> > +option, or (2) a list of individual branch manipulation commands, described
> > +in the "Branch Management" section below.
>
> Should this (2) choice mention remount operation somehow?
> It seems the (2) case is about remount only.

I'll clarify it further.

> > +The "dirs=" option takes a colon-delimited list of directories to compose
> > +the union, with an optional branch mode for each of those directories.
> > +Directories that come earlier (specified first, on the left) in the list
> > +have a higher precedence than those which come later. Additionally,
>
> By the way, what is "precedence" here? I mean, will unionfs always
> write to the first rw branch, or is it possible to write to some
> deeper branch too?
>
> And finally, can a first (and all of them as well) branch be ro?

Sounds like more clarifications :-)

A unionfs mount can be readonly or readwrite. If it's readonly, none of the
branches will be written to. If the union is readwrite, then the leftmost
(highest priority) branch must be writeable (for copyup to take place); the
remaining branches can be any mix of read-write and read-only.

In a writeable union, Unionfs will create new files in the leftmost branch.
If one tries to modify a file in a readonly branch/media, unionfs will copy
it up to the leftmost branch and modify it there. If you try to modify a
file from a writeable branch which is NOT the leftmost branch, then unionfs
will modify it in that branch; this is useful if you, say, unify differnet
packages (e.g., apache, sendmail, ftpd, etc.) and you want changes to
specific package files to remain logically in the directory where they came
from.

> Thanks.
>
> /mjt

Erez.