2007-09-03 02:24:32

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

The following is a series of patches related to Unionfs, which include three
small VFS/fsstack patches and one eCryptfs patch; the rest are Unionfs
patches. The patches here represent several months of work and testing
under various conditions, especially low-memory, SMP, and preemption
situations with an assortment of lower systems: ext2/3/4, xfs, reiserfs,
nfs, jffs2, ramfs, tmpfs, cramfs, and squashfs.

You can pull from 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/jsipek/unionfs.git

As I already mentioned to Andrew, some of these patches should go into
Linus's tree, but the remainder depends on them.

Additionally, as Andrew said earlier today, let's figure out whether or not
Unionfs gets merged.


For Linus:
VFS: export release_open_intent symbol
VFS/fsstack: remove 3rd argument to fsstack_copy_attr_all
VFS/fsstack: cpp endif comments

For Andrew:
Unionfs: fixed compilation error
Unionfs: do not use fsstack_copy_attr_all
Unionfs: copyright corrections and updates
Unionfs: cpp endif comments
Unionfs: cache-coherency - update inode times
Unionfs: cache-coherency - dentries
Unionfs: cache-coherency - file flush
Unionfs: cache-coherency and fixes for unionfs_rename
Unionfs: documentation updates
Unionfs: copyup updates
Unionfs: file_revalidate updates
Unionfs: implement f/async
Unionfs: minor file_release updates
Unionfs: interpose updates
Unionfs: unionfs_ioctl bug fixes
Unionfs: partial_lookup update
Unionfs: lower nameidata support
Unionfs: mmap fixes
Unionfs: handling lower vfsmount fixes
Unionfs: mount-time option parsing fix
Unionfs: remove old nfsro option
Unionfs: readonly branch test fix
Unionfs: minor remount fixes
Unionfs: extended attributes fixes
Unionfs: use file f_path field
Unionfs: assorted comment and style updates
Unionfs: update unionfs version number
Unionfs: debugging and validation of fan-out invariants
Unionfs: unionfs_create rewrite


Josef 'Jeff' Sipek, on behalf of the Unionfs team.

[email protected]


2007-09-03 02:25:30

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 16/32] Unionfs: minor file_release updates

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 612207a..1050c49 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -593,8 +593,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
struct unionfs_inode_info *inodeinfo;
struct super_block *sb = inode->i_sb;
int bindex, bstart, bend;
- int fgen;
- int err;
+ int fgen, err = 0;

unionfs_read_lock(sb);
/*
@@ -618,7 +617,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)

if (lower_file) {
fput(lower_file);
- branchput(inode->i_sb, bindex);
+ branchput(sb, bindex);
}
}
kfree(fileinfo->lower_files);
@@ -640,6 +639,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
fileinfo->rdstate = NULL;
}
kfree(fileinfo);
+
out:
unionfs_read_unlock(sb);
return err;
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:26:22

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 22/32] Unionfs: handling lower vfsmount fixes

From: Erez Zadok <[email protected]>

Properly increase/release lower vfsmounts.
Validate proper use of unionfs mntget/put.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/dentry.c | 6 ++++--
fs/unionfs/inode.c | 11 +++++++++++
fs/unionfs/lookup.c | 11 ++++++-----
fs/unionfs/union.h | 47 +++++++++++++++++++++++++++++++++++++++--------
4 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index c7bbeca..0be958f 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -450,9 +450,11 @@ static void unionfs_d_release(struct dentry *dentry)
bend = dbend(dentry);
for (bindex = bstart; bindex <= bend; bindex++) {
dput(unionfs_lower_dentry_idx(dentry, bindex));
- unionfs_mntput(dentry, bindex);
-
unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+ /* NULL lower mnt is ok if this is a negative dentry */
+ if (!dentry->d_inode && !unionfs_lower_mnt_idx(dentry,bindex))
+ continue;
+ unionfs_mntput(dentry, bindex);
unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
}
/* free private data (unionfs_dentry_info) here */
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index a219a40..d6a79d5 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -222,8 +222,11 @@ out:
dput(wh_dentry);
kfree(name);

+ if (!err)
+ unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

@@ -563,8 +566,12 @@ out:
d_drop(dentry);

kfree(name);
+ if (!err)
+ unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);
+
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

@@ -835,8 +842,12 @@ out:

kfree(name);

+ if (!err)
+ unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);
+
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 152d421..38ee21f 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -314,24 +314,25 @@ out_negative:
if (lookupmode == INTERPOSE_REVAL) {
if (dentry->d_inode)
UNIONFS_I(dentry->d_inode)->stale = 1;
-
goto out;
}
/* This should only happen if we found a whiteout. */
if (first_dentry_offset == -1) {
first_lower_dentry = lookup_one_len(name, lower_dir_dentry,
- namelen);
+ namelen);
first_dentry_offset = bindex;
if (IS_ERR(first_lower_dentry)) {
err = PTR_ERR(first_lower_dentry);
goto out;
}
-
- /* FIXME: the following line needs to be changed to allow
+
+ /*
+ * FIXME: the following line needs to be changed to allow
* mount-point crossing
*/
first_dentry = dentry;
- first_lower_mnt = unionfs_mntget(dentry, bindex);
+ first_lower_mnt = unionfs_mntget(dentry->d_sb->s_root,
+ bindex);
}
unionfs_set_lower_dentry_idx(dentry, first_dentry_offset,
first_lower_dentry);
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 9f3e68a..33c2137 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -459,18 +459,49 @@ static inline void unlock_dir(struct dentry *dir)
static inline struct vfsmount *unionfs_mntget(struct dentry *dentry,
int bindex)
{
- BUG_ON(!dentry || bindex < 0);
-
- return mntget(unionfs_lower_mnt_idx(dentry, bindex));
+ struct vfsmount *mnt;
+
+ if (!dentry) {
+ if (bindex < 0)
+ return NULL;
+ if (!dentry && bindex >= 0) {
+ return NULL;
+ }
+ }
+ mnt = unionfs_lower_mnt_idx(dentry, bindex);
+ if (!mnt) {
+ if (bindex < 0)
+ return NULL;
+ if (!mnt && bindex >= 0) {
+ return NULL;
+ }
+ }
+ mnt = mntget(mnt);
+ return mnt;
}

static inline void unionfs_mntput(struct dentry *dentry, int bindex)
{
- if (!dentry)
- return;
+ struct vfsmount *mnt;
+
+ if (!dentry) {
+ if (bindex < 0)
+ return;
+ if (!dentry && bindex >= 0) {
+ return;
+ }
+ }
+ mnt = unionfs_lower_mnt_idx(dentry, bindex);
+ if (!mnt) {
+ if (bindex < 0)
+ return;
+ if (!mnt && bindex >= 0) {
+ return;
+ }
+ }
+ mntput(mnt);
+}
+

- BUG_ON(bindex < 0);

- mntput(unionfs_lower_mnt_idx(dentry, bindex));
-}
#endif /* not _UNION_H_ */
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:26:49

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 02/32] VFS/fsstack: remove 3rd argument to fsstack_copy_attr_all

From: Erez Zadok <[email protected]>

Unionfs needs a special fan-out version of fsstack_copy_attr_all. A
single-level stackable file systems such as eCryptfs can therefore use a
simplified fsstack_copy_attr_all function; remove its 3rd argument, which
was never used by eCryptfs and was only used by Unionfs.

Acked-by: Michael Halcrow <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/ecryptfs/dentry.c | 2 +-
fs/ecryptfs/inode.c | 6 +++---
fs/ecryptfs/main.c | 2 +-
fs/stack.c | 13 ++-----------
include/linux/fs_stack.h | 4 +---
5 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/fs/ecryptfs/dentry.c b/fs/ecryptfs/dentry.c
index cb20b96..a8c1686 100644
--- a/fs/ecryptfs/dentry.c
+++ b/fs/ecryptfs/dentry.c
@@ -62,7 +62,7 @@ static int ecryptfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
struct inode *lower_inode =
ecryptfs_inode_to_lower(dentry->d_inode);

- fsstack_copy_attr_all(dentry->d_inode, lower_inode, NULL);
+ fsstack_copy_attr_all(dentry->d_inode, lower_inode);
}
out:
return rc;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 131954b..fc4c6cb 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -601,9 +601,9 @@ ecryptfs_rename(struct inode *old_dir, struct dentry *old_dentry,
lower_new_dir_dentry->d_inode, lower_new_dentry);
if (rc)
goto out_lock;
- fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode, NULL);
+ fsstack_copy_attr_all(new_dir, lower_new_dir_dentry->d_inode);
if (new_dir != old_dir)
- fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode, NULL);
+ fsstack_copy_attr_all(old_dir, lower_old_dir_dentry->d_inode);
out_lock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
dput(lower_new_dentry->d_parent);
@@ -961,7 +961,7 @@ static int ecryptfs_setattr(struct dentry *dentry, struct iattr *ia)
}
rc = notify_change(lower_dentry, ia);
out:
- fsstack_copy_attr_all(inode, lower_inode, NULL);
+ fsstack_copy_attr_all(inode, lower_inode);
return rc;
}

diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index a984972..cb349a4 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -151,7 +151,7 @@ int ecryptfs_interpose(struct dentry *lower_dentry, struct dentry *dentry,
d_add(dentry, inode);
else
d_instantiate(dentry, inode);
- fsstack_copy_attr_all(inode, lower_inode, NULL);
+ fsstack_copy_attr_all(inode, lower_inode);
/* This size will be overwritten for real files w/ headers and
* other metadata */
fsstack_copy_inode_size(inode, lower_inode);
diff --git a/fs/stack.c b/fs/stack.c
index 4368d4b..a548aac 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -30,8 +30,7 @@ EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
* copy all attributes; get_nlinks is optional way to override the i_nlink
* copying
*/
-void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
- int (*get_nlinks)(struct inode *))
+void fsstack_copy_attr_all(struct inode *dest, const struct inode *src)
{
dest->i_mode = src->i_mode;
dest->i_uid = src->i_uid;
@@ -42,14 +41,6 @@ void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
dest->i_ctime = src->i_ctime;
dest->i_blkbits = src->i_blkbits;
dest->i_flags = src->i_flags;
-
- /*
- * Update the nlinks AFTER updating the above fields, because the
- * get_links callback may depend on them.
- */
- if (!get_nlinks)
- dest->i_nlink = src->i_nlink;
- else
- dest->i_nlink = (*get_nlinks)(dest);
+ dest->i_nlink = src->i_nlink;
}
EXPORT_SYMBOL_GPL(fsstack_copy_attr_all);
diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h
index f3cd7f4..6b52faf 100644
--- a/include/linux/fs_stack.h
+++ b/include/linux/fs_stack.h
@@ -20,9 +20,7 @@
#include <linux/fs.h>

/* externs for fs/stack.c */
-extern void fsstack_copy_attr_all(struct inode *dest, const struct inode *src,
- int (*get_nlinks)(struct inode *));
-
+extern void fsstack_copy_attr_all(struct inode *dest, const struct inode *src);
extern void fsstack_copy_inode_size(struct inode *dst,
const struct inode *src);

--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:27:20

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 26/32] Unionfs: minor remount fixes

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/super.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 1de41ea..339afab 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -736,7 +736,7 @@ out_no_change:
for (i=dbstart(sb->s_root); i<=dbend(sb->s_root); i++) {
struct dentry *lower_dentry =
unionfs_lower_dentry_idx(sb->s_root, i);
- atomic_inc(&lower_dentry->d_inode->i_count);
+ igrab(lower_dentry->d_inode);
new_lower_inodes[i] = lower_dentry->d_inode;
}
/* 2. release reference on all older lower inodes */
@@ -758,11 +758,11 @@ out_no_change:
i = atomic_inc_return(&UNIONFS_SB(sb)->generation);
atomic_set(&UNIONFS_D(sb->s_root)->generation, i);
atomic_set(&UNIONFS_I(sb->s_root->d_inode)->generation, i);
-
- err = 0; /* reset to success */
-
if (!(*flags & MS_SILENT))
printk("unionfs: new generation number %d\n", i);
+ /* finally, update the root dentry's times */
+ unionfs_copy_attr_times(sb->s_root->d_inode);
+ err = 0; /* reset to success */

/*
* The code above falls through to the next label, and releases the
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:27:40

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 11/32] Unionfs: cache-coherency and fixes for unionfs_rename

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/rename.c | 30 ++++++++++++++++++++++++++++--
1 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 4de984e..f8ea37a 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -457,18 +457,44 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
}
}
err = do_unionfs_rename(old_dir, old_dentry, new_dir, new_dentry);
-
out:
if (err)
/* clear the new_dentry stuff created */
d_drop(new_dentry);
- else
+ else {
/*
* force re-lookup since the dir on ro branch is not renamed,
* and lower dentries still indicate the un-renamed ones.
*/
if (S_ISDIR(old_dentry->d_inode->i_mode))
atomic_dec(&UNIONFS_D(old_dentry)->generation);
+ if (new_dentry->d_inode &&
+ !S_ISDIR(new_dentry->d_inode->i_mode)) {
+ if (!unionfs_lower_inode(new_dentry->d_inode)) {
+ /*
+ * If we get here, it means that no copyup
+ * was needed, and that a file by the old
+ * name already existing on the destination
+ * branch; that file got renamed earlier in
+ * this function, so all we need to do here
+ * is set the lower inode.
+ */
+ struct inode *inode;
+ inode = unionfs_lower_inode(
+ old_dentry->d_inode);
+ igrab(inode);
+ unionfs_set_lower_inode_idx(
+ new_dentry->d_inode,
+ dbstart(new_dentry), inode);
+ }
+
+ }
+ /* if all of this renaming succeeded, update our times */
+ unionfs_copy_attr_times(old_dir);
+ unionfs_copy_attr_times(new_dir);
+ unionfs_copy_attr_times(old_dentry->d_inode);
+ unionfs_copy_attr_times(new_dentry->d_inode);
+ }

unionfs_unlock_dentry(new_dentry);
unionfs_unlock_dentry(old_dentry);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:28:15

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 04/32] Unionfs: fixed compilation error

From: Andrew Morton <[email protected]>

Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
include/linux/mm.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d823db0..aee99b6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -20,6 +20,7 @@ struct anon_vma;
struct file_ra_state;
struct user_struct;
struct writeback_control;
+struct super_block;

#ifndef CONFIG_DISCONTIGMEM /* Don't use mapnrs, do it properly */
extern unsigned long max_mapnr;
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:28:45

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 08/32] Unionfs: cache-coherency - update inode times

From: Erez Zadok <[email protected]>

Part of cache-coherency support (as per OLS'07 talk and
Documentation/filesystems/unionfs/concepts.txt): update our inode time if
lower had changed.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/file.c | 7 +++++--
fs/unionfs/inode.c | 11 +++++++++++
fs/unionfs/unlink.c | 3 +++
3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index ad6821c..47b63f3 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -67,17 +67,20 @@ out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
return err;
}
-static ssize_t unionfs_write(struct file * file, const char __user * buf,
+
+static ssize_t unionfs_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
int err = 0;

unionfs_read_lock(file->f_path.dentry->d_sb);
-
if ((err = unionfs_file_revalidate(file, 1)))
goto out;

err = do_sync_write(file, buf, count, ppos);
+ /* update our inode times upon a successful lower write */
+ if (err >= 0)
+ unionfs_copy_attr_times(file->f_path.dentry->d_inode);

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 9261bed..66614e3 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -245,6 +245,12 @@ static struct dentry *unionfs_lookup(struct inode *parent,
nd->dentry = path_save.dentry;
nd->mnt = path_save.mnt;
}
+ if (!IS_ERR(ret)) {
+ if (ret)
+ dentry = ret;
+ /* parent times may have changed */
+ unionfs_copy_attr_times(dentry->d_parent->d_inode);
+ }

unionfs_read_unlock(dentry->d_sb);

@@ -675,8 +681,11 @@ out:

kfree(name);

+ if (!err)
+ unionfs_copy_attr_times(dentry->d_inode);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

@@ -1006,6 +1015,8 @@ static int unionfs_permission(struct inode *inode, int mask,
break;
}
}
+ /* sync times which may have changed (asynchronously) below */
+ unionfs_copy_attr_times(inode);

out:
return err;
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 822bffe..7ad19ec 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -41,6 +41,9 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
dget(lower_dentry);
if (!(err = is_robranch_super(dentry->d_sb, bindex)))
err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+ /* if vfs_unlink succeeded, update our inode's times */
+ if (!err)
+ unionfs_copy_attr_times(dentry->d_inode);
dput(lower_dentry);
fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
unlock_dir(lower_dir_dentry);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:29:21

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 09/32] Unionfs: cache-coherency - dentries

From: Erez Zadok <[email protected]>

Utility functions to check if lower dentries/inodes are newer than upper
ones, and purging cached data if lower objects are newer. Also passed flag
to our d_revalidate_chain, to tell it if the caller may be writing data or
just reading it.

[jsipek: changed purge_inode_data to take a struct inode]
Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 2 +-
fs/unionfs/dentry.c | 127 ++++++++++++++++++++++++++++++++++++++++++++--
fs/unionfs/inode.c | 24 ++++++---
fs/unionfs/rename.c | 4 +-
fs/unionfs/super.c | 2 +-
fs/unionfs/union.h | 3 +-
fs/unionfs/unlink.c | 12 +++-
fs/unionfs/xattr.c | 8 ++--
8 files changed, 155 insertions(+), 27 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 28cb4e9..0dc7492 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -287,7 +287,7 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
* but not unhashed dentries.
*/
if (!d_deleted(dentry) &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, willwrite)) {
err = -ESTALE;
goto out_nofree;
}
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 4a3c479..c7bbeca 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -35,7 +35,6 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
int positive = 0;
int locked = 0;
int interpose_flag;
-
struct nameidata lowernd; /* TODO: be gentler to the stack */

if (nd)
@@ -158,7 +157,7 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
|| !lower_dentry->d_op->d_revalidate)
continue;
if (!lower_dentry->d_op->d_revalidate(lower_dentry,
- &lowernd))
+ &lowernd))
valid = 0;
}

@@ -184,10 +183,92 @@ out:
}

/*
+ * Determine if the lower inode objects have changed from below the unionfs
+ * inode. Return 1 if changed, 0 otherwise.
+ */
+int is_newer_lower(const struct dentry *dentry)
+{
+ int bindex;
+ struct inode *inode;
+ struct inode *lower_inode;
+
+ /* ignore if we're called on semi-initialized dentries/inodes */
+ if (!dentry || !UNIONFS_D(dentry))
+ return 0;
+ inode = dentry->d_inode;
+ if (!inode || !UNIONFS_I(inode) ||
+ ibstart(inode) < 0 || ibend(inode) < 0)
+ return 0;
+
+ for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ /*
+ * We may want to apply other tests to determine if the
+ * lower inode's data has changed, but checking for changed
+ * ctime and mtime on the lower inode should be enough.
+ */
+ if (timespec_compare(&inode->i_mtime,
+ &lower_inode->i_mtime) < 0) {
+ printk("unionfs: new lower inode mtime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return 1; /* mtime changed! */
+ }
+ if (timespec_compare(&inode->i_ctime,
+ &lower_inode->i_ctime) < 0) {
+ printk("unionfs: new lower inode ctime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return 1; /* ctime changed! */
+ }
+ }
+ return 0; /* default: lower is not newer */
+}
+
+/*
+ * Purge/remove/unmap all date pages of a unionfs inode. This is called
+ * when the lower inode has changed, and we have to force processes to get
+ * the new data.
+ *
+ * XXX: Our implementation works in that as long as a user process will have
+ * caused Unionfs to be called, directly or indirectly, even to just do
+ * ->d_revalidate; then we will have purged the current Unionfs data and the
+ * process will see the new data. For example, a process that continually
+ * re-reads the same file's data will see the NEW data as soon as the lower
+ * file had changed, upon the next read(2) syscall (even if the file is
+ * still open!) However, this doesn't work when the process re-reads the
+ * open file's data via mmap(2) (unless the user unmaps/closes the file and
+ * remaps/reopens it). Once we respond to ->readpage(s), then the kernel
+ * maps the page into the process's address space and there doesn't appear
+ * to be a way to force the kernel to invalidate those pages/mappings, and
+ * force the process to re-issue ->readpage. If there's a way to invalidate
+ * active mappings and force a ->readpage, let us know please
+ * (invalidate_inode_pages2 doesn't do the trick).
+ */
+static inline void purge_inode_data(struct inode *inode)
+{
+ /* remove all non-private mappings */
+ unmap_mapping_range(inode->i_mapping, 0, 0, 0);
+
+ if (inode->i_data.nrpages)
+ truncate_inode_pages(&inode->i_data, 0);
+}
+
+/*
* Revalidate a parent chain of dentries, then the actual node.
* Assumes that dentry is locked, but will lock all parents if/when needed.
+ *
+ * If 'willwrite' is 1, and the lower inode times are not in sync, then
+ * *don't* purge_inode_data, as it could deadlock if ->write calls us and we
+ * try to truncate a locked page. Besides, if unionfs is about to write
+ * data to a file, then there's the data unionfs is about to write is more
+ * authoritative than what's below, therefore we can safely overwrite the
+ * lower inode times and data.
*/
-int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
+int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
+ int willwrite)
{
int valid = 0; /* default is invalid (0); valid is 1. */
struct dentry **chain = NULL; /* chain of dentries to reval */
@@ -202,6 +283,25 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
+ /* XXX: should we check if is_newer_lower all the way up? */
+ if (is_newer_lower(dtmp)) {
+ /*
+ * Special case: the root dentry's generation number must
+ * always be valid, but its lower inode times don't have to
+ * be, so sync up the times only.
+ */
+ if (IS_ROOT(dtmp))
+ unionfs_copy_attr_times(dtmp->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dtmp)->generation, dgen);
+ }
+ purge_inode_data(dtmp->d_inode);
+ }
while (sbgen != dgen) {
/* The root entry should always be valid */
BUG_ON(IS_ROOT(dtmp));
@@ -234,8 +334,8 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
}

/*
- * call __unionfs_d_revalidate() on each dentry, but in parent to
- * child order.
+ * call __unionfs_d_revalidate_one() on each dentry, but in parent
+ * to child order.
*/
for (i=0; i<chain_len; i++) {
unionfs_lock_dentry(chain[i]);
@@ -264,6 +364,21 @@ out_this:
/* finally, lock this dentry and revalidate it */
verify_locked(dentry);
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+ if (is_newer_lower(dentry)) {
+ /* root dentry special case as aforementioned */
+ if (IS_ROOT(dentry))
+ unionfs_copy_attr_times(dentry->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dentry)->generation, dgen);
+ }
+ if (!willwrite)
+ purge_inode_data(dentry->d_inode);
+ }
valid = __unionfs_d_revalidate_one(dentry, nd);

/*
@@ -299,7 +414,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
unionfs_read_lock(dentry->d_sb);

unionfs_lock_dentry(dentry);
- err = __unionfs_d_revalidate_chain(dentry, nd);
+ err = __unionfs_d_revalidate_chain(dentry, nd, 0);
unionfs_unlock_dentry(dentry);

unionfs_read_unlock(dentry->d_sb);
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 66614e3..4789fd5 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -34,13 +34,13 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
unionfs_lock_dentry(dentry);

unionfs_lock_dentry(dentry->d_parent);
- valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd);
+ valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, 0);
unionfs_unlock_dentry(dentry->d_parent);
if (!valid) {
err = -ESTALE; /* same as what real_lookup does */
goto out;
}
- valid = __unionfs_d_revalidate_chain(dentry, nd);
+ valid = __unionfs_d_revalidate_chain(dentry, nd, 0);
/*
* It's only a bug if this dentry was not negative and couldn't be
* revalidated (shouldn't happen).
@@ -270,12 +270,12 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(new_dentry, old_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
if (new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -412,7 +412,7 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -560,7 +560,7 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -703,7 +703,7 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -816,7 +816,7 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -890,6 +890,12 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *cookie)
{
unionfs_read_lock(dentry->d_sb);
+
+ unionfs_lock_dentry(dentry);
+ if (!__unionfs_d_revalidate_chain(dentry, nd, 0))
+ printk("unionfs: put_link failed to revalidate dentry\n");
+ unionfs_unlock_dentry(dentry);
+
kfree(nd_get_link(nd));
unionfs_read_unlock(dentry->d_sb);
}
@@ -1035,7 +1041,7 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 8a159d1..4de984e 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -401,12 +401,12 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(old_dentry, new_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
if (!d_deleted(new_dentry) && new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 2f321c2..1de41ea 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -126,7 +126,7 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
unionfs_read_lock(sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index b7e5a93..a9f9b97 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -317,7 +317,8 @@ extern int unionfs_unlink(struct inode *dir, struct dentry *dentry);
extern int unionfs_rmdir(struct inode *dir, struct dentry *dentry);

extern int __unionfs_d_revalidate_chain(struct dentry *dentry,
- struct nameidata *nd);
+ struct nameidata *nd, int willwrite);
+extern int is_newer_lower(const struct dentry *dentry);

/* The values for unionfs_interpose's flag. */
#define INTERPOSE_DEFAULT 0
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 7ad19ec..4e1c8f3 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -80,15 +80,21 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}

err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
- if (!err)
+ if (!err) {
d_drop(dentry);
+ /*
+ * if unlink/whiteout succeeded, parent dir mtime has
+ * changed
+ */
+ unionfs_copy_attr_times(dir);
+ }

out:
unionfs_unlock_dentry(dentry);
@@ -136,7 +142,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 92bcd20..46f3d4e 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -60,7 +60,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -88,7 +88,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -116,7 +116,7 @@ int unionfs_removexattr(struct dentry *dentry, const char *name)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
@@ -144,7 +144,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
err = -ESTALE;
goto out;
}
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:29:46

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 20/32] Unionfs: lower nameidata support

From: Erez Zadok <[email protected]>

Create and free custom nameidata structures, and pass them to lower file
systems when needed via vfs_create. (This code will get updated when/if
nameidata is split into an intent structure and a VFS-level only structure.)

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/copyup.c | 8 ++++++-
fs/unionfs/lookup.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++
fs/unionfs/rename.c | 15 +++++++++++-
fs/unionfs/subr.c | 14 ++++++++++-
fs/unionfs/union.h | 2 +
5 files changed, 92 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 9c476cd..4c45790 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -177,19 +177,25 @@ static int __copyup_ndentry(struct dentry *old_lower_dentry,
run_sioq(__unionfs_mknod, &args);
err = args.err;
} else if (S_ISREG(old_mode)) {
+ struct nameidata nd;
+ err = init_lower_nd(&nd, LOOKUP_CREATE);
+ if (err < 0)
+ goto out;
+ args.create.nd = &nd;
args.create.parent = new_lower_parent_dentry->d_inode;
args.create.dentry = new_lower_dentry;
args.create.mode = old_mode;
- args.create.nd = NULL;

run_sioq(__unionfs_create, &args);
err = args.err;
+ release_lower_nd(&nd, err);
} else {
printk(KERN_ERR "unionfs: unknown inode type %d\n",
old_mode);
BUG();
}

+out:
return err;
}

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index d05daa5..152d421 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -564,3 +564,61 @@ void update_bstart(struct dentry *dentry)
unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
}
}
+
+
+/*
+ * Initialize a nameidata structure (the intent part) we can pass to a lower
+ * file system. Returns 0 on success or -error (only -ENOMEM possible).
+ * Inside that nd structure, this function may also return an allocated
+ * struct file (for open intents). The caller, when done with this nd, must
+ * kfree the intent file (using release_lower_nd).
+ */
+int init_lower_nd(struct nameidata *nd, unsigned int flags)
+{
+ int err = 0;
+#ifdef ALLOC_LOWER_ND_FILE
+ /*
+ * XXX: one day we may need to have the lower return an open file
+ * for us. It is not needed in 2.6.23-rc1 for nfs2/nfs3, but may
+ * very well be needed for nfs4.
+ */
+ struct file *file;
+#endif /* ALLOC_LOWER_ND_FILE */
+
+ memset(nd, 0, sizeof(struct nameidata));
+
+ switch (flags) {
+ case LOOKUP_CREATE:
+ nd->flags = LOOKUP_CREATE;
+ nd->intent.open.flags = FMODE_READ | FMODE_WRITE | O_CREAT;
+#ifdef ALLOC_LOWER_ND_FILE
+ file = kzalloc(sizeof(struct file), GFP_KERNEL);
+ if (!file) {
+ err = -ENOMEM;
+ break; /* exit switch statement and thus return */
+ }
+ nd->intent.open.file = file;
+#endif /* ALLOC_LOWER_ND_FILE */
+ break;
+ default:
+ /*
+ * We should never get here, for now.
+ * We can add new cases here later on.
+ */
+ BUG();
+ break;
+ }
+
+ return err;
+}
+
+void release_lower_nd(struct nameidata *nd, int err)
+{
+ if (!nd->intent.open.file)
+ return;
+ if (!err)
+ release_open_intent(nd);
+#ifdef ALLOC_LOWER_ND_FILE
+ kfree(nd->intent.open.file);
+#endif /* ALLOC_LOWER_ND_FILE */
+}
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index acf829a..a02d678 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -256,10 +256,20 @@ static int do_unionfs_rename(struct inode *old_dir,
*/
if ((old_bstart != old_bend) || (do_copyup != -1)) {
struct dentry *lower_parent;
- BUG_ON(!wh_old || wh_old->d_inode || bwh_old < 0);
+ struct nameidata nd;
+ if (!wh_old || wh_old->d_inode || bwh_old < 0) {
+ printk(KERN_ERR "unionfs: rename error "
+ "(wh_old=%p/%p bwh_old=%d)\n", wh_old,
+ (wh_old ? wh_old->d_inode : NULL), bwh_old);
+ err = -EIO;
+ goto out;
+ }
+ err = init_lower_nd(&nd, LOOKUP_CREATE);
+ if (err < 0)
+ goto out;
lower_parent = lock_parent(wh_old);
local_err = vfs_create(lower_parent->d_inode, wh_old, S_IRUGO,
- NULL);
+ &nd);
unlock_dir(lower_parent);
if (!local_err)
set_dbopaque(old_dentry, bwh_old);
@@ -272,6 +282,7 @@ static int do_unionfs_rename(struct inode *old_dir,
"the source in rename!\n");
err = -EIO;
}
+ release_lower_nd(&nd, local_err);
}

out:
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 5db9e62..b7e7904 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -29,6 +29,7 @@ int create_whiteout(struct dentry *dentry, int start)
struct dentry *lower_dir_dentry;
struct dentry *lower_dentry;
struct dentry *lower_wh_dentry;
+ struct nameidata nd;
char *name = NULL;
int err = -EINVAL;

@@ -82,14 +83,18 @@ int create_whiteout(struct dentry *dentry, int start)
goto out;
}

+ err = init_lower_nd(&nd, LOOKUP_CREATE);
+ if (err < 0)
+ goto out;
lower_dir_dentry = lock_parent(lower_wh_dentry);
if (!(err = is_robranch_super(dentry->d_sb, bindex)))
err = vfs_create(lower_dir_dentry->d_inode,
lower_wh_dentry,
~current->fs->umask & S_IRWXUGO,
- NULL);
+ &nd);
unlock_dir(lower_dir_dentry);
dput(lower_wh_dentry);
+ release_lower_nd(&nd, err);

if (!err || !IS_COPYUP_ERR(err))
break;
@@ -151,6 +156,7 @@ int make_dir_opaque(struct dentry *dentry, int bindex)
int err = 0;
struct dentry *lower_dentry, *diropq;
struct inode *lower_dir;
+ struct nameidata nd;

lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
lower_dir = lower_dentry->d_inode;
@@ -165,10 +171,14 @@ int make_dir_opaque(struct dentry *dentry, int bindex)
goto out;
}

+ err = init_lower_nd(&nd, LOOKUP_CREATE);
+ if (err < 0)
+ goto out;
if (!diropq->d_inode)
- err = vfs_create(lower_dir, diropq, S_IRUGO, NULL);
+ err = vfs_create(lower_dir, diropq, S_IRUGO, &nd);
if (!err)
set_dbopaque(dentry, bindex);
+ release_lower_nd(&nd, err);

dput(diropq);

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 0fa8ae0..9f3e68a 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -248,6 +248,8 @@ static inline void unionfs_double_lock_dentry(struct dentry *d1,
extern int new_dentry_private_data(struct dentry *dentry);
extern void free_dentry_private_data(struct dentry *dentry);
extern void update_bstart(struct dentry *dentry);
+extern int init_lower_nd(struct nameidata *nd, unsigned int flags);
+extern void release_lower_nd(struct nameidata *nd, int err);

/*
* EXTERNALS:
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:30:23

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 01/32] VFS: export release_open_intent symbol

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/namei.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index a83160a..b2b7c8e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -374,6 +374,7 @@ void release_open_intent(struct nameidata *nd)
else
fput(nd->intent.open.file);
}
+EXPORT_SYMBOL(release_open_intent);

static inline struct dentry *
do_revalidate(struct dentry *dentry, struct nameidata *nd)
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:30:49

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 06/32] Unionfs: copyright corrections and updates

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/mmap.c | 2 +-
fs/unionfs/sioq.c | 13 +++++++------
fs/unionfs/sioq.h | 13 +++++++------
3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index e03ca00..969fd16 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -10,7 +10,7 @@
* Copyright (c) 2003 Puja Gupta
* Copyright (c) 2003 Harikesavan Krishnan
* Copyright (c) 2003-2007 Stony Brook University
- * Copyright (c) 2003-2007 The Research Foundation of State University of New York
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c
index 575f08d..2a8c88e 100644
--- a/fs/unionfs/sioq.c
+++ b/fs/unionfs/sioq.c
@@ -1,10 +1,11 @@
/*
- * Copyright (c) 2003-2007 Erez Zadok
- * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
- * Copyright (c) 2005-2006 Junjiro Okajima
- * Copyright (c) 2004-2006 David P. Quigley
- * Copyright (c) 2003-2007 Stony Brook University
- * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006 Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006 Junjiro Okajima
+ * Copyright (c) 2006 David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
diff --git a/fs/unionfs/sioq.h b/fs/unionfs/sioq.h
index 6d0d01f..bedd7af 100644
--- a/fs/unionfs/sioq.h
+++ b/fs/unionfs/sioq.h
@@ -1,10 +1,11 @@
/*
- * Copyright (c) 2003-2007 Erez Zadok
- * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
- * Copyright (c) 2005-2006 Junjiro Okajima
- * Copyright (c) 2004-2006 David P. Quigley
- * Copyright (c) 2003-2007 Stony Brook University
- * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ * Copyright (c) 2006-2007 Erez Zadok
+ * Copyright (c) 2006 Charles P. Wright
+ * Copyright (c) 2006-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2006 Junjiro Okajima
+ * Copyright (c) 2006 David P. Quigley
+ * Copyright (c) 2006-2007 Stony Brook University
+ * Copyright (c) 2006-2007 The Research Foundation of SUNY
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:31:25

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 25/32] Unionfs: readonly branch test fix

From: Erez Zadok <[email protected]>

Bug fix to test if a lower branch is readonly, even when given negative
dentries.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/union.h | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 33c2137..26d886e 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -396,14 +396,23 @@ static inline int is_robranch_super(const struct super_block *sb, int index)
/* Is this file on a read-only branch? */
static inline int is_robranch_idx(const struct dentry *dentry, int index)
{
- int err = 0;
+ struct super_block *lower_sb;

BUG_ON(index < 0);

- if ((!(branchperms(dentry->d_sb, index) & MAY_WRITE)) ||
- IS_RDONLY(unionfs_lower_dentry_idx(dentry, index)->d_inode))
- err = -EROFS;
- return err;
+ if (!(branchperms(dentry->d_sb, index) & MAY_WRITE))
+ return -EROFS;
+
+ lower_sb = unionfs_lower_super_idx(dentry->d_sb, index);
+ BUG_ON(lower_sb == NULL);
+ /*
+ * test sb flags directly, not IS_RDONLY(lower_inode) because the
+ * lower_dentry could be a negative.
+ */
+ if (lower_sb->s_flags & MS_RDONLY)
+ return -EROFS;
+
+ return 0;
}

static inline int is_robranch(const struct dentry *dentry)
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:31:48

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 31/32] Unionfs: debugging and validation of fan-out invariants

From: Erez Zadok <[email protected]>

Introduce debugging functionality, Makefile support to turn it on at compile
time, and hooks in the main code to verify fan-out invariants. This is very
similar to how other file systems provide debugging functionality. This
code has been very useful in detecting and fixing problems, especially when
stacking on top of assorted file systems.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/Kconfig | 6 +
fs/unionfs/Makefile | 2 +
fs/unionfs/commonfops.c | 17 ++
fs/unionfs/copyup.c | 2 +
fs/unionfs/debug.c | 502 +++++++++++++++++++++++++++++++++++++++++++++++
fs/unionfs/dentry.c | 4 +
fs/unionfs/file.c | 19 ++-
fs/unionfs/inode.c | 22 ++
fs/unionfs/mmap.c | 5 +
fs/unionfs/rename.c | 4 +
fs/unionfs/super.c | 3 +
fs/unionfs/union.h | 105 +++++++---
fs/unionfs/unlink.c | 6 +
fs/unionfs/xattr.c | 4 +
14 files changed, 668 insertions(+), 33 deletions(-)
create mode 100644 fs/unionfs/debug.c

diff --git a/fs/Kconfig b/fs/Kconfig
index d53d8ca..0c58d02 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1063,6 +1063,12 @@ config UNION_FS_XATTR

If unsure, say N.

+config UNION_FS_DEBUG
+ bool "Debug Unionfs"
+ depends on UNION_FS
+ help
+ If you say Y here, you can turn on debugging output from Unionfs.
+
endmenu

menu "Miscellaneous filesystems"
diff --git a/fs/unionfs/Makefile b/fs/unionfs/Makefile
index 78be3e7..38fed90 100644
--- a/fs/unionfs/Makefile
+++ b/fs/unionfs/Makefile
@@ -5,3 +5,5 @@ unionfs-y := subr.o dentry.o file.o inode.o main.o super.o \
lookup.o commonfops.o dirfops.o sioq.o mmap.o

unionfs-$(CONFIG_UNION_FS_XATTR) += xattr.o
+
+unionfs-$(CONFIG_UNION_FS_DEBUG) += debug.o
diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 341b6f2..724128d 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -91,6 +91,8 @@ retry:
unlock_dir(lower_dir_dentry);

out:
+ if (!err)
+ unionfs_check_dentry(dentry);
return err;
}

@@ -247,6 +249,8 @@ static int do_delayed_copyup(struct file *file)

BUG_ON(!S_ISREG(dentry->d_inode->i_mode));

+ unionfs_check_file(file);
+ unionfs_check_dentry(dentry);
for (bindex = bstart - 1; bindex >= 0; bindex--) {
if (!d_deleted(dentry))
err = copyup_file(parent_inode, file, bstart,
@@ -286,6 +290,8 @@ static int do_delayed_copyup(struct file *file)
ibend(dentry->d_inode) = ibstart(dentry->d_inode);

out:
+ unionfs_check_file(file);
+ unionfs_check_dentry(dentry);
return err;
}

@@ -399,6 +405,8 @@ out:
kfree(UNIONFS_F(file)->saved_branch_ids);
}
out_nofree:
+ if (!err)
+ unionfs_check_file(file);
unionfs_unlock_dentry(dentry);
return err;
}
@@ -576,6 +584,11 @@ out:
}
out_nofree:
unionfs_read_unlock(inode->i_sb);
+ unionfs_check_inode(inode);
+ if (!err) {
+ unionfs_check_file(file);
+ unionfs_check_dentry(file->f_path.dentry->d_parent);
+ }
return err;
}

@@ -601,6 +614,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
*/
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);
fileinfo = UNIONFS_F(file);
BUG_ON(file->f_path.dentry->d_inode != inode);
inodeinfo = UNIONFS_I(inode);
@@ -764,6 +778,7 @@ long unionfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err;
}

@@ -778,6 +793,7 @@ int unionfs_flush(struct file *file, fl_owner_t id)

if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);

if (!atomic_dec_and_test(&UNIONFS_I(dentry->d_inode)->totalopens))
goto out;
@@ -814,5 +830,6 @@ out_lock:
unionfs_unlock_dentry(dentry);
out:
unionfs_read_unlock(dentry->d_sb);
+ unionfs_check_file(file);
return err;
}
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 36f751e..23ac4c8 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -532,6 +532,8 @@ out_free:
unionfs_postcopyup_setmnt(dentry);
/* sync inode times from copied-up inode to our inode */
unionfs_copy_attr_times(dentry->d_inode);
+ unionfs_check_inode(dir);
+ unionfs_check_dentry(dentry);
out:
return err;
}
diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
new file mode 100644
index 0000000..f678534
--- /dev/null
+++ b/fs/unionfs/debug.c
@@ -0,0 +1,502 @@
+/*
+ * Copyright (c) 2003-2007 Erez Zadok
+ * Copyright (c) 2005-2007 Josef 'Jeff' Sipek
+ * Copyright (c) 2003-2007 Stony Brook University
+ * Copyright (c) 2003-2007 The Research Foundation of SUNY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "union.h"
+
+/*
+ * Helper debugging functions for maintainers (and for users to report back
+ * useful information back to maintainers)
+ */
+
+/* it's always useful to know what part of the code called us */
+#define PRINT_CALLER(fname, fxn, line) \
+ do { \
+ if (!printed_caller) { \
+ printk("PC:%s:%s:%d\n",(fname),(fxn),(line)); \
+ printed_caller = 1; \
+ } \
+ } while (0)
+
+#if BITS_PER_LONG == 32
+#define POISONED_PTR ((void*) 0x5a5a5a5a)
+#elif BITS_PER_LONG == 64
+#define POISONED_PTR ((void*) 0x5a5a5a5a5a5a5a5a)
+#else
+#error Unknown BITS_PER_LONG value
+#endif /* BITS_PER_LONG != known */
+
+/*
+ * __unionfs_check_{inode,dentry,file} perform exhaustive sanity checking on
+ * the fan-out of various Unionfs objects. We check that no lower objects
+ * exist outside the start/end branch range; that all objects within are
+ * non-NULL (with some allowed exceptions); that for every lower file
+ * there's a lower dentry+inode; that the start/end ranges match for all
+ * corresponding lower objects; that open files/symlinks have only one lower
+ * objects, but directories can have several; and more.
+ */
+void __unionfs_check_inode(const struct inode *inode,
+ const char *fname, const char *fxn, int line)
+{
+ int bindex;
+ int istart, iend;
+ struct inode *lower_inode;
+ struct super_block *sb;
+ int printed_caller = 0;
+
+ /* for inodes now */
+ BUG_ON(!inode);
+ sb = inode->i_sb;
+ istart = ibstart(inode);
+ iend = ibend(inode);
+ if (istart > iend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci0: inode=%p istart/end=%d:%d\n",
+ inode, istart, iend);
+ }
+ if ((istart == -1 && iend != -1) ||
+ (istart != -1 && iend == -1)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci1: inode=%p istart/end=%d:%d\n",
+ inode, istart, iend);
+ }
+ if (!S_ISDIR(inode->i_mode)) {
+ if (iend != istart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci2: inode=%p istart=%d iend=%d\n",
+ inode, istart, iend);
+ }
+ }
+
+ for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+ if (!UNIONFS_I(inode)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci3: no inode_info %p\n", inode);
+ return;
+ }
+ if (!UNIONFS_I(inode)->lower_inodes) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci4: no lower_inodes %p\n", inode);
+ return;
+ }
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (lower_inode) {
+ if (bindex < istart || bindex > iend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci5: inode/linode=%p:%p bindex=%d "
+ "istart/end=%d:%d\n", inode,
+ lower_inode, bindex, istart, iend);
+ } else if (lower_inode == POISONED_PTR) {
+ /* freed inode! */
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci6: inode/linode=%p:%p bindex=%d "
+ "istart/end=%d:%d\n", inode,
+ lower_inode, bindex, istart, iend);
+ }
+ } else { /* lower_inode == NULL */
+ if (bindex >= istart && bindex <= iend) {
+ /*
+ * directories can have NULL lower inodes in
+ * b/t start/end, but NOT if at the
+ * start/end range.
+ */
+ if (!(S_ISDIR(inode->i_mode) &&
+ bindex > istart && bindex < iend)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Ci7: inode/linode=%p:%p "
+ "bindex=%d istart/end=%d:%d\n",
+ inode, lower_inode, bindex,
+ istart, iend);
+ }
+ }
+ }
+ }
+}
+
+void __unionfs_check_dentry(const struct dentry *dentry,
+ const char *fname, const char *fxn, int line)
+{
+ int bindex;
+ int dstart, dend, istart, iend;
+ struct dentry *lower_dentry;
+ struct inode *inode, *lower_inode;
+ struct super_block *sb;
+ struct vfsmount *lower_mnt;
+ int printed_caller = 0;
+
+ BUG_ON(!dentry);
+ sb = dentry->d_sb;
+ inode = dentry->d_inode;
+ dstart = dbstart(dentry);
+ dend = dbend(dentry);
+ BUG_ON(dstart > dend);
+
+ if ((dstart == -1 && dend != -1) ||
+ (dstart != -1 && dend == -1)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CD0: dentry=%p dstart/end=%d:%d\n",
+ dentry, dstart, dend);
+ }
+ /*
+ * check for NULL dentries inside the start/end range, or
+ * non-NULL dentries outside the start/end range.
+ */
+ for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+ lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+ if (lower_dentry) {
+ if (bindex < dstart || bindex > dend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CD1: dentry/lower=%p:%p(%p) "
+ "bindex=%d dstart/end=%d:%d\n",
+ dentry, lower_dentry,
+ (lower_dentry ? lower_dentry->d_inode :
+ (void *) -1L),
+ bindex, dstart, dend);
+ }
+ } else { /* lower_dentry == NULL */
+ if (bindex >= dstart && bindex <= dend) {
+ /*
+ * Directories can have NULL lower inodes in
+ * b/t start/end, but NOT if at the
+ * start/end range. Ignore this rule,
+ * however, if this is a NULL dentry or a
+ * deleted dentry.
+ */
+ if (!d_deleted((struct dentry *) dentry) &&
+ inode &&
+ !(inode && S_ISDIR(inode->i_mode) &&
+ bindex > dstart && bindex < dend)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CD2: dentry/lower=%p:%p(%p) "
+ "bindex=%d dstart/end=%d:%d\n",
+ dentry, lower_dentry,
+ (lower_dentry ?
+ lower_dentry->d_inode :
+ (void *) -1L),
+ bindex, dstart, dend);
+ }
+ }
+ }
+ }
+
+ /* check for vfsmounts same as for dentries */
+ for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+ lower_mnt = unionfs_lower_mnt_idx(dentry, bindex);
+ if (lower_mnt) {
+ if (bindex < dstart || bindex > dend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CM0: dentry/lmnt=%p:%p bindex=%d "
+ "dstart/end=%d:%d\n", dentry,
+ lower_mnt, bindex, dstart, dend);
+ }
+ } else { /* lower_mnt == NULL */
+ if (bindex >= dstart && bindex <= dend) {
+ /*
+ * Directories can have NULL lower inodes in
+ * b/t start/end, but NOT if at the
+ * start/end range. Ignore this rule,
+ * however, if this is a NULL dentry.
+ */
+ if (inode &&
+ !(inode && S_ISDIR(inode->i_mode) &&
+ bindex > dstart && bindex < dend)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CM1: dentry/lmnt=%p:%p "
+ "bindex=%d dstart/end=%d:%d\n",
+ dentry, lower_mnt, bindex,
+ dstart, dend);
+ }
+ }
+ }
+ }
+
+ /* for inodes now */
+ if (!inode)
+ return;
+ istart = ibstart(inode);
+ iend = ibend(inode);
+ BUG_ON(istart > iend);
+ if ((istart == -1 && iend != -1) ||
+ (istart != -1 && iend == -1)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI0: dentry/inode=%p:%p istart/end=%d:%d\n",
+ dentry, inode, istart, iend);
+ }
+ if (istart != dstart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI1: dentry/inode=%p:%p istart=%d dstart=%d\n",
+ dentry, inode, istart, dstart);
+ }
+ if (iend != dend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI2: dentry/inode=%p:%p iend=%d dend=%d\n",
+ dentry, inode, iend, dend);
+ }
+
+ if (!S_ISDIR(inode->i_mode)) {
+ if (dend != dstart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI3: dentry/inode=%p:%p dstart=%d dend=%d\n",
+ dentry, inode, dstart, dend);
+ }
+ if (iend != istart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI4: dentry/inode=%p:%p istart=%d iend=%d\n",
+ dentry, inode, istart, iend);
+ }
+ }
+
+ for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (lower_inode) {
+ if (bindex < istart || bindex > iend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI5: dentry/linode=%p:%p bindex=%d "
+ "istart/end=%d:%d\n", dentry,
+ lower_inode, bindex, istart, iend);
+ } else if (lower_inode == POISONED_PTR) {
+ /* freed inode! */
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI6: dentry/linode=%p:%p bindex=%d "
+ "istart/end=%d:%d\n", dentry,
+ lower_inode, bindex, istart, iend);
+ }
+ } else { /* lower_inode == NULL */
+ if (bindex >= istart && bindex <= iend) {
+ /*
+ * directories can have NULL lower inodes in
+ * b/t start/end, but NOT if at the
+ * start/end range.
+ */
+ if (!(S_ISDIR(inode->i_mode) &&
+ bindex > istart && bindex < iend)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CI7: dentry/linode=%p:%p "
+ "bindex=%d istart/end=%d:%d\n",
+ dentry, lower_inode, bindex,
+ istart, iend);
+ }
+ }
+ }
+ }
+
+ /*
+ * If it's a directory, then intermediate objects b/t start/end can
+ * be NULL. But, check that all three are NULL: lower dentry, mnt,
+ * and inode.
+ */
+ if (S_ISDIR(inode->i_mode))
+ for (bindex = dstart+1; bindex < dend; bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ lower_dentry = unionfs_lower_dentry_idx(dentry,
+ bindex);
+ lower_mnt = unionfs_lower_mnt_idx(dentry, bindex);
+ if (!((lower_inode && lower_dentry && lower_mnt) ||
+ (!lower_inode && !lower_dentry && !lower_mnt))) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" Cx: lmnt/ldentry/linode=%p:%p:%p "
+ "bindex=%d dstart/end=%d:%d\n",
+ lower_mnt, lower_dentry, lower_inode,
+ bindex, dstart, dend);
+ }
+ }
+ /* check if lower inode is newer than upper one (it shouldn't) */
+ if (is_newer_lower(dentry)) {
+ PRINT_CALLER(fname, fxn, line);
+ for (bindex=ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ printk(" CI8: bindex=%d mtime/lmtime=%lu.%lu/%lu.%lu "
+ "ctime/lctime=%lu.%lu/%lu.%lu\n",
+ bindex,
+ inode->i_mtime.tv_sec,
+ inode->i_mtime.tv_nsec,
+ lower_inode->i_mtime.tv_sec,
+ lower_inode->i_mtime.tv_nsec,
+ inode->i_ctime.tv_sec,
+ inode->i_ctime.tv_nsec,
+ lower_inode->i_ctime.tv_sec,
+ lower_inode->i_ctime.tv_nsec);
+ }
+ }
+}
+
+void __unionfs_check_file(const struct file *file,
+ const char *fname, const char *fxn, int line)
+{
+ int bindex;
+ int dstart, dend, fstart, fend;
+ struct dentry *dentry;
+ struct file *lower_file;
+ struct inode *inode;
+ struct super_block *sb;
+ int printed_caller = 0;
+
+ BUG_ON(!file);
+ dentry = file->f_path.dentry;
+ sb = dentry->d_sb;
+ dstart = dbstart(dentry);
+ dend = dbend(dentry);
+ BUG_ON(dstart > dend);
+ fstart = fbstart(file);
+ fend = fbend(file);
+ BUG_ON(fstart > fend);
+
+ if ((fstart == -1 && fend != -1) ||
+ (fstart != -1 && fend == -1)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF0: file/dentry=%p:%p fstart/end=%d:%d\n",
+ file, dentry, fstart, fend);
+ }
+ if (fstart != dstart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF1: file/dentry=%p:%p fstart=%d dstart=%d\n",
+ file, dentry, fstart, dstart);
+ }
+ if (fend != dend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF2: file/dentry=%p:%p fend=%d dend=%d\n",
+ file, dentry, fend, dend);
+ }
+ inode = dentry->d_inode;
+ if (!S_ISDIR(inode->i_mode)) {
+ if (fend != fstart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF3: file/inode=%p:%p fstart=%d fend=%d\n",
+ file, inode, fstart, fend);
+ }
+ if (dend != dstart) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF4: file/dentry=%p:%p dstart=%d dend=%d\n",
+ file, dentry, dstart, dend);
+ }
+ }
+
+ /*
+ * check for NULL dentries inside the start/end range, or
+ * non-NULL dentries outside the start/end range.
+ */
+ for (bindex = sbstart(sb); bindex < sbmax(sb); bindex++) {
+ lower_file = unionfs_lower_file_idx(file, bindex);
+ if (lower_file) {
+ if (bindex < fstart || bindex > fend) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF5: file/lower=%p:%p bindex=%d "
+ "fstart/end=%d:%d\n",
+ file, lower_file, bindex, fstart, fend);
+ }
+ } else { /* lower_file == NULL */
+ if (bindex >= fstart && bindex <= fend) {
+ /*
+ * directories can have NULL lower inodes in
+ * b/t start/end, but NOT if at the
+ * start/end range.
+ */
+ if (!(S_ISDIR(inode->i_mode) &&
+ bindex > fstart && bindex < fend)) {
+ PRINT_CALLER(fname, fxn, line);
+ printk(" CF6: file/lower=%p:%p "
+ "bindex=%d fstart/end=%d:%d\n",
+ file, lower_file, bindex,
+ fstart, fend);
+ }
+ }
+ }
+ }
+
+ __unionfs_check_dentry(dentry,fname,fxn,line);
+}
+
+/* useful to track vfsmount leaks that could cause EBUSY on unmount */
+void __show_branch_counts(const struct super_block *sb,
+ const char *file, const char *fxn, int line)
+{
+ int i;
+ struct vfsmount *mnt;
+
+ printk("BC:");
+ for (i=0; i<sbmax(sb); i++) {
+ if (sb->s_root)
+ mnt = UNIONFS_D(sb->s_root)->lower_paths[i].mnt;
+ else
+ mnt = NULL;
+ printk("%d:", (mnt ? atomic_read(&mnt->mnt_count) : -99));
+ }
+ printk("%s:%s:%d\n",file,fxn,line);
+}
+
+void __show_inode_times(const struct inode *inode,
+ const char *file, const char *fxn, int line)
+{
+ struct inode *lower_inode;
+ int bindex;
+
+ for (bindex=ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ printk("IT(%lu:%d): ", inode->i_ino, bindex);
+ printk("%s:%s:%d ",file,fxn,line);
+ printk("um=%lu/%lu lm=%lu/%lu ",
+ inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+ lower_inode->i_mtime.tv_sec,
+ lower_inode->i_mtime.tv_nsec);
+ printk("uc=%lu/%lu lc=%lu/%lu\n",
+ inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+ lower_inode->i_ctime.tv_sec,
+ lower_inode->i_ctime.tv_nsec);
+ }
+}
+
+void __show_dinode_times(const struct dentry *dentry,
+ const char *file, const char *fxn, int line)
+{
+ struct inode *inode = dentry->d_inode;
+ struct inode *lower_inode;
+ int bindex;
+
+ for (bindex=ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ printk("DT(%s:%lu:%d): ", dentry->d_name.name, inode->i_ino, bindex);
+ printk("%s:%s:%d ",file,fxn,line);
+ printk("um=%lu/%lu lm=%lu/%lu ",
+ inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec,
+ lower_inode->i_mtime.tv_sec,
+ lower_inode->i_mtime.tv_nsec);
+ printk("uc=%lu/%lu lc=%lu/%lu\n",
+ inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
+ lower_inode->i_ctime.tv_sec,
+ lower_inode->i_ctime.tv_nsec);
+ }
+}
+
+void __show_inode_counts(const struct inode *inode,
+ const char *file, const char *fxn, int line)
+{
+ struct inode *lower_inode;
+ int bindex;
+
+ if (!inode) {
+ printk("SiC: Null inode\n");
+ return;
+ }
+ for (bindex=sbstart(inode->i_sb); bindex <= sbend(inode->i_sb); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ printk("SIC(%lu:%d:%d): ", inode->i_ino, bindex,
+ atomic_read(&(inode)->i_count));
+ printk("lc=%d ", atomic_read(&(lower_inode)->i_count));
+ printk("%s:%s:%d\n",file,fxn,line);
+ }
+}
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 0be958f..00b0bc2 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -214,6 +214,7 @@ int is_newer_lower(const struct dentry *dentry)
printk("unionfs: new lower inode mtime "
"(bindex=%d, name=%s)\n", bindex,
dentry->d_name.name);
+ show_dinode_times(dentry);
return 1; /* mtime changed! */
}
if (timespec_compare(&inode->i_ctime,
@@ -221,6 +222,7 @@ int is_newer_lower(const struct dentry *dentry)
printk("unionfs: new lower inode ctime "
"(bindex=%d, name=%s)\n", bindex,
dentry->d_name.name);
+ show_dinode_times(dentry);
return 1; /* ctime changed! */
}
}
@@ -416,6 +418,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
unionfs_lock_dentry(dentry);
err = __unionfs_d_revalidate_chain(dentry, nd, 0);
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);

unionfs_read_unlock(dentry->d_sb);

@@ -432,6 +435,7 @@ static void unionfs_d_release(struct dentry *dentry)

unionfs_read_lock(dentry->d_sb);

+ unionfs_check_dentry(dentry);
/* this could be a negative dentry, so check first */
if (!UNIONFS_D(dentry)) {
printk(KERN_DEBUG "unionfs: dentry without private data: %.*s",
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index b33f44f..e9e63b7 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -26,6 +26,7 @@ static ssize_t unionfs_read(struct file *file, char __user *buf,
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 0)))
goto out;
+ unionfs_check_file(file);

err = do_sync_read(file, buf, count, ppos);

@@ -35,6 +36,7 @@ static ssize_t unionfs_read(struct file *file, char __user *buf,

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err;
}

@@ -47,6 +49,7 @@ static ssize_t unionfs_aio_read(struct kiocb *iocb, const struct iovec *iov,
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 0)))
goto out;
+ unionfs_check_file(file);

err = generic_file_aio_read(iocb, iov, nr_segs, pos);

@@ -59,6 +62,7 @@ static ssize_t unionfs_aio_read(struct kiocb *iocb, const struct iovec *iov,

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err;
}

@@ -70,11 +74,14 @@ static ssize_t unionfs_write(struct file *file, const char __user *buf,
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);

err = do_sync_write(file, buf, count, ppos);
/* update our inode times upon a successful lower write */
- if (err >= 0)
+ if (err >= 0) {
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
+ unionfs_check_file(file);
+ }

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
@@ -99,6 +106,7 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)
willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
if ((err = unionfs_file_revalidate(file, willwrite)))
goto out;
+ unionfs_check_file(file);

/*
* File systems which do not implement ->writepage may use
@@ -123,9 +131,12 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
- if (!err)
+ if (!err) {
/* copyup could cause parent dir times to change */
unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
+ unionfs_check_file(file);
+ unionfs_check_dentry(file->f_path.dentry->d_parent);
+ }
return err;
}

@@ -140,6 +151,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);

bstart = fbstart(file);
bend = fbend(file);
@@ -171,6 +183,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err;
}

@@ -185,6 +198,7 @@ int unionfs_fasync(int fd, struct file *file, int flag)
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);

bstart = fbstart(file);
bend = fbend(file);
@@ -214,6 +228,7 @@ int unionfs_fasync(int fd, struct file *file, int flag)

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err;
}

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 218320e..9adf272 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -227,6 +227,10 @@ out:
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);

+ unionfs_check_inode(parent);
+ if (!err)
+ unionfs_check_dentry(dentry->d_parent);
+ unionfs_check_dentry(dentry);
return err;
}

@@ -265,6 +269,9 @@ static struct dentry *unionfs_lookup(struct inode *parent,
unionfs_copy_attr_times(dentry->d_parent->d_inode);
}

+ unionfs_check_inode(parent);
+ unionfs_check_dentry(dentry);
+ unionfs_check_dentry(dentry->d_parent);
unionfs_read_unlock(dentry->d_sb);

return ret;
@@ -410,6 +417,9 @@ out:
unionfs_unlock_dentry(new_dentry);
unionfs_unlock_dentry(old_dentry);

+ unionfs_check_inode(dir);
+ unionfs_check_dentry(new_dentry);
+ unionfs_check_dentry(old_dentry);
unionfs_read_unlock(old_dentry->d_sb);

return err;
@@ -570,6 +580,8 @@ out:
unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);

+ unionfs_check_inode(dir);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);

return err;
@@ -718,6 +730,8 @@ out:
if (!err)
unionfs_copy_attr_times(dentry->d_inode);
unionfs_unlock_dentry(dentry);
+ unionfs_check_inode(parent);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);

return err;
@@ -845,6 +859,8 @@ out:
unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);

+ unionfs_check_inode(dir);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);

return err;
@@ -880,6 +896,7 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);

return err;
@@ -925,6 +942,7 @@ static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
err = 0;

out:
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return ERR_PTR(err);
}
@@ -940,6 +958,7 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
printk("unionfs: put_link failed to revalidate dentry\n");
unionfs_unlock_dentry(dentry);

+ unionfs_check_dentry(dentry);
kfree(nd_get_link(nd));
unionfs_read_unlock(dentry->d_sb);
}
@@ -1067,6 +1086,7 @@ static int unionfs_permission(struct inode *inode, int mask,
unionfs_copy_attr_times(inode);

out:
+ unionfs_check_inode(inode);
return err;
}

@@ -1153,6 +1173,8 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
unionfs_copy_attr_times(dentry->d_parent->d_inode);
out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
+ unionfs_check_dentry(dentry->d_parent);
unionfs_read_unlock(dentry->d_sb);

return err;
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index e05e6a2..6ef19af 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -213,6 +213,7 @@ static int unionfs_readpage(struct file *file, struct page *page)
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 0)))
goto out;
+ unionfs_check_file(file);

err = unionfs_do_readpage(file, page);

@@ -229,6 +230,7 @@ static int unionfs_readpage(struct file *file, struct page *page)
*/
out:
unlock_page(page);
+ unionfs_check_file(file);
unionfs_read_unlock(file->f_path.dentry->d_sb);

return err;
@@ -253,6 +255,7 @@ static int unionfs_prepare_write(struct file *file, struct page *page,
*/
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
err = unionfs_file_revalidate(file, 1);
+ unionfs_check_file(file);
unionfs_read_unlock(file->f_path.dentry->d_sb);

return err;
@@ -274,6 +277,7 @@ static int unionfs_commit_write(struct file *file, struct page *page,
unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
+ unionfs_check_file(file);

inode = page->mapping->host;
lower_inode = unionfs_lower_inode(inode);
@@ -319,6 +323,7 @@ out:
ClearPageUptodate(page);

unionfs_read_unlock(file->f_path.dentry->d_sb);
+ unionfs_check_file(file);
return err; /* assume all is ok */
}

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 7aa7e57..b8d406c 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -513,6 +513,10 @@ out:
unionfs_copy_attr_times(new_dir);
unionfs_copy_attr_times(old_dentry->d_inode);
unionfs_copy_attr_times(new_dentry->d_inode);
+ unionfs_check_inode(old_dir);
+ unionfs_check_inode(new_dir);
+ unionfs_check_dentry(old_dentry);
+ unionfs_check_dentry(new_dentry);
}

unionfs_unlock_dentry(new_dentry);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 3f972df..bfac7a4 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -130,6 +130,7 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
err = -ESTALE;
goto out;
}
+ unionfs_check_dentry(dentry);

lower_dentry = unionfs_lower_dentry(sb->s_root);
err = vfs_statfs(lower_dentry, buf);
@@ -153,6 +154,7 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(sb);
return err;
}
@@ -790,6 +792,7 @@ out_free:
kfree(new_lower_inodes);
out_error:
unionfs_write_unlock(sb);
+ unionfs_check_dentry(sb->s_root);
return err;
}

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index ce43a50..adad6a5 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -472,22 +472,15 @@ static inline struct vfsmount *unionfs_mntget(struct dentry *dentry,
{
struct vfsmount *mnt;

- if (!dentry) {
- if (bindex < 0)
- return NULL;
- if (!dentry && bindex >= 0) {
- return NULL;
- }
- }
- mnt = unionfs_lower_mnt_idx(dentry, bindex);
- if (!mnt) {
- if (bindex < 0)
- return NULL;
- if (!mnt && bindex >= 0) {
- return NULL;
- }
- }
- mnt = mntget(mnt);
+ BUG_ON(!dentry || bindex < 0);
+
+ mnt = mntget(unionfs_lower_mnt_idx(dentry, bindex));
+#ifdef CONFIG_UNION_FS_DEBUG
+ if (!mnt)
+ printk(KERN_DEBUG "unionfs_mntget: mnt=%p bindex=%d\n",
+ mnt, bindex);
+#endif /* CONFIG_UNION_FS_DEBUG */
+
return mnt;
}

@@ -495,24 +488,74 @@ static inline void unionfs_mntput(struct dentry *dentry, int bindex)
{
struct vfsmount *mnt;

- if (!dentry) {
- if (bindex < 0)
- return;
- if (!dentry && bindex >= 0) {
- return;
- }
- }
+ if (!dentry && bindex < 0)
+ return;
+ BUG_ON(!dentry || bindex < 0);
+
mnt = unionfs_lower_mnt_idx(dentry, bindex);
- if (!mnt) {
- if (bindex < 0)
- return;
- if (!mnt && bindex >= 0) {
- return;
- }
- }
+#ifdef CONFIG_UNION_FS_DEBUG
+ /*
+ * Directories can have NULL lower objects in between start/end, but
+ * NOT if at the start/end range. We cannot verify that this dentry
+ * is a type=DIR, because it may already be a negative dentry. But
+ * if dbstart is greater than dbend, we know that this couldn't have
+ * been a regular file: it had to have been a directory.
+ */
+ if (!mnt && !(bindex > dbstart(dentry) && bindex < dbend(dentry)))
+ printk(KERN_WARNING
+ "unionfs_mntput: mnt=%p bindex=%d\n",
+ mnt, bindex);
+#endif /* CONFIG_UNION_FS_DEBUG */
mntput(mnt);
}

-
+#ifdef CONFIG_UNION_FS_DEBUG
+
+/* useful for tracking code reachability */
+#define UDBG printk("DBG:%s:%s:%d\n",__FILE__,__FUNCTION__,__LINE__)
+
+#define unionfs_check_inode(i) __unionfs_check_inode((i), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define unionfs_check_dentry(d) __unionfs_check_dentry((d), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define unionfs_check_file(f) __unionfs_check_file((f), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define show_branch_counts(sb) __show_branch_counts((sb), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define show_inode_times(i) __show_inode_times((i), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define show_dinode_times(d) __show_dinode_times((d), \
+ __FILE__,__FUNCTION__,__LINE__)
+#define show_inode_counts(i) __show_inode_counts((i), \
+ __FILE__,__FUNCTION__,__LINE__)
+
+extern void __unionfs_check_inode(const struct inode *inode, const char *fname,
+ const char *fxn, int line);
+extern void __unionfs_check_dentry(const struct dentry *dentry,
+ const char *fname, const char *fxn,
+ int line);
+extern void __unionfs_check_file(const struct file *file,
+ const char *fname, const char *fxn, int line);
+extern void __show_branch_counts(const struct super_block *sb,
+ const char *file, const char *fxn, int line);
+extern void __show_inode_times(const struct inode *inode,
+ const char *file, const char *fxn, int line);
+extern void __show_dinode_times(const struct dentry *dentry,
+ const char *file, const char *fxn, int line);
+extern void __show_inode_counts(const struct inode *inode,
+ const char *file, const char *fxn, int line);
+
+#else /* not CONFIG_UNION_FS_DEBUG */
+
+/* we leave useful hooks for these check functions throughout the code */
+#define unionfs_check_inode(i) do { } while(0)
+#define unionfs_check_dentry(d) do { } while(0)
+#define unionfs_check_file(f) do { } while(0)
+#define show_branch_counts(sb) do { } while(0)
+#define show_inode_times(i) do { } while(0)
+#define show_dinode_times(d) do { } while(0)
+#define show_inode_counts(i) do { } while(0)
+
+#endif /* not CONFIG_UNION_FS_DEBUG */

#endif /* not _UNION_H_ */
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 50ed351..4ea350a 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -83,6 +83,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
err = -ESTALE;
goto out;
}
+ unionfs_check_dentry(dentry);

err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
@@ -98,6 +99,10 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
}

out:
+ if (!err) {
+ unionfs_check_dentry(dentry);
+ unionfs_check_inode(dir);
+ }
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
@@ -147,6 +152,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
err = -ESTALE;
goto out;
}
+ unionfs_check_dentry(dentry);

/* check if this unionfs directory is empty or not */
err = check_empty(dentry, &namelist);
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 6ab92f3..ee7da13 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -59,6 +59,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
}
@@ -88,6 +89,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
}
@@ -115,6 +117,7 @@ int unionfs_removexattr(struct dentry *dentry, const char *name)

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
}
@@ -144,6 +147,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)

out:
unionfs_unlock_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
}
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:32:19

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 24/32] Unionfs: remove old nfsro option

From: Erez Zadok <[email protected]>

Ensure that a branch set as 'ro' behaves like a real readonly mounted lower
file system. This allows us to remove the old 'nfsro' option. Now unionfs
handles even an readonly exported NFS file system, which was mounted on the
client in readwrite mode.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/inode.c | 48 ++++++++++++++++++++++-----------------------
include/linux/union_fs.h | 3 --
2 files changed, 23 insertions(+), 28 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index d6a79d5..4574fbe 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -948,47 +948,44 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
* Basically copied from the kernel vfs permission(), but we've changed
* the following:
* (1) the IS_RDONLY check is skipped, and
- * (2) if you set the mount option `mode=nfsro', we assume that -EACCES
- * means that the export is read-only and we should check standard Unix
- * permissions. This means that NFS ACL checks (or other advanced
- * permission features) are bypassed. Note however, that we do call
- * security_inode_permission, and therefore security inside SELinux, etc.
- * are performed.
+ * (2) We return 0 (success) if the non-leftmost branch is mounted
+ * readonly, to allow copyup to work.
+ * (3) we do call security_inode_permission, and therefore security inside
+ * SELinux, etc. are performed.
*/
-static int inode_permission(struct inode *inode, int mask,
+static int inode_permission(struct super_block *sb, struct inode *inode, int mask,
struct nameidata *nd, int bindex)
{
int retval, submask;

if (mask & MAY_WRITE) {
+ umode_t mode = inode->i_mode;
/* The first branch is allowed to be really readonly. */
- if (bindex == 0) {
- umode_t mode = inode->i_mode;
- if (IS_RDONLY(inode) &&
- (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
- return -EROFS;
- }
+ if (bindex == 0 &&
+ IS_RDONLY(inode) &&
+ (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
+ return -EROFS;
/*
* Nobody gets write access to an immutable file.
*/
if (IS_IMMUTABLE(inode))
return -EACCES;
+ /*
+ * For all other branches than the first one, we ignore
+ * EROFS or if the branch is mounted as readonly, to let
+ * copyup take place.
+ */
+ if (bindex > 0 &&
+ is_robranch_super(sb, bindex) &&
+ (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
+ return 0;
}

/* Ordinary permission routines do not understand MAY_APPEND. */
submask = mask & ~MAY_APPEND;
- if (inode->i_op && inode->i_op->permission) {
+ if (inode->i_op && inode->i_op->permission)
retval = inode->i_op->permission(inode, submask, nd);
- if ((retval == -EACCES) && (submask & MAY_WRITE) &&
- (!strcmp("nfs", (inode)->i_sb->s_type->name)) &&
- (nd) && (nd->mnt) && (nd->mnt->mnt_sb)) {
- int perms;
- perms = branchperms(nd->mnt->mnt_sb, bindex);
- if (perms & MAY_NFSRO)
- retval = generic_permission(inode, submask,
- NULL);
- }
- } else
+ else
retval = generic_permission(inode, submask, NULL);

if (retval && retval != -EROFS) /* ignore EROFS */
@@ -1046,7 +1043,8 @@ static int unionfs_permission(struct inode *inode, int mask,
* We use our own special version of permission, such that
* only the first branch returns -EROFS.
*/
- err = inode_permission(lower_inode, mask, nd, bindex);
+ err = inode_permission(inode->i_sb, lower_inode, mask, nd,
+ bindex);

/*
* The permissions are an intersection of the overall directory
diff --git a/include/linux/union_fs.h b/include/linux/union_fs.h
index 223ccab..9bc4e3b 100644
--- a/include/linux/union_fs.h
+++ b/include/linux/union_fs.h
@@ -22,8 +22,5 @@
/* We don't support normal remount, but unionctl uses it. */
# define UNIONFS_REMOUNT_MAGIC 0x4a5a4380

-/* should be at least LAST_USED_UNIONFS_PERMISSION<<1 */
-#define MAY_NFSRO 16
-
#endif /* _LINUX_UNIONFS_H */

--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:32:44

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 28/32] Unionfs: use file f_path field

From: Erez Zadok <[email protected]>

Start using file->f_path.dentry instead of file->f_dentry

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 42 ++++++++++++++++++++----------------------
fs/unionfs/dirfops.c | 6 +++---
fs/unionfs/mmap.c | 5 ++---
fs/unionfs/rdstate.c | 2 +-
4 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index eee68b8..341b6f2 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -158,7 +158,7 @@ static int open_all_files(struct file *file)
int bindex, bstart, bend, err = 0;
struct file *lower_file;
struct dentry *lower_dentry;
- struct dentry *dentry = file->f_dentry;
+ struct dentry *dentry = file->f_path.dentry;
struct super_block *sb = dentry->d_sb;

bstart = dbstart(dentry);
@@ -193,8 +193,7 @@ static int open_highest_file(struct file *file, int willwrite)
int bindex, bstart, bend, err = 0;
struct file *lower_file;
struct dentry *lower_dentry;
-
- struct dentry *dentry = file->f_dentry;
+ struct dentry *dentry = file->f_path.dentry;
struct inode *parent_inode = dentry->d_parent->d_inode;
struct super_block *sb = dentry->d_sb;
size_t inode_size = dentry->d_inode->i_size;
@@ -302,10 +301,9 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
int sbgen, fgen, dgen;
int bstart, bend;
int size;
-
int err = 0;

- dentry = file->f_dentry;
+ dentry = file->f_path.dentry;
unionfs_lock_dentry(dentry);
sb = dentry->d_sb;

@@ -412,20 +410,20 @@ static int __open_dir(struct inode *inode, struct file *file)
struct file *lower_file;
int bindex, bstart, bend;

- bstart = fbstart(file) = dbstart(file->f_dentry);
- bend = fbend(file) = dbend(file->f_dentry);
+ bstart = fbstart(file) = dbstart(file->f_path.dentry);
+ bend = fbend(file) = dbend(file->f_path.dentry);

for (bindex = bstart; bindex <= bend; bindex++) {
lower_dentry =
- unionfs_lower_dentry_idx(file->f_dentry, bindex);
+ unionfs_lower_dentry_idx(file->f_path.dentry, bindex);
if (!lower_dentry)
continue;

dget(lower_dentry);
- unionfs_mntget(file->f_dentry, bindex);
+ unionfs_mntget(file->f_path.dentry, bindex);
lower_file = dentry_open(lower_dentry,
- unionfs_lower_mnt_idx(file->f_dentry,
- bindex),
+ unionfs_lower_mnt_idx(file->f_path.dentry,
+ bindex),
file->f_flags);
if (IS_ERR(lower_file))
return PTR_ERR(lower_file);
@@ -450,17 +448,17 @@ static int __open_file(struct inode *inode, struct file *file)
int lower_flags;
int bindex, bstart, bend;

- lower_dentry = unionfs_lower_dentry(file->f_dentry);
+ lower_dentry = unionfs_lower_dentry(file->f_path.dentry);
lower_flags = file->f_flags;

- bstart = fbstart(file) = dbstart(file->f_dentry);
- bend = fbend(file) = dbend(file->f_dentry);
+ bstart = fbstart(file) = dbstart(file->f_path.dentry);
+ bend = fbend(file) = dbend(file->f_path.dentry);

/*
* check for the permission for lower file. If the error is
* COPYUP_ERR, copyup the file.
*/
- if (lower_dentry->d_inode && is_robranch(file->f_dentry)) {
+ if (lower_dentry->d_inode && is_robranch(file->f_path.dentry)) {
/*
* if the open will change the file, copy it up otherwise
* defer it.
@@ -472,7 +470,7 @@ static int __open_file(struct inode *inode, struct file *file)
/* copyup the file */
for (bindex = bstart - 1; bindex >= 0; bindex--) {
err = copyup_file(
- file->f_dentry->d_parent->d_inode,
+ file->f_path.dentry->d_parent->d_inode,
file, bstart, bindex, size);
if (!err)
break;
@@ -488,10 +486,10 @@ static int __open_file(struct inode *inode, struct file *file)
* dentry_open will decrement mnt refcnt if err.
* otherwise fput() will do an mntput() for us upon file close.
*/
- unionfs_mntget(file->f_dentry, bstart);
+ unionfs_mntget(file->f_path.dentry, bstart);
lower_file =
dentry_open(lower_dentry,
- unionfs_lower_mnt_idx(file->f_dentry, bstart),
+ unionfs_lower_mnt_idx(file->f_path.dentry, bstart),
lower_flags);
if (IS_ERR(lower_file))
return PTR_ERR(lower_file);
@@ -536,7 +534,7 @@ int unionfs_open(struct inode *inode, struct file *file)
goto out;
}

- dentry = file->f_dentry;
+ dentry = file->f_path.dentry;
unionfs_lock_dentry(dentry);

bstart = fbstart(file) = dbstart(dentry);
@@ -562,7 +560,7 @@ int unionfs_open(struct inode *inode, struct file *file)
if (!lower_file)
continue;

- branchput(file->f_dentry->d_sb, bindex);
+ branchput(file->f_path.dentry->d_sb, bindex);
/* fput calls dput for lower_dentry */
fput(lower_file);
}
@@ -604,7 +602,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
if ((err = unionfs_file_revalidate(file, 1)))
goto out;
fileinfo = UNIONFS_F(file);
- BUG_ON(file->f_dentry->d_inode != inode);
+ BUG_ON(file->f_path.dentry->d_inode != inode);
inodeinfo = UNIONFS_I(inode);

/* fput all the lower files */
@@ -664,7 +662,7 @@ static long do_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
err = lower_file->f_op->unlocked_ioctl(lower_file, cmd, arg);
} else if (lower_file->f_op->ioctl) {
lock_kernel();
- err = lower_file->f_op->ioctl(lower_file->f_dentry->d_inode,
+ err = lower_file->f_op->ioctl(lower_file->f_path.dentry->d_inode,
lower_file, cmd, arg);
unlock_kernel();
}
diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index 0e93bd7..980f125 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -102,7 +102,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
if ((err = unionfs_file_revalidate(file, 0)))
goto out;

- inode = file->f_dentry->d_inode;
+ inode = file->f_path.dentry->d_inode;

uds = UNIONFS_F(file)->rdstate;
if (!uds) {
@@ -156,7 +156,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
uds->dirpos = offset;

/* Copy the atime. */
- fsstack_copy_attr_atime(inode, lower_file->f_dentry->d_inode);
+ fsstack_copy_attr_atime(inode, lower_file->f_path.dentry->d_inode);

if (err < 0)
goto out;
@@ -239,7 +239,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t offset, int origin)
else
err = -EINVAL;
} else {
- rdstate = find_rdstate(file->f_dentry->d_inode,
+ rdstate = find_rdstate(file->f_path.dentry->d_inode,
offset);
if (rdstate) {
UNIONFS_F(file)->rdstate = rdstate;
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index d26b572..e05e6a2 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -229,7 +229,7 @@ static int unionfs_readpage(struct file *file, struct page *page)
*/
out:
unlock_page(page);
- unionfs_read_unlock(file->f_dentry->d_sb);
+ unionfs_read_unlock(file->f_path.dentry->d_sb);

return err;
}
@@ -271,8 +271,7 @@ static int unionfs_commit_write(struct file *file, struct page *page,

BUG_ON(file == NULL);

- unionfs_read_lock(file->f_dentry->d_sb);
-
+ unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 1)))
goto out;

diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c
index 5fc242e..0a18d5c 100644
--- a/fs/unionfs/rdstate.c
+++ b/fs/unionfs/rdstate.c
@@ -91,7 +91,7 @@ int init_rdstate(struct file *file)
(sizeof(unsigned int) + sizeof(unsigned int)));
BUG_ON(UNIONFS_F(file)->rdstate != NULL);

- UNIONFS_F(file)->rdstate = alloc_rdstate(file->f_dentry->d_inode,
+ UNIONFS_F(file)->rdstate = alloc_rdstate(file->f_path.dentry->d_inode,
fbstart(file));

return (UNIONFS_F(file)->rdstate ? 0 : -ENOMEM);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:33:18

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 13/32] Unionfs: copyup updates

From: Erez Zadok <[email protected]>

Fixes, updates, and better documentation for the file-copyup functionality.
Include two additional utility functions useful for copyup code callers.
Parent directory copyup updates: create_parents now takes a string name
instead of the whole dentry.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 135 +++++++++++--------
fs/unionfs/copyup.c | 348 +++++++++++++++++++++++++++-------------------
fs/unionfs/inode.c | 33 ++++--
fs/unionfs/rename.c | 26 +++--
fs/unionfs/subr.c | 4 +-
fs/unionfs/union.h | 11 +-
fs/unionfs/unlink.c | 2 +
7 files changed, 338 insertions(+), 221 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index edb52c0..64bd0bd 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -31,7 +31,6 @@ static int copyup_deleted_file(struct file *file, struct dentry *dentry,
const int countersize = sizeof(counter) * 2;
const int nlen = sizeof(".unionfs") + i_inosize + countersize - 1;
char name[nlen + 1];
-
int err;
struct dentry *tmp_dentry = NULL;
struct dentry *lower_dentry;
@@ -42,7 +41,6 @@ static int copyup_deleted_file(struct file *file, struct dentry *dentry,
sprintf(name, ".unionfs%*.*lx",
i_inosize, i_inosize, lower_dentry->d_inode->i_ino);

-retry:
/*
* Loop, looking for an unused temp name to copyup to.
*
@@ -52,6 +50,7 @@ retry:
* the name exists in the dest branch, but it'd be nice to catch it
* sooner than later.
*/
+retry:
tmp_dentry = NULL;
do {
char *suffix = name + nlen - countersize;
@@ -73,14 +72,20 @@ retry:
dput(tmp_dentry);

err = copyup_named_file(dentry->d_parent->d_inode, file, name, bstart,
- bindex, file->f_dentry->d_inode->i_size);
- if (err == -EEXIST)
- goto retry;
- else if (err)
+ bindex, file->f_path.dentry->d_inode->i_size);
+ if (err) {
+ if (err == -EEXIST)
+ goto retry;
goto out;
+ }

/* bring it to the same state as an unlinked file */
lower_dentry = unionfs_lower_dentry_idx(dentry, dbstart(dentry));
+ if (!unionfs_lower_inode_idx(dentry->d_inode, bindex)) {
+ atomic_inc(&lower_dentry->d_inode->i_count);
+ unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+ lower_dentry->d_inode);
+ }
lower_dir_dentry = lock_parent(lower_dentry);
err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
unlock_dir(lower_dir_dentry);
@@ -96,48 +101,52 @@ out:
static void cleanup_file(struct file *file)
{
int bindex, bstart, bend;
- struct file **lf;
- struct super_block *sb = file->f_dentry->d_sb;
+ struct file **lower_files;
+ struct file *lower_file;
+ struct super_block *sb = file->f_path.dentry->d_sb;

- lf = UNIONFS_F(file)->lower_files;
+ lower_files = UNIONFS_F(file)->lower_files;
bstart = fbstart(file);
bend = fbend(file);

for (bindex = bstart; bindex <= bend; bindex++) {
- if (unionfs_lower_file_idx(file, bindex)) {
- /*
- * Find new index of matching branch with an open
- * file, since branches could have been added or
- * deleted causing the one with open files to shift.
- */
- int i; /* holds (possibly) updated branch index */
- int old_bid;
-
- old_bid = UNIONFS_F(file)->saved_branch_ids[bindex];
- i = branch_id_to_idx(sb, old_bid);
- if (i < 0)
- printk(KERN_ERR "unionfs: no superblock for "
- "file %p\n", file);
- else {
- /* decrement count of open files */
- branchput(sb, i);
- /*
- * fput will perform an mntput for us on the
- * correct branch. Although we're using the
- * file's old branch configuration, bindex,
- * which is the old index, correctly points
- * to the right branch in the file's branch
- * list. In other words, we're going to
- * mntput the correct branch even if
- * branches have been added/removed.
- */
- fput(unionfs_lower_file_idx(file, bindex));
- }
+ int i; /* holds (possibly) updated branch index */
+ int old_bid;
+
+ lower_file = unionfs_lower_file_idx(file, bindex);
+ if (!lower_file)
+ continue;
+
+ /*
+ * Find new index of matching branch with an open
+ * file, since branches could have been added or
+ * deleted causing the one with open files to shift.
+ */
+ old_bid = UNIONFS_F(file)->saved_branch_ids[bindex];
+ i = branch_id_to_idx(sb, old_bid);
+ if (i < 0) {
+ printk(KERN_ERR "unionfs: no superblock for "
+ "file %p\n", file);
+ continue;
}
+
+ /* decrement count of open files */
+ branchput(sb, i);
+ /*
+ * fput will perform an mntput for us on the correct branch.
+ * Although we're using the file's old branch configuration,
+ * bindex, which is the old index, correctly points to the
+ * right branch in the file's branch list. In other words,
+ * we're going to mntput the correct branch even if branches
+ * have been added/removed.
+ */
+ fput(lower_file);
+ UNIONFS_F(file)->lower_files[bindex] = NULL;
+ UNIONFS_F(file)->saved_branch_ids[bindex] = -1;
}

UNIONFS_F(file)->lower_files = NULL;
- kfree(lf);
+ kfree(lower_files);
kfree(UNIONFS_F(file)->saved_branch_ids);
/* set to NULL because caller needs to know if to kfree on error */
UNIONFS_F(file)->saved_branch_ids = NULL;
@@ -209,7 +218,6 @@ static int open_highest_file(struct file *file, int willwrite)

dget(lower_dentry);
unionfs_mntget(dentry, bstart);
- branchget(sb, bstart);
lower_file = dentry_open(lower_dentry,
unionfs_lower_mnt_idx(dentry, bstart),
file->f_flags);
@@ -217,6 +225,7 @@ static int open_highest_file(struct file *file, int willwrite)
err = PTR_ERR(lower_file);
goto out;
}
+ branchget(sb, bstart);
unionfs_set_lower_file(file, lower_file);
/* Fix up the position. */
lower_file->f_pos = file->f_pos;
@@ -227,19 +236,20 @@ out:
}

/* perform a delayed copyup of a read-write file on a read-only branch */
-static int do_delayed_copyup(struct file *file, struct dentry *dentry)
+static int do_delayed_copyup(struct file *file)
{
int bindex, bstart, bend, err = 0;
+ struct dentry *dentry = file->f_path.dentry;
struct inode *parent_inode = dentry->d_parent->d_inode;
- loff_t inode_size = file->f_dentry->d_inode->i_size;
+ loff_t inode_size = dentry->d_inode->i_size;

bstart = fbstart(file);
bend = fbend(file);

- BUG_ON(!S_ISREG(file->f_dentry->d_inode->i_mode));
+ BUG_ON(!S_ISREG(dentry->d_inode->i_mode));

for (bindex = bstart - 1; bindex >= 0; bindex--) {
- if (!d_deleted(file->f_dentry))
+ if (!d_deleted(dentry))
err = copyup_file(parent_inode, file, bstart,
bindex, inode_size);
else
@@ -249,17 +259,34 @@ static int do_delayed_copyup(struct file *file, struct dentry *dentry)
if (!err)
break;
}
- if (!err && (bstart > fbstart(file))) {
- bend = fbend(file);
- for (bindex = bstart; bindex <= bend; bindex++) {
- if (unionfs_lower_file_idx(file, bindex)) {
- branchput(dentry->d_sb, bindex);
- fput(unionfs_lower_file_idx(file, bindex));
- unionfs_set_lower_file_idx(file, bindex, NULL);
- }
+ if (err || (bstart <= fbstart(file)))
+ goto out;
+ bend = fbend(file);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ if (unionfs_lower_file_idx(file, bindex)) {
+ branchput(dentry->d_sb, bindex);
+ fput(unionfs_lower_file_idx(file, bindex));
+ unionfs_set_lower_file_idx(file, bindex, NULL);
+ }
+ if (unionfs_lower_mnt_idx(dentry, bindex)) {
+ unionfs_mntput(dentry, bindex);
+ unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
+ }
+ if (unionfs_lower_dentry_idx(dentry, bindex)) {
+ BUG_ON(!dentry->d_inode);
+ iput(unionfs_lower_inode_idx(dentry->d_inode, bindex));
+ unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+ NULL);
+ dput(unionfs_lower_dentry_idx(dentry, bindex));
+ unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
}
- fbend(file) = bend;
}
+ /* for reg file, we only open it "once" */
+ fbend(file) = fbstart(file);
+ set_dbend(dentry, dbstart(dentry));
+ ibend(dentry->d_inode) = ibstart(dentry->d_inode);
+
+out:
return err;
}

@@ -348,7 +375,7 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
is_robranch(dentry)) {
printk(KERN_DEBUG "unionfs: Doing delayed copyup of a "
"read-write file on a read-only branch.\n");
- err = do_delayed_copyup(file, dentry);
+ err = do_delayed_copyup(file);
}

out:
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 410ce07..9c476cd 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -23,15 +23,6 @@
* Documentation/filesystems/unionfs/concepts.txt
*/

-/* forward definitions */
-static int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
- int bstart, int new_bindex, const char *name,
- int namelen, struct file **copyup_file,
- loff_t len);
-static struct dentry *create_parents_named(struct inode *dir,
- struct dentry *dentry,
- const char *name, int bindex);
-
#ifdef CONFIG_UNION_FS_XATTR
/* copyup all extended attrs for a given dentry */
static int copyup_xattrs(struct dentry *old_lower_dentry,
@@ -101,7 +92,14 @@ out:
}
#endif /* CONFIG_UNION_FS_XATTR */

-/* Determine the mode based on the copyup flags, and the existing dentry. */
+/*
+ * Determine the mode based on the copyup flags, and the existing dentry.
+ *
+ * Handle file systems which may not support certain options. For example
+ * jffs2 doesn't allow one to chmod a symlink. So we ignore such harmless
+ * errors, rather than propagating them up, which results in copyup errors
+ * and errors returned back to users.
+ */
static int copyup_permissions(struct super_block *sb,
struct dentry *old_lower_dentry,
struct dentry *new_lower_dentry)
@@ -113,30 +111,31 @@ static int copyup_permissions(struct super_block *sb,
newattrs.ia_atime = i->i_atime;
newattrs.ia_mtime = i->i_mtime;
newattrs.ia_ctime = i->i_ctime;
-
newattrs.ia_gid = i->i_gid;
newattrs.ia_uid = i->i_uid;
-
- newattrs.ia_mode = i->i_mode;
-
newattrs.ia_valid = ATTR_CTIME | ATTR_ATIME | ATTR_MTIME |
ATTR_ATIME_SET | ATTR_MTIME_SET | ATTR_FORCE |
- ATTR_GID | ATTR_UID | ATTR_MODE;
+ ATTR_GID | ATTR_UID;
+ err = notify_change(new_lower_dentry, &newattrs);
+ if (err)
+ goto out;

+ /* now try to change the mode and ignore EOPNOTSUPP on symlinks */
+ newattrs.ia_mode = i->i_mode;
+ newattrs.ia_valid = ATTR_MODE | ATTR_FORCE;
err = notify_change(new_lower_dentry, &newattrs);
+ if (err == -EOPNOTSUPP &&
+ S_ISLNK(new_lower_dentry->d_inode->i_mode)) {
+ printk(KERN_WARNING
+ "unionfs: changing \"%s\" symlink mode unsupported\n",
+ new_lower_dentry->d_name.name);
+ err = 0;
+ }

+out:
return err;
}

-int copyup_dentry(struct inode *dir, struct dentry *dentry,
- int bstart, int new_bindex,
- struct file **copyup_file, loff_t len)
-{
- return copyup_named_dentry(dir, dentry, bstart, new_bindex,
- dentry->d_name.name,
- dentry->d_name.len, copyup_file, len);
-}
-
/*
* create the new device/file/directory - use copyup_permission to copyup
* times, and mode
@@ -202,6 +201,7 @@ static int __copyup_reg_data(struct dentry *dentry,
struct super_block *sb = dentry->d_sb;
struct file *input_file;
struct file *output_file;
+ struct vfsmount *output_mnt;
mm_segment_t old_fs;
char *buf = NULL;
ssize_t read_bytes, write_bytes;
@@ -211,6 +211,7 @@ static int __copyup_reg_data(struct dentry *dentry,
/* open old file */
unionfs_mntget(dentry, old_bindex);
branchget(sb, old_bindex);
+ /* dentry_open calls dput and mntput if it returns an error */
input_file = dentry_open(old_lower_dentry,
unionfs_lower_mnt_idx(dentry, old_bindex),
O_RDONLY | O_LARGEFILE);
@@ -226,10 +227,9 @@ static int __copyup_reg_data(struct dentry *dentry,

/* open new file */
dget(new_lower_dentry);
- unionfs_mntget(dentry, new_bindex);
+ output_mnt = unionfs_mntget(sb->s_root, new_bindex);
branchget(sb, new_bindex);
- output_file = dentry_open(new_lower_dentry,
- unionfs_lower_mnt_idx(dentry, new_bindex),
+ output_file = dentry_open(new_lower_dentry, output_mnt,
O_RDWR | O_LARGEFILE);
if (IS_ERR(output_file)) {
err = PTR_ERR(output_file);
@@ -331,11 +331,21 @@ static void __clear(struct dentry *dentry, struct dentry *old_lower_dentry,
dput(old_lower_dentry);
}

-/* copy up a dentry to a file of specified name */
-static int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
- int bstart, int new_bindex, const char *name,
- int namelen, struct file **copyup_file,
- loff_t len)
+/*
+ * Copy up a dentry to a file of specified name.
+ *
+ * @dir: used to pull the ->i_sb to access other branches
+ * @dentry: the non-negative dentry whose lower_inode we should copy
+ * @bstart: the branch of the lower_inode to copy from
+ * @new_bindex: the branch to create the new file in
+ * @name: the name of the file to create
+ * @namelen: length of @name
+ * @copyup_file: the "struct file" to return (optional)
+ * @len: how many bytes to copy-up?
+ */
+int copyup_dentry(struct inode *dir, struct dentry *dentry, int bstart,
+ int new_bindex, const char *name, int namelen,
+ struct file **copyup_file, loff_t len)
{
struct dentry *new_lower_dentry;
struct dentry *old_lower_dentry = NULL;
@@ -363,8 +373,7 @@ static int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
goto out;

/* Create the directory structure above this dentry. */
- new_lower_dentry =
- create_parents_named(dir, dentry, name, new_bindex);
+ new_lower_dentry = create_parents(dir, dentry, name, new_bindex);
if (IS_ERR(new_lower_dentry)) {
err = PTR_ERR(new_lower_dentry);
goto out;
@@ -479,6 +488,25 @@ out_free:
dput(old_lower_dentry);
kfree(symbuf);

+ if (err)
+ goto out;
+ if (!S_ISDIR(dentry->d_inode->i_mode)) {
+ unionfs_postcopyup_release(dentry);
+ if (!unionfs_lower_inode(dentry->d_inode)) {
+ /*
+ * If we got here, then we copied up to an
+ * unlinked-open file, whose name is .unionfsXXXXX.
+ */
+ struct inode *inode = new_lower_dentry->d_inode;
+ atomic_inc(&inode->i_count);
+ unionfs_set_lower_inode_idx(dentry->d_inode,
+ ibstart(dentry->d_inode),
+ inode);
+ }
+ }
+ unionfs_postcopyup_setmnt(dentry);
+ /* sync inode times from copied-up inode to our inode */
+ unionfs_copy_attr_times(dentry->d_inode);
out:
return err;
}
@@ -494,9 +522,8 @@ int copyup_named_file(struct inode *dir, struct file *file, char *name,
int err = 0;
struct file *output_file = NULL;

- err = copyup_named_dentry(dir, file->f_dentry, bstart,
- new_bindex, name, strlen(name), &output_file,
- len);
+ err = copyup_dentry(dir, file->f_path.dentry, bstart, new_bindex,
+ name, strlen(name), &output_file, len);
if (!err) {
fbstart(file) = new_bindex;
unionfs_set_lower_file_idx(file, new_bindex, output_file);
@@ -514,8 +541,10 @@ int copyup_file(struct inode *dir, struct file *file, int bstart,
{
int err = 0;
struct file *output_file = NULL;
+ struct dentry *dentry = file->f_path.dentry;

- err = copyup_dentry(dir, file->f_dentry, bstart, new_bindex,
+ err = copyup_dentry(dir, dentry, bstart, new_bindex,
+ dentry->d_name.name, dentry->d_name.len,
&output_file, len);
if (!err) {
fbstart(file) = new_bindex;
@@ -525,17 +554,6 @@ int copyup_file(struct inode *dir, struct file *file, int bstart,
return err;
}

-/*
- * This function replicates the directory structure up-to given dentry in the
- * bindex branch. Can create directory structure recursively to the right
- * also.
- */
-struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
- int bindex)
-{
- return create_parents_named(dir, dentry, dentry->d_name.name, bindex);
-}
-
/* purge a dentry's lower-branch states (dput/mntput, etc.) */
static void __cleanup_dentry(struct dentry *dentry, int bindex,
int old_bstart, int old_bend)
@@ -615,9 +633,8 @@ static void __set_dentry(struct dentry *upper, struct dentry *lower,
* This function replicates the directory structure up-to given dentry
* in the bindex branch.
*/
-static struct dentry *create_parents_named(struct inode *dir,
- struct dentry *dentry,
- const char *name, int bindex)
+struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
+ const char *name, int bindex)
{
int err;
struct dentry *child_dentry;
@@ -626,10 +643,8 @@ static struct dentry *create_parents_named(struct inode *dir,
struct dentry *lower_dentry = NULL;
const char *childname;
unsigned int childnamelen;
-
int nr_dentry;
int count = 0;
-
int old_bstart;
int old_bend;
struct dentry **path = NULL;
@@ -681,7 +696,8 @@ static struct dentry *create_parents_named(struct inode *dir,
void *p;

nr_dentry *= 2;
- p = krealloc(path, nr_dentry * sizeof(struct dentry *), GFP_KERNEL);
+ p = krealloc(path, nr_dentry * sizeof(struct dentry *),
+ GFP_KERNEL);
if (!p) {
lower_dentry = ERR_PTR(-ENOMEM);
goto out;
@@ -697,106 +713,100 @@ static struct dentry *create_parents_named(struct inode *dir,
sb = dentry->d_sb;

/*
- * This is basically while(child_dentry != dentry). This loop is
- * horrible to follow and should be replaced with cleaner code.
+ * This code goes between the begin/end labels and basically
+ * emulates a while(child_dentry != dentry), only cleaner and
+ * shorter than what would be a much longer while loop.
*/
- while (1) {
- /* get lower parent dir in the current branch */
- lower_parent_dentry =
- unionfs_lower_dentry_idx(parent_dentry, bindex);
- unionfs_unlock_dentry(parent_dentry);
-
- /* init the values to lookup */
- childname = child_dentry->d_name.name;
- childnamelen = child_dentry->d_name.len;
-
- if (child_dentry != dentry) {
- /* lookup child in the underlying file system */
- lower_dentry =
- lookup_one_len(childname, lower_parent_dentry,
- childnamelen);
- if (IS_ERR(lower_dentry))
- goto out;
- } else {
+begin:
+ /* get lower parent dir in the current branch */
+ lower_parent_dentry = unionfs_lower_dentry_idx(parent_dentry, bindex);
+ unionfs_unlock_dentry(parent_dentry);
+
+ /* init the values to lookup */
+ childname = child_dentry->d_name.name;
+ childnamelen = child_dentry->d_name.len;
+
+ if (child_dentry != dentry) {
+ /* lookup child in the underlying file system */
+ lower_dentry = lookup_one_len(childname, lower_parent_dentry,
+ childnamelen);
+ if (IS_ERR(lower_dentry))
+ goto out;
+ } else {
+ /*
+ * Is the name a whiteout of the child name ? lookup the
+ * whiteout child in the underlying file system
+ */
+ lower_dentry = lookup_one_len(name, lower_parent_dentry,
+ strlen(name));
+ if (IS_ERR(lower_dentry))
+ goto out;

- /*
- * is the name a whiteout of the child name ?
- * lookup the whiteout child in the underlying file
- * system
- */
- lower_dentry =
- lookup_one_len(name, lower_parent_dentry,
- strlen(name));
- if (IS_ERR(lower_dentry))
- goto out;
+ /* Replace the current dentry (if any) with the new one */
+ dput(unionfs_lower_dentry_idx(dentry, bindex));
+ unionfs_set_lower_dentry_idx(dentry, bindex,
+ lower_dentry);

- /*
- * Replace the current dentry (if any) with the new
- * one.
- */
- dput(unionfs_lower_dentry_idx(dentry, bindex));
- unionfs_set_lower_dentry_idx(dentry, bindex,
- lower_dentry);
+ __cleanup_dentry(dentry, bindex, old_bstart, old_bend);
+ goto out;
+ }

- __cleanup_dentry(dentry, bindex, old_bstart, old_bend);
- break;
- }
+ if (lower_dentry->d_inode) {
+ /*
+ * since this already exists we dput to avoid
+ * multiple references on the same dentry
+ */
+ dput(lower_dentry);
+ } else {
+ struct sioq_args args;
+
+ /* it's a negative dentry, create a new dir */
+ lower_parent_dentry = lock_parent(lower_dentry);
+
+ args.mkdir.parent = lower_parent_dentry->d_inode;
+ args.mkdir.dentry = lower_dentry;
+ args.mkdir.mode = child_dentry->d_inode->i_mode;
+
+ run_sioq(__unionfs_mkdir, &args);
+ err = args.err;

- if (lower_dentry->d_inode) {
+ if (!err)
+ err = copyup_permissions(dir->i_sb, child_dentry,
+ lower_dentry);
+ unlock_dir(lower_parent_dentry);
+ if (err) {
+ struct inode *inode = lower_dentry->d_inode;
/*
- * since this already exists we dput to avoid
- * multiple references on the same dentry
+ * If we get here, it means that we created a new
+ * dentry+inode, but copying permissions failed.
+ * Therefore, we should delete this inode and dput
+ * the dentry so as not to leave cruft behind.
*/
+ if (lower_dentry->d_op && lower_dentry->d_op->d_iput)
+ lower_dentry->d_op->d_iput(lower_dentry,
+ inode);
+ else
+ iput(inode);
+ lower_dentry->d_inode = NULL;
dput(lower_dentry);
- } else {
- struct sioq_args args;
-
- /* its a negative dentry, create a new dir */
- lower_parent_dentry = lock_parent(lower_dentry);
-
- args.mkdir.parent = lower_parent_dentry->d_inode;
- args.mkdir.dentry = lower_dentry;
- args.mkdir.mode = child_dentry->d_inode->i_mode;
-
- run_sioq(__unionfs_mkdir, &args);
- err = args.err;
-
- if (!err)
- err = copyup_permissions(dir->i_sb,
- child_dentry,
- lower_dentry);
- unlock_dir(lower_parent_dentry);
- if (err) {
- struct inode *inode = lower_dentry->d_inode;
- /*
- * If we get here, it means that we created a new
- * dentry+inode, but copying permissions failed.
- * Therefore, we should delete this inode and dput
- * the dentry so as not to leave cruft behind.
- *
- * XXX: call dentry_iput() instead, but then we have
- * to export that symbol.
- */
- if (lower_dentry->d_op && lower_dentry->d_op->d_iput)
- lower_dentry->d_op->d_iput(lower_dentry,
- inode);
- else
- iput(inode);
- lower_dentry->d_inode = NULL;
-
- dput(lower_dentry);
- lower_dentry = ERR_PTR(err);
- goto out;
- }
-
+ lower_dentry = ERR_PTR(err);
+ goto out;
}

- __set_inode(child_dentry, lower_dentry, bindex);
- __set_dentry(child_dentry, lower_dentry, bindex);
-
- parent_dentry = child_dentry;
- child_dentry = path[--count];
}
+
+ __set_inode(child_dentry, lower_dentry, bindex);
+ __set_dentry(child_dentry, lower_dentry, bindex);
+ /*
+ * update times of this dentry, but also the parent, because if
+ * we changed, the parent may have changed too.
+ */
+ unionfs_copy_attr_times(parent_dentry->d_inode);
+ unionfs_copy_attr_times(child_dentry->d_inode);
+
+ parent_dentry = child_dentry;
+ child_dentry = path[--count];
+ goto begin;
out:
/* cleanup any leftover locks from the do/while loop above */
if (IS_ERR(lower_dentry))
@@ -805,3 +815,53 @@ out:
kfree(path);
return lower_dentry;
}
+
+/*
+ * Post-copyup helper to ensure we have valid mnts: set lower mnt of
+ * dentry+parents to the first parent node that has an mnt.
+ */
+void unionfs_postcopyup_setmnt(struct dentry *dentry)
+{
+ struct dentry *parent, *hasone;
+ int bindex = dbstart(dentry);
+
+ if (unionfs_lower_mnt_idx(dentry, bindex))
+ return;
+ hasone = dentry->d_parent;
+ /* this loop should stop at root dentry */
+ while (!unionfs_lower_mnt_idx(hasone, bindex))
+ hasone = hasone->d_parent;
+ parent = dentry;
+ while (!unionfs_lower_mnt_idx(parent, bindex)) {
+ unionfs_set_lower_mnt_idx(parent, bindex,
+ unionfs_mntget(hasone, bindex));
+ parent = parent->d_parent;
+ }
+}
+
+/*
+ * Post-copyup helper to release all non-directory source objects of a
+ * copied-up file. Regular files should have only one lower object.
+ */
+void unionfs_postcopyup_release(struct dentry *dentry)
+{
+ int bindex;
+
+ BUG_ON(S_ISDIR(dentry->d_inode->i_mode));
+ for (bindex=dbstart(dentry)+1; bindex<=dbend(dentry); bindex++) {
+ if (unionfs_lower_mnt_idx(dentry, bindex)) {
+ unionfs_mntput(dentry, bindex);
+ unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
+ }
+ if (unionfs_lower_dentry_idx(dentry, bindex)) {
+ dput(unionfs_lower_dentry_idx(dentry, bindex));
+ unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+ iput(unionfs_lower_inode_idx(dentry->d_inode, bindex));
+ unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+ NULL);
+ }
+ }
+ bindex = dbstart(dentry);
+ set_dbend(dentry, bindex);
+ ibend(dentry->d_inode) = ibstart(dentry->d_inode) = bindex;
+}
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 4789fd5..6cec564 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -167,7 +167,9 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
* because lookup passed as a negative unionfs dentry
* pointing to a lone negative underlying dentry.
*/
- lower_dentry = create_parents(parent, dentry, bindex);
+ lower_dentry = create_parents(parent, dentry,
+ dentry->d_name.name,
+ bindex);
if (!lower_dentry || IS_ERR(lower_dentry)) {
if (IS_ERR(lower_dentry))
err = PTR_ERR(lower_dentry);
@@ -321,8 +323,9 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
}

if (dbstart(old_dentry) != dbstart(new_dentry)) {
- lower_new_dentry =
- create_parents(dir, new_dentry, dbstart(old_dentry));
+ lower_new_dentry = create_parents(dir, new_dentry,
+ new_dentry->d_name.name,
+ dbstart(old_dentry));
err = PTR_ERR(lower_new_dentry);
if (IS_COPYUP_ERR(err))
goto docopyup;
@@ -347,11 +350,13 @@ docopyup:
for (bindex = old_bstart - 1; bindex >= 0; bindex--) {
err = copyup_dentry(old_dentry->d_parent->d_inode,
old_dentry, old_bstart,
- bindex, NULL,
+ bindex, old_dentry->d_name.name,
+ old_dentry->d_name.len, NULL,
old_dentry->d_inode->i_size);
if (!err) {
lower_new_dentry =
create_parents(dir, new_dentry,
+ new_dentry->d_name.name,
bindex);
lower_old_dentry =
unionfs_lower_dentry(old_dentry);
@@ -388,6 +393,8 @@ out:
d_drop(new_dentry);

kfree(name);
+ if (!err)
+ unionfs_postcopyup_setmnt(new_dentry);

unionfs_unlock_dentry(new_dentry);
unionfs_unlock_dentry(old_dentry);
@@ -488,7 +495,9 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
* unionfs dentry pointing to a lone negative
* underlying dentry
*/
- lower_dentry = create_parents(dir, dentry, bindex);
+ lower_dentry = create_parents(dir, dentry,
+ dentry->d_name.name,
+ bindex);
if (!lower_dentry || IS_ERR(lower_dentry)) {
if (IS_ERR(lower_dentry))
err = PTR_ERR(lower_dentry);
@@ -621,7 +630,9 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)

lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
if (!lower_dentry) {
- lower_dentry = create_parents(parent, dentry, bindex);
+ lower_dentry = create_parents(parent, dentry,
+ dentry->d_name.name,
+ bindex);
if (!lower_dentry || IS_ERR(lower_dentry)) {
printk(KERN_DEBUG "unionfs: lower dentry "
" NULL for bindex = %d\n", bindex);
@@ -759,7 +770,9 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,

lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
if (!lower_dentry) {
- lower_dentry = create_parents(dir, dentry, bindex);
+ lower_dentry = create_parents(dir, dentry,
+ dentry->d_name.name,
+ bindex);
if (IS_ERR(lower_dentry)) {
printk(KERN_DEBUG "unionfs: failed to create "
"parents on %d, err = %ld\n",
@@ -1068,8 +1081,10 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
if (ia->ia_valid & ATTR_SIZE)
size = ia->ia_size;
err = copyup_dentry(dentry->d_parent->d_inode,
- dentry, bstart, i, NULL,
- size);
+ dentry, bstart, i,
+ dentry->d_name.name,
+ dentry->d_name.len,
+ NULL, size);

if (!err) {
copyup = 1;
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index f8ea37a..acf829a 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -37,7 +37,8 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (!lower_new_dentry) {
lower_new_dentry =
create_parents(new_dentry->d_parent->d_inode,
- new_dentry, bindex);
+ new_dentry, new_dentry->d_name.name,
+ bindex);
if (IS_ERR(lower_new_dentry)) {
printk(KERN_DEBUG "unionfs: error creating directory "
"tree for rename, bindex = %d, err = %ld\n",
@@ -226,15 +227,18 @@ static int do_unionfs_rename(struct inode *old_dir,
*/
err = copyup_dentry(old_dentry->d_parent->d_inode,
old_dentry, old_bstart, bindex,
+ old_dentry->d_name.name,
+ old_dentry->d_name.len,
NULL, old_dentry->d_inode->i_size);
- if (!err) {
- dput(wh_old);
- bwh_old = bindex;
- err = __unionfs_rename(old_dir, old_dentry,
- new_dir, new_dentry,
- bindex, &wh_old);
- break;
- }
+ /* if copyup failed, try next branch to the left */
+ if (err)
+ continue;
+ dput(wh_old);
+ bwh_old = bindex;
+ err = __unionfs_rename(old_dir, old_dentry,
+ new_dir, new_dentry,
+ bindex, &wh_old);
+ break;
}
}

@@ -468,8 +472,12 @@ out:
*/
if (S_ISDIR(old_dentry->d_inode->i_mode))
atomic_dec(&UNIONFS_D(old_dentry)->generation);
+ else
+ unionfs_postcopyup_release(old_dentry);
if (new_dentry->d_inode &&
!S_ISDIR(new_dentry->d_inode->i_mode)) {
+ unionfs_postcopyup_release(new_dentry);
+ unionfs_postcopyup_setmnt(new_dentry);
if (!unionfs_lower_inode(new_dentry->d_inode)) {
/*
* If we get here, it means that no copyup
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 3cebc17..5db9e62 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -56,7 +56,9 @@ int create_whiteout(struct dentry *dentry, int start)
* this dentry.
*/
lower_dentry = create_parents(dentry->d_inode,
- dentry, bindex);
+ dentry,
+ dentry->d_name.name,
+ bindex);
if (!lower_dentry || IS_ERR(lower_dentry)) {
printk(KERN_DEBUG "unionfs: create_parents "
"failed for bindex = %d\n", bindex);
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index a9f9b97..f8a9cd2 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -255,7 +255,7 @@ extern void update_bstart(struct dentry *dentry);

/* replicates the directory structure up to given dentry in given branch */
extern struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
- int bindex);
+ const char *name, int bindex);
extern int make_dir_opaque(struct dentry *dir, int bindex);

/* partial lookup */
@@ -275,9 +275,12 @@ extern int copyup_named_file(struct inode *dir, struct file *file,
char *name, int bstart, int new_bindex,
loff_t len);
/* copies a dentry from dbstart to newbindex branch */
-extern int copyup_dentry(struct inode *dir, struct dentry *dentry, int bstart,
- int new_bindex, struct file **copyup_file,
- loff_t len);
+extern int copyup_dentry(struct inode *dir, struct dentry *dentry,
+ int bstart, int new_bindex, const char *name,
+ int namelen, struct file **copyup_file, loff_t len);
+/* helper functions for post-copyup actions */
+extern void unionfs_postcopyup_setmnt(struct dentry *dentry);
+extern void unionfs_postcopyup_release(struct dentry *dentry);

extern int remove_whiteouts(struct dentry *dentry,
struct dentry *lower_dentry, int bindex);
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 4e1c8f3..1e5bfce 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -88,6 +88,8 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
if (!err) {
+ if (!S_ISDIR(dentry->d_inode->i_mode))
+ unionfs_postcopyup_release(dentry);
d_drop(dentry);
/*
* if unlink/whiteout succeeded, parent dir mtime has
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:33:49

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 07/32] Unionfs: cpp endif comments

From: Erez Zadok <[email protected]>

Add comments to #endif's to help clarify code.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/copyup.c | 2 +-
fs/unionfs/inode.c | 4 ++--
fs/unionfs/sioq.h | 2 +-
fs/unionfs/union.h | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index cc7a816..410ce07 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -433,7 +433,7 @@ static int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
/* Selinux uses extended attributes for permissions. */
if ((err = copyup_xattrs(old_lower_dentry, new_lower_dentry)))
goto out_unlink;
-#endif
+#endif /* CONFIG_UNION_FS_XATTR */

/* do not allow files getting deleted to be re-interposed */
if (!d_deleted(dentry))
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index cbbde6f..9261bed 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -1122,7 +1122,7 @@ struct inode_operations unionfs_dir_iops = {
.getxattr = unionfs_getxattr,
.removexattr = unionfs_removexattr,
.listxattr = unionfs_listxattr,
-#endif
+#endif /* CONFIG_UNION_FS_XATTR */
};

struct inode_operations unionfs_main_iops = {
@@ -1133,5 +1133,5 @@ struct inode_operations unionfs_main_iops = {
.getxattr = unionfs_getxattr,
.removexattr = unionfs_removexattr,
.listxattr = unionfs_listxattr,
-#endif
+#endif /* CONFIG_UNION_FS_XATTR */
};
diff --git a/fs/unionfs/sioq.h b/fs/unionfs/sioq.h
index bedd7af..afb71ee 100644
--- a/fs/unionfs/sioq.h
+++ b/fs/unionfs/sioq.h
@@ -89,4 +89,4 @@ extern void __unionfs_unlink(struct work_struct *work);
extern void __delete_whiteouts(struct work_struct *work);
extern void __is_opaque_dir(struct work_struct *work);

-#endif /* _SIOQ_H */
+#endif /* not _SIOQ_H */
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 2a8f763..b7e5a93 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -419,7 +419,7 @@ static inline int is_robranch(const struct dentry *dentry)

#ifndef DEFAULT_POLLMASK
#define DEFAULT_POLLMASK (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM)
-#endif
+#endif /* not DEFAULT_POLLMASK */

/*
* EXTERNALS:
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:34:25

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 14/32] Unionfs: file_revalidate updates

From: Erez Zadok <[email protected]>

Correctly revalidate a file and account for lower mnts, even when branches
are updated or inserted. Better info upon file copyup.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 25 +++++++++++++++++++++----
1 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 64bd0bd..612207a 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -333,6 +333,9 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
*/
if (!d_deleted(dentry) &&
(sbgen > fgen || dbstart(dentry) != fbstart(file))) {
+ /* save orig branch ID */
+ int orig_brid = UNIONFS_F(file)->saved_branch_ids[fbstart(file)];
+
/* First we throw out the existing files. */
cleanup_file(file);

@@ -359,22 +362,36 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
if (err)
goto out;
} else {
+ int new_brid;
/* We only open the highest priority branch. */
err = open_highest_file(file, willwrite);
if (err)
goto out;
+ new_brid = UNIONFS_F(file)->
+ saved_branch_ids[fbstart(file)];
+ if (new_brid != orig_brid && sbgen > fgen) {
+ /*
+ * If we re-opened the file on a different
+ * branch than the original one, and this
+ * was due to a new branch inserted, then
+ * update the mnt counts of the old and new
+ * branches accordingly.
+ */
+ unionfs_mntget(dentry, bstart);
+ unionfs_mntput(sb->s_root,
+ branch_id_to_idx(sb, orig_brid));
+ }
}
atomic_set(&UNIONFS_F(file)->generation,
- atomic_read(&UNIONFS_I(dentry->d_inode)->
- generation));
+ atomic_read(&UNIONFS_I(dentry->d_inode)->generation));
}

/* Copyup on the first write to a file on a readonly branch. */
if (willwrite && IS_WRITE_FLAG(file->f_flags) &&
!IS_WRITE_FLAG(unionfs_lower_file(file)->f_flags) &&
is_robranch(dentry)) {
- printk(KERN_DEBUG "unionfs: Doing delayed copyup of a "
- "read-write file on a read-only branch.\n");
+ printk(KERN_DEBUG "unionfs: do delay copyup of \"%s\"\n",
+ dentry->d_name.name);
err = do_delayed_copyup(file);
}

--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:34:49

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 23/32] Unionfs: mount-time option parsing fix

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/main.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index bc5c105..ce08d96 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -337,8 +337,12 @@ static int parse_dirs_option(struct super_block *sb, struct unionfs_dentry_info
int perms;
char *mode = strchr(name, '=');

- if (!name || !*name)
+ if (!name)
continue;
+ if (!*name) { /* bad use of ':' (extra colons) */
+ err = -EINVAL;
+ goto out;
+ }

branches++;

@@ -404,10 +408,9 @@ static int parse_dirs_option(struct super_block *sb, struct unionfs_dentry_info
* branch-overlapping test.
*/
for (i = 0; i < branches; i++) {
+ dent1 = lower_root_info->lower_paths[i].dentry;
for (j = i + 1; j < branches; j++) {
- dent1 = lower_root_info->lower_paths[i].dentry;
dent2 = lower_root_info->lower_paths[j].dentry;
-
if (is_branch_overlap(dent1, dent2)) {
printk(KERN_WARNING "unionfs: branches %d and "
"%d overlap\n", i, j);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:35:29

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 18/32] Unionfs: unionfs_ioctl bug fixes

From: Erez Zadok <[email protected]>

Properly update lower objects, and release lower mnts upon ioctl success or
failure.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 25 +++++++++++++++++++++++--
1 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 1050c49..eee68b8 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -684,12 +684,15 @@ static int unionfs_ioctl_queryfile(struct file *file, unsigned int cmd,
{
int err = 0;
fd_set branchlist;
-
int bstart = 0, bend = 0, bindex = 0;
+ int orig_bstart, orig_bend;
struct dentry *dentry, *lower_dentry;
+ struct vfsmount *mnt;

- dentry = file->f_dentry;
+ dentry = file->f_path.dentry;
unionfs_lock_dentry(dentry);
+ orig_bstart = dbstart(dentry);
+ orig_bend = dbend(dentry);
if ((err = unionfs_partial_lookup(dentry)))
goto out;
bstart = dbstart(dentry);
@@ -703,7 +706,25 @@ static int unionfs_ioctl_queryfile(struct file *file, unsigned int cmd,
continue;
if (lower_dentry->d_inode)
FD_SET(bindex, &branchlist);
+ /* purge any lower objects after partial_lookup */
+ if (bindex < orig_bstart || bindex > orig_bend) {
+ dput(lower_dentry);
+ unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+ iput(unionfs_lower_inode_idx(dentry->d_inode, bindex));
+ unionfs_set_lower_inode_idx(dentry->d_inode, bindex,
+ NULL);
+ mnt = unionfs_lower_mnt_idx(dentry, bindex);
+ if (!mnt)
+ continue;
+ unionfs_mntput(dentry, bindex);
+ unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
+ }
}
+ /* restore original dentry's offsets */
+ set_dbstart(dentry, orig_bstart);
+ set_dbend(dentry, orig_bend);
+ ibstart(dentry->d_inode) = orig_bstart;
+ ibend(dentry->d_inode) = orig_bend;

err = copy_to_user((void __user *)arg, &branchlist, sizeof(fd_set));
if (err)
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:35:53

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 15/32] Unionfs: implement f/async

From: Erez Zadok <[email protected]>

Unionfs needs its own fsync and fasync instead of calling the generic
file_fsync, because it may have to sync multiple writable lower branches
(not just one). This also allows Unionfs to compile with CONFIG_BLOCK=n.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/dirfops.c | 2 +
fs/unionfs/file.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++-
fs/unionfs/union.h | 3 ++
3 files changed, 95 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index 8503411..0e93bd7 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -273,4 +273,6 @@ struct file_operations unionfs_dir_fops = {
.open = unionfs_open,
.release = unionfs_file_release,
.flush = unionfs_flush,
+ .fsync = unionfs_fsync,
+ .fasync = unionfs_fasync,
};
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index 47b63f3..0555b6c 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -135,6 +135,94 @@ out:
return err;
}

+int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+{
+ int bindex, bstart, bend;
+ struct file *lower_file;
+ struct dentry *lower_dentry;
+ struct inode *lower_inode, *inode;
+ int err = -EINVAL;
+
+ unionfs_read_lock(file->f_path.dentry->d_sb);
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+ if (bstart < 0 || bend < 0)
+ goto out;
+
+ inode = dentry->d_inode;
+ if (!inode) {
+ printk(KERN_ERR
+ "unionfs: null lower inode in unionfs_fsync\n");
+ goto out;
+ }
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode || !lower_inode->i_fop->fsync)
+ continue;
+ lower_file = unionfs_lower_file_idx(file, bindex);
+ lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+ mutex_lock(&lower_inode->i_mutex);
+ err = lower_inode->i_fop->fsync(lower_file,
+ lower_dentry,
+ datasync);
+ mutex_unlock(&lower_inode->i_mutex);
+ if (err)
+ goto out;
+ }
+
+ unionfs_copy_attr_times(inode);
+
+out:
+ unionfs_read_unlock(file->f_path.dentry->d_sb);
+ return err;
+}
+
+int unionfs_fasync(int fd, struct file *file, int flag)
+{
+ int bindex, bstart, bend;
+ struct file *lower_file;
+ struct dentry *dentry;
+ struct inode *lower_inode, *inode;
+ int err = 0;
+
+ unionfs_read_lock(file->f_path.dentry->d_sb);
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+ if (bstart < 0 || bend < 0)
+ goto out;
+
+ dentry = file->f_path.dentry;
+ inode = dentry->d_inode;
+ if (!inode) {
+ printk(KERN_ERR
+ "unionfs: null lower inode in unionfs_fasync\n");
+ goto out;
+ }
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode || !lower_inode->i_fop->fasync)
+ continue;
+ lower_file = unionfs_lower_file_idx(file, bindex);
+ mutex_lock(&lower_inode->i_mutex);
+ err = lower_inode->i_fop->fasync(fd, lower_file, flag);
+ mutex_unlock(&lower_inode->i_mutex);
+ if (err)
+ goto out;
+ }
+
+ unionfs_copy_attr_times(inode);
+
+out:
+ unionfs_read_unlock(file->f_path.dentry->d_sb);
+ return err;
+}
+
struct file_operations unionfs_main_fops = {
.llseek = generic_file_llseek,
.read = unionfs_read,
@@ -147,6 +235,7 @@ struct file_operations unionfs_main_fops = {
.open = unionfs_open,
.flush = unionfs_flush,
.release = unionfs_file_release,
- .fsync = file_fsync,
+ .fsync = unionfs_fsync,
+ .fasync = unionfs_fasync,
.splice_read = generic_file_splice_read,
};
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index f8a9cd2..ec33155 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -312,6 +312,9 @@ extern int unionfs_file_release(struct inode *inode, struct file *file);
extern int unionfs_flush(struct file *file, fl_owner_t id);
extern long unionfs_ioctl(struct file *file, unsigned int cmd,
unsigned long arg);
+extern int unionfs_fsync(struct file *file, struct dentry *dentry,
+ int datasync);
+extern int unionfs_fasync(int fd, struct file *file, int flag);

/* Inode operations */
extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:36:29

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 27/32] Unionfs: extended attributes fixes

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/copyup.c | 43 +++++++++++++++++++++++++++++++------------
fs/unionfs/union.h | 6 ++++--
fs/unionfs/xattr.c | 16 ++--------------
3 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 4c45790..36f751e 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -32,27 +32,39 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
ssize_t list_size = -1;
char *name_list = NULL;
char *attr_value = NULL;
- char *name_list_orig = NULL;
+ char *name_list_buf = NULL;

+ /* query the actual size of the xattr list */
list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
-
if (list_size <= 0) {
err = list_size;
goto out;
}

+ /* allocate space for the actual list */
name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX);
if (!name_list || IS_ERR(name_list)) {
err = PTR_ERR(name_list);
goto out;
}
+
+ name_list_buf = name_list; /* save for kfree at end */
+
+ /* now get the actual xattr list of the source file */
list_size = vfs_listxattr(old_lower_dentry, name_list, list_size);
+ if (list_size <= 0) {
+ err = list_size;
+ goto out;
+ }
+
+ /* allocate space to hold each xattr's value */
attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX);
if (!attr_value || IS_ERR(attr_value)) {
err = PTR_ERR(name_list);
goto out;
}
- name_list_orig = name_list;
+
+ /* in a loop, get and set each xattr from src to dst file */
while (*name_list) {
ssize_t size;

@@ -65,7 +77,6 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
err = size;
goto out;
}
-
if (size > XATTR_SIZE_MAX) {
err = -E2BIG;
goto out;
@@ -73,19 +84,27 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
/* Don't lock here since vfs_setxattr does it for us. */
err = vfs_setxattr(new_lower_dentry, name_list, attr_value,
size, 0);
-
+ /*
+ * Selinux depends on "security.*" xattrs, so to maintain
+ * the security of copied-up files, if Selinux is active,
+ * then we must copy these xattrs as well. So we need to
+ * temporarily get FOWNER privileges.
+ * XXX: move entire copyup code to SIOQ.
+ */
+ if (err == -EPERM && !capable(CAP_FOWNER)) {
+ cap_raise(current->cap_effective, CAP_FOWNER);
+ err = vfs_setxattr(new_lower_dentry, name_list,
+ attr_value, size, 0);
+ cap_lower(current->cap_effective, CAP_FOWNER);
+ }
if (err < 0)
goto out;
name_list += strlen(name_list) + 1;
}
out:
- name_list = name_list_orig;
-
- if (name_list)
- unionfs_xattr_free(name_list, list_size + 1);
- if (attr_value)
- unionfs_xattr_free(attr_value, XATTR_SIZE_MAX);
- /* It is no big deal if this fails, we just roll with the punches. */
+ unionfs_xattr_kfree(name_list_buf);
+ unionfs_xattr_kfree(attr_value);
+ /* Ignore if xattr isn't supported */
if (err == -ENOTSUPP || err == -EOPNOTSUPP)
err = 0;
return err;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 26d886e..d1232ac 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -341,8 +341,10 @@ extern struct dentry *unionfs_interpose(struct dentry *this_dentry,
#ifdef CONFIG_UNION_FS_XATTR
/* Extended attribute functions. */
extern void *unionfs_xattr_alloc(size_t size, size_t limit);
-extern void unionfs_xattr_free(void *ptr, size_t size);
-
+static inline void unionfs_xattr_kfree(const void *p)
+{
+ kfree(p);
+}
extern ssize_t unionfs_getxattr(struct dentry *dentry, const char *name,
void *value, size_t size);
extern int unionfs_removexattr(struct dentry *dentry, const char *name);
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 46f3d4e..6ab92f3 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -28,25 +28,13 @@ void *unionfs_xattr_alloc(size_t size, size_t limit)

if (!size) /* size request, no buffer is needed */
return NULL;
- else if (size <= PAGE_SIZE)
- ptr = kmalloc(size, GFP_KERNEL);
- else
- ptr = vmalloc(size);
+
+ ptr = kmalloc(size, GFP_KERNEL);
if (!ptr)
return ERR_PTR(-ENOMEM);
return ptr;
}

-void unionfs_xattr_free(void *ptr, size_t size)
-{
- if (!size) /* size request, no buffer was needed */
- return;
- else if (size <= PAGE_SIZE)
- kfree(ptr);
- else
- vfree(ptr);
-}
-
/*
* BKL held by caller.
* dentry->d_inode->i_mutex locked
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:36:56

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 05/32] Unionfs: do not use fsstack_copy_attr_all

From: Erez Zadok <[email protected]>

Unionfs needs a special fan-out version of fsstack_copy_attr_all, which is
called unionfs_copy_attr_all.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/dentry.c | 12 +++++++++---
fs/unionfs/fanout.h | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
fs/unionfs/inode.c | 13 ++++++++-----
fs/unionfs/main.c | 2 +-
fs/unionfs/subr.c | 2 +-
fs/unionfs/union.h | 5 +++--
6 files changed, 71 insertions(+), 13 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index f88a285..4a3c479 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -166,9 +166,15 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
valid = 0;

if (valid) {
- fsstack_copy_attr_all(dentry->d_inode,
- unionfs_lower_inode(dentry->d_inode),
- unionfs_get_nlinks);
+ /*
+ * If we get here, and we copy the meta-data from the lower
+ * inode to our inode, then it is vital that we have already
+ * purged all unionfs-level file data. We do that in the
+ * caller (__unionfs_d_revalidate_chain) by calling
+ * purge_inode_data.
+ */
+ unionfs_copy_attr_all(dentry->d_inode,
+ unionfs_lower_inode(dentry->d_inode));
fsstack_copy_inode_size(dentry->d_inode,
unionfs_lower_inode(dentry->d_inode));
}
diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 4da34c6..5908bc7 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -305,4 +305,52 @@ static inline void verify_locked(struct dentry *d)
BUG_ON(!mutex_is_locked(&UNIONFS_D(d)->lock));
}

-#endif /* _FANOUT_H */
+/* copy a/m/ctime from the lower branch with the newest times */
+static inline void unionfs_copy_attr_times(struct inode *upper)
+{
+ int bindex;
+ struct inode *lower;
+
+ if (!upper)
+ return;
+ for (bindex=ibstart(upper); bindex <= ibend(upper); bindex++) {
+ lower = unionfs_lower_inode_idx(upper, bindex);
+ if (!lower)
+ continue; /* not all lower dir objects may exist */
+ if (timespec_compare(&upper->i_mtime, &lower->i_mtime) < 0)
+ upper->i_mtime = lower->i_mtime;
+ if (timespec_compare(&upper->i_ctime, &lower->i_ctime) < 0)
+ upper->i_ctime = lower->i_ctime;
+ if (timespec_compare(&upper->i_atime, &lower->i_atime) < 0)
+ upper->i_atime = lower->i_atime;
+ /* XXX: should we notify_change on our upper inode? */
+ }
+}
+
+/*
+ * A unionfs/fanout version of fsstack_copy_attr_all. Uses a
+ * unionfs_get_nlinks to properly calcluate the number of links to a file.
+ * Also, copies the max() of all a/m/ctimes for all lower inodes (which is
+ * important if the lower inode is a directory type)
+ */
+static inline void unionfs_copy_attr_all(struct inode *dest,
+ const struct inode *src)
+{
+ dest->i_mode = src->i_mode;
+ dest->i_uid = src->i_uid;
+ dest->i_gid = src->i_gid;
+ dest->i_rdev = src->i_rdev;
+
+ unionfs_copy_attr_times(dest);
+
+ dest->i_blkbits = src->i_blkbits;
+ dest->i_flags = src->i_flags;
+
+ /*
+ * Update the nlinks AFTER updating the above fields, because the
+ * get_links callback may depend on them.
+ */
+ dest->i_nlink = unionfs_get_nlinks(dest);
+}
+
+#endif /* not _FANOUT_H */
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 59bb418..cbbde6f 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -369,12 +369,13 @@ check_link:
/* Its a hard link, so use the same inode */
new_dentry->d_inode = igrab(old_dentry->d_inode);
d_instantiate(new_dentry, new_dentry->d_inode);
- fsstack_copy_attr_all(dir, lower_new_dentry->d_parent->d_inode,
- unionfs_get_nlinks);
+ unionfs_copy_attr_all(dir, lower_new_dentry->d_parent->d_inode);
fsstack_copy_inode_size(dir, lower_new_dentry->d_parent->d_inode);

/* propagate number of hard-links */
old_dentry->d_inode->i_nlink = unionfs_get_nlinks(old_dentry->d_inode);
+ /* new dentry's ctime may have changed due to hard-link counts */
+ unionfs_copy_attr_times(new_dentry->d_inode);

out:
if (!new_dentry->d_inode)
@@ -1084,13 +1085,15 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
}

/* get the size from the first lower inode */
- lower_inode = unionfs_lower_inode(dentry->d_inode);
- fsstack_copy_attr_all(inode, lower_inode, unionfs_get_nlinks);
+ lower_inode = unionfs_lower_inode(inode);
+ unionfs_copy_attr_all(inode, lower_inode);
fsstack_copy_inode_size(inode, lower_inode);
-
+ /* if setattr succeeded, then parent dir may have changed */
+ unionfs_copy_attr_times(dentry->d_parent->d_inode);
out:
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 1ea7894..689a8fa 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -123,7 +123,7 @@ int unionfs_interpose(struct dentry *dentry, struct super_block *sb, int flag)
lower_inode->i_rdev);

/* all well, copy inode attributes */
- fsstack_copy_attr_all(inode, lower_inode, unionfs_get_nlinks);
+ unionfs_copy_attr_all(inode, lower_inode);
fsstack_copy_inode_size(inode, lower_inode);

skip:
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 0c8b0aa..3cebc17 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -179,7 +179,7 @@ out:
* returns the sum of the n_link values of all the underlying inodes of the
* passed inode
*/
-int unionfs_get_nlinks(struct inode *inode)
+int unionfs_get_nlinks(const struct inode *inode)
{
int sum_nlinks = 0;
int dirs = 0;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index e4ee3c1..2a8f763 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -186,6 +186,9 @@ struct unionfs_dir_state {
struct list_head list[0];
};

+/* externs needed for fanout.h or sioq.h */
+extern int unionfs_get_nlinks(const struct inode *inode);
+
/* include miscellaneous macros */
#include "fanout.h"
#include "sioq.h"
@@ -282,8 +285,6 @@ extern int remove_whiteouts(struct dentry *dentry,
extern int do_delete_whiteouts(struct dentry *dentry, int bindex,
struct unionfs_dir_state *namelist);

-extern int unionfs_get_nlinks(struct inode *inode);
-
/* Is this directory empty: 0 if it is empty, -ENOTEMPTY if not. */
extern int check_empty(struct dentry *dentry,
struct unionfs_dir_state **namelist);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:37:32

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 30/32] Unionfs: update unionfs version number

From: Erez Zadok <[email protected]>

Update version number from 2.0 to 2.1 to reflect the amount of work that had
gone in since 2.0 was first released, and also to sync up with Unionfs 2.x
releases for earlier kernels.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
include/linux/union_fs.h | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/union_fs.h b/include/linux/union_fs.h
index 9bc4e3b..7f8dcc3 100644
--- a/include/linux/union_fs.h
+++ b/include/linux/union_fs.h
@@ -12,7 +12,8 @@
#ifndef _LINUX_UNION_FS_H
#define _LINUX_UNION_FS_H

-#define UNIONFS_VERSION "2.0"
+#define UNIONFS_VERSION "2.1-mm"
+
/*
* DEFINITIONS FOR USER AND KERNEL CODE:
*/
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:37:54

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 19/32] Unionfs: partial_lookup update

From: Erez Zadok <[email protected]>

Handle new semantics of lookup_backend.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/lookup.c | 22 ++++++++++++++++------
1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index e4e8470..d05daa5 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -425,20 +425,30 @@ out:
return ERR_PTR(err);
}

-/* This is a utility function that fills in a unionfs dentry */
+/*
+ * This is a utility function that fills in a unionfs dentry.
+ *
+ * Returns: 0 (ok), or -ERRNO if an error occurred.
+ */
int unionfs_partial_lookup(struct dentry *dentry)
{
struct dentry *tmp;
struct nameidata nd = { .flags = 0 };
+ int err = -ENOSYS;

tmp = unionfs_lookup_backend(dentry, &nd, INTERPOSE_PARTIAL);
- if (!tmp)
- return 0;
- if (IS_ERR(tmp))
- return PTR_ERR(tmp);
+ if (!tmp) {
+ err = 0;
+ goto out;
+ }
+ if (IS_ERR(tmp)) {
+ err = PTR_ERR(tmp);
+ goto out;
+ }
/* need to change the interface */
BUG_ON(tmp != dentry);
- return -ENOSYS;
+out:
+ return err;
}

/* The dentry cache is just so we have properly sized dentries. */
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:38:27

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 10/32] Unionfs: cache-coherency - file flush

From: Erez Zadok <[email protected]>

Update our inode's time after flush.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 0dc7492..edb52c0 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -742,6 +742,11 @@ int unionfs_flush(struct file *file, fl_owner_t id)

}

+ /* on success, update our times */
+ unionfs_copy_attr_times(dentry->d_inode);
+ /* parent time could have changed too (async) */
+ unionfs_copy_attr_times(dentry->d_parent->d_inode);
+
out_lock:
unionfs_unlock_dentry(dentry);
out:
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:38:51

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 32/32] Unionfs: unionfs_create rewrite

The code was hard to follow and violated some invariants (e.g., never modify
a read only branch, and always create on branch 0).

Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/inode.c | 207 +++++++++++++++-------------------------------------
1 files changed, 58 insertions(+), 149 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 9adf272..08c1ae0 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -24,9 +24,7 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
int err = 0;
struct dentry *lower_dentry = NULL;
struct dentry *wh_dentry = NULL;
- struct dentry *new_lower_dentry;
struct dentry *lower_parent_dentry = NULL;
- int bindex = 0, bstart;
char *name = NULL;
int valid = 0;

@@ -47,177 +45,88 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
*/
BUG_ON(!valid && dentry->d_inode);

- /* We start out in the leftmost branch. */
- bstart = dbstart(dentry);
- lower_dentry = unionfs_lower_dentry(dentry);
-
/*
- * check if whiteout exists in this branch, i.e. lookup .wh.foo
- * first.
+ * We shouldn't create things in a read-only branch; this check is a
+ * bit redundant as we don't allow branch 0 to be read-only at the
+ * moment
*/
- name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
- if (IS_ERR(name)) {
- err = PTR_ERR(name);
- goto out;
- }
-
- wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
- dentry->d_name.len + UNIONFS_WHLEN);
- if (IS_ERR(wh_dentry)) {
- err = PTR_ERR(wh_dentry);
- wh_dentry = NULL;
+ if ((err = is_robranch_super(dentry->d_sb, 0))) {
+ err = -EROFS;
goto out;
}

- if (wh_dentry->d_inode) {
+ /*
+ * We _always_ create on branch 0
+ */
+ lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+ if (lower_dentry) {
/*
- * .wh.foo has been found.
- * First truncate it and then rename it to foo (hence having
- * the same overall effect as a normal create.
+ * check if whiteout exists in this branch, i.e. lookup .wh.foo
+ * first.
*/
- struct dentry *lower_dir_dentry;
- struct iattr newattrs;
-
- mutex_lock(&wh_dentry->d_inode->i_mutex);
- newattrs.ia_valid = ATTR_CTIME | ATTR_MODE | ATTR_ATIME
- | ATTR_MTIME | ATTR_UID | ATTR_GID | ATTR_FORCE
- | ATTR_KILL_SUID | ATTR_KILL_SGID;
-
- newattrs.ia_mode = mode & ~current->fs->umask;
- newattrs.ia_uid = current->fsuid;
- newattrs.ia_gid = current->fsgid;
-
- if (wh_dentry->d_inode->i_size != 0) {
- newattrs.ia_valid |= ATTR_SIZE;
- newattrs.ia_size = 0;
- }
-
- err = notify_change(wh_dentry, &newattrs);
-
- mutex_unlock(&wh_dentry->d_inode->i_mutex);
-
- if (err)
- printk(KERN_WARNING "unionfs: %s:%d: notify_change "
- "failed: %d, ignoring..\n",
- __FILE__, __LINE__, err);
-
- new_lower_dentry = unionfs_lower_dentry(dentry);
- dget(new_lower_dentry);
-
- lower_dir_dentry = dget_parent(wh_dentry);
- lock_rename(lower_dir_dentry, lower_dir_dentry);
-
- if (!(err = is_robranch_super(dentry->d_sb, bstart))) {
- err = vfs_rename(lower_dir_dentry->d_inode,
- wh_dentry,
- lower_dir_dentry->d_inode,
- new_lower_dentry);
- }
- if (!err) {
- fsstack_copy_attr_times(parent,
- new_lower_dentry->d_parent->
- d_inode);
- fsstack_copy_inode_size(parent,
- new_lower_dentry->d_parent->
- d_inode);
- parent->i_nlink = unionfs_get_nlinks(parent);
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
}

- unlock_rename(lower_dir_dentry, lower_dir_dentry);
- dput(lower_dir_dentry);
-
- dput(new_lower_dentry);
-
- if (err) {
- /* exit if the error returned was NOT -EROFS */
- if (!IS_COPYUP_ERR(err))
- goto out;
- /*
- * We were not able to create the file in this
- * branch, so, we try to create it in one branch to
- * left
- */
- bstart--;
- } else {
- /*
- * reset the unionfs dentry to point to the .wh.foo
- * entry.
- */
-
- /* Discard any old reference. */
- dput(unionfs_lower_dentry(dentry));
-
- /* Trade one reference to another. */
- unionfs_set_lower_dentry_idx(dentry, bstart,
- wh_dentry);
+ wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(wh_dentry)) {
+ err = PTR_ERR(wh_dentry);
wh_dentry = NULL;
-
- /*
- * Only INTERPOSE_LOOKUP can return a value other
- * than 0 on err.
- */
- err = PTR_ERR(unionfs_interpose(dentry,
- parent->i_sb, 0));
goto out;
}
- }

- for (bindex = bstart; bindex >= 0; bindex--) {
- lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
- if (!lower_dentry) {
+ if (wh_dentry->d_inode) {
/*
- * if lower_dentry is NULL, create the entire
- * dentry directory structure in branch 'bindex'.
- * lower_dentry will NOT be null when bindex == bstart
- * because lookup passed as a negative unionfs dentry
- * pointing to a lone negative underlying dentry.
+ * .wh.foo has been found, so let's unlink it
*/
- lower_dentry = create_parents(parent, dentry,
- dentry->d_name.name,
- bindex);
- if (!lower_dentry || IS_ERR(lower_dentry)) {
- if (IS_ERR(lower_dentry))
- err = PTR_ERR(lower_dentry);
- continue;
+ struct dentry *lower_dir_dentry;
+
+ lower_dir_dentry = lock_parent(wh_dentry);
+ err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+ unlock_dir(lower_dir_dentry);
+
+ if (err) {
+ printk("unionfs_create: could not unlink "
+ "whiteout, err = %d\n", err);
+ goto out;
}
}
-
- lower_parent_dentry = lock_parent(lower_dentry);
- if (IS_ERR(lower_parent_dentry)) {
- err = PTR_ERR(lower_parent_dentry);
+ } else {
+ /*
+ * if lower_dentry is NULL, create the entire
+ * dentry directory structure in branch 0.
+ */
+ lower_dentry = create_parents(parent, dentry, dentry->d_name.name, 0);
+ if (IS_ERR(lower_dentry)) {
+ err = PTR_ERR(lower_dentry);
goto out;
}
- /* We shouldn't create things in a read-only branch. */
- if (!(err = is_robranch_super(dentry->d_sb, bindex)))
- err = vfs_create(lower_parent_dentry->d_inode,
- lower_dentry, mode, nd);
+ }

- if (err || !lower_dentry->d_inode) {
- unlock_dir(lower_parent_dentry);
+ lower_parent_dentry = lock_parent(lower_dentry);
+ if (IS_ERR(lower_parent_dentry)) {
+ err = PTR_ERR(lower_parent_dentry);
+ goto out;
+ }

- /* break out of for loop if the error wasn't -EROFS */
- if (!IS_COPYUP_ERR(err))
- break;
- } else {
- /*
- * Only INTERPOSE_LOOKUP can return a value other
- * than 0 on err.
- */
- err = PTR_ERR(unionfs_interpose(dentry,
- parent->i_sb, 0));
- if (!err) {
- unionfs_copy_attr_times(parent);
- fsstack_copy_inode_size(parent,
- lower_parent_dentry->
- d_inode);
- /* update no. of links on parent directory */
- parent->i_nlink = unionfs_get_nlinks(parent);
- }
- unlock_dir(lower_parent_dentry);
- break;
+ err = vfs_create(lower_parent_dentry->d_inode, lower_dentry, mode, nd);
+
+ if (!err) {
+ err = PTR_ERR(unionfs_interpose(dentry, parent->i_sb, 0));
+ if (!err) {
+ unionfs_copy_attr_times(parent);
+ fsstack_copy_inode_size(parent,
+ lower_parent_dentry->d_inode);
+ /* update no. of links on parent directory */
+ parent->i_nlink = unionfs_get_nlinks(parent);
}
}

+ unlock_dir(lower_parent_dentry);
+
out:
dput(wh_dentry);
kfree(name);
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:39:21

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 03/32] VFS/fsstack: cpp endif comments

From: Erez Zadok <[email protected]>

Add comments to #endif's to help clarify code.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
include/linux/fs_stack.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h
index 6b52faf..28543ad 100644
--- a/include/linux/fs_stack.h
+++ b/include/linux/fs_stack.h
@@ -39,4 +39,4 @@ static inline void fsstack_copy_attr_times(struct inode *dest,
dest->i_ctime = src->i_ctime;
}

-#endif /* _LINUX_FS_STACK_H */
+#endif /* not _LINUX_FS_STACK_H */
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:39:47

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 12/32] Unionfs: documentation updates

From: Erez Zadok <[email protected]>

Details of cache-coherency implementation (as per OLS'07 talk).
Also explain new incgen support via remount, not ioctl.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
Documentation/filesystems/unionfs/concepts.txt | 106 ++++++++++++++++++++++++
Documentation/filesystems/unionfs/issues.txt | 53 ++++--------
Documentation/filesystems/unionfs/usage.txt | 11 ++-
3 files changed, 134 insertions(+), 36 deletions(-)

diff --git a/Documentation/filesystems/unionfs/concepts.txt b/Documentation/filesystems/unionfs/concepts.txt
index 83d45b9..74b5bdc 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -4,6 +4,7 @@ Unionfs 2.0 CONCEPTS:
This file describes the concepts needed by a namespace unification file
system.

+
Branch Priority:
================

@@ -72,4 +73,109 @@ that lookup and readdir return this newer "version" of the file rather than
the original (see duplicate elimination).


+Cache Coherency:
+================
+
+Unionfs users often want to be able to modify files and directories directly
+on the lower branches, and have those changes be visible at the Unionfs
+level. This means that data (e.g., pages) and meta-data (dentries, inodes,
+open files, etc.) have to be synchronized between the upper and lower
+layers. In other words, the newest changes from a layer below have to be
+propagated to the Unionfs layer above. If the two layers are not in sync, a
+cache incoherency ensues, which could lead to application failures and even
+oopses. The Linux kernel, however, has a rather limited set of mechanisms
+to ensure this inter-layer cache coherency---so Unionfs has to do most of
+the hard work on its own.
+
+Maintaining Invariants:
+
+The way Unionfs ensures cache coherency is as follows. At each entry point
+to a Unionfs file system method, we call a utility function to validate the
+primary objects of this method. Generally, we call unionfs_file_revalidate
+on open files, and __unionfs_d_revalidate_chain on dentries (which also
+validates inodes). These utility functions check to see whether the upper
+Unionfs object is in sync with any of the lower objects that it represents.
+The checks we perform include whether the Unionfs superblock has a newer
+generation number, or if any of the lower objects mtime's or ctime's are
+newer. (Note: generation numbers change when branch-management commands are
+issued, so in a way, maintaining cache coherency is also very important for
+branch-management.) If indeed we determine that any Unionfs object is no
+longer in sync with its lower counterparts, then we rebuild that object
+similarly to how we do so for branch-management.
+
+While rebuilding Unionfs's objects, we also purge any page mappings and
+truncate inode pages (see fs/Unionfs/dentry.c:purge_inode_data). This is to
+ensure that Unionfs will re-get the newer data from the lower branches. We
+perform this purging only if the Unionfs operation in question is a reading
+operation; if Unionfs is performing a data writing operation (e.g., ->write,
+->commit_write, etc.) then we do NOT flush the lower mappings/pages: this is
+because (1) a self-deadlock could occur and (2) the upper Unionfs pages are
+considered more authoritative anyway, as they are newer and will overwrite
+any lower pages.
+
+Unionfs maintains the following important invariant regarding mtime's,
+ctime's, and atime's: the upper inode object's times are the max() of all of
+the lower ones. For non-directory objects, there's only one object below,
+so the mapping is simple; for directory objects, there could me multiple
+lower objects and we have to sync up with the newest one of all the lower
+ones. This invariant is important to maintain, especially for directories
+(besides, we need this to be POSIX compliant). A union could comprise
+multiple writable branches, each of which could change. If we don't reflect
+the newest possible mtime/ctime, some applications could fail. For example,
+NFSv2/v3 exports check for newer directory mtimes on the server to determine
+if the client-side attribute cache should be purged.
+
+To maintain these important invariants, of course, Unionfs carefully
+synchronizes upper and lower times in various places. For example, if we
+copy-up a file to a top-level branch, the parent directory where the file
+was copied up to will now have a new mtime: so after a successful copy-up,
+we sync up with the new top-level branch's parent directory mtime.
+
+Implementation:
+
+This cache-coherency implementation is efficient because it defers any
+synchronizing between the upper and lower layers until absolutely needed.
+Consider the example a common situation where users perform a lot of lower
+changes, such as untarring a whole package. While these take place,
+typically the user doesn't access the files via Unionfs; only after the
+lower changes are done, does the user try to access the lower files. With
+our cache-coherency implementation, the entirety of the changes to the lower
+branches will not result in a single CPU cycle spent at the Unionfs level
+until the user invokes a system call that goes through Unionfs.
+
+We have considered two alternate cache-coherency designs. (1) Using the
+dentry/inode notify functionality to register interest in finding out about
+any lower changes. This is a somewhat limited and also a heavy-handed
+approach which could result in many notifications to the Unionfs layer upon
+each small change at the lower layer (imagine a file being modified multiple
+times in rapid succession). (2) Rewriting the VFS to support explicit
+callbacks from lower objects to upper objects. We began exploring such an
+implementation, but found it to be very complicated--it would have resulted
+in massive VFS/MM changes which are unlikely to be accepted by the LKML
+community. We therefore believe that our current cache-coherency design and
+implementation represent the best approach at this time.
+
+Limitations:
+
+Our implementation works in that as long as a user process will have caused
+Unionfs to be called, directly or indirectly, even to just do
+->d_revalidate; then we will have purged the current Unionfs data and the
+process will see the new data. For example, a process that continually
+re-reads the same file's data will see the NEW data as soon as the lower
+file had changed, upon the next read(2) syscall (even if the file is still
+open!) However, this doesn't work when the process re-reads the open file's
+data via mmap(2) (unless the user unmaps/closes the file and remaps/reopens
+it). Once we respond to ->readpage(s), then the kernel maps the page into
+the process's address space and there doesn't appear to be a way to force
+the kernel to invalidate those pages/mappings, and force the process to
+re-issue ->readpage. If there's a way to invalidate active mappings and
+force a ->readpage, let us know please (invalidate_inode_pages2 doesn't do
+the trick).
+
+Our current Unionfs code has to perform many file-revalidation calls. It
+would be really nice if the VFS would export an optional file system hook
+->file_revalidate (similarly to dentry->d_revalidate) that will be called
+before each VFS op that has a "struct file" in it.
+
+
For more information, see <http://unionfs.filesystems.org/>.
diff --git a/Documentation/filesystems/unionfs/issues.txt b/Documentation/filesystems/unionfs/issues.txt
index c634604..c3a0aef 100644
--- a/Documentation/filesystems/unionfs/issues.txt
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -1,39 +1,24 @@
-KNOWN Unionfs 2.0 ISSUES:
+KNOWN Unionfs 2.1 ISSUES:
=========================

-1. The NFS server returns -EACCES for read-only exports, instead of -EROFS.
- This means we can't reliably detect a read-only NFS export.
-
-2. Modifying a Unionfs branch directly, while the union is mounted, is
- currently unsupported, because it could cause a cache incoherency between
- the union layer and the lower file systems (for that reason, Unionfs
- currently prohibits using branches which overlap with each other, even
- partially). We have tested Unionfs under such conditions, and fixed any
- bugs we found (Unionfs comes with an extensive regression test suite).
- However, it may still be possible that changes made to lower branches
- directly could cause cache incoherency which, in the worst case, may case
- an oops.
-
- Unionfs 2.0 has a temporary workaround for this. You can force Unionfs
- to increase the superblock generation number, and hence purge all cached
- Unionfs objects, which would then be re-gotten from the lower branches.
- This should ensure cache consistency. To increase the generation number,
- executed the command:
-
- mount -t unionfs -o remount,incgen none MOUNTPOINT
-
- Note that the older way of incrementing the generation number using an
- ioctl, is no longer supported in Unionfs 2.0. Ioctls in general are not
- encouraged. Plus, an ioctl is per-file concept, whereas the generation
- number is a per-file-system concept. Worse, such an ioctl requires an
- open file, which then has to be invalidated by the very nature of the
- generation number increase (read: the old generation increase ioctl was
- pretty racy).
-
-3. Unionfs should not use lookup_one_len() on the underlying f/s as it
- confuses NFS. Currently, unionfs_lookup() passes lookup intents to the
+1. Unionfs should not use lookup_one_len() on the underlying f/s as it
+ confuses NFSv4. Currently, unionfs_lookup() passes lookup intents to the
lower file-system, this eliminates part of the problem. The remaining
- calls to lookup_one_len may need to be changed to pass an intent.
-
+ calls to lookup_one_len may need to be changed to pass an intent. We are
+ currently introducing VFS changes to fs/namei.c's do_path_lookup() to
+ allow proper file lookup and opening in stackable file systems.
+
+2. Lockdep (a debugging feature) isn't aware of stacking, and so it
+ incorrectly complains about locking problems. The problem boils down to
+ this: Lockdep considers all objects of a certain type to be in the same
+ class, for example, all inodes. Lockdep doesn't like to see a lock held
+ on two inodes within the same task, and warns that it could lead to a
+ deadlock. However, stackable file systems do precisely that: they lock
+ an upper object, and then a lower object, in a strict order to avoid
+ locking problems; in addition, Unionfs, as a fan-out file system, may
+ have to lock several lower inodes. We are currently looking into Lockdep
+ to see how to make it aware of stackable file systems. In the mean time,
+ if you get any warnings from Lockdep, you can safely ignore them (or feel
+ free to report them to the Unionfs maintainers, just to be sure).

For more information, see <http://unionfs.filesystems.org/>.
diff --git a/Documentation/filesystems/unionfs/usage.txt b/Documentation/filesystems/unionfs/usage.txt
index 1c7554b..2316670 100644
--- a/Documentation/filesystems/unionfs/usage.txt
+++ b/Documentation/filesystems/unionfs/usage.txt
@@ -44,7 +44,7 @@ To upgrade a union from read-only to read-write:

To delete a branch /foo, regardless where it is in the current union:

-# mount -t unionfs -o del=/foo none MOUNTPOINT
+# mount -t unionfs -o remount,del=/foo none MOUNTPOINT

To insert (add) a branch /foo before /bar:

@@ -67,7 +67,7 @@ To append a branch to the very end (new lowest-priority branch):
To append a branch to the very end (new lowest-priority branch), in
read-only mode:

-# mount -t unionfs -o remount,add=:/foo:ro none MOUNTPOINT
+# mount -t unionfs -o remount,add=:/foo=ro none MOUNTPOINT

Finally, to change the mode of one existing branch, say /foo, from read-only
to read-write, and change /bar from read-write to read-only:
@@ -86,5 +86,12 @@ command:

# mount -t unionfs -o remount,incgen none MOUNTPOINT

+Note that the older way of incrementing the generation number using an
+ioctl, is no longer supported in Unionfs 2.0. Ioctls in general are not
+encouraged. Plus, an ioctl is per-file concept, whereas the generation
+number is a per-file-system concept. Worse, such an ioctl requires an open
+file, which then has to be invalidated by the very nature of the generation
+number increase (read: the old generation increase ioctl was pretty racy).
+

For more information, see <http://unionfs.filesystems.org/>.
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:40:27

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 17/32] Unionfs: interpose updates

From: Erez Zadok <[email protected]>

Update unionfs_interpose to handle spliced dentries, which is important for
NFS exporting.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/inode.c | 40 +++++++++++----
fs/unionfs/lookup.c | 22 +++++++-
fs/unionfs/main.c | 138 +++++++++++++++++++++++++++++++++------------------
fs/unionfs/union.h | 4 +-
4 files changed, 141 insertions(+), 63 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 6cec564..a219a40 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -152,7 +152,12 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
wh_dentry);
wh_dentry = NULL;

- err = unionfs_interpose(dentry, parent->i_sb, 0);
+ /*
+ * Only INTERPOSE_LOOKUP can return a value other
+ * than 0 on err.
+ */
+ err = PTR_ERR(unionfs_interpose(dentry,
+ parent->i_sb, 0));
goto out;
}
}
@@ -194,11 +199,14 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
if (!IS_COPYUP_ERR(err))
break;
} else {
- err = unionfs_interpose(dentry, parent->i_sb, 0);
+ /*
+ * Only INTERPOSE_LOOKUP can return a value other
+ * than 0 on err.
+ */
+ err = PTR_ERR(unionfs_interpose(dentry,
+ parent->i_sb, 0));
if (!err) {
- fsstack_copy_attr_times(parent,
- lower_parent_dentry->
- d_inode);
+ unionfs_copy_attr_times(parent);
fsstack_copy_inode_size(parent,
lower_parent_dentry->
d_inode);
@@ -527,7 +535,12 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
if (!IS_COPYUP_ERR(err))
break;
} else {
- err = unionfs_interpose(dentry, dir->i_sb, 0);
+ /*
+ * Only INTERPOSE_LOOKUP can return a value other
+ * than 0 on err.
+ */
+ err = PTR_ERR(unionfs_interpose(dentry,
+ dir->i_sb, 0));
if (!err) {
fsstack_copy_attr_times(dir,
lower_dir_dentry->
@@ -664,10 +677,13 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
}
set_dbend(dentry, bindex);

- err = unionfs_interpose(dentry, parent->i_sb, 0);
+ /*
+ * Only INTERPOSE_LOOKUP can return a value other than 0 on
+ * err.
+ */
+ err = PTR_ERR(unionfs_interpose(dentry, parent->i_sb, 0));
if (!err) {
- fsstack_copy_attr_times(parent,
- lower_parent_dentry->d_inode);
+ unionfs_copy_attr_times(parent);
fsstack_copy_inode_size(parent,
lower_parent_dentry->d_inode);

@@ -795,7 +811,11 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
break;
}

- err = unionfs_interpose(dentry, dir->i_sb, 0);
+ /*
+ * Only INTERPOSE_LOOKUP can return a value other than 0 on
+ * err.
+ */
+ err = PTR_ERR(unionfs_interpose(dentry, dir->i_sb, 0));
if (!err) {
fsstack_copy_attr_times(dir,
lower_parent_dentry->d_inode);
diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 61ee50d..e4e8470 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -72,7 +72,12 @@ out:
return err;
}

-/* main (and complex) driver function for Unionfs's lookup */
+/*
+ * Main (and complex) driver function for Unionfs's lookup
+ *
+ * Returns: NULL (ok), ERR_PTR if an error occurred, or a non-null non-error
+ * PTR if d_splice returned a different dentry.
+ */
struct dentry *unionfs_lookup_backend(struct dentry *dentry,
struct nameidata *nd, int lookupmode)
{
@@ -81,6 +86,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
struct dentry *wh_lower_dentry = NULL;
struct dentry *lower_dir_dentry = NULL;
struct dentry *parent_dentry = NULL;
+ struct dentry *d_interposed = NULL;
int bindex, bstart, bend, bopaque;
int dentry_count = 0; /* Number of positive dentries. */
int first_dentry_offset = -1; /* -1 is uninitialized */
@@ -90,7 +96,6 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
int locked_parent = 0;
int locked_child = 0;
int allocated_new_info = 0;
-
int opaque;
char *whname = NULL;
const char *name;
@@ -370,7 +375,16 @@ out_positive:
bend = dbend(dentry);
}

- err = unionfs_interpose(dentry, dentry->d_sb, lookupmode);
+ /*
+ * Interpose can return a dentry if d_splice returned a different
+ * dentry.
+ */
+ d_interposed = unionfs_interpose(dentry, dentry->d_sb, lookupmode);
+ if (IS_ERR(d_interposed))
+ err = PTR_ERR(d_interposed);
+ else if (d_interposed)
+ dentry = d_interposed;
+
if (err)
goto out_drop;

@@ -406,6 +420,8 @@ out:
dput(parent_dentry);
if (locked_child || (err && allocated_new_info))
unionfs_unlock_dentry(dentry);
+ if (!err && d_interposed)
+ return d_interposed;
return ERR_PTR(err);
}

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 689a8fa..bc5c105 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -20,20 +20,73 @@
#include <linux/module.h>
#include <linux/moduleparam.h>

+static void unionfs_fill_inode(struct dentry *dentry,
+ struct inode *inode)
+{
+ struct inode *lower_inode;
+ struct dentry *lower_dentry;
+ int bindex, bstart, bend;
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
+ if (!lower_dentry) {
+ unionfs_set_lower_inode_idx(inode, bindex, NULL);
+ continue;
+ }
+
+ /* Initialize the lower inode to the new lower inode. */
+ if (!lower_dentry->d_inode)
+ continue;
+
+ unionfs_set_lower_inode_idx(inode, bindex,
+ igrab(lower_dentry->d_inode));
+ }
+
+ ibstart(inode) = dbstart(dentry);
+ ibend(inode) = dbend(dentry);
+
+ /* Use attributes from the first branch. */
+ lower_inode = unionfs_lower_inode(inode);
+
+ /* Use different set of inode ops for symlinks & directories */
+ if (S_ISLNK(lower_inode->i_mode))
+ inode->i_op = &unionfs_symlink_iops;
+ else if (S_ISDIR(lower_inode->i_mode))
+ inode->i_op = &unionfs_dir_iops;
+
+ /* Use different set of file ops for directories */
+ if (S_ISDIR(lower_inode->i_mode))
+ inode->i_fop = &unionfs_dir_fops;
+
+ /* properly initialize special inodes */
+ if (S_ISBLK(lower_inode->i_mode) || S_ISCHR(lower_inode->i_mode) ||
+ S_ISFIFO(lower_inode->i_mode) || S_ISSOCK(lower_inode->i_mode))
+ init_special_inode(inode, lower_inode->i_mode,
+ lower_inode->i_rdev);
+
+ /* all well, copy inode attributes */
+ unionfs_copy_attr_all(inode, lower_inode);
+ fsstack_copy_inode_size(inode, lower_inode);
+}
+
/*
* Connect a unionfs inode dentry/inode with several lower ones. This is
* the classic stackable file system "vnode interposition" action.
*
* @sb: unionfs's super_block
*/
-int unionfs_interpose(struct dentry *dentry, struct super_block *sb, int flag)
+struct dentry *unionfs_interpose(struct dentry *dentry, struct super_block *sb,
+ int flag)
{
- struct inode *lower_inode;
- struct dentry *lower_dentry;
int err = 0;
struct inode *inode;
int is_negative_dentry = 1;
int bindex, bstart, bend;
+ int need_fill_inode = 1;
+ struct dentry *spliced = NULL;

verify_locked(dentry);

@@ -80,51 +133,12 @@ int unionfs_interpose(struct dentry *dentry, struct super_block *sb, int flag)
err = -EACCES;
goto out;
}
-
if (atomic_read(&inode->i_count) > 1)
goto skip;
}

- for (bindex = bstart; bindex <= bend; bindex++) {
- lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
- if (!lower_dentry) {
- unionfs_set_lower_inode_idx(inode, bindex, NULL);
- continue;
- }
-
- /* Initialize the lower inode to the new lower inode. */
- if (!lower_dentry->d_inode)
- continue;
-
- unionfs_set_lower_inode_idx(inode, bindex,
- igrab(lower_dentry->d_inode));
- }
-
- ibstart(inode) = dbstart(dentry);
- ibend(inode) = dbend(dentry);
-
- /* Use attributes from the first branch. */
- lower_inode = unionfs_lower_inode(inode);
-
- /* Use different set of inode ops for symlinks & directories */
- if (S_ISLNK(lower_inode->i_mode))
- inode->i_op = &unionfs_symlink_iops;
- else if (S_ISDIR(lower_inode->i_mode))
- inode->i_op = &unionfs_dir_iops;
-
- /* Use different set of file ops for directories */
- if (S_ISDIR(lower_inode->i_mode))
- inode->i_fop = &unionfs_dir_fops;
-
- /* properly initialize special inodes */
- if (S_ISBLK(lower_inode->i_mode) || S_ISCHR(lower_inode->i_mode) ||
- S_ISFIFO(lower_inode->i_mode) || S_ISSOCK(lower_inode->i_mode))
- init_special_inode(inode, lower_inode->i_mode,
- lower_inode->i_rdev);
-
- /* all well, copy inode attributes */
- unionfs_copy_attr_all(inode, lower_inode);
- fsstack_copy_inode_size(inode, lower_inode);
+ need_fill_inode = 0;
+ unionfs_fill_inode(dentry, inode);

skip:
/* only (our) lookup wants to do a d_add */
@@ -134,7 +148,28 @@ skip:
d_instantiate(dentry, inode);
break;
case INTERPOSE_LOOKUP:
- err = PTR_ERR(d_splice_alias(inode, dentry));
+ spliced = d_splice_alias(inode, dentry);
+ if (IS_ERR(spliced))
+ err = PTR_ERR(spliced);
+ else if (spliced && spliced != dentry) {
+ /*
+ * d_splice can return a dentry if it was
+ * disconnected and had to be moved. We must ensure
+ * that the private data of the new dentry is
+ * correct and that the inode info was filled
+ * properly. Finally we must return this new
+ * dentry.
+ */
+ spliced->d_op = &unionfs_dops;
+ spliced->d_fsdata = dentry->d_fsdata;
+ dentry->d_fsdata = NULL;
+ dentry = spliced;
+ if (need_fill_inode) {
+ need_fill_inode = 0;
+ unionfs_fill_inode(dentry, inode);
+ }
+ goto out_spliced;
+ }
break;
case INTERPOSE_REVAL:
/* Do nothing. */
@@ -143,9 +178,13 @@ skip:
printk(KERN_ERR "unionfs: invalid interpose flag passed!");
BUG();
}
+ goto out;

+out_spliced:
+ if (!err)
+ return spliced;
out:
- return err;
+ return ERR_PTR(err);
}

/* like interpose above, but for an already existing dentry */
@@ -623,8 +662,11 @@ static int unionfs_read_super(struct super_block *sb, void *raw_data,
/* Set the generation number to one, since this is for the mount. */
atomic_set(&UNIONFS_D(sb->s_root)->generation, 1);

- /* call interpose to create the upper level inode */
- err = unionfs_interpose(sb->s_root, sb, 0);
+ /*
+ * Call interpose to create the upper level inode. Only
+ * INTERPOSE_LOOKUP can return a value other than 0 on err.
+ */
+ err = PTR_ERR(unionfs_interpose(sb->s_root, sb, 0));
unionfs_unlock_dentry(sb->s_root);
if (!err)
goto out;
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index ec33155..0fa8ae0 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -333,8 +333,8 @@ extern int is_newer_lower(const struct dentry *dentry);
#define INTERPOSE_REVAL_NEG 3
#define INTERPOSE_PARTIAL 4

-extern int unionfs_interpose(struct dentry *this_dentry,
- struct super_block *sb, int flag);
+extern struct dentry *unionfs_interpose(struct dentry *this_dentry,
+ struct super_block *sb, int flag);

#ifdef CONFIG_UNION_FS_XATTR
/* Extended attribute functions. */
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:41:00

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 21/32] Unionfs: mmap fixes

From: Erez Zadok <[email protected]>

Most important fixes prevent deadlocks especially under low-memory
conditions, when one is not supposed to cause more memory pressure; also
handle AOP_WRITEPAGE_ACTIVATE from lower file systems.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/file.c | 6 +-
fs/unionfs/mmap.c | 132 ++++++++++++++++++++++++++++++++++++++---------------
2 files changed, 98 insertions(+), 40 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index 0555b6c..b55da4f 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -101,9 +101,6 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)

unionfs_read_lock(file->f_path.dentry->d_sb);

- if ((err = unionfs_file_revalidate(file, 1)))
- goto out;
-
/* This might be deferred to mmap's writepage */
willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
if ((err = unionfs_file_revalidate(file, willwrite)))
@@ -132,6 +129,9 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)

out:
unionfs_read_unlock(file->f_path.dentry->d_sb);
+ if (!err)
+ /* copyup could cause parent dir times to change */
+ unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
return err;
}

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 969fd16..d26b572 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -21,7 +21,7 @@

/*
* Unionfs doesn't implement ->writepages, which is OK with the VFS and
- * nkeeps our code simpler and smaller. Nevertheless, somehow, our own
+ * keeps our code simpler and smaller. Nevertheless, somehow, our own
* ->writepage must be called so we can sync the upper pages with the lower
* pages: otherwise data changed at the upper layer won't get written to the
* lower layer.
@@ -64,10 +64,31 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
inode = page->mapping->host;
lower_inode = unionfs_lower_inode(inode);

- /* find lower page (returns a locked page) */
- lower_page = grab_cache_page(lower_inode->i_mapping, page->index);
- if (!lower_page)
+ /*
+ * find lower page (returns a locked page)
+ *
+ * NOTE: we used to call grab_cache_page(), but that was unnecessary
+ * as it would have tried to create a new lower page if it didn't
+ * exist, leading to deadlocks (esp. under memory-pressure
+ * conditions, when it is really a bad idea to *consume* more
+ * memory). Instead, we assume the lower page exists, and if we can
+ * find it, then we ->writepage on it; if we can't find it, then it
+ * couldn't have disappeared unless the kernel already flushed it,
+ * in which case we're still OK. This is especially correct if
+ * wbc->sync_mode is WB_SYNC_NONE (as per
+ * Documentation/filesystems/vfs.txt). If we can't flush our page
+ * because we can't find a lower page, then at least we re-mark our
+ * page as dirty, and return AOP_WRITEPAGE_ACTIVATE as the VFS
+ * expects us to. (Note, if in the future it'd turn out that we
+ * have to find a lower page no matter what, then we'd have to
+ * resort to RAIF's page pointer flipping trick.)
+ */
+ lower_page = find_lock_page(lower_inode->i_mapping, page->index);
+ if (!lower_page) {
+ err = AOP_WRITEPAGE_ACTIVATE;
+ set_page_dirty(page);
goto out;
+ }

/* get page address, and encode it */
kaddr = kmap(page);
@@ -85,24 +106,41 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
wbc->for_writepages = 0;

/* call lower writepage (expects locked page) */
+ clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
err = lower_inode->i_mapping->a_ops->writepage(lower_page, wbc);
wbc->for_writepages = saved_for_writepages; /* restore value */

- /*
- * update mtime and ctime of lower level file system
- * unionfs' mtime and ctime are updated by generic_file_write
- */
- lower_inode->i_mtime = lower_inode->i_ctime = CURRENT_TIME;
-
- page_cache_release(lower_page); /* b/c grab_cache_page increased refcnt */
-
+ /* b/c find_lock_page locked it and ->writepage unlocks on success */
if (err)
+ unlock_page(lower_page);
+ /* b/c grab_cache_page increased refcnt */
+ page_cache_release(lower_page);
+
+ if (err < 0) {
ClearPageUptodate(page);
- else
- SetPageUptodate(page);
+ goto out;
+ }
+ if (err == AOP_WRITEPAGE_ACTIVATE) {
+ /*
+ * Lower file systems such as ramfs and tmpfs, may return
+ * AOP_WRITEPAGE_ACTIVATE so that the VM won't try to
+ * (pointlessly) write the page again for a while. But
+ * those lower file systems also set the page dirty bit back
+ * again. So we mimic that behaviour here.
+ */
+ if (PageDirty(lower_page))
+ set_page_dirty(page);
+ goto out;
+ }
+
+ /* all is well */
+ SetPageUptodate(page);
+ /* lower mtimes has changed: update ours */
+ unionfs_copy_attr_times(inode);

-out:
unlock_page(page);
+
+out:
return err;
}

@@ -155,7 +193,9 @@ static int unionfs_do_readpage(struct file *file, struct page *page)
err = 0;

/* if vfs_read succeeded above, sync up our times */
- fsstack_copy_attr_times(inode, lower_file->f_path.dentry->d_inode);
+ unionfs_copy_attr_times(inode);
+
+ flush_dcache_page(page);

out:
if (err == 0)
@@ -170,16 +210,17 @@ static int unionfs_readpage(struct file *file, struct page *page)
{
int err;

- unionfs_read_lock(file->f_dentry->d_sb);
-
+ unionfs_read_lock(file->f_path.dentry->d_sb);
if ((err = unionfs_file_revalidate(file, 0)))
goto out;

err = unionfs_do_readpage(file, page);

- if (!err)
+ if (!err) {
touch_atime(unionfs_lower_mnt(file->f_path.dentry),
unionfs_lower_dentry(file->f_path.dentry));
+ unionfs_copy_attr_times(file->f_path.dentry->d_inode);
+ }

/*
* we have to unlock our page, b/c we _might_ have gotten a locked
@@ -198,11 +239,21 @@ static int unionfs_prepare_write(struct file *file, struct page *page,
{
int err;

- unionfs_read_lock(file->f_dentry->d_sb);
-
+ unionfs_read_lock(file->f_path.dentry->d_sb);
+ /*
+ * This is the only place where we unconditionally copy the lower
+ * attribute times before calling unionfs_file_revalidate. The
+ * reason is that our ->write calls do_sync_write which in turn will
+ * call our ->prepare_write and then ->commit_write. Before our
+ * ->write is called, the lower mtimes are in sync, but by the time
+ * the VFS calls our ->commit_write, the lower mtimes have changed.
+ * Therefore, the only reasonable time for us to sync up from the
+ * changed lower mtimes, and avoid an invariant violation warning,
+ * is here, in ->prepare_write.
+ */
+ unionfs_copy_attr_times(file->f_path.dentry->d_inode);
err = unionfs_file_revalidate(file, 1);
-
- unionfs_read_unlock(file->f_dentry->d_sb);
+ unionfs_read_unlock(file->f_path.dentry->d_sb);

return err;
}
@@ -237,7 +288,8 @@ static int unionfs_commit_write(struct file *file, struct page *page,
page_data = (char *)kmap(page);
lower_file->f_pos = (page->index << PAGE_CACHE_SHIFT) + from;

- /* SP: I use vfs_write instead of copying page data and the
+ /*
+ * SP: I use vfs_write instead of copying page data and the
* prepare_write/commit_write combo because file system's like
* GFS/OCFS2 don't like things touching those directly,
* calling the underlying write op, while a little bit slower, will
@@ -259,20 +311,15 @@ static int unionfs_commit_write(struct file *file, struct page *page,
pos = ((loff_t) page->index << PAGE_CACHE_SHIFT) + to;
if (pos > i_size_read(inode))
i_size_write(inode, pos);
-
- /*
- * update mtime and ctime of lower level file system
- * unionfs' mtime and ctime are updated by generic_file_write
- */
- lower_inode->i_mtime = lower_inode->i_ctime = CURRENT_TIME;
-
+ /* if vfs_write succeeded above, sync up our times */
+ unionfs_copy_attr_times(inode);
mark_inode_dirty_sync(inode);

out:
if (err < 0)
ClearPageUptodate(page);

- unionfs_read_unlock(file->f_dentry->d_sb);
+ unionfs_read_unlock(file->f_path.dentry->d_sb);
return err; /* assume all is ok */
}

@@ -286,10 +333,19 @@ static void unionfs_sync_page(struct page *page)
inode = page->mapping->host;
lower_inode = unionfs_lower_inode(inode);

- /* find lower page (returns a locked page) */
- lower_page = grab_cache_page(lower_inode->i_mapping, page->index);
- if (!lower_page)
+ /*
+ * Find lower page (returns a locked page).
+ *
+ * NOTE: we used to call grab_cache_page(), but that was unnecessary
+ * as it would have tried to create a new lower page if it didn't
+ * exist, leading to deadlocks. All our sync_page method needs to
+ * do is ensure that pending I/O gets done.
+ */
+ lower_page = find_lock_page(lower_inode->i_mapping, page->index);
+ if (!lower_page) {
+ printk(KERN_DEBUG "unionfs: find_lock_page failed\n");
goto out;
+ }

/* do the actual sync */
mapping = lower_page->mapping;
@@ -300,8 +356,10 @@ static void unionfs_sync_page(struct page *page)
if (mapping && mapping->a_ops && mapping->a_ops->sync_page)
mapping->a_ops->sync_page(lower_page);

- unlock_page(lower_page); /* b/c grab_cache_page locked it */
- page_cache_release(lower_page); /* b/c grab_cache_page increased refcnt */
+ /* b/c find_lock_page locked it */
+ unlock_page(lower_page);
+ /* b/c find_lock_page increased refcnt */
+ page_cache_release(lower_page);

out:
return;
--
1.5.2.2.238.g7cbf2f2

2007-09-03 02:41:29

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 29/32] Unionfs: assorted comment and style updates

From: Erez Zadok <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/dirhelper.c | 2 --
fs/unionfs/fanout.h | 9 +++------
fs/unionfs/file.c | 6 ------
fs/unionfs/inode.c | 2 +-
fs/unionfs/lookup.c | 2 +-
fs/unionfs/main.c | 7 +++----
fs/unionfs/rename.c | 8 ++++----
fs/unionfs/super.c | 7 ++++++-
fs/unionfs/union.h | 14 +++++++-------
fs/unionfs/unlink.c | 1 -
10 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index 24bd327..a72f711 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -31,7 +31,6 @@ int do_delete_whiteouts(struct dentry *dentry, int bindex,
struct dentry *lower_dentry;
char *name = NULL, *p;
struct inode *lower_dir;
-
int i;
struct list_head *pos;
struct filldir_node *cursor;
@@ -95,7 +94,6 @@ int delete_whiteouts(struct dentry *dentry, int bindex,
struct super_block *sb;
struct dentry *lower_dir_dentry;
struct inode *lower_dir;
-
struct sioq_args args;

sb = dentry->d_sb;
diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 5908bc7..c5bf454 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -65,9 +65,9 @@ static inline void new_branch_id(struct super_block *sb, int index)
}

/*
- * Find new index of matching branch with an existing superblock a a known
+ * Find new index of matching branch with an existing superblock of a known
* (possibly old) id. This is needed because branches could have been
- * added/deleted causing the branchs of any open files to shift.
+ * added/deleted causing the branches of any open files to shift.
*
* @sb: the new superblock which may have new/different branch IDs
* @id: the old/existing id we're looking for
@@ -80,10 +80,7 @@ static inline int branch_id_to_idx(struct super_block *sb, int id)
if (branch_id(sb, i) == id)
return i;
}
- /*
- * XXX: maybe we should BUG_ON if not found new branch index?
- * (really that should never happen).
- */
+ /* in the non-ODF code, this should really never happen */
printk(KERN_WARNING "unionfs: cannot find branch with id %d\n", id);
return -1;
}
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index b55da4f..b33f44f 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -18,17 +18,12 @@

#include "union.h"

-/*******************
- * File Operations *
- *******************/
-
static ssize_t unionfs_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
int err;

unionfs_read_lock(file->f_path.dentry->d_sb);
-
if ((err = unionfs_file_revalidate(file, 0)))
goto out;

@@ -50,7 +45,6 @@ static ssize_t unionfs_aio_read(struct kiocb *iocb, const struct iovec *iov,
struct file *file = iocb->ki_filp;

unionfs_read_lock(file->f_path.dentry->d_sb);
-
if ((err = unionfs_file_revalidate(file, 0)))
goto out;

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 4574fbe..218320e 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -781,7 +781,6 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
if (err) {
if (!IS_COPYUP_ERR(err))
goto out;
-
bstart--;
} else
whiteout_unlinked = 1;
@@ -882,6 +881,7 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,
out:
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
+
return err;
}

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 38ee21f..7fa6310 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -197,7 +197,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,

/* check if whiteout exists in this branch: lookup .wh.foo */
wh_lower_dentry = lookup_one_len(whname, lower_dir_dentry,
- namelen + UNIONFS_WHLEN);
+ namelen + UNIONFS_WHLEN);
if (IS_ERR(wh_lower_dentry)) {
dput(first_lower_dentry);
unionfs_mntput(first_dentry, first_dentry_offset);
diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index ce08d96..4faae44 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -226,6 +226,7 @@ void unionfs_reinterpose(struct dentry *dentry)
*/
int check_branch(struct nameidata *nd)
{
+ /* XXX: remove in ODF code -- stacking unions allowed there */
if (!strcmp(nd->dentry->d_sb->s_type->name, "unionfs"))
return -EINVAL;
if (!nd->dentry->d_inode)
@@ -298,7 +299,6 @@ static int parse_dirs_option(struct super_block *sb, struct unionfs_dentry_info
int bindex = 0;
int i = 0;
int j = 0;
-
struct dentry *dent1;
struct dentry *dent2;

@@ -449,8 +449,8 @@ out:
* We want to mount our stackable file system on top of that lower directory.
*/
static struct unionfs_dentry_info *unionfs_parse_options(
- struct super_block *sb,
- char *options)
+ struct super_block *sb,
+ char *options)
{
struct unionfs_dentry_info *lower_root_info;
char *optname;
@@ -584,7 +584,6 @@ static int unionfs_read_super(struct super_block *sb, void *raw_data,
int silent)
{
int err = 0;
-
struct unionfs_dentry_info *lower_root_info = NULL;
int bindex, bstart, bend;

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index a02d678..7aa7e57 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -56,8 +56,8 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
}

lower_wh_dentry = lookup_one_len(wh_name, lower_new_dentry->d_parent,
- new_dentry->d_name.len +
- UNIONFS_WHLEN);
+ new_dentry->d_name.len +
+ UNIONFS_WHLEN);
if (IS_ERR(lower_wh_dentry)) {
err = PTR_ERR(lower_wh_dentry);
goto out;
@@ -143,8 +143,8 @@ out:
}

/*
- * Main rename code. This is sufficienly complex, that it's documented in
- * Docmentation/filesystems/unionfs/rename.txt. This routine calls
+ * Main rename code. This is sufficiently complex, that it's documented in
+ * Documentation/filesystems/unionfs/rename.txt. This routine calls
* __unionfs_rename() above to perform some of the work.
*/
static int do_unionfs_rename(struct inode *old_dir,
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 339afab..3f972df 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -136,13 +136,18 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)

/* set return buf to our f/s to avoid confusing user-level utils */
buf->f_type = UNIONFS_SUPER_MAGIC;
-
/*
* Our maximum file name can is shorter by a few bytes because every
* file name could potentially be whited-out.
+ *
+ * XXX: this restriction goes away with ODF.
*/
buf->f_namelen -= UNIONFS_WHLEN;

+ /*
+ * reset two fields to avoid confusing user-land.
+ * XXX: is this still necessary?
+ */
memset(&buf->f_fsid, 0, sizeof(__kernel_fsid_t));
memset(&buf->f_spare, 0, sizeof(buf->f_spare));

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index d1232ac..ce43a50 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -203,11 +203,14 @@ extern void unionfs_destroy_dentry_cache(void);

/* Initialize and free readdir-specific state. */
extern int init_rdstate(struct file *file);
-extern struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex);
-extern struct unionfs_dir_state *find_rdstate(struct inode *inode, loff_t fpos);
+extern struct unionfs_dir_state *alloc_rdstate(struct inode *inode,
+ int bindex);
+extern struct unionfs_dir_state *find_rdstate(struct inode *inode,
+ loff_t fpos);
extern void free_rdstate(struct unionfs_dir_state *state);
-extern int add_filldir_node(struct unionfs_dir_state *rdstate, const char *name,
- int namelen, int bindex, int whiteout);
+extern int add_filldir_node(struct unionfs_dir_state *rdstate,
+ const char *name, int namelen, int bindex,
+ int whiteout);
extern struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
const char *name, int namelen);

@@ -373,16 +376,13 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
static inline int branchperms(const struct super_block *sb, int index)
{
BUG_ON(index < 0);
-
return UNIONFS_SB(sb)->data[index].branchperms;
}

static inline int set_branchperms(struct super_block *sb, int index, int perms)
{
BUG_ON(index < 0);
-
UNIONFS_SB(sb)->data[index].branchperms = perms;
-
return perms;
}

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 1e5bfce..50ed351 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -54,7 +54,6 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
if (err) {
if (dbstart(dentry) == 0)
goto out;
-
err = create_whiteout(dentry, dbstart(dentry) - 1);
} else if (dbopaque(dentry) != -1)
/* There is a lower lower-priority file with the same name. */
--
1.5.2.2.238.g7cbf2f2

2007-09-03 03:48:34

by Al Boldi

[permalink] [raw]
Subject: Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

Josef 'Jeff' Sipek wrote:
> The following is a series of patches related to Unionfs, which include
> three small VFS/fsstack patches and one eCryptfs patch; the rest are
> Unionfs patches. The patches here represent several months of work and
> testing under various conditions, especially low-memory, SMP, and
> preemption situations with an assortment of lower systems: ext2/3/4, xfs,
> reiserfs, nfs, jffs2, ramfs, tmpfs, cramfs, and squashfs.

To increase test-usage, it may be critical to always backport at least to the
latest stable release, like 2.6.22.


Thanks!

--
Al

2007-09-03 06:39:37

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 03/32] VFS/fsstack: cpp endif comments


On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
>diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h
>index 6b52faf..28543ad 100644
>--- a/include/linux/fs_stack.h
>+++ b/include/linux/fs_stack.h
>@@ -39,4 +39,4 @@ static inline void fsstack_copy_attr_times(struct inode *dest,
> dest->i_ctime = src->i_ctime;
> }
>
>-#endif /* _LINUX_FS_STACK_H */
>+#endif /* not _LINUX_FS_STACK_H */

I hardly think this changes a thing, and it even goes against
the de-facto standard of a lot of other files.


Jan
--

2007-09-03 06:43:23

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 05/32] Unionfs: do not use fsstack_copy_attr_all


On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
>+/* copy a/m/ctime from the lower branch with the newest times */
>+static inline void unionfs_copy_attr_times(struct inode *upper)
>+{
>
>+ /* XXX: should we notify_change on our upper inode? */

I do not think so. Inotifying should only happen when the user really
did something (e.g. touch, cp, whatever). When merely the newest time is
picked from a stack of files and applied to the user-visible dentry,
nothing should happen.


Jan
--

2007-09-03 06:49:12

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 08/32] Unionfs: cache-coherency - update inode times


On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
>-static ssize_t unionfs_write(struct file * file, const char __user * buf,
>+
>+static ssize_t unionfs_write(struct file *file, const char __user *buf,
> size_t count, loff_t *ppos)
> {
> int err = 0;
>
> unionfs_read_lock(file->f_path.dentry->d_sb);
>-
> if ((err = unionfs_file_revalidate(file, 1)))
> goto out;
>
> err = do_sync_write(file, buf, count, ppos);
>+ /* update our inode times upon a successful lower write */
>+ if (err >= 0)
>+ unionfs_copy_attr_times(file->f_path.dentry->d_inode);

So, revisiting the question of inotifying, write() is indeed a case which
warrants notification (I think). Just was not sure what callers
unionfs_copy_attr_times can have.

>@@ -245,6 +245,12 @@ static struct dentry *unionfs_lookup(struct inode *parent,
> nd->dentry = path_save.dentry;
> nd->mnt = path_save.mnt;
> }
>+ if (!IS_ERR(ret)) {
>+ if (ret)
>+ dentry = ret;
>+ /* parent times may have changed */
>+ unionfs_copy_attr_times(dentry->d_parent->d_inode);
>+ }

On lookup? No inotify if you ask me...


>@@ -675,8 +681,11 @@ out:
>
> kfree(name);
>
>+ if (!err)
>+ unionfs_copy_attr_times(dentry->d_inode);
> unionfs_unlock_dentry(dentry);
> unionfs_read_unlock(dentry->d_sb);
>+
> return err;
> }
>

Can't decide ;-)

>@@ -1006,6 +1015,8 @@ static int unionfs_permission(struct inode *inode, int mask,
> break;
> }
> }
>+ /* sync times which may have changed (asynchronously) below */
>+ unionfs_copy_attr_times(inode);
>
> out:
> return err;

permission => readonly operation => no inotify

>diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
>index 822bffe..7ad19ec 100644
>--- a/fs/unionfs/unlink.c
>+++ b/fs/unionfs/unlink.c
>@@ -41,6 +41,9 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
> dget(lower_dentry);
> if (!(err = is_robranch_super(dentry->d_sb, bindex)))
> err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
>+ /* if vfs_unlink succeeded, update our inode's times */
>+ if (!err)
>+ unionfs_copy_attr_times(dentry->d_inode);
> dput(lower_dentry);
> fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
> unlock_dir(lower_dir_dentry);

unlink => write operation => inotify



Jan
--

2007-09-03 06:52:27

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 09/32] Unionfs: cache-coherency - dentries


On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
>@@ -184,10 +183,92 @@ out:
> }
>
> /*
>+ * Determine if the lower inode objects have changed from below the unionfs
>+ * inode. Return 1 if changed, 0 otherwise.
>+ */
>+int is_newer_lower(const struct dentry *dentry)

Could use bool and true/false as return value.

>-int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
>+int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
>+ int willwrite)

also looks like a bool (willwrite)

>- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
>+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {

(Are there any callers with ,1?)



Jan
--

2007-09-03 06:59:17

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 12/32] Unionfs: documentation updates


On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
>+
>+While rebuilding Unionfs's objects, we also purge any page mappings and
>+truncate inode pages (see fs/Unionfs/dentry.c:purge_inode_data). This is to

fs/unionfs/dentry.c

>+Unionfs maintains the following important invariant regarding mtime's,
>+ctime's, and atime's: the upper inode object's times are the max() of all of

utimes, ctimes and atimes.

>+2. Lockdep (a debugging feature) isn't aware of stacking, and so it
>+ incorrectly complains about locking problems. The problem boils down to
>+ this: Lockdep considers all objects of a certain type to be in the same
>+ class, for example, all inodes. Lockdep doesn't like to see a lock held
>+ on two inodes within the same task, and warns that it could lead to a
>+ deadlock. However, stackable file systems do precisely that: they lock
>+ an upper object, and then a lower object, in a strict order to avoid
>+ locking problems; in addition, Unionfs, as a fan-out file system, may
>+ have to lock several lower inodes. We are currently looking into Lockdep
>+ to see how to make it aware of stackable file systems. In the mean time,

meantime

>@@ -86,5 +86,12 @@ command:
>
> # mount -t unionfs -o remount,incgen none MOUNTPOINT
>
>+Note that the older way of incrementing the generation number using an
>+ioctl, is no longer supported in Unionfs 2.0. Ioctls in general are not

2.1?



Jan
--

2007-09-03 14:06:39

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: Re: [PATCH 12/32] Unionfs: documentation updates

On Mon, Sep 03, 2007 at 08:59:02AM +0200, Jan Engelhardt wrote:
>
> On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
> >+
> >+While rebuilding Unionfs's objects, we also purge any page mappings and
> >+truncate inode pages (see fs/Unionfs/dentry.c:purge_inode_data). This is to
>
> fs/unionfs/dentry.c
>
> >+Unionfs maintains the following important invariant regarding mtime's,
> >+ctime's, and atime's: the upper inode object's times are the max() of all of
>
> utimes, ctimes and atimes.
>
> >+2. Lockdep (a debugging feature) isn't aware of stacking, and so it
> >+ incorrectly complains about locking problems. The problem boils down to
> >+ this: Lockdep considers all objects of a certain type to be in the same
> >+ class, for example, all inodes. Lockdep doesn't like to see a lock held
> >+ on two inodes within the same task, and warns that it could lead to a
> >+ deadlock. However, stackable file systems do precisely that: they lock
> >+ an upper object, and then a lower object, in a strict order to avoid
> >+ locking problems; in addition, Unionfs, as a fan-out file system, may
> >+ have to lock several lower inodes. We are currently looking into Lockdep
> >+ to see how to make it aware of stackable file systems. In the mean time,
>
> meantime
>
> >@@ -86,5 +86,12 @@ command:
> >
> > # mount -t unionfs -o remount,incgen none MOUNTPOINT
> >
> >+Note that the older way of incrementing the generation number using an
> >+ioctl, is no longer supported in Unionfs 2.0. Ioctls in general are not
>
> 2.1?

Thanks.

Jeff.

--
Penguin : Linux version 2.4.20-46.9.legacysmp on an i386 machine (2778.72 BogoMips).

2007-09-03 14:09:49

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: Re: [PATCH 09/32] Unionfs: cache-coherency - dentries

On Mon, Sep 03, 2007 at 08:52:17AM +0200, Jan Engelhardt wrote:
>
> On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
> >@@ -184,10 +183,92 @@ out:
> > }
> >
> > /*
> >+ * Determine if the lower inode objects have changed from below the unionfs
> >+ * inode. Return 1 if changed, 0 otherwise.
> >+ */
> >+int is_newer_lower(const struct dentry *dentry)
>
> Could use bool and true/false as return value.

I remember that way back when there was a discussion about the bool type.
What how did that end? Is bool preferred?

> >-int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
> >+int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
> >+ int willwrite)
>
> also looks like a bool (willwrite)

Right.

> >- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
> >+ if (!__unionfs_d_revalidate_chain(dentry, NULL, 0)) {
>
> (Are there any callers with ,1?)

Indirectly yes. There are callers that pass a value they get. Very large
majority is 0.

Jeff.

--
Bad pun of the week: The formula 1 control computer suffered from a race
condition

2007-09-03 14:23:33

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 09/32] Unionfs: cache-coherency - dentries


On Sep 3 2007 10:08, Josef 'Jeff' Sipek wrote:
>> >+int is_newer_lower(const struct dentry *dentry)
>>
>> Could use bool and true/false as return value.
>
>I remember that way back when there was a discussion about the bool type.
>What how did that end? Is bool preferred?

Well if there were objections, it would not be in the tree.


Jan
--

2007-09-03 16:16:28

by Satyam Sharma

[permalink] [raw]
Subject: Re: [PATCH 01/32] VFS: export release_open_intent symbol



On Sun, 2 Sep 2007, Josef 'Jeff' Sipek wrote:
>
> diff --git a/fs/namei.c b/fs/namei.c
> index a83160a..b2b7c8e 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -374,6 +374,7 @@ void release_open_intent(struct nameidata *nd)
> else
> fput(nd->intent.open.file);
> }
> +EXPORT_SYMBOL(release_open_intent);

Hmm, why is this being pushed into mainline? This also looks like an
-mm only patch to me, there are no modular users of release_open_intent()
in mainline, and the next patch doesn't add one either.

2007-09-03 16:20:31

by Erez Zadok

[permalink] [raw]
Subject: Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

In message <[email protected]>, Al Boldi writes:
> Josef 'Jeff' Sipek wrote:
> > The following is a series of patches related to Unionfs, which include
> > three small VFS/fsstack patches and one eCryptfs patch; the rest are
> > Unionfs patches. The patches here represent several months of work and
> > testing under various conditions, especially low-memory, SMP, and
> > preemption situations with an assortment of lower systems: ext2/3/4, xfs,
> > reiserfs, nfs, jffs2, ramfs, tmpfs, cramfs, and squashfs.
>
> To increase test-usage, it may be critical to always backport at least to
> the latest stable release, like 2.6.22.
>
> Thanks!
>
> --
> Al

Al, we have back-ports of the latest Unionfs to 2.6.{22,21,20,19,18,9}, all
in http://unionfs.filesystems.org/. Before we release any change, we test
it on all back-ports as well as the latest -rc/-mm code base (takes over 24
hours straight to get through all of our regressions :-)

So we'd be happy to submit those patches to the latest stable kernel. But,
are you talking about VFS/ecryptfs patches (which are in the stable kernel),
or are you talking about Unionfs (which is not)?

Thanks,
Erez.

2007-09-03 17:40:06

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: Re: [PATCH 01/32] VFS: export release_open_intent symbol

On Mon, Sep 03, 2007 at 09:59:15PM +0530, Satyam Sharma wrote:
>
>
> On Sun, 2 Sep 2007, Josef 'Jeff' Sipek wrote:
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index a83160a..b2b7c8e 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -374,6 +374,7 @@ void release_open_intent(struct nameidata *nd)
> > else
> > fput(nd->intent.open.file);
> > }
> > +EXPORT_SYMBOL(release_open_intent);
>
> Hmm, why is this being pushed into mainline? This also looks like an
> -mm only patch to me, there are no modular users of release_open_intent()
> in mainline, and the next patch doesn't add one either.

Err...a thinko. Good catch.

Josef 'Jeff' Sipek.

--
I abhor a system designed for the "user", if that word is a coded pejorative
meaning "stupid and unsophisticated."
- Ken Thompson

2007-09-03 18:27:23

by Al Boldi

[permalink] [raw]
Subject: Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

Erez Zadok wrote:
> Al, we have back-ports of the latest Unionfs to 2.6.{22,21,20,19,18,9},
> all in http://unionfs.filesystems.org/. Before we release any change, we
> test it on all back-ports as well as the latest -rc/-mm code base (takes
> over 24 hours straight to get through all of our regressions :-)

I am impressed, thanks!

It's probably a good idea to always point these backports out, whenever
submitting patches against -mm. Otherwise, people might forget.

> So we'd be happy to submit those patches to the latest stable kernel.
> But, are you talking about VFS/ecryptfs patches (which are in the stable
> kernel), or are you talking about Unionfs (which is not)?

I'm talking about Unionfs, which seems like a rather critical feature to
miss-out on. BTW, did you ever get that oops-on-umount worked out?


Thanks!

--
Al

2007-09-03 18:46:41

by Erez Zadok

[permalink] [raw]
Subject: Re: [GIT PULL -mm] Unionfs/fsstack/eCryptfs updates/cleanups/fixes

In message <[email protected]>, Al Boldi writes:
> Erez Zadok wrote:
> > Al, we have back-ports of the latest Unionfs to 2.6.{22,21,20,19,18,9},
> > all in http://unionfs.filesystems.org/. Before we release any change, we
> > test it on all back-ports as well as the latest -rc/-mm code base (takes
> > over 24 hours straight to get through all of our regressions :-)
>
> I am impressed, thanks!

You're welcome.

> It's probably a good idea to always point these backports out, whenever
> submitting patches against -mm. Otherwise, people might forget.

Good idea.

> > So we'd be happy to submit those patches to the latest stable kernel.
> > But, are you talking about VFS/ecryptfs patches (which are in the stable
> > kernel), or are you talking about Unionfs (which is not)?
>
> I'm talking about Unionfs, which seems like a rather critical feature to
> miss-out on.

Hmmm, we'll have to discuss this among the unionfs developers first.

> BTW, did you ever get that oops-on-umount worked out?

Which bug? Is that something anyone submitted to our
https://bugzilla.filesystems.org? I don't recall such a bug in a while, so
if it got fixed, it must've been a while back. If it's still there and
reproducible, please let me know asap so I can work on it.

It is possible that the bug is in the -mm code. That's why we just posted
the long series of Unionfs patches: those patches represent more than four
months of intense hardening and testing. Some of the bugs we've fixed had
to do with improper refcounting (esp. mnt refcounting, which, if not
perfect, either causes an EBUSY on unmount or an oops on unmount :-)

> Thanks!
>
> --
> Al

Erez.

2007-09-03 23:45:29

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: Re: [PATCH 03/32] VFS/fsstack: cpp endif comments

On Mon, Sep 03, 2007 at 08:39:27AM +0200, Jan Engelhardt wrote:
>
> On Sep 2 2007 22:20, Josef 'Jeff' Sipek wrote:
> >diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h
> >index 6b52faf..28543ad 100644
> >--- a/include/linux/fs_stack.h
> >+++ b/include/linux/fs_stack.h
> >@@ -39,4 +39,4 @@ static inline void fsstack_copy_attr_times(struct inode *dest,
> > dest->i_ctime = src->i_ctime;
> > }
> >
> >-#endif /* _LINUX_FS_STACK_H */
> >+#endif /* not _LINUX_FS_STACK_H */
>
> I hardly think this changes a thing, and it even goes against
> the de-facto standard of a lot of other files.

Fair enough. I'm going to drop it.

Jeff.

--
My public GPG key can be found at
http://www.josefsipek.net/gpg/public-0xC7958FFE.txt

2007-09-03 23:46:35

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 1/1] Unionfs: cache-coherency - dentries

From: Erez Zadok <[email protected]>

Utility functions to check if lower dentries/inodes are newer than upper
ones, and purging cached data if lower objects are newer. Also passed flag
to our d_revalidate_chain, to tell it if the caller may be writing data or
just reading it.

[jsipek: changed purge_inode_data to take a struct inode]
Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 2 +-
fs/unionfs/dentry.c | 143 ++++++++++++++++++++++++++++++++++++++++++-----
fs/unionfs/inode.c | 24 +++++---
fs/unionfs/rename.c | 4 +-
fs/unionfs/super.c | 2 +-
fs/unionfs/union.h | 5 +-
fs/unionfs/unlink.c | 12 +++-
fs/unionfs/xattr.c | 8 +-
8 files changed, 164 insertions(+), 36 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 28cb4e9..0dc7492 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -287,7 +287,7 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
* but not unhashed dentries.
*/
if (!d_deleted(dentry) &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, willwrite)) {
err = -ESTALE;
goto out_nofree;
}
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 4a3c479..d937329 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -23,19 +23,18 @@
* Assume that dentry's info node is locked.
* Assume that parent(s) are all valid already, but
* the child may not yet be valid.
- * Returns 1 if valid, 0 otherwise.
+ * Returns true if valid, false otherwise.
*/
-static int __unionfs_d_revalidate_one(struct dentry *dentry,
+static bool __unionfs_d_revalidate_one(struct dentry *dentry,
struct nameidata *nd)
{
- int valid = 1; /* default is valid (1); invalid is 0. */
+ bool valid = true; /* default is valid */
struct dentry *lower_dentry;
int bindex, bstart, bend;
int sbgen, dgen;
int positive = 0;
int locked = 0;
int interpose_flag;
-
struct nameidata lowernd; /* TODO: be gentler to the stack */

if (nd)
@@ -128,7 +127,7 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
interpose_flag);
if (result) {
if (IS_ERR(result)) {
- valid = 0;
+ valid = false;
goto out;
}
/*
@@ -142,7 +141,7 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
if (positive && UNIONFS_I(dentry->d_inode)->stale) {
make_bad_inode(dentry->d_inode);
d_drop(dentry);
- valid = 0;
+ valid = false;
goto out;
}
goto out;
@@ -158,12 +157,12 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
|| !lower_dentry->d_op->d_revalidate)
continue;
if (!lower_dentry->d_op->d_revalidate(lower_dentry,
- &lowernd))
- valid = 0;
+ &lowernd))
+ valid = false;
}

if (!dentry->d_inode)
- valid = 0;
+ valid = false;

if (valid) {
/*
@@ -184,12 +183,94 @@ out:
}

/*
+ * Determine if the lower inode objects have changed from below the unionfs
+ * inode. Return 1 if changed, 0 otherwise.
+ */
+bool is_newer_lower(const struct dentry *dentry)
+{
+ int bindex;
+ struct inode *inode;
+ struct inode *lower_inode;
+
+ /* ignore if we're called on semi-initialized dentries/inodes */
+ if (!dentry || !UNIONFS_D(dentry))
+ return false;
+ inode = dentry->d_inode;
+ if (!inode || !UNIONFS_I(inode) ||
+ ibstart(inode) < 0 || ibend(inode) < 0)
+ return false;
+
+ for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ /*
+ * We may want to apply other tests to determine if the
+ * lower inode's data has changed, but checking for changed
+ * ctime and mtime on the lower inode should be enough.
+ */
+ if (timespec_compare(&inode->i_mtime,
+ &lower_inode->i_mtime) < 0) {
+ printk("unionfs: new lower inode mtime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return true; /* mtime changed! */
+ }
+ if (timespec_compare(&inode->i_ctime,
+ &lower_inode->i_ctime) < 0) {
+ printk("unionfs: new lower inode ctime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return true; /* ctime changed! */
+ }
+ }
+ return true; /* default: lower is not newer */
+}
+
+/*
+ * Purge/remove/unmap all date pages of a unionfs inode. This is called
+ * when the lower inode has changed, and we have to force processes to get
+ * the new data.
+ *
+ * XXX: Our implementation works in that as long as a user process will have
+ * caused Unionfs to be called, directly or indirectly, even to just do
+ * ->d_revalidate; then we will have purged the current Unionfs data and the
+ * process will see the new data. For example, a process that continually
+ * re-reads the same file's data will see the NEW data as soon as the lower
+ * file had changed, upon the next read(2) syscall (even if the file is
+ * still open!) However, this doesn't work when the process re-reads the
+ * open file's data via mmap(2) (unless the user unmaps/closes the file and
+ * remaps/reopens it). Once we respond to ->readpage(s), then the kernel
+ * maps the page into the process's address space and there doesn't appear
+ * to be a way to force the kernel to invalidate those pages/mappings, and
+ * force the process to re-issue ->readpage. If there's a way to invalidate
+ * active mappings and force a ->readpage, let us know please
+ * (invalidate_inode_pages2 doesn't do the trick).
+ */
+static inline void purge_inode_data(struct inode *inode)
+{
+ /* remove all non-private mappings */
+ unmap_mapping_range(inode->i_mapping, 0, 0, 0);
+
+ if (inode->i_data.nrpages)
+ truncate_inode_pages(&inode->i_data, 0);
+}
+
+/*
* Revalidate a parent chain of dentries, then the actual node.
* Assumes that dentry is locked, but will lock all parents if/when needed.
+ *
+ * If 'willwrite' is 1, and the lower inode times are not in sync, then
+ * *don't* purge_inode_data, as it could deadlock if ->write calls us and we
+ * try to truncate a locked page. Besides, if unionfs is about to write
+ * data to a file, then there's the data unionfs is about to write is more
+ * authoritative than what's below, therefore we can safely overwrite the
+ * lower inode times and data.
*/
-int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
+bool __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
+ bool willwrite)
{
- int valid = 0; /* default is invalid (0); valid is 1. */
+ bool valid = false; /* default is invalid */
struct dentry **chain = NULL; /* chain of dentries to reval */
int chain_len = 0;
struct dentry *dtmp;
@@ -202,6 +283,25 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
+ /* XXX: should we check if is_newer_lower all the way up? */
+ if (is_newer_lower(dtmp)) {
+ /*
+ * Special case: the root dentry's generation number must
+ * always be valid, but its lower inode times don't have to
+ * be, so sync up the times only.
+ */
+ if (IS_ROOT(dtmp))
+ unionfs_copy_attr_times(dtmp->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dtmp)->generation, dgen);
+ }
+ purge_inode_data(dtmp->d_inode);
+ }
while (sbgen != dgen) {
/* The root entry should always be valid */
BUG_ON(IS_ROOT(dtmp));
@@ -234,8 +334,8 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
}

/*
- * call __unionfs_d_revalidate() on each dentry, but in parent to
- * child order.
+ * call __unionfs_d_revalidate_one() on each dentry, but in parent
+ * to child order.
*/
for (i=0; i<chain_len; i++) {
unionfs_lock_dentry(chain[i]);
@@ -264,6 +364,21 @@ out_this:
/* finally, lock this dentry and revalidate it */
verify_locked(dentry);
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+ if (is_newer_lower(dentry)) {
+ /* root dentry special case as aforementioned */
+ if (IS_ROOT(dentry))
+ unionfs_copy_attr_times(dentry->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dentry)->generation, dgen);
+ }
+ if (!willwrite)
+ purge_inode_data(dentry->d_inode);
+ }
valid = __unionfs_d_revalidate_one(dentry, nd);

/*
@@ -299,7 +414,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
unionfs_read_lock(dentry->d_sb);

unionfs_lock_dentry(dentry);
- err = __unionfs_d_revalidate_chain(dentry, nd);
+ err = __unionfs_d_revalidate_chain(dentry, nd, false);
unionfs_unlock_dentry(dentry);

unionfs_read_unlock(dentry->d_sb);
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 66614e3..568fc1c 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -34,13 +34,13 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
unionfs_lock_dentry(dentry);

unionfs_lock_dentry(dentry->d_parent);
- valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd);
+ valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
unionfs_unlock_dentry(dentry->d_parent);
if (!valid) {
err = -ESTALE; /* same as what real_lookup does */
goto out;
}
- valid = __unionfs_d_revalidate_chain(dentry, nd);
+ valid = __unionfs_d_revalidate_chain(dentry, nd, false);
/*
* It's only a bug if this dentry was not negative and couldn't be
* revalidated (shouldn't happen).
@@ -270,12 +270,12 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(new_dentry, old_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
if (new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -412,7 +412,7 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -560,7 +560,7 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -703,7 +703,7 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -816,7 +816,7 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -890,6 +890,12 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *cookie)
{
unionfs_read_lock(dentry->d_sb);
+
+ unionfs_lock_dentry(dentry);
+ if (!__unionfs_d_revalidate_chain(dentry, nd, false))
+ printk("unionfs: put_link failed to revalidate dentry\n");
+ unionfs_unlock_dentry(dentry);
+
kfree(nd_get_link(nd));
unionfs_read_unlock(dentry->d_sb);
}
@@ -1035,7 +1041,7 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 8a159d1..80add64 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -401,12 +401,12 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(old_dentry, new_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
if (!d_deleted(new_dentry) && new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 2f321c2..1b26756 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -126,7 +126,7 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
unionfs_read_lock(sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index b7e5a93..c5c6090 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -316,8 +316,9 @@ extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
extern int unionfs_unlink(struct inode *dir, struct dentry *dentry);
extern int unionfs_rmdir(struct inode *dir, struct dentry *dentry);

-extern int __unionfs_d_revalidate_chain(struct dentry *dentry,
- struct nameidata *nd);
+extern bool __unionfs_d_revalidate_chain(struct dentry *dentry,
+ struct nameidata *nd, bool willwrite);
+extern bool is_newer_lower(const struct dentry *dentry);

/* The values for unionfs_interpose's flag. */
#define INTERPOSE_DEFAULT 0
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 7ad19ec..cda1a88 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -80,15 +80,21 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}

err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
- if (!err)
+ if (!err) {
d_drop(dentry);
+ /*
+ * if unlink/whiteout succeeded, parent dir mtime has
+ * changed
+ */
+ unionfs_copy_attr_times(dir);
+ }

out:
unionfs_unlock_dentry(dentry);
@@ -136,7 +142,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 92bcd20..9b50f44 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -60,7 +60,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -88,7 +88,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -116,7 +116,7 @@ int unionfs_removexattr(struct dentry *dentry, const char *name)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -144,7 +144,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
--
1.5.2.2.238.g7cbf2f2

2007-09-06 16:44:19

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: Re: [PATCH 1/1] Unionfs: cache-coherency - dentries

On Mon, Sep 03, 2007 at 07:39:48PM -0400, Josef 'Jeff' Sipek wrote:
...
> /*
> + * Determine if the lower inode objects have changed from below the unionfs
> + * inode. Return 1 if changed, 0 otherwise.
> + */
> +bool is_newer_lower(const struct dentry *dentry)
> +{
> + int bindex;
> + struct inode *inode;
> + struct inode *lower_inode;
> +
> + /* ignore if we're called on semi-initialized dentries/inodes */
> + if (!dentry || !UNIONFS_D(dentry))
> + return false;
> + inode = dentry->d_inode;
> + if (!inode || !UNIONFS_I(inode) ||
> + ibstart(inode) < 0 || ibend(inode) < 0)
> + return false;
> +
> + for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
> + lower_inode = unionfs_lower_inode_idx(inode, bindex);
> + if (!lower_inode)
> + continue;
> + /*
> + * We may want to apply other tests to determine if the
> + * lower inode's data has changed, but checking for changed
> + * ctime and mtime on the lower inode should be enough.
> + */
> + if (timespec_compare(&inode->i_mtime,
> + &lower_inode->i_mtime) < 0) {
> + printk("unionfs: new lower inode mtime "
> + "(bindex=%d, name=%s)\n", bindex,
> + dentry->d_name.name);
> + return true; /* mtime changed! */
> + }
> + if (timespec_compare(&inode->i_ctime,
> + &lower_inode->i_ctime) < 0) {
> + printk("unionfs: new lower inode ctime "
> + "(bindex=%d, name=%s)\n", bindex,
> + dentry->d_name.name);
> + return true; /* ctime changed! */
> + }
> + }
> + return true; /* default: lower is not newer */

This was a thinko, it should have returned false. I'll send an updated
patch for completeness.

Josef 'Jeff' Sipek.

--
Only two things are infinite, the universe and human stupidity, and I'm not
sure about the former.
- Albert Einstein

2007-09-06 16:47:54

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 1/1] Unionfs: cache-coherency - dentries

From: Erez Zadok <[email protected]>

Utility functions to check if lower dentries/inodes are newer than upper
ones, and purging cached data if lower objects are newer. Also passed flag
to our d_revalidate_chain, to tell it if the caller may be writing data or
just reading it.

[jsipek: changed purge_inode_data to take a struct inode]
Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
fs/unionfs/commonfops.c | 2 +-
fs/unionfs/dentry.c | 143 ++++++++++++++++++++++++++++++++++++++++++-----
fs/unionfs/inode.c | 24 +++++---
fs/unionfs/rename.c | 4 +-
fs/unionfs/super.c | 2 +-
fs/unionfs/union.h | 5 +-
fs/unionfs/unlink.c | 12 +++-
fs/unionfs/xattr.c | 8 +-
8 files changed, 164 insertions(+), 36 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 28cb4e9..0dc7492 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -287,7 +287,7 @@ int unionfs_file_revalidate(struct file *file, int willwrite)
* but not unhashed dentries.
*/
if (!d_deleted(dentry) &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, willwrite)) {
err = -ESTALE;
goto out_nofree;
}
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 4a3c479..647a347 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -23,19 +23,18 @@
* Assume that dentry's info node is locked.
* Assume that parent(s) are all valid already, but
* the child may not yet be valid.
- * Returns 1 if valid, 0 otherwise.
+ * Returns true if valid, false otherwise.
*/
-static int __unionfs_d_revalidate_one(struct dentry *dentry,
+static bool __unionfs_d_revalidate_one(struct dentry *dentry,
struct nameidata *nd)
{
- int valid = 1; /* default is valid (1); invalid is 0. */
+ bool valid = true; /* default is valid */
struct dentry *lower_dentry;
int bindex, bstart, bend;
int sbgen, dgen;
int positive = 0;
int locked = 0;
int interpose_flag;
-
struct nameidata lowernd; /* TODO: be gentler to the stack */

if (nd)
@@ -128,7 +127,7 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
interpose_flag);
if (result) {
if (IS_ERR(result)) {
- valid = 0;
+ valid = false;
goto out;
}
/*
@@ -142,7 +141,7 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
if (positive && UNIONFS_I(dentry->d_inode)->stale) {
make_bad_inode(dentry->d_inode);
d_drop(dentry);
- valid = 0;
+ valid = false;
goto out;
}
goto out;
@@ -158,12 +157,12 @@ static int __unionfs_d_revalidate_one(struct dentry *dentry,
|| !lower_dentry->d_op->d_revalidate)
continue;
if (!lower_dentry->d_op->d_revalidate(lower_dentry,
- &lowernd))
- valid = 0;
+ &lowernd))
+ valid = false;
}

if (!dentry->d_inode)
- valid = 0;
+ valid = false;

if (valid) {
/*
@@ -184,12 +183,94 @@ out:
}

/*
+ * Determine if the lower inode objects have changed from below the unionfs
+ * inode. Return true if changed, false otherwise.
+ */
+bool is_newer_lower(const struct dentry *dentry)
+{
+ int bindex;
+ struct inode *inode;
+ struct inode *lower_inode;
+
+ /* ignore if we're called on semi-initialized dentries/inodes */
+ if (!dentry || !UNIONFS_D(dentry))
+ return false;
+ inode = dentry->d_inode;
+ if (!inode || !UNIONFS_I(inode) ||
+ ibstart(inode) < 0 || ibend(inode) < 0)
+ return false;
+
+ for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+ lower_inode = unionfs_lower_inode_idx(inode, bindex);
+ if (!lower_inode)
+ continue;
+ /*
+ * We may want to apply other tests to determine if the
+ * lower inode's data has changed, but checking for changed
+ * ctime and mtime on the lower inode should be enough.
+ */
+ if (timespec_compare(&inode->i_mtime,
+ &lower_inode->i_mtime) < 0) {
+ printk("unionfs: new lower inode mtime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return true; /* mtime changed! */
+ }
+ if (timespec_compare(&inode->i_ctime,
+ &lower_inode->i_ctime) < 0) {
+ printk("unionfs: new lower inode ctime "
+ "(bindex=%d, name=%s)\n", bindex,
+ dentry->d_name.name);
+ return true; /* ctime changed! */
+ }
+ }
+ return false; /* default: lower is not newer */
+}
+
+/*
+ * Purge/remove/unmap all date pages of a unionfs inode. This is called
+ * when the lower inode has changed, and we have to force processes to get
+ * the new data.
+ *
+ * XXX: Our implementation works in that as long as a user process will have
+ * caused Unionfs to be called, directly or indirectly, even to just do
+ * ->d_revalidate; then we will have purged the current Unionfs data and the
+ * process will see the new data. For example, a process that continually
+ * re-reads the same file's data will see the NEW data as soon as the lower
+ * file had changed, upon the next read(2) syscall (even if the file is
+ * still open!) However, this doesn't work when the process re-reads the
+ * open file's data via mmap(2) (unless the user unmaps/closes the file and
+ * remaps/reopens it). Once we respond to ->readpage(s), then the kernel
+ * maps the page into the process's address space and there doesn't appear
+ * to be a way to force the kernel to invalidate those pages/mappings, and
+ * force the process to re-issue ->readpage. If there's a way to invalidate
+ * active mappings and force a ->readpage, let us know please
+ * (invalidate_inode_pages2 doesn't do the trick).
+ */
+static inline void purge_inode_data(struct inode *inode)
+{
+ /* remove all non-private mappings */
+ unmap_mapping_range(inode->i_mapping, 0, 0, 0);
+
+ if (inode->i_data.nrpages)
+ truncate_inode_pages(&inode->i_data, 0);
+}
+
+/*
* Revalidate a parent chain of dentries, then the actual node.
* Assumes that dentry is locked, but will lock all parents if/when needed.
+ *
+ * If 'willwrite' is 1, and the lower inode times are not in sync, then
+ * *don't* purge_inode_data, as it could deadlock if ->write calls us and we
+ * try to truncate a locked page. Besides, if unionfs is about to write
+ * data to a file, then there's the data unionfs is about to write is more
+ * authoritative than what's below, therefore we can safely overwrite the
+ * lower inode times and data.
*/
-int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
+bool __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
+ bool willwrite)
{
- int valid = 0; /* default is invalid (0); valid is 1. */
+ bool valid = false; /* default is invalid */
struct dentry **chain = NULL; /* chain of dentries to reval */
int chain_len = 0;
struct dentry *dtmp;
@@ -202,6 +283,25 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
dtmp = dentry->d_parent;
dgen = atomic_read(&UNIONFS_D(dtmp)->generation);
+ /* XXX: should we check if is_newer_lower all the way up? */
+ if (is_newer_lower(dtmp)) {
+ /*
+ * Special case: the root dentry's generation number must
+ * always be valid, but its lower inode times don't have to
+ * be, so sync up the times only.
+ */
+ if (IS_ROOT(dtmp))
+ unionfs_copy_attr_times(dtmp->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dtmp)->generation, dgen);
+ }
+ purge_inode_data(dtmp->d_inode);
+ }
while (sbgen != dgen) {
/* The root entry should always be valid */
BUG_ON(IS_ROOT(dtmp));
@@ -234,8 +334,8 @@ int __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd)
}

/*
- * call __unionfs_d_revalidate() on each dentry, but in parent to
- * child order.
+ * call __unionfs_d_revalidate_one() on each dentry, but in parent
+ * to child order.
*/
for (i=0; i<chain_len; i++) {
unionfs_lock_dentry(chain[i]);
@@ -264,6 +364,21 @@ out_this:
/* finally, lock this dentry and revalidate it */
verify_locked(dentry);
dgen = atomic_read(&UNIONFS_D(dentry)->generation);
+ if (is_newer_lower(dentry)) {
+ /* root dentry special case as aforementioned */
+ if (IS_ROOT(dentry))
+ unionfs_copy_attr_times(dentry->d_inode);
+ else {
+ /*
+ * reset generation number to zero, guaranteed to be
+ * "old"
+ */
+ dgen = 0;
+ atomic_set(&UNIONFS_D(dentry)->generation, dgen);
+ }
+ if (!willwrite)
+ purge_inode_data(dentry->d_inode);
+ }
valid = __unionfs_d_revalidate_one(dentry, nd);

/*
@@ -299,7 +414,7 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
unionfs_read_lock(dentry->d_sb);

unionfs_lock_dentry(dentry);
- err = __unionfs_d_revalidate_chain(dentry, nd);
+ err = __unionfs_d_revalidate_chain(dentry, nd, false);
unionfs_unlock_dentry(dentry);

unionfs_read_unlock(dentry->d_sb);
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 66614e3..568fc1c 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -34,13 +34,13 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
unionfs_lock_dentry(dentry);

unionfs_lock_dentry(dentry->d_parent);
- valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd);
+ valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
unionfs_unlock_dentry(dentry->d_parent);
if (!valid) {
err = -ESTALE; /* same as what real_lookup does */
goto out;
}
- valid = __unionfs_d_revalidate_chain(dentry, nd);
+ valid = __unionfs_d_revalidate_chain(dentry, nd, false);
/*
* It's only a bug if this dentry was not negative and couldn't be
* revalidated (shouldn't happen).
@@ -270,12 +270,12 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(new_dentry, old_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
if (new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -412,7 +412,7 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -560,7 +560,7 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -703,7 +703,7 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
unionfs_lock_dentry(dentry);

if (dentry->d_inode &&
- !__unionfs_d_revalidate_chain(dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -816,7 +816,7 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -890,6 +890,12 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *cookie)
{
unionfs_read_lock(dentry->d_sb);
+
+ unionfs_lock_dentry(dentry);
+ if (!__unionfs_d_revalidate_chain(dentry, nd, false))
+ printk("unionfs: put_link failed to revalidate dentry\n");
+ unionfs_unlock_dentry(dentry);
+
kfree(nd_get_link(nd));
unionfs_read_unlock(dentry->d_sb);
}
@@ -1035,7 +1041,7 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 8a159d1..80add64 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -401,12 +401,12 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
unionfs_read_lock(old_dentry->d_sb);
unionfs_double_lock_dentry(old_dentry, new_dentry);

- if (!__unionfs_d_revalidate_chain(old_dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
if (!d_deleted(new_dentry) && new_dentry->d_inode &&
- !__unionfs_d_revalidate_chain(new_dentry, NULL)) {
+ !__unionfs_d_revalidate_chain(new_dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 2f321c2..1b26756 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -126,7 +126,7 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
unionfs_read_lock(sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index b7e5a93..c5c6090 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -316,8 +316,9 @@ extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
extern int unionfs_unlink(struct inode *dir, struct dentry *dentry);
extern int unionfs_rmdir(struct inode *dir, struct dentry *dentry);

-extern int __unionfs_d_revalidate_chain(struct dentry *dentry,
- struct nameidata *nd);
+extern bool __unionfs_d_revalidate_chain(struct dentry *dentry,
+ struct nameidata *nd, bool willwrite);
+extern bool is_newer_lower(const struct dentry *dentry);

/* The values for unionfs_interpose's flag. */
#define INTERPOSE_DEFAULT 0
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 7ad19ec..cda1a88 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -80,15 +80,21 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}

err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
- if (!err)
+ if (!err) {
d_drop(dentry);
+ /*
+ * if unlink/whiteout succeeded, parent dir mtime has
+ * changed
+ */
+ unionfs_copy_attr_times(dir);
+ }

out:
unionfs_unlock_dentry(dentry);
@@ -136,7 +142,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 92bcd20..9b50f44 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -60,7 +60,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -88,7 +88,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -116,7 +116,7 @@ int unionfs_removexattr(struct dentry *dentry, const char *name)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
@@ -144,7 +144,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);

- if (!__unionfs_d_revalidate_chain(dentry, NULL)) {
+ if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
err = -ESTALE;
goto out;
}
--
1.5.2.2.238.g7cbf2f2