The following is a series of patchsets related to Unionfs. This is the
second set of patchsets resulting from an lkml review of the entire unionfs
code base (many thanks go to all of those who sent their comments). The
most significant changes in this set are bug fixes and an overhaul of the
locking in unionfs (many of these fixes are applicable to stacking in
general).
The first two patches in the series are small generic VFS fsstack patches.
These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc6-125-g5356f66), MM, as well as the backports to
2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2,
ramfs, tmpfs, cramfs, and squashfs (where available). Also tested with
LTP-full. See http://unionfs.filesystems.org/ to download back-ported
unionfs code.
Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git
to receive the following:
Erez Zadok (27):
VFS/fs_stack: drop cast on inode passed to i_size_read
Unionfs: use locking around i_size_write in 32-bit systems
Unionfs: clarify usage.txt read/write behavior
Unionfs: interpose cleanup and fix for spliced dentries
Unionfs: initialize inode times for reused inodes
Unionfs: create new special files only in first branch
Unionfs: create new symlinks only in first branch
Unionfs: release special files on copyup
Unionfs: remove custom read/write methods
Unionfs: prevent deadlock in cache coherency
Unionfs: remove unnecessary conditional inode lock
Unionfs: remove unnecessary lock when deleting whiteouts
Unionfs: remove unnecessary lock in read_inode
Unionfs: remove unnecessary locking in follow-link
Unionfs: remove unnecessary parent lock in create
Unionfs: prevent false lockdep warnings in stacking
Unionfs: implement lockdep classes
Unionfs: minor code rearrangement in rename
Unionfs: handle on lower inodes in lookup
Unionfs: set our superblock a/m/ctime granularity
Unionfs: update inode times after a successful open
Unionfs: minor cleanup in check_empty
Unionfs: initialize namelist variable in rename
Unionfs: cleanup lower inodes after successful unlink
Unionfs: don't check dentry on error
Unionfs: implement d_iput method
Unionfs: don't check parent dentries
Hugh Dickins (3):
VFS/fs_stack: use locking around i_size_write in 32-bit systems
Unionfs: mmap fixes
Unionfs: restructure unionfs_setattr and fix truncation order
Documentation/filesystems/unionfs/concepts.txt | 20
Documentation/filesystems/unionfs/issues.txt | 12
Documentation/filesystems/unionfs/usage.txt | 13
fs/stack.c | 8
fs/unionfs/commonfops.c | 22
fs/unionfs/copyup.c | 16
fs/unionfs/dentry.c | 101 ++--
fs/unionfs/dirfops.c | 4
fs/unionfs/dirhelper.c | 11
fs/unionfs/fanout.h | 14
fs/unionfs/file.c | 53 --
fs/unionfs/inode.c | 579 ++++++++++++-------------
fs/unionfs/lookup.c | 20
fs/unionfs/main.c | 19
fs/unionfs/mmap.c | 33 -
fs/unionfs/rename.c | 114 ++--
fs/unionfs/subr.c | 4
fs/unionfs/super.c | 31 -
fs/unionfs/union.h | 38 +
fs/unionfs/unlink.c | 31 -
fs/unionfs/xattr.c | 16
21 files changed, 608 insertions(+), 551 deletions(-)
---
Erez Zadok
[email protected]
Don't try to truncate_inode_pages in in purge_inode_data, because this could
lead to a deadlock between some of address_space ops and dentry
revalidation: the address space op is invoked with a lock on our own page,
and truncate_inode_pages will block on locked pages. Instead, it should be
enough to be gentler and just invalidate_mapping_pages.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dentry.c | 32 +++++++++++---------------------
1 files changed, 11 insertions(+), 21 deletions(-)
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index c7b6a7c..68c07ba 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -248,32 +248,22 @@ bool is_newer_lower(const struct dentry *dentry)
}
/*
- * Purge/remove/unmap all date pages of a unionfs inode. This is called
- * when the lower inode has changed, and we have to force processes to get
- * the new data.
- *
- * XXX: Our implementation works in that as long as a user process will have
- * caused Unionfs to be called, directly or indirectly, even to just do
- * ->d_revalidate; then we will have purged the current Unionfs data and the
- * process will see the new data. For example, a process that continually
- * re-reads the same file's data will see the NEW data as soon as the lower
- * file had changed, upon the next read(2) syscall (even if the file is
- * still open!) However, this doesn't work when the process re-reads the
- * open file's data via mmap(2) (unless the user unmaps/closes the file and
- * remaps/reopens it). Once we respond to ->readpage(s), then the kernel
- * maps the page into the process's address space and there doesn't appear
- * to be a way to force the kernel to invalidate those pages/mappings, and
- * force the process to re-issue ->readpage. If there's a way to invalidate
- * active mappings and force a ->readpage, let us know please
- * (invalidate_inode_pages2 doesn't do the trick).
+ * Purge and invalidate as many data pages of a unionfs inode. This is
+ * called when the lower inode has changed, and we want to force processes
+ * to re-get the new data.
*/
static inline void purge_inode_data(struct inode *inode)
{
/* remove all non-private mappings */
unmap_mapping_range(inode->i_mapping, 0, 0, 0);
-
- if (inode->i_data.nrpages)
- truncate_inode_pages(&inode->i_data, 0);
+ /* invalidate as many pages as possible */
+ invalidate_mapping_pages(inode->i_mapping, 0, -1);
+ /*
+ * Don't try to truncate_inode_pages here, because this could lead
+ * to a deadlock between some of address_space ops and dentry
+ * revalidation: the address space op is invoked with a lock on our
+ * own page, and truncate_inode_pages will block on locked pages.
+ */
}
void purge_sb_data(struct super_block *sb)
--
1.5.2.2
CC: Hugh Dickins <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/super.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 0ff9a9e..ed3eb04 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -69,7 +69,13 @@ static void unionfs_read_inode(struct inode *inode)
*/
static void unionfs_delete_inode(struct inode *inode)
{
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+ spin_lock(&inode->i_lock);
+#endif
i_size_write(inode, 0); /* every f/s seems to do that */
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+ spin_unlock(&inode->i_lock);
+#endif
if (inode->i_data.nrpages)
truncate_inode_pages(&inode->i_data, 0);
--
1.5.2.2
Having them results in lockdep warnings about having locks and grabbing the
same class locks in do_sync_read/write which were called from
unionfs_read/write. All they did was revalidate out file object sooner,
which will now be deferred till a bit later. Instead, use generic
do_sync_read and do_sync_write.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/file.c | 46 ++--------------------------------------------
1 files changed, 2 insertions(+), 44 deletions(-)
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index c922173..94784c3 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -18,48 +18,6 @@
#include "union.h"
-static ssize_t unionfs_read(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
-{
- int err;
-
- unionfs_read_lock(file->f_path.dentry->d_sb);
- err = unionfs_file_revalidate(file, false);
- if (unlikely(err))
- goto out;
- unionfs_check_file(file);
-
- err = do_sync_read(file, buf, count, ppos);
-
-out:
- unionfs_check_file(file);
- unionfs_read_unlock(file->f_path.dentry->d_sb);
- return err;
-}
-
-static ssize_t unionfs_write(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
-{
- int err = 0;
-
- unionfs_read_lock(file->f_path.dentry->d_sb);
- err = unionfs_file_revalidate(file, true);
- if (unlikely(err))
- goto out;
- unionfs_check_file(file);
-
- err = do_sync_write(file, buf, count, ppos);
- /* update our inode times upon a successful lower write */
- if (err >= 0) {
- unionfs_copy_attr_times(file->f_path.dentry->d_inode);
- unionfs_check_file(file);
- }
-
-out:
- unionfs_read_unlock(file->f_path.dentry->d_sb);
- return err;
-}
-
static int unionfs_file_readdir(struct file *file, void *dirent,
filldir_t filldir)
{
@@ -210,9 +168,9 @@ out:
struct file_operations unionfs_main_fops = {
.llseek = generic_file_llseek,
- .read = unionfs_read,
+ .read = do_sync_read,
.aio_read = generic_file_aio_read,
- .write = unionfs_write,
+ .write = do_sync_write,
.aio_write = generic_file_aio_write,
.readdir = unionfs_file_readdir,
.unlocked_ioctl = unionfs_ioctl,
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/super.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index ed3eb04..c474c86 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -55,6 +55,14 @@ static void unionfs_read_inode(struct inode *inode)
inode->i_mapping->a_ops = &unionfs_aops;
+ /*
+ * reset times so unionfs_copy_attr_all can keep out time invariants
+ * right (upper inode time being the max of all lower ones).
+ */
+ inode->i_atime.tv_sec = inode->i_atime.tv_nsec = 0;
+ inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
+ inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
+
unionfs_read_unlock(inode->i_sb);
}
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dentry.c | 6 +++++-
fs/unionfs/lookup.c | 5 +++++
2 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index b207a6f..0e89308 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -151,8 +151,12 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
valid = false;
}
- if (!dentry->d_inode)
+ if (!dentry->d_inode ||
+ ibstart(dentry->d_inode) < 0 ||
+ ibend(dentry->d_inode) < 0) {
valid = false;
+ goto out;
+ }
if (valid) {
/*
diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 85a85aa..b9ee072 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -225,6 +225,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
wh_lower_dentry = NULL;
/* Now do regular lookup; lookup foo */
+ BUG_ON(!lower_dir_dentry);
lower_dentry = lookup_one_len(name, lower_dir_dentry, namelen);
if (IS_ERR(lower_dentry)) {
dput(first_lower_dentry);
@@ -315,6 +316,10 @@ out_negative:
UNIONFS_I(dentry->d_inode)->stale = 1;
goto out;
}
+ if (!lower_dir_dentry) {
+ err = -ENOENT;
+ goto out;
+ }
/* This should only happen if we found a whiteout. */
if (first_dentry_offset == -1) {
first_lower_dentry = lookup_one_len(name, lower_dir_dentry,
--
1.5.2.2
A stackable file system like unionfs often performs an operation on a lower
file system, by calling a vfs_* method, having been called possibly by the
very same method from the VFS. Both calls to the vfs_* method grab a lock
in the same lock class, and hence lockdep complains. This warning is a
false positive in instances where unionfs only calls the vfs_* method on
lower objects; there's a strict lock ordering here: upper objects first,
then lower objects.
We want to prevent these false positives so that lockdep will not shutdown
so it'd still be able to warn us about potentially true locking problems.
So, we temporarily turn off lockdep ONLY AROUND the calls to vfs methods to
which we pass lower objects, and only for those instances where lockdep
complained. While this solution may seem unclean, it is not without
precedent: other places in the kernel also do similar temporary disabling,
of course after carefully having checked that it is the right thing to do.
In the long run, lockdep needs to be taught how to handle about stacking.
Then this patch can be removed. It is likely that such lockdep-stacking
support will do essentially the same as this patch: consider the same
ordering (upper then lower) and consider upper vs. lower locks to be in
different classes.
Signed-off-by: Erez Zadok <[email protected]>
---
Documentation/filesystems/unionfs/issues.txt | 10 +++++++---
fs/unionfs/copyup.c | 3 +++
fs/unionfs/inode.c | 21 +++++++++++++++++++--
fs/unionfs/rename.c | 21 +++++++++++----------
fs/unionfs/super.c | 4 ++++
fs/unionfs/unlink.c | 12 ++++++++++--
6 files changed, 54 insertions(+), 17 deletions(-)
diff --git a/Documentation/filesystems/unionfs/issues.txt b/Documentation/filesystems/unionfs/issues.txt
index bb6ab05..f4b7e7e 100644
--- a/Documentation/filesystems/unionfs/issues.txt
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -17,8 +17,12 @@ KNOWN Unionfs 2.x ISSUES:
an upper object, and then a lower object, in a strict order to avoid
locking problems; in addition, Unionfs, as a fan-out file system, may
have to lock several lower inodes. We are currently looking into Lockdep
- to see how to make it aware of stackable file systems. In the meantime,
- if you get any warnings from Lockdep, you can safely ignore them (or feel
- free to report them to the Unionfs maintainers, just to be sure).
+ to see how to make it aware of stackable file systems. For now, we
+ temporarily disable lockdep when calling vfs methods on lower objects,
+ but only for those places where lockdep complained. While this solution
+ may seem unclean, it is not without precedent: other places in the kernel
+ also do similar temporary disabling, of course after carefully having
+ checked that it is the right thing to do. Anyway, you get any warnings
+ from Lockdep, please report them to the Unionfs maintainers.
For more information, see <http://unionfs.filesystems.org/>.
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index f48209f..0012caf 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -297,11 +297,14 @@ static int __copyup_reg_data(struct dentry *dentry,
break;
}
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
write_bytes =
output_file->f_op->write(output_file,
(char __user *)buf,
read_bytes,
&output_file->f_pos);
+ lockdep_on();
if ((write_bytes < 0) || (write_bytes < read_bytes)) {
err = write_bytes;
break;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 3df9b19..4890f42 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -80,7 +80,10 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
struct dentry *lower_dir_dentry;
lower_dir_dentry = lock_parent(wh_dentry);
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+ lockdep_on();
unlock_dir(lower_dir_dentry);
/*
@@ -262,9 +265,13 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
/* found a .wh.foo entry, unlink it and then call vfs_link() */
lower_dir_dentry = lock_parent(whiteout_dentry);
err = is_robranch_super(new_dentry->d_sb, dbstart(new_dentry));
- if (!err)
+ if (!err) {
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
err = vfs_unlink(lower_dir_dentry->d_inode,
whiteout_dentry);
+ lockdep_on();
+ }
fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
dir->i_nlink = unionfs_get_nlinks(dir);
@@ -291,9 +298,13 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
BUG_ON(dbstart(old_dentry) != dbstart(new_dentry));
lower_dir_dentry = lock_parent(lower_new_dentry);
err = is_robranch(old_dentry);
- if (!err)
+ if (!err) {
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
err = vfs_link(lower_old_dentry, lower_dir_dentry->d_inode,
lower_new_dentry);
+ lockdep_on();
+ }
unlock_dir(lower_dir_dentry);
docopyup:
@@ -316,10 +327,16 @@ docopyup:
unionfs_lower_dentry(old_dentry);
lower_dir_dentry =
lock_parent(lower_new_dentry);
+ /*
+ * see
+ * Documentation/filesystems/unionfs/issues.txt
+ */
+ lockdep_off();
/* do vfs_link */
err = vfs_link(lower_old_dentry,
lower_dir_dentry->d_inode,
lower_new_dentry);
+ lockdep_on();
unlock_dir(lower_dir_dentry);
goto check_link;
}
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 452d1e7..8b04acf 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -90,16 +90,14 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
dput(lower_wh_dentry);
}
+ err = is_robranch_super(old_dentry->d_sb, bindex);
+ if (err)
+ goto out;
+
dget(lower_old_dentry);
lower_old_dir_dentry = dget_parent(lower_old_dentry);
lower_new_dir_dentry = dget_parent(lower_new_dentry);
- lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
-
- err = is_robranch_super(old_dentry->d_sb, bindex);
- if (err)
- goto out_unlock;
-
/*
* ready to whiteout for old_dentry. caller will create the actual
* whiteout, and must dput(*wh_old)
@@ -110,7 +108,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
old_dentry->d_name.len);
err = PTR_ERR(whname);
if (unlikely(IS_ERR(whname)))
- goto out_unlock;
+ goto out_dput;
*wh_old = lookup_one_len(whname, lower_old_dir_dentry,
old_dentry->d_name.len +
UNIONFS_WHLEN);
@@ -118,16 +116,19 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
err = PTR_ERR(*wh_old);
if (IS_ERR(*wh_old)) {
*wh_old = NULL;
- goto out_unlock;
+ goto out_dput;
}
}
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
+ lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
err = vfs_rename(lower_old_dir_dentry->d_inode, lower_old_dentry,
lower_new_dir_dentry->d_inode, lower_new_dentry);
-
-out_unlock:
unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
+ lockdep_on();
+out_dput:
dput(lower_old_dir_dentry);
dput(lower_new_dir_dentry);
dput(lower_old_dentry);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 8b70aca..45bcf89 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -839,7 +839,11 @@ static void unionfs_clear_inode(struct inode *inode)
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
+ unionfs_set_lower_inode_idx(inode, bindex, NULL);
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
iput(lower_inode);
+ lockdep_on();
}
}
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 423ff36..677a5ae 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -41,8 +41,12 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
/* avoid destroying the lower inode if the file is in use */
dget(lower_dentry);
err = is_robranch_super(dentry->d_sb, bindex);
- if (!err)
+ if (!err) {
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
err = vfs_unlink(lower_dir_dentry->d_inode, lower_dentry);
+ lockdep_on();
+ }
/* if vfs_unlink succeeded, update our inode's times */
if (!err)
unionfs_copy_attr_times(dentry->d_inode);
@@ -139,8 +143,12 @@ static int unionfs_rmdir_first(struct inode *dir, struct dentry *dentry,
/* avoid destroying the lower inode if the file is in use */
dget(lower_dentry);
err = is_robranch(dentry);
- if (!err)
+ if (!err) {
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
err = vfs_rmdir(lower_dir_dentry->d_inode, lower_dentry);
+ lockdep_on();
+ }
dput(lower_dentry);
fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
--
1.5.2.2
When creating a new special file, always create it in the first branch,
which is always writeable, not in the branch which may have a whiteout in
it. This makes the policy for the creation of new special files consistent
with that of new files/directories, as well as improves efficiency a bit.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 174 +++++++++++++++++++++++++++------------------------
1 files changed, 92 insertions(+), 82 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 63ff3d3..8076d0b 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -686,15 +686,15 @@ out:
return err;
}
-static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
+static int unionfs_mknod(struct inode *parent, struct dentry *dentry, int mode,
dev_t dev)
{
int err = 0;
- struct dentry *lower_dentry = NULL, *whiteout_dentry = NULL;
+ struct dentry *lower_dentry = NULL;
+ struct dentry *wh_dentry = NULL;
struct dentry *lower_parent_dentry = NULL;
- int bindex = 0, bstart;
char *name = NULL;
- int whiteout_unlinked = 0;
+ int valid = 0;
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -705,115 +705,125 @@ static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
goto out;
}
- bstart = dbstart(dentry);
-
- lower_dentry = unionfs_lower_dentry(dentry);
+ /*
+ * It's only a bug if this dentry was not negative and couldn't be
+ * revalidated (shouldn't happen).
+ */
+ BUG_ON(!valid && dentry->d_inode);
/*
- * check if whiteout exists in this branch, i.e. lookup .wh.foo
- * first.
+ * We shouldn't create things in a read-only branch; this check is a
+ * bit redundant as we don't allow branch 0 to be read-only at the
+ * moment
*/
- name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
- if (unlikely(IS_ERR(name))) {
- err = PTR_ERR(name);
+ err = is_robranch_super(dentry->d_sb, 0);
+ if (err) {
+ err = -EROFS;
goto out;
}
- whiteout_dentry = lookup_one_len(name, lower_dentry->d_parent,
- dentry->d_name.len + UNIONFS_WHLEN);
- if (IS_ERR(whiteout_dentry)) {
- err = PTR_ERR(whiteout_dentry);
- goto out;
- }
+ /*
+ * We _always_ create on branch 0
+ */
+ lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+ if (lower_dentry) {
+ /*
+ * check if whiteout exists in this branch, i.e. lookup .wh.foo
+ * first.
+ */
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (unlikely(IS_ERR(name))) {
+ err = PTR_ERR(name);
+ goto out;
+ }
- if (!whiteout_dentry->d_inode) {
- dput(whiteout_dentry);
- whiteout_dentry = NULL;
- } else {
- /* found .wh.foo, unlink it */
- lower_parent_dentry = lock_parent(whiteout_dentry);
+ wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(wh_dentry)) {
+ err = PTR_ERR(wh_dentry);
+ wh_dentry = NULL;
+ goto out;
+ }
- /* found a.wh.foo entry, remove it then do vfs_mkdir */
- err = is_robranch_super(dentry->d_sb, bstart);
- if (!err)
- err = vfs_unlink(lower_parent_dentry->d_inode,
- whiteout_dentry);
- dput(whiteout_dentry);
+ if (wh_dentry->d_inode) {
+ /*
+ * .wh.foo has been found, so let's unlink it
+ */
+ struct dentry *lower_dir_dentry;
- unlock_dir(lower_parent_dentry);
+ lower_dir_dentry = lock_parent(wh_dentry);
+ err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+ unlock_dir(lower_dir_dentry);
- if (err) {
- if (!IS_COPYUP_ERR(err))
+ /*
+ * Whiteouts are special files and should be deleted
+ * no matter what (as if they never existed), in
+ * order to allow this create operation to succeed.
+ * This is especially important in sticky
+ * directories: a whiteout may have been created by
+ * one user, but the newly created file may be
+ * created by another user. Therefore, in order to
+ * maintain Unix semantics, if the vfs_unlink above
+ * ailed, then we have to try to directly unlink the
+ * whiteout. Note: in the ODF version of unionfs,
+ * whiteout are handled much more cleanly.
+ */
+ if (err == -EPERM) {
+ struct inode *inode = lower_dir_dentry->d_inode;
+ err = inode->i_op->unlink(inode, wh_dentry);
+ }
+ if (err) {
+ printk(KERN_ERR "unionfs: mknod: could not "
+ "unlink whiteout, err = %d\n", err);
goto out;
- bstart--;
- } else {
- whiteout_unlinked = 1;
- }
- }
-
- for (bindex = bstart; bindex >= 0; bindex--) {
- if (is_robranch_super(dentry->d_sb, bindex))
- continue;
-
- lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
- if (!lower_dentry) {
- lower_dentry = create_parents(dir, dentry,
- dentry->d_name.name,
- bindex);
- if (IS_ERR(lower_dentry)) {
- printk(KERN_ERR "unionfs: failed to create "
- "parents on %d, err = %ld\n",
- bindex, PTR_ERR(lower_dentry));
- continue;
}
}
-
- lower_parent_dentry = lock_parent(lower_dentry);
- if (IS_ERR(lower_parent_dentry)) {
- err = PTR_ERR(lower_parent_dentry);
+ } else {
+ /*
+ * if lower_dentry is NULL, create the entire
+ * dentry directory structure in branch 0.
+ */
+ lower_dentry = create_parents(parent, dentry,
+ dentry->d_name.name, 0);
+ if (IS_ERR(lower_dentry)) {
+ err = PTR_ERR(lower_dentry);
goto out;
}
+ }
- err = vfs_mknod(lower_parent_dentry->d_inode,
- lower_dentry, mode, dev);
-
- if (err) {
- unlock_dir(lower_parent_dentry);
- break;
- }
+ lower_parent_dentry = lock_parent(lower_dentry);
+ if (IS_ERR(lower_parent_dentry)) {
+ err = PTR_ERR(lower_parent_dentry);
+ goto out;
+ }
- /*
- * Only INTERPOSE_LOOKUP can return a value other than 0 on
- * err.
- */
- err = PTR_ERR(unionfs_interpose(dentry, dir->i_sb, 0));
+ err = vfs_mknod(lower_parent_dentry->d_inode, lower_dentry, mode, dev);
+ if (!err) {
+ err = PTR_ERR(unionfs_interpose(dentry, parent->i_sb, 0));
if (!err) {
- fsstack_copy_attr_times(dir,
- lower_parent_dentry->d_inode);
- fsstack_copy_inode_size(dir,
+ unionfs_copy_attr_times(parent);
+ fsstack_copy_inode_size(parent,
lower_parent_dentry->d_inode);
- /* update number of links on parent directory */
- dir->i_nlink = unionfs_get_nlinks(dir);
+ /* update no. of links on parent directory */
+ parent->i_nlink = unionfs_get_nlinks(parent);
}
- unlock_dir(lower_parent_dentry);
-
- break;
}
-out:
- if (!dentry->d_inode)
- d_drop(dentry);
+ unlock_dir(lower_parent_dentry);
+out:
+ dput(wh_dentry);
kfree(name);
if (!err)
unionfs_postcopyup_setmnt(dentry);
- unionfs_check_inode(dir);
+ unionfs_check_inode(parent);
+ if (!err)
+ unionfs_check_dentry(dentry->d_parent);
unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
-
return err;
}
--
1.5.2.2
Set it to 1 ns, because we could be stacked on top of file systems with such
granularity.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/main.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 92f0e9d..23c18f7 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -642,6 +642,13 @@ static int unionfs_read_super(struct super_block *sb, void *raw_data,
/* max Bytes is the maximum bytes from highest priority branch */
sb->s_maxbytes = unionfs_lower_super_idx(sb, 0)->s_maxbytes;
+ /*
+ * Our c/m/atime granularity is 1 ns because we may stack on file
+ * systems whose granularity is as good. This is important for our
+ * time-based cache coherency.
+ */
+ sb->s_time_gran = 1;
+
sb->s_op = &unionfs_sops;
/* See comment next to the definition of unionfs_d_alloc_root */
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dirhelper.c | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index b40090a..4b73bb6 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -182,6 +182,7 @@ int check_empty(struct dentry *dentry, struct unionfs_dir_state **namelist)
{
int err = 0;
struct dentry *lower_dentry = NULL;
+ struct vfsmount *mnt;
struct super_block *sb;
struct file *lower_file;
struct unionfs_rdutil_callback *buf = NULL;
@@ -226,15 +227,11 @@ int check_empty(struct dentry *dentry, struct unionfs_dir_state **namelist)
continue;
dget(lower_dentry);
- unionfs_mntget(dentry, bindex);
+ mnt = unionfs_mntget(dentry, bindex);
branchget(sb, bindex);
- lower_file =
- dentry_open(lower_dentry,
- unionfs_lower_mnt_idx(dentry, bindex),
- O_RDONLY);
+ lower_file = dentry_open(lower_dentry, mnt, O_RDONLY);
if (IS_ERR(lower_file)) {
err = PTR_ERR(lower_file);
- dput(lower_dentry);
branchput(sb, bindex);
goto out;
}
--
1.5.2.2
Fix unionfs_interpose to fill lower inode info when d_splice_alias returns
NULL. Also cleanup impossible case (d_splice_alias doesn't return ERR_PTR).
Signed-off-by: Rachita Kothiyal <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/main.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 22aa6e6..ea8976d 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -149,9 +149,7 @@ skip:
break;
case INTERPOSE_LOOKUP:
spliced = d_splice_alias(inode, dentry);
- if (IS_ERR(spliced)) {
- err = PTR_ERR(spliced);
- } else if (spliced && spliced != dentry) {
+ if (spliced && spliced != dentry) {
/*
* d_splice can return a dentry if it was
* disconnected and had to be moved. We must ensure
@@ -169,6 +167,12 @@ skip:
unionfs_fill_inode(dentry, inode);
}
goto out_spliced;
+ } else if (!spliced) {
+ if (need_fill_inode) {
+ need_fill_inode = 0;
+ unionfs_fill_inode(dentry, inode);
+ goto out_spliced;
+ }
}
break;
case INTERPOSE_REVAL:
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/commonfops.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 4077907..b8357a7 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -583,10 +583,13 @@ out:
kfree(UNIONFS_F(file));
}
out_nofree:
- unionfs_check_inode(inode);
if (!err) {
+ dentry = file->f_path.dentry;
+ unionfs_copy_attr_times(dentry->d_parent->d_inode);
+ unionfs_copy_attr_times(inode);
unionfs_check_file(file);
- unionfs_check_dentry(file->f_path.dentry->d_parent);
+ unionfs_check_dentry(dentry->d_parent);
+ unionfs_check_inode(inode);
}
unionfs_read_unlock(inode->i_sb);
return err;
--
1.5.2.2
From: Hugh Dickins <[email protected]>
Remove !mapping_cap_writeback_dirty shortcircuit from unionfs_writepages.
It was introduced to avoid the stray AOP_WRITEPAGE_ACTIVATE coming from
shmem_writepage; but that has since been fixed in shmem_writepage and in
write_cache_pages. It stayed because it looked like a good optimization,
not to waste time calling down to tmpfs when that would serve no purpose.
But in fact this optimization causes hangs when running LTP with unionfs
over tmpfs. The problem is that the test comes at the wrong level: unionfs
has already declared in its default_backing_dev_info that it's playing by
cap_writeback_dirty rules. If it does nothing here in its writepages, its
dirty pages accumulate and choke the system. What's needed is to carry on
down and let its pages be cleaned while in turn they dirty the lower level.
And this now has an additional benefit for tmpfs, that a sync or pdflush
pushes these pages down to shmem_writepage, letting it match the filepage
coming from unionfs with the swap which may have been allocated earlier,
so it can free the duplication sooner than waiting for further pressure.
Remove unnecessary locking/code from prepare_write. Handle if no lower
inodes in writepage.
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/mmap.c | 29 +++++++++--------------------
1 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 4d05352..aad2137 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -30,6 +30,11 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
BUG_ON(!PageUptodate(page));
inode = page->mapping->host;
+ /* if no lower inode, nothing to do */
+ if (!inode || !UNIONFS_I(inode) || UNIONFS_I(inode)->lower_inodes) {
+ err = 0;
+ goto out;
+ }
lower_inode = unionfs_lower_inode(inode);
lower_mapping = lower_inode->i_mapping;
@@ -130,9 +135,6 @@ static int unionfs_writepages(struct address_space *mapping,
if (!lower_inode)
goto out;
- if (!mapping_cap_writeback_dirty(lower_inode->i_mapping))
- goto out;
-
err = generic_writepages(mapping, wbc);
if (!err)
unionfs_copy_attr_times(inode);
@@ -222,26 +224,13 @@ out:
static int unionfs_prepare_write(struct file *file, struct page *page,
unsigned from, unsigned to)
{
- int err;
-
- unionfs_read_lock(file->f_path.dentry->d_sb);
/*
- * This is the only place where we unconditionally copy the lower
- * attribute times before calling unionfs_file_revalidate. The
- * reason is that our ->write calls do_sync_write which in turn will
- * call our ->prepare_write and then ->commit_write. Before our
- * ->write is called, the lower mtimes are in sync, but by the time
- * the VFS calls our ->commit_write, the lower mtimes have changed.
- * Therefore, the only reasonable time for us to sync up from the
- * changed lower mtimes, and avoid an invariant violation warning,
- * is here, in ->prepare_write.
+ * Just copy lower inode attributes and return success. Not much
+ * else to do here. No need to lock either (lockdep won't like it).
+ * Let commit_write do all the hard work instead.
*/
unionfs_copy_attr_times(file->f_path.dentry->d_inode);
- err = unionfs_file_revalidate(file, true);
- unionfs_check_file(file);
- unionfs_read_unlock(file->f_path.dentry->d_sb);
-
- return err;
+ return 0;
}
static int unionfs_commit_write(struct file *file, struct page *page,
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 37258c8..7ec9c1b 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -851,7 +851,8 @@ out:
* nor do we need to revalidate it either. It is safe to not lock our
* dentry here, nor revalidate it, because unionfs_follow_link does not do
* anything (prior to calling ->readlink) which could become inconsistent
- * due to branch management.
+ * due to branch management. We also don't need to lock our super because
+ * this function isn't affected by branch-management.
*/
static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
{
@@ -859,8 +860,6 @@ static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
int len = PAGE_SIZE, err;
mm_segment_t old_fs;
- unionfs_read_lock(dentry->d_sb);
-
/* This is freed by the put_link method assuming a successful call. */
buf = kmalloc(len, GFP_KERNEL);
if (unlikely(!buf)) {
@@ -885,7 +884,6 @@ static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
out:
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
- unionfs_read_unlock(dentry->d_sb);
return ERR_PTR(err);
}
--
1.5.2.2
Our read_inode doesn't need to grab the superblock rwsem because there no
chance it could be affected by branch management. But our read_inode was
called from other places which did grab need to grab that rwsem, and lockdep
complained.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/super.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index c474c86..8b70aca 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -29,8 +29,6 @@ static void unionfs_read_inode(struct inode *inode)
int size;
struct unionfs_inode_info *info = UNIONFS_I(inode);
- unionfs_read_lock(inode->i_sb);
-
memset(info, 0, offsetof(struct unionfs_inode_info, vfs_inode));
info->bstart = -1;
info->bend = -1;
@@ -63,7 +61,6 @@ static void unionfs_read_inode(struct inode *inode)
inode->i_mtime.tv_sec = inode->i_mtime.tv_nsec = 0;
inode->i_ctime.tv_sec = inode->i_ctime.tv_nsec = 0;
- unionfs_read_unlock(inode->i_sb);
}
/*
--
1.5.2.2
If we copyup a special file (char, block, etc.), then dput the source
object.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/copyup.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 3fe4865..f48209f 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -505,13 +505,12 @@ out_unlock:
out_free:
/*
- * If old_lower_dentry was a directory, we need to dput it. If it
- * was a file, then it was already dput indirectly by other
+ * If old_lower_dentry was not a file, then we need to dput it. If
+ * it was a file, then it was already dput indirectly by other
* functions we call above which operate on regular files.
*/
if (old_lower_dentry && old_lower_dentry->d_inode &&
- (S_ISDIR(old_lower_dentry->d_inode->i_mode) ||
- S_ISLNK(old_lower_dentry->d_inode->i_mode)))
+ !S_ISREG(old_lower_dentry->d_inode->i_mode))
dput(old_lower_dentry);
kfree(symbuf);
--
1.5.2.2
Parent dentries may not be locked and may change, so don't check them. But
do check parent inodes if they are passed to the method. Also, ensure the
checks are done only if no error occurred.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/commonfops.c | 1 -
fs/unionfs/file.c | 1 -
fs/unionfs/inode.c | 23 +++++++++++------------
3 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index b8357a7..2c32ada 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -588,7 +588,6 @@ out_nofree:
unionfs_copy_attr_times(dentry->d_parent->d_inode);
unionfs_copy_attr_times(inode);
unionfs_check_file(file);
- unionfs_check_dentry(dentry->d_parent);
unionfs_check_inode(inode);
}
unionfs_read_unlock(inode->i_sb);
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index b632042..0c424f6 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -66,7 +66,6 @@ out:
/* copyup could cause parent dir times to change */
unionfs_copy_attr_times(file->f_path.dentry->d_parent->d_inode);
unionfs_check_file(file);
- unionfs_check_dentry(file->f_path.dentry->d_parent);
}
unionfs_read_unlock(file->f_path.dentry->d_sb);
return err;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 740d364..6095c4f 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -157,7 +157,6 @@ out:
unionfs_check_inode(parent);
if (!err) {
- unionfs_check_dentry(dentry->d_parent);
unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
@@ -207,14 +206,16 @@ static struct dentry *unionfs_lookup(struct inode *parent,
}
unionfs_check_inode(parent);
- unionfs_check_dentry(dentry);
- unionfs_check_dentry(dentry->d_parent);
- unionfs_check_nd(nd);
- if (!IS_ERR(ret))
+ if (!IS_ERR(ret)) {
+ unionfs_check_dentry(dentry);
+ unionfs_check_nd(nd);
unionfs_unlock_dentry(dentry);
+ }
- if (dentry != dentry->d_parent)
+ if (dentry != dentry->d_parent) {
+ unionfs_check_dentry(dentry->d_parent);
unionfs_unlock_dentry(dentry->d_parent);
+ }
unionfs_read_unlock(dentry->d_sb);
return ret;
@@ -520,8 +521,7 @@ out:
unionfs_check_inode(parent);
if (!err)
- unionfs_check_dentry(dentry->d_parent);
- unionfs_check_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
@@ -815,8 +815,7 @@ out:
unionfs_check_inode(parent);
if (!err)
- unionfs_check_dentry(dentry->d_parent);
- unionfs_check_dentry(dentry);
+ unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
@@ -1110,8 +1109,8 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
/* if setattr succeeded, then parent dir may have changed */
unionfs_copy_attr_times(dentry->d_parent->d_inode);
out:
- unionfs_check_dentry(dentry);
- unionfs_check_dentry(dentry->d_parent);
+ if (!err)
+ unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/rename.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 1019d47..9306a2b 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -462,7 +462,7 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
}
if (S_ISDIR(new_dentry->d_inode->i_mode)) {
- struct unionfs_dir_state *namelist;
+ struct unionfs_dir_state *namelist = NULL;
/* check if this unionfs directory is empty or not */
err = check_empty(new_dentry, &namelist);
if (err)
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/unlink.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 1e81494..a1c82b6 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -91,7 +91,9 @@ out:
int unionfs_unlink(struct inode *dir, struct dentry *dentry)
{
int err = 0;
+ struct inode *inode = dentry->d_inode;
+ BUG_ON(S_ISDIR(inode->i_mode));
unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
@@ -104,8 +106,13 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system "forgets" about us */
if (!err) {
- if (!S_ISDIR(dentry->d_inode->i_mode))
- unionfs_postcopyup_release(dentry);
+ unionfs_postcopyup_release(dentry);
+ if (inode->i_nlink == 0) {
+ /* drop lower inodes */
+ iput(unionfs_lower_inode(inode));
+ unionfs_set_lower_inode(inode, NULL);
+ ibstart(inode) = ibend(inode) = -1;
+ }
d_drop(dentry);
/*
* if unlink/whiteout succeeded, parent dir mtime has
--
1.5.2.2
This is needed to drop lower objects early enough, under certain conditions,
so the lower objects don't stay behind until umount(). [LTP testing]
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dentry.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 files changed, 42 insertions(+), 0 deletions(-)
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 0e89308..0369d93 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -484,7 +484,49 @@ out:
return;
}
+/*
+ * Called when we're removing the last reference to our dentry. So we
+ * should drop all lower references too.
+ */
+static void unionfs_d_iput(struct dentry *dentry, struct inode *inode)
+{
+ int bindex, rc;
+
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+
+ if (dbstart(dentry) < 0)
+ goto drop_lower_inodes;
+ for (bindex = dbstart(dentry); bindex <= dbend(dentry); bindex++) {
+ if (unionfs_lower_mnt_idx(dentry, bindex)) {
+ unionfs_mntput(dentry, bindex);
+ unionfs_set_lower_mnt_idx(dentry, bindex, NULL);
+ }
+ if (unionfs_lower_dentry_idx(dentry, bindex)) {
+ dput(unionfs_lower_dentry_idx(dentry, bindex));
+ unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
+ }
+ }
+ set_dbstart(dentry, -1);
+ set_dbend(dentry, -1);
+
+drop_lower_inodes:
+ rc = atomic_read(&inode->i_count);
+ if (rc == 1 && inode->i_nlink == 1 && ibstart(inode) >= 0) {
+ /* see Documentation/filesystems/unionfs/issues.txt */
+ lockdep_off();
+ iput(unionfs_lower_inode(inode));
+ lockdep_on();
+ unionfs_set_lower_inode(inode, NULL);
+ /* XXX: may need to set start/end to -1? */
+ }
+
+ iput(inode);
+
+ unionfs_read_unlock(dentry->d_sb);
+}
+
struct dentry_operations unionfs_dops = {
.d_revalidate = unionfs_d_revalidate,
.d_release = unionfs_d_release,
+ .d_iput = unionfs_d_iput,
};
--
1.5.2.2
To avoid too much code nesting.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/rename.c | 83 ++++++++++++++++++++++++--------------------------
1 files changed, 40 insertions(+), 43 deletions(-)
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 824c285..1019d47 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -479,54 +479,51 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
goto out;
}
}
+
err = do_unionfs_rename(old_dir, old_dentry, new_dir, new_dentry);
-out:
- if (err) {
- /* clear the new_dentry stuff created */
- d_drop(new_dentry);
- } else {
- /*
- * force re-lookup since the dir on ro branch is not renamed,
- * and lower dentries still indicate the un-renamed ones.
- */
- if (S_ISDIR(old_dentry->d_inode->i_mode))
- atomic_dec(&UNIONFS_D(old_dentry)->generation);
- else
- unionfs_postcopyup_release(old_dentry);
- if (new_dentry->d_inode &&
- !S_ISDIR(new_dentry->d_inode->i_mode)) {
- unionfs_postcopyup_release(new_dentry);
- unionfs_postcopyup_setmnt(new_dentry);
- if (!unionfs_lower_inode(new_dentry->d_inode)) {
- /*
- * If we get here, it means that no copyup
- * was needed, and that a file by the old
- * name already existing on the destination
- * branch; that file got renamed earlier in
- * this function, so all we need to do here
- * is set the lower inode.
- */
- struct inode *inode;
- inode = unionfs_lower_inode(
- old_dentry->d_inode);
- igrab(inode);
- unionfs_set_lower_inode_idx(
- new_dentry->d_inode,
- dbstart(new_dentry), inode);
- }
+ if (err)
+ goto out;
+ /*
+ * force re-lookup since the dir on ro branch is not renamed, and
+ * lower dentries still indicate the un-renamed ones.
+ */
+ if (S_ISDIR(old_dentry->d_inode->i_mode))
+ atomic_dec(&UNIONFS_D(old_dentry)->generation);
+ else
+ unionfs_postcopyup_release(old_dentry);
+ if (new_dentry->d_inode && !S_ISDIR(new_dentry->d_inode->i_mode)) {
+ unionfs_postcopyup_release(new_dentry);
+ unionfs_postcopyup_setmnt(new_dentry);
+ if (!unionfs_lower_inode(new_dentry->d_inode)) {
+ /*
+ * If we get here, it means that no copyup was
+ * needed, and that a file by the old name already
+ * existing on the destination branch; that file got
+ * renamed earlier in this function, so all we need
+ * to do here is set the lower inode.
+ */
+ struct inode *inode;
+ inode = unionfs_lower_inode(old_dentry->d_inode);
+ igrab(inode);
+ unionfs_set_lower_inode_idx(new_dentry->d_inode,
+ dbstart(new_dentry),
+ inode);
}
- /* if all of this renaming succeeded, update our times */
- unionfs_copy_attr_times(old_dir);
- unionfs_copy_attr_times(new_dir);
- unionfs_copy_attr_times(old_dentry->d_inode);
- unionfs_copy_attr_times(new_dentry->d_inode);
- unionfs_check_inode(old_dir);
- unionfs_check_inode(new_dir);
- unionfs_check_dentry(old_dentry);
- unionfs_check_dentry(new_dentry);
}
+ /* if all of this renaming succeeded, update our times */
+ unionfs_copy_attr_times(old_dir);
+ unionfs_copy_attr_times(new_dir);
+ unionfs_copy_attr_times(old_dentry->d_inode);
+ unionfs_copy_attr_times(new_dentry->d_inode);
+ unionfs_check_inode(old_dir);
+ unionfs_check_inode(new_dir);
+ unionfs_check_dentry(old_dentry);
+ unionfs_check_dentry(new_dentry);
+out:
+ if (err) /* clear the new_dentry stuff created */
+ d_drop(new_dentry);
unionfs_unlock_dentry(new_dentry);
unionfs_unlock_dentry(old_dentry);
unionfs_read_unlock(old_dentry->d_sb);
--
1.5.2.2
Lockdep complained, because we eventually call vfs_unlink which'd grab the
necessary locks.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dirhelper.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index 2e52fc3..b40090a 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -110,7 +110,6 @@ int delete_whiteouts(struct dentry *dentry, int bindex,
lower_dir = lower_dir_dentry->d_inode;
BUG_ON(!S_ISDIR(lower_dir->i_mode));
- mutex_lock(&lower_dir->i_mutex);
if (!permission(lower_dir, MAY_WRITE | MAY_EXEC, NULL)) {
err = do_delete_whiteouts(dentry, bindex, namelist);
} else {
@@ -120,7 +119,6 @@ int delete_whiteouts(struct dentry *dentry, int bindex,
run_sioq(__delete_whiteouts, &args);
err = args.err;
}
- mutex_unlock(&lower_dir->i_mutex);
out:
return err;
--
1.5.2.2
From: Hugh Dickins <[email protected]>
Restructure the code to move the lower notify_change out of the loop in
unionfs_setattr. Cleanup and simplify the code. Then fix the truncation
order which fsx-linux in a unionfs on tmpfs found. Then handle copyup
properly.
When shrinking a file, unionfs_setattr needs to vmtruncate the upper level
before notifying change to the lower level, to eliminate those dirty pages
beyond new eof which otherwise drift down to the lower level's writepage,
writing beyond its eof (and later uncovered when the file is expanded).
Also truncate the upper level first when expanding, in the case when
the upper level's s_maxbytes is more limiting than the lower level's.
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 97 ++++++++++++++++++++++++++-------------------------
1 files changed, 49 insertions(+), 48 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 78cdfa2..37258c8 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -998,11 +998,10 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
{
int err = 0;
struct dentry *lower_dentry;
- struct inode *inode = NULL;
- struct inode *lower_inode = NULL;
+ struct inode *inode;
+ struct inode *lower_inode;
int bstart, bend, bindex;
- int i;
- int copyup = 0;
+ loff_t size;
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -1023,62 +1022,64 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
if (ia->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID))
ia->ia_valid &= ~ATTR_MODE;
- for (bindex = bstart; (bindex <= bend) || (bindex == bstart);
- bindex++) {
- lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
- if (!lower_dentry)
- continue;
- BUG_ON(lower_dentry->d_inode == NULL);
-
- /* If the file is on a read only branch */
- if (is_robranch_super(dentry->d_sb, bindex)
- || IS_RDONLY(lower_dentry->d_inode)) {
- if (copyup || (bindex != bstart))
- continue;
- /* Only if its the leftmost file, copyup the file */
- for (i = bstart - 1; i >= 0; i--) {
- loff_t size = i_size_read(dentry->d_inode);
- if (ia->ia_valid & ATTR_SIZE)
- size = ia->ia_size;
- err = copyup_dentry(dentry->d_parent->d_inode,
- dentry, bstart, i,
- dentry->d_name.name,
- dentry->d_name.len,
- NULL, size);
-
- if (!err) {
- copyup = 1;
- lower_dentry =
- unionfs_lower_dentry(dentry);
- break;
- }
- /*
- * if error is in the leftmost branch, pass
- * it up.
- */
- if (i == 0)
- goto out;
- }
+ lower_dentry = unionfs_lower_dentry(dentry);
+ BUG_ON(!lower_dentry); /* should never happen after above revalidate */
+
+ /* copyup if the file is on a read only branch */
+ if (is_robranch_super(dentry->d_sb, bstart)
+ || IS_RDONLY(lower_dentry->d_inode)) {
+ /* check if we have a branch to copy up to */
+ if (bstart <= 0) {
+ err = -EACCES;
+ goto out;
+ }
+ if (ia->ia_valid & ATTR_SIZE)
+ size = ia->ia_size;
+ else
+ size = i_size_read(inode);
+ /* copyup to next available branch */
+ for (bindex = bstart - 1; bindex >= 0; bindex--) {
+ err = copyup_dentry(dentry->d_parent->d_inode,
+ dentry, bstart, bindex,
+ dentry->d_name.name,
+ dentry->d_name.len,
+ NULL, size);
+ if (!err)
+ break;
}
- err = notify_change(lower_dentry, ia);
if (err)
goto out;
- break;
+ /* get updated lower_dentry after copyup */
+ lower_dentry = unionfs_lower_dentry(dentry);
}
- /* for mmap */
+ lower_inode = unionfs_lower_inode(inode);
+
+ /*
+ * If shrinking, first truncate upper level to cancel writing dirty
+ * pages beyond the new eof; and also if its' maxbytes is more
+ * limiting (fail with -EFBIG before making any change to the lower
+ * level). There is no need to vmtruncate the upper level
+ * afterwards in the other cases: we fsstack_copy_inode_size from
+ * the lower level.
+ */
if (ia->ia_valid & ATTR_SIZE) {
- if (ia->ia_size != i_size_read(inode)) {
+ size = i_size_read(inode);
+ if (ia->ia_size < size || (ia->ia_size > size &&
+ inode->i_sb->s_maxbytes < lower_inode->i_sb->s_maxbytes)) {
err = vmtruncate(inode, ia->ia_size);
if (err)
- printk(KERN_ERR
- "unionfs: setattr: vmtruncate failed\n");
+ goto out;
}
}
- /* get the size from the first lower inode */
- lower_inode = unionfs_lower_inode(inode);
+ /* notify the (possibly copied-up) lower inode */
+ err = notify_change(lower_dentry, ia);
+ if (err)
+ goto out;
+
+ /* get attributes from the first lower inode */
unionfs_copy_attr_all(inode, lower_inode);
/*
* unionfs_copy_attr_all will copy the lower times to our inode if
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index df6138a..740d364 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -158,9 +158,9 @@ out:
unionfs_check_inode(parent);
if (!err) {
unionfs_check_dentry(dentry->d_parent);
+ unionfs_check_dentry(dentry);
unionfs_check_nd(nd);
}
- unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
return err;
--
1.5.2.2
Lockdep fixes. Support locking order/classes (e.g., parent -> child ->
whiteout). Remove locking from create_parents: it's enough to just dget the
dentries in question. Move parent locking to from lookup_backend to caller,
unionfs_lookup.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/commonfops.c | 16 ++++++++--------
fs/unionfs/copyup.c | 6 +++---
fs/unionfs/dentry.c | 8 ++++----
fs/unionfs/dirfops.c | 4 ++--
fs/unionfs/fanout.h | 14 ++++++++++++--
fs/unionfs/file.c | 6 +++---
fs/unionfs/inode.c | 47 ++++++++++++++++++++++++++---------------------
fs/unionfs/lookup.c | 15 +++++----------
fs/unionfs/main.c | 2 +-
fs/unionfs/mmap.c | 4 ++--
fs/unionfs/rename.c | 8 ++++----
fs/unionfs/subr.c | 4 ++--
fs/unionfs/super.c | 10 +++++-----
fs/unionfs/union.h | 38 ++++++++++++++++++++++++--------------
fs/unionfs/unlink.c | 8 ++++----
fs/unionfs/xattr.c | 16 ++++++++--------
16 files changed, 113 insertions(+), 93 deletions(-)
diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index f714e2f..4077907 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -311,7 +311,7 @@ int unionfs_file_revalidate(struct file *file, bool willwrite)
int err = 0;
dentry = file->f_path.dentry;
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
sb = dentry->d_sb;
/*
@@ -519,7 +519,7 @@ int unionfs_open(struct inode *inode, struct file *file)
int bindex = 0, bstart = 0, bend = 0;
int size;
- unionfs_read_lock(inode->i_sb);
+ unionfs_read_lock(inode->i_sb, UNIONFS_SMUTEX_PARENT);
file->private_data =
kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
@@ -546,7 +546,7 @@ int unionfs_open(struct inode *inode, struct file *file)
}
dentry = file->f_path.dentry;
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
bstart = fbstart(file) = dbstart(dentry);
bend = fbend(file) = dbend(dentry);
@@ -607,7 +607,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
int bindex, bstart, bend;
int fgen, err = 0;
- unionfs_read_lock(sb);
+ unionfs_read_lock(sb, UNIONFS_SMUTEX_PARENT);
/*
* Yes, we have to revalidate this file even if it's being released.
* This is important for open-but-unlinked files, as well as mmap
@@ -626,7 +626,7 @@ int unionfs_file_release(struct inode *inode, struct file *file)
bstart = fbstart(file);
bend = fbend(file);
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
for (bindex = bstart; bindex <= bend; bindex++) {
lower_file = unionfs_lower_file_idx(file, bindex);
@@ -705,7 +705,7 @@ static int unionfs_ioctl_queryfile(struct file *file, unsigned int cmd,
struct vfsmount *mnt;
dentry = file->f_path.dentry;
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
orig_bstart = dbstart(dentry);
orig_bend = dbend(dentry);
err = unionfs_partial_lookup(dentry);
@@ -755,7 +755,7 @@ long unionfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
long err;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
@@ -794,7 +794,7 @@ int unionfs_flush(struct file *file, fl_owner_t id)
struct dentry *dentry = file->f_path.dentry;
int bindex, bstart, bend;
- unionfs_read_lock(dentry->d_sb);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 0012caf..16b2c7c 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -717,7 +717,7 @@ struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
/* find the parent directory dentry in unionfs */
parent_dentry = child_dentry->d_parent;
- unionfs_lock_dentry(parent_dentry);
+ dget(parent_dentry);
/* find out the lower_parent_dentry in the given branch */
lower_parent_dentry =
@@ -752,7 +752,7 @@ struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
begin:
/* get lower parent dir in the current branch */
lower_parent_dentry = unionfs_lower_dentry_idx(parent_dentry, bindex);
- unionfs_unlock_dentry(parent_dentry);
+ dput(parent_dentry);
/* init the values to lookup */
childname = child_dentry->d_name.name;
@@ -843,7 +843,7 @@ out:
/* cleanup any leftover locks from the do/while loop above */
if (IS_ERR(lower_dentry))
while (count)
- unionfs_unlock_dentry(path[count--]);
+ dput(path[count--]);
kfree(path);
return lower_dentry;
}
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index dc1aa39..b207a6f 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -347,7 +347,7 @@ bool __unionfs_d_revalidate_chain(struct dentry *dentry, struct nameidata *nd,
* to child order.
*/
for (i = 0; i < chain_len; i++) {
- unionfs_lock_dentry(chain[i]);
+ unionfs_lock_dentry(chain[i], UNIONFS_DMUTEX_REVAL+i);
saved_bstart = dbstart(chain[i]);
saved_bend = dbend(chain[i]);
sbgen = atomic_read(&UNIONFS_SB(dentry->d_sb)->generation);
@@ -421,9 +421,9 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
{
int err;
- unionfs_read_lock(dentry->d_sb);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
err = __unionfs_d_revalidate_chain(dentry, nd, false);
if (likely(err > 0)) { /* true==1: dentry is valid */
unionfs_check_dentry(dentry);
@@ -444,7 +444,7 @@ static void unionfs_d_release(struct dentry *dentry)
{
int bindex, bstart, bend;
- unionfs_read_lock(dentry->d_sb);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
unionfs_check_dentry(dentry);
/* this could be a negative dentry, so check first */
diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index 88df635..a613862 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -103,7 +103,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
int bend;
loff_t offset;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, false);
if (unlikely(err))
@@ -208,7 +208,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t offset, int origin)
struct unionfs_dir_state *rdstate;
loff_t err;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, false);
if (unlikely(err))
diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 864383e..5f31015 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -284,10 +284,20 @@ static inline struct vfsmount *unionfs_lower_mnt(const struct dentry *dent)
}
/* Macros for locking a dentry. */
-static inline void unionfs_lock_dentry(struct dentry *d)
+enum unionfs_dentry_lock_class {
+ UNIONFS_DMUTEX_NORMAL,
+ UNIONFS_DMUTEX_ROOT,
+ UNIONFS_DMUTEX_PARENT,
+ UNIONFS_DMUTEX_CHILD,
+ UNIONFS_DMUTEX_WHITEOUT,
+ UNIONFS_DMUTEX_REVAL, /* for file/dentry revalidate */
+};
+
+static inline void unionfs_lock_dentry(struct dentry *d,
+ unsigned int subclass)
{
BUG_ON(!d);
- mutex_lock(&UNIONFS_D(d)->lock);
+ mutex_lock_nested(&UNIONFS_D(d)->lock, subclass);
}
static inline void unionfs_unlock_dentry(struct dentry *d)
diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index 94784c3..b632042 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -30,7 +30,7 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)
bool willwrite;
struct file *lower_file;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
/* This might be deferred to mmap's writepage */
willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
@@ -80,7 +80,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
struct inode *lower_inode, *inode;
int err = -EINVAL;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
goto out;
@@ -128,7 +128,7 @@ int unionfs_fasync(int fd, struct file *file, int flag)
struct inode *lower_inode, *inode;
int err = 0;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
goto out;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 4890f42..df6138a 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -29,8 +29,8 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
int valid = 0;
struct nameidata lower_nd;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
valid = __unionfs_d_revalidate_chain(dentry, nd, false);
/*
@@ -79,7 +79,7 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
*/
struct dentry *lower_dir_dentry;
- lower_dir_dentry = lock_parent(wh_dentry);
+ lower_dir_dentry = lock_parent_wh(wh_dentry);
/* see Documentation/filesystems/unionfs/issues.txt */
lockdep_off();
err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
@@ -178,7 +178,9 @@ static struct dentry *unionfs_lookup(struct inode *parent,
struct path path_save;
struct dentry *ret;
- unionfs_read_lock(dentry->d_sb);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ if (dentry != dentry->d_parent)
+ unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_PARENT);
/* save the dentry & vfsmnt from namei */
if (nd) {
@@ -210,6 +212,9 @@ static struct dentry *unionfs_lookup(struct inode *parent,
unionfs_check_nd(nd);
if (!IS_ERR(ret))
unionfs_unlock_dentry(dentry);
+
+ if (dentry != dentry->d_parent)
+ unionfs_unlock_dentry(dentry->d_parent);
unionfs_read_unlock(dentry->d_sb);
return ret;
@@ -225,7 +230,7 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
struct dentry *whiteout_dentry;
char *name = NULL;
- unionfs_read_lock(old_dentry->d_sb);
+ unionfs_read_lock(old_dentry->d_sb, UNIONFS_SMUTEX_CHILD);
unionfs_double_lock_dentry(new_dentry, old_dentry);
if (unlikely(!__unionfs_d_revalidate_chain(old_dentry, NULL, false))) {
@@ -263,7 +268,7 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
whiteout_dentry = NULL;
} else {
/* found a .wh.foo entry, unlink it and then call vfs_link() */
- lower_dir_dentry = lock_parent(whiteout_dentry);
+ lower_dir_dentry = lock_parent_wh(whiteout_dentry);
err = is_robranch_super(new_dentry->d_sb, dbstart(new_dentry));
if (!err) {
/* see Documentation/filesystems/unionfs/issues.txt */
@@ -389,8 +394,8 @@ static int unionfs_symlink(struct inode *parent, struct dentry *dentry,
int valid = 0;
umode_t mode;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(dentry->d_inode &&
!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
@@ -444,7 +449,7 @@ static int unionfs_symlink(struct inode *parent, struct dentry *dentry,
*/
struct dentry *lower_dir_dentry;
- lower_dir_dentry = lock_parent(wh_dentry);
+ lower_dir_dentry = lock_parent_wh(wh_dentry);
err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
unlock_dir(lower_dir_dentry);
@@ -532,8 +537,8 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
int whiteout_unlinked = 0;
struct sioq_args args;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(dentry->d_inode &&
!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
@@ -566,7 +571,7 @@ static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
dput(whiteout_dentry);
whiteout_dentry = NULL;
} else {
- lower_parent_dentry = lock_parent(whiteout_dentry);
+ lower_parent_dentry = lock_parent_wh(whiteout_dentry);
/* found a.wh.foo entry, remove it then do vfs_mkdir */
err = is_robranch_super(dentry->d_sb, bstart);
@@ -686,8 +691,8 @@ static int unionfs_mknod(struct inode *parent, struct dentry *dentry, int mode,
char *name = NULL;
int valid = 0;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(dentry->d_inode &&
!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
@@ -741,7 +746,7 @@ static int unionfs_mknod(struct inode *parent, struct dentry *dentry, int mode,
*/
struct dentry *lower_dir_dentry;
- lower_dir_dentry = lock_parent(wh_dentry);
+ lower_dir_dentry = lock_parent_wh(wh_dentry);
err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
unlock_dir(lower_dir_dentry);
@@ -823,8 +828,8 @@ static int unionfs_readlink(struct dentry *dentry, char __user *buf,
int err;
struct dentry *lower_dentry;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -901,9 +906,9 @@ out:
static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *cookie)
{
- unionfs_read_lock(dentry->d_sb);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
- unionfs_lock_dentry(dentry);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, nd, false)))
printk(KERN_ERR
"unionfs: put_link failed to revalidate dentry\n");
@@ -1011,8 +1016,8 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
int bstart, bend, bindex;
loff_t size;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index a1904c9..85a85aa 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -98,7 +98,6 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
struct dentry *first_dentry = NULL;
struct dentry *first_lower_dentry = NULL;
struct vfsmount *first_lower_mnt = NULL;
- int locked_parent = 0;
int opaque;
char *whname = NULL;
const char *name;
@@ -119,7 +118,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
case INTERPOSE_PARTIAL:
break;
case INTERPOSE_LOOKUP:
- err = new_dentry_private_data(dentry);
+ err = new_dentry_private_data(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(err))
goto out;
break;
@@ -136,10 +135,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
parent_dentry = dget_parent(dentry);
/* We never partial lookup the root directory. */
- if (parent_dentry != dentry) {
- unionfs_lock_dentry(parent_dentry);
- locked_parent = 1;
- } else {
+ if (parent_dentry == dentry) {
dput(parent_dentry);
parent_dentry = NULL;
goto out;
@@ -426,8 +422,6 @@ out:
}
}
kfree(whname);
- if (locked_parent)
- unionfs_unlock_dentry(parent_dentry);
dput(parent_dentry);
if (err && (lookupmode == INTERPOSE_LOOKUP))
unionfs_unlock_dentry(dentry);
@@ -441,6 +435,7 @@ out:
/*
* This is a utility function that fills in a unionfs dentry.
+ * Caller must lock this dentry with unionfs_lock_dentry.
*
* Returns: 0 (ok), or -ERRNO if an error occurred.
*/
@@ -530,7 +525,7 @@ static int realloc_dentry_private_data(struct dentry *dentry)
}
/* allocate new dentry private data */
-int new_dentry_private_data(struct dentry *dentry)
+int new_dentry_private_data(struct dentry *dentry, int subclass)
{
struct unionfs_dentry_info *info = UNIONFS_D(dentry);
@@ -541,7 +536,7 @@ int new_dentry_private_data(struct dentry *dentry)
return -ENOMEM;
mutex_init(&info->lock);
- mutex_lock(&info->lock);
+ mutex_lock_nested(&info->lock, subclass);
info->lower_paths = NULL;
diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index ea8976d..92f0e9d 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -653,7 +653,7 @@ static int unionfs_read_super(struct super_block *sb, void *raw_data,
/* link the upper and lower dentries */
sb->s_root->d_fsdata = NULL;
- err = new_dentry_private_data(sb->s_root);
+ err = new_dentry_private_data(sb->s_root, UNIONFS_DMUTEX_ROOT);
if (unlikely(err))
goto out_freedpd;
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index aad2137..a0e654b 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -152,7 +152,7 @@ static int unionfs_readpage(struct file *file, struct page *page)
char *page_data = NULL;
mode_t orig_mode;
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, false);
if (unlikely(err))
goto out;
@@ -245,7 +245,7 @@ static int unionfs_commit_write(struct file *file, struct page *page,
BUG_ON(file == NULL);
- unionfs_read_lock(file->f_path.dentry->d_sb);
+ unionfs_read_lock(file->f_path.dentry->d_sb, UNIONFS_SMUTEX_PARENT);
err = unionfs_file_revalidate(file, true);
if (unlikely(err))
goto out;
diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 8b04acf..824c285 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -76,7 +76,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
goto out;
}
- lower_wh_dir_dentry = lock_parent(lower_wh_dentry);
+ lower_wh_dir_dentry = lock_parent_wh(lower_wh_dentry);
err = is_robranch_super(old_dentry->d_sb, bindex);
if (!err)
err = vfs_unlink(lower_wh_dir_dentry->d_inode,
@@ -274,7 +274,7 @@ static int do_unionfs_rename(struct inode *old_dir,
err = init_lower_nd(&nd, LOOKUP_CREATE);
if (unlikely(err < 0))
goto out;
- lower_parent = lock_parent(wh_old);
+ lower_parent = lock_parent_wh(wh_old);
local_err = vfs_create(lower_parent->d_inode, wh_old, S_IRUGO,
&nd);
unlock_dir(lower_parent);
@@ -362,7 +362,7 @@ static struct dentry *lookup_whiteout(struct dentry *dentry)
return (void *)whname;
parent = dget_parent(dentry);
- unionfs_lock_dentry(parent);
+ unionfs_lock_dentry(parent, UNIONFS_DMUTEX_WHITEOUT);
bstart = dbstart(parent);
bend = dbend(parent);
wh_dentry = ERR_PTR(-ENOENT);
@@ -421,7 +421,7 @@ int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
int err = 0;
struct dentry *wh_dentry;
- unionfs_read_lock(old_dentry->d_sb);
+ unionfs_read_lock(old_dentry->d_sb, UNIONFS_SMUTEX_CHILD);
unionfs_double_lock_dentry(old_dentry, new_dentry);
if (unlikely(!__unionfs_d_revalidate_chain(old_dentry, NULL, false))) {
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 1a26c57..0a0fce9 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -90,7 +90,7 @@ int create_whiteout(struct dentry *dentry, int start)
err = init_lower_nd(&nd, LOOKUP_CREATE);
if (unlikely(err < 0))
goto out;
- lower_dir_dentry = lock_parent(lower_wh_dentry);
+ lower_dir_dentry = lock_parent_wh(lower_wh_dentry);
err = is_robranch_super(dentry->d_sb, bindex);
if (!err)
err = vfs_create(lower_dir_dentry->d_inode,
@@ -126,7 +126,7 @@ int unionfs_refresh_lower_dentry(struct dentry *dentry, int bindex)
verify_locked(dentry);
- unionfs_lock_dentry(dentry->d_parent);
+ unionfs_lock_dentry(dentry->d_parent, UNIONFS_DMUTEX_CHILD);
lower_parent = unionfs_lower_dentry_idx(dentry->d_parent, bindex);
unionfs_unlock_dentry(dentry->d_parent);
diff --git a/fs/unionfs/super.c b/fs/unionfs/super.c
index 45bcf89..986c980 100644
--- a/fs/unionfs/super.c
+++ b/fs/unionfs/super.c
@@ -134,8 +134,8 @@ static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
sb = dentry->d_sb;
- unionfs_read_lock(sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -944,7 +944,7 @@ static void unionfs_umount_begin(struct vfsmount *mnt, int flags)
sb = mnt->mnt_sb;
- unionfs_read_lock(sb);
+ unionfs_read_lock(sb, UNIONFS_SMUTEX_CHILD);
bstart = sbstart(sb);
bend = sbend(sb);
@@ -969,9 +969,9 @@ static int unionfs_show_options(struct seq_file *m, struct vfsmount *mnt)
int bindex, bstart, bend;
int perms;
- unionfs_read_lock(sb);
+ unionfs_read_lock(sb, UNIONFS_SMUTEX_CHILD);
- unionfs_lock_dentry(sb->s_root);
+ unionfs_lock_dentry(sb->s_root, UNIONFS_DMUTEX_CHILD);
tmp_page = (char *) __get_free_page(GFP_KERNEL);
if (unlikely(!tmp_page)) {
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index b1bcb42..d324f83 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -243,12 +243,18 @@ static inline off_t rdstate2offset(struct unionfs_dir_state *buf)
return tmp;
}
-static inline void unionfs_read_lock(struct super_block *sb)
+/* Macros for locking a super_block. */
+enum unionfs_super_lock_class {
+ UNIONFS_SMUTEX_NORMAL,
+ UNIONFS_SMUTEX_PARENT, /* when locking on behalf of file */
+ UNIONFS_SMUTEX_CHILD, /* when locking on behalf of dentry */
+};
+static inline void unionfs_read_lock(struct super_block *sb, int subclass)
{
if (UNIONFS_SB(sb)->write_lock_owner &&
UNIONFS_SB(sb)->write_lock_owner == current->pid)
return;
- down_read(&UNIONFS_SB(sb)->rwsem);
+ down_read_nested(&UNIONFS_SB(sb)->rwsem, subclass);
}
static inline void unionfs_read_unlock(struct super_block *sb)
{
@@ -271,16 +277,17 @@ static inline void unionfs_write_unlock(struct super_block *sb)
static inline void unionfs_double_lock_dentry(struct dentry *d1,
struct dentry *d2)
{
- if (d2 < d1) {
- struct dentry *tmp = d1;
- d1 = d2;
- d2 = tmp;
+ BUG_ON(d1 == d2);
+ if (d1 < d2) {
+ unionfs_lock_dentry(d1, UNIONFS_DMUTEX_PARENT);
+ unionfs_lock_dentry(d2, UNIONFS_DMUTEX_CHILD);
+ } else {
+ unionfs_lock_dentry(d2, UNIONFS_DMUTEX_PARENT);
+ unionfs_lock_dentry(d1, UNIONFS_DMUTEX_CHILD);
}
- unionfs_lock_dentry(d1);
- unionfs_lock_dentry(d2);
}
-extern int new_dentry_private_data(struct dentry *dentry);
+extern int new_dentry_private_data(struct dentry *dentry, int subclass);
extern void free_dentry_private_data(struct dentry *dentry);
extern void update_bstart(struct dentry *dentry);
extern int init_lower_nd(struct nameidata *nd, unsigned int flags);
@@ -477,15 +484,18 @@ extern char *alloc_whname(const char *name, int len);
extern int check_branch(struct nameidata *nd);
extern int parse_branch_mode(const char *name, int *perms);
-/*
- * These two functions are here because it is kind of daft to copy and paste
- * the contents of the two functions to 32+ places in unionfs
- */
+/* locking helpers */
static inline struct dentry *lock_parent(struct dentry *dentry)
{
struct dentry *dir = dget(dentry->d_parent);
+ mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT);
+ return dir;
+}
+static inline struct dentry *lock_parent_wh(struct dentry *dentry)
+{
+ struct dentry *dir = dget(dentry->d_parent);
- mutex_lock(&dir->d_inode->i_mutex);
+ mutex_lock_nested(&dir->d_inode->i_mutex, UNIONFS_DMUTEX_WHITEOUT);
return dir;
}
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 677a5ae..1e81494 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -92,8 +92,8 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
{
int err = 0;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -167,8 +167,8 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
struct unionfs_dir_state *namelist = NULL;
int dstart, dend;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 00c6d0d..8001c65 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -45,8 +45,8 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value,
struct dentry *lower_dentry = NULL;
int err = -EOPNOTSUPP;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -74,8 +74,8 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,
struct dentry *lower_dentry = NULL;
int err = -EOPNOTSUPP;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -103,8 +103,8 @@ int unionfs_removexattr(struct dentry *dentry, const char *name)
struct dentry *lower_dentry = NULL;
int err = -EOPNOTSUPP;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
@@ -132,8 +132,8 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size)
int err = -EOPNOTSUPP;
char *encoded_list = NULL;
- unionfs_read_lock(dentry->d_sb);
- unionfs_lock_dentry(dentry);
+ unionfs_read_lock(dentry->d_sb, UNIONFS_SMUTEX_CHILD);
+ unionfs_lock_dentry(dentry, UNIONFS_DMUTEX_CHILD);
if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
--
1.5.2.2
This was intended to protect the inode during branch management, but that is
now done through our superblock rwsem.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dentry.c | 13 -------------
1 files changed, 0 insertions(+), 13 deletions(-)
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 68c07ba..dc1aa39 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -33,7 +33,6 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
int bindex, bstart, bend;
int sbgen, dgen;
int positive = 0;
- int locked = 0;
int interpose_flag;
struct nameidata lowernd; /* TODO: be gentler to the stack */
@@ -87,16 +86,6 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
interpose_flag = INTERPOSE_REVAL_NEG;
if (positive) {
interpose_flag = INTERPOSE_REVAL;
- /*
- * During BRM, the VFS could already hold a lock on
- * a file being read, so don't lock it again
- * (deadlock), but if you lock it in this function,
- * then release it here too.
- */
- if (!mutex_is_locked(&dentry->d_inode->i_mutex)) {
- mutex_lock(&dentry->d_inode->i_mutex);
- locked = 1;
- }
bstart = ibstart(dentry->d_inode);
bend = ibend(dentry->d_inode);
@@ -115,8 +104,6 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
UNIONFS_I(dentry->d_inode)->lower_inodes = NULL;
ibstart(dentry->d_inode) = -1;
ibend(dentry->d_inode) = -1;
- if (locked)
- mutex_unlock(&dentry->d_inode->i_mutex);
}
result = unionfs_lookup_backend(dentry, &lowernd,
--
1.5.2.2
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 7 -------
1 files changed, 0 insertions(+), 7 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 7ec9c1b..3df9b19 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -32,13 +32,6 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry,
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
- unionfs_lock_dentry(dentry->d_parent);
- valid = __unionfs_d_revalidate_chain(dentry->d_parent, nd, false);
- unionfs_unlock_dentry(dentry->d_parent);
- if (unlikely(!valid)) {
- err = -ESTALE; /* same as what real_lookup does */
- goto out;
- }
valid = __unionfs_d_revalidate_chain(dentry, nd, false);
/*
* It's only a bug if this dentry was not negative and couldn't be
--
1.5.2.2
When creating a new symlink, always create it in the first branch, which is
always writeable, not in the branch which may have a whiteout in it. This
makes the policy for the creation of new symlinks consistent with that of
new files/directories, as well as improves efficiency a bit.
Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 210 +++++++++++++++++++++++----------------------------
1 files changed, 95 insertions(+), 115 deletions(-)
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 8076d0b..78cdfa2 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -368,16 +368,16 @@ out:
return err;
}
-static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
+static int unionfs_symlink(struct inode *parent, struct dentry *dentry,
const char *symname)
{
int err = 0;
struct dentry *lower_dentry = NULL;
- struct dentry *whiteout_dentry = NULL;
- struct dentry *lower_dir_dentry = NULL;
- umode_t mode;
- int bindex = 0, bstart;
+ struct dentry *wh_dentry = NULL;
+ struct dentry *lower_parent_dentry = NULL;
char *name = NULL;
+ int valid = 0;
+ umode_t mode;
unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -388,147 +388,127 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
goto out;
}
- /* We start out in the leftmost branch. */
- bstart = dbstart(dentry);
-
- lower_dentry = unionfs_lower_dentry(dentry);
-
/*
- * check if whiteout exists in this branch, i.e. lookup .wh.foo
- * first. If present, delete it
+ * It's only a bug if this dentry was not negative and couldn't be
+ * revalidated (shouldn't happen).
*/
- name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
- if (unlikely(IS_ERR(name))) {
- err = PTR_ERR(name);
- goto out;
- }
+ BUG_ON(!valid && dentry->d_inode);
- whiteout_dentry =
- lookup_one_len(name, lower_dentry->d_parent,
- dentry->d_name.len + UNIONFS_WHLEN);
- if (IS_ERR(whiteout_dentry)) {
- err = PTR_ERR(whiteout_dentry);
+ /*
+ * We shouldn't create things in a read-only branch; this check is a
+ * bit redundant as we don't allow branch 0 to be read-only at the
+ * moment
+ */
+ err = is_robranch_super(dentry->d_sb, 0);
+ if (err) {
+ err = -EROFS;
goto out;
}
- if (!whiteout_dentry->d_inode) {
- dput(whiteout_dentry);
- whiteout_dentry = NULL;
- } else {
+ /*
+ * We _always_ create on branch 0
+ */
+ lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
+ if (lower_dentry) {
/*
- * found a .wh.foo entry, unlink it and then call
- * vfs_symlink().
+ * check if whiteout exists in this branch, i.e. lookup .wh.foo
+ * first.
*/
- lower_dir_dentry = lock_parent(whiteout_dentry);
-
- err = is_robranch_super(dentry->d_sb, bstart);
- if (!err)
- err = vfs_unlink(lower_dir_dentry->d_inode,
- whiteout_dentry);
- dput(whiteout_dentry);
-
- fsstack_copy_attr_times(dir, lower_dir_dentry->d_inode);
- /* propagate number of hard-links */
- dir->i_nlink = unionfs_get_nlinks(dir);
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (unlikely(IS_ERR(name))) {
+ err = PTR_ERR(name);
+ goto out;
+ }
- unlock_dir(lower_dir_dentry);
+ wh_dentry = lookup_one_len(name, lower_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(wh_dentry)) {
+ err = PTR_ERR(wh_dentry);
+ wh_dentry = NULL;
+ goto out;
+ }
- if (err) {
- /* exit if the error returned was NOT -EROFS */
- if (!IS_COPYUP_ERR(err))
- goto out;
+ if (wh_dentry->d_inode) {
/*
- * should now try to create symlink in the another
- * branch.
+ * .wh.foo has been found, so let's unlink it
*/
- bstart--;
- }
- }
+ struct dentry *lower_dir_dentry;
+
+ lower_dir_dentry = lock_parent(wh_dentry);
+ err = vfs_unlink(lower_dir_dentry->d_inode, wh_dentry);
+ unlock_dir(lower_dir_dentry);
- /*
- * deleted whiteout if it was present, now do a normal vfs_symlink()
- * with possible recursive directory creation
- */
- for (bindex = bstart; bindex >= 0; bindex--) {
- lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
- if (!lower_dentry) {
/*
- * if lower_dentry is NULL, create the entire
- * dentry directory structure in branch 'bindex'.
- * lower_dentry will NOT be null when bindex ==
- * bstart because lookup passed as a negative
- * unionfs dentry pointing to a lone negative
- * underlying dentry
+ * Whiteouts are special files and should be deleted
+ * no matter what (as if they never existed), in
+ * order to allow this create operation to succeed.
+ * This is especially important in sticky
+ * directories: a whiteout may have been created by
+ * one user, but the newly created file may be
+ * created by another user. Therefore, in order to
+ * maintain Unix semantics, if the vfs_unlink above
+ * ailed, then we have to try to directly unlink the
+ * whiteout. Note: in the ODF version of unionfs,
+ * whiteout are handled much more cleanly.
*/
- lower_dentry = create_parents(dir, dentry,
- dentry->d_name.name,
- bindex);
- if (!lower_dentry || IS_ERR(lower_dentry)) {
- if (IS_ERR(lower_dentry))
- err = PTR_ERR(lower_dentry);
- if (!IS_COPYUP_ERR(err))
- printk(KERN_ERR
- "unionfs: create_parents for "
- "symlink failed: bindex=%d "
- "err=%d\n", bindex, err);
- continue;
+ if (err == -EPERM) {
+ struct inode *inode = lower_dir_dentry->d_inode;
+ err = inode->i_op->unlink(inode, wh_dentry);
+ }
+ if (err) {
+ printk(KERN_ERR "unionfs: symlink: could not "
+ "unlink whiteout, err = %d\n", err);
+ goto out;
}
}
+ } else {
+ /*
+ * if lower_dentry is NULL, create the entire
+ * dentry directory structure in branch 0.
+ */
+ lower_dentry = create_parents(parent, dentry,
+ dentry->d_name.name, 0);
+ if (IS_ERR(lower_dentry)) {
+ err = PTR_ERR(lower_dentry);
+ goto out;
+ }
+ }
- lower_dir_dentry = lock_parent(lower_dentry);
+ lower_parent_dentry = lock_parent(lower_dentry);
+ if (IS_ERR(lower_parent_dentry)) {
+ err = PTR_ERR(lower_parent_dentry);
+ goto out;
+ }
- err = is_robranch_super(dentry->d_sb, bindex);
+ mode = S_IALLUGO;
+ err = vfs_symlink(lower_parent_dentry->d_inode, lower_dentry,
+ symname, mode);
+ if (!err) {
+ err = PTR_ERR(unionfs_interpose(dentry, parent->i_sb, 0));
if (!err) {
- mode = S_IALLUGO;
- err = vfs_symlink(lower_dir_dentry->d_inode,
- lower_dentry, symname, mode);
- }
- unlock_dir(lower_dir_dentry);
-
- if (err || !lower_dentry->d_inode) {
- /*
- * break out of for loop if error returned was NOT
- * -EROFS.
- */
- if (!IS_COPYUP_ERR(err))
- break;
- } else {
- /*
- * Only INTERPOSE_LOOKUP can return a value other
- * than 0 on err.
- */
- err = PTR_ERR(unionfs_interpose(dentry,
- dir->i_sb, 0));
- if (!err) {
- fsstack_copy_attr_times(dir,
- lower_dir_dentry->
- d_inode);
- fsstack_copy_inode_size(dir,
- lower_dir_dentry->
- d_inode);
- /*
- * update number of links on parent
- * directory.
- */
- dir->i_nlink = unionfs_get_nlinks(dir);
- }
- break;
+ unionfs_copy_attr_times(parent);
+ fsstack_copy_inode_size(parent,
+ lower_parent_dentry->d_inode);
+ /* update no. of links on parent directory */
+ parent->i_nlink = unionfs_get_nlinks(parent);
}
}
-out:
- if (!dentry->d_inode)
- d_drop(dentry);
+ unlock_dir(lower_parent_dentry);
+out:
+ dput(wh_dentry);
kfree(name);
+
if (!err)
unionfs_postcopyup_setmnt(dentry);
- unionfs_check_inode(dir);
+ unionfs_check_inode(parent);
+ if (!err)
+ unionfs_check_dentry(dentry->d_parent);
unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry->d_sb);
-
return err;
}
--
1.5.2.2
CC: Michael Tokarev <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
Documentation/filesystems/unionfs/concepts.txt | 20 +++++++++++++++++---
Documentation/filesystems/unionfs/issues.txt | 2 +-
Documentation/filesystems/unionfs/usage.txt | 13 ++++++++-----
3 files changed, 26 insertions(+), 9 deletions(-)
diff --git a/Documentation/filesystems/unionfs/concepts.txt b/Documentation/filesystems/unionfs/concepts.txt
index 7654ccc..bed69bd 100644
--- a/Documentation/filesystems/unionfs/concepts.txt
+++ b/Documentation/filesystems/unionfs/concepts.txt
@@ -1,4 +1,4 @@
-Unionfs 2.1 CONCEPTS:
+Unionfs 2.x CONCEPTS:
=====================
This file describes the concepts needed by a namespace unification file
@@ -66,12 +66,26 @@ Copyup:
=======
When a change is made to the contents of a file's data or meta-data, they
-have to be stored somewhere. The best way is to create a copy of the
+have to be stored somewhere. The best way is to create a copy of the
original file on a branch that is writable, and then redirect the write
-though to this copy. The copy must be made on a higher priority branch so
+though to this copy. The copy must be made on a higher priority branch so
that lookup and readdir return this newer "version" of the file rather than
the original (see duplicate elimination).
+An entire unionfs mount can be read-only or read-write. If it's read-only,
+then none of the branches will be written to, even if some of the branches
+are physically writeable. If the unionfs mount is read-write, then the
+leftmost (highest priority) branch must be writeable (for copyup to take
+place); the remaining branches can be any mix of read-write and read-only.
+
+In a writeable mount, unionfs will create new files/dir in the leftmost
+branch. If one tries to modify a file in a read-only branch/media, unionfs
+will copyup the file to the leftmost branch and modify it there. If you try
+to modify a file from a writeable branch which is not the leftmost branch,
+then unionfs will modify it in that branch; this is useful if you, say,
+unify differnet packages (e.g., apache, sendmail, ftpd, etc.) and you want
+changes to specific package files to remain logically in the directory where
+they came from.
Cache Coherency:
================
diff --git a/Documentation/filesystems/unionfs/issues.txt b/Documentation/filesystems/unionfs/issues.txt
index 9db1d70..bb6ab05 100644
--- a/Documentation/filesystems/unionfs/issues.txt
+++ b/Documentation/filesystems/unionfs/issues.txt
@@ -1,4 +1,4 @@
-KNOWN Unionfs 2.1 ISSUES:
+KNOWN Unionfs 2.x ISSUES:
=========================
1. Unionfs should not use lookup_one_len() on the underlying f/s as it
diff --git a/Documentation/filesystems/unionfs/usage.txt b/Documentation/filesystems/unionfs/usage.txt
index 59c4f28..1adde69 100644
--- a/Documentation/filesystems/unionfs/usage.txt
+++ b/Documentation/filesystems/unionfs/usage.txt
@@ -12,7 +12,7 @@ GENERAL SYNTAX
# mount -t unionfs -o <OPTIONS>,<BRANCH-OPTIONS> none MOUNTPOINT
-OPTIONS can be any legal combination one of:
+OPTIONS can be any legal combination of:
- ro # mount file system read-only
- rw # mount file system read-write
@@ -20,8 +20,9 @@ OPTIONS can be any legal combination one of:
- incgen # increment generation no. (see Cache Consistency below)
BRANCH-OPTIONS can be either (1) a list of branches given to the "dirs="
-option, or (2) a list of individual branch manipulation commands, described
-in the "Branch Management" section below.
+option, or (2) a list of individual branch manipulation commands, combined
+with the "remount" option, and is further described in the "Branch
+Management" section below.
The syntax for the "dirs=" mount option is:
@@ -32,7 +33,9 @@ the union, with an optional branch mode for each of those directories.
Directories that come earlier (specified first, on the left) in the list
have a higher precedence than those which come later. Additionally,
read-only or read-write permissions of the branch can be specified by
-appending =ro or =rw (default) to each directory.
+appending =ro or =rw (default) to each directory. See the Copyup section in
+concepts.txt, for a description of Unionfs's behavior when mixing read-only
+and read-write branches and mounts.
Syntax:
@@ -112,7 +115,7 @@ CACHE CONSISTENCY
=================
If you modify any file on any of the lower branches directly, while there is
-a Unionfs 2.1 mounted above any of those branches, you should tell Unionfs
+a Unionfs 2.x mounted above any of those branches, you should tell Unionfs
to purge its caches and re-get the objects. To do that, you have to
increment the generation number of the superblock using the following
command:
--
1.5.2.2
From: Hugh Dickins <[email protected]>
LTP's iogen01 doio tests hang nicely on 32-bit SMP when /tmp is a unionfs
mount of a tmpfs. See the comment on i_size_write in linux/fs.h: it needs
to be locked, otherwise i_size_read can spin forever waiting for a lost
seqcount update.
Most filesystems are already holding i_mutex for this, but unionfs calls
fsstack_copy_inode_size from many places, not necessarily holding i_mutex.
Use the low-level i_lock within fsstack_copy_inode_size when 32-bit SMP.
Checked the entire unionfs code to ensure this is the right fix for
i_size_write().
Also compared to what other file systems do when they have to handle inodes,
esp. not their own inodes (e.g., network file systems have to access the
exported file system's inodes). Found out that most such file systems not just
don't lock around i_size_write, but they don't even use i_size_read or
i_size_write to access the inode's size.
CC: Mike Halcrow <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/stack.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/fs/stack.c b/fs/stack.c
index 7913fe5..4336f2b 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -21,8 +21,14 @@
*/
void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
{
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+ spin_lock(&dst->i_lock);
+#endif
i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
+#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
+ spin_unlock(&dst->i_lock);
+#endif
}
EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
--
1.5.2.2
i_size_read() takes 'const struct inode *' already, as of 2.6.20.
CC: Mike Halcrow <[email protected]>
Signed-off-by: Jan Engelhardt <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
---
fs/stack.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/stack.c b/fs/stack.c
index a548aac..7913fe5 100644
--- a/fs/stack.c
+++ b/fs/stack.c
@@ -21,7 +21,7 @@
*/
void fsstack_copy_inode_size(struct inode *dst, const struct inode *src)
{
- i_size_write(dst, i_size_read((struct inode *)src));
+ i_size_write(dst, i_size_read(src));
dst->i_blocks = src->i_blocks;
}
EXPORT_SYMBOL_GPL(fsstack_copy_inode_size);
--
1.5.2.2