2007-11-26 16:45:22

by Erez Zadok

[permalink] [raw]
Subject: [GIT PULL -mm] 00/16 Unionfs updates/fixes/cleanups


The following is a series of patches related to Unionfs. The main changes
here are bug fixes (mostly discovered using ltp-full-20071031), as well as
full support for splice(2) and swapon(2).

These patches were tested (where appropriate) on Linus's 2.6.24 latest code
(as of v2.6.24-rc3-19-g2ffbb83), MM (mmotm-2007-11-21-16-25), as well as the
backports to 2.6.{23,22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4,
jffs2, ramfs, tmpfs, cramfs, and squashfs (where available). See
http://unionfs.filesystems.org/ to download back-ported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (15):
Unionfs: use f_path instead of f_dentry/mnt
Unionfs: minor coding standards applied
Unionfs: minor cleanup in the debugging infrastructure
Unionfs: set lower mnt after mkdir which resulted in copyup
Unionfs: handle whiteouts more efficiently in filldir
Unionfs: remove useless debugging messages
Unionfs: release lower resources on successful rmdir
Unionfs: don't create whiteouts on rightmost branch
Unionfs: create opaque directories' whiteouts unconditionally
Unionfs: update times in setattr
Unionfs: reintroduce a bmap method
Unionfs: support splice(2)
Unionfs: prevent multiple writers to lower_page
Unionfs: update our inode size correctly upon partial write
Unionfs: use generic_file_aio_read/write

Hugh Dickins (1):
Unionfs: minor cleanup in writepage

commonfops.c | 5 ----
copyup.c | 4 +--
debug.c | 66 +++++++++++++++++++++++++++--------------------------------
dentry.c | 15 ++-----------
dirfops.c | 18 +++++++++++-----
dirhelper.c | 2 -
fanout.h | 2 -
file.c | 28 ++-----------------------
inode.c | 19 +++++++++++++++-
mmap.c | 52 +++++++++++++++++++++++++++++++++++-----------
rdstate.c | 11 ++++++---
subr.c | 14 ++++++++++++
union.h | 3 +-
unlink.c | 56 ++++++++++++++++++++++++++++++++++++++++++--------
14 files changed, 182 insertions(+), 113 deletions(-)

---
Erez Zadok
[email protected]


2007-11-26 16:45:54

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 10/16] Unionfs: create opaque directories' whiteouts unconditionally

Needed to maintain Unix semantics (LTP testing).

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/subr.c | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 968ee8c..1a26c57 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -162,6 +162,19 @@ int make_dir_opaque(struct dentry *dentry, int bindex)
struct dentry *lower_dentry, *diropq;
struct inode *lower_dir;
struct nameidata nd;
+ kernel_cap_t orig_cap;
+
+ /*
+ * Opaque directory whiteout markers are special files (like regular
+ * whiteouts), and should appear to the users as if they don't
+ * exist. They should be created/deleted regardless of directory
+ * search/create permissions, but only for the duration of this
+ * creation of the .wh.__dir_opaque: file. Note, this does not
+ * circumvent normal ->permission).
+ */
+ orig_cap = current->cap_effective;
+ cap_raise(current->cap_effective, CAP_DAC_READ_SEARCH);
+ cap_raise(current->cap_effective, CAP_DAC_OVERRIDE);

lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
lower_dir = lower_dentry->d_inode;
@@ -189,6 +202,7 @@ int make_dir_opaque(struct dentry *dentry, int bindex)

out:
mutex_unlock(&lower_dir->i_mutex);
+ current->cap_effective = orig_cap;
return err;
}

--
1.5.2.2

2007-11-26 16:46:19

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 13/16] Unionfs: support splice(2)

Also remove redundant variable from unionfs_readpage (saves a bit on stack
space).

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/file.c | 1 +
fs/unionfs/mmap.c | 10 ++++++++--
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index cfbfb8e..b7d0d55 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -246,4 +246,5 @@ struct file_operations unionfs_main_fops = {
.fsync = unionfs_fsync,
.fasync = unionfs_fasync,
.splice_read = generic_file_splice_read,
+ .splice_write = generic_file_splice_write,
};
diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index fa358ef..623a913 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -147,7 +147,7 @@ static int unionfs_readpage(struct file *file, struct page *page)
struct inode *inode;
mm_segment_t old_fs;
char *page_data = NULL;
- loff_t offset;
+ mode_t orig_mode;

unionfs_read_lock(file->f_path.dentry->d_sb);
err = unionfs_file_revalidate(file, false);
@@ -175,11 +175,17 @@ static int unionfs_readpage(struct file *file, struct page *page)
* the necessary magic for us.
*/
lower_file->f_pos = page_offset(page);
- offset = page_offset(page);
old_fs = get_fs();
set_fs(KERNEL_DS);
+ /*
+ * generic_file_splice_write may call us on a file not opened for
+ * reading, so temporarily allow reading.
+ */
+ orig_mode = lower_file->f_mode;
+ lower_file->f_mode |= FMODE_READ;
err = vfs_read(lower_file, page_data, PAGE_CACHE_SIZE,
&lower_file->f_pos);
+ lower_file->f_mode = orig_mode;
set_fs(old_fs);
if (err >= 0 && err < PAGE_CACHE_SIZE)
memset(page_data + err, 0, PAGE_CACHE_SIZE - err);
--
1.5.2.2

2007-11-26 16:46:34

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 15/16] Unionfs: update our inode size correctly upon partial write

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/mmap.c | 9 ++-------
1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index ea5ef3d..8c07eed 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -250,7 +250,6 @@ static int unionfs_commit_write(struct file *file, struct page *page,
int err = -ENOMEM;
struct inode *inode, *lower_inode;
struct file *lower_file = NULL;
- loff_t pos;
unsigned bytes = to - from;
char *page_data = NULL;
mm_segment_t old_fs;
@@ -293,12 +292,8 @@ static int unionfs_commit_write(struct file *file, struct page *page,
if (err < 0)
goto out;

- inode->i_blocks = lower_inode->i_blocks;
- /* we may have to update i_size */
- pos = page_offset(page) + to;
- if (pos > i_size_read(inode))
- i_size_write(inode, pos);
- /* if vfs_write succeeded above, sync up our times */
+ /* if vfs_write succeeded above, sync up our times/sizes */
+ fsstack_copy_inode_size(inode, lower_inode);
unionfs_copy_attr_times(inode);
mark_inode_dirty_sync(inode);

--
1.5.2.2

2007-11-26 16:46:47

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 07/16] Unionfs: remove useless debugging messages

These are considered normal behaviour, they don't really reveal any insight
to the person debugging the code, and they tend to clutter console messages.

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/commonfops.c | 5 -----
fs/unionfs/dentry.c | 15 +++------------
2 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index b33917e..f714e2f 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -648,11 +648,6 @@ int unionfs_file_release(struct inode *inode, struct file *file)

if (fileinfo->rdstate) {
fileinfo->rdstate->access = jiffies;
- pr_debug("unionfs: saving rdstate with cookie "
- "%u [%d.%lld]\n",
- fileinfo->rdstate->cookie,
- fileinfo->rdstate->bindex,
- (long long)fileinfo->rdstate->dirpos);
spin_lock(&inodeinfo->rdlock);
inodeinfo->rdcount++;
list_add_tail(&fileinfo->rdstate->cache,
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index edea1a4..05d9914 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -45,12 +45,8 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
verify_locked(dentry);

/* if the dentry is unhashed, do NOT revalidate */
- if (d_deleted(dentry)) {
- pr_debug("unionfs: unhashed dentry being "
- "revalidated: %*s\n",
- dentry->d_name.len, dentry->d_name.name);
+ if (d_deleted(dentry))
goto out;
- }

BUG_ON(dbstart(dentry) == -1);
if (dentry->d_inode)
@@ -460,13 +456,8 @@ static void unionfs_d_release(struct dentry *dentry)
printk(KERN_ERR "unionfs: dentry without private data: %.*s\n",
dentry->d_name.len, dentry->d_name.name);
goto out;
- } else if (dbstart(dentry) < 0) {
- /* this is due to a failed lookup */
- pr_debug("unionfs: dentry without lower "
- "dentries: %.*s\n",
- dentry->d_name.len, dentry->d_name.name);
- goto out_free;
- }
+ } else if (dbstart(dentry) < 0)
+ goto out_free; /* due to a (normal) failed lookup */

/* Release all the lower dentries */
bstart = dbstart(dentry);
--
1.5.2.2

2007-11-26 16:47:01

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 11/16] Unionfs: update times in setattr

Needed to maintain Unix semantics via utimes(2).

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index ef61d9c..63ff3d3 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -1090,6 +1090,18 @@ static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
/* get the size from the first lower inode */
lower_inode = unionfs_lower_inode(inode);
unionfs_copy_attr_all(inode, lower_inode);
+ /*
+ * unionfs_copy_attr_all will copy the lower times to our inode if
+ * the lower ones are newer (useful for cache coherency). However,
+ * ->setattr is the only place in which we may have to copy the
+ * lower inode times absolutely, to support utimes(2).
+ */
+ if (ia->ia_valid & ATTR_MTIME_SET)
+ inode->i_mtime = lower_inode->i_mtime;
+ if (ia->ia_valid & ATTR_CTIME)
+ inode->i_ctime = lower_inode->i_ctime;
+ if (ia->ia_valid & ATTR_ATIME_SET)
+ inode->i_atime = lower_inode->i_atime;
fsstack_copy_inode_size(inode, lower_inode);
/* if setattr succeeded, then parent dir may have changed */
unionfs_copy_attr_times(dentry->d_parent->d_inode);
--
1.5.2.2

2007-11-26 16:47:23

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 12/16] Unionfs: reintroduce a bmap method

This is needed for swapon(2) files in the union.

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/mmap.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 3f65e52..fa358ef 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -304,10 +304,33 @@ out:
return err; /* assume all is ok */
}

+/*
+ * Although unionfs isn't a block-based file system, it may stack on one.
+ * ->bmap is needed, for example, to swapon(2) files.
+ */
+sector_t unionfs_bmap(struct address_space *mapping, sector_t block)
+{
+ int err = -EINVAL;
+ struct inode *inode, *lower_inode;
+ sector_t (*bmap)(struct address_space *, sector_t);
+
+ inode = (struct inode *)mapping->host;
+ lower_inode = unionfs_lower_inode(inode);
+ if (!lower_inode)
+ goto out;
+ bmap = lower_inode->i_mapping->a_ops->bmap;
+ if (bmap)
+ err = bmap(lower_inode->i_mapping, block);
+out:
+ return err;
+}
+
+
struct address_space_operations unionfs_aops = {
.writepage = unionfs_writepage,
.writepages = unionfs_writepages,
.readpage = unionfs_readpage,
.prepare_write = unionfs_prepare_write,
.commit_write = unionfs_commit_write,
+ .bmap = unionfs_bmap,
};
--
1.5.2.2

2007-11-26 16:47:41

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 08/16] Unionfs: release lower resources on successful rmdir

This patch prevents those resources from lingering around until memory
pressure would have forced them out. The patch also properly handles
directories that have been rmdir'ed which are still some process's cwd.

CC: Hugh Dickins <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 3 ++-
fs/unionfs/mmap.c | 5 ++++-
fs/unionfs/unlink.c | 33 ++++++++++++++++++++++++++++-----
3 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 9c144be..ef61d9c 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -944,7 +944,8 @@ static int unionfs_permission(struct inode *inode, int mask,
* dentry+inode. This should be equivalent to issuing
* __unionfs_d_revalidate_chain on nd.dentry here.
*/
- err = -ESTALE; /* force revalidate */
+ if (is_file) /* dirs can be unlinked but chdir'ed to */
+ err = -ESTALE; /* force revalidate */
goto out;
}

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 1e10280..3f65e52 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -123,8 +123,11 @@ static int unionfs_writepages(struct address_space *mapping,
struct inode *inode;

inode = mapping->host;
+ if (ibstart(inode) < 0 && ibend(inode) < 0)
+ goto out;
lower_inode = unionfs_lower_inode(inode);
- BUG_ON(!lower_inode);
+ if (!lower_inode)
+ goto out;

if (!mapping_cap_writeback_dirty(lower_inode->i_mapping))
goto out;
diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index a8de672..f65245d 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -148,6 +148,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
{
int err = 0;
struct unionfs_dir_state *namelist = NULL;
+ int dstart, dend;

unionfs_read_lock(dentry->d_sb);
unionfs_lock_dentry(dentry);
@@ -164,28 +165,50 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
goto out;

err = unionfs_rmdir_first(dir, dentry, namelist);
+ dstart = dbstart(dentry);
+ dend = dbend(dentry);
/* create whiteout */
if (!err) {
- err = create_whiteout(dentry, dbstart(dentry));
+ err = create_whiteout(dentry, dstart);
} else {
int new_err;

- if (dbstart(dentry) == 0)
+ if (dstart == 0)
goto out;

/* exit if the error returned was NOT -EROFS */
if (!IS_COPYUP_ERR(err))
goto out;

- new_err = create_whiteout(dentry, dbstart(dentry) - 1);
+ new_err = create_whiteout(dentry, dstart - 1);
if (new_err != -EEXIST)
err = new_err;
}

out:
- /* call d_drop so the system "forgets" about us */
- if (!err)
+ /*
+ * Drop references to lower dentry/inode so storage space for them
+ * can be reclaimed. Then, call d_drop so the system "forgets"
+ * about us.
+ */
+ if (!err) {
+ struct inode *inode = dentry->d_inode;
+ BUG_ON(!inode);
+ iput(unionfs_lower_inode_idx(inode, dstart));
+ unionfs_set_lower_inode_idx(inode, dstart, NULL);
+ dput(unionfs_lower_dentry_idx(dentry, dstart));
+ unionfs_set_lower_dentry_idx(dentry, dstart, NULL);
+ /*
+ * If the last directory is unlinked, then mark istart/end
+ * as -1, (to maintain the invariant that if there are no
+ * lower objects, then branch index start and end are set to
+ * -1).
+ */
+ if (!unionfs_lower_inode_idx(inode, dstart) &&
+ !unionfs_lower_inode_idx(inode, dend))
+ ibstart(inode) = ibend(inode) = -1;
d_drop(dentry);
+ }

if (namelist)
free_rdstate(namelist);
--
1.5.2.2

2007-11-26 16:47:54

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 05/16] Unionfs: set lower mnt after mkdir which resulted in copyup

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/inode.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 1708f40..9c144be 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -674,8 +674,10 @@ out:

kfree(name);

- if (!err)
+ if (!err) {
unionfs_copy_attr_times(dentry->d_inode);
+ unionfs_postcopyup_setmnt(dentry);
+ }
unionfs_check_inode(parent);
unionfs_check_dentry(dentry);
unionfs_unlock_dentry(dentry);
--
1.5.2.2

2007-11-26 16:48:17

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 16/16] Unionfs: use generic_file_aio_read/write

There's no apparent need to define our own aio_read/write methods.

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/file.c | 27 ++-------------------------
1 files changed, 2 insertions(+), 25 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index b7d0d55..c922173 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -37,29 +37,6 @@ out:
return err;
}

-static ssize_t unionfs_aio_read(struct kiocb *iocb, const struct iovec *iov,
- unsigned long nr_segs, loff_t pos)
-{
- int err = 0;
- struct file *file = iocb->ki_filp;
-
- unionfs_read_lock(file->f_path.dentry->d_sb);
- err = unionfs_file_revalidate(file, false);
- if (unlikely(err))
- goto out;
- unionfs_check_file(file);
-
- err = generic_file_aio_read(iocb, iov, nr_segs, pos);
-
- if (err == -EIOCBQUEUED)
- err = wait_on_sync_kiocb(iocb);
-
-out:
- unionfs_check_file(file);
- unionfs_read_unlock(file->f_path.dentry->d_sb);
- return err;
-}
-
static ssize_t unionfs_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
@@ -234,9 +211,9 @@ out:
struct file_operations unionfs_main_fops = {
.llseek = generic_file_llseek,
.read = unionfs_read,
- .aio_read = unionfs_aio_read,
+ .aio_read = generic_file_aio_read,
.write = unionfs_write,
- .aio_write = generic_file_aio_write,
+ .aio_write = generic_file_aio_write,
.readdir = unionfs_file_readdir,
.unlocked_ioctl = unionfs_ioctl,
.mmap = unionfs_mmap,
--
1.5.2.2

2007-11-26 16:48:37

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 01/16] Unionfs: use f_path instead of f_dentry/mnt

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/debug.c | 2 +-
fs/unionfs/fanout.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index 8464fbb..bc221d6 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -437,7 +437,7 @@ void __unionfs_check_nd(const struct nameidata *nd,
if (nd->flags & LOOKUP_OPEN) {
file = nd->intent.open.file;
if (unlikely(file->f_path.dentry &&
- strcmp(file->f_dentry->d_sb->s_type->name,
+ strcmp(file->f_path.dentry->d_sb->s_type->name,
UNIONFS_NAME))) {
PRINT_CALLER(fname, fxn, line);
pr_debug(" CND1: lower_file of type %s\n",
diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index ec18013..864383e 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -106,7 +106,7 @@ static inline void unionfs_set_lower_file_idx(struct file *f, int index,
UNIONFS_F(f)->lower_files[index] = val;
/* save branch ID (may be redundant?) */
UNIONFS_F(f)->saved_branch_ids[index] =
- branch_id((f)->f_dentry->d_sb, index);
+ branch_id((f)->f_path.dentry->d_sb, index);
}

static inline void unionfs_set_lower_file(struct file *f, struct file *val)
--
1.5.2.2

2007-11-26 16:48:58

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 02/16] Unionfs: minor cleanup in writepage

From: Hugh Dickins <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/mmap.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 4918f77..1e10280 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -56,6 +56,7 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
copy_highpage(lower_page, page);
flush_dcache_page(lower_page);
SetPageUptodate(lower_page);
+ set_page_dirty(lower_page);

/*
* Call lower writepage (expects locked page). However, if we are
@@ -66,12 +67,11 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
* success.
*/
if (wbc->for_reclaim) {
- set_page_dirty(lower_page);
unlock_page(lower_page);
goto out_release;
}
+
BUG_ON(!lower_mapping->a_ops->writepage);
- set_page_dirty(lower_page);
clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
err = lower_mapping->a_ops->writepage(lower_page, wbc);
if (err < 0)
--
1.5.2.2

2007-11-26 16:49:30

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 03/16] Unionfs: minor coding standards applied

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/copyup.c | 4 ++--
fs/unionfs/dirfops.c | 5 +++--
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 98bed0b..3fe4865 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -460,8 +460,8 @@ int copyup_dentry(struct inode *dir, struct dentry *dentry, int bstart,
goto out_unlink;

/* Set permissions. */
- if ((err = copyup_permissions(sb, old_lower_dentry,
- new_lower_dentry)))
+ err = copyup_permissions(sb, old_lower_dentry, new_lower_dentry);
+ if (err)
goto out_unlink;

#ifdef CONFIG_UNION_FS_XATTR
diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index c644c13..5276cb3 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -77,8 +77,9 @@ static int unionfs_filldir(void *dirent, const char *name, int namelen,
goto out;
}
buf->entries_written++;
- if ((err = add_filldir_node(buf->rdstate, name, namelen,
- buf->rdstate->bindex, is_wh_entry)))
+ err = add_filldir_node(buf->rdstate, name, namelen,
+ buf->rdstate->bindex, is_wh_entry);
+ if (err)
buf->filldir_error = err;

out:
--
1.5.2.2

2007-11-26 16:49:44

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 06/16] Unionfs: handle whiteouts more efficiently in filldir

If we cache a dirent for file "foo", and then it gets deleted, then we look
for a ".wh.foo" whiteout entry in the same dirent cache. But our dirent
cache strips the ".wh." prefix, thus looking for an entry named "foo" whose
filldir_node->whiteout should be 1 instead of 0. In that case, don't
display an incorrect printk message that the file system may be corrupt,
but set that filldir_node->whiteout to 1.

CC: Hugh Dickins <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/dirfops.c | 13 ++++++++++---
fs/unionfs/dirhelper.c | 2 +-
fs/unionfs/rdstate.c | 11 +++++++----
fs/unionfs/union.h | 3 ++-
4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index 5276cb3..88df635 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -53,10 +53,17 @@ static int unionfs_filldir(void *dirent, const char *name, int namelen,
is_wh_entry = 1;
}

- found = find_filldir_node(buf->rdstate, name, namelen);
-
- if (found)
+ found = find_filldir_node(buf->rdstate, name, namelen, is_wh_entry);
+
+ if (found) {
+ /*
+ * If we had non-whiteout entry in dir cache, then mark it
+ * as a whiteout and but leave it in the dir cache.
+ */
+ if (is_wh_entry && !found->whiteout)
+ found->whiteout = is_wh_entry;
goto out;
+ }

/* if 'name' isn't a whiteout, filldir it. */
if (!is_wh_entry) {
diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index 7a28444..2e52fc3 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -158,7 +158,7 @@ static int readdir_util_callback(void *dirent, const char *name, int namelen,
whiteout = 1;
}

- found = find_filldir_node(buf->rdstate, name, namelen);
+ found = find_filldir_node(buf->rdstate, name, namelen, whiteout);
/* If it was found in the table there was a previous whiteout. */
if (found)
goto out;
diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c
index bdf335d..7ba1e1a 100644
--- a/fs/unionfs/rdstate.c
+++ b/fs/unionfs/rdstate.c
@@ -188,7 +188,8 @@ void free_rdstate(struct unionfs_dir_state *state)
}

struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
- const char *name, int namelen)
+ const char *name, int namelen,
+ int is_whiteout)
{
int index;
unsigned int hash;
@@ -215,10 +216,12 @@ struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
found = 1;

/*
- * if the duplicate is in this branch, then the file
- * system is corrupted.
+ * if a duplicate is found in this branch, and is
+ * not due to the caller looking for an entry to
+ * whiteout, then the file system may be corrupted.
*/
- if (unlikely(cursor->bindex == rdstate->bindex))
+ if (unlikely(!is_whiteout &&
+ cursor->bindex == rdstate->bindex))
printk(KERN_ERR "unionfs: filldir: possible "
"I/O error: a file is duplicated "
"in the same branch %d: %s\n",
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index f61e32d..283b085 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -223,7 +223,8 @@ extern int add_filldir_node(struct unionfs_dir_state *rdstate,
const char *name, int namelen, int bindex,
int whiteout);
extern struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
- const char *name, int namelen);
+ const char *name, int namelen,
+ int is_whiteout);

extern struct dentry **alloc_new_dentries(int objs);
extern struct unionfs_data *alloc_new_data(int objs);
--
1.5.2.2

2007-11-26 16:50:01

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 04/16] Unionfs: minor cleanup in the debugging infrastructure

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/debug.c | 64 ++++++++++++++++++++++++---------------------------
1 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index bc221d6..c2b8b58 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -97,24 +97,22 @@ void __unionfs_check_inode(const struct inode *inode,
"istart/end=%d:%d\n", inode,
lower_inode, bindex, istart, iend);
}
- } else { /* lower_inode == NULL */
- if (bindex >= istart && bindex <= iend) {
- /*
- * directories can have NULL lower inodes in
- * b/t start/end, but NOT if at the
- * start/end range.
- */
- if (unlikely(!(S_ISDIR(inode->i_mode) &&
- bindex > istart &&
- bindex < iend))) {
- PRINT_CALLER(fname, fxn, line);
- pr_debug(" Ci7: inode/linode=%p:%p "
- "bindex=%d istart/end=%d:%d\n",
- inode, lower_inode, bindex,
- istart, iend);
- }
- }
+ continue;
}
+ /* if we get here, then lower_inode == NULL */
+ if (bindex < istart || bindex > iend)
+ continue;
+ /*
+ * directories can have NULL lower inodes in b/t start/end,
+ * but NOT if at the start/end range.
+ */
+ if (unlikely(S_ISDIR(inode->i_mode) &&
+ bindex > istart && bindex < iend))
+ continue;
+ PRINT_CALLER(fname, fxn, line);
+ pr_debug(" Ci7: inode/linode=%p:%p "
+ "bindex=%d istart/end=%d:%d\n",
+ inode, lower_inode, bindex, istart, iend);
}
}

@@ -274,24 +272,22 @@ check_inode:
"istart/end=%d:%d\n", dentry,
lower_inode, bindex, istart, iend);
}
- } else { /* lower_inode == NULL */
- if (bindex >= istart && bindex <= iend) {
- /*
- * directories can have NULL lower inodes in
- * b/t start/end, but NOT if at the
- * start/end range.
- */
- if (unlikely(!(S_ISDIR(inode->i_mode) &&
- bindex > istart &&
- bindex < iend))) {
- PRINT_CALLER(fname, fxn, line);
- pr_debug(" CI7: dentry/linode=%p:%p "
- "bindex=%d istart/end=%d:%d\n",
- dentry, lower_inode, bindex,
- istart, iend);
- }
- }
+ continue;
}
+ /* if we get here, then lower_inode == NULL */
+ if (bindex < istart || bindex > iend)
+ continue;
+ /*
+ * directories can have NULL lower inodes in b/t start/end,
+ * but NOT if at the start/end range.
+ */
+ if (unlikely(S_ISDIR(inode->i_mode) &&
+ bindex > istart && bindex < iend))
+ continue;
+ PRINT_CALLER(fname, fxn, line);
+ pr_debug(" CI7: dentry/linode=%p:%p "
+ "bindex=%d istart/end=%d:%d\n",
+ dentry, lower_inode, bindex, istart, iend);
}

/*
--
1.5.2.2

2007-11-26 16:50:40

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 14/16] Unionfs: prevent multiple writers to lower_page

Without this patch, the LTP fs test "rwtest04" triggers a
BUG_ON(PageWriteback(page)) in fs/buffer.c:1706.

CC: Hugh Dickins <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/mmap.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 623a913..ea5ef3d 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -73,6 +73,7 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc)

BUG_ON(!lower_mapping->a_ops->writepage);
clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
+ wait_on_page_writeback(lower_page); /* prevent multiple writers */
err = lower_mapping->a_ops->writepage(lower_page, wbc);
if (err < 0)
goto out_release;
--
1.5.2.2

2007-11-26 16:50:54

by Erez Zadok

[permalink] [raw]
Subject: [PATCH 09/16] Unionfs: don't create whiteouts on rightmost branch

If we are unlinking/rmdir'ing an object on the rightmost branch, there's no
need to create a whiteout there: this saves on storage space and inodes.
Also, in the (degenerate) case of having only one branch, this really saves
on whiteouts.

CC: Hugh Dickins <[email protected]>

Signed-off-by: Erez Zadok <[email protected]>
---
fs/unionfs/unlink.c | 25 +++++++++++++++++++++----
1 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index f65245d..423ff36 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -53,14 +53,23 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
if (err && !IS_COPYUP_ERR(err))
goto out;

+ /*
+ * We create whiteouts if (1) there was an error unlinking the main
+ * file; (2) there is a lower priority file with the same name
+ * (dbopaque); (3) the branch in which the file is not the last
+ * (rightmost0 branch. The last rule is an optimization to avoid
+ * creating all those whiteouts if there's no chance they'd be
+ * masking any lower-priority branch, as well as unionfs is used
+ * with only one branch (using only one branch, while odd, is still
+ * possible).
+ */
if (err) {
if (dbstart(dentry) == 0)
goto out;
err = create_whiteout(dentry, dbstart(dentry) - 1);
} else if (dbopaque(dentry) != -1) {
- /* There is a lower lower-priority file with the same name. */
err = create_whiteout(dentry, dbopaque(dentry));
- } else {
+ } else if (dbstart(dentry) < sbend(dentry->d_sb)) {
err = create_whiteout(dentry, dbstart(dentry));
}

@@ -167,9 +176,17 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
err = unionfs_rmdir_first(dir, dentry, namelist);
dstart = dbstart(dentry);
dend = dbend(dentry);
- /* create whiteout */
+ /*
+ * We create a whiteout for the directory if there was an error to
+ * rmdir the first directory entry in the union. Otherwise, we
+ * create a whiteout only if there is no chance that a lower
+ * priority branch might also have the same named directory. IOW,
+ * if there is not another same-named directory at a lower priority
+ * branch, then we don't need to create a whiteout for it.
+ */
if (!err) {
- err = create_whiteout(dentry, dstart);
+ if (dstart < dend)
+ err = create_whiteout(dentry, dstart);
} else {
int new_err;

--
1.5.2.2