2006-09-01 01:35:33

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

This set of patches constitutes Unionfs version 2.0. We are presenting it to
be reviewed and considered for inclusion into the kernel.

Unionfs is a stackable filesystem that is based off of the FiST stackable
filesystem framework written by Erez Zadok: see <http://filesystems.org/>.

Josef Sipek presented Unionfs at the 2006 Ottawa Linux Symposiums; the
high-level overview from this year's symposium starts on page 349 of the
second half of the symposium proceedings: see

<http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf>

Note that this set of patches contains a considerably trimmed-down version
of Unionfs. That way it'd be possible to evaluate Unionfs's most basic
functionality, gradually adding features in future patches.

To download tarballs of the full source, along with userspace utilities,
read various documents and other info about Unionfs, see the home page at

<http://www.unionfs.org>

Josef "Jeff" Sipek, on behalf of the Unionfs team.

Index:
=======
[PATCH 01/22][RFC] Unionfs: Documentation
[PATCH 02/22][RFC] Unionfs: Kconfig and Makefile
[PATCH 03/22][RFC] Unionfs: Branch management functionality
[PATCH 04/22][RFC] Unionfs: Common file operations
[PATCH 05/22][RFC] Unionfs: Copyup Functionality
[PATCH 06/22][RFC] Unionfs: Dentry operations
[PATCH 07/22][RFC] Unionfs: Directory file operations
[PATCH 08/22][RFC] Unionfs: Directory manipulation helper functions
[PATCH 09/22][RFC] Unionfs: File operations
[PATCH 10/22][RFC] Unionfs: Inode operations
[PATCH 11/22][RFC] Unionfs: Lookup helper functions
[PATCH 12/22][RFC] Unionfs: Main module functions
[PATCH 13/22][RFC] Unionfs: Readdir state
[PATCH 14/22][RFC] Unionfs: Rename
[PATCH 15/22][RFC] Unionfs: Privileged operations workqueue
[PATCH 16/22][RFC] Unionfs: Handling of stale inodes
[PATCH 17/22][RFC] Unionfs: Miscellaneous helper functions
[PATCH 18/22][RFC] Unionfs: Superblock operations
[PATCH 19/22][RFC] Unionfs: Helper macros/inlines
[PATCH 20/22][RFC] Unionfs: Internal include file
[PATCH 21/22][RFC] Unionfs: Unlink
[PATCH 22/22][RFC] Unionfs: Include file

Known Issues and Limitations:

- The NFS server returns -EACCES for read-only exports, instead of -EROFS.
This means we can't reliably detect a read-only NFS export.

- Modifying a Unionfs branch directly, while the union is mounted, is
currently unsupported. Any such change may cause Unionfs to oops and it
can even result in data loss!

- The PPC module loading algorithm has an O(N^2) loop, so it takes a while
to load the Unionfs module, because we have lots of symbols.

- Unionfs shouldn't use lookup_one_len on the underlying fs as it confuses
NFS.

For the initial release we removed support for xattrs, persistent inode
mappings, and mmap operations.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>


2006-09-01 01:37:43

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 01/22][RFC] Unionfs: Documentation

From: David Quigley <[email protected]>

This patch contains documentation for Unionfs. You will find several files
outlining basic unification concepts and rename semantics.

Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

Documentation/filesystems/00-INDEX | 2
Documentation/filesystems/unionfs/00-INDEX | 8 ++
Documentation/filesystems/unionfs/concepts.txt | 68 +++++++++++++++++++++++++
Documentation/filesystems/unionfs/rename.txt | 31 +++++++++++
Documentation/filesystems/unionfs/usage.txt | 44 ++++++++++++++++
5 files changed, 153 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/Documentation/filesystems/00-INDEX linux-2.6-git-unionfs/Documentation/filesystems/00-INDEX
--- linux-2.6-git/Documentation/filesystems/00-INDEX 2006-08-31 18:43:38.000000000 -0400
+++ linux-2.6-git-unionfs/Documentation/filesystems/00-INDEX 2006-08-31 19:03:48.000000000 -0400
@@ -82,6 +82,8 @@
- info and mount options for the UDF filesystem.
ufs.txt
- info on the ufs filesystem.
+unionfs/
+ - info on the unionfs filesystem
v9fs.txt
- v9fs is a Unix implementation of the Plan 9 9p remote fs protocol.
vfat.txt
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/Documentation/filesystems/unionfs/00-INDEX linux-2.6-git-unionfs/Documentation/filesystems/unionfs/00-INDEX
--- linux-2.6-git/Documentation/filesystems/unionfs/00-INDEX 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/Documentation/filesystems/unionfs/00-INDEX 2006-08-31 19:26:15.000000000 -0400
@@ -0,0 +1,8 @@
+00-INDEX
+ - this file.
+concepts.txt
+ - A brief introduction of concepts
+rename.txt
+ - Information regarding rename operations
+usage.txt
+ - Usage & known limitations
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/Documentation/filesystems/unionfs/concepts.txt linux-2.6-git-unionfs/Documentation/filesystems/unionfs/concepts.txt
--- linux-2.6-git/Documentation/filesystems/unionfs/concepts.txt 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/Documentation/filesystems/unionfs/concepts.txt 2006-08-31 19:03:48.000000000 -0400
@@ -0,0 +1,68 @@
+This file describes the concepts needed by a name-space unification file system.
+
+Branch Priority:
+================
+
+Each branch is assigned a unique priority - starting from 0 (highest priority).
+No two branches can have the same priority.
+
+
+Branch Mode:
+============
+
+Each branch is assigned a mode - read-write or read-only. This allows
+directories on media mounted read-write to be used in a read-only manner.
+
+
+Whiteouts:
+==========
+
+A whiteout removes a file name from the name-space. Whiteouts are needed when
+one attempts to remove a file on a read-only branch.
+
+Suppose we have a two branch union, where branch 0 is read-write and branch 1
+is read-only. And a file 'foo' on branch 1:
+
+./b0/
+./b1/
+./b1/foo
+
+The unified view would simply be:
+
+./union/
+./union/foo
+
+Since 'foo' is stored on a read-only branch, it cannot be removed. A whiteout
+is used to remove the name 'foo' from the unified name-space. Again, since
+branch 1 is read-only, the whiteout cannot be created there. So, we try on a
+higher priority (lower numerically) branch. And there we create the whiteout.
+
+./b0/
+./b0/.wh.foo
+./b1/
+./b1/foo
+
+Later, when Unionfs traverses branches (due to lookup or readdir), it eliminate
+'foo' from the name-space (as well as the whiteout itself.)
+
+
+Duplicate Elimination:
+======================
+
+It is possible for files on different branches to have the same name. Unionfs
+then has to select which instance of the file to show to the user. Given the
+fact that each branch has a priority associated with it, the simplest
+solution is to take the instance from the highest priority (lowest numerical
+value) and "hide" the others.
+
+
+Copyup:
+=======
+
+When a change is made to the contents of a file's data or meta-data, they
+have to be stored somewhere. The best way is to create a copy of the
+original file on a branch that is writable, and then redirect the write
+though to this copy. The copy must be made on a higher priority branch so
+that lookup & readdir return this newer "version" of the file rather than
+the original (see duplicate elimination).
+
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/Documentation/filesystems/unionfs/rename.txt linux-2.6-git-unionfs/Documentation/filesystems/unionfs/rename.txt
--- linux-2.6-git/Documentation/filesystems/unionfs/rename.txt 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/Documentation/filesystems/unionfs/rename.txt 2006-08-31 19:03:48.000000000 -0400
@@ -0,0 +1,31 @@
+Rename is a complex beast. The following table shows which rename(2) operations
+should succeed and which should fail.
+
+o: success
+E: error (either unionfs or vfs)
+X: EXDEV
+
+none = file does not exist
+file = file is a file
+dir = file is a empty directory
+child= file is a non-empty directory
+wh = file is a directory containing only whiteouts; this makes it logically
+ empty
+
+ none file dir child wh
+file o o E E E
+dir o E o E o
+child X E X E X
+wh o E o E o
+
+
+Renaming directories:
+=====================
+
+Whenever a empty (either physically or logically) directory is being renamed,
+the following sequence of events should take place:
+
+1) Remove whiteouts from both source and destination directory
+2) Rename source to destination
+3) Make destination opaque to prevent anything under it from showing up
+
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/Documentation/filesystems/unionfs/usage.txt linux-2.6-git-unionfs/Documentation/filesystems/unionfs/usage.txt
--- linux-2.6-git/Documentation/filesystems/unionfs/usage.txt 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/Documentation/filesystems/unionfs/usage.txt 2006-08-31 19:25:19.000000000 -0400
@@ -0,0 +1,44 @@
+Unionfs is a stackable unification file system, which can appear to merge the
+contents of several directories (branches), while keeping their physical
+content separate. Unionfs is useful for unified source tree management, merged
+contents of split CD-ROM, merged separate software package directories, data
+grids, and more. Unionfs allows any mix of read-only and read-write branches,
+as well as insertion and deletion of branches anywhere in the fan-out. To
+maintain unix semantics, Unionfs handles elimination of duplicates,
+partial-error conditions, and more.
+
+mount -t unionfs -o branch-option[,union-options[,...]] none unionfs
+
+The available branch-option for the mount command is:
+
+dirs=branch[=ro|=rw][:...]
+specifies a separated list of which directories compose the union. Directories
+that come earlier in the list have a higher precedence than those which come
+later. Additionally, read-only or read-write permissions of the branch can be
+specified by appending =ro or =rw (default) to each directory.
+
+Syntax:
+dirs=/branch1[=ro|=rw]:/branch2[=ro|=rw]:...:/branchN[=ro|=rw]
+
+Example:
+dirs=/writable_branch=rw:/read-only_branch=ro
+
+
+KNOWN ISSUES:
+=============
+
+The NFS server returns -EACCES for read-only exports, instead of -EROFS. This
+means we can't reliably detect a read-only NFS export. If you want to use
+copy-on-write with NFS, set the per-branch option `nfsro'. If this flag is
+set, then Unionfs will apply standard Unix permissions to files on this
+nfs-mouted branch.
+
+Modifying a Unionfs branch directly, while the union is mounted is currently
+unsupported. Any such change can cause Unionfs to oops, however it could even
+BRESULT IN DATA LOSS.
+
+The PPC module loading algorithm has an O(N^2) loop, so it takes a while to
+load the Unionfs module, because we have lots of symbols.
+
+Unionfs shouldn't use lookup_one_len on the underlying fs as it confuses NFS.
+

2006-09-01 01:39:35

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 02/22][RFC] Unionfs: Kconfig and Makefile

From: David Quigley <[email protected]>

This patch contains the changes to fs Kconfig file, Makefiles, and Maintainers
file for Unionfs.

Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

MAINTAINERS | 7 +++++++
fs/Kconfig | 10 ++++++++++
fs/Makefile | 1 +
fs/unionfs/Makefile | 5 +++++
4 files changed, 23 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/Kconfig linux-2.6-git-unionfs/fs/Kconfig
--- linux-2.6-git/fs/Kconfig 2006-08-31 18:43:47.000000000 -0400
+++ linux-2.6-git-unionfs/fs/Kconfig 2006-08-31 19:04:00.000000000 -0400
@@ -1394,6 +1394,16 @@
Y here. This will result in _many_ additional debugging messages to be
written to the system log.

+config UNION_FS
+ tristate "Stackable namespace unification file system"
+ depends on EXPERIMENTAL
+ help
+ Unionfs is a stackable unification file system, which appears to
+ merge the contents of several directories (branches), while keeping
+ their physical content separate.
+
+ See <http://www.unionfs.org> for details
+
endmenu

menu "Network File Systems"
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/Makefile linux-2.6-git-unionfs/fs/Makefile
--- linux-2.6-git/fs/Makefile 2006-08-31 18:43:47.000000000 -0400
+++ linux-2.6-git-unionfs/fs/Makefile 2006-08-31 19:04:00.000000000 -0400
@@ -102,3 +102,4 @@
obj-$(CONFIG_HPPFS) += hppfs/
obj-$(CONFIG_DEBUG_FS) += debugfs/
obj-$(CONFIG_OCFS2_FS) += ocfs2/
+obj-$(CONFIG_UNION_FS) += unionfs/
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/Makefile linux-2.6-git-unionfs/fs/unionfs/Makefile
--- linux-2.6-git/fs/unionfs/Makefile 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/Makefile 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,5 @@
+obj-$(CONFIG_UNION_FS) += unionfs.o
+
+unionfs-objs := subr.o dentry.o file.o inode.o main.o super.o \
+ stale_inode.o branchman.o rdstate.o copyup.o dirhelper.o \
+ rename.o unlink.o lookup.o commonfops.o dirfops.o sioq.o
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/MAINTAINERS linux-2.6-git-unionfs/MAINTAINERS
--- linux-2.6-git/MAINTAINERS 2006-08-31 18:43:38.000000000 -0400
+++ linux-2.6-git-unionfs/MAINTAINERS 2006-08-31 19:03:49.000000000 -0400
@@ -2921,6 +2921,13 @@
W: http://www.kernel.dk
S: Maintained

+UNIONFS
+P: Josef "Jeff" Sipek
+M: [email protected]
+L: [email protected]
+W: http://www.unionfs.org
+S: Maintained
+
USB ACM DRIVER
P: Oliver Neukum
M: [email protected]

2006-09-01 01:40:23

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 03/22][RFC] Unionfs: Branch management functionality

From: David Quigley <[email protected]>

This patch contains the ioctls to increase the union generation and to query
which branch a file exists on.

Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/branchman.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 92 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/branchman.c linux-2.6-git-unionfs/fs/unionfs/branchman.c
--- linux-2.6-git/fs/unionfs/branchman.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/branchman.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,92 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+struct dentry **alloc_new_dentries(int objs)
+{
+ if (!objs)
+ return NULL;
+
+ return kzalloc(sizeof(struct dentry *) * objs, GFP_KERNEL);
+}
+
+struct unionfs_usi_data *alloc_new_data(int objs)
+{
+ if (!objs)
+ return NULL;
+
+ return kzalloc(sizeof(struct unionfs_usi_data) * objs, GFP_KERNEL);
+}
+
+int unionfs_ioctl_incgen(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct super_block *sb;
+ int gen;
+
+ sb = file->f_dentry->d_sb;
+
+ unionfs_write_lock(sb);
+
+ atomic_inc(&stopd(sb)->usi_generation);
+ gen = atomic_read(&stopd(sb)->usi_generation);
+
+ atomic_set(&dtopd(sb->s_root)->udi_generation, gen);
+ atomic_set(&itopd(sb->s_root->d_inode)->uii_generation, gen);
+
+ unionfs_write_unlock(sb);
+
+ return gen;
+}
+
+int unionfs_ioctl_queryfile(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ int err = 0;
+ fd_set branchlist;
+
+ int bstart = 0, bend = 0, bindex = 0;
+ struct dentry *dentry, *hidden_dentry;
+
+ dentry = file->f_dentry;
+ lock_dentry(dentry);
+ if ((err = unionfs_partial_lookup(dentry)))
+ goto out;
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ FD_ZERO(&branchlist);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+ if (hidden_dentry->d_inode)
+ FD_SET(bindex, &branchlist);
+ }
+
+ err = copy_to_user((void __user *)arg, &branchlist, sizeof(fd_set));
+ if (err)
+ err = -EFAULT;
+
+out:
+ unlock_dentry(dentry);
+ return err < 0 ? err : bend;
+}
+

2006-09-01 01:41:55

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 04/22][RFC] Unionfs: Common file operations

From: David Quigley <[email protected]>

This patch contains helper functions used through the rest of the code which
pertains to files.

Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/commonfops.c | 575 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 575 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/commonfops.c linux-2.6-git-unionfs/fs/unionfs/commonfops.c
--- linux-2.6-git/fs/unionfs/commonfops.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/commonfops.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,575 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* 1) Copyup the file
+ * 2) Rename the file to '.unionfs<original inode#><counter>' - obviously
+ * stolen from NFS's silly rename
+ */
+static int copyup_deleted_file(struct file *file, struct dentry *dentry,
+ int bstart, int bindex)
+{
+ static unsigned int counter;
+ const int i_inosize = sizeof(dentry->d_inode->i_ino) * 2;
+ const int countersize = sizeof(counter) * 2;
+ const int nlen = sizeof(".unionfs") + i_inosize + countersize - 1;
+ char name[nlen + 1];
+
+ int err;
+ struct dentry *tmp_dentry = NULL;
+ struct dentry *hidden_dentry = NULL;
+ struct dentry *hidden_dir_dentry = NULL;
+
+ hidden_dentry = dtohd_index(dentry, bstart);
+
+ sprintf(name, ".unionfs%*.*lx",
+ i_inosize, i_inosize, hidden_dentry->d_inode->i_ino);
+
+ tmp_dentry = NULL;
+ do {
+ char *suffix = name + nlen - countersize;
+
+ dput(tmp_dentry);
+ counter++;
+ sprintf(suffix, "%*.*x", countersize, countersize, counter);
+
+ printk(KERN_DEBUG "unionfs: trying to rename %s to %s\n",
+ dentry->d_name.name, name);
+
+ tmp_dentry = lookup_one_len(name, hidden_dentry->d_parent,
+ UNIONFS_TMPNAM_LEN);
+ if (IS_ERR(tmp_dentry)) {
+ err = PTR_ERR(tmp_dentry);
+ goto out;
+ }
+ } while (tmp_dentry->d_inode != NULL); /* need negative dentry */
+
+ err = copyup_named_file(dentry->d_parent->d_inode, file, name, bstart,
+ bindex, file->f_dentry->d_inode->i_size);
+ if (err)
+ goto out;
+
+ /* bring it to the same state as an unlinked file */
+ hidden_dentry = dtohd_index(dentry, dbstart(dentry));
+ hidden_dir_dentry = lock_parent(hidden_dentry);
+ err = vfs_unlink(hidden_dir_dentry->d_inode, hidden_dentry);
+ unlock_dir(hidden_dir_dentry);
+
+out:
+ return err;
+}
+
+static void cleanup_file(struct file *file, struct dentry *dentry)
+{
+ int bindex, bstart, bend;
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ if (ftohf_index(file, bindex)) {
+ branchput(dentry->d_sb, bindex);
+ fput(ftohf_index(file, bindex));
+ }
+ }
+
+ if (ftohf_ptr(file)) {
+ kfree(ftohf_ptr(file));
+ ftohf_ptr(file) = NULL;
+ }
+}
+
+static int open_all_files(struct file *file, struct dentry *dentry)
+{
+ int bindex, bstart, bend, err = 0;
+ struct file *hidden_file;
+ struct dentry *hidden_dentry;
+ struct super_block *sb = dentry->d_sb;
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+
+ dget(hidden_dentry);
+ mntget(stohiddenmnt_index(sb, bindex));
+ branchget(sb, bindex);
+
+ hidden_file = dentry_open(hidden_dentry,
+ stohiddenmnt_index(sb, bindex), file->f_flags);
+ if (IS_ERR(hidden_file)) {
+ err = PTR_ERR(hidden_file);
+ goto out;
+ } else
+ set_ftohf_index(file, bindex, hidden_file);
+ }
+out:
+ return err;
+}
+
+static int open_highest_file(struct file *file, struct dentry *dentry,
+ int willwrite)
+{
+ int bindex, bstart, bend, err = 0;
+ struct file *hidden_file;
+ struct dentry *hidden_dentry;
+ struct inode *parent_inode = dentry->d_parent->d_inode;
+ size_t inode_size = file->f_dentry->d_inode->i_size;
+ struct super_block *sb = dentry->d_sb;
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ hidden_dentry = dtohd(dentry);
+ if (willwrite && IS_WRITE_FLAG(file->f_flags)
+ && is_robranch(dentry)) {
+ for (bindex = bstart - 1; bindex >= 0; bindex--) {
+ err = copyup_file(parent_inode, file, bstart, bindex,
+ inode_size);
+ if (!err)
+ break;
+
+ }
+ atomic_set(&ftopd(file)->ufi_generation,
+ atomic_read(&itopd(dentry->d_inode)->uii_generation));
+ goto out;
+ }
+
+ dget(hidden_dentry);
+ mntget(stohiddenmnt_index(sb, bstart));
+ branchget(sb, bstart);
+ hidden_file = dentry_open(hidden_dentry,
+ stohiddenmnt_index(sb, bstart), file->f_flags);
+ if (IS_ERR(hidden_file)) {
+ err = PTR_ERR(hidden_file);
+ goto out;
+ }
+ set_ftohf(file, hidden_file);
+ /* Fix up the position. */
+ hidden_file->f_pos = file->f_pos;
+
+ memcpy(&(hidden_file->f_ra), &(file->f_ra),
+ sizeof(struct file_ra_state));
+out:
+ return err;
+}
+
+static int do_delayed_copyup(struct file *file, struct dentry *dentry)
+{
+ int bindex, bstart, bend, err = 0;
+ struct inode *parent_inode = dentry->d_parent->d_inode;
+ size_t inode_size = file->f_dentry->d_inode->i_size;
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+
+ BUG_ON(!S_ISREG(file->f_dentry->d_inode->i_mode));
+
+ for (bindex = bstart - 1; bindex >= 0; bindex--) {
+ if (!d_deleted(file->f_dentry))
+ err = copyup_file(parent_inode, file, bstart,
+ bindex, inode_size);
+ else
+ err = copyup_deleted_file(file, dentry, bstart, bindex);
+
+ if (!err)
+ break;
+ }
+ if (!err && (bstart > fbstart(file))) {
+ bend = fbend(file);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ if (ftohf_index(file, bindex)) {
+ branchput(dentry->d_sb, bindex);
+ fput(ftohf_index(file, bindex));
+ set_ftohf_index(file, bindex, NULL);
+ }
+ }
+ fbend(file) = bend;
+ }
+ return err;
+}
+
+int unionfs_file_revalidate(struct file *file, int willwrite)
+{
+ struct super_block *sb;
+ struct dentry *dentry;
+ int sbgen, fgen, dgen;
+ int bstart, bend;
+ int size;
+
+ int err = 0;
+
+ dentry = file->f_dentry;
+ lock_dentry(dentry);
+ sb = dentry->d_sb;
+ unionfs_read_lock(sb);
+ if (!unionfs_d_revalidate(dentry, NULL) && !d_deleted(dentry)) {
+ err = -ESTALE;
+ goto out;
+ }
+
+ sbgen = atomic_read(&stopd(sb)->usi_generation);
+ dgen = atomic_read(&dtopd(dentry)->udi_generation);
+ fgen = atomic_read(&ftopd(file)->ufi_generation);
+
+ BUG_ON(sbgen > dgen);
+
+ /* There are two cases we are interested in. The first is if the
+ * generation is lower than the super-block. The second is if someone
+ * has copied up this file from underneath us, we also need to refresh
+ * things. */
+ if (!d_deleted(dentry) &&
+ ((sbgen > fgen) || (dbstart(dentry) != fbstart(file)))) {
+ /* First we throw out the existing files. */
+ cleanup_file(file, dentry);
+
+ /* Now we reopen the file(s) as in unionfs_open. */
+ bstart = fbstart(file) = dbstart(dentry);
+ bend = fbend(file) = dbend(dentry);
+
+ size = sizeof(struct file *) * sbmax(sb);
+ ftohf_ptr(file) = kzalloc(size, GFP_KERNEL);
+ if (!ftohf_ptr(file)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (S_ISDIR(dentry->d_inode->i_mode)) {
+ /* We need to open all the files. */
+ err = open_all_files(file, dentry);
+ if (err)
+ goto out;
+ } else {
+ /* We only open the highest priority branch. */
+ err = open_highest_file(file, dentry, willwrite);
+ if (err)
+ goto out;
+ }
+ atomic_set(&ftopd(file)->ufi_generation,
+ atomic_read(&itopd(dentry->d_inode)->
+ uii_generation));
+ }
+
+ /* Copyup on the first write to a file on a readonly branch. */
+ if (willwrite && IS_WRITE_FLAG(file->f_flags)
+ && !IS_WRITE_FLAG(ftohf(file)->f_flags) && is_robranch(dentry)) {
+ printk(KERN_DEBUG
+ "Doing delayed copyup of a read-write file on a read-only branch.\n");
+ err = do_delayed_copyup(file, dentry);
+ }
+
+out:
+ unlock_dentry(dentry);
+ unionfs_read_unlock(dentry->d_sb);
+ return err;
+}
+
+/* unionfs_open helper function: open a directory */
+static inline int __open_dir(struct inode *inode, struct file *file)
+{
+ struct dentry *hidden_dentry;
+ struct file *hidden_file;
+ int bindex, bstart, bend;
+
+ bstart = fbstart(file) = dbstart(file->f_dentry);
+ bend = fbend(file) = dbend(file->f_dentry);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(file->f_dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+
+ dget(hidden_dentry);
+ mntget(stohiddenmnt_index(inode->i_sb, bindex));
+ hidden_file = dentry_open(hidden_dentry,
+ stohiddenmnt_index(inode->i_sb, bindex),
+ file->f_flags);
+ if (IS_ERR(hidden_file))
+ return PTR_ERR(hidden_file);
+
+ set_ftohf_index(file, bindex, hidden_file);
+
+ /* The branchget goes after the open, because otherwise
+ * we would miss the reference on release. */
+ branchget(inode->i_sb, bindex);
+ }
+
+ return 0;
+}
+
+/* unionfs_open helper function: open a file */
+static inline int __open_file(struct inode *inode, struct file *file)
+{
+ struct dentry *hidden_dentry;
+ struct file *hidden_file;
+ int hidden_flags;
+ int bindex, bstart, bend;
+
+ hidden_dentry = dtohd(file->f_dentry);
+ hidden_flags = file->f_flags;
+
+ bstart = fbstart(file) = dbstart(file->f_dentry);
+ bend = fbend(file) = dbend(file->f_dentry);
+
+ /* check for the permission for hidden file. If the error is COPYUP_ERR,
+ * copyup the file.
+ */
+ if (hidden_dentry->d_inode && is_robranch(file->f_dentry)) {
+ /* if the open will change the file, copy it up otherwise defer it. */
+ if (hidden_flags & O_TRUNC) {
+ int size = 0;
+ int err = -EROFS;
+
+ /* copyup the file */
+ for (bindex = bstart - 1; bindex >= 0; bindex--) {
+ err = copyup_file(file->f_dentry->d_parent->d_inode,
+ file, bstart, bindex, size);
+ if (!err)
+ break;
+ }
+ return err;
+ } else
+ hidden_flags &= ~(OPEN_WRITE_FLAGS);
+ }
+
+ dget(hidden_dentry);
+
+ /* dentry_open will decrement mnt refcnt if err.
+ * otherwise fput() will do an mntput() for us upon file close.
+ */
+ mntget(stohiddenmnt_index(inode->i_sb, bstart));
+ hidden_file = dentry_open(hidden_dentry,
+ stohiddenmnt_index(inode->i_sb, bstart),
+ hidden_flags);
+ if (IS_ERR(hidden_file))
+ return PTR_ERR(hidden_file);
+
+ set_ftohf(file, hidden_file);
+ branchget(inode->i_sb, bstart);
+
+ return 0;
+}
+
+int unionfs_open(struct inode *inode, struct file *file)
+{
+ int err = 0;
+ struct file *hidden_file = NULL;
+ struct dentry *dentry = NULL;
+ int bindex = 0, bstart = 0, bend = 0;
+ int size;
+
+ ftopd_lhs(file) = kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
+ if (!ftopd(file)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ fbstart(file) = -1;
+ fbend(file) = -1;
+ atomic_set(&ftopd(file)->ufi_generation,
+ atomic_read(&itopd(inode)->uii_generation));
+
+ size = sizeof(struct file *) * sbmax(inode->i_sb);
+ ftohf_ptr(file) = kzalloc(size, GFP_KERNEL);
+ if (!ftohf_ptr(file)) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ dentry = file->f_dentry;
+ lock_dentry(dentry);
+ unionfs_read_lock(inode->i_sb);
+
+ bstart = fbstart(file) = dbstart(dentry);
+ bend = fbend(file) = dbend(dentry);
+
+ /* increment, so that we can flush appropriately */
+ atomic_inc(&itopd(dentry->d_inode)->uii_totalopens);
+
+ /* open all directories and make the unionfs file struct point to these hidden file structs */
+ if (S_ISDIR(inode->i_mode))
+ err = __open_dir(inode, file); /* open a dir */
+ else
+ err = __open_file(inode, file); /* open a file */
+
+ /* freeing the allocated resources, and fput the opened files */
+ if (err) {
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_file = ftohf_index(file, bindex);
+ if (!hidden_file)
+ continue;
+
+ branchput(file->f_dentry->d_sb, bindex);
+ /* fput calls dput for hidden_dentry */
+ fput(hidden_file);
+ }
+ }
+
+ unlock_dentry(dentry);
+ unionfs_read_unlock(inode->i_sb);
+
+out:
+ if (err) {
+ kfree(ftohf_ptr(file));
+ kfree(ftopd(file));
+ }
+
+ return err;
+}
+
+int unionfs_file_release(struct inode *inode, struct file *file)
+{
+ int err = 0;
+ struct file *hidden_file = NULL;
+ int bindex, bstart, bend;
+ int fgen;
+
+ /* fput all the hidden files */
+ fgen = atomic_read(&ftopd(file)->ufi_generation);
+ bstart = fbstart(file);
+ bend = fbend(file);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_file = ftohf_index(file, bindex);
+
+ if (hidden_file) {
+ fput(hidden_file);
+ unionfs_read_lock(inode->i_sb);
+ branchput(inode->i_sb, bindex);
+ unionfs_read_unlock(inode->i_sb);
+ }
+ }
+ kfree(ftohf_ptr(file));
+
+ if (ftopd(file)->rdstate) {
+ ftopd(file)->rdstate->uds_access = jiffies;
+ printk(KERN_DEBUG "Saving rdstate with cookie %u [%d.%lld]\n",
+ ftopd(file)->rdstate->uds_cookie,
+ ftopd(file)->rdstate->uds_bindex,
+ (long long)ftopd(file)->rdstate->uds_dirpos);
+ spin_lock(&itopd(inode)->uii_rdlock);
+ itopd(inode)->uii_rdcount++;
+ list_add_tail(&ftopd(file)->rdstate->uds_cache,
+ &itopd(inode)->uii_readdircache);
+ mark_inode_dirty(inode);
+ spin_unlock(&itopd(inode)->uii_rdlock);
+ ftopd(file)->rdstate = NULL;
+ }
+ kfree(ftopd(file));
+ return err;
+}
+
+static inline long do_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct file *hidden_file;
+ int err;
+
+ hidden_file = ftohf(file);
+
+ err = security_file_ioctl(hidden_file, cmd, arg);
+ if (err)
+ goto out;
+ err = -ENOTTY;
+ if (!hidden_file || !hidden_file->f_op)
+ goto out;
+ if (hidden_file->f_op->unlocked_ioctl) {
+ err = hidden_file->f_op->unlocked_ioctl(hidden_file, cmd, arg);
+ } else if (hidden_file->f_op->ioctl) {
+ lock_kernel();
+ err = hidden_file->f_op->ioctl(hidden_file->f_dentry->d_inode,
+ hidden_file, cmd, arg);
+ unlock_kernel();
+ }
+
+out:
+ return err;
+}
+
+long unionfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ long err = 0; /* don't fail by default */
+
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ /* check if asked for local commands */
+ switch (cmd) {
+ case UNIONFS_IOCTL_INCGEN:
+ if (!capable(CAP_SYS_ADMIN)) {
+ err = -EACCES;
+ goto out;
+ }
+ err = unionfs_ioctl_incgen(file, cmd, arg);
+ break;
+
+ case UNIONFS_IOCTL_QUERYFILE:
+ err = unionfs_ioctl_queryfile(file, cmd, arg);
+ break;
+
+ default:
+ err = do_ioctl(file, cmd, arg);
+ break;
+ }
+
+out:
+ return err;
+}
+
+int unionfs_flush(struct file *file, fl_owner_t id)
+{
+ int err = 0; /* assume ok (see open.c:close_fp) */
+ struct file *hidden_file = NULL;
+ int bindex, bstart, bend;
+
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+ if (!atomic_dec_and_test
+ (&itopd(file->f_dentry->d_inode)->uii_totalopens))
+ goto out;
+
+ lock_dentry(file->f_dentry);
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_file = ftohf_index(file, bindex);
+
+ if (hidden_file && hidden_file->f_op
+ && hidden_file->f_op->flush) {
+ err = hidden_file->f_op->flush(hidden_file, id);
+ if (err)
+ goto out_lock;
+
+ /* if there are no more references to the dentry, dput it */
+ if (d_deleted(file->f_dentry)) {
+ dput(dtohd_index(file->f_dentry, bindex));
+ set_dtohd_index(file->f_dentry, bindex, NULL);
+ }
+ }
+
+ }
+
+out_lock:
+ unlock_dentry(file->f_dentry);
+out:
+ return err;
+}
+

2006-09-01 01:43:11

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 05/22][RFC] Unionfs: Copyup Functionality

From: David Quigley <[email protected]>

This patch contains the functions used to perform copyup operations in unionfs.

Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/copyup.c | 662 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 662 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/copyup.c linux-2.6-git-unionfs/fs/unionfs/copyup.c
--- linux-2.6-git/fs/unionfs/copyup.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/copyup.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,662 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York*
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* Determine the mode based on the copyup flags, and the existing dentry. */
+static int copyup_permissions(struct super_block *sb,
+ struct dentry *old_hidden_dentry,
+ struct dentry *new_hidden_dentry)
+{
+ struct iattr newattrs;
+ int err;
+
+ newattrs.ia_atime = old_hidden_dentry->d_inode->i_atime;
+ newattrs.ia_mtime = old_hidden_dentry->d_inode->i_mtime;
+ newattrs.ia_ctime = old_hidden_dentry->d_inode->i_ctime;
+
+ newattrs.ia_gid = old_hidden_dentry->d_inode->i_gid;
+ newattrs.ia_uid = old_hidden_dentry->d_inode->i_uid;
+
+ newattrs.ia_mode = old_hidden_dentry->d_inode->i_mode;
+
+ newattrs.ia_valid = ATTR_CTIME | ATTR_ATIME | ATTR_MTIME |
+ ATTR_ATIME_SET | ATTR_MTIME_SET | ATTR_FORCE |
+ ATTR_GID | ATTR_UID | ATTR_MODE;
+
+ err = notify_change(new_hidden_dentry, &newattrs);
+
+ return err;
+}
+
+int copyup_dentry(struct inode *dir, struct dentry *dentry,
+ int bstart, int new_bindex,
+ struct file **copyup_file, loff_t len)
+{
+ return copyup_named_dentry(dir, dentry, bstart, new_bindex,
+ dentry->d_name.name,
+ dentry->d_name.len, copyup_file, len);
+}
+
+/* create the new device/file/directory - use copyup_permission to copyup
+ * times, and mode
+ *
+ * if the object being copied up is a regular file, the file is only created,
+ * the contents have to be copied up separately
+ */
+static inline int __copyup_ndentry(struct dentry *old_hidden_dentry,
+ struct dentry *new_hidden_dentry,
+ struct dentry *new_hidden_parent_dentry,
+ char *symbuf)
+{
+ int err = 0;
+ umode_t old_mode = old_hidden_dentry->d_inode->i_mode;
+ struct sioq_args args;
+
+ if (S_ISDIR(old_mode)) {
+ args.u.mkdir.parent = new_hidden_parent_dentry->d_inode;
+ args.u.mkdir.dentry = new_hidden_dentry;
+ args.u.mkdir.mode = old_mode;
+
+ run_sioq(__unionfs_mkdir, &args);
+ err = args.err;
+ } else if (S_ISLNK(old_mode)) {
+ args.u.symlink.parent = new_hidden_parent_dentry->d_inode;
+ args.u.symlink.dentry = new_hidden_dentry;
+ args.u.symlink.symbuf = symbuf;
+ args.u.symlink.mode = old_mode;
+
+ run_sioq(__unionfs_symlink, &args);
+ err = args.err;
+ } else if (S_ISBLK(old_mode)
+ || S_ISCHR(old_mode)
+ || S_ISFIFO(old_mode)
+ || S_ISSOCK(old_mode)) {
+ args.u.mknod.parent = new_hidden_parent_dentry->d_inode;
+ args.u.mknod.dentry = new_hidden_dentry;
+ args.u.mknod.mode = old_mode;
+ args.u.mknod.dev = old_hidden_dentry->d_inode->i_rdev;
+
+ run_sioq(__unionfs_mknod, &args);
+ err = args.err;
+ } else if (S_ISREG(old_mode)) {
+ args.u.create.parent = new_hidden_parent_dentry->d_inode;
+ args.u.create.dentry = new_hidden_dentry;
+ args.u.create.mode = old_mode;
+ args.u.create.nd = NULL;
+
+ run_sioq(__unionfs_create, &args);
+ err = args.err;
+ } else {
+ printk(KERN_ERR "Unknown inode type %d\n",
+ old_mode);
+ BUG();
+ }
+
+ return err;
+}
+
+static inline int __copyup_reg_data(struct super_block *sb,
+ struct dentry *new_hidden_dentry,
+ int new_bindex,
+ struct dentry *old_hidden_dentry,
+ int old_bindex,
+ struct file **copyup_file,
+ loff_t len)
+{
+ struct file *input_file;
+ struct file *output_file;
+ mm_segment_t old_fs;
+ char *buf = NULL;
+ ssize_t read_bytes, write_bytes;
+ ssize_t size;
+ int err = 0;
+
+ /* open old file */
+ mntget(stohiddenmnt_index(sb, old_bindex));
+ branchget(sb, old_bindex);
+ input_file = dentry_open(old_hidden_dentry,
+ stohiddenmnt_index(sb, old_bindex), O_RDONLY);
+ if (IS_ERR(input_file)) {
+ dput(old_hidden_dentry);
+ err = PTR_ERR(input_file);
+ goto out;
+ }
+ if (!input_file->f_op || !input_file->f_op->read) {
+ err = -EINVAL;
+ goto out_close_in;
+ }
+
+ /* open new file */
+ dget(new_hidden_dentry);
+ mntget(stohiddenmnt_index(sb, new_bindex));
+ branchget(sb, new_bindex);
+ output_file = dentry_open(new_hidden_dentry,
+ stohiddenmnt_index(sb, new_bindex), O_WRONLY);
+ if (IS_ERR(output_file)) {
+ err = PTR_ERR(output_file);
+ goto out_close_in2;
+ }
+ if (!output_file->f_op || !output_file->f_op->write) {
+ err = -EINVAL;
+ goto out_close_out;
+ }
+
+ /* allocating a buffer */
+ buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ if (!buf) {
+ err = -ENOMEM;
+ goto out_close_out;
+ }
+
+ input_file->f_pos = 0;
+ output_file->f_pos = 0;
+
+ old_fs = get_fs();
+ set_fs(KERNEL_DS);
+
+ size = len;
+ err = 0;
+ do {
+ if (len >= PAGE_SIZE)
+ size = PAGE_SIZE;
+ else if ((len < PAGE_SIZE) && (len > 0))
+ size = len;
+
+ len -= PAGE_SIZE;
+
+ read_bytes =
+ input_file->f_op->read(input_file,
+ (char __user *)buf, size,
+ &input_file->f_pos);
+ if (read_bytes <= 0) {
+ err = read_bytes;
+ break;
+ }
+
+ write_bytes =
+ output_file->f_op->write(output_file,
+ (char __user *)buf,
+ read_bytes,
+ &output_file->f_pos);
+ if (write_bytes < 0 || (write_bytes < read_bytes)) {
+ err = write_bytes;
+ break;
+ }
+ } while ((read_bytes > 0) && (len > 0));
+
+ set_fs(old_fs);
+
+ kfree(buf);
+
+ if (err)
+ goto out_close_out;
+ if (copyup_file) {
+ *copyup_file = output_file;
+ goto out_close_in;
+ }
+
+out_close_out:
+ fput(output_file);
+
+out_close_in2:
+ branchput(sb, new_bindex);
+
+out_close_in:
+ fput(input_file);
+
+out:
+ branchput(sb, old_bindex);
+
+ return err;
+}
+
+static inline void __clear(struct dentry *dentry,
+ struct dentry *old_hidden_dentry,
+ int old_bstart, int old_bend,
+ struct dentry *new_hidden_dentry,
+ int new_bindex)
+{
+ /* get rid of the hidden dentry and all its traces */
+ set_dtohd_index(dentry, new_bindex, NULL);
+ set_dbstart(dentry, old_bstart);
+ set_dbend(dentry, old_bend);
+
+ dput(new_hidden_dentry);
+ dput(old_hidden_dentry);
+}
+
+int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
+ int bstart, int new_bindex, const char *name,
+ int namelen, struct file **copyup_file, loff_t len)
+{
+ struct dentry *new_hidden_dentry;
+ struct dentry *old_hidden_dentry = NULL;
+ struct super_block *sb;
+ int err = 0;
+ int old_bindex;
+ int old_bstart;
+ int old_bend;
+ struct dentry *new_hidden_parent_dentry = NULL;
+ mm_segment_t oldfs;
+ char *symbuf = NULL;
+
+ verify_locked(dentry);
+
+ old_bindex = bstart;
+ old_bstart = dbstart(dentry);
+ old_bend = dbend(dentry);
+
+ BUG_ON(new_bindex < 0);
+ BUG_ON(new_bindex >= old_bindex);
+
+ sb = dir->i_sb;
+
+ unionfs_read_lock(sb);
+
+ if ((err = is_robranch_super(sb, new_bindex))) {
+ dput(old_hidden_dentry);
+ goto out;
+ }
+
+ /* Create the directory structure above this dentry. */
+ new_hidden_dentry = create_parents_named(dir, dentry, name, new_bindex);
+ if (IS_ERR(new_hidden_dentry)) {
+ dput(old_hidden_dentry);
+ err = PTR_ERR(new_hidden_dentry);
+ goto out;
+ }
+
+ old_hidden_dentry = dtohd_index(dentry, old_bindex);
+ dget(old_hidden_dentry);
+
+ /* For symlinks, we must read the link before we lock the directory. */
+ if (S_ISLNK(old_hidden_dentry->d_inode->i_mode)) {
+
+ symbuf = kmalloc(PATH_MAX, GFP_KERNEL);
+ if (!symbuf) {
+ __clear(dentry, old_hidden_dentry,
+ old_bstart, old_bend,
+ new_hidden_dentry, new_bindex);
+ err = -ENOMEM;
+ goto out_free;
+ }
+
+ oldfs = get_fs();
+ set_fs(KERNEL_DS);
+ err = old_hidden_dentry->d_inode->i_op->readlink(
+ old_hidden_dentry,
+ (char __user *)symbuf,
+ PATH_MAX);
+ set_fs(oldfs);
+ if (err) {
+ __clear(dentry, old_hidden_dentry,
+ old_bstart, old_bend,
+ new_hidden_dentry, new_bindex);
+ goto out_free;
+ }
+ symbuf[err] = '\0';
+ }
+
+ /* Now we lock the parent, and create the object in the new branch. */
+ new_hidden_parent_dentry = lock_parent(new_hidden_dentry);
+
+ /* create the new inode */
+ err = __copyup_ndentry(old_hidden_dentry, new_hidden_dentry,
+ new_hidden_parent_dentry, symbuf);
+
+ if (err) {
+ __clear(dentry, old_hidden_dentry,
+ old_bstart, old_bend,
+ new_hidden_dentry, new_bindex);
+ goto out_unlock;
+ }
+
+ /* We actually copyup the file here. */
+ if (S_ISREG(old_hidden_dentry->d_inode->i_mode))
+ err = __copyup_reg_data(sb, new_hidden_dentry, new_bindex,
+ old_hidden_dentry, old_bindex, copyup_file, len);
+ if (err)
+ goto out_unlink;
+
+ /* Set permissions. */
+ if ((err = copyup_permissions(sb, old_hidden_dentry, new_hidden_dentry)))
+ goto out_unlink;
+
+ /* do not allow files getting deleted to be reinterposed */
+ if (!d_deleted(dentry))
+ unionfs_reinterpose(dentry);
+
+ goto out_unlock;
+ /****/
+
+out_unlink:
+ /* copyup failed, because we possibly ran out of space or
+ * quota, or something else happened so let's unlink; we don't
+ * really care about the return value of vfs_unlink */
+ vfs_unlink(new_hidden_parent_dentry->d_inode, new_hidden_dentry);
+
+ if (copyup_file) {
+ /* need to close the file */
+
+ fput(*copyup_file);
+ branchput(sb, new_bindex);
+ }
+
+ /* TODO: should we reset the error to something like -EIO? */
+
+out_unlock:
+ unlock_dir(new_hidden_parent_dentry);
+
+out_free:
+ kfree(symbuf);
+
+out:
+ unionfs_read_unlock(sb);
+
+ return err;
+}
+
+/* This function creates a copy of a file represented by 'file' which currently
+ * resides in branch 'bstart' to branch 'new_bindex. The copy will be named
+ * "name". */
+int copyup_named_file(struct inode *dir, struct file *file, char *name,
+ int bstart, int new_bindex, loff_t len)
+{
+ int err = 0;
+ struct file *output_file = NULL;
+
+ err = copyup_named_dentry(dir, file->f_dentry, bstart,
+ new_bindex, name, strlen(name), &output_file,
+ len);
+ if (!err) {
+ fbstart(file) = new_bindex;
+ set_ftohf_index(file, new_bindex, output_file);
+ }
+
+ return err;
+}
+
+/* This function creates a copy of a file represented by 'file' which currently
+ * resides in branch 'bstart' to branch 'new_bindex.
+ */
+int copyup_file(struct inode *dir, struct file *file, int bstart,
+ int new_bindex, loff_t len)
+{
+ int err = 0;
+ struct file *output_file = NULL;
+
+ err = copyup_dentry(dir, file->f_dentry, bstart, new_bindex,
+ &output_file, len);
+ if (!err) {
+ fbstart(file) = new_bindex;
+ set_ftohf_index(file, new_bindex, output_file);
+ }
+
+ return err;
+}
+
+/* This function replicates the directory structure upto given dentry
+ * in the bindex branch. Can create directory structure recursively to the right
+ * also.
+ */
+struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
+ int bindex)
+{
+ struct dentry *hidden_dentry;
+
+ hidden_dentry =
+ create_parents_named(dir, dentry, dentry->d_name.name, bindex);
+
+ return (hidden_dentry);
+}
+
+static inline void __cleanup_dentry(struct dentry * dentry, int bindex,
+ int old_bstart, int old_bend)
+{
+ int loop_start;
+ int loop_end;
+ int new_bstart = -1;
+ int new_bend = -1;
+ int i;
+
+ loop_start = min(old_bstart, bindex);
+ loop_end = max(old_bend, bindex);
+
+ /* This loop sets the bstart and bend for the new
+ * dentry by traversing from left to right.
+ * It also dputs all negative dentries except
+ * bindex (the newly looked dentry
+ */
+ for (i = loop_start; i <= loop_end; i++) {
+ if (!dtohd_index(dentry, i))
+ continue;
+
+ if (i == bindex) {
+ new_bend = i;
+ if (new_bstart < 0)
+ new_bstart = i;
+ continue;
+ }
+
+ if (!dtohd_index(dentry, i)->d_inode) {
+ dput(dtohd_index(dentry, i));
+ set_dtohd_index(dentry, i, NULL);
+ } else {
+ if (new_bstart < 0)
+ new_bstart = i;
+ new_bend = i;
+ }
+ }
+
+ if (new_bstart < 0)
+ new_bstart = bindex;
+ if (new_bend < 0)
+ new_bend = bindex;
+ set_dbstart(dentry, new_bstart);
+ set_dbend(dentry, new_bend);
+
+}
+static inline void __set_inode(struct dentry * upper, struct dentry * lower,
+ int bindex)
+{
+ set_itohi_index(upper->d_inode, bindex,
+ igrab(lower->d_inode));
+ if (likely(ibstart(upper->d_inode) > bindex))
+ ibstart(upper->d_inode) = bindex;
+ if (likely(ibend(upper->d_inode) < bindex))
+ ibend(upper->d_inode) = bindex;
+
+}
+static inline void __set_dentry(struct dentry * upper, struct dentry * lower,
+ int bindex)
+{
+ set_dtohd_index(upper, bindex, lower);
+ if (likely(dbstart(upper) > bindex))
+ set_dbstart(upper, bindex);
+ if (likely(dbend(upper) < bindex))
+ set_dbend(upper, bindex);
+}
+/* This function replicates the directory structure upto given dentry
+ * in the bindex branch. */
+struct dentry *create_parents_named(struct inode *dir, struct dentry *dentry,
+ const char *name, int bindex)
+{
+ int err;
+ struct dentry *child_dentry;
+ struct dentry *parent_dentry;
+ struct dentry *hidden_parent_dentry = NULL;
+ struct dentry *hidden_dentry = NULL;
+ const char *childname;
+ unsigned int childnamelen;
+
+ int old_kmalloc_size;
+ int kmalloc_size;
+ int num_dentry;
+ int count;
+
+ int old_bstart;
+ int old_bend;
+ struct dentry **path = NULL;
+ struct dentry **tmp_path;
+ struct super_block *sb;
+
+ verify_locked(dentry);
+
+ /* There is no sense allocating any less than the minimum. */
+ kmalloc_size = malloc_sizes[0].cs_size;
+ num_dentry = kmalloc_size / sizeof(struct dentry *);
+
+ if ((err = is_robranch_super(dir->i_sb, bindex))) {
+ hidden_dentry = ERR_PTR(err);
+ goto out;
+ }
+
+ old_bstart = dbstart(dentry);
+ old_bend = dbend(dentry);
+
+ hidden_dentry = ERR_PTR(-ENOMEM);
+ path = (struct dentry **)kzalloc(kmalloc_size, GFP_KERNEL);
+ if (!path)
+ ;
+
+ /* assume the negative dentry of unionfs as the parent dentry */
+ parent_dentry = dentry;
+
+ count = 0;
+ /* This loop finds the first parent that exists in the given branch.
+ * We start building the directory structure from there. At the end
+ * of the loop, the following should hold:
+ * child_dentry is the first nonexistent child
+ * parent_dentry is the first existent parent
+ * path[0] is the = deepest child
+ * path[count] is the first child to create
+ */
+ do {
+ child_dentry = parent_dentry;
+
+ /* find the parent directory dentry in unionfs */
+ parent_dentry = child_dentry->d_parent;
+ lock_dentry(parent_dentry);
+
+ /* find out the hidden_parent_dentry in the given branch */
+ hidden_parent_dentry = dtohd_index(parent_dentry, bindex);
+
+ /* store the child dentry */
+ path[count++] = child_dentry;
+
+ /* grow path table */
+ if (count == num_dentry) {
+ old_kmalloc_size = kmalloc_size;
+ kmalloc_size *= 2;
+ num_dentry = kmalloc_size / sizeof(struct dentry *);
+
+ tmp_path =
+ (struct dentry **)kzalloc(kmalloc_size, GFP_KERNEL);
+ if (!tmp_path) {
+ hidden_dentry = ERR_PTR(-ENOMEM);
+ goto out;
+ }
+ memcpy(tmp_path, path, old_kmalloc_size);
+ kfree(path);
+ path = tmp_path;
+ tmp_path = NULL;
+ }
+
+ } while (!hidden_parent_dentry);
+ count--;
+
+ sb = dentry->d_sb;
+
+ /* This is basically while(child_dentry != dentry). This loop is
+ * horrible to follow and should be replaced with cleaner code. */
+ while (1) {
+ // get hidden parent dir in the current branch
+ hidden_parent_dentry = dtohd_index(parent_dentry, bindex);
+ unlock_dentry(parent_dentry);
+
+ // init the values to lookup
+ childname = child_dentry->d_name.name;
+ childnamelen = child_dentry->d_name.len;
+
+ if (child_dentry != dentry) {
+ // lookup child in the underlying file system
+ hidden_dentry =
+ lookup_one_len(childname, hidden_parent_dentry,
+ childnamelen);
+ if (IS_ERR(hidden_dentry))
+ goto out;
+ } else {
+
+ /* is the name a whiteout of the childname ? */
+ //lookup the whiteout child in the underlying file system
+ hidden_dentry =
+ lookup_one_len(name, hidden_parent_dentry,
+ strlen(name));
+ if (IS_ERR(hidden_dentry))
+ goto out;
+
+ /* Replace the current dentry (if any) with the new one. */
+ dput(dtohd_index(dentry, bindex));
+ set_dtohd_index(dentry, bindex, hidden_dentry);
+
+ __cleanup_dentry(dentry, bindex, old_bstart, old_bend);
+ break;
+ }
+
+ if (hidden_dentry->d_inode) {
+ /* since this already exists we dput to avoid
+ * multiple references on the same dentry */
+ dput(hidden_dentry);
+ } else {
+ struct sioq_args args;
+
+ /* its a negative dentry, create a new dir */
+ hidden_parent_dentry = lock_parent(hidden_dentry);
+
+ args.u.mkdir.parent = hidden_parent_dentry->d_inode;
+ args.u.mkdir.dentry = hidden_dentry;
+ args.u.mkdir.mode = child_dentry->d_inode->i_mode;
+
+ run_sioq(__unionfs_mkdir, &args);
+ err = args.err;
+
+ if (!err)
+ err = copyup_permissions(dir->i_sb,
+ child_dentry, hidden_dentry);
+ unlock_dir(hidden_parent_dentry);
+ if (err) {
+ dput(hidden_dentry);
+ hidden_dentry = ERR_PTR(err);
+ goto out;
+ }
+
+ }
+
+ __set_inode(child_dentry, hidden_dentry, bindex);
+ __set_dentry(child_dentry, hidden_dentry, bindex);
+
+ parent_dentry = child_dentry;
+ child_dentry = path[--count];
+ }
+out:
+ kfree(path);
+ return hidden_dentry;
+}
+

2006-09-01 01:44:27

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 06/22][RFC] Unionfs: Dentry operations

From: Josef "Jeff" Sipek <[email protected]>

This patch contains the dentry operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/dentry.c | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 236 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/dentry.c linux-2.6-git-unionfs/fs/unionfs/dentry.c
--- linux-2.6-git/fs/unionfs/dentry.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/dentry.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,236 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* declarations added for "sparse" */
+extern int unionfs_d_revalidate_wrap(struct dentry *dentry,
+ struct nameidata *nd);
+extern void unionfs_d_release(struct dentry *dentry);
+extern void unionfs_d_iput(struct dentry *dentry, struct inode *inode);
+
+/*
+ * THIS IS A BOOLEAN FUNCTION: returns 1 if valid, 0 otherwise.
+ */
+int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
+{
+ int valid = 1; /* default is valid (1); invalid is 0. */
+ struct dentry *hidden_dentry;
+ int bindex, bstart, bend;
+ int sbgen, dgen;
+ int positive = 0;
+ int locked = 0;
+ int restart = 0;
+ int interpose_flag;
+
+restart:
+ verify_locked(dentry);
+
+ /* if the dentry is unhashed, do NOT revalidate */
+ if (d_deleted(dentry)) {
+ printk(KERN_DEBUG "unhashed dentry being revalidated: %*s\n",
+ dentry->d_name.len, dentry->d_name.name);
+ goto out;
+ }
+
+ BUG_ON(dbstart(dentry) == -1);
+ if (dentry->d_inode)
+ positive = 1;
+ dgen = atomic_read(&dtopd(dentry)->udi_generation);
+ sbgen = atomic_read(&stopd(dentry->d_sb)->usi_generation);
+ /* If we are working on an unconnected dentry, then there is no
+ * revalidation to be done, because this file does not exist within the
+ * namespace, and Unionfs operates on the namespace, not data.
+ */
+ if (sbgen != dgen) {
+ struct dentry *result;
+ int pdgen;
+
+ unionfs_read_lock(dentry->d_sb);
+ locked = 1;
+
+ /* The root entry should always be valid */
+ BUG_ON(IS_ROOT(dentry));
+
+ /* We can't work correctly if our parent isn't valid. */
+ pdgen = atomic_read(&dtopd(dentry->d_parent)->udi_generation);
+ if (!restart && (pdgen != sbgen)) {
+ unionfs_read_unlock(dentry->d_sb);
+ locked = 0;
+ /* We must be locked before our parent. */
+ if (!
+ (dentry->d_parent->d_op->
+ d_revalidate(dentry->d_parent, nd))) {
+ valid = 0;
+ goto out;
+ }
+ restart = 1;
+ goto restart;
+ }
+ BUG_ON(pdgen != sbgen);
+
+ /* Free the pointers for our inodes and this dentry. */
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ if (bstart >= 0) {
+ struct dentry *hidden_dentry;
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry =
+ dtohd_index_nocheck(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+ dput(hidden_dentry);
+ }
+ }
+ set_dbstart(dentry, -1);
+ set_dbend(dentry, -1);
+
+ interpose_flag = INTERPOSE_REVAL_NEG;
+ if (positive) {
+ interpose_flag = INTERPOSE_REVAL;
+ mutex_lock(&dentry->d_inode->i_mutex);
+ bstart = ibstart(dentry->d_inode);
+ bend = ibend(dentry->d_inode);
+ if (bstart >= 0) {
+ struct inode *hidden_inode;
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_inode =
+ itohi_index(dentry->d_inode,
+ bindex);
+ if (!hidden_inode)
+ continue;
+ iput(hidden_inode);
+ }
+ }
+ kfree(itohi_ptr(dentry->d_inode));
+ itohi_ptr(dentry->d_inode) = NULL;
+ ibstart(dentry->d_inode) = -1;
+ ibend(dentry->d_inode) = -1;
+ mutex_unlock(&dentry->d_inode->i_mutex);
+ }
+
+ result = unionfs_lookup_backend(dentry, interpose_flag);
+ if (result) {
+ if (IS_ERR(result)) {
+ valid = 0;
+ goto out;
+ }
+ /* current unionfs_lookup_backend() doesn't return
+ a valid dentry */
+ dput(dentry);
+ dentry = result;
+ }
+
+ if (positive && itopd(dentry->d_inode)->uii_stale) {
+ make_stale_inode(dentry->d_inode);
+ d_drop(dentry);
+ valid = 0;
+ goto out;
+ }
+ goto out;
+ }
+
+ /* The revalidation must occur across all branches */
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ BUG_ON(bstart == -1);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry || !hidden_dentry->d_op
+ || !hidden_dentry->d_op->d_revalidate)
+ continue;
+
+ if (!hidden_dentry->d_op->d_revalidate(hidden_dentry, nd))
+ valid = 0;
+ }
+
+ if (!dentry->d_inode)
+ valid = 0;
+ if (valid)
+ fist_copy_attr_all(dentry->d_inode, itohi(dentry->d_inode));
+
+out:
+ if (locked)
+ unionfs_read_unlock(dentry->d_sb);
+ return valid;
+}
+
+int unionfs_d_revalidate_wrap(struct dentry *dentry, struct nameidata *nd)
+{
+ int err;
+
+ lock_dentry(dentry);
+
+ err = unionfs_d_revalidate(dentry, nd);
+
+ unlock_dentry(dentry);
+ return err;
+}
+
+void unionfs_d_release(struct dentry *dentry)
+{
+ struct dentry *hidden_dentry;
+ int bindex, bstart, bend;
+
+ /* There is no reason to lock the dentry, because we have the only
+ * reference, but the printing functions verify that we have a lock
+ * on the dentry before calling dbstart, etc. */
+ lock_dentry(dentry);
+
+ /* this could be a negative dentry, so check first */
+ if (!dtopd(dentry)) {
+ printk(KERN_DEBUG "dentry without private data: %*s",
+ dentry->d_name.len, dentry->d_name.name);
+ goto out;
+ } else if (dbstart(dentry) < 0) {
+ /* this is due to a failed lookup */
+ /* the failed lookup has a dtohd_ptr set to null,
+ but this is a better check */
+ printk(KERN_DEBUG "dentry without hidden dentries : %*s",
+ dentry->d_name.len, dentry->d_name.name);
+ goto out_free;
+ }
+
+ /* Release all the hidden dentries */
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ dput(hidden_dentry);
+ set_dtohd_index(dentry, bindex, NULL);
+ }
+ /* free private data (unionfs_dentry_info) here */
+ kfree(dtohd_ptr(dentry));
+ dtohd_ptr(dentry) = NULL;
+
+out_free:
+ /* No need to unlock it, because it is disappeared. */
+ free_dentry_private_data(dtopd(dentry));
+ dtopd_lhs(dentry) = NULL; /* just to be safe */
+
+out:
+ /* This is here to make the compiler happy */
+ return;
+}
+
+struct dentry_operations unionfs_dops = {
+ .d_revalidate = unionfs_d_revalidate_wrap,
+ .d_release = unionfs_d_release,
+};
+

2006-09-01 01:45:41

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 07/22][RFC] Unionfs: Directory file operations

From: Josef "Jeff" Sipek <[email protected]>

This patch provides directory file operations.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/dirfops.c | 270 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 270 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/dirfops.c linux-2.6-git-unionfs/fs/unionfs/dirfops.c
--- linux-2.6-git/fs/unionfs/dirfops.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/dirfops.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,270 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* Make sure our rdstate is playing by the rules. */
+static void verify_rdstate_offset(struct unionfs_dir_state *rdstate)
+{
+ BUG_ON(rdstate->uds_offset >= DIREOF);
+ BUG_ON(rdstate->uds_cookie >= MAXRDCOOKIE);
+}
+
+struct unionfs_getdents_callback {
+ struct unionfs_dir_state *rdstate;
+ void *dirent;
+ int entries_written;
+ int filldir_called;
+ int filldir_error;
+ filldir_t filldir;
+ struct super_block *sb;
+};
+
+/* copied from generic filldir in fs/readir.c */
+static int unionfs_filldir(void *dirent, const char *name, int namelen,
+ loff_t offset, ino_t ino, unsigned int d_type)
+{
+ struct unionfs_getdents_callback *buf =
+ (struct unionfs_getdents_callback *)dirent;
+ struct filldir_node *found = NULL;
+ int err = 0;
+ int is_wh_entry = 0;
+
+ buf->filldir_called++;
+
+ if ((namelen > UNIONFS_WHLEN) && !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
+ name += UNIONFS_WHLEN;
+ namelen -= UNIONFS_WHLEN;
+ is_wh_entry = 1;
+ }
+
+ found = find_filldir_node(buf->rdstate, name, namelen);
+
+ if (found)
+ goto out;
+
+ /* if 'name' isn't a whiteout filldir it. */
+ if (!is_wh_entry) {
+ off_t pos = rdstate2offset(buf->rdstate);
+ ino_t unionfs_ino = ino;
+
+ if (!err) {
+ err = buf->filldir(buf->dirent, name, namelen, pos,
+ unionfs_ino, d_type);
+ buf->rdstate->uds_offset++;
+ verify_rdstate_offset(buf->rdstate);
+ }
+ }
+ /* If we did fill it, stuff it in our hash, otherwise return an error */
+ if (err) {
+ buf->filldir_error = err;
+ goto out;
+ }
+ buf->entries_written++;
+ if ((err = add_filldir_node(buf->rdstate, name, namelen,
+ buf->rdstate->uds_bindex, is_wh_entry)))
+ buf->filldir_error = err;
+
+out:
+ return err;
+}
+
+static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir)
+{
+ int err = 0;
+ struct file *hidden_file = NULL;
+ struct inode *inode = NULL;
+ struct unionfs_getdents_callback buf;
+ struct unionfs_dir_state *uds;
+ int bend;
+ loff_t offset;
+
+ if ((err = unionfs_file_revalidate(file, 0)))
+ goto out;
+
+ inode = file->f_dentry->d_inode;
+
+ uds = ftopd(file)->rdstate;
+ if (!uds) {
+ if (file->f_pos == DIREOF) {
+ goto out;
+ } else if (file->f_pos > 0) {
+ uds = find_rdstate(inode, file->f_pos);
+ if (!uds) {
+ err = -ESTALE;
+ goto out;
+ }
+ ftopd(file)->rdstate = uds;
+ } else {
+ init_rdstate(file);
+ uds = ftopd(file)->rdstate;
+ }
+ }
+ bend = fbend(file);
+
+ while (uds->uds_bindex <= bend) {
+ hidden_file = ftohf_index(file, uds->uds_bindex);
+ if (!hidden_file) {
+ uds->uds_bindex++;
+ uds->uds_dirpos = 0;
+ continue;
+ }
+
+ /* prepare callback buffer */
+ buf.filldir_called = 0;
+ buf.filldir_error = 0;
+ buf.entries_written = 0;
+ buf.dirent = dirent;
+ buf.filldir = filldir;
+ buf.rdstate = uds;
+ buf.sb = inode->i_sb;
+
+ /* Read starting from where we last left off. */
+ offset = vfs_llseek(hidden_file, uds->uds_dirpos, 0);
+ if (offset < 0) {
+ err = offset;
+ goto out;
+ }
+ err = vfs_readdir(hidden_file, unionfs_filldir, (void *)&buf);
+ /* Save the position for when we continue. */
+
+ offset = vfs_llseek(hidden_file, 0, 1);
+ if (offset < 0) {
+ err = offset;
+ goto out;
+ }
+ uds->uds_dirpos = offset;
+
+ /* Copy the atime. */
+ fist_copy_attr_atime(inode, hidden_file->f_dentry->d_inode);
+
+ if (err < 0) {
+ goto out;
+ }
+
+ if (buf.filldir_error) {
+ break;
+ }
+
+ if (!buf.entries_written) {
+ uds->uds_bindex++;
+ uds->uds_dirpos = 0;
+ }
+ }
+
+ if (!buf.filldir_error && uds->uds_bindex >= bend) {
+ /* Save the number of hash entries for next time. */
+ itopd(inode)->uii_hashsize = uds->uds_hashentries;
+ free_rdstate(uds);
+ ftopd(file)->rdstate = NULL;
+ file->f_pos = DIREOF;
+ } else {
+ file->f_pos = rdstate2offset(uds);
+ }
+
+out:
+ return err;
+}
+
+/* This is not meant to be a generic repositioning function. If you do
+ * things that aren't supported, then we return EINVAL.
+ *
+ * What is allowed:
+ * (1) seeking to the same position that you are currently at
+ * This really has no effect, but returns where you are.
+ * (2) seeking to the end of the file, if you've read everything
+ * This really has no effect, but returns where you are.
+ * (3) seeking to the beginning of the file
+ * This throws out all state, and lets you begin again.
+ */
+static loff_t unionfs_dir_llseek(struct file *file, loff_t offset, int origin)
+{
+ struct unionfs_dir_state *rdstate;
+ loff_t err;
+
+ if ((err = unionfs_file_revalidate(file, 0)))
+ goto out;
+
+ rdstate = ftopd(file)->rdstate;
+
+ /* We let users seek to their current position, but not anywhere else. */
+ if (!offset) {
+ switch (origin) {
+ case SEEK_SET:
+ if (rdstate) {
+ free_rdstate(rdstate);
+ ftopd(file)->rdstate = NULL;
+ }
+ init_rdstate(file);
+ err = 0;
+ break;
+ case SEEK_CUR:
+ err = file->f_pos;
+ break;
+ case SEEK_END:
+ /* Unsupported, because we would break everything. */
+ err = -EINVAL;
+ break;
+ }
+ } else {
+ switch (origin) {
+ case SEEK_SET:
+ if (rdstate) {
+ if (offset == rdstate2offset(rdstate)) {
+ err = offset;
+ } else if (file->f_pos == DIREOF) {
+ err = DIREOF;
+ } else {
+ err = -EINVAL;
+ }
+ } else {
+ if ((rdstate = find_rdstate(file->f_dentry->d_inode,
+ offset))) {
+ ftopd(file)->rdstate = rdstate;
+ err = rdstate->uds_offset;
+ } else {
+ err = -EINVAL;
+ }
+ }
+ break;
+ case SEEK_CUR:
+ case SEEK_END:
+ /* Unsupported, because we would break everything. */
+ err = -EINVAL;
+ break;
+ }
+ }
+
+out:
+ return err;
+}
+
+/* Trimmed directory options, we shouldn't pass everything down since
+ * we don't want to operate on partial directories.
+ */
+struct file_operations unionfs_dir_fops = {
+ .llseek = unionfs_dir_llseek,
+ .read = generic_read_dir,
+ .readdir = unionfs_readdir,
+ .unlocked_ioctl = unionfs_ioctl,
+ .open = unionfs_open,
+ .release = unionfs_file_release,
+ .flush = unionfs_flush,
+};
+

2006-09-01 01:47:21

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 08/22][RFC] Unionfs: Directory manipulation helper functions

From: Josef "Jeff" Sipek <[email protected]>

This patch contains directory manipulation helper functions.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/dirhelper.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 268 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/dirhelper.c linux-2.6-git-unionfs/fs/unionfs/dirhelper.c
--- linux-2.6-git/fs/unionfs/dirhelper.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/dirhelper.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,268 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* Delete all of the whiteouts in a given directory for rmdir. */
+int do_delete_whiteouts(struct dentry *dentry, int bindex,
+ struct unionfs_dir_state *namelist)
+{
+ int err = 0;
+ struct dentry *hidden_dir_dentry = NULL;
+ struct dentry *hidden_dentry;
+ char *name = NULL, *p;
+ struct inode *hidden_dir;
+
+ int i;
+ struct list_head *pos;
+ struct filldir_node *cursor;
+
+ /* Find out hidden parent dentry */
+ hidden_dir_dentry = dtohd_index(dentry, bindex);
+ BUG_ON(!S_ISDIR(hidden_dir_dentry->d_inode->i_mode));
+ hidden_dir = hidden_dir_dentry->d_inode;
+ BUG_ON(!S_ISDIR(hidden_dir->i_mode));
+
+ err = -ENOMEM;
+ name = __getname();
+ if (!name)
+ goto out;
+ strcpy(name, UNIONFS_WHPFX);
+ p = name + UNIONFS_WHLEN;
+
+ err = 0;
+ for (i = 0; !err && i < namelist->uds_size; i++) {
+ list_for_each(pos, &namelist->uds_list[i]) {
+ cursor =
+ list_entry(pos, struct filldir_node, file_list);
+ /* Only operate on whiteouts in this branch. */
+ if (cursor->bindex != bindex)
+ continue;
+ if (!cursor->whiteout)
+ continue;
+
+ strcpy(p, cursor->name);
+ hidden_dentry =
+ lookup_one_len(name, hidden_dir_dentry,
+ cursor->namelen + UNIONFS_WHLEN);
+ if (IS_ERR(hidden_dentry)) {
+ err = PTR_ERR(hidden_dentry);
+ break;
+ }
+ if (hidden_dentry->d_inode)
+ err = vfs_unlink(hidden_dir, hidden_dentry);
+ dput(hidden_dentry);
+ if (err)
+ break;
+ }
+ }
+
+ __putname(name);
+
+ /* After all of the removals, we should copy the attributes once. */
+ fist_copy_attr_times(dentry->d_inode, hidden_dir_dentry->d_inode);
+
+out:
+ return err;
+}
+
+int delete_whiteouts(struct dentry *dentry, int bindex,
+ struct unionfs_dir_state *namelist)
+{
+ int err;
+ struct super_block *sb;
+ struct dentry *hidden_dir_dentry;
+ struct inode *hidden_dir;
+
+ struct sioq_args args;
+
+ sb = dentry->d_sb;
+ unionfs_read_lock(sb);
+
+ BUG_ON(!S_ISDIR(dentry->d_inode->i_mode));
+ BUG_ON(bindex < dbstart(dentry));
+ BUG_ON(bindex > dbend(dentry));
+ err = is_robranch_super(sb, bindex);
+ if (err)
+ goto out;
+
+ hidden_dir_dentry = dtohd_index(dentry, bindex);
+ BUG_ON(!S_ISDIR(hidden_dir_dentry->d_inode->i_mode));
+ hidden_dir = hidden_dir_dentry->d_inode;
+ BUG_ON(!S_ISDIR(hidden_dir->i_mode));
+
+ mutex_lock(&hidden_dir->i_mutex);
+ if (!permission(hidden_dir, MAY_WRITE | MAY_EXEC, NULL))
+ err = do_delete_whiteouts(dentry, bindex, namelist);
+ else {
+ args.u.deletewh.namelist = namelist;
+ args.u.deletewh.dentry = dentry;
+ args.u.deletewh.bindex = bindex;
+ run_sioq(__delete_whiteouts, &args);
+ err = args.err;
+ }
+ mutex_unlock(&hidden_dir->i_mutex);
+
+out:
+ unionfs_read_unlock(sb);
+ return err;
+}
+
+#define RD_NONE 0
+#define RD_CHECK_EMPTY 1
+/* The callback structure for check_empty. */
+struct unionfs_rdutil_callback {
+ int err;
+ int filldir_called;
+ struct unionfs_dir_state *rdstate;
+ int mode;
+};
+
+/* This filldir function makes sure only whiteouts exist within a directory. */
+static int readdir_util_callback(void *dirent, const char *name, int namelen,
+ loff_t offset, ino_t ino, unsigned int d_type)
+{
+ int err = 0;
+ struct unionfs_rdutil_callback *buf =
+ (struct unionfs_rdutil_callback *)dirent;
+ int whiteout = 0;
+ struct filldir_node *found;
+
+ buf->filldir_called = 1;
+
+ if (name[0] == '.'
+ && (namelen == 1 || (name[1] == '.' && namelen == 2)))
+ goto out;
+
+ if ((namelen > UNIONFS_WHLEN) && !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
+ namelen -= UNIONFS_WHLEN;
+ name += UNIONFS_WHLEN;
+ whiteout = 1;
+ }
+
+ found = find_filldir_node(buf->rdstate, name, namelen);
+ /* If it was found in the table there was a previous whiteout. */
+ if (found)
+ goto out;
+
+ /* If it wasn't found and isn't a whiteout, the directory isn't empty. */
+ err = -ENOTEMPTY;
+ if ((buf->mode == RD_CHECK_EMPTY) && !whiteout)
+ goto out;
+
+ err = add_filldir_node(buf->rdstate, name, namelen,
+ buf->rdstate->uds_bindex, whiteout);
+
+out:
+ buf->err = err;
+ return err;
+}
+
+/* Is a directory logically empty? */
+int check_empty(struct dentry *dentry, struct unionfs_dir_state **namelist)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL;
+ struct super_block *sb;
+ struct file *hidden_file;
+ struct unionfs_rdutil_callback *buf = NULL;
+ int bindex, bstart, bend, bopaque;
+
+ sb = dentry->d_sb;
+
+ unionfs_read_lock(sb);
+
+ BUG_ON(!S_ISDIR(dentry->d_inode->i_mode));
+
+ if ((err = unionfs_partial_lookup(dentry)))
+ goto out;
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ bopaque = dbopaque(dentry);
+ if (0 <= bopaque && bopaque < bend)
+ bend = bopaque;
+
+ buf = kmalloc(sizeof(struct unionfs_rdutil_callback), GFP_KERNEL);
+ if (!buf) {
+ err = -ENOMEM;
+ goto out;
+ }
+ buf->err = 0;
+ buf->mode = RD_CHECK_EMPTY;
+ buf->rdstate = alloc_rdstate(dentry->d_inode, bstart);
+ if (!buf->rdstate) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* Process the hidden directories with rdutil_callback as a filldir. */
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+ if (!hidden_dentry->d_inode)
+ continue;
+ if (!S_ISDIR(hidden_dentry->d_inode->i_mode))
+ continue;
+
+ dget(hidden_dentry);
+ mntget(stohiddenmnt_index(sb, bindex));
+ branchget(sb, bindex);
+ hidden_file =
+ dentry_open(hidden_dentry, stohiddenmnt_index(sb, bindex),
+ O_RDONLY);
+ if (IS_ERR(hidden_file)) {
+ err = PTR_ERR(hidden_file);
+ dput(hidden_dentry);
+ branchput(sb, bindex);
+ goto out;
+ }
+
+ do {
+ buf->filldir_called = 0;
+ buf->rdstate->uds_bindex = bindex;
+ err = vfs_readdir(hidden_file,
+ readdir_util_callback, buf);
+ if (buf->err)
+ err = buf->err;
+ } while ((err >= 0) && buf->filldir_called);
+
+ /* fput calls dput for hidden_dentry */
+ fput(hidden_file);
+ branchput(sb, bindex);
+
+ if (err < 0)
+ goto out;
+ }
+
+out:
+ if (buf) {
+ if (namelist && !err)
+ *namelist = buf->rdstate;
+ else if (buf->rdstate)
+ free_rdstate(buf->rdstate);
+ kfree(buf);
+ }
+
+ unionfs_read_unlock(sb);
+
+ return err;
+}
+

2006-09-01 01:48:33

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 09/22][RFC] Unionfs: File operations

From: Josef "Jeff" Sipek <[email protected]>

This patch provides the file operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/file.c | 279 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 279 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/file.c linux-2.6-git-unionfs/fs/unionfs/file.c
--- linux-2.6-git/fs/unionfs/file.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/file.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,279 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* declarations for sparse */
+extern ssize_t unionfs_read(struct file *, char __user *, size_t, loff_t *);
+extern ssize_t unionfs_write(struct file *, const char __user *, size_t,
+ loff_t *);
+
+/*******************
+ * File Operations *
+ *******************/
+
+static loff_t unionfs_llseek(struct file *file, loff_t offset, int origin)
+{
+ loff_t err;
+ struct file *hidden_file = NULL;
+
+ if ((err = unionfs_file_revalidate(file, 0)))
+ goto out;
+
+ hidden_file = ftohf(file);
+ /* always set hidden position to this one */
+ hidden_file->f_pos = file->f_pos;
+
+ memcpy(&(hidden_file->f_ra), &(file->f_ra),
+ sizeof(struct file_ra_state));
+
+ if (hidden_file->f_op && hidden_file->f_op->llseek)
+ err = hidden_file->f_op->llseek(hidden_file, offset, origin);
+ else
+ err = generic_file_llseek(hidden_file, offset, origin);
+
+ if (err < 0)
+ goto out;
+ if (err != file->f_pos) {
+ file->f_pos = err;
+ // ION maybe this?
+ // file->f_pos = hidden_file->f_pos;
+
+ file->f_version++;
+ }
+out:
+ return err;
+}
+
+ssize_t __unionfs_read(struct file * file, char __user * buf, size_t count,
+ loff_t * ppos)
+{
+ int err = -EINVAL;
+ struct file *hidden_file = NULL;
+ loff_t pos = *ppos;
+
+ hidden_file = ftohf(file);
+ if (!hidden_file->f_op || !hidden_file->f_op->read)
+ goto out;
+
+ err = hidden_file->f_op->read(hidden_file, buf, count, &pos);
+ *ppos = pos;
+
+out:
+ return err;
+}
+
+ssize_t unionfs_read(struct file * file, char __user * buf, size_t count,
+ loff_t * ppos)
+{
+ int err = -EINVAL;
+
+ if ((err = unionfs_file_revalidate(file, 0)))
+ goto out;
+
+ err = __unionfs_read(file, buf, count, ppos);
+
+out:
+ return err;
+}
+
+ssize_t __unionfs_write(struct file * file, const char __user * buf,
+ size_t count, loff_t * ppos)
+{
+ int err = -EINVAL;
+ struct file *hidden_file = NULL;
+ struct inode *inode;
+ struct inode *hidden_inode;
+ loff_t pos = *ppos;
+ int bstart, bend;
+
+ inode = file->f_dentry->d_inode;
+
+ bstart = fbstart(file);
+ bend = fbend(file);
+
+ BUG_ON(bstart == -1);
+
+ hidden_file = ftohf(file);
+ hidden_inode = hidden_file->f_dentry->d_inode;
+
+ if (!hidden_file->f_op || !hidden_file->f_op->write)
+ goto out;
+
+ /* adjust for append -- seek to the end of the file */
+ if (file->f_flags & O_APPEND)
+ pos = inode->i_size;
+
+ err = hidden_file->f_op->write(hidden_file, buf, count, &pos);
+
+ /*
+ * copy ctime and mtime from lower layer attributes
+ * atime is unchanged for both layers
+ */
+ if (err >= 0)
+ fist_copy_attr_times(inode, hidden_inode);
+
+ *ppos = pos;
+
+ /* update this inode's size */
+ if (pos > inode->i_size)
+ inode->i_size = pos;
+out:
+ return err;
+}
+
+ssize_t unionfs_write(struct file * file, const char __user * buf, size_t count,
+ loff_t * ppos)
+{
+ int err = 0;
+
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ err = __unionfs_write(file, buf, count, ppos);
+
+out:
+ return err;
+}
+
+static int unionfs_file_readdir(struct file *file, void *dirent,
+ filldir_t filldir)
+{
+ int err = -ENOTDIR;
+ return err;
+}
+
+static unsigned int unionfs_poll(struct file *file, poll_table * wait)
+{
+ unsigned int mask = DEFAULT_POLLMASK;
+ struct file *hidden_file = NULL;
+
+ if (unionfs_file_revalidate(file, 0)) {
+ /* We should pretend an error happend. */
+ mask = POLLERR | POLLIN | POLLOUT;
+ goto out;
+ }
+
+ hidden_file = ftohf(file);
+
+ if (!hidden_file->f_op || !hidden_file->f_op->poll)
+ goto out;
+
+ mask = hidden_file->f_op->poll(hidden_file, wait);
+
+out:
+ return mask;
+}
+
+static int __do_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ int err;
+ struct file *hidden_file;
+
+ hidden_file = ftohf(file);
+
+ err = -ENODEV;
+ if (!hidden_file->f_op || !hidden_file->f_op->mmap)
+ goto out;
+
+ vma->vm_file = hidden_file;
+ err = hidden_file->f_op->mmap(hidden_file, vma);
+ get_file(hidden_file); /* make sure it doesn't get freed on us */
+ fput(file); /* no need to keep extra ref on ours */
+out:
+ return err;
+}
+
+/* SP: mmap code now maps upper file
+ * like old code, will only copyup at this point, it's possible to copyup
+ * in writepage(), but I haven't bothered with that, as only apt-get seem
+ * to want to write to a shared/write mapping
+ */
+static int unionfs_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ int err = 0;
+ int willwrite;
+
+ /* This might could be deferred to mmap's writepage. */
+ willwrite = ((vma->vm_flags | VM_SHARED | VM_WRITE) == vma->vm_flags);
+ if ((err = unionfs_file_revalidate(file, willwrite)))
+ goto out;
+
+ err = __do_mmap(file, vma);
+
+out:
+ return err;
+}
+
+static int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
+{
+ int err;
+ struct file *hidden_file = NULL;
+
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ hidden_file = ftohf(file);
+
+ err = -EINVAL;
+ if (!hidden_file->f_op || !hidden_file->f_op->fsync)
+ goto out;
+
+ mutex_lock(&hidden_file->f_dentry->d_inode->i_mutex);
+ err = hidden_file->f_op->fsync(hidden_file, hidden_file->f_dentry,
+ datasync);
+ mutex_unlock(&hidden_file->f_dentry->d_inode->i_mutex);
+
+out:
+ return err;
+}
+
+/* SP: disabled as none of the other in kernel fs's seem to use it */
+static int unionfs_fasync(int fd, struct file *file, int flag)
+{
+ int err = 0;
+ struct file *hidden_file = NULL;
+
+ if ((err = unionfs_file_revalidate(file, 1)))
+ goto out;
+
+ hidden_file = ftohf(file);
+
+ if (hidden_file->f_op && hidden_file->f_op->fasync)
+ err = hidden_file->f_op->fasync(fd, hidden_file, flag);
+
+out:
+ return err;
+}
+
+struct file_operations unionfs_main_fops = {
+ .llseek = unionfs_llseek,
+ .read = unionfs_read,
+ .write = unionfs_write,
+ .readdir = unionfs_file_readdir,
+ .poll = unionfs_poll,
+ .unlocked_ioctl = unionfs_ioctl,
+ .mmap = unionfs_mmap,
+ .open = unionfs_open,
+ .flush = unionfs_flush,
+ .release = unionfs_file_release,
+ .fsync = unionfs_fsync,
+ .fasync = unionfs_fasync,
+};
+

2006-09-01 01:49:52

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 10/22][RFC] Unionfs: Inode operations

From: Josef "Jeff" Sipek <[email protected]>

This patch provides the inode operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/inode.c | 925 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 925 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/inode.c linux-2.6-git-unionfs/fs/unionfs/inode.c
--- linux-2.6-git/fs/unionfs/inode.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/inode.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,925 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* declarations added for "sparse" */
+extern struct dentry *unionfs_lookup(struct inode *, struct dentry *,
+ struct nameidata *);
+extern int unionfs_readlink(struct dentry *dentry, char __user * buf,
+ int bufsiz);
+extern void unionfs_put_link(struct dentry *dentry, struct nameidata *nd,
+ void *cookie);
+
+static int unionfs_create(struct inode *parent, struct dentry *dentry,
+ int mode, struct nameidata *nd)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL;
+ struct dentry *whiteout_dentry = NULL;
+ struct dentry *new_hidden_dentry;
+ struct dentry *hidden_parent_dentry = NULL;
+ int bindex = 0, bstart;
+ char *name = NULL;
+
+ lock_dentry(dentry);
+
+ /* We start out in the leftmost branch. */
+ bstart = dbstart(dentry);
+ hidden_dentry = dtohd(dentry);
+
+ /* check if whiteout exists in this branch, i.e. lookup .wh.foo first */
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ whiteout_dentry =
+ lookup_one_len(name, hidden_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(whiteout_dentry)) {
+ err = PTR_ERR(whiteout_dentry);
+ whiteout_dentry = NULL;
+ goto out;
+ }
+
+ if (whiteout_dentry->d_inode) {
+ /* .wh.foo has been found. */
+ /* First truncate it and then rename it to foo (hence having
+ * the same overall effect as a normal create.
+ */
+ struct dentry *hidden_dir_dentry;
+ struct iattr newattrs;
+
+ mutex_lock(&whiteout_dentry->d_inode->i_mutex);
+ newattrs.ia_valid = ATTR_CTIME | ATTR_MODE | ATTR_ATIME
+ | ATTR_MTIME | ATTR_UID | ATTR_GID | ATTR_FORCE
+ | ATTR_KILL_SUID | ATTR_KILL_SGID;
+
+ newattrs.ia_mode = mode & ~current->fs->umask;
+ newattrs.ia_uid = current->fsuid;
+ newattrs.ia_gid = current->fsgid;
+
+ if (whiteout_dentry->d_inode->i_size != 0) {
+ newattrs.ia_valid |= ATTR_SIZE;
+ newattrs.ia_size = 0;
+ }
+
+ err = notify_change(whiteout_dentry, &newattrs);
+
+ mutex_unlock(&whiteout_dentry->d_inode->i_mutex);
+
+ if (err)
+ printk(KERN_WARNING
+ "unionfs: %s:%d: notify_change failed: %d, ignoring..\n",
+ __FILE__, __LINE__, err);
+
+ new_hidden_dentry = dtohd(dentry);
+ dget(new_hidden_dentry);
+
+ hidden_dir_dentry = dget_parent(whiteout_dentry);
+ lock_rename(hidden_dir_dentry, hidden_dir_dentry);
+
+ if (!(err = is_robranch_super(dentry->d_sb, bstart))) {
+ err =
+ vfs_rename(hidden_dir_dentry->d_inode,
+ whiteout_dentry,
+ hidden_dir_dentry->d_inode,
+ new_hidden_dentry);
+ }
+ if (!err) {
+ fist_copy_attr_timesizes(parent,
+ new_hidden_dentry->d_parent->
+ d_inode);
+ parent->i_nlink = get_nlinks(parent);
+ }
+
+ unlock_rename(hidden_dir_dentry, hidden_dir_dentry);
+ dput(hidden_dir_dentry);
+
+ dput(new_hidden_dentry);
+
+ if (err) {
+ /* exit if the error returned was NOT -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ goto out;
+ /* We were not able to create the file in this branch,
+ * so, we try to create it in one branch to left
+ */
+ bstart--;
+ } else {
+ /* reset the unionfs dentry to point to the .wh.foo entry. */
+
+ /* Discard any old reference. */
+ dput(dtohd(dentry));
+
+ /* Trade one reference to another. */
+ set_dtohd_index(dentry, bstart, whiteout_dentry);
+ whiteout_dentry = NULL;
+
+ err = unionfs_interpose(dentry, parent->i_sb, 0);
+ goto out;
+ }
+ }
+
+ for (bindex = bstart; bindex >= 0; bindex--) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry) {
+ /* if hidden_dentry is NULL, create the entire
+ * dentry directory structure in branch 'bindex'.
+ * hidden_dentry will NOT be null when bindex == bstart
+ * because lookup passed as a negative unionfs dentry
+ * pointing to a lone negative underlying dentry */
+ hidden_dentry = create_parents(parent, dentry, bindex);
+ if (!hidden_dentry || IS_ERR(hidden_dentry)) {
+ if (IS_ERR(hidden_dentry))
+ err = PTR_ERR(hidden_dentry);
+ continue;
+ }
+ }
+
+ hidden_parent_dentry = lock_parent(hidden_dentry);
+ if (IS_ERR(hidden_parent_dentry)) {
+ err = PTR_ERR(hidden_parent_dentry);
+ goto out;
+ }
+ /* We shouldn't create things in a read-only branch. */
+ if (!(err = is_robranch_super(dentry->d_sb, bindex))) {
+ //DQ: vfs_create has a different prototype in 2.6
+ err = vfs_create(hidden_parent_dentry->d_inode,
+ hidden_dentry, mode, nd);
+ }
+ if (err || !hidden_dentry->d_inode) {
+ unlock_dir(hidden_parent_dentry);
+
+ /* break out of for loop if the error wasn't -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ break;
+ } else {
+ err = unionfs_interpose(dentry, parent->i_sb, 0);
+ if (!err) {
+ fist_copy_attr_timesizes(parent,
+ hidden_parent_dentry->
+ d_inode);
+ /* update number of links on parent directory */
+ parent->i_nlink = get_nlinks(parent);
+ }
+ unlock_dir(hidden_parent_dentry);
+ break;
+ }
+ }
+
+out:
+ dput(whiteout_dentry);
+ kfree(name);
+
+ unlock_dentry(dentry);
+ return err;
+}
+
+struct dentry *unionfs_lookup(struct inode *parent, struct dentry *dentry,
+ struct nameidata *nd)
+{
+ /* The locking is done by unionfs_lookup_backend. */
+ return unionfs_lookup_backend(dentry, INTERPOSE_LOOKUP);
+}
+
+static int unionfs_link(struct dentry *old_dentry, struct inode *dir,
+ struct dentry *new_dentry)
+{
+ int err = 0;
+ struct dentry *hidden_old_dentry = NULL;
+ struct dentry *hidden_new_dentry = NULL;
+ struct dentry *hidden_dir_dentry = NULL;
+ struct dentry *whiteout_dentry;
+ char *name = NULL;
+
+ double_lock_dentry(new_dentry, old_dentry);
+
+ hidden_new_dentry = dtohd(new_dentry);
+
+ /* check if whiteout exists in the branch of new dentry, i.e. lookup
+ * .wh.foo first. If present, delete it */
+ name = alloc_whname(new_dentry->d_name.name, new_dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ whiteout_dentry =
+ lookup_one_len(name, hidden_new_dentry->d_parent,
+ new_dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(whiteout_dentry)) {
+ err = PTR_ERR(whiteout_dentry);
+ goto out;
+ }
+
+ if (!whiteout_dentry->d_inode) {
+ dput(whiteout_dentry);
+ whiteout_dentry = NULL;
+ } else {
+ /* found a .wh.foo entry, unlink it and then call vfs_link() */
+ hidden_dir_dentry = lock_parent(whiteout_dentry);
+ if (!
+ (err =
+ is_robranch_super(new_dentry->d_sb,
+ dbstart(new_dentry)))) {
+ err =
+ vfs_unlink(hidden_dir_dentry->d_inode,
+ whiteout_dentry);
+ }
+ fist_copy_attr_times(dir, hidden_dir_dentry->d_inode);
+ dir->i_nlink = get_nlinks(dir);
+ unlock_dir(hidden_dir_dentry);
+ hidden_dir_dentry = NULL;
+ dput(whiteout_dentry);
+ if (err)
+ goto out;
+ }
+
+ if (dbstart(old_dentry) != dbstart(new_dentry)) {
+ hidden_new_dentry =
+ create_parents(dir, new_dentry, dbstart(old_dentry));
+ err = PTR_ERR(hidden_new_dentry);
+ if (IS_COPYUP_ERR(err))
+ goto docopyup;
+ if (!hidden_new_dentry || IS_ERR(hidden_new_dentry))
+ goto out;
+ }
+ hidden_new_dentry = dtohd(new_dentry);
+ hidden_old_dentry = dtohd(old_dentry);
+
+ BUG_ON(dbstart(old_dentry) != dbstart(new_dentry));
+ hidden_dir_dentry = lock_parent(hidden_new_dentry);
+ if (!(err = is_robranch(old_dentry)))
+ err =
+ vfs_link(hidden_old_dentry, hidden_dir_dentry->d_inode,
+ hidden_new_dentry);
+ unlock_dir(hidden_dir_dentry);
+
+docopyup:
+ if (IS_COPYUP_ERR(err)) {
+ int old_bstart = dbstart(old_dentry);
+ int bindex;
+
+ for (bindex = old_bstart - 1; bindex >= 0; bindex--) {
+ err =
+ copyup_dentry(old_dentry->d_parent->
+ d_inode, old_dentry,
+ old_bstart, bindex, NULL,
+ old_dentry->d_inode->i_size);
+ if (!err) {
+ hidden_new_dentry =
+ create_parents(dir, new_dentry, bindex);
+ hidden_old_dentry = dtohd(old_dentry);
+ hidden_dir_dentry =
+ lock_parent(hidden_new_dentry);
+ /* do vfs_link */
+ err =
+ vfs_link(hidden_old_dentry,
+ hidden_dir_dentry->d_inode,
+ hidden_new_dentry);
+ unlock_dir(hidden_dir_dentry);
+ goto check_link;
+ }
+ }
+ goto out;
+ }
+check_link:
+ if (err || !hidden_new_dentry->d_inode)
+ goto out;
+
+ /* Its a hard link, so use the same inode */
+ new_dentry->d_inode = igrab(old_dentry->d_inode);
+ d_instantiate(new_dentry, new_dentry->d_inode);
+ fist_copy_attr_all(dir, hidden_new_dentry->d_parent->d_inode);
+ /* propagate number of hard-links */
+ old_dentry->d_inode->i_nlink = get_nlinks(old_dentry->d_inode);
+
+out:
+ if (!new_dentry->d_inode)
+ d_drop(new_dentry);
+
+ kfree(name);
+
+ unlock_dentry(new_dentry);
+ unlock_dentry(old_dentry);
+
+ return err;
+}
+
+static int unionfs_symlink(struct inode *dir, struct dentry *dentry,
+ const char *symname)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL;
+ struct dentry *whiteout_dentry = NULL;
+ struct dentry *hidden_dir_dentry = NULL;
+ umode_t mode;
+ int bindex = 0, bstart;
+ char *name = NULL;
+
+ lock_dentry(dentry);
+
+ /* We start out in the leftmost branch. */
+ bstart = dbstart(dentry);
+
+ hidden_dentry = dtohd(dentry);
+
+ /* check if whiteout exists in this branch, i.e. lookup .wh.foo first. If present, delete it */
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ whiteout_dentry =
+ lookup_one_len(name, hidden_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(whiteout_dentry)) {
+ err = PTR_ERR(whiteout_dentry);
+ goto out;
+ }
+
+ if (!whiteout_dentry->d_inode) {
+ dput(whiteout_dentry);
+ whiteout_dentry = NULL;
+ } else {
+ /* found a .wh.foo entry, unlink it and then call vfs_symlink() */
+ hidden_dir_dentry = lock_parent(whiteout_dentry);
+
+ if (!(err = is_robranch_super(dentry->d_sb, bstart))) {
+ err =
+ vfs_unlink(hidden_dir_dentry->d_inode,
+ whiteout_dentry);
+ }
+ dput(whiteout_dentry);
+
+ fist_copy_attr_times(dir, hidden_dir_dentry->d_inode);
+ /* propagate number of hard-links */
+ dir->i_nlink = get_nlinks(dir);
+
+ unlock_dir(hidden_dir_dentry);
+
+ if (err) {
+ /* exit if the error returned was NOT -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ goto out;
+ /* should now try to create symlink in the another branch */
+ bstart--;
+ }
+ }
+
+ /* deleted whiteout if it was present, now do a normal vfs_symlink() with
+ possible recursive directory creation */
+ for (bindex = bstart; bindex >= 0; bindex--) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry) {
+ /* if hidden_dentry is NULL, create the entire
+ * dentry directory structure in branch 'bindex'. hidden_dentry will NOT be null when
+ * bindex == bstart because lookup passed as a negative unionfs dentry pointing to a
+ * lone negative underlying dentry */
+ hidden_dentry = create_parents(dir, dentry, bindex);
+ if (!hidden_dentry || IS_ERR(hidden_dentry)) {
+ if (IS_ERR(hidden_dentry)) {
+ err = PTR_ERR(hidden_dentry);
+ }
+ printk(KERN_DEBUG
+ "hidden dentry NULL (or error) for bindex = %d\n",
+ bindex);
+ continue;
+ }
+ }
+
+ hidden_dir_dentry = lock_parent(hidden_dentry);
+
+ if (!(err = is_robranch_super(dentry->d_sb, bindex))) {
+ mode = S_IALLUGO;
+ err =
+ vfs_symlink(hidden_dir_dentry->d_inode,
+ hidden_dentry, symname, mode);
+ }
+ unlock_dir(hidden_dir_dentry);
+
+ if (err || !hidden_dentry->d_inode) {
+ /* break out of for loop if error returned was NOT -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ break;
+ } else {
+ err = unionfs_interpose(dentry, dir->i_sb, 0);
+ if (!err) {
+ fist_copy_attr_timesizes(dir,
+ hidden_dir_dentry->
+ d_inode);
+ /* update number of links on parent directory */
+ dir->i_nlink = get_nlinks(dir);
+ }
+ break;
+ }
+ }
+
+out:
+ if (!dentry->d_inode)
+ d_drop(dentry);
+
+ kfree(name);
+ unlock_dentry(dentry);
+ return err;
+}
+
+static int unionfs_mkdir(struct inode *parent, struct dentry *dentry, int mode)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL, *whiteout_dentry = NULL;
+ struct dentry *hidden_parent_dentry = NULL;
+ int bindex = 0, bstart;
+ char *name = NULL;
+ int whiteout_unlinked = 0;
+ struct sioq_args args;
+
+ lock_dentry(dentry);
+ bstart = dbstart(dentry);
+
+ hidden_dentry = dtohd(dentry);
+
+ // check if whiteout exists in this branch, i.e. lookup .wh.foo first
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ whiteout_dentry = lookup_one_len(name, hidden_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(whiteout_dentry)) {
+ err = PTR_ERR(whiteout_dentry);
+ goto out;
+ }
+
+ if (!whiteout_dentry->d_inode) {
+ dput(whiteout_dentry);
+ whiteout_dentry = NULL;
+ } else {
+ hidden_parent_dentry = lock_parent(whiteout_dentry);
+
+ //found a.wh.foo entry, remove it then do vfs_mkdir
+ if (!(err = is_robranch_super(dentry->d_sb, bstart))) {
+ args.u.unlink.parent = hidden_parent_dentry->d_inode;
+ args.u.unlink.dentry = whiteout_dentry;
+ run_sioq(__unionfs_unlink, &args);
+ err = args.err;
+ }
+ dput(whiteout_dentry);
+
+ unlock_dir(hidden_parent_dentry);
+
+ if (err) {
+ /* exit if the error returned was NOT -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ goto out;
+ bstart--;
+ } else {
+ whiteout_unlinked = 1;
+ }
+ }
+
+ for (bindex = bstart; bindex >= 0; bindex--) {
+ int i;
+ int bend = dbend(dentry);
+
+ if (is_robranch_super(dentry->d_sb, bindex))
+ continue;
+
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry) {
+ hidden_dentry = create_parents(parent, dentry, bindex);
+ if (!hidden_dentry || IS_ERR(hidden_dentry)) {
+ printk(KERN_DEBUG
+ "hidden dentry NULL for bindex = %d\n",
+ bindex);
+ continue;
+ }
+ }
+
+ hidden_parent_dentry = lock_parent(hidden_dentry);
+
+ if (IS_ERR(hidden_parent_dentry)) {
+ err = PTR_ERR(hidden_parent_dentry);
+ goto out;
+ }
+
+ err = vfs_mkdir(hidden_parent_dentry->d_inode, hidden_dentry, mode);
+
+ unlock_dir(hidden_parent_dentry);
+
+ /* did the mkdir suceed? */
+ if (err)
+ break;
+
+ for (i = bindex + 1; i < bend; i++) {
+ if (dtohd_index(dentry, i)) {
+ dput(dtohd_index(dentry, i));
+ set_dtohd_index(dentry, i, NULL);
+ }
+ }
+ set_dbend(dentry, bindex);
+
+ err = unionfs_interpose(dentry, parent->i_sb, 0);
+ if (!err) {
+ fist_copy_attr_timesizes(parent,
+ hidden_parent_dentry->d_inode);
+
+ /* update number of links on parent directory */
+ parent->i_nlink = get_nlinks(parent);
+ }
+
+ err = make_dir_opaque(dentry, dbstart(dentry));
+ if (err) {
+ printk(KERN_DEBUG "mkdir: error creating .wh.__dir_opaque: %d\n", err);
+ goto out;
+ }
+
+ /* we are done! */
+ break;
+ }
+
+out:
+ if (!dentry->d_inode)
+ d_drop(dentry);
+
+ kfree(name);
+
+ unlock_dentry(dentry);
+ return err;
+}
+
+static int unionfs_mknod(struct inode *dir, struct dentry *dentry, int mode,
+ dev_t dev)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL, *whiteout_dentry = NULL;
+ struct dentry *hidden_parent_dentry = NULL;
+ int bindex = 0, bstart;
+ char *name = NULL;
+ int whiteout_unlinked = 0;
+
+ lock_dentry(dentry);
+ bstart = dbstart(dentry);
+
+ hidden_dentry = dtohd(dentry);
+
+ // check if whiteout exists in this branch, i.e. lookup .wh.foo first
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ whiteout_dentry =
+ lookup_one_len(name, hidden_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(whiteout_dentry)) {
+ err = PTR_ERR(whiteout_dentry);
+ goto out;
+ }
+
+ if (!whiteout_dentry->d_inode) {
+ dput(whiteout_dentry);
+ whiteout_dentry = NULL;
+ } else {
+ /* found .wh.foo, unlink it */
+ hidden_parent_dentry = lock_parent(whiteout_dentry);
+
+ //found a.wh.foo entry, remove it then do vfs_mkdir
+ if (!(err = is_robranch_super(dentry->d_sb, bstart)))
+ err = vfs_unlink(hidden_parent_dentry->d_inode,
+ whiteout_dentry);
+ dput(whiteout_dentry);
+
+ unlock_dir(hidden_parent_dentry);
+
+ if (err) {
+ if (!IS_COPYUP_ERR(err))
+ goto out;
+
+ bstart--;
+ } else {
+ whiteout_unlinked = 1;
+ }
+ }
+
+ for (bindex = bstart; bindex >= 0; bindex--) {
+ if (is_robranch_super(dentry->d_sb, bindex))
+ continue;
+
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry) {
+ hidden_dentry = create_parents(dir, dentry, bindex);
+ if (IS_ERR(hidden_dentry)) {
+ printk(KERN_DEBUG
+ "failed to create parents on %d, err = %ld\n",
+ bindex, PTR_ERR(hidden_dentry));
+ continue;
+ }
+ }
+
+ hidden_parent_dentry = lock_parent(hidden_dentry);
+ if (IS_ERR(hidden_parent_dentry)) {
+ err = PTR_ERR(hidden_parent_dentry);
+ goto out;
+ }
+
+ err = vfs_mknod(hidden_parent_dentry->d_inode,
+ hidden_dentry, mode, dev);
+
+ if (err) {
+ unlock_dir(hidden_parent_dentry);
+ break;
+ }
+
+ err = unionfs_interpose(dentry, dir->i_sb, 0);
+ if (!err) {
+ fist_copy_attr_timesizes(dir,
+ hidden_parent_dentry->
+ d_inode);
+ /* update number of links on parent directory */
+ dir->i_nlink = get_nlinks(dir);
+ }
+ unlock_dir(hidden_parent_dentry);
+
+ break;
+ }
+
+out:
+ if (!dentry->d_inode)
+ d_drop(dentry);
+
+ if (name) {
+ kfree(name);
+ }
+
+ unlock_dentry(dentry);
+ return err;
+}
+
+int unionfs_readlink(struct dentry *dentry, char __user * buf, int bufsiz)
+{
+ int err;
+ struct dentry *hidden_dentry;
+
+ lock_dentry(dentry);
+ hidden_dentry = dtohd(dentry);
+
+ if (!hidden_dentry->d_inode->i_op ||
+ !hidden_dentry->d_inode->i_op->readlink) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = hidden_dentry->d_inode->i_op->readlink(hidden_dentry,
+ buf, bufsiz);
+ if (err > 0)
+ fist_copy_attr_atime(dentry->d_inode, hidden_dentry->d_inode);
+
+out:
+ unlock_dentry(dentry);
+ return err;
+}
+
+/* We don't lock the dentry here, because readlink does the heavy lifting. */
+static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
+{
+ char *buf;
+ int len = PAGE_SIZE, err;
+ mm_segment_t old_fs;
+
+ /* This is freed by the put_link method assuming a successful call. */
+ buf = (char *)kmalloc(len, GFP_KERNEL);
+ if (!buf) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* read the symlink, and then we will follow it */
+ old_fs = get_fs();
+ set_fs(KERNEL_DS);
+ err = dentry->d_inode->i_op->readlink(dentry, (char __user *)buf, len);
+ set_fs(old_fs);
+ if (err < 0) {
+ kfree(buf);
+ buf = NULL;
+ goto out;
+ }
+ buf[err] = 0;
+ nd_set_link(nd, buf);
+ err = 0;
+
+out:
+ return ERR_PTR(err);
+}
+
+void unionfs_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie)
+{
+ char *link;
+ link = nd_get_link(nd);
+ kfree(link);
+}
+
+/* Basically copied from the kernel vfs permission(), but we've changed
+ * the following: (1) the IS_RDONLY check is skipped, and (2) if you set
+ * the mount option `nfsperms=insceure', we assume that -EACCES means that
+ * the export is read-only and we should check standard Unix permissions.
+ * This means that NFS ACL checks (or other advanced permission features)
+ * are bypassed.
+ */
+static int inode_permission(struct inode *inode, int mask, struct nameidata *nd,
+ int bindex)
+{
+ int retval, submask;
+
+ if (mask & MAY_WRITE) {
+ /* The first branch is allowed to be really readonly. */
+ if (bindex == 0) {
+ umode_t mode = inode->i_mode;
+ if (IS_RDONLY(inode) && (S_ISREG(mode) || S_ISDIR(mode)
+ || S_ISLNK(mode)))
+ return -EROFS;
+ }
+ /*
+ * Nobody gets write access to an immutable file.
+ */
+ if (IS_IMMUTABLE(inode))
+ return -EACCES;
+ }
+
+ /* Ordinary permission routines do not understand MAY_APPEND. */
+ submask = mask & ~MAY_APPEND;
+ if (inode->i_op && inode->i_op->permission) {
+ retval = inode->i_op->permission(inode, submask, nd);
+ if ((retval == -EACCES) && (submask & MAY_WRITE) &&
+ (!strcmp("nfs", (inode)->i_sb->s_type->name)) &&
+ (nd) && (nd->mnt) && (nd->mnt->mnt_sb) &&
+ (branchperms(nd->mnt->mnt_sb, bindex) & MAY_NFSRO)) {
+ retval = generic_permission(inode, submask, NULL);
+ }
+ } else {
+ retval = generic_permission(inode, submask, NULL);
+ }
+
+ if (retval && retval != -EROFS) /* ignore EROFS */
+ return retval;
+
+ retval = security_inode_permission(inode, mask, nd);
+ return ((retval == -EROFS) ? 0 : retval); /* ignore EROFS */
+}
+
+static int unionfs_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct inode *hidden_inode = NULL;
+ int err = 0;
+ int bindex, bstart, bend;
+ const int is_file = !S_ISDIR(inode->i_mode);
+ const int write_mask = (mask & MAY_WRITE) && !(mask & MAY_READ);
+
+ bstart = ibstart(inode);
+ bend = ibend(inode);
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_inode = itohi_index(inode, bindex);
+ if (!hidden_inode)
+ continue;
+
+ /* check the condition for D-F-D underlying files/directories,
+ * we dont have to check for files, if we are checking for
+ * directories.
+ */
+ if (!is_file && !S_ISDIR(hidden_inode->i_mode))
+ continue;
+ /* We use our own special version of permission, such that
+ * only the first branch returns -EROFS. */
+ err = inode_permission(hidden_inode, mask, nd, bindex);
+ /* The permissions are an intersection of the overall directory
+ * permissions, so we fail if one fails. */
+ if (err)
+ goto out;
+ /* only the leftmost file matters. */
+ if (is_file || write_mask) {
+ if (is_file && write_mask) {
+ err = get_write_access(hidden_inode);
+ if (!err)
+ put_write_access(hidden_inode);
+ }
+ break;
+ }
+ }
+
+out:
+ return err;
+}
+
+static int unionfs_setattr(struct dentry *dentry, struct iattr *ia)
+{
+ int err = 0;
+ struct dentry *hidden_dentry;
+ struct inode *inode = NULL;
+ struct inode *hidden_inode = NULL;
+ int bstart, bend, bindex;
+ int i;
+ int copyup = 0;
+
+ lock_dentry(dentry);
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ inode = dentry->d_inode;
+
+ for (bindex = bstart; (bindex <= bend) || (bindex == bstart); bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+ BUG_ON(hidden_dentry->d_inode == NULL);
+
+ /* If the file is on a read only branch */
+ if (is_robranch_super(dentry->d_sb, bindex)
+ || IS_RDONLY(hidden_dentry->d_inode)) {
+ if (copyup || (bindex != bstart))
+ continue;
+ /* Only if its the leftmost file, copyup the file */
+ for (i = bstart - 1; i >= 0; i--) {
+ size_t size = dentry->d_inode->i_size;
+ if (ia->ia_valid & ATTR_SIZE)
+ size = ia->ia_size;
+ err = copyup_dentry(dentry->d_parent->d_inode,
+ dentry, bstart, i, NULL,
+ size);
+
+ if (!err) {
+ copyup = 1;
+ hidden_dentry = dtohd(dentry);
+ break;
+ }
+ /* if error is in the leftmost f/s, pass it up */
+ if (i == 0)
+ goto out;
+ }
+
+ }
+ err = notify_change(hidden_dentry, ia);
+ if (err)
+ goto out;
+ break;
+ }
+
+ /* get the size from the first hidden inode */
+ hidden_inode = itohi(dentry->d_inode);
+ fist_copy_attr_all(inode, hidden_inode);
+
+out:
+ unlock_dentry(dentry);
+ return err;
+}
+
+struct inode_operations unionfs_symlink_iops = {
+ .readlink = unionfs_readlink,
+ .permission = unionfs_permission,
+ .follow_link = unionfs_follow_link,
+ .setattr = unionfs_setattr,
+ .put_link = unionfs_put_link,
+};
+
+struct inode_operations unionfs_dir_iops = {
+ .create = unionfs_create,
+ .lookup = unionfs_lookup,
+ .link = unionfs_link,
+ .unlink = unionfs_unlink,
+ .symlink = unionfs_symlink,
+ .mkdir = unionfs_mkdir,
+ .rmdir = unionfs_rmdir,
+ .mknod = unionfs_mknod,
+ .rename = unionfs_rename,
+ .permission = unionfs_permission,
+ .setattr = unionfs_setattr,
+};
+
+struct inode_operations unionfs_main_iops = {
+ .permission = unionfs_permission,
+ .setattr = unionfs_setattr,
+};
+

2006-09-01 01:50:35

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 11/22][RFC] Unionfs: Lookup helper functions

From: Josef "Jeff" Sipek <[email protected]>

This patch provides helper functions for the lookup operations in Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/lookup.c | 477 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 477 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/lookup.c linux-2.6-git-unionfs/fs/unionfs/lookup.c
--- linux-2.6-git/fs/unionfs/lookup.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/lookup.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,477 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+static int is_opaque_dir(struct dentry *dentry, int bindex);
+static int is_validname(const char *name);
+
+struct dentry *unionfs_lookup_backend(struct dentry *dentry, int lookupmode)
+{
+ int err = 0;
+ struct dentry *hidden_dentry = NULL;
+ struct dentry *wh_hidden_dentry = NULL;
+ struct dentry *hidden_dir_dentry = NULL;
+ struct dentry *parent_dentry = NULL;
+ int bindex, bstart, bend, bopaque;
+ int dentry_count = 0; /* Number of positive dentries. */
+ int first_dentry_offset = -1;
+ struct dentry *first_hidden_dentry = NULL;
+ int locked_parent = 0;
+ int locked_child = 0;
+
+ int opaque;
+ char *whname = NULL;
+ const char *name;
+ int namelen;
+
+ /* We should already have a lock on this dentry in the case of a
+ * partial lookup, or a revalidation. Otherwise it is returned from
+ * new_dentry_private_data already locked. */
+ if (lookupmode == INTERPOSE_PARTIAL || lookupmode == INTERPOSE_REVAL
+ || lookupmode == INTERPOSE_REVAL_NEG) {
+ verify_locked(dentry);
+ } else {
+ BUG_ON(dtopd_nocheck(dentry) != NULL);
+ locked_child = 1;
+ }
+ if (lookupmode != INTERPOSE_PARTIAL)
+ if ((err = new_dentry_private_data(dentry)))
+ goto out;
+ /* must initialize dentry operations */
+ dentry->d_op = &unionfs_dops;
+
+ parent_dentry = dget_parent(dentry);
+ /* We never partial lookup the root directory. */
+ if (parent_dentry != dentry) {
+ lock_dentry(parent_dentry);
+ locked_parent = 1;
+ } else {
+ dput(parent_dentry);
+ parent_dentry = NULL;
+ goto out;
+ }
+
+ name = dentry->d_name.name;
+ namelen = dentry->d_name.len;
+
+ /* No dentries should get created for possible whiteout names. */
+ if (!is_validname(name)) {
+ err = -EPERM;
+ goto out_free;
+ }
+
+ /* Now start the actual lookup procedure. */
+ bstart = dbstart(parent_dentry);
+ bend = dbend(parent_dentry);
+ bopaque = dbopaque(parent_dentry);
+ BUG_ON(bstart < 0);
+
+ /* It would be ideal if we could convert partial lookups to only have
+ * to do this work when they really need to. It could probably improve
+ * performance quite a bit, and maybe simplify the rest of the code. */
+ if (lookupmode == INTERPOSE_PARTIAL) {
+ bstart++;
+ if ((bopaque != -1) && (bopaque < bend))
+ bend = bopaque;
+ }
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (lookupmode == INTERPOSE_PARTIAL && hidden_dentry)
+ continue;
+ BUG_ON(hidden_dentry != NULL);
+
+ hidden_dir_dentry = dtohd_index(parent_dentry, bindex);
+
+ /* if the parent hidden dentry does not exist skip this */
+ if (!(hidden_dir_dentry && hidden_dir_dentry->d_inode))
+ continue;
+
+ /* also skip it if the parent isn't a directory. */
+ if (!S_ISDIR(hidden_dir_dentry->d_inode->i_mode))
+ continue;
+
+ /* Reuse the whiteout name because its value doesn't change. */
+ if (!whname) {
+ whname = alloc_whname(name, namelen);
+ if (IS_ERR(whname)) {
+ err = PTR_ERR(whname);
+ goto out_free;
+ }
+ }
+
+ /* check if whiteout exists in this branch: lookup .wh.foo */
+ wh_hidden_dentry = lookup_one_len(whname, hidden_dir_dentry,
+ namelen + UNIONFS_WHLEN);
+ if (IS_ERR(wh_hidden_dentry)) {
+ dput(first_hidden_dentry);
+ err = PTR_ERR(wh_hidden_dentry);
+ goto out_free;
+ }
+
+ if (wh_hidden_dentry->d_inode) {
+ /* We found a whiteout so lets give up. */
+ if (S_ISREG(wh_hidden_dentry->d_inode->i_mode)) {
+ set_dbend(dentry, bindex);
+ set_dbopaque(dentry, bindex);
+ dput(wh_hidden_dentry);
+ break;
+ }
+ err = -EIO;
+ printk(KERN_NOTICE "EIO: Invalid whiteout entry type"
+ " %d.\n", wh_hidden_dentry->d_inode->i_mode);
+ dput(wh_hidden_dentry);
+ dput(first_hidden_dentry);
+ goto out_free;
+ }
+
+ dput(wh_hidden_dentry);
+ wh_hidden_dentry = NULL;
+
+ /* Now do regular lookup; lookup foo */
+ hidden_dentry = lookup_one_len(name, hidden_dir_dentry,
+ namelen);
+ if (IS_ERR(hidden_dentry)) {
+ dput(first_hidden_dentry);
+ err = PTR_ERR(hidden_dentry);
+ goto out_free;
+ }
+
+ /* Store the first negative dentry specially, because if they
+ * are all negative we need this for future creates. */
+ if (!hidden_dentry->d_inode) {
+ if (!first_hidden_dentry && (dbstart(dentry) == -1)) {
+ first_hidden_dentry = hidden_dentry;
+ first_dentry_offset = bindex;
+ } else {
+ dput(hidden_dentry);
+ }
+ continue;
+ }
+
+ /* number of positive dentries */
+ dentry_count++;
+
+ /* store underlying dentry */
+ if (dbstart(dentry) == -1)
+ set_dbstart(dentry, bindex);
+ set_dtohd_index(dentry, bindex, hidden_dentry);
+ set_dbend(dentry, bindex);
+
+ /* update parent directory's atime with the bindex */
+ fist_copy_attr_atime(parent_dentry->d_inode,
+ hidden_dir_dentry->d_inode);
+
+ /* We terminate file lookups here. */
+ if (!S_ISDIR(hidden_dentry->d_inode->i_mode)) {
+ if (lookupmode == INTERPOSE_PARTIAL)
+ continue;
+ if (dentry_count == 1)
+ goto out_positive;
+ /* This can only happen with mixed D-*-F-* */
+ BUG_ON(!S_ISDIR(dtohd(dentry)->d_inode->i_mode));
+ continue;
+ }
+
+ opaque = is_opaque_dir(dentry, bindex);
+ if (opaque < 0) {
+ dput(first_hidden_dentry);
+ err = opaque;
+ goto out_free;
+ }
+ if (opaque) {
+ set_dbend(dentry, bindex);
+ set_dbopaque(dentry, bindex);
+ break;
+ }
+ }
+
+ if (dentry_count)
+ goto out_positive;
+ else
+ goto out_negative;
+
+out_negative:
+ if (lookupmode == INTERPOSE_PARTIAL)
+ goto out;
+
+ /* If we've only got negative dentries, then use the leftmost one. */
+ if (lookupmode == INTERPOSE_REVAL) {
+ if (dentry->d_inode) {
+ itopd(dentry->d_inode)->uii_stale = 1;
+ }
+ goto out;
+ }
+ /* This should only happen if we found a whiteout. */
+ if (first_dentry_offset == -1) {
+ first_hidden_dentry = lookup_one_len(name, hidden_dir_dentry,
+ namelen);
+ first_dentry_offset = bindex;
+ if (IS_ERR(first_hidden_dentry)) {
+ err = PTR_ERR(first_hidden_dentry);
+ goto out;
+ }
+ }
+ set_dtohd_index(dentry, first_dentry_offset, first_hidden_dentry);
+ set_dbstart(dentry, first_dentry_offset);
+ set_dbend(dentry, first_dentry_offset);
+
+ if (lookupmode == INTERPOSE_REVAL_NEG)
+ BUG_ON(dentry->d_inode != NULL);
+ else
+ d_add(dentry, NULL);
+ goto out;
+
+/* This part of the code is for positive dentries. */
+out_positive:
+ BUG_ON(dentry_count <= 0);
+
+ /* If we're holding onto the first negative dentry throw it out. */
+ dput(first_hidden_dentry);
+
+ /* Partial lookups need to reinterpose, or throw away older negs. */
+ if (lookupmode == INTERPOSE_PARTIAL) {
+ if (dentry->d_inode) {
+ unionfs_reinterpose(dentry);
+ goto out;
+ }
+
+ /* This somehow turned positive, so it is as if we had a
+ * negative revalidation. */
+ lookupmode = INTERPOSE_REVAL_NEG;
+
+ update_bstart(dentry);
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ }
+
+ err = unionfs_interpose(dentry, dentry->d_sb, lookupmode);
+ if (err)
+ goto out_drop;
+
+ goto out;
+
+out_drop:
+ d_drop(dentry);
+
+out_free:
+ /* should dput all the underlying dentries on error condition */
+ bstart = dbstart(dentry);
+ if (bstart >= 0) {
+ bend = dbend(dentry);
+ for (bindex = bstart; bindex <= bend; bindex++)
+ dput(dtohd_index(dentry, bindex));
+ }
+ kfree(dtohd_ptr(dentry));
+ dtohd_ptr(dentry) = NULL;
+ set_dbstart(dentry, -1);
+ set_dbend(dentry, -1);
+
+out:
+ if (!err && dtopd(dentry)) {
+ BUG_ON(dbend(dentry) > dtopd(dentry)->udi_bcount);
+ BUG_ON(dbend(dentry) > sbmax(dentry->d_sb));
+ BUG_ON(dbstart(dentry) < 0);
+ }
+ kfree(whname);
+ if (locked_parent)
+ unlock_dentry(parent_dentry);
+ dput(parent_dentry);
+ if (locked_child)
+ unlock_dentry(dentry);
+ return ERR_PTR(err);
+}
+
+/* This is a utility function that fills in a unionfs dentry.*/
+int unionfs_partial_lookup(struct dentry *dentry)
+{
+ struct dentry *tmp;
+
+ tmp = unionfs_lookup_backend(dentry, INTERPOSE_PARTIAL);
+ if (!tmp)
+ return 0;
+ if (IS_ERR(tmp))
+ return PTR_ERR(tmp);
+ /* need to change the interface */
+ BUG_ON(tmp != dentry);
+ return -ENOSYS;
+}
+
+/* The rest of these are utility functions for lookup. */
+static int is_opaque_dir(struct dentry *dentry, int bindex)
+{
+ int err = 0;
+ struct dentry *hidden_dentry;
+ struct dentry *wh_hidden_dentry;
+ struct inode *hidden_inode;
+ struct sioq_args args;
+
+ hidden_dentry = dtohd_index(dentry, bindex);
+ hidden_inode = hidden_dentry->d_inode;
+
+ BUG_ON(!S_ISDIR(hidden_inode->i_mode));
+
+ mutex_lock(&hidden_inode->i_mutex);
+
+ if (!permission(hidden_inode, MAY_EXEC, NULL))
+ wh_hidden_dentry = lookup_one_len(UNIONFS_DIR_OPAQUE, hidden_dentry,
+ sizeof(UNIONFS_DIR_OPAQUE) - 1);
+ else {
+ args.u.isopaque.dentry = hidden_dentry;
+ run_sioq(__is_opaque_dir, &args);
+ wh_hidden_dentry = args.ret;
+ }
+
+ mutex_unlock(&hidden_inode->i_mutex);
+
+ if (IS_ERR(wh_hidden_dentry)) {
+ err = PTR_ERR(wh_hidden_dentry);
+ goto out;
+ }
+
+ /* This is an opaque dir iff wh_hidden_dentry is positive */
+ err = !!wh_hidden_dentry->d_inode;
+
+ dput(wh_hidden_dentry);
+out:
+ return err;
+}
+
+static int is_validname(const char *name)
+{
+ if (!strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN))
+ return 0;
+ if (!strncmp(name, UNIONFS_DIR_OPAQUE_NAME,
+ sizeof(UNIONFS_DIR_OPAQUE_NAME) - 1))
+ return 0;
+ return 1;
+}
+
+/* The dentry cache is just so we have properly sized dentries. */
+static kmem_cache_t *unionfs_dentry_cachep;
+int init_dentry_cache(void)
+{
+ unionfs_dentry_cachep =
+ kmem_cache_create("unionfs_dentry",
+ sizeof(struct unionfs_dentry_info), 0,
+ SLAB_RECLAIM_ACCOUNT, NULL, NULL);
+
+ if (!unionfs_dentry_cachep)
+ return -ENOMEM;
+ return 0;
+}
+
+void destroy_dentry_cache(void)
+{
+ if (!unionfs_dentry_cachep)
+ return;
+ if (kmem_cache_destroy(unionfs_dentry_cachep))
+ printk(KERN_ERR
+ "unionfs_dentry_cache: not all structures were freed\n");
+ return;
+}
+
+void free_dentry_private_data(struct unionfs_dentry_info *udi)
+{
+ if (!udi)
+ return;
+ kmem_cache_free(unionfs_dentry_cachep, udi);
+}
+
+int new_dentry_private_data(struct dentry *dentry)
+{
+ int newsize;
+ int oldsize = 0;
+
+ spin_lock(&dentry->d_lock);
+ if (!dtopd_nocheck(dentry)) {
+ dtopd_lhs(dentry) = (struct unionfs_dentry_info *)
+ kmem_cache_alloc(unionfs_dentry_cachep, SLAB_ATOMIC);
+ if (!dtopd_nocheck(dentry))
+ goto out;
+ init_MUTEX_LOCKED(&dtopd_nocheck(dentry)->udi_sem);
+
+ dtohd_ptr(dentry) = NULL;
+ } else {
+ oldsize = sizeof(struct dentry *) * dtopd(dentry)->udi_bcount;
+ }
+
+ dtopd_nocheck(dentry)->udi_bstart = -1;
+ dtopd_nocheck(dentry)->udi_bend = -1;
+ dtopd_nocheck(dentry)->udi_bopaque = -1;
+ dtopd_nocheck(dentry)->udi_bcount = sbmax(dentry->d_sb);
+ atomic_set(&dtopd_nocheck(dentry)->udi_generation,
+ atomic_read(&stopd(dentry->d_sb)->usi_generation));
+ newsize = sizeof(struct dentry *) * sbmax(dentry->d_sb);
+
+ /* Don't reallocate when we already have enough space. */
+ /* It would be ideal if we could actually use the slab macros to
+ * determine what our object sizes is, but those are not exported.
+ */
+ if (oldsize) {
+ int minsize = malloc_sizes[0].cs_size;
+
+ if (!newsize || ((oldsize < newsize) && (newsize > minsize))) {
+ kfree(dtohd_ptr(dentry));
+ dtohd_ptr(dentry) = NULL;
+ }
+ }
+
+ if (!dtohd_ptr(dentry) && newsize) {
+ dtohd_ptr(dentry) = kmalloc(newsize, GFP_ATOMIC);
+ if (!dtohd_ptr(dentry))
+ goto out;
+ }
+
+ if (oldsize > newsize)
+ memset(dtohd_ptr(dentry), 0, oldsize);
+ else
+ memset(dtohd_ptr(dentry), 0, newsize);
+
+ spin_unlock(&dentry->d_lock);
+ return 0;
+
+out:
+ free_dentry_private_data(dtopd_nocheck(dentry));
+ dtopd_lhs(dentry) = NULL;
+ spin_unlock(&dentry->d_lock);
+ return -ENOMEM;
+}
+
+void update_bstart(struct dentry *dentry)
+{
+ int bindex;
+ int bstart = dbstart(dentry);
+ int bend = dbend(dentry);
+ struct dentry *hidden_dentry;
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+ if (hidden_dentry->d_inode) {
+ set_dbstart(dentry, bindex);
+ break;
+ }
+ dput(hidden_dentry);
+ set_dtohd_index(dentry, bindex, NULL);
+ }
+}
+

2006-09-01 01:51:55

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 12/22][RFC] Unionfs: Main module functions

From: Josef "Jeff" Sipek <[email protected]>

Module init & cleanup code, as well as interposition functions.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/main.c | 685 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 685 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/main.c linux-2.6-git-unionfs/fs/unionfs/main.c
--- linux-2.6-git/fs/unionfs/main.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/main.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,685 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+
+/* declarations added for "sparse" */
+extern void unionfs_kill_block_super(struct super_block *sb);
+
+/* declarations added for malloc_debugging */
+
+/* sb we pass is unionfs's super_block */
+int unionfs_interpose(struct dentry *dentry, struct super_block *sb, int flag)
+{
+ struct inode *hidden_inode;
+ struct dentry *hidden_dentry;
+ int err = 0;
+ struct inode *inode;
+ int is_negative_dentry = 1;
+ int bindex, bstart, bend;
+
+ verify_locked(dentry);
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ /* Make sure that we didn't get a negative dentry. */
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ if (dtohd_index(dentry, bindex) &&
+ dtohd_index(dentry, bindex)->d_inode) {
+ is_negative_dentry = 0;
+ break;
+ }
+ }
+ BUG_ON(is_negative_dentry);
+
+ /* We allocate our new inode below, by calling iget.
+ * iget will call our read_inode which will initialize some
+ * of the new inode's fields
+ */
+
+ /* On revalidate we've already got our own inode and just need
+ * to fix it up. */
+ if (flag == INTERPOSE_REVAL) {
+ inode = dentry->d_inode;
+ itopd(inode)->b_start = -1;
+ itopd(inode)->b_end = -1;
+ atomic_set(&itopd(inode)->uii_generation,
+ atomic_read(&stopd(sb)->usi_generation));
+
+ itohi_ptr(inode) =
+ kzalloc(sbmax(sb) * sizeof(struct inode *), GFP_KERNEL);
+ if (!itohi_ptr(inode)) {
+ err = -ENOMEM;
+ goto out;
+ }
+ mutex_lock(&inode->i_mutex);
+ } else {
+ ino_t ino;
+ /* get unique inode number for unionfs */
+ ino = iunique(sb, UNIONFS_ROOT_INO);
+
+ inode = iget(sb, ino);
+ if (!inode) {
+ err = -EACCES; /* should be impossible??? */
+ goto out;
+ }
+
+ mutex_lock(&inode->i_mutex);
+ if (atomic_read(&inode->i_count) > 1)
+ goto skip;
+ }
+
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry) {
+ set_itohi_index(inode, bindex, NULL);
+ continue;
+ }
+
+ /* Initialize the hidden inode to the new hidden inode. */
+ if (!hidden_dentry->d_inode)
+ continue;
+
+ set_itohi_index(inode, bindex, igrab(hidden_dentry->d_inode));
+ }
+
+ ibstart(inode) = dbstart(dentry);
+ ibend(inode) = dbend(dentry);
+
+ /* Use attributes from the first branch. */
+ hidden_inode = itohi(inode);
+
+ /* Use different set of inode ops for symlinks & directories */
+ if (S_ISLNK(hidden_inode->i_mode))
+ inode->i_op = &unionfs_symlink_iops;
+ else if (S_ISDIR(hidden_inode->i_mode))
+ inode->i_op = &unionfs_dir_iops;
+
+ /* Use different set of file ops for directories */
+ if (S_ISDIR(hidden_inode->i_mode))
+ inode->i_fop = &unionfs_dir_fops;
+
+ /* properly initialize special inodes */
+ if (S_ISBLK(hidden_inode->i_mode) || S_ISCHR(hidden_inode->i_mode) ||
+ S_ISFIFO(hidden_inode->i_mode) || S_ISSOCK(hidden_inode->i_mode))
+ init_special_inode(inode, hidden_inode->i_mode,
+ hidden_inode->i_rdev);
+ /* Fix our inode's address operations to that of the lower inode (Unionfs is FiST-Lite) */
+ if (inode->i_mapping->a_ops != hidden_inode->i_mapping->a_ops)
+ inode->i_mapping->a_ops = hidden_inode->i_mapping->a_ops;
+
+ /* all well, copy inode attributes */
+ fist_copy_attr_all(inode, hidden_inode);
+
+skip:
+ /* only (our) lookup wants to do a d_add */
+ switch (flag) {
+ case INTERPOSE_DEFAULT:
+ case INTERPOSE_REVAL_NEG:
+ d_instantiate(dentry, inode);
+ break;
+ case INTERPOSE_LOOKUP:
+ err = PTR_ERR(d_splice_alias(inode, dentry));
+ break;
+ case INTERPOSE_REVAL:
+ /* Do nothing. */
+ break;
+ default:
+ printk(KERN_ERR "Invalid interpose flag passed!");
+ BUG();
+ }
+
+ mutex_unlock(&inode->i_mutex);
+
+out:
+ return err;
+}
+
+void unionfs_reinterpose(struct dentry *dentry)
+{
+ struct dentry *hidden_dentry;
+ struct inode *inode;
+ int bindex, bstart, bend;
+
+ verify_locked(dentry);
+
+ /* This is pre-allocated inode */
+ inode = dentry->d_inode;
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ continue;
+
+ if (!hidden_dentry->d_inode)
+ continue;
+ if (itohi_index(inode, bindex))
+ continue;
+ set_itohi_index(inode, bindex, igrab(hidden_dentry->d_inode));
+ }
+ ibstart(inode) = dbstart(dentry);
+ ibend(inode) = dbend(dentry);
+}
+
+int check_branch(struct nameidata *nd)
+{
+ if (!strcmp(nd->dentry->d_sb->s_type->name, "unionfs"))
+ return -EINVAL;
+ if (!nd->dentry->d_inode)
+ return -ENOENT;
+ if (!S_ISDIR(nd->dentry->d_inode->i_mode))
+ return -ENOTDIR;
+ return 0;
+}
+
+/* checks if two hidden_dentries have overlapping branches */
+int is_branch_overlap(struct dentry *dent1, struct dentry *dent2)
+{
+ struct dentry *dent = NULL;
+
+ dent = dent1;
+ while ((dent != dent2) && (dent->d_parent != dent)) {
+ dent = dent->d_parent;
+ }
+ if (dent == dent2) {
+ return 1;
+ }
+
+ dent = dent2;
+ while ((dent != dent1) && (dent->d_parent != dent)) {
+ dent = dent->d_parent;
+ }
+ if (dent == dent1) {
+ return 1;
+ }
+
+ return 0;
+}
+static int parse_branch_mode(char *name)
+{
+ int perms;
+ int l = strlen(name);
+ if (!strcmp(name + l - 3, "=ro")) {
+ perms = MAY_READ;
+ name[l - 3] = '\0';
+ } else if (!strcmp(name + l - 6, "=nfsro")) {
+ perms = MAY_READ | MAY_NFSRO;
+ name[l - 6] = '\0';
+ } else if (!strcmp(name + l - 3, "=rw")) {
+ perms = MAY_READ | MAY_WRITE;
+ name[l - 3] = '\0';
+ } else {
+ perms = MAY_READ | MAY_WRITE;
+ }
+
+ return perms;
+}
+
+static int parse_dirs_option(struct super_block *sb, struct unionfs_dentry_info
+ *hidden_root_info, char *options)
+{
+ struct nameidata nd;
+ char *name;
+ int err = 0;
+ int branches = 1;
+ int bindex = 0;
+ int i = 0;
+ int j = 0;
+
+ struct dentry *dent1 = NULL;
+ struct dentry *dent2 = NULL;
+
+ if (options[0] == '\0') {
+ printk(KERN_WARNING "unionfs: no branches specified\n");
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Each colon means we have a separator, this is really just a rough
+ * guess, since strsep will handle empty fields for us. */
+ for (i = 0; options[i]; i++) {
+ if (options[i] == ':')
+ branches++;
+ }
+
+ /* allocate space for underlying pointers to hidden dentry */
+ if (!(stopd(sb)->usi_data = alloc_new_data(branches))) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ if (!(hidden_root_info->udi_dentry = alloc_new_dentries(branches))) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ /* now parsing the string b1:b2=rw:b3=ro:b4 */
+ branches = 0;
+ while ((name = strsep(&options, ":")) != NULL) {
+ int perms;
+
+ if (!*name)
+ continue;
+
+ branches++;
+
+ /* strip off =rw or =ro if it is specified. */
+ perms = parse_branch_mode(name);
+ if (!bindex && !(perms & MAY_WRITE)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = path_lookup(name, LOOKUP_FOLLOW, &nd);
+ if (err) {
+ printk(KERN_WARNING "unionfs: error accessing "
+ "hidden directory '%s' (error %d)\n", name, err);
+ goto out;
+ }
+
+ if ((err = check_branch(&nd))) {
+ printk(KERN_WARNING "unionfs: hidden directory "
+ "'%s' is not a valid branch\n", name);
+ path_release(&nd);
+ goto out;
+ }
+
+ hidden_root_info->udi_dentry[bindex] = nd.dentry;
+
+ set_stohiddenmnt_index(sb, bindex, nd.mnt);
+ set_branchperms(sb, bindex, perms);
+ set_branch_count(sb, bindex, 0);
+
+ if (hidden_root_info->udi_bstart < 0)
+ hidden_root_info->udi_bstart = bindex;
+ hidden_root_info->udi_bend = bindex;
+ bindex++;
+ }
+
+ if (branches == 0) {
+ printk(KERN_WARNING "unionfs: no branches specified\n");
+ err = -EINVAL;
+ goto out;
+ }
+
+ BUG_ON(branches != (hidden_root_info->udi_bend + 1));
+
+ /* ensure that no overlaps exist in the branches */
+ for (i = 0; i < branches; i++) {
+ for (j = i + 1; j < branches; j++) {
+ dent1 = hidden_root_info->udi_dentry[i];
+ dent2 = hidden_root_info->udi_dentry[j];
+
+ if (is_branch_overlap(dent1, dent2)) {
+ printk(KERN_WARNING "unionfs: branches %d and %d overlap\n", i, j);
+ err = -EINVAL;
+ goto out;
+ }
+ }
+ }
+
+out:
+ if (err) {
+ for (i = 0; i < branches; i++) {
+ if (hidden_root_info->udi_dentry[i])
+ dput(hidden_root_info->udi_dentry[i]);
+ }
+
+ kfree(hidden_root_info->udi_dentry);
+ kfree(stopd(sb)->usi_data);
+
+ /* MUST clear the pointers to prevent potential double free if
+ * the caller dies later on
+ */
+ hidden_root_info->udi_dentry = NULL;
+ stopd(sb)->usi_data = NULL;
+ }
+ return err;
+}
+
+/*
+ * Parse mount options. See the manual page for usage instructions.
+ *
+ * Returns the dentry object of the lower-level (hidden) directory;
+ * We want to mount our stackable file system on top of that hidden directory.
+ *
+ * Sets default debugging level to N, if any.
+ */
+static struct unionfs_dentry_info *unionfs_parse_options(struct super_block *sb,
+ char *options)
+{
+ struct unionfs_dentry_info *hidden_root_info;
+ char *optname;
+ int err = 0;
+ int bindex;
+ int dirsfound = 0;
+
+ /* allocate private data area */
+ err = -ENOMEM;
+ hidden_root_info =
+ kzalloc(sizeof(struct unionfs_dentry_info), GFP_KERNEL);
+ if (!hidden_root_info)
+ goto out_error;
+ hidden_root_info->udi_bstart = -1;
+ hidden_root_info->udi_bend = -1;
+ hidden_root_info->udi_bopaque = -1;
+
+ while ((optname = strsep(&options, ",")) != NULL) {
+ char *optarg;
+ char *endptr;
+ int intval;
+
+ if (!*optname) {
+ continue;
+ }
+
+ optarg = strchr(optname, '=');
+ if (optarg) {
+ *optarg++ = '\0';
+ }
+
+ /* All of our options take an argument now. Insert ones that
+ * don't, above this check. */
+ if (!optarg) {
+ printk("unionfs: %s requires an argument.\n", optname);
+ err = -EINVAL;
+ goto out_error;
+ }
+
+ if (!strcmp("dirs", optname)) {
+ if (++dirsfound > 1) {
+ printk(KERN_WARNING
+ "unionfs: multiple dirs specified\n");
+ err = -EINVAL;
+ goto out_error;
+ }
+ err = parse_dirs_option(sb, hidden_root_info, optarg);
+ if (err)
+ goto out_error;
+ continue;
+ }
+
+ /* All of these options require an integer argument. */
+ intval = simple_strtoul(optarg, &endptr, 0);
+ if (*endptr) {
+ printk(KERN_WARNING
+ "unionfs: invalid %s option '%s'\n",
+ optname, optarg);
+ err = -EINVAL;
+ goto out_error;
+ }
+
+ err = -EINVAL;
+ printk(KERN_WARNING
+ "unionfs: unrecognized option '%s'\n", optname);
+ goto out_error;
+ }
+ if (dirsfound != 1) {
+ printk(KERN_WARNING "unionfs: dirs option required\n");
+ err = -EINVAL;
+ goto out_error;
+ }
+ goto out;
+
+out_error:
+ if (hidden_root_info && hidden_root_info->udi_dentry) {
+ for (bindex = hidden_root_info->udi_bstart;
+ bindex >= 0 && bindex <= hidden_root_info->udi_bend;
+ bindex++) {
+ struct dentry *d;
+ d = hidden_root_info->udi_dentry[bindex];
+ dput(d);
+ if (stohiddenmnt_index(sb, bindex))
+ mntput(stohiddenmnt_index(sb, bindex));
+ }
+ }
+
+ kfree(hidden_root_info->udi_dentry);
+ kfree(hidden_root_info);
+
+ kfree(stopd(sb)->usi_data);
+ stopd(sb)->usi_data = NULL;
+
+ hidden_root_info = ERR_PTR(err);
+out:
+ return hidden_root_info;
+}
+
+static struct dentry *unionfs_d_alloc_root(struct super_block *sb)
+{
+ struct dentry *ret = NULL;
+
+ if (sb) {
+ static const struct qstr name = {.name = "/",.len = 1 };
+
+ ret = d_alloc(NULL, &name);
+ if (ret) {
+ ret->d_op = &unionfs_dops;
+ ret->d_sb = sb;
+ ret->d_parent = ret;
+ }
+ }
+ return ret;
+}
+
+static int unionfs_read_super(struct super_block *sb, void *raw_data,
+ int silent)
+{
+ int err = 0;
+
+ struct unionfs_dentry_info *hidden_root_info = NULL;
+ int bindex, bstart, bend;
+
+ if (!raw_data) {
+ printk(KERN_WARNING
+ "unionfs_read_super: missing data argument\n");
+ err = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Allocate superblock private data
+ */
+ stopd_lhs(sb) = kzalloc(sizeof(struct unionfs_sb_info), GFP_KERNEL);
+ if (!stopd(sb)) {
+ printk(KERN_WARNING "%s: out of memory\n", __FUNCTION__);
+ err = -ENOMEM;
+ goto out;
+ }
+ stopd(sb)->b_end = -1;
+ atomic_set(&stopd(sb)->usi_generation, 1);
+ init_rwsem(&stopd(sb)->usi_rwsem);
+
+ hidden_root_info = unionfs_parse_options(sb, raw_data);
+ if (IS_ERR(hidden_root_info)) {
+ printk(KERN_WARNING
+ "unionfs_read_super: error while parsing options (err = %ld)\n",
+ PTR_ERR(hidden_root_info));
+ err = PTR_ERR(hidden_root_info);
+ hidden_root_info = NULL;
+ goto out_free;
+ }
+ if (hidden_root_info->udi_bstart == -1) {
+ err = -ENOENT;
+ goto out_free;
+ }
+
+ /* set the hidden superblock field of upper superblock */
+ bstart = hidden_root_info->udi_bstart;
+ BUG_ON(bstart != 0);
+ sbend(sb) = bend = hidden_root_info->udi_bend;
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ struct dentry *d;
+
+ d = hidden_root_info->udi_dentry[bindex];
+
+ set_stohs_index(sb, bindex, d->d_sb);
+ }
+
+ /* Unionfs: Max Bytes is the maximum bytes from highest priority branch */
+ sb->s_maxbytes = stohs_index(sb, 0)->s_maxbytes;
+
+ sb->s_op = &unionfs_sops;
+
+ /*
+ * we can't use d_alloc_root if we want to use
+ * our own interpose function unchanged,
+ * so we simply call our own "fake" d_alloc_root
+ */
+ sb->s_root = unionfs_d_alloc_root(sb);
+ if (!sb->s_root) {
+ err = -ENOMEM;
+ goto out_dput;
+ }
+
+ /* link the upper and lower dentries */
+ dtopd_lhs(sb->s_root) = NULL;
+ if ((err = new_dentry_private_data(sb->s_root)))
+ goto out_freedpd;
+
+ /* Set the hidden dentries for s_root */
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ struct dentry *d;
+
+ d = hidden_root_info->udi_dentry[bindex];
+
+ set_dtohd_index(sb->s_root, bindex, d);
+ }
+ set_dbstart(sb->s_root, bstart);
+ set_dbend(sb->s_root, bend);
+
+ /* Set the generation number to one, since this is for the mount. */
+ atomic_set(&dtopd(sb->s_root)->udi_generation, 1);
+
+ /* call interpose to create the upper level inode */
+ if ((err = unionfs_interpose(sb->s_root, sb, 0)))
+ goto out_freedpd;
+ unlock_dentry(sb->s_root);
+ goto out;
+
+out_freedpd:
+ if (dtopd(sb->s_root)) {
+ kfree(dtohd_ptr(sb->s_root));
+ free_dentry_private_data(dtopd(sb->s_root));
+ }
+ dput(sb->s_root);
+
+out_dput:
+ if (hidden_root_info && !IS_ERR(hidden_root_info)) {
+ for (bindex = hidden_root_info->udi_bstart;
+ bindex <= hidden_root_info->udi_bend; bindex++) {
+ struct dentry *d;
+
+ d = hidden_root_info->udi_dentry[bindex];
+
+ if (d)
+ dput(d);
+
+ if (stopd(sb) && stohiddenmnt_index(sb, bindex))
+ mntput(stohiddenmnt_index(sb, bindex));
+ }
+ kfree(hidden_root_info->udi_dentry);
+ kfree(hidden_root_info);
+ hidden_root_info = NULL;
+ }
+
+out_free:
+ kfree(stopd(sb)->usi_data);
+ kfree(stopd(sb));
+ stopd_lhs(sb) = NULL;
+
+out:
+ if (hidden_root_info && !IS_ERR(hidden_root_info)) {
+ kfree(hidden_root_info->udi_dentry);
+ kfree(hidden_root_info);
+ }
+ return err;
+}
+
+static int unionfs_get_sb(struct file_system_type *fs_type,
+ int flags, const char *dev_name,
+ void *raw_data, struct vfsmount *mnt)
+{
+ return get_sb_nodev(fs_type, flags, raw_data, unionfs_read_super, mnt);
+}
+
+void unionfs_kill_block_super(struct super_block *sb)
+{
+ generic_shutdown_super(sb);
+}
+
+static struct file_system_type unionfs_fs_type = {
+ .owner = THIS_MODULE,
+ .name = "unionfs",
+ .get_sb = unionfs_get_sb,
+ .kill_sb = unionfs_kill_block_super,
+ .fs_flags = FS_REVAL_DOT,
+};
+
+static int init_debug = 0;
+module_param_named(debug, init_debug, int, S_IRUGO);
+MODULE_PARM_DESC(debug, "Initial Unionfs debug value.");
+
+static int __init init_unionfs_fs(void)
+{
+ int err;
+ printk("Registering unionfs " UNIONFS_VERSION "\n");
+
+ if ((err = init_filldir_cache()))
+ goto out;
+ if ((err = init_inode_cache()))
+ goto out;
+ if ((err = init_dentry_cache()))
+ goto out;
+ if ((err = init_sioq()))
+ goto out;
+ err = register_filesystem(&unionfs_fs_type);
+out:
+ if (err) {
+ fin_sioq();
+ destroy_filldir_cache();
+ destroy_inode_cache();
+ destroy_dentry_cache();
+ }
+ return err;
+}
+static void __exit exit_unionfs_fs(void)
+{
+ fin_sioq();
+ destroy_filldir_cache();
+ destroy_inode_cache();
+ destroy_dentry_cache();
+ unregister_filesystem(&unionfs_fs_type);
+ printk("Completed unionfs module unload.\n");
+}
+
+MODULE_AUTHOR("Filesystems and Storage Lab, Stony Brook University"
+ " (http://www.fsl.cs.sunysb.edu/)");
+MODULE_DESCRIPTION("Unionfs " UNIONFS_VERSION
+ " (http://unionfs.filesystems.org/)");
+MODULE_LICENSE("GPL");
+
+module_init(init_unionfs_fs);
+module_exit(exit_unionfs_fs);
+

2006-09-01 01:54:05

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 13/22][RFC] Unionfs: Readdir state

From: Josef "Jeff" Sipek <[email protected]>

This file contains the routines for maintaining readdir state.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/rdstate.c | 303 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 303 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/rdstate.c linux-2.6-git-unionfs/fs/unionfs/rdstate.c
--- linux-2.6-git/fs/unionfs/rdstate.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/rdstate.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,303 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* This file contains the routines for maintaining readdir state. */
+/* There are two structures here, rdstate which is a hash table
+ * of the second structure which is a filldir_node. */
+
+/* This is a kmem_cache_t for filldir nodes, because we allocate a lot of them
+ * and they shouldn't waste memory. If the node has a small name (as defined
+ * by the dentry structure), then we use an inline name to preserve kmalloc
+ * space. */
+static kmem_cache_t *unionfs_filldir_cachep;
+int init_filldir_cache(void)
+{
+ unionfs_filldir_cachep =
+ kmem_cache_create("unionfs_filldir", sizeof(struct filldir_node), 0,
+ SLAB_RECLAIM_ACCOUNT, NULL, NULL);
+
+ if (!unionfs_filldir_cachep)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void destroy_filldir_cache(void)
+{
+ if (!unionfs_filldir_cachep)
+ return;
+ if (kmem_cache_destroy(unionfs_filldir_cachep)) {
+ printk(KERN_ERR
+ "unionfs_filldir_cache: not all structures were freed\n");
+ }
+ return;
+}
+
+/* This is a tuning parameter that tells us roughly how big to make the
+ * hash table in directory entries per page. This isn't perfect, but
+ * at least we get a hash table size that shouldn't be too overloaded.
+ * The following averages are based on my home directory.
+ * 14.44693 Overall
+ * 12.29 Single Page Directories
+ * 117.93 Multi-page directories
+ */
+#define DENTPAGE 4096
+#define DENTPERONEPAGE 12
+#define DENTPERPAGE 118
+#define MINHASHSIZE 1
+static int guesstimate_hash_size(struct inode *inode)
+{
+ struct inode *hidden_inode;
+ int bindex;
+ int hashsize = MINHASHSIZE;
+
+ if (itopd(inode)->uii_hashsize > 0)
+ return itopd(inode)->uii_hashsize;
+
+ for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+ if (!(hidden_inode = itohi_index(inode, bindex)))
+ continue;
+
+ if (hidden_inode->i_size == DENTPAGE) {
+ hashsize += DENTPERONEPAGE;
+ } else {
+ hashsize +=
+ (hidden_inode->i_size / DENTPAGE) * DENTPERPAGE;
+ }
+ }
+
+ return hashsize;
+}
+
+int init_rdstate(struct file *file)
+{
+ BUG_ON(sizeof(loff_t) != (sizeof(unsigned int) + sizeof(unsigned int)));
+ BUG_ON(ftopd(file)->rdstate != NULL);
+
+ ftopd(file)->rdstate =
+ alloc_rdstate(file->f_dentry->d_inode, fbstart(file));
+ if (!ftopd(file)->rdstate)
+ return -ENOMEM;
+ return 0;
+}
+
+struct unionfs_dir_state *find_rdstate(struct inode *inode, loff_t fpos)
+{
+ struct unionfs_dir_state *rdstate = NULL;
+ struct list_head *pos;
+
+ spin_lock(&itopd(inode)->uii_rdlock);
+ list_for_each(pos, &itopd(inode)->uii_readdircache) {
+ struct unionfs_dir_state *r =
+ list_entry(pos, struct unionfs_dir_state, uds_cache);
+ if (fpos == rdstate2offset(r)) {
+ itopd(inode)->uii_rdcount--;
+ list_del(&r->uds_cache);
+ rdstate = r;
+ break;
+ }
+ }
+ spin_unlock(&itopd(inode)->uii_rdlock);
+ return rdstate;
+}
+
+struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex)
+{
+ int i = 0;
+ int hashsize;
+ int mallocsize = sizeof(struct unionfs_dir_state);
+ struct unionfs_dir_state *rdstate;
+
+ hashsize = guesstimate_hash_size(inode);
+ mallocsize += hashsize * sizeof(struct list_head);
+ /* Round it up to the next highest power of two. */
+ mallocsize--;
+ mallocsize |= mallocsize >> 1;
+ mallocsize |= mallocsize >> 2;
+ mallocsize |= mallocsize >> 4;
+ mallocsize |= mallocsize >> 8;
+ mallocsize |= mallocsize >> 16;
+ mallocsize++;
+
+ /* This should give us about 500 entries anyway. */
+ if (mallocsize > PAGE_SIZE)
+ mallocsize = PAGE_SIZE;
+
+ hashsize =
+ (mallocsize -
+ sizeof(struct unionfs_dir_state)) / sizeof(struct list_head);
+
+ rdstate = kmalloc(mallocsize, GFP_KERNEL);
+ if (!rdstate)
+ return NULL;
+
+ spin_lock(&itopd(inode)->uii_rdlock);
+ if (itopd(inode)->uii_cookie >= (MAXRDCOOKIE - 1))
+ itopd(inode)->uii_cookie = 1;
+ else
+ itopd(inode)->uii_cookie++;
+
+ rdstate->uds_cookie = itopd(inode)->uii_cookie;
+ spin_unlock(&itopd(inode)->uii_rdlock);
+ rdstate->uds_offset = 1;
+ rdstate->uds_access = jiffies;
+ rdstate->uds_bindex = bindex;
+ rdstate->uds_dirpos = 0;
+ rdstate->uds_hashentries = 0;
+ rdstate->uds_size = hashsize;
+ for (i = 0; i < rdstate->uds_size; i++)
+ INIT_LIST_HEAD(&rdstate->uds_list[i]);
+
+ return rdstate;
+}
+
+static void free_filldir_node(struct filldir_node *node)
+{
+ if (node->namelen >= DNAME_INLINE_LEN_MIN)
+ kfree(node->name);
+ kmem_cache_free(unionfs_filldir_cachep, node);
+}
+
+void free_rdstate(struct unionfs_dir_state *state)
+{
+ struct filldir_node *tmp;
+ int i;
+
+ for (i = 0; i < state->uds_size; i++) {
+ struct list_head *head = &(state->uds_list[i]);
+ struct list_head *pos, *n;
+
+ /* traverse the list and deallocate space */
+ list_for_each_safe(pos, n, head) {
+ tmp = list_entry(pos, struct filldir_node, file_list);
+ list_del(&tmp->file_list);
+ free_filldir_node(tmp);
+ }
+ }
+
+ kfree(state);
+}
+
+struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
+ const char *name, int namelen)
+{
+ int index;
+ unsigned int hash;
+ struct list_head *head;
+ struct list_head *pos;
+ struct filldir_node *cursor = NULL;
+ int found = 0;
+
+ /* If we print entry, we end up with spurious data. */
+ /* print_entry("name = %*s", namelen, name); */
+ BUG_ON(namelen <= 0);
+
+ hash = full_name_hash(name, namelen);
+ index = hash % rdstate->uds_size;
+
+ head = &(rdstate->uds_list[index]);
+ list_for_each(pos, head) {
+ cursor = list_entry(pos, struct filldir_node, file_list);
+
+ if (cursor->namelen == namelen && cursor->hash == hash
+ && !strncmp(cursor->name, name, namelen)) {
+ /* a duplicate exists, and hence no need to create entry to the list */
+ found = 1;
+ /* if the duplicate is in this branch, then the file system is corrupted. */
+ if (cursor->bindex == rdstate->uds_bindex) {
+ //buf->error = err = -EIO;
+ printk(KERN_DEBUG
+ "Possible I/O error unionfs_filldir: a file is duplicated in the same branch %d: %s\n",
+ rdstate->uds_bindex, cursor->name);
+ }
+ break;
+ }
+ }
+
+ if (!found) {
+ cursor = NULL;
+ }
+ return cursor;
+}
+
+inline struct filldir_node *alloc_filldir_node(const char *name, int namelen,
+ unsigned int hash, int bindex)
+{
+ struct filldir_node *newnode;
+
+ newnode =
+ (struct filldir_node *)kmem_cache_alloc(unionfs_filldir_cachep,
+ SLAB_KERNEL);
+ if (!newnode)
+ goto out;
+
+out:
+ return newnode;
+}
+
+int add_filldir_node(struct unionfs_dir_state *rdstate, const char *name,
+ int namelen, int bindex, int whiteout)
+{
+ struct filldir_node *new;
+ unsigned int hash;
+ int index;
+ int err = 0;
+ struct list_head *head;
+
+ BUG_ON(namelen <= 0);
+
+ hash = full_name_hash(name, namelen);
+ index = hash % rdstate->uds_size;
+ head = &(rdstate->uds_list[index]);
+
+ new = alloc_filldir_node(name, namelen, hash, bindex);
+ if (!new) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&new->file_list);
+ new->namelen = namelen;
+ new->hash = hash;
+ new->bindex = bindex;
+ new->whiteout = whiteout;
+
+ if (namelen < DNAME_INLINE_LEN_MIN) {
+ new->name = new->iname;
+ } else {
+ new->name = (char *)kmalloc(namelen + 1, GFP_KERNEL);
+ if (!new->name) {
+ kmem_cache_free(unionfs_filldir_cachep, new);
+ new = NULL;
+ goto out;
+ }
+ }
+
+ memcpy(new->name, name, namelen);
+ new->name[namelen] = '\0';
+
+ rdstate->uds_hashentries++;
+
+ list_add(&(new->file_list), head);
+out:
+ return err;
+}
+

2006-09-01 01:53:38

by Stephen Rothwell

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Thu, 31 Aug 2006 21:35:13 -0400 Josef Sipek <[email protected]> wrote:
>
> This set of patches constitutes Unionfs version 2.0. We are presenting it to
> be reviewed and considered for inclusion into the kernel.

Small nit: is it possible to order these patches so that the kernel
builds at each intermediate point (so we can use git bisect). The
easiest way to achieve this would be to do the Kconfig and Makefile
updates last.

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (540.00 B)
(No filename) (189.00 B)
Download all attachments

2006-09-01 01:55:07

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 14/22][RFC] Unionfs: Rename

From: Josef "Jeff" Sipek <[email protected]>

This patch provides rename functionality for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/rename.c | 480 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 480 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/rename.c linux-2.6-git-unionfs/fs/unionfs/rename.c
--- linux-2.6-git/fs/unionfs/rename.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/rename.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,480 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+static int do_rename(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry,
+ int bindex, struct dentry **wh_old)
+{
+ int err = 0;
+ struct dentry *hidden_old_dentry;
+ struct dentry *hidden_new_dentry;
+ struct dentry *hidden_old_dir_dentry;
+ struct dentry *hidden_new_dir_dentry;
+ struct dentry *hidden_wh_dentry;
+ struct dentry *hidden_wh_dir_dentry;
+ char *wh_name = NULL;
+
+ hidden_new_dentry = dtohd_index(new_dentry, bindex);
+ hidden_old_dentry = dtohd_index(old_dentry, bindex);
+
+ if (!hidden_new_dentry) {
+ hidden_new_dentry =
+ create_parents(new_dentry->d_parent->d_inode, new_dentry, bindex);
+ if (IS_ERR(hidden_new_dentry)) {
+ printk(KERN_DEBUG "error creating directory tree for"
+ " rename, bindex = %d, err = %ld\n",
+ bindex, PTR_ERR(hidden_new_dentry));
+ err = PTR_ERR(hidden_new_dentry);
+ goto out;
+ }
+ }
+
+ wh_name = alloc_whname(new_dentry->d_name.name, new_dentry->d_name.len);
+ if (IS_ERR(wh_name)) {
+ err = PTR_ERR(wh_name);
+ goto out;
+ }
+
+ hidden_wh_dentry =
+ lookup_one_len(wh_name, hidden_new_dentry->d_parent,
+ new_dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(hidden_wh_dentry)) {
+ err = PTR_ERR(hidden_wh_dentry);
+ goto out;
+ }
+
+ if (hidden_wh_dentry->d_inode) {
+ /* get rid of the whiteout that is existing */
+ if (hidden_new_dentry->d_inode) {
+ printk(KERN_WARNING "Both a whiteout and a dentry"
+ " exist when doing a rename!\n");
+ err = -EIO;
+
+ dput(hidden_wh_dentry);
+ goto out;
+ }
+
+ hidden_wh_dir_dentry = lock_parent(hidden_wh_dentry);
+ if (!(err = is_robranch_super(old_dentry->d_sb, bindex))) {
+ err = vfs_unlink(hidden_wh_dir_dentry->d_inode,
+ hidden_wh_dentry);
+ }
+ dput(hidden_wh_dentry);
+ unlock_dir(hidden_wh_dir_dentry);
+ if (err)
+ goto out;
+ } else
+ dput(hidden_wh_dentry);
+
+ dget(hidden_old_dentry);
+ hidden_old_dir_dentry = dget_parent(hidden_old_dentry);
+ hidden_new_dir_dentry = dget_parent(hidden_new_dentry);
+
+ lock_rename(hidden_old_dir_dentry, hidden_new_dir_dentry);
+
+ err = is_robranch_super(old_dentry->d_sb, bindex);
+ if (err)
+ goto out_unlock;
+
+ /* ready to whiteout for old_dentry.
+ caller will create the actual whiteout,
+ and must dput(*wh_old) */
+ if (wh_old) {
+ char *whname;
+ whname = alloc_whname(old_dentry->d_name.name,
+ old_dentry->d_name.len);
+ err = PTR_ERR(whname);
+ if (IS_ERR(whname))
+ goto out_unlock;
+ *wh_old = lookup_one_len(whname, hidden_old_dir_dentry,
+ old_dentry->d_name.len + UNIONFS_WHLEN);
+ kfree(whname);
+ err = PTR_ERR(*wh_old);
+ if (IS_ERR(*wh_old)) {
+ *wh_old = NULL;
+ goto out_unlock;
+ }
+ }
+
+ err = vfs_rename(hidden_old_dir_dentry->d_inode, hidden_old_dentry,
+ hidden_new_dir_dentry->d_inode, hidden_new_dentry);
+
+out_unlock:
+ unlock_rename(hidden_old_dir_dentry, hidden_new_dir_dentry);
+
+ dput(hidden_old_dir_dentry);
+ dput(hidden_new_dir_dentry);
+ dput(hidden_old_dentry);
+
+out:
+ if (!err) {
+ /* Fixup the newdentry. */
+ if (bindex < dbstart(new_dentry))
+ set_dbstart(new_dentry, bindex);
+ else if (bindex > dbend(new_dentry))
+ set_dbend(new_dentry, bindex);
+ }
+
+ kfree(wh_name);
+
+ return err;
+}
+
+static int do_unionfs_rename(struct inode *old_dir,
+ struct dentry *old_dentry,
+ struct inode *new_dir,
+ struct dentry *new_dentry)
+{
+ int err = 0;
+ int bindex, bwh_old;
+ int old_bstart, old_bend;
+ int new_bstart, new_bend;
+ int do_copyup = -1;
+ struct dentry *parent_dentry;
+ int local_err = 0;
+ int eio = 0;
+ int revert = 0;
+ struct dentry *wh_old = NULL;
+
+ old_bstart = dbstart(old_dentry);
+ bwh_old = old_bstart;
+ old_bend = dbend(old_dentry);
+ parent_dentry = old_dentry->d_parent;
+
+ new_bstart = dbstart(new_dentry);
+ new_bend = dbend(new_dentry);
+
+ /* Rename source to destination. */
+ err = do_rename(old_dir, old_dentry, new_dir, new_dentry, old_bstart,
+ &wh_old);
+ if (err) {
+ if (!IS_COPYUP_ERR(err)) {
+ goto out;
+ }
+ do_copyup = old_bstart - 1;
+ } else {
+ revert = 1;
+ }
+
+ /* Unlink all instances of destination that exist to the left of
+ * bstart of source. On error, revert back, goto out.
+ */
+ for (bindex = old_bstart - 1; bindex >= new_bstart; bindex--) {
+ struct dentry *unlink_dentry;
+ struct dentry *unlink_dir_dentry;
+
+ unlink_dentry = dtohd_index(new_dentry, bindex);
+ if (!unlink_dentry) {
+ continue;
+ }
+
+ unlink_dir_dentry = lock_parent(unlink_dentry);
+ if (!(err = is_robranch_super(old_dir->i_sb, bindex))) {
+ err =
+ vfs_unlink(unlink_dir_dentry->d_inode,
+ unlink_dentry);
+ }
+
+ fist_copy_attr_times(new_dentry->d_parent->d_inode,
+ unlink_dir_dentry->d_inode);
+ /* propagate number of hard-links */
+ new_dentry->d_parent->d_inode->i_nlink =
+ get_nlinks(new_dentry->d_parent->d_inode);
+
+ unlock_dir(unlink_dir_dentry);
+ if (!err) {
+ if (bindex != new_bstart) {
+ dput(unlink_dentry);
+ set_dtohd_index(new_dentry, bindex, NULL);
+ }
+ } else if (IS_COPYUP_ERR(err)) {
+ do_copyup = bindex - 1;
+ } else if (revert) {
+ dput(wh_old);
+ goto revert;
+ }
+ }
+
+ if (do_copyup != -1) {
+ for (bindex = do_copyup; bindex >= 0; bindex--) {
+ /* copyup the file into some left directory, so that you can rename it */
+ err =
+ copyup_dentry(old_dentry->d_parent->d_inode,
+ old_dentry, old_bstart, bindex, NULL,
+ old_dentry->d_inode->i_size);
+ if (!err) {
+ dput(wh_old);
+ bwh_old = bindex;
+ err =
+ do_rename(old_dir, old_dentry, new_dir,
+ new_dentry, bindex, &wh_old);
+ break;
+ }
+ }
+ }
+
+ /* make it opaque */
+ if (S_ISDIR(old_dentry->d_inode->i_mode)) {
+ err = make_dir_opaque(old_dentry, dbstart(old_dentry));
+ if (err)
+ goto revert;
+ }
+
+ /* Create whiteout for source, only if:
+ * (1) There is more than one underlying instance of source.
+ * (2) We did a copy_up
+ */
+ if ((old_bstart != old_bend) || (do_copyup != -1)) {
+ struct dentry *hidden_parent;
+ BUG_ON(!wh_old || IS_ERR(wh_old) || wh_old->d_inode
+ || bwh_old < 0);
+ hidden_parent = lock_parent(wh_old);
+ local_err = vfs_create(hidden_parent->d_inode, wh_old, S_IRUGO,
+ NULL);
+ unlock_dir(hidden_parent);
+ if (!local_err)
+ set_dbopaque(old_dentry, bwh_old);
+ else {
+ /* We can't fix anything now, so we cop-out and use -EIO. */
+ printk("<0>We can't create a whiteout for the source in rename!\n");
+ err = -EIO;
+ }
+ }
+
+out:
+ dput(wh_old);
+ return err;
+
+revert:
+ /* Do revert here. */
+ local_err = unionfs_refresh_hidden_dentry(new_dentry, old_bstart);
+ if (local_err) {
+ printk(KERN_WARNING
+ "Revert failed in rename: the new refresh failed.\n");
+ eio = -EIO;
+ }
+
+ local_err = unionfs_refresh_hidden_dentry(old_dentry, old_bstart);
+ if (local_err) {
+ printk(KERN_WARNING
+ "Revert failed in rename: the old refresh failed.\n");
+ eio = -EIO;
+ goto revert_out;
+ }
+
+ if (!dtohd_index(new_dentry, bindex)
+ || !dtohd_index(new_dentry, bindex)->d_inode) {
+ printk(KERN_WARNING
+ "Revert failed in rename: the object disappeared from under us!\n");
+ eio = -EIO;
+ goto revert_out;
+ }
+
+ if (dtohd_index(old_dentry, bindex)
+ && dtohd_index(old_dentry, bindex)->d_inode) {
+ printk(KERN_WARNING
+ "Revert failed in rename: the object was created underneath us!\n");
+ eio = -EIO;
+ goto revert_out;
+ }
+
+ local_err =
+ do_rename(new_dir, new_dentry, old_dir, old_dentry, old_bstart,
+ NULL);
+
+ /* If we can't fix it, then we cop-out with -EIO. */
+ if (local_err) {
+ printk(KERN_WARNING "Revert failed in rename!\n");
+ eio = -EIO;
+ }
+
+ local_err = unionfs_refresh_hidden_dentry(new_dentry, bindex);
+ if (local_err)
+ eio = -EIO;
+ local_err = unionfs_refresh_hidden_dentry(old_dentry, bindex);
+ if (local_err)
+ eio = -EIO;
+
+revert_out:
+ if (eio)
+ err = eio;
+ return err;
+}
+
+/*
+ * Unfortunately, we cannot simply call things like dbstart() in different
+ * places of the rename code because we move things around. So, we use this
+ * structure to pass the necessary information around to all the places that
+ * need it.
+ */
+struct rename_info {
+ int do_copyup;
+ int do_whiteout;
+ int rename_ok;
+
+ int old_bstart;
+ int old_bend;
+ int new_bstart;
+ int new_bend;
+
+ int isdir; /* Is the source a directory? */
+ int clobber; /* Are we clobbering the destination? */
+
+ int bwh_old; /* where we create the whiteout */
+ struct dentry *wh_old; /* lookup and set by do_rename() */
+};
+
+static struct dentry *lookup_whiteout(struct dentry *dentry)
+{
+ char *whname;
+ int bindex = -1, bstart = -1, bend = -1;
+ struct dentry *parent, *hidden_parent, *wh_dentry;
+
+ whname = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(whname))
+ return (void *)whname;
+
+ parent = dget_parent(dentry);
+ lock_dentry(parent);
+ bstart = dbstart(parent);
+ bend = dbend(parent);
+ wh_dentry = ERR_PTR(-ENOENT);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_parent = dtohd_index(parent, bindex);
+ if (!hidden_parent)
+ continue;
+ wh_dentry =
+ lookup_one_len(whname, hidden_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(wh_dentry))
+ continue;
+ if (wh_dentry->d_inode)
+ break;
+ dput(wh_dentry);
+ wh_dentry = ERR_PTR(-ENOENT);
+ }
+ unlock_dentry(parent);
+ dput(parent);
+ kfree(whname);
+ return wh_dentry;
+}
+
+/* We can't copyup a directory, because it may involve huge
+ * numbers of children, etc. Doing that in the kernel would
+ * be bad, so instead we let the userspace recurse and ask us
+ * to copy up each file separately
+ */
+static int may_rename_dir(struct dentry *dentry)
+{
+ int err, bstart;
+
+ err = check_empty(dentry, NULL);
+ if (err == -ENOTEMPTY) {
+ if (is_robranch(dentry))
+ return -EXDEV;
+ } else if (err)
+ return err;
+
+ bstart = dbstart(dentry);
+ if (dbend(dentry) == bstart || dbopaque(dentry) == bstart)
+ return 0;
+
+ set_dbstart(dentry, bstart + 1);
+ err = check_empty(dentry, NULL);
+ set_dbstart(dentry, bstart);
+ if (err == -ENOTEMPTY)
+ err = -EXDEV;
+ return err;
+}
+
+int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry)
+{
+ int err = 0;
+ struct dentry *wh_dentry;
+
+ double_lock_dentry(old_dentry, new_dentry);
+
+ if (!S_ISDIR(old_dentry->d_inode->i_mode))
+ err = unionfs_partial_lookup(old_dentry);
+ else
+ err = may_rename_dir(old_dentry);
+
+ if (err)
+ goto out;
+
+ err = unionfs_partial_lookup(new_dentry);
+ if (err)
+ goto out;
+
+ /*
+ * if new_dentry is already hidden because of whiteout,
+ * simply override it even if the whiteouted dir is not empty.
+ */
+ wh_dentry = lookup_whiteout(new_dentry);
+ if (!IS_ERR(wh_dentry))
+ dput(wh_dentry);
+ else if (new_dentry->d_inode) {
+ if (S_ISDIR(old_dentry->d_inode->i_mode) !=
+ S_ISDIR(new_dentry->d_inode->i_mode)) {
+ err =
+ S_ISDIR(old_dentry->d_inode->
+ i_mode) ? -ENOTDIR : -EISDIR;
+ goto out;
+ }
+
+ if (S_ISDIR(new_dentry->d_inode->i_mode)) {
+ struct unionfs_dir_state *namelist;
+ /* check if this unionfs directory is empty or not */
+ err = check_empty(new_dentry, &namelist);
+ if (err)
+ goto out;
+
+ if (!is_robranch(new_dentry))
+ err = delete_whiteouts(new_dentry,
+ dbstart(new_dentry),
+ namelist);
+
+ free_rdstate(namelist);
+
+ if (err)
+ goto out;
+ }
+ }
+ err = do_unionfs_rename(old_dir, old_dentry, new_dir, new_dentry);
+
+out:
+
+ if (err) {
+ /* clear the new_dentry stuff created */
+ d_drop(new_dentry);
+ } else {
+ /* force re-lookup since the dir on ro branch is not renamed,
+ and hidden dentries still indicate the un-renamed ones. */
+ if (S_ISDIR(old_dentry->d_inode->i_mode))
+ atomic_dec(&dtopd(old_dentry)->udi_generation);
+ }
+
+ unlock_dentry(new_dentry);
+ unlock_dentry(old_dentry);
+ return err;
+}
+

2006-09-01 01:55:52

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 15/22][RFC] Unionfs: Privileged operations workqueue

From: Josef "Jeff" Sipek <[email protected]>

Workqueue & helper functions used to perform privileged operations on
behalf of the user process.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/sioq.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/unionfs/sioq.h | 79 +++++++++++++++++++++++++++++++++++++
2 files changed, 193 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/sioq.c linux-2.6-git-unionfs/fs/unionfs/sioq.c
--- linux-2.6-git/fs/unionfs/sioq.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/sioq.c 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,114 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+struct workqueue_struct *sioq;
+
+int __init init_sioq(void)
+{
+ int err;
+
+ sioq = create_workqueue("unionfs_siod");
+ if (!IS_ERR(sioq))
+ return 0;
+
+ err = PTR_ERR(sioq);
+ printk(KERN_ERR "create_workqueue failed %d\n", err);
+ sioq = NULL;
+ return err;
+}
+
+void fin_sioq(void)
+{
+ if (sioq)
+ destroy_workqueue(sioq);
+}
+
+void run_sioq(void (*func)(void *arg), struct sioq_args *args)
+{
+ DECLARE_WORK(wk, func, &args->comp);
+
+ init_completion(&args->comp);
+ while (!queue_work(sioq, &wk)) {
+ // TODO: do accounting if needed
+ schedule();
+ }
+ wait_for_completion(&args->comp);
+}
+
+void __unionfs_create(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->err = vfs_create(args->u.create.parent, args->u.create.dentry,
+ args->u.create.mode, args->u.create.nd);
+ complete(&args->comp);
+}
+
+void __unionfs_mkdir(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->err = vfs_mkdir(args->u.mkdir.parent, args->u.mkdir.dentry,
+ args->u.mkdir.mode);
+ complete(&args->comp);
+}
+
+void __unionfs_mknod(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->err = vfs_mknod(args->u.mknod.parent, args->u.mknod.dentry,
+ args->u.mknod.mode, args->u.mknod.dev);
+ complete(&args->comp);
+}
+void __unionfs_symlink(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->err = vfs_symlink(args->u.symlink.parent, args->u.symlink.dentry,
+ args->u.symlink.symbuf, args->u.symlink.mode);
+}
+
+void __unionfs_unlink(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->err = vfs_unlink(args->u.unlink.parent, args->u.unlink.dentry);
+ complete(&args->comp);
+}
+
+void __delete_whiteouts(void *data) {
+ struct sioq_args *args = data;
+
+ args->err = do_delete_whiteouts(args->u.deletewh.dentry, args->u.deletewh.bindex,
+ args->u.deletewh.namelist);
+
+ complete(&args->comp);
+}
+
+void __is_opaque_dir(void *data)
+{
+ struct sioq_args *args = data;
+
+ args->ret = lookup_one_len(UNIONFS_DIR_OPAQUE, args->u.isopaque.dentry,
+ sizeof(UNIONFS_DIR_OPAQUE) - 1);
+ complete(&args->comp);
+}
diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/sioq.h linux-2.6-git-unionfs/fs/unionfs/sioq.h
--- linux-2.6-git/fs/unionfs/sioq.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/sioq.h 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,79 @@
+#ifndef _SIOQ_H
+#define _SIOQ_H
+
+struct deletewh_args {
+ struct unionfs_dir_state *namelist;
+ struct dentry *dentry;
+ int bindex;
+};
+
+struct isopaque_args {
+ struct dentry *dentry;
+};
+
+struct create_args {
+ struct inode *parent;
+ struct dentry *dentry;
+ umode_t mode;
+ struct nameidata *nd;
+};
+
+struct mkdir_args {
+ struct inode *parent;
+ struct dentry *dentry;
+ umode_t mode;
+};
+
+struct mknod_args {
+ struct inode *parent;
+ struct dentry *dentry;
+ umode_t mode;
+ dev_t dev;
+};
+
+struct symlink_args {
+ struct inode *parent;
+ struct dentry *dentry;
+ char *symbuf;
+ umode_t mode;
+};
+
+struct unlink_args {
+ struct inode *parent;
+ struct dentry *dentry;
+};
+
+
+struct sioq_args {
+
+ struct completion comp;
+ int err;
+ void *ret;
+
+ union {
+ struct deletewh_args deletewh;
+ struct isopaque_args isopaque;
+ struct create_args create;
+ struct mkdir_args mkdir;
+ struct mknod_args mknod;
+ struct symlink_args symlink;
+ struct unlink_args unlink;
+ } u;
+};
+
+extern struct workqueue_struct *sioq;
+int __init init_sioq(void);
+extern void fin_sioq(void);
+extern void run_sioq(void (*func)(void *arg), struct sioq_args *args);
+
+/* Extern definitions for our privledge escalation helpers */
+extern void __unionfs_create(void *data);
+extern void __unionfs_mkdir(void *data);
+extern void __unionfs_mknod(void *data);
+extern void __unionfs_symlink(void *data);
+extern void __unionfs_unlink(void *data);
+extern void __delete_whiteouts(void *data);
+extern void __is_opaque_dir(void *data);
+
+#endif /* _SIOQ_H */
+

2006-09-01 01:56:57

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 16/22][RFC] Unionfs: Handling of stale inodes

From: Josef "Jeff" Sipek <[email protected]>

Provides nicer handling of stale inodes.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/stale_inode.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 116 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/stale_inode.c linux-2.6-git-unionfs/fs/unionfs/stale_inode.c
--- linux-2.6-git/fs/unionfs/stale_inode.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/stale_inode.c 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,116 @@
+/*
+ * Adpated from linux/fs/bad_inode.c
+ *
+ * Copyright (C) 1997, Stephen Tweedie
+ *
+ * Provide stub functions for "stale" inodes, a bit friendlier than the
+ * -EIO that bad_inode.c does.
+ */
+
+#include <linux/config.h>
+#include <linux/version.h>
+
+#include <linux/fs.h>
+#include <linux/stat.h>
+#include <linux/sched.h>
+
+static struct address_space_operations unionfs_stale_aops;
+
+/* declarations for "sparse */
+extern struct inode_operations stale_inode_ops;
+
+/*
+ * The follow_link operation is special: it must behave as a no-op
+ * so that a stale root inode can at least be unmounted. To do this
+ * we must dput() the base and return the dentry with a dget().
+ */
+static void *stale_follow_link(struct dentry *dent, struct nameidata *nd)
+{
+ int err = vfs_follow_link(nd, ERR_PTR(-ESTALE));
+ return ERR_PTR(err);
+}
+
+static int return_ESTALE(void)
+{
+ return -ESTALE;
+}
+
+#define ESTALE_ERROR ((void *) (return_ESTALE))
+
+static struct file_operations stale_file_ops = {
+ .llseek = ESTALE_ERROR,
+ .read = ESTALE_ERROR,
+ .write = ESTALE_ERROR,
+ .readdir = ESTALE_ERROR,
+ .poll = ESTALE_ERROR,
+ .ioctl = ESTALE_ERROR,
+ .mmap = ESTALE_ERROR,
+ .open = ESTALE_ERROR,
+ .flush = ESTALE_ERROR,
+ .release = ESTALE_ERROR,
+ .fsync = ESTALE_ERROR,
+ .fasync = ESTALE_ERROR,
+ .lock = ESTALE_ERROR,
+};
+
+struct inode_operations stale_inode_ops = {
+ .create = ESTALE_ERROR,
+ .lookup = ESTALE_ERROR,
+ .link = ESTALE_ERROR,
+ .unlink = ESTALE_ERROR,
+ .symlink = ESTALE_ERROR,
+ .mkdir = ESTALE_ERROR,
+ .rmdir = ESTALE_ERROR,
+ .mknod = ESTALE_ERROR,
+ .rename = ESTALE_ERROR,
+ .readlink = ESTALE_ERROR,
+ .follow_link = stale_follow_link,
+ .truncate = ESTALE_ERROR,
+ .permission = ESTALE_ERROR,
+};
+
+/*
+ * When a filesystem is unable to read an inode due to an I/O error in
+ * its read_inode() function, it can call make_stale_inode() to return a
+ * set of stubs which will return ESTALE errors as required.
+ *
+ * We only need to do limited initialisation: all other fields are
+ * preinitialised to zero automatically.
+ */
+
+/**
+ * make_stale_inode - mark an inode stale due to an I/O error
+ * @inode: Inode to mark stale
+ *
+ * When an inode cannot be read due to a media or remote network
+ * failure this function makes the inode "stale" and causes I/O operations
+ * on it to fail from this point on.
+ */
+
+void make_stale_inode(struct inode *inode)
+{
+ inode->i_mode = S_IFREG;
+ inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+ inode->i_op = &stale_inode_ops;
+ inode->i_fop = &stale_file_ops;
+ inode->i_mapping->a_ops = &unionfs_stale_aops;
+}
+
+/*
+ * This tests whether an inode has been flagged as stale. The test uses
+ * &stale_inode_ops to cover the case of invalidated inodes as well as
+ * those created by make_stale_inode() above.
+ */
+
+/**
+ * is_stale_inode - is an inode errored
+ * @inode: inode to test
+ *
+ * Returns true if the inode in question has been marked as stale.
+ */
+
+int is_stale_inode(struct inode *inode)
+{
+ return (inode->i_op == &stale_inode_ops);
+}
+

2006-09-01 01:58:22

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 17/22][RFC] Unionfs: Miscellaneous helper functions

From: Josef "Jeff" Sipek <[email protected]>

This patch contains miscellaneous helper functions used thoughout Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/subr.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 179 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/subr.c linux-2.6-git-unionfs/fs/unionfs/subr.c
--- linux-2.6-git/fs/unionfs/subr.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/subr.c 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,179 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* Pass an unionfs dentry and an index. It will try to create a whiteout
+ * for the filename in dentry, and will try in branch 'index'. On error,
+ * it will proceed to a branch to the left.
+ */
+int create_whiteout(struct dentry *dentry, int start)
+{
+ int bstart, bend, bindex;
+ struct dentry *hidden_dir_dentry;
+ struct dentry *hidden_dentry;
+ struct dentry *hidden_wh_dentry;
+ char *name = NULL;
+ int err = -EINVAL;
+
+ verify_locked(dentry);
+
+ bstart = dbstart(dentry);
+ bend = dbend(dentry);
+
+ /* create dentry's whiteout equivalent */
+ name = alloc_whname(dentry->d_name.name, dentry->d_name.len);
+ if (IS_ERR(name)) {
+ err = PTR_ERR(name);
+ goto out;
+ }
+
+ for (bindex = start; bindex >= 0; bindex--) {
+ hidden_dentry = dtohd_index(dentry, bindex);
+
+ if (!hidden_dentry) {
+ /* if hidden dentry is not present, create the entire
+ * hidden dentry directory structure and go ahead.
+ * Since we want to just create whiteout, we only want
+ * the parent dentry, and hence get rid of this dentry.
+ */
+ hidden_dentry = create_parents(dentry->d_inode,
+ dentry, bindex);
+ if (!hidden_dentry || IS_ERR(hidden_dentry)) {
+ printk(KERN_DEBUG
+ "create_parents failed for bindex = %d\n",
+ bindex);
+ continue;
+ }
+ }
+ hidden_wh_dentry =
+ lookup_one_len(name, hidden_dentry->d_parent,
+ dentry->d_name.len + UNIONFS_WHLEN);
+ if (IS_ERR(hidden_wh_dentry))
+ continue;
+
+ /* The whiteout already exists. This used to be impossible, but
+ * now is possible because of opaqueness. */
+ if (hidden_wh_dentry->d_inode) {
+ dput(hidden_wh_dentry);
+ err = 0;
+ goto out;
+ }
+
+ hidden_dir_dentry = lock_parent(hidden_wh_dentry);
+ if (!(err = is_robranch_super(dentry->d_sb, bindex))) {
+ err =
+ vfs_create(hidden_dir_dentry->d_inode,
+ hidden_wh_dentry,
+ ~current->fs->umask & S_IRWXUGO, NULL);
+
+ }
+ unlock_dir(hidden_dir_dentry);
+ dput(hidden_wh_dentry);
+
+ if (!err)
+ break;
+
+ if (!IS_COPYUP_ERR(err))
+ break;
+ }
+
+ /* set dbopaque so that lookup will not proceed after this branch */
+ if (!err)
+ set_dbopaque(dentry, bindex);
+
+out:
+ kfree(name);
+ return err;
+}
+
+/* This is a helper function for rename, which ends up with hosed over dentries
+ * when it needs to revert. */
+int unionfs_refresh_hidden_dentry(struct dentry *dentry, int bindex)
+{
+ struct dentry *hidden_dentry;
+ struct dentry *hidden_parent;
+ int err = 0;
+
+ verify_locked(dentry);
+
+ lock_dentry(dentry->d_parent);
+ hidden_parent = dtohd_index(dentry->d_parent, bindex);
+ unlock_dentry(dentry->d_parent);
+
+ BUG_ON(!S_ISDIR(hidden_parent->d_inode->i_mode));
+
+ hidden_dentry =
+ lookup_one_len(dentry->d_name.name, hidden_parent,
+ dentry->d_name.len);
+ if (IS_ERR(hidden_dentry)) {
+ err = PTR_ERR(hidden_dentry);
+ goto out;
+ }
+
+ if (dtohd_index(dentry, bindex))
+ dput(dtohd_index(dentry, bindex));
+ if (itohi_index(dentry->d_inode, bindex)) {
+ iput(itohi_index(dentry->d_inode, bindex));
+ set_itohi_index(dentry->d_inode, bindex, NULL);
+ }
+ if (!hidden_dentry->d_inode) {
+ dput(hidden_dentry);
+ set_dtohd_index(dentry, bindex, NULL);
+ } else {
+ set_dtohd_index(dentry, bindex, hidden_dentry);
+ set_itohi_index(dentry->d_inode, bindex,
+ igrab(hidden_dentry->d_inode));
+ }
+
+out:
+ return err;
+}
+
+int make_dir_opaque(struct dentry *dentry, int bindex)
+{
+ int err = 0;
+ struct dentry *hidden_dentry, *diropq;
+ struct inode *hidden_dir;
+
+ hidden_dentry = dtohd_index(dentry, bindex);
+ hidden_dir = hidden_dentry->d_inode;
+ BUG_ON(!S_ISDIR(dentry->d_inode->i_mode)
+ || !S_ISDIR(hidden_dir->i_mode));
+
+ mutex_lock(&hidden_dir->i_mutex);
+ diropq = lookup_one_len(UNIONFS_DIR_OPAQUE, hidden_dentry,
+ sizeof(UNIONFS_DIR_OPAQUE) - 1);
+ if (IS_ERR(diropq)) {
+ err = PTR_ERR(diropq);
+ goto out;
+ }
+
+ if (!diropq->d_inode)
+ err = vfs_create(hidden_dir, diropq, S_IRUGO, NULL);
+ if (!err)
+ set_dbopaque(dentry, bindex);
+
+ dput(diropq);
+
+out:
+ mutex_unlock(&hidden_dir->i_mutex);
+ return err;
+}
+

2006-09-01 01:59:09

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 18/22][RFC] Unionfs: Superblock operations

From: Josef "Jeff" Sipek <[email protected]>

This patch contains the superblock operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/super.c | 352 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 352 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/super.c linux-2.6-git-unionfs/fs/unionfs/super.c
--- linux-2.6-git/fs/unionfs/super.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/super.c 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,352 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+/* The inode cache is used with alloc_inode for both our inode info and the
+ * vfs inode. */
+static kmem_cache_t *unionfs_inode_cachep;
+
+static void unionfs_read_inode(struct inode *inode)
+{
+ static struct address_space_operations unionfs_empty_aops;
+ int size;
+
+ if (!itopd(inode)) {
+ printk(KERN_ERR "No kernel memory when allocating inode "
+ "private data!\n");
+ BUG();
+ }
+
+ memset(itopd(inode), 0, sizeof(struct unionfs_inode_info));
+ itopd(inode)->b_start = -1;
+ itopd(inode)->b_end = -1;
+ atomic_set(&itopd(inode)->uii_generation,
+ atomic_read(&stopd(inode->i_sb)->usi_generation));
+ itopd(inode)->uii_rdlock = SPIN_LOCK_UNLOCKED;
+ itopd(inode)->uii_rdcount = 1;
+ itopd(inode)->uii_hashsize = -1;
+ INIT_LIST_HEAD(&itopd(inode)->uii_readdircache);
+
+ size = sbmax(inode->i_sb) * sizeof(struct inode *);
+ itohi_ptr(inode) = kzalloc(size, GFP_KERNEL);
+ if (!itohi_ptr(inode)) {
+ printk(KERN_ERR "No kernel memory when allocating lower-"
+ "pointer array!\n");
+ BUG();
+ }
+
+ inode->i_version++;
+ inode->i_op = &unionfs_main_iops;
+ inode->i_fop = &unionfs_main_fops;
+
+ /* I don't think ->a_ops is ever allowed to be NULL */
+ inode->i_mapping->a_ops = &unionfs_empty_aops;
+}
+
+static void unionfs_put_inode(struct inode *inode)
+{
+ /*
+ * This is really funky stuff:
+ * Basically, if i_count == 1, iput will then decrement it and this
+ * inode will be destroyed. It is currently holding a reference to the
+ * hidden inode. Therefore, it needs to release that reference by
+ * calling iput on the hidden inode. iput() _will_ do it for us (by
+ * calling our clear_inode), but _only_ if i_nlink == 0. The problem
+ * is, NFS keeps i_nlink == 1 for silly_rename'd files. So we must for
+ * our i_nlink to 0 here to trick iput() into calling our clear_inode.
+ */
+
+ if (atomic_read(&inode->i_count) == 1)
+ inode->i_nlink = 0;
+}
+
+/*
+ * we now define delete_inode, because there are two VFS paths that may
+ * destroy an inode: one of them calls clear inode before doing everything
+ * else that's needed, and the other is fine. This way we truncate the inode
+ * size (and its pages) and then clear our own inode, which will do an iput
+ * on our and the lower inode.
+ */
+static void unionfs_delete_inode(struct inode *inode)
+{
+ inode->i_size = 0; /* every f/s seems to do that */
+
+ clear_inode(inode);
+}
+
+/* final actions when unmounting a file system */
+static void unionfs_put_super(struct super_block *sb)
+{
+ int bindex, bstart, bend;
+ struct unionfs_sb_info *spd;
+
+ if ((spd = stopd(sb))) {
+ bstart = sbstart(sb);
+ bend = sbend(sb);
+ for (bindex = bstart; bindex <= bend; bindex++)
+ mntput(stohiddenmnt_index(sb, bindex));
+
+ /* Make sure we have no leaks of branchget/branchput. */
+ for (bindex = bstart; bindex <= bend; bindex++)
+ BUG_ON(branch_count(sb, bindex) != 0);
+
+ kfree(spd->usi_data);
+ kfree(spd);
+ stopd_lhs(sb) = NULL;
+ }
+}
+
+/* Since people use this to answer the "How big of a file can I write?"
+ * question, we report the size of the highest priority branch as the size of
+ * the union. */
+static int unionfs_statfs(struct dentry *dentry, struct kstatfs *buf)
+{
+ int err = 0;
+ struct super_block *sb, *hidden_sb;
+
+ sb = dentry->d_sb;
+
+ hidden_sb = stohs_index(sb, sbstart(sb));
+ err = vfs_statfs(hidden_sb->s_root, buf);
+
+ buf->f_type = UNIONFS_SUPER_MAGIC;
+ buf->f_namelen -= UNIONFS_WHLEN;
+
+ memset(&buf->f_fsid, 0, sizeof(__kernel_fsid_t));
+ memset(&buf->f_spare, 0, sizeof(buf->f_spare));
+
+ return err;
+}
+
+/* We don't support a standard text remount. Eventually it would be nice to
+ * support a full-on remount, so that you can have all of the directories
+ * change at once, but that would require some pretty complicated matching
+ * code. */
+static int unionfs_remount_fs(struct super_block *sb, int *flags, char *data)
+{
+ return -ENOSYS;
+}
+
+/*
+ * Called by iput() when the inode reference count reached zero
+ * and the inode is not hashed anywhere. Used to clear anything
+ * that needs to be, before the inode is completely destroyed and put
+ * on the inode free list.
+ */
+static void unionfs_clear_inode(struct inode *inode)
+{
+ int bindex, bstart, bend;
+ struct inode *hidden_inode;
+ struct list_head *pos, *n;
+ struct unionfs_dir_state *rdstate;
+
+ list_for_each_safe(pos, n, &itopd(inode)->uii_readdircache) {
+ rdstate = list_entry(pos, struct unionfs_dir_state, uds_cache);
+ list_del(&rdstate->uds_cache);
+ free_rdstate(rdstate);
+ }
+
+ /* Decrement a reference to a hidden_inode, which was incremented
+ * by our read_inode when it was created initially. */
+ bstart = ibstart(inode);
+ bend = ibend(inode);
+ if (bstart >= 0) {
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_inode = itohi_index(inode, bindex);
+ if (!hidden_inode)
+ continue;
+ iput(hidden_inode);
+ }
+ }
+
+ kfree(itohi_ptr(inode));
+ itohi_ptr(inode) = NULL;
+}
+
+static struct inode *unionfs_alloc_inode(struct super_block *sb)
+{
+ struct unionfs_inode_container *c;
+
+ c = (struct unionfs_inode_container *)
+ kmem_cache_alloc(unionfs_inode_cachep, SLAB_KERNEL);
+ if (!c) {
+ return NULL;
+ }
+
+ memset(&c->info, 0, sizeof(c->info));
+
+ c->vfs_inode.i_version = 1;
+ return &c->vfs_inode;
+}
+
+static void unionfs_destroy_inode(struct inode *inode)
+{
+ kmem_cache_free(unionfs_inode_cachep, itopd(inode));
+}
+
+static void init_once(void *v, kmem_cache_t * cachep, unsigned long flags)
+{
+ struct unionfs_inode_container *c = (struct unionfs_inode_container *)v;
+
+ if ((flags & (SLAB_CTOR_VERIFY | SLAB_CTOR_CONSTRUCTOR)) ==
+ SLAB_CTOR_CONSTRUCTOR)
+ inode_init_once(&c->vfs_inode);
+}
+
+int init_inode_cache(void)
+{
+ int err = 0;
+
+ unionfs_inode_cachep =
+ kmem_cache_create("unionfs_inode_cache",
+ sizeof(struct unionfs_inode_container), 0,
+ SLAB_RECLAIM_ACCOUNT, init_once, NULL);
+ if (!unionfs_inode_cachep)
+ err = -ENOMEM;
+ return err;
+}
+
+void destroy_inode_cache(void)
+{
+ if (!unionfs_inode_cachep)
+ goto out;
+ if (kmem_cache_destroy(unionfs_inode_cachep))
+ printk(KERN_ERR
+ "unionfs_inode_cache: not all structures were freed\n");
+out:
+ return;
+}
+
+/* Called when we have a dirty inode, right here we only throw out
+ * parts of our readdir list that are too old.
+ */
+static int unionfs_write_inode(struct inode *inode, int sync)
+{
+ struct list_head *pos, *n;
+ struct unionfs_dir_state *rdstate;
+
+ spin_lock(&itopd(inode)->uii_rdlock);
+ list_for_each_safe(pos, n, &itopd(inode)->uii_readdircache) {
+ rdstate = list_entry(pos, struct unionfs_dir_state, uds_cache);
+ /* We keep this list in LRU order. */
+ if ((rdstate->uds_access + RDCACHE_JIFFIES) > jiffies)
+ break;
+ itopd(inode)->uii_rdcount--;
+ list_del(&rdstate->uds_cache);
+ free_rdstate(rdstate);
+ }
+ spin_unlock(&itopd(inode)->uii_rdlock);
+
+ return 0;
+}
+
+/*
+ * Used only in nfs, to kill any pending RPC tasks, so that subsequent
+ * code can actually succeed and won't leave tasks that need handling.
+ *
+ * PS. I wonder if this is somehow useful to undo damage that was
+ * left in the kernel after a user level file server (such as amd)
+ * dies.
+ */
+static void unionfs_umount_begin(struct vfsmount *mnt, int flags)
+{
+ struct super_block *sb, *hidden_sb;
+ struct vfsmount *hidden_mnt;
+ int bindex, bstart, bend;
+
+ if (!(flags & MNT_FORCE))
+ /* we are not being MNT_FORCEd, therefore we should emulate old
+ * behaviour
+ */
+ return;
+
+ sb = mnt->mnt_sb;
+
+ bstart = sbstart(sb);
+ bend = sbend(sb);
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ hidden_mnt = stohiddenmnt_index(sb, bindex);
+ hidden_sb = stohs_index(sb, bindex);
+
+ if (hidden_mnt && hidden_sb && hidden_sb->s_op &&
+ hidden_sb->s_op->umount_begin)
+ hidden_sb->s_op->umount_begin(hidden_mnt, flags);
+ }
+}
+
+static int unionfs_show_options(struct seq_file *m, struct vfsmount *mnt)
+{
+ struct super_block *sb = mnt->mnt_sb;
+ int ret = 0;
+ char *tmp_page;
+ char *path;
+ int bindex, bstart, bend;
+ int perms;
+
+ lock_dentry(sb->s_root);
+
+ tmp_page = (char*) __get_free_page(GFP_KERNEL);
+ if (!tmp_page) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ bindex = bstart = sbstart(sb);
+ bend = sbend(sb);
+
+ seq_printf(m, ",dirs=");
+ for (bindex = bstart; bindex <= bend; bindex++) {
+ path = d_path(dtohd_index(sb->s_root, bindex),
+ stohiddenmnt_index(sb, bindex), tmp_page,
+ PAGE_SIZE);
+ perms = branchperms(sb, bindex);
+
+ seq_printf(m, "%s=%s", path,
+ perms & MAY_WRITE ? "rw" :
+ perms & MAY_NFSRO ? "nfsro" : "ro");
+ if (bindex != bend) {
+ seq_printf(m, ":");
+ }
+ }
+
+out:
+ if (tmp_page)
+ free_page((unsigned long) tmp_page);
+
+ unlock_dentry(sb->s_root);
+
+ return ret;
+}
+
+struct super_operations unionfs_sops = {
+ .read_inode = unionfs_read_inode,
+ .put_inode = unionfs_put_inode,
+ .delete_inode = unionfs_delete_inode,
+ .put_super = unionfs_put_super,
+ .statfs = unionfs_statfs,
+ .remount_fs = unionfs_remount_fs,
+ .clear_inode = unionfs_clear_inode,
+ .umount_begin = unionfs_umount_begin,
+ .show_options = unionfs_show_options,
+ .write_inode = unionfs_write_inode,
+ .alloc_inode = unionfs_alloc_inode,
+ .destroy_inode = unionfs_destroy_inode,
+};
+

2006-09-01 01:59:59

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 19/22][RFC] Unionfs: Helper macros/inlines

From: Josef "Jeff" Sipek <[email protected]>

This patch contains many macros and inline functions used thoughout Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/fanout.h | 188 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 188 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/fanout.h linux-2.6-git-unionfs/fs/unionfs/fanout.h
--- linux-2.6-git/fs/unionfs/fanout.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/fanout.h 2006-08-31 19:04:00.000000000 -0400
@@ -0,0 +1,188 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef Sipek
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#ifndef _FANOUT_H_
+#define _FANOUT_H_
+
+/* Inode to private data */
+static inline struct unionfs_inode_info *itopd(const struct inode *inode)
+{
+ return
+ &(container_of(inode, struct unionfs_inode_container, vfs_inode)->
+ info);
+}
+
+#define itohi_ptr(ino) (itopd(ino)->uii_inode)
+#define ibstart(ino) (itopd(ino)->b_start)
+#define ibend(ino) (itopd(ino)->b_end)
+
+/* Superblock to private data */
+#define stopd(super) ((struct unionfs_sb_info *)(super)->s_fs_info)
+#define stopd_lhs(super) ((super)->s_fs_info)
+#define sbstart(sb) 0
+#define sbend(sb) stopd(sb)->b_end
+#define sbmax(sb) (stopd(sb)->b_end + 1)
+
+/* File to private Data */
+#define ftopd(file) ((struct unionfs_file_info *)((file)->private_data))
+#define ftopd_lhs(file) ((file)->private_data)
+#define ftohf_ptr(file) (ftopd(file)->ufi_file)
+#define fbstart(file) (ftopd(file)->b_start)
+#define fbend(file) (ftopd(file)->b_end)
+
+/* File to hidden file. */
+static inline struct file *ftohf(struct file *f)
+{
+ return ftopd(f)->ufi_file[fbstart(f)];
+}
+
+static inline struct file *ftohf_index(const struct file *f, int index)
+{
+ return ftopd(f)->ufi_file[index];
+}
+
+static inline void set_ftohf_index(struct file *f, int index, struct file *val)
+{
+ ftopd(f)->ufi_file[index] = val;
+}
+
+static inline void set_ftohf(struct file *f, struct file *val)
+{
+ ftopd(f)->ufi_file[fbstart(f)] = val;
+}
+
+/* Inode to hidden inode. */
+static inline struct inode *itohi(const struct inode *i)
+{
+ return itopd(i)->uii_inode[ibstart(i)];
+}
+
+static inline struct inode *itohi_index(const struct inode *i, int index)
+{
+ return itopd(i)->uii_inode[index];
+}
+
+static inline void set_itohi_index(struct inode *i, int index,
+ struct inode *val)
+{
+ itopd(i)->uii_inode[index] = val;
+}
+
+static inline void set_itohi(struct inode *i, struct inode *val)
+{
+ itopd(i)->uii_inode[ibstart(i)] = val;
+}
+
+/* Superblock to hidden superblock. */
+static inline struct super_block *stohs(const struct super_block *o)
+{
+ return stopd(o)->usi_data[sbstart(o)].sb;
+}
+
+static inline struct super_block *stohs_index(const struct super_block *o, int index)
+{
+ return stopd(o)->usi_data[index].sb;
+}
+
+static inline void set_stohs_index(struct super_block *o, int index,
+ struct super_block *val)
+{
+ stopd(o)->usi_data[index].sb = val;
+}
+
+static inline void set_stohs(struct super_block *o, struct super_block *val)
+{
+ stopd(o)->usi_data[sbstart(o)].sb = val;
+}
+
+/* Super to hidden mount. */
+static inline struct vfsmount *stohiddenmnt_index(struct super_block *o,
+ int index)
+{
+ return stopd(o)->usi_data[index].hidden_mnt;
+}
+
+static inline void set_stohiddenmnt_index(struct super_block *o, int index,
+ struct vfsmount *val)
+{
+ stopd(o)->usi_data[index].hidden_mnt = val;
+}
+
+/* Branch count macros. */
+static inline int branch_count(struct super_block *o, int index)
+{
+ return atomic_read(&stopd(o)->usi_data[index].sbcount);
+}
+
+static inline void set_branch_count(struct super_block *o, int index, int val)
+{
+ atomic_set(&stopd(o)->usi_data[index].sbcount, val);
+}
+
+static inline void branchget(struct super_block *o, int index)
+{
+ atomic_inc(&stopd(o)->usi_data[index].sbcount);
+}
+
+static inline void branchput(struct super_block *o, int index)
+{
+ atomic_dec(&stopd(o)->usi_data[index].sbcount);
+}
+
+/* Dentry macros */
+static inline struct unionfs_dentry_info *dtopd(const struct dentry *dent)
+{
+ return (struct unionfs_dentry_info *)dent->d_fsdata;
+}
+
+#define dtopd_lhs(dent) ((dent)->d_fsdata)
+#define dtopd_nocheck(dent) dtopd(dent)
+#define dbstart(dent) (dtopd(dent)->udi_bstart)
+#define set_dbstart(dent, val) do { dtopd(dent)->udi_bstart = val; } while(0)
+#define dbend(dent) (dtopd(dent)->udi_bend)
+#define set_dbend(dent, val) do { dtopd(dent)->udi_bend = val; } while(0)
+#define dbopaque(dent) (dtopd(dent)->udi_bopaque)
+#define set_dbopaque(dent, val) do { dtopd(dent)->udi_bopaque = val; } while (0)
+
+static inline void set_dtohd_index(struct dentry *dent, int index,
+ struct dentry *val)
+{
+ dtopd(dent)->udi_dentry[index] = val;
+}
+
+static inline struct dentry *dtohd_index(const struct dentry *dent, int index)
+{
+ return dtopd(dent)->udi_dentry[index];
+}
+
+static inline struct dentry *dtohd(const struct dentry *dent)
+{
+ return dtopd(dent)->udi_dentry[dbstart(dent)];
+}
+
+#define set_dtohd_index_nocheck(dent, index, val) set_dtohd_index(dent, index, val)
+#define dtohd_index_nocheck(dent, index) dtohd_index(dent, index)
+
+#define dtohd_ptr(dent) (dtopd_nocheck(dent)->udi_dentry)
+
+/* Macros for locking a dentry. */
+#define lock_dentry(d) down(&dtopd(d)->udi_sem)
+#define unlock_dentry(d) up(&dtopd(d)->udi_sem)
+#define verify_locked(d)
+
+#endif /* _FANOUT_H */

2006-09-01 02:01:23

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 20/22][RFC] Unionfs: Internal include file

From: Josef "Jeff" Sipek <[email protected]>

This patch contains an internal Unionfs include file. The include file is
specific to kernel code only, and therefore is separate from
include/linux/unionfs.h.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/union.h | 544 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 544 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/union.h linux-2.6-git-unionfs/fs/unionfs/union.h
--- linux-2.6-git/fs/unionfs/union.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/union.h 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,544 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef Sipek
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#ifndef _UNION_H_
+#define _UNION_H_
+
+#include <asm/mman.h>
+#include <asm/system.h>
+#include <linux/dcache.h>
+#include <linux/file.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/mount.h>
+#include <linux/namei.h>
+#include <linux/page-flags.h>
+#include <linux/pagemap.h>
+#include <linux/poll.h>
+#include <linux/security.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/statfs.h>
+#include <linux/string.h>
+#include <linux/vmalloc.h>
+#include <linux/writeback.h>
+
+#include <linux/union_fs.h>
+/* the file system name */
+#define UNIONFS_NAME "unionfs"
+
+/* unionfs file systems superblock magic */
+#define UNIONFS_SUPER_MAGIC 0xf15f083d
+
+/* unionfs root inode number */
+#define UNIONFS_ROOT_INO 1
+
+/* Mount time flags */
+#define MOUNT_FLAG(sb) (stopd(sb)->usi_mount_flag)
+
+/* number of characters while generating unique temporary file names */
+#define UNIONFS_TMPNAM_LEN 12
+
+/* number of times we try to get a unique temporary file name */
+#define GET_TMPNAM_MAX_RETRY 5
+
+/* Operations vectors defined in specific files. */
+extern struct file_operations unionfs_main_fops;
+extern struct file_operations unionfs_dir_fops;
+extern struct inode_operations unionfs_main_iops;
+extern struct inode_operations unionfs_dir_iops;
+extern struct inode_operations unionfs_symlink_iops;
+extern struct super_operations unionfs_sops;
+extern struct dentry_operations unionfs_dops;
+
+/* How long should an entry be allowed to persist */
+#define RDCACHE_JIFFIES 5*HZ
+
+/* file private data. */
+struct unionfs_file_info {
+ int b_start;
+ int b_end;
+ atomic_t ufi_generation;
+
+ struct unionfs_dir_state *rdstate;
+ struct file **ufi_file;
+};
+
+/* unionfs inode data in memory */
+struct unionfs_inode_info {
+ int b_start;
+ int b_end;
+ atomic_t uii_generation;
+ int uii_stale;
+ /* Stuff for readdir over NFS. */
+ spinlock_t uii_rdlock;
+ struct list_head uii_readdircache;
+ int uii_rdcount;
+ int uii_hashsize;
+ int uii_cookie; /* The hidden inodes */
+ struct inode **uii_inode;
+ /* to keep track of reads/writes for unlinks before closes */
+ atomic_t uii_totalopens;
+};
+
+struct unionfs_inode_container {
+ struct unionfs_inode_info info;
+ struct inode vfs_inode;
+};
+
+/* unionfs dentry data in memory */
+struct unionfs_dentry_info {
+ /* The semaphore is used to lock the dentry as soon as we get into a
+ * unionfs function from the VFS. Our lock ordering is that children
+ * go before their parents. */
+ struct semaphore udi_sem;
+ int udi_bstart;
+ int udi_bend;
+ int udi_bopaque;
+ int udi_bcount;
+ atomic_t udi_generation;
+ struct dentry **udi_dentry;
+};
+
+/* These are the pointers to our various objects. */
+struct unionfs_usi_data {
+ struct super_block *sb;
+ struct vfsmount *hidden_mnt;
+ atomic_t sbcount;
+ int branchperms;
+};
+
+/* unionfs super-block data in memory */
+struct unionfs_sb_info {
+ int b_end;
+
+ atomic_t usi_generation;
+ unsigned long usi_mount_flag;
+ struct rw_semaphore usi_rwsem;
+
+ struct unionfs_usi_data *usi_data;
+};
+
+/*
+ * structure for making the linked list of entries by readdir on left branch
+ * to compare with entries on right branch
+ */
+struct filldir_node {
+ struct list_head file_list; // list for directory entries
+ char *name; // name entry
+ int hash; // name hash
+ int namelen; // name len since name is not 0 terminated
+ int bindex; // we can check for duplicate whiteouts and files in the same branch in order to return -EIO.
+ int whiteout; // is this a whiteout entry?
+ char iname[DNAME_INLINE_LEN_MIN]; // Inline name, so we don't need to separately kmalloc small ones
+};
+
+/* Directory hash table. */
+struct unionfs_dir_state {
+ unsigned int uds_cookie; /* The cookie, which is based off of uii_rdversion */
+ unsigned int uds_offset; /* The entry we have returned. */
+ int uds_bindex;
+ loff_t uds_dirpos; /* The offset within the lower level directory. */
+ int uds_size; /* How big is the hash table? */
+ int uds_hashentries; /* How many entries have been inserted? */
+ unsigned long uds_access;
+ /* This cache list is used when the inode keeps us around. */
+ struct list_head uds_cache;
+ struct list_head uds_list[0];
+};
+
+/* include miscellaneous macros */
+#include "fanout.h"
+#include "sioq.h"
+
+/* Cache creation/deletion routines. */
+void destroy_filldir_cache(void);
+int init_filldir_cache(void);
+int init_inode_cache(void);
+void destroy_inode_cache(void);
+int init_dentry_cache(void);
+void destroy_dentry_cache(void);
+
+/* Initialize and free readdir-specific state. */
+int init_rdstate(struct file *file);
+struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex);
+struct unionfs_dir_state *find_rdstate(struct inode *inode, loff_t fpos);
+void free_rdstate(struct unionfs_dir_state *state);
+int add_filldir_node(struct unionfs_dir_state *rdstate, const char *name,
+ int namelen, int bindex, int whiteout);
+struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate,
+ const char *name, int namelen);
+
+struct dentry **alloc_new_dentries(int objs);
+struct unionfs_usi_data *alloc_new_data(int objs);
+
+/* We can only use 32-bits of offset for rdstate --- blech! */
+#define DIREOF (0xfffff)
+#define RDOFFBITS 20 /* This is the number of bits in DIREOF. */
+#define MAXRDCOOKIE (0xfff)
+/* Turn an rdstate into an offset. */
+static inline off_t rdstate2offset(struct unionfs_dir_state *buf)
+{
+ off_t tmp;
+ tmp = ((buf->uds_cookie & MAXRDCOOKIE) << RDOFFBITS)
+ | (buf->uds_offset & DIREOF);
+ return tmp;
+}
+
+#define unionfs_read_lock(sb) down_read(&stopd(sb)->usi_rwsem)
+#define unionfs_read_unlock(sb) up_read(&stopd(sb)->usi_rwsem)
+#define unionfs_write_lock(sb) down_write(&stopd(sb)->usi_rwsem)
+#define unionfs_write_unlock(sb) up_write(&stopd(sb)->usi_rwsem)
+
+/* The double lock function needs to go after the debugmacros, so that
+ * dtopd is defined. */
+static inline void double_lock_dentry(struct dentry *d1, struct dentry *d2)
+{
+ if (d2 < d1) {
+ struct dentry *tmp = d1;
+ d1 = d2;
+ d2 = tmp;
+ }
+ lock_dentry(d1);
+ lock_dentry(d2);
+}
+
+extern int new_dentry_private_data(struct dentry *dentry);
+void free_dentry_private_data(struct unionfs_dentry_info *udi);
+void update_bstart(struct dentry *dentry);
+#define sbt(sb) ((sb)->s_type->name)
+
+/*
+ * EXTERNALS:
+ */
+/* replicates the directory structure upto given dentry in given branch */
+extern struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
+ int bindex);
+struct dentry *create_parents_named(struct inode *dir, struct dentry *dentry,
+ const char *name, int bindex);
+
+/* check if two branches overlap */
+extern int is_branch_overlap(struct dentry *dent1, struct dentry *dent2);
+
+/* partial lookup */
+extern int unionfs_partial_lookup(struct dentry *dentry);
+
+/* Pass an unionfs dentry and an index and it will try to create a whiteout in branch 'index'.
+ On error, it will proceed to a branch to the left */
+extern int create_whiteout(struct dentry *dentry, int start);
+/* copies a file from dbstart to newbindex branch */
+extern int copyup_file(struct inode *dir, struct file *file, int bstart,
+ int newbindex, loff_t size);
+extern int copyup_named_file(struct inode *dir, struct file *file,
+ char *name, int bstart, int new_bindex,
+ loff_t len);
+/* copies a dentry from dbstart to newbindex branch */
+extern int copyup_dentry(struct inode *dir, struct dentry *dentry, int bstart,
+ int new_bindex, struct file **copyup_file, loff_t len);
+extern int copyup_named_dentry(struct inode *dir, struct dentry *dentry,
+ int bstart, int new_bindex, const char *name,
+ int namelen, struct file **copyup_file,
+ loff_t len);
+
+extern int remove_whiteouts(struct dentry *dentry, struct dentry *hidden_dentry,
+ int bindex);
+
+extern int do_delete_whiteouts(struct dentry *dentry, int bindex,
+ struct unionfs_dir_state *namelist);
+
+/* Is this directory empty: 0 if it is empty, -ENOTEMPTY if not. */
+extern int check_empty(struct dentry *dentry,
+ struct unionfs_dir_state **namelist);
+/* Delete whiteouts from this directory in branch bindex. */
+extern int delete_whiteouts(struct dentry *dentry, int bindex,
+ struct unionfs_dir_state *namelist);
+
+/* Re-lookup a hidden dentry. */
+extern int unionfs_refresh_hidden_dentry(struct dentry *dentry, int bindex);
+
+extern void unionfs_reinterpose(struct dentry *this_dentry);
+extern struct super_block *unionfs_duplicate_super(struct super_block *sb);
+
+/* Locking functions. */
+extern int unionfs_setlk(struct file *file, int cmd, struct file_lock *fl);
+extern int unionfs_getlk(struct file *file, struct file_lock *fl);
+
+/* Common file operations. */
+extern int unionfs_file_revalidate(struct file *file, int willwrite);
+extern int unionfs_open(struct inode *inode, struct file *file);
+extern int unionfs_file_release(struct inode *inode, struct file *file);
+extern int unionfs_flush(struct file *file, fl_owner_t id);
+extern long unionfs_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg);
+
+/* Inode operations */
+extern int unionfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry);
+int unionfs_unlink(struct inode *dir, struct dentry *dentry);
+int unionfs_rmdir(struct inode *dir, struct dentry *dentry);
+
+int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd);
+
+/* The values for unionfs_interpose's flag. */
+#define INTERPOSE_DEFAULT 0
+#define INTERPOSE_LOOKUP 1
+#define INTERPOSE_REVAL 2
+#define INTERPOSE_REVAL_NEG 3
+#define INTERPOSE_PARTIAL 4
+
+extern int unionfs_interpose(struct dentry *this_dentry, struct super_block *sb,
+ int flag);
+
+/* Branch management ioctls. */
+int unionfs_ioctl_incgen(struct file *file, unsigned int cmd,
+ unsigned long arg);
+int unionfs_ioctl_queryfile(struct file *file, unsigned int cmd,
+ unsigned long arg);
+
+/* Verify that a branch is valid. */
+int check_branch(struct nameidata *nd);
+
+/* The root directory is unhashed, but isn't deleted. */
+static inline int d_deleted(struct dentry *d)
+{
+ return d_unhashed(d) && (d != d->d_sb->s_root);
+}
+
+/* returns the sum of the n_link values of all the underlying inodes of the passed inode */
+static inline int get_nlinks(struct inode *inode)
+{
+ int sum_nlinks = 0;
+ int dirs = 0;
+ int bindex;
+ struct inode *hidden_inode;
+
+ if (!S_ISDIR(inode->i_mode))
+ return itohi(inode)->i_nlink;
+
+ for (bindex = ibstart(inode); bindex <= ibend(inode); bindex++) {
+ hidden_inode = itohi_index(inode, bindex);
+ if (!hidden_inode || !S_ISDIR(hidden_inode->i_mode))
+ continue;
+ BUG_ON(hidden_inode->i_nlink < 0);
+
+ /* A deleted directory. */
+ if (hidden_inode->i_nlink == 0)
+ continue;
+ dirs++;
+ /* A broken directory (e.g., squashfs). */
+ if (hidden_inode->i_nlink == 1)
+ sum_nlinks += 2;
+ else
+ sum_nlinks += (hidden_inode->i_nlink - 2);
+ }
+
+ if (!dirs)
+ return 0;
+ return sum_nlinks + 2;
+}
+
+static inline void fist_copy_attr_atime(struct inode *dest,
+ const struct inode *src)
+{
+ dest->i_atime = src->i_atime;
+}
+static inline void fist_copy_attr_times(struct inode *dest,
+ const struct inode *src)
+{
+ dest->i_atime = src->i_atime;
+ dest->i_mtime = src->i_mtime;
+ dest->i_ctime = src->i_ctime;
+}
+static inline void fist_copy_attr_timesizes(struct inode *dest,
+ const struct inode *src)
+{
+ dest->i_atime = src->i_atime;
+ dest->i_mtime = src->i_mtime;
+ dest->i_ctime = src->i_ctime;
+ dest->i_size = src->i_size;
+ dest->i_blocks = src->i_blocks;
+}
+static inline void fist_copy_attr_all(struct inode *dest,
+ const struct inode *src)
+{
+ dest->i_mode = src->i_mode;
+ /* we do not need to copy if the file is a deleted file */
+ if (dest->i_nlink > 0)
+ dest->i_nlink = get_nlinks(dest);
+ dest->i_uid = src->i_uid;
+ dest->i_gid = src->i_gid;
+ dest->i_rdev = src->i_rdev;
+ dest->i_atime = src->i_atime;
+ dest->i_mtime = src->i_mtime;
+ dest->i_ctime = src->i_ctime;
+ dest->i_blksize = src->i_blksize;
+ dest->i_blkbits = src->i_blkbits;
+ dest->i_size = src->i_size;
+ dest->i_blocks = src->i_blocks;
+ dest->i_flags = src->i_flags;
+}
+
+struct dentry *unionfs_lookup_backend(struct dentry *dentry, int lookupmode);
+int is_stale_inode(struct inode *inode);
+void make_stale_inode(struct inode *inode);
+
+#define IS_SET(sb, check_flag) (check_flag & MOUNT_FLAG(sb))
+
+/* unionfs_permission, check if we should bypass error to facilitate copyup */
+#define IS_COPYUP_ERR(err) ((err) == -EROFS)
+
+/* unionfs_open, check if we need to copyup the file */
+#define OPEN_WRITE_FLAGS (O_WRONLY | O_RDWR | O_APPEND)
+#define IS_WRITE_FLAG(flag) ((flag) & (OPEN_WRITE_FLAGS))
+
+static inline int branchperms(struct super_block *sb, int index)
+{
+ BUG_ON(index < 0);
+
+ return stopd(sb)->usi_data[index].branchperms;
+}
+static inline int set_branchperms(struct super_block *sb, int index, int perms)
+{
+ BUG_ON(index < 0);
+
+ stopd(sb)->usi_data[index].branchperms = perms;
+
+ return perms;
+}
+
+/* Is this file on a read-only branch? */
+static inline int __is_robranch_super(struct super_block *sb, int index,
+ char *file, const char *function,
+ int line)
+{
+ int err = 0;
+
+ if (!(branchperms(sb, index) & MAY_WRITE))
+ err = -EROFS;
+
+ return err;
+}
+
+/* Is this file on a read-only branch? */
+static inline int __is_robranch_index(struct dentry *dentry, int index,
+ char *file, const char *function,
+ int line)
+{
+ int err = 0;
+ int perms;
+
+ BUG_ON(index < 0);
+
+ perms = stopd(dentry->d_sb)->usi_data[index].branchperms;
+
+ if ((!(perms & MAY_WRITE))
+ || (IS_RDONLY(dtohd_index(dentry, index)->d_inode)))
+ err = -EROFS;
+
+ return err;
+}
+static inline int __is_robranch(struct dentry *dentry, char *file,
+ const char *function, int line)
+{
+ int index;
+ int err;
+
+ index = dtopd(dentry)->udi_bstart;
+ BUG_ON(index < 0);
+
+ err = __is_robranch_index(dentry, index, file, function, line);
+
+ return err;
+}
+
+#define is_robranch(d) __is_robranch(d, __FILE__, __FUNCTION__, __LINE__)
+#define is_robranch_super(s, n) __is_robranch_super(s, n, __FILE__, __FUNCTION__, __LINE__)
+
+/* What do we use for whiteouts. */
+#define UNIONFS_WHPFX ".wh."
+#define UNIONFS_WHLEN 4
+/* If a directory contains this file, then it is opaque. We start with the
+ * .wh. flag so that it is blocked by lookup.
+ */
+#define UNIONFS_DIR_OPAQUE_NAME "__dir_opaque"
+#define UNIONFS_DIR_OPAQUE UNIONFS_WHPFX UNIONFS_DIR_OPAQUE_NAME
+
+/* construct whiteout filename */
+static inline char *alloc_whname(const char *name, int len)
+{
+ char *buf;
+
+ buf = kmalloc(len + UNIONFS_WHLEN + 1, GFP_KERNEL);
+ if (!buf)
+ return ERR_PTR(-ENOMEM);
+
+ strcpy(buf, UNIONFS_WHPFX);
+ strlcat(buf, name, len + UNIONFS_WHLEN + 1);
+
+ return buf;
+}
+
+#define VALID_MOUNT_FLAGS (0)
+
+/*
+ * MACROS:
+ */
+
+#ifndef SEEK_SET
+#define SEEK_SET 0
+#endif /* not SEEK_SET */
+
+#ifndef SEEK_CUR
+#define SEEK_CUR 1
+#endif /* not SEEK_CUR */
+
+#ifndef SEEK_END
+#define SEEK_END 2
+#endif /* not SEEK_END */
+
+#ifndef DEFAULT_POLLMASK
+#define DEFAULT_POLLMASK (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM)
+#endif
+
+/*
+ * EXTERNALS:
+ */
+
+/* These two functions are here because it is kind of daft to copy and paste the
+ * contents of the two functions to 32+ places in unionfs
+ */
+static inline struct dentry *lock_parent(struct dentry *dentry)
+{
+ struct dentry *dir = dget(dentry->d_parent);
+
+ mutex_lock(&dir->d_inode->i_mutex);
+ return dir;
+}
+
+static inline void unlock_dir(struct dentry *dir)
+{
+ mutex_unlock(&dir->d_inode->i_mutex);
+ dput(dir);
+}
+
+extern int make_dir_opaque(struct dentry *dir, int bindex);
+
+#endif /* not _UNION_H_ */
+

2006-09-01 02:02:14

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 21/22][RFC] Unionfs: Unlink

From: Josef "Jeff" Sipek <[email protected]>

This patch provides unlink functionality for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

fs/unionfs/unlink.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 179 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/unlink.c linux-2.6-git-unionfs/fs/unionfs/unlink.c
--- linux-2.6-git/fs/unionfs/unlink.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/fs/unionfs/unlink.c 2006-08-31 19:04:01.000000000 -0400
@@ -0,0 +1,179 @@
+/*
+ * Copyright (c) 2003-2006 Erez Zadok
+ * Copyright (c) 2003-2006 Charles P. Wright
+ * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
+ * Copyright (c) 2005-2006 Junjiro Okajima
+ * Copyright (c) 2005 Arun M. Krishnakumar
+ * Copyright (c) 2004-2006 David P. Quigley
+ * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
+ * Copyright (c) 2003 Puja Gupta
+ * Copyright (c) 2003 Harikesavan Krishnan
+ * Copyright (c) 2003-2006 Stony Brook University
+ * Copyright (c) 2003-2006 The Research Foundation of State University of New York
+ *
+ * For specific licensing information, see the COPYING file distributed with
+ * this package.
+ *
+ * This Copyright notice must be kept intact and distributed with all sources.
+ */
+
+#include "union.h"
+
+static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry)
+{
+ struct dentry *hidden_dentry;
+ struct dentry *hidden_dir_dentry;
+ int bindex;
+ int err = 0;
+
+ if ((err = unionfs_partial_lookup(dentry)))
+ goto out;
+
+ bindex = dbstart(dentry);
+
+ hidden_dentry = dtohd_index(dentry, bindex);
+ if (!hidden_dentry)
+ goto out;
+
+ hidden_dir_dentry = lock_parent(hidden_dentry);
+
+ /* avoid destroying the hidden inode if the file is in use */
+ dget(hidden_dentry);
+ if (!(err = is_robranch_super(dentry->d_sb, bindex)))
+ err = vfs_unlink(hidden_dir_dentry->d_inode, hidden_dentry);
+ dput(hidden_dentry);
+ fist_copy_attr_times(dir, hidden_dir_dentry->d_inode);
+ unlock_dir(hidden_dir_dentry);
+
+ if (err && !IS_COPYUP_ERR(err))
+ goto out;
+
+ if (err) {
+ if (dbstart(dentry) == 0)
+ goto out;
+
+ err = create_whiteout(dentry, dbstart(dentry) - 1);
+ } else if (dbopaque(dentry) != -1) {
+ /* There is a hidden lower-priority file with the same name. */
+ err = create_whiteout(dentry, dbopaque(dentry));
+ } else {
+ err = create_whiteout(dentry, dbstart(dentry));
+ }
+
+out:
+ if (!err)
+ dentry->d_inode->i_nlink--;
+
+ /* We don't want to leave negative leftover dentries for revalidate. */
+ if (!err && (dbopaque(dentry) != -1))
+ update_bstart(dentry);
+
+ return err;
+}
+
+int unionfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+ int err = 0;
+
+ lock_dentry(dentry);
+
+ err = unionfs_unlink_whiteout(dir, dentry);
+ /* call d_drop so the system "forgets" about us */
+ if (!err)
+ d_drop(dentry);
+
+ unlock_dentry(dentry);
+ return err;
+}
+
+static int unionfs_rmdir_first(struct inode *dir, struct dentry *dentry,
+ struct unionfs_dir_state *namelist)
+{
+ int err;
+ struct dentry *hidden_dentry;
+ struct dentry *hidden_dir_dentry = NULL;
+
+ /* Here we need to remove whiteout entries. */
+ err = delete_whiteouts(dentry, dbstart(dentry), namelist);
+ if (err) {
+ goto out;
+ }
+
+ hidden_dentry = dtohd(dentry);
+
+ hidden_dir_dentry = lock_parent(hidden_dentry);
+
+ /* avoid destroying the hidden inode if the file is in use */
+ dget(hidden_dentry);
+ if (!(err = is_robranch(dentry))) {
+ err = vfs_rmdir(hidden_dir_dentry->d_inode, hidden_dentry);
+ }
+ dput(hidden_dentry);
+
+ fist_copy_attr_times(dir, hidden_dir_dentry->d_inode);
+ /* propagate number of hard-links */
+ dentry->d_inode->i_nlink = get_nlinks(dentry->d_inode);
+
+out:
+ if (hidden_dir_dentry) {
+ unlock_dir(hidden_dir_dentry);
+ }
+ return err;
+}
+
+int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
+{
+ int err = 0;
+ struct unionfs_dir_state *namelist = NULL;
+
+ lock_dentry(dentry);
+
+ /* check if this unionfs directory is empty or not */
+ err = check_empty(dentry, &namelist);
+ if (err) {
+#if 0
+ /* vfs_rmdir(our caller) unhashed the dentry. This will recover
+ * the Unionfs inode number for the directory itself, but the
+ * children are already lost. It seems that tmpfs manages its
+ * way around this by upping the refcount on everything.
+ *
+ * Even if we do this, we still lose the inode numbers of the
+ * children. The best way to fix this is to fix the VFS (or
+ * use persistent inode maps). */
+ if (d_unhashed(dentry))
+ d_rehash(dentry);
+#endif
+ goto out;
+ }
+
+ err = unionfs_rmdir_first(dir, dentry, namelist);
+ /* create whiteout */
+ if (!err) {
+ err = create_whiteout(dentry, dbstart(dentry));
+ } else {
+ int new_err;
+
+ if (dbstart(dentry) == 0)
+ goto out;
+
+ /* exit if the error returned was NOT -EROFS */
+ if (!IS_COPYUP_ERR(err))
+ goto out;
+
+ new_err = create_whiteout(dentry, dbstart(dentry) - 1);
+ if (new_err != -EEXIST)
+ err = new_err;
+ }
+
+out:
+ /* call d_drop so the system "forgets" about us */
+ if (!err)
+ d_drop(dentry);
+
+ if (namelist)
+ free_rdstate(namelist);
+
+ unlock_dentry(dentry);
+ return err;
+}
+

2006-09-01 02:02:34

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH 22/22][RFC] Unionfs: Include file

From: Josef "Jeff" Sipek <[email protected]>

Global include file - can be included from userspace by utilities.

Signed-off-by: Josef "Jeff" Sipek <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>

---

include/linux/union_fs.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/include/linux/union_fs.h linux-2.6-git-unionfs/include/linux/union_fs.h
--- linux-2.6-git/include/linux/union_fs.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-2.6-git-unionfs/include/linux/union_fs.h 2006-08-31 19:04:04.000000000 -0400
@@ -0,0 +1,20 @@
+#ifndef _LINUX_UNION_FS_H
+#define _LINUX_UNION_FS_H
+
+#define UNIONFS_VERSION "2.0"
+/*
+ * DEFINITIONS FOR USER AND KERNEL CODE:
+ * (Note: ioctl numbers 1--9 are reserved for fistgen, the rest
+ * are auto-generated automatically based on the user's .fist file.)
+ */
+# define UNIONFS_IOCTL_INCGEN _IOR(0x15, 11, int)
+# define UNIONFS_IOCTL_QUERYFILE _IOR(0x15, 15, int)
+
+/* We don't support normal remount, but unionctl uses it. */
+# define UNIONFS_REMOUNT_MAGIC 0x4a5a4380
+
+/* should be at least LAST_USED_UNIONFS_PERMISSION<<1 */
+#define MAY_NFSRO 16
+
+#endif /* _LINUX_UNIONFS_H */
+

2006-09-01 03:02:48

by Ian Kent

[permalink] [raw]
Subject: Re: [PATCH 09/22][RFC] Unionfs: File operations

On Thu, 31 Aug 2006, Josef Sipek wrote:

> + if (err < 0)
> + goto out;
> + if (err != file->f_pos) {
> + file->f_pos = err;
> + // ION maybe this?
> + // file->f_pos = hidden_file->f_pos;

Do you really want to keep this comment, perhaps it's time to decide?

Ian

2006-09-01 07:47:24

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 01/22][RFC] Unionfs: Documentation

Hi,


nice to see that unionfs finally tries to get in :)


>+Whiteouts:
>+==========
>+
>+A whiteout removes a file name from the name-space. Whiteouts are needed when
>+one attempts to remove a file on a read-only branch.

"namespace".

>+Suppose we have a two branch union, where branch 0 is read-write and branch 1

I'd go for "two-branch".

>+Copyup:
>+=======
>+
>+When a change is made to the contents of a file's data or meta-data, they
>+have to be stored somewhere. The best way is to create a copy of the
>+original file on a branch that is writable, and then redirect the write
>+though to this copy. The copy must be made on a higher priority branch so
>+that lookup & readdir return this newer "version" of the file rather than
>+the original (see duplicate elimination).

Apropos copyup, sparse copyup would probably a nice feature in future, but it
also has its effects.

>--- linux-2.6-git/Documentation/filesystems/unionfs/usage.txt 1969-12-31 19:00:00.000000000 -0500
>+++ linux-2.6-git-unionfs/Documentation/filesystems/unionfs/usage.txt 2006-08-31 19:25:19.000000000 -0400
>+
>+mount -t unionfs -o branch-option[,union-options[,...]] none unionfs

should read
mount -t unionfs -o branch-option[,union-options[,...]] none MOUNTPOINT

>+KNOWN ISSUES:
>+=============
>+
>+The NFS server returns -EACCES for read-only exports, instead of -EROFS. This

Will the NFS code ever be changed to return EROFS instead?

>+nfs-mouted branch.

mounted

>+Modifying a Unionfs branch directly, while the union is mounted is currently
>+unsupported. Any such change can cause Unionfs to oops, however it could even
>+BRESULT IN DATA LOSS.

RESULT




Jan Engelhardt
--

2006-09-01 12:48:55

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 02/22][RFC] Unionfs: Kconfig and Makefile


>+config UNION_FS
>+ tristate "Stackable namespace unification file system"
>+ depends on EXPERIMENTAL
>+ help
>+ Unionfs is a stackable unification file system, which appears to
>+ merge the contents of several directories (branches), while keeping
>+ their physical content separate.

Is there any CodingStyle for Kconfig files? Like what indentation to use (4 vs
8) (tab vs space) and/or whether to use "help" or "---help---"



Jan Engelhardt
--

2006-09-01 12:54:07

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations


>+ if (!d_deleted(dentry) &&
>+ ((sbgen > fgen) || (dbstart(dentry) != fbstart(file)))) {

(sbgen > fgen || dbstart(dentry) != fbstart(file)) should suffice. (Read:
reduce the amount of "()" depth.)

>+int unionfs_file_release(struct inode *inode, struct file *file)
>+{
>+ int err = 0;
>+ struct file *hidden_file = NULL;
>+ int bindex, bstart, bend;
>+ int fgen;
>+
>+ /* fput all the hidden files */
>+ fgen = atomic_read(&ftopd(file)->ufi_generation);
>+ bstart = fbstart(file);
>+ bend = fbend(file);
>+
>+ for (bindex = bstart; bindex <= bend; bindex++) {
>+ hidden_file = ftohf_index(file, bindex);
>+
>+ if (hidden_file) {
>+ fput(hidden_file);
>+ unionfs_read_lock(inode->i_sb);
>+ branchput(inode->i_sb, bindex);
>+ unionfs_read_unlock(inode->i_sb);
>+ }
>+ }
>+ kfree(ftohf_ptr(file));
>+
>+ if (ftopd(file)->rdstate) {
>+ ftopd(file)->rdstate->uds_access = jiffies;
>+ printk(KERN_DEBUG "Saving rdstate with cookie %u [%d.%lld]\n",
>+ ftopd(file)->rdstate->uds_cookie,
>+ ftopd(file)->rdstate->uds_bindex,
>+ (long long)ftopd(file)->rdstate->uds_dirpos);
>+ spin_lock(&itopd(inode)->uii_rdlock);
>+ itopd(inode)->uii_rdcount++;
>+ list_add_tail(&ftopd(file)->rdstate->uds_cache,
>+ &itopd(inode)->uii_readdircache);
>+ mark_inode_dirty(inode);
>+ spin_unlock(&itopd(inode)->uii_rdlock);
>+ ftopd(file)->rdstate = NULL;
>+ }
>+ kfree(ftopd(file));
>+ return err;
>+}

"err" is unused in this function. Rid it.

>+ }
>+ }
>+
>+ }



Jan Engelhardt
--

2006-09-01 15:29:29

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 02/22][RFC] Unionfs: Kconfig and Makefile

On Fri, 1 Sep 2006 14:44:51 +0200 (MEST) Jan Engelhardt wrote:

>
> >+config UNION_FS
> >+ tristate "Stackable namespace unification file system"
> >+ depends on EXPERIMENTAL
> >+ help
> >+ Unionfs is a stackable unification file system, which appears to
> >+ merge the contents of several directories (branches), while keeping
> >+ their physical content separate.
>
> Is there any CodingStyle for Kconfig files? Like what indentation to use (4 vs
> 8) (tab vs space) and/or whether to use "help" or "---help---"

Doc/kbuild/kconfig-language.txt says that "help" or "---help---" is OK.
It also seems to recommend some indentation under "help", but doesn't
say what that is. Roman seems to use TAB + "help" and then
TAB SPACE SPACE <help text> under the "help" keyword, but it's not
codified anywhere that I know of.

---
~Randy

2006-09-01 17:23:55

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Fri, Sep 01, 2006 at 11:53:27AM +1000, Stephen Rothwell wrote:
> On Thu, 31 Aug 2006 21:35:13 -0400 Josef Sipek <[email protected]> wrote:
> >
> > This set of patches constitutes Unionfs version 2.0. We are presenting it to
> > be reviewed and considered for inclusion into the kernel.
>
> Small nit: is it possible to order these patches so that the kernel
> builds at each intermediate point (so we can use git bisect). The
> easiest way to achieve this would be to do the Kconfig and Makefile
> updates last.

Ideally, when Unionfs is commited into git, the whole thing would be one
commit - what's the point of having half of a filesystem? I reordered the
patches for the next submission so that the Makefile & kconfig one is last.

Thanks,

Josef "Jeff" Sipek.

2006-09-01 22:20:21

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Thu, 2006-08-31 at 21:41 -0400, Josef Sipek wrote:
> From: David Quigley <[email protected]>
>
> This patch contains helper functions used through the rest of the code which
> pertains to files.
>
> Signed-off-by: David Quigley <[email protected]>
> Signed-off-by: Josef "Jeff" Sipek <[email protected]>
> Signed-off-by: Erez Zadok <[email protected]>
>
> ---
>
> fs/unionfs/commonfops.c | 575 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 575 insertions(+)
>
> diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/fs/unionfs/commonfops.c linux-2.6-git-unionfs/fs/unionfs/commonfops.c
> --- linux-2.6-git/fs/unionfs/commonfops.c 1969-12-31 19:00:00.000000000 -0500
> +++ linux-2.6-git-unionfs/fs/unionfs/commonfops.c 2006-08-31 19:04:00.000000000 -0400
> @@ -0,0 +1,575 @@
> +/*
> + * Copyright (c) 2003-2006 Erez Zadok
> + * Copyright (c) 2003-2006 Charles P. Wright
> + * Copyright (c) 2005-2006 Josef 'Jeff' Sipek
> + * Copyright (c) 2005-2006 Junjiro Okajima
> + * Copyright (c) 2005 Arun M. Krishnakumar
> + * Copyright (c) 2004-2006 David P. Quigley
> + * Copyright (c) 2003-2004 Mohammad Nayyer Zubair
> + * Copyright (c) 2003 Puja Gupta
> + * Copyright (c) 2003 Harikesavan Krishnan
> + * Copyright (c) 2003-2006 Stony Brook University
> + * Copyright (c) 2003-2006 The Research Foundation of State University of New York
> + *
> + * For specific licensing information, see the COPYING file distributed with
> + * this package.
> + *
> + * This Copyright notice must be kept intact and distributed with all sources.
> + */
> +
> +#include "union.h"
> +
> +/* 1) Copyup the file
> + * 2) Rename the file to '.unionfs<original inode#><counter>' - obviously
> + * stolen from NFS's silly rename
> + */
> +static int copyup_deleted_file(struct file *file, struct dentry *dentry,
> + int bstart, int bindex)
> +{
> + static unsigned int counter;
> + const int i_inosize = sizeof(dentry->d_inode->i_ino) * 2;
> + const int countersize = sizeof(counter) * 2;
> + const int nlen = sizeof(".unionfs") + i_inosize + countersize - 1;
> + char name[nlen + 1];
> +
> + int err;
> + struct dentry *tmp_dentry = NULL;
> + struct dentry *hidden_dentry = NULL;
> + struct dentry *hidden_dir_dentry = NULL;
> +
> + hidden_dentry = dtohd_index(dentry, bstart);
> +
> + sprintf(name, ".unionfs%*.*lx",
> + i_inosize, i_inosize, hidden_dentry->d_inode->i_ino);
> +
> + tmp_dentry = NULL;
> + do {
> + char *suffix = name + nlen - countersize;
> +
> + dput(tmp_dentry);
> + counter++;
> + sprintf(suffix, "%*.*x", countersize, countersize, counter);
> +
> + printk(KERN_DEBUG "unionfs: trying to rename %s to %s\n",
> + dentry->d_name.name, name);
> +
> + tmp_dentry = lookup_one_len(name, hidden_dentry->d_parent,
> + UNIONFS_TMPNAM_LEN);
> + if (IS_ERR(tmp_dentry)) {
> + err = PTR_ERR(tmp_dentry);
> + goto out;
> + }
> + } while (tmp_dentry->d_inode != NULL); /* need negative dentry */
> +
> + err = copyup_named_file(dentry->d_parent->d_inode, file, name, bstart,
> + bindex, file->f_dentry->d_inode->i_size);
> + if (err)
> + goto out;
> +
> + /* bring it to the same state as an unlinked file */
> + hidden_dentry = dtohd_index(dentry, dbstart(dentry));
> + hidden_dir_dentry = lock_parent(hidden_dentry);
> + err = vfs_unlink(hidden_dir_dentry->d_inode, hidden_dentry);
> + unlock_dir(hidden_dir_dentry);
> +
> +out:
> + return err;
> +}
> +
> +static void cleanup_file(struct file *file, struct dentry *dentry)
> +{
> + int bindex, bstart, bend;
> +
> + bstart = fbstart(file);
> + bend = fbend(file);
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + if (ftohf_index(file, bindex)) {
> + branchput(dentry->d_sb, bindex);
> + fput(ftohf_index(file, bindex));
> + }
> + }
> +
> + if (ftohf_ptr(file)) {
> + kfree(ftohf_ptr(file));
> + ftohf_ptr(file) = NULL;
> + }
> +}
> +
> +static int open_all_files(struct file *file, struct dentry *dentry)
> +{
> + int bindex, bstart, bend, err = 0;
> + struct file *hidden_file;
> + struct dentry *hidden_dentry;
> + struct super_block *sb = dentry->d_sb;
> +
> + bstart = dbstart(dentry);
> + bend = dbend(dentry);
> +
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + hidden_dentry = dtohd_index(dentry, bindex);
> + if (!hidden_dentry)
> + continue;
> +
> + dget(hidden_dentry);
> + mntget(stohiddenmnt_index(sb, bindex));
> + branchget(sb, bindex);
> +
> + hidden_file = dentry_open(hidden_dentry,
> + stohiddenmnt_index(sb, bindex), file->f_flags);
> + if (IS_ERR(hidden_file)) {
> + err = PTR_ERR(hidden_file);
> + goto out;
> + } else
> + set_ftohf_index(file, bindex, hidden_file);
> + }
> +out:
> + return err;
> +}
> +
> +static int open_highest_file(struct file *file, struct dentry *dentry,
> + int willwrite)
> +{
> + int bindex, bstart, bend, err = 0;
> + struct file *hidden_file;
> + struct dentry *hidden_dentry;
> + struct inode *parent_inode = dentry->d_parent->d_inode;
> + size_t inode_size = file->f_dentry->d_inode->i_size;
> + struct super_block *sb = dentry->d_sb;
> +
> + bstart = dbstart(dentry);
> + bend = dbend(dentry);
> +
> + hidden_dentry = dtohd(dentry);
> + if (willwrite && IS_WRITE_FLAG(file->f_flags)
> + && is_robranch(dentry)) {
> + for (bindex = bstart - 1; bindex >= 0; bindex--) {
> + err = copyup_file(parent_inode, file, bstart, bindex,
> + inode_size);
> + if (!err)
> + break;
> +
> + }
> + atomic_set(&ftopd(file)->ufi_generation,
> + atomic_read(&itopd(dentry->d_inode)->uii_generation));
> + goto out;
> + }
> +
> + dget(hidden_dentry);
> + mntget(stohiddenmnt_index(sb, bstart));
> + branchget(sb, bstart);
> + hidden_file = dentry_open(hidden_dentry,
> + stohiddenmnt_index(sb, bstart), file->f_flags);
> + if (IS_ERR(hidden_file)) {
> + err = PTR_ERR(hidden_file);
> + goto out;
> + }
> + set_ftohf(file, hidden_file);
> + /* Fix up the position. */
> + hidden_file->f_pos = file->f_pos;
> +
> + memcpy(&(hidden_file->f_ra), &(file->f_ra),
> + sizeof(struct file_ra_state));
> +out:
> + return err;
> +}
> +
> +static int do_delayed_copyup(struct file *file, struct dentry *dentry)
> +{
> + int bindex, bstart, bend, err = 0;
> + struct inode *parent_inode = dentry->d_parent->d_inode;
> + size_t inode_size = file->f_dentry->d_inode->i_size;
> +
> + bstart = fbstart(file);
> + bend = fbend(file);
> +
> + BUG_ON(!S_ISREG(file->f_dentry->d_inode->i_mode));
> +
> + for (bindex = bstart - 1; bindex >= 0; bindex--) {
> + if (!d_deleted(file->f_dentry))
> + err = copyup_file(parent_inode, file, bstart,
> + bindex, inode_size);
> + else
> + err = copyup_deleted_file(file, dentry, bstart, bindex);
> +
> + if (!err)
> + break;
> + }
> + if (!err && (bstart > fbstart(file))) {
> + bend = fbend(file);
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + if (ftohf_index(file, bindex)) {
> + branchput(dentry->d_sb, bindex);
> + fput(ftohf_index(file, bindex));
> + set_ftohf_index(file, bindex, NULL);
> + }
> + }
> + fbend(file) = bend;
> + }
> + return err;
> +}
> +
> +int unionfs_file_revalidate(struct file *file, int willwrite)
> +{
> + struct super_block *sb;
> + struct dentry *dentry;
> + int sbgen, fgen, dgen;
> + int bstart, bend;
> + int size;
> +
> + int err = 0;
> +
> + dentry = file->f_dentry;
> + lock_dentry(dentry);
> + sb = dentry->d_sb;
> + unionfs_read_lock(sb);
> + if (!unionfs_d_revalidate(dentry, NULL) && !d_deleted(dentry)) {
> + err = -ESTALE;
> + goto out;
> + }
> +
> + sbgen = atomic_read(&stopd(sb)->usi_generation);
> + dgen = atomic_read(&dtopd(dentry)->udi_generation);
> + fgen = atomic_read(&ftopd(file)->ufi_generation);
> +
> + BUG_ON(sbgen > dgen);
> +
> + /* There are two cases we are interested in. The first is if the
> + * generation is lower than the super-block. The second is if someone
> + * has copied up this file from underneath us, we also need to refresh
> + * things. */
> + if (!d_deleted(dentry) &&
> + ((sbgen > fgen) || (dbstart(dentry) != fbstart(file)))) {
> + /* First we throw out the existing files. */
> + cleanup_file(file, dentry);
> +
> + /* Now we reopen the file(s) as in unionfs_open. */
> + bstart = fbstart(file) = dbstart(dentry);
> + bend = fbend(file) = dbend(dentry);
> +
> + size = sizeof(struct file *) * sbmax(sb);
> + ftohf_ptr(file) = kzalloc(size, GFP_KERNEL);
> + if (!ftohf_ptr(file)) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + if (S_ISDIR(dentry->d_inode->i_mode)) {
> + /* We need to open all the files. */
> + err = open_all_files(file, dentry);
> + if (err)
> + goto out;
> + } else {
> + /* We only open the highest priority branch. */
> + err = open_highest_file(file, dentry, willwrite);
> + if (err)
> + goto out;
> + }
> + atomic_set(&ftopd(file)->ufi_generation,
> + atomic_read(&itopd(dentry->d_inode)->
> + uii_generation));
> + }
> +
> + /* Copyup on the first write to a file on a readonly branch. */
> + if (willwrite && IS_WRITE_FLAG(file->f_flags)
> + && !IS_WRITE_FLAG(ftohf(file)->f_flags) && is_robranch(dentry)) {
> + printk(KERN_DEBUG
> + "Doing delayed copyup of a read-write file on a read-only branch.\n");
> + err = do_delayed_copyup(file, dentry);
> + }
> +
> +out:
> + unlock_dentry(dentry);
> + unionfs_read_unlock(dentry->d_sb);
> + return err;
> +}
> +
> +/* unionfs_open helper function: open a directory */
> +static inline int __open_dir(struct inode *inode, struct file *file)
> +{
> + struct dentry *hidden_dentry;
> + struct file *hidden_file;
> + int bindex, bstart, bend;
> +
> + bstart = fbstart(file) = dbstart(file->f_dentry);
> + bend = fbend(file) = dbend(file->f_dentry);
> +
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + hidden_dentry = dtohd_index(file->f_dentry, bindex);
> + if (!hidden_dentry)
> + continue;
> +
> + dget(hidden_dentry);
> + mntget(stohiddenmnt_index(inode->i_sb, bindex));
> + hidden_file = dentry_open(hidden_dentry,
> + stohiddenmnt_index(inode->i_sb, bindex),
> + file->f_flags);

Race! You cannot open an underlying NFS file by name after it has been
looked up: you have no guarantee that it hasn't been renamed.

> + if (IS_ERR(hidden_file))
> + return PTR_ERR(hidden_file);
> +
> + set_ftohf_index(file, bindex, hidden_file);
> +
> + /* The branchget goes after the open, because otherwise
> + * we would miss the reference on release. */
> + branchget(inode->i_sb, bindex);
> + }
> +
> + return 0;
> +}
> +
> +/* unionfs_open helper function: open a file */
> +static inline int __open_file(struct inode *inode, struct file *file)
> +{
> + struct dentry *hidden_dentry;
> + struct file *hidden_file;
> + int hidden_flags;
> + int bindex, bstart, bend;
> +
> + hidden_dentry = dtohd(file->f_dentry);
> + hidden_flags = file->f_flags;
> +
> + bstart = fbstart(file) = dbstart(file->f_dentry);
> + bend = fbend(file) = dbend(file->f_dentry);
> +
> + /* check for the permission for hidden file. If the error is COPYUP_ERR,
> + * copyup the file.
> + */
> + if (hidden_dentry->d_inode && is_robranch(file->f_dentry)) {
> + /* if the open will change the file, copy it up otherwise defer it. */
> + if (hidden_flags & O_TRUNC) {
> + int size = 0;
> + int err = -EROFS;
> +
> + /* copyup the file */
> + for (bindex = bstart - 1; bindex >= 0; bindex--) {
> + err = copyup_file(file->f_dentry->d_parent->d_inode,
> + file, bstart, bindex, size);
> + if (!err)
> + break;
> + }
> + return err;
> + } else
> + hidden_flags &= ~(OPEN_WRITE_FLAGS);
> + }
> +
> + dget(hidden_dentry);
> +
> + /* dentry_open will decrement mnt refcnt if err.
> + * otherwise fput() will do an mntput() for us upon file close.
> + */
> + mntget(stohiddenmnt_index(inode->i_sb, bstart));
> + hidden_file = dentry_open(hidden_dentry,
> + stohiddenmnt_index(inode->i_sb, bstart),
> + hidden_flags);

Race: see above. Besides, calling dentry_open() directly on an NFS file
is a bug!

> + if (IS_ERR(hidden_file))
> + return PTR_ERR(hidden_file);
> +
> + set_ftohf(file, hidden_file);
> + branchget(inode->i_sb, bstart);
> +
> + return 0;
> +}
> +
> +int unionfs_open(struct inode *inode, struct file *file)
> +{
> + int err = 0;
> + struct file *hidden_file = NULL;
> + struct dentry *dentry = NULL;
> + int bindex = 0, bstart = 0, bend = 0;
> + int size;
> +
> + ftopd_lhs(file) = kzalloc(sizeof(struct unionfs_file_info), GFP_KERNEL);
> + if (!ftopd(file)) {
> + err = -ENOMEM;
> + goto out;
> + }
> + fbstart(file) = -1;
> + fbend(file) = -1;
> + atomic_set(&ftopd(file)->ufi_generation,
> + atomic_read(&itopd(inode)->uii_generation));
> +
> + size = sizeof(struct file *) * sbmax(inode->i_sb);
> + ftohf_ptr(file) = kzalloc(size, GFP_KERNEL);
> + if (!ftohf_ptr(file)) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + dentry = file->f_dentry;
> + lock_dentry(dentry);
> + unionfs_read_lock(inode->i_sb);
> +
> + bstart = fbstart(file) = dbstart(dentry);
> + bend = fbend(file) = dbend(dentry);
> +
> + /* increment, so that we can flush appropriately */
> + atomic_inc(&itopd(dentry->d_inode)->uii_totalopens);
> +
> + /* open all directories and make the unionfs file struct point to these hidden file structs */
> + if (S_ISDIR(inode->i_mode))
> + err = __open_dir(inode, file); /* open a dir */
> + else
> + err = __open_file(inode, file); /* open a file */
> +
> + /* freeing the allocated resources, and fput the opened files */
> + if (err) {
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + hidden_file = ftohf_index(file, bindex);
> + if (!hidden_file)
> + continue;
> +
> + branchput(file->f_dentry->d_sb, bindex);
> + /* fput calls dput for hidden_dentry */
> + fput(hidden_file);
> + }
> + }
> +
> + unlock_dentry(dentry);
> + unionfs_read_unlock(inode->i_sb);
> +
> +out:
> + if (err) {
> + kfree(ftohf_ptr(file));
> + kfree(ftopd(file));
> + }
> +
> + return err;
> +}
> +
> +int unionfs_file_release(struct inode *inode, struct file *file)
> +{
> + int err = 0;
> + struct file *hidden_file = NULL;
> + int bindex, bstart, bend;
> + int fgen;
> +
> + /* fput all the hidden files */
> + fgen = atomic_read(&ftopd(file)->ufi_generation);
> + bstart = fbstart(file);
> + bend = fbend(file);
> +
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + hidden_file = ftohf_index(file, bindex);
> +
> + if (hidden_file) {
> + fput(hidden_file);
> + unionfs_read_lock(inode->i_sb);
> + branchput(inode->i_sb, bindex);
> + unionfs_read_unlock(inode->i_sb);
> + }
> + }
> + kfree(ftohf_ptr(file));
> +
> + if (ftopd(file)->rdstate) {
> + ftopd(file)->rdstate->uds_access = jiffies;
> + printk(KERN_DEBUG "Saving rdstate with cookie %u [%d.%lld]\n",
> + ftopd(file)->rdstate->uds_cookie,
> + ftopd(file)->rdstate->uds_bindex,
> + (long long)ftopd(file)->rdstate->uds_dirpos);
> + spin_lock(&itopd(inode)->uii_rdlock);
> + itopd(inode)->uii_rdcount++;
> + list_add_tail(&ftopd(file)->rdstate->uds_cache,
> + &itopd(inode)->uii_readdircache);
> + mark_inode_dirty(inode);
> + spin_unlock(&itopd(inode)->uii_rdlock);
> + ftopd(file)->rdstate = NULL;
> + }
> + kfree(ftopd(file));
> + return err;
> +}
> +
> +static inline long do_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> + struct file *hidden_file;
> + int err;
> +
> + hidden_file = ftohf(file);
> +
> + err = security_file_ioctl(hidden_file, cmd, arg);
> + if (err)
> + goto out;
> + err = -ENOTTY;
> + if (!hidden_file || !hidden_file->f_op)
> + goto out;
> + if (hidden_file->f_op->unlocked_ioctl) {
> + err = hidden_file->f_op->unlocked_ioctl(hidden_file, cmd, arg);
> + } else if (hidden_file->f_op->ioctl) {
> + lock_kernel();
> + err = hidden_file->f_op->ioctl(hidden_file->f_dentry->d_inode,
> + hidden_file, cmd, arg);
> + unlock_kernel();
> + }
> +
> +out:
> + return err;
> +}
> +
> +long unionfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> + long err = 0; /* don't fail by default */
> +
> + if ((err = unionfs_file_revalidate(file, 1)))
> + goto out;
> +
> + /* check if asked for local commands */
> + switch (cmd) {
> + case UNIONFS_IOCTL_INCGEN:
> + if (!capable(CAP_SYS_ADMIN)) {
> + err = -EACCES;
> + goto out;
> + }
> + err = unionfs_ioctl_incgen(file, cmd, arg);
> + break;
> +
> + case UNIONFS_IOCTL_QUERYFILE:
> + err = unionfs_ioctl_queryfile(file, cmd, arg);
> + break;
> +
> + default:
> + err = do_ioctl(file, cmd, arg);
> + break;
> + }
> +
> +out:
> + return err;
> +}
> +
> +int unionfs_flush(struct file *file, fl_owner_t id)
> +{
> + int err = 0; /* assume ok (see open.c:close_fp) */
> + struct file *hidden_file = NULL;
> + int bindex, bstart, bend;
> +
> + if ((err = unionfs_file_revalidate(file, 1)))
> + goto out;
> + if (!atomic_dec_and_test
> + (&itopd(file->f_dentry->d_inode)->uii_totalopens))
> + goto out;
> +
> + lock_dentry(file->f_dentry);
> +
> + bstart = fbstart(file);
> + bend = fbend(file);
> + for (bindex = bstart; bindex <= bend; bindex++) {
> + hidden_file = ftohf_index(file, bindex);
> +
> + if (hidden_file && hidden_file->f_op
> + && hidden_file->f_op->flush) {
> + err = hidden_file->f_op->flush(hidden_file, id);
> + if (err)
> + goto out_lock;
> +
> + /* if there are no more references to the dentry, dput it */
> + if (d_deleted(file->f_dentry)) {
> + dput(dtohd_index(file->f_dentry, bindex));
> + set_dtohd_index(file->f_dentry, bindex, NULL);
> + }
> + }
> +
> + }
> +
> +out_lock:
> + unlock_dentry(file->f_dentry);
> +out:
> + return err;
> +}
> +
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--
VGER BF report: H 0.0108304

2006-09-01 22:37:19

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Fri, 2006-09-01 at 18:20 -0400, Trond Myklebust wrote:

> Race! You cannot open an underlying NFS file by name after it has been
> looked up: you have no guarantee that it hasn't been renamed.

In a unionfs case that's not an issue. Nothing else is allowed to use
the backing store (i.e. the nfs fs) while unionfs is using it, so there
shouldn't be a renaming issue.


--
VGER BF report: H 0

2006-09-01 22:57:33

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Fri, 2006-09-01 at 18:36 -0400, Shaya Potter wrote:
> On Fri, 2006-09-01 at 18:20 -0400, Trond Myklebust wrote:
>
> > Race! You cannot open an underlying NFS file by name after it has been
> > looked up: you have no guarantee that it hasn't been renamed.
>
> In a unionfs case that's not an issue. Nothing else is allowed to use
> the backing store (i.e. the nfs fs) while unionfs is using it, so there
> shouldn't be a renaming issue.

How are you enforcing that on the server?



--
VGER BF report: H 2.77556e-16

2006-09-02 02:48:21

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Fri, Sep 01, 2006 at 06:20:00PM -0400, Trond Myklebust wrote:
> On Thu, 2006-08-31 at 21:41 -0400, Josef Sipek wrote:
...
> > + for (bindex = bstart; bindex <= bend; bindex++) {
> > + hidden_dentry = dtohd_index(file->f_dentry, bindex);
> > + if (!hidden_dentry)
> > + continue;
> > +
> > + dget(hidden_dentry);
> > + mntget(stohiddenmnt_index(inode->i_sb, bindex));
> > + hidden_file = dentry_open(hidden_dentry,
> > + stohiddenmnt_index(inode->i_sb, bindex),
> > + file->f_flags);
>
> Race! You cannot open an underlying NFS file by name after it has been
> looked up: you have no guarantee that it hasn't been renamed.

>From what I can see, the solution to this would be to pass the lookup
intents in unionfs_lookup down to the lower filesystem (the way it should be
done in the first place). Then, we could use the dentry here without any
problems. Is that all that needs to be done or am I forgetting something?

Thanks,

Josef "Jeff" Sipek.

--
VGER BF report: H 0

2006-09-03 01:04:10

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Fri, 2006-09-01 at 18:57 -0400, Trond Myklebust wrote:
> On Fri, 2006-09-01 at 18:36 -0400, Shaya Potter wrote:
> > On Fri, 2006-09-01 at 18:20 -0400, Trond Myklebust wrote:
> >
> > > Race! You cannot open an underlying NFS file by name after it has been
> > > looked up: you have no guarantee that it hasn't been renamed.
> >
> > In a unionfs case that's not an issue. Nothing else is allowed to use
> > the backing store (i.e. the nfs fs) while unionfs is using it, so there
> > shouldn't be a renaming issue.
>
> How are you enforcing that on the server?

If I formatted a partition on a san w/ ext3, who would enforce that only
one machine has access to it at a time?

the administrator of the file system.


--
VGER BF report: H 0

2006-09-03 04:11:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 04/22][RFC] Unionfs: Common file operations

On Fri, 2006-09-01 at 22:47 -0400, Josef Sipek wrote:

> From what I can see, the solution to this would be to pass the lookup
> intents in unionfs_lookup down to the lower filesystem (the way it should be
> done in the first place). Then, we could use the dentry here without any
> problems. Is that all that needs to be done or am I forgetting something?

That sounds correct. If you pass the lookup intents down to the
underlying filesystem while doing the lookup, then all should be well:
NFS will actually open the file for you at that point too, whereas most
other filesystems will just look it up and return the dentry.

In any case, you avoid the race, because you lookup/revalidate the
underlying dentry at the same time as you lookup/revalidate the unionfs
dentry.

Cheers,
Trond


--
VGER BF report: H 0

2006-09-03 17:48:50

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

>> > This set of patches constitutes Unionfs version 2.0. We are presenting it to
>> > be reviewed and considered for inclusion into the kernel.
>>
>> Small nit: is it possible to order these patches so that the kernel
>> builds at each intermediate point (so we can use git bisect). The
>> easiest way to achieve this would be to do the Kconfig and Makefile
>> updates last.
>
>Ideally, when Unionfs is commited into git, the whole thing would be one
>commit - what's the point of having half of a filesystem?

So that you can eliminate e.g. locking bugs by searching in less code.



Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-03 19:45:32

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Sun, Sep 03, 2006 at 07:42:53PM +0200, Jan Engelhardt wrote:
> >> > This set of patches constitutes Unionfs version 2.0. We are presenting it to
> >> > be reviewed and considered for inclusion into the kernel.
> >>
> >> Small nit: is it possible to order these patches so that the kernel
> >> builds at each intermediate point (so we can use git bisect). The
> >> easiest way to achieve this would be to do the Kconfig and Makefile
> >> updates last.
> >
> >Ideally, when Unionfs is commited into git, the whole thing would be one
> >commit - what's the point of having half of a filesystem?
>
> So that you can eliminate e.g. locking bugs by searching in less code.

I think you misunderstood my comment. What I meant to say was that there is
_no way_ you can compile a filesystem that has only dentry ops but not
superblock ops - this would happen if you tried to bisect and you landed
half way in the series of commits for the filesystem. For the _initial_
commit one cset makes sense. For subsequent fixes one commit per fix is the
only logical thing to do.

Josef "Jeff" Sipek.

--
Bad pun of the week: The formula 1 control computer suffered from a race
condition

--
VGER BF report: H 7.95683e-07

2006-09-04 07:03:07

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality


>+/* Determine the mode based on the copyup flags, and the existing dentry. */
>+static int copyup_permissions(struct super_block *sb,
>+ struct dentry *old_hidden_dentry,
>+ struct dentry *new_hidden_dentry)
>+{
>+ struct iattr newattrs;
>+ int err;
>+
>+ newattrs.ia_atime = old_hidden_dentry->d_inode->i_atime;
>+ newattrs.ia_mtime = old_hidden_dentry->d_inode->i_mtime;
>+ newattrs.ia_ctime = old_hidden_dentry->d_inode->i_ctime;

How about,

struct inode *ohi = old_hidden_dentry->d_inode;
newattrs.ia_atime = ohi->i_atime;

reduces typing a little.

>+ if (S_ISDIR(old_mode)) {
>+ args.u.mkdir.parent = new_hidden_parent_dentry->d_inode;
>+ args.u.mkdir.dentry = new_hidden_dentry;
>+ args.u.mkdir.mode = old_mode;

Like above maybe?

>+ } else {
>+ printk(KERN_ERR "Unknown inode type %d\n",
>+ old_mode);
>+ BUG();
>+ }

Is BUG the right thing, what do others think? (Using WARN, and set err to
something useful?)

>+ } while ((read_bytes > 0) && (len > 0));

-()

>+/* This function creates a copy of a file represented by 'file' which currently
>+ * resides in branch 'bstart' to branch 'new_bindex.
^

+'

>+struct dentry *create_parents(struct inode *dir, struct dentry *dentry,
>+ int bindex)
>+{
>+ struct dentry *hidden_dentry;
>+
>+ hidden_dentry =
>+ create_parents_named(dir, dentry, dentry->d_name.name, bindex);
>+
>+ return (hidden_dentry);
>+}

{
return create_parents_named(dir, dentry, dentry->d_name.name, bindex);
}


>+struct dentry *create_parents_named(struct inode *dir, struct dentry *dentry,
>+ const char *name, int bindex)
>+{
>+ struct dentry **path = NULL;
>+ path = (struct dentry **)kzalloc(kmalloc_size, GFP_KERNEL);

Nocast.

>+ if (!path)
>+ ;

Woha?!

>+ tmp_path =
>+ (struct dentry **)kzalloc(kmalloc_size, GFP_KERNEL);

Nocast.



Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:08:22

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 06/22][RFC] Unionfs: Dentry operations


>+/*
>+ * THIS IS A BOOLEAN FUNCTION: returns 1 if valid, 0 otherwise.
>+ */

Candiate for "generic boolean patch"!

>+ if (!restart && (pdgen != sbgen)) {
()

>+ } else if (dbstart(dentry) < 0) {
>+ /* this is due to a failed lookup */
>+ /* the failed lookup has a dtohd_ptr set to null,
>+ but this is a better check */
>+ printk(KERN_DEBUG "dentry without hidden dentries : %*s",
>+ dentry->d_name.len, dentry->d_name.name);

I think you want %.*s

>+out_free:
>+ /* No need to unlock it, because it is disappeared. */
>+ free_dentry_private_data(dtopd(dentry));
>+ dtopd_lhs(dentry) = NULL; /* just to be safe */

Things like this NULLing could be removed. It if then oopses somewhere,
you either

(a) needed this =NULL indeed (because some other function depends
on it being NULL)

or (b) found a bug elsewhere (more likely, since you write "just to be safe")


The (a) case is needed if you wanted to kfree(dtopd_lhs(dentry))
elsewhere it.


Jan Engelhardt
--

--
VGER BF report: H 0.00065657

2006-09-04 07:11:06

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 07/22][RFC] Unionfs: Directory file operations


>+/* copied from generic filldir in fs/readir.c */
>+static int unionfs_filldir(void *dirent, const char *name, int namelen,
>+ loff_t offset, ino_t ino, unsigned int d_type)
>+{
>+ struct unionfs_getdents_callback *buf =
>+ (struct unionfs_getdents_callback *)dirent;

Nocast.

>+ if ((namelen > UNIONFS_WHLEN) && !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
()

>+ /* if 'name' isn't a whiteout filldir it. */
^
I would put a , here

>+ err = vfs_readdir(hidden_file, unionfs_filldir, (void *)&buf);

Most likely nocast.

>+ if (err < 0) {
>+ goto out;
>+ }
>+
>+ if (buf.filldir_error) {
>+ break;
>+ }

-{}

>+ if (offset == rdstate2offset(rdstate)) {
>+ err = offset;
>+ } else if (file->f_pos == DIREOF) {
>+ err = DIREOF;
>+ } else {
>+ err = -EINVAL;
>+ }

-{}

>+/* Trimmed directory options, we shouldn't pass everything down since
>+ * we don't want to operate on partial directories.
>+ */
>+struct file_operations unionfs_dir_fops = {
>+ .llseek = unionfs_dir_llseek,
>+ .read = generic_read_dir,
>+ .readdir = unionfs_readdir,
>+ .unlocked_ioctl = unionfs_ioctl,
>+ .open = unionfs_open,
>+ .release = unionfs_file_release,
>+ .flush = unionfs_flush,
>+};

Might want to line up structs' members.




Jan Engelhardt
--

--
VGER BF report: H 1.9526e-07

2006-09-04 07:13:12

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 08/22][RFC] Unionfs: Directory manipulation helper functions


>+/* This filldir function makes sure only whiteouts exist within a directory. */
>+static int readdir_util_callback(void *dirent, const char *name, int namelen,
>+ loff_t offset, ino_t ino, unsigned int d_type)
>+{
>+ int err = 0;
>+ struct unionfs_rdutil_callback *buf =
>+ (struct unionfs_rdutil_callback *)dirent;

Nocast.

>+ if ((namelen > UNIONFS_WHLEN) && !strncmp(name, UNIONFS_WHPFX, UNIONFS_WHLEN)) {
()
Also elsewhere.

>+ if (0 <= bopaque && bopaque < bend)

Turn it. Constant values are usually wanted on the right side.

bopaque >= 0



Jan Engelhardt
--

--
VGER BF report: H 1.55431e-15

2006-09-04 07:15:48

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 09/22][RFC] Unionfs: File operations


>+ memcpy(&(hidden_file->f_ra), &(file->f_ra),
>+ sizeof(struct file_ra_state));

-> has precedence over &, so the () are not needed.

>+ if (err != file->f_pos) {
>+ file->f_pos = err;
>+ // ION maybe this?
>+ // file->f_pos = hidden_file->f_pos;

ION?

>+static int unionfs_file_readdir(struct file *file, void *dirent,
>+ filldir_t filldir)
>+{
>+ int err = -ENOTDIR;
>+ return err;
>+}

{
return -ENOTDIR;
}




Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:22:27

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 10/22][RFC] Unionfs: Inode operations


>+ //DQ: vfs_create has a different prototype in 2.6

I found a relic! :-)

>+ if (IS_ERR(hidden_dentry)) {
>+ err = PTR_ERR(hidden_dentry);
>+ }
-{}

>+/* We don't lock the dentry here, because readlink does the heavy lifting. */
>+static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd)
>+{
>+ char *buf;
>+ int len = PAGE_SIZE, err;
>+ mm_segment_t old_fs;
>+
>+ /* This is freed by the put_link method assuming a successful call. */
>+ buf = (char *)kmalloc(len, GFP_KERNEL);

Nocast.

>+void unionfs_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie)
>+{
>+ char *link;
>+ link = nd_get_link(nd);
>+ kfree(link);
>+}

kfree(nd_get_link(nd));

>+ /* Ordinary permission routines do not understand MAY_APPEND. */
>+ submask = mask & ~MAY_APPEND;
>+ if (inode->i_op && inode->i_op->permission) {
>+ retval = inode->i_op->permission(inode, submask, nd);
>+ if ((retval == -EACCES) && (submask & MAY_WRITE) &&
>+ (!strcmp("nfs", (inode)->i_sb->s_type->name)) &&
>+ (nd) && (nd->mnt) && (nd->mnt->mnt_sb) &&
>+ (branchperms(nd->mnt->mnt_sb, bindex) & MAY_NFSRO)) {
>+ retval = generic_permission(inode, submask, NULL);

I am not sure right now, I would need to test; does someone know out of the box
whether other network filesystems (in particular SMBFS/CIFS) behave like NFS
and also return -EACCES rather than -EROFS?

>+ return ((retval == -EROFS) ? 0 : retval); /* ignore EROFS */
outer ()
>+ for (bindex = bstart; (bindex <= bend) || (bindex == bstart); bindex++) {

>+ /* if error is in the leftmost f/s, pass it up */

"f/s" => branch, to follow unionfs terminiology.



Jan Engelhardt
--

--
VGER BF report: H 0.0311395

2006-09-04 07:23:15

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 08/22][RFC] Unionfs: Directory manipulation helper functions

Jan Engelhardt wrote:
>> + if (0 <= bopaque && bopaque < bend)
>>
>
> Turn it. Constant values are usually wanted on the right side.
>
> bopaque >= 0

Not in this case. The test in its current form clearly shows that
bopaque needs to be within a range.


J

--
VGER BF report: H 0

2006-09-04 07:28:45

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 11/22][RFC] Unionfs: Lookup helper functions


>+ if ((bopaque != -1) && (bopaque < bend))

>+ /* If we've only got negative dentries, then use the leftmost one. */
>+ if (lookupmode == INTERPOSE_REVAL) {
>+ if (dentry->d_inode) {
>+ itopd(dentry->d_inode)->uii_stale = 1;
>+ }
>+ goto out;
>+ }
>+ if (!newsize || ((oldsize < newsize) && (newsize > minsize))) {

Some places where -() and -{}

>+static int is_opaque_dir(struct dentry *dentry, int bindex)
>+{
>+ int err = 0;
>+ /* This is an opaque dir iff wh_hidden_dentry is positive */
>+ err = !!wh_hidden_dentry->d_inode;
>+out:
>+ return err;
>+}

This smells like a bool function. If so, don't call it "err" (since boolean
functions do not return "errors").



Jan Engelhardt
--

--
VGER BF report: H 8.44923e-11

2006-09-04 07:32:42

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 12/22][RFC] Unionfs: Main module functions


>+/* checks if two hidden_dentries have overlapping branches */
>+int is_branch_overlap(struct dentry *dent1, struct dentry *dent2)
>+{
>+ struct dentry *dent = NULL;
>+
>+ dent = dent1;
>+ while ((dent != dent2) && (dent->d_parent != dent)) {
>+ dent = dent->d_parent;
>+ }
>+ if (dent == dent2) {
>+ return 1;
>+ }
>+
>+ dent = dent2;
>+ while ((dent != dent1) && (dent->d_parent != dent)) {
>+ dent = dent->d_parent;
>+ }
>+ if (dent == dent1) {
>+ return 1;
>+ }
>+
>+ return 0;
>+}
-()-{} Also elsewhere.

>+ struct dentry *dent1 = NULL;
>+ struct dentry *dent2 = NULL;

Is it necessary to set these to NULL?

>+ for (i = 0; i < branches; i++) {
>+ if (hidden_root_info->udi_dentry[i])
>+ dput(hidden_root_info->udi_dentry[i]);
>+ }


>+MODULE_AUTHOR("Filesystems and Storage Lab, Stony Brook University"
>+ " (http://www.fsl.cs.sunysb.edu/)");

Should probably have at least one human person in the list.


Jan Engelhardt
--

--
VGER BF report: H 4.15867e-12

2006-09-04 07:34:44

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 13/22][RFC] Unionfs: Readdir state


{} as usual.

>+ /* If we print entry, we end up with spurious data. */
>+ /* print_entry("name = %*s", namelen, name); */

%.*s

>+ new->name = (char *)kmalloc(namelen + 1, GFP_KERNEL);

Nocast.




Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:41:08

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 15/22][RFC] Unionfs: Privileged operations workqueue


>+void __unionfs_create(void *data)
>+{
>+ struct sioq_args *args = data;
>+
>+ args->err = vfs_create(args->u.create.parent, args->u.create.dentry,
>+ args->u.create.mode, args->u.create.nd);
>+ complete(&args->comp);
>+}

Suggestion

{
struct sioq_args *args = data;
struct create_args *c = &args->u.create;
args->err = vfs_create(c->parent, c->dentry, c->mode, c->nd);
complete(&args->comp);
}

Similar for others.

>+ union {
>+ struct deletewh_args deletewh;
>+ struct isopaque_args isopaque;
>+ struct create_args create;
>+ struct mkdir_args mkdir;
>+ struct mknod_args mknod;
>+ struct symlink_args symlink;
>+ struct unlink_args unlink;
>+ } u;

Anonymous unions (and structs) are allowed, use them if you think they sound
cool.


Jan Engelhardt
--

--
VGER BF report: H 0.127232

2006-09-04 07:43:25

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 16/22][RFC] Unionfs: Handling of stale inodes


>+static void *stale_follow_link(struct dentry *dent, struct nameidata *nd)
>+{
>+ int err = vfs_follow_link(nd, ERR_PTR(-ESTALE));
>+ return ERR_PTR(err);
>+}

{
return ERR_PTR(vfs_follow_link(nd, ERR_PTR(-ESTALE)));
}?

>+#define ESTALE_ERROR ((void *) (return_ESTALE))

#define ESTALE_ERROR ((void *)return_ESTALE)

That really looks BSDish, violating the number of args ;-)


Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:50:55

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 18/22][RFC] Unionfs: Superblock operations


>+/* final actions when unmounting a file system */
>+static void unionfs_put_super(struct super_block *sb)
>+{
>+ int bindex, bstart, bend;
>+ struct unionfs_sb_info *spd;
>+
>+ if ((spd = stopd(sb))) {

Sugg.:

if((spd = stopd(sb)) == NULL)
return;

>+static struct inode *unionfs_alloc_inode(struct super_block *sb)
>+{
>+ struct unionfs_inode_container *c;
>+
>+ c = (struct unionfs_inode_container *)
>+ kmem_cache_alloc(unionfs_inode_cachep, SLAB_KERNEL);

Nocast.

>+static void init_once(void *v, kmem_cache_t * cachep, unsigned long flags)
>+{
>+ struct unionfs_inode_container *c = (struct unionfs_inode_container *)v;

Nocast.



Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:53:10

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 19/22][RFC] Unionfs: Helper macros/inlines


>+/* Dentry macros */
>+static inline struct unionfs_dentry_info *dtopd(const struct dentry *dent)
>+{
>+ return (struct unionfs_dentry_info *)dent->d_fsdata;
>+}

Nocast.


Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 07:57:58

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 20/22][RFC] Unionfs: Internal include file


>+static inline void fist_copy_attr_atime(struct inode *dest,
>+ const struct inode *src)
>+{
>+ dest->i_atime = src->i_atime;
>+}
>+static inline void fist_copy_attr_times(struct inode *dest,
>+ const struct inode *src)
>+{
>+ dest->i_atime = src->i_atime;
>+ dest->i_mtime = src->i_mtime;
>+ dest->i_ctime = src->i_ctime;
>+}
>+static inline void fist_copy_attr_timesizes(struct inode *dest,
>+ const struct inode *src)
>+{
>+ dest->i_atime = src->i_atime;
>+ dest->i_mtime = src->i_mtime;
>+ dest->i_ctime = src->i_ctime;
>+ dest->i_size = src->i_size;
>+ dest->i_blocks = src->i_blocks;
>+}
>+static inline void fist_copy_attr_all(struct inode *dest,
>+ const struct inode *src)
>+{
>+ dest->i_mode = src->i_mode;
>+ /* we do not need to copy if the file is a deleted file */
>+ if (dest->i_nlink > 0)
>+ dest->i_nlink = get_nlinks(dest);
>+ dest->i_uid = src->i_uid;
>+ dest->i_gid = src->i_gid;
>+ dest->i_rdev = src->i_rdev;
>+ dest->i_atime = src->i_atime;
>+ dest->i_mtime = src->i_mtime;
>+ dest->i_ctime = src->i_ctime;
>+ dest->i_blksize = src->i_blksize;
>+ dest->i_blkbits = src->i_blkbits;
>+ dest->i_size = src->i_size;
>+ dest->i_blocks = src->i_blocks;
>+ dest->i_flags = src->i_flags;
>+}

Add some empty lines between the functions.

>+static inline int branchperms(struct super_block *sb, int index)
>+{
>+ BUG_ON(index < 0);
>+
>+ return stopd(sb)->usi_data[index].branchperms;
>+}
>+static inline int set_branchperms(struct super_block *sb, int index, int perms)
>+{
>+ BUG_ON(index < 0);
>+
>+ stopd(sb)->usi_data[index].branchperms = perms;
>+
>+ return perms;
>+}

>+/* What do we use for whiteouts. */
>+#define UNIONFS_WHPFX ".wh."
>+#define UNIONFS_WHLEN 4

#define UNINOFS_WHLEN (sizeof(UNIONFS_WHPFX) - 1)

>+/*
>+ * MACROS:
>+ */
>+
>+#ifndef SEEK_SET
>+#define SEEK_SET 0
>+#endif /* not SEEK_SET */
>+
>+#ifndef SEEK_CUR
>+#define SEEK_CUR 1
>+#endif /* not SEEK_CUR */
>+
>+#ifndef SEEK_END
>+#define SEEK_END 2
>+#endif /* not SEEK_END */

These should really be in include/linux/. Currently, everybody
replicates them or uses hard-coded values :-(

linux-2.6.17.11-jen33$ grep -r SEEK_CUR .;
./drivers/char/mbcs.c: case 1: /* SEEK_CUR */
./drivers/isdn/hardware/eicon/dsp_defs.h:#define OS_SEEK_CUR 1
./drivers/mtd/mtdchar.c: /* SEEK_CUR */
./fs/unionfs/dirfops.c: case SEEK_CUR:
./fs/unionfs/dirfops.c: case SEEK_CUR:
./fs/unionfs/unionfs.h:#ifndef SEEK_CUR
./fs/unionfs/unionfs.h:#define SEEK_CUR 1
./fs/unionfs/unionfs.h:#endif /* not SEEK_CUR */
./fs/xfs/xfs_vnodeops.c: case 1: /*SEEK_CUR*/
./fs/locks.c: case 1: /*SEEK_CUR*/
./fs/locks.c: case 1: /*SEEK_CUR*/
./security/rpldev.c:#ifndef SEEK_CUR
./security/rpldev.c:# define SEEK_CUR 1
./security/rpldev.c: (BufRP). Thus, the only accepted origin is SEEK_CUR, or SEEK_END
./security/rpldev.c: if(origin != SEEK_CUR)
./security/rpldev.c: return rpldev_seek(filp, arg, SEEK_CUR);
./sound/core/info.c: case 1: /* SEEK_CUR */
./sound/drivers/opl4/opl4_proc.c: case 1: /* SEEK_CUR */
./sound/isa/gus/gus_mem_proc.c: case 1: /* SEEK_CUR */
./sound/pci/mixart/mixart.c: case 1: /* SEEK_CUR */
./sound/pci/mixart/mixart.c: case 1: /* SEEK_CUR */




Jan Engelhardt
--

--
VGER BF report: H 0.0197548

2006-09-04 07:59:35

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 22/22][RFC] Unionfs: Include file

On Aug 31 2006 22:02, Josef Sipek wrote:

>Date: Thu, 31 Aug 2006 22:02:22 -0400
>From: Josef Sipek <[email protected]>
>To: [email protected]
>Cc: [email protected], [email protected], [email protected],
> [email protected]
>Subject: [PATCH 22/22][RFC] Unionfs: Include file
>
>From: Josef "Jeff" Sipek <[email protected]>
>
>Global include file - can be included from userspace by utilities.
>
>Signed-off-by: Josef "Jeff" Sipek <[email protected]>
>Signed-off-by: David Quigley <[email protected]>
>Signed-off-by: Erez Zadok <[email protected]>
>
>---
>
> include/linux/union_fs.h | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
>diff -Nur -x linux-2.6-git/Documentation/dontdiff linux-2.6-git/include/linux/union_fs.h linux-2.6-git-unionfs/include/linux/union_fs.h
>--- linux-2.6-git/include/linux/union_fs.h 1969-12-31 19:00:00.000000000 -0500
>+++ linux-2.6-git-unionfs/include/linux/union_fs.h 2006-08-31 19:04:04.000000000 -0400
>@@ -0,0 +1,20 @@
>+#ifndef _LINUX_UNION_FS_H
>+#define _LINUX_UNION_FS_H
>+
>+#define UNIONFS_VERSION "2.0"
>+/*
>+ * DEFINITIONS FOR USER AND KERNEL CODE:
>+ * (Note: ioctl numbers 1--9 are reserved for fistgen, the rest
>+ * are auto-generated automatically based on the user's .fist file.)
>+ */
>+# define UNIONFS_IOCTL_INCGEN _IOR(0x15, 11, int)
>+# define UNIONFS_IOCTL_QUERYFILE _IOR(0x15, 15, int)
>+
>+/* We don't support normal remount, but unionctl uses it. */
>+# define UNIONFS_REMOUNT_MAGIC 0x4a5a4380
>+
>+/* should be at least LAST_USED_UNIONFS_PERMISSION<<1 */
>+#define MAY_NFSRO 16
>+
>+#endif /* _LINUX_UNIONFS_H */
>+
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>

Jan Engelhardt
--

--
VGER BF report: U 0.495265

2006-09-04 08:25:11

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH 18/22][RFC] Unionfs: Superblock operations

Jan Engelhardt <[email protected]> writes:

>>+/* final actions when unmounting a file system */
>>+static void unionfs_put_super(struct super_block *sb)
>>+{
>>+ int bindex, bstart, bend;
>>+ struct unionfs_sb_info *spd;
>>+
>>+ if ((spd = stopd(sb))) {
>
> Sugg.:
>
> if((spd = stopd(sb)) == NULL)
> return;

Better:

spd = stopd(sb);
if (!spd)
return;

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

--
VGER BF report: H 0

2006-09-04 09:26:07

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality

On Mon, Sep 04, 2006 at 08:59:08AM +0200, Jan Engelhardt wrote:
> >+ newattrs.ia_atime = old_hidden_dentry->d_inode->i_atime;
> >+ newattrs.ia_mtime = old_hidden_dentry->d_inode->i_mtime;
> >+ newattrs.ia_ctime = old_hidden_dentry->d_inode->i_ctime;
>
> How about,
>
> struct inode *ohi = old_hidden_dentry->d_inode;
> newattrs.ia_atime = ohi->i_atime;
>
> reduces typing a little.

Makes sense.

> >+ } else {
> >+ printk(KERN_ERR "Unknown inode type %d\n",
> >+ old_mode);
> >+ BUG();
> >+ }
>
> Is BUG the right thing, what do others think? (Using WARN, and set err to
> something useful?)

Well, it is definitely a condition which Unionfs doesn't expect - if it
doesn't know about the type, how could it copy it up?

> >+ if (!path)
> >+ ;
>
> Woha?!

Eeek. Good catch. The 'goto out' disappeared somehow.

Thanks for the comments.

Josef 'Jeff' Sipek.

--
Real Programmers consider "what you see is what you get" to be just as bad a
concept in Text Editors as it is in women. No, the Real Programmer wants a
"you asked for it, you got it" text editor -- complicated, cryptic,
powerful, unforgiving, dangerous.

--
VGER BF report: H 0.0308323

2006-09-04 10:46:34

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality


>> Is BUG the right thing, what do others think? (Using WARN, and set err to
>> something useful?)
>
>Well, it is definitely a condition which Unionfs doesn't expect - if it
>doesn't know about the type, how could it copy it up?

Other filesystems don't seem to BUG either (at least I have not run into
that yet) when - for whatever reasons - the statdata of a dentry is
fubared. `ls` just displays it then, like

?-w---Sr-T 1 root root 4294967295 date fubared_file



Jan Engelhardt
--

--
VGER BF report: H 0

2006-09-04 11:01:45

by Pekka Enberg

[permalink] [raw]
Subject: Re: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On 9/3/06, Josef Sipek <[email protected]> wrote:
> I think you misunderstood my comment. What I meant to say was that there is
> _no way_ you can compile a filesystem that has only dentry ops but not
> superblock ops - this would happen if you tried to bisect and you landed
> half way in the series of commits for the filesystem. For the _initial_
> commit one cset makes sense. For subsequent fixes one commit per fix is the
> only logical thing to do.

Reorder the patches so that Makefile and Kconfig changes come last and
git bisect will work just fine.

--
VGER BF report: H 0

2006-09-04 11:21:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH 03/22][RFC] Unionfs: Branch management functionality

On 9/1/06, Josef Sipek <[email protected]> wrote:
> +struct dentry **alloc_new_dentries(int objs)
> +{
> + if (!objs)
> + return NULL;
> +
> + return kzalloc(sizeof(struct dentry *) * objs, GFP_KERNEL);

kcalloc

> +struct unionfs_usi_data *alloc_new_data(int objs)
> +{
> + if (!objs)
> + return NULL;
> +
> + return kzalloc(sizeof(struct unionfs_usi_data) * objs, GFP_KERNEL);
> +}

Same here. I suggest you kill the wrappers too.

> +int unionfs_ioctl_incgen(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> + struct super_block *sb;
> + int gen;
> +
> + sb = file->f_dentry->d_sb;
> +
> + unionfs_write_lock(sb);
> +
> + atomic_inc(&stopd(sb)->usi_generation);
> + gen = atomic_read(&stopd(sb)->usi_generation);

You could use atomic_inc_return here. Is usi_generation protected by
write lock on sb or do you really need atomic ops?

> +
> + atomic_set(&dtopd(sb->s_root)->udi_generation, gen);
> + atomic_set(&itopd(sb->s_root->d_inode)->uii_generation, gen);
> +
> + unionfs_write_unlock(sb);
> +
> + return gen;
> +}

--
VGER BF report: U 0.5

2006-09-04 12:33:50

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

Hi!

> - Modifying a Unionfs branch directly, while the union is mounted, is
> currently unsupported. Any such change may cause Unionfs to oops and it
> can even result in data loss!

I'm not sure if that is acceptable. Even root user should be unable to
oops the kernel using 'normal' actions.
Pavel
--
Thanks for all the (sleeping) penguins.

--
VGER BF report: H 5.1371e-14

2006-09-04 12:57:51

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Sun, 3 September 2006 11:05:08 +0000, Pavel Machek wrote:
>
> > - Modifying a Unionfs branch directly, while the union is mounted, is
> > currently unsupported. Any such change may cause Unionfs to oops and it
> > can even result in data loss!
>
> I'm not sure if that is acceptable. Even root user should be unable to
> oops the kernel using 'normal' actions.

Direct modification of branches is similar to direct modification of
block devices underneith a mounted filesystem. While I agree that
such a thing _should_ not oops the kernel, I'd bet that you can easily
run a stresstest on a filesystem while randomly flipping bits in the
block device and get just that.

There are bigger problems in unionfs to worry about.

J?rn

--
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra

2006-09-04 13:29:46

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> Hi!
>
> > - Modifying a Unionfs branch directly, while the union is mounted, is
> > currently unsupported. Any such change may cause Unionfs to oops and it
> > can even result in data loss!
>
> I'm not sure if that is acceptable. Even root user should be unable to
> oops the kernel using 'normal' actions.

As I said in the other case. imagine ext2/3 on a a san file system
where 2 systems try to make use of it. Will they not have issues?

2006-09-04 20:34:08

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon 2006-09-04 09:28:26, Shaya Potter wrote:
> On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> > Hi!
> >
> > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > currently unsupported. Any such change may cause Unionfs to oops and it
> > > can even result in data loss!
> >
> > I'm not sure if that is acceptable. Even root user should be unable to
> > oops the kernel using 'normal' actions.
>
> As I said in the other case. imagine ext2/3 on a a san file system
> where 2 systems try to make use of it. Will they not have issues?

They probably will have issues (altrough I'm not sure, perhaps ext2
has been debugged enough), but they'll fix them (as opposed to
document that oopses are okay).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2006-09-04 21:44:07

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon, 2006-09-04 at 22:33 +0200, Pavel Machek wrote:
> On Mon 2006-09-04 09:28:26, Shaya Potter wrote:
> > On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> > > Hi!
> > >
> > > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > > currently unsupported. Any such change may cause Unionfs to oops and it
> > > > can even result in data loss!
> > >
> > > I'm not sure if that is acceptable. Even root user should be unable to
> > > oops the kernel using 'normal' actions.
> >
> > As I said in the other case. imagine ext2/3 on a a san file system
> > where 2 systems try to make use of it. Will they not have issues?
>
> They probably will have issues (altrough I'm not sure, perhaps ext2
> has been debugged enough), but they'll fix them (as opposed to
> document that oopses are okay).

I agree that unionfs shouldn't oops, it should handle that situation in
a more graceful manner, but once the "backing store" is modified
underneath it, all bets are off for either unionfs or ext2/3 behaving
"correctly" (where "correctly" doesn't just mean handle the error
gracefully).

But are you also 100% sure that messing with the underlying backing
store wouldn't be considered an admin bug as opposed to an administrator
bug? I mean there's nothing that we can do to prevent an administrator
from FUBAR'ing their system by

dd if=/dev/random of=/dev/kmem.

where does one draw the line? I agree that stackable file systems make
this a more pressing issue, as the "backing store" can be visible within
the file system namespace as a regular file system that people are
generally accustomed to interacting with.

2006-09-04 23:31:59

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon, Sep 04, 2006 at 05:43:04PM -0400, Shaya Potter wrote:
> On Mon, 2006-09-04 at 22:33 +0200, Pavel Machek wrote:
> > On Mon 2006-09-04 09:28:26, Shaya Potter wrote:
> > > On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> > > > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > > > currently unsupported. Any such change may cause Unionfs to oops and it
> > > > > can even result in data loss!
> > > >
> > > > I'm not sure if that is acceptable. Even root user should be unable to
> > > > oops the kernel using 'normal' actions.
> > >
> > > As I said in the other case. imagine ext2/3 on a a san file system
> > > where 2 systems try to make use of it. Will they not have issues?
> >
> > They probably will have issues (altrough I'm not sure, perhaps ext2
> > has been debugged enough), but they'll fix them (as opposed to
> > document that oopses are okay).
>
> I agree that unionfs shouldn't oops, it should handle that situation in
> a more graceful manner,

Agreed. The solution is to make the VFS little friendlier to stackable
filesystems. I'm not sure what the best way of accomplishing that would be.
There are some papers about it [1], but my gut feeling is that they are way
too invasive.

> but once the "backing store" is modified underneath it, all bets are off
> for either unionfs or ext2/3 behaving "correctly"

Exactly.

> where does one draw the line? I agree that stackable file systems make
> this a more pressing issue, as the "backing store" can be visible within
> the file system namespace as a regular file system that people are
> generally accustomed to interacting with.

The unionfs (as well as any other stackable filesystem) case can be fixed up
by making the VFS aware of the stack - for example, by having some kind of
stackable filesystem callbacks.

Josef 'Jeff' Sipek.

[1] http://www.isi.edu/people/johnh/PAPERS/Heidemann95e.html

--
Bad pun of the week: The formula 1 control computer suffered from a race
condition

2006-09-04 23:35:22

by Josef Sipek

[permalink] [raw]
Subject: Re: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon, Sep 04, 2006 at 02:01:41PM +0300, Pekka Enberg wrote:
> On 9/3/06, Josef Sipek <[email protected]> wrote:
> >I think you misunderstood my comment. What I meant to say was that there is
> >_no way_ you can compile a filesystem that has only dentry ops but not
> >superblock ops - this would happen if you tried to bisect and you landed
> >half way in the series of commits for the filesystem. For the _initial_
> >commit one cset makes sense. For subsequent fixes one commit per fix is the
> >only logical thing to do.
>
> Reorder the patches so that Makefile and Kconfig changes come last and
> git bisect will work just fine.

Already done.

Josef 'Jeff' Sipek.

--
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
- Brian W. Kernighan

2006-09-05 03:09:17

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon, 2006-09-04 at 09:28 -0400, Shaya Potter wrote:
> On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> > Hi!
> >
> > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > currently unsupported. Any such change may cause Unionfs to oops and it
> > > can even result in data loss!
> >
> > I'm not sure if that is acceptable. Even root user should be unable to
> > oops the kernel using 'normal' actions.
>
> As I said in the other case. imagine ext2/3 on a a san file system
> where 2 systems try to make use of it. Will they not have issues?

Yes, but you are deliberately ignoring that NAS systems like CIFS or NFS
don't, and neither do clustered filesystems. Users of those systems
don't expect them to have issues with that sort of scenario.


2006-09-05 03:29:35

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Mon, 2006-09-04 at 23:08 -0400, Trond Myklebust wrote:
> On Mon, 2006-09-04 at 09:28 -0400, Shaya Potter wrote:
> > On Sun, 2006-09-03 at 11:05 +0000, Pavel Machek wrote:
> > > Hi!
> > >
> > > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > > currently unsupported. Any such change may cause Unionfs to oops and it
> > > > can even result in data loss!
> > >
> > > I'm not sure if that is acceptable. Even root user should be unable to
> > > oops the kernel using 'normal' actions.
> >
> > As I said in the other case. imagine ext2/3 on a a san file system
> > where 2 systems try to make use of it. Will they not have issues?
>
> Yes, but you are deliberately ignoring that NAS systems like CIFS or NFS
> don't, and neither do clustered filesystems. Users of those systems
> don't expect them to have issues with that sort of scenario.

No. I just view them as a backing store type system. Yes, if you use
unionfs in an nfs context you better be sure about how the nfs backing
store is going to be used (i.e. read-only or only used by a single
user), just like if you put ext2/3 on a san block device, you better be
sure that either its only used read-only or only used by a single user.

Yes, unionfs enables you to use the backing store "incorrectly", but so
do ext2/3 or any other non clustered file system when used on a SAN.

2006-09-05 04:45:44

by Al Boldi

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

Jörn Engel wrote:
> On Sun, 3 September 2006 11:05:08 +0000, Pavel Machek wrote:
> > > - Modifying a Unionfs branch directly, while the union is mounted, is
> > > currently unsupported. Any such change may cause Unionfs to oops
> > > and it can even result in data loss!
> >
> > I'm not sure if that is acceptable. Even root user should be unable to
> > oops the kernel using 'normal' actions.
>
> Direct modification of branches is similar to direct modification of
> block devices underneith a mounted filesystem. While I agree that
> such a thing _should_ not oops the kernel, I'd bet that you can easily
> run a stresstest on a filesystem while randomly flipping bits in the
> block device and get just that.

Not really a fair comparison. The block level is conceptionally totally
different than the fs level, while a stackable fs is within the realms of
the fs level.

> There are bigger problems in unionfs to worry about.

Agreed. Moving basic functionality abstractions into the VFS could easily
alleviate theses kinds of problems.


Thanks!

--
Al

2006-09-05 06:10:24

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

>
>I agree that unionfs shouldn't oops, it should handle that situation in
>a more graceful manner, but once the "backing store" is modified
>underneath it, all bets are off for either unionfs or ext2/3 behaving
>"correctly" (where "correctly" doesn't just mean handle the error
>gracefully).
>
>But are you also 100% sure that messing with the underlying backing
>store wouldn't be considered an admin bug as opposed to an administrator
>bug? I mean there's nothing that we can do to prevent an administrator
>from FUBAR'ing their system by
>
>dd if=/dev/random of=/dev/kmem.
>
>where does one draw the line? I agree that stackable file systems make
>this a more pressing issue, as the "backing store" can be visible within
>the file system namespace as a regular file system that people are
>generally accustomed to interacting with.

So here's an idea. When a branch is added, mount an empty space onto the
branch. (From within the kernel, so it appears as a side-effect of mount(2))

mount -t unionfs -o dirs=/a=rw:/b=ro none /union

should imply something like

mount --bind /var/lib/empty /a
mount --bind /var/lib/empty /b

Or better, yet, make them read-only:

mount --rbind -o ro /a /a
mount --rbind -o ro /b /b
(hope this works as intended?)

So that no one can tinker with /a and /b while the union is mounted.


Jan Engelhardt
--

2006-09-05 07:01:27

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Tue, 5 September 2006 07:46:44 +0300, Al Boldi wrote:
> J?rn Engel wrote:
> >
> > Direct modification of branches is similar to direct modification of
> > block devices underneith a mounted filesystem. While I agree that
> > such a thing _should_ not oops the kernel, I'd bet that you can easily
> > run a stresstest on a filesystem while randomly flipping bits in the
> > block device and get just that.
>
> Not really a fair comparison. The block level is conceptionally totally
> different than the fs level, while a stackable fs is within the realms of
> the fs level.

Well, I didn't realize that unionfs required its backing filesystems
to be mounted. That's more like having the block device open in a
text editor while mounting ext3. In the presence of such a design, an
oops clearly is not acceptable. And this sort of design is just what
I was talking about when I said:

> > There are bigger problems in unionfs to worry about.

J?rn

--
You can't tell where a program is going to spend its time. Bottlenecks
occur in surprising places, so don't try to second guess and put in a
speed hack until you've proven that's where the bottleneck is.
-- Rob Pike

2006-09-05 13:03:38

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 00/22][RFC] Unionfs: Stackable Namespace Unification Filesystem

On Tue, 2006-09-05 at 08:02 +0200, Jan Engelhardt wrote:
> >
> >I agree that unionfs shouldn't oops, it should handle that situation in
> >a more graceful manner, but once the "backing store" is modified
> >underneath it, all bets are off for either unionfs or ext2/3 behaving
> >"correctly" (where "correctly" doesn't just mean handle the error
> >gracefully).
> >
> >But are you also 100% sure that messing with the underlying backing
> >store wouldn't be considered an admin bug as opposed to an administrator
> >bug? I mean there's nothing that we can do to prevent an administrator
> >from FUBAR'ing their system by
> >
> >dd if=/dev/random of=/dev/kmem.
> >
> >where does one draw the line? I agree that stackable file systems make
> >this a more pressing issue, as the "backing store" can be visible within
> >the file system namespace as a regular file system that people are
> >generally accustomed to interacting with.
>
> So here's an idea. When a branch is added, mount an empty space onto the
> branch. (From within the kernel, so it appears as a side-effect of mount(2))
>
> mount -t unionfs -o dirs=/a=rw:/b=ro none /union
>
> should imply something like
>
> mount --bind /var/lib/empty /a
> mount --bind /var/lib/empty /b
>
> Or better, yet, make them read-only:
>
> mount --rbind -o ro /a /a
> mount --rbind -o ro /b /b
> (hope this works as intended?)
>
> So that no one can tinker with /a and /b while the union is mounted.

I thought about that, but that doesn't help you w/ the NFS as branch
case.

2006-09-16 22:14:04

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality

On Mon, Sep 04, 2006 at 12:41:58PM +0200, Jan Engelhardt wrote:
>
> >> Is BUG the right thing, what do others think? (Using WARN, and set err to
> >> something useful?)
> >
> >Well, it is definitely a condition which Unionfs doesn't expect - if it
> >doesn't know about the type, how could it copy it up?
>
> Other filesystems don't seem to BUG either (at least I have not run into
> that yet) when - for whatever reasons - the statdata of a dentry is
> fubared. `ls` just displays it then, like
>
> ?-w---Sr-T 1 root root 4294967295 date fubared_file

I was thinking about this, and the difference between "other filesystems"
and unionfs in this case is that the example above is just stat. During
copyup, unionfs has to copy the file to another filesystem. How is it
supposed to do that when it doesn't understand what the file is?

Sure, when unionfs does stat, fubared statdata is fine, but during
copyup...bad things could potentially happen.

Any suggestions how to copyup an unknown file type?

Josef 'Jeff' Sipek.

--
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like
that.
- Linus Torvalds

2006-09-16 22:30:20

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality

>> >> Is BUG the right thing, what do others think? (Using WARN, and set err to
>> >> something useful?)
>> >
>> >Well, it is definitely a condition which Unionfs doesn't expect - if it
>> >doesn't know about the type, how could it copy it up?
>>
>> Other filesystems don't seem to BUG either (at least I have not run into
>> that yet) when - for whatever reasons - the statdata of a dentry is
>> fubared. `ls` just displays it then, like
>>
>> ?-w---Sr-T 1 root root 4294967295 date fubared_file
>
>I was thinking about this, and the difference between "other filesystems"
>and unionfs in this case is that the example above is just stat. During
>copyup, unionfs has to copy the file to another filesystem. How is it
>supposed to do that when it doesn't understand what the file is?
>
>Sure, when unionfs does stat, fubared statdata is fine, but during
>copyup...bad things could potentially happen.
>
>Any suggestions how to copyup an unknown file type?

Return some error value if possible. -EIO, banners, big printk()s,
anything. Tell the user the filesytem on some branch is hosed - or
too advanced to be understood by unionfs. (There seems to be a
variety of file types in other UNIXes according to `man 2 stat`,
such as doors, etc.)



Jan Engelhardt
--

2006-09-17 01:30:53

by Shaya Potter

[permalink] [raw]
Subject: Re: [PATCH 05/22][RFC] Unionfs: Copyup Functionality

On Sat, 2006-09-16 at 18:13 -0400, Josef Sipek wrote:
> On Mon, Sep 04, 2006 at 12:41:58PM +0200, Jan Engelhardt wrote:
> >
> > >> Is BUG the right thing, what do others think? (Using WARN, and set err to
> > >> something useful?)
> > >
> > >Well, it is definitely a condition which Unionfs doesn't expect - if it
> > >doesn't know about the type, how could it copy it up?
> >
> > Other filesystems don't seem to BUG either (at least I have not run into
> > that yet) when - for whatever reasons - the statdata of a dentry is
> > fubared. `ls` just displays it then, like
> >
> > ?-w---Sr-T 1 root root 4294967295 date fubared_file
>
> I was thinking about this, and the difference between "other filesystems"
> and unionfs in this case is that the example above is just stat. During
> copyup, unionfs has to copy the file to another filesystem. How is it
> supposed to do that when it doesn't understand what the file is?
>
> Sure, when unionfs does stat, fubared statdata is fine, but during
> copyup...bad things could potentially happen.
>
> Any suggestions how to copyup an unknown file type?

copyup is only required if a file is going to be modified. refuse to
modify (or perhaps even open for write) an unknown file? i.e. calling
BUG() is bad when it can be cleanly handled much earlier in the chain.