2010-07-12 06:38:43

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 0/12] Generic name to handle and open by handle syscalls

Hi,

The below set of patches implement open by handle support using exportfs
operations. This allows user space application to map a file name to file
handle and later open the file using handle. This should be usable
for userspace NFS [1] and 9P server [2]. XFS already support this with the ioctls
XFS_IOC_PATH_TO_HANDLE and XFS_IOC_OPEN_BY_HANDLE.

[1] http://nfs-ganesha.sourceforge.net/
[2] http://thread.gmane.org/gmane.comp.emulators.qemu/68992

git repo for the patchset at:

git://git.kernel.org/pub/scm/linux/kernel/git/kvaneesh/linux-open-handle.git open-by-handle

Changes from V15:
a) Add mount id to file handle
b) Add support for optional uuid tag in /proc/<pid>/mountinfo
c) Added MAY_OPEN_LINK to support open on symlink instead of
adding a new may_open function. Also limited the open flag
to O_RDONLY.

Changes from V14:
a) Use fget_light instead of fget in the patches
b) Drop uuid from struct file_handle as per the last review

Changes from V13:
a) Add support for file descriptor to handle conversion. This is needed
so that we find the right file handle for newly created files.

Changes from V12:
a) Use CAP_DAC_READ_SEARCH instead of CAP_DAC_OVERRIDE in open_by_handle
b) Return -ENOTDIR if O_DIRECTORY flag is specified in open_by_handle with
handle for non directory

Changes from V11:
a) Add necessary documentation to different functions
b) Add null pathname support to faccessat and linkat similar to
readlinkat.
c) compile fix on x86_64

Changes from V10:
a) Missed an stg refresh before sending out the patchset. Send
updated patchset.

Changes from V9:
a) Fix compile errors with CONFIG_EXPORTFS not defined
b) Return -EOPNOTSUPP if file system doesn't support fh_to_dentry exportfs callback.

Changes from V8:
a) exportfs_decode_fh now returns -ESTALE if export operations is not defined.
b) drop get_fsid super_operations. Instead use superblock to store uuid.

Changes from V7:
a) open_by_handle now use mountdirfd to identify the vfsmount.
b) We don't validate the UUID passed as a part of file handle in open_by_handle.
UUID is provided as a part of file handle as an easy way for userspace to
use the kernel returned handle as it is. It also helps in finding the 16 byte
filessytem UUID in userspace without using file system specific libraries to
read file system superblock. If a particular file system doesn't support UUID
or any form of unique id this field in the file handle will be zero filled.
c) drop freadlink syscall. Instead use readlinkat with NULL pathname to indicate
read the link target name of the link pointed by fd. This is similar to
sys_utimensat
d) Instead of opencoding all the open flag related check use helper functions.
Did finish_open_by_handle similar to finish_open.
c) Fix may_open to not return ELOOP for symlink when we are called from handle open.
open(2) still returns error as expected.

Changes from V6:
a) Add uuid to vfsmount lookup and drop uuid to superblock lookup
b) Return -EOPNOTSUPP in sys_name_to_handle if the file system returned uuid
doesn't give the same vfsmount on lookup. This ensure that we fail
sys_name_to_handle when we have multiple file system returning same UUID.

Changes from V5:
a) added sys_name_to_handle_at syscall which takes AT_SYMLINK_NOFOLLOW flag
instead of two syscalls sys_name_to_handle and sys_lname_to_handle.
b) addressed review comments from Niel Brown
c) rebased to b91ce4d14a21fc04d165be30319541e0f9204f15
d) Add compat_sys_open_by_handle

Chages from V4:
a) Changed the syscal arguments so that we don't need compat syscalls
as suggested by Christoph
c) Added two new syscall sys_lname_to_handle and sys_freadlink to work with
symlinks
d) Changed open_by_handle to work with all file types
e) Add ext3 support

Changes from V3:
a) Code cleanup suggested by Andreas
b) x86_64 syscall support
c) add compat syscall

Chages from V2:
a) Support system wide unique handle.

Changes from v1:
a) handle size is now specified in bytes
b) returns -EOVERFLOW if the handle size is small
c) dropped open_handle syscall and added open_by_handle_at syscall
open_by_handle_at takes mount_fd as the directory fd of the mount point
containing the file
e) handle will only be unique in a given file system. So for an NFS server
exporting multiple file system, NFS server will have to internally track the
mount point to which a file handle belongs to. We should be able to do it much
easily than expecting kernel to give a system wide unique file handle. System
wide unique file handle would need much larger changes to the exportfs or VFS
interface and I was not sure whether we really need to do that in the kernel or
in the user space
f) open_handle_at now only check for DAC_OVERRIDE capability


Example program: (x86_32). (x86_64 would need a different syscall number)
-------
cc <source.c>
--------
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>

struct file_handle {
int mnt_id;
int handle_size;
int handle_type;
unsigned char handle[0];
};

#define AT_FDCWD -100
#define AT_SYMLINK_FOLLOW 0x400

static int name_to_handle(const char *name, struct file_handle *fh)
{
return syscall(338, AT_FDCWD, name, fh, AT_SYMLINK_FOLLOW);
}

static int lname_to_handle(const char *name, struct file_handle *fh)
{
return syscall(338, AT_FDCWD, name, fh, 0);
}

static int fd_to_handle(int fd, struct file_handle *fh)
{
return syscall(338, fd, NULL, fh, AT_SYMLINK_FOLLOW);
}

static int open_by_handle(int mountfd, struct file_handle *fh, int flags)
{
return syscall(339, mountfd, fh, flags);
}

#define BUFSZ 100
int main(int argc, char *argv[])
{
int fd;
int ret, done = 0;
int mountfd;
int handle_sz;
struct stat bufstat;
char buf[BUFSZ];
struct file_handle *fh = NULL;;
if (argc != 3 ) {
printf("Usage: %s <filename> <mount-dir-name>\n", argv[0]);
exit(1);
}
again:
if (fh && fh->handle_size) {
handle_sz = fh->handle_size;
free(fh);
fh = malloc(sizeof(struct file_handle) + handle_sz);
fh->handle_size = handle_sz;
} else {
fh = malloc(sizeof(struct file_handle));
fh->handle_size = 0;
}
errno = 0;
ret = lname_to_handle(argv[1], fh);
if (ret && errno == EOVERFLOW) {
printf("Found the handle size needed to be %d\n", fh->handle_size);
goto again;
} else if (ret) {
perror("Error:");
exit(1);
}
do_again:
printf("found mount_id %d\n", fh->mnt_id);
printf("Waiting for input");
getchar();
mountfd = open(argv[2], O_RDONLY | O_DIRECTORY);
if (mountfd <= 0) {
perror("Error:");
exit(1);
}
fd = open_by_handle(mountfd, fh, O_RDONLY);
if (fd <= 0 ) {
perror("Error:");
exit(1);
}
printf("Reading the content now \n");
fstat(fd, &bufstat);
ret = S_ISLNK(bufstat.st_mode);
if (ret) {
memset(buf, 0 , BUFSZ);
readlinkat(fd, NULL, buf, BUFSZ);
printf("%s is a symlink pointing to %s\n", argv[1], buf);
}
memset(buf, 0 , BUFSZ);
while (1) {
ret = read(fd, buf, BUFSZ -1);
if (ret <= 0)
break;
buf[ret] = '\0';
printf("%s", buf);
memset(buf, 0 , BUFSZ);
}
/* Now check for faccess */
if (faccessat(fd, NULL, W_OK, 0) == 0) {
printf("Got write permission on the file \n");
} else
perror("faccess error");
/* now try to create a hardlink */
if (linkat(fd, NULL, AT_FDCWD, "test", 0) == 0){
printf("created hardlink\n");
} else
perror("linkat error");
if (done)
exit(0);
printf("Map fd to handle \n");
ret = fd_to_handle(fd, fh);
if (ret) {
perror("Error:");
exit(1);
}
done = 1;
goto do_again;
}

-aneesh


2010-07-12 06:35:59

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 05/12] vfs: Support null pathname in readlink

From: NeilBrown <[email protected]>

This enables to use readlink to get the link target name
from a file descriptor point to the link. This can be used
with open_by_handle syscall that returns a file descriptor for a link.
We can then use this file descriptor to get the target name.

This is similar to utimensat(2) interface

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/stat.c | 30 ++++++++++++++++++++++--------
1 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index c4ecd52..a66a0ef 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -284,26 +284,40 @@ SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf)
SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname,
char __user *, buf, int, bufsiz)
{
- struct path path;
- int error;
+ int error = 0, fput_needed;
+ struct path path, *pp;
+ struct file *file = NULL;

if (bufsiz <= 0)
return -EINVAL;

- error = user_path_at(dfd, pathname, 0, &path);
+ if (pathname == NULL && dfd != AT_FDCWD) {
+ file = fget_light(dfd, &fput_needed);
+
+ if (file)
+ pp = &file->f_path;
+ else
+ error = -EBADF;
+ } else {
+ error = user_path_at(dfd, pathname, 0, &path);
+ pp = &path;
+ }
if (!error) {
- struct inode *inode = path.dentry->d_inode;
+ struct inode *inode = pp->dentry->d_inode;

error = -EINVAL;
if (inode->i_op->readlink) {
- error = security_inode_readlink(path.dentry);
+ error = security_inode_readlink(pp->dentry);
if (!error) {
- touch_atime(path.mnt, path.dentry);
- error = inode->i_op->readlink(path.dentry,
+ touch_atime(pp->mnt, pp->dentry);
+ error = inode->i_op->readlink(pp->dentry,
buf, bufsiz);
}
}
- path_put(&path);
+ if (file)
+ fput_light(file, fput_needed);
+ else
+ path_put(&path);
}
return error;
}
--
1.7.2.rc1

2010-07-12 06:36:27

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 09/12] x86: Add new syscalls for x86_64

Add sys_name_to_handle_at and sys_open_by_handle_at syscalls
for x86_64

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
arch/x86/ia32/ia32entry.S | 2 ++
arch/x86/include/asm/unistd_64.h | 4 ++++
fs/compat.c | 11 +++++++++++
fs/open.c | 20 ++++++++++----------
include/linux/fs.h | 1 +
5 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index e790bc1..99f9623 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -842,4 +842,6 @@ ia32_sys_call_table:
.quad compat_sys_rt_tgsigqueueinfo /* 335 */
.quad sys_perf_event_open
.quad compat_sys_recvmmsg
+ .quad sys_name_to_handle_at
+ .quad compat_sys_open_by_handle_at /* 339 */
ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index ff4307b..6afd818 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -663,6 +663,10 @@ __SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
__SYSCALL(__NR_perf_event_open, sys_perf_event_open)
#define __NR_recvmmsg 299
__SYSCALL(__NR_recvmmsg, sys_recvmmsg)
+#define __NR_name_to_handle_at 300
+__SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
+#define __NR_open_by_handle_at 301
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)

#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
diff --git a/fs/compat.c b/fs/compat.c
index 6490d21..dc662ac 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -2332,3 +2332,14 @@ asmlinkage long compat_sys_timerfd_gettime(int ufd,
}

#endif /* CONFIG_TIMERFD */
+
+/*
+ * Exactly like fs/open.c:sys_open_by_handle_at(), except that it
+ * doesn't set the O_LARGEFILE flag.
+ */
+asmlinkage long
+compat_sys_open_by_handle_at(int mountdirfd,
+ struct file_handle __user *handle, int flags)
+{
+ return do_sys_open_by_handle(mountdirfd, handle, flags);
+}
diff --git a/fs/open.c b/fs/open.c
index a87c654..7a11b48 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1217,8 +1217,8 @@ out_err:
return ERR_PTR(retval);
}

-static long do_sys_open_by_handle(int mountdirfd,
- struct file_handle __user *ufh, int open_flag)
+long do_sys_open_by_handle(int mountdirfd,
+ struct file_handle __user *ufh, int open_flag)
{
long retval = 0;
int fd, acc_mode;
@@ -1327,6 +1327,14 @@ out_handle:
out_err:
return retval;
}
+#else
+long do_sys_open_by_handle(int mountdirfd,
+ struct file_handle __user *ufh, int open_flag)
+{
+
+ return -ENOSYS;
+}
+#endif

/**
* sys_open_by_handle_at: Open the file handle
@@ -1351,11 +1359,3 @@ SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
ret = do_sys_open_by_handle(mountdirfd, handle, flags);
return ret;
}
-#else
-SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
- struct file_handle __user *, handle,
- int, flags)
-{
- return -ENOSYS;
-}
-#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 08afa72..3103c39 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1929,6 +1929,7 @@ extern struct file * dentry_open(struct dentry *, struct vfsmount *, int,
const struct cred *);
extern int filp_close(struct file *, fl_owner_t id);
extern char * getname(const char __user *);
+extern long do_sys_open_by_handle(int, struct file_handle __user *, int);

/* fs/ioctl.c */

--
1.7.2.rc1

2010-07-12 06:36:23

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 06/12] vfs: Support null pathname in faccessat

This enables to use faccessat to get the access check details
from a file descriptor pointing to the file. This can be used
with open_by_handle syscall that returns a file descriptor.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/open.c | 29 +++++++++++++++++++++--------
1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index afb089e..a87c654 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -288,9 +288,10 @@ SYSCALL_DEFINE3(faccessat, int, dfd, const char __user *, filename, int, mode)
{
const struct cred *old_cred;
struct cred *override_cred;
- struct path path;
+ struct file *file = NULL;
+ struct path path, *pp;
struct inode *inode;
- int res;
+ int res, fput_needed;

if (mode & ~S_IRWXO) /* where's F_OK, X_OK, W_OK, R_OK? */
return -EINVAL;
@@ -312,12 +313,21 @@ SYSCALL_DEFINE3(faccessat, int, dfd, const char __user *, filename, int, mode)
}

old_cred = override_creds(override_cred);
-
- res = user_path_at(dfd, filename, LOOKUP_FOLLOW, &path);
+ if (filename == NULL && dfd != AT_FDCWD) {
+ file = fget_light(dfd, &fput_needed);
+ if (file) {
+ pp = &file->f_path;
+ res = 0;
+ } else
+ res = -EBADF;
+ } else {
+ res = user_path_at(dfd, filename, LOOKUP_FOLLOW, &path);
+ pp = &path;
+ }
if (res)
goto out;

- inode = path.dentry->d_inode;
+ inode = pp->dentry->d_inode;

if ((mode & MAY_EXEC) && S_ISREG(inode->i_mode)) {
/*
@@ -325,7 +335,7 @@ SYSCALL_DEFINE3(faccessat, int, dfd, const char __user *, filename, int, mode)
* with the "noexec" flag.
*/
res = -EACCES;
- if (path.mnt->mnt_flags & MNT_NOEXEC)
+ if (pp->mnt->mnt_flags & MNT_NOEXEC)
goto out_path_release;
}

@@ -343,11 +353,14 @@ SYSCALL_DEFINE3(faccessat, int, dfd, const char __user *, filename, int, mode)
* inherently racy and know that the fs may change
* state before we even see this result.
*/
- if (__mnt_is_readonly(path.mnt))
+ if (__mnt_is_readonly(pp->mnt))
res = -EROFS;

out_path_release:
- path_put(&path);
+ if (file)
+ fput_light(file, fput_needed);
+ else
+ path_put(&path);
out:
revert_creds(old_cred);
put_cred(override_cred);
--
1.7.2.rc1

2010-07-12 06:36:14

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 08/12] x86: Add new syscalls for x86_32

This patch adds sys_name_to_handle_at and sys_open_by_handle_at
syscalls to x86_32

Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
arch/x86/include/asm/unistd_32.h | 4 +++-
arch/x86/kernel/syscall_table_32.S | 2 ++
2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index beb9b5f..06890db 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -343,10 +343,12 @@
#define __NR_rt_tgsigqueueinfo 335
#define __NR_perf_event_open 336
#define __NR_recvmmsg 337
+#define __NR_name_to_handle_at 338
+#define __NR_open_by_handle_at 339

#ifdef __KERNEL__

-#define NR_syscalls 338
+#define NR_syscalls 340

#define __ARCH_WANT_IPC_PARSE_VERSION
#define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index 8b37293..646717f 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -337,3 +337,5 @@ ENTRY(sys_call_table)
.long sys_rt_tgsigqueueinfo /* 335 */
.long sys_perf_event_open
.long sys_recvmmsg
+ .long sys_name_to_handle_at
+ .long sys_open_by_handle_at /* 339 */
--
1.7.2.rc1

2010-07-12 06:36:17

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 03/12] vfs: Add open by file handle support

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/exportfs/expfs.c | 2 +
fs/namei.c | 73 +++++++++++++++++++
fs/open.c | 175 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 3 +-
include/linux/namei.h | 1 +
include/linux/syscalls.h | 3 +
6 files changed, 256 insertions(+), 1 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index cfee0f0..05a1179 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -373,6 +373,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
/*
* Try to get any dentry for the given file handle from the filesystem.
*/
+ if (!nop || !nop->fh_to_dentry)
+ return ERR_PTR(-ESTALE);
result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
if (!result)
result = ERR_PTR(-ESTALE);
diff --git a/fs/namei.c b/fs/namei.c
index 868d0cb..4d590a3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1052,6 +1052,29 @@ out_fail:
return retval;
}

+struct vfsmount *get_vfsmount_from_fd(int fd)
+{
+ int fput_needed;
+ struct path *path;
+ struct file *filep;
+
+ if (fd == AT_FDCWD) {
+ struct fs_struct *fs = current->fs;
+ read_lock(&fs->lock);
+ path = &fs->pwd;
+ mntget(path->mnt);
+ read_unlock(&fs->lock);
+ } else {
+ filep = fget_light(fd, &fput_needed);
+ if (!filep)
+ return ERR_PTR(-EBADF);
+ path = &filep->f_path;
+ mntget(path->mnt);
+ fput_light(filep, fput_needed);
+ }
+ return path->mnt;
+}
+
/* Returns 0 and nd will be valid on success; Retuns error, otherwise. */
static int do_path_lookup(int dfd, const char *name,
unsigned int flags, struct nameidata *nd)
@@ -1557,6 +1580,56 @@ static int open_will_truncate(int flag, struct inode *inode)
return (flag & O_TRUNC);
}

+struct file *finish_open_handle(struct path *path,
+ int open_flag, int acc_mode)
+{
+ int error;
+ struct file *filp;
+ int will_truncate;
+
+ will_truncate = open_will_truncate(open_flag, path->dentry->d_inode);
+ if (will_truncate) {
+ error = mnt_want_write(path->mnt);
+ if (error)
+ goto exit;
+ }
+ error = may_open(path, acc_mode, open_flag);
+ if (error) {
+ if (will_truncate)
+ mnt_drop_write(path->mnt);
+ goto exit;
+ }
+ filp = dentry_open(path->dentry, path->mnt, open_flag, current_cred());
+ if (!IS_ERR(filp)) {
+ error = ima_file_check(filp, acc_mode);
+ if (error) {
+ fput(filp);
+ filp = ERR_PTR(error);
+ }
+ }
+ if (!IS_ERR(filp)) {
+ if (will_truncate) {
+ error = handle_truncate(path);
+ if (error) {
+ fput(filp);
+ filp = ERR_PTR(error);
+ }
+ }
+ }
+ /*
+ * It is now safe to drop the mnt write
+ * because the filp has had a write taken
+ * on its behalf.
+ */
+ if (will_truncate)
+ mnt_drop_write(path->mnt);
+ return filp;
+
+exit:
+ path_put(path);
+ return ERR_PTR(error);
+}
+
static struct file *finish_open(struct nameidata *nd,
int open_flag, int acc_mode)
{
diff --git a/fs/open.c b/fs/open.c
index 7ad8f28..df5d21e 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1164,3 +1164,178 @@ SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
return -ENOSYS;
}
#endif
+
+#ifdef CONFIG_EXPORTFS
+static int vfs_dentry_acceptable(void *context, struct dentry *dentry)
+{
+ return 1;
+}
+
+static struct path *handle_to_path(int mountdirfd, struct file_handle *handle)
+{
+ int retval;
+ int handle_size;
+ struct path *path;
+
+ path = kmalloc(sizeof(struct path), GFP_KERNEL);
+ if (!path)
+ return ERR_PTR(-ENOMEM);
+
+ path->mnt = get_vfsmount_from_fd(mountdirfd);
+ if (IS_ERR(path->mnt)) {
+ retval = PTR_ERR(path->mnt);
+ goto out_err;
+ }
+ /* change the handle size to multiple of sizeof(u32) */
+ handle_size = handle->handle_size >> 2;
+ path->dentry = exportfs_decode_fh(path->mnt,
+ (struct fid *)handle->f_handle,
+ handle_size, handle->handle_type,
+ vfs_dentry_acceptable, NULL);
+ if (IS_ERR(path->dentry)) {
+ retval = PTR_ERR(path->dentry);
+ goto out_mnt;
+ }
+ return path;
+out_mnt:
+ mntput(path->mnt);
+out_err:
+ kfree(path);
+ return ERR_PTR(retval);
+}
+
+static long do_sys_open_by_handle(int mountdirfd,
+ struct file_handle __user *ufh, int open_flag)
+{
+ long retval = 0;
+ int fd, acc_mode;
+ struct file *filp;
+ struct path *path;
+ struct file_handle f_handle;
+ struct file_handle *handle = NULL;
+
+ /*
+ * With handle we don't look at the execute bit on the
+ * the directory. Ideally we would like CAP_DAC_SEARCH.
+ * But we don't have that
+ */
+ if (!capable(CAP_DAC_READ_SEARCH)) {
+ retval = -EPERM;
+ goto out_err;
+ }
+ /* can't use O_CREATE with open_by_handle */
+ if (open_flag & O_CREAT) {
+ retval = -EINVAL;
+ goto out_err;
+ }
+ if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+ retval = -EFAULT;
+ goto out_err;
+ }
+ if ((f_handle.handle_size > MAX_HANDLE_SZ) ||
+ (f_handle.handle_size <= 0)) {
+ retval = -EINVAL;
+ goto out_err;
+ }
+ handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+ GFP_KERNEL);
+ if (!handle) {
+ retval = -ENOMEM;
+ goto out_err;
+ }
+ /* copy the full handle */
+ if (copy_from_user(handle, ufh,
+ sizeof(struct file_handle) +
+ f_handle.handle_size)) {
+ retval = -EFAULT;
+ goto out_handle;
+ }
+ path = handle_to_path(mountdirfd, handle);
+ if (IS_ERR(path)) {
+ retval = PTR_ERR(path);
+ goto out_handle;
+ }
+ if ((open_flag & O_DIRECTORY) &&
+ !S_ISDIR(path->dentry->d_inode->i_mode)) {
+ retval = -ENOTDIR;
+ goto out_path;
+ }
+ /*
+ * O_SYNC is implemented as __O_SYNC|O_DSYNC. As many places only
+ * check for O_DSYNC if the need any syncing at all we enforce it's
+ * always set instead of having to deal with possibly weird behaviour
+ * for malicious applications setting only __O_SYNC.
+ */
+ if (open_flag & __O_SYNC)
+ open_flag |= O_DSYNC;
+
+ acc_mode = MAY_OPEN | ACC_MODE(open_flag);
+
+ /* O_TRUNC implies we need access checks for write permissions */
+ if (open_flag & O_TRUNC)
+ acc_mode |= MAY_WRITE;
+ /*
+ * Allow the LSM permission hook to distinguish append
+ * access from general write access.
+ */
+ if (open_flag & O_APPEND)
+ acc_mode |= MAY_APPEND;
+
+ fd = get_unused_fd_flags(open_flag);
+ if (fd < 0) {
+ retval = fd;
+ goto out_path;
+ }
+ filp = finish_open_handle(path, open_flag, acc_mode);
+ if (IS_ERR(filp)) {
+ put_unused_fd(fd);
+ retval = PTR_ERR(filp);
+ } else {
+ retval = fd;
+ fsnotify_open(filp->f_path.dentry);
+ fd_install(fd, filp);
+ }
+ kfree(path);
+ kfree(handle);
+ return retval;
+
+out_path:
+ path_put(path);
+ kfree(path);
+out_handle:
+ kfree(handle);
+out_err:
+ return retval;
+}
+
+/**
+ * sys_open_by_handle_at: Open the file handle
+ * @mountdirfd: directory file descriptor
+ * @handle: file handle to be opened
+ * @flag: open flags.
+ *
+ * @mountdirfd indicate the directory file descriptor
+ * of the mount point. file handle is decoded relative
+ * to the vfsmount pointed by the @mountdirfd. @flags
+ * value is same as the open(2) flags.
+ */
+SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
+ struct file_handle __user *, handle,
+ int, flags)
+{
+ long ret;
+
+ if (force_o_largefile())
+ flags |= O_LARGEFILE;
+
+ ret = do_sys_open_by_handle(mountdirfd, handle, flags);
+ return ret;
+}
+#else
+SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
+ struct file_handle __user *, handle,
+ int, flags)
+{
+ return -ENOSYS;
+}
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0e7cf4c..a458b4e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2139,7 +2139,8 @@ extern int may_open(struct path *, int, int);

extern int kernel_read(struct file *, loff_t, char *, unsigned long);
extern struct file * open_exec(const char *);
-
+extern struct file *finish_open_handle(struct path *, int, int);
+
/* fs/dcache.c -- generic fs support functions */
extern int is_subdir(struct dentry *, struct dentry *);
extern int path_is_under(struct path *, struct path *);
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 05b441d..b95c582 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -67,6 +67,7 @@ extern int user_path_at(int, const char __user *, unsigned, struct path *);
extern int kern_path(const char *, unsigned, struct path *);

extern int path_lookup(const char *, unsigned, struct nameidata *);
+extern struct vfsmount *get_vfsmount_from_fd(int);
extern int vfs_path_lookup(struct dentry *, struct vfsmount *,
const char *, unsigned int, struct nameidata *);

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 9d6d6e3..3a4f499 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -827,4 +827,7 @@ asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
struct file_handle __user *handle,
int flag);
+asmlinkage long sys_open_by_handle_at(int mountdirfd,
+ struct file_handle __user *handle,
+ int flags);
#endif
--
1.7.2.rc1

2010-07-12 06:36:19

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 12/12] ext4: Copy fs UUID to superblock

File system UUID is made available to application
via /proc/<pid>/mountinfo

Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/ext4/super.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4e8983a..f2bd1bc 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2816,6 +2816,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
sb->s_qcop = &ext4_qctl_operations;
sb->dq_op = &ext4_quota_operations;
#endif
+ memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
+
INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
mutex_init(&sbi->s_orphan_lock);
mutex_init(&sbi->s_resize_lock);
--
1.7.2.rc1

2010-07-12 06:36:21

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 07/12] vfs: Support null pathname in linkat

This enables to use linkat to create hardlinks from a
file descriptor pointing to the file. This can be used
with open_by_handle syscall that returns a file descriptor.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 34 +++++++++++++++++++++++++---------
1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index a6a8093..9a7b71a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
{
struct dentry *new_dentry;
struct nameidata nd;
- struct path old_path;
- int error;
+ struct path old_path, *old_pathp;
+ struct file *file = NULL;
+ int error, fput_needed;
char *to;

if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
return -EINVAL;

- error = user_path_at(olddfd, oldname,
- flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
- &old_path);
+ if (oldname == NULL && olddfd != AT_FDCWD) {
+ file = fget_light(olddfd, &fput_needed);
+ if (file) {
+ old_pathp = &file->f_path;
+ error = 0;
+ } else
+ error = -EBADF;
+ } else {
+ error = user_path_at(olddfd, oldname,
+ flags & AT_SYMLINK_FOLLOW ?
+ LOOKUP_FOLLOW : 0,
+ &old_path);
+ old_pathp = &old_path;
+ }
if (error)
return error;

@@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
if (error)
goto out;
error = -EXDEV;
- if (old_path.mnt != nd.path.mnt)
+ if (old_pathp->mnt != nd.path.mnt)
goto out_release;
new_dentry = lookup_create(&nd, 0);
error = PTR_ERR(new_dentry);
@@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
error = mnt_want_write(nd.path.mnt);
if (error)
goto out_dput;
- error = security_path_link(old_path.dentry, &nd.path, new_dentry);
+ error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
if (error)
goto out_drop_write;
- error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
+ error = vfs_link(old_pathp->dentry,
+ nd.path.dentry->d_inode, new_dentry);
out_drop_write:
mnt_drop_write(nd.path.mnt);
out_dput:
@@ -2593,7 +2606,10 @@ out_release:
path_put(&nd.path);
putname(to);
out:
- path_put(&old_path);
+ if (file)
+ fput_light(file, fput_needed);
+ else
+ path_put(&old_path);

return error;
}
--
1.7.2.rc1

2010-07-12 06:36:04

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 04/12] vfs: Allow handle based open on symlinks

The patch update may_open to allow handle based open on symlinks.
The file handle based API use file descritor returned from open_by_handle_at
to do different file system operations. To find the link target name we
need to get a file descriptor on symlinks.

We should be able to read the link target using file handle. The exact
usecase is with respect to implementing READLINK operation on a
userspace NFS server. The request contain the file handle and the
response include target name.

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 10 +++++++++-
fs/open.c | 9 ++++++++-
include/linux/fs.h | 1 +
3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4d590a3..a6a8093 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1456,7 +1456,15 @@ int may_open(struct path *path, int acc_mode, int flag)

switch (inode->i_mode & S_IFMT) {
case S_IFLNK:
- return -ELOOP;
+ /*
+ * Allow only if acc_mode contain
+ * open link request and read request.
+ */
+ if (acc_mode != (MAY_OPEN_LINK | MAY_READ))
+ return -ELOOP;
+ if (flag != O_RDONLY)
+ return -ELOOP;
+ break;
case S_IFDIR:
if (acc_mode & MAY_WRITE)
return -EISDIR;
diff --git a/fs/open.c b/fs/open.c
index df5d21e..afb089e 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1268,8 +1268,15 @@ static long do_sys_open_by_handle(int mountdirfd,
*/
if (open_flag & __O_SYNC)
open_flag |= O_DSYNC;
+ /*
+ * Handle based API allow open on a symlink
+ */
+ if (S_ISLNK(path->dentry->d_inode->i_mode))
+ acc_mode = MAY_OPEN_LINK;
+ else
+ acc_mode = MAY_OPEN;

- acc_mode = MAY_OPEN | ACC_MODE(open_flag);
+ acc_mode |= ACC_MODE(open_flag);

/* O_TRUNC implies we need access checks for write permissions */
if (open_flag & O_TRUNC)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a458b4e..08afa72 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -53,6 +53,7 @@ struct inodes_stat_t {
#define MAY_APPEND 8
#define MAY_ACCESS 16
#define MAY_OPEN 32
+#define MAY_OPEN_LINK 64

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.2.rc1

2010-07-12 06:36:12

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 02/12] vfs: Add name to file handle conversion support

The file handle also include mount id which can be used
to lookup file system specific information such as uuid
in /proc/<pid>mountinfo

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/open.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 9 +++
include/linux/syscalls.h | 5 ++-
3 files changed, 137 insertions(+), 1 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 5463266..7ad8f28 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -29,6 +29,7 @@
#include <linux/falloc.h>
#include <linux/fs_struct.h>
#include <linux/ima.h>
+#include <linux/exportfs.h>

#include "internal.h"

@@ -1040,3 +1041,126 @@ int nonseekable_open(struct inode *inode, struct file *filp)
}

EXPORT_SYMBOL(nonseekable_open);
+
+#ifdef CONFIG_EXPORTFS
+/* limit the handle size to some value */
+#define MAX_HANDLE_SZ 4096
+static long do_sys_name_to_handle(struct path *path,
+ struct file_handle __user *ufh)
+{
+ long retval;
+ int handle_size;
+ struct file_handle f_handle;
+ struct file_handle *handle = NULL;
+
+ if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+ retval = -EFAULT;
+ goto err_out;
+ }
+ if (f_handle.handle_size > MAX_HANDLE_SZ) {
+ retval = -EINVAL;
+ goto err_out;
+ }
+ handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+ GFP_KERNEL);
+ if (!handle) {
+ retval = -ENOMEM;
+ goto err_out;
+ }
+
+ /* convert handle size to multiple of sizeof(u32) */
+ handle_size = f_handle.handle_size >> 2;
+
+ /* we ask for a non connected handle */
+ retval = exportfs_encode_fh(path->dentry,
+ (struct fid *)handle->f_handle,
+ &handle_size, 0);
+ /* convert handle size to bytes */
+ handle_size *= sizeof(u32);
+ handle->handle_type = retval;
+ handle->handle_size = handle_size;
+ /* copy the mount id */
+ handle->mnt_id = path->mnt->mnt_id;
+ if (handle_size > f_handle.handle_size) {
+ /*
+ * set the handle_size to zero so we copy only
+ * non variable part of the file_handle
+ */
+ handle_size = 0;
+ retval = -EOVERFLOW;
+ } else
+ retval = 0;
+ if (copy_to_user(ufh, handle,
+ sizeof(struct file_handle) + handle_size))
+ retval = -EFAULT;
+
+ kfree(handle);
+err_out:
+ return retval;
+}
+
+/**
+ * sys_name_to_handle_at: convert name to handle
+ * @dfd: directory relative to which name is interpreted if not absolute
+ * @name: name that should be converted to handle.
+ * @handle: resulting file handle
+ * @flag: flag value to indicate whether to follow symlink or not
+ *
+ * @handle->handle_size indicate the space available to store the
+ * variable part of the file handle in bytes. If there is not
+ * enough space, the field is updated to return the minimum
+ * value required.
+ */
+SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
+ struct file_handle __user *, handle, int, flag)
+{
+
+ int follow;
+ int fput_needed;
+ long ret = -EINVAL;
+ struct path path, *pp;
+ struct file *file = NULL;
+
+ if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
+ goto err_out;
+
+ if (name == NULL && dfd != AT_FDCWD) {
+ file = fget_light(dfd, &fput_needed);
+ if (file) {
+ pp = &file->f_path;
+ ret = 0;
+ } else
+ ret = -EBADF;
+ } else {
+ follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
+ ret = user_path_at(dfd, name, follow, &path);
+ pp = &path;
+ }
+ if (ret)
+ goto err_out;
+ /*
+ * We need t make sure wether the file system
+ * support decoding of the file handle
+ */
+ if (!pp->mnt->mnt_sb->s_export_op ||
+ !pp->mnt->mnt_sb->s_export_op->fh_to_dentry) {
+ ret = -EOPNOTSUPP;
+ goto out_path;
+ }
+ ret = do_sys_name_to_handle(pp, handle);
+
+out_path:
+ if (file)
+ fput_light(file, fput_needed);
+ else
+ path_put(&path);
+err_out:
+ return ret;
+}
+#else
+SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
+ struct file_handle __user *, handle, int, flag)
+{
+ return -ENOSYS;
+}
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 471e1ff..0e7cf4c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -949,6 +949,15 @@ struct file {
unsigned long f_mnt_write_state;
#endif
};
+
+struct file_handle {
+ int mnt_id;
+ int handle_size;
+ int handle_type;
+ /* file identifier */
+ unsigned char f_handle[0];
+};
+
extern spinlock_t files_lock;
#define file_list_lock() spin_lock(&files_lock);
#define file_list_unlock() spin_unlock(&files_lock);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 7f614ce..9d6d6e3 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -61,6 +61,7 @@ struct robust_list_head;
struct getcpu_cache;
struct old_linux_dirent;
struct perf_event_attr;
+struct file_handle;

#include <linux/types.h>
#include <linux/aio_abi.h>
@@ -823,5 +824,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoff);
asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
-
+asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
+ struct file_handle __user *handle,
+ int flag);
#endif
--
1.7.2.rc1

2010-07-12 06:38:21

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 01/12] exportfs: Return the minimum required handle size

The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/btrfs/export.c | 8 ++++++--
fs/exportfs/expfs.c | 9 +++++++--
fs/fat/inode.c | 4 +++-
fs/fuse/inode.c | 4 +++-
fs/gfs2/export.c | 8 ++++++--
fs/isofs/export.c | 8 ++++++--
fs/ocfs2/export.c | 8 +++++++-
fs/reiserfs/inode.c | 7 ++++++-
fs/udf/namei.c | 7 ++++++-
fs/xfs/linux-2.6/xfs_export.c | 4 +++-
include/linux/exportfs.h | 6 ++++--
mm/shmem.c | 4 +++-
12 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
index 951ef09..5f8ee5a 100644
--- a/fs/btrfs/export.c
+++ b/fs/btrfs/export.c
@@ -21,9 +21,13 @@ static int btrfs_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
int len = *max_len;
int type;

- if ((len < BTRFS_FID_SIZE_NON_CONNECTABLE) ||
- (connectable && len < BTRFS_FID_SIZE_CONNECTABLE))
+ if (connectable && (len < BTRFS_FID_SIZE_CONNECTABLE)) {
+ *max_len = BTRFS_FID_SIZE_CONNECTABLE;
return 255;
+ } else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) {
+ *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE;
+ return 255;
+ }

len = BTRFS_FID_SIZE_NON_CONNECTABLE;
type = FILEID_BTRFS_WITHOUT_PARENT;
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index e9e1759..cfee0f0 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -319,9 +319,14 @@ static int export_encode_fh(struct dentry *dentry, struct fid *fid,
struct inode * inode = dentry->d_inode;
int len = *max_len;
int type = FILEID_INO32_GEN;
-
- if (len < 2 || (connectable && len < 4))
+
+ if (connectable && (len < 4)) {
+ *max_len = 4;
+ return 255;
+ } else if (len < 2) {
+ *max_len = 2;
return 255;
+ }

len = 2;
fid->i32.ino = inode->i_ino;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 7bf45ae..da2f8a1 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -761,8 +761,10 @@ fat_encode_fh(struct dentry *de, __u32 *fh, int *lenp, int connectable)
struct inode *inode = de->d_inode;
u32 ipos_h, ipos_m, ipos_l;

- if (len < 5)
+ if (len < 5) {
+ *lenp = 5;
return 255; /* no room */
+ }

ipos_h = MSDOS_I(inode)->i_pos >> 8;
ipos_m = (MSDOS_I(inode)->i_pos & 0xf0) << 24;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index ec14d19..beaea69 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -638,8 +638,10 @@ static int fuse_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
u64 nodeid;
u32 generation;

- if (*max_len < len)
+ if (*max_len < len) {
+ *max_len = len;
return 255;
+ }

nodeid = get_fuse_inode(inode)->nodeid;
generation = inode->i_generation;
diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index dfe237a..bd0fd68 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -36,9 +36,13 @@ static int gfs2_encode_fh(struct dentry *dentry, __u32 *p, int *len,
struct super_block *sb = inode->i_sb;
struct gfs2_inode *ip = GFS2_I(inode);

- if (*len < GFS2_SMALL_FH_SIZE ||
- (connectable && *len < GFS2_LARGE_FH_SIZE))
+ if (connectable && (*len < GFS2_LARGE_FH_SIZE)) {
+ *len = GFS2_LARGE_FH_SIZE;
return 255;
+ } else if (*len < GFS2_SMALL_FH_SIZE) {
+ *len = GFS2_SMALL_FH_SIZE;
+ return 255;
+ }

fh[0] = cpu_to_be32(ip->i_no_formal_ino >> 32);
fh[1] = cpu_to_be32(ip->i_no_formal_ino & 0xFFFFFFFF);
diff --git a/fs/isofs/export.c b/fs/isofs/export.c
index ed752cb..dd4687f 100644
--- a/fs/isofs/export.c
+++ b/fs/isofs/export.c
@@ -124,9 +124,13 @@ isofs_export_encode_fh(struct dentry *dentry,
* offset of the inode and the upper 16 bits of fh32[1] to
* hold the offset of the parent.
*/
-
- if (len < 3 || (connectable && len < 5))
+ if (connectable && (len < 5)) {
+ *max_len = 5;
+ return 255;
+ } else if (len < 3) {
+ *max_len = 3;
return 255;
+ }

len = 3;
fh32[0] = ei->i_iget5_block;
diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
index 19ad145..250a347 100644
--- a/fs/ocfs2/export.c
+++ b/fs/ocfs2/export.c
@@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
dentry->d_name.len, dentry->d_name.name,
fh, len, connectable);

- if (len < 3 || (connectable && len < 6)) {
+ if (connectable && (len < 6)) {
mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+ *max_len = 6;
+ type = 255;
+ goto bail;
+ } else if (len < 3) {
+ mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+ *max_len = 3;
type = 255;
goto bail;
}
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index 0f22fda..8f1bf99 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -1588,8 +1588,13 @@ int reiserfs_encode_fh(struct dentry *dentry, __u32 * data, int *lenp,
struct inode *inode = dentry->d_inode;
int maxlen = *lenp;

- if (maxlen < 3)
+ if (need_parent && (maxlen < 5)) {
+ *lenp = 5;
return 255;
+ } else if (maxlen < 3) {
+ *lenp = 3;
+ return 255;
+ }

data[0] = inode->i_ino;
data[1] = le32_to_cpu(INODE_PKEY(inode)->k_dir_id);
diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index bf5fc67..20db42f 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -1336,8 +1336,13 @@ static int udf_encode_fh(struct dentry *de, __u32 *fh, int *lenp,
struct fid *fid = (struct fid *)fh;
int type = FILEID_UDF_WITHOUT_PARENT;

- if (len < 3 || (connectable && len < 5))
+ if (connectable && (len < 5)) {
+ *lenp = 5;
+ return 255;
+ } else if (len < 3) {
+ *lenp = 3;
return 255;
+ }

*lenp = 3;
fid->udf.block = location.logicalBlockNum;
diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
index e7839ee..3dd0bf4 100644
--- a/fs/xfs/linux-2.6/xfs_export.c
+++ b/fs/xfs/linux-2.6/xfs_export.c
@@ -81,8 +81,10 @@ xfs_fs_encode_fh(
* seven combinations work. The real answer is "don't use v2".
*/
len = xfs_fileid_length(fileid_type);
- if (*max_len < len)
+ if (*max_len < len) {
+ *max_len = len;
return 255;
+ }
*max_len = len;

switch (fileid_type) {
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index a9cd507..acd0b2d 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -108,8 +108,10 @@ struct fid {
* set, the encode_fh() should store sufficient information so that a good
* attempt can be made to find not only the file but also it's place in the
* filesystem. This typically means storing a reference to de->d_parent in
- * the filehandle fragment. encode_fh() should return the number of bytes
- * stored or a negative error code such as %-ENOSPC
+ * the filehandle fragment. encode_fh() should return the fileid_type on
+ * success and on error returns 255 (if the space needed to encode fh is
+ * greater than @max_len*4 bytes). On error @max_len contain the minimum
+ * size(in 4 byte unit) needed to encode the file handle.
*
* fh_to_dentry:
* @fh_to_dentry is given a &struct super_block (@sb) and a file handle
diff --git a/mm/shmem.c b/mm/shmem.c
index f65f840..d8223db 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2117,8 +2117,10 @@ static int shmem_encode_fh(struct dentry *dentry, __u32 *fh, int *len,
{
struct inode *inode = dentry->d_inode;

- if (*len < 3)
+ if (*len < 3) {
+ *len = 3;
return 255;
+ }

if (hlist_unhashed(&inode->i_hash)) {
/* Unfortunately insert_inode_hash is not idempotent,
--
1.7.2.rc1

2010-07-12 06:39:07

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 10/12] vfs: Export file system uuid via /proc/<pid>mountinfo

We add a per superblock uuid field. File systems should
update the uuid in the fill_super callback

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namespace.c | 3 +++
include/linux/fs.h | 1 +
2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 88058de..5dbdbd6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -871,6 +871,9 @@ static int show_mountinfo(struct seq_file *m, void *v)
if (IS_MNT_UNBINDABLE(mnt))
seq_puts(m, " unbindable");

+ /* print the uuid */
+ seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
+
/* Filesystem specific data */
seq_puts(m, " - ");
show_type(m, sb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3103c39..5f43472 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1367,6 +1367,7 @@ struct super_block {
wait_queue_head_t s_wait_unfrozen;

char s_id[32]; /* Informational name */
+ u8 s_uuid[16]; /* UUID */

void *s_fs_info; /* Filesystem private info */
fmode_t s_mode;
--
1.7.2.rc1

2010-07-12 06:39:13

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V16 11/12] ext3: Copy fs UUID to superblock.

File system UUID is made available to application
via /proc/<pid>/mountinfo

Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/ext3/super.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 6c953bb..1596795 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -1931,6 +1931,7 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
sb->s_qcop = &ext3_qctl_operations;
sb->dq_op = &ext3_quota_operations;
#endif
+ memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
mutex_init(&sbi->s_orphan_lock);
mutex_init(&sbi->s_resize_lock);
--
1.7.2.rc1

2010-07-12 07:19:49

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V16 10/12] vfs: Export file system uuid via /proc/<pid>mountinfo

On Mon, 12 Jul 2010 12:05:43 +0530, "Aneesh Kumar K.V" <[email protected]> wrote:
> We add a per superblock uuid field. File systems should
> update the uuid in the fill_super callback
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/namespace.c | 3 +++
> include/linux/fs.h | 1 +
> 2 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 88058de..5dbdbd6 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -871,6 +871,9 @@ static int show_mountinfo(struct seq_file *m, void *v)
> if (IS_MNT_UNBINDABLE(mnt))
> seq_puts(m, " unbindable");
>
> + /* print the uuid */
> + seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
> +
> /* Filesystem specific data */
> seq_puts(m, " - ");
> show_type(m, sb);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 3103c39..5f43472 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1367,6 +1367,7 @@ struct super_block {
> wait_queue_head_t s_wait_unfrozen;
>
> char s_id[32]; /* Informational name */
> + u8 s_uuid[16]; /* UUID */
>
> void *s_fs_info; /* Filesystem private info */
> fmode_t s_mode;

since it is an optional tag is it ok to do the below patch ? or does
optional is a way to introduce changes across kernel version and
each line is suppose to have new added fields ?. Is there a userspace
tool that use /proc/<pid>/mouninfo ?

diff --git a/fs/namespace.c b/fs/namespace.c
index 5dbdbd6..7542959 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -833,6 +833,16 @@ const struct seq_operations mounts_op = {
.show = show_vfsmnt
};

+static int uuid_is_nil(u8 *uuid)
+{
+ int i;
+ u8 *cp = (u8 *)uuid;
+
+ for (i = 0; i < 16; i++)
+ if (*cp++) return 0; /* not nil */
+ return 1; /* is nil */
+}
+
static int show_mountinfo(struct seq_file *m, void *v)
{
struct proc_mounts *p = m->private;
@@ -871,8 +881,9 @@ static int show_mountinfo(struct seq_file *m, void *v)
if (IS_MNT_UNBINDABLE(mnt))
seq_puts(m, " unbindable");

+ if (!uuid_is_nil(mnt->mnt_sb->s_uuid))
/* print the uuid */
- seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
+ seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);

/* Filesystem specific data */
seq_puts(m, " - ");

2010-07-12 08:03:29

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 05/12] vfs: Support null pathname in readlink

On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> From: NeilBrown <[email protected]>
>
> This enables to use readlink to get the link target name
> from a file descriptor point to the link. This can be used
> with open_by_handle syscall that returns a file descriptor for a link.
> We can then use this file descriptor to get the target name.
>
> This is similar to utimensat(2) interface

We could introduce pair of new helper functions to extract the common
code from do_utimes() and this:

err = lookup_path_at(dfd, filename, atflags, &path, &file);
/* do something with path */
put_path_at(&path, file);

Thanks,
Miklos

> Signed-off-by: NeilBrown <[email protected]>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/stat.c | 30 ++++++++++++++++++++++--------
> 1 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/fs/stat.c b/fs/stat.c
> index c4ecd52..a66a0ef 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -284,26 +284,40 @@ SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf)
> SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname,
> char __user *, buf, int, bufsiz)
> {
> - struct path path;
> - int error;
> + int error = 0, fput_needed;
> + struct path path, *pp;
> + struct file *file = NULL;
>
> if (bufsiz <= 0)
> return -EINVAL;
>
> - error = user_path_at(dfd, pathname, 0, &path);
> + if (pathname == NULL && dfd != AT_FDCWD) {
> + file = fget_light(dfd, &fput_needed);
> +
> + if (file)
> + pp = &file->f_path;
> + else
> + error = -EBADF;
> + } else {
> + error = user_path_at(dfd, pathname, 0, &path);
> + pp = &path;
> + }
> if (!error) {
> - struct inode *inode = path.dentry->d_inode;
> + struct inode *inode = pp->dentry->d_inode;
>
> error = -EINVAL;
> if (inode->i_op->readlink) {
> - error = security_inode_readlink(path.dentry);
> + error = security_inode_readlink(pp->dentry);
> if (!error) {
> - touch_atime(path.mnt, path.dentry);
> - error = inode->i_op->readlink(path.dentry,
> + touch_atime(pp->mnt, pp->dentry);
> + error = inode->i_op->readlink(pp->dentry,
> buf, bufsiz);
> }
> }
> - path_put(&path);
> + if (file)
> + fput_light(file, fput_needed);
> + else
> + path_put(&path);
> }
> return error;
> }
> --
> 1.7.2.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2010-07-12 08:06:12

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> This enables to use linkat to create hardlinks from a
> file descriptor pointing to the file. This can be used
> with open_by_handle syscall that returns a file descriptor.

This needs more thought, filesystems don't usually tolerate
resurrecting a file which has already been unlinked (i_nlink == 0).

Thanks,
Miklos

>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/namei.c | 34 +++++++++++++++++++++++++---------
> 1 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index a6a8093..9a7b71a 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> {
> struct dentry *new_dentry;
> struct nameidata nd;
> - struct path old_path;
> - int error;
> + struct path old_path, *old_pathp;
> + struct file *file = NULL;
> + int error, fput_needed;
> char *to;
>
> if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
> return -EINVAL;
>
> - error = user_path_at(olddfd, oldname,
> - flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
> - &old_path);
> + if (oldname == NULL && olddfd != AT_FDCWD) {
> + file = fget_light(olddfd, &fput_needed);
> + if (file) {
> + old_pathp = &file->f_path;
> + error = 0;
> + } else
> + error = -EBADF;
> + } else {
> + error = user_path_at(olddfd, oldname,
> + flags & AT_SYMLINK_FOLLOW ?
> + LOOKUP_FOLLOW : 0,
> + &old_path);
> + old_pathp = &old_path;
> + }
> if (error)
> return error;
>
> @@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> if (error)
> goto out;
> error = -EXDEV;
> - if (old_path.mnt != nd.path.mnt)
> + if (old_pathp->mnt != nd.path.mnt)
> goto out_release;
> new_dentry = lookup_create(&nd, 0);
> error = PTR_ERR(new_dentry);
> @@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> error = mnt_want_write(nd.path.mnt);
> if (error)
> goto out_dput;
> - error = security_path_link(old_path.dentry, &nd.path, new_dentry);
> + error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
> if (error)
> goto out_drop_write;
> - error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
> + error = vfs_link(old_pathp->dentry,
> + nd.path.dentry->d_inode, new_dentry);
> out_drop_write:
> mnt_drop_write(nd.path.mnt);
> out_dput:
> @@ -2593,7 +2606,10 @@ out_release:
> path_put(&nd.path);
> putname(to);
> out:
> - path_put(&old_path);
> + if (file)
> + fput_light(file, fput_needed);
> + else
> + path_put(&old_path);
>
> return error;
> }
> --
> 1.7.2.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2010-07-12 08:15:50

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 02/12] vfs: Add name to file handle conversion support

On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> The file handle also include mount id which can be used
> to lookup file system specific information such as uuid
> in /proc/<pid>mountinfo
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/open.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/fs.h | 9 +++
> include/linux/syscalls.h | 5 ++-
> 3 files changed, 137 insertions(+), 1 deletions(-)
>
> diff --git a/fs/open.c b/fs/open.c
> index 5463266..7ad8f28 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -29,6 +29,7 @@
> #include <linux/falloc.h>
> #include <linux/fs_struct.h>
> #include <linux/ima.h>
> +#include <linux/exportfs.h>
>
> #include "internal.h"
>
> @@ -1040,3 +1041,126 @@ int nonseekable_open(struct inode *inode, struct file *filp)
> }
>
> EXPORT_SYMBOL(nonseekable_open);
> +
> +#ifdef CONFIG_EXPORTFS
> +/* limit the handle size to some value */
> +#define MAX_HANDLE_SZ 4096
> +static long do_sys_name_to_handle(struct path *path,
> + struct file_handle __user *ufh)
> +{
> + long retval;
> + int handle_size;
> + struct file_handle f_handle;
> + struct file_handle *handle = NULL;
> +
> + if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> + retval = -EFAULT;
> + goto err_out;
> + }
> + if (f_handle.handle_size > MAX_HANDLE_SZ) {
> + retval = -EINVAL;
> + goto err_out;
> + }
> + handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> + GFP_KERNEL);
> + if (!handle) {
> + retval = -ENOMEM;
> + goto err_out;
> + }
> +
> + /* convert handle size to multiple of sizeof(u32) */
> + handle_size = f_handle.handle_size >> 2;
> +
> + /* we ask for a non connected handle */
> + retval = exportfs_encode_fh(path->dentry,
> + (struct fid *)handle->f_handle,
> + &handle_size, 0);
> + /* convert handle size to bytes */
> + handle_size *= sizeof(u32);
> + handle->handle_type = retval;
> + handle->handle_size = handle_size;
> + /* copy the mount id */
> + handle->mnt_id = path->mnt->mnt_id;
> + if (handle_size > f_handle.handle_size) {
> + /*
> + * set the handle_size to zero so we copy only
> + * non variable part of the file_handle
> + */
> + handle_size = 0;
> + retval = -EOVERFLOW;
> + } else
> + retval = 0;
> + if (copy_to_user(ufh, handle,
> + sizeof(struct file_handle) + handle_size))
> + retval = -EFAULT;
> +
> + kfree(handle);
> +err_out:
> + return retval;
> +}
> +
> +/**
> + * sys_name_to_handle_at: convert name to handle
> + * @dfd: directory relative to which name is interpreted if not absolute
> + * @name: name that should be converted to handle.
> + * @handle: resulting file handle
> + * @flag: flag value to indicate whether to follow symlink or not
> + *
> + * @handle->handle_size indicate the space available to store the
> + * variable part of the file handle in bytes. If there is not
> + * enough space, the field is updated to return the minimum
> + * value required.
> + */
> +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> + struct file_handle __user *, handle, int, flag)
> +{
> +
> + int follow;
> + int fput_needed;
> + long ret = -EINVAL;
> + struct path path, *pp;
> + struct file *file = NULL;
> +
> + if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
> + goto err_out;
> +
> + if (name == NULL && dfd != AT_FDCWD) {
> + file = fget_light(dfd, &fput_needed);
> + if (file) {
> + pp = &file->f_path;
> + ret = 0;
> + } else
> + ret = -EBADF;
> + } else {
> + follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
> + ret = user_path_at(dfd, name, follow, &path);
> + pp = &path;
> + }
> + if (ret)
> + goto err_out;
> + /*
> + * We need t make sure wether the file system
> + * support decoding of the file handle
> + */
> + if (!pp->mnt->mnt_sb->s_export_op ||
> + !pp->mnt->mnt_sb->s_export_op->fh_to_dentry) {
> + ret = -EOPNOTSUPP;
> + goto out_path;
> + }
> + ret = do_sys_name_to_handle(pp, handle);
> +
> +out_path:
> + if (file)
> + fput_light(file, fput_needed);
> + else
> + path_put(&path);
> +err_out:
> + return ret;
> +}
> +#else
> +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> + struct file_handle __user *, handle, int, flag)
> +{
> + return -ENOSYS;
> +}
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 471e1ff..0e7cf4c 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -949,6 +949,15 @@ struct file {
> unsigned long f_mnt_write_state;
> #endif
> };
> +
> +struct file_handle {
> + int mnt_id;

The mount id is not part of the handle in that it's not used when
converting back a handle to a file descriptor. So it shouldn't be
included here.

The uuid can be looked up based on st_dev.

> + int handle_size;
> + int handle_type;
> + /* file identifier */
> + unsigned char f_handle[0];
> +};
> +
> extern spinlock_t files_lock;
> #define file_list_lock() spin_lock(&files_lock);
> #define file_list_unlock() spin_unlock(&files_lock);
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 7f614ce..9d6d6e3 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -61,6 +61,7 @@ struct robust_list_head;
> struct getcpu_cache;
> struct old_linux_dirent;
> struct perf_event_attr;
> +struct file_handle;
>
> #include <linux/types.h>
> #include <linux/aio_abi.h>
> @@ -823,5 +824,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
> unsigned long prot, unsigned long flags,
> unsigned long fd, unsigned long pgoff);
> asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
> -
> +asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
> + struct file_handle __user *handle,
> + int flag);
> #endif
> --
> 1.7.2.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2010-07-12 08:23:50

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 04/12] vfs: Allow handle based open on symlinks

On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> The patch update may_open to allow handle based open on symlinks.
> The file handle based API use file descritor returned from open_by_handle_at
> to do different file system operations. To find the link target name we
> need to get a file descriptor on symlinks.
>
> We should be able to read the link target using file handle. The exact
> usecase is with respect to implementing READLINK operation on a
> userspace NFS server. The request contain the file handle and the
> response include target name.
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/namei.c | 10 +++++++++-
> fs/open.c | 9 ++++++++-
> include/linux/fs.h | 1 +
> 3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 4d590a3..a6a8093 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1456,7 +1456,15 @@ int may_open(struct path *path, int acc_mode, int flag)
>
> switch (inode->i_mode & S_IFMT) {
> case S_IFLNK:
> - return -ELOOP;
> + /*
> + * Allow only if acc_mode contain
> + * open link request and read request.
> + */
> + if (acc_mode != (MAY_OPEN_LINK | MAY_READ))

Why require MAY_READ?

Actually, open_by_handle() should be a good place to start supporting
O_NOACCESS from the start. I.e. neigher read, nor write access is
permitted on the file.


> + return -ELOOP;
> + if (flag != O_RDONLY)
> + return -ELOOP;
> + break;
> case S_IFDIR:
> if (acc_mode & MAY_WRITE)
> return -EISDIR;
> diff --git a/fs/open.c b/fs/open.c
> index df5d21e..afb089e 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -1268,8 +1268,15 @@ static long do_sys_open_by_handle(int mountdirfd,
> */
> if (open_flag & __O_SYNC)
> open_flag |= O_DSYNC;
> + /*
> + * Handle based API allow open on a symlink
> + */
> + if (S_ISLNK(path->dentry->d_inode->i_mode))
> + acc_mode = MAY_OPEN_LINK;
> + else
> + acc_mode = MAY_OPEN;
>
> - acc_mode = MAY_OPEN | ACC_MODE(open_flag);
> + acc_mode |= ACC_MODE(open_flag);
>
> /* O_TRUNC implies we need access checks for write permissions */
> if (open_flag & O_TRUNC)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a458b4e..08afa72 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -53,6 +53,7 @@ struct inodes_stat_t {
> #define MAY_APPEND 8
> #define MAY_ACCESS 16
> #define MAY_OPEN 32
> +#define MAY_OPEN_LINK 64
>
> /*
> * flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
> --
> 1.7.2.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2010-07-12 08:28:00

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 10/12] vfs: Export file system uuid via /proc/<pid>mountinfo

On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> On Mon, 12 Jul 2010 12:05:43 +0530, "Aneesh Kumar K.V" <[email protected]> wrote:
> > We add a per superblock uuid field. File systems should
> > update the uuid in the fill_super callback
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namespace.c | 3 +++
> > include/linux/fs.h | 1 +
> > 2 files changed, 4 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index 88058de..5dbdbd6 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -871,6 +871,9 @@ static int show_mountinfo(struct seq_file *m, void *v)
> > if (IS_MNT_UNBINDABLE(mnt))
> > seq_puts(m, " unbindable");
> >
> > + /* print the uuid */
> > + seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
> > +
> > /* Filesystem specific data */
> > seq_puts(m, " - ");
> > show_type(m, sb);
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 3103c39..5f43472 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1367,6 +1367,7 @@ struct super_block {
> > wait_queue_head_t s_wait_unfrozen;
> >
> > char s_id[32]; /* Informational name */
> > + u8 s_uuid[16]; /* UUID */
> >
> > void *s_fs_info; /* Filesystem private info */
> > fmode_t s_mode;
>
> since it is an optional tag is it ok to do the below patch ? or does
> optional is a way to introduce changes across kernel version and
> each line is suppose to have new added fields ?.

Tagged fields are optional, so yes, the patch is OK.

> Is there a userspace
> tool that use /proc/<pid>/mouninfo ?

libmount from recent enough util-linux is using mountinfo. So
mount(8) should be using it, as well as some other utilities in
util-linux.

Thanks,
Miklos

>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 5dbdbd6..7542959 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -833,6 +833,16 @@ const struct seq_operations mounts_op = {
> .show = show_vfsmnt
> };
>
> +static int uuid_is_nil(u8 *uuid)
> +{
> + int i;
> + u8 *cp = (u8 *)uuid;
> +
> + for (i = 0; i < 16; i++)
> + if (*cp++) return 0; /* not nil */
> + return 1; /* is nil */
> +}
> +
> static int show_mountinfo(struct seq_file *m, void *v)
> {
> struct proc_mounts *p = m->private;
> @@ -871,8 +881,9 @@ static int show_mountinfo(struct seq_file *m, void *v)
> if (IS_MNT_UNBINDABLE(mnt))
> seq_puts(m, " unbindable");
>
> + if (!uuid_is_nil(mnt->mnt_sb->s_uuid))
> /* print the uuid */
> - seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
> + seq_printf(m, " uuid:%pU", mnt->mnt_sb->s_uuid);
>
> /* Filesystem specific data */
> seq_puts(m, " - ");
>

2010-07-12 09:33:23

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V16 02/12] vfs: Add name to file handle conversion support

On Mon, 12 Jul 2010 10:15:29 +0200, Miklos Szeredi <[email protected]> wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > The file handle also include mount id which can be used
> > to lookup file system specific information such as uuid
> > in /proc/<pid>mountinfo
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/open.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/fs.h | 9 +++
> > include/linux/syscalls.h | 5 ++-
> > 3 files changed, 137 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index 5463266..7ad8f28 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -29,6 +29,7 @@
> > #include <linux/falloc.h>
> > #include <linux/fs_struct.h>
> > #include <linux/ima.h>
> > +#include <linux/exportfs.h>
> >
> > #include "internal.h"
> >
> > @@ -1040,3 +1041,126 @@ int nonseekable_open(struct inode *inode, struct file *filp)
> > }
> >
> > EXPORT_SYMBOL(nonseekable_open);
> > +
> > +#ifdef CONFIG_EXPORTFS
> > +/* limit the handle size to some value */
> > +#define MAX_HANDLE_SZ 4096
> > +static long do_sys_name_to_handle(struct path *path,
> > + struct file_handle __user *ufh)
> > +{
> > + long retval;
> > + int handle_size;
> > + struct file_handle f_handle;
> > + struct file_handle *handle = NULL;
> > +
> > + if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> > + retval = -EFAULT;
> > + goto err_out;
> > + }
> > + if (f_handle.handle_size > MAX_HANDLE_SZ) {
> > + retval = -EINVAL;
> > + goto err_out;
> > + }
> > + handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> > + GFP_KERNEL);
> > + if (!handle) {
> > + retval = -ENOMEM;
> > + goto err_out;
> > + }
> > +
> > + /* convert handle size to multiple of sizeof(u32) */
> > + handle_size = f_handle.handle_size >> 2;
> > +
> > + /* we ask for a non connected handle */
> > + retval = exportfs_encode_fh(path->dentry,
> > + (struct fid *)handle->f_handle,
> > + &handle_size, 0);
> > + /* convert handle size to bytes */
> > + handle_size *= sizeof(u32);
> > + handle->handle_type = retval;
> > + handle->handle_size = handle_size;
> > + /* copy the mount id */
> > + handle->mnt_id = path->mnt->mnt_id;
> > + if (handle_size > f_handle.handle_size) {
> > + /*
> > + * set the handle_size to zero so we copy only
> > + * non variable part of the file_handle
> > + */
> > + handle_size = 0;
> > + retval = -EOVERFLOW;
> > + } else
> > + retval = 0;
> > + if (copy_to_user(ufh, handle,
> > + sizeof(struct file_handle) + handle_size))
> > + retval = -EFAULT;
> > +
> > + kfree(handle);
> > +err_out:
> > + return retval;
> > +}
> > +
> > +/**
> > + * sys_name_to_handle_at: convert name to handle
> > + * @dfd: directory relative to which name is interpreted if not absolute
> > + * @name: name that should be converted to handle.
> > + * @handle: resulting file handle
> > + * @flag: flag value to indicate whether to follow symlink or not
> > + *
> > + * @handle->handle_size indicate the space available to store the
> > + * variable part of the file handle in bytes. If there is not
> > + * enough space, the field is updated to return the minimum
> > + * value required.
> > + */
> > +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> > + struct file_handle __user *, handle, int, flag)
> > +{
> > +
> > + int follow;
> > + int fput_needed;
> > + long ret = -EINVAL;
> > + struct path path, *pp;
> > + struct file *file = NULL;
> > +
> > + if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
> > + goto err_out;
> > +
> > + if (name == NULL && dfd != AT_FDCWD) {
> > + file = fget_light(dfd, &fput_needed);
> > + if (file) {
> > + pp = &file->f_path;
> > + ret = 0;
> > + } else
> > + ret = -EBADF;
> > + } else {
> > + follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
> > + ret = user_path_at(dfd, name, follow, &path);
> > + pp = &path;
> > + }
> > + if (ret)
> > + goto err_out;
> > + /*
> > + * We need t make sure wether the file system
> > + * support decoding of the file handle
> > + */
> > + if (!pp->mnt->mnt_sb->s_export_op ||
> > + !pp->mnt->mnt_sb->s_export_op->fh_to_dentry) {
> > + ret = -EOPNOTSUPP;
> > + goto out_path;
> > + }
> > + ret = do_sys_name_to_handle(pp, handle);
> > +
> > +out_path:
> > + if (file)
> > + fput_light(file, fput_needed);
> > + else
> > + path_put(&path);
> > +err_out:
> > + return ret;
> > +}
> > +#else
> > +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> > + struct file_handle __user *, handle, int, flag)
> > +{
> > + return -ENOSYS;
> > +}
> > +#endif
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 471e1ff..0e7cf4c 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -949,6 +949,15 @@ struct file {
> > unsigned long f_mnt_write_state;
> > #endif
> > };
> > +
> > +struct file_handle {
> > + int mnt_id;
>
> The mount id is not part of the handle in that it's not used when
> converting back a handle to a file descriptor. So it shouldn't be
> included here.
>
> The uuid can be looked up based on st_dev.
>

That would include another stat call on the file to get the st_dev ? As
per the last review (Message-id:[email protected])
http://article.gmane.org/gmane.linux.kernel/1007385 we discussed that
it would be nice to add st_dev as a part of handle. Later I suggested
it would be nice to get mount_id instead of st_dev because st_dev is
not stable (against remounts) for file system that doesn't have a
backing device. So instead of using something that is partially stable,
add mnt_id which is explicitly stated to be unstable across remounts.

If you are against having mount_id as a part of struct file_handle, do
you think we could add it as a extra argument to syscall ?

-aneesh

2010-07-12 09:42:39

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V16 04/12] vfs: Allow handle based open on symlinks

On Mon, 12 Jul 2010 10:23:21 +0200, Miklos Szeredi <[email protected]> wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > The patch update may_open to allow handle based open on symlinks.
> > The file handle based API use file descritor returned from open_by_handle_at
> > to do different file system operations. To find the link target name we
> > need to get a file descriptor on symlinks.
> >
> > We should be able to read the link target using file handle. The exact
> > usecase is with respect to implementing READLINK operation on a
> > userspace NFS server. The request contain the file handle and the
> > response include target name.
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namei.c | 10 +++++++++-
> > fs/open.c | 9 ++++++++-
> > include/linux/fs.h | 1 +
> > 3 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 4d590a3..a6a8093 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -1456,7 +1456,15 @@ int may_open(struct path *path, int acc_mode, int flag)
> >
> > switch (inode->i_mode & S_IFMT) {
> > case S_IFLNK:
> > - return -ELOOP;
> > + /*
> > + * Allow only if acc_mode contain
> > + * open link request and read request.
> > + */
> > + if (acc_mode != (MAY_OPEN_LINK | MAY_READ))
>
> Why require MAY_READ?

a value of 0x0 for flag in open(2) implies a read ?

>
> Actually, open_by_handle() should be a good place to start supporting
> O_NOACCESS from the start. I.e. neigher read, nor write access is
> permitted on the file.

Yes that would be ideal. But that will include much larger change. I was
hoping we could get something in this merge window with the change
suggested above ?

>
>
> > + return -ELOOP;
> > + if (flag != O_RDONLY)
> > + return -ELOOP;
> > + break;
> > case S_IFDIR:
> > if (acc_mode & MAY_WRITE)
> > return -EISDIR;

-aneesh

2010-07-12 10:59:07

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, 12 Jul 2010 10:05:39 +0200, Miklos Szeredi <[email protected]> wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > This enables to use linkat to create hardlinks from a
> > file descriptor pointing to the file. This can be used
> > with open_by_handle syscall that returns a file descriptor.
>
> This needs more thought, filesystems don't usually tolerate
> resurrecting a file which has already been unlinked (i_nlink == 0).
>

We get -ENOENT error when we do that on ext*

open("test", O_RDONLY) = 3
unlink("test") = 0
linkat(3, NULL, AT_FDCWD, "test3", 0) = -1 ENOENT (No such file or directory)

ext4_link does the below

/*
* Return -ENOENT if we've raced with unlink and i_nlink is 0. Doing
* otherwise has the potential to corrupt the orphan inode list.
*/
if (inode->i_nlink == 0)
return -ENOENT;

I can move this check to VFS so that we do it for all file systems.

-aneesh

2010-07-12 16:39:24

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 02/12] vfs: Add name to file handle conversion support

On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> That would include another stat call on the file to get the st_dev ? As
> per the last review (Message-id:[email protected])
> http://article.gmane.org/gmane.linux.kernel/1007385 we discussed that
> it would be nice to add st_dev as a part of handle. Later I suggested
> it would be nice to get mount_id instead of st_dev because st_dev is
> not stable (against remounts) for file system that doesn't have a
> backing device. So instead of using something that is partially stable,
> add mnt_id which is explicitly stated to be unstable across remounts.
>
> If you are against having mount_id as a part of struct file_handle, do
> you think we could add it as a extra argument to syscall ?

Yeah, I think that would be cleaner.

Thanks,
Miklos

2010-07-12 16:56:54

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 04/12] vfs: Allow handle based open on symlinks

On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> On Mon, 12 Jul 2010 10:23:21 +0200, Miklos Szeredi <[email protected]> wrote:
> > On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > > The patch update may_open to allow handle based open on symlinks.
> > > The file handle based API use file descritor returned from open_by_handle_at
> > > to do different file system operations. To find the link target name we
> > > need to get a file descriptor on symlinks.
> > >
> > > We should be able to read the link target using file handle. The exact
> > > usecase is with respect to implementing READLINK operation on a
> > > userspace NFS server. The request contain the file handle and the
> > > response include target name.
> > >
> > > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > > ---
> > > fs/namei.c | 10 +++++++++-
> > > fs/open.c | 9 ++++++++-
> > > include/linux/fs.h | 1 +
> > > 3 files changed, 18 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 4d590a3..a6a8093 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -1456,7 +1456,15 @@ int may_open(struct path *path, int acc_mode, int flag)
> > >
> > > switch (inode->i_mode & S_IFMT) {
> > > case S_IFLNK:
> > > - return -ELOOP;
> > > + /*
> > > + * Allow only if acc_mode contain
> > > + * open link request and read request.
> > > + */
> > > + if (acc_mode != (MAY_OPEN_LINK | MAY_READ))
> >
> > Why require MAY_READ?
>
> a value of 0x0 for flag in open(2) implies a read ?

Yes.

However a value of 0x3 is not defined in POSIX, and in linux it's a
sort of weird O_NOACCESS: it requires both read and write permission
on the file but allows neither read or write.

Requiring permission is because on device files the open can have side
effects. Not sure if open_by_handle() really wants to allow device
opens, that's another thing to think about.

Thanks,
Miklos

2010-07-12 17:05:38

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> /*
> * Return -ENOENT if we've raced with unlink and i_nlink is 0. Doing
> * otherwise has the potential to corrupt the orphan inode list.
> */
> if (inode->i_nlink == 0)
> return -ENOENT;
>
> I can move this check to VFS so that we do it for all file systems.

That makes sense. Hopefully filesystems which implement hard links
will change i_nlink to zero for the last unlink...

Thanks,
Miklos

2010-07-13 05:47:38

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH -V16 07/12] vfs: Support null pathname in linkat

On Mon, Jul 12, 2010 at 10:05:39AM +0200, Miklos Szeredi wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > This enables to use linkat to create hardlinks from a
> > file descriptor pointing to the file. This can be used
> > with open_by_handle syscall that returns a file descriptor.
>
> This needs more thought, filesystems don't usually tolerate
> resurrecting a file which has already been unlinked (i_nlink == 0).

File resurrection support would be useful for more than the
open-by-handle patches.

Checkpoint/restart would make use of it too. Without it, programs that use
large unlinked files would take corresponding amounts of IO to checkpoint.
By resurrecting the file its contents can be checkpointed without copying
on filesystems or block devices that support CoW-sharing of snapshot data.
Then, during restart, we'd do something like:

1. Take a snapshot of the snapshot and remount it rw.
(Not necessary if userspace doesn't mind restart
destroying the checkpoint.)
2. Re-open the snapshot of the resurrected file.
<everything else we already do to restart open files (e.g. seek)>
N. Re-unlink the file.

Cheers,
-Matt Helsley

>
> Thanks,
> Miklos
>
> >
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namei.c | 34 +++++++++++++++++++++++++---------
> > 1 files changed, 25 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index a6a8093..9a7b71a 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > {
> > struct dentry *new_dentry;
> > struct nameidata nd;
> > - struct path old_path;
> > - int error;
> > + struct path old_path, *old_pathp;
> > + struct file *file = NULL;
> > + int error, fput_needed;
> > char *to;
> >
> > if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
> > return -EINVAL;
> >
> > - error = user_path_at(olddfd, oldname,
> > - flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
> > - &old_path);
> > + if (oldname == NULL && olddfd != AT_FDCWD) {
> > + file = fget_light(olddfd, &fput_needed);
> > + if (file) {
> > + old_pathp = &file->f_path;
> > + error = 0;
> > + } else
> > + error = -EBADF;
> > + } else {
> > + error = user_path_at(olddfd, oldname,
> > + flags & AT_SYMLINK_FOLLOW ?
> > + LOOKUP_FOLLOW : 0,
> > + &old_path);
> > + old_pathp = &old_path;
> > + }
> > if (error)
> > return error;
> >
> > @@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > if (error)
> > goto out;
> > error = -EXDEV;
> > - if (old_path.mnt != nd.path.mnt)
> > + if (old_pathp->mnt != nd.path.mnt)
> > goto out_release;
> > new_dentry = lookup_create(&nd, 0);
> > error = PTR_ERR(new_dentry);
> > @@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > error = mnt_want_write(nd.path.mnt);
> > if (error)
> > goto out_dput;
> > - error = security_path_link(old_path.dentry, &nd.path, new_dentry);
> > + error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
> > if (error)
> > goto out_drop_write;
> > - error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
> > + error = vfs_link(old_pathp->dentry,
> > + nd.path.dentry->d_inode, new_dentry);
> > out_drop_write:
> > mnt_drop_write(nd.path.mnt);
> > out_dput:
> > @@ -2593,7 +2606,10 @@ out_release:
> > path_put(&nd.path);
> > putname(to);
> > out:
> > - path_put(&old_path);
> > + if (file)
> > + fput_light(file, fput_needed);
> > + else
> > + path_put(&old_path);
> >
> > return error;
> > }
> > --
> > 1.7.2.rc1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/