2009-06-16 20:40:41

by David Howells

[permalink] [raw]
Subject: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



This series of patches provides a pioctl() system call, and makes kAFS use it
to provide a number of of the OpenAFS pioctl functions sufficient to allow a
number of OpenAFS userspace utilities work with kAFS.

File-requiring pioctls have been tested with:

[root@andromeda ~]# fs getfid /afs
File /afs (1.1.0) contained in volume 1
[root@andromeda ~]# fs whichcell /afs
File /afs lives in cell 'procyon.org.uk'
[root@andromeda ~]# fs examine /afs
fs: You don't have the required access rights on '/afs'
[root@andromeda ~]# fs whereis /afs
File /afs is on host altair.procyon.org.uk

Non-file-requiring pioctls for manipulating authentication tokens have been
tested with:

[root@andromeda ~]# kinit admin/admin
Password for admin/[email protected]:
[root@andromeda ~]# klog admin
Password:
[root@andromeda ~]# keyctl show
Session Keyring
-3 --alswrv 0 0 keyring: _ses
939040040 --als--v 0 0 \_ rxrpc: [email protected]
[root@andromeda ~]# fs examine /afs
File /afs (1.1.32558) contained in volume 1
Volume status for vid = 536870912 named
Current disk quota is 5000
Current blocks used are 2
The partition has 39007484 blocks available out of 39187776

[root@andromeda ~]# aklog
[root@andromeda ~]# keyctl show
Session Keyring
-3 --alswrv 0 0 keyring: _ses
660792724 --als--v 0 0 \_ rxrpc: [email protected]
[root@andromeda ~]# tokens

Tokens held by the Cache Manager:

User's (AFS ID 10143) tokens for [email protected] [Expires Jun 17 00:47]
--End of list--

The AFS keys probably should be in their own keyring which is linked to from
the session keyring.


2009-06-16 20:39:52

by David Howells

[permalink] [raw]
Subject: [PATCH 08/17] AFS: Implement the PGetFileCell pioctl

From: Jacob Thebault-Spieker <[email protected]>

Implement the PGetFileCell pioctl for AFS. This will get the name of the cell
to which a file belongs and return to userspace.

This can be tested with the OpenAFS userspace tools by doing:

fs whichcell /afs

on a mounted AFS filesystem, which should return something like:

File /afs lives in cell 'cambridge.redhat.com'

Signed-off-by: Jacob Thebault-Spieker <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 31 +++++++++++++++++++++++++++++++
include/linux/afscall.h | 1 +
include/linux/venus.h | 1 +
3 files changed, 33 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 2e4f741..3d95a0d 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -11,6 +11,7 @@
#include <linux/afscall.h>
#include <linux/pioctl.h>
#include <linux/venus.h>
+#include <linux/string.h>
#include "internal.h"

/*
@@ -39,6 +40,32 @@ static long afs_PGetFID(struct dentry *dentry, struct vice_ioctl *arg,
}

/*
+ * Get the cell that the file belongs to
+ */
+long afs_PGetFileCell(struct dentry *dentry, struct vice_ioctl *arg,
+ struct key *key)
+{
+ struct afs_vnode *vnode;
+ size_t name_len;
+
+ _enter("");
+
+ vnode = AFS_FS_I(dentry->d_inode);
+ name_len = strlen(vnode->volume->vlocation->cell->name);
+
+ if (arg->out_size < name_len + 1) {
+ _leave(" = -EINVAL [%d < %zu]", arg->out_size, name_len + 1);
+ return -EINVAL;
+ }
+
+ memcpy(arg->out, &vnode->volume->vlocation->cell->name, name_len + 1);
+ arg->out_size = name_len + 1;
+
+ _leave(" = 0 [%d]", arg->out_size);
+ return 0;
+}
+
+/*
* The AFS path-based I/O control operation
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
@@ -64,6 +91,10 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
ret = afs_PGetFID(dentry, arg, key);
break;

+ case VIOC_COMMAND(PGetFileCell):
+ ret = afs_PGetFileCell(dentry, arg, key);
+ break;
+
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index cb006a2..0976469 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -15,5 +15,6 @@

/* pioctl commands */
#define PGetFID 22 /* get file ID */
+#define PGetFileCell 30 /* get the cell a file inhabits */

#endif /* _LINUX_AFSCALL_H */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index ea896e4..9cc115c 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -18,5 +18,6 @@
* pioctl commands (not usable as ioctls)
*/
#define VIOCGETFID _VICEIOCTL(PGetFID)
+#define VIOC_FILE_CELL_NAME _VICEIOCTL(PGetFileCell)

#endif /* _LINUX_VENUS_H */

2009-06-16 20:40:18

by David Howells

[permalink] [raw]
Subject: [PATCH 09/17] AFS: Implement the PGetVolStat pioctl

From: Wang Lei <[email protected]>

Implement the PGetVolStat pioctl for AFS. This will get the status information
for the volume in which a specified file is located.

This can be tested with the OpenAFS userspace tools by doing:

fs examine /afs

on a mounted AFS filesystem, which should return something like:

fs: You don't have the required access rights on '/afs'

or:

[root@andromeda ~]# fs examine /afs
File /afs (1.1.32576) contained in volume 1
Volume status for vid = 536870912 named
Current disk quota is 5000
Current blocks used are 2
The partition has 39007484 blocks available out of 39187776

if authenticated.

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/afscall.h | 19 +++++++++++++++++++
include/linux/venus.h | 1 +
3 files changed, 55 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 3d95a0d..4efd825 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -66,6 +66,37 @@ long afs_PGetFileCell(struct dentry *dentry, struct vice_ioctl *arg,
}

/*
+ * Get volume status for pathname
+ */
+long afs_PGetVolStat(struct dentry *dentry, struct vice_ioctl *arg,
+ struct key *key)
+{
+ struct afs_volume_status vs;
+ struct afs_vnode *vnode = AFS_FS_I(dentry->d_inode);
+ long ret;
+
+ _enter("");
+
+ if (arg->out_size < sizeof(struct VolumeStatus)) {
+ _leave(" = -EINVAL [%d < %zu]", arg->out_size,
+ sizeof(struct VolumeStatus));
+ return -EINVAL;
+ }
+
+ ret = afs_vnode_get_volume_status(vnode, key, &vs);
+ if (ret < 0) {
+ _leave(" = %ld", ret);
+ return ret;
+ }
+
+ memcpy(arg->out, &vs, sizeof(struct VolumeStatus));
+ arg->out_size = sizeof(struct VolumeStatus);
+
+ _leave(" = 0 [%d]", arg->out_size);
+ return 0;
+}
+
+/*
* The AFS path-based I/O control operation
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
@@ -95,6 +126,10 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
ret = afs_PGetFileCell(dentry, arg, key);
break;

+ case VIOC_COMMAND(PGetVolStat):
+ ret = afs_PGetVolStat(dentry, arg, key);
+ break;
+
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 0976469..6772712 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -14,7 +14,26 @@
#define AFSCALL_PIOCTL 0x14

/* pioctl commands */
+#define PGetVolStat 4 /* get volume status */
#define PGetFID 22 /* get file ID */
#define PGetFileCell 30 /* get the cell a file inhabits */

+/*
+ * AFS volume status record
+ */
+struct VolumeStatus {
+ int Vid; /* volume ID */
+ int ParentId; /* parent volume ID */
+ char Online; /* 1 if volume currently online and available */
+ char InService; /* 1 if volume currently in service */
+ char Blessed; /* same as in_service */
+ char NeedsSalvage; /* 1 if consistency checking required */
+ int Type; /* volume type (afs_voltype_t) */
+ int MinQuota; /* minimum blocks set aside */
+ int MaxQuota; /* maximum blocks this volume may occupy */
+ int BlocksInUse; /* blocks this volume currently occupies */
+ int PartBlocksAvail;/* space available in volume's partition */
+ int PartMaxBlocks; /* size of volume's partition */
+};
+
#endif /* _LINUX_AFSCALL_H */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index 9cc115c..437e7f3 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -17,6 +17,7 @@
/*
* pioctl commands (not usable as ioctls)
*/
+#define VIOCGETVOLSTAT _VICEIOCTL(PGetVolStat)
#define VIOCGETFID _VICEIOCTL(PGetFID)
#define VIOC_FILE_CELL_NAME _VICEIOCTL(PGetFileCell)

2009-06-16 20:41:07

by David Howells

[permalink] [raw]
Subject: [PATCH 05/17] AFS: Handle pathless pioctls aimed at AFS

From: Wang Lei <[email protected]>

Handle pathless pioctls aimed at the AFS client in general, rather than at
specific files, volumes or cells. We also check pathed ioctls for command
matches to pathless pioctls.

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/internal.h | 3 +++
fs/afs/pioctl.c | 28 ++++++++++++++++++++++++++--
fs/afs/super.c | 17 +++++++++++++++++
3 files changed, 46 insertions(+), 2 deletions(-)


diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 0aaa324..9a8e8a2 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -13,6 +13,7 @@
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/pagemap.h>
+#include <linux/pioctl.h>
#include <linux/skbuff.h>
#include <linux/rxrpc.h>
#include <linux/key.h>
@@ -588,6 +589,8 @@ extern void afs_mntpt_kill_timer(void);
*/
extern long afs_pioctl(struct dentry *, int, struct vice_ioctl *);

+extern long afs_pathless_pioctl(int, struct vice_ioctl *);
+
/*
* proc.c
*/
diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 63a6211..63d2fe1 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -33,8 +33,8 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)

switch (cmd) {
default:
- printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
- ret = -EOPNOTSUPP;
+ _debug("fallback to pathless: %x", cmd);
+ ret = afs_pathless_pioctl(cmd, arg);
break;
}

@@ -42,3 +42,27 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
_leave(" = %ld", ret);
return ret;
}
+
+/*
+ * The AFS pathless pioctl handler
+ */
+long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)
+{
+ long ret;
+
+ _enter(",%x(%d),{%d,%d}",
+ cmd, _IOC_NR(cmd), arg->in_size, arg->out_size);
+
+#define VIOC_COMMAND(nr) (_VICEIOCTL(nr) & ~IOCSIZE_MASK)
+
+ switch (cmd & ~IOCSIZE_MASK) {
+ default:
+ printk(KERN_DEBUG
+ "AFS: Unsupported pioctl command %x\n", cmd);
+ ret = -EOPNOTSUPP;
+ break;
+ }
+
+ _leave(" = %ld", ret);
+ return ret;
+}
diff --git a/fs/afs/super.c b/fs/afs/super.c
index ad0514d..62a43ea 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -71,6 +71,11 @@ static const match_table_t afs_options_list = {
{ afs_no_opt, NULL },
};

+struct pathless_pioctl_handler afs_pathless_pioctl_handler = {
+ .owner = THIS_MODULE,
+ .pioctl = afs_pathless_pioctl,
+};
+
/*
* initialise the filesystem
*/
@@ -94,9 +99,20 @@ int __init afs_fs_init(void)
return ret;
}

+ /* register our pathless pioctl handler to pathless pioctl list */
+ ret = pathless_pioctl_register(&afs_pathless_pioctl_handler);
+ if (ret < 0) {
+ printk(KERN_NOTICE
+ "kAFS: Failed to register pathless pioctl handler\n");
+ kmem_cache_destroy(afs_inode_cachep);
+ _leave(" = %d", ret);
+ return ret;
+ }
+
/* now export our filesystem to lesser mortals */
ret = register_filesystem(&afs_fs_type);
if (ret < 0) {
+ pathless_pioctl_unregister(&afs_pathless_pioctl_handler);
kmem_cache_destroy(afs_inode_cachep);
_leave(" = %d", ret);
return ret;
@@ -114,6 +130,7 @@ void __exit afs_fs_exit(void)
_enter("");

afs_mntpt_kill_timer();
+ pathless_pioctl_unregister(&afs_pathless_pioctl_handler);
unregister_filesystem(&afs_fs_type);

if (atomic_read(&afs_count_active_inodes) != 0) {

2009-06-16 20:41:25

by David Howells

[permalink] [raw]
Subject: [PATCH 03/17] VFS: Implement handling for pathless pioctls

From: Wang Lei <[email protected]>

Implement handling for pathless pioctls. Because these take no path argument,
there's no way to know for certain which filesystem they're aimed at, so we
have to switch on command number instead. This patch allows interested parties
to register handlers. Each registered handler function is tried in turn until
one doesn't return -EOPNOTSUPP.

This is required because OpenAFS implemented a number of AFS calls that don't
get given a path as they're aimed at AFS in general, and not at a particular
file, volume or cell in the AFS world.

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/compat_pioctl.c | 14 ++++--
fs/pioctl.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++--
include/linux/pioctl.h | 12 +++++
3 files changed, 130 insertions(+), 10 deletions(-)


diff --git a/fs/compat_pioctl.c b/fs/compat_pioctl.c
index 9f2de77..36b0553 100644
--- a/fs/compat_pioctl.c
+++ b/fs/compat_pioctl.c
@@ -75,11 +75,15 @@ long compat_sys_pioctl(const char __user *filename, int cmd,
kargs.out = NULL;
}

- error = user_path(filename, &path);
- if (!error) {
- if (path.dentry->d_inode)
- error = vfs_pioctl(path.dentry, cmd, &kargs);
- path_put(&path);
+ if (!filename) {
+ error = vfs_pioctl(NULL, cmd, &kargs);
+ } else {
+ error = user_path(filename, &path);
+ if (!error) {
+ if (path.dentry->d_inode)
+ error = vfs_pioctl(path.dentry, cmd, &kargs);
+ path_put(&path);
+ }
}
kfree(kargs.in);

diff --git a/fs/pioctl.c b/fs/pioctl.c
index c17f220..1fe4bf8 100644
--- a/fs/pioctl.c
+++ b/fs/pioctl.c
@@ -1,5 +1,6 @@
/* Path-based I/O control
*
+ * Copyright (C) 2009 Wang Lei <[email protected]>
* Copyright (C) 2009 David Howells <[email protected]>
* Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
*
@@ -12,15 +13,47 @@
#include <linux/syscalls.h>
#include <linux/uaccess.h>
#include <linux/kernel.h>
+#include <linux/module.h>
#include <linux/namei.h>
#include <linux/pioctl.h>
#include <linux/slab.h>

+static struct pathless_pioctl_handler *pathless_pioctls;
+static DECLARE_RWSEM(pathless_pioctls_rwsem);
+
+/*
+ * Traverse the pathless pioctl handlers list, to find the appropriate handler
+ */
+static long pathless_pioctl(int cmd, struct vice_ioctl *arg)
+{
+ struct pathless_pioctl_handler *p;
+ long ret;
+
+ down_read(&pathless_pioctls_rwsem);
+ p = pathless_pioctls;
+ while (p) {
+ if (try_module_get(p->owner)) {
+ ret = p->pioctl(cmd, arg);
+ module_put(p->owner);
+ if (ret != -EOPNOTSUPP) {
+ up_write(&pathless_pioctls_rwsem);
+ return ret;
+ }
+ }
+ p = p->next;
+ }
+ up_read(&pathless_pioctls_rwsem);
+ return -EOPNOTSUPP;
+}
+
/*
* VFS entry point for path-based I/O control
*/
long vfs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
{
+ if (!dentry)
+ return pathless_pioctl(cmd, arg);
+
if (!dentry->d_inode->i_op || !dentry->d_inode->i_op->pioctl)
return -EPERM;

@@ -28,6 +61,73 @@ long vfs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
}

/*
+ * Find the pointer to an pathless pioctl handler or the point at which it
+ * should be inserted
+ */
+static struct pathless_pioctl_handler **find_pathless_pioctl(
+ struct pathless_pioctl_handler *handler)
+{
+ struct pathless_pioctl_handler **p;
+
+ for (p = &pathless_pioctls; *p; p = &(*p)->next)
+ if ((*p) == handler)
+ break;
+ return p;
+}
+
+/**
+ * pathless_pioctl_register - Register a pathless pioctl handler
+ * @handler: The handler to be registered
+ *
+ * Add a handler to the list of pathless pioctl handlers, making sure that the
+ * handler is not already registered.
+ */
+int pathless_pioctl_register(struct pathless_pioctl_handler *handler)
+{
+ int res = 0;
+ struct pathless_pioctl_handler **p;
+
+ if (handler->next)
+ return -EBUSY;
+
+ down_write(&pathless_pioctls_rwsem);
+ p = find_pathless_pioctl(handler);
+ if (*p)
+ res = -EBUSY;
+ else
+ *p = handler;
+ up_write(&pathless_pioctls_rwsem);
+ return res;
+}
+EXPORT_SYMBOL(pathless_pioctl_register);
+
+/**
+ * pathless_pioctl_unregister - Unregister a pathless pioctl handler
+ * @handler: The handler to be unregistered
+ *
+ * Remove the special handler from the list of pathless pioctl handlers, making
+ * sure that the handler is already registered.
+ */
+int pathless_pioctl_unregister(struct pathless_pioctl_handler *handler)
+{
+ struct pathless_pioctl_handler **p;
+
+ down_write(&pathless_pioctls_rwsem);
+ for (p = &pathless_pioctls; *p; p = &(*p)->next) {
+ if (*p == handler) {
+ *p = handler->next;
+ handler->next = NULL;
+ up_write(&pathless_pioctls_rwsem);
+ return 0;
+ }
+
+ }
+ up_write(&pathless_pioctls_rwsem);
+ return -EINVAL;
+}
+EXPORT_SYMBOL(pathless_pioctl_unregister);
+
+/*
* Path-based I/O control system call
*/
SYSCALL_DEFINE4(pioctl,
@@ -82,11 +182,15 @@ SYSCALL_DEFINE4(pioctl,
kargs.out = NULL;
}

- error = user_path(filename, &path);
- if (!error) {
- if (path.dentry->d_inode)
- error = vfs_pioctl(path.dentry, cmd, &kargs);
- path_put(&path);
+ if (!filename) {
+ error = vfs_pioctl(NULL, cmd, &kargs);
+ } else {
+ error = user_path(filename, &path);
+ if (!error) {
+ if (path.dentry->d_inode)
+ error = vfs_pioctl(path.dentry, cmd, &kargs);
+ path_put(&path);
+ }
}
kfree(kargs.in);

diff --git a/include/linux/pioctl.h b/include/linux/pioctl.h
index 8e979f4..a4c1082 100644
--- a/include/linux/pioctl.h
+++ b/include/linux/pioctl.h
@@ -41,6 +41,18 @@ struct vice_ioctl {
*/
extern long vfs_pioctl(struct dentry *, int, struct vice_ioctl *);

+/*
+ * Pathless pioctl handler type
+ */
+struct pathless_pioctl_handler {
+ struct module *owner;
+ struct pathless_pioctl_handler *next;
+ long (*pioctl)(int cmd, struct vice_ioctl *);
+};
+
+extern int pathless_pioctl_register(struct pathless_pioctl_handler *);
+extern int pathless_pioctl_unregister(struct pathless_pioctl_handler *);
+
#else

/*

2009-06-16 20:41:38

by David Howells

[permalink] [raw]
Subject: [PATCH 01/17] VFS: Implement the pioctl() system call

From: Jacob Thebault-Spieker <[email protected]>

Implement the pioctl() system call. This is used to support a number of AFS
functions, and could also be used for Coda and other filesystems.

Signed-off-by: Jacob Thebault-Spieker <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

arch/x86/ia32/ia32entry.S | 1
arch/x86/include/asm/unistd_32.h | 1
arch/x86/include/asm/unistd_64.h | 2 +
arch/x86/kernel/syscall_table_32.S | 1
fs/Kconfig | 6 ++
fs/Makefile | 3 +
fs/afs/Kconfig | 1
fs/afs/Makefile | 3 +
fs/afs/dir.c | 1
fs/afs/file.c | 1
fs/afs/inode.c | 9 +++
fs/afs/internal.h | 5 ++
fs/afs/mntpt.c | 1
fs/afs/pioctl.c | 24 ++++++++
fs/compat_pioctl.c | 100 ++++++++++++++++++++++++++++++++++
fs/pioctl.c | 107 ++++++++++++++++++++++++++++++++++++
include/linux/compat.h | 10 +++
include/linux/fs.h | 2 +
include/linux/pioctl.h | 58 ++++++++++++++++++++
include/linux/syscalls.h | 5 ++
20 files changed, 339 insertions(+), 2 deletions(-)
create mode 100644 fs/afs/pioctl.c
create mode 100644 fs/compat_pioctl.c
create mode 100644 fs/pioctl.c
create mode 100644 include/linux/pioctl.h


diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index e590261..5caa7bd 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -832,4 +832,5 @@ ia32_sys_call_table:
.quad compat_sys_pwritev
.quad compat_sys_rt_tgsigqueueinfo /* 335 */
.quad sys_perf_counter_open
+ .quad compat_sys_pioctl
ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index 732a307..f4f5c35 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -342,6 +342,7 @@
#define __NR_pwritev 334
#define __NR_rt_tgsigqueueinfo 335
#define __NR_perf_counter_open 336
+#define __NR_pioctl 337

#ifdef __KERNEL__

diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 900e161..495d0fb 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -661,6 +661,8 @@ __SYSCALL(__NR_pwritev, sys_pwritev)
__SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
#define __NR_perf_counter_open 298
__SYSCALL(__NR_perf_counter_open, sys_perf_counter_open)
+#define __NR_pioctl 299
+__SYSCALL(__NR_pioctl, sys_pioctl)

#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index d51321d..723f33e 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -336,3 +336,4 @@ ENTRY(sys_call_table)
.long sys_pwritev
.long sys_rt_tgsigqueueinfo /* 335 */
.long sys_perf_counter_open
+ .long sys_pioctl
diff --git a/fs/Kconfig b/fs/Kconfig
index 525da2e..69cff4a 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -76,6 +76,12 @@ config GENERIC_ACL
bool
select FS_POSIX_ACL

+config PIOCTL
+ bool
+ help
+ This option enabled the pioctl() system call, which is used by AFS
+ and Coda
+
menu "Caches"

source "fs/fscache/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index af6d047..d5bf38a 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -51,6 +51,9 @@ obj-$(CONFIG_FS_POSIX_ACL) += posix_acl.o xattr_acl.o
obj-$(CONFIG_NFS_COMMON) += nfs_common/
obj-$(CONFIG_GENERIC_ACL) += generic_acl.o

+pioctl-compat-$(CONFIG_COMPAT) := compat_pioctl.o
+obj-$(CONFIG_PIOCTL) += pioctl.o $(pioctl-compat-y)
+
obj-y += quota/

obj-$(CONFIG_PROC_FS) += proc/
diff --git a/fs/afs/Kconfig b/fs/afs/Kconfig
index 5c4e61d..2bd2324 100644
--- a/fs/afs/Kconfig
+++ b/fs/afs/Kconfig
@@ -2,6 +2,7 @@ config AFS_FS
tristate "Andrew File System support (AFS) (EXPERIMENTAL)"
depends on INET && EXPERIMENTAL
select AF_RXRPC
+ select PIOCTL
help
If you say Y here, you will get an experimental Andrew File System
driver. It currently only supports unsecured read-only AFS access.
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 4f64b95..0160227 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -27,6 +27,7 @@ kafs-objs := \
vlocation.o \
vnode.o \
volume.o \
- write.o
+ write.o \
+ pioctl.o

obj-$(CONFIG_AFS_FS) := kafs.o
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9bd7577..e1a785c 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -60,6 +60,7 @@ const struct inode_operations afs_dir_inode_operations = {
.permission = afs_permission,
.getattr = afs_getattr,
.setattr = afs_setattr,
+ .pioctl = afs_pioctl,
};

static const struct dentry_operations afs_fs_dentry_operations = {
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 0149dab..73835b7 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -45,6 +45,7 @@ const struct inode_operations afs_file_inode_operations = {
.getattr = afs_getattr,
.setattr = afs_setattr,
.permission = afs_permission,
+ .pioctl = afs_pioctl,
};

const struct address_space_operations afs_fs_aops = {
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index c048f06..f1de608 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -27,6 +27,13 @@ struct afs_iget_data {
struct afs_volume *volume; /* volume on which resides */
};

+static const struct inode_operations afs_symlink_inode_operations = {
+ .readlink = generic_readlink,
+ .follow_link = page_follow_link_light,
+ .put_link = page_put_link,
+ .pioctl = afs_pioctl,
+};
+
/*
* map the AFS file status to the inode member variables
*/
@@ -54,7 +61,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
break;
case AFS_FTYPE_SYMLINK:
inode->i_mode = S_IFLNK | vnode->status.mode;
- inode->i_op = &page_symlink_inode_operations;
+ inode->i_op = &afs_symlink_inode_operations;
break;
default:
printk("kAFS: AFS vnode with undefined type\n");
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 106be66..0aaa324 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -584,6 +584,11 @@ extern int afs_mntpt_check_symlink(struct afs_vnode *, struct key *);
extern void afs_mntpt_kill_timer(void);

/*
+ * pioctl.c
+ */
+extern long afs_pioctl(struct dentry *, int, struct vice_ioctl *);
+
+/*
* proc.c
*/
extern int afs_proc_init(void);
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index c52be53..153bea5 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -37,6 +37,7 @@ const struct inode_operations afs_mntpt_inode_operations = {
.follow_link = afs_mntpt_follow_link,
.readlink = page_readlink,
.getattr = afs_getattr,
+ .pioctl = afs_pioctl,
};

static LIST_HEAD(afs_vfsmounts);
diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
new file mode 100644
index 0000000..e266f27
--- /dev/null
+++ b/fs/afs/pioctl.c
@@ -0,0 +1,24 @@
+/* Path-based I/O control
+ *
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/fs.h>
+#include <linux/pioctl.h>
+#include "internal.h"
+
+/*
+ * The AFS path-based I/O control operation
+ */
+long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
+{
+ switch (cmd) {
+ default:
+ printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
+ return -EOPNOTSUPP;
+ }
+}
diff --git a/fs/compat_pioctl.c b/fs/compat_pioctl.c
new file mode 100644
index 0000000..9f2de77
--- /dev/null
+++ b/fs/compat_pioctl.c
@@ -0,0 +1,100 @@
+/* Path-based I/O control, compatibility
+ *
+ * Copyright (C) 2009 David Howells <[email protected]>
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * Modified by David Howells <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/fs.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+#include <linux/kernel.h>
+#include <linux/compat.h>
+#include <linux/namei.h>
+#include <linux/pioctl.h>
+#include <linux/slab.h>
+
+/*
+ * Path-based I/O control system call, 32-bit compatibility version
+ */
+long compat_sys_pioctl(const char __user *filename, int cmd,
+ struct compat_ViceIoctl __user *arg, int follow)
+{
+ struct compat_ViceIoctl user_args;
+ struct vice_ioctl kargs;
+ struct path path;
+ long error;
+
+ if (copy_from_user(&user_args, arg, sizeof(user_args)) != 0)
+ return -EFAULT;
+
+ if (user_args.in_size < 0 || user_args.out_size < 0)
+ return -EINVAL;
+
+ if (user_args.in) {
+ if (unlikely(!access_ok(VERIFY_READ,
+ compat_ptr(user_args.in),
+ user_args.in_size)))
+ return -EFAULT;
+
+ kargs.in = kmalloc(user_args.in_size, GFP_KERNEL);
+ if (!kargs.in)
+ return -ENOMEM;
+
+ if (copy_from_user(kargs.in,
+ compat_ptr(user_args.in),
+ user_args.in_size) != 0) {
+ kfree(kargs.in);
+ return -EFAULT;
+ }
+ kargs.in_size = user_args.in_size;
+ } else {
+ kargs.in_size = 0;
+ kargs.in = NULL;
+ }
+
+ if (user_args.out) {
+ if (unlikely(!access_ok(VERIFY_WRITE,
+ compat_ptr(user_args.out),
+ user_args.out_size))) {
+ kfree(kargs.in);
+ return -EFAULT;
+ }
+ kargs.out = kmalloc(user_args.out_size, GFP_KERNEL);
+ if (!kargs.out) {
+ kfree(kargs.in);
+ return -ENOMEM;
+ }
+ } else {
+ kargs.out_size = 0;
+ kargs.out = NULL;
+ }
+
+ error = user_path(filename, &path);
+ if (!error) {
+ if (path.dentry->d_inode)
+ error = vfs_pioctl(path.dentry, cmd, &kargs);
+ path_put(&path);
+ }
+ kfree(kargs.in);
+
+ if (user_args.out) {
+ if (error >= 0) {
+ if (copy_to_user(compat_ptr(user_args.out), kargs.out,
+ kargs.out_size) != 0) {
+ kfree(kargs.out);
+ return -EFAULT;
+ }
+ if (put_user(kargs.out_size, &arg->out_size) != 0)
+ error = -EFAULT;
+ }
+ kfree(kargs.out);
+ }
+
+ return error;
+}
diff --git a/fs/pioctl.c b/fs/pioctl.c
new file mode 100644
index 0000000..c17f220
--- /dev/null
+++ b/fs/pioctl.c
@@ -0,0 +1,107 @@
+/* Path-based I/O control
+ *
+ * Copyright (C) 2009 David Howells <[email protected]>
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * This program is free software; you can redistribute it a/or
+ * modify it uer the terms of the GNU General Public License
+ * as published by the Free Software Fouation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/fs.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+#include <linux/kernel.h>
+#include <linux/namei.h>
+#include <linux/pioctl.h>
+#include <linux/slab.h>
+
+/*
+ * VFS entry point for path-based I/O control
+ */
+long vfs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
+{
+ if (!dentry->d_inode->i_op || !dentry->d_inode->i_op->pioctl)
+ return -EPERM;
+
+ return dentry->d_inode->i_op->pioctl(dentry, cmd, arg);
+}
+
+/*
+ * Path-based I/O control system call
+ */
+SYSCALL_DEFINE4(pioctl,
+ const char __user *, filename, int, cmd,
+ struct ViceIoctl __user *, arg, int, follow)
+{
+ struct vice_ioctl kargs;
+ struct ViceIoctl user_args;
+ struct path path;
+ long error;
+
+ if (copy_from_user(&user_args, arg, sizeof(user_args)) != 0)
+ return -EFAULT;
+
+ if (user_args.in_size < 0 || user_args.out_size < 0)
+ return -EINVAL;
+
+ if (user_args.in) {
+ if (unlikely(!access_ok(VERIFY_READ, user_args.in,
+ user_args.in_size)))
+ return -EFAULT;
+
+ kargs.in = kmalloc(user_args.in_size, GFP_KERNEL);
+ if (!kargs.in)
+ return -ENOMEM;
+
+ if (copy_from_user(kargs.in, user_args.in,
+ user_args.in_size) != 0) {
+ kfree(kargs.in);
+ return -EFAULT;
+ }
+ kargs.in_size = user_args.in_size;
+ } else {
+ kargs.in = NULL;
+ kargs.in_size = 0;
+ }
+
+ if (user_args.out) {
+ if (unlikely(!access_ok(VERIFY_WRITE, user_args.out,
+ user_args.out_size))) {
+ kfree(kargs.in);
+ return -EFAULT;
+ }
+ kargs.out = kmalloc(user_args.out_size, GFP_KERNEL);
+ if (!kargs.out) {
+ kfree(kargs.in);
+ return -ENOMEM;
+ }
+ kargs.out_size = user_args.out_size;
+ } else {
+ kargs.out_size = 0;
+ kargs.out = NULL;
+ }
+
+ error = user_path(filename, &path);
+ if (!error) {
+ if (path.dentry->d_inode)
+ error = vfs_pioctl(path.dentry, cmd, &kargs);
+ path_put(&path);
+ }
+ kfree(kargs.in);
+
+ if (user_args.out) {
+ if (error >= 0) {
+ if (copy_to_user(user_args.out, kargs.out,
+ kargs.out_size) != 0) {
+ kfree(kargs.out);
+ return -EFAULT;
+ }
+ if (put_user(kargs.out_size, &arg->out_size) != 0)
+ error = -EFAULT;
+ }
+ kfree(kargs.out);
+ }
+
+ return error;
+}
diff --git a/include/linux/compat.h b/include/linux/compat.h
index af931ee..35afe29 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -132,6 +132,13 @@ struct compat_ustat {
char f_fpack[6];
};

+struct compat_ViceIoctl {
+ compat_uptr_t in;
+ compat_uptr_t out;
+ short in_size;
+ short out_size;
+};
+
typedef union compat_sigval {
compat_int_t sival_int;
compat_uptr_t sival_ptr;
@@ -308,6 +315,9 @@ asmlinkage long compat_sys_newfstatat(unsigned int dfd, char __user * filename,
int flag);
asmlinkage long compat_sys_openat(unsigned int dfd, const char __user *filename,
int flags, int mode);
+asmlinkage long compat_sys_pioctl(const char __user *filename, int cmd,
+ struct compat_ViceIoctl __user *arg,
+ int follow);

#endif /* CONFIG_COMPAT */
#endif /* _LINUX_COMPAT_H */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 32b0228..1737524 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -388,6 +388,7 @@ struct kstatfs;
struct vm_area_struct;
struct vfsmount;
struct cred;
+struct vice_ioctl;

extern void __init inode_init(void);
extern void __init inode_init_early(void);
@@ -1531,6 +1532,7 @@ struct inode_operations {
loff_t len);
int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
u64 len);
+ long (*pioctl)(struct dentry *, int, struct vice_ioctl *);
};

struct seq_file;
diff --git a/include/linux/pioctl.h b/include/linux/pioctl.h
new file mode 100644
index 0000000..8e979f4
--- /dev/null
+++ b/include/linux/pioctl.h
@@ -0,0 +1,58 @@
+/* Path-based I/O control command listing
+ *
+ * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ *
+ * Modifications Copyright (C) 2008
+ * Jacob Thebault-Spieker <[email protected]>
+ *
+ * pioctl definitions taken from http://grand.central.org/numbers/pioctls.html
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_PIOCTL_H
+#define _LINUX_PIOCTL_H
+
+#ifdef __KERNEL__
+
+/*
+ * pioctl syscall argument block
+ */
+struct ViceIoctl {
+ caddr_t __user in; /* input/argument buffer (or NULL) */
+ caddr_t __user out; /* output/reply buffer (or NULL) */
+ short in_size; /* size of input buffer (or 0) */
+ short out_size; /* size of output buffer (or 0) */
+};
+
+struct vice_ioctl {
+ char *in;
+ char *out;
+ short in_size;
+ short out_size;
+};
+
+/*
+ * Internal pioctl handler
+ */
+extern long vfs_pioctl(struct dentry *, int, struct vice_ioctl *);
+
+#else
+
+/*
+ * Userspace version of pioctl syscall argument block
+ */
+struct ViceIoctl {
+ caddr_t in;
+ caddr_t out;
+ short in_size;
+ short out_size;
+};
+
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_PIOCTL_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 418d90f..ab6f49f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -56,6 +56,7 @@ struct robust_list_head;
struct getcpu_cache;
struct old_linux_dirent;
struct perf_counter_attr;
+struct ViceIoctl;

#include <linux/types.h>
#include <linux/aio_abi.h>
@@ -760,4 +761,8 @@ int kernel_execve(const char *filename, char *const argv[], char *const envp[]);
asmlinkage long sys_perf_counter_open(
struct perf_counter_attr __user *attr_uptr,
pid_t pid, int cpu, int group_fd, unsigned long flags);
+
+asmlinkage long sys_pioctl(const char __user *filename, int cmd,
+ struct ViceIoctl __user *args, int nofollow);
+
#endif

2009-06-16 20:41:56

by David Howells

[permalink] [raw]
Subject: [PATCH 10/17] AFS: Implement the PWhereIs pioctl

From: Wang Lei <[email protected]>

Implement the PWhereIs pioctl for AFS. This will find out on which servers the
volume containing the specified file is located and return the IPv4 addresses
to userspace.

This can be tested with the OpenAFS userspace tools by doing:

fs whereis /afs

on a mounted AFS filesystem, which should return something like:

File /afs is on host altair.cambridge.redhat.com

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/afscall.h | 3 +++
include/linux/venus.h | 1 +
3 files changed, 55 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 4efd825..5f6beeb 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -97,6 +97,53 @@ long afs_PGetVolStat(struct dentry *dentry, struct vice_ioctl *arg,
}

/*
+ * Find out where a volume is located
+ */
+long afs_PWhereIs(struct dentry *dentry, struct vice_ioctl *arg,
+ struct key *key)
+{
+ int i;
+ char *cp = arg->out;
+ const size_t addr_size = sizeof(struct in_addr);
+ struct afs_volume *volume = AFS_FS_I(dentry->d_inode)->volume;
+ long ret;
+
+ _enter("");
+
+ if (arg->out_size < AFS_MAXHOSTS * addr_size) {
+ _leave(" = -EINVAL [%d < %zu]", arg->out_size,
+ AFS_MAXHOSTS * addr_size);
+ return -EINVAL;
+ }
+
+ down_read(&volume->server_sem);
+
+ /* handle the no-server case */
+ if (volume->nservers == 0) {
+ ret = volume->rjservers ? -ENOMEDIUM : -ESTALE;
+ up_read(&volume->server_sem);
+ _leave(" = %ld [no servers]", ret);
+ return ret;
+ }
+
+ for (i = 0; i < volume->nservers; i++, cp += addr_size)
+ memcpy(cp, &volume->servers[i]->addr.s_addr, addr_size);
+
+ up_read(&volume->server_sem);
+
+ if (i < AFS_MAXHOSTS) {
+ /* still room for terminating NULL, add it on the end */
+ memset(cp, 0, addr_size);
+ cp += addr_size;
+ }
+
+ arg->out_size = cp - arg->out;
+
+ _leave(" = 0 [%d]", arg->out_size);
+ return 0;
+}
+
+/*
* The AFS path-based I/O control operation
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
@@ -130,6 +177,10 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
ret = afs_PGetVolStat(dentry, arg, key);
break;

+ case VIOC_COMMAND(PWhereIs):
+ ret = afs_PWhereIs(dentry, arg, key);
+ break;
+
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 6772712..0a60cd1 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -15,9 +15,12 @@

/* pioctl commands */
#define PGetVolStat 4 /* get volume status */
+#define PWhereIs 14 /* find out where a volume is located */
#define PGetFID 22 /* get file ID */
#define PGetFileCell 30 /* get the cell a file inhabits */

+#define AFS_MAXHOSTS 8 /* the maximum of hosts number */
+
/*
* AFS volume status record
*/
diff --git a/include/linux/venus.h b/include/linux/venus.h
index 437e7f3..78cbf47 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -18,6 +18,7 @@
* pioctl commands (not usable as ioctls)
*/
#define VIOCGETVOLSTAT _VICEIOCTL(PGetVolStat)
+#define VIOCWHEREIS _VICEIOCTL(PWhereIs)
#define VIOCGETFID _VICEIOCTL(PGetFID)
#define VIOC_FILE_CELL_NAME _VICEIOCTL(PGetFileCell)

2009-06-16 20:42:16

by David Howells

[permalink] [raw]
Subject: [PATCH 02/17] VFS: Implement the AFS system call

From: Jacob Thebault-Spieker <[email protected]>

Implement the AFS system call, supporting just the pioctl() function for now.

Signed-off-by: Jacob Thebault-Spieker <[email protected]>
---

arch/x86/ia32/ia32entry.S | 2 +-
arch/x86/include/asm/unistd_64.h | 2 +-
arch/x86/kernel/syscall_table_32.S | 2 +-
fs/Makefile | 5 ++++-
fs/afs/Kconfig | 12 ++++++++++++
fs/afs/pioctl.c | 1 +
fs/afs_call.c | 33 ++++++++++++++++++++++++++++++++
fs/afs_compat.c | 37 ++++++++++++++++++++++++++++++++++++
include/linux/afscall.h | 16 ++++++++++++++++
include/linux/syscalls.h | 2 ++
10 files changed, 108 insertions(+), 4 deletions(-)
create mode 100644 fs/afs_call.c
create mode 100644 fs/afs_compat.c
create mode 100644 include/linux/afscall.h


diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 5caa7bd..43abb72 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -632,7 +632,7 @@ ia32_sys_call_table:
.quad quiet_ni_syscall /* bdflush */
.quad sys_sysfs /* 135 */
.quad sys_personality
- .quad quiet_ni_syscall /* for afs_syscall */
+ .quad compat_sys_afs /* for afs_syscall */
.quad sys_setfsuid16
.quad sys_setfsgid16
.quad sys_llseek /* 140 */
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 495d0fb..5b0a806 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -424,7 +424,7 @@ __SYSCALL(__NR_putpmsg, sys_ni_syscall)

/* reserved for AFS */
#define __NR_afs_syscall 183
-__SYSCALL(__NR_afs_syscall, sys_ni_syscall)
+__SYSCALL(__NR_afs_syscall, sys_afs)

/* reserved for tux */
#define __NR_tuxcall 184
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index 723f33e..530c5d0 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -136,7 +136,7 @@ ENTRY(sys_call_table)
.long sys_bdflush
.long sys_sysfs /* 135 */
.long sys_personality
- .long sys_ni_syscall /* reserved for afs_syscall */
+ .long sys_afs /* reserved for afs_syscall */
.long sys_setfsuid16
.long sys_setfsgid16
.long sys_llseek /* 140 */
diff --git a/fs/Makefile b/fs/Makefile
index d5bf38a..a75d3d9 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -64,7 +64,10 @@ obj-y += devpts/

obj-$(CONFIG_PROFILING) += dcookies.o
obj-$(CONFIG_DLM) += dlm/
-
+
+afs-compat-$(CONFIG_COMPAT) += afs_compat.o
+obj-$(CONFIG_AFS_CALL) += afs_call.o $(afs-compat-y)
+
# Do not add any filesystems before this line
obj-$(CONFIG_FSCACHE) += fscache/
obj-$(CONFIG_REISERFS_FS) += reiserfs/
diff --git a/fs/afs/Kconfig b/fs/afs/Kconfig
index 2bd2324..6871ca3 100644
--- a/fs/afs/Kconfig
+++ b/fs/afs/Kconfig
@@ -1,8 +1,15 @@
+config AFS_CALL
+ bool "Enable AFS system call"
+ depends on EXPERIMENTAL
+ help
+ Enable AFS system call functions, AFS_FS depends on this option.
+
config AFS_FS
tristate "Andrew File System support (AFS) (EXPERIMENTAL)"
depends on INET && EXPERIMENTAL
select AF_RXRPC
select PIOCTL
+ select AFS_CALL
help
If you say Y here, you will get an experimental Andrew File System
driver. It currently only supports unsecured read-only AFS access.
@@ -28,3 +35,8 @@ config AFS_FSCACHE
help
Say Y here if you want AFS data to be cached locally on disk through
the generic filesystem cache manager
+
+
+
+
+
diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index e266f27..5a76017 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -8,6 +8,7 @@
* 2 of the License, or (at your option) any later version.
*/
#include <linux/fs.h>
+#include <linux/afscall.h>
#include <linux/pioctl.h>
#include "internal.h"

diff --git a/fs/afs_call.c b/fs/afs_call.c
new file mode 100644
index 0000000..5dc28f8
--- /dev/null
+++ b/fs/afs_call.c
@@ -0,0 +1,33 @@
+/* AFS system call multiplexor
+ *
+ * Copyright (C) 2009 David Howells <[email protected]>
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * Modified by David Howells <[email protected]>
+ *
+ * This program is free software; you can redistribute it a/or
+ * modify it uer the terms of the GNU General Public License
+ * as published by the Free Software Fouation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/syscalls.h>
+#include <linux/afscall.h>
+#include <linux/pioctl.h>
+
+/*
+ * The AFS system call entry point
+ */
+SYSCALL_DEFINE5(afs, int, option,
+ unsigned long, arg2, unsigned long, arg3,
+ unsigned long, arg4, unsigned long, arg5)
+{
+ switch (option) {
+ case AFSCALL_PIOCTL:
+ return sys_pioctl((const char __user *) arg2, (int) arg3,
+ (struct ViceIoctl __user *) arg4, (int) arg5);
+
+ default:
+ printk(KERN_NOTICE "Unknown AFS call %x invoked\n", option);
+ return -ENOSYS;
+ }
+}
diff --git a/fs/afs_compat.c b/fs/afs_compat.c
new file mode 100644
index 0000000..0add585
--- /dev/null
+++ b/fs/afs_compat.c
@@ -0,0 +1,37 @@
+/* AFS syscall multiplexor, compatibility
+ *
+ * Copyright (C) 2009 David Howells <[email protected]>
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * Modified by David Howells <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/syscalls.h>
+#include <linux/afscall.h>
+#include <linux/pioctl.h>
+#include <linux/compat.h>
+
+/*
+ * The AFS system call 32-bit compatibility entry point
+ */
+asmlinkage long compat_sys_afs(int option,
+ unsigned long arg2, unsigned long arg3,
+ unsigned long arg4, unsigned long arg5)
+{
+ switch (option) {
+ case AFSCALL_PIOCTL:
+ return compat_sys_pioctl(
+ (const char __user *) compat_ptr(arg2),
+ (int) arg3,
+ (struct compat_ViceIoctl __user *) compat_ptr(arg4),
+ (int) arg5);
+
+ default:
+ printk(KERN_NOTICE "Unknown AFS call %x invoked\n", option);
+ return -ENOSYS;
+ }
+}
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
new file mode 100644
index 0000000..40cbfa5
--- /dev/null
+++ b/include/linux/afscall.h
@@ -0,0 +1,16 @@
+/* AFS system call multiplexor
+ *
+ * Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_AFSCALL_H
+#define _LINUX_AFSCALL_H
+
+#define AFSCALL_PIOCTL 0x14
+
+#endif /* _LINUX_AFSCALL_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ab6f49f..0a8a194 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -764,5 +764,7 @@ asmlinkage long sys_perf_counter_open(

asmlinkage long sys_pioctl(const char __user *filename, int cmd,
struct ViceIoctl __user *args, int nofollow);
+asmlinkage long sys_afs(int option, unsigned long arg2, unsigned long arg3,
+ unsigned long arg4, unsigned long arg5);

#endif

2009-06-16 20:43:30

by David Howells

[permalink] [raw]
Subject: [PATCH 06/17] VFS: Define pioctl command wrappers

Define pioctl() command wrappers, equivalent to ioctl() command wrapper _IOW().

Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 7 ++++---
include/linux/venus.h | 21 +++++++++++++++++++++
include/linux/vice.h | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 60 insertions(+), 3 deletions(-)
create mode 100644 include/linux/venus.h
create mode 100644 include/linux/vice.h


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 63d2fe1..6cac006 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -31,7 +31,9 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
return ret;
}

- switch (cmd) {
+#define VIOC_COMMAND(nr) (_VICEIOCTL(nr) & ~IOCSIZE_MASK)
+
+ switch (cmd & ~IOCSIZE_MASK) {
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
@@ -57,8 +59,7 @@ long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)

switch (cmd & ~IOCSIZE_MASK) {
default:
- printk(KERN_DEBUG
- "AFS: Unsupported pioctl command %x\n", cmd);
+ printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
ret = -EOPNOTSUPP;
break;
}
diff --git a/include/linux/venus.h b/include/linux/venus.h
new file mode 100644
index 0000000..19fe13e
--- /dev/null
+++ b/include/linux/venus.h
@@ -0,0 +1,21 @@
+/* Venus VICE (p)ioctls used by AFS
+ *
+ * Copyright (C) 2009 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_VENUS_H
+#define _LINUX_VENUS_H
+
+#include <linux/vice.h>
+
+/*
+ * pioctl commands (not usable as ioctls)
+ */
+
+#endif /* _LINUX_VENUS_H */
diff --git a/include/linux/vice.h b/include/linux/vice.h
new file mode 100644
index 0000000..76080fb
--- /dev/null
+++ b/include/linux/vice.h
@@ -0,0 +1,35 @@
+/* Command wrappers for the Vast Interconnected Computing Environment ioctls
+ * and pioctls
+ *
+ * Copyright (C) 2009 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells ([email protected])
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef _LINUX_VICE_H
+#define _LINUX_VICE_H
+
+#include <linux/pioctl.h>
+#include <linux/ioctl.h>
+#include <linux/compat.h>
+
+/*
+ * Wrappers for VICE ioctl/pioctl
+ */
+#define _VICEIOCTL(nr) _IOW('V', nr, struct ViceIoctl)
+#define _CVICEIOCTL(nr) _IOW('C', nr, struct ViceIoctl)
+#define _OVICEIOCTL(nr) _IOW('O', nr, struct ViceIoctl)
+
+#ifdef __KERNEL__
+#ifdef CONFIG_COMPAT
+#define _compat_VICEIOCTL(nr) _IOW('V', nr, struct compat_ViceIoctl)
+#define _compat_CVICEIOCTL(nr) _IOW('C', nr, struct compat_ViceIoctl)
+#define _compat_OVICEIOCTL(nr) _IOW('O', nr, struct compat_ViceIoctl)
+#endif
+#endif
+
+#endif /* _LINUX_VICE_H */

2009-06-16 20:42:31

by David Howells

[permalink] [raw]
Subject: [PATCH 13/17] RxRPC: Record extra data in key

Institute key data version 2 to allow the kernel to store the vice_id and the
ticket start time as well as the other fields. Whilst these aren't actually
required for the network protocol, they are required to be returned by the
VIOCGETTOK/PGetTokens pioctl of AFS.

Signed-off-by: David Howells <[email protected]>
---

include/keys/rxrpc-type.h | 46 +++++++++++++++++
net/rxrpc/ar-internal.h | 16 ------
net/rxrpc/ar-key.c | 121 ++++++++++++++++++++++++++++++---------------
net/rxrpc/rxkad.c | 1
4 files changed, 129 insertions(+), 55 deletions(-)


diff --git a/include/keys/rxrpc-type.h b/include/keys/rxrpc-type.h
index 7609365..42d6d91 100644
--- a/include/keys/rxrpc-type.h
+++ b/include/keys/rxrpc-type.h
@@ -21,4 +21,50 @@ extern struct key_type key_type_rxrpc;

extern struct key *rxrpc_get_null_key(const char *);

+/*
+ * RxRPC key for Kerberos (type-2 security)
+ */
+struct rxkad_key {
+ u16 security_index; /* RxRPC header security index */
+ u16 ticket_len; /* length of ticket[] */
+ u32 vice_id;
+ u32 start; /* time at which ticket starts */
+ u32 expiry; /* time at which ticket expires */
+ u32 kvno; /* key version number */
+ u8 session_key[8]; /* DES session key */
+ u8 ticket[0]; /* the encrypted ticket */
+};
+
+/*
+ * structure of raw payloads passed to add_key() or instantiate key
+ */
+struct rxrpc_key_data_v1 {
+ u32 kif_version; /* 1 */
+ u16 security_index;
+ u16 ticket_length;
+ u32 expiry; /* time_t */
+ u32 kvno;
+ u8 session_key[8];
+ u8 ticket[0];
+};
+
+struct rxrpc_key_data_v2 {
+ u32 kif_version; /* 2 */
+ u16 security_index;
+ u16 ticket_length;
+ u32 vice_id;
+ u32 start; /* time_t */
+ u32 expiry; /* time_t */
+ u32 kvno;
+ u8 session_key[8];
+ u8 ticket[0];
+};
+
+/*
+ * structure of data attached to rxrpc key struct
+ */
+struct rxrpc_key_payload {
+ struct rxkad_key k;
+};
+
#endif /* _KEYS_RXRPC_TYPE_H */
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 3e7318c..46c6d88 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -402,22 +402,6 @@ struct rxrpc_call {
};

/*
- * RxRPC key for Kerberos (type-2 security)
- */
-struct rxkad_key {
- u16 security_index; /* RxRPC header security index */
- u16 ticket_len; /* length of ticket[] */
- u32 expiry; /* time at which expires */
- u32 kvno; /* key version number */
- u8 session_key[8]; /* DES session key */
- u8 ticket[0]; /* the encrypted ticket */
-};
-
-struct rxrpc_key_payload {
- struct rxkad_key k;
-};
-
-/*
* locally abort an RxRPC call
*/
static inline void rxrpc_abort_call(struct rxrpc_call *call, u32 abort_code)
diff --git a/net/rxrpc/ar-key.c b/net/rxrpc/ar-key.c
index ad8c7a7..00678c5 100644
--- a/net/rxrpc/ar-key.c
+++ b/net/rxrpc/ar-key.c
@@ -70,10 +70,11 @@ struct key_type key_type_rxrpc_s = {
*/
static int rxrpc_instantiate(struct key *key, const void *data, size_t datalen)
{
- const struct rxkad_key *tsec;
+ const struct rxrpc_key_data_v1 *v1 = data;
+ const struct rxrpc_key_data_v2 *v2 = data;
struct rxrpc_key_payload *upayload;
size_t plen;
- u32 kver;
+ u32 kver, security_index, ticket_len;
int ret;

_enter("{%x},,%zu", key_serial(key), datalen);
@@ -86,48 +87,74 @@ static int rxrpc_instantiate(struct key *key, const void *data, size_t datalen)
ret = -EINVAL;
if (datalen <= 4 || !data)
goto error;
- memcpy(&kver, data, sizeof(kver));
- data += sizeof(kver);
- datalen -= sizeof(kver);
+ kver = v1->kif_version;

_debug("KEY I/F VERSION: %u", kver);

- ret = -EKEYREJECTED;
- if (kver != 1)
+ if (kver == 1) {
+ /* deal with a version 1 data blob */
+ ret = -EINVAL;
+ if (datalen < sizeof(*v1))
+ goto error;
+ security_index = v1->security_index;
+ ticket_len = v1->ticket_length;
+ if (datalen != sizeof(*v1) + ticket_len)
+ goto error;
+
+ _debug("SCIX: %u", security_index);
+ _debug("TLEN: %u", ticket_len);
+ _debug("EXPY: %x", v1->expiry);
+ _debug("KVNO: %u", v1->kvno);
+ _debug("SKEY: %02x%02x%02x%02x%02x%02x%02x%02x",
+ v1->session_key[0], v1->session_key[1],
+ v1->session_key[2], v1->session_key[3],
+ v1->session_key[4], v1->session_key[5],
+ v1->session_key[6], v1->session_key[7]);
+ if (ticket_len >= 8)
+ _debug("TCKT: %02x%02x%02x%02x%02x%02x%02x%02x",
+ v1->ticket[0], v1->ticket[1],
+ v1->ticket[2], v1->ticket[3],
+ v1->ticket[4], v1->ticket[5],
+ v1->ticket[6], v1->ticket[7]);
+ } else if (kver == 2) {
+ /* deal with a version 2 data blob */
+ ret = -EINVAL;
+ if (datalen < sizeof(*v2))
+ goto error;
+ security_index = v2->security_index;
+ ticket_len = v2->ticket_length;
+ if (datalen != sizeof(*v2) + ticket_len)
+ goto error;
+
+ _debug("SCIX: %u", security_index);
+ _debug("TLEN: %u", ticket_len);
+ _debug("VICE: %x", v2->vice_id);
+ _debug("STRT: %x", v2->start);
+ _debug("EXPY: %x", v2->expiry);
+ _debug("KVNO: %u", v2->kvno);
+ _debug("SKEY: %02x%02x%02x%02x%02x%02x%02x%02x",
+ v2->session_key[0], v2->session_key[1],
+ v2->session_key[2], v2->session_key[3],
+ v2->session_key[4], v2->session_key[5],
+ v2->session_key[6], v2->session_key[7]);
+ if (ticket_len >= 8)
+ _debug("TCKT: %02x%02x%02x%02x%02x%02x%02x%02x",
+ v2->ticket[0], v2->ticket[1],
+ v2->ticket[2], v2->ticket[3],
+ v2->ticket[4], v2->ticket[5],
+ v2->ticket[6], v2->ticket[7]);
+ } else {
+ ret = -EKEYREJECTED;
goto error;
-
- /* deal with a version 1 key */
- ret = -EINVAL;
- if (datalen < sizeof(*tsec))
- goto error;
-
- tsec = data;
- if (datalen != sizeof(*tsec) + tsec->ticket_len)
- goto error;
-
- _debug("SCIX: %u", tsec->security_index);
- _debug("TLEN: %u", tsec->ticket_len);
- _debug("EXPY: %x", tsec->expiry);
- _debug("KVNO: %u", tsec->kvno);
- _debug("SKEY: %02x%02x%02x%02x%02x%02x%02x%02x",
- tsec->session_key[0], tsec->session_key[1],
- tsec->session_key[2], tsec->session_key[3],
- tsec->session_key[4], tsec->session_key[5],
- tsec->session_key[6], tsec->session_key[7]);
- if (tsec->ticket_len >= 8)
- _debug("TCKT: %02x%02x%02x%02x%02x%02x%02x%02x",
- tsec->ticket[0], tsec->ticket[1],
- tsec->ticket[2], tsec->ticket[3],
- tsec->ticket[4], tsec->ticket[5],
- tsec->ticket[6], tsec->ticket[7]);
+ }

ret = -EPROTONOSUPPORT;
- if (tsec->security_index != 2)
+ if (security_index != 2)
goto error;

- key->type_data.x[0] = tsec->security_index;
+ key->type_data.x[0] = security_index;

- plen = sizeof(*upayload) + tsec->ticket_len;
+ plen = sizeof(*upayload) + ticket_len;
ret = key_payload_reserve(key, plen);
if (ret < 0)
goto error;
@@ -138,14 +165,30 @@ static int rxrpc_instantiate(struct key *key, const void *data, size_t datalen)
goto error;

/* attach the data */
- memcpy(&upayload->k, tsec, sizeof(*tsec));
- memcpy(&upayload->k.ticket, (void *)tsec + sizeof(*tsec),
- tsec->ticket_len);
+ if (kver == 1) {
+ upayload->k.security_index = security_index;
+ upayload->k.ticket_len = ticket_len;
+ upayload->k.expiry = v1->expiry;
+ upayload->k.kvno = v1->kvno;
+ memcpy(&upayload->k.session_key, &v1->session_key,
+ 8 + ticket_len);
+ } else if (kver == 2) {
+ upayload->k.security_index = security_index;
+ upayload->k.ticket_len = ticket_len;
+ upayload->k.vice_id = v2->vice_id;
+ upayload->k.start = v2->start;
+ upayload->k.expiry = v2->expiry;
+ upayload->k.kvno = v2->kvno;
+ memcpy(&upayload->k.session_key, &v2->session_key,
+ 8 + ticket_len);
+ }
+
key->payload.data = upayload;
- key->expiry = tsec->expiry;
+ key->expiry = upayload->k.expiry;
ret = 0;

error:
+ _leave(" = %d", ret);
return ret;
}

diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index ef8f910..d5a677f 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -16,6 +16,7 @@
#include <linux/crypto.h>
#include <linux/scatterlist.h>
#include <linux/ctype.h>
+#include <keys/rxrpc-type.h>
#include <net/sock.h>
#include <net/af_rxrpc.h>
#define rxrpc_debug rxkad_debug

2009-06-16 20:43:46

by David Howells

[permalink] [raw]
Subject: [PATCH 07/17] AFS: Implement the PGetFid pioctl

From: Jacob Thebault-Spieker <[email protected]>

Implement the PGetFID pioctl for AFS. This will get the FID of a specified
file and return it to userspace.

This can be tested with the OpenAFS userspace tools by doing:

fs getfid /afs

on a mounted AFS filesystem, which should return something like:

File /afs (1.1.0) contained in volume 1

Signed-off-by: Jacob Thebault-Spieker <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 30 ++++++++++++++++++++++++++++++
include/linux/afscall.h | 3 +++
include/linux/venus.h | 1 +
3 files changed, 34 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 6cac006..2e4f741 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -10,9 +10,35 @@
#include <linux/fs.h>
#include <linux/afscall.h>
#include <linux/pioctl.h>
+#include <linux/venus.h>
#include "internal.h"

/*
+ * Get the AFS file identifier of a file
+ */
+static long afs_PGetFID(struct dentry *dentry, struct vice_ioctl *arg,
+ struct key *key)
+{
+ struct afs_vnode *vnode;
+
+ _enter("");
+
+ vnode = AFS_FS_I(dentry->d_inode);
+
+ if (arg->out_size < sizeof(vnode->fid)) {
+ _leave(" = -EINVAL [%d < %zu]",
+ arg->out_size, sizeof(vnode->fid));
+ return -EINVAL;
+ }
+
+ memcpy(arg->out, &vnode->fid, sizeof(vnode->fid));
+ arg->out_size = sizeof(vnode->fid);
+
+ _leave(" = 0 [%d]", arg->out_size);
+ return 0;
+}
+
+/*
* The AFS path-based I/O control operation
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
@@ -34,6 +60,10 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
#define VIOC_COMMAND(nr) (_VICEIOCTL(nr) & ~IOCSIZE_MASK)

switch (cmd & ~IOCSIZE_MASK) {
+ case VIOC_COMMAND(PGetFID):
+ ret = afs_PGetFID(dentry, arg, key);
+ break;
+
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 40cbfa5..cb006a2 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -13,4 +13,7 @@

#define AFSCALL_PIOCTL 0x14

+/* pioctl commands */
+#define PGetFID 22 /* get file ID */
+
#endif /* _LINUX_AFSCALL_H */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index 19fe13e..ea896e4 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -17,5 +17,6 @@
/*
* pioctl commands (not usable as ioctls)
*/
+#define VIOCGETFID _VICEIOCTL(PGetFID)

#endif /* _LINUX_VENUS_H */

2009-06-16 20:42:47

by David Howells

[permalink] [raw]
Subject: [PATCH 15/17] AFS: Implement the PSetTokens pioctl

Implement the PSetTokens pioctl for AFS. This will submit a security token for
caching.

This can be tested with the OpenAFS userspace tools using the klog program,
which should add a key to the session keyring with something like:

[root@andromeda ~]# echo password | klog -pipe admin
[root@andromeda ~]# keyctl show
Session Keyring
-3 --alswrv 0 0 keyring: _ses
147139749 --alswrv 0 -1 \_ keyring: _uid.0
457362442 --als--v 0 0 \_ rxrpc: cambridge.redhat.com

Note that 'klog -setpag' is not supported by this patch as there's currently
no way for a process to replace its parent process's session keyring.

Signed-off-by: David Howells <[email protected]>
---

fs/afs/cell.c | 15 ++++
fs/afs/internal.h | 1
fs/afs/pioctl.c | 177 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/afscall.h | 14 ++++
include/linux/venus.h | 1
5 files changed, 208 insertions(+), 0 deletions(-)


diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index e19c13f..b900fc7 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -227,6 +227,21 @@ int afs_cell_init(char *rootcell)
}

/*
+ * get a reference to the root cell
+ */
+struct afs_cell *afs_get_root_cell(void)
+{
+ struct afs_cell *cell;
+
+ read_lock(&afs_cells_lock);
+ cell = afs_cell_root;
+ afs_get_cell(cell);
+ read_unlock(&afs_cells_lock);
+
+ return cell;
+}
+
+/*
* lookup a cell record
*/
struct afs_cell *afs_cell_lookup(const char *name, unsigned namesz)
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 9a8e8a2..cf08782 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -467,6 +467,7 @@ extern struct list_head afs_proc_cells;

#define afs_get_cell(C) do { atomic_inc(&(C)->usage); } while(0)
extern int afs_cell_init(char *);
+extern struct afs_cell *afs_get_root_cell(void);
extern struct afs_cell *afs_cell_create(const char *, char *);
extern struct afs_cell *afs_cell_lookup(const char *, unsigned);
extern struct afs_cell *afs_grab_cell(struct afs_cell *);
diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index ffbec0c..e6ea69f 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -2,6 +2,8 @@
*
* Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
*
+ * Modified by David Howells <[email protected]>
+ *
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
@@ -12,6 +14,10 @@
#include <linux/pioctl.h>
#include <linux/venus.h>
#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/key.h>
+#include <linux/keyctl.h>
+#include <keys/rxrpc-type.h>
#include "internal.h"

/*
@@ -219,6 +225,173 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
}

/*
+ * Set a user's rxkad authentication tokens
+ */
+static long afs_PSetTokens(struct vice_ioctl *arg)
+{
+ struct rxrpc_key_data_v2 *payload;
+ struct clear_token details;
+ struct afs_cell *cell;
+ const char *cp;
+ key_ref_t keyring_r, key_r;
+ size_t in_size, loop;
+ void *in, *in_next, *ticket;
+ char *cellname, *keyname, *dp;
+ long ret;
+ u32 tktlen, tmp, flag;
+
+ _enter("");
+
+ /* decode the argument block */
+ in_next = arg->in;
+ in_size = arg->in_size;
+
+#define CHECK(n) \
+ do { \
+ if (in_size < (n)) \
+ goto underflow; \
+ in = in_next; \
+ in_size -= (n); \
+ in_next += (n); \
+ } while(0)
+
+#define DECODE(to) \
+ do { \
+ CHECK(sizeof(*(to))); \
+ memcpy(to, in, sizeof(*(to))); \
+ } while(0)
+
+ DECODE(&tktlen);
+ _debug("tktlen: %u", tktlen);
+ if (tktlen > INT_MAX)
+ goto invalid;
+ CHECK(tktlen);
+ ticket = in;
+ DECODE(&tmp);
+ _debug("clear token %u", tmp);
+ if (tmp != sizeof(struct clear_token))
+ goto invalid;
+ DECODE(&details);
+ _debug("ah:%x vi:%x bts:%x ets:%x (e-b:%u) (CT:%lx)",
+ details.auth_handle, details.vice_id,
+ details.begin_timestamp, details.end_timestamp,
+ details.end_timestamp - details.begin_timestamp,
+ CURRENT_TIME.tv_sec);
+ if (details.vice_id == UINT_MAX)
+ goto invalid;
+ if (details.auth_handle == UINT_MAX)
+ details.auth_handle = 999;
+
+ /* flags and cellname are optional, defaulting to the root cell */
+ _debug("in_size: %zu", in_size);
+ if (in_size != 0) {
+ DECODE(&flag);
+ _debug("flag: %x", flag);
+
+ if (flag & 0x8000) {
+ /* the caller wants us to give our parent a new PAG
+ * - we don't support this currently
+ */
+ _leave(" = -EACCES");
+ return -EACCES;
+ }
+
+ /* remainder is cell name */
+ CHECK(sizeof(char));
+ cellname = in;
+ for (loop = 0; loop < in_size; loop++)
+ if (!isprint(cellname[loop]))
+ goto invalid;
+
+ if (cellname[loop] != '\0')
+ goto invalid;
+ cell = NULL;
+
+ _debug("cellname: %s", cellname);
+ } else {
+ cell = afs_get_root_cell();
+ cellname = cell->name;
+ flag = 1;
+ }
+
+#undef DECODE
+#undef CHECK
+
+ /* construct the key name */
+ ret = -ENOMEM;
+ keyname = kmalloc(4 + strlen(cellname) + 1, GFP_KERNEL);
+ if (!keyname)
+ goto error_nokeyname;
+
+ memcpy(keyname, "afs@", 4);
+ dp = keyname + 4;
+ cp = cellname;
+ while (*cp)
+ *dp++ = toupper(*cp++);
+ *dp = 0;
+
+ /* we install the authentication token as a key */
+ payload = kmalloc(sizeof(*payload) + tktlen, GFP_KERNEL);
+ if (!payload)
+ goto error_nopayload;
+
+ payload->kif_version = 2;
+ payload->security_index = RXRPC_SECURITY_RXKAD;
+ payload->ticket_length = tktlen;
+ payload->vice_id = details.vice_id;
+ payload->start = details.begin_timestamp;
+ payload->expiry = details.end_timestamp;
+ payload->kvno = details.auth_handle;
+ memcpy(payload->session_key, details.session_key, 8);
+ memcpy(payload->ticket, ticket, tktlen);
+
+ /* add the key to the session keyring */
+ keyring_r = lookup_user_key(KEY_SPEC_SESSION_KEYRING, 1, 0,
+ WANT_KEY_WRITE);
+ if (IS_ERR(keyring_r)) {
+ _debug("keyring lookup failed");
+ ret = PTR_ERR(keyring_r);
+ goto error;
+ }
+
+ /* create or update the requested key and add it to the target
+ * keyring */
+ key_r = key_create_or_update(keyring_r, "rxrpc", keyname,
+ payload, sizeof(*payload) + tktlen,
+ KEY_PERM_UNDEF, KEY_ALLOC_IN_QUOTA);
+ key_ref_put(keyring_r);
+
+ if (IS_ERR(key_r)) {
+ _debug("key create failed");
+ ret = PTR_ERR(key_r);
+ goto error;
+ }
+
+ _debug("key serial: %x", key_ref_to_ptr(key_r)->serial);
+
+ key_ref_put(key_r);
+ arg->out_size = 0;
+ ret = 0;
+
+error:
+ kfree(payload);
+error_nopayload:
+ kfree(keyname);
+error_nokeyname:
+ afs_put_cell(cell);
+ _leave(" = %ld", ret);
+ return ret;
+
+underflow:
+ _leave(" = -EINVAL [short arg]");
+ return -EINVAL;
+
+invalid:
+ _leave(" = -EINVAL [invalid arg]");
+ return -EINVAL;
+}
+
+/*
* The AFS pathless pioctl handler
*/
long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)
@@ -231,6 +404,10 @@ long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)
#define VIOC_COMMAND(nr) (_VICEIOCTL(nr) & ~IOCSIZE_MASK)

switch (cmd & ~IOCSIZE_MASK) {
+ case VIOC_COMMAND(PSetTokens):
+ ret = afs_PSetTokens(arg);
+ break;
+
default:
printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
ret = -EOPNOTSUPP;
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 00054f0..7635aab 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -2,6 +2,8 @@
*
* Copyright (C) 2008 Jacob Thebault-Spieker <[email protected]>
*
+ * Modified by David Howells <[email protected]>
+ *
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
@@ -14,6 +16,7 @@
#define AFSCALL_PIOCTL 0x14

/* pioctl commands */
+#define PSetTokens 3 /* get authentication tokens for user */
#define PGetVolStat 4 /* get volume status */
#define PWhereIs 14 /* find out where a volume is located */
#define PGetFID 22 /* get file ID */
@@ -40,4 +43,15 @@ struct VolumeStatus {
int PartMaxBlocks; /* size of volume's partition */
};

+/*
+ * User details when getting or submitting a token
+ */
+struct clear_token {
+ u32 auth_handle; /* key version number */
+ u8 session_key[8]; /* session encryption key */
+ u32 vice_id; /* client/user ID */
+ u32 begin_timestamp; /* time_t at which ticket starts */
+ u32 end_timestamp; /* time_t at which ticket expires */
+};
+
#endif /* _LINUX_AFSCALL_H */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index ea8e468..b90e5f2 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -17,6 +17,7 @@
/*
* pioctl commands (not usable as ioctls)
*/
+#define VIOCSETTOK _VICEIOCTL(PSetTokens)
#define VIOCGETVOLSTAT _VICEIOCTL(PGetVolStat)
#define VIOCWHEREIS _VICEIOCTL(PWhereIs)
#define VIOCGETFID _VICEIOCTL(PGetFID)

2009-06-16 20:43:05

by David Howells

[permalink] [raw]
Subject: [PATCH 04/17] AFS: Add key request for pioctl

From: Wang Lei <[email protected]>

afs_pioctl() should get the security key applicable to the nominated file and
pass it on to the command handlers.

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 21 ++++++++++++++++++++-
1 files changed, 20 insertions(+), 1 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 5a76017..63a6211 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -17,9 +17,28 @@
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
{
+ struct afs_vnode *vnode = AFS_FS_I(dentry->d_inode);
+ struct key *key;
+ long ret;
+
+ _enter(",%x(%d),{%d,%d}",
+ cmd, _IOC_NR(cmd), arg->in_size, arg->out_size);
+
+ key = afs_request_key(vnode->volume->cell);
+ if (IS_ERR(key)) {
+ ret = PTR_ERR(key);
+ _leave(" = %ld [no key]", ret);
+ return ret;
+ }
+
switch (cmd) {
default:
printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
- return -EOPNOTSUPP;
+ ret = -EOPNOTSUPP;
+ break;
}
+
+ key_put(key);
+ _leave(" = %ld", ret);
+ return ret;
}

2009-06-16 20:44:05

by David Howells

[permalink] [raw]
Subject: [PATCH 16/17] KEYS: Add a function by which the contents of a keyring can be enumerated

Add a function by which the contents of a keyring can be enumerated.

This allows AFS's VIOCGETTOK/PGetTokens pioctl to list the AFS RxRPC keys on
behalf of userspace.

The following text is added to Documentation/keys.txt:

(*) The contents of a keyring may be enumerated by the following function:

typedef bool (*keyring_enum_filter_t)(const struct key *key,
void *filter_data);
key_ref_t keyring_enum(key_ref_t keyring_ref,
unsigned skip,
keyring_enum_filter_t filter,
void *filter_data,
key_perm_t perm)

This scans the keyring in question for keys for which the caller has the
specified permissions and that match the filter provided. It returns a
reference to the first of those keys, after the specified quantity of them
have been skipped. If no key is found error ENOKEY will be returned.

If the keyring is invalid or unsearchable, error ENOTDIR or EACCES will be
returned.

The filter function should return true if the key it is passed is a match,
and false if it is not. The filter_data passed to keyring_enum() will be
passed on to the filter function.

Signed-off-by: David Howells <[email protected]>
---

Documentation/keys.txt | 23 +++++++++++++++++
include/linux/key.h | 5 ++++
security/keys/keyring.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 93 insertions(+), 0 deletions(-)


diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 35618d1..77bbe07 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -947,6 +947,29 @@ payload contents" for more information.
reference pointer if successful.


+(*) The contents of a keyring may be enumerated by the following function:
+
+ typedef bool (*keyring_enum_filter_t)(const struct key *key,
+ void *filter_data);
+ key_ref_t keyring_enum(key_ref_t keyring_ref,
+ unsigned skip,
+ keyring_enum_filter_t filter,
+ void *filter_data,
+ key_perm_t perm)
+
+ This scans the keyring in question for keys for which the caller has the
+ specified permissions and that match the filter provided. It returns a
+ reference to the first of those keys, after the specified quantity of them
+ have been skipped. If no key is found error ENOKEY will be returned.
+
+ If the keyring is invalid or unsearchable, error ENOTDIR or EACCES will be
+ returned.
+
+ The filter function should return true if the key it is passed is a match,
+ and false if it is not. The filter_data passed to keyring_enum() will be
+ passed on to the filter function.
+
+
(*) To check the validity of a key, this function can be called:

int validate_key(struct key *key);
diff --git a/include/linux/key.h b/include/linux/key.h
index 4d8cc1e..6d41a4e 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -273,6 +273,11 @@ extern key_ref_t keyring_search(key_ref_t keyring,
extern int keyring_add_key(struct key *keyring,
struct key *key);

+typedef bool (*keyring_enum_filter_t)(const struct key *key, void *data);
+extern key_ref_t keyring_enum(key_ref_t keyring_r, unsigned skip,
+ keyring_enum_filter_t filter, void *filter_data,
+ key_perm_t perm);
+
extern struct key *key_lookup(key_serial_t id);

static inline key_serial_t key_serial(struct key *key)
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index 97529ab..c83ab26 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -1000,3 +1000,68 @@ static void keyring_revoke(struct key *keyring)
}

} /* end keyring_revoke() */
+
+/**
+ * keyring_enum - Allow enumeration of a keyring
+ * @keyring_ref: The keyring to search
+ * @skip: Number of matching keys to skip
+ * @filter: A function to filter out unwanted matches
+ * @filter_data: Data to be passed to filter()
+ * @perm: The permissions desired on the key
+ *
+ * Allow the caller to enumerate a keyring by getting the (skip+1)'th
+ * permissible key that matched a particular filter.
+ *
+ * The caller must lock the keyring if they don't want the contents to change
+ * between calls.
+ */
+key_ref_t keyring_enum(key_ref_t keyring_ref, unsigned skip,
+ keyring_enum_filter_t filter, void *filter_data,
+ key_perm_t perm)
+{
+ struct keyring_list *klist;
+ unsigned long possessed;
+ struct key *keyring, *key;
+ long ret;
+ int loop;
+
+ key_check(keyring);
+
+ /* top keyring must have search permission to begin the search */
+ ret = key_permission(keyring_ref, WANT_KEY_SEARCH);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ keyring = key_ref_to_ptr(keyring_ref);
+ if (keyring->type != &key_type_keyring)
+ return ERR_PTR(-ENOTDIR);
+
+ possessed = is_key_possessed(keyring_ref);
+
+ rcu_read_lock();
+
+ klist = rcu_dereference(keyring->payload.subscriptions);
+ if (klist) {
+ for (loop = 0; loop < klist->nkeys; loop++) {
+ key = klist->keys[loop];
+
+ if (!filter(key, filter_data) ||
+ key_permission(make_key_ref(key, possessed),
+ perm) != 0 ||
+ test_bit(KEY_FLAG_REVOKED, &key->flags))
+ continue;
+ if (skip == 0)
+ goto found;
+ skip--;
+ }
+ }
+
+ rcu_read_unlock();
+ return ERR_PTR(-ENOKEY);
+
+ found:
+ atomic_inc(&key->usage);
+ rcu_read_unlock();
+ return make_key_ref(key, possessed);
+}
+EXPORT_SYMBOL(keyring_enum);

2009-06-16 20:44:27

by David Howells

[permalink] [raw]
Subject: [PATCH 17/17] AFS: Implement the PGetTokens pioctl

Implement the PGetTokens pioctl for AFS. This will get the security tokens
cached for a user for security index 2 tokens.

This can be tested with the OpenAFS userspace tools by doing:

tokens

which should return something like:

[root@andromeda ~]# echo password | klog admin -pipe
[root@andromeda ~]# keyctl show
Session Keyring
-3 --alswrv 0 0 keyring: _ses
237984081 --alswrv 0 -1 \_ keyring: _uid.0
978861311 --als--v 0 0 \_ rxrpc: cambridge.redhat.com
[root@andromeda ~]# tokens

Tokens held by the Cache Manager:

User's (AFS ID 10143) tokens for [email protected] [Expires Jun 16 16:10]
--End of list--

Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 174 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/afscall.h | 1
include/linux/venus.h | 1
3 files changed, 176 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index e6ea69f..d097745 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -392,6 +392,175 @@ invalid:
}

/*
+ * filter tickets by security index when enumerating
+ */
+bool afs_enum_filter(const struct key *key, void *data)
+{
+ unsigned long security_index = (unsigned long) data;
+
+ return key->type == &key_type_rxrpc &&
+ key->type_data.x[0] == security_index &&
+ memcmp(key->description, "afs@", 4) == 0;
+}
+
+/*
+ * Get a user's authentication rxkad tokens
+ */
+static long afs_PGetTokens(struct vice_ioctl *arg)
+{
+ struct rxrpc_key_payload *upayload;
+ struct clear_token details;
+ struct afs_cell *root_cell;
+ struct key *key;
+ key_ref_t keyring_r, key_r = NULL;
+ size_t out_size;
+ void *out, *out_next;
+ long ret;
+ int skip;
+ u32 tmp, ticket_len;
+
+ _enter("");
+
+ /* find out which key we're being asked for */
+ if (arg->in_size == 0) {
+ skip = -1;
+ } else if (arg->in_size == sizeof(skip)) {
+ memcpy(&skip, arg->in, sizeof(skip));
+ if (skip < 0) {
+ _leave(" = -EINVAL [inval iter]");
+ return -EINVAL;
+ }
+ } else {
+ _leave(" = -EINVAL [bad arg size]");
+ return -EINVAL;
+ }
+
+ _debug("iterator: %d", skip);
+
+ /* we're going to look through the session keyring */
+ keyring_r = lookup_user_key(KEY_SPEC_SESSION_KEYRING, 1, 0,
+ WANT_KEY_SEARCH);
+ if (IS_ERR(keyring_r)) {
+ _leave(" = %ld [keyring]", PTR_ERR(keyring_r));
+ return PTR_ERR(keyring_r);
+ }
+
+ root_cell = afs_get_root_cell();
+ ASSERT(root_cell != NULL);
+
+ /* if there's no input argument, then we return the tokens for the root
+ * cell; if there is an argument, then we're being asked for the nth
+ * key belonging to this session */
+ if (skip >= 0) {
+ key_r = keyring_enum(keyring_r, skip,
+ afs_enum_filter,
+ (void *) RXRPC_SECURITY_RXKAD,
+ WANT_KEY_SEARCH);
+ if (key_r == ERR_PTR(-ENOKEY)) {
+ ret = -EDOM;
+ goto error_no_key;
+ }
+ } else {
+ /* find key for the root cell */
+ key_r = keyring_search(keyring_r, &key_type_rxrpc,
+ root_cell->anonymous_key->description);
+ if (key_r == ERR_PTR(-ENOKEY)) {
+ ret = -ENOTCONN;
+ goto error_no_key;
+ }
+ }
+
+ key_ref_put(keyring_r);
+ keyring_r = NULL;
+
+ if (IS_ERR(key_r)) {
+ if (key_r == ERR_PTR(-EACCES))
+ ret = -EACCES;
+ else
+ ret = -EIO;
+ goto error_no_key;
+ }
+
+ key = key_ref_to_ptr(key_r);
+ upayload = key->payload.data;
+
+ _debug("key serial: %x", key->serial);
+
+ /* pass the contents of the key back to userspace */
+#define CHECK(n) \
+ do { \
+ if (out_size < (n)) { \
+ ret = -EINVAL; \
+ goto error; \
+ } \
+ out = out_next; \
+ out_size -= (n); \
+ out_next += (n); \
+ } while(0)
+
+#define ENCODE(from) \
+ do { \
+ CHECK(sizeof(*(from))); \
+ memcpy(out, from, sizeof(*(from))); \
+ } while(0)
+
+ out_next = arg->out;
+ out_size = arg->out_size;
+
+ /* pass the ticket in at least 56 bytes of space */
+ ticket_len = upayload->k.ticket_len;
+ tmp = min(ticket_len, 56U);
+ ENCODE(&tmp);
+ CHECK(tmp);
+ memcpy(out, upayload->k.ticket, ticket_len);
+ if (ticket_len < tmp)
+ memset(out + ticket_len, 0, tmp - ticket_len);
+
+ tmp = sizeof(details);
+ ENCODE(&tmp);
+ details.vice_id = upayload->k.vice_id;
+ details.begin_timestamp = upayload->k.start;
+ details.end_timestamp = upayload->k.expiry;
+ details.auth_handle = upayload->k.kvno;
+ memcpy(details.session_key, upayload->k.session_key, 8);
+ ENCODE(&details);
+
+ /* if we were given an iterator, then there's more stuff we must
+ * return */
+ if (arg->in_size > 0) {
+ struct afs_cell *cell;
+ size_t cellname_size;
+
+ cellname_size = strlen(key->description + 4);
+ cell = afs_cell_lookup(key->description + 4, cellname_size);
+ tmp = (cell == root_cell) ? 1 : 0;
+ if (!IS_ERR(cell))
+ afs_put_cell(cell);
+ ENCODE(&tmp);
+ CHECK(cellname_size + 1);
+ memcpy(out, key->description + 4, cellname_size + 1);
+ }
+
+#undef ENCODE
+#undef CHECK
+
+ arg->out_size = (char *) out_next - arg->out;
+
+ key_ref_put(key_r);
+ afs_put_cell(root_cell);
+ _leave(" = 0");
+ return 0;
+
+error:
+ key_ref_put(key_r);
+error_no_key:
+ key_ref_put(keyring_r);
+ afs_put_cell(root_cell);
+ _leave(" = %ld", ret);
+ return ret;
+}
+
+/*
* The AFS pathless pioctl handler
*/
long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)
@@ -408,6 +577,11 @@ long afs_pathless_pioctl(int cmd, struct vice_ioctl *arg)
ret = afs_PSetTokens(arg);
break;

+
+ case VIOC_COMMAND(PGetTokens):
+ ret = afs_PGetTokens(arg);
+ break;
+
default:
printk(KERN_DEBUG "AFS: Unsupported pioctl command %x\n", cmd);
ret = -EOPNOTSUPP;
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 7635aab..bdff9a0 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -18,6 +18,7 @@
/* pioctl commands */
#define PSetTokens 3 /* get authentication tokens for user */
#define PGetVolStat 4 /* get volume status */
+#define PGetTokens 8 /* get authentication tokens for user */
#define PWhereIs 14 /* find out where a volume is located */
#define PGetFID 22 /* get file ID */
#define PFlushCB 25 /* flush callback only */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index b90e5f2..7a3ae08 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -19,6 +19,7 @@
*/
#define VIOCSETTOK _VICEIOCTL(PSetTokens)
#define VIOCGETVOLSTAT _VICEIOCTL(PGetVolStat)
+#define VIOCGETTOK _VICEIOCTL(PGetTokens)
#define VIOCWHEREIS _VICEIOCTL(PWhereIs)
#define VIOCGETFID _VICEIOCTL(PGetFID)
#define VIOCFLUSHCB _VICEIOCTL(PFlushCB)

2009-06-16 20:44:57

by David Howells

[permalink] [raw]
Subject: [PATCH 11/17] AFS: Implement the PFlushCB pioctl

From: Wang Lei <[email protected]>

Implement the PFlushCB pioctl for AFS. This flushes the callback of the
specified file, indicating to the server we're no longer interested in
notifications of changes to that file.

Signed-off-by: Wang Lei <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/afs/pioctl.c | 26 ++++++++++++++++++++++++++
include/linux/afscall.h | 1 +
include/linux/venus.h | 1 +
3 files changed, 28 insertions(+), 0 deletions(-)


diff --git a/fs/afs/pioctl.c b/fs/afs/pioctl.c
index 5f6beeb..ffbec0c 100644
--- a/fs/afs/pioctl.c
+++ b/fs/afs/pioctl.c
@@ -144,6 +144,28 @@ long afs_PWhereIs(struct dentry *dentry, struct vice_ioctl *arg,
}

/*
+ * Flush callback only
+ */
+long afs_PFlushCB(struct dentry *dentry, struct vice_ioctl *arg,
+ struct key *key)
+{
+ struct afs_vnode *vnode = AFS_FS_I(dentry->d_inode);
+ struct afs_volume *volume = vnode->volume;
+
+ _enter("");
+
+ /* file servers do not grant callbacks on files from read-only
+ * volumes */
+ if (volume->type != AFSVL_ROVOL && vnode->cb_promised) {
+ afs_give_up_callback(vnode);
+ afs_flush_callback_breaks(vnode->server);
+ }
+
+ _leave(" = 0");
+ return 0;
+}
+
+/*
* The AFS path-based I/O control operation
*/
long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
@@ -181,6 +203,10 @@ long afs_pioctl(struct dentry *dentry, int cmd, struct vice_ioctl *arg)
ret = afs_PWhereIs(dentry, arg, key);
break;

+ case VIOC_COMMAND(PFlushCB):
+ ret = afs_PFlushCB(dentry, arg, key);
+ break;
+
default:
_debug("fallback to pathless: %x", cmd);
ret = afs_pathless_pioctl(cmd, arg);
diff --git a/include/linux/afscall.h b/include/linux/afscall.h
index 0a60cd1..00054f0 100644
--- a/include/linux/afscall.h
+++ b/include/linux/afscall.h
@@ -17,6 +17,7 @@
#define PGetVolStat 4 /* get volume status */
#define PWhereIs 14 /* find out where a volume is located */
#define PGetFID 22 /* get file ID */
+#define PFlushCB 25 /* flush callback only */
#define PGetFileCell 30 /* get the cell a file inhabits */

#define AFS_MAXHOSTS 8 /* the maximum of hosts number */
diff --git a/include/linux/venus.h b/include/linux/venus.h
index 78cbf47..ea8e468 100644
--- a/include/linux/venus.h
+++ b/include/linux/venus.h
@@ -20,6 +20,7 @@
#define VIOCGETVOLSTAT _VICEIOCTL(PGetVolStat)
#define VIOCWHEREIS _VICEIOCTL(PWhereIs)
#define VIOCGETFID _VICEIOCTL(PGetFID)
+#define VIOCFLUSHCB _VICEIOCTL(PFlushCB)
#define VIOC_FILE_CELL_NAME _VICEIOCTL(PGetFileCell)

#endif /* _LINUX_VENUS_H */

2009-06-16 20:45:19

by David Howells

[permalink] [raw]
Subject: [PATCH 12/17] KEYS: Export lookup_user_key() and the key permission request flags

Export lookup_user_key() and the key permission request flags so that the token
handling pioctls of kAFS can make use of them.

This requires that the key permission request flags also be renamed from
KEY_xxx to WANT_KEY_xxx to avoid collision with keyboard-related symbols.

This allows AFS's VIOCSETTOK/PSetTokens and similar to access and manipulate
the calling process's session keyring.

The following text is added to Documentation/keys.txt:

(*) For code that manipulates keys and keyrings on behalf of userspace (such
as keyctl functions), the following function is available:

key_ref_t lookup_user_key(key_serial_t id,
int create,
int partial,
key_perm_t perm)

This looks up a key or keyring by serial ID, or may take a KEY_SPEC_
constant instead as the ID [see above]. It may be asked to create special
keyrings if they're asked for, but don't already exist (such as the
per-thread keyring), and may be asked to look up partially created keys for
the purpose of instantiation.

The key requested must have the specified permission available, where perm
is one of:

WANT_KEY_VIEW - Require permission to view attributes
WANT_KEY_READ - Require permission to read content
WANT_KEY_WRITE - Require permission to update / modify
WANT_KEY_SEARCH - Require permission to search (keyring) or find (key)
WANT_KEY_LINK - Require permission to link
WANT_KEY_SETATTR - Require permission to change attributes


Signed-off-by: David Howells <[email protected]>
---

Documentation/keys.txt | 25 +++++++++++++++++++++++++
include/linux/key.h | 12 ++++++++++++
security/keys/internal.h | 12 ------------
security/keys/key.c | 6 +++---
security/keys/keyctl.c | 38 +++++++++++++++++++-------------------
security/keys/keyring.c | 8 ++++----
security/keys/permission.c | 2 +-
security/keys/proc.c | 2 +-
security/keys/process_keys.c | 2 ++
9 files changed, 67 insertions(+), 40 deletions(-)


diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index b56aacc..35618d1 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -877,6 +877,31 @@ payload contents" for more information.
case error ERESTARTSYS will be returned.


+(*) For code that manipulates keys and keyrings on behalf of userspace (such
+ as keyctl functions), the following function is available:
+
+ key_ref_t lookup_user_key(key_serial_t id,
+ int create,
+ int partial,
+ key_perm_t perm)
+
+ This looks up a key or keyring by serial ID, or may take a KEY_SPEC_
+ constant instead as the ID [see above]. It may be asked to create special
+ keyrings if they're asked for, but don't already exist (such as the
+ per-thread keyring), and may be asked to look up partially created keys
+ for the purpose of instantiation.
+
+ The key requested must have the specified permission available, where perm
+ is one of:
+
+ WANT_KEY_VIEW - Require permission to view attributes
+ WANT_KEY_READ - Require permission to read content
+ WANT_KEY_WRITE - Require permission to update / modify
+ WANT_KEY_SEARCH - Require permission to search (keyring) or find (key)
+ WANT_KEY_LINK - Require permission to link
+ WANT_KEY_SETATTR - Require permission to change attributes
+
+
(*) When it is no longer required, the key should be released using:

void key_put(struct key *key);
diff --git a/include/linux/key.h b/include/linux/key.h
index e544f46..4d8cc1e 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -71,6 +71,15 @@ struct key;

#define KEY_PERM_UNDEF 0xffffffff

+/* required permissions */
+#define WANT_KEY_VIEW 0x01 /* require permission to view attributes */
+#define WANT_KEY_READ 0x02 /* require permission to read content */
+#define WANT_KEY_WRITE 0x04 /* require permission to update / modify */
+#define WANT_KEY_SEARCH 0x08 /* require permission to search (keyring) or find (key) */
+#define WANT_KEY_LINK 0x10 /* require permission to link */
+#define WANT_KEY_SETATTR 0x20 /* require permission to change attributes */
+#define WANT_KEY_ALL 0x3f /* all the above permissions */
+
struct seq_file;
struct user_struct;
struct signal_struct;
@@ -275,6 +284,9 @@ static inline key_serial_t key_serial(struct key *key)
extern ctl_table key_sysctls[];
#endif

+extern key_ref_t lookup_user_key(key_serial_t id, int create, int partial,
+ key_perm_t perm);
+
/*
* the userspace interface
*/
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 9fb679c..7baf655 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -124,9 +124,6 @@ extern struct key *request_key_and_link(struct key_type *type,
struct key *dest_keyring,
unsigned long flags);

-extern key_ref_t lookup_user_key(key_serial_t id, int create, int partial,
- key_perm_t perm);
-
extern long join_session_keyring(const char *name);

/*
@@ -141,15 +138,6 @@ static inline int key_permission(const key_ref_t key_ref, key_perm_t perm)
return key_task_permission(key_ref, current_cred(), perm);
}

-/* required permissions */
-#define KEY_VIEW 0x01 /* require permission to view attributes */
-#define KEY_READ 0x02 /* require permission to read content */
-#define KEY_WRITE 0x04 /* require permission to update / modify */
-#define KEY_SEARCH 0x08 /* require permission to search (keyring) or find (key) */
-#define KEY_LINK 0x10 /* require permission to link */
-#define KEY_SETATTR 0x20 /* require permission to change attributes */
-#define KEY_ALL 0x3f /* all the above permissions */
-
/*
* request_key authorisation
*/
diff --git a/security/keys/key.c b/security/keys/key.c
index 4a1297d..68d7d6b 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -708,7 +708,7 @@ static inline key_ref_t __key_update(key_ref_t key_ref,
int ret;

/* need write permission on the key to update it */
- ret = key_permission(key_ref, KEY_WRITE);
+ ret = key_permission(key_ref, WANT_KEY_WRITE);
if (ret < 0)
goto error;

@@ -780,7 +780,7 @@ key_ref_t key_create_or_update(key_ref_t keyring_ref,

/* if we're going to allocate a new key, we're going to have
* to modify the keyring */
- ret = key_permission(keyring_ref, KEY_WRITE);
+ ret = key_permission(keyring_ref, WANT_KEY_WRITE);
if (ret < 0) {
key_ref = ERR_PTR(ret);
goto error_3;
@@ -860,7 +860,7 @@ int key_update(key_ref_t key_ref, const void *payload, size_t plen)
key_check(key);

/* the key must be writable */
- ret = key_permission(key_ref, KEY_WRITE);
+ ret = key_permission(key_ref, WANT_KEY_WRITE);
if (ret < 0)
goto error;

diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 7f09fb8..ec0cd69 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -103,7 +103,7 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type,
}

/* find the target keyring (which must be writable) */
- keyring_ref = lookup_user_key(ringid, 1, 0, KEY_WRITE);
+ keyring_ref = lookup_user_key(ringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(keyring_ref)) {
ret = PTR_ERR(keyring_ref);
goto error3;
@@ -185,7 +185,7 @@ SYSCALL_DEFINE4(request_key, const char __user *, _type,
/* get the destination keyring if specified */
dest_ref = NULL;
if (destringid) {
- dest_ref = lookup_user_key(destringid, 1, 0, KEY_WRITE);
+ dest_ref = lookup_user_key(destringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(dest_ref)) {
ret = PTR_ERR(dest_ref);
goto error3;
@@ -235,7 +235,7 @@ long keyctl_get_keyring_ID(key_serial_t id, int create)
key_ref_t key_ref;
long ret;

- key_ref = lookup_user_key(id, create, 0, KEY_SEARCH);
+ key_ref = lookup_user_key(id, create, 0, WANT_KEY_SEARCH);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error;
@@ -309,7 +309,7 @@ long keyctl_update_key(key_serial_t id,
}

/* find the target key (which must be writable) */
- key_ref = lookup_user_key(id, 0, 0, KEY_WRITE);
+ key_ref = lookup_user_key(id, 0, 0, WANT_KEY_WRITE);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error2;
@@ -337,7 +337,7 @@ long keyctl_revoke_key(key_serial_t id)
key_ref_t key_ref;
long ret;

- key_ref = lookup_user_key(id, 0, 0, KEY_WRITE);
+ key_ref = lookup_user_key(id, 0, 0, WANT_KEY_WRITE);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error;
@@ -363,7 +363,7 @@ long keyctl_keyring_clear(key_serial_t ringid)
key_ref_t keyring_ref;
long ret;

- keyring_ref = lookup_user_key(ringid, 1, 0, KEY_WRITE);
+ keyring_ref = lookup_user_key(ringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(keyring_ref)) {
ret = PTR_ERR(keyring_ref);
goto error;
@@ -389,13 +389,13 @@ long keyctl_keyring_link(key_serial_t id, key_serial_t ringid)
key_ref_t keyring_ref, key_ref;
long ret;

- keyring_ref = lookup_user_key(ringid, 1, 0, KEY_WRITE);
+ keyring_ref = lookup_user_key(ringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(keyring_ref)) {
ret = PTR_ERR(keyring_ref);
goto error;
}

- key_ref = lookup_user_key(id, 1, 0, KEY_LINK);
+ key_ref = lookup_user_key(id, 1, 0, WANT_KEY_LINK);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error2;
@@ -423,7 +423,7 @@ long keyctl_keyring_unlink(key_serial_t id, key_serial_t ringid)
key_ref_t keyring_ref, key_ref;
long ret;

- keyring_ref = lookup_user_key(ringid, 0, 0, KEY_WRITE);
+ keyring_ref = lookup_user_key(ringid, 0, 0, WANT_KEY_WRITE);
if (IS_ERR(keyring_ref)) {
ret = PTR_ERR(keyring_ref);
goto error;
@@ -465,7 +465,7 @@ long keyctl_describe_key(key_serial_t keyid,
char *tmpbuf;
long ret;

- key_ref = lookup_user_key(keyid, 0, 1, KEY_VIEW);
+ key_ref = lookup_user_key(keyid, 0, 1, WANT_KEY_VIEW);
if (IS_ERR(key_ref)) {
/* viewing a key under construction is permitted if we have the
* authorisation token handy */
@@ -558,7 +558,7 @@ long keyctl_keyring_search(key_serial_t ringid,
}

/* get the keyring at which to begin the search */
- keyring_ref = lookup_user_key(ringid, 0, 0, KEY_SEARCH);
+ keyring_ref = lookup_user_key(ringid, 0, 0, WANT_KEY_SEARCH);
if (IS_ERR(keyring_ref)) {
ret = PTR_ERR(keyring_ref);
goto error2;
@@ -567,7 +567,7 @@ long keyctl_keyring_search(key_serial_t ringid,
/* get the destination keyring if specified */
dest_ref = NULL;
if (destringid) {
- dest_ref = lookup_user_key(destringid, 1, 0, KEY_WRITE);
+ dest_ref = lookup_user_key(destringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(dest_ref)) {
ret = PTR_ERR(dest_ref);
goto error3;
@@ -594,7 +594,7 @@ long keyctl_keyring_search(key_serial_t ringid,

/* link the resulting key to the destination keyring if we can */
if (dest_ref) {
- ret = key_permission(key_ref, KEY_LINK);
+ ret = key_permission(key_ref, WANT_KEY_LINK);
if (ret < 0)
goto error6;

@@ -646,7 +646,7 @@ long keyctl_read_key(key_serial_t keyid, char __user *buffer, size_t buflen)
key = key_ref_to_ptr(key_ref);

/* see if we can read it directly */
- ret = key_permission(key_ref, KEY_READ);
+ ret = key_permission(key_ref, WANT_KEY_READ);
if (ret == 0)
goto can_read_key;
if (ret != -EACCES)
@@ -700,7 +700,7 @@ long keyctl_chown_key(key_serial_t id, uid_t uid, gid_t gid)
if (uid == (uid_t) -1 && gid == (gid_t) -1)
goto error;

- key_ref = lookup_user_key(id, 1, 1, KEY_SETATTR);
+ key_ref = lookup_user_key(id, 1, 1, WANT_KEY_SETATTR);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error;
@@ -805,7 +805,7 @@ long keyctl_setperm_key(key_serial_t id, key_perm_t perm)
if (perm & ~(KEY_POS_ALL | KEY_USR_ALL | KEY_GRP_ALL | KEY_OTH_ALL))
goto error;

- key_ref = lookup_user_key(id, 1, 1, KEY_SETATTR);
+ key_ref = lookup_user_key(id, 1, 1, WANT_KEY_SETATTR);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error;
@@ -847,7 +847,7 @@ static long get_instantiation_keyring(key_serial_t ringid,

/* if a specific keyring is nominated by ID, then use that */
if (ringid > 0) {
- dkref = lookup_user_key(ringid, 1, 0, KEY_WRITE);
+ dkref = lookup_user_key(ringid, 1, 0, WANT_KEY_WRITE);
if (IS_ERR(dkref))
return PTR_ERR(dkref);
*_dest_keyring = key_ref_to_ptr(dkref);
@@ -1083,7 +1083,7 @@ long keyctl_set_timeout(key_serial_t id, unsigned timeout)
time_t expiry;
long ret;

- key_ref = lookup_user_key(id, 1, 1, KEY_SETATTR);
+ key_ref = lookup_user_key(id, 1, 1, WANT_KEY_SETATTR);
if (IS_ERR(key_ref)) {
ret = PTR_ERR(key_ref);
goto error;
@@ -1170,7 +1170,7 @@ long keyctl_get_security(key_serial_t keyid,
char *context;
long ret;

- key_ref = lookup_user_key(keyid, 0, 1, KEY_VIEW);
+ key_ref = lookup_user_key(keyid, 0, 1, WANT_KEY_VIEW);
if (IS_ERR(key_ref)) {
if (PTR_ERR(key_ref) != -EACCES)
return PTR_ERR(key_ref);
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index 3dba81c..97529ab 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -304,7 +304,7 @@ key_ref_t keyring_search_aux(key_ref_t keyring_ref,
key_check(keyring);

/* top keyring must have search permission to begin the search */
- err = key_task_permission(keyring_ref, cred, KEY_SEARCH);
+ err = key_task_permission(keyring_ref, cred, WANT_KEY_SEARCH);
if (err < 0) {
key_ref = ERR_PTR(err);
goto error;
@@ -377,7 +377,7 @@ descend:

/* key must have search permissions */
if (key_task_permission(make_key_ref(key, possessed),
- cred, KEY_SEARCH) < 0)
+ cred, WANT_KEY_SEARCH) < 0)
continue;

/* we set a different error code if we pass a negative key */
@@ -404,7 +404,7 @@ ascend:
continue;

if (key_task_permission(make_key_ref(key, possessed),
- cred, KEY_SEARCH) < 0)
+ cred, WANT_KEY_SEARCH) < 0)
continue;

/* stack the current position */
@@ -550,7 +550,7 @@ struct key *find_keyring_by_name(const char *name, bool skip_perm_check)

if (!skip_perm_check &&
key_permission(make_key_ref(keyring, 0),
- KEY_SEARCH) < 0)
+ WANT_KEY_SEARCH) < 0)
continue;

/* we've got a match */
diff --git a/security/keys/permission.c b/security/keys/permission.c
index 0ed802c..a3a2bbe 100644
--- a/security/keys/permission.c
+++ b/security/keys/permission.c
@@ -72,7 +72,7 @@ use_these_perms:
if (is_key_possessed(key_ref))
kperm |= key->perm >> 24;

- kperm = kperm & perm & KEY_ALL;
+ kperm = kperm & perm & WANT_KEY_ALL;

if (kperm != perm)
return -EACCES;
diff --git a/security/keys/proc.c b/security/keys/proc.c
index 6132629..cfab6f7 100644
--- a/security/keys/proc.c
+++ b/security/keys/proc.c
@@ -182,7 +182,7 @@ static int proc_keys_show(struct seq_file *m, void *v)
* access to __current_cred() safe
*/
rc = key_task_permission(make_key_ref(key, 0), current_cred(),
- KEY_VIEW);
+ WANT_KEY_VIEW);
if (rc < 0)
return 0;

diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index 276d278..5b3b7a0 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -685,6 +685,8 @@ reget_creds:

} /* end lookup_user_key() */

+EXPORT_SYMBOL(lookup_user_key);
+
/*****************************************************************************/
/*
* join the named keyring as the session keyring if possible, or attempt to

2009-06-16 20:45:42

by David Howells

[permalink] [raw]
Subject: [PATCH 14/17] RxRPC: Declare the security index constants symbolically

Declare the security index constants symbolically rather than just referring
to them numerically.

Signed-off-by: David Howells <[email protected]>
---

include/linux/rxrpc.h | 7 +++++++
net/rxrpc/ar-key.c | 4 ++--
net/rxrpc/rxkad.c | 6 +++---
3 files changed, 12 insertions(+), 5 deletions(-)


diff --git a/include/linux/rxrpc.h b/include/linux/rxrpc.h
index f7b826b..a53915c 100644
--- a/include/linux/rxrpc.h
+++ b/include/linux/rxrpc.h
@@ -58,5 +58,12 @@ struct sockaddr_rxrpc {
#define RXRPC_SECURITY_AUTH 1 /* authenticated packets */
#define RXRPC_SECURITY_ENCRYPT 2 /* encrypted packets */

+/*
+ * RxRPC security indices
+ */
+#define RXRPC_SECURITY_NONE 0 /* no security protocol */
+#define RXRPC_SECURITY_RXKAD 2 /* kaserver or kerberos 4 */
+#define RXRPC_SECURITY_RXGK 4 /* gssapi-based */
+#define RXRPC_SECURITY_RXK5 5 /* kerberos 5 */

#endif /* _LINUX_RXRPC_H */
diff --git a/net/rxrpc/ar-key.c b/net/rxrpc/ar-key.c
index 00678c5..3ed1a44 100644
--- a/net/rxrpc/ar-key.c
+++ b/net/rxrpc/ar-key.c
@@ -149,7 +149,7 @@ static int rxrpc_instantiate(struct key *key, const void *data, size_t datalen)
}

ret = -EPROTONOSUPPORT;
- if (security_index != 2)
+ if (security_index != RXRPC_SECURITY_RXKAD)
goto error;

key->type_data.x[0] = security_index;
@@ -351,7 +351,7 @@ int rxrpc_get_server_data_key(struct rxrpc_connection *conn,
_debug("key %d", key_serial(key));

data.kver = 1;
- data.tsec.security_index = 2;
+ data.tsec.security_index = RXRPC_SECURITY_RXKAD;
data.tsec.ticket_len = 0;
data.tsec.expiry = expiry;
data.tsec.kvno = 0;
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index d5a677f..34fc0c4 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -43,7 +43,7 @@ struct rxkad_level2_hdr {
__be32 checksum; /* decrypted data checksum */
};

-MODULE_DESCRIPTION("RxRPC network protocol type-2 security (Kerberos)");
+MODULE_DESCRIPTION("RxRPC network protocol type-2 security (Kerberos 4)");
MODULE_AUTHOR("Red Hat, Inc.");
MODULE_LICENSE("GPL");

@@ -507,7 +507,7 @@ static int rxkad_verify_packet(const struct rxrpc_call *call,
if (!call->conn->cipher)
return 0;

- if (sp->hdr.securityIndex != 2) {
+ if (sp->hdr.securityIndex != RXRPC_SECURITY_RXKAD) {
*_abort_code = RXKADINCONSISTENCY;
_leave(" = -EPROTO [not rxkad]");
return -EPROTO;
@@ -1123,7 +1123,7 @@ static void rxkad_clear(struct rxrpc_connection *conn)
static struct rxrpc_security rxkad = {
.owner = THIS_MODULE,
.name = "rxkad",
- .security_index = RXKAD_VERSION,
+ .security_index = RXRPC_SECURITY_RXKAD,
.init_connection_security = rxkad_init_connection_security,
.prime_packet_security = rxkad_prime_packet_security,
.secure_packet = rxkad_secure_packet,

2009-06-16 20:55:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 01/17] VFS: Implement the pioctl() system call

On Tue, Jun 16, 2009 at 09:38:51PM +0100, David Howells wrote:
> From: Jacob Thebault-Spieker <[email protected]>
>
> Implement the pioctl() system call. This is used to support a number of AFS
> functions, and could also be used for Coda and other filesystems.

Umm, adding a new system call multiplexer without any structure is a
serious no-go. And this one is much worse than ioctl, which with a
fixed [fd,cmd,arg] tuple seems like a stronhold of sanity compred to this
monster with multiple arguments and a path that may or may not be there.

I think you'd be better of writing tools that use a sane interface than
adding a big pile of crap like this to the kernel.

2009-06-16 23:00:18

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s


David Howells <[email protected]> wrote:

> This series of patches provides a pioctl() system call, and makes kAFS use it
> to provide a number of of the OpenAFS pioctl functions sufficient to allow a
> number of OpenAFS userspace utilities work with kAFS.

The point of this is to permit unmodified OpenAFS userspace utilities to use
either the OpenAFS kernel module or the kAFS kernel module.

Whilst I might wish to ultimately replace the OpenAFS userspace utilities with
my own set, that's no small piece of work, and so a handy halfway stage is a
mixed environment as outlined above.

Furthermore, the ability to use the OpenAFS userspace utilities unmodified
with my kernel module, and, indeed vice-versa, makes testing much easier.

David

2009-06-16 23:11:19

by Alan

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

> Whilst I might wish to ultimately replace the OpenAFS userspace utilities with
> my own set, that's no small piece of work, and so a handy halfway stage is a
> mixed environment as outlined above.
>
> Furthermore, the ability to use the OpenAFS userspace utilities unmodified
> with my kernel module, and, indeed vice-versa, makes testing much easier.

But if we add an ABI we end up stuck with it and this one is really
really rather ugly.

Can you not put pioctl() into a C library linked with the openafs
utilities that generates more sensible interface calls ? I mean you have
to produce the pioctl() syscall wrapper anyway so why not make "pioctl" a
user space compat library ?

Alan

2009-06-17 00:20:29

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 01/17] VFS: Implement the pioctl() system call

Christoph Hellwig <[email protected]> wrote:

> > Implement the pioctl() system call. This is used to support a number of AFS
> > functions, and could also be used for Coda and other filesystems.
>
> Umm, adding a new system call multiplexer without any structure is a
> serious no-go. And this one is much worse than ioctl, which with a
> fixed [fd,cmd,arg] tuple seems like a stronhold of sanity compred to this
> monster with multiple arguments and a path that may or may not be there.

Ummm... pioctl() has lots of structure. Standard argument/reply block
definition, for example: you get one blob of argument, you may return one blob
of argument, you must structure your blobs such that 32-bit/64-bit
compatibility problems don't occur. It's _much_ more structured than ioctl,
for example.

The main annoyance with it, as you noted, is the fact that people have treated
the path as being optional.

> I think you'd be better of writing tools that use a sane interface than
> adding a big pile of crap like this to the kernel.

Name a single sane interface that can do all that pioctl() can? There isn't
one. You can emulate almost all of pioctl() in userspace by a combination of
getxattr, lgetxattr, setxattr, lsetxattr, add_key, keyctl_read, and when all
else fails, open/open-NOFOLLOW + ioctl [IF not a dev file, and IF there are no
collisions between ioctl numbers and pioctl numbers]. In other words, a mess.

Now, assuming I do produce such a userspace library - that does not address
the other requirement: that of using a common set of binaries to manipulate
both OpenAFS and kAFS without the need for recompilation. I presume you
advocate making OpenAFS change to suit your requirements?

David

2009-06-17 00:26:41

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

Alan Cox <[email protected]> wrote:

> But if we add an ABI we end up stuck with it and this one is really
> really rather ugly.

Somewhat less ugly than ioctl, for instance, but you're not entirely wrong.
There is no good way of doing this.

> Can you not put pioctl() into a C library linked with the openafs utilities
> that generates more sensible interface calls? I mean you have to produce
> the pioctl() syscall wrapper anyway so why not make "pioctl" a user space
> compat library?

pioctl() is almost implementable with a combination of (l)setxattr,
(l)getxattr, set_key, keyctl_read, and if all else fails, open + ioctl or
open(O_NOFOLLOW) + ioctl, but not quite completely. There are things you can't
open, even with O_NOFOLLOW. And doing state-retaining setxattr/getxattr pairs
is even more nasty than pioctl (IIRC, that's something Christoph suggested a
while back).

Besides, I want a set of utilities that I can use in conjunction with both kAFS
and OpenAFS without having to recompile.

David

2009-06-17 07:48:13

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 03/17] VFS: Implement handling for pathless pioctls

On Jun 16, 2009 21:39 +0100, David Howells wrote:
> Implement handling for pathless pioctls. Because these take no path argument,
> there's no way to know for certain which filesystem they're aimed at, so we
> have to switch on command number instead. This patch allows interested parties
> to register handlers. Each registered handler function is tried in turn until
> one doesn't return -EOPNOTSUPP.
>
> This is required because OpenAFS implemented a number of AFS calls that don't
> get given a path as they're aimed at AFS in general, and not at a particular
> file, volume or cell in the AFS world.

Wouldn't it make a lot more sense to create a virtual device, open the
mountpoint, or do _something_ that associates these calls with AFS
directly instead of having the kernel magically route the call to a
specific filesystem?

Not that I agree with having a filesystem-specific syscall either
(sys_reiserfs() was not allowed either :-), but wouldn't even that
be better suited to doing AFS-specific things?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-06-17 07:51:58

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 10/17] AFS: Implement the PWhereIs pioctl

On Jun 16, 2009 21:39 +0100, David Howells wrote:
> Implement the PWhereIs pioctl for AFS. This will find out on which servers
> the volume containing the specified file is located and return the IPv4
> addresses to userspace.

What happens with IPv6?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-06-17 07:55:26

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Jun 17, 2009 01:25 +0100, David Howells wrote:
> Alan Cox <[email protected]> wrote:
> > But if we add an ABI we end up stuck with it and this one is really
> > really rather ugly.
>
> Somewhat less ugly than ioctl, for instance, but you're not entirely wrong.
> There is no good way of doing this.
>
> > Can you not put pioctl() into a C library linked with the openafs utilities
> > that generates more sensible interface calls? I mean you have to produce
> > the pioctl() syscall wrapper anyway so why not make "pioctl" a user space
> > compat library?
>
> pioctl() is almost implementable with a combination of (l)setxattr,
> (l)getxattr, set_key, keyctl_read, and if all else fails, open + ioctl or
> open(O_NOFOLLOW) + ioctl, but not quite completely. There are things you
> can't open, even with O_NOFOLLOW. And doing state-retaining setxattr/
> getxattr pairs is even more nasty than pioctl (IIRC, that's something
> Christoph suggested a while back).

What about opening the mountpoint (which HAS to be available) and then
calling an ioctl() on that? At least the mess would be contained within
AFS instead of requiring several new syscalls.

> Besides, I want a set of utilities that I can use in conjunction with both
> kAFS and OpenAFS without having to recompile.

That doesn't mean it isn't possible to have the same user-space utilities,
just that the pioctl() wrapper will need to multiplex its behaviour depending
on whether it is working with kAFS or OpenAFS.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-06-17 09:01:20

by Alan

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

> > Can you not put pioctl() into a C library linked with the openafs utilities
> > that generates more sensible interface calls? I mean you have to produce
> > the pioctl() syscall wrapper anyway so why not make "pioctl" a user space
> > compat library?
>
> pioctl() is almost implementable with a combination of (l)setxattr,
> (l)getxattr, set_key, keyctl_read, and if all else fails, open + ioctl or
> open(O_NOFOLLOW) + ioctl, but not quite completely. There are things you can't
> open, even with O_NOFOLLOW. And doing state-retaining setxattr/getxattr pairs
> is even more nasty than pioctl (IIRC, that's something Christoph suggested a
> while back).
>
> Besides, I want a set of utilities that I can use in conjunction with both kAFS
> and OpenAFS without having to recompile.

"I want" isn't a good policy for the introduction of ugly as sin long
term interfaces into the kernel. Besides which you argument doesn't
actually make sense anyway.

If you have to put a pioctl() wrapper in a library then you can compile
both sets of tools with your wrapper library and the library can do the
relevant gunge to decide how to make the kernel calls and which ones to
make.

The pioctl() interface is crap, keep it in user space wrappers and put
actual proper structured interfaces into the kernel.

2009-06-17 09:03:06

by Alan

[permalink] [raw]
Subject: Re: [PATCH 01/17] VFS: Implement the pioctl() system call

> Now, assuming I do produce such a userspace library - that does not address
> the other requirement: that of using a common set of binaries to manipulate
> both OpenAFS and kAFS without the need for recompilation. I presume you
> advocate making OpenAFS change to suit your requirements?

If you have a library providing pioctl() it can speak brain damage to the
OpenAFS module and sensible interfaces otherwise.

The problem you are attempting to introduce simply doesn't exist.

2009-06-17 16:11:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



On Wed, 17 Jun 2009, Andreas Dilger wrote:
>
> What about opening the mountpoint (which HAS to be available) and then
> calling an ioctl() on that?

It's very hard to "open the mountpoint" in user space. How would you even
do it? Remember: we're not living in the 1980's any more, and disco is
dead. ABBA may have made a comeback, but static mountpoints are long gone,
and won't be coming back.

These days, you can mount individual files, you can have per-process
mounts, and automounters have been a fact for a long time.

So I _agree_ that pioctl's are problematic, but please don't argue against
them using _stupid_ arguments. And "open the mountpoint" really is a
stupid argument. It not only isn't possible to do in user space, but you
may well want to do operations on a particular path, not just the mount.

So you'd need to open the file itself. Which might be a symlink or a
device node, depending on the exact semantics of pioctl.

We've traditionally had that magic "open with flag=3" to do a magic open
of device files without waiting, and we have O_NOFOLLOW to open symlinks
without following them (sadly, it just errors out, rather than opening the
symlink, but that's another detail).

So I think it should be solvable some way, but not by trying to find the
mount point.

Linus

2009-06-17 17:27:14

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

Linus Torvalds <[email protected]> wrote:

> > What about opening the mountpoint (which HAS to be available) and then
> > calling an ioctl() on that?
>
> It's very hard to "open the mountpoint" in user space. How would you even
> do it? Remember: we're not living in the 1980's any more, and disco is
> dead. ABBA may have made a comeback, but static mountpoints are long gone,
> and won't be coming back.

I think what Andreas means is open the directory at the root of the mounted
tree, i.e. "/afs" for AFS, and then do an ioctl() on that that emulates
pioctl().

This is similar to what Coda does (see fs/coda/pioctl.c) except that that uses
a special file hidden in the root dir of the Coda mount (see coda_lookup() in
fs/coda/dir.c) that doesn't appear to readdir.

This is also similar to what OpenAFS does if it can't alter the syscall table -
it creates a proc file (/proc/fs/openafs/afs_ioctl) and issues ioctls on that,
to emulate pioctl().

In both cases, the pioctl emulator calls user_path() or similar from the
module, and then calls the appropriate handler directly.

> So you'd need to open the file itself. Which might be a symlink or a
> device node, depending on the exact semantics of pioctl.

Indeed. I was also hoping to use them to control caching of AFS, NFS and CIFS
files, where you might want to point at a file and say "put that in the cache"
or "eject that from the cache" or "keep that out of the cache". However,
that's only possible with certain types of file where the file is opened
directly and an ioctl() done on it instead of a pioctl(). As you point out,
for symlinks it's tricky and for device files it's definitely wrong.

I'll acknowledge that it *could* perhaps be done with [l]setxattr() using
special attributes - however these things aren't necessarily attributes of the
file, but rather attributes of the system, so is setxattr() semantically
correct?

David

2009-06-17 17:34:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



On Wed, 17 Jun 2009, David Howells wrote:

> Linus Torvalds <[email protected]> wrote:
>
> > > What about opening the mountpoint (which HAS to be available) and then
> > > calling an ioctl() on that?
> >
> > It's very hard to "open the mountpoint" in user space. How would you even
> > do it? Remember: we're not living in the 1980's any more, and disco is
> > dead. ABBA may have made a comeback, but static mountpoints are long gone,
> > and won't be coming back.
>
> I think what Andreas means is open the directory at the root of the mounted
> tree, i.e. "/afs" for AFS, and then do an ioctl() on that that emulates
> pioctl().

I agree that that is what he means.

What _I_ mean is that THIS IS IMPOSSIBLE TO DO FROM USER SPACE!

Try it. Not doable. User space simply doesn't know enough, and has
fundamental races with mount/umount.

Sure, you can try to do it by trying to parse the pathname and looking in
/etc/mtab or /proc/mounts. And I guarantee that the end result will be a
buggy pile of sh*t.

End result: you do need a new system call.

I just don't think "pioctl()" is a good one. You'd be better off with some
modification of open and then use ioctl.

Linus

2009-06-17 18:05:30

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 10/17] AFS: Implement the PWhereIs pioctl

Andreas Dilger <[email protected]> wrote:

> What happens with IPv6?

Currently AFS doesn't support IPv6. There need to be some protocol changes
for that to happen, for instance the volume location server needs to return
IPv6 records for its fileservers if that's the right way to access them.

David

2009-06-17 18:06:40

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

Linus Torvalds <[email protected]> wrote:

> What _I_ mean is that THIS IS IMPOSSIBLE TO DO FROM USER SPACE!
>
> Try it. Not doable. User space simply doesn't know enough, and has
> fundamental races with mount/umount.

Ummm... I'm not sure I completely agree. If you've managed to open, say,
"/afs", where's the race with mount/umount? You've got a file descriptor you
can use as a handle. Yes, you have to check that it's actually an inode of
your fs, but that's not exactly difficult, and that's not going to change just
because someone unmounts it or mounts over it whilst you've got it open.

However, that makes userspace have to assume that the pioctl handler is on an
AFS inode, perhaps any AFS inode. This is not compatible with OpenAFS as it
stands, and also means you can't use the AFS pioctls before mounting anything,
and you can't mount it elsewhere and expect it to work.

> End result: you do need a new system call.
>
> I just don't think "pioctl()" is a good one.

Out of interest, why not? Is it just because it's another multipexor? Or is
it because it's been abused to have pathless commands?

> You'd be better off with some modification of open and then use ioctl.

So you'd say use:

fd = open("/the/target/file", O_SUPPRESS | (nofollow?O_NOFOLLOW:0));
ioctl(fd, cmd, &args);
close(fd);

where O_SUPPRESS (or whatever) suppressed override of the ops tables by the
chardev and blockdev handlers, and allows symlinks to be opened, rather than:

pioctl("/the/target/file", cmd, &args, nofollow);

I would counter that with:

(1) pioctl() is actually simpler and cleaner, and doesn't require
modifications to open().

(2) The open()/ioctl() method doesn't handle pathless pioctls, and so is not
a complete solution.

(3) The open()/ioctl() method assumes that pioctl() command numbers don't
clash with ioctl() command numbers - something that's unfortunately not
true of OpenAFS:-(

Of course, you could have one ioctl() command number that says that this
is a pioctl() and then a second number in the argument data that is the
pioctl() command number.

(4) pioctl() is compatible with OpenAFS.

Do you also disagree with OpenAFS's idea of creating a proc file to open so
that you can do ioctls on that to emulate pioctl()? That would serve also.

David

2009-06-17 18:25:20

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



On Wed, 17 Jun 2009, David Howells wrote:

> Linus Torvalds <[email protected]> wrote:
>
> > What _I_ mean is that THIS IS IMPOSSIBLE TO DO FROM USER SPACE!
> >
> > Try it. Not doable. User space simply doesn't know enough, and has
> > fundamental races with mount/umount.
>
> Ummm... I'm not sure I completely agree. If you've managed to open, say,
> "/afs", where's the race with mount/umount?

Well, if you mean that you're going to have a new system call that then
passes in both the 'fd' from that /afs open, _and_ the pathname you want
to work on, then sure.

But if you do that new system call, then what's the point again? You're
back to pinfo() anyway.

> > I just don't think "pioctl()" is a good one.
>
> Out of interest, why not? Is it just because it's another multipexor? Or is
> it because it's been abused to have pathless commands?

No. It's because it's another _typeless_ multiplexor.

Look at ioctl. It's a F*CKING DISASTER. Look at all the compat crap, and
at the ioctl numbers that mean different things for different file types,
and all the random sizing crap. You fixed the random sizing crap (at least
it has well-defined "input" and "output" areas), and that's an
improvement, but it's still just random numbers with no semantics.

Now, you can take two approaches:

- learn from your mistake, and not do another f*cking disaster that just
takes a pathname instead of a fd. Do something else, that actually has
semantics and has a well-defined input and output buffer.

- do the same stupid thing over again, and never learn.

And quite frankly, I know which of those choices I'd call "intelligent",
and which of them I'd call "you're a f*cking moron for doing it".

And guess which one "pioctl()" is. Just take a wild stab at it.

> > You'd be better off with some modification of open and then use ioctl.
>
> So you'd say use:
>
> fd = open("/the/target/file", O_SUPPRESS | (nofollow?O_NOFOLLOW:0));
> ioctl(fd, cmd, &args);
> close(fd);

Yes, I think that would be better. It's not perfect, because I think ioctl
is still a f*cking broken mess (and with the sizing issue, it's arguably
_worse_ then your pioctl), but at least we're not adding _another_ broken
mess.

So I don't think the above is great either.

What I'd really prefer is something that actually has semantics. Not just
"here's input, here's output, do something random to it".

> Do you also disagree with OpenAFS's idea of creating a proc file to open so
> that you can do ioctls on that to emulate pioctl()? That would serve also.

Oh yes, I think that's a piece of crap too.

Linus

2009-06-17 18:27:00

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 03/17] VFS: Implement handling for pathless pioctls

Andreas Dilger <[email protected]> wrote:

> Wouldn't it make a lot more sense to create a virtual device, open the
> mountpoint, or do _something_ that associates these calls with AFS
> directly instead of having the kernel magically route the call to a
> specific filesystem?

Ummm... You mean like mkdir, rmdir, symlink, readlink, mknod, creat, open,
unlink, etc. aren't magically routed by the kernel to a specific filesystem
based on the pathname?

The exception to that is pathless pioctls which are most annoying and require
special handling and foreknowledge whatever.

David

2009-06-17 18:31:13

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Wed, Jun 17, 2009 at 07:03:43PM +0100, David Howells wrote:
> Linus Torvalds <[email protected]> wrote:
>
> > What _I_ mean is that THIS IS IMPOSSIBLE TO DO FROM USER SPACE!
> >
> > Try it. Not doable. User space simply doesn't know enough, and has
> > fundamental races with mount/umount.
>
> Ummm... I'm not sure I completely agree. If you've managed to open, say,
> "/afs", where's the race with mount/umount? You've got a file descriptor you
> can use as a handle. Yes, you have to check that it's actually an inode of
> your fs, but that's not exactly difficult, and that's not going to change just
> because someone unmounts it or mounts over it whilst you've got it open.

I believe Linus is arguing that in the general case, it's impossible
to open the mountpoint of an arbitrarily mounted filesystem. David,
you're arguing that by convention the afs root is *always* in /afs,
and that the afs utilities will always simply open "/afs", and thus
it's not hard to find the mount point, since afs works by having a
single top-level static mount point --- and AFS hides the
lookup of what volume server you might need to go to when you open
/afs/athena.mit.edu/user/t/y/tytso versus /afs/andrew.cmu.edu/usr/shadow.

There are no magic "automounts" such that OS won't know that
user.tytso AFS Volume in the athena.mit.edu AFS cell is at
/afs/athena.mit.edu/user/t/y/tytso, so the only "mountpoint" that
exists as far as AFS is concerned is at /afs --- and that in the AFS
world, it's essentially a universal convention that AFS pathnames
begin with "/afs", and so the AFS filesystem will always be mounted in /afs.

- Ted

2009-06-17 18:39:18

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Wed, Jun 17, 2009 at 09:09:47AM -0700, Linus Torvalds wrote:
> We've traditionally had that magic "open with flag=3" to do a magic open
> of device files without waiting, and we have O_NOFOLLOW to open symlinks
> without following them (sadly, it just errors out, rather than opening the
> symlink, but that's another detail).
>
> So I think it should be solvable some way, but not by trying to find the
> mount point.

O_NOFOLLOW *will* open their mountpoints just fine, without triggering
automount. Of course, if something's already mounted there, it will
get you the covering object. Which is a feature, as far as I'm concerned,
since "I've overmounted that to have it unreachable" shouldn't be breakable
regardless of the syscall we are using - be it open() or pioctl().

FWIW, count me strongly opposed to that shit; it's too damn ugly to live,
has interesting security implications and we'll get stuck with it forever.
And we *really* don't need another multiplexor from hell, without anything
resembling well-defined semantics.

2009-06-17 18:45:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



On Wed, 17 Jun 2009, Al Viro wrote:

> On Wed, Jun 17, 2009 at 09:09:47AM -0700, Linus Torvalds wrote:
> > We've traditionally had that magic "open with flag=3" to do a magic open
> > of device files without waiting, and we have O_NOFOLLOW to open symlinks
> > without following them (sadly, it just errors out, rather than opening the
> > symlink, but that's another detail).
> >
> > So I think it should be solvable some way, but not by trying to find the
> > mount point.
>
> O_NOFOLLOW *will* open their mountpoints just fine, without triggering
> automount.

That's not the problem with O_NOFOLLOW.

The problem is that if you want to actually open the symlink itself (say,
you do some filesystem cleanup operation on it, like saying "drop the
caches of this file"), you can't do it. O_NOFOLLOW won't open the symlink,
it will just refuse to follow it, and return an error.

Linus

2009-06-17 18:52:46

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Wed, Jun 17, 2009 at 11:44:18AM -0700, Linus Torvalds wrote:
> > O_NOFOLLOW *will* open their mountpoints just fine, without triggering
> > automount.
>
> That's not the problem with O_NOFOLLOW.
>
> The problem is that if you want to actually open the symlink itself (say,
> you do some filesystem cleanup operation on it, like saying "drop the
> caches of this file"), you can't do it. O_NOFOLLOW won't open the symlink,
> it will just refuse to follow it, and return an error.

... so we need a syscall that would do that "drop the caches" operation.
_After_ having decided that it's really needed for symlinks. With decision
made on per-operation basis. Sure, it will be painful for people proposing
such operations, which is just fine by me - barriers to adding new primitives
shouldn't be low.

2009-06-17 19:15:39

by David Lang

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Wed, 17 Jun 2009, Theodore Tso wrote:

> There are no magic "automounts" such that OS won't know that
> user.tytso AFS Volume in the athena.mit.edu AFS cell is at
> /afs/athena.mit.edu/user/t/y/tytso, so the only "mountpoint" that
> exists as far as AFS is concerned is at /afs --- and that in the AFS
> world, it's essentially a universal convention that AFS pathnames
> begin with "/afs", and so the AFS filesystem will always be mounted in /afs.

so does this mean that there can never be more than a single AFS
filesystem mounted?

David Lang

2009-06-17 19:29:07

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

Al Viro <[email protected]> wrote:

> ... so we need a syscall that would do that "drop the caches" operation.
> _After_ having decided that it's really needed for symlinks.

If you want to support disconnected operation, then you need a way to (a) lock
an object in the cache, (b) unlock an object in the cache, (c) pull an object
into the cache, (d) kick an object out of the cache, (e) ban an object from the
cache, (f) reserve space in the cache for an object, (g) release the
reservation on an object and (h) find out the lock/ban/reservation status of an
object in the cache, and you'd need to support them for _all_ file types,
including dirs, symlinks, dev files and fifos. Probably not UNIX sockets,
though.

I can add a system call for each of these operations. I need some of them
anyway to implement kAFS if I'm not allowed pioctl().

<sarcasm> In fact, why don't I just make each AFS pioctl function a full-blown
syscall? That satisfies Linus's semantics requirement, and avoids the need for
a 'typeless' multiplexor that so offends people. OTOH, the master syscall mux
_is_ a typeless multiplexor... </sarcasm>

David

2009-06-17 19:30:50

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

[email protected] wrote:

> so does this mean that there can never be more than a single AFS filesystem
> mounted?

Currently, in a way, yes: the existence of pathless pioctls ensures that.

kAFS does use multiple mounts, one per volume, but they're normally rooted at
/afs.

David

2009-06-17 19:52:20

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

Linus Torvalds <[email protected]> wrote:

> > Ummm... I'm not sure I completely agree. If you've managed to open, say,
> > "/afs", where's the race with mount/umount?
>
> Well, if you mean that you're going to have a new system call that then
> passes in both the 'fd' from that /afs open, _and_ the pathname you want
> to work on, then sure.

No. I meant open + ioctl.

> But if you do that new system call, then what's the point again? You're
> back to pinfo() anyway.

Ummm... pinfo()? Did you mean pioctl()?

> No. It's because it's another _typeless_ multiplexor.

What do you mean by 'typeless'? Even the master syscall mux is typeless,
depending on how you look at it; either that, or it's the superposition of a
multiplicity of types selected by an arbitrary number.

Short of doing something like an XML or ASN1 structured interface, we aren't
going to get that, and do we really want to go down that path?

The difference between the syscall mux and a filesystem's ioctl/pioctl mux is
that the both need to check on their arguments.

> Look at ioctl. It's a F*CKING DISASTER. Look at all the compat crap, and
> at the ioctl numbers that mean different things for different file types,
> and all the random sizing crap. You fixed the random sizing crap (at least
> it has well-defined "input" and "output" areas), and that's an
> improvement, but it's still just random numbers with no semantics.

Well, to be fair, I didn't fix it. That's the way pioctl() was defined before
I got to deal with it. Compat code is not necessary beyond the outermost VFS
layer because you have to carefully structure your input and output blobs, and
pointers and CPU-dependent tyeps are not allowed therein. It even uses XDR
encoding in some circumstances (VIOCSETTOK2 and VIOCGETTOK2), so some pioctls
will even work on a mixed-endian machine. Now if it only used XDR for all...

> - learn from your mistake, and not do another f*cking disaster that just
> takes a pathname instead of a fd. Do something else, that actually has
> semantics and has a well-defined input and output buffer.

That sounds like you want all the pioctl functions promoted to syscalls.
Emulation through ioctl does not gain this; nor does going through xattrs -
that's just a way of doing ioctls with textual command names instead of
numbers.

> And guess which one "pioctl()" is. Just take a wild stab at it.

Well, I'd say it's more intelligent that open+ioctl...

> > fd = open("/the/target/file", O_SUPPRESS | (nofollow?O_NOFOLLOW:0));
> > ioctl(fd, cmd, &args);
> > close(fd);
>
> Yes, I think that would be better.

But it isn't sufficient to address all the cases - in which case it's
pointless.

David

2009-06-17 20:09:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s



On Wed, 17 Jun 2009, David Howells wrote:
>
> > But if you do that new system call, then what's the point again? You're
> > back to pinfo() anyway.
>
> Ummm... pinfo()? Did you mean pioctl()?

Yes.

> > No. It's because it's another _typeless_ multiplexor.
>
> What do you mean by 'typeless'? Even the master syscall mux is typeless,
> depending on how you look at it; either that, or it's the superposition of a
> multiplicity of types selected by an arbitrary number.

No, it's not typeless, and you're just an ugly troll.

The master syscall mutex is VERY WELL DEFINED, and doesn't change randomly
depending on what file you happen to look at. It's also somethign where
people actually spend time and effort before adding new entries, instead
of the incessant cluster-fuck that is always the random ioctl, where
subsystems add a new entry at the drop of a hat.

> Short of doing something like an XML or ASN1 structured interface, we aren't
> going to get that, and do we really want to go down that path?
>
> The difference between the syscall mux and a filesystem's ioctl/pioctl mux is
> that the both need to check on their arguments.

What kind of idiotic "arguments" are these. Neither of them is in the
least true, relevant, or anything sane at all.

Why should I bother reading any further, when you show that your emails
are full of pointless crap?

> > - learn from your mistake, and not do another f*cking disaster that just
> > takes a pathname instead of a fd. Do something else, that actually has
> > semantics and has a well-defined input and output buffer.
>
> That sounds like you want all the pioctl functions promoted to syscalls.

No.

It means that I want more structure. Stop making these things up.

More structure to make _guarantees_ that you will never need a compat
layer, for example. More structure so that people can _figure out_ what
the supported set of interfaces are, since they are going to differ for
different files. More structure so that we can see security issues without
having to know every f*cking new entry in that table. Etc etc etc.

> > And guess which one "pioctl()" is. Just take a wild stab at it.
>
> Well, I'd say it's more intelligent that open+ioctl...

Why?

So far, you're just spouting total nonsense.

Linus

2009-06-18 12:50:50

by Olivier Galibert

[permalink] [raw]
Subject: Re: [PATCH 00/17] [RFC] AFS: Implement OpenAFS pioctls(version)s

On Wed, Jun 17, 2009 at 08:28:29PM +0100, David Howells wrote:
> Al Viro <[email protected]> wrote:
>
> > ... so we need a syscall that would do that "drop the caches" operation.
> > _After_ having decided that it's really needed for symlinks.
>
> If you want to support disconnected operation, then you need a way to (a) lock
> an object in the cache, (b) unlock an object in the cache, (c) pull an object
> into the cache, (d) kick an object out of the cache, (e) ban an object from the
> cache, (f) reserve space in the cache for an object, (g) release the
> reservation on an object and (h) find out the lock/ban/reservation status of an
> object in the cache, and you'd need to support them for _all_ file types,
> including dirs, symlinks, dev files and fifos. Probably not UNIX sockets,
> though.

If I follow correctly, what you call "object" is "anything a name can
point to in a filesystem", and you need to be able to refer to any of
them without side effects. So, Al, whay should be used to refer to
them?

OG.