> On December 3, 2001 01:54 am, Nathan Scott wrote:
> > ...BTW, we have reworked the interfaces once more and will
> > send out the latest revision in the next couple of days -
hi folks,
Here is the revised interface. I believe it takes into account
the issues raised so far - further suggestions are also welcome,
of course.
Man pages for the system calls are available from the XFS CVS tree
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/cmd/attr2/man/
[Andreas, could you host html-ised versions at bestbits.at again?]
The interesting pages are getxattr(2), setxattr(2), listxattr(2),
removexattr(2) and attr(5), though there are several user tools
based on this interface too and a version of Andreas' POSIX ACL
tools which makes use of this interface now exists (these also have
man pages and are all available from the XFS CVS tree).
Two patches follow - the first marks syscall numbers as reserved,
the second is the proposed VFS interface. These are patches based
on the 2.5.0 tree, but should apply cleanly to any 2.5.1-preX and
2.4.16/17-preX tree. Linus - if possible, we'd really like to get
system call numbers reserved for these, or know of any aspects you
would like changed in order to make this acceptable for 2.5.
many thanks.
--
Nathan
[1st patch]
diff -Naur 2.5.0-pristine/arch/i386/kernel/entry.S 2.5.0-reserved/arch/i386/kernel/entry.S
--- 2.5.0-pristine/arch/i386/kernel/entry.S Sat Nov 3 12:18:49 2001
+++ 2.5.0-reserved/arch/i386/kernel/entry.S Tue Dec 4 11:57:32 2001
@@ -622,6 +622,18 @@
.long SYMBOL_NAME(sys_ni_syscall) /* Reserved for Security */
.long SYMBOL_NAME(sys_gettid)
.long SYMBOL_NAME(sys_readahead) /* 225 */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for setxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for lsetxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for fsetxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for getxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* 230 reserved for lgetxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for fgetxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for listxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for llistxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for flistxattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* 235 reserved for removexattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for lremovexattr */
+ .long SYMBOL_NAME(sys_ni_syscall) /* reserved for fremovexattr */
.rept NR_syscalls-(.-sys_call_table)/4
.long SYMBOL_NAME(sys_ni_syscall)
diff -Naur 2.5.0-pristine/include/asm-i386/unistd.h 2.5.0-reserved/include/asm-i386/unistd.h
--- 2.5.0-pristine/include/asm-i386/unistd.h Thu Oct 18 03:03:03 2001
+++ 2.5.0-reserved/include/asm-i386/unistd.h Tue Dec 4 11:58:21 2001
@@ -230,6 +230,18 @@
#define __NR_security 223 /* syscall for security modules */
#define __NR_gettid 224
#define __NR_readahead 225
+#define __NR_setxattr 226
+#define __NR_lsetxattr 227
+#define __NR_fsetxattr 228
+#define __NR_getxattr 229
+#define __NR_lgetxattr 230
+#define __NR_fgetxattr 231
+#define __NR_listxattr 232
+#define __NR_llistxattr 233
+#define __NR_flistxattr 234
+#define __NR_removexattr 235
+#define __NR_lremovexattr 236
+#define __NR_fremovexattr 237
/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
[2nd patch]
diff -Naur 2.5.0-pristine/arch/i386/kernel/entry.S 2.5.0-xattr/arch/i386/kernel/entry.S
--- 2.5.0-pristine/arch/i386/kernel/entry.S Sat Nov 3 12:18:49 2001
+++ 2.5.0-xattr/arch/i386/kernel/entry.S Tue Dec 4 12:02:56 2001
@@ -622,6 +622,18 @@
.long SYMBOL_NAME(sys_ni_syscall) /* Reserved for Security */
.long SYMBOL_NAME(sys_gettid)
.long SYMBOL_NAME(sys_readahead) /* 225 */
+ .long SYMBOL_NAME(sys_setxattr)
+ .long SYMBOL_NAME(sys_lsetxattr)
+ .long SYMBOL_NAME(sys_fsetxattr)
+ .long SYMBOL_NAME(sys_getxattr)
+ .long SYMBOL_NAME(sys_lgetxattr) /* 230 */
+ .long SYMBOL_NAME(sys_fgetxattr)
+ .long SYMBOL_NAME(sys_listxattr)
+ .long SYMBOL_NAME(sys_llistxattr)
+ .long SYMBOL_NAME(sys_flistxattr)
+ .long SYMBOL_NAME(sys_removexattr) /* 235 */
+ .long SYMBOL_NAME(sys_lremovexattr)
+ .long SYMBOL_NAME(sys_fremovexattr)
.rept NR_syscalls-(.-sys_call_table)/4
.long SYMBOL_NAME(sys_ni_syscall)
diff -Naur 2.5.0-pristine/fs/Makefile 2.5.0-xattr/fs/Makefile
--- 2.5.0-pristine/fs/Makefile Tue Nov 13 04:34:16 2001
+++ 2.5.0-xattr/fs/Makefile Fri Nov 30 15:33:28 2001
@@ -14,7 +14,7 @@
super.o block_dev.o char_dev.o stat.o exec.o pipe.o namei.o \
fcntl.o ioctl.o readdir.o select.o fifo.o locks.o \
dcache.o inode.o attr.o bad_inode.o file.o iobuf.o dnotify.o \
- filesystems.o namespace.o seq_file.o
+ filesystems.o namespace.o seq_file.o xattr.o
ifeq ($(CONFIG_QUOTA),y)
obj-y += dquot.o
diff -Naur 2.5.0-pristine/fs/xattr.c 2.5.0-xattr/fs/xattr.c
--- 2.5.0-pristine/fs/xattr.c Thu Jan 1 10:00:00 1970
+++ 2.5.0-xattr/fs/xattr.c Tue Dec 4 12:00:49 2001
@@ -0,0 +1,346 @@
+/*
+ File: fs/xattr.c
+
+ Extended attribute handling.
+
+ Copyright (C) 2001 by Andreas Gruenbacher <[email protected]>
+ Copyright (C) 2001 SGI - Silicon Graphics, Inc <[email protected]>
+ */
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/smp_lock.h>
+#include <linux/file.h>
+#include <linux/xattr.h>
+#include <asm/uaccess.h>
+
+/*
+ * Extended attribute memory allocation wrappers, originally
+ * based on the Intermezzo PRESTO_ALLOC/PRESTO_FREE macros.
+ * The vmalloc use here is very uncommon - extended attributes
+ * are supposed to be small chunks of metadata, and it is quite
+ * unusual to have very many extended attributes, so lists tend
+ * to be quite short as well. The 64K upper limit is derived
+ * from the extended attribute size limit used by XFS.
+ * Intentionally allow zero @size for value/list size requests.
+ */
+static void *
+xattr_alloc(size_t size, size_t limit)
+{
+ void *ptr;
+
+ if (size > limit)
+ return ERR_PTR(-E2BIG);
+
+ if (!size) /* size request, no buffer is needed */
+ return NULL;
+ else if (size <= PAGE_SIZE)
+ ptr = kmalloc((unsigned long) size, GFP_KERNEL);
+ else
+ ptr = vmalloc((unsigned long) size);
+ if (!ptr)
+ return ERR_PTR(-ENOMEM);
+ return ptr;
+}
+
+static void
+xattr_free(void *ptr, size_t size)
+{
+ if (!size) /* size request, no buffer was needed */
+ return;
+ else if (size <= PAGE_SIZE)
+ kfree(ptr);
+ else
+ vfree(ptr);
+}
+
+/*
+ * Extended attribute SET operations
+ */
+static long
+setxattr(struct dentry *d, char *name, void *value, size_t size, int flags)
+{
+ int error;
+ void *kvalue;
+ char kname[XATTR_NAME_MAX + 1];
+
+ error = -EINVAL;
+ if (flags & ~(XATTR_CREATE|XATTR_REPLACE))
+ return error;
+
+ error = -EFAULT;
+ if (copy_from_user(kname, name, XATTR_NAME_MAX))
+ return error;
+ kname[XATTR_NAME_MAX] = '\0';
+
+ kvalue = xattr_alloc(size, XATTR_SIZE_MAX);
+ if (IS_ERR(kvalue))
+ return PTR_ERR(kvalue);
+
+ error = -EFAULT;
+ if (size > 0 && copy_from_user(kvalue, value, size)) {
+ xattr_free(kvalue, size);
+ return error;
+ }
+
+ error = -EOPNOTSUPP;
+ if (d->d_inode->i_op && d->d_inode->i_op->setxattr) {
+ lock_kernel();
+ error = d->d_inode->i_op->setxattr(d, kname, kvalue, size, flags);
+ unlock_kernel();
+ }
+
+ xattr_free(kvalue, size);
+ return error;
+}
+
+asmlinkage long
+sys_setxattr(char *path, char *name, void *value, size_t size, int flags)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk(path, &nd);
+ if (error)
+ return error;
+ error = setxattr(nd.dentry, name, value, size, flags);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_lsetxattr(char *path, char *name, void *value, size_t size, int flags)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk_link(path, &nd);
+ if (error)
+ return error;
+ error = setxattr(nd.dentry, name, value, size, flags);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_fsetxattr(int fd, char *name, void *value, size_t size, int flags)
+{
+ struct file *f;
+ int error = -EBADF;
+
+ f = fget(fd);
+ if (!f)
+ return error;
+ error = setxattr(f->f_dentry, name, value, size, flags);
+ fput(f);
+ return error;
+}
+
+/*
+ * Extended attribute GET operations
+ */
+static long
+getxattr(struct dentry *d, char *name, void *value, size_t size)
+{
+ int error;
+ void *kvalue;
+ char kname[XATTR_NAME_MAX + 1];
+
+ error = -EFAULT;
+ if (copy_from_user(kname, name, XATTR_NAME_MAX))
+ return error;
+ kname[XATTR_NAME_MAX] = '\0';
+
+ kvalue = xattr_alloc(size, XATTR_SIZE_MAX);
+ if (IS_ERR(kvalue))
+ return PTR_ERR(kvalue);
+
+ error = -EOPNOTSUPP;
+ if (d->d_inode->i_op && d->d_inode->i_op->getxattr) {
+ lock_kernel();
+ error = d->d_inode->i_op->getxattr(d, kname, kvalue, size);
+ unlock_kernel();
+ }
+
+ if (kvalue && error > 0)
+ if (copy_to_user(value, kvalue, size))
+ error = -EFAULT;
+ xattr_free(kvalue, size);
+ return error;
+}
+
+asmlinkage long
+sys_getxattr(char *path, char *name, void *value, size_t size)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk(path, &nd);
+ if (error)
+ return error;
+ error = getxattr(nd.dentry, name, value, size);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_lgetxattr(char *path, char *name, void *value, size_t size)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk_link(path, &nd);
+ if (error)
+ return error;
+ error = getxattr(nd.dentry, name, value, size);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_fgetxattr(int fd, char *name, void *value, size_t size)
+{
+ struct file *f;
+ int error = -EBADF;
+
+ f = fget(fd);
+ if (!f)
+ return error;
+ error = getxattr(f->f_dentry, name, value, size);
+ fput(f);
+ return error;
+}
+
+/*
+ * Extended attribute LIST operations
+ */
+static long
+listxattr(struct dentry *d, char *list, size_t size)
+{
+ int error;
+ char *klist;
+
+ klist = (char *)xattr_alloc(size, XATTR_LIST_MAX);
+ if (IS_ERR(klist))
+ return PTR_ERR(klist);
+
+ error = -EOPNOTSUPP;
+ if (d->d_inode->i_op && d->d_inode->i_op->listxattr) {
+ lock_kernel();
+ error = d->d_inode->i_op->listxattr(d, klist, size);
+ unlock_kernel();
+ }
+
+ if (klist && error > 0)
+ if (copy_to_user(list, klist, size))
+ error = -EFAULT;
+ xattr_free(klist, size);
+ return error;
+}
+
+asmlinkage long
+sys_listxattr(char *path, char *list, size_t size)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk(path, &nd);
+ if (error)
+ return error;
+ error = listxattr(nd.dentry, list, size);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_llistxattr(char *path, char *list, size_t size)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk_link(path, &nd);
+ if (error)
+ return error;
+ error = listxattr(nd.dentry, list, size);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_flistxattr(int fd, char *list, size_t size)
+{
+ struct file *f;
+ int error = -EBADF;
+
+ f = fget(fd);
+ if (!f)
+ return error;
+ error = listxattr(f->f_dentry, list, size);
+ fput(f);
+ return error;
+}
+
+/*
+ * Extended attribute REMOVE operations
+ */
+static long
+removexattr(struct dentry *d, char *name)
+{
+ int error;
+ char kname[XATTR_NAME_MAX + 1];
+
+ error = -EFAULT;
+ if (copy_from_user(kname, name, XATTR_NAME_MAX))
+ return error;
+ kname[XATTR_NAME_MAX] = '\0';
+
+ error = -EOPNOTSUPP;
+ if (d->d_inode->i_op && d->d_inode->i_op->removexattr) {
+ lock_kernel();
+ error = d->d_inode->i_op->removexattr(d, kname);
+ unlock_kernel();
+ }
+ return error;
+}
+
+asmlinkage long
+sys_removexattr(char *path, char *name)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk(path, &nd);
+ if (error)
+ return error;
+ error = removexattr(nd.dentry, name);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_lremovexattr(char *path, char *name)
+{
+ struct nameidata nd;
+ int error;
+
+ error = user_path_walk_link(path, &nd);
+ if (error)
+ return error;
+ error = removexattr(nd.dentry, name);
+ path_release(&nd);
+ return error;
+}
+
+asmlinkage long
+sys_fremovexattr(int fd, char *name)
+{
+ struct file *f;
+ int error = -EBADF;
+
+ f = fget(fd);
+ if (!f)
+ return error;
+ error = removexattr(f->f_dentry, name);
+ fput(f);
+ return error;
+}
diff -Naur 2.5.0-pristine/include/asm-i386/unistd.h 2.5.0-xattr/include/asm-i386/unistd.h
--- 2.5.0-pristine/include/asm-i386/unistd.h Thu Oct 18 03:03:03 2001
+++ 2.5.0-xattr/include/asm-i386/unistd.h Tue Dec 4 12:03:22 2001
@@ -230,6 +230,18 @@
#define __NR_security 223 /* syscall for security modules */
#define __NR_gettid 224
#define __NR_readahead 225
+#define __NR_setxattr 226
+#define __NR_lsetxattr 227
+#define __NR_fsetxattr 228
+#define __NR_getxattr 229
+#define __NR_lgetxattr 230
+#define __NR_fgetxattr 231
+#define __NR_listxattr 232
+#define __NR_llistxattr 233
+#define __NR_flistxattr 234
+#define __NR_removexattr 235
+#define __NR_lremovexattr 236
+#define __NR_fremovexattr 237
/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
diff -Naur 2.5.0-pristine/include/linux/fs.h 2.5.0-xattr/include/linux/fs.h
--- 2.5.0-pristine/include/linux/fs.h Fri Nov 23 06:46:19 2001
+++ 2.5.0-xattr/include/linux/fs.h Tue Dec 4 12:03:34 2001
@@ -851,6 +851,10 @@
int (*revalidate) (struct dentry *);
int (*setattr) (struct dentry *, struct iattr *);
int (*getattr) (struct dentry *, struct iattr *);
+ int (*setxattr) (struct dentry *, char *, void *, size_t, int);
+ int (*getxattr) (struct dentry *, char *, void *, size_t);
+ int (*listxattr) (struct dentry *, char *, size_t);
+ int (*removexattr) (struct dentry *, char *);
};
/*
diff -Naur 2.5.0-pristine/include/linux/limits.h 2.5.0-xattr/include/linux/limits.h
--- 2.5.0-pristine/include/linux/limits.h Thu Jul 29 03:30:10 1999
+++ 2.5.0-xattr/include/linux/limits.h Fri Nov 30 15:33:28 2001
@@ -13,6 +13,9 @@
#define NAME_MAX 255 /* # chars in a file name */
#define PATH_MAX 4095 /* # chars in a path name */
#define PIPE_BUF 4096 /* # bytes in atomic write to a pipe */
+#define XATTR_NAME_MAX 255 /* # chars in an extended attribute name */
+#define XATTR_SIZE_MAX 65536 /* size of an extended attribute value (64k) */
+#define XATTR_LIST_MAX 65536 /* size of extended attribute namelist (64k) */
#define RTSIG_MAX 32
diff -Naur 2.5.0-pristine/include/linux/xattr.h 2.5.0-xattr/include/linux/xattr.h
--- 2.5.0-pristine/include/linux/xattr.h Thu Jan 1 10:00:00 1970
+++ 2.5.0-xattr/include/linux/xattr.h Tue Dec 4 12:01:35 2001
@@ -0,0 +1,15 @@
+/*
+ File: linux/xattr.h
+
+ Extended attributes handling.
+
+ Copyright (C) 2001 by Andreas Gruenbacher <[email protected]>
+ Copyright (C) 2001 SGI - Silicon Graphics, Inc <[email protected]>
+*/
+#ifndef _LINUX_XATTR_H
+#define _LINUX_XATTR_H
+
+#define XATTR_CREATE 0x1 /* set value, fail if attr already exists */
+#define XATTR_REPLACE 0x2 /* set value, fail if attr does not exist */
+
+#endif /* _LINUX_XATTR_H */
At 03:32 05/12/01, Nathan Scott wrote:
>Here is the revised interface. I believe it takes into account
>the issues raised so far - further suggestions are also welcome,
>of course.
Hi,
Looks good to me. Just one tiny point: you seem to like setting error=xyz;
a lot which is completely unnecessary some times. Any particular reason?
Here is an example of what I mean:
>+static long
>+setxattr(struct dentry *d, char *name, void *value, size_t size, int flags)
>+{
>+ int error;
>+ void *kvalue;
>+ char kname[XATTR_NAME_MAX + 1];
>+
>+ error = -EINVAL;
>+ if (flags & ~(XATTR_CREATE|XATTR_REPLACE))
>+ return error;
Why not:
+ if (flags & ~(XATTR_CREATE|XATTR_REPLACE))
+ return -EINVAL;
>+
>+ error = -EFAULT;
>+ if (copy_from_user(kname, name, XATTR_NAME_MAX))
>+ return error;
+ if (copy_from_user(kname, name, XATTR_NAME_MAX))
+ return -EFAULT;
>+ kname[XATTR_NAME_MAX] = '\0';
>+
>+ kvalue = xattr_alloc(size, XATTR_SIZE_MAX);
>+ if (IS_ERR(kvalue))
>+ return PTR_ERR(kvalue);
>+
>+ error = -EFAULT;
>+ if (size > 0 && copy_from_user(kvalue, value, size)) {
>+ xattr_free(kvalue, size);
>+ return error;
>+ }
+ if (size > 0 && copy_from_user(kvalue, value, size)) {
+ xattr_free(kvalue, size);
+ return -EFAULT;
+ }
Shorter, faster in the common non-error path, and looks nicer, although the
latter is probably a matter of personal preference. (-;
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
On December 5, 2001 04:32 am, Nathan Scott wrote:
> Here is the revised interface. I believe it takes into account
> the issues raised so far - further suggestions are also welcome,
> of course.
Hi Nathan,
I still don't like the class parsing inside the kernel, it's hard to see
what is good about that.
Is there a difference between these two?:
long sys_setxattr(char *path, char *name, void *value, size_t size, int flags)
long sys_lsetxattr(char *path, char *name, void *value, size_t size, int flags)
--
Daniel
On Thu, Dec 06, 2001 at 04:05:32AM +0100, Daniel Phillips wrote:
> Hi Nathan,
>
hey there.
> I still don't like the class parsing inside the kernel, it's hard to see
> what is good about that.
I guess it ultimately comes down to simplicity. The IRIX interfaces
have this separation of name and namespace - each operation has to
indicate which namespace is to be used. That becomes very messy when
you wish to work with multiple attribute names and namespaces at once.
Since the namespace is intimately tied to the name anyway, this idea
of specifying the two components together provides very clean APIs.
The term "parsing" is a bit of an overstatement too. We're talking
strncmp() complexity here, not lex/yacc. ;) And its not clear that
you can get out of doing that level of parsing in the kernel anyway
(unless you go for a binary namespace representation, and that's a
real can of worms).
> Is there a difference between these two?:
>
> long sys_setxattr(char *path, char *name, void *value, size_t size, int flags)
> long sys_lsetxattr(char *path, char *name, void *value, size_t size, int flags)
>
Yes, definately. The easiest reason - there are filesystems which
support extended attributes on symlinks already (XFS does), coming
from other operating systems, and there should be a way to get at
that information too.
cheers.
--
Nathan
On Wed, Dec 05, 2001 at 09:08:12AM +0000, Anton Altaparmakov wrote:
> Looks good to me. Just one tiny point: you seem to like setting error=xyz;
> a lot which is completely unnecessary some times. Any particular reason?
No compelling reason - I've switched to your version, new patch is here:
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/cmd/xfsmisc/xattr.patch
cheers.
--
Nathan
On December 6, 2001 06:41 am, Nathan Scott wrote:
> hey there.
>
> > I still don't like the class parsing inside the kernel, it's hard to see
> > what is good about that.
>
> I guess it ultimately comes down to simplicity. The IRIX interfaces
> have this separation of name and namespace - each operation has to
> indicate which namespace is to be used. That becomes very messy when
> you wish to work with multiple attribute names and namespaces at once.
> Since the namespace is intimately tied to the name anyway, this idea
> of specifying the two components together provides very clean APIs.
Right now we have two namespaces, user and system. That's one bit of
information, and the proposal is to represent it with 5-7 bytes, passing it
on every call, and decoding it with a memcmp or similar. This is just extra
fluff as far as I can see, and provides every bit as much opportunity for
implementing a private API as the original cmd parameter did, by encoding
whatever one pleases before the dot.
> The term "parsing" is a bit of an overstatement too. We're talking
> strncmp() complexity here, not lex/yacc. ;) And its not clear that
> you can get out of doing that level of parsing in the kernel anyway
> (unless you go for a binary namespace representation, and that's a
> real can of worms).
I'm suggesting we take a look at that.
> > Is there a difference between these two?:
> >
> > long sys_setxattr(char *path, char *name, void *value, size_t size,
int flags)
> > long sys_lsetxattr(char *path, char *name, void *value, size_t size,
int flags)
> >
>
> Yes, definately. The easiest reason - there are filesystems which
> support extended attributes on symlinks already (XFS does), coming
> from other operating systems, and there should be a way to get at
> that information too.
OK, well it looks like you're going a little overboard here in dividing out
the functionality. What you're talking about is 'follow symlink or not',
right? That really does sound to me as though it's naturally expressed with
a flag bit. I really don't see a compelling reason to go beyond 8 syscalls:
get, fget, set, fset, del, fdel, list, flist
--
Daniel
hi Daniel,
On Thu, Dec 06, 2001 at 04:25:41PM +0100, Daniel Phillips wrote:
> On December 6, 2001 06:41 am, Nathan Scott wrote:
> >
> > I guess it ultimately comes down to simplicity.
>
> Right now we have two namespaces, user and system.
Andreas and the security folk have long been investigating
"trusted" and "owner" namespaces too. See Andreas' web pages
for more discussion on those.
I see no reason to impose such arbitrary restrictions as you
seem to be suggesting - if a filesystem does not want to or
cannot implement some particular namespace, then it shouldn't
be forced to. eg. there is no reason why the user namespace
has to be implemented - Andreas patches for ext2/ext3 go so
far as to make this namespace compile-time conditional (and
his userspace tools just continue to work - its really quite
a nice design, IMO).
> That's one bit of
> information, and the proposal is to represent it with 5-7 bytes, passing it
> on every call, and decoding it with a memcmp or similar. This is just extra
> fluff as far as I can see,
I'd be interested in seeing exactly how you'd like to see the
interfaces changed - could you put forward some APIs?
<guessing here>
It sounds like you're suggesting a separate integer namespace
parameter to each syscall/vfs interface? I think you'll find
this solves none of the problems you're describing, and makes
every operation more complex, and more difficult. Worse, its
alot more open to abuse in the way the old command parameter
was than namespace-prefixed names are! (there would probably
be some free high-order bits in there where I could sneak new
functionality in, right?)
</guessing here>
But perhaps I'm misunderstanding what you're suggesting - I
should wait to see your patch.
> and provides every bit as much opportunity for
> implementing a private API as the original cmd parameter did, by encoding
> whatever one pleases before the dot.
This is just not true - the API does not change at all if a new
namespace was needed for some reason. Restricting namespaces by
using a binary namespace also doesn't help at all - the namespace
becomes obfuscated (strings are easier to grok than bitfields),
divorced from the name (which it clarifies, making life difficult
for the userspace tools) and doesn't even solve the "problem" -
someone could use new bits just as easily as use new namespace
strings.
> > (unless you go for a binary namespace representation, and that's a
> > real can of worms).
>
> I'm suggesting we take a look at that.
>
Andreas and I did have such an implementation, but we ditched it.
The CVS revision history of cmd/attr2/{set,get}fattr/*.c in the XFS
tree show the progression of user<->kernel interfaces which I tried
while Andreas and I were nutting out a clean solution that we both
could use.
Thar be dragons thar. Big hairy ones.
> > [extended attributes on symlinks]
>
> OK, well it looks like you're going a little overboard here in dividing out
> the functionality. What you're talking about is 'follow symlink or not',
> right? That really does sound to me as though it's naturally expressed with
> a flag bit. I really don't see a compelling reason to go beyond 8 syscalls:
>
> get, fget, set, fset, del, fdel, list, flist
>
I'm not too fussed - the second draft patch I sent out did exactly
as you describe, in an attempt to cut down on syscalls. This again
meant adding a "flags" field to each operation. We also have stat/
lstat/fstat and chown/lchown/fchown - I was trying to be consistent
with those, and I still think that is the right thing to do.
cheers.
--
Nathan
On December 7, 2001 12:15 am, Nathan Scott wrote:
> > > (unless you go for a binary namespace representation, and that's a
> > > real can of worms).
> >
> > I'm suggesting we take a look at that.
>
> Andreas and I did have such an implementation, but we ditched it.
> The CVS revision history of cmd/attr2/{set,get}fattr/*.c in the XFS
> tree show the progression of user<->kernel interfaces which I tried
> while Andreas and I were nutting out a clean solution that we both
> could use.
>
> Thar be dragons thar. Big hairy ones.
Could you describe them, please?
--
Daniel
On December 7, 2001 12:15 am, Nathan Scott wrote:
> > > [extended attributes on symlinks]
> >
> > OK, well it looks like you're going a little overboard here in dividing out
> > the functionality. What you're talking about is 'follow symlink or not',
> > right? That really does sound to me as though it's naturally expressed with
> > a flag bit. I really don't see a compelling reason to go beyond 8 syscalls:
> >
> > get, fget, set, fset, del, fdel, list, flist
>
> I'm not too fussed - the second draft patch I sent out did exactly
> as you describe, in an attempt to cut down on syscalls. This again
> meant adding a "flags" field to each operation. We also have stat/
> lstat/fstat and chown/lchown/fchown - I was trying to be consistent
> with those, and I still think that is the right thing to do.
It may well be, however, the one call that has flags, set, is looking a
little irregular sitting there on its own.
We're inventing an API here for which we don't have a lot of guidance from
existing unices, correct? It wouldn't hurt to really kick it around. After
all, what we settle on in Linux is likely to become the standard.
Presumably there's some existing practice at SGI, do you have a pointer to
man pages?
--
Daniel
hi Daniel,
On Fri, Dec 07, 2001 at 03:03:43AM +0100, Daniel Phillips wrote:
> On December 7, 2001 12:15 am, Nathan Scott wrote:
> > > > [extended attributes on symlinks]
> > > get, fget, set, fset, del, fdel, list, flist
> >
> > I'm not too fussed - the second draft patch I sent out did exactly
> > as you describe, in an attempt to cut down on syscalls. This again
> > meant adding a "flags" field to each operation. We also have stat/
> > lstat/fstat and chown/lchown/fchown - I was trying to be consistent
> > with those, and I still think that is the right thing to do.
>
> It may well be, however, the one call that has flags, set, is looking a
> little irregular sitting there on its own.
Not sure what to say to that ... the API is practical, flags
seem to make sense for that call (the flags give the slightly
different "set" semantics, but it is still "set"), IMO they
don't make sense for others.
> We're inventing an API here for which we don't have a lot of guidance from
> existing unices, correct?
No. Many existing versions of Unix support extended attributes
in some form or another, but there is no common API/standard -
each implementation differs, sometimes radically, to the others.
> It wouldn't hurt to really kick it around.
Please read the archives first - discussion started well over a
year ago now on an API for Linux, and there have been heaps and
heaps of ideas, proposals, prototypes, etc, floated.
The lack of progress on getting something in the kernel is
hurting several projects (even having syscalls reserved would
be a _huge_ help to us). We have distributors telling us they
would include XFS in their kernels if there was some progress
on this particular issue.
> After all, what we settle on in Linux is likely to become the standard.
Mmm.. I seriously doubt there could ever be any standards in this
area - I would be satisfied with a good implementation on Linux,
which allows filesystems from different operating systems to use
it while preserving any existing on-disk formats they may have.
>
> Presumably there's some existing practice at SGI, do you have a pointer to
> man pages?
>
Start with the mail we sent to Linus, LKML, fs-devel, etc about a
month ago - it had pointers to the original discussion from this
time last year (and in that discussion, Andreas provided pointers
to documentation for several other implementations, incl. BSD,
IRIX, Tru64, & others). It also has pointers to several projects
relying on the existing, diverging Linux implementations.
It should probably be pointed out again that many of the folk
working on filesystems which support extended attributes have
given their collective thumbs up to the latest round of patches.
In particular, the projects which have already implemented EAs
(XFS, ext2/ext3) and services above EAs, are confident that
these patches will work for them and that we'll be able to get
our userspace code working together for the first time.
cheers.
--
Nathan
Hi,
On Wed, Dec 05, 2001 at 02:32:10PM +1100, Nathan Scott wrote:
> Here is the revised interface. I believe it takes into account
> the issues raised so far - further suggestions are also welcome,
> of course.
This is looking OK as far as EAs go. However, there is still no
mention of ACLs specifically, except an oblique reference to
""system.posix_acl_access".
Is there no consensus on this? In previous proposals we've at least
tried to deal with it to some extent.
Cheers,
Stephen
On Fri, Dec 07, 2001 at 08:20:36PM +0000, Stephen C. Tweedie wrote:
> Hi,
>
hi Stephen,
> This is looking OK as far as EAs go. However, there is still no
> mention of ACLs specifically, except an oblique reference to
> "system.posix_acl_access".
Yup - there's little mention of ACLs because they are only an
optional, higher-level consumer of the API, & so didn't seem
appropriate to document here.
We have implemented POSIX ACLs above this interface - there
is source to new versions of Andreas' user tools here:
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/cmd/acl2
These have been tested with XFS and seem to work fine, so we
are ready to transition over from our old implementation to
this new one.
In a way there's consensus wrt how to do POSIX ACLs on Linux
now, as both the ext2/ext3 and XFS ACL projects will be using
the same tools, libraries, etc. In terms of other ACL types,
I don't know of anyone actively working on any.
The existence of a POSIX ACL implementation using attributes
system.posix_acl_access and system.posix_acl_default doesn't
preclude other types of ACLs from being implemented (obviously
using different attributes) as well of course, if someone had
an itch to scratch.
cheers.
--
Nathan
Nathan Scott wrote:
>
>
>In a way there's consensus wrt how to do POSIX ACLs on Linux
>now, as both the ext2/ext3 and XFS ACL projects will be using
>the same tools, libraries, etc. In terms of other ACL types,
>I don't know of anyone actively working on any.
>
>
We are taking a very different approach to EAs (and thus to ACLs) as
described in brief at http://www.namesys.com/v4/v4.html. We don't expect
anyone to take us seriously on it before it works, but silence while
coding does not equal consensus.;-)
In essence, we think that if a file can't do what an EA can do, then you
need to make files able to do more.
It is very important not to reduce the amount of closure (as in
mathematical closure) within the namespace, and creating EAs that cannot
be accessed as files reduces closure.
The same argument applies to streams, but it is kind of interesting to
see people argue against streams for this reason, and then embrace EAs.
Kind of leaves you wondering whether their hatred of streams was really
any deeper than streams aren't what they are used to from Unix.
Hans
Hi,
On Sat, Dec 08, 2001 at 03:58:41PM +1100, Nathan Scott wrote:
> On Fri, Dec 07, 2001 at 08:20:36PM +0000, Stephen C. Tweedie wrote:
>
> > This is looking OK as far as EAs go. However, there is still no
> > mention of ACLs specifically, except an oblique reference to
> > "system.posix_acl_access".
>
> Yup - there's little mention of ACLs because they are only an
> optional, higher-level consumer of the API, & so didn't seem
> appropriate to document here.
Unfortunately, if there are many filesystems wanting to use posix
ACLs, then standardising the API is still desirable.
> We have implemented POSIX ACLs above this interface - there
> is source to new versions of Andreas' user tools here:
> http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/cmd/acl2
> These have been tested with XFS and seem to work fine, so we
> are ready to transition over from our old implementation to
> this new one.
But the ACL encoding is still hobbled: there's no namespace for
credentials other than uid/gid. This has been brought up before, but
it's worth going over some of the things we'd like to be able to do
with extended credentials again:
* NFSv4.
NFSv4 credentials are of the form "user@realm", and an NFSv4 server
needs to be able to apply ACLs using such credentials so that it can
securely serve users in foreign realms.
* Kerberos single-signon.
I want to be able to get a kerberos login ticket on the desktop in
front of me and access files in my entire organisation securely. I
want to be able to login to remote systems in different departments
and still have ACLs work. So "[email protected]" might login to a
machine in the "DEVEL.CO.COM", and would only get a "guest" uid, but
the ACL system would allow access based on the full "[email protected]"
credentials.
* Samba.
Is there any reason not to allow an NT SID to be used as the
credential for an ACL?
* Sub-IDs.
There was a beautiful paper presented at a recent Usenix in which the
concept of user-manageable sub-ids was presented. I am on a secure
intranet, but I'm constantly accessing untrusted data. Every time
Mozilla accesses a web site I am potentially vulnerable to web
rendering bugs which could allow a site to take over my machine.
Plugins such as flash just make the matter worse. Even in the home
environment we'd like to make it easy to allow multiuser games to be
run without compromising the whole local system.
The sub-id concept proposes allowing users to create process groups
with restricted rights to the system. I would _really_ like to give
Mozilla write access to ~/tmp and ~/.mozilla, but not to the rest of
my homedir. Can't I use a "sct/mozilla" credential for my ACLs?
Authentication is about *much* more than just local uid/gids, but the
current EA/ACL specs are creating an implicit standard for ACLs
without addressing any of these concerns.
> The existence of a POSIX ACL implementation using attributes
> system.posix_acl_access and system.posix_acl_default doesn't
> preclude other types of ACLs from being implemented (obviously
> using different attributes) as well of course, if someone had
> an itch to scratch.
I am not talking about other types of ACLs! I am talking about
*POSIX* ACLs, but using a credentials namespace which is more than
just uid/gid. Only the credentials change: the rest of the POSIX
semantics still apply. The CITI NFSv4 implementation is already doing
POSIX ACLs and GSSAPI krb5 authentication on top of the bestbits API,
so we already have at least one application ready and waiting to use
such an extension.
Cheers,
Stephen
On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
> * Sub-IDs.
>
> There was a beautiful paper presented at a recent Usenix in which the
> concept of user-manageable sub-ids was presented.
Stephen, Do you have a ref for that?
Thanks!
Hi,
On Mon, Dec 10, 2001 at 08:00:03AM -0700, Peter J. Braam wrote:
> On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
>
> > * Sub-IDs.
> >
> > There was a beautiful paper presented at a recent Usenix in which the
> > concept of user-manageable sub-ids was presented.
>
> Stephen, Do you have a ref for that?
http://www.usenix.org/publications/library/proceedings/usenix01/freenix01/ioannidis.html
Cheers,
Stephen
Hello Stephen , Is this the only attribution ?
Just love those 'we won't share security info with you unless you
are member or pay.' . Sorry , JimL
On Mon, 10 Dec 2001, Stephen C. Tweedie wrote:
> Hi,
>
> On Mon, Dec 10, 2001 at 08:00:03AM -0700, Peter J. Braam wrote:
> > On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
> >
> > > * Sub-IDs.
> > >
> > > There was a beautiful paper presented at a recent Usenix in which the
> > > concept of user-manageable sub-ids was presented.
> >
> > Stephen, Do you have a ref for that?
>
> http://www.usenix.org/publications/library/proceedings/usenix01/freenix01/ioannidis.html
>
> Cheers,
> Stephen
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+
Hi,
On Mon, Dec 10, 2001 at 11:00:06AM -0500, Mr. James W. Laferriere wrote:
>
> Hello Stephen , Is this the only attribution ?
> Just love those 'we won't share security info with you unless you
> are member or pay.' . Sorry , JimL
There are other references in the paper: I've appended them below.
One, in particular, seems to talk about quite similar concepts:
http://www.usenix.org/publications/library/proceedings/sec2000/acharya.html
Cheers,
Stephen
[1] NJS JavaScript Interpreter. http://www.bbassett.net/njs/.
[2] The OpenBSD Operating System. http://www.openbsd.org/.
[3] World Wide Web Consortium. http://www.w3.org/.
[4] Anurag Acharya and Mandar Raje. Map- box: Using parameterized
behavior classes to confine applications. In Proceedings of the 2000
USENIX Security Symposium, pages 1-17, Denver, CO, August 2000.
[5] Andrew Berman, Virgil Bourassa, and Erik Selberg. TRON:
Process-Specific File Protection for the UNIX Operating System. In
USENIX 1995 Technical Conference, New Orleans, Louisiana, January
1995.
[6] David Flanagan. JavaScript The De nitive Guide. O'Reilly, 1998.
[7] Tim Fraser, Lee Badger, and Mark Feldman. Hardening COTS Software
with Generic Software Wrappers. In Proceedings of the IEEE Symposium
on Security and Privacy, Oakland, CA, May 1999.
[8] Ian Goldberg, David Wagner, Randi Thomas, and Eric A. Brewer. A
Secure Environment for Untrusted Helper Applications. In USENIX 1996
Technical Conference, 1996.
[9] Li Gong. Inside Java 2 Platform Security. Addison-Wesley, 1999.
[10] James Gosling, Bill Joy, and Guy Steele. The Java Language
Specification. Addison Wesley, Reading, 1996.
[11] http://www.cert.org/advisories/.
[12] Sotiris Ioannidis and Steven M. Bellovin. Sub-Operating Systems: A
New Approach to Application Security. Technical Report MS-CIS-01- 06,
University of Pennsylvania, February 2000.
[13] R. Kaplan. SUID and SGID Based Attacks on UNIX: a Look at One Form
of then Use and Abuse of Privileges. Computer Security Journal,
9(1):73-7, 1993.
[14] Jacob Y. Levy, Laurent Demailly, John K. Ousterhout, and Brent
B. Welch. The Safe-Tcl Security Model. In USENIX 1998 Annual Technical
Conference, New Orleans, Louisiana, June 1998.
[15] Gary McGraw and Edward W. Felten. Java Security: hostile applets,
holes and antidotes. Wiley, New York, NY, 1997.
[16] G. C. Necula and P. Lee. Safe, Untrusted Agents using
Proof-Carrying Code. In Lecture Notes in Computer Science Special
Issue on Mobile Agents, October 1997.
[17] Dan S. Wallach, Dirk Balfanz, Drew Dean, and Edward
W. Felten. Extensible Security Architectures for Java. In Proceedings
of the 16th ACM Symposium on Operating Systems Principles, October
1997.
James> Hello Stephen , Is this the only attribution ?
Probably.
James> Just love those 'we won't share security info with you unless
James> you are member or pay.' . Sorry , JimL
Excuse me? This is a paper presented at a conference, not a security
bug report in existing code. I can totally understand having to pay
for proceedings from a conference.
John
On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
> On Sat, Dec 08, 2001 at 03:58:41PM +1100, Nathan Scott wrote:
> > On Fri, Dec 07, 2001 at 08:20:36PM +0000, Stephen C. Tweedie wrote:
> >
> > > This is looking OK as far as EAs go. However, there is still no
> > > mention of ACLs specifically, except an oblique reference to
> > > "system.posix_acl_access".
> >
> > Yup - there's little mention of ACLs because they are only an
> > optional, higher-level consumer of the API, & so didn't seem
> > appropriate to document here.
>
> Unfortunately, if there are many filesystems wanting to use posix
> ACLs, then standardising the API is still desirable.
True.
>
> > We have implemented POSIX ACLs above this interface - there
> > is source to new versions of Andreas' user tools here:
> > http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/cmd/acl2
> > These have been tested with XFS and seem to work fine, so we
> > are ready to transition over from our old implementation to
> > this new one.
>
> But the ACL encoding is still hobbled: there's no namespace for
> credentials other than uid/gid. This has been brought up before, but
> it's worth going over some of the things we'd like to be able to do
> with extended credentials again:
>
[credential examples deleted]
>
> Authentication is about *much* more than just local uid/gids, but the
> current EA/ACL specs are creating an implicit standard for ACLs
> without addressing any of these concerns.
>
> > The existence of a POSIX ACL implementation using attributes
> > system.posix_acl_access and system.posix_acl_default doesn't
> > preclude other types of ACLs from being implemented (obviously
> > using different attributes) as well of course, if someone had
> > an itch to scratch.
>
> I am not talking about other types of ACLs! I am talking about
> *POSIX* ACLs, but using a credentials namespace which is more than
> just uid/gid. Only the credentials change: the rest of the POSIX
> semantics still apply. The CITI NFSv4 implementation is already doing
> POSIX ACLs and GSSAPI krb5 authentication on top of the bestbits API,
> so we already have at least one application ready and waiting to use
> such an extension.
>
So you are particularly interested in more general "qualifiers"
(in posix acl entry speak:).
Some people are also interested in more general "permissions" for ACEs.
Could this not be catered for independent of the proposed EA interface
for getting/setting/removing EAs ?
One could come up with more general data structures and functions
for ACLs/ACEs than what we currently propose,
and yet still use the same EA interface.
--Tim
On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
> Hi,
>
hi there Stephen.
> On Sat, Dec 08, 2001 at 03:58:41PM +1100, Nathan Scott wrote:
> > On Fri, Dec 07, 2001 at 08:20:36PM +0000, Stephen C. Tweedie wrote:
> >
> > > This is looking OK as far as EAs go. However, there is still no
> > > mention of ACLs specifically, except an oblique reference to
> > > "system.posix_acl_access".
> >
> > Yup - there's little mention of ACLs because they are only an
> > optional, higher-level consumer of the API, & so didn't seem
> > appropriate to document here.
>
> Unfortunately, if there are many filesystems wanting to use posix
> ACLs, then standardising the API is still desirable.
>
Yes, absolutely. That is in fact a large driving force behind
this effort to get a common EA and POSIX ACL API, and we are now
for the first time at a point where we have multiple filesystems
(xfs, ext2, and ext3) sharing the same API. The history went a
bit like this:
- an implementation of POSIX ACLs was written for ext2 and ext3
by Andreas;
- an implementation of POSIX ACLs was ported for XFS (at the time,
Andreas' implementation didn't allow us to use our pre-existing
on-disk format from IRIX)
- Andreas made attempt #1 to get a system call interface agreed on
over a year ago now. He incorporated several peoples suggestions,
but eventually the discussion got sidetracked, died and nothing
further happened;
- We were all _really_ hoping for something to come out of that,
so we could then "standardise" on the various APIs involved;
- [time passes, much pain is felt by lots of users - the patches
have to continually track new kernels where the syscall table
changes frequently break the user/kernel interface, affecting
an increasing number of userspace applications]
- After about a year of this, Andi gives us a kick in the pants,
we contact Andreas and make a renewed effort at producing an API
that we all can share.
- Several iterations later, we have an initial implementation
(which is not filesystem-specific for the first-time)
- We made attempt #2 to get system call and VFS interfaces agreed
on by posting to Linus, Al, various lists. We incorporate all
the suggestions that we think make sense, and push out several
iterations of the patches out.
- We are all _really_ hoping for something to come out of this,
so that we can "standardise" on the various APIs involved;
- ...?
> > We have implemented POSIX ACLs above this interface -
>
> But the ACL encoding is still hobbled: ...
I have been on the acl-devel mailing list for a long time now,
and while these features all sound like good ideas or interesting
projects, I have never seen anyone post a patch or request any
specific changes to Andreas' ACL encoding in that time.
It seems to me that the relatively simple implementation which
Andreas has done is a good starting point (it has been used in
production for a long time now).
His POSIX ACL encoding has a version field in it, so if/when some
people step forward to implement these features you've described,
and if they require changes to the format, then there should be no
reason they can't do it cleanly and in a filesystem-independent
manner, right? And if you do have reasons, its high time you sent
Andreas some patches! ;-)
Seriously though, from an XFS point of view, Andreas' current
implementation is simple and meets all of our needs, he does a
really good job of maintaining the code and is very responsive
on the acl-devel list and to questions from us XFS folk, so we
are quite happy to use his as the initial filesystem-independent
implementation of POSIX ACLs for Linux.
cheers.
--
Nathan
hi Hans,
On Sat, Dec 08, 2001 at 11:17:21PM +0300, Hans Reiser wrote:
> Nathan Scott wrote:
> >
> >In a way there's consensus wrt how to do POSIX ACLs on Linux
> >now, as both the ext2/ext3 and XFS ACL projects will be using
> >the same tools, libraries, etc. In terms of other ACL types,
> >I don't know of anyone actively working on any.
> >
> We are taking a very different approach to EAs (and thus to ACLs) as
> described in brief at http://www.namesys.com/v4/v4.html. We don't expect
> anyone to take us seriously on it before it works, but silence while
> coding does not equal consensus.;-)
>
> In essence, we think that if a file can't do what an EA can do, then you
> need to make files able to do more.
We did read through your page awhile ago. It wasn't clear to me
how you were addressing Anton's questions here:
http://marc.theaimsgroup.com/?l=linux-fsdevel&m=97260371413867&w=2
(I couldn't find a reply in the archive, but may have missed it).
We were concentrating on something that could be fs-independent,
so the lack of answers there put us off a bit, and the dependence
on a reiser4() syscall is pretty filesystem-specific too (I guess
if your solution is intended to be a reiserfs-specific one, then
the questions above are meaningless).
I was curious on another thing also - in the section titled
``The Usual Resolution Of These Flaws Is A One-Off Solution'',
talking about security attributes interfaces, your page says:
"Linus said that we can have a system call to use as our
experimental plaything in this. With what I have in mind for the
API, one rather flexible system call is all we want..."
How did you manage to get him to say that? We were flamed for
suggesting a syscall which multiplexed all extended attributes
commands though the one interface (because its semantics were
not clearly defined & it could be extended with new commands,
like ioctl/quotactl/...), and we've also had no luck so far in
getting either our original interface, nor any revised syscall
interfaces (which aren't like that anymore) accepted by Linus.
many thanks.
--
Nathan
Hi,
On Tue, Dec 11, 2001 at 12:22:58PM +1100, Timothy Shimmin wrote:
> Could this not be catered for independent of the proposed EA interface
> for getting/setting/removing EAs ?
Definitely. The whole problem I pointed out with the EA interface was
that it didn't talk about ACLs at all. So, sure, it gives us an API
for arbitrary EAs, but it does absolutely nothing to help us unify ACL
APIs. In effect it is far _too_ extensible: we need to have some
agreement on how it can be used if the different ACL applications are
to have any hope of working together.
The bright point is that this can be done reasonably well in user
space, if we're careful (but we still need to worry about exactly how
the kernel will deal with validating ACE chains --- we need to specify
whether EAs in the system namespace are expected to be stored verbatim
or whether the filesystem is permitted to interpret their semantics
intelligently.)
Cheers,
Stephen
I respond below.
I didn't see that email, probably because I was not on the cc list.
Nathan Scott wrote:
>hi Hans,
>
>On Sat, Dec 08, 2001 at 11:17:21PM +0300, Hans Reiser wrote:
>
>>Nathan Scott wrote:
>>
>>>In a way there's consensus wrt how to do POSIX ACLs on Linux
>>>now, as both the ext2/ext3 and XFS ACL projects will be using
>>>the same tools, libraries, etc. In terms of other ACL types,
>>>I don't know of anyone actively working on any.
>>>
>>We are taking a very different approach to EAs (and thus to ACLs) as
>>described in brief at http://www.namesys.com/v4/v4.html. We don't expect
>>anyone to take us seriously on it before it works, but silence while
>>coding does not equal consensus.;-)
>>
>>In essence, we think that if a file can't do what an EA can do, then you
>>need to make files able to do more.
>>
>
>We did read through your page awhile ago. It wasn't clear to me
>how you were addressing Anton's questions here:
>http://marc.theaimsgroup.com/?l=linux-fsdevel&m=97260371413867&w=2
>(I couldn't find a reply in the archive, but may have missed it).
>
>We were concentrating on something that could be fs-independent,
>so the lack of answers there put us off a bit, and the dependence
>on a reiser4() syscall is pretty filesystem-specific too (I guess
>if your solution is intended to be a reiserfs-specific one, then
>the questions above are meaningless).
>
Changing the name of the system call is not a biggie. Our approach is
to make
it work for reiserfs, then proselytize. While we work, we let people know
what we are working on, and if they join in, great to have it work for more
than one FS.
>
>
>I was curious on another thing also - in the section titled
>``The Usual Resolution Of These Flaws Is A One-Off Solution'',
>talking about security attributes interfaces, your page says:
>
> "Linus said that we can have a system call to use as our*experimental plaything in this. With what I have in mind for the
>API, one rather flexible system call is all we want..."
>
>How did you manage to get him to say that? We were flamed for
>suggesting a syscall which multiplexed all extended attributes
>commands though the one interface (because its semantics were
>not clearly defined & it could be extended with new commands,
>like ioctl/quotactl/...), and we've also had no luck so far in
>getting either our original interface, nor any revised syscall
>interfaces (which aren't like that anymore) accepted by Linus.*
>
We expect to get flamed once we have a patch.;-) When we
have something mature enough to be usable, I expect he'll find a lot that
could be made better. He does that.;-)
For us, there are semantic advantages to having a single system call.
Probably
it will get a lot of argument once we have working code, and frankly I
prefer
to have that argument only after it is something usable, and it is easy
to see
the convenience of expression that comes from it. We want to Linux to be
MORE expressive than BeOS in regards to files.
>*
>
>many thanks.
>
>*
>
> **
**
**
*
Curtis Anderson wrote:
> > The problem with streams-style attributes comes from stepping onto the
> > slippery slope of trying to put too much generality into it. I chose the
> > block-access style of API so that there would be no temptation to start
> > down that slope.
>
>I understand you right up until this. I just don't get it. If you extend
>the functionality of files and directories so that attributes are not
>needed, this is goodness, right? I sure think it is the right
>approach. We should just decompose carefully what functionality is
>provided by attributes that files and directories lack, and one feature at
>a time add that capability to files and directories as separate optional
>features.
No, it is _not_ goodness, IMHO. - If you did implement the API for
attributes through files and directori
es, then what would you do with named
streams?!?
*
**
**
*Hans Reiser wrote:
What is your intended functional difference between extended attributes and streams?
None?
Ok, let's assume none until I get your response. (I can respond more specifically
after you correct me.) Let me further go out on a limb,and guess that you intend
that extended attributes are meta-information about the object, and streams
are contained within the object.
In this case, a naming convention is quite sufficient to distinguish them.
Extended attributes can have names of the form filenameA/..extone.
Streams can have names of the form filenameA/streamone.
In other words, all meta-information about an object should by convention
(and only by convention, because people should live free, and because
there is not always an obvious distinction between meta and contained
information) be preceded by '..'
Note that readdir should return neither stream names nor extended attribute names,
and the use of 'hidden' directory entries accomplishes this (ala .snapshot
for WAFL).
*
**
**
*Curtis:
You can't possibly have both using the same API since you would then get
name collision on filesystems where both named streams and EAs are
supported.
*
**
**
*Name distinctions are what you use to avoid name collisions, see above.
*
**
**
*Curtis:
(And I haven't even mentioned EAs and named streams attached to
actual _real_ directories yet.)
*
**
*I don't understand this.
*
**
**
*Curtis:
Let's face it: EAs exist. They are _not_ files/directories so the API
*
**
*Is this an argument?
EA's do not exist in Linux, and they should never exist as something that
is more than a file. Since they do not
exist, you might as well improve the filesystems you port to Linux while
porting them. APIs shape an OS over the long term, and if done wrong
they burden generations after you with crud.
*
**
**
*Curtis:
should not make them appear as files/directories. - You have to consider
that there are a lot of filesystems out there which are already developed
and which need to be supported. - Not everyone has their own filesystem
which they can change/extend the specifications/implementation of at will.
*
**
*
Yes they do. It is all GPL'd. Even XFS. Do the underlying infrastructure
the right way, and I bet you'll be surprised at how little need there really
is for ea's done the wrong way. A user space library can cover
over it all (causing only the obsolete programs using it to suffer while they
wait to fade away).
*
What would have happened if set theory had not just sets and elements,
but sets, elements, extended-attributes, and streams, and you could not
use the same operators on streams that you use on elements? It would
have been crap as a theoretical model. It does real damage when you add
things that require different operators to the set of primitives.
Closure is extremely important to design. Don't do this.
Hans
**
**
*
*
At 11:33 11/12/01, Stephen C. Tweedie wrote:
>On Tue, Dec 11, 2001 at 12:22:58PM +1100, Timothy Shimmin wrote:
> > Could this not be catered for independent of the proposed EA interface
> > for getting/setting/removing EAs ?
>
>Definitely. The whole problem I pointed out with the EA interface was
>that it didn't talk about ACLs at all. So, sure, it gives us an API
>for arbitrary EAs, but it does absolutely nothing to help us unify ACL
>APIs.
And this is a Good Thing(TM). EAs are completely orthogonal to ACLs and the
API for EAs should not in any way have anything to do with ACLs.
It is up to a different API to implement ACLs. In the case of xfs, ext3,
etc, it can have EAs as a backend to the API but in the case of NTFS ACLs
would not be anything to do with EAs.
_Please_ do not mix the two issues. We have here a IMO nice API for EAs.
Lets get it into the kernel. Once that is done, one can start talking about
an API for ACLs.
>In effect it is far _too_ extensible: we need to have some agreement on
>how it can be used if the different ACL applications are to have any hope
>of working together.
IMHO a generic POSIX ACL API would never even know that ACLs are stored as
EAs, this should be up to the individual fs and if several fs use EAs it
would make sense to have vfs helpers which all fs can use (a-la generic_*
helpers).
If you create a hard connection between ACLs and EAs then you are already
asserting from the start that there will be file systems with alternate ACL
interface separate from this "generic" one and alternate user space
utilities... Perhaps this is what you want, but then it certainly not a
true generic interface, it's just a "cater for the people who first
implemented it" interface.
>The bright point is that this can be done reasonably well in user
>space, if we're careful (but we still need to worry about exactly how
>the kernel will deal with validating ACE chains --- we need to specify
>whether EAs in the system namespace are expected to be stored verbatim
>or whether the filesystem is permitted to interpret their semantics
>intelligently.)
At the very least you have to allow for the possibility that a file system
will have ACLs but those will be NOT EAs. An implementation which actually
makes this impossible is IMHO unacceptable in the generic parts of the kernel.
IMHO an interface where each fs would have a set of acl_ops which the vfs
can invoke like inode->acl_ops->{get,set,remove,add,whatever you
like}_{acl,ace} - you get the idea - is required for a generic
implementation of POSIX ACLs.
Each fs then implements ACLs any way it likes. xfs,ext3,et al would use the
EA API, ntfs would use its own security attributes, other fs will do
whatever is required.
This fits in nicely with the idea for vfs helpers so that xfs,ext3,etc
could just set their acl_ops to generic_*_acl and be done with it.
Comments?
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Hi,
On Tue, Dec 11, 2001 at 12:41:15PM +1100, Nathan Scott wrote:
> On Mon, Dec 10, 2001 at 11:52:09AM +0000, Stephen C. Tweedie wrote:
> > Unfortunately, if there are many filesystems wanting to use posix
> > ACLs, then standardising the API is still desirable.
> Yes, absolutely. That is in fact a large driving force behind
> this effort to get a common EA and POSIX ACL API, and we are now
> for the first time at a point where we have multiple filesystems
> (xfs, ext2, and ext3) sharing the same API. The history went a
> bit like this:
Yep, I know the history: I've been following this for a long time. :)
> - Andreas made attempt #1 to get a system call interface agreed on
> over a year ago now. He incorporated several peoples suggestions,
> but eventually the discussion got sidetracked, died and nothing
> further happened;
Yep, and I brought up all these points last time, too.
> > But the ACL encoding is still hobbled: ...
>
> I have been on the acl-devel mailing list for a long time now,
> and while these features all sound like good ideas or interesting
> projects, I have never seen anyone post a patch or request any
> specific changes to Andreas' ACL encoding in that time.
It was proposed over a year ago on fsdevel-list. I've attached the
main proposal email, and I've posted the mailbox containing the
discussion at
http://people.redhat.com/sct/ACL/ACL.mailbox.gz
Warning, it uncompresses to over 600k!
> It seems to me that the relatively simple implementation which
> Andreas has done is a good starting point (it has been used in
> production for a long time now).
>
> His POSIX ACL encoding has a version field in it
Umm, and where in the EA man pages is this described? How does an
application use the EA API? That's what I'm concerned about.
The EA API is fine, as far as it goes. However, it doesn't talk _at
all_ about extending semantics. It doesn't even say if it is legal to
use system EAs for POSIX ACLs. Right now, system EAs are just a magic
way of stuffing undefined bits into undefined filesystems. What if I
want to add non-user-modifiable EAs to a file for user-space reasons?
Eg. what if my backup tool wants to write a backup timestamp which the
user can't modify? How do I do that? The EA spec doesn't actually
say whether it is legal for applications to store their own data in
system EAs, and if so, which set of system EAs must be reserved for
system internal use.
> so if/when some
> people step forward to implement these features you've described,
> and if they require changes to the format, then there should be no
> reason they can't do it cleanly and in a filesystem-independent
> manner, right?
What format? There _is_ no defined format. There's some existing
practice, but no rules whatever right now.
Cheers,
Stephen
Hi,
On Tue, Dec 11, 2001 at 01:30:16PM +0000, Anton Altaparmakov wrote:
> At 11:33 11/12/01, Stephen C. Tweedie wrote:
> >On Tue, Dec 11, 2001 at 12:22:58PM +1100, Timothy Shimmin wrote:
> > > Could this not be catered for independent of the proposed EA interface
> > > for getting/setting/removing EAs ?
> >
> >Definitely. The whole problem I pointed out with the EA interface was
> >that it didn't talk about ACLs at all. So, sure, it gives us an API
> >for arbitrary EAs, but it does absolutely nothing to help us unify ACL
> >APIs.
>
> And this is a Good Thing(TM). EAs are completely orthogonal to ACLs and the
> API for EAs should not in any way have anything to do with ACLs.
At the moment, however, they do.
The user-visible APIs for ACLs and EAs are quite separate. However,
the way that the user ACL libraries talk to the filesystem is through
special reserved EAs, to which the kernel magically imparts ACL
semantics. The format of that EA, its name, its semantics and so on,
are all completely glossed over by the existing EA spec, despite the
fact that this mechanism is right at the very core of the ACL
implementation.
> It is up to a different API to implement ACLs. In the case of xfs, ext3,
> etc, it can have EAs as a backend to the API but in the case of NTFS ACLs
> would not be anything to do with EAs.
I know, and that's part of the problem we identified over a year ago.
My old EA proposal had the concept of "attribute families", and
allowed us to define ACL attributes in a completely different
namespace from EAs, so that NTFS, AFS or NFSv4 ACLs could be passed
cleanly through the same API.
> _Please_ do not mix the two issues. We have here a IMO nice API for EAs.
> Lets get it into the kernel. Once that is done, one can start talking about
> an API for ACLs.
I'm not mixing them: I'm *trying* to unmix them.
The existing EA code implements magic "handlers" for system EAs.
Undefined magic gets called when you set such an EA. There's no
mention about this in the spec.
So right now the EA code opens up what is basically a named-ioctl back
door into the filesystems, and ACLs work on top of that. The existing
ACL and EA code is already enormously intertwined.
> IMHO a generic POSIX ACL API would never even know that ACLs are stored as
> EAs, this should be up to the individual fs and if several fs use EAs it
> would make sense to have vfs helpers which all fs can use (a-la generic_*
> helpers).
Yes, and the existing bestbits ACL API does not have anything like
that abstraction: rather, it relies on doing an assignment to a "$acl"
EA.
So how are we going to do ACLs on top of EAs? Even if we forget about
the ACL API for now, the API between the ACL layer and the EA layer
*does* matter.
Cheers,
Stephen
Hi Stephen,
At 14:34 11/12/01, Stephen C. Tweedie wrote:
>On Tue, Dec 11, 2001 at 01:30:16PM +0000, Anton Altaparmakov wrote:
> > At 11:33 11/12/01, Stephen C. Tweedie wrote:
> > >On Tue, Dec 11, 2001 at 12:22:58PM +1100, Timothy Shimmin wrote:
> > > > Could this not be catered for independent of the proposed EA interface
> > > > for getting/setting/removing EAs ?
> > >
> > >Definitely. The whole problem I pointed out with the EA interface was
> > >that it didn't talk about ACLs at all. So, sure, it gives us an API
> > >for arbitrary EAs, but it does absolutely nothing to help us unify ACL
> > >APIs.
> >
> > And this is a Good Thing(TM). EAs are completely orthogonal to ACLs and
> the
> > API for EAs should not in any way have anything to do with ACLs.
>
>At the moment, however, they do.
>
>The user-visible APIs for ACLs and EAs are quite separate. However,
>the way that the user ACL libraries talk to the filesystem is through
>special reserved EAs, to which the kernel magically imparts ACL
>semantics. The format of that EA, its name, its semantics and so on,
>are all completely glossed over by the existing EA spec, despite the
>fact that this mechanism is right at the very core of the ACL
>implementation.
You are certainly right here. However, I thought the existing ACL
implementation was going to provide the detailed EA specs on top of the
generic EA spec. I can see why you would want to reserve certain EAs for
ACL use in the EA spec though. It would be just like reserving syscall
numbers and that makes a lot of sense...
> > It is up to a different API to implement ACLs. In the case of xfs, ext3,
> > etc, it can have EAs as a backend to the API but in the case of NTFS ACLs
> > would not be anything to do with EAs.
>
>I know, and that's part of the problem we identified over a year ago.
>My old EA proposal had the concept of "attribute families", and
>allowed us to define ACL attributes in a completely different
>namespace from EAs, so that NTFS, AFS or NFSv4 ACLs could be passed
>cleanly through the same API.
>
> > _Please_ do not mix the two issues. We have here a IMO nice API for EAs.
> > Lets get it into the kernel. Once that is done, one can start talking
> about
> > an API for ACLs.
>
>I'm not mixing them: I'm *trying* to unmix them.
Ok. (-:
>The existing EA code implements magic "handlers" for system EAs.
>Undefined magic gets called when you set such an EA. There's no
>mention about this in the spec.
That is a short coming of the EA spec. Agreed. Or rather it is a flaw in
the EA design as there shouldn't be any magic handlers. I have always seen
EAs as a means to store things in files that isn't the file data but other
random pieces of information. The fs shouldn't care what those are.
>So right now the EA code opens up what is basically a named-ioctl back
>door into the filesystems, and ACLs work on top of that. The existing
>ACL and EA code is already enormously intertwined.
Viewing it from this point of view you are of course right.
The ACL API should only use EAs as backing store, no magic handlers
attached to the EAs. Then the system would not be open to abuse. The ACL
part of the fs can of course make use of the EA backing store if it wants to.
> > IMHO a generic POSIX ACL API would never even know that ACLs are stored as
> > EAs, this should be up to the individual fs and if several fs use EAs it
> > would make sense to have vfs helpers which all fs can use (a-la generic_*
> > helpers).
>
>Yes, and the existing bestbits ACL API does not have anything like
>that abstraction: rather, it relies on doing an assignment to a "$acl"
>EA.
So the ACL API is flawed.
>So how are we going to do ACLs on top of EAs? Even if we forget about
>the ACL API for now, the API between the ACL layer and the EA layer
>*does* matter.
Indeed. I would favour an abstraction at vfs level and fs specific methods
as I described in previous post because this really divorses the two layers.
I know code speaks but I am more interested in making NTFS work properly
read-write and only then implement ACLs in it at the moment...
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Stephen C. Tweedie wrote:
>
>
>The proposal defines two "families" of attribute entities: attribute
>families and name families.
>
>An attribute family might be ATR_USER or ATR_SYSTEM to specify that we
>are dealing with arbitrary user or system named extended attributes,
>or ATR_POSIXACL to specify POSIX-semantics ACLs. Obviously, this can
>be extended to other ACL semantics without revving the API --- a new
>attribute family would be all that is needed.
>
>The "name family" is the other part of the equation. Attributes in
>the ATR_USER or ATR_SYSTEM families might be named with counted
>strings, so they would have names in the ANAME_STRING name family.
>POSIX ACLs, however, have a different namespace: ANAME_UID or
>ANAME_GID. The API cleanly deals with the difference between user and
>group ACLs. It also makes it easy to add support later on for more
>complex operations: if we want to add NT SID support to ext2 ACLs so
>that Samba and local accesses get the same access control, we can pass
>ANAME_NTSID names to the ATR_POSIXACL attribute family without
>changing the API.
>
If you have given it some thought, which your writing hints you may
have, can you say a little about supporting NT SIDS and NT ACLs by
Linux, and how that can be hard and easy?
One of my programmers is arguing that NT (as opposed to POSIX) ACL
support is harder than I imagine due to SIDS, and.... your view would be
interesting.
Hans
At 18:23 11/12/01, Hans Reiser wrote:
>Stephen C. Tweedie wrote:
>The proposal defines two "families" of attribute entities: attribute
>>families and name families.
>>
>>An attribute family might be ATR_USER or ATR_SYSTEM to specify that we
>>are dealing with arbitrary user or system named extended attributes,
>>or ATR_POSIXACL to specify POSIX-semantics ACLs. Obviously, this can
>>be extended to other ACL semantics without revving the API --- a new
>>attribute family would be all that is needed.
>>
>>The "name family" is the other part of the equation. Attributes in
>>the ATR_USER or ATR_SYSTEM families might be named with counted
>>strings, so they would have names in the ANAME_STRING name family.
>>POSIX ACLs, however, have a different namespace: ANAME_UID or
>>ANAME_GID. The API cleanly deals with the difference between user and
>>group ACLs. It also makes it easy to add support later on for more
>>complex operations: if we want to add NT SID support to ext2 ACLs so
>>that Samba and local accesses get the same access control, we can pass
>>ANAME_NTSID names to the ATR_POSIXACL attribute family without
>>changing the API.
>If you have given it some thought, which your writing hints you may have,
>can you say a little about supporting NT SIDS and NT ACLs by Linux, and
>how that can be hard and easy?
>
>One of my programmers is arguing that NT (as opposed to POSIX) ACL support
>is harder than I imagine due to SIDS, and.... your view would be interesting.
SIDs are nothing but user ids so you just require the user to pass a
mapping between SIDs and Linux user&group ids at mount time and that
problem is solved.
I am told samba already has support for SIDs so it can't be that difficult. (-:
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Hi Hans,
At 12:02 11/12/01, Hans Reiser wrote:
> I respond below.
>
>I didn't see that email, probably because I was not on the cc list.
>
>Nathan Scott wrote:
>>hi Hans,
>>On Sat, Dec 08, 2001 at 11:17:21PM +0300, Hans Reiser wrote:
>>>Nathan Scott wrote:
>>>>In a way there's consensus wrt how to do POSIX ACLs on Linux
>>>>now, as both the ext2/ext3 and XFS ACL projects will be using
>>>>the same tools, libraries, etc. In terms of other ACL types,
>>>>I don't know of anyone actively working on any.
>>>We are taking a very different approach to EAs (and thus to ACLs) as
>>>described in brief at http://www.namesys.com/v4/v4.html. We don't expect
>>>anyone to take us seriously on it before it works, but silence while
>>>coding does not equal consensus.;-)
>>>
>>>In essence, we think that if a file can't do what an EA can do, then you
>>>need to make files able to do more.
>>We did read through your page awhile ago. It wasn't clear to me
>>how you were addressing Anton's questions here:
>>http://marc.theaimsgroup.com/?l=linux-fsdevel&m=97260371413867&w=2
>>(I couldn't find a reply in the archive, but may have missed it).
>>
>>We were concentrating on something that could be fs-independent,
>>so the lack of answers there put us off a bit, and the dependence
>>on a reiser4() syscall is pretty filesystem-specific too (I guess
>>if your solution is intended to be a reiserfs-specific one, then
>>the questions above are meaningless).
>Changing the name of the system call is not a biggie. Our approach is to make
>it work for reiserfs, then proselytize. While we work, we let people know
>what we are working on, and if they join in, great to have it work for more
>than one FS.
>
>>I was curious on another thing also - in the section titled ``The Usual
>>Resolution Of These Flaws Is A One-Off Solution'',
>>talking about security attributes interfaces, your page says:
>>
>> "Linus said that we can have a system call to use as
>> our*experimental plaything in this. With what I have in mind for the
>>API, one rather flexible system call is all we want..."
>>
>>How did you manage to get him to say that? We were flamed for
>>suggesting a syscall which multiplexed all extended attributes
>>commands though the one interface (because its semantics were
>>not clearly defined & it could be extended with new commands,
>>like ioctl/quotactl/...), and we've also had no luck so far in
>>getting either our original interface, nor any revised syscall
>>interfaces (which aren't like that anymore) accepted by Linus.*
>
>We expect to get flamed once we have a patch.;-) When we
>have something mature enough to be usable, I expect he'll find a lot that
>could be made better. He does that.;-)
>
>For us, there are semantic advantages to having a single system call. Probably
>it will get a lot of argument once we have working code, and frankly I prefer
>to have that argument only after it is something usable, and it is easy to see
>the convenience of expression that comes from it. We want to Linux to be
>MORE expressive than BeOS in regards to files.
>
>>*
>>many thanks.
>>*
>>**
> **
>*
>Curtis Anderson wrote:
>
>> > The problem with streams-style attributes comes from stepping onto the
>> > slippery slope of trying to put too much generality into it. I chose the
>> > block-access style of API so that there would be no temptation to start
>> > down that slope.
>>
>>I understand you right up until this. I just don't get it. If you
>>extend the functionality of files and directories so that attributes are
>>not needed, this is goodness, right? I sure think it is the right
>>approach. We should just decompose carefully what functionality is
>>provided by attributes that files and directories lack, and one feature
>>at a time add that capability to files and directories as separate
>>optional features.
I wrote:
>No, it is _not_ goodness, IMHO. - If you did implement the API for
>attributes through files and directories, then what would you do with
>named streams?!?
>*
> **
>**
>
>*Hans Reiser wrote:
>
>What is your intended functional difference between extended attributes
>and streams?
>
>None?
Differences in NTFS:
- maximum size (EA limited to 64kiB, named stream 2^63 bytes)
- locality of storage (all EAs are stored in one so they are quicker to
access when you need to access multiple EAs)
- name namespace (Unicode names for named streams vs ASCII for EAs)
- potential ability to compress/encrypt (EAs cannot do this, named streams
could possibly and they certainly can be sparse, too which EAs cannot be)
- named streams have creation/modification/access/etc times associated with
them, EAs don't
How is that for a start?
>Ok, let's assume none until I get your response. (I can respond more
>specifically
>after you correct me.) Let me further go out on a limb,and guess that you
>intend
>that extended attributes are meta-information about the object, and streams
>are contained within the object.
Streams are only within the inode if they are tiny, otherwise they are
stored indirect just like normal file data. What they contain is complete
specific to the creator. Same is valid for EAs, with the exception that all
EAs are stored as one "stream" (for lack of a better word).
>In this case, a naming convention is quite sufficient to distinguish them.
Still think so? I don't.
>Extended attributes can have names of the form filenameA/..extone.
>
>Streams can have names of the form filenameA/streamone.
>
>In other words, all meta-information about an object should by convention
>(and only by convention, because people should live free, and because
>there is not always an obvious distinction between meta and contained
>information) be preceded by '..'
>
>Note that readdir should return neither stream names nor extended
>attribute names,
>and the use of 'hidden' directory entries accomplishes this (ala .snapshot
>for WAFL).
>*
>**
> **
All the below quotes refering to *Curtis are actually from me, IIRC...
>*Curtis:
>You can't possibly have both using the same API since you would then get
>name collision on filesystems where both named streams and EAs are supported.
>*
> **
>**
>
>*Name distinctions are what you use to avoid name collisions, see above.
>*
Ok, that would work, BUT:
(Again this is me not Curtis...)
>*Curtis:
>(And I haven't even mentioned EAs and named streams attached to actual
>_real_ directories yet.)
>
>*
> **
>*I don't understand this.
Ok, I will try to explain. An inode is the real thing, not a file. An inode
can by definition be a file or a directory (or a symlink, or special device
file, etc).
Any of these (i.e. any inode) can have both named streams AND EAs attached
to them on NTFS. So say I have a directory named MyDir and it contains a
named stream called MyStream1 and an EA called MyEA1 and two files, one
called MyStream1 and one called "..MyEA1".
Now with your scheme of naming things, looking up MyDir/MyStream1 matches
both the file MyStream1 that is in the directory MyDir and the named stream
MyStream1 belonging to the directory MyDir. - How do you/does one
distinguish the two in your scheme?!? I can only see it makind a big BANG
here...
Similarly, looking up MyDir/..MyEA1 matches both the file named "..MyEA1"
and the EA MyEA1. BANG!
And add a named stream actually named "..MyEA1" to MyDir and you have total
salad!
See the problem now?
I certainly fail to see how your naming scheme is going to cope with
this... Perhaps I am missing something?
Now if you have distinct APIs for EAs you have no problems on that side and
if you don't use the slash but say the colon (like Windows does) for named
streams you get rid of the named streams in directories problem, too. But
then you need to forbid the ":" as an accepted character in the file name
just like Windows does which is probably a reason not to use that API either...
>*
>**
> **
>
>*Curtis:
>Let's face it: EAs exist. They are _not_ files/directories so the API
>*
> **
>*Is this an argument?
>
>EA's do not exist in Linux, and they should never exist as something that
>is more than a file. Since they do not exist, you might as well improve
>the filesystems you port to Linux while porting them. APIs shape an OS
>over the long term, and if done wrong they burden generations after you
>with crud.
Like Microsoft is going to let me change the NTFS specifications to modify
how EAs and named streams are stored. Dream on!
But perhaps we are talking past each other: I am talking on-disk format /
specifications. These exist and no, we cannot change those at all. You can
do that with reiserfs as it is yours but all of us supporting existing file
systems owned by corporations like Microsoft, SGI, etc, have to live with
the specifications.
>*
>**
> **
>*Curtis:
>should not make them appear as files/directories. - You have to consider
>that there are a lot of filesystems out there which are already developed
>and which need to be supported. - Not everyone has their own filesystem
>which they can change/extend the specifications/implementation of at will.
>*
> **
>*
>Yes they do. It is all GPL'd. Even XFS. Do the underlying infrastructure
>the right way, and I bet you'll be surprised at how little need there really
>is for ea's done the wrong way. A user space library can cover
>over it all (causing only the obsolete programs using it to suffer while they
>wait to fade away).
?!? GPL has nothing to do with on-disk format and I doubt Microsoft would
agree that the ntfs on-disk layout is GPL. It's a trade secret! Why do you
think ntfs developers have to spend half their life using disassemblers and
hexeditors?!?
>*
>What would have happened if set theory had not just sets and elements, but
>sets, elements, extended-attributes, and streams, and you could not use
>the same operators on streams that you use on elements? It would have
>been crap as a theoretical model. It does real damage when you add things
>that require different operators to the set of primitives. Closure is
>extremely important to design. Don't do this.
Since we are going into analogies: You don't use a hammer to affix a screw
and neither do you use a screwdriver to affix a nail...at least I don't. I
think you are trying to use a large sledge hammer to put together things
which do not fit together thus breaking them in the process. To use your
own words: Don't do this. (-; Each is distinct and should be treated as
such. </me ducks>
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
I know I'm stepping into a minefield, but I just can't help putting in
my 2 pennies. :-)
Anton Altaparmakov wrote:
> At 12:02 11/12/01, Hans Reiser wrote:
> >What would have happened if set theory had not just sets and elements, but
> >sets, elements, extended-attributes, and streams, and you could not use
> >the same operators on streams that you use on elements? It would have
> >been crap as a theoretical model. It does real damage when you add things
> >that require different operators to the set of primitives. Closure is
> >extremely important to design. Don't do this.
>
> Since we are going into analogies: You don't use a hammer to affix a screw
> and neither do you use a screwdriver to affix a nail...at least I don't. I
> think you are trying to use a large sledge hammer to put together things
> which do not fit together thus breaking them in the process. To use your
> own words: Don't do this. (-; Each is distinct and should be treated as
> such. </me ducks>
I agree with Anton. Files have certain characteristics that we all
know and love, stream-style attributes have pretty-much those same
characteristics. IMHO, we would like EAs to have a different set of
characteristics so that the application programmer has different tools
in her toolbox. To continue the analogy: "if all you have is a hammer,
everything looks like a nail". Give someone that _already has_ a hammer
a screwdriver and they will be confused for a while but will end up
happier than if you gave them a "better hammer".
Thanks,
Curtis
--
Curtis Anderson [email protected]
Anton Altaparmakov wrote:
> Hi Hans,
>
> At 12:02 11/12/01, Hans Reiser wrote:
>
>> I respond below.
>>
>> I didn't see that email, probably because I was not on the cc list.
>>
>> Nathan Scott wrote:
>>
>>> hi Hans,
>>> On Sat, Dec 08, 2001 at 11:17:21PM +0300, Hans Reiser wrote:
>>>
>>>> Nathan Scott wrote:
>>>>
>>>>> In a way there's consensus wrt how to do POSIX ACLs on Linux
>>>>> now, as both the ext2/ext3 and XFS ACL projects will be using
>>>>> the same tools, libraries, etc. In terms of other ACL types,
>>>>> I don't know of anyone actively working on any.
>>>>
>>>> We are taking a very different approach to EAs (and thus to ACLs)
>>>> as described in brief at http://www.namesys.com/v4/v4.html. We don't
>>>> expect anyone to take us seriously on it before it works, but
>>>> silence while coding does not equal consensus.;-)
>>>>
>>>> In essence, we think that if a file can't do what an EA can do,
>>>> then you need to make files able to do more.
>>>
>>> We did read through your page awhile ago. It wasn't clear to me
>>> how you were addressing Anton's questions here:
>>> http://marc.theaimsgroup.com/?l=linux-fsdevel&m=97260371413867&w=2
>>> (I couldn't find a reply in the archive, but may have missed it).
>>>
>>> We were concentrating on something that could be fs-independent,
>>> so the lack of answers there put us off a bit, and the dependence
>>> on a reiser4() syscall is pretty filesystem-specific too (I guess
>>> if your solution is intended to be a reiserfs-specific one, then
>>> the questions above are meaningless).
>>
>> Changing the name of the system call is not a biggie. Our approach
>> is to make
>> it work for reiserfs, then proselytize. While we work, we let people
>> know
>> what we are working on, and if they join in, great to have it work
>> for more
>> than one FS.
>>
>>> I was curious on another thing also - in the section titled ``The
>>> Usual Resolution Of These Flaws Is A One-Off Solution'',
>>> talking about security attributes interfaces, your page says:
>>>
>>> "Linus said that we can have a system call to use as
>>> our*experimental plaything in this. With what I have in mind for the
>>> API, one rather flexible system call is all we want..."
>>>
>>> How did you manage to get him to say that? We were flamed for
>>> suggesting a syscall which multiplexed all extended attributes
>>> commands though the one interface (because its semantics were
>>> not clearly defined & it could be extended with new commands,
>>> like ioctl/quotactl/...), and we've also had no luck so far in
>>> getting either our original interface, nor any revised syscall
>>> interfaces (which aren't like that anymore) accepted by Linus.*
>>
>>
>> We expect to get flamed once we have a patch.;-) When we
>> have something mature enough to be usable, I expect he'll find a lot
>> that
>> could be made better. He does that.;-)
>>
>> For us, there are semantic advantages to having a single system call.
>> Probably
>> it will get a lot of argument once we have working code, and frankly
>> I prefer
>> to have that argument only after it is something usable, and it is
>> easy to see
>> the convenience of expression that comes from it. We want to Linux
>> to be
>> MORE expressive than BeOS in regards to files.
>>
>>> *
>>> many thanks.
>>> *
>>> **
>>
>> **
>> *
>> Curtis Anderson wrote:
>>
>>> > The problem with streams-style attributes comes from stepping onto
>>> the
>>> > slippery slope of trying to put too much generality into it. I
>>> chose the
>>> > block-access style of API so that there would be no temptation to
>>> start
>>> > down that slope.
>>>
>>> I understand you right up until this. I just don't get it. If you
>>> extend the functionality of files and directories so that attributes
>>> are not needed, this is goodness, right? I sure think it is the
>>> right approach. We should just decompose carefully what
>>> functionality is provided by attributes that files and directories
>>> lack, and one feature at a time add that capability to files and
>>> directories as separate optional features.
>>
>
> I wrote:
>
>> No, it is _not_ goodness, IMHO. - If you did implement the API for
>> attributes through files and directories, then what would you do with
>> named streams?!?
>> *
>> **
>> **
>>
>> *Hans Reiser wrote:
>>
>> What is your intended functional difference between extended
>> attributes and streams?
>>
>> None?
>
>
> Differences in NTFS:
>
> - maximum size (EA limited to 64kiB, named stream 2^63 bytes)
These are desirable limits to preserve? For sure? If so, then a
particular plugin can be written to restrict files to 64k, though I
shake my head at the thought.
>
> - locality of storage (all EAs are stored in one so they are quicker
> to access when you need to access multiple EAs)
arbitrarily aggregating files is a useful feature, there is no reason to
rigidly offer and require the feature for EAs only.
>
> - name namespace (Unicode names for named streams vs ASCII for EAs)
Namespaces can be changed for the children of directories. Plan9 guys
have done such things, and it is cool.
>
> - potential ability to compress/encrypt (EAs cannot do this, named
> streams could possibly and they certainly can be sparse, too which EAs
> cannot be)
Well, I suppose you could allow restricting some files to not have
compression and sparseness, though it isn't exciting to me.
>
> - named streams have creation/modification/access/etc times associated
> with them, EAs don't
I thought streams shared the stat data of the parent file?
Regardless, files should be able to share/inherit stat data.
>
>
> How is that for a start?
Not one reason cited is convincing to me.
>
>
>> Ok, let's assume none until I get your response. (I can respond more
>> specifically
>> after you correct me.) Let me further go out on a limb,and guess
>> that you intend
>> that extended attributes are meta-information about the object, and
>> streams
>> are contained within the object.
>
>
> Streams are only within the inode if they are tiny, otherwise they are
> stored indirect just like normal file data. What they contain is
> complete specific to the creator. Same is valid for EAs, with the
> exception that all EAs are stored as one "stream" (for lack of a
> better word).
I miss the point of the implementation details cited above.
>
>
>> In this case, a naming convention is quite sufficient to distinguish
>> them.
>
>
> Still think so?
Yes.
> I don't.
>
>> Extended attributes can have names of the form filenameA/..extone.
>>
>> Streams can have names of the form filenameA/streamone.
>>
>> In other words, all meta-information about an object should by
>> convention
>> (and only by convention, because people should live free, and because
>> there is not always an obvious distinction between meta and contained
>> information) be preceded by '..'
>>
>> Note that readdir should return neither stream names nor extended
>> attribute names,
>> and the use of 'hidden' directory entries accomplishes this (ala
>> .snapshot
>> for WAFL).
>> *
>> **
>> **
>
>
> All the below quotes refering to *Curtis are actually from me, IIRC...
>
>> *Curtis:
>> You can't possibly have both using the same API since you would then
>> get name collision on filesystems where both named streams and EAs
>> are supported.
>> *
>> **
>> **
>>
>> *Name distinctions are what you use to avoid name collisions, see above.
>> *
>
>
> Ok, that would work, BUT:
>
> (Again this is me not Curtis...)
>
>> *Curtis:
>> (And I haven't even mentioned EAs and named streams attached to
>> actual _real_ directories yet.)
>>
>> *
>> **
>> *I don't understand this.
>
>
> Ok, I will try to explain. An inode is the real thing, not a file.
In reiserfs we say object, and consider files and directories (and
symlinks, etc.) to be objects. We don't have on-disk inodes. Inodes
are implementation layer not semantic layer. We should be talking about
semantic layer here I think.
> An inode can by definition be a file or a directory (or a symlink, or
> special device file, etc).
>
> Any of these (i.e. any inode) can have both named streams AND EAs
> attached to them on NTFS. So say I have a directory named MyDir and it
> contains a named stream called MyStream1 and an EA called MyEA1 and
> two files, one called MyStream1 and one called "..MyEA1".
>
> Now with your scheme of naming things, looking up MyDir/MyStream1
> matches both the file MyStream1 that is in the directory MyDir and the
> named stream MyStream1 belonging to the directory MyDir. - How do
> you/does one distinguish the two in your scheme?!? I can only see it
> makind a big BANG here...
Well, gosh, okay, maybe you want to prepend ',,' to streams and '..' to
extended attributes. I personally think Linux would only want to do so
when used as a fileserver emulating NTFS/SAMBA. There is no enhancement
of user functionality from doing it for general purpose filesystems.
Feel free to substitute anything you like for ',,', the choice of
naming convention is not the point. You could even use ':':-).
It is important though that you not require ',,', ':', or '..' to have
these special meanings for all Linux namespaces, I hope that is understood.
>
>
> Similarly, looking up MyDir/..MyEA1 matches both the file named
> "..MyEA1" and the EA MyEA1. BANG!
>
> And add a named stream actually named "..MyEA1" to MyDir and you have
> total salad!
>
> See the problem now?
No, see above.
>
>
> I certainly fail to see how your naming scheme is going to cope with
> this... Perhaps I am missing something?
Naming conventions are easy. See above.
>
>
> Now if you have distinct APIs for EAs you have no problems on that
> side and if you don't use the slash but say the colon (like Windows
> does) for named streams you get rid of the named streams in
> directories problem, too. But then you need to forbid the ":" as an
> accepted character in the file name just like Windows does which is
> probably a reason not to use that API either...
>
>> *
>> **
>> **
>>
>> *Curtis:
>> Let's face it: EAs exist. They are _not_ files/directories so the API
>> *
>> **
>> *Is this an argument?
>>
>> EA's do not exist in Linux, and they should never exist as something
>> that is more than a file. Since they do not exist, you might as well
>> improve the filesystems you port to Linux while porting them. APIs
>> shape an OS over the long term, and if done wrong they burden
>> generations after you with crud.
>
>
> Like Microsoft is going to let me change the NTFS specifications to
> modify how EAs and named streams are stored. Dream on!
>
> But perhaps we are talking past each other: I am talking on-disk
> format / specifications.
I am NOT talking about on-disk format, I am talking about APIs and
naming conventions. On disk format is entirely FS specific. Live free
(errr, no, you are doing NTFS, live confined;-) ).....
> These exist and no, we cannot change those at all. You can do that
> with reiserfs as it is yours but all of us supporting existing file
> systems owned by corporations like Microsoft, SGI, etc, have to live
> with the specifications.
>
>> *
>> **
>> **
>> *Curtis:
>> should not make them appear as files/directories. - You have to
>> consider that there are a lot of filesystems out there which are
>> already developed and which need to be supported. - Not everyone has
>> their own filesystem which they can change/extend the
>> specifications/implementation of at will.
>> *
>> **
>> *
>> Yes they do. It is all GPL'd. Even XFS. Do the underlying
>> infrastructure
>> the right way, and I bet you'll be surprised at how little need there
>> really
>> is for ea's done the wrong way. A user space library can cover
>> over it all (causing only the obsolete programs using it to suffer
>> while they
>> wait to fade away).
>
>
> ?!? GPL has nothing to do with on-disk format and I doubt Microsoft
> would agree that the ntfs on-disk layout is GPL. It's a trade secret!
> Why do you think ntfs developers have to spend half their life using
> disassemblers and hexeditors?!?
Did you file comments in the various MS legal battles going on? You
should..... (I know, there is only a small chance it will have an
effect, but..... ) they should be required to give you the info, and if
you don't demand it I bet they won't be so required. Did you notice how
they are restricting things to only persons with a viable business in
the opinion of MS?
>
>
>> *
>> What would have happened if set theory had not just sets and
>> elements, but sets, elements, extended-attributes, and streams, and
>> you could not use the same operators on streams that you use on
>> elements? It would have been crap as a theoretical model. It does
>> real damage when you add things that require different operators to
>> the set of primitives. Closure is extremely important to design.
>> Don't do this.
>
>
> Since we are going into analogies: You don't use a hammer to affix a
> screw and neither do you use a screwdriver to affix a nail...at least
> I don't. I think you are trying to use a large sledge hammer to put
> together things which do not fit together thus breaking them in the
> process. To use your own words: Don't do this. (-; Each is distinct
> and should be treated as such. </me ducks>
>
> Best regards,
>
> Anton
>
>
Programs will get written to use your API, and not work with reiserfs,
and will get written to use our API and not work with NTFS, and this is
bad....
Thanks for the FS driver by the way, it is very useful to us dual-booters.
Hans
Extended attributes differ from files in N ways (forgive me for not
specifying N exactly, it would distract us).
What I am saying is that each of the N permutations required to
transform a file into an extended attribute should be separately
selectable. Theory guys would call this orthogonalizing the primitives.
(I am a theory guy.;-) ).
It is very important to decompose one's primitives into separately
specifiable primitives. It makes for a much more expressive abstract
model. This is a very standard policy among mathematicians, that they
strive to decompose primitives into a more orthogonal toolkit because
they know from hundreds of years of experience that it inevitably leads
to more expressive power. Let us learn from these mathematicians who
are so much older and wiser than we.
Hans
[email protected] wrote:
>I know I'm stepping into a minefield, but I just can't help putting in
>my 2 pennies. :-)
>
>Anton Altaparmakov wrote:
>
>>At 12:02 11/12/01, Hans Reiser wrote:
>>
>>>What would have happened if set theory had not just sets and elements, but
>>>sets, elements, extended-attributes, and streams, and you could not use
>>>the same operators on streams that you use on elements? It would have
>>>been crap as a theoretical model. It does real damage when you add things
>>>that require different operators to the set of primitives. Closure is
>>>extremely important to design. Don't do this.
>>>
>>Since we are going into analogies: You don't use a hammer to affix a screw
>>and neither do you use a screwdriver to affix a nail...at least I don't. I
>>think you are trying to use a large sledge hammer to put together things
>>which do not fit together thus breaking them in the process. To use your
>>own words: Don't do this. (-; Each is distinct and should be treated as
>>such. </me ducks>
>>
>
>I agree with Anton. Files have certain characteristics that we all
>know and love, stream-style attributes have pretty-much those same
>characteristics. IMHO, we would like EAs to have a different set of
>characteristics so that the application programmer has different tools
>in her toolbox. To continue the analogy: "if all you have is a hammer,
>everything looks like a nail". Give someone that _already has_ a hammer
>a screwdriver and they will be confused for a while but will end up
>happier than if you gave them a "better hammer".
>
>Thanks,
>
> Curtis
>
Hans Reiser wrote:
> What I am saying is that each of the N permutations required to
> transform a file into an extended attribute should be separately
> selectable. Theory guys would call this orthogonalizing the primitives.
> (I am a theory guy.;-) ).
Applying such rigor in the architecture design phase is probably a good
idea. Doing it at application run time is not so clear to me.
If you think of files and EAs as apples and oranges, knowing the minimal
set of orthogonal steps to turn an apple into an orange is good when
designing, but I hesitate to burden an app with having to select the
"skin-color" characteristic separately from the "ascorbic acid content"
characteristic. IMHO, files and EAs are "package deals" where we have
chosen a different set of characteristics for each, ones that we believe
will be useful to an app.
At bottom, a file holds an uninterpreted data stream. You have to ask
yourself whether you want that to change or not. If not, then you
build any additional functionality in selectable layers on top of the
filesystem, not in it. If you do want it to change, then you are
headed down the path of pulling a database into the filesystem. Come
to think of it, I believe that someone is already doing that. :-)
Having an interface such that an app can ask for
open("pizza-pie", F_OLIVES|F_PEPPERONI|F_ANCHOVIES|F_PINEAPPLE...)
where each of the "F_*" options are orthogonal and ask the filesystem to
layer in a different "filter" between the raw data and the app, or to
change the access characteristics (eg: block alignment, non-buffered,
etc), sounds overly complex. I believe that this would be better done
by explicitly stacking filesystems in a per-process namespace.
Thanks,
Curtis
--
Curtis Anderson [email protected]
[email protected] wrote:
>Hans Reiser wrote:
>
>>What I am saying is that each of the N permutations required to
>>transform a file into an extended attribute should be separately
>>selectable. Theory guys would call this orthogonalizing the primitives.
>> (I am a theory guy.;-) ).
>>
>
>Applying such rigor in the architecture design phase is probably a good
>idea. Doing it at application run time is not so clear to me.
>
>If you think of files and EAs as apples and oranges, knowing the minimal
>set of orthogonal steps to turn an apple into an orange is good when
>designing, but I hesitate to burden an app with having to select the
>"skin-color" characteristic separately from the "ascorbic acid content"
>characteristic. IMHO, files and EAs are "package deals" where we have
>chosen a different set of characteristics for each, ones that we believe
>will be useful to an app.
>
>At bottom, a file holds an uninterpreted data stream. You have to ask
>yourself whether you want that to change or not. If not, then you
>build any additional functionality in selectable layers on top of the
>filesystem, not in it. If you do want it to change, then you are
>headed down the path of pulling a database into the filesystem. Come
>to think of it, I believe that someone is already doing that. :-)
>
>
>Having an interface such that an app can ask for
> open("pizza-pie", F_OLIVES|F_PEPPERONI|F_ANCHOVIES|F_PINEAPPLE...)
>where each of the "F_*" options are orthogonal and ask the filesystem to
>layer in a different "filter" between the raw data and the app, or to
>change the access characteristics (eg: block alignment, non-buffered,
>etc), sounds overly complex. I believe that this would be better done
>by explicitly stacking filesystems in a per-process namespace.
>
#define PIZZA F_OLIVES|F_PEPPERONI|F_ANCHOVIES|F_PINEAPPLE
#define EDIBLE_PIZZA F_OLIVES|F_PEPPERONI|F_PINEAPPLE
Your way allows for PIZZA but not EDIBLE_PIZZA to be selected by users.
Both
are easy to specify.
You cannot know in advance what a user will consider to be EDIBLE_PIZZA.
Not allowing choice is for, umh, better I not say what OS likes to
prevent choice......;-)
Ok, so I understand that what I am advocating is a lot of work, and a
much harder path to take,
and I understand why you feel you have enough work, and I think we can
both respect each
other for our positions.
I'll try to convince you again when I have working code that isn't
monstrous code, but allows
users full choice, ok?
Best,
Hans
On Wed, 12 Dec 2001, Hans Reiser wrote:
> Anton Altaparmakov wrote:
> >> *Hans Reiser wrote:
> >>
> >> What is your intended functional difference between extended
> >> attributes and streams?
> >>
> >> None?
> >
> >
> > Differences in NTFS:
> >
> > - maximum size (EA limited to 64kiB, named stream 2^63 bytes)
>
> These are desirable limits to preserve? For sure? If so, then a
> particular plugin can be written to restrict files to 64k, though I
> shake my head at the thought.
You cannot not preserve them. At least not on NTFS! That is what the NTFS
specifications state (as visible from $AttrDef system file). I don't have
the option to change this.
> > - locality of storage (all EAs are stored in one so they are quicker
> > to access when you need to access multiple EAs)
>
> arbitrarily aggregating files is a useful feature, there is no reason to
> rigidly offer and require the feature for EAs only.
I was just stating a fact of how they are stored on NTFS, again something
I have no power to change.
> > - name namespace (Unicode names for named streams vs ASCII for EAs)
>
> Namespaces can be changed for the children of directories. Plan9 guys
> have done such things, and it is cool.
>
> > - potential ability to compress/encrypt (EAs cannot do this, named
> > streams could possibly and they certainly can be sparse, too which EAs
> > cannot be)
>
> Well, I suppose you could allow restricting some files to not have
> compression and sparseness, though it isn't exciting to me.
> >
> > - named streams have creation/modification/access/etc times associated
> > with them, EAs don't
>
> I thought streams shared the stat data of the parent file?
Yes, they do. Sorry, I got mixed up on that one point. I was thinking of
hard links rather than named streams (attribute $FILE_NAME rather than
$DATA).
> > How is that for a start?
>
> Not one reason cited is convincing to me.
> >
> >> Ok, let's assume none until I get your response. (I can respond more
> >> specifically
> >> after you correct me.) Let me further go out on a limb,and guess
> >> that you intend
> >> that extended attributes are meta-information about the object, and
> >> streams
> >> are contained within the object.
> >
> > Streams are only within the inode if they are tiny, otherwise they are
> > stored indirect just like normal file data. What they contain is
> > complete specific to the creator. Same is valid for EAs, with the
> > exception that all EAs are stored as one "stream" (for lack of a
> > better word).
>
> I miss the point of the implementation details cited above.
You said that EAs contain meta-info and streams are contained within the
object (not sure what you mean there but anyway) and I was saying that
that is not true, even if in somewhat unclear words.
> >> In this case, a naming convention is quite sufficient to distinguish
> >> them.
> >
> > Still think so?
>
> Yes.
>
> > I don't.
> >
> >> Extended attributes can have names of the form filenameA/..extone.
> >>
> >> Streams can have names of the form filenameA/streamone.
> >>
> >> In other words, all meta-information about an object should by
> >> convention
> >> (and only by convention, because people should live free, and because
> >> there is not always an obvious distinction between meta and contained
> >> information) be preceded by '..'
> >>
> >> Note that readdir should return neither stream names nor extended
> >> attribute names,
> >> and the use of 'hidden' directory entries accomplishes this (ala
> >> .snapshot
> >> for WAFL).
> >> *
> >> **
> >> **
> >
> >
> > All the below quotes refering to *Curtis are actually from me, IIRC...
> >
> >> *Curtis:
> >> You can't possibly have both using the same API since you would then
> >> get name collision on filesystems where both named streams and EAs
> >> are supported.
> >> *
> >> **
> >> **
> >>
> >> *Name distinctions are what you use to avoid name collisions, see above.
> >> *
> >
> >
> > Ok, that would work, BUT:
> >
> > (Again this is me not Curtis...)
> >
> >> *Curtis:
> >> (And I haven't even mentioned EAs and named streams attached to
> >> actual _real_ directories yet.)
> >>
> >> *
> >> **
> >> *I don't understand this.
> >
> >
> > Ok, I will try to explain. An inode is the real thing, not a file.
>
> In reiserfs we say object, and consider files and directories (and
> symlinks, etc.) to be objects. We don't have on-disk inodes. Inodes
> are implementation layer not semantic layer. We should be talking about
> semantic layer here I think.
>
> > An inode can by definition be a file or a directory (or a symlink, or
> > special device file, etc).
> >
> > Any of these (i.e. any inode) can have both named streams AND EAs
> > attached to them on NTFS. So say I have a directory named MyDir and it
> > contains a named stream called MyStream1 and an EA called MyEA1 and
> > two files, one called MyStream1 and one called "..MyEA1".
> >
> > Now with your scheme of naming things, looking up MyDir/MyStream1
> > matches both the file MyStream1 that is in the directory MyDir and the
> > named stream MyStream1 belonging to the directory MyDir. - How do
> > you/does one distinguish the two in your scheme?!? I can only see it
> > makind a big BANG here...
>
> Well, gosh, okay, maybe you want to prepend ',,' to streams and '..' to
> extended attributes. I personally think Linux would only want to do so
> when used as a fileserver emulating NTFS/SAMBA. There is no enhancement
> of user functionality from doing it for general purpose filesystems.
Just wait until this functionality is available and watch all GUI things
start to use it en masse! I don't doubt that GNOME/KDE/replace with your
favourite window manager are going to hesitate to start putting in the
icon, the name, and whatnot inside EAs or inside named streams the instant
they are ubiquitously available and I think that makes a lot of sense too.
No doubt I will get flamed for saying this but all flames go to
/dev/null...
Both MacOS and as of recently Windows do this kind of stuff, too, and it
can't be long before Linux goes the same way, provided file systems
support the required features (i.e. EAs and/or named streams) so I
disagree with you this is only a compatibility thing. It might start out
as one but it will find real world applications very quickly...
> Feel free to substitute anything you like for ',,', the choice of
> naming convention is not the point. You could even use ':':-).
>
> It is important though that you not require ',,', ':', or '..' to have
> these special meanings for all Linux namespaces, I hope that is understood.
The problem with making this flexible is that how does a user space
application find out what the current separators are? Will you be
introducing a get_name_spaces_of_this_inode syscall that needs to be
called on each inode before accessing EAs? And what if someone changes it
halfway through while you are reading EAs? That way lie dragons IMHO.
The proposed EA API which accesses EAs as EAs and not as files doesn't
suffer from any such problems.
> > Similarly, looking up MyDir/..MyEA1 matches both the file named
> > "..MyEA1" and the EA MyEA1. BANG!
> >
> > And add a named stream actually named "..MyEA1" to MyDir and you have
> > total salad!
> >
> > See the problem now?
>
> No, see above.
>
> > I certainly fail to see how your naming scheme is going to cope with
> > this... Perhaps I am missing something?
>
> Naming conventions are easy. See above.
How so? You are begging the question by just saying it is easy. Please
answer what you do when there is a clash, which is bound to happen
eventually especially if you make the name space prefixes flexible.
Does the user just experience undefined behaviour? That would be
unacceptable IMHO.
> > Now if you have distinct APIs for EAs you have no problems on that
> > side and if you don't use the slash but say the colon (like Windows
> > does) for named streams you get rid of the named streams in
> > directories problem, too. But then you need to forbid the ":" as an
> > accepted character in the file name just like Windows does which is
> > probably a reason not to use that API either...
> >
> >> *
> >> **
> >> **
> >>
> >> *Curtis:
> >> Let's face it: EAs exist. They are _not_ files/directories so the API
> >> *
> >> **
> >> *Is this an argument?
> >>
> >> EA's do not exist in Linux, and they should never exist as something
> >> that is more than a file. Since they do not exist, you might as well
> >> improve the filesystems you port to Linux while porting them. APIs
> >> shape an OS over the long term, and if done wrong they burden
> >> generations after you with crud.
> >
> >
> > Like Microsoft is going to let me change the NTFS specifications to
> > modify how EAs and named streams are stored. Dream on!
> >
> > But perhaps we are talking past each other: I am talking on-disk
> > format / specifications.
>
>
> I am NOT talking about on-disk format, I am talking about APIs and
> naming conventions. On disk format is entirely FS specific. Live free
> (errr, no, you are doing NTFS, live confined;-) ).....
>
> > These exist and no, we cannot change those at all. You can do that
> > with reiserfs as it is yours but all of us supporting existing file
> > systems owned by corporations like Microsoft, SGI, etc, have to live
> > with the specifications.
> >
> >> *
> >> **
> >> **
> >> *Curtis:
> >> should not make them appear as files/directories. - You have to
> >> consider that there are a lot of filesystems out there which are
> >> already developed and which need to be supported. - Not everyone has
> >> their own filesystem which they can change/extend the
> >> specifications/implementation of at will.
> >> *
> >> **
> >> *
> >> Yes they do. It is all GPL'd. Even XFS. Do the underlying
> >> infrastructure
> >> the right way, and I bet you'll be surprised at how little need there
> >> really
> >> is for ea's done the wrong way. A user space library can cover
> >> over it all (causing only the obsolete programs using it to suffer
> >> while they
> >> wait to fade away).
> >
> >
> > ?!? GPL has nothing to do with on-disk format and I doubt Microsoft
> > would agree that the ntfs on-disk layout is GPL. It's a trade secret!
> > Why do you think ntfs developers have to spend half their life using
> > disassemblers and hexeditors?!?
>
> Did you file comments in the various MS legal battles going on? You
> should..... (I know, there is only a small chance it will have an
> effect, but..... ) they should be required to give you the info, and if
> you don't demand it I bet they won't be so required. Did you notice how
> they are restricting things to only persons with a viable business in
> the opinion of MS?
They are restricting a lot of things. From what I have seen so far they
will end up not showing anything to anybody at all. This settlement is
complete garbage. But we are getting off topic...
> >> *
> >> What would have happened if set theory had not just sets and
> >> elements, but sets, elements, extended-attributes, and streams, and
> >> you could not use the same operators on streams that you use on
> >> elements? It would have been crap as a theoretical model. It does
> >> real damage when you add things that require different operators to
> >> the set of primitives. Closure is extremely important to design.
> >> Don't do this.
> > Since we are going into analogies: You don't use a hammer to affix a
> > screw and neither do you use a screwdriver to affix a nail...at least
> > I don't. I think you are trying to use a large sledge hammer to put
> > together things which do not fit together thus breaking them in the
> > process. To use your own words: Don't do this. (-; Each is distinct
> > and should be treated as such. </me ducks>
> >
> Programs will get written to use your API, and not work with reiserfs,
> and will get written to use our API and not work with NTFS, and this is
> bad....
Now that is true. And yes, it is bad. However it will be up to the
community to decide which API to use and at the moment there are several
fs using the "bestbits" API and only reiserfs (?) the "reiserfs" one...
And we all know from our very own $Deity that we don't design software, we
just write things and let evolution decide which is better. (((-;
> Thanks for the FS driver by the way, it is very useful to us dual-booters.
Thanks. (-: It is indeed, I being one of the dual-booters as well. (-:
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
hi Stephen,
On Tue, Dec 11, 2001 at 01:47:58PM +0000, Stephen C. Tweedie wrote:
> > - Andreas made attempt #1 to get a system call interface agreed on
> > over a year ago now. He incorporated several peoples suggestions,
> > but eventually the discussion got sidetracked, died and nothing
> > further happened;
>
> Yep, and I brought up all these points last time, too.
>
> > > But the ACL encoding is still hobbled: ...
> >
> > I have been on the acl-devel mailing list for a long time now,
> > and while these features all sound like good ideas or interesting
> > projects, I have never seen anyone post a patch or request any
> > specific changes to Andreas' ACL encoding in that time.
>
> It was proposed over a year ago on fsdevel-list. I've attached the
Yeah, I know - I've read it. What I was trying to say with "post a
patch or request any specific changes" is that we can make proposals
till the cows come home, but someone has to do the work and it seems
noone has implemented these things in the intervening year.
The encoding should cater for these future changes, for sure - through
a version or features field, or an ACL "family" concept, or whatever
(these things shouldn't pollute the generic EA interfaces though).
If the current POSIX ACL format isn't good enough for the needs that
you've foreseen, then it would be a good idea to suggest your changes
to Andreas directly.
Yes, it would help if the current format was documented. ;) But the
code is there and the format it implements hasn't changed at all in
the year that's gone.
> > ...
> > His POSIX ACL encoding has a version field in it
>
> Umm, and where in the EA man pages is this described? How does an
> application use the EA API? That's what I'm concerned about.
>
> > ...
> > so if/when some
> > people step forward to implement these features you've described,
> > and if they require changes to the format, then there should be no
> > reason they can't do it cleanly and in a filesystem-independent
> > manner, right?
>
> What format? There _is_ no defined format.
These are valid points - I have only read Andreas' code and I
don't believe the format he has chosen is documented anywhere.
It really should be and (imo) all system attributes should be
documented and incorporated into the extended attributes docs.
cheers.
--
Nathan
On Wed, 12 Dec 2001, Hans Reiser wrote:
[snip theories and pizzas]
> Ok, so I understand that what I am advocating is a lot of work, and a
> much harder path to take,
> and I understand why you feel you have enough work, and I think we can
> both respect each
> other for our positions.
>
> I'll try to convince you again when I have working code that isn't
> monstrous code, but allows
> users full choice, ok?
I am looking forward to it. Perhaps one day I will be able to show you
some alternative code in ntfs, if the issue is still is open for
discussion then... (-:
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Anton Altaparmakov wrote:
>
>I was just stating a fact of how they are stored on NTFS, again something
>I have no power to change.
>
But does NTFS specificism/cripplism belong in VFS? (I in no way blame
you for NTFS's design:-) )
>>Well, gosh, okay, maybe you want to prepend ',,' to streams and '..' to
>>extended attributes. I personally think Linux would only want to do so
>>when used as a fileserver emulating NTFS/SAMBA. There is no enhancement
>>of user functionality from doing it for general purpose filesystems.
>>
>
>Just wait until this functionality is available and watch all GUI things
>start to use it en masse! I don't doubt that GNOME/KDE/replace with your
>favourite window manager are going to hesitate to start putting in the
>icon, the name, and whatnot inside EAs or inside named streams the instant
>they are ubiquitously available and I think that makes a lot of sense too.
>No doubt I will get flamed for saying this but all flames go to
>/dev/null...
>
>Both MacOS and as of recently Windows do this kind of stuff, too, and it
>can't be long before Linux goes the same way, provided file systems
>support the required features (i.e. EAs and/or named streams) so I
>disagree with you this is only a compatibility thing. It might start out
>as one but it will find real world applications very quickly...
>
I am not saying that the features of EAs are not useful, I am saying
that I want to choose them
individually for particular files.
It could be so much better to have EDIBLE_PIZZA (example from previous
email)
instead of just PIZZA, sigh.
>
>
>
>>>
>>Programs will get written to use your API, and not work with reiserfs,
>>and will get written to use our API and not work with NTFS, and this is
>>bad....
>>
>
>Now that is true. And yes, it is bad. However it will be up to the
>community to decide which API to use and at the moment there are several
>fs using the "bestbits" API and only reiserfs (?) the "reiserfs" one...
>And we all know from our very own $Deity that we don't design software, we
>just write things and let evolution decide which is better. (((-;
>
Fortunately he isn't entirely consistent on this point.:-)
I predict you guys will ship first and get a lot of usage, and then we
will ship later with more features,
and the result will be a mess for users. This is the usual evolutionary
design standards mess.
Objectively, I understand it is highly reasonable for the Linux
community to assume that what we
implement will be horrible until we finish it. I would encourage it to
assume that someone else
will eventually get orthogonalism right though, and I think it would be
worth waiting for it, because
these are the sorts of design features that stick around for 30 years.
I don't really expect that most
folks will choose to wait though.
Best to all,
Hans
Hans Reiser wrote:
> I'll try to convince you again when I have working code that isn't
> monstrous code, but allows users full choice, ok?
It's a deal!
Thanks,
Curtis
--
Curtis Anderson [email protected]
On Wed, 12 Dec 2001, Hans Reiser wrote:
> Anton Altaparmakov wrote:
> >I was just stating a fact of how they are stored on NTFS, again something
> >I have no power to change.
> >
> But does NTFS specificism/cripplism belong in VFS?
No, of course not. But the vfs needs to be able to cope with limitations
of specific file systems (even if it is only by passing -Exyz into
userspace).
> >>Well, gosh, okay, maybe you want to prepend ',,' to streams and '..' to
> >>extended attributes. I personally think Linux would only want to do so
> >>when used as a fileserver emulating NTFS/SAMBA. There is no enhancement
> >>of user functionality from doing it for general purpose filesystems.
> >
> >Just wait until this functionality is available and watch all GUI things
> >start to use it en masse! I don't doubt that GNOME/KDE/replace with your
> >favourite window manager are going to hesitate to start putting in the
> >icon, the name, and whatnot inside EAs or inside named streams the instant
> >they are ubiquitously available and I think that makes a lot of sense too.
> >No doubt I will get flamed for saying this but all flames go to
> >/dev/null...
> >
> >Both MacOS and as of recently Windows do this kind of stuff, too, and it
> >can't be long before Linux goes the same way, provided file systems
> >support the required features (i.e. EAs and/or named streams) so I
> >disagree with you this is only a compatibility thing. It might start out
> >as one but it will find real world applications very quickly...
> >
> I am not saying that the features of EAs are not useful, I am saying
> that I want to choose them individually for particular files.
>
> It could be so much better to have EDIBLE_PIZZA (example from previous
> email) instead of just PIZZA, sigh.
I am not quite sure what you mean. Surely you can just have all features
available at all times/to all files and then you just use the ones you
want, just ignoring/not using the rest. Why do you see the need for
"selecting features of EAs individually for particular files"? It makes
sense when buying EDIBLE_PIZZA but I don't see how that can be transferred
onto files. After all I can just have all pizza ingredients and only put
the ones I want on the pizza just ignoring the others.
Um, I think we might be saying the same thing in different words...
> >>Programs will get written to use your API, and not work with reiserfs,
> >>and will get written to use our API and not work with NTFS, and this is
> >>bad....
> >
> >Now that is true. And yes, it is bad. However it will be up to the
> >community to decide which API to use and at the moment there are several
> >fs using the "bestbits" API and only reiserfs (?) the "reiserfs" one...
> >And we all know from our very own $Deity that we don't design software, we
> >just write things and let evolution decide which is better. (((-;
> >
> Fortunately he isn't entirely consistent on this point.:-)
>
> I predict you guys will ship first and get a lot of usage, and then we
> will ship later with more features,
> and the result will be a mess for users. This is the usual evolutionary
> design standards mess.
Yes, your prediction will likely hold true IMO.
> Objectively, I understand it is highly reasonable for the Linux
> community to assume that what we
> implement will be horrible until we finish it. I would encourage it to
> assume that someone else
> will eventually get orthogonalism right though, and I think it would be
> worth waiting for it, because
> these are the sorts of design features that stick around for 30 years.
> I don't really expect that most folks will choose to wait though.
Me neither. People want it now, which pretty much limits the choice to
one of the things available and working now, plus some required cleanups
to satisfy all $Deities so the solution can be accepted in the kernel...
The one who comes first gets to populate the vacuum. Evolution at its
best. (-:
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Anton Altaparmakov wrote:
>On Wed, 12 Dec 2001, Hans Reiser wrote:
>
>>Anton Altaparmakov wrote:
>>
>>>I was just stating a fact of how they are stored on NTFS, again something
>>>I have no power to change.
>>>
>>But does NTFS specificism/cripplism belong in VFS?
>>
>
>No, of course not. But the vfs needs to be able to cope with limitations
>of specific file systems (even if it is only by passing -Exyz into
>userspace).
>
We agree on this, I have no opposition to NTFS checking size of files
used to store EAs and rejecting any write more than 64k
>
>
>>>>Well, gosh, okay, maybe you want to prepend ',,' to streams and '..' to
>>>>extended attributes. I personally think Linux would only want to do so
>>>>when used as a fileserver emulating NTFS/SAMBA. There is no enhancement
>>>>of user functionality from doing it for general purpose filesystems.
>>>>
>>>Just wait until this functionality is available and watch all GUI things
>>>start to use it en masse! I don't doubt that GNOME/KDE/replace with your
>>>favourite window manager are going to hesitate to start putting in the
>>>icon, the name, and whatnot inside EAs or inside named streams the instant
>>>they are ubiquitously available and I think that makes a lot of sense too.
>>>No doubt I will get flamed for saying this but all flames go to
>>>/dev/null...
>>>
>>>Both MacOS and as of recently Windows do this kind of stuff, too, and it
>>>can't be long before Linux goes the same way, provided file systems
>>>support the required features (i.e. EAs and/or named streams) so I
>>>disagree with you this is only a compatibility thing. It might start out
>>>as one but it will find real world applications very quickly...
>>>
>>I am not saying that the features of EAs are not useful, I am saying
>>that I want to choose them individually for particular files.
>>
>>It could be so much better to have EDIBLE_PIZZA (example from previous
>>email) instead of just PIZZA, sigh.
>>
>
>I am not quite sure what you mean. Surely you can just have all features
>available at all times/to all files and then you just use the ones you
>want, just ignoring/not using the rest. Why do you see the need for
>"selecting features of EAs individually for particular files"? It makes
>sense when buying EDIBLE_PIZZA but I don't see how that can be transferred
>onto files. After all I can just have all pizza ingredients and only put
>the ones I want on the pizza just ignoring the others.
>
Inheriting stat data from the parent directory should be a feature
available not just for streams, but for all files that want it.
Efficient small file access to a 32 byte file should be a feature
available to all files, not just EAs. Not being listed in readdir
should be a feature available to all files, not just EAs. Constraining
what is written to them should be a feature available to all files, not
just EAs, and arbitrary plugin based constraints should be possible.
Is this more clear?
Hans
At 12:02 12/12/01, Hans Reiser wrote:
>Anton Altaparmakov wrote:
>>On Wed, 12 Dec 2001, Hans Reiser wrote:
>>>Anton Altaparmakov wrote:
>>>>Both MacOS and as of recently Windows do this kind of stuff, too, and it
>>>>can't be long before Linux goes the same way, provided file systems
>>>>support the required features (i.e. EAs and/or named streams) so I
>>>>disagree with you this is only a compatibility thing. It might start out
>>>>as one but it will find real world applications very quickly...
>>>I am not saying that the features of EAs are not useful, I am saying
>>>that I want to choose them individually for particular files.
>>>
>>>It could be so much better to have EDIBLE_PIZZA (example from previous
>>>email) instead of just PIZZA, sigh.
>>
>>I am not quite sure what you mean. Surely you can just have all features
>>available at all times/to all files and then you just use the ones you
>>want, just ignoring/not using the rest. Why do you see the need for
>>"selecting features of EAs individually for particular files"? It makes
>>sense when buying EDIBLE_PIZZA but I don't see how that can be transferred
>>onto files. After all I can just have all pizza ingredients and only put
>>the ones I want on the pizza just ignoring the others.
>Inheriting stat data from the parent directory should be a feature
>available not just for streams, but for all files that want it. Efficient
>small file access to a 32 byte file should be a feature available to all
>files, not just EAs. Not being listed in readdir should be a feature
>available to all files, not just EAs. Constraining what is written to
>them should be a feature available to all files, not just EAs, and
>arbitrary plugin based constraints should be possible.
>
>Is this more clear?
Yes it is, thanks. And yes it makes sense. But this is talking about files
as a whole and has nothing to do with EAs as such (but it would obviously
apply to EAs, too under your proposed API).
I will be looking forward to seeing this stuff implemented. (-:
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Anton Altaparmakov wrote:
> At 12:02 12/12/01, Hans Reiser wrote:
>
>> Anton Altaparmakov wrote:
>>
>>> On Wed, 12 Dec 2001, Hans Reiser wrote:
>>>
>>>> Anton Altaparmakov wrote:
>>>>
>>>>> Both MacOS and as of recently Windows do this kind of stuff, too,
>>>>> and it
>>>>> can't be long before Linux goes the same way, provided file systems
>>>>> support the required features (i.e. EAs and/or named streams) so I
>>>>> disagree with you this is only a compatibility thing. It might
>>>>> start out
>>>>> as one but it will find real world applications very quickly...
>>>>
>>>> I am not saying that the features of EAs are not useful, I am
>>>> saying that I want to choose them individually for particular files.
>>>>
>>>> It could be so much better to have EDIBLE_PIZZA (example from
>>>> previous email) instead of just PIZZA, sigh.
>>>
>>>
>>> I am not quite sure what you mean. Surely you can just have all
>>> features
>>> available at all times/to all files and then you just use the ones you
>>> want, just ignoring/not using the rest. Why do you see the need for
>>> "selecting features of EAs individually for particular files"? It makes
>>> sense when buying EDIBLE_PIZZA but I don't see how that can be
>>> transferred
>>> onto files. After all I can just have all pizza ingredients and only
>>> put
>>> the ones I want on the pizza just ignoring the others.
>>
>> Inheriting stat data from the parent directory should be a feature
>> available not just for streams, but for all files that want it.
>> Efficient small file access to a 32 byte file should be a feature
>> available to all files, not just EAs. Not being listed in readdir
>> should be a feature available to all files, not just EAs.
>> Constraining what is written to them should be a feature available to
>> all files, not just EAs, and arbitrary plugin based constraints
>> should be possible.
>>
>> Is this more clear?
>
>
> Yes it is, thanks. And yes it makes sense. But this is talking about
> files as a whole and has nothing to do with EAs as such (but it would
> obviously apply to EAs, too under your proposed API).
There would be no need for EAs if files as a whole could have these
properties, as EAs would just be particular files with particular
properties within a directory/file object.
>
>
> I will be looking forward to seeing this stuff implemented. (-:
>
> Anton
>
>
If Linux users get really unlucky, which seems likely, :-/, 2.6 will
take as long as 2.4, in which case I think we will complete the task in
plenty of time for 2.6, and we can ask Linus which implementation he
prefers before he has committed to one in a stable release.
Hans
On Wed, Dec 12, 2001 at 12:21:49AM +0300, Hans Reiser wrote:
> Naming conventions are easy.
Hans,
While I look forward to your work, I think Anton points out some
issues that you really should try to address now, only you have not
understood them. Can I take a crack at posing some concrete
questions that manifest the issues?
Let's imagine that we have a Linux system with an NTFS filesystem
and a reiserfs4 filesystem. You can make any tentative assumptions
about reiserfs4 and new API's that you like, I just want to have an
idea of how you envision the following working:
First, I write a desktop application that wants to save an HTML file
along with some other object that contains the name of the creating
application. The latter can go anywhere you want, except in the
same stream as the HTML file. The user has requested that the
filename be /home/user/foo.html , and expects to be able to FTP this
file to his ISP with a standard FTP program. What calls does my
application make to store the HTML and the application name? If the
answer is different depending on whether /home/user is NTFS or
reiserfs4, explain both ways.
Second, I booted NT and created a directory in the NTFS filesystem
called /foo . In the directory, I created a file called bar. I
also created a named stream called bar, and an extended attribute
called bar. Now I boot Linux. What calls do I make to see each of
the three objects called bar?
The heart of Anton's argument is that the UNIX filesystem name space
is basically used up--there's just not much room to add new
semantics. The only obvious avenue for extension is, if /foo is not
a directory, you can give some interpretation to /foo/bar . But
this doesn't help if /foo is a directory. So something has to give,
and we want to see what will give in reiserfs4.
Andrew
Andrew Pimlott wrote:
>On Wed, Dec 12, 2001 at 12:21:49AM +0300, Hans Reiser wrote:
>
>>Naming conventions are easy.
>>
>
>Hans,
>
>While I look forward to your work, I think Anton points out some
>issues that you really should try to address now, only you have not
>understood them. Can I take a crack at posing some concrete
>questions that manifest the issues?
>
>Let's imagine that we have a Linux system with an NTFS filesystem
>and a reiserfs4 filesystem. You can make any tentative assumptions
>about reiserfs4 and new API's that you like, I just want to have an
>idea of how you envision the following working:
>
>First, I write a desktop application that wants to save an HTML file
>along with some other object that contains the name of the creating
>application. The latter can go anywhere you want, except in the
>same stream as the HTML file. The user has requested that the
>filename be /home/user/foo.html , and expects to be able to FTP this
>file to his ISP with a standard FTP program. What calls does my
>application make to store the HTML and the application name? If the
>answer is different depending on whether /home/user is NTFS or
>reiserfs4, explain both ways.
>
Are you sure that standard ftp will be able to handle extended
attributes without modification?
One approach is to create a plugin called ..archive that when read is a
virtual file consisting of an archive of everything in the directory.
It would be interesting I think to attach said plugin to standard
directories by default along with several other standard plugins like
..cat, etc.
>
>
>Second, I booted NT and created a directory in the NTFS filesystem
>called /foo . In the directory, I created a file called bar. I
>also created a named stream called bar, and an extended attribute
>called bar. Now I boot Linux. What calls do I make to see each of
>the three objects called bar?
>
You access /foo/bar, /foo/bar/,,bar, /foo/..bar by name.
>
>
>The heart of Anton's argument is that the UNIX filesystem name space
>is basically used up--there's just not much room to add new
>semantics. The only obvious avenue for extension is, if /foo is not
>a directory, you can give some interpretation to /foo/bar . But
>this doesn't help if /foo is a directory. So something has to give,
>and we want to see what will give in reiserfs4.
>
>Andrew
>
>
Naming conventions are easy, but teaching user space is hard no matter
whose scheme is used.
Good morning to everyone.
I was thinking about the idea of sub-ids to enable users to run "untrusted"
binary or "dangerous" one without risk for their files/privacy.
I had an idea to a low-profile, almost no invasive way to implement it that
should be almost transparent to user-space application (almost all). Let me
explain with an example.
Let's add to task_struct another array like groups[NGROUPS], calling it
slave_uids[NSLAVES]. Add a (privileged) syscall, addslave(uid_t, uid_t),
that can fill that arrays.
Now, the user space configuration is the following:
I am romano, uid 300.
There is(/are) another(s) user, for example r-slave, uid 3001, no login
shell, with home dir in ~romano/r-slave.
When I login as romano, the login binary call a addslave(300, 3001), looking
at a /etc/slaves that has a line that says romano:r-slave
Now change the kernel so that:
1) user romano can do a setuid() call to become anyone of its slaves, with
no way back possible.
2) user romano can chown() files owned by him and by any of its slaves to
any of the romano or slaves uids.
And that's all. All the other strange file management that user romano would
want to do on the slave-id environment, he can do by doing a kind-of-su(*)
to one of its slaves (with setuid) and then play in the restricted
enviroment. If you add ACL to this(**), you could easily fine-control what the
untrusted binary can see of your environment; add the per-vfsmount ro-flag
and it gives to you a lot of flexibility.
This should be a change simple enough for the kernel, and for the userspace
too: just change login to add addslave call, and the tools that need to
spawn untrusted binaries can do a setuid() to a slave before the exec().
Is there something that I am missing here?
Romano
(*) probably to be called giu... (sorry, Italian-speaking only joke: /su/
means 'up' in Italian, and /giu/ means 'down').
(**) and without ACL, if you makes a parallel thing for gid, you can
probably fine-tune access in the old-style ways, provided that the system
set-up is the "every user in its own group" style.
--
Romano Giannetti - Univ. Pontificia Comillas (Madrid, Spain)
Electronic Engineer - phone +34 915 422 800 ext 2416 fax +34 915 411 132
On Thu, Dec 13, 2001 at 11:36:16AM +0100, Romano Giannetti wrote:
> I am romano, uid 300.
> There is(/are) another(s) user, for example r-slave, uid 3001, no login
> shell, with home dir in ~romano/r-slave.
It would be so much nicer to be able to do this on-the-fly, rather than
having to create the user and it's home directory first.
However, I think one must first start with figguring out what
functionality we want:
1 do we want the "slave" to be able to read the users files
2 do we want the "slave" to be able to write the users files
3 do we want the "slave" to keep is own configuration files
And I think the answers are:
1. No. It would make it possible for broken/evil programs
to steal your data.
2. Definitively not
3. No - it would cause different "slave" processes interact.
It should rather use the users regular configuration files.
And we end up with a different solution:
olduid=getuid();
/* Allocate a uid with no privilegies */
slaveuid=setruid_slave();
set_acl("private-file", ACL_READ, slaveuid);
set_acl("private-log", ACL_APPEND, slaveuid);
seteuid(slaveuid);
exec("dangerous-program");
This should also be possible to implement with minimal impact. All you
need is a new systemcall to allocate a uid for the slave. This means you
need to reserve some uids for this purpose, but with 32bit uids......
A possible addon would be a systemcall to free the uid when it was not
in use anymore, so it can be reused safely.
An alternative would be to not give the new uid access to the files, but
just open them before doing exec. This way it is safe to run multiple
slaves with the same uid at once, and it doesn't rely on ACLs! The
downside is that it needs cooperation from the dangerous-program, while
the above could work as long as the wrapper (e.g. a browser) took the
appropriate steps.
--
Ragnar Kj?rstad
Big Storage
On Thu, Dec 13, 2001 at 12:23:46PM +0300, Hans Reiser wrote:
> Andrew Pimlott wrote:
> >First, I write a desktop application that wants to save an HTML file
> >along with some other object that contains the name of the creating
> >application. The latter can go anywhere you want, except in the
> >same stream as the HTML file. The user has requested that the
> >filename be /home/user/foo.html , and expects to be able to FTP this
> >file to his ISP with a standard FTP program. What calls does my
> >application make to store the HTML and the application name? If the
> >answer is different depending on whether /home/user is NTFS or
> >reiserfs4, explain both ways.
> >
> Are you sure that standard ftp will be able to handle extended
> attributes without modification?
No, the ftp program only needs to transfer the HTML part.
> One approach is to create a plugin called ..archive that when read is a
> virtual file consisting of an archive of everything in the directory.
Ok, does this mean that every directory in the filesystem (or in
some part of it) will automatically have a node ..archive?
Presumably, it will not appear in directory listings, but can be
read but not written to? Does this mean that a legacy application
(pathological as it may be) that expects to be able to create a file
called ..archive will fail?
Or do you mean that the application would explicitly create the node
associated with this plugin?
> It would be interesting I think to attach said plugin to standard
> directories by default along with several other standard plugins like
> ..cat, etc.
Anyway, you didn't answer the part I really care about. What calls
does the application make to store the HTML and the "extended
attribute"? You can pick whatever conventions you want, just give
me an example.
> >Second, I booted NT and created a directory in the NTFS filesystem
> >called /foo . In the directory, I created a file called bar. I
> >also created a named stream called bar, and an extended attribute
> >called bar. Now I boot Linux. What calls do I make to see each of
> >the three objects called bar?
> >
>
> You access /foo/bar, /foo/bar/,,bar, /foo/..bar by name.
How do I access the file called ..bar (created in NT) in the
directory /foo?
(Anton, does NTFS define any reserved filename characters, or only
win32?)
Andrew
On Thu, Dec 13, 2001 at 02:37:52PM +0100, Ragnar Kj?rstad wrote:
> On Thu, Dec 13, 2001 at 11:36:16AM +0100, Romano Giannetti wrote:
> > I am romano, uid 300.
> > There is(/are) another(s) user, for example r-slave, uid 3001, no login
> > shell, with home dir in ~romano/r-slave.
>
> It would be so much nicer to be able to do this on-the-fly, rather than
> having to create the user and it's home directory first.
Yes, this could be nice.
> However, I think one must first start with figguring out what
> functionality we want:
> 1 do we want the "slave" to be able to read the users files
Yes, but _by default_ the slave process could read only the files that you
have world readable (or group readable, if the slave is in the same group
than you, which probably is not a good idea). So you could decide wich file
it can access and which not.
> 2 do we want the "slave" to be able to write the users files
Generally no, but you can create a dir where the slave uid can create file
(think to a java applet that need temporary files, etc...)
> 3 do we want the "slave" to keep is own configuration files
Define the slave uid to have the same home dir than the main user...
>
> This should also be possible to implement with minimal impact. All you
> need is a new systemcall to allocate a uid for the slave. This means you
> need to reserve some uids for this purpose, but with 32bit uids......
>
Yes, but then the slave process is very much _very_ limited. It could need
to read/map dynamic libraries, for example; with my approach the slave uid
processes are processes that have a full-level citizenship and that can do
anything a process can do, but under a different name than the user. Root
uses "nobody" to this extent sometime; my proposal is to extend this to
every (unprivileged) user in a safe way. Then, you can create a chrooted
environment for the new process and tailor the level of access it has
depending on the needs.
Romano
--
Romano Giannetti - Univ. Pontificia Comillas (Madrid, Spain)
Electronic Engineer - phone +34 915 422 800 ext 2416 fax +34 915 411 132
On Thu, 13 Dec 2001 11:36:16 -0500, Romano Giannetti wrote
> Good morning to everyone.
>
> I was thinking about the idea of sub-ids to enable users to run "untrusted"
> binary or "dangerous" one without risk for their files/privacy.
I have another solution, which is almost completed. I am combining
two project. One is the vserver project (see the url in my signature) and the
other is the AclFS component of the virtualfs project
(http://www.solucorp.qc.ca/virtualfs).
Using the chcontext utility from the vserver project, you can isolate a process
from the rest of the system, including the other user processes
For example, as a normal user, you can do
xterm &
/usr/sbin/chcontext /bin/sh
ps ax
killall xterm
and you only see your new shell, the ps command and init. The killall fails
finding no xterm to kill.
Another part of the vserver project is the capability ceiling, which is a way to
turn off some capabilities for a process and its children, even setuid child.
I was thinking about introducing a new capability CAP_OPEN. This capability
would prevent any open system call from succeeding. Wow. Now that's secure :-)
The acslfs daemon works using a unix domain socket. Using a preload object
the client does various system call request to aclfsd, including open, socket
and so on. If aclfsd grant the access, it opens the file and pass back the
file handle using the socket. So the client does not need to open the file
itself.
So the CAP_OPEN is there to force the client to use aclfsd. Even if using aclfsd
is transparent to normal clients, some client might do a direct call to the OS.
All those calls would fail.
Not also that aclfsd does not need any privilege. A normal user may start
it with its own configurations (access privileges).
Ultimatly, one goal of this would be to run your favorite browser in a security
box and allow fine grain access to your own file. Then one could do the so
cool thing windows user do all the time: They visit a site, select a plugin
and run it. Unlike windows, you would not get all the virus though :-)
Anyway, the vserver and virtual projects are used for different purpose today
but could be combined to achieve this kind of result.
---------------------------------------------------------
Jacques Gelinas <[email protected]>
vserver: run general purpose virtual servers on one box, full speed!
http://www.solucorp.qc.ca/miscprj/s_context.hc
On Thu, Dec 13, 2001 at 05:06:29PM +0100, Romano Giannetti wrote:
> > 2 do we want the "slave" to be able to write the users files
>
> Generally no, but you can create a dir where the slave uid can create file
> (think to a java applet that need temporary files, etc...)
I think generally temporary files should go to /tmp and not the home
directory, but yes, there may be reasons to write to specific files in
the home directory as well.
> > This should also be possible to implement with minimal impact. All you
> > need is a new systemcall to allocate a uid for the slave. This means you
> > need to reserve some uids for this purpose, but with 32bit uids......
>
> Yes, but then the slave process is very much _very_ limited. It could need
> to read/map dynamic libraries, for example; with my approach the slave uid
> processes are processes that have a full-level citizenship and that can do
> anything a process can do, but under a different name than the user. Root
> uses "nobody" to this extent sometime; my proposal is to extend this to
> every (unprivileged) user in a safe way. Then, you can create a chrooted
> environment for the new process and tailor the level of access it has
> depending on the needs.
Why would the slave not be able to read/map dynamic libraries in my
sceeme? Such files should be readable by everyone, so I don't see the
problem?
With ACL support I don't see this beeing limited at all. The process can
be given any rights you desire before changing it's effective userid.
--
Ragnar Kj?rstad
Big Storage
Andrew Pimlott wrote:
>On Thu, Dec 13, 2001 at 12:23:46PM +0300, Hans Reiser wrote:
>
>>Andrew Pimlott wrote:
>>
>>>First, I write a desktop application that wants to save an HTML file
>>>along with some other object that contains the name of the creating
>>>application. The latter can go anywhere you want, except in the
>>>same stream as the HTML file. The user has requested that the
>>>filename be /home/user/foo.html , and expects to be able to FTP this
>>>file to his ISP with a standard FTP program. What calls does my
>>>application make to store the HTML and the application name? If the
>>>answer is different depending on whether /home/user is NTFS or
>>>reiserfs4, explain both ways.
>>>
>>Are you sure that standard ftp will be able to handle extended
>>attributes without modification?
>>
>
>No, the ftp program only needs to transfer the HTML part.
>
>>One approach is to create a plugin called ..archive that when read is a
>>virtual file consisting of an archive of everything in the directory.
>>
>
>Ok, does this mean that every directory in the filesystem (or in
>some part of it) will automatically have a node ..archive?
>Presumably, it will not appear in directory listings, but can be
>read but not written to? Does this mean that a legacy application
>(pathological as it may be) that expects to be able to create a file
>called ..archive will fail?
>
I remember that I used to be a sysadmin with some NetApp boxes that have
a .snapshot directory that is invisible, and has special qualities.
It worked. There were no namespace collision problems. None.
These things can be survived by users.;-)
Nothing I say should be construed to mean that I think that a particular
name for a pseudo-file implemented by the default regular directory
plugin is what should ship. I am easy in such matters. You can also
get me to agree it should be modifiable, so that if Joe Sevenpack needs
a file named ..archive, he can have it.
>
>
>Or do you mean that the application would explicitly create the node
>associated with this plugin?
>
Both. If you want a file named '..glob' that does the same thing as
'..archive', go for it. I am not necessarily committed to putting
..archive in the default directory plugin (actually, I don't like that
name, it should be something snappier, but I haven't thought of it). I
also am not funded to implement ..archive at the moment (I am funded to
do inheritance though) .
>
>
>>It would be interesting I think to attach said plugin to standard
>>directories by default along with several other standard plugins like
>>..cat, etc.
>>
>
>Anyway, you didn't answer the part I really care about. What calls
>does the application make to store the HTML and the "extended
>attribute"? You can pick whatever conventions you want, just give
>me an example.
>
read, write, etc., on file.html/..joes_attribute, unless it is a
particular attribute that has particular effects on the particular
plugin for file.html, in which case it all depends on the plugin and the
constraints imposed on joes_attribute. It may be that modifying
file.html modifies ..joes_attribute as a side-effect, plugins can do
anything in response to a VFS operation. You put the plugin into your
kernel, you'd better be able to trust it....
>
>
>>>Second, I booted NT and created a directory in the NTFS filesystem
>>>called /foo . In the directory, I created a file called bar. I
>>>also created a named stream called bar, and an extended attribute
>>>called bar. Now I boot Linux. What calls do I make to see each of
>>>the three objects called bar?
>>>
>>You access /foo/bar, /foo/bar/,,bar, /foo/..bar by name.
>>
>
>How do I access the file called ..bar (created in NT) in the
>directory /foo?
>
If you have permission, you can:
cat /foo/..bar
Or you can use the efficient for small files API we are implementing,
which I won't go into here.
>
>
>(Anton, does NTFS define any reserved filename characters, or only
>win32?)
>
>Andrew
>
>
At 15:27 13/12/01, Andrew Pimlott wrote:
>(Anton, does NTFS define any reserved filename characters, or only
>win32?)
It does. RTFS. (-8
From ntfs-driver-tng/linux/fs/ntfs/layout.h:
/*
* The maximum allowed length for a file name.
*/
#define MAXIMUM_FILE_NAME_LENGTH 255
/*
* Possible namespaces for filenames in ntfs (8-bit).
*/
typedef enum {
FILE_NAME_POSIX = 0x00,
/* This is the largest namespace. It is case sensitive and
allows all Unicode characters except for: '\0' and '/'.
Beware that in WinNT/2k files which eg have the same name
except for their case will not be distinguished by the
standard utilities and thus a "del filename" will delete
both "filename" and "fileName" without warning. */
FILE_NAME_WIN32 = 0x01,
/* The standard WinNT/2k NTFS long filenames. Case
insensitive.
All Unicode chars except: '\0', '"', '*', '/', ':', '<',
'>', '?', '\' and '|'. Further, names cannot end with a '.'
or a space. */
FILE_NAME_DOS = 0x02,
/* The standard DOS filenames (8.3 format). Uppercase only.
All 8-bit characters greater space, except: '"', '*', '+',
',', '/', ':', ';', '<', '=', '>', '?' and '\'. */
FILE_NAME_WIN32_AND_DOS = 0x03,
/* 3 means that both the Win32 and the DOS filenames are
identical and hence have been saved in this single filename
record. */
} __attribute__ ((__packed__)) FILE_NAME_TYPE_FLAGS;
The whole of layout.h can be viewed here (link to view of CVS):
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/linux-ntfs/ntfs-driver-tng/linux/fs/ntfs/layout.h?rev=1.6&content-type=text/vnd.viewcvs-markup
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Linux NTFS Maintainer / WWW: http://linux-ntfs.sf.net/
ICQ: 8561279 / WWW: http://www-stu.christs.cam.ac.uk/~aia21/
Romano Giannetti wrote:
>I was thinking about the idea of sub-ids to enable users to run "untrusted"
>binary or "dangerous" one without risk for their files/privacy.
Can I point you to some work I've done on this topic?
We built a tool called Janus (alas, it required kernel patches).
http://www.cs.berkeley.edu/~daw/janus/
Your basic goal seems like a good one: it'd be really nice if you could
run untrusted code in a sandbox that nothing can escape. Based on
our experience, though, I've come to believe you probably want a more
sophisticated solution than the one you outlined.
First, the 'nobody' userid (and equivalents) leave a lot to be desired.
A troubling number of system resources can be accessed by 'nobody': there
are usually an enormous quantity of world-readable files; more troubling,
there are tons of world-executable setuid programs, and it's hard with a
purely userid-based mechanism to be sure that they won't provide an escape
hatch; not to mention other resources, such as interprocess communication,
network sockets, and so on.
The conclusion we came to is that you really need something more powerful
than the existing access control measures. Unix systems are really not
very good at preventing attacks by local users.
A second claim is that you really want to start from the Principle
of Least Privilege: give the untrusted process the absolute minimum
privilege necessary for it to accomplish its task, and nothing more.
Userid-based mechanisms do a lousy job at achieving this.
This, by the way, is analogous to the "default deny" policy that you may
be familiar with from the firewalls world: if you start by giving the
untrusted process zero privileges and then explicitly declare only the
ones you want allowed, you greatly reduce the risk that the untrusted
process can escape and cause harm in some way you didn't expect.
A third observation is that you need to control access to a lot more
than just the filesystem. You want to control the network (prevent the
spread of viruses; and if anyone uses IP-based authentication, or if
your machine is inside a firewall, prevent the untrusted process from
abusing the good name of the local host). You want to control resources
like IPC, signals, resource usage, and so on. And I claim you want more
fine-grained control than POSIX capabilities provide.
A fourth observation is that in practice it's useful to provide more
than just isolation: you often also want to allow some limited degree
of sharing between trusted and untrusted processes. chroot() is not
so good in this respect, even apart from the fact that it protects only
the filesystem and not any of the other resources on the system.
The Janus approach is to interpose on system calls to impose a more
restrictive security policy. We use ptrace() and the like to do this
from userspace. It's a little clunky, especially since support for
process tracing on Linux has shortcomings, but it works. We've run
a web browser, a web server, etc., inside the restricted environment
Janus provides. Janus is just one approach, of course, and there are
a number of other projects that have followed related directions (DTE,
consh, mapbox, SubDomain, SELinux, etc.).
Looking to the future, may I direct your attention to the Linux Security
Module project? They're doing some great work that I think will lay
a fantastic foundation for trying out many different approaches to
this problem.
Hi!
> And we end up with a different solution:
> olduid=getuid();
> /* Allocate a uid with no privilegies */
Dangerous. Imagine:
while (1) {
On Thu, Dec 13, 2001 at 11:36:16AM +0100, Romano Giannetti wrote:
>
> I was thinking about the idea of sub-ids to enable users to run "untrusted"
> binary or "dangerous" one without risk for their files/privacy.
Most parts of your proposal can be implemented in userspace, without
any kernel changes.
In fact, most parts /are/ already implemented, and only waiting to be
configured properly. It's called "sudo".
The only deficiency of the userspace only approach I see at the moment
is that you can't impersonate the slave user from the main user id
regarding to filesystem access. This can be worked around with proper
permissions if you take the "one group/one user" approach, all
slave users will have the main users group.
Andreas
--
Andreas Ferber - dev/consulting GmbH - Bielefeld, FRG
---------------------------------------------------------
+49 521 1365800 - [email protected] - http://www.devcon.net