2010-09-24 12:48:33

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 00/11] New ACL format for better NFSv4 acl interoperability


Hi,

The following set of patches implements VFS changes needed to implement
a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
extended by file masks to fit into the standard POSIX file permission model.
They are designed to work seamlessly locally as well as across the NFSv4 and
CIFS/SMB2 network file system protocols.

The patch set consists of four parts:

The first set of patches, posted as a follow up, contains VFS changes needed
to implement the Rich ACL model. The second set [1] contains the Rich ACL model
and Ext4 implementation. The third set [2] contains mapping of Rich ACL to
NFSv4 ACL (how to apply file mask to access mask) and implementation of
Richacl ACL for NFS server and client. The fourth set [3] contains POSIX ACL
to Rich ACL mapping and its ext4 usage.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-minimal
[2] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-upstream
[3] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-fullset

A user-space utility for displaying and changing richacls is available at [4]
(a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).

[4] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/richacl.git master

To test richacl on ext4 use -o richacl mount option. This mount option may later be
dropped in favour of a feature flag.

More details regarding richacl can be found at
http://acl.bestbits.at/richacl/

Changes from V3:
a) Droped may_delete and may_create inode operations callback and reworked
the patch series to use additional check flags.
b) Rebased to the latest kernel
c) The patch series now contain only the minimal VFS changes.

Changes from V2:
1) Git repo include check-acl branch that drop newly added inode_operations
callback in favour for additional access check flags (MAY_CREATE_FILE,
MAY_CREATE_DIR, MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_TAKE_OWNERSHIP,
MAY_CHMOD, and MAY_SET_TIMES)
2) richacl is now cached in the vfs inode instead of file system inode.
(currently kept as a separate patch. We may want to fold that later)
3) Added a new acl flag ACL4_MASKED. richacl_apply_masks() can skip transforming acls
without this flag, which speeds things up and avoids modifying those acls unnecessarily.
4) Owner always allowed permissions are now explicitly included when synthesizing an acl
from file mode.

Changes from V1:
1) Split the patches into smaller patches
2) Added extensive documentation to the patches.

-aneesh




2010-09-24 12:48:05

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 02/11] vfs: Pass all mask flags down to iop->check_acl

From: Andreas Gruenbacher <[email protected]>

Some file permission models differentiate between writing to a file
(MAY_WRITE) and appending to it (MAY_WRITE | MAY_APPEND). Pass all the
mask flags down to iop->check_acl so that filesystems can distinguish
between writing and appending.

All users of iop->check_acl pass the mask value back into
posix_acl_permission(); strip off the additional mask flags there.

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 4 +---
fs/posix_acl.c | 2 ++
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index c5920b5..7cadd07 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,8 +174,6 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;

- mask &= MAY_READ | MAY_WRITE | MAY_EXEC;
-
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
@@ -192,7 +190,7 @@ static int acl_permission_check(struct inode *inode, int mask,
/*
* If the DACs are ok we don't need any capability check.
*/
- if ((mask & ~mode) == 0)
+ if ((mask & (MAY_READ | MAY_WRITE | MAY_EXEC) & ~mode) == 0)
return 0;
return -EACCES;
}
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 39df95a..c60bddf 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -213,6 +213,8 @@ posix_acl_permission(struct inode *inode, const struct posix_acl *acl, int want)
const struct posix_acl_entry *pa, *pe, *mask_obj;
int found = 0;

+ want &= MAY_READ | MAY_WRITE | MAY_EXEC;
+
FOREACH_ACL_ENTRY(pa, acl, pe) {
switch(pa->e_tag) {
case ACL_USER_OBJ:
--
1.7.0.4

2010-09-24 12:48:07

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 04/11] vfs: Add generic IS_ACL() test for acl support

From: Andreas Gruenbacher <[email protected]>

When IS_POSIXACL() is true, the vfs does not apply the umask. Other acl
models will need the same exception, so introduce a separate IS_ACL()
test.

The IS_POSIX_ACL() test is still needed so that nfsd can determine when
the underlying file system supports POSIX ACLs (as opposed to some other
kind).

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 6 +++---
fs/nfs/nfs4proc.c | 2 +-
include/linux/fs.h | 8 +++++++-
3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf822b8..855b360 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1486,7 +1486,7 @@ static int __open_namei_create(struct nameidata *nd, struct path *path,
int error;
struct dentry *dir = nd->path.dentry;

- if (!IS_POSIXACL(dir->d_inode))
+ if (!IS_ACL(dir->d_inode))
mode &= ~current_umask();
error = security_path_mknod(&nd->path, path->dentry, mode, 0);
if (error)
@@ -2009,7 +2009,7 @@ SYSCALL_DEFINE4(mknodat, int, dfd, const char __user *, filename, int, mode,
error = PTR_ERR(dentry);
goto out_unlock;
}
- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = may_mknod(mode);
if (error)
@@ -2086,7 +2086,7 @@ SYSCALL_DEFINE3(mkdirat, int, dfd, const char __user *, pathname, int, mode)
if (IS_ERR(dentry))
goto out_unlock;

- if (!IS_POSIXACL(nd.path.dentry->d_inode))
+ if (!IS_ACL(nd.path.dentry->d_inode))
mode &= ~current_umask();
error = mnt_want_write(nd.path.mnt);
if (error)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 089da5b..c77dbbf 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2042,7 +2042,7 @@ nfs4_atomic_open(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
if (nd->flags & LOOKUP_CREATE) {
attr.ia_mode = nd->intent.open.create_mode;
attr.ia_valid = ATTR_MODE;
- if (!IS_POSIXACL(dir))
+ if (!IS_ACL(dir))
attr.ia_mode &= ~current_umask();
} else {
open_flags &= ~O_EXCL;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 63d069b..485c697 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -199,7 +199,7 @@ struct inodes_stat_t {
#define MS_VERBOSE 32768 /* War is peace. Verbosity is silence.
MS_VERBOSE is deprecated. */
#define MS_SILENT 32768
-#define MS_POSIXACL (1<<16) /* VFS does not apply the umask */
+#define MS_POSIXACL (1<<16) /* Supports POSIX ACLs */
#define MS_UNBINDABLE (1<<17) /* change to unbindable */
#define MS_PRIVATE (1<<18) /* change to private */
#define MS_SLAVE (1<<19) /* change to slave */
@@ -270,6 +270,12 @@ struct inodes_stat_t {
#define IS_SWAPFILE(inode) ((inode)->i_flags & S_SWAPFILE)
#define IS_PRIVATE(inode) ((inode)->i_flags & S_PRIVATE)

+/*
+ * IS_ACL() tells the VFS to not apply the umask
+ * and use iop->check_acl for acl permission checks when defined.
+ */
+#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL)
+
/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */

--
1.7.0.4

2010-09-24 12:48:06

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 03/11] vfs: Add a comment to inode_permission()

From: Andreas Gruenbacher <[email protected]>

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7cadd07..bf822b8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -167,7 +167,7 @@ EXPORT_SYMBOL(putname);
#endif

/*
- * This does basic POSIX ACL permission checking
+ * This does the basic permission checking
*/
static int acl_permission_check(struct inode *inode, int mask,
int (*check_acl)(struct inode *inode, int mask))
@@ -212,7 +212,7 @@ int generic_permission(struct inode *inode, int mask,
int ret;

/*
- * Do the basic POSIX ACL permission checks.
+ * Do the basic permission checks.
*/
ret = acl_permission_check(inode, mask, check_acl);
if (ret != -EACCES)
@@ -246,6 +246,8 @@ int generic_permission(struct inode *inode, int mask,
* We use "fsuid" for this, letting us set arbitrary permissions
* for filesystem access without changing the "normal" uids which
* are used for other things.
+ *
+ * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
--
1.7.0.4

2010-09-24 12:48:09

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 06/11] vfs: Optimize out IS_RICHACL() if CONFIG_FS_RICHACL is not defined

From: Andreas Gruenbacher <[email protected]>

if CONFIG_FS_RICHACL is not defined optimize out
the ACL check function.

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/Kconfig | 4 ++++
include/linux/fs.h | 5 +++++
2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 3d18530..cd283dc 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -39,6 +39,10 @@ config FS_POSIX_ACL
bool
default n

+config FS_RICHACL
+ bool
+ default n
+
source "fs/xfs/Kconfig"
source "fs/gfs2/Kconfig"
source "fs/ocfs2/Kconfig"
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d86e77c..26fc8ae 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -265,7 +265,12 @@ struct inodes_stat_t {
#define IS_APPEND(inode) ((inode)->i_flags & S_APPEND)
#define IS_IMMUTABLE(inode) ((inode)->i_flags & S_IMMUTABLE)
#define IS_POSIXACL(inode) __IS_FLG(inode, MS_POSIXACL)
+
+#ifdef CONFIG_FS_RICHACL
#define IS_RICHACL(inode) __IS_FLG(inode, MS_RICHACL)
+#else
+#define IS_RICHACL(inode) 0
+#endif

#define IS_DEADDIR(inode) ((inode)->i_flags & S_DEAD)
#define IS_NOCMTIME(inode) ((inode)->i_flags & S_NOCMTIME)
--
1.7.0.4

2010-09-24 12:48:10

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 07/11] vfs: Make acl_permission_check() work for richacls

From: Andreas Gruenbacher <[email protected]>

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 855b360..b0b8a71 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
{
umode_t mode = inode->i_mode;

+ if (IS_RICHACL(inode)) {
+ int error = check_acl(inode, mask);
+ if (error != -EAGAIN)
+ return error;
+ }
+
if (current_fsuid() == inode->i_uid)
mode >>= 6;
else {
--
1.7.0.4

2010-09-24 12:49:07

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 10/11] vfs: Make the inode passed to inode_change_ok non-const

From: Andreas Gruenbacher <[email protected]>

We will need to call iop->permission and iop->change_acl from
inode_change_ok() for additional permission checks, and both take a
non-const inode.

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/attr.c | 2 +-
include/linux/fs.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 7ca4181..f081b5a 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -26,7 +26,7 @@
* Should be called as the first thing in ->setattr implementations,
* possibly after taking additional locks.
*/
-int inode_change_ok(const struct inode *inode, struct iattr *attr)
+int inode_change_ok(struct inode *inode, struct iattr *attr)
{
unsigned int ia_valid = attr->ia_valid;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 845a930..2616d34 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2402,7 +2402,7 @@ extern int buffer_migrate_page(struct address_space *,
#define buffer_migrate_page NULL
#endif

-extern int inode_change_ok(const struct inode *, struct iattr *);
+extern int inode_change_ok(struct inode *, struct iattr *);
extern int inode_newsize_ok(const struct inode *, loff_t offset);
extern void setattr_copy(struct inode *inode, const struct iattr *attr);

--
1.7.0.4


2010-09-24 12:49:01

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

From: Andreas Gruenbacher <[email protected]>

Some permission models distinguish between the permission to create a
non-directory and a directory. Pass this information down to
inode_permission() as mask flags

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 21 ++++++++++++---------
include/linux/fs.h | 2 ++
2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index b0b8a71..ed786b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
* for filesystem access without changing the "normal" uids which
* are used for other things.
*
- * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
+ * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
+ * MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
* 3. We should have write and exec permissions on dir
* 4. We can't do it if dir is immutable (done in permission())
*/
-static inline int may_create(struct inode *dir, struct dentry *child)
+static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
{
+ int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
+
if (child->d_inode)
return -EEXIST;
if (IS_DEADDIR(dir))
return -ENOENT;
- return inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
}

/*
@@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
struct nameidata *nd)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);

int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)

int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 1);

if (error)
return error;
@@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)

int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error = may_create(dir, dentry);
+ int error = may_create(dir, dentry, 0);

if (error)
return error;
@@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;

- error = may_create(dir, new_dentry);
+ error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
if (error)
return error;

@@ -2631,7 +2634,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
return error;

if (!new_dentry->d_inode)
- error = may_create(new_dir, new_dentry);
+ error = may_create(new_dir, new_dentry, is_dir);
else
error = may_delete(new_dir, new_dentry, is_dir);
if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26fc8ae..5dc1ebd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -55,6 +55,8 @@ struct inodes_stat_t {
#define MAY_ACCESS 16
#define MAY_OPEN 32
#define MAY_CHDIR 64
+#define MAY_CREATE_FILE 128
+#define MAY_CREATE_DIR 256

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4


2010-09-24 12:49:04

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 09/11] vfs: Add delete child and delete self permission flags

From: Andreas Gruenbacher <[email protected]>

Normally, deleting a file requires write access to the parent directory.
Some permission models use a different permission on the parent
directory to indicate delete access. In addition, a process can have
per-file delete access even without delete access on the parent
directory.

Introduce two new inode_permission() mask flags and use them in
may_delete()

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 36 ++++++++++++++++++++++++------------
include/linux/fs.h | 2 ++
2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ed786b2..24c1e8c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -254,7 +254,7 @@ int generic_permission(struct inode *inode, int mask,
* are used for other things.
*
* When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
- * MAY_WRITE must also be set in @mask.
+ * MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_WRITE must also be set in @mask.
*/
int inode_permission(struct inode *inode, int mask)
{
@@ -1298,30 +1298,42 @@ static inline int check_sticky(struct inode *dir, struct inode *inode)
* 10. We don't allow removal of NFS sillyrenamed files; it's handled by
* nfs_async_unlink().
*/
-static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
+static int may_delete(struct inode *dir, struct dentry *victim,
+ int isdir, int replace)
{
+ struct inode *inode = victim->d_inode;
+ int mask;
int error;

- if (!victim->d_inode)
+ if (!inode)
return -ENOENT;

BUG_ON(victim->d_parent->d_inode != dir);
audit_inode_child(victim, dir);

- error = inode_permission(dir, MAY_WRITE | MAY_EXEC);
+ mask = MAY_WRITE | MAY_EXEC | MAY_DELETE_CHILD;
+ if (replace)
+ mask |= S_ISDIR(inode->i_mode) ?
+ MAY_CREATE_DIR : MAY_CREATE_FILE;
+ error = inode_permission(dir, mask);
+ if (error && IS_RICHACL(inode) &&
+ !inode_permission(dir, MAY_EXEC) &&
+ !inode_permission(inode, MAY_WRITE | MAY_DELETE_SELF))
+ error = 0;
+ else if (!error && check_sticky(dir, inode))
+ error = -EPERM;
if (error)
return error;
if (IS_APPEND(dir))
return -EPERM;
- if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
- IS_IMMUTABLE(victim->d_inode) || IS_SWAPFILE(victim->d_inode))
+ if (IS_APPEND(inode) || IS_IMMUTABLE(inode) || IS_SWAPFILE(inode))
return -EPERM;
if (isdir) {
- if (!S_ISDIR(victim->d_inode->i_mode))
+ if (!S_ISDIR(inode->i_mode))
return -ENOTDIR;
if (IS_ROOT(victim))
return -EBUSY;
- } else if (S_ISDIR(victim->d_inode->i_mode))
+ } else if (S_ISDIR(inode->i_mode))
return -EISDIR;
if (IS_DEADDIR(dir))
return -ENOENT;
@@ -2150,7 +2162,7 @@ void dentry_unhash(struct dentry *dentry)

int vfs_rmdir(struct inode *dir, struct dentry *dentry)
{
- int error = may_delete(dir, dentry, 1);
+ int error = may_delete(dir, dentry, 1, 0);

if (error)
return error;
@@ -2237,7 +2249,7 @@ SYSCALL_DEFINE1(rmdir, const char __user *, pathname)

int vfs_unlink(struct inode *dir, struct dentry *dentry)
{
- int error = may_delete(dir, dentry, 0);
+ int error = may_delete(dir, dentry, 0, 0);

if (error)
return error;
@@ -2629,14 +2641,14 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (old_dentry->d_inode == new_dentry->d_inode)
return 0;

- error = may_delete(old_dir, old_dentry, is_dir);
+ error = may_delete(old_dir, old_dentry, is_dir, 0);
if (error)
return error;

if (!new_dentry->d_inode)
error = may_create(new_dir, new_dentry, is_dir);
else
- error = may_delete(new_dir, new_dentry, is_dir);
+ error = may_delete(new_dir, new_dentry, is_dir, 1);
if (error)
return error;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5dc1ebd..845a930 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -57,6 +57,8 @@ struct inodes_stat_t {
#define MAY_CHDIR 64
#define MAY_CREATE_FILE 128
#define MAY_CREATE_DIR 256
+#define MAY_DELETE_CHILD 512
+#define MAY_DELETE_SELF 1024

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4


2010-09-24 12:49:10

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 11/11] vfs: Add permission flags for setting file attributes

From: Andreas Gruenbacher <[email protected]>

Some permission models can allow processes to take ownership of a file,
change the file permissions, and set the file timestamps. Introduce new
permission mask flags and check for those permissions in
inode_change_ok().

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/attr.c | 49 ++++++++++++++++++++++++++++++++++++++-----------
include/linux/fs.h | 3 +++
2 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index f081b5a..00f971a 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -13,6 +13,20 @@
#include <linux/fsnotify.h>
#include <linux/fcntl.h>
#include <linux/security.h>
+#include <linux/richacl.h>
+
+static int richacl_change_ok(struct inode *inode, int mask)
+{
+ if (!IS_RICHACL(inode))
+ return -EPERM;
+
+ if (inode->i_op->permission)
+ return inode->i_op->permission(inode, mask);
+ else if (inode->i_op->check_acl)
+ return inode->i_op->check_acl(inode, mask);
+ else
+ return -EPERM;
+}

/**
* inode_change_ok - check if attribute changes to an inode are allowed
@@ -47,20 +61,29 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)
/* Make sure a caller can chown. */
if ((ia_valid & ATTR_UID) &&
(current_fsuid() != inode->i_uid ||
- attr->ia_uid != inode->i_uid) && !capable(CAP_CHOWN))
- return -EPERM;
+ attr->ia_uid != inode->i_uid) &&
+ (current_fsuid() != attr->ia_uid ||
+ richacl_change_ok(inode, MAY_TAKE_OWNERSHIP)) &&
+ !capable(CAP_CHOWN))
+ goto error;

/* Make sure caller can chgrp. */
- if ((ia_valid & ATTR_GID) &&
- (current_fsuid() != inode->i_uid ||
- (!in_group_p(attr->ia_gid) && attr->ia_gid != inode->i_gid)) &&
- !capable(CAP_CHOWN))
- return -EPERM;
+ if (ia_valid & ATTR_GID) {
+ int in_group = in_group_p(attr->ia_gid);
+ if ((current_fsuid() != inode->i_uid ||
+ (!in_group && attr->ia_gid != inode->i_gid)) &&
+ (!in_group ||
+ richacl_change_ok(inode, MAY_TAKE_OWNERSHIP)) &&
+ !capable(CAP_CHOWN))
+ goto error;
+ }

/* Make sure a caller can chmod. */
if (ia_valid & ATTR_MODE) {
- if (!is_owner_or_cap(inode))
- return -EPERM;
+ if (current_fsuid() != inode->i_uid &&
+ richacl_change_ok(inode, MAY_CHMOD) &&
+ !capable(CAP_FOWNER))
+ goto error;
/* Also check the setgid bit! */
if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
inode->i_gid) && !capable(CAP_FSETID))
@@ -69,11 +92,15 @@ int inode_change_ok(struct inode *inode, struct iattr *attr)

/* Check for setting the inode time. */
if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)) {
- if (!is_owner_or_cap(inode))
- return -EPERM;
+ if (current_fsuid() != inode->i_uid &&
+ richacl_change_ok(inode, MAY_SET_TIMES) &&
+ !capable(CAP_FOWNER))
+ goto error;
}

return 0;
+error:
+ return -EPERM;
}
EXPORT_SYMBOL(inode_change_ok);

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2616d34..f08e82f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -59,6 +59,9 @@ struct inodes_stat_t {
#define MAY_CREATE_DIR 256
#define MAY_DELETE_CHILD 512
#define MAY_DELETE_SELF 1024
+#define MAY_TAKE_OWNERSHIP 2048
+#define MAY_CHMOD 4096
+#define MAY_SET_TIMES 8192

/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
--
1.7.0.4


2010-09-24 12:48:08

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 05/11] vfs: Add IS_RICHACL() test for richacl support

From: Andreas Gruenbacher <[email protected]>

Introduce a new MS_RICHACL super-block flag and a new IS_RICHACL() test
which file systems like nfs can use. IS_ACL() is true if IS_POSIXACL()
or IS_RICHACL() is true.

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
include/linux/fs.h | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 485c697..d86e77c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -208,6 +208,7 @@ struct inodes_stat_t {
#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
#define MS_I_VERSION (1<<23) /* Update inode I_version field */
#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
+#define MS_RICHACL (1<<25) /* Supports richacls */
#define MS_BORN (1<<29)
#define MS_ACTIVE (1<<30)
#define MS_NOUSER (1<<31)
@@ -264,6 +265,7 @@ struct inodes_stat_t {
#define IS_APPEND(inode) ((inode)->i_flags & S_APPEND)
#define IS_IMMUTABLE(inode) ((inode)->i_flags & S_IMMUTABLE)
#define IS_POSIXACL(inode) __IS_FLG(inode, MS_POSIXACL)
+#define IS_RICHACL(inode) __IS_FLG(inode, MS_RICHACL)

#define IS_DEADDIR(inode) ((inode)->i_flags & S_DEAD)
#define IS_NOCMTIME(inode) ((inode)->i_flags & S_NOCMTIME)
@@ -274,7 +276,7 @@ struct inodes_stat_t {
* IS_ACL() tells the VFS to not apply the umask
* and use iop->check_acl for acl permission checks when defined.
*/
-#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL)
+#define IS_ACL(inode) __IS_FLG(inode, MS_POSIXACL | MS_RICHACL)

/* the read-only stuff doesn't really belong here, but any other place is
probably as bad and I don't want to create yet another include file. */
--
1.7.0.4


2010-09-24 12:48:04

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH -V4 01/11] vfs: Indicate that the permission functions take all the MAY_* flags

From: Andreas Gruenbacher <[email protected]>

Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
fs/namei.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 24896e8..c5920b5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -200,7 +200,7 @@ static int acl_permission_check(struct inode *inode, int mask,
/**
* generic_permission - check for access rights on a Posix-like filesystem
* @inode: inode to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
* @check_acl: optional callback to check for Posix ACLs
*
* Used to check for read/write/execute permissions on a file.
@@ -242,7 +242,7 @@ int generic_permission(struct inode *inode, int mask,
/**
* inode_permission - check for access rights to a given inode
* @inode: inode to check permission on
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an inode.
* We use "fsuid" for this, letting us set arbitrary permissions
@@ -288,7 +288,7 @@ int inode_permission(struct inode *inode, int mask)
/**
* file_permission - check for additional access rights to a given file
* @file: file to check access rights for
- * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
*
* Used to check for read/write/execute permissions on an already opened
* file.
--
1.7.0.4


2010-09-24 15:51:08

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH -V4 07/11] vfs: Make acl_permission_check() work for richacls

On Fri, 24 Sep 2010 18:18:10 +0530
"Aneesh Kumar K.V" <[email protected]> wrote:

> From: Andreas Gruenbacher <[email protected]>
>
> Signed-off-by: Andreas Gruenbacher <[email protected]>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/namei.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 855b360..b0b8a71 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
> {
> umode_t mode = inode->i_mode;
>
> + if (IS_RICHACL(inode)) {
> + int error = check_acl(inode, mask);
> + if (error != -EAGAIN)
> + return error;
> + }
> +
> if (current_fsuid() == inode->i_uid)
> mode >>= 6;
> else {

This may just be my own ignorance of ACL semantics and unfamiliarity
with the ACL code in general. It seems a bit unusual though...

Just to be clear...this patch implies that with richacls you can deny
or grant access to the owner of the file even if the mode bits say
otherwise. With POSIX acls, this seems to be the other way around.

Hmm....guess I should read the spec...

--
Jeff Layton <[email protected]>

2010-09-24 15:54:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

On Fri, 24 Sep 2010 18:18:11 +0530
"Aneesh Kumar K.V" <[email protected]> wrote:

> From: Andreas Gruenbacher <[email protected]>
>
> Some permission models distinguish between the permission to create a
> non-directory and a directory. Pass this information down to
> inode_permission() as mask flags
>
> Signed-off-by: Andreas Gruenbacher <[email protected]>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> fs/namei.c | 21 ++++++++++++---------
> include/linux/fs.h | 2 ++
> 2 files changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index b0b8a71..ed786b2 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
> * for filesystem access without changing the "normal" uids which
> * are used for other things.
> *
> - * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
> + * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
> + * MAY_WRITE must also be set in @mask.
> */
> int inode_permission(struct inode *inode, int mask)
> {
> @@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> * 3. We should have write and exec permissions on dir
> * 4. We can't do it if dir is immutable (done in permission())
> */
> -static inline int may_create(struct inode *dir, struct dentry *child)
> +static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
^^^^^
nit: maybe saner as a bool?
> {
> + int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
> +
> if (child->d_inode)
> return -EEXIST;
> if (IS_DEADDIR(dir))
> return -ENOENT;
> - return inode_permission(dir, MAY_WRITE | MAY_EXEC);
> + return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
> }
>
> /*
> @@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
> int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
> struct nameidata *nd)
> {
> - int error = may_create(dir, dentry);
> + int error = may_create(dir, dentry, 0);
>
> if (error)
> return error;
> @@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
>
> int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> {
> - int error = may_create(dir, dentry);
> + int error = may_create(dir, dentry, 0);
>
> if (error)
> return error;
> @@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
>
> int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
> {
> - int error = may_create(dir, dentry);
> + int error = may_create(dir, dentry, 1);
>
> if (error)
> return error;
> @@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
>
> int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
> {
> - int error = may_create(dir, dentry);
> + int error = may_create(dir, dentry, 0);
>
> if (error)
> return error;
> @@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> if (!inode)
> return -ENOENT;
>
> - error = may_create(dir, new_dentry);
> + error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));

^^^^ this is a little
scary, but even if it's
a directory, it'll get
kicked out in a later
check. Would it be
clearer to move up the
S_ISDIR() check in this
function and then pass
this in as false?

> if (error)
> return error;
>
> @@ -2631,7 +2634,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> return error;
>
> if (!new_dentry->d_inode)
> - error = may_create(new_dir, new_dentry);
> + error = may_create(new_dir, new_dentry, is_dir);
> else
> error = may_delete(new_dir, new_dentry, is_dir);
> if (error)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 26fc8ae..5dc1ebd 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -55,6 +55,8 @@ struct inodes_stat_t {
> #define MAY_ACCESS 16
> #define MAY_OPEN 32
> #define MAY_CHDIR 64
> +#define MAY_CREATE_FILE 128
> +#define MAY_CREATE_DIR 256
>
> /*
> * flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond


--
Jeff Layton <[email protected]>

2010-09-24 18:56:05

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V4 07/11] vfs: Make acl_permission_check() work for richacls

On Fri, 24 Sep 2010 11:50:49 -0400, Jeff Layton <[email protected]> wrote:
> On Fri, 24 Sep 2010 18:18:10 +0530
> "Aneesh Kumar K.V" <[email protected]> wrote:
>
> > From: Andreas Gruenbacher <[email protected]>
> >
> > Signed-off-by: Andreas Gruenbacher <[email protected]>
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namei.c | 6 ++++++
> > 1 files changed, 6 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 855b360..b0b8a71 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -174,6 +174,12 @@ static int acl_permission_check(struct inode *inode, int mask,
> > {
> > umode_t mode = inode->i_mode;
> >
> > + if (IS_RICHACL(inode)) {
> > + int error = check_acl(inode, mask);
> > + if (error != -EAGAIN)
> > + return error;
> > + }
> > +
> > if (current_fsuid() == inode->i_uid)
> > mode >>= 6;
> > else {
>
> This may just be my own ignorance of ACL semantics and unfamiliarity
> with the ACL code in general. It seems a bit unusual though...
>
> Just to be clear...this patch implies that with richacls you can deny
> or grant access to the owner of the file even if the mode bits say
> otherwise. With POSIX acls, this seems to be the other way around.
>
> Hmm....guess I should read the spec...
>

To be POSIX compatible we need to ensure that additional file access
control mechanisms may only further restrict the access permissions defined
by the file permission bits. So with rich acl, similar to POSIX ACL,
we use file mask as an upper bound of the acess permission
allowed. Unlike POSIX ACL where the 'owner' and 'other' ACL entry access
mask is kept in sync with mode bits, rich acl include a file mask even
for 'owner' and 'everyone' entries.

The patch that gives more details about the permission check algo can be
found at

http://git.kernel.org/?p=linux/kernel/git/agruen/linux-2.6-richacl.git;a=commitdiff;h=442c675aeac85cfc893a2ec600f7fb3da3951177;hp=02456437cf280838a50ef00d7b4df2e7179fe6b2

-aneesh

2010-09-24 19:16:14

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

On Fri, 24 Sep 2010 11:54:23 -0400, Jeff Layton <[email protected]> wrote:
> On Fri, 24 Sep 2010 18:18:11 +0530
> "Aneesh Kumar K.V" <[email protected]> wrote:
>
> > From: Andreas Gruenbacher <[email protected]>
> >
> > Some permission models distinguish between the permission to create a
> > non-directory and a directory. Pass this information down to
> > inode_permission() as mask flags
> >
> > Signed-off-by: Andreas Gruenbacher <[email protected]>
> > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > ---
> > fs/namei.c | 21 ++++++++++++---------
> > include/linux/fs.h | 2 ++
> > 2 files changed, 14 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index b0b8a71..ed786b2 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
> > * for filesystem access without changing the "normal" uids which
> > * are used for other things.
> > *
> > - * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
> > + * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
> > + * MAY_WRITE must also be set in @mask.
> > */
> > int inode_permission(struct inode *inode, int mask)
> > {
> > @@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > * 3. We should have write and exec permissions on dir
> > * 4. We can't do it if dir is immutable (done in permission())
> > */
> > -static inline int may_create(struct inode *dir, struct dentry *child)
> > +static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
> ^^^^^
> nit: maybe saner as a bool?
> > {
> > + int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
> > +
> > if (child->d_inode)
> > return -EEXIST;
> > if (IS_DEADDIR(dir))
> > return -ENOENT;
> > - return inode_permission(dir, MAY_WRITE | MAY_EXEC);
> > + return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
> > }
> >
> > /*
> > @@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
> > int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
> > struct nameidata *nd)
> > {
> > - int error = may_create(dir, dentry);
> > + int error = may_create(dir, dentry, 0);
> >
> > if (error)
> > return error;
> > @@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
> >
> > int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > {
> > - int error = may_create(dir, dentry);
> > + int error = may_create(dir, dentry, 0);
> >
> > if (error)
> > return error;
> > @@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
> >
> > int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
> > {
> > - int error = may_create(dir, dentry);
> > + int error = may_create(dir, dentry, 1);
> >
> > if (error)
> > return error;
> > @@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
> >
> > int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
> > {
> > - int error = may_create(dir, dentry);
> > + int error = may_create(dir, dentry, 0);
> >
> > if (error)
> > return error;
> > @@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> > if (!inode)
> > return -ENOENT;
> >
> > - error = may_create(dir, new_dentry);
> > + error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
>
> ^^^^ this is a little
> scary, but even if it's
> a directory, it'll get
> kicked out in a later
> check. Would it be
> clearer to move up the
> S_ISDIR() check in this
> function and then pass
> this in as false?

Can you elaborate on this ?

-aneesh


2010-09-24 19:23:54

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

On Sat, 25 Sep 2010 00:46:03 +0530
"Aneesh Kumar K. V" <[email protected]> wrote:

> On Fri, 24 Sep 2010 11:54:23 -0400, Jeff Layton <[email protected]> wrote:
> > On Fri, 24 Sep 2010 18:18:11 +0530
> > "Aneesh Kumar K.V" <[email protected]> wrote:
> >
> > > From: Andreas Gruenbacher <[email protected]>
> > >
> > > Some permission models distinguish between the permission to create a
> > > non-directory and a directory. Pass this information down to
> > > inode_permission() as mask flags
> > >
> > > Signed-off-by: Andreas Gruenbacher <[email protected]>
> > > Signed-off-by: Aneesh Kumar K.V <[email protected]>
> > > ---
> > > fs/namei.c | 21 ++++++++++++---------
> > > include/linux/fs.h | 2 ++
> > > 2 files changed, 14 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index b0b8a71..ed786b2 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -253,7 +253,8 @@ int generic_permission(struct inode *inode, int mask,
> > > * for filesystem access without changing the "normal" uids which
> > > * are used for other things.
> > > *
> > > - * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
> > > + * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
> > > + * MAY_WRITE must also be set in @mask.
> > > */
> > > int inode_permission(struct inode *inode, int mask)
> > > {
> > > @@ -1337,13 +1338,15 @@ static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> > > * 3. We should have write and exec permissions on dir
> > > * 4. We can't do it if dir is immutable (done in permission())
> > > */
> > > -static inline int may_create(struct inode *dir, struct dentry *child)
> > > +static inline int may_create(struct inode *dir, struct dentry *child, int isdir)
> > ^^^^^
> > nit: maybe saner as a bool?
> > > {
> > > + int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
> > > +
> > > if (child->d_inode)
> > > return -EEXIST;
> > > if (IS_DEADDIR(dir))
> > > return -ENOENT;
> > > - return inode_permission(dir, MAY_WRITE | MAY_EXEC);
> > > + return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
> > > }
> > >
> > > /*
> > > @@ -1391,7 +1394,7 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
> > > int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
> > > struct nameidata *nd)
> > > {
> > > - int error = may_create(dir, dentry);
> > > + int error = may_create(dir, dentry, 0);
> > >
> > > if (error)
> > > return error;
> > > @@ -1953,7 +1956,7 @@ EXPORT_SYMBOL_GPL(lookup_create);
> > >
> > > int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > > {
> > > - int error = may_create(dir, dentry);
> > > + int error = may_create(dir, dentry, 0);
> > >
> > > if (error)
> > > return error;
> > > @@ -2057,7 +2060,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, int, mode, unsigned, dev)
> > >
> > > int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
> > > {
> > > - int error = may_create(dir, dentry);
> > > + int error = may_create(dir, dentry, 1);
> > >
> > > if (error)
> > > return error;
> > > @@ -2342,7 +2345,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
> > >
> > > int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
> > > {
> > > - int error = may_create(dir, dentry);
> > > + int error = may_create(dir, dentry, 0);
> > >
> > > if (error)
> > > return error;
> > > @@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> > > if (!inode)
> > > return -ENOENT;
> > >
> > > - error = may_create(dir, new_dentry);
> > > + error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
> >
> > ^^^^ this is a little
> > scary, but even if it's
> > a directory, it'll get
> > kicked out in a later
> > check. Would it be
> > clearer to move up the
> > S_ISDIR() check in this
> > function and then pass
> > this in as false?
>
> Can you elaborate on this ?
>
> -aneesh
>

Hardlinked directories are a no-no, of course. So when I first saw this
patch, it gave me pause. There's a later check in vfs_link though that
explicitly rejects hardlinking directories, so the above is harmless.
It may be more efficient to go ahead and return error if the target
is a directory however and bypass the permission check.

OTOH, maybe there's good reason to do it this way or I'm just being
excessively nitpicky.

--
Jeff Layton <[email protected]>

2010-09-27 13:03:49

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH -V4 07/11] vfs: Make acl_permission_check() work for richacls

On Friday 24 September 2010 20:55:51 Aneesh Kumar K. V wrote:
> To be POSIX compatible we need to ensure that additional file access
> control mechanisms may only further restrict the access permissions defined
> by the file permission bits.

That's true but I don't think it fully answers Jeff's question.

With POSIX ACLs, the owner file permission bits are always identical to the
permissions that the owner is granted through the ACL. Therefore, when
acl_permission_check() is invoked on behalf of the owner, the ACL does not
need to be consulted at all. For non-owners, the ACL always needs to be
checked. This optimization is also true for richacls for the base permissions
(read, write, execute), but:

* Some permissions are more fine-grained than the file mode permission
bits: richacls distinguish between write and append, and between creating
directories and non-directories.

* Some permissions go beyond what the owner is implicitly allowed or what can
be expressed with read, write, execute: in a richacl, a user can be granted
the right to delete a specific file even without write access to the
containing directory and to take ownership of a file

(* In addition, a richacl can grant the right to chmod and set the acl of a
file, and to explicitly set the file timestamps. These are permissions
which the owner is implicitly allowed anyway, so they are not relevant to
this change to acl_permission_check().)

To handle those cases correctly too, we always look at the acl for richacls,
even for the owner. (We could still skip the acl check in some, but fewer,
cases.)

Thanks,
Andreas

2010-09-27 13:14:03

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

On Friday 24 September 2010 17:54:23 Jeff Layton wrote:
> On Fri, 24 Sep 2010 18:18:11 +0530
> "Aneesh Kumar K.V" <[email protected]> wrote:
> > @@ -2415,7 +2418,7 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> > if (!inode)
> > return -ENOENT;
> >
> > - error = may_create(dir, new_dentry);
> > + error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
>
> ^^^^ this is a little
> scary, but even if it's
> a directory, it'll get
> kicked out in a later
> check. Would it be
> clearer to move up the
> S_ISDIR() check in this
> function and then pass
> this in as false?

Ah, you mean this:

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
if (!inode)
return -ENOENT;

- error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
+ if (S_ISDIR(inode->i_mode))
+ return -EPERM;
+ error = may_create(dir, new_dentry, 0);
if (error)
return error;

@@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
return -EPERM;
if (!dir->i_op->link)
return -EPERM;
- if (S_ISDIR(inode->i_mode))
- return -EPERM;

error = security_inode_link(old_dentry, dir, new_dentry);
if (error)

This is a clear improvement; I don't think it matters that user-space will
get -EPERM instead of -EXDEV when trying to hard-link a directory across
devices.

Thanks,
Andreas

2010-10-12 00:24:09

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH -V4 00/11] New ACL format for better NFSv4 acl interoperability

On Fri, Sep 24, 2010 at 06:18:03PM +0530, Aneesh Kumar K.V wrote:
>
> Hi,
>
> The following set of patches implements VFS changes needed to implement
> a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
> extended by file masks to fit into the standard POSIX file permission model.
> They are designed to work seamlessly locally as well as across the NFSv4 and
> CIFS/SMB2 network file system protocols.
>
> The patch set consists of four parts:
>
> The first set of patches, posted as a follow up, contains VFS changes needed
> to implement the Rich ACL model. The second set [1] contains the Rich ACL model
> and Ext4 implementation. The third set [2] contains mapping of Rich ACL to
> NFSv4 ACL (how to apply file mask to access mask) and implementation of
> Richacl ACL for NFS server and client.

That's the part I'd like to review carefully and haven't yet.

> The fourth set [3] contains POSIX ACL
> to Rich ACL mapping and its ext4 usage.

The one thing I remember not liking before was a flag that told the user
whether a given ACL was originally mapped from POSIX or not. Is that
still there?

Overall I'm for doing this: I don't like NFSv4/Windows ACLs more than
anyone else, but they're too useful to ignore, and the mapping that
Samba and the NFSv4 server try to do is painful for users.

--b.

>
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-minimal
> [2] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-upstream
> [3] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-fullset
>
> A user-space utility for displaying and changing richacls is available at [4]
> (a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).
>
> [4] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/richacl.git master
>
> To test richacl on ext4 use -o richacl mount option. This mount option may later be
> dropped in favour of a feature flag.
>
> More details regarding richacl can be found at
> http://acl.bestbits.at/richacl/
>
> Changes from V3:
> a) Droped may_delete and may_create inode operations callback and reworked
> the patch series to use additional check flags.
> b) Rebased to the latest kernel
> c) The patch series now contain only the minimal VFS changes.
>
> Changes from V2:
> 1) Git repo include check-acl branch that drop newly added inode_operations
> callback in favour for additional access check flags (MAY_CREATE_FILE,
> MAY_CREATE_DIR, MAY_DELETE_CHILD, MAY_DELETE_SELF, MAY_TAKE_OWNERSHIP,
> MAY_CHMOD, and MAY_SET_TIMES)
> 2) richacl is now cached in the vfs inode instead of file system inode.
> (currently kept as a separate patch. We may want to fold that later)
> 3) Added a new acl flag ACL4_MASKED. richacl_apply_masks() can skip transforming acls
> without this flag, which speeds things up and avoids modifying those acls unnecessarily.
> 4) Owner always allowed permissions are now explicitly included when synthesizing an acl
> from file mode.
>
> Changes from V1:
> 1) Split the patches into smaller patches
> 2) Added extensive documentation to the patches.
>
> -aneesh
>
>

2010-10-12 07:17:48

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V4 00/11] New ACL format for better NFSv4 acl interoperability

On Mon, 11 Oct 2010 20:24:09 -0400, "J. Bruce Fields" <[email protected]> wrote:
> On Fri, Sep 24, 2010 at 06:18:03PM +0530, Aneesh Kumar K.V wrote:
> >
> > Hi,
> >
> > The following set of patches implements VFS changes needed to implement
> > a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
> > extended by file masks to fit into the standard POSIX file permission model.
> > They are designed to work seamlessly locally as well as across the NFSv4 and
> > CIFS/SMB2 network file system protocols.
> >
> > The patch set consists of four parts:
> >
> > The first set of patches, posted as a follow up, contains VFS changes needed
> > to implement the Rich ACL model. The second set [1] contains the Rich ACL model
> > and Ext4 implementation. The third set [2] contains mapping of Rich ACL to
> > NFSv4 ACL (how to apply file mask to access mask) and implementation of
> > Richacl ACL for NFS server and client.
>
> That's the part I'd like to review carefully and haven't yet.
>
> > The fourth set [3] contains POSIX ACL
> > to Rich ACL mapping and its ext4 usage.
>
> The one thing I remember not liking before was a flag that told the user
> whether a given ACL was originally mapped from POSIX or not. Is that
> still there?

We still have that. But we can resolve that once we decide on how to
migrate an existing file system containing posix acl to richacl. Most of
those patches will need to be updated based on the feedback from
different local file system maintainers. That is why those patches are
pushed towards the end and is part of last set.

What we need in the first step is to get VFS changes reviewed.Once we
agree on the VFS changes done, then we can start looking at the changes
upto NFS richacl nfs support. When get that merged then we can start
having discussion on how local file system maintainers want to migrate
the existing file system with posixacl to richacl.

>
> Overall I'm for doing this: I don't like NFSv4/Windows ACLs more than
> anyone else, but they're too useful to ignore, and the mapping that
> Samba and the NFSv4 server try to do is painful for users.
>


-aneesh

2010-10-12 15:35:07

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH -V4 00/11] New ACL format for better NFSv4 acl interoperability

On Tue, Oct 12, 2010 at 12:47:38PM +0530, Aneesh Kumar K. V wrote:
> On Mon, 11 Oct 2010 20:24:09 -0400, "J. Bruce Fields" <[email protected]> wrote:
> > The one thing I remember not liking before was a flag that told the user
> > whether a given ACL was originally mapped from POSIX or not. Is that
> > still there?
>
> We still have that. But we can resolve that once we decide on how to
> migrate an existing file system containing posix acl to richacl. Most of
> those patches will need to be updated based on the feedback from
> different local file system maintainers. That is why those patches are
> pushed towards the end and is part of last set.
>
> What we need in the first step is to get VFS changes reviewed.Once we
> agree on the VFS changes done, then we can start looking at the changes
> upto NFS richacl nfs support. When get that merged then we can start
> having discussion on how local file system maintainers want to migrate
> the existing file system with posixacl to richacl.

OK. So, personally: I'm resigned to the idea that we want support for
this ACL model. The vfs changes look OK to me (and wouldn't be changed
by any comments I'd have on the more richacl-specific patches to
follow). So that's an ACK from me on the first set of these patches,
assuming it's OK with people to merge these things one step at a time.

--b.

2011-01-02 23:21:01

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

I was going through some old patches in the ext4 patchwork list, and
came across this. It looks like this patch has never been applied to
mainline. If it's a "clear improvement", any reason not to submit it?

Regards,

- Ted

On Mon, Sep 27, 2010 at 03:14:00PM +0200, Andreas Gruenbacher wrote:
>
> Ah, you mean this:
>
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> if (!inode)
> return -ENOENT;
>
> - error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
> + if (S_ISDIR(inode->i_mode))
> + return -EPERM;
> + error = may_create(dir, new_dentry, 0);
> if (error)
> return error;
>
> @@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
> return -EPERM;
> if (!dir->i_op->link)
> return -EPERM;
> - if (S_ISDIR(inode->i_mode))
> - return -EPERM;
>
> error = security_inode_link(old_dentry, dir, new_dentry);
> if (error)
>
> This is a clear improvement; I don't think it matters that user-space will
> get -EPERM instead of -EXDEV when trying to hard-link a directory across
> devices.
>
> Thanks,
> Andreas

2011-01-03 05:20:08

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

Actually, I think it is required by POSIX that EXDEV is returned when cross-linking files. For mv calling rename() at least it uses a return EXDEV to copy the file to the target filesystem instead of just giving up and returning an error. It may be other applications like tar have a similar dependency for link().

Cheers, Andreas

On 2011-01-02, at 16:21, Ted Ts'o <[email protected]> wrote:

> I was going through some old patches in the ext4 patchwork list, and
> came across this. It looks like this patch has never been applied to
> mainline. If it's a "clear improvement", any reason not to submit it?
>
> Regards,
>
> - Ted
>
> On Mon, Sep 27, 2010 at 03:14:00PM +0200, Andreas Gruenbacher wrote:
>>
>> Ah, you mean this:
>>
>> --- a/fs/namei.c
>> +++ b/fs/namei.c
>> @@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
>> if (!inode)
>> return -ENOENT;
>>
>> - error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
>> + if (S_ISDIR(inode->i_mode))
>> + return -EPERM;
>> + error = may_create(dir, new_dentry, 0);
>> if (error)
>> return error;
>>
>> @@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
>> return -EPERM;
>> if (!dir->i_op->link)
>> return -EPERM;
>> - if (S_ISDIR(inode->i_mode))
>> - return -EPERM;
>>
>> error = security_inode_link(old_dentry, dir, new_dentry);
>> if (error)
>>
>> This is a clear improvement; I don't think it matters that user-space will
>> get -EPERM instead of -EXDEV when trying to hard-link a directory across
>> devices.
>>
>> Thanks,
>> Andreas
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-01-03 05:59:00

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

Never mind, I didn't notice this was only for hard linking directories...

Cheers, Andreas

On 2011-01-02, at 22:20, Andreas Dilger <[email protected]> wrote:

> Actually, I think it is required by POSIX that EXDEV is returned when cross-linking files. For mv calling rename() at least it uses a return EXDEV to copy the file to the target filesystem instead of just giving up and returning an error. It may be other applications like tar have a similar dependency for link().
>
> Cheers, Andreas
>
> On 2011-01-02, at 16:21, Ted Ts'o <[email protected]> wrote:
>
>> I was going through some old patches in the ext4 patchwork list, and
>> came across this. It looks like this patch has never been applied to
>> mainline. If it's a "clear improvement", any reason not to submit it?
>>
>> Regards,
>>
>> - Ted
>>
>> On Mon, Sep 27, 2010 at 03:14:00PM +0200, Andreas Gruenbacher wrote:
>>>
>>> Ah, you mean this:
>>>
>>> --- a/fs/namei.c
>>> +++ b/fs/namei.c
>>> @@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
>>> if (!inode)
>>> return -ENOENT;
>>>
>>> - error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
>>> + if (S_ISDIR(inode->i_mode))
>>> + return -EPERM;
>>> + error = may_create(dir, new_dentry, 0);
>>> if (error)
>>> return error;
>>>
>>> @@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
>>> return -EPERM;
>>> if (!dir->i_op->link)
>>> return -EPERM;
>>> - if (S_ISDIR(inode->i_mode))
>>> - return -EPERM;
>>>
>>> error = security_inode_link(old_dentry, dir, new_dentry);
>>> if (error)
>>>
>>> This is a clear improvement; I don't think it matters that user-space will
>>> get -EPERM instead of -EXDEV when trying to hard-link a directory across
>>> devices.
>>>
>>> Thanks,
>>> Andreas
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-01-03 14:21:11

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

On Sun, 2 Jan 2011 18:21:01 -0500, "Ted Ts'o" <[email protected]> wrote:
> I was going through some old patches in the ext4 patchwork list, and
> came across this. It looks like this patch has never been applied to
> mainline. If it's a "clear improvement", any reason not to submit it?

The patch is needed only with richacl patches, that add an extra argument to
may_create.

-aneesh