2010-03-04 18:34:32

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 0/5] RFC: introduce extended inode owner identifier v5

This is next generation of attempt to add extended inode identifier.
Again i've change the name of the identified, but this is the last
time, i promise.
Now it's called similar to XFS project_id feature, in fact both
features are almost equal.

*Feature description*
1) Inode may has a project identifier which has same meaning as uid/gid.
2) Id is stored in inode's xattr named "system.project_id"
3) Id is inherent from parent inode on creation.
4) This id is cached in memory inode structure vfs_inode->i_prjid
This field it restricted by CONFIG_PROJECT_ID. So no wasting
of memory happens.

5) Since id is cached in memory it may be used for different purposes
such as:
5A) Implement additional quota id space orthogonal to uid/gid. This is
useful in managing quota for some filesystem hierarchy(chroot or
container over bindmount)
5B) Export dedicated fs hierarchy to nfsd (only inode which has some
project_id will be accessible via nfsd)

6) It is possible to create isolated inode subtree.
(2 AlViro) please do not blame isolation feature before you read
the isolation patch description, and then please wellcome.

*User interface *
Project id is managed via generic xattr interface "system.project_id"

PATCH SET TOC:
1) generic projectid support
2) generic project quota support
3) ext4 project support implementation
3A) ext4: generic project support
3B) ext4: project isolation support
3C) ext4: project quota support

Patch agains linux-next-20100304
The patchset survived basic stress testing.



2010-03-04 18:34:50

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 1/5] vfs: Add additional owner identifier

This patch add project inode identifier. Project ID may be used as
auxiliary owner specifier in addition to standard uid/gid.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/Kconfig | 7 +++++++
include/linux/fs.h | 4 ++++
include/linux/xattr.h | 3 +++
3 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 5f85b59..23957c0 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -54,6 +54,13 @@ config FILE_LOCKING
This option enables standard file locking support, required
for filesystems like NFS and for the flock() system
call. Disabling this option saves about 11k.
+config PROJECT_ID
+ bool "Enable project inode identifier"
+ default y
+ help
+ This option enables project inode identifier. Project ID
+ may be used as auxiliary owner specifier in addition to
+ standard uid/gid.

source "fs/notify/Kconfig"

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 30eee24..0218906 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -753,6 +753,10 @@ struct inode {
#ifdef CONFIG_QUOTA
struct dquot *i_dquot[MAXQUOTAS];
#endif
+#ifdef CONFIG_PROJECT_ID
+ /* Project id, protected by i_mutex similar to i_uid/i_gid */
+ __u32 i_prjid;
+#endif
struct list_head i_devices;
union {
struct pipe_inode_info *i_pipe;
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index fb9b7e6..9d85a4b 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -33,6 +33,9 @@
#define XATTR_USER_PREFIX "user."
#define XATTR_USER_PREFIX_LEN (sizeof (XATTR_USER_PREFIX) - 1)

+#define XATTR_PRJID "system.project_id"
+#define XATTR_PRJID_LEN (sizeof (XATTR_PRJID))
+
struct inode;
struct dentry;

--
1.6.6


2010-03-04 18:34:34

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 2/5] quota: Implement project id support for generic quota

Since all preparation code are already in quota-tree,
So this patch is really small.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/quota/dquot.c | 12 ++++++++++++
fs/quota/quotaio_v2.h | 6 ++++--
include/linux/quota.h | 12 +++++++++++-
3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 10d021d..3c4838f 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1090,6 +1090,11 @@ static int need_print_warning(struct dquot *dquot)
return current_fsuid() == dquot->dq_id;
case GRPQUOTA:
return in_group_p(dquot->dq_id);
+ case PRJQUOTA:
+ /* XXX: Currently there is no way to understand
+ which project_id this task belonges to, So print
+ a warn message unconditionally. -dmon */
+ return 1;
}
return 0;
}
@@ -1322,6 +1327,13 @@ int dquot_initialize(struct inode *inode, int type)
case GRPQUOTA:
id = inode->i_gid;
break;
+ case PRJQUOTA:
+#ifdef CONFIG_PROJECT_ID
+ id = inode->i_prjid;
+#else
+ BUG_ON(sb_has_quota_loaded(inode->i_sb, PRJQUOTA));
+#endif
+ break;
}
got[cnt] = dqget(sb, id, cnt);
}
diff --git a/fs/quota/quotaio_v2.h b/fs/quota/quotaio_v2.h
index f1966b4..bfab9df 100644
--- a/fs/quota/quotaio_v2.h
+++ b/fs/quota/quotaio_v2.h
@@ -13,12 +13,14 @@
*/
#define V2_INITQMAGICS {\
0xd9c01f11, /* USRQUOTA */\
- 0xd9c01927 /* GRPQUOTA */\
+ 0xd9c01927, /* GRPQUOTA */\
+ 0xd9c03f14 /* PRJQUOTA */\
}

#define V2_INITQVERSIONS {\
1, /* USRQUOTA */\
- 1 /* GRPQUOTA */\
+ 1, /* GRPQUOTA */ \
+ 1 /* PRJQUOTA */\
}

/* First generic header */
diff --git a/include/linux/quota.h b/include/linux/quota.h
index edf34f2..514435f 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -36,18 +36,28 @@
#include <linux/errno.h>
#include <linux/types.h>

-#define __DQUOT_VERSION__ "dquot_6.5.2"
+#define __DQUOT_VERSION__ "dquot_6.6.0"

+#ifdef CONFIG_PROJECT_ID
+#define MAXQUOTAS 3
+#else
#define MAXQUOTAS 2
+#endif
+
#define USRQUOTA 0 /* element used for user quotas */
#define GRPQUOTA 1 /* element used for group quotas */

+#ifdef CONFIG_PROJECT_ID
+#define PRJQUOTA 2 /* element used for project quotas */
+#endif
+
/*
* Definitions for the default names of the quotas files.
*/
#define INITQFNAMES { \
"user", /* USRQUOTA */ \
"group", /* GRPQUOTA */ \
+ "project", /* RPJQUOTA */ \
"undefined", \
};

--
1.6.6


2010-03-04 18:34:52

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

* Abstract
A subtree of a directory tree T is a tree consisting of a directory
(the subtree root) in T and all of its descendants in T.

*NOTE*: User is allowed to break pure subtree hierarchy via manual
id manipulation.

Project subtrees assumptions:
(1) Each inode has an id. This id is persistently stored inside
inode (xattr, usually inside ibody)
(2) Project id is inherent from parent directory

This feature is similar to project-id in XFS. One may assign some id to
a subtree. Each entry from the subtree may be accounted in directory
project quota. Will appear in later patches.

* Disk layout
Project id is stored on disk inside xattr usually inside ibody.
Xattr is used only as a data storage, It has not user visible xattr
interface.

* User interface
Project id is accessible via generic xattr interface "system.project_id"

TODO: implement e2libfs support for project_id.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/Kconfig | 8 ++
fs/ext4/Makefile | 1 +
fs/ext4/ext4.h | 1 +
fs/ext4/ialloc.c | 12 +++-
fs/ext4/inode.c | 5 +-
fs/ext4/project.c | 209 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/project.h | 25 +++++++
fs/ext4/super.c | 9 ++-
fs/ext4/xattr.c | 7 ++
fs/ext4/xattr.h | 2 +
10 files changed, 276 insertions(+), 3 deletions(-)
create mode 100644 fs/ext4/project.c
create mode 100644 fs/ext4/project.h

diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index 9ed1bb1..1c04c9f 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -74,6 +74,14 @@ config EXT4_FS_SECURITY

If you are not using a security module that requires using
extended attributes for file security labels, say N.
+config EXT4_PROJECT_ID
+ bool "Ext4 project_id support"
+ depends on PROJECT_ID
+ depends on EXT4_FS_XATTR
+ help
+ Enables project inode identifier support for ext4 filesystem.
+ This feature allow to assign some id to inodes similar to
+ uid/gid.

config EXT4_DEBUG
bool "EXT4 debugging support"
diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index 8867b2a..be923b1 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -11,3 +11,4 @@ ext4-y := balloc.o bitmap.o dir.o file.o fsync.o ialloc.o inode.o \
ext4-$(CONFIG_EXT4_FS_XATTR) += xattr.o xattr_user.o xattr_trusted.o
ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o
ext4-$(CONFIG_EXT4_FS_SECURITY) += xattr_security.o
+ext4-$(CONFIG_EXT4_PROJECT_ID) += project.o
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5806f53..9112c21 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -763,6 +763,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_JOURNAL_CHECKSUM 0x800000 /* Journal checksums */
#define EXT4_MOUNT_JOURNAL_ASYNC_COMMIT 0x1000000 /* Journal Async Commit */
#define EXT4_MOUNT_I_VERSION 0x2000000 /* i_version support */
+#define EXT4_MOUNT_PROJECT_ID 0x4000000 /* project owner id support */
#define EXT4_MOUNT_DELALLOC 0x8000000 /* Delalloc support */
#define EXT4_MOUNT_DATA_ERR_ABORT 0x10000000 /* Abort on file data write */
#define EXT4_MOUNT_BLOCK_VALIDITY 0x20000000 /* Block validity checking */
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 004c9da..13cc85f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -28,7 +28,7 @@
#include "ext4_jbd2.h"
#include "xattr.h"
#include "acl.h"
-
+#include "project.h"
#include <trace/events/ext4.h>

/*
@@ -1028,6 +1028,13 @@ got:

ei->i_extra_isize = EXT4_SB(sb)->s_want_extra_isize;

+#ifdef CONFIG_EXT4_PROJECT_ID
+ /*
+ * XXX: move this to generic inode init helper
+ * depends on generic_inode_init patch.
+ */
+ inode->i_prjid = dir->i_prjid;
+#endif
ret = inode;
if (vfs_dq_alloc_inode(inode)) {
err = -EDQUOT;
@@ -1041,6 +1048,9 @@ got:
err = ext4_init_security(handle, inode, dir);
if (err)
goto fail_free_drop;
+ err = ext4_prj_init(handle, inode);
+ if (err)
+ goto fail_free_drop;

if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_EXTENTS)) {
/* set extent flag only for directory, file and normal symlink*/
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index efc0442..119491a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -44,7 +44,7 @@
#include "xattr.h"
#include "acl.h"
#include "ext4_extents.h"
-
+#include "project.h"
#include <trace/events/ext4.h>

#define MPAGE_DA_EXTENT_TAIL 0x01
@@ -5076,6 +5076,9 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
}
if (ret)
goto bad_inode;
+ ret = ext4_prj_read(inode);
+ if (ret)
+ goto bad_inode;

if (S_ISREG(inode->i_mode)) {
inode->i_op = &ext4_file_inode_operations;
diff --git a/fs/ext4/project.c b/fs/ext4/project.c
new file mode 100644
index 0000000..8de8c0c
--- /dev/null
+++ b/fs/ext4/project.c
@@ -0,0 +1,209 @@
+/*
+ * linux/fs/ext4/projectid.c
+ *
+ * Copyright (C) 2010 Parallels Inc
+ * Dmitry Monakhov <[email protected]>
+ */
+
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/capability.h>
+#include <linux/fs.h>
+#include <linux/quotaops.h>
+#include "ext4_jbd2.h"
+#include "ext4.h"
+#include "xattr.h"
+#include "project.h"
+
+/*
+ * PROJECT SUBTREE
+ * A subtree of a directory tree T is a tree consisting of a directory
+ * (the subtree root) in T and all of its descendants in T.
+ *
+ * Project Subtree's assumptions:
+ * (1) Each inode has subtree id. This id is persistently stored inside
+ * inode's xattr, usually inside ibody
+ * (2) Subtree id is inherent from parent directory
+ */
+
+/*
+ * Read project_id id from inode's xattr
+ * Locking: none
+ */
+int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
+{
+ __le32 dsk_prjid;
+ int retval;
+ retval = ext4_xattr_get(inode, EXT4_XATTR_INDEX_PROJECT_ID, "",
+ &dsk_prjid, sizeof (dsk_prjid));
+ if (retval > 0) {
+ if (retval != sizeof(dsk_prjid))
+ return -EIO;
+ else
+ retval = 0;
+ }
+ *prjid = le32_to_cpu(dsk_prjid);
+ return retval;
+
+}
+
+/*
+ * Save project_id id to inode's xattr
+ * Locking: none
+ */
+int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags)
+{
+ __le32 dsk_prjid = cpu_to_le32(prjid);
+ int retval;
+ retval = ext4_xattr_set_handle(handle,
+ inode, EXT4_XATTR_INDEX_PROJECT_ID, "",
+ &dsk_prjid, sizeof (dsk_prjid), xflags);
+ if (retval > 0) {
+ if (retval != sizeof(dsk_prjid))
+ retval = -EIO;
+ else
+ retval = 0;
+ }
+ return retval;
+}
+
+/*
+ * Change project_id id.
+ * Called under inode->i_mutex
+ */
+static int ext4_prj_change(struct inode *inode, unsigned int new_prjid)
+{
+ /*
+ * One data_trans_blocks chunk for xattr update.
+ * One quota_trans_blocks chunk for quota transfer, and one
+ * quota_trans_block chunk for emergency quota rollback transfer,
+ * because quota rollback may result new quota blocks allocation.
+ */
+ unsigned credits = EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
+ EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb) * 2;
+ qid_t qid[MAXQUOTAS];
+ int ret, ret2 = 0;
+ unsigned retries = 0;
+ handle_t *handle;
+
+ vfs_dq_init(inode);
+retry:
+ handle = ext4_journal_start(inode, credits);
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ ext4_std_error(inode->i_sb, ret);
+ goto out;
+ }
+ /* Inode may not have project_id xattr yet. Create it explicitly */
+ ret = ext4_prj_xattr_write(handle, inode, inode->i_prjid,
+ XATTR_CREATE);
+ if (ret == -EEXIST)
+ ret = 0;
+ if (ret) {
+ ret2 = ext4_journal_stop(handle);
+ if (ret2)
+ ret = ret2;
+ if (ret == -ENOSPC &&
+ ext4_should_retry_alloc(inode->i_sb, &retries))
+ goto retry;
+ }
+#ifdef CONFIG_QUOTA
+ qid[PRJQUOTA] = new_prjid;
+ ret = inode->i_sb->dq_op->transfer(inode, qid, 1 << PRJQUOTA);
+ if (ret)
+ return ret;
+#endif
+ ret = ext4_prj_xattr_write(handle, inode, new_prjid, XATTR_REPLACE);
+ if (ret) {
+ /*
+ * Function may fail only due to fatal error, Nor than less
+ * we have try to rollback quota changes.
+ */
+#ifdef CONFIG_QUOTA
+ qid[PRJQUOTA] = inode->i_prjid;
+ inode->i_sb->dq_op->transfer(inode, qid, 1 << PRJQUOTA);
+#endif
+ ext4_std_error(inode->i_sb, ret);
+
+ }
+ inode->i_prjid = new_prjid;
+ ret2 = ext4_journal_stop(handle);
+out:
+ if (ret2)
+ ret = ret2;
+ return ret;
+}
+
+int ext4_prj_read(struct inode *inode)
+{
+ int ret = 0;
+ if(test_opt(inode->i_sb, PROJECT_ID)) {
+ ret = ext4_prj_xattr_read(inode, &inode->i_prjid);
+ if (ret == -ENODATA) {
+ inode->i_prjid = 0;
+ ret = 0;
+ }
+ } else
+ inode->i_prjid = 0;
+ return ret;
+}
+/*
+ * Initialize the projectid xattr of a new inode. Called from ext4_new_inode.
+ *
+ * dir->i_mutex: down
+ * inode->i_mutex: up (access to inode is still exclusive)
+ * Note: caller must assign correct project id to inode before.
+ */
+int ext4_prj_init(handle_t *handle, struct inode *inode)
+{
+ return ext4_prj_xattr_write(handle, inode, inode->i_prjid,
+ XATTR_CREATE);
+}
+
+static size_t
+ext4_xattr_prj_list(struct dentry *dentry, char *list, size_t list_size,
+ const char *name, size_t name_len, int type)
+{
+ if (list && XATTR_PRJID_LEN <= list_size)
+ memcpy(list, XATTR_PRJID, XATTR_PRJID_LEN);
+ return XATTR_PRJID_LEN;
+
+}
+
+static int
+ext4_xattr_prj_get(struct dentry *dentry, const char *name,
+ void *buffer, size_t size, int type)
+{
+ int ret;
+ unsigned prjid;
+ char buf[32];
+ if (strcmp(name, "") != 0)
+ return -EINVAL;
+ ret = ext4_prj_xattr_read(dentry->d_inode, &prjid);
+ if (ret)
+ return ret;
+ snprintf(buf, sizeof(buf)-1, "%u", prjid);
+ buf[31] = '\0';
+ strncpy(buffer, buf, size);
+ return strlen(buf);
+}
+
+static int
+ext4_xattr_prj_set(struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags, int type)
+{
+ unsigned int new_prjid;
+ if (strcmp(name, "") != 0)
+ return -EINVAL;
+ new_prjid = simple_strtoul(value, (char **)&value, 0);
+ return ext4_prj_change(dentry->d_inode, new_prjid);
+}
+
+struct xattr_handler ext4_xattr_prj_handler = {
+ .prefix = XATTR_PRJID,
+ .list = ext4_xattr_prj_list,
+ .get = ext4_xattr_prj_get,
+ .set = ext4_xattr_prj_set,
+};
diff --git a/fs/ext4/project.h b/fs/ext4/project.h
new file mode 100644
index 0000000..7e80579
--- /dev/null
+++ b/fs/ext4/project.h
@@ -0,0 +1,25 @@
+#include <linux/xattr.h>
+#include <linux/fs.h>
+
+#ifdef CONFIG_EXT4_PROJECT_ID
+extern int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid);
+extern int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags);
+extern int ext4_prj_init(handle_t *handle, struct inode *inode);
+extern int ext4_prj_read(struct inode *inode);
+
+#else
+static inline int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
+{
+ return -ENOTSUPP;
+}
+static inline int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags)
+{
+ return -ENOTSUPP;
+}
+static int ext4_prj_read(struct inode *inode)
+{
+ return 0;
+}
+#endif
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9fc6057..240df9a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -960,6 +960,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
if (test_opt(sb, DISCARD))
seq_puts(seq, ",discard");

+ if (test_opt(sb, PROJECT_ID))
+ seq_puts(seq, ",project_id");
+
if (test_opt(sb, NOLOAD))
seq_puts(seq, ",norecovery");

@@ -1150,7 +1153,7 @@ enum {
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
Opt_dioread_nolock, Opt_dioread_lock,
- Opt_discard, Opt_nodiscard,
+ Opt_discard, Opt_nodiscard, Opt_project_id,
};

static const match_table_t tokens = {
@@ -1221,6 +1224,7 @@ static const match_table_t tokens = {
{Opt_dioread_lock, "dioread_lock"},
{Opt_discard, "discard"},
{Opt_nodiscard, "nodiscard"},
+ {Opt_project_id, "project_id"},
{Opt_err, NULL},
};

@@ -1689,6 +1693,9 @@ set_qf_format:
case Opt_dioread_lock:
clear_opt(sbi->s_mount_opt, DIOREAD_NOLOCK);
break;
+ case Opt_project_id:
+ set_opt(sbi->s_mount_opt, PROJECT_ID);
+ break;
default:
ext4_msg(sb, KERN_ERR,
"Unrecognized mount option \"%s\" "
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index efc16a4..881b4de 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -107,6 +107,10 @@ static struct xattr_handler *ext4_xattr_handler_map[] = {
#ifdef CONFIG_EXT4_FS_SECURITY
[EXT4_XATTR_INDEX_SECURITY] = &ext4_xattr_security_handler,
#endif
+#ifdef CONFIG_EXT4_PROJECT_ID
+ [EXT4_XATTR_INDEX_PROJECT_ID] = &ext4_xattr_prj_handler,
+#endif
+
};

struct xattr_handler *ext4_xattr_handlers[] = {
@@ -119,6 +123,9 @@ struct xattr_handler *ext4_xattr_handlers[] = {
#ifdef CONFIG_EXT4_FS_SECURITY
&ext4_xattr_security_handler,
#endif
+#ifdef CONFIG_EXT4_PROJECT_ID
+ &ext4_xattr_prj_handler,
+#endif
NULL
};

diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
index 8ede88b..777d60f 100644
--- a/fs/ext4/xattr.h
+++ b/fs/ext4/xattr.h
@@ -21,6 +21,7 @@
#define EXT4_XATTR_INDEX_TRUSTED 4
#define EXT4_XATTR_INDEX_LUSTRE 5
#define EXT4_XATTR_INDEX_SECURITY 6
+#define EXT4_XATTR_INDEX_PROJECT_ID 7

struct ext4_xattr_header {
__le32 h_magic; /* magic number for identification */
@@ -70,6 +71,7 @@ extern struct xattr_handler ext4_xattr_trusted_handler;
extern struct xattr_handler ext4_xattr_acl_access_handler;
extern struct xattr_handler ext4_xattr_acl_default_handler;
extern struct xattr_handler ext4_xattr_security_handler;
+extern struct xattr_handler ext4_xattr_prj_handler;

extern ssize_t ext4_listxattr(struct dentry *, char *, size_t);

--
1.6.6


2010-03-04 18:34:54

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 4/5] ext4: add isolated project support

PROJECT_ISOLATION
This feature allows to create an isolated project subtrees.
Isolation means what:
1) directory subtree has no common inodes (no hadlinks across subtrees)
2) All descendants belongs to the same subtree.

Project subtree's isolation assumptions:
1)Inode can not belongs to different subtree trees
Otherwise changes in one subtree result in changes in other subtree
which contradict to isolation criteria.

*Usage*
We already has bind mounts which prevent link/remount across mounts.
But if user has isolated project which consists of several subtrees
and he want link/renames to work between subtrees(but in one project)

Since this feature is non obvious it controlled by mount option.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/ext4.h | 1 +
fs/ext4/namei.c | 9 ++++-
fs/ext4/project.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/project.h | 4 +-
fs/ext4/super.c | 9 ++++-
5 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9112c21..8fc8257 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -768,6 +768,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_DATA_ERR_ABORT 0x10000000 /* Abort on file data write */
#define EXT4_MOUNT_BLOCK_VALIDITY 0x20000000 /* Block validity checking */
#define EXT4_MOUNT_DISCARD 0x40000000 /* Issue DISCARD requests */
+#define EXT4_MOUNT_PRJ_ISOLATION 0x80000000 /* Isolation project support */

#define clear_opt(o, opt) o &= ~EXT4_MOUNT_##opt
#define set_opt(o, opt) o |= EXT4_MOUNT_##opt
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 608d21f..5401ec9 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -39,6 +39,7 @@

#include "xattr.h"
#include "acl.h"
+#include "project.h"

/*
* define how far ahead to read directories while searching them.
@@ -1080,6 +1081,7 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, stru
return ERR_CAST(inode);
}
}
+ ext4_prj_check_parent(dir, inode);
}
return d_splice_alias(inode, dentry);
}
@@ -2315,7 +2317,8 @@ static int ext4_link(struct dentry *old_dentry,
*/
if (inode->i_nlink == 0)
return -ENOENT;
-
+ if (!ext4_prj_may_link(dir, inode))
+ return -EXDEV;
retry:
handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
EXT4_INDEX_EXTRA_TRANS_BLOCKS);
@@ -2365,6 +2368,10 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
* in separate transaction */
if (new_dentry->d_inode)
vfs_dq_init(new_dentry->d_inode);
+
+ if (!ext4_prj_may_rename(new_dir, old_dentry->d_inode))
+ return -EXDEV;
+
handle = ext4_journal_start(old_dir, 2 *
EXT4_DATA_TRANS_BLOCKS(old_dir->i_sb) +
EXT4_INDEX_EXTRA_TRANS_BLOCKS + 2);
diff --git a/fs/ext4/project.c b/fs/ext4/project.c
index 8de8c0c..dea0487 100644
--- a/fs/ext4/project.c
+++ b/fs/ext4/project.c
@@ -25,6 +25,17 @@
* (1) Each inode has subtree id. This id is persistently stored inside
* inode's xattr, usually inside ibody
* (2) Subtree id is inherent from parent directory
+ *
+ * PROJECT ISOLATION
+ * This feature allows to create an isolated subtrees.
+ * Isolation means what:
+ * 1) Subtrees has no common inodes (no hadlinks across subtrees)
+ * 2) All descendants belongs to the same subtree.
+ *
+ * Project subtree's isolation assumptions:
+ * 1)Inode can not belongs to different subtree trees
+ * Otherwise changes in one subtree result in changes in other subtree
+ * which contradict to isolation criteria.
*/

/*
@@ -149,6 +160,101 @@ int ext4_prj_read(struct inode *inode)
inode->i_prjid = 0;
return ret;
}
+
+enum {
+ EXT4_SUBTREE_SAME = 1, /* Both nodes belongs to same subtree */
+ EXT4_SUBTREE_COMMON, /* Ancestor tree includes descent subtree*/
+ EXT4_SUBTREE_CROSS, /* Nodes belongs to different subtrees */
+};
+
+/**
+ * Check ancestor descendant subtree relationship.
+ * @ancino: ancestor inode
+ * @inode: descendant inode
+ */
+static inline int ext4_which_subtree(struct inode *ancino, struct inode *inode)
+{
+ if (inode->i_prjid == ancino->i_prjid)
+ return EXT4_SUBTREE_SAME;
+ else if (ancino->i_prjid == 0)
+ /*
+ * Ancestor inode belongs to default tree and it includes
+ * other subtrees by default
+ */
+ return EXT4_SUBTREE_COMMON;
+ return EXT4_SUBTREE_CROSS;
+}
+
+/**
+ * Check subtree assumptions on ext4_link()
+ * @tdir: target directory inode
+ * @inode: inode in question
+ * @return: true if link is possible, zero otherwise
+ */
+inline int ext4_prj_may_link(struct inode *tdir, struct inode *inode)
+{
+ if (!test_opt(inode->i_sb, PRJ_ISOLATION))
+ return 1;
+ /*
+ * According to subtree quota assumptions inode can not belongs to
+ * different quota trees.
+ */
+ if(ext4_which_subtree(tdir, inode) != EXT4_SUBTREE_SAME)
+ return 0;
+ return 1;
+}
+
+/**
+ * Check for directory subtree assumptions on ext4_rename()
+ * @new_dir: new directory inode
+ * @inode: inode in question
+ * @return: true if rename is possible, zero otherwise.
+ */
+inline int ext4_prj_may_rename(struct inode *new_dir, struct inode *inode)
+{
+ int same;
+ if (!test_opt(inode->i_sb, PRJ_ISOLATION))
+ return 1;
+ // XXX: Seems what i_nlink check is racy
+ // Is it possible to get inode->i_mutex here?
+ same = ext4_which_subtree(new_dir, inode);
+ if (S_ISDIR(inode->i_mode)) {
+ if (same == EXT4_SUBTREE_CROSS)
+ return 0;
+ } else {
+ if (inode->i_nlink > 1) {
+ /*
+ * If we allow to move any dentry of inode which has
+ * more than one link between subtrees then we end up
+ * with inode which belongs to different subtrees.
+ */
+ if (same != EXT4_SUBTREE_SAME)
+ return 0;
+ } else {
+ if (same == EXT4_SUBTREE_CROSS)
+ return 0;
+ }
+ }
+ return 1;
+}
+
+/**
+ * Check subtree parent/child relationship assumptions.
+ */
+inline void ext4_prj_check_parent(struct inode *dir, struct inode *inode)
+{
+ if (!test_opt(dir->i_sb, PRJ_ISOLATION))
+ return;
+
+ if (ext4_which_subtree(dir, inode) == EXT4_SUBTREE_CROSS) {
+ ext4_warning(inode->i_sb,
+ "Bad subtree hierarchy: directory{ino:%lu, project:%u}"
+ "inoode{ino:%lu, project:%u}\n",
+ dir->i_ino, dir->i_prjid,
+ inode->i_ino, inode->i_prjid);
+ }
+}
+
/*
* Initialize the projectid xattr of a new inode. Called from ext4_new_inode.
*
diff --git a/fs/ext4/project.h b/fs/ext4/project.h
index 7e80579..058bdc1 100644
--- a/fs/ext4/project.h
+++ b/fs/ext4/project.h
@@ -7,7 +7,9 @@ extern int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
unsigned int prjid, int xflags);
extern int ext4_prj_init(handle_t *handle, struct inode *inode);
extern int ext4_prj_read(struct inode *inode);
-
+extern int ext4_prj_may_link(struct inode *dir, struct inode *inode);
+extern int ext4_prj_may_rename(struct inode *dir, struct inode *inode);
+extern void ext4_prj_check_parent(struct inode *dir, struct inode *inode);
#else
static inline int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
{
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 240df9a..92b0662 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -963,6 +963,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
if (test_opt(sb, PROJECT_ID))
seq_puts(seq, ",project_id");

+ if (test_opt(sb, PRJ_ISOLATION))
+ seq_puts(seq, ",project_isolation");
+
if (test_opt(sb, NOLOAD))
seq_puts(seq, ",norecovery");

@@ -1153,7 +1156,7 @@ enum {
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
Opt_dioread_nolock, Opt_dioread_lock,
- Opt_discard, Opt_nodiscard, Opt_project_id,
+ Opt_discard, Opt_nodiscard, Opt_project_id, Opt_prj_isolation
};

static const match_table_t tokens = {
@@ -1225,6 +1228,7 @@ static const match_table_t tokens = {
{Opt_discard, "discard"},
{Opt_nodiscard, "nodiscard"},
{Opt_project_id, "project_id"},
+ {Opt_prj_isolation, "project_isolation"},
{Opt_err, NULL},
};

@@ -1696,6 +1700,9 @@ set_qf_format:
case Opt_project_id:
set_opt(sbi->s_mount_opt, PROJECT_ID);
break;
+ case Opt_prj_isolation:
+ set_opt(sbi->s_mount_opt, PRJ_ISOLATION);
+ break;
default:
ext4_msg(sb, KERN_ERR,
"Unrecognized mount option \"%s\" "
--
1.6.6


2010-03-04 18:34:56

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 5/5] ext4: add project quota support

Both regular and journaled quota are supported.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 57 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 8fc8257..2d775d0 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -745,6 +745,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_ERRORS_PANIC 0x00040 /* Panic on errors */
#define EXT4_MOUNT_MINIX_DF 0x00080 /* Mimics the Minix statfs */
#define EXT4_MOUNT_NOLOAD 0x00100 /* Don't use existing journal*/
+#define EXT4_MOUNT_PRJQUOTA 0x00200 /* Project quota support */
#define EXT4_MOUNT_DATA_FLAGS 0x00C00 /* Mode for data writes: */
#define EXT4_MOUNT_JOURNAL_DATA 0x00400 /* Write data to journal */
#define EXT4_MOUNT_ORDERED_DATA 0x00800 /* Flush data before commit */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 92b0662..d023e80 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -833,11 +833,18 @@ static inline void ext4_show_quota_options(struct seq_file *seq,
if (sbi->s_qf_names[GRPQUOTA])
seq_printf(seq, ",grpjquota=%s", sbi->s_qf_names[GRPQUOTA]);

+ if (sbi->s_qf_names[PRJQUOTA])
+ seq_printf(seq, ",prjjquota=%s", sbi->s_qf_names[PRJQUOTA]);
+
if (test_opt(sb, USRQUOTA))
seq_puts(seq, ",usrquota");

if (test_opt(sb, GRPQUOTA))
seq_puts(seq, ",grpquota");
+
+ if (test_opt(sb, PRJQUOTA))
+ seq_puts(seq, ",prjquota");
+
#endif
}

@@ -1041,8 +1048,8 @@ static int bdev_try_to_free_page(struct super_block *sb, struct page *page,
}

#ifdef CONFIG_QUOTA
-#define QTYPE2NAME(t) ((t) == USRQUOTA ? "user" : "group")
-#define QTYPE2MOPT(on, t) ((t) == USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
+static char *quotatypes[] = INITQFNAMES;
+#define QTYPE2NAME(t) (quotatypes[t])

static int ext4_write_dquot(struct dquot *dquot);
static int ext4_acquire_dquot(struct dquot *dquot);
@@ -1148,10 +1155,11 @@ enum {
Opt_journal_checksum, Opt_journal_async_commit,
Opt_abort, Opt_data_journal, Opt_data_ordered, Opt_data_writeback,
Opt_data_err_abort, Opt_data_err_ignore,
- Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
+ Opt_usrjquota, Opt_grpjquota, Opt_prjjquota, Opt_offusrjquota,
+ Opt_offgrpjquota, Opt_offprjjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
Opt_noquota, Opt_ignore, Opt_barrier, Opt_nobarrier, Opt_err,
- Opt_resize, Opt_usrquota, Opt_grpquota, Opt_i_version,
+ Opt_resize, Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
Opt_stripe, Opt_delalloc, Opt_nodelalloc,
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
@@ -1201,10 +1209,13 @@ static const match_table_t tokens = {
{Opt_usrjquota, "usrjquota=%s"},
{Opt_offgrpjquota, "grpjquota="},
{Opt_grpjquota, "grpjquota=%s"},
+ {Opt_offprjjquota, "prjjquota="},
+ {Opt_prjjquota, "prjjquota=%s"},
{Opt_jqfmt_vfsold, "jqfmt=vfsold"},
{Opt_jqfmt_vfsv0, "jqfmt=vfsv0"},
{Opt_jqfmt_vfsv1, "jqfmt=vfsv1"},
{Opt_grpquota, "grpquota"},
+ {Opt_prjquota, "prjquota"},
{Opt_noquota, "noquota"},
{Opt_quota, "quota"},
{Opt_usrquota, "usrquota"},
@@ -1525,6 +1536,16 @@ static int parse_options(char *options, struct super_block *sb,
if (!set_qf_name(sb, GRPQUOTA, &args[0]))
return 0;
break;
+
+ case Opt_prjjquota:
+#ifdef CONFIG_PROJECT_ID
+ if (!set_qf_name(sb, PRJQUOTA, &args[0]))
+ return 0;
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_offusrjquota:
if (!clear_qf_name(sb, USRQUOTA))
return 0;
@@ -1533,7 +1554,15 @@ static int parse_options(char *options, struct super_block *sb,
if (!clear_qf_name(sb, GRPQUOTA))
return 0;
break;
-
+ case Opt_offprjjquota:
+#ifdef CONFIG_PROJECT_ID
+ if (!clear_qf_name(sb, PRJQUOTA))
+ return 0;
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_jqfmt_vfsold:
qfmt = QFMT_VFS_OLD;
goto set_qf_format;
@@ -1561,6 +1590,15 @@ set_qf_format:
set_opt(sbi->s_mount_opt, QUOTA);
set_opt(sbi->s_mount_opt, GRPQUOTA);
break;
+ case Opt_prjquota:
+#ifdef CONFIG_PROJECT_ID
+ set_opt(sbi->s_mount_opt, QUOTA);
+ set_opt(sbi->s_mount_opt, PRJQUOTA);
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_noquota:
if (sb_any_quota_loaded(sb)) {
ext4_msg(sb, KERN_ERR, "Cannot change quota "
@@ -1570,18 +1608,22 @@ set_qf_format:
clear_opt(sbi->s_mount_opt, QUOTA);
clear_opt(sbi->s_mount_opt, USRQUOTA);
clear_opt(sbi->s_mount_opt, GRPQUOTA);
+ clear_opt(sbi->s_mount_opt, PRJQUOTA);
break;
#else
case Opt_quota:
case Opt_usrquota:
case Opt_grpquota:
+ case Opt_prjquota:
ext4_msg(sb, KERN_ERR,
"quota options not supported");
break;
case Opt_usrjquota:
case Opt_grpjquota:
+ case Opt_prjjquota:
case Opt_offusrjquota:
case Opt_offgrpjquota:
+ case Opt_offprjjquota:
case Opt_jqfmt_vfsold:
case Opt_jqfmt_vfsv0:
case Opt_jqfmt_vfsv1:
@@ -1711,14 +1753,19 @@ set_qf_format:
}
}
#ifdef CONFIG_QUOTA
- if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA]) {
+ if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA] ||
+ sbi->s_qf_names[PRJQUOTA]) {
if (test_opt(sb, USRQUOTA) && sbi->s_qf_names[USRQUOTA])
clear_opt(sbi->s_mount_opt, USRQUOTA);

if (test_opt(sb, GRPQUOTA) && sbi->s_qf_names[GRPQUOTA])
clear_opt(sbi->s_mount_opt, GRPQUOTA);

- if (test_opt(sb, GRPQUOTA) || test_opt(sb, USRQUOTA)) {
+ if (test_opt(sb, PRJQUOTA) && sbi->s_qf_names[PRJQUOTA])
+ clear_opt(sbi->s_mount_opt, PRJQUOTA);
+
+ if (test_opt(sb, GRPQUOTA) || test_opt(sb, USRQUOTA) ||
+ test_opt(sb, PRJQUOTA)) {
ext4_msg(sb, KERN_ERR, "old and new quota "
"format mixing");
return 0;
@@ -3878,7 +3925,8 @@ static int ext4_mark_dquot_dirty(struct dquot *dquot)
{
/* Are we journaling quotas? */
if (EXT4_SB(dquot->dq_sb)->s_qf_names[USRQUOTA] ||
- EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA]) {
+ EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA] ||
+ EXT4_SB(dquot->dq_sb)->s_qf_names[PRJQUOTA]) {
dquot_mark_dquot_dirty(dquot);
return ext4_write_dquot(dquot);
} else {
--
1.6.6


2010-03-04 20:07:03

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 4/5] ext4: add isolated project support

Hi,

> PROJECT_ISOLATION
> This feature allows to create an isolated project subtrees.
> Isolation means what:
> 1) directory subtree has no common inodes (no hadlinks across subtrees)
> 2) All descendants belongs to the same subtree.
>
> Project subtree's isolation assumptions:
> 1)Inode can not belongs to different subtree trees
> Otherwise changes in one subtree result in changes in other subtree
> which contradict to isolation criteria.
Just a curious question:
Do you really need this subtree separation in your envisioned containers
usecase? Because there I imagine you have one project_id per container,
containers form disjoint subtrees (at least their writeable parts) and
each file & directory has this project_id set and you forbid to manipulate
project id's from inside the container (otherwise you'd have problems with
enforcing quota limits I guess).

And when project_id is a per-inode property, quota has no problems with it
(is well defined) even without subtree separation. So is this subtree
separation really needed?

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2010-03-04 20:34:43

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 4/5] ext4: add isolated project support

Jan Kara <[email protected]> writes:

> Hi,
>
>> PROJECT_ISOLATION
>> This feature allows to create an isolated project subtrees.
>> Isolation means what:
>> 1) directory subtree has no common inodes (no hadlinks across subtrees)
>> 2) All descendants belongs to the same subtree.
>>
>> Project subtree's isolation assumptions:
>> 1)Inode can not belongs to different subtree trees
>> Otherwise changes in one subtree result in changes in other subtree
>> which contradict to isolation criteria.
> Just a curious question:
> Do you really need this subtree separation in your envisioned containers
> usecase? Because there I imagine you have one project_id per container,
> containers form disjoint subtrees (at least their writeable parts) and
> each file & directory has this project_id set and you forbid to manipulate
> project id's from inside the container (otherwise you'd have problems with
> enforcing quota limits I guess).
>
> And when project_id is a per-inode property, quota has no problems with it
> (is well defined) even without subtree separation. So is this subtree
> separation really needed?
You right containers dealt with with only one subtree so bindmount
is sufficient for all container's like sulutions.
I've done this isolation part after long discussion with Dave Chinner
He give some examples there fs-specific (not mount ones) isolation
is useful. Most obvious usage example of project which has several
trees. He intend that this feature is used by XFS users.
But this feature attract most complains from reviewers, and i'll drop
it if will be necessary to merge the project-id quota.
>
> Honza

2010-03-11 12:01:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/5] vfs: Add additional owner identifier

On Thu, Mar 04, 2010 at 09:34:33PM +0300, Dmitry Monakhov wrote:
> This patch add project inode identifier. Project ID may be used as
> auxiliary owner specifier in addition to standard uid/gid.

There's really no reason to make this a config option. And xattrs
are a rather bad interfaces to this, so even if a filesystem has to
resort to implement the project ID this way, this should be hidden
inside the filesystem.


2010-03-11 12:03:35

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/5] quota: Implement project id support for generic quota

On Thu, Mar 04, 2010 at 09:34:34PM +0300, Dmitry Monakhov wrote:
> case GRPQUOTA:
> return in_group_p(dquot->dq_id);
> + case PRJQUOTA:
> + /* XXX: Currently there is no way to understand
> + which project_id this task belonges to, So print
> + a warn message unconditionally. -dmon */
> + return 1;

Note that this is different from the XFS behaviour, which neve warns
for project quota. In fact project quota in XFS traditionally doesn't
even return EDQUOT but ENOSPC instead, as it's not a traditional quota
mechanism but filesystem containerization.

Otherwise the patch looks good except for the ifdef mess already comment
on in the previous patch.

2010-03-11 12:06:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

On Thu, Mar 04, 2010 at 09:34:35PM +0300, Dmitry Monakhov wrote:
> Project id is stored on disk inside xattr usually inside ibody.
> Xattr is used only as a data storage, It has not user visible xattr
> interface.
>
> * User interface
> Project id is accessible via generic xattr interface "system.project_id"
>
> TODO: implement e2libfs support for project_id.

I think you'd be much better off storing it inide the inode core itself.
E.g. you could ue the never used fragment address in the ext2/3/4 disk
inode.

> +#ifdef CONFIG_QUOTA
> + qid[PRJQUOTA] = new_prjid;
> + ret = inode->i_sb->dq_op->transfer(inode, qid, 1 << PRJQUOTA);
> + if (ret)
> + return ret;
> +#endif

This needs to be updated to use dquot_transfer directly.


2010-03-11 12:07:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/5] ext4: add isolated project support

On Thu, Mar 04, 2010 at 09:34:36PM +0300, Dmitry Monakhov wrote:
> PROJECT_ISOLATION
> This feature allows to create an isolated project subtrees.
> Isolation means what:
> 1) directory subtree has no common inodes (no hadlinks across subtrees)
> 2) All descendants belongs to the same subtree.
>
> Project subtree's isolation assumptions:
> 1)Inode can not belongs to different subtree trees
> Otherwise changes in one subtree result in changes in other subtree
> which contradict to isolation criteria.
>
> *Usage*
> We already has bind mounts which prevent link/remount across mounts.
> But if user has isolated project which consists of several subtrees
> and he want link/renames to work between subtrees(but in one project)
>
> Since this feature is non obvious it controlled by mount option.

Making this a mount option is even more non-obvious. Unless you have
very good reason to support both and not just stick to the existing
"isolated" semantics make it a chattr option so we can easily check
out what kind of subtree we deal with.


2010-03-11 13:12:02

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 1/5] vfs: Add additional owner identifier

Christoph Hellwig <[email protected]> writes:

> On Thu, Mar 04, 2010 at 09:34:33PM +0300, Dmitry Monakhov wrote:
>> This patch add project inode identifier. Project ID may be used as
>> auxiliary owner specifier in addition to standard uid/gid.
>
> There's really no reason to make this a config option.
Project id feature is not likely to be widely used (i hope only
at the beginning). Personally this is not bad to avoid config option.
At least we will have many new users for free. But I predict
many angry voices against enabling this feature by default.
Look, we even have CONFIG_BLOCK option at the time when this option
is enabled in 99.99% of systems.

Let's ask Jan an Al:
Will they accept genetic projectid support without appropriate config option?
> And xattrs
> are a rather bad interfaces to this, so even if a filesystem has to
> resort to implement the project ID this way, this should be hidden
> inside the filesystem.
Only one question "Why not?".
I've tried to count pro and contra options.
*PRO*
1) xattr interface already available, we don't have to invent a new one
2) xattrs are supported by most of backup tools and other fs-related
tools.
3) xattr interface is rather simple and flexible. Filesystem is free
to store it's projectid whenever it likes to, xattr is just an
manipulation interface.
*CONTRA*
1) XATTR_CREATE/XATTR_REPLACE is useless for project id since
inode is belongs to default project (id == 0)



2010-03-11 13:17:58

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 2/5] quota: Implement project id support for generic quota

Christoph Hellwig <[email protected]> writes:

> On Thu, Mar 04, 2010 at 09:34:34PM +0300, Dmitry Monakhov wrote:
>> case GRPQUOTA:
>> return in_group_p(dquot->dq_id);
>> + case PRJQUOTA:
>> + /* XXX: Currently there is no way to understand
>> + which project_id this task belonges to, So print
>> + a warn message unconditionally. -dmon */
>> + return 1;
>
> Note that this is different from the XFS behaviour, which neve warns
> for project quota. In fact project quota in XFS traditionally doesn't
> even return EDQUOT but ENOSPC instead, as it's not a traditional quota
> mechanism but filesystem containerization.
Yes, i know, but imho it is better to return true EDQUOT to understand
what is really happen.
>
> Otherwise the patch looks good except for the ifdef mess already comment
> on in the previous patch.

2010-03-11 13:30:55

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

Christoph Hellwig <[email protected]> writes:

> On Thu, Mar 04, 2010 at 09:34:35PM +0300, Dmitry Monakhov wrote:
>> Project id is stored on disk inside xattr usually inside ibody.
>> Xattr is used only as a data storage, It has not user visible xattr
>> interface.
>>
>> * User interface
>> Project id is accessible via generic xattr interface "system.project_id"
>>
>> TODO: implement e2libfs support for project_id.
>
> I think you'd be much better off storing it inide the inode core itself.
> E.g. you could ue the never used fragment address in the ext2/3/4 disk
> inode.http://patchwork.ozlabs.org/patch/38766
This was already discussed at the first RFC
http://patchwork.ozlabs.org/patch/38766
and Andreas was strongly against this idea.
>
>> +#ifdef CONFIG_QUOTA
>> + qid[PRJQUOTA] = new_prjid;
>> + ret = inode->i_sb->dq_op->transfer(inode, qid, 1 << PRJQUOTA);
>> + if (ret)
>> + return ret;
>> +#endif
>
> This needs to be updated to use dquot_transfer directly.

2010-03-11 18:50:18

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/5] vfs: Add additional owner identifier

On Thu, Mar 11, 2010 at 04:11:57PM +0300, Dmitry Monakhov wrote:
> Christoph Hellwig <[email protected]> writes:
>
> > On Thu, Mar 04, 2010 at 09:34:33PM +0300, Dmitry Monakhov wrote:
> >> This patch add project inode identifier. Project ID may be used as
> >> auxiliary owner specifier in addition to standard uid/gid.
> >
> > There's really no reason to make this a config option.
> Project id feature is not likely to be widely used (i hope only
> at the beginning). Personally this is not bad to avoid config option.
> At least we will have many new users for free. But I predict
> many angry voices against enabling this feature by default.

How would those "angry voices" notice this feature?

--b.

> Look, we even have CONFIG_BLOCK option at the time when this option
> is enabled in 99.99% of systems.
>
> Let's ask Jan an Al:
> Will they accept genetic projectid support without appropriate config option?
> > And xattrs
> > are a rather bad interfaces to this, so even if a filesystem has to
> > resort to implement the project ID this way, this should be hidden
> > inside the filesystem.
> Only one question "Why not?".
> I've tried to count pro and contra options.
> *PRO*
> 1) xattr interface already available, we don't have to invent a new one
> 2) xattrs are supported by most of backup tools and other fs-related
> tools.
> 3) xattr interface is rather simple and flexible. Filesystem is free
> to store it's projectid whenever it likes to, xattr is just an
> manipulation interface.
> *CONTRA*
> 1) XATTR_CREATE/XATTR_REPLACE is useless for project id since
> inode is belongs to default project (id == 0)
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2010-03-11 19:40:19

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 1/5] vfs: Add additional owner identifier

On 2010-03-11, at 11:51, J. Bruce Fields wrote:
> On Thu, Mar 11, 2010 at 04:11:57PM +0300, Dmitry Monakhov wrote:
>> Christoph Hellwig <[email protected]> writes:
>>> There's really no reason to make this a config option.
>> Project id feature is not likely to be widely used (i hope only
>> at the beginning). Personally this is not bad to avoid config option.
>> At least we will have many new users for free. But I predict
>> many angry voices against enabling this feature by default.
>
> How would those "angry voices" notice this feature?


The embedded folks are continually complaining about kernel bloat.

I don't see it as a bad thing that there is a config option for a
feature that will likely be used by < 1% of Linux users.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2010-03-11 19:54:46

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

On 2010-03-11, at 06:30, Dmitry Monakhov wrote:
> Christoph Hellwig <[email protected]> writes:
>>
>> I think you'd be much better off storing it inide the inode core
>> itself.
>> E.g. you could ue the never used fragment address in the ext2/3/4
>> disk
>> inode.
>
> This was already discussed at the first RFC
> http://patchwork.ozlabs.org/patch/38766
> and Andreas was strongly against this idea.


I had written:
>> You can instead just store this data in an xattr (which will
>> normally be stored in the inode, so no performance impact), and
>> then you are free to store multiple values per inode.

I don't know if I would classify this as "strongly against", but it is
true that I'm hesitant to use up the last field in the inode for
something that may be used so rarely. There is also some desire to
use this field for an extended i_links_hi field, and/or an inode
checksum.

Part of my suggestion to use xattrs was that it would then be possible
to allow hard links to have different project IDs on the same file,
since the size of the xattr is flexible. Since the xattr is stored
inside the inode in ext3/ext4 if the inode was formatted with 256-byte
inodes this is a minimal performance hit.

A second possibility (if there is really no desire to have more than a
single project ID per inode) is to add a field to the "large" inode
for ext4, though that doesn't help filesystems that were not formatted
that way, and it also consumes space in all inodes even if this
feature is not used.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2010-03-11 22:01:37

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

On Thu, Mar 11, 2010 at 12:54:46PM -0700, Andreas Dilger wrote:
> A second possibility (if there is really no desire to have more than
> a single project ID per inode) is to add a field to the "large"
> inode for ext4, though that doesn't help filesystems that were not
> formatted that way, and it also consumes space in all inodes even if
> this feature is not used.

The big question that I'm still uncertain about is how often are
people going to be using this feature, and how many project ID's do we
really need? I know Dimitry believes this is going to be the greatest
thing since sliced bread, but even for people running virtualization,
I'm not sure how many folks really will consider it critical.

I'd be a bit more willing to give the last 16-bit field for the
project ID, but otherwise, I think using a 32-bit field in the large
inode might be the better compromise if we don't like the xattr
approach.

Of course, I'm willing to be convinced otherwise with some sound
technical arguments. (Or beer; beer is good too. :-)

- Ted

2010-03-12 08:47:47

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 1/5] vfs: Add additional owner identifier

Andreas Dilger <[email protected]> writes:

> On 2010-03-11, at 11:51, J. Bruce Fields wrote:
>> On Thu, Mar 11, 2010 at 04:11:57PM +0300, Dmitry Monakhov wrote:
>>> Christoph Hellwig <[email protected]> writes:
>>>> There's really no reason to make this a config option.
>>> Project id feature is not likely to be widely used (i hope only
>>> at the beginning). Personally this is not bad to avoid config option.
>>> At least we will have many new users for free. But I predict
>>> many angry voices against enabling this feature by default.
>>
>> How would those "angry voices" notice this feature?
>
>
> The embedded folks are continually complaining about kernel bloat.
>
> I don't see it as a bad thing that there is a config option for a
> feature that will likely be used by < 1% of Linux users.
Definitely we have to hide projectid under kernel config option, bu,t
what option should it be? We have two choices
1) CONFIG_QUOTA: because most of users who use quota probably
will also use project id support
2) introduce new config option: This may be reasonable if projectid
may be usefull witout quota and inode size is critical.
(i dont know such cases for now)
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.

2010-03-12 09:32:09

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

[email protected] writes:

> On Thu, Mar 11, 2010 at 12:54:46PM -0700, Andreas Dilger wrote:
>> A second possibility (if there is really no desire to have more than
>> a single project ID per inode) is to add a field to the "large"
>> inode for ext4, though that doesn't help filesystems that were not
>> formatted that way, and it also consumes space in all inodes even if
>> this feature is not used.
>
> The big question that I'm still uncertain about is how often are
> people going to be using this feature, and how many project ID's do we
> really need? I know Dimitry believes this is going to be the greatest
> thing since sliced bread, but even for people running virtualization,
> I'm not sure how many folks really will consider it critical.
Most of our customers (hosting providers) use quota, otherwise
it is impossible to restrict disk usage. Currently they have to
perform full quotecheck after power failure. Which result in huge
service down time. If we able to use journalled quota all problems
will be solved.
Also NFS people was interesting in projectid feature. They want to
use it for creating safe file-handles.
http://marc.info/?l=linux-fsdevel&m=126634832431306&w=2
In fact the projectid feature is not intrusive (except an isolation part)
it is even much simpler than ACL.
>
> I'd be a bit more willing to give the last 16-bit field for the
> project ID, but otherwise, I think using a 32-bit field in the large
> inode might be the better compromise if we don't like the xattr
> approach.
IMHO one 32-bit value in xattr it the best solution. Because it is
stored in inode's in_body xattr.And how we store it via update_inode
or xattr_set is not really important.
Another plus is that we are able to support all existing filesystems
without any problems even if they have 128-bit inodes.
>
> Of course, I'm willing to be convinced otherwise with some sound
> technical arguments. (Or beer; beer is good too. :-)
I've attached two mega-liters of beer. Feel free drink the attachment :)

2010-03-12 20:07:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

On Fri, Mar 12, 2010 at 12:32:09PM +0300, Dmitry Monakhov wrote:
> [email protected] writes:
>
> > On Thu, Mar 11, 2010 at 12:54:46PM -0700, Andreas Dilger wrote:
> >> A second possibility (if there is really no desire to have more than
> >> a single project ID per inode) is to add a field to the "large"
> >> inode for ext4, though that doesn't help filesystems that were not
> >> formatted that way, and it also consumes space in all inodes even if
> >> this feature is not used.
> >
> > The big question that I'm still uncertain about is how often are
> > people going to be using this feature, and how many project ID's do we
> > really need? I know Dimitry believes this is going to be the greatest
> > thing since sliced bread, but even for people running virtualization,
> > I'm not sure how many folks really will consider it critical.
> Most of our customers (hosting providers) use quota, otherwise
> it is impossible to restrict disk usage. Currently they have to
> perform full quotecheck after power failure. Which result in huge
> service down time. If we able to use journalled quota all problems
> will be solved.
> Also NFS people was interesting in projectid feature. They want to
> use it for creating safe file-handles.
> http://marc.info/?l=linux-fsdevel&m=126634832431306&w=2

By the way, have you looked at all at what it would take to be able to
encode and decode filehandles with projectid's in them?

--b.