2010-03-18 14:02:58

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 0/5] RFC: introduce extended inode owner identifier v6

This is 6'th version of extened inode owner patch-set.
Please review it tell me what do you think about all this.
Are you agree with this approach?
Are you worry about some implementation details?
Is it ready for merge to some devel's tree?

*Feature description*
1) Inode may has a project identifier which has same meaning as uid/gid.
2) Id is stored in inode's xattr named "system.project_id"
3) Id is inherent from parent inode on creation.
4) This id is cached in memory inode structure vfs_inode->i_prjid
This field it restricted by CONFIG_PROJECT_ID. So no wasting
of memory happens.

5) Since id is cached in memory it may be used for different purposes
such as:
5A) Implement additional quota id space orthogonal to uid/gid. This is
useful in managing quota for some filesystem hierarchy(chroot or
container over bindmount)
5B) Export dedicated fs hierarchy to nfsd (only inode which has some
project_id will be accessible via nfsd)

6) It is possible to create isolated project's subtree.
Note: Please do not blame isolation feature before you read the
isolation patch description, and than please wellcome.

*User interface *
Project id is managed via generic xattr interface "system.project_id"
This good because
1) We may use already existing interface.
2) xattr already supported by generic urils tar/rsync and etc

PATCH SET TOC:
1) generic projectid support
2) generic project quota support
3) ext4 project support implementation
3A) ext4: generic project support
3B) ext4: project quota support
3C) ext4: project isolation support. This patch is not principal
but makes ext4 implementation rename behaviour equotals
to XFS

Patch against linux-next-20100318
Changes against v5
- convert dquota_transfer to struct iattr interface. Not it is possible
to change i_prjid via notify_changes()
- some bugfixes.


2010-03-18 14:03:01

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 1/5] vfs: Add additional owner identifier

This patch add project inode identifier. Project ID may be used as
auxiliary owner specifier in addition to standard uid/gid.
---
fs/Kconfig | 7 +++++++
fs/attr.c | 10 +++++++++-
include/linux/fs.h | 8 ++++++++
include/linux/xattr.h | 3 +++
4 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 5f85b59..23957c0 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -54,6 +54,13 @@ config FILE_LOCKING
This option enables standard file locking support, required
for filesystems like NFS and for the flock() system
call. Disabling this option saves about 11k.
+config PROJECT_ID
+ bool "Enable project inode identifier"
+ default y
+ help
+ This option enables project inode identifier. Project ID
+ may be used as auxiliary owner specifier in addition to
+ standard uid/gid.

source "fs/notify/Kconfig"

diff --git a/fs/attr.c b/fs/attr.c
index 0815e93..2894cc7 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -32,6 +32,9 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
attr->ia_uid != inode->i_uid) && !capable(CAP_CHOWN))
goto error;

+ if ((ia_valid & ATTR_PRJID) && !capable(CAP_SYS_RESOURCE))
+ goto error;
+
/* Make sure caller can chgrp. */
if ((ia_valid & ATTR_GID) &&
(current_fsuid() != inode->i_uid ||
@@ -119,6 +122,10 @@ int inode_setattr(struct inode * inode, struct iattr * attr)
inode->i_uid = attr->ia_uid;
if (ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
+#ifdef CONFIG_PROJECT_ID
+ if (ia_valid & ATTR_PRJID)
+ inode->i_prjid = attr->ia_prjid;
+#endif
if (ia_valid & ATTR_ATIME)
inode->i_atime = timespec_trunc(attr->ia_atime,
inode->i_sb->s_time_gran);
@@ -149,7 +156,8 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
struct timespec now;
unsigned int ia_valid = attr->ia_valid;

- if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) {
+ if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_PRJID |
+ ATTR_TIMES_SET)) {
if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
return -EPERM;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 48aee87..0c9dadb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -438,6 +438,7 @@ typedef void (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
#define ATTR_KILL_PRIV (1 << 14)
#define ATTR_OPEN (1 << 15) /* Truncating from open(O_TRUNC) */
#define ATTR_TIMES_SET (1 << 16)
+#define ATTR_PRJID (1 << 17)

/*
* This is the Inode Attributes structure, used for notify_change(). It
@@ -453,6 +454,9 @@ struct iattr {
umode_t ia_mode;
uid_t ia_uid;
gid_t ia_gid;
+#ifdef CONFIG_PROJECT_ID
+ unsigned int ia_prjid;
+#endif
loff_t ia_size;
struct timespec ia_atime;
struct timespec ia_mtime;
@@ -756,6 +760,10 @@ struct inode {
#ifdef CONFIG_QUOTA
struct dquot *i_dquot[MAXQUOTAS];
#endif
+#ifdef CONFIG_PROJECT_ID
+ /* Project id, protected by i_mutex similar to i_uid/i_gid */
+ __u32 i_prjid;
+#endif
struct list_head i_devices;
union {
struct pipe_inode_info *i_pipe;
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index fb9b7e6..9d85a4b 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -33,6 +33,9 @@
#define XATTR_USER_PREFIX "user."
#define XATTR_USER_PREFIX_LEN (sizeof (XATTR_USER_PREFIX) - 1)

+#define XATTR_PRJID "system.project_id"
+#define XATTR_PRJID_LEN (sizeof (XATTR_PRJID))
+
struct inode;
struct dentry;

--
1.6.6.1


2010-03-18 14:03:02

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 2/5] quota: Implement project id support for generic quota

Since all preparation code are already in quota-tree,
So this patch is really small.
---
fs/quota/dquot.c | 18 ++++++++++++++++++
fs/quota/quotaio_v2.h | 6 ++++--
include/linux/quota.h | 12 +++++++++++-
3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index e0b870f..3590888 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1091,6 +1091,11 @@ static int need_print_warning(struct dquot *dquot)
return current_fsuid() == dquot->dq_id;
case GRPQUOTA:
return in_group_p(dquot->dq_id);
+ case PRJQUOTA:
+ /* XXX: Currently there is no way to understand
+ which project_id this task belonges to, So print
+ a warn message unconditionally. -dmon */
+ return 1;
}
return 0;
}
@@ -1327,6 +1332,13 @@ static void __dquot_initialize(struct inode *inode, int type)
case GRPQUOTA:
id = inode->i_gid;
break;
+ case PRJQUOTA:
+#ifdef CONFIG_PROJECT_ID
+ id = inode->i_prjid;
+#else
+ BUG_ON(sb_has_quota_loaded(inode->i_sb, PRJQUOTA));
+#endif
+ break;
}
got[cnt] = dqget(sb, id, cnt);
}
@@ -1792,6 +1804,12 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
mask |= 1 << GRPQUOTA;
chid[GRPQUOTA] = iattr->ia_gid;
}
+#ifdef CONFIG_PROJECT_ID
+ if (iattr->ia_valid & ATTR_PRJID && iattr->ia_prjid != inode->i_prjid) {
+ mask |= 1 << PRJQUOTA;
+ chid[GRPQUOTA] = iattr->ia_prjid;
+ }
+#endif
if (sb_any_quota_active(inode->i_sb) && !IS_NOQUOTA(inode)) {
dquot_initialize(inode);
return __dquot_transfer(inode, chid, mask);
diff --git a/fs/quota/quotaio_v2.h b/fs/quota/quotaio_v2.h
index f1966b4..bfab9df 100644
--- a/fs/quota/quotaio_v2.h
+++ b/fs/quota/quotaio_v2.h
@@ -13,12 +13,14 @@
*/
#define V2_INITQMAGICS {\
0xd9c01f11, /* USRQUOTA */\
- 0xd9c01927 /* GRPQUOTA */\
+ 0xd9c01927, /* GRPQUOTA */\
+ 0xd9c03f14 /* PRJQUOTA */\
}

#define V2_INITQVERSIONS {\
1, /* USRQUOTA */\
- 1 /* GRPQUOTA */\
+ 1, /* GRPQUOTA */ \
+ 1 /* PRJQUOTA */\
}

/* First generic header */
diff --git a/include/linux/quota.h b/include/linux/quota.h
index b462916..73332c5 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -36,18 +36,28 @@
#include <linux/errno.h>
#include <linux/types.h>

-#define __DQUOT_VERSION__ "dquot_6.5.2"
+#define __DQUOT_VERSION__ "dquot_6.6.0"

+#ifdef CONFIG_PROJECT_ID
+#define MAXQUOTAS 3
+#else
#define MAXQUOTAS 2
+#endif
+
#define USRQUOTA 0 /* element used for user quotas */
#define GRPQUOTA 1 /* element used for group quotas */

+#ifdef CONFIG_PROJECT_ID
+#define PRJQUOTA 2 /* element used for project quotas */
+#endif
+
/*
* Definitions for the default names of the quotas files.
*/
#define INITQFNAMES { \
"user", /* USRQUOTA */ \
"group", /* GRPQUOTA */ \
+ "project", /* RPJQUOTA */ \
"undefined", \
};

--
1.6.6.1


2010-03-18 14:03:10

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 5/5] ext4: add isolated project support

This is not mandatory part of project_id feature, but it allows to
use projects hierarchy to implement subfilesystem isolation semantics.

PROJECT_ISOLATION
This feature allows to create an isolated project sub-trees.
Isolation means what:
1) directory sub-tree has no common inodes (no hadlinks across sub-trees)
2) All descendants belongs to the same sub-tree.

Project sub-tree's isolation assumptions:
1)Inode can not belongs to different sub-tree trees
Otherwise changes in one sub-tree result in changes in other sub-tree
which contradict to isolation criteria.

*Usage*
We already has bind mounts which prevent link/remount across mounts.
But if user has isolated project which consists of several sub-trees
and he want link/renames to work between sub-trees(but in one project)

Since this feature is non obvious it controlled by mount option.

*Approach N2*
Currently i'm consider another approach to implement isolation semantics:
Maintains per-sb list of prjid objects similar dquot. Each object
has corresponding isolation type.Three isolation types are possible
1) NO_ISOLATION: no isolation at all
2) SEMI_ISOLATION restrict rename/link similar to XFS implementation
3) FULL_ISOLATION do not allow any nested sub-trees, even if user
want to change prjid explicitly. This type of isolation may be
useful for nfs export and containers-like solutions.
Read info from special config-file for projectid sub-trees for example
like this:
# mnt_point projectid type
/mnt 100 NO_ISOLATION
/mnt 101 FULL_ISOLATION
Any project id which does not contains in config file considered
as NO_ISOLATION by default, This allow us to enable isolation
semantics by default without pain.

---
fs/ext4/ext4.h | 1 +
fs/ext4/namei.c | 9 ++++-
fs/ext4/project.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/project.h | 15 +++++++
fs/ext4/super.c | 9 ++++-
5 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 3fa3602..22996eb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -769,6 +769,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_DATA_ERR_ABORT 0x10000000 /* Abort on file data write */
#define EXT4_MOUNT_BLOCK_VALIDITY 0x20000000 /* Block validity checking */
#define EXT4_MOUNT_DISCARD 0x40000000 /* Issue DISCARD requests */
+#define EXT4_MOUNT_PRJ_ISOLATION 0x80000000 /* Isolation project support */

#define clear_opt(o, opt) o &= ~EXT4_MOUNT_##opt
#define set_opt(o, opt) o |= EXT4_MOUNT_##opt
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0c070fa..92afa62 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -39,6 +39,7 @@

#include "xattr.h"
#include "acl.h"
+#include "project.h"

/*
* define how far ahead to read directories while searching them.
@@ -1080,6 +1081,7 @@ static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, stru
return ERR_CAST(inode);
}
}
+ ext4_prj_check_parent(dir, inode);
}
return d_splice_alias(inode, dentry);
}
@@ -2329,7 +2331,8 @@ static int ext4_link(struct dentry *old_dentry,
*/
if (inode->i_nlink == 0)
return -ENOENT;
-
+ if (!ext4_prj_may_link(dir, inode))
+ return -EXDEV;
retry:
handle = ext4_journal_start(dir, EXT4_DATA_TRANS_BLOCKS(dir->i_sb) +
EXT4_INDEX_EXTRA_TRANS_BLOCKS);
@@ -2382,6 +2385,10 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
* in separate transaction */
if (new_dentry->d_inode)
dquot_initialize(new_dentry->d_inode);
+
+ if (!ext4_prj_may_rename(new_dir, old_dentry->d_inode))
+ return -EXDEV;
+
handle = ext4_journal_start(old_dir, 2 *
EXT4_DATA_TRANS_BLOCKS(old_dir->i_sb) +
EXT4_INDEX_EXTRA_TRANS_BLOCKS + 2);
diff --git a/fs/ext4/project.c b/fs/ext4/project.c
index b571599..22085cc 100644
--- a/fs/ext4/project.c
+++ b/fs/ext4/project.c
@@ -25,6 +25,17 @@
* (1) Each inode has subtree id. This id is persistently stored inside
* inode's xattr, usually inside ibody
* (2) Subtree id is inherent from parent directory
+ *
+ * PROJECT ISOLATION
+ * This feature allows to create an isolated subtrees.
+ * Isolation means what:
+ * 1) Subtrees has no common inodes (no hadlinks across subtrees)
+ * 2) All descendants belongs to the same subtree.
+ *
+ * Project subtree's isolation assumptions:
+ * 1)Inode can not belongs to different subtree trees
+ * Otherwise changes in one subtree result in changes in other subtree
+ * which contradict to isolation criteria.
*/

/*
@@ -151,6 +162,101 @@ int ext4_prj_read(struct inode *inode)
inode->i_prjid = 0;
return ret;
}
+
+enum {
+ EXT4_SUBTREE_SAME = 1, /* Both nodes belongs to same subtree */
+ EXT4_SUBTREE_COMMON, /* Ancestor tree includes descent subtree*/
+ EXT4_SUBTREE_CROSS, /* Nodes belongs to different subtrees */
+};
+
+/**
+ * Check ancestor descendant subtree relationship.
+ * @ancino: ancestor inode
+ * @inode: descendant inode
+ */
+static inline int ext4_which_subtree(struct inode *ancino, struct inode *inode)
+{
+ if (inode->i_prjid == ancino->i_prjid)
+ return EXT4_SUBTREE_SAME;
+ else if (ancino->i_prjid == 0)
+ /*
+ * Ancestor inode belongs to default tree and it includes
+ * other subtrees by default
+ */
+ return EXT4_SUBTREE_COMMON;
+ return EXT4_SUBTREE_CROSS;
+}
+
+/**
+ * Check subtree assumptions on ext4_link()
+ * @tdir: target directory inode
+ * @inode: inode in question
+ * @return: true if link is possible, zero otherwise
+ */
+inline int ext4_prj_may_link(struct inode *tdir, struct inode *inode)
+{
+ if (!test_opt(inode->i_sb, PRJ_ISOLATION))
+ return 1;
+ /*
+ * According to subtree quota assumptions inode can not belongs to
+ * different quota trees.
+ */
+ if(ext4_which_subtree(tdir, inode) != EXT4_SUBTREE_SAME)
+ return 0;
+ return 1;
+}
+
+/**
+ * Check for directory subtree assumptions on ext4_rename()
+ * @new_dir: new directory inode
+ * @inode: inode in question
+ * @return: true if rename is possible, zero otherwise.
+ */
+inline int ext4_prj_may_rename(struct inode *new_dir, struct inode *inode)
+{
+ int same;
+ if (!test_opt(inode->i_sb, PRJ_ISOLATION))
+ return 1;
+ // XXX: Seems what i_nlink check is racy
+ // Is it possible to get inode->i_mutex here?
+ same = ext4_which_subtree(new_dir, inode);
+ if (S_ISDIR(inode->i_mode)) {
+ if (same == EXT4_SUBTREE_CROSS)
+ return 0;
+ } else {
+ if (inode->i_nlink > 1) {
+ /*
+ * If we allow to move any dentry of inode which has
+ * more than one link between subtrees then we end up
+ * with inode which belongs to different subtrees.
+ */
+ if (same != EXT4_SUBTREE_SAME)
+ return 0;
+ } else {
+ if (same == EXT4_SUBTREE_CROSS)
+ return 0;
+ }
+ }
+ return 1;
+}
+
+/**
+ * Check subtree parent/child relationship assumptions.
+ */
+inline void ext4_prj_check_parent(struct inode *dir, struct inode *inode)
+{
+ if (!test_opt(dir->i_sb, PRJ_ISOLATION))
+ return;
+
+ if (ext4_which_subtree(dir, inode) == EXT4_SUBTREE_CROSS) {
+ ext4_warning(inode->i_sb,
+ "Bad subtree hierarchy: directory{ino:%lu, project:%u}"
+ "inoode{ino:%lu, project:%u}\n",
+ dir->i_ino, dir->i_prjid,
+ inode->i_ino, inode->i_prjid);
+ }
+}
+
/*
* Initialize the projectid xattr of a new inode. Called from ext4_new_inode.
*
diff --git a/fs/ext4/project.h b/fs/ext4/project.h
index a8b56a0..00deddc 100644
--- a/fs/ext4/project.h
+++ b/fs/ext4/project.h
@@ -8,6 +8,9 @@ extern int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
extern int ext4_prj_init(handle_t *handle, struct inode *inode);
extern int ext4_prj_read(struct inode *inode);
extern int ext4_prj_change(struct inode *inode, unsigned int new_prjid);
+extern int ext4_prj_may_link(struct inode *dir, struct inode *inode);
+extern int ext4_prj_may_rename(struct inode *dir, struct inode *inode);
+extern void ext4_prj_check_parent(struct inode *dir, struct inode *inode);
#else
static inline int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
{
@@ -26,4 +29,16 @@ static int ext4_prj_change(struct inode *inode, unsigned int new_prjid)
{
return -ENOTSUP;
}
+static int ext4_prj_may_link(struct inode *dir, struct inode *inode)
+{
+ return 1;
+}
+static int ext4_prj_may_rename(struct inode *dir, struct inode *inode)
+{
+ return 1;
+}
+static void ext4_prj_check_parent(struct inode *dir, struct inode *inode)
+{
+ return 1;
+}
#endif
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 43a525e..8321844 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -971,6 +971,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
if (test_opt(sb, PROJECT_ID))
seq_puts(seq, ",project_id");

+ if (test_opt(sb, PRJ_ISOLATION))
+ seq_puts(seq, ",project_isolation");
+
if (test_opt(sb, NOLOAD))
seq_puts(seq, ",norecovery");

@@ -1152,7 +1155,7 @@ enum {
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
Opt_dioread_nolock, Opt_dioread_lock,
- Opt_discard, Opt_nodiscard, Opt_project_id,
+ Opt_discard, Opt_nodiscard, Opt_project_id, Opt_prj_isolation
};

static const match_table_t tokens = {
@@ -1227,6 +1230,7 @@ static const match_table_t tokens = {
{Opt_discard, "discard"},
{Opt_nodiscard, "nodiscard"},
{Opt_project_id, "project_id"},
+ {Opt_prj_isolation, "project_isolation"},
{Opt_err, NULL},
};

@@ -1729,6 +1733,9 @@ set_qf_format:
case Opt_project_id:
set_opt(sbi->s_mount_opt, PROJECT_ID);
break;
+ case Opt_prj_isolation:
+ set_opt(sbi->s_mount_opt, PRJ_ISOLATION);
+ break;
default:
ext4_msg(sb, KERN_ERR,
"Unrecognized mount option \"%s\" "
--
1.6.6.1


2010-03-18 14:03:07

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 4/5] ext4: add project quota support

Both regular and journaled quota are supported.
---
fs/ext4/ext4.h | 1 +
fs/ext4/super.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++------
fs/quota/dquot.c | 2 +-
3 files changed, 58 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5df54ae..3fa3602 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -745,6 +745,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_ERRORS_PANIC 0x00040 /* Panic on errors */
#define EXT4_MOUNT_MINIX_DF 0x00080 /* Mimics the Minix statfs */
#define EXT4_MOUNT_NOLOAD 0x00100 /* Don't use existing journal*/
+#define EXT4_MOUNT_PRJQUOTA 0x00200 /* Project quota support */
#define EXT4_MOUNT_DATA_FLAGS 0x00C00 /* Mode for data writes: */
#define EXT4_MOUNT_JOURNAL_DATA 0x00400 /* Write data to journal */
#define EXT4_MOUNT_ORDERED_DATA 0x00800 /* Flush data before commit */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ddb4588..43a525e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -834,11 +834,18 @@ static inline void ext4_show_quota_options(struct seq_file *seq,
if (sbi->s_qf_names[GRPQUOTA])
seq_printf(seq, ",grpjquota=%s", sbi->s_qf_names[GRPQUOTA]);

+ if (sbi->s_qf_names[PRJQUOTA])
+ seq_printf(seq, ",prjjquota=%s", sbi->s_qf_names[PRJQUOTA]);
+
if (test_opt(sb, USRQUOTA))
seq_puts(seq, ",usrquota");

if (test_opt(sb, GRPQUOTA))
seq_puts(seq, ",grpquota");
+
+ if (test_opt(sb, PRJQUOTA))
+ seq_puts(seq, ",prjquota");
+
#endif
}

@@ -1039,8 +1046,8 @@ static int bdev_try_to_free_page(struct super_block *sb, struct page *page,
}

#ifdef CONFIG_QUOTA
-#define QTYPE2NAME(t) ((t) == USRQUOTA ? "user" : "group")
-#define QTYPE2MOPT(on, t) ((t) == USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
+static char *quotatypes[] = INITQFNAMES;
+#define QTYPE2NAME(t) (quotatypes[t])

static int ext4_write_dquot(struct dquot *dquot);
static int ext4_acquire_dquot(struct dquot *dquot);
@@ -1136,10 +1143,11 @@ enum {
Opt_journal_checksum, Opt_journal_async_commit,
Opt_abort, Opt_data_journal, Opt_data_ordered, Opt_data_writeback,
Opt_data_err_abort, Opt_data_err_ignore,
- Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
+ Opt_usrjquota, Opt_grpjquota, Opt_prjjquota, Opt_offusrjquota,
+ Opt_offgrpjquota, Opt_offprjjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
Opt_noquota, Opt_ignore, Opt_barrier, Opt_nobarrier, Opt_err,
- Opt_resize, Opt_usrquota, Opt_grpquota, Opt_i_version,
+ Opt_resize, Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
Opt_stripe, Opt_delalloc, Opt_nodelalloc,
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
@@ -1189,10 +1197,13 @@ static const match_table_t tokens = {
{Opt_usrjquota, "usrjquota=%s"},
{Opt_offgrpjquota, "grpjquota="},
{Opt_grpjquota, "grpjquota=%s"},
+ {Opt_offprjjquota, "prjjquota="},
+ {Opt_prjjquota, "prjjquota=%s"},
{Opt_jqfmt_vfsold, "jqfmt=vfsold"},
{Opt_jqfmt_vfsv0, "jqfmt=vfsv0"},
{Opt_jqfmt_vfsv1, "jqfmt=vfsv1"},
{Opt_grpquota, "grpquota"},
+ {Opt_prjquota, "prjquota"},
{Opt_noquota, "noquota"},
{Opt_quota, "quota"},
{Opt_usrquota, "usrquota"},
@@ -1512,6 +1523,16 @@ static int parse_options(char *options, struct super_block *sb,
if (!set_qf_name(sb, GRPQUOTA, &args[0]))
return 0;
break;
+
+ case Opt_prjjquota:
+#ifdef CONFIG_PROJECT_ID
+ if (!set_qf_name(sb, PRJQUOTA, &args[0]))
+ return 0;
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_offusrjquota:
if (!clear_qf_name(sb, USRQUOTA))
return 0;
@@ -1520,7 +1541,15 @@ static int parse_options(char *options, struct super_block *sb,
if (!clear_qf_name(sb, GRPQUOTA))
return 0;
break;
-
+ case Opt_offprjjquota:
+#ifdef CONFIG_PROJECT_ID
+ if (!clear_qf_name(sb, PRJQUOTA))
+ return 0;
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_jqfmt_vfsold:
qfmt = QFMT_VFS_OLD;
goto set_qf_format;
@@ -1548,6 +1577,15 @@ set_qf_format:
set_opt(sbi->s_mount_opt, QUOTA);
set_opt(sbi->s_mount_opt, GRPQUOTA);
break;
+ case Opt_prjquota:
+#ifdef CONFIG_PROJECT_ID
+ set_opt(sbi->s_mount_opt, QUOTA);
+ set_opt(sbi->s_mount_opt, PRJQUOTA);
+#else
+ ext4_msg(sb, KERN_ERR,
+ "project quota options not supported");
+#endif
+ break;
case Opt_noquota:
if (sb_any_quota_loaded(sb)) {
ext4_msg(sb, KERN_ERR, "Cannot change quota "
@@ -1557,18 +1595,22 @@ set_qf_format:
clear_opt(sbi->s_mount_opt, QUOTA);
clear_opt(sbi->s_mount_opt, USRQUOTA);
clear_opt(sbi->s_mount_opt, GRPQUOTA);
+ clear_opt(sbi->s_mount_opt, PRJQUOTA);
break;
#else
case Opt_quota:
case Opt_usrquota:
case Opt_grpquota:
+ case Opt_prjquota:
ext4_msg(sb, KERN_ERR,
"quota options not supported");
break;
case Opt_usrjquota:
case Opt_grpjquota:
+ case Opt_prjjquota:
case Opt_offusrjquota:
case Opt_offgrpjquota:
+ case Opt_offprjjquota:
case Opt_jqfmt_vfsold:
case Opt_jqfmt_vfsv0:
case Opt_jqfmt_vfsv1:
@@ -1695,14 +1737,19 @@ set_qf_format:
}
}
#ifdef CONFIG_QUOTA
- if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA]) {
+ if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA] ||
+ sbi->s_qf_names[PRJQUOTA]) {
if (test_opt(sb, USRQUOTA) && sbi->s_qf_names[USRQUOTA])
clear_opt(sbi->s_mount_opt, USRQUOTA);

if (test_opt(sb, GRPQUOTA) && sbi->s_qf_names[GRPQUOTA])
clear_opt(sbi->s_mount_opt, GRPQUOTA);

- if (test_opt(sb, GRPQUOTA) || test_opt(sb, USRQUOTA)) {
+ if (test_opt(sb, PRJQUOTA) && sbi->s_qf_names[PRJQUOTA])
+ clear_opt(sbi->s_mount_opt, PRJQUOTA);
+
+ if (test_opt(sb, GRPQUOTA) || test_opt(sb, USRQUOTA) ||
+ test_opt(sb, PRJQUOTA)) {
ext4_msg(sb, KERN_ERR, "old and new quota "
"format mixing");
return 0;
@@ -3868,7 +3915,8 @@ static int ext4_mark_dquot_dirty(struct dquot *dquot)
{
/* Are we journaling quotas? */
if (EXT4_SB(dquot->dq_sb)->s_qf_names[USRQUOTA] ||
- EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA]) {
+ EXT4_SB(dquot->dq_sb)->s_qf_names[GRPQUOTA] ||
+ EXT4_SB(dquot->dq_sb)->s_qf_names[PRJQUOTA]) {
dquot_mark_dquot_dirty(dquot);
return ext4_write_dquot(dquot);
} else {
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 3590888..73ef888 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1807,7 +1807,7 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
#ifdef CONFIG_PROJECT_ID
if (iattr->ia_valid & ATTR_PRJID && iattr->ia_prjid != inode->i_prjid) {
mask |= 1 << PRJQUOTA;
- chid[GRPQUOTA] = iattr->ia_prjid;
+ chid[PRJQUOTA] = iattr->ia_prjid;
}
#endif
if (sb_any_quota_active(inode->i_sb) && !IS_NOQUOTA(inode)) {
--
1.6.6.1


2010-03-18 14:03:05

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

* Abstract
A subtree of a directory tree T is a tree consisting of a directory
(the subtree root) in T and all of its descendants in T.

*NOTE*: User is allowed to break pure subtree hierarchy via manual
id manipulation.

Project subtrees assumptions:
(1) Each inode has an id. This id is persistently stored inside
inode (xattr, usually inside ibody)
(2) Project id is inherent from parent directory

This feature is similar to project-id in XFS. One may assign some id to
a subtree. Each entry from the subtree may be accounted in directory
project quota. Will appear in later patches.

* Disk layout
Project id is stored on disk inside xattr usually inside ibody.
Xattr is used only as a data storage, It has not user visible xattr
interface.

* User interface
Project id is accessible via generic xattr interface "system.project_id"

* Notes
ext4_setattr interface to prjid: Semantically prjid must being changed
similar to uid/gid, but project_id is stored inside xattr so on-disk
structures updates is not that trivial, so I've move prjid change
logic to separate function.

TODO: implement e2libfs support for project_id.
---
fs/ext4/Kconfig | 8 ++
fs/ext4/Makefile | 1 +
fs/ext4/ext4.h | 1 +
fs/ext4/ialloc.c | 12 +++-
fs/ext4/inode.c | 21 +++++-
fs/ext4/project.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/project.h | 29 +++++++
fs/ext4/super.c | 9 ++-
fs/ext4/xattr.c | 7 ++
fs/ext4/xattr.h | 2 +
10 files changed, 297 insertions(+), 4 deletions(-)
create mode 100644 fs/ext4/project.c
create mode 100644 fs/ext4/project.h

diff --git a/fs/ext4/Kconfig b/fs/ext4/Kconfig
index 9ed1bb1..1c04c9f 100644
--- a/fs/ext4/Kconfig
+++ b/fs/ext4/Kconfig
@@ -74,6 +74,14 @@ config EXT4_FS_SECURITY

If you are not using a security module that requires using
extended attributes for file security labels, say N.
+config EXT4_PROJECT_ID
+ bool "Ext4 project_id support"
+ depends on PROJECT_ID
+ depends on EXT4_FS_XATTR
+ help
+ Enables project inode identifier support for ext4 filesystem.
+ This feature allow to assign some id to inodes similar to
+ uid/gid.

config EXT4_DEBUG
bool "EXT4 debugging support"
diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile
index 8867b2a..be923b1 100644
--- a/fs/ext4/Makefile
+++ b/fs/ext4/Makefile
@@ -11,3 +11,4 @@ ext4-y := balloc.o bitmap.o dir.o file.o fsync.o ialloc.o inode.o \
ext4-$(CONFIG_EXT4_FS_XATTR) += xattr.o xattr_user.o xattr_trusted.o
ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o
ext4-$(CONFIG_EXT4_FS_SECURITY) += xattr_security.o
+ext4-$(CONFIG_EXT4_PROJECT_ID) += project.o
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index bf938cf..5df54ae 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -763,6 +763,7 @@ struct ext4_inode_info {
#define EXT4_MOUNT_JOURNAL_CHECKSUM 0x800000 /* Journal checksums */
#define EXT4_MOUNT_JOURNAL_ASYNC_COMMIT 0x1000000 /* Journal Async Commit */
#define EXT4_MOUNT_I_VERSION 0x2000000 /* i_version support */
+#define EXT4_MOUNT_PROJECT_ID 0x4000000 /* project owner id support */
#define EXT4_MOUNT_DELALLOC 0x8000000 /* Delalloc support */
#define EXT4_MOUNT_DATA_ERR_ABORT 0x10000000 /* Abort on file data write */
#define EXT4_MOUNT_BLOCK_VALIDITY 0x20000000 /* Block validity checking */
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 361c0b9..e1fa91a 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -28,7 +28,7 @@
#include "ext4_jbd2.h"
#include "xattr.h"
#include "acl.h"
-
+#include "project.h"
#include <trace/events/ext4.h>

/*
@@ -1028,6 +1028,13 @@ got:

ei->i_extra_isize = EXT4_SB(sb)->s_want_extra_isize;

+#ifdef CONFIG_EXT4_PROJECT_ID
+ /*
+ * XXX: move this to generic inode init helper
+ * depends on generic_inode_init patch.
+ */
+ inode->i_prjid = dir->i_prjid;
+#endif
ret = inode;
dquot_initialize(inode);
err = dquot_alloc_inode(inode);
@@ -1041,6 +1048,9 @@ got:
err = ext4_init_security(handle, inode, dir);
if (err)
goto fail_free_drop;
+ err = ext4_prj_init(handle, inode);
+ if (err)
+ goto fail_free_drop;

if (EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_EXTENTS)) {
/* set extent flag only for directory, file and normal symlink*/
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 986120f..56ff209 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -44,7 +44,7 @@
#include "xattr.h"
#include "acl.h"
#include "ext4_extents.h"
-
+#include "project.h"
#include <trace/events/ext4.h>

#define MPAGE_DA_EXTENT_TAIL 0x01
@@ -5113,6 +5113,9 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
}
if (ret)
goto bad_inode;
+ ret = ext4_prj_read(inode);
+ if (ret)
+ goto bad_inode;

if (S_ISREG(inode->i_mode)) {
inode->i_op = &ext4_file_inode_operations;
@@ -5413,10 +5416,12 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
*
* Called with inode->i_mutex down.
*/
-int ext4_setattr(struct dentry *dentry, struct iattr *attr)
+int ext4_setattr(struct dentry *dentry, struct iattr *iattr)
{
struct inode *inode = dentry->d_inode;
int error, rc = 0;
+ struct iattr local_attr = *iattr;
+ struct iattr *attr = &local_attr;
const unsigned int ia_valid = attr->ia_valid;

error = inode_change_ok(inode, attr);
@@ -5425,6 +5430,18 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)

if (ia_valid & ATTR_SIZE)
dquot_initialize(inode);
+#ifdef CONFIG_EXT4_PROJECT_ID
+ if (ia_valid & ATTR_PRJID && attr->ia_prjid != inode->i_prjid) {
+ error = ext4_prj_change(dentry->d_inode, attr->ia_prjid);
+ if (error)
+ return error;
+ /*
+ Clear valid bit to prevent quota_transfer attempt in next
+ stage. It it safe since 'attr' is local variable.
+ */
+ attr->ia_valid &= ~ATTR_PRJID;
+ }
+#endif
if ((ia_valid & ATTR_UID && attr->ia_uid != inode->i_uid) ||
(ia_valid & ATTR_GID && attr->ia_gid != inode->i_gid)) {
handle_t *handle;
diff --git a/fs/ext4/project.c b/fs/ext4/project.c
new file mode 100644
index 0000000..b571599
--- /dev/null
+++ b/fs/ext4/project.c
@@ -0,0 +1,211 @@
+/*
+ * linux/fs/ext4/projectid.c
+ *
+ * Copyright (C) 2010 Parallels Inc
+ * Dmitry Monakhov <[email protected]>
+ */
+
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/capability.h>
+#include <linux/fs.h>
+#include <linux/quotaops.h>
+#include "ext4_jbd2.h"
+#include "ext4.h"
+#include "xattr.h"
+#include "project.h"
+
+/*
+ * PROJECT SUBTREE
+ * A subtree of a directory tree T is a tree consisting of a directory
+ * (the subtree root) in T and all of its descendants in T.
+ *
+ * Project Subtree's assumptions:
+ * (1) Each inode has subtree id. This id is persistently stored inside
+ * inode's xattr, usually inside ibody
+ * (2) Subtree id is inherent from parent directory
+ */
+
+/*
+ * Read project_id id from inode's xattr
+ * Locking: none
+ */
+int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
+{
+ __le32 dsk_prjid;
+ int retval;
+ retval = ext4_xattr_get(inode, EXT4_XATTR_INDEX_PROJECT_ID, "",
+ &dsk_prjid, sizeof (dsk_prjid));
+ if (retval > 0) {
+ if (retval != sizeof(dsk_prjid))
+ return -EIO;
+ else
+ retval = 0;
+ }
+ *prjid = le32_to_cpu(dsk_prjid);
+ return retval;
+
+}
+
+/*
+ * Save project_id id to inode's xattr
+ * Locking: none
+ */
+int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags)
+{
+ __le32 dsk_prjid = cpu_to_le32(prjid);
+ int retval;
+ retval = ext4_xattr_set_handle(handle,
+ inode, EXT4_XATTR_INDEX_PROJECT_ID, "",
+ &dsk_prjid, sizeof (dsk_prjid), xflags);
+ if (retval > 0) {
+ if (retval != sizeof(dsk_prjid))
+ retval = -EIO;
+ else
+ retval = 0;
+ }
+ return retval;
+}
+
+/*
+ * Change project_id id.
+ * Called under inode->i_mutex
+ */
+int ext4_prj_change(struct inode *inode, unsigned int new_prjid)
+{
+ /*
+ * One data_trans_blocks chunk for xattr update.
+ * One quota_trans_blocks chunk for quota transfer, and one
+ * quota_trans_block chunk for emergency quota rollback transfer,
+ * because quota rollback may result new quota blocks allocation.
+ */
+ unsigned credits = EXT4_DATA_TRANS_BLOCKS(inode->i_sb) +
+ EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb) * 2;
+ struct iattr attr;
+ int ret, ret2 = 0;
+ unsigned retries = 0;
+ handle_t *handle;
+
+ dquot_initialize(inode);
+retry:
+ handle = ext4_journal_start(inode, credits);
+ if (IS_ERR(handle)) {
+ ret = PTR_ERR(handle);
+ ext4_std_error(inode->i_sb, ret);
+ goto out;
+ }
+ /* Inode may not have project_id xattr yet. Create it explicitly */
+ ret = ext4_prj_xattr_write(handle, inode, inode->i_prjid,
+ XATTR_CREATE);
+ if (ret == -EEXIST)
+ ret = 0;
+ if (ret) {
+ ret2 = ext4_journal_stop(handle);
+ if (ret2)
+ ret = ret2;
+ if (ret == -ENOSPC &&
+ ext4_should_retry_alloc(inode->i_sb, &retries))
+ goto retry;
+ }
+#ifdef CONFIG_QUOTA
+ attr.ia_prjid = new_prjid;
+ attr.ia_valid = ATTR_PRJID;
+ ret = dquot_transfer(inode, &attr);
+ if (ret)
+ return ret;
+#endif
+ ret = ext4_prj_xattr_write(handle, inode, new_prjid, XATTR_REPLACE);
+ if (ret) {
+ /*
+ * Function may fail only due to fatal error, Nor than less
+ * we have try to rollback quota changes.
+ */
+#ifdef CONFIG_QUOTA
+ attr.ia_prjid = inode->i_prjid;
+ attr.ia_valid = ATTR_PRJID;
+ dquot_transfer(inode, &attr);
+#endif
+ ext4_std_error(inode->i_sb, ret);
+
+ }
+ inode->i_prjid = new_prjid;
+ ret2 = ext4_journal_stop(handle);
+out:
+ if (ret2)
+ ret = ret2;
+ return ret;
+}
+
+int ext4_prj_read(struct inode *inode)
+{
+ int ret = 0;
+ if(test_opt(inode->i_sb, PROJECT_ID)) {
+ ret = ext4_prj_xattr_read(inode, &inode->i_prjid);
+ if (ret == -ENODATA) {
+ inode->i_prjid = 0;
+ ret = 0;
+ }
+ } else
+ inode->i_prjid = 0;
+ return ret;
+}
+/*
+ * Initialize the projectid xattr of a new inode. Called from ext4_new_inode.
+ *
+ * dir->i_mutex: down
+ * inode->i_mutex: up (access to inode is still exclusive)
+ * Note: caller must assign correct project id to inode before.
+ */
+int ext4_prj_init(handle_t *handle, struct inode *inode)
+{
+ return ext4_prj_xattr_write(handle, inode, inode->i_prjid,
+ XATTR_CREATE);
+}
+
+static size_t
+ext4_xattr_prj_list(struct dentry *dentry, char *list, size_t list_size,
+ const char *name, size_t name_len, int type)
+{
+ if (list && XATTR_PRJID_LEN <= list_size)
+ memcpy(list, XATTR_PRJID, XATTR_PRJID_LEN);
+ return XATTR_PRJID_LEN;
+
+}
+
+static int
+ext4_xattr_prj_get(struct dentry *dentry, const char *name,
+ void *buffer, size_t size, int type)
+{
+ int ret;
+ unsigned prjid;
+ char buf[32];
+ if (strcmp(name, "") != 0)
+ return -EINVAL;
+ ret = ext4_prj_xattr_read(dentry->d_inode, &prjid);
+ if (ret)
+ return ret;
+ snprintf(buf, sizeof(buf)-1, "%u", prjid);
+ buf[31] = '\0';
+ strncpy(buffer, buf, size);
+ return strlen(buf);
+}
+
+static int
+ext4_xattr_prj_set(struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags, int type)
+{
+ unsigned int new_prjid;
+ if (strcmp(name, "") != 0)
+ return -EINVAL;
+ new_prjid = simple_strtoul(value, (char **)&value, 0);
+ return ext4_prj_change(dentry->d_inode, new_prjid);
+}
+
+struct xattr_handler ext4_xattr_prj_handler = {
+ .prefix = XATTR_PRJID,
+ .list = ext4_xattr_prj_list,
+ .get = ext4_xattr_prj_get,
+ .set = ext4_xattr_prj_set,
+};
diff --git a/fs/ext4/project.h b/fs/ext4/project.h
new file mode 100644
index 0000000..a8b56a0
--- /dev/null
+++ b/fs/ext4/project.h
@@ -0,0 +1,29 @@
+#include <linux/xattr.h>
+#include <linux/fs.h>
+
+#ifdef CONFIG_EXT4_PROJECT_ID
+extern int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid);
+extern int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags);
+extern int ext4_prj_init(handle_t *handle, struct inode *inode);
+extern int ext4_prj_read(struct inode *inode);
+extern int ext4_prj_change(struct inode *inode, unsigned int new_prjid);
+#else
+static inline int ext4_prj_xattr_read(struct inode *inode, unsigned int *prjid)
+{
+ return -ENOTSUPP;
+}
+static inline int ext4_prj_xattr_write(handle_t *handle, struct inode *inode,
+ unsigned int prjid, int xflags)
+{
+ return -ENOTSUPP;
+}
+static int ext4_prj_read(struct inode *inode)
+{
+ return 0;
+}
+static int ext4_prj_change(struct inode *inode, unsigned int new_prjid)
+{
+ return -ENOTSUP;
+}
+#endif
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ba191da..ddb4588 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -961,6 +961,9 @@ static int ext4_show_options(struct seq_file *seq, struct vfsmount *vfs)
if (test_opt(sb, DISCARD))
seq_puts(seq, ",discard");

+ if (test_opt(sb, PROJECT_ID))
+ seq_puts(seq, ",project_id");
+
if (test_opt(sb, NOLOAD))
seq_puts(seq, ",norecovery");

@@ -1141,7 +1144,7 @@ enum {
Opt_block_validity, Opt_noblock_validity,
Opt_inode_readahead_blks, Opt_journal_ioprio,
Opt_dioread_nolock, Opt_dioread_lock,
- Opt_discard, Opt_nodiscard,
+ Opt_discard, Opt_nodiscard, Opt_project_id,
};

static const match_table_t tokens = {
@@ -1212,6 +1215,7 @@ static const match_table_t tokens = {
{Opt_dioread_lock, "dioread_lock"},
{Opt_discard, "discard"},
{Opt_nodiscard, "nodiscard"},
+ {Opt_project_id, "project_id"},
{Opt_err, NULL},
};

@@ -1680,6 +1684,9 @@ set_qf_format:
case Opt_dioread_lock:
clear_opt(sbi->s_mount_opt, DIOREAD_NOLOCK);
break;
+ case Opt_project_id:
+ set_opt(sbi->s_mount_opt, PROJECT_ID);
+ break;
default:
ext4_msg(sb, KERN_ERR,
"Unrecognized mount option \"%s\" "
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index b4c5aa8..ffec4b7 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -107,6 +107,10 @@ static struct xattr_handler *ext4_xattr_handler_map[] = {
#ifdef CONFIG_EXT4_FS_SECURITY
[EXT4_XATTR_INDEX_SECURITY] = &ext4_xattr_security_handler,
#endif
+#ifdef CONFIG_EXT4_PROJECT_ID
+ [EXT4_XATTR_INDEX_PROJECT_ID] = &ext4_xattr_prj_handler,
+#endif
+
};

struct xattr_handler *ext4_xattr_handlers[] = {
@@ -119,6 +123,9 @@ struct xattr_handler *ext4_xattr_handlers[] = {
#ifdef CONFIG_EXT4_FS_SECURITY
&ext4_xattr_security_handler,
#endif
+#ifdef CONFIG_EXT4_PROJECT_ID
+ &ext4_xattr_prj_handler,
+#endif
NULL
};

diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
index 8ede88b..777d60f 100644
--- a/fs/ext4/xattr.h
+++ b/fs/ext4/xattr.h
@@ -21,6 +21,7 @@
#define EXT4_XATTR_INDEX_TRUSTED 4
#define EXT4_XATTR_INDEX_LUSTRE 5
#define EXT4_XATTR_INDEX_SECURITY 6
+#define EXT4_XATTR_INDEX_PROJECT_ID 7

struct ext4_xattr_header {
__le32 h_magic; /* magic number for identification */
@@ -70,6 +71,7 @@ extern struct xattr_handler ext4_xattr_trusted_handler;
extern struct xattr_handler ext4_xattr_acl_access_handler;
extern struct xattr_handler ext4_xattr_acl_default_handler;
extern struct xattr_handler ext4_xattr_security_handler;
+extern struct xattr_handler ext4_xattr_prj_handler;

extern ssize_t ext4_listxattr(struct dentry *, char *, size_t);

--
1.6.6.1


2010-03-18 21:25:00

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

On 2010-03-18, at 08:02, Dmitry Monakhov wrote:
> * Disk layout
> Project id is stored on disk inside xattr usually inside ibody.
> Xattr is used only as a data storage, It has not user visible xattr
> interface.
>
> * User interface
> Project id is accessible via generic xattr interface
> "system.project_id"
>
> +#define EXT4_XATTR_INDEX_PROJECT_ID 7

If you are making this attribute available to userspace via
system.project_id, doesn't it make sense to store it on disk as
system.project_id also (i.e. in the "system" namespace)?

Alternately, the "trusted" namespace is already intended for use as
"accessible only by root/kernel" semantics if this is what you are
trying to achieve.

I'm also with the statement "It has not user visible xattr interface"
yet you also write "Project id is accessible via generic xattr
interface". Why bother having this complexity when you are free to
make the two identical?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2010-03-19 08:16:14

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 3/5] ext4: Implement project ID support for ext4 filesystem

Andreas Dilger <[email protected]> writes:

> On 2010-03-18, at 08:02, Dmitry Monakhov wrote:
>> * Disk layout
>> Project id is stored on disk inside xattr usually inside ibody.
>> Xattr is used only as a data storage, It has not user visible xattr
>> interface.
>>
>> * User interface
>> Project id is accessible via generic xattr interface
>> "system.project_id"
>>
>> +#define EXT4_XATTR_INDEX_PROJECT_ID 7
>
> If you are making this attribute available to userspace via
> system.project_id, doesn't it make sense to store it on disk as
> system.project_id also (i.e. in the "system" namespace)?
>
> Alternately, the "trusted" namespace is already intended for use as
> "accessible only by root/kernel" semantics if this is what you are
> trying to achieve.
Yess, may be it will be better storage class for projectid.
> I'm also with the statement "It has not user visible xattr interface"
> yet you also write "Project id is accessible via generic xattr
> interface". Why bother having this complexity when you are free to
> make the two identical?
Ohh. obviously sentence that 'xattr is not visible' is obsolete.
I'v forgot to change it after prjid becomes visible xattr, sorry
for ambiguity.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.

2010-04-06 09:07:55

by Dmitry Monakhov

[permalink] [raw]
Subject: Ping.

Dmitry Monakhov <[email protected]> writes:

> This is 6'th version of extened inode owner patch-set.
> Please review it tell me what do you think about all this.
> Are you agree with this approach?
> Are you worry about some implementation details?
> Is it ready for merge to some devel's tree?
Ping. I haven't got response about the patchset, just small note about
xattr-name from Andreas.
Please clarify what do you think about whole idea and
current patch-set state. What do i have to do to make a progress?
>
> *Feature description*
> 1) Inode may has a project identifier which has same meaning as uid/gid.
> 2) Id is stored in inode's xattr named "system.project_id"
> 3) Id is inherent from parent inode on creation.
> 4) This id is cached in memory inode structure vfs_inode->i_prjid
> This field it restricted by CONFIG_PROJECT_ID. So no wasting
> of memory happens.
>
> 5) Since id is cached in memory it may be used for different purposes
> such as:
> 5A) Implement additional quota id space orthogonal to uid/gid. This is
> useful in managing quota for some filesystem hierarchy(chroot or
> container over bindmount)
> 5B) Export dedicated fs hierarchy to nfsd (only inode which has some
> project_id will be accessible via nfsd)
>
> 6) It is possible to create isolated project's subtree.
> Note: Please do not blame isolation feature before you read the
> isolation patch description, and than please wellcome.
>
> *User interface *
> Project id is managed via generic xattr interface "system.project_id"
> This good because
> 1) We may use already existing interface.
> 2) xattr already supported by generic urils tar/rsync and etc
>
> PATCH SET TOC:
> 1) generic projectid support
> 2) generic project quota support
> 3) ext4 project support implementation
> 3A) ext4: generic project support
> 3B) ext4: project quota support
> 3C) ext4: project isolation support. This patch is not principal
> but makes ext4 implementation rename behaviour equotals
> to XFS
>
> Patch against linux-next-20100318
> Changes against v5
> - convert dquota_transfer to struct iattr interface. Not it is possible
> to change i_prjid via notify_changes()
> - some bugfixes.

2010-04-13 18:14:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Ping.

On Tue, Apr 06, 2010 at 01:00:58PM +0400, Dmitry Monakhov wrote:
> Dmitry Monakhov <[email protected]> writes:
>
> > This is 6'th version of extened inode owner patch-set.
> > Please review it tell me what do you think about all this.
> > Are you agree with this approach?
> > Are you worry about some implementation details?
> > Is it ready for merge to some devel's tree?
> Ping. I haven't got response about the patchset, just small note about
> xattr-name from Andreas.
> Please clarify what do you think about whole idea and
> current patch-set state. What do i have to do to make a progress?

As long as you still have the awkward ifdefs and different semantics for
"isolation" vs "not" there's still a NACK from me. But in the end Al
will have to decide if he wants to take your patches or not.


2010-04-15 11:30:58

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: Ping.

Christoph Hellwig <[email protected]> writes:

> On Tue, Apr 06, 2010 at 01:00:58PM +0400, Dmitry Monakhov wrote:
>> Dmitry Monakhov <[email protected]> writes:
>>
>> > This is 6'th version of extened inode owner patch-set.
>> > Please review it tell me what do you think about all this.
>> > Are you agree with this approach?
>> > Are you worry about some implementation details?
>> > Is it ready for merge to some devel's tree?
>> Ping. I haven't got response about the patchset, just small note about
>> xattr-name from Andreas.
>> Please clarify what do you think about whole idea and
>> current patch-set state. What do i have to do to make a progress?
>
> As long as you still have the awkward ifdefs and different semantics for
About ifdefs style:
How can i avoid CONFIG_PROJECT_ID without bloating inode size?
embedded people will kill me for this.
I just act similar quota code, or you want protect prjid logic
via quota config option?
I don't remember, are we already talk about this?
If so please remind me your vision.
> "isolation" vs "not" there's still a NACK from me. But in the end Al
> will have to decide if he wants to take your patches or not.

2010-04-30 17:08:02

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: Ping to Al

> As long as you still have the awkward ifdefs and different semantics for
> "isolation" vs "not" there's still a NACK from me. But in the end Al
> will have to decide if he wants to take your patches or not.

Hi, Al!

Regardless of the Christof's concern about the awkward ifdefs and semantics (we
will fix this for sure), can you comment anything on the patchset please.

Thanks,
Pavel

2010-05-15 09:34:51

by Al Viro

[permalink] [raw]
Subject: Re: Ping.

On Thu, Apr 15, 2010 at 03:30:58PM +0400, Dmitry Monakhov wrote:

> > As long as you still have the awkward ifdefs and different semantics for
> About ifdefs style:
> How can i avoid CONFIG_PROJECT_ID without bloating inode size?
> embedded people will kill me for this.
> I just act similar quota code, or you want protect prjid logic
> via quota config option?
> I don't remember, are we already talk about this?
> If so please remind me your vision.

You have way more ifdefs than just that.

E.g. struct iattr ones are pure junk; it's not kept around for long
time. Moreover, since ATTR_... is not userland-reachable, you can
bloody well #define ATTR_PRJID to 0 and get rid of more ifdefs.
Ifdefs due to accesses to i_prjid can be reduced to just two inlined
helpers (get_prjid/set_prjid) and there you go...

As for the ->i_prjid itself, I'd buy ifdef there, but _not_ in that
place within structure. You are shoving it between two pointers;
think of alignment.

As for the isolation... Why do we care about link/rename at all?
All stuff with the same project ID gets counted together, wherever
in the tree it might be. It's not as if we'd been scanning a subtree
and calculate the sum over it, after all. So what is the problem?
Some bastard DoSing you by creating links from places you can't write
to? Then you want as much isolation as possible anyway, TYVM, and
bind-as-link-barrier is entirely appropriate there.

> > "isolation" vs "not" there's still a NACK from me. But in the end Al
> > will have to decide if he wants to take your patches or not.

I want details on use cases and semantics for isolation stuff; as it is,
it looks rather odd. Why move towards "neutral" part of the tree is
allowed and links over there are not, for example?