2008-12-22 21:48:55

by Mark Fasheh

[permalink] [raw]
Subject: [git patches] Ocfs2 patches for merge window, batch 2/3

Hi,

This is the second batch of Ocfs2 patches intended for the merge window. The
1st batch were sent out previously:

http://lkml.org/lkml/2008/12/19/280

The bulk of this set is comprised of Jan Kara's patches to add quota support
to Ocfs2. Many of the quota patches are to generic code, which I carried to
make merging of the Ocfs2 support easier. All of the non-ocfs2 patches
should have appropriate signoffs. Quota is handled a bit differently in
Ocfs2 than other file systems. We keep a set of node local quota files (user
and group), which periodically sync with a global file. This allows for a
higher level of concurrency on multi-node clusters. Additionally, a small
portion of each quota block is reserved for later use as a checksum field.

The other non-trivial part of this series is comprised of some more meta
data I/O cleanups by Joel. This time the focus is on writing of leaves in
indexed xattr trees and managing those changes in a fashion which makes the
meta data checksum patches in round 3 more straight forward.
--Mark

Please pull from 'upstream-round2' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2.git upstream-round2

to receive the following updates:

fs/Kconfig | 7 +
fs/Makefile | 1 +
fs/dquot.c | 436 +++++++++----
fs/ext3/super.c | 16 +-
fs/ext4/super.c | 15 +-
fs/ocfs2/Makefile | 2 +
fs/ocfs2/alloc.c | 20 +-
fs/ocfs2/aops.c | 16 +-
fs/ocfs2/buffer_head_io.c | 5 +-
fs/ocfs2/cluster/masklog.c | 1 +
fs/ocfs2/cluster/masklog.h | 1 +
fs/ocfs2/dir.c | 24 +-
fs/ocfs2/dlmglue.c | 146 +++++
fs/ocfs2/dlmglue.h | 19 +
fs/ocfs2/file.c | 78 ++-
fs/ocfs2/file.h | 3 +
fs/ocfs2/inode.c | 16 +-
fs/ocfs2/inode.h | 2 +
fs/ocfs2/journal.c | 142 ++++-
fs/ocfs2/journal.h | 85 ++-
fs/ocfs2/namei.c | 44 ++-
fs/ocfs2/ocfs2.h | 7 +-
fs/ocfs2/ocfs2_fs.h | 126 ++++-
fs/ocfs2/ocfs2_lockid.h | 5 +
fs/ocfs2/quota.h | 117 ++++
fs/ocfs2/quota_global.c | 990 ++++++++++++++++++++++++++++
fs/ocfs2/quota_local.c | 1253 ++++++++++++++++++++++++++++++++++++
fs/ocfs2/super.c | 277 ++++++++-
fs/ocfs2/xattr.c | 567 +++++++++--------
fs/quota.c | 11 +-
fs/quota_tree.c | 645 +++++++++++++++++++
fs/quota_tree.h | 25 +
fs/quota_v1.c | 28 +-
fs/quota_v2.c | 631 +++----------------
{include/linux => fs}/quotaio_v1.h | 0
{include/linux => fs}/quotaio_v2.h | 33 +-
fs/reiserfs/super.c | 10 +-
include/linux/Kbuild | 4 -
include/linux/dqblk_qtree.h | 56 ++
include/linux/dqblk_v1.h | 7 -
include/linux/dqblk_v2.h | 22 +-
include/linux/jbd2.h | 1 +
include/linux/quota.h | 108 +++-
include/linux/quotaops.h | 96 +++-
mm/pdflush.c | 1 +
45 files changed, 4925 insertions(+), 1174 deletions(-)
create mode 100644 fs/ocfs2/quota.h
create mode 100644 fs/ocfs2/quota_global.c
create mode 100644 fs/ocfs2/quota_local.c
create mode 100644 fs/quota_tree.c
create mode 100644 fs/quota_tree.h
rename {include/linux => fs}/quotaio_v1.h (100%)
rename {include/linux => fs}/quotaio_v2.h (68%)
create mode 100644 include/linux/dqblk_qtree.h

Jan Kara (36):
quota: Add callbacks for allocating and destroying dquot structures
quota: Increase size of variables for limits and inode usage
quota: Remove bogus 'optimization' in check_idq() and check_bdq()
quota: Make _SUSPENDED just a flag
quota: Allow to separately enable quota accounting and enforcing limits
ext3: Use sb_any_quota_loaded() instead of sb_any_quota_enabled()
ext4: Use sb_any_quota_loaded() instead of sb_any_quota_enabled()
reiserfs: Use sb_any_quota_loaded() instead of sb_any_quota_enabled().
quota: Remove compatibility function sb_any_quota_enabled()
quota: Introduce DQUOT_QUOTA_SYS_FILE flag
quota: Move quotaio_v[12].h from include/linux/ to fs/
quota: Split off quota tree handling into a separate file
quota: Convert union in mem_dqinfo to a pointer
quota: Allow negative usage of space and inodes
quota: Keep which entries were set by SETQUOTA quotactl
quota: Update version number
quota: Add helpers to allow ocfs2 specific quota initialization, freeing and recovery
quota: Implement function for scanning active dquots
mm: Export pdflush_operation()
ocfs2: Support nested transactions
ocfs2: Assign feature bits and system inodes to quota feature and quota files
ocfs2: Mark system files as not subject to quota accounting
ocfs2: Implementation of local and global quota file handling
ocfs2: Add quota calls for allocation and freeing of inodes and space
ocfs2: Implement quota syncing thread
ocfs2: Implement quota recovery
ocfs2: Enable quota accounting on mount, disable on umount
ocfs2: Add missing initialization
ocfs2: Fix oops when extending quota files
ocfs2: Make ocfs2_get_quota_block() consistent with ocfs2_read_quota_block()
ocfs2: Fix build warnings (64-bit types vs long long)
quota: Unexport dqblk_v1.h and dqblk_v2.h
quota: Export dquot_alloc() and dquot_destroy() functions
reiserfs: Add default allocation routines for quota structures
ext3: Add default allocation routines for quota structures
ext4: Add default allocation routines for quota structures

Joel Becker (14):
ocfs2: Fix ocfs2_read_quota_block() error handling.
ocfs2: Dirty the entire bucket in ocfs2_bucket_value_truncate()
ocfs2: Dirty the entire first bucket in ocfs2_extend_xattr_bucket()
ocfs2: Dirty the entire first bucket in ocfs2_cp_xattr_cluster().
ocfs2: Explain t_is_new in ocfs2_cp_xattr_cluster().
ocfs2: Use ocfs2_cp_xattr_bucket() in ocfs2_mv_xattr_bucket_cross_cluster().
ocfs2: Rename ocfs2_cp_xattr_cluster() to ocfs2_mv_xattr_buckets().
ocfs2: ocfs2_mv_xattr_buckets() can handle a partial cluster now.
ocfs2: Use ocfs2_mv_xattr_buckets() in ocfs2_mv_xattr_bucket_cross_cluster().
ocfs2: Start using buckets in ocfs2_adjust_xattr_cross_cluster().
ocfs2: Pass buckets into ocfs2_mv_xattr_bucket_cross_cluster().
ocfs2: Move buckets up into ocfs2_add_new_xattr_cluster().
ocfs2: Move buckets up into ocfs2_add_new_xattr_bucket().
ocfs2: Pass xs->bucket into ocfs2_add_new_xattr_bucket().

Mark Fasheh (2):
jbd2: Add BH_JBDPrivateStart
ocfs2: Use BH_JBDPrivateStart instead of BH_Unshadow

Tao Ma (4):
ocfs2: fix indendation in ocfs2_dquot_drop_slow
ocfs2/quota: sparse fixes for quota
ocfs2: Narrow the transaction for deleting xattrs from a bucket.
ocfs2/quota: Add QUOTA in mlog_attribute.


2008-12-22 21:49:18

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 01/56] quota: Add callbacks for allocating and destroying dquot structures

From: Jan Kara <[email protected]>

Some filesystems would like to keep private information together with each
dquot. Add callbacks alloc_dquot and destroy_dquot allowing filesystem to
allocate larger dquots from their private slab in a similar fashion we
currently allocate inodes.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 27 ++++++++++++++++++++++-----
include/linux/quota.h | 2 ++
2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 5e95261..5512e38 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -415,6 +415,16 @@ out_dqlock:
return ret;
}

+static void dquot_destroy(struct dquot *dquot)
+{
+ kmem_cache_free(dquot_cachep, dquot);
+}
+
+static inline void do_destroy_dquot(struct dquot *dquot)
+{
+ dquot->dq_sb->dq_op->destroy_dquot(dquot);
+}
+
/* Invalidate all dquots on the list. Note that this function is called after
* quota is disabled and pointers from inodes removed so there cannot be new
* quota users. There can still be some users of quotas due to inodes being
@@ -463,7 +473,7 @@ restart:
remove_dquot_hash(dquot);
remove_free_dquot(dquot);
remove_inuse(dquot);
- kmem_cache_free(dquot_cachep, dquot);
+ do_destroy_dquot(dquot);
}
spin_unlock(&dq_list_lock);
}
@@ -527,7 +537,7 @@ static void prune_dqcache(int count)
remove_dquot_hash(dquot);
remove_free_dquot(dquot);
remove_inuse(dquot);
- kmem_cache_free(dquot_cachep, dquot);
+ do_destroy_dquot(dquot);
count--;
head = free_dquots.prev;
}
@@ -625,11 +635,16 @@ we_slept:
spin_unlock(&dq_list_lock);
}

+static struct dquot *dquot_alloc(struct super_block *sb, int type)
+{
+ return kmem_cache_zalloc(dquot_cachep, GFP_NOFS);
+}
+
static struct dquot *get_empty_dquot(struct super_block *sb, int type)
{
struct dquot *dquot;

- dquot = kmem_cache_zalloc(dquot_cachep, GFP_NOFS);
+ dquot = sb->dq_op->alloc_dquot(sb, type);
if(!dquot)
return NODQUOT;

@@ -682,7 +697,7 @@ we_slept:
dqstats.lookups++;
spin_unlock(&dq_list_lock);
if (empty)
- kmem_cache_free(dquot_cachep, empty);
+ do_destroy_dquot(empty);
}
/* Wait for dq_lock - after this we know that either dquot_release() is already
* finished or it will be canceled due to dq_count > 1 test */
@@ -1533,7 +1548,9 @@ struct dquot_operations dquot_operations = {
.acquire_dquot = dquot_acquire,
.release_dquot = dquot_release,
.mark_dirty = dquot_mark_dquot_dirty,
- .write_info = dquot_commit_info
+ .write_info = dquot_commit_info,
+ .alloc_dquot = dquot_alloc,
+ .destroy_dquot = dquot_destroy,
};

static inline void set_enable_flags(struct quota_info *dqopt, int type)
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 40401b5..3ce708c 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -292,6 +292,8 @@ struct dquot_operations {
int (*free_inode) (const struct inode *, unsigned long);
int (*transfer) (struct inode *, struct iattr *);
int (*write_dquot) (struct dquot *); /* Ordinary dquot write */
+ struct dquot *(*alloc_dquot)(struct super_block *, int); /* Allocate memory for new dquot */
+ void (*destroy_dquot)(struct dquot *); /* Free memory for dquot */
int (*acquire_dquot) (struct dquot *); /* Quota is going to be created on disk */
int (*release_dquot) (struct dquot *); /* Quota is going to be deleted from disk */
int (*mark_dirty) (struct dquot *); /* Dquot is marked dirty */
--
1.5.6

2008-12-22 21:49:41

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 02/56] quota: Increase size of variables for limits and inode usage

From: Jan Kara <[email protected]>

So far quota was fine with quota block limits and inode limits/numbers in
a 32-bit type. Now with rapid increase in storage sizes there are coming
requests to be able to handle quota limits above 4TB / more that 2^32 inodes.
So bump up sizes of types in mem_dqblk structure to 64-bits to be able to
handle this. Also update inode allocation / checking functions to use qsize_t
and make global structure keep quota limits in bytes so that things are
consistent.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 50 ++++++++++++++++++++++++++-------------------
fs/quota_v1.c | 25 +++++++++++++++++-----
fs/quota_v2.c | 21 +++++++++++++++---
include/linux/quota.h | 28 +++++++++++--------------
include/linux/quotaops.h | 4 +-
5 files changed, 79 insertions(+), 49 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 5512e38..59e8aea 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -835,7 +835,7 @@ static void drop_dquot_ref(struct super_block *sb, int type)
}
}

-static inline void dquot_incr_inodes(struct dquot *dquot, unsigned long number)
+static inline void dquot_incr_inodes(struct dquot *dquot, qsize_t number)
{
dquot->dq_dqb.dqb_curinodes += number;
}
@@ -845,7 +845,7 @@ static inline void dquot_incr_space(struct dquot *dquot, qsize_t number)
dquot->dq_dqb.dqb_curspace += number;
}

-static inline void dquot_decr_inodes(struct dquot *dquot, unsigned long number)
+static inline void dquot_decr_inodes(struct dquot *dquot, qsize_t number)
{
if (dquot->dq_dqb.dqb_curinodes > number)
dquot->dq_dqb.dqb_curinodes -= number;
@@ -862,7 +862,7 @@ static inline void dquot_decr_space(struct dquot *dquot, qsize_t number)
dquot->dq_dqb.dqb_curspace -= number;
else
dquot->dq_dqb.dqb_curspace = 0;
- if (toqb(dquot->dq_dqb.dqb_curspace) <= dquot->dq_dqb.dqb_bsoftlimit)
+ if (dquot->dq_dqb.dqb_curspace <= dquot->dq_dqb.dqb_bsoftlimit)
dquot->dq_dqb.dqb_btime = (time_t) 0;
clear_bit(DQ_BLKS_B, &dquot->dq_flags);
}
@@ -1038,7 +1038,7 @@ static inline char ignore_hardlimit(struct dquot *dquot)
}

/* needs dq_data_lock */
-static int check_idq(struct dquot *dquot, ulong inodes, char *warntype)
+static int check_idq(struct dquot *dquot, qsize_t inodes, char *warntype)
{
*warntype = QUOTA_NL_NOWARN;
if (inodes <= 0 || test_bit(DQ_FAKE_B, &dquot->dq_flags))
@@ -1077,7 +1077,7 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *war
return QUOTA_OK;

if (dquot->dq_dqb.dqb_bhardlimit &&
- toqb(dquot->dq_dqb.dqb_curspace + space) > dquot->dq_dqb.dqb_bhardlimit &&
+ dquot->dq_dqb.dqb_curspace + space > dquot->dq_dqb.dqb_bhardlimit &&
!ignore_hardlimit(dquot)) {
if (!prealloc)
*warntype = QUOTA_NL_BHARDWARN;
@@ -1085,7 +1085,7 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *war
}

if (dquot->dq_dqb.dqb_bsoftlimit &&
- toqb(dquot->dq_dqb.dqb_curspace + space) > dquot->dq_dqb.dqb_bsoftlimit &&
+ dquot->dq_dqb.dqb_curspace + space > dquot->dq_dqb.dqb_bsoftlimit &&
dquot->dq_dqb.dqb_btime && get_seconds() >= dquot->dq_dqb.dqb_btime &&
!ignore_hardlimit(dquot)) {
if (!prealloc)
@@ -1094,7 +1094,7 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *war
}

if (dquot->dq_dqb.dqb_bsoftlimit &&
- toqb(dquot->dq_dqb.dqb_curspace + space) > dquot->dq_dqb.dqb_bsoftlimit &&
+ dquot->dq_dqb.dqb_curspace + space > dquot->dq_dqb.dqb_bsoftlimit &&
dquot->dq_dqb.dqb_btime == 0) {
if (!prealloc) {
*warntype = QUOTA_NL_BSOFTWARN;
@@ -1111,7 +1111,7 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *war
return QUOTA_OK;
}

-static int info_idq_free(struct dquot *dquot, ulong inodes)
+static int info_idq_free(struct dquot *dquot, qsize_t inodes)
{
if (test_bit(DQ_FAKE_B, &dquot->dq_flags) ||
dquot->dq_dqb.dqb_curinodes <= dquot->dq_dqb.dqb_isoftlimit)
@@ -1128,15 +1128,13 @@ static int info_idq_free(struct dquot *dquot, ulong inodes)
static int info_bdq_free(struct dquot *dquot, qsize_t space)
{
if (test_bit(DQ_FAKE_B, &dquot->dq_flags) ||
- toqb(dquot->dq_dqb.dqb_curspace) <= dquot->dq_dqb.dqb_bsoftlimit)
+ dquot->dq_dqb.dqb_curspace <= dquot->dq_dqb.dqb_bsoftlimit)
return QUOTA_NL_NOWARN;

- if (toqb(dquot->dq_dqb.dqb_curspace - space) <=
- dquot->dq_dqb.dqb_bsoftlimit)
+ if (dquot->dq_dqb.dqb_curspace - space <= dquot->dq_dqb.dqb_bsoftlimit)
return QUOTA_NL_BSOFTBELOW;
- if (toqb(dquot->dq_dqb.dqb_curspace) >= dquot->dq_dqb.dqb_bhardlimit &&
- toqb(dquot->dq_dqb.dqb_curspace - space) <
- dquot->dq_dqb.dqb_bhardlimit)
+ if (dquot->dq_dqb.dqb_curspace >= dquot->dq_dqb.dqb_bhardlimit &&
+ dquot->dq_dqb.dqb_curspace - space < dquot->dq_dqb.dqb_bhardlimit)
return QUOTA_NL_BHARDBELOW;
return QUOTA_NL_NOWARN;
}
@@ -1279,7 +1277,7 @@ warn_put_all:
/*
* This operation can block, but only after everything is updated
*/
-int dquot_alloc_inode(const struct inode *inode, unsigned long number)
+int dquot_alloc_inode(const struct inode *inode, qsize_t number)
{
int cnt, ret = NO_QUOTA;
char warntype[MAXQUOTAS];
@@ -1364,7 +1362,7 @@ out_sub:
/*
* This operation can block, but only after everything is updated
*/
-int dquot_free_inode(const struct inode *inode, unsigned long number)
+int dquot_free_inode(const struct inode *inode, qsize_t number)
{
unsigned int cnt;
char warntype[MAXQUOTAS];
@@ -1883,14 +1881,24 @@ int vfs_dq_quota_on_remount(struct super_block *sb)
return ret;
}

+static inline qsize_t qbtos(qsize_t blocks)
+{
+ return blocks << QIF_DQBLKSIZE_BITS;
+}
+
+static inline qsize_t stoqb(qsize_t space)
+{
+ return (space + QIF_DQBLKSIZE - 1) >> QIF_DQBLKSIZE_BITS;
+}
+
/* Generic routine for getting common part of quota structure */
static void do_get_dqblk(struct dquot *dquot, struct if_dqblk *di)
{
struct mem_dqblk *dm = &dquot->dq_dqb;

spin_lock(&dq_data_lock);
- di->dqb_bhardlimit = dm->dqb_bhardlimit;
- di->dqb_bsoftlimit = dm->dqb_bsoftlimit;
+ di->dqb_bhardlimit = stoqb(dm->dqb_bhardlimit);
+ di->dqb_bsoftlimit = stoqb(dm->dqb_bsoftlimit);
di->dqb_curspace = dm->dqb_curspace;
di->dqb_ihardlimit = dm->dqb_ihardlimit;
di->dqb_isoftlimit = dm->dqb_isoftlimit;
@@ -1937,8 +1945,8 @@ static int do_set_dqblk(struct dquot *dquot, struct if_dqblk *di)
check_blim = 1;
}
if (di->dqb_valid & QIF_BLIMITS) {
- dm->dqb_bsoftlimit = di->dqb_bsoftlimit;
- dm->dqb_bhardlimit = di->dqb_bhardlimit;
+ dm->dqb_bsoftlimit = qbtos(di->dqb_bsoftlimit);
+ dm->dqb_bhardlimit = qbtos(di->dqb_bhardlimit);
check_blim = 1;
}
if (di->dqb_valid & QIF_INODES) {
@@ -1956,7 +1964,7 @@ static int do_set_dqblk(struct dquot *dquot, struct if_dqblk *di)
dm->dqb_itime = di->dqb_itime;

if (check_blim) {
- if (!dm->dqb_bsoftlimit || toqb(dm->dqb_curspace) < dm->dqb_bsoftlimit) {
+ if (!dm->dqb_bsoftlimit || dm->dqb_curspace < dm->dqb_bsoftlimit) {
dm->dqb_btime = 0;
clear_bit(DQ_BLKS_B, &dquot->dq_flags);
}
diff --git a/fs/quota_v1.c b/fs/quota_v1.c
index 5ae15b1..3e078ee 100644
--- a/fs/quota_v1.c
+++ b/fs/quota_v1.c
@@ -14,14 +14,27 @@ MODULE_AUTHOR("Jan Kara");
MODULE_DESCRIPTION("Old quota format support");
MODULE_LICENSE("GPL");

+#define QUOTABLOCK_BITS 10
+#define QUOTABLOCK_SIZE (1 << QUOTABLOCK_BITS)
+
+static inline qsize_t v1_stoqb(qsize_t space)
+{
+ return (space + QUOTABLOCK_SIZE - 1) >> QUOTABLOCK_BITS;
+}
+
+static inline qsize_t v1_qbtos(qsize_t blocks)
+{
+ return blocks << QUOTABLOCK_BITS;
+}
+
static void v1_disk2mem_dqblk(struct mem_dqblk *m, struct v1_disk_dqblk *d)
{
m->dqb_ihardlimit = d->dqb_ihardlimit;
m->dqb_isoftlimit = d->dqb_isoftlimit;
m->dqb_curinodes = d->dqb_curinodes;
- m->dqb_bhardlimit = d->dqb_bhardlimit;
- m->dqb_bsoftlimit = d->dqb_bsoftlimit;
- m->dqb_curspace = ((qsize_t)d->dqb_curblocks) << QUOTABLOCK_BITS;
+ m->dqb_bhardlimit = v1_qbtos(d->dqb_bhardlimit);
+ m->dqb_bsoftlimit = v1_qbtos(d->dqb_bsoftlimit);
+ m->dqb_curspace = v1_qbtos(d->dqb_curblocks);
m->dqb_itime = d->dqb_itime;
m->dqb_btime = d->dqb_btime;
}
@@ -31,9 +44,9 @@ static void v1_mem2disk_dqblk(struct v1_disk_dqblk *d, struct mem_dqblk *m)
d->dqb_ihardlimit = m->dqb_ihardlimit;
d->dqb_isoftlimit = m->dqb_isoftlimit;
d->dqb_curinodes = m->dqb_curinodes;
- d->dqb_bhardlimit = m->dqb_bhardlimit;
- d->dqb_bsoftlimit = m->dqb_bsoftlimit;
- d->dqb_curblocks = toqb(m->dqb_curspace);
+ d->dqb_bhardlimit = v1_stoqb(m->dqb_bhardlimit);
+ d->dqb_bsoftlimit = v1_stoqb(m->dqb_bsoftlimit);
+ d->dqb_curblocks = v1_stoqb(m->dqb_curspace);
d->dqb_itime = m->dqb_itime;
d->dqb_btime = m->dqb_btime;
}
diff --git a/fs/quota_v2.c b/fs/quota_v2.c
index b53827d..51c4717 100644
--- a/fs/quota_v2.c
+++ b/fs/quota_v2.c
@@ -26,6 +26,19 @@ typedef char *dqbuf_t;
#define GETIDINDEX(id, depth) (((id) >> ((V2_DQTREEDEPTH-(depth)-1)*8)) & 0xff)
#define GETENTRIES(buf) ((struct v2_disk_dqblk *)(((char *)buf)+sizeof(struct v2_disk_dqdbheader)))

+#define QUOTABLOCK_BITS 10
+#define QUOTABLOCK_SIZE (1 << QUOTABLOCK_BITS)
+
+static inline qsize_t v2_stoqb(qsize_t space)
+{
+ return (space + QUOTABLOCK_SIZE - 1) >> QUOTABLOCK_BITS;
+}
+
+static inline qsize_t v2_qbtos(qsize_t blocks)
+{
+ return blocks << QUOTABLOCK_BITS;
+}
+
/* Check whether given file is really vfsv0 quotafile */
static int v2_check_quota_file(struct super_block *sb, int type)
{
@@ -104,8 +117,8 @@ static void disk2memdqb(struct mem_dqblk *m, struct v2_disk_dqblk *d)
m->dqb_isoftlimit = le32_to_cpu(d->dqb_isoftlimit);
m->dqb_curinodes = le32_to_cpu(d->dqb_curinodes);
m->dqb_itime = le64_to_cpu(d->dqb_itime);
- m->dqb_bhardlimit = le32_to_cpu(d->dqb_bhardlimit);
- m->dqb_bsoftlimit = le32_to_cpu(d->dqb_bsoftlimit);
+ m->dqb_bhardlimit = v2_qbtos(le32_to_cpu(d->dqb_bhardlimit));
+ m->dqb_bsoftlimit = v2_qbtos(le32_to_cpu(d->dqb_bsoftlimit));
m->dqb_curspace = le64_to_cpu(d->dqb_curspace);
m->dqb_btime = le64_to_cpu(d->dqb_btime);
}
@@ -116,8 +129,8 @@ static void mem2diskdqb(struct v2_disk_dqblk *d, struct mem_dqblk *m, qid_t id)
d->dqb_isoftlimit = cpu_to_le32(m->dqb_isoftlimit);
d->dqb_curinodes = cpu_to_le32(m->dqb_curinodes);
d->dqb_itime = cpu_to_le64(m->dqb_itime);
- d->dqb_bhardlimit = cpu_to_le32(m->dqb_bhardlimit);
- d->dqb_bsoftlimit = cpu_to_le32(m->dqb_bsoftlimit);
+ d->dqb_bhardlimit = cpu_to_le32(v2_qbtos(m->dqb_bhardlimit));
+ d->dqb_bsoftlimit = cpu_to_le32(v2_qbtos(m->dqb_bsoftlimit));
d->dqb_curspace = cpu_to_le64(m->dqb_curspace);
d->dqb_btime = cpu_to_le64(m->dqb_btime);
d->dqb_id = cpu_to_le32(id);
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 3ce708c..9ea4683 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -39,15 +39,6 @@
#define __DQUOT_VERSION__ "dquot_6.5.1"
#define __DQUOT_NUM_VERSION__ 6*10000+5*100+1

-/* Size of blocks in which are counted size limits */
-#define QUOTABLOCK_BITS 10
-#define QUOTABLOCK_SIZE (1 << QUOTABLOCK_BITS)
-
-/* Conversion routines from and to quota blocks */
-#define qb2kb(x) ((x) << (QUOTABLOCK_BITS-10))
-#define kb2qb(x) ((x) >> (QUOTABLOCK_BITS-10))
-#define toqb(x) (((x) + QUOTABLOCK_SIZE - 1) >> QUOTABLOCK_BITS)
-
#define MAXQUOTAS 2
#define USRQUOTA 0 /* element used for user quotas */
#define GRPQUOTA 1 /* element used for group quotas */
@@ -80,6 +71,11 @@
#define Q_GETQUOTA 0x800007 /* get user quota structure */
#define Q_SETQUOTA 0x800008 /* set user quota structure */

+/* Size of block in which space limits are passed through the quota
+ * interface */
+#define QIF_DQBLKSIZE_BITS 10
+#define QIF_DQBLKSIZE (1 << QIF_DQBLKSIZE_BITS)
+
/*
* Quota structure used for communication with userspace via quotactl
* Following flags are used to specify which fields are valid
@@ -187,12 +183,12 @@ extern spinlock_t dq_data_lock;
* Data for one user/group kept in memory
*/
struct mem_dqblk {
- __u32 dqb_bhardlimit; /* absolute limit on disk blks alloc */
- __u32 dqb_bsoftlimit; /* preferred limit on disk blks */
+ qsize_t dqb_bhardlimit; /* absolute limit on disk blks alloc */
+ qsize_t dqb_bsoftlimit; /* preferred limit on disk blks */
qsize_t dqb_curspace; /* current used space */
- __u32 dqb_ihardlimit; /* absolute limit on allocated inodes */
- __u32 dqb_isoftlimit; /* preferred inode limit */
- __u32 dqb_curinodes; /* current # allocated inodes */
+ qsize_t dqb_ihardlimit; /* absolute limit on allocated inodes */
+ qsize_t dqb_isoftlimit; /* preferred inode limit */
+ qsize_t dqb_curinodes; /* current # allocated inodes */
time_t dqb_btime; /* time limit for excessive disk use */
time_t dqb_itime; /* time limit for excessive inode use */
};
@@ -287,9 +283,9 @@ struct dquot_operations {
int (*initialize) (struct inode *, int);
int (*drop) (struct inode *);
int (*alloc_space) (struct inode *, qsize_t, int);
- int (*alloc_inode) (const struct inode *, unsigned long);
+ int (*alloc_inode) (const struct inode *, qsize_t);
int (*free_space) (struct inode *, qsize_t);
- int (*free_inode) (const struct inode *, unsigned long);
+ int (*free_inode) (const struct inode *, qsize_t);
int (*transfer) (struct inode *, struct iattr *);
int (*write_dquot) (struct dquot *); /* Ordinary dquot write */
struct dquot *(*alloc_dquot)(struct super_block *, int); /* Allocate memory for new dquot */
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index a558a4c..adcc7ba 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -26,10 +26,10 @@ int dquot_initialize(struct inode *inode, int type);
int dquot_drop(struct inode *inode);

int dquot_alloc_space(struct inode *inode, qsize_t number, int prealloc);
-int dquot_alloc_inode(const struct inode *inode, unsigned long number);
+int dquot_alloc_inode(const struct inode *inode, qsize_t number);

int dquot_free_space(struct inode *inode, qsize_t number);
-int dquot_free_inode(const struct inode *inode, unsigned long number);
+int dquot_free_inode(const struct inode *inode, qsize_t number);

int dquot_transfer(struct inode *inode, struct iattr *iattr);
int dquot_commit(struct dquot *dquot);
--
1.5.6

2008-12-22 21:49:59

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 03/56] quota: Remove bogus 'optimization' in check_idq() and check_bdq()

From: Jan Kara <[email protected]>

Checks like <= 0 for an unsigned type do not make much sence. The value
could be only 0 and that does not happen often enough for the check
to be worth it.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 59e8aea..735e2c3 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -1041,7 +1041,7 @@ static inline char ignore_hardlimit(struct dquot *dquot)
static int check_idq(struct dquot *dquot, qsize_t inodes, char *warntype)
{
*warntype = QUOTA_NL_NOWARN;
- if (inodes <= 0 || test_bit(DQ_FAKE_B, &dquot->dq_flags))
+ if (test_bit(DQ_FAKE_B, &dquot->dq_flags))
return QUOTA_OK;

if (dquot->dq_dqb.dqb_ihardlimit &&
@@ -1073,7 +1073,7 @@ static int check_idq(struct dquot *dquot, qsize_t inodes, char *warntype)
static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *warntype)
{
*warntype = QUOTA_NL_NOWARN;
- if (space <= 0 || test_bit(DQ_FAKE_B, &dquot->dq_flags))
+ if (test_bit(DQ_FAKE_B, &dquot->dq_flags))
return QUOTA_OK;

if (dquot->dq_dqb.dqb_bhardlimit &&
--
1.5.6

2008-12-22 21:50:51

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 04/56] quota: Make _SUSPENDED just a flag

From: Jan Kara <[email protected]>

Upto now, DQUOT_USR_SUSPENDED behaved like a state - i.e., either quota
was enabled or suspended or none. Now allowed states are 0, ENABLED,
ENABLED | SUSPENDED. This will be useful later when we implement separate
enabling of quota usage tracking and limits enforcement because we need to
keep track of a state which has been suspended.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 10 ++++++----
include/linux/quotaops.h | 6 ++++--
2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 735e2c3..1f9f1f1 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -1570,18 +1570,20 @@ static inline void reset_enable_flags(struct quota_info *dqopt, int type,
{
switch (type) {
case USRQUOTA:
- dqopt->flags &= ~DQUOT_USR_ENABLED;
if (remount)
dqopt->flags |= DQUOT_USR_SUSPENDED;
- else
+ else {
+ dqopt->flags &= ~DQUOT_USR_ENABLED;
dqopt->flags &= ~DQUOT_USR_SUSPENDED;
+ }
break;
case GRPQUOTA:
- dqopt->flags &= ~DQUOT_GRP_ENABLED;
if (remount)
dqopt->flags |= DQUOT_GRP_SUSPENDED;
- else
+ else {
+ dqopt->flags &= ~DQUOT_GRP_ENABLED;
dqopt->flags &= ~DQUOT_GRP_SUSPENDED;
+ }
break;
}
}
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index adcc7ba..ffd9707 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -67,8 +67,10 @@ static inline struct mem_dqinfo *sb_dqinfo(struct super_block *sb, int type)
static inline int sb_has_quota_enabled(struct super_block *sb, int type)
{
if (type == USRQUOTA)
- return sb_dqopt(sb)->flags & DQUOT_USR_ENABLED;
- return sb_dqopt(sb)->flags & DQUOT_GRP_ENABLED;
+ return (sb_dqopt(sb)->flags & DQUOT_USR_ENABLED)
+ && !(sb_dqopt(sb)->flags & DQUOT_USR_SUSPENDED);
+ return (sb_dqopt(sb)->flags & DQUOT_GRP_ENABLED)
+ && !(sb_dqopt(sb)->flags & DQUOT_GROUP_SUSPENDED);
}

static inline int sb_any_quota_enabled(struct super_block *sb)
--
1.5.6

2008-12-22 21:51:15

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 05/56] quota: Allow to separately enable quota accounting and enforcing limits

From: Jan Kara <[email protected]>

Split DQUOT_USR_ENABLED (and DQUOT_GRP_ENABLED) into DQUOT_USR_USAGE_ENABLED
and DQUOT_USR_LIMITS_ENABLED. This way we are able to separately enable /
disable whether we should:
1) ignore quotas completely
2) just keep uptodate information about usage
3) actually enforce quota limits

This is going to be useful when quota is treated as filesystem metadata - we
then want to keep quota information uptodate all the time and just enable /
disable limits enforcement.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 222 ++++++++++++++++++++++++++++-----------------
fs/quota.c | 8 +-
include/linux/quota.h | 30 ++++++-
include/linux/quotaops.h | 91 +++++++++++++++----
4 files changed, 239 insertions(+), 112 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 1f9f1f1..adf59ce 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -489,7 +489,7 @@ int vfs_quota_sync(struct super_block *sb, int type)
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
if (type != -1 && cnt != type)
continue;
- if (!sb_has_quota_enabled(sb, cnt))
+ if (!sb_has_quota_active(sb, cnt))
continue;
spin_lock(&dq_list_lock);
dirty = &dqopt->info[cnt].dqi_dirty_list;
@@ -514,8 +514,8 @@ int vfs_quota_sync(struct super_block *sb, int type)
}

for (cnt = 0; cnt < MAXQUOTAS; cnt++)
- if ((cnt == type || type == -1) && sb_has_quota_enabled(sb, cnt)
- && info_dirty(&dqopt->info[cnt]))
+ if ((cnt == type || type == -1) && sb_has_quota_active(sb, cnt)
+ && info_dirty(&dqopt->info[cnt]))
sb->dq_op->write_info(sb, cnt);
spin_lock(&dq_list_lock);
dqstats.syncs++;
@@ -594,7 +594,7 @@ we_slept:
/* We have more than one user... nothing to do */
atomic_dec(&dquot->dq_count);
/* Releasing dquot during quotaoff phase? */
- if (!sb_has_quota_enabled(dquot->dq_sb, dquot->dq_type) &&
+ if (!sb_has_quota_active(dquot->dq_sb, dquot->dq_type) &&
atomic_read(&dquot->dq_count) == 1)
wake_up(&dquot->dq_wait_unused);
spin_unlock(&dq_list_lock);
@@ -670,7 +670,7 @@ static struct dquot *dqget(struct super_block *sb, unsigned int id, int type)
unsigned int hashent = hashfn(sb, id, type);
struct dquot *dquot, *empty = NODQUOT;

- if (!sb_has_quota_enabled(sb, type))
+ if (!sb_has_quota_active(sb, type))
return NODQUOT;
we_slept:
spin_lock(&dq_list_lock);
@@ -1041,7 +1041,8 @@ static inline char ignore_hardlimit(struct dquot *dquot)
static int check_idq(struct dquot *dquot, qsize_t inodes, char *warntype)
{
*warntype = QUOTA_NL_NOWARN;
- if (test_bit(DQ_FAKE_B, &dquot->dq_flags))
+ if (!sb_has_quota_limits_enabled(dquot->dq_sb, dquot->dq_type) ||
+ test_bit(DQ_FAKE_B, &dquot->dq_flags))
return QUOTA_OK;

if (dquot->dq_dqb.dqb_ihardlimit &&
@@ -1073,7 +1074,8 @@ static int check_idq(struct dquot *dquot, qsize_t inodes, char *warntype)
static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *warntype)
{
*warntype = QUOTA_NL_NOWARN;
- if (test_bit(DQ_FAKE_B, &dquot->dq_flags))
+ if (!sb_has_quota_limits_enabled(dquot->dq_sb, dquot->dq_type) ||
+ test_bit(DQ_FAKE_B, &dquot->dq_flags))
return QUOTA_OK;

if (dquot->dq_dqb.dqb_bhardlimit &&
@@ -1114,7 +1116,8 @@ static int check_bdq(struct dquot *dquot, qsize_t space, int prealloc, char *war
static int info_idq_free(struct dquot *dquot, qsize_t inodes)
{
if (test_bit(DQ_FAKE_B, &dquot->dq_flags) ||
- dquot->dq_dqb.dqb_curinodes <= dquot->dq_dqb.dqb_isoftlimit)
+ dquot->dq_dqb.dqb_curinodes <= dquot->dq_dqb.dqb_isoftlimit ||
+ !sb_has_quota_limits_enabled(dquot->dq_sb, dquot->dq_type))
return QUOTA_NL_NOWARN;

if (dquot->dq_dqb.dqb_curinodes - inodes <= dquot->dq_dqb.dqb_isoftlimit)
@@ -1508,7 +1511,7 @@ warn_put_all:
/* Wrapper for transferring ownership of an inode */
int vfs_dq_transfer(struct inode *inode, struct iattr *iattr)
{
- if (sb_any_quota_enabled(inode->i_sb) && !IS_NOQUOTA(inode)) {
+ if (sb_any_quota_active(inode->i_sb) && !IS_NOQUOTA(inode)) {
vfs_dq_init(inode);
if (inode->i_sb->dq_op->transfer(inode, iattr) == NO_QUOTA)
return 1;
@@ -1551,53 +1554,22 @@ struct dquot_operations dquot_operations = {
.destroy_dquot = dquot_destroy,
};

-static inline void set_enable_flags(struct quota_info *dqopt, int type)
-{
- switch (type) {
- case USRQUOTA:
- dqopt->flags |= DQUOT_USR_ENABLED;
- dqopt->flags &= ~DQUOT_USR_SUSPENDED;
- break;
- case GRPQUOTA:
- dqopt->flags |= DQUOT_GRP_ENABLED;
- dqopt->flags &= ~DQUOT_GRP_SUSPENDED;
- break;
- }
-}
-
-static inline void reset_enable_flags(struct quota_info *dqopt, int type,
- int remount)
-{
- switch (type) {
- case USRQUOTA:
- if (remount)
- dqopt->flags |= DQUOT_USR_SUSPENDED;
- else {
- dqopt->flags &= ~DQUOT_USR_ENABLED;
- dqopt->flags &= ~DQUOT_USR_SUSPENDED;
- }
- break;
- case GRPQUOTA:
- if (remount)
- dqopt->flags |= DQUOT_GRP_SUSPENDED;
- else {
- dqopt->flags &= ~DQUOT_GRP_ENABLED;
- dqopt->flags &= ~DQUOT_GRP_SUSPENDED;
- }
- break;
- }
-}
-
-
/*
* Turn quota off on a device. type == -1 ==> quotaoff for all types (umount)
*/
-int vfs_quota_off(struct super_block *sb, int type, int remount)
+int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags)
{
int cnt, ret = 0;
struct quota_info *dqopt = sb_dqopt(sb);
struct inode *toputinode[MAXQUOTAS];

+ /* Cannot turn off usage accounting without turning off limits, or
+ * suspend quotas and simultaneously turn quotas off. */
+ if ((flags & DQUOT_USAGE_ENABLED && !(flags & DQUOT_LIMITS_ENABLED))
+ || (flags & DQUOT_SUSPENDED && flags & (DQUOT_LIMITS_ENABLED |
+ DQUOT_USAGE_ENABLED)))
+ return -EINVAL;
+
/* We need to serialize quota_off() for device */
mutex_lock(&dqopt->dqonoff_mutex);

@@ -1606,7 +1578,7 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
* sometimes we are called when fill_super() failed and calling
* sync_fs() in such cases does no good.
*/
- if (!sb_any_quota_enabled(sb) && !sb_any_quota_suspended(sb)) {
+ if (!sb_any_quota_loaded(sb)) {
mutex_unlock(&dqopt->dqonoff_mutex);
return 0;
}
@@ -1614,17 +1586,28 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
toputinode[cnt] = NULL;
if (type != -1 && cnt != type)
continue;
- /* If we keep inodes of quota files after remount and quotaoff
- * is called, drop kept inodes. */
- if (!remount && sb_has_quota_suspended(sb, cnt)) {
- iput(dqopt->files[cnt]);
- dqopt->files[cnt] = NULL;
- reset_enable_flags(dqopt, cnt, 0);
+ if (!sb_has_quota_loaded(sb, cnt))
continue;
+
+ if (flags & DQUOT_SUSPENDED) {
+ dqopt->flags |=
+ dquot_state_flag(DQUOT_SUSPENDED, cnt);
+ } else {
+ dqopt->flags &= ~dquot_state_flag(flags, cnt);
+ /* Turning off suspended quotas? */
+ if (!sb_has_quota_loaded(sb, cnt) &&
+ sb_has_quota_suspended(sb, cnt)) {
+ dqopt->flags &= ~dquot_state_flag(
+ DQUOT_SUSPENDED, cnt);
+ iput(dqopt->files[cnt]);
+ dqopt->files[cnt] = NULL;
+ continue;
+ }
}
- if (!sb_has_quota_enabled(sb, cnt))
+
+ /* We still have to keep quota loaded? */
+ if (sb_has_quota_loaded(sb, cnt) && !(flags & DQUOT_SUSPENDED))
continue;
- reset_enable_flags(dqopt, cnt, remount);

/* Note: these are blocking operations */
drop_dquot_ref(sb, cnt);
@@ -1640,7 +1623,7 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
put_quota_format(dqopt->info[cnt].dqi_format);

toputinode[cnt] = dqopt->files[cnt];
- if (!remount)
+ if (!sb_has_quota_loaded(sb, cnt))
dqopt->files[cnt] = NULL;
dqopt->info[cnt].dqi_flags = 0;
dqopt->info[cnt].dqi_igrace = 0;
@@ -1663,7 +1646,7 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
mutex_lock(&dqopt->dqonoff_mutex);
/* If quota was reenabled in the meantime, we have
* nothing to do */
- if (!sb_has_quota_enabled(sb, cnt)) {
+ if (!sb_has_quota_loaded(sb, cnt)) {
mutex_lock_nested(&toputinode[cnt]->i_mutex, I_MUTEX_QUOTA);
toputinode[cnt]->i_flags &= ~(S_IMMUTABLE |
S_NOATIME | S_NOQUOTA);
@@ -1673,10 +1656,13 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
}
mutex_unlock(&dqopt->dqonoff_mutex);
/* On remount RO, we keep the inode pointer so that we
- * can reenable quota on the subsequent remount RW.
- * But we have better not keep inode pointer when there
- * is pending delete on the quota file... */
- if (!remount)
+ * can reenable quota on the subsequent remount RW. We
+ * have to check 'flags' variable and not use sb_has_
+ * function because another quotaon / quotaoff could
+ * change global state before we got here. We refuse
+ * to suspend quotas when there is pending delete on
+ * the quota file... */
+ if (!(flags & DQUOT_SUSPENDED))
iput(toputinode[cnt]);
else if (!toputinode[cnt]->i_nlink)
ret = -EBUSY;
@@ -1686,12 +1672,22 @@ int vfs_quota_off(struct super_block *sb, int type, int remount)
return ret;
}

+int vfs_quota_off(struct super_block *sb, int type, int remount)
+{
+ return vfs_quota_disable(sb, type, remount ? DQUOT_SUSPENDED :
+ (DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED));
+}
+
/*
* Turn quotas on on a device
*/

-/* Helper function when we already have the inode */
-static int vfs_quota_on_inode(struct inode *inode, int type, int format_id)
+/*
+ * Helper function to turn quotas on when we already have the inode of
+ * quota file and no quota information is loaded.
+ */
+static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
+ unsigned int flags)
{
struct quota_format_type *fmt = find_quota_format(format_id);
struct super_block *sb = inode->i_sb;
@@ -1713,6 +1709,11 @@ static int vfs_quota_on_inode(struct inode *inode, int type, int format_id)
error = -EINVAL;
goto out_fmt;
}
+ /* Usage always has to be set... */
+ if (!(flags & DQUOT_USAGE_ENABLED)) {
+ error = -EINVAL;
+ goto out_fmt;
+ }

/* As we bypass the pagecache we must now flush the inode so that
* we see all the changes from userspace... */
@@ -1721,8 +1722,7 @@ static int vfs_quota_on_inode(struct inode *inode, int type, int format_id)
invalidate_bdev(sb->s_bdev);
mutex_lock(&inode->i_mutex);
mutex_lock(&dqopt->dqonoff_mutex);
- if (sb_has_quota_enabled(sb, type) ||
- sb_has_quota_suspended(sb, type)) {
+ if (sb_has_quota_loaded(sb, type)) {
error = -EBUSY;
goto out_lock;
}
@@ -1754,7 +1754,7 @@ static int vfs_quota_on_inode(struct inode *inode, int type, int format_id)
}
mutex_unlock(&dqopt->dqio_mutex);
mutex_unlock(&inode->i_mutex);
- set_enable_flags(dqopt, type);
+ dqopt->flags |= dquot_state_flag(flags, type);

add_dquot_ref(sb, type);
mutex_unlock(&dqopt->dqonoff_mutex);
@@ -1787,20 +1787,23 @@ static int vfs_quota_on_remount(struct super_block *sb, int type)
struct quota_info *dqopt = sb_dqopt(sb);
struct inode *inode;
int ret;
+ unsigned int flags;

mutex_lock(&dqopt->dqonoff_mutex);
if (!sb_has_quota_suspended(sb, type)) {
mutex_unlock(&dqopt->dqonoff_mutex);
return 0;
}
- BUG_ON(sb_has_quota_enabled(sb, type));
-
inode = dqopt->files[type];
dqopt->files[type] = NULL;
- reset_enable_flags(dqopt, type, 0);
+ flags = dqopt->flags & dquot_state_flag(DQUOT_USAGE_ENABLED |
+ DQUOT_LIMITS_ENABLED, type);
+ dqopt->flags &= ~dquot_state_flag(DQUOT_STATE_FLAGS, type);
mutex_unlock(&dqopt->dqonoff_mutex);

- ret = vfs_quota_on_inode(inode, type, dqopt->info[type].dqi_fmt_id);
+ flags = dquot_generic_flag(flags, type);
+ ret = vfs_load_quota_inode(inode, type, dqopt->info[type].dqi_fmt_id,
+ flags);
iput(inode);

return ret;
@@ -1816,12 +1819,12 @@ int vfs_quota_on_path(struct super_block *sb, int type, int format_id,
if (path->mnt->mnt_sb != sb)
error = -EXDEV;
else
- error = vfs_quota_on_inode(path->dentry->d_inode, type,
- format_id);
+ error = vfs_load_quota_inode(path->dentry->d_inode, type,
+ format_id, DQUOT_USAGE_ENABLED |
+ DQUOT_LIMITS_ENABLED);
return error;
}

-/* Actual function called from quotactl() */
int vfs_quota_on(struct super_block *sb, int type, int format_id, char *name,
int remount)
{
@@ -1840,6 +1843,50 @@ int vfs_quota_on(struct super_block *sb, int type, int format_id, char *name,
}

/*
+ * More powerful function for turning on quotas allowing setting
+ * of individual quota flags
+ */
+int vfs_quota_enable(struct inode *inode, int type, int format_id,
+ unsigned int flags)
+{
+ int ret = 0;
+ struct super_block *sb = inode->i_sb;
+ struct quota_info *dqopt = sb_dqopt(sb);
+
+ /* Just unsuspend quotas? */
+ if (flags & DQUOT_SUSPENDED)
+ return vfs_quota_on_remount(sb, type);
+ if (!flags)
+ return 0;
+ /* Just updating flags needed? */
+ if (sb_has_quota_loaded(sb, type)) {
+ mutex_lock(&dqopt->dqonoff_mutex);
+ /* Now do a reliable test... */
+ if (!sb_has_quota_loaded(sb, type)) {
+ mutex_unlock(&dqopt->dqonoff_mutex);
+ goto load_quota;
+ }
+ if (flags & DQUOT_USAGE_ENABLED &&
+ sb_has_quota_usage_enabled(sb, type)) {
+ ret = -EBUSY;
+ goto out_lock;
+ }
+ if (flags & DQUOT_LIMITS_ENABLED &&
+ sb_has_quota_limits_enabled(sb, type)) {
+ ret = -EBUSY;
+ goto out_lock;
+ }
+ sb_dqopt(sb)->flags |= dquot_state_flag(flags, type);
+out_lock:
+ mutex_unlock(&dqopt->dqonoff_mutex);
+ return ret;
+ }
+
+load_quota:
+ return vfs_load_quota_inode(inode, type, format_id, flags);
+}
+
+/*
* This function is used when filesystem needs to initialize quotas
* during mount time.
*/
@@ -1860,7 +1907,8 @@ int vfs_quota_on_mount(struct super_block *sb, char *qf_name,

error = security_quota_on(dentry);
if (!error)
- error = vfs_quota_on_inode(dentry->d_inode, type, format_id);
+ error = vfs_load_quota_inode(dentry->d_inode, type, format_id,
+ DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED);

out:
dput(dentry);
@@ -1997,12 +2045,14 @@ int vfs_set_dqblk(struct super_block *sb, int type, qid_t id, struct if_dqblk *d
int rc;

mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
- if (!(dquot = dqget(sb, id, type))) {
- mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
- return -ESRCH;
+ dquot = dqget(sb, id, type);
+ if (!dquot) {
+ rc = -ESRCH;
+ goto out;
}
rc = do_set_dqblk(dquot, di);
dqput(dquot);
+out:
mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
return rc;
}
@@ -2013,7 +2063,7 @@ int vfs_get_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii)
struct mem_dqinfo *mi;

mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
- if (!sb_has_quota_enabled(sb, type)) {
+ if (!sb_has_quota_active(sb, type)) {
mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
return -ESRCH;
}
@@ -2032,11 +2082,12 @@ int vfs_get_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii)
int vfs_set_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii)
{
struct mem_dqinfo *mi;
+ int err = 0;

mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
- if (!sb_has_quota_enabled(sb, type)) {
- mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
- return -ESRCH;
+ if (!sb_has_quota_active(sb, type)) {
+ err = -ESRCH;
+ goto out;
}
mi = sb_dqopt(sb)->info + type;
spin_lock(&dq_data_lock);
@@ -2050,8 +2101,9 @@ int vfs_set_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii)
mark_info_dirty(sb, type);
/* Force write to disk */
sb->dq_op->write_info(sb, type);
+out:
mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
- return 0;
+ return err;
}

struct quotactl_ops vfs_quotactl_ops = {
@@ -2213,9 +2265,11 @@ EXPORT_SYMBOL(register_quota_format);
EXPORT_SYMBOL(unregister_quota_format);
EXPORT_SYMBOL(dqstats);
EXPORT_SYMBOL(dq_data_lock);
+EXPORT_SYMBOL(vfs_quota_enable);
EXPORT_SYMBOL(vfs_quota_on);
EXPORT_SYMBOL(vfs_quota_on_path);
EXPORT_SYMBOL(vfs_quota_on_mount);
+EXPORT_SYMBOL(vfs_quota_disable);
EXPORT_SYMBOL(vfs_quota_off);
EXPORT_SYMBOL(vfs_quota_sync);
EXPORT_SYMBOL(vfs_get_dqinfo);
diff --git a/fs/quota.c b/fs/quota.c
index 7f4386e..a8026f1 100644
--- a/fs/quota.c
+++ b/fs/quota.c
@@ -73,7 +73,7 @@ static int generic_quotactl_valid(struct super_block *sb, int type, int cmd, qid
case Q_SETQUOTA:
case Q_GETQUOTA:
/* This is just informative test so we are satisfied without a lock */
- if (!sb_has_quota_enabled(sb, type))
+ if (!sb_has_quota_active(sb, type))
return -ESRCH;
}

@@ -175,7 +175,7 @@ static void quota_sync_sb(struct super_block *sb, int type)
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
if (type != -1 && cnt != type)
continue;
- if (!sb_has_quota_enabled(sb, cnt))
+ if (!sb_has_quota_active(sb, cnt))
continue;
mutex_lock_nested(&sb_dqopt(sb)->files[cnt]->i_mutex, I_MUTEX_QUOTA);
truncate_inode_pages(&sb_dqopt(sb)->files[cnt]->i_data, 0);
@@ -201,7 +201,7 @@ restart:
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
if (type != -1 && type != cnt)
continue;
- if (!sb_has_quota_enabled(sb, cnt))
+ if (!sb_has_quota_active(sb, cnt))
continue;
if (!info_dirty(&sb_dqopt(sb)->info[cnt]) &&
list_empty(&sb_dqopt(sb)->info[cnt].dqi_dirty_list))
@@ -245,7 +245,7 @@ static int do_quotactl(struct super_block *sb, int type, int cmd, qid_t id, void
__u32 fmt;

down_read(&sb_dqopt(sb)->dqptr_sem);
- if (!sb_has_quota_enabled(sb, type)) {
+ if (!sb_has_quota_active(sb, type)) {
up_read(&sb_dqopt(sb)->dqptr_sem);
return -ESRCH;
}
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 9ea4683..93717ab 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -318,12 +318,34 @@ struct quota_format_type {
struct quota_format_type *qf_next;
};

-#define DQUOT_USR_ENABLED 0x01 /* User diskquotas enabled */
-#define DQUOT_GRP_ENABLED 0x02 /* Group diskquotas enabled */
-#define DQUOT_USR_SUSPENDED 0x04 /* User diskquotas are off, but
+/* Quota state flags - they actually come in two flavors - for users and groups */
+enum {
+ _DQUOT_USAGE_ENABLED = 0, /* Track disk usage for users */
+ _DQUOT_LIMITS_ENABLED, /* Enforce quota limits for users */
+ _DQUOT_SUSPENDED, /* User diskquotas are off, but
* we have necessary info in
* memory to turn them on */
-#define DQUOT_GRP_SUSPENDED 0x08 /* The same for group quotas */
+ _DQUOT_STATE_FLAGS
+};
+#define DQUOT_USAGE_ENABLED (1 << _DQUOT_USAGE_ENABLED)
+#define DQUOT_LIMITS_ENABLED (1 << _DQUOT_LIMITS_ENABLED)
+#define DQUOT_SUSPENDED (1 << _DQUOT_SUSPENDED)
+#define DQUOT_STATE_FLAGS (DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED | \
+ DQUOT_SUSPENDED)
+
+static inline unsigned int dquot_state_flag(unsigned int flags, int type)
+{
+ if (type == USRQUOTA)
+ return flags;
+ return flags << _DQUOT_STATE_FLAGS;
+}
+
+static inline unsigned int dquot_generic_flag(unsigned int flags, int type)
+{
+ if (type == USRQUOTA)
+ return flags;
+ return flags >> _DQUOT_STATE_FLAGS;
+}

struct quota_info {
unsigned int flags; /* Flags for diskquotas on this device */
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index ffd9707..3b3346f 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -40,11 +40,14 @@ int dquot_mark_dquot_dirty(struct dquot *dquot);

int vfs_quota_on(struct super_block *sb, int type, int format_id,
char *path, int remount);
+int vfs_quota_enable(struct inode *inode, int type, int format_id,
+ unsigned int flags);
int vfs_quota_on_path(struct super_block *sb, int type, int format_id,
struct path *path);
int vfs_quota_on_mount(struct super_block *sb, char *qf_name,
int format_id, int type);
int vfs_quota_off(struct super_block *sb, int type, int remount);
+int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags);
int vfs_quota_sync(struct super_block *sb, int type);
int vfs_get_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii);
int vfs_set_dqinfo(struct super_block *sb, int type, struct if_dqinfo *ii);
@@ -64,26 +67,22 @@ static inline struct mem_dqinfo *sb_dqinfo(struct super_block *sb, int type)
* Functions for checking status of quota
*/

-static inline int sb_has_quota_enabled(struct super_block *sb, int type)
+static inline int sb_has_quota_usage_enabled(struct super_block *sb, int type)
{
- if (type == USRQUOTA)
- return (sb_dqopt(sb)->flags & DQUOT_USR_ENABLED)
- && !(sb_dqopt(sb)->flags & DQUOT_USR_SUSPENDED);
- return (sb_dqopt(sb)->flags & DQUOT_GRP_ENABLED)
- && !(sb_dqopt(sb)->flags & DQUOT_GROUP_SUSPENDED);
+ return sb_dqopt(sb)->flags &
+ dquot_state_flag(DQUOT_USAGE_ENABLED, type);
}

-static inline int sb_any_quota_enabled(struct super_block *sb)
+static inline int sb_has_quota_limits_enabled(struct super_block *sb, int type)
{
- return sb_has_quota_enabled(sb, USRQUOTA) ||
- sb_has_quota_enabled(sb, GRPQUOTA);
+ return sb_dqopt(sb)->flags &
+ dquot_state_flag(DQUOT_LIMITS_ENABLED, type);
}

static inline int sb_has_quota_suspended(struct super_block *sb, int type)
{
- if (type == USRQUOTA)
- return sb_dqopt(sb)->flags & DQUOT_USR_SUSPENDED;
- return sb_dqopt(sb)->flags & DQUOT_GRP_SUSPENDED;
+ return sb_dqopt(sb)->flags &
+ dquot_state_flag(DQUOT_SUSPENDED, type);
}

static inline int sb_any_quota_suspended(struct super_block *sb)
@@ -92,6 +91,34 @@ static inline int sb_any_quota_suspended(struct super_block *sb)
sb_has_quota_suspended(sb, GRPQUOTA);
}

+/* Does kernel know about any quota information for given sb + type? */
+static inline int sb_has_quota_loaded(struct super_block *sb, int type)
+{
+ /* Currently if anything is on, then quota usage is on as well */
+ return sb_has_quota_usage_enabled(sb, type);
+}
+
+static inline int sb_any_quota_loaded(struct super_block *sb)
+{
+ return sb_has_quota_loaded(sb, USRQUOTA) ||
+ sb_has_quota_loaded(sb, GRPQUOTA);
+}
+
+static inline int sb_has_quota_active(struct super_block *sb, int type)
+{
+ return sb_has_quota_loaded(sb, type) &&
+ !sb_has_quota_suspended(sb, type);
+}
+
+static inline int sb_any_quota_active(struct super_block *sb)
+{
+ return sb_has_quota_active(sb, USRQUOTA) ||
+ sb_has_quota_active(sb, GRPQUOTA);
+}
+
+/* For backward compatibility until we remove all users */
+#define sb_any_quota_enabled(sb) sb_any_quota_active(sb)
+
/*
* Operations supported for diskquotas.
*/
@@ -106,7 +133,7 @@ extern struct quotactl_ops vfs_quotactl_ops;
static inline void vfs_dq_init(struct inode *inode)
{
BUG_ON(!inode->i_sb);
- if (sb_any_quota_enabled(inode->i_sb) && !IS_NOQUOTA(inode))
+ if (sb_any_quota_active(inode->i_sb) && !IS_NOQUOTA(inode))
inode->i_sb->dq_op->initialize(inode, -1);
}

@@ -114,7 +141,7 @@ static inline void vfs_dq_init(struct inode *inode)
* a transaction (deadlocks possible otherwise) */
static inline int vfs_dq_prealloc_space_nodirty(struct inode *inode, qsize_t nr)
{
- if (sb_any_quota_enabled(inode->i_sb)) {
+ if (sb_any_quota_active(inode->i_sb)) {
/* Used space is updated in alloc_space() */
if (inode->i_sb->dq_op->alloc_space(inode, nr, 1) == NO_QUOTA)
return 1;
@@ -134,7 +161,7 @@ static inline int vfs_dq_prealloc_space(struct inode *inode, qsize_t nr)

static inline int vfs_dq_alloc_space_nodirty(struct inode *inode, qsize_t nr)
{
- if (sb_any_quota_enabled(inode->i_sb)) {
+ if (sb_any_quota_active(inode->i_sb)) {
/* Used space is updated in alloc_space() */
if (inode->i_sb->dq_op->alloc_space(inode, nr, 0) == NO_QUOTA)
return 1;
@@ -154,7 +181,7 @@ static inline int vfs_dq_alloc_space(struct inode *inode, qsize_t nr)

static inline int vfs_dq_alloc_inode(struct inode *inode)
{
- if (sb_any_quota_enabled(inode->i_sb)) {
+ if (sb_any_quota_active(inode->i_sb)) {
vfs_dq_init(inode);
if (inode->i_sb->dq_op->alloc_inode(inode, 1) == NO_QUOTA)
return 1;
@@ -164,7 +191,7 @@ static inline int vfs_dq_alloc_inode(struct inode *inode)

static inline void vfs_dq_free_space_nodirty(struct inode *inode, qsize_t nr)
{
- if (sb_any_quota_enabled(inode->i_sb))
+ if (sb_any_quota_active(inode->i_sb))
inode->i_sb->dq_op->free_space(inode, nr);
else
inode_sub_bytes(inode, nr);
@@ -178,7 +205,7 @@ static inline void vfs_dq_free_space(struct inode *inode, qsize_t nr)

static inline void vfs_dq_free_inode(struct inode *inode)
{
- if (sb_any_quota_enabled(inode->i_sb))
+ if (sb_any_quota_active(inode->i_sb))
inode->i_sb->dq_op->free_inode(inode, 1);
}

@@ -199,12 +226,12 @@ static inline int vfs_dq_off(struct super_block *sb, int remount)

#else

-static inline int sb_has_quota_enabled(struct super_block *sb, int type)
+static inline int sb_has_quota_usage_enabled(struct super_block *sb, int type)
{
return 0;
}

-static inline int sb_any_quota_enabled(struct super_block *sb)
+static inline int sb_has_quota_limits_enabled(struct super_block *sb, int type)
{
return 0;
}
@@ -219,6 +246,30 @@ static inline int sb_any_quota_suspended(struct super_block *sb)
return 0;
}

+/* Does kernel know about any quota information for given sb + type? */
+static inline int sb_has_quota_loaded(struct super_block *sb, int type)
+{
+ return 0;
+}
+
+static inline int sb_any_quota_loaded(struct super_block *sb)
+{
+ return 0;
+}
+
+static inline int sb_has_quota_active(struct super_block *sb, int type)
+{
+ return 0;
+}
+
+static inline int sb_any_quota_active(struct super_block *sb)
+{
+ return 0;
+}
+
+/* For backward compatibility until we remove all users */
+#define sb_any_quota_enabled(sb) sb_any_quota_active(sb)
+
/*
* NO-OP when quota not configured.
*/
--
1.5.6

2008-12-22 21:51:32

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 06/56] ext3: Use sb_any_quota_loaded() instead of sb_any_quota_enabled()

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ext3/super.c | 12 ++++--------
1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index f6c94f2..250ec53 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -1035,8 +1035,7 @@ static int parse_options (char *options, struct super_block *sb,
case Opt_grpjquota:
qtype = GRPQUOTA;
set_qf_name:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
!sbi->s_qf_names[qtype]) {
printk(KERN_ERR
"EXT3-fs: Cannot change journaled "
@@ -1075,8 +1074,7 @@ set_qf_name:
case Opt_offgrpjquota:
qtype = GRPQUOTA;
clear_qf_name:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
sbi->s_qf_names[qtype]) {
printk(KERN_ERR "EXT3-fs: Cannot change "
"journaled quota options when "
@@ -1095,8 +1093,7 @@ clear_qf_name:
case Opt_jqfmt_vfsv0:
qfmt = QFMT_VFS_V0;
set_qf_format:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
sbi->s_jquota_fmt != qfmt) {
printk(KERN_ERR "EXT3-fs: Cannot change "
"journaled quota options when "
@@ -1115,8 +1112,7 @@ set_qf_format:
set_opt(sbi->s_mount_opt, GRPQUOTA);
break;
case Opt_noquota:
- if (sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) {
+ if (sb_any_quota_loaded(sb)) {
printk(KERN_ERR "EXT3-fs: Cannot change quota "
"options when quota turned on.\n");
return 0;
--
1.5.6

2008-12-22 21:51:50

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 07/56] ext4: Use sb_any_quota_loaded() instead of sb_any_quota_enabled()

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ext4/super.c | 11 ++++-------
1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e4a241c..9e5a717 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1142,8 +1142,7 @@ static int parse_options(char *options, struct super_block *sb,
case Opt_grpjquota:
qtype = GRPQUOTA;
set_qf_name:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
!sbi->s_qf_names[qtype]) {
printk(KERN_ERR
"EXT4-fs: Cannot change journaled "
@@ -1182,8 +1181,7 @@ set_qf_name:
case Opt_offgrpjquota:
qtype = GRPQUOTA;
clear_qf_name:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
sbi->s_qf_names[qtype]) {
printk(KERN_ERR "EXT4-fs: Cannot change "
"journaled quota options when "
@@ -1202,8 +1200,7 @@ clear_qf_name:
case Opt_jqfmt_vfsv0:
qfmt = QFMT_VFS_V0;
set_qf_format:
- if ((sb_any_quota_enabled(sb) ||
- sb_any_quota_suspended(sb)) &&
+ if (sb_any_quota_loaded(sb) &&
sbi->s_jquota_fmt != qfmt) {
printk(KERN_ERR "EXT4-fs: Cannot change "
"journaled quota options when "
@@ -1222,7 +1219,7 @@ set_qf_format:
set_opt(sbi->s_mount_opt, GRPQUOTA);
break;
case Opt_noquota:
- if (sb_any_quota_enabled(sb)) {
+ if (sb_any_quota_loaded(sb)) {
printk(KERN_ERR "EXT4-fs: Cannot change quota "
"options when quota turned on.\n");
return 0;
--
1.5.6

2008-12-22 21:52:12

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 08/56] reiserfs: Use sb_any_quota_loaded() instead of sb_any_quota_enabled().

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/reiserfs/super.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 663a91f..a9b393a 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -994,8 +994,7 @@ static int reiserfs_parse_options(struct super_block *s, char *options, /* strin
if (c == 'u' || c == 'g') {
int qtype = c == 'u' ? USRQUOTA : GRPQUOTA;

- if ((sb_any_quota_enabled(s) ||
- sb_any_quota_suspended(s)) &&
+ if (sb_any_quota_loaded(s) &&
(!*arg != !REISERFS_SB(s)->s_qf_names[qtype])) {
reiserfs_warning(s,
"reiserfs_parse_options: cannot change journaled quota options when quota turned on.");
@@ -1041,8 +1040,7 @@ static int reiserfs_parse_options(struct super_block *s, char *options, /* strin
"reiserfs_parse_options: unknown quota format specified.");
return 0;
}
- if ((sb_any_quota_enabled(s) ||
- sb_any_quota_suspended(s)) &&
+ if (sb_any_quota_loaded(s) &&
*qfmt != REISERFS_SB(s)->s_jquota_fmt) {
reiserfs_warning(s,
"reiserfs_parse_options: cannot change journaled quota options when quota turned on.");
@@ -1067,7 +1065,7 @@ static int reiserfs_parse_options(struct super_block *s, char *options, /* strin
}
/* This checking is not precise wrt the quota type but for our purposes it is sufficient */
if (!(*mount_options & (1 << REISERFS_QUOTA))
- && sb_any_quota_enabled(s)) {
+ && sb_any_quota_loaded(s)) {
reiserfs_warning(s,
"reiserfs_parse_options: quota options must be present when quota is turned on.");
return 0;
--
1.5.6

2008-12-22 21:52:34

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 09/56] quota: Remove compatibility function sb_any_quota_enabled()

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
include/linux/quotaops.h | 6 ------
1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index 3b3346f..e840ca5 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -116,9 +116,6 @@ static inline int sb_any_quota_active(struct super_block *sb)
sb_has_quota_active(sb, GRPQUOTA);
}

-/* For backward compatibility until we remove all users */
-#define sb_any_quota_enabled(sb) sb_any_quota_active(sb)
-
/*
* Operations supported for diskquotas.
*/
@@ -267,9 +264,6 @@ static inline int sb_any_quota_active(struct super_block *sb)
return 0;
}

-/* For backward compatibility until we remove all users */
-#define sb_any_quota_enabled(sb) sb_any_quota_active(sb)
-
/*
* NO-OP when quota not configured.
*/
--
1.5.6

2008-12-22 21:52:52

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 10/56] quota: Introduce DQUOT_QUOTA_SYS_FILE flag

From: Jan Kara <[email protected]>

If filesystem can handle quota files as system files hidden from users, we can
skip a lot of cache invalidation, syncing, inode flags setting etc. when
turning quotas on, off and quota_sync. Allow filesystem to indicate that it is
hiding quota files from users by DQUOT_QUOTA_SYS_FILE flag.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 45 ++++++++++++++++++++++++++++++---------------
fs/quota.c | 3 +++
include/linux/quota.h | 7 +++++++
3 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index adf59ce..f4d6f7e 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -1631,6 +1631,11 @@ int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags)
dqopt->ops[cnt] = NULL;
}
mutex_unlock(&dqopt->dqonoff_mutex);
+
+ /* Skip syncing and setting flags if quota files are hidden */
+ if (dqopt->flags & DQUOT_QUOTA_SYS_FILE)
+ goto put_inodes;
+
/* Sync the superblock so that buffers with quota data are written to
* disk (and so userspace sees correct data afterwards). */
if (sb->s_op->sync_fs)
@@ -1655,6 +1660,12 @@ int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags)
mark_inode_dirty(toputinode[cnt]);
}
mutex_unlock(&dqopt->dqonoff_mutex);
+ }
+ if (sb->s_bdev)
+ invalidate_bdev(sb->s_bdev);
+put_inodes:
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ if (toputinode[cnt]) {
/* On remount RO, we keep the inode pointer so that we
* can reenable quota on the subsequent remount RW. We
* have to check 'flags' variable and not use sb_has_
@@ -1667,8 +1678,6 @@ int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags)
else if (!toputinode[cnt]->i_nlink)
ret = -EBUSY;
}
- if (sb->s_bdev)
- invalidate_bdev(sb->s_bdev);
return ret;
}

@@ -1715,25 +1724,31 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
goto out_fmt;
}

- /* As we bypass the pagecache we must now flush the inode so that
- * we see all the changes from userspace... */
- write_inode_now(inode, 1);
- /* And now flush the block cache so that kernel sees the changes */
- invalidate_bdev(sb->s_bdev);
+ if (!(dqopt->flags & DQUOT_QUOTA_SYS_FILE)) {
+ /* As we bypass the pagecache we must now flush the inode so
+ * that we see all the changes from userspace... */
+ write_inode_now(inode, 1);
+ /* And now flush the block cache so that kernel sees the
+ * changes */
+ invalidate_bdev(sb->s_bdev);
+ }
mutex_lock(&inode->i_mutex);
mutex_lock(&dqopt->dqonoff_mutex);
if (sb_has_quota_loaded(sb, type)) {
error = -EBUSY;
goto out_lock;
}
- /* We don't want quota and atime on quota files (deadlocks possible)
- * Also nobody should write to the file - we use special IO operations
- * which ignore the immutable bit. */
- down_write(&dqopt->dqptr_sem);
- oldflags = inode->i_flags & (S_NOATIME | S_IMMUTABLE | S_NOQUOTA);
- inode->i_flags |= S_NOQUOTA | S_NOATIME | S_IMMUTABLE;
- up_write(&dqopt->dqptr_sem);
- sb->dq_op->drop(inode);
+
+ if (!(dqopt->flags & DQUOT_QUOTA_SYS_FILE)) {
+ /* We don't want quota and atime on quota files (deadlocks
+ * possible) Also nobody should write to the file - we use
+ * special IO operations which ignore the immutable bit. */
+ down_write(&dqopt->dqptr_sem);
+ oldflags = inode->i_flags & (S_NOATIME | S_IMMUTABLE | S_NOQUOTA);
+ inode->i_flags |= S_NOQUOTA | S_NOATIME | S_IMMUTABLE;
+ up_write(&dqopt->dqptr_sem);
+ sb->dq_op->drop(inode);
+ }

error = -EIO;
dqopt->files[type] = igrab(inode);
diff --git a/fs/quota.c b/fs/quota.c
index a8026f1..2c6ea78 100644
--- a/fs/quota.c
+++ b/fs/quota.c
@@ -160,6 +160,9 @@ static void quota_sync_sb(struct super_block *sb, int type)
int cnt;

sb->s_qcop->quota_sync(sb, type);
+
+ if (sb_dqopt(sb)->flags & DQUOT_QUOTA_SYS_FILE)
+ return;
/* This is not very clever (and fast) but currently I don't know about
* any other simple way of getting quota data to disk and we must get
* them there for userspace to be visible... */
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 93717ab..80b8807 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -332,6 +332,13 @@ enum {
#define DQUOT_SUSPENDED (1 << _DQUOT_SUSPENDED)
#define DQUOT_STATE_FLAGS (DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED | \
DQUOT_SUSPENDED)
+/* Other quota flags */
+#define DQUOT_QUOTA_SYS_FILE (1 << 6) /* Quota file is a special
+ * system file and user cannot
+ * touch it. Filesystem is
+ * responsible for setting
+ * S_NOQUOTA, S_NOATIME flags
+ */

static inline unsigned int dquot_state_flag(unsigned int flags, int type)
{
--
1.5.6

2008-12-22 21:53:17

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 11/56] quota: Move quotaio_v[12].h from include/linux/ to fs/

From: Jan Kara <[email protected]>

Since these include files are used only by implementation of quota formats,
there's no need to have them in include/linux/.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/quota_v1.c | 3 +-
fs/quota_v2.c | 7 ++--
fs/quotaio_v1.h | 33 ++++++++++++++++++
fs/quotaio_v2.h | 79 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/Kbuild | 2 -
include/linux/quotaio_v1.h | 33 ------------------
include/linux/quotaio_v2.h | 79 --------------------------------------------
7 files changed, 118 insertions(+), 118 deletions(-)
create mode 100644 fs/quotaio_v1.h
create mode 100644 fs/quotaio_v2.h
delete mode 100644 include/linux/quotaio_v1.h
delete mode 100644 include/linux/quotaio_v2.h

diff --git a/fs/quota_v1.c b/fs/quota_v1.c
index 3e078ee..b4af1c6 100644
--- a/fs/quota_v1.c
+++ b/fs/quota_v1.c
@@ -3,13 +3,14 @@
#include <linux/quota.h>
#include <linux/quotaops.h>
#include <linux/dqblk_v1.h>
-#include <linux/quotaio_v1.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>

#include <asm/byteorder.h>

+#include "quotaio_v1.h"
+
MODULE_AUTHOR("Jan Kara");
MODULE_DESCRIPTION("Old quota format support");
MODULE_LICENSE("GPL");
diff --git a/fs/quota_v2.c b/fs/quota_v2.c
index 51c4717..a21d1a7 100644
--- a/fs/quota_v2.c
+++ b/fs/quota_v2.c
@@ -6,7 +6,6 @@
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/dqblk_v2.h>
-#include <linux/quotaio_v2.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
@@ -15,6 +14,8 @@

#include <asm/byteorder.h>

+#include "quotaio_v2.h"
+
MODULE_AUTHOR("Jan Kara");
MODULE_DESCRIPTION("Quota format v2 support");
MODULE_LICENSE("GPL");
@@ -129,8 +130,8 @@ static void mem2diskdqb(struct v2_disk_dqblk *d, struct mem_dqblk *m, qid_t id)
d->dqb_isoftlimit = cpu_to_le32(m->dqb_isoftlimit);
d->dqb_curinodes = cpu_to_le32(m->dqb_curinodes);
d->dqb_itime = cpu_to_le64(m->dqb_itime);
- d->dqb_bhardlimit = cpu_to_le32(v2_qbtos(m->dqb_bhardlimit));
- d->dqb_bsoftlimit = cpu_to_le32(v2_qbtos(m->dqb_bsoftlimit));
+ d->dqb_bhardlimit = cpu_to_le32(v2_stoqb(m->dqb_bhardlimit));
+ d->dqb_bsoftlimit = cpu_to_le32(v2_stoqb(m->dqb_bsoftlimit));
d->dqb_curspace = cpu_to_le64(m->dqb_curspace);
d->dqb_btime = cpu_to_le64(m->dqb_btime);
d->dqb_id = cpu_to_le32(id);
diff --git a/fs/quotaio_v1.h b/fs/quotaio_v1.h
new file mode 100644
index 0000000..746654b
--- /dev/null
+++ b/fs/quotaio_v1.h
@@ -0,0 +1,33 @@
+#ifndef _LINUX_QUOTAIO_V1_H
+#define _LINUX_QUOTAIO_V1_H
+
+#include <linux/types.h>
+
+/*
+ * The following constants define the amount of time given a user
+ * before the soft limits are treated as hard limits (usually resulting
+ * in an allocation failure). The timer is started when the user crosses
+ * their soft limit, it is reset when they go below their soft limit.
+ */
+#define MAX_IQ_TIME 604800 /* (7*24*60*60) 1 week */
+#define MAX_DQ_TIME 604800 /* (7*24*60*60) 1 week */
+
+/*
+ * The following structure defines the format of the disk quota file
+ * (as it appears on disk) - the file is an array of these structures
+ * indexed by user or group number.
+ */
+struct v1_disk_dqblk {
+ __u32 dqb_bhardlimit; /* absolute limit on disk blks alloc */
+ __u32 dqb_bsoftlimit; /* preferred limit on disk blks */
+ __u32 dqb_curblocks; /* current block count */
+ __u32 dqb_ihardlimit; /* absolute limit on allocated inodes */
+ __u32 dqb_isoftlimit; /* preferred inode limit */
+ __u32 dqb_curinodes; /* current # allocated inodes */
+ time_t dqb_btime; /* time limit for excessive disk use */
+ time_t dqb_itime; /* time limit for excessive inode use */
+};
+
+#define v1_dqoff(UID) ((loff_t)((UID) * sizeof (struct v1_disk_dqblk)))
+
+#endif /* _LINUX_QUOTAIO_V1_H */
diff --git a/fs/quotaio_v2.h b/fs/quotaio_v2.h
new file mode 100644
index 0000000..303d7cb
--- /dev/null
+++ b/fs/quotaio_v2.h
@@ -0,0 +1,79 @@
+/*
+ * Definitions of structures for vfsv0 quota format
+ */
+
+#ifndef _LINUX_QUOTAIO_V2_H
+#define _LINUX_QUOTAIO_V2_H
+
+#include <linux/types.h>
+#include <linux/quota.h>
+
+/*
+ * Definitions of magics and versions of current quota files
+ */
+#define V2_INITQMAGICS {\
+ 0xd9c01f11, /* USRQUOTA */\
+ 0xd9c01927 /* GRPQUOTA */\
+}
+
+#define V2_INITQVERSIONS {\
+ 0, /* USRQUOTA */\
+ 0 /* GRPQUOTA */\
+}
+
+/*
+ * The following structure defines the format of the disk quota file
+ * (as it appears on disk) - the file is a radix tree whose leaves point
+ * to blocks of these structures.
+ */
+struct v2_disk_dqblk {
+ __le32 dqb_id; /* id this quota applies to */
+ __le32 dqb_ihardlimit; /* absolute limit on allocated inodes */
+ __le32 dqb_isoftlimit; /* preferred inode limit */
+ __le32 dqb_curinodes; /* current # allocated inodes */
+ __le32 dqb_bhardlimit; /* absolute limit on disk space (in QUOTABLOCK_SIZE) */
+ __le32 dqb_bsoftlimit; /* preferred limit on disk space (in QUOTABLOCK_SIZE) */
+ __le64 dqb_curspace; /* current space occupied (in bytes) */
+ __le64 dqb_btime; /* time limit for excessive disk use */
+ __le64 dqb_itime; /* time limit for excessive inode use */
+};
+
+/*
+ * Here are header structures as written on disk and their in-memory copies
+ */
+/* First generic header */
+struct v2_disk_dqheader {
+ __le32 dqh_magic; /* Magic number identifying file */
+ __le32 dqh_version; /* File version */
+};
+
+/* Header with type and version specific information */
+struct v2_disk_dqinfo {
+ __le32 dqi_bgrace; /* Time before block soft limit becomes hard limit */
+ __le32 dqi_igrace; /* Time before inode soft limit becomes hard limit */
+ __le32 dqi_flags; /* Flags for quotafile (DQF_*) */
+ __le32 dqi_blocks; /* Number of blocks in file */
+ __le32 dqi_free_blk; /* Number of first free block in the list */
+ __le32 dqi_free_entry; /* Number of block with at least one free entry */
+};
+
+/*
+ * Structure of header of block with quota structures. It is padded to 16 bytes so
+ * there will be space for exactly 21 quota-entries in a block
+ */
+struct v2_disk_dqdbheader {
+ __le32 dqdh_next_free; /* Number of next block with free entry */
+ __le32 dqdh_prev_free; /* Number of previous block with free entry */
+ __le16 dqdh_entries; /* Number of valid entries in block */
+ __le16 dqdh_pad1;
+ __le32 dqdh_pad2;
+};
+
+#define V2_DQINFOOFF sizeof(struct v2_disk_dqheader) /* Offset of info header in file */
+#define V2_DQBLKSIZE_BITS 10
+#define V2_DQBLKSIZE (1 << V2_DQBLKSIZE_BITS) /* Size of block with quota structures */
+#define V2_DQTREEOFF 1 /* Offset of tree in file in blocks */
+#define V2_DQTREEDEPTH 4 /* Depth of quota tree */
+#define V2_DQSTRINBLK ((V2_DQBLKSIZE - sizeof(struct v2_disk_dqdbheader)) / sizeof(struct v2_disk_dqblk)) /* Number of entries in one blocks */
+
+#endif /* _LINUX_QUOTAIO_V2_H */
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index e531783..0fd2da3 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -134,8 +134,6 @@ header-y += posix_types.h
header-y += ppdev.h
header-y += prctl.h
header-y += qnxtypes.h
-header-y += quotaio_v1.h
-header-y += quotaio_v2.h
header-y += radeonfb.h
header-y += raw.h
header-y += resource.h
diff --git a/include/linux/quotaio_v1.h b/include/linux/quotaio_v1.h
deleted file mode 100644
index 746654b..0000000
--- a/include/linux/quotaio_v1.h
+++ /dev/null
@@ -1,33 +0,0 @@
-#ifndef _LINUX_QUOTAIO_V1_H
-#define _LINUX_QUOTAIO_V1_H
-
-#include <linux/types.h>
-
-/*
- * The following constants define the amount of time given a user
- * before the soft limits are treated as hard limits (usually resulting
- * in an allocation failure). The timer is started when the user crosses
- * their soft limit, it is reset when they go below their soft limit.
- */
-#define MAX_IQ_TIME 604800 /* (7*24*60*60) 1 week */
-#define MAX_DQ_TIME 604800 /* (7*24*60*60) 1 week */
-
-/*
- * The following structure defines the format of the disk quota file
- * (as it appears on disk) - the file is an array of these structures
- * indexed by user or group number.
- */
-struct v1_disk_dqblk {
- __u32 dqb_bhardlimit; /* absolute limit on disk blks alloc */
- __u32 dqb_bsoftlimit; /* preferred limit on disk blks */
- __u32 dqb_curblocks; /* current block count */
- __u32 dqb_ihardlimit; /* absolute limit on allocated inodes */
- __u32 dqb_isoftlimit; /* preferred inode limit */
- __u32 dqb_curinodes; /* current # allocated inodes */
- time_t dqb_btime; /* time limit for excessive disk use */
- time_t dqb_itime; /* time limit for excessive inode use */
-};
-
-#define v1_dqoff(UID) ((loff_t)((UID) * sizeof (struct v1_disk_dqblk)))
-
-#endif /* _LINUX_QUOTAIO_V1_H */
diff --git a/include/linux/quotaio_v2.h b/include/linux/quotaio_v2.h
deleted file mode 100644
index 303d7cb..0000000
--- a/include/linux/quotaio_v2.h
+++ /dev/null
@@ -1,79 +0,0 @@
-/*
- * Definitions of structures for vfsv0 quota format
- */
-
-#ifndef _LINUX_QUOTAIO_V2_H
-#define _LINUX_QUOTAIO_V2_H
-
-#include <linux/types.h>
-#include <linux/quota.h>
-
-/*
- * Definitions of magics and versions of current quota files
- */
-#define V2_INITQMAGICS {\
- 0xd9c01f11, /* USRQUOTA */\
- 0xd9c01927 /* GRPQUOTA */\
-}
-
-#define V2_INITQVERSIONS {\
- 0, /* USRQUOTA */\
- 0 /* GRPQUOTA */\
-}
-
-/*
- * The following structure defines the format of the disk quota file
- * (as it appears on disk) - the file is a radix tree whose leaves point
- * to blocks of these structures.
- */
-struct v2_disk_dqblk {
- __le32 dqb_id; /* id this quota applies to */
- __le32 dqb_ihardlimit; /* absolute limit on allocated inodes */
- __le32 dqb_isoftlimit; /* preferred inode limit */
- __le32 dqb_curinodes; /* current # allocated inodes */
- __le32 dqb_bhardlimit; /* absolute limit on disk space (in QUOTABLOCK_SIZE) */
- __le32 dqb_bsoftlimit; /* preferred limit on disk space (in QUOTABLOCK_SIZE) */
- __le64 dqb_curspace; /* current space occupied (in bytes) */
- __le64 dqb_btime; /* time limit for excessive disk use */
- __le64 dqb_itime; /* time limit for excessive inode use */
-};
-
-/*
- * Here are header structures as written on disk and their in-memory copies
- */
-/* First generic header */
-struct v2_disk_dqheader {
- __le32 dqh_magic; /* Magic number identifying file */
- __le32 dqh_version; /* File version */
-};
-
-/* Header with type and version specific information */
-struct v2_disk_dqinfo {
- __le32 dqi_bgrace; /* Time before block soft limit becomes hard limit */
- __le32 dqi_igrace; /* Time before inode soft limit becomes hard limit */
- __le32 dqi_flags; /* Flags for quotafile (DQF_*) */
- __le32 dqi_blocks; /* Number of blocks in file */
- __le32 dqi_free_blk; /* Number of first free block in the list */
- __le32 dqi_free_entry; /* Number of block with at least one free entry */
-};
-
-/*
- * Structure of header of block with quota structures. It is padded to 16 bytes so
- * there will be space for exactly 21 quota-entries in a block
- */
-struct v2_disk_dqdbheader {
- __le32 dqdh_next_free; /* Number of next block with free entry */
- __le32 dqdh_prev_free; /* Number of previous block with free entry */
- __le16 dqdh_entries; /* Number of valid entries in block */
- __le16 dqdh_pad1;
- __le32 dqdh_pad2;
-};
-
-#define V2_DQINFOOFF sizeof(struct v2_disk_dqheader) /* Offset of info header in file */
-#define V2_DQBLKSIZE_BITS 10
-#define V2_DQBLKSIZE (1 << V2_DQBLKSIZE_BITS) /* Size of block with quota structures */
-#define V2_DQTREEOFF 1 /* Offset of tree in file in blocks */
-#define V2_DQTREEDEPTH 4 /* Depth of quota tree */
-#define V2_DQSTRINBLK ((V2_DQBLKSIZE - sizeof(struct v2_disk_dqdbheader)) / sizeof(struct v2_disk_dqblk)) /* Number of entries in one blocks */
-
-#endif /* _LINUX_QUOTAIO_V2_H */
--
1.5.6

2008-12-22 21:53:35

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 12/56] quota: Split off quota tree handling into a separate file

From: Jan Kara <[email protected]>

There is going to be a new version of quota format having 64-bit
quota limits and a new quota format for OCFS2. They are both
going to use the same tree structure as VFSv0 quota format. So
split out tree handling into a separate file and make size of
leaf blocks, amount of space usable in each block (needed for
checksumming) and structures contained in them configurable
so that the code can be shared.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/Kconfig | 5 +
fs/Makefile | 1 +
fs/quota_tree.c | 645 +++++++++++++++++++++++++++++++++++++++++++
fs/quota_tree.h | 25 ++
fs/quota_v2.c | 596 ++++------------------------------------
fs/quotaio_v2.h | 33 +--
include/linux/dqblk_qtree.h | 56 ++++
include/linux/dqblk_v2.h | 19 +-
8 files changed, 799 insertions(+), 581 deletions(-)
create mode 100644 fs/quota_tree.c
create mode 100644 fs/quota_tree.h
create mode 100644 include/linux/dqblk_qtree.h

diff --git a/fs/Kconfig b/fs/Kconfig
index 55dc974..d99bc0a 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -339,6 +339,10 @@ config PRINT_QUOTA_WARNING
Note that this behavior is currently deprecated and may go away in
future. Please use notification via netlink socket instead.

+# Generic support for tree structured quota files. Seleted when needed.
+config QUOTA_TREE
+ tristate
+
config QFMT_V1
tristate "Old quota format support"
depends on QUOTA
@@ -350,6 +354,7 @@ config QFMT_V1
config QFMT_V2
tristate "Quota format v2 support"
depends on QUOTA
+ select QUOTA_TREE
help
This quota format allows using quotas with 32-bit UIDs/GIDs. If you
need this functionality say Y here.
diff --git a/fs/Makefile b/fs/Makefile
index d9f8afe..cdf6556 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -55,6 +55,7 @@ obj-$(CONFIG_GENERIC_ACL) += generic_acl.o
obj-$(CONFIG_QUOTA) += dquot.o
obj-$(CONFIG_QFMT_V1) += quota_v1.o
obj-$(CONFIG_QFMT_V2) += quota_v2.o
+obj-$(CONFIG_QUOTA_TREE) += quota_tree.o
obj-$(CONFIG_QUOTACTL) += quota.o

obj-$(CONFIG_DNOTIFY) += dnotify.o
diff --git a/fs/quota_tree.c b/fs/quota_tree.c
new file mode 100644
index 0000000..953404c
--- /dev/null
+++ b/fs/quota_tree.c
@@ -0,0 +1,645 @@
+/*
+ * vfsv0 quota IO operations on file
+ */
+
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/mount.h>
+#include <linux/dqblk_v2.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/quotaops.h>
+
+#include <asm/byteorder.h>
+
+#include "quota_tree.h"
+
+MODULE_AUTHOR("Jan Kara");
+MODULE_DESCRIPTION("Quota trie support");
+MODULE_LICENSE("GPL");
+
+#define __QUOTA_QT_PARANOIA
+
+typedef char *dqbuf_t;
+
+static int get_index(struct qtree_mem_dqinfo *info, qid_t id, int depth)
+{
+ unsigned int epb = info->dqi_usable_bs >> 2;
+
+ depth = info->dqi_qtree_depth - depth - 1;
+ while (depth--)
+ id /= epb;
+ return id % epb;
+}
+
+/* Number of entries in one blocks */
+static inline int qtree_dqstr_in_blk(struct qtree_mem_dqinfo *info)
+{
+ return (info->dqi_usable_bs - sizeof(struct qt_disk_dqdbheader))
+ / info->dqi_entry_size;
+}
+
+static dqbuf_t getdqbuf(size_t size)
+{
+ dqbuf_t buf = kmalloc(size, GFP_NOFS);
+ if (!buf)
+ printk(KERN_WARNING "VFS: Not enough memory for quota buffers.\n");
+ return buf;
+}
+
+static inline void freedqbuf(dqbuf_t buf)
+{
+ kfree(buf);
+}
+
+static inline ssize_t read_blk(struct qtree_mem_dqinfo *info, uint blk, dqbuf_t buf)
+{
+ struct super_block *sb = info->dqi_sb;
+
+ memset(buf, 0, info->dqi_usable_bs);
+ return sb->s_op->quota_read(sb, info->dqi_type, (char *)buf,
+ info->dqi_usable_bs, blk << info->dqi_blocksize_bits);
+}
+
+static inline ssize_t write_blk(struct qtree_mem_dqinfo *info, uint blk, dqbuf_t buf)
+{
+ struct super_block *sb = info->dqi_sb;
+
+ return sb->s_op->quota_write(sb, info->dqi_type, (char *)buf,
+ info->dqi_usable_bs, blk << info->dqi_blocksize_bits);
+}
+
+/* Remove empty block from list and return it */
+static int get_free_dqblk(struct qtree_mem_dqinfo *info)
+{
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ struct qt_disk_dqdbheader *dh = (struct qt_disk_dqdbheader *)buf;
+ int ret, blk;
+
+ if (!buf)
+ return -ENOMEM;
+ if (info->dqi_free_blk) {
+ blk = info->dqi_free_blk;
+ ret = read_blk(info, blk, buf);
+ if (ret < 0)
+ goto out_buf;
+ info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free);
+ }
+ else {
+ memset(buf, 0, info->dqi_usable_bs);
+ /* Assure block allocation... */
+ ret = write_blk(info, info->dqi_blocks, buf);
+ if (ret < 0)
+ goto out_buf;
+ blk = info->dqi_blocks++;
+ }
+ mark_info_dirty(info->dqi_sb, info->dqi_type);
+ ret = blk;
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Insert empty block to the list */
+static int put_free_dqblk(struct qtree_mem_dqinfo *info, dqbuf_t buf, uint blk)
+{
+ struct qt_disk_dqdbheader *dh = (struct qt_disk_dqdbheader *)buf;
+ int err;
+
+ dh->dqdh_next_free = cpu_to_le32(info->dqi_free_blk);
+ dh->dqdh_prev_free = cpu_to_le32(0);
+ dh->dqdh_entries = cpu_to_le16(0);
+ err = write_blk(info, blk, buf);
+ if (err < 0)
+ return err;
+ info->dqi_free_blk = blk;
+ mark_info_dirty(info->dqi_sb, info->dqi_type);
+ return 0;
+}
+
+/* Remove given block from the list of blocks with free entries */
+static int remove_free_dqentry(struct qtree_mem_dqinfo *info, dqbuf_t buf, uint blk)
+{
+ dqbuf_t tmpbuf = getdqbuf(info->dqi_usable_bs);
+ struct qt_disk_dqdbheader *dh = (struct qt_disk_dqdbheader *)buf;
+ uint nextblk = le32_to_cpu(dh->dqdh_next_free);
+ uint prevblk = le32_to_cpu(dh->dqdh_prev_free);
+ int err;
+
+ if (!tmpbuf)
+ return -ENOMEM;
+ if (nextblk) {
+ err = read_blk(info, nextblk, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ ((struct qt_disk_dqdbheader *)tmpbuf)->dqdh_prev_free =
+ dh->dqdh_prev_free;
+ err = write_blk(info, nextblk, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ }
+ if (prevblk) {
+ err = read_blk(info, prevblk, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ ((struct qt_disk_dqdbheader *)tmpbuf)->dqdh_next_free =
+ dh->dqdh_next_free;
+ err = write_blk(info, prevblk, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ } else {
+ info->dqi_free_entry = nextblk;
+ mark_info_dirty(info->dqi_sb, info->dqi_type);
+ }
+ freedqbuf(tmpbuf);
+ dh->dqdh_next_free = dh->dqdh_prev_free = cpu_to_le32(0);
+ /* No matter whether write succeeds block is out of list */
+ if (write_blk(info, blk, buf) < 0)
+ printk(KERN_ERR "VFS: Can't write block (%u) with free entries.\n", blk);
+ return 0;
+out_buf:
+ freedqbuf(tmpbuf);
+ return err;
+}
+
+/* Insert given block to the beginning of list with free entries */
+static int insert_free_dqentry(struct qtree_mem_dqinfo *info, dqbuf_t buf, uint blk)
+{
+ dqbuf_t tmpbuf = getdqbuf(info->dqi_usable_bs);
+ struct qt_disk_dqdbheader *dh = (struct qt_disk_dqdbheader *)buf;
+ int err;
+
+ if (!tmpbuf)
+ return -ENOMEM;
+ dh->dqdh_next_free = cpu_to_le32(info->dqi_free_entry);
+ dh->dqdh_prev_free = cpu_to_le32(0);
+ err = write_blk(info, blk, buf);
+ if (err < 0)
+ goto out_buf;
+ if (info->dqi_free_entry) {
+ err = read_blk(info, info->dqi_free_entry, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ ((struct qt_disk_dqdbheader *)tmpbuf)->dqdh_prev_free =
+ cpu_to_le32(blk);
+ err = write_blk(info, info->dqi_free_entry, tmpbuf);
+ if (err < 0)
+ goto out_buf;
+ }
+ freedqbuf(tmpbuf);
+ info->dqi_free_entry = blk;
+ mark_info_dirty(info->dqi_sb, info->dqi_type);
+ return 0;
+out_buf:
+ freedqbuf(tmpbuf);
+ return err;
+}
+
+/* Is the entry in the block free? */
+int qtree_entry_unused(struct qtree_mem_dqinfo *info, char *disk)
+{
+ int i;
+
+ for (i = 0; i < info->dqi_entry_size; i++)
+ if (disk[i])
+ return 0;
+ return 1;
+}
+EXPORT_SYMBOL(qtree_entry_unused);
+
+/* Find space for dquot */
+static uint find_free_dqentry(struct qtree_mem_dqinfo *info,
+ struct dquot *dquot, int *err)
+{
+ uint blk, i;
+ struct qt_disk_dqdbheader *dh;
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ char *ddquot;
+
+ *err = 0;
+ if (!buf) {
+ *err = -ENOMEM;
+ return 0;
+ }
+ dh = (struct qt_disk_dqdbheader *)buf;
+ if (info->dqi_free_entry) {
+ blk = info->dqi_free_entry;
+ *err = read_blk(info, blk, buf);
+ if (*err < 0)
+ goto out_buf;
+ } else {
+ blk = get_free_dqblk(info);
+ if ((int)blk < 0) {
+ *err = blk;
+ freedqbuf(buf);
+ return 0;
+ }
+ memset(buf, 0, info->dqi_usable_bs);
+ /* This is enough as block is already zeroed and entry list is empty... */
+ info->dqi_free_entry = blk;
+ mark_info_dirty(dquot->dq_sb, dquot->dq_type);
+ }
+ /* Block will be full? */
+ if (le16_to_cpu(dh->dqdh_entries) + 1 >= qtree_dqstr_in_blk(info)) {
+ *err = remove_free_dqentry(info, buf, blk);
+ if (*err < 0) {
+ printk(KERN_ERR "VFS: find_free_dqentry(): Can't "
+ "remove block (%u) from entry free list.\n",
+ blk);
+ goto out_buf;
+ }
+ }
+ le16_add_cpu(&dh->dqdh_entries, 1);
+ /* Find free structure in block */
+ for (i = 0, ddquot = ((char *)buf) + sizeof(struct qt_disk_dqdbheader);
+ i < qtree_dqstr_in_blk(info) && !qtree_entry_unused(info, ddquot);
+ i++, ddquot += info->dqi_entry_size);
+#ifdef __QUOTA_QT_PARANOIA
+ if (i == qtree_dqstr_in_blk(info)) {
+ printk(KERN_ERR "VFS: find_free_dqentry(): Data block full "
+ "but it shouldn't.\n");
+ *err = -EIO;
+ goto out_buf;
+ }
+#endif
+ *err = write_blk(info, blk, buf);
+ if (*err < 0) {
+ printk(KERN_ERR "VFS: find_free_dqentry(): Can't write quota "
+ "data block %u.\n", blk);
+ goto out_buf;
+ }
+ dquot->dq_off = (blk << info->dqi_blocksize_bits) +
+ sizeof(struct qt_disk_dqdbheader) +
+ i * info->dqi_entry_size;
+ freedqbuf(buf);
+ return blk;
+out_buf:
+ freedqbuf(buf);
+ return 0;
+}
+
+/* Insert reference to structure into the trie */
+static int do_insert_tree(struct qtree_mem_dqinfo *info, struct dquot *dquot,
+ uint *treeblk, int depth)
+{
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ int ret = 0, newson = 0, newact = 0;
+ __le32 *ref;
+ uint newblk;
+
+ if (!buf)
+ return -ENOMEM;
+ if (!*treeblk) {
+ ret = get_free_dqblk(info);
+ if (ret < 0)
+ goto out_buf;
+ *treeblk = ret;
+ memset(buf, 0, info->dqi_usable_bs);
+ newact = 1;
+ } else {
+ ret = read_blk(info, *treeblk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't read tree quota block "
+ "%u.\n", *treeblk);
+ goto out_buf;
+ }
+ }
+ ref = (__le32 *)buf;
+ newblk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]);
+ if (!newblk)
+ newson = 1;
+ if (depth == info->dqi_qtree_depth - 1) {
+#ifdef __QUOTA_QT_PARANOIA
+ if (newblk) {
+ printk(KERN_ERR "VFS: Inserting already present quota "
+ "entry (block %u).\n",
+ le32_to_cpu(ref[get_index(info,
+ dquot->dq_id, depth)]));
+ ret = -EIO;
+ goto out_buf;
+ }
+#endif
+ newblk = find_free_dqentry(info, dquot, &ret);
+ } else {
+ ret = do_insert_tree(info, dquot, &newblk, depth+1);
+ }
+ if (newson && ret >= 0) {
+ ref[get_index(info, dquot->dq_id, depth)] =
+ cpu_to_le32(newblk);
+ ret = write_blk(info, *treeblk, buf);
+ } else if (newact && ret < 0) {
+ put_free_dqblk(info, buf, *treeblk);
+ }
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Wrapper for inserting quota structure into tree */
+static inline int dq_insert_tree(struct qtree_mem_dqinfo *info,
+ struct dquot *dquot)
+{
+ int tmp = QT_TREEOFF;
+ return do_insert_tree(info, dquot, &tmp, 0);
+}
+
+/*
+ * We don't have to be afraid of deadlocks as we never have quotas on quota files...
+ */
+int qtree_write_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
+{
+ int type = dquot->dq_type;
+ struct super_block *sb = dquot->dq_sb;
+ ssize_t ret;
+ dqbuf_t ddquot = getdqbuf(info->dqi_entry_size);
+
+ if (!ddquot)
+ return -ENOMEM;
+
+ /* dq_off is guarded by dqio_mutex */
+ if (!dquot->dq_off) {
+ ret = dq_insert_tree(info, dquot);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Error %zd occurred while "
+ "creating quota.\n", ret);
+ freedqbuf(ddquot);
+ return ret;
+ }
+ }
+ spin_lock(&dq_data_lock);
+ info->dqi_ops->mem2disk_dqblk(ddquot, dquot);
+ spin_unlock(&dq_data_lock);
+ ret = sb->s_op->quota_write(sb, type, (char *)ddquot,
+ info->dqi_entry_size, dquot->dq_off);
+ if (ret != info->dqi_entry_size) {
+ printk(KERN_WARNING "VFS: dquota write failed on dev %s\n",
+ sb->s_id);
+ if (ret >= 0)
+ ret = -ENOSPC;
+ } else {
+ ret = 0;
+ }
+ dqstats.writes++;
+ freedqbuf(ddquot);
+
+ return ret;
+}
+EXPORT_SYMBOL(qtree_write_dquot);
+
+/* Free dquot entry in data block */
+static int free_dqentry(struct qtree_mem_dqinfo *info, struct dquot *dquot,
+ uint blk)
+{
+ struct qt_disk_dqdbheader *dh;
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ int ret = 0;
+
+ if (!buf)
+ return -ENOMEM;
+ if (dquot->dq_off >> info->dqi_blocksize_bits != blk) {
+ printk(KERN_ERR "VFS: Quota structure has offset to other "
+ "block (%u) than it should (%u).\n", blk,
+ (uint)(dquot->dq_off >> info->dqi_blocksize_bits));
+ goto out_buf;
+ }
+ ret = read_blk(info, blk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't read quota data block %u\n", blk);
+ goto out_buf;
+ }
+ dh = (struct qt_disk_dqdbheader *)buf;
+ le16_add_cpu(&dh->dqdh_entries, -1);
+ if (!le16_to_cpu(dh->dqdh_entries)) { /* Block got free? */
+ ret = remove_free_dqentry(info, buf, blk);
+ if (ret >= 0)
+ ret = put_free_dqblk(info, buf, blk);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't move quota data block (%u) "
+ "to free list.\n", blk);
+ goto out_buf;
+ }
+ } else {
+ memset(buf +
+ (dquot->dq_off & ((1 << info->dqi_blocksize_bits) - 1)),
+ 0, info->dqi_entry_size);
+ if (le16_to_cpu(dh->dqdh_entries) ==
+ qtree_dqstr_in_blk(info) - 1) {
+ /* Insert will write block itself */
+ ret = insert_free_dqentry(info, buf, blk);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't insert quota data "
+ "block (%u) to free entry list.\n", blk);
+ goto out_buf;
+ }
+ } else {
+ ret = write_blk(info, blk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't write quota data "
+ "block %u\n", blk);
+ goto out_buf;
+ }
+ }
+ }
+ dquot->dq_off = 0; /* Quota is now unattached */
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Remove reference to dquot from tree */
+static int remove_tree(struct qtree_mem_dqinfo *info, struct dquot *dquot,
+ uint *blk, int depth)
+{
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ int ret = 0;
+ uint newblk;
+ __le32 *ref = (__le32 *)buf;
+
+ if (!buf)
+ return -ENOMEM;
+ ret = read_blk(info, *blk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't read quota data block %u\n", *blk);
+ goto out_buf;
+ }
+ newblk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]);
+ if (depth == info->dqi_qtree_depth - 1) {
+ ret = free_dqentry(info, dquot, newblk);
+ newblk = 0;
+ } else {
+ ret = remove_tree(info, dquot, &newblk, depth+1);
+ }
+ if (ret >= 0 && !newblk) {
+ int i;
+ ref[get_index(info, dquot->dq_id, depth)] = cpu_to_le32(0);
+ /* Block got empty? */
+ for (i = 0;
+ i < (info->dqi_usable_bs >> 2) && !ref[i];
+ i++);
+ /* Don't put the root block into the free block list */
+ if (i == (info->dqi_usable_bs >> 2)
+ && *blk != QT_TREEOFF) {
+ put_free_dqblk(info, buf, *blk);
+ *blk = 0;
+ } else {
+ ret = write_blk(info, *blk, buf);
+ if (ret < 0)
+ printk(KERN_ERR "VFS: Can't write quota tree "
+ "block %u.\n", *blk);
+ }
+ }
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Delete dquot from tree */
+int qtree_delete_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
+{
+ uint tmp = QT_TREEOFF;
+
+ if (!dquot->dq_off) /* Even not allocated? */
+ return 0;
+ return remove_tree(info, dquot, &tmp, 0);
+}
+EXPORT_SYMBOL(qtree_delete_dquot);
+
+/* Find entry in block */
+static loff_t find_block_dqentry(struct qtree_mem_dqinfo *info,
+ struct dquot *dquot, uint blk)
+{
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ loff_t ret = 0;
+ int i;
+ char *ddquot;
+
+ if (!buf)
+ return -ENOMEM;
+ ret = read_blk(info, blk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't read quota tree block %u.\n", blk);
+ goto out_buf;
+ }
+ for (i = 0, ddquot = ((char *)buf) + sizeof(struct qt_disk_dqdbheader);
+ i < qtree_dqstr_in_blk(info) && !info->dqi_ops->is_id(ddquot, dquot);
+ i++, ddquot += info->dqi_entry_size);
+ if (i == qtree_dqstr_in_blk(info)) {
+ printk(KERN_ERR "VFS: Quota for id %u referenced "
+ "but not present.\n", dquot->dq_id);
+ ret = -EIO;
+ goto out_buf;
+ } else {
+ ret = (blk << info->dqi_blocksize_bits) + sizeof(struct
+ qt_disk_dqdbheader) + i * info->dqi_entry_size;
+ }
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Find entry for given id in the tree */
+static loff_t find_tree_dqentry(struct qtree_mem_dqinfo *info,
+ struct dquot *dquot, uint blk, int depth)
+{
+ dqbuf_t buf = getdqbuf(info->dqi_usable_bs);
+ loff_t ret = 0;
+ __le32 *ref = (__le32 *)buf;
+
+ if (!buf)
+ return -ENOMEM;
+ ret = read_blk(info, blk, buf);
+ if (ret < 0) {
+ printk(KERN_ERR "VFS: Can't read quota tree block %u.\n", blk);
+ goto out_buf;
+ }
+ ret = 0;
+ blk = le32_to_cpu(ref[get_index(info, dquot->dq_id, depth)]);
+ if (!blk) /* No reference? */
+ goto out_buf;
+ if (depth < info->dqi_qtree_depth - 1)
+ ret = find_tree_dqentry(info, dquot, blk, depth+1);
+ else
+ ret = find_block_dqentry(info, dquot, blk);
+out_buf:
+ freedqbuf(buf);
+ return ret;
+}
+
+/* Find entry for given id in the tree - wrapper function */
+static inline loff_t find_dqentry(struct qtree_mem_dqinfo *info,
+ struct dquot *dquot)
+{
+ return find_tree_dqentry(info, dquot, QT_TREEOFF, 0);
+}
+
+int qtree_read_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
+{
+ int type = dquot->dq_type;
+ struct super_block *sb = dquot->dq_sb;
+ loff_t offset;
+ dqbuf_t ddquot;
+ int ret = 0;
+
+#ifdef __QUOTA_QT_PARANOIA
+ /* Invalidated quota? */
+ if (!sb_dqopt(dquot->dq_sb)->files[type]) {
+ printk(KERN_ERR "VFS: Quota invalidated while reading!\n");
+ return -EIO;
+ }
+#endif
+ /* Do we know offset of the dquot entry in the quota file? */
+ if (!dquot->dq_off) {
+ offset = find_dqentry(info, dquot);
+ if (offset <= 0) { /* Entry not present? */
+ if (offset < 0)
+ printk(KERN_ERR "VFS: Can't read quota "
+ "structure for id %u.\n", dquot->dq_id);
+ dquot->dq_off = 0;
+ set_bit(DQ_FAKE_B, &dquot->dq_flags);
+ memset(&dquot->dq_dqb, 0, sizeof(struct mem_dqblk));
+ ret = offset;
+ goto out;
+ }
+ dquot->dq_off = offset;
+ }
+ ddquot = getdqbuf(info->dqi_entry_size);
+ if (!ddquot)
+ return -ENOMEM;
+ ret = sb->s_op->quota_read(sb, type, (char *)ddquot,
+ info->dqi_entry_size, dquot->dq_off);
+ if (ret != info->dqi_entry_size) {
+ if (ret >= 0)
+ ret = -EIO;
+ printk(KERN_ERR "VFS: Error while reading quota "
+ "structure for id %u.\n", dquot->dq_id);
+ set_bit(DQ_FAKE_B, &dquot->dq_flags);
+ memset(&dquot->dq_dqb, 0, sizeof(struct mem_dqblk));
+ freedqbuf(ddquot);
+ goto out;
+ }
+ spin_lock(&dq_data_lock);
+ info->dqi_ops->disk2mem_dqblk(dquot, ddquot);
+ if (!dquot->dq_dqb.dqb_bhardlimit &&
+ !dquot->dq_dqb.dqb_bsoftlimit &&
+ !dquot->dq_dqb.dqb_ihardlimit &&
+ !dquot->dq_dqb.dqb_isoftlimit)
+ set_bit(DQ_FAKE_B, &dquot->dq_flags);
+ spin_unlock(&dq_data_lock);
+ freedqbuf(ddquot);
+out:
+ dqstats.reads++;
+ return ret;
+}
+EXPORT_SYMBOL(qtree_read_dquot);
+
+/* Check whether dquot should not be deleted. We know we are
+ * the only one operating on dquot (thanks to dq_lock) */
+int qtree_release_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot)
+{
+ if (test_bit(DQ_FAKE_B, &dquot->dq_flags) && !(dquot->dq_dqb.dqb_curinodes | dquot->dq_dqb.dqb_curspace))
+ return qtree_delete_dquot(info, dquot);
+ return 0;
+}
+EXPORT_SYMBOL(qtree_release_dquot);
diff --git a/fs/quota_tree.h b/fs/quota_tree.h
new file mode 100644
index 0000000..a1ab8db
--- /dev/null
+++ b/fs/quota_tree.h
@@ -0,0 +1,25 @@
+/*
+ * Definitions of structures for vfsv0 quota format
+ */
+
+#ifndef _LINUX_QUOTA_TREE_H
+#define _LINUX_QUOTA_TREE_H
+
+#include <linux/types.h>
+#include <linux/quota.h>
+
+/*
+ * Structure of header of block with quota structures. It is padded to 16 bytes so
+ * there will be space for exactly 21 quota-entries in a block
+ */
+struct qt_disk_dqdbheader {
+ __le32 dqdh_next_free; /* Number of next block with free entry */
+ __le32 dqdh_prev_free; /* Number of previous block with free entry */
+ __le16 dqdh_entries; /* Number of valid entries in block */
+ __le16 dqdh_pad1;
+ __le32 dqdh_pad2;
+};
+
+#define QT_TREEOFF 1 /* Offset of tree in file in blocks */
+
+#endif /* _LINUX_QUOTAIO_TREE_H */
diff --git a/fs/quota_v2.c b/fs/quota_v2.c
index a21d1a7..a87f102 100644
--- a/fs/quota_v2.c
+++ b/fs/quota_v2.c
@@ -14,6 +14,7 @@

#include <asm/byteorder.h>

+#include "quota_tree.h"
#include "quotaio_v2.h"

MODULE_AUTHOR("Jan Kara");
@@ -22,10 +23,15 @@ MODULE_LICENSE("GPL");

#define __QUOTA_V2_PARANOIA

-typedef char *dqbuf_t;
+static void v2_mem2diskdqb(void *dp, struct dquot *dquot);
+static void v2_disk2memdqb(struct dquot *dquot, void *dp);
+static int v2_is_id(void *dp, struct dquot *dquot);

-#define GETIDINDEX(id, depth) (((id) >> ((V2_DQTREEDEPTH-(depth)-1)*8)) & 0xff)
-#define GETENTRIES(buf) ((struct v2_disk_dqblk *)(((char *)buf)+sizeof(struct v2_disk_dqdbheader)))
+static struct qtree_fmt_operations v2_qtree_ops = {
+ .mem2disk_dqblk = v2_mem2diskdqb,
+ .disk2mem_dqblk = v2_disk2memdqb,
+ .is_id = v2_is_id,
+};

#define QUOTABLOCK_BITS 10
#define QUOTABLOCK_SIZE (1 << QUOTABLOCK_BITS)
@@ -64,7 +70,7 @@ static int v2_check_quota_file(struct super_block *sb, int type)
static int v2_read_file_info(struct super_block *sb, int type)
{
struct v2_disk_dqinfo dinfo;
- struct mem_dqinfo *info = sb_dqopt(sb)->info+type;
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
ssize_t size;

size = sb->s_op->quota_read(sb, type, (char *)&dinfo,
@@ -80,9 +86,16 @@ static int v2_read_file_info(struct super_block *sb, int type)
info->dqi_bgrace = le32_to_cpu(dinfo.dqi_bgrace);
info->dqi_igrace = le32_to_cpu(dinfo.dqi_igrace);
info->dqi_flags = le32_to_cpu(dinfo.dqi_flags);
- info->u.v2_i.dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
- info->u.v2_i.dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
- info->u.v2_i.dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
+ info->u.v2_i.i.dqi_sb = sb;
+ info->u.v2_i.i.dqi_type = type;
+ info->u.v2_i.i.dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
+ info->u.v2_i.i.dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
+ info->u.v2_i.i.dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
+ info->u.v2_i.i.dqi_blocksize_bits = V2_DQBLKSIZE_BITS;
+ info->u.v2_i.i.dqi_usable_bs = 1 << V2_DQBLKSIZE_BITS;
+ info->u.v2_i.i.dqi_qtree_depth = qtree_depth(&info->u.v2_i.i);
+ info->u.v2_i.i.dqi_entry_size = sizeof(struct v2_disk_dqblk);
+ info->u.v2_i.i.dqi_ops = &v2_qtree_ops;
return 0;
}

@@ -90,7 +103,7 @@ static int v2_read_file_info(struct super_block *sb, int type)
static int v2_write_file_info(struct super_block *sb, int type)
{
struct v2_disk_dqinfo dinfo;
- struct mem_dqinfo *info = sb_dqopt(sb)->info+type;
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
ssize_t size;

spin_lock(&dq_data_lock);
@@ -99,9 +112,9 @@ static int v2_write_file_info(struct super_block *sb, int type)
dinfo.dqi_igrace = cpu_to_le32(info->dqi_igrace);
dinfo.dqi_flags = cpu_to_le32(info->dqi_flags & DQF_MASK);
spin_unlock(&dq_data_lock);
- dinfo.dqi_blocks = cpu_to_le32(info->u.v2_i.dqi_blocks);
- dinfo.dqi_free_blk = cpu_to_le32(info->u.v2_i.dqi_free_blk);
- dinfo.dqi_free_entry = cpu_to_le32(info->u.v2_i.dqi_free_entry);
+ dinfo.dqi_blocks = cpu_to_le32(info->u.v2_i.i.dqi_blocks);
+ dinfo.dqi_free_blk = cpu_to_le32(info->u.v2_i.i.dqi_free_blk);
+ dinfo.dqi_free_entry = cpu_to_le32(info->u.v2_i.i.dqi_free_entry);
size = sb->s_op->quota_write(sb, type, (char *)&dinfo,
sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
if (size != sizeof(struct v2_disk_dqinfo)) {
@@ -112,8 +125,11 @@ static int v2_write_file_info(struct super_block *sb, int type)
return 0;
}

-static void disk2memdqb(struct mem_dqblk *m, struct v2_disk_dqblk *d)
+static void v2_disk2memdqb(struct dquot *dquot, void *dp)
{
+ struct v2_disk_dqblk *d = dp, empty;
+ struct mem_dqblk *m = &dquot->dq_dqb;
+
m->dqb_ihardlimit = le32_to_cpu(d->dqb_ihardlimit);
m->dqb_isoftlimit = le32_to_cpu(d->dqb_isoftlimit);
m->dqb_curinodes = le32_to_cpu(d->dqb_curinodes);
@@ -122,10 +138,20 @@ static void disk2memdqb(struct mem_dqblk *m, struct v2_disk_dqblk *d)
m->dqb_bsoftlimit = v2_qbtos(le32_to_cpu(d->dqb_bsoftlimit));
m->dqb_curspace = le64_to_cpu(d->dqb_curspace);
m->dqb_btime = le64_to_cpu(d->dqb_btime);
+ /* We need to escape back all-zero structure */
+ memset(&empty, 0, sizeof(struct v2_disk_dqblk));
+ empty.dqb_itime = cpu_to_le64(1);
+ if (!memcmp(&empty, dp, sizeof(struct v2_disk_dqblk)))
+ m->dqb_itime = 0;
}

-static void mem2diskdqb(struct v2_disk_dqblk *d, struct mem_dqblk *m, qid_t id)
+static void v2_mem2diskdqb(void *dp, struct dquot *dquot)
{
+ struct v2_disk_dqblk *d = dp;
+ struct mem_dqblk *m = &dquot->dq_dqb;
+ struct qtree_mem_dqinfo *info =
+ &sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i;
+
d->dqb_ihardlimit = cpu_to_le32(m->dqb_ihardlimit);
d->dqb_isoftlimit = cpu_to_le32(m->dqb_isoftlimit);
d->dqb_curinodes = cpu_to_le32(m->dqb_curinodes);
@@ -134,553 +160,35 @@ static void mem2diskdqb(struct v2_disk_dqblk *d, struct mem_dqblk *m, qid_t id)
d->dqb_bsoftlimit = cpu_to_le32(v2_stoqb(m->dqb_bsoftlimit));
d->dqb_curspace = cpu_to_le64(m->dqb_curspace);
d->dqb_btime = cpu_to_le64(m->dqb_btime);
- d->dqb_id = cpu_to_le32(id);
-}
-
-static dqbuf_t getdqbuf(void)
-{
- dqbuf_t buf = kmalloc(V2_DQBLKSIZE, GFP_NOFS);
- if (!buf)
- printk(KERN_WARNING "VFS: Not enough memory for quota buffers.\n");
- return buf;
-}
-
-static inline void freedqbuf(dqbuf_t buf)
-{
- kfree(buf);
-}
-
-static inline ssize_t read_blk(struct super_block *sb, int type, uint blk, dqbuf_t buf)
-{
- memset(buf, 0, V2_DQBLKSIZE);
- return sb->s_op->quota_read(sb, type, (char *)buf,
- V2_DQBLKSIZE, blk << V2_DQBLKSIZE_BITS);
-}
-
-static inline ssize_t write_blk(struct super_block *sb, int type, uint blk, dqbuf_t buf)
-{
- return sb->s_op->quota_write(sb, type, (char *)buf,
- V2_DQBLKSIZE, blk << V2_DQBLKSIZE_BITS);
-}
-
-/* Remove empty block from list and return it */
-static int get_free_dqblk(struct super_block *sb, int type)
-{
- dqbuf_t buf = getdqbuf();
- struct mem_dqinfo *info = sb_dqinfo(sb, type);
- struct v2_disk_dqdbheader *dh = (struct v2_disk_dqdbheader *)buf;
- int ret, blk;
-
- if (!buf)
- return -ENOMEM;
- if (info->u.v2_i.dqi_free_blk) {
- blk = info->u.v2_i.dqi_free_blk;
- if ((ret = read_blk(sb, type, blk, buf)) < 0)
- goto out_buf;
- info->u.v2_i.dqi_free_blk = le32_to_cpu(dh->dqdh_next_free);
- }
- else {
- memset(buf, 0, V2_DQBLKSIZE);
- /* Assure block allocation... */
- if ((ret = write_blk(sb, type, info->u.v2_i.dqi_blocks, buf)) < 0)
- goto out_buf;
- blk = info->u.v2_i.dqi_blocks++;
- }
- mark_info_dirty(sb, type);
- ret = blk;
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Insert empty block to the list */
-static int put_free_dqblk(struct super_block *sb, int type, dqbuf_t buf, uint blk)
-{
- struct mem_dqinfo *info = sb_dqinfo(sb, type);
- struct v2_disk_dqdbheader *dh = (struct v2_disk_dqdbheader *)buf;
- int err;
-
- dh->dqdh_next_free = cpu_to_le32(info->u.v2_i.dqi_free_blk);
- dh->dqdh_prev_free = cpu_to_le32(0);
- dh->dqdh_entries = cpu_to_le16(0);
- info->u.v2_i.dqi_free_blk = blk;
- mark_info_dirty(sb, type);
- /* Some strange block. We had better leave it... */
- if ((err = write_blk(sb, type, blk, buf)) < 0)
- return err;
- return 0;
+ d->dqb_id = cpu_to_le32(dquot->dq_id);
+ if (qtree_entry_unused(info, dp))
+ d->dqb_itime = cpu_to_le64(1);
}

-/* Remove given block from the list of blocks with free entries */
-static int remove_free_dqentry(struct super_block *sb, int type, dqbuf_t buf, uint blk)
+static int v2_is_id(void *dp, struct dquot *dquot)
{
- dqbuf_t tmpbuf = getdqbuf();
- struct mem_dqinfo *info = sb_dqinfo(sb, type);
- struct v2_disk_dqdbheader *dh = (struct v2_disk_dqdbheader *)buf;
- uint nextblk = le32_to_cpu(dh->dqdh_next_free), prevblk = le32_to_cpu(dh->dqdh_prev_free);
- int err;
+ struct v2_disk_dqblk *d = dp;
+ struct qtree_mem_dqinfo *info =
+ &sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i;

- if (!tmpbuf)
- return -ENOMEM;
- if (nextblk) {
- if ((err = read_blk(sb, type, nextblk, tmpbuf)) < 0)
- goto out_buf;
- ((struct v2_disk_dqdbheader *)tmpbuf)->dqdh_prev_free = dh->dqdh_prev_free;
- if ((err = write_blk(sb, type, nextblk, tmpbuf)) < 0)
- goto out_buf;
- }
- if (prevblk) {
- if ((err = read_blk(sb, type, prevblk, tmpbuf)) < 0)
- goto out_buf;
- ((struct v2_disk_dqdbheader *)tmpbuf)->dqdh_next_free = dh->dqdh_next_free;
- if ((err = write_blk(sb, type, prevblk, tmpbuf)) < 0)
- goto out_buf;
- }
- else {
- info->u.v2_i.dqi_free_entry = nextblk;
- mark_info_dirty(sb, type);
- }
- freedqbuf(tmpbuf);
- dh->dqdh_next_free = dh->dqdh_prev_free = cpu_to_le32(0);
- /* No matter whether write succeeds block is out of list */
- if (write_blk(sb, type, blk, buf) < 0)
- printk(KERN_ERR "VFS: Can't write block (%u) with free entries.\n", blk);
- return 0;
-out_buf:
- freedqbuf(tmpbuf);
- return err;
-}
-
-/* Insert given block to the beginning of list with free entries */
-static int insert_free_dqentry(struct super_block *sb, int type, dqbuf_t buf, uint blk)
-{
- dqbuf_t tmpbuf = getdqbuf();
- struct mem_dqinfo *info = sb_dqinfo(sb, type);
- struct v2_disk_dqdbheader *dh = (struct v2_disk_dqdbheader *)buf;
- int err;
-
- if (!tmpbuf)
- return -ENOMEM;
- dh->dqdh_next_free = cpu_to_le32(info->u.v2_i.dqi_free_entry);
- dh->dqdh_prev_free = cpu_to_le32(0);
- if ((err = write_blk(sb, type, blk, buf)) < 0)
- goto out_buf;
- if (info->u.v2_i.dqi_free_entry) {
- if ((err = read_blk(sb, type, info->u.v2_i.dqi_free_entry, tmpbuf)) < 0)
- goto out_buf;
- ((struct v2_disk_dqdbheader *)tmpbuf)->dqdh_prev_free = cpu_to_le32(blk);
- if ((err = write_blk(sb, type, info->u.v2_i.dqi_free_entry, tmpbuf)) < 0)
- goto out_buf;
- }
- freedqbuf(tmpbuf);
- info->u.v2_i.dqi_free_entry = blk;
- mark_info_dirty(sb, type);
- return 0;
-out_buf:
- freedqbuf(tmpbuf);
- return err;
-}
-
-/* Find space for dquot */
-static uint find_free_dqentry(struct dquot *dquot, int *err)
-{
- struct super_block *sb = dquot->dq_sb;
- struct mem_dqinfo *info = sb_dqopt(sb)->info+dquot->dq_type;
- uint blk, i;
- struct v2_disk_dqdbheader *dh;
- struct v2_disk_dqblk *ddquot;
- struct v2_disk_dqblk fakedquot;
- dqbuf_t buf;
-
- *err = 0;
- if (!(buf = getdqbuf())) {
- *err = -ENOMEM;
+ if (qtree_entry_unused(info, dp))
return 0;
- }
- dh = (struct v2_disk_dqdbheader *)buf;
- ddquot = GETENTRIES(buf);
- if (info->u.v2_i.dqi_free_entry) {
- blk = info->u.v2_i.dqi_free_entry;
- if ((*err = read_blk(sb, dquot->dq_type, blk, buf)) < 0)
- goto out_buf;
- }
- else {
- blk = get_free_dqblk(sb, dquot->dq_type);
- if ((int)blk < 0) {
- *err = blk;
- freedqbuf(buf);
- return 0;
- }
- memset(buf, 0, V2_DQBLKSIZE);
- /* This is enough as block is already zeroed and entry list is empty... */
- info->u.v2_i.dqi_free_entry = blk;
- mark_info_dirty(sb, dquot->dq_type);
- }
- if (le16_to_cpu(dh->dqdh_entries)+1 >= V2_DQSTRINBLK) /* Block will be full? */
- if ((*err = remove_free_dqentry(sb, dquot->dq_type, buf, blk)) < 0) {
- printk(KERN_ERR "VFS: find_free_dqentry(): Can't remove block (%u) from entry free list.\n", blk);
- goto out_buf;
- }
- le16_add_cpu(&dh->dqdh_entries, 1);
- memset(&fakedquot, 0, sizeof(struct v2_disk_dqblk));
- /* Find free structure in block */
- for (i = 0; i < V2_DQSTRINBLK && memcmp(&fakedquot, ddquot+i, sizeof(struct v2_disk_dqblk)); i++);
-#ifdef __QUOTA_V2_PARANOIA
- if (i == V2_DQSTRINBLK) {
- printk(KERN_ERR "VFS: find_free_dqentry(): Data block full but it shouldn't.\n");
- *err = -EIO;
- goto out_buf;
- }
-#endif
- if ((*err = write_blk(sb, dquot->dq_type, blk, buf)) < 0) {
- printk(KERN_ERR "VFS: find_free_dqentry(): Can't write quota data block %u.\n", blk);
- goto out_buf;
- }
- dquot->dq_off = (blk<<V2_DQBLKSIZE_BITS)+sizeof(struct v2_disk_dqdbheader)+i*sizeof(struct v2_disk_dqblk);
- freedqbuf(buf);
- return blk;
-out_buf:
- freedqbuf(buf);
- return 0;
+ return le32_to_cpu(d->dqb_id) == dquot->dq_id;
}

-/* Insert reference to structure into the trie */
-static int do_insert_tree(struct dquot *dquot, uint *treeblk, int depth)
-{
- struct super_block *sb = dquot->dq_sb;
- dqbuf_t buf;
- int ret = 0, newson = 0, newact = 0;
- __le32 *ref;
- uint newblk;
-
- if (!(buf = getdqbuf()))
- return -ENOMEM;
- if (!*treeblk) {
- ret = get_free_dqblk(sb, dquot->dq_type);
- if (ret < 0)
- goto out_buf;
- *treeblk = ret;
- memset(buf, 0, V2_DQBLKSIZE);
- newact = 1;
- }
- else {
- if ((ret = read_blk(sb, dquot->dq_type, *treeblk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't read tree quota block %u.\n", *treeblk);
- goto out_buf;
- }
- }
- ref = (__le32 *)buf;
- newblk = le32_to_cpu(ref[GETIDINDEX(dquot->dq_id, depth)]);
- if (!newblk)
- newson = 1;
- if (depth == V2_DQTREEDEPTH-1) {
-#ifdef __QUOTA_V2_PARANOIA
- if (newblk) {
- printk(KERN_ERR "VFS: Inserting already present quota entry (block %u).\n", le32_to_cpu(ref[GETIDINDEX(dquot->dq_id, depth)]));
- ret = -EIO;
- goto out_buf;
- }
-#endif
- newblk = find_free_dqentry(dquot, &ret);
- }
- else
- ret = do_insert_tree(dquot, &newblk, depth+1);
- if (newson && ret >= 0) {
- ref[GETIDINDEX(dquot->dq_id, depth)] = cpu_to_le32(newblk);
- ret = write_blk(sb, dquot->dq_type, *treeblk, buf);
- }
- else if (newact && ret < 0)
- put_free_dqblk(sb, dquot->dq_type, buf, *treeblk);
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Wrapper for inserting quota structure into tree */
-static inline int dq_insert_tree(struct dquot *dquot)
+static int v2_read_dquot(struct dquot *dquot)
{
- int tmp = V2_DQTREEOFF;
- return do_insert_tree(dquot, &tmp, 0);
+ return qtree_read_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
}

-/*
- * We don't have to be afraid of deadlocks as we never have quotas on quota files...
- */
static int v2_write_dquot(struct dquot *dquot)
{
- int type = dquot->dq_type;
- ssize_t ret;
- struct v2_disk_dqblk ddquot, empty;
-
- /* dq_off is guarded by dqio_mutex */
- if (!dquot->dq_off)
- if ((ret = dq_insert_tree(dquot)) < 0) {
- printk(KERN_ERR "VFS: Error %zd occurred while creating quota.\n", ret);
- return ret;
- }
- spin_lock(&dq_data_lock);
- mem2diskdqb(&ddquot, &dquot->dq_dqb, dquot->dq_id);
- /* Argh... We may need to write structure full of zeroes but that would be
- * treated as an empty place by the rest of the code. Format change would
- * be definitely cleaner but the problems probably are not worth it */
- memset(&empty, 0, sizeof(struct v2_disk_dqblk));
- if (!memcmp(&empty, &ddquot, sizeof(struct v2_disk_dqblk)))
- ddquot.dqb_itime = cpu_to_le64(1);
- spin_unlock(&dq_data_lock);
- ret = dquot->dq_sb->s_op->quota_write(dquot->dq_sb, type,
- (char *)&ddquot, sizeof(struct v2_disk_dqblk), dquot->dq_off);
- if (ret != sizeof(struct v2_disk_dqblk)) {
- printk(KERN_WARNING "VFS: dquota write failed on dev %s\n", dquot->dq_sb->s_id);
- if (ret >= 0)
- ret = -ENOSPC;
- }
- else
- ret = 0;
- dqstats.writes++;
-
- return ret;
+ return qtree_write_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
}

-/* Free dquot entry in data block */
-static int free_dqentry(struct dquot *dquot, uint blk)
-{
- struct super_block *sb = dquot->dq_sb;
- int type = dquot->dq_type;
- struct v2_disk_dqdbheader *dh;
- dqbuf_t buf = getdqbuf();
- int ret = 0;
-
- if (!buf)
- return -ENOMEM;
- if (dquot->dq_off >> V2_DQBLKSIZE_BITS != blk) {
- printk(KERN_ERR "VFS: Quota structure has offset to other "
- "block (%u) than it should (%u).\n", blk,
- (uint)(dquot->dq_off >> V2_DQBLKSIZE_BITS));
- goto out_buf;
- }
- if ((ret = read_blk(sb, type, blk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't read quota data block %u\n", blk);
- goto out_buf;
- }
- dh = (struct v2_disk_dqdbheader *)buf;
- le16_add_cpu(&dh->dqdh_entries, -1);
- if (!le16_to_cpu(dh->dqdh_entries)) { /* Block got free? */
- if ((ret = remove_free_dqentry(sb, type, buf, blk)) < 0 ||
- (ret = put_free_dqblk(sb, type, buf, blk)) < 0) {
- printk(KERN_ERR "VFS: Can't move quota data block (%u) "
- "to free list.\n", blk);
- goto out_buf;
- }
- }
- else {
- memset(buf+(dquot->dq_off & ((1 << V2_DQBLKSIZE_BITS)-1)), 0,
- sizeof(struct v2_disk_dqblk));
- if (le16_to_cpu(dh->dqdh_entries) == V2_DQSTRINBLK-1) {
- /* Insert will write block itself */
- if ((ret = insert_free_dqentry(sb, type, buf, blk)) < 0) {
- printk(KERN_ERR "VFS: Can't insert quota data block (%u) to free entry list.\n", blk);
- goto out_buf;
- }
- }
- else
- if ((ret = write_blk(sb, type, blk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't write quota data "
- "block %u\n", blk);
- goto out_buf;
- }
- }
- dquot->dq_off = 0; /* Quota is now unattached */
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Remove reference to dquot from tree */
-static int remove_tree(struct dquot *dquot, uint *blk, int depth)
-{
- struct super_block *sb = dquot->dq_sb;
- int type = dquot->dq_type;
- dqbuf_t buf = getdqbuf();
- int ret = 0;
- uint newblk;
- __le32 *ref = (__le32 *)buf;
-
- if (!buf)
- return -ENOMEM;
- if ((ret = read_blk(sb, type, *blk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't read quota data block %u\n", *blk);
- goto out_buf;
- }
- newblk = le32_to_cpu(ref[GETIDINDEX(dquot->dq_id, depth)]);
- if (depth == V2_DQTREEDEPTH-1) {
- ret = free_dqentry(dquot, newblk);
- newblk = 0;
- }
- else
- ret = remove_tree(dquot, &newblk, depth+1);
- if (ret >= 0 && !newblk) {
- int i;
- ref[GETIDINDEX(dquot->dq_id, depth)] = cpu_to_le32(0);
- for (i = 0; i < V2_DQBLKSIZE && !buf[i]; i++); /* Block got empty? */
- /* Don't put the root block into the free block list */
- if (i == V2_DQBLKSIZE && *blk != V2_DQTREEOFF) {
- put_free_dqblk(sb, type, buf, *blk);
- *blk = 0;
- }
- else
- if ((ret = write_blk(sb, type, *blk, buf)) < 0)
- printk(KERN_ERR "VFS: Can't write quota tree "
- "block %u.\n", *blk);
- }
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Delete dquot from tree */
-static int v2_delete_dquot(struct dquot *dquot)
-{
- uint tmp = V2_DQTREEOFF;
-
- if (!dquot->dq_off) /* Even not allocated? */
- return 0;
- return remove_tree(dquot, &tmp, 0);
-}
-
-/* Find entry in block */
-static loff_t find_block_dqentry(struct dquot *dquot, uint blk)
-{
- dqbuf_t buf = getdqbuf();
- loff_t ret = 0;
- int i;
- struct v2_disk_dqblk *ddquot = GETENTRIES(buf);
-
- if (!buf)
- return -ENOMEM;
- if ((ret = read_blk(dquot->dq_sb, dquot->dq_type, blk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't read quota tree block %u.\n", blk);
- goto out_buf;
- }
- if (dquot->dq_id)
- for (i = 0; i < V2_DQSTRINBLK &&
- le32_to_cpu(ddquot[i].dqb_id) != dquot->dq_id; i++);
- else { /* ID 0 as a bit more complicated searching... */
- struct v2_disk_dqblk fakedquot;
-
- memset(&fakedquot, 0, sizeof(struct v2_disk_dqblk));
- for (i = 0; i < V2_DQSTRINBLK; i++)
- if (!le32_to_cpu(ddquot[i].dqb_id) &&
- memcmp(&fakedquot, ddquot+i, sizeof(struct v2_disk_dqblk)))
- break;
- }
- if (i == V2_DQSTRINBLK) {
- printk(KERN_ERR "VFS: Quota for id %u referenced "
- "but not present.\n", dquot->dq_id);
- ret = -EIO;
- goto out_buf;
- }
- else
- ret = (blk << V2_DQBLKSIZE_BITS) + sizeof(struct
- v2_disk_dqdbheader) + i * sizeof(struct v2_disk_dqblk);
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Find entry for given id in the tree */
-static loff_t find_tree_dqentry(struct dquot *dquot, uint blk, int depth)
-{
- dqbuf_t buf = getdqbuf();
- loff_t ret = 0;
- __le32 *ref = (__le32 *)buf;
-
- if (!buf)
- return -ENOMEM;
- if ((ret = read_blk(dquot->dq_sb, dquot->dq_type, blk, buf)) < 0) {
- printk(KERN_ERR "VFS: Can't read quota tree block %u.\n", blk);
- goto out_buf;
- }
- ret = 0;
- blk = le32_to_cpu(ref[GETIDINDEX(dquot->dq_id, depth)]);
- if (!blk) /* No reference? */
- goto out_buf;
- if (depth < V2_DQTREEDEPTH-1)
- ret = find_tree_dqentry(dquot, blk, depth+1);
- else
- ret = find_block_dqentry(dquot, blk);
-out_buf:
- freedqbuf(buf);
- return ret;
-}
-
-/* Find entry for given id in the tree - wrapper function */
-static inline loff_t find_dqentry(struct dquot *dquot)
-{
- return find_tree_dqentry(dquot, V2_DQTREEOFF, 0);
-}
-
-static int v2_read_dquot(struct dquot *dquot)
-{
- int type = dquot->dq_type;
- loff_t offset;
- struct v2_disk_dqblk ddquot, empty;
- int ret = 0;
-
-#ifdef __QUOTA_V2_PARANOIA
- /* Invalidated quota? */
- if (!dquot->dq_sb || !sb_dqopt(dquot->dq_sb)->files[type]) {
- printk(KERN_ERR "VFS: Quota invalidated while reading!\n");
- return -EIO;
- }
-#endif
- offset = find_dqentry(dquot);
- if (offset <= 0) { /* Entry not present? */
- if (offset < 0)
- printk(KERN_ERR "VFS: Can't read quota "
- "structure for id %u.\n", dquot->dq_id);
- dquot->dq_off = 0;
- set_bit(DQ_FAKE_B, &dquot->dq_flags);
- memset(&dquot->dq_dqb, 0, sizeof(struct mem_dqblk));
- ret = offset;
- }
- else {
- dquot->dq_off = offset;
- if ((ret = dquot->dq_sb->s_op->quota_read(dquot->dq_sb, type,
- (char *)&ddquot, sizeof(struct v2_disk_dqblk), offset))
- != sizeof(struct v2_disk_dqblk)) {
- if (ret >= 0)
- ret = -EIO;
- printk(KERN_ERR "VFS: Error while reading quota "
- "structure for id %u.\n", dquot->dq_id);
- memset(&ddquot, 0, sizeof(struct v2_disk_dqblk));
- }
- else {
- ret = 0;
- /* We need to escape back all-zero structure */
- memset(&empty, 0, sizeof(struct v2_disk_dqblk));
- empty.dqb_itime = cpu_to_le64(1);
- if (!memcmp(&empty, &ddquot, sizeof(struct v2_disk_dqblk)))
- ddquot.dqb_itime = 0;
- }
- disk2memdqb(&dquot->dq_dqb, &ddquot);
- if (!dquot->dq_dqb.dqb_bhardlimit &&
- !dquot->dq_dqb.dqb_bsoftlimit &&
- !dquot->dq_dqb.dqb_ihardlimit &&
- !dquot->dq_dqb.dqb_isoftlimit)
- set_bit(DQ_FAKE_B, &dquot->dq_flags);
- }
- dqstats.reads++;
-
- return ret;
-}
-
-/* Check whether dquot should not be deleted. We know we are
- * the only one operating on dquot (thanks to dq_lock) */
static int v2_release_dquot(struct dquot *dquot)
{
- if (test_bit(DQ_FAKE_B, &dquot->dq_flags) && !(dquot->dq_dqb.dqb_curinodes | dquot->dq_dqb.dqb_curspace))
- return v2_delete_dquot(dquot);
- return 0;
+ return qtree_release_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
}

static struct quota_format_ops v2_format_ops = {
diff --git a/fs/quotaio_v2.h b/fs/quotaio_v2.h
index 303d7cb..530fe58 100644
--- a/fs/quotaio_v2.h
+++ b/fs/quotaio_v2.h
@@ -21,6 +21,12 @@
0 /* GRPQUOTA */\
}

+/* First generic header */
+struct v2_disk_dqheader {
+ __le32 dqh_magic; /* Magic number identifying file */
+ __le32 dqh_version; /* File version */
+};
+
/*
* The following structure defines the format of the disk quota file
* (as it appears on disk) - the file is a radix tree whose leaves point
@@ -38,15 +44,6 @@ struct v2_disk_dqblk {
__le64 dqb_itime; /* time limit for excessive inode use */
};

-/*
- * Here are header structures as written on disk and their in-memory copies
- */
-/* First generic header */
-struct v2_disk_dqheader {
- __le32 dqh_magic; /* Magic number identifying file */
- __le32 dqh_version; /* File version */
-};
-
/* Header with type and version specific information */
struct v2_disk_dqinfo {
__le32 dqi_bgrace; /* Time before block soft limit becomes hard limit */
@@ -57,23 +54,7 @@ struct v2_disk_dqinfo {
__le32 dqi_free_entry; /* Number of block with at least one free entry */
};

-/*
- * Structure of header of block with quota structures. It is padded to 16 bytes so
- * there will be space for exactly 21 quota-entries in a block
- */
-struct v2_disk_dqdbheader {
- __le32 dqdh_next_free; /* Number of next block with free entry */
- __le32 dqdh_prev_free; /* Number of previous block with free entry */
- __le16 dqdh_entries; /* Number of valid entries in block */
- __le16 dqdh_pad1;
- __le32 dqdh_pad2;
-};
-
#define V2_DQINFOOFF sizeof(struct v2_disk_dqheader) /* Offset of info header in file */
-#define V2_DQBLKSIZE_BITS 10
-#define V2_DQBLKSIZE (1 << V2_DQBLKSIZE_BITS) /* Size of block with quota structures */
-#define V2_DQTREEOFF 1 /* Offset of tree in file in blocks */
-#define V2_DQTREEDEPTH 4 /* Depth of quota tree */
-#define V2_DQSTRINBLK ((V2_DQBLKSIZE - sizeof(struct v2_disk_dqdbheader)) / sizeof(struct v2_disk_dqblk)) /* Number of entries in one blocks */
+#define V2_DQBLKSIZE_BITS 10 /* Size of leaf block in tree */

#endif /* _LINUX_QUOTAIO_V2_H */
diff --git a/include/linux/dqblk_qtree.h b/include/linux/dqblk_qtree.h
new file mode 100644
index 0000000..82a1652
--- /dev/null
+++ b/include/linux/dqblk_qtree.h
@@ -0,0 +1,56 @@
+/*
+ * Definitions of structures and functions for quota formats using trie
+ */
+
+#ifndef _LINUX_DQBLK_QTREE_H
+#define _LINUX_DQBLK_QTREE_H
+
+#include <linux/types.h>
+
+/* Numbers of blocks needed for updates - we count with the smallest
+ * possible block size (1024) */
+#define QTREE_INIT_ALLOC 4
+#define QTREE_INIT_REWRITE 2
+#define QTREE_DEL_ALLOC 0
+#define QTREE_DEL_REWRITE 6
+
+struct dquot;
+
+/* Operations */
+struct qtree_fmt_operations {
+ void (*mem2disk_dqblk)(void *disk, struct dquot *dquot); /* Convert given entry from in memory format to disk one */
+ void (*disk2mem_dqblk)(struct dquot *dquot, void *disk); /* Convert given entry from disk format to in memory one */
+ int (*is_id)(void *disk, struct dquot *dquot); /* Is this structure for given id? */
+};
+
+/* Inmemory copy of version specific information */
+struct qtree_mem_dqinfo {
+ struct super_block *dqi_sb; /* Sb quota is on */
+ int dqi_type; /* Quota type */
+ unsigned int dqi_blocks; /* # of blocks in quota file */
+ unsigned int dqi_free_blk; /* First block in list of free blocks */
+ unsigned int dqi_free_entry; /* First block with free entry */
+ unsigned int dqi_blocksize_bits; /* Block size of quota file */
+ unsigned int dqi_entry_size; /* Size of quota entry in quota file */
+ unsigned int dqi_usable_bs; /* Space usable in block for quota data */
+ unsigned int dqi_qtree_depth; /* Precomputed depth of quota tree */
+ struct qtree_fmt_operations *dqi_ops; /* Operations for entry manipulation */
+};
+
+int qtree_write_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot);
+int qtree_read_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot);
+int qtree_delete_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot);
+int qtree_release_dquot(struct qtree_mem_dqinfo *info, struct dquot *dquot);
+int qtree_entry_unused(struct qtree_mem_dqinfo *info, char *disk);
+static inline int qtree_depth(struct qtree_mem_dqinfo *info)
+{
+ unsigned int epb = info->dqi_usable_bs >> 2;
+ unsigned long long entries = epb;
+ int i;
+
+ for (i = 1; entries < (1ULL << 32); i++)
+ entries *= epb;
+ return i;
+}
+
+#endif /* _LINUX_DQBLK_QTREE_H */
diff --git a/include/linux/dqblk_v2.h b/include/linux/dqblk_v2.h
index 4f85332..e5e22a7 100644
--- a/include/linux/dqblk_v2.h
+++ b/include/linux/dqblk_v2.h
@@ -1,26 +1,23 @@
/*
- * Definitions of structures for vfsv0 quota format
+ * Definitions for vfsv0 quota format
*/

#ifndef _LINUX_DQBLK_V2_H
#define _LINUX_DQBLK_V2_H

-#include <linux/types.h>
+#include <linux/dqblk_qtree.h>

-/* id numbers of quota format */
+/* Id number of quota format */
#define QFMT_VFS_V0 2

/* Numbers of blocks needed for updates */
-#define V2_INIT_ALLOC 4
-#define V2_INIT_REWRITE 2
-#define V2_DEL_ALLOC 0
-#define V2_DEL_REWRITE 6
+#define V2_INIT_ALLOC QTREE_INIT_ALLOC
+#define V2_INIT_REWRITE QTREE_INIT_REWRITE
+#define V2_DEL_ALLOC QTREE_DEL_ALLOC
+#define V2_DEL_REWRITE QTREE_DEL_REWRITE

-/* Inmemory copy of version specific information */
struct v2_mem_dqinfo {
- unsigned int dqi_blocks;
- unsigned int dqi_free_blk;
- unsigned int dqi_free_entry;
+ struct qtree_mem_dqinfo i;
};

#endif /* _LINUX_DQBLK_V2_H */
--
1.5.6

2008-12-22 21:53:52

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 13/56] quota: Convert union in mem_dqinfo to a pointer

From: Jan Kara <[email protected]>

Coming quota support for OCFS2 is going to need quite a bit
of additional per-sb quota information. Moreover having fs.h
include all the types needed for this structure would be a
pain in the a**. So remove the union from mem_dqinfo and add
a private pointer for filesystem's use.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/quota_v2.c | 53 +++++++++++++++++++++++++++++----------------
include/linux/dqblk_v1.h | 4 ---
include/linux/dqblk_v2.h | 4 ---
include/linux/quota.h | 5 +---
4 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/fs/quota_v2.c b/fs/quota_v2.c
index a87f102..b618b56 100644
--- a/fs/quota_v2.c
+++ b/fs/quota_v2.c
@@ -71,6 +71,7 @@ static int v2_read_file_info(struct super_block *sb, int type)
{
struct v2_disk_dqinfo dinfo;
struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct qtree_mem_dqinfo *qinfo;
ssize_t size;

size = sb->s_op->quota_read(sb, type, (char *)&dinfo,
@@ -80,22 +81,29 @@ static int v2_read_file_info(struct super_block *sb, int type)
sb->s_id);
return -1;
}
+ info->dqi_priv = kmalloc(sizeof(struct qtree_mem_dqinfo), GFP_NOFS);
+ if (!info->dqi_priv) {
+ printk(KERN_WARNING
+ "Not enough memory for quota information structure.\n");
+ return -1;
+ }
+ qinfo = info->dqi_priv;
/* limits are stored as unsigned 32-bit data */
info->dqi_maxblimit = 0xffffffff;
info->dqi_maxilimit = 0xffffffff;
info->dqi_bgrace = le32_to_cpu(dinfo.dqi_bgrace);
info->dqi_igrace = le32_to_cpu(dinfo.dqi_igrace);
info->dqi_flags = le32_to_cpu(dinfo.dqi_flags);
- info->u.v2_i.i.dqi_sb = sb;
- info->u.v2_i.i.dqi_type = type;
- info->u.v2_i.i.dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
- info->u.v2_i.i.dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
- info->u.v2_i.i.dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
- info->u.v2_i.i.dqi_blocksize_bits = V2_DQBLKSIZE_BITS;
- info->u.v2_i.i.dqi_usable_bs = 1 << V2_DQBLKSIZE_BITS;
- info->u.v2_i.i.dqi_qtree_depth = qtree_depth(&info->u.v2_i.i);
- info->u.v2_i.i.dqi_entry_size = sizeof(struct v2_disk_dqblk);
- info->u.v2_i.i.dqi_ops = &v2_qtree_ops;
+ qinfo->dqi_sb = sb;
+ qinfo->dqi_type = type;
+ qinfo->dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
+ qinfo->dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
+ qinfo->dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
+ qinfo->dqi_blocksize_bits = V2_DQBLKSIZE_BITS;
+ qinfo->dqi_usable_bs = 1 << V2_DQBLKSIZE_BITS;
+ qinfo->dqi_qtree_depth = qtree_depth(qinfo);
+ qinfo->dqi_entry_size = sizeof(struct v2_disk_dqblk);
+ qinfo->dqi_ops = &v2_qtree_ops;
return 0;
}

@@ -104,6 +112,7 @@ static int v2_write_file_info(struct super_block *sb, int type)
{
struct v2_disk_dqinfo dinfo;
struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct qtree_mem_dqinfo *qinfo = info->dqi_priv;
ssize_t size;

spin_lock(&dq_data_lock);
@@ -112,9 +121,9 @@ static int v2_write_file_info(struct super_block *sb, int type)
dinfo.dqi_igrace = cpu_to_le32(info->dqi_igrace);
dinfo.dqi_flags = cpu_to_le32(info->dqi_flags & DQF_MASK);
spin_unlock(&dq_data_lock);
- dinfo.dqi_blocks = cpu_to_le32(info->u.v2_i.i.dqi_blocks);
- dinfo.dqi_free_blk = cpu_to_le32(info->u.v2_i.i.dqi_free_blk);
- dinfo.dqi_free_entry = cpu_to_le32(info->u.v2_i.i.dqi_free_entry);
+ dinfo.dqi_blocks = cpu_to_le32(qinfo->dqi_blocks);
+ dinfo.dqi_free_blk = cpu_to_le32(qinfo->dqi_free_blk);
+ dinfo.dqi_free_entry = cpu_to_le32(qinfo->dqi_free_entry);
size = sb->s_op->quota_write(sb, type, (char *)&dinfo,
sizeof(struct v2_disk_dqinfo), V2_DQINFOOFF);
if (size != sizeof(struct v2_disk_dqinfo)) {
@@ -150,7 +159,7 @@ static void v2_mem2diskdqb(void *dp, struct dquot *dquot)
struct v2_disk_dqblk *d = dp;
struct mem_dqblk *m = &dquot->dq_dqb;
struct qtree_mem_dqinfo *info =
- &sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i;
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;

d->dqb_ihardlimit = cpu_to_le32(m->dqb_ihardlimit);
d->dqb_isoftlimit = cpu_to_le32(m->dqb_isoftlimit);
@@ -169,7 +178,7 @@ static int v2_is_id(void *dp, struct dquot *dquot)
{
struct v2_disk_dqblk *d = dp;
struct qtree_mem_dqinfo *info =
- &sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i;
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;

if (qtree_entry_unused(info, dp))
return 0;
@@ -178,24 +187,30 @@ static int v2_is_id(void *dp, struct dquot *dquot)

static int v2_read_dquot(struct dquot *dquot)
{
- return qtree_read_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
+ return qtree_read_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv, dquot);
}

static int v2_write_dquot(struct dquot *dquot)
{
- return qtree_write_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
+ return qtree_write_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv, dquot);
}

static int v2_release_dquot(struct dquot *dquot)
{
- return qtree_release_dquot(&sb_dqinfo(dquot->dq_sb, dquot->dq_type)->u.v2_i.i, dquot);
+ return qtree_release_dquot(sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv, dquot);
+}
+
+static int v2_free_file_info(struct super_block *sb, int type)
+{
+ kfree(sb_dqinfo(sb, type)->dqi_priv);
+ return 0;
}

static struct quota_format_ops v2_format_ops = {
.check_quota_file = v2_check_quota_file,
.read_file_info = v2_read_file_info,
.write_file_info = v2_write_file_info,
- .free_file_info = NULL,
+ .free_file_info = v2_free_file_info,
.read_dqblk = v2_read_dquot,
.commit_dqblk = v2_write_dquot,
.release_dqblk = v2_release_dquot,
diff --git a/include/linux/dqblk_v1.h b/include/linux/dqblk_v1.h
index 57f1250..9cea901 100644
--- a/include/linux/dqblk_v1.h
+++ b/include/linux/dqblk_v1.h
@@ -17,8 +17,4 @@
#define V1_DEL_ALLOC 0
#define V1_DEL_REWRITE 2

-/* Special information about quotafile */
-struct v1_mem_dqinfo {
-};
-
#endif /* _LINUX_DQBLK_V1_H */
diff --git a/include/linux/dqblk_v2.h b/include/linux/dqblk_v2.h
index e5e22a7..ff8af1b 100644
--- a/include/linux/dqblk_v2.h
+++ b/include/linux/dqblk_v2.h
@@ -16,8 +16,4 @@
#define V2_DEL_ALLOC QTREE_DEL_ALLOC
#define V2_DEL_REWRITE QTREE_DEL_REWRITE

-struct v2_mem_dqinfo {
- struct qtree_mem_dqinfo i;
-};
-
#endif /* _LINUX_DQBLK_V2_H */
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 80b8807..e51dfdc 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -208,10 +208,7 @@ struct mem_dqinfo {
unsigned int dqi_igrace;
qsize_t dqi_maxblimit;
qsize_t dqi_maxilimit;
- union {
- struct v1_mem_dqinfo v1_i;
- struct v2_mem_dqinfo v2_i;
- } u;
+ void *dqi_priv;
};

struct super_block;
--
1.5.6

2008-12-22 21:54:19

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 14/56] quota: Allow negative usage of space and inodes

From: Jan Kara <[email protected]>

For clustered filesystems, it can happen that space / inode usage goes
negative temporarily (because some node is allocating another node
is freeing and they are not completely in sync). So let quota code
allow this and change qsize_t so a signed type so that we don't
underflow the variables.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 6 ++++--
include/linux/quota.h | 3 ++-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index f4d6f7e..cf0dac7 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -847,7 +847,8 @@ static inline void dquot_incr_space(struct dquot *dquot, qsize_t number)

static inline void dquot_decr_inodes(struct dquot *dquot, qsize_t number)
{
- if (dquot->dq_dqb.dqb_curinodes > number)
+ if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NEGATIVE_USAGE ||
+ dquot->dq_dqb.dqb_curinodes >= number)
dquot->dq_dqb.dqb_curinodes -= number;
else
dquot->dq_dqb.dqb_curinodes = 0;
@@ -858,7 +859,8 @@ static inline void dquot_decr_inodes(struct dquot *dquot, qsize_t number)

static inline void dquot_decr_space(struct dquot *dquot, qsize_t number)
{
- if (dquot->dq_dqb.dqb_curspace > number)
+ if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NEGATIVE_USAGE ||
+ dquot->dq_dqb.dqb_curspace >= number)
dquot->dq_dqb.dqb_curspace -= number;
else
dquot->dq_dqb.dqb_curspace = 0;
diff --git a/include/linux/quota.h b/include/linux/quota.h
index e51dfdc..75bf761 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -168,7 +168,7 @@ enum {
#include <asm/atomic.h>

typedef __kernel_uid32_t qid_t; /* Type in which we store ids in memory */
-typedef __u64 qsize_t; /* Type in which we store sizes */
+typedef long long qsize_t; /* Type in which we store sizes */

extern spinlock_t dq_data_lock;

@@ -336,6 +336,7 @@ enum {
* responsible for setting
* S_NOQUOTA, S_NOATIME flags
*/
+#define DQUOT_NEGATIVE_USAGE (1 << 7) /* Allow negative quota usage */

static inline unsigned int dquot_state_flag(unsigned int flags, int type)
{
--
1.5.6

2008-12-22 21:54:38

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 15/56] quota: Keep which entries were set by SETQUOTA quotactl

From: Jan Kara <[email protected]>

Quota in a clustered environment needs to synchronize quota information
among cluster nodes. This means we have to occasionally update some
information in dquot from disk / network. On the other hand we have to
be careful not to overwrite changes administrator did via SETQUOTA.
So indicate in dquot->dq_flags which entries have been set by SETQUOTA
and quota format can clear these flags when it properly propagated
the changes.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 12 ++++++++++--
include/linux/quota.h | 26 ++++++++++++++++++++------
2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index cf0dac7..6784892 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -2010,25 +2010,33 @@ static int do_set_dqblk(struct dquot *dquot, struct if_dqblk *di)
if (di->dqb_valid & QIF_SPACE) {
dm->dqb_curspace = di->dqb_curspace;
check_blim = 1;
+ __set_bit(DQ_LASTSET_B + QIF_SPACE_B, &dquot->dq_flags);
}
if (di->dqb_valid & QIF_BLIMITS) {
dm->dqb_bsoftlimit = qbtos(di->dqb_bsoftlimit);
dm->dqb_bhardlimit = qbtos(di->dqb_bhardlimit);
check_blim = 1;
+ __set_bit(DQ_LASTSET_B + QIF_BLIMITS_B, &dquot->dq_flags);
}
if (di->dqb_valid & QIF_INODES) {
dm->dqb_curinodes = di->dqb_curinodes;
check_ilim = 1;
+ __set_bit(DQ_LASTSET_B + QIF_INODES_B, &dquot->dq_flags);
}
if (di->dqb_valid & QIF_ILIMITS) {
dm->dqb_isoftlimit = di->dqb_isoftlimit;
dm->dqb_ihardlimit = di->dqb_ihardlimit;
check_ilim = 1;
+ __set_bit(DQ_LASTSET_B + QIF_ILIMITS_B, &dquot->dq_flags);
}
- if (di->dqb_valid & QIF_BTIME)
+ if (di->dqb_valid & QIF_BTIME) {
dm->dqb_btime = di->dqb_btime;
- if (di->dqb_valid & QIF_ITIME)
+ __set_bit(DQ_LASTSET_B + QIF_BTIME_B, &dquot->dq_flags);
+ }
+ if (di->dqb_valid & QIF_ITIME) {
dm->dqb_itime = di->dqb_itime;
+ __set_bit(DQ_LASTSET_B + QIF_ITIME_B, &dquot->dq_flags);
+ }

if (check_blim) {
if (!dm->dqb_bsoftlimit || dm->dqb_curspace < dm->dqb_bsoftlimit) {
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 75bf761..6d98885 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -80,12 +80,21 @@
* Quota structure used for communication with userspace via quotactl
* Following flags are used to specify which fields are valid
*/
-#define QIF_BLIMITS 1
-#define QIF_SPACE 2
-#define QIF_ILIMITS 4
-#define QIF_INODES 8
-#define QIF_BTIME 16
-#define QIF_ITIME 32
+enum {
+ QIF_BLIMITS_B = 0,
+ QIF_SPACE_B,
+ QIF_ILIMITS_B,
+ QIF_INODES_B,
+ QIF_BTIME_B,
+ QIF_ITIME_B,
+};
+
+#define QIF_BLIMITS (1 << QIF_BLIMITS_B)
+#define QIF_SPACE (1 << QIF_SPACE_B)
+#define QIF_ILIMITS (1 << QIF_ILIMITS_B)
+#define QIF_INODES (1 << QIF_INODES_B)
+#define QIF_BTIME (1 << QIF_BTIME_B)
+#define QIF_ITIME (1 << QIF_ITIME_B)
#define QIF_LIMITS (QIF_BLIMITS | QIF_ILIMITS)
#define QIF_USAGE (QIF_SPACE | QIF_INODES)
#define QIF_TIMES (QIF_BTIME | QIF_ITIME)
@@ -242,6 +251,11 @@ extern struct dqstats dqstats;
#define DQ_FAKE_B 3 /* no limits only usage */
#define DQ_READ_B 4 /* dquot was read into memory */
#define DQ_ACTIVE_B 5 /* dquot is active (dquot_release not called) */
+#define DQ_LASTSET_B 6 /* Following 6 bits (see QIF_) are reserved\
+ * for the mask of entries set via SETQUOTA\
+ * quotactl. They are set under dq_data_lock\
+ * and the quota format handling dquot can\
+ * clear them when it sees fit. */

struct dquot {
struct hlist_node dq_hash; /* Hash list in memory */
--
1.5.6

2008-12-22 21:54:55

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 16/56] quota: Update version number

From: Jan Kara <[email protected]>

Increase reported version number of quota support since quota core has changed
significantly. Also remove __DQUOT_NUM_VERSION__ since nobody uses it.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
include/linux/quota.h | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/include/linux/quota.h b/include/linux/quota.h
index 6d98885..ec82beb 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -36,8 +36,7 @@
#include <linux/errno.h>
#include <linux/types.h>

-#define __DQUOT_VERSION__ "dquot_6.5.1"
-#define __DQUOT_NUM_VERSION__ 6*10000+5*100+1
+#define __DQUOT_VERSION__ "dquot_6.5.2"

#define MAXQUOTAS 2
#define USRQUOTA 0 /* element used for user quotas */
--
1.5.6

2008-12-22 21:55:21

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 17/56] quota: Add helpers to allow ocfs2 specific quota initialization, freeing and recovery

From: Jan Kara <[email protected]>

OCFS2 needs to peek whether quota structure is already in memory so
that it can avoid expensive cluster locking in that case. Similarly
when freeing dquots, it checks whether it is the last quota structure
user or not. Finally, it needs to get reference to dquot structure for
specified id and quota type when recovering quota file after crash.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 38 ++++++++++++++++++++++++++++++++------
include/linux/quotaops.h | 4 ++++
2 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 6784892..8774be6 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -211,8 +211,6 @@ static struct hlist_head *dquot_hash;

struct dqstats dqstats;

-static void dqput(struct dquot *dquot);
-
static inline unsigned int
hashfn(const struct super_block *sb, unsigned int id, int type)
{
@@ -568,7 +566,7 @@ static struct shrinker dqcache_shrinker = {
* NOTE: If you change this function please check whether dqput_blocks() works right...
* MUST be called with either dqptr_sem or dqonoff_mutex held
*/
-static void dqput(struct dquot *dquot)
+void dqput(struct dquot *dquot)
{
int ret;

@@ -662,10 +660,28 @@ static struct dquot *get_empty_dquot(struct super_block *sb, int type)
}

/*
+ * Check whether dquot is in memory.
+ * MUST be called with either dqptr_sem or dqonoff_mutex held
+ */
+int dquot_is_cached(struct super_block *sb, unsigned int id, int type)
+{
+ unsigned int hashent = hashfn(sb, id, type);
+ int ret = 0;
+
+ if (!sb_has_quota_active(sb, type))
+ return 0;
+ spin_lock(&dq_list_lock);
+ if (find_dquot(hashent, sb, id, type) != NODQUOT)
+ ret = 1;
+ spin_unlock(&dq_list_lock);
+ return ret;
+}
+
+/*
* Get reference to dquot
* MUST be called with either dqptr_sem or dqonoff_mutex held
*/
-static struct dquot *dqget(struct super_block *sb, unsigned int id, int type)
+struct dquot *dqget(struct super_block *sb, unsigned int id, int type)
{
unsigned int hashent = hashfn(sb, id, type);
struct dquot *dquot, *empty = NODQUOT;
@@ -1184,17 +1200,23 @@ out_err:
* Release all quotas referenced by inode
* Transaction must be started at an entry
*/
-int dquot_drop(struct inode *inode)
+int dquot_drop_locked(struct inode *inode)
{
int cnt;

- down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
if (inode->i_dquot[cnt] != NODQUOT) {
dqput(inode->i_dquot[cnt]);
inode->i_dquot[cnt] = NODQUOT;
}
}
+ return 0;
+}
+
+int dquot_drop(struct inode *inode)
+{
+ down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+ dquot_drop_locked(inode);
up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
return 0;
}
@@ -2308,7 +2330,11 @@ EXPORT_SYMBOL(dquot_release);
EXPORT_SYMBOL(dquot_mark_dquot_dirty);
EXPORT_SYMBOL(dquot_initialize);
EXPORT_SYMBOL(dquot_drop);
+EXPORT_SYMBOL(dquot_drop_locked);
EXPORT_SYMBOL(vfs_dq_drop);
+EXPORT_SYMBOL(dqget);
+EXPORT_SYMBOL(dqput);
+EXPORT_SYMBOL(dquot_is_cached);
EXPORT_SYMBOL(dquot_alloc_space);
EXPORT_SYMBOL(dquot_alloc_inode);
EXPORT_SYMBOL(dquot_free_space);
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index e840ca5..e3a1027 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -24,6 +24,10 @@ void sync_dquots(struct super_block *sb, int type);

int dquot_initialize(struct inode *inode, int type);
int dquot_drop(struct inode *inode);
+int dquot_drop_locked(struct inode *inode);
+struct dquot *dqget(struct super_block *sb, unsigned int id, int type);
+void dqput(struct dquot *dquot);
+int dquot_is_cached(struct super_block *sb, unsigned int id, int type);

int dquot_alloc_space(struct inode *inode, qsize_t number, int prealloc);
int dquot_alloc_inode(const struct inode *inode, qsize_t number);
--
1.5.6

2008-12-22 21:55:48

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 18/56] quota: Implement function for scanning active dquots

From: Jan Kara <[email protected]>

OCFS2 needs to scan all active dquots once in a while and sync quota
information among cluster nodes. Provide a helper function for it so
that it does not have to reimplement internally a list which VFS
already has. Moreover this function is probably going to be useful
for other clustered filesystems if they decide to use VFS quotas.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/quotaops.h | 3 +++
2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 8774be6..6f7df91 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -476,6 +476,41 @@ restart:
spin_unlock(&dq_list_lock);
}

+/* Call callback for every active dquot on given filesystem */
+int dquot_scan_active(struct super_block *sb,
+ int (*fn)(struct dquot *dquot, unsigned long priv),
+ unsigned long priv)
+{
+ struct dquot *dquot, *old_dquot = NULL;
+ int ret = 0;
+
+ mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
+ spin_lock(&dq_list_lock);
+ list_for_each_entry(dquot, &inuse_list, dq_inuse) {
+ if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
+ continue;
+ if (dquot->dq_sb != sb)
+ continue;
+ /* Now we have active dquot so we can just increase use count */
+ atomic_inc(&dquot->dq_count);
+ dqstats.lookups++;
+ spin_unlock(&dq_list_lock);
+ dqput(old_dquot);
+ old_dquot = dquot;
+ ret = fn(dquot, priv);
+ if (ret < 0)
+ goto out;
+ spin_lock(&dq_list_lock);
+ /* We are safe to continue now because our dquot could not
+ * be moved out of the inuse list while we hold the reference */
+ }
+ spin_unlock(&dq_list_lock);
+out:
+ dqput(old_dquot);
+ mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
+ return ret;
+}
+
int vfs_quota_sync(struct super_block *sb, int type)
{
struct list_head *dirty;
@@ -2318,6 +2353,7 @@ EXPORT_SYMBOL(vfs_quota_on_path);
EXPORT_SYMBOL(vfs_quota_on_mount);
EXPORT_SYMBOL(vfs_quota_disable);
EXPORT_SYMBOL(vfs_quota_off);
+EXPORT_SYMBOL(dquot_scan_active);
EXPORT_SYMBOL(vfs_quota_sync);
EXPORT_SYMBOL(vfs_get_dqinfo);
EXPORT_SYMBOL(vfs_set_dqinfo);
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index e3a1027..f491394 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -28,6 +28,9 @@ int dquot_drop_locked(struct inode *inode);
struct dquot *dqget(struct super_block *sb, unsigned int id, int type);
void dqput(struct dquot *dquot);
int dquot_is_cached(struct super_block *sb, unsigned int id, int type);
+int dquot_scan_active(struct super_block *sb,
+ int (*fn)(struct dquot *dquot, unsigned long priv),
+ unsigned long priv);

int dquot_alloc_space(struct inode *inode, qsize_t number, int prealloc);
int dquot_alloc_inode(const struct inode *inode, qsize_t number);
--
1.5.6

2008-12-22 21:56:07

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 19/56] mm: Export pdflush_operation()

From: Jan Kara <[email protected]>

OCSF2 will need to queue up work for periodic syncing of quotas
among nodes in the cluster. pdflush() is good thread for this so
export it's controlling function so that OCFS2 can use it.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
mm/pdflush.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/mm/pdflush.c b/mm/pdflush.c
index a0a14c4..13af84d 100644
--- a/mm/pdflush.c
+++ b/mm/pdflush.c
@@ -223,6 +223,7 @@ int pdflush_operation(void (*fn)(unsigned long), unsigned long arg0)

return ret;
}
+EXPORT_SYMBOL(pdflush_operation);

static void start_one_pdflush_thread(void)
{
--
1.5.6

2008-12-22 21:56:29

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 20/56] ocfs2: Support nested transactions

From: Jan Kara <[email protected]>

OCFS2 can easily support nested transactions. We just have to
take care and not spoil statistics acquire semaphore unnecessarily.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/journal.c | 14 +++++++-------
1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 12b62a3..11a1178 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -256,11 +256,9 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb, int max_buffs)
BUG_ON(osb->journal->j_state == OCFS2_JOURNAL_FREE);
BUG_ON(max_buffs <= 0);

- /* JBD might support this, but our journalling code doesn't yet. */
- if (journal_current_handle()) {
- mlog(ML_ERROR, "Recursive transaction attempted!\n");
- BUG();
- }
+ /* Nested transaction? Just return the handle... */
+ if (journal_current_handle())
+ return jbd2_journal_start(journal, max_buffs);

down_read(&osb->journal->j_trans_barrier);

@@ -285,16 +283,18 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb, int max_buffs)
int ocfs2_commit_trans(struct ocfs2_super *osb,
handle_t *handle)
{
- int ret;
+ int ret, nested;
struct ocfs2_journal *journal = osb->journal;

BUG_ON(!handle);

+ nested = handle->h_ref > 1;
ret = jbd2_journal_stop(handle);
if (ret < 0)
mlog_errno(ret);

- up_read(&journal->j_trans_barrier);
+ if (!nested)
+ up_read(&journal->j_trans_barrier);

return ret;
}
--
1.5.6

2008-12-22 21:56:46

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 21/56] ocfs2: Assign feature bits and system inodes to quota feature and quota files

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/Kconfig | 2 ++
fs/ocfs2/inode.c | 2 ++
fs/ocfs2/ocfs2_fs.h | 21 ++++++++++++++++++---
fs/ocfs2/super.c | 17 +++++++++++++++++
4 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index d99bc0a..107f1cd 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -189,6 +189,8 @@ config OCFS2_FS
select CONFIGFS_FS
select JBD2
select CRC32
+ select QUOTA
+ select QUOTA_TREE
help
OCFS2 is a general purpose extent based shared disk cluster file
system with many similarities to ext3. It supports 64 bit inode
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index ec3497b..ec25d99 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -283,6 +283,8 @@ void ocfs2_populate_inode(struct inode *inode, struct ocfs2_dinode *fe,
mlog(0, "local alloc inode: i_ino=%lu\n", inode->i_ino);
} else if (fe->i_flags & cpu_to_le32(OCFS2_BITMAP_FL)) {
OCFS2_I(inode)->ip_flags |= OCFS2_INODE_BITMAP;
+ } else if (fe->i_flags & cpu_to_le32(OCFS2_QUOTA_FL)) {
+ inode->i_flags |= S_NOQUOTA;
} else if (fe->i_flags & cpu_to_le32(OCFS2_SUPER_BLOCK_FL)) {
mlog(0, "superblock inode: i_ino=%lu\n", inode->i_ino);
/* we can't actually hit this as read_inode can't
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h
index 5e0c0d0..06e3bd6 100644
--- a/fs/ocfs2/ocfs2_fs.h
+++ b/fs/ocfs2/ocfs2_fs.h
@@ -94,7 +94,7 @@
| OCFS2_FEATURE_INCOMPAT_EXTENDED_SLOT_MAP \
| OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK \
| OCFS2_FEATURE_INCOMPAT_XATTR)
-#define OCFS2_FEATURE_RO_COMPAT_SUPP OCFS2_FEATURE_RO_COMPAT_UNWRITTEN
+#define OCFS2_FEATURE_RO_COMPAT_SUPP (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN)

/*
* Heartbeat-only devices are missing journals and other files. The
@@ -163,6 +163,12 @@
*/
#define OCFS2_FEATURE_RO_COMPAT_UNWRITTEN 0x0001

+/*
+ * Maintain quota information for this filesystem
+ */
+#define OCFS2_FEATURE_RO_COMPAT_USRQUOTA 0x0002
+#define OCFS2_FEATURE_RO_COMPAT_GRPQUOTA 0x0004
+
/* The byte offset of the first backup block will be 1G.
* The following will be 4G, 16G, 64G, 256G and 1T.
*/
@@ -192,6 +198,7 @@
#define OCFS2_HEARTBEAT_FL (0x00000200) /* Heartbeat area */
#define OCFS2_CHAIN_FL (0x00000400) /* Chain allocator */
#define OCFS2_DEALLOC_FL (0x00000800) /* Truncate log */
+#define OCFS2_QUOTA_FL (0x00001000) /* Quota file */

/*
* Flags on ocfs2_dinode.i_dyn_features
@@ -329,13 +336,17 @@ enum {
#define OCFS2_FIRST_ONLINE_SYSTEM_INODE SLOT_MAP_SYSTEM_INODE
HEARTBEAT_SYSTEM_INODE,
GLOBAL_BITMAP_SYSTEM_INODE,
-#define OCFS2_LAST_GLOBAL_SYSTEM_INODE GLOBAL_BITMAP_SYSTEM_INODE
+ USER_QUOTA_SYSTEM_INODE,
+ GROUP_QUOTA_SYSTEM_INODE,
+#define OCFS2_LAST_GLOBAL_SYSTEM_INODE GROUP_QUOTA_SYSTEM_INODE
ORPHAN_DIR_SYSTEM_INODE,
EXTENT_ALLOC_SYSTEM_INODE,
INODE_ALLOC_SYSTEM_INODE,
JOURNAL_SYSTEM_INODE,
LOCAL_ALLOC_SYSTEM_INODE,
TRUNCATE_LOG_SYSTEM_INODE,
+ LOCAL_USER_QUOTA_SYSTEM_INODE,
+ LOCAL_GROUP_QUOTA_SYSTEM_INODE,
NUM_SYSTEM_INODES
};

@@ -349,6 +360,8 @@ static struct ocfs2_system_inode_info ocfs2_system_inodes[NUM_SYSTEM_INODES] = {
[SLOT_MAP_SYSTEM_INODE] = { "slot_map", 0, S_IFREG | 0644 },
[HEARTBEAT_SYSTEM_INODE] = { "heartbeat", OCFS2_HEARTBEAT_FL, S_IFREG | 0644 },
[GLOBAL_BITMAP_SYSTEM_INODE] = { "global_bitmap", 0, S_IFREG | 0644 },
+ [USER_QUOTA_SYSTEM_INODE] = { "aquota.user", OCFS2_QUOTA_FL, S_IFREG | 0644 },
+ [GROUP_QUOTA_SYSTEM_INODE] = { "aquota.group", OCFS2_QUOTA_FL, S_IFREG | 0644 },

/* Slot-specific system inodes (one copy per slot) */
[ORPHAN_DIR_SYSTEM_INODE] = { "orphan_dir:%04d", 0, S_IFDIR | 0755 },
@@ -356,7 +369,9 @@ static struct ocfs2_system_inode_info ocfs2_system_inodes[NUM_SYSTEM_INODES] = {
[INODE_ALLOC_SYSTEM_INODE] = { "inode_alloc:%04d", OCFS2_BITMAP_FL | OCFS2_CHAIN_FL, S_IFREG | 0644 },
[JOURNAL_SYSTEM_INODE] = { "journal:%04d", OCFS2_JOURNAL_FL, S_IFREG | 0644 },
[LOCAL_ALLOC_SYSTEM_INODE] = { "local_alloc:%04d", OCFS2_BITMAP_FL | OCFS2_LOCAL_ALLOC_FL, S_IFREG | 0644 },
- [TRUNCATE_LOG_SYSTEM_INODE] = { "truncate_log:%04d", OCFS2_DEALLOC_FL, S_IFREG | 0644 }
+ [TRUNCATE_LOG_SYSTEM_INODE] = { "truncate_log:%04d", OCFS2_DEALLOC_FL, S_IFREG | 0644 },
+ [LOCAL_USER_QUOTA_SYSTEM_INODE] = { "aquota.user:%04d", OCFS2_QUOTA_FL, S_IFREG | 0644 },
+ [LOCAL_GROUP_QUOTA_SYSTEM_INODE] = { "aquota.group:%04d", OCFS2_QUOTA_FL, S_IFREG | 0644 },
};

/* Parameter passed from mount.ocfs2 to module */
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 9e7accc..41bb019 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -225,6 +225,19 @@ static int ocfs2_sync_fs(struct super_block *sb, int wait)
return 0;
}

+static int ocfs2_need_system_inode(struct ocfs2_super *osb, int ino)
+{
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, OCFS2_FEATURE_RO_COMPAT_USRQUOTA)
+ && (ino == USER_QUOTA_SYSTEM_INODE
+ || ino == LOCAL_USER_QUOTA_SYSTEM_INODE))
+ return 0;
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)
+ && (ino == GROUP_QUOTA_SYSTEM_INODE
+ || ino == LOCAL_GROUP_QUOTA_SYSTEM_INODE))
+ return 0;
+ return 1;
+}
+
static int ocfs2_init_global_system_inodes(struct ocfs2_super *osb)
{
struct inode *new = NULL;
@@ -251,6 +264,8 @@ static int ocfs2_init_global_system_inodes(struct ocfs2_super *osb)

for (i = OCFS2_FIRST_ONLINE_SYSTEM_INODE;
i <= OCFS2_LAST_GLOBAL_SYSTEM_INODE; i++) {
+ if (!ocfs2_need_system_inode(osb, i))
+ continue;
new = ocfs2_get_system_file_inode(osb, i, osb->slot_num);
if (!new) {
ocfs2_release_system_inodes(osb);
@@ -281,6 +296,8 @@ static int ocfs2_init_local_system_inodes(struct ocfs2_super *osb)
for (i = OCFS2_LAST_GLOBAL_SYSTEM_INODE + 1;
i < NUM_SYSTEM_INODES;
i++) {
+ if (!ocfs2_need_system_inode(osb, i))
+ continue;
new = ocfs2_get_system_file_inode(osb, i, osb->slot_num);
if (!new) {
ocfs2_release_system_inodes(osb);
--
1.5.6

2008-12-22 21:57:11

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 22/56] ocfs2: Mark system files as not subject to quota accounting

From: Jan Kara <[email protected]>

Mark system files as not subject to quota accounting. This prevents
possible recursions into quota code and thus deadlocks.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/inode.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index ec25d99..50dbc48 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -275,8 +275,10 @@ void ocfs2_populate_inode(struct inode *inode, struct ocfs2_dinode *fe,

inode->i_nlink = le16_to_cpu(fe->i_links_count);

- if (fe->i_flags & cpu_to_le32(OCFS2_SYSTEM_FL))
+ if (fe->i_flags & cpu_to_le32(OCFS2_SYSTEM_FL)) {
OCFS2_I(inode)->ip_flags |= OCFS2_INODE_SYSTEM_FILE;
+ inode->i_flags |= S_NOQUOTA;
+ }

if (fe->i_flags & cpu_to_le32(OCFS2_LOCAL_ALLOC_FL)) {
OCFS2_I(inode)->ip_flags |= OCFS2_INODE_BITMAP;
--
1.5.6

2008-12-22 21:57:29

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 23/56] ocfs2: Implementation of local and global quota file handling

From: Jan Kara <[email protected]>

For each quota type each node has local quota file. In this file it stores
changes users have made to disk usage via this node. Once in a while this
information is synced to global file (and thus with other nodes) so that
limits enforcement at least aproximately works.

Global quota files contain all the information about usage and limits. It's
mostly handled by the generic VFS code (which implements a trie of structures
inside a quota file). We only have to provide functions to convert structures
from on-disk format to in-memory one. We also have to provide wrappers for
various quota functions starting transactions and acquiring necessary cluster
locks before the actual IO is really started.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/Makefile | 2 +
fs/ocfs2/cluster/masklog.h | 1 +
fs/ocfs2/dlmglue.c | 146 +++++++
fs/ocfs2/dlmglue.h | 19 +
fs/ocfs2/file.c | 6 +-
fs/ocfs2/file.h | 3 +
fs/ocfs2/inode.h | 2 +
fs/ocfs2/ocfs2_fs.h | 103 +++++
fs/ocfs2/ocfs2_lockid.h | 5 +
fs/ocfs2/quota.h | 93 +++++
fs/ocfs2/quota_global.c | 919 ++++++++++++++++++++++++++++++++++++++++++++
fs/ocfs2/quota_local.c | 833 +++++++++++++++++++++++++++++++++++++++
fs/ocfs2/super.c | 38 ++-
13 files changed, 2165 insertions(+), 5 deletions(-)
create mode 100644 fs/ocfs2/quota.h
create mode 100644 fs/ocfs2/quota_global.c
create mode 100644 fs/ocfs2/quota_local.c

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index e9ef5d1..7e4b361 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -35,6 +35,8 @@ ocfs2-objs := \
sysfile.o \
uptodate.o \
ver.o \
+ quota_local.o \
+ quota_global.o \
xattr.o

ifeq ($(CONFIG_OCFS2_FS_POSIX_ACL),y)
diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
index 57670c6..7e72a81 100644
--- a/fs/ocfs2/cluster/masklog.h
+++ b/fs/ocfs2/cluster/masklog.h
@@ -113,6 +113,7 @@
#define ML_QUORUM 0x0000000008000000ULL /* net connection quorum */
#define ML_EXPORT 0x0000000010000000ULL /* ocfs2 export operations */
#define ML_XATTR 0x0000000020000000ULL /* ocfs2 extended attributes */
+#define ML_QUOTA 0x0000000040000000ULL /* ocfs2 quota operations */
/* bits that are infrequently given and frequently matched in the high word */
#define ML_ERROR 0x0000000100000000ULL /* sent to KERN_ERR */
#define ML_NOTICE 0x0000000200000000ULL /* setn to KERN_NOTICE */
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 9f2a7f7..058aa86 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -32,6 +32,7 @@
#include <linux/debugfs.h>
#include <linux/seq_file.h>
#include <linux/time.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_DLM_GLUE
#include <cluster/masklog.h>
@@ -51,6 +52,7 @@
#include "slot_map.h"
#include "super.h"
#include "uptodate.h"
+#include "quota.h"

#include "buffer_head_io.h"

@@ -68,6 +70,7 @@ struct ocfs2_mask_waiter {
static struct ocfs2_super *ocfs2_get_dentry_osb(struct ocfs2_lock_res *lockres);
static struct ocfs2_super *ocfs2_get_inode_osb(struct ocfs2_lock_res *lockres);
static struct ocfs2_super *ocfs2_get_file_osb(struct ocfs2_lock_res *lockres);
+static struct ocfs2_super *ocfs2_get_qinfo_osb(struct ocfs2_lock_res *lockres);

/*
* Return value from ->downconvert_worker functions.
@@ -102,6 +105,7 @@ static int ocfs2_dentry_convert_worker(struct ocfs2_lock_res *lockres,
static void ocfs2_dentry_post_unlock(struct ocfs2_super *osb,
struct ocfs2_lock_res *lockres);

+static void ocfs2_set_qinfo_lvb(struct ocfs2_lock_res *lockres);

#define mlog_meta_lvb(__level, __lockres) ocfs2_dump_meta_lvb_info(__level, __PRETTY_FUNCTION__, __LINE__, __lockres)

@@ -258,6 +262,12 @@ static struct ocfs2_lock_res_ops ocfs2_flock_lops = {
.flags = 0,
};

+static struct ocfs2_lock_res_ops ocfs2_qinfo_lops = {
+ .set_lvb = ocfs2_set_qinfo_lvb,
+ .get_osb = ocfs2_get_qinfo_osb,
+ .flags = LOCK_TYPE_REQUIRES_REFRESH | LOCK_TYPE_USES_LVB,
+};
+
static inline int ocfs2_is_inode_lock(struct ocfs2_lock_res *lockres)
{
return lockres->l_type == OCFS2_LOCK_TYPE_META ||
@@ -279,6 +289,13 @@ static inline struct ocfs2_dentry_lock *ocfs2_lock_res_dl(struct ocfs2_lock_res
return (struct ocfs2_dentry_lock *)lockres->l_priv;
}

+static inline struct ocfs2_mem_dqinfo *ocfs2_lock_res_qinfo(struct ocfs2_lock_res *lockres)
+{
+ BUG_ON(lockres->l_type != OCFS2_LOCK_TYPE_QINFO);
+
+ return (struct ocfs2_mem_dqinfo *)lockres->l_priv;
+}
+
static inline struct ocfs2_super *ocfs2_get_lockres_osb(struct ocfs2_lock_res *lockres)
{
if (lockres->l_ops->get_osb)
@@ -507,6 +524,13 @@ static struct ocfs2_super *ocfs2_get_inode_osb(struct ocfs2_lock_res *lockres)
return OCFS2_SB(inode->i_sb);
}

+static struct ocfs2_super *ocfs2_get_qinfo_osb(struct ocfs2_lock_res *lockres)
+{
+ struct ocfs2_mem_dqinfo *info = lockres->l_priv;
+
+ return OCFS2_SB(info->dqi_gi.dqi_sb);
+}
+
static struct ocfs2_super *ocfs2_get_file_osb(struct ocfs2_lock_res *lockres)
{
struct ocfs2_file_private *fp = lockres->l_priv;
@@ -609,6 +633,17 @@ void ocfs2_file_lock_res_init(struct ocfs2_lock_res *lockres,
lockres->l_flags |= OCFS2_LOCK_NOCACHE;
}

+void ocfs2_qinfo_lock_res_init(struct ocfs2_lock_res *lockres,
+ struct ocfs2_mem_dqinfo *info)
+{
+ ocfs2_lock_res_init_once(lockres);
+ ocfs2_build_lock_name(OCFS2_LOCK_TYPE_QINFO, info->dqi_gi.dqi_type,
+ 0, lockres->l_name);
+ ocfs2_lock_res_init_common(OCFS2_SB(info->dqi_gi.dqi_sb), lockres,
+ OCFS2_LOCK_TYPE_QINFO, &ocfs2_qinfo_lops,
+ info);
+}
+
void ocfs2_lock_res_free(struct ocfs2_lock_res *res)
{
mlog_entry_void();
@@ -3445,6 +3480,117 @@ static int ocfs2_dentry_convert_worker(struct ocfs2_lock_res *lockres,
return UNBLOCK_CONTINUE_POST;
}

+static void ocfs2_set_qinfo_lvb(struct ocfs2_lock_res *lockres)
+{
+ struct ocfs2_qinfo_lvb *lvb;
+ struct ocfs2_mem_dqinfo *oinfo = ocfs2_lock_res_qinfo(lockres);
+ struct mem_dqinfo *info = sb_dqinfo(oinfo->dqi_gi.dqi_sb,
+ oinfo->dqi_gi.dqi_type);
+
+ mlog_entry_void();
+
+ lvb = (struct ocfs2_qinfo_lvb *)ocfs2_dlm_lvb(&lockres->l_lksb);
+ lvb->lvb_version = OCFS2_QINFO_LVB_VERSION;
+ lvb->lvb_bgrace = cpu_to_be32(info->dqi_bgrace);
+ lvb->lvb_igrace = cpu_to_be32(info->dqi_igrace);
+ lvb->lvb_syncms = cpu_to_be32(oinfo->dqi_syncms);
+ lvb->lvb_blocks = cpu_to_be32(oinfo->dqi_gi.dqi_blocks);
+ lvb->lvb_free_blk = cpu_to_be32(oinfo->dqi_gi.dqi_free_blk);
+ lvb->lvb_free_entry = cpu_to_be32(oinfo->dqi_gi.dqi_free_entry);
+
+ mlog_exit_void();
+}
+
+void ocfs2_qinfo_unlock(struct ocfs2_mem_dqinfo *oinfo, int ex)
+{
+ struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
+ struct ocfs2_super *osb = OCFS2_SB(oinfo->dqi_gi.dqi_sb);
+ int level = ex ? DLM_LOCK_EX : DLM_LOCK_PR;
+
+ mlog_entry_void();
+ if (!ocfs2_is_hard_readonly(osb) && !ocfs2_mount_local(osb))
+ ocfs2_cluster_unlock(osb, lockres, level);
+ mlog_exit_void();
+}
+
+static int ocfs2_refresh_qinfo(struct ocfs2_mem_dqinfo *oinfo)
+{
+ struct mem_dqinfo *info = sb_dqinfo(oinfo->dqi_gi.dqi_sb,
+ oinfo->dqi_gi.dqi_type);
+ struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
+ struct ocfs2_qinfo_lvb *lvb = ocfs2_dlm_lvb(&lockres->l_lksb);
+ struct buffer_head *bh;
+ struct ocfs2_global_disk_dqinfo *gdinfo;
+ int status = 0;
+
+ if (lvb->lvb_version == OCFS2_QINFO_LVB_VERSION) {
+ info->dqi_bgrace = be32_to_cpu(lvb->lvb_bgrace);
+ info->dqi_igrace = be32_to_cpu(lvb->lvb_igrace);
+ oinfo->dqi_syncms = be32_to_cpu(lvb->lvb_syncms);
+ oinfo->dqi_gi.dqi_blocks = be32_to_cpu(lvb->lvb_blocks);
+ oinfo->dqi_gi.dqi_free_blk = be32_to_cpu(lvb->lvb_free_blk);
+ oinfo->dqi_gi.dqi_free_entry =
+ be32_to_cpu(lvb->lvb_free_entry);
+ } else {
+ bh = ocfs2_read_quota_block(oinfo->dqi_gqinode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ goto bail;
+ }
+ gdinfo = (struct ocfs2_global_disk_dqinfo *)
+ (bh->b_data + OCFS2_GLOBAL_INFO_OFF);
+ info->dqi_bgrace = le32_to_cpu(gdinfo->dqi_bgrace);
+ info->dqi_igrace = le32_to_cpu(gdinfo->dqi_igrace);
+ oinfo->dqi_syncms = le32_to_cpu(gdinfo->dqi_syncms);
+ oinfo->dqi_gi.dqi_blocks = le32_to_cpu(gdinfo->dqi_blocks);
+ oinfo->dqi_gi.dqi_free_blk = le32_to_cpu(gdinfo->dqi_free_blk);
+ oinfo->dqi_gi.dqi_free_entry =
+ le32_to_cpu(gdinfo->dqi_free_entry);
+ brelse(bh);
+ ocfs2_track_lock_refresh(lockres);
+ }
+
+bail:
+ return status;
+}
+
+/* Lock quota info, this function expects at least shared lock on the quota file
+ * so that we can safely refresh quota info from disk. */
+int ocfs2_qinfo_lock(struct ocfs2_mem_dqinfo *oinfo, int ex)
+{
+ struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
+ struct ocfs2_super *osb = OCFS2_SB(oinfo->dqi_gi.dqi_sb);
+ int level = ex ? DLM_LOCK_EX : DLM_LOCK_PR;
+ int status = 0;
+
+ mlog_entry_void();
+
+ /* On RO devices, locking really isn't needed... */
+ if (ocfs2_is_hard_readonly(osb)) {
+ if (ex)
+ status = -EROFS;
+ goto bail;
+ }
+ if (ocfs2_mount_local(osb))
+ goto bail;
+
+ status = ocfs2_cluster_lock(osb, lockres, level, 0, 0);
+ if (status < 0) {
+ mlog_errno(status);
+ goto bail;
+ }
+ if (!ocfs2_should_refresh_lock_res(lockres))
+ goto bail;
+ /* OK, we have the lock but we need to refresh the quota info */
+ status = ocfs2_refresh_qinfo(oinfo);
+ if (status)
+ ocfs2_qinfo_unlock(oinfo, ex);
+ ocfs2_complete_lock_res_refresh(lockres, status);
+bail:
+ mlog_exit(status);
+ return status;
+}
+
/*
* This is the filesystem locking protocol. It provides the lock handling
* hooks for the underlying DLM. It has a maximum version number.
diff --git a/fs/ocfs2/dlmglue.h b/fs/ocfs2/dlmglue.h
index 2bb01f0..3f8d998 100644
--- a/fs/ocfs2/dlmglue.h
+++ b/fs/ocfs2/dlmglue.h
@@ -49,6 +49,19 @@ struct ocfs2_meta_lvb {
__be32 lvb_reserved2;
};

+#define OCFS2_QINFO_LVB_VERSION 1
+
+struct ocfs2_qinfo_lvb {
+ __u8 lvb_version;
+ __u8 lvb_reserved[3];
+ __be32 lvb_bgrace;
+ __be32 lvb_igrace;
+ __be32 lvb_syncms;
+ __be32 lvb_blocks;
+ __be32 lvb_free_blk;
+ __be32 lvb_free_entry;
+};
+
/* ocfs2_inode_lock_full() 'arg_flags' flags */
/* don't wait on recovery. */
#define OCFS2_META_LOCK_RECOVERY (0x01)
@@ -69,6 +82,9 @@ void ocfs2_dentry_lock_res_init(struct ocfs2_dentry_lock *dl,
struct ocfs2_file_private;
void ocfs2_file_lock_res_init(struct ocfs2_lock_res *lockres,
struct ocfs2_file_private *fp);
+struct ocfs2_mem_dqinfo;
+void ocfs2_qinfo_lock_res_init(struct ocfs2_lock_res *lockres,
+ struct ocfs2_mem_dqinfo *info);
void ocfs2_lock_res_free(struct ocfs2_lock_res *res);
int ocfs2_create_new_inode_locks(struct inode *inode);
int ocfs2_drop_inode_locks(struct inode *inode);
@@ -103,6 +119,9 @@ int ocfs2_dentry_lock(struct dentry *dentry, int ex);
void ocfs2_dentry_unlock(struct dentry *dentry, int ex);
int ocfs2_file_lock(struct file *file, int ex, int trylock);
void ocfs2_file_unlock(struct file *file);
+int ocfs2_qinfo_lock(struct ocfs2_mem_dqinfo *oinfo, int ex);
+void ocfs2_qinfo_unlock(struct ocfs2_mem_dqinfo *oinfo, int ex);
+

void ocfs2_mark_lockres_freeing(struct ocfs2_lock_res *lockres);
void ocfs2_simple_drop_lockres(struct ocfs2_super *osb,
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 41001d5..372d965 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -304,9 +304,9 @@ bail:
return status;
}

-static int ocfs2_simple_size_update(struct inode *inode,
- struct buffer_head *di_bh,
- u64 new_i_size)
+int ocfs2_simple_size_update(struct inode *inode,
+ struct buffer_head *di_bh,
+ u64 new_i_size)
{
int ret;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
diff --git a/fs/ocfs2/file.h b/fs/ocfs2/file.h
index e92382c..172f9fb 100644
--- a/fs/ocfs2/file.h
+++ b/fs/ocfs2/file.h
@@ -51,6 +51,9 @@ int ocfs2_add_inode_data(struct ocfs2_super *osb,
struct ocfs2_alloc_context *data_ac,
struct ocfs2_alloc_context *meta_ac,
enum ocfs2_alloc_restarted *reason_ret);
+int ocfs2_simple_size_update(struct inode *inode,
+ struct buffer_head *di_bh,
+ u64 new_i_size);
int ocfs2_extend_no_holes(struct inode *inode, u64 new_i_size,
u64 zero_to);
int ocfs2_setattr(struct dentry *dentry, struct iattr *attr);
diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
index b79c371..eb3c302 100644
--- a/fs/ocfs2/inode.h
+++ b/fs/ocfs2/inode.h
@@ -142,6 +142,8 @@ int ocfs2_mark_inode_dirty(handle_t *handle,
struct buffer_head *bh);
int ocfs2_aio_read(struct file *file, struct kiocb *req, struct iocb *iocb);
int ocfs2_aio_write(struct file *file, struct kiocb *req, struct iocb *iocb);
+struct buffer_head *ocfs2_bread(struct inode *inode,
+ int block, int *err, int reada);

void ocfs2_set_inode_flags(struct inode *inode);
void ocfs2_get_inode_flags(struct ocfs2_inode_info *oi);
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h
index 06e3bd6..0a5ac79 100644
--- a/fs/ocfs2/ocfs2_fs.h
+++ b/fs/ocfs2/ocfs2_fs.h
@@ -883,6 +883,109 @@ static inline int ocfs2_xattr_get_type(struct ocfs2_xattr_entry *xe)
return xe->xe_type & OCFS2_XATTR_TYPE_MASK;
}

+/*
+ * On disk structures for global quota file
+ */
+
+/* Magic numbers and known versions for global quota files */
+#define OCFS2_GLOBAL_QMAGICS {\
+ 0x0cf52470, /* USRQUOTA */ \
+ 0x0cf52471 /* GRPQUOTA */ \
+}
+
+#define OCFS2_GLOBAL_QVERSIONS {\
+ 0, \
+ 0, \
+}
+
+
+/* Each block of each quota file has a certain fixed number of bytes reserved
+ * for OCFS2 internal use at its end. OCFS2 can use it for things like
+ * checksums, etc. */
+#define OCFS2_QBLK_RESERVED_SPACE 8
+
+/* Generic header of all quota files */
+struct ocfs2_disk_dqheader {
+ __le32 dqh_magic; /* Magic number identifying file */
+ __le32 dqh_version; /* Quota format version */
+};
+
+#define OCFS2_GLOBAL_INFO_OFF (sizeof(struct ocfs2_disk_dqheader))
+
+/* Information header of global quota file (immediately follows the generic
+ * header) */
+struct ocfs2_global_disk_dqinfo {
+/*00*/ __le32 dqi_bgrace; /* Grace time for space softlimit excess */
+ __le32 dqi_igrace; /* Grace time for inode softlimit excess */
+ __le32 dqi_syncms; /* Time after which we sync local changes to
+ * global quota file */
+ __le32 dqi_blocks; /* Number of blocks in quota file */
+/*10*/ __le32 dqi_free_blk; /* First free block in quota file */
+ __le32 dqi_free_entry; /* First block with free dquot entry in quota
+ * file */
+};
+
+/* Structure with global user / group information. We reserve some space
+ * for future use. */
+struct ocfs2_global_disk_dqblk {
+/*00*/ __le32 dqb_id; /* ID the structure belongs to */
+ __le32 dqb_use_count; /* Number of nodes having reference to this structure */
+ __le64 dqb_ihardlimit; /* absolute limit on allocated inodes */
+/*10*/ __le64 dqb_isoftlimit; /* preferred inode limit */
+ __le64 dqb_curinodes; /* current # allocated inodes */
+/*20*/ __le64 dqb_bhardlimit; /* absolute limit on disk space */
+ __le64 dqb_bsoftlimit; /* preferred limit on disk space */
+/*30*/ __le64 dqb_curspace; /* current space occupied */
+ __le64 dqb_btime; /* time limit for excessive disk use */
+/*40*/ __le64 dqb_itime; /* time limit for excessive inode use */
+ __le64 dqb_pad1;
+/*50*/ __le64 dqb_pad2;
+};
+
+/*
+ * On-disk structures for local quota file
+ */
+
+/* Magic numbers and known versions for local quota files */
+#define OCFS2_LOCAL_QMAGICS {\
+ 0x0cf524c0, /* USRQUOTA */ \
+ 0x0cf524c1 /* GRPQUOTA */ \
+}
+
+#define OCFS2_LOCAL_QVERSIONS {\
+ 0, \
+ 0, \
+}
+
+/* Quota flags in dqinfo header */
+#define OLQF_CLEAN 0x0001 /* Quota file is empty (this should be after\
+ * quota has been cleanly turned off) */
+
+#define OCFS2_LOCAL_INFO_OFF (sizeof(struct ocfs2_disk_dqheader))
+
+/* Information header of local quota file (immediately follows the generic
+ * header) */
+struct ocfs2_local_disk_dqinfo {
+ __le32 dqi_flags; /* Flags for quota file */
+ __le32 dqi_chunks; /* Number of chunks of quota structures
+ * with a bitmap */
+ __le32 dqi_blocks; /* Number of blocks allocated for quota file */
+};
+
+/* Header of one chunk of a quota file */
+struct ocfs2_local_disk_chunk {
+ __le32 dqc_free; /* Number of free entries in the bitmap */
+ u8 dqc_bitmap[0]; /* Bitmap of entries in the corresponding
+ * chunk of quota file */
+};
+
+/* One entry in local quota file */
+struct ocfs2_local_disk_dqblk {
+/*00*/ __le64 dqb_id; /* id this quota applies to */
+ __le64 dqb_spacemod; /* Change in the amount of used space */
+/*10*/ __le64 dqb_inodemod; /* Change in the amount of used inodes */
+};
+
#ifdef __KERNEL__
static inline int ocfs2_fast_symlink_chars(struct super_block *sb)
{
diff --git a/fs/ocfs2/ocfs2_lockid.h b/fs/ocfs2/ocfs2_lockid.h
index 82c200f..eb6f50c 100644
--- a/fs/ocfs2/ocfs2_lockid.h
+++ b/fs/ocfs2/ocfs2_lockid.h
@@ -46,6 +46,7 @@ enum ocfs2_lock_type {
OCFS2_LOCK_TYPE_DENTRY,
OCFS2_LOCK_TYPE_OPEN,
OCFS2_LOCK_TYPE_FLOCK,
+ OCFS2_LOCK_TYPE_QINFO,
OCFS2_NUM_LOCK_TYPES
};

@@ -77,6 +78,9 @@ static inline char ocfs2_lock_type_char(enum ocfs2_lock_type type)
case OCFS2_LOCK_TYPE_FLOCK:
c = 'F';
break;
+ case OCFS2_LOCK_TYPE_QINFO:
+ c = 'Q';
+ break;
default:
c = '\0';
}
@@ -95,6 +99,7 @@ static char *ocfs2_lock_type_strings[] = {
[OCFS2_LOCK_TYPE_DENTRY] = "Dentry",
[OCFS2_LOCK_TYPE_OPEN] = "Open",
[OCFS2_LOCK_TYPE_FLOCK] = "Flock",
+ [OCFS2_LOCK_TYPE_QINFO] = "Quota",
};

static inline const char *ocfs2_lock_type_string(enum ocfs2_lock_type type)
diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
new file mode 100644
index 0000000..1f1c863
--- /dev/null
+++ b/fs/ocfs2/quota.h
@@ -0,0 +1,93 @@
+/*
+ * quota.h for OCFS2
+ *
+ * On disk quota structures for local and global quota file, in-memory
+ * structures.
+ *
+ */
+
+#ifndef _OCFS2_QUOTA_H
+#define _OCFS2_QUOTA_H
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/quota.h>
+#include <linux/list.h>
+#include <linux/dqblk_qtree.h>
+
+#include "ocfs2.h"
+
+/* Common stuff */
+/* id number of quota format */
+#define QFMT_OCFS2 3
+
+/*
+ * In-memory structures
+ */
+struct ocfs2_dquot {
+ struct dquot dq_dquot; /* Generic VFS dquot */
+ loff_t dq_local_off; /* Offset in the local quota file */
+ struct ocfs2_quota_chunk *dq_chunk; /* Chunk dquot is in */
+ unsigned int dq_use_count; /* Number of nodes having reference to this entry in global quota file */
+ s64 dq_origspace; /* Last globally synced space usage */
+ s64 dq_originodes; /* Last globally synced inode usage */
+};
+
+/* In-memory structure with quota header information */
+struct ocfs2_mem_dqinfo {
+ unsigned int dqi_type; /* Quota type this structure describes */
+ unsigned int dqi_chunks; /* Number of chunks in local quota file */
+ unsigned int dqi_blocks; /* Number of blocks allocated for local quota file */
+ unsigned int dqi_syncms; /* How often should we sync with other nodes */
+ struct list_head dqi_chunk; /* List of chunks */
+ struct inode *dqi_gqinode; /* Global quota file inode */
+ struct ocfs2_lock_res dqi_gqlock; /* Lock protecting quota information structure */
+ struct buffer_head *dqi_gqi_bh; /* Buffer head with global quota file inode - set only if inode lock is obtained */
+ int dqi_gqi_count; /* Number of holders of dqi_gqi_bh */
+ struct buffer_head *dqi_lqi_bh; /* Buffer head with local quota file inode */
+ struct buffer_head *dqi_ibh; /* Buffer with information header */
+ struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
+};
+
+static inline struct ocfs2_dquot *OCFS2_DQUOT(struct dquot *dquot)
+{
+ return container_of(dquot, struct ocfs2_dquot, dq_dquot);
+}
+
+struct ocfs2_quota_chunk {
+ struct list_head qc_chunk; /* List of quotafile chunks */
+ int qc_num; /* Number of quota chunk */
+ struct buffer_head *qc_headerbh; /* Buffer head with chunk header */
+};
+
+extern struct kmem_cache *ocfs2_dquot_cachep;
+extern struct kmem_cache *ocfs2_qf_chunk_cachep;
+
+extern struct qtree_fmt_operations ocfs2_global_ops;
+
+ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
+ size_t len, loff_t off);
+ssize_t ocfs2_quota_write(struct super_block *sb, int type,
+ const char *data, size_t len, loff_t off);
+int ocfs2_global_read_info(struct super_block *sb, int type);
+int ocfs2_global_write_info(struct super_block *sb, int type);
+int ocfs2_global_read_dquot(struct dquot *dquot);
+int __ocfs2_sync_dquot(struct dquot *dquot, int freeing);
+static inline int ocfs2_sync_dquot(struct dquot *dquot)
+{
+ return __ocfs2_sync_dquot(dquot, 0);
+}
+static inline int ocfs2_global_release_dquot(struct dquot *dquot)
+{
+ return __ocfs2_sync_dquot(dquot, 1);
+}
+
+int ocfs2_lock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex);
+void ocfs2_unlock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex);
+struct buffer_head *ocfs2_read_quota_block(struct inode *inode,
+ int block, int *err);
+
+extern struct dquot_operations ocfs2_quota_operations;
+extern struct quota_format_type ocfs2_quota_format;
+
+#endif /* _OCFS2_QUOTA_H */
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
new file mode 100644
index 0000000..af8340c
--- /dev/null
+++ b/fs/ocfs2/quota_global.c
@@ -0,0 +1,919 @@
+/*
+ * Implementation of operations over global quota file
+ */
+#include <linux/fs.h>
+#include <linux/quota.h>
+#include <linux/quotaops.h>
+#include <linux/dqblk_qtree.h>
+
+#define MLOG_MASK_PREFIX ML_QUOTA
+#include <cluster/masklog.h>
+
+#include "ocfs2_fs.h"
+#include "ocfs2.h"
+#include "alloc.h"
+#include "inode.h"
+#include "journal.h"
+#include "file.h"
+#include "sysfile.h"
+#include "dlmglue.h"
+#include "uptodate.h"
+#include "quota.h"
+
+static void ocfs2_global_disk2memdqb(struct dquot *dquot, void *dp)
+{
+ struct ocfs2_global_disk_dqblk *d = dp;
+ struct mem_dqblk *m = &dquot->dq_dqb;
+
+ /* Update from disk only entries not set by the admin */
+ if (!test_bit(DQ_LASTSET_B + QIF_ILIMITS_B, &dquot->dq_flags)) {
+ m->dqb_ihardlimit = le64_to_cpu(d->dqb_ihardlimit);
+ m->dqb_isoftlimit = le64_to_cpu(d->dqb_isoftlimit);
+ }
+ if (!test_bit(DQ_LASTSET_B + QIF_INODES_B, &dquot->dq_flags))
+ m->dqb_curinodes = le64_to_cpu(d->dqb_curinodes);
+ if (!test_bit(DQ_LASTSET_B + QIF_BLIMITS_B, &dquot->dq_flags)) {
+ m->dqb_bhardlimit = le64_to_cpu(d->dqb_bhardlimit);
+ m->dqb_bsoftlimit = le64_to_cpu(d->dqb_bsoftlimit);
+ }
+ if (!test_bit(DQ_LASTSET_B + QIF_SPACE_B, &dquot->dq_flags))
+ m->dqb_curspace = le64_to_cpu(d->dqb_curspace);
+ if (!test_bit(DQ_LASTSET_B + QIF_BTIME_B, &dquot->dq_flags))
+ m->dqb_btime = le64_to_cpu(d->dqb_btime);
+ if (!test_bit(DQ_LASTSET_B + QIF_ITIME_B, &dquot->dq_flags))
+ m->dqb_itime = le64_to_cpu(d->dqb_itime);
+ OCFS2_DQUOT(dquot)->dq_use_count = le32_to_cpu(d->dqb_use_count);
+}
+
+static void ocfs2_global_mem2diskdqb(void *dp, struct dquot *dquot)
+{
+ struct ocfs2_global_disk_dqblk *d = dp;
+ struct mem_dqblk *m = &dquot->dq_dqb;
+
+ d->dqb_id = cpu_to_le32(dquot->dq_id);
+ d->dqb_use_count = cpu_to_le32(OCFS2_DQUOT(dquot)->dq_use_count);
+ d->dqb_ihardlimit = cpu_to_le64(m->dqb_ihardlimit);
+ d->dqb_isoftlimit = cpu_to_le64(m->dqb_isoftlimit);
+ d->dqb_curinodes = cpu_to_le64(m->dqb_curinodes);
+ d->dqb_bhardlimit = cpu_to_le64(m->dqb_bhardlimit);
+ d->dqb_bsoftlimit = cpu_to_le64(m->dqb_bsoftlimit);
+ d->dqb_curspace = cpu_to_le64(m->dqb_curspace);
+ d->dqb_btime = cpu_to_le64(m->dqb_btime);
+ d->dqb_itime = cpu_to_le64(m->dqb_itime);
+}
+
+static int ocfs2_global_is_id(void *dp, struct dquot *dquot)
+{
+ struct ocfs2_global_disk_dqblk *d = dp;
+ struct ocfs2_mem_dqinfo *oinfo =
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;
+
+ if (qtree_entry_unused(&oinfo->dqi_gi, dp))
+ return 0;
+ return le32_to_cpu(d->dqb_id) == dquot->dq_id;
+}
+
+struct qtree_fmt_operations ocfs2_global_ops = {
+ .mem2disk_dqblk = ocfs2_global_mem2diskdqb,
+ .disk2mem_dqblk = ocfs2_global_disk2memdqb,
+ .is_id = ocfs2_global_is_id,
+};
+
+
+struct buffer_head *ocfs2_read_quota_block(struct inode *inode,
+ int block, int *err)
+{
+ struct buffer_head *tmp = NULL;
+
+ *err = ocfs2_read_virt_blocks(inode, block, 1, &tmp, 0, NULL);
+ if (*err)
+ mlog_errno(*err);
+
+ return tmp;
+}
+
+static struct buffer_head *ocfs2_get_quota_block(struct inode *inode,
+ int block, int *err)
+{
+ u64 pblock, pcount;
+ struct buffer_head *bh;
+
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
+ *err = ocfs2_extent_map_get_blocks(inode, block, &pblock, &pcount,
+ NULL);
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
+ if (*err) {
+ mlog_errno(*err);
+ return NULL;
+ }
+ bh = sb_getblk(inode->i_sb, pblock);
+ if (!bh) {
+ *err = -EIO;
+ mlog_errno(*err);
+ }
+ return bh;
+}
+
+/* Read data from global quotafile - avoid pagecache and such because we cannot
+ * afford acquiring the locks... We use quota cluster lock to serialize
+ * operations. Caller is responsible for acquiring it. */
+ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
+ size_t len, loff_t off)
+{
+ struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ struct inode *gqinode = oinfo->dqi_gqinode;
+ loff_t i_size = i_size_read(gqinode);
+ int offset = off & (sb->s_blocksize - 1);
+ sector_t blk = off >> sb->s_blocksize_bits;
+ int err = 0;
+ struct buffer_head *bh;
+ size_t toread, tocopy;
+
+ if (off > i_size)
+ return 0;
+ if (off + len > i_size)
+ len = i_size - off;
+ toread = len;
+ while (toread > 0) {
+ tocopy = min((size_t)(sb->s_blocksize - offset), toread);
+ bh = ocfs2_read_quota_block(gqinode, blk, &err);
+ if (!bh) {
+ mlog_errno(err);
+ return err;
+ }
+ memcpy(data, bh->b_data + offset, tocopy);
+ brelse(bh);
+ offset = 0;
+ toread -= tocopy;
+ data += tocopy;
+ blk++;
+ }
+ return len;
+}
+
+/* Write to quotafile (we know the transaction is already started and has
+ * enough credits) */
+ssize_t ocfs2_quota_write(struct super_block *sb, int type,
+ const char *data, size_t len, loff_t off)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct inode *gqinode = oinfo->dqi_gqinode;
+ int offset = off & (sb->s_blocksize - 1);
+ sector_t blk = off >> sb->s_blocksize_bits;
+ int err = 0, new = 0;
+ struct buffer_head *bh;
+ handle_t *handle = journal_current_handle();
+
+ if (!handle) {
+ mlog(ML_ERROR, "Quota write (off=%llu, len=%llu) cancelled "
+ "because transaction was not started.\n",
+ (unsigned long long)off, (unsigned long long)len);
+ return -EIO;
+ }
+ if (len > sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE - offset) {
+ WARN_ON(1);
+ len = sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE - offset;
+ }
+
+ mutex_lock_nested(&gqinode->i_mutex, I_MUTEX_QUOTA);
+ if (gqinode->i_size < off + len) {
+ down_write(&OCFS2_I(gqinode)->ip_alloc_sem);
+ err = ocfs2_extend_no_holes(gqinode, off + len, off);
+ up_write(&OCFS2_I(gqinode)->ip_alloc_sem);
+ if (err < 0)
+ goto out;
+ err = ocfs2_simple_size_update(gqinode,
+ oinfo->dqi_gqi_bh,
+ off + len);
+ if (err < 0)
+ goto out;
+ new = 1;
+ }
+ /* Not rewriting whole block? */
+ if ((offset || len < sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE) &&
+ !new) {
+ bh = ocfs2_read_quota_block(gqinode, blk, &err);
+ if (!bh) {
+ mlog_errno(err);
+ return err;
+ }
+ err = ocfs2_journal_access(handle, gqinode, bh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ } else {
+ bh = ocfs2_get_quota_block(gqinode, blk, &err);
+ if (!bh) {
+ mlog_errno(err);
+ return err;
+ }
+ err = ocfs2_journal_access(handle, gqinode, bh,
+ OCFS2_JOURNAL_ACCESS_CREATE);
+ }
+ if (err < 0) {
+ brelse(bh);
+ goto out;
+ }
+ lock_buffer(bh);
+ if (new)
+ memset(bh->b_data, 0, sb->s_blocksize);
+ memcpy(bh->b_data + offset, data, len);
+ flush_dcache_page(bh->b_page);
+ unlock_buffer(bh);
+ ocfs2_set_buffer_uptodate(gqinode, bh);
+ err = ocfs2_journal_dirty(handle, bh);
+ brelse(bh);
+ if (err < 0)
+ goto out;
+out:
+ if (err) {
+ mutex_unlock(&gqinode->i_mutex);
+ mlog_errno(err);
+ return err;
+ }
+ gqinode->i_version++;
+ ocfs2_mark_inode_dirty(handle, gqinode, oinfo->dqi_gqi_bh);
+ mutex_unlock(&gqinode->i_mutex);
+ return len;
+}
+
+int ocfs2_lock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex)
+{
+ int status;
+ struct buffer_head *bh = NULL;
+
+ status = ocfs2_inode_lock(oinfo->dqi_gqinode, &bh, ex);
+ if (status < 0)
+ return status;
+ spin_lock(&dq_data_lock);
+ if (!oinfo->dqi_gqi_count++)
+ oinfo->dqi_gqi_bh = bh;
+ else
+ WARN_ON(bh != oinfo->dqi_gqi_bh);
+ spin_unlock(&dq_data_lock);
+ return 0;
+}
+
+void ocfs2_unlock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex)
+{
+ ocfs2_inode_unlock(oinfo->dqi_gqinode, ex);
+ brelse(oinfo->dqi_gqi_bh);
+ spin_lock(&dq_data_lock);
+ if (!--oinfo->dqi_gqi_count)
+ oinfo->dqi_gqi_bh = NULL;
+ spin_unlock(&dq_data_lock);
+}
+
+/* Read information header from global quota file */
+int ocfs2_global_read_info(struct super_block *sb, int type)
+{
+ struct inode *gqinode = NULL;
+ unsigned int ino[MAXQUOTAS] = { USER_QUOTA_SYSTEM_INODE,
+ GROUP_QUOTA_SYSTEM_INODE };
+ struct ocfs2_global_disk_dqinfo dinfo;
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ int status;
+
+ mlog_entry_void();
+
+ /* Read global header */
+ gqinode = ocfs2_get_system_file_inode(OCFS2_SB(sb), ino[type],
+ OCFS2_INVALID_SLOT);
+ if (!gqinode) {
+ mlog(ML_ERROR, "failed to get global quota inode (type=%d)\n",
+ type);
+ status = -EINVAL;
+ goto out_err;
+ }
+ oinfo->dqi_gi.dqi_sb = sb;
+ oinfo->dqi_gi.dqi_type = type;
+ ocfs2_qinfo_lock_res_init(&oinfo->dqi_gqlock, oinfo);
+ oinfo->dqi_gi.dqi_entry_size = sizeof(struct ocfs2_global_disk_dqblk);
+ oinfo->dqi_gi.dqi_ops = &ocfs2_global_ops;
+ oinfo->dqi_gqi_bh = NULL;
+ oinfo->dqi_gqi_count = 0;
+ oinfo->dqi_gqinode = gqinode;
+ status = ocfs2_lock_global_qf(oinfo, 0);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+ status = sb->s_op->quota_read(sb, type, (char *)&dinfo,
+ sizeof(struct ocfs2_global_disk_dqinfo),
+ OCFS2_GLOBAL_INFO_OFF);
+ ocfs2_unlock_global_qf(oinfo, 0);
+ if (status != sizeof(struct ocfs2_global_disk_dqinfo)) {
+ mlog(ML_ERROR, "Cannot read global quota info (%d).\n",
+ status);
+ if (status >= 0)
+ status = -EIO;
+ mlog_errno(status);
+ goto out_err;
+ }
+ info->dqi_bgrace = le32_to_cpu(dinfo.dqi_bgrace);
+ info->dqi_igrace = le32_to_cpu(dinfo.dqi_igrace);
+ oinfo->dqi_syncms = le32_to_cpu(dinfo.dqi_syncms);
+ oinfo->dqi_gi.dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
+ oinfo->dqi_gi.dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
+ oinfo->dqi_gi.dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
+ oinfo->dqi_gi.dqi_blocksize_bits = sb->s_blocksize_bits;
+ oinfo->dqi_gi.dqi_usable_bs = sb->s_blocksize -
+ OCFS2_QBLK_RESERVED_SPACE;
+ oinfo->dqi_gi.dqi_qtree_depth = qtree_depth(&oinfo->dqi_gi);
+out_err:
+ mlog_exit(status);
+ return status;
+}
+
+/* Write information to global quota file. Expects exlusive lock on quota
+ * file inode and quota info */
+static int __ocfs2_global_write_info(struct super_block *sb, int type)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct ocfs2_global_disk_dqinfo dinfo;
+ ssize_t size;
+
+ spin_lock(&dq_data_lock);
+ info->dqi_flags &= ~DQF_INFO_DIRTY;
+ dinfo.dqi_bgrace = cpu_to_le32(info->dqi_bgrace);
+ dinfo.dqi_igrace = cpu_to_le32(info->dqi_igrace);
+ spin_unlock(&dq_data_lock);
+ dinfo.dqi_syncms = cpu_to_le32(oinfo->dqi_syncms);
+ dinfo.dqi_blocks = cpu_to_le32(oinfo->dqi_gi.dqi_blocks);
+ dinfo.dqi_free_blk = cpu_to_le32(oinfo->dqi_gi.dqi_free_blk);
+ dinfo.dqi_free_entry = cpu_to_le32(oinfo->dqi_gi.dqi_free_entry);
+ size = sb->s_op->quota_write(sb, type, (char *)&dinfo,
+ sizeof(struct ocfs2_global_disk_dqinfo),
+ OCFS2_GLOBAL_INFO_OFF);
+ if (size != sizeof(struct ocfs2_global_disk_dqinfo)) {
+ mlog(ML_ERROR, "Cannot write global quota info structure\n");
+ if (size >= 0)
+ size = -EIO;
+ return size;
+ }
+ return 0;
+}
+
+int ocfs2_global_write_info(struct super_block *sb, int type)
+{
+ int err;
+ struct ocfs2_mem_dqinfo *info = sb_dqinfo(sb, type)->dqi_priv;
+
+ err = ocfs2_qinfo_lock(info, 1);
+ if (err < 0)
+ return err;
+ err = __ocfs2_global_write_info(sb, type);
+ ocfs2_qinfo_unlock(info, 1);
+ return err;
+}
+
+/* Read in information from global quota file and acquire a reference to it.
+ * dquot_acquire() has already started the transaction and locked quota file */
+int ocfs2_global_read_dquot(struct dquot *dquot)
+{
+ int err, err2, ex = 0;
+ struct ocfs2_mem_dqinfo *info =
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;
+
+ err = ocfs2_qinfo_lock(info, 0);
+ if (err < 0)
+ goto out;
+ err = qtree_read_dquot(&info->dqi_gi, dquot);
+ if (err < 0)
+ goto out_qlock;
+ OCFS2_DQUOT(dquot)->dq_use_count++;
+ OCFS2_DQUOT(dquot)->dq_origspace = dquot->dq_dqb.dqb_curspace;
+ OCFS2_DQUOT(dquot)->dq_originodes = dquot->dq_dqb.dqb_curinodes;
+ if (!dquot->dq_off) { /* No real quota entry? */
+ /* Upgrade to exclusive lock for allocation */
+ err = ocfs2_qinfo_lock(info, 1);
+ if (err < 0)
+ goto out_qlock;
+ ex = 1;
+ }
+ err = qtree_write_dquot(&info->dqi_gi, dquot);
+ if (ex && info_dirty(sb_dqinfo(dquot->dq_sb, dquot->dq_type))) {
+ err2 = __ocfs2_global_write_info(dquot->dq_sb, dquot->dq_type);
+ if (!err)
+ err = err2;
+ }
+out_qlock:
+ if (ex)
+ ocfs2_qinfo_unlock(info, 1);
+ ocfs2_qinfo_unlock(info, 0);
+out:
+ if (err < 0)
+ mlog_errno(err);
+ return err;
+}
+
+/* Sync local information about quota modifications with global quota file.
+ * Caller must have started the transaction and obtained exclusive lock for
+ * global quota file inode */
+int __ocfs2_sync_dquot(struct dquot *dquot, int freeing)
+{
+ int err, err2;
+ struct super_block *sb = dquot->dq_sb;
+ int type = dquot->dq_type;
+ struct ocfs2_mem_dqinfo *info = sb_dqinfo(sb, type)->dqi_priv;
+ struct ocfs2_global_disk_dqblk dqblk;
+ s64 spacechange, inodechange;
+ time_t olditime, oldbtime;
+
+ err = sb->s_op->quota_read(sb, type, (char *)&dqblk,
+ sizeof(struct ocfs2_global_disk_dqblk),
+ dquot->dq_off);
+ if (err != sizeof(struct ocfs2_global_disk_dqblk)) {
+ if (err >= 0) {
+ mlog(ML_ERROR, "Short read from global quota file "
+ "(%u read)\n", err);
+ err = -EIO;
+ }
+ goto out;
+ }
+
+ /* Update space and inode usage. Get also other information from
+ * global quota file so that we don't overwrite any changes there.
+ * We are */
+ spin_lock(&dq_data_lock);
+ spacechange = dquot->dq_dqb.dqb_curspace -
+ OCFS2_DQUOT(dquot)->dq_origspace;
+ inodechange = dquot->dq_dqb.dqb_curinodes -
+ OCFS2_DQUOT(dquot)->dq_originodes;
+ olditime = dquot->dq_dqb.dqb_itime;
+ oldbtime = dquot->dq_dqb.dqb_btime;
+ ocfs2_global_disk2memdqb(dquot, &dqblk);
+ mlog(0, "Syncing global dquot %d space %lld+%lld, inodes %lld+%lld\n",
+ dquot->dq_id, dquot->dq_dqb.dqb_curspace, spacechange,
+ dquot->dq_dqb.dqb_curinodes, inodechange);
+ if (!test_bit(DQ_LASTSET_B + QIF_SPACE_B, &dquot->dq_flags))
+ dquot->dq_dqb.dqb_curspace += spacechange;
+ if (!test_bit(DQ_LASTSET_B + QIF_INODES_B, &dquot->dq_flags))
+ dquot->dq_dqb.dqb_curinodes += inodechange;
+ /* Set properly space grace time... */
+ if (dquot->dq_dqb.dqb_bsoftlimit &&
+ dquot->dq_dqb.dqb_curspace > dquot->dq_dqb.dqb_bsoftlimit) {
+ if (!test_bit(DQ_LASTSET_B + QIF_BTIME_B, &dquot->dq_flags) &&
+ oldbtime > 0) {
+ if (dquot->dq_dqb.dqb_btime > 0)
+ dquot->dq_dqb.dqb_btime =
+ min(dquot->dq_dqb.dqb_btime, oldbtime);
+ else
+ dquot->dq_dqb.dqb_btime = oldbtime;
+ }
+ } else {
+ dquot->dq_dqb.dqb_btime = 0;
+ clear_bit(DQ_BLKS_B, &dquot->dq_flags);
+ }
+ /* Set properly inode grace time... */
+ if (dquot->dq_dqb.dqb_isoftlimit &&
+ dquot->dq_dqb.dqb_curinodes > dquot->dq_dqb.dqb_isoftlimit) {
+ if (!test_bit(DQ_LASTSET_B + QIF_ITIME_B, &dquot->dq_flags) &&
+ olditime > 0) {
+ if (dquot->dq_dqb.dqb_itime > 0)
+ dquot->dq_dqb.dqb_itime =
+ min(dquot->dq_dqb.dqb_itime, olditime);
+ else
+ dquot->dq_dqb.dqb_itime = olditime;
+ }
+ } else {
+ dquot->dq_dqb.dqb_itime = 0;
+ clear_bit(DQ_INODES_B, &dquot->dq_flags);
+ }
+ /* All information is properly updated, clear the flags */
+ __clear_bit(DQ_LASTSET_B + QIF_SPACE_B, &dquot->dq_flags);
+ __clear_bit(DQ_LASTSET_B + QIF_INODES_B, &dquot->dq_flags);
+ __clear_bit(DQ_LASTSET_B + QIF_BLIMITS_B, &dquot->dq_flags);
+ __clear_bit(DQ_LASTSET_B + QIF_ILIMITS_B, &dquot->dq_flags);
+ __clear_bit(DQ_LASTSET_B + QIF_BTIME_B, &dquot->dq_flags);
+ __clear_bit(DQ_LASTSET_B + QIF_ITIME_B, &dquot->dq_flags);
+ OCFS2_DQUOT(dquot)->dq_origspace = dquot->dq_dqb.dqb_curspace;
+ OCFS2_DQUOT(dquot)->dq_originodes = dquot->dq_dqb.dqb_curinodes;
+ spin_unlock(&dq_data_lock);
+ err = ocfs2_qinfo_lock(info, freeing);
+ if (err < 0) {
+ mlog(ML_ERROR, "Failed to lock quota info, loosing quota write"
+ " (type=%d, id=%u)\n", dquot->dq_type,
+ (unsigned)dquot->dq_id);
+ goto out;
+ }
+ if (freeing)
+ OCFS2_DQUOT(dquot)->dq_use_count--;
+ err = qtree_write_dquot(&info->dqi_gi, dquot);
+ if (err < 0)
+ goto out_qlock;
+ if (freeing && !OCFS2_DQUOT(dquot)->dq_use_count) {
+ err = qtree_release_dquot(&info->dqi_gi, dquot);
+ if (info_dirty(sb_dqinfo(sb, type))) {
+ err2 = __ocfs2_global_write_info(sb, type);
+ if (!err)
+ err = err2;
+ }
+ }
+out_qlock:
+ ocfs2_qinfo_unlock(info, freeing);
+out:
+ if (err < 0)
+ mlog_errno(err);
+ return err;
+}
+
+/*
+ * Wrappers for generic quota functions
+ */
+
+static int ocfs2_write_dquot(struct dquot *dquot)
+{
+ handle_t *handle;
+ struct ocfs2_super *osb = OCFS2_SB(dquot->dq_sb);
+ int status = 0;
+
+ mlog_entry("id=%u, type=%d", dquot->dq_id, dquot->dq_type);
+
+ handle = ocfs2_start_trans(osb, OCFS2_QWRITE_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out;
+ }
+ status = dquot_commit(dquot);
+ ocfs2_commit_trans(osb, handle);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+int ocfs2_calc_qdel_credits(struct super_block *sb, int type)
+{
+ struct ocfs2_mem_dqinfo *oinfo;
+ int features[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA };
+
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, features[type]))
+ return 0;
+
+ oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ /* We modify tree, leaf block, global info, local chunk header,
+ * global and local inode */
+ return oinfo->dqi_gi.dqi_qtree_depth + 2 + 1 +
+ 2 * OCFS2_INODE_UPDATE_CREDITS;
+}
+
+static int ocfs2_release_dquot(struct dquot *dquot)
+{
+ handle_t *handle;
+ struct ocfs2_mem_dqinfo *oinfo =
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;
+ struct ocfs2_super *osb = OCFS2_SB(dquot->dq_sb);
+ int status = 0;
+
+ mlog_entry("id=%u, type=%d", dquot->dq_id, dquot->dq_type);
+
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+ handle = ocfs2_start_trans(osb,
+ ocfs2_calc_qdel_credits(dquot->dq_sb, dquot->dq_type));
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ status = dquot_release(dquot);
+ ocfs2_commit_trans(osb, handle);
+out_ilock:
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+int ocfs2_calc_qinit_credits(struct super_block *sb, int type)
+{
+ struct ocfs2_mem_dqinfo *oinfo;
+ int features[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA };
+ struct ocfs2_dinode *lfe, *gfe;
+
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, features[type]))
+ return 0;
+
+ oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ gfe = (struct ocfs2_dinode *)oinfo->dqi_gqi_bh->b_data;
+ lfe = (struct ocfs2_dinode *)oinfo->dqi_lqi_bh->b_data;
+ /* We can extend local file + global file. In local file we
+ * can modify info, chunk header block and dquot block. In
+ * global file we can modify info, tree and leaf block */
+ return ocfs2_calc_extend_credits(sb, &lfe->id2.i_list, 0) +
+ ocfs2_calc_extend_credits(sb, &gfe->id2.i_list, 0) +
+ 3 + oinfo->dqi_gi.dqi_qtree_depth + 2;
+}
+
+static int ocfs2_acquire_dquot(struct dquot *dquot)
+{
+ handle_t *handle;
+ struct ocfs2_mem_dqinfo *oinfo =
+ sb_dqinfo(dquot->dq_sb, dquot->dq_type)->dqi_priv;
+ struct ocfs2_super *osb = OCFS2_SB(dquot->dq_sb);
+ int status = 0;
+
+ mlog_entry("id=%u, type=%d", dquot->dq_id, dquot->dq_type);
+ /* We need an exclusive lock, because we're going to update use count
+ * and instantiate possibly new dquot structure */
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+ handle = ocfs2_start_trans(osb,
+ ocfs2_calc_qinit_credits(dquot->dq_sb, dquot->dq_type));
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ status = dquot_acquire(dquot);
+ ocfs2_commit_trans(osb, handle);
+out_ilock:
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
+{
+ unsigned long mask = (1 << (DQ_LASTSET_B + QIF_ILIMITS_B)) |
+ (1 << (DQ_LASTSET_B + QIF_BLIMITS_B)) |
+ (1 << (DQ_LASTSET_B + QIF_INODES_B)) |
+ (1 << (DQ_LASTSET_B + QIF_SPACE_B)) |
+ (1 << (DQ_LASTSET_B + QIF_BTIME_B)) |
+ (1 << (DQ_LASTSET_B + QIF_ITIME_B));
+ int sync = 0;
+ int status;
+ struct super_block *sb = dquot->dq_sb;
+ int type = dquot->dq_type;
+ struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ handle_t *handle;
+ struct ocfs2_super *osb = OCFS2_SB(sb);
+
+ mlog_entry("id=%u, type=%d", dquot->dq_id, type);
+ dquot_mark_dquot_dirty(dquot);
+
+ /* In case user set some limits, sync dquot immediately to global
+ * quota file so that information propagates quicker */
+ spin_lock(&dq_data_lock);
+ if (dquot->dq_flags & mask)
+ sync = 1;
+ spin_unlock(&dq_data_lock);
+ if (!sync) {
+ status = ocfs2_write_dquot(dquot);
+ goto out;
+ }
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+ handle = ocfs2_start_trans(osb, OCFS2_QSYNC_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ status = ocfs2_sync_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+ /* Now write updated local dquot structure */
+ status = dquot_commit(dquot);
+out_trans:
+ ocfs2_commit_trans(osb, handle);
+out_ilock:
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+/* This should happen only after set_dqinfo(). */
+static int ocfs2_write_info(struct super_block *sb, int type)
+{
+ handle_t *handle;
+ int status = 0;
+ struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
+
+ mlog_entry_void();
+
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+ handle = ocfs2_start_trans(OCFS2_SB(sb), OCFS2_QINFO_WRITE_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ status = dquot_commit_info(sb, type);
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+out_ilock:
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+/* This is difficult. We have to lock quota inode and start transaction
+ * in this function but we don't want to take the penalty of exlusive
+ * quota file lock when we are just going to use cached structures. So
+ * we just take read lock check whether we have dquot cached and if so,
+ * we don't have to take the write lock... */
+static int ocfs2_dquot_initialize(struct inode *inode, int type)
+{
+ handle_t *handle = NULL;
+ int status = 0;
+ struct super_block *sb = inode->i_sb;
+ struct ocfs2_mem_dqinfo *oinfo;
+ int exclusive = 0;
+ int cnt;
+ qid_t id;
+
+ mlog_entry_void();
+
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+ if (type != -1 && cnt != type)
+ continue;
+ if (!sb_has_quota_active(sb, cnt))
+ continue;
+ oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
+ status = ocfs2_lock_global_qf(oinfo, 0);
+ if (status < 0)
+ goto out;
+ /* This is just a performance optimization not a reliable test.
+ * Since we hold an inode lock, noone can actually release
+ * the structure until we are finished with initialization. */
+ if (inode->i_dquot[cnt] != NODQUOT) {
+ ocfs2_unlock_global_qf(oinfo, 0);
+ continue;
+ }
+ /* When we have inode lock, we know that no dquot_release() can
+ * run and thus we can safely check whether we need to
+ * read+modify global file to get quota information or whether
+ * our node already has it. */
+ if (cnt == USRQUOTA)
+ id = inode->i_uid;
+ else if (cnt == GRPQUOTA)
+ id = inode->i_gid;
+ else
+ BUG();
+ /* Obtain exclusion from quota off... */
+ down_write(&sb_dqopt(sb)->dqptr_sem);
+ exclusive = !dquot_is_cached(sb, id, cnt);
+ up_write(&sb_dqopt(sb)->dqptr_sem);
+ if (exclusive) {
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0) {
+ exclusive = 0;
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ handle = ocfs2_start_trans(OCFS2_SB(sb),
+ ocfs2_calc_qinit_credits(sb, cnt));
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ }
+ dquot_initialize(inode, cnt);
+ if (exclusive) {
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ ocfs2_unlock_global_qf(oinfo, 1);
+ }
+ ocfs2_unlock_global_qf(oinfo, 0);
+ }
+ mlog_exit(0);
+ return 0;
+out_ilock:
+ if (exclusive)
+ ocfs2_unlock_global_qf(oinfo, 1);
+ ocfs2_unlock_global_qf(oinfo, 0);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+static int ocfs2_dquot_drop_slow(struct inode *inode)
+{
+ int status;
+ int cnt;
+ int got_lock[MAXQUOTAS] = {0, 0};
+ handle_t *handle;
+ struct super_block *sb = inode->i_sb;
+ struct ocfs2_mem_dqinfo *oinfo;
+
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+ if (!sb_has_quota_active(sb, cnt))
+ continue;
+ oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+ got_lock[cnt] = 1;
+ }
+ handle = ocfs2_start_trans(OCFS2_SB(sb),
+ ocfs2_calc_qinit_credits(sb, USRQUOTA) +
+ ocfs2_calc_qinit_credits(sb, GRPQUOTA));
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out;
+ }
+ dquot_drop(inode);
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+out:
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ if (got_lock[cnt]) {
+ oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
+ ocfs2_unlock_global_qf(oinfo, 1);
+ }
+ return status;
+}
+
+/* See the comment before ocfs2_dquot_initialize. */
+static int ocfs2_dquot_drop(struct inode *inode)
+{
+ int status = 0;
+ struct super_block *sb = inode->i_sb;
+ struct ocfs2_mem_dqinfo *oinfo;
+ int exclusive = 0;
+ int cnt;
+ int got_lock[MAXQUOTAS] = {0, 0};
+
+ mlog_entry_void();
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+ if (!sb_has_quota_active(sb, cnt))
+ continue;
+ oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
+ status = ocfs2_lock_global_qf(oinfo, 0);
+ if (status < 0)
+ goto out;
+ got_lock[cnt] = 1;
+ }
+ /* Lock against anyone releasing references so that when when we check
+ * we know we are not going to be last ones to release dquot */
+ down_write(&sb_dqopt(sb)->dqptr_sem);
+ /* Urgh, this is a terrible hack :( */
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+ if (inode->i_dquot[cnt] != NODQUOT &&
+ atomic_read(&inode->i_dquot[cnt]->dq_count) > 1) {
+ exclusive = 1;
+ break;
+ }
+ }
+ if (!exclusive)
+ dquot_drop_locked(inode);
+ up_write(&sb_dqopt(sb)->dqptr_sem);
+out:
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ if (got_lock[cnt]) {
+ oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
+ ocfs2_unlock_global_qf(oinfo, 0);
+ }
+ /* In case we bailed out because we had to do expensive locking
+ * do it now... */
+ if (exclusive)
+ status = ocfs2_dquot_drop_slow(inode);
+ mlog_exit(status);
+ return status;
+}
+
+static struct dquot *ocfs2_alloc_dquot(struct super_block *sb, int type)
+{
+ struct ocfs2_dquot *dquot =
+ kmem_cache_zalloc(ocfs2_dquot_cachep, GFP_NOFS);
+
+ if (!dquot)
+ return NULL;
+ return &dquot->dq_dquot;
+}
+
+static void ocfs2_destroy_dquot(struct dquot *dquot)
+{
+ kmem_cache_free(ocfs2_dquot_cachep, dquot);
+}
+
+struct dquot_operations ocfs2_quota_operations = {
+ .initialize = ocfs2_dquot_initialize,
+ .drop = ocfs2_dquot_drop,
+ .alloc_space = dquot_alloc_space,
+ .alloc_inode = dquot_alloc_inode,
+ .free_space = dquot_free_space,
+ .free_inode = dquot_free_inode,
+ .transfer = dquot_transfer,
+ .write_dquot = ocfs2_write_dquot,
+ .acquire_dquot = ocfs2_acquire_dquot,
+ .release_dquot = ocfs2_release_dquot,
+ .mark_dirty = ocfs2_mark_dquot_dirty,
+ .write_info = ocfs2_write_info,
+ .alloc_dquot = ocfs2_alloc_dquot,
+ .destroy_dquot = ocfs2_destroy_dquot,
+};
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
new file mode 100644
index 0000000..55c3f2f
--- /dev/null
+++ b/fs/ocfs2/quota_local.c
@@ -0,0 +1,833 @@
+/*
+ * Implementation of operations over local quota file
+ */
+
+#include <linux/fs.h>
+#include <linux/quota.h>
+#include <linux/quotaops.h>
+#include <linux/module.h>
+
+#define MLOG_MASK_PREFIX ML_QUOTA
+#include <cluster/masklog.h>
+
+#include "ocfs2_fs.h"
+#include "ocfs2.h"
+#include "inode.h"
+#include "alloc.h"
+#include "file.h"
+#include "buffer_head_io.h"
+#include "journal.h"
+#include "sysfile.h"
+#include "dlmglue.h"
+#include "quota.h"
+
+/* Number of local quota structures per block */
+static inline unsigned int ol_quota_entries_per_block(struct super_block *sb)
+{
+ return ((sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE) /
+ sizeof(struct ocfs2_local_disk_dqblk));
+}
+
+/* Number of blocks with entries in one chunk */
+static inline unsigned int ol_chunk_blocks(struct super_block *sb)
+{
+ return ((sb->s_blocksize - sizeof(struct ocfs2_local_disk_chunk) -
+ OCFS2_QBLK_RESERVED_SPACE) << 3) /
+ ol_quota_entries_per_block(sb);
+}
+
+/* Number of entries in a chunk bitmap */
+static unsigned int ol_chunk_entries(struct super_block *sb)
+{
+ return ol_chunk_blocks(sb) * ol_quota_entries_per_block(sb);
+}
+
+/* Offset of the chunk in quota file */
+static unsigned int ol_quota_chunk_block(struct super_block *sb, int c)
+{
+ /* 1 block for local quota file info, 1 block per chunk for chunk info */
+ return 1 + (ol_chunk_blocks(sb) + 1) * c;
+}
+
+/* Offset of the dquot structure in the quota file */
+static loff_t ol_dqblk_off(struct super_block *sb, int c, int off)
+{
+ int epb = ol_quota_entries_per_block(sb);
+
+ return ((ol_quota_chunk_block(sb, c) + 1 + off / epb)
+ << sb->s_blocksize_bits) +
+ (off % epb) * sizeof(struct ocfs2_local_disk_dqblk);
+}
+
+/* Compute block number from given offset */
+static inline unsigned int ol_dqblk_file_block(struct super_block *sb, loff_t off)
+{
+ return off >> sb->s_blocksize_bits;
+}
+
+static inline unsigned int ol_dqblk_block_offset(struct super_block *sb, loff_t off)
+{
+ return off & ((1 << sb->s_blocksize_bits) - 1);
+}
+
+/* Compute offset in the chunk of a structure with the given offset */
+static int ol_dqblk_chunk_off(struct super_block *sb, int c, loff_t off)
+{
+ int epb = ol_quota_entries_per_block(sb);
+
+ return ((off >> sb->s_blocksize_bits) -
+ ol_quota_chunk_block(sb, c) - 1) * epb
+ + ((unsigned int)(off & ((1 << sb->s_blocksize_bits) - 1))) /
+ sizeof(struct ocfs2_local_disk_dqblk);
+}
+
+/* Write bufferhead into the fs */
+static int ocfs2_modify_bh(struct inode *inode, struct buffer_head *bh,
+ void (*modify)(struct buffer_head *, void *), void *private)
+{
+ struct super_block *sb = inode->i_sb;
+ handle_t *handle;
+ int status;
+
+ handle = ocfs2_start_trans(OCFS2_SB(sb), 1);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ return status;
+ }
+ status = ocfs2_journal_access(handle, inode, bh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ return status;
+ }
+ lock_buffer(bh);
+ modify(bh, private);
+ unlock_buffer(bh);
+ status = ocfs2_journal_dirty(handle, bh);
+ if (status < 0) {
+ mlog_errno(status);
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ return status;
+ }
+ status = ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ if (status < 0) {
+ mlog_errno(status);
+ return status;
+ }
+ return 0;
+}
+
+/* Check whether we understand format of quota files */
+static int ocfs2_local_check_quota_file(struct super_block *sb, int type)
+{
+ unsigned int lmagics[MAXQUOTAS] = OCFS2_LOCAL_QMAGICS;
+ unsigned int lversions[MAXQUOTAS] = OCFS2_LOCAL_QVERSIONS;
+ unsigned int gmagics[MAXQUOTAS] = OCFS2_GLOBAL_QMAGICS;
+ unsigned int gversions[MAXQUOTAS] = OCFS2_GLOBAL_QVERSIONS;
+ unsigned int ino[MAXQUOTAS] = { USER_QUOTA_SYSTEM_INODE,
+ GROUP_QUOTA_SYSTEM_INODE };
+ struct buffer_head *bh;
+ struct inode *linode = sb_dqopt(sb)->files[type];
+ struct inode *ginode = NULL;
+ struct ocfs2_disk_dqheader *dqhead;
+ int status, ret = 0;
+
+ /* First check whether we understand local quota file */
+ bh = ocfs2_read_quota_block(linode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ mlog(ML_ERROR, "failed to read quota file header (type=%d)\n",
+ type);
+ goto out_err;
+ }
+ dqhead = (struct ocfs2_disk_dqheader *)(bh->b_data);
+ if (le32_to_cpu(dqhead->dqh_magic) != lmagics[type]) {
+ mlog(ML_ERROR, "quota file magic does not match (%u != %u),"
+ " type=%d\n", le32_to_cpu(dqhead->dqh_magic),
+ lmagics[type], type);
+ goto out_err;
+ }
+ if (le32_to_cpu(dqhead->dqh_version) != lversions[type]) {
+ mlog(ML_ERROR, "quota file version does not match (%u != %u),"
+ " type=%d\n", le32_to_cpu(dqhead->dqh_version),
+ lversions[type], type);
+ goto out_err;
+ }
+ brelse(bh);
+ bh = NULL;
+
+ /* Next check whether we understand global quota file */
+ ginode = ocfs2_get_system_file_inode(OCFS2_SB(sb), ino[type],
+ OCFS2_INVALID_SLOT);
+ if (!ginode) {
+ mlog(ML_ERROR, "cannot get global quota file inode "
+ "(type=%d)\n", type);
+ goto out_err;
+ }
+ /* Since the header is read only, we don't care about locking */
+ bh = ocfs2_read_quota_block(ginode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ mlog(ML_ERROR, "failed to read global quota file header "
+ "(type=%d)\n", type);
+ goto out_err;
+ }
+ dqhead = (struct ocfs2_disk_dqheader *)(bh->b_data);
+ if (le32_to_cpu(dqhead->dqh_magic) != gmagics[type]) {
+ mlog(ML_ERROR, "global quota file magic does not match "
+ "(%u != %u), type=%d\n",
+ le32_to_cpu(dqhead->dqh_magic), gmagics[type], type);
+ goto out_err;
+ }
+ if (le32_to_cpu(dqhead->dqh_version) != gversions[type]) {
+ mlog(ML_ERROR, "global quota file version does not match "
+ "(%u != %u), type=%d\n",
+ le32_to_cpu(dqhead->dqh_version), gversions[type],
+ type);
+ goto out_err;
+ }
+
+ ret = 1;
+out_err:
+ brelse(bh);
+ iput(ginode);
+ return ret;
+}
+
+/* Release given list of quota file chunks */
+static void ocfs2_release_local_quota_bitmaps(struct list_head *head)
+{
+ struct ocfs2_quota_chunk *pos, *next;
+
+ list_for_each_entry_safe(pos, next, head, qc_chunk) {
+ list_del(&pos->qc_chunk);
+ brelse(pos->qc_headerbh);
+ kmem_cache_free(ocfs2_qf_chunk_cachep, pos);
+ }
+}
+
+/* Load quota bitmaps into memory */
+static int ocfs2_load_local_quota_bitmaps(struct inode *inode,
+ struct ocfs2_local_disk_dqinfo *ldinfo,
+ struct list_head *head)
+{
+ struct ocfs2_quota_chunk *newchunk;
+ int i, status;
+
+ INIT_LIST_HEAD(head);
+ for (i = 0; i < le32_to_cpu(ldinfo->dqi_chunks); i++) {
+ newchunk = kmem_cache_alloc(ocfs2_qf_chunk_cachep, GFP_NOFS);
+ if (!newchunk) {
+ ocfs2_release_local_quota_bitmaps(head);
+ return -ENOMEM;
+ }
+ newchunk->qc_num = i;
+ newchunk->qc_headerbh = ocfs2_read_quota_block(inode,
+ ol_quota_chunk_block(inode->i_sb, i),
+ &status);
+ if (!newchunk->qc_headerbh) {
+ mlog_errno(status);
+ kmem_cache_free(ocfs2_qf_chunk_cachep, newchunk);
+ ocfs2_release_local_quota_bitmaps(head);
+ return status;
+ }
+ list_add_tail(&newchunk->qc_chunk, head);
+ }
+ return 0;
+}
+
+static void olq_update_info(struct buffer_head *bh, void *private)
+{
+ struct mem_dqinfo *info = private;
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct ocfs2_local_disk_dqinfo *ldinfo;
+
+ ldinfo = (struct ocfs2_local_disk_dqinfo *)(bh->b_data +
+ OCFS2_LOCAL_INFO_OFF);
+ spin_lock(&dq_data_lock);
+ ldinfo->dqi_flags = cpu_to_le32(info->dqi_flags & DQF_MASK);
+ ldinfo->dqi_chunks = cpu_to_le32(oinfo->dqi_chunks);
+ ldinfo->dqi_blocks = cpu_to_le32(oinfo->dqi_blocks);
+ spin_unlock(&dq_data_lock);
+}
+
+/* Read information header from quota file */
+static int ocfs2_local_read_info(struct super_block *sb, int type)
+{
+ struct ocfs2_local_disk_dqinfo *ldinfo;
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo;
+ struct inode *lqinode = sb_dqopt(sb)->files[type];
+ int status;
+ struct buffer_head *bh = NULL;
+ int locked = 0;
+
+ info->dqi_maxblimit = 0x7fffffffffffffffLL;
+ info->dqi_maxilimit = 0x7fffffffffffffffLL;
+ oinfo = kmalloc(sizeof(struct ocfs2_mem_dqinfo), GFP_NOFS);
+ if (!oinfo) {
+ mlog(ML_ERROR, "failed to allocate memory for ocfs2 quota"
+ " info.");
+ goto out_err;
+ }
+ info->dqi_priv = oinfo;
+ oinfo->dqi_type = type;
+ INIT_LIST_HEAD(&oinfo->dqi_chunk);
+ oinfo->dqi_lqi_bh = NULL;
+ oinfo->dqi_ibh = NULL;
+
+ status = ocfs2_global_read_info(sb, type);
+ if (status < 0)
+ goto out_err;
+
+ status = ocfs2_inode_lock(lqinode, &oinfo->dqi_lqi_bh, 1);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+ locked = 1;
+
+ /* Now read local header */
+ bh = ocfs2_read_quota_block(lqinode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ mlog(ML_ERROR, "failed to read quota file info header "
+ "(type=%d)\n", type);
+ goto out_err;
+ }
+ ldinfo = (struct ocfs2_local_disk_dqinfo *)(bh->b_data +
+ OCFS2_LOCAL_INFO_OFF);
+ info->dqi_flags = le32_to_cpu(ldinfo->dqi_flags);
+ oinfo->dqi_chunks = le32_to_cpu(ldinfo->dqi_chunks);
+ oinfo->dqi_blocks = le32_to_cpu(ldinfo->dqi_blocks);
+ oinfo->dqi_ibh = bh;
+
+ /* We crashed when using local quota file? */
+ if (!(info->dqi_flags & OLQF_CLEAN))
+ goto out_err; /* So far we just bail out. Later we should resync here */
+
+ status = ocfs2_load_local_quota_bitmaps(sb_dqopt(sb)->files[type],
+ ldinfo,
+ &oinfo->dqi_chunk);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+
+ /* Now mark quota file as used */
+ info->dqi_flags &= ~OLQF_CLEAN;
+ status = ocfs2_modify_bh(lqinode, bh, olq_update_info, info);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+
+ return 0;
+out_err:
+ if (oinfo) {
+ iput(oinfo->dqi_gqinode);
+ ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
+ ocfs2_lock_res_free(&oinfo->dqi_gqlock);
+ brelse(oinfo->dqi_lqi_bh);
+ if (locked)
+ ocfs2_inode_unlock(lqinode, 1);
+ ocfs2_release_local_quota_bitmaps(&oinfo->dqi_chunk);
+ kfree(oinfo);
+ }
+ brelse(bh);
+ return -1;
+}
+
+/* Write local info to quota file */
+static int ocfs2_local_write_info(struct super_block *sb, int type)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct buffer_head *bh = ((struct ocfs2_mem_dqinfo *)info->dqi_priv)
+ ->dqi_ibh;
+ int status;
+
+ status = ocfs2_modify_bh(sb_dqopt(sb)->files[type], bh, olq_update_info,
+ info);
+ if (status < 0) {
+ mlog_errno(status);
+ return -1;
+ }
+
+ return 0;
+}
+
+/* Release info from memory */
+static int ocfs2_local_free_info(struct super_block *sb, int type)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct ocfs2_quota_chunk *chunk;
+ struct ocfs2_local_disk_chunk *dchunk;
+ int mark_clean = 1, len;
+ int status;
+
+ iput(oinfo->dqi_gqinode);
+ ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
+ ocfs2_lock_res_free(&oinfo->dqi_gqlock);
+ list_for_each_entry(chunk, &oinfo->dqi_chunk, qc_chunk) {
+ dchunk = (struct ocfs2_local_disk_chunk *)
+ (chunk->qc_headerbh->b_data);
+ if (chunk->qc_num < oinfo->dqi_chunks - 1) {
+ len = ol_chunk_entries(sb);
+ } else {
+ len = (oinfo->dqi_blocks -
+ ol_quota_chunk_block(sb, chunk->qc_num) - 1)
+ * ol_quota_entries_per_block(sb);
+ }
+ /* Not all entries free? Bug! */
+ if (le32_to_cpu(dchunk->dqc_free) != len) {
+ mlog(ML_ERROR, "releasing quota file with used "
+ "entries (type=%d)\n", type);
+ mark_clean = 0;
+ }
+ }
+ ocfs2_release_local_quota_bitmaps(&oinfo->dqi_chunk);
+
+ if (!mark_clean)
+ goto out;
+
+ /* Mark local file as clean */
+ info->dqi_flags |= OLQF_CLEAN;
+ status = ocfs2_modify_bh(sb_dqopt(sb)->files[type],
+ oinfo->dqi_ibh,
+ olq_update_info,
+ info);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+
+out:
+ ocfs2_inode_unlock(sb_dqopt(sb)->files[type], 1);
+ brelse(oinfo->dqi_ibh);
+ brelse(oinfo->dqi_lqi_bh);
+ kfree(oinfo);
+ return 0;
+}
+
+static void olq_set_dquot(struct buffer_head *bh, void *private)
+{
+ struct ocfs2_dquot *od = private;
+ struct ocfs2_local_disk_dqblk *dqblk;
+ struct super_block *sb = od->dq_dquot.dq_sb;
+
+ dqblk = (struct ocfs2_local_disk_dqblk *)(bh->b_data
+ + ol_dqblk_block_offset(sb, od->dq_local_off));
+
+ dqblk->dqb_id = cpu_to_le64(od->dq_dquot.dq_id);
+ spin_lock(&dq_data_lock);
+ dqblk->dqb_spacemod = cpu_to_le64(od->dq_dquot.dq_dqb.dqb_curspace -
+ od->dq_origspace);
+ dqblk->dqb_inodemod = cpu_to_le64(od->dq_dquot.dq_dqb.dqb_curinodes -
+ od->dq_originodes);
+ spin_unlock(&dq_data_lock);
+ mlog(0, "Writing local dquot %u space %lld inodes %lld\n",
+ od->dq_dquot.dq_id, dqblk->dqb_spacemod, dqblk->dqb_inodemod);
+}
+
+/* Write dquot to local quota file */
+static int ocfs2_local_write_dquot(struct dquot *dquot)
+{
+ struct super_block *sb = dquot->dq_sb;
+ struct ocfs2_dquot *od = OCFS2_DQUOT(dquot);
+ struct buffer_head *bh;
+ int status;
+
+ bh = ocfs2_read_quota_block(sb_dqopt(sb)->files[dquot->dq_type],
+ ol_dqblk_file_block(sb, od->dq_local_off),
+ &status);
+ if (!bh) {
+ mlog_errno(status);
+ goto out;
+ }
+ status = ocfs2_modify_bh(sb_dqopt(sb)->files[dquot->dq_type], bh,
+ olq_set_dquot, od);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+out:
+ brelse(bh);
+ return status;
+}
+
+/* Find free entry in local quota file */
+static struct ocfs2_quota_chunk *ocfs2_find_free_entry(struct super_block *sb,
+ int type,
+ int *offset)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct ocfs2_quota_chunk *chunk;
+ struct ocfs2_local_disk_chunk *dchunk;
+ int found = 0, len;
+
+ list_for_each_entry(chunk, &oinfo->dqi_chunk, qc_chunk) {
+ dchunk = (struct ocfs2_local_disk_chunk *)
+ chunk->qc_headerbh->b_data;
+ if (le32_to_cpu(dchunk->dqc_free) > 0) {
+ found = 1;
+ break;
+ }
+ }
+ if (!found)
+ return NULL;
+
+ if (chunk->qc_num < oinfo->dqi_chunks - 1) {
+ len = ol_chunk_entries(sb);
+ } else {
+ len = (oinfo->dqi_blocks -
+ ol_quota_chunk_block(sb, chunk->qc_num) - 1)
+ * ol_quota_entries_per_block(sb);
+ }
+
+ found = ocfs2_find_next_zero_bit(dchunk->dqc_bitmap, len, 0);
+ /* We failed? */
+ if (found == len) {
+ mlog(ML_ERROR, "Did not find empty entry in chunk %d with %u"
+ " entries free (type=%d)\n", chunk->qc_num,
+ le32_to_cpu(dchunk->dqc_free), type);
+ return ERR_PTR(-EIO);
+ }
+ *offset = found;
+ return chunk;
+}
+
+/* Add new chunk to the local quota file */
+static struct ocfs2_quota_chunk *ocfs2_local_quota_add_chunk(
+ struct super_block *sb,
+ int type,
+ int *offset)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct inode *lqinode = sb_dqopt(sb)->files[type];
+ struct ocfs2_quota_chunk *chunk = NULL;
+ struct ocfs2_local_disk_chunk *dchunk;
+ int status;
+ handle_t *handle;
+ struct buffer_head *bh = NULL;
+ u64 p_blkno;
+
+ /* We are protected by dqio_sem so no locking needed */
+ status = ocfs2_extend_no_holes(lqinode,
+ lqinode->i_size + 2 * sb->s_blocksize,
+ lqinode->i_size);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ status = ocfs2_simple_size_update(lqinode, oinfo->dqi_lqi_bh,
+ lqinode->i_size + 2 * sb->s_blocksize);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+
+ chunk = kmem_cache_alloc(ocfs2_qf_chunk_cachep, GFP_NOFS);
+ if (!chunk) {
+ status = -ENOMEM;
+ mlog_errno(status);
+ goto out;
+ }
+
+ down_read(&OCFS2_I(lqinode)->ip_alloc_sem);
+ status = ocfs2_extent_map_get_blocks(lqinode, oinfo->dqi_blocks,
+ &p_blkno, NULL, NULL);
+ up_read(&OCFS2_I(lqinode)->ip_alloc_sem);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ bh = sb_getblk(sb, p_blkno);
+ if (!bh) {
+ status = -ENOMEM;
+ mlog_errno(status);
+ goto out;
+ }
+ dchunk = (struct ocfs2_local_disk_chunk *)bh->b_data;
+
+ handle = ocfs2_start_trans(OCFS2_SB(sb), 2);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out;
+ }
+
+ status = ocfs2_journal_access(handle, lqinode, bh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+ lock_buffer(bh);
+ dchunk->dqc_free = ol_quota_entries_per_block(sb);
+ memset(dchunk->dqc_bitmap, 0,
+ sb->s_blocksize - sizeof(struct ocfs2_local_disk_chunk) -
+ OCFS2_QBLK_RESERVED_SPACE);
+ set_buffer_uptodate(bh);
+ unlock_buffer(bh);
+ status = ocfs2_journal_dirty(handle, bh);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+
+ oinfo->dqi_blocks += 2;
+ oinfo->dqi_chunks++;
+ status = ocfs2_local_write_info(sb, type);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+ status = ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+
+ list_add_tail(&chunk->qc_chunk, &oinfo->dqi_chunk);
+ chunk->qc_num = list_entry(chunk->qc_chunk.prev,
+ struct ocfs2_quota_chunk,
+ qc_chunk)->qc_num + 1;
+ chunk->qc_headerbh = bh;
+ *offset = 0;
+ return chunk;
+out_trans:
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+out:
+ brelse(bh);
+ kmem_cache_free(ocfs2_qf_chunk_cachep, chunk);
+ return ERR_PTR(status);
+}
+
+/* Find free entry in local quota file */
+static struct ocfs2_quota_chunk *ocfs2_extend_local_quota_file(
+ struct super_block *sb,
+ int type,
+ int *offset)
+{
+ struct mem_dqinfo *info = sb_dqinfo(sb, type);
+ struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
+ struct ocfs2_quota_chunk *chunk;
+ struct inode *lqinode = sb_dqopt(sb)->files[type];
+ struct ocfs2_local_disk_chunk *dchunk;
+ int epb = ol_quota_entries_per_block(sb);
+ unsigned int chunk_blocks;
+ int status;
+ handle_t *handle;
+
+ if (list_empty(&oinfo->dqi_chunk))
+ return ocfs2_local_quota_add_chunk(sb, type, offset);
+ /* Is the last chunk full? */
+ chunk = list_entry(oinfo->dqi_chunk.prev,
+ struct ocfs2_quota_chunk, qc_chunk);
+ chunk_blocks = oinfo->dqi_blocks -
+ ol_quota_chunk_block(sb, chunk->qc_num) - 1;
+ if (ol_chunk_blocks(sb) == chunk_blocks)
+ return ocfs2_local_quota_add_chunk(sb, type, offset);
+
+ /* We are protected by dqio_sem so no locking needed */
+ status = ocfs2_extend_no_holes(lqinode,
+ lqinode->i_size + sb->s_blocksize,
+ lqinode->i_size);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ status = ocfs2_simple_size_update(lqinode, oinfo->dqi_lqi_bh,
+ lqinode->i_size + sb->s_blocksize);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ handle = ocfs2_start_trans(OCFS2_SB(sb), 2);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out;
+ }
+ status = ocfs2_journal_access(handle, lqinode, chunk->qc_headerbh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+
+ dchunk = (struct ocfs2_local_disk_chunk *)chunk->qc_headerbh->b_data;
+ lock_buffer(chunk->qc_headerbh);
+ le32_add_cpu(&dchunk->dqc_free, ol_quota_entries_per_block(sb));
+ unlock_buffer(chunk->qc_headerbh);
+ status = ocfs2_journal_dirty(handle, chunk->qc_headerbh);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+ oinfo->dqi_blocks++;
+ status = ocfs2_local_write_info(sb, type);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+
+ status = ocfs2_commit_trans(OCFS2_SB(sb), handle);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ *offset = chunk_blocks * epb;
+ return chunk;
+out_trans:
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+out:
+ return ERR_PTR(status);
+}
+
+void olq_alloc_dquot(struct buffer_head *bh, void *private)
+{
+ int *offset = private;
+ struct ocfs2_local_disk_chunk *dchunk;
+
+ dchunk = (struct ocfs2_local_disk_chunk *)bh->b_data;
+ ocfs2_set_bit(*offset, dchunk->dqc_bitmap);
+ le32_add_cpu(&dchunk->dqc_free, -1);
+}
+
+/* Create dquot in the local file for given id */
+static int ocfs2_create_local_dquot(struct dquot *dquot)
+{
+ struct super_block *sb = dquot->dq_sb;
+ int type = dquot->dq_type;
+ struct inode *lqinode = sb_dqopt(sb)->files[type];
+ struct ocfs2_quota_chunk *chunk;
+ struct ocfs2_dquot *od = OCFS2_DQUOT(dquot);
+ int offset;
+ int status;
+
+ chunk = ocfs2_find_free_entry(sb, type, &offset);
+ if (!chunk) {
+ chunk = ocfs2_extend_local_quota_file(sb, type, &offset);
+ if (IS_ERR(chunk))
+ return PTR_ERR(chunk);
+ } else if (IS_ERR(chunk)) {
+ return PTR_ERR(chunk);
+ }
+ od->dq_local_off = ol_dqblk_off(sb, chunk->qc_num, offset);
+ od->dq_chunk = chunk;
+
+ /* Initialize dquot structure on disk */
+ status = ocfs2_local_write_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+
+ /* Mark structure as allocated */
+ status = ocfs2_modify_bh(lqinode, chunk->qc_headerbh, olq_alloc_dquot,
+ &offset);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+out:
+ return status;
+}
+
+/* Create entry in local file for dquot, load data from the global file */
+static int ocfs2_local_read_dquot(struct dquot *dquot)
+{
+ int status;
+
+ mlog_entry("id=%u, type=%d\n", dquot->dq_id, dquot->dq_type);
+
+ status = ocfs2_global_read_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+
+ /* Now create entry in the local quota file */
+ status = ocfs2_create_local_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+ mlog_exit(0);
+ return 0;
+out_err:
+ mlog_exit(status);
+ return status;
+}
+
+/* Release dquot structure from local quota file. ocfs2_release_dquot() has
+ * already started a transaction and obtained exclusive lock for global
+ * quota file. */
+static int ocfs2_local_release_dquot(struct dquot *dquot)
+{
+ int status;
+ int type = dquot->dq_type;
+ struct ocfs2_dquot *od = OCFS2_DQUOT(dquot);
+ struct super_block *sb = dquot->dq_sb;
+ struct ocfs2_local_disk_chunk *dchunk;
+ int offset;
+ handle_t *handle = journal_current_handle();
+
+ BUG_ON(!handle);
+ /* First write all local changes to global file */
+ status = ocfs2_global_release_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+
+ status = ocfs2_journal_access(handle, sb_dqopt(sb)->files[type],
+ od->dq_chunk->qc_headerbh, OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ offset = ol_dqblk_chunk_off(sb, od->dq_chunk->qc_num,
+ od->dq_local_off);
+ dchunk = (struct ocfs2_local_disk_chunk *)
+ (od->dq_chunk->qc_headerbh->b_data);
+ /* Mark structure as freed */
+ lock_buffer(od->dq_chunk->qc_headerbh);
+ ocfs2_clear_bit(offset, dchunk->dqc_bitmap);
+ le32_add_cpu(&dchunk->dqc_free, 1);
+ unlock_buffer(od->dq_chunk->qc_headerbh);
+ status = ocfs2_journal_dirty(handle, od->dq_chunk->qc_headerbh);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out;
+ }
+ status = 0;
+out:
+ /* Clear the read bit so that next time someone uses this
+ * dquot he reads fresh info from disk and allocates local
+ * dquot structure */
+ clear_bit(DQ_READ_B, &dquot->dq_flags);
+ return status;
+}
+
+static struct quota_format_ops ocfs2_format_ops = {
+ .check_quota_file = ocfs2_local_check_quota_file,
+ .read_file_info = ocfs2_local_read_info,
+ .write_file_info = ocfs2_global_write_info,
+ .free_file_info = ocfs2_local_free_info,
+ .read_dqblk = ocfs2_local_read_dquot,
+ .commit_dqblk = ocfs2_local_write_dquot,
+ .release_dqblk = ocfs2_local_release_dquot,
+};
+
+struct quota_format_type ocfs2_quota_format = {
+ .qf_fmt_id = QFMT_OCFS2,
+ .qf_ops = &ocfs2_format_ops,
+ .qf_owner = THIS_MODULE
+};
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 41bb019..7bb83e4 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -65,10 +65,13 @@
#include "uptodate.h"
#include "ver.h"
#include "xattr.h"
+#include "quota.h"

#include "buffer_head_io.h"

static struct kmem_cache *ocfs2_inode_cachep = NULL;
+struct kmem_cache *ocfs2_dquot_cachep;
+struct kmem_cache *ocfs2_qf_chunk_cachep;

/* OCFS2 needs to schedule several differnt types of work which
* require cluster locking, disk I/O, recovery waits, etc. Since these
@@ -137,6 +140,8 @@ static const struct super_operations ocfs2_sops = {
.put_super = ocfs2_put_super,
.remount_fs = ocfs2_remount,
.show_options = ocfs2_show_options,
+ .quota_read = ocfs2_quota_read,
+ .quota_write = ocfs2_quota_write,
};

enum {
@@ -1104,6 +1109,7 @@ static int __init ocfs2_init(void)

ocfs2_set_locking_protocol();

+ status = register_quota_format(&ocfs2_quota_format);
leave:
if (status < 0) {
ocfs2_free_mem_caches();
@@ -1127,6 +1133,8 @@ static void __exit ocfs2_exit(void)
destroy_workqueue(ocfs2_wq);
}

+ unregister_quota_format(&ocfs2_quota_format);
+
debugfs_remove(ocfs2_debugfs_root);

ocfs2_free_mem_caches();
@@ -1242,8 +1250,27 @@ static int ocfs2_initialize_mem_caches(void)
(SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD),
ocfs2_inode_init_once);
- if (!ocfs2_inode_cachep)
+ ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
+ sizeof(struct ocfs2_dquot),
+ 0,
+ (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
+ SLAB_MEM_SPREAD),
+ NULL);
+ ocfs2_qf_chunk_cachep = kmem_cache_create("ocfs2_qf_chunk_cache",
+ sizeof(struct ocfs2_quota_chunk),
+ 0,
+ (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
+ NULL);
+ if (!ocfs2_inode_cachep || !ocfs2_dquot_cachep ||
+ !ocfs2_qf_chunk_cachep) {
+ if (ocfs2_inode_cachep)
+ kmem_cache_destroy(ocfs2_inode_cachep);
+ if (ocfs2_dquot_cachep)
+ kmem_cache_destroy(ocfs2_dquot_cachep);
+ if (ocfs2_qf_chunk_cachep)
+ kmem_cache_destroy(ocfs2_qf_chunk_cachep);
return -ENOMEM;
+ }

return 0;
}
@@ -1252,8 +1279,15 @@ static void ocfs2_free_mem_caches(void)
{
if (ocfs2_inode_cachep)
kmem_cache_destroy(ocfs2_inode_cachep);
-
ocfs2_inode_cachep = NULL;
+
+ if (ocfs2_dquot_cachep)
+ kmem_cache_destroy(ocfs2_dquot_cachep);
+ ocfs2_dquot_cachep = NULL;
+
+ if (ocfs2_qf_chunk_cachep)
+ kmem_cache_destroy(ocfs2_qf_chunk_cachep);
+ ocfs2_qf_chunk_cachep = NULL;
}

static int ocfs2_get_sector(struct super_block *sb,
--
1.5.6

2008-12-22 21:58:32

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 26/56] ocfs2: Implement quota recovery

From: Jan Kara <[email protected]>

Implement functions for recovery after a crash. Functions just
read local quota file and sync info to global quota file.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/journal.c | 108 ++++++++++---
fs/ocfs2/journal.h | 1 +
fs/ocfs2/ocfs2.h | 4 +-
fs/ocfs2/quota.h | 21 +++
fs/ocfs2/quota_global.c | 1 -
fs/ocfs2/quota_local.c | 425 ++++++++++++++++++++++++++++++++++++++++++++++-
6 files changed, 528 insertions(+), 32 deletions(-)

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 11a1178..c602420 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -45,6 +45,7 @@
#include "slot_map.h"
#include "super.h"
#include "sysfile.h"
+#include "quota.h"

#include "buffer_head_io.h"

@@ -52,7 +53,7 @@ DEFINE_SPINLOCK(trans_inc_lock);

static int ocfs2_force_read_journal(struct inode *inode);
static int ocfs2_recover_node(struct ocfs2_super *osb,
- int node_num);
+ int node_num, int slot_num);
static int __ocfs2_recovery_thread(void *arg);
static int ocfs2_commit_cache(struct ocfs2_super *osb);
static int ocfs2_wait_on_mount(struct ocfs2_super *osb);
@@ -857,6 +858,7 @@ struct ocfs2_la_recovery_item {
int lri_slot;
struct ocfs2_dinode *lri_la_dinode;
struct ocfs2_dinode *lri_tl_dinode;
+ struct ocfs2_quota_recovery *lri_qrec;
};

/* Does the second half of the recovery process. By this point, the
@@ -877,6 +879,7 @@ void ocfs2_complete_recovery(struct work_struct *work)
struct ocfs2_super *osb = journal->j_osb;
struct ocfs2_dinode *la_dinode, *tl_dinode;
struct ocfs2_la_recovery_item *item, *n;
+ struct ocfs2_quota_recovery *qrec;
LIST_HEAD(tmp_la_list);

mlog_entry_void();
@@ -922,6 +925,16 @@ void ocfs2_complete_recovery(struct work_struct *work)
if (ret < 0)
mlog_errno(ret);

+ qrec = item->lri_qrec;
+ if (qrec) {
+ mlog(0, "Recovering quota files");
+ ret = ocfs2_finish_quota_recovery(osb, qrec,
+ item->lri_slot);
+ if (ret < 0)
+ mlog_errno(ret);
+ /* Recovery info is already freed now */
+ }
+
kfree(item);
}

@@ -935,7 +948,8 @@ void ocfs2_complete_recovery(struct work_struct *work)
static void ocfs2_queue_recovery_completion(struct ocfs2_journal *journal,
int slot_num,
struct ocfs2_dinode *la_dinode,
- struct ocfs2_dinode *tl_dinode)
+ struct ocfs2_dinode *tl_dinode,
+ struct ocfs2_quota_recovery *qrec)
{
struct ocfs2_la_recovery_item *item;

@@ -950,6 +964,9 @@ static void ocfs2_queue_recovery_completion(struct ocfs2_journal *journal,
if (tl_dinode)
kfree(tl_dinode);

+ if (qrec)
+ ocfs2_free_quota_recovery(qrec);
+
mlog_errno(-ENOMEM);
return;
}
@@ -958,6 +975,7 @@ static void ocfs2_queue_recovery_completion(struct ocfs2_journal *journal,
item->lri_la_dinode = la_dinode;
item->lri_slot = slot_num;
item->lri_tl_dinode = tl_dinode;
+ item->lri_qrec = qrec;

spin_lock(&journal->j_lock);
list_add_tail(&item->lri_list, &journal->j_la_cleanups);
@@ -977,6 +995,7 @@ void ocfs2_complete_mount_recovery(struct ocfs2_super *osb)
ocfs2_queue_recovery_completion(journal,
osb->slot_num,
osb->local_alloc_copy,
+ NULL,
NULL);
ocfs2_schedule_truncate_log_flush(osb, 0);

@@ -985,11 +1004,26 @@ void ocfs2_complete_mount_recovery(struct ocfs2_super *osb)
}
}

+void ocfs2_complete_quota_recovery(struct ocfs2_super *osb)
+{
+ if (osb->quota_rec) {
+ ocfs2_queue_recovery_completion(osb->journal,
+ osb->slot_num,
+ NULL,
+ NULL,
+ osb->quota_rec);
+ osb->quota_rec = NULL;
+ }
+}
+
static int __ocfs2_recovery_thread(void *arg)
{
- int status, node_num;
+ int status, node_num, slot_num;
struct ocfs2_super *osb = arg;
struct ocfs2_recovery_map *rm = osb->recovery_map;
+ int *rm_quota = NULL;
+ int rm_quota_used = 0, i;
+ struct ocfs2_quota_recovery *qrec;

mlog_entry_void();

@@ -998,6 +1032,11 @@ static int __ocfs2_recovery_thread(void *arg)
goto bail;
}

+ rm_quota = kzalloc(osb->max_slots * sizeof(int), GFP_NOFS);
+ if (!rm_quota) {
+ status = -ENOMEM;
+ goto bail;
+ }
restart:
status = ocfs2_super_lock(osb, 1);
if (status < 0) {
@@ -1011,8 +1050,28 @@ restart:
* clear it until ocfs2_recover_node() has succeeded. */
node_num = rm->rm_entries[0];
spin_unlock(&osb->osb_lock);
-
- status = ocfs2_recover_node(osb, node_num);
+ mlog(0, "checking node %d\n", node_num);
+ slot_num = ocfs2_node_num_to_slot(osb, node_num);
+ if (slot_num == -ENOENT) {
+ status = 0;
+ mlog(0, "no slot for this node, so no recovery"
+ "required.\n");
+ goto skip_recovery;
+ }
+ mlog(0, "node %d was using slot %d\n", node_num, slot_num);
+
+ /* It is a bit subtle with quota recovery. We cannot do it
+ * immediately because we have to obtain cluster locks from
+ * quota files and we also don't want to just skip it because
+ * then quota usage would be out of sync until some node takes
+ * the slot. So we remember which nodes need quota recovery
+ * and when everything else is done, we recover quotas. */
+ for (i = 0; i < rm_quota_used && rm_quota[i] != slot_num; i++);
+ if (i == rm_quota_used)
+ rm_quota[rm_quota_used++] = slot_num;
+
+ status = ocfs2_recover_node(osb, node_num, slot_num);
+skip_recovery:
if (!status) {
ocfs2_recovery_map_clear(osb, node_num);
} else {
@@ -1034,13 +1093,27 @@ restart:
if (status < 0)
mlog_errno(status);

+ /* Now it is right time to recover quotas... We have to do this under
+ * superblock lock so that noone can start using the slot (and crash)
+ * before we recover it */
+ for (i = 0; i < rm_quota_used; i++) {
+ qrec = ocfs2_begin_quota_recovery(osb, rm_quota[i]);
+ if (IS_ERR(qrec)) {
+ status = PTR_ERR(qrec);
+ mlog_errno(status);
+ continue;
+ }
+ ocfs2_queue_recovery_completion(osb->journal, rm_quota[i],
+ NULL, NULL, qrec);
+ }
+
ocfs2_super_unlock(osb, 1);

/* We always run recovery on our own orphan dir - the dead
* node(s) may have disallowd a previos inode delete. Re-processing
* is therefore required. */
ocfs2_queue_recovery_completion(osb->journal, osb->slot_num, NULL,
- NULL);
+ NULL, NULL);

bail:
mutex_lock(&osb->recovery_lock);
@@ -1055,6 +1128,9 @@ bail:

mutex_unlock(&osb->recovery_lock);

+ if (rm_quota)
+ kfree(rm_quota);
+
mlog_exit(status);
/* no one is callint kthread_stop() for us so the kthread() api
* requires that we call do_exit(). And it isn't exported, but
@@ -1282,31 +1358,19 @@ done:
* far less concerning.
*/
static int ocfs2_recover_node(struct ocfs2_super *osb,
- int node_num)
+ int node_num, int slot_num)
{
int status = 0;
- int slot_num;
struct ocfs2_dinode *la_copy = NULL;
struct ocfs2_dinode *tl_copy = NULL;

- mlog_entry("(node_num=%d, osb->node_num = %d)\n",
- node_num, osb->node_num);
-
- mlog(0, "checking node %d\n", node_num);
+ mlog_entry("(node_num=%d, slot_num=%d, osb->node_num = %d)\n",
+ node_num, slot_num, osb->node_num);

/* Should not ever be called to recover ourselves -- in that
* case we should've called ocfs2_journal_load instead. */
BUG_ON(osb->node_num == node_num);

- slot_num = ocfs2_node_num_to_slot(osb, node_num);
- if (slot_num == -ENOENT) {
- status = 0;
- mlog(0, "no slot for this node, so no recovery required.\n");
- goto done;
- }
-
- mlog(0, "node %d was using slot %d\n", node_num, slot_num);
-
status = ocfs2_replay_journal(osb, node_num, slot_num);
if (status < 0) {
if (status == -EBUSY) {
@@ -1342,7 +1406,7 @@ static int ocfs2_recover_node(struct ocfs2_super *osb,

/* This will kfree the memory pointed to by la_copy and tl_copy */
ocfs2_queue_recovery_completion(osb->journal, slot_num, la_copy,
- tl_copy);
+ tl_copy, NULL);

status = 0;
done:
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index ee08e9c..37013bf 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -168,6 +168,7 @@ void ocfs2_recovery_thread(struct ocfs2_super *osb,
int node_num);
int ocfs2_mark_dead_nodes(struct ocfs2_super *osb);
void ocfs2_complete_mount_recovery(struct ocfs2_super *osb);
+void ocfs2_complete_quota_recovery(struct ocfs2_super *osb);

static inline void ocfs2_start_checkpoint(struct ocfs2_super *osb)
{
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index f04b229..6b25b4a 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -206,6 +206,7 @@ enum ocfs2_mount_options
struct ocfs2_journal;
struct ocfs2_slot_info;
struct ocfs2_recovery_map;
+struct ocfs2_quota_recovery;
struct ocfs2_super
{
struct task_struct *commit_task;
@@ -287,10 +288,11 @@ struct ocfs2_super
char *local_alloc_debug_buf;
#endif

- /* Next two fields are for local node slot recovery during
+ /* Next three fields are for local node slot recovery during
* mount. */
int dirty;
struct ocfs2_dinode *local_alloc_copy;
+ struct ocfs2_quota_recovery *quota_rec;

struct ocfs2_alloc_stats alloc_stats;
char dev_str[20]; /* "major,minor" of the device */
diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
index 11cdff1..84c50a1 100644
--- a/fs/ocfs2/quota.h
+++ b/fs/ocfs2/quota.h
@@ -34,6 +34,17 @@ struct ocfs2_dquot {
s64 dq_originodes; /* Last globally synced inode usage */
};

+/* Description of one chunk to recover in memory */
+struct ocfs2_recovery_chunk {
+ struct list_head rc_list; /* List of chunks */
+ int rc_chunk; /* Chunk number */
+ unsigned long *rc_bitmap; /* Bitmap of entries to recover */
+};
+
+struct ocfs2_quota_recovery {
+ struct list_head r_list[MAXQUOTAS]; /* List of chunks to recover */
+};
+
/* In-memory structure with quota header information */
struct ocfs2_mem_dqinfo {
unsigned int dqi_type; /* Quota type this structure describes */
@@ -50,6 +61,10 @@ struct ocfs2_mem_dqinfo {
struct buffer_head *dqi_ibh; /* Buffer with information header */
struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
struct timer_list dqi_sync_timer; /* Timer for syncing dquots */
+ struct ocfs2_quota_recovery *dqi_rec; /* Pointer to recovery
+ * information, in case we
+ * enable quotas on file
+ * needing it */
};

static inline struct ocfs2_dquot *OCFS2_DQUOT(struct dquot *dquot)
@@ -68,6 +83,12 @@ extern struct kmem_cache *ocfs2_qf_chunk_cachep;

extern struct qtree_fmt_operations ocfs2_global_ops;

+struct ocfs2_quota_recovery *ocfs2_begin_quota_recovery(
+ struct ocfs2_super *osb, int slot_num);
+int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
+ struct ocfs2_quota_recovery *rec,
+ int slot_num);
+void ocfs2_free_quota_recovery(struct ocfs2_quota_recovery *rec);
ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
size_t len, loff_t off);
ssize_t ocfs2_quota_write(struct super_block *sb, int type,
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 4a5bc09..d2a5bfa 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -85,7 +85,6 @@ struct qtree_fmt_operations ocfs2_global_ops = {
.is_id = ocfs2_global_is_id,
};

-
struct buffer_head *ocfs2_read_quota_block(struct inode *inode,
int block, int *err)
{
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 1db7a16..54e8788 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -49,14 +49,25 @@ static unsigned int ol_quota_chunk_block(struct super_block *sb, int c)
return 1 + (ol_chunk_blocks(sb) + 1) * c;
}

-/* Offset of the dquot structure in the quota file */
-static loff_t ol_dqblk_off(struct super_block *sb, int c, int off)
+static unsigned int ol_dqblk_block(struct super_block *sb, int c, int off)
+{
+ int epb = ol_quota_entries_per_block(sb);
+
+ return ol_quota_chunk_block(sb, c) + 1 + off / epb;
+}
+
+static unsigned int ol_dqblk_block_off(struct super_block *sb, int c, int off)
{
int epb = ol_quota_entries_per_block(sb);

- return ((ol_quota_chunk_block(sb, c) + 1 + off / epb)
- << sb->s_blocksize_bits) +
- (off % epb) * sizeof(struct ocfs2_local_disk_dqblk);
+ return (off % epb) * sizeof(struct ocfs2_local_disk_dqblk);
+}
+
+/* Offset of the dquot structure in the quota file */
+static loff_t ol_dqblk_off(struct super_block *sb, int c, int off)
+{
+ return (ol_dqblk_block(sb, c, off) << sb->s_blocksize_bits) +
+ ol_dqblk_block_off(sb, c, off);
}

/* Compute block number from given offset */
@@ -253,6 +264,379 @@ static void olq_update_info(struct buffer_head *bh, void *private)
spin_unlock(&dq_data_lock);
}

+static int ocfs2_add_recovery_chunk(struct super_block *sb,
+ struct ocfs2_local_disk_chunk *dchunk,
+ int chunk,
+ struct list_head *head)
+{
+ struct ocfs2_recovery_chunk *rc;
+
+ rc = kmalloc(sizeof(struct ocfs2_recovery_chunk), GFP_NOFS);
+ if (!rc)
+ return -ENOMEM;
+ rc->rc_chunk = chunk;
+ rc->rc_bitmap = kmalloc(sb->s_blocksize, GFP_NOFS);
+ if (!rc->rc_bitmap) {
+ kfree(rc);
+ return -ENOMEM;
+ }
+ memcpy(rc->rc_bitmap, dchunk->dqc_bitmap,
+ (ol_chunk_entries(sb) + 7) >> 3);
+ list_add_tail(&rc->rc_list, head);
+ return 0;
+}
+
+static void free_recovery_list(struct list_head *head)
+{
+ struct ocfs2_recovery_chunk *next;
+ struct ocfs2_recovery_chunk *rchunk;
+
+ list_for_each_entry_safe(rchunk, next, head, rc_list) {
+ list_del(&rchunk->rc_list);
+ kfree(rchunk->rc_bitmap);
+ kfree(rchunk);
+ }
+}
+
+void ocfs2_free_quota_recovery(struct ocfs2_quota_recovery *rec)
+{
+ int type;
+
+ for (type = 0; type < MAXQUOTAS; type++)
+ free_recovery_list(&(rec->r_list[type]));
+ kfree(rec);
+}
+
+/* Load entries in our quota file we have to recover*/
+static int ocfs2_recovery_load_quota(struct inode *lqinode,
+ struct ocfs2_local_disk_dqinfo *ldinfo,
+ int type,
+ struct list_head *head)
+{
+ struct super_block *sb = lqinode->i_sb;
+ struct buffer_head *hbh;
+ struct ocfs2_local_disk_chunk *dchunk;
+ int i, chunks = le32_to_cpu(ldinfo->dqi_chunks);
+ int status = 0;
+
+ for (i = 0; i < chunks; i++) {
+ hbh = ocfs2_read_quota_block(lqinode,
+ ol_quota_chunk_block(sb, i),
+ &status);
+ if (!hbh) {
+ mlog_errno(status);
+ break;
+ }
+ dchunk = (struct ocfs2_local_disk_chunk *)hbh->b_data;
+ if (le32_to_cpu(dchunk->dqc_free) < ol_chunk_entries(sb))
+ status = ocfs2_add_recovery_chunk(sb, dchunk, i, head);
+ brelse(hbh);
+ if (status < 0)
+ break;
+ }
+ if (status < 0)
+ free_recovery_list(head);
+ return status;
+}
+
+static struct ocfs2_quota_recovery *ocfs2_alloc_quota_recovery(void)
+{
+ int type;
+ struct ocfs2_quota_recovery *rec;
+
+ rec = kmalloc(sizeof(struct ocfs2_quota_recovery), GFP_NOFS);
+ if (!rec)
+ return NULL;
+ for (type = 0; type < MAXQUOTAS; type++)
+ INIT_LIST_HEAD(&(rec->r_list[type]));
+ return rec;
+}
+
+/* Load information we need for quota recovery into memory */
+struct ocfs2_quota_recovery *ocfs2_begin_quota_recovery(
+ struct ocfs2_super *osb,
+ int slot_num)
+{
+ unsigned int feature[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA};
+ unsigned int ino[MAXQUOTAS] = { LOCAL_USER_QUOTA_SYSTEM_INODE,
+ LOCAL_GROUP_QUOTA_SYSTEM_INODE };
+ struct super_block *sb = osb->sb;
+ struct ocfs2_local_disk_dqinfo *ldinfo;
+ struct inode *lqinode;
+ struct buffer_head *bh;
+ int type;
+ int status = 0;
+ struct ocfs2_quota_recovery *rec;
+
+ mlog(ML_NOTICE, "Beginning quota recovery in slot %u\n", slot_num);
+ rec = ocfs2_alloc_quota_recovery();
+ if (!rec)
+ return ERR_PTR(-ENOMEM);
+ /* First init... */
+
+ for (type = 0; type < MAXQUOTAS; type++) {
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, feature[type]))
+ continue;
+ /* At this point, journal of the slot is already replayed so
+ * we can trust metadata and data of the quota file */
+ lqinode = ocfs2_get_system_file_inode(osb, ino[type], slot_num);
+ if (!lqinode) {
+ status = -ENOENT;
+ goto out;
+ }
+ status = ocfs2_inode_lock_full(lqinode, NULL, 1,
+ OCFS2_META_LOCK_RECOVERY);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_put;
+ }
+ /* Now read local header */
+ bh = ocfs2_read_quota_block(lqinode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ mlog(ML_ERROR, "failed to read quota file info header "
+ "(slot=%d type=%d)\n", slot_num, type);
+ goto out_lock;
+ }
+ ldinfo = (struct ocfs2_local_disk_dqinfo *)(bh->b_data +
+ OCFS2_LOCAL_INFO_OFF);
+ status = ocfs2_recovery_load_quota(lqinode, ldinfo, type,
+ &rec->r_list[type]);
+ brelse(bh);
+out_lock:
+ ocfs2_inode_unlock(lqinode, 1);
+out_put:
+ iput(lqinode);
+ if (status < 0)
+ break;
+ }
+out:
+ if (status < 0) {
+ ocfs2_free_quota_recovery(rec);
+ rec = ERR_PTR(status);
+ }
+ return rec;
+}
+
+/* Sync changes in local quota file into global quota file and
+ * reinitialize local quota file.
+ * The function expects local quota file to be already locked and
+ * dqonoff_mutex locked. */
+static int ocfs2_recover_local_quota_file(struct inode *lqinode,
+ int type,
+ struct ocfs2_quota_recovery *rec)
+{
+ struct super_block *sb = lqinode->i_sb;
+ struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ struct ocfs2_local_disk_chunk *dchunk;
+ struct ocfs2_local_disk_dqblk *dqblk;
+ struct dquot *dquot;
+ handle_t *handle;
+ struct buffer_head *hbh = NULL, *qbh = NULL;
+ int status = 0;
+ int bit, chunk;
+ struct ocfs2_recovery_chunk *rchunk, *next;
+ qsize_t spacechange, inodechange;
+
+ mlog_entry("ino=%lu type=%u", (unsigned long)lqinode->i_ino, type);
+
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+
+ list_for_each_entry_safe(rchunk, next, &(rec->r_list[type]), rc_list) {
+ chunk = rchunk->rc_chunk;
+ hbh = ocfs2_read_quota_block(lqinode,
+ ol_quota_chunk_block(sb, chunk),
+ &status);
+ if (!hbh) {
+ mlog_errno(status);
+ break;
+ }
+ dchunk = (struct ocfs2_local_disk_chunk *)hbh->b_data;
+ for_each_bit(bit, rchunk->rc_bitmap, ol_chunk_entries(sb)) {
+ qbh = ocfs2_read_quota_block(lqinode,
+ ol_dqblk_block(sb, chunk, bit),
+ &status);
+ if (!qbh) {
+ mlog_errno(status);
+ break;
+ }
+ dqblk = (struct ocfs2_local_disk_dqblk *)(qbh->b_data +
+ ol_dqblk_block_off(sb, chunk, bit));
+ dquot = dqget(sb, le64_to_cpu(dqblk->dqb_id), type);
+ if (!dquot) {
+ status = -EIO;
+ mlog(ML_ERROR, "Failed to get quota structure "
+ "for id %u, type %d. Cannot finish quota "
+ "file recovery.\n",
+ (unsigned)le64_to_cpu(dqblk->dqb_id),
+ type);
+ goto out_put_bh;
+ }
+ handle = ocfs2_start_trans(OCFS2_SB(sb),
+ OCFS2_QSYNC_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_put_dquot;
+ }
+ mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+ spin_lock(&dq_data_lock);
+ /* Add usage from quota entry into quota changes
+ * of our node. Auxiliary variables are important
+ * due to signedness */
+ spacechange = le64_to_cpu(dqblk->dqb_spacemod);
+ inodechange = le64_to_cpu(dqblk->dqb_inodemod);
+ dquot->dq_dqb.dqb_curspace += spacechange;
+ dquot->dq_dqb.dqb_curinodes += inodechange;
+ spin_unlock(&dq_data_lock);
+ /* We want to drop reference held by the crashed
+ * node. Since we have our own reference we know
+ * global structure actually won't be freed. */
+ status = ocfs2_global_release_dquot(dquot);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_commit;
+ }
+ /* Release local quota file entry */
+ status = ocfs2_journal_access(handle, lqinode,
+ qbh, OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_commit;
+ }
+ lock_buffer(qbh);
+ WARN_ON(!ocfs2_test_bit(bit, dchunk->dqc_bitmap));
+ ocfs2_clear_bit(bit, dchunk->dqc_bitmap);
+ le32_add_cpu(&dchunk->dqc_free, 1);
+ unlock_buffer(qbh);
+ status = ocfs2_journal_dirty(handle, qbh);
+ if (status < 0)
+ mlog_errno(status);
+out_commit:
+ mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+ ocfs2_commit_trans(OCFS2_SB(sb), handle);
+out_put_dquot:
+ dqput(dquot);
+out_put_bh:
+ brelse(qbh);
+ if (status < 0)
+ break;
+ }
+ brelse(hbh);
+ list_del(&rchunk->rc_list);
+ kfree(rchunk->rc_bitmap);
+ kfree(rchunk);
+ if (status < 0)
+ break;
+ }
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ if (status < 0)
+ free_recovery_list(&(rec->r_list[type]));
+ mlog_exit(status);
+ return status;
+}
+
+/* Recover local quota files for given node different from us */
+int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
+ struct ocfs2_quota_recovery *rec,
+ int slot_num)
+{
+ unsigned int ino[MAXQUOTAS] = { LOCAL_USER_QUOTA_SYSTEM_INODE,
+ LOCAL_GROUP_QUOTA_SYSTEM_INODE };
+ struct super_block *sb = osb->sb;
+ struct ocfs2_local_disk_dqinfo *ldinfo;
+ struct buffer_head *bh;
+ handle_t *handle;
+ int type;
+ int status = 0;
+ struct inode *lqinode;
+ unsigned int flags;
+
+ mlog(ML_NOTICE, "Finishing quota recovery in slot %u\n", slot_num);
+ mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
+ for (type = 0; type < MAXQUOTAS; type++) {
+ if (list_empty(&(rec->r_list[type])))
+ continue;
+ mlog(0, "Recovering quota in slot %d\n", slot_num);
+ lqinode = ocfs2_get_system_file_inode(osb, ino[type], slot_num);
+ if (!lqinode) {
+ status = -ENOENT;
+ goto out;
+ }
+ status = ocfs2_inode_lock_full(lqinode, NULL, 1,
+ OCFS2_META_LOCK_NOQUEUE);
+ /* Someone else is holding the lock? Then he must be
+ * doing the recovery. Just skip the file... */
+ if (status == -EAGAIN) {
+ mlog(ML_NOTICE, "skipping quota recovery for slot %d "
+ "because quota file is locked.\n", slot_num);
+ status = 0;
+ goto out_put;
+ } else if (status < 0) {
+ mlog_errno(status);
+ goto out_put;
+ }
+ /* Now read local header */
+ bh = ocfs2_read_quota_block(lqinode, 0, &status);
+ if (!bh) {
+ mlog_errno(status);
+ mlog(ML_ERROR, "failed to read quota file info header "
+ "(slot=%d type=%d)\n", slot_num, type);
+ goto out_lock;
+ }
+ ldinfo = (struct ocfs2_local_disk_dqinfo *)(bh->b_data +
+ OCFS2_LOCAL_INFO_OFF);
+ /* Is recovery still needed? */
+ flags = le32_to_cpu(ldinfo->dqi_flags);
+ if (!(flags & OLQF_CLEAN))
+ status = ocfs2_recover_local_quota_file(lqinode,
+ type,
+ rec);
+ /* We don't want to mark file as clean when it is actually
+ * active */
+ if (slot_num == osb->slot_num)
+ goto out_bh;
+ /* Mark quota file as clean if we are recovering quota file of
+ * some other node. */
+ handle = ocfs2_start_trans(osb, 1);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_bh;
+ }
+ status = ocfs2_journal_access(handle, lqinode, bh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_trans;
+ }
+ lock_buffer(bh);
+ ldinfo->dqi_flags = cpu_to_le32(flags | OLQF_CLEAN);
+ unlock_buffer(bh);
+ status = ocfs2_journal_dirty(handle, bh);
+ if (status < 0)
+ mlog_errno(status);
+out_trans:
+ ocfs2_commit_trans(osb, handle);
+out_bh:
+ brelse(bh);
+out_lock:
+ ocfs2_inode_unlock(lqinode, 1);
+out_put:
+ iput(lqinode);
+ if (status < 0)
+ break;
+ }
+out:
+ mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
+ kfree(rec);
+ return status;
+}
+
/* Read information header from quota file */
static int ocfs2_local_read_info(struct super_block *sb, int type)
{
@@ -262,6 +646,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
struct inode *lqinode = sb_dqopt(sb)->files[type];
int status;
struct buffer_head *bh = NULL;
+ struct ocfs2_quota_recovery *rec;
int locked = 0;

info->dqi_maxblimit = 0x7fffffffffffffffLL;
@@ -275,6 +660,7 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
info->dqi_priv = oinfo;
oinfo->dqi_type = type;
INIT_LIST_HEAD(&oinfo->dqi_chunk);
+ oinfo->dqi_rec = NULL;
oinfo->dqi_lqi_bh = NULL;
oinfo->dqi_ibh = NULL;

@@ -305,10 +691,27 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
oinfo->dqi_ibh = bh;

/* We crashed when using local quota file? */
- if (!(info->dqi_flags & OLQF_CLEAN))
- goto out_err; /* So far we just bail out. Later we should resync here */
+ if (!(info->dqi_flags & OLQF_CLEAN)) {
+ rec = OCFS2_SB(sb)->quota_rec;
+ if (!rec) {
+ rec = ocfs2_alloc_quota_recovery();
+ if (!rec) {
+ status = -ENOMEM;
+ mlog_errno(status);
+ goto out_err;
+ }
+ OCFS2_SB(sb)->quota_rec = rec;
+ }

- status = ocfs2_load_local_quota_bitmaps(sb_dqopt(sb)->files[type],
+ status = ocfs2_recovery_load_quota(lqinode, ldinfo, type,
+ &rec->r_list[type]);
+ if (status < 0) {
+ mlog_errno(status);
+ goto out_err;
+ }
+ }
+
+ status = ocfs2_load_local_quota_bitmaps(lqinode,
ldinfo,
&oinfo->dqi_chunk);
if (status < 0) {
@@ -394,6 +797,12 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
}
ocfs2_release_local_quota_bitmaps(&oinfo->dqi_chunk);

+ /* dqonoff_mutex protects us against racing with recovery thread... */
+ if (oinfo->dqi_rec) {
+ ocfs2_free_quota_recovery(oinfo->dqi_rec);
+ mark_clean = 0;
+ }
+
if (!mark_clean)
goto out;

--
1.5.6

2008-12-22 21:58:04

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 25/56] ocfs2: Implement quota syncing thread

From: Jan Kara <[email protected]>

This patch implements functions and timer setup which handles periodic
syncing of locally cached quota information to global quota file.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota.h | 3 ++
fs/ocfs2/quota_global.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++
fs/ocfs2/quota_local.c | 4 ++
3 files changed, 78 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
index 1f1c863..11cdff1 100644
--- a/fs/ocfs2/quota.h
+++ b/fs/ocfs2/quota.h
@@ -14,6 +14,7 @@
#include <linux/quota.h>
#include <linux/list.h>
#include <linux/dqblk_qtree.h>
+#include <linux/timer.h>

#include "ocfs2.h"

@@ -39,6 +40,7 @@ struct ocfs2_mem_dqinfo {
unsigned int dqi_chunks; /* Number of chunks in local quota file */
unsigned int dqi_blocks; /* Number of blocks allocated for local quota file */
unsigned int dqi_syncms; /* How often should we sync with other nodes */
+ unsigned int dqi_syncjiff; /* Precomputed dqi_syncms in jiffies */
struct list_head dqi_chunk; /* List of chunks */
struct inode *dqi_gqinode; /* Global quota file inode */
struct ocfs2_lock_res dqi_gqlock; /* Lock protecting quota information structure */
@@ -47,6 +49,7 @@ struct ocfs2_mem_dqinfo {
struct buffer_head *dqi_lqi_bh; /* Buffer head with local quota file inode */
struct buffer_head *dqi_ibh; /* Buffer with information header */
struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
+ struct timer_list dqi_sync_timer; /* Timer for syncing dquots */
};

static inline struct ocfs2_dquot *OCFS2_DQUOT(struct dquot *dquot)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index af8340c..4a5bc09 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -1,10 +1,14 @@
/*
* Implementation of operations over global quota file
*/
+#include <linux/spinlock.h>
#include <linux/fs.h>
#include <linux/quota.h>
#include <linux/quotaops.h>
#include <linux/dqblk_qtree.h>
+#include <linux/jiffies.h>
+#include <linux/timer.h>
+#include <linux/writeback.h>

#define MLOG_MASK_PREFIX ML_QUOTA
#include <cluster/masklog.h>
@@ -20,6 +24,8 @@
#include "uptodate.h"
#include "quota.h"

+static void qsync_timer_fn(unsigned long oinfo_ptr);
+
static void ocfs2_global_disk2memdqb(struct dquot *dquot, void *dp)
{
struct ocfs2_global_disk_dqblk *d = dp;
@@ -313,6 +319,7 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
info->dqi_bgrace = le32_to_cpu(dinfo.dqi_bgrace);
info->dqi_igrace = le32_to_cpu(dinfo.dqi_igrace);
oinfo->dqi_syncms = le32_to_cpu(dinfo.dqi_syncms);
+ oinfo->dqi_syncjiff = msecs_to_jiffies(oinfo->dqi_syncms);
oinfo->dqi_gi.dqi_blocks = le32_to_cpu(dinfo.dqi_blocks);
oinfo->dqi_gi.dqi_free_blk = le32_to_cpu(dinfo.dqi_free_blk);
oinfo->dqi_gi.dqi_free_entry = le32_to_cpu(dinfo.dqi_free_entry);
@@ -320,6 +327,10 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
oinfo->dqi_gi.dqi_usable_bs = sb->s_blocksize -
OCFS2_QBLK_RESERVED_SPACE;
oinfo->dqi_gi.dqi_qtree_depth = qtree_depth(&oinfo->dqi_gi);
+ setup_timer(&oinfo->dqi_sync_timer, qsync_timer_fn,
+ (unsigned long)oinfo);
+ mod_timer(&oinfo->dqi_sync_timer,
+ round_jiffies(jiffies + oinfo->dqi_syncjiff));
out_err:
mlog_exit(status);
return status;
@@ -520,6 +531,66 @@ out:
}

/*
+ * Functions for periodic syncing of dquots with global file
+ */
+static int ocfs2_sync_dquot_helper(struct dquot *dquot, unsigned long type)
+{
+ handle_t *handle;
+ struct super_block *sb = dquot->dq_sb;
+ struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
+ struct ocfs2_super *osb = OCFS2_SB(sb);
+ int status = 0;
+
+ mlog_entry("id=%u qtype=%u type=%lu device=%s\n", dquot->dq_id,
+ dquot->dq_type, type, sb->s_id);
+ if (type != dquot->dq_type)
+ goto out;
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto out;
+
+ handle = ocfs2_start_trans(osb, OCFS2_QSYNC_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto out_ilock;
+ }
+ mutex_lock(&sb_dqopt(sb)->dqio_mutex);
+ status = ocfs2_sync_dquot(dquot);
+ mutex_unlock(&sb_dqopt(sb)->dqio_mutex);
+ if (status < 0)
+ mlog_errno(status);
+ /* We have to write local structure as well... */
+ dquot_mark_dquot_dirty(dquot);
+ status = dquot_commit(dquot);
+ if (status < 0)
+ mlog_errno(status);
+ ocfs2_commit_trans(osb, handle);
+out_ilock:
+ ocfs2_unlock_global_qf(oinfo, 1);
+out:
+ mlog_exit(status);
+ return status;
+}
+
+static void ocfs2_do_qsync(unsigned long oinfo_ptr)
+{
+ struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
+ struct super_block *sb = oinfo->dqi_gqinode->i_sb;
+
+ dquot_scan_active(sb, ocfs2_sync_dquot_helper, oinfo->dqi_type);
+}
+
+static void qsync_timer_fn(unsigned long oinfo_ptr)
+{
+ struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
+
+ pdflush_operation(ocfs2_do_qsync, oinfo_ptr);
+ mod_timer(&oinfo->dqi_sync_timer,
+ round_jiffies(jiffies + oinfo->dqi_syncjiff));
+}
+
+/*
* Wrappers for generic quota functions
*/

diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 55c3f2f..1db7a16 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -368,6 +368,10 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
int mark_clean = 1, len;
int status;

+ /* At this point we know there are no more dquots and thus
+ * even if there's some sync in the pdflush queue, it won't
+ * find any dquots and return without doing anything */
+ del_timer_sync(&oinfo->dqi_sync_timer);
iput(oinfo->dqi_gqinode);
ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
ocfs2_lock_res_free(&oinfo->dqi_gqlock);
--
1.5.6

2008-12-22 21:57:47

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 24/56] ocfs2: Add quota calls for allocation and freeing of inodes and space

From: Jan Kara <[email protected]>

Add quota calls for allocation and freeing of inodes and space, also update
estimates on number of needed credits for a transaction. Move out inode
allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called
outside of a transaction.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/alloc.c | 20 +++++++++++-
fs/ocfs2/aops.c | 16 ++++++++--
fs/ocfs2/dir.c | 24 +++++++++++++-
fs/ocfs2/file.c | 72 +++++++++++++++++++++++++++++++++++++++++---
fs/ocfs2/inode.c | 10 +++++-
fs/ocfs2/journal.h | 84 ++++++++++++++++++++++++++++++++++++++++++---------
fs/ocfs2/namei.c | 44 ++++++++++++++++++++++++--
fs/ocfs2/xattr.c | 14 +++++----
8 files changed, 245 insertions(+), 39 deletions(-)

diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 69d67ab..84a7bd4 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -28,6 +28,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/swap.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_DISK_ALLOC
#include <cluster/masklog.h>
@@ -5322,7 +5323,7 @@ int ocfs2_remove_btree_range(struct inode *inode,
}
}

- handle = ocfs2_start_trans(osb, OCFS2_REMOVE_EXTENT_CREDITS);
+ handle = ocfs2_start_trans(osb, ocfs2_remove_extent_credits(osb->sb));
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
mlog_errno(ret);
@@ -6552,6 +6553,8 @@ static int ocfs2_do_truncate(struct ocfs2_super *osb,
goto bail;
}

+ vfs_dq_free_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, clusters_to_del));
spin_lock(&OCFS2_I(inode)->ip_lock);
OCFS2_I(inode)->ip_clusters = le32_to_cpu(fe->i_clusters) -
clusters_to_del;
@@ -6860,6 +6863,7 @@ int ocfs2_convert_inline_data_to_extents(struct inode *inode,
struct page **pages = NULL;
loff_t end = osb->s_clustersize;
struct ocfs2_extent_tree et;
+ int did_quota = 0;

has_data = i_size_read(inode) ? 1 : 0;

@@ -6879,7 +6883,8 @@ int ocfs2_convert_inline_data_to_extents(struct inode *inode,
}
}

- handle = ocfs2_start_trans(osb, OCFS2_INLINE_TO_EXTENTS_CREDITS);
+ handle = ocfs2_start_trans(osb,
+ ocfs2_inline_to_extents_credits(osb->sb));
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
mlog_errno(ret);
@@ -6898,6 +6903,13 @@ int ocfs2_convert_inline_data_to_extents(struct inode *inode,
unsigned int page_end;
u64 phys;

+ if (vfs_dq_alloc_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, 1))) {
+ ret = -EDQUOT;
+ goto out_commit;
+ }
+ did_quota = 1;
+
ret = ocfs2_claim_clusters(osb, handle, data_ac, 1, &bit_off,
&num);
if (ret) {
@@ -6971,6 +6983,10 @@ int ocfs2_convert_inline_data_to_extents(struct inode *inode,
}

out_commit:
+ if (ret < 0 && did_quota)
+ vfs_dq_free_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, 1));
+
ocfs2_commit_trans(osb, handle);

out_unlock:
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 6af79ad..6b647ec 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -27,6 +27,7 @@
#include <linux/swap.h>
#include <linux/pipe_fs_i.h>
#include <linux/mpage.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_FILE_IO
#include <cluster/masklog.h>
@@ -1730,6 +1731,11 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,

wc->w_handle = handle;

+ if (clusters_to_alloc && vfs_dq_alloc_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, clusters_to_alloc))) {
+ ret = -EDQUOT;
+ goto out_commit;
+ }
/*
* We don't want this to fail in ocfs2_write_end(), so do it
* here.
@@ -1738,7 +1744,7 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,
OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
- goto out_commit;
+ goto out_quota;
}

/*
@@ -1751,14 +1757,14 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,
mmap_page);
if (ret) {
mlog_errno(ret);
- goto out_commit;
+ goto out_quota;
}

ret = ocfs2_write_cluster_by_desc(mapping, data_ac, meta_ac, wc, pos,
len);
if (ret) {
mlog_errno(ret);
- goto out_commit;
+ goto out_quota;
}

if (data_ac)
@@ -1770,6 +1776,10 @@ success:
*pagep = wc->w_target_page;
*fsdata = wc;
return 0;
+out_quota:
+ if (clusters_to_alloc)
+ vfs_dq_free_space(inode,
+ ocfs2_clusters_to_bytes(osb->sb, clusters_to_alloc));
out_commit:
ocfs2_commit_trans(osb, handle);

diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index d83cff9..3708fe4 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -40,6 +40,7 @@
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/highmem.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_NAMEI
#include <cluster/masklog.h>
@@ -1210,9 +1211,9 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh,
unsigned int blocks_wanted,
struct buffer_head **first_block_bh)
{
- int ret, credits = OCFS2_INLINE_TO_EXTENTS_CREDITS;
u32 alloc, bit_off, len;
struct super_block *sb = dir->i_sb;
+ int ret, credits = ocfs2_inline_to_extents_credits(sb);
u64 blkno, bytes = blocks_wanted << sb->s_blocksize_bits;
struct ocfs2_super *osb = OCFS2_SB(dir->i_sb);
struct ocfs2_inode_info *oi = OCFS2_I(dir);
@@ -1221,6 +1222,7 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh,
struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
handle_t *handle;
struct ocfs2_extent_tree et;
+ int did_quota = 0;

ocfs2_init_dinode_extent_tree(&et, dir, di_bh);

@@ -1258,6 +1260,12 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh,
goto out_sem;
}

+ if (vfs_dq_alloc_space_nodirty(dir,
+ ocfs2_clusters_to_bytes(osb->sb, alloc))) {
+ ret = -EDQUOT;
+ goto out_commit;
+ }
+ did_quota = 1;
/*
* Try to claim as many clusters as the bitmap can give though
* if we only get one now, that's enough to continue. The rest
@@ -1380,6 +1388,9 @@ static int ocfs2_expand_inline_dir(struct inode *dir, struct buffer_head *di_bh,
dirdata_bh = NULL;

out_commit:
+ if (ret < 0 && did_quota)
+ vfs_dq_free_space_nodirty(dir,
+ ocfs2_clusters_to_bytes(osb->sb, 2));
ocfs2_commit_trans(osb, handle);

out_sem:
@@ -1404,7 +1415,7 @@ static int ocfs2_do_extend_dir(struct super_block *sb,
struct buffer_head **new_bh)
{
int status;
- int extend;
+ int extend, did_quota = 0;
u64 p_blkno, v_blkno;

spin_lock(&OCFS2_I(dir)->ip_lock);
@@ -1414,6 +1425,13 @@ static int ocfs2_do_extend_dir(struct super_block *sb,
if (extend) {
u32 offset = OCFS2_I(dir)->ip_clusters;

+ if (vfs_dq_alloc_space_nodirty(dir,
+ ocfs2_clusters_to_bytes(sb, 1))) {
+ status = -EDQUOT;
+ goto bail;
+ }
+ did_quota = 1;
+
status = ocfs2_add_inode_data(OCFS2_SB(sb), dir, &offset,
1, 0, parent_fe_bh, handle,
data_ac, meta_ac, NULL);
@@ -1439,6 +1457,8 @@ static int ocfs2_do_extend_dir(struct super_block *sb,
}
status = 0;
bail:
+ if (did_quota && status < 0)
+ vfs_dq_free_space_nodirty(dir, ocfs2_clusters_to_bytes(sb, 1));
mlog_exit(status);
return status;
}
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 372d965..9374d37 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -35,6 +35,7 @@
#include <linux/mount.h>
#include <linux/writeback.h>
#include <linux/falloc.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_INODE
#include <cluster/masklog.h>
@@ -57,6 +58,7 @@
#include "super.h"
#include "xattr.h"
#include "acl.h"
+#include "quota.h"

#include "buffer_head_io.h"

@@ -534,6 +536,7 @@ static int __ocfs2_extend_allocation(struct inode *inode, u32 logical_start,
enum ocfs2_alloc_restarted why;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
struct ocfs2_extent_tree et;
+ int did_quota = 0;

mlog_entry("(clusters_to_add = %u)\n", clusters_to_add);

@@ -577,6 +580,13 @@ restart_all:
}

restarted_transaction:
+ if (vfs_dq_alloc_space_nodirty(inode, ocfs2_clusters_to_bytes(osb->sb,
+ clusters_to_add))) {
+ status = -EDQUOT;
+ goto leave;
+ }
+ did_quota = 1;
+
/* reserve a write to the file entry early on - that we if we
* run out of credits in the allocation path, we can still
* update i_size. */
@@ -614,6 +624,10 @@ restarted_transaction:
spin_lock(&OCFS2_I(inode)->ip_lock);
clusters_to_add -= (OCFS2_I(inode)->ip_clusters - prev_clusters);
spin_unlock(&OCFS2_I(inode)->ip_lock);
+ /* Release unused quota reservation */
+ vfs_dq_free_space(inode,
+ ocfs2_clusters_to_bytes(osb->sb, clusters_to_add));
+ did_quota = 0;

if (why != RESTART_NONE && clusters_to_add) {
if (why == RESTART_META) {
@@ -646,6 +660,9 @@ restarted_transaction:
OCFS2_I(inode)->ip_clusters, (long long)i_size_read(inode));

leave:
+ if (status < 0 && did_quota)
+ vfs_dq_free_space(inode,
+ ocfs2_clusters_to_bytes(osb->sb, clusters_to_add));
if (handle) {
ocfs2_commit_trans(osb, handle);
handle = NULL;
@@ -877,6 +894,9 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
struct ocfs2_super *osb = OCFS2_SB(sb);
struct buffer_head *bh = NULL;
handle_t *handle = NULL;
+ int locked[MAXQUOTAS] = {0, 0};
+ int credits, qtype;
+ struct ocfs2_mem_dqinfo *oinfo;

mlog_entry("(0x%p, '%.*s')\n", dentry,
dentry->d_name.len, dentry->d_name.name);
@@ -947,11 +967,47 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
}
}

- handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
- if (IS_ERR(handle)) {
- status = PTR_ERR(handle);
- mlog_errno(status);
- goto bail_unlock;
+ if ((attr->ia_valid & ATTR_UID && attr->ia_uid != inode->i_uid) ||
+ (attr->ia_valid & ATTR_GID && attr->ia_gid != inode->i_gid)) {
+ credits = OCFS2_INODE_UPDATE_CREDITS;
+ if (attr->ia_valid & ATTR_UID && attr->ia_uid != inode->i_uid
+ && OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_USRQUOTA)) {
+ oinfo = sb_dqinfo(sb, USRQUOTA)->dqi_priv;
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto bail_unlock;
+ credits += ocfs2_calc_qinit_credits(sb, USRQUOTA) +
+ ocfs2_calc_qdel_credits(sb, USRQUOTA);
+ locked[USRQUOTA] = 1;
+ }
+ if (attr->ia_valid & ATTR_GID && attr->ia_gid != inode->i_gid
+ && OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)) {
+ oinfo = sb_dqinfo(sb, GRPQUOTA)->dqi_priv;
+ status = ocfs2_lock_global_qf(oinfo, 1);
+ if (status < 0)
+ goto bail_unlock;
+ credits += ocfs2_calc_qinit_credits(sb, GRPQUOTA) +
+ ocfs2_calc_qdel_credits(sb, GRPQUOTA);
+ locked[GRPQUOTA] = 1;
+ }
+ handle = ocfs2_start_trans(osb, credits);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto bail_unlock;
+ }
+ status = vfs_dq_transfer(inode, attr) ? -EDQUOT : 0;
+ if (status < 0)
+ goto bail_commit;
+ } else {
+ handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
+ if (IS_ERR(handle)) {
+ status = PTR_ERR(handle);
+ mlog_errno(status);
+ goto bail_unlock;
+ }
}

/*
@@ -974,6 +1030,12 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
bail_commit:
ocfs2_commit_trans(osb, handle);
bail_unlock:
+ for (qtype = 0; qtype < MAXQUOTAS; qtype++) {
+ if (!locked[qtype])
+ continue;
+ oinfo = sb_dqinfo(sb, qtype)->dqi_priv;
+ ocfs2_unlock_global_qf(oinfo, 1);
+ }
ocfs2_inode_unlock(inode, 1);
bail_unlock_rw:
if (size_change)
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 50dbc48..288512c 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -28,6 +28,7 @@
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/pagemap.h>
+#include <linux/quotaops.h>

#include <asm/byteorder.h>

@@ -603,7 +604,8 @@ static int ocfs2_remove_inode(struct inode *inode,
goto bail;
}

- handle = ocfs2_start_trans(osb, OCFS2_DELETE_INODE_CREDITS);
+ handle = ocfs2_start_trans(osb, OCFS2_DELETE_INODE_CREDITS +
+ ocfs2_quota_trans_credits(inode->i_sb));
if (IS_ERR(handle)) {
status = PTR_ERR(handle);
mlog_errno(status);
@@ -635,6 +637,7 @@ static int ocfs2_remove_inode(struct inode *inode,
}

ocfs2_remove_from_cache(inode, di_bh);
+ vfs_dq_free_inode(inode);

status = ocfs2_free_dinode(handle, inode_alloc_inode,
inode_alloc_bh, di);
@@ -917,7 +920,10 @@ void ocfs2_delete_inode(struct inode *inode)

mlog_entry("(inode->i_ino = %lu)\n", inode->i_ino);

- if (is_bad_inode(inode)) {
+ /* When we fail in read_inode() we mark inode as bad. The second test
+ * catches the case when inode allocation fails before allocating
+ * a block for inode. */
+ if (is_bad_inode(inode) || !OCFS2_I(inode)->ip_blkno) {
mlog(0, "Skipping delete of bad inode\n");
goto bail;
}
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 8203980..ee08e9c 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -284,6 +284,37 @@ int ocfs2_journal_dirty(handle_t *handle,
/* extended attribute block update */
#define OCFS2_XATTR_BLOCK_UPDATE_CREDITS 1

+/* global quotafile inode update, data block */
+#define OCFS2_QINFO_WRITE_CREDITS (OCFS2_INODE_UPDATE_CREDITS + 1)
+
+/*
+ * The two writes below can accidentally see global info dirty due
+ * to set_info() quotactl so make them prepared for the writes.
+ */
+/* quota data block, global info */
+/* Write to local quota file */
+#define OCFS2_QWRITE_CREDITS (OCFS2_QINFO_WRITE_CREDITS + 1)
+
+/* global quota data block, local quota data block, global quota inode,
+ * global quota info */
+#define OCFS2_QSYNC_CREDITS (OCFS2_INODE_UPDATE_CREDITS + 3)
+
+static inline int ocfs2_quota_trans_credits(struct super_block *sb)
+{
+ int credits = 0;
+
+ if (OCFS2_HAS_RO_COMPAT_FEATURE(sb, OCFS2_FEATURE_RO_COMPAT_USRQUOTA))
+ credits += OCFS2_QWRITE_CREDITS;
+ if (OCFS2_HAS_RO_COMPAT_FEATURE(sb, OCFS2_FEATURE_RO_COMPAT_GRPQUOTA))
+ credits += OCFS2_QWRITE_CREDITS;
+ return credits;
+}
+
+/* Number of credits needed for removing quota structure from file */
+int ocfs2_calc_qdel_credits(struct super_block *sb, int type);
+/* Number of credits needed for initialization of new quota structure */
+int ocfs2_calc_qinit_credits(struct super_block *sb, int type);
+
/* group extend. inode update and last group update. */
#define OCFS2_GROUP_EXTEND_CREDITS (OCFS2_INODE_UPDATE_CREDITS + 1)

@@ -294,8 +325,11 @@ int ocfs2_journal_dirty(handle_t *handle,
* prev. group desc. if we relink. */
#define OCFS2_SUBALLOC_ALLOC (3)

-#define OCFS2_INLINE_TO_EXTENTS_CREDITS (OCFS2_SUBALLOC_ALLOC \
- + OCFS2_INODE_UPDATE_CREDITS)
+static inline int ocfs2_inline_to_extents_credits(struct super_block *sb)
+{
+ return OCFS2_SUBALLOC_ALLOC + OCFS2_INODE_UPDATE_CREDITS +
+ ocfs2_quota_trans_credits(sb);
+}

/* dinode + group descriptor update. We don't relink on free yet. */
#define OCFS2_SUBALLOC_FREE (2)
@@ -304,16 +338,23 @@ int ocfs2_journal_dirty(handle_t *handle,
#define OCFS2_TRUNCATE_LOG_FLUSH_ONE_REC (OCFS2_SUBALLOC_FREE \
+ OCFS2_TRUNCATE_LOG_UPDATE)

-#define OCFS2_REMOVE_EXTENT_CREDITS (OCFS2_TRUNCATE_LOG_UPDATE + OCFS2_INODE_UPDATE_CREDITS)
+static inline int ocfs2_remove_extent_credits(struct super_block *sb)
+{
+ return OCFS2_TRUNCATE_LOG_UPDATE + OCFS2_INODE_UPDATE_CREDITS +
+ ocfs2_quota_trans_credits(sb);
+}

/* data block for new dir/symlink, 2 for bitmap updates (bitmap fe +
* bitmap block for the new bit) */
#define OCFS2_DIR_LINK_ADDITIONAL_CREDITS (1 + 2)

/* parent fe, parent block, new file entry, inode alloc fe, inode alloc
- * group descriptor + mkdir/symlink blocks */
-#define OCFS2_MKNOD_CREDITS (3 + OCFS2_SUBALLOC_ALLOC \
- + OCFS2_DIR_LINK_ADDITIONAL_CREDITS)
+ * group descriptor + mkdir/symlink blocks + quota update */
+static inline int ocfs2_mknod_credits(struct super_block *sb)
+{
+ return 3 + OCFS2_SUBALLOC_ALLOC + OCFS2_DIR_LINK_ADDITIONAL_CREDITS +
+ ocfs2_quota_trans_credits(sb);
+}

/* local alloc metadata change + main bitmap updates */
#define OCFS2_WINDOW_MOVE_CREDITS (OCFS2_INODE_UPDATE_CREDITS \
@@ -323,13 +364,21 @@ int ocfs2_journal_dirty(handle_t *handle,
* for the dinode, one for the new block. */
#define OCFS2_SIMPLE_DIR_EXTEND_CREDITS (2)

-/* file update (nlink, etc) + directory mtime/ctime + dir entry block */
-#define OCFS2_LINK_CREDITS (2*OCFS2_INODE_UPDATE_CREDITS + 1)
+/* file update (nlink, etc) + directory mtime/ctime + dir entry block + quota
+ * update on dir */
+static inline int ocfs2_link_credits(struct super_block *sb)
+{
+ return 2*OCFS2_INODE_UPDATE_CREDITS + 1 +
+ ocfs2_quota_trans_credits(sb);
+}

/* inode + dir inode (if we unlink a dir), + dir entry block + orphan
* dir inode link */
-#define OCFS2_UNLINK_CREDITS (2 * OCFS2_INODE_UPDATE_CREDITS + 1 \
- + OCFS2_LINK_CREDITS)
+static inline int ocfs2_unlink_credits(struct super_block *sb)
+{
+ /* The quota update from ocfs2_link_credits is unused here... */
+ return 2 * OCFS2_INODE_UPDATE_CREDITS + 1 + ocfs2_link_credits(sb);
+}

/* dinode + orphan dir dinode + inode alloc dinode + orphan dir entry +
* inode alloc group descriptor */
@@ -338,8 +387,10 @@ int ocfs2_journal_dirty(handle_t *handle,
/* dinode update, old dir dinode update, new dir dinode update, old
* dir dir entry, new dir dir entry, dir entry update for renaming
* directory + target unlink */
-#define OCFS2_RENAME_CREDITS (3 * OCFS2_INODE_UPDATE_CREDITS + 3 \
- + OCFS2_UNLINK_CREDITS)
+static inline int ocfs2_rename_credits(struct super_block *sb)
+{
+ return 3 * OCFS2_INODE_UPDATE_CREDITS + 3 + ocfs2_unlink_credits(sb);
+}

/* global bitmap dinode, group desc., relinked group,
* suballocator dinode, group desc., relinked group,
@@ -377,18 +428,19 @@ static inline int ocfs2_calc_extend_credits(struct super_block *sb,
* credit for the dinode there. */
extent_blocks = 1 + 1 + le16_to_cpu(root_el->l_tree_depth);

- return bitmap_blocks + sysfile_bitmap_blocks + extent_blocks;
+ return bitmap_blocks + sysfile_bitmap_blocks + extent_blocks +
+ ocfs2_quota_trans_credits(sb);
}

static inline int ocfs2_calc_symlink_credits(struct super_block *sb)
{
- int blocks = OCFS2_MKNOD_CREDITS;
+ int blocks = ocfs2_mknod_credits(sb);

/* links can be longer than one block so we may update many
* within our single allocated extent. */
blocks += ocfs2_clusters_to_blocks(sb, 1);

- return blocks;
+ return blocks + ocfs2_quota_trans_credits(sb);
}

static inline int ocfs2_calc_group_alloc_credits(struct super_block *sb,
@@ -425,6 +477,8 @@ static inline int ocfs2_calc_tree_trunc_credits(struct super_block *sb,
/* update to the truncate log. */
credits += OCFS2_TRUNCATE_LOG_UPDATE;

+ credits += ocfs2_quota_trans_credits(sb);
+
return credits;
}

diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 98fd325..02c8026 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -40,6 +40,7 @@
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/highmem.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_NAMEI
#include <cluster/masklog.h>
@@ -212,6 +213,7 @@ static struct inode *ocfs2_get_init_inode(struct inode *dir, int mode)
} else
inode->i_gid = current->fsgid;
inode->i_mode = mode;
+ vfs_dq_init(inode);
return inode;
}

@@ -236,6 +238,7 @@ static int ocfs2_mknod(struct inode *dir,
struct ocfs2_security_xattr_info si = {
.enable = 1,
};
+ int did_quota_inode = 0;

mlog_entry("(0x%p, 0x%p, %d, %lu, '%.*s')\n", dir, dentry, mode,
(unsigned long)dev, dentry->d_name.len,
@@ -323,7 +326,8 @@ static int ocfs2_mknod(struct inode *dir,
goto leave;
}

- handle = ocfs2_start_trans(osb, OCFS2_MKNOD_CREDITS + xattr_credits);
+ handle = ocfs2_start_trans(osb, ocfs2_mknod_credits(osb->sb) +
+ xattr_credits);
if (IS_ERR(handle)) {
status = PTR_ERR(handle);
handle = NULL;
@@ -331,6 +335,15 @@ static int ocfs2_mknod(struct inode *dir,
goto leave;
}

+ /* We don't use standard VFS wrapper because we don't want vfs_dq_init
+ * to be called. */
+ if (sb_any_quota_active(osb->sb) &&
+ osb->sb->dq_op->alloc_inode(inode, 1) == NO_QUOTA) {
+ status = -EDQUOT;
+ goto leave;
+ }
+ did_quota_inode = 1;
+
/* do the real work now. */
status = ocfs2_mknod_locked(osb, dir, inode, dentry, dev,
&new_fe_bh, parent_fe_bh, handle,
@@ -399,6 +412,8 @@ static int ocfs2_mknod(struct inode *dir,
d_instantiate(dentry, inode);
status = 0;
leave:
+ if (status < 0 && did_quota_inode)
+ vfs_dq_free_inode(inode);
if (handle)
ocfs2_commit_trans(osb, handle);

@@ -641,7 +656,7 @@ static int ocfs2_link(struct dentry *old_dentry,
goto out_unlock_inode;
}

- handle = ocfs2_start_trans(osb, OCFS2_LINK_CREDITS);
+ handle = ocfs2_start_trans(osb, ocfs2_link_credits(osb->sb));
if (IS_ERR(handle)) {
err = PTR_ERR(handle);
handle = NULL;
@@ -828,7 +843,7 @@ static int ocfs2_unlink(struct inode *dir,
}
}

- handle = ocfs2_start_trans(osb, OCFS2_UNLINK_CREDITS);
+ handle = ocfs2_start_trans(osb, ocfs2_unlink_credits(osb->sb));
if (IS_ERR(handle)) {
status = PTR_ERR(handle);
handle = NULL;
@@ -1234,7 +1249,7 @@ static int ocfs2_rename(struct inode *old_dir,
}
}

- handle = ocfs2_start_trans(osb, OCFS2_RENAME_CREDITS);
+ handle = ocfs2_start_trans(osb, ocfs2_rename_credits(osb->sb));
if (IS_ERR(handle)) {
status = PTR_ERR(handle);
handle = NULL;
@@ -1555,6 +1570,7 @@ static int ocfs2_symlink(struct inode *dir,
struct ocfs2_security_xattr_info si = {
.enable = 1,
};
+ int did_quota = 0, did_quota_inode = 0;

mlog_entry("(0x%p, 0x%p, symname='%s' actual='%.*s')\n", dir,
dentry, symname, dentry->d_name.len, dentry->d_name.name);
@@ -1648,6 +1664,15 @@ static int ocfs2_symlink(struct inode *dir,
goto bail;
}

+ /* We don't use standard VFS wrapper because we don't want vfs_dq_init
+ * to be called. */
+ if (sb_any_quota_active(osb->sb) &&
+ osb->sb->dq_op->alloc_inode(inode, 1) == NO_QUOTA) {
+ status = -EDQUOT;
+ goto bail;
+ }
+ did_quota_inode = 1;
+
status = ocfs2_mknod_locked(osb, dir, inode, dentry,
0, &new_fe_bh, parent_fe_bh, handle,
inode_ac);
@@ -1663,6 +1688,12 @@ static int ocfs2_symlink(struct inode *dir,
u32 offset = 0;

inode->i_op = &ocfs2_symlink_inode_operations;
+ if (vfs_dq_alloc_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, 1))) {
+ status = -EDQUOT;
+ goto bail;
+ }
+ did_quota = 1;
status = ocfs2_add_inode_data(osb, inode, &offset, 1, 0,
new_fe_bh,
handle, data_ac, NULL,
@@ -1728,6 +1759,11 @@ static int ocfs2_symlink(struct inode *dir,
dentry->d_op = &ocfs2_dentry_ops;
d_instantiate(dentry, inode);
bail:
+ if (status < 0 && did_quota)
+ vfs_dq_free_space_nodirty(inode,
+ ocfs2_clusters_to_bytes(osb->sb, 1));
+ if (status < 0 && did_quota_inode)
+ vfs_dq_free_inode(inode);
if (handle)
ocfs2_commit_trans(osb, handle);

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 9cb71e1..3b9634c 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -1665,7 +1665,8 @@ static int ocfs2_remove_value_outside(struct inode*inode,

ocfs2_init_dealloc_ctxt(&ctxt.dealloc);

- ctxt.handle = ocfs2_start_trans(osb, OCFS2_REMOVE_EXTENT_CREDITS);
+ ctxt.handle = ocfs2_start_trans(osb,
+ ocfs2_remove_extent_credits(osb->sb));
if (IS_ERR(ctxt.handle)) {
ret = PTR_ERR(ctxt.handle);
mlog_errno(ret);
@@ -2233,7 +2234,7 @@ static int ocfs2_calc_xattr_set_need(struct inode *inode,
*/
if (!xi->value) {
if (!ocfs2_xattr_is_local(xe))
- credits += OCFS2_REMOVE_EXTENT_CREDITS;
+ credits += ocfs2_remove_extent_credits(inode->i_sb);

goto out;
}
@@ -2250,7 +2251,7 @@ static int ocfs2_calc_xattr_set_need(struct inode *inode,
*/
if (ocfs2_xattr_can_be_in_inode(inode, xi, xis)) {
clusters_add += new_clusters;
- credits += OCFS2_REMOVE_EXTENT_CREDITS +
+ credits += ocfs2_remove_extent_credits(inode->i_sb) +
OCFS2_INODE_UPDATE_CREDITS;
if (!ocfs2_xattr_is_local(xe))
credits += ocfs2_calc_extend_credits(
@@ -2275,7 +2276,7 @@ static int ocfs2_calc_xattr_set_need(struct inode *inode,
xv = &def_xv.xv;

if (old_clusters >= new_clusters) {
- credits += OCFS2_REMOVE_EXTENT_CREDITS;
+ credits += ocfs2_remove_extent_credits(inode->i_sb);
goto out;
} else {
meta_add += ocfs2_extend_meta_needed(&xv->xr_list);
@@ -4750,7 +4751,7 @@ static int ocfs2_rm_xattr_cluster(struct inode *inode,
}
}

- handle = ocfs2_start_trans(osb, OCFS2_REMOVE_EXTENT_CREDITS);
+ handle = ocfs2_start_trans(osb, ocfs2_remove_extent_credits(osb->sb));
if (IS_ERR(handle)) {
ret = -ENOMEM;
mlog_errno(ret);
@@ -5109,7 +5110,8 @@ static int ocfs2_delete_xattr_in_bucket(struct inode *inode,

ocfs2_init_dealloc_ctxt(&ctxt.dealloc);

- ctxt.handle = ocfs2_start_trans(osb, OCFS2_REMOVE_EXTENT_CREDITS);
+ ctxt.handle = ocfs2_start_trans(osb,
+ ocfs2_remove_extent_credits(osb->sb));
if (IS_ERR(ctxt.handle)) {
ret = PTR_ERR(ctxt.handle);
mlog_errno(ret);
--
1.5.6

2008-12-22 21:59:09

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 28/56] jbd2: Add BH_JBDPrivateStart

Add this so that file systems using JBD2 can safely allocate unused b_state
bits.

In this case, we add it so that Ocfs2 can define a single bit for tracking
the validation state of a buffer.

Acked-by: "Theodore Ts'o" <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
include/linux/jbd2.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index c7d106e..f366457 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -329,6 +329,7 @@ enum jbd_state_bits {
BH_State, /* Pins most journal_head state */
BH_JournalHead, /* Pins bh->b_private and jh->b_bh */
BH_Unshadow, /* Dummy bit, for BJ_Shadow wakeup filtering */
+ BH_JBDPrivateStart, /* First bit available for private use by FS */
};

BUFFER_FNS(JBD, jbd)
--
1.5.6

2008-12-22 21:59:35

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 29/56] ocfs2: Use BH_JBDPrivateStart instead of BH_Unshadow

This is safer. We no longer have to worry about tracking changes to
jbd_state_bits.

Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/buffer_head_io.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/buffer_head_io.c b/fs/ocfs2/buffer_head_io.c
index 0e9eed0..15c8e6d 100644
--- a/fs/ocfs2/buffer_head_io.c
+++ b/fs/ocfs2/buffer_head_io.c
@@ -42,11 +42,10 @@
/*
* Bits on bh->b_state used by ocfs2.
*
- * These MUST be after the JBD2 bits. Currently BH_Unshadow is the last
- * JBD2 bit.
+ * These MUST be after the JBD2 bits. Hence, we use BH_JBDPrivateStart.
*/
enum ocfs2_state_bits {
- BH_NeedsValidate = BH_Unshadow + 1,
+ BH_NeedsValidate = BH_JBDPrivateStart,
};

/* Expand the magic b_state functions */
--
1.5.6

2008-12-22 21:58:48

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 27/56] ocfs2: Enable quota accounting on mount, disable on umount

From: Jan Kara <[email protected]>

Enable quota usage tracking on mount and disable it on umount. Also
add support for quota on and quota off quotactls and usrquota and
grpquota mount options. Add quota features among supported ones.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/journal.c | 20 ++++-
fs/ocfs2/ocfs2.h | 3 +
fs/ocfs2/ocfs2_fs.h | 4 +-
fs/ocfs2/super.c | 222 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 245 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index c602420..302f114 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -56,7 +56,7 @@ static int ocfs2_recover_node(struct ocfs2_super *osb,
int node_num, int slot_num);
static int __ocfs2_recovery_thread(void *arg);
static int ocfs2_commit_cache(struct ocfs2_super *osb);
-static int ocfs2_wait_on_mount(struct ocfs2_super *osb);
+static int __ocfs2_wait_on_mount(struct ocfs2_super *osb, int quota);
static int ocfs2_journal_toggle_dirty(struct ocfs2_super *osb,
int dirty, int replayed);
static int ocfs2_trylock_journal(struct ocfs2_super *osb,
@@ -65,6 +65,17 @@ static int ocfs2_recover_orphans(struct ocfs2_super *osb,
int slot);
static int ocfs2_commit_thread(void *arg);

+static inline int ocfs2_wait_on_mount(struct ocfs2_super *osb)
+{
+ return __ocfs2_wait_on_mount(osb, 0);
+}
+
+static inline int ocfs2_wait_on_quotas(struct ocfs2_super *osb)
+{
+ return __ocfs2_wait_on_mount(osb, 1);
+}
+
+

/*
* The recovery_list is a simple linked list of node numbers to recover.
@@ -895,6 +906,8 @@ void ocfs2_complete_recovery(struct work_struct *work)

mlog(0, "Complete recovery for slot %d\n", item->lri_slot);

+ ocfs2_wait_on_quotas(osb);
+
la_dinode = item->lri_la_dinode;
if (la_dinode) {
mlog(0, "Clean up local alloc %llu\n",
@@ -1701,13 +1714,14 @@ static int ocfs2_recover_orphans(struct ocfs2_super *osb,
return ret;
}

-static int ocfs2_wait_on_mount(struct ocfs2_super *osb)
+static int __ocfs2_wait_on_mount(struct ocfs2_super *osb, int quota)
{
/* This check is good because ocfs2 will wait on our recovery
* thread before changing it to something other than MOUNTED
* or DISABLED. */
wait_event(osb->osb_mount_event,
- atomic_read(&osb->vol_state) == VOLUME_MOUNTED ||
+ (!quota && atomic_read(&osb->vol_state) == VOLUME_MOUNTED) ||
+ atomic_read(&osb->vol_state) == VOLUME_MOUNTED_QUOTAS ||
atomic_read(&osb->vol_state) == VOLUME_DISABLED);

/* If there's an error on mount, then we may never get to the
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 6b25b4a..5c77798 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -161,6 +161,7 @@ enum ocfs2_vol_state
{
VOLUME_INIT = 0,
VOLUME_MOUNTED,
+ VOLUME_MOUNTED_QUOTAS,
VOLUME_DISMOUNTED,
VOLUME_DISABLED
};
@@ -196,6 +197,8 @@ enum ocfs2_mount_options
OCFS2_MOUNT_NOUSERXATTR = 1 << 6, /* No user xattr */
OCFS2_MOUNT_INODE64 = 1 << 7, /* Allow inode numbers > 2^32 */
OCFS2_MOUNT_POSIX_ACL = 1 << 8, /* POSIX access control lists */
+ OCFS2_MOUNT_USRQUOTA = 1 << 9, /* We support user quotas */
+ OCFS2_MOUNT_GRPQUOTA = 1 << 10, /* We support group quotas */
};

#define OCFS2_OSB_SOFT_RO 0x0001
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h
index 0a5ac79..359732e 100644
--- a/fs/ocfs2/ocfs2_fs.h
+++ b/fs/ocfs2/ocfs2_fs.h
@@ -94,7 +94,9 @@
| OCFS2_FEATURE_INCOMPAT_EXTENDED_SLOT_MAP \
| OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK \
| OCFS2_FEATURE_INCOMPAT_XATTR)
-#define OCFS2_FEATURE_RO_COMPAT_SUPP (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN)
+#define OCFS2_FEATURE_RO_COMPAT_SUPP (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN \
+ | OCFS2_FEATURE_RO_COMPAT_USRQUOTA \
+ | OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)

/*
* Heartbeat-only devices are missing journals and other files. The
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 7bb83e4..bc43138 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -41,6 +41,7 @@
#include <linux/debugfs.h>
#include <linux/mount.h>
#include <linux/seq_file.h>
+#include <linux/quotaops.h>

#define MLOG_MASK_PREFIX ML_SUPER
#include <cluster/masklog.h>
@@ -127,6 +128,9 @@ static int ocfs2_get_sector(struct super_block *sb,
static void ocfs2_write_super(struct super_block *sb);
static struct inode *ocfs2_alloc_inode(struct super_block *sb);
static void ocfs2_destroy_inode(struct inode *inode);
+static int ocfs2_susp_quotas(struct ocfs2_super *osb, int unsuspend);
+static int ocfs2_enable_quotas(struct ocfs2_super *osb);
+static void ocfs2_disable_quotas(struct ocfs2_super *osb);

static const struct super_operations ocfs2_sops = {
.statfs = ocfs2_statfs,
@@ -165,6 +169,8 @@ enum {
Opt_inode64,
Opt_acl,
Opt_noacl,
+ Opt_usrquota,
+ Opt_grpquota,
Opt_err,
};

@@ -189,6 +195,8 @@ static const match_table_t tokens = {
{Opt_inode64, "inode64"},
{Opt_acl, "acl"},
{Opt_noacl, "noacl"},
+ {Opt_usrquota, "usrquota"},
+ {Opt_grpquota, "grpquota"},
{Opt_err, NULL}
};

@@ -452,6 +460,12 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)

/* We're going to/from readonly mode. */
if ((*flags & MS_RDONLY) != (sb->s_flags & MS_RDONLY)) {
+ /* Disable quota accounting before remounting RO */
+ if (*flags & MS_RDONLY) {
+ ret = ocfs2_susp_quotas(osb, 0);
+ if (ret < 0)
+ goto out;
+ }
/* Lock here so the check of HARD_RO and the potential
* setting of SOFT_RO is atomic. */
spin_lock(&osb->osb_lock);
@@ -487,6 +501,21 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)
}
unlock_osb:
spin_unlock(&osb->osb_lock);
+ /* Enable quota accounting after remounting RW */
+ if (!ret && !(*flags & MS_RDONLY)) {
+ if (sb_any_quota_suspended(sb))
+ ret = ocfs2_susp_quotas(osb, 1);
+ else
+ ret = ocfs2_enable_quotas(osb);
+ if (ret < 0) {
+ /* Return back changes... */
+ spin_lock(&osb->osb_lock);
+ sb->s_flags |= MS_RDONLY;
+ osb->osb_flags |= OCFS2_OSB_SOFT_RO;
+ spin_unlock(&osb->osb_lock);
+ goto out;
+ }
+ }
}

if (!ret) {
@@ -647,6 +676,131 @@ static int ocfs2_verify_userspace_stack(struct ocfs2_super *osb,
return 0;
}

+static int ocfs2_susp_quotas(struct ocfs2_super *osb, int unsuspend)
+{
+ int type;
+ struct super_block *sb = osb->sb;
+ unsigned int feature[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA};
+ int status = 0;
+
+ for (type = 0; type < MAXQUOTAS; type++) {
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, feature[type]))
+ continue;
+ if (unsuspend)
+ status = vfs_quota_enable(
+ sb_dqopt(sb)->files[type],
+ type, QFMT_OCFS2,
+ DQUOT_SUSPENDED);
+ else
+ status = vfs_quota_disable(sb, type,
+ DQUOT_SUSPENDED);
+ if (status < 0)
+ break;
+ }
+ if (status < 0)
+ mlog(ML_ERROR, "Failed to suspend/unsuspend quotas on "
+ "remount (error = %d).\n", status);
+ return status;
+}
+
+static int ocfs2_enable_quotas(struct ocfs2_super *osb)
+{
+ struct inode *inode[MAXQUOTAS] = { NULL, NULL };
+ struct super_block *sb = osb->sb;
+ unsigned int feature[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA};
+ unsigned int ino[MAXQUOTAS] = { LOCAL_USER_QUOTA_SYSTEM_INODE,
+ LOCAL_GROUP_QUOTA_SYSTEM_INODE };
+ int status;
+ int type;
+
+ sb_dqopt(sb)->flags |= DQUOT_QUOTA_SYS_FILE | DQUOT_NEGATIVE_USAGE;
+ for (type = 0; type < MAXQUOTAS; type++) {
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, feature[type]))
+ continue;
+ inode[type] = ocfs2_get_system_file_inode(osb, ino[type],
+ osb->slot_num);
+ if (!inode[type]) {
+ status = -ENOENT;
+ goto out_quota_off;
+ }
+ status = vfs_quota_enable(inode[type], type, QFMT_OCFS2,
+ DQUOT_USAGE_ENABLED);
+ if (status < 0)
+ goto out_quota_off;
+ }
+
+ for (type = 0; type < MAXQUOTAS; type++)
+ iput(inode[type]);
+ return 0;
+out_quota_off:
+ ocfs2_disable_quotas(osb);
+ for (type = 0; type < MAXQUOTAS; type++)
+ iput(inode[type]);
+ mlog_errno(status);
+ return status;
+}
+
+static void ocfs2_disable_quotas(struct ocfs2_super *osb)
+{
+ int type;
+ struct inode *inode;
+ struct super_block *sb = osb->sb;
+
+ /* We mostly ignore errors in this function because there's not much
+ * we can do when we see them */
+ for (type = 0; type < MAXQUOTAS; type++) {
+ if (!sb_has_quota_loaded(sb, type))
+ continue;
+ inode = igrab(sb->s_dquot.files[type]);
+ /* Turn off quotas. This will remove all dquot structures from
+ * memory and so they will be automatically synced to global
+ * quota files */
+ vfs_quota_disable(sb, type, DQUOT_USAGE_ENABLED |
+ DQUOT_LIMITS_ENABLED);
+ if (!inode)
+ continue;
+ iput(inode);
+ }
+}
+
+/* Handle quota on quotactl */
+static int ocfs2_quota_on(struct super_block *sb, int type, int format_id,
+ char *path, int remount)
+{
+ unsigned int feature[MAXQUOTAS] = { OCFS2_FEATURE_RO_COMPAT_USRQUOTA,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA};
+
+ if (!OCFS2_HAS_RO_COMPAT_FEATURE(sb, feature[type]))
+ return -EINVAL;
+
+ if (remount)
+ return 0; /* Just ignore it has been handled in
+ * ocfs2_remount() */
+ return vfs_quota_enable(sb_dqopt(sb)->files[type], type,
+ format_id, DQUOT_LIMITS_ENABLED);
+}
+
+/* Handle quota off quotactl */
+static int ocfs2_quota_off(struct super_block *sb, int type, int remount)
+{
+ if (remount)
+ return 0; /* Ignore now and handle later in
+ * ocfs2_remount() */
+ return vfs_quota_disable(sb, type, DQUOT_LIMITS_ENABLED);
+}
+
+static struct quotactl_ops ocfs2_quotactl_ops = {
+ .quota_on = ocfs2_quota_on,
+ .quota_off = ocfs2_quota_off,
+ .quota_sync = vfs_quota_sync,
+ .get_info = vfs_get_dqinfo,
+ .set_info = vfs_set_dqinfo,
+ .get_dqblk = vfs_get_dqblk,
+ .set_dqblk = vfs_set_dqblk,
+};
+
static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
{
struct dentry *root;
@@ -689,6 +843,22 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
osb->osb_commit_interval = parsed_options.commit_interval;
osb->local_alloc_default_bits = ocfs2_megabytes_to_clusters(sb, parsed_options.localalloc_opt);
osb->local_alloc_bits = osb->local_alloc_default_bits;
+ if (osb->s_mount_opt & OCFS2_MOUNT_USRQUOTA &&
+ !OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_USRQUOTA)) {
+ status = -EINVAL;
+ mlog(ML_ERROR, "User quotas were requested, but this "
+ "filesystem does not have the feature enabled.\n");
+ goto read_super_error;
+ }
+ if (osb->s_mount_opt & OCFS2_MOUNT_GRPQUOTA &&
+ !OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)) {
+ status = -EINVAL;
+ mlog(ML_ERROR, "Group quotas were requested, but this "
+ "filesystem does not have the feature enabled.\n");
+ goto read_super_error;
+ }

status = ocfs2_verify_userspace_stack(osb, &parsed_options);
if (status)
@@ -793,6 +963,28 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
atomic_set(&osb->vol_state, VOLUME_MOUNTED);
wake_up(&osb->osb_mount_event);

+ /* Now we can initialize quotas because we can afford to wait
+ * for cluster locks recovery now. That also means that truncation
+ * log recovery can happen but that waits for proper quota setup */
+ if (!(sb->s_flags & MS_RDONLY)) {
+ status = ocfs2_enable_quotas(osb);
+ if (status < 0) {
+ /* We have to err-out specially here because
+ * s_root is already set */
+ mlog_errno(status);
+ atomic_set(&osb->vol_state, VOLUME_DISABLED);
+ wake_up(&osb->osb_mount_event);
+ mlog_exit(status);
+ return status;
+ }
+ }
+
+ ocfs2_complete_quota_recovery(osb);
+
+ /* Now we wake up again for processes waiting for quotas */
+ atomic_set(&osb->vol_state, VOLUME_MOUNTED_QUOTAS);
+ wake_up(&osb->osb_mount_event);
+
mlog_exit(status);
return status;

@@ -980,6 +1172,28 @@ static int ocfs2_parse_options(struct super_block *sb,
case Opt_inode64:
mopt->mount_opt |= OCFS2_MOUNT_INODE64;
break;
+ case Opt_usrquota:
+ /* We check only on remount, otherwise features
+ * aren't yet initialized. */
+ if (is_remount && !OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_USRQUOTA)) {
+ mlog(ML_ERROR, "User quota requested but "
+ "filesystem feature is not set\n");
+ status = 0;
+ goto bail;
+ }
+ mopt->mount_opt |= OCFS2_MOUNT_USRQUOTA;
+ break;
+ case Opt_grpquota:
+ if (is_remount && !OCFS2_HAS_RO_COMPAT_FEATURE(sb,
+ OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)) {
+ mlog(ML_ERROR, "Group quota requested but "
+ "filesystem feature is not set\n");
+ status = 0;
+ goto bail;
+ }
+ mopt->mount_opt |= OCFS2_MOUNT_GRPQUOTA;
+ break;
#ifdef CONFIG_OCFS2_FS_POSIX_ACL
case Opt_acl:
mopt->mount_opt |= OCFS2_MOUNT_POSIX_ACL;
@@ -1056,6 +1270,10 @@ static int ocfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
if (osb->osb_cluster_stack[0])
seq_printf(s, ",cluster_stack=%.*s", OCFS2_STACK_LABEL_LEN,
osb->osb_cluster_stack);
+ if (opts & OCFS2_MOUNT_USRQUOTA)
+ seq_printf(s, ",usrquota");
+ if (opts & OCFS2_MOUNT_GRPQUOTA)
+ seq_printf(s, ",grpquota");

if (opts & OCFS2_MOUNT_NOUSERXATTR)
seq_printf(s, ",nouser_xattr");
@@ -1387,6 +1605,8 @@ static void ocfs2_dismount_volume(struct super_block *sb, int mnt_err)
osb = OCFS2_SB(sb);
BUG_ON(!osb);

+ ocfs2_disable_quotas(osb);
+
ocfs2_shutdown_local_alloc(osb);

ocfs2_truncate_log_shutdown(osb);
@@ -1497,6 +1717,8 @@ static int ocfs2_initialize_super(struct super_block *sb,
sb->s_fs_info = osb;
sb->s_op = &ocfs2_sops;
sb->s_export_op = &ocfs2_export_ops;
+ sb->s_qcop = &ocfs2_quotactl_ops;
+ sb->dq_op = &ocfs2_quota_operations;
sb->s_xattr = ocfs2_xattr_handlers;
sb->s_time_gran = 1;
sb->s_flags |= MS_NOATIME;
--
1.5.6

2008-12-22 21:59:53

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 30/56] ocfs2: Add missing initialization

From: Jan Kara <[email protected]>

Add missing variable initialization to ocfs2_dquot_drop_slow().

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_global.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index d2a5bfa..7f561e4 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -873,7 +873,7 @@ out:

static int ocfs2_dquot_drop_slow(struct inode *inode)
{
- int status;
+ int status = 0;
int cnt;
int got_lock[MAXQUOTAS] = {0, 0};
handle_t *handle;
--
1.5.6

2008-12-22 22:00:44

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 32/56] ocfs2: Fix oops when extending quota files

From: Jan Kara <[email protected]>

We have to mark buffer as uptodate before calling ocfs2_journal_access() and
ocfs2_set_buffer_uptodate() does not do this for us.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_global.c | 28 ++++++++++++----------------
1 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 1401edf..339c98a 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -172,7 +172,7 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
struct inode *gqinode = oinfo->dqi_gqinode;
int offset = off & (sb->s_blocksize - 1);
sector_t blk = off >> sb->s_blocksize_bits;
- int err = 0, new = 0;
+ int err = 0, new = 0, ja_type;
struct buffer_head *bh = NULL;
handle_t *handle = journal_current_handle();

@@ -205,32 +205,28 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
if ((offset || len < sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE) &&
!new) {
err = ocfs2_read_quota_block(gqinode, blk, &bh);
- if (err) {
- mlog_errno(err);
- return err;
- }
- err = ocfs2_journal_access(handle, gqinode, bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ ja_type = OCFS2_JOURNAL_ACCESS_WRITE;
} else {
bh = ocfs2_get_quota_block(gqinode, blk, &err);
- if (!bh) {
- mlog_errno(err);
- return err;
- }
- err = ocfs2_journal_access(handle, gqinode, bh,
- OCFS2_JOURNAL_ACCESS_CREATE);
+ ja_type = OCFS2_JOURNAL_ACCESS_CREATE;
}
- if (err < 0) {
- brelse(bh);
- goto out;
+ if (err) {
+ mlog_errno(err);
+ return err;
}
lock_buffer(bh);
if (new)
memset(bh->b_data, 0, sb->s_blocksize);
memcpy(bh->b_data + offset, data, len);
flush_dcache_page(bh->b_page);
+ set_buffer_uptodate(bh);
unlock_buffer(bh);
ocfs2_set_buffer_uptodate(gqinode, bh);
+ err = ocfs2_journal_access(handle, gqinode, bh, ja_type);
+ if (err < 0) {
+ brelse(bh);
+ goto out;
+ }
err = ocfs2_journal_dirty(handle, bh);
brelse(bh);
if (err < 0)
--
1.5.6

2008-12-22 22:00:20

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 31/56] ocfs2: Fix ocfs2_read_quota_block() error handling.

From: Joel Becker <[email protected]>

ocfs2_bread() has become ocfs2_read_virt_blocks(), with a prototype to
match ocfs2_read_blocks(). The quota code, converting from
ocfs2_bread(), wraps the call to ocfs2_read_virt_blocks() in
ocfs2_read_quota_block(). Unfortunately, the prototype of
ocfs2_read_quota_block() matches the old prototype of ocfs2_bread().

The problem is that ocfs2_bread() returned the buffer head, and callers
assumed that a NULL pointer was indicative of error. It wasn't. This
is why ocfs2_bread() took an int*err argument as well.

The new prototype of ocfs2_read_virt_blocks() avoids this error handling
confusion. Let's change ocfs2_read_quota_block() to match.

Signed-off-by: Joel Becker <[email protected]>
Acked-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/dlmglue.c | 6 ++--
fs/ocfs2/quota.h | 4 +-
fs/ocfs2/quota_global.c | 34 ++++++++++++++----------
fs/ocfs2/quota_local.c | 64 +++++++++++++++++++++++++---------------------
4 files changed, 60 insertions(+), 48 deletions(-)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 058aa86..b1c7591 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -3519,7 +3519,7 @@ static int ocfs2_refresh_qinfo(struct ocfs2_mem_dqinfo *oinfo)
oinfo->dqi_gi.dqi_type);
struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
struct ocfs2_qinfo_lvb *lvb = ocfs2_dlm_lvb(&lockres->l_lksb);
- struct buffer_head *bh;
+ struct buffer_head *bh = NULL;
struct ocfs2_global_disk_dqinfo *gdinfo;
int status = 0;

@@ -3532,8 +3532,8 @@ static int ocfs2_refresh_qinfo(struct ocfs2_mem_dqinfo *oinfo)
oinfo->dqi_gi.dqi_free_entry =
be32_to_cpu(lvb->lvb_free_entry);
} else {
- bh = ocfs2_read_quota_block(oinfo->dqi_gqinode, 0, &status);
- if (!bh) {
+ status = ocfs2_read_quota_block(oinfo->dqi_gqinode, 0, &bh);
+ if (status) {
mlog_errno(status);
goto bail;
}
diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
index 84c50a1..abf6941 100644
--- a/fs/ocfs2/quota.h
+++ b/fs/ocfs2/quota.h
@@ -108,8 +108,8 @@ static inline int ocfs2_global_release_dquot(struct dquot *dquot)

int ocfs2_lock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex);
void ocfs2_unlock_global_qf(struct ocfs2_mem_dqinfo *oinfo, int ex);
-struct buffer_head *ocfs2_read_quota_block(struct inode *inode,
- int block, int *err);
+int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
+ struct buffer_head **bh);

extern struct dquot_operations ocfs2_quota_operations;
extern struct quota_format_type ocfs2_quota_format;
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 7f561e4..1401edf 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -85,16 +85,21 @@ struct qtree_fmt_operations ocfs2_global_ops = {
.is_id = ocfs2_global_is_id,
};

-struct buffer_head *ocfs2_read_quota_block(struct inode *inode,
- int block, int *err)
+int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
+ struct buffer_head **bh)
{
- struct buffer_head *tmp = NULL;
+ int rc = 0;
+ struct buffer_head *tmp = *bh;

- *err = ocfs2_read_virt_blocks(inode, block, 1, &tmp, 0, NULL);
- if (*err)
- mlog_errno(*err);
+ rc = ocfs2_read_virt_blocks(inode, v_block, 1, &tmp, 0, NULL);
+ if (rc)
+ mlog_errno(rc);
+
+ /* If ocfs2_read_virt_blocks() got us a new bh, pass it up. */
+ if (!rc && !*bh)
+ *bh = tmp;

- return tmp;
+ return rc;
}

static struct buffer_head *ocfs2_get_quota_block(struct inode *inode,
@@ -141,8 +146,9 @@ ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
toread = len;
while (toread > 0) {
tocopy = min((size_t)(sb->s_blocksize - offset), toread);
- bh = ocfs2_read_quota_block(gqinode, blk, &err);
- if (!bh) {
+ bh = NULL;
+ err = ocfs2_read_quota_block(gqinode, blk, &bh);
+ if (err) {
mlog_errno(err);
return err;
}
@@ -167,7 +173,7 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
int offset = off & (sb->s_blocksize - 1);
sector_t blk = off >> sb->s_blocksize_bits;
int err = 0, new = 0;
- struct buffer_head *bh;
+ struct buffer_head *bh = NULL;
handle_t *handle = journal_current_handle();

if (!handle) {
@@ -198,13 +204,13 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
/* Not rewriting whole block? */
if ((offset || len < sb->s_blocksize - OCFS2_QBLK_RESERVED_SPACE) &&
!new) {
- bh = ocfs2_read_quota_block(gqinode, blk, &err);
- if (!bh) {
+ err = ocfs2_read_quota_block(gqinode, blk, &bh);
+ if (err) {
mlog_errno(err);
return err;
}
err = ocfs2_journal_access(handle, gqinode, bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ OCFS2_JOURNAL_ACCESS_WRITE);
} else {
bh = ocfs2_get_quota_block(gqinode, blk, &err);
if (!bh) {
@@ -212,7 +218,7 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
return err;
}
err = ocfs2_journal_access(handle, gqinode, bh,
- OCFS2_JOURNAL_ACCESS_CREATE);
+ OCFS2_JOURNAL_ACCESS_CREATE);
}
if (err < 0) {
brelse(bh);
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 54e8788..c008dd9 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -139,15 +139,15 @@ static int ocfs2_local_check_quota_file(struct super_block *sb, int type)
unsigned int gversions[MAXQUOTAS] = OCFS2_GLOBAL_QVERSIONS;
unsigned int ino[MAXQUOTAS] = { USER_QUOTA_SYSTEM_INODE,
GROUP_QUOTA_SYSTEM_INODE };
- struct buffer_head *bh;
+ struct buffer_head *bh = NULL;
struct inode *linode = sb_dqopt(sb)->files[type];
struct inode *ginode = NULL;
struct ocfs2_disk_dqheader *dqhead;
int status, ret = 0;

/* First check whether we understand local quota file */
- bh = ocfs2_read_quota_block(linode, 0, &status);
- if (!bh) {
+ status = ocfs2_read_quota_block(linode, 0, &bh);
+ if (status) {
mlog_errno(status);
mlog(ML_ERROR, "failed to read quota file header (type=%d)\n",
type);
@@ -178,8 +178,8 @@ static int ocfs2_local_check_quota_file(struct super_block *sb, int type)
goto out_err;
}
/* Since the header is read only, we don't care about locking */
- bh = ocfs2_read_quota_block(ginode, 0, &status);
- if (!bh) {
+ status = ocfs2_read_quota_block(ginode, 0, &bh);
+ if (status) {
mlog_errno(status);
mlog(ML_ERROR, "failed to read global quota file header "
"(type=%d)\n", type);
@@ -235,10 +235,11 @@ static int ocfs2_load_local_quota_bitmaps(struct inode *inode,
return -ENOMEM;
}
newchunk->qc_num = i;
- newchunk->qc_headerbh = ocfs2_read_quota_block(inode,
+ newchunk->qc_headerbh = NULL;
+ status = ocfs2_read_quota_block(inode,
ol_quota_chunk_block(inode->i_sb, i),
- &status);
- if (!newchunk->qc_headerbh) {
+ &newchunk->qc_headerbh);
+ if (status) {
mlog_errno(status);
kmem_cache_free(ocfs2_qf_chunk_cachep, newchunk);
ocfs2_release_local_quota_bitmaps(head);
@@ -320,10 +321,11 @@ static int ocfs2_recovery_load_quota(struct inode *lqinode,
int status = 0;

for (i = 0; i < chunks; i++) {
- hbh = ocfs2_read_quota_block(lqinode,
- ol_quota_chunk_block(sb, i),
- &status);
- if (!hbh) {
+ hbh = NULL;
+ status = ocfs2_read_quota_block(lqinode,
+ ol_quota_chunk_block(sb, i),
+ &hbh);
+ if (status) {
mlog_errno(status);
break;
}
@@ -392,8 +394,9 @@ struct ocfs2_quota_recovery *ocfs2_begin_quota_recovery(
goto out_put;
}
/* Now read local header */
- bh = ocfs2_read_quota_block(lqinode, 0, &status);
- if (!bh) {
+ bh = NULL;
+ status = ocfs2_read_quota_block(lqinode, 0, &bh);
+ if (status) {
mlog_errno(status);
mlog(ML_ERROR, "failed to read quota file info header "
"(slot=%d type=%d)\n", slot_num, type);
@@ -447,19 +450,21 @@ static int ocfs2_recover_local_quota_file(struct inode *lqinode,

list_for_each_entry_safe(rchunk, next, &(rec->r_list[type]), rc_list) {
chunk = rchunk->rc_chunk;
- hbh = ocfs2_read_quota_block(lqinode,
- ol_quota_chunk_block(sb, chunk),
- &status);
- if (!hbh) {
+ hbh = NULL;
+ status = ocfs2_read_quota_block(lqinode,
+ ol_quota_chunk_block(sb, chunk),
+ &hbh);
+ if (status) {
mlog_errno(status);
break;
}
dchunk = (struct ocfs2_local_disk_chunk *)hbh->b_data;
for_each_bit(bit, rchunk->rc_bitmap, ol_chunk_entries(sb)) {
- qbh = ocfs2_read_quota_block(lqinode,
+ qbh = NULL;
+ status = ocfs2_read_quota_block(lqinode,
ol_dqblk_block(sb, chunk, bit),
- &status);
- if (!qbh) {
+ &qbh);
+ if (status) {
mlog_errno(status);
break;
}
@@ -581,8 +586,9 @@ int ocfs2_finish_quota_recovery(struct ocfs2_super *osb,
goto out_put;
}
/* Now read local header */
- bh = ocfs2_read_quota_block(lqinode, 0, &status);
- if (!bh) {
+ bh = NULL;
+ status = ocfs2_read_quota_block(lqinode, 0, &bh);
+ if (status) {
mlog_errno(status);
mlog(ML_ERROR, "failed to read quota file info header "
"(slot=%d type=%d)\n", slot_num, type);
@@ -676,8 +682,8 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
locked = 1;

/* Now read local header */
- bh = ocfs2_read_quota_block(lqinode, 0, &status);
- if (!bh) {
+ status = ocfs2_read_quota_block(lqinode, 0, &bh);
+ if (status) {
mlog_errno(status);
mlog(ML_ERROR, "failed to read quota file info header "
"(type=%d)\n", type);
@@ -850,13 +856,13 @@ static int ocfs2_local_write_dquot(struct dquot *dquot)
{
struct super_block *sb = dquot->dq_sb;
struct ocfs2_dquot *od = OCFS2_DQUOT(dquot);
- struct buffer_head *bh;
+ struct buffer_head *bh = NULL;
int status;

- bh = ocfs2_read_quota_block(sb_dqopt(sb)->files[dquot->dq_type],
+ status = ocfs2_read_quota_block(sb_dqopt(sb)->files[dquot->dq_type],
ol_dqblk_file_block(sb, od->dq_local_off),
- &status);
- if (!bh) {
+ &bh);
+ if (status) {
mlog_errno(status);
goto out;
}
--
1.5.6

2008-12-22 22:00:59

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 33/56] ocfs2: Make ocfs2_get_quota_block() consistent with ocfs2_read_quota_block()

From: Jan Kara <[email protected]>

Make function return error status and not buffer pointer so that it's
consistent with ocfs2_read_quota_block().

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_global.c | 27 +++++++++++++--------------
1 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 339c98a..27a8123 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -102,26 +102,25 @@ int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
return rc;
}

-static struct buffer_head *ocfs2_get_quota_block(struct inode *inode,
- int block, int *err)
+static int ocfs2_get_quota_block(struct inode *inode, int block,
+ struct buffer_head **bh)
{
u64 pblock, pcount;
- struct buffer_head *bh;
+ int err;

down_read(&OCFS2_I(inode)->ip_alloc_sem);
- *err = ocfs2_extent_map_get_blocks(inode, block, &pblock, &pcount,
- NULL);
+ err = ocfs2_extent_map_get_blocks(inode, block, &pblock, &pcount, NULL);
up_read(&OCFS2_I(inode)->ip_alloc_sem);
- if (*err) {
- mlog_errno(*err);
- return NULL;
+ if (err) {
+ mlog_errno(err);
+ return err;
}
- bh = sb_getblk(inode->i_sb, pblock);
- if (!bh) {
- *err = -EIO;
- mlog_errno(*err);
+ *bh = sb_getblk(inode->i_sb, pblock);
+ if (!*bh) {
+ err = -EIO;
+ mlog_errno(err);
}
- return bh;
+ return err;;
}

/* Read data from global quotafile - avoid pagecache and such because we cannot
@@ -207,7 +206,7 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
err = ocfs2_read_quota_block(gqinode, blk, &bh);
ja_type = OCFS2_JOURNAL_ACCESS_WRITE;
} else {
- bh = ocfs2_get_quota_block(gqinode, blk, &err);
+ err = ocfs2_get_quota_block(gqinode, blk, &bh);
ja_type = OCFS2_JOURNAL_ACCESS_CREATE;
}
if (err) {
--
1.5.6

2008-12-22 22:01:30

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 34/56] ocfs2: Fix build warnings (64-bit types vs long long)

From: Jan Kara <[email protected]>

fs/ocfs2/quota_local.c: In function 'olq_set_dquot':
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64'
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64'
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64'
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64'
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 7 has type '__le64'
fs/ocfs2/quota_local.c:844: warning: format '%lld' expects type 'long long int', but argument 8 has type '__le64'
fs/ocfs2/quota_global.c: In function '__ocfs2_sync_dquot':
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64'
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64'
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64'
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64'
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 8 has type 's64'
fs/ocfs2/quota_global.c:457: warning: format '%lld' expects type 'long long int', but argument 10 has type 's64'

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_global.c | 6 +++---
fs/ocfs2/quota_local.c | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 27a8123..b0eae79 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -455,9 +455,9 @@ int __ocfs2_sync_dquot(struct dquot *dquot, int freeing)
olditime = dquot->dq_dqb.dqb_itime;
oldbtime = dquot->dq_dqb.dqb_btime;
ocfs2_global_disk2memdqb(dquot, &dqblk);
- mlog(0, "Syncing global dquot %d space %lld+%lld, inodes %lld+%lld\n",
- dquot->dq_id, dquot->dq_dqb.dqb_curspace, spacechange,
- dquot->dq_dqb.dqb_curinodes, inodechange);
+ mlog(0, "Syncing global dquot %u space %lld+%lld, inodes %lld+%lld\n",
+ dquot->dq_id, dquot->dq_dqb.dqb_curspace, (long long)spacechange,
+ dquot->dq_dqb.dqb_curinodes, (long long)inodechange);
if (!test_bit(DQ_LASTSET_B + QIF_SPACE_B, &dquot->dq_flags))
dquot->dq_dqb.dqb_curspace += spacechange;
if (!test_bit(DQ_LASTSET_B + QIF_INODES_B, &dquot->dq_flags))
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index c008dd9..9b22643 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -848,7 +848,8 @@ static void olq_set_dquot(struct buffer_head *bh, void *private)
od->dq_originodes);
spin_unlock(&dq_data_lock);
mlog(0, "Writing local dquot %u space %lld inodes %lld\n",
- od->dq_dquot.dq_id, dqblk->dqb_spacemod, dqblk->dqb_inodemod);
+ od->dq_dquot.dq_id, (long long)le64_to_cpu(dqblk->dqb_spacemod),
+ (long long)le64_to_cpu(dqblk->dqb_inodemod));
}

/* Write dquot to local quota file */
--
1.5.6

2008-12-22 22:01:50

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 35/56] quota: Unexport dqblk_v1.h and dqblk_v2.h

From: Jan Kara <[email protected]>

Unexport header files dqblk_v[12].h since except for quota format ID they
don't contain information userspace should be interested in. Move ID
definitions to quota.h.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
include/linux/Kbuild | 2 --
include/linux/dqblk_v1.h | 3 ---
include/linux/dqblk_v2.h | 3 ---
include/linux/quota.h | 4 ++++
4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 0fd2da3..4c32642 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -56,8 +56,6 @@ header-y += dlm_device.h
header-y += dlm_netlink.h
header-y += dm-ioctl.h
header-y += dn.h
-header-y += dqblk_v1.h
-header-y += dqblk_v2.h
header-y += dqblk_xfs.h
header-y += efs_fs_sb.h
header-y += elf-fdpic.h
diff --git a/include/linux/dqblk_v1.h b/include/linux/dqblk_v1.h
index 9cea901..3713a72 100644
--- a/include/linux/dqblk_v1.h
+++ b/include/linux/dqblk_v1.h
@@ -5,9 +5,6 @@
#ifndef _LINUX_DQBLK_V1_H
#define _LINUX_DQBLK_V1_H

-/* Id of quota format */
-#define QFMT_VFS_OLD 1
-
/* Root squash turned on */
#define V1_DQF_RSQUASH 1

diff --git a/include/linux/dqblk_v2.h b/include/linux/dqblk_v2.h
index ff8af1b..18000a5 100644
--- a/include/linux/dqblk_v2.h
+++ b/include/linux/dqblk_v2.h
@@ -7,9 +7,6 @@

#include <linux/dqblk_qtree.h>

-/* Id number of quota format */
-#define QFMT_VFS_V0 2
-
/* Numbers of blocks needed for updates */
#define V2_INIT_ALLOC QTREE_INIT_ALLOC
#define V2_INIT_REWRITE QTREE_INIT_REWRITE
diff --git a/include/linux/quota.h b/include/linux/quota.h
index ec82beb..d72d5d8 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -70,6 +70,10 @@
#define Q_GETQUOTA 0x800007 /* get user quota structure */
#define Q_SETQUOTA 0x800008 /* set user quota structure */

+/* Quota format type IDs */
+#define QFMT_VFS_OLD 1
+#define QFMT_VFS_V0 2
+
/* Size of block in which space limits are passed through the quota
* interface */
#define QIF_DQBLKSIZE_BITS 10
--
1.5.6

2008-12-22 22:02:14

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 36/56] quota: Export dquot_alloc() and dquot_destroy() functions

From: Jan Kara <[email protected]>

These are default functions for creating and destroying quota structures
and they should be used from filesystems.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/dquot.c | 6 ++++--
include/linux/quotaops.h | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 6f7df91..c724586 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -413,10 +413,11 @@ out_dqlock:
return ret;
}

-static void dquot_destroy(struct dquot *dquot)
+void dquot_destroy(struct dquot *dquot)
{
kmem_cache_free(dquot_cachep, dquot);
}
+EXPORT_SYMBOL(dquot_destroy);

static inline void do_destroy_dquot(struct dquot *dquot)
{
@@ -668,10 +669,11 @@ we_slept:
spin_unlock(&dq_list_lock);
}

-static struct dquot *dquot_alloc(struct super_block *sb, int type)
+struct dquot *dquot_alloc(struct super_block *sb, int type)
{
return kmem_cache_zalloc(dquot_cachep, GFP_NOFS);
}
+EXPORT_SYMBOL(dquot_alloc);

static struct dquot *get_empty_dquot(struct super_block *sb, int type)
{
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index f491394..21b781a 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -31,6 +31,8 @@ int dquot_is_cached(struct super_block *sb, unsigned int id, int type);
int dquot_scan_active(struct super_block *sb,
int (*fn)(struct dquot *dquot, unsigned long priv),
unsigned long priv);
+struct dquot *dquot_alloc(struct super_block *sb, int type);
+void dquot_destroy(struct dquot *dquot);

int dquot_alloc_space(struct inode *inode, qsize_t number, int prealloc);
int dquot_alloc_inode(const struct inode *inode, qsize_t number);
--
1.5.6

2008-12-22 22:02:50

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 38/56] ext3: Add default allocation routines for quota structures

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ext3/super.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 250ec53..c22d014 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -713,7 +713,9 @@ static struct dquot_operations ext3_quota_operations = {
.acquire_dquot = ext3_acquire_dquot,
.release_dquot = ext3_release_dquot,
.mark_dirty = ext3_mark_dquot_dirty,
- .write_info = ext3_write_info
+ .write_info = ext3_write_info,
+ .alloc_dquot = dquot_alloc,
+ .destroy_dquot = dquot_destroy,
};

static struct quotactl_ops ext3_qctl_operations = {
--
1.5.6

2008-12-22 22:02:33

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 37/56] reiserfs: Add default allocation routines for quota structures

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/reiserfs/super.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index a9b393a..c55651f 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -649,6 +649,8 @@ static struct dquot_operations reiserfs_quota_operations = {
.release_dquot = reiserfs_release_dquot,
.mark_dirty = reiserfs_mark_dquot_dirty,
.write_info = reiserfs_write_info,
+ .alloc_dquot = dquot_alloc,
+ .destroy_dquot = dquot_destroy,
};

static struct quotactl_ops reiserfs_qctl_operations = {
--
1.5.6

2008-12-22 22:03:16

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 39/56] ext4: Add default allocation routines for quota structures

From: Jan Kara <[email protected]>

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ext4/super.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9e5a717..e6a5d74 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -803,7 +803,9 @@ static struct dquot_operations ext4_quota_operations = {
.acquire_dquot = ext4_acquire_dquot,
.release_dquot = ext4_release_dquot,
.mark_dirty = ext4_mark_dquot_dirty,
- .write_info = ext4_write_info
+ .write_info = ext4_write_info,
+ .alloc_dquot = dquot_alloc,
+ .destroy_dquot = dquot_destroy,
};

static struct quotactl_ops ext4_qctl_operations = {
--
1.5.6

2008-12-22 22:03:34

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 40/56] ocfs2: fix indendation in ocfs2_dquot_drop_slow

From: Tao Ma <[email protected]>

Signed-off-by: Tao Ma <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_global.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b0eae79..4b38a2e 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -896,7 +896,7 @@ static int ocfs2_dquot_drop_slow(struct inode *inode)
if (IS_ERR(handle)) {
status = PTR_ERR(handle);
mlog_errno(status);
- goto out;
+ goto out;
}
dquot_drop(inode);
ocfs2_commit_trans(OCFS2_SB(sb), handle);
--
1.5.6

2008-12-22 22:03:54

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 41/56] ocfs2/quota: sparse fixes for quota

From: Tao Ma <[email protected]>

Fix 2 minor things in quota. They are both found by sparse check.
1. an endian bug in ocfs2_local_quota_add_chunk.
2. change olq_alloc_dquot to static.

Signed-off-by: Tao Ma <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota_local.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index 9b22643..5353c42 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -988,7 +988,7 @@ static struct ocfs2_quota_chunk *ocfs2_local_quota_add_chunk(
goto out_trans;
}
lock_buffer(bh);
- dchunk->dqc_free = ol_quota_entries_per_block(sb);
+ dchunk->dqc_free = cpu_to_le32(ol_quota_entries_per_block(sb));
memset(dchunk->dqc_bitmap, 0,
sb->s_blocksize - sizeof(struct ocfs2_local_disk_chunk) -
OCFS2_QBLK_RESERVED_SPACE);
@@ -1110,7 +1110,7 @@ out:
return ERR_PTR(status);
}

-void olq_alloc_dquot(struct buffer_head *bh, void *private)
+static void olq_alloc_dquot(struct buffer_head *bh, void *private)
{
int *offset = private;
struct ocfs2_local_disk_chunk *dchunk;
--
1.5.6

2008-12-22 22:04:23

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 42/56] ocfs2: Dirty the entire bucket in ocfs2_bucket_value_truncate()

From: Joel Becker <[email protected]>

ocfs2_bucket_value_truncate() currently takes the first bh of the
bucket, and magically plays around with the value bh - even though
the bucket structure in the calling function already has it.

In addition, future code wants to always dirty the entire bucket when it
is changed. So let's pass the entire bucket into this function, skip
any block reads (we have them), and add the access/dirty logic.

ocfs2_xattr_update_value_size() is no longer necessary, as it only did
one thing other than journal access/dirty.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 74 ++++++++++++++++++++---------------------------------
1 files changed, 28 insertions(+), 46 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 3b9634c..6db68a2 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -4580,31 +4580,6 @@ out:
return ret;
}

-static int ocfs2_xattr_value_update_size(struct inode *inode,
- handle_t *handle,
- struct buffer_head *xe_bh,
- struct ocfs2_xattr_entry *xe,
- u64 new_size)
-{
- int ret;
-
- ret = ocfs2_journal_access(handle, inode, xe_bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
- if (ret < 0) {
- mlog_errno(ret);
- goto out;
- }
-
- xe->xe_value_size = cpu_to_le64(new_size);
-
- ret = ocfs2_journal_dirty(handle, xe_bh);
- if (ret < 0)
- mlog_errno(ret);
-
-out:
- return ret;
-}
-
/*
* Truncate the specified xe_off entry in xattr bucket.
* bucket is indicated by header_bh and len is the new length.
@@ -4613,7 +4588,7 @@ out:
* Copy the new updated xe and xe_value_root to new_xe and new_xv if needed.
*/
static int ocfs2_xattr_bucket_value_truncate(struct inode *inode,
- struct buffer_head *header_bh,
+ struct ocfs2_xattr_bucket *bucket,
int xe_off,
int len,
struct ocfs2_xattr_set_ctxt *ctxt)
@@ -4623,8 +4598,7 @@ static int ocfs2_xattr_bucket_value_truncate(struct inode *inode,
struct buffer_head *value_bh = NULL;
struct ocfs2_xattr_value_root *xv;
struct ocfs2_xattr_entry *xe;
- struct ocfs2_xattr_header *xh =
- (struct ocfs2_xattr_header *)header_bh->b_data;
+ struct ocfs2_xattr_header *xh = bucket_xh(bucket);
size_t blocksize = inode->i_sb->s_blocksize;

xe = &xh->xh_entries[xe_off];
@@ -4638,34 +4612,41 @@ static int ocfs2_xattr_bucket_value_truncate(struct inode *inode,

/* We don't allow ocfs2_xattr_value to be stored in different block. */
BUG_ON(value_blk != (offset + OCFS2_XATTR_ROOT_SIZE - 1) / blocksize);
- value_blk += header_bh->b_blocknr;

- ret = ocfs2_read_block(inode, value_blk, &value_bh, NULL);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
+ value_bh = bucket->bu_bhs[value_blk];
+ BUG_ON(!value_bh);

xv = (struct ocfs2_xattr_value_root *)
(value_bh->b_data + offset % blocksize);

- mlog(0, "truncate %u in xattr bucket %llu to %d bytes.\n",
- xe_off, (unsigned long long)header_bh->b_blocknr, len);
- ret = ocfs2_xattr_value_truncate(inode, value_bh, xv, len, ctxt);
+ ret = ocfs2_xattr_bucket_journal_access(ctxt->handle, bucket,
+ OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
goto out;
}

- ret = ocfs2_xattr_value_update_size(inode, ctxt->handle,
- header_bh, xe, len);
+ /*
+ * From here on out we have to dirty the bucket. The generic
+ * value calls only modify one of the bucket's bhs, but we need
+ * to send the bucket at once. So if they error, they *could* have
+ * modified something. We have to assume they did, and dirty
+ * the whole bucket. This leaves us in a consistent state.
+ */
+ mlog(0, "truncate %u in xattr bucket %llu to %d bytes.\n",
+ xe_off, (unsigned long long)bucket_blkno(bucket), len);
+ ret = ocfs2_xattr_value_truncate(inode, value_bh, xv, len, ctxt);
if (ret) {
mlog_errno(ret);
- goto out;
+ goto out_dirty;
}

+ xe->xe_value_size = cpu_to_le64(len);
+
+out_dirty:
+ ocfs2_xattr_bucket_journal_dirty(ctxt->handle, bucket);
+
out:
- brelse(value_bh);
return ret;
}

@@ -4681,7 +4662,7 @@ static int ocfs2_xattr_bucket_value_truncate_xs(struct inode *inode,
BUG_ON(!xs->bucket->bu_bhs[0] || !xe || ocfs2_xattr_is_local(xe));

offset = xe - xh->xh_entries;
- ret = ocfs2_xattr_bucket_value_truncate(inode, xs->bucket->bu_bhs[0],
+ ret = ocfs2_xattr_bucket_value_truncate(inode, xs->bucket,
offset, len, ctxt);
if (ret)
mlog_errno(ret);
@@ -5107,11 +5088,13 @@ static int ocfs2_delete_xattr_in_bucket(struct inode *inode,
struct ocfs2_xattr_entry *xe;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
struct ocfs2_xattr_set_ctxt ctxt = {NULL, NULL,};
+ int credits = ocfs2_remove_extent_credits(osb->sb) +
+ ocfs2_blocks_per_xattr_bucket(inode->i_sb);
+

ocfs2_init_dealloc_ctxt(&ctxt.dealloc);

- ctxt.handle = ocfs2_start_trans(osb,
- ocfs2_remove_extent_credits(osb->sb));
+ ctxt.handle = ocfs2_start_trans(osb, credits);
if (IS_ERR(ctxt.handle)) {
ret = PTR_ERR(ctxt.handle);
mlog_errno(ret);
@@ -5123,8 +5106,7 @@ static int ocfs2_delete_xattr_in_bucket(struct inode *inode,
if (ocfs2_xattr_is_local(xe))
continue;

- ret = ocfs2_xattr_bucket_value_truncate(inode,
- bucket->bu_bhs[0],
+ ret = ocfs2_xattr_bucket_value_truncate(inode, bucket,
i, 0, &ctxt);
if (ret) {
mlog_errno(ret);
--
1.5.6

2008-12-22 22:04:43

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 43/56] ocfs2: Narrow the transaction for deleting xattrs from a bucket.

From: Tao Ma <[email protected]>

We move the transaction into the loop because in
ocfs2_remove_extent, we will double the credits in function
ocfs2_extend_rotate_transaction. So if we have a large loop
number, we will soon waste much the journal space.

Signed-off-by: Tao Ma <[email protected]>
Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 6db68a2..df53a2c 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -5094,30 +5094,30 @@ static int ocfs2_delete_xattr_in_bucket(struct inode *inode,

ocfs2_init_dealloc_ctxt(&ctxt.dealloc);

- ctxt.handle = ocfs2_start_trans(osb, credits);
- if (IS_ERR(ctxt.handle)) {
- ret = PTR_ERR(ctxt.handle);
- mlog_errno(ret);
- goto out;
- }
-
for (i = 0; i < le16_to_cpu(xh->xh_count); i++) {
xe = &xh->xh_entries[i];
if (ocfs2_xattr_is_local(xe))
continue;

+ ctxt.handle = ocfs2_start_trans(osb, credits);
+ if (IS_ERR(ctxt.handle)) {
+ ret = PTR_ERR(ctxt.handle);
+ mlog_errno(ret);
+ break;
+ }
+
ret = ocfs2_xattr_bucket_value_truncate(inode, bucket,
i, 0, &ctxt);
+
+ ocfs2_commit_trans(osb, ctxt.handle);
if (ret) {
mlog_errno(ret);
break;
}
}

- ret = ocfs2_commit_trans(osb, ctxt.handle);
ocfs2_schedule_truncate_log_flush(osb, 1);
ocfs2_run_deallocs(osb, &ctxt.dealloc);
-out:
return ret;
}

--
1.5.6

2008-12-22 22:05:00

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 44/56] ocfs2: Dirty the entire first bucket in ocfs2_extend_xattr_bucket()

From: Joel Becker <[email protected]>

ocfs2_extend_xattr_bucket() takes an extent of buckets and shifts some
of them down to make room for a new xattr. It is passed the first bh of
the first bucket, because that is where we store the number of buckets
in the extent.

However, future code wants to always dirty the entire bucket when it
is changed. So let's pass the entire bucket into this function, skip
any block reads (we have them), and add the access/dirty logic. We also
can skip passing in the target bucket bh - we only need its block
number.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 85 +++++++++++++++++++++++++++++++++++-------------------
1 files changed, 55 insertions(+), 30 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index df53a2c..ed1e959 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3905,7 +3905,7 @@ static int ocfs2_cp_xattr_bucket(struct inode *inode,
mlog_errno(ret);
goto out;
}
-
+
ret = ocfs2_read_xattr_bucket(s_bucket, s_blkno);
if (ret)
goto out;
@@ -4232,37 +4232,45 @@ leave:
}

/*
- * Extend a new xattr bucket and move xattrs to the end one by one until
- * We meet with start_bh. Only move half of the xattrs to the bucket after it.
+ * We are given an extent. 'first' is the bucket at the very front of
+ * the extent. The extent has space for an additional bucket past
+ * bucket_xh(first)->xh_num_buckets. 'target_blkno' is the block number
+ * of the target bucket. We wish to shift every bucket past the target
+ * down one, filling in that additional space. When we get back to the
+ * target, we split the target between itself and the now-empty bucket
+ * at target+1 (aka, target_blkno + blks_per_bucket).
*/
static int ocfs2_extend_xattr_bucket(struct inode *inode,
handle_t *handle,
- struct buffer_head *first_bh,
- struct buffer_head *start_bh,
+ struct ocfs2_xattr_bucket *first,
+ u64 target_blk,
u32 num_clusters)
{
int ret, credits;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
u16 blk_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
- u64 start_blk = start_bh->b_blocknr, end_blk;
- u32 num_buckets = num_clusters * ocfs2_xattr_buckets_per_cluster(osb);
- struct ocfs2_xattr_header *first_xh =
- (struct ocfs2_xattr_header *)first_bh->b_data;
- u16 bucket = le16_to_cpu(first_xh->xh_num_buckets);
+ u64 end_blk;
+ u16 new_bucket = le16_to_cpu(bucket_xh(first)->xh_num_buckets);

mlog(0, "extend xattr bucket in %llu, xattr extend rec starting "
- "from %llu, len = %u\n", (unsigned long long)start_blk,
- (unsigned long long)first_bh->b_blocknr, num_clusters);
+ "from %llu, len = %u\n", (unsigned long long)target_blk,
+ (unsigned long long)bucket_blkno(first), num_clusters);

- BUG_ON(bucket >= num_buckets);
+ /* The extent must have room for an additional bucket */
+ BUG_ON(new_bucket >=
+ (num_clusters * ocfs2_xattr_buckets_per_cluster(osb)));

- end_blk = first_bh->b_blocknr + (bucket - 1) * blk_per_bucket;
+ /* end_blk points to the last existing bucket */
+ end_blk = bucket_blkno(first) + ((new_bucket - 1) * blk_per_bucket);

/*
- * We will touch all the buckets after the start_bh(include it).
- * Then we add one more bucket.
+ * end_blk is the start of the last existing bucket.
+ * Thus, (end_blk - target_blk) covers the target bucket and
+ * every bucket after it up to, but not including, the last
+ * existing bucket. Then we add the last existing bucket, the
+ * new bucket, and the first bucket (3 * blk_per_bucket).
*/
- credits = end_blk - start_blk + 3 * blk_per_bucket + 1 +
+ credits = (end_blk - target_blk) + (3 * blk_per_bucket) +
handle->h_buffer_credits;
ret = ocfs2_extend_trans(handle, credits);
if (ret) {
@@ -4270,14 +4278,14 @@ static int ocfs2_extend_xattr_bucket(struct inode *inode,
goto out;
}

- ret = ocfs2_journal_access(handle, inode, first_bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ ret = ocfs2_xattr_bucket_journal_access(handle, first,
+ OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
goto out;
}

- while (end_blk != start_blk) {
+ while (end_blk != target_blk) {
ret = ocfs2_cp_xattr_bucket(inode, handle, end_blk,
end_blk + blk_per_bucket, 0);
if (ret)
@@ -4285,12 +4293,12 @@ static int ocfs2_extend_xattr_bucket(struct inode *inode,
end_blk -= blk_per_bucket;
}

- /* Move half of the xattr in start_blk to the next bucket. */
- ret = ocfs2_divide_xattr_bucket(inode, handle, start_blk,
- start_blk + blk_per_bucket, NULL, 0);
+ /* Move half of the xattr in target_blkno to the next bucket. */
+ ret = ocfs2_divide_xattr_bucket(inode, handle, target_blk,
+ target_blk + blk_per_bucket, NULL, 0);

- le16_add_cpu(&first_xh->xh_num_buckets, 1);
- ocfs2_journal_dirty(handle, first_bh);
+ le16_add_cpu(&bucket_xh(first)->xh_num_buckets, 1);
+ ocfs2_xattr_bucket_journal_dirty(handle, first);

out:
return ret;
@@ -4324,10 +4332,19 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
int ret, num_buckets, extend = 1;
u64 p_blkno;
u32 e_cpos, num_clusters;
+ /* The bucket at the front of the extent */
+ struct ocfs2_xattr_bucket *first;

mlog(0, "Add new xattr bucket starting form %llu\n",
(unsigned long long)header_bh->b_blocknr);

+ first = ocfs2_xattr_bucket_new(inode);
+ if (!first) {
+ ret = -ENOMEM;
+ mlog_errno(ret);
+ goto out;
+ }
+
/*
* Add refrence for header_bh here because it may be
* changed in ocfs2_add_new_xattr_cluster and we need
@@ -4367,17 +4384,25 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
}
}

- if (extend)
+ if (extend) {
+ /* These bucket reads should be cached */
+ ret = ocfs2_read_xattr_bucket(first, first_bh->b_blocknr);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
ret = ocfs2_extend_xattr_bucket(inode,
ctxt->handle,
- first_bh,
- header_bh,
+ first, header_bh->b_blocknr,
num_clusters);
- if (ret)
- mlog_errno(ret);
+ if (ret)
+ mlog_errno(ret);
+ }
+
out:
brelse(first_bh);
brelse(header_bh);
+ ocfs2_xattr_bucket_free(first);
return ret;
}

--
1.5.6

2008-12-22 22:05:27

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 45/56] ocfs2: Dirty the entire first bucket in ocfs2_cp_xattr_cluster().

From: Joel Becker <[email protected]>

ocfs2_cp_xattr_cluster() takes the last bucket of a full extent and
copies it over to a new extent. It then updates the headers of both
extents to reflect the new state. It is passed the first bh of
the first bucket in order to update that first extent's bucket count.
It reads and dirties the first bh of the new extent for the same reason.

However, future code wants to always dirty the entire bucket when it
is changed. So it is changed to read the entire bucket it is updating
for both extents.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 80 ++++++++++++++++++++++++++++++++---------------------
1 files changed, 48 insertions(+), 32 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index ed1e959..4dba347 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3936,9 +3936,10 @@ out:
}

/*
- * Copy one xattr cluster from src_blk to to_blk.
- * The to_blk will become the first bucket header of the cluster, so its
- * xh_num_buckets will be initialized as the bucket num in the cluster.
+ * src_blk points to the last cluster of an existing extent. to_blk
+ * points to a newly allocated extent. We copy the cluster over to the
+ * new extent, initializing its xh_num_buckets. The old extent's
+ * xh_num_buckets shrinks by the same amount.
*/
static int ocfs2_cp_xattr_cluster(struct inode *inode,
handle_t *handle,
@@ -3950,27 +3951,42 @@ static int ocfs2_cp_xattr_cluster(struct inode *inode,
int i, ret, credits;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
int bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
+ int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
- struct buffer_head *bh = NULL;
- struct ocfs2_xattr_header *xh;
- u64 to_blk_start = to_blk;
+ struct ocfs2_xattr_bucket *old_first, *new_first;

mlog(0, "cp xattrs from cluster %llu to %llu\n",
(unsigned long long)src_blk, (unsigned long long)to_blk);

+ /* The first bucket of the original extent */
+ old_first = ocfs2_xattr_bucket_new(inode);
+ /* The first bucket of the new extent */
+ new_first = ocfs2_xattr_bucket_new(inode);
+ if (!old_first || !new_first) {
+ ret = -ENOMEM;
+ mlog_errno(ret);
+ goto out;
+ }
+
+ ret = ocfs2_read_xattr_bucket(old_first, first_bh->b_blocknr);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+
/*
- * We need to update the new cluster and 1 more for the update of
- * the 1st bucket of the previous extent rec.
+ * We need to update the first bucket of the old extent and the
+ * entire first cluster of the new extent.
*/
- credits = bpc + 1 + handle->h_buffer_credits;
+ credits = blks_per_bucket + bpc + handle->h_buffer_credits;
ret = ocfs2_extend_trans(handle, credits);
if (ret) {
mlog_errno(ret);
goto out;
}

- ret = ocfs2_journal_access(handle, inode, first_bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ ret = ocfs2_xattr_bucket_journal_access(handle, old_first,
+ OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
goto out;
@@ -3978,45 +3994,45 @@ static int ocfs2_cp_xattr_cluster(struct inode *inode,

for (i = 0; i < num_buckets; i++) {
ret = ocfs2_cp_xattr_bucket(inode, handle,
- src_blk, to_blk, 1);
+ src_blk + (i * blks_per_bucket),
+ to_blk + (i * blks_per_bucket),
+ 1);
if (ret) {
mlog_errno(ret);
goto out;
}
-
- src_blk += ocfs2_blocks_per_xattr_bucket(inode->i_sb);
- to_blk += ocfs2_blocks_per_xattr_bucket(inode->i_sb);
}

- /* update the old bucket header. */
- xh = (struct ocfs2_xattr_header *)first_bh->b_data;
- le16_add_cpu(&xh->xh_num_buckets, -num_buckets);
-
- ocfs2_journal_dirty(handle, first_bh);
-
- /* update the new bucket header. */
- ret = ocfs2_read_block(inode, to_blk_start, &bh, NULL);
- if (ret < 0) {
+ /*
+ * Get the new bucket ready before we dirty anything
+ * (This actually shouldn't fail, because we already dirtied
+ * it once in ocfs2_cp_xattr_bucket()).
+ */
+ ret = ocfs2_read_xattr_bucket(new_first, to_blk);
+ if (ret) {
mlog_errno(ret);
goto out;
}
-
- ret = ocfs2_journal_access(handle, inode, bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ ret = ocfs2_xattr_bucket_journal_access(handle, new_first,
+ OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
goto out;
}

- xh = (struct ocfs2_xattr_header *)bh->b_data;
- xh->xh_num_buckets = cpu_to_le16(num_buckets);
+ /* Now update the headers */
+ le16_add_cpu(&bucket_xh(old_first)->xh_num_buckets, -num_buckets);
+ ocfs2_xattr_bucket_journal_dirty(handle, old_first);

- ocfs2_journal_dirty(handle, bh);
+ bucket_xh(new_first)->xh_num_buckets = cpu_to_le16(num_buckets);
+ ocfs2_xattr_bucket_journal_dirty(handle, new_first);

if (first_hash)
- *first_hash = le32_to_cpu(xh->xh_entries[0].xe_name_hash);
+ *first_hash = le32_to_cpu(bucket_xh(new_first)->xh_entries[0].xe_name_hash);
+
out:
- brelse(bh);
+ ocfs2_xattr_bucket_free(new_first);
+ ocfs2_xattr_bucket_free(old_first);
return ret;
}

--
1.5.6

2008-12-22 22:05:49

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 46/56] ocfs2: Explain t_is_new in ocfs2_cp_xattr_cluster().

From: Joel Becker <[email protected]>

I was unsure of the JOURNAL_ACCESS parameters in
ocfs2_cp_xattr_cluster(). They're based on the function argument
't_is_new', but I couldn't quite figure out how t_is_new mapped to
allocation. ocfs2_cp_xattr_cluster() actually overwrites the target,
regardless of t_is_new.

Well, I just figured it out. So I'm adding a big fat comment for those
who come after me. ocfs2_divide_xattr_cluster() has the same behavior.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 4dba347..5efcf4e 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3747,6 +3747,11 @@ static int ocfs2_divide_xattr_bucket(struct inode *inode,
goto out;
}

+ /*
+ * Hey, if we're overwriting t_bucket, what difference does
+ * ACCESS_CREATE vs ACCESS_WRITE make? See the comment in the
+ * same part of ocfs2_cp_xattr_bucket().
+ */
ret = ocfs2_xattr_bucket_journal_access(handle, t_bucket,
new_bucket_head ?
OCFS2_JOURNAL_ACCESS_CREATE :
@@ -3918,6 +3923,18 @@ static int ocfs2_cp_xattr_bucket(struct inode *inode,
if (ret)
goto out;

+ /*
+ * Hey, if we're overwriting t_bucket, what difference does
+ * ACCESS_CREATE vs ACCESS_WRITE make? Well, if we allocated a new
+ * cluster to fill, we came here from ocfs2_cp_xattr_cluster(), and
+ * it is really new - ACCESS_CREATE is required. But we also
+ * might have moved data out of t_bucket before extending back
+ * into it. ocfs2_add_new_xattr_bucket() can do this - its call
+ * to ocfs2_add_new_xattr_cluster() may have created a new extent
+ * and copied out the end of the old extent. Then it re-extends
+ * the old extent back to create space for new xattrs. That's
+ * how we get here, and the bucket isn't really new.
+ */
ret = ocfs2_xattr_bucket_journal_access(handle, t_bucket,
t_is_new ?
OCFS2_JOURNAL_ACCESS_CREATE :
--
1.5.6

2008-12-22 22:06:12

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 47/56] ocfs2: Use ocfs2_cp_xattr_bucket() in ocfs2_mv_xattr_bucket_cross_cluster().

From: Joel Becker <[email protected]>

The buffer copy loop of ocfs2_mv_xattr_bucket_cross_cluster() actually
looks a lot like ocfs2_cp_xattr_bucket(). Let's just use that instead.
We also use bucket operations to update the buckets at the start of each
extent.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 169 +++++++++++++++++++++++++++++++++---------------------
1 files changed, 104 insertions(+), 65 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 5efcf4e..5be9966 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -170,6 +170,11 @@ static int ocfs2_xattr_set_entry_index_block(struct inode *inode,

static int ocfs2_delete_xattr_index_block(struct inode *inode,
struct buffer_head *xb_bh);
+static int ocfs2_cp_xattr_bucket(struct inode *inode,
+ handle_t *handle,
+ u64 s_blkno,
+ u64 t_blkno,
+ int t_is_new);

static inline u16 ocfs2_xattr_buckets_per_cluster(struct ocfs2_super *osb)
{
@@ -3526,13 +3531,21 @@ out:
}

/*
- * Move half nums of the xattr bucket in the previous cluster to this new
- * cluster. We only touch the last cluster of the previous extend record.
+ * prev_blkno points to the start of an existing extent. new_blkno
+ * points to a newly allocated extent. Because we know each of our
+ * clusters contains more than bucket, we can easily split one cluster
+ * at a bucket boundary. So we take the last cluster of the existing
+ * extent and split it down the middle. We move the last half of the
+ * buckets in the last cluster of the existing extent over to the new
+ * extent.
+ *
+ * first_bh is the buffer at prev_blkno so we can update the existing
+ * extent's bucket count. header_bh is the bucket were we were hoping
+ * to insert our xattr. If the bucket move places the target in the new
+ * extent, we'll update first_bh and header_bh after modifying the old
+ * extent.
*
- * first_bh is the first buffer_head of a series of bucket in the same
- * extent rec and header_bh is the header of one bucket in this cluster.
- * They will be updated if we move the data header_bh contains to the new
- * cluster. first_hash will be set as the 1st xe's name_hash of the new cluster.
+ * first_hash will be set as the 1st xe's name_hash in the new extent.
*/
static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,
handle_t *handle,
@@ -3545,105 +3558,131 @@ static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,
{
int i, ret, credits;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
+ int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
int bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
- int blocksize = inode->i_sb->s_blocksize;
- struct buffer_head *old_bh, *new_bh, *prev_bh, *new_first_bh = NULL;
- struct ocfs2_xattr_header *new_xh;
+ int to_move = num_buckets / 2;
+ u64 last_cluster_blkno, src_blkno;
struct ocfs2_xattr_header *xh =
(struct ocfs2_xattr_header *)((*first_bh)->b_data);
+ struct ocfs2_xattr_bucket *old_first, *new_first;

BUG_ON(le16_to_cpu(xh->xh_num_buckets) < num_buckets);
BUG_ON(OCFS2_XATTR_BUCKET_SIZE == osb->s_clustersize);

- prev_bh = *first_bh;
- get_bh(prev_bh);
- xh = (struct ocfs2_xattr_header *)prev_bh->b_data;
-
- prev_blkno += (num_clusters - 1) * bpc + bpc / 2;
+ last_cluster_blkno = prev_blkno + ((num_clusters - 1) * bpc);
+ src_blkno = last_cluster_blkno + (to_move * blks_per_bucket);

mlog(0, "move half of xattrs in cluster %llu to %llu\n",
(unsigned long long)prev_blkno, (unsigned long long)new_blkno);

+ /* The first bucket of the original extent */
+ old_first = ocfs2_xattr_bucket_new(inode);
+ /* The first bucket of the new extent */
+ new_first = ocfs2_xattr_bucket_new(inode);
+ if (!old_first || !new_first) {
+ ret = -ENOMEM;
+ mlog_errno(ret);
+ goto out;
+ }
+
+ ret = ocfs2_read_xattr_bucket(old_first, prev_blkno);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+
/*
- * We need to update the 1st half of the new cluster and
- * 1 more for the update of the 1st bucket of the previous
- * extent record.
+ * We need to update the 1st half of the new extent, and we
+ * need to update the first bucket of the old extent.
*/
- credits = bpc / 2 + 1 + handle->h_buffer_credits;
+ credits = ((to_move + 1) * blks_per_bucket) + handle->h_buffer_credits;
ret = ocfs2_extend_trans(handle, credits);
if (ret) {
mlog_errno(ret);
goto out;
}

- ret = ocfs2_journal_access(handle, inode, prev_bh,
- OCFS2_JOURNAL_ACCESS_WRITE);
+ ret = ocfs2_xattr_bucket_journal_access(handle, old_first,
+ OCFS2_JOURNAL_ACCESS_WRITE);
if (ret) {
mlog_errno(ret);
goto out;
}

- for (i = 0; i < bpc / 2; i++, prev_blkno++, new_blkno++) {
- old_bh = new_bh = NULL;
- new_bh = sb_getblk(inode->i_sb, new_blkno);
- if (!new_bh) {
- ret = -EIO;
+ for (i = 0; i < to_move; i++) {
+ ret = ocfs2_cp_xattr_bucket(inode, handle,
+ src_blkno + (i * blks_per_bucket),
+ new_blkno + (i * blks_per_bucket),
+ 1);
+ if (ret) {
mlog_errno(ret);
goto out;
}
+ }

- ocfs2_set_new_buffer_uptodate(inode, new_bh);
+ /*
+ * Get the new bucket ready before we dirty anything
+ * (This actually shouldn't fail, because we already dirtied
+ * it once in ocfs2_cp_xattr_bucket()).
+ */
+ ret = ocfs2_read_xattr_bucket(new_first, new_blkno);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+ ret = ocfs2_xattr_bucket_journal_access(handle, new_first,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }

- ret = ocfs2_journal_access(handle, inode, new_bh,
- OCFS2_JOURNAL_ACCESS_CREATE);
- if (ret < 0) {
- mlog_errno(ret);
- brelse(new_bh);
- goto out;
- }
+ /* Now update the headers */
+ le16_add_cpu(&bucket_xh(old_first)->xh_num_buckets, -to_move);
+ ocfs2_xattr_bucket_journal_dirty(handle, old_first);

- ret = ocfs2_read_block(inode, prev_blkno, &old_bh, NULL);
- if (ret < 0) {
- mlog_errno(ret);
- brelse(new_bh);
- goto out;
- }
+ bucket_xh(new_first)->xh_num_buckets = cpu_to_le16(to_move);
+ ocfs2_xattr_bucket_journal_dirty(handle, new_first);

- memcpy(new_bh->b_data, old_bh->b_data, blocksize);
+ if (first_hash)
+ *first_hash = le32_to_cpu(bucket_xh(new_first)->xh_entries[0].xe_name_hash);

- if (i == 0) {
- new_xh = (struct ocfs2_xattr_header *)new_bh->b_data;
- new_xh->xh_num_buckets = cpu_to_le16(num_buckets / 2);
+ /*
+ * If the target bucket is anywhere past src_blkno, we moved
+ * it to the new extent. We need to update first_bh and header_bh.
+ */
+ if ((*header_bh)->b_blocknr >= src_blkno) {
+ /* We're done with old_first, so we can re-use it. */
+ ocfs2_xattr_bucket_relse(old_first);

- if (first_hash)
- *first_hash = le32_to_cpu(
- new_xh->xh_entries[0].xe_name_hash);
- new_first_bh = new_bh;
- get_bh(new_first_bh);
- }
+ /* Find the block for the new target bucket */
+ src_blkno = new_blkno +
+ ((*header_bh)->b_blocknr - src_blkno);

- ocfs2_journal_dirty(handle, new_bh);
+ /*
+ * This shouldn't fail - the buffers are in the
+ * journal from ocfs2_cp_xattr_bucket().
+ */
+ ret = ocfs2_read_xattr_bucket(old_first, src_blkno);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }

- if (*header_bh == old_bh) {
- brelse(*header_bh);
- *header_bh = new_bh;
- get_bh(*header_bh);
+ brelse(*first_bh);
+ *first_bh = new_first->bu_bhs[0];
+ get_bh(*first_bh);

- brelse(*first_bh);
- *first_bh = new_first_bh;
- get_bh(*first_bh);
- }
- brelse(new_bh);
- brelse(old_bh);
+ brelse(*header_bh);
+ *header_bh = old_first->bu_bhs[0];
+ get_bh(*header_bh);
}

- le16_add_cpu(&xh->xh_num_buckets, -(num_buckets / 2));
-
- ocfs2_journal_dirty(handle, prev_bh);
out:
- brelse(prev_bh);
- brelse(new_first_bh);
+ ocfs2_xattr_bucket_free(new_first);
+ ocfs2_xattr_bucket_free(old_first);
+
return ret;
}

--
1.5.6

2008-12-22 22:06:33

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 48/56] ocfs2: Rename ocfs2_cp_xattr_cluster() to ocfs2_mv_xattr_buckets().

From: Joel Becker <[email protected]>

ocfs2_cp_xattr_cluster() takes the last cluster of an xattr extent,
copies its buckets to the front of a new extent, and then shrinks the bucket
count of the original extent. So it's really moving the data, not
copying it.

While we're here, the function doesn't need a buffer_head for the old
extent, just the block number.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 42 ++++++++++++++++++++++--------------------
1 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 5be9966..c1f2e06 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3965,11 +3965,12 @@ static int ocfs2_cp_xattr_bucket(struct inode *inode,
/*
* Hey, if we're overwriting t_bucket, what difference does
* ACCESS_CREATE vs ACCESS_WRITE make? Well, if we allocated a new
- * cluster to fill, we came here from ocfs2_cp_xattr_cluster(), and
- * it is really new - ACCESS_CREATE is required. But we also
- * might have moved data out of t_bucket before extending back
- * into it. ocfs2_add_new_xattr_bucket() can do this - its call
- * to ocfs2_add_new_xattr_cluster() may have created a new extent
+ * cluster to fill, we came here from
+ * ocfs2_mv_xattr_buckets(), and it is really new -
+ * ACCESS_CREATE is required. But we also might have moved data
+ * out of t_bucket before extending back into it.
+ * ocfs2_add_new_xattr_bucket() can do this - its call to
+ * ocfs2_add_new_xattr_cluster() may have created a new extent
* and copied out the end of the old extent. Then it re-extends
* the old extent back to create space for new xattrs. That's
* how we get here, and the bucket isn't really new.
@@ -3992,17 +3993,16 @@ out:
}

/*
- * src_blk points to the last cluster of an existing extent. to_blk
- * points to a newly allocated extent. We copy the cluster over to the
- * new extent, initializing its xh_num_buckets. The old extent's
- * xh_num_buckets shrinks by the same amount.
+ * src_blk points to the start of an existing extent. last_blk points to
+ * last cluster in that extent. to_blk points to a newly allocated
+ * extent. We copy the buckets from cluster at last_blk to the new extent,
+ * initializing its xh_num_buckets. The old extent's xh_num_buckets
+ * shrinks by the same amount.
*/
-static int ocfs2_cp_xattr_cluster(struct inode *inode,
+static int ocfs2_mv_xattr_buckets(struct inode *inode,
handle_t *handle,
- struct buffer_head *first_bh,
- u64 src_blk,
- u64 to_blk,
- u32 *first_hash)
+ u64 src_blk, u64 last_blk,
+ u64 to_blk, u32 *first_hash)
{
int i, ret, credits;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
@@ -4011,8 +4011,8 @@ static int ocfs2_cp_xattr_cluster(struct inode *inode,
int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
struct ocfs2_xattr_bucket *old_first, *new_first;

- mlog(0, "cp xattrs from cluster %llu to %llu\n",
- (unsigned long long)src_blk, (unsigned long long)to_blk);
+ mlog(0, "mv xattrs from cluster %llu to %llu\n",
+ (unsigned long long)last_blk, (unsigned long long)to_blk);

/* The first bucket of the original extent */
old_first = ocfs2_xattr_bucket_new(inode);
@@ -4024,7 +4024,7 @@ static int ocfs2_cp_xattr_cluster(struct inode *inode,
goto out;
}

- ret = ocfs2_read_xattr_bucket(old_first, first_bh->b_blocknr);
+ ret = ocfs2_read_xattr_bucket(old_first, src_blk);
if (ret) {
mlog_errno(ret);
goto out;
@@ -4050,7 +4050,7 @@ static int ocfs2_cp_xattr_cluster(struct inode *inode,

for (i = 0; i < num_buckets; i++) {
ret = ocfs2_cp_xattr_bucket(inode, handle,
- src_blk + (i * blks_per_bucket),
+ last_blk + (i * blks_per_bucket),
to_blk + (i * blks_per_bucket),
1);
if (ret) {
@@ -4175,8 +4175,10 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
u64 last_blk = prev_blk + bpc * (prev_clusters - 1);

if (prev_clusters > 1 && (*header_bh)->b_blocknr != last_blk)
- ret = ocfs2_cp_xattr_cluster(inode, handle, *first_bh,
- last_blk, new_blk,
+ ret = ocfs2_mv_xattr_buckets(inode, handle,
+ (*first_bh)->b_blocknr,
+ last_blk,
+ new_blk,
v_start);
else {
ret = ocfs2_divide_xattr_cluster(inode, handle,
--
1.5.6

2008-12-22 22:06:50

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 49/56] ocfs2: ocfs2_mv_xattr_buckets() can handle a partial cluster now.

From: Joel Becker <[email protected]>

If you look at ocfs2_mv_xattr_bucket_cross_cluster(), you'll notice that
two-thirds of the code is almost identical to ocfs2_mv_xattr_buckets().
The only difference is that ocfs2_mv_xattr_buckets() moves a whole
cluster's worth, while ocfs2_mv_xattr_bucket_cross_cluster() moves half
the cluster.

We change ocfs2_mv_xattr_buckets() to allow moving partial clusters.
The original caller of ocfs2_mv_xattr_buckets() still moves the whole
cluster's worth - it just passes a start_bucket of 0.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 33 ++++++++++++++++++++-------------
1 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index c1f2e06..9734094 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3995,18 +3995,19 @@ out:
/*
* src_blk points to the start of an existing extent. last_blk points to
* last cluster in that extent. to_blk points to a newly allocated
- * extent. We copy the buckets from cluster at last_blk to the new extent,
- * initializing its xh_num_buckets. The old extent's xh_num_buckets
- * shrinks by the same amount.
+ * extent. We copy the buckets from the cluster at last_blk to the new
+ * extent. If start_bucket is non-zero, we skip that many buckets before
+ * we start copying. The new extent's xh_num_buckets gets set to the
+ * number of buckets we copied. The old extent's xh_num_buckets shrinks
+ * by the same amount.
*/
-static int ocfs2_mv_xattr_buckets(struct inode *inode,
- handle_t *handle,
- u64 src_blk, u64 last_blk,
- u64 to_blk, u32 *first_hash)
+static int ocfs2_mv_xattr_buckets(struct inode *inode, handle_t *handle,
+ u64 src_blk, u64 last_blk, u64 to_blk,
+ unsigned int start_bucket,
+ u32 *first_hash)
{
int i, ret, credits;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
- int bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
struct ocfs2_xattr_bucket *old_first, *new_first;
@@ -4014,6 +4015,12 @@ static int ocfs2_mv_xattr_buckets(struct inode *inode,
mlog(0, "mv xattrs from cluster %llu to %llu\n",
(unsigned long long)last_blk, (unsigned long long)to_blk);

+ BUG_ON(start_bucket >= num_buckets);
+ if (start_bucket) {
+ num_buckets -= start_bucket;
+ last_blk += (start_bucket * blks_per_bucket);
+ }
+
/* The first bucket of the original extent */
old_first = ocfs2_xattr_bucket_new(inode);
/* The first bucket of the new extent */
@@ -4031,10 +4038,11 @@ static int ocfs2_mv_xattr_buckets(struct inode *inode,
}

/*
- * We need to update the first bucket of the old extent and the
- * entire first cluster of the new extent.
+ * We need to update the first bucket of the old extent and all
+ * the buckets going to the new extent.
*/
- credits = blks_per_bucket + bpc + handle->h_buffer_credits;
+ credits = ((num_buckets + 1) * blks_per_bucket) +
+ handle->h_buffer_credits;
ret = ocfs2_extend_trans(handle, credits);
if (ret) {
mlog_errno(ret);
@@ -4177,8 +4185,7 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
if (prev_clusters > 1 && (*header_bh)->b_blocknr != last_blk)
ret = ocfs2_mv_xattr_buckets(inode, handle,
(*first_bh)->b_blocknr,
- last_blk,
- new_blk,
+ last_blk, new_blk, 0,
v_start);
else {
ret = ocfs2_divide_xattr_cluster(inode, handle,
--
1.5.6

2008-12-22 22:07:12

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 50/56] ocfs2: Use ocfs2_mv_xattr_buckets() in ocfs2_mv_xattr_bucket_cross_cluster().

From: Joel Becker <[email protected]>

Now that ocfs2_mv_xattr_buckets() can move a partial cluster's worth of
buckets, ocfs2_mv_xattr_bucket_cross_cluster() can use it.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 110 ++++++++++++++---------------------------------------
1 files changed, 29 insertions(+), 81 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 9734094..c318928 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -170,11 +170,10 @@ static int ocfs2_xattr_set_entry_index_block(struct inode *inode,

static int ocfs2_delete_xattr_index_block(struct inode *inode,
struct buffer_head *xb_bh);
-static int ocfs2_cp_xattr_bucket(struct inode *inode,
- handle_t *handle,
- u64 s_blkno,
- u64 t_blkno,
- int t_is_new);
+static int ocfs2_mv_xattr_buckets(struct inode *inode, handle_t *handle,
+ u64 src_blk, u64 last_blk, u64 to_blk,
+ unsigned int start_bucket,
+ u32 *first_hash);

static inline u16 ocfs2_xattr_buckets_per_cluster(struct ocfs2_super *osb)
{
@@ -3556,115 +3555,64 @@ static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,
u32 num_clusters,
u32 *first_hash)
{
- int i, ret, credits;
+ int ret;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
- int bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
int to_move = num_buckets / 2;
- u64 last_cluster_blkno, src_blkno;
+ u64 src_blkno;
+ u64 last_cluster_blkno = prev_blkno +
+ ((num_clusters - 1) * ocfs2_clusters_to_blocks(inode->i_sb, 1));
struct ocfs2_xattr_header *xh =
(struct ocfs2_xattr_header *)((*first_bh)->b_data);
- struct ocfs2_xattr_bucket *old_first, *new_first;
+ struct ocfs2_xattr_bucket *new_target, *new_first;

BUG_ON(le16_to_cpu(xh->xh_num_buckets) < num_buckets);
BUG_ON(OCFS2_XATTR_BUCKET_SIZE == osb->s_clustersize);

- last_cluster_blkno = prev_blkno + ((num_clusters - 1) * bpc);
- src_blkno = last_cluster_blkno + (to_move * blks_per_bucket);
-
mlog(0, "move half of xattrs in cluster %llu to %llu\n",
- (unsigned long long)prev_blkno, (unsigned long long)new_blkno);
+ (unsigned long long)last_cluster_blkno, (unsigned long long)new_blkno);

- /* The first bucket of the original extent */
- old_first = ocfs2_xattr_bucket_new(inode);
/* The first bucket of the new extent */
new_first = ocfs2_xattr_bucket_new(inode);
- if (!old_first || !new_first) {
+ /* The target bucket if it was moved to the new extent */
+ new_target = ocfs2_xattr_bucket_new(inode);
+ if (!new_target || !new_first) {
ret = -ENOMEM;
mlog_errno(ret);
goto out;
}

- ret = ocfs2_read_xattr_bucket(old_first, prev_blkno);
+ ret = ocfs2_mv_xattr_buckets(inode, handle, prev_blkno,
+ last_cluster_blkno, new_blkno,
+ to_move, first_hash);
if (ret) {
mlog_errno(ret);
goto out;
}

- /*
- * We need to update the 1st half of the new extent, and we
- * need to update the first bucket of the old extent.
- */
- credits = ((to_move + 1) * blks_per_bucket) + handle->h_buffer_credits;
- ret = ocfs2_extend_trans(handle, credits);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
- ret = ocfs2_xattr_bucket_journal_access(handle, old_first,
- OCFS2_JOURNAL_ACCESS_WRITE);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
- for (i = 0; i < to_move; i++) {
- ret = ocfs2_cp_xattr_bucket(inode, handle,
- src_blkno + (i * blks_per_bucket),
- new_blkno + (i * blks_per_bucket),
- 1);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
- }
-
- /*
- * Get the new bucket ready before we dirty anything
- * (This actually shouldn't fail, because we already dirtied
- * it once in ocfs2_cp_xattr_bucket()).
- */
- ret = ocfs2_read_xattr_bucket(new_first, new_blkno);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
- ret = ocfs2_xattr_bucket_journal_access(handle, new_first,
- OCFS2_JOURNAL_ACCESS_WRITE);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
- /* Now update the headers */
- le16_add_cpu(&bucket_xh(old_first)->xh_num_buckets, -to_move);
- ocfs2_xattr_bucket_journal_dirty(handle, old_first);
-
- bucket_xh(new_first)->xh_num_buckets = cpu_to_le16(to_move);
- ocfs2_xattr_bucket_journal_dirty(handle, new_first);
-
- if (first_hash)
- *first_hash = le32_to_cpu(bucket_xh(new_first)->xh_entries[0].xe_name_hash);
+ /* This is the first bucket that got moved */
+ src_blkno = last_cluster_blkno + (to_move * blks_per_bucket);

/*
- * If the target bucket is anywhere past src_blkno, we moved
- * it to the new extent. We need to update first_bh and header_bh.
+ * If the target bucket was part of the moved buckets, we need to
+ * update first_bh and header_bh.
*/
if ((*header_bh)->b_blocknr >= src_blkno) {
- /* We're done with old_first, so we can re-use it. */
- ocfs2_xattr_bucket_relse(old_first);
-
/* Find the block for the new target bucket */
src_blkno = new_blkno +
((*header_bh)->b_blocknr - src_blkno);

/*
- * This shouldn't fail - the buffers are in the
+ * These shouldn't fail - the buffers are in the
* journal from ocfs2_cp_xattr_bucket().
*/
- ret = ocfs2_read_xattr_bucket(old_first, src_blkno);
+ ret = ocfs2_read_xattr_bucket(new_first, new_blkno);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+ ret = ocfs2_read_xattr_bucket(new_target, src_blkno);
if (ret) {
mlog_errno(ret);
goto out;
@@ -3675,13 +3623,13 @@ static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,
get_bh(*first_bh);

brelse(*header_bh);
- *header_bh = old_first->bu_bhs[0];
+ *header_bh = new_target->bu_bhs[0];
get_bh(*header_bh);
}

out:
ocfs2_xattr_bucket_free(new_first);
- ocfs2_xattr_bucket_free(old_first);
+ ocfs2_xattr_bucket_free(new_target);

return ret;
}
--
1.5.6

2008-12-22 22:07:38

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 51/56] ocfs2: Start using buckets in ocfs2_adjust_xattr_cross_cluster().

From: Joel Becker <[email protected]>

We want to be passing around buckets instead of buffer_heads. Let's get
them into ocfs2_adjust_xattr_cross_cluster.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 44 +++++++++++++++++++++++++++++++++++++-------
1 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index c318928..975ba36 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -4111,28 +4111,54 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
u32 *v_start,
int *extend)
{
- int ret = 0;
- int bpc = ocfs2_clusters_to_blocks(inode->i_sb, 1);
+ int ret;
+ struct ocfs2_xattr_bucket *first, *target;

mlog(0, "adjust xattrs from cluster %llu len %u to %llu\n",
(unsigned long long)prev_blk, prev_clusters,
(unsigned long long)new_blk);

+ /* The first bucket of the original extent */
+ first = ocfs2_xattr_bucket_new(inode);
+ /* The target bucket for insert */
+ target = ocfs2_xattr_bucket_new(inode);
+ if (!first || !target) {
+ ret = -ENOMEM;
+ mlog_errno(ret);
+ goto out;
+ }
+
+ BUG_ON(prev_blk != (*first_bh)->b_blocknr);
+ ret = ocfs2_read_xattr_bucket(first, prev_blk);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+
+ ret = ocfs2_read_xattr_bucket(target, (*header_bh)->b_blocknr);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+
if (ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode->i_sb)) > 1)
ret = ocfs2_mv_xattr_bucket_cross_cluster(inode,
handle,
first_bh,
header_bh,
new_blk,
- prev_blk,
+ bucket_blkno(first),
prev_clusters,
v_start);
else {
- u64 last_blk = prev_blk + bpc * (prev_clusters - 1);
+ /* The start of the last cluster in the first extent */
+ u64 last_blk = bucket_blkno(first) +
+ ((prev_clusters - 1) *
+ ocfs2_clusters_to_blocks(inode->i_sb, 1));

- if (prev_clusters > 1 && (*header_bh)->b_blocknr != last_blk)
+ if (prev_clusters > 1 && bucket_blkno(target) != last_blk)
ret = ocfs2_mv_xattr_buckets(inode, handle,
- (*first_bh)->b_blocknr,
+ bucket_blkno(first),
last_blk, new_blk, 0,
v_start);
else {
@@ -4140,11 +4166,15 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
last_blk, new_blk,
v_start);

- if ((*header_bh)->b_blocknr == last_blk && extend)
+ if ((bucket_blkno(target) == last_blk) && extend)
*extend = 0;
}
}

+out:
+ ocfs2_xattr_bucket_free(first);
+ ocfs2_xattr_bucket_free(target);
+
return ret;
}

--
1.5.6

2008-12-22 22:07:56

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 52/56] ocfs2: Pass buckets into ocfs2_mv_xattr_bucket_cross_cluster().

From: Joel Becker <[email protected]>

Now that ocfs2_adjust_xattr_cross_cluster() has buckets, it can pass
them into ocfs2_mv_xattr_bucket_cross_cluster(). It no longer has to
care about buffer_heads. The manipulation of first_bh and header_bh
moves up to ocfs2_adjust_xattr_cross_cluster().

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 84 +++++++++++++++++++++++------------------------------
1 files changed, 37 insertions(+), 47 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 975ba36..2f16f50 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -3548,42 +3548,28 @@ out:
*/
static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,
handle_t *handle,
- struct buffer_head **first_bh,
- struct buffer_head **header_bh,
+ struct ocfs2_xattr_bucket *first,
+ struct ocfs2_xattr_bucket *target,
u64 new_blkno,
- u64 prev_blkno,
u32 num_clusters,
u32 *first_hash)
{
int ret;
- struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
- int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(inode->i_sb);
- int num_buckets = ocfs2_xattr_buckets_per_cluster(osb);
+ struct super_block *sb = inode->i_sb;
+ int blks_per_bucket = ocfs2_blocks_per_xattr_bucket(sb);
+ int num_buckets = ocfs2_xattr_buckets_per_cluster(OCFS2_SB(sb));
int to_move = num_buckets / 2;
u64 src_blkno;
- u64 last_cluster_blkno = prev_blkno +
- ((num_clusters - 1) * ocfs2_clusters_to_blocks(inode->i_sb, 1));
- struct ocfs2_xattr_header *xh =
- (struct ocfs2_xattr_header *)((*first_bh)->b_data);
- struct ocfs2_xattr_bucket *new_target, *new_first;
+ u64 last_cluster_blkno = bucket_blkno(first) +
+ ((num_clusters - 1) * ocfs2_clusters_to_blocks(sb, 1));

- BUG_ON(le16_to_cpu(xh->xh_num_buckets) < num_buckets);
- BUG_ON(OCFS2_XATTR_BUCKET_SIZE == osb->s_clustersize);
+ BUG_ON(le16_to_cpu(bucket_xh(first)->xh_num_buckets) < num_buckets);
+ BUG_ON(OCFS2_XATTR_BUCKET_SIZE == OCFS2_SB(sb)->s_clustersize);

mlog(0, "move half of xattrs in cluster %llu to %llu\n",
(unsigned long long)last_cluster_blkno, (unsigned long long)new_blkno);

- /* The first bucket of the new extent */
- new_first = ocfs2_xattr_bucket_new(inode);
- /* The target bucket if it was moved to the new extent */
- new_target = ocfs2_xattr_bucket_new(inode);
- if (!new_target || !new_first) {
- ret = -ENOMEM;
- mlog_errno(ret);
- goto out;
- }
-
- ret = ocfs2_mv_xattr_buckets(inode, handle, prev_blkno,
+ ret = ocfs2_mv_xattr_buckets(inode, handle, bucket_blkno(first),
last_cluster_blkno, new_blkno,
to_move, first_hash);
if (ret) {
@@ -3596,41 +3582,32 @@ static int ocfs2_mv_xattr_bucket_cross_cluster(struct inode *inode,

/*
* If the target bucket was part of the moved buckets, we need to
- * update first_bh and header_bh.
+ * update first and target.
*/
- if ((*header_bh)->b_blocknr >= src_blkno) {
+ if (bucket_blkno(target) >= src_blkno) {
/* Find the block for the new target bucket */
src_blkno = new_blkno +
- ((*header_bh)->b_blocknr - src_blkno);
+ (bucket_blkno(target) - src_blkno);
+
+ ocfs2_xattr_bucket_relse(first);
+ ocfs2_xattr_bucket_relse(target);

/*
* These shouldn't fail - the buffers are in the
* journal from ocfs2_cp_xattr_bucket().
*/
- ret = ocfs2_read_xattr_bucket(new_first, new_blkno);
+ ret = ocfs2_read_xattr_bucket(first, new_blkno);
if (ret) {
mlog_errno(ret);
goto out;
}
- ret = ocfs2_read_xattr_bucket(new_target, src_blkno);
- if (ret) {
+ ret = ocfs2_read_xattr_bucket(target, src_blkno);
+ if (ret)
mlog_errno(ret);
- goto out;
- }

- brelse(*first_bh);
- *first_bh = new_first->bu_bhs[0];
- get_bh(*first_bh);
-
- brelse(*header_bh);
- *header_bh = new_target->bu_bhs[0];
- get_bh(*header_bh);
}

out:
- ocfs2_xattr_bucket_free(new_first);
- ocfs2_xattr_bucket_free(new_target);
-
return ret;
}

@@ -4141,16 +4118,29 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
goto out;
}

- if (ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode->i_sb)) > 1)
+ if (ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode->i_sb)) > 1) {
ret = ocfs2_mv_xattr_bucket_cross_cluster(inode,
handle,
- first_bh,
- header_bh,
+ first, target,
new_blk,
- bucket_blkno(first),
prev_clusters,
v_start);
- else {
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+
+ /* Did first+target get moved? */
+ if (prev_blk != bucket_blkno(first)) {
+ brelse(*first_bh);
+ *first_bh = first->bu_bhs[0];
+ get_bh(*first_bh);
+
+ brelse(*header_bh);
+ *header_bh = target->bu_bhs[0];
+ get_bh(*header_bh);
+ }
+ } else {
/* The start of the last cluster in the first extent */
u64 last_blk = bucket_blkno(first) +
((prev_clusters - 1) *
--
1.5.6

2008-12-22 22:08:23

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 53/56] ocfs2: Move buckets up into ocfs2_add_new_xattr_cluster().

From: Joel Becker <[email protected]>

Lift the buckets from ocfs2_adjust_xattr_cross_cluster() up into
ocfs2_add_new_xattr_cluster(). Now ocfs2_adjust_xattr_cross_cluster()
doesn't deal with buffer_heads.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 100 ++++++++++++++++++++++++++---------------------------
1 files changed, 49 insertions(+), 51 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 2f16f50..4b24704 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -4080,44 +4080,19 @@ static int ocfs2_divide_xattr_cluster(struct inode *inode,
*/
static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
handle_t *handle,
- struct buffer_head **first_bh,
- struct buffer_head **header_bh,
+ struct ocfs2_xattr_bucket *first,
+ struct ocfs2_xattr_bucket *target,
u64 new_blk,
- u64 prev_blk,
u32 prev_clusters,
u32 *v_start,
int *extend)
{
int ret;
- struct ocfs2_xattr_bucket *first, *target;

mlog(0, "adjust xattrs from cluster %llu len %u to %llu\n",
- (unsigned long long)prev_blk, prev_clusters,
+ (unsigned long long)bucket_blkno(first), prev_clusters,
(unsigned long long)new_blk);

- /* The first bucket of the original extent */
- first = ocfs2_xattr_bucket_new(inode);
- /* The target bucket for insert */
- target = ocfs2_xattr_bucket_new(inode);
- if (!first || !target) {
- ret = -ENOMEM;
- mlog_errno(ret);
- goto out;
- }
-
- BUG_ON(prev_blk != (*first_bh)->b_blocknr);
- ret = ocfs2_read_xattr_bucket(first, prev_blk);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
- ret = ocfs2_read_xattr_bucket(target, (*header_bh)->b_blocknr);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
if (ocfs2_xattr_buckets_per_cluster(OCFS2_SB(inode->i_sb)) > 1) {
ret = ocfs2_mv_xattr_bucket_cross_cluster(inode,
handle,
@@ -4125,46 +4100,33 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
new_blk,
prev_clusters,
v_start);
- if (ret) {
+ if (ret)
mlog_errno(ret);
- goto out;
- }
-
- /* Did first+target get moved? */
- if (prev_blk != bucket_blkno(first)) {
- brelse(*first_bh);
- *first_bh = first->bu_bhs[0];
- get_bh(*first_bh);
-
- brelse(*header_bh);
- *header_bh = target->bu_bhs[0];
- get_bh(*header_bh);
- }
} else {
/* The start of the last cluster in the first extent */
u64 last_blk = bucket_blkno(first) +
((prev_clusters - 1) *
ocfs2_clusters_to_blocks(inode->i_sb, 1));

- if (prev_clusters > 1 && bucket_blkno(target) != last_blk)
+ if (prev_clusters > 1 && bucket_blkno(target) != last_blk) {
ret = ocfs2_mv_xattr_buckets(inode, handle,
bucket_blkno(first),
last_blk, new_blk, 0,
v_start);
- else {
+ if (ret)
+ mlog_errno(ret);
+ } else {
ret = ocfs2_divide_xattr_cluster(inode, handle,
last_blk, new_blk,
v_start);
+ if (ret)
+ mlog_errno(ret);

if ((bucket_blkno(target) == last_blk) && extend)
*extend = 0;
}
}

-out:
- ocfs2_xattr_bucket_free(first);
- ocfs2_xattr_bucket_free(target);
-
return ret;
}

@@ -4202,6 +4164,7 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
handle_t *handle = ctxt->handle;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
struct ocfs2_extent_tree et;
+ struct ocfs2_xattr_bucket *first, *target;

mlog(0, "Add new xattr cluster for %llu, previous xattr hash = %u, "
"previous xattr blkno = %llu\n",
@@ -4210,6 +4173,29 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,

ocfs2_init_xattr_tree_extent_tree(&et, inode, root_bh);

+ /* The first bucket of the original extent */
+ first = ocfs2_xattr_bucket_new(inode);
+ /* The target bucket for insert */
+ target = ocfs2_xattr_bucket_new(inode);
+ if (!first || !target) {
+ ret = -ENOMEM;
+ mlog_errno(ret);
+ goto leave;
+ }
+
+ BUG_ON(prev_blkno != (*first_bh)->b_blocknr);
+ ret = ocfs2_read_xattr_bucket(first, prev_blkno);
+ if (ret) {
+ mlog_errno(ret);
+ goto leave;
+ }
+
+ ret = ocfs2_read_xattr_bucket(target, (*header_bh)->b_blocknr);
+ if (ret) {
+ mlog_errno(ret);
+ goto leave;
+ }
+
ret = ocfs2_journal_access(handle, inode, root_bh,
OCFS2_JOURNAL_ACCESS_WRITE);
if (ret < 0) {
@@ -4250,10 +4236,9 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
} else {
ret = ocfs2_adjust_xattr_cross_cluster(inode,
handle,
- first_bh,
- header_bh,
+ first,
+ target,
block,
- prev_blkno,
prev_clusters,
&v_start,
extend);
@@ -4261,6 +4246,17 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
mlog_errno(ret);
goto leave;
}
+
+ /* Did first+target get moved? */
+ if (prev_blkno != bucket_blkno(first)) {
+ brelse(*first_bh);
+ *first_bh = first->bu_bhs[0];
+ get_bh(*first_bh);
+
+ brelse(*header_bh);
+ *header_bh = target->bu_bhs[0];
+ get_bh(*header_bh);
+ }
}

mlog(0, "Insert %u clusters at block %llu for xattr at %u\n",
@@ -4277,6 +4273,8 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
mlog_errno(ret);

leave:
+ ocfs2_xattr_bucket_free(first);
+ ocfs2_xattr_bucket_free(target);
return ret;
}

--
1.5.6

2008-12-22 22:08:41

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 54/56] ocfs2: Move buckets up into ocfs2_add_new_xattr_bucket().

From: Joel Becker <[email protected]>

Lift the buckets from ocfs2_add_new_xattr_cluster() up into
ocfs2_add_new_xattr_bucket(). Now ocfs2_add_new_xattr_cluster()
doesn't deal with buffer_heads. In fact, we no longer have to play
get_bh() tricks at all.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 105 ++++++++++++++++-------------------------------------
1 files changed, 32 insertions(+), 73 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 4b24704..5a5a1bd 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -4148,11 +4148,10 @@ static int ocfs2_adjust_xattr_cross_cluster(struct inode *inode,
*/
static int ocfs2_add_new_xattr_cluster(struct inode *inode,
struct buffer_head *root_bh,
- struct buffer_head **first_bh,
- struct buffer_head **header_bh,
+ struct ocfs2_xattr_bucket *first,
+ struct ocfs2_xattr_bucket *target,
u32 *num_clusters,
u32 prev_cpos,
- u64 prev_blkno,
int *extend,
struct ocfs2_xattr_set_ctxt *ctxt)
{
@@ -4164,38 +4163,14 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
handle_t *handle = ctxt->handle;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
struct ocfs2_extent_tree et;
- struct ocfs2_xattr_bucket *first, *target;

mlog(0, "Add new xattr cluster for %llu, previous xattr hash = %u, "
"previous xattr blkno = %llu\n",
(unsigned long long)OCFS2_I(inode)->ip_blkno,
- prev_cpos, (unsigned long long)prev_blkno);
+ prev_cpos, (unsigned long long)bucket_blkno(first));

ocfs2_init_xattr_tree_extent_tree(&et, inode, root_bh);

- /* The first bucket of the original extent */
- first = ocfs2_xattr_bucket_new(inode);
- /* The target bucket for insert */
- target = ocfs2_xattr_bucket_new(inode);
- if (!first || !target) {
- ret = -ENOMEM;
- mlog_errno(ret);
- goto leave;
- }
-
- BUG_ON(prev_blkno != (*first_bh)->b_blocknr);
- ret = ocfs2_read_xattr_bucket(first, prev_blkno);
- if (ret) {
- mlog_errno(ret);
- goto leave;
- }
-
- ret = ocfs2_read_xattr_bucket(target, (*header_bh)->b_blocknr);
- if (ret) {
- mlog_errno(ret);
- goto leave;
- }
-
ret = ocfs2_journal_access(handle, inode, root_bh,
OCFS2_JOURNAL_ACCESS_WRITE);
if (ret < 0) {
@@ -4217,7 +4192,7 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
mlog(0, "Allocating %u clusters at block %u for xattr in inode %llu\n",
num_bits, bit_off, (unsigned long long)OCFS2_I(inode)->ip_blkno);

- if (prev_blkno + prev_clusters * bpc == block &&
+ if (bucket_blkno(first) + (prev_clusters * bpc) == block &&
(prev_clusters + num_bits) << osb->s_clustersize_bits <=
OCFS2_MAX_XATTR_TREE_LEAF_SIZE) {
/*
@@ -4246,17 +4221,6 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
mlog_errno(ret);
goto leave;
}
-
- /* Did first+target get moved? */
- if (prev_blkno != bucket_blkno(first)) {
- brelse(*first_bh);
- *first_bh = first->bu_bhs[0];
- get_bh(*first_bh);
-
- brelse(*header_bh);
- *header_bh = target->bu_bhs[0];
- get_bh(*header_bh);
- }
}

mlog(0, "Insert %u clusters at block %llu for xattr at %u\n",
@@ -4273,8 +4237,6 @@ static int ocfs2_add_new_xattr_cluster(struct inode *inode,
mlog_errno(ret);

leave:
- ocfs2_xattr_bucket_free(first);
- ocfs2_xattr_bucket_free(target);
return ret;
}

@@ -4357,16 +4319,16 @@ out:
* We will move all the buckets starting from header_bh to the next place. As
* for this one, half num of its xattrs will be moved to the next one.
*
- * We will allocate a new cluster if current cluster is full and adjust
- * header_bh and first_bh if the insert place is moved to the new cluster.
+ * We will allocate a new cluster if current cluster is full. The
+ * underlying calls will make sure that there is space at the target
+ * bucket, shifting buckets around if necessary. 'target' may be updated
+ * by those calls.
*/
static int ocfs2_add_new_xattr_bucket(struct inode *inode,
struct buffer_head *xb_bh,
struct buffer_head *header_bh,
struct ocfs2_xattr_set_ctxt *ctxt)
{
- struct ocfs2_xattr_header *first_xh = NULL;
- struct buffer_head *first_bh = NULL;
struct ocfs2_xattr_block *xb =
(struct ocfs2_xattr_block *)xb_bh->b_data;
struct ocfs2_xattr_tree_root *xb_root = &xb->xb_attrs.xb_root;
@@ -4374,31 +4336,26 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
struct ocfs2_xattr_header *xh =
(struct ocfs2_xattr_header *)header_bh->b_data;
u32 name_hash = le32_to_cpu(xh->xh_entries[0].xe_name_hash);
- struct super_block *sb = inode->i_sb;
- struct ocfs2_super *osb = OCFS2_SB(sb);
+ struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
int ret, num_buckets, extend = 1;
u64 p_blkno;
u32 e_cpos, num_clusters;
/* The bucket at the front of the extent */
- struct ocfs2_xattr_bucket *first;
+ struct ocfs2_xattr_bucket *first, *target;

mlog(0, "Add new xattr bucket starting form %llu\n",
(unsigned long long)header_bh->b_blocknr);

+ /* The first bucket of the original extent */
first = ocfs2_xattr_bucket_new(inode);
- if (!first) {
+ /* The target bucket for insert */
+ target = ocfs2_xattr_bucket_new(inode);
+ if (!first || !target) {
ret = -ENOMEM;
mlog_errno(ret);
goto out;
}

- /*
- * Add refrence for header_bh here because it may be
- * changed in ocfs2_add_new_xattr_cluster and we need
- * to free it in the end.
- */
- get_bh(header_bh);
-
ret = ocfs2_xattr_get_rec(inode, name_hash, &p_blkno, &e_cpos,
&num_clusters, el);
if (ret) {
@@ -4406,23 +4363,30 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
goto out;
}

- ret = ocfs2_read_block(inode, p_blkno, &first_bh, NULL);
+ ret = ocfs2_read_xattr_bucket(first, p_blkno);
if (ret) {
mlog_errno(ret);
goto out;
}

- num_buckets = ocfs2_xattr_buckets_per_cluster(osb) * num_clusters;
- first_xh = (struct ocfs2_xattr_header *)first_bh->b_data;
+ ret = ocfs2_read_xattr_bucket(target, header_bh->b_blocknr);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }

- if (num_buckets == le16_to_cpu(first_xh->xh_num_buckets)) {
+ num_buckets = ocfs2_xattr_buckets_per_cluster(osb) * num_clusters;
+ if (num_buckets == le16_to_cpu(bucket_xh(first)->xh_num_buckets)) {
+ /*
+ * This can move first+target if the target bucket moves
+ * to the new extent.
+ */
ret = ocfs2_add_new_xattr_cluster(inode,
xb_bh,
- &first_bh,
- &header_bh,
+ first,
+ target,
&num_clusters,
e_cpos,
- p_blkno,
&extend,
ctxt);
if (ret) {
@@ -4432,24 +4396,19 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
}

if (extend) {
- /* These bucket reads should be cached */
- ret = ocfs2_read_xattr_bucket(first, first_bh->b_blocknr);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
ret = ocfs2_extend_xattr_bucket(inode,
ctxt->handle,
- first, header_bh->b_blocknr,
+ first,
+ bucket_blkno(target),
num_clusters);
if (ret)
mlog_errno(ret);
}

out:
- brelse(first_bh);
- brelse(header_bh);
ocfs2_xattr_bucket_free(first);
+ ocfs2_xattr_bucket_free(target);
+
return ret;
}

--
1.5.6

2008-12-22 22:08:58

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 55/56] ocfs2: Pass xs->bucket into ocfs2_add_new_xattr_bucket().

From: Joel Becker <[email protected]>

Pass the actual target bucket for insert through to
ocfs2_add_new_xattr_bucket(). Now growing a bucket has no buffer_head
knowledge.

ocfs2_add_new_xattr_bucket() leavs xs->bucket in the proper state for
insert. However, it doesn't update the rest of the search fields in xs,
so we still have to relse() and re-find. That's OK, because everything
is cached.

Signed-off-by: Joel Becker <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/xattr.c | 52 +++++++++++++++++++++++++---------------------------
1 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 5a5a1bd..dfc51c3 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -4314,43 +4314,42 @@ out:
}

/*
- * Add new xattr bucket in an extent record and adjust the buckets accordingly.
- * xb_bh is the ocfs2_xattr_block.
- * We will move all the buckets starting from header_bh to the next place. As
- * for this one, half num of its xattrs will be moved to the next one.
+ * Add new xattr bucket in an extent record and adjust the buckets
+ * accordingly. xb_bh is the ocfs2_xattr_block, and target is the
+ * bucket we want to insert into.
*
- * We will allocate a new cluster if current cluster is full. The
- * underlying calls will make sure that there is space at the target
- * bucket, shifting buckets around if necessary. 'target' may be updated
- * by those calls.
+ * In the easy case, we will move all the buckets after target down by
+ * one. Half of target's xattrs will be moved to the next bucket.
+ *
+ * If current cluster is full, we'll allocate a new one. This may not
+ * be contiguous. The underlying calls will make sure that there is
+ * space for the insert, shifting buckets around if necessary.
+ * 'target' may be moved by those calls.
*/
static int ocfs2_add_new_xattr_bucket(struct inode *inode,
struct buffer_head *xb_bh,
- struct buffer_head *header_bh,
+ struct ocfs2_xattr_bucket *target,
struct ocfs2_xattr_set_ctxt *ctxt)
{
struct ocfs2_xattr_block *xb =
(struct ocfs2_xattr_block *)xb_bh->b_data;
struct ocfs2_xattr_tree_root *xb_root = &xb->xb_attrs.xb_root;
struct ocfs2_extent_list *el = &xb_root->xt_list;
- struct ocfs2_xattr_header *xh =
- (struct ocfs2_xattr_header *)header_bh->b_data;
- u32 name_hash = le32_to_cpu(xh->xh_entries[0].xe_name_hash);
+ u32 name_hash =
+ le32_to_cpu(bucket_xh(target)->xh_entries[0].xe_name_hash);
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
int ret, num_buckets, extend = 1;
u64 p_blkno;
u32 e_cpos, num_clusters;
/* The bucket at the front of the extent */
- struct ocfs2_xattr_bucket *first, *target;
+ struct ocfs2_xattr_bucket *first;

- mlog(0, "Add new xattr bucket starting form %llu\n",
- (unsigned long long)header_bh->b_blocknr);
+ mlog(0, "Add new xattr bucket starting from %llu\n",
+ (unsigned long long)bucket_blkno(target));

/* The first bucket of the original extent */
first = ocfs2_xattr_bucket_new(inode);
- /* The target bucket for insert */
- target = ocfs2_xattr_bucket_new(inode);
- if (!first || !target) {
+ if (!first) {
ret = -ENOMEM;
mlog_errno(ret);
goto out;
@@ -4369,12 +4368,6 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,
goto out;
}

- ret = ocfs2_read_xattr_bucket(target, header_bh->b_blocknr);
- if (ret) {
- mlog_errno(ret);
- goto out;
- }
-
num_buckets = ocfs2_xattr_buckets_per_cluster(osb) * num_clusters;
if (num_buckets == le16_to_cpu(bucket_xh(first)->xh_num_buckets)) {
/*
@@ -4407,7 +4400,6 @@ static int ocfs2_add_new_xattr_bucket(struct inode *inode,

out:
ocfs2_xattr_bucket_free(first);
- ocfs2_xattr_bucket_free(target);

return ret;
}
@@ -5083,15 +5075,21 @@ try_again:

ret = ocfs2_add_new_xattr_bucket(inode,
xs->xattr_bh,
- xs->bucket->bu_bhs[0],
+ xs->bucket,
ctxt);
if (ret) {
mlog_errno(ret);
goto out;
}

+ /*
+ * ocfs2_add_new_xattr_bucket() will have updated
+ * xs->bucket if it moved, but it will not have updated
+ * any of the other search fields. Thus, we drop it and
+ * re-search. Everything should be cached, so it'll be
+ * quick.
+ */
ocfs2_xattr_bucket_relse(xs->bucket);
-
ret = ocfs2_xattr_index_block_find(inode, xs->xattr_bh,
xi->name_index,
xi->name, xs);
--
1.5.6

2008-12-22 22:09:26

by Mark Fasheh

[permalink] [raw]
Subject: [PATCH 56/56] ocfs2/quota: Add QUOTA in mlog_attribute.

From: Tao Ma <[email protected]>

A new mlog mask has to be added into mlog_attribute before it can
be really used in mlog. ML_QUOTA is only added in masklog.h, so
add it to the array to enable it.

Signed-off-by: Tao Ma <[email protected]>
Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/cluster/masklog.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/cluster/masklog.c b/fs/ocfs2/cluster/masklog.c
index d8a0cb9..96df541 100644
--- a/fs/ocfs2/cluster/masklog.c
+++ b/fs/ocfs2/cluster/masklog.c
@@ -110,6 +110,7 @@ static struct mlog_attribute mlog_attrs[MLOG_MAX_BITS] = {
define_mask(QUORUM),
define_mask(EXPORT),
define_mask(XATTR),
+ define_mask(QUOTA),
define_mask(ERROR),
define_mask(NOTICE),
define_mask(KTHREAD),
--
1.5.6

2008-12-23 00:01:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 19/56] mm: Export pdflush_operation()

On Mon, 22 Dec 2008 13:48:00 -0800
Mark Fasheh <[email protected]> wrote:

> OCSF2 will need to queue up work for periodic syncing of quotas
> among nodes in the cluster. pdflush() is good thread for this so
> export it's controlling function so that OCFS2 can use it.

I trust that nothing will explode if pdflush_operation() fails
to do anything and returns -1?

2008-12-23 00:12:16

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 23/56] ocfs2: Implementation of local and global quota file handling

On Mon, 22 Dec 2008 13:48:04 -0800
Mark Fasheh <[email protected]> wrote:

> From: Jan Kara <[email protected]>
>
> For each quota type each node has local quota file. In this file it stores
> changes users have made to disk usage via this node. Once in a while this
> information is synced to global file (and thus with other nodes) so that
> limits enforcement at least aproximately works.
>
> Global quota files contain all the information about usage and limits. It's
> mostly handled by the generic VFS code (which implements a trie of structures
> inside a quota file). We only have to provide functions to convert structures
> from on-disk format to in-memory one. We also have to provide wrappers for
> various quota functions starting transactions and acquiring necessary cluster
> locks before the actual IO is really started.
>
> +static void ocfs2_set_qinfo_lvb(struct ocfs2_lock_res *lockres)
> +{
> + struct ocfs2_qinfo_lvb *lvb;
> + struct ocfs2_mem_dqinfo *oinfo = ocfs2_lock_res_qinfo(lockres);
> + struct mem_dqinfo *info = sb_dqinfo(oinfo->dqi_gi.dqi_sb,
> + oinfo->dqi_gi.dqi_type);
> +
> + mlog_entry_void();
> +
> + lvb = (struct ocfs2_qinfo_lvb *)ocfs2_dlm_lvb(&lockres->l_lksb);

Unneeded cast.

> + lvb->lvb_version = OCFS2_QINFO_LVB_VERSION;
> + lvb->lvb_bgrace = cpu_to_be32(info->dqi_bgrace);
> + lvb->lvb_igrace = cpu_to_be32(info->dqi_igrace);
> + lvb->lvb_syncms = cpu_to_be32(oinfo->dqi_syncms);
> + lvb->lvb_blocks = cpu_to_be32(oinfo->dqi_gi.dqi_blocks);
> + lvb->lvb_free_blk = cpu_to_be32(oinfo->dqi_gi.dqi_free_blk);
> + lvb->lvb_free_entry = cpu_to_be32(oinfo->dqi_gi.dqi_free_entry);
> +
> + mlog_exit_void();
> +}
> +
> +void ocfs2_qinfo_unlock(struct ocfs2_mem_dqinfo *oinfo, int ex)
> +{
> + struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
> + struct ocfs2_super *osb = OCFS2_SB(oinfo->dqi_gi.dqi_sb);
> + int level = ex ? DLM_LOCK_EX : DLM_LOCK_PR;
> +
> + mlog_entry_void();
> + if (!ocfs2_is_hard_readonly(osb) && !ocfs2_mount_local(osb))
> + ocfs2_cluster_unlock(osb, lockres, level);
> + mlog_exit_void();
> +}
> +
> +static int ocfs2_refresh_qinfo(struct ocfs2_mem_dqinfo *oinfo)
> +{
> + struct mem_dqinfo *info = sb_dqinfo(oinfo->dqi_gi.dqi_sb,
> + oinfo->dqi_gi.dqi_type);
> + struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
> + struct ocfs2_qinfo_lvb *lvb = ocfs2_dlm_lvb(&lockres->l_lksb);

yeah, like that ;)

> + struct buffer_head *bh;
> + struct ocfs2_global_disk_dqinfo *gdinfo;
> + int status = 0;
> +
> + if (lvb->lvb_version == OCFS2_QINFO_LVB_VERSION) {
> + info->dqi_bgrace = be32_to_cpu(lvb->lvb_bgrace);
> + info->dqi_igrace = be32_to_cpu(lvb->lvb_igrace);
> + oinfo->dqi_syncms = be32_to_cpu(lvb->lvb_syncms);
> + oinfo->dqi_gi.dqi_blocks = be32_to_cpu(lvb->lvb_blocks);
> + oinfo->dqi_gi.dqi_free_blk = be32_to_cpu(lvb->lvb_free_blk);
> + oinfo->dqi_gi.dqi_free_entry =
> + be32_to_cpu(lvb->lvb_free_entry);
> + } else {
> + bh = ocfs2_read_quota_block(oinfo->dqi_gqinode, 0, &status);
> + if (!bh) {
> + mlog_errno(status);
> + goto bail;
> + }
> + gdinfo = (struct ocfs2_global_disk_dqinfo *)
> + (bh->b_data + OCFS2_GLOBAL_INFO_OFF);
> + info->dqi_bgrace = le32_to_cpu(gdinfo->dqi_bgrace);
> + info->dqi_igrace = le32_to_cpu(gdinfo->dqi_igrace);
> + oinfo->dqi_syncms = le32_to_cpu(gdinfo->dqi_syncms);
> + oinfo->dqi_gi.dqi_blocks = le32_to_cpu(gdinfo->dqi_blocks);
> + oinfo->dqi_gi.dqi_free_blk = le32_to_cpu(gdinfo->dqi_free_blk);
> + oinfo->dqi_gi.dqi_free_entry =
> + le32_to_cpu(gdinfo->dqi_free_entry);
> + brelse(bh);

put_bh() is more efficient and modern, in the case where bh is known to
not be NULL.

> + ocfs2_track_lock_refresh(lockres);
> + }
> +
> +bail:
> + return status;
> +}
> +
> +/* Lock quota info, this function expects at least shared lock on the quota file
> + * so that we can safely refresh quota info from disk. */
> +int ocfs2_qinfo_lock(struct ocfs2_mem_dqinfo *oinfo, int ex)
> +{
> + struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
> + struct ocfs2_super *osb = OCFS2_SB(oinfo->dqi_gi.dqi_sb);
> + int level = ex ? DLM_LOCK_EX : DLM_LOCK_PR;
> + int status = 0;
> +
> + mlog_entry_void();
> +
> + /* On RO devices, locking really isn't needed... */
> + if (ocfs2_is_hard_readonly(osb)) {
> + if (ex)
> + status = -EROFS;
> + goto bail;
> + }
> + if (ocfs2_mount_local(osb))
> + goto bail;

This is not an error case?

> +
> + status = ocfs2_cluster_lock(osb, lockres, level, 0, 0);
> + if (status < 0) {
> + mlog_errno(status);
> + goto bail;
> + }
> + if (!ocfs2_should_refresh_lock_res(lockres))
> + goto bail;

ditto?

> + /* OK, we have the lock but we need to refresh the quota info */
> + status = ocfs2_refresh_qinfo(oinfo);
> + if (status)
> + ocfs2_qinfo_unlock(oinfo, ex);
> + ocfs2_complete_lock_res_refresh(lockres, status);
> +bail:
> + mlog_exit(status);
> + return status;
> +}
> +
>
> ...
>
> +ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
> + size_t len, loff_t off)
> +{
> + struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
> + struct inode *gqinode = oinfo->dqi_gqinode;
> + loff_t i_size = i_size_read(gqinode);
> + int offset = off & (sb->s_blocksize - 1);
> + sector_t blk = off >> sb->s_blocksize_bits;
> + int err = 0;
> + struct buffer_head *bh;
> + size_t toread, tocopy;
> +
> + if (off > i_size)
> + return 0;
> + if (off + len > i_size)
> + len = i_size - off;
> + toread = len;
> + while (toread > 0) {
> + tocopy = min((size_t)(sb->s_blocksize - offset), toread);

min_t is preferred.

> + bh = ocfs2_read_quota_block(gqinode, blk, &err);
> + if (!bh) {
> + mlog_errno(err);
> + return err;
> + }
> + memcpy(data, bh->b_data + offset, tocopy);
> + brelse(bh);
> + offset = 0;
> + toread -= tocopy;
> + data += tocopy;
> + blk++;
> + }
> + return len;
> +}
> +
> +/* Write to quotafile (we know the transaction is already started and has
> + * enough credits) */
> +ssize_t ocfs2_quota_write(struct super_block *sb, int type,
> + const char *data, size_t len, loff_t off)
> +{
> + struct mem_dqinfo *info = sb_dqinfo(sb, type);
> + struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
> + struct inode *gqinode = oinfo->dqi_gqinode;
> + int offset = off & (sb->s_blocksize - 1);
> + sector_t blk = off >> sb->s_blocksize_bits;

does ocfs2 attempt to support CONFIG_LBD=n?

> + int err = 0, new = 0;
> + struct buffer_head *bh;
> + handle_t *handle = journal_current_handle();
> +
>
> ...
>
> + lock_buffer(bh);
> + if (new)
> + memset(bh->b_data, 0, sb->s_blocksize);
> + memcpy(bh->b_data + offset, data, len);
> + flush_dcache_page(bh->b_page);
> + unlock_buffer(bh);
> + ocfs2_set_buffer_uptodate(gqinode, bh);
> + err = ocfs2_journal_dirty(handle, bh);
> + brelse(bh);

lots of put_bh()'s

> + if (err < 0)
> + goto out;
> +out:
> + if (err) {
> + mutex_unlock(&gqinode->i_mutex);
> + mlog_errno(err);
> + return err;
> + }
> + gqinode->i_version++;
> + ocfs2_mark_inode_dirty(handle, gqinode, oinfo->dqi_gqi_bh);
> + mutex_unlock(&gqinode->i_mutex);
> + return len;
> +}
> +
>
> ...
>

gee, what a lot of code.

2008-12-25 00:29:46

by Mark Fasheh

[permalink] [raw]
Subject: Re: [PATCH 23/56] ocfs2: Implementation of local and global quota file handling

On Mon, Dec 22, 2008 at 04:11:38PM -0800, Andrew Morton wrote:
> > + mlog_entry_void();
> > +
> > + lvb = (struct ocfs2_qinfo_lvb *)ocfs2_dlm_lvb(&lockres->l_lksb);
>
> Unneeded cast.

Yeah, there's quite a few of those in dlmglue.c actually. I'll add a patch
to fix them up.


> > + if (lvb->lvb_version == OCFS2_QINFO_LVB_VERSION) {
> > + info->dqi_bgrace = be32_to_cpu(lvb->lvb_bgrace);
> > + info->dqi_igrace = be32_to_cpu(lvb->lvb_igrace);
> > + oinfo->dqi_syncms = be32_to_cpu(lvb->lvb_syncms);
> > + oinfo->dqi_gi.dqi_blocks = be32_to_cpu(lvb->lvb_blocks);
> > + oinfo->dqi_gi.dqi_free_blk = be32_to_cpu(lvb->lvb_free_blk);
> > + oinfo->dqi_gi.dqi_free_entry =
> > + be32_to_cpu(lvb->lvb_free_entry);
> > + } else {
> > + bh = ocfs2_read_quota_block(oinfo->dqi_gqinode, 0, &status);
> > + if (!bh) {
> > + mlog_errno(status);
> > + goto bail;
> > + }
> > + gdinfo = (struct ocfs2_global_disk_dqinfo *)
> > + (bh->b_data + OCFS2_GLOBAL_INFO_OFF);
> > + info->dqi_bgrace = le32_to_cpu(gdinfo->dqi_bgrace);
> > + info->dqi_igrace = le32_to_cpu(gdinfo->dqi_igrace);
> > + oinfo->dqi_syncms = le32_to_cpu(gdinfo->dqi_syncms);
> > + oinfo->dqi_gi.dqi_blocks = le32_to_cpu(gdinfo->dqi_blocks);
> > + oinfo->dqi_gi.dqi_free_blk = le32_to_cpu(gdinfo->dqi_free_blk);
> > + oinfo->dqi_gi.dqi_free_entry =
> > + le32_to_cpu(gdinfo->dqi_free_entry);
> > + brelse(bh);
>
> put_bh() is more efficient and modern, in the case where bh is known to
> not be NULL.

How about __brelse()? Won't we lose the ref counting check if we go straight
to put_bh()?


> > +/* Lock quota info, this function expects at least shared lock on the quota file
> > + * so that we can safely refresh quota info from disk. */
> > +int ocfs2_qinfo_lock(struct ocfs2_mem_dqinfo *oinfo, int ex)
> > +{
> > + struct ocfs2_lock_res *lockres = &oinfo->dqi_gqlock;
> > + struct ocfs2_super *osb = OCFS2_SB(oinfo->dqi_gi.dqi_sb);
> > + int level = ex ? DLM_LOCK_EX : DLM_LOCK_PR;
> > + int status = 0;
> > +
> > + mlog_entry_void();
> > +
> > + /* On RO devices, locking really isn't needed... */
> > + if (ocfs2_is_hard_readonly(osb)) {
> > + if (ex)
> > + status = -EROFS;
> > + goto bail;
> > + }
> > + if (ocfs2_mount_local(osb))
> > + goto bail;
>
> This is not an error case?

Nope, that's a short-circuit for the case where the file system is marked as
'local only' - this no cluster locking is ever needed.

> > +
> > + status = ocfs2_cluster_lock(osb, lockres, level, 0, 0);
> > + if (status < 0) {
> > + mlog_errno(status);
> > + goto bail;
> > + }
> > + if (!ocfs2_should_refresh_lock_res(lockres))
> > + goto bail;
>
> ditto?

Another shortcut - the data which some locks protect wants to be
'refreshed', typically from lvb or disk. The lower level locking logic (heh,
try saying that 10 times fast ;) knows when this is required, and will mark
the lock appropriately. This is detected later with the above test and the
lock-specific data is refreshed.

I think eventually, we'll move some more of this stuff to the lower level,
probably using callbacks in the case of lock refresh.


> > +ssize_t ocfs2_quota_read(struct super_block *sb, int type, char *data,
> > + size_t len, loff_t off)
> > +{
> > + struct ocfs2_mem_dqinfo *oinfo = sb_dqinfo(sb, type)->dqi_priv;
> > + struct inode *gqinode = oinfo->dqi_gqinode;
> > + loff_t i_size = i_size_read(gqinode);
> > + int offset = off & (sb->s_blocksize - 1);
> > + sector_t blk = off >> sb->s_blocksize_bits;
> > + int err = 0;
> > + struct buffer_head *bh;
> > + size_t toread, tocopy;
> > +
> > + if (off > i_size)
> > + return 0;
> > + if (off + len > i_size)
> > + len = i_size - off;
> > + toread = len;
> > + while (toread > 0) {
> > + tocopy = min((size_t)(sb->s_blocksize - offset), toread);
>
> min_t is preferred.

Right, I'll fix that one up too.


> > +/* Write to quotafile (we know the transaction is already started and has
> > + * enough credits) */
> > +ssize_t ocfs2_quota_write(struct super_block *sb, int type,
> > + const char *data, size_t len, loff_t off)
> > +{
> > + struct mem_dqinfo *info = sb_dqinfo(sb, type);
> > + struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
> > + struct inode *gqinode = oinfo->dqi_gqinode;
> > + int offset = off & (sb->s_blocksize - 1);
> > + sector_t blk = off >> sb->s_blocksize_bits;
>
> does ocfs2 attempt to support CONFIG_LBD=n?

It should... What's the problem here?
--Mark

--
Mark Fasheh

2008-12-25 01:05:54

by Mark Fasheh

[permalink] [raw]
Subject: Re: [PATCH 19/56] mm: Export pdflush_operation()

On Mon, Dec 22, 2008 at 04:01:04PM -0800, Andrew Morton wrote:
> On Mon, 22 Dec 2008 13:48:00 -0800
> Mark Fasheh <[email protected]> wrote:
>
> > OCSF2 will need to queue up work for periodic syncing of quotas
> > among nodes in the cluster. pdflush() is good thread for this so
> > export it's controlling function so that OCFS2 can use it.
>
> I trust that nothing will explode if pdflush_operation() fails
> to do anything and returns -1?

Hmm, Jan do you have any opinion here? I'm wondering if we just need our own
thread for this after all...
--Mark

--
Mark Fasheh

2008-12-26 02:30:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 23/56] ocfs2: Implementation of local and global quota file handling

On Wed, 24 Dec 2008 16:29:23 -0800 Mark Fasheh <[email protected]> wrote:

> >
> > put_bh() is more efficient and modern, in the case where bh is known to
> > not be NULL.
>
> How about __brelse()? Won't we lose the ref counting check if we go straight
> to put_bh()?
>

That would work, if you value the debug check.

>
> ...
>
> > > +/* Write to quotafile (we know the transaction is already started and has
> > > + * enough credits) */
> > > +ssize_t ocfs2_quota_write(struct super_block *sb, int type,
> > > + const char *data, size_t len, loff_t off)
> > > +{
> > > + struct mem_dqinfo *info = sb_dqinfo(sb, type);
> > > + struct ocfs2_mem_dqinfo *oinfo = info->dqi_priv;
> > > + struct inode *gqinode = oinfo->dqi_gqinode;
> > > + int offset = off & (sb->s_blocksize - 1);
> > > + sector_t blk = off >> sb->s_blocksize_bits;
> >
> > does ocfs2 attempt to support CONFIG_LBD=n?
>
> It should... What's the problem here?

Idle curiosity. I noticed that the above expression could result in
truncation when writing a 64-bit value into a 32-bit one, which makes
one wonder whether this all works and is tested, etc.

2008-12-31 19:29:12

by Mark Fasheh

[permalink] [raw]
Subject: Re: [PATCH 19/56] mm: Export pdflush_operation()

On Wed, Dec 24, 2008 at 05:05:44PM -0800, Mark Fasheh wrote:
> On Mon, Dec 22, 2008 at 04:01:04PM -0800, Andrew Morton wrote:
> > On Mon, 22 Dec 2008 13:48:00 -0800
> > Mark Fasheh <[email protected]> wrote:
> >
> > > OCSF2 will need to queue up work for periodic syncing of quotas
> > > among nodes in the cluster. pdflush() is good thread for this so
> > > export it's controlling function so that OCFS2 can use it.
> >
> > I trust that nothing will explode if pdflush_operation() fails
> > to do anything and returns -1?
>
> Hmm, Jan do you have any opinion here? I'm wondering if we just need our own
> thread for this after all...
> --Mark

Ok, looking at this closer, it seems like this could be a problem after all.
Starving the quota syncing thread doesn't seem like a great idea either.

The following patch changes things to use a workqueue. Really, this doesn't
seem like a big deal anyway - the workqueue has reasonable overhead.

I could add this on top of my upstream branch along with a revert of the
'mm: Export pdflush_operation()' patch, or I could work this into the patch
series so we never get the export patch in the 1st place.
--Mark

--
Mark Fasheh

From: Mark Fasheh <[email protected]>

ocfs2/quota: Use workqueue for periodic syncing instead of pdflush()

Using pdflush_operation() for this was potentially buggy - we could get into
a situation where the work function never gets run. Instead, just create a
workqueue, 'o2quota' and just constantly queue a delayed work item. The
impact of this should be pretty minimal.

Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota.h | 5 ++++-
fs/ocfs2/quota_global.c | 45 ++++++++++++++++++++++++++++-----------------
fs/ocfs2/quota_local.c | 2 +-
fs/ocfs2/super.c | 6 ++++++
4 files changed, 39 insertions(+), 19 deletions(-)

diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
index abf6941..6d190c0 100644
--- a/fs/ocfs2/quota.h
+++ b/fs/ocfs2/quota.h
@@ -60,7 +60,7 @@ struct ocfs2_mem_dqinfo {
struct buffer_head *dqi_lqi_bh; /* Buffer head with local quota file inode */
struct buffer_head *dqi_ibh; /* Buffer with information header */
struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
- struct timer_list dqi_sync_timer; /* Timer for syncing dquots */
+ struct delayed_work dqi_sync_work; /* Work for syncing dquots */
struct ocfs2_quota_recovery *dqi_rec; /* Pointer to recovery
* information, in case we
* enable quotas on file
@@ -114,4 +114,7 @@ int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
extern struct dquot_operations ocfs2_quota_operations;
extern struct quota_format_type ocfs2_quota_format;

+int ocfs2_quota_setup(void);
+void ocfs2_quota_shutdown(void);
+
#endif /* _OCFS2_QUOTA_H */
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 9184953..fa9101c 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -7,8 +7,8 @@
#include <linux/quotaops.h>
#include <linux/dqblk_qtree.h>
#include <linux/jiffies.h>
-#include <linux/timer.h>
#include <linux/writeback.h>
+#include <linux/workqueue.h>

#define MLOG_MASK_PREFIX ML_QUOTA
#include <cluster/masklog.h>
@@ -25,7 +25,9 @@
#include "uptodate.h"
#include "quota.h"

-static void qsync_timer_fn(unsigned long oinfo_ptr);
+static struct workqueue_struct *ocfs2_quota_wq = NULL;
+
+static void qsync_work_fn(struct work_struct *work);

static void ocfs2_global_disk2memdqb(struct dquot *dquot, void *dp)
{
@@ -348,10 +350,10 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
oinfo->dqi_gi.dqi_usable_bs = sb->s_blocksize -
OCFS2_QBLK_RESERVED_SPACE;
oinfo->dqi_gi.dqi_qtree_depth = qtree_depth(&oinfo->dqi_gi);
- setup_timer(&oinfo->dqi_sync_timer, qsync_timer_fn,
- (unsigned long)oinfo);
- mod_timer(&oinfo->dqi_sync_timer,
- round_jiffies(jiffies + oinfo->dqi_syncjiff));
+ INIT_DELAYED_WORK(&oinfo->dqi_sync_work, qsync_work_fn);
+ queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
+ oinfo->dqi_syncjiff);
+
out_err:
mlog_exit(status);
return status;
@@ -594,21 +596,16 @@ out:
return status;
}

-static void ocfs2_do_qsync(unsigned long oinfo_ptr)
+static void qsync_work_fn(struct work_struct *work)
{
- struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
+ struct ocfs2_mem_dqinfo *oinfo = container_of(work,
+ struct ocfs2_mem_dqinfo,
+ dqi_sync_work.work);
struct super_block *sb = oinfo->dqi_gqinode->i_sb;

dquot_scan_active(sb, ocfs2_sync_dquot_helper, oinfo->dqi_type);
-}
-
-static void qsync_timer_fn(unsigned long oinfo_ptr)
-{
- struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
-
- pdflush_operation(ocfs2_do_qsync, oinfo_ptr);
- mod_timer(&oinfo->dqi_sync_timer,
- round_jiffies(jiffies + oinfo->dqi_syncjiff));
+ queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
+ oinfo->dqi_syncjiff);
}

/*
@@ -1009,3 +1006,17 @@ struct dquot_operations ocfs2_quota_operations = {
.alloc_dquot = ocfs2_alloc_dquot,
.destroy_dquot = ocfs2_destroy_dquot,
};
+
+int ocfs2_quota_setup(void)
+{
+ ocfs2_quota_wq = create_workqueue("o2quot");
+ if (!ocfs2_quota_wq)
+ return -ENOMEM;
+ return 0;
+}
+
+void ocfs2_quota_shutdown(void)
+{
+ flush_workqueue(ocfs2_quota_wq);
+ destroy_workqueue(ocfs2_quota_wq);
+}
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index a5f6e2a..07deec5 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -780,7 +780,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
/* At this point we know there are no more dquots and thus
* even if there's some sync in the pdflush queue, it won't
* find any dquots and return without doing anything */
- del_timer_sync(&oinfo->dqi_sync_timer);
+ cancel_delayed_work_sync(&oinfo->dqi_sync_work);
iput(oinfo->dqi_gqinode);
ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
ocfs2_lock_res_free(&oinfo->dqi_gqlock);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index a79e67b..25ccf22 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1326,6 +1326,10 @@ static int __init ocfs2_init(void)
mlog(ML_ERROR, "Unable to create ocfs2 debugfs root.\n");
}

+ status = ocfs2_quota_setup();
+ if (status)
+ goto leave;
+
ocfs2_set_locking_protocol();

status = register_quota_format(&ocfs2_quota_format);
@@ -1347,6 +1351,8 @@ static void __exit ocfs2_exit(void)
{
mlog_entry_void();

+ ocfs2_quota_shutdown();
+
if (ocfs2_wq) {
flush_workqueue(ocfs2_wq);
destroy_workqueue(ocfs2_wq);
--
1.5.6

2008-12-31 22:19:18

by Joel Becker

[permalink] [raw]
Subject: Re: [PATCH 19/56] mm: Export pdflush_operation()

On Wed, Dec 31, 2008 at 11:28:54AM -0800, Mark Fasheh wrote:
> On Wed, Dec 24, 2008 at 05:05:44PM -0800, Mark Fasheh wrote:
> > On Mon, Dec 22, 2008 at 04:01:04PM -0800, Andrew Morton wrote:
> > > On Mon, 22 Dec 2008 13:48:00 -0800
> > > Mark Fasheh <[email protected]> wrote:
> > >
> > > > OCSF2 will need to queue up work for periodic syncing of quotas
> > > > among nodes in the cluster. pdflush() is good thread for this so
> > > > export it's controlling function so that OCFS2 can use it.
> > >
> > > I trust that nothing will explode if pdflush_operation() fails
> > > to do anything and returns -1?
> >
> > Hmm, Jan do you have any opinion here? I'm wondering if we just need our own
> > thread for this after all...
> > --Mark
>
> Ok, looking at this closer, it seems like this could be a problem after all.
> Starving the quota syncing thread doesn't seem like a great idea either.

Definitely don't like the pdflush method. You guys are right
that it is buggy.

> The following patch changes things to use a workqueue. Really, this doesn't
> seem like a big deal anyway - the workqueue has reasonable overhead.

I like the patch overall. A couple comments.

> I could add this on top of my upstream branch along with a revert of the
> 'mm: Export pdflush_operation()' patch, or I could work this into the patch
> series so we never get the export patch in the 1st place.

Regarding merge, I'd rather drop the export patch and merge this
with the patch that uses pdflush_operation().

> diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
> index a5f6e2a..07deec5 100644
> --- a/fs/ocfs2/quota_local.c
> +++ b/fs/ocfs2/quota_local.c
> @@ -780,7 +780,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
> /* At this point we know there are no more dquots and thus
> * even if there's some sync in the pdflush queue, it won't
> * find any dquots and return without doing anything */
> - del_timer_sync(&oinfo->dqi_sync_timer);
> + cancel_delayed_work_sync(&oinfo->dqi_sync_work);
> iput(oinfo->dqi_gqinode);
> ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
> ocfs2_lock_res_free(&oinfo->dqi_gqlock);

Ok, I found what I was looking for. The workqueue is not
flushed when unmounting a single volume, and I wanted to be sure that
was correct. It is, as vfs_quota_disable() calls ->write_info() before
calling ->free_file_info() here. So we can just cancel any delayed work
and forget about it safely.

> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index a79e67b..25ccf22 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1326,6 +1326,10 @@ static int __init ocfs2_init(void)
> mlog(ML_ERROR, "Unable to create ocfs2 debugfs root.\n");
> }
>
> + status = ocfs2_quota_setup();
> + if (status)
> + goto leave;
> +
> ocfs2_set_locking_protocol();
>
> status = register_quota_format(&ocfs2_quota_format);

Don't you need to shutdown the quota workqueue if
register_quota_format() fails?

Joel

--

Life's Little Instruction Book #80

"Slow dance"

Joel Becker
Principal Software Developer
Oracle
E-mail: [email protected]
Phone: (650) 506-8127

2008-12-31 23:10:00

by Mark Fasheh

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH 19/56] mm: Export pdflush_operation()

On Wed, Dec 31, 2008 at 02:17:24PM -0800, Joel Becker wrote:
> On Wed, Dec 31, 2008 at 11:28:54AM -0800, Mark Fasheh wrote:
> > On Wed, Dec 24, 2008 at 05:05:44PM -0800, Mark Fasheh wrote:
> > > On Mon, Dec 22, 2008 at 04:01:04PM -0800, Andrew Morton wrote:
> > > > On Mon, 22 Dec 2008 13:48:00 -0800
> > > > Mark Fasheh <[email protected]> wrote:
> > > >
> > > > > OCSF2 will need to queue up work for periodic syncing of quotas
> > > > > among nodes in the cluster. pdflush() is good thread for this so
> > > > > export it's controlling function so that OCFS2 can use it.
> > > >
> > > > I trust that nothing will explode if pdflush_operation() fails
> > > > to do anything and returns -1?
> > >
> > > Hmm, Jan do you have any opinion here? I'm wondering if we just need our own
> > > thread for this after all...
> > > --Mark
> >
> > Ok, looking at this closer, it seems like this could be a problem after all.
> > Starving the quota syncing thread doesn't seem like a great idea either.
>
> Definitely don't like the pdflush method. You guys are right
> that it is buggy.
>
> > The following patch changes things to use a workqueue. Really, this doesn't
> > seem like a big deal anyway - the workqueue has reasonable overhead.
>
> I like the patch overall. A couple comments.
>
> > I could add this on top of my upstream branch along with a revert of the
> > 'mm: Export pdflush_operation()' patch, or I could work this into the patch
> > series so we never get the export patch in the 1st place.
>
> Regarding merge, I'd rather drop the export patch and merge this
> with the patch that uses pdflush_operation().

Sounds good. I think (hope) that shouldn't be too bad :)


> > diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
> > index a5f6e2a..07deec5 100644
> > --- a/fs/ocfs2/quota_local.c
> > +++ b/fs/ocfs2/quota_local.c
> > @@ -780,7 +780,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
> > /* At this point we know there are no more dquots and thus
> > * even if there's some sync in the pdflush queue, it won't
> > * find any dquots and return without doing anything */
> > - del_timer_sync(&oinfo->dqi_sync_timer);
> > + cancel_delayed_work_sync(&oinfo->dqi_sync_work);
> > iput(oinfo->dqi_gqinode);
> > ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
> > ocfs2_lock_res_free(&oinfo->dqi_gqlock);
>
> Ok, I found what I was looking for. The workqueue is not
> flushed when unmounting a single volume, and I wanted to be sure that
> was correct. It is, as vfs_quota_disable() calls ->write_info() before
> calling ->free_file_info() here. So we can just cancel any delayed work
> and forget about it safely.
>
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index a79e67b..25ccf22 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -1326,6 +1326,10 @@ static int __init ocfs2_init(void)
> > mlog(ML_ERROR, "Unable to create ocfs2 debugfs root.\n");
> > }
> >
> > + status = ocfs2_quota_setup();
> > + if (status)
> > + goto leave;
> > +
> > ocfs2_set_locking_protocol();
> >
> > status = register_quota_format(&ocfs2_quota_format);
>
> Don't you need to shutdown the quota workqueue if
> register_quota_format() fails?

Yep, good catch. Fixed patch follows. I'll start merging it all now.
--Mark

From: Mark Fasheh <[email protected]>

ocfs2/quota: Use workqueue for periodic syncing instead of pdflush()

Using pdflush_operation() for this was potentially buggy - we could get into
a situation where the work function never gets run. Instead, just create a
workqueue, 'o2quota' and just constantly queue a delayed work item. The
impact of this should be pretty minimal.

Signed-off-by: Mark Fasheh <[email protected]>
---
fs/ocfs2/quota.h | 5 +++-
fs/ocfs2/quota_global.c | 48 ++++++++++++++++++++++++++++++----------------
fs/ocfs2/quota_local.c | 2 +-
fs/ocfs2/super.c | 7 ++++++
4 files changed, 43 insertions(+), 19 deletions(-)

diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
index abf6941..6d190c0 100644
--- a/fs/ocfs2/quota.h
+++ b/fs/ocfs2/quota.h
@@ -60,7 +60,7 @@ struct ocfs2_mem_dqinfo {
struct buffer_head *dqi_lqi_bh; /* Buffer head with local quota file inode */
struct buffer_head *dqi_ibh; /* Buffer with information header */
struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
- struct timer_list dqi_sync_timer; /* Timer for syncing dquots */
+ struct delayed_work dqi_sync_work; /* Work for syncing dquots */
struct ocfs2_quota_recovery *dqi_rec; /* Pointer to recovery
* information, in case we
* enable quotas on file
@@ -114,4 +114,7 @@ int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
extern struct dquot_operations ocfs2_quota_operations;
extern struct quota_format_type ocfs2_quota_format;

+int ocfs2_quota_setup(void);
+void ocfs2_quota_shutdown(void);
+
#endif /* _OCFS2_QUOTA_H */
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 9184953..6aff8f2 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -7,8 +7,8 @@
#include <linux/quotaops.h>
#include <linux/dqblk_qtree.h>
#include <linux/jiffies.h>
-#include <linux/timer.h>
#include <linux/writeback.h>
+#include <linux/workqueue.h>

#define MLOG_MASK_PREFIX ML_QUOTA
#include <cluster/masklog.h>
@@ -25,7 +25,9 @@
#include "uptodate.h"
#include "quota.h"

-static void qsync_timer_fn(unsigned long oinfo_ptr);
+static struct workqueue_struct *ocfs2_quota_wq = NULL;
+
+static void qsync_work_fn(struct work_struct *work);

static void ocfs2_global_disk2memdqb(struct dquot *dquot, void *dp)
{
@@ -348,10 +350,10 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
oinfo->dqi_gi.dqi_usable_bs = sb->s_blocksize -
OCFS2_QBLK_RESERVED_SPACE;
oinfo->dqi_gi.dqi_qtree_depth = qtree_depth(&oinfo->dqi_gi);
- setup_timer(&oinfo->dqi_sync_timer, qsync_timer_fn,
- (unsigned long)oinfo);
- mod_timer(&oinfo->dqi_sync_timer,
- round_jiffies(jiffies + oinfo->dqi_syncjiff));
+ INIT_DELAYED_WORK(&oinfo->dqi_sync_work, qsync_work_fn);
+ queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
+ oinfo->dqi_syncjiff);
+
out_err:
mlog_exit(status);
return status;
@@ -594,21 +596,16 @@ out:
return status;
}

-static void ocfs2_do_qsync(unsigned long oinfo_ptr)
+static void qsync_work_fn(struct work_struct *work)
{
- struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
+ struct ocfs2_mem_dqinfo *oinfo = container_of(work,
+ struct ocfs2_mem_dqinfo,
+ dqi_sync_work.work);
struct super_block *sb = oinfo->dqi_gqinode->i_sb;

dquot_scan_active(sb, ocfs2_sync_dquot_helper, oinfo->dqi_type);
-}
-
-static void qsync_timer_fn(unsigned long oinfo_ptr)
-{
- struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
-
- pdflush_operation(ocfs2_do_qsync, oinfo_ptr);
- mod_timer(&oinfo->dqi_sync_timer,
- round_jiffies(jiffies + oinfo->dqi_syncjiff));
+ queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
+ oinfo->dqi_syncjiff);
}

/*
@@ -1009,3 +1006,20 @@ struct dquot_operations ocfs2_quota_operations = {
.alloc_dquot = ocfs2_alloc_dquot,
.destroy_dquot = ocfs2_destroy_dquot,
};
+
+int ocfs2_quota_setup(void)
+{
+ ocfs2_quota_wq = create_workqueue("o2quot");
+ if (!ocfs2_quota_wq)
+ return -ENOMEM;
+ return 0;
+}
+
+void ocfs2_quota_shutdown(void)
+{
+ if (ocfs2_quota_wq) {
+ flush_workqueue(ocfs2_quota_wq);
+ destroy_workqueue(ocfs2_quota_wq);
+ ocfs2_quota_wq = NULL;
+ }
+}
diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
index a5f6e2a..07deec5 100644
--- a/fs/ocfs2/quota_local.c
+++ b/fs/ocfs2/quota_local.c
@@ -780,7 +780,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
/* At this point we know there are no more dquots and thus
* even if there's some sync in the pdflush queue, it won't
* find any dquots and return without doing anything */
- del_timer_sync(&oinfo->dqi_sync_timer);
+ cancel_delayed_work_sync(&oinfo->dqi_sync_work);
iput(oinfo->dqi_gqinode);
ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
ocfs2_lock_res_free(&oinfo->dqi_gqlock);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index a79e67b..43ed113 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1326,11 +1326,16 @@ static int __init ocfs2_init(void)
mlog(ML_ERROR, "Unable to create ocfs2 debugfs root.\n");
}

+ status = ocfs2_quota_setup();
+ if (status)
+ goto leave;
+
ocfs2_set_locking_protocol();

status = register_quota_format(&ocfs2_quota_format);
leave:
if (status < 0) {
+ ocfs2_quota_shutdown();
ocfs2_free_mem_caches();
exit_ocfs2_uptodate_cache();
}
@@ -1347,6 +1352,8 @@ static void __exit ocfs2_exit(void)
{
mlog_entry_void();

+ ocfs2_quota_shutdown();
+
if (ocfs2_wq) {
flush_workqueue(ocfs2_wq);
destroy_workqueue(ocfs2_wq);
--
1.5.6

2009-01-05 13:27:28

by Jan Kara

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH 19/56] mm: Export pdflush_operation()

On Wed 31-12-08 15:09:49, Mark Fasheh wrote:
> From: Mark Fasheh <[email protected]>
>
> ocfs2/quota: Use workqueue for periodic syncing instead of pdflush()
>
> Using pdflush_operation() for this was potentially buggy - we could get into
> a situation where the work function never gets run. Instead, just create a
> workqueue, 'o2quota' and just constantly queue a delayed work item. The
> impact of this should be pretty minimal.
>
> Signed-off-by: Mark Fasheh <[email protected]>
The patch looks fine. You can add 'Acked-by: Jan Kara <[email protected]>'

Honza
> ---
> fs/ocfs2/quota.h | 5 +++-
> fs/ocfs2/quota_global.c | 48 ++++++++++++++++++++++++++++++----------------
> fs/ocfs2/quota_local.c | 2 +-
> fs/ocfs2/super.c | 7 ++++++
> 4 files changed, 43 insertions(+), 19 deletions(-)
>
> diff --git a/fs/ocfs2/quota.h b/fs/ocfs2/quota.h
> index abf6941..6d190c0 100644
> --- a/fs/ocfs2/quota.h
> +++ b/fs/ocfs2/quota.h
> @@ -60,7 +60,7 @@ struct ocfs2_mem_dqinfo {
> struct buffer_head *dqi_lqi_bh; /* Buffer head with local quota file inode */
> struct buffer_head *dqi_ibh; /* Buffer with information header */
> struct qtree_mem_dqinfo dqi_gi; /* Info about global file */
> - struct timer_list dqi_sync_timer; /* Timer for syncing dquots */
> + struct delayed_work dqi_sync_work; /* Work for syncing dquots */
> struct ocfs2_quota_recovery *dqi_rec; /* Pointer to recovery
> * information, in case we
> * enable quotas on file
> @@ -114,4 +114,7 @@ int ocfs2_read_quota_block(struct inode *inode, u64 v_block,
> extern struct dquot_operations ocfs2_quota_operations;
> extern struct quota_format_type ocfs2_quota_format;
>
> +int ocfs2_quota_setup(void);
> +void ocfs2_quota_shutdown(void);
> +
> #endif /* _OCFS2_QUOTA_H */
> diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
> index 9184953..6aff8f2 100644
> --- a/fs/ocfs2/quota_global.c
> +++ b/fs/ocfs2/quota_global.c
> @@ -7,8 +7,8 @@
> #include <linux/quotaops.h>
> #include <linux/dqblk_qtree.h>
> #include <linux/jiffies.h>
> -#include <linux/timer.h>
> #include <linux/writeback.h>
> +#include <linux/workqueue.h>
>
> #define MLOG_MASK_PREFIX ML_QUOTA
> #include <cluster/masklog.h>
> @@ -25,7 +25,9 @@
> #include "uptodate.h"
> #include "quota.h"
>
> -static void qsync_timer_fn(unsigned long oinfo_ptr);
> +static struct workqueue_struct *ocfs2_quota_wq = NULL;
> +
> +static void qsync_work_fn(struct work_struct *work);
>
> static void ocfs2_global_disk2memdqb(struct dquot *dquot, void *dp)
> {
> @@ -348,10 +350,10 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
> oinfo->dqi_gi.dqi_usable_bs = sb->s_blocksize -
> OCFS2_QBLK_RESERVED_SPACE;
> oinfo->dqi_gi.dqi_qtree_depth = qtree_depth(&oinfo->dqi_gi);
> - setup_timer(&oinfo->dqi_sync_timer, qsync_timer_fn,
> - (unsigned long)oinfo);
> - mod_timer(&oinfo->dqi_sync_timer,
> - round_jiffies(jiffies + oinfo->dqi_syncjiff));
> + INIT_DELAYED_WORK(&oinfo->dqi_sync_work, qsync_work_fn);
> + queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
> + oinfo->dqi_syncjiff);
> +
> out_err:
> mlog_exit(status);
> return status;
> @@ -594,21 +596,16 @@ out:
> return status;
> }
>
> -static void ocfs2_do_qsync(unsigned long oinfo_ptr)
> +static void qsync_work_fn(struct work_struct *work)
> {
> - struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
> + struct ocfs2_mem_dqinfo *oinfo = container_of(work,
> + struct ocfs2_mem_dqinfo,
> + dqi_sync_work.work);
> struct super_block *sb = oinfo->dqi_gqinode->i_sb;
>
> dquot_scan_active(sb, ocfs2_sync_dquot_helper, oinfo->dqi_type);
> -}
> -
> -static void qsync_timer_fn(unsigned long oinfo_ptr)
> -{
> - struct ocfs2_mem_dqinfo *oinfo = (struct ocfs2_mem_dqinfo *)oinfo_ptr;
> -
> - pdflush_operation(ocfs2_do_qsync, oinfo_ptr);
> - mod_timer(&oinfo->dqi_sync_timer,
> - round_jiffies(jiffies + oinfo->dqi_syncjiff));
> + queue_delayed_work(ocfs2_quota_wq, &oinfo->dqi_sync_work,
> + oinfo->dqi_syncjiff);
> }
>
> /*
> @@ -1009,3 +1006,20 @@ struct dquot_operations ocfs2_quota_operations = {
> .alloc_dquot = ocfs2_alloc_dquot,
> .destroy_dquot = ocfs2_destroy_dquot,
> };
> +
> +int ocfs2_quota_setup(void)
> +{
> + ocfs2_quota_wq = create_workqueue("o2quot");
> + if (!ocfs2_quota_wq)
> + return -ENOMEM;
> + return 0;
> +}
> +
> +void ocfs2_quota_shutdown(void)
> +{
> + if (ocfs2_quota_wq) {
> + flush_workqueue(ocfs2_quota_wq);
> + destroy_workqueue(ocfs2_quota_wq);
> + ocfs2_quota_wq = NULL;
> + }
> +}
> diff --git a/fs/ocfs2/quota_local.c b/fs/ocfs2/quota_local.c
> index a5f6e2a..07deec5 100644
> --- a/fs/ocfs2/quota_local.c
> +++ b/fs/ocfs2/quota_local.c
> @@ -780,7 +780,7 @@ static int ocfs2_local_free_info(struct super_block *sb, int type)
> /* At this point we know there are no more dquots and thus
> * even if there's some sync in the pdflush queue, it won't
> * find any dquots and return without doing anything */
> - del_timer_sync(&oinfo->dqi_sync_timer);
> + cancel_delayed_work_sync(&oinfo->dqi_sync_work);
> iput(oinfo->dqi_gqinode);
> ocfs2_simple_drop_lockres(OCFS2_SB(sb), &oinfo->dqi_gqlock);
> ocfs2_lock_res_free(&oinfo->dqi_gqlock);
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index a79e67b..43ed113 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1326,11 +1326,16 @@ static int __init ocfs2_init(void)
> mlog(ML_ERROR, "Unable to create ocfs2 debugfs root.\n");
> }
>
> + status = ocfs2_quota_setup();
> + if (status)
> + goto leave;
> +
> ocfs2_set_locking_protocol();
>
> status = register_quota_format(&ocfs2_quota_format);
> leave:
> if (status < 0) {
> + ocfs2_quota_shutdown();
> ocfs2_free_mem_caches();
> exit_ocfs2_uptodate_cache();
> }
> @@ -1347,6 +1352,8 @@ static void __exit ocfs2_exit(void)
> {
> mlog_entry_void();
>
> + ocfs2_quota_shutdown();
> +
> if (ocfs2_wq) {
> flush_workqueue(ocfs2_wq);
> destroy_workqueue(ocfs2_wq);
> --
> 1.5.6
>
--
Jan Kara <[email protected]>
SUSE Labs, CR