2014-04-01 09:15:58

by Steven Whitehouse

[permalink] [raw]
Subject: GFS2: Pre-pull patch posting (merge window)

Hi,

Here is the current content of the GFS2 -nmw tree for the
current merge window.

One of the main highlights this time, is not the patches themselves
but instead the widening contributor base. It is good to see that
interest is increasing in GFS2, and I'd like to thank all the
contributors to this patch set.

In addition to the usual set of bug fixes and clean ups, there are
patches to improve inode creation performance when xattrs are required
and some improvements to the transaction code which is intended to help
improve scalability after further changes in due course. Journal extent
mapping is also updated to make it more efficient and again, this is a
foundation for future work in this area.

The maximum number of ACLs has been increased to 300 (for a 4k block size)
which means that even with a few additional xattrs from selinux,
everything should fit within a single fs block. There is also a patch
to bring GFS2's own copy of the writepages code up to the same level as
the core VFS. Eventually we may be able to merge some of this code, since
it is fairly similar.

The other major change this time, is bringing consistency to the printing
of messages via fs_<level>, pr_<level> macros.

Steve.


2014-04-01 09:16:09

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 01/29] GFS2: Plug on AIL flush

When we do a flush of the AIL list, we are writing out what is
likely to be a lot of small I/Os, which are possibly in an order
which is not ideal performance-wise. Since this is done by calling
filemap_fdatatwrite for each individual inode's address space there
is no overall plugging going on.

In addition to that, we do not always wait for AIL i/o when we flush
it, so that it is possible for things to get left behind on the queue.
By adding explicit plugging here, we reduce the chances of this
being an issues. A quick test using the AIL flush tracepoint shows a
small, but measurable improvement.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 9dcb977..1e1bda0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -18,6 +18,7 @@
#include <linux/kthread.h>
#include <linux/freezer.h>
#include <linux/bio.h>
+#include <linux/blkdev.h>
#include <linux/writeback.h>
#include <linux/list_sort.h>

@@ -145,8 +146,10 @@ void gfs2_ail1_flush(struct gfs2_sbd *sdp, struct writeback_control *wbc)
{
struct list_head *head = &sdp->sd_ail1_list;
struct gfs2_trans *tr;
+ struct blk_plug plug;

trace_gfs2_ail_flush(sdp, wbc, 1);
+ blk_start_plug(&plug);
spin_lock(&sdp->sd_ail_lock);
restart:
list_for_each_entry_reverse(tr, head, tr_list) {
@@ -156,6 +159,7 @@ restart:
goto restart;
}
spin_unlock(&sdp->sd_ail_lock);
+ blk_finish_plug(&plug);
trace_gfs2_ail_flush(sdp, wbc, 0);
}

--
1.8.3.1

2014-04-01 09:16:29

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 12/29] GFS2: Remove extra "if" in gfs2_log_flush()

By reordering some of the assignments in gfs2_log_flush() it
is possible to remove one of the "if" statements as it can be
merged with one higher up the function.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index c1c9a29..edbd461 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -684,21 +684,19 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
}
trace_gfs2_log_flush(sdp, 1);

+ sdp->sd_log_flush_head = sdp->sd_log_head;
+ sdp->sd_log_flush_wrapped = 0;
tr = sdp->sd_log_tr;
if (tr) {
sdp->sd_log_tr = NULL;
INIT_LIST_HEAD(&tr->tr_ail1_list);
INIT_LIST_HEAD(&tr->tr_ail2_list);
+ tr->tr_first = sdp->sd_log_flush_head;
}

gfs2_assert_withdraw(sdp,
sdp->sd_log_num_revoke == sdp->sd_log_commited_revoke);

- sdp->sd_log_flush_head = sdp->sd_log_head;
- sdp->sd_log_flush_wrapped = 0;
- if (tr)
- tr->tr_first = sdp->sd_log_flush_head;
-
gfs2_ordered_write(sdp);
lops_before_commit(sdp, tr);
gfs2_log_flush_bio(sdp, WRITE);
--
1.8.3.1

2014-04-01 09:16:38

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 15/29] GFS2: return -E2BIG if hit the maximum limits of ACLs

From: Jie Liu <[email protected]>

Return -E2BIG rather than -EINVAL if hit the maximum size limits of
ACLs, as the former errno is consistent with VFS xattr syscalls.

This is pointed out by Dave Chinner in previous discussion thread:
http://www.spinics.net/lists/linux-fsdevel/msg71125.html

Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/acl.c b/fs/gfs2/acl.c
index ba94566..9c59ebe 100644
--- a/fs/gfs2/acl.c
+++ b/fs/gfs2/acl.c
@@ -86,7 +86,7 @@ int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
BUG_ON(name == NULL);

if (acl->a_count > GFS2_ACL_MAX_ENTRIES)
- return -EINVAL;
+ return -E2BIG;

if (type == ACL_TYPE_ACCESS) {
umode_t mode = inode->i_mode;
--
1.8.3.1

2014-04-01 09:16:50

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 18/29] GFS2: Use pr_<level> more consistently

From: Joe Perches <[email protected]>

Add pr_fmt, remove embedded "GFS2: " prefixes.
This now consistently emits lower case "gfs2: " for each message.

Other miscellanea around these changes:

o Add missing newlines
o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 39c7081..1a349f9 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -53,6 +53,8 @@
* but never before the maximum hash table size has been reached.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/buffer_head.h>
@@ -507,8 +509,8 @@ static int gfs2_check_dirent(struct gfs2_dirent *dent, unsigned int offset,
goto error;
return 0;
error:
- pr_warn("gfs2_check_dirent: %s (%s)\n", msg,
- first ? "first in block" : "not first in block");
+ pr_warn("%s: %s (%s)\n",
+ __func__, msg, first ? "first in block" : "not first in block");
return -EIO;
}

@@ -531,8 +533,7 @@ static int gfs2_dirent_offset(const void *buf)
}
return offset;
wrong_type:
- pr_warn("gfs2_scan_dirent: wrong block type %u\n",
- be32_to_cpu(h->mh_type));
+ pr_warn("%s: wrong block type %u\n", __func__, be32_to_cpu(h->mh_type));
return -1;
}

@@ -728,7 +729,7 @@ static int get_leaf(struct gfs2_inode *dip, u64 leaf_no,

error = gfs2_meta_read(dip->i_gl, leaf_no, DIO_WAIT, bhp);
if (!error && gfs2_metatype_check(GFS2_SB(&dip->i_inode), *bhp, GFS2_METATYPE_LF)) {
- /* printk(KERN_INFO "block num=%llu\n", leaf_no); */
+ /* pr_info("block num=%llu\n", leaf_no); */
error = -EIO;
}

@@ -1006,7 +1007,8 @@ static int dir_split_leaf(struct inode *inode, const struct qstr *name)
len = 1 << (dip->i_depth - be16_to_cpu(oleaf->lf_depth));
half_len = len >> 1;
if (!half_len) {
- pr_warn("i_depth %u lf_depth %u index %u\n", dip->i_depth, be16_to_cpu(oleaf->lf_depth), index);
+ pr_warn("i_depth %u lf_depth %u index %u\n",
+ dip->i_depth, be16_to_cpu(oleaf->lf_depth), index);
gfs2_consist_inode(dip);
error = -EIO;
goto fail_brelse;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 329dc80..52f7478 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
@@ -468,7 +470,7 @@ retry:
do_xmote(gl, gh, LM_ST_UNLOCKED);
break;
default: /* Everything else */
- pr_err("GFS2: wanted %u got %u\n", gl->gl_target, state);
+ pr_err("wanted %u got %u\n", gl->gl_target, state);
GLOCK_BUG_ON(gl, 1);
}
spin_unlock(&gl->gl_spin);
@@ -542,7 +544,7 @@ __acquires(&gl->gl_spin)
/* lock_dlm */
ret = sdp->sd_lockstruct.ls_ops->lm_lock(gl, target, lck_flags);
if (ret) {
- pr_err("GFS2: lm_lock ret %d\n", ret);
+ pr_err("lm_lock ret %d\n", ret);
GLOCK_BUG_ON(gl, 1);
}
} else { /* lock_nolock */
@@ -935,7 +937,7 @@ void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...)
vaf.fmt = fmt;
vaf.va = &args;

- pr_err(" %pV", &vaf);
+ pr_err("%pV", &vaf);
}

va_end(args);
diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index a664ddd..c1eb555 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/fs.h>
#include <linux/dlm.h>
#include <linux/slab.h>
@@ -176,7 +178,7 @@ static void gdlm_bast(void *arg, int mode)
gfs2_glock_cb(gl, LM_ST_SHARED);
break;
default:
- pr_err("unknown bast mode %d", mode);
+ pr_err("unknown bast mode %d\n", mode);
BUG();
}
}
@@ -195,7 +197,7 @@ static int make_mode(const unsigned int lmstate)
case LM_ST_SHARED:
return DLM_LOCK_PR;
}
- pr_err("unknown LM state %d", lmstate);
+ pr_err("unknown LM state %d\n", lmstate);
BUG();
return -1;
}
@@ -308,7 +310,8 @@ static void gdlm_put_lock(struct gfs2_glock *gl)
error = dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, DLM_LKF_VALBLK,
NULL, gl);
if (error) {
- pr_err("gdlm_unlock %x,%llx err=%d\n", gl->gl_name.ln_type,
+ pr_err("gdlm_unlock %x,%llx err=%d\n",
+ gl->gl_name.ln_type,
(unsigned long long)gl->gl_name.ln_number, error);
return;
}
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index ae9b02b..82b6ac8 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/completion.h>
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index ea9c35c..fba74a2 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
@@ -150,7 +152,7 @@ static int gfs2_check_sb(struct gfs2_sbd *sdp, int silent)
if (sb->sb_magic != GFS2_MAGIC ||
sb->sb_type != GFS2_METATYPE_SB) {
if (!silent)
- pr_warn("GFS2: not a GFS2 filesystem\n");
+ pr_warn("not a GFS2 filesystem\n");
return -EINVAL;
}

@@ -172,7 +174,7 @@ static void end_bio_io_page(struct bio *bio, int error)
if (!error)
SetPageUptodate(page);
else
- pr_warn("gfs2: error %d reading superblock\n", error);
+ pr_warn("error %d reading superblock\n", error);
unlock_page(page);
}

@@ -945,7 +947,7 @@ static int gfs2_lm_mount(struct gfs2_sbd *sdp, int silent)
lm = &gfs2_dlm_ops;
#endif
} else {
- pr_info("GFS2: can't find protocol %s\n", proto);
+ pr_info("can't find protocol %s\n", proto);
return -ENOENT;
}

@@ -1052,7 +1054,7 @@ static int fill_super(struct super_block *sb, struct gfs2_args *args, int silent

sdp = init_sbd(sb);
if (!sdp) {
- pr_warn("GFS2: can't alloc struct gfs2_sbd\n");
+ pr_warn("can't alloc struct gfs2_sbd\n");
return -ENOMEM;
}
sdp->sd_args = *args;
@@ -1300,7 +1302,7 @@ static struct dentry *gfs2_mount(struct file_system_type *fs_type, int flags,

error = gfs2_mount_args(&args, data);
if (error) {
- pr_warn("GFS2: can't parse mount arguments\n");
+ pr_warn("can't parse mount arguments\n");
goto error_super;
}

@@ -1350,15 +1352,15 @@ static struct dentry *gfs2_mount_meta(struct file_system_type *fs_type,

error = kern_path(dev_name, LOOKUP_FOLLOW, &path);
if (error) {
- pr_warn("GFS2: path_lookup on %s returned error %d\n",
- dev_name, error);
+ pr_warn("path_lookup on %s returned error %d\n",
+ dev_name, error);
return ERR_PTR(error);
}
s = sget(&gfs2_fs_type, test_gfs2_super, set_meta_super, flags,
path.dentry->d_inode->i_sb->s_bdev);
path_put(&path);
if (IS_ERR(s)) {
- pr_warn("GFS2: gfs2 mount does not exist\n");
+ pr_warn("gfs2 mount does not exist\n");
return ERR_CAST(s);
}
if ((flags ^ s->s_flags) & MS_RDONLY) {
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 6e25ee4..73ed925 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -36,6 +36,8 @@
* the quota file, so it is not being constantly read.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/mm.h>
@@ -1081,10 +1083,10 @@ static int print_message(struct gfs2_quota_data *qd, char *type)
{
struct gfs2_sbd *sdp = qd->qd_gl->gl_sbd;

- pr_info("GFS2: fsid=%s: quota %s for %s %u\n",
- sdp->sd_fsname, type,
- (qd->qd_id.type == USRQUOTA) ? "user" : "group",
- from_kqid(&init_user_ns, qd->qd_id));
+ pr_info("fsid=%s: quota %s for %s %u\n",
+ sdp->sd_fsname, type,
+ (qd->qd_id.type == USRQUOTA) ? "user" : "group",
+ from_kqid(&init_user_ns, qd->qd_id));

return 0;
}
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 8d12038..281a771 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/completion.h>
@@ -99,12 +101,12 @@ static inline void gfs2_setbit(const struct gfs2_rbm *rbm, bool do_clone,
cur_state = (*byte1 >> bit) & GFS2_BIT_MASK;

if (unlikely(!valid_change[new_state * 4 + cur_state])) {
- pr_warn("GFS2: buf_blk = 0x%x old_state=%d, "
- "new_state=%d\n", rbm->offset, cur_state, new_state);
- pr_warn("GFS2: rgrp=0x%llx bi_start=0x%x\n",
- (unsigned long long)rbm->rgd->rd_addr, bi->bi_start);
- pr_warn("GFS2: bi_offset=0x%x bi_len=0x%x\n",
- bi->bi_offset, bi->bi_len);
+ pr_warn("buf_blk = 0x%x old_state=%d, new_state=%d\n",
+ rbm->offset, cur_state, new_state);
+ pr_warn("rgrp=0x%llx bi_start=0x%x\n",
+ (unsigned long long)rbm->rgd->rd_addr, bi->bi_start);
+ pr_warn("bi_offset=0x%x bi_len=0x%x\n",
+ bi->bi_offset, bi->bi_len);
dump_stack();
gfs2_consist_rgrpd(rbm->rgd);
return;
@@ -736,11 +738,11 @@ void gfs2_clear_rgrpd(struct gfs2_sbd *sdp)

static void gfs2_rindex_print(const struct gfs2_rgrpd *rgd)
{
- pr_info(" ri_addr = %llu\n", (unsigned long long)rgd->rd_addr);
- pr_info(" ri_length = %u\n", rgd->rd_length);
- pr_info(" ri_data0 = %llu\n", (unsigned long long)rgd->rd_data0);
- pr_info(" ri_data = %u\n", rgd->rd_data);
- pr_info(" ri_bitbytes = %u\n", rgd->rd_bitbytes);
+ pr_info("ri_addr = %llu\n", (unsigned long long)rgd->rd_addr);
+ pr_info("ri_length = %u\n", rgd->rd_length);
+ pr_info("ri_data0 = %llu\n", (unsigned long long)rgd->rd_data0);
+ pr_info("ri_data = %u\n", rgd->rd_data);
+ pr_info("ri_bitbytes = %u\n", rgd->rd_bitbytes);
}

/**
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 584c757..a08c66e 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/bio.h>
#include <linux/sched.h>
#include <linux/slab.h>
@@ -175,8 +177,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_debug:
if (args->ar_errors == GFS2_ERRORS_PANIC) {
- pr_warn("GFS2: -o debug and -o errors=panic "
- "are mutually exclusive.\n");
+ pr_warn("-o debug and -o errors=panic are mutually exclusive\n");
return -EINVAL;
}
args->ar_debug = 1;
@@ -228,21 +229,21 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
case Opt_commit:
rv = match_int(&tmp[0], &args->ar_commit);
if (rv || args->ar_commit <= 0) {
- pr_warn("GFS2: commit mount option requires a positive numeric argument\n");
+ pr_warn("commit mount option requires a positive numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
case Opt_statfs_quantum:
rv = match_int(&tmp[0], &args->ar_statfs_quantum);
if (rv || args->ar_statfs_quantum < 0) {
- pr_warn("GFS2: statfs_quantum mount option requires a non-negative numeric argument\n");
+ pr_warn("statfs_quantum mount option requires a non-negative numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
case Opt_quota_quantum:
rv = match_int(&tmp[0], &args->ar_quota_quantum);
if (rv || args->ar_quota_quantum <= 0) {
- pr_warn("GFS2: quota_quantum mount option requires a positive numeric argument\n");
+ pr_warn("quota_quantum mount option requires a positive numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
@@ -259,8 +260,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_err_panic:
if (args->ar_debug) {
- pr_warn("GFS2: -o debug and -o errors=panic "
- "are mutually exclusive.\n");
+ pr_warn("-o debug and -o errors=panic are mutually exclusive\n");
return -EINVAL;
}
args->ar_errors = GFS2_ERRORS_PANIC;
@@ -279,7 +279,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_error:
default:
- pr_warn("GFS2: invalid mount option: %s\n", o);
+ pr_warn("invalid mount option: %s\n", o);
return -EINVAL;
}
}
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index d09f6ed..256354c 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/sched.h>
#include <linux/spinlock.h>
#include <linux/completion.h>
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 3fe8e34..bead90d 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
@@ -99,13 +101,13 @@ static void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)

static void gfs2_print_trans(const struct gfs2_trans *tr)
{
- pr_warn("GFS2: Transaction created at: %pSR\n", (void *)tr->tr_ip);
- pr_warn("GFS2: blocks=%u revokes=%u reserved=%u touched=%u\n",
- tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched);
- pr_warn("GFS2: Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
- tr->tr_num_buf_new, tr->tr_num_buf_rm,
- tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
- tr->tr_num_revoke, tr->tr_num_revoke_rm);
+ pr_warn("Transaction created at: %pSR\n", (void *)tr->tr_ip);
+ pr_warn("blocks=%u revokes=%u reserved=%u touched=%u\n",
+ tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched);
+ pr_warn("Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
+ tr->tr_num_buf_new, tr->tr_num_buf_rm,
+ tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
+ tr->tr_num_revoke, tr->tr_num_revoke_rm);
}

void gfs2_trans_end(struct gfs2_sbd *sdp)
@@ -231,8 +233,7 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
set_bit(GLF_DIRTY, &bd->bd_gl->gl_flags);
mh = (struct gfs2_meta_header *)bd->bd_bh->b_data;
if (unlikely(mh->mh_magic != cpu_to_be32(GFS2_MAGIC))) {
- pr_err("Attempting to add uninitialised block to journal "
- "(inplace block=%lld)\n",
+ pr_err("Attempting to add uninitialised block to journal (inplace block=%lld)\n",
(unsigned long long)bd->bd_bh->b_blocknr);
BUG();
}
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index e9d7001..02fb38d 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -7,6 +7,8 @@
* of the GNU General Public License version 2.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/spinlock.h>
#include <linux/completion.h>
#include <linux/buffer_head.h>
@@ -30,7 +32,7 @@ mempool_t *gfs2_page_pool __read_mostly;

void gfs2_assert_i(struct gfs2_sbd *sdp)
{
- pr_emerg("GFS2: fsid=%s: fatal assertion failed\n", sdp->sd_fsname);
+ pr_emerg("fsid=%s: fatal assertion failed\n", sdp->sd_fsname);
}

int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
@@ -65,7 +67,7 @@ int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
}

if (sdp->sd_args.ar_errors == GFS2_ERRORS_PANIC)
- panic("GFS2: fsid=%s: panic requested.\n", sdp->sd_fsname);
+ panic("GFS2: fsid=%s: panic requested\n", sdp->sd_fsname);

return -1;
}
@@ -104,10 +106,8 @@ int gfs2_assert_warn_i(struct gfs2_sbd *sdp, char *assertion,
return -2;

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW)
- pr_warn("GFS2: fsid=%s: warning: assertion \"%s\" failed\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname, assertion,
- sdp->sd_fsname, function, file, line);
+ pr_warn("fsid=%s: warning: assertion \"%s\" failed at function = %s, file = %s, line = %u\n",
+ sdp->sd_fsname, assertion, function, file, line);

if (sdp->sd_args.ar_debug)
BUG();
diff --git a/fs/gfs2/util.h b/fs/gfs2/util.h
index b7ffb09..d365733 100644
--- a/fs/gfs2/util.h
+++ b/fs/gfs2/util.h
@@ -10,22 +10,21 @@
#ifndef __UTIL_DOT_H__
#define __UTIL_DOT_H__

+#ifdef pr_fmt
+#undef pr_fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#endif
+
#include <linux/mempool.h>

#include "incore.h"

-#define fs_printk(level, fs, fmt, arg...) \
- printk(level "GFS2: fsid=%s: " fmt , (fs)->sd_fsname , ## arg)
-
-#define fs_info(fs, fmt, arg...) \
- fs_printk(KERN_INFO , fs , fmt , ## arg)
-
-#define fs_warn(fs, fmt, arg...) \
- fs_printk(KERN_WARNING , fs , fmt , ## arg)
-
-#define fs_err(fs, fmt, arg...) \
- fs_printk(KERN_ERR, fs , fmt , ## arg)
-
+#define fs_warn(fs, fmt, ...) \
+ pr_warn("fsid=%s: " fmt, (fs)->sd_fsname, ##__VA_ARGS__)
+#define fs_err(fs, fmt, ...) \
+ pr_err("fsid=%s: " fmt, (fs)->sd_fsname, ##__VA_ARGS__)
+#define fs_info(fs, fmt, ...) \
+ pr_info("fsid=%s: " fmt, (fs)->sd_fsname, ##__VA_ARGS__)

void gfs2_assert_i(struct gfs2_sbd *sdp);

@@ -85,7 +84,7 @@ static inline int gfs2_meta_check(struct gfs2_sbd *sdp,
struct gfs2_meta_header *mh = (struct gfs2_meta_header *)bh->b_data;
u32 magic = be32_to_cpu(mh->mh_magic);
if (unlikely(magic != GFS2_MAGIC)) {
- printk(KERN_ERR "GFS2: Magic number missing at %llu\n",
+ pr_err("Magic number missing at %llu\n",
(unsigned long long)bh->b_blocknr);
return -EIO;
}
--
1.8.3.1

2014-04-01 09:17:00

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 29/29] GFS2: Fix address space from page function

Now that rgrps use the address space which is part of the super
block, we need to update gfs2_mapping2sbd() to take account of
that. The only way to do that easily is to use a different set
of address_space_operations for rgrps.

Reported-by: Abhi Das <[email protected]>
Tested-by: Abhi Das <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 005e468..2cf09b6 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -97,6 +97,11 @@ const struct address_space_operations gfs2_meta_aops = {
.releasepage = gfs2_releasepage,
};

+const struct address_space_operations gfs2_rgrp_aops = {
+ .writepage = gfs2_aspace_writepage,
+ .releasepage = gfs2_releasepage,
+};
+
/**
* gfs2_getbuf - Get a buffer with a given address space
* @gl: the glock
diff --git a/fs/gfs2/meta_io.h b/fs/gfs2/meta_io.h
index 4823b93..ac5d802 100644
--- a/fs/gfs2/meta_io.h
+++ b/fs/gfs2/meta_io.h
@@ -38,12 +38,15 @@ static inline void gfs2_buffer_copy_tail(struct buffer_head *to_bh,
}

extern const struct address_space_operations gfs2_meta_aops;
+extern const struct address_space_operations gfs2_rgrp_aops;

static inline struct gfs2_sbd *gfs2_mapping2sbd(struct address_space *mapping)
{
struct inode *inode = mapping->host;
if (mapping->a_ops == &gfs2_meta_aops)
return (((struct gfs2_glock *)mapping) - 1)->gl_sbd;
+ else if (mapping->a_ops == &gfs2_rgrp_aops)
+ return container_of(mapping, struct gfs2_sbd, sd_aspace);
else
return inode->i_sb->s_fs_info;
}
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index fba74a2..22f9540 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -106,7 +106,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
mapping = &sdp->sd_aspace;

address_space_init_once(mapping);
- mapping->a_ops = &gfs2_meta_aops;
+ mapping->a_ops = &gfs2_rgrp_aops;
mapping->host = sb->s_bdev->bd_inode;
mapping->flags = 0;
mapping_set_gfp_mask(mapping, GFP_NOFS);
--
1.8.3.1

2014-04-01 09:16:55

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 27/29] GFS2: Fix return value in slot_get()

From: Abhi Das <[email protected]>

ENOSPC was being returned in slot_get inspite of successful
execution of the function. This patch fixes this return
code.

Signed-off-by: Abhi Das <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 27f9435..c4effff 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -332,6 +332,7 @@ static int slot_get(struct gfs2_quota_data *qd)
if (bit < sdp->sd_quota_slots) {
set_bit(bit, sdp->sd_quota_bitmap);
qd->qd_slot = bit;
+ error = 0;
out:
qd->qd_slot_count++;
}
--
1.8.3.1

2014-04-01 09:17:32

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 28/29] GFS2: Fix uninitialized VFS inode in gfs2_create_inode

From: Abhi Das <[email protected]>

When gfs2_create_inode() fails due to quota violation, the VFS
inode is not completely uninitialized. This can cause a list
corruption error.

This patch correctly uninitializes the VFS inode when a quota
violation occurs in the gfs2_create_inode codepath.

Resolves: rhbz#1059808
Signed-off-by: Abhi Das <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index ef26ed9..bdf70c1 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -371,6 +371,7 @@ enum {
GIF_ALLOC_FAILED = 2,
GIF_SW_PAGED = 3,
GIF_ORDERED = 4,
+ GIF_FREE_VFS_INODE = 5,
};

struct gfs2_inode {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 69ed57a..28cc7bf 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -597,7 +597,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
struct gfs2_sbd *sdp = GFS2_SB(&dip->i_inode);
struct gfs2_glock *io_gl;
struct dentry *d;
- int error;
+ int error, free_vfs_inode = 0;
u32 aflags = 0;
unsigned blocks = 1;
struct gfs2_diradd da = { .bh = NULL, };
@@ -788,15 +788,16 @@ fail_free_acls:
if (acl)
posix_acl_release(acl);
fail_free_vfs_inode:
- free_inode_nonrcu(inode);
- inode = NULL;
+ free_vfs_inode = 1;
fail_gunlock:
gfs2_dir_no_add(&da);
gfs2_glock_dq_uninit(ghs);
if (inode && !IS_ERR(inode)) {
clear_nlink(inode);
- mark_inode_dirty(inode);
- set_bit(GIF_ALLOC_FAILED, &GFS2_I(inode)->i_flags);
+ if (!free_vfs_inode)
+ mark_inode_dirty(inode);
+ set_bit(free_vfs_inode ? GIF_FREE_VFS_INODE : GIF_ALLOC_FAILED,
+ &GFS2_I(inode)->i_flags);
iput(inode);
}
fail:
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a08c66e..29cacd5 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1248,7 +1248,7 @@ static int gfs2_drop_inode(struct inode *inode)
{
struct gfs2_inode *ip = GFS2_I(inode);

- if (inode->i_nlink) {
+ if (!test_bit(GIF_FREE_VFS_INODE, &ip->i_flags) && inode->i_nlink) {
struct gfs2_glock *gl = ip->i_iopen_gh.gh_gl;
if (gl && test_bit(GLF_DEMOTE, &gl->gl_flags))
clear_nlink(inode);
@@ -1463,6 +1463,11 @@ static void gfs2_evict_inode(struct inode *inode)
struct gfs2_holder gh;
int error;

+ if (test_bit(GIF_FREE_VFS_INODE, &ip->i_flags)) {
+ clear_inode(inode);
+ return;
+ }
+
if (inode->i_nlink || (sb->s_flags & MS_RDONLY))
goto out;

--
1.8.3.1

2014-04-01 09:16:53

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 26/29] GFS2: inline function gfs2_set_mode

From: Bob Peterson <[email protected]>

Here is a revised patch based on Steve's feedback:

This patch eliminates function gfs2_set_mode which was only called in
one place, and always returned 0.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/acl.c b/fs/gfs2/acl.c
index 394dc55..3088e2a 100644
--- a/fs/gfs2/acl.c
+++ b/fs/gfs2/acl.c
@@ -64,18 +64,6 @@ struct posix_acl *gfs2_get_acl(struct inode *inode, int type)
return acl;
}

-static int gfs2_set_mode(struct inode *inode, umode_t mode)
-{
- int error = 0;
-
- if (mode != inode->i_mode) {
- inode->i_mode = mode;
- mark_inode_dirty(inode);
- }
-
- return error;
-}
-
int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
{
int error;
@@ -98,9 +86,10 @@ int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
if (error == 0)
acl = NULL;

- error = gfs2_set_mode(inode, mode);
- if (error)
- return error;
+ if (mode != inode->i_mode) {
+ inode->i_mode = mode;
+ mark_inode_dirty(inode);
+ }
}

if (acl) {
--
1.8.3.1

2014-04-01 09:18:22

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 25/29] GFS2: Remove extraneous function gfs2_security_init

From: Bob Peterson <[email protected]>

This patch eliminates function gfs2_security_init in favor of just
calling security_inode_init_security directly.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index b52ebf8..69ed57a 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -571,13 +571,6 @@ static int gfs2_initxattrs(struct inode *inode, const struct xattr *xattr_array,
return err;
}

-static int gfs2_security_init(struct gfs2_inode *dip, struct gfs2_inode *ip,
- const struct qstr *qstr)
-{
- return security_inode_init_security(&ip->i_inode, &dip->i_inode, qstr,
- &gfs2_initxattrs, NULL);
-}
-
/**
* gfs2_create_inode - Create a new inode
* @dir: The parent directory
@@ -758,7 +751,8 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
if (error)
goto fail_gunlock3;

- error = gfs2_security_init(dip, ip, name);
+ error = security_inode_init_security(&ip->i_inode, &dip->i_inode, name,
+ &gfs2_initxattrs, NULL);
if (error)
goto fail_gunlock3;

--
1.8.3.1

2014-04-01 09:18:50

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 24/29] GFS2: Increase the max number of ACLs

From: Bob Peterson <[email protected]>

This patch increases the maximum number of ACLs from 25 to 300 for
a 4K block size. The value is adjusted accordingly if the block size
is smaller. Note that this is an arbitrary limit with a performance
tradeoff, and that the physical limit is slightly over 500.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/acl.c b/fs/gfs2/acl.c
index 9c59ebe..394dc55 100644
--- a/fs/gfs2/acl.c
+++ b/fs/gfs2/acl.c
@@ -85,7 +85,7 @@ int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type)

BUG_ON(name == NULL);

- if (acl->a_count > GFS2_ACL_MAX_ENTRIES)
+ if (acl->a_count > GFS2_ACL_MAX_ENTRIES(GFS2_SB(inode)))
return -E2BIG;

if (type == ACL_TYPE_ACCESS) {
diff --git a/fs/gfs2/acl.h b/fs/gfs2/acl.h
index 301260c..2d65ec4 100644
--- a/fs/gfs2/acl.h
+++ b/fs/gfs2/acl.h
@@ -14,7 +14,7 @@

#define GFS2_POSIX_ACL_ACCESS "posix_acl_access"
#define GFS2_POSIX_ACL_DEFAULT "posix_acl_default"
-#define GFS2_ACL_MAX_ENTRIES 25
+#define GFS2_ACL_MAX_ENTRIES(sdp) ((300 << (sdp)->sd_sb.sb_bsize_shift) >> 12)

extern struct posix_acl *gfs2_get_acl(struct inode *inode, int type);
extern int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type);
--
1.8.3.1

2014-04-01 09:16:47

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 19/29] GFS2: Use fs_<level> more often

From: Joe Perches <[email protected]>

Convert a couple of uses of pr_<level> to fs_<level>
Add and use fs_emerg.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 73ed925..27f9435 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1083,8 +1083,8 @@ static int print_message(struct gfs2_quota_data *qd, char *type)
{
struct gfs2_sbd *sdp = qd->qd_gl->gl_sbd;

- pr_info("fsid=%s: quota %s for %s %u\n",
- sdp->sd_fsname, type,
+ fs_info(sdp, "quota %s for %s %u\n",
+ type,
(qd->qd_id.type == USRQUOTA) ? "user" : "group",
from_kqid(&init_user_ns, qd->qd_id));

diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 02fb38d..84bf853 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -32,7 +32,7 @@ mempool_t *gfs2_page_pool __read_mostly;

void gfs2_assert_i(struct gfs2_sbd *sdp)
{
- pr_emerg("fsid=%s: fatal assertion failed\n", sdp->sd_fsname);
+ fs_emerg(sdp, "fatal assertion failed\n");
}

int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
@@ -106,8 +106,8 @@ int gfs2_assert_warn_i(struct gfs2_sbd *sdp, char *assertion,
return -2;

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW)
- pr_warn("fsid=%s: warning: assertion \"%s\" failed at function = %s, file = %s, line = %u\n",
- sdp->sd_fsname, assertion, function, file, line);
+ fs_warn(sdp, "warning: assertion \"%s\" failed at function = %s, file = %s, line = %u\n",
+ assertion, function, file, line);

if (sdp->sd_args.ar_debug)
BUG();
diff --git a/fs/gfs2/util.h b/fs/gfs2/util.h
index d365733..515cce2 100644
--- a/fs/gfs2/util.h
+++ b/fs/gfs2/util.h
@@ -19,6 +19,8 @@

#include "incore.h"

+#define fs_emerg(fs, fmt, ...) \
+ pr_emerg("fsid=%s: " fmt, (fs)->sd_fsname, ##__VA_ARGS__)
#define fs_warn(fs, fmt, ...) \
pr_warn("fsid=%s: " fmt, (fs)->sd_fsname, ##__VA_ARGS__)
#define fs_err(fs, fmt, ...) \
--
1.8.3.1

2014-04-01 09:19:31

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 23/29] GFS2: Re-add a call to log_flush_wait when flushing the journal

From: Bob Peterson <[email protected]>

Upstream commit 34cc178 changed a line of code from calling function
log_flush_commit to calling log_write_header. This had the effect of
eliminating a call to function log_flush_wait. That causes the journal
to skip over log headers, which results in multiple wrap points,
which itself leads to infinite loops in journal replay, both in the
kernel code and fsck.gfs2 code. This patch re-adds that call.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index edbd461..4a14d50 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -702,6 +702,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
gfs2_log_flush_bio(sdp, WRITE);

if (sdp->sd_log_head != sdp->sd_log_flush_head) {
+ log_flush_wait(sdp);
log_write_header(sdp, 0);
} else if (sdp->sd_log_tail != current_tail(sdp) && !sdp->sd_log_idle){
atomic_dec(&sdp->sd_log_blks_free); /* Adjust for unreserved buffer */
--
1.8.3.1

2014-04-01 09:16:45

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 20/29] GFS2: Convert gfs2_lm_withdraw to use fs_err

From: Joe Perches <[email protected]>

vprintk use is not prefixed by a KERN_<LEVEL>,
so emit these messages at KERN_ERR level.

Using %pV can save some code and allow fs_err to
be used, so do it.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 256354c..de25d55 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -140,9 +140,8 @@ static ssize_t withdraw_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
if (simple_strtol(buf, NULL, 0) != 1)
return -EINVAL;

- gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: withdrawing from cluster at user's request\n",
- sdp->sd_fsname);
+ gfs2_lm_withdraw(sdp, "withdrawing from cluster at user's request\n");
+
return len;
}

diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 84bf853..86d2035 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -35,18 +35,24 @@ void gfs2_assert_i(struct gfs2_sbd *sdp)
fs_emerg(sdp, "fatal assertion failed\n");
}

-int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
+int gfs2_lm_withdraw(struct gfs2_sbd *sdp, const char *fmt, ...)
{
struct lm_lockstruct *ls = &sdp->sd_lockstruct;
const struct lm_lockops *lm = ls->ls_ops;
va_list args;
+ struct va_format vaf;

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW &&
test_and_set_bit(SDF_SHUTDOWN, &sdp->sd_flags))
return 0;

va_start(args, fmt);
- vprintk(fmt, args);
+
+ vaf.fmt = fmt;
+ vaf.va = &args;
+
+ fs_err(sdp, "%pV", &vaf);
+
va_end(args);

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
@@ -83,10 +89,9 @@ int gfs2_assert_withdraw_i(struct gfs2_sbd *sdp, char *assertion,
{
int me;
me = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: assertion \"%s\" failed\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname, assertion,
- sdp->sd_fsname, function, file, line);
+ "fatal: assertion \"%s\" failed\n"
+ " function = %s, file = %s, line = %u\n",
+ assertion, function, file, line);
dump_stack();
return (me) ? -1 : -2;
}
@@ -136,10 +141,8 @@ int gfs2_consist_i(struct gfs2_sbd *sdp, int cluster_wide, const char *function,
{
int rv;
rv = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: filesystem consistency error\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, function, file, line);
+ "fatal: filesystem consistency error - function = %s, file = %s, line = %u\n",
+ function, file, line);
return rv;
}

@@ -155,13 +158,12 @@ int gfs2_consist_inode_i(struct gfs2_inode *ip, int cluster_wide,
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
int rv;
rv = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: filesystem consistency error\n"
- "GFS2: fsid=%s: inode = %llu %llu\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, (unsigned long long)ip->i_no_formal_ino,
- (unsigned long long)ip->i_no_addr,
- sdp->sd_fsname, function, file, line);
+ "fatal: filesystem consistency error\n"
+ " inode = %llu %llu\n"
+ " function = %s, file = %s, line = %u\n",
+ (unsigned long long)ip->i_no_formal_ino,
+ (unsigned long long)ip->i_no_addr,
+ function, file, line);
return rv;
}

@@ -177,12 +179,11 @@ int gfs2_consist_rgrpd_i(struct gfs2_rgrpd *rgd, int cluster_wide,
struct gfs2_sbd *sdp = rgd->rd_sbd;
int rv;
rv = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: filesystem consistency error\n"
- "GFS2: fsid=%s: RG = %llu\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, (unsigned long long)rgd->rd_addr,
- sdp->sd_fsname, function, file, line);
+ "fatal: filesystem consistency error\n"
+ " RG = %llu\n"
+ " function = %s, file = %s, line = %u\n",
+ (unsigned long long)rgd->rd_addr,
+ function, file, line);
return rv;
}

@@ -198,12 +199,11 @@ int gfs2_meta_check_ii(struct gfs2_sbd *sdp, struct buffer_head *bh,
{
int me;
me = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: invalid metadata block\n"
- "GFS2: fsid=%s: bh = %llu (%s)\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, (unsigned long long)bh->b_blocknr, type,
- sdp->sd_fsname, function, file, line);
+ "fatal: invalid metadata block\n"
+ " bh = %llu (%s)\n"
+ " function = %s, file = %s, line = %u\n",
+ (unsigned long long)bh->b_blocknr, type,
+ function, file, line);
return (me) ? -1 : -2;
}

@@ -219,12 +219,11 @@ int gfs2_metatype_check_ii(struct gfs2_sbd *sdp, struct buffer_head *bh,
{
int me;
me = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: invalid metadata block\n"
- "GFS2: fsid=%s: bh = %llu (type: exp=%u, found=%u)\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, (unsigned long long)bh->b_blocknr, type, t,
- sdp->sd_fsname, function, file, line);
+ "fatal: invalid metadata block\n"
+ " bh = %llu (type: exp=%u, found=%u)\n"
+ " function = %s, file = %s, line = %u\n",
+ (unsigned long long)bh->b_blocknr, type, t,
+ function, file, line);
return (me) ? -1 : -2;
}

@@ -239,10 +238,9 @@ int gfs2_io_error_i(struct gfs2_sbd *sdp, const char *function, char *file,
{
int rv;
rv = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: I/O error\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, function, file, line);
+ "fatal: I/O error\n"
+ " function = %s, file = %s, line = %u\n",
+ function, file, line);
return rv;
}

@@ -257,12 +255,11 @@ int gfs2_io_error_bh_i(struct gfs2_sbd *sdp, struct buffer_head *bh,
{
int rv;
rv = gfs2_lm_withdraw(sdp,
- "GFS2: fsid=%s: fatal: I/O error\n"
- "GFS2: fsid=%s: block = %llu\n"
- "GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
- sdp->sd_fsname,
- sdp->sd_fsname, (unsigned long long)bh->b_blocknr,
- sdp->sd_fsname, function, file, line);
+ "fatal: I/O error\n"
+ " block = %llu\n"
+ " function = %s, file = %s, line = %u\n",
+ (unsigned long long)bh->b_blocknr,
+ function, file, line);
return rv;
}

diff --git a/fs/gfs2/util.h b/fs/gfs2/util.h
index 515cce2..cbdcbdf 100644
--- a/fs/gfs2/util.h
+++ b/fs/gfs2/util.h
@@ -165,7 +165,7 @@ static inline unsigned int gfs2_tune_get_i(struct gfs2_tune *gt,
#define gfs2_tune_get(sdp, field) \
gfs2_tune_get_i(&(sdp)->sd_tune, &(sdp)->sd_tune.field)

-int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...);
+__printf(2, 3)
+int gfs2_lm_withdraw(struct gfs2_sbd *sdp, const char *fmt, ...);

#endif /* __UTIL_DOT_H__ */
-
--
1.8.3.1

2014-04-01 09:19:54

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 22/29] GFS2: Ensure workqueue is scheduled after noexp request

From: Bob Peterson <[email protected]>

This patch closes a small timing window whereby a request to hold the
transaction glock can get stuck. The problem is that after the DLM has
granted the lock, it can get into a state whereby it doesn't transition
the glock to a held state, due to not having requeued the glock state
machine to finish the transition.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 52f7478..aec7f73 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1047,9 +1047,13 @@ int gfs2_glock_nq(struct gfs2_holder *gh)

spin_lock(&gl->gl_spin);
add_to_queue(gh);
- if ((LM_FLAG_NOEXP & gh->gh_flags) &&
- test_and_clear_bit(GLF_FROZEN, &gl->gl_flags))
+ if (unlikely((LM_FLAG_NOEXP & gh->gh_flags) &&
+ test_and_clear_bit(GLF_FROZEN, &gl->gl_flags))) {
set_bit(GLF_REPLY_PENDING, &gl->gl_flags);
+ gl->gl_lockref.count++;
+ if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) == 0)
+ gl->gl_lockref.count--;
+ }
run_queue(gl, 1);
spin_unlock(&gl->gl_spin);

--
1.8.3.1

2014-04-01 09:20:44

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 21/29] GFS2: check NULL return value in gfs2_ok_to_move

From: Abhi Das <[email protected]>

gfs2_lookupi() can return NULL if the path to the root is broken by
another rename/rmdir. In this case gfs2_ok_to_move() must check for
this NULL pointer and return error.

Resolves: rhbz#1060246
Signed-off-by: Abhi Das <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index ec455b9..b52ebf8 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1299,6 +1299,10 @@ static int gfs2_ok_to_move(struct gfs2_inode *this, struct gfs2_inode *to)
}

tmp = gfs2_lookupi(dir, &gfs2_qdotdot, 1);
+ if (!tmp) {
+ error = -ENOENT;
+ break;
+ }
if (IS_ERR(tmp)) {
error = PTR_ERR(tmp);
break;
--
1.8.3.1

2014-04-01 09:16:35

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 13/29] GFS2: replace kmalloc - __vmalloc / memset 0

From: Fabian Frederick <[email protected]>

Use kzalloc and __vmalloc __GFP_ZERO for clean sd_quota_bitmap allocation.

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 8bec0e31..a5cccf6 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1242,14 +1242,13 @@ int gfs2_quota_init(struct gfs2_sbd *sdp)
bm_size = DIV_ROUND_UP(sdp->sd_quota_slots, 8 * sizeof(unsigned long));
bm_size *= sizeof(unsigned long);
error = -ENOMEM;
- sdp->sd_quota_bitmap = kmalloc(bm_size, GFP_NOFS|__GFP_NOWARN);
+ sdp->sd_quota_bitmap = kzalloc(bm_size, GFP_NOFS | __GFP_NOWARN);
if (sdp->sd_quota_bitmap == NULL)
- sdp->sd_quota_bitmap = __vmalloc(bm_size, GFP_NOFS, PAGE_KERNEL);
+ sdp->sd_quota_bitmap = __vmalloc(bm_size, GFP_NOFS |
+ __GFP_ZERO, PAGE_KERNEL);
if (!sdp->sd_quota_bitmap)
return error;

- memset(sdp->sd_quota_bitmap, 0, bm_size);
-
for (x = 0; x < blocks; x++) {
struct buffer_head *bh;
const struct gfs2_quota_change *qc;
--
1.8.3.1

2014-04-01 09:21:43

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 17/29] GFS2: Move recovery variables to journal structure in memory

From: Bob Peterson <[email protected]>

If multiple nodes fail and their recovery work runs simultaneously, they
would use the same unprotected variables in the superblock. For example,
they would stomp on each other's revoked blocks lists, which resulted
in file system metadata corruption. This patch moves the necessary
variables so that each journal has its own separate area for tracking
its journal replay.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 456d8fa..ef26ed9 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -503,6 +503,15 @@ struct gfs2_jdesc {
unsigned int jd_jid;
unsigned int jd_blocks;
int jd_recover_error;
+ /* Replay stuff */
+
+ unsigned int jd_found_blocks;
+ unsigned int jd_found_revokes;
+ unsigned int jd_replayed_blocks;
+
+ struct list_head jd_revoke_list;
+ unsigned int jd_replay_tail;
+
};

struct gfs2_statfs_change_host {
@@ -782,15 +791,6 @@ struct gfs2_sbd {
struct list_head sd_ail1_list;
struct list_head sd_ail2_list;

- /* Replay stuff */
-
- struct list_head sd_revoke_list;
- unsigned int sd_replay_tail;
-
- unsigned int sd_found_blocks;
- unsigned int sd_found_revokes;
- unsigned int sd_replayed_blocks;
-
/* For quiescing the filesystem */
struct gfs2_holder sd_freeze_gh;

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index ae1d635..a294d8d 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -520,13 +520,11 @@ static void buf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
static void buf_lo_before_scan(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, int pass)
{
- struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
-
if (pass != 0)
return;

- sdp->sd_found_blocks = 0;
- sdp->sd_replayed_blocks = 0;
+ jd->jd_found_blocks = 0;
+ jd->jd_replayed_blocks = 0;
}

static int buf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
@@ -549,9 +547,9 @@ static int buf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
for (; blks; gfs2_replay_incr_blk(sdp, &start), blks--) {
blkno = be64_to_cpu(*ptr++);

- sdp->sd_found_blocks++;
+ jd->jd_found_blocks++;

- if (gfs2_revoke_check(sdp, blkno, start))
+ if (gfs2_revoke_check(jd, blkno, start))
continue;

error = gfs2_replay_read_block(jd, start, &bh_log);
@@ -572,7 +570,7 @@ static int buf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
if (error)
break;

- sdp->sd_replayed_blocks++;
+ jd->jd_replayed_blocks++;
}

return error;
@@ -615,7 +613,7 @@ static void buf_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)
gfs2_meta_sync(ip->i_gl);

fs_info(sdp, "jid=%u: Replayed %u of %u blocks\n",
- jd->jd_jid, sdp->sd_replayed_blocks, sdp->sd_found_blocks);
+ jd->jd_jid, jd->jd_replayed_blocks, jd->jd_found_blocks);
}

static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
@@ -677,13 +675,11 @@ static void revoke_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
static void revoke_lo_before_scan(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, int pass)
{
- struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
-
if (pass != 0)
return;

- sdp->sd_found_revokes = 0;
- sdp->sd_replay_tail = head->lh_tail;
+ jd->jd_found_revokes = 0;
+ jd->jd_replay_tail = head->lh_tail;
}

static int revoke_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
@@ -715,13 +711,13 @@ static int revoke_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
while (offset + sizeof(u64) <= sdp->sd_sb.sb_bsize) {
blkno = be64_to_cpu(*(__be64 *)(bh->b_data + offset));

- error = gfs2_revoke_add(sdp, blkno, start);
+ error = gfs2_revoke_add(jd, blkno, start);
if (error < 0) {
brelse(bh);
return error;
}
else if (error)
- sdp->sd_found_revokes++;
+ jd->jd_found_revokes++;

if (!--revokes)
break;
@@ -741,16 +737,16 @@ static void revoke_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)
struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);

if (error) {
- gfs2_revoke_clean(sdp);
+ gfs2_revoke_clean(jd);
return;
}
if (pass != 1)
return;

fs_info(sdp, "jid=%u: Found %u revoke tags\n",
- jd->jd_jid, sdp->sd_found_revokes);
+ jd->jd_jid, jd->jd_found_revokes);

- gfs2_revoke_clean(sdp);
+ gfs2_revoke_clean(jd);
}

/**
@@ -789,9 +785,9 @@ static int databuf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
blkno = be64_to_cpu(*ptr++);
esc = be64_to_cpu(*ptr++);

- sdp->sd_found_blocks++;
+ jd->jd_found_blocks++;

- if (gfs2_revoke_check(sdp, blkno, start))
+ if (gfs2_revoke_check(jd, blkno, start))
continue;

error = gfs2_replay_read_block(jd, start, &bh_log);
@@ -811,7 +807,7 @@ static int databuf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
brelse(bh_log);
brelse(bh_ip);

- sdp->sd_replayed_blocks++;
+ jd->jd_replayed_blocks++;
}

return error;
@@ -835,7 +831,7 @@ static void databuf_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)
gfs2_meta_sync(ip->i_gl);

fs_info(sdp, "jid=%u: Replayed %u of %u data blocks\n",
- jd->jd_jid, sdp->sd_replayed_blocks, sdp->sd_found_blocks);
+ jd->jd_jid, jd->jd_replayed_blocks, jd->jd_found_blocks);
}

static void databuf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index c3ef844..ea9c35c 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -128,8 +128,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
atomic_set(&sdp->sd_log_in_flight, 0);
init_waitqueue_head(&sdp->sd_log_flush_wait);

- INIT_LIST_HEAD(&sdp->sd_revoke_list);
-
return sdp;
}

@@ -575,6 +573,8 @@ static int gfs2_jindex_hold(struct gfs2_sbd *sdp, struct gfs2_holder *ji_gh)
break;

INIT_LIST_HEAD(&jd->extent_list);
+ INIT_LIST_HEAD(&jd->jd_revoke_list);
+
INIT_WORK(&jd->jd_work, gfs2_recover_func);
jd->jd_inode = gfs2_lookupi(sdp->sd_jindex, &name, 1);
if (!jd->jd_inode || IS_ERR(jd->jd_inode)) {
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 963b2d7..7ad4094 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -52,9 +52,9 @@ int gfs2_replay_read_block(struct gfs2_jdesc *jd, unsigned int blk,
return error;
}

-int gfs2_revoke_add(struct gfs2_sbd *sdp, u64 blkno, unsigned int where)
+int gfs2_revoke_add(struct gfs2_jdesc *jd, u64 blkno, unsigned int where)
{
- struct list_head *head = &sdp->sd_revoke_list;
+ struct list_head *head = &jd->jd_revoke_list;
struct gfs2_revoke_replay *rr;
int found = 0;

@@ -81,13 +81,13 @@ int gfs2_revoke_add(struct gfs2_sbd *sdp, u64 blkno, unsigned int where)
return 1;
}

-int gfs2_revoke_check(struct gfs2_sbd *sdp, u64 blkno, unsigned int where)
+int gfs2_revoke_check(struct gfs2_jdesc *jd, u64 blkno, unsigned int where)
{
struct gfs2_revoke_replay *rr;
int wrap, a, b, revoke;
int found = 0;

- list_for_each_entry(rr, &sdp->sd_revoke_list, rr_list) {
+ list_for_each_entry(rr, &jd->jd_revoke_list, rr_list) {
if (rr->rr_blkno == blkno) {
found = 1;
break;
@@ -97,17 +97,17 @@ int gfs2_revoke_check(struct gfs2_sbd *sdp, u64 blkno, unsigned int where)
if (!found)
return 0;

- wrap = (rr->rr_where < sdp->sd_replay_tail);
- a = (sdp->sd_replay_tail < where);
+ wrap = (rr->rr_where < jd->jd_replay_tail);
+ a = (jd->jd_replay_tail < where);
b = (where < rr->rr_where);
revoke = (wrap) ? (a || b) : (a && b);

return revoke;
}

-void gfs2_revoke_clean(struct gfs2_sbd *sdp)
+void gfs2_revoke_clean(struct gfs2_jdesc *jd)
{
- struct list_head *head = &sdp->sd_revoke_list;
+ struct list_head *head = &jd->jd_revoke_list;
struct gfs2_revoke_replay *rr;

while (!list_empty(head)) {
diff --git a/fs/gfs2/recovery.h b/fs/gfs2/recovery.h
index 2226136..6142836 100644
--- a/fs/gfs2/recovery.h
+++ b/fs/gfs2/recovery.h
@@ -23,9 +23,9 @@ static inline void gfs2_replay_incr_blk(struct gfs2_sbd *sdp, unsigned int *blk)
extern int gfs2_replay_read_block(struct gfs2_jdesc *jd, unsigned int blk,
struct buffer_head **bh);

-extern int gfs2_revoke_add(struct gfs2_sbd *sdp, u64 blkno, unsigned int where);
-extern int gfs2_revoke_check(struct gfs2_sbd *sdp, u64 blkno, unsigned int where);
-extern void gfs2_revoke_clean(struct gfs2_sbd *sdp);
+extern int gfs2_revoke_add(struct gfs2_jdesc *jd, u64 blkno, unsigned int where);
+extern int gfs2_revoke_check(struct gfs2_jdesc *jd, u64 blkno, unsigned int where);
+extern void gfs2_revoke_clean(struct gfs2_jdesc *jd);

extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head);
--
1.8.3.1

2014-04-01 09:22:14

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 16/29] GFS2: global conversion to pr_foo()

From: Fabian Frederick <[email protected]>

-All printk(KERN_foo converted to pr_foo().
-Messages updated to fit in 80 columns.
-fs_macros converted as well.
-fs_printk removed.

Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index ffcfdd1..39c7081 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -507,7 +507,7 @@ static int gfs2_check_dirent(struct gfs2_dirent *dent, unsigned int offset,
goto error;
return 0;
error:
- printk(KERN_WARNING "gfs2_check_dirent: %s (%s)\n", msg,
+ pr_warn("gfs2_check_dirent: %s (%s)\n", msg,
first ? "first in block" : "not first in block");
return -EIO;
}
@@ -531,8 +531,8 @@ static int gfs2_dirent_offset(const void *buf)
}
return offset;
wrong_type:
- printk(KERN_WARNING "gfs2_scan_dirent: wrong block type %u\n",
- be32_to_cpu(h->mh_type));
+ pr_warn("gfs2_scan_dirent: wrong block type %u\n",
+ be32_to_cpu(h->mh_type));
return -1;
}

@@ -1006,7 +1006,7 @@ static int dir_split_leaf(struct inode *inode, const struct qstr *name)
len = 1 << (dip->i_depth - be16_to_cpu(oleaf->lf_depth));
half_len = len >> 1;
if (!half_len) {
- printk(KERN_WARNING "i_depth %u lf_depth %u index %u\n", dip->i_depth, be16_to_cpu(oleaf->lf_depth), index);
+ pr_warn("i_depth %u lf_depth %u index %u\n", dip->i_depth, be16_to_cpu(oleaf->lf_depth), index);
gfs2_consist_inode(dip);
error = -EIO;
goto fail_brelse;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index ca0be6c..329dc80 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -468,7 +468,7 @@ retry:
do_xmote(gl, gh, LM_ST_UNLOCKED);
break;
default: /* Everything else */
- printk(KERN_ERR "GFS2: wanted %u got %u\n", gl->gl_target, state);
+ pr_err("GFS2: wanted %u got %u\n", gl->gl_target, state);
GLOCK_BUG_ON(gl, 1);
}
spin_unlock(&gl->gl_spin);
@@ -542,7 +542,7 @@ __acquires(&gl->gl_spin)
/* lock_dlm */
ret = sdp->sd_lockstruct.ls_ops->lm_lock(gl, target, lck_flags);
if (ret) {
- printk(KERN_ERR "GFS2: lm_lock ret %d\n", ret);
+ pr_err("GFS2: lm_lock ret %d\n", ret);
GLOCK_BUG_ON(gl, 1);
}
} else { /* lock_nolock */
@@ -935,7 +935,7 @@ void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...)
vaf.fmt = fmt;
vaf.va = &args;

- printk(KERN_ERR " %pV", &vaf);
+ pr_err(" %pV", &vaf);
}

va_end(args);
@@ -1010,13 +1010,13 @@ do_cancel:
return;

trap_recursive:
- printk(KERN_ERR "original: %pSR\n", (void *)gh2->gh_ip);
- printk(KERN_ERR "pid: %d\n", pid_nr(gh2->gh_owner_pid));
- printk(KERN_ERR "lock type: %d req lock state : %d\n",
+ pr_err("original: %pSR\n", (void *)gh2->gh_ip);
+ pr_err("pid: %d\n", pid_nr(gh2->gh_owner_pid));
+ pr_err("lock type: %d req lock state : %d\n",
gh2->gh_gl->gl_name.ln_type, gh2->gh_state);
- printk(KERN_ERR "new: %pSR\n", (void *)gh->gh_ip);
- printk(KERN_ERR "pid: %d\n", pid_nr(gh->gh_owner_pid));
- printk(KERN_ERR "lock type: %d req lock state : %d\n",
+ pr_err("new: %pSR\n", (void *)gh->gh_ip);
+ pr_err("pid: %d\n", pid_nr(gh->gh_owner_pid));
+ pr_err("lock type: %d req lock state : %d\n",
gh->gh_gl->gl_name.ln_type, gh->gh_state);
gfs2_dump_glock(NULL, gl);
BUG();
diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index 6b97d98..a664ddd 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -176,7 +176,7 @@ static void gdlm_bast(void *arg, int mode)
gfs2_glock_cb(gl, LM_ST_SHARED);
break;
default:
- printk(KERN_ERR "unknown bast mode %d", mode);
+ pr_err("unknown bast mode %d", mode);
BUG();
}
}
@@ -195,7 +195,7 @@ static int make_mode(const unsigned int lmstate)
case LM_ST_SHARED:
return DLM_LOCK_PR;
}
- printk(KERN_ERR "unknown LM state %d", lmstate);
+ pr_err("unknown LM state %d", lmstate);
BUG();
return -1;
}
@@ -308,8 +308,7 @@ static void gdlm_put_lock(struct gfs2_glock *gl)
error = dlm_unlock(ls->ls_dlm, gl->gl_lksb.sb_lkid, DLM_LKF_VALBLK,
NULL, gl);
if (error) {
- printk(KERN_ERR "gdlm_unlock %x,%llx err=%d\n",
- gl->gl_name.ln_type,
+ pr_err("gdlm_unlock %x,%llx err=%d\n", gl->gl_name.ln_type,
(unsigned long long)gl->gl_name.ln_number, error);
return;
}
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index c272e73..ae9b02b 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -165,7 +165,7 @@ static int __init init_gfs2_fs(void)

gfs2_register_debugfs();

- printk("GFS2 installed\n");
+ pr_info("GFS2 installed\n");

return 0;

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index e0d0db5..c3ef844 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -152,7 +152,7 @@ static int gfs2_check_sb(struct gfs2_sbd *sdp, int silent)
if (sb->sb_magic != GFS2_MAGIC ||
sb->sb_type != GFS2_METATYPE_SB) {
if (!silent)
- printk(KERN_WARNING "GFS2: not a GFS2 filesystem\n");
+ pr_warn("GFS2: not a GFS2 filesystem\n");
return -EINVAL;
}

@@ -174,7 +174,7 @@ static void end_bio_io_page(struct bio *bio, int error)
if (!error)
SetPageUptodate(page);
else
- printk(KERN_WARNING "gfs2: error %d reading superblock\n", error);
+ pr_warn("gfs2: error %d reading superblock\n", error);
unlock_page(page);
}

@@ -945,7 +945,7 @@ static int gfs2_lm_mount(struct gfs2_sbd *sdp, int silent)
lm = &gfs2_dlm_ops;
#endif
} else {
- printk(KERN_INFO "GFS2: can't find protocol %s\n", proto);
+ pr_info("GFS2: can't find protocol %s\n", proto);
return -ENOENT;
}

@@ -1052,7 +1052,7 @@ static int fill_super(struct super_block *sb, struct gfs2_args *args, int silent

sdp = init_sbd(sb);
if (!sdp) {
- printk(KERN_WARNING "GFS2: can't alloc struct gfs2_sbd\n");
+ pr_warn("GFS2: can't alloc struct gfs2_sbd\n");
return -ENOMEM;
}
sdp->sd_args = *args;
@@ -1300,7 +1300,7 @@ static struct dentry *gfs2_mount(struct file_system_type *fs_type, int flags,

error = gfs2_mount_args(&args, data);
if (error) {
- printk(KERN_WARNING "GFS2: can't parse mount arguments\n");
+ pr_warn("GFS2: can't parse mount arguments\n");
goto error_super;
}

@@ -1350,7 +1350,7 @@ static struct dentry *gfs2_mount_meta(struct file_system_type *fs_type,

error = kern_path(dev_name, LOOKUP_FOLLOW, &path);
if (error) {
- printk(KERN_WARNING "GFS2: path_lookup on %s returned error %d\n",
+ pr_warn("GFS2: path_lookup on %s returned error %d\n",
dev_name, error);
return ERR_PTR(error);
}
@@ -1358,7 +1358,7 @@ static struct dentry *gfs2_mount_meta(struct file_system_type *fs_type,
path.dentry->d_inode->i_sb->s_bdev);
path_put(&path);
if (IS_ERR(s)) {
- printk(KERN_WARNING "GFS2: gfs2 mount does not exist\n");
+ pr_warn("GFS2: gfs2 mount does not exist\n");
return ERR_CAST(s);
}
if ((flags ^ s->s_flags) & MS_RDONLY) {
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index a5cccf6..6e25ee4 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1081,7 +1081,7 @@ static int print_message(struct gfs2_quota_data *qd, char *type)
{
struct gfs2_sbd *sdp = qd->qd_gl->gl_sbd;

- printk(KERN_INFO "GFS2: fsid=%s: quota %s for %s %u\n",
+ pr_info("GFS2: fsid=%s: quota %s for %s %u\n",
sdp->sd_fsname, type,
(qd->qd_id.type == USRQUOTA) ? "user" : "group",
from_kqid(&init_user_ns, qd->qd_id));
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index f585746..8d12038 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -99,11 +99,11 @@ static inline void gfs2_setbit(const struct gfs2_rbm *rbm, bool do_clone,
cur_state = (*byte1 >> bit) & GFS2_BIT_MASK;

if (unlikely(!valid_change[new_state * 4 + cur_state])) {
- printk(KERN_WARNING "GFS2: buf_blk = 0x%x old_state=%d, "
+ pr_warn("GFS2: buf_blk = 0x%x old_state=%d, "
"new_state=%d\n", rbm->offset, cur_state, new_state);
- printk(KERN_WARNING "GFS2: rgrp=0x%llx bi_start=0x%x\n",
+ pr_warn("GFS2: rgrp=0x%llx bi_start=0x%x\n",
(unsigned long long)rbm->rgd->rd_addr, bi->bi_start);
- printk(KERN_WARNING "GFS2: bi_offset=0x%x bi_len=0x%x\n",
+ pr_warn("GFS2: bi_offset=0x%x bi_len=0x%x\n",
bi->bi_offset, bi->bi_len);
dump_stack();
gfs2_consist_rgrpd(rbm->rgd);
@@ -736,11 +736,11 @@ void gfs2_clear_rgrpd(struct gfs2_sbd *sdp)

static void gfs2_rindex_print(const struct gfs2_rgrpd *rgd)
{
- printk(KERN_INFO " ri_addr = %llu\n", (unsigned long long)rgd->rd_addr);
- printk(KERN_INFO " ri_length = %u\n", rgd->rd_length);
- printk(KERN_INFO " ri_data0 = %llu\n", (unsigned long long)rgd->rd_data0);
- printk(KERN_INFO " ri_data = %u\n", rgd->rd_data);
- printk(KERN_INFO " ri_bitbytes = %u\n", rgd->rd_bitbytes);
+ pr_info(" ri_addr = %llu\n", (unsigned long long)rgd->rd_addr);
+ pr_info(" ri_length = %u\n", rgd->rd_length);
+ pr_info(" ri_data0 = %llu\n", (unsigned long long)rgd->rd_data0);
+ pr_info(" ri_data = %u\n", rgd->rd_data);
+ pr_info(" ri_bitbytes = %u\n", rgd->rd_bitbytes);
}

/**
@@ -2278,7 +2278,7 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,
}
}
if (rbm.rgd->rd_free < *nblocks) {
- printk(KERN_WARNING "nblocks=%u\n", *nblocks);
+ pr_warn("nblocks=%u\n", *nblocks);
goto rgrp_error;
}

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 2574744..584c757 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -175,7 +175,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_debug:
if (args->ar_errors == GFS2_ERRORS_PANIC) {
- printk(KERN_WARNING "GFS2: -o debug and -o errors=panic "
+ pr_warn("GFS2: -o debug and -o errors=panic "
"are mutually exclusive.\n");
return -EINVAL;
}
@@ -228,21 +228,21 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
case Opt_commit:
rv = match_int(&tmp[0], &args->ar_commit);
if (rv || args->ar_commit <= 0) {
- printk(KERN_WARNING "GFS2: commit mount option requires a positive numeric argument\n");
+ pr_warn("GFS2: commit mount option requires a positive numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
case Opt_statfs_quantum:
rv = match_int(&tmp[0], &args->ar_statfs_quantum);
if (rv || args->ar_statfs_quantum < 0) {
- printk(KERN_WARNING "GFS2: statfs_quantum mount option requires a non-negative numeric argument\n");
+ pr_warn("GFS2: statfs_quantum mount option requires a non-negative numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
case Opt_quota_quantum:
rv = match_int(&tmp[0], &args->ar_quota_quantum);
if (rv || args->ar_quota_quantum <= 0) {
- printk(KERN_WARNING "GFS2: quota_quantum mount option requires a positive numeric argument\n");
+ pr_warn("GFS2: quota_quantum mount option requires a positive numeric argument\n");
return rv ? rv : -EINVAL;
}
break;
@@ -250,7 +250,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
rv = match_int(&tmp[0], &args->ar_statfs_percent);
if (rv || args->ar_statfs_percent < 0 ||
args->ar_statfs_percent > 100) {
- printk(KERN_WARNING "statfs_percent mount option requires a numeric argument between 0 and 100\n");
+ pr_warn("statfs_percent mount option requires a numeric argument between 0 and 100\n");
return rv ? rv : -EINVAL;
}
break;
@@ -259,7 +259,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_err_panic:
if (args->ar_debug) {
- printk(KERN_WARNING "GFS2: -o debug and -o errors=panic "
+ pr_warn("GFS2: -o debug and -o errors=panic "
"are mutually exclusive.\n");
return -EINVAL;
}
@@ -279,7 +279,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
break;
case Opt_error:
default:
- printk(KERN_WARNING "GFS2: invalid mount option: %s\n", o);
+ pr_warn("GFS2: invalid mount option: %s\n", o);
return -EINVAL;
}
}
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 295f400..3fe8e34 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -99,11 +99,10 @@ static void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)

static void gfs2_print_trans(const struct gfs2_trans *tr)
{
- printk(KERN_WARNING "GFS2: Transaction created at: %pSR\n",
- (void *)tr->tr_ip);
- printk(KERN_WARNING "GFS2: blocks=%u revokes=%u reserved=%u touched=%u\n",
+ pr_warn("GFS2: Transaction created at: %pSR\n", (void *)tr->tr_ip);
+ pr_warn("GFS2: blocks=%u revokes=%u reserved=%u touched=%u\n",
tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched);
- printk(KERN_WARNING "GFS2: Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
+ pr_warn("GFS2: Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
tr->tr_num_buf_new, tr->tr_num_buf_rm,
tr->tr_num_databuf_new, tr->tr_num_databuf_rm,
tr->tr_num_revoke, tr->tr_num_revoke_rm);
@@ -232,8 +231,8 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
set_bit(GLF_DIRTY, &bd->bd_gl->gl_flags);
mh = (struct gfs2_meta_header *)bd->bd_bh->b_data;
if (unlikely(mh->mh_magic != cpu_to_be32(GFS2_MAGIC))) {
- printk(KERN_ERR
- "Attempting to add uninitialised block to journal (inplace block=%lld)\n",
+ pr_err("Attempting to add uninitialised block to journal "
+ "(inplace block=%lld)\n",
(unsigned long long)bd->bd_bh->b_blocknr);
BUG();
}
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index f7109f6..e9d7001 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -30,8 +30,7 @@ mempool_t *gfs2_page_pool __read_mostly;

void gfs2_assert_i(struct gfs2_sbd *sdp)
{
- printk(KERN_EMERG "GFS2: fsid=%s: fatal assertion failed\n",
- sdp->sd_fsname);
+ pr_emerg("GFS2: fsid=%s: fatal assertion failed\n", sdp->sd_fsname);
}

int gfs2_lm_withdraw(struct gfs2_sbd *sdp, char *fmt, ...)
@@ -105,8 +104,7 @@ int gfs2_assert_warn_i(struct gfs2_sbd *sdp, char *assertion,
return -2;

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW)
- printk(KERN_WARNING
- "GFS2: fsid=%s: warning: assertion \"%s\" failed\n"
+ pr_warn("GFS2: fsid=%s: warning: assertion \"%s\" failed\n"
"GFS2: fsid=%s: function = %s, file = %s, line = %u\n",
sdp->sd_fsname, assertion,
sdp->sd_fsname, function, file, line);
--
1.8.3.1

2014-04-01 09:22:46

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 14/29] GFS2: Clean up journal extent mapping

This patch fixes a long standing issue in mapping the journal
extents. Most journals will consist of only a single extent,
and although the cache took account of that by merging extents,
it did not actually map large extents, but instead was doing a
block by block mapping. Since the journal was only being mapped
on mount, this was not normally noticeable.

With the updated code, it is now possible to use the same extent
mapping system during journal recovery (which will be added in a
later patch). This will allow checking of the integrity of the
journal before any reply of the journal content is attempted. For
this reason the code is moving to bmap.c, since it will be used
more widely in due course.

An exercise left for the reader is to compare the new function
gfs2_map_journal_extents() with gfs2_write_alloc_required()

Additionally, should there be a failure, the error reporting is
also updated to show more detail about what went wrong.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index fe0500c..c62d4b9 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1328,6 +1328,121 @@ int gfs2_file_dealloc(struct gfs2_inode *ip)
}

/**
+ * gfs2_free_journal_extents - Free cached journal bmap info
+ * @jd: The journal
+ *
+ */
+
+void gfs2_free_journal_extents(struct gfs2_jdesc *jd)
+{
+ struct gfs2_journal_extent *jext;
+
+ while(!list_empty(&jd->extent_list)) {
+ jext = list_entry(jd->extent_list.next, struct gfs2_journal_extent, list);
+ list_del(&jext->list);
+ kfree(jext);
+ }
+}
+
+/**
+ * gfs2_add_jextent - Add or merge a new extent to extent cache
+ * @jd: The journal descriptor
+ * @lblock: The logical block at start of new extent
+ * @pblock: The physical block at start of new extent
+ * @blocks: Size of extent in fs blocks
+ *
+ * Returns: 0 on success or -ENOMEM
+ */
+
+static int gfs2_add_jextent(struct gfs2_jdesc *jd, u64 lblock, u64 dblock, u64 blocks)
+{
+ struct gfs2_journal_extent *jext;
+
+ if (!list_empty(&jd->extent_list)) {
+ jext = list_entry(jd->extent_list.prev, struct gfs2_journal_extent, list);
+ if ((jext->dblock + jext->blocks) == dblock) {
+ jext->blocks += blocks;
+ return 0;
+ }
+ }
+
+ jext = kzalloc(sizeof(struct gfs2_journal_extent), GFP_NOFS);
+ if (jext == NULL)
+ return -ENOMEM;
+ jext->dblock = dblock;
+ jext->lblock = lblock;
+ jext->blocks = blocks;
+ list_add_tail(&jext->list, &jd->extent_list);
+ jd->nr_extents++;
+ return 0;
+}
+
+/**
+ * gfs2_map_journal_extents - Cache journal bmap info
+ * @sdp: The super block
+ * @jd: The journal to map
+ *
+ * Create a reusable "extent" mapping from all logical
+ * blocks to all physical blocks for the given journal. This will save
+ * us time when writing journal blocks. Most journals will have only one
+ * extent that maps all their logical blocks. That's because gfs2.mkfs
+ * arranges the journal blocks sequentially to maximize performance.
+ * So the extent would map the first block for the entire file length.
+ * However, gfs2_jadd can happen while file activity is happening, so
+ * those journals may not be sequential. Less likely is the case where
+ * the users created their own journals by mounting the metafs and
+ * laying it out. But it's still possible. These journals might have
+ * several extents.
+ *
+ * Returns: 0 on success, or error on failure
+ */
+
+int gfs2_map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd)
+{
+ u64 lblock = 0;
+ u64 lblock_stop;
+ struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
+ struct buffer_head bh;
+ unsigned int shift = sdp->sd_sb.sb_bsize_shift;
+ u64 size;
+ int rc;
+
+ lblock_stop = i_size_read(jd->jd_inode) >> shift;
+ size = (lblock_stop - lblock) << shift;
+ jd->nr_extents = 0;
+ WARN_ON(!list_empty(&jd->extent_list));
+
+ do {
+ bh.b_state = 0;
+ bh.b_blocknr = 0;
+ bh.b_size = size;
+ rc = gfs2_block_map(jd->jd_inode, lblock, &bh, 0);
+ if (rc || !buffer_mapped(&bh))
+ goto fail;
+ rc = gfs2_add_jextent(jd, lblock, bh.b_blocknr, bh.b_size >> shift);
+ if (rc)
+ goto fail;
+ size -= bh.b_size;
+ lblock += (bh.b_size >> ip->i_inode.i_blkbits);
+ } while(size > 0);
+
+ fs_info(sdp, "journal %d mapped with %u extents\n", jd->jd_jid,
+ jd->nr_extents);
+ return 0;
+
+fail:
+ fs_warn(sdp, "error %d mapping journal %u at offset %llu (extent %u)\n",
+ rc, jd->jd_jid,
+ (unsigned long long)(i_size_read(jd->jd_inode) - size),
+ jd->nr_extents);
+ fs_warn(sdp, "bmap=%d lblock=%llu block=%llu, state=0x%08lx, size=%llu\n",
+ rc, (unsigned long long)lblock, (unsigned long long)bh.b_blocknr,
+ bh.b_state, (unsigned long long)bh.b_size);
+ gfs2_free_journal_extents(jd);
+ return rc;
+}
+
+/**
* gfs2_write_alloc_required - figure out if a write will require an allocation
* @ip: the file being written to
* @offset: the offset to write to
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index 42fea03..81ded5e 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -55,5 +55,7 @@ extern int gfs2_truncatei_resume(struct gfs2_inode *ip);
extern int gfs2_file_dealloc(struct gfs2_inode *ip);
extern int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
unsigned int len);
+extern int gfs2_map_journal_extents(struct gfs2_sbd *sdp, struct gfs2_jdesc *jd);
+extern void gfs2_free_journal_extents(struct gfs2_jdesc *jd);

#endif /* __BMAP_DOT_H__ */
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index d0c3928..456d8fa 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -485,7 +485,7 @@ struct gfs2_trans {
};

struct gfs2_journal_extent {
- struct list_head extent_list;
+ struct list_head list;

unsigned int lblock; /* First logical block */
u64 dblock; /* First disk block */
@@ -495,6 +495,7 @@ struct gfs2_journal_extent {
struct gfs2_jdesc {
struct list_head jd_list;
struct list_head extent_list;
+ unsigned int nr_extents;
struct work_struct jd_work;
struct inode *jd_inode;
unsigned long jd_flags;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 23c6e72..ae1d635 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -146,8 +146,8 @@ static u64 gfs2_log_bmap(struct gfs2_sbd *sdp)
struct gfs2_journal_extent *je;
u64 block;

- list_for_each_entry(je, &sdp->sd_jdesc->extent_list, extent_list) {
- if (lbn >= je->lblock && lbn < je->lblock + je->blocks) {
+ list_for_each_entry(je, &sdp->sd_jdesc->extent_list, list) {
+ if ((lbn >= je->lblock) && (lbn < (je->lblock + je->blocks))) {
block = je->dblock + lbn - je->lblock;
gfs2_log_incr_head(sdp);
return block;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 1f855a7..e0d0db5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -517,67 +517,6 @@ out:
return ret;
}

-/**
- * map_journal_extents - create a reusable "extent" mapping from all logical
- * blocks to all physical blocks for the given journal. This will save
- * us time when writing journal blocks. Most journals will have only one
- * extent that maps all their logical blocks. That's because gfs2.mkfs
- * arranges the journal blocks sequentially to maximize performance.
- * So the extent would map the first block for the entire file length.
- * However, gfs2_jadd can happen while file activity is happening, so
- * those journals may not be sequential. Less likely is the case where
- * the users created their own journals by mounting the metafs and
- * laying it out. But it's still possible. These journals might have
- * several extents.
- *
- * TODO: This should be done in bigger chunks rather than one block at a time,
- * but since it's only done at mount time, I'm not worried about the
- * time it takes.
- */
-static int map_journal_extents(struct gfs2_sbd *sdp)
-{
- struct gfs2_jdesc *jd = sdp->sd_jdesc;
- unsigned int lb;
- u64 db, prev_db; /* logical block, disk block, prev disk block */
- struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
- struct gfs2_journal_extent *jext = NULL;
- struct buffer_head bh;
- int rc = 0;
-
- prev_db = 0;
-
- for (lb = 0; lb < i_size_read(jd->jd_inode) >> sdp->sd_sb.sb_bsize_shift; lb++) {
- bh.b_state = 0;
- bh.b_blocknr = 0;
- bh.b_size = 1 << ip->i_inode.i_blkbits;
- rc = gfs2_block_map(jd->jd_inode, lb, &bh, 0);
- db = bh.b_blocknr;
- if (rc || !db) {
- printk(KERN_INFO "GFS2 journal mapping error %d: lb="
- "%u db=%llu\n", rc, lb, (unsigned long long)db);
- break;
- }
- if (!prev_db || db != prev_db + 1) {
- jext = kzalloc(sizeof(struct gfs2_journal_extent),
- GFP_KERNEL);
- if (!jext) {
- printk(KERN_INFO "GFS2 error: out of memory "
- "mapping journal extents.\n");
- rc = -ENOMEM;
- break;
- }
- jext->dblock = db;
- jext->lblock = lb;
- jext->blocks = 1;
- list_add_tail(&jext->extent_list, &jd->extent_list);
- } else {
- jext->blocks++;
- }
- prev_db = db;
- }
- return rc;
-}
-
static void gfs2_others_may_mount(struct gfs2_sbd *sdp)
{
char *message = "FIRSTMOUNT=Done";
@@ -779,7 +718,7 @@ static int init_journal(struct gfs2_sbd *sdp, int undo)
atomic_set(&sdp->sd_log_thresh2, 4*sdp->sd_jdesc->jd_blocks/5);

/* Map the extents for this journal's blocks */
- map_journal_extents(sdp);
+ gfs2_map_journal_extents(sdp, sdp->sd_jdesc);
}
trace_gfs2_log_blocks(sdp, atomic_read(&sdp->sd_log_blks_free));

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 60f60f6..2574744 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -295,9 +295,8 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)

void gfs2_jindex_free(struct gfs2_sbd *sdp)
{
- struct list_head list, *head;
+ struct list_head list;
struct gfs2_jdesc *jd;
- struct gfs2_journal_extent *jext;

spin_lock(&sdp->sd_jindex_spin);
list_add(&list, &sdp->sd_jindex_list);
@@ -307,14 +306,7 @@ void gfs2_jindex_free(struct gfs2_sbd *sdp)

while (!list_empty(&list)) {
jd = list_entry(list.next, struct gfs2_jdesc, jd_list);
- head = &jd->extent_list;
- while (!list_empty(head)) {
- jext = list_entry(head->next,
- struct gfs2_journal_extent,
- extent_list);
- list_del(&jext->extent_list);
- kfree(jext);
- }
+ gfs2_free_journal_extents(jd);
list_del(&jd->jd_list);
iput(jd->jd_inode);
kfree(jd);
--
1.8.3.1

2014-04-01 09:16:25

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 10/29] GFS2: Move log buffer accounting to transaction

Now we have a master transaction into which other transactions
are merged, the accounting can be done using this master
transaction. We no longer require the superblock fields which
were being used for this function.

In addition, this allows for a clean up in calc_reserved()
making it rather easier understand. Also, by reducing the
number of variables used to track the buffers being added
and removed from the journal, a number of error checks are
now no longer required.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 99aab64..d0c3928 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -748,15 +748,10 @@ struct gfs2_sbd {

struct gfs2_trans *sd_log_tr;
unsigned int sd_log_blks_reserved;
- unsigned int sd_log_commited_buf;
- unsigned int sd_log_commited_databuf;
int sd_log_commited_revoke;

atomic_t sd_log_pinned;
- unsigned int sd_log_num_buf;
unsigned int sd_log_num_revoke;
- unsigned int sd_log_num_rg;
- unsigned int sd_log_num_databuf;

struct list_head sd_log_le_revoke;
struct list_head sd_log_le_ordered;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 975712c..c1c9a29 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -414,24 +414,22 @@ static inline unsigned int log_distance(struct gfs2_sbd *sdp, unsigned int newer
static unsigned int calc_reserved(struct gfs2_sbd *sdp)
{
unsigned int reserved = 0;
- unsigned int mbuf_limit, metabufhdrs_needed;
- unsigned int dbuf_limit, databufhdrs_needed;
- unsigned int revokes = 0;
+ unsigned int mbuf;
+ unsigned int dbuf;
+ struct gfs2_trans *tr = sdp->sd_log_tr;

- mbuf_limit = buf_limit(sdp);
- metabufhdrs_needed = (sdp->sd_log_commited_buf +
- (mbuf_limit - 1)) / mbuf_limit;
- dbuf_limit = databuf_limit(sdp);
- databufhdrs_needed = (sdp->sd_log_commited_databuf +
- (dbuf_limit - 1)) / dbuf_limit;
+ if (tr) {
+ mbuf = tr->tr_num_buf_new - tr->tr_num_buf_rm;
+ dbuf = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
+ reserved = mbuf + dbuf;
+ /* Account for header blocks */
+ reserved += DIV_ROUND_UP(mbuf, buf_limit(sdp));
+ reserved += DIV_ROUND_UP(dbuf, databuf_limit(sdp));
+ }

if (sdp->sd_log_commited_revoke > 0)
- revokes = gfs2_struct2blk(sdp, sdp->sd_log_commited_revoke,
+ reserved += gfs2_struct2blk(sdp, sdp->sd_log_commited_revoke,
sizeof(u64));
-
- reserved = sdp->sd_log_commited_buf + metabufhdrs_needed +
- sdp->sd_log_commited_databuf + databufhdrs_needed +
- revokes;
/* One for the overall header */
if (reserved)
reserved++;
@@ -693,16 +691,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
INIT_LIST_HEAD(&tr->tr_ail2_list);
}

- if (sdp->sd_log_num_buf != sdp->sd_log_commited_buf) {
- printk(KERN_INFO "GFS2: log buf %u %u\n", sdp->sd_log_num_buf,
- sdp->sd_log_commited_buf);
- gfs2_assert_withdraw(sdp, 0);
- }
- if (sdp->sd_log_num_databuf != sdp->sd_log_commited_databuf) {
- printk(KERN_INFO "GFS2: log databuf %u %u\n",
- sdp->sd_log_num_databuf, sdp->sd_log_commited_databuf);
- gfs2_assert_withdraw(sdp, 0);
- }
gfs2_assert_withdraw(sdp,
sdp->sd_log_num_revoke == sdp->sd_log_commited_revoke);

@@ -727,8 +715,6 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
gfs2_log_lock(sdp);
sdp->sd_log_head = sdp->sd_log_flush_head;
sdp->sd_log_blks_reserved = 0;
- sdp->sd_log_commited_buf = 0;
- sdp->sd_log_commited_databuf = 0;
sdp->sd_log_commited_revoke = 0;

spin_lock(&sdp->sd_ail_lock);
@@ -769,31 +755,29 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
unsigned int reserved;
unsigned int unused;
+ unsigned int maxres;

gfs2_log_lock(sdp);

- sdp->sd_log_commited_buf += tr->tr_num_buf_new - tr->tr_num_buf_rm;
- sdp->sd_log_commited_databuf += tr->tr_num_databuf_new -
- tr->tr_num_databuf_rm;
- gfs2_assert_withdraw(sdp, (((int)sdp->sd_log_commited_buf) >= 0) ||
- (((int)sdp->sd_log_commited_databuf) >= 0));
+ if (sdp->sd_log_tr) {
+ gfs2_merge_trans(sdp->sd_log_tr, tr);
+ } else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
+ gfs2_assert_withdraw(sdp, tr->tr_t_gh.gh_gl);
+ sdp->sd_log_tr = tr;
+ tr->tr_attached = 1;
+ }
+
sdp->sd_log_commited_revoke += tr->tr_num_revoke - tr->tr_num_revoke_rm;
reserved = calc_reserved(sdp);
- gfs2_assert_withdraw(sdp, sdp->sd_log_blks_reserved + tr->tr_reserved >= reserved);
- unused = sdp->sd_log_blks_reserved - reserved + tr->tr_reserved;
+ maxres = sdp->sd_log_blks_reserved + tr->tr_reserved;
+ gfs2_assert_withdraw(sdp, maxres >= reserved);
+ unused = maxres - reserved;
atomic_add(unused, &sdp->sd_log_blks_free);
trace_gfs2_log_blocks(sdp, unused);
gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
sdp->sd_jdesc->jd_blocks);
sdp->sd_log_blks_reserved = reserved;

- if (sdp->sd_log_tr) {
- gfs2_merge_trans(sdp->sd_log_tr, tr);
- } else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
- gfs2_assert_withdraw(sdp, tr->tr_t_gh.gh_gl);
- sdp->sd_log_tr = tr;
- tr->tr_attached = 1;
- }
gfs2_log_unlock(sdp);
}

@@ -833,10 +817,7 @@ void gfs2_log_shutdown(struct gfs2_sbd *sdp)
down_write(&sdp->sd_log_flush_lock);

gfs2_assert_withdraw(sdp, !sdp->sd_log_blks_reserved);
- gfs2_assert_withdraw(sdp, !sdp->sd_log_num_buf);
gfs2_assert_withdraw(sdp, !sdp->sd_log_num_revoke);
- gfs2_assert_withdraw(sdp, !sdp->sd_log_num_rg);
- gfs2_assert_withdraw(sdp, !sdp->sd_log_num_databuf);
gfs2_assert_withdraw(sdp, list_empty(&sdp->sd_ail1_list));

sdp->sd_log_flush_head = sdp->sd_log_head;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index ee9ec7f..23c6e72 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -494,9 +494,11 @@ static void gfs2_before_commit(struct gfs2_sbd *sdp, unsigned int limit,
static void buf_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
unsigned int limit = buf_limit(sdp); /* 503 for 4k blocks */
+ unsigned int nbuf;
if (tr == NULL)
return;
- gfs2_before_commit(sdp, limit, sdp->sd_log_num_buf, &tr->tr_buf, 0);
+ nbuf = tr->tr_num_buf_new - tr->tr_num_buf_rm;
+ gfs2_before_commit(sdp, limit, nbuf, &tr->tr_buf, 0);
}

static void buf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
@@ -511,11 +513,8 @@ static void buf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
while (!list_empty(head)) {
bd = list_entry(head->next, struct gfs2_bufdata, bd_list);
list_del_init(&bd->bd_list);
- sdp->sd_log_num_buf--;
-
gfs2_unpin(sdp, bd->bd_bh, tr);
}
- gfs2_assert_warn(sdp, !sdp->sd_log_num_buf);
}

static void buf_lo_before_scan(struct gfs2_jdesc *jd,
@@ -761,10 +760,12 @@ static void revoke_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)

static void databuf_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
- unsigned int limit = buf_limit(sdp) / 2;
+ unsigned int limit = databuf_limit(sdp);
+ unsigned int nbuf;
if (tr == NULL)
return;
- gfs2_before_commit(sdp, limit, sdp->sd_log_num_databuf, &tr->tr_databuf, 1);
+ nbuf = tr->tr_num_databuf_new - tr->tr_num_databuf_rm;
+ gfs2_before_commit(sdp, limit, nbuf, &tr->tr_databuf, 1);
}

static int databuf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
@@ -849,10 +850,8 @@ static void databuf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
while (!list_empty(head)) {
bd = list_entry(head->next, struct gfs2_bufdata, bd_list);
list_del_init(&bd->bd_list);
- sdp->sd_log_num_databuf--;
gfs2_unpin(sdp, bd->bd_bh, tr);
}
- gfs2_assert_warn(sdp, !sdp->sd_log_num_databuf);
}


diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index c7f2469..005e468 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -267,15 +267,10 @@ void gfs2_remove_from_journal(struct buffer_head *bh, struct gfs2_trans *tr, int
trace_gfs2_pin(bd, 0);
atomic_dec(&sdp->sd_log_pinned);
list_del_init(&bd->bd_list);
- if (meta) {
- gfs2_assert_warn(sdp, sdp->sd_log_num_buf);
- sdp->sd_log_num_buf--;
+ if (meta)
tr->tr_num_buf_rm++;
- } else {
- gfs2_assert_warn(sdp, sdp->sd_log_num_databuf);
- sdp->sd_log_num_databuf--;
+ else
tr->tr_num_databuf_rm++;
- }
tr->tr_touched = 1;
was_pinned = 1;
brelse(bh);
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index e0464a2..295f400 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -213,7 +213,6 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct buffer_head *bh)
set_bit(GLF_DIRTY, &bd->bd_gl->gl_flags);
gfs2_pin(sdp, bd->bd_bh);
tr->tr_num_databuf_new++;
- sdp->sd_log_num_databuf++;
list_add_tail(&bd->bd_list, &tr->tr_databuf);
}
gfs2_log_unlock(sdp);
@@ -241,7 +240,6 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
gfs2_pin(sdp, bd->bd_bh);
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
- sdp->sd_log_num_buf++;
list_add(&bd->bd_list, &tr->tr_buf);
tr->tr_num_buf_new++;
}
--
1.8.3.1

2014-04-01 09:23:15

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 11/29] fs: NULL dereference in posix_acl_to_xattr()

From: Dan Carpenter <[email protected]>

This patch moves the dereference of "buffer" after the check for NULL.
The only place which passes a NULL parameter is gfs2_set_acl().

Cc: stable <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 38bae5a..202b84f 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -717,7 +717,7 @@ posix_acl_to_xattr(struct user_namespace *user_ns, const struct posix_acl *acl,
void *buffer, size_t size)
{
posix_acl_xattr_header *ext_acl = (posix_acl_xattr_header *)buffer;
- posix_acl_xattr_entry *ext_entry = ext_acl->a_entries;
+ posix_acl_xattr_entry *ext_entry;
int real_size, n;

real_size = posix_acl_xattr_size(acl->a_count);
@@ -725,7 +725,8 @@ posix_acl_to_xattr(struct user_namespace *user_ns, const struct posix_acl *acl,
return real_size;
if (real_size > size)
return -ERANGE;
-
+
+ ext_entry = ext_acl->a_entries;
ext_acl->a_version = cpu_to_le32(POSIX_ACL_XATTR_VERSION);

for (n=0; n < acl->a_count; n++, ext_entry++) {
--
1.8.3.1

2014-04-01 09:16:21

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 03/29] GFS2: journal data writepages update

GFS2 has carried what is more or less a copy of the
write_cache_pages() for some time. It seems that this
copy has slipped behind the core code over time. This
patch brings it back uptodate, and in addition adds the
tracepoint which would otherwise be missing.

We could go further, and eliminate some or all of the
code duplication here. The issue is that if we do that,
then the function we need to split out from the existing
write_cache_pages(), which will look a lot like
gfs2_jdata_write_pagevec(), would land up putting quite a
lot of extra variables on the stack. I know that has been
a problem in the past in the writeback code path, which
is why I've hesitated to do it here.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e0259a1..82a1456 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -94,6 +94,8 @@ static inline struct inode *wb_inode(struct list_head *head)
#define CREATE_TRACE_POINTS
#include <trace/events/writeback.h>

+EXPORT_TRACEPOINT_SYMBOL_GPL(wbc_writepage);
+
static void bdi_queue_work(struct backing_dev_info *bdi,
struct wb_writeback_work *work)
{
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 49436fa..ce62dca 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -21,6 +21,7 @@
#include <linux/gfs2_ondisk.h>
#include <linux/backing-dev.h>
#include <linux/aio.h>
+#include <trace/events/writeback.h>

#include "gfs2.h"
#include "incore.h"
@@ -230,13 +231,11 @@ static int gfs2_writepages(struct address_space *mapping,
static int gfs2_write_jdata_pagevec(struct address_space *mapping,
struct writeback_control *wbc,
struct pagevec *pvec,
- int nr_pages, pgoff_t end)
+ int nr_pages, pgoff_t end,
+ pgoff_t *done_index)
{
struct inode *inode = mapping->host;
struct gfs2_sbd *sdp = GFS2_SB(inode);
- loff_t i_size = i_size_read(inode);
- pgoff_t end_index = i_size >> PAGE_CACHE_SHIFT;
- unsigned offset = i_size & (PAGE_CACHE_SIZE-1);
unsigned nrblocks = nr_pages * (PAGE_CACHE_SIZE/inode->i_sb->s_blocksize);
int i;
int ret;
@@ -248,40 +247,83 @@ static int gfs2_write_jdata_pagevec(struct address_space *mapping,
for(i = 0; i < nr_pages; i++) {
struct page *page = pvec->pages[i];

+ /*
+ * At this point, the page may be truncated or
+ * invalidated (changing page->mapping to NULL), or
+ * even swizzled back from swapper_space to tmpfs file
+ * mapping. However, page->index will not change
+ * because we have a reference on the page.
+ */
+ if (page->index > end) {
+ /*
+ * can't be range_cyclic (1st pass) because
+ * end == -1 in that case.
+ */
+ ret = 1;
+ break;
+ }
+
+ *done_index = page->index;
+
lock_page(page);

if (unlikely(page->mapping != mapping)) {
+continue_unlock:
unlock_page(page);
continue;
}

- if (!wbc->range_cyclic && page->index > end) {
- ret = 1;
- unlock_page(page);
- continue;
+ if (!PageDirty(page)) {
+ /* someone wrote it for us */
+ goto continue_unlock;
}

- if (wbc->sync_mode != WB_SYNC_NONE)
- wait_on_page_writeback(page);
-
- if (PageWriteback(page) ||
- !clear_page_dirty_for_io(page)) {
- unlock_page(page);
- continue;
+ if (PageWriteback(page)) {
+ if (wbc->sync_mode != WB_SYNC_NONE)
+ wait_on_page_writeback(page);
+ else
+ goto continue_unlock;
}

- /* Is the page fully outside i_size? (truncate in progress) */
- if (page->index > end_index || (page->index == end_index && !offset)) {
- page->mapping->a_ops->invalidatepage(page, 0,
- PAGE_CACHE_SIZE);
- unlock_page(page);
- continue;
- }
+ BUG_ON(PageWriteback(page));
+ if (!clear_page_dirty_for_io(page))
+ goto continue_unlock;
+
+ trace_wbc_writepage(wbc, mapping->backing_dev_info);

ret = __gfs2_jdata_writepage(page, wbc);
+ if (unlikely(ret)) {
+ if (ret == AOP_WRITEPAGE_ACTIVATE) {
+ unlock_page(page);
+ ret = 0;
+ } else {
+
+ /*
+ * done_index is set past this page,
+ * so media errors will not choke
+ * background writeout for the entire
+ * file. This has consequences for
+ * range_cyclic semantics (ie. it may
+ * not be suitable for data integrity
+ * writeout).
+ */
+ *done_index = page->index + 1;
+ ret = 1;
+ break;
+ }
+ }

- if (ret || (--(wbc->nr_to_write) <= 0))
+ /*
+ * We stop writing back only if we are not doing
+ * integrity sync. In case of integrity sync we have to
+ * keep going until we have written all the pages
+ * we tagged for writeback prior to entering this loop.
+ */
+ if (--wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE) {
ret = 1;
+ break;
+ }
+
}
gfs2_trans_end(sdp);
return ret;
@@ -306,51 +348,69 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
int done = 0;
struct pagevec pvec;
int nr_pages;
+ pgoff_t uninitialized_var(writeback_index);
pgoff_t index;
pgoff_t end;
- int scanned = 0;
+ pgoff_t done_index;
+ int cycled;
int range_whole = 0;
+ int tag;

pagevec_init(&pvec, 0);
if (wbc->range_cyclic) {
- index = mapping->writeback_index; /* Start from prev offset */
+ writeback_index = mapping->writeback_index; /* prev offset */
+ index = writeback_index;
+ if (index == 0)
+ cycled = 1;
+ else
+ cycled = 0;
end = -1;
} else {
index = wbc->range_start >> PAGE_CACHE_SHIFT;
end = wbc->range_end >> PAGE_CACHE_SHIFT;
if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
range_whole = 1;
- scanned = 1;
+ cycled = 1; /* ignore range_cyclic tests */
}
+ if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+ tag = PAGECACHE_TAG_TOWRITE;
+ else
+ tag = PAGECACHE_TAG_DIRTY;

retry:
- while (!done && (index <= end) &&
- (nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
- PAGECACHE_TAG_DIRTY,
- min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1))) {
- scanned = 1;
- ret = gfs2_write_jdata_pagevec(mapping, wbc, &pvec, nr_pages, end);
+ if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+ tag_pages_for_writeback(mapping, index, end);
+ done_index = index;
+ while (!done && (index <= end)) {
+ nr_pages = pagevec_lookup_tag(&pvec, mapping, &index, tag,
+ min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1);
+ if (nr_pages == 0)
+ break;
+
+ ret = gfs2_write_jdata_pagevec(mapping, wbc, &pvec, nr_pages, end, &done_index);
if (ret)
done = 1;
if (ret > 0)
ret = 0;
-
pagevec_release(&pvec);
cond_resched();
}

- if (!scanned && !done) {
+ if (!cycled && !done) {
/*
+ * range_cyclic:
* We hit the last page and there is more work to be done: wrap
* back to the start of the file
*/
- scanned = 1;
+ cycled = 1;
index = 0;
+ end = writeback_index - 1;
goto retry;
}

if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
- mapping->writeback_index = index;
+ mapping->writeback_index = done_index;
+
return ret;
}

diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index c7bbbe7..309a086 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -4,6 +4,7 @@
#if !defined(_TRACE_WRITEBACK_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_WRITEBACK_H

+#include <linux/tracepoint.h>
#include <linux/backing-dev.h>
#include <linux/writeback.h>

--
1.8.3.1

2014-04-01 09:23:51

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 09/29] GFS2: Move log buffer lists into transaction

Over time, we hope to be able to improve the concurrency available
in the log code. This is one small step towards that, by moving
the buffer lists from the super block, and into the transaction
structure, so that each transaction builds its own buffer lists.

At transaction commit time, the buffer lists are merged into
the currently accumulating transaction. That transaction then
is passed into the before and after commit functions at journal
flush time. Thus there should be no change in overall behaviour
yet.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 3bf0631..54b66809 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -82,6 +82,8 @@ static void gfs2_ail_empty_gl(struct gfs2_glock *gl)
struct gfs2_trans tr;

memset(&tr, 0, sizeof(tr));
+ INIT_LIST_HEAD(&tr.tr_buf);
+ INIT_LIST_HEAD(&tr.tr_databuf);
tr.tr_revokes = atomic_read(&gl->gl_ail_count);

if (!tr.tr_revokes)
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 645655c..99aab64 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -52,7 +52,7 @@ struct gfs2_log_header_host {
*/

struct gfs2_log_operations {
- void (*lo_before_commit) (struct gfs2_sbd *sdp);
+ void (*lo_before_commit) (struct gfs2_sbd *sdp, struct gfs2_trans *tr);
void (*lo_after_commit) (struct gfs2_sbd *sdp, struct gfs2_trans *tr);
void (*lo_before_scan) (struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, int pass);
@@ -476,6 +476,8 @@ struct gfs2_trans {
unsigned int tr_num_revoke_rm;

struct list_head tr_list;
+ struct list_head tr_databuf;
+ struct list_head tr_buf;

unsigned int tr_first;
struct list_head tr_ail1_list;
@@ -756,9 +758,7 @@ struct gfs2_sbd {
unsigned int sd_log_num_rg;
unsigned int sd_log_num_databuf;

- struct list_head sd_log_le_buf;
struct list_head sd_log_le_revoke;
- struct list_head sd_log_le_databuf;
struct list_head sd_log_le_ordered;
spinlock_t sd_ordered_lock;

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 1e1bda0..975712c 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -712,7 +712,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
tr->tr_first = sdp->sd_log_flush_head;

gfs2_ordered_write(sdp);
- lops_before_commit(sdp);
+ lops_before_commit(sdp, tr);
gfs2_log_flush_bio(sdp, WRITE);

if (sdp->sd_log_head != sdp->sd_log_flush_head) {
@@ -744,6 +744,27 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl)
kfree(tr);
}

+/**
+ * gfs2_merge_trans - Merge a new transaction into a cached transaction
+ * @old: Original transaction to be expanded
+ * @new: New transaction to be merged
+ */
+
+static void gfs2_merge_trans(struct gfs2_trans *old, struct gfs2_trans *new)
+{
+ WARN_ON_ONCE(old->tr_attached != 1);
+
+ old->tr_num_buf_new += new->tr_num_buf_new;
+ old->tr_num_databuf_new += new->tr_num_databuf_new;
+ old->tr_num_buf_rm += new->tr_num_buf_rm;
+ old->tr_num_databuf_rm += new->tr_num_databuf_rm;
+ old->tr_num_revoke += new->tr_num_revoke;
+ old->tr_num_revoke_rm += new->tr_num_revoke_rm;
+
+ list_splice_tail_init(&new->tr_databuf, &old->tr_databuf);
+ list_splice_tail_init(&new->tr_buf, &old->tr_buf);
+}
+
static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
unsigned int reserved;
@@ -766,8 +787,9 @@ static void log_refund(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
sdp->sd_jdesc->jd_blocks);
sdp->sd_log_blks_reserved = reserved;

- if (sdp->sd_log_tr == NULL &&
- (tr->tr_num_buf_new || tr->tr_num_databuf_new)) {
+ if (sdp->sd_log_tr) {
+ gfs2_merge_trans(sdp->sd_log_tr, tr);
+ } else if (tr->tr_num_buf_new || tr->tr_num_databuf_new) {
gfs2_assert_withdraw(sdp, tr->tr_t_gh.gh_gl);
sdp->sd_log_tr = tr;
tr->tr_attached = 1;
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 7669379..ee9ec7f 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -491,24 +491,23 @@ static void gfs2_before_commit(struct gfs2_sbd *sdp, unsigned int limit,
gfs2_log_unlock(sdp);
}

-static void buf_lo_before_commit(struct gfs2_sbd *sdp)
+static void buf_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
unsigned int limit = buf_limit(sdp); /* 503 for 4k blocks */
-
- gfs2_before_commit(sdp, limit, sdp->sd_log_num_buf,
- &sdp->sd_log_le_buf, 0);
+ if (tr == NULL)
+ return;
+ gfs2_before_commit(sdp, limit, sdp->sd_log_num_buf, &tr->tr_buf, 0);
}

static void buf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
- struct list_head *head = &sdp->sd_log_le_buf;
+ struct list_head *head;
struct gfs2_bufdata *bd;

- if (tr == NULL) {
- gfs2_assert(sdp, list_empty(head));
+ if (tr == NULL)
return;
- }

+ head = &tr->tr_buf;
while (!list_empty(head)) {
bd = list_entry(head->next, struct gfs2_bufdata, bd_list);
list_del_init(&bd->bd_list);
@@ -620,7 +619,7 @@ static void buf_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)
jd->jd_jid, sdp->sd_replayed_blocks, sdp->sd_found_blocks);
}

-static void revoke_lo_before_commit(struct gfs2_sbd *sdp)
+static void revoke_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
struct gfs2_meta_header *mh;
unsigned int offset;
@@ -760,12 +759,12 @@ static void revoke_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)
*
*/

-static void databuf_lo_before_commit(struct gfs2_sbd *sdp)
+static void databuf_lo_before_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
unsigned int limit = buf_limit(sdp) / 2;
-
- gfs2_before_commit(sdp, limit, sdp->sd_log_num_databuf,
- &sdp->sd_log_le_databuf, 1);
+ if (tr == NULL)
+ return;
+ gfs2_before_commit(sdp, limit, sdp->sd_log_num_databuf, &tr->tr_databuf, 1);
}

static int databuf_lo_scan_elements(struct gfs2_jdesc *jd, unsigned int start,
@@ -840,14 +839,13 @@ static void databuf_lo_after_scan(struct gfs2_jdesc *jd, int error, int pass)

static void databuf_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
{
- struct list_head *head = &sdp->sd_log_le_databuf;
+ struct list_head *head;
struct gfs2_bufdata *bd;

- if (tr == NULL) {
- gfs2_assert(sdp, list_empty(head));
+ if (tr == NULL)
return;
- }

+ head = &tr->tr_databuf;
while (!list_empty(head)) {
bd = list_entry(head->next, struct gfs2_bufdata, bd_list);
list_del_init(&bd->bd_list);
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index 9ca2e64..a65a7ba 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -46,12 +46,13 @@ static inline unsigned int databuf_limit(struct gfs2_sbd *sdp)
return limit;
}

-static inline void lops_before_commit(struct gfs2_sbd *sdp)
+static inline void lops_before_commit(struct gfs2_sbd *sdp,
+ struct gfs2_trans *tr)
{
int x;
for (x = 0; gfs2_log_ops[x]; x++)
if (gfs2_log_ops[x]->lo_before_commit)
- gfs2_log_ops[x]->lo_before_commit(sdp);
+ gfs2_log_ops[x]->lo_before_commit(sdp, tr);
}

static inline void lops_after_commit(struct gfs2_sbd *sdp,
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index c6872d0..1f855a7 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -114,9 +114,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)

spin_lock_init(&sdp->sd_log_lock);
atomic_set(&sdp->sd_log_pinned, 0);
- INIT_LIST_HEAD(&sdp->sd_log_le_buf);
INIT_LIST_HEAD(&sdp->sd_log_le_revoke);
- INIT_LIST_HEAD(&sdp->sd_log_le_databuf);
INIT_LIST_HEAD(&sdp->sd_log_le_ordered);
spin_lock_init(&sdp->sd_ordered_lock);

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 963b28c..e0464a2 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -51,6 +51,9 @@ int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
if (revokes)
tr->tr_reserved += gfs2_struct2blk(sdp, revokes,
sizeof(u64));
+ INIT_LIST_HEAD(&tr->tr_databuf);
+ INIT_LIST_HEAD(&tr->tr_buf);
+
sb_start_intwrite(sdp->sd_vfs);
gfs2_holder_init(sdp->sd_trans_gl, LM_ST_SHARED, 0, &tr->tr_t_gh);

@@ -211,7 +214,7 @@ void gfs2_trans_add_data(struct gfs2_glock *gl, struct buffer_head *bh)
gfs2_pin(sdp, bd->bd_bh);
tr->tr_num_databuf_new++;
sdp->sd_log_num_databuf++;
- list_add_tail(&bd->bd_list, &sdp->sd_log_le_databuf);
+ list_add_tail(&bd->bd_list, &tr->tr_databuf);
}
gfs2_log_unlock(sdp);
unlock_buffer(bh);
@@ -239,7 +242,7 @@ static void meta_lo_add(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd)
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
sdp->sd_log_num_buf++;
- list_add(&bd->bd_list, &sdp->sd_log_le_buf);
+ list_add(&bd->bd_list, &tr->tr_buf);
tr->tr_num_buf_new++;
}

--
1.8.3.1

2014-04-01 09:24:28

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 08/29] GFS2: Reduce struct gfs2_trans in size

A couple of "int" fields were being used as boolean values
so we can make them bitfields of one bit, and put them in
what might otherwise be a hole in the structure with 64
bit alignment.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index cf0e344..645655c 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -462,11 +462,11 @@ struct gfs2_trans {
unsigned int tr_blocks;
unsigned int tr_revokes;
unsigned int tr_reserved;
+ unsigned int tr_touched:1;
+ unsigned int tr_attached:1;

struct gfs2_holder tr_t_gh;

- int tr_touched;
- int tr_attached;

unsigned int tr_num_buf_new;
unsigned int tr_num_databuf_new;
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 2b20d70..963b28c 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -98,7 +98,7 @@ static void gfs2_print_trans(const struct gfs2_trans *tr)
{
printk(KERN_WARNING "GFS2: Transaction created at: %pSR\n",
(void *)tr->tr_ip);
- printk(KERN_WARNING "GFS2: blocks=%u revokes=%u reserved=%u touched=%d\n",
+ printk(KERN_WARNING "GFS2: blocks=%u revokes=%u reserved=%u touched=%u\n",
tr->tr_blocks, tr->tr_revokes, tr->tr_reserved, tr->tr_touched);
printk(KERN_WARNING "GFS2: Buf %u/%u Databuf %u/%u Revoke %u/%u\n",
tr->tr_num_buf_new, tr->tr_num_buf_rm,
--
1.8.3.1

2014-04-01 09:16:19

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 04/29] GFS2: Lock i_mutex and use a local gfs2_holder for fallocate

From: Bob Peterson <[email protected]>

This patch causes GFS2 to lock the i_mutex during fallocate. It
also switches from using a dinode's inode glock to using a local
holder like the other GFS2 i_operations.

Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index efc078f..6c79408 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -811,6 +811,8 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset,
loff_t bsize_mask = ~((loff_t)sdp->sd_sb.sb_bsize - 1);
loff_t next = (offset + len - 1) >> sdp->sd_sb.sb_bsize_shift;
loff_t max_chunk_size = UINT_MAX & bsize_mask;
+ struct gfs2_holder gh;
+
next = (next + 1) << sdp->sd_sb.sb_bsize_shift;

/* We only support the FALLOC_FL_KEEP_SIZE mode */
@@ -831,8 +833,10 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset,
if (error)
return error;

- gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &ip->i_gh);
- error = gfs2_glock_nq(&ip->i_gh);
+ mutex_lock(&inode->i_mutex);
+
+ gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
+ error = gfs2_glock_nq(&gh);
if (unlikely(error))
goto out_uninit;

@@ -900,9 +904,10 @@ out_trans_fail:
out_qunlock:
gfs2_quota_unlock(ip);
out_unlock:
- gfs2_glock_dq(&ip->i_gh);
+ gfs2_glock_dq(&gh);
out_uninit:
- gfs2_holder_uninit(&ip->i_gh);
+ gfs2_holder_uninit(&gh);
+ mutex_unlock(&inode->i_mutex);
return error;
}

--
1.8.3.1

2014-04-01 09:16:16

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 05/29] GFS2: Add meta readahead field in directory entries

The intent of this new field in the directory entry is to
allow a subsequent lookup to know how many blocks, which
are contiguous with the inode, contain metadata which relates
to the inode. This will then allow the issuing of a single
read to read these blocks, rather than reading the inode
first, and then issuing a second read for the metadata.

This only works under some fairly strict conditions, since
we do not have back pointers from inodes to directory entries
we must ensure that the blocks referenced in this way will
always belong to the inode.

This rules out being able to use this system for indirect
blocks, as these can change as a result of truncate/rewrite.

So the idea here is to restrict this to xattr blocks only
for the time being. For most inodes, that means only a
single block. Also, when using ACLs and/or SELinux or
other LSMs, these will be added at inode creation time
so that they will be contiguous with the inode on disk and
also will almost always be needed when we read the inode in
for permissions checks.

Once an xattr block for an inode is allocated, it will never
change until the inode is deallocated.

This patch adds the new field, a further patch will add the
readahead in due course.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index fa32655..ffcfdd1 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1684,6 +1684,14 @@ static int dir_new_leaf(struct inode *inode, const struct qstr *name)
return 0;
}

+static u16 gfs2_inode_ra_len(const struct gfs2_inode *ip)
+{
+ u64 where = ip->i_no_addr + 1;
+ if (ip->i_eattr == where)
+ return 1;
+ return 0;
+}
+
/**
* gfs2_dir_add - Add new filename into directory
* @inode: The directory inode
@@ -1721,6 +1729,7 @@ int gfs2_dir_add(struct inode *inode, const struct qstr *name,
dent = gfs2_init_dirent(inode, dent, name, bh);
gfs2_inum_out(nip, dent);
dent->de_type = cpu_to_be16(IF2DT(nip->i_inode.i_mode));
+ dent->de_rahead = cpu_to_be16(gfs2_inode_ra_len(nip));
tv = CURRENT_TIME;
if (ip->i_diskflags & GFS2_DIF_EXHASH) {
leaf = (struct gfs2_leaf *)bh->b_data;
diff --git a/include/uapi/linux/gfs2_ondisk.h b/include/uapi/linux/gfs2_ondisk.h
index 3100208..db3fdd0 100644
--- a/include/uapi/linux/gfs2_ondisk.h
+++ b/include/uapi/linux/gfs2_ondisk.h
@@ -304,7 +304,13 @@ struct gfs2_dirent {
__be16 de_rec_len;
__be16 de_name_len;
__be16 de_type;
- __u8 __pad[14];
+ union {
+ __u8 __pad[14];
+ struct {
+ __be16 de_rahead;
+ __u8 pad2[12];
+ };
+ };
};

/*
--
1.8.3.1

2014-04-01 09:25:07

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 07/29] GFS2: add missing newline

From: David Teigland <[email protected]>

Log message is missing newline.

Signed-off-by: David Teigland <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index 2a6ba06..6b97d98 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -1102,7 +1102,7 @@ static void gdlm_recover_slot(void *arg, struct dlm_slot *slot)
}

if (ls->ls_recover_submit[jid]) {
- fs_info(sdp, "recover_slot jid %d gen %u prev %u",
+ fs_info(sdp, "recover_slot jid %d gen %u prev %u\n",
jid, ls->ls_recover_block, ls->ls_recover_submit[jid]);
}
ls->ls_recover_submit[jid] = ls->ls_recover_block;
--
1.8.3.1

2014-04-01 09:25:49

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 06/29] GFS2: Mark functions as static in gfs2/rgrp.c

From: Rashika Kheria <[email protected]>

Mark functions as static in gfs2/rgrp.c because they are not used
outside this file.

This eliminates the following warning in gfs2/rgrp.c:
fs/gfs2/rgrp.c:1092:5: warning: no previous prototype for ‘gfs2_rgrp_bh_get’ [-Wmissing-prototypes]
fs/gfs2/rgrp.c:1157:5: warning: no previous prototype for ‘update_rgrp_lvb’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index c13e4c5..f585746 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1102,7 +1102,7 @@ static u32 count_unlinked(struct gfs2_rgrpd *rgd)
* Returns: errno
*/

-int gfs2_rgrp_bh_get(struct gfs2_rgrpd *rgd)
+static int gfs2_rgrp_bh_get(struct gfs2_rgrpd *rgd)
{
struct gfs2_sbd *sdp = rgd->rd_sbd;
struct gfs2_glock *gl = rgd->rd_gl;
@@ -1169,7 +1169,7 @@ fail:
return error;
}

-int update_rgrp_lvb(struct gfs2_rgrpd *rgd)
+static int update_rgrp_lvb(struct gfs2_rgrpd *rgd)
{
u32 rl_flags;

--
1.8.3.1

2014-04-01 09:26:28

by Steven Whitehouse

[permalink] [raw]
Subject: [PATCH 02/29] GFS2: Allocate block for xattr at inode alloc time, if required

This is another step towards improving the allocation of xattr
blocks at inode allocation time. Here we take advantage of
Christoph's recent work on ACLs to allocate a block for the
xattrs early if we know that we will be adding ACLs to the
inode later on. The advantage of that is that it is much
more likely that we'll get a contiguous run of two blocks
where the first is the inode and the second is the xattr block.

We still have to fall back to the original system in case we
don't get the requested two contiguous blocks, or in case the
ACLs are too large to fit into the block.

Future patches will move more of the ACL setting code further
up the gfs2_inode_create() function. Also, I'd like to be
able to do the same thing with the xattrs from LSMs in
due course, too. That way we should be able to slowly reduce
the number of independent transactions, at least in the
most common cases.

Signed-off-by: Steven Whitehouse <[email protected]>

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 5c52418..ec455b9 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -376,12 +376,11 @@ static void munge_mode_uid_gid(const struct gfs2_inode *dip,
inode->i_gid = current_fsgid();
}

-static int alloc_dinode(struct gfs2_inode *ip, u32 flags)
+static int alloc_dinode(struct gfs2_inode *ip, u32 flags, unsigned *dblocks)
{
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
- struct gfs2_alloc_parms ap = { .target = RES_DINODE, .aflags = flags, };
+ struct gfs2_alloc_parms ap = { .target = *dblocks, .aflags = flags, };
int error;
- int dblocks = 1;

error = gfs2_quota_lock_check(ip);
if (error)
@@ -391,11 +390,11 @@ static int alloc_dinode(struct gfs2_inode *ip, u32 flags)
if (error)
goto out_quota;

- error = gfs2_trans_begin(sdp, RES_RG_BIT + RES_STATFS + RES_QUOTA, 0);
+ error = gfs2_trans_begin(sdp, (*dblocks * RES_RG_BIT) + RES_STATFS + RES_QUOTA, 0);
if (error)
goto out_ipreserv;

- error = gfs2_alloc_blocks(ip, &ip->i_no_addr, &dblocks, 1, &ip->i_generation);
+ error = gfs2_alloc_blocks(ip, &ip->i_no_addr, dblocks, 1, &ip->i_generation);
ip->i_no_formal_ino = ip->i_generation;
ip->i_inode.i_ino = ip->i_no_addr;
ip->i_goal = ip->i_no_addr;
@@ -428,6 +427,33 @@ static void gfs2_init_dir(struct buffer_head *dibh,
}

/**
+ * gfs2_init_xattr - Initialise an xattr block for a new inode
+ * @ip: The inode in question
+ *
+ * This sets up an empty xattr block for a new inode, ready to
+ * take any ACLs, LSM xattrs, etc.
+ */
+
+static void gfs2_init_xattr(struct gfs2_inode *ip)
+{
+ struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
+ struct buffer_head *bh;
+ struct gfs2_ea_header *ea;
+
+ bh = gfs2_meta_new(ip->i_gl, ip->i_eattr);
+ gfs2_trans_add_meta(ip->i_gl, bh);
+ gfs2_metatype_set(bh, GFS2_METATYPE_EA, GFS2_FORMAT_EA);
+ gfs2_buffer_clear_tail(bh, sizeof(struct gfs2_meta_header));
+
+ ea = GFS2_EA_BH2FIRST(bh);
+ ea->ea_rec_len = cpu_to_be32(sdp->sd_jbsize);
+ ea->ea_type = GFS2_EATYPE_UNUSED;
+ ea->ea_flags = GFS2_EAFLAG_LAST;
+
+ brelse(bh);
+}
+
+/**
* init_dinode - Fill in a new dinode structure
* @dip: The directory this inode is being created in
* @ip: The inode
@@ -580,6 +606,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
struct dentry *d;
int error;
u32 aflags = 0;
+ unsigned blocks = 1;
struct gfs2_diradd da = { .bh = NULL, };

if (!name->len || name->len > GFS2_FNAMESIZE)
@@ -676,10 +703,15 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
(dip->i_diskflags & GFS2_DIF_TOPDIR))
aflags |= GFS2_AF_ORLOV;

- error = alloc_dinode(ip, aflags);
+ if (default_acl || acl)
+ blocks++;
+
+ error = alloc_dinode(ip, aflags, &blocks);
if (error)
goto fail_free_inode;

+ gfs2_set_inode_blocks(inode, blocks);
+
error = gfs2_glock_get(sdp, ip->i_no_addr, &gfs2_inode_glops, CREATE, &ip->i_gl);
if (error)
goto fail_free_inode;
@@ -689,10 +721,14 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
if (error)
goto fail_free_inode;

- error = gfs2_trans_begin(sdp, RES_DINODE, 0);
+ error = gfs2_trans_begin(sdp, blocks, 0);
if (error)
goto fail_gunlock2;

+ if (blocks > 1) {
+ ip->i_eattr = ip->i_no_addr + 1;
+ gfs2_init_xattr(ip);
+ }
init_dinode(dip, ip, symname);
gfs2_trans_end(sdp);

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index a1da213..c13e4c5 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -2296,7 +2296,7 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,

gfs2_statfs_change(sdp, 0, -(s64)*nblocks, dinode ? 1 : 0);
if (dinode)
- gfs2_trans_add_unrevoke(sdp, block, 1);
+ gfs2_trans_add_unrevoke(sdp, block, *nblocks);

gfs2_quota_change(ip, *nblocks, ip->i_inode.i_uid, ip->i_inode.i_gid);

diff --git a/include/uapi/linux/gfs2_ondisk.h b/include/uapi/linux/gfs2_ondisk.h
index 0f24c07..3100208 100644
--- a/include/uapi/linux/gfs2_ondisk.h
+++ b/include/uapi/linux/gfs2_ondisk.h
@@ -347,9 +347,9 @@ struct gfs2_leaf {
* metadata header. Each inode, if it has extended attributes, will
* have either a single block containing the extended attribute headers
* or a single indirect block pointing to blocks containing the
- * extended attribure headers.
+ * extended attribute headers.
*
- * The maximim size of the data part of an extended attribute is 64k
+ * The maximum size of the data part of an extended attribute is 64k
* so the number of blocks required depends upon block size. Since the
* block size also determines the number of pointers in an indirect
* block, its a fairly complicated calculation to work out the maximum
--
1.8.3.1