2011-07-30 06:20:06

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCHSET 0/6] Patches needed for Stable

Trond Hi

I've played all day with these commit patches and I came to the conclusion
that we will need all 5 of Peng's commit patches. The minimal set is the first
4 and without the 5th one, is as you pointed out, might be very racy. Let me
explain.

At first I thought all I needed was:
[PATCH v4 05/27] pnfs: let layoutcommit handle a list of lseg
[PATCH v4 06/27] pnfs: use lwb as layoutcommit length

But They do not apply as is, and would not work.
I tried to, instead of taking the creds and lwb from the *header*, take them
from the *first segment*. But this is an ugly hack (I tried, it's really ugly)
and is scary as hell. Because it is code we never tested as opposed to this code
that was the only one I actually tested since before last bakeathon. I would
hate to put such under-tested code in Stable when we will change it hopefully
in this Kernel and is ugly as hell.

So Introducing these two:
[PATCH v4 03/27] pnfs: save layoutcommit lwb at layout header
[PATCH v4 04/27] pnfs: save layoutcommit cred at layout header
Just makes sense.

Now also:
[PATCH v4 07/27] NFS41: save layoutcommit cred in layout header init

Was pointed by you as fixing a potential race, which I agree completely, and
I think it should be just merged with:
[PATCH 2/6] pnfs: save layoutcommit cred at layout header
Which are the same exact issue only done better

I'm adding two more patches I'll need for Stable which are bug fixes we found
in the large testing we did here after Bakeathon:
[PATCH 5/6] pnfs-obj: Bug when we are running out of bio
[PATCH 6/6] pnfs-obj: Fix the comp_index != 0 case

----
So here is the list of patches for stable:

[PATCH 1/6] pnfs: save layoutcommit lwb at layout header
[PATCH 2/6] pnfs: save layoutcommit cred at layout header init
[PATCH 3/6] pnfs: let layoutcommit handle a list of lseg
[PATCH 4/6] pnfs: use lwb as layoutcommit length

These 4 are Peng's 5 layoutcommit patches. I've combined
[PATCH v4 04/27] pnfs: save layoutcommit cred at layout header
and
[PATCH v4 07/27] NFS41: save layoutcommit cred in layout header init
Which are just the same issue. And rebased the last one on top of that.
And added CC: Stable.

[PATCH 5/6] pnfs-obj: Bug when we are running out of bio
[PATCH 6/6] pnfs-obj: Fix the comp_index != 0 case

Thanks
Boaz


2011-07-30 06:30:21

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 1/6] pnfs: save layoutcommit lwb at layout header

From: Peng Tao <[email protected]>

No need to save it for every lseg.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
---
fs/nfs/nfs4filelayout.c | 2 +-
fs/nfs/pnfs.c | 10 ++++++----
fs/nfs/pnfs.h | 2 +-
3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index f9d03ab..614c4d2 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -170,7 +170,7 @@ filelayout_set_layoutcommit(struct nfs_write_data *wdata)

pnfs_set_layoutcommit(wdata);
dprintk("%s ionde %lu pls_end_pos %lu\n", __func__, wdata->inode->i_ino,
- (unsigned long) wdata->lseg->pls_end_pos);
+ (unsigned long) NFS_I(wdata->inode)->layout->plh_lwb);
}

/*
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 29c0ca7..fb1bcf1 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1224,9 +1224,11 @@ pnfs_set_layoutcommit(struct nfs_write_data *wdata)
dprintk("%s: Set layoutcommit for inode %lu ",
__func__, wdata->inode->i_ino);
}
- if (end_pos > wdata->lseg->pls_end_pos)
- wdata->lseg->pls_end_pos = end_pos;
+ if (end_pos > nfsi->layout->plh_lwb)
+ nfsi->layout->plh_lwb = end_pos;
spin_unlock(&nfsi->vfs_inode.i_lock);
+ dprintk("%s: lseg %p end_pos %llu\n",
+ __func__, wdata->lseg, nfsi->layout->plh_lwb);

/* if pnfs_layoutcommit_inode() runs between inode locks, the next one
* will be a noop because NFS_INO_LAYOUTCOMMIT will not be set */
@@ -1278,9 +1280,9 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
*/
lseg = pnfs_list_write_lseg(inode);

- end_pos = lseg->pls_end_pos;
+ end_pos = nfsi->layout->plh_lwb;
cred = lseg->pls_lc_cred;
- lseg->pls_end_pos = 0;
+ nfsi->layout->plh_lwb = 0;
lseg->pls_lc_cred = NULL;

memcpy(&data->args.stateid.data, nfsi->layout->plh_stateid.data,
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 96bf4e6..77e1b24 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -45,7 +45,6 @@ struct pnfs_layout_segment {
unsigned long pls_flags;
struct pnfs_layout_hdr *pls_layout;
struct rpc_cred *pls_lc_cred; /* LAYOUTCOMMIT credential */
- loff_t pls_end_pos; /* LAYOUTCOMMIT write end */
};

enum pnfs_try_status {
@@ -124,6 +123,7 @@ struct pnfs_layout_hdr {
unsigned long plh_block_lgets; /* block LAYOUTGET if >0 */
u32 plh_barrier; /* ignore lower seqids */
unsigned long plh_flags;
+ loff_t plh_lwb; /* last write byte for layoutcommit */
struct inode *plh_inode;
};

--
1.7.6



2011-07-30 06:32:37

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 3/6] pnfs: let layoutcommit handle a list of lseg

From: Peng Tao <[email protected]>

There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/nfs/nfs4proc.c | 8 +++++++-
fs/nfs/pnfs.c | 32 ++++++++++++++++----------------
fs/nfs/pnfs.h | 2 ++
include/linux/nfs_xdr.h | 2 +-
4 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 5879b23..92cfd2e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5850,9 +5850,15 @@ nfs4_layoutcommit_done(struct rpc_task *task, void *calldata)
static void nfs4_layoutcommit_release(void *calldata)
{
struct nfs4_layoutcommit_data *data = calldata;
+ struct pnfs_layout_segment *lseg, *tmp;

/* Matched by references in pnfs_set_layoutcommit */
- put_lseg(data->lseg);
+ list_for_each_entry_safe(lseg, tmp, &data->lseg_list, pls_lc_list) {
+ list_del_init(&lseg->pls_lc_list);
+ if (test_and_clear_bit(NFS_LSEG_LAYOUTCOMMIT,
+ &lseg->pls_flags))
+ put_lseg(lseg);
+ }
put_rpccred(data->cred);
kfree(data);
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index baa2a04..a726c0a 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -224,6 +224,7 @@ static void
init_lseg(struct pnfs_layout_hdr *lo, struct pnfs_layout_segment *lseg)
{
INIT_LIST_HEAD(&lseg->pls_list);
+ INIT_LIST_HEAD(&lseg->pls_lc_list);
atomic_set(&lseg->pls_refcount, 1);
smp_mb();
set_bit(NFS_LSEG_VALID, &lseg->pls_flags);
@@ -1201,16 +1202,17 @@ pnfs_try_to_read_data(struct nfs_read_data *rdata,
}

/*
- * Currently there is only one (whole file) write lseg.
+ * There can be multiple RW segments.
*/
-static struct pnfs_layout_segment *pnfs_list_write_lseg(struct inode *inode)
+static void pnfs_list_write_lseg(struct inode *inode, struct list_head *listp)
{
- struct pnfs_layout_segment *lseg, *rv = NULL;
+ struct pnfs_layout_segment *lseg;

- list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list)
- if (lseg->pls_range.iomode == IOMODE_RW)
- rv = lseg;
- return rv;
+ list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list) {
+ if (lseg->pls_range.iomode == IOMODE_RW &&
+ test_bit(NFS_LSEG_LAYOUTCOMMIT, &lseg->pls_flags))
+ list_add(&lseg->pls_lc_list, listp);
+ }
}

void
@@ -1222,12 +1224,14 @@ pnfs_set_layoutcommit(struct nfs_write_data *wdata)

spin_lock(&nfsi->vfs_inode.i_lock);
if (!test_and_set_bit(NFS_INO_LAYOUTCOMMIT, &nfsi->flags)) {
- /* references matched in nfs4_layoutcommit_release */
- get_lseg(wdata->lseg);
mark_as_dirty = true;
dprintk("%s: Set layoutcommit for inode %lu ",
__func__, wdata->inode->i_ino);
}
+ if (!test_and_set_bit(NFS_LSEG_LAYOUTCOMMIT, &wdata->lseg->pls_flags)) {
+ /* references matched in nfs4_layoutcommit_release */
+ get_lseg(wdata->lseg);
+ }
if (end_pos > nfsi->layout->plh_lwb)
nfsi->layout->plh_lwb = end_pos;
spin_unlock(&nfsi->vfs_inode.i_lock);
@@ -1254,7 +1258,6 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
{
struct nfs4_layoutcommit_data *data;
struct nfs_inode *nfsi = NFS_I(inode);
- struct pnfs_layout_segment *lseg;
loff_t end_pos;
int status = 0;

@@ -1271,17 +1274,15 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
goto out;
}

+ INIT_LIST_HEAD(&data->lseg_list);
spin_lock(&inode->i_lock);
if (!test_and_clear_bit(NFS_INO_LAYOUTCOMMIT, &nfsi->flags)) {
spin_unlock(&inode->i_lock);
kfree(data);
goto out;
}
- /*
- * Currently only one (whole file) write lseg which is referenced
- * in pnfs_set_layoutcommit and will be found.
- */
- lseg = pnfs_list_write_lseg(inode);
+
+ pnfs_list_write_lseg(inode, &data->lseg_list);

end_pos = nfsi->layout->plh_lwb;
nfsi->layout->plh_lwb = 0;
@@ -1291,7 +1292,6 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
spin_unlock(&inode->i_lock);

data->args.inode = inode;
- data->lseg = lseg;
data->cred = get_rpccred(nfsi->layout->plh_lc_cred);
nfs_fattr_init(&data->fattr);
data->args.bitmask = NFS_SERVER(inode)->cache_consistency_bitmask;
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 6969594..9d147d9 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -36,10 +36,12 @@
enum {
NFS_LSEG_VALID = 0, /* cleared when lseg is recalled/returned */
NFS_LSEG_ROC, /* roc bit received from server */
+ NFS_LSEG_LAYOUTCOMMIT, /* layoutcommit bit set for layoutcommit */
};

struct pnfs_layout_segment {
struct list_head pls_list;
+ struct list_head pls_lc_list;
struct pnfs_layout_range pls_range;
atomic_t pls_refcount;
unsigned long pls_flags;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 00848d8..be2eba7 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -262,7 +262,7 @@ struct nfs4_layoutcommit_res {
struct nfs4_layoutcommit_data {
struct rpc_task task;
struct nfs_fattr fattr;
- struct pnfs_layout_segment *lseg;
+ struct list_head lseg_list;
struct rpc_cred *cred;
struct nfs4_layoutcommit_args args;
struct nfs4_layoutcommit_res res;
--
1.7.6



2011-07-30 06:30:56

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 2/6] pnfs: save layoutcommit cred at layout header init

From: Peng Tao <[email protected]>

No need to save it for every lseg.
No need to save it at every pnfs_set_layoutcommit.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/nfs/pnfs.c | 21 +++++++++++----------
fs/nfs/pnfs.h | 2 +-
2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index fb1bcf1..baa2a04 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -189,6 +189,7 @@ static void
pnfs_free_layout_hdr(struct pnfs_layout_hdr *lo)
{
struct pnfs_layoutdriver_type *ld = NFS_SERVER(lo->plh_inode)->pnfs_curr_ld;
+ put_rpccred(lo->plh_lc_cred);
return ld->alloc_layout_hdr ? ld->free_layout_hdr(lo) : kfree(lo);
}

@@ -805,7 +806,9 @@ out:
}

static struct pnfs_layout_hdr *
-alloc_init_layout_hdr(struct inode *ino, gfp_t gfp_flags)
+alloc_init_layout_hdr(struct inode *ino,
+ struct nfs_open_context *ctx,
+ gfp_t gfp_flags)
{
struct pnfs_layout_hdr *lo;

@@ -817,11 +820,14 @@ alloc_init_layout_hdr(struct inode *ino, gfp_t gfp_flags)
INIT_LIST_HEAD(&lo->plh_segs);
INIT_LIST_HEAD(&lo->plh_bulk_recall);
lo->plh_inode = ino;
+ lo->plh_lc_cred = get_rpccred(ctx->state->owner->so_cred);
return lo;
}

static struct pnfs_layout_hdr *
-pnfs_find_alloc_layout(struct inode *ino, gfp_t gfp_flags)
+pnfs_find_alloc_layout(struct inode *ino,
+ struct nfs_open_context *ctx,
+ gfp_t gfp_flags)
{
struct nfs_inode *nfsi = NFS_I(ino);
struct pnfs_layout_hdr *new = NULL;
@@ -836,7 +842,7 @@ pnfs_find_alloc_layout(struct inode *ino, gfp_t gfp_flags)
return nfsi->layout;
}
spin_unlock(&ino->i_lock);
- new = alloc_init_layout_hdr(ino, gfp_flags);
+ new = alloc_init_layout_hdr(ino, ctx, gfp_flags);
spin_lock(&ino->i_lock);

if (likely(nfsi->layout == NULL)) /* Won the race? */
@@ -928,7 +934,7 @@ pnfs_update_layout(struct inode *ino,
if (!pnfs_enabled_sb(NFS_SERVER(ino)))
return NULL;
spin_lock(&ino->i_lock);
- lo = pnfs_find_alloc_layout(ino, gfp_flags);
+ lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
if (lo == NULL) {
dprintk("%s ERROR: can't get pnfs_layout_hdr\n", __func__);
goto out_unlock;
@@ -1218,8 +1224,6 @@ pnfs_set_layoutcommit(struct nfs_write_data *wdata)
if (!test_and_set_bit(NFS_INO_LAYOUTCOMMIT, &nfsi->flags)) {
/* references matched in nfs4_layoutcommit_release */
get_lseg(wdata->lseg);
- wdata->lseg->pls_lc_cred =
- get_rpccred(wdata->args.context->state->owner->so_cred);
mark_as_dirty = true;
dprintk("%s: Set layoutcommit for inode %lu ",
__func__, wdata->inode->i_ino);
@@ -1251,7 +1255,6 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
struct nfs4_layoutcommit_data *data;
struct nfs_inode *nfsi = NFS_I(inode);
struct pnfs_layout_segment *lseg;
- struct rpc_cred *cred;
loff_t end_pos;
int status = 0;

@@ -1281,9 +1284,7 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
lseg = pnfs_list_write_lseg(inode);

end_pos = nfsi->layout->plh_lwb;
- cred = lseg->pls_lc_cred;
nfsi->layout->plh_lwb = 0;
- lseg->pls_lc_cred = NULL;

memcpy(&data->args.stateid.data, nfsi->layout->plh_stateid.data,
sizeof(nfsi->layout->plh_stateid.data));
@@ -1291,7 +1292,7 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)

data->args.inode = inode;
data->lseg = lseg;
- data->cred = cred;
+ data->cred = get_rpccred(nfsi->layout->plh_lc_cred);
nfs_fattr_init(&data->fattr);
data->args.bitmask = NFS_SERVER(inode)->cache_consistency_bitmask;
data->res.fattr = &data->fattr;
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 77e1b24..6969594 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -44,7 +44,6 @@ struct pnfs_layout_segment {
atomic_t pls_refcount;
unsigned long pls_flags;
struct pnfs_layout_hdr *pls_layout;
- struct rpc_cred *pls_lc_cred; /* LAYOUTCOMMIT credential */
};

enum pnfs_try_status {
@@ -124,6 +123,7 @@ struct pnfs_layout_hdr {
u32 plh_barrier; /* ignore lower seqids */
unsigned long plh_flags;
loff_t plh_lwb; /* last write byte for layoutcommit */
+ struct rpc_cred *plh_lc_cred; /* layoutcommit cred */
struct inode *plh_inode;
};

--
1.7.6



2011-07-30 06:33:44

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 4/6] pnfs: use lwb as layoutcommit length

From: Peng Tao <[email protected]>

Using NFS4_MAX_UINT64 will break current protocol.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Peng Tao <[email protected]>
---
fs/nfs/nfs4xdr.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e6e8f3b..fc97fd5 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1888,7 +1888,7 @@ encode_layoutcommit(struct xdr_stream *xdr,
*p++ = cpu_to_be32(OP_LAYOUTCOMMIT);
/* Only whole file layouts */
p = xdr_encode_hyper(p, 0); /* offset */
- p = xdr_encode_hyper(p, NFS4_MAX_UINT64); /* length */
+ p = xdr_encode_hyper(p, args->lastbytewritten + 1); /* length */
*p++ = cpu_to_be32(0); /* reclaim */
p = xdr_encode_opaque_fixed(p, args->stateid.data, NFS4_STATEID_SIZE);
*p++ = cpu_to_be32(1); /* newoffset = TRUE */
--
1.7.6



2011-07-30 06:35:49

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 6/6] pnfs-obj: Fix the comp_index != 0 case


There were bugs in the case of partial layout where olo_comp_index
is not zero. This used to work and was tested but one of the later
cleanup SQUASHMEs broke it and was not tested since.

Also add a dprint that specify those received layout parameters.
Every thing else was already printed.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/nfs/objlayout/objio_osd.c | 16 +++++++---------
fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 3 +++
2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/objlayout/objio_osd.c b/fs/nfs/objlayout/objio_osd.c
index 9aa9d67..1d1dc1e 100644
--- a/fs/nfs/objlayout/objio_osd.c
+++ b/fs/nfs/objlayout/objio_osd.c
@@ -479,7 +479,6 @@ static int _io_check(struct objio_state *ios, bool is_write)
for (i = 0; i < ios->numdevs; i++) {
struct osd_sense_info osi;
struct osd_request *or = ios->per_dev[i].or;
- unsigned dev;
int ret;

if (!or)
@@ -500,9 +499,8 @@ static int _io_check(struct objio_state *ios, bool is_write)

continue; /* we recovered */
}
- dev = ios->per_dev[i].dev;
- objlayout_io_set_result(&ios->ol_state, dev,
- &ios->layout->comps[dev].oc_object_id,
+ objlayout_io_set_result(&ios->ol_state, i,
+ &ios->layout->comps[i].oc_object_id,
osd_pri_2_pnfs_err(osi.osd_err_pri),
ios->per_dev[i].offset,
ios->per_dev[i].length,
@@ -648,7 +646,7 @@ static int _prepare_one_group(struct objio_state *ios, u64 length,
int ret = 0;

while (length) {
- struct _objio_per_comp *per_dev = &ios->per_dev[dev];
+ struct _objio_per_comp *per_dev = &ios->per_dev[dev - first_dev];
unsigned cur_len, page_off = 0;

if (!per_dev->length) {
@@ -668,8 +666,8 @@ static int _prepare_one_group(struct objio_state *ios, u64 length,
cur_len = stripe_unit;
}

- if (max_comp < dev)
- max_comp = dev;
+ if (max_comp < dev - first_dev)
+ max_comp = dev - first_dev;
} else {
cur_len = stripe_unit;
}
@@ -804,7 +802,7 @@ static int _read_mirrors(struct objio_state *ios, unsigned cur_comp)
struct _objio_per_comp *per_dev = &ios->per_dev[cur_comp];
unsigned dev = per_dev->dev;
struct pnfs_osd_object_cred *cred =
- &ios->layout->comps[dev];
+ &ios->layout->comps[cur_comp];
struct osd_obj_id obj = {
.partition = cred->oc_object_id.oid_partition_id,
.id = cred->oc_object_id.oid_object_id,
@@ -902,7 +900,7 @@ static int _write_mirrors(struct objio_state *ios, unsigned cur_comp)
for (; cur_comp < last_comp; ++cur_comp, ++dev) {
struct osd_request *or = NULL;
struct pnfs_osd_object_cred *cred =
- &ios->layout->comps[dev];
+ &ios->layout->comps[cur_comp];
struct osd_obj_id obj = {
.partition = cred->oc_object_id.oid_partition_id,
.id = cred->oc_object_id.oid_object_id,
diff --git a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
index 16fc758..b3918f7 100644
--- a/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
+++ b/fs/nfs/objlayout/pnfs_osd_xdr_cli.c
@@ -170,6 +170,9 @@ int pnfs_osd_xdr_decode_layout_map(struct pnfs_osd_layout *layout,
p = _osd_xdr_decode_data_map(p, &layout->olo_map);
layout->olo_comps_index = be32_to_cpup(p++);
layout->olo_num_comps = be32_to_cpup(p++);
+ dprintk("%s: olo_comps_index=%d olo_num_comps=%d\n", __func__,
+ layout->olo_comps_index, layout->olo_num_comps);
+
iter->total_comps = layout->olo_num_comps;
return 0;
}
--
1.7.6



2011-07-30 06:34:41

by Boaz Harrosh

[permalink] [raw]
Subject: [PATCH 5/6] pnfs-obj: Bug when we are running out of bio


When we have a situation that the number of pages we want
to encode is bigger then the size of the bio. (Which can
currently happen only when all IO is going to a single device
.e.g group_width==1) then the IO is submitted short and we
report back only the amount of bytes we actually wrote/read
and all is fine. BUT ...

There was a bug that the current length counter was advanced
before the fail to add the extra page, and we come to a situation
that the CDB length was one-page longer then the actual bio size,
which is of course rejected by the osd-target.

While here also fix the bio size calculation, in the case
that we received more then one group of devices.

[Needed in v3.0]
CC: Stable Tree <[email protected]>
Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/nfs/objlayout/objio_osd.c | 12 +++++-------
1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/objlayout/objio_osd.c b/fs/nfs/objlayout/objio_osd.c
index 8ff2ea3..9aa9d67 100644
--- a/fs/nfs/objlayout/objio_osd.c
+++ b/fs/nfs/objlayout/objio_osd.c
@@ -589,22 +589,19 @@ static void _calc_stripe_info(struct objio_state *ios, u64 file_offset,
}

static int _add_stripe_unit(struct objio_state *ios, unsigned *cur_pg,
- unsigned pgbase, struct _objio_per_comp *per_dev, int cur_len,
+ unsigned pgbase, struct _objio_per_comp *per_dev, int len,
gfp_t gfp_flags)
{
unsigned pg = *cur_pg;
+ int cur_len = len;
struct request_queue *q =
osd_request_queue(_io_od(ios, per_dev->dev));

- per_dev->length += cur_len;
-
if (per_dev->bio == NULL) {
- unsigned stripes = ios->layout->num_comps /
- ios->layout->mirrors_p1;
- unsigned pages_in_stripe = stripes *
+ unsigned pages_in_stripe = ios->layout->group_width *
(ios->layout->stripe_unit / PAGE_SIZE);
unsigned bio_size = (ios->ol_state.nr_pages + pages_in_stripe) /
- stripes;
+ ios->layout->group_width;

if (BIO_MAX_PAGES_KMALLOC < bio_size)
bio_size = BIO_MAX_PAGES_KMALLOC;
@@ -632,6 +629,7 @@ static int _add_stripe_unit(struct objio_state *ios, unsigned *cur_pg,
}
BUG_ON(cur_len);

+ per_dev->length += len;
*cur_pg = pg;
return 0;
}
--
1.7.6