LinuxLists.cc - [RFC] Something very wrong with layout_recall of RETURN

2012-05-30 23:52:09

Subject: [RFC] Something very wrong with layout_recall of RETURN_FILE

In patch:
pnfsd: layout recall layout state

the cl_has_file_layout() is no longer inspecting the layout structures added per file
but is inspecting if file has layout_state.

So it is counting layout_states and not layouts

This is bad because the addition of the layout_states on the file is done before the
call to the filesystem so if the FS does a recall, the nfsd is confused thinking
it already has a layout and issues a recall. Instead of returning -ENOENT, ie list
is empty. The client then truly returns nomaching_layout and when the lo_return(s) are
emulated the system gets stuck is some reference miss-match. (UML so no crash trace)

Now lets say that the state should be set before the call to the FS. Then I don't
see where the state is removed in the case of an ERROR return from FS->layout_get.
Meaning cl_has_file_layout() will always think it has some count.

Also When a layout is returned it is the layout list that is inspected and freed,
so how is the cl_has_file_layout() emptied ?

In any way. I do not agree that it is the state that is needed to be searched
in cl_has_file_layout() but it is layouts that are needed, otherwise the all
layout <---> recall very delicate dance is totally broken.

What was the meaning of the Poet?

I reverted the cl_has_file_layout() to historical processing and am debugging
Will probably now get the state processing wrong.

Also cl_has_file_layout() returns true for any layout on a file, but we must
inspect IO_MODE and LSEG for a partial-match, as well.

The below works for me. State also looks good
(lightly tested, bug above is fixed, Have not tried multiple clients shared
same-stripe writes)

Thanks
Boaz

------
git diff --stat -p -M fs/nfsd/nfs4pnfsd.c
fs/nfsd/nfs4pnfsd.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
------
diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index f90f3a7..b421437 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -1179,24 +1179,27 @@ out:
}

static bool
-cl_has_file_layout(struct nfs4_client *clp, struct nfs4_file *fp, stateid_t *lsid)
+cl_has_file_layout(struct nfs4_client *clp, struct nfs4_file *fp,
+ stateid_t *lsid, struct nfsd4_pnfs_cb_layout *cbl)
{
- struct nfs4_layout_state *ls;
+ struct nfs4_layout *lo;
+ bool ret = false;

spin_lock(&layout_lock);
- list_for_each_entry (ls, &fp->fi_layout_states, ls_perfile)
- if (same_clid(&ls->ls_stid.sc_stateid.si_opaque.so_clid,
- &clp->cl_clientid)) {
+ list_for_each_entry (lo, &fp->fi_layouts, lo_perfile) {
+ if (same_clid(&lo->lo_client->cl_clientid, &clp->cl_clientid) &&
+ lo_seg_overlapping(&cbl->cbl_seg, &lo->lo_seg) &&
+ (cbl->cbl_seg.iomode & lo->lo_seg.iomode))
goto found;
- }
- spin_unlock(&layout_lock);
- return false;
-
+ }
+ goto unlock;
found:
- update_layout_stateid_locked(ls, lsid);
+ /* Im going to send a recall on this latout update state */
+ update_layout_stateid_locked(lo->lo_state, lsid);
+ ret = true;
+unlock:
spin_unlock(&layout_lock);
-
- return true;
+ return ret;
}

static int
@@ -1228,7 +1231,7 @@ cl_has_layout(struct nfs4_client *clp, struct nfsd4_pnfs_cb_layout *cbl,
{
switch (cbl->cbl_recall_type) {
case RETURN_FILE:
- return cl_has_file_layout(clp, lrfile, lsid);
+ return cl_has_file_layout(clp, lrfile, lsid, cbl);
case RETURN_FSID:
return cl_has_fsid_layout(clp, &cbl->cbl_fsid);
default:

2012-05-31 00:25:39

by Boaz Harrosh

[permalink] [raw]

Subject: [PATCH] pnfsd-exofs: Two clients must not write to the same RAID stripe

If any one is interested the below is the behavior
I'm striving for.

----
From: Boaz Harrosh <[email protected]>
Subject: [PATCH] pnfsd-exofs: Two clients must not write to the same RAID stripe

If we have file redundancy RAID1/4/5/6 then two clients cannot
write to the same stripe/region.

We take care of this by giving out smaller regions of the file.
Before any layout_get we make sure to recall the same exact
region from any client. If a recall was issued we return
NFS4ERR_RECALLCONFLICT. The client will come again later for
it's layout.

Meanwhile the fist client can flush data and release the
layout. The next time the segment might be free and the
lo_get succeed.

It is very possible that multiple writers will fight and
some clients will starve forever. But the smaller the
region, and if the clients randomize a wait, it should
statistically be OK.
(We could manage a fairness queue. What about an lo_available
notification)

On the other hand a very small segment will hurt performance,
so default size is now set to 8 stripes.
TODO: Let segment size be set in sysfs.

TODO:
For debugging we always give out small segments. But we should
only start giving out small segments on a shared file. The
first/single writer should get a large seg as before.

Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/exofs/export.c | 48 ++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/fs/exofs/export.c b/fs/exofs/export.c
index c5712f3..a1f112f 100644
--- a/fs/exofs/export.c
+++ b/fs/exofs/export.c
@@ -29,6 +29,9 @@

#include "linux/nfsd/pnfs_osd_xdr_srv.h"

+/* TODO: put in sysfs per sb */
+const static unsigned sb_shared_num_stripes = 8;
+
static int exofs_layout_type(struct super_block *sb)
{
return LAYOUT_OSD2_OBJECTS;
@@ -94,14 +97,27 @@ void ore_layout_2_pnfs_layout(struct pnfs_osd_layout *pl,
}
}

-static void _align_io(struct ore_layout *layout, u64 *offset, u64 *length)
+static bool _align_io(struct ore_layout *layout, struct nfsd4_layout_seg *lseg,
+ bool shared)
{
u64 stripe_size = (layout->group_width - layout->parity) *
layout->stripe_unit;
u64 group_size = stripe_size * layout->group_depth;

- *offset = div64_u64(*offset, group_size) * group_size;
- *length = group_size;
+ /* TODO: Don't ignore shared flag. Single writer can get a full group */
+ if (lseg->iomode != IOMODE_READ &&
+ (layout->parity || (layout->mirrors_p1 > 1))) {
+ /* RAID writes */
+ lseg->offset = div64_u64(lseg->offset, stripe_size) *
+ stripe_size;
+ lseg->length = stripe_size * sb_shared_num_stripes;
+ return true;
+ } else {
+ /* reads or no data redundancy */
+ lseg->offset = div64_u64(lseg->offset, group_size) * group_size;
+ lseg->length = group_size;
+ return false;
+ }
}

static enum nfsstat4 exofs_layout_get(
@@ -116,15 +132,15 @@ static enum nfsstat4 exofs_layout_get(
struct pnfs_osd_layout layout;
__be32 *start;
unsigned i;
- bool in_recall;
+ bool in_recall, need_recall;
enum nfsstat4 nfserr;

EXOFS_DBGMSG("(0x%lx) REQUESTED offset=0x%llx len=0x%llx iomod=0x%x\n",
inode->i_ino, res->lg_seg.offset,
res->lg_seg.length, res->lg_seg.iomode);

- _align_io(&sbi->layout, &res->lg_seg.offset, &res->lg_seg.length);
- res->lg_seg.iomode = IOMODE_RW;
+ need_recall = _align_io(&sbi->layout, &res->lg_seg,
+ test_bit(OBJ_LAYOUT_IS_GIVEN, &oi->i_flags));
res->lg_return_on_close = true;
res->lg_lo_cookie = inode; /* Just for debug prints */

@@ -132,6 +148,26 @@ static enum nfsstat4 exofs_layout_get(
inode->i_ino, res->lg_seg.offset,
res->lg_seg.length, res->lg_seg.iomode);

+ if (need_recall) {
+ int rc = cb_layout_recall(inode, IOMODE_RW, res->lg_seg.offset,
+ res->lg_seg.length, (void *)0x17);
+ switch (rc) {
+ case 0:
+ case -EAGAIN:
+ EXOFS_DBGMSG("(0x%lx) @@@ Sharing of RAID5/1 stripe\n",
+ inode->i_ino);
+ return NFS4ERR_RECALLCONFLICT;
+ default:
+ /* This is fine for now */
+ /* TODO: Fence object off */
+ EXOFS_DBGMSG("(0x%lx) !!!cb_layout_recall => %d\n",
+ inode->i_ino, rc);
+ /*fallthrough*/
+ case -ENOENT:
+ break;
+ }
+ }
+
/* skip opaque size, will be filled-in later */
start = exp_xdr_reserve_qwords(xdr, 1);
if (!start) {
--
1.7.10.2.677.gb6bc67f

2012-06-11 15:10:54

by Benny Halevy

[permalink] [raw]

Subject: Re: [RFC] Something very wrong with layout_recall of RETURN_FILE

On 2012-05-31 02:51, Boaz Harrosh wrote:
>
> In patch:
> pnfsd: layout recall layout state
>
> the cl_has_file_layout() is no longer inspecting the layout structures added per file
> but is inspecting if file has layout_state.
>
> So it is counting layout_states and not layouts
>
> This is bad because the addition of the layout_states on the file is done before the
> call to the filesystem so if the FS does a recall, the nfsd is confused thinking
> it already has a layout and issues a recall. Instead of returning -ENOENT, ie list
> is empty. The client then truly returns nomaching_layout and when the lo_return(s) are
> emulated the system gets stuck is some reference miss-match. (UML so no crash trace)

This should be fixed regardless so that exofs is more tolerant to "phantom"
layout returns.

>
> Now lets say that the state should be set before the call to the FS. Then I don't
> see where the state is removed in the case of an ERROR return from FS->layout_get.
> Meaning cl_has_file_layout() will always think it has some count.

This is a bug and there is actually no reason to insert the layout state
before the call to layout_get.

>
> Also When a layout is returned it is the layout list that is inspected and freed,
> so how is the cl_has_file_layout() emptied ?

Not sure I understand your question but the layout state is unhashed
in destroy_layout_state

>
> In any way. I do not agree that it is the state that is needed to be searched
> in cl_has_file_layout() but it is layouts that are needed, otherwise the all
> layout <---> recall very delicate dance is totally broken.
>
> What was the meaning of the Poet?

This wasn't the original intent for cl_has_file_layout.

I agree the requirement changed when we added the cookie magic
and the reliance of exofs on the layout recall process to be
precise about detecting the no layout case. So on one hand
the process needs to be more robust and on the other we can
lookup the exact region as you suggest below.

Then, we should be able to get rid of the layout states list altogether.
(practically reverting "pnfsd: layout recall layout state")

Benny

>
> I reverted the cl_has_file_layout() to historical processing and am debugging
> Will probably now get the state processing wrong.
>
> Also cl_has_file_layout() returns true for any layout on a file, but we must
> inspect IO_MODE and LSEG for a partial-match, as well.
>
> The below works for me. State also looks good
> (lightly tested, bug above is fixed, Have not tried multiple clients shared
> same-stripe writes)
>
> Thanks
> Boaz
>
> ------
> git diff --stat -p -M fs/nfsd/nfs4pnfsd.c
> fs/nfsd/nfs4pnfsd.c | 29 ++++++++++++++++-------------
> 1 file changed, 16 insertions(+), 13 deletions(-)
> ------
> diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
> index f90f3a7..b421437 100644
> --- a/fs/nfsd/nfs4pnfsd.c
> +++ b/fs/nfsd/nfs4pnfsd.c
> @@ -1179,24 +1179,27 @@ out:
> }
>
> static bool
> -cl_has_file_layout(struct nfs4_client *clp, struct nfs4_file *fp, stateid_t *lsid)
> +cl_has_file_layout(struct nfs4_client *clp, struct nfs4_file *fp,
> + stateid_t *lsid, struct nfsd4_pnfs_cb_layout *cbl)
> {
> - struct nfs4_layout_state *ls;
> + struct nfs4_layout *lo;
> + bool ret = false;
>
> spin_lock(&layout_lock);
> - list_for_each_entry (ls, &fp->fi_layout_states, ls_perfile)
> - if (same_clid(&ls->ls_stid.sc_stateid.si_opaque.so_clid,
> - &clp->cl_clientid)) {
> + list_for_each_entry (lo, &fp->fi_layouts, lo_perfile) {
> + if (same_clid(&lo->lo_client->cl_clientid, &clp->cl_clientid) &&
> + lo_seg_overlapping(&cbl->cbl_seg, &lo->lo_seg) &&
> + (cbl->cbl_seg.iomode & lo->lo_seg.iomode))
> goto found;
> - }
> - spin_unlock(&layout_lock);
> - return false;
> -
> + }
> + goto unlock;
> found:
> - update_layout_stateid_locked(ls, lsid);
> + /* Im going to send a recall on this latout update state */
> + update_layout_stateid_locked(lo->lo_state, lsid);
> + ret = true;
> +unlock:
> spin_unlock(&layout_lock);
> -
> - return true;
> + return ret;
> }
>
> static int
> @@ -1228,7 +1231,7 @@ cl_has_layout(struct nfs4_client *clp, struct nfsd4_pnfs_cb_layout *cbl,
> {
> switch (cbl->cbl_recall_type) {
> case RETURN_FILE:
> - return cl_has_file_layout(clp, lrfile, lsid);
> + return cl_has_file_layout(clp, lrfile, lsid, cbl);
> case RETURN_FSID:
> return cl_has_fsid_layout(clp, &cbl->cbl_fsid);
> default:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-06-11 15:35:11

by Boaz Harrosh

[permalink] [raw]

Subject: Re: [RFC] Something very wrong with layout_recall of RETURN_FILE

On 06/11/2012 06:10 PM, Benny Halevy wrote:

> This should be fixed regardless so that exofs is more tolerant to "phantom"
> layout returns.
>

It's not a crash at exofs, it's a crash at nfsd do to reference miss-match.

<>

> This wasn't the original intent for cl_has_file_layout.
>
> I agree the requirement changed when we added the cookie magic
> and the reliance of exofs on the layout recall process to be
> precise about detecting the no layout case. So on one hand
> the process needs to be more robust and on the other we can
> lookup the exact region as you suggest below.
>

I have lots of code changes to this, which works very well, to
my satisfaction. It fixes the above and many other problems
and is also a cleanup and fixture additions.

I would have sent it, if I was not busy with clients bugs found
which are more urgent. The code is there and is being heavily
tested as we speak. (mainly the client code, the server code is
very good)

It'll take a few more days to send all this, in. Needs SPLITMEs
and cleanup. (Tell me if you want RFC level code which will be
harder for review, before hand)

> Then, we should be able to get rid of the layout states list altogether.
> (practically reverting "pnfsd: layout recall layout state")
>

I have not removed this. As you say it's by now dead code. I'll send in
what I have and we can surgically revert that thing as well. It will
all be in SQUASHMEs and we can later re-arrange the patches for this
to naturally fall off the patchlist. (I intend to help a bit with this
work, in the areas these touch)

> Benny
>

Thanks
Boaz

2012-06-11 15:43:38

by Benny Halevy

[permalink] [raw]

Subject: Re: [PATCH] pnfsd-exofs: Two clients must not write to the same RAID stripe

On 2012-05-31 03:25, Boaz Harrosh wrote:
> If any one is interested the below is the behavior
> I'm striving for.

This patch looks reasonable to me.
Would you like me to commit it?

Benny

>
> ----
> From: Boaz Harrosh <[email protected]>
> Subject: [PATCH] pnfsd-exofs: Two clients must not write to the same RAID stripe
>
> If we have file redundancy RAID1/4/5/6 then two clients cannot
> write to the same stripe/region.
>
> We take care of this by giving out smaller regions of the file.
> Before any layout_get we make sure to recall the same exact
> region from any client. If a recall was issued we return
> NFS4ERR_RECALLCONFLICT. The client will come again later for
> it's layout.
>
> Meanwhile the fist client can flush data and release the
> layout. The next time the segment might be free and the
> lo_get succeed.
>
> It is very possible that multiple writers will fight and
> some clients will starve forever. But the smaller the
> region, and if the clients randomize a wait, it should
> statistically be OK.
> (We could manage a fairness queue. What about an lo_available
> notification)
>
> On the other hand a very small segment will hurt performance,
> so default size is now set to 8 stripes.
> TODO: Let segment size be set in sysfs.
>
> TODO:
> For debugging we always give out small segments. But we should
> only start giving out small segments on a shared file. The
> first/single writer should get a large seg as before.
>
> Signed-off-by: Boaz Harrosh <[email protected]>
> ---
> fs/exofs/export.c | 48 ++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 42 insertions(+), 6 deletions(-)
>
> diff --git a/fs/exofs/export.c b/fs/exofs/export.c
> index c5712f3..a1f112f 100644
> --- a/fs/exofs/export.c
> +++ b/fs/exofs/export.c
> @@ -29,6 +29,9 @@
>
> #include "linux/nfsd/pnfs_osd_xdr_srv.h"
>
> +/* TODO: put in sysfs per sb */
> +const static unsigned sb_shared_num_stripes = 8;
> +
> static int exofs_layout_type(struct super_block *sb)
> {
> return LAYOUT_OSD2_OBJECTS;
> @@ -94,14 +97,27 @@ void ore_layout_2_pnfs_layout(struct pnfs_osd_layout *pl,
> }
> }
>
> -static void _align_io(struct ore_layout *layout, u64 *offset, u64 *length)
> +static bool _align_io(struct ore_layout *layout, struct nfsd4_layout_seg *lseg,
> + bool shared)
> {
> u64 stripe_size = (layout->group_width - layout->parity) *
> layout->stripe_unit;
> u64 group_size = stripe_size * layout->group_depth;
>
> - *offset = div64_u64(*offset, group_size) * group_size;
> - *length = group_size;
> + /* TODO: Don't ignore shared flag. Single writer can get a full group */
> + if (lseg->iomode != IOMODE_READ &&
> + (layout->parity || (layout->mirrors_p1 > 1))) {
> + /* RAID writes */
> + lseg->offset = div64_u64(lseg->offset, stripe_size) *
> + stripe_size;
> + lseg->length = stripe_size * sb_shared_num_stripes;
> + return true;
> + } else {
> + /* reads or no data redundancy */
> + lseg->offset = div64_u64(lseg->offset, group_size) * group_size;
> + lseg->length = group_size;
> + return false;
> + }
> }
>
> static enum nfsstat4 exofs_layout_get(
> @@ -116,15 +132,15 @@ static enum nfsstat4 exofs_layout_get(
> struct pnfs_osd_layout layout;
> __be32 *start;
> unsigned i;
> - bool in_recall;
> + bool in_recall, need_recall;
> enum nfsstat4 nfserr;
>
> EXOFS_DBGMSG("(0x%lx) REQUESTED offset=0x%llx len=0x%llx iomod=0x%x\n",
> inode->i_ino, res->lg_seg.offset,
> res->lg_seg.length, res->lg_seg.iomode);
>
> - _align_io(&sbi->layout, &res->lg_seg.offset, &res->lg_seg.length);
> - res->lg_seg.iomode = IOMODE_RW;
> + need_recall = _align_io(&sbi->layout, &res->lg_seg,
> + test_bit(OBJ_LAYOUT_IS_GIVEN, &oi->i_flags));
> res->lg_return_on_close = true;
> res->lg_lo_cookie = inode; /* Just for debug prints */
>
> @@ -132,6 +148,26 @@ static enum nfsstat4 exofs_layout_get(
> inode->i_ino, res->lg_seg.offset,
> res->lg_seg.length, res->lg_seg.iomode);
>
> + if (need_recall) {
> + int rc = cb_layout_recall(inode, IOMODE_RW, res->lg_seg.offset,
> + res->lg_seg.length, (void *)0x17);
> + switch (rc) {
> + case 0:
> + case -EAGAIN:
> + EXOFS_DBGMSG("(0x%lx) @@@ Sharing of RAID5/1 stripe\n",
> + inode->i_ino);
> + return NFS4ERR_RECALLCONFLICT;
> + default:
> + /* This is fine for now */
> + /* TODO: Fence object off */
> + EXOFS_DBGMSG("(0x%lx) !!!cb_layout_recall => %d\n",
> + inode->i_ino, rc);
> + /*fallthrough*/
> + case -ENOENT:
> + break;
> + }
> + }
> +
> /* skip opaque size, will be filled-in later */
> start = exp_xdr_reserve_qwords(xdr, 1);
> if (!start) {

2012-06-11 15:40:34

by Benny Halevy

[permalink] [raw]

Subject: Re: [RFC] Something very wrong with layout_recall of RETURN_FILE

On 2012-06-11 18:34, Boaz Harrosh wrote:
> On 06/11/2012 06:10 PM, Benny Halevy wrote:
>
>> This should be fixed regardless so that exofs is more tolerant to "phantom"
>> layout returns.
>>
>
>
> It's not a crash at exofs, it's a crash at nfsd do to reference miss-match.
>
>
> <>
>
>> This wasn't the original intent for cl_has_file_layout.
>>
>> I agree the requirement changed when we added the cookie magic
>> and the reliance of exofs on the layout recall process to be
>> precise about detecting the no layout case. So on one hand
>> the process needs to be more robust and on the other we can
>> lookup the exact region as you suggest below.
>>
>
>
> I have lots of code changes to this, which works very well, to
> my satisfaction. It fixes the above and many other problems
> and is also a cleanup and fixture additions.
>
> I would have sent it, if I was not busy with clients bugs found
> which are more urgent. The code is there and is being heavily
> tested as we speak. (mainly the client code, the server code is
> very good)
>
> It'll take a few more days to send all this, in. Needs SPLITMEs
> and cleanup. (Tell me if you want RFC level code which will be
> harder for review, before hand)
>
>> Then, we should be able to get rid of the layout states list altogether.
>> (practically reverting "pnfsd: layout recall layout state")
>>
>
>
> I have not removed this. As you say it's by now dead code. I'll send in
> what I have and we can surgically revert that thing as well. It will
> all be in SQUASHMEs and we can later re-arrange the patches for this
> to naturally fall off the patchlist. (I intend to help a bit with this
> work, in the areas these touch)

Great. Feel free to use/test the following patch...

>From 537ddc4c2d35c820a5c74b783b139dbc11a36531 Mon Sep 17 00:00:00 2001
From: Benny Halevy <[email protected]>
Date: Mon, 11 Jun 2012 18:29:03 +0300
Subject: [PATCH] SQUAHSME: pnfsd: get rid of fi_layout_states list

---
fs/nfsd/nfs4pnfsd.c | 15 ++-------------
fs/nfsd/nfs4state.c | 1 -
fs/nfsd/pnfsd.h | 1 -
fs/nfsd/state.h | 1 -
4 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index 4ee26ff..a8e85e9 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -147,8 +147,7 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
* Note: must be called under the state lock
*/
static struct nfs4_layout_state *
-alloc_init_layout_state(struct nfs4_client *clp, struct nfs4_file *fp,
- stateid_t *stateid)
+alloc_init_layout_state(struct nfs4_client *clp, stateid_t *stateid)
{
struct nfs4_layout_state *new;

@@ -157,10 +156,6 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
return new;
kref_init(&new->ls_ref);
nfsd4_init_stid(&new->ls_stid, clp, NFS4_LAYOUT_STID);
- INIT_LIST_HEAD(&new->ls_perfile);
- spin_lock(&layout_lock);
- list_add(&new->ls_perfile, &fp->fi_layout_states);
- spin_unlock(&layout_lock);
new->ls_roc = false;
return new;
}
@@ -178,11 +173,6 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
container_of(kref, struct nfs4_layout_state, ls_ref);

nfsd4_unhash_stid(&ls->ls_stid);
- if (!list_empty(&ls->ls_perfile)) {
- spin_lock(&layout_lock);
- list_del(&ls->ls_perfile);
- spin_unlock(&layout_lock);
- }
kfree(ls);
}

@@ -233,7 +223,7 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
goto out;
}

- ls = alloc_init_layout_state(clp, fp, stateid);
+ ls = alloc_init_layout_state(clp, stateid);
if (!ls) {
status = nfserr_jukebox;
goto out;
@@ -344,7 +334,6 @@ static void update_layout_roc(struct nfs4_layout_state *ls, bool roc)
__func__, lp, clp, fp, fp->fi_inode);

kmem_cache_free(pnfs_layout_slab, lp);
- list_del_init(&ls->ls_perfile);
/* release references taken by init_layout */
put_layout_state(ls);
put_nfs4_file(fp);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index db64d8e..930babd 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2385,7 +2385,6 @@ static void nfsd4_init_file(struct nfs4_file *fp, struct inode *ino,
memset(fp->fi_access, 0, sizeof(fp->fi_access));
#if defined(CONFIG_PNFSD)
INIT_LIST_HEAD(&fp->fi_layouts);
- INIT_LIST_HEAD(&fp->fi_layout_states);
fp->fi_fsid.major = current_fh->fh_export->ex_fsid;
fp->fi_fsid.minor = 0;
fp->fi_fhlen = current_fh->fh_handle.fh_size;
diff --git a/fs/nfsd/pnfsd.h b/fs/nfsd/pnfsd.h
index df8595f..d2d7795 100644
--- a/fs/nfsd/pnfsd.h
+++ b/fs/nfsd/pnfsd.h
@@ -44,7 +44,6 @@
struct nfs4_layout_state {
struct kref ls_ref;
struct nfs4_stid ls_stid;
- struct list_head ls_perfile;
bool ls_roc;
};

diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index a23ee0b..94cbcd1 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -405,7 +405,6 @@ struct nfs4_file {
bool fi_had_conflict;
#if defined(CONFIG_PNFSD)
struct list_head fi_layouts;
- struct list_head fi_layout_states;
/* used by layoutget / layoutrecall */
struct nfs4_fsid fi_fsid;
u32 fi_fhlen;
--
1.7.6.5

>
>> Benny
>>
>
>
> Thanks
> Boaz
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html