2012-05-24 00:16:53

by Boaz Harrosh

[permalink] [raw]
Subject: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id


Benny hi

If I remember/understand correctly, there is a mode in the RFC errata
about the layout forgetful-model and a client sending a layout_get with
an open_state_id after he already had previous state (layouts) on the file.

As I understood this is an indication to the server that client has
"forgotten" all it's layouts on a file, and Server can assume their
return.

Is my understanding correct?

If Yes:
Did we implement the internal return of all layouts, if above
open_state_id is encountered?
I thought we did but I can't find this code.

Currently, I always set ROC so there is no leak. But theoretically
ROC does not have to be set. I'm doing some heavy lifting of
layout_return, and I want to make sure I have not missed a spot.

If I'm correct that it is needed, and it's missing:
My suggestion for now is that we always set ROC, disregarding FS so not to
leak layouts and therefor inode-refs, until such time that we implement it.

Thanks
Boaz


2012-05-28 16:10:02

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 5/5] SQUASHME: pnfsd: lrs_present is false by default

Set to true only on error free path.

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4pnfsd.c | 1 +
fs/nfsd/nfs4proc.c | 2 +-
2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index 0a8d5b5..11bccdf 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -892,6 +892,7 @@ struct super_block *
struct nfs4_layout *lp, *nextlp;

dprintk("%s: clp %p fp %p\n", __func__, clp, fp);
+ lrp->lrs_present = 1;
spin_lock(&layout_lock);
list_for_each_entry_safe (lp, nextlp, &fp->fi_layouts, lo_perfile) {
dprintk("%s: lp %p client %p,%p lo_type %x,%x iomode %d,%d\n",
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 6328300..e76111c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1299,7 +1299,7 @@ static __be32 nfsd4_do_lookupp(struct svc_rqst *rqstp, struct svc_fh *fh)

/* Set clientid from sessionid */
copy_clientid((clientid_t *)&lrp->args.lr_seg.clientid, cstate->session);
- lrp->lrs_present = (lrp->args.lr_return_type == RETURN_FILE);
+ lrp->lrs_present = 0;
status = nfs4_pnfs_return_layout(sb, current_fh, lrp);
out:
dprintk("pNFS %s: status %d return_type 0x%x lrs_present %d\n",
--
1.7.6.5


2012-05-28 16:36:20

by Boaz Harrosh

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 05/28/2012 07:05 PM, Benny Halevy wrote:

> Yup, see patchset in reply.
>


That I wish you wouldn't do. I have a patchset pending that makes
major changes to all this and totally conflicts with your code.

You can carry this for a while but I might revert your core
code in my new patchset.

Sigh, more work for me.

Boaz

2012-05-28 16:53:44

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 3/5] pnfsd: add debug printouts to pnfsd_roc

On 05/28/2012 07:09 PM, Benny Halevy wrote:

> Signed-off-by: Benny Halevy <[email protected]>


Benny I disagree with this patch.

Not specifically with the print but with the !found print.

pnfsd_roc is always called layouts or not. So these
!found print are bogus.

> ---
> fs/nfsd/nfs4pnfsd.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
> index cfaac56..0a8d5b5 100644
> --- a/fs/nfsd/nfs4pnfsd.c
> +++ b/fs/nfsd/nfs4pnfsd.c
> @@ -1227,7 +1227,9 @@ int nfs4_pnfs_return_layout(struct super_block *sb, struct svc_fh *current_fh,
> void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
> {
> struct nfs4_layout *lo, *nextlp;
> + bool found = false;
>
> + dprintk("%s: fp=%p clp=%p", __func__, fp, clp);
> spin_lock(&layout_lock);
> list_for_each_entry_safe (lo, nextlp, &fp->fi_layouts, lo_perfile) {
> struct nfsd4_pnfs_layoutreturn lr;
> @@ -1245,11 +1247,15 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
> destroy_layout(lo); /* do not access lp after this */
>
> empty = list_empty(&fp->fi_layouts);
> + found = true;
> + dprintk("%s: fp=%p clp=%p: return on close", __func__, fp, clp);
> fs_layout_return(fp->fi_inode->i_sb, fp->fi_inode, &lr,
> LR_FLAG_INTERN,
> empty ? PNFS_LAST_LAYOUT_NO_RECALLS : NULL);


fs_layout_return already prints for each call. So we should only add one
global print so to know we came from ROC

Please don't fix it. I have this covered in my patches. In fact this
function is totally changed.

> }
> spin_unlock(&layout_lock);
> + if (!found)
> + dprintk("%s: no layout found", __func__);


Again please no prints here. pnfsd_roc is always called unconditionally
from last close.
(And found should be conditional on SUNRPC_DEBUG if is left)

> }
>
> void pnfs_expire_client(struct nfs4_client *clp)


Thanks
Boaz

2012-05-28 17:55:59

by Benny Halevy

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 2012-05-28 19:36, Boaz Harrosh wrote:
> On 05/28/2012 07:05 PM, Benny Halevy wrote:
>
>> Yup, see patchset in reply.
>>
>
>
> That I wish you wouldn't do. I have a patchset pending that makes
> major changes to all this and totally conflicts with your code.

Sorry, I didn't know that.

>
> You can carry this for a while but I might revert your core
> code in my new patchset.

I hope that at least it will all be for the better.

Benny

>
> Sigh, more work for me.
>
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-24 12:41:53

by Benny Halevy

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 2012-05-24 03:16, Boaz Harrosh wrote:
>
> Benny hi
>
> If I remember/understand correctly, there is a mode in the RFC errata
> about the layout forgetful-model and a client sending a layout_get with
> an open_state_id after he already had previous state (layouts) on the file.
>
> As I understood this is an indication to the server that client has
> "forgotten" all it's layouts on a file, and Server can assume their
> return.
>
> Is my understanding correct?

Yes

>
> If Yes:
> Did we implement the internal return of all layouts, if above
> open_state_id is encountered?
> I thought we did but I can't find this code.

True. This is not implemented yet.

>
> Currently, I always set ROC so there is no leak. But theoretically
> ROC does not have to be set. I'm doing some heavy lifting of
> layout_return, and I want to make sure I have not missed a spot.
>
> If I'm correct that it is needed, and it's missing:
> My suggestion for now is that we always set ROC, disregarding FS so not to
> leak layouts and therefor inode-refs, until such time that we implement it.

According to the new errata the server will have to simulate layout returns in the ROC
case on last CLOSE if the (forgetful) client did not explicitly return the layout.
This is not implemented either :-(

Benny

>
> Thanks
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-28 16:09:25

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 1/5] pnfsd: make return_on_close state layout-global for the file/client

As per RFC5661 errata #3226
http://www.ietf.org/mail-archive/web/nfsv4/current/msg10965.html
Once the server returns the return_on_close flag set, all the layout
for that client will be implicitly returned on last close.

Reported-by: Boaz Harrosh <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4pnfsd.c | 19 ++++++++++++++-----
fs/nfsd/pnfsd.h | 2 +-
2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index 9f94cd0..a95e96e 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -161,6 +161,7 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
spin_lock(&layout_lock);
list_add(&new->ls_perfile, &fp->fi_layout_states);
spin_unlock(&layout_lock);
+ new->ls_roc = false;
return new;
}

@@ -286,6 +287,15 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
spin_unlock(&layout_lock);
}

+static void update_layout_roc(struct nfs4_layout_state *ls, bool roc)
+{
+ if (roc) {
+ ls->ls_roc = true;
+ dprintk("%s: Marked return_on_close on layoutstate %p\n",
+ __func__, ls);
+ }
+}
+
static void
init_layout(struct nfs4_layout *lp,
struct nfs4_layout_state *ls,
@@ -293,8 +303,7 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
struct nfs4_client *clp,
struct svc_fh *current_fh,
struct nfsd4_layout_seg *seg,
- stateid_t *stateid,
- bool roc)
+ stateid_t *stateid)
{
dprintk("pNFS %s: lp %p ls %p clp %p fp %p ino %p\n", __func__,
lp, ls, clp, fp, fp->fi_inode);
@@ -305,7 +314,6 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
memcpy(&lp->lo_seg, seg, sizeof(lp->lo_seg));
get_layout_state(ls); /* put on destroy_layout */
lp->lo_state = ls;
- lp->lo_roc = roc;
update_layout_stateid(ls, stateid);
list_add_tail(&lp->lo_perclnt, &clp->cl_layouts);
list_add_tail(&lp->lo_perfile, &fp->fi_layouts);
@@ -811,6 +819,7 @@ struct super_block *

lgp->lg_seg = res.lg_seg;
lgp->lg_roc = res.lg_return_on_close;
+ update_layout_roc(ls, res.lg_return_on_close);

/* SUCCESS!
* Can the new layout be merged into an existing one?
@@ -820,7 +829,7 @@ struct super_block *
goto out_freelayout;

/* Can't merge, so let's initialize this new layout */
- init_layout(lp, ls, fp, clp, lgp->lg_fhp, &res.lg_seg, &lgp->lg_sid, res.lg_return_on_close);
+ init_layout(lp, ls, fp, clp, lgp->lg_fhp, &res.lg_seg, &lgp->lg_sid);
out_unlock:
if (ls)
put_layout_state(ls);
@@ -1225,7 +1234,7 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
bool empty;

/* Check for a match */
- if (!lo->lo_roc || lo->lo_client != clp)
+ if (!lo->lo_state->ls_roc || lo->lo_client != clp)
continue;

/* Return the layout */
diff --git a/fs/nfsd/pnfsd.h b/fs/nfsd/pnfsd.h
index f0862fb..e960fd3 100644
--- a/fs/nfsd/pnfsd.h
+++ b/fs/nfsd/pnfsd.h
@@ -45,6 +45,7 @@ struct nfs4_layout_state {
struct kref ls_ref;
struct nfs4_stid ls_stid;
struct list_head ls_perfile;
+ bool ls_roc;
};

/* outstanding layout */
@@ -55,7 +56,6 @@ struct nfs4_layout {
struct nfs4_client *lo_client;
struct nfs4_layout_state *lo_state;
struct nfsd4_layout_seg lo_seg;
- bool lo_roc;
};

struct pnfs_inval_state {
--
1.7.6.5


2012-05-28 16:05:18

by Benny Halevy

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 2012-05-24 16:49, Boaz Harrosh wrote:
> On 05/24/2012 03:41 PM, Benny Halevy wrote:
>
>> On 2012-05-24 03:16, Boaz Harrosh wrote:
>>>
>>> Benny hi
>>>
>>> If I remember/understand correctly, there is a mode in the RFC errata
>>> about the layout forgetful-model and a client sending a layout_get with
>>> an open_state_id after he already had previous state (layouts) on the file.
>>>
>>> As I understood this is an indication to the server that client has
>>> "forgotten" all it's layouts on a file, and Server can assume their
>>> return.
>>>
>>> Is my understanding correct?
>>
>> Yes
>>
>>>
>>> If Yes:
>>> Did we implement the internal return of all layouts, if above
>>> open_state_id is encountered?
>>> I thought we did but I can't find this code.
>>
>> True. This is not implemented yet.
>>
>>>
>>> Currently, I always set ROC so there is no leak. But theoretically
>>> ROC does not have to be set. I'm doing some heavy lifting of
>>> layout_return, and I want to make sure I have not missed a spot.
>>>
>>> If I'm correct that it is needed, and it's missing:
>>> My suggestion for now is that we always set ROC, disregarding FS so not to
>>> leak layouts and therefor inode-refs, until such time that we implement it.
>>
>> According to the new errata the server will have to simulate layout returns in the ROC
>> case on last CLOSE if the (forgetful) client did not explicitly return the layout.
>> This is not implemented either :-(
>>
>
>
> Yes Benny it is implemented. I implemented it and you fix a bug I had. See:
> nfs4pnfsd.c::pnfsd_roc(...)
> and it's call site. It is called on last call and all is working well (except some locking
> bugs I'll send a fix to later). Otherwise both exofs and DF would have crapped out for sure.

Duh, you're right of course.

>
> With ROC set all work well. This is why I say we should set it for now to make sure we do
> not have the above bug.

Yup, see patchset in reply.

Thanks!

Benny

>
> Thanks
> Boaz
>
>> Benny
>>
>>>
>>> Thanks
>>> Boaz
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-28 18:05:26

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 4/5] pnfsd-lexp: return_on_close config option

On 05/28/2012 08:58 PM, Benny Halevy wrote:

> On 2012-05-28 19:52, Boaz Harrosh wrote:
>
> so how will you debug the leak case?
> pnfsd-lexp is for testing, and the default is on.
>
> Benny
>


Good point I agree

Boaz

2012-05-28 16:39:05

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 1/5] pnfsd: make return_on_close state layout-global for the file/client

On 05/28/2012 07:09 PM, Benny Halevy wrote:

> As per RFC5661 errata #3226
> http://www.ietf.org/mail-archive/web/nfsv4/current/msg10965.html
> Once the server returns the return_on_close flag set, all the layout
> for that client will be implicitly returned on last close.
>


This one I'll keep though It will conflict with my patches.

I will incorporate whats relevant.

> Reported-by: Boaz Harrosh <[email protected]>
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> fs/nfsd/nfs4pnfsd.c | 19 ++++++++++++++-----
> fs/nfsd/pnfsd.h | 2 +-
> 2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
> index 9f94cd0..a95e96e 100644
> --- a/fs/nfsd/nfs4pnfsd.c
> +++ b/fs/nfsd/nfs4pnfsd.c
> @@ -161,6 +161,7 @@ void pnfs_clear_device_notify(struct nfs4_client *clp)
> spin_lock(&layout_lock);
> list_add(&new->ls_perfile, &fp->fi_layout_states);
> spin_unlock(&layout_lock);
> + new->ls_roc = false;
> return new;
> }
>
> @@ -286,6 +287,15 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
> spin_unlock(&layout_lock);
> }
>
> +static void update_layout_roc(struct nfs4_layout_state *ls, bool roc)
> +{
> + if (roc) {
> + ls->ls_roc = true;
> + dprintk("%s: Marked return_on_close on layoutstate %p\n",
> + __func__, ls);
> + }
> +}
> +
> static void
> init_layout(struct nfs4_layout *lp,
> struct nfs4_layout_state *ls,
> @@ -293,8 +303,7 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
> struct nfs4_client *clp,
> struct svc_fh *current_fh,
> struct nfsd4_layout_seg *seg,
> - stateid_t *stateid,
> - bool roc)
> + stateid_t *stateid)
> {
> dprintk("pNFS %s: lp %p ls %p clp %p fp %p ino %p\n", __func__,
> lp, ls, clp, fp, fp->fi_inode);
> @@ -305,7 +314,6 @@ static void update_layout_stateid(struct nfs4_layout_state *ls, stateid_t *sid)
> memcpy(&lp->lo_seg, seg, sizeof(lp->lo_seg));
> get_layout_state(ls); /* put on destroy_layout */
> lp->lo_state = ls;
> - lp->lo_roc = roc;
> update_layout_stateid(ls, stateid);
> list_add_tail(&lp->lo_perclnt, &clp->cl_layouts);
> list_add_tail(&lp->lo_perfile, &fp->fi_layouts);
> @@ -811,6 +819,7 @@ struct super_block *
>
> lgp->lg_seg = res.lg_seg;
> lgp->lg_roc = res.lg_return_on_close;
> + update_layout_roc(ls, res.lg_return_on_close);
>
> /* SUCCESS!
> * Can the new layout be merged into an existing one?
> @@ -820,7 +829,7 @@ struct super_block *
> goto out_freelayout;
>
> /* Can't merge, so let's initialize this new layout */
> - init_layout(lp, ls, fp, clp, lgp->lg_fhp, &res.lg_seg, &lgp->lg_sid, res.lg_return_on_close);
> + init_layout(lp, ls, fp, clp, lgp->lg_fhp, &res.lg_seg, &lgp->lg_sid);
> out_unlock:
> if (ls)
> put_layout_state(ls);
> @@ -1225,7 +1234,7 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
> bool empty;
>
> /* Check for a match */
> - if (!lo->lo_roc || lo->lo_client != clp)
> + if (!lo->lo_state->ls_roc || lo->lo_client != clp)
> continue;
>
> /* Return the layout */
> diff --git a/fs/nfsd/pnfsd.h b/fs/nfsd/pnfsd.h
> index f0862fb..e960fd3 100644
> --- a/fs/nfsd/pnfsd.h
> +++ b/fs/nfsd/pnfsd.h
> @@ -45,6 +45,7 @@ struct nfs4_layout_state {
> struct kref ls_ref;
> struct nfs4_stid ls_stid;
> struct list_head ls_perfile;
> + bool ls_roc;
> };
>
> /* outstanding layout */
> @@ -55,7 +56,6 @@ struct nfs4_layout {
> struct nfs4_client *lo_client;
> struct nfs4_layout_state *lo_state;
> struct nfsd4_layout_seg lo_seg;
> - bool lo_roc;
> };
>
> struct pnfs_inval_state {



2012-05-28 18:29:10

by Benny Halevy

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 2012-05-28 21:09, Boaz Harrosh wrote:
> On 05/28/2012 08:55 PM, Benny Halevy wrote:
>
>> On 2012-05-28 19:36, Boaz Harrosh wrote:
>>> On 05/28/2012 07:05 PM, Benny Halevy wrote:
>>>
>>>> Yup, see patchset in reply.
>>>>
>>>
>>>
>>> That I wish you wouldn't do. I have a patchset pending that makes
>>> major changes to all this and totally conflicts with your code.
>>
>> Sorry, I didn't know that.
>>
>>>
>>> You can carry this for a while but I might revert your core
>>> code in my new patchset.
>>
>> I hope that at least it will all be for the better.
>>
>> Benny
>>
>
>
> NP. Yes it is all for the better. I think you'll like what I did.
>
> BTW after I finish implementing my stuff. I think it would be easy
> to implement that open_state_id thing. All we need is to save the
> open_state_id somewhere per-file or somehow identify it's receive,
> And then just call nfs4_roc() which will do the proper work.

It's any non layout stateid so it should be pretty easy (in theory :).
See nfs4_process_layout_stateid()
if (stid->sc_type != NFS4_LAYOUT_STID) {}

We should be able to locate the layout stateid, if any, on the
fp->fi_layout_states list by matching stid->sc_client.

Benny

>
> But please wait for me to finish.
>
> (It is all done and debugged, but will be divided and patched
> on Wednesday)

Sure. Thanks!

Benny

>
> Thanks
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-24 13:50:31

by Boaz Harrosh

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 05/24/2012 04:49 PM, Boaz Harrosh wrote:

> and it's call site. It is called on last call and all is working well (except some locking


I meant last close not last call

Boaz

2012-05-28 18:10:03

by Boaz Harrosh

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 05/28/2012 08:55 PM, Benny Halevy wrote:

> On 2012-05-28 19:36, Boaz Harrosh wrote:
>> On 05/28/2012 07:05 PM, Benny Halevy wrote:
>>
>>> Yup, see patchset in reply.
>>>
>>
>>
>> That I wish you wouldn't do. I have a patchset pending that makes
>> major changes to all this and totally conflicts with your code.
>
> Sorry, I didn't know that.
>
>>
>> You can carry this for a while but I might revert your core
>> code in my new patchset.
>
> I hope that at least it will all be for the better.
>
> Benny
>


NP. Yes it is all for the better. I think you'll like what I did.

BTW after I finish implementing my stuff. I think it would be easy
to implement that open_state_id thing. All we need is to save the
open_state_id somewhere per-file or somehow identify it's receive,
And then just call nfs4_roc() which will do the proper work.

But please wait for me to finish.

(It is all done and debugged, but will be divided and patched
on Wednesday)

Thanks
Boaz

2012-05-28 18:01:20

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 4/5] pnfsd-lexp: return_on_close config option

On 2012-05-28 20:58, Benny Halevy wrote:
> On 2012-05-28 19:52, Boaz Harrosh wrote:
>> On 05/28/2012 07:09 PM, Benny Halevy wrote:
>>
>>> Signed-off-by: Benny Halevy <[email protected]>
>>> ---
>>> fs/nfsd/Kconfig | 7 +++++++
>>> fs/nfsd/pnfsd_lexp.c | 5 +++++
>>> 2 files changed, 12 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
>>> index 9d4d79b..f9b3426 100644
>>> --- a/fs/nfsd/Kconfig
>>> +++ b/fs/nfsd/Kconfig
>>> @@ -132,3 +132,10 @@ config PNFSD_LEXP_LAYOUT_SEGMENT_SIZE
>>> Set simulated layout segment size.
>>>
>>> If unsure, say N.
>>> +
>>> +config PNFSD_LEXP_RETURN_ON_CLOSE
>>> + bool "Reply to LAYOUTGET with return_on_close set to true"
>>> + depends on PNFSD_LOCAL_EXPORT
>>> + default false
>>> + help
>>> + Set return_on_close response flag.
>>> diff --git a/fs/nfsd/pnfsd_lexp.c b/fs/nfsd/pnfsd_lexp.c
>>> index 30d9757..33a724c 100644
>>> --- a/fs/nfsd/pnfsd_lexp.c
>>> +++ b/fs/nfsd/pnfsd_lexp.c
>>> @@ -153,6 +153,11 @@ static int get_stripe_unit(int blocksize)
>>> res->lg_seg.offset = 0;
>>> res->lg_seg.length = NFS4_MAX_UINT64;
>>> #endif /* CONFIG_PNFSD_LEXP_LAYOUT_SEGMENTS */
>>> +#ifdef CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE
>>> + res->lg_return_on_close = true;
>>> +#else /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>>> + res->lg_return_on_close = false;
>>
>>
>> Back to my original question. With current code this case is a BUG.
>> It will leak layouts and therefore file-reference in case of a layout-get
>> with open-state ID.
>>
>> So I suggest hardcoding to true until such time that we actually can
>> support it
>
> so how will you debug the leak case?
> pnfsd-lexp is for testing, and the default is on.

oops, the default is actually off in this patch.
I do need to code what I wish :)

Benny

>
> Benny
>
>>
>> Boaz
>>
>>> +#endif /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>>>
>>> layout = kzalloc(sizeof(*layout), GFP_KERNEL);
>>> if (layout == NULL) {
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-28 16:09:34

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 2/5] SQUASHME: pnfsd: use LR_FLAG_INTERN for pnfsd_roc implicit layout return

The client is not expired in the case, but rather we internally simulate
a layout return based on return_on_close.

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4pnfsd.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index a95e96e..cfaac56 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -1246,7 +1246,7 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)

empty = list_empty(&fp->fi_layouts);
fs_layout_return(fp->fi_inode->i_sb, fp->fi_inode, &lr,
- LR_FLAG_EXPIRE,
+ LR_FLAG_INTERN,
empty ? PNFS_LAST_LAYOUT_NO_RECALLS : NULL);
}
spin_unlock(&layout_lock);
--
1.7.6.5


2012-05-28 17:59:11

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 3/5] pnfsd: add debug printouts to pnfsd_roc

On 2012-05-28 19:53, Boaz Harrosh wrote:
> On 05/28/2012 07:09 PM, Benny Halevy wrote:
>
>> Signed-off-by: Benny Halevy <[email protected]>
>
>
> Benny I disagree with this patch.
>
> Not specifically with the print but with the !found print.
>
> pnfsd_roc is always called layouts or not. So these
> !found print are bogus.
>
>> ---
>> fs/nfsd/nfs4pnfsd.c | 6 ++++++
>> 1 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
>> index cfaac56..0a8d5b5 100644
>> --- a/fs/nfsd/nfs4pnfsd.c
>> +++ b/fs/nfsd/nfs4pnfsd.c
>> @@ -1227,7 +1227,9 @@ int nfs4_pnfs_return_layout(struct super_block *sb, struct svc_fh *current_fh,
>> void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
>> {
>> struct nfs4_layout *lo, *nextlp;
>> + bool found = false;
>>
>> + dprintk("%s: fp=%p clp=%p", __func__, fp, clp);
>> spin_lock(&layout_lock);
>> list_for_each_entry_safe (lo, nextlp, &fp->fi_layouts, lo_perfile) {
>> struct nfsd4_pnfs_layoutreturn lr;
>> @@ -1245,11 +1247,15 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
>> destroy_layout(lo); /* do not access lp after this */
>>
>> empty = list_empty(&fp->fi_layouts);
>> + found = true;
>> + dprintk("%s: fp=%p clp=%p: return on close", __func__, fp, clp);
>> fs_layout_return(fp->fi_inode->i_sb, fp->fi_inode, &lr,
>> LR_FLAG_INTERN,
>> empty ? PNFS_LAST_LAYOUT_NO_RECALLS : NULL);
>
>
> fs_layout_return already prints for each call. So we should only add one
> global print so to know we came from ROC
>
> Please don't fix it. I have this covered in my patches. In fact this
> function is totally changed.
>

OK.

>> }
>> spin_unlock(&layout_lock);
>> + if (!found)
>> + dprintk("%s: no layout found", __func__);
>
>
> Again please no prints here. pnfsd_roc is always called unconditionally
> from last close.
> (And found should be conditional on SUNRPC_DEBUG if is left)
>
>> }
>>
>> void pnfs_expire_client(struct nfs4_client *clp)
>
>
> Thanks
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-29 07:14:14

by Boaz Harrosh

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 05/28/2012 09:29 PM, Benny Halevy wrote:

> On 2012-05-28 21:09, Boaz Harrosh wrote:


>> BTW after I finish implementing my stuff. I think it would be easy
>> to implement that open_state_id thing. All we need is to save the
>> open_state_id somewhere per-file or somehow identify it's receive,
>> And then just call nfs4_roc() which will do the proper work.
>
> It's any non layout stateid so it should be pretty easy (in theory :).
> See nfs4_process_layout_stateid()
> if (stid->sc_type != NFS4_LAYOUT_STID) {}
>
> We should be able to locate the layout stateid, if any, on the
> fp->fi_layout_states list by matching stid->sc_client.
>


Does look easy, then. I thought it should be. I will have a shot
at it and add it to my patchset.

Do you know if we have a test case for this in pyNFS?

I need to think if we can cause this with the Linux client?
Perhaps: not set ROC, access a file and close, clear caches
at both ends, re access, something like that, I'll try.

Thanks
Boaz


2012-05-28 17:58:21

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 4/5] pnfsd-lexp: return_on_close config option

On 2012-05-28 19:52, Boaz Harrosh wrote:
> On 05/28/2012 07:09 PM, Benny Halevy wrote:
>
>> Signed-off-by: Benny Halevy <[email protected]>
>> ---
>> fs/nfsd/Kconfig | 7 +++++++
>> fs/nfsd/pnfsd_lexp.c | 5 +++++
>> 2 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
>> index 9d4d79b..f9b3426 100644
>> --- a/fs/nfsd/Kconfig
>> +++ b/fs/nfsd/Kconfig
>> @@ -132,3 +132,10 @@ config PNFSD_LEXP_LAYOUT_SEGMENT_SIZE
>> Set simulated layout segment size.
>>
>> If unsure, say N.
>> +
>> +config PNFSD_LEXP_RETURN_ON_CLOSE
>> + bool "Reply to LAYOUTGET with return_on_close set to true"
>> + depends on PNFSD_LOCAL_EXPORT
>> + default false
>> + help
>> + Set return_on_close response flag.
>> diff --git a/fs/nfsd/pnfsd_lexp.c b/fs/nfsd/pnfsd_lexp.c
>> index 30d9757..33a724c 100644
>> --- a/fs/nfsd/pnfsd_lexp.c
>> +++ b/fs/nfsd/pnfsd_lexp.c
>> @@ -153,6 +153,11 @@ static int get_stripe_unit(int blocksize)
>> res->lg_seg.offset = 0;
>> res->lg_seg.length = NFS4_MAX_UINT64;
>> #endif /* CONFIG_PNFSD_LEXP_LAYOUT_SEGMENTS */
>> +#ifdef CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE
>> + res->lg_return_on_close = true;
>> +#else /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>> + res->lg_return_on_close = false;
>
>
> Back to my original question. With current code this case is a BUG.
> It will leak layouts and therefore file-reference in case of a layout-get
> with open-state ID.
>
> So I suggest hardcoding to true until such time that we actually can
> support it

so how will you debug the leak case?
pnfsd-lexp is for testing, and the default is on.

Benny

>
> Boaz
>
>> +#endif /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>>
>> layout = kzalloc(sizeof(*layout), GFP_KERNEL);
>> if (layout == NULL) {
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-28 16:52:56

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 4/5] pnfsd-lexp: return_on_close config option

On 05/28/2012 07:09 PM, Benny Halevy wrote:

> Signed-off-by: Benny Halevy <[email protected]>
> ---
> fs/nfsd/Kconfig | 7 +++++++
> fs/nfsd/pnfsd_lexp.c | 5 +++++
> 2 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
> index 9d4d79b..f9b3426 100644
> --- a/fs/nfsd/Kconfig
> +++ b/fs/nfsd/Kconfig
> @@ -132,3 +132,10 @@ config PNFSD_LEXP_LAYOUT_SEGMENT_SIZE
> Set simulated layout segment size.
>
> If unsure, say N.
> +
> +config PNFSD_LEXP_RETURN_ON_CLOSE
> + bool "Reply to LAYOUTGET with return_on_close set to true"
> + depends on PNFSD_LOCAL_EXPORT
> + default false
> + help
> + Set return_on_close response flag.
> diff --git a/fs/nfsd/pnfsd_lexp.c b/fs/nfsd/pnfsd_lexp.c
> index 30d9757..33a724c 100644
> --- a/fs/nfsd/pnfsd_lexp.c
> +++ b/fs/nfsd/pnfsd_lexp.c
> @@ -153,6 +153,11 @@ static int get_stripe_unit(int blocksize)
> res->lg_seg.offset = 0;
> res->lg_seg.length = NFS4_MAX_UINT64;
> #endif /* CONFIG_PNFSD_LEXP_LAYOUT_SEGMENTS */
> +#ifdef CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE
> + res->lg_return_on_close = true;
> +#else /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
> + res->lg_return_on_close = false;


Back to my original question. With current code this case is a BUG.
It will leak layouts and therefore file-reference in case of a layout-get
with open-state ID.

So I suggest hardcoding to true until such time that we actually can
support it

Boaz

> +#endif /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>
> layout = kzalloc(sizeof(*layout), GFP_KERNEL);
> if (layout == NULL) {



2012-05-28 16:09:44

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 3/5] pnfsd: add debug printouts to pnfsd_roc

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4pnfsd.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4pnfsd.c b/fs/nfsd/nfs4pnfsd.c
index cfaac56..0a8d5b5 100644
--- a/fs/nfsd/nfs4pnfsd.c
+++ b/fs/nfsd/nfs4pnfsd.c
@@ -1227,7 +1227,9 @@ int nfs4_pnfs_return_layout(struct super_block *sb, struct svc_fh *current_fh,
void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
{
struct nfs4_layout *lo, *nextlp;
+ bool found = false;

+ dprintk("%s: fp=%p clp=%p", __func__, fp, clp);
spin_lock(&layout_lock);
list_for_each_entry_safe (lo, nextlp, &fp->fi_layouts, lo_perfile) {
struct nfsd4_pnfs_layoutreturn lr;
@@ -1245,11 +1247,15 @@ void pnfsd_roc(struct nfs4_client *clp, struct nfs4_file *fp)
destroy_layout(lo); /* do not access lp after this */

empty = list_empty(&fp->fi_layouts);
+ found = true;
+ dprintk("%s: fp=%p clp=%p: return on close", __func__, fp, clp);
fs_layout_return(fp->fi_inode->i_sb, fp->fi_inode, &lr,
LR_FLAG_INTERN,
empty ? PNFS_LAST_LAYOUT_NO_RECALLS : NULL);
}
spin_unlock(&layout_lock);
+ if (!found)
+ dprintk("%s: no layout found", __func__);
}

void pnfs_expire_client(struct nfs4_client *clp)
--
1.7.6.5


2012-05-28 18:08:32

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 4/5] pnfsd-lexp: return_on_close config option

On 2012-05-28 21:01, Benny Halevy wrote:
> On 2012-05-28 20:58, Benny Halevy wrote:
>> On 2012-05-28 19:52, Boaz Harrosh wrote:
>>> On 05/28/2012 07:09 PM, Benny Halevy wrote:
>>>
>>>> Signed-off-by: Benny Halevy <[email protected]>
>>>> ---
>>>> fs/nfsd/Kconfig | 7 +++++++
>>>> fs/nfsd/pnfsd_lexp.c | 5 +++++
>>>> 2 files changed, 12 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
>>>> index 9d4d79b..f9b3426 100644
>>>> --- a/fs/nfsd/Kconfig
>>>> +++ b/fs/nfsd/Kconfig
>>>> @@ -132,3 +132,10 @@ config PNFSD_LEXP_LAYOUT_SEGMENT_SIZE
>>>> Set simulated layout segment size.
>>>>
>>>> If unsure, say N.
>>>> +
>>>> +config PNFSD_LEXP_RETURN_ON_CLOSE
>>>> + bool "Reply to LAYOUTGET with return_on_close set to true"
>>>> + depends on PNFSD_LOCAL_EXPORT
>>>> + default false
>>>> + help
>>>> + Set return_on_close response flag.
>>>> diff --git a/fs/nfsd/pnfsd_lexp.c b/fs/nfsd/pnfsd_lexp.c
>>>> index 30d9757..33a724c 100644
>>>> --- a/fs/nfsd/pnfsd_lexp.c
>>>> +++ b/fs/nfsd/pnfsd_lexp.c
>>>> @@ -153,6 +153,11 @@ static int get_stripe_unit(int blocksize)
>>>> res->lg_seg.offset = 0;
>>>> res->lg_seg.length = NFS4_MAX_UINT64;
>>>> #endif /* CONFIG_PNFSD_LEXP_LAYOUT_SEGMENTS */
>>>> +#ifdef CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE
>>>> + res->lg_return_on_close = true;
>>>> +#else /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>>>> + res->lg_return_on_close = false;
>>>
>>>
>>> Back to my original question. With current code this case is a BUG.
>>> It will leak layouts and therefore file-reference in case of a layout-get
>>> with open-state ID.
>>>
>>> So I suggest hardcoding to true until such time that we actually can
>>> support it
>>
>> so how will you debug the leak case?
>> pnfsd-lexp is for testing, and the default is on.
>
> oops, the default is actually off in this patch.
> I do need to code what I wish :)

Fixed.

>
> Benny
>
>>
>> Benny
>>
>>>
>>> Boaz
>>>
>>>> +#endif /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
>>>>
>>>> layout = kzalloc(sizeof(*layout), GFP_KERNEL);
>>>> if (layout == NULL) {
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-05-24 13:49:15

by Boaz Harrosh

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 05/24/2012 03:41 PM, Benny Halevy wrote:

> On 2012-05-24 03:16, Boaz Harrosh wrote:
>>
>> Benny hi
>>
>> If I remember/understand correctly, there is a mode in the RFC errata
>> about the layout forgetful-model and a client sending a layout_get with
>> an open_state_id after he already had previous state (layouts) on the file.
>>
>> As I understood this is an indication to the server that client has
>> "forgotten" all it's layouts on a file, and Server can assume their
>> return.
>>
>> Is my understanding correct?
>
> Yes
>
>>
>> If Yes:
>> Did we implement the internal return of all layouts, if above
>> open_state_id is encountered?
>> I thought we did but I can't find this code.
>
> True. This is not implemented yet.
>
>>
>> Currently, I always set ROC so there is no leak. But theoretically
>> ROC does not have to be set. I'm doing some heavy lifting of
>> layout_return, and I want to make sure I have not missed a spot.
>>
>> If I'm correct that it is needed, and it's missing:
>> My suggestion for now is that we always set ROC, disregarding FS so not to
>> leak layouts and therefor inode-refs, until such time that we implement it.
>
> According to the new errata the server will have to simulate layout returns in the ROC
> case on last CLOSE if the (forgetful) client did not explicitly return the layout.
> This is not implemented either :-(
>


Yes Benny it is implemented. I implemented it and you fix a bug I had. See:
nfs4pnfsd.c::pnfsd_roc(...)
and it's call site. It is called on last call and all is working well (except some locking
bugs I'll send a fix to later). Otherwise both exofs and DF would have crapped out for sure.

With ROC set all work well. This is why I say we should set it for now to make sure we do
not have the above bug.

Thanks
Boaz

> Benny
>
>>
>> Thanks
>> Boaz
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html



2012-05-31 06:35:11

by Benny Halevy

[permalink] [raw]
Subject: Re: Does our Kernel PNFSD-Server supports recurring layout_get with open_state_id

On 2012-05-29 10:13, Boaz Harrosh wrote:
> On 05/28/2012 09:29 PM, Benny Halevy wrote:
>
>> On 2012-05-28 21:09, Boaz Harrosh wrote:
>
>
>>> BTW after I finish implementing my stuff. I think it would be easy
>>> to implement that open_state_id thing. All we need is to save the
>>> open_state_id somewhere per-file or somehow identify it's receive,
>>> And then just call nfs4_roc() which will do the proper work.
>>
>> It's any non layout stateid so it should be pretty easy (in theory :).
>> See nfs4_process_layout_stateid()
>> if (stid->sc_type != NFS4_LAYOUT_STID) {}
>>
>> We should be able to locate the layout stateid, if any, on the
>> fp->fi_layout_states list by matching stid->sc_client.
>>
>
>
> Does look easy, then. I thought it should be. I will have a shot
> at it and add it to my patchset.

Thanks!

>
> Do you know if we have a test case for this in pyNFS?

No, not that I know of.

>
> I need to think if we can cause this with the Linux client?
> Perhaps: not set ROC, access a file and close, clear caches
> at both ends, re access, something like that, I'll try.

Cool. Thanks a bunch for picking this up!

Benny

>
> Thanks
> Boaz
>
> --

2012-05-28 16:09:52

by Benny Halevy

[permalink] [raw]
Subject: [PATCH 4/5] pnfsd-lexp: return_on_close config option

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/Kconfig | 7 +++++++
fs/nfsd/pnfsd_lexp.c | 5 +++++
2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
index 9d4d79b..f9b3426 100644
--- a/fs/nfsd/Kconfig
+++ b/fs/nfsd/Kconfig
@@ -132,3 +132,10 @@ config PNFSD_LEXP_LAYOUT_SEGMENT_SIZE
Set simulated layout segment size.

If unsure, say N.
+
+config PNFSD_LEXP_RETURN_ON_CLOSE
+ bool "Reply to LAYOUTGET with return_on_close set to true"
+ depends on PNFSD_LOCAL_EXPORT
+ default false
+ help
+ Set return_on_close response flag.
diff --git a/fs/nfsd/pnfsd_lexp.c b/fs/nfsd/pnfsd_lexp.c
index 30d9757..33a724c 100644
--- a/fs/nfsd/pnfsd_lexp.c
+++ b/fs/nfsd/pnfsd_lexp.c
@@ -153,6 +153,11 @@ static int get_stripe_unit(int blocksize)
res->lg_seg.offset = 0;
res->lg_seg.length = NFS4_MAX_UINT64;
#endif /* CONFIG_PNFSD_LEXP_LAYOUT_SEGMENTS */
+#ifdef CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE
+ res->lg_return_on_close = true;
+#else /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */
+ res->lg_return_on_close = false;
+#endif /* CONFIG_PNFSD_LEXP_RETURN_ON_CLOSE */

layout = kzalloc(sizeof(*layout), GFP_KERNEL);
if (layout == NULL) {
--
1.7.6.5