2010-04-22 14:32:24

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

Enforce the rules about compound op ordering.

Motivated by implementing RECLAIM_COMPLETE, for which the client is
implicit in the current session, so it is important to ensure a
succesful SEQUENCE proceeds the RECLAIM_COMPLETE.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4proc.c | 44 ++++++++++++++++++++++++++++++--------------
fs/nfsd/nfs4state.c | 13 +++++++++++++
2 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 37514c4..e147dbc 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -968,20 +968,36 @@ static struct nfsd4_operation nfsd4_ops[];
static const char *nfsd4_op_name(unsigned opnum);

/*
- * Enforce NFSv4.1 COMPOUND ordering rules.
+ * Enforce NFSv4.1 COMPOUND ordering rules:
*
- * TODO:
- * - enforce NFS4ERR_NOT_ONLY_OP,
- * - DESTROY_SESSION MUST be the final operation in the COMPOUND request.
+ * Also note, enforced elsewhere:
+ * - SEQUENCE other than as first op results in
+ * NFS4ERR_SEQUENCE_POS. (Enforced in nfsd4_sequence().)
+ * - BIND_CONN_TO_SESSION must be the only op in its compound
+ * (Will be enforced in nfsd4_bind_conn_to_session().)
+ * - DESTROY_SESSION must be the final operation in a compound, if
+ * sessionid's in SEQUENCE and DESTROY_SESSION are the same.
+ * (Enforced in nfsd4_destroy_session().)
*/
-static bool nfs41_op_ordering_ok(struct nfsd4_compoundargs *args)
+static __be32 nfs41_check_op_ordering(struct nfsd4_compoundargs *args)
{
- if (args->minorversion && args->opcnt > 0) {
- struct nfsd4_op *op = &args->ops[0];
- return (op->status == nfserr_op_illegal) ||
- (nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP);
- }
- return true;
+ struct nfsd4_op *op = &args->ops[0];
+
+ /* These ordering requirements don't apply to NFSv4.0: */
+ if (args->minorversion == 0)
+ return nfs_ok;
+ /* This is weird, but OK, not our problem: */
+ if (args->opcnt == 0)
+ return nfs_ok;
+ if (op->status == nfserr_op_illegal)
+ return nfs_ok;
+ if (!(nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP))
+ return nfserr_op_not_in_session;
+ if (op->opnum == OP_SEQUENCE)
+ return nfs_ok;
+ if (args->opcnt != 1)
+ return nfserr_not_only_op;
+ return nfs_ok;
}

/*
@@ -1023,13 +1039,13 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
if (args->minorversion > nfsd_supported_minorversion)
goto out;

- if (!nfs41_op_ordering_ok(args)) {
+ status = nfs41_check_op_ordering(args);
+ if (status) {
op = &args->ops[0];
- op->status = nfserr_sequence_pos;
+ op->status = status;
goto encode_op;
}

- status = nfs_ok;
while (!status && resp->opcnt < args->opcnt) {
op = &args->ops[resp->opcnt++];

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5051ade..e444829 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1365,6 +1365,14 @@ out:
return status;
}

+static bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
+{
+ struct nfsd4_compoundres *resp = rqstp->rq_resp;
+ struct nfsd4_compoundargs *argp = rqstp->rq_argp;
+
+ return argp->opcnt == resp->opcnt;
+}
+
__be32
nfsd4_destroy_session(struct svc_rqst *r,
struct nfsd4_compound_state *cstate,
@@ -1380,6 +1388,11 @@ nfsd4_destroy_session(struct svc_rqst *r,
* - Do we need to clear any callback info from previous session?
*/

+ if (!memcmp(&sessionid->sessionid, &cstate->session->se_sessionid,
+ sizeof(struct nfs4_sessionid))) {
+ if (!nfsd4_last_compound_op(r))
+ return nfserr_not_only_op;
+ }
dump_sessionid(__func__, &sessionid->sessionid);
spin_lock(&sessionid_lock);
ses = find_in_sessionid_hashtbl(&sessionid->sessionid);
--
1.6.3.3



2010-04-27 16:36:30

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Tue, Apr 27, 2010 at 06:44:51PM +0300, Benny Halevy wrote:
> On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > I think what we actually want is a mechanism with semantics like yours,
> > but with multiple RENEW values so we can track how many users there are
> > and have only the last one do the renew.
> >
> > One possibility:
> >
> > - add a cl_users field, with:
> > cl_user>0 meaning: somebody's using this client right
> > now, don't expire it. (Your RENEW state.)
> > cl_user<0 meaning: this client is already being expired,
> > don't try to use it (or any sessionid or other
> > state associated with it). (Your EXPIRED state.)
> > cl_user==0: fair game to either use or (if expiry time
> > has passed) to expire. (Your NORMAL state.)
> >
> > - add a cl_renewme boolean, meaning: last user of this client
> > (user to decrement to 0) should renew the client (reset the
> > expiry time and move it to the back of the lru).
> >
> > So:
> >
> > - in laundromat:
> > - atomically check whether cl_users is 0, and, if so,
> > decrement to -1. (If positive, skip expiring this
> > client.)
> >
> > - on looking up a sessionid (or, eventually, any state object
> > associated with a client), call get_client(), which:
> > - atomically checks whether cl_users is >=0, and, if so,
> > increments it. If <0, fail to find the object and
> > return an appropriate error (STALE?).
> > - on renew:
> > - BUG_ON(cl_user<=0)
> > - set cl_renewme
> > - on dropping reference to a sessionid (or, eventually, any
> > state object associated with a client), call put_client(),
> > which:
> > - atomically decrements cl_users, checks whether it hits
> > zero, and (if so, and if cl_renewme set), renews the
> > client.
> >
> > One possible implementation: make cl_users atomic, add a per-client
> > spinlock, make the put_client() do an atomic_dec_and_lock(), etc.
> >
> > --b.
>
> Sounds good. I'll take a stab at it right away.

Great, thanks; let me know what I can do.

> (Funny that my original implementation uses a counter but IIRC
> I decided at the time it was too complicated but I agree it's much better)

You might also look at get_write_access/deny_write_access and friends as
an implementation of something similar.

--b.

2010-04-27 16:44:54

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Apr. 27, 2010, 19:34 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Tue, Apr 27, 2010 at 07:22:55PM +0300, Benny Halevy wrote:
>> On Apr. 27, 2010, 19:12 +0300, "J. Bruce Fields" <[email protected]> wrote:
>>> On Tue, Apr 27, 2010 at 06:46:43PM +0300, Benny Halevy wrote:
>>>> On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
>>>>> On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
>>>>>> Bruce, Oddly enough I didn't receive the patch you're commenting on into
>>>>>> my inbox. It already happened before on this list and I've no idea what
>>>>>> could have went wrong. (I also have a gmail account subscribed on this list
>>>>>> and I can't find it there, even in the spam folder :-/)
>>>>>
>>>>> Huh. I see it in various archives, so I'll assume the problem's on your
>>>>> end.
>>>>>
>>>>
>>>> I see it on the archives too, but since I don't ever erase anything out
>>>> of my gmail lists account the fact that it's not there AND not
>>>> on my panasas account as well is really suspicious. I wonder if
>>>> vger had some sort of outage or glitch that could explain that.
>>>
>>> For what it's worth I've found postmaster at vger.kernel.org can
>>> sometimes help troubleshoot problems.
>>>
>>> --b.
>>
>> Oddly enough, it seems like I'm subscribed this list (called "linux-nfsv4"):
>> http://marc.info/?l=linux-nfsv4&r=1&b=201004&w=2
>> rather than
>> http://marc.info/?l=linux-nfs&r=1&b=201004&w=2
>>
>> Although my subscription messages from [email protected]
>> say "linux-nfs"
>>
>> The former is essentially [email protected], right?
>> Maybe I got bounced off of linux-nfs.org for some reason...

I've re-subscribed to linux-nfs. I hope I stay there now.
I've no idea why I was bounced off.
I'm not really sure if I should have got an email telling me I was bounced.

BTW, the "which <my-email>" command doesn't seem to work properly on
[email protected] as it returns "No matches found" even though
I'm subscribed now.

>
> The main lists we've been using are:
>
> [email protected]
> [email protected]
> [email protected]
>
> This seems to be confusing. We've talked before about transitioning
> away from one (probably [email protected]). Maybe now's the time to
> start.

I'm all for it.
Maybe it's time to fold the pnfs list into linux-nfs too.

>
> (Oh, and there's also an older sourceforge list; I think mail still gets
> forwarded from the vger list to there? Or is it the other way around?
> May be time to turn that one off completely.)

Never heard of it 'till now :)

Benny

>
> --b.


2010-04-27 18:10:08

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Tue, Apr 27, 2010 at 07:44:51PM +0300, Benny Halevy wrote:
> On Apr. 27, 2010, 19:34 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > The main lists we've been using are:
> >
> > [email protected]
> > [email protected]
> > [email protected]
> >
> > This seems to be confusing. We've talked before about transitioning
> > away from one (probably [email protected]). Maybe now's the time to
> > start.
>
> I'm all for it.
> Maybe it's time to fold the pnfs list into linux-nfs too.

So, looking back at what we did for the sourceforge list:

- We added something to the footer saying the list is
deprecated. (But if we wanted to be more obnoxious, we could
add it to the top of each message.)

- We subscribed the new list ([email protected]) to the
old list, I guess so newbie bug reports sent to the old list
wouldn't get lost.

Anyone want to object to starting that process for [email protected]?
What about [email protected]?

--b.

2010-04-22 14:32:24

by J.Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/2] nfsd4: implement reclaim_complete

This is a mandatory operation. Also, here (not in open) is where we
should be committing the reboot recovery information.

Signed-off-by: J. Bruce Fields <[email protected]>
---
Documentation/filesystems/nfs/nfs41-server.txt | 2 +-
fs/nfsd/nfs4proc.c | 5 +++++
fs/nfsd/nfs4state.c | 22 +++++++++++++++++++---
fs/nfsd/nfs4xdr.c | 12 +++++++++++-
fs/nfsd/xdr4.h | 6 ++++++
5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/Documentation/filesystems/nfs/nfs41-server.txt b/Documentation/filesystems/nfs/nfs41-server.txt
index 6a53a84..0488491 100644
--- a/Documentation/filesystems/nfs/nfs41-server.txt
+++ b/Documentation/filesystems/nfs/nfs41-server.txt
@@ -137,7 +137,7 @@ NS*| OPENATTR | OPT | | Section 18.17 |
| READ | REQ | | Section 18.22 |
| READDIR | REQ | | Section 18.23 |
| READLINK | OPT | | Section 18.24 |
-NS | RECLAIM_COMPLETE | REQ | | Section 18.51 |
+ | RECLAIM_COMPLETE | REQ | | Section 18.51 |
| RELEASE_LOCKOWNER | MNI | | N/A |
| REMOVE | REQ | | Section 18.25 |
| RENAME | REQ | | Section 18.26 |
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index e147dbc..fd01b33 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1310,6 +1310,11 @@ static struct nfsd4_operation nfsd4_ops[] = {
.op_flags = ALLOWED_WITHOUT_FH | ALLOWED_AS_FIRST_OP,
.op_name = "OP_SEQUENCE",
},
+ [OP_RECLAIM_COMPLETE] = {
+ .op_func = (nfsd4op_func)nfsd4_reclaim_complete,
+ .op_flags = ALLOWED_WITHOUT_FH,
+ .op_name = "OP_RECLAIM_COMPLETE",
+ },
};

static const char *nfsd4_op_name(unsigned opnum)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e444829..b9d9c04 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1482,6 +1482,24 @@ out:
}

__be32
+nfsd4_reclaim_complete(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_reclaim_complete *rc)
+{
+ if (rc->rca_one_fs) {
+ if (!cstate->current_fh.fh_dentry)
+ return nfserr_nofilehandle;
+ /*
+ * We don't take advantage of the rca_one_fs case.
+ * That's OK, it's optional, we can safely ignore it.
+ */
+ return nfs_ok;
+ }
+ nfs4_lock_state();
+ nfsd4_create_clid_dir(cstate->session->se_client);
+ nfs4_unlock_state();
+ return nfs_ok;
+}
+
+__be32
nfsd4_setclientid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_setclientid *setclid)
{
@@ -2492,10 +2510,8 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
}
memcpy(&open->op_stateid, &stp->st_stateid, sizeof(stateid_t));

- if (nfsd4_has_session(&resp->cstate)) {
+ if (nfsd4_has_session(&resp->cstate))
open->op_stateowner->so_confirmed = 1;
- nfsd4_create_clid_dir(open->op_stateowner->so_client);
- }

/*
* Attempt to hand out a delegation. No error return, because the
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index fb27b1d..bc4b86d 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1233,6 +1233,16 @@ nfsd4_decode_sequence(struct nfsd4_compoundargs *argp,
DECODE_TAIL;
}

+static __be32 nfsd4_decode_reclaim_complete(struct nfsd4_compoundargs *argp, struct nfsd4_reclaim_complete *rc)
+{
+ DECODE_HEAD;
+
+ READ_BUF(4);
+ READ32(rc->rca_one_fs);
+
+ DECODE_TAIL;
+}
+
static __be32
nfsd4_decode_noop(struct nfsd4_compoundargs *argp, void *p)
{
@@ -1345,7 +1355,7 @@ static nfsd4_dec nfsd41_dec_ops[] = {
[OP_TEST_STATEID] = (nfsd4_dec)nfsd4_decode_notsupp,
[OP_WANT_DELEGATION] = (nfsd4_dec)nfsd4_decode_notsupp,
[OP_DESTROY_CLIENTID] = (nfsd4_dec)nfsd4_decode_notsupp,
- [OP_RECLAIM_COMPLETE] = (nfsd4_dec)nfsd4_decode_notsupp,
+ [OP_RECLAIM_COMPLETE] = (nfsd4_dec)nfsd4_decode_reclaim_complete,
};

struct nfsd4_minorversion_ops {
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index c28958e..4d476ff 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -381,6 +381,10 @@ struct nfsd4_destroy_session {
struct nfs4_sessionid sessionid;
};

+struct nfsd4_reclaim_complete {
+ u32 rca_one_fs;
+};
+
struct nfsd4_op {
int opnum;
__be32 status;
@@ -421,6 +425,7 @@ struct nfsd4_op {
struct nfsd4_create_session create_session;
struct nfsd4_destroy_session destroy_session;
struct nfsd4_sequence sequence;
+ struct nfsd4_reclaim_complete reclaim_complete;
} u;
struct nfs4_replay * replay;
};
@@ -523,6 +528,7 @@ extern __be32 nfsd4_sequence(struct svc_rqst *,
extern __be32 nfsd4_destroy_session(struct svc_rqst *,
struct nfsd4_compound_state *,
struct nfsd4_destroy_session *);
+__be32 nfsd4_reclaim_complete(struct svc_rqst *, struct nfsd4_compound_state *, struct nfsd4_reclaim_complete *);
extern __be32 nfsd4_process_open1(struct nfsd4_compound_state *,
struct nfsd4_open *open);
extern __be32 nfsd4_process_open2(struct svc_rqst *rqstp,
--
1.6.3.3


2010-04-22 14:48:32

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
> Enforce the rules about compound op ordering.
>
> Motivated by implementing RECLAIM_COMPLETE, for which the client is
> implicit in the current session, so it is important to ensure a
> succesful SEQUENCE proceeds the RECLAIM_COMPLETE.

The other problem here is that while we have a reference count on the
session itself preventing it from going away till the compound is done,
I don't see what prevents the associated clientid from going away.

To fix that, and to be more polite to 4.0 clients, I think we want to
also add a client pointer to the compound_state structure, keep count of
the number of compounds in progress which reference that client, and not
start the client's expiry timer until we've encoded the reply to the
compound.

One question there is whether it's really correct to assume that a
single compound can reference only one client. (I don't think rfc 3530
explicitly forbids a single compound referring to multiple clients. rfc
5661 explicitly allows it in the case of DESTROY_CLIENTID, though that's
a special case.)

--b.

>
> Signed-off-by: J. Bruce Fields <[email protected]>
> ---
> fs/nfsd/nfs4proc.c | 44 ++++++++++++++++++++++++++++++--------------
> fs/nfsd/nfs4state.c | 13 +++++++++++++
> 2 files changed, 43 insertions(+), 14 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 37514c4..e147dbc 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -968,20 +968,36 @@ static struct nfsd4_operation nfsd4_ops[];
> static const char *nfsd4_op_name(unsigned opnum);
>
> /*
> - * Enforce NFSv4.1 COMPOUND ordering rules.
> + * Enforce NFSv4.1 COMPOUND ordering rules:
> *
> - * TODO:
> - * - enforce NFS4ERR_NOT_ONLY_OP,
> - * - DESTROY_SESSION MUST be the final operation in the COMPOUND request.
> + * Also note, enforced elsewhere:
> + * - SEQUENCE other than as first op results in
> + * NFS4ERR_SEQUENCE_POS. (Enforced in nfsd4_sequence().)
> + * - BIND_CONN_TO_SESSION must be the only op in its compound
> + * (Will be enforced in nfsd4_bind_conn_to_session().)
> + * - DESTROY_SESSION must be the final operation in a compound, if
> + * sessionid's in SEQUENCE and DESTROY_SESSION are the same.
> + * (Enforced in nfsd4_destroy_session().)
> */
> -static bool nfs41_op_ordering_ok(struct nfsd4_compoundargs *args)
> +static __be32 nfs41_check_op_ordering(struct nfsd4_compoundargs *args)
> {
> - if (args->minorversion && args->opcnt > 0) {
> - struct nfsd4_op *op = &args->ops[0];
> - return (op->status == nfserr_op_illegal) ||
> - (nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP);
> - }
> - return true;
> + struct nfsd4_op *op = &args->ops[0];
> +
> + /* These ordering requirements don't apply to NFSv4.0: */
> + if (args->minorversion == 0)
> + return nfs_ok;
> + /* This is weird, but OK, not our problem: */
> + if (args->opcnt == 0)
> + return nfs_ok;
> + if (op->status == nfserr_op_illegal)
> + return nfs_ok;
> + if (!(nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP))
> + return nfserr_op_not_in_session;
> + if (op->opnum == OP_SEQUENCE)
> + return nfs_ok;
> + if (args->opcnt != 1)
> + return nfserr_not_only_op;
> + return nfs_ok;
> }
>
> /*
> @@ -1023,13 +1039,13 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
> if (args->minorversion > nfsd_supported_minorversion)
> goto out;
>
> - if (!nfs41_op_ordering_ok(args)) {
> + status = nfs41_check_op_ordering(args);
> + if (status) {
> op = &args->ops[0];
> - op->status = nfserr_sequence_pos;
> + op->status = status;
> goto encode_op;
> }
>
> - status = nfs_ok;
> while (!status && resp->opcnt < args->opcnt) {
> op = &args->ops[resp->opcnt++];
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 5051ade..e444829 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1365,6 +1365,14 @@ out:
> return status;
> }
>
> +static bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
> +{
> + struct nfsd4_compoundres *resp = rqstp->rq_resp;
> + struct nfsd4_compoundargs *argp = rqstp->rq_argp;
> +
> + return argp->opcnt == resp->opcnt;
> +}
> +
> __be32
> nfsd4_destroy_session(struct svc_rqst *r,
> struct nfsd4_compound_state *cstate,
> @@ -1380,6 +1388,11 @@ nfsd4_destroy_session(struct svc_rqst *r,
> * - Do we need to clear any callback info from previous session?
> */
>
> + if (!memcmp(&sessionid->sessionid, &cstate->session->se_sessionid,
> + sizeof(struct nfs4_sessionid))) {
> + if (!nfsd4_last_compound_op(r))
> + return nfserr_not_only_op;
> + }
> dump_sessionid(__func__, &sessionid->sessionid);
> spin_lock(&sessionid_lock);
> ses = find_in_sessionid_hashtbl(&sessionid->sessionid);
> --
> 1.6.3.3
>

2010-04-22 15:03:14

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
> On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
> > Enforce the rules about compound op ordering.
> >
> > Motivated by implementing RECLAIM_COMPLETE, for which the client is
> > implicit in the current session, so it is important to ensure a
> > succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
>
> The other problem here is that while we have a reference count on the
> session itself preventing it from going away till the compound is done,
> I don't see what prevents the associated clientid from going away.

That session reference counting also doesn't make sense to me: we should
never allow a client to disappear while a session is in use.

> > +static bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
> > +{
> > + struct nfsd4_compoundres *resp = rqstp->rq_resp;
> > + struct nfsd4_compoundargs *argp = rqstp->rq_argp;
> > +
> > + return argp->opcnt == resp->opcnt;
> > +}
> > +
> > __be32
> > nfsd4_destroy_session(struct svc_rqst *r,
> > struct nfsd4_compound_state *cstate,
> > @@ -1380,6 +1388,11 @@ nfsd4_destroy_session(struct svc_rqst *r,
> > * - Do we need to clear any callback info from previous session?
> > */
> >

Also to do in destroy_session, from
http://tools.ietf.org/search/rfc5661#section-13.9 "DESTROY_SESSION MUST
be invoked on a connection that is associated with the session being
destroyed." We should probably be enforcing that here. I'm not sure
what error to return.

--b.

> > + if (!memcmp(&sessionid->sessionid, &cstate->session->se_sessionid,
> > + sizeof(struct nfs4_sessionid))) {
> > + if (!nfsd4_last_compound_op(r))
> > + return nfserr_not_only_op;
> > + }
> > dump_sessionid(__func__, &sessionid->sessionid);
> > spin_lock(&sessionid_lock);
> > ses = find_in_sessionid_hashtbl(&sessionid->sessionid);
> > --
> > 1.6.3.3
> >

2010-04-23 21:24:12

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
> On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
> > Enforce the rules about compound op ordering.
> >
> > Motivated by implementing RECLAIM_COMPLETE, for which the client is
> > implicit in the current session, so it is important to ensure a
> > succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
>
> The other problem here is that while we have a reference count on the
> session itself preventing it from going away till the compound is done,
> I don't see what prevents the associated clientid from going away.
>
> To fix that, and to be more polite to 4.0 clients, I think we want to
> also add a client pointer to the compound_state structure, keep count of
> the number of compounds in progress which reference that client, and not
> start the client's expiry timer until we've encoded the reply to the
> compound.

Benny--I coded up a simple (possibly incorrect) implementation of this,
and then remembered that this was more or less what your
state-lock-reduction-prep patch series did. Do you have a more recent
version of those patches?

--b.

>
> One question there is whether it's really correct to assume that a
> single compound can reference only one client. (I don't think rfc 3530
> explicitly forbids a single compound referring to multiple clients. rfc
> 5661 explicitly allows it in the case of DESTROY_CLIENTID, though that's
> a special case.)
>
> --b.
>
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
> > ---
> > fs/nfsd/nfs4proc.c | 44 ++++++++++++++++++++++++++++++--------------
> > fs/nfsd/nfs4state.c | 13 +++++++++++++
> > 2 files changed, 43 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 37514c4..e147dbc 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -968,20 +968,36 @@ static struct nfsd4_operation nfsd4_ops[];
> > static const char *nfsd4_op_name(unsigned opnum);
> >
> > /*
> > - * Enforce NFSv4.1 COMPOUND ordering rules.
> > + * Enforce NFSv4.1 COMPOUND ordering rules:
> > *
> > - * TODO:
> > - * - enforce NFS4ERR_NOT_ONLY_OP,
> > - * - DESTROY_SESSION MUST be the final operation in the COMPOUND request.
> > + * Also note, enforced elsewhere:
> > + * - SEQUENCE other than as first op results in
> > + * NFS4ERR_SEQUENCE_POS. (Enforced in nfsd4_sequence().)
> > + * - BIND_CONN_TO_SESSION must be the only op in its compound
> > + * (Will be enforced in nfsd4_bind_conn_to_session().)
> > + * - DESTROY_SESSION must be the final operation in a compound, if
> > + * sessionid's in SEQUENCE and DESTROY_SESSION are the same.
> > + * (Enforced in nfsd4_destroy_session().)
> > */
> > -static bool nfs41_op_ordering_ok(struct nfsd4_compoundargs *args)
> > +static __be32 nfs41_check_op_ordering(struct nfsd4_compoundargs *args)
> > {
> > - if (args->minorversion && args->opcnt > 0) {
> > - struct nfsd4_op *op = &args->ops[0];
> > - return (op->status == nfserr_op_illegal) ||
> > - (nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP);
> > - }
> > - return true;
> > + struct nfsd4_op *op = &args->ops[0];
> > +
> > + /* These ordering requirements don't apply to NFSv4.0: */
> > + if (args->minorversion == 0)
> > + return nfs_ok;
> > + /* This is weird, but OK, not our problem: */
> > + if (args->opcnt == 0)
> > + return nfs_ok;
> > + if (op->status == nfserr_op_illegal)
> > + return nfs_ok;
> > + if (!(nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP))
> > + return nfserr_op_not_in_session;
> > + if (op->opnum == OP_SEQUENCE)
> > + return nfs_ok;
> > + if (args->opcnt != 1)
> > + return nfserr_not_only_op;
> > + return nfs_ok;
> > }
> >
> > /*
> > @@ -1023,13 +1039,13 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
> > if (args->minorversion > nfsd_supported_minorversion)
> > goto out;
> >
> > - if (!nfs41_op_ordering_ok(args)) {
> > + status = nfs41_check_op_ordering(args);
> > + if (status) {
> > op = &args->ops[0];
> > - op->status = nfserr_sequence_pos;
> > + op->status = status;
> > goto encode_op;
> > }
> >
> > - status = nfs_ok;
> > while (!status && resp->opcnt < args->opcnt) {
> > op = &args->ops[resp->opcnt++];
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 5051ade..e444829 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1365,6 +1365,14 @@ out:
> > return status;
> > }
> >
> > +static bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
> > +{
> > + struct nfsd4_compoundres *resp = rqstp->rq_resp;
> > + struct nfsd4_compoundargs *argp = rqstp->rq_argp;
> > +
> > + return argp->opcnt == resp->opcnt;
> > +}
> > +
> > __be32
> > nfsd4_destroy_session(struct svc_rqst *r,
> > struct nfsd4_compound_state *cstate,
> > @@ -1380,6 +1388,11 @@ nfsd4_destroy_session(struct svc_rqst *r,
> > * - Do we need to clear any callback info from previous session?
> > */
> >
> > + if (!memcmp(&sessionid->sessionid, &cstate->session->se_sessionid,
> > + sizeof(struct nfs4_sessionid))) {
> > + if (!nfsd4_last_compound_op(r))
> > + return nfserr_not_only_op;
> > + }
> > dump_sessionid(__func__, &sessionid->sessionid);
> > spin_lock(&sessionid_lock);
> > ses = find_in_sessionid_hashtbl(&sessionid->sessionid);
> > --
> > 1.6.3.3
> >

2010-04-23 23:13:45

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Fri, Apr 23, 2010 at 05:24:11PM -0400, J. Bruce Fields wrote:
> On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
> > On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
> > > Enforce the rules about compound op ordering.
> > >
> > > Motivated by implementing RECLAIM_COMPLETE, for which the client is
> > > implicit in the current session, so it is important to ensure a
> > > succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
> >
> > The other problem here is that while we have a reference count on the
> > session itself preventing it from going away till the compound is done,
> > I don't see what prevents the associated clientid from going away.
> >
> > To fix that, and to be more polite to 4.0 clients, I think we want to
> > also add a client pointer to the compound_state structure, keep count of
> > the number of compounds in progress which reference that client, and not
> > start the client's expiry timer until we've encoded the reply to the
> > compound.
>
> Benny--I coded up a simple (possibly incorrect) implementation of this,
> and then remembered that this was more or less what your
> state-lock-reduction-prep patch series did. Do you have a more recent
> version of those patches?

One possible problem with that approach: mark_for_renew() can fail (if
the client is already expired), without telling the caller.

You then end up with an odd situation, continuing to process the rest of
the compound normally, while your session belongs to a client that's
already expired.

Perhaps it would be better to mark the client for possible renewal the
moment it (or some stateid, sessionid, or other object that refers to
it) is looked up?

--b.

2010-04-24 00:10:34

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Fri, Apr 23, 2010 at 07:13:44PM -0400, J. Bruce Fields wrote:
> On Fri, Apr 23, 2010 at 05:24:11PM -0400, J. Bruce Fields wrote:
> > On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
> > > On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
> > > > Enforce the rules about compound op ordering.
> > > >
> > > > Motivated by implementing RECLAIM_COMPLETE, for which the client is
> > > > implicit in the current session, so it is important to ensure a
> > > > succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
> > >
> > > The other problem here is that while we have a reference count on the
> > > session itself preventing it from going away till the compound is done,
> > > I don't see what prevents the associated clientid from going away.
> > >
> > > To fix that, and to be more polite to 4.0 clients, I think we want to
> > > also add a client pointer to the compound_state structure, keep count of
> > > the number of compounds in progress which reference that client, and not
> > > start the client's expiry timer until we've encoded the reply to the
> > > compound.
> >
> > Benny--I coded up a simple (possibly incorrect) implementation of this,
> > and then remembered that this was more or less what your
> > state-lock-reduction-prep patch series did. Do you have a more recent
> > version of those patches?
>
> One possible problem with that approach: mark_for_renew() can fail (if
> the client is already expired), without telling the caller.
>
> You then end up with an odd situation, continuing to process the rest of
> the compound normally, while your session belongs to a client that's
> already expired.
>
> Perhaps it would be better to mark the client for possible renewal the
> moment it (or some stateid, sessionid, or other object that refers to
> it) is looked up?

Also: a single "RENEW" state won't suffice when multiple outstanding
compounds reference the same client, in which case you want only the
last (not the first) to complete to do the renewal. So I think we want
a count of outstanding compounds referencing the client, so that we can
do the renewal when the count goes to zero.

--b.

2010-04-27 14:40:56

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

Bruce, Oddly enough I didn't receive the patch you're commenting on into
my inbox. It already happened before on this list and I've no idea what
could have went wrong. (I also have a gmail account subscribed on this list
and I can't find it there, even in the spam folder :-/)

comments below...

On Apr. 24, 2010, 0:24 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
>> On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
>>> Enforce the rules about compound op ordering.
>>>
>>> Motivated by implementing RECLAIM_COMPLETE, for which the client is
>>> implicit in the current session, so it is important to ensure a
>>> succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
>>
>> The other problem here is that while we have a reference count on the
>> session itself preventing it from going away till the compound is done,
>> I don't see what prevents the associated clientid from going away.
>>
>> To fix that, and to be more polite to 4.0 clients, I think we want to
>> also add a client pointer to the compound_state structure, keep count of
>> the number of compounds in progress which reference that client, and not
>> start the client's expiry timer until we've encoded the reply to the
>> compound.
>
> Benny--I coded up a simple (possibly incorrect) implementation of this,
> and then remembered that this was more or less what your
> state-lock-reduction-prep patch series did. Do you have a more recent
> version of those patches?

Yup. though untested with latest bits.
I'll resend it as a reply to this message.

>
> --b.
>
>>
>> One question there is whether it's really correct to assume that a
>> single compound can reference only one client. (I don't think rfc 3530
>> explicitly forbids a single compound referring to multiple clients. rfc
>> 5661 explicitly allows it in the case of DESTROY_CLIENTID, though that's
>> a special case.)

I don't think so, if the compound starts with SEQUENCE there's no way
to replace the session/clientid for the compound. All operations
not associated with a sessions must be the only ops in the compound
(with the exception of CREATE_SESSION that can be preceded with a
SEQUENCE but it doesn't change the client id not the effective session
for the compound)

Benny

>>
>> --b.
>>
>>>
>>> Signed-off-by: J. Bruce Fields <[email protected]>
>>> ---
>>> fs/nfsd/nfs4proc.c | 44 ++++++++++++++++++++++++++++++--------------
>>> fs/nfsd/nfs4state.c | 13 +++++++++++++
>>> 2 files changed, 43 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>> index 37514c4..e147dbc 100644
>>> --- a/fs/nfsd/nfs4proc.c
>>> +++ b/fs/nfsd/nfs4proc.c
>>> @@ -968,20 +968,36 @@ static struct nfsd4_operation nfsd4_ops[];
>>> static const char *nfsd4_op_name(unsigned opnum);
>>>
>>> /*
>>> - * Enforce NFSv4.1 COMPOUND ordering rules.
>>> + * Enforce NFSv4.1 COMPOUND ordering rules:
>>> *
>>> - * TODO:
>>> - * - enforce NFS4ERR_NOT_ONLY_OP,
>>> - * - DESTROY_SESSION MUST be the final operation in the COMPOUND request.
>>> + * Also note, enforced elsewhere:
>>> + * - SEQUENCE other than as first op results in
>>> + * NFS4ERR_SEQUENCE_POS. (Enforced in nfsd4_sequence().)
>>> + * - BIND_CONN_TO_SESSION must be the only op in its compound
>>> + * (Will be enforced in nfsd4_bind_conn_to_session().)
>>> + * - DESTROY_SESSION must be the final operation in a compound, if
>>> + * sessionid's in SEQUENCE and DESTROY_SESSION are the same.
>>> + * (Enforced in nfsd4_destroy_session().)
>>> */
>>> -static bool nfs41_op_ordering_ok(struct nfsd4_compoundargs *args)
>>> +static __be32 nfs41_check_op_ordering(struct nfsd4_compoundargs *args)
>>> {
>>> - if (args->minorversion && args->opcnt > 0) {
>>> - struct nfsd4_op *op = &args->ops[0];
>>> - return (op->status == nfserr_op_illegal) ||
>>> - (nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP);
>>> - }
>>> - return true;
>>> + struct nfsd4_op *op = &args->ops[0];
>>> +
>>> + /* These ordering requirements don't apply to NFSv4.0: */
>>> + if (args->minorversion == 0)
>>> + return nfs_ok;
>>> + /* This is weird, but OK, not our problem: */
>>> + if (args->opcnt == 0)
>>> + return nfs_ok;
>>> + if (op->status == nfserr_op_illegal)
>>> + return nfs_ok;
>>> + if (!(nfsd4_ops[op->opnum].op_flags & ALLOWED_AS_FIRST_OP))
>>> + return nfserr_op_not_in_session;
>>> + if (op->opnum == OP_SEQUENCE)
>>> + return nfs_ok;
>>> + if (args->opcnt != 1)
>>> + return nfserr_not_only_op;
>>> + return nfs_ok;
>>> }
>>>
>>> /*
>>> @@ -1023,13 +1039,13 @@ nfsd4_proc_compound(struct svc_rqst *rqstp,
>>> if (args->minorversion > nfsd_supported_minorversion)
>>> goto out;
>>>
>>> - if (!nfs41_op_ordering_ok(args)) {
>>> + status = nfs41_check_op_ordering(args);
>>> + if (status) {
>>> op = &args->ops[0];
>>> - op->status = nfserr_sequence_pos;
>>> + op->status = status;
>>> goto encode_op;
>>> }
>>>
>>> - status = nfs_ok;
>>> while (!status && resp->opcnt < args->opcnt) {
>>> op = &args->ops[resp->opcnt++];
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index 5051ade..e444829 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -1365,6 +1365,14 @@ out:
>>> return status;
>>> }
>>>
>>> +static bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
>>> +{
>>> + struct nfsd4_compoundres *resp = rqstp->rq_resp;
>>> + struct nfsd4_compoundargs *argp = rqstp->rq_argp;
>>> +
>>> + return argp->opcnt == resp->opcnt;
>>> +}
>>> +
>>> __be32
>>> nfsd4_destroy_session(struct svc_rqst *r,
>>> struct nfsd4_compound_state *cstate,
>>> @@ -1380,6 +1388,11 @@ nfsd4_destroy_session(struct svc_rqst *r,
>>> * - Do we need to clear any callback info from previous session?
>>> */
>>>
>>> + if (!memcmp(&sessionid->sessionid, &cstate->session->se_sessionid,
>>> + sizeof(struct nfs4_sessionid))) {
>>> + if (!nfsd4_last_compound_op(r))
>>> + return nfserr_not_only_op;
>>> + }
>>> dump_sessionid(__func__, &sessionid->sessionid);
>>> spin_lock(&sessionid_lock);
>>> ses = find_in_sessionid_hashtbl(&sessionid->sessionid);
>>> --
>>> 1.6.3.3
>>>

2010-04-27 15:01:04

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
> Bruce, Oddly enough I didn't receive the patch you're commenting on into
> my inbox. It already happened before on this list and I've no idea what
> could have went wrong. (I also have a gmail account subscribed on this list
> and I can't find it there, even in the spam folder :-/)

Huh. I see it in various archives, so I'll assume the problem's on your
end.

> comments below...
>
> On Apr. 24, 2010, 0:24 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > Benny--I coded up a simple (possibly incorrect) implementation of this,
> > and then remembered that this was more or less what your
> > state-lock-reduction-prep patch series did. Do you have a more recent
> > version of those patches?
>
> Yup. though untested with latest bits.
> I'll resend it as a reply to this message.

I think what we actually want is a mechanism with semantics like yours,
but with multiple RENEW values so we can track how many users there are
and have only the last one do the renew.

One possibility:

- add a cl_users field, with:
cl_user>0 meaning: somebody's using this client right
now, don't expire it. (Your RENEW state.)
cl_user<0 meaning: this client is already being expired,
don't try to use it (or any sessionid or other
state associated with it). (Your EXPIRED state.)
cl_user==0: fair game to either use or (if expiry time
has passed) to expire. (Your NORMAL state.)

- add a cl_renewme boolean, meaning: last user of this client
(user to decrement to 0) should renew the client (reset the
expiry time and move it to the back of the lru).

So:

- in laundromat:
- atomically check whether cl_users is 0, and, if so,
decrement to -1. (If positive, skip expiring this
client.)

- on looking up a sessionid (or, eventually, any state object
associated with a client), call get_client(), which:
- atomically checks whether cl_users is >=0, and, if so,
increments it. If <0, fail to find the object and
return an appropriate error (STALE?).
- on renew:
- BUG_ON(cl_user<=0)
- set cl_renewme
- on dropping reference to a sessionid (or, eventually, any
state object associated with a client), call put_client(),
which:
- atomically decrements cl_users, checks whether it hits
zero, and (if so, and if cl_renewme set), renews the
client.

One possible implementation: make cl_users atomic, add a per-client
spinlock, make the put_client() do an atomic_dec_and_lock(), etc.

--b.

2010-04-27 15:03:18

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Apr. 24, 2010, 3:10 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Fri, Apr 23, 2010 at 07:13:44PM -0400, J. Bruce Fields wrote:
>> On Fri, Apr 23, 2010 at 05:24:11PM -0400, J. Bruce Fields wrote:
>>> On Thu, Apr 22, 2010 at 10:48:31AM -0400, J. Bruce Fields wrote:
>>>> On Thu, Apr 22, 2010 at 10:32:23AM -0400, J. Bruce Fields wrote:
>>>>> Enforce the rules about compound op ordering.
>>>>>
>>>>> Motivated by implementing RECLAIM_COMPLETE, for which the client is
>>>>> implicit in the current session, so it is important to ensure a
>>>>> succesful SEQUENCE proceeds the RECLAIM_COMPLETE.
>>>>
>>>> The other problem here is that while we have a reference count on the
>>>> session itself preventing it from going away till the compound is done,
>>>> I don't see what prevents the associated clientid from going away.
>>>>
>>>> To fix that, and to be more polite to 4.0 clients, I think we want to
>>>> also add a client pointer to the compound_state structure, keep count of
>>>> the number of compounds in progress which reference that client, and not
>>>> start the client's expiry timer until we've encoded the reply to the
>>>> compound.
>>>
>>> Benny--I coded up a simple (possibly incorrect) implementation of this,
>>> and then remembered that this was more or less what your
>>> state-lock-reduction-prep patch series did. Do you have a more recent
>>> version of those patches?
>>
>> One possible problem with that approach: mark_for_renew() can fail (if
>> the client is already expired), without telling the caller.
>>
>> You then end up with an odd situation, continuing to process the rest of
>> the compound normally, while your session belongs to a client that's
>> already expired.
>>
>> Perhaps it would be better to mark the client for possible renewal the
>> moment it (or some stateid, sessionid, or other object that refers to
>> it) is looked up?
>
> Also: a single "RENEW" state won't suffice when multiple outstanding
> compounds reference the same client, in which case you want only the
> last (not the first) to complete to do the renewal. So I think we want
> a count of outstanding compounds referencing the client, so that we can
> do the renewal when the count goes to zero.

True. My patchset sort of did that on SEQUENCE, though not under the
sessionid_lock. If the client is expired atomically with updating
all sessions referring to it under that lock we can refcount it for
renewal on nfsd4_get_session (always called under sessionid_lock),
next nfsd4_put_session should be moved under the same lock and there
the "use count" can be decremented and the client can be renewed
when no session is actively using it.

Benny

>
> --b.

2010-04-27 15:44:54

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
>> Bruce, Oddly enough I didn't receive the patch you're commenting on into
>> my inbox. It already happened before on this list and I've no idea what
>> could have went wrong. (I also have a gmail account subscribed on this list
>> and I can't find it there, even in the spam folder :-/)
>
> Huh. I see it in various archives, so I'll assume the problem's on your
> end.
>
>> comments below...
>>
>> On Apr. 24, 2010, 0:24 +0300, "J. Bruce Fields" <[email protected]> wrote:
>>> Benny--I coded up a simple (possibly incorrect) implementation of this,
>>> and then remembered that this was more or less what your
>>> state-lock-reduction-prep patch series did. Do you have a more recent
>>> version of those patches?
>>
>> Yup. though untested with latest bits.
>> I'll resend it as a reply to this message.
>
> I think what we actually want is a mechanism with semantics like yours,
> but with multiple RENEW values so we can track how many users there are
> and have only the last one do the renew.
>
> One possibility:
>
> - add a cl_users field, with:
> cl_user>0 meaning: somebody's using this client right
> now, don't expire it. (Your RENEW state.)
> cl_user<0 meaning: this client is already being expired,
> don't try to use it (or any sessionid or other
> state associated with it). (Your EXPIRED state.)
> cl_user==0: fair game to either use or (if expiry time
> has passed) to expire. (Your NORMAL state.)
>
> - add a cl_renewme boolean, meaning: last user of this client
> (user to decrement to 0) should renew the client (reset the
> expiry time and move it to the back of the lru).
>
> So:
>
> - in laundromat:
> - atomically check whether cl_users is 0, and, if so,
> decrement to -1. (If positive, skip expiring this
> client.)
>
> - on looking up a sessionid (or, eventually, any state object
> associated with a client), call get_client(), which:
> - atomically checks whether cl_users is >=0, and, if so,
> increments it. If <0, fail to find the object and
> return an appropriate error (STALE?).
> - on renew:
> - BUG_ON(cl_user<=0)
> - set cl_renewme
> - on dropping reference to a sessionid (or, eventually, any
> state object associated with a client), call put_client(),
> which:
> - atomically decrements cl_users, checks whether it hits
> zero, and (if so, and if cl_renewme set), renews the
> client.
>
> One possible implementation: make cl_users atomic, add a per-client
> spinlock, make the put_client() do an atomic_dec_and_lock(), etc.
>
> --b.

Sounds good. I'll take a stab at it right away.
(Funny that my original implementation uses a counter but IIRC
I decided at the time it was too complicated but I agree it's much better)

Benny

2010-04-27 15:46:45

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
>> Bruce, Oddly enough I didn't receive the patch you're commenting on into
>> my inbox. It already happened before on this list and I've no idea what
>> could have went wrong. (I also have a gmail account subscribed on this list
>> and I can't find it there, even in the spam folder :-/)
>
> Huh. I see it in various archives, so I'll assume the problem's on your
> end.
>

I see it on the archives too, but since I don't ever erase anything out
of my gmail lists account the fact that it's not there AND not
on my panasas account as well is really suspicious. I wonder if
vger had some sort of outage or glitch that could explain that.

Benny

2010-04-27 16:12:45

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Tue, Apr 27, 2010 at 06:46:43PM +0300, Benny Halevy wrote:
> On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
> >> Bruce, Oddly enough I didn't receive the patch you're commenting on into
> >> my inbox. It already happened before on this list and I've no idea what
> >> could have went wrong. (I also have a gmail account subscribed on this list
> >> and I can't find it there, even in the spam folder :-/)
> >
> > Huh. I see it in various archives, so I'll assume the problem's on your
> > end.
> >
>
> I see it on the archives too, but since I don't ever erase anything out
> of my gmail lists account the fact that it's not there AND not
> on my panasas account as well is really suspicious. I wonder if
> vger had some sort of outage or glitch that could explain that.

For what it's worth I've found postmaster at vger.kernel.org can
sometimes help troubleshoot problems.

--b.

2010-04-27 16:22:58

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Apr. 27, 2010, 19:12 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Tue, Apr 27, 2010 at 06:46:43PM +0300, Benny Halevy wrote:
>> On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
>>> On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
>>>> Bruce, Oddly enough I didn't receive the patch you're commenting on into
>>>> my inbox. It already happened before on this list and I've no idea what
>>>> could have went wrong. (I also have a gmail account subscribed on this list
>>>> and I can't find it there, even in the spam folder :-/)
>>>
>>> Huh. I see it in various archives, so I'll assume the problem's on your
>>> end.
>>>
>>
>> I see it on the archives too, but since I don't ever erase anything out
>> of my gmail lists account the fact that it's not there AND not
>> on my panasas account as well is really suspicious. I wonder if
>> vger had some sort of outage or glitch that could explain that.
>
> For what it's worth I've found postmaster at vger.kernel.org can
> sometimes help troubleshoot problems.
>
> --b.

Oddly enough, it seems like I'm subscribed this list (called "linux-nfsv4"):
http://marc.info/?l=linux-nfsv4&r=1&b=201004&w=2
rather than
http://marc.info/?l=linux-nfs&r=1&b=201004&w=2

Although my subscription messages from [email protected]
say "linux-nfs"

The former is essentially [email protected], right?
Maybe I got bounced off of linux-nfs.org for some reason...

Benny

2010-04-27 16:34:04

by J.Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/2] nfsd4: complete enforcement of 4.1 op ordering

On Tue, Apr 27, 2010 at 07:22:55PM +0300, Benny Halevy wrote:
> On Apr. 27, 2010, 19:12 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > On Tue, Apr 27, 2010 at 06:46:43PM +0300, Benny Halevy wrote:
> >> On Apr. 27, 2010, 18:01 +0300, "J. Bruce Fields" <[email protected]> wrote:
> >>> On Tue, Apr 27, 2010 at 05:40:53PM +0300, Benny Halevy wrote:
> >>>> Bruce, Oddly enough I didn't receive the patch you're commenting on into
> >>>> my inbox. It already happened before on this list and I've no idea what
> >>>> could have went wrong. (I also have a gmail account subscribed on this list
> >>>> and I can't find it there, even in the spam folder :-/)
> >>>
> >>> Huh. I see it in various archives, so I'll assume the problem's on your
> >>> end.
> >>>
> >>
> >> I see it on the archives too, but since I don't ever erase anything out
> >> of my gmail lists account the fact that it's not there AND not
> >> on my panasas account as well is really suspicious. I wonder if
> >> vger had some sort of outage or glitch that could explain that.
> >
> > For what it's worth I've found postmaster at vger.kernel.org can
> > sometimes help troubleshoot problems.
> >
> > --b.
>
> Oddly enough, it seems like I'm subscribed this list (called "linux-nfsv4"):
> http://marc.info/?l=linux-nfsv4&r=1&b=201004&w=2
> rather than
> http://marc.info/?l=linux-nfs&r=1&b=201004&w=2
>
> Although my subscription messages from [email protected]
> say "linux-nfs"
>
> The former is essentially [email protected], right?
> Maybe I got bounced off of linux-nfs.org for some reason...

The main lists we've been using are:

[email protected]
[email protected]
[email protected]

This seems to be confusing. We've talked before about transitioning
away from one (probably [email protected]). Maybe now's the time to
start.

(Oh, and there's also an older sourceforge list; I think mail still gets
forwarded from the vger list to there? Or is it the other way around?
May be time to turn that one off completely.)

--b.