current_stateid processing will allow clients to construct compounds
like OPEN + READ + CLOSE or OPEN +LAYOUTGET.
As dCache server marks all layouts return-on-close, any open to a file followed
by LAYOUTGET. A combined compound will allow to do the same in one go.
Currently linux client doesn't support it. But before we can get such functionality
linux server have to support it as well. Here is an attempt to add it.
I made a dirty test with pynfs with to test OPEN+CLOSE in a single compound.
If I read nfsd code correctly, READ and WRITE have to work as well. I will refactor
pynfs a bit to easily construct such tests.
This is patches for preview to be sure that other server developers are happy with
my approach before I change much more.
Tigran.
Ok, I think I know what to do. I will update patches later today or tomorrw.
Tigran.
On Tue, Dec 6, 2011 at 3:32 PM, Benny Halevy <[email protected]> wrote:
> On 2011-12-06 15:40, J. Bruce Fields wrote:
>> On Tue, Dec 06, 2011 at 02:31:14PM +0100, Tigran Mkrtchyan wrote:
>>> On Tue, Dec 6, 2011 at 12:26 PM, Benny Halevy <[email protected]> wrote:
>>>> On 2011-12-06 04:08, J. Bruce Fields wrote:
>>>>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>>>>> On 2011-12-04 14:03, [email protected] wrote:
>>>>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>>>>> ---
>>>>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>> index fa38336..535aed2 100644
>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>> */
>>>>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>>>>> WARN_ON(status && open->op_created);
>>>>>>> +
>>>>>>> + if(status)
>>>>>>> + goto out;
>>>>>>> +
>>>>>>> + /* set current state id */
>>>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>>>>
>>>>> That comment is a bit redundant.
>>>>>
>>>>>> Since this should be done for all stateid-returning operations
>>>>>> I think that a cleaner approach could be to mark those as such in
>>>>>> nfsd4_ops by providing a per-op function to return the operation's
>>>>>> stateid. You can then call this method from nfsd4_proc_compound()
>>>>>> after the call to nfsd4_encode_operation() and when status == 0.
>>>>>
>>>>> So the choice is between
>>>>>
>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>>>>> sizeof(stateid_t));
>>>>>
>>>>> and
>>>>>
>>>>> + static void get_open_stateid(stateid_t *s)
>>>>> + {
>>>>> + memcpy(s, open->op_stateid);
>>>>> + }
>>>>> +
>>>>> + [OP_OPEN] = {
>>>>> + ...
>>>>> + .op_get_stateid = get_open_stateid,
>>>>> + ...
>>>>> + }
>>>>>
>>>>> ?
>>>>>
>>>>> I'm not so sure.
>>>>
>>>> The point is to copy the result stateid into the current_stateid
>>>> in a centralized place: nfsd4_proc_compound() and do that for all
>>>> stateid-modifying operations.
>>>
>>> Sounds good, nevertheless all operations call differently resulting
>>> stateid which make it's not possible to use a unified schema.
>
> That was the point of introducing a per-op method to get the value.
>
>>
>> Could we just remove all (or most) of those different fields and replace
>> them by cstate->current_stateid?
>
> Hmm, that will require changing the corresponding xdr result encoding
> routines to use the new field and all other use sites, but I don't think
> of a good reason why this won't work.
>
> Benny
>
>>
>> --b.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 06, 2011 at 01:26:05PM +0200, Benny Halevy wrote:
> On 2011-12-06 04:08, J. Bruce Fields wrote:
> > On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
> >> On 2011-12-04 14:03, [email protected] wrote:
> >>> From: Tigran Mkrtchyan <[email protected]>
> >>>
> >>>
> >>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
> >>> ---
> >>> fs/nfsd/nfs4proc.c | 6 ++++++
> >>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
> >>> 2 files changed, 26 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>> index fa38336..535aed2 100644
> >>> --- a/fs/nfsd/nfs4proc.c
> >>> +++ b/fs/nfsd/nfs4proc.c
> >>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>> */
> >>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> >>> WARN_ON(status && open->op_created);
> >>> +
> >>> + if(status)
> >>> + goto out;
> >>> +
> >>> + /* set current state id */
> >>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
> >
> > That comment is a bit redundant.
> >
> >> Since this should be done for all stateid-returning operations
> >> I think that a cleaner approach could be to mark those as such in
> >> nfsd4_ops by providing a per-op function to return the operation's
> >> stateid. You can then call this method from nfsd4_proc_compound()
> >> after the call to nfsd4_encode_operation() and when status == 0.
> >
> > So the choice is between
> >
> > + memcpy(&cstate->current_stateid, &open->op_stateid,
> > sizeof(stateid_t));
> >
> > and
> >
> > + static void get_open_stateid(stateid_t *s)
> > + {
> > + memcpy(s, open->op_stateid);
> > + }
> > +
> > + [OP_OPEN] = {
> > + ...
> > + .op_get_stateid = get_open_stateid,
> > + ...
> > + }
> >
> > ?
> >
> > I'm not so sure.
>
> The point is to copy the result stateid into the current_stateid
> in a centralized place: nfsd4_proc_compound() and do that for all
> stateid-modifying operations.
>
> >
> > Anyway, thanks Tigran for looking at this.
> >
> > Do we want to guarantee that the client can't expire as long as a
> > compound references the stateid? I think that's the case.
>
> The client can't time out while the 4.1 compound is in progress, see commit d768298.
OK, you're right, and presumably it would be a bug for a compound to use
a session from one client and a stateid from another, so this is taken
care of. (Except--I think we need to check for that case. On a quick
skim I don't see the current code doing that.)
Thanks!
> Are you thinking of explicit expiration of the client?
> We may unhash the client and keep using it while it's referenced
> so that's not a problem. As far as the stateid goes, we're copying the
> value of the stateid, not pointing to any stateid structure. If the
> actual state was destroyed, we will detect that when the current_stateid
> is used by any successive operation and we cannot find the state using
> the stateid.
I *think* the concensus of the working group was that explicit
destruction of a client should wait on in-progress compounds referencing
any of the client's sessions:
http://www.ietf.org/mail-archive/web/nfsv4/current/msg08584.html
So we should probably fix this. But we can fix it at the session level.
So, OK, I can't see any practical objection to doing as Tigran as and
just passing the value of stateid instead of a reference to some object.
Well, except for performance--it seems unfortunate to have to redo the
lookup on each use. As long as there's no impact on the existing cases
(so we're only doing the lookup when a client actually uses the current
stateid), I can live with that until somebody actually demonstrates some
harm.
--b.
On Sun, Dec 4, 2011 at 1:42 PM, Benny Halevy <[email protected]> wrote:
> On 2011-12-04 14:03, [email protected] wrote:
>> From: Tigran Mkrtchyan <[email protected]>
>>
>>
>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>> ---
>> fs/nfsd/nfs4proc.c | 6 ++++++
>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index fa38336..535aed2 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>> */
>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>> WARN_ON(status && open->op_created);
>> +
>> + if(status)
>> + goto out;
>> +
>> + /* set current state id */
>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>
> Since this should be done for all stateid-returning operations
> I think that a cleaner approach could be to mark those as such in
> nfsd4_ops by providing a per-op function to return the operation's
> stateid. You can then call this method from nfsd4_proc_compound()
> after the call to nfsd4_encode_operation() and when status == 0.
Ok , I will try to do that.
>
>> out:
>> nfsd4_cleanup_open_state(open, status);
>> if (open->op_openowner)
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 47e94e3..a34d82e 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -51,10 +51,14 @@ time_t nfsd4_grace = 90;
>> static time_t boot_time;
>> static stateid_t zerostateid; /* bits all 0 */
>> static stateid_t onestateid; /* bits all 1 */
>> +static stateid_t currentstateid; /* other all 0, seqid 1 */
>> static u64 current_sessionid = 1;
>>
>> #define ZERO_STATEID(stateid) (!memcmp((stateid), &zerostateid, sizeof(stateid_t)))
>> #define ONE_STATEID(stateid) (!memcmp((stateid), &onestateid, sizeof(stateid_t)))
>> +#define CURRENT_STATEID(stateid) (!memcmp((stateid), ¤tstateid, sizeof(stateid_t)))
>> +
>> +#define STATEID_OF(s, c) ( CURRENT_STATEID((s)) ? &(c)->current_stateid : (s) )
>>
>> /* forward declarations */
>> static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
>> @@ -3314,13 +3318,15 @@ static __be32 nfsd4_lookup_stateid(stateid_t *stateid, unsigned char typemask, s
>> */
>> __be32
>> nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
>> - stateid_t *stateid, int flags, struct file **filpp)
>> + stateid_t *stateid_in, int flags, struct file **filpp)
>> {
>> struct nfs4_stid *s;
>> struct nfs4_ol_stateid *stp = NULL;
>> struct nfs4_delegation *dp = NULL;
>> struct svc_fh *current_fh = &cstate->current_fh;
>> struct inode *ino = current_fh->fh_dentry->d_inode;
>> + stateid_t *stateid;
>> +
>> __be32 status;
>>
>> if (filpp)
>> @@ -3329,9 +3335,11 @@ nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
>> if (grace_disallows_io(ino))
>> return nfserr_grace;
>>
>> - if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
>> - return check_special_stateids(current_fh, stateid, flags);
>> + if (ZERO_STATEID(stateid_in) || ONE_STATEID(stateid_in))
>> + return check_special_stateids(current_fh, stateid_in, flags);
>>
>> + stateid = STATEID_OF(stateid_in, cstate);
>> +
>> status = nfsd4_lookup_stateid(stateid, NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID, &s);
>> if (status)
>> return status;
>> @@ -3655,14 +3663,17 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>> __be32 status;
>> struct nfs4_openowner *oo;
>> struct nfs4_ol_stateid *stp;
>> + stateid_t *stateid;
>>
>> dprintk("NFSD: nfsd4_close on file %.*s\n",
>> (int)cstate->current_fh.fh_dentry->d_name.len,
>> cstate->current_fh.fh_dentry->d_name.name);
>>
>> nfs4_lock_state();
>> + stateid = STATEID_OF( &close->cl_stateid, cstate);
>
> nit: extra space after open-parentheses
>
>> +
>> status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
>> - &close->cl_stateid,
>> + stateid,
>> NFS4_OPEN_STID|NFS4_CLOSED_STID,
>> &stp);
>> if (status)
>> @@ -3670,7 +3681,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>> oo = openowner(stp->st_stateowner);
>> status = nfs_ok;
>> update_stateid(&stp->st_stid.sc_stateid);
>> - memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
>> + memcpy(stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
>
> Currently, nfsd4_encode_close uses close->cl_stateid so we need to keep its value
> up-to-date. With the scheme I outlined above, copying into cstate->current_stateid can
> be done in a centralized place.
>
> Benny
>
>>
>> nfsd4_close_open_stateid(stp);
>> oo->oo_last_closed_stid = stp;
>> @@ -4400,7 +4411,7 @@ int
>> nfs4_state_init(void)
>> {
>> int i, status;
>> -
>> + uint32_t one = 1;
>> status = nfsd4_init_slabs();
>> if (status)
>> return status;
>> @@ -4423,6 +4434,9 @@ nfs4_state_init(void)
>> INIT_LIST_HEAD(&lock_ownerstr_hashtbl[i]);
>> }
>> memset(&onestateid, ~0, sizeof(stateid_t));
>> + /* seqid 1, other all 0 */
>> + memcpy(¤tstateid , &one, sizeof(one));
>> +
>
> you mean currentstateid.si_generation = cpu_to_be32(1) ?
why it should be big endian here? I think xdr encoding will take care about it.
Tigran.
>
>> INIT_LIST_HEAD(&close_lru);
>> INIT_LIST_HEAD(&client_lru);
>> INIT_LIST_HEAD(&del_recall_lru);
On 2011-12-06 15:40, J. Bruce Fields wrote:
> On Tue, Dec 06, 2011 at 02:31:14PM +0100, Tigran Mkrtchyan wrote:
>> On Tue, Dec 6, 2011 at 12:26 PM, Benny Halevy <[email protected]> wrote:
>>> On 2011-12-06 04:08, J. Bruce Fields wrote:
>>>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>>>> On 2011-12-04 14:03, [email protected] wrote:
>>>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>>>
>>>>>>
>>>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>>>> ---
>>>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>> index fa38336..535aed2 100644
>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>> */
>>>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>>>> WARN_ON(status && open->op_created);
>>>>>> +
>>>>>> + if(status)
>>>>>> + goto out;
>>>>>> +
>>>>>> + /* set current state id */
>>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>>>
>>>> That comment is a bit redundant.
>>>>
>>>>> Since this should be done for all stateid-returning operations
>>>>> I think that a cleaner approach could be to mark those as such in
>>>>> nfsd4_ops by providing a per-op function to return the operation's
>>>>> stateid. You can then call this method from nfsd4_proc_compound()
>>>>> after the call to nfsd4_encode_operation() and when status == 0.
>>>>
>>>> So the choice is between
>>>>
>>>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>>>> sizeof(stateid_t));
>>>>
>>>> and
>>>>
>>>> + static void get_open_stateid(stateid_t *s)
>>>> + {
>>>> + memcpy(s, open->op_stateid);
>>>> + }
>>>> +
>>>> + [OP_OPEN] = {
>>>> + ...
>>>> + .op_get_stateid = get_open_stateid,
>>>> + ...
>>>> + }
>>>>
>>>> ?
>>>>
>>>> I'm not so sure.
>>>
>>> The point is to copy the result stateid into the current_stateid
>>> in a centralized place: nfsd4_proc_compound() and do that for all
>>> stateid-modifying operations.
>>
>> Sounds good, nevertheless all operations call differently resulting
>> stateid which make it's not possible to use a unified schema.
That was the point of introducing a per-op method to get the value.
>
> Could we just remove all (or most) of those different fields and replace
> them by cstate->current_stateid?
Hmm, that will require changing the corresponding xdr result encoding
routines to use the new field and all other use sites, but I don't think
of a good reason why this won't work.
Benny
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Tigran Mkrtchyan <[email protected]>
Signed-off-by: Tigran Mkrtchyan <[email protected]>
---
fs/nfsd/xdr4.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 2364747..7f26edd 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -54,6 +54,7 @@ struct nfsd4_compound_state {
size_t iovlen;
u32 minorversion;
u32 status;
+ stateid_t current_stateid;
};
static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
--
1.7.7.3
On Tue, Dec 6, 2011 at 12:26 PM, Benny Halevy <[email protected]> wrote:
> On 2011-12-06 04:08, J. Bruce Fields wrote:
>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>> On 2011-12-04 14:03, [email protected] wrote:
>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>
>>>>
>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>> ---
>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>> index fa38336..535aed2 100644
>>>> --- a/fs/nfsd/nfs4proc.c
>>>> +++ b/fs/nfsd/nfs4proc.c
>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>> */
>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>> WARN_ON(status && open->op_created);
>>>> +
>>>> + if(status)
>>>> + goto out;
>>>> +
>>>> + /* set current state id */
>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>
>> That comment is a bit redundant.
>>
>>> Since this should be done for all stateid-returning operations
>>> I think that a cleaner approach could be to mark those as such in
>>> nfsd4_ops by providing a per-op function to return the operation's
>>> stateid. You can then call this method from nfsd4_proc_compound()
>>> after the call to nfsd4_encode_operation() and when status == 0.
>>
>> So the choice is between
>>
>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>> sizeof(stateid_t));
>>
>> and
>>
>> + static void get_open_stateid(stateid_t *s)
>> + {
>> + memcpy(s, open->op_stateid);
>> + }
>> +
>> + [OP_OPEN] = {
>> + ...
>> + .op_get_stateid = get_open_stateid,
>> + ...
>> + }
>>
>> ?
>>
>> I'm not so sure.
>
> The point is to copy the result stateid into the current_stateid
> in a centralized place: nfsd4_proc_compound() and do that for all
> stateid-modifying operations.
Sounds good, nevertheless all operations call differently resulting
stateid which make it's not possible to use a unified schema.
Regards,
Tigran.
>
>>
>> Anyway, thanks Tigran for looking at this.
>>
>> Do we want to guarantee that the client can't expire as long as a
>> compound references the stateid? I think that's the case.
>
> The client can't time out while the 4.1 compound is in progress, see commit d768298.
> Are you thinking of explicit expiration of the client?
> We may unhash the client and keep using it while it's referenced
> so that's not a problem. As far as the stateid goes, we're copying the
> value of the stateid, not pointing to any stateid structure. If the
> actual state was destroyed, we will detect that when the current_stateid
> is used by any successive operation and we cannot find the state using
> the stateid.
>
> Benny
>
>>
>> --b.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Tigran Mkrtchyan <[email protected]>
Signed-off-by: Tigran Mkrtchyan <[email protected]>
---
fs/nfsd/nfs4proc.c | 6 ++++++
fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
2 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index fa38336..535aed2 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
*/
status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
WARN_ON(status && open->op_created);
+
+ if(status)
+ goto out;
+
+ /* set current state id */
+ memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
out:
nfsd4_cleanup_open_state(open, status);
if (open->op_openowner)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 47e94e3..a34d82e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -51,10 +51,14 @@ time_t nfsd4_grace = 90;
static time_t boot_time;
static stateid_t zerostateid; /* bits all 0 */
static stateid_t onestateid; /* bits all 1 */
+static stateid_t currentstateid; /* other all 0, seqid 1 */
static u64 current_sessionid = 1;
#define ZERO_STATEID(stateid) (!memcmp((stateid), &zerostateid, sizeof(stateid_t)))
#define ONE_STATEID(stateid) (!memcmp((stateid), &onestateid, sizeof(stateid_t)))
+#define CURRENT_STATEID(stateid) (!memcmp((stateid), ¤tstateid, sizeof(stateid_t)))
+
+#define STATEID_OF(s, c) ( CURRENT_STATEID((s)) ? &(c)->current_stateid : (s) )
/* forward declarations */
static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
@@ -3314,13 +3318,15 @@ static __be32 nfsd4_lookup_stateid(stateid_t *stateid, unsigned char typemask, s
*/
__be32
nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
- stateid_t *stateid, int flags, struct file **filpp)
+ stateid_t *stateid_in, int flags, struct file **filpp)
{
struct nfs4_stid *s;
struct nfs4_ol_stateid *stp = NULL;
struct nfs4_delegation *dp = NULL;
struct svc_fh *current_fh = &cstate->current_fh;
struct inode *ino = current_fh->fh_dentry->d_inode;
+ stateid_t *stateid;
+
__be32 status;
if (filpp)
@@ -3329,9 +3335,11 @@ nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
if (grace_disallows_io(ino))
return nfserr_grace;
- if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
- return check_special_stateids(current_fh, stateid, flags);
+ if (ZERO_STATEID(stateid_in) || ONE_STATEID(stateid_in))
+ return check_special_stateids(current_fh, stateid_in, flags);
+ stateid = STATEID_OF(stateid_in, cstate);
+
status = nfsd4_lookup_stateid(stateid, NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID, &s);
if (status)
return status;
@@ -3655,14 +3663,17 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
struct nfs4_openowner *oo;
struct nfs4_ol_stateid *stp;
+ stateid_t *stateid;
dprintk("NFSD: nfsd4_close on file %.*s\n",
(int)cstate->current_fh.fh_dentry->d_name.len,
cstate->current_fh.fh_dentry->d_name.name);
nfs4_lock_state();
+ stateid = STATEID_OF( &close->cl_stateid, cstate);
+
status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
- &close->cl_stateid,
+ stateid,
NFS4_OPEN_STID|NFS4_CLOSED_STID,
&stp);
if (status)
@@ -3670,7 +3681,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
oo = openowner(stp->st_stateowner);
status = nfs_ok;
update_stateid(&stp->st_stid.sc_stateid);
- memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
+ memcpy(stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
nfsd4_close_open_stateid(stp);
oo->oo_last_closed_stid = stp;
@@ -4400,7 +4411,7 @@ int
nfs4_state_init(void)
{
int i, status;
-
+ uint32_t one = 1;
status = nfsd4_init_slabs();
if (status)
return status;
@@ -4423,6 +4434,9 @@ nfs4_state_init(void)
INIT_LIST_HEAD(&lock_ownerstr_hashtbl[i]);
}
memset(&onestateid, ~0, sizeof(stateid_t));
+ /* seqid 1, other all 0 */
+ memcpy(¤tstateid , &one, sizeof(one));
+
INIT_LIST_HEAD(&close_lru);
INIT_LIST_HEAD(&client_lru);
INIT_LIST_HEAD(&del_recall_lru);
--
1.7.7.3
On 2011-12-06 14:40, J. Bruce Fields wrote:
> On Tue, Dec 06, 2011 at 01:26:05PM +0200, Benny Halevy wrote:
>> On 2011-12-06 04:08, J. Bruce Fields wrote:
>>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>>> On 2011-12-04 14:03, [email protected] wrote:
>>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>>
>>>>>
>>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>>> ---
>>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>> index fa38336..535aed2 100644
>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>> */
>>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>>> WARN_ON(status && open->op_created);
>>>>> +
>>>>> + if(status)
>>>>> + goto out;
>>>>> +
>>>>> + /* set current state id */
>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>>
>>> That comment is a bit redundant.
>>>
>>>> Since this should be done for all stateid-returning operations
>>>> I think that a cleaner approach could be to mark those as such in
>>>> nfsd4_ops by providing a per-op function to return the operation's
>>>> stateid. You can then call this method from nfsd4_proc_compound()
>>>> after the call to nfsd4_encode_operation() and when status == 0.
>>>
>>> So the choice is between
>>>
>>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>>> sizeof(stateid_t));
>>>
>>> and
>>>
>>> + static void get_open_stateid(stateid_t *s)
>>> + {
>>> + memcpy(s, open->op_stateid);
>>> + }
>>> +
>>> + [OP_OPEN] = {
>>> + ...
>>> + .op_get_stateid = get_open_stateid,
>>> + ...
>>> + }
>>>
>>> ?
>>>
>>> I'm not so sure.
>>
>> The point is to copy the result stateid into the current_stateid
>> in a centralized place: nfsd4_proc_compound() and do that for all
>> stateid-modifying operations.
>>
>>>
>>> Anyway, thanks Tigran for looking at this.
>>>
>>> Do we want to guarantee that the client can't expire as long as a
>>> compound references the stateid? I think that's the case.
>>
>> The client can't time out while the 4.1 compound is in progress, see commit d768298.
>
> OK, you're right, and presumably it would be a bug for a compound to use
> a session from one client and a stateid from another, so this is taken
> care of. (Except--I think we need to check for that case. On a quick
> skim I don't see the current code doing that.)
>
> Thanks!
>
>> Are you thinking of explicit expiration of the client?
>> We may unhash the client and keep using it while it's referenced
>> so that's not a problem. As far as the stateid goes, we're copying the
>> value of the stateid, not pointing to any stateid structure. If the
>> actual state was destroyed, we will detect that when the current_stateid
>> is used by any successive operation and we cannot find the state using
>> the stateid.
>
> I *think* the concensus of the working group was that explicit
> destruction of a client should wait on in-progress compounds referencing
> any of the client's sessions:
>
> http://www.ietf.org/mail-archive/web/nfsv4/current/msg08584.html
I'm a bit confused, since the rfc5661 says that:
DESTROY_CLIENTID:
If there are
sessions (both idle and non-idle), opens, locks, delegations,
layouts, and/or wants (Section 18.49) associated with the unexpired
lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY.
DESTROY_SESSION:
Locks, delegations, layouts, wants, and the lease, which
are all tied to the client ID, are not affected by DESTROY_SESSION.
>
> So we should probably fix this. But we can fix it at the session level.
>
> So, OK, I can't see any practical objection to doing as Tigran as and
> just passing the value of stateid instead of a reference to some object.
>
> Well, except for performance--it seems unfortunate to have to redo the
> lookup on each use. As long as there's no impact on the existing cases
> (so we're only doing the lookup when a client actually uses the current
> stateid), I can live with that until somebody actually demonstrates some
> harm.
Great. I'm glad we're in agreement!
Benny
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 06, 2011 at 02:31:14PM +0100, Tigran Mkrtchyan wrote:
> On Tue, Dec 6, 2011 at 12:26 PM, Benny Halevy <[email protected]> wrote:
> > On 2011-12-06 04:08, J. Bruce Fields wrote:
> >> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
> >>> On 2011-12-04 14:03, [email protected] wrote:
> >>>> From: Tigran Mkrtchyan <[email protected]>
> >>>>
> >>>>
> >>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
> >>>> ---
> >>>> fs/nfsd/nfs4proc.c | 6 ++++++
> >>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
> >>>> 2 files changed, 26 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>> index fa38336..535aed2 100644
> >>>> --- a/fs/nfsd/nfs4proc.c
> >>>> +++ b/fs/nfsd/nfs4proc.c
> >>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>> */
> >>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> >>>> WARN_ON(status && open->op_created);
> >>>> +
> >>>> + if(status)
> >>>> + goto out;
> >>>> +
> >>>> + /* set current state id */
> >>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
> >>
> >> That comment is a bit redundant.
> >>
> >>> Since this should be done for all stateid-returning operations
> >>> I think that a cleaner approach could be to mark those as such in
> >>> nfsd4_ops by providing a per-op function to return the operation's
> >>> stateid. You can then call this method from nfsd4_proc_compound()
> >>> after the call to nfsd4_encode_operation() and when status == 0.
> >>
> >> So the choice is between
> >>
> >> + memcpy(&cstate->current_stateid, &open->op_stateid,
> >> sizeof(stateid_t));
> >>
> >> and
> >>
> >> + static void get_open_stateid(stateid_t *s)
> >> + {
> >> + memcpy(s, open->op_stateid);
> >> + }
> >> +
> >> + [OP_OPEN] = {
> >> + ...
> >> + .op_get_stateid = get_open_stateid,
> >> + ...
> >> + }
> >>
> >> ?
> >>
> >> I'm not so sure.
> >
> > The point is to copy the result stateid into the current_stateid
> > in a centralized place: nfsd4_proc_compound() and do that for all
> > stateid-modifying operations.
>
> Sounds good, nevertheless all operations call differently resulting
> stateid which make it's not possible to use a unified schema.
Could we just remove all (or most) of those different fields and replace
them by cstate->current_stateid?
--b.
On Tue, Dec 06, 2011 at 04:30:25PM +0200, Benny Halevy wrote:
> On 2011-12-06 14:40, J. Bruce Fields wrote:
> > On Tue, Dec 06, 2011 at 01:26:05PM +0200, Benny Halevy wrote:
> >> On 2011-12-06 04:08, J. Bruce Fields wrote:
> >>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
> >>>> On 2011-12-04 14:03, [email protected] wrote:
> >>>>> From: Tigran Mkrtchyan <[email protected]>
> >>>>>
> >>>>>
> >>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
> >>>>> ---
> >>>>> fs/nfsd/nfs4proc.c | 6 ++++++
> >>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
> >>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >>>>> index fa38336..535aed2 100644
> >>>>> --- a/fs/nfsd/nfs4proc.c
> >>>>> +++ b/fs/nfsd/nfs4proc.c
> >>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >>>>> */
> >>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> >>>>> WARN_ON(status && open->op_created);
> >>>>> +
> >>>>> + if(status)
> >>>>> + goto out;
> >>>>> +
> >>>>> + /* set current state id */
> >>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
> >>>
> >>> That comment is a bit redundant.
> >>>
> >>>> Since this should be done for all stateid-returning operations
> >>>> I think that a cleaner approach could be to mark those as such in
> >>>> nfsd4_ops by providing a per-op function to return the operation's
> >>>> stateid. You can then call this method from nfsd4_proc_compound()
> >>>> after the call to nfsd4_encode_operation() and when status == 0.
> >>>
> >>> So the choice is between
> >>>
> >>> + memcpy(&cstate->current_stateid, &open->op_stateid,
> >>> sizeof(stateid_t));
> >>>
> >>> and
> >>>
> >>> + static void get_open_stateid(stateid_t *s)
> >>> + {
> >>> + memcpy(s, open->op_stateid);
> >>> + }
> >>> +
> >>> + [OP_OPEN] = {
> >>> + ...
> >>> + .op_get_stateid = get_open_stateid,
> >>> + ...
> >>> + }
> >>>
> >>> ?
> >>>
> >>> I'm not so sure.
> >>
> >> The point is to copy the result stateid into the current_stateid
> >> in a centralized place: nfsd4_proc_compound() and do that for all
> >> stateid-modifying operations.
> >>
> >>>
> >>> Anyway, thanks Tigran for looking at this.
> >>>
> >>> Do we want to guarantee that the client can't expire as long as a
> >>> compound references the stateid? I think that's the case.
> >>
> >> The client can't time out while the 4.1 compound is in progress, see commit d768298.
> >
> > OK, you're right, and presumably it would be a bug for a compound to use
> > a session from one client and a stateid from another, so this is taken
> > care of. (Except--I think we need to check for that case. On a quick
> > skim I don't see the current code doing that.)
> >
> > Thanks!
> >
> >> Are you thinking of explicit expiration of the client?
> >> We may unhash the client and keep using it while it's referenced
> >> so that's not a problem. As far as the stateid goes, we're copying the
> >> value of the stateid, not pointing to any stateid structure. If the
> >> actual state was destroyed, we will detect that when the current_stateid
> >> is used by any successive operation and we cannot find the state using
> >> the stateid.
> >
> > I *think* the concensus of the working group was that explicit
> > destruction of a client should wait on in-progress compounds referencing
> > any of the client's sessions:
> >
> > http://www.ietf.org/mail-archive/web/nfsv4/current/msg08584.html
>
> I'm a bit confused, since the rfc5661 says that:
>
> DESTROY_CLIENTID:
> If there are
> sessions (both idle and non-idle), opens, locks, delegations,
> layouts, and/or wants (Section 18.49) associated with the unexpired
> lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY.
Hm. The case Mike Eisler mentions in the message referenced above is
EXCHANGE_ID/CREATE_SESSION.
But he also proposes some text for DESTROY_CLIENTID that contradicts the
text you quote above.
I guess that's intentional--he's worried about not being able to shut
down a client cleanly after a cluster node goes down.
Still, whatever behavior we decide to implement on the server, we may
want to run it by the ietf list before we're done to make sure there's
an agreement here....
> DESTROY_SESSION:
> Locks, delegations, layouts, wants, and the lease, which
> are all tied to the client ID, are not affected by DESTROY_SESSION.
But that doesn't explain what to do about in-progress operations. So if
you the DESTROY_SESSION comes while we're in the middle of processing a
rename using that session, what do we do?
In any case, Tigran's work doesn't need to block on any of this.
--b.
>
> >
> > So we should probably fix this. But we can fix it at the session level.
> >
> > So, OK, I can't see any practical objection to doing as Tigran as and
> > just passing the value of stateid instead of a reference to some object.
> >
> > Well, except for performance--it seems unfortunate to have to redo the
> > lookup on each use. As long as there's no impact on the existing cases
> > (so we're only doing the lookup when a client actually uses the current
> > stateid), I can live with that until somebody actually demonstrates some
> > harm.
>
> Great. I'm glad we're in agreement!
>
> Benny
>
> >
> > --b.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2011-12-06 20:24, Tigran Mkrtchyan wrote:
> Ok, I think I know what to do. I will update patches later today or tomorrw.
Great! Thanks much for picking this up!
Benny
>
> Tigran.
>
> On Tue, Dec 6, 2011 at 3:32 PM, Benny Halevy <[email protected]> wrote:
>> On 2011-12-06 15:40, J. Bruce Fields wrote:
>>> On Tue, Dec 06, 2011 at 02:31:14PM +0100, Tigran Mkrtchyan wrote:
>>>> On Tue, Dec 6, 2011 at 12:26 PM, Benny Halevy <[email protected]> wrote:
>>>>> On 2011-12-06 04:08, J. Bruce Fields wrote:
>>>>>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>>>>>> On 2011-12-04 14:03, [email protected] wrote:
>>>>>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>>>>>> ---
>>>>>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>>>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>>> index fa38336..535aed2 100644
>>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>>> */
>>>>>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>>>>>> WARN_ON(status && open->op_created);
>>>>>>>> +
>>>>>>>> + if(status)
>>>>>>>> + goto out;
>>>>>>>> +
>>>>>>>> + /* set current state id */
>>>>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>>>>>
>>>>>> That comment is a bit redundant.
>>>>>>
>>>>>>> Since this should be done for all stateid-returning operations
>>>>>>> I think that a cleaner approach could be to mark those as such in
>>>>>>> nfsd4_ops by providing a per-op function to return the operation's
>>>>>>> stateid. You can then call this method from nfsd4_proc_compound()
>>>>>>> after the call to nfsd4_encode_operation() and when status == 0.
>>>>>>
>>>>>> So the choice is between
>>>>>>
>>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>>>>>> sizeof(stateid_t));
>>>>>>
>>>>>> and
>>>>>>
>>>>>> + static void get_open_stateid(stateid_t *s)
>>>>>> + {
>>>>>> + memcpy(s, open->op_stateid);
>>>>>> + }
>>>>>> +
>>>>>> + [OP_OPEN] = {
>>>>>> + ...
>>>>>> + .op_get_stateid = get_open_stateid,
>>>>>> + ...
>>>>>> + }
>>>>>>
>>>>>> ?
>>>>>>
>>>>>> I'm not so sure.
>>>>>
>>>>> The point is to copy the result stateid into the current_stateid
>>>>> in a centralized place: nfsd4_proc_compound() and do that for all
>>>>> stateid-modifying operations.
>>>>
>>>> Sounds good, nevertheless all operations call differently resulting
>>>> stateid which make it's not possible to use a unified schema.
>>
>> That was the point of introducing a per-op method to get the value.
>>
>>>
>>> Could we just remove all (or most) of those different fields and replace
>>> them by cstate->current_stateid?
>>
>> Hmm, that will require changing the corresponding xdr result encoding
>> routines to use the new field and all other use sites, but I don't think
>> of a good reason why this won't work.
>>
>> Benny
>>
>>>
>>> --b.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
> On 2011-12-04 14:03, [email protected] wrote:
> > From: Tigran Mkrtchyan <[email protected]>
> >
> >
> > Signed-off-by: Tigran Mkrtchyan <[email protected]>
> > ---
> > fs/nfsd/nfs4proc.c | 6 ++++++
> > fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
> > 2 files changed, 26 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index fa38336..535aed2 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > */
> > status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> > WARN_ON(status && open->op_created);
> > +
> > + if(status)
> > + goto out;
> > +
> > + /* set current state id */
> > + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
That comment is a bit redundant.
> Since this should be done for all stateid-returning operations
> I think that a cleaner approach could be to mark those as such in
> nfsd4_ops by providing a per-op function to return the operation's
> stateid. You can then call this method from nfsd4_proc_compound()
> after the call to nfsd4_encode_operation() and when status == 0.
So the choice is between
+ memcpy(&cstate->current_stateid, &open->op_stateid,
sizeof(stateid_t));
and
+ static void get_open_stateid(stateid_t *s)
+ {
+ memcpy(s, open->op_stateid);
+ }
+
+ [OP_OPEN] = {
+ ...
+ .op_get_stateid = get_open_stateid,
+ ...
+ }
?
I'm not so sure.
Anyway, thanks Tigran for looking at this.
Do we want to guarantee that the client can't expire as long as a
compound references the stateid? I think that's the case.
--b.
On Tue, Dec 6, 2011 at 8:10 PM, J. Bruce Fields <[email protected]> wrote:
> On Tue, Dec 06, 2011 at 04:30:25PM +0200, Benny Halevy wrote:
>> On 2011-12-06 14:40, J. Bruce Fields wrote:
>> > On Tue, Dec 06, 2011 at 01:26:05PM +0200, Benny Halevy wrote:
>> >> On 2011-12-06 04:08, J. Bruce Fields wrote:
>> >>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>> >>>> On 2011-12-04 14:03, [email protected] wrote:
>> >>>>> From: Tigran Mkrtchyan <[email protected]>
>> >>>>>
>> >>>>>
>> >>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>> >>>>> ---
>> >>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>> >>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>> >>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>> >>>>>
>> >>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> >>>>> index fa38336..535aed2 100644
>> >>>>> --- a/fs/nfsd/nfs4proc.c
>> >>>>> +++ b/fs/nfsd/nfs4proc.c
>> >>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>> >>>>> */
>> >>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>> >>>>> WARN_ON(status && open->op_created);
>> >>>>> +
>> >>>>> + if(status)
>> >>>>> + goto out;
>> >>>>> +
>> >>>>> + /* set current state id */
>> >>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>> >>>
>> >>> That comment is a bit redundant.
>> >>>
>> >>>> Since this should be done for all stateid-returning operations
>> >>>> I think that a cleaner approach could be to mark those as such in
>> >>>> nfsd4_ops by providing a per-op function to return the operation's
>> >>>> stateid. You can then call this method from nfsd4_proc_compound()
>> >>>> after the call to nfsd4_encode_operation() and when status == 0.
>> >>>
>> >>> So the choice is between
>> >>>
>> >>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>> >>> sizeof(stateid_t));
>> >>>
>> >>> and
>> >>>
>> >>> + static void get_open_stateid(stateid_t *s)
>> >>> + {
>> >>> + memcpy(s, open->op_stateid);
>> >>> + }
>> >>> +
>> >>> + [OP_OPEN] = {
>> >>> + ...
>> >>> + .op_get_stateid = get_open_stateid,
>> >>> + ...
>> >>> + }
>> >>>
>> >>> ?
>> >>>
>> >>> I'm not so sure.
>> >>
>> >> The point is to copy the result stateid into the current_stateid
>> >> in a centralized place: nfsd4_proc_compound() and do that for all
>> >> stateid-modifying operations.
>> >>
>> >>>
>> >>> Anyway, thanks Tigran for looking at this.
>> >>>
>> >>> Do we want to guarantee that the client can't expire as long as a
>> >>> compound references the stateid? I think that's the case.
>> >>
>> >> The client can't time out while the 4.1 compound is in progress, see commit d768298.
>> >
>> > OK, you're right, and presumably it would be a bug for a compound to use
>> > a session from one client and a stateid from another, so this is taken
>> > care of. (Except--I think we need to check for that case. On a quick
>> > skim I don't see the current code doing that.)
>> >
>> > Thanks!
>> >
>> >> Are you thinking of explicit expiration of the client?
>> >> We may unhash the client and keep using it while it's referenced
>> >> so that's not a problem. As far as the stateid goes, we're copying the
>> >> value of the stateid, not pointing to any stateid structure. If the
>> >> actual state was destroyed, we will detect that when the current_stateid
>> >> is used by any successive operation and we cannot find the state using
>> >> the stateid.
>> >
>> > I *think* the concensus of the working group was that explicit
>> > destruction of a client should wait on in-progress compounds referencing
>> > any of the client's sessions:
>> >
>> > http://www.ietf.org/mail-archive/web/nfsv4/current/msg08584.html
>>
>> I'm a bit confused, since the rfc5661 says that:
>>
>> DESTROY_CLIENTID:
>> If there are
>> sessions (both idle and non-idle), opens, locks, delegations,
>> layouts, and/or wants (Section 18.49) associated with the unexpired
>> lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY.
>
> Hm. The case Mike Eisler mentions in the message referenced above is
> EXCHANGE_ID/CREATE_SESSION.
>
> But he also proposes some text for DESTROY_CLIENTID that contradicts the
> text you quote above.
>
> I guess that's intentional--he's worried about not being able to shut
> down a client cleanly after a cluster node goes down.
>
> Still, whatever behavior we decide to implement on the server, we may
> want to run it by the ietf list before we're done to make sure there's
> an agreement here....
>
>> DESTROY_SESSION:
>> Locks, delegations, layouts, wants, and the lease, which
>> are all tied to the client ID, are not affected by DESTROY_SESSION.
>
> But that doesn't explain what to do about in-progress operations. So if
> you the DESTROY_SESSION comes while we're in the middle of processing a
> rename using that session, what do we do?
>
> In any case, Tigran's work doesn't need to block on any of this.
Please have a look at new patch. If it the line which you prefer I
will complete it.
Tigran.
>
> --b.
>
>>
>> >
>> > So we should probably fix this. But we can fix it at the session level.
>> >
>> > So, OK, I can't see any practical objection to doing as Tigran as and
>> > just passing the value of stateid instead of a reference to some object.
>> >
>> > Well, except for performance--it seems unfortunate to have to redo the
>> > lookup on each use. As long as there's no impact on the existing cases
>> > (so we're only doing the lookup when a client actually uses the current
>> > stateid), I can live with that until somebody actually demonstrates some
>> > harm.
>>
>> Great. I'm glad we're in agreement!
>>
>> Benny
>>
>> >
>> > --b.
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> > the body of a message to [email protected]
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2011-12-04 14:03, [email protected] wrote:
> current_stateid processing will allow clients to construct compounds
> like OPEN + READ + CLOSE or OPEN +LAYOUTGET.
>
> As dCache server marks all layouts return-on-close, any open to a file followed
> by LAYOUTGET. A combined compound will allow to do the same in one go.
>
> Currently linux client doesn't support it. But before we can get such functionality
> linux server have to support it as well. Here is an attempt to add it.
>
> I made a dirty test with pynfs with to test OPEN+CLOSE in a single compound.
> If I read nfsd code correctly, READ and WRITE have to work as well. I will refactor
> pynfs a bit to easily construct such tests.
Cool! Another interesting and meaningful tests could be OPEN + LOCK
and LOCKU + CLOSE
Benny
>
> This is patches for preview to be sure that other server developers are happy with
> my approach before I change much more.
>
> Tigran.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2011-12-04 14:03, [email protected] wrote:
> From: Tigran Mkrtchyan <[email protected]>
>
>
> Signed-off-by: Tigran Mkrtchyan <[email protected]>
> ---
> fs/nfsd/nfs4proc.c | 6 ++++++
> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
> 2 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index fa38336..535aed2 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> */
> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
> WARN_ON(status && open->op_created);
> +
> + if(status)
> + goto out;
> +
> + /* set current state id */
> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
Since this should be done for all stateid-returning operations
I think that a cleaner approach could be to mark those as such in
nfsd4_ops by providing a per-op function to return the operation's
stateid. You can then call this method from nfsd4_proc_compound()
after the call to nfsd4_encode_operation() and when status == 0.
> out:
> nfsd4_cleanup_open_state(open, status);
> if (open->op_openowner)
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 47e94e3..a34d82e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -51,10 +51,14 @@ time_t nfsd4_grace = 90;
> static time_t boot_time;
> static stateid_t zerostateid; /* bits all 0 */
> static stateid_t onestateid; /* bits all 1 */
> +static stateid_t currentstateid; /* other all 0, seqid 1 */
> static u64 current_sessionid = 1;
>
> #define ZERO_STATEID(stateid) (!memcmp((stateid), &zerostateid, sizeof(stateid_t)))
> #define ONE_STATEID(stateid) (!memcmp((stateid), &onestateid, sizeof(stateid_t)))
> +#define CURRENT_STATEID(stateid) (!memcmp((stateid), ¤tstateid, sizeof(stateid_t)))
> +
> +#define STATEID_OF(s, c) ( CURRENT_STATEID((s)) ? &(c)->current_stateid : (s) )
>
> /* forward declarations */
> static int check_for_locks(struct nfs4_file *filp, struct nfs4_lockowner *lowner);
> @@ -3314,13 +3318,15 @@ static __be32 nfsd4_lookup_stateid(stateid_t *stateid, unsigned char typemask, s
> */
> __be32
> nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
> - stateid_t *stateid, int flags, struct file **filpp)
> + stateid_t *stateid_in, int flags, struct file **filpp)
> {
> struct nfs4_stid *s;
> struct nfs4_ol_stateid *stp = NULL;
> struct nfs4_delegation *dp = NULL;
> struct svc_fh *current_fh = &cstate->current_fh;
> struct inode *ino = current_fh->fh_dentry->d_inode;
> + stateid_t *stateid;
> +
> __be32 status;
>
> if (filpp)
> @@ -3329,9 +3335,11 @@ nfs4_preprocess_stateid_op(struct nfsd4_compound_state *cstate,
> if (grace_disallows_io(ino))
> return nfserr_grace;
>
> - if (ZERO_STATEID(stateid) || ONE_STATEID(stateid))
> - return check_special_stateids(current_fh, stateid, flags);
> + if (ZERO_STATEID(stateid_in) || ONE_STATEID(stateid_in))
> + return check_special_stateids(current_fh, stateid_in, flags);
>
> + stateid = STATEID_OF(stateid_in, cstate);
> +
> status = nfsd4_lookup_stateid(stateid, NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID, &s);
> if (status)
> return status;
> @@ -3655,14 +3663,17 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> __be32 status;
> struct nfs4_openowner *oo;
> struct nfs4_ol_stateid *stp;
> + stateid_t *stateid;
>
> dprintk("NFSD: nfsd4_close on file %.*s\n",
> (int)cstate->current_fh.fh_dentry->d_name.len,
> cstate->current_fh.fh_dentry->d_name.name);
>
> nfs4_lock_state();
> + stateid = STATEID_OF( &close->cl_stateid, cstate);
nit: extra space after open-parentheses
> +
> status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
> - &close->cl_stateid,
> + stateid,
> NFS4_OPEN_STID|NFS4_CLOSED_STID,
> &stp);
> if (status)
> @@ -3670,7 +3681,7 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> oo = openowner(stp->st_stateowner);
> status = nfs_ok;
> update_stateid(&stp->st_stid.sc_stateid);
> - memcpy(&close->cl_stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
> + memcpy(stateid, &stp->st_stid.sc_stateid, sizeof(stateid_t));
Currently, nfsd4_encode_close uses close->cl_stateid so we need to keep its value
up-to-date. With the scheme I outlined above, copying into cstate->current_stateid can
be done in a centralized place.
Benny
>
> nfsd4_close_open_stateid(stp);
> oo->oo_last_closed_stid = stp;
> @@ -4400,7 +4411,7 @@ int
> nfs4_state_init(void)
> {
> int i, status;
> -
> + uint32_t one = 1;
> status = nfsd4_init_slabs();
> if (status)
> return status;
> @@ -4423,6 +4434,9 @@ nfs4_state_init(void)
> INIT_LIST_HEAD(&lock_ownerstr_hashtbl[i]);
> }
> memset(&onestateid, ~0, sizeof(stateid_t));
> + /* seqid 1, other all 0 */
> + memcpy(¤tstateid , &one, sizeof(one));
> +
you mean currentstateid.si_generation = cpu_to_be32(1) ?
> INIT_LIST_HEAD(&close_lru);
> INIT_LIST_HEAD(&client_lru);
> INIT_LIST_HEAD(&del_recall_lru);
On 2011-12-06 21:10, J. Bruce Fields wrote:
> On Tue, Dec 06, 2011 at 04:30:25PM +0200, Benny Halevy wrote:
>> On 2011-12-06 14:40, J. Bruce Fields wrote:
>>> On Tue, Dec 06, 2011 at 01:26:05PM +0200, Benny Halevy wrote:
>>>> On 2011-12-06 04:08, J. Bruce Fields wrote:
>>>>> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>>>>>> On 2011-12-04 14:03, [email protected] wrote:
>>>>>>> From: Tigran Mkrtchyan <[email protected]>
>>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>>>>>> ---
>>>>>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>>>>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>> index fa38336..535aed2 100644
>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>> */
>>>>>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>>>>>> WARN_ON(status && open->op_created);
>>>>>>> +
>>>>>>> + if(status)
>>>>>>> + goto out;
>>>>>>> +
>>>>>>> + /* set current state id */
>>>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>>>>>
>>>>> That comment is a bit redundant.
>>>>>
>>>>>> Since this should be done for all stateid-returning operations
>>>>>> I think that a cleaner approach could be to mark those as such in
>>>>>> nfsd4_ops by providing a per-op function to return the operation's
>>>>>> stateid. You can then call this method from nfsd4_proc_compound()
>>>>>> after the call to nfsd4_encode_operation() and when status == 0.
>>>>>
>>>>> So the choice is between
>>>>>
>>>>> + memcpy(&cstate->current_stateid, &open->op_stateid,
>>>>> sizeof(stateid_t));
>>>>>
>>>>> and
>>>>>
>>>>> + static void get_open_stateid(stateid_t *s)
>>>>> + {
>>>>> + memcpy(s, open->op_stateid);
>>>>> + }
>>>>> +
>>>>> + [OP_OPEN] = {
>>>>> + ...
>>>>> + .op_get_stateid = get_open_stateid,
>>>>> + ...
>>>>> + }
>>>>>
>>>>> ?
>>>>>
>>>>> I'm not so sure.
>>>>
>>>> The point is to copy the result stateid into the current_stateid
>>>> in a centralized place: nfsd4_proc_compound() and do that for all
>>>> stateid-modifying operations.
>>>>
>>>>>
>>>>> Anyway, thanks Tigran for looking at this.
>>>>>
>>>>> Do we want to guarantee that the client can't expire as long as a
>>>>> compound references the stateid? I think that's the case.
>>>>
>>>> The client can't time out while the 4.1 compound is in progress, see commit d768298.
>>>
>>> OK, you're right, and presumably it would be a bug for a compound to use
>>> a session from one client and a stateid from another, so this is taken
>>> care of. (Except--I think we need to check for that case. On a quick
>>> skim I don't see the current code doing that.)
>>>
>>> Thanks!
>>>
>>>> Are you thinking of explicit expiration of the client?
>>>> We may unhash the client and keep using it while it's referenced
>>>> so that's not a problem. As far as the stateid goes, we're copying the
>>>> value of the stateid, not pointing to any stateid structure. If the
>>>> actual state was destroyed, we will detect that when the current_stateid
>>>> is used by any successive operation and we cannot find the state using
>>>> the stateid.
>>>
>>> I *think* the concensus of the working group was that explicit
>>> destruction of a client should wait on in-progress compounds referencing
>>> any of the client's sessions:
>>>
>>> http://www.ietf.org/mail-archive/web/nfsv4/current/msg08584.html
>>
>> I'm a bit confused, since the rfc5661 says that:
>>
>> DESTROY_CLIENTID:
>> If there are
>> sessions (both idle and non-idle), opens, locks, delegations,
>> layouts, and/or wants (Section 18.49) associated with the unexpired
>> lease of the client ID, the server MUST return NFS4ERR_CLIENTID_BUSY.
>
> Hm. The case Mike Eisler mentions in the message referenced above is
> EXCHANGE_ID/CREATE_SESSION.
>
> But he also proposes some text for DESTROY_CLIENTID that contradicts the
> text you quote above.
>
> I guess that's intentional--he's worried about not being able to shut
> down a client cleanly after a cluster node goes down.
>
> Still, whatever behavior we decide to implement on the server, we may
> want to run it by the ietf list before we're done to make sure there's
> an agreement here....
>
>> DESTROY_SESSION:
>> Locks, delegations, layouts, wants, and the lease, which
>> are all tied to the client ID, are not affected by DESTROY_SESSION.
>
> But that doesn't explain what to do about in-progress operations. So if
> you the DESTROY_SESSION comes while we're in the middle of processing a
> rename using that session, what do we do?
The operations cannot be replied to nor their replies can be cached
since their respective slots are gone, but that can be checked when the
operation in-progress is done.
>
> In any case, Tigran's work doesn't need to block on any of this.
Agreed.
Benny
>
> --b.
>
>>
>>>
>>> So we should probably fix this. But we can fix it at the session level.
>>>
>>> So, OK, I can't see any practical objection to doing as Tigran as and
>>> just passing the value of stateid instead of a reference to some object.
>>>
>>> Well, except for performance--it seems unfortunate to have to redo the
>>> lookup on each use. As long as there's no impact on the existing cases
>>> (so we're only doing the lookup when a client actually uses the current
>>> stateid), I can live with that until somebody actually demonstrates some
>>> harm.
>>
>> Great. I'm glad we're in agreement!
>>
>> Benny
>>
>>>
>>> --b.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2011-12-04 14:03, [email protected] wrote:
> From: Tigran Mkrtchyan <[email protected]>
>
>
> Signed-off-by: Tigran Mkrtchyan <[email protected]>
> ---
> fs/nfsd/xdr4.h | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 2364747..7f26edd 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -54,6 +54,7 @@ struct nfsd4_compound_state {
> size_t iovlen;
> u32 minorversion;
> u32 status;
> + stateid_t current_stateid;
> };
>
> static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
nit: since the new member gets to be used only in the next patch
they might as well be squashed together...
Thanks,
Benny
On 2011-12-06 04:08, J. Bruce Fields wrote:
> On Sun, Dec 04, 2011 at 02:42:11PM +0200, Benny Halevy wrote:
>> On 2011-12-04 14:03, [email protected] wrote:
>>> From: Tigran Mkrtchyan <[email protected]>
>>>
>>>
>>> Signed-off-by: Tigran Mkrtchyan <[email protected]>
>>> ---
>>> fs/nfsd/nfs4proc.c | 6 ++++++
>>> fs/nfsd/nfs4state.c | 26 ++++++++++++++++++++------
>>> 2 files changed, 26 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>> index fa38336..535aed2 100644
>>> --- a/fs/nfsd/nfs4proc.c
>>> +++ b/fs/nfsd/nfs4proc.c
>>> @@ -400,6 +400,12 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>> */
>>> status = nfsd4_process_open2(rqstp, &cstate->current_fh, open);
>>> WARN_ON(status && open->op_created);
>>> +
>>> + if(status)
>>> + goto out;
>>> +
>>> + /* set current state id */
>>> + memcpy(&cstate->current_stateid, &open->op_stateid, sizeof(stateid_t));
>
> That comment is a bit redundant.
>
>> Since this should be done for all stateid-returning operations
>> I think that a cleaner approach could be to mark those as such in
>> nfsd4_ops by providing a per-op function to return the operation's
>> stateid. You can then call this method from nfsd4_proc_compound()
>> after the call to nfsd4_encode_operation() and when status == 0.
>
> So the choice is between
>
> + memcpy(&cstate->current_stateid, &open->op_stateid,
> sizeof(stateid_t));
>
> and
>
> + static void get_open_stateid(stateid_t *s)
> + {
> + memcpy(s, open->op_stateid);
> + }
> +
> + [OP_OPEN] = {
> + ...
> + .op_get_stateid = get_open_stateid,
> + ...
> + }
>
> ?
>
> I'm not so sure.
The point is to copy the result stateid into the current_stateid
in a centralized place: nfsd4_proc_compound() and do that for all
stateid-modifying operations.
>
> Anyway, thanks Tigran for looking at this.
>
> Do we want to guarantee that the client can't expire as long as a
> compound references the stateid? I think that's the case.
The client can't time out while the 4.1 compound is in progress, see commit d768298.
Are you thinking of explicit expiration of the client?
We may unhash the client and keep using it while it's referenced
so that's not a problem. As far as the stateid goes, we're copying the
value of the stateid, not pointing to any stateid structure. If the
actual state was destroyed, we will detect that when the current_stateid
is used by any successive operation and we cannot find the state using
the stateid.
Benny
>
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html