LinuxLists.cc - [PATCH v2] nfsd: Fix race between FREE

2016-08-07 18:53:10

Subject: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

When running LTP's nfslock01 test, the Linux client can send a LOCK
and a FREE_STATEID request at the same time. The LOCK uses the same
lockowner as the stateid sent in the FREE_STATEID request.

The outcome is:

Frame 115025 C FREE_STATEID stateid 2/A
Frame 115026 C LOCK offset 672128 len 64
Frame 115029 R FREE_STATEID NFS4_OK
Frame 115030 R LOCK stateid 3/A
Frame 115034 C WRITE stateid 0/A offset 672128 len 64
Frame 115038 R WRITE NFS4ERR_BAD_STATEID

In other words, the server returns stateid A in a successful LOCK
reply, but it has already released it. Subsequent uses of the
stateid fail.

To address this, protect the generation check in nfsd4_free_stateid
with the st_mutex. This should guarantee that only one of two
outcomes occurs: either LOCK returns a fresh valid stateid, or
FREE_STATEID returns NFS4ERR_LOCKS_HELD.

Reported-by: Alexey Kodanev <[email protected]>
Fix-suggested-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/nfs4state.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b921123..07dc1aa 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
ret = nfserr_locks_held;
break;
case NFS4_LOCK_STID:
+ atomic_inc(&s->sc_count);
+ spin_unlock(&cl->cl_lock);
+ stp = openlockstateid(s);
+ mutex_lock(&stp->st_mutex);
ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
if (ret)
- break;
- stp = openlockstateid(s);
+ goto out_mutex_unlock;
ret = nfserr_locks_held;
if (check_for_locks(stp->st_stid.sc_file,
lockowner(stp->st_stateowner)))
- break;
- WARN_ON(!unhash_lock_stateid(stp));
- spin_unlock(&cl->cl_lock);
- nfs4_put_stid(s);
+ goto out_mutex_unlock;
+ release_lock_stateid(stp);
ret = nfs_ok;
- goto out;
+ goto out_mutex_unlock;
case NFS4_REVOKED_DELEG_STID:
dp = delegstateid(s);
list_del_init(&dp->dl_recall_lru);
@@ -4937,6 +4938,10 @@ out_unlock:
spin_unlock(&cl->cl_lock);
out:
return ret;
+out_mutex_unlock:
+ mutex_unlock(&stp->st_mutex);
+ nfs4_put_stid(s);
+ goto out;
}

static inline int

2016-08-07 22:22:40

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> When running LTP's nfslock01 test, the Linux client can send a LOCK
> and a FREE_STATEID request at the same time. The LOCK uses the same
> lockowner as the stateid sent in the FREE_STATEID request.
>
> The outcome is:
>
> Frame 115025 C FREE_STATEID stateid 2/A
> Frame 115026 C LOCK offset 672128 len 64
> Frame 115029 R FREE_STATEID NFS4_OK
> Frame 115030 R LOCK stateid 3/A
> Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> Frame 115038 R WRITE NFS4ERR_BAD_STATEID
>
> In other words, the server returns stateid A in a successful LOCK
> reply, but it has already released it. Subsequent uses of the
> stateid fail.
>
> To address this, protect the generation check in nfsd4_free_stateid
> with the st_mutex. This should guarantee that only one of two
> outcomes occurs: either LOCK returns a fresh valid stateid, or
> FREE_STATEID returns NFS4ERR_LOCKS_HELD.
>
> Reported-by: Alexey Kodanev <[email protected]>
> Fix-suggested-by: Jeff Layton <[email protected]>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index b921123..07dc1aa 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
> struct nfsd4_compound_state *cstate,
> ret = nfserr_locks_held;
> break;
> case NFS4_LOCK_STID:
> + atomic_inc(&s->sc_count);
> + spin_unlock(&cl->cl_lock);
> + stp = openlockstateid(s);
> + mutex_lock(&stp->st_mutex);
> ret = check_stateid_generation(stateid, &s-
> >sc_stateid, 1);
> if (ret)
> - break;
> - stp = openlockstateid(s);
> + goto out_mutex_unlock;
> ret = nfserr_locks_held;
> if (check_for_locks(stp->st_stid.sc_file,
> lockowner(stp->st_stateowner)))
> - break;
> - WARN_ON(!unhash_lock_stateid(stp));
> - spin_unlock(&cl->cl_lock);
> - nfs4_put_stid(s);
> + goto out_mutex_unlock;
> + release_lock_stateid(stp);
> ret = nfs_ok;
> - goto out;
> + goto out_mutex_unlock;
> case NFS4_REVOKED_DELEG_STID:
> dp = delegstateid(s);
> list_del_init(&dp->dl_recall_lru);
> @@ -4937,6 +4938,10 @@ out_unlock:
> spin_unlock(&cl->cl_lock);
> out:
> return ret;
> +out_mutex_unlock:
> + mutex_unlock(&stp->st_mutex);
> + nfs4_put_stid(s);
> + goto out;
> }
>
> static inline int
>
>

Looks good to me.

Reviewed-by: Jeff Layton <[email protected]>

2016-08-08 06:48:47

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Sun, Aug 07, 2016 at 02:53:07PM -0400, Chuck Lever wrote:
> When running LTP's nfslock01 test, the Linux client can send a LOCK
> and a FREE_STATEID request at the same time. The LOCK uses the same
> lockowner as the stateid sent in the FREE_STATEID request.
>
> The outcome is:
>
> Frame 115025 C FREE_STATEID stateid 2/A
> Frame 115026 C LOCK offset 672128 len 64
> Frame 115029 R FREE_STATEID NFS4_OK
> Frame 115030 R LOCK stateid 3/A
> Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> Frame 115038 R WRITE NFS4ERR_BAD_STATEID
>
> In other words, the server returns stateid A in a successful LOCK
> reply, but it has already released it. Subsequent uses of the
> stateid fail.
>
> To address this, protect the generation check in nfsd4_free_stateid
> with the st_mutex. This should guarantee that only one of two
> outcomes occurs: either LOCK returns a fresh valid stateid, or
> FREE_STATEID returns NFS4ERR_LOCKS_HELD.
>
> Reported-by: Alexey Kodanev <[email protected]>
> Fix-suggested-by: Jeff Layton <[email protected]>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index b921123..07dc1aa 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> ret = nfserr_locks_held;
> break;
> case NFS4_LOCK_STID:
> + atomic_inc(&s->sc_count);
> + spin_unlock(&cl->cl_lock);
> + stp = openlockstateid(s);
> + mutex_lock(&stp->st_mutex);
> ret = check_stateid_generation(stateid, &s->sc_stateid, 1);
> if (ret)
> - break;
> - stp = openlockstateid(s);
> + goto out_mutex_unlock;
> ret = nfserr_locks_held;
> if (check_for_locks(stp->st_stid.sc_file,
> lockowner(stp->st_stateowner)))
> - break;
> - WARN_ON(!unhash_lock_stateid(stp));
> - spin_unlock(&cl->cl_lock);
> - nfs4_put_stid(s);
> + goto out_mutex_unlock;
> + release_lock_stateid(stp);
> ret = nfs_ok;
> - goto out;
> + goto out_mutex_unlock;

It would be nice to split the non-trivial cases (at least
NFS4_LOCK_STID and NFS4_REVOKED_DELEG_STID) into separate helpers
here as a follow on patch..

2016-08-08 13:19:19

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
> On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> >
> > When running LTP's nfslock01 test, the Linux client can send a LOCK
> > and a FREE_STATEID request at the same time. The LOCK uses the same
> > lockowner as the stateid sent in the FREE_STATEID request.
> >
> > The outcome is:
> >
> > Frame 115025 C FREE_STATEID stateid 2/A
> > Frame 115026 C LOCK offset 672128 len 64
> > Frame 115029 R FREE_STATEID NFS4_OK
> > Frame 115030 R LOCK stateid 3/A

Oh, to be clear here -- I assume this a lk_is_new lock (with an open
stateid in it). Right?

> > Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> > Frame 115038 R WRITE NFS4ERR_BAD_STATEID
> >
> > In other words, the server returns stateid A in a successful LOCK
> > reply, but it has already released it. Subsequent uses of the
> > stateid fail.
> >
> > To address this, protect the generation check in nfsd4_free_stateid
> > with the st_mutex. This should guarantee that only one of two
> > outcomes occurs: either LOCK returns a fresh valid stateid, or
> > FREE_STATEID returns NFS4ERR_LOCKS_HELD.
> >
> > Reported-by: Alexey Kodanev <[email protected]>
> > Fix-suggested-by: Jeff Layton <[email protected]>
> > Signed-off-by: Chuck Lever <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 19 ++++++++++++-------
> > 1 file changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index b921123..07dc1aa 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
> > struct nfsd4_compound_state *cstate,
> > ret = nfserr_locks_held;
> > break;
> > case NFS4_LOCK_STID:
> > + atomic_inc(&s->sc_count);
> > + spin_unlock(&cl->cl_lock);
> > + stp = openlockstateid(s);
> > + mutex_lock(&stp->st_mutex);
> > ret = check_stateid_generation(stateid, &s-
> > >
> > > sc_stateid, 1);
> > if (ret)
> > - break;
> > - stp = openlockstateid(s);
> > + goto out_mutex_unlock;
> > ret = nfserr_locks_held;
> > if (check_for_locks(stp->st_stid.sc_file,
> > lockowner(stp-
> > >st_stateowner)))
> > - break;
> > - WARN_ON(!unhash_lock_stateid(stp));
> > - spin_unlock(&cl->cl_lock);
> > - nfs4_put_stid(s);
> > + goto out_mutex_unlock;
> > + release_lock_stateid(stp);
> > ret = nfs_ok;
> > - goto out;
> > + goto out_mutex_unlock;
> > case NFS4_REVOKED_DELEG_STID:
> > dp = delegstateid(s);
> > list_del_init(&dp->dl_recall_lru);
> > @@ -4937,6 +4938,10 @@ out_unlock:
> > spin_unlock(&cl->cl_lock);
> > out:
> > return ret;
> > +out_mutex_unlock:
> > + mutex_unlock(&stp->st_mutex);
> > + nfs4_put_stid(s);
> > + goto out;
> > }
> >
> > static inline int
> >
> >
>
> Looks good to me.
>
> Reviewed-by: Jeff Layton <[email protected]>

Hmm...I think this is not a complete fix though. We also need something
like this patch:

--------------[snip]---------------

[PATCH] nfsd: don't return an already-unhashed lock stateid after
taking mutex

nfsd4_lock will take the st_mutex before working with the stateid it
gets, but between the time when we drop the cl_lock and take the mutex,
the stateid could become unhashed (a'la FREE_STATEID). If that happens
the lock stateid returned to the client will be forgotten.

Fix this by first moving the st_mutex acquisition into
lookup_or_create_lock_state. Then, have it check to see if the lock
stateid is still hashed after taking the mutex. If it's not, then put
the stateid and try the find/create again.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 5d6a28af0f42..1235b1661703 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5653,7 +5653,7 @@ static __be32
lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
struct nfs4_ol_stateid *ost,
struct nfsd4_lock *lock,
- struct nfs4_ol_stateid **lst, bool *new)
+ struct nfs4_ol_stateid **plst, bool *new)
{
__be32 status;
struct nfs4_file *fi = ost->st_stid.sc_file;
@@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
struct nfs4_client *cl = oo->oo_owner.so_client;
struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
struct nfs4_lockowner *lo;
+ struct nfs4_ol_stateid *lst;
unsigned int strhashval;
+ bool hashed;

lo = find_lockowner_str(cl, &lock->lk_new_owner);
if (!lo) {
@@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
goto out;
}

- *lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
- if (*lst == NULL) {
+retry:
+ lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
+ if (lst == NULL) {
status = nfserr_jukebox;
goto out;
}
+
+ mutex_lock(&lst->st_mutex);
+
+ /* See if it's still hashed to avoid race with FREE_STATEID */
+ spin_lock(&cl->cl_lock);
+ hashed = list_empty(&lst->st_perfile);
+ spin_unlock(&cl->cl_lock);
+
+ if (!hashed) {
+ mutex_unlock(&lst->st_mutex);
+ nfs4_put_stid(&lst->st_stid);
+ goto retry;
+ }
status = nfs_ok;
+ *plst = lst;
out:
nfs4_put_stateowner(&lo->lo_owner);
return status;
@@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
goto out;
status = lookup_or_create_lock_state(cstate, open_stp, lock,
&lock_stp, &new);
- if (status == nfs_ok)
- mutex_lock(&lock_stp->st_mutex);
} else {
status = nfs4_preprocess_seqid_op(cstate,
lock->lk_old_lock_seqid,
--
2.7.4
--
Jeff Layton <[email protected]>

2016-08-08 16:14:40

by Chuck Lever III

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

> On Aug 8, 2016, at 9:19 AM, Jeff Layton <[email protected]> wrote:
>
> On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
>> On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
>>>
>>> When running LTP's nfslock01 test, the Linux client can send a LOCK
>>> and a FREE_STATEID request at the same time. The LOCK uses the same
>>> lockowner as the stateid sent in the FREE_STATEID request.
>>>
>>> The outcome is:
>>>
>>> Frame 115025 C FREE_STATEID stateid 2/A
>>> Frame 115026 C LOCK offset 672128 len 64
>>> Frame 115029 R FREE_STATEID NFS4_OK
>>> Frame 115030 R LOCK stateid 3/A
>
> Oh, to be clear here -- I assume this a lk_is_new lock (with an open
> stateid in it). Right?

Opcode: LOCK (12)
locktype: WRITEW_LT (4)
reclaim?: No
offset: 672000
length: 64
new lock owner?: Yes
seqid: 0x00000000
stateid
[StateID Hash: 0x6f7e]
seqid: 0x00000002
Data: a95169579501000007000000
lock_seqid: 0x00000000
Owner
clientid: 0xa951695795010000
Data: <DATA>
length: 20
contents: <DATA>

The first appearance of that stateid is in an earlier OPEN reply:

Opcode: OPEN (18)
Status: NFS4_OK (0)
stateid
[StateID Hash: 0x6f7e]
seqid: 0x00000002
Data: a95169579501000007000000
change_info
Atomic: No
changeid (before): 0
changeid (after): 0
result flags: 0x00000004, locktype posix
.... .... .... .... .... .... .... ..0. = confirm: False
.... .... .... .... .... .... .... .1.. = locktype posix: True
.... .... .... .... .... .... .... 0... = preserve unlinked: False
.... .... .... .... .... .... ..0. .... = may notify lock: False
Delegation Type: OPEN_DELEGATE_NONE (0)

>>> Frame 115034 C WRITE stateid 0/A offset 672128 len 64
>>> Frame 115038 R WRITE NFS4ERR_BAD_STATEID
>>>
>>> In other words, the server returns stateid A in a successful LOCK
>>> reply, but it has already released it. Subsequent uses of the
>>> stateid fail.
>>>
>>> To address this, protect the generation check in nfsd4_free_stateid
>>> with the st_mutex. This should guarantee that only one of two
>>> outcomes occurs: either LOCK returns a fresh valid stateid, or
>>> FREE_STATEID returns NFS4ERR_LOCKS_HELD.
>>>
>>> Reported-by: Alexey Kodanev <[email protected]>
>>> Fix-suggested-by: Jeff Layton <[email protected]>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>> ---
>>> fs/nfsd/nfs4state.c | 19 ++++++++++++-------
>>> 1 file changed, 12 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index b921123..07dc1aa 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
>>> struct nfsd4_compound_state *cstate,
>>> ret = nfserr_locks_held;
>>> break;
>>> case NFS4_LOCK_STID:
>>> + atomic_inc(&s->sc_count);
>>> + spin_unlock(&cl->cl_lock);
>>> + stp = openlockstateid(s);
>>> + mutex_lock(&stp->st_mutex);
>>> ret = check_stateid_generation(stateid, &s-
>>>>
>>>> sc_stateid, 1);
>>> if (ret)
>>> - break;
>>> - stp = openlockstateid(s);
>>> + goto out_mutex_unlock;
>>> ret = nfserr_locks_held;
>>> if (check_for_locks(stp->st_stid.sc_file,
>>> lockowner(stp-
>>>> st_stateowner)))
>>> - break;
>>> - WARN_ON(!unhash_lock_stateid(stp));
>>> - spin_unlock(&cl->cl_lock);
>>> - nfs4_put_stid(s);
>>> + goto out_mutex_unlock;
>>> + release_lock_stateid(stp);
>>> ret = nfs_ok;
>>> - goto out;
>>> + goto out_mutex_unlock;
>>> case NFS4_REVOKED_DELEG_STID:
>>> dp = delegstateid(s);
>>> list_del_init(&dp->dl_recall_lru);
>>> @@ -4937,6 +4938,10 @@ out_unlock:
>>> spin_unlock(&cl->cl_lock);
>>> out:
>>> return ret;
>>> +out_mutex_unlock:
>>> + mutex_unlock(&stp->st_mutex);
>>> + nfs4_put_stid(s);
>>> + goto out;
>>> }
>>>
>>> static inline int
>>>
>>>
>>
>> Looks good to me.
>>
>> Reviewed-by: Jeff Layton <[email protected]>
>
> Hmm...I think this is not a complete fix though. We also need something
> like this patch:

OK, I'll create a series and add this patch.

> --------------[snip]---------------
>
> [PATCH] nfsd: don't return an already-unhashed lock stateid after
> taking mutex
>
> nfsd4_lock will take the st_mutex before working with the stateid it
> gets, but between the time when we drop the cl_lock and take the mutex,
> the stateid could become unhashed (a'la FREE_STATEID). If that happens
> the lock stateid returned to the client will be forgotten.
>
> Fix this by first moving the st_mutex acquisition into
> lookup_or_create_lock_state. Then, have it check to see if the lock
> stateid is still hashed after taking the mutex. If it's not, then put
> the stateid and try the find/create again.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 5d6a28af0f42..1235b1661703 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -5653,7 +5653,7 @@ static __be32
> lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> struct nfs4_ol_stateid *ost,
> struct nfsd4_lock *lock,
> - struct nfs4_ol_stateid **lst, bool *new)
> + struct nfs4_ol_stateid **plst, bool *new)
> {
> __be32 status;
> struct nfs4_file *fi = ost->st_stid.sc_file;
> @@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> struct nfs4_client *cl = oo->oo_owner.so_client;
> struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
> struct nfs4_lockowner *lo;
> + struct nfs4_ol_stateid *lst;
> unsigned int strhashval;
> + bool hashed;
>
> lo = find_lockowner_str(cl, &lock->lk_new_owner);
> if (!lo) {
> @@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> goto out;
> }
>
> - *lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> - if (*lst == NULL) {
> +retry:
> + lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> + if (lst == NULL) {
> status = nfserr_jukebox;
> goto out;
> }
> +
> + mutex_lock(&lst->st_mutex);
> +
> + /* See if it's still hashed to avoid race with FREE_STATEID */
> + spin_lock(&cl->cl_lock);
> + hashed = list_empty(&lst->st_perfile);
> + spin_unlock(&cl->cl_lock);
> +
> + if (!hashed) {
> + mutex_unlock(&lst->st_mutex);
> + nfs4_put_stid(&lst->st_stid);
> + goto retry;
> + }
> status = nfs_ok;
> + *plst = lst;
> out:
> nfs4_put_stateowner(&lo->lo_owner);
> return status;
> @@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> goto out;
> status = lookup_or_create_lock_state(cstate, open_stp, lock,
> &lock_stp, &new);
> - if (status == nfs_ok)
> - mutex_lock(&lock_stp->st_mutex);
> } else {
> status = nfs4_preprocess_seqid_op(cstate,
> lock->lk_old_lock_seqid,
> --
> 2.7.4

--
Chuck Lever

2016-08-08 18:58:51

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Mon, 2016-08-08 at 12:14 -0400, Chuck Lever wrote:
> >
> > > > On Aug 8, 2016, at 9:19 AM, Jeff Layton <[email protected]> wrote:
> >
> > On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
> > >
> > > On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> > > >
> > > >
> > > > When running LTP's nfslock01 test, the Linux client can send a LOCK
> > > > and a FREE_STATEID request at the same time. The LOCK uses the same
> > > > lockowner as the stateid sent in the FREE_STATEID request.
> > > >
> > > > The outcome is:
> > > >
> > > > Frame 115025 C FREE_STATEID stateid 2/A
> > > > Frame 115026 C LOCK offset 672128 len 64
> > > > Frame 115029 R FREE_STATEID NFS4_OK
> > > > Frame 115030 R LOCK stateid 3/A
> >
> > Oh, to be clear here -- I assume this a lk_is_new lock (with an open
> > stateid in it). Right?
>
>         Opcode: LOCK (12)
>             locktype: WRITEW_LT (4)
>             reclaim?: No
>             offset: 672000
>             length: 64
>             new lock owner?: Yes
>             seqid: 0x00000000
>             stateid
>                 [StateID Hash: 0x6f7e]
>                 seqid: 0x00000002
>                 Data: a95169579501000007000000
>             lock_seqid: 0x00000000
>             Owner
>                 clientid: 0xa951695795010000
>                 Data: <DATA>
>                     length: 20
>                     contents: <DATA>
>
> The first appearance of that stateid is in an earlier OPEN reply:
>
>         Opcode: OPEN (18)
>             Status: NFS4_OK (0)
>             stateid
>                 [StateID Hash: 0x6f7e]
>                 seqid: 0x00000002
>                 Data: a95169579501000007000000
>             change_info
>                 Atomic: No
>                 changeid (before): 0
>                 changeid (after): 0
>             result flags: 0x00000004, locktype posix
>                 .... .... .... .... .... .... .... ..0. = confirm: False
>                 .... .... .... .... .... .... .... .1.. = locktype posix: True
>                 .... .... .... .... .... .... .... 0... = preserve unlinked: False
>                 .... .... .... .... .... .... ..0. .... = may notify lock: False
>             Delegation Type: OPEN_DELEGATE_NONE (0)
>
> >
> > >
> > > >
> > > > Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> > > > Frame 115038 R WRITE NFS4ERR_BAD_STATEID
> > > >
> > > > In other words, the server returns stateid A in a successful LOCK
> > > > reply, but it has already released it. Subsequent uses of the
> > > > stateid fail.
> > > >
> > > > To address this, protect the generation check in nfsd4_free_stateid
> > > > with the st_mutex. This should guarantee that only one of two
> > > > outcomes occurs: either LOCK returns a fresh valid stateid, or
> > > > FREE_STATEID returns NFS4ERR_LOCKS_HELD.
> > > >
> > > > > > > > Reported-by: Alexey Kodanev <[email protected]>
> > > > > > > > Fix-suggested-by: Jeff Layton <[email protected]>
> > > > > > > > Signed-off-by: Chuck Lever <[email protected]>
> > > > ---
> > > > fs/nfsd/nfs4state.c |   19 ++++++++++++-------
> > > > 1 file changed, 12 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index b921123..07dc1aa 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
> > > > struct nfsd4_compound_state *cstate,
> > > > > > > > ret = nfserr_locks_held;
> > > > > > > > break;
> > > > > > > > case NFS4_LOCK_STID:
> > > > > > > > + atomic_inc(&s->sc_count);
> > > > > > > > + spin_unlock(&cl->cl_lock);
> > > > > > > > + stp = openlockstateid(s);
> > > > > > > > + mutex_lock(&stp->st_mutex);
> > > > > > > > ret = check_stateid_generation(stateid, &s-
> > > > >
> > > > >
> > > > > sc_stateid, 1);
> > > > > > > > if (ret)
> > > > > > > > - break;
> > > > > > > > - stp = openlockstateid(s);
> > > > > > > > + goto out_mutex_unlock;
> > > > > > > > ret = nfserr_locks_held;
> > > > > > > > if (check_for_locks(stp->st_stid.sc_file,
> > > > > > > >     lockowner(stp-
> > > > >
> > > > > st_stateowner)))
> > > > > > > > - break;
> > > > > > > > - WARN_ON(!unhash_lock_stateid(stp));
> > > > > > > > - spin_unlock(&cl->cl_lock);
> > > > > > > > - nfs4_put_stid(s);
> > > > > > > > + goto out_mutex_unlock;
> > > > > > > > + release_lock_stateid(stp);
> > > > > > > > ret = nfs_ok;
> > > > > > > > - goto out;
> > > > > > > > + goto out_mutex_unlock;
> > > > > > > > case NFS4_REVOKED_DELEG_STID:
> > > > > > > > dp = delegstateid(s);
> > > > > > > > list_del_init(&dp->dl_recall_lru);
> > > > @@ -4937,6 +4938,10 @@ out_unlock:
> > > > > > > > spin_unlock(&cl->cl_lock);
> > > > out:
> > > > > > > > return ret;
> > > > +out_mutex_unlock:
> > > > > > > > + mutex_unlock(&stp->st_mutex);
> > > > > > > > + nfs4_put_stid(s);
> > > > > > > > + goto out;
> > > > }
> > > >
> > > > static inline int
> > > >
> > > >
> > >
> > > Looks good to me.
> > >
> > > > > > Reviewed-by: Jeff Layton <[email protected]>
> >
> > Hmm...I think this is not a complete fix though. We also need something
> > like this patch:
>
> OK, I'll create a series and add this patch.
>
>

Thanks!

> >
> > --------------[snip]---------------
> >
> > [PATCH] nfsd: don't return an already-unhashed lock stateid after
> > taking mutex
> >
> > nfsd4_lock will take the st_mutex before working with the stateid it
> > gets, but between the time when we drop the cl_lock and take the mutex,
> > the stateid could become unhashed (a'la FREE_STATEID). If that happens
> > the lock stateid returned to the client will be forgotten.
> >
> > Fix this by first moving the st_mutex acquisition into
> > lookup_or_create_lock_state. Then, have it check to see if the lock
> > stateid is still hashed after taking the mutex. If it's not, then put
> > the stateid and try the find/create again.
> >
> > > > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
> > 1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 5d6a28af0f42..1235b1661703 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -5653,7 +5653,7 @@ static __be32
> > lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > > >     struct nfs4_ol_stateid *ost,
> > > >     struct nfsd4_lock *lock,
> > > > -     struct nfs4_ol_stateid **lst, bool *new)
> > > > +     struct nfs4_ol_stateid **plst, bool *new)
> > {
> > __be32 status;
> > struct nfs4_file *fi = ost->st_stid.sc_file;
> > @@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > struct nfs4_client *cl = oo->oo_owner.so_client;
> > struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
> > struct nfs4_lockowner *lo;
> > > > + struct nfs4_ol_stateid *lst;
> > unsigned int strhashval;
> > > > + bool hashed;
> >
> > lo = find_lockowner_str(cl, &lock->lk_new_owner);
> > if (!lo) {
> > @@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > > > goto out;
> > }
> >
> > > > - *lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> > > > - if (*lst == NULL) {
> > +retry:
> > > > + lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> > > > + if (lst == NULL) {
> > > > status = nfserr_jukebox;
> > > > goto out;
> > }
> > +
> > > > + mutex_lock(&lst->st_mutex);
> > +
> > > > + /* See if it's still hashed to avoid race with FREE_STATEID */
> > > > + spin_lock(&cl->cl_lock);
> > > > > > + hashed = list_empty(&lst->st_perfile);

For those lurking on this thread...this should be:

hashed = !list_empty(&lst->st_perfile);

> > > > > > + spin_unlock(&cl->cl_lock);
> > +
> > > > + if (!hashed) {
> > > > + mutex_unlock(&lst->st_mutex);
> > > > + nfs4_put_stid(&lst->st_stid);
> > > > + goto retry;
> > > > + }
> > status = nfs_ok;
> > > > + *plst = lst;
> > out:
> > nfs4_put_stateowner(&lo->lo_owner);
> > return status;
> > @@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > > > goto out;
> > > > status = lookup_or_create_lock_state(cstate, open_stp, lock,
> > > > &lock_stp, &new);
> > > > - if (status == nfs_ok)
> > > > - mutex_lock(&lock_stp->st_mutex);
> > } else {
> > > > status = nfs4_preprocess_seqid_op(cstate,
> > > >        lock->lk_old_lock_seqid,
> > --
> > 2.7.4
>
> --
> Chuck Lever
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Jeff Layton <[email protected]>

2016-08-08 19:53:04

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Mon, Aug 08, 2016 at 12:14:36PM -0400, Chuck Lever wrote:
>
> > On Aug 8, 2016, at 9:19 AM, Jeff Layton <[email protected]> wrote:
> >
> > On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
> >> On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> >>>
> >>> When running LTP's nfslock01 test, the Linux client can send a LOCK
> >>> and a FREE_STATEID request at the same time. The LOCK uses the same
> >>> lockowner as the stateid sent in the FREE_STATEID request.
> >>>
> >>> The outcome is:
> >>>
> >>> Frame 115025 C FREE_STATEID stateid 2/A
> >>> Frame 115026 C LOCK offset 672128 len 64
> >>> Frame 115029 R FREE_STATEID NFS4_OK
> >>> Frame 115030 R LOCK stateid 3/A
> >
> > Oh, to be clear here -- I assume this a lk_is_new lock (with an open
> > stateid in it). Right?
>
> Opcode: LOCK (12)
> locktype: WRITEW_LT (4)
> reclaim?: No
> offset: 672000
> length: 64
> new lock owner?: Yes
> seqid: 0x00000000
> stateid
> [StateID Hash: 0x6f7e]
> seqid: 0x00000002
> Data: a95169579501000007000000
> lock_seqid: 0x00000000
> Owner
> clientid: 0xa951695795010000
> Data: <DATA>
> length: 20
> contents: <DATA>
>
> The first appearance of that stateid is in an earlier OPEN reply:
>
> Opcode: OPEN (18)
> Status: NFS4_OK (0)
> stateid
> [StateID Hash: 0x6f7e]
> seqid: 0x00000002
> Data: a95169579501000007000000
> change_info
> Atomic: No
> changeid (before): 0
> changeid (after): 0
> result flags: 0x00000004, locktype posix
> .... .... .... .... .... .... .... ..0. = confirm: False
> .... .... .... .... .... .... .... .1.. = locktype posix: True
> .... .... .... .... .... .... .... 0... = preserve unlinked: False
> .... .... .... .... .... .... ..0. .... = may notify lock: False
> Delegation Type: OPEN_DELEGATE_NONE (0)

Oh, the client behavior makes more sense, then.

Still, did we establish for certain that the client isn't required to
serialize here?

We'd want it fixed either way, but it'd be nice to know.

--b.

>
> >>> Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> >>> Frame 115038 R WRITE NFS4ERR_BAD_STATEID
> >>>
> >>> In other words, the server returns stateid A in a successful LOCK
> >>> reply, but it has already released it. Subsequent uses of the
> >>> stateid fail.
> >>>
> >>> To address this, protect the generation check in nfsd4_free_stateid
> >>> with the st_mutex. This should guarantee that only one of two
> >>> outcomes occurs: either LOCK returns a fresh valid stateid, or
> >>> FREE_STATEID returns NFS4ERR_LOCKS_HELD.
> >>>
> >>> Reported-by: Alexey Kodanev <[email protected]>
> >>> Fix-suggested-by: Jeff Layton <[email protected]>
> >>> Signed-off-by: Chuck Lever <[email protected]>
> >>> ---
> >>> fs/nfsd/nfs4state.c | 19 ++++++++++++-------
> >>> 1 file changed, 12 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >>> index b921123..07dc1aa 100644
> >>> --- a/fs/nfsd/nfs4state.c
> >>> +++ b/fs/nfsd/nfs4state.c
> >>> @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
> >>> struct nfsd4_compound_state *cstate,
> >>> ret = nfserr_locks_held;
> >>> break;
> >>> case NFS4_LOCK_STID:
> >>> + atomic_inc(&s->sc_count);
> >>> + spin_unlock(&cl->cl_lock);
> >>> + stp = openlockstateid(s);
> >>> + mutex_lock(&stp->st_mutex);
> >>> ret = check_stateid_generation(stateid, &s-
> >>>>
> >>>> sc_stateid, 1);
> >>> if (ret)
> >>> - break;
> >>> - stp = openlockstateid(s);
> >>> + goto out_mutex_unlock;
> >>> ret = nfserr_locks_held;
> >>> if (check_for_locks(stp->st_stid.sc_file,
> >>> lockowner(stp-
> >>>> st_stateowner)))
> >>> - break;
> >>> - WARN_ON(!unhash_lock_stateid(stp));
> >>> - spin_unlock(&cl->cl_lock);
> >>> - nfs4_put_stid(s);
> >>> + goto out_mutex_unlock;
> >>> + release_lock_stateid(stp);
> >>> ret = nfs_ok;
> >>> - goto out;
> >>> + goto out_mutex_unlock;
> >>> case NFS4_REVOKED_DELEG_STID:
> >>> dp = delegstateid(s);
> >>> list_del_init(&dp->dl_recall_lru);
> >>> @@ -4937,6 +4938,10 @@ out_unlock:
> >>> spin_unlock(&cl->cl_lock);
> >>> out:
> >>> return ret;
> >>> +out_mutex_unlock:
> >>> + mutex_unlock(&stp->st_mutex);
> >>> + nfs4_put_stid(s);
> >>> + goto out;
> >>> }
> >>>
> >>> static inline int
> >>>
> >>>
> >>
> >> Looks good to me.
> >>
> >> Reviewed-by: Jeff Layton <[email protected]>
> >
> > Hmm...I think this is not a complete fix though. We also need something
> > like this patch:
>
> OK, I'll create a series and add this patch.
>
>
> > --------------[snip]---------------
> >
> > [PATCH] nfsd: don't return an already-unhashed lock stateid after
> > taking mutex
> >
> > nfsd4_lock will take the st_mutex before working with the stateid it
> > gets, but between the time when we drop the cl_lock and take the mutex,
> > the stateid could become unhashed (a'la FREE_STATEID). If that happens
> > the lock stateid returned to the client will be forgotten.
> >
> > Fix this by first moving the st_mutex acquisition into
> > lookup_or_create_lock_state. Then, have it check to see if the lock
> > stateid is still hashed after taking the mutex. If it's not, then put
> > the stateid and try the find/create again.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
> > 1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 5d6a28af0f42..1235b1661703 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -5653,7 +5653,7 @@ static __be32
> > lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > struct nfs4_ol_stateid *ost,
> > struct nfsd4_lock *lock,
> > - struct nfs4_ol_stateid **lst, bool *new)
> > + struct nfs4_ol_stateid **plst, bool *new)
> > {
> > __be32 status;
> > struct nfs4_file *fi = ost->st_stid.sc_file;
> > @@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > struct nfs4_client *cl = oo->oo_owner.so_client;
> > struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
> > struct nfs4_lockowner *lo;
> > + struct nfs4_ol_stateid *lst;
> > unsigned int strhashval;
> > + bool hashed;
> >
> > lo = find_lockowner_str(cl, &lock->lk_new_owner);
> > if (!lo) {
> > @@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > goto out;
> > }
> >
> > - *lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> > - if (*lst == NULL) {
> > +retry:
> > + lst = find_or_create_lock_stateid(lo, fi, inode, ost, new);
> > + if (lst == NULL) {
> > status = nfserr_jukebox;
> > goto out;
> > }
> > +
> > + mutex_lock(&lst->st_mutex);
> > +
> > + /* See if it's still hashed to avoid race with FREE_STATEID */
> > + spin_lock(&cl->cl_lock);
> > + hashed = list_empty(&lst->st_perfile);
> > + spin_unlock(&cl->cl_lock);
> > +
> > + if (!hashed) {
> > + mutex_unlock(&lst->st_mutex);
> > + nfs4_put_stid(&lst->st_stid);
> > + goto retry;
> > + }
> > status = nfs_ok;
> > + *plst = lst;
> > out:
> > nfs4_put_stateowner(&lo->lo_owner);
> > return status;
> > @@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > goto out;
> > status = lookup_or_create_lock_state(cstate, open_stp, lock,
> > &lock_stp, &new);
> > - if (status == nfs_ok)
> > - mutex_lock(&lock_stp->st_mutex);
> > } else {
> > status = nfs4_preprocess_seqid_op(cstate,
> > lock->lk_old_lock_seqid,
> > --
> > 2.7.4
>
> --
> Chuck Lever
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-08-08 20:17:10

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

On Mon, 2016-08-08 at 15:53 -0400, J. Bruce Fields wrote:
> On Mon, Aug 08, 2016 at 12:14:36PM -0400, Chuck Lever wrote:
> >
> >
> > >
> > > On Aug 8, 2016, at 9:19 AM, Jeff Layton <[email protected]>
> > > wrote:
> > >
> > > On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
> > > >
> > > > On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> > > > >
> > > > >
> > > > > When running LTP's nfslock01 test, the Linux client can send
> > > > > a LOCK
> > > > > and a FREE_STATEID request at the same time. The LOCK uses
> > > > > the same
> > > > > lockowner as the stateid sent in the FREE_STATEID request.
> > > > >
> > > > > The outcome is:
> > > > >
> > > > > Frame 115025 C FREE_STATEID stateid 2/A
> > > > > Frame 115026 C LOCK offset 672128 len 64
> > > > > Frame 115029 R FREE_STATEID NFS4_OK
> > > > > Frame 115030 R LOCK stateid 3/A
> > >
> > > Oh, to be clear here -- I assume this a lk_is_new lock (with an
> > > open
> > > stateid in it). Right?
> >
> >         Opcode: LOCK (12)
> >             locktype: WRITEW_LT (4)
> >             reclaim?: No
> >             offset: 672000
> >             length: 64
> >             new lock owner?: Yes
> >             seqid: 0x00000000
> >             stateid
> >                 [StateID Hash: 0x6f7e]
> >                 seqid: 0x00000002
> >                 Data: a95169579501000007000000
> >             lock_seqid: 0x00000000
> >             Owner
> >                 clientid: 0xa951695795010000
> >                 Data: <DATA>
> >                     length: 20
> >                     contents: <DATA>
> >
> > The first appearance of that stateid is in an earlier OPEN reply:
> >
> >         Opcode: OPEN (18)
> >             Status: NFS4_OK (0)
> >             stateid
> >                 [StateID Hash: 0x6f7e]
> >                 seqid: 0x00000002
> >                 Data: a95169579501000007000000
> >             change_info
> >                 Atomic: No
> >                 changeid (before): 0
> >                 changeid (after): 0
> >             result flags: 0x00000004, locktype posix
> >                 .... .... .... .... .... .... .... ..0. = confirm:
> > False
> >                 .... .... .... .... .... .... .... .1.. = locktype
> > posix: True
> >                 .... .... .... .... .... .... .... 0... = preserve
> > unlinked: False
> >                 .... .... .... .... .... .... ..0. .... = may
> > notify lock: False
> >             Delegation Type: OPEN_DELEGATE_NONE (0)
>
> Oh, the client behavior makes more sense, then.
>
> Still, did we establish for certain that the client isn't required to
> serialize here?
>
> We'd want it fixed either way, but it'd be nice to know.
>
> --b.
>

I don't _think_ it is, since we aren't using a LOCK stateid at this
point. There's really nothing to serialize this against, other than
pending FREE_STATEID calls. I don't think we'd want to serialize LOCK
and FREE_STATEID though as that would prevent the client from lazily
freeing them. I think this is probably a better option.

> >
> >
> > >
> > > >
> > > > >
> > > > > Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> > > > > Frame 115038 R WRITE NFS4ERR_BAD_STATEID
> > > > >
> > > > > In other words, the server returns stateid A in a successful
> > > > > LOCK
> > > > > reply, but it has already released it. Subsequent uses of the
> > > > > stateid fail.
> > > > >
> > > > > To address this, protect the generation check in
> > > > > nfsd4_free_stateid
> > > > > with the st_mutex. This should guarantee that only one of two
> > > > > outcomes occurs: either LOCK returns a fresh valid stateid,
> > > > > or
> > > > > FREE_STATEID returns NFS4ERR_LOCKS_HELD.
> > > > >
> > > > > Reported-by: Alexey Kodanev <[email protected]>
> > > > > Fix-suggested-by: Jeff Layton <[email protected]>
> > > > > Signed-off-by: Chuck Lever <[email protected]>
> > > > > ---
> > > > > fs/nfsd/nfs4state.c |   19 ++++++++++++-------
> > > > > 1 file changed, 12 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > > index b921123..07dc1aa 100644
> > > > > --- a/fs/nfsd/nfs4state.c
> > > > > +++ b/fs/nfsd/nfs4state.c
> > > > > @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst
> > > > > *rqstp,
> > > > > struct nfsd4_compound_state *cstate,
> > > > > ret = nfserr_locks_held;
> > > > > break;
> > > > > case NFS4_LOCK_STID:
> > > > > + atomic_inc(&s->sc_count);
> > > > > + spin_unlock(&cl->cl_lock);
> > > > > + stp = openlockstateid(s);
> > > > > + mutex_lock(&stp->st_mutex);
> > > > > ret = check_stateid_generation(stateid, &s-
> > > > > >
> > > > > >
> > > > > > sc_stateid, 1);
> > > > > if (ret)
> > > > > - break;
> > > > > - stp = openlockstateid(s);
> > > > > + goto out_mutex_unlock;
> > > > > ret = nfserr_locks_held;
> > > > > if (check_for_locks(stp->st_stid.sc_file,
> > > > >     lockowner(stp-
> > > > > >
> > > > > > st_stateowner)))
> > > > > - break;
> > > > > - WARN_ON(!unhash_lock_stateid(stp));
> > > > > - spin_unlock(&cl->cl_lock);
> > > > > - nfs4_put_stid(s);
> > > > > + goto out_mutex_unlock;
> > > > > + release_lock_stateid(stp);
> > > > > ret = nfs_ok;
> > > > > - goto out;
> > > > > + goto out_mutex_unlock;
> > > > > case NFS4_REVOKED_DELEG_STID:
> > > > > dp = delegstateid(s);
> > > > > list_del_init(&dp->dl_recall_lru);
> > > > > @@ -4937,6 +4938,10 @@ out_unlock:
> > > > > spin_unlock(&cl->cl_lock);
> > > > > out:
> > > > > return ret;
> > > > > +out_mutex_unlock:
> > > > > + mutex_unlock(&stp->st_mutex);
> > > > > + nfs4_put_stid(s);
> > > > > + goto out;
> > > > > }
> > > > >
> > > > > static inline int
> > > > >
> > > > >
> > > >
> > > > Looks good to me.
> > > >
> > > > Reviewed-by: Jeff Layton <[email protected]>
> > >
> > > Hmm...I think this is not a complete fix though. We also need
> > > something
> > > like this patch:
> >
> > OK, I'll create a series and add this patch.
> >
> >
> > >
> > > --------------[snip]---------------
> > >
> > > [PATCH] nfsd: don't return an already-unhashed lock stateid after
> > > taking mutex
> > >
> > > nfsd4_lock will take the st_mutex before working with the stateid
> > > it
> > > gets, but between the time when we drop the cl_lock and take the
> > > mutex,
> > > the stateid could become unhashed (a'la FREE_STATEID). If that
> > > happens
> > > the lock stateid returned to the client will be forgotten.
> > >
> > > Fix this by first moving the st_mutex acquisition into
> > > lookup_or_create_lock_state. Then, have it check to see if the
> > > lock
> > > stateid is still hashed after taking the mutex. If it's not, then
> > > put
> > > the stateid and try the find/create again.
> > >
> > > Signed-off-by: Jeff Layton <[email protected]>
> > > ---
> > > fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
> > > 1 file changed, 20 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 5d6a28af0f42..1235b1661703 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -5653,7 +5653,7 @@ static __be32
> > > lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > >     struct nfs4_ol_stateid *ost,
> > >     struct nfsd4_lock *lock,
> > > -     struct nfs4_ol_stateid **lst, bool
> > > *new)
> > > +     struct nfs4_ol_stateid **plst, bool
> > > *new)
> > > {
> > > __be32 status;
> > > struct nfs4_file *fi = ost->st_stid.sc_file;
> > > @@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct
> > > nfsd4_compound_state *cstate,
> > > struct nfs4_client *cl = oo->oo_owner.so_client;
> > > struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
> > > struct nfs4_lockowner *lo;
> > > + struct nfs4_ol_stateid *lst;
> > > unsigned int strhashval;
> > > + bool hashed;
> > >
> > > lo = find_lockowner_str(cl, &lock->lk_new_owner);
> > > if (!lo) {
> > > @@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct
> > > nfsd4_compound_state *cstate,
> > > goto out;
> > > }
> > >
> > > - *lst = find_or_create_lock_stateid(lo, fi, inode, ost,
> > > new);
> > > - if (*lst == NULL) {
> > > +retry:
> > > + lst = find_or_create_lock_stateid(lo, fi, inode, ost,
> > > new);
> > > + if (lst == NULL) {
> > > status = nfserr_jukebox;
> > > goto out;
> > > }
> > > +
> > > + mutex_lock(&lst->st_mutex);
> > > +
> > > + /* See if it's still hashed to avoid race with
> > > FREE_STATEID */
> > > + spin_lock(&cl->cl_lock);
> > > + hashed = list_empty(&lst->st_perfile);
> > > + spin_unlock(&cl->cl_lock);
> > > +
> > > + if (!hashed) {
> > > + mutex_unlock(&lst->st_mutex);
> > > + nfs4_put_stid(&lst->st_stid);
> > > + goto retry;
> > > + }
> > > status = nfs_ok;
> > > + *plst = lst;
> > > out:
> > > nfs4_put_stateowner(&lo->lo_owner);
> > > return status;
> > > @@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct
> > > nfsd4_compound_state *cstate,
> > > goto out;
> > > status = lookup_or_create_lock_state(cstate, open_stp,
> > > lock,
> > > &lock_stp,
> > > &new);
> > > - if (status == nfs_ok)
> > > - mutex_lock(&lock_stp->st_mutex);
> > > } else {
> > > status = nfs4_preprocess_seqid_op(cstate,
> > >        lock->lk_old_lock_seqid,
> > > --
> > > 2.7.4
> >
> > --
> > Chuck Lever
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > nfs" in
> > the body of a message to [email protected]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Jeff Layton <[email protected]>