2014-09-17 14:50:52

by Steve Dickson

[permalink] [raw]
Subject: [PATCH] NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()

There is a race between nfs4_state_manager() and
nfs_server_remove_lists() that happens during a nfsv3 mount.

The v3 mount notices there is already a supper block so
nfs_server_remove_lists() called which uses the nfs_client_lock
spin lock to synchronize access to the client list.

At the same time nfs4_state_manager() is running through
the client list looking for work to do, using the same
lock. When nfs4_state_manager() wins the race to the
list, a v3 client pointer is found and not ignored
properly which causes the panic.

Moving some protocol checks before the state checking
avoids the panic.

CC: Stable Tree <[email protected]>
Signed-off-by: Steve Dickson <[email protected]>
---
fs/nfs/nfs4client.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 53e435a..7ff4c02 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -622,6 +622,16 @@ int nfs41_walk_client_list(struct nfs_client *new,

spin_lock(&nn->nfs_client_lock);
list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) {
+
+ if (pos->rpc_ops != new->rpc_ops)
+ continue;
+
+ if (pos->cl_proto != new->cl_proto)
+ continue;
+
+ if (pos->cl_minorversion != new->cl_minorversion)
+ continue;
+
/* If "pos" isn't marked ready, we can't trust the
* remaining fields in "pos", especially the client
* ID and serverowner fields. Wait for CREATE_SESSION
@@ -647,15 +657,6 @@ int nfs41_walk_client_list(struct nfs_client *new,
if (pos->cl_cons_state != NFS_CS_READY)
continue;

- if (pos->rpc_ops != new->rpc_ops)
- continue;
-
- if (pos->cl_proto != new->cl_proto)
- continue;
-
- if (pos->cl_minorversion != new->cl_minorversion)
- continue;
-
if (!nfs4_match_clientids(pos, new))
continue;

--
1.8.3.1



2014-09-18 13:13:58

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH] NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()



On 09/17/2014 10:55 AM, Trond Myklebust wrote:
> On Wed, Sep 17, 2014 at 10:50 AM, Steve Dickson <[email protected]> wrote:
>> There is a race between nfs4_state_manager() and
>> nfs_server_remove_lists() that happens during a nfsv3 mount.
>>
>> The v3 mount notices there is already a supper block so
>> nfs_server_remove_lists() called which uses the nfs_client_lock
>> spin lock to synchronize access to the client list.
>>
>> At the same time nfs4_state_manager() is running through
>> the client list looking for work to do, using the same
>> lock. When nfs4_state_manager() wins the race to the
>> list, a v3 client pointer is found and not ignored
>> properly which causes the panic.
>>
>> Moving some protocol checks before the state checking
>> avoids the panic.
>>
>> CC: Stable Tree <[email protected]>
>> Signed-off-by: Steve Dickson <[email protected]>
>> ---
>> fs/nfs/nfs4client.c | 19 ++++++++++---------
>> 1 file changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>> index 53e435a..7ff4c02 100644
>> --- a/fs/nfs/nfs4client.c
>> +++ b/fs/nfs/nfs4client.c
>> @@ -622,6 +622,16 @@ int nfs41_walk_client_list(struct nfs_client *new,
>>
>> spin_lock(&nn->nfs_client_lock);
>> list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) {
>> +
>> + if (pos->rpc_ops != new->rpc_ops)
>> + continue;
>> +
>> + if (pos->cl_proto != new->cl_proto)
>> + continue;
>> +
>> + if (pos->cl_minorversion != new->cl_minorversion)
>> + continue;
>> +
>> /* If "pos" isn't marked ready, we can't trust the
>> * remaining fields in "pos", especially the client
>> * ID and serverowner fields. Wait for CREATE_SESSION
>> @@ -647,15 +657,6 @@ int nfs41_walk_client_list(struct nfs_client *new,
>> if (pos->cl_cons_state != NFS_CS_READY)
>> continue;
>>
>> - if (pos->rpc_ops != new->rpc_ops)
>> - continue;
>> -
>> - if (pos->cl_proto != new->cl_proto)
>> - continue;
>> -
>> - if (pos->cl_minorversion != new->cl_minorversion)
>> - continue;
>> -
>> if (!nfs4_match_clientids(pos, new))
>> continue;
>>
>> --
>> 1.8.3.1
>>
>
> Don't we need the same fix in nfs40_walk_client_list?
Yes... Just posted version 2...

steved.

>
> Cheers
> Trond
>

2014-09-18 13:17:45

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH] NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()



On 09/17/2014 11:00 AM, Anna Schumaker wrote:
> On 09/17/2014 10:55 AM, Trond Myklebust wrote:
>> On Wed, Sep 17, 2014 at 10:50 AM, Steve Dickson <[email protected]> wrote:
>>> There is a race between nfs4_state_manager() and
>>> nfs_server_remove_lists() that happens during a nfsv3 mount.
>>>
>>> The v3 mount notices there is already a supper block so
>>> nfs_server_remove_lists() called which uses the nfs_client_lock
>>> spin lock to synchronize access to the client list.
>>>
>>> At the same time nfs4_state_manager() is running through
>>> the client list looking for work to do, using the same
>>> lock. When nfs4_state_manager() wins the race to the
>>> list, a v3 client pointer is found and not ignored
>>> properly which causes the panic.
>>>
>>> Moving some protocol checks before the state checking
>>> avoids the panic.
>>>
>>> CC: Stable Tree <[email protected]>
>>> Signed-off-by: Steve Dickson <[email protected]>
>>> ---
>>> fs/nfs/nfs4client.c | 19 ++++++++++---------
>>> 1 file changed, 10 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>> index 53e435a..7ff4c02 100644
>>> --- a/fs/nfs/nfs4client.c
>>> +++ b/fs/nfs/nfs4client.c
>>> @@ -622,6 +622,16 @@ int nfs41_walk_client_list(struct nfs_client *new,
>>>
>>> spin_lock(&nn->nfs_client_lock);
>>> list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) {
>>> +
>>> + if (pos->rpc_ops != new->rpc_ops)
>>> + continue;
>>> +
>>> + if (pos->cl_proto != new->cl_proto)
>>> + continue;
>>> +
>>> + if (pos->cl_minorversion != new->cl_minorversion)
>>> + continue;
>>> +
>>> /* If "pos" isn't marked ready, we can't trust the
>>> * remaining fields in "pos", especially the client
>>> * ID and serverowner fields. Wait for CREATE_SESSION
>>> @@ -647,15 +657,6 @@ int nfs41_walk_client_list(struct nfs_client *new,
>>> if (pos->cl_cons_state != NFS_CS_READY)
>>> continue;
>>>
>>> - if (pos->rpc_ops != new->rpc_ops)
>>> - continue;
>>> -
>>> - if (pos->cl_proto != new->cl_proto)
>>> - continue;
>>> -
>>> - if (pos->cl_minorversion != new->cl_minorversion)
>>> - continue;
>>> -
>>> if (!nfs4_match_clientids(pos, new))
>>> continue;
>>>
>>> --
>>> 1.8.3.1
>>>
>> Don't we need the same fix in nfs40_walk_client_list?
> Bonus points for finding a way to merge these functions, since they do similar comparisons in the beginning :)
I did talk a look at merging these functions... The start
of the functions are similar but the do differ after the state
check enough to keep them separate... IMHO...

steved.


2014-09-17 15:00:32

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH] NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()

On 09/17/2014 10:55 AM, Trond Myklebust wrote:
> On Wed, Sep 17, 2014 at 10:50 AM, Steve Dickson <[email protected]> wrote:
>> There is a race between nfs4_state_manager() and
>> nfs_server_remove_lists() that happens during a nfsv3 mount.
>>
>> The v3 mount notices there is already a supper block so
>> nfs_server_remove_lists() called which uses the nfs_client_lock
>> spin lock to synchronize access to the client list.
>>
>> At the same time nfs4_state_manager() is running through
>> the client list looking for work to do, using the same
>> lock. When nfs4_state_manager() wins the race to the
>> list, a v3 client pointer is found and not ignored
>> properly which causes the panic.
>>
>> Moving some protocol checks before the state checking
>> avoids the panic.
>>
>> CC: Stable Tree <[email protected]>
>> Signed-off-by: Steve Dickson <[email protected]>
>> ---
>> fs/nfs/nfs4client.c | 19 ++++++++++---------
>> 1 file changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>> index 53e435a..7ff4c02 100644
>> --- a/fs/nfs/nfs4client.c
>> +++ b/fs/nfs/nfs4client.c
>> @@ -622,6 +622,16 @@ int nfs41_walk_client_list(struct nfs_client *new,
>>
>> spin_lock(&nn->nfs_client_lock);
>> list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) {
>> +
>> + if (pos->rpc_ops != new->rpc_ops)
>> + continue;
>> +
>> + if (pos->cl_proto != new->cl_proto)
>> + continue;
>> +
>> + if (pos->cl_minorversion != new->cl_minorversion)
>> + continue;
>> +
>> /* If "pos" isn't marked ready, we can't trust the
>> * remaining fields in "pos", especially the client
>> * ID and serverowner fields. Wait for CREATE_SESSION
>> @@ -647,15 +657,6 @@ int nfs41_walk_client_list(struct nfs_client *new,
>> if (pos->cl_cons_state != NFS_CS_READY)
>> continue;
>>
>> - if (pos->rpc_ops != new->rpc_ops)
>> - continue;
>> -
>> - if (pos->cl_proto != new->cl_proto)
>> - continue;
>> -
>> - if (pos->cl_minorversion != new->cl_minorversion)
>> - continue;
>> -
>> if (!nfs4_match_clientids(pos, new))
>> continue;
>>
>> --
>> 1.8.3.1
>>
> Don't we need the same fix in nfs40_walk_client_list?
Bonus points for finding a way to merge these functions, since they do similar comparisons in the beginning :)

Anna

>
> Cheers
> Trond
>


2014-09-17 14:55:13

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] NFSv4: nfs4_state_manager() vs. nfs_server_remove_lists()

On Wed, Sep 17, 2014 at 10:50 AM, Steve Dickson <[email protected]> wrote:
> There is a race between nfs4_state_manager() and
> nfs_server_remove_lists() that happens during a nfsv3 mount.
>
> The v3 mount notices there is already a supper block so
> nfs_server_remove_lists() called which uses the nfs_client_lock
> spin lock to synchronize access to the client list.
>
> At the same time nfs4_state_manager() is running through
> the client list looking for work to do, using the same
> lock. When nfs4_state_manager() wins the race to the
> list, a v3 client pointer is found and not ignored
> properly which causes the panic.
>
> Moving some protocol checks before the state checking
> avoids the panic.
>
> CC: Stable Tree <[email protected]>
> Signed-off-by: Steve Dickson <[email protected]>
> ---
> fs/nfs/nfs4client.c | 19 ++++++++++---------
> 1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 53e435a..7ff4c02 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -622,6 +622,16 @@ int nfs41_walk_client_list(struct nfs_client *new,
>
> spin_lock(&nn->nfs_client_lock);
> list_for_each_entry(pos, &nn->nfs_client_list, cl_share_link) {
> +
> + if (pos->rpc_ops != new->rpc_ops)
> + continue;
> +
> + if (pos->cl_proto != new->cl_proto)
> + continue;
> +
> + if (pos->cl_minorversion != new->cl_minorversion)
> + continue;
> +
> /* If "pos" isn't marked ready, we can't trust the
> * remaining fields in "pos", especially the client
> * ID and serverowner fields. Wait for CREATE_SESSION
> @@ -647,15 +657,6 @@ int nfs41_walk_client_list(struct nfs_client *new,
> if (pos->cl_cons_state != NFS_CS_READY)
> continue;
>
> - if (pos->rpc_ops != new->rpc_ops)
> - continue;
> -
> - if (pos->cl_proto != new->cl_proto)
> - continue;
> -
> - if (pos->cl_minorversion != new->cl_minorversion)
> - continue;
> -
> if (!nfs4_match_clientids(pos, new))
> continue;
>
> --
> 1.8.3.1
>

Don't we need the same fix in nfs40_walk_client_list?

Cheers
Trond

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]