2014-07-29 18:39:34

by Steve Dickson

[permalink] [raw]
Subject: nfs4_state_manager() vs. nfs_server_remove_lists()

Hello,

I've been seeing a panic where nfs4_state_manager()
ends up processing an v3 nfs client pointer.

The panic happens at the top of nfs4_state_manager()
because clp->cl_mvops == NULL;

Looking at the pointer (via crash) it becomes obvious
it is a V3 client point (AKA rpc_ops = nfs_v3_clientop)

Now the reason we are in the state manager code is a NFSv4
mount doing server discovery so it is waking the client list
in nfs41_walk_client_list()

Now looking at the at the entire stack with crash, the
only time that v3 client pointer appears is after
nfs41_walk_client_list() has been called so I'm 99%
sure the pointer is coming from the cl_share_link list.

So the question is how is that v3 client pointer on that
list, in non NFS_CS_READY state.

Well, simultaneously a V3 mount is happening. In nfs_fs_mount_common()
it notices there is already a existing supper block sit decides to
free its server pointer so nfs_server_remove_lists() is called.

What nfs_server_remove_lists() and nfs41_walk_client_list()
have in common is the nfs_client_lock spin lock.

Also the client pointer in the server pointer being freed is
in a non NFS_CS_READY state

To answer the question, the v3 client pointer, in a non
NFS_CS_READY state, is found by nfs41_walk_client_list()
because it beat nfs_server_remove_lists() to the
nfs_client_lock spin lock.

nfs41_walk_client_list() finds the uninitialized client
pointer nfs_server_remove_lists() is trying to free and
processes it and then fall over...

Note this was very hard to reproduce since a very large client
(many cores) is needed and a very fast server and a few
hours...

Question, since both v3 and v4 clients are on the cl_share_link
list should there be a check in nfs41_walk_client_list() to
process only v4 clients?

steved.



2014-07-29 19:52:16

by Trond Myklebust

[permalink] [raw]
Subject: Re: nfs4_state_manager() vs. nfs_server_remove_lists()

On Tue, Jul 29, 2014 at 2:39 PM, Steve Dickson <[email protected]> wrote:
> Hello,
>
> I've been seeing a panic where nfs4_state_manager()
> ends up processing an v3 nfs client pointer.
>
> The panic happens at the top of nfs4_state_manager()
> because clp->cl_mvops == NULL;
>
> Looking at the pointer (via crash) it becomes obvious
> it is a V3 client point (AKA rpc_ops = nfs_v3_clientop)
>
> Now the reason we are in the state manager code is a NFSv4
> mount doing server discovery so it is waking the client list
> in nfs41_walk_client_list()
>
> Now looking at the at the entire stack with crash, the
> only time that v3 client pointer appears is after
> nfs41_walk_client_list() has been called so I'm 99%
> sure the pointer is coming from the cl_share_link list.
>
> So the question is how is that v3 client pointer on that
> list, in non NFS_CS_READY state.
>
> Well, simultaneously a V3 mount is happening. In nfs_fs_mount_common()
> it notices there is already a existing supper block sit decides to
> free its server pointer so nfs_server_remove_lists() is called.
>
> What nfs_server_remove_lists() and nfs41_walk_client_list()
> have in common is the nfs_client_lock spin lock.
>
> Also the client pointer in the server pointer being freed is
> in a non NFS_CS_READY state
>
> To answer the question, the v3 client pointer, in a non
> NFS_CS_READY state, is found by nfs41_walk_client_list()
> because it beat nfs_server_remove_lists() to the
> nfs_client_lock spin lock.
>
> nfs41_walk_client_list() finds the uninitialized client
> pointer nfs_server_remove_lists() is trying to free and
> processes it and then fall over...
>
> Note this was very hard to reproduce since a very large client
> (many cores) is needed and a very fast server and a few
> hours...
>
> Question, since both v3 and v4 clients are on the cl_share_link
> list should there be a check in nfs41_walk_client_list() to
> process only v4 clients?
>

Hi Steve,

Let's just move up the test for "pos->rpc_ops != new->rpc_ops",
"pos->cl_minorversion != new->cl_minorversion" and "pos->cl_proto !=
new->cl_proto" so that they all happen before we try to test the value
of cl_cons_state.
As far as I can tell, all those values are guaranteed to be set as
part of the struct nfs_client allocators, before we ever put the
result on the cl_share_link list.

Cheers
Trond

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-07-29 21:58:05

by Trond Myklebust

[permalink] [raw]
Subject: Re: nfs4_state_manager() vs. nfs_server_remove_lists()

On Tue, Jul 29, 2014 at 4:40 PM, Steve Dickson <[email protected]> wrote:
> On 29/07/14 15:52, Trond Myklebust wrote:
>> Let's just move up the test for "pos->rpc_ops != new->rpc_ops",
>> "pos->cl_minorversion != new->cl_minorversion" and "pos->cl_proto !=
>> new->cl_proto" so that they all happen before we try to test the value
>> of cl_cons_state.
>> As far as I can tell, all those values are guaranteed to be set as
>> part of the struct nfs_client allocators, before we ever put the
>> result on the cl_share_link list.
>
> The check for
> if (pos->cl_cons_state > NFS_CS_READY)
>
> then right after that check is:
>
> if (pos->cl_cons_state != NFS_CS_READY)
> continue;
>
> confuses me... Is the second check even needed?
>
> steved.

Yes. The result of the lease_recovery could be that the nfs_client is
left in a state of error if, say, we get a NFS4ERR_CLID_INUSE beastie.

Cheers
Trond

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-07-29 20:40:37

by Steve Dickson

[permalink] [raw]
Subject: Re: nfs4_state_manager() vs. nfs_server_remove_lists()

On 29/07/14 15:52, Trond Myklebust wrote:
> Let's just move up the test for "pos->rpc_ops != new->rpc_ops",
> "pos->cl_minorversion != new->cl_minorversion" and "pos->cl_proto !=
> new->cl_proto" so that they all happen before we try to test the value
> of cl_cons_state.
> As far as I can tell, all those values are guaranteed to be set as
> part of the struct nfs_client allocators, before we ever put the
> result on the cl_share_link list.

The check for
if (pos->cl_cons_state > NFS_CS_READY)

then right after that check is:

if (pos->cl_cons_state != NFS_CS_READY)
continue;

confuses me... Is the second check even needed?

steved.

2014-09-17 12:59:31

by Steve Dickson

[permalink] [raw]
Subject: Re: nfs4_state_manager() vs. nfs_server_remove_lists()

Hello,

On 09/16/2014 03:28 PM, Trond Myklebust wrote:
> On Tue, Sep 16, 2014 at 2:51 PM, Fred Isaman <[email protected]> wrote:
>> > Was a patch ever submitted for this? I'm seeing something similar but can't
>> > find the fix upstream.
> As far as I can tell, the answer is no. If you are seeing the bug,
> then could you please post the discussed fix?

It looks like I dropped the ball... I too am see this problem
and can not find the posted patch... Working on it....

steved.

2014-09-16 19:28:36

by Trond Myklebust

[permalink] [raw]
Subject: Re: nfs4_state_manager() vs. nfs_server_remove_lists()

On Tue, Sep 16, 2014 at 2:51 PM, Fred Isaman <[email protected]> wrote:
> Was a patch ever submitted for this? I'm seeing something similar but can't
> find the fix upstream.

As far as I can tell, the answer is no. If you are seeing the bug,
then could you please post the discussed fix?

Cheers
TRond

>
> Fred
>
> On Tue, Jul 29, 2014 at 5:58 PM, Trond Myklebust
> <[email protected]> wrote:
>>
>> On Tue, Jul 29, 2014 at 4:40 PM, Steve Dickson <[email protected]> wrote:
>> > On 29/07/14 15:52, Trond Myklebust wrote:
>> >> Let's just move up the test for "pos->rpc_ops != new->rpc_ops",
>> >> "pos->cl_minorversion != new->cl_minorversion" and "pos->cl_proto !=
>> >> new->cl_proto" so that they all happen before we try to test the value
>> >> of cl_cons_state.
>> >> As far as I can tell, all those values are guaranteed to be set as
>> >> part of the struct nfs_client allocators, before we ever put the
>> >> result on the cl_share_link list.
>> >
>> > The check for
>> > if (pos->cl_cons_state > NFS_CS_READY)
>> >
>> > then right after that check is:
>> >
>> > if (pos->cl_cons_state != NFS_CS_READY)
>> > continue;
>> >
>> > confuses me... Is the second check even needed?
>> >
>> > steved.
>>
>> Yes. The result of the lease_recovery could be that the nfs_client is
>> left in a state of error if, say, we get a NFS4ERR_CLID_INUSE beastie.
>>
>> Cheers
>> Trond
>>
>> --
>> Trond Myklebust
>>
>> Linux NFS client maintainer, PrimaryData
>>
>> [email protected]
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>



--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]