2008-05-20 00:48:47

by Peter Leckie

[permalink] [raw]
Subject: Re: [PATCH 03/04] NFS/RDMA client stall patches

Talpey, Thomas wrote:
> At 09:31 PM 5/18/2008, Peter Leckie wrote:
>
>> This patch changes rpcrdma_conn_func() to directly call
>> xprt_disconnect() instead of directly waking the pending
>> task queue.
>>
>
> Does this fix some issue, or is it simply more efficient to initiate the
> disconnect immediately? Because the conn_func is called directly
> from the RDMA provider's connection upcall, it may be entered in
> an arbitrary context, so xprt_disconnect() could be a bit heavyweight
> and lead to deadlock. Have you satisified yourself this is not the
> case?
>
Well xprt_disconnect_done() is no more heavy weight then the
rpcrdma_conn_func()
equivalent it simply clears the connected bit and labels the queues as
not connected.
The reason for calling xprt_disconnect_done() is to make a single
disconnect function.

The reason this change is needed is to allow the send and resend queues
to be drained on disconnect, patch 02 could have been changed to also
drain the these queues from rpcrdma_conn_func() however I think this is
a cleaner fix.

Thanks
Pete

>
>> Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/[email protected]>
>> Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/[email protected]>
>> X-Sgi-Pv: 970244
>> <http://bugworks/query.cgi/970244>---
>> Index: linux-2.6.25.3/net/sunrpc/xprtrdma/rpc_rdma.c
>> ===================================================================
>> --- linux-2.6.25.3.orig/net/sunrpc/xprtrdma/rpc_rdma.c
>> +++ linux-2.6.25.3/net/sunrpc/xprtrdma/rpc_rdma.c
>> @@ -680,15 +680,13 @@ rpcrdma_conn_func(struct rpcrdma_ep *ep)
>> {
>> struct rpc_xprt *xprt = ep->rep_xprt;
>>
>> - spin_lock_bh(&xprt->transport_lock);
>> if (ep->rep_connected > 0) {
>> + spin_lock_bh(&xprt->transport_lock);
>> if (!xprt_test_and_set_connected(xprt))
>> xprt_wake_pending_tasks(xprt, 0);
>> - } else {
>> - if (xprt_test_and_clear_connected(xprt))
>> - xprt_wake_pending_tasks(xprt, ep->rep_connected);
>> - }
>> - spin_unlock_bh(&xprt->transport_lock);
>> + spin_unlock_bh(&xprt->transport_lock);
>> + } else
>> + xprt_disconnect(xprt);
>> }
>>
>> /*
>>
>>
>>
>>
>
>
>



2008-05-20 13:20:52

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 03/04] NFS/RDMA client stall patches

At 08:48 PM 5/19/2008, Peter Leckie wrote:
>Well xprt_disconnect_done() is no more heavy weight then the
>rpcrdma_conn_func()
>equivalent it simply clears the connected bit and labels the queues as
>not connected.
>The reason for calling xprt_disconnect_done() is to make a single
>disconnect function.

Aha - I agree and now realize that I misunderstood because your patch
calls xprt_disconnect(), which (IIRC) is an older heavier function that's no
longer in sunrpc. Is this the right patch then?

>The reason this change is needed is to allow the send and resend queues
>to be drained on disconnect, patch 02 could have been changed to also
>drain the these queues from rpcrdma_conn_func() however I think this is
>a cleaner fix.

Sounds good - pending better understanding of which fix is right. :-)

...
>>> Index: linux-2.6.25.3/net/sunrpc/xprtrdma/rpc_rdma.c
>>> ===================================================================
>>> --- linux-2.6.25.3.orig/net/sunrpc/xprtrdma/rpc_rdma.c
>>> +++ linux-2.6.25.3/net/sunrpc/xprtrdma/rpc_rdma.c
>>> @@ -680,15 +680,13 @@ rpcrdma_conn_func(struct rpcrdma_ep *ep)
...
>>> + spin_unlock_bh(&xprt->transport_lock);
>>> + } else
>>> + xprt_disconnect(xprt);
>>> }
>>>

Tom.