Return-Path: Received: from mail-out1.uio.no ([129.240.10.57]:57815 "EHLO mail-out1.uio.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754176Ab0H3TS4 (ORCPT ); Mon, 30 Aug 2010 15:18:56 -0400 Subject: Re: [PATCH] sunrpc: cancel delayed connect working when conncet success From: Trond Myklebust To: Mi Jinlong Cc: NFSv3 list , "J. Bruce Fields" , Chuck Lever , Jeff Layton In-Reply-To: <4C6BACAA.2060706@cn.fujitsu.com> References: <4C6BACAA.2060706@cn.fujitsu.com> Content-Type: text/plain; charset="UTF-8" Date: Mon, 30 Aug 2010 15:18:47 -0400 Message-ID: <1283195927.2920.3.camel@heimdal.trondhjem.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Wed, 2010-08-18 at 17:49 +0800, Mi Jinlong wrote: > As network partition or some other reason, when client connect > success, maybe there is some delayed connect working in connect_work list. > > Aug 2 12:51:32 TEST-M kernel: RPC: xs_connect delayed xprt ccc4c800 for 96 seconds > Aug 2 12:51:32 TEST-M kernel: RPC: xs_error_report client ccc4c800... > Aug 2 12:51:32 TEST-M kernel: RPC: error 111 > ... snip ... > Aug 2 12:53:08 TEST-M kernel: RPC: disconnected transport ccc4c800 > Aug 2 12:53:08 TEST-M kernel: RPC: worker connecting xprt ccc4c800 via tcp to 192.168.0.21 (port 2049) > Aug 2 12:53:08 TEST-M kernel: RPC: ccc4c800 connect status 115 connected 0 sock state 2 > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_connect_status: retrying > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_prepare_transmit > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_transmit(136) > Aug 2 12:53:08 TEST-M kernel: RPC: xs_tcp_send_request(136) = -11 > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xmit incomplete (136 left of 136) > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_connect xprt ccc4c800 is not connected > Aug 2 12:53:08 TEST-M kernel: RPC: xs_connect delayed xprt ccc4c800 for 192 seconds > Aug 2 12:53:08 TEST-M kernel: RPC: xs_tcp_state_change client ccc4c800... > Aug 2 12:53:08 TEST-M kernel: RPC: state 1 conn 0 dead 0 zapped 1 > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_connect_status: retrying > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_prepare_transmit > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xprt_transmit(136) > Aug 2 12:53:08 TEST-M kernel: RPC: xs_tcp_send_request(136) = 136 > Aug 2 12:53:08 TEST-M kernel: RPC: 228 xmit complete > Aug 2 12:53:08 TEST-M kernel: RPC: 229 xprt_prepare_transmit > > As the debug message show, "xs_connect delayed xprt ccc4c800 for 192 seconds" > means a connecting work have be delayed at connect_worker list. > "state 1 conn 0 dead 0 zapped 1" shows the connect have successed > but a delayed work still alive at connect_worker list. > > Signed-off-by: Mi Jinlong > > --- > net/sunrpc/xprtsock.c | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > index 49a62f0..823f1db 100644 > --- a/net/sunrpc/xprtsock.c > +++ b/net/sunrpc/xprtsock.c > @@ -1324,6 +1324,10 @@ static void xs_tcp_state_change(struct sock *sk) > transport->tcp_flags = > TCP_RCV_COPY_FRAGHDR | TCP_RCV_COPY_XID; > > + if (xprt_connecting(xprt) && > + cancel_delayed_work(&transport->connect_worker)) > + xprt_clear_connecting(xprt); > + > xprt_wake_pending_tasks(xprt, -EAGAIN); > } > spin_unlock_bh(&xprt->transport_lock); Wait... According to the above trace, the connect request is _failing_ due to an ECONNREFUSED error. In that case, we _want_ to delay the reconnection in order to give the server time to set itself up. Cheers Trond