On Fri, 2009-11-27 at 01:14 +0100, Stephen R. van den Berg wrote:
> On Fri, Nov 27, 2009 at 01:07, Stephen R. van den Berg <[email protected]> wrote:
> > RPC: worker connecting xprt cfa94400 to address: addr=1.2.3.151
> > port=2049 proto=tcp
> > RPC: cfa94400 connect status 99 connected 0 sock state 7
>
> errno 99 means EADDRNOTAVAIL. In userspace this normally is solved by
> using the REUSEADDR sockopt. In xprtsock.c we try something like:
>
> /* We're probably in TIME_WAIT. Get rid of existing socket,
> * and retry
> */
> set_bit(XPRT_CONNECTION_CLOSE, &xprt->state);
> xprt_force_disconnect(xprt);
>
> I'd guess that this needs to be fixed, or the REUSEADDR sockopt needs to be set.
Does the following patch fix matters?
Trond
---------------------------------------------------------------------------------------------------------
SUNRPC: Ensure that we honour autoclose before attempting to reconnect
From: Trond Myklebust <[email protected]>
If the XPRT_CLOSE_WAIT flag is set, we need to ensure that we call
xprt->ops->close() while holding xprt_lock_write() before we can
start reconnecting.
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprt.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index fd46d42..469de29 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -700,6 +700,10 @@ void xprt_connect(struct rpc_task *task)
}
if (!xprt_lock_write(xprt, task))
return;
+
+ if (test_and_clear_bit(XPRT_CLOSE_WAIT, &xprt->state))
+ xprt->ops->close(xprt);
+
if (xprt_connected(xprt))
xprt_release_write(xprt, task);
else {
On Fri, Nov 27, 2009 at 22:23, Trond Myklebust
<[email protected]> wrote:
> On Fri, 2009-11-27 at 01:14 +0100, Stephen R. van den Berg wrote:
>> On Fri, Nov 27, 2009 at 01:07, Stephen R. van den Berg <[email protected]>=
wrote:
> Does the following patch fix matters?
> =A0 =A0 =A0 =A0if (!xprt_lock_write(xprt, task))
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return;
> +
> + =A0 =A0 =A0 if (test_and_clear_bit(XPRT_CLOSE_WAIT, &xprt->state))
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 xprt->ops->close(xprt);
> +
> =A0 =A0 =A0 =A0if (xprt_connected(xprt))
Sorry. No go. I got the following trace, I'm not sure if this is
relevant, because it is difficult to determine if the logging
corresponds to the experienced problem.
RPC: 14194 xprt_connect_status: retrying
RPC: 14194 xprt_prepare_transmit
RPC: 14194 xprt_transmit(112)
RPC: disconnected transport cfa82400
RPC: 14194 xprt_connect xprt cfa82400 is not connected
RPC: 14194 xprt_connect_status: retrying
RPC: 14194 xprt_prepare_transmit
RPC: 14194 xprt_transmit(112)
RPC: disconnected transport cfa82400
RPC: 14194 xprt_connect xprt cfa82400 is not connected
RPC: 14194 xprt_connect_status: retrying
RPC: 14194 xprt_prepare_transmit
RPC: 14194 xprt_transmit(112)
RPC: disconnected transport cfa82400
RPC: 14194 xprt_connect xprt cfa82400 is not connected
RPC: 14194 xprt_connect_status: retrying
RPC: 14194 xprt_prepare_transmit
RPC: 14194 xprt_transmit(112)
RPC: disconnected transport cfa82400
RPC: 14194 xprt_connect xprt cfa82400 is not connected
RPC: 14194 xprt_connect_status: retrying
RPC: 14194 xprt_prepare_transmit
RPC: 14194 xprt_transmit(112)
--=20
Sincerely,
Stephen R. van den Berg.