2008-04-08 17:38:32

by Chuck Lever

[permalink] [raw]
Subject: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect

Commits 66af1e55 through 663b8858 recently changed how
xs_tcp_state_change() awakens RPC tasks in order to address problems
with disconnection of TCP sockets that were already connected.

The generic function xprt_connect_status() is invoked during a transport
connection operation by this wake-up. It expects a specific error value
describing the connection failure, but currently always gets ENOTCONN.

To fix this, change xprt_disconnect_done() to take an errno value
argument and pass it on to xprt_wake_pending_tasks(). Change
xs_tcp_state_change() to invoke xprt_disconnect_done() with the value
contained in the socket's sk_err field in order to report the specific
error which prevented the connect operation from completing.

Signed-off-by: Chuck Lever <[email protected]>
---

Found this while poking at the RPC server registration logic. It appears to
fix the problem I was seeing with refused TCP connections.

Comments?

include/linux/sunrpc/xprt.h | 2 +-
net/sunrpc/xprt.c | 4 ++--
net/sunrpc/xprtrdma/transport.c | 4 ++--
net/sunrpc/xprtsock.c | 9 +++++----
4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index b3ff9a8..a3543c8 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -239,7 +239,7 @@ void xprt_adjust_cwnd(struct rpc_task *task, int result);
struct rpc_rqst * xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
void xprt_complete_rqst(struct rpc_task *task, int copied);
void xprt_release_rqst_cong(struct rpc_task *task);
-void xprt_disconnect_done(struct rpc_xprt *xprt);
+void xprt_disconnect_done(struct rpc_xprt *xprt, int error);
void xprt_force_disconnect(struct rpc_xprt *xprt);

/*
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index d5553b8..f61b560 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -579,12 +579,12 @@ static void xprt_autoclose(struct work_struct *work)
* @xprt: transport to flag for disconnect
*
*/
-void xprt_disconnect_done(struct rpc_xprt *xprt)
+void xprt_disconnect_done(struct rpc_xprt *xprt, int error)
{
dprintk("RPC: disconnected transport %p\n", xprt);
spin_lock_bh(&xprt->transport_lock);
xprt_clear_connected(xprt);
- xprt_wake_pending_tasks(xprt, -ENOTCONN);
+ xprt_wake_pending_tasks(xprt, error);
spin_unlock_bh(&xprt->transport_lock);
}
EXPORT_SYMBOL_GPL(xprt_disconnect_done);
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index a564c1a..ac3938d 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -458,7 +458,7 @@ xprt_rdma_close(struct rpc_xprt *xprt)
struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);

dprintk("RPC: %s: closing\n", __func__);
- xprt_disconnect_done(xprt);
+ xprt_disconnect_done(xprt, -ENOTCONN);
(void) rpcrdma_ep_disconnect(&r_xprt->rx_ep, &r_xprt->rx_ia);
}

@@ -695,7 +695,7 @@ xprt_rdma_send_request(struct rpc_task *task)
}

if (rpcrdma_ep_post(&r_xprt->rx_ia, &r_xprt->rx_ep, req)) {
- xprt_disconnect_done(xprt);
+ xprt_disconnect_done(xprt, -ENOTCONN);
return -ENOTCONN; /* implies disconnect */
}

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 896aa1e..47b37ac 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -771,7 +771,7 @@ clear_close_wait:
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
clear_bit(XPRT_CLOSING, &xprt->state);
smp_mb__after_clear_bit();
- xprt_disconnect_done(xprt);
+ xprt_disconnect_done(xprt, -ENOTCONN);
}

/**
@@ -1105,8 +1105,8 @@ static void xs_tcp_state_change(struct sock *sk)
if (!(xprt = xprt_from_sock(sk)))
goto out;
dprintk("RPC: xs_tcp_state_change client %p...\n", xprt);
- dprintk("RPC: state %x conn %d dead %d zapped %d\n",
- sk->sk_state, xprt_connected(xprt),
+ dprintk("RPC: state %x sk_err %d conn %d dead %d zapped %d\n",
+ sk->sk_state, sk->sk_err, xprt_connected(xprt),
sock_flag(sk, SOCK_DEAD),
sock_flag(sk, SOCK_ZAPPED));

@@ -1161,7 +1161,8 @@ static void xs_tcp_state_change(struct sock *sk)
clear_bit(XPRT_CLOSING, &xprt->state);
smp_mb__after_clear_bit();
/* Mark transport as closed and wake up all pending tasks */
- xprt_disconnect_done(xprt);
+ xprt_disconnect_done(xprt, (sk->sk_err ?
+ -sk->sk_err : -ENOTCONN));
}
out:
read_unlock(&sk->sk_callback_lock);



2008-04-08 20:21:42

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect


On Tue, 2008-04-08 at 15:35 -0400, Chuck Lever wrote:

> Last week, you claimed:
>
> > The client currently doesn't retry in the case of a connection
> > failing for a soft RPC request.
>
> In fact, it does retry a connection attempt on soft RPC requests
> until a *major* timeout occurs.

You're right. I forgot that we changed that.

> So is the actual bug in call_timeout?

I'm confused. What should call_timeout be doing differently?

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-04-08 17:56:06

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect


On Tue, 2008-04-08 at 13:38 -0400, Chuck Lever wrote:
> Commits 66af1e55 through 663b8858 recently changed how
> xs_tcp_state_change() awakens RPC tasks in order to address problems
> with disconnection of TCP sockets that were already connected.
>
> The generic function xprt_connect_status() is invoked during a transport
> connection operation by this wake-up. It expects a specific error value
> describing the connection failure, but currently always gets ENOTCONN.

Does it? AFAICS, the only errors that it cares about are ETIMEDOUT,
ENOTCONN, ECONNRESET, and ECONNREFUSED, and the only effect is change
the dprintk() message.

OTOH, call_connect_status() will cause the task to exit with an EIO if
ever it gets sent an ECONNRESET or ECONNREFUSED.

IOW: I'm sceptical to this change.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-04-08 18:05:54

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect


On Tue, 2008-04-08 at 13:38 -0400, Chuck Lever wrote:
> Found this while poking at the RPC server registration logic. It appears to
> fix the problem I was seeing with refused TCP connections.

BTW: Exactly how did this fix the problem?

As I said, it looks to me as if a report other than ENOTCONN or
ETIMEDOUT will automatically result in the rpc task aborting.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-04-08 18:23:21

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect

At 02:00 PM 4/8/2008, Trond Myklebust wrote:
>As I said, it looks to me as if a report other than ENOTCONN or
>ETIMEDOUT will automatically result in the rpc task aborting.

FWIW, we've seen some leaks on NFS/RDMA clients that receive strange
errors from the RDMA layer when connecting to the server. I hope to be
able to test this soon to see if it eliminates the leak, but don't have the
ability to do that where I am right now.

Anyway, the RDMA layer does return some very different errors from
sockets. We attempt to "fix" them in the RPC transport before returning
them, but it's a bit of a challenge so I would welcome error hardening in
the layer(s) above.

Tom.


2008-04-08 18:43:01

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect


On Tue, 2008-04-08 at 14:22 -0400, Talpey, Thomas wrote:
> At 02:00 PM 4/8/2008, Trond Myklebust wrote:
> >As I said, it looks to me as if a report other than ENOTCONN or
> >ETIMEDOUT will automatically result in the rpc task aborting.
>
> FWIW, we've seen some leaks on NFS/RDMA clients that receive strange
> errors from the RDMA layer when connecting to the server. I hope to be
> able to test this soon to see if it eliminates the leak, but don't have the
> ability to do that where I am right now.
>
> Anyway, the RDMA layer does return some very different errors from
> sockets. We attempt to "fix" them in the RPC transport before returning
> them, but it's a bit of a challenge so I would welcome error hardening in
> the layer(s) above.

The only justification for passing more errors up to the higher layers
is if they have different error handling requirements.
The reason why we currently transform more or less everything into
ENOTCONN is because the two other errors ECONNRESET and ECONNREFUSED
basically require the same kind of error handling (exit with EIO in the
"soft" case, and keep retrying in the "hard" case).

So, what kind of RDMA errors are these, and how are we failing to handle
them correctly today?

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-04-08 19:36:08

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH] SUNRPC: RPC client's TCP transport ignores errors during connect

On Apr 8, 2008, at 1:55 PM, Trond Myklebust wrote:
> On Tue, 2008-04-08 at 13:38 -0400, Chuck Lever wrote:
>> Commits 66af1e55 through 663b8858 recently changed how
>> xs_tcp_state_change() awakens RPC tasks in order to address problems
>> with disconnection of TCP sockets that were already connected.
>>
>> The generic function xprt_connect_status() is invoked during a
>> transport
>> connection operation by this wake-up. It expects a specific error
>> value
>> describing the connection failure, but currently always gets
>> ENOTCONN.
>
> Does it? AFAICS, the only errors that it cares about are ETIMEDOUT,
> ENOTCONN, ECONNRESET, and ECONNREFUSED, and the only effect is change
> the dprintk() message.
>
> OTOH, call_connect_status() will cause the task to exit with an EIO if
> ever it gets sent an ECONNRESET or ECONNREFUSED.

Well, OK... but why keep the extra switch arms in xprt_connect_status
if transports and the generic logic don't use those error codes? It
serves only to confuse.

Last week, you claimed:

> The client currently doesn't retry in the case of a connection
> failing for a soft RPC request.

In fact, it does retry a connection attempt on soft RPC requests
until a *major* timeout occurs.

So is the actual bug in call_timeout?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com