2007-11-07 00:37:59

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 0/7] Improve the NFS/TCP reconnection code

The following series of patches attempts to fix a problem with the
current Linux RPC client disconnect/reconnect logic in the TCP client.

Our current strategy whenever we need to disconnect from the server and
then reconnect is to force a reset of the connection (by issuing a
'special' connect(AF_UNSPEC) call that essentially causes the TCP layer
to send an RST to terminate the current connection.
This then allows us to reuse the port immediately without worrying about
TIME-WAIT states.

The problem is that RST is not supposed to be acked by the server. We
can therefore never be entirely sure that the connection was correctly
terminated on the server side. This again may cause a SYN, RST loop
when we try to reconnect since the server may think that we are trying
to reconnect from a port that is already connected.

The solution is to use a combination of the shutdown() command, and the
connect(AF_UNSPEC).

By using shutdown() to initiate the disconnection, we are able to hang
onto the socket and monitor the shutdown process via the ->state_change()
callback. Better, we can continue to receive replies on the socket until
the FIN from the server arrives to tell that it is done sending.

After the connection is shut down, we can then use the connect(AF_UNSPEC)
trick in order to reset the socket without releasing the port number. Note
that because the socket is already closed, no RST is sent to the server.
A curious side-effect of this is that the TIME-WAIT state gets moved to
a different port number. I'm not sure how to avoid this...

Cheers
Trond

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-11-08 16:12:40

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 1/7] SUNRPC: Fix a race in xs_tcp_state_change()


On Thu, 2007-11-08 at 10:40 -0500, Chuck Lever wrote:
> That's fine. But you still clear XPRT_CLOSE_WAIT twice in a row in
> xprt_autoclose, at least for the socket transport method. Either clear
> it in xprt_autoclose, or clear it in the tranport close methods. Not
> both. It doesn't break anything, but it is confusing to human readers.

Yup. I'll fix it up.

> I'm still a little confused about which state flags are supposed to be
> managed entirely by the transports, and which by the generic logic. Can
> you perhaps add a block comment in xprt.h where the flags are defined
> that describes how these flags are supposed to be used? Actually, even
> better would be to have a separate state word for the transports,
> defined in their private transport structure. The
> non-connection-oriented transports wouldn't need to manipulate the
> specific flags.

Well... Some of them want to be publicly readable, but only writable by
certain subsystems. The XPRT_BOUND and XPRT_CONNECTED flags are two
examples of this. Then there is fully private stuff like XPRT_BINDING,
XPRT_CONNECTING, and XPRT_CLOSING.

I think the right thing to do is to keep public functions like
xprt_connected(), and xprt_bound() for use by the RPC client state
machine in xprt.h, and then perhaps move the currently public
'manipulator' functions xprt_set_connected(), xprt_clear_connected(),...
into a private header file for use by xprtsock.c, and
xprtrdma/transport.c.

Trond

2007-11-09 13:37:03

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 2/7] SUNRPC: Fix TCP rebinding logic

At 07:39 PM 11/6/2007, Trond Myklebust wrote:
>From: Trond Myklebust <[email protected]>
>
>Currently the TCP rebinding logic assumes that if we're not using a
>reserved port, then we don't need to reconnect on the same port if a
>disconnection event occurs. This breaks most RPC duplicate reply cache
>implementations.

The good news is, in many cases the port search ends up landing on the
same port it used before, because it always starts at the same point and
moves in the same direction to find a free one. If only one connection at
a time is breaking, this does tend to work.

Of course, I like your intentional approach better! :-) Just a note that
existing kernels aren't completely broken.

Other comments largely match Chuck's, so I'll stand by for round 2.

Tom.

>
>Also take into account the fact that xprt_min_resvport and
>xprt_max_resvport may change while we're reconnecting, since the user may
>change them at any time via the sysctls. Ensure that we check the port
>boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
>boundaries change, we only scan the ports a maximum of 2 times.
>
>Signed-off-by: Trond Myklebust <[email protected]>
>---
>
> net/sunrpc/xprtsock.c | 59 ++++++++++++++++++++++++++++++++-----------------
> 1 files changed, 38 insertions(+), 21 deletions(-)
>
>diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>index 322e4e2..5a83a40 100644
>--- a/net/sunrpc/xprtsock.c
>+++ b/net/sunrpc/xprtsock.c
>@@ -1272,34 +1272,53 @@ static void xs_set_port(struct rpc_xprt *xprt,
>unsigned short port)
> }
> }
>
>+static unsigned short xs_get_srcport(struct sock_xprt *transport,
>struct socket *sock)
>+{
>+ unsigned short port = transport->port;
>+
>+ if (port == 0 && transport->xprt.resvport)
>+ port = xs_get_random_port();
>+ return port;
>+}
>+
>+static unsigned short xs_next_srcport(struct sock_xprt *transport,
>struct socket *sock, unsigned short port)
>+{
>+ if (transport->port != 0)
>+ transport->port = 0;
>+ if (!transport->xprt.resvport)
>+ return 0;
>+ if (port <= xprt_min_resvport || port > xprt_max_resvport)
>+ return xprt_max_resvport;
>+ return --port;
>+}
>+
> static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
> {
> struct sockaddr_in myaddr = {
> .sin_family = AF_INET,
> };
> struct sockaddr_in *sa;
>- int err;
>- unsigned short port = transport->port;
>+ int err, nloop = 0;
>+ unsigned short port = xs_get_srcport(transport, sock);
>+ unsigned short last;
>
>- if (!transport->xprt.resvport)
>- port = 0;
> sa = (struct sockaddr_in *)&transport->addr;
> myaddr.sin_addr = sa->sin_addr;
> do {
> myaddr.sin_port = htons(port);
> err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> sizeof(myaddr));
>- if (!transport->xprt.resvport)
>+ if (port == 0)
> break;
> if (err == 0) {
> transport->port = port;
> break;
> }
>- if (port <= xprt_min_resvport)
>- port = xprt_max_resvport;
>- else
>- port--;
>- } while (err == -EADDRINUSE && port != transport->port);
>+ last = port;
>+ port = xs_next_srcport(transport, sock, port);
>+ if (port > last)
>+ nloop++;
>+ } while (err == -EADDRINUSE && nloop != 2);
> dprintk("RPC: %s "NIPQUAD_FMT":%u: %s (%d)\n",
> __FUNCTION__, NIPQUAD(myaddr.sin_addr),
> port, err ? "failed" : "ok", err);
>@@ -1312,28 +1331,27 @@ static int xs_bind6(struct sock_xprt
>*transport, struct socket *sock)
> .sin6_family = AF_INET6,
> };
> struct sockaddr_in6 *sa;
>- int err;
>- unsigned short port = transport->port;
>+ int err, nloop = 0;
>+ unsigned short port = xs_get_srcport(transport, sock);
>+ unsigned short last;
>
>- if (!transport->xprt.resvport)
>- port = 0;
> sa = (struct sockaddr_in6 *)&transport->addr;
> myaddr.sin6_addr = sa->sin6_addr;
> do {
> myaddr.sin6_port = htons(port);
> err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> sizeof(myaddr));
>- if (!transport->xprt.resvport)
>+ if (port == 0)
> break;
> if (err == 0) {
> transport->port = port;
> break;
> }
>- if (port <= xprt_min_resvport)
>- port = xprt_max_resvport;
>- else
>- port--;
>- } while (err == -EADDRINUSE && port != transport->port);
>+ last = port;
>+ port = xs_next_srcport(transport, sock, port);
>+ if (port > last)
>+ nloop++;
>+ } while (err == -EADDRINUSE && nloop != 2);
> dprintk("RPC: xs_bind6 "NIP6_FMT":%u: %s (%d)\n",
> NIP6(myaddr.sin6_addr), port, err ? "failed" : "ok", err);
> return err;
>@@ -1815,7 +1833,6 @@ static struct rpc_xprt *xs_setup_xprt(struct
>xprt_create *args,
> xprt->addrlen = args->addrlen;
> if (args->srcaddr)
> memcpy(&new->addr, args->srcaddr, args->addrlen);
>- new->port = xs_get_random_port();
>
> return xprt;
> }
>
>
>-------------------------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc.
>Still grepping through log files to find problems? Stop.
>Now Search log events and configuration files using AJAX and a browser.
>Download your FREE copy of Splunk now >> http://get.splunk.com/
>_______________________________________________
>NFS maillist - [email protected]
>https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 13:38:24

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 4/7] SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket

At 07:39 PM 11/6/2007, Trond Myklebust wrote:
>From: Trond Myklebust <[email protected]>
>
>By using shutdown() rather than close() we allow the RPC client to wait
>for the TCP close handshake to complete before we start trying to reconnect
>using the same port.
>We use shutdown(SHUT_WR) only instead of shutting down both directions,
>however we wait until the server has closed the connection on its side.
>
>Signed-off-by: Trond Myklebust <[email protected]>
>---
>
> net/sunrpc/xprtsock.c | 53 +++++++++++++++++++++++++++++++++++++++++++------
> 1 files changed, 46 insertions(+), 7 deletions(-)
>
>diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>index 99c0166..d610d28 100644
>--- a/net/sunrpc/xprtsock.c
>+++ b/net/sunrpc/xprtsock.c
>@@ -614,6 +614,34 @@ static int xs_udp_send_request(struct rpc_task *task)
> return status;
> }
>
>+static int xs_shutdown(struct socket *sock, int how)
>+{
>+ /*
>+ * Note: 'how - 1' trick converts
>+ * RCV_SHUTDOWN -> SHUT_RD = 0
>+ * SEND_SHUTDOWN -> SHUT_WR = 1
>+ * RCV_SHUTDOWN|SEND_SHUTDOWN -> SHUT_RDWR = 2
>+ */
>+ return sock->ops->shutdown(sock, how - 1);

These #defines might be better if they had XPRT_ or RPCXPRT_ in front
of them to make it clear they're flags for an RPC API. All by themselves
they kind of look like a standard networking component.

Tom.

>+}
>+
>+/**
>+ * xs_tcp_shutdown - gracefully shut down a TCP socket
>+ * @xprt: transport
>+ *
>+ * Initiates a graceful shutdown of the TCP socket by calling the
>+ * equivalent of shutdown(SHUT_WR);
>+ */
>+static void xs_tcp_shutdown(struct rpc_xprt *xprt)
>+{
>+ struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
>+ struct socket *sock = transport->sock;
>+
>+ if (sock != NULL)
>+ xs_shutdown(sock, SEND_SHUTDOWN);
>+ clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>+}
>+
> static inline void xs_encode_tcp_record_marker(struct xdr_buf *buf)
> {
> u32 reclen = buf->len - sizeof(rpc_fraghdr);
>@@ -691,7 +719,7 @@ static int xs_tcp_send_request(struct rpc_task *task)
> default:
> dprintk("RPC: sendmsg returned unrecognized error %d\n",
> -status);
>- xprt_disconnect(xprt);
>+ xs_tcp_shutdown(xprt);
> break;
> }
>
>@@ -1627,8 +1655,7 @@ static void xs_tcp_connect_worker4(struct
>work_struct *work)
> break;
> default:
> /* get rid of existing socket, and retry */
>- xs_close(xprt);
>- break;
>+ xs_tcp_shutdown(xprt);
> }
> }
> out:
>@@ -1687,8 +1714,7 @@ static void xs_tcp_connect_worker6(struct
>work_struct *work)
> break;
> default:
> /* get rid of existing socket, and retry */
>- xs_close(xprt);
>- break;
>+ xs_tcp_shutdown(xprt);
> }
> }
> out:
>@@ -1735,6 +1761,19 @@ static void xs_connect(struct rpc_task *task)
> }
> }
>
>+static void xs_tcp_connect(struct rpc_task *task)
>+{
>+ struct rpc_xprt *xprt = task->tk_xprt;
>+
>+ /* Initiate graceful shutdown of the socket if not already done */
>+ if (!test_bit(XPRT_CONNECTING, &xprt->state))
>+ xs_tcp_shutdown(xprt);
>+ /* Exit if we need to wait for socket shutdown to complete */
>+ if (test_bit(XPRT_CLOSING, &xprt->state))
>+ return;
>+ xs_connect(task);
>+}
>+
> /**
> * xs_udp_print_stats - display UDP socket-specifc stats
> * @xprt: rpc_xprt struct containing statistics
>@@ -1805,12 +1844,12 @@ static struct rpc_xprt_ops xs_tcp_ops = {
> .release_xprt = xs_tcp_release_xprt,
> .rpcbind = rpcb_getport_async,
> .set_port = xs_set_port,
>- .connect = xs_connect,
>+ .connect = xs_tcp_connect,
> .buf_alloc = rpc_malloc,
> .buf_free = rpc_free,
> .send_request = xs_tcp_send_request,
> .set_retrans_timeout = xprt_set_retrans_timeout_def,
>- .close = xs_close,
>+ .close = xs_tcp_shutdown,
> .destroy = xs_destroy,
> .print_stats = xs_tcp_print_stats,
> };
>
>
>-------------------------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc.
>Still grepping through log files to find problems? Stop.
>Now Search log events and configuration files using AJAX and a browser.
>Download your FREE copy of Splunk now >> http://get.splunk.com/
>_______________________________________________
>NFS maillist - [email protected]
>https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 13:51:34

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 4/7] SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket


On Fri, 2007-11-09 at 08:38 -0500, Talpey, Thomas wrote:
> At 07:39 PM 11/6/2007, Trond Myklebust wrote:
> >From: Trond Myklebust <[email protected]>
> >
> >By using shutdown() rather than close() we allow the RPC client to wait
> >for the TCP close handshake to complete before we start trying to reconnect
> >using the same port.
> >We use shutdown(SHUT_WR) only instead of shutting down both directions,
> >however we wait until the server has closed the connection on its side.
> >
> >Signed-off-by: Trond Myklebust <[email protected]>
> >---
> >
> > net/sunrpc/xprtsock.c | 53 +++++++++++++++++++++++++++++++++++++++++++------
> > 1 files changed, 46 insertions(+), 7 deletions(-)
> >
> >diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> >index 99c0166..d610d28 100644
> >--- a/net/sunrpc/xprtsock.c
> >+++ b/net/sunrpc/xprtsock.c
> >@@ -614,6 +614,34 @@ static int xs_udp_send_request(struct rpc_task *task)
> > return status;
> > }
> >
> >+static int xs_shutdown(struct socket *sock, int how)
> >+{
> >+ /*
> >+ * Note: 'how - 1' trick converts
> >+ * RCV_SHUTDOWN -> SHUT_RD = 0
> >+ * SEND_SHUTDOWN -> SHUT_WR = 1
> >+ * RCV_SHUTDOWN|SEND_SHUTDOWN -> SHUT_RDWR = 2
> >+ */
> >+ return sock->ops->shutdown(sock, how - 1);
>
> These #defines might be better if they had XPRT_ or RPCXPRT_ in front
> of them to make it clear they're flags for an RPC API. All by themselves
> they kind of look like a standard networking component.

They're not #defines, but just a comment...

However yesterday I did submit a patch to clean this up by providing the
proper functionality in the networking layer. See

http://article.gmane.org/gmane.linux.network/77163

Cheers
Trond

2007-11-09 13:56:35

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [NFS] [PATCH 5/7] SUNRPC: xprt_autoclose() should not call xprt_disconnect()

At 07:40 PM 11/6/2007, Trond Myklebust wrote:
>From: Trond Myklebust <[email protected]>
>
>The transport layer should do that itself whenever appropriate.
>
>Note that the RDMA transport already assumes that it needs to call
>xprt_disconnect in xprt_rdma_close().

This was an interesting issue, IIRC. With RDMA, there is no way to
cancel an individual RDMA that might be arrive later from the peer,
you have to break the connection to do so. (Well, there are other
ways but they're expensive and don't make sense when you're about
to close anyway). But if the transport were to simply slam the connection
closed, any other RPC tasks would potentially initiate recovery when
the disconnect event bubbled up.

So, the best solution turned out to be to close the RPC door and the
transport door separately. Except for the order in which it's happening
this is basically what you're coding here, so I'm glad to see it consistent
now.


Tom.

>For TCP sockets, we want to call xprt_disconnect() only after the
>connection has been closed by both ends.
>
>Signed-off-by: Trond Myklebust <[email protected]>
>---
>
> net/sunrpc/xprt.c | 1 -
> net/sunrpc/xprtsock.c | 3 +--
> 2 files changed, 1 insertions(+), 3 deletions(-)
>
>diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>index 48c5a8b..fcdc4b8 100644
>--- a/net/sunrpc/xprt.c
>+++ b/net/sunrpc/xprt.c
>@@ -568,7 +568,6 @@ static void xprt_autoclose(struct work_struct *work)
> struct rpc_xprt *xprt =
> container_of(work, struct rpc_xprt, task_cleanup);
>
>- xprt_disconnect(xprt);
> xprt->ops->close(xprt);
> clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> xprt_release_write(xprt, NULL);
>diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>index d610d28..c0aa2b4 100644
>--- a/net/sunrpc/xprtsock.c
>+++ b/net/sunrpc/xprtsock.c
>@@ -786,10 +786,10 @@ static void xs_close(struct rpc_xprt *xprt)
> sock_release(sock);
> clear_close_wait:
> smp_mb__before_clear_bit();
>- clear_bit(XPRT_CONNECTED, &xprt->state);
> clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> clear_bit(XPRT_CLOSING, &xprt->state);
> smp_mb__after_clear_bit();
>+ xprt_disconnect(xprt);
> }
>
> /**
>@@ -805,7 +805,6 @@ static void xs_destroy(struct rpc_xprt *xprt)
>
> cancel_rearming_delayed_work(&transport->connect_worker);
>
>- xprt_disconnect(xprt);
> xs_close(xprt);
> xs_free_peer_addresses(xprt);
> kfree(xprt->slot);
>
>
>-------------------------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc.
>Still grepping through log files to find problems? Stop.
>Now Search log events and configuration files using AJAX and a browser.
>Download your FREE copy of Splunk now >> http://get.splunk.com/
>_______________________________________________
>NFS maillist - [email protected]
>https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 14:04:06

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

At 07:39 PM 11/6/2007, Trond Myklebust wrote:
>From: Trond Myklebust <[email protected]>
>
>Add an xprt->state bit to enable the TCP ->state_change() method to signal
>whether or not the TCP connection is in the process of closing down.
>This will to be used by the reconnection logic in a separate patch.
>
>Signed-off-by: Trond Myklebust <[email protected]>
>---
>
> include/linux/sunrpc/xprt.h | 1 +
> net/sunrpc/xprtsock.c | 20 +++++++++++++++++---
> 2 files changed, 18 insertions(+), 3 deletions(-)
>
>diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
>index 6f524a9..afb9e6a 100644
>--- a/include/linux/sunrpc/xprt.h
>+++ b/include/linux/sunrpc/xprt.h
>@@ -257,6 +257,7 @@ void xprt_force_disconnect(struct rpc_xprt *xprt);
> #define XPRT_CLOSE_WAIT (3)
> #define XPRT_BOUND (4)
> #define XPRT_BINDING (5)
>+#define XPRT_CLOSING (6)
>
> static inline void xprt_set_connected(struct rpc_xprt *xprt)
> {
>diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>index 5a83a40..99c0166 100644
>--- a/net/sunrpc/xprtsock.c
>+++ b/net/sunrpc/xprtsock.c
>@@ -758,7 +758,9 @@ static void xs_close(struct rpc_xprt *xprt)
> sock_release(sock);
> clear_close_wait:
> smp_mb__before_clear_bit();
>+ clear_bit(XPRT_CONNECTED, &xprt->state);
> clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>+ clear_bit(XPRT_CLOSING, &xprt->state);
> smp_mb__after_clear_bit();
> }
>
>@@ -1114,13 +1116,25 @@ static void xs_tcp_state_change(struct sock *sk)
> }
> spin_unlock_bh(&xprt->transport_lock);
> break;
>- case TCP_SYN_SENT:
>- case TCP_SYN_RECV:

Why are you removing these states from the switch?

>+ case TCP_FIN_WAIT1:
>+ /* The client initiated a shutdown of the socket */

If the client initiated this, why does the bit management need to wait
for TCP to change state? I think what this means is that the client
initiated a FIN, *and* the server has acked it. Both these bits could
have been set when the FIN was sent, basically. Or is there some
subtlety here, in which case the comment should say more.

Tom.


>+ set_bit(XPRT_CLOSING, &xprt->state);
>+ smp_mb__before_clear_bit();
>+ clear_bit(XPRT_CONNECTED, &xprt->state);
>+ smp_mb__after_clear_bit();
> break;
> case TCP_CLOSE_WAIT:
>+ /* The server initiated a shutdown of the socket */
>+ set_bit(XPRT_CLOSING, &xprt->state);
>+ smp_mb__before_clear_bit();
>+ clear_bit(XPRT_CONNECTED, &xprt->state);
>+ smp_mb__after_clear_bit();
> xprt_force_disconnect(xprt);
>- default:
>+ break;
>+ case TCP_CLOSE:
>+ /* Mark transport as closed and wake up all pending tasks */
> xprt_disconnect(xprt);
>+ clear_bit(XPRT_CLOSING, &xprt->state);
> }
> out:
> read_unlock(&sk->sk_callback_lock);
>
>
>-------------------------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc.
>Still grepping through log files to find problems? Stop.
>Now Search log events and configuration files using AJAX and a browser.
>Download your FREE copy of Splunk now >> http://get.splunk.com/
>_______________________________________________
>NFS maillist - [email protected]
>https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 14:33:26

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed


On Fri, 2007-11-09 at 09:04 -0500, Talpey, Thomas wrote:
> At 07:39 PM 11/6/2007, Trond Myklebust wrote:
> >From: Trond Myklebust <[email protected]>
> >
> >Add an xprt->state bit to enable the TCP ->state_change() method to signal
> >whether or not the TCP connection is in the process of closing down.
> >This will to be used by the reconnection logic in a separate patch.
> >
> >Signed-off-by: Trond Myklebust <[email protected]>
> >---
> >
> > include/linux/sunrpc/xprt.h | 1 +
> > net/sunrpc/xprtsock.c | 20 +++++++++++++++++---
> > 2 files changed, 18 insertions(+), 3 deletions(-)
> >
> >diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> >index 6f524a9..afb9e6a 100644
> >--- a/include/linux/sunrpc/xprt.h
> >+++ b/include/linux/sunrpc/xprt.h
> >@@ -257,6 +257,7 @@ void xprt_force_disconnect(struct rpc_xprt *xprt);
> > #define XPRT_CLOSE_WAIT (3)
> > #define XPRT_BOUND (4)
> > #define XPRT_BINDING (5)
> >+#define XPRT_CLOSING (6)
> >
> > static inline void xprt_set_connected(struct rpc_xprt *xprt)
> > {
> >diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> >index 5a83a40..99c0166 100644
> >--- a/net/sunrpc/xprtsock.c
> >+++ b/net/sunrpc/xprtsock.c
> >@@ -758,7 +758,9 @@ static void xs_close(struct rpc_xprt *xprt)
> > sock_release(sock);
> > clear_close_wait:
> > smp_mb__before_clear_bit();
> >+ clear_bit(XPRT_CONNECTED, &xprt->state);
> > clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> >+ clear_bit(XPRT_CLOSING, &xprt->state);
> > smp_mb__after_clear_bit();
> > }
> >
> >@@ -1114,13 +1116,25 @@ static void xs_tcp_state_change(struct sock *sk)
> > }
> > spin_unlock_bh(&xprt->transport_lock);
> > break;
> >- case TCP_SYN_SENT:
> >- case TCP_SYN_RECV:
>
> Why are you removing these states from the switch?

There is no need for any special handling for these states, so just fall
through.

> >+ case TCP_FIN_WAIT1:
> >+ /* The client initiated a shutdown of the socket */
>
> If the client initiated this, why does the bit management need to wait
> for TCP to change state? I think what this means is that the client
> initiated a FIN, *and* the server has acked it. Both these bits could
> have been set when the FIN was sent, basically. Or is there some
> subtlety here, in which case the comment should say more.

No. TCP_FIN_WAIT1 is the standard name in TCP parlance for the state
where the client has sent a FIN to the server, and is waiting for an ACK
and/or a FIN in reply. See one of Stevens' famous TCP state diagrams.

The TCP layer is therefore notifying us of this state change via the
socket's sk->sk_state_change() callback. All we're doing in response is
therefore to set the flags in the struct xprt to indicate that we have
initiated a shutdown of the socket.

> >+ set_bit(XPRT_CLOSING, &xprt->state);
> >+ smp_mb__before_clear_bit();
> >+ clear_bit(XPRT_CONNECTED, &xprt->state);
> >+ smp_mb__after_clear_bit();
> > break;
> > case TCP_CLOSE_WAIT:
> >+ /* The server initiated a shutdown of the socket */
> >+ set_bit(XPRT_CLOSING, &xprt->state);
> >+ smp_mb__before_clear_bit();
> >+ clear_bit(XPRT_CONNECTED, &xprt->state);
> >+ smp_mb__after_clear_bit();
> > xprt_force_disconnect(xprt);
> >- default:
> >+ break;
> >+ case TCP_CLOSE:
> >+ /* Mark transport as closed and wake up all pending tasks */
> > xprt_disconnect(xprt);
> >+ clear_bit(XPRT_CLOSING, &xprt->state);
> > }
> > out:
> > read_unlock(&sk->sk_callback_lock);
> >
> >
> >-------------------------------------------------------------------------
> >This SF.net email is sponsored by: Splunk Inc.
> >Still grepping through log files to find problems? Stop.
> >Now Search log events and configuration files using AJAX and a browser.
> >Download your FREE copy of Splunk now >> http://get.splunk.com/
> >_______________________________________________
> >NFS maillist - [email protected]
> >https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> NFSv4 mailing list
> [email protected]
> http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4

2007-11-09 14:35:01

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [NFS] [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

At 09:33 AM 11/9/2007, Trond Myklebust wrote:
>> > break;
>> >- case TCP_SYN_SENT:
>> >- case TCP_SYN_RECV:
>>
>> Why are you removing these states from the switch?
>
>There is no need for any special handling for these states, so just fall
>through.

Ah - the next line was "break;". Ack.

>
>> >+ case TCP_FIN_WAIT1:
>> >+ /* The client initiated a shutdown of the socket */
>>
>> If the client initiated this, why does the bit management need to wait
>> for TCP to change state? I think what this means is that the client
>> initiated a FIN, *and* the server has acked it. Both these bits could
>> have been set when the FIN was sent, basically. Or is there some
>> subtlety here, in which case the comment should say more.
>
>No. TCP_FIN_WAIT1 is the standard name in TCP parlance for the state
>where the client has sent a FIN to the server, and is waiting for an ACK
>and/or a FIN in reply. See one of Stevens' famous TCP state diagrams.
>
>The TCP layer is therefore notifying us of this state change via the
>socket's sk->sk_state_change() callback. All we're doing in response is
>therefore to set the flags in the struct xprt to indicate that we have
>initiated a shutdown of the socket.

Oh - so in that case what you're actually doing is waiting for any data
to be sent, *and* then the FIN to be sent. Of course, the server hasn't
necessarily seen either yet.

I guess I return to my question - why not just set the XPRT bits when
we decided to close? Why wait for TCP to click through to this state?
It doesn't guarantee anything on the server until we see an ACK.

Tom.

>
>> >+ set_bit(XPRT_CLOSING, &xprt->state);
>> >+ smp_mb__before_clear_bit();
>> >+ clear_bit(XPRT_CONNECTED, &xprt->state);
>> >+ smp_mb__after_clear_bit();
>> > break;

2007-11-09 14:48:50

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed


On Fri, 2007-11-09 at 09:35 -0500, Talpey, Thomas wrote:
> I guess I return to my question - why not just set the XPRT bits when
> we decided to close? Why wait for TCP to click through to this state?
> It doesn't guarantee anything on the server until we see an ACK.

We could do that, but IMO it is cleaner to keep all of this
state-dependent code in one place. The state change occurs while we're
inside the call to ->shutdown(), so there is no delay.

Cheers
Trond

2007-11-09 15:25:23

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

At 09:48 AM 11/9/2007, Trond Myklebust wrote:
>
>On Fri, 2007-11-09 at 09:35 -0500, Talpey, Thomas wrote:
>> I guess I return to my question - why not just set the XPRT bits when
>> we decided to close? Why wait for TCP to click through to this state?
>> It doesn't guarantee anything on the server until we see an ACK.
>
>We could do that, but IMO it is cleaner to keep all of this
>state-dependent code in one place.

I'm fine with that.

> The state change occurs while we're
>inside the call to ->shutdown(), so there is no delay.

I don't think so, in the case that the network is disconnected and
there is some data pending in the TCP output queue. The FIN won't
be sent until the window advances to allow it, and this could happen
much later. In the meantime, the xprt isn't even marked CLOSING.

Tom.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 15:32:49

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed


On Fri, 2007-11-09 at 10:25 -0500, Talpey, Thomas wrote:
> At 09:48 AM 11/9/2007, Trond Myklebust wrote:
> > The state change occurs while we're
> >inside the call to ->shutdown(), so there is no delay.
>
> I don't think so, in the case that the network is disconnected and
> there is some data pending in the TCP output queue. The FIN won't
> be sent until the window advances to allow it, and this could happen
> much later. In the meantime, the xprt isn't even marked CLOSING.

Oh, I see what you mean. So, correction: the TCP_FIN_WAIT1 state means
that the FIN has been _queued_ by the TCP layer. It may or may not have
hit the wire yet.

Cheers
Trond

2007-11-09 16:54:02

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

At 10:32 AM 11/9/2007, Trond Myklebust wrote:
>
>On Fri, 2007-11-09 at 10:25 -0500, Talpey, Thomas wrote:
>> At 09:48 AM 11/9/2007, Trond Myklebust wrote:
>> > The state change occurs while we're
>> >inside the call to ->shutdown(), so there is no delay.
>>
>> I don't think so, in the case that the network is disconnected and
>> there is some data pending in the TCP output queue. The FIN won't
>> be sent until the window advances to allow it, and this could happen
>> much later. In the meantime, the xprt isn't even marked CLOSING.
>
>Oh, I see what you mean. So, correction: the TCP_FIN_WAIT1 state means
>that the FIN has been _queued_ by the TCP layer. It may or may not have
>hit the wire yet.

Correct. And even if were sent, the FINWAIT1 state does not mean the
peer has transitioned. You need to see the ACK (which often comes in
the same packet as the peer's FIN). All this is a gawd-awful layering
violation, you know. :-)

Tom.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 17:35:09

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed


On Fri, 2007-11-09 at 11:53 -0500, Talpey, Thomas wrote:
> At 10:32 AM 11/9/2007, Trond Myklebust wrote:
> >
> >On Fri, 2007-11-09 at 10:25 -0500, Talpey, Thomas wrote:
> >> At 09:48 AM 11/9/2007, Trond Myklebust wrote:
> >> > The state change occurs while we're
> >> >inside the call to ->shutdown(), so there is no delay.
> >>
> >> I don't think so, in the case that the network is disconnected and
> >> there is some data pending in the TCP output queue. The FIN won't
> >> be sent until the window advances to allow it, and this could happen
> >> much later. In the meantime, the xprt isn't even marked CLOSING.
> >
> >Oh, I see what you mean. So, correction: the TCP_FIN_WAIT1 state means
> >that the FIN has been _queued_ by the TCP layer. It may or may not have
> >hit the wire yet.
>
> Correct. And even if were sent, the FINWAIT1 state does not mean the
> peer has transitioned. You need to see the ACK (which often comes in
> the same packet as the peer's FIN).

That is really of no direct concern to the RPC client. The important
thing for us is to clear the XPRT_CONNECTED flag in order to indicate
that the socket will no longer accept further requests, and we want to
set XPRT_CLOSING in order to tell anybody who tries to reconnect that
they need to wait.

> All this is a gawd-awful layering violation, you know. :-)

I disagree. This is exactly the same thing as using poll() to monitor
the state of the socket in userland.

Cheers
Trond

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 17:52:30

by Talpey, Thomas

[permalink] [raw]
Subject: Re: [NFS] [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

At 12:37 PM 11/9/2007, Trond Myklebust wrote:
>> Correct. And even if were sent, the FINWAIT1 state does not mean the
>> peer has transitioned. You need to see the ACK (which often comes in
>> the same packet as the peer's FIN).
>
>That is really of no direct concern to the RPC client. The important
>thing for us is to clear the XPRT_CONNECTED flag in order to indicate
>that the socket will no longer accept further requests, and we want to
>set XPRT_CLOSING in order to tell anybody who tries to reconnect that
>they need to wait.

Well, my concern is that the XPRT_CLOSING bit set may be delayed due to
network issues. What happens if a second task discovers the client while
it's pending? That's all.

>> All this is a gawd-awful layering violation, you know. :-)
>
>I disagree. This is exactly the same thing as using poll() to monitor
>the state of the socket in userland.

But but but... poll() can't detect FIN_WAIT_1. It can only detect EOF,
i.e. CLOSE. I don't think this code could be implemented in userspace.

Here's my bottom line - if the RPC client is provably more correct with
this detail of the TCP state, then it's fine. I do think however that it's
unnecessary, i.e. the code can work without this, and simply set the
bits when the shutdown() is initiated.

Tom.

2007-11-09 18:17:57

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed


On Fri, 2007-11-09 at 12:52 -0500, Talpey, Thomas wrote:
> At 12:37 PM 11/9/2007, Trond Myklebust wrote:
> >> Correct. And even if were sent, the FINWAIT1 state does not mean the
> >> peer has transitioned. You need to see the ACK (which often comes in
> >> the same packet as the peer's FIN).
> >
> >That is really of no direct concern to the RPC client. The important
> >thing for us is to clear the XPRT_CONNECTED flag in order to indicate
> >that the socket will no longer accept further requests, and we want to
> >set XPRT_CLOSING in order to tell anybody who tries to reconnect that
> >they need to wait.
>
> Well, my concern is that the XPRT_CLOSING bit set may be delayed due to
> network issues. What happens if a second task discovers the client while
> it's pending? That's all.
>
> >> All this is a gawd-awful layering violation, you know. :-)
> >
> >I disagree. This is exactly the same thing as using poll() to monitor
> >the state of the socket in userland.
>
> But but but... poll() can't detect FIN_WAIT_1. It can only detect EOF,
> i.e. CLOSE. I don't think this code could be implemented in userspace.

There is nothing to tell you explicitly that you are entering
TCP_FIN_WAIT1, but once you do, POLLOUT will tell you that the socket is
unavailable for writing.

POLLRDHUP will tell you if the server has sent a FIN (i.e. if you are
entering TCP_CLOSING).

Finally, POLLHUP will tell you that the socket has entered TCP_CLOSED.

So yes, I could implement fully equivalent monitoring of the socket in
userspace.

> Here's my bottom line - if the RPC client is provably more correct with
> this detail of the TCP state, then it's fine. I do think however that it's
> unnecessary, i.e. the code can work without this, and simply set the
> bits when the shutdown() is initiated.

Sure, but why split this code into a million little pieces? Do it in the
one spot where it is dead obvious what is going on (the socket's state
change notifier), and where it is easy to maintain the code.

Trond

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-09 19:55:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: setclientid: string in use on NFS v4 share on Debian Etch & hosts file "solution"

On Wed, Nov 07, 2007 at 03:47:08PM -0800, Matt Weatherford wrote:
>
> Hi,
>
> Im not sure this is the right place to ask, but Im wondering if someone can
> illuminate what the following console messages means on my NFS v4 server....
>
>
> NFSD: setclientid: string in use by client(clientid 471c1b08/000003e9)
> NFSD: setclientid: string in use by client(clientid 471c1b08/00000410)
> nfs4_cb: server 10.11.12.221 not responding, timed out
> nfs4_cb: server 10.11.12.222 not responding, timed out
> NFSD: setclientid: string in use by client(clientid 471c1b08/0000043a)
> NFSD: setclientid: string in use by client(clientid 471c1b08/0000043c)
> NFSD: setclientid: string in use by client(clientid 471c1b08/00000447)
> NFSD: setclientid: string in use by client(clientid 471c1b08/0000044d)
> NFSD: setclientid: string in use by client(clientid 471c1b08/00000453)
> nfs4_cb: server 10.11.12.221 not responding, timed out

The "nfs4_cb: server not responding" mean the server is failing to open
a connection back to the client to use for callbacks, which means it
won't give out delegations. That's not a fatal error at all, it's just
a possible lost opportunity for a performance improvement.

The "string in use" errors could be a sign that clients might fail to
recover locks when the server reboots. But if the clients aren't
complaining, I wouldn't worry about it. There have been changes to
nfs-utils and the kernel to make those clientid collisions less likely,
but I don't recall whether they went in before or after 2.6.18.

The "string in use" message is no longer logged by the server by default
in recent kernels, and the "not responding" comment needs to be removed
sometime too; I wouldn't worry about them.

>
> (and yes, those IPs (10.11.12.221, 222) are good)
>
>
> I am on Debian Etch, on x86 hardware, using multiple NFS v4 clients with
> all the latest package updates.
>
> I have multiple other NFS clients which the server is not complaining about.
> I am sharing NFSv4 shares on both (2) NIC's.
>
>
> NFS v4 client:
>
> client42:~# uname -a
> Linux client42 2.6.18-5-686 #1 SMP Wed Oct 3 00:12:50 UTC 2007 i686 GNU/Linux
> client42:~# apt-show-versions | grep nfs
> libnfsidmap2/etch uptodate 0.18-0
> nfs-common/etch uptodate 1:1.0.10-6+etch.1
> client42:~#
>
>
> mount
> client42:~# mount
> ...omittted non v4 shares...
> yadda:/etch-llocal on /net/LLocal4 type nfs4
> (rw,hard,intr,rsize=32768,wsize=32768,addr=192.168.105.123)
> client42:~>
>
>
>
>
> NFS v4 server:
>
> yadda:~# apt-show-versions | grep -i nfs
> nfs-common/etch uptodate 1:1.0.10-6+etch.1
> nfs-kernel-server/etch uptodate 1:1.0.10-6+etch.1
> libnfsidmap2/etch uptodate 0.18-0
> yadda:~#
>
>
> Im not sure that this is the answer, but somewhere I saw a thread that I may
> need to name the host like this in my hosts table on my nfsv4 client:
>
> 127.0.0.1 localhost
> 127.0.1.1 client42
>
> If this is the solution (is it?)

I suspect that'd be a step backwards.

--b.

2007-11-07 00:37:54

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 2/7] SUNRPC: Fix TCP rebinding logic

From: Trond Myklebust <[email protected]>

Currently the TCP rebinding logic assumes that if we're not using a
reserved port, then we don't need to reconnect on the same port if a
disconnection event occurs. This breaks most RPC duplicate reply cache
implementations.

Also take into account the fact that xprt_min_resvport and
xprt_max_resvport may change while we're reconnecting, since the user may
change them at any time via the sysctls. Ensure that we check the port
boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
boundaries change, we only scan the ports a maximum of 2 times.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/xprtsock.c | 59 ++++++++++++++++++++++++++++++++-----------------
1 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 322e4e2..5a83a40 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1272,34 +1272,53 @@ static void xs_set_port(struct rpc_xprt *xprt, unsigned short port)
}
}

+static unsigned short xs_get_srcport(struct sock_xprt *transport, struct socket *sock)
+{
+ unsigned short port = transport->port;
+
+ if (port == 0 && transport->xprt.resvport)
+ port = xs_get_random_port();
+ return port;
+}
+
+static unsigned short xs_next_srcport(struct sock_xprt *transport, struct socket *sock, unsigned short port)
+{
+ if (transport->port != 0)
+ transport->port = 0;
+ if (!transport->xprt.resvport)
+ return 0;
+ if (port <= xprt_min_resvport || port > xprt_max_resvport)
+ return xprt_max_resvport;
+ return --port;
+}
+
static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
{
struct sockaddr_in myaddr = {
.sin_family = AF_INET,
};
struct sockaddr_in *sa;
- int err;
- unsigned short port = transport->port;
+ int err, nloop = 0;
+ unsigned short port = xs_get_srcport(transport, sock);
+ unsigned short last;

- if (!transport->xprt.resvport)
- port = 0;
sa = (struct sockaddr_in *)&transport->addr;
myaddr.sin_addr = sa->sin_addr;
do {
myaddr.sin_port = htons(port);
err = kernel_bind(sock, (struct sockaddr *) &myaddr,
sizeof(myaddr));
- if (!transport->xprt.resvport)
+ if (port == 0)
break;
if (err == 0) {
transport->port = port;
break;
}
- if (port <= xprt_min_resvport)
- port = xprt_max_resvport;
- else
- port--;
- } while (err == -EADDRINUSE && port != transport->port);
+ last = port;
+ port = xs_next_srcport(transport, sock, port);
+ if (port > last)
+ nloop++;
+ } while (err == -EADDRINUSE && nloop != 2);
dprintk("RPC: %s "NIPQUAD_FMT":%u: %s (%d)\n",
__FUNCTION__, NIPQUAD(myaddr.sin_addr),
port, err ? "failed" : "ok", err);
@@ -1312,28 +1331,27 @@ static int xs_bind6(struct sock_xprt *transport, struct socket *sock)
.sin6_family = AF_INET6,
};
struct sockaddr_in6 *sa;
- int err;
- unsigned short port = transport->port;
+ int err, nloop = 0;
+ unsigned short port = xs_get_srcport(transport, sock);
+ unsigned short last;

- if (!transport->xprt.resvport)
- port = 0;
sa = (struct sockaddr_in6 *)&transport->addr;
myaddr.sin6_addr = sa->sin6_addr;
do {
myaddr.sin6_port = htons(port);
err = kernel_bind(sock, (struct sockaddr *) &myaddr,
sizeof(myaddr));
- if (!transport->xprt.resvport)
+ if (port == 0)
break;
if (err == 0) {
transport->port = port;
break;
}
- if (port <= xprt_min_resvport)
- port = xprt_max_resvport;
- else
- port--;
- } while (err == -EADDRINUSE && port != transport->port);
+ last = port;
+ port = xs_next_srcport(transport, sock, port);
+ if (port > last)
+ nloop++;
+ } while (err == -EADDRINUSE && nloop != 2);
dprintk("RPC: xs_bind6 "NIP6_FMT":%u: %s (%d)\n",
NIP6(myaddr.sin6_addr), port, err ? "failed" : "ok", err);
return err;
@@ -1815,7 +1833,6 @@ static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
xprt->addrlen = args->addrlen;
if (args->srcaddr)
memcpy(&new->addr, args->srcaddr, args->addrlen);
- new->port = xs_get_random_port();

return xprt;
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:37:59

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 3/7] SUNRPC: Allow the client to detect if the TCP connection is closed

From: Trond Myklebust <[email protected]>

Add an xprt->state bit to enable the TCP ->state_change() method to signal
whether or not the TCP connection is in the process of closing down.
This will to be used by the reconnection logic in a separate patch.

Signed-off-by: Trond Myklebust <[email protected]>
---

include/linux/sunrpc/xprt.h | 1 +
net/sunrpc/xprtsock.c | 20 +++++++++++++++++---
2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 6f524a9..afb9e6a 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -257,6 +257,7 @@ void xprt_force_disconnect(struct rpc_xprt *xprt);
#define XPRT_CLOSE_WAIT (3)
#define XPRT_BOUND (4)
#define XPRT_BINDING (5)
+#define XPRT_CLOSING (6)

static inline void xprt_set_connected(struct rpc_xprt *xprt)
{
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 5a83a40..99c0166 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -758,7 +758,9 @@ static void xs_close(struct rpc_xprt *xprt)
sock_release(sock);
clear_close_wait:
smp_mb__before_clear_bit();
+ clear_bit(XPRT_CONNECTED, &xprt->state);
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
+ clear_bit(XPRT_CLOSING, &xprt->state);
smp_mb__after_clear_bit();
}

@@ -1114,13 +1116,25 @@ static void xs_tcp_state_change(struct sock *sk)
}
spin_unlock_bh(&xprt->transport_lock);
break;
- case TCP_SYN_SENT:
- case TCP_SYN_RECV:
+ case TCP_FIN_WAIT1:
+ /* The client initiated a shutdown of the socket */
+ set_bit(XPRT_CLOSING, &xprt->state);
+ smp_mb__before_clear_bit();
+ clear_bit(XPRT_CONNECTED, &xprt->state);
+ smp_mb__after_clear_bit();
break;
case TCP_CLOSE_WAIT:
+ /* The server initiated a shutdown of the socket */
+ set_bit(XPRT_CLOSING, &xprt->state);
+ smp_mb__before_clear_bit();
+ clear_bit(XPRT_CONNECTED, &xprt->state);
+ smp_mb__after_clear_bit();
xprt_force_disconnect(xprt);
- default:
+ break;
+ case TCP_CLOSE:
+ /* Mark transport as closed and wake up all pending tasks */
xprt_disconnect(xprt);
+ clear_bit(XPRT_CLOSING, &xprt->state);
}
out:
read_unlock(&sk->sk_callback_lock);


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:38:11

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 6/7] SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect()

From: Trond Myklebust <[email protected]>

Move the calls to xprt_disconnect() over to xprt_force_disconnect() in
order to enable the transport layer to manage the state of the
XPRT_CONNECTED flag.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/clnt.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 76be83e..046d8f6 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1137,7 +1137,7 @@ call_status(struct rpc_task *task)
case -ETIMEDOUT:
task->tk_action = call_timeout;
if (task->tk_client->cl_discrtry)
- xprt_disconnect(task->tk_xprt);
+ xprt_force_disconnect(task->tk_xprt);
break;
case -ECONNREFUSED:
case -ENOTCONN:
@@ -1260,7 +1260,7 @@ out_retry:
req->rq_received = req->rq_private_buf.len = 0;
task->tk_status = 0;
if (task->tk_client->cl_discrtry)
- xprt_disconnect(task->tk_xprt);
+ xprt_force_disconnect(task->tk_xprt);
}

/*


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:38:05

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 5/7] SUNRPC: xprt_autoclose() should not call xprt_disconnect()

From: Trond Myklebust <[email protected]>

The transport layer should do that itself whenever appropriate.

Note that the RDMA transport already assumes that it needs to call
xprt_disconnect in xprt_rdma_close().
For TCP sockets, we want to call xprt_disconnect() only after the
connection has been closed by both ends.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/xprt.c | 1 -
net/sunrpc/xprtsock.c | 3 +--
2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 48c5a8b..fcdc4b8 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -568,7 +568,6 @@ static void xprt_autoclose(struct work_struct *work)
struct rpc_xprt *xprt =
container_of(work, struct rpc_xprt, task_cleanup);

- xprt_disconnect(xprt);
xprt->ops->close(xprt);
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
xprt_release_write(xprt, NULL);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index d610d28..c0aa2b4 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -786,10 +786,10 @@ static void xs_close(struct rpc_xprt *xprt)
sock_release(sock);
clear_close_wait:
smp_mb__before_clear_bit();
- clear_bit(XPRT_CONNECTED, &xprt->state);
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
clear_bit(XPRT_CLOSING, &xprt->state);
smp_mb__after_clear_bit();
+ xprt_disconnect(xprt);
}

/**
@@ -805,7 +805,6 @@ static void xs_destroy(struct rpc_xprt *xprt)

cancel_rearming_delayed_work(&transport->connect_worker);

- xprt_disconnect(xprt);
xs_close(xprt);
xs_free_peer_addresses(xprt);
kfree(xprt->slot);


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:38:15

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 7/7] SUNRPC: Rename xprt_disconnect()

From: Trond Myklebust <[email protected]>

xprt_disconnect() should really only be called when the transport shutdown
is completed, and it is time to wake up any pending tasks. Rename it to
xprt_disconnect_done() in order to reflect the semantical change.

Signed-off-by: Trond Myklebust <[email protected]>
---

include/linux/sunrpc/xprt.h | 2 +-
net/sunrpc/xprt.c | 6 +++---
net/sunrpc/xprtrdma/transport.c | 4 ++--
net/sunrpc/xprtsock.c | 6 +++---
4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index afb9e6a..2554cd2 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -245,7 +245,7 @@ void xprt_adjust_cwnd(struct rpc_task *task, int result);
struct rpc_rqst * xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
void xprt_complete_rqst(struct rpc_task *task, int copied);
void xprt_release_rqst_cong(struct rpc_task *task);
-void xprt_disconnect(struct rpc_xprt *xprt);
+void xprt_disconnect_done(struct rpc_xprt *xprt);
void xprt_force_disconnect(struct rpc_xprt *xprt);

/*
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index fcdc4b8..4d71152 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -574,11 +574,11 @@ static void xprt_autoclose(struct work_struct *work)
}

/**
- * xprt_disconnect - mark a transport as disconnected
+ * xprt_disconnect_done - mark a transport as disconnected
* @xprt: transport to flag for disconnect
*
*/
-void xprt_disconnect(struct rpc_xprt *xprt)
+void xprt_disconnect_done(struct rpc_xprt *xprt)
{
dprintk("RPC: disconnected transport %p\n", xprt);
spin_lock_bh(&xprt->transport_lock);
@@ -586,7 +586,7 @@ void xprt_disconnect(struct rpc_xprt *xprt)
xprt_wake_pending_tasks(xprt, -ENOTCONN);
spin_unlock_bh(&xprt->transport_lock);
}
-EXPORT_SYMBOL_GPL(xprt_disconnect);
+EXPORT_SYMBOL_GPL(xprt_disconnect_done);

/**
* xprt_force_disconnect - force a transport to disconnect
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index dc55cc9..06e8b50 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -449,7 +449,7 @@ xprt_rdma_close(struct rpc_xprt *xprt)
struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);

dprintk("RPC: %s: closing\n", __func__);
- xprt_disconnect(xprt);
+ xprt_disconnect_done(xprt);
(void) rpcrdma_ep_disconnect(&r_xprt->rx_ep, &r_xprt->rx_ia);
}

@@ -682,7 +682,7 @@ xprt_rdma_send_request(struct rpc_task *task)
}

if (rpcrdma_ep_post(&r_xprt->rx_ia, &r_xprt->rx_ep, req)) {
- xprt_disconnect(xprt);
+ xprt_disconnect_done(xprt);
return -ENOTCONN; /* implies disconnect */
}

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index c0aa2b4..a7d53d1 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -789,7 +789,7 @@ clear_close_wait:
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
clear_bit(XPRT_CLOSING, &xprt->state);
smp_mb__after_clear_bit();
- xprt_disconnect(xprt);
+ xprt_disconnect_done(xprt);
}

/**
@@ -911,7 +911,7 @@ static inline void xs_tcp_read_fraghdr(struct rpc_xprt *xprt, struct xdr_skb_rea
/* Sanity check of the record length */
if (unlikely(transport->tcp_reclen < 4)) {
dprintk("RPC: invalid TCP record fragment length\n");
- xprt_disconnect(xprt);
+ xprt_force_disconnect(xprt);
return;
}
dprintk("RPC: reading TCP record fragment of length %d\n",
@@ -1160,7 +1160,7 @@ static void xs_tcp_state_change(struct sock *sk)
break;
case TCP_CLOSE:
/* Mark transport as closed and wake up all pending tasks */
- xprt_disconnect(xprt);
+ xprt_disconnect_done(xprt);
clear_bit(XPRT_CLOSING, &xprt->state);
}
out:


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:38:19

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 4/7] SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket

From: Trond Myklebust <[email protected]>

By using shutdown() rather than close() we allow the RPC client to wait
for the TCP close handshake to complete before we start trying to reconnect
using the same port.
We use shutdown(SHUT_WR) only instead of shutting down both directions,
however we wait until the server has closed the connection on its side.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/xprtsock.c | 53 +++++++++++++++++++++++++++++++++++++++++++------
1 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 99c0166..d610d28 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -614,6 +614,34 @@ static int xs_udp_send_request(struct rpc_task *task)
return status;
}

+static int xs_shutdown(struct socket *sock, int how)
+{
+ /*
+ * Note: 'how - 1' trick converts
+ * RCV_SHUTDOWN -> SHUT_RD = 0
+ * SEND_SHUTDOWN -> SHUT_WR = 1
+ * RCV_SHUTDOWN|SEND_SHUTDOWN -> SHUT_RDWR = 2
+ */
+ return sock->ops->shutdown(sock, how - 1);
+}
+
+/**
+ * xs_tcp_shutdown - gracefully shut down a TCP socket
+ * @xprt: transport
+ *
+ * Initiates a graceful shutdown of the TCP socket by calling the
+ * equivalent of shutdown(SHUT_WR);
+ */
+static void xs_tcp_shutdown(struct rpc_xprt *xprt)
+{
+ struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+ struct socket *sock = transport->sock;
+
+ if (sock != NULL)
+ xs_shutdown(sock, SEND_SHUTDOWN);
+ clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
+}
+
static inline void xs_encode_tcp_record_marker(struct xdr_buf *buf)
{
u32 reclen = buf->len - sizeof(rpc_fraghdr);
@@ -691,7 +719,7 @@ static int xs_tcp_send_request(struct rpc_task *task)
default:
dprintk("RPC: sendmsg returned unrecognized error %d\n",
-status);
- xprt_disconnect(xprt);
+ xs_tcp_shutdown(xprt);
break;
}

@@ -1627,8 +1655,7 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
break;
default:
/* get rid of existing socket, and retry */
- xs_close(xprt);
- break;
+ xs_tcp_shutdown(xprt);
}
}
out:
@@ -1687,8 +1714,7 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
break;
default:
/* get rid of existing socket, and retry */
- xs_close(xprt);
- break;
+ xs_tcp_shutdown(xprt);
}
}
out:
@@ -1735,6 +1761,19 @@ static void xs_connect(struct rpc_task *task)
}
}

+static void xs_tcp_connect(struct rpc_task *task)
+{
+ struct rpc_xprt *xprt = task->tk_xprt;
+
+ /* Initiate graceful shutdown of the socket if not already done */
+ if (!test_bit(XPRT_CONNECTING, &xprt->state))
+ xs_tcp_shutdown(xprt);
+ /* Exit if we need to wait for socket shutdown to complete */
+ if (test_bit(XPRT_CLOSING, &xprt->state))
+ return;
+ xs_connect(task);
+}
+
/**
* xs_udp_print_stats - display UDP socket-specifc stats
* @xprt: rpc_xprt struct containing statistics
@@ -1805,12 +1844,12 @@ static struct rpc_xprt_ops xs_tcp_ops = {
.release_xprt = xs_tcp_release_xprt,
.rpcbind = rpcb_getport_async,
.set_port = xs_set_port,
- .connect = xs_connect,
+ .connect = xs_tcp_connect,
.buf_alloc = rpc_malloc,
.buf_free = rpc_free,
.send_request = xs_tcp_send_request,
.set_retrans_timeout = xprt_set_retrans_timeout_def,
- .close = xs_close,
+ .close = xs_tcp_shutdown,
.destroy = xs_destroy,
.print_stats = xs_tcp_print_stats,
};


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 00:38:29

by Myklebust, Trond

[permalink] [raw]
Subject: [PATCH 1/7] SUNRPC: Fix a race in xs_tcp_state_change()

From: Trond Myklebust <[email protected]>

When scheduling the autoclose RPC call, we want to ensure that we don't
race against the test_bit() call in xprt_clear_locked().

Signed-off-by: Trond Myklebust <[email protected]>
---

include/linux/sunrpc/xprt.h | 1 +
net/sunrpc/xprt.c | 20 ++++++++++++++++++++
net/sunrpc/xprtsock.c | 5 +----
3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 30b17b3..6f524a9 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -246,6 +246,7 @@ struct rpc_rqst * xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
void xprt_complete_rqst(struct rpc_task *task, int copied);
void xprt_release_rqst_cong(struct rpc_task *task);
void xprt_disconnect(struct rpc_xprt *xprt);
+void xprt_force_disconnect(struct rpc_xprt *xprt);

/*
* Reserved bit positions in xprt->state
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 282a9a2..48c5a8b 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -570,6 +570,7 @@ static void xprt_autoclose(struct work_struct *work)

xprt_disconnect(xprt);
xprt->ops->close(xprt);
+ clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
xprt_release_write(xprt, NULL);
}

@@ -588,6 +589,25 @@ void xprt_disconnect(struct rpc_xprt *xprt)
}
EXPORT_SYMBOL_GPL(xprt_disconnect);

+/**
+ * xprt_force_disconnect - force a transport to disconnect
+ * @xprt: transport to disconnect
+ *
+ */
+void xprt_force_disconnect(struct rpc_xprt *xprt)
+{
+ /* Don't race with the test_bit() in xprt_clear_locked() */
+ spin_lock_bh(&xprt->transport_lock);
+ set_bit(XPRT_CLOSE_WAIT, &xprt->state);
+ /* Try to schedule an autoclose RPC call */
+ if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
+ queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ else if (xprt->snd_task != NULL)
+ rpc_wake_up_task(xprt->snd_task);
+ spin_unlock_bh(&xprt->transport_lock);
+}
+EXPORT_SYMBOL_GPL(xprt_force_disconnect);
+
static void
xprt_init_autodisconnect(unsigned long data)
{
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 02298f5..322e4e2 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1118,10 +1118,7 @@ static void xs_tcp_state_change(struct sock *sk)
case TCP_SYN_RECV:
break;
case TCP_CLOSE_WAIT:
- /* Try to schedule an autoclose RPC calls */
- set_bit(XPRT_CLOSE_WAIT, &xprt->state);
- if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
- queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ xprt_force_disconnect(xprt);
default:
xprt_disconnect(xprt);
}


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 22:32:43

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 1/7] SUNRPC: Fix a race in xs_tcp_state_change()


On Wed, 2007-11-07 at 17:21 -0500, Chuck Lever wrote:
> Trond Myklebust wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > When scheduling the autoclose RPC call, we want to ensure that we don't
> > race against the test_bit() call in xprt_clear_locked().
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> >
> > include/linux/sunrpc/xprt.h | 1 +
> > net/sunrpc/xprt.c | 20 ++++++++++++++++++++
> > net/sunrpc/xprtsock.c | 5 +----
> > 3 files changed, 22 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> > index 30b17b3..6f524a9 100644
> > --- a/include/linux/sunrpc/xprt.h
> > +++ b/include/linux/sunrpc/xprt.h
> > @@ -246,6 +246,7 @@ struct rpc_rqst * xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
> > void xprt_complete_rqst(struct rpc_task *task, int copied);
> > void xprt_release_rqst_cong(struct rpc_task *task);
> > void xprt_disconnect(struct rpc_xprt *xprt);
> > +void xprt_force_disconnect(struct rpc_xprt *xprt);
> >
> > /*
> > * Reserved bit positions in xprt->state
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > index 282a9a2..48c5a8b 100644
> > --- a/net/sunrpc/xprt.c
> > +++ b/net/sunrpc/xprt.c
> > @@ -570,6 +570,7 @@ static void xprt_autoclose(struct work_struct *work)
> >
> > xprt_disconnect(xprt);
> > xprt->ops->close(xprt);
> > + clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>
> xs_close already clears the CLOSE_WAIT bit, and so does xs_tcp_shutdown
> (added in a later patch). So this hunk appears to be unnecessary.
>
> But xprt_rdma_close doesn't clear CLOSE_WAIT, it appears. Do we want to
> copy (some of) the logic from the end of xs_close into xprt_rdma_close?

Nah. I'd like to move the XPRT_CLOSE_WAIT stuff into xprt.c in order to
convert it into a generic way of telling the transport layer that we
want it to shut down. It is already in the xprt_clear_locked() logic, so
this is really just a cleanup of the current hack.

> > xprt_release_write(xprt, NULL);
> > }
> >
> > @@ -588,6 +589,25 @@ void xprt_disconnect(struct rpc_xprt *xprt)
> > }
> > EXPORT_SYMBOL_GPL(xprt_disconnect);
> >
> > +/**
> > + * xprt_force_disconnect - force a transport to disconnect
> > + * @xprt: transport to disconnect
> > + *
> > + */
> > +void xprt_force_disconnect(struct rpc_xprt *xprt)
> > +{
> > + /* Don't race with the test_bit() in xprt_clear_locked() */
> > + spin_lock_bh(&xprt->transport_lock);
> > + set_bit(XPRT_CLOSE_WAIT, &xprt->state);
> > + /* Try to schedule an autoclose RPC call */
> > + if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
> > + queue_work(rpciod_workqueue, &xprt->task_cleanup);
>
> > + else if (xprt->snd_task != NULL)
> > + rpc_wake_up_task(xprt->snd_task);
>
> What's this new bit of logic for?

It ensures that if the current holder of the XPRT_LOCKED bit is
sleeping, then we want it to wake up so that we can disconnect. The only
reason why the current holder might be sleeping would be if it is
waiting for buffer space to transmit (or possibly it might be trying to
reconnect - we might want to add a test for XPRT_CONNECTED too).

> > + spin_unlock_bh(&xprt->transport_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(xprt_force_disconnect);
> > +
> > static void
> > xprt_init_autodisconnect(unsigned long data)
> > {
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 02298f5..322e4e2 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -1118,10 +1118,7 @@ static void xs_tcp_state_change(struct sock *sk)
> > case TCP_SYN_RECV:
> > break;
> > case TCP_CLOSE_WAIT:
> > - /* Try to schedule an autoclose RPC calls */
> > - set_bit(XPRT_CLOSE_WAIT, &xprt->state);
> > - if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
> > - queue_work(rpciod_workqueue, &xprt->task_cleanup);
> > + xprt_force_disconnect(xprt);
> > default:
> > xprt_disconnect(xprt);
> > }
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________ NFS maillist - [email protected] https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 23:06:02

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 2/7] SUNRPC: Fix TCP rebinding logic


On Wed, 2007-11-07 at 17:48 -0500, Chuck Lever wrote:
> Trond Myklebust wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > Currently the TCP rebinding logic assumes that if we're not using a
> > reserved port, then we don't need to reconnect on the same port if a
> > disconnection event occurs.
>
> As Johnny Carson used to say: "I did not know that."
>
> I had assumed we always reused the port number whether a privileged port
> had been requested or not.

Looking at the code, this appears not to be the case.

> > This breaks most RPC duplicate reply cache
> > implementations.
> >
> > Also take into account the fact that xprt_min_resvport and
> > xprt_max_resvport may change while we're reconnecting, since the user may
> > change them at any time via the sysctls. Ensure that we check the port
> > boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
> > boundaries change, we only scan the ports a maximum of 2 times.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> >
> > net/sunrpc/xprtsock.c | 59 ++++++++++++++++++++++++++++++++-----------------
> > 1 files changed, 38 insertions(+), 21 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 322e4e2..5a83a40 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -1272,34 +1272,53 @@ static void xs_set_port(struct rpc_xprt *xprt, unsigned short port)
> > }
> > }
> >
> > +static unsigned short xs_get_srcport(struct sock_xprt *transport, struct socket *sock)
>
> Long line.
>
> > +{
> > + unsigned short port = transport->port;
> > +
> > + if (port == 0 && transport->xprt.resvport)
> > + port = xs_get_random_port();
>
> I don't see a reason not to get rid of xs_get_random_port and move that
> logic in here.

Keeping it makes the code easier to read, so I'd prefer to do so.

> > + return port;
> > +}
> > +
> > +static unsigned short xs_next_srcport(struct sock_xprt *transport, struct socket *sock, unsigned short port)
>
> Long line.
>
> > +{
> > + if (transport->port != 0)
> > + transport->port = 0;
> > + if (!transport->xprt.resvport)
> > + return 0;
> > + if (port <= xprt_min_resvport || port > xprt_max_resvport)
> > + return xprt_max_resvport;
> > + return --port;
> > +}
> > +
> > static int xs_bind4(struct sock_xprt *transport, struct socket *sock)
> > {
> > struct sockaddr_in myaddr = {
> > .sin_family = AF_INET,
> > };
> > struct sockaddr_in *sa;
> > - int err;
> > - unsigned short port = transport->port;
> > + int err, nloop = 0;
> > + unsigned short port = xs_get_srcport(transport, sock);
> > + unsigned short last;
> >
> > - if (!transport->xprt.resvport)
> > - port = 0;
> > sa = (struct sockaddr_in *)&transport->addr;
> > myaddr.sin_addr = sa->sin_addr;
> > do {
> > myaddr.sin_port = htons(port);
> > err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> > sizeof(myaddr));
> > - if (!transport->xprt.resvport)
> > + if (port == 0)
> > break;
> > if (err == 0) {
> > transport->port = port;
> > break;
> > }
> > - if (port <= xprt_min_resvport)
> > - port = xprt_max_resvport;
> > - else
> > - port--;
> > - } while (err == -EADDRINUSE && port != transport->port);
> > + last = port;
> > + port = xs_next_srcport(transport, sock, port);
> > + if (port > last)
> > + nloop++;
>
> It seems like there are cases where a user can adjust the port range and
> it would defeat this check. For example, if the port range is 30 to 40,
> and the user changes it to 10 to 20, we keep looping.

Yes, but once we hit 10, then the nloop gets bumped. It is guaranteed to
get bumped if we hit 0, or if the administrator increases the minimum
port number past the current one.

> Doesn't breaking out of the loop break "Hard" NFS requests?
>
> And I understand why you would want to copy the checks into a separate
> function (like, xs_bind6 uses the same checks), but it adds this extra
> little loop check at the end. I usually punt when that happens.

We've always had to deal with breaking out of the loops. If we don't, we
will deadlock the computer.
If this is a hard mount, then the only result should normally be that we
abort the connection, time out, and then try again. This has been the
behaviour for quite some while.
If, however, this is a soft mount, then it may result in an EIO getting
sent to the application.

> > + } while (err == -EADDRINUSE && nloop != 2);
> > dprintk("RPC: %s "NIPQUAD_FMT":%u: %s (%d)\n",
> > __FUNCTION__, NIPQUAD(myaddr.sin_addr),
> > port, err ? "failed" : "ok", err);
> > @@ -1312,28 +1331,27 @@ static int xs_bind6(struct sock_xprt *transport, struct socket *sock)
> > .sin6_family = AF_INET6,
> > };
> > struct sockaddr_in6 *sa;
> > - int err;
> > - unsigned short port = transport->port;
> > + int err, nloop = 0;
> > + unsigned short port = xs_get_srcport(transport, sock);
> > + unsigned short last;
> >
> > - if (!transport->xprt.resvport)
> > - port = 0;
> > sa = (struct sockaddr_in6 *)&transport->addr;
> > myaddr.sin6_addr = sa->sin6_addr;
> > do {
> > myaddr.sin6_port = htons(port);
> > err = kernel_bind(sock, (struct sockaddr *) &myaddr,
> > sizeof(myaddr));
> > - if (!transport->xprt.resvport)
> > + if (port == 0)
> > break;
> > if (err == 0) {
> > transport->port = port;
> > break;
> > }
> > - if (port <= xprt_min_resvport)
> > - port = xprt_max_resvport;
> > - else
> > - port--;
> > - } while (err == -EADDRINUSE && port != transport->port);
> > + last = port;
> > + port = xs_next_srcport(transport, sock, port);
> > + if (port > last)
> > + nloop++;
> > + } while (err == -EADDRINUSE && nloop != 2);
> > dprintk("RPC: xs_bind6 "NIP6_FMT":%u: %s (%d)\n",
> > NIP6(myaddr.sin6_addr), port, err ? "failed" : "ok", err);
> > return err;
> > @@ -1815,7 +1833,6 @@ static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
> > xprt->addrlen = args->addrlen;
> > if (args->srcaddr)
> > memcpy(&new->addr, args->srcaddr, args->addrlen);
> > - new->port = xs_get_random_port();
> >
> > return xprt;
> > }
>
> Moving this little wart into xs_bind?() is a nice clean-up.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 23:45:38

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 2/7] SUNRPC: Fix TCP rebinding logic


On Wed, 2007-11-07 at 18:28 -0500, Chuck Lever wrote:
> > We've always had to deal with breaking out of the loops. If we don't, we
> > will deadlock the computer.
>
> In that case it might make sense to allow some scheduling between each
> socket bind attempt, and do only a single pass over the port range
> before breaking out. Then you could forget about whether the range
> changes -- you will use the new range the next time a connection is
> attempted.

No. We shouldn't ever schedule from an rpciod process context.

Trond

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 23:47:28

by Matt Weatherford

[permalink] [raw]
Subject: setclientid: string in use on NFS v4 share on Debian Etch & hosts file "solution"


Hi,

Im not sure this is the right place to ask, but Im wondering if someone can
illuminate what the following console messages means on my NFS v4 server....


NFSD: setclientid: string in use by client(clientid 471c1b08/000003e9)
NFSD: setclientid: string in use by client(clientid 471c1b08/00000410)
nfs4_cb: server 10.11.12.221 not responding, timed out
nfs4_cb: server 10.11.12.222 not responding, timed out
NFSD: setclientid: string in use by client(clientid 471c1b08/0000043a)
NFSD: setclientid: string in use by client(clientid 471c1b08/0000043c)
NFSD: setclientid: string in use by client(clientid 471c1b08/00000447)
NFSD: setclientid: string in use by client(clientid 471c1b08/0000044d)
NFSD: setclientid: string in use by client(clientid 471c1b08/00000453)
nfs4_cb: server 10.11.12.221 not responding, timed out

(and yes, those IPs (10.11.12.221, 222) are good)


I am on Debian Etch, on x86 hardware, using multiple NFS v4 clients with
all the latest package updates.

I have multiple other NFS clients which the server is not complaining about.
I am sharing NFSv4 shares on both (2) NIC's.


NFS v4 client:

client42:~# uname -a
Linux client42 2.6.18-5-686 #1 SMP Wed Oct 3 00:12:50 UTC 2007 i686 GNU/Linux
client42:~# apt-show-versions | grep nfs
libnfsidmap2/etch uptodate 0.18-0
nfs-common/etch uptodate 1:1.0.10-6+etch.1
client42:~#


mount
client42:~# mount
...omittted non v4 shares...
yadda:/etch-llocal on /net/LLocal4 type nfs4
(rw,hard,intr,rsize=32768,wsize=32768,addr=192.168.105.123)
client42:~>




NFS v4 server:

yadda:~# apt-show-versions | grep -i nfs
nfs-common/etch uptodate 1:1.0.10-6+etch.1
nfs-kernel-server/etch uptodate 1:1.0.10-6+etch.1
libnfsidmap2/etch uptodate 0.18-0
yadda:~#


Im not sure that this is the answer, but somewhere I saw a thread that I may
need to name the host like this in my hosts table on my nfsv4 client:

127.0.0.1 localhost
127.0.1.1 client42

If this is the solution (is it?) ...this is a hassle, since I already have a
hostname, and an IP address for that hostname defined in a hosts table that I
push out to every cluster node and would like to keep them the same - is
there another way to solve this problem that lets me keep all my hosts files
the same across my environment?

Thanks,

Matt




-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-11-07 23:59:13

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [NFS] [PATCH 4/7] SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket


On Wed, 2007-11-07 at 18:11 -0500, Chuck Lever wrote:
> Trond Myklebust wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > By using shutdown() rather than close() we allow the RPC client to wait
> > for the TCP close handshake to complete before we start trying to reconnect
> > using the same port.
> > We use shutdown(SHUT_WR) only instead of shutting down both directions,
> > however we wait until the server has closed the connection on its side.
>
> Yeah, that seems like a more friendly and nuanced behavior, and is
> probably one of the more important features of this set of changes.
>
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> >
> > net/sunrpc/xprtsock.c | 53 +++++++++++++++++++++++++++++++++++++++++++------
> > 1 files changed, 46 insertions(+), 7 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 99c0166..d610d28 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -614,6 +614,34 @@ static int xs_udp_send_request(struct rpc_task *task)
> > return status;
> > }
> >
> > +static int xs_shutdown(struct socket *sock, int how)
> > +{
> > + /*
> > + * Note: 'how - 1' trick converts
> > + * RCV_SHUTDOWN -> SHUT_RD = 0
> > + * SEND_SHUTDOWN -> SHUT_WR = 1
> > + * RCV_SHUTDOWN|SEND_SHUTDOWN -> SHUT_RDWR = 2
> > + */
> > + return sock->ops->shutdown(sock, how - 1);
> > +}
> > +
> > +/**
> > + * xs_tcp_shutdown - gracefully shut down a TCP socket
> > + * @xprt: transport
> > + *
> > + * Initiates a graceful shutdown of the TCP socket by calling the
> > + * equivalent of shutdown(SHUT_WR);
> > + */
> > +static void xs_tcp_shutdown(struct rpc_xprt *xprt)
> > +{
> > + struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
> > + struct socket *sock = transport->sock;
> > +
> > + if (sock != NULL)
> > + xs_shutdown(sock, SEND_SHUTDOWN);
>
> I'm not sure why simply
>
> sock->ops->shutdown(sock, SHUT_WR);
>
> isn't adequate here.

trondmy@heimdal:~/devel/git/linux/kernel-2.6.x/linux-2.6$ git grep SHUT_WR
arch/um/os-Linux/file.c:#ifndef SHUT_WR
arch/um/os-Linux/file.c:#define SHUT_WR 1
arch/um/os-Linux/file.c: else if(w) what = SHUT_WR;
net/sctp/socket.c: * SHUT_WR

I suppose I could define SHUT_WR, but that would have to be part of a
separate patch, perhaps as part of a definition of a
kernel_sock_shutdown() wrapper. I wanted to get the RPC code right
first, then send in the cleanup via davem.

BTW: Both CIFS and NBD appear to be calling sock->ops->shutdown() with a
SEND_SHUTDOWN argument. From what I can read in inet_shutdown() that is
incorrect. (Ccing Steve and Paul).

> > + clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> > +}
> > +
> > static inline void xs_encode_tcp_record_marker(struct xdr_buf *buf)
> > {
> > u32 reclen = buf->len - sizeof(rpc_fraghdr);
> > @@ -691,7 +719,7 @@ static int xs_tcp_send_request(struct rpc_task *task)
> > default:
> > dprintk("RPC: sendmsg returned unrecognized error %d\n",
> > -status);
> > - xprt_disconnect(xprt);
> > + xs_tcp_shutdown(xprt);
>
> Hrm. I would have thought this case was handled adequately by
> xprt_release_write?

Nope. We really do want to get rid of xprt_disconnect(). The
XPRT_CONNECTED flag needs to be managed by the transport, and the
transport alone. See Patch 7.

> > break;
> > }
> >
> > @@ -1627,8 +1655,7 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
> > break;
> > default:
> > /* get rid of existing socket, and retry */
> > - xs_close(xprt);
> > - break;
> > + xs_tcp_shutdown(xprt);
> > }
> > }
> > out:
> > @@ -1687,8 +1714,7 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
> > break;
> > default:
> > /* get rid of existing socket, and retry */
> > - xs_close(xprt);
> > - break;
> > + xs_tcp_shutdown(xprt);
> > }
> > }
> > out:
> > @@ -1735,6 +1761,19 @@ static void xs_connect(struct rpc_task *task)
> > }
> > }
> >
> > +static void xs_tcp_connect(struct rpc_task *task)
> > +{
> > + struct rpc_xprt *xprt = task->tk_xprt;
> > +
> > + /* Initiate graceful shutdown of the socket if not already done */
> > + if (!test_bit(XPRT_CONNECTING, &xprt->state))
> > + xs_tcp_shutdown(xprt);
> > + /* Exit if we need to wait for socket shutdown to complete */
> > + if (test_bit(XPRT_CLOSING, &xprt->state))
> > + return;
> > + xs_connect(task);
> > +}
> > +
> > /**
> > * xs_udp_print_stats - display UDP socket-specifc stats
> > * @xprt: rpc_xprt struct containing statistics
>
> Adding xs_tcp_connect will allow a lot more clean up in xs_connect,
> especially simplifying UDP transport connects.

Yup.

> > @@ -1805,12 +1844,12 @@ static struct rpc_xprt_ops xs_tcp_ops = {
> > .release_xprt = xs_tcp_release_xprt,
> > .rpcbind = rpcb_getport_async,
> > .set_port = xs_set_port,
> > - .connect = xs_connect,
> > + .connect = xs_tcp_connect,
> > .buf_alloc = rpc_malloc,
> > .buf_free = rpc_free,
> > .send_request = xs_tcp_send_request,
> > .set_retrans_timeout = xprt_set_retrans_timeout_def,
> > - .close = xs_close,
> > + .close = xs_tcp_shutdown,
> > .destroy = xs_destroy,
> > .print_stats = xs_tcp_print_stats,
> > };

2007-11-07 23:58:49

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 7/7] SUNRPC: Rename xprt_disconnect()


On Wed, 2007-11-07 at 18:16 -0500, Chuck Lever wrote:
> Oops. Looks like you did the rename in xs_tcp_read_fraghdr in this patch.

I agree. That really should be part of patch 6. I'll respin these 2
patches...


> Trond Myklebust wrote:
> > From: Trond Myklebust <[email protected]>
> >
> > xprt_disconnect() should really only be called when the transport shutdown
> > is completed, and it is time to wake up any pending tasks. Rename it to
> > xprt_disconnect_done() in order to reflect the semantical change.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> >
> > include/linux/sunrpc/xprt.h | 2 +-
> > net/sunrpc/xprt.c | 6 +++---
> > net/sunrpc/xprtrdma/transport.c | 4 ++--
> > net/sunrpc/xprtsock.c | 6 +++---
> > 4 files changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> > index afb9e6a..2554cd2 100644
> > --- a/include/linux/sunrpc/xprt.h
> > +++ b/include/linux/sunrpc/xprt.h
> > @@ -245,7 +245,7 @@ void xprt_adjust_cwnd(struct rpc_task *task, int result);
> > struct rpc_rqst * xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
> > void xprt_complete_rqst(struct rpc_task *task, int copied);
> > void xprt_release_rqst_cong(struct rpc_task *task);
> > -void xprt_disconnect(struct rpc_xprt *xprt);
> > +void xprt_disconnect_done(struct rpc_xprt *xprt);
> > void xprt_force_disconnect(struct rpc_xprt *xprt);
> >
> > /*
> > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > index fcdc4b8..4d71152 100644
> > --- a/net/sunrpc/xprt.c
> > +++ b/net/sunrpc/xprt.c
> > @@ -574,11 +574,11 @@ static void xprt_autoclose(struct work_struct *work)
> > }
> >
> > /**
> > - * xprt_disconnect - mark a transport as disconnected
> > + * xprt_disconnect_done - mark a transport as disconnected
> > * @xprt: transport to flag for disconnect
> > *
> > */
> > -void xprt_disconnect(struct rpc_xprt *xprt)
> > +void xprt_disconnect_done(struct rpc_xprt *xprt)
> > {
> > dprintk("RPC: disconnected transport %p\n", xprt);
> > spin_lock_bh(&xprt->transport_lock);
> > @@ -586,7 +586,7 @@ void xprt_disconnect(struct rpc_xprt *xprt)
> > xprt_wake_pending_tasks(xprt, -ENOTCONN);
> > spin_unlock_bh(&xprt->transport_lock);
> > }
> > -EXPORT_SYMBOL_GPL(xprt_disconnect);
> > +EXPORT_SYMBOL_GPL(xprt_disconnect_done);
> >
> > /**
> > * xprt_force_disconnect - force a transport to disconnect
> > diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
> > index dc55cc9..06e8b50 100644
> > --- a/net/sunrpc/xprtrdma/transport.c
> > +++ b/net/sunrpc/xprtrdma/transport.c
> > @@ -449,7 +449,7 @@ xprt_rdma_close(struct rpc_xprt *xprt)
> > struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
> >
> > dprintk("RPC: %s: closing\n", __func__);
> > - xprt_disconnect(xprt);
> > + xprt_disconnect_done(xprt);
> > (void) rpcrdma_ep_disconnect(&r_xprt->rx_ep, &r_xprt->rx_ia);
> > }
> >
> > @@ -682,7 +682,7 @@ xprt_rdma_send_request(struct rpc_task *task)
> > }
> >
> > if (rpcrdma_ep_post(&r_xprt->rx_ia, &r_xprt->rx_ep, req)) {
> > - xprt_disconnect(xprt);
> > + xprt_disconnect_done(xprt);
> > return -ENOTCONN; /* implies disconnect */
> > }
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index c0aa2b4..a7d53d1 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -789,7 +789,7 @@ clear_close_wait:
> > clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
> > clear_bit(XPRT_CLOSING, &xprt->state);
> > smp_mb__after_clear_bit();
> > - xprt_disconnect(xprt);
> > + xprt_disconnect_done(xprt);
> > }
> >
> > /**
> > @@ -911,7 +911,7 @@ static inline void xs_tcp_read_fraghdr(struct rpc_xprt *xprt, struct xdr_skb_rea
> > /* Sanity check of the record length */
> > if (unlikely(transport->tcp_reclen < 4)) {
> > dprintk("RPC: invalid TCP record fragment length\n");
> > - xprt_disconnect(xprt);
> > + xprt_force_disconnect(xprt);
> > return;
> > }
> > dprintk("RPC: reading TCP record fragment of length %d\n",
> > @@ -1160,7 +1160,7 @@ static void xs_tcp_state_change(struct sock *sk)
> > break;
> > case TCP_CLOSE:
> > /* Mark transport as closed and wake up all pending tasks */
> > - xprt_disconnect(xprt);
> > + xprt_disconnect_done(xprt);
> > clear_bit(XPRT_CLOSING, &xprt->state);
> > }
> > out:
> >
> >
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems? Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs