2018-02-02 22:40:30

by Daniel Reichelt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

Hi Trond, Steven,

eversince I switched from Debian Jessie to Stretch last summer, I've
been seeing the very same hidden ports on an NFS server as described in
[1], which is a follow-up to [2].

Your patch ([3], [4]) solved the issue back then. Later on, you changed
that fix again in [5], which lead to the situation we're seeing today.

Reverting 0b0ab51 fixes the issue for me.

Let me know if you need more info.



Thanks
Daniel


[1] https://lkml.org/lkml/2016/6/30/341
[2] https://lkml.org/lkml/2015/6/11/803
[3] https://lkml.org/lkml/2015/6/19/759
[4] 4876cc779ff525b9c2376d8076edf47815e71f2c
[5] 4b0ab51db32eba0f48b7618254742f143364a28d


Attachments:
signature.asc (883.00 B)
OpenPGP digital signature

2018-02-06 00:25:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Fri, 2018-02-02 at 22:31 +0100, Daniel Reichelt wrote:
> Hi Trond, Steven,
>
> eversince I switched from Debian Jessie to Stretch last summer, I've
> been seeing the very same hidden ports on an NFS server as described
> in
> [1], which is a follow-up to [2].
>
> Your patch ([3], [4]) solved the issue back then. Later on, you
> changed
> that fix again in [5], which lead to the situation we're seeing
> today.
>
> Reverting 0b0ab51 fixes the issue for me.
>
> Let me know if you need more info.
>
>
>
> Thanks
> Daniel
>
>
> [1] https://lkml.org/lkml/2016/6/30/341
> [2] https://lkml.org/lkml/2015/6/11/803
> [3] https://lkml.org/lkml/2015/6/19/759
> [4] 4876cc779ff525b9c2376d8076edf47815e71f2c
> [5] 4b0ab51db32eba0f48b7618254742f143364a28d

Does the following fix the issue?

8<-----------------------------------------------
From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Mon, 5 Feb 2018 10:20:06 -0500
Subject: [PATCH] SUNRPC: Ensure we always close the socket after a connection
shuts down

Ensure that we release the TCP socket once it is in the TCP_CLOSE or
TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
and its ilk.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtsock.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 18803021f242..5d0108172ed3 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -807,13 +807,6 @@ static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
smp_mb__after_atomic();
}

-static void xs_sock_mark_closed(struct rpc_xprt *xprt)
-{
- xs_sock_reset_connection_flags(xprt);
- /* Mark transport as closed and wake up all pending tasks */
- xprt_disconnect_done(xprt);
-}
-
/**
* xs_error_report - callback to handle TCP socket state errors
* @sk: socket
@@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
err = -sk->sk_err;
if (err == 0)
goto out;
- /* Is this a reset event? */
- if (sk->sk_state == TCP_CLOSE)
- xs_sock_mark_closed(xprt);
dprintk("RPC: xs_error_report client %p, error=%d...\n",
xprt, -err);
trace_rpc_socket_error(xprt, sk->sk_socket, err);
@@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock *sk)
if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
&transport->sock_state))
xprt_clear_connecting(xprt);
+ clear_bit(XPRT_CLOSING, &xprt->state);
if (sk->sk_err)
xprt_wake_pending_tasks(xprt, -sk->sk_err);
- xs_sock_mark_closed(xprt);
+ /* Trigger the socket release */
+ xs_tcp_force_close(xprt);
}
out:
read_unlock_bh(&sk->sk_callback_lock);
@@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
{
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
struct socket *sock = transport->sock;
+ int skst = transport->inet ? transport->inet->sk_state : TCP_CLOSE;

if (sock == NULL)
return;
- if (xprt_connected(xprt)) {
+ switch (skst) {
+ default:
kernel_sock_shutdown(sock, SHUT_RDWR);
trace_rpc_socket_shutdown(xprt, sock);
- } else
+ break;
+ case TCP_CLOSE:
+ case TCP_TIME_WAIT:
xs_reset_transport(transport);
+ }
}

static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
--
2.14.3

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part

2018-02-06 09:21:10

by Daniel Reichelt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On 02/06/2018 01:24 AM, Trond Myklebust wrote:
> Does the following fix the issue?
>
> 8<-----------------------------------------------
> From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Mon, 5 Feb 2018 10:20:06 -0500
> Subject: [PATCH] SUNRPC: Ensure we always close the socket after a connection
> shuts down
>
> Ensure that we release the TCP socket once it is in the TCP_CLOSE or
> TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
> and its ilk.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 23 ++++++++++-------------
> 1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 18803021f242..5d0108172ed3 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -807,13 +807,6 @@ static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
> smp_mb__after_atomic();
> }
>
> -static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> -{
> - xs_sock_reset_connection_flags(xprt);
> - /* Mark transport as closed and wake up all pending tasks */
> - xprt_disconnect_done(xprt);
> -}
> -
> /**
> * xs_error_report - callback to handle TCP socket state errors
> * @sk: socket
> @@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
> err = -sk->sk_err;
> if (err == 0)
> goto out;
> - /* Is this a reset event? */
> - if (sk->sk_state == TCP_CLOSE)
> - xs_sock_mark_closed(xprt);
> dprintk("RPC: xs_error_report client %p, error=%d...\n",
> xprt, -err);
> trace_rpc_socket_error(xprt, sk->sk_socket, err);
> @@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock *sk)
> if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
> &transport->sock_state))
> xprt_clear_connecting(xprt);
> + clear_bit(XPRT_CLOSING, &xprt->state);
> if (sk->sk_err)
> xprt_wake_pending_tasks(xprt, -sk->sk_err);
> - xs_sock_mark_closed(xprt);
> + /* Trigger the socket release */
> + xs_tcp_force_close(xprt);
> }
> out:
> read_unlock_bh(&sk->sk_callback_lock);
> @@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
> {
> struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
> struct socket *sock = transport->sock;
> + int skst = transport->inet ? transport->inet->sk_state : TCP_CLOSE;
>
> if (sock == NULL)
> return;
> - if (xprt_connected(xprt)) {
> + switch (skst) {
> + default:
> kernel_sock_shutdown(sock, SHUT_RDWR);
> trace_rpc_socket_shutdown(xprt, sock);
> - } else
> + break;
> + case TCP_CLOSE:
> + case TCP_TIME_WAIT:
> xs_reset_transport(transport);
> + }
> }
>
> static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
>


Previously, I've seen hidden ports within 5-6 minutes after re-starting
the nfsd and re-mounting nfs-exports on clients.

With this patch applied, I don't see any hidden ports after 15mins. I
guess it's a valid fix.


Thank you!

Daniel


Attachments:
signature.asc (883.00 B)
OpenPGP digital signature

2018-02-06 19:29:51

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Tue, 2018-02-06 at 10:20 +0100, Daniel Reichelt wrote:
> On 02/06/2018 01:24 AM, Trond Myklebust wrote:
> > Does the following fix the issue?
> >
> > 8<-----------------------------------------------
> > From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00
> > 2001
> > From: Trond Myklebust <[email protected]>
> > Date: Mon, 5 Feb 2018 10:20:06 -0500
> > Subject: [PATCH] SUNRPC: Ensure we always close the socket after a
> > connection
> > shuts down
> >
> > Ensure that we release the TCP socket once it is in the TCP_CLOSE
> > or
> > TCP_TIME_WAIT state (and only then) so that we don't confuse
> > rkhunter
> > and its ilk.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > net/sunrpc/xprtsock.c | 23 ++++++++++-------------
> > 1 file changed, 10 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 18803021f242..5d0108172ed3 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -807,13 +807,6 @@ static void
> > xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
> > smp_mb__after_atomic();
> > }
> >
> > -static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> > -{
> > - xs_sock_reset_connection_flags(xprt);
> > - /* Mark transport as closed and wake up all pending tasks
> > */
> > - xprt_disconnect_done(xprt);
> > -}
> > -
> > /**
> > * xs_error_report - callback to handle TCP socket state errors
> > * @sk: socket
> > @@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
> > err = -sk->sk_err;
> > if (err == 0)
> > goto out;
> > - /* Is this a reset event? */
> > - if (sk->sk_state == TCP_CLOSE)
> > - xs_sock_mark_closed(xprt);
> > dprintk("RPC: xs_error_report client %p,
> > error=%d...\n",
> > xprt, -err);
> > trace_rpc_socket_error(xprt, sk->sk_socket, err);
> > @@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock
> > *sk)
> > if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
> > &transport->sock_state))
> > xprt_clear_connecting(xprt);
> > + clear_bit(XPRT_CLOSING, &xprt->state);
> > if (sk->sk_err)
> > xprt_wake_pending_tasks(xprt, -sk-
> > >sk_err);
> > - xs_sock_mark_closed(xprt);
> > + /* Trigger the socket release */
> > + xs_tcp_force_close(xprt);
> > }
> > out:
> > read_unlock_bh(&sk->sk_callback_lock);
> > @@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt
> > *xprt)
> > {
> > struct sock_xprt *transport = container_of(xprt, struct
> > sock_xprt, xprt);
> > struct socket *sock = transport->sock;
> > + int skst = transport->inet ? transport->inet->sk_state :
> > TCP_CLOSE;
> >
> > if (sock == NULL)
> > return;
> > - if (xprt_connected(xprt)) {
> > + switch (skst) {
> > + default:
> > kernel_sock_shutdown(sock, SHUT_RDWR);
> > trace_rpc_socket_shutdown(xprt, sock);
> > - } else
> > + break;
> > + case TCP_CLOSE:
> > + case TCP_TIME_WAIT:
> > xs_reset_transport(transport);
> > + }
> > }
> >
> > static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
> >
>
>
> Previously, I've seen hidden ports within 5-6 minutes after re-
> starting
> the nfsd and re-mounting nfs-exports on clients.
>
> With this patch applied, I don't see any hidden ports after 15mins. I
> guess it's a valid fix.

For the record, the intention of the patch is not to adjust or correct
any connection timeout values. Merely to ensure that once the
connection breakage is detected by the socket layer, so that is it no
longer usable by the RPC client, we release the socket.

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part