2016-06-30 12:59:55

by Steven Rostedt

[permalink] [raw]
Subject: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

[ resending as a new email, as I'm assuming people do not sort their
INBOX via last email on thread, thus my last email is sitting in the
bottom of everyone's INBOX ]

I've hit this again. Not sure when it started, but I applied my old
debug trace_printk() patch (attached) and rebooted (4.5.7). I just
tested the latest kernel from Linus's tree (from last nights pull), and
it still gives me the problem.

Here's the trace I have:

kworker/3:1H-134 [003] ..s. 61.036129: inet_csk_get_port: snum 805
kworker/3:1H-134 [003] ..s. 61.036135: <stack trace>
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/3:1H-134 [003] ..s. 61.036136: inet_bind_hash: add 805
kworker/3:1H-134 [003] ..s. 61.036138: <stack trace>
=> inet_csk_get_port
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/3:1H-134 [003] .... 61.036139: xs_bind: RPC: xs_bind 4.136.255.255:805: ok (0)
kworker/3:1H-134 [003] .... 61.036140: xs_tcp_setup_socket: RPC: worker connecting xprt ffff880407eca800 via tcp to 192.168.23.22 (port 43651)
kworker/3:1H-134 [003] .... 61.036162: xs_tcp_setup_socket: RPC: ffff880407eca800 connect status 115 connected 0 sock state 2
<idle>-0 [001] ..s. 61.036450: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff880407eca800...
<idle>-0 [001] ..s. 61.036452: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
kworker/1:1H-136 [001] .... 61.036476: xprt_connect_status: RPC: 43 xprt_connect_status: retrying
kworker/1:1H-136 [001] .... 61.036478: xprt_prepare_transmit: RPC: 43 xprt_prepare_transmit
kworker/1:1H-136 [001] .... 61.036479: xprt_transmit: RPC: 43 xprt_transmit(72)
kworker/1:1H-136 [001] .... 61.036486: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
kworker/1:1H-136 [001] .... 61.036487: xprt_transmit: RPC: 43 xmit complete
<idle>-0 [001] ..s. 61.036789: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
kworker/1:1H-136 [001] .... 61.036798: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading XID (4 bytes)
kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading request with XID 2f4c3f88
kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
kworker/1:1H-136 [001] .... 61.036801: xs_tcp_data_recv: RPC: read reply XID 2f4c3f88
kworker/1:1H-136 [001] ..s. 61.036801: xs_tcp_data_recv: RPC: XID 2f4c3f88 read 16 bytes
kworker/1:1H-136 [001] ..s. 61.036802: xs_tcp_data_recv: RPC: xprt = ffff880407eca800, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
kworker/1:1H-136 [001] ..s. 61.036802: xprt_complete_rqst: RPC: 43 xid 2f4c3f88 complete (24 bytes received)
kworker/1:1H-136 [001] .... 61.036803: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
kworker/1:1H-136 [001] .... 61.036812: xprt_release: RPC: 43 release request ffff88040b270800


# unhide-tcp
Unhide-tcp 20130526
Copyright © 2013 Yago Jesus & Patrick Gouin
License GPLv3+ : GNU GPL version 3 or later
http://www.unhide-forensics.info
Used options:
[*]Starting TCP checking

Found Hidden port that not appears in ss: 805

-- Steve


Attachments:
(No filename) (3.96 kB)
debug-hidden-port-4.7.patch (2.32 kB)
Download all attachments

2016-06-30 13:18:10

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))


> On Jun 30, 2016, at 08:59, Steven Rostedt <[email protected]> wrote:
>
> [ resending as a new email, as I'm assuming people do not sort their
> INBOX via last email on thread, thus my last email is sitting in the
> bottom of everyone's INBOX ]
>
> I've hit this again. Not sure when it started, but I applied my old
> debug trace_printk() patch (attached) and rebooted (4.5.7). I just
> tested the latest kernel from Linus's tree (from last nights pull), and
> it still gives me the problem.
>
> Here's the trace I have:
>
> kworker/3:1H-134 [003] ..s. 61.036129: inet_csk_get_port: snum 805
> kworker/3:1H-134 [003] ..s. 61.036135: <stack trace>
> => sched_clock
> => inet_addr_type_table
> => security_capable
> => inet_bind
> => xs_bind
> => release_sock
> => sock_setsockopt
> => __sock_create
> => xs_create_sock.isra.19
> => xs_tcp_setup_socket
> => process_one_work
> => worker_thread
> => worker_thread
> => kthread
> => ret_from_fork
> => kthread
> kworker/3:1H-134 [003] ..s. 61.036136: inet_bind_hash: add 805
> kworker/3:1H-134 [003] ..s. 61.036138: <stack trace>
> => inet_csk_get_port
> => sched_clock
> => inet_addr_type_table
> => security_capable
> => inet_bind
> => xs_bind
> => release_sock
> => sock_setsockopt
> => __sock_create
> => xs_create_sock.isra.19
> => xs_tcp_setup_socket
> => process_one_work
> => worker_thread
> => worker_thread
> => kthread
> => ret_from_fork
> => kthread
> kworker/3:1H-134 [003] .... 61.036139: xs_bind: RPC: xs_bind 4.136.255.255:805: ok (0)
> kworker/3:1H-134 [003] .... 61.036140: xs_tcp_setup_socket: RPC: worker connecting xprt ffff880407eca800 via tcp to 192.168.23.22 (port 43651)
> kworker/3:1H-134 [003] .... 61.036162: xs_tcp_setup_socket: RPC: ffff880407eca800 connect status 115 connected 0 sock state 2
> <idle>-0 [001] ..s. 61.036450: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff880407eca800...
> <idle>-0 [001] ..s. 61.036452: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
> kworker/1:1H-136 [001] .... 61.036476: xprt_connect_status: RPC: 43 xprt_connect_status: retrying
> kworker/1:1H-136 [001] .... 61.036478: xprt_prepare_transmit: RPC: 43 xprt_prepare_transmit
> kworker/1:1H-136 [001] .... 61.036479: xprt_transmit: RPC: 43 xprt_transmit(72)
> kworker/1:1H-136 [001] .... 61.036486: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
> kworker/1:1H-136 [001] .... 61.036487: xprt_transmit: RPC: 43 xmit complete
> <idle>-0 [001] ..s. 61.036789: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
> kworker/1:1H-136 [001] .... 61.036798: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
> kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
> kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading XID (4 bytes)
> kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading request with XID 2f4c3f88
> kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
> kworker/1:1H-136 [001] .... 61.036801: xs_tcp_data_recv: RPC: read reply XID 2f4c3f88
> kworker/1:1H-136 [001] ..s. 61.036801: xs_tcp_data_recv: RPC: XID 2f4c3f88 read 16 bytes
> kworker/1:1H-136 [001] ..s. 61.036802: xs_tcp_data_recv: RPC: xprt = ffff880407eca800, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
> kworker/1:1H-136 [001] ..s. 61.036802: xprt_complete_rqst: RPC: 43 xid 2f4c3f88 complete (24 bytes received)
> kworker/1:1H-136 [001] .... 61.036803: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
> kworker/1:1H-136 [001] .... 61.036812: xprt_release: RPC: 43 release request ffff88040b270800
>
>
> # unhide-tcp
> Unhide-tcp 20130526
> Copyright ? 2013 Yago Jesus & Patrick Gouin
> License GPLv3+ : GNU GPL version 3 or later
> http://www.unhide-forensics.info
> Used options:
> [*]Starting TCP checking
>
> Found Hidden port that not appears in ss: 805
>

What is a ?Hidden port that not appears in ss: 805?, and what does this report mean? Are we failing to close a socket?

Cheers
Trond


2016-06-30 15:23:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Thu, 30 Jun 2016 13:17:47 +0000
Trond Myklebust <[email protected]> wrote:

> > On Jun 30, 2016, at 08:59, Steven Rostedt <[email protected]> wrote:
> >
> > [ resending as a new email, as I'm assuming people do not sort their
> > INBOX via last email on thread, thus my last email is sitting in the
> > bottom of everyone's INBOX ]
> >
> > I've hit this again. Not sure when it started, but I applied my old
> > debug trace_printk() patch (attached) and rebooted (4.5.7). I just
> > tested the latest kernel from Linus's tree (from last nights pull), and
> > it still gives me the problem.
> >
> > Here's the trace I have:
> >
> > kworker/3:1H-134 [003] ..s. 61.036129: inet_csk_get_port: snum 805

Here's were the port is taken

> > kworker/3:1H-134 [003] ..s. 61.036135: <stack trace>
> > => sched_clock
> > => inet_addr_type_table
> > => security_capable
> > => inet_bind
> > => xs_bind
> > => release_sock
> > => sock_setsockopt
> > => __sock_create
> > => xs_create_sock.isra.19
> > => xs_tcp_setup_socket
> > => process_one_work
> > => worker_thread
> > => worker_thread
> > => kthread
> > => ret_from_fork
> > => kthread
> > kworker/3:1H-134 [003] ..s. 61.036136: inet_bind_hash: add 805
> > kworker/3:1H-134 [003] ..s. 61.036138: <stack trace>
> > => inet_csk_get_port
> > => sched_clock
> > => inet_addr_type_table
> > => security_capable
> > => inet_bind
> > => xs_bind
> > => release_sock
> > => sock_setsockopt
> > => __sock_create
> > => xs_create_sock.isra.19
> > => xs_tcp_setup_socket
> > => process_one_work
> > => worker_thread
> > => worker_thread
> > => kthread
> > => ret_from_fork
> > => kthread
> > kworker/3:1H-134 [003] .... 61.036139: xs_bind: RPC: xs_bind 4.136.255.255:805: ok (0)

Here's where it is bounded.

> > kworker/3:1H-134 [003] .... 61.036140: xs_tcp_setup_socket: RPC: worker connecting xprt ffff880407eca800 via tcp to 192.168.23.22 (port 43651)
> > kworker/3:1H-134 [003] .... 61.036162: xs_tcp_setup_socket: RPC: ffff880407eca800 connect status 115 connected 0 sock state 2
> > <idle>-0 [001] ..s. 61.036450: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff880407eca800...
> > <idle>-0 [001] ..s. 61.036452: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
> > kworker/1:1H-136 [001] .... 61.036476: xprt_connect_status: RPC: 43 xprt_connect_status: retrying
> > kworker/1:1H-136 [001] .... 61.036478: xprt_prepare_transmit: RPC: 43 xprt_prepare_transmit
> > kworker/1:1H-136 [001] .... 61.036479: xprt_transmit: RPC: 43 xprt_transmit(72)
> > kworker/1:1H-136 [001] .... 61.036486: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
> > kworker/1:1H-136 [001] .... 61.036487: xprt_transmit: RPC: 43 xmit complete
> > <idle>-0 [001] ..s. 61.036789: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
> > kworker/1:1H-136 [001] .... 61.036798: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
> > kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
> > kworker/1:1H-136 [001] .... 61.036799: xs_tcp_data_recv: RPC: reading XID (4 bytes)
> > kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading request with XID 2f4c3f88
> > kworker/1:1H-136 [001] .... 61.036800: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
> > kworker/1:1H-136 [001] .... 61.036801: xs_tcp_data_recv: RPC: read reply XID 2f4c3f88
> > kworker/1:1H-136 [001] ..s. 61.036801: xs_tcp_data_recv: RPC: XID 2f4c3f88 read 16 bytes
> > kworker/1:1H-136 [001] ..s. 61.036802: xs_tcp_data_recv: RPC: xprt = ffff880407eca800, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
> > kworker/1:1H-136 [001] ..s. 61.036802: xprt_complete_rqst: RPC: 43 xid 2f4c3f88 complete (24 bytes received)
> > kworker/1:1H-136 [001] .... 61.036803: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
> > kworker/1:1H-136 [001] .... 61.036812: xprt_release: RPC: 43 release request ffff88040b270800
> >
> >
> > # unhide-tcp
> > Unhide-tcp 20130526
> > Copyright © 2013 Yago Jesus & Patrick Gouin
> > License GPLv3+ : GNU GPL version 3 or later
> > http://www.unhide-forensics.info
> > Used options:
> > [*]Starting TCP checking
> >
> > Found Hidden port that not appears in ss: 805
> >
>
> What is a “Hidden port that not appears in ss: 805”, and what does this report mean? Are we failing to close a socket?

I believe hidden ports are ports that are bound to no socket.
Basically, a "port leak". Where they are in limbo and can never be
reused.

I looked at my past report, and everthing is exactly like the issue
before. When I first boot my box, the port is there, I have the above
trace. I run netstat -tapn and grep for the port. And it shows that it
is an established socket between my box and my wife's box (I have a nfs
mounted file system for her to copy her pictures to my server). After a
couple of minutes, the port turns from ESTABLISHED to TIME_WAIT, and
after another minute it disappears. At that moment, the unhide-tcp
shows the port as hidden.

When the socket goes away (without releasing the port) I see this in my
trace:

kworker/1:1H-131 [001] ..s. 364.762537: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff88040ad68800...
kworker/1:1H-131 [001] ..s. 364.762539: xs_tcp_state_change: RPC: state 4 conn 1 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [001] ..s. 364.762715: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff88040ad68800...
<idle>-0 [001] ..s. 364.762716: xs_tcp_state_change: RPC: state 5 conn 0 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [001] ..s. 364.762728: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff88040ad68800...
<idle>-0 [001] ..s. 364.762728: xs_tcp_state_change: RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [001] ..s. 364.762729: xprt_disconnect_done: RPC: disconnected transport ffff88040ad68800
<idle>-0 [001] ..s. 364.762730: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff88040ad68800...
<idle>-0 [001] ..s. 364.762730: xs_tcp_state_change: RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [001] ..s. 364.762730: xprt_disconnect_done: RPC: disconnected transport ffff88040ad68800

I can add more trace_printk()s if it would help.

-- Steve

2016-06-30 16:24:14

by Steven Rostedt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Thu, 30 Jun 2016 11:23:41 -0400
Steven Rostedt <[email protected]> wrote:


> I can add more trace_printk()s if it would help.

I added a trace_printk() in inet_bind_bucket_destroy() to print out
some information on the socket used by xs_bind(), and it shows that the
bind destroy is called, but the list is not empty.



/*
* Caller must hold hashbucket lock for this tb with local BH disabled
*/
void inet_bind_bucket_destroy(struct kmem_cache *cachep, struct inet_bind_bucket *tb)
{
if (!current->mm && xs_port == tb->port) {
trace_printk("destroy %d empty=%d %p\n",
tb->port, hlist_empty(&tb->owners), tb);
trace_dump_stack(1);
}
if (hlist_empty(&tb->owners)) {
__hlist_del(&tb->node);
kmem_cache_free(cachep, tb);
}
}

I created "xs_port" to hold the port of the variable used by xs_bind,
and when it is called, the hlist_empty(&tb->owners) returns false.

I'll add more trace_printks to find out where those owners are being
added.

-- Steve

2016-06-30 18:30:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

DQo+IE9uIEp1biAzMCwgMjAxNiwgYXQgMTE6MjMsIFN0ZXZlbiBSb3N0ZWR0IDxyb3N0ZWR0QGdv
b2RtaXMub3JnPiB3cm90ZToNCj4gDQo+IE9uIFRodSwgMzAgSnVuIDIwMTYgMTM6MTc6NDcgKzAw
MDANCj4gVHJvbmQgTXlrbGVidXN0IDx0cm9uZG15QHByaW1hcnlkYXRhLmNvbT4gd3JvdGU6DQo+
IA0KPj4+IE9uIEp1biAzMCwgMjAxNiwgYXQgMDg6NTksIFN0ZXZlbiBSb3N0ZWR0IDxyb3N0ZWR0
QGdvb2RtaXMub3JnPiB3cm90ZToNCj4+PiANCj4+PiBbIHJlc2VuZGluZyBhcyBhIG5ldyBlbWFp
bCwgYXMgSSdtIGFzc3VtaW5nIHBlb3BsZSBkbyBub3Qgc29ydCB0aGVpcg0KPj4+IElOQk9YIHZp
YSBsYXN0IGVtYWlsIG9uIHRocmVhZCwgdGh1cyBteSBsYXN0IGVtYWlsIGlzIHNpdHRpbmcgaW4g
dGhlDQo+Pj4gYm90dG9tIG9mIGV2ZXJ5b25lJ3MgSU5CT1ggXQ0KPj4+IA0KPj4+IEkndmUgaGl0
IHRoaXMgYWdhaW4uIE5vdCBzdXJlIHdoZW4gaXQgc3RhcnRlZCwgYnV0IEkgYXBwbGllZCBteSBv
bGQNCj4+PiBkZWJ1ZyB0cmFjZV9wcmludGsoKSBwYXRjaCAoYXR0YWNoZWQpIGFuZCByZWJvb3Rl
ZCAoNC41LjcpLiBJIGp1c3QNCj4+PiB0ZXN0ZWQgdGhlIGxhdGVzdCBrZXJuZWwgZnJvbSBMaW51
cydzIHRyZWUgKGZyb20gbGFzdCBuaWdodHMgcHVsbCksIGFuZA0KPj4+IGl0IHN0aWxsIGdpdmVz
IG1lIHRoZSBwcm9ibGVtLg0KPj4+IA0KPj4+IEhlcmUncyB0aGUgdHJhY2UgSSBoYXZlOg0KPj4+
IA0KPj4+ICAga3dvcmtlci8zOjFILTEzNCAgIFswMDNdIC4ucy4gICAgNjEuMDM2MTI5OiBpbmV0
X2Nza19nZXRfcG9ydDogc251bSA4MDUNCj4gDQo+IEhlcmUncyB3ZXJlIHRoZSBwb3J0IGlzIHRh
a2VuDQo+IA0KPj4+ICAga3dvcmtlci8zOjFILTEzNCAgIFswMDNdIC4ucy4gICAgNjEuMDM2MTM1
OiA8c3RhY2sgdHJhY2U+ICANCj4+PiA9PiBzY2hlZF9jbG9jaw0KPj4+ID0+IGluZXRfYWRkcl90
eXBlX3RhYmxlDQo+Pj4gPT4gc2VjdXJpdHlfY2FwYWJsZQ0KPj4+ID0+IGluZXRfYmluZA0KPj4+
ID0+IHhzX2JpbmQNCj4+PiA9PiByZWxlYXNlX3NvY2sNCj4+PiA9PiBzb2NrX3NldHNvY2tvcHQN
Cj4+PiA9PiBfX3NvY2tfY3JlYXRlDQo+Pj4gPT4geHNfY3JlYXRlX3NvY2suaXNyYS4xOQ0KPj4+
ID0+IHhzX3RjcF9zZXR1cF9zb2NrZXQNCj4+PiA9PiBwcm9jZXNzX29uZV93b3JrDQo+Pj4gPT4g
d29ya2VyX3RocmVhZA0KPj4+ID0+IHdvcmtlcl90aHJlYWQNCj4+PiA9PiBrdGhyZWFkDQo+Pj4g
PT4gcmV0X2Zyb21fZm9yaw0KPj4+ID0+IGt0aHJlYWQgICAgDQo+Pj4gICBrd29ya2VyLzM6MUgt
MTM0ICAgWzAwM10gLi5zLiAgICA2MS4wMzYxMzY6IGluZXRfYmluZF9oYXNoOiBhZGQgODA1DQo+
Pj4gICBrd29ya2VyLzM6MUgtMTM0ICAgWzAwM10gLi5zLiAgICA2MS4wMzYxMzg6IDxzdGFjayB0
cmFjZT4gIA0KPj4+ID0+IGluZXRfY3NrX2dldF9wb3J0DQo+Pj4gPT4gc2NoZWRfY2xvY2sNCj4+
PiA9PiBpbmV0X2FkZHJfdHlwZV90YWJsZQ0KPj4+ID0+IHNlY3VyaXR5X2NhcGFibGUNCj4+PiA9
PiBpbmV0X2JpbmQNCj4+PiA9PiB4c19iaW5kDQo+Pj4gPT4gcmVsZWFzZV9zb2NrDQo+Pj4gPT4g
c29ja19zZXRzb2Nrb3B0DQo+Pj4gPT4gX19zb2NrX2NyZWF0ZQ0KPj4+ID0+IHhzX2NyZWF0ZV9z
b2NrLmlzcmEuMTkNCj4+PiA9PiB4c190Y3Bfc2V0dXBfc29ja2V0DQo+Pj4gPT4gcHJvY2Vzc19v
bmVfd29yaw0KPj4+ID0+IHdvcmtlcl90aHJlYWQNCj4+PiA9PiB3b3JrZXJfdGhyZWFkDQo+Pj4g
PT4ga3RocmVhZA0KPj4+ID0+IHJldF9mcm9tX2ZvcmsNCj4+PiA9PiBrdGhyZWFkICAgIA0KPj4+
ICAga3dvcmtlci8zOjFILTEzNCAgIFswMDNdIC4uLi4gICAgNjEuMDM2MTM5OiB4c19iaW5kOiBS
UEM6ICAgICAgIHhzX2JpbmQgNC4xMzYuMjU1LjI1NTo4MDU6IG9rICgwKQ0KPiANCj4gSGVyZSdz
IHdoZXJlIGl0IGlzIGJvdW5kZWQuDQo+IA0KPj4+ICAga3dvcmtlci8zOjFILTEzNCAgIFswMDNd
IC4uLi4gICAgNjEuMDM2MTQwOiB4c190Y3Bfc2V0dXBfc29ja2V0OiBSUEM6ICAgICAgIHdvcmtl
ciBjb25uZWN0aW5nIHhwcnQgZmZmZjg4MDQwN2VjYTgwMCB2aWEgdGNwIHRvIDE5Mi4xNjguMjMu
MjIgKHBvcnQgNDM2NTEpDQo+Pj4gICBrd29ya2VyLzM6MUgtMTM0ICAgWzAwM10gLi4uLiAgICA2
MS4wMzYxNjI6IHhzX3RjcF9zZXR1cF9zb2NrZXQ6IFJQQzogICAgICAgZmZmZjg4MDQwN2VjYTgw
MCBjb25uZWN0IHN0YXR1cyAxMTUgY29ubmVjdGVkIDAgc29jayBzdGF0ZSAyDQo+Pj4gICAgICAg
ICA8aWRsZT4tMCAgICAgWzAwMV0gLi5zLiAgICA2MS4wMzY0NTA6IHhzX3RjcF9zdGF0ZV9jaGFu
Z2U6IFJQQzogICAgICAgeHNfdGNwX3N0YXRlX2NoYW5nZSBjbGllbnQgZmZmZjg4MDQwN2VjYTgw
MC4uLg0KPj4+ICAgICAgICAgPGlkbGU+LTAgICAgIFswMDFdIC4ucy4gICAgNjEuMDM2NDUyOiB4
c190Y3Bfc3RhdGVfY2hhbmdlOiBSUEM6ICAgICAgIHN0YXRlIDEgY29ubiAwIGRlYWQgMCB6YXBw
ZWQgMSBza19zaHV0ZG93biAwDQo+Pj4gICBrd29ya2VyLzE6MUgtMTM2ICAgWzAwMV0gLi4uLiAg
ICA2MS4wMzY0NzY6IHhwcnRfY29ubmVjdF9zdGF0dXM6IFJQQzogICAgNDMgeHBydF9jb25uZWN0
X3N0YXR1czogcmV0cnlpbmcNCj4+PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLi4uICAg
IDYxLjAzNjQ3ODogeHBydF9wcmVwYXJlX3RyYW5zbWl0OiBSUEM6ICAgIDQzIHhwcnRfcHJlcGFy
ZV90cmFuc21pdA0KPj4+ICAga3dvcmtlci8xOjFILTEzNiAgIFswMDFdIC4uLi4gICAgNjEuMDM2
NDc5OiB4cHJ0X3RyYW5zbWl0OiBSUEM6ICAgIDQzIHhwcnRfdHJhbnNtaXQoNzIpDQo+Pj4gICBr
d29ya2VyLzE6MUgtMTM2ICAgWzAwMV0gLi4uLiAgICA2MS4wMzY0ODY6IHhzX3RjcF9zZW5kX3Jl
cXVlc3Q6IFJQQzogICAgICAgeHNfdGNwX3NlbmRfcmVxdWVzdCg3MikgPSAwDQo+Pj4gICBrd29y
a2VyLzE6MUgtMTM2ICAgWzAwMV0gLi4uLiAgICA2MS4wMzY0ODc6IHhwcnRfdHJhbnNtaXQ6IFJQ
QzogICAgNDMgeG1pdCBjb21wbGV0ZQ0KPj4+ICAgICAgICAgPGlkbGU+LTAgICAgIFswMDFdIC4u
cy4gICAgNjEuMDM2Nzg5OiB4c190Y3BfZGF0YV9yZWFkeTogUlBDOiAgICAgICB4c190Y3BfZGF0
YV9yZWFkeS4uLg0KPj4+ICAga3dvcmtlci8xOjFILTEzNiAgIFswMDFdIC4uLi4gICAgNjEuMDM2
Nzk4OiB4c190Y3BfZGF0YV9yZWN2OiBSUEM6ICAgICAgIHhzX3RjcF9kYXRhX3JlY3Ygc3RhcnRl
ZA0KPj4+ICAga3dvcmtlci8xOjFILTEzNiAgIFswMDFdIC4uLi4gICAgNjEuMDM2Nzk5OiB4c190
Y3BfZGF0YV9yZWN2OiBSUEM6ICAgICAgIHJlYWRpbmcgVENQIHJlY29yZCBmcmFnbWVudCBvZiBs
ZW5ndGggMjQNCj4+PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLi4uICAgIDYxLjAzNjc5
OTogeHNfdGNwX2RhdGFfcmVjdjogUlBDOiAgICAgICByZWFkaW5nIFhJRCAoNCBieXRlcykNCj4+
PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLi4uICAgIDYxLjAzNjgwMDogeHNfdGNwX2Rh
dGFfcmVjdjogUlBDOiAgICAgICByZWFkaW5nIHJlcXVlc3Qgd2l0aCBYSUQgMmY0YzNmODgNCj4+
PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLi4uICAgIDYxLjAzNjgwMDogeHNfdGNwX2Rh
dGFfcmVjdjogUlBDOiAgICAgICByZWFkaW5nIENBTEwvUkVQTFkgZmxhZyAoNCBieXRlcykNCj4+
PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLi4uICAgIDYxLjAzNjgwMTogeHNfdGNwX2Rh
dGFfcmVjdjogUlBDOiAgICAgICByZWFkIHJlcGx5IFhJRCAyZjRjM2Y4OA0KPj4+ICAga3dvcmtl
ci8xOjFILTEzNiAgIFswMDFdIC4ucy4gICAgNjEuMDM2ODAxOiB4c190Y3BfZGF0YV9yZWN2OiBS
UEM6ICAgICAgIFhJRCAyZjRjM2Y4OCByZWFkIDE2IGJ5dGVzDQo+Pj4gICBrd29ya2VyLzE6MUgt
MTM2ICAgWzAwMV0gLi5zLiAgICA2MS4wMzY4MDI6IHhzX3RjcF9kYXRhX3JlY3Y6IFJQQzogICAg
ICAgeHBydCA9IGZmZmY4ODA0MDdlY2E4MDAsIHRjcF9jb3BpZWQgPSAyNCwgdGNwX29mZnNldCA9
IDI0LCB0Y3BfcmVjbGVuID0gMjQNCj4+PiAgIGt3b3JrZXIvMToxSC0xMzYgICBbMDAxXSAuLnMu
ICAgIDYxLjAzNjgwMjogeHBydF9jb21wbGV0ZV9ycXN0OiBSUEM6ICAgIDQzIHhpZCAyZjRjM2Y4
OCBjb21wbGV0ZSAoMjQgYnl0ZXMgcmVjZWl2ZWQpDQo+Pj4gICBrd29ya2VyLzE6MUgtMTM2ICAg
WzAwMV0gLi4uLiAgICA2MS4wMzY4MDM6IHhzX3RjcF9kYXRhX3JlY3Y6IFJQQzogICAgICAgeHNf
dGNwX2RhdGFfcmVjdiBkb25lDQo+Pj4gICBrd29ya2VyLzE6MUgtMTM2ICAgWzAwMV0gLi4uLiAg
ICA2MS4wMzY4MTI6IHhwcnRfcmVsZWFzZTogUlBDOiAgICA0MyByZWxlYXNlIHJlcXVlc3QgZmZm
Zjg4MDQwYjI3MDgwMA0KPj4+IA0KPj4+IA0KPj4+ICMgdW5oaWRlLXRjcCANCj4+PiBVbmhpZGUt
dGNwIDIwMTMwNTI2DQo+Pj4gQ29weXJpZ2h0IMKpIDIwMTMgWWFnbyBKZXN1cyAmIFBhdHJpY2sg
R291aW4NCj4+PiBMaWNlbnNlIEdQTHYzKyA6IEdOVSBHUEwgdmVyc2lvbiAzIG9yIGxhdGVyDQo+
Pj4gaHR0cDovL3d3dy51bmhpZGUtZm9yZW5zaWNzLmluZm8NCj4+PiBVc2VkIG9wdGlvbnM6IA0K
Pj4+IFsqXVN0YXJ0aW5nIFRDUCBjaGVja2luZw0KPj4+IA0KPj4+IEZvdW5kIEhpZGRlbiBwb3J0
IHRoYXQgbm90IGFwcGVhcnMgaW4gc3M6IDgwNQ0KPj4+IA0KPj4gDQo+PiBXaGF0IGlzIGEg4oCc
SGlkZGVuIHBvcnQgdGhhdCBub3QgYXBwZWFycyBpbiBzczogODA14oCdLCBhbmQgd2hhdCBkb2Vz
IHRoaXMgcmVwb3J0IG1lYW4/IEFyZSB3ZSBmYWlsaW5nIHRvIGNsb3NlIGEgc29ja2V0Pw0KPiAN
Cj4gSSBiZWxpZXZlIGhpZGRlbiBwb3J0cyBhcmUgcG9ydHMgdGhhdCBhcmUgYm91bmQgdG8gbm8g
c29ja2V0Lg0KPiBCYXNpY2FsbHksIGEgInBvcnQgbGVhayIuIFdoZXJlIHRoZXkgYXJlIGluIGxp
bWJvIGFuZCBjYW4gbmV2ZXIgYmUNCj4gcmV1c2VkLg0KPiANCj4gSSBsb29rZWQgYXQgbXkgcGFz
dCByZXBvcnQsIGFuZCBldmVydGhpbmcgaXMgZXhhY3RseSBsaWtlIHRoZSBpc3N1ZQ0KPiBiZWZv
cmUuIFdoZW4gSSBmaXJzdCBib290IG15IGJveCwgdGhlIHBvcnQgaXMgdGhlcmUsIEkgaGF2ZSB0
aGUgYWJvdmUNCj4gdHJhY2UuIEkgcnVuIG5ldHN0YXQgLXRhcG4gYW5kIGdyZXAgZm9yIHRoZSBw
b3J0LiBBbmQgaXQgc2hvd3MgdGhhdCBpdA0KPiBpcyBhbiBlc3RhYmxpc2hlZCBzb2NrZXQgYmV0
d2VlbiBteSBib3ggYW5kIG15IHdpZmUncyBib3ggKEkgaGF2ZSBhIG5mcw0KPiBtb3VudGVkIGZp
bGUgc3lzdGVtIGZvciBoZXIgdG8gY29weSBoZXIgcGljdHVyZXMgdG8gbXkgc2VydmVyKS4gQWZ0
ZXIgYQ0KPiBjb3VwbGUgb2YgbWludXRlcywgdGhlIHBvcnQgdHVybnMgZnJvbSBFU1RBQkxJU0hF
RCB0byBUSU1FX1dBSVQsIGFuZA0KPiBhZnRlciBhbm90aGVyIG1pbnV0ZSBpdCBkaXNhcHBlYXJz
LiBBdCB0aGF0IG1vbWVudCwgdGhlIHVuaGlkZS10Y3ANCj4gc2hvd3MgdGhlIHBvcnQgYXMgaGlk
ZGVuLg0KPiANCj4gV2hlbiB0aGUgc29ja2V0IGdvZXMgYXdheSAod2l0aG91dCByZWxlYXNpbmcg
dGhlIHBvcnQpIEkgc2VlIHRoaXMgaW4gbXkNCj4gdHJhY2U6DQo+IA0KPiAgICBrd29ya2VyLzE6
MUgtMTMxICAgWzAwMV0gLi5zLiAgIDM2NC43NjI1Mzc6IHhzX3RjcF9zdGF0ZV9jaGFuZ2U6IFJQ
QzogICAgICAgeHNfdGNwX3N0YXRlX2NoYW5nZSBjbGllbnQgZmZmZjg4MDQwYWQ2ODgwMC4uLg0K
PiAgICBrd29ya2VyLzE6MUgtMTMxICAgWzAwMV0gLi5zLiAgIDM2NC43NjI1Mzk6IHhzX3RjcF9z
dGF0ZV9jaGFuZ2U6IFJQQzogICAgICAgc3RhdGUgNCBjb25uIDEgZGVhZCAwIHphcHBlZCAxIHNr
X3NodXRkb3duIDMNCj4gICAgICAgICAgPGlkbGU+LTAgICAgIFswMDFdIC4ucy4gICAzNjQuNzYy
NzE1OiB4c190Y3Bfc3RhdGVfY2hhbmdlOiBSUEM6ICAgICAgIHhzX3RjcF9zdGF0ZV9jaGFuZ2Ug
Y2xpZW50IGZmZmY4ODA0MGFkNjg4MDAuLi4NCj4gICAgICAgICAgPGlkbGU+LTAgICAgIFswMDFd
IC4ucy4gICAzNjQuNzYyNzE2OiB4c190Y3Bfc3RhdGVfY2hhbmdlOiBSUEM6ICAgICAgIHN0YXRl
IDUgY29ubiAwIGRlYWQgMCB6YXBwZWQgMSBza19zaHV0ZG93biAzDQo+ICAgICAgICAgIDxpZGxl
Pi0wICAgICBbMDAxXSAuLnMuICAgMzY0Ljc2MjcyODogeHNfdGNwX3N0YXRlX2NoYW5nZTogUlBD
OiAgICAgICB4c190Y3Bfc3RhdGVfY2hhbmdlIGNsaWVudCBmZmZmODgwNDBhZDY4ODAwLi4uDQo+
ICAgICAgICAgIDxpZGxlPi0wICAgICBbMDAxXSAuLnMuICAgMzY0Ljc2MjcyODogeHNfdGNwX3N0
YXRlX2NoYW5nZTogUlBDOiAgICAgICBzdGF0ZSA3IGNvbm4gMCBkZWFkIDAgemFwcGVkIDEgc2tf
c2h1dGRvd24gMw0KPiAgICAgICAgICA8aWRsZT4tMCAgICAgWzAwMV0gLi5zLiAgIDM2NC43NjI3
Mjk6IHhwcnRfZGlzY29ubmVjdF9kb25lOiBSUEM6ICAgICAgIGRpc2Nvbm5lY3RlZCB0cmFuc3Bv
cnQgZmZmZjg4MDQwYWQ2ODgwMA0KPiAgICAgICAgICA8aWRsZT4tMCAgICAgWzAwMV0gLi5zLiAg
IDM2NC43NjI3MzA6IHhzX3RjcF9zdGF0ZV9jaGFuZ2U6IFJQQzogICAgICAgeHNfdGNwX3N0YXRl
X2NoYW5nZSBjbGllbnQgZmZmZjg4MDQwYWQ2ODgwMC4uLg0KPiAgICAgICAgICA8aWRsZT4tMCAg
ICAgWzAwMV0gLi5zLiAgIDM2NC43NjI3MzA6IHhzX3RjcF9zdGF0ZV9jaGFuZ2U6IFJQQzogICAg
ICAgc3RhdGUgNyBjb25uIDAgZGVhZCAwIHphcHBlZCAxIHNrX3NodXRkb3duIDMNCj4gICAgICAg
ICAgPGlkbGU+LTAgICAgIFswMDFdIC4ucy4gICAzNjQuNzYyNzMwOiB4cHJ0X2Rpc2Nvbm5lY3Rf
ZG9uZTogUlBDOiAgICAgICBkaXNjb25uZWN0ZWQgdHJhbnNwb3J0IGZmZmY4ODA0MGFkNjg4MDAN
Cj4gDQo+IEkgY2FuIGFkZCBtb3JlIHRyYWNlX3ByaW50aygpcyBpZiBpdCB3b3VsZCBoZWxwLg0K
DQoNCldhaXQuIFNvIHRoZSBORlMgbW91bnQgaXMgc3RpbGwgYWN0aXZlLCBpdOKAmXMganVzdCB0
aGF0IHRoZSBzb2NrZXQgZGlzY29ubmVjdGVkIGR1ZSB0byBubyB0cmFmZmljPyBUaGF0IHNob3Vs
ZCBiZSBPSy4gR3JhbnRlZCB0aGF0IHRoZSBwb3J0IGNhbuKAmXQgYmUgcmV1c2VkIGJ5IGFub3Ro
ZXIgcHJvY2VzcywgYnV0IHlvdSByZWFsbHkgZG9u4oCZdCB3YW50IHRoYXQ6IHdoYXQgaWYgdGhl
cmUgYXJlIG5vIG90aGVyIHBvcnRzIGF2YWlsYWJsZSBhbmQgeW91IHN0YXJ0IHdyaXRpbmcgdG8g
YSBmaWxlIG9uIHRoZSBORlMgcGFydGl0aW9uPw0KDQpDaGVlcnMNCiAgVHJvbmQ=


2016-06-30 20:07:31

by Steven Rostedt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Thu, 30 Jun 2016 18:30:42 +0000
Trond Myklebust <[email protected]> wrote:


> Wait. So the NFS mount is still active, it’s just that the socket
> disconnected due to no traffic? That should be OK. Granted that the
> port can’t be reused by another process, but you really don’t want
> that: what if there are no other ports available and you start
> writing to a file on the NFS partition?

What would cause the port to be connected to a socket again? I copied a
large file to the nfs mount, and the hidden port is still there?

Remember, this wasn't always the case, the hidden port is a recent
issue.

I ran wireshark on this and it appears to create two ports for NFS. One
of them is canceled by the client (sends a FIN/ACK) and this port is
what lays around never to be used again, and uses the other port for
all connections after that.

When I unmount the NFS directory, the port is finally freed (but has no
socket attached to it). What is the purpose of keeping this port around?

I can reproduce this by having the client unmount and remount the
directory.

-- Steve

2016-06-30 22:01:10

by Steven Rostedt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Thu, 30 Jun 2016 16:07:26 -0400
Steven Rostedt <[email protected]> wrote:

> I can reproduce this by having the client unmount and remount the
> directory.

It gets even more interesting. When I unmount the directory, the hidden
port does not go away. It is still there. But if I mount it again, it
goes away (until it times out again).

Even more info:

When I first mount it, it creates 3 sockets, where one immediately is
closed:

tcp 0 0 192.168.23.9:892 192.168.23.22:44672 TIME_WAIT -
tcp 0 0 192.168.23.9:2049 192.168.23.22:815 ESTABLISHED -
tcp 0 0 192.168.23.9:754 192.168.23.22:44672 ESTABLISHED -

(192.168.23.22 is the machine remotely mounting a directory from the
server 192.168.23.9)

The trace of port 892 is this:

kworker/u32:1-13473 [000] .... 4093.915114: xs_setup_tcp: RPC: set up xprt to 192.168.23.22 (port 44672) via tcp
kworker/u32:1-13473 [000] .... 4093.915122: xprt_create_transport: RPC: created transport ffff8803b1c38000 with 65536 slots
kworker/0:1H-129 [000] .... 4093.915152: xprt_alloc_slot: RPC: 47 reserved req ffff88040b27ca00 xid c50ccaff
kworker/0:1H-129 [000] .... 4093.915157: xprt_connect: RPC: 47 xprt_connect xprt ffff8803b1c38000 is not connected
kworker/0:1H-129 [000] .... 4093.915159: xs_connect: RPC: xs_connect scheduled xprt ffff8803b1c38000
kworker/0:1H-129 [000] ..s. 4093.915170: inet_csk_get_port: snum 892
kworker/0:1H-129 [000] ..s. 4093.915177: <stack trace>
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/0:1H-129 [000] ..s. 4093.915178: inet_bind_hash: add 892 ffff8803bb9b5cc0
kworker/0:1H-129 [000] ..s. 4093.915184: <stack trace>
=> inet_csk_get_port
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/0:1H-129 [000] .... 4093.915185: xs_bind: RPC: xs_bind 4.136.255.255:892: ok (0)
kworker/0:1H-129 [000] .... 4093.915186: xs_tcp_setup_socket: RPC: worker connecting xprt ffff8803b1c38000 via tcp to 192.168.23.22 (port 44672)
kworker/0:1H-129 [000] .... 4093.915221: xs_tcp_setup_socket: RPC: ffff8803b1c38000 connect status 115 connected 0 sock state 2
<idle>-0 [003] ..s. 4093.915434: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b1c38000...
<idle>-0 [003] ..s. 4093.915435: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
kworker/3:1H-145 [003] .... 4093.915558: xprt_connect_status: RPC: 47 xprt_connect_status: retrying
kworker/3:1H-145 [003] .... 4093.915560: xprt_prepare_transmit: RPC: 47 xprt_prepare_transmit
kworker/3:1H-145 [003] .... 4093.915562: xprt_transmit: RPC: 47 xprt_transmit(72)
kworker/3:1H-145 [003] .... 4093.915588: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
kworker/3:1H-145 [003] .... 4093.915589: xprt_transmit: RPC: 47 xmit complete
<idle>-0 [003] ..s. 4093.915969: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
kworker/3:1H-145 [003] .... 4093.916081: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
kworker/3:1H-145 [003] .... 4093.916083: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
kworker/3:1H-145 [003] .... 4093.916084: xs_tcp_data_recv: RPC: reading XID (4 bytes)
kworker/3:1H-145 [003] .... 4093.916085: xs_tcp_data_recv: RPC: reading request with XID c50ccaff
kworker/3:1H-145 [003] .... 4093.916086: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
kworker/3:1H-145 [003] .... 4093.916087: xs_tcp_data_recv: RPC: read reply XID c50ccaff
kworker/3:1H-145 [003] ..s. 4093.916088: xs_tcp_data_recv: RPC: XID c50ccaff read 16 bytes
kworker/3:1H-145 [003] ..s. 4093.916089: xs_tcp_data_recv: RPC: xprt = ffff8803b1c38000, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
kworker/3:1H-145 [003] ..s. 4093.916090: xprt_complete_rqst: RPC: 47 xid c50ccaff complete (24 bytes received)
kworker/3:1H-145 [003] .... 4093.916091: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
kworker/3:1H-145 [003] .... 4093.916098: xprt_release: RPC: 47 release request ffff88040b27ca00
kworker/u32:1-13473 [002] .... 4093.976056: xprt_destroy: RPC: destroying transport ffff8803b1c38000
kworker/u32:1-13473 [002] .... 4093.976068: xs_destroy: RPC: xs_destroy xprt ffff8803b1c38000
kworker/u32:1-13473 [002] .... 4093.976069: xs_close: RPC: xs_close xprt ffff8803b1c38000
kworker/u32:1-13473 [002] ..s. 4093.976096: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b1c38000...
kworker/u32:1-13473 [002] ..s. 4093.976098: xs_tcp_state_change: RPC: state 4 conn 1 dead 0 zapped 1 sk_shutdown 3
kworker/u32:1-13473 [002] .... 4093.976103: xprt_disconnect_done: RPC: disconnected transport ffff8803b1c38000
kworker/u32:1-13473 [002] .... 4093.976104: xprt_disconnect_done: disconnect transport!
kworker/u32:1-13473 [002] .... 4093.976113: <stack trace>
=> xs_destroy
=> xprt_switch_free
=> rpc_free_client
=> rpc_release_client
=> rpc_shutdown_client
=> load_balance
=> ttwu_do_wakeup
=> nfsd4_process_cb_update.isra.14
=> __switch_to
=> pick_next_task_fair
=> nfsd4_run_cb_work
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread


Immediately followed by setting up of the port that will eventually turn into the hidden port:

kworker/u32:1-13473 [002] .... 4093.976128: xs_setup_tcp: RPC: set up xprt to 192.168.23.22 (port 44672) via tcp
kworker/u32:1-13473 [002] .... 4093.976136: xprt_create_transport: RPC: created transport ffff8803b8c22000 with 65536 slots
kworker/2:1H-144 [002] .... 4093.976209: xprt_alloc_slot: RPC: 48 reserved req ffff8803bfe89c00 xid 10c028fe
kworker/2:1H-144 [002] .... 4093.976213: xprt_connect: RPC: 48 xprt_connect xprt ffff8803b8c22000 is not connected
kworker/2:1H-144 [002] .... 4093.976215: xs_connect: RPC: xs_connect scheduled xprt ffff8803b8c22000
kworker/2:1H-144 [002] ..s. 4093.976231: inet_csk_get_port: snum 754
kworker/2:1H-144 [002] ..s. 4093.976239: <stack trace>
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/2:1H-144 [002] ..s. 4093.976239: inet_bind_hash: add 754 ffff8803afc20e40
kworker/2:1H-144 [002] ..s. 4093.976247: <stack trace>
=> inet_csk_get_port
=> sched_clock
=> inet_addr_type_table
=> security_capable
=> inet_bind
=> xs_bind
=> release_sock
=> sock_setsockopt
=> __sock_create
=> xs_create_sock.isra.19
=> xs_tcp_setup_socket
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/2:1H-144 [002] .... 4093.976248: xs_bind: RPC: xs_bind 4.136.255.255:754: ok (0)
kworker/2:1H-144 [002] .... 4093.976250: xs_tcp_setup_socket: RPC: worker connecting xprt ffff8803b8c22000 via tcp to 192.168.23.22 (port 44672)
kworker/2:1H-144 [002] .... 4093.976284: xs_tcp_setup_socket: RPC: ffff8803b8c22000 connect status 115 connected 0 sock state 2
<idle>-0 [003] ..s. 4093.976456: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b8c22000...
<idle>-0 [003] ..s. 4093.976458: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
kworker/3:1H-145 [003] .... 4093.976588: xprt_connect_status: RPC: 48 xprt_connect_status: retrying
kworker/3:1H-145 [003] .... 4093.976590: xprt_prepare_transmit: RPC: 48 xprt_prepare_transmit
kworker/3:1H-145 [003] .... 4093.976604: xprt_transmit: RPC: 48 xprt_transmit(72)
kworker/3:1H-145 [003] .... 4093.976622: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
kworker/3:1H-145 [003] .... 4093.976623: xprt_transmit: RPC: 48 xmit complete
<idle>-0 [003] ..s. 4093.977040: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
kworker/3:1H-145 [003] .... 4093.977151: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
kworker/3:1H-145 [003] .... 4093.977153: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
kworker/3:1H-145 [003] .... 4093.977154: xs_tcp_data_recv: RPC: reading XID (4 bytes)
kworker/3:1H-145 [003] .... 4093.977155: xs_tcp_data_recv: RPC: reading request with XID 10c028fe
kworker/3:1H-145 [003] .... 4093.977156: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
kworker/3:1H-145 [003] .... 4093.977157: xs_tcp_data_recv: RPC: read reply XID 10c028fe
kworker/3:1H-145 [003] ..s. 4093.977158: xs_tcp_data_recv: RPC: XID 10c028fe read 16 bytes
kworker/3:1H-145 [003] ..s. 4093.977159: xs_tcp_data_recv: RPC: xprt = ffff8803b8c22000, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
kworker/3:1H-145 [003] ..s. 4093.977160: xprt_complete_rqst: RPC: 48 xid 10c028fe complete (24 bytes received)
kworker/3:1H-145 [003] .... 4093.977161: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
kworker/3:1H-145 [003] .... 4093.977167: xprt_release: RPC: 48 release request ffff8803bfe89c00


That "2049" port is what is used for all transferring of data.

When the FIN/ACK is sent by the client, the socket is destroyed (no more connections
can be used) but the port is not freed up, because it appears there's still an owner
attached to it. That means this port will *never* be used again. Even if the client
unmount the directory. That port is still in limbo.

When the FIN/ACK comes in, it goes into the TIME_WAIT state here:

kworker/3:1H-145 [003] ..s. 4394.370019: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b8c22000...
kworker/3:1H-145 [003] ..s. 4394.370022: xs_tcp_state_change: RPC: state 4 conn 1 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [003] ..s. 4394.370352: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b8c22000...
<idle>-0 [003] ..s. 4394.370354: xs_tcp_state_change: RPC: state 5 conn 0 dead 0 zapped 1 sk_shutdown 3
<idle>-0 [003] ..s. 4394.370375: tcp_time_wait: moving xs_bind sock to time wait
<idle>-0 [003] ..s. 4394.370396: <stack trace>
=> tcp_data_queue
=> tcp_rcv_state_process
=> tcp_v4_inbound_md5_hash
=> tcp_v4_do_rcv
=> tcp_v4_rcv
=> ipv4_confirm
=> nf_iterate
=> ip_local_deliver_finish
=> ip_local_deliver
=> ip_local_deliver_finish
=> ip_rcv
=> packet_rcv
=> ip_rcv_finish
=> __netif_receive_skb_core
=> kmem_cache_alloc
=> netif_receive_skb_internal
=> br_pass_frame_up
=> br_flood
=> br_handle_frame
=> br_handle_frame_finish
=> enqueue_task_fair
=> br_handle_frame
=> br_handle_frame
=> find_busiest_group
=> __netif_receive_skb_core
=> inet_gro_receive
=> netif_receive_skb_internal
=> napi_gro_receive
=> e1000_clean_rx_irq
=> e1000_clean
=> net_rx_action
=> __do_softirq
=> irq_exit
=> do_IRQ
=> ret_from_intr
=> cpuidle_enter_state
=> cpuidle_enter_state
=> cpu_startup_entry
=> start_secondary


Then eventually, the socket is freed, and 'netstat' no longer shows it.

<idle>-0 [003] .Ns. 4454.371802: inet_bind_bucket_destroy: destroy 754 empty=0 ffff8803afc20e40
<idle>-0 [003] .Ns. 4454.371813: <stack trace>
=> inet_twsk_bind_unhash
=> inet_twsk_kill
=> tw_timer_handler
=> call_timer_fn
=> tw_timer_handler
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> do_IRQ
=> ret_from_intr
=> cpuidle_enter_state
=> cpuidle_enter_state
=> cpu_startup_entry
=> start_secondary


"empty=0" means the tb->owners associated with the port is not empty, and the
freeing of the port is skipped.

Now when I go and remount the directory from the client, the code finally cleans
up the port:

kworker/u32:0-25031 [000] .... 4544.674603: xprt_destroy: RPC: destroying transport ffff8803b8c22000
kworker/u32:0-25031 [000] .... 4544.674616: xs_destroy: RPC: xs_destroy xprt ffff8803b8c22000
kworker/u32:0-25031 [000] .... 4544.674617: xs_close: RPC: xs_close xprt ffff8803b8c22000
kworker/u32:0-25031 [000] ..s. 4544.674619: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803b8c22000...
kworker/u32:0-25031 [000] ..s. 4544.674620: xs_tcp_state_change: RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
kworker/u32:0-25031 [000] ..s. 4544.674621: xprt_disconnect_done: RPC: disconnected transport ffff8803b8c22000
kworker/u32:0-25031 [000] ..s. 4544.674621: xprt_disconnect_done: disconnect transport!


kworker/u32:0-25031 [000] ..s. 4544.674647: inet_bind_bucket_destroy: destroy 754 empty=1 ffff8803afc20e40
kworker/u32:0-25031 [000] ..s. 4544.674655: <stack trace>
=> inet_put_port
=> tcp_v4_destroy_sock
=> inet_csk_destroy_sock
=> tcp_close
=> inet_release
=> sock_release
=> xs_close
=> xs_destroy
=> xprt_switch_free
=> rpc_free_client
=> rpc_release_client
=> rpc_shutdown_client
=> nfsd4_process_cb_update.isra.14
=> update_curr
=> dequeue_task_fair
=> __switch_to
=> pick_next_task_fair
=> nfsd4_run_cb_work
=> process_one_work
=> worker_thread
=> worker_thread
=> kthread
=> ret_from_fork
=> kthread
kworker/u32:0-25031 [000] .... 4544.674660: xprt_disconnect_done: RPC: disconnected transport ffff8803b8c22000

Notice that "empty=1" now, and the port is freed.

Then it goes back doing everything all over again:

kworker/3:1H-145 [003] .... 4558.442458: xs_bind: RPC: xs_bind 4.136.255.255:973: ok (0)
kworker/3:1H-145 [003] .... 4558.442460: xs_tcp_setup_socket: RPC: worker connecting xprt ffff8803d7fd3800 via tcp to 192.168.23.22 (port 45075)
kworker/3:1H-145 [003] .... 4558.442496: xs_tcp_setup_socket: RPC: ffff8803d7fd3800 connect status 115 connected 0 sock state 2
<idle>-0 [002] ..s. 4558.442691: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803d7fd3800...
<idle>-0 [002] ..s. 4558.442693: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
kworker/2:1H-144 [002] .... 4558.442732: xprt_connect_status: RPC: 49 xprt_connect_status: retrying
kworker/2:1H-144 [002] .... 4558.442734: xprt_prepare_transmit: RPC: 49 xprt_prepare_transmit
kworker/2:1H-144 [002] .... 4558.442737: xprt_transmit: RPC: 49 xprt_transmit(72)
kworker/2:1H-144 [002] .... 4558.442753: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
kworker/2:1H-144 [002] .... 4558.442754: xprt_transmit: RPC: 49 xmit complete
nfsd-4382 [002] ..s. 4558.443203: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
kworker/2:1H-144 [002] .... 4558.443227: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
kworker/2:1H-144 [002] .... 4558.443229: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
kworker/2:1H-144 [002] .... 4558.443230: xs_tcp_data_recv: RPC: reading XID (4 bytes)
kworker/2:1H-144 [002] .... 4558.443231: xs_tcp_data_recv: RPC: reading request with XID e2e1dc21
kworker/2:1H-144 [002] .... 4558.443232: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
kworker/2:1H-144 [002] .... 4558.443233: xs_tcp_data_recv: RPC: read reply XID e2e1dc21
kworker/2:1H-144 [002] ..s. 4558.443235: xs_tcp_data_recv: RPC: XID e2e1dc21 read 16 bytes
kworker/2:1H-144 [002] ..s. 4558.443236: xs_tcp_data_recv: RPC: xprt = ffff8803d7fd3800, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
kworker/2:1H-144 [002] ..s. 4558.443237: xprt_complete_rqst: RPC: 49 xid e2e1dc21 complete (24 bytes received)
kworker/2:1H-144 [002] .... 4558.443238: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
kworker/2:1H-144 [002] .... 4558.443246: xprt_release: RPC: 49 release request ffff8800dba14800
kworker/u32:1-13473 [003] .... 4558.496850: xprt_destroy: RPC: destroying transport ffff8803d7fd3800
kworker/u32:1-13473 [003] .... 4558.496860: xs_destroy: RPC: xs_destroy xprt ffff8803d7fd3800
kworker/u32:1-13473 [003] .... 4558.496861: xs_close: RPC: xs_close xprt ffff8803d7fd3800
kworker/u32:1-13473 [003] ..s. 4558.496888: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803d7fd3800...
kworker/u32:1-13473 [003] ..s. 4558.496889: xs_tcp_state_change: RPC: state 4 conn 1 dead 0 zapped 1 sk_shutdown 3
kworker/u32:1-13473 [003] .... 4558.496894: xprt_disconnect_done: RPC: disconnected transport ffff8803d7fd3800
kworker/u32:1-13473 [003] .... 4558.496895: xprt_disconnect_done: disconnect transport!

Where 973 is the port that gets added and disconnected right away (and freed)

kworker/3:1H-145 [003] .... 4558.496991: xs_bind: RPC: xs_bind 4.136.255.255:688: ok (0)
kworker/3:1H-145 [003] .... 4558.496993: xs_tcp_setup_socket: RPC: worker connecting xprt ffff8803bb889000 via tcp to 192.168.23.22 (port 45075)
kworker/3:1H-145 [003] .... 4558.497024: xs_tcp_setup_socket: RPC: ffff8803bb889000 connect status 115 connected 0 sock state 2
<idle>-0 [002] .Ns. 4558.497171: xs_tcp_state_change: RPC: xs_tcp_state_change client ffff8803bb889000...
<idle>-0 [002] .Ns. 4558.497173: xs_tcp_state_change: RPC: state 1 conn 0 dead 0 zapped 1 sk_shutdown 0
kworker/2:1H-144 [002] .... 4558.497196: xprt_connect_status: RPC: 50 xprt_connect_status: retrying
kworker/2:1H-144 [002] .... 4558.497197: xprt_prepare_transmit: RPC: 50 xprt_prepare_transmit
kworker/2:1H-144 [002] .... 4558.497199: xprt_transmit: RPC: 50 xprt_transmit(72)
kworker/2:1H-144 [002] .... 4558.497210: xs_tcp_send_request: RPC: xs_tcp_send_request(72) = 0
kworker/2:1H-144 [002] .... 4558.497210: xprt_transmit: RPC: 50 xmit complete
<idle>-0 [002] ..s. 4558.497475: xs_tcp_data_ready: RPC: xs_tcp_data_ready...
kworker/2:1H-144 [002] .... 4558.497569: xs_tcp_data_recv: RPC: xs_tcp_data_recv started
kworker/2:1H-144 [002] .... 4558.497571: xs_tcp_data_recv: RPC: reading TCP record fragment of length 24
kworker/2:1H-144 [002] .... 4558.497571: xs_tcp_data_recv: RPC: reading XID (4 bytes)
kworker/2:1H-144 [002] .... 4558.497572: xs_tcp_data_recv: RPC: reading request with XID a4418f34
kworker/2:1H-144 [002] .... 4558.497573: xs_tcp_data_recv: RPC: reading CALL/REPLY flag (4 bytes)
kworker/2:1H-144 [002] .... 4558.497573: xs_tcp_data_recv: RPC: read reply XID a4418f34
kworker/2:1H-144 [002] ..s. 4558.497574: xs_tcp_data_recv: RPC: XID a4418f34 read 16 bytes
kworker/2:1H-144 [002] ..s. 4558.497575: xs_tcp_data_recv: RPC: xprt = ffff8803bb889000, tcp_copied = 24, tcp_offset = 24, tcp_reclen = 24
kworker/2:1H-144 [002] ..s. 4558.497575: xprt_complete_rqst: RPC: 50 xid a4418f34 complete (24 bytes received)
kworker/2:1H-144 [002] .... 4558.497577: xs_tcp_data_recv: RPC: xs_tcp_data_recv done
kworker/2:1H-144 [002] .... 4558.497581: xprt_release: RPC: 50 release request ffff8800db28d800

And 688 will be the port that becomes the new hidden port.

Thus it looks like there's something that is not cleaning up ports properly. There's
some timing issues here.

Any thoughts why. Because this is obviously wrong, and not only is it wasting a port
and memory to store it, but it's causing rkhunter to report it, because it's one of
the things that rootkits do.

-- Steve

2018-02-02 21:46:40

by Daniel Reichelt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

Hi Trond, Steven,

eversince I switched from Debian Jessie to Stretch last summer, I've
been seeing the very same hidden ports on an NFS server as described in
[1], which is a follow-up to [2].

Your patch ([3], [4]) solved the issue back then. Later on, you changed
that fix again in [5], which lead to the situation we're seeing today.

Reverting 0b0ab51 fixes the issue for me.

Let me know if you need more info.



Thanks
Daniel


[1] https://lkml.org/lkml/2016/6/30/341
[2] https://lkml.org/lkml/2015/6/11/803
[3] https://lkml.org/lkml/2015/6/19/759
[4] 4876cc779ff525b9c2376d8076edf47815e71f2c
[5] 4b0ab51db32eba0f48b7618254742f143364a28d


Attachments:
signature.asc (866.00 B)
OpenPGP digital signature

2018-02-06 00:24:29

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Fri, 2018-02-02 at 22:31 +0100, Daniel Reichelt wrote:
> Hi Trond, Steven,
>
> eversince I switched from Debian Jessie to Stretch last summer, I've
> been seeing the very same hidden ports on an NFS server as described
> in
> [1], which is a follow-up to [2].
>
> Your patch ([3], [4]) solved the issue back then. Later on, you
> changed
> that fix again in [5], which lead to the situation we're seeing
> today.
>
> Reverting 0b0ab51 fixes the issue for me.
>
> Let me know if you need more info.
>
>
>
> Thanks
> Daniel
>
>
> [1] https://lkml.org/lkml/2016/6/30/341
> [2] https://lkml.org/lkml/2015/6/11/803
> [3] https://lkml.org/lkml/2015/6/19/759
> [4] 4876cc779ff525b9c2376d8076edf47815e71f2c
> [5] 4b0ab51db32eba0f48b7618254742f143364a28d

Does the following fix the issue?

8<-----------------------------------------------
From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Mon, 5 Feb 2018 10:20:06 -0500
Subject: [PATCH] SUNRPC: Ensure we always close the socket after a connection
shuts down

Ensure that we release the TCP socket once it is in the TCP_CLOSE or
TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
and its ilk.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtsock.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 18803021f242..5d0108172ed3 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -807,13 +807,6 @@ static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
smp_mb__after_atomic();
}

-static void xs_sock_mark_closed(struct rpc_xprt *xprt)
-{
- xs_sock_reset_connection_flags(xprt);
- /* Mark transport as closed and wake up all pending tasks */
- xprt_disconnect_done(xprt);
-}
-
/**
* xs_error_report - callback to handle TCP socket state errors
* @sk: socket
@@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
err = -sk->sk_err;
if (err == 0)
goto out;
- /* Is this a reset event? */
- if (sk->sk_state == TCP_CLOSE)
- xs_sock_mark_closed(xprt);
dprintk("RPC: xs_error_report client %p, error=%d...\n",
xprt, -err);
trace_rpc_socket_error(xprt, sk->sk_socket, err);
@@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock *sk)
if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
&transport->sock_state))
xprt_clear_connecting(xprt);
+ clear_bit(XPRT_CLOSING, &xprt->state);
if (sk->sk_err)
xprt_wake_pending_tasks(xprt, -sk->sk_err);
- xs_sock_mark_closed(xprt);
+ /* Trigger the socket release */
+ xs_tcp_force_close(xprt);
}
out:
read_unlock_bh(&sk->sk_callback_lock);
@@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
{
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
struct socket *sock = transport->sock;
+ int skst = transport->inet ? transport->inet->sk_state : TCP_CLOSE;

if (sock == NULL)
return;
- if (xprt_connected(xprt)) {
+ switch (skst) {
+ default:
kernel_sock_shutdown(sock, SHUT_RDWR);
trace_rpc_socket_shutdown(xprt, sock);
- } else
+ break;
+ case TCP_CLOSE:
+ case TCP_TIME_WAIT:
xs_reset_transport(transport);
+ }
}

static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
--
2.14.3

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]


Attachments:
signature.asc (833.00 B)
This is a digitally signed message part

2018-02-06 09:20:27

by Daniel Reichelt

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On 02/06/2018 01:24 AM, Trond Myklebust wrote:
> Does the following fix the issue?
>
> 8<-----------------------------------------------
> From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Mon, 5 Feb 2018 10:20:06 -0500
> Subject: [PATCH] SUNRPC: Ensure we always close the socket after a connection
> shuts down
>
> Ensure that we release the TCP socket once it is in the TCP_CLOSE or
> TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
> and its ilk.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 23 ++++++++++-------------
> 1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 18803021f242..5d0108172ed3 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -807,13 +807,6 @@ static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
> smp_mb__after_atomic();
> }
>
> -static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> -{
> - xs_sock_reset_connection_flags(xprt);
> - /* Mark transport as closed and wake up all pending tasks */
> - xprt_disconnect_done(xprt);
> -}
> -
> /**
> * xs_error_report - callback to handle TCP socket state errors
> * @sk: socket
> @@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
> err = -sk->sk_err;
> if (err == 0)
> goto out;
> - /* Is this a reset event? */
> - if (sk->sk_state == TCP_CLOSE)
> - xs_sock_mark_closed(xprt);
> dprintk("RPC: xs_error_report client %p, error=%d...\n",
> xprt, -err);
> trace_rpc_socket_error(xprt, sk->sk_socket, err);
> @@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock *sk)
> if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
> &transport->sock_state))
> xprt_clear_connecting(xprt);
> + clear_bit(XPRT_CLOSING, &xprt->state);
> if (sk->sk_err)
> xprt_wake_pending_tasks(xprt, -sk->sk_err);
> - xs_sock_mark_closed(xprt);
> + /* Trigger the socket release */
> + xs_tcp_force_close(xprt);
> }
> out:
> read_unlock_bh(&sk->sk_callback_lock);
> @@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
> {
> struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
> struct socket *sock = transport->sock;
> + int skst = transport->inet ? transport->inet->sk_state : TCP_CLOSE;
>
> if (sock == NULL)
> return;
> - if (xprt_connected(xprt)) {
> + switch (skst) {
> + default:
> kernel_sock_shutdown(sock, SHUT_RDWR);
> trace_rpc_socket_shutdown(xprt, sock);
> - } else
> + break;
> + case TCP_CLOSE:
> + case TCP_TIME_WAIT:
> xs_reset_transport(transport);
> + }
> }
>
> static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
>


Previously, I've seen hidden ports within 5-6 minutes after re-starting
the nfsd and re-mounting nfs-exports on clients.

With this patch applied, I don't see any hidden ports after 15mins. I
guess it's a valid fix.


Thank you!

Daniel


Attachments:
signature.asc (866.00 B)
OpenPGP digital signature

2018-02-06 19:26:35

by Trond Myklebust

[permalink] [raw]
Subject: Re: It's back! (Re: [REGRESSION] NFS is creating a hidden port (left over from xs_bind() ))

On Tue, 2018-02-06 at 10:20 +0100, Daniel Reichelt wrote:
> On 02/06/2018 01:24 AM, Trond Myklebust wrote:
> > Does the following fix the issue?
> >
> > 8<-----------------------------------------------
> > From 9b30889c548a4d45bfe6226e58de32504c1d682f Mon Sep 17 00:00:00
> > 2001
> > From: Trond Myklebust <[email protected]>
> > Date: Mon, 5 Feb 2018 10:20:06 -0500
> > Subject: [PATCH] SUNRPC: Ensure we always close the socket after a
> > connection
> > shuts down
> >
> > Ensure that we release the TCP socket once it is in the TCP_CLOSE
> > or
> > TCP_TIME_WAIT state (and only then) so that we don't confuse
> > rkhunter
> > and its ilk.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > ---
> > net/sunrpc/xprtsock.c | 23 ++++++++++-------------
> > 1 file changed, 10 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index 18803021f242..5d0108172ed3 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -807,13 +807,6 @@ static void
> > xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
> > smp_mb__after_atomic();
> > }
> >
> > -static void xs_sock_mark_closed(struct rpc_xprt *xprt)
> > -{
> > - xs_sock_reset_connection_flags(xprt);
> > - /* Mark transport as closed and wake up all pending tasks
> > */
> > - xprt_disconnect_done(xprt);
> > -}
> > -
> > /**
> > * xs_error_report - callback to handle TCP socket state errors
> > * @sk: socket
> > @@ -833,9 +826,6 @@ static void xs_error_report(struct sock *sk)
> > err = -sk->sk_err;
> > if (err == 0)
> > goto out;
> > - /* Is this a reset event? */
> > - if (sk->sk_state == TCP_CLOSE)
> > - xs_sock_mark_closed(xprt);
> > dprintk("RPC: xs_error_report client %p,
> > error=%d...\n",
> > xprt, -err);
> > trace_rpc_socket_error(xprt, sk->sk_socket, err);
> > @@ -1655,9 +1645,11 @@ static void xs_tcp_state_change(struct sock
> > *sk)
> > if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
> > &transport->sock_state))
> > xprt_clear_connecting(xprt);
> > + clear_bit(XPRT_CLOSING, &xprt->state);
> > if (sk->sk_err)
> > xprt_wake_pending_tasks(xprt, -sk-
> > >sk_err);
> > - xs_sock_mark_closed(xprt);
> > + /* Trigger the socket release */
> > + xs_tcp_force_close(xprt);
> > }
> > out:
> > read_unlock_bh(&sk->sk_callback_lock);
> > @@ -2265,14 +2257,19 @@ static void xs_tcp_shutdown(struct rpc_xprt
> > *xprt)
> > {
> > struct sock_xprt *transport = container_of(xprt, struct
> > sock_xprt, xprt);
> > struct socket *sock = transport->sock;
> > + int skst = transport->inet ? transport->inet->sk_state :
> > TCP_CLOSE;
> >
> > if (sock == NULL)
> > return;
> > - if (xprt_connected(xprt)) {
> > + switch (skst) {
> > + default:
> > kernel_sock_shutdown(sock, SHUT_RDWR);
> > trace_rpc_socket_shutdown(xprt, sock);
> > - } else
> > + break;
> > + case TCP_CLOSE:
> > + case TCP_TIME_WAIT:
> > xs_reset_transport(transport);
> > + }
> > }
> >
> > static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
> >
>
>
> Previously, I've seen hidden ports within 5-6 minutes after re-
> starting
> the nfsd and re-mounting nfs-exports on clients.
>
> With this patch applied, I don't see any hidden ports after 15mins. I
> guess it's a valid fix.

For the record, the intention of the patch is not to adjust or correct
any connection timeout values. Merely to ensure that once the
connection breakage is detected by the socket layer, so that is it no
longer usable by the RPC client, we release the socket.

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]


Attachments:
signature.asc (833.00 B)
This is a digitally signed message part