2016-05-10 15:57:29

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: pnfs client running out TCP port numbers


Dear NFS gurus,

we observe very interesting problem with pNFS client.
We have ~600 DSes in our installation + MDS + some
regular NFSv3 and v4 mounts. After some time we get on
the client nodes that they can't create new mounts:

May 10 16:00:25 bXXX0 automount[5351]: attempting to mount entry /nfs/aaa/bbb
May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed

Turned out that problem is in RPC layer. There are no free source ports anymore:

May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect xprt ffff880209f0b000 is not connected
May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect xprt ffff880209f0b000 is not connected
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sleep_on(queue "xprt_pending" time 14685414165)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 added to queue ffff880209f0b258 "xprt_pending"
May 10 17:05:04 bXXX0 kernel: RPC: 35575 setting alarm for 60000 ms
May 10 17:05:04 bXXX0 kernel: RPC: xs_connect scheduled xprt ffff880209f0b000
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task going to sleep
May 10 17:05:04 bXXX0 kernel: RPC: xs_bind 0.0.0.0:1023: failed (-98)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 __rpc_wake_up_task (now 14685414165)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 disabling timer
May 10 17:05:04 bXXX0 kernel: RPC: 35575 removed from queue ffff880209f0b258 "xprt_pending"
May 10 17:05:04 bXXX0 kernel: RPC: __rpc_wake_up_task done
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task resuming
May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect_status: error 98 connecting to server 1xx.xx4.xx8.xx3
May 10 17:05:04 bXXX0 kernel: RPC: wake_up_first(ffff880209f0b190 "xprt_sending")
May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect_status (status -5)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 return 0, status -5


This is limited by min_resvport and max_resvport, which are, by default, 665 and 1023, accordingly.
This gives us only 358 connections. If a client accesses many DSes, then we have a problem.

Questions:

- Why pNFS client must use privileged port number, when talks to DS?

- Why pNFS client uses port number only for one connection as for ip connection
it a src_addr+src_port - dst_addr+dst_port must be unique and source port number
can be reused for other connections as well.

- Should we just bump max_resvport to solve it (which in did have helped)?


Thanks in advance,
Tigran.


2016-05-10 16:21:26

by Trond Myklebust

[permalink] [raw]
Subject: Re: pnfs client running out TCP port numbers

DQoNCg0KDQoNCk9uIDUvMTAvMTYsIDExOjU3LCAibGludXgtbmZzLW93bmVyQHZnZXIua2VybmVs
Lm9yZyBvbiBiZWhhbGYgb2YgTWtydGNoeWFuLCBUaWdyYW4iIDxsaW51eC1uZnMtb3duZXJAdmdl
ci5rZXJuZWwub3JnIG9uIGJlaGFsZiBvZiB0aWdyYW4ubWtydGNoeWFuQGRlc3kuZGU+IHdyb3Rl
Og0KDQo+DQo+RGVhciBORlMgZ3VydXMsDQo+DQo+d2Ugb2JzZXJ2ZSB2ZXJ5IGludGVyZXN0aW5n
IHByb2JsZW0gd2l0aCBwTkZTIGNsaWVudC4NCj5XZSBoYXZlIH42MDAgRFNlcyBpbiBvdXIgaW5z
dGFsbGF0aW9uICsgTURTICsgc29tZQ0KPnJlZ3VsYXIgTkZTdjMgYW5kIHY0IG1vdW50cy4gQWZ0
ZXIgc29tZSB0aW1lIHdlIGdldCBvbg0KPnRoZSBjbGllbnQgbm9kZXMgdGhhdCB0aGV5IGNhbid0
IGNyZWF0ZSBuZXcgbW91bnRzOg0KPg0KPk1heSAxMCAxNjowMDoyNSBiWFhYMCBhdXRvbW91bnRb
NTM1MV06IGF0dGVtcHRpbmcgdG8gbW91bnQgZW50cnkgL25mcy9hYWEvYmJiDQo+TWF5IDEwIDE2
OjAwOjI2IGJYWFgwIGF1dG9tb3VudFs1MzUxXTogPj4gbW91bnQubmZzOiBtb3VudCBzeXN0ZW0g
Y2FsbCBmYWlsZWQNCj5NYXkgMTAgMTY6MDA6MjYgYlhYWDAgYXV0b21vdW50WzUzNTFdOiA+PiBt
b3VudC5uZnM6IG1vdW50IHN5c3RlbSBjYWxsIGZhaWxlZA0KPg0KPlR1cm5lZCBvdXQgdGhhdCBw
cm9ibGVtIGlzIGluIFJQQyBsYXllci4gVGhlcmUgYXJlIG5vIGZyZWUgc291cmNlIHBvcnRzIGFu
eW1vcmU6DQo+DQo+TWF5IDEwIDE3OjA1OjA0IGJYWFgwIGtlcm5lbDogUlBDOiAzNTU3NSBjYWxs
X2Nvbm5lY3QgeHBydCBmZmZmODgwMjA5ZjBiMDAwIGlzIG5vdCBjb25uZWN0ZWQNCj5NYXkgMTAg
MTc6MDU6MDQgYlhYWDAga2VybmVsOiBSUEM6IDM1NTc1IHhwcnRfY29ubmVjdCB4cHJ0IGZmZmY4
ODAyMDlmMGIwMDAgaXMgbm90IGNvbm5lY3RlZA0KPk1heSAxMCAxNzowNTowNCBiWFhYMCBrZXJu
ZWw6IFJQQzogMzU1NzUgc2xlZXBfb24ocXVldWUgInhwcnRfcGVuZGluZyIgdGltZSAxNDY4NTQx
NDE2NSkNCj5NYXkgMTAgMTc6MDU6MDQgYlhYWDAga2VybmVsOiBSUEM6IDM1NTc1IGFkZGVkIHRv
IHF1ZXVlIGZmZmY4ODAyMDlmMGIyNTggInhwcnRfcGVuZGluZyINCj5NYXkgMTAgMTc6MDU6MDQg
YlhYWDAga2VybmVsOiBSUEM6IDM1NTc1IHNldHRpbmcgYWxhcm0gZm9yIDYwMDAwIG1zDQo+TWF5
IDEwIDE3OjA1OjA0IGJYWFgwIGtlcm5lbDogUlBDOiAgICAgICB4c19jb25uZWN0IHNjaGVkdWxl
ZCB4cHJ0IGZmZmY4ODAyMDlmMGIwMDANCj5NYXkgMTAgMTc6MDU6MDQgYlhYWDAga2VybmVsOiBS
UEM6IDM1NTc1IHN5bmMgdGFzayBnb2luZyB0byBzbGVlcA0KPk1heSAxMCAxNzowNTowNCBiWFhY
MCBrZXJuZWw6IFJQQzogICAgICAgeHNfYmluZCAwLjAuMC4wOjEwMjM6IGZhaWxlZCAoLTk4KQ0K
Pk1heSAxMCAxNzowNTowNCBiWFhYMCBrZXJuZWw6IFJQQzogMzU1NzUgX19ycGNfd2FrZV91cF90
YXNrIChub3cgMTQ2ODU0MTQxNjUpDQo+TWF5IDEwIDE3OjA1OjA0IGJYWFgwIGtlcm5lbDogUlBD
OiAzNTU3NSBkaXNhYmxpbmcgdGltZXINCj5NYXkgMTAgMTc6MDU6MDQgYlhYWDAga2VybmVsOiBS
UEM6IDM1NTc1IHJlbW92ZWQgZnJvbSBxdWV1ZSBmZmZmODgwMjA5ZjBiMjU4ICJ4cHJ0X3BlbmRp
bmciDQo+TWF5IDEwIDE3OjA1OjA0IGJYWFgwIGtlcm5lbDogUlBDOiAgICAgICBfX3JwY193YWtl
X3VwX3Rhc2sgZG9uZQ0KPk1heSAxMCAxNzowNTowNCBiWFhYMCBrZXJuZWw6IFJQQzogMzU1NzUg
c3luYyB0YXNrIHJlc3VtaW5nDQo+TWF5IDEwIDE3OjA1OjA0IGJYWFgwIGtlcm5lbDogUlBDOiAz
NTU3NSB4cHJ0X2Nvbm5lY3Rfc3RhdHVzOiBlcnJvciA5OCBjb25uZWN0aW5nIHRvIHNlcnZlciAx
eHgueHg0Lnh4OC54eDMNCj5NYXkgMTAgMTc6MDU6MDQgYlhYWDAga2VybmVsOiBSUEM6ICAgICAg
IHdha2VfdXBfZmlyc3QoZmZmZjg4MDIwOWYwYjE5MCAieHBydF9zZW5kaW5nIikNCj5NYXkgMTAg
MTc6MDU6MDQgYlhYWDAga2VybmVsOiBSUEM6IDM1NTc1IGNhbGxfY29ubmVjdF9zdGF0dXMgKHN0
YXR1cyAtNSkNCj5NYXkgMTAgMTc6MDU6MDQgYlhYWDAga2VybmVsOiBSUEM6IDM1NTc1IHJldHVy
biAwLCBzdGF0dXMgLTUNCj4NCj4NCj5UaGlzIGlzIGxpbWl0ZWQgYnkgbWluX3Jlc3Zwb3J0IGFu
ZCBtYXhfcmVzdnBvcnQsIHdoaWNoIGFyZSwgYnkgZGVmYXVsdCwgNjY1IGFuZCAxMDIzLCBhY2Nv
cmRpbmdseS4NCj5UaGlzIGdpdmVzIHVzIG9ubHkgMzU4IGNvbm5lY3Rpb25zLiBJZiBhIGNsaWVu
dCBhY2Nlc3NlcyBtYW55IERTZXMsIHRoZW4gd2UgaGF2ZSBhIHByb2JsZW0uDQo+DQo+UXVlc3Rp
b25zOg0KPg0KPiAgLSBXaHkgcE5GUyBjbGllbnQgbXVzdCB1c2UgcHJpdmlsZWdlZCBwb3J0IG51
bWJlciwgd2hlbiB0YWxrcyB0byBEUz8NCg0KVGhhdCdzIGEgZGVmYXVsdCByZXF1aXJlbWVudCBv
biBtb3N0IE5GU3YzIHNlcnZlcnMsIHBhcnRpY3VsYXJseSB3aGVuIHVzaW5nIEFVVEhfU1lTLg0K
DQo+DQo+ICAtIFdoeSBwTkZTIGNsaWVudCB1c2VzIHBvcnQgbnVtYmVyIG9ubHkgZm9yIG9uZSBj
b25uZWN0aW9uIGFzIGZvciBpcCBjb25uZWN0aW9uDQo+ICAgIGl0IGEgc3JjX2FkZHIrc3JjX3Bv
cnQgLSBkc3RfYWRkcitkc3RfcG9ydCBtdXN0IGJlIHVuaXF1ZSBhbmQgc291cmNlIHBvcnQgbnVt
YmVyDQo+ICAgIGNhbiBiZSByZXVzZWQgZm9yIG90aGVyIGNvbm5lY3Rpb25zIGFzIHdlbGwuDQoN
CkFzIHlvdSBzYXkgYWJvdmUsIGluIG9yZGVyIHRvIHJldXNlIHRoZSBwb3J0LCB0aGUgY29ubmVj
dGlvbiBlbmQgcG9pbnRzIG5lZWQgdG8gYmUgdW5pcXVlLiBUaGF0IGNhbiBzb21ldGltZXMgYmUg
dHJpY2t5IGlmIHRoZSBzZXJ2ZXIgaXMgYWN0aW5nIGJvdGggYXMgYW4gTURTIGFuZCBhIERTLg0K
DQo+DQo+ICAtIFNob3VsZCB3ZSBqdXN0IGJ1bXAgbWF4X3Jlc3Zwb3J0IHRvIHNvbHZlIGl0ICh3
aGljaCBpbiBkaWQgaGF2ZSBoZWxwZWQpPw0KDQpZb3UgY291bGQuIFdlIGNvdWxkIGFsc28gbG9v
ayBpbnRvIGhhbmRsaW5nIHRoZSBBVVRIX1RPT1dFQUsgUlBDIGxldmVsIGVycm9yIGJ5IHR1cm5p
bmcgb24gcHJpdmlsZWdlZCBwb3J0cy4gVGhhdCBtaWdodCBhbGxvdyB1cyB0byBkZWZhdWx0IHRv
IG5vdCB1c2luZyBwcml2aWxlZ2VkIHBvcnRzLg0KDQoNCg0K


2016-06-02 19:29:12

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: pnfs client running out TCP port numbers



Hi Trond,

today, during linux-nfs phone-conf Chuck has suggested to use **noresvport**
mount option. It works for client <=> MDS connection, but was ignored for
client <=> DS connection:

[root@dcache-lab-wn002 ~]# netstat -tnC
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 131.169.161.126:860 131.169.191.142:32049 ESTABLISHED
tcp 0 0 131.169.161.126:49451 131.169.191.144:2049 ESTABLISHED
tcp 0 200 131.169.161.126:887 131.169.191.141:32049 ESTABLISHED
[root@dcache-lab-wn002 ~]#

Looks like it's a trivial change to fix that. I will send a patch after testing.

Tigran.

----- Original Message -----
> From: "Trond Myklebust" <[email protected]>
> To: "Mkrtchyan, Tigran" <[email protected]>, "linux-nfs list" <[email protected]>
> Cc: "yves kemp" <[email protected]>
> Sent: Tuesday, May 10, 2016 6:21:14 PM
> Subject: Re: pnfs client running out TCP port numbers

> On 5/10/16, 11:57, "[email protected] on behalf of Mkrtchyan,
> Tigran" <[email protected] on behalf of [email protected]>
> wrote:
>
>>
>>Dear NFS gurus,
>>
>>we observe very interesting problem with pNFS client.
>>We have ~600 DSes in our installation + MDS + some
>>regular NFSv3 and v4 mounts. After some time we get on
>>the client nodes that they can't create new mounts:
>>
>>May 10 16:00:25 bXXX0 automount[5351]: attempting to mount entry /nfs/aaa/bbb
>>May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
>>May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
>>
>>Turned out that problem is in RPC layer. There are no free source ports anymore:
>>
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect xprt ffff880209f0b000 is
>>not connected
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect xprt ffff880209f0b000 is
>>not connected
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sleep_on(queue "xprt_pending" time
>>14685414165)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 added to queue ffff880209f0b258
>>"xprt_pending"
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 setting alarm for 60000 ms
>>May 10 17:05:04 bXXX0 kernel: RPC: xs_connect scheduled xprt
>>ffff880209f0b000
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task going to sleep
>>May 10 17:05:04 bXXX0 kernel: RPC: xs_bind 0.0.0.0:1023: failed (-98)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 __rpc_wake_up_task (now 14685414165)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 disabling timer
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 removed from queue ffff880209f0b258
>>"xprt_pending"
>>May 10 17:05:04 bXXX0 kernel: RPC: __rpc_wake_up_task done
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task resuming
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect_status: error 98
>>connecting to server 1xx.xx4.xx8.xx3
>>May 10 17:05:04 bXXX0 kernel: RPC: wake_up_first(ffff880209f0b190
>>"xprt_sending")
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect_status (status -5)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 return 0, status -5
>>
>>
>>This is limited by min_resvport and max_resvport, which are, by default, 665 and
>>1023, accordingly.
>>This gives us only 358 connections. If a client accesses many DSes, then we have
>>a problem.
>>
>>Questions:
>>
>> - Why pNFS client must use privileged port number, when talks to DS?
>
> That's a default requirement on most NFSv3 servers, particularly when using
> AUTH_SYS.
>
>>
>> - Why pNFS client uses port number only for one connection as for ip connection
>> it a src_addr+src_port - dst_addr+dst_port must be unique and source port number
>> can be reused for other connections as well.
>
> As you say above, in order to reuse the port, the connection end points need to
> be unique. That can sometimes be tricky if the server is acting both as an MDS
> and a DS.
>
>>
>> - Should we just bump max_resvport to solve it (which in did have helped)?
>
> You could. We could also look into handling the AUTH_TOOWEAK RPC level error by
> turning on privileged ports. That might allow us to default to not using
> privileged ports.
>
>
>
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{���"��^n�r���z���h����&���G���h�(�階�ݢj"���m�����z�ޖ���f���h���~�m�