2020-08-28 20:44:26

by Bart Groeneveld

[permalink] [raw]
Subject: [PATCH v2] net: Use standardized (IANA) local port range

IANA specifies User ports as 1024-49151,
and Private ports (local/ephemeral/dynamic/w/e) as 49152-65535 [1].

This means Linux uses 32768-49151 'illegally'.
This is not just a matter of following specifications:
IANA actually assigns numbers in this range [1].

I understand that Linux uses 61000-65535 for masquarading/NAT [2],
so I left the high value at 60999.
This means the high value still does not follow the specification,
but it also doesn't conflict with it.

This change will effectively halve the available ephemeral ports,
increasing the risk of port exhaustion. But:
a) I don't think that warrants ignoring standards.
Consider for example setting up a (corporate) firewall blocking
all unknown external services.
It will only allow outgoing trafiic at port 80,443 and 49152-65535.
A Linux computer behind such a firewall will not be able to connect
to *any* external service *half of the time*.
Of course, the firewall can be adjusted to also allow 32768-49151,
but that allows computers to use some services against the policy.
b) It is only an issue with more than 11848 *outgoing* connections.
I think that is a niche case (I know, citation needed, but still).
If someone finds themselves in such a niche case,
they can still modify ip_local_port_range.

This patch keeps the low and high value at different parity,
as to optimize port assignment [3].

[1]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[2]: https://marc.info/?l=linux-kernel&m=117900026927289
[3]: See for example commit 1580ab63fc9a03593072cc5656167a75c4f1d173 ("tcp/dccp: better use of ephemeral ports in connect()")

Signed-off-by: Bart Groeneveld <[email protected]>
---
Documentation/networking/ip-sysctl.rst | 4 ++--
net/ipv4/af_inet.c | 2 +-
net/ipv4/inet_connection_sock.c | 2 +-
net/ipv4/inet_hashtables.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 837d51f9e1fa..5048b326f773 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1024,7 +1024,7 @@ ip_local_port_range - 2 INTEGERS
If possible, it is better these numbers have different parity
(one even and one odd value).
Must be greater than or equal to ip_unprivileged_port_start.
- The default values are 32768 and 60999 respectively.
+ The default values are 49152 and 60999 respectively.

ip_local_reserved_ports - list of comma separated ranges
Specify the ports which are reserved for known third-party
@@ -1047,7 +1047,7 @@ ip_local_reserved_ports - list of comma separated ranges
ip_local_port_range, e.g.::

$ cat /proc/sys/net/ipv4/ip_local_port_range
- 32000 60999
+ 49152 60999
$ cat /proc/sys/net/ipv4/ip_local_reserved_ports
8080,9148

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 4307503a6f0b..f95a9ffffdc9 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1838,7 +1838,7 @@ static __net_init int inet_init_net(struct net *net)
* Set defaults for local port range
*/
seqlock_init(&net->ipv4.ip_local_ports.lock);
- net->ipv4.ip_local_ports.range[0] = 32768;
+ net->ipv4.ip_local_ports.range[0] = 49152;
net->ipv4.ip_local_ports.range[1] = 60999;

seqlock_init(&net->ipv4.ping_group_range.lock);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index b457dd2d6c75..322bcfce0737 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -196,7 +196,7 @@ inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int *
attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
other_half_scan:
inet_get_local_port_range(net, &low, &high);
- high++; /* [32768, 60999] -> [32768, 61000[ */
+ high++; /* [49152, 60999] -> [49152, 61000[ */
if (high - low < 4)
attempt_half = 0;
if (attempt_half) {
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 239e54474b65..547b95a4891a 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -695,7 +695,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
l3mdev = inet_sk_bound_l3mdev(sk);

inet_get_local_port_range(net, &low, &high);
- high++; /* [32768, 60999] -> [32768, 61000[ */
+ high++; /* [49152, 60999] -> [49152, 61000[ */
remaining = high - low;
if (likely(remaining > 1))
remaining &= ~1U;
--
2.28.0


2020-08-28 20:46:29

by Bart Groeneveld

[permalink] [raw]
Subject: [PATCH v3] net: Use standardized (IANA) local port range

IANA specifies User ports as 1024-49151,
and Private ports (local/ephemeral/dynamic/w/e) as 49152-65535 [1].

This means Linux uses 32768-49151 'illegally'.
This is not just a matter of following specifications:
IANA actually assigns numbers in this range [1].

I understand that Linux uses 61000-65535 for masquarading/NAT [2],
so I left the high value at 60999.
This means the high value still does not follow the specification,
but it also doesn't conflict with it.

This change will effectively halve the available ephemeral ports,
increasing the risk of port exhaustion. But:
a) I don't think that warrants ignoring standards.
Consider for example setting up a (corporate) firewall blocking
all unknown external services.
It will only allow outgoing trafiic at port 80,443 and 49152-65535.
A Linux computer behind such a firewall will not be able to connect
to *any* external service *half of the time*.
Of course, the firewall can be adjusted to also allow 32768-49151,
but that allows computers to use some services against the policy.
b) It is only an issue with more than 11848 *outgoing* connections.
I think that is a niche case (I know, citation needed, but still).
If someone finds themselves in such a niche case,
they can still modify ip_local_port_range.

This patch keeps the low and high value at different parity,
as to optimize port assignment [3].

[1]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[2]: https://marc.info/?l=linux-kernel&m=117900026927289
[3]: See for example commit 1580ab63fc9a03593072cc5656167a75c4f1d173 ("tcp/dccp: better use of ephemeral ports in connect()")

Signed-off-by: Bart Groeneveld <[email protected]>
---
Documentation/networking/ip-sysctl.rst | 4 ++--
net/ipv4/af_inet.c | 2 +-
net/ipv4/inet_connection_sock.c | 2 +-
net/ipv4/inet_hashtables.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 837d51f9e1fa..5048b326f773 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1024,7 +1024,7 @@ ip_local_port_range - 2 INTEGERS
If possible, it is better these numbers have different parity
(one even and one odd value).
Must be greater than or equal to ip_unprivileged_port_start.
- The default values are 32768 and 60999 respectively.
+ The default values are 49152 and 60999 respectively.

ip_local_reserved_ports - list of comma separated ranges
Specify the ports which are reserved for known third-party
@@ -1047,7 +1047,7 @@ ip_local_reserved_ports - list of comma separated ranges
ip_local_port_range, e.g.::

$ cat /proc/sys/net/ipv4/ip_local_port_range
- 32000 60999
+ 49152 60999
$ cat /proc/sys/net/ipv4/ip_local_reserved_ports
8080,9148

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 4307503a6f0b..f95a9ffffdc9 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1838,7 +1838,7 @@ static __net_init int inet_init_net(struct net *net)
* Set defaults for local port range
*/
seqlock_init(&net->ipv4.ip_local_ports.lock);
- net->ipv4.ip_local_ports.range[0] = 32768;
+ net->ipv4.ip_local_ports.range[0] = 49152;
net->ipv4.ip_local_ports.range[1] = 60999;

seqlock_init(&net->ipv4.ping_group_range.lock);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index b457dd2d6c75..322bcfce0737 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -196,7 +196,7 @@ inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int *
attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
other_half_scan:
inet_get_local_port_range(net, &low, &high);
- high++; /* [32768, 60999] -> [32768, 61000[ */
+ high++; /* [49152, 60999] -> [49152, 61000[ */
if (high - low < 4)
attempt_half = 0;
if (attempt_half) {
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 239e54474b65..547b95a4891a 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -695,7 +695,7 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
l3mdev = inet_sk_bound_l3mdev(sk);

inet_get_local_port_range(net, &low, &high);
- high++; /* [32768, 60999] -> [32768, 61000[ */
+ high++; /* [49152, 60999] -> [49152, 61000[ */
remaining = high - low;
if (likely(remaining > 1))
remaining &= ~1U;
--
2.28.0

2020-08-28 21:55:24

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v3] net: Use standardized (IANA) local port range

On Fri, 28 Aug 2020 22:44:47 +0200
Bart Groeneveld <[email protected]> wrote:

> IANA specifies User ports as 1024-49151,
> and Private ports (local/ephemeral/dynamic/w/e) as 49152-65535 [1].
>
> This means Linux uses 32768-49151 'illegally'.
> This is not just a matter of following specifications:
> IANA actually assigns numbers in this range [1].
>
> I understand that Linux uses 61000-65535 for masquarading/NAT [2],
> so I left the high value at 60999.
> This means the high value still does not follow the specification,
> but it also doesn't conflict with it.
>
> This change will effectively halve the available ephemeral ports,
> increasing the risk of port exhaustion. But:
> a) I don't think that warrants ignoring standards.
> Consider for example setting up a (corporate) firewall blocking
> all unknown external services.
> It will only allow outgoing trafiic at port 80,443 and 49152-65535.
> A Linux computer behind such a firewall will not be able to connect
> to *any* external service *half of the time*.
> Of course, the firewall can be adjusted to also allow 32768-49151,
> but that allows computers to use some services against the policy.
> b) It is only an issue with more than 11848 *outgoing* connections.
> I think that is a niche case (I know, citation needed, but still).
> If someone finds themselves in such a niche case,
> they can still modify ip_local_port_range.
>
> This patch keeps the low and high value at different parity,
> as to optimize port assignment [3].
>
> [1]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
> [2]: https://marc.info/?l=linux-kernel&m=117900026927289
> [3]: See for example commit 1580ab63fc9a03593072cc5656167a75c4f1d173 ("tcp/dccp: better use of ephemeral ports in connect()")
>
> Signed-off-by: Bart Groeneveld <[email protected]>

Changing the default range impacts existing users. Since Linux has been doing
this for so long, I don't think just because a standards body decided to reserve
some space is sufficient justification to do this.

2020-08-28 23:15:13

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v3] net: Use standardized (IANA) local port range

From: Stephen Hemminger <[email protected]>
Date: Fri, 28 Aug 2020 14:52:03 -0700

> Changing the default range impacts existing users. Since Linux has been doing
> this for so long, I don't think just because a standards body decided to reserve
> some space is sufficient justification to do this.

Agreed, there is no way we can change this after decades of
precedence. We will definitely break things for people.

2020-08-29 10:40:53

by Michal Kubecek

[permalink] [raw]
Subject: Re: [PATCH v3] net: Use standardized (IANA) local port range

On Fri, Aug 28, 2020 at 10:44:47PM +0200, Bart Groeneveld wrote:
> This change will effectively halve the available ephemeral ports,
> increasing the risk of port exhaustion. But:
> ...
> b) It is only an issue with more than 11848 *outgoing* connections.
> I think that is a niche case (I know, citation needed, but still).

You don't need 11848 simultaneous connections to run into problems as
you may also have timewait sockets left after a connection is closed.
If there are many shortlived outgoing connections to the same server,
you may run out of ephemeral ports even without having too many active
connections at any time.

Michal

2020-08-29 13:39:14

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2] net: Use standardized (IANA) local port range

From: Bart Groeneveld
> Sent: 28 August 2020 21:40
>
> IANA specifies User ports as 1024-49151,
> and Private ports (local/ephemeral/dynamic/w/e) as 49152-65535 [1].
>
> This means Linux uses 32768-49151 'illegally'.
> This is not just a matter of following specifications:
> IANA actually assigns numbers in this range [1].

Linux is using the 'historic' values.
IANA shouldn't really have 'grabbed' half the port number space.
Really the 'problem' of TCP port numbers identifying the service
as well as the connection should have been addresses by some other
means (eg using port 1023 and a TCP option to select the serivce).

Changing the default base from 32k to 48k will break some existing
systems if/when a kernel upgrade is installed.

You are also changing the numbers for UDP.
Anyone doing a lot of RTP (which typically requires 2 adjacent
UDP ports) is already constrained by the availability or ports.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-08-31 08:10:14

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v3] net: Use standardized (IANA) local port range



On 8/28/20 2:52 PM, Stephen Hemminger wrote:
> On Fri, 28 Aug 2020 22:44:47 +0200
> Bart Groeneveld <[email protected]> wrote:
>
>> IANA specifies User ports as 1024-49151,
>> and Private ports (local/ephemeral/dynamic/w/e) as 49152-65535 [1].
>>
>> This means Linux uses 32768-49151 'illegally'.
>> This is not just a matter of following specifications:
>> IANA actually assigns numbers in this range [1].
>>
>> I understand that Linux uses 61000-65535 for masquarading/NAT [2],
>> so I left the high value at 60999.
>> This means the high value still does not follow the specification,
>> but it also doesn't conflict with it.
>>
>> This change will effectively halve the available ephemeral ports,
>> increasing the risk of port exhaustion. But:
>> a) I don't think that warrants ignoring standards.
>> Consider for example setting up a (corporate) firewall blocking
>> all unknown external services.
>> It will only allow outgoing trafiic at port 80,443 and 49152-65535.
>> A Linux computer behind such a firewall will not be able to connect
>> to *any* external service *half of the time*.
>> Of course, the firewall can be adjusted to also allow 32768-49151,
>> but that allows computers to use some services against the policy.
>> b) It is only an issue with more than 11848 *outgoing* connections.
>> I think that is a niche case (I know, citation needed, but still).
>> If someone finds themselves in such a niche case,
>> they can still modify ip_local_port_range.
>>
>> This patch keeps the low and high value at different parity,
>> as to optimize port assignment [3].
>>
>> [1]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
>> [2]: https://marc.info/?l=linux-kernel&m=117900026927289
>> [3]: See for example commit 1580ab63fc9a03593072cc5656167a75c4f1d173 ("tcp/dccp: better use of ephemeral ports in connect()")
>>
>> Signed-off-by: Bart Groeneveld <[email protected]>
>
> Changing the default range impacts existing users. Since Linux has been doing
> this for so long, I don't think just because a standards body decided to reserve
> some space is sufficient justification to do this.
>

Agreed.

There is a sysctl, allowing admins/distros to opt-in to whatever IANA values of the days
if they really want.

We have already many issues caused by ephemeral range being too small.

For instance I often have to debug issues caused by some distros
changing sysctl_tcp_rfc1337 to 1, hurting some real applications.