2015-08-05 20:28:37

by Calvin Owens

[permalink] [raw]
Subject: [PATCH] net: Unbreak resetting default values for tcp_wmem/udp_wmem_min

Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
written to them are not less than SOCK_MIN_{RCV,SND}BUF.

This change is fine for tcp_rmem and udp_rmem_min, since SOCK_MIN_RCVBUF
is equal to equal to TCP_SKB_MIN_TRUESIZE. But it breaks tcp_wmem and
udp_wmem_min for previously valid values because SOCK_MIN_SNDBUF is
(2 * TCP_SKB_MIN_TRUESIZE), which ends up being greater than 4KB.

Thus, 4096 is no longer accepted as a valid value, despite still being
the default for udp_wmem_min, and for 'min' in tcp_wmem. A huge number
of sysctl configurations at FB use 4096 as 'min', so this change breaks
all of them.

This patch changes the sysctls to simply enforce that the value written
is greater than or equal to the default value of SK_MEM_QUANTUM.

Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
Signed-off-by: Calvin Owens <[email protected]>
---
net/ipv4/sysctl_net_ipv4.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 433231c..a214b6a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -41,8 +41,7 @@ static int tcp_syn_retries_min = 1;
static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
static int ip_ping_group_range_min[] = { 0, 0 };
static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
-static int min_sndbuf = SOCK_MIN_SNDBUF;
-static int min_rcvbuf = SOCK_MIN_RCVBUF;
+static int min_buf = SK_MEM_QUANTUM;

/* Update system visible IP port range */
static void set_local_port_range(struct net *net, int range[2])
@@ -530,7 +529,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_wmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &min_buf,
},
{
.procname = "tcp_notsent_lowat",
@@ -545,7 +544,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_rmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &min_buf,
},
{
.procname = "tcp_app_win",
@@ -758,7 +757,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_rmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &min_buf,
},
{
.procname = "udp_wmem_min",
@@ -766,7 +765,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_wmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &min_buf,
},
{ }
};
--
1.8.1


2015-08-10 05:41:20

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net: Unbreak resetting default values for tcp_wmem/udp_wmem_min

From: Calvin Owens <[email protected]>
Date: Wed, 5 Aug 2015 13:26:54 -0700

> Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
> SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
> written to them are not less than SOCK_MIN_{RCV,SND}BUF.
>
> This change is fine for tcp_rmem and udp_rmem_min, since SOCK_MIN_RCVBUF
> is equal to equal to TCP_SKB_MIN_TRUESIZE. But it breaks tcp_wmem and
> udp_wmem_min for previously valid values because SOCK_MIN_SNDBUF is
> (2 * TCP_SKB_MIN_TRUESIZE), which ends up being greater than 4KB.
>
> Thus, 4096 is no longer accepted as a valid value, despite still being
> the default for udp_wmem_min, and for 'min' in tcp_wmem. A huge number
> of sysctl configurations at FB use 4096 as 'min', so this change breaks
> all of them.
>
> This patch changes the sysctls to simply enforce that the value written
> is greater than or equal to the default value of SK_MEM_QUANTUM.
>
> Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
> Signed-off-by: Calvin Owens <[email protected]>

I think increasing the default makes more sense.

If we don't allow applications to set 4K, the kernel shouldn't start
with that value either.

2015-08-11 03:34:52

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH] net: Unbreak resetting default values for tcp_wmem/udp_wmem_min

On Sunday 08/09 at 22:41 -0700, David Miller wrote:
> From: Calvin Owens <[email protected]>
> Date: Wed, 5 Aug 2015 13:26:54 -0700
>
> > Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
> > SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
> > written to them are not less than SOCK_MIN_{RCV,SND}BUF.
> >
> > This change is fine for tcp_rmem and udp_rmem_min, since SOCK_MIN_RCVBUF
> > is equal to equal to TCP_SKB_MIN_TRUESIZE. But it breaks tcp_wmem and
> > udp_wmem_min for previously valid values because SOCK_MIN_SNDBUF is
> > (2 * TCP_SKB_MIN_TRUESIZE), which ends up being greater than 4KB.
> >
> > Thus, 4096 is no longer accepted as a valid value, despite still being
> > the default for udp_wmem_min, and for 'min' in tcp_wmem. A huge number
> > of sysctl configurations at FB use 4096 as 'min', so this change breaks
> > all of them.
> >
> > This patch changes the sysctls to simply enforce that the value written
> > is greater than or equal to the default value of SK_MEM_QUANTUM.
> >
> > Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
> > Signed-off-by: Calvin Owens <[email protected]>
>
> I think increasing the default makes more sense.
>
> If we don't allow applications to set 4K, the kernel shouldn't start
> with that value either.

I'm really questioning the limitation itself: why enforce a minimum of
SOCK_MIN_SNDBUF here? Why not SK_MEM_QUANTUM?

Commit 8133534c760d4083 referred to b1cb59cf2efe7971, which choose to
use the SOCK_MIN constants as the lower limits to avoid nasty bugs. But
AFAICS, a limit of SOCK_MIN_SNDBUF isn't necessary to do that: the
BUG_ON cited in the commit message for b1cb59cf2efe7971 seems to have
happened because unix_stream_sendmsg() expects a minimum of a full page
(ie SK_MEM_QUANTUM) and the math broke, not because it had less than
SOCK_MIN_SNDBUF allocated.

Nothing seems to assume that it has at least SOCK_MIN_SNDBUF to play
with, so my argument is that enforcing a minimum of SK_MEM_QUANTUM
avoids the sort of bugs commit 8133534c760d4083 was trying to avoid, and
it does so without breaking anybody's sysctl configurations. What do you
think?

Thanks very much,
Calvin

2015-08-11 03:46:35

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] net: Unbreak resetting default values for tcp_wmem/udp_wmem_min

From: Calvin Owens <[email protected]>
Date: Mon, 10 Aug 2015 20:34:06 -0700

> I'm really questioning the limitation itself: why enforce a minimum of
> SOCK_MIN_SNDBUF here? Why not SK_MEM_QUANTUM?
>
> Commit 8133534c760d4083 referred to b1cb59cf2efe7971, which choose to
> use the SOCK_MIN constants as the lower limits to avoid nasty bugs. But
> AFAICS, a limit of SOCK_MIN_SNDBUF isn't necessary to do that: the
> BUG_ON cited in the commit message for b1cb59cf2efe7971 seems to have
> happened because unix_stream_sendmsg() expects a minimum of a full page
> (ie SK_MEM_QUANTUM) and the math broke, not because it had less than
> SOCK_MIN_SNDBUF allocated.
>
> Nothing seems to assume that it has at least SOCK_MIN_SNDBUF to play
> with, so my argument is that enforcing a minimum of SK_MEM_QUANTUM
> avoids the sort of bugs commit 8133534c760d4083 was trying to avoid, and
> it does so without breaking anybody's sysctl configurations. What do you
> think?

The author of said commit argues that too small values lead to really
bad performance, but I guess he should have adjusted the default if he
cared about it so much.

Ok, can you respin your patch with some added details in the commit
message like what you said above?

Thanks.

2015-08-12 04:55:01

by Calvin Owens

[permalink] [raw]
Subject: [PATCH] net: Use SK_MEM_QUANTUM as minimum for tcp/udp rmem/wmem

Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
written to them are not less than SOCK_MIN_{RCV,SND}BUF.

That change causes 4096 (or SK_MEM_QUANTUM) to no longer be accepted as
a valid value for 'min' in tcp_wmem and udp_wmem_min. 4096 has been the
default for both of those sysctls for a long time, and unfortunately
seems to be an extremely popular setting. This change breaks a large
number of sysctl configurations at FB.

That commit referred to b1cb59cf2efe7971 ("net: sysctl_net_core: check
SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
commit message seems to have happened because unix_stream_sendmsg()
expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
not because it had less than SOCK_MIN_SNDBUF allocated.

Nothing seems to assume that it has at least SOCK_MIN_SNDBUF to play
with, so I think enforcing a minimum of SK_MEM_QUANTUM avoids the sort
of bugs 8133534c was trying to avoid, and it does so without breaking
anybody's sysctl configurations.

Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
Signed-off-by: Calvin Owens <[email protected]>
---
net/ipv4/sysctl_net_ipv4.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 433231c..a214b6a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -41,8 +41,7 @@ static int tcp_syn_retries_min = 1;
static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
static int ip_ping_group_range_min[] = { 0, 0 };
static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
-static int min_sndbuf = SOCK_MIN_SNDBUF;
-static int min_rcvbuf = SOCK_MIN_RCVBUF;
+static int min_buf = SK_MEM_QUANTUM;

/* Update system visible IP port range */
static void set_local_port_range(struct net *net, int range[2])
@@ -530,7 +529,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_wmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &min_buf,
},
{
.procname = "tcp_notsent_lowat",
@@ -545,7 +544,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_rmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &min_buf,
},
{
.procname = "tcp_app_win",
@@ -758,7 +757,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_rmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &min_buf,
},
{
.procname = "udp_wmem_min",
@@ -766,7 +765,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_wmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &min_buf,
},
{ }
};
--
1.8.1

2015-08-12 14:21:08

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] net: Use SK_MEM_QUANTUM as minimum for tcp/udp rmem/wmem

On Tue, 2015-08-11 at 21:54 -0700, Calvin Owens wrote:
> Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
> SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
> written to them are not less than SOCK_MIN_{RCV,SND}BUF.
>
> That change causes 4096 (or SK_MEM_QUANTUM) to no longer be accepted as
> a valid value for 'min' in tcp_wmem and udp_wmem_min. 4096 has been the
> default for both of those sysctls for a long time, and unfortunately
> seems to be an extremely popular setting. This change breaks a large
> number of sysctl configurations at FB.
>
> That commit referred to b1cb59cf2efe7971 ("net: sysctl_net_core: check
> SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
> constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
> of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
> commit message seems to have happened because unix_stream_sendmsg()
> expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
> not because it had less than SOCK_MIN_SNDBUF allocated.
>
> Nothing seems to assume that it has at least SOCK_MIN_SNDBUF to play
> with, so I think enforcing a minimum of SK_MEM_QUANTUM avoids the sort
> of bugs 8133534c was trying to avoid, and it does so without breaking
> anybody's sysctl configurations.
>
> Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
> Signed-off-by: Calvin Owens <[email protected]>
> ---

#define SK_MEM_QUANTUM ((int)PAGE_SIZE)

Some arches have PAGE_SIZE = 65536

So your patch might break scripts as well for them.

We should revert 8133534c760d4083.

2015-08-12 17:01:00

by Sorin Dumitru

[permalink] [raw]
Subject: Re: [PATCH] net: Use SK_MEM_QUANTUM as minimum for tcp/udp rmem/wmem

On Wed, Aug 12, 2015 at 5:21 PM, Eric Dumazet <[email protected]> wrote:
> On Tue, 2015-08-11 at 21:54 -0700, Calvin Owens wrote:
>> Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
>> SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
>> written to them are not less than SOCK_MIN_{RCV,SND}BUF.
>>
>> That change causes 4096 (or SK_MEM_QUANTUM) to no longer be accepted as
>> a valid value for 'min' in tcp_wmem and udp_wmem_min. 4096 has been the
>> default for both of those sysctls for a long time, and unfortunately
>> seems to be an extremely popular setting. This change breaks a large
>> number of sysctl configurations at FB.
>>
>> That commit referred to b1cb59cf2efe7971 ("net: sysctl_net_core: check
>> SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
>> constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
>> of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
>> commit message seems to have happened because unix_stream_sendmsg()
>> expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
>> not because it had less than SOCK_MIN_SNDBUF allocated.
>>
>> Nothing seems to assume that it has at least SOCK_MIN_SNDBUF to play
>> with, so I think enforcing a minimum of SK_MEM_QUANTUM avoids the sort
>> of bugs 8133534c was trying to avoid, and it does so without breaking
>> anybody's sysctl configurations.
>>
>> Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
>> Signed-off-by: Calvin Owens <[email protected]>
>> ---
>
> #define SK_MEM_QUANTUM ((int)PAGE_SIZE)
>
> Some arches have PAGE_SIZE = 65536
>
> So your patch might break scripts as well for them.
>
> We should revert 8133534c760d4083.
>

Would clamping the values to a min value, like setsockopt(SO_SNDBUF)
does, be an option?
I still find it odd that SO_SNDBUF limits you, while the /proc
interface doesn't. If you think it's
too much, I'm ok with reverting it since it affects scripts.

On those arches where PAGE_SIZE == 64K(or > 16K) it looks like we have
tcp_wmem[1]
smaller than tcp_wmem[0]. Shouldn't we do something about this?

2015-08-12 17:46:29

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] net: Use SK_MEM_QUANTUM as minimum for tcp/udp rmem/wmem

On Wed, 2015-08-12 at 20:00 +0300, Sorin Dumitru wrote:

> Would clamping the values to a min value, like setsockopt(SO_SNDBUF)
> does, be an option?
> I still find it odd that SO_SNDBUF limits you, while the /proc
> interface doesn't. If you think it's
> too much, I'm ok with reverting it since it affects scripts.
>
> On those arches where PAGE_SIZE == 64K(or > 16K) it looks like we have
> tcp_wmem[1]
> smaller than tcp_wmem[0]. Shouldn't we do something about this?

As long as we do not crash if/when root user changes /proc/sys/net
settings, we are good.

I would not care if performance is bad if root does something really
stupid. root user is supposed to not mess things just for fun.

2015-08-13 21:07:51

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH] net: Use SK_MEM_QUANTUM as minimum for tcp/udp rmem/wmem

On Wednesday 08/12 at 10:46 -0700, Eric Dumazet wrote:
> On Wed, 2015-08-12 at 20:00 +0300, Sorin Dumitru wrote:
>
> > Would clamping the values to a min value, like setsockopt(SO_SNDBUF)
> > does, be an option?
> > I still find it odd that SO_SNDBUF limits you, while the /proc
> > interface doesn't. If you think it's
> > too much, I'm ok with reverting it since it affects scripts.
> >
> > On those arches where PAGE_SIZE == 64K(or > 16K) it looks like we have
> > tcp_wmem[1]
> > smaller than tcp_wmem[0]. Shouldn't we do something about this?
>
> As long as we do not crash if/when root user changes /proc/sys/net
> settings, we are good.

Using "1 1 1" for tcp_{r,w}mem seems not to explode, so this sounds good
to me. I'll send a new patch reverting the original.

Thanks,
Calvin

> I would not care if performance is bad if root does something really
> stupid. root user is supposed to not mess things just for fun.

2015-08-13 21:22:16

by Calvin Owens

[permalink] [raw]
Subject: [PATCH] Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN"

Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
written to them are not less than SOCK_MIN_{RCV,SND}BUF.

That change causes 4096 to no longer be accepted as a valid value for
'min' in tcp_wmem and udp_wmem_min. 4096 has been the default for both
of those sysctls for a long time, and unfortunately seems to be an
extremely popular setting. This change breaks a large number of sysctl
configurations at Facebook.

That commit referred to b1cb59cf2efe7971 ("net: sysctl_net_core: check
SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
commit message seems to have happened because unix_stream_sendmsg()
expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
not because it had less than SOCK_MIN_SNDBUF allocated.

This particular issue doesn't seem to affect TCP however: using a
setting of "1 1 1" for tcp_{r,w}mem works, although it's obviously
suboptimal. SK_MEM_QUANTUM would be a nice minimum, but it's 64K on
some archs, so there would still be breakage.

Since a value of one doesn't seem to cause any problems, we can drop the
minimum 8133534c added to fix this.

This reverts commit 8133534c760d4083f79d2cde42c636ccc0b2792e.

Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
Cc: Eric Dumazet <[email protected]>
Cc: Sorin Dumitru <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
net/ipv4/sysctl_net_ipv4.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 433231c..0330ab2 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -41,8 +41,6 @@ static int tcp_syn_retries_min = 1;
static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
static int ip_ping_group_range_min[] = { 0, 0 };
static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
-static int min_sndbuf = SOCK_MIN_SNDBUF;
-static int min_rcvbuf = SOCK_MIN_RCVBUF;

/* Update system visible IP port range */
static void set_local_port_range(struct net *net, int range[2])
@@ -530,7 +528,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_wmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &one,
},
{
.procname = "tcp_notsent_lowat",
@@ -545,7 +543,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_tcp_rmem),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &one,
},
{
.procname = "tcp_app_win",
@@ -758,7 +756,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_rmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_rcvbuf,
+ .extra1 = &one
},
{
.procname = "udp_wmem_min",
@@ -766,7 +764,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_wmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &min_sndbuf,
+ .extra1 = &one
},
{ }
};
--
2.5.0

2015-08-13 22:56:39

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN"

On Thu, 2015-08-13 at 14:21 -0700, Calvin Owens wrote:
> Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
> SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
> written to them are not less than SOCK_MIN_{RCV,SND}BUF.
>
...
>
> This reverts commit 8133534c760d4083f79d2cde42c636ccc0b2792e.
>
> Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
> Cc: Eric Dumazet <[email protected]>
> Cc: Sorin Dumitru <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>

Acked-by: Eric Dumazet <[email protected]>

2015-08-17 19:11:18

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN"

From: Calvin Owens <[email protected]>
Date: Thu, 13 Aug 2015 14:21:34 -0700

> Commit 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to
> SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
> written to them are not less than SOCK_MIN_{RCV,SND}BUF.
>
> That change causes 4096 to no longer be accepted as a valid value for
> 'min' in tcp_wmem and udp_wmem_min. 4096 has been the default for both
> of those sysctls for a long time, and unfortunately seems to be an
> extremely popular setting. This change breaks a large number of sysctl
> configurations at Facebook.
>
> That commit referred to b1cb59cf2efe7971 ("net: sysctl_net_core: check
> SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
> constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
> of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
> commit message seems to have happened because unix_stream_sendmsg()
> expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
> not because it had less than SOCK_MIN_SNDBUF allocated.
>
> This particular issue doesn't seem to affect TCP however: using a
> setting of "1 1 1" for tcp_{r,w}mem works, although it's obviously
> suboptimal. SK_MEM_QUANTUM would be a nice minimum, but it's 64K on
> some archs, so there would still be breakage.
>
> Since a value of one doesn't seem to cause any problems, we can drop the
> minimum 8133534c added to fix this.
>
> This reverts commit 8133534c760d4083f79d2cde42c636ccc0b2792e.
>
> Fixes: 8133534c760d4083 ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
> Cc: Eric Dumazet <[email protected]>
> Cc: Sorin Dumitru <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>

Applied, thanks.