2006-09-22 06:24:51

by William Pitcock

[permalink] [raw]
Subject: [PATCH 2.6.18 1/1] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

This patch allows for a user to disable the requirement to meet the
CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index e4b1a4d..c3f7c3c 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -411,6 +411,7 @@ enum
NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
NET_TCP_DMA_COPYBREAK=116,
NET_TCP_SLOW_START_AFTER_IDLE=117,
+ NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER=118,
};
enum {
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c84a320..a2ea829 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -394,6 +394,11 @@ int inet_release(struct socket *sock)
/* It is off by default, see below. */
int sysctl_ip_nonlocal_bind;

+/* When this is enabled, it allows normal users to bind to ports <=
1023.
+ * This is set by the net.ipv4.allow_lowport_bind_nonsuperuser
sysctl value.
+ */
+int sysctl_ip_allow_lowport_bind_nonsuperuser;
+
int inet_bind(struct socket *sock, struct sockaddr *uaddr, int
addr_len)
{
struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
@@ -432,7 +437,8 @@ int inet_bind(struct socket *sock, struc
snum = ntohs(addr->sin_port);
err = -EACCES;
- if (snum && snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
+ if (!sysctl_ip_allow_lowport_bind_nonsuperuser && snum && snum <
PROT_SOCK &&
+ !capable(CAP_NET_BIND_SERVICE))
goto out;
/* We keep a pair of addresses. rcv_saddr is the one
@@ -1412,3 +1418,4 @@ EXPORT_SYMBOL(inet_stream_ops);
EXPORT_SYMBOL(inet_unregister_protosw);
EXPORT_SYMBOL(net_statistics);
EXPORT_SYMBOL(sysctl_ip_nonlocal_bind);
+EXPORT_SYMBOL(sysctl_ip_allow_lowport_bind_nonsuperuser);
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 70cea9d..c57ef3a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -20,6 +20,7 @@ #include <net/tcp.h>
/* From af_inet.c */
extern int sysctl_ip_nonlocal_bind;
+extern int sysctl_ip_allow_lowport_bind_nonsuperuser;
#ifdef CONFIG_SYSCTL
static int zero;
@@ -197,6 +198,14 @@ ctl_table ipv4_table[] = {
.proc_handler = &proc_dointvec
},
{
+ .ctl_name = NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER,
+ .procname = "allow_lowport_bind_nonsuperuser",
+ .data = &sysctl_ip_allow_lowport_bind_nonsuperuser,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec
+ },
+ {
.ctl_name = NET_IPV4_TCP_SYN_RETRIES,
.procname = "tcp_syn_retries",
.data = &sysctl_tcp_syn_retries,


Signed-off-by: William Pitcock <[email protected]>


2006-09-22 07:31:38

by William Pitcock

[permalink] [raw]
Subject: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

This patch allows for a user to disable the requirement to meet the
CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.

Changes:
- clean up mangling from mailer
- put my signoff in the right location (oops)

Signed-off-by: William Pitcock <[email protected]>
---
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index e4b1a4d..c3f7c3c 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -411,6 +411,7 @@ enum
NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
NET_TCP_DMA_COPYBREAK=116,
NET_TCP_SLOW_START_AFTER_IDLE=117,
+ NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER=118,
};

enum {
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c84a320..a2ea829 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -394,6 +394,11 @@ int inet_release(struct socket *sock)
/* It is off by default, see below. */
int sysctl_ip_nonlocal_bind;

+/* When this is enabled, it allows normal users to bind to ports <=
1023.
+ * This is set by the net.ipv4.allow_lowport_bind_nonsuperuser
sysctl value.
+ */
+int sysctl_ip_allow_lowport_bind_nonsuperuser;
+
int inet_bind(struct socket *sock, struct sockaddr *uaddr, int
addr_len)
{
struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
@@ -432,7 +437,8 @@ int inet_bind(struct socket *sock, struc
snum = ntohs(addr->sin_port);
err = -EACCES;
- if (snum && snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
+ if (!sysctl_ip_allow_lowport_bind_nonsuperuser && snum && snum <
PROT_SOCK &&
+ !capable(CAP_NET_BIND_SERVICE))
goto out;

/* We keep a pair of addresses. rcv_saddr is the one
@@ -1412,3 +1418,4 @@ EXPORT_SYMBOL(inet_stream_ops);
EXPORT_SYMBOL(inet_unregister_protosw);
EXPORT_SYMBOL(net_statistics);
EXPORT_SYMBOL(sysctl_ip_nonlocal_bind);
+EXPORT_SYMBOL(sysctl_ip_allow_lowport_bind_nonsuperuser);
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 70cea9d..c57ef3a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -20,6 +20,7 @@ #include <net/tcp.h>
/* From af_inet.c */
extern int sysctl_ip_nonlocal_bind;
+extern int sysctl_ip_allow_lowport_bind_nonsuperuser;

#ifdef CONFIG_SYSCTL
static int zero;
@@ -197,6 +198,14 @@ ctl_table ipv4_table[] = {
.proc_handler = &proc_dointvec
},
{
+ .ctl_name = NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER,
+ .procname = "allow_lowport_bind_nonsuperuser",
+ .data = &sysctl_ip_allow_lowport_bind_nonsuperuser,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec
+ },
+ {
.ctl_name = NET_IPV4_TCP_SYN_RETRIES,
.procname = "tcp_syn_retries",
.data = &sysctl_tcp_syn_retries,

2006-09-22 07:38:58

by YOSHIFUJI Hideaki

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

In article <[email protected]> (at Fri, 22 Sep 2006 02:31:59 -0500), William Pitcock <[email protected]> says:

> This patch allows for a user to disable the requirement to meet the
> CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
> the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.

Why? I don't think this is a good idea.

> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
> index e4b1a4d..c3f7c3c 100644
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -411,6 +411,7 @@ enum
> NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
> NET_TCP_DMA_COPYBREAK=116,
> NET_TCP_SLOW_START_AFTER_IDLE=117,
> + NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER=118,
> };
>
> enum {

This implies all IPv4 protocols including other protocols
such as UDP, SCTP, ...

> @@ -1412,3 +1418,4 @@ EXPORT_SYMBOL(inet_stream_ops);
> EXPORT_SYMBOL(inet_unregister_protosw);
> EXPORT_SYMBOL(net_statistics);
> EXPORT_SYMBOL(sysctl_ip_nonlocal_bind);
> +EXPORT_SYMBOL(sysctl_ip_allow_lowport_bind_nonsuperuser);

Please be aware about indent.

--yoshfuji

2006-09-22 08:27:01

by William Pitcock

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

On Sep 22, 2006, at 2:41 AM, YOSHIFUJI Hideaki wrote:

> In article <[email protected]> (at
> Fri, 22 Sep 2006 02:31:59 -0500), William Pitcock
> <[email protected]> says:
>
>> This patch allows for a user to disable the requirement to meet the
>> CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
>> the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.
>
> Why? I don't think this is a good idea.
>

There are several reasons. To summarize, in some setups, such as
mine, it is undesirable to force applications to run as root to gain
access to 'service' ports. A more defined listing of reasons why this
patch is a good idea are below:

* People wanting to run restricted services such as jabber, ircd, etc
on low ports to allow people to bypass ISP firewalls, but the
software doesn't have mechanisms for dropping privileges (most ircds,
for example do not have such an option)

* The software is untrusted by the end user, in the event that the
software is not trustworthy, the amount of damage it can do running
as a normal user is less than as a superuser. As it is, the bind()
may have failed before the CAP_NET_BIND_SERVICE capability was
granted to the process.

* Building on that, capabilities are still linux-specific. Other
systems, such as FreeBSD allow you to disable this restriction via
sysctl as well. It is very likely that daemons are not capability
aware, and thus would require some sort of wrapper script (which is
likely beyond the ability of most endusers). Wrapping the daemon
would still require superuser privileges as well to make sure it
worked properly, and even if it did work properly, it still opens a
race condition where the bind() may have failed before the capability
bit was granted to the process.

* Many services do not run on 'service' ports, and instead run out in
userspace. For instance, MySQL listens on TCP/3306 by default, and
PostgreSQL listens in userspace as well (although, I cannot recall
the exact port number it listens on at present). In many cases, squid
runs on port 8080, which is also userspace. For this reason, it is
arguable that the entire CAP_NET_BIND_SERVICE restriction isn't very
useful.

* Embedded devices (consumer routers, etc) may want to have some
level of privilege seperation internally to reduce the amount of
exploitation possibility in their firmware, this patch makes that
easier to accomplish (just set the sysctl in the initialization and
go from there)

* Other TCP stacks (Winsock2, for instance) do not impose the <= 1023
limit.

>> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
>> index e4b1a4d..c3f7c3c 100644
>> --- a/include/linux/sysctl.h
>> +++ b/include/linux/sysctl.h
>> @@ -411,6 +411,7 @@ enum
>> NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
>> NET_TCP_DMA_COPYBREAK=116,
>> NET_TCP_SLOW_START_AFTER_IDLE=117,
>> + NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER=118,
>> };
>>
>> enum {
>
> This implies all IPv4 protocols including other protocols
> such as UDP, SCTP, ...

Yes, I'll change the sysctl name to better infer that it is for TCP.
That is not an issue. If you have a suggestion for what it should be,
I'd love to hear it.

>
>> @@ -1412,3 +1418,4 @@ EXPORT_SYMBOL(inet_stream_ops);
>> EXPORT_SYMBOL(inet_unregister_protosw);
>> EXPORT_SYMBOL(net_statistics);
>> EXPORT_SYMBOL(sysctl_ip_nonlocal_bind);
>> +EXPORT_SYMBOL(sysctl_ip_allow_lowport_bind_nonsuperuser);
>
> Please be aware about indent.

I'll be sure to fix that, thank you.

(resent due to mailer glitch)

- nenolod


2006-09-22 08:59:32

by David Wagner

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

William Pitcock wrote:
>This patch allows for a user to disable the requirement to meet the
>CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
>the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.

Can't you provide this functionality (in a non-transparent way) through
user-space code alone? I'm thinking of a setuid-root program that
takes a port number as argv[1], binds to that port, dup()s the new
file descriptor onto fd 0 (say), drops root, and then forks and execs
a program specified on argv[2]. If you want to get fancy, instead of
exec-ing, you could use the standard trick to pass the file descriptor
over a Unix domain socket to some other process. Seems like you should
be able to make something like this work, as long as you're willing to
make small modifications to the program that uses the low port. Does
that work?

2006-09-22 09:19:04

by William Pitcock

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

On Sep 22, 2006, at 3:59 AM, David Wagner wrote:

> William Pitcock wrote:
>> This patch allows for a user to disable the requirement to meet the
>> CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
>> the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.
>
> Can't you provide this functionality (in a non-transparent way)
> through
> user-space code alone? I'm thinking of a setuid-root program that
> takes a port number as argv[1], binds to that port, dup()s the new
> file descriptor onto fd 0 (say), drops root, and then forks and execs
> a program specified on argv[2]. If you want to get fancy, instead of
> exec-ing, you could use the standard trick to pass the file descriptor
> over a Unix domain socket to some other process. Seems like you
> should
> be able to make something like this work, as long as you're willing to
> make small modifications to the program that uses the low port. Does
> that work?

While this is possible, the purpose of this patch is to allow for
such things to "just work" without any effort from the user to make
it work.

Additionally, with your solution, the program would still need to be
extensively modified. With the sysctl patch, this isn't necessary, as
the lowport bind() will be successful as long as the sysctl value is
set to a non-zero value.

On other TCP stacks, such as the one included with FreeBSD, you can
do the exact same thing this patch does, by doing:

# sysctl net.inet.ip.portrange.reservedhigh=0

The goal of this patch is to provide similar functionality, which
right now, it does. However, it's not as fancy as FreeBSD's, but that
is because PROT_SOCK in af_inet.c is a constant (#define), and thus
not as nicely tuneable.

However, that is a weak argument for not doing it that way, as I
could have done something like:

int sysctl_ip_portrange_high = PROT_SOCK;

The current way is simpler, though, than the way it is done in
FreeBSD, and I feel covers the typical use-case very well.

However, that's really not a bad idea (what you proposed). But, I
still believe that the sysctl patch is more flexible, especially in
cases where you might not have the source-code to what you are trying
to run (common with enterprise apps, gameserver admin panels, etc.).

2006-09-22 09:38:49

by David Wagner

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

William Pitcock wrote:
>Additionally, with your solution, the program would still need to be
>extensively modified.

I suspect "extensively" may be a little bit of an overstatement, though
it sure would take some doing. With some work, it may be possible to
write an alternative implementation of bind() that creates a Unix domain
socket, forks, execs a copy of the setuid-root program, recieves a copy
of the newly opened fd passed over the Unix domain socket, and returns
that to the caller of bind(). In this way, it might be possible to
build a solution that requires only minimal modifications to the app
(just change how it is linked). It'd be messy and thoroughly unportable
(because it would only work on systems where that setuid program was
installed), but maybe doable.

>However, that's really not a bad idea (what you proposed). But, I
>still believe that the sysctl patch is more flexible, especially in
>cases where you might not have the source-code to what you are trying
>to run (common with enterprise apps, gameserver admin panels, etc.).

Ok. Understandable. I leave it to others to comment further. I'm not
advocating anything either way.

2006-09-22 11:54:52

by Rolf Eike Beer

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 1/1] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

William Pitcock wrote:
> This patch allows for a user to disable the requirement to meet the
> CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
> the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.

I assume you are searching for accessfs.

Eike


Attachments:
(No filename) (274.00 B)
(No filename) (189.00 B)
Download all attachments

2006-09-22 17:39:32

by William Pitcock

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

On Sep 22, 2006, at 2:41 AM, YOSHIFUJI Hideaki / 吉藤英明 wrote:

> In article <[email protected]> (at
> Fri, 22 Sep 2006 02:31:59 -0500), William Pitcock
> <[email protected]> says:
>
>> This patch allows for a user to disable the requirement to meet the
>> CAP_NET_BIND_SERVICE capability for a non-superuser. It is toggled by
>> the net.ipv4.allow_lowport_bind_nonsuperuser sysctl value.
>
> Why? I don't think this is a good idea.
>

There are several reasons. To summarize, in some setups, such as
mine, it is undesirable to force applications to run as root to gain
access to 'service' ports. A more defined listing of reasons why this
patch is a good idea are below:

* People wanting to run restricted services such as jabber, ircd, etc
on low ports to allow people to bypass ISP firewalls, but the
software doesn't have mechanisms for dropping privileges (most ircds,
for example do not have such an option)

* The software is untrusted by the end user, in the event that the
software is not trustworthy, the amount of damage it can do running
as a normal user is less than as a superuser. As it is, the bind()
may have failed before the CAP_NET_BIND_SERVICE capability was
granted to the process.

* Building on that, capabilities are still linux-specific. Other
systems, such as FreeBSD allow you to disable this restriction via
sysctl as well. It is very likely that daemons are not capability
aware, and thus would require some sort of wrapper script (which is
likely beyond the ability of most endusers). Wrapping the daemon
would still require superuser privileges as well to make sure it
worked properly, and even if it did work properly, it still opens a
race condition where the bind() may have failed before the capability
bit was granted to the process.

* Many services do not run on 'service' ports, and instead run out in
userspace. For instance, MySQL listens on TCP/3306 by default, and
PostgreSQL listens in userspace as well (although, I cannot recall
the exact port number it listens on at present). In many cases, squid
runs on port 8080, which is also userspace. For this reason, it is
arguable that the entire CAP_NET_BIND_SERVICE restriction isn't very
useful.

* Embedded devices (consumer routers, etc) may want to have some
level of privilege seperation internally to reduce the amount of
exploitation possibility in their firmware, this patch makes that
easier to accomplish (just set the sysctl in the initialization and
go from there)

* Other TCP stacks (Winsock2, for instance) do not impose the <= 1023
limit.

>> diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
>> index e4b1a4d..c3f7c3c 100644
>> --- a/include/linux/sysctl.h
>> +++ b/include/linux/sysctl.h
>> @@ -411,6 +411,7 @@ enum
>> NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
>> NET_TCP_DMA_COPYBREAK=116,
>> NET_TCP_SLOW_START_AFTER_IDLE=117,
>> + NET_IPV4_ALLOW_LOWPORT_BIND_NONSUPERUSER=118,
>> };
>>
>> enum {
>
> This implies all IPv4 protocols including other protocols
> such as UDP, SCTP, ...

Yes, I'll change the sysctl name to better infer that it is for TCP.
That is not an issue. If you have a suggestion for what it should be,
I'd love to hear it.

>
>> @@ -1412,3 +1418,4 @@ EXPORT_SYMBOL(inet_stream_ops);
>> EXPORT_SYMBOL(inet_unregister_protosw);
>> EXPORT_SYMBOL(net_statistics);
>> EXPORT_SYMBOL(sysctl_ip_nonlocal_bind);
>> +EXPORT_SYMBOL(sysctl_ip_allow_lowport_bind_nonsuperuser);
>
> Please be aware about indent.

I'll be sure to fix that, thank you.

(resent due to mailer glitch)

- nenolod

2006-09-22 18:04:20

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2.6.18 try 2] net/ipv4: sysctl to allow non-superuser to bypass CAP_NET_BIND_SERVICE requirement

From: William Pitcock <[email protected]>
Date: Fri, 22 Sep 2006 03:27:22 -0500

> * The software is untrusted by the end user, in the event that the
> software is not trustworthy, the amount of damage it can do running
> as a normal user is less than as a superuser. As it is, the bind()
> may have failed before the CAP_NET_BIND_SERVICE capability was
> granted to the process.

You have the power to exec() the daemon in question with
CAP_NET_BIND_SERVICE capability inherited from the parent,
and that will be the only "extra" capability the process will
have.

So there is in fact an existing mechanism for doing this.

If you have the power to set the sysctl, you have the power
to give the capability to an arbitrary process which you
want to get lower ports but do not trust to run completely
as root.