LinuxLists.cc - [RFC] Allow user namespace inside chroot

2018-10-15 17:11:45

Subject: [RFC] Allow user namespace inside chroot

From: Nagarathnam Muthusamy <[email protected]>

Following commit disables the creation of user namespace inside
the chroot environment.

userns: Don't allow creation if the user is chrooted

commit 3151527ee007b73a0ebd296010f1c0454a919c7d

Consider a system in which a non-root user creates a combination
of user, pid and mount namespaces and confines a process to it.
The system will have multiple levels of nested namespaces.
The root namespace in the system will have lots of directories
which should not be exposed to the child confined to the set of
namespaces.

Without chroot, we will have to hide all unwanted directories
individually using bind mounts and mount namespace. Chroot enables
us to expose a handpicked list of directories which the child
can see but if we use chroot we wont be able to create nested
namespaces.

Allowing a process to create user namespace within a chroot
environment will enable it to chroot, which in turn can be used
to escape the jail.

This patch drops the chroot privilege when user namespace is
created within the chroot environment so the process cannot
use it to escape the chroot jail. The process can still modify
the view of the file system using mount namespace but for those
modifications to be useful, it needs to run a setuid program with
that intented uid directly mapped into the user namespace as it is
which is not possible for an unprivileged process.

If there were any other corner cases which were considered while
deciding to disable the creation of user namespace as a whole
within the chroot environment please let me know.

Signed-off-by: Nagarathnam Muthusamy<[email protected]>
---
kernel/user_namespace.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index e5222b5..83d2a70 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
}

-static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
+static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns, int is_chrooted)
{
/* Start with the same capabilities as init but useless for doing
* anything as the capabilities are bound to the new user namespace.
@@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
cred->cap_effective = CAP_FULL_SET;
cred->cap_ambient = CAP_EMPTY_SET;
cred->cap_bset = CAP_FULL_SET;
+ if (is_chrooted) {
+ cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
+ cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
+ cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
+ }
#ifdef CONFIG_KEYS
key_put(cred->request_key_auth);
cred->request_key_auth = NULL;
@@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
kgid_t group = new->egid;
struct ucounts *ucounts;
int ret, i;
+ int is_chrooted = 0;

ret = -ENOSPC;
if (parent_ns->level > 32)
@@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
goto fail;

/*
- * Verify that we can not violate the policy of which files
- * may be accessed that is specified by the root directory,
- * by verifing that the root directory is at the root of the
- * mount namespace which allows all files to be accessed.
+ * Drop the chroot privilege when a user namespace is created inside
+ * chrooted environment so that the file system view presented to a
+ * non-admin process is preserved.
*/
- ret = -EPERM;
if (current_chrooted())
- goto fail_dec;
+ is_chrooted = 1;

/* The creator needs a mapping in the parent user namespace
* or else we won't be able to reasonably tell userspace who
@@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
if (!setup_userns_sysctls(ns))
goto fail_keyring;

- set_cred_user_ns(new, ns);
+ set_cred_user_ns(new, ns, is_chrooted);
return 0;
fail_keyring:
#ifdef CONFIG_PERSISTENT_KEYRINGS
@@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
return -ENOMEM;

put_user_ns(cred->user_ns);
- set_cred_user_ns(cred, get_user_ns(user_ns));
+ set_cred_user_ns(cred, get_user_ns(user_ns), 0);

return commit_creds(cred);
}
--
1.8.3.1

2018-10-15 17:25:10

by Jann Horn

[permalink] [raw]

Subject: Re: [RFC] Allow user namespace inside chroot

On Mon, Oct 15, 2018 at 7:10 PM <[email protected]> wrote:
> Following commit disables the creation of user namespace inside
> the chroot environment.
>
> userns: Don't allow creation if the user is chrooted
>
> commit 3151527ee007b73a0ebd296010f1c0454a919c7d
>
> Consider a system in which a non-root user creates a combination
> of user, pid and mount namespaces and confines a process to it.
> The system will have multiple levels of nested namespaces.
> The root namespace in the system will have lots of directories
> which should not be exposed to the child confined to the set of
> namespaces.
>
> Without chroot, we will have to hide all unwanted directories
> individually using bind mounts and mount namespace.

IMO what you really should be doing is to create a tmpfs, bind-mount
the directories you want into it, and then pivot_root() into that, not
the other way around.

> Chroot enables
> us to expose a handpicked list of directories which the child
> can see but if we use chroot we wont be able to create nested
> namespaces.

Uh, are you aware that pivot_root() exists? That's what you should be using.

The kernel makes pretty much no security guarantees about chroot(). If
you're using chroot() for security, you're almost certainly doing it
wrong. If you want security, use pivot_root().

> Allowing a process to create user namespace within a chroot
> environment will enable it to chroot, which in turn can be used
> to escape the jail.
>
> This patch drops the chroot privilege when user namespace is
> created within the chroot environment so the process cannot
> use it to escape the chroot jail.

"cannot" is a strong expression. More like "might not be able to".

> The process can still modify
> the view of the file system using mount namespace but for those
> modifications to be useful, it needs to run a setuid program with
> that intented uid directly mapped into the user namespace as it is
> which is not possible for an unprivileged process.
>
> If there were any other corner cases which were considered while
> deciding to disable the creation of user namespace as a whole
> within the chroot environment please let me know.
>
> Signed-off-by: Nagarathnam Muthusamy<[email protected]>
> ---
> kernel/user_namespace.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index e5222b5..83d2a70 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
> return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
> }
>
> -static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
> +static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns, int is_chrooted)
> {
> /* Start with the same capabilities as init but useless for doing
> * anything as the capabilities are bound to the new user namespace.
> @@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
> cred->cap_effective = CAP_FULL_SET;
> cred->cap_ambient = CAP_EMPTY_SET;
> cred->cap_bset = CAP_FULL_SET;
> + if (is_chrooted) {
> + cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
> + }

This isn't going to work. For example, if the attacker can use
pivot_root() (which checks for CAP_SYS_ADMIN), you're still screwed.

> #ifdef CONFIG_KEYS
> key_put(cred->request_key_auth);
> cred->request_key_auth = NULL;
> @@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
> kgid_t group = new->egid;
> struct ucounts *ucounts;
> int ret, i;
> + int is_chrooted = 0;
>
> ret = -ENOSPC;
> if (parent_ns->level > 32)
> @@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
> goto fail;
>
> /*
> - * Verify that we can not violate the policy of which files
> - * may be accessed that is specified by the root directory,
> - * by verifing that the root directory is at the root of the
> - * mount namespace which allows all files to be accessed.
> + * Drop the chroot privilege when a user namespace is created inside
> + * chrooted environment so that the file system view presented to a
> + * non-admin process is preserved.
> */
> - ret = -EPERM;
> if (current_chrooted())
> - goto fail_dec;
> + is_chrooted = 1;
>
> /* The creator needs a mapping in the parent user namespace
> * or else we won't be able to reasonably tell userspace who
> @@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
> if (!setup_userns_sysctls(ns))
> goto fail_keyring;
>
> - set_cred_user_ns(new, ns);
> + set_cred_user_ns(new, ns, is_chrooted);
> return 0;
> fail_keyring:
> #ifdef CONFIG_PERSISTENT_KEYRINGS
> @@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
> return -ENOMEM;
>
> put_user_ns(cred->user_ns);
> - set_cred_user_ns(cred, get_user_ns(user_ns));
> + set_cred_user_ns(cred, get_user_ns(user_ns), 0);

This looks bogus. With this, I think your restriction can be bypassed
if process A forks a child B, B creates a new user namespace, then A
enters the user namespace with setns() and has full capabilities. Am I
missing something?

2018-10-15 17:28:09

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [RFC] Allow user namespace inside chroot

On Mon, Oct 15, 2018 at 10:22 AM Jann Horn <[email protected]> wrote:
>
> On Mon, Oct 15, 2018 at 7:10 PM <[email protected]> wrote:
> > Following commit disables the creation of user namespace inside
> > the chroot environment.
> >
> > userns: Don't allow creation if the user is chrooted
> >
> > commit 3151527ee007b73a0ebd296010f1c0454a919c7d
> >
> > Consider a system in which a non-root user creates a combination
> > of user, pid and mount namespaces and confines a process to it.
> > The system will have multiple levels of nested namespaces.
> > The root namespace in the system will have lots of directories
> > which should not be exposed to the child confined to the set of
> > namespaces.
> >
> > Without chroot, we will have to hide all unwanted directories
> > individually using bind mounts and mount namespace.
>
> IMO what you really should be doing is to create a tmpfs, bind-mount
> the directories you want into it, and then pivot_root() into that, not
> the other way around.

Indeed. Or you can just recursive bind-mount the subtree you want and
then pivot_root() into it.

--Andy

2018-10-15 17:44:01

by Eric W. Biederman

[permalink] [raw]

Subject: Re: [RFC] Allow user namespace inside chroot

Have you considered using pivot_root to drop all of the pieces of the
filesystem you don't want to be visible? That should be a much better
solution overall.

It is must a matter of:
mount --bind /path/you/would/chroot/to
pivot_root /path/you/would/chroot/to /put/old
umount -l /put/old

You might need to do something like make --rprivate before calling
pivot_root to stop mount propagation to the parent. But I can't
image it to be a practical problem.

Also note that being in a chroot tends to indicate one of two things,
being in an old build system, or being in some kind of chroot jail.
Because of the jails created with chroot we want to be very careful
with enabling user namespaces in that context.

There have been some very clever people figuring out how to get out of
chroot jails by passing file descriptors between processes and using
things like pivot root.

Even if your analysis is semantically perfect there is the issue of
increasing the attack surface of preexising chroot jails. I believe
that would make the kernel more vulnerable overall, and for only
a very small simplification of implementation details.

So unless I am missing something I don't see the use case for this that
would not be better served by just properly setting up your mount
namespace, and the attack surface increase of chroot jails makes we
very relucatant to see a change like this.

Eric

[email protected] writes:

> From: Nagarathnam Muthusamy <[email protected]>
>
> Following commit disables the creation of user namespace inside
> the chroot environment.
>
> userns: Don't allow creation if the user is chrooted
>
> commit 3151527ee007b73a0ebd296010f1c0454a919c7d
>
> Consider a system in which a non-root user creates a combination
> of user, pid and mount namespaces and confines a process to it.
> The system will have multiple levels of nested namespaces.
> The root namespace in the system will have lots of directories
> which should not be exposed to the child confined to the set of
> namespaces.
>
> Without chroot, we will have to hide all unwanted directories
> individually using bind mounts and mount namespace. Chroot enables
> us to expose a handpicked list of directories which the child
> can see but if we use chroot we wont be able to create nested
> namespaces.
>
> Allowing a process to create user namespace within a chroot
> environment will enable it to chroot, which in turn can be used
> to escape the jail.
>
> This patch drops the chroot privilege when user namespace is
> created within the chroot environment so the process cannot
> use it to escape the chroot jail. The process can still modify
> the view of the file system using mount namespace but for those
> modifications to be useful, it needs to run a setuid program with
> that intented uid directly mapped into the user namespace as it is
> which is not possible for an unprivileged process.
>
> If there were any other corner cases which were considered while
> deciding to disable the creation of user namespace as a whole
> within the chroot environment please let me know.
>
> Signed-off-by: Nagarathnam Muthusamy<[email protected]>
> ---
> kernel/user_namespace.c | 22 +++++++++++++---------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index e5222b5..83d2a70 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
> return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
> }
>
> -static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
> +static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns, int is_chrooted)
> {
> /* Start with the same capabilities as init but useless for doing
> * anything as the capabilities are bound to the new user namespace.
> @@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
> cred->cap_effective = CAP_FULL_SET;
> cred->cap_ambient = CAP_EMPTY_SET;
> cred->cap_bset = CAP_FULL_SET;
> + if (is_chrooted) {
> + cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
> + }
> #ifdef CONFIG_KEYS
> key_put(cred->request_key_auth);
> cred->request_key_auth = NULL;
> @@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
> kgid_t group = new->egid;
> struct ucounts *ucounts;
> int ret, i;
> + int is_chrooted = 0;
>
> ret = -ENOSPC;
> if (parent_ns->level > 32)
> @@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
> goto fail;
>
> /*
> - * Verify that we can not violate the policy of which files
> - * may be accessed that is specified by the root directory,
> - * by verifing that the root directory is at the root of the
> - * mount namespace which allows all files to be accessed.
> + * Drop the chroot privilege when a user namespace is created inside
> + * chrooted environment so that the file system view presented to a
> + * non-admin process is preserved.
> */
> - ret = -EPERM;
> if (current_chrooted())
> - goto fail_dec;
> + is_chrooted = 1;
>
> /* The creator needs a mapping in the parent user namespace
> * or else we won't be able to reasonably tell userspace who
> @@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
> if (!setup_userns_sysctls(ns))
> goto fail_keyring;
>
> - set_cred_user_ns(new, ns);
> + set_cred_user_ns(new, ns, is_chrooted);
> return 0;
> fail_keyring:
> #ifdef CONFIG_PERSISTENT_KEYRINGS
> @@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
> return -ENOMEM;
>
> put_user_ns(cred->user_ns);
> - set_cred_user_ns(cred, get_user_ns(user_ns));
> + set_cred_user_ns(cred, get_user_ns(user_ns), 0);
>
> return commit_creds(cred);
> }

2018-10-15 18:07:23

by Nagarathnam Muthusamy

[permalink] [raw]

Subject: Re: [RFC] Allow user namespace inside chroot

On 10/15/2018 10:42 AM, [email protected] wrote:
> Have you considered using pivot_root to drop all of the pieces of the
> filesystem you don't want to be visible? That should be a much better
> solution overall.
>
> It is must a matter of:
> mount --bind /path/you/would/chroot/to
> pivot_root /path/you/would/chroot/to /put/old
> umount -l /put/old
>
> You might need to do something like make --rprivate before calling
> pivot_root to stop mount propagation to the parent. But I can't
> image it to be a practical problem.
>
>
> Also note that being in a chroot tends to indicate one of two things,
> being in an old build system, or being in some kind of chroot jail.
> Because of the jails created with chroot we want to be very careful
> with enabling user namespaces in that context.
>
> There have been some very clever people figuring out how to get out of
> chroot jails by passing file descriptors between processes and using
> things like pivot root.
>
> Even if your analysis is semantically perfect there is the issue of
> increasing the attack surface of preexising chroot jails. I believe
> that would make the kernel more vulnerable overall, and for only
> a very small simplification of implementation details.
>
> So unless I am missing something I don't see the use case for this that
> would not be better served by just properly setting up your mount
> namespace, and the attack surface increase of chroot jails makes we
> very relucatant to see a change like this.

Thanks a lot for the feedback! I will work on solving the issue with
pivot_root and mount namespace combination.

Thanks,
Nagarathnam.
> Eric
>
> [email protected] writes:
>
>> From: Nagarathnam Muthusamy <[email protected]>
>>
>> Following commit disables the creation of user namespace inside
>> the chroot environment.
>>
>> userns: Don't allow creation if the user is chrooted
>>
>> commit 3151527ee007b73a0ebd296010f1c0454a919c7d
>>
>> Consider a system in which a non-root user creates a combination
>> of user, pid and mount namespaces and confines a process to it.
>> The system will have multiple levels of nested namespaces.
>> The root namespace in the system will have lots of directories
>> which should not be exposed to the child confined to the set of
>> namespaces.
>>
>> Without chroot, we will have to hide all unwanted directories
>> individually using bind mounts and mount namespace. Chroot enables
>> us to expose a handpicked list of directories which the child
>> can see but if we use chroot we wont be able to create nested
>> namespaces.
>>
>> Allowing a process to create user namespace within a chroot
>> environment will enable it to chroot, which in turn can be used
>> to escape the jail.
>>
>> This patch drops the chroot privilege when user namespace is
>> created within the chroot environment so the process cannot
>> use it to escape the chroot jail. The process can still modify
>> the view of the file system using mount namespace but for those
>> modifications to be useful, it needs to run a setuid program with
>> that intented uid directly mapped into the user namespace as it is
>> which is not possible for an unprivileged process.
>>
>> If there were any other corner cases which were considered while
>> deciding to disable the creation of user namespace as a whole
>> within the chroot environment please let me know.
>>
>> Signed-off-by: Nagarathnam Muthusamy<[email protected]>
>> ---
>> kernel/user_namespace.c | 22 +++++++++++++---------
>> 1 file changed, 13 insertions(+), 9 deletions(-)
>>
>> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
>> index e5222b5..83d2a70 100644
>> --- a/kernel/user_namespace.c
>> +++ b/kernel/user_namespace.c
>> @@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
>> return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
>> }
>>
>> -static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
>> +static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns, int is_chrooted)
>> {
>> /* Start with the same capabilities as init but useless for doing
>> * anything as the capabilities are bound to the new user namespace.
>> @@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
>> cred->cap_effective = CAP_FULL_SET;
>> cred->cap_ambient = CAP_EMPTY_SET;
>> cred->cap_bset = CAP_FULL_SET;
>> + if (is_chrooted) {
>> + cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
>> + cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
>> + cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
>> + }
>> #ifdef CONFIG_KEYS
>> key_put(cred->request_key_auth);
>> cred->request_key_auth = NULL;
>> @@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
>> kgid_t group = new->egid;
>> struct ucounts *ucounts;
>> int ret, i;
>> + int is_chrooted = 0;
>>
>> ret = -ENOSPC;
>> if (parent_ns->level > 32)
>> @@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
>> goto fail;
>>
>> /*
>> - * Verify that we can not violate the policy of which files
>> - * may be accessed that is specified by the root directory,
>> - * by verifing that the root directory is at the root of the
>> - * mount namespace which allows all files to be accessed.
>> + * Drop the chroot privilege when a user namespace is created inside
>> + * chrooted environment so that the file system view presented to a
>> + * non-admin process is preserved.
>> */
>> - ret = -EPERM;
>> if (current_chrooted())
>> - goto fail_dec;
>> + is_chrooted = 1;
>>
>> /* The creator needs a mapping in the parent user namespace
>> * or else we won't be able to reasonably tell userspace who
>> @@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
>> if (!setup_userns_sysctls(ns))
>> goto fail_keyring;
>>
>> - set_cred_user_ns(new, ns);
>> + set_cred_user_ns(new, ns, is_chrooted);
>> return 0;
>> fail_keyring:
>> #ifdef CONFIG_PERSISTENT_KEYRINGS
>> @@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>> return -ENOMEM;
>>
>> put_user_ns(cred->user_ns);
>> - set_cred_user_ns(cred, get_user_ns(user_ns));
>> + set_cred_user_ns(cred, get_user_ns(user_ns), 0);
>>
>> return commit_creds(cred);
>> }

2018-10-15 18:09:51

by Eric W. Biederman

[permalink] [raw]

Subject: Re: [RFC] Allow user namespace inside chroot

Jann Horn <[email protected]> writes:

> On Mon, Oct 15, 2018 at 7:10 PM <[email protected]> wrote:
>> @@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct ns_common *ns)
>> return -ENOMEM;
>>
>> put_user_ns(cred->user_ns);
>> - set_cred_user_ns(cred, get_user_ns(user_ns));
>> + set_cred_user_ns(cred, get_user_ns(user_ns), 0);
>
> This looks bogus. With this, I think your restriction can be bypassed
> if process A forks a child B, B creates a new user namespace, then A
> enters the user namespace with setns() and has full capabilities. Am I
> missing something?

Nope. I feel silly for missing that angle.

Even without the full capabilities the userns_install angle will place
you at the root of the mount namespace outside of the chroot.

At which point I have visions of the special cases multiplying like
bunnies make this work. Without a very strong case I don't like this at all.

Eric