2018-06-25 16:35:26

by Jann Horn

[permalink] [raw]
Subject: [PATCH] userns: move user access out of the mutex

The old code would hold the userns_state_mutex indefinitely if
memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
moving the memdup_user_nul in front of the mutex_lock().

Note: This changes the error precedence of invalid buf/count/*ppos vs
map already written / capabilities missing.

Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
Cc: [email protected]
Signed-off-by: Jann Horn <[email protected]>
---
kernel/user_namespace.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index c3d7583fcd21..e5222b5fb4fe 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -859,7 +859,16 @@ static ssize_t map_write(struct file *file, const char __user *buf,
unsigned idx;
struct uid_gid_extent extent;
char *kbuf = NULL, *pos, *next_line;
- ssize_t ret = -EINVAL;
+ ssize_t ret;
+
+ /* Only allow < page size writes at the beginning of the file */
+ if ((*ppos != 0) || (count >= PAGE_SIZE))
+ return -EINVAL;
+
+ /* Slurp in the user data */
+ kbuf = memdup_user_nul(buf, count);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);

/*
* The userns_state_mutex serializes all writes to any given map.
@@ -895,19 +904,6 @@ static ssize_t map_write(struct file *file, const char __user *buf,
if (cap_valid(cap_setid) && !file_ns_capable(file, ns, CAP_SYS_ADMIN))
goto out;

- /* Only allow < page size writes at the beginning of the file */
- ret = -EINVAL;
- if ((*ppos != 0) || (count >= PAGE_SIZE))
- goto out;
-
- /* Slurp in the user data */
- kbuf = memdup_user_nul(buf, count);
- if (IS_ERR(kbuf)) {
- ret = PTR_ERR(kbuf);
- kbuf = NULL;
- goto out;
- }
-
/* Parse the user data */
ret = -EINVAL;
pos = kbuf;
--
2.18.0.rc2.346.g013aa6912e-goog



2018-06-26 14:12:16

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] userns: move user access out of the mutex

On Mon, Jun 25, 2018 at 06:34:19PM +0200, Jann Horn wrote:
> The old code would hold the userns_state_mutex indefinitely if
> memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
> moving the memdup_user_nul in front of the mutex_lock().
>
> Note: This changes the error precedence of invalid buf/count/*ppos vs
> map already written / capabilities missing.
>
> Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
> Cc: [email protected]
> Signed-off-by: Jann Horn <[email protected]>
> ---
> kernel/user_namespace.c | 24 ++++++++++--------------
> 1 file changed, 10 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index c3d7583fcd21..e5222b5fb4fe 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -859,7 +859,16 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> unsigned idx;
> struct uid_gid_extent extent;
> char *kbuf = NULL, *pos, *next_line;
> - ssize_t ret = -EINVAL;
> + ssize_t ret;
> +
> + /* Only allow < page size writes at the beginning of the file */
> + if ((*ppos != 0) || (count >= PAGE_SIZE))
> + return -EINVAL;
> +
> + /* Slurp in the user data */
> + kbuf = memdup_user_nul(buf, count);
> + if (IS_ERR(kbuf))
> + return PTR_ERR(kbuf);

I'm not opposed to this but what is annoying is the changed error
reporting you pointed out. It seems conceptually way cleaner if missing
permissions are reported before more specific internal errors.

The question I have is how bad it would be if memdup_user_nul() stalled
and if you see any issues security wise. Given that the mutex is only
taken on operations that are mostly performed when creating or setting
up a new user namespace

map_write()
create_user_ns()
proc_setgroups_write()
userns_may_setgroups()

and not when actually interacting with it it seems the worst that
happens is that creation of new user namespaces is not possible anymore.
That shouldn't have any effect on the host though which I would see as a
real issue. But I might be missing context. :)

Christian

>
> /*
> * The userns_state_mutex serializes all writes to any given map.
> @@ -895,19 +904,6 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> if (cap_valid(cap_setid) && !file_ns_capable(file, ns, CAP_SYS_ADMIN))
> goto out;
>
> - /* Only allow < page size writes at the beginning of the file */
> - ret = -EINVAL;
> - if ((*ppos != 0) || (count >= PAGE_SIZE))
> - goto out;
> -
> - /* Slurp in the user data */
> - kbuf = memdup_user_nul(buf, count);
> - if (IS_ERR(kbuf)) {
> - ret = PTR_ERR(kbuf);
> - kbuf = NULL;
> - goto out;
> - }
> -
> /* Parse the user data */
> ret = -EINVAL;
> pos = kbuf;
> --
> 2.18.0.rc2.346.g013aa6912e-goog
>

2018-06-26 16:01:42

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] userns: move user access out of the mutex

On Tue, Jun 26, 2018 at 3:08 PM Christian Brauner
<[email protected]> wrote:
>
> On Mon, Jun 25, 2018 at 06:34:19PM +0200, Jann Horn wrote:
> > The old code would hold the userns_state_mutex indefinitely if
> > memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
> > moving the memdup_user_nul in front of the mutex_lock().
> >
> > Note: This changes the error precedence of invalid buf/count/*ppos vs
> > map already written / capabilities missing.
> >
> > Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
> > Cc: [email protected]
> > Signed-off-by: Jann Horn <[email protected]>
> > ---
> > kernel/user_namespace.c | 24 ++++++++++--------------
> > 1 file changed, 10 insertions(+), 14 deletions(-)
> >
> > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> > index c3d7583fcd21..e5222b5fb4fe 100644
> > --- a/kernel/user_namespace.c
> > +++ b/kernel/user_namespace.c
> > @@ -859,7 +859,16 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > unsigned idx;
> > struct uid_gid_extent extent;
> > char *kbuf = NULL, *pos, *next_line;
> > - ssize_t ret = -EINVAL;
> > + ssize_t ret;
> > +
> > + /* Only allow < page size writes at the beginning of the file */
> > + if ((*ppos != 0) || (count >= PAGE_SIZE))
> > + return -EINVAL;
> > +
> > + /* Slurp in the user data */
> > + kbuf = memdup_user_nul(buf, count);
> > + if (IS_ERR(kbuf))
> > + return PTR_ERR(kbuf);
>
> I'm not opposed to this but what is annoying is the changed error
> reporting you pointed out. It seems conceptually way cleaner if missing
> permissions are reported before more specific internal errors.
>
> The question I have is how bad it would be if memdup_user_nul() stalled
> and if you see any issues security wise. Given that the mutex is only
> taken on operations that are mostly performed when creating or setting
> up a new user namespace
>
> map_write()
> create_user_ns()
> proc_setgroups_write()
> userns_may_setgroups()
>
> and not when actually interacting with it it seems the worst that
> happens is that creation of new user namespaces is not possible anymore.
> That shouldn't have any effect on the host though which I would see as a
> real issue. But I might be missing context. :)

userns_may_setgroups() is involved in sys_setgroups() via
may_setgroups(), so if one thread is blocking the userns_state_mutex,
nobody can log in anymore.

> > /*
> > * The userns_state_mutex serializes all writes to any given map.
> > @@ -895,19 +904,6 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > if (cap_valid(cap_setid) && !file_ns_capable(file, ns, CAP_SYS_ADMIN))
> > goto out;
> >
> > - /* Only allow < page size writes at the beginning of the file */
> > - ret = -EINVAL;
> > - if ((*ppos != 0) || (count >= PAGE_SIZE))
> > - goto out;
> > -
> > - /* Slurp in the user data */
> > - kbuf = memdup_user_nul(buf, count);
> > - if (IS_ERR(kbuf)) {
> > - ret = PTR_ERR(kbuf);
> > - kbuf = NULL;
> > - goto out;
> > - }
> > -
> > /* Parse the user data */
> > ret = -EINVAL;
> > pos = kbuf;
> > --
> > 2.18.0.rc2.346.g013aa6912e-goog
> >

2018-06-27 14:25:34

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] userns: move user access out of the mutex

On Tue, Jun 26, 2018 at 04:06:45PM +0200, Jann Horn wrote:
> On Tue, Jun 26, 2018 at 3:08 PM Christian Brauner
> <[email protected]> wrote:
> >
> > On Mon, Jun 25, 2018 at 06:34:19PM +0200, Jann Horn wrote:
> > > The old code would hold the userns_state_mutex indefinitely if
> > > memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
> > > moving the memdup_user_nul in front of the mutex_lock().
> > >
> > > Note: This changes the error precedence of invalid buf/count/*ppos vs
> > > map already written / capabilities missing.
> > >
> > > Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
> > > Cc: [email protected]
> > > Signed-off-by: Jann Horn <[email protected]>

Acked-by: Christian Brauner <[email protected]>

> > > ---
> > > kernel/user_namespace.c | 24 ++++++++++--------------
> > > 1 file changed, 10 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> > > index c3d7583fcd21..e5222b5fb4fe 100644
> > > --- a/kernel/user_namespace.c
> > > +++ b/kernel/user_namespace.c
> > > @@ -859,7 +859,16 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > > unsigned idx;
> > > struct uid_gid_extent extent;
> > > char *kbuf = NULL, *pos, *next_line;
> > > - ssize_t ret = -EINVAL;
> > > + ssize_t ret;
> > > +
> > > + /* Only allow < page size writes at the beginning of the file */
> > > + if ((*ppos != 0) || (count >= PAGE_SIZE))
> > > + return -EINVAL;
> > > +
> > > + /* Slurp in the user data */
> > > + kbuf = memdup_user_nul(buf, count);
> > > + if (IS_ERR(kbuf))
> > > + return PTR_ERR(kbuf);
> >
> > I'm not opposed to this but what is annoying is the changed error
> > reporting you pointed out. It seems conceptually way cleaner if missing
> > permissions are reported before more specific internal errors.
> >
> > The question I have is how bad it would be if memdup_user_nul() stalled
> > and if you see any issues security wise. Given that the mutex is only
> > taken on operations that are mostly performed when creating or setting
> > up a new user namespace
> >
> > map_write()
> > create_user_ns()
> > proc_setgroups_write()
> > userns_may_setgroups()
> >
> > and not when actually interacting with it it seems the worst that
> > happens is that creation of new user namespaces is not possible anymore.
> > That shouldn't have any effect on the host though which I would see as a
> > real issue. But I might be missing context. :)
>
> userns_may_setgroups() is involved in sys_setgroups() via
> may_setgroups(), so if one thread is blocking the userns_state_mutex,
> nobody can log in anymore.

Thanks!

>
> > > /*
> > > * The userns_state_mutex serializes all writes to any given map.
> > > @@ -895,19 +904,6 @@ static ssize_t map_write(struct file *file, const char __user *buf,
> > > if (cap_valid(cap_setid) && !file_ns_capable(file, ns, CAP_SYS_ADMIN))
> > > goto out;
> > >
> > > - /* Only allow < page size writes at the beginning of the file */
> > > - ret = -EINVAL;
> > > - if ((*ppos != 0) || (count >= PAGE_SIZE))
> > > - goto out;
> > > -
> > > - /* Slurp in the user data */
> > > - kbuf = memdup_user_nul(buf, count);
> > > - if (IS_ERR(kbuf)) {
> > > - ret = PTR_ERR(kbuf);
> > > - kbuf = NULL;
> > > - goto out;
> > > - }
> > > -
> > > /* Parse the user data */
> > > ret = -EINVAL;
> > > pos = kbuf;
> > > --
> > > 2.18.0.rc2.346.g013aa6912e-goog
> > >

2018-06-27 19:37:47

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH] userns: move user access out of the mutex

Quoting Christian Brauner ([email protected]):
> On Tue, Jun 26, 2018 at 04:06:45PM +0200, Jann Horn wrote:
> > On Tue, Jun 26, 2018 at 3:08 PM Christian Brauner
> > <[email protected]> wrote:
> > >
> > > On Mon, Jun 25, 2018 at 06:34:19PM +0200, Jann Horn wrote:
> > > > The old code would hold the userns_state_mutex indefinitely if
> > > > memdup_user_nul stalled due to e.g. a userfault region. Prevent that by
> > > > moving the memdup_user_nul in front of the mutex_lock().
> > > >
> > > > Note: This changes the error precedence of invalid buf/count/*ppos vs
> > > > map already written / capabilities missing.
> > > >
> > > > Fixes: 22d917d80e84 ("userns: Rework the user_namespace adding uid/gid...")
> > > > Cc: [email protected]
> > > > Signed-off-by: Jann Horn <[email protected]>
>
> Acked-by: Christian Brauner <[email protected]>

Acked-by: Serge Hallyn <[email protected]>

thanks.

-serge