2019-09-10 19:28:15

by Eugene Syromiatnikov

[permalink] [raw]
Subject: [PATCH v2] fork: check exit_signal passed in clone3() call

Previously, higher 32 bits of exit_signal fields were lost when
copied to the kernel args structure (that uses int as a type for the
respective field). Moreover, as Oleg has noted[1], exit_signal is used
unchecked, so it has to be checked for sanity before use; for the legacy
syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
however, there's no such thing is done in clone3() code path, and that can
break at least thread_group_leader.

Checking user-passed exit_signal against ~CSIGNAL mask solves both
of these problems.

[1] https://lkml.org/lkml/2019/9/10/467

* kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
args.exit_signal has bits set outside CSIGNAL mask.
(_do_fork): Note that exit_signal is expected to be checked for the
sanity by the caller.

Fixes: 7f192e3cd316 ("fork: add clone3")
Reported-by: Oleg Nesterov <[email protected]>
Signed-off-by: Eugene Syromiatnikov <[email protected]>
---
kernel/fork.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 2852d0e..9dee2ab 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
*
* It copies the process, and if successful kick-starts
* it and waits for it to finish using the VM if required.
+ *
+ * args->exit_signal is expected to be checked for sanity by the caller.
*/
long _do_fork(struct kernel_clone_args *args)
{
@@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
if (copy_from_user(&args, uargs, size))
return -EFAULT;

+ /*
+ * exit_signal is confined to CSIGNAL mask in legacy syscalls,
+ * so it is used unchecked deeper in syscall handling routines;
+ * moreover, copying to struct kernel_clone_args.exit_signals
+ * trims higher 32 bits, so it is has to be checked that they
+ * are zero.
+ */
+ if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
+ return -EINVAL;
+
*kargs = (struct kernel_clone_args){
.flags = args.flags,
.pidfd = u64_to_user_ptr(args.pidfd),
--
2.1.4


2019-09-11 13:33:40

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On 09/10, Eugene Syromiatnikov wrote:
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
> *
> * It copies the process, and if successful kick-starts
> * it and waits for it to finish using the VM if required.
> + *
> + * args->exit_signal is expected to be checked for sanity by the caller.

not sure this comment is really useful but it doesn't hurt

> long _do_fork(struct kernel_clone_args *args)
> {
> @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> if (copy_from_user(&args, uargs, size))
> return -EFAULT;
>
> + /*
> + * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> + * so it is used unchecked deeper in syscall handling routines;
> + * moreover, copying to struct kernel_clone_args.exit_signals
> + * trims higher 32 bits, so it is has to be checked that they
> + * are zero.
> + */
> + if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> + return -EINVAL;

OK, agreed. As you pointed out, this doesn't guarantee valid_signal(exit_signal).
But we do no really care as long as it is non-negative, it acts as exit_signal==0.

I have no idea if we want to deny exit_signal >= _NSIG in clone3(), this was always
allowed...

I think this needs the "CC: stable" tag.

Acked-by: Oleg Nesterov <[email protected]>

2019-09-11 13:50:18

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 03:31:20PM +0200, Oleg Nesterov wrote:
> On 09/10, Eugene Syromiatnikov wrote:
> >
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
> > *
> > * It copies the process, and if successful kick-starts
> > * it and waits for it to finish using the VM if required.
> > + *
> > + * args->exit_signal is expected to be checked for sanity by the caller.
>
> not sure this comment is really useful but it doesn't hurt
>
> > long _do_fork(struct kernel_clone_args *args)
> > {
> > @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> > if (copy_from_user(&args, uargs, size))
> > return -EFAULT;
> >
> > + /*
> > + * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> > + * so it is used unchecked deeper in syscall handling routines;
> > + * moreover, copying to struct kernel_clone_args.exit_signals
> > + * trims higher 32 bits, so it is has to be checked that they
> > + * are zero.
> > + */
> > + if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> > + return -EINVAL;
>
> OK, agreed. As you pointed out, this doesn't guarantee valid_signal(exit_signal).
> But we do no really care as long as it is non-negative, it acts as exit_signal==0.
>
> I have no idea if we want to deny exit_signal >= _NSIG in clone3(), this was always
> allowed...
>
> I think this needs the "CC: stable" tag.

No, I don't think so. clone3() is not in any released kernel. It'll be
released with 5.3. So we should just try and have this picked up this
week before the release. I'm going to send a pr for this today
hopefully.
(Sorry for the delay, conferencing makes it harder to reply to mail.)

Reviewed-by: Christian Brauner <[email protected]>

>
> Acked-by: Oleg Nesterov <[email protected]>
>

2019-09-11 13:54:13

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <[email protected]> wrote:

> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field). Moreover, as Oleg has noted[1], exit_signal is used
> unchecked, so it has to be checked for sanity before use; for the legacy
> syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> however, there's no such thing is done in clone3() code path, and that can
> break at least thread_group_leader.
>
> Checking user-passed exit_signal against ~CSIGNAL mask solves both
> of these problems.
>
> [1] https://lkml.org/lkml/2019/9/10/467
>
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal has bits set outside CSIGNAL mask.
> (_do_fork): Note that exit_signal is expected to be checked for the
> sanity by the caller.
>
> Fixes: 7f192e3cd316 ("fork: add clone3")

What are the user-visible runtime effects of this bug?

Relatedly, should this fix be backported into -stable kernels? If so, why?

2019-09-11 13:55:47

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <[email protected]> wrote:
>
> > Previously, higher 32 bits of exit_signal fields were lost when
> > copied to the kernel args structure (that uses int as a type for the
> > respective field). Moreover, as Oleg has noted[1], exit_signal is used
> > unchecked, so it has to be checked for sanity before use; for the legacy
> > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > however, there's no such thing is done in clone3() code path, and that can
> > break at least thread_group_leader.
> >
> > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > of these problems.
> >
> > [1] https://lkml.org/lkml/2019/9/10/467
> >
> > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > args.exit_signal has bits set outside CSIGNAL mask.
> > (_do_fork): Note that exit_signal is expected to be checked for the
> > sanity by the caller.
> >
> > Fixes: 7f192e3cd316 ("fork: add clone3")
>
> What are the user-visible runtime effects of this bug?
>
> Relatedly, should this fix be backported into -stable kernels? If so, why?

No, as I said in my other mail clone3() is not in any released kernel
yet. clone3() is going to be released in v5.3.

Christian

2019-09-11 14:19:14

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <[email protected]> wrote:
> >
> > > Previously, higher 32 bits of exit_signal fields were lost when
> > > copied to the kernel args structure (that uses int as a type for the
> > > respective field). Moreover, as Oleg has noted[1], exit_signal is used
> > > unchecked, so it has to be checked for sanity before use; for the legacy
> > > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > > however, there's no such thing is done in clone3() code path, and that can
> > > break at least thread_group_leader.
> > >
> > > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > > of these problems.
> > >
> > > [1] https://lkml.org/lkml/2019/9/10/467
> > >
> > > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > > args.exit_signal has bits set outside CSIGNAL mask.
> > > (_do_fork): Note that exit_signal is expected to be checked for the
> > > sanity by the caller.
> > >
> > > Fixes: 7f192e3cd316 ("fork: add clone3")
> >
> > What are the user-visible runtime effects of this bug?
> >
> > Relatedly, should this fix be backported into -stable kernels? If so, why?
>
> No, as I said in my other mail clone3() is not in any released kernel
> yet. clone3() is going to be released in v5.3.

Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
chance that this might be visible in legacy clone if anyone passes in an
invalid signal greater than NSIG right now somehow, they'd now get
EINVAL if I'm seeing this right.

So an alternative might be to only fix this in clone3() only right now
and get this patch into 5.3 to not release clone3() with this bug from
legacy clone duplicated.
And we defer the actual legacy clone fix until after next merge window
having it stew in linux-next for a couple of rcs. Distros often pull in
rcs so if anyone notices a regression for legacy clone we'll know about
it... valid_signal() checks at process exit time when the parent is
supposed to be notifed will catch faulty signals anyway so it's not that
big of a deal.

Christian

2019-09-11 14:35:26

by Eugene Syromiatnikov

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > What are the user-visible runtime effects of this bug?

The userspace can set -1 as an exit_signal, and that will break process
signalling and reaping.

> > > Relatedly, should this fix be backported into -stable kernels? If so, why?
> >
> > No, as I said in my other mail clone3() is not in any released kernel
> > yet. clone3() is going to be released in v5.3.
>
> Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> chance that this might be visible in legacy clone if anyone passes in an
> invalid signal greater than NSIG right now somehow, they'd now get
> EINVAL if I'm seeing this right.
>
> So an alternative might be to only fix this in clone3() only right now
> and get this patch into 5.3 to not release clone3() with this bug from
> legacy clone duplicated.
> And we defer the actual legacy clone fix until after next merge window
> having it stew in linux-next for a couple of rcs. Distros often pull in
> rcs so if anyone notices a regression for legacy clone we'll know about
> it... valid_signal() checks at process exit time when the parent is
> supposed to be notifed will catch faulty signals anyway so it's not that
> big of a deal.

As the patch is written, only copy_clone_args_from_user is touched (which
is used only by clone3 and not legacy clone), and the check added
replicates legacy clone behaviour: userspace can set 0..CSIGNAL
as an exit_signal. Having ability to set exit_signal in NSIG..CSIGNAL
renge seems to be a bug, but at least it seems to be harmless one
and indeed may be addressed separately in the future.

2019-09-11 14:56:57

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > What are the user-visible runtime effects of this bug?
>
> The userspace can set -1 as an exit_signal, and that will break process
> signalling and reaping.
>
> > > > Relatedly, should this fix be backported into -stable kernels? If so, why?
> > >
> > > No, as I said in my other mail clone3() is not in any released kernel
> > > yet. clone3() is going to be released in v5.3.
> >
> > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > chance that this might be visible in legacy clone if anyone passes in an
> > invalid signal greater than NSIG right now somehow, they'd now get
> > EINVAL if I'm seeing this right.
> >
> > So an alternative might be to only fix this in clone3() only right now
> > and get this patch into 5.3 to not release clone3() with this bug from
> > legacy clone duplicated.
> > And we defer the actual legacy clone fix until after next merge window
> > having it stew in linux-next for a couple of rcs. Distros often pull in
> > rcs so if anyone notices a regression for legacy clone we'll know about
> > it... valid_signal() checks at process exit time when the parent is
> > supposed to be notifed will catch faulty signals anyway so it's not that
> > big of a deal.
>
> As the patch is written, only copy_clone_args_from_user is touched (which
> is used only by clone3 and not legacy clone), and the check added

Great!

> replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> as an exit_signal. Having ability to set exit_signal in NSIG..CSIGNAL

Hm. The way I see it for clone3() it would make sense to only have <
NSIG right away. valid_signal() won't let through anything else
anyway... Since clone3() isn't out yet it doesn't make sense to
replicate the (buggy) behavior of legacy clone, right?

Christian

2019-09-11 15:13:39

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > What are the user-visible runtime effects of this bug?
> >
> > The userspace can set -1 as an exit_signal, and that will break process
> > signalling and reaping.
> >
> > > > > Relatedly, should this fix be backported into -stable kernels? If so, why?
> > > >
> > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > yet. clone3() is going to be released in v5.3.
> > >
> > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > chance that this might be visible in legacy clone if anyone passes in an
> > > invalid signal greater than NSIG right now somehow, they'd now get
> > > EINVAL if I'm seeing this right.
> > >
> > > So an alternative might be to only fix this in clone3() only right now
> > > and get this patch into 5.3 to not release clone3() with this bug from
> > > legacy clone duplicated.
> > > And we defer the actual legacy clone fix until after next merge window
> > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > it... valid_signal() checks at process exit time when the parent is
> > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > big of a deal.
> >
> > As the patch is written, only copy_clone_args_from_user is touched (which
> > is used only by clone3 and not legacy clone), and the check added
>
> Great!
>
> > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > as an exit_signal. Having ability to set exit_signal in NSIG..CSIGNAL
>
> Hm. The way I see it for clone3() it would make sense to only have <
> NSIG right away. valid_signal() won't let through anything else
> anyway... Since clone3() isn't out yet it doesn't make sense to
> replicate the (buggy) behavior of legacy clone, right?

I agree, let's have a proper exit_signal check in the new syscall
from the beginning.

It should be as simple as
if (unlikely((args.exit_signal & ~((u64)CSIGNAL)) ||
!valid_signal(args.exit_signal)))
return -EINVAL;
shouldn't it?


--
ldv

2019-09-11 15:25:42

by Eugene Syromiatnikov

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > What are the user-visible runtime effects of this bug?
> >
> > The userspace can set -1 as an exit_signal, and that will break process
> > signalling and reaping.
> >
> > > > > Relatedly, should this fix be backported into -stable kernels? If so, why?
> > > >
> > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > yet. clone3() is going to be released in v5.3.
> > >
> > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > chance that this might be visible in legacy clone if anyone passes in an
> > > invalid signal greater than NSIG right now somehow, they'd now get
> > > EINVAL if I'm seeing this right.
> > >
> > > So an alternative might be to only fix this in clone3() only right now
> > > and get this patch into 5.3 to not release clone3() with this bug from
> > > legacy clone duplicated.
> > > And we defer the actual legacy clone fix until after next merge window
> > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > it... valid_signal() checks at process exit time when the parent is
> > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > big of a deal.
> >
> > As the patch is written, only copy_clone_args_from_user is touched (which
> > is used only by clone3 and not legacy clone), and the check added
>
> Great!
>
> > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > as an exit_signal. Having ability to set exit_signal in NSIG..CSIGNAL
>
> Hm. The way I see it for clone3() it would make sense to only have <
> NSIG right away. valid_signal() won't let through anything else
> anyway... Since clone3() isn't out yet it doesn't make sense to
> replicate the (buggy) behavior of legacy clone, right?

I was thinking about that and in the end decided to go with CSIGNAL;
the only issue I see here is that it prevents for libc to easily
switch clone() library call implementation to clone3(), in case there
are some applications that rely on this kind of behaviour; if there's
no such userspace users, then switch to valid_signal() check for both
legacy and clone3 should be fine (along with possible switching to u64
for kernel_clone_args's exit_signal, so the check can done in one place),
but I'm agree with your hesitance to do it right now.

> Christian

2019-09-11 15:34:48

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 04:20:48PM +0100, Eugene Syromiatnikov wrote:
> On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> > On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > > What are the user-visible runtime effects of this bug?
> > >
> > > The userspace can set -1 as an exit_signal, and that will break process
> > > signalling and reaping.
> > >
> > > > > > Relatedly, should this fix be backported into -stable kernels? If so, why?
> > > > >
> > > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > > yet. clone3() is going to be released in v5.3.
> > > >
> > > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > > chance that this might be visible in legacy clone if anyone passes in an
> > > > invalid signal greater than NSIG right now somehow, they'd now get
> > > > EINVAL if I'm seeing this right.
> > > >
> > > > So an alternative might be to only fix this in clone3() only right now
> > > > and get this patch into 5.3 to not release clone3() with this bug from
> > > > legacy clone duplicated.
> > > > And we defer the actual legacy clone fix until after next merge window
> > > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > > it... valid_signal() checks at process exit time when the parent is
> > > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > > big of a deal.
> > >
> > > As the patch is written, only copy_clone_args_from_user is touched (which
> > > is used only by clone3 and not legacy clone), and the check added
> >
> > Great!
> >
> > > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > > as an exit_signal. Having ability to set exit_signal in NSIG..CSIGNAL
> >
> > Hm. The way I see it for clone3() it would make sense to only have <
> > NSIG right away. valid_signal() won't let through anything else
> > anyway... Since clone3() isn't out yet it doesn't make sense to
> > replicate the (buggy) behavior of legacy clone, right?
>
> I was thinking about that and in the end decided to go with CSIGNAL;
> the only issue I see here is that it prevents for libc to easily
> switch clone() library call implementation to clone3(), in case there
> are some applications that rely on this kind of behaviour; if there's
> no such userspace users, then switch to valid_signal() check for both
> legacy and clone3 should be fine (along with possible switching to u64
> for kernel_clone_args's exit_signal, so the check can done in one place),
> but I'm agree with your hesitance to do it right now.

We already have other explicit differences between clone() and clone3().
And these differences are intentional. Two examples are that CSIGNAL is
not a valid flag anymore (due to .exit-signal) and CLONE_DETACHED will
not be simply ignored anymore but will get you EINVAL. So that shouldn't
be a concern.

I think we really want the minimal thing and only for clone3 right now:
verify whether any high-bits are set and whether it's a valid signal.
clone3() needs to have correct semantics when released!

Legacy clone on the other hand already has this buggy behavior for
n-years. It can have it for a little while longer so we can see whether
userspace relies on it.

Christian

2019-09-11 17:37:04

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

Eugene Syromiatnikov <[email protected]> writes:

> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field). Moreover, as Oleg has noted[1], exit_signal is used
> unchecked, so it has to be checked for sanity before use; for the legacy
> syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> however, there's no such thing is done in clone3() code path, and that can
> break at least thread_group_leader.
>
> Checking user-passed exit_signal against ~CSIGNAL mask solves both
> of these problems.
>
> [1] https://lkml.org/lkml/2019/9/10/467
>
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal has bits set outside CSIGNAL mask.
> (_do_fork): Note that exit_signal is expected to be checked for the
> sanity by the caller.
>
> Fixes: 7f192e3cd316 ("fork: add clone3")
> Reported-by: Oleg Nesterov <[email protected]>
> Signed-off-by: Eugene Syromiatnikov <[email protected]>

Acked-by: "Eric W. Biederman" <[email protected]>

> ---
> kernel/fork.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 2852d0e..9dee2ab 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2338,6 +2338,8 @@ struct mm_struct *copy_init_mm(void)
> *
> * It copies the process, and if successful kick-starts
> * it and waits for it to finish using the VM if required.
> + *
> + * args->exit_signal is expected to be checked for sanity by the caller.
> */
> long _do_fork(struct kernel_clone_args *args)
> {
> @@ -2562,6 +2564,16 @@ noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
> if (copy_from_user(&args, uargs, size))
> return -EFAULT;
>
> + /*
> + * exit_signal is confined to CSIGNAL mask in legacy syscalls,
> + * so it is used unchecked deeper in syscall handling routines;
> + * moreover, copying to struct kernel_clone_args.exit_signals
> + * trims higher 32 bits, so it is has to be checked that they
> + * are zero.
> + */
> + if (unlikely(args.exit_signal & ~((u64)CSIGNAL)))
> + return -EINVAL;
> +
> *kargs = (struct kernel_clone_args){
> .flags = args.flags,
> .pidfd = u64_to_user_ptr(args.pidfd),

2019-09-13 10:57:47

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v2] fork: check exit_signal passed in clone3() call

On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > On Tue, 10 Sep 2019 18:58:52 +0100 Eugene Syromiatnikov <[email protected]> wrote:
> >
> > > Previously, higher 32 bits of exit_signal fields were lost when
> > > copied to the kernel args structure (that uses int as a type for the
> > > respective field). Moreover, as Oleg has noted[1], exit_signal is used
> > > unchecked, so it has to be checked for sanity before use; for the legacy
> > > syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> > > however, there's no such thing is done in clone3() code path, and that can
> > > break at least thread_group_leader.
> > >
> > > Checking user-passed exit_signal against ~CSIGNAL mask solves both
> > > of these problems.
> > >
> > > [1] https://lkml.org/lkml/2019/9/10/467
> > >
> > > * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> > > args.exit_signal has bits set outside CSIGNAL mask.
> > > (_do_fork): Note that exit_signal is expected to be checked for the
> > > sanity by the caller.
> > >
> > > Fixes: 7f192e3cd316 ("fork: add clone3")
> >
> > What are the user-visible runtime effects of this bug?
> >
> > Relatedly, should this fix be backported into -stable kernels? If so, why?
>
> No, as I said in my other mail clone3() is not in any released kernel
> yet. clone3() is going to be released in v5.3.

Applied yesteday. This is now fixed and included in mainline.

Thanks!
Christian