2022-09-02 04:40:22

by Oleksandr Tymoshenko

[permalink] [raw]
Subject: [PATCH] seccomp: fix refcounter leak if fork/clone is terminated

release_task, where the seccomp's filter refcounter is released, is not
called for the case when the fork/clone is terminated midway by a
signal. This leaves an extra reference that prevents filter from being
destroyed even after all processes using it exit leading to a BPF JIT
memory leak. Dereference the refcounter in the failure path of the
copy_process function.

Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead")
Cc: Christian Brauner <[email protected]>
Cc: [email protected]
Signed-off-by: Oleksandr Tymoshenko <[email protected]>
---
kernel/fork.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index 90c85b17bf69..20f7a3e91354 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1763,6 +1763,21 @@ static void copy_seccomp(struct task_struct *p)
#endif
}

+static void release_seccomp(struct task_struct *p)
+{
+#ifdef CONFIG_SECCOMP
+ /*
+ * Must be called with sighand->lock held, which is common to
+ * all threads in the group. Holding cred_guard_mutex is not
+ * needed because this new task is not yet running and cannot
+ * be racing exec.
+ */
+ assert_spin_locked(&current->sighand->siglock);
+
+ seccomp_filter_release(p);
+#endif
+}
+
SYSCALL_DEFINE1(set_tid_address, int __user *, tidptr)
{
current->clear_child_tid = tidptr;
@@ -2495,6 +2510,7 @@ static __latent_entropy struct task_struct *copy_process(
return p;

bad_fork_cancel_cgroup:
+ release_seccomp(p);
sched_core_free(p);
spin_unlock(&current->sighand->siglock);
write_unlock_irq(&tasklist_lock);
--
2.37.2.789.g6183377224-goog


2022-09-05 09:46:03

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] seccomp: fix refcounter leak if fork/clone is terminated

On Fri, Sep 02, 2022 at 03:41:35AM +0000, Oleksandr Tymoshenko wrote:
> release_task, where the seccomp's filter refcounter is released, is not
> called for the case when the fork/clone is terminated midway by a
> signal. This leaves an extra reference that prevents filter from being
> destroyed even after all processes using it exit leading to a BPF JIT
> memory leak. Dereference the refcounter in the failure path of the
> copy_process function.
>
> Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead")
> Cc: Christian Brauner <[email protected]>
> Cc: [email protected]
> Signed-off-by: Oleksandr Tymoshenko <[email protected]>
> ---

Hey Oleksandr,

Thanks for the patch! I'm really puzzled as to why we never noticed this
and I'm trying to re-architect how this happend. But in any case,
there's a patch in the seccomp tree that fixes this:

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?id=6d17452707ca

which is slighly different from your approach in that it moves
copy_seccomp() after the point of no return. Let us know if you see any
issues with this!

Christian

2022-09-05 23:07:08

by Oleksandr Tymoshenko

[permalink] [raw]
Subject: Re: [PATCH] seccomp: fix refcounter leak if fork/clone is terminated

Hi Christian,

The patch in the seccomp tree, adapted to 5.10 branch, fixed the
memory leak in my reproducer.
Thanks for working on this, I should have checked the seccomp tree first :)
Please disregard the patch in my submission.

Thank you

On Mon, Sep 5, 2022 at 1:39 AM Christian Brauner <[email protected]> wrote:
>
> On Fri, Sep 02, 2022 at 03:41:35AM +0000, Oleksandr Tymoshenko wrote:
> > release_task, where the seccomp's filter refcounter is released, is not
> > called for the case when the fork/clone is terminated midway by a
> > signal. This leaves an extra reference that prevents filter from being
> > destroyed even after all processes using it exit leading to a BPF JIT
> > memory leak. Dereference the refcounter in the failure path of the
> > copy_process function.
> >
> > Fixes: 3a15fb6ed92c ("seccomp: release filter after task is fully dead")
> > Cc: Christian Brauner <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Oleksandr Tymoshenko <[email protected]>
> > ---
>
> Hey Oleksandr,
>
> Thanks for the patch! I'm really puzzled as to why we never noticed this
> and I'm trying to re-architect how this happend. But in any case,
> there's a patch in the seccomp tree that fixes this:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?id=6d17452707ca
>
> which is slighly different from your approach in that it moves
> copy_seccomp() after the point of no return. Let us know if you see any
> issues with this!
>
> Christian

2022-09-06 08:10:37

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] seccomp: fix refcounter leak if fork/clone is terminated

On Mon, Sep 05, 2022 at 03:53:54PM -0700, Oleksandr Tymoshenko wrote:
> Hi Christian,
>
> The patch in the seccomp tree, adapted to 5.10 branch, fixed the
> memory leak in my reproducer.
> Thanks for working on this, I should have checked the seccomp tree first :)
> Please disregard the patch in my submission.

Oh, it wasn't my patch it was someone else's. :) I just acked it! Thanks
for working on this we need things like these spotted and fixed!