2024-05-16 09:36:16

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 2/3] seccomp: release task filters when the task exits

(add lkml)

On 05/15, Andrei Vagin wrote:
>
> On Wed, May 15, 2024 at 5:52 AM Oleg Nesterov <[email protected]> wrote:
> >
> > Let me repeat I forgot everything about seccomp, but let me ask
> > a couple of questions...
>
> It seems you still remember something:). Thank you for the feedback.

Just I am still remember how to use grep ;)

> > > @@ -2126,6 +2137,11 @@ static struct seccomp_filter *get_nth_filter(struct task_struct *task,
> > > */
> > > spin_lock_irq(&task->sighand->siglock);
> > >
> > > + if (task->flags & PF_EXITING) {
> > > + spin_unlock_irq(&task->sighand->siglock);
> > > + return ERR_PTR(-EINVAL);
> > > + }
> >
> > Why do we need the PF_EXITING check here?
> >
> > This looks unnecessary even if get_nth_filter() could race with the
> > exiting task, but this doesn't matter.
> >
> > This race is not possible, get_nth_filter() is only called from ptrace()
> > paths, but the tracee can't stop in TASK_TRACED after exit_signals() which
> > sets PF_EXITING.
>
> If we rely on using seccomp_get_filter only from ptrace, you are right.

Plus it too does __get_seccomp_filter/__get_seccomp_filter, so I guess it
should be safe without this check even if it could be used outside of ptrace.
Just like proc_pid_seccomp_cache(), see below.

> > > @@ -2494,6 +2510,11 @@ int proc_pid_seccomp_cache(struct seq_file *m, struct pid_namespace *ns,
> > > if (!lock_task_sighand(task, &flags))
> > > return -ESRCH;
> > >
> > > + if (thread->flags & PF_EXITING) {
> > > + unlock_task_sighand(task, &flags);
> > > + return 0;
> >
> > Again, do we really need this check?
> >
> > It can race with the exiting task and (without this check) do
> > __get_seccomp_filter(f) right before seccomp_filter_release()
> > takes sighand->siglock. But why is it bad?
>
> I think you are right, this check isn't required.
>
> >
> > OTOH. I guess proc_pid_seccomp_cache() is the only reason why
> > seccomp_filter_release() takes ->siglock with your patch?
>
> seccomp_sync_threads and seccomp_can_sync_threads should be considered too.

Yes. But we only need to consider them in the multi-thread case, right?
In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
miss this flag, seccomp_filter_release() doesn't need to take siglock.

> If we check PF_EXITING in all of them, we don't need to take ->siglock in
> seccomp_filter_release. Does it sound right?

The problem is a single-threaded exiting task. In this case exit_signals()
sets PF_EXITING lockless. This means that in this case

- proc_pid_seccomp_cache() can't rely on the PF_EXITING check
but it can be safely removed.

- seccomp_filter_release() needs to take ->siglock to avoid the
race with proc_pid_seccomp_cache().

And this chunk from your patch

static void __seccomp_filter_orphan(struct seccomp_filter *orig)
{
+ lockdep_assert_held(&current->sighand->siglock);
+

looks unnecessary too, seccomp_filter_release() can just do

spin_lock_irq(siglock);
orig = tsk->seccomp.filter;
tsk->seccomp.filter = NULL;
spin_unlock_irq(siglock);

__seccomp_filter_release(orig);

Right?

Oleg.



2024-05-16 13:10:56

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 2/3] seccomp: release task filters when the task exits

On 05/16, Oleg Nesterov wrote:
>
> On 05/15, Andrei Vagin wrote:
> >
> > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
>
> Yes. But we only need to consider them in the multi-thread case, right?
> In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> miss this flag, seccomp_filter_release() doesn't need to take siglock.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Ah, no. seccomp_filter_release() does need to take ->siglock even if we
forget about proc_pid_seccomp_cache().

Without siglock

orig = tsk->seccomp.filter;

can leak into the critical section in exit_signals() (spin_unlock is the
one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".

Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
don't. So we can't add a fast-path

if (!tsk->seccomp.filter)
return;

check at the start of seccomp_filter_release().


Cough... Now that I look at seccomp_can_sync_threads() I think it too
doesn't need the PF_EXITING check.

If it is called before seccomp_filter_release(), this doesn't really
differ from the case when it is called before do_exit/exit_signals.

If it is called after seccomp_filter_release(), then is_ancestor()
must be true.

But perhaps I missed something, I won't insist, up to you.

> > If we check PF_EXITING in all of them, we don't need to take ->siglock in
> > seccomp_filter_release. Does it sound right?
>
> The problem is a single-threaded exiting task. In this case exit_signals()
> sets PF_EXITING lockless. This means that in this case
>
> - proc_pid_seccomp_cache() can't rely on the PF_EXITING check
> but it can be safely removed.
>
> - seccomp_filter_release() needs to take ->siglock to avoid the
> race with proc_pid_seccomp_cache().
>
> And this chunk from your patch
>
> static void __seccomp_filter_orphan(struct seccomp_filter *orig)
> {
> + lockdep_assert_held(&current->sighand->siglock);
> +
>
> looks unnecessary too, seccomp_filter_release() can just do
>
> spin_lock_irq(siglock);
> orig = tsk->seccomp.filter;
> tsk->seccomp.filter = NULL;
> spin_unlock_irq(siglock);
>
> __seccomp_filter_release(orig);
>
> Right?
>
> Oleg.


2024-05-22 06:49:58

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH 2/3] seccomp: release task filters when the task exits

On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <[email protected]> wrote:
>
> On 05/16, Oleg Nesterov wrote:
> >
> > On 05/15, Andrei Vagin wrote:
> > >
> > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> >
> > Yes. But we only need to consider them in the multi-thread case, right?



> > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


>
> Ah, no. seccomp_filter_release() does need to take ->siglock even if we
> forget about proc_pid_seccomp_cache().
>
> Without siglock
>
> orig = tsk->seccomp.filter;
>
> can leak into the critical section in exit_signals() (spin_unlock is the
> one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
>
> Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> don't. So we can't add a fast-path

We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.

>
> if (!tsk->seccomp.filter)
> return;
>
> check at the start of seccomp_filter_release().
>
>
> Cough... Now that I look at seccomp_can_sync_threads() I think it too
> doesn't need the PF_EXITING check.
>
> If it is called before seccomp_filter_release(), this doesn't really
> differ from the case when it is called before do_exit/exit_signals.
>
> If it is called after seccomp_filter_release(), then is_ancestor()
> must be true.
>
> But perhaps I missed something, I won't insist, up to you.
>
> > > If we check PF_EXITING in all of them, we don't need to take ->siglock in
> > > seccomp_filter_release. Does it sound right?
> >
> > The problem is a single-threaded exiting task. In this case exit_signals()
> > sets PF_EXITING lockless. This means that in this case
> >
> > - proc_pid_seccomp_cache() can't rely on the PF_EXITING check
> > but it can be safely removed.
> >
> > - seccomp_filter_release() needs to take ->siglock to avoid the
> > race with proc_pid_seccomp_cache().
> >
> > And this chunk from your patch
> >
> > static void __seccomp_filter_orphan(struct seccomp_filter *orig)
> > {
> > + lockdep_assert_held(&current->sighand->siglock);
> > +
> >
> > looks unnecessary too, seccomp_filter_release() can just do
> >
> > spin_lock_irq(siglock);
> > orig = tsk->seccomp.filter;
> > tsk->seccomp.filter = NULL;
> > spin_unlock_irq(siglock);
> >
> > __seccomp_filter_release(orig);
> >
> > Right?
> >
> > Oleg.
>

2024-05-22 07:07:02

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH 2/3] seccomp: release task filters when the task exits

> On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <[email protected]> wrote:
> >
> > On 05/16, Oleg Nesterov wrote:
> > >
> > > On 05/15, Andrei Vagin wrote:
> > > >
> > > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> > >
> > > Yes. But we only need to consider them in the multi-thread case, right?
> > > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

PF_EXITING is set without holding ->siglock if tsk->signal has the
SIGNAL_GROUP_EXIT flag. I think it can be a case when one thread is in
seccomp_sync_threads and others are exiting. The first thread can check
that PF_EXITING isn't set for another thread. Then, the second thread calls
exit_signals and seccomp_filter_release(), and finally, the first thread
sets its seccomp.filter to the second thread. If seccomp_filter_release takes
siglock, it will be handled properly.

> >
> > Ah, no. seccomp_filter_release() does need to take ->siglock even if we
> > forget about proc_pid_seccomp_cache().
> >
> > Without siglock
> >
> > orig = tsk->seccomp.filter;
> >
> > can leak into the critical section in exit_signals() (spin_unlock is the
> > one-way barrier) and this LOAD can be reordered with "flags |= PF_EXITING".
> >
> > Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> > don't. So we can't add a fast-path

We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.

> >
> > if (!tsk->seccomp.filter)
> > return;
> >
> > check at the start of seccomp_filter_release().
> >
> >
> > Cough... Now that I look at seccomp_can_sync_threads() I think it too
> > doesn't need the PF_EXITING check.
> >
> > If it is called before seccomp_filter_release(), this doesn't really
> > differ from the case when it is called before do_exit/exit_signals.
> >
> > If it is called after seccomp_filter_release(), then is_ancestor()
> > must be true.
> >
> > But perhaps I missed something, I won't insist, up to you.
> >

You are right, this check isn't required in seccomp_can_sync_threads, but
I decided that it is better to be consistent with seccomp_sync_threads.

Thanks,
Andrei

2024-05-22 10:37:35

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 2/3] seccomp: release task filters when the task exits

On 05/22, Andrei Vagin wrote:
>
> > On Thu, May 16, 2024 at 6:10 AM Oleg Nesterov <[email protected]> wrote:
> > >
> > > On 05/16, Oleg Nesterov wrote:
> > > >
> > > > On 05/15, Andrei Vagin wrote:
> > > > >
> > > > > seccomp_sync_threads and seccomp_can_sync_threads should be considered too.
> > > >
> > > > Yes. But we only need to consider them in the multi-thread case, right?
> > > > In this case exit_signals() sets PF_EXITING under ->siglock, so they can't
> > > > miss this flag, seccomp_filter_release() doesn't need to take siglock.
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> PF_EXITING is set without holding ->siglock if tsk->signal has the
> SIGNAL_GROUP_EXIT flag. I think it can be a case when one thread is in
> seccomp_sync_threads and others are exiting.

Yes, I forgot this.

> > > Hmm. I thought we have something smp_mb__after_unlock(), but it seems we
> > > don't. So we can't add a fast-path
>
> We have smp_mb__after_unlock_lock in include/linux/rcupdate.h.

This is another thing.

But sorry for confusion, this doesn't really matter, we could you a plain mb().
I mean, I was thinking about something like

seccomp_filter_release:

smp_mb();
if (!READ_ONCE(tsk->seccomp.filter))
return;

spin_lock_irq(siglock);
orig = tsk->seccomp.filter;
...

but then seccomp_sync_threads() should do something like


orig = READ_ONCE(thread->seccomp.filter);

smp_store_release(&thread->seccomp.filter,
caller->seccomp.filter);

smp_mb(); // pairs with mb() in seccomp_filter_release()

if (READ_ONCE(thread->flags) & PF_EXITING) {
WRITE_ONCE(thread->seccomp.filter, orig);
continue;
}
__seccomp_filter_release(orig);

...

too subtle even _if_ correct, and I am not sure at all this would be correct.

> > > Cough... Now that I look at seccomp_can_sync_threads() I think it too
> > > doesn't need the PF_EXITING check.
> > >
> > > If it is called before seccomp_filter_release(), this doesn't really
> > > differ from the case when it is called before do_exit/exit_signals.
> > >
> > > If it is called after seccomp_filter_release(), then is_ancestor()
> > > must be true.
> > >
> > > But perhaps I missed something, I won't insist, up to you.
> > >
>
> You are right, this check isn't required in seccomp_can_sync_threads, but
> I decided that it is better to be consistent with seccomp_sync_threads.

OK, agreed.

Oleg.