2007-08-12 17:03:46

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 3/4] signalfd: fix interaction with posix-timers

dequeue_signal:

if (__SI_TIMER) {
spin_unlock(&tsk->sighand->siglock);
do_schedule_next_timer(info);
spin_lock(&tsk->sighand->siglock);
}

Unless tsk == curent, this is absolutely unsafe: nothing prevents tsk from
exiting. If signalfd was passed to another process, do_schedule_next_timer()
is just wrong.

Add yet another "tsk == current" check into dequeue_signal().

This patch fixes an oopsable bug, but breaks the scheduling of posix timers
if the shared __SI_TIMER signal was fetched via signalfd attached to another
sub-thread. Mostly fixed by the next patch.

Signed-off-by: Oleg Nesterov <[email protected]>

--- t/kernel/signal.c~3_SIGFD 2007-08-09 19:59:27.000000000 +0400
+++ t/kernel/signal.c 2007-08-12 19:48:26.000000000 +0400
@@ -378,7 +378,7 @@ int dequeue_signal(struct task_struct *t
/* We only dequeue private signals from ourselves, we don't let
* signalfd steal them
*/
- if (tsk == current)
+ if (likely(tsk == current))
signr = __dequeue_signal(&tsk->pending, mask, info);
if (!signr) {
signr = __dequeue_signal(&tsk->signal->shared_pending,
@@ -425,7 +425,7 @@ int dequeue_signal(struct task_struct *t
if (!(tsk->signal->flags & SIGNAL_GROUP_EXIT))
tsk->signal->flags |= SIGNAL_STOP_DEQUEUED;
}
- if ( signr &&
+ if (signr && likely(tsk == current) &&
((info->si_code & __SI_MASK) == __SI_TIMER) &&
info->si_sys_private){
/*


2007-08-12 23:04:18

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 3/4] signalfd: fix interaction with posix-timers

This only affects signalfd and so the core change seems ok vs status quo.

I think it would be better overall not to have anyone like signalfd calling
dequeue_signal in its current form at all. (The function is too much an
internal piece of the core signals code. The SIGNAL_STOP_DEQUEUED code
applying for signalfd calls is probably another problem, for example.)


Thanks,
Roland

2007-08-13 13:07:35

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 3/4] signalfd: fix interaction with posix-timers

On 08/12, Roland McGrath wrote:
>
> This only affects signalfd and so the core change seems ok vs status quo.
>
> I think it would be better overall not to have anyone like signalfd calling
> dequeue_signal in its current form at all. (The function is too much an
> internal piece of the core signals code. The SIGNAL_STOP_DEQUEUED code
> applying for signalfd calls is probably another problem, for example.)

I agree, the "tsk != current" case is not nice. We can forbid fetching
signals from another thread group and remove it, this is very easy.
Until then, we can't avoid it without the code duplication.

But SIGNAL_STOP_DEQUEUED code should be OK, afaics. We only need it to
make sure do_signal_stop() can't miss SIGNAL_STOP_CONTINUED/GROUP_EXIT.

Can't we remove SIGNAL_STOP_DEQUEUED, btw?

dequeue_signal:

if (sig_kernel_stop(sig))
->flags &= ~SIGNAL_STOP_CONTINUED;

do_signal_stop:

if (flags & (SIGNAL_STOP_CONTINUED | SIGNAL_GROUP_EXIT))
return 0;

Possible?

Oleg.

2007-08-28 04:35:41

by Roland McGrath

[permalink] [raw]
Subject: SIGNAL_STOP_DEQUEUED

> But SIGNAL_STOP_DEQUEUED code should be OK, afaics. We only need it to
> make sure do_signal_stop() can't miss SIGNAL_STOP_CONTINUED/GROUP_EXIT.
>
> Can't we remove SIGNAL_STOP_DEQUEUED, btw?

No, we can't.

> dequeue_signal:
>
> if (sig_kernel_stop(sig))
> ->flags &= ~SIGNAL_STOP_CONTINUED;
>
> do_signal_stop:
>
> if (flags & (SIGNAL_STOP_CONTINUED | SIGNAL_GROUP_EXIT))
> return 0;
>
> Possible?

SIGNAL_STOP_CONTINUED exists to keep track of whether CLD_CONTINUED has
been reported to the parent's wait* call using the WCONTINUED flag.
wait_task_continued (kernel/exit.c) clears it. It should only be cleared
by the signals code when a job control stop actually happens before a wait*
call clears it. In dequeue_signal, you do not yet know whether it will
actually lead to a stop.

SIGNAL_STOP_DEQUEUED exists to fix a particular race, when we dequeue a
signal and then unlock the siglock before acting on it. This happens when
we call is_current_pgrp_orphaned(), and there might be other cases that
arise. In the window while we do not hold the lock, a kill or suchlike can
take the long to post a SIGCONT. The generation of a SIGCONT must clear
all pending stop signals from all the queues. But, the thread that
dequeued the signal and then released the lock effectively has that signal
still "pending" in its stack state, where it cannot be cleared. So, when
clearing stop signals from queues, we also clear the SIGNAL_STOP_DEQUEUED
flag. After retaking the siglock and considering the previously dequeued
stop signal, we check that SIGNAL_STOP_DEQUEUED is still set. This way, if
the stop signal had a handler, it runs, the stop signal having been
considered logically delivered before the SIGCONT was generated. The
SIGKILL/group-exit case has the same need, where rather than needing to
have stop signals cleared from queus, we need to make sure SIGKILL
overrides any stop in progress just as it would wake up the stopped thread
if it came after the race.


Thanks,
Roland

2007-08-28 09:13:19

by Oleg Nesterov

[permalink] [raw]
Subject: Re: SIGNAL_STOP_DEQUEUED

On 08/27, Roland McGrath wrote:
>
> > But SIGNAL_STOP_DEQUEUED code should be OK, afaics. We only need it to
> > make sure do_signal_stop() can't miss SIGNAL_STOP_CONTINUED/GROUP_EXIT.
> >
> > Can't we remove SIGNAL_STOP_DEQUEUED, btw?
>
> No, we can't.
>
> > dequeue_signal:
> >
> > if (sig_kernel_stop(sig))
> > ->flags &= ~SIGNAL_STOP_CONTINUED;
> >
> > do_signal_stop:
> >
> > if (flags & (SIGNAL_STOP_CONTINUED | SIGNAL_GROUP_EXIT))
> > return 0;
> >
> > Possible?
>
> SIGNAL_STOP_CONTINUED exists to keep track of whether CLD_CONTINUED has
> been reported to the parent's wait* call using the WCONTINUED flag.
> wait_task_continued (kernel/exit.c) clears it. It should only be cleared
> by the signals code when a job control stop actually happens before a wait*
> call clears it. In dequeue_signal, you do not yet know whether it will
> actually lead to a stop.

Yes. do_wait(WCONTINUED) can miss SIGNAL_STOP_CONTINUED task if the task
dequeues another sig_kernel_stop() signal. I thought this is harmless, but
it turns out I misunderstood the needed semantics.

Thanks for your explanation.

This is easy to fix, but we have to split SIGNAL_STOP_CONTINUED into 2 flags,
one for do_wait(), another to indicate that do_signal_stop() should abort.

> SIGNAL_STOP_DEQUEUED exists to fix a particular race, when we dequeue a
> signal and then unlock the siglock before acting on it. This happens when
> we call is_current_pgrp_orphaned(), and there might be other cases that
> arise. In the window while we do not hold the lock, a kill or suchlike can
> take the long to post a SIGCONT. The generation of a SIGCONT must clear
> all pending stop signals from all the queues. But, the thread that
> dequeued the signal and then released the lock effectively has that signal
> still "pending" in its stack state, where it cannot be cleared. So, when
> clearing stop signals from queues, we also clear the SIGNAL_STOP_DEQUEUED
> flag. After retaking the siglock and considering the previously dequeued
> stop signal, we check that SIGNAL_STOP_DEQUEUED is still set.

Yes. But please look at the pseudo code above. do_signal_stop() checks
"flags & (SIGNAL_STOP_CONTINUED | SIGNAL_GROUP_EXIT)" to prevent exactly
this race. Actually we need this check even if we don't temporary drop
->siglock, we shouldn't rely on the fact that SIGKILL < SIGSTOP and so
it will be dequeued first.

But, as you pointed out, this breaks do_wait(WCONTINUED), and the fix
needs another flag, so please forget.



However, is it all correct currently ? Consider 2 threads, T1 and T2,
SIGTTIN is SIG_DFL, SIGTTOU has a handler.

T1 does get_signal_to_deliver(), dequeus SIGTTIN, unlocks ->siglock.

SIGCONT comes in, nothing to do yet, just clear SIGNAL_STOP_DEQUEUED.

SIGTTOU is sent, T2 dequeues it and sets SIGNAL_STOP_DEQUEUED again.
It has a handler, we shouldn't stop, but

T1 continues, takes ->siglock, and calls do_signal_stop().

What do you think about something like the patch below? It moves the setting of
SIGNAL_STOP_DEQUEUED from dequeue_signal() to get_signal_to_deliver(), and the
flag is set only if the sig_kernel_stop() signal doesn't have a handler.

Oleg.

--- kernel/signal.c 2007-08-13 22:20:50.000000000 +0400
+++ - 2007-08-28 12:58:53.765332579 +0400
@@ -409,22 +409,7 @@ int dequeue_signal(struct task_struct *t
}
if (likely(tsk == current))
recalc_sigpending();
- if (signr && unlikely(sig_kernel_stop(signr))) {
- /*
- * Set a marker that we have dequeued a stop signal. Our
- * caller might release the siglock and then the pending
- * stop signal it is about to process is no longer in the
- * pending bitmasks, but must still be cleared by a SIGCONT
- * (and overruled by a SIGKILL). So those cases clear this
- * shared flag after we've set it. Note that this flag may
- * remain set after the signal we return is ignored or
- * handled. That doesn't matter because its only purpose
- * is to alert stop-signal processing code when another
- * processor has come along and cleared the flag.
- */
- if (!(tsk->signal->flags & SIGNAL_GROUP_EXIT))
- tsk->signal->flags |= SIGNAL_STOP_DEQUEUED;
- }
+
if ( signr &&
((info->si_code & __SI_MASK) == __SI_TIMER) &&
info->si_sys_private){
@@ -1853,6 +1838,10 @@ relock:
continue;

if (sig_kernel_stop(signr)) {
+ if (current->signal->flags & SIGNAL_GROUP_EXIT)
+ continue;
+ current->signal->flags |= SIGNAL_STOP_DEQUEUED;
+
/*
* The default action is to stop all threads in
* the thread group. The job control signals

2007-09-01 07:12:39

by Roland McGrath

[permalink] [raw]
Subject: Re: SIGNAL_STOP_DEQUEUED

> Yes. do_wait(WCONTINUED) can miss SIGNAL_STOP_CONTINUED task if the task
> dequeues another sig_kernel_stop() signal. I thought this is harmless, but
> it turns out I misunderstood the needed semantics.
>
> Thanks for your explanation.
>
> This is easy to fix, but we have to split SIGNAL_STOP_CONTINUED into 2 flags,
> one for do_wait(), another to indicate that do_signal_stop() should abort.

SIGNAL_STOP_DEQUEUED is that second flag (its lack says to abort the stop).

> However, is it all correct currently ? Consider 2 threads, T1 and T2,
> SIGTTIN is SIG_DFL, SIGTTOU has a handler.
>
> T1 does get_signal_to_deliver(), dequeus SIGTTIN, unlocks ->siglock.
>
> SIGCONT comes in, nothing to do yet, just clear SIGNAL_STOP_DEQUEUED.
>
> SIGTTOU is sent, T2 dequeues it and sets SIGNAL_STOP_DEQUEUED again.
> It has a handler, we shouldn't stop, but
>
> T1 continues, takes ->siglock, and calls do_signal_stop().

Yes, I think you are right that this could happen. It is indeed wrong that
it's possible to stop with SIGTTIN when the SIGCONT was posted second.

> What do you think about something like the patch below? It moves the setting of
> SIGNAL_STOP_DEQUEUED from dequeue_signal() to get_signal_to_deliver(), and the
> flag is set only if the sig_kernel_stop() signal doesn't have a handler.

I don't recall why this went into dequeue_signal in the first place. It's
possible there was a specific reason at the time, but I can't find one now.
Probably it just seemed natural at the time to set the flag very close the
actual dequeuing just so it's simple to be sure what the invariant is.


Thanks,
Roland