2008-10-06 14:09:18

by Jan Kiszka

[permalink] [raw]
Subject: SIGTRAP vs. sys_exit_group race

Hi,

are there any news on these ideas?

http://marc.info/?l=linux-kernel&m=121540671602971

I've been caught by a race between a ptraced thread running on a
breakpoint, thus generating a SIGTRAP and another thread in this process
issuing sys_exit_group. The discussion above, specifically Oleg's
concerns, made me think that this is a generic issue of current
mainline.

I observed this on a heavily patched 2.6.26.5 kernel which comes, among
other things, with a higher probability for latencies/reschedules
between the queuing of SIGTRAP and the actual delivery. Right into this
window, the sys_exit_group comes. It informs gdb about the termination,
sends out SIGKILL to the other threads and turns the caller into a
zombie. Now the second thread has SIGKILL + SIGTRAP pending, and it
picks SIGTRAP for delivery. At this point gdb gets confused (maybe a bug
of its own?), sends SIGSTOP to the dead thread and waits for it to enter
the traced state (which it will never do) - deadlock of gdb, only
resolvable by killing the latter.

The patch below (rebased against latest git) resolves the issue for me,
but I'm definitely not sure about all its implications and if I'm not
papering over a different issue. Could you comment on my scenario? Is it
possible with mainline as well? Will Roland's approach resolve it?

Thanks,
Jan

---
Index: b/kernel/signal.c
===================================================================
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1528,10 +1528,11 @@ static void ptrace_stop(int exit_code, i
spin_unlock_irq(&current->sighand->siglock);
arch_ptrace_stop(exit_code, info);
spin_lock_irq(&current->sighand->siglock);
- if (sigkill_pending(current))
- return;
}

+ if (sigkill_pending(current))
+ return;
+
/*
* If there is a group stop in progress,
* we must participate in the bookkeeping.
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


2008-10-16 16:56:39

by Oleg Nesterov

[permalink] [raw]
Subject: Re: SIGTRAP vs. sys_exit_group race

Roland, what do you think?

On 10/06, Jan Kiszka wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1528,10 +1528,11 @@ static void ptrace_stop(int exit_code, i
> spin_unlock_irq(&current->sighand->siglock);
> arch_ptrace_stop(exit_code, info);
> spin_lock_irq(&current->sighand->siglock);
> - if (sigkill_pending(current))
> - return;
> }
>
> + if (sigkill_pending(current))
> + return;
> +

Personally, I think this change is good anyway. The tracee shouldn't
sleep in TASK_TRACED with the pending SIGKILL.

And the current code is confusing, imho. Why do we check sigkill_pending()
under arch_ptrace_stop_needed() ? Yes, it unlocks ->siglock and can sleep,
so SIGKILL can come in between. But it is quite possible that SIGKILL is
already pending when we enter ptrace_stop().


The only problem I can see this patch adds a user-visible change, even
if this change looks good to me. For example, if we send SIGKILL to
the thread group, the tracee will not send PTRACE_EVENT_EXIT.

I think we need further changes. If the thread group group was killed
by some fatal signal (but not SIGKILL) the tracee will sleep with
SIGNAL_GROUP_EXIT, this is not nice too. But imho the patch makes
sense anyway.

Oleg.

2008-12-04 00:53:34

by Roland McGrath

[permalink] [raw]
Subject: Re: SIGTRAP vs. sys_exit_group race

> Roland, what do you think?
>
> On 10/06, Jan Kiszka wrote:
> >
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -1528,10 +1528,11 @@ static void ptrace_stop(int exit_code, i
> > spin_unlock_irq(&current->sighand->siglock);
> > arch_ptrace_stop(exit_code, info);
> > spin_lock_irq(&current->sighand->siglock);
> > - if (sigkill_pending(current))
> > - return;
> > }
> >
> > + if (sigkill_pending(current))
> > + return;
> > +
>
> Personally, I think this change is good anyway. The tracee shouldn't
> sleep in TASK_TRACED with the pending SIGKILL.

I think this is actually superfluous since TASK_WAKEKILL (2.6.24?).
It won't sleep in TASK_TRACED at all, because of signal_pending_state().

> I think we need further changes. If the thread group group was killed
> by some fatal signal (but not SIGKILL) the tracee will sleep with
> SIGNAL_GROUP_EXIT, this is not nice too. But imho the patch makes
> sense anyway.

When there is no (user-level) SIGKILL and no core dump synchronization, I
think it's desireable for each thread to stop in exit tracing so it can be
fully examined.

I believe the behavior Jan saw is not new, but was always possible in races
depending on when signals got sent. The SIGTRAP (or PTRACE_EVENT_*)
exit_code might be seen by gdb's wait4 call, but the task can be woken
again by SIGKILL in the next instant and have the death exit_code seen by
the next wait4 call (or not, depending on the race). Since the The size of
the window where an asynchronous SIGKILL could leave the previous SIGTRAP
(et al) exit_code to be seen by wait4 before seeing the death code might
have changed with the introduction of TASK_WAKEWILL, but it was always
there.


Thanks,
Roland

2008-12-04 18:26:19

by Oleg Nesterov

[permalink] [raw]
Subject: Re: SIGTRAP vs. sys_exit_group race

On 12/02, Roland McGrath wrote:
>
> > Roland, what do you think?
> >
> > On 10/06, Jan Kiszka wrote:
> > >
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -1528,10 +1528,11 @@ static void ptrace_stop(int exit_code, i
> > > spin_unlock_irq(&current->sighand->siglock);
> > > arch_ptrace_stop(exit_code, info);
> > > spin_lock_irq(&current->sighand->siglock);
> > > - if (sigkill_pending(current))
> > > - return;
> > > }
> > >
> > > + if (sigkill_pending(current))
> > > + return;
> > > +
> >
> > Personally, I think this change is good anyway. The tracee shouldn't
> > sleep in TASK_TRACED with the pending SIGKILL.
>
> I think this is actually superfluous since TASK_WAKEKILL (2.6.24?).
> It won't sleep in TASK_TRACED at all, because of signal_pending_state().

Yes. But what if the task was killed by the group-wide SIGKILL, and
already dequeued SIGKILL from ->pending ? (do_exit path).

> > I think we need further changes. If the thread group group was killed
> > by some fatal signal (but not SIGKILL) the tracee will sleep with
> > SIGNAL_GROUP_EXIT, this is not nice too. But imho the patch makes
> > sense anyway.
>
> When there is no (user-level) SIGKILL and no core dump synchronization, I
> think it's desireable for each thread to stop in exit tracing so it can be
> fully examined.

Yes. But my point was, it is not good the tracee sleeps and can't be
killed. Yes, the user can use tkill(9) to wake it up, but still.

Oleg.