Greetings,
A race in ptrace was pointed to us by a fellow Google engineer, Tavis
Ormandy. The race involves interaction between a tracer, a tracee and
an antagonist. The tracer is tracing the tracee with PTRACE_SYSCALL and
waits on the tracee. In the mean time, an antagonist blasts the tracee
with SIGCONTs.
The observed issue is that sometimes when the tracer attempts to continue the tracee
with PTRACE_SYSCALL, it gets a return value of -ESRCH, indicating that the tracee is already
running (or not being traced). It turns out that a SIGCONT wakes up the
tracee in kernel mode, and for a moment the tracee's state is TASK_RUNNING
then in ptrace_stop we hit the condition where the tracee is found to be running (and thus
not traced). If the syscall is repeated, the second time it usually
succeeds (because by that time, the tracee has been put into TASK_TRACED).
Below is a quick and dirty fix for the one instance that I did
figure out. Note that this doesn't completely close the race on
2.6.33-rc6. But on 2.6.26 it appears to be sufficient. I suspect there
are other code paths with similar issues:
Fix a race in ptrace.
Race description:
The traced process is running for a small duration
of time between when it is sent a SIGCONT and when it realizes that it needs
to be asleep in order to get traced. If during this time the tracer calls
ptrace with PTRACE_SYSCALL, it recieves an errno value of -ESRCH.
Solution:
We add a new bit to the ptrace field of task_struct. We call this PT_WAKING.
When the process is being awoken for a SIGCONT signal, we set this bit before
state changes to TASK_RUNNING. When the process is about to go to sleep, we
reset this bit after we change the state to TASK_TRACED.
Signed-off-by: Salman Qazi <[email protected]>
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 56f2d63..6c6771a 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -67,8 +67,9 @@
#define PT_TRACE_EXEC 0x00000080
#define PT_TRACE_VFORK_DONE 0x00000100
#define PT_TRACE_EXIT 0x00000200
+#define PT_WAKING 0x00000400
-#define PT_TRACE_MASK 0x000003f4
+#define PT_TRACE_MASK 0x000007f4
/* single stepping state bits (used on ARM and PA-RISC) */
#define PT_SINGLESTEP_BIT 31
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..32157f8 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -104,7 +104,8 @@ int ptrace_check_attach(struct task_struct *child, int kill)
spin_lock_irq(&child->sighand->siglock);
if (task_is_stopped(child))
child->state = TASK_TRACED;
- else if (!task_is_traced(child) && !kill)
+ else if (!task_is_traced(child) && !kill &&
+ (!(child->ptrace & PT_WAKING)))
ret = -ESRCH;
spin_unlock_irq(&child->sighand->siglock);
}
diff --git a/kernel/signal.c b/kernel/signal.c
index 934ae5e..095507e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -697,6 +697,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
* and wake all threads.
*/
rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
+ if (p->ptrace & PT_PTRACED) {
+ p->ptrace |= PT_WAKING;
+ mb();
+ }
t = p;
do {
unsigned int state;
@@ -1626,6 +1630,10 @@ static void ptrace_stop(int exit_code, int clear_code, siginfo_t *info)
/* Let the debugger run. */
__set_current_state(TASK_TRACED);
+ if (current->ptrace & PT_PTRACED) {
+ mb();
+ current->ptrace &= ~PT_WAKING;
+ }
spin_unlock_irq(¤t->sighand->siglock);
read_lock(&tasklist_lock);
if (may_ptrace_stop()) {
Salman Qazi wrote:
>
> A race in ptrace was pointed to us by a fellow Google engineer, Tavis
> Ormandy. The race involves interaction between a tracer, a tracee and
> an antagonist. The tracer is tracing the tracee with PTRACE_SYSCALL and
> waits on the tracee. In the mean time, an antagonist blasts the tracee
> with SIGCONTs.
Could you please explain how did observe this race? Do you have a
test-case, or could you explain how we can reproduce it?
Because,
> It turns out that a SIGCONT wakes up the
> tracee in kernel mode,
SIGCONT must not wake up a TASK_TRACED task.
Yes, there are some know problems with SIGCONT && ptrace, but I don't
see anthing related to the described problem.
Thanks,
Oleg.
On 02/09, Oleg Nesterov wrote:
>
> Salman Qazi wrote:
> >
> > A race in ptrace was pointed to us by a fellow Google engineer, Tavis
> > Ormandy. The race involves interaction between a tracer, a tracee and
> > an antagonist. The tracer is tracing the tracee with PTRACE_SYSCALL and
> > waits on the tracee. In the mean time, an antagonist blasts the tracee
> > with SIGCONTs.
>
> Could you please explain how did observe this race? Do you have a
> test-case, or could you explain how we can reproduce it?
>
> Because,
>
> > It turns out that a SIGCONT wakes up the
> > tracee in kernel mode,
>
> SIGCONT must not wake up a TASK_TRACED task.
In case I wasn't clear...
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -697,6 +697,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
* and wake all threads.
*/
rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
+ if (p->ptrace & PT_PTRACED) {
+ p->ptrace |= PT_WAKING;
+ mb();
+ }
Please note that we are going to do wake_up_state(state), and
this state can never have __TASK_TRACED bit set.
And we can't change ->ptrace here, we can race with the tracer.
There are other problems with this patch, but the main problem
is that I can't understand what this patch tries to fix.
IOW, please provide more info ;)
Oleg.
[+tavis]
Sorry for the delayed response. I was out sick yesterday. I have
made a simpler version of tavis's test case:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sched.h>
#include <errno.h>
#include <sys/ptrace.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/wait.h>
int child_pid;
int ant_pid;
void child(void)
{
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
asm("int3");
while(1);
}
void antagonist(void)
{
while (1) {
kill(child_pid, SIGSTOP);
usleep(2);
kill(child_pid, SIGCONT);
usleep(2);
}
}
int do_fork(void (*callback)(void))
{
int pid;
pid = fork();
if (pid)
return pid;
callback();
exit(EXIT_SUCCESS);
}
int main(int argc, char **argv)
{
int status;
assert((child_pid = do_fork(child)) > 0);
assert((ant_pid = do_fork(antagonist)) > 0);
waitpid(child_pid, &status, 0);
ptrace(PTRACE_SYSCALL, child_pid, NULL, NULL);
while(1) {
if (waitpid(child_pid, &status, 0) <= 0) {
printf("Errno %d\n", errno);
exit(EXIT_FAILURE);
}
if (WIFSTOPPED(status)) {
printf("stopped: %d\n", WSTOPSIG(status));
/* This should work, but sometimes it doesn't */
if (ptrace(PTRACE_SYSCALL, child_pid,
NULL, WSTOPSIG(status)) < 0) {
/* Oddly it works the second time! */
assert (ptrace(PTRACE_SYSCALL,
child_pid, NULL, WSTOPSIG(status)) < 0);
}
}
}
}
On Wed, Feb 10, 2010 at 5:35 AM, Oleg Nesterov <[email protected]> wrote:
> On 02/09, Oleg Nesterov wrote:
>>
>> Salman Qazi wrote:
>> >
>> > A race in ptrace was pointed to us by a fellow Google engineer, Tavis
>> > Ormandy. ?The race involves interaction between a tracer, a tracee and
>> > an antagonist. ?The tracer is tracing the tracee with PTRACE_SYSCALL and
>> > waits on the tracee. ?In the mean time, an antagonist blasts the tracee
>> > with SIGCONTs.
>>
>> Could you please explain how did observe this race? Do you have a
>> test-case, or could you explain how we can reproduce it?
>>
>> Because,
>>
>> > It turns out that a SIGCONT wakes up the
>> > tracee in kernel mode,
>>
>> SIGCONT must not wake up a TASK_TRACED task.
>
> In case I wasn't clear...
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -697,6 +697,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
> ? ? ? ? ? ? ? ? * and wake all threads.
> ? ? ? ? ? ? ? ? */
> ? ? ? ? ? ? ? ?rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
> + ? ? ? ? ? ? ? if (p->ptrace & PT_PTRACED) {
> + ? ? ? ? ? ? ? ? ? ? ? p->ptrace |= PT_WAKING;
> + ? ? ? ? ? ? ? ? ? ? ? mb();
> + ? ? ? ? ? ? ? }
>
> Please note that we are going to do wake_up_state(state), and
> this state can never have __TASK_TRACED bit set.
>
> And we can't change ->ptrace here, we can race with the tracer.
>
> There are other problems with this patch, but the main problem
> is that I can't understand what this patch tries to fix.
>
> IOW, please provide more info ;)
>
> Oleg.
>
>
On 02/10, Salman Qazi wrote:
>
> I have
> made a simpler version of tavis's test case:
Thanks, now I see what you mean.
But this all is correct, you can't expect PTRACE_SYSCALL can succeed
is the tracee is running, it must be stopped or traced.
The tracee is running because it was TASK_STOPPED and antagonist()
sends SIGCONT.
The tracee was TASK_STOPPED because the tracer passes sig = SIGSTOP
via ptrace(PTRACE_SYSCALL, WSTOPSIG(status).
Where do you see the bug?
OK, let me simplify the test-case even more:
int main(void)
{
int stat, ret;
int pid = fork();
if (!pid) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
for (;;)
;
}
sleep(1); // wait for PTRACE_TRACEME
kill(pid, SIGSTOP);
// the child reports SIGSTOP, it is TASK_TRACED
assert(pid == wait(&stat) && WIFSTOPPED(stat));
// the tracee should stop, we pass sig = SIGSTOP
assert(ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat)) == 0);
// the child reports the group stop, it is TASK_STOPPED
assert(pid == wait(&stat) && WIFSTOPPED(stat));
// the tracee is STOPPED as requested, not TRACED,
// SIGCONT wakes it up
kill(pid, SIGCONT);
// now the tracee is _running_, and PTRACE_SYSCALL must fail
ret = ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat));
printf("should fail: ret=%d %m\n", ret);
return 0;
}
PTRACE_SYSCALL fails, and this is absolutely correct.
Now, let's look at your test-case
> int main(int argc, char **argv)
> {
> int status;
> assert((child_pid = do_fork(child)) > 0);
> assert((ant_pid = do_fork(antagonist)) > 0);
> waitpid(child_pid, &status, 0);
> ptrace(PTRACE_SYSCALL, child_pid, NULL, NULL);
> while(1) {
> if (waitpid(child_pid, &status, 0) <= 0) {
> printf("Errno %d\n", errno);
> exit(EXIT_FAILURE);
> }
> if (WIFSTOPPED(status)) {
WSTOPSIG() should be either SIGCONT or SIGSTOP
> printf("stopped: %d\n", WSTOPSIG(status));
>
> /* This should work, but sometimes it doesn't */
> if (ptrace(PTRACE_SYSCALL, child_pid,
> NULL, WSTOPSIG(status)) < 0) {
This should not work if the tracee reported the group stop (not the
fact it dequeued SIGSTOP) and antagonist() sends SIGCONT in between.
> /* Oddly it works the second time! */
> assert (ptrace(PTRACE_SYSCALL,
> child_pid, NULL, WSTOPSIG(status)) < 0);
Of couse, it _can_ work the second time, antagonist() sends a signal
(SIGCONT or SIGSTOP), the tracee dequeues the signal, and stops to
report this signal.
See?
Oleg.
On Thu, Feb 11, 2010 at 4:56 AM, Oleg Nesterov <[email protected]> wrote:
> On 02/10, Salman Qazi wrote:
>>
>> I have
>> made a simpler version of tavis's test case:
>
> Thanks, now I see what you mean.
>
> But this all is correct, you can't expect PTRACE_SYSCALL can succeed
> is the tracee is running, it must be stopped or traced.
>
> The tracee is running because it was TASK_STOPPED and antagonist()
> sends SIGCONT.
>
> The tracee was TASK_STOPPED because the tracer passes sig = SIGSTOP
> via ptrace(PTRACE_SYSCALL, WSTOPSIG(status).
>
> Where do you see the bug?
Shouldn't ptrace(PTRACE_SYSCALL, WSTOPSIG(status)...), cause any
future signals to the child be intercepted by the parent?
>
> OK, let me simplify the test-case even more:
>
> ? ? ? ?int main(void)
> ? ? ? ?{
> ? ? ? ? ? ? ? ?int stat, ret;
> ? ? ? ? ? ? ? ?int pid = fork();
>
> ? ? ? ? ? ? ? ?if (!pid) {
> ? ? ? ? ? ? ? ? ? ? ? ?ptrace(PTRACE_TRACEME, 0, NULL, NULL);
> ? ? ? ? ? ? ? ? ? ? ? ?for (;;)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?;
> ? ? ? ? ? ? ? ?}
>
> ? ? ? ? ? ? ? ?sleep(1); ? ? ? // wait for PTRACE_TRACEME
> ? ? ? ? ? ? ? ?kill(pid, SIGSTOP);
>
> ? ? ? ? ? ? ? ?// the child reports SIGSTOP, it is TASK_TRACED
> ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
>
> ? ? ? ? ? ? ? ?// the tracee should stop, we pass sig = SIGSTOP
> ? ? ? ? ? ? ? ?assert(ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat)) == 0);
>
> ? ? ? ? ? ? ? ?// the child reports the group stop, it is TASK_STOPPED
> ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
>
> ? ? ? ? ? ? ? ?// the tracee is STOPPED as requested, not TRACED,
> ? ? ? ? ? ? ? ?// SIGCONT wakes it up
> ? ? ? ? ? ? ? ?kill(pid, SIGCONT);
According to the man page, any signals to the
process are supposed to be intercepted by the parent and that is how
one is supposed to be able to control which signals make it to the
child. I am not sure if it makes any difference if the signal
originates at the parent. But in our test case, it doesn't. So, why
doesn't the parent get a notification first?
>
> ? ? ? ? ? ? ? ?// now the tracee is _running_, and PTRACE_SYSCALL must fail
> ? ? ? ? ? ? ? ?ret = ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat));
> ? ? ? ? ? ? ? ?printf("should fail: ret=%d %m\n", ret);
>
> ? ? ? ? ? ? ? ?return 0;
> ? ? ? ?}
>
> PTRACE_SYSCALL fails, and this is absolutely correct.
>
> Now, let's look at your test-case
>
>> int main(int argc, char **argv)
>> {
>> ? ? ? ? int status;
>> ? ? ? ? assert((child_pid = do_fork(child)) > 0);
>> ? ? ? ? assert((ant_pid = do_fork(antagonist)) > 0);
>> ? ? ? ? waitpid(child_pid, &status, 0);
>> ? ? ? ? ptrace(PTRACE_SYSCALL, child_pid, NULL, NULL);
>> ? ? ? ? while(1) {
>> ? ? ? ? ? ? ? ? if (waitpid(child_pid, &status, 0) <= 0) {
>> ? ? ? ? ? ? ? ? ? ? ? ? printf("Errno %d\n", errno);
>> ? ? ? ? ? ? ? ? ? ? ? ? exit(EXIT_FAILURE);
>> ? ? ? ? ? ? ? ? }
>> ? ? ? ? ? ? ? ? if (WIFSTOPPED(status)) {
>
> WSTOPSIG() should be either SIGCONT or SIGSTOP
>
>> ? ? ? ? ? ? ? ? ? ? ? ? printf("stopped: %d\n", WSTOPSIG(status));
>>
>> ? ? ? ? ? ? ? ? ? ? ? ? /* This should work, but sometimes it doesn't */
>> ? ? ? ? ? ? ? ? ? ? ? ? if (ptrace(PTRACE_SYSCALL, child_pid,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? NULL, WSTOPSIG(status)) < 0) {
>
> This should not work if the tracee reported the group stop (not the
> fact it dequeued SIGSTOP) and antagonist() sends SIGCONT in between.
>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /* Oddly it works the second time! */
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? assert (ptrace(PTRACE_SYSCALL,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? child_pid, NULL, WSTOPSIG(status)) < 0);
>
> Of couse, it _can_ work the second time, antagonist() sends a signal
> (SIGCONT or SIGSTOP), the tracee dequeues the signal, and stops to
> report this signal.
>
> See?
>
> Oleg.
>
>
On 02/11, Salman Qazi wrote:
>
> On Thu, Feb 11, 2010 at 4:56 AM, Oleg Nesterov <[email protected]> wrote:
> >
> > But this all is correct, you can't expect PTRACE_SYSCALL can succeed
> > is the tracee is running, it must be stopped or traced.
> >
> > The tracee is running because it was TASK_STOPPED and antagonist()
> > sends SIGCONT.
> >
> > The tracee was TASK_STOPPED because the tracer passes sig = SIGSTOP
> > via ptrace(PTRACE_SYSCALL, WSTOPSIG(status).
> >
> > Where do you see the bug?
>
> Shouldn't ptrace(PTRACE_SYSCALL, WSTOPSIG(status)...), cause any
> future signals to the child be intercepted by the parent?
Not sure I understand your question. Of course the tracee will report any
future signals signals, after it has a chance to dequeue a signal.
But if you mean that after, say, ptrace(PTRACE_SYSCALL, SIGTERM) the
tracee should report _this_ SIGTERM to the tracer - then no. Well,
actually "this depends", but if PTRACE_SYSCALL (or any other req)
is called after the tracee reported the signal - no. The signal was
already reported.
> > ? ? ? ?int main(void)
> > ? ? ? ?{
> > ? ? ? ? ? ? ? ?int stat, ret;
> > ? ? ? ? ? ? ? ?int pid = fork();
> >
> > ? ? ? ? ? ? ? ?if (!pid) {
> > ? ? ? ? ? ? ? ? ? ? ? ?ptrace(PTRACE_TRACEME, 0, NULL, NULL);
> > ? ? ? ? ? ? ? ? ? ? ? ?for (;;)
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?;
> > ? ? ? ? ? ? ? ?}
> >
> > ? ? ? ? ? ? ? ?sleep(1); ? ? ? // wait for PTRACE_TRACEME
> > ? ? ? ? ? ? ? ?kill(pid, SIGSTOP);
> >
> > ? ? ? ? ? ? ? ?// the child reports SIGSTOP, it is TASK_TRACED
> > ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
> >
> > ? ? ? ? ? ? ? ?// the tracee should stop, we pass sig = SIGSTOP
> > ? ? ? ? ? ? ? ?assert(ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat)) == 0);
> >
> > ? ? ? ? ? ? ? ?// the child reports the group stop, it is TASK_STOPPED
> > ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
> >
> > ? ? ? ? ? ? ? ?// the tracee is STOPPED as requested, not TRACED,
> > ? ? ? ? ? ? ? ?// SIGCONT wakes it up
> > ? ? ? ? ? ? ? ?kill(pid, SIGCONT);
>
> According to the man page, any signals to the
> process are supposed to be intercepted by the parent and that is how
> one is supposed to be able to control which signals make it to the
> child. I am not sure if it makes any difference if the signal
> originates at the parent. But in our test case, it doesn't. So, why
> doesn't the parent get a notification first?
It does. You can insert another wait() (or just sleep(1)) between
kill(SIGCONT) and PTRACE_SYSCALL below, the tracee will stop to report
SIGCONT and the tracer will be notified. In this case the following
PTRACE_SYSCALL should succeed.
Perhaps I should have mentioned that the code above is racy. It is,
I only did it to simplify the explanations.
Oleg.
On Thu, Feb 11, 2010 at 8:50 AM, Oleg Nesterov <[email protected]> wrote:
> On 02/11, Salman Qazi wrote:
>>
>> On Thu, Feb 11, 2010 at 4:56 AM, Oleg Nesterov <[email protected]> wrote:
>> >
>> > But this all is correct, you can't expect PTRACE_SYSCALL can succeed
>> > is the tracee is running, it must be stopped or traced.
>> >
>> > The tracee is running because it was TASK_STOPPED and antagonist()
>> > sends SIGCONT.
>> >
>> > The tracee was TASK_STOPPED because the tracer passes sig = SIGSTOP
>> > via ptrace(PTRACE_SYSCALL, WSTOPSIG(status).
>> >
>> > Where do you see the bug?
>>
>> Shouldn't ptrace(PTRACE_SYSCALL, WSTOPSIG(status)...), cause any
>> future signals to the child be intercepted by the parent?
>
> Not sure I understand your question. Of course the tracee will report any
> future signals signals, after it has a chance to dequeue a signal.
>
> But if you mean that after, say, ptrace(PTRACE_SYSCALL, SIGTERM) the
> tracee should report _this_ SIGTERM to the tracer - then no. Well,
> actually "this depends", but if PTRACE_SYSCALL (or any other req)
> is called after the tracee reported the signal - no. The signal was
> already reported.
>
>> > ? ? ? ?int main(void)
>> > ? ? ? ?{
>> > ? ? ? ? ? ? ? ?int stat, ret;
>> > ? ? ? ? ? ? ? ?int pid = fork();
>> >
>> > ? ? ? ? ? ? ? ?if (!pid) {
>> > ? ? ? ? ? ? ? ? ? ? ? ?ptrace(PTRACE_TRACEME, 0, NULL, NULL);
>> > ? ? ? ? ? ? ? ? ? ? ? ?for (;;)
>> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?;
>> > ? ? ? ? ? ? ? ?}
>> >
>> > ? ? ? ? ? ? ? ?sleep(1); ? ? ? // wait for PTRACE_TRACEME
>> > ? ? ? ? ? ? ? ?kill(pid, SIGSTOP);
>> >
>> > ? ? ? ? ? ? ? ?// the child reports SIGSTOP, it is TASK_TRACED
>> > ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
>> >
>> > ? ? ? ? ? ? ? ?// the tracee should stop, we pass sig = SIGSTOP
>> > ? ? ? ? ? ? ? ?assert(ptrace(PTRACE_SYSCALL, pid, 0, WSTOPSIG(stat)) == 0);
>> >
>> > ? ? ? ? ? ? ? ?// the child reports the group stop, it is TASK_STOPPED
>> > ? ? ? ? ? ? ? ?assert(pid == wait(&stat) && WIFSTOPPED(stat));
>> >
>> > ? ? ? ? ? ? ? ?// the tracee is STOPPED as requested, not TRACED,
>> > ? ? ? ? ? ? ? ?// SIGCONT wakes it up
>> > ? ? ? ? ? ? ? ?kill(pid, SIGCONT);
I am still missing something. There's probably a gap in my
understanding, so let's try to clarify it. The last "kill" call,
sends a SIGCONT. But, shouldn't this SIGCONT be intercepted by the
tracer before the tracee sees it?
>>
>> ? ? ? ? ? ? ? ? ? ?According to the man page, any signals to the
>> process are supposed to be intercepted by the parent and that is how
>> one is supposed to be able to control which signals make it to the
>> child. ?I am not sure if it makes any difference if the signal
>> originates at the parent. ?But in our test case, it doesn't. ? So, why
>> doesn't the parent get a notification first?
>
> It does. You can insert another wait() (or just sleep(1)) between
> kill(SIGCONT) and PTRACE_SYSCALL below, the tracee will stop to report
> SIGCONT and the tracer will be notified. In this case the following
> PTRACE_SYSCALL should succeed.
>
> Perhaps I should have mentioned that the code above is racy. It is,
> I only did it to simplify the explanations.
>
> Oleg.
>
>
On 02/11, Salman Qazi wrote:
>
> >> > ? ? ? ? ? ? ? ?// the tracee is STOPPED as requested, not TRACED,
> >> > ? ? ? ? ? ? ? ?// SIGCONT wakes it up
> >> > ? ? ? ? ? ? ? ?kill(pid, SIGCONT);
>
> I am still missing something. There's probably a gap in my
> understanding, so let's try to clarify it. The last "kill" call,
> sends a SIGCONT. But, shouldn't this SIGCONT be intercepted by the
> tracer before the tracee sees it?
No. The tracee resumes (again: because it was STOPPED, not TRACED),
dequeues SIGCONT, reports the signal and stops in TASK_TRACED,
see ptrace_signal(). Meanwhile, until it calls ptrace_stop(), it is
TASK_RUNNING and ptrace() fails.
Oleg.
I understand what it does. But, why is it the right thing to do?
>From the user's perspective, why should the task become untraced if we
use ptrace to deliver the signal? Doesn't this make it impossible to
intercept and control which signals are sent to a traced task?
On Thu, Feb 11, 2010 at 10:55 AM, Oleg Nesterov <[email protected]> wrote:
> On 02/11, Salman Qazi wrote:
>>
>> >> > ? ? ? ? ? ? ? ?// the tracee is STOPPED as requested, not TRACED,
>> >> > ? ? ? ? ? ? ? ?// SIGCONT wakes it up
>> >> > ? ? ? ? ? ? ? ?kill(pid, SIGCONT);
>>
>> I am still missing something. ?There's probably a gap in my
>> understanding, so let's try to clarify it. ?The last "kill" call,
>> sends a SIGCONT. ?But, shouldn't this SIGCONT be intercepted by the
>> tracer before the tracee sees it?
>
> No. The tracee resumes (again: because it was STOPPED, not TRACED),
> dequeues SIGCONT, reports the signal and stops in TASK_TRACED,
> see ptrace_signal(). Meanwhile, until it calls ptrace_stop(), it is
> TASK_RUNNING and ptrace() fails.
>
> Oleg.
>
>
On 02/11, Salman Qazi wrote:
>
> I understand what it does. But, why is it the right thing to do?
Oh. Let's not discuss the current API. Nobody thinks it is great,
but we can't change it.
But,
> From the user's perspective, why should the task become untraced if we
> use ptrace to deliver the signal?
The task does not become untraced. The tracer (in your test-case)
explicitly asks the tracee to respect SIGSTOP and stop.
> Doesn't this make it impossible to
> intercept and control which signals are sent to a traced task?
Why? The tracee reports all signals. If the tracer does
ptrace(PTRACE_WHATEVER, SIGXXX) surely it knows SIGXXX is sent to
tracee.
In any case. This is how ptrace currently works, there is no race
and the patch is not needed (in fact it is very wrong, but this
soesn't matter).
Do you agree?
Oleg.
On Thu, Feb 11, 2010 at 12:10 PM, Oleg Nesterov <[email protected]> wrote:
> On 02/11, Salman Qazi wrote:
>>
>> I understand what it does. ?But, why is it the right thing to do?
>
> Oh. Let's not discuss the current API. Nobody thinks it is great,
> but we can't change it.
>
> But,
>
>> From the user's perspective, why should the task become untraced if we
>> use ptrace to deliver the signal?
>
> The task does not become untraced. The tracer (in your test-case)
> explicitly asks the tracee to respect SIGSTOP and stop.
>
>> Doesn't this make it impossible to
>> intercept and control which signals are sent to a traced task?
>
> Why? The tracee reports all signals. If the tracer does
> ptrace(PTRACE_WHATEVER, SIGXXX) surely it knows SIGXXX is sent to
> tracee.
>
>
The ptrace syscall fails, as the child is running and so we are unable
to restart the child. I suppose it is not accurate to say "impossible
to intercept" if it eventually works. But, it's an unpleasant
behaviour. How do you distinguish between this race and the child
suddenly becoming untraced or dying?
What I don't agree with is that when we send SIGCONT later with
"kill", we wake up the child at all. It may make sense to someone who
has access to the kernel source code, but from a user's point of view
this is a surprise. The signal is intercepted and should not have an
effect on the child.
>
>
> In any case. This is how ptrace currently works, there is no race
> and the patch is not needed (in fact it is very wrong, but this
> soesn't matter).
>
> Do you agree?
I agree that the patch is wrong because of the reasons you mentioned
earlier. But I think there is an issue here. It's hard to say what
it is supposed to do, but I can certainly see it being more useful
this behaviour wasn't there. Thanks for your time and attention on
this matter.
>
> Oleg.
>
>
> What I don't agree with is that when we send SIGCONT later with
> "kill", we wake up the child at all. It may make sense to someone who
> has access to the kernel source code, but from a user's point of view
> this is a surprise. The signal is intercepted and should not have an
> effect on the child.
This is the behavior of SIGCONT and doesn't really have anything to do with
ptrace. Once you have let the SIGSTOP through, the process is in job
control stop just like if you'd sent a SIGSTOP without using ptrace at all.
The distinction that is confusing you is that *generating* SIGCONT is what
resumes the process, not *delivering* it. Another example is that if your
process has SIGCONT blocked or ignored, SIGCONT still wakes it up. Another
example is that SIGCONT wakes up all the threads in a process, before one
of those threads delivers the SIGCONT (i.e. runs a handler).
Thanks,
Roland
On 02/11, Salman Qazi wrote:
>
> On Thu, Feb 11, 2010 at 12:10 PM, Oleg Nesterov <[email protected]> wrote:
> >>
> > Why? The tracee reports all signals. If the tracer does
> > ptrace(PTRACE_WHATEVER, SIGXXX) surely it knows SIGXXX is sent to
> > tracee.
>
> The ptrace syscall fails, as the child is running and so we are unable
> to restart the child. I suppose it is not accurate to say "impossible
> to intercept" if it eventually works. But, it's an unpleasant
> behaviour. How do you distinguish between this race and the child
> suddenly becoming untraced or dying?
The child can't become untraced unless the tracer detaches. If the
tracee dies the tracer can notice this via wait(). And please note
again, this particular case is not possible when the tracee is
TASK_TRACED. The tracer explicitly instructed the tracee to stop in
TASK_STOPPED, it should take care of SIGCONT case.
But don't get me wrong, see below,
> > In any case. This is how ptrace currently works, there is no race
> > and the patch is not needed (in fact it is very wrong, but this
> > soesn't matter).
> >
> > Do you agree?
>
> I agree that the patch is wrong because of the reasons you mentioned
> earlier. But I think there is an issue here. It's hard to say what
> it is supposed to do, but I can certainly see it being more useful
> this behaviour wasn't there.
Ha. let me repeat, nobody thinks the current ptrace API is nice.
OK. Thanks Salman for your report and discussion.
Oleg.
On Thu, Feb 11, 2010 at 12:55 PM, Roland McGrath <[email protected]> wrote:
>> What I don't agree with is that when we send SIGCONT later with
>> "kill", we wake up the child at all. ?It may make sense to someone who
>> has access to the kernel source code, but from a user's point of view
>> this is a surprise. ?The signal is intercepted and should not have an
>> effect on the child.
>
> This is the behavior of SIGCONT and doesn't really have anything to do with
> ptrace. ?Once you have let the SIGSTOP through, the process is in job
> control stop just like if you'd sent a SIGSTOP without using ptrace at all.
>
> The distinction that is confusing you is that *generating* SIGCONT is what
> resumes the process, not *delivering* it. ?Another example is that if your
> process has SIGCONT blocked or ignored, SIGCONT still wakes it up. ?Another
> example is that SIGCONT wakes up all the threads in a process, before one
> of those threads delivers the SIGCONT (i.e. runs a handler).
Thanks for the clarification. This is exactly the bit of information
I was missing.
>
>
> Thanks,
> Roland
>