2008-10-27 08:55:42

by Mike Frysinger

[permalink] [raw]
Subject: inconsistent behavior with ptrace(TRACEME) and fork/exec

i'm hoping my understanding of ptrace is correct and that the attached test
case (reduced from LTP) isnt just completely broken ... my understanding is
that if a parent forks and the child does a ptrace(TRACEME) right before
doing an exec(), the kernel should always halt it and wait indefinitely for
the parent to start ptracing it.

unfortunately, this behavior seems to be unreliable. most of the time it
works, but sometimes the kernel does not halt the child and it gladly does
the exec() that it set out to do. i dont believe this to be a race in the
user space parent component as forcing it to delay via judicious usage of
sleep() shows the same behavior.

if all goes well, we should only ever see "SUCCESS!" from the test case:
$ gcc -Wall ptrace-vfork-traceme.c -o ptrace-vfork-traceme
$ while ./ptrace-vfork-traceme ; do :; done
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
SUCCESS! :D
failure, child exited with 17: Child exited
wait() = 12275
status = 1407
WIFEXITED = 0
WEXITSTATUS = 5
WIFSIGNALED = 0
WTERMSIG = 127 (Unknown signal 127)

i'm testing 2.6.26/2.6.27 atm. both x86_64 and ppc64 seem to behave the same.
while gcc-4.3.2 / glibc-2.8 is in use in both places, i dont think that
matters.

also, while the attached test uses vfork(), same behavior can be observed with
fork(). vfork() is used because i like my test cases to work on both MMU and
no-MMU systems.
-mike


Attachments:
(No filename) (1.46 kB)
signature.asc (835.00 B)
This is a digitally signed message part.
ptrace-vfork-traceme.c (1.53 kB)
Download all attachments

2008-10-27 15:04:55

by Jan Kratochvil

[permalink] [raw]
Subject: Re: inconsistent behavior with ptrace(TRACEME) and fork/exec

--- ptrace-vfork-traceme.c-orig 2008-10-27 15:41:51.000000000 +0100
+++ ptrace-vfork-traceme.c 2008-10-27 15:47:05.000000000 +0100
@@ -18,14 +18,23 @@ do { \
static void child_exit(int sig)
{
int status;
+#if 0 /* Correct behavior. */
printf("failure, child exited with %i: %s\n", sig, strsignal(sig));
+#endif
printf("wait() = %i\n", wait(&status));
- printf("status = %i\n", status);
+ printf("status = 0x%x\n", status);
printf("\tWIFEXITED = %i\n", WIFEXITED(status));
printf("\tWEXITSTATUS = %i\n", WEXITSTATUS(status));
printf("\tWIFSIGNALED = %i\n", WIFSIGNALED(status));
printf("\tWTERMSIG = %i (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
+ /* WIFSTOPPED happens. */
+ printf("\tWIFSTOPPED = %i\n", WIFSTOPPED(status));
+ /* SIGTRAP happens. */
+ printf("\tWSTOPSIG = %i (%s)\n", WSTOPSIG(status), strsignal(WSTOPSIG(status)));
+#if 0 /* We can continue. Just calling printf() from a signal handler is not
+ correct. */
exit(1);
+#endif
}

int main(int argc, char *argv[])
@@ -35,9 +44,15 @@ int main(int argc, char *argv[])

/* child process ... shouldnt be executed, but just in case ... */
if (argc > 1 && !strcmp(argv[1], "child"))
+#if 0 /* Parent did not kill us, after its child_exit() messages we should get
+ here. */
fail("kernel should have halted me...");
+#else
+ { puts ("child exiting"); exit (0); }
+#endif

- pid = vfork();
+ /* vfork() child must not call ptrace(). */
+ pid = fork();
if (pid == -1)
fail("vfork() didnt work: %m");
else if (!pid) {
@@ -54,6 +69,10 @@ int main(int argc, char *argv[])
/* do the parent stuff here */
signal(SIGCHLD, child_exit);

+ /* We cannot PTRACE_PEEKUSER here as the child still may not have
+ called PTRACE_TRACEME. */
+ pause ();
+
errno = 0;
pret = ptrace(PTRACE_PEEKUSER, pid, NULL, NULL);
if (pret && errno)


Attachments:
(No filename) (728.00 B)
ptrace-vfork-traceme.c (2.15 kB)
ptrace-vfork-traceme.c.patch (1.81 kB)
Download all attachments

2008-10-27 17:38:26

by Mike Frysinger

[permalink] [raw]
Subject: Re: inconsistent behavior with ptrace(TRACEME) and fork/exec

On Mon, Oct 27, 2008 at 10:56, Jan Kratochvil wrote:
> On Wed, 19 Jul 2006 21:18:29 +0200, Mike Frysinger wrote:
>> my understanding is that if a parent forks and the child does
>> a ptrace(TRACEME) right before doing an exec(), the kernel should always
>> halt it and wait indefinitely for the parent to start ptracing it.
>
> Yes, just the parent must process the event (signal). In your testcase the
> parent finished before the signal could be delivered. If the tracer exits the
> tracee's tracing is finished and it continues freely.

no signal should have been generated. the child should have gone
straight to the exec and waited for the parent to process it.

>> unfortunately, this behavior seems to be unreliable.
>
> Fixed the races in your code and I do not see there any problem, do you?
> The ptrace problems/testsuite is being maintained at:
> http://sourceware.org/systemtap/wiki/utrace/tests

there is no race condition ... it's using vfork here remember ? it is
impossible for the parent to have executed anything after the vfork()
before the child made it into the exec() and gone to sleep
-mike

2008-10-28 14:58:57

by Jan Kratochvil

[permalink] [raw]
Subject: Re: inconsistent behavior with ptrace(TRACEME) and fork/exec

#define _GNU_SOURCE
#include <errno.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <assert.h>

#define fail(msg, args...) \
do { \
fprintf(stderr, "FAIL:%i: " msg "\n", __LINE__, ## args); \
exit(1); \
} while (0)

static volatile int caught = 0;

static void child_exit(int sig)
{
int status;
#if 0 /* Correct behavior. */
printf("failure, child exited with %i: %s\n", sig, strsignal(sig));
#else
assert(sig == SIGCHLD);
#endif
printf("wait() = %i\n", wait(&status));
printf("status = 0x%x\n", status);
printf("\tWIFEXITED = %i\n", WIFEXITED(status));
printf("\tWEXITSTATUS = %i\n", WEXITSTATUS(status));
printf("\tWIFSIGNALED = %i\n", WIFSIGNALED(status));
printf("\tWTERMSIG = %i (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
/* WIFSTOPPED happens. */
printf("\tWIFSTOPPED = %i\n", WIFSTOPPED(status));
/* SIGTRAP happens. */
printf("\tWSTOPSIG = %i (%s)\n", WSTOPSIG(status), strsignal(WSTOPSIG(status)));
#if 0 /* We can continue. Just calling printf() from a signal handler is not
correct. */
exit(1);
#else
caught = 1;
#endif
}

int main(int argc, char *argv[])
{
long pret;
pid_t pid;

/* child process ... shouldnt be executed, but just in case ... */
if (argc > 1 && !strcmp(argv[1], "child"))
#if 0 /* Parent did not kill us, after its child_exit() messages we should get
here. */
fail("kernel should have halted me...");
#else
{ puts ("child exiting"); exit (0); }
#endif

/* Setup the handler already here as after vfork() there would be
a race setting it up before we get signalled. */
signal(SIGCHLD, child_exit);

/* vfork() child must not call ptrace(). */
pid = vfork();
if (pid == -1)
fail("vfork() didnt work: %m");
else if (!pid) {
/* do the child stuff here */
errno = 0;
pret = ptrace(PTRACE_TRACEME, 0, NULL, NULL);
if (pret && errno)
fail("ptrace(PTRACE_TRACEME) = %li: %m", pret);

int eret = execlp(argv[0], argv[0], "child", NULL);
fail("execlp() = %i", eret);
}
/* do the parent stuff here */

/* We cannot pause() here as the signal already could occur, we would
have to SIG_BLOCK SIGCHLD and sigsuspend() here. */
while (!caught);

errno = 0;
pret = ptrace(PTRACE_PEEKUSER, pid, NULL, NULL);
if (pret && errno)
fail("ptrace(PTRACE_PEEKUSER, %i) = %li: %m", pid, pret);

puts("SUCCESS! :D");

return 0;
}


Attachments:
(No filename) (2.05 kB)
ptrace-vfork-traceme.c (2.41 kB)
Download all attachments

2008-10-29 10:27:14

by Mike Frysinger

[permalink] [raw]
Subject: Re: inconsistent behavior with ptrace(TRACEME) and fork/exec

On Tue, Oct 28, 2008 at 10:58, Jan Kratochvil wrote:
> On Mon, 27 Oct 2008 18:38:16 +0100, Mike Frysinger wrote:
>> On Mon, Oct 27, 2008 at 10:56, Jan Kratochvil wrote:
>> > On Wed, 19 Jul 2006 21:18:29 +0200, Mike Frysinger wrote:
>> >> my understanding is that if a parent forks and the child does
>> >> a ptrace(TRACEME) right before doing an exec(), the kernel should always
>> >> halt it and wait indefinitely for the parent to start ptracing it.
>> >
>> > Yes, just the parent must process the event (signal). In your testcase the
>> > parent finished before the signal could be delivered. If the tracer exits the
>> > tracee's tracing is finished and it continues freely.
>>
>> no signal should have been generated. the child should have gone
>> straight to the exec and waited for the parent to process it.
>
> Every ptrace event generates SIGCHLD. vfork() frees the parent at the moment
> the child calls exec() or _exit(). Here the child calls exec() which both
> frees the parent to continue the execution AND delivers SIGCHLD with the
> status WIFSTOPPED WSTOPSIG(SIGTRAP) to it. I do not see a problem here.

sorry, the only info ive gathered about the ptrace interface is the
man page, strace, and some of the ptrace code in the kernel ... not
that strace/kernel is really documented at all. nowhere did i see
mention of the child generating a signal due to ptrace. but in this
case, it sounds like the initial SIGCHLD with stop info is implied
rather than explicitly mentioned. every time the child is halted at a
trace point (like syscall entry/exit or signal), the parent is
notified by a SIGCHLD with the stop bit set. and then it's up to the
parent to interrogate the child's signal info via like
PTRACE_GETSIGINFO.

>> >> unfortunately, this behavior seems to be unreliable.
>> >
>> > Fixed the races in your code and I do not see there any problem, do you?
>> > The ptrace problems/testsuite is being maintained at:
>> > http://sourceware.org/systemtap/wiki/utrace/tests
>>
>> there is no race condition ... it's using vfork here remember ?
>
> Yes but you said that you see the same problem even with fork():

true, so in that case you are right that there is a race condition.
but lets ignore that to keep things simple. in the vfork() case, it
is impossible for the parent to execute before the child executes
PTRACE_TRACEME. if we also ignore the signal() aspect by setting
SIGCHLD to SIG_IGN, then how can the child not get stopped properly ?

> You must not rely on the vfork() specifics as vfork() can be just fork():

i'm testing the Linux interface. how anyone else handles vfork()
doesnt matter here. the assumption is that the expected behavior is
respected: the parent is halted until the child calls _exit() or an
exec() function. this appears to be the case on every Linux port i
know of, and can be checked on my system by inserting a sleep() right
after the vfork() for the child. the parent never runs until the
child hits exec().

> I hope there is no problem with the kernel at this moment, the attached
> testcase works even with vfork() reliably.

your code goes through a lot of trouble to avoid issues. i'm pushing
the acceptable limits on purpose. then again, the acceptable limits
arent terribly well documented. you're saying that the child exec()
will first release the parent, and then start creating the child
image/etc... before the child is placed into a stopped state ? and
until that stopped state is reached, any attempt to ptrace() the child
is not allowed ? in other words, ptrace() only works on a stopped
child, not an actively running one.

that is why the parent can sometimes get:
FAIL:62: ptrace(PTRACE_PEEKUSER, 18762) = -1: No such process
FAIL:38: kernel should have halted me...
even though (1) the child has clearly enabled tracing on itself and
(2) the pid clearly exists ?

which means if a child does something like mask all signals and then execute:
while (1) ;
a process tracing it will not be able to ptrace() it at all anymore ?

thanks for that utrace URL ... some good info there. i'm guessing
that there is no real ptrace() documentation (like a spec) and it's
largely an orphaned interface nowadays.
-mike