Hello,
It is unclear from the various documentions in the kernel and glibc what
the proper behaviour should be for the case when a child process
catches a SIGNAL (say for instance, SIGTERM), and then calls exit()
from within its caught SIGNAL handler.
Since the exit() will cause a SIGCHLD to the parent, and the parent
(let's say) has a SIGCHLD sigaction (SA_SIGINFO sa_flags set), should
the parent's WIFSIGNALED(siginfo->si_status) be true?
To recap, the WIFSIGNALED section of the waitpid() manpage says:
WIFSIGNALED(status)
returns true if the child process was terminated by a signal.
So the dilemna: the child caught the signal, so it wasn't terminated by
a signal, but rather its signal handler (let's say) called exit.
Furthermore:
WTERMSIG(status)
returns the number of the signal that caused the child process
to terminate. This macro should only be employed if WIFSIGNALED
returned true.
Observered behaviour with 2.6.20.6 is that is WIFSIGNALED(status)
returns true (possibly incorrect), and furthermore, WTERMSIG(status)
returns the exit(VALUE) VALUE from the child's exit() call, and not the
SIGNAL (let's say SIGTERM that the child caught). This may be correct,
since the siginfo_t * si_status member is:
int si_status; /* Exit value or signal */
but there's no clarity on which: exit value or signal. Since the child
exited, I'm likely to assume exit status, which is current observed
behaviour, but then, WIFSIGNALED(status) should be FALSE, which its not
(observed with 2.6.20.6).
So could someone clarify the kernel's intent?
I can provide a short C program to illustrate above behaviour, if
needed. It also could be that I'm just misinterpreting the intent,
which is why I'm not calling this a bug, despite a possible
inconsistency in behaviour.
Thanks,
John
John,
> It is unclear from the various documentions in the kernel and glibc what
> the proper behaviour should be for the case when a child process
> catches a SIGNAL (say for instance, SIGTERM), and then calls exit()
> from within its caught SIGNAL handler.
>
> Since the exit() will cause a SIGCHLD to the parent, and the parent
> (let's say) has a SIGCHLD sigaction (SA_SIGINFO sa_flags set), should
> the parent's WIFSIGNALED(siginfo->si_status) be true?
>
> To recap, the WIFSIGNALED section of the waitpid() manpage says:
>
> WIFSIGNALED(status)
> returns true if the child process was terminated by a signal.
>
> So the dilemna: the child caught the signal, so it wasn't terminated by
> a signal, but rather its signal handler (let's say) called exit.
>
> Furthermore:
>
> WTERMSIG(status)
> returns the number of the signal that caused the child process
> to terminate. This macro should only be employed if WIFSIGNALED
> returned true.
>
> Observered behaviour with 2.6.20.6 is that is WIFSIGNALED(status)
> returns true (possibly incorrect), and furthermore, WTERMSIG(status)
> returns the exit(VALUE) VALUE from the child's exit() call, and not the
> SIGNAL (let's say SIGTERM that the child caught). This may be correct,
> since the siginfo_t * si_status member is:
>
> int si_status; /* Exit value or signal */
>
> but there's no clarity on which: exit value or signal. Since the child
> exited, I'm likely to assume exit status, which is current observed
> behaviour, but then, WIFSIGNALED(status) should be FALSE, which its not
> (observed with 2.6.20.6).
>
> So could someone clarify the kernel's intent?
>
> I can provide a short C program to illustrate above behaviour, if
> needed. It also could be that I'm just misinterpreting the intent,
> which is why I'm not calling this a bug, despite a possible
> inconsistency in behaviour.
If the child terminated by calling exit(), regardless of whether it
was done from inside a signal handler, then WIFEXITED() should test
true, but WIFSIGNALED() will not. If you are seeing otherwise, then
show a *short* program that demonstrates the behavior. (But it seems
unlikely that there would be a kernel bug on this point, so do check
your program carefully!)
Cheers,
Michael
On Sat, 2007-09-22 at 11:22 -0700, John Z. Bohach wrote:
> Hello,
>
> It is unclear from the various documentions in the kernel and glibc what
> the proper behaviour should be for the case when a child process
> catches a SIGNAL (say for instance, SIGTERM), and then calls exit()
> from within its caught SIGNAL handler.
>
> Since the exit() will cause a SIGCHLD to the parent, and the parent
> (let's say) has a SIGCHLD sigaction (SA_SIGINFO sa_flags set), should
> the parent's WIFSIGNALED(siginfo->si_status) be true?
>
> To recap, the WIFSIGNALED section of the waitpid() manpage says:
>
> WIFSIGNALED(status)
> returns true if the child process was terminated by a signal.
>
> So the dilemna: the child caught the signal, so it wasn't terminated by
> a signal, but rather its signal handler (let's say) called exit.
POSIX says
WIFSIGNALED(stat_val)
Evaluates to a non-zero value if status was returned for a child
process that terminated due to the receipt of a signal that was
not caught (see <signal.h>).
So there's no dilemma at all and Linux is non-conformant.
--
Nicholas Miell <[email protected]>
On Saturday 22 September 2007 11:49:09 Michael Kerrisk wrote:
> John,
>
...snip...
>
> If the child terminated by calling exit(), regardless of whether it
> was done from inside a signal handler, then WIFEXITED() should test
> true, but WIFSIGNALED() will not. If you are seeing otherwise, then
> show a *short* program that demonstrates the behavior. (But it seems
> unlikely that there would be a kernel bug on this point, so do check
> your program carefully!)
Attached is a (somewhat) short program that demonstates the behavior. I
simply compile it with 'make sigtest'.
My observed behavior is:
$ ./sigtest
sigtest started
child1 started
child2 started
selecting...
sigCaught: 3366 receieved signal 15
sigtest 3366 exiting
sigChld: 3365 receieved signal 17
sigChld: 3365 child 3366 WIFEXITED with childStat 15
sigChld: 3365 child 3366 WIFSIGNALED with si_status 15
select error: Interrupted system call
selecting...
sigCaught: 3367 receieved signal 15
sigtest 3367 exiting
sigChld: 3365 receieved signal 17
sigChld: 3365 child 3367 WIFEXITED with childStat 15
sigChld: 3365 child 3367 WIFSIGNALED with si_status 15
select error: Interrupted system call
selecting...
sigCaught: 3365 receieved signal 15
sigtest 3365 exiting
$
To get this output, I ran, from another shell, the following sequence:
$ ps -eaf | grep sigtest
zoltan 3365 2307 0 13:04 pts/0 00:00:00 ./sigtest
zoltan 3366 3365 98 13:04 pts/0 00:00:06 ./sigtest
zoltan 3367 3365 98 13:04 pts/0 00:00:06 ./sigtest
$ kill -SIGTERM 3366
$ kill -SIGTERM 3367
$ kill -SIGTERM 3365
That's it. What I find odd is that the wait_pid() status and the
si_status are the same for both a WIFEXITED and a WIFSIGNALED, which
should be impossible, if I read the documentation right.
Thanks for your responses,
John
"John Z. Bohach" <[email protected]> writes:
> if (WIFEXITED(siginfo->si_status))
That does not make any sense. si_status is _not_ a wait status.
Andreas.
--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
On Saturday 22 September 2007 16:35:56 Andreas Schwab wrote:
> "John Z. Bohach" <[email protected]> writes:
> > if (WIFEXITED(siginfo->si_status))
>
> That does not make any sense. si_status is _not_ a wait status.
>
> Andreas.
Thank you for clearing that up. That explains it.
--john