2008-03-08 17:20:47

by Oleg Nesterov

[permalink] [raw]
Subject: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

Not sure this is needed, please review. Afaics, print_fatal_signal() is needed
to detect the early segfaults, and we don not transform the coredump signals
into the group-wide SIGKILL. With this change, for example, ^C will be reported.
I don't know whether this is good or bad. Most probably we don't need this, but
please clarify what is the desirable behaviour.

The task can dequeue SIGKILL while in fact it was killed by another fatal signal.
Consult ->group_exit_code to figure out the right signr.

Signed-off-by: Oleg Nesterov <[email protected]>

--- 25/kernel/signal.c~7_PFS_FIX 2008-03-08 17:38:50.000000000 +0300
+++ 25/kernel/signal.c 2008-03-08 19:15:00.000000000 +0300
@@ -834,6 +834,13 @@ int print_fatal_signals;

static void print_fatal_signal(struct pt_regs *regs, int signr)
{
+ if (signr == SIGKILL) {
+ if (current->signal->flags & SIGNAL_GROUP_EXIT)
+ signr = current->signal->group_exit_code & 0x7f;
+ if (!signr || signr == SIGKILL)
+ return;
+ }
+
printk("%s/%d: potentially unexpected fatal signal %d.\n",
current->comm, task_pid_nr(current), signr);

@@ -855,7 +862,7 @@ static void print_fatal_signal(struct pt

static int __init setup_print_fatal_signals(char *str)
{
- get_option (&str, &print_fatal_signals);
+ get_option(&str, &print_fatal_signals);

return 1;
}
@@ -1768,7 +1775,7 @@ relock:
* Anything else is fatal, maybe with a core dump.
*/
current->flags |= PF_SIGNALED;
- if ((signr != SIGKILL) && print_fatal_signals)
+ if (print_fatal_signals)
print_fatal_signal(regs, signr);
if (sig_kernel_coredump(signr)) {
/*


2008-03-08 19:03:52

by Roland McGrath

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

The intent of your change is to get the printk for each thread, right?
I don't really see the point. The thread that actually had the fault will
dequeue a non-SIGKILL signal and report its status. We only need one
thread per signal to the print-out.

Hmm. I see that non-coredump signals that hit the optimized fatal case in
__group_complete_signal will cause every thread to have a pending SIGKILL.
So that will be seen first and prevent the print-out. So that's what you
intend to change?

I'm not sure print-fatal-signals was really ever intended for non-coredump
signals. It doesn't seem like it would be all that useful. It's probably
even undesireable for every normal C-c killing something to cause a printk.


Thanks,
Roland

2008-03-08 20:31:23

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

On 03/08, Roland McGrath wrote:
>
> Hmm. I see that non-coredump signals that hit the optimized fatal case in
> __group_complete_signal will cause every thread to have a pending SIGKILL.
> So that will be seen first and prevent the print-out. So that's what you
> intend to change?

Yes,

> I'm not sure print-fatal-signals was really ever intended for non-coredump
> signals.

Great, thanks! Please ignore this patch then. I was afraid something was
missed here.

Oleg.

2008-03-09 11:05:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"


* Roland McGrath <[email protected]> wrote:

> I'm not sure print-fatal-signals was really ever intended for
> non-coredump signals. It doesn't seem like it would be all that
> useful. It's probably even undesireable for every normal C-c killing
> something to cause a printk.

correct. We used to have them for SIGKILL but even that was confusing to
users - so the intent very much is to only have them for truly
unexpected, non-user generated and 'fatal', coredump-generating signals.

Ingo

2008-03-09 16:07:47

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

On 03/09, Ingo Molnar wrote:
>
> * Roland McGrath <[email protected]> wrote:
>
> > I'm not sure print-fatal-signals was really ever intended for
> > non-coredump signals. It doesn't seem like it would be all that
> > useful. It's probably even undesireable for every normal C-c killing
> > something to cause a printk.
>
> correct. We used to have them for SIGKILL but even that was confusing to
> users - so the intent very much is to only have them for truly
> unexpected, non-user generated and 'fatal', coredump-generating signals.

Yes, I see now, thanks.

Let me clarify why I started this thread. I'm thinking how we can "improve"
fatal signals. Let's look at this patch

http://marc.info/?l=linux-kernel&m=120498888702466

(under discussion, let's suppose it will be accepted). In short: with this
patch the thread-specific SIGKILL shutdowns the whole thread group early on
signal delivery, the same way like the group-wide SIGKILL does.

For various reasons we can't currently do the same for sig_kernel_coredump()
signals. But, when rlim[RLIMIT_CORE] == 0, we don't actually need coredumping?
So, we could do something like


- if (!sig_kernel_coredump(sig)) {
+ if (!sig_kernel_coredump(sig) || !signal->rlim[RLIMIT_CORE]) {

// shutdown the whole group,
// send SIGKILL to each thread

to speedup the processing of fatal coredump signals (to clarify: this issue
is minor, just for example. and the change above is not exactly right).

However, this breaks print_fatal_signal(), because with this change nobody
will dequeue the "right" signal to report, it was transformed to the "global"
SIGKILL.

So, if we change the behaviour of thread-specific coredump signals, then
we should "fix" print_fatal_signal(). At least, now I know which signals
should be reported.

Thanks!

Oleg.

2008-03-11 02:19:41

by Roland McGrath

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

> For various reasons we can't currently do the same for sig_kernel_coredump()
> signals. But, when rlim[RLIMIT_CORE] == 0, we don't actually need coredumping?
> So, we could do something like
>
>
> - if (!sig_kernel_coredump(sig)) {
> + if (!sig_kernel_coredump(sig) || !signal->rlim[RLIMIT_CORE]) {

I don't like this. I think it's better to leave logic like RLIMIT_CORE
checks to the core dump code itself.


Thanks,
Roland

2008-03-11 18:18:04

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [RFC, PATCH] signals: print_fatal_signal: fix the signr "calculation"

On 03/10, Roland McGrath wrote:
>
> > For various reasons we can't currently do the same for sig_kernel_coredump()
> > signals. But, when rlim[RLIMIT_CORE] == 0, we don't actually need coredumping?
> > So, we could do something like
> >
> >
> > - if (!sig_kernel_coredump(sig)) {
> > + if (!sig_kernel_coredump(sig) || !signal->rlim[RLIMIT_CORE]) {
>
> I don't like this. I think it's better to leave logic like RLIMIT_CORE
> checks to the core dump code itself.

As I said, this was just for illustration.

What I am thinking about is how to restore the "sig_kernel_coredump() signal
freezes the whole group on delivery" behaviour, we had this before
198466b41d11dd062fb26ee0376080458d7bfcaf.

Afaics this is possible, we could add another SIGNAL_XXX flag but send SIGKILL,
this also unifies the fatal signals processing.

Oleg.