2004-06-25 23:56:58

by Paul Maurides

[permalink] [raw]
Subject: 2.6.x signal handler bug

The bug has been reproduced successfully using the following program
on kernel 2.6.5 and 2.6.7, and probably affects any other 2.6 kernel.

Kernel 2.4 produce the correct behavior, an endless loop of handled
signals, but on kernel 2.6 the program segfaults.

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

volatile int len;
volatile int real;
volatile int caught;
jmp_buf env;

void catcher(int sig){
signal(SIGSEGV,catcher);
printf("requested: %9d malloced: %9d\n",len,real);
longjmp(env, 1);
}

int main(){
char* p=0;
len = 0;
signal(SIGSEGV,catcher);

setjmp(env);
len++;
free(p);
p = malloc(len);
real = 0;
while(1){
p[real] = 0;
real++;
}
return 0;
}

PS. I'm not subscribed to this list, so please include me in the cc


2004-06-26 00:05:11

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

Paul Maurides <[email protected]> wrote:
>
> void catcher(int sig){
> signal(SIGSEGV,catcher);
> printf("requested: %9d malloced: %9d\n",len,real);
> longjmp(env, 1);
> }

Use siglongjmp()

2004-06-26 01:32:53

by daw

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

Neither printf() nor longjmp() are safe to call from within
the sighandler. Have you tried deleting printf() and replacing
longjmp() with siglongjmp()? This is a FAQ; search the list archives
for details.

2004-06-26 14:33:27

by Steve G

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

Hi,

I looked at the test program and do not see anything wrong with the code.
Contrary to what's already been said in this thread, sigsetjmp/siglongjmp only
differ in that they restore the signal context. This should never cause a
segfault.

Regarding re-entrancy, longjmp is stated as one of only 2 ways to exit signal
handlers. Also, while the printf is not signal safe, it is not your problem
either. BTW, this mechanism is used by some servers to prevent crashes even in
the face of big problems. xinetd for one does this...so its important to have
working.

I ran the test program on my machine under 2.4 and all works as expected. Under
2.6, it definitely segfaults. I tried using Electric Fence and valgrind to trap
the error. Neither one could.

In summary, the program is valid and real world servers do this kind of thing. It
does segfault under 2.6.

Hope this helps...
-Steve Grubb

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2004-06-26 16:05:43

by Davide Libenzi

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Sat, 26 Jun 2004, Steve G wrote:

> Hi,
>
> I looked at the test program and do not see anything wrong with the code.
> Contrary to what's already been said in this thread, sigsetjmp/siglongjmp only
> differ in that they restore the signal context. This should never cause a
> segfault.
>
> Regarding re-entrancy, longjmp is stated as one of only 2 ways to exit signal
> handlers. Also, while the printf is not signal safe, it is not your problem
> either. BTW, this mechanism is used by some servers to prevent crashes even in
> the face of big problems. xinetd for one does this...so its important to have
> working.
>
> I ran the test program on my machine under 2.4 and all works as expected. Under
> 2.6, it definitely segfaults. I tried using Electric Fence and valgrind to trap
> the error. Neither one could.
>
> In summary, the program is valid and real world servers do this kind of thing. It
> does segfault under 2.6.

You're receiving a SIGSEGV while SIGSEGV is blocked (force_sig_info). The
force_sig_info call wants to send a signal that the task can't refuse
(kinda The GodFather offers ;). The kernel will noticed this and will
restore the handler to SIG_DFL. All three examples below works fine on 2.6.



- Davide


--------------------------------------------------------------------
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

volatile int len;
volatile int real;
volatile int caught;
jmp_buf env;

void catcher(int sig){
signal(SIGSEGV, catcher);
printf("requested: %9d malloced: %9d\n",len,real);
longjmp(env, 1);
}

int main(){
char* p=0;
sigset_t m;
len = 0;
sigemptyset(&m);
sigaddset(&m, SIGSEGV);
signal(SIGSEGV, catcher);
setjmp(env);
sigprocmask(SIG_UNBLOCK, &m, NULL);
printf("len %d\n", len);
len++;
free(p);
p = malloc(len);
real = 0;
while(1){
p[real] = 0;
real++;
}
return 0;
}

--------------------------------------------------------------------
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

volatile int len;
volatile int real;
volatile int caught;
jmp_buf env;

void catcher(int sig){
signal(SIGSEGV, catcher);
printf("requested: %9d malloced: %9d\n",len,real);
siglongjmp(env, 1);
}

int main(){
char* p=0;
len = 0;
signal(SIGSEGV, catcher);
sigsetjmp(env, 1);
printf("len %d\n", len);
len++;
free(p);
p = malloc(len);
real = 0;
while(1){
p[real] = 0;
real++;
}
return 0;
}


--------------------------------------------------------------------
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

volatile int len;
volatile int real;
volatile int caught;
jmp_buf env;

void catcher(int sig){
printf("requested: %9d malloced: %9d\n",len,real);
longjmp(env, 1);
}

int main(){
char* p=0;
struct sigaction act;

len = 0;
act.sa_handler = catcher;
act.sa_flags = SA_NODEFER;
sigaction(SIGSEGV, &act, NULL);

setjmp(env);
printf("len %d\n", len);
len++;
free(p);
p = malloc(len);
real = 0;
while(1){
p[real] = 0;
real++;
}
return 0;
}


2004-06-27 22:16:16

by Andries Brouwer

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Sat, Jun 26, 2004 at 09:05:34AM -0700, Davide Libenzi wrote:

> You're receiving a SIGSEGV while SIGSEGV is blocked (force_sig_info). The
> force_sig_info call wants to send a signal that the task can't refuse
> (kinda The GodFather offers ;). The kernel will noticed this and will
> restore the handler to SIG_DFL.

Yes.

So checking whether this is POSIX conforming:

- Blocking a signal in its signal handler is explicitly allowed.
(signal(3p))
- It is unspecified what longjmp() does with the signal mask.
(longjmp(3p))
- The SIGSEGV that occurs during a stack overflow is of the GodFather kind.
(getrlimit(3p))
- If SIGSEGV is generated while blocked, the result is undefined
(sigprocmask(3p))

So, maybe the restoring to SIG_DFL was not required, but it doesnt seem
incorrect either. It may be a bit surprising.

Andries

2004-06-27 22:45:31

by Davide Libenzi

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Mon, 28 Jun 2004, Andries Brouwer wrote:

> On Sat, Jun 26, 2004 at 09:05:34AM -0700, Davide Libenzi wrote:
>
> > You're receiving a SIGSEGV while SIGSEGV is blocked (force_sig_info). The
> > force_sig_info call wants to send a signal that the task can't refuse
> > (kinda The GodFather offers ;). The kernel will noticed this and will
> > restore the handler to SIG_DFL.
>
> Yes.
>
> So checking whether this is POSIX conforming:
>
> - Blocking a signal in its signal handler is explicitly allowed.
> (signal(3p))
> - It is unspecified what longjmp() does with the signal mask.
> (longjmp(3p))
> - The SIGSEGV that occurs during a stack overflow is of the GodFather kind.
> (getrlimit(3p))
> - If SIGSEGV is generated while blocked, the result is undefined
> (sigprocmask(3p))
>
> So, maybe the restoring to SIG_DFL was not required, but it doesnt seem
> incorrect either. It may be a bit surprising.

I think so. Maybe the attached patch?



- Davide




--- a/kernel/signal.c 2004-06-27 15:42:26.000000000 -0700
+++ b/kernel/signal.c 2004-06-27 15:43:28.000000000 -0700
@@ -820,8 +820,9 @@
int ret;

spin_lock_irqsave(&t->sighand->siglock, flags);
- if (sigismember(&t->blocked, sig) || t->sighand->action[sig-1].sa.sa_handler == SIG_IGN) {
- t->sighand->action[sig-1].sa.sa_handler = SIG_DFL;
+ if (sigismember(&t->blocked, sig)) {
+ if (t->sighand->action[sig-1].sa.sa_handler == SIG_IGN)
+ t->sighand->action[sig-1].sa.sa_handler = SIG_DFL;
sigdelset(&t->blocked, sig);
recalc_sigpending_tsk(t);
}

2004-06-27 22:51:56

by Davide Libenzi

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Sun, 27 Jun 2004, Davide Libenzi wrote:

> On Mon, 28 Jun 2004, Andries Brouwer wrote:
>
> > On Sat, Jun 26, 2004 at 09:05:34AM -0700, Davide Libenzi wrote:
> >
> > > You're receiving a SIGSEGV while SIGSEGV is blocked (force_sig_info). The
> > > force_sig_info call wants to send a signal that the task can't refuse
> > > (kinda The GodFather offers ;). The kernel will noticed this and will
> > > restore the handler to SIG_DFL.
> >
> > Yes.
> >
> > So checking whether this is POSIX conforming:
> >
> > - Blocking a signal in its signal handler is explicitly allowed.
> > (signal(3p))
> > - It is unspecified what longjmp() does with the signal mask.
> > (longjmp(3p))
> > - The SIGSEGV that occurs during a stack overflow is of the GodFather kind.
> > (getrlimit(3p))
> > - If SIGSEGV is generated while blocked, the result is undefined
> > (sigprocmask(3p))
> >
> > So, maybe the restoring to SIG_DFL was not required, but it doesnt seem
> > incorrect either. It may be a bit surprising.
>
> I think so. Maybe the attached patch?

No, the SIG_IGN check should be there ...



- Davide




--- a/kernel/signal.c 2004-06-27 15:48:47.000000000 -0700
+++ b/kernel/signal.c 2004-06-27 15:49:14.000000000 -0700
@@ -821,7 +821,8 @@

spin_lock_irqsave(&t->sighand->siglock, flags);
if (sigismember(&t->blocked, sig) || t->sighand->action[sig-1].sa.sa_handler == SIG_IGN) {
- t->sighand->action[sig-1].sa.sa_handler = SIG_DFL;
+ if (t->sighand->action[sig-1].sa.sa_handler == SIG_IGN)
+ t->sighand->action[sig-1].sa.sa_handler = SIG_DFL;
sigdelset(&t->blocked, sig);
recalc_sigpending_tsk(t);
}

2004-06-28 02:01:11

by Steve G

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

> > So, maybe the restoring to SIG_DFL was not required, but it doesn't seem
> > incorrect either. It may be a bit surprising.

Right. Thanks for looking deeper Andries. I understood Davide's explanation and
then immediately wondered why the program worked under 2.4. I want to think 2.4
was emulating the unreliable signal from the past when signal() was used.

My main concern is that the behavior change may have broken some applications
that used to work. For example, valgrind caught & reported a problem under 2.4,
but valgrind never had a chance to catch it under 2.6.

> I think so. Maybe the attached patch?

I've applied the second patch to my kernel & started recompiling. I'll re-test it
tomrrow.

Thanks,
-Steve Grubb




__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail

2004-06-28 11:26:15

by Steve G

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

>> I think so. Maybe the attached patch?

>No, the SIG_IGN check should be there ...

OK. I tested the patch and now the test program runs as it did under 2.4.

Thanks,
-Steve Grubb

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2004-06-28 14:59:30

by Davide Libenzi

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Mon, 28 Jun 2004, Steve G wrote:

> >> I think so. Maybe the attached patch?
>
> >No, the SIG_IGN check should be there ...
>
> OK. I tested the patch and now the test program runs as it did under 2.4.

Good. I'll toss it to Andrew's back to see if it sticks :)



- Davide

2004-06-28 21:57:17

by Jörn Engel

[permalink] [raw]
Subject: Re: 2.6.x signal handler bug

On Sat, 26 June 2004 02:56:51 +0300, Paul Maurides wrote:
>
> The bug has been reproduced successfully using the following program
> on kernel 2.6.5 and 2.6.7, and probably affects any other 2.6 kernel.

All, since about 2.5.71 or so.

> Kernel 2.4 produce the correct behavior, an endless loop of handled
> signals, but on kernel 2.6 the program segfaults.

The program never returns from it's signal handler. Instead, it
causes yet another segfault. Any program stupid enough to cause a
segfault inside the segfault handler, should be killed. Full stop.

> #include <signal.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <setjmp.h>
>
> volatile int len;
> volatile int real;
> volatile int caught;
> jmp_buf env;
>
> void catcher(int sig){
> signal(SIGSEGV,catcher);
> printf("requested: %9d malloced: %9d\n",len,real);
> longjmp(env, 1);
> }
>
> int main(){
> char* p=0;
> len = 0;
> signal(SIGSEGV,catcher);
>
> setjmp(env);
> len++;
> free(p);
> p = malloc(len);
> real = 0;
> while(1){
> p[real] = 0;
> real++;
> }
> return 0;
> }

J?rn

--
Fancy algorithms are buggier than simple ones, and they're much harder
to implement. Use simple algorithms as well as simple data structures.
-- Rob Pike