... doesn't seem to be possible anymore.
See
http://www.openoffice.org/issues/show_bug.cgi?id=27162
Is this change intentional, or a bug?
LLaP
bero
--
Ark Linux - Linux for the masses
http://www.arklinux.org/
Redistribution and processing of this message is subject to
http://www.arklinux.org/terms.php
[email protected] wrote:
> ... doesn't seem to be possible anymore.
>
> See http://www.openoffice.org/issues/show_bug.cgi?id=27162
>
> Is this change intentional, or a bug?
On 2.6.3, x86, SIGSEGV is being caught just fine in my test program,
with the correct fault address, with or without SA_SIGINFO.
-- Jamie
Jamie Lokier wrote:
> [email protected] wrote:
>
>>... doesn't seem to be possible anymore.
>>
>>See http://www.openoffice.org/issues/show_bug.cgi?id=27162
>>
>>Is this change intentional, or a bug?
>
>
> On 2.6.3, x86, SIGSEGV is being caught just fine in my test program,
> with the correct fault address, with or without SA_SIGINFO.
SA_SIGINFO implies sigaction(). The original poster was talking about
signal().
That said, it seems to work with 2.6.4 on ppc32.
Chris
Chris Friesen wrote:
> SA_SIGINFO implies sigaction(). The original poster was talking about
> signal().
>
> That said, it seems to work with 2.6.4 on ppc32.
Just tried it with 2.6.3, x86 and signal(). Works fine.
-- Jamie
On Mon, 5 Apr 2004, Jamie Lokier wrote:
> Chris Friesen wrote:
> > SA_SIGINFO implies sigaction(). The original poster was talking about
> > signal().
> >
> > That said, it seems to work with 2.6.4 on ppc32.
>
> Just tried it with 2.6.3, x86 and signal(). Works fine.
>
> -- Jamie
Are you using a longjump to get out of the signal handler?
You may find that you can trap SIGSEGV, but you can't exit
from it because it will return to the instruction that
caused the trap!!!
#include <stdio.h>
#include <signal.h>
void handler(int sig) {
fprintf(stderr, "Caught %d\n", sig);
}
int main() {
char *foo = NULL;
signal(SIGSEGV, handler);
fprintf(stderr, "Send a signal....\n");
kill(0, SIGSEGV);
fprintf(stderr, "Okay! That worked!\n");
// *foo = 0;
return 0;
}
Just un-comment the null-pointer de-reference and watch!
Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.
Richard B. Johnson wrote:
> Are you using a longjump to get out of the signal handler?
> You may find that you can trap SIGSEGV, but you can't exit
> from it because it will return to the instruction that
> caused the trap!!!
Thanks for stating the obvious! :)
No, actually I'm changing memory protection with mprotect() inside the
handler, so when it returns the program can continue.
But that's not relevant to the OpenOffice problem. They have a
program which traps SIGSEGV with 2.4 and terminates suddenly with 2.6.
Obviously they aren't just returning else it wouldn't work with 2.4.
-- Jamie
Richard B. Johnson wrote:
> Are you using a longjump to get out of the signal handler?
> You may find that you can trap SIGSEGV, but you can't exit
> from it because it will return to the instruction that
> caused the trap!!!
That's the same as in 2.4 though. The original poster was talking about
behaviour changes in 2.6.
Chris
On Mon, 5 Apr 2004, Jamie Lokier wrote:
> > See http://www.openoffice.org/issues/show_bug.cgi?id=27162
> >
> > Is this change intentional, or a bug?
>
> On 2.6.3, x86, SIGSEGV is being caught just fine in my test program,
> with the correct fault address, with or without SA_SIGINFO.
Seems to be triggered only by some segfaults -- a simpler test app than
the one in the OpenOffice bug report works here too, the OpenOffice one
crashes.
I'll try to debug it some more when I have some time, but that could take
a while (busy ATM)
LLaP
bero
--
Ark Linux - Linux for the masses
http://www.arklinux.org/
Redistribution and processing of this message is subject to
http://www.arklinux.org/terms.php
Hi,
Just in case this helps, this is a simplified testcase of the OpenOffice.org
code in question that always worked under 2.4 kernels on multiple
architectures but fails on 2.6.X kernels on those same multiple platforms.
For some reason, the segfault generated by trying to write to address 0 can
not be properly caught anymore (or at least it appears that way to me).
Hope this helps.
Kevin
[kbhend@base1 solar]$ cat testcase.c
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
typedef int (*TestFunc)( void* );
static jmp_buf check_env;
static int bSignal;
void SignalHdl( int sig )
{
bSignal = 1;
longjmp( check_env, sig );
}
int check( TestFunc func, void* p )
{
int result;
bSignal = 0;
if ( !setjmp( check_env ) )
{
signal( SIGSEGV, SignalHdl );
signal( SIGBUS, SignalHdl );
result = func( p );
signal( SIGSEGV, SIG_DFL );
signal( SIGBUS, SIG_DFL );
}
if ( bSignal )
return -1;
else
return 0;
}
int GetAtAddress( void* p )
{
return *((char*)p);
}
int SetAtAddress( void* p )
{
return *((char*)p) = 0;
}
int CheckGetAccess( void* p )
{
int b;
b = -1 != check( (TestFunc)GetAtAddress, p );
return b;
}
int CheckSetAccess( void* p )
{
int b;
b = -1 != check( (TestFunc)SetAtAddress, p );
return b;
}
void InfoMemoryAccess( char* p )
{
if ( CheckGetAccess( p ) )
printf( "can read address %p\n", p );
else
printf( "can not read address %p\n", p );
if ( CheckSetAccess( p ) )
printf( "can write address %p\n", p );
else
printf( "can not write address %p\n", p );
}
int
main( int argc, char* argv[] )
{
{
char* p = NULL;
InfoMemoryAccess( p );
p = (char*)&p;
InfoMemoryAccess( p );
}
exit( 0 );
}
Kevin B. Hendricks wrote:
> For some reason, the segfault generated by trying to write to address 0 can
> not be properly caught anymore (or at least it appears that way to me).
If the code would be correct you'd see the expected behavior.
> void SignalHdl( int sig )
> {
> bSignal = 1;
> longjmp( check_env, sig );
> }
Since you jump out of a signal handling you must use siglongmp
> int check( TestFunc func, void* p )
> {
> int result;
> bSignal = 0;
> if ( !setjmp( check_env ) )
And sigsetjmp(check_env, 1) here.
--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
Hi,
> If the code would be correct you'd see the expected behavior.
> Since you jump out of a signal handling you must use siglongmp
> And sigsetjmp(check_env, 1) here.
So the code has been wrong since the beginning and we were just "lucky" it
worked in all pre-2.6 kernels?
I have no doubt you are right but forgiving my ignorance here, please explain
why must we use siglongjmp when longjmping out of a signal handler given that
1. before the next use of the handler we use signal again to properly set the
signal handler (and the set of masked signals).
and
2. the mask of blocked signals will include sigsegv upon entry to the signal
handler and therefore it will be masked after the normal longjmp since a
normal longjmp wil not change the set of masked symbols.
What am I missing that makes sigsetjmp and siglongjmp a requirement, or is
this just part of some specification someplace?
Kevin
Kevin B. Hendricks wrote:
> So the code has been wrong since the beginning and we were just "lucky" it
> worked in all pre-2.6 kernels?
The old code depended on undefined behavior.
> 1. before the next use of the handler we use signal again to properly set the
> signal handler (and the set of masked signals).
Where do you set the signal mask? That's the point. You don't. This
means jumping from the signal handler causes the signal to remain
blocked. And then
~~~~
If any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals are generated
while they are blocked, the result is undefined, unless the signal was
generated by the kill() function, the sigqueue() function, or the
raise() function.
~~~~
(see pthread_sigmask in POSIX) comes into play.
The second SIGSEGV signal is created with the signal blocked and since
it's neither of the functions mentioned in the text below which creates
the signal anything can happen. The old kernel queued the signal, the
new kernel terminates the process which is much better IMO. Try the
attached program to see why. Also note, the 2.4 behavior is
inconsistent. If no handler is installed the process is terminated,
regardless of the signal being masked.
--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
Hi Ulrich,
> The old code depended on undefined behavior.
Thanks for explanation. It makes perfect sense. I appreciate it.
Our bad assumption was that using signal to install a signalhandler on a
specific signal unblocked that specific signal, but as you show it does not.
I will try to get a fix using sigsetjmp/siglongjmp or fork/wait into the
forthcoming OOo 1.1.2 tree so that no more "problems" are reported.
Kevin
Ulrich Drepper wrote:
>
> Kevin B. Hendricks wrote:
>
> > So the code has been wrong since the beginning and we were just "lucky" it
> > worked in all pre-2.6 kernels?
>
> The old code depended on undefined behavior.
Maybe it's simply *old* code, possibly written under libc5.
There, signal() used SA_RESETHAND which implies SA_NODEFER
which in turn did not block the signal and exiting from the
signal handler via longjmp was OK.
With the new signal() behaviour in glibc2 one may get results
undefined by POSIX but it still worked as before because the
sigprocmask was ignored for SIGSEGV under Linux <2.6.
It's the combination of new glibc2 and new kernel that makes
code like the mentioned one break.
It has nothing to do with POSIX - for POSIX all of this is
"undefined/implementation defined behaviour". I had chosen
to stay compatible...
Ciao, ET.
--
Not every program claims to be POSIX compliant (who reads
3600 pages of difficult to obtain specs?) - some are simply
Linux programs...