2004-11-24 21:52:13

by Matt Mackall

[permalink] [raw]
Subject: nanosleep interrupted by ignored signals

Take the following trivial program:

#include <unistd.h>

int main(void)
{
sleep(10);
return 0;
}

Run it in an xterm. Note that resizing the xterm has no effect on the
process. Now do the same with strace:

brk(0x80495bc) = 0x80495bc
brk(0x804a000) = 0x804a000
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
call)
--- SIGWINCH (Window changed) ---
_exit(0) = ?

In short, nanosleep is getting interrupted by signals that are
supposedly ignored when a process is being praced. This appears to be
a long-standing bug.

It also appears to be a long-known bug. I found some old discussion of this
problem here but no sign of any resolution:

http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html

What's the current thinking on this?

--
Mathematics is the supreme nostalgia of our time.


2004-11-25 03:32:52

by Matt Mackall

[permalink] [raw]
Subject: Re: nanosleep interrupted by ignored signals

On Wed, Nov 24, 2004 at 06:45:05PM -0800, George Anzinger wrote:
> Matt Mackall wrote:
> >Take the following trivial program:
> >
> >#include <unistd.h>
> >
> >int main(void)
> >{
> > sleep(10);
> > return 0;
> >}
> >
> >Run it in an xterm. Note that resizing the xterm has no effect on the
> >process. Now do the same with strace:
> >
> >brk(0x80495bc) = 0x80495bc
> >brk(0x804a000) = 0x804a000
> >rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> >rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
> >rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> >nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
> >call)
> >--- SIGWINCH (Window changed) ---
> >_exit(0) = ?
> >
> >In short, nanosleep is getting interrupted by signals that are
> >supposedly ignored when a process is being praced. This appears to be
> >a long-standing bug.
> >
> >It also appears to be a long-known bug. I found some old discussion of this
> >problem here but no sign of any resolution:
> >
> >http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html
> >
> >What's the current thinking on this?
>
> This should have been resolved with the 2.6 changes, in particular, the
> restart code. What kernel are you using?

Indeed it is. Forgot I still had 2.4 on the box in question, didn't
notice the restart bit when comparing the 2.6 code against the thread
above. Mea culpa.

--
Mathematics is the supreme nostalgia of our time.

2004-11-25 05:47:55

by George Anzinger

[permalink] [raw]
Subject: Re: nanosleep interrupted by ignored signals

Matt Mackall wrote:
> Take the following trivial program:
>
> #include <unistd.h>
>
> int main(void)
> {
> sleep(10);
> return 0;
> }
>
> Run it in an xterm. Note that resizing the xterm has no effect on the
> process. Now do the same with strace:
>
> brk(0x80495bc) = 0x80495bc
> brk(0x804a000) = 0x804a000
> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
> call)
> --- SIGWINCH (Window changed) ---
> _exit(0) = ?
>
> In short, nanosleep is getting interrupted by signals that are
> supposedly ignored when a process is being praced. This appears to be
> a long-standing bug.
>
> It also appears to be a long-known bug. I found some old discussion of this
> problem here but no sign of any resolution:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html
>
> What's the current thinking on this?

This should have been resolved with the 2.6 changes, in particular, the restart
code. What kernel are you using?
>

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/

2004-11-27 01:01:19

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: nanosleep interrupted by ignored signals

On Wed, Nov 24, 2004 at 07:06:27PM -0800, Matt Mackall wrote:
> On Wed, Nov 24, 2004 at 06:45:05PM -0800, George Anzinger wrote:
> > Matt Mackall wrote:
> > >Take the following trivial program:
> > >
> > >#include <unistd.h>
> > >
> > >int main(void)
> > >{
> > > sleep(10);
> > > return 0;
> > >}
> > >
> > >Run it in an xterm. Note that resizing the xterm has no effect on the
> > >process. Now do the same with strace:
> > >
> > >brk(0x80495bc) = 0x80495bc
> > >brk(0x804a000) = 0x804a000
> > >rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> > >rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
> > >rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> > >nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
> > >call)
> > >--- SIGWINCH (Window changed) ---
> > >_exit(0) = ?
> > >
> > >In short, nanosleep is getting interrupted by signals that are
> > >supposedly ignored when a process is being praced. This appears to be
> > >a long-standing bug.
> > >
> > >It also appears to be a long-known bug. I found some old discussion of this
> > >problem here but no sign of any resolution:
> > >
> > >http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html
> > >
> > >What's the current thinking on this?
> >
> > This should have been resolved with the 2.6 changes, in particular, the
> > restart code. What kernel are you using?
>
> Indeed it is. Forgot I still had 2.4 on the box in question, didn't
> notice the restart bit when comparing the 2.6 code against the thread
> above. Mea culpa.

George,

Is it worth/necessary to fix this bug in v2.4 ?

Quoting yourself

"This is an issue for debugging also (same ptrace...). The fix is to fix
nano_sleep to match the standard which says it should only return on a
signal if the signal is delivered to the program (i.e. not on internal
"do nothing" signals). Signal in the kernel returns 1 if it calls the
task and 0 otherwise, thus nano sleep might be changed as follows: "


2004-11-29 20:03:09

by George Anzinger

[permalink] [raw]
Subject: Re: nanosleep interrupted by ignored signals

Marcelo Tosatti wrote:
> On Wed, Nov 24, 2004 at 07:06:27PM -0800, Matt Mackall wrote:
>
>>On Wed, Nov 24, 2004 at 06:45:05PM -0800, George Anzinger wrote:
>>
>>>Matt Mackall wrote:
>>>
>>>>Take the following trivial program:
>>>>
>>>>#include <unistd.h>
>>>>
>>>>int main(void)
>>>>{
>>>> sleep(10);
>>>> return 0;
>>>>}
>>>>
>>>>Run it in an xterm. Note that resizing the xterm has no effect on the
>>>>process. Now do the same with strace:
>>>>
>>>>brk(0x80495bc) = 0x80495bc
>>>>brk(0x804a000) = 0x804a000
>>>>rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
>>>>rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
>>>>rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
>>>>nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
>>>>call)
>>>>--- SIGWINCH (Window changed) ---
>>>>_exit(0) = ?
>>>>
>>>>In short, nanosleep is getting interrupted by signals that are
>>>>supposedly ignored when a process is being praced. This appears to be
>>>>a long-standing bug.
>>>>
>>>>It also appears to be a long-known bug. I found some old discussion of this
>>>>problem here but no sign of any resolution:
>>>>
>>>>http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html
>>>>
>>>>What's the current thinking on this?
>>>
>>>This should have been resolved with the 2.6 changes, in particular, the
>>>restart code. What kernel are you using?
>>
>>Indeed it is. Forgot I still had 2.4 on the box in question, didn't
>>notice the restart bit when comparing the 2.6 code against the thread
>>above. Mea culpa.
>
>
> George,
>
> Is it worth/necessary to fix this bug in v2.4 ?
>
> Quoting yourself
>
> "This is an issue for debugging also (same ptrace...). The fix is to fix
> nano_sleep to match the standard which says it should only return on a
> signal if the signal is delivered to the program (i.e. not on internal
> "do nothing" signals). Signal in the kernel returns 1 if it calls the
> task and 0 otherwise, thus nano sleep might be changed as follows: "
>
Hmm, wise fellow, that :) We (MontaVista) have back ported this fix to our
kernels as part of the HRT patch, and, in fact, it is in the latest (albeit
somewhat out of date) HRT patch on sourceforge. The main issue is that it
requires changes in arch level code and so requires a cooperative effort (in
that most folks only have one or two archs to check it out on).

My take on this is that this has been in the kernel since nanosleep() was put in
and so, for a mature kernel, it is not really important to change it. Now if
you want to back port POSIX clocks and timers (i.e. clock_nanosleep()) I would
argue that you should back port this change as part of that effort.

--
George Anzinger [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/

2004-11-30 16:55:03

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: nanosleep interrupted by ignored signals

On Mon, Nov 29, 2004 at 12:01:00PM -0800, George Anzinger wrote:
> Marcelo Tosatti wrote:
> >On Wed, Nov 24, 2004 at 07:06:27PM -0800, Matt Mackall wrote:
> >
> >>On Wed, Nov 24, 2004 at 06:45:05PM -0800, George Anzinger wrote:
> >>
> >>>Matt Mackall wrote:
> >>>
> >>>>Take the following trivial program:
> >>>>
> >>>>#include <unistd.h>
> >>>>
> >>>>int main(void)
> >>>>{
> >>>> sleep(10);
> >>>> return 0;
> >>>>}
> >>>>
> >>>>Run it in an xterm. Note that resizing the xterm has no effect on the
> >>>>process. Now do the same with strace:
> >>>>
> >>>>brk(0x80495bc) = 0x80495bc
> >>>>brk(0x804a000) = 0x804a000
> >>>>rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> >>>>rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
> >>>>rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> >>>>nanosleep({10, 0}, 0xbffff548) = -1 EINTR (Interrupted system
> >>>>call)
> >>>>--- SIGWINCH (Window changed) ---
> >>>>_exit(0) = ?
> >>>>
> >>>>In short, nanosleep is getting interrupted by signals that are
> >>>>supposedly ignored when a process is being praced. This appears to be
> >>>>a long-standing bug.
> >>>>
> >>>>It also appears to be a long-known bug. I found some old discussion of
> >>>>this
> >>>>problem here but no sign of any resolution:
> >>>>
> >>>>http://www.ussg.iu.edu/hypermail/linux/kernel/0108.1/1448.html
> >>>>
> >>>>What's the current thinking on this?
> >>>
> >>>This should have been resolved with the 2.6 changes, in particular, the
> >>>restart code. What kernel are you using?
> >>
> >>Indeed it is. Forgot I still had 2.4 on the box in question, didn't
> >>notice the restart bit when comparing the 2.6 code against the thread
> >>above. Mea culpa.
> >
> >
> >George,
> >
> >Is it worth/necessary to fix this bug in v2.4 ?
> >
> >Quoting yourself
> >
> >"This is an issue for debugging also (same ptrace...). The fix is to fix
> >nano_sleep to match the standard which says it should only return on a
> >signal if the signal is delivered to the program (i.e. not on internal
> >"do nothing" signals). Signal in the kernel returns 1 if it calls the
> >task and 0 otherwise, thus nano sleep might be changed as follows: "
> >
> Hmm, wise fellow, that :) We (MontaVista) have back ported this fix to
> our kernels as part of the HRT patch, and, in fact, it is in the latest
> (albeit somewhat out of date) HRT patch on sourceforge. The main issue is
> that it requires changes in arch level code and so requires a cooperative
> effort (in that most folks only have one or two archs to check it out on).
>
> My take on this is that this has been in the kernel since nanosleep() was
> put in and so, for a mature kernel, it is not really important to change
> it. Now if you want to back port POSIX clocks and timers (i.e.
> clock_nanosleep()) I would argue that you should back port this change as
> part of that effort.

Not really a good idea IMO - lets live with the bug. If it was easy to fix it,
then it would be interesting, but since it is not...

Thanks for your input.