Considering that the threading library for Linux uses signals to make it
work, would it be possible to change the Linux kernel to operate the way
BSD does--instead of returning EINTR, just restart the interrupted
primitive?
For example, if I'm using read(2) to read data from a file descriptor,
and a signal happens, the signal handler runs, and read(2) returns EINTR
after the system call finishes. Then I'm supposed to catch this and
re-try the system call.
I assume that this is true for _any_ system call which makes the process
block, right?
Can we _PLEASE_PLEASE_PLEASE_ not do this anymore and have the kernel do
what BSD does: re-start the interrupted call?
Please? If this is something that would be acceptable for integration
into a mainline kernel, I would do my best to help with a patch.
If I'm wrong about this, please enlighten me. Also, please cc: me off
the list, as I don't get the list directly.
Thank you for your consideration.
--
George T. Talbot
<george at moberg dot com>
[email protected] writes:
> Can we _PLEASE_PLEASE_PLEASE_ not do this anymore and have the kernel do
> what BSD does: re-start the interrupted call?
This is crap. Returning EINTR is necessary for many applications.
--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------
Ulrich Drepper wrote:
>
> [email protected] writes:
>
> > Can we _PLEASE_PLEASE_PLEASE_ not do this anymore and have the kernel do
> > what BSD does: re-start the interrupted call?
>
> This is crap. Returning EINTR is necessary for many applications.
>
> --
> ---------------. ,-. 1325 Chesapeake Terrace
> Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
> Red Hat `--' drepper at redhat.com `------------------------
After reading about SA_RESTART, ok. However, couldn't those
applications that require it enable this behaviour explicitly?
The problem I'm having right now is with pthread_create() failing
because deep somewhere in either the kernel or glibc, nanosleep()
returns EINTR during said pthread_create() and pthread_create() fails.
I've got a multithreaded program written using gcc (2.95.2) and glibc
(2.1.3), and it's talking to a natively threaded Java program (tried
both Sun & Blackdown ports, both 1.2.2 and 1.3) on a 2.2.17 kernel. The
C program is listening for incoming socket connections, and the Java
program is hammering on it with many parallel connect() calls. After a
short, a bit random interval, pthread_create() will fail in either my
program, or deep in the Java VM. I assume that the Java VM is using
pthread_create().
I don't mean to sound like a psycho on this, but I can't see why
SA_RESTART isn't the default behavior. Maybe I'm missing something
somewhere.
--
George T. Talbot
<george at moberg dot com>
Hello!
> > Can we _PLEASE_PLEASE_PLEASE_ not do this anymore and have the kernel do
> > what BSD does: re-start the interrupted call?
>
> This is crap. Returning EINTR is necessary for many applications.
Just reminder: this "crap" is default behaviour of Linux nowadays. 8)8)
Alexey
On Fri, 3 Nov 2000 [email protected] wrote:
> Considering that the threading library for Linux uses signals to make it
> work, would it be possible to change the Linux kernel to operate the way
> BSD does--instead of returning EINTR, just restart the interrupted
> primitive?
>
It's just how the default for signal() is set up by the 'C' runtime
library. Instead of using signal, use sigaction(), set the SA_RESTART
flag and you have BSD action.
It is also possible to compile existing applications using
-D_BSD_SIGNALS (this is from memory, it might not be exactly correct).
New applications should not use signal(), then should use sigaction()
which gives POSIX-defined fine control over the signal handler.
Cheers,
Dick Johnson
Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips).
"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.
Followup to: <[email protected]>
By author: [email protected]
In newsgroup: linux.dev.kernel
>
> Hello!
>
> > > Can we _PLEASE_PLEASE_PLEASE_ not do this anymore and have the kernel do
> > > what BSD does: re-start the interrupted call?
> >
> > This is crap. Returning EINTR is necessary for many applications.
>
> Just reminder: this "crap" is default behaviour of Linux nowadays. 8)8)
>
signal() is crap... I personally think it was a major lose to have
signal() change to BSD behaviour by default (an unexpected change for
most applications!!)
For sigaction() you must choose behaviour explicitly anyway, by either
specifying or not specifying SA_RESTART.
Applications should use sigaction(). Period. Full stop. signal() is
so unpredictable these days as to be practically unusable.
-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
On Fri, 3 Nov 2000 [email protected] wrote:
> I don't mean this to sound like a rant. It's just that I can't possibly
> ascertain why someone in their right mind would want any behaviour
> different than SA_RESTART.
study apache 1.3's child_main code, you'll see an example of EINTR in use.
it's used to get out of accept() -- most specifically when the child needs
to die off (because the parent has determined that there's either too many
children, or because a shutdown/restart is occuring).
apache 1.3's BUFF code also uses EINTR for timeouts.
i eliminated signals in the 2.0 design... so it doesn't use EINTR any
more, but it restarts in userland because that's the most portable thing
to do.
On Fri, 3 Nov 2000 [email protected] wrote:
> After reading about SA_RESTART, ok. However, couldn't those
> applications that require it enable this behaviour explicitly?
anyone sane writing modern applications will use sigaction(). signal() is
legacy.
-dean
I respectfully disagree that programs which don't surround some of the
most common system calls with
do
{
rv = __some_system_call__(...);
} while (rv == -1 && errno == EINTR);
are broken. Especially if those programs don't use signals. The problem
that I'm raising is that the default behavior of returning EINTR from
system calls is, in my opinion, an application reliabilty problem. The
specific problem I'm having is that glibc uses signals to implement
multiple threads, and because of the EINTR behavior, expose multithreaded
programs to this behavior that weren't necessarily written to use signals.
It's a useability and portability issue, especially considering that I
might be using pthreads so that I can avoid signal handling entirely. I
might want to do this because I want to be portable to non-UNIX systems
that implement the pthreads API.
I _was_not_ too lazy to read the documentation, though I don't have a copy
of POSIX.
Does POSIX require that pthreads programs be signal-aware in the EINTR
sense? Could this be considered a bug?
--
George T. Talbot
<[email protected]>
On Sun, 5 Nov 2000, Marc Lehmann wrote:
> On Fri, Nov 03, 2000 at 02:49:37PM -0500, [email protected] wrote:
> > After reading about SA_RESTART, ok. However, couldn't those
> > applications that require it enable this behaviour explicitly?
>
> No, broken applications that require specific bsd behaviour should just be
> compiled with -D_BSD_SOURCE. If you need newer features then be prepared
> to fix the program.
>
> > The problem I'm having right now is with pthread_create() failing
> > because deep somewhere in either the kernel or glibc, nanosleep()
> > returns EINTR during said pthread_create() and pthread_create() fails.
>
> This has hardly something to do with the signal reliability issue, unless
> you compiled your own threads library. You might want to file a bug report
> for libpthread otherwise.
>
> > I don't mean to sound like a psycho on this, but I can't see why
> > SA_RESTART isn't the default behavior. Maybe I'm missing something
> > somewhere.
>
> Yes, you are missing signal(2) or the glibc info file, so the real
> question is: why were you too lazy to read the documentation???
>
>
Date: Mon, 6 Nov 2000 09:13:25 -0500 (EST)
From: George Talbot <[email protected]>
I respectfully disagree that programs which don't surround some of the
most common system calls with
do
{
rv = __some_system_call__(...);
} while (rv == -1 && errno == EINTR);
are broken. Especially if those programs don't use signals. The problem
that I'm raising is that the default behavior of returning EINTR from
system calls is, in my opinion, an application reliabilty problem. The
specific problem I'm having is that glibc uses signals to implement
multiple threads, and because of the EINTR behavior, expose multithreaded
programs to this behavior that weren't necessarily written to use
signals.
Arguably though the bug is in glibc, in that if it's using signals
behinds the scenes, it should have passed SA_RESTART to sigaction.
However, from a portability point of view, you should *always* surround
certain system calls with while loops, since even if your program
doesn't use signals, if you run that program on a System-V derived Unix
system, and someone types ^Z at the wrong moment, you can also get an
EINTR. Similarly, you should always check the return value from write
and make sure all of what you asked to be written, was actually
written.
What I normally do is have a full_write routine which looks something
like this:
static errcode_t full_write(int fd, void *buf, int count)
{
char *cp = buf;
int left = count, c;
while (left) {
c = write(fd, cp, left);
if (c < 0) {
if (errno == EINTR || errno == EAGAIN)
continue;
return errno;
}
left -= c;
cp += c;
}
return 0;
}
It's like checking the return value from malloc(). Not everyone does
it, but even if it's not needed 99% of the time, it's a darned good idea
to do that.
- Ted
"Theodore Y. Ts'o" <[email protected]> writes:
> Arguably though the bug is in glibc, in that if it's using signals
> behinds the scenes, it should have passed SA_RESTART to sigaction.
Why are you talking such a nonsense?
>
> However, from a portability point of view, you should *always* surround
> certain system calls with while loops, since even if your program
> doesn't use signals, if you run that program on a System-V derived Unix
> system, and someone types ^Z at the wrong moment, you can also get an
> EINTR. Similarly, you should always check the return value from write
> and make sure all of what you asked to be written, was actually
> written.
>
> What I normally do is have a full_write routine which looks something
> like this:
>
> static errcode_t full_write(int fd, void *buf, int count)
> {
> char *cp = buf;
> int left = count, c;
>
> while (left) {
> c = write(fd, cp, left);
> if (c < 0) {
> if (errno == EINTR || errno == EAGAIN)
> continue;
> return errno;
> }
> left -= c;
> cp += c;
> }
> return 0;
> }
>
> It's like checking the return value from malloc(). Not everyone does
> it, but even if it's not needed 99% of the time, it's a darned good idea
> to do that.
>
> - Ted
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>
>
--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------
Ulrich Drepper <[email protected]> writes:
> "Theodore Y. Ts'o" <[email protected]> writes:
>
> > Arguably though the bug is in glibc, in that if it's using signals
> > behinds the scenes, it should have passed SA_RESTART to sigaction.
>
> Why are you talking such a nonsense?
[Note to self: remove kitten from keyboard before writing mail.]
Glibc has to use signals because there *still* is not mechanism in the
kernel to allow synchronization. After how many years.
I don't blame Linux. He has no interest in threads and therefore
spends not much time thinking about it. But everybody who's
complaining about things like this has to be willing to fix the real
problems.
Get your ass up and write a fast semaphore/mutex system.
--
---------------. ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Red Hat `--' drepper at redhat.com `------------------------
Hello!
> Glibc has to use signals because there *still* is not mechanism in the
> kernel to allow synchronization.
Could you tell why does it use SA_INTERRUPT on its internal signals?
Alexey
On Mon, 6 Nov 2000, George Talbot wrote:
> I respectfully disagree that programs which don't surround some of the
> most common system calls with
>
> do
> {
> rv = __some_system_call__(...);
> } while (rv == -1 && errno == EINTR);
welcome to Unix. this is how it is, and it's not just linux. and it's
not just glibc/linuxthreads. in your code do you go about setting all
signals to SA_RESTART? if not then you're subject to the vagaries of
whatever the default signal settings are.
ted mentioned ^Z... there's also strace/truss/ktrace (depending on your
flavour of unix). there's also page-out/in (and on some unixes there's
swap-out/in).
it's something which bites lots of folks. gnu tar had this bug for at
least 5 years, and may still have it -- i got tired of submitting the bug
fix.
-dean
From: Ulrich Drepper <[email protected]>
Date: 06 Nov 2000 10:50:37 -0800
> Arguably though the bug is in glibc, in that if it's using signals
> behinds the scenes, it should have passed SA_RESTART to sigaction.
Why are you talking such a nonsense?
The claim was made that pthreads was using signals behind the scenes, so
that programs which weren't expecting that system calls to get
interrupted were getting interrupted. Hence, one could make the
argument that if the pthreads code had used SA_RESTART to set up its
signal handlers, then this situation wouldn't have come up.
I haven't looked more deeply into this. As far as I'm concerned,
threads === "more rope" and use of threads should be avoided whenever
possible, even if Linux had a decent threads implementation....
- Ted