2008-07-18 01:28:17

by Robert Hancock

[permalink] [raw]
Subject: Re: EINTR under Linux

akineko wrote:
> Hello,
>
> I have a socket program that is running flawlessly under Solaris.
> When I re-compiled it under Linux (CentOS 5.1) and run it, I got the
> following error:
>
> recv() failed: Interrupted system call
>
> This only occurs very infrequently (probably one out of a million
> packets exchanged).
>
> select() in my program is getting EINTR.
>
> From the postings I found in the news group seem suggesting that it is
> due to GC.
>
>> The GC sends signals to each thread which causes them all to enter a stop-the-world state. When the GC
>> is finished, all the threads are resumed. When the threads are resumed, any that were blocked in a
>> blocking system call (like poll()) will return with EINTR. Normally you would just retry the system call.
>
> So, I added to check if the errno == EINTR and now my program seems
> working fine.
>
> //
>
> My question I would like to ask in this group is:
> Does this mean any system call under Linux could return empty-hand
> with EINTR due to GC?
> I usually assume fatal if system call returns -1.
> It is quite painful to check all system-call return status.
>
> My second question is:
> Does this can occur in other OS's? (free-BSD, Solaris, ...)
> Or, is this specific to Linux OS?

I'm not sure what the GC you're referring to is, but I assume it's using
a signal handler for that stop signal. If the signal handler is not
installed with the SA_RESTART flag, then if a system call is interrupted
by that signal it will get EINTR instead of being restarted
automatically. For some system calls, EINTR can still occur, for
example, see:

http://www.opengroup.org/onlinepubs/007908775/xsh/select.html

This is not Linux specific, but the specs allow for some different
behavior between UNIX variants.


Subject: Re: EINTR under Linux

On 7/18/08, Robert Hancock <[email protected]> wrote:
> akineko wrote:
>
> > Hello,
> >
> > I have a socket program that is running flawlessly under Solaris.
> > When I re-compiled it under Linux (CentOS 5.1) and run it, I got the
> > following error:
> >
> > recv() failed: Interrupted system call
> >
> > This only occurs very infrequently (probably one out of a million
> > packets exchanged).
> >
> > select() in my program is getting EINTR.
> >
> > From the postings I found in the news group seem suggesting that it is
> > due to GC.
> >
> >
> > > The GC sends signals to each thread which causes them all to enter a
> stop-the-world state. When the GC
> > > is finished, all the threads are resumed. When the threads are
> resumed, any that were blocked in a
> > > blocking system call (like poll()) will return with EINTR. Normally you
> would just retry the system call.
> > >
> >
> > So, I added to check if the errno == EINTR and now my program seems
> > working fine.
> >
> > //
> >
> > My question I would like to ask in this group is:
> > Does this mean any system call under Linux could return empty-hand
> > with EINTR due to GC?
> > I usually assume fatal if system call returns -1.
> > It is quite painful to check all system-call return status.
> >
> > My second question is:
> > Does this can occur in other OS's? (free-BSD, Solaris, ...)
> > Or, is this specific to Linux OS?
> >
>
> I'm not sure what the GC you're referring to is, but I assume it's using a
> signal handler for that stop signal. If the signal handler is not installed
> with the SA_RESTART flag, then if a system call is interrupted by that
> signal it will get EINTR instead of being restarted automatically. For some
> system calls, EINTR can still occur, for example, see:
>
> http://www.opengroup.org/onlinepubs/007908775/xsh/select.html
>
> This is not Linux specific, but the specs allow for some different behavior
> between UNIX variants.

And the signal.7 page has been very recently updated to include
Linux-specific details for most system calls. Have a look here:

http://www.kernel.org/doc/man-pages/online/pages/man7/signal.7.html

Basically, recv() is restarted if you use SA_RESTART, but select() is
never restarted, regardless of SA_RESTART (and POSIX.1 allows this).