2001-04-10 04:44:22

by Jason Gunthorpe

[permalink] [raw]
Subject: Lost O_NONBLOCK (Bug?)


Hi,

I've run into the following weird behavior on my system with 2.4.0. I have
the following code:

if (fork() == 0)
{
int Flags,dummy;
if ((Flags = fcntl(STDIN_FILENO,F_GETFL,dummy)) < 0)
_exit(100);
if (fcntl(STDIN_FILENO,F_SETFL,Flags | O_NONBLOCK) < 0)
_exit(100);
while (read(STDIN_FILENO,&dummy,1) == 1);
if (fcntl(STDIN_FILENO,F_SETFL,Flags & (~(long)O_NONBLOCK)) < 0)
_exit(100);

// exec something
}

Which works fine, unless the parent process was backgrounded by the shell
(^Z then bg). If that is the case then the O_NONBLOCK seems to be lost. I
straced this:

fcntl(0, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(0, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(0, 0xbfffea38, 1) = ? ERESTARTSYS (To be restarted)
--- SIGTTIN (Stopped (tty input)) ---
--- SIGTTIN (Stopped (tty input)) ---
read(0, 0xbfffea38, 1) = ? ERESTARTSYS (To be restarted)
--- SIGTTIN (Stopped (tty input)) ---
--- SIGTTIN (Stopped (tty input)) ---
[.. etc, again and again in a tight loop ..]
--- SIGTTIN (Stopped (tty input)) ---
--- SIGTTIN (Stopped (tty input)) ---
read(0,

The last read was after the process was forgrounded. The read waits
forever, the non-block flag seems to have gone missing. It is also a
little odd I think that it repeated to get SIGTTIN which was never
actually delivered to the program.. Shouldn't SIGTTIN suspend the process?

The signal mask from /proc/xx/status (the child) looks like:

SigPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 8000000000000000
SigCgt: 0000000000000000

Is this the expected behavior of the kernel?

Thanks,
Jason
Please CC me, I'm not on l-k today.



2001-04-13 06:46:06

by Philippe Troin

[permalink] [raw]
Subject: Re: Lost O_NONBLOCK (Bug?)

Jason Gunthorpe <[email protected]> writes:

> I've run into the following weird behavior on my system with 2.4.0. I have
> the following code:

Apt I guess ? It has a very strange behavior when backgrounded...

> if (fork() == 0)
> {
> int Flags,dummy;
> if ((Flags = fcntl(STDIN_FILENO,F_GETFL,dummy)) < 0)
> _exit(100);
> if (fcntl(STDIN_FILENO,F_SETFL,Flags | O_NONBLOCK) < 0)
> _exit(100);
> while (read(STDIN_FILENO,&dummy,1) == 1);
> if (fcntl(STDIN_FILENO,F_SETFL,Flags & (~(long)O_NONBLOCK)) < 0)
> _exit(100);
>
> // exec something
> }
>
> Which works fine, unless the parent process was backgrounded by the shell
> (^Z then bg). If that is the case then the O_NONBLOCK seems to be lost. I
> straced this:
>
> fcntl(0, F_GETFL) = 0x2 (flags O_RDWR)
> fcntl(0, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> read(0, 0xbfffea38, 1) = ? ERESTARTSYS (To be restarted)
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> read(0, 0xbfffea38, 1) = ? ERESTARTSYS (To be restarted)
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> [.. etc, again and again in a tight loop ..]
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> read(0,
>
> The last read was after the process was forgrounded. The read waits
> forever, the non-block flag seems to have gone missing. It is also a
> little odd I think that it repeated to get SIGTTIN which was never
> actually delivered to the program.. Shouldn't SIGTTIN suspend the process?

Strace can perturbate signal delivery, especially for terminal-related
signals, I wouldn't trust it...

O_NONBLOCK is not lost... Attempting to read from the controlling tty
even from a O_NONBLOCK descriptor will trigger SIGTTIN.

>From the code, it looks like you're trying to flush stdin before
exec'ing.

Why not use tcflush(STDIN_FILENO, TCIFLUSH) rather than using
O_NONBLOCK ?

This will not prevent SIGGTTIN from getting sent... You could catch it
or just ignore it...

But why would you want to flush stdin if you're in the background ?
Why not using:

if (fork()==0)
{
if (tcgetpgrp(STDIN_FILENO) == getpgrp())
{
/* We're the foreground process of the controlling tty */
tcflush(STDIN_FILENO, TCIFLUSH);
}

exec(...);
}

Here you just don't care flushing stdin if you're not the foreground
process (which is the *right* thing to do).

There's a race condition if the process is backgrounded between the
tcgetgrp() and the tcflush(), but you'll have to leave with it...

Phil.

2001-04-13 20:17:04

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: Lost O_NONBLOCK (Bug?)


On 12 Apr 2001, Philippe Troin wrote:

> Apt I guess ? It has a very strange behavior when backgrounded...

Not really, just want it tries to run dpkg it hangs.

> > The last read was after the process was forgrounded. The read waits
> > forever, the non-block flag seems to have gone missing. It is also a
> > little odd I think that it repeated to get SIGTTIN which was never
> > actually delivered to the program.. Shouldn't SIGTTIN suspend the process?

> Strace can perturbate signal delivery, especially for terminal-related
> signals, I wouldn't trust it...

I know, the problem still happens without strace.

> O_NONBLOCK is not lost... Attempting to read from the controlling tty
> even from a O_NONBLOCK descriptor will trigger SIGTTIN.

I don't really care about the SIGTTIN, what bugs me is that the read that
happens after the process has been foregrounded blocks - and that should
not be.

> Why not use tcflush(STDIN_FILENO, TCIFLUSH) rather than using
> O_NONBLOCK ?

Mm, thats probably better.

> But why would you want to flush stdin if you're in the background ?

Well, overall, I don't even want to fork if I'm in the background. Getting
suspsended before forking is perfectly fine.

Jason

2001-04-19 17:47:59

by Philippe Troin

[permalink] [raw]
Subject: Re: Lost O_NONBLOCK (Bug?)

Jason Gunthorpe <[email protected]> writes:

> On 12 Apr 2001, Philippe Troin wrote:
>
> > Apt I guess ? It has a very strange behavior when backgrounded...
>
> Not really, just want it tries to run dpkg it hangs.
>
> > > The last read was after the process was forgrounded. The read waits
> > > forever, the non-block flag seems to have gone missing. It is also a
> > > little odd I think that it repeated to get SIGTTIN which was never
> > > actually delivered to the program.. Shouldn't SIGTTIN suspend the process?
>
> > Strace can perturbate signal delivery, especially for terminal-related
> > signals, I wouldn't trust it...
>
> I know, the problem still happens without strace.

Do you have a snippet that can reproduce the problem ? Does this
happens only with 2.4, or both 2.2 and 2.4 have the problem ?

> > O_NONBLOCK is not lost... Attempting to read from the controlling tty
> > even from a O_NONBLOCK descriptor will trigger SIGTTIN.
>
> I don't really care about the SIGTTIN, what bugs me is that the read that
> happens after the process has been foregrounded blocks - and that should
> not be.

True.

8< snip >8

Phil.