2015-12-09 21:14:44

by Marc Aurèle La France

[permalink] [raw]
Subject: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

Greetings.

The following four commits are some of the changes that have been made
to the tty layer since kernel version 3.11:

1) f95499c3030fe1bfad57745f2db1959c5b43dca8
n_tty: Don't wait for buffer work in read() loop

2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
tty: Fix pty master read() after slave closes

3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
pty, n_tty: Simplify input processing on final close

4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
pty: Fix input race when closing

Commit "4)" corrected an issue whereby EIO could be prematurely
returned on a read of one end of a master/slave pty pair after the
other had been completely closed. Yet, I would argue that EAGAIN
should not be returned either when there actually is data to be
returned. This whether or not the other end has been completely
closed.

Indeed, the previous code (before commit "1)") checked the other end
of the pty pair for any data before returning EAGAIN. This mimics the
behaviour of other System-V variants (Solaris, AIX, etc.) in this
regard and ensured that EAGAIN really did mean no data was available
at the time of the call.

Portable OpenSSH, since release 4.6p1 in 2007, relies on this being
the case and has been broken since commit "1)" introduced spurious
EAGAIN returns (i.e. as of 3.12 kernels). The scenario at hand is
as follows.

After sshd has been SIGCHLD'ed about the shell's termination, it
continues to read the master pty until an error occurs. This error
will be EIO if no process has the slave pty open. Otherwise (for
example when the shell spawned long-running processes in the
background before terminating), that error is expected to be EAGAIN.
sshd cannot continue to read until an EIO in all cases, because doing
so causes the session to hang until all processes have closed the
slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
return causes sshd to lose data, whether or not the slave pty is
completely closed.

I've been using the following script to reproduce the problem. It
loops until the issue is detected.

#! /bin/bash

LOG=sshlog-`date "+%F.%T"`

touch ${LOG}

while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`"
do
ssh -p 22 -tt root@localhost \
'/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \
tee -a ${LOG}
done

It should be noted that the problem is extremely rare, but still
occurs, on real hardware. This bug is easier to replicate in a
virtual machine such as those that can be created using Google Cloud.

The patch below is a suggested fix. It was developed using a 4.3.0
kernel and should apply, modulo fuzz, to any release >= 4.0.5. My
suggested fix is modeled after commit "2)" mentionned above. Given
commit "2)" was later reworked by commit "3)", I fully expect my fix
to be reworked as well.

I volunteer to backport the fix this ends up being to any stable
release >= 3.12 deemed needed.

Please Reply-To-All.

Thanks and have a great day.

Marc.

Reported-by: Volth <[email protected]>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Marc Aurele La France <[email protected]>

--- a/drivers/tty/n_tty.c
+++ a/drivers/tty/n_tty.c
@@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
if (!timeout)
break;
if (file->f_flags & O_NONBLOCK) {
- retval = -EAGAIN;
- break;
- }
- if (signal_pending(current)) {
- retval = -ERESTARTSYS;
- break;
- }
- up_read(&tty->termios_rwsem);
+ up_read(&tty->termios_rwsem);
+ flush_work(&tty->port->buf.work);
+ down_read(&tty->termios_rwsem);
+ if (!input_available_p(tty, 0)) {
+ retval = -EAGAIN;
+ break;
+ }
+ } else {
+ if (signal_pending(current)) {
+ retval = -ERESTARTSYS;
+ break;
+ }
+ up_read(&tty->termios_rwsem);

- timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
- timeout);
+ timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
+ timeout);

- down_read(&tty->termios_rwsem);
- continue;
+ down_read(&tty->termios_rwsem);
+ continue;
+ }
}

if (ldata->icanon && !L_EXTPROC(tty)) {


2015-12-10 14:59:38

by Peter Hurley

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
> Greetings.
>
> The following four commits are some of the changes that have been made
> to the tty layer since kernel version 3.11:
>
> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
> n_tty: Don't wait for buffer work in read() loop
>
> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
> tty: Fix pty master read() after slave closes
>
> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
> pty, n_tty: Simplify input processing on final close
>
> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
> pty: Fix input race when closing
>
> Commit "4)" corrected an issue whereby EIO could be prematurely
> returned on a read of one end of a master/slave pty pair after the
> other had been completely closed. Yet, I would argue that EAGAIN
> should not be returned either when there actually is data to be
> returned. This whether or not the other end has been completely
> closed.
>
> Indeed, the previous code (before commit "1)") checked the other end
> of the pty pair for any data before returning EAGAIN. This mimics the
> behaviour of other System-V variants (Solaris, AIX, etc.)
^^^^
What other SysV systems were tested?

> in this
> regard and ensured that EAGAIN really did mean no data was available
> at the time of the call.
>
> Portable OpenSSH, since release 4.6p1 in 2007, relies on this being
> the case and has been broken since commit "1)" introduced spurious
> EAGAIN returns (i.e. as of 3.12 kernels). The scenario at hand is
> as follows.
>
> After sshd has been SIGCHLD'ed about the shell's termination, it
> continues to read the master pty until an error occurs. This error
> will be EIO if no process has the slave pty open. Otherwise (for
> example when the shell spawned long-running processes in the
> background before terminating), that error is expected to be EAGAIN.
> sshd cannot continue to read until an EIO in all cases, because doing
> so causes the session to hang until all processes have closed the
> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
> return causes sshd to lose data, whether or not the slave pty is
> completely closed.

Ah, the games userspace will be up to :)


> I've been using the following script to reproduce the problem. It
> loops until the issue is detected.
>
> #! /bin/bash
>
> LOG=sshlog-`date "+%F.%T"`
>
> touch ${LOG}
>
> while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`"
> do
> ssh -p 22 -tt root@localhost \
> '/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \
> tee -a ${LOG}
> done
>
> It should be noted that the problem is extremely rare, but still
> occurs, on real hardware. This bug is easier to replicate in a
> virtual machine such as those that can be created using Google Cloud.
>
> The patch below is a suggested fix. It was developed using a 4.3.0
> kernel and should apply, modulo fuzz, to any release >= 4.0.5. My
> suggested fix is modeled after commit "2)" mentionned above. Given
> commit "2)" was later reworked by commit "3)", I fully expect my fix
> to be reworked as well.
>
> I volunteer to backport the fix this ends up being to any stable
> release >= 3.12 deemed needed.
>
> Please Reply-To-All.
>
> Thanks and have a great day.
>
> Marc.
>
> Reported-by: Volth <[email protected]>
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
> BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
> Signed-off-by: Marc Aurele La France <[email protected]>
>
> --- a/drivers/tty/n_tty.c
> +++ a/drivers/tty/n_tty.c
> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
> if (!timeout)
> break;
> if (file->f_flags & O_NONBLOCK) {
> - retval = -EAGAIN;
> - break;
> - }
> - if (signal_pending(current)) {
> - retval = -ERESTARTSYS;
> - break;
> - }
> - up_read(&tty->termios_rwsem);
> + up_read(&tty->termios_rwsem);
> + flush_work(&tty->port->buf.work);
> + down_read(&tty->termios_rwsem);
> + if (!input_available_p(tty, 0)) {
> + retval = -EAGAIN;
> + break;
> + }
> + } else {
> + if (signal_pending(current)) {
> + retval = -ERESTARTSYS;
> + break;
> + }
> + up_read(&tty->termios_rwsem);

No sense in doing this just for O_NONBLOCK; might as well do it before
all the condition tests.

Which renders the earlier fixes for the slave end closing superfluous,
so might as well rip those out.

n_tty_poll() will need to be fixed as well, because if one application
used read() with O_NONBLOCK to expect to block until i/o became available,
then I guarantee some other application uses poll() with no timeout
for the same purpose.

Regards,
Peter Hurley

>
> - timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
> - timeout);
> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
> + timeout);
>
> - down_read(&tty->termios_rwsem);
> - continue;
> + down_read(&tty->termios_rwsem);
> + continue;
> + }
> }
>
> if (ldata->icanon && !L_EXTPROC(tty)) {
>

2015-12-10 22:49:01

by Marc Aurèle La France

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On Thu, 10 Dec 2015, Peter Hurley wrote:
> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>> The following four commits are some of the changes that have been made
>> to the tty layer since kernel version 3.11:

>> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>> n_tty: Don't wait for buffer work in read() loop

>> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>> tty: Fix pty master read() after slave closes

>> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>> pty, n_tty: Simplify input processing on final close

>> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>> pty: Fix input race when closing

>> Indeed, the previous code (before commit "1)") checked the other end
>> of the pty pair for any data before returning EAGAIN. This mimics the
>> behaviour of other System-V variants (Solaris, AIX, etc.)

> What other SysV systems were tested?

The fix for mindrot bug 52 was verified to correct the issue reported
there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX,
OpenSolaris, and a number of Linux distributions. There was a no-go
report regarding HP-UX 11.11 that was never followed up on, but I
believe it to have been resolved by the fix for mindrot bug 1467 (for
systems where EAGAIN != EWOULDBLOCK).

>> in this
>> regard and ensured that EAGAIN really did mean no data was available
>> at the time of the call.

>> After sshd has been SIGCHLD'ed about the shell's termination, it
>> continues to read the master pty until an error occurs. This error
>> will be EIO if no process has the slave pty open. Otherwise (for
>> example when the shell spawned long-running processes in the
>> background before terminating), that error is expected to be EAGAIN.
>> sshd cannot continue to read until an EIO in all cases, because doing
>> so causes the session to hang until all processes have closed the
>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
>> return causes sshd to lose data, whether or not the slave pty is
>> completely closed.

> Ah, the games userspace will be up to :)

Not really. The fact different OSes behave differently in this regard can
hardly be said to be userland's fault. The lower the number of distinct
behaviours userland needs to deal with, the better. Furthermore, sshd
"knows" there should be data there, so it makes no sense to befuddle it
with false EAGAIN returns.

>> --- a/drivers/tty/n_tty.c
>> +++ a/drivers/tty/n_tty.c
>> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>> if (!timeout)
>> break;
>> if (file->f_flags & O_NONBLOCK) {
>> - retval = -EAGAIN;
>> - break;
>> - }
>> - if (signal_pending(current)) {
>> - retval = -ERESTARTSYS;
>> - break;
>> - }
>> - up_read(&tty->termios_rwsem);
>> + up_read(&tty->termios_rwsem);
>> + flush_work(&tty->port->buf.work);
>> + down_read(&tty->termios_rwsem);
>> + if (!input_available_p(tty, 0)) {
>> + retval = -EAGAIN;
>> + break;
>> + }
>> + } else {
>> + if (signal_pending(current)) {
>> + retval = -ERESTARTSYS;
>> + break;
>> + }
>> + up_read(&tty->termios_rwsem);

> No sense in doing this just for O_NONBLOCK; might as well do it before
> all the condition tests.

> Which renders the earlier fixes for the slave end closing superfluous,
> so might as well rip those out.

This strikes me as somewhat draconian. It seems to me the intent behind
commit "1)" was in part to speed things up a bit. I beleive that goal to
have been largely attained. It's just a matter of ensuring this speedup
doesn't change kernel<->userland semantics. I see your EIO fix and the
eventual solution to the problem here in that light.

> n_tty_poll() will need to be fixed as well, because if one application
> used read() with O_NONBLOCK to expect to block until i/o became available,
> then I guarantee some other application uses poll() with no timeout
> for the same purpose.

Agreed. which is another reason why I don't believe my suggestion to be
the correct fix. Also, I took somewhat of a hammer approach, meaning
that, for OpenSSH's purposes, it would be sufficient to remove spurious
EAGAIN only after disassociation of the slave pty.

>>
>> - timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>> - timeout);
>> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>> + timeout);
>>
>> - down_read(&tty->termios_rwsem);
>> - continue;
>> + down_read(&tty->termios_rwsem);
>> + continue;
>> + }
>> }
>>
>> if (ldata->icanon && !L_EXTPROC(tty)) {
>>

Thanks.

Marc.

2015-12-11 00:07:37

by Peter Hurley

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>>> The following four commits are some of the changes that have been made
>>> to the tty layer since kernel version 3.11:
>
>>> 1) f95499c3030fe1bfad57745f2db1959c5b43dca8
>>> n_tty: Don't wait for buffer work in read() loop
>
>>> 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
>>> tty: Fix pty master read() after slave closes
>
>>> 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
>>> pty, n_tty: Simplify input processing on final close
>
>>> 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60
>>> pty: Fix input race when closing
>
>>> Indeed, the previous code (before commit "1)") checked the other end
>>> of the pty pair for any data before returning EAGAIN. This mimics the
>>> behaviour of other System-V variants (Solaris, AIX, etc.)
>
>> What other SysV systems were tested?
>
> The fix for mindrot bug 52 was verified to correct the issue reported
> there on various versions of SunOS, OSF1, HP-UX, IRIX, Tru64 UNIX, AIX,
> OpenSolaris, and a number of Linux distributions. There was a no-go
> report regarding HP-UX 11.11 that was never followed up on, but I
> believe it to have been resolved by the fix for mindrot bug 1467 (for
> systems where EAGAIN != EWOULDBLOCK).

Ok.

>>> in this
>>> regard and ensured that EAGAIN really did mean no data was available
>>> at the time of the call.
>
>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>> continues to read the master pty until an error occurs. This error
>>> will be EIO if no process has the slave pty open. Otherwise (for
>>> example when the shell spawned long-running processes in the
>>> background before terminating), that error is expected to be EAGAIN.
>>> sshd cannot continue to read until an EIO in all cases, because doing
>>> so causes the session to hang until all processes have closed the
>>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
>>> return causes sshd to lose data, whether or not the slave pty is
>>> completely closed.
>
>> Ah, the games userspace will be up to :)
>
> Not really.

Definitely.

The idea that a read with O_NONBLOCK set should have synchronous behavior
is ridiculous.

> The fact different OSes behave differently in this regard can
> hardly be said to be userland's fault. The lower the number of distinct
> behaviours userland needs to deal with, the better. Furthermore, sshd
> "knows" there should be data there, so it makes no sense to befuddle it
> with false EAGAIN returns.

But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
sshd is extrapolating from one known condition to another unknown condition,
and assuming it "should" be that way because it has been.

For example, try the same idea with real ttys on loopback. Wouldn't work,
because it's asynchronous.

The only reason this needs fixing is because it's a userspace regression.


>>> --- a/drivers/tty/n_tty.c
>>> +++ a/drivers/tty/n_tty.c
>>> @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
>>> if (!timeout)
>>> break;
>>> if (file->f_flags & O_NONBLOCK) {
>>> - retval = -EAGAIN;
>>> - break;
>>> - }
>>> - if (signal_pending(current)) {
>>> - retval = -ERESTARTSYS;
>>> - break;
>>> - }
>>> - up_read(&tty->termios_rwsem);
>>> + up_read(&tty->termios_rwsem);
>>> + flush_work(&tty->port->buf.work);
>>> + down_read(&tty->termios_rwsem);
>>> + if (!input_available_p(tty, 0)) {
>>> + retval = -EAGAIN;
>>> + break;
>>> + }
>>> + } else {
>>> + if (signal_pending(current)) {
>>> + retval = -ERESTARTSYS;
>>> + break;
>>> + }
>>> + up_read(&tty->termios_rwsem);
>
>> No sense in doing this just for O_NONBLOCK; might as well do it before
>> all the condition tests.
>
>> Which renders the earlier fixes for the slave end closing superfluous,
>> so might as well rip those out.
>
> This strikes me as somewhat draconian. It seems to me the intent behind
> commit "1)" was in part to speed things up a bit. I beleive that goal to
> have been largely attained. It's just a matter of ensuring this speedup
> doesn't change kernel<->userland semantics. I see your EIO fix and the
> eventual solution to the problem here in that light.

When I first saw this report, I considered pipelining the status change
using new tty flags, but then reconsidered. It's one thing to operate within
the confines of VFS but a whole different situation to workaround this
problem with some kind of process exit hack.

Unfortunately, this condition lands squarely in the path I wanted to
avoid; O_NONBLOCK and poll(). I see no clear solution other than that
presented to prevent the regression. And the existing work since 3.12
is really pointless if we have to wait in O_NONBLOCK and poll anyway.

This is just one of those unfortunate situations where userspace has come
to rely on an unspecified behavior because it worked.

>> n_tty_poll() will need to be fixed as well, because if one application
>> used read() with O_NONBLOCK to expect to block until i/o became available,
>> then I guarantee some other application uses poll() with no timeout
>> for the same purpose.
>
> Agreed. which is another reason why I don't believe my suggestion to be
> the correct fix. Also, I took somewhat of a hammer approach, meaning
> that, for OpenSSH's purposes, it would be sufficient to remove spurious
> EAGAIN only after disassociation of the slave pty.

I don't see another solution.

Regards,
Peter Hurley

>>>
>>> - timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> - timeout);
>>> + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE,
>>> + timeout);
>>>
>>> - down_read(&tty->termios_rwsem);
>>> - continue;
>>> + down_read(&tty->termios_rwsem);
>>> + continue;
>>> + }
>>> }
>>>
>>> if (ldata->icanon && !L_EXTPROC(tty)) {
>>>
>
> Thanks.
>
> Marc.
>

2015-12-11 13:37:14

by Marc Aurèle La France

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On Thu, 10 Dec 2015, Peter Hurley wrote:
> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:

>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>> continues to read the master pty until an error occurs. This error
>>>> will be EIO if no process has the slave pty open. Otherwise (for
>>>> example when the shell spawned long-running processes in the
>>>> background before terminating), that error is expected to be EAGAIN.
>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>> so causes the session to hang until all processes have closed the
>>>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
>>>> return causes sshd to lose data, whether or not the slave pty is
>>>> completely closed.

>>> Ah, the games userspace will be up to :)

>> Not really.

> Definitely.

> The idea that a read with O_NONBLOCK set should have synchronous behavior
> is ridiculous.

>> The fact different OSes behave differently in this regard can
>> hardly be said to be userland's fault. The lower the number of distinct
>> behaviours userland needs to deal with, the better. Furthermore, sshd
>> "knows" there should be data there, so it makes no sense to befuddle it
>> with false EAGAIN returns.

> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
> sshd is extrapolating from one known condition to another unknown condition,
> and assuming it "should" be that way because it has been.

> For example, try the same idea with real ttys on loopback. Wouldn't work,
> because it's asynchronous.

> The only reason this needs fixing is because it's a userspace regression.

It's the kernel that introduced this regression, not OpenSSH.

I am not asking to read data before it has been produced. I am puzzled
that despite knowing that the data exists, I can now be lied to when I
try to retrieve it, when I wasn't before. We are talking about what is
essentially a two-way pipe, not some network or serial connection with
transmission delays userland has long experience in dealing with.

These previously internal additional delays, that are now exposed to
userland, are simply an implementation detail that userland did not, and
should not, need to worry about.

> This is just one of those unfortunate situations where userspace has come
> to rely on an unspecified behavior because it worked.

Whether the behaviour is specified or not is irrelevent. This simply
means there is no standard to debunk the fact that the kernel's previous
behaviour mimics that of other systems.

So, how am I supposed to avoid these spurious EAGAINs and finally be
allowed to read the data I know exists? How long do I have to wait? Do I
have to run a calibration loop to figure that out? Why should I need to
do that only on Linux?

I don't know, but there's nonsense in here somewhere.

Marc.

2015-12-11 13:56:18

by Peter Hurley

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On 12/11/2015 05:37 AM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>
>>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>>> continues to read the master pty until an error occurs. This error
>>>>> will be EIO if no process has the slave pty open. Otherwise (for
>>>>> example when the shell spawned long-running processes in the
>>>>> background before terminating), that error is expected to be EAGAIN.
>>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>>> so causes the session to hang until all processes have closed the
>>>>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
>>>>> return causes sshd to lose data, whether or not the slave pty is
>>>>> completely closed.
>
>>>> Ah, the games userspace will be up to :)
>
>>> Not really.
>
>> Definitely.
>
>> The idea that a read with O_NONBLOCK set should have synchronous behavior
>> is ridiculous.
>
>>> The fact different OSes behave differently in this regard can
>>> hardly be said to be userland's fault. The lower the number of distinct
>>> behaviours userland needs to deal with, the better. Furthermore, sshd
>>> "knows" there should be data there, so it makes no sense to befuddle it
>>> with false EAGAIN returns.
>
>> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
>> sshd is extrapolating from one known condition to another unknown condition,
>> and assuming it "should" be that way because it has been.
>
>> For example, try the same idea with real ttys on loopback. Wouldn't work,
>> because it's asynchronous.
>
>> The only reason this needs fixing is because it's a userspace regression.

Misunderstanding.

"userspace regression" = kernel regression observable by userspace

> It's the kernel that introduced this regression, not OpenSSH.
>
> I am not asking to read data before it has been produced. I am puzzled that despite knowing that the data exists, I can now be lied to when I try to retrieve it, when I wasn't before. We are talking about what is essentially a two-way pipe, not some network or serial connection with transmission delays userland has long experience in dealing with.
>
> These previously internal additional delays, that are now exposed to userland, are simply an implementation detail that userland did not, and should not, need to worry about.

Your mental model is that pseudo-terminals are a synchronous pipe, which
is not true.

But this argument is pointless because the regression needs to be fixed
regardless of the merits.

Regards,
Peter Hurley

>> This is just one of those unfortunate situations where userspace has come
>> to rely on an unspecified behavior because it worked.
>
> Whether the behaviour is specified or not is irrelevent. This simply means there is no standard to debunk the fact that the kernel's previous behaviour mimics that of other systems.
>
> So, how am I supposed to avoid these spurious EAGAINs and finally be allowed to read the data I know exists? How long do I have to wait? Do I have to run a calibration loop to figure that out? Why should I need to do that only on Linux?
>
> I don't know, but there's nonsense in here somewhere.
>
> Marc.

2015-12-18 14:34:47

by Marc Aurèle La France

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On Fri, 11 Dec 2015, Peter Hurley wrote:
> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:

>> I am not asking to read data before it has been produced. I am puzzled
>> that despite knowing that the data exists, I can now be lied to when I
>> try to retrieve it, when I wasn't before. We are talking about what is
>> essentially a two-way pipe, not some network or serial connection with
>> transmission delays userland has long experience in dealing with.

>> These previously internal additional delays, that are now exposed to
>> userland, are simply an implementation detail that userland did not,
>> and should not, need to worry about.
>
> Your mental model is that pseudo-terminals are a synchronous pipe, which
> is not true.
>
> But this argument is pointless because the regression needs to be fixed
> regardless of the merits.

Fair enough.

Anything new on this?

Thanks.

Marc.

2015-12-18 16:40:04

by Peter Hurley

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

Hi Marc,

On 12/18/2015 06:26 AM, Marc Aurele La France wrote:
> On Fri, 11 Dec 2015, Peter Hurley wrote:
>> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:
>
>>> I am not asking to read data before it has been produced. I am puzzled
>>> that despite knowing that the data exists, I can now be lied to when I
>>> try to retrieve it, when I wasn't before. We are talking about what is
>>> essentially a two-way pipe, not some network or serial connection with
>>> transmission delays userland has long experience in dealing with.
>
>>> These previously internal additional delays, that are now exposed to
>>> userland, are simply an implementation detail that userland did not,
>>> and should not, need to worry about.
>>
>> Your mental model is that pseudo-terminals are a synchronous pipe, which
>> is not true.
>>
>> But this argument is pointless because the regression needs to be fixed
>> regardless of the merits.
>
> Fair enough.
>
> Anything new on this?

It's on my todo list.

While considering this issue further, I was curious what ssh does
regarding the entire foreground process group and its output?

If ssh only knows that the child has terminated, how does it wait
for the rest of the foreground process group's output since those
processes may not yet have received their SIGHUP/SIGCONT signals
yet?

Regards,
Peter Hurley

2015-12-18 17:24:12

by Marc Aurèle La France

[permalink] [raw]
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()

On Fri, 18 Dec 2015, Peter Hurley wrote:
> On 12/18/2015 06:26 AM, Marc Aurele La France wrote:
>> On Fri, 11 Dec 2015, Peter Hurley wrote:
>>> On 12/11/2015 05:37 AM, Marc Aurele La France wrote:

>>>> I am not asking to read data before it has been produced. I am puzzled
>>>> that despite knowing that the data exists, I can now be lied to when I
>>>> try to retrieve it, when I wasn't before. We are talking about what is
>>>> essentially a two-way pipe, not some network or serial connection with
>>>> transmission delays userland has long experience in dealing with.

>>>> These previously internal additional delays, that are now exposed to
>>>> userland, are simply an implementation detail that userland did not,
>>>> and should not, need to worry about.

>>> Your mental model is that pseudo-terminals are a synchronous pipe, which
>>> is not true.

>>> But this argument is pointless because the regression needs to be fixed
>>> regardless of the merits.

>> Fair enough.

>> Anything new on this?

> It's on my todo list.

> While considering this issue further, I was curious what ssh does
> regarding the entire foreground process group and its output?

> If ssh only knows that the child has terminated, how does it wait
> for the rest of the foreground process group's output since those
> processes may not yet have received their SIGHUP/SIGCONT signals
> yet?

sshd cannot know about the termination of any process other than the
session leader because any of the session leader's children are
re-parented to init. The idea is to, at minimum, collect any output the
session leader might have left behind. Yes, this could entail also
collecting output from its children that might have squeaked in, but
that's gravy that can't be avoided.

This situation is much simpler on the *BSDs. There, both ends of the
pty pair are, in effect, completely closed after disassociation of either
end, preventing (with EIO) any further output (but still allowing data
already collected to be read, after which an EIO occurs). It's
unfortunate System V variants don't do this, but that's crying over spilt
milk.

Marc.