Date: Fri, 11 Dec 2015 06:37:05 -0700 (MST)
From: Marc Aurele La France <tsi@tuyoix.net>
To: Peter Hurley <peter@hurleysoftware.com>
cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Jiri Slaby <jslaby@suse.com>, linux-kernel@vger.kernel.org,
        Volth <openssh@volth.com>, Damien Miller <djm@mindrot.org>
Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN
 on a read()
In-Reply-To: <566A13C2.7040803@hurleysoftware.com>
Message-ID: <alpine.LNX.2.00.1512110557160.9874@fanir.tuyoix.net>
References: <alpine.LNX.2.00.1512091358290.9574@fanir.tuyoix.net> <56699356.8040802@hurleysoftware.com> <alpine.LNX.2.00.1512101504070.6038@fanir.tuyoix.net> <566A13C2.7040803@hurleysoftware.com>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3230
Lines: 72

On Thu, 10 Dec 2015, Peter Hurley wrote:
> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:

>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>> continues to read the master pty until an error occurs.  This error
>>>> will be EIO if no process has the slave pty open.  Otherwise (for
>>>> example when the shell spawned long-running processes in the
>>>> background before terminating), that error is expected to be EAGAIN.
>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>> so causes the session to hang until all processes have closed the
>>>> slave pty, which is not the desired behaviour.  Thus a spurious EAGAIN
>>>> return causes sshd to lose data, whether or not the slave pty is
>>>> completely closed.

>>> Ah, the games userspace will be up to :)

>> Not really.

> Definitely.

> The idea that a read with O_NONBLOCK set should have synchronous behavior
> is ridiculous.

>> The fact different OSes behave differently in this regard can
>> hardly be said to be userland's fault.  The lower the number of distinct
>> behaviours userland needs to deal with, the better.  Furthermore, sshd
>> "knows" there should be data there, so it makes no sense to befuddle it
>> with false EAGAIN returns.

> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
> sshd is extrapolating from one known condition to another unknown condition,
> and assuming it "should" be that way because it has been.

> For example, try the same idea with real ttys on loopback. Wouldn't work,
> because it's asynchronous.

> The only reason this needs fixing is because it's a userspace regression.

It's the kernel that introduced this regression, not OpenSSH.

I am not asking to read data before it has been produced.  I am puzzled 
that despite knowing that the data exists, I can now be lied to when I 
try to retrieve it, when I wasn't before.  We are talking about what is 
essentially a two-way pipe, not some network or serial connection with 
transmission delays userland has long experience in dealing with.

These previously internal additional delays, that are now exposed to 
userland, are simply an implementation detail that userland did not, and 
should not, need to worry about.

> This is just one of those unfortunate situations where userspace has come
> to rely on an unspecified behavior because it worked.

Whether the behaviour is specified or not is irrelevent.  This simply 
means there is no standard to debunk the fact that the kernel's previous 
behaviour mimics that of other systems.

So, how am I supposed to avoid these spurious EAGAINs and finally be 
allowed to read the data I know exists?  How long do I have to wait?  Do I 
have to run a calibration loop to figure that out?  Why should I need to 
do that only on Linux?

I don't know, but there's nonsense in here somewhere.

Marc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/