Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753438AbbLKNhO (ORCPT ); Fri, 11 Dec 2015 08:37:14 -0500 Received: from smtp-out-so.shaw.ca ([64.59.136.138]:36270 "EHLO smtp-out-so.shaw.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751879AbbLKNhK (ORCPT ); Fri, 11 Dec 2015 08:37:10 -0500 X-Authority-Analysis: v=2.1 cv=IaYnITea c=1 sm=1 tr=0 a=qZxK3cM5tHtOUZVZkOzy1Q==:117 a=qZxK3cM5tHtOUZVZkOzy1Q==:17 a=3I1X_3ewAAAA:8 a=kj9zAlcOel0A:10 a=WGRiTBFpbcaZp5OuUhoA:9 a=CjuIK1q_8ugA:10 Date: Fri, 11 Dec 2015 06:37:05 -0700 (MST) From: Marc Aurele La France To: Peter Hurley cc: Greg Kroah-Hartman , Jiri Slaby , linux-kernel@vger.kernel.org, Volth , Damien Miller Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read() In-Reply-To: <566A13C2.7040803@hurleysoftware.com> Message-ID: References: <56699356.8040802@hurleysoftware.com> <566A13C2.7040803@hurleysoftware.com> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-CMAE-Envelope: MS4wfEXEIN+T7Vlx8I2BlXeL+7vH/csoE2V2dtZR439nsWVlKEBwcUFxH9yNz/ZhJeYMnAOi5HjhCrpxsZp0BNgjnKyFSpN9BoPgaCtRPW/EojRHrtpo+Xfi Y7Li1QCsAcfUHYSzc6SS+Z2WqtJ+m+3bEmt0kAykdi4UD7vygOEEoX8pbX7WXE3XP9X8kOrhcKGShB5CenWkd18Cc5V96ZIrp/CJgmWaAnNflSTPivr/jID2 Uztu4+nA9V6mWWpHrfTMfCPatsEKTkOxGt/Gp8NuAizvDZIhvHA6iy/72d7jNC8jcDU5aDKC4fV2r1+dnm2GsUXtpucq64i0nOapjBqc/nM= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3230 Lines: 72 On Thu, 10 Dec 2015, Peter Hurley wrote: > On 12/10/2015 02:48 PM, Marc Aurele La France wrote: >> On Thu, 10 Dec 2015, Peter Hurley wrote: >>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote: >>>> After sshd has been SIGCHLD'ed about the shell's termination, it >>>> continues to read the master pty until an error occurs. This error >>>> will be EIO if no process has the slave pty open. Otherwise (for >>>> example when the shell spawned long-running processes in the >>>> background before terminating), that error is expected to be EAGAIN. >>>> sshd cannot continue to read until an EIO in all cases, because doing >>>> so causes the session to hang until all processes have closed the >>>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN >>>> return causes sshd to lose data, whether or not the slave pty is >>>> completely closed. >>> Ah, the games userspace will be up to :) >> Not really. > Definitely. > The idea that a read with O_NONBLOCK set should have synchronous behavior > is ridiculous. >> The fact different OSes behave differently in this regard can >> hardly be said to be userland's fault. The lower the number of distinct >> behaviours userland needs to deal with, the better. Furthermore, sshd >> "knows" there should be data there, so it makes no sense to befuddle it >> with false EAGAIN returns. > But sshd doesn't "know". sshd "knows" the data has been sent and that's all. > sshd is extrapolating from one known condition to another unknown condition, > and assuming it "should" be that way because it has been. > For example, try the same idea with real ttys on loopback. Wouldn't work, > because it's asynchronous. > The only reason this needs fixing is because it's a userspace regression. It's the kernel that introduced this regression, not OpenSSH. I am not asking to read data before it has been produced. I am puzzled that despite knowing that the data exists, I can now be lied to when I try to retrieve it, when I wasn't before. We are talking about what is essentially a two-way pipe, not some network or serial connection with transmission delays userland has long experience in dealing with. These previously internal additional delays, that are now exposed to userland, are simply an implementation detail that userland did not, and should not, need to worry about. > This is just one of those unfortunate situations where userspace has come > to rely on an unspecified behavior because it worked. Whether the behaviour is specified or not is irrelevent. This simply means there is no standard to debunk the fact that the kernel's previous behaviour mimics that of other systems. So, how am I supposed to avoid these spurious EAGAINs and finally be allowed to read the data I know exists? How long do I have to wait? Do I have to run a calibration loop to figure that out? Why should I need to do that only on Linux? I don't know, but there's nonsense in here somewhere. Marc. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/