Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753956AbbLJO7i (ORCPT ); Thu, 10 Dec 2015 09:59:38 -0500 Received: from mail-pa0-f48.google.com ([209.85.220.48]:35591 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753338AbbLJO7g (ORCPT ); Thu, 10 Dec 2015 09:59:36 -0500 Subject: Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read() To: Marc Aurele La France , Greg Kroah-Hartman , Jiri Slaby , linux-kernel@vger.kernel.org References: Cc: Volth , Damien Miller From: Peter Hurley Message-ID: <56699356.8040802@hurleysoftware.com> Date: Thu, 10 Dec 2015 06:59:34 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5348 Lines: 157 On 12/09/2015 01:06 PM, Marc Aurele La France wrote: > Greetings. > > The following four commits are some of the changes that have been made > to the tty layer since kernel version 3.11: > > 1) f95499c3030fe1bfad57745f2db1959c5b43dca8 > n_tty: Don't wait for buffer work in read() loop > > 2) f8747d4a466ab2cafe56112c51b3379f9fdb7a12 > tty: Fix pty master read() after slave closes > > 3) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73 > pty, n_tty: Simplify input processing on final close > > 4) 1a48632ffed61352a7810ce089dc5a8bcd505a60 > pty: Fix input race when closing > > Commit "4)" corrected an issue whereby EIO could be prematurely > returned on a read of one end of a master/slave pty pair after the > other had been completely closed. Yet, I would argue that EAGAIN > should not be returned either when there actually is data to be > returned. This whether or not the other end has been completely > closed. > > Indeed, the previous code (before commit "1)") checked the other end > of the pty pair for any data before returning EAGAIN. This mimics the > behaviour of other System-V variants (Solaris, AIX, etc.) ^^^^ What other SysV systems were tested? > in this > regard and ensured that EAGAIN really did mean no data was available > at the time of the call. > > Portable OpenSSH, since release 4.6p1 in 2007, relies on this being > the case and has been broken since commit "1)" introduced spurious > EAGAIN returns (i.e. as of 3.12 kernels). The scenario at hand is > as follows. > > After sshd has been SIGCHLD'ed about the shell's termination, it > continues to read the master pty until an error occurs. This error > will be EIO if no process has the slave pty open. Otherwise (for > example when the shell spawned long-running processes in the > background before terminating), that error is expected to be EAGAIN. > sshd cannot continue to read until an EIO in all cases, because doing > so causes the session to hang until all processes have closed the > slave pty, which is not the desired behaviour. Thus a spurious EAGAIN > return causes sshd to lose data, whether or not the slave pty is > completely closed. Ah, the games userspace will be up to :) > I've been using the following script to reproduce the problem. It > loops until the issue is detected. > > #! /bin/bash > > LOG=sshlog-`date "+%F.%T"` > > touch ${LOG} > > while test -z "`grep -n '^Connection' ${LOG} | grep -v '0:Connection'`" > do > ssh -p 22 -tt root@localhost \ > '/bin/bash -c "/bin/ping -c4 8.8.8.8"' 2>&1 | \ > tee -a ${LOG} > done > > It should be noted that the problem is extremely rare, but still > occurs, on real hardware. This bug is easier to replicate in a > virtual machine such as those that can be created using Google Cloud. > > The patch below is a suggested fix. It was developed using a 4.3.0 > kernel and should apply, modulo fuzz, to any release >= 4.0.5. My > suggested fix is modeled after commit "2)" mentionned above. Given > commit "2)" was later reworked by commit "3)", I fully expect my fix > to be reworked as well. > > I volunteer to backport the fix this ends up being to any stable > release >= 3.12 deemed needed. > > Please Reply-To-All. > > Thanks and have a great day. > > Marc. > > Reported-by: Volth > BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52 > BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492 > Signed-off-by: Marc Aurele La France > > --- a/drivers/tty/n_tty.c > +++ a/drivers/tty/n_tty.c > @@ -2281,20 +2281,26 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, > if (!timeout) > break; > if (file->f_flags & O_NONBLOCK) { > - retval = -EAGAIN; > - break; > - } > - if (signal_pending(current)) { > - retval = -ERESTARTSYS; > - break; > - } > - up_read(&tty->termios_rwsem); > + up_read(&tty->termios_rwsem); > + flush_work(&tty->port->buf.work); > + down_read(&tty->termios_rwsem); > + if (!input_available_p(tty, 0)) { > + retval = -EAGAIN; > + break; > + } > + } else { > + if (signal_pending(current)) { > + retval = -ERESTARTSYS; > + break; > + } > + up_read(&tty->termios_rwsem); No sense in doing this just for O_NONBLOCK; might as well do it before all the condition tests. Which renders the earlier fixes for the slave end closing superfluous, so might as well rip those out. n_tty_poll() will need to be fixed as well, because if one application used read() with O_NONBLOCK to expect to block until i/o became available, then I guarantee some other application uses poll() with no timeout for the same purpose. Regards, Peter Hurley > > - timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, > - timeout); > + timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, > + timeout); > > - down_read(&tty->termios_rwsem); > - continue; > + down_read(&tty->termios_rwsem); > + continue; > + } > } > > if (ldata->icanon && !L_EXTPROC(tty)) { > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/