2015-02-17 21:06:18

by Aristeu Rozanski

[permalink] [raw]
Subject: [PATCH] n_tty_read: check for hanging tty while waiting for input

If the console has a canonical reader and the respective tty hangs up,
it'll waste a wake up and will never release the last ldisc reference so
the hangup process can finish:

n_tty_read():
(..)
add_wait_queue(&tty->read_wait, &wait);
while (nr) {
(..)
if (!input_available_p(tty, 0)) {
if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
up_read(&tty->termios_rwsem);
tty_flush_to_ldisc(tty);
down_read(&tty->termios_rwsem);
if (!input_available_p(tty, 0)) {
retval = -EIO;
break;
}
} else {
-> if (tty_hung_up_p(file))
break;
this won't work because file->f_op never gets set to &hung_up_tty_fops:
__tty_hangup():

spin_lock(&tty_files_lock);
/* This breaks for file handles being sent over AF_UNIX sockets ? */
list_for_each_entry(priv, &tty->tty_files, list) {
filp = priv->file;
if (filp->f_op->write == redirected_tty_write)
cons_filp = filp;
-> if (filp->f_op->write != tty_write)
-> continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
-> filp->f_op = &hung_up_tty_fops;
}
spin_unlock(&tty_files_lock);

refs = tty_signal_session_leader(tty, exit_session);
/* Account for the p->signal references we killed */
while (refs--)
tty_kref_put(tty);

/*
* it drops BTM and thus races with reopen
* we protect the race by TTY_HUPPING
*/
-> tty_ldisc_hangup(tty);

So while the canonical read waits for input, it'll sleep, be awaken by
tty_ldisc_hangup() and then immediately going back to sleep without
dropping the reference to the ldisc gained on tty_read(). This isn't
noticiable in a non canonical read due that it'll eventually timeout.

The proposed patch checks for TTY_HUPPING flag in order to leave if
there's no input.

This is easily reproduced by opening /dev/console (my test case was a
virtual machine with serial console), setting as canonical and waiting
on a read(). Then, in another session, killing agetty that is running on
ttyS0 which will issue a hangup.

[ 240.439045] INFO: task (agetty):1323 blocked for more than 120 seconds.
[ 240.439569] Not tainted 3.13.0-rc3+ #11
[ 240.439972] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.440596] (agetty) D ffff88007fd94440 0 1323 1 0x00000080
[ 240.441253] ffff88007bca1c50 0000000000000086 ffff88007989b0c0 0000000000014440
[ 240.441857] ffff88007bca1fd8 0000000000014440 ffff88007989b0c0 ffff88007989b0c0
[ 240.442561] ffff88007ad46c30 7fffffffffffffff 0000000000000001 ffff88007ad46c28
[ 240.443296] Call Trace:
[ 240.443506] [<ffffffff815c8c99>] schedule+0x29/0x70
[ 240.443883] [<ffffffff815c7f59>] schedule_timeout+0x209/0x2d0
[ 240.444395] [<ffffffff810974b5>] ? check_preempt_curr+0x85/0xa0
[ 240.444850] [<ffffffff810974e9>] ? ttwu_do_wakeup+0x19/0xd0
[ 240.445343] [<ffffffff8109764d>] ? ttwu_do_activate.constprop.80+0x5d/0x70
[ 240.445868] [<ffffffff810995eb>] ? try_to_wake_up+0xeb/0x2b0
[ 240.446363] [<ffffffff815cbdaa>] ldsem_down_write+0xda/0x227
[ 240.446797] [<ffffffff81099822>] ? default_wake_function+0x12/0x20
[ 240.447359] [<ffffffff815cc43d>] tty_ldisc_lock_pair_timeout+0x7d/0x100
[ 240.447861] [<ffffffff8136e519>] tty_ldisc_hangup+0xc9/0x220
[ 240.448355] [<ffffffff81365463>] __tty_hangup+0x363/0x4b0
[ 240.448768] [<ffffffff81367cc5>] tty_ioctl+0x865/0xbb0
[ 240.449219] [<ffffffff811bb52a>] ? do_filp_open+0x3a/0x90
[ 240.449634] [<ffffffff811bd900>] do_vfs_ioctl+0x2e0/0x4c0
[ 240.450066] [<ffffffff8124ea76>] ? file_has_perm+0x86/0xa0
[ 240.450543] [<ffffffff811bdb61>] SyS_ioctl+0x81/0xa0
[ 240.450921] [<ffffffff815d4b69>] system_call_fastpath+0x16/0x1b

Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Peter Hurley <[email protected]>
Signed-off-by: Aristeu Rozanski <[email protected]>

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 0f74945..4fb909d 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -2189,6 +2189,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
} else {
if (tty_hung_up_p(file))
break;
+ if (test_bit(TTY_HUPPING, &tty->flags))
+ break;
if (!timeout)
break;
if (file->f_flags & O_NONBLOCK) {


2015-02-17 21:28:37

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

On 02/17/2015 04:06 PM, Aristeu Rozanski wrote:
> If the console has a canonical reader and the respective tty hangs up,
> it'll waste a wake up and will never release the last ldisc reference so
> the hangup process can finish:

This behavior is by-design; /dev/console cannot be hung-up.


> n_tty_read():
> (..)
> add_wait_queue(&tty->read_wait, &wait);
> while (nr) {
> (..)
> if (!input_available_p(tty, 0)) {
> if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
> up_read(&tty->termios_rwsem);
> tty_flush_to_ldisc(tty);
> down_read(&tty->termios_rwsem);
> if (!input_available_p(tty, 0)) {
> retval = -EIO;
> break;
> }
> } else {
> -> if (tty_hung_up_p(file))
> break;
> this won't work because file->f_op never gets set to &hung_up_tty_fops:
> __tty_hangup():
>
> spin_lock(&tty_files_lock);
> /* This breaks for file handles being sent over AF_UNIX sockets ? */
> list_for_each_entry(priv, &tty->tty_files, list) {
> filp = priv->file;
> if (filp->f_op->write == redirected_tty_write)
> cons_filp = filp;
> -> if (filp->f_op->write != tty_write)
> -> continue;
> closecount++;
> __tty_fasync(-1, filp, 0); /* can't block */
> -> filp->f_op = &hung_up_tty_fops;
> }
> spin_unlock(&tty_files_lock);
>
> refs = tty_signal_session_leader(tty, exit_session);
> /* Account for the p->signal references we killed */
> while (refs--)
> tty_kref_put(tty);
>
> /*
> * it drops BTM and thus races with reopen
> * we protect the race by TTY_HUPPING
> */
> -> tty_ldisc_hangup(tty);
>
> So while the canonical read waits for input, it'll sleep, be awaken by
> tty_ldisc_hangup() and then immediately going back to sleep without
> dropping the reference to the ldisc gained on tty_read(). This isn't
> noticiable in a non canonical read due that it'll eventually timeout.
>
> The proposed patch checks for TTY_HUPPING flag in order to leave if
> there's no input.
>
> This is easily reproduced by opening /dev/console (my test case was a
> virtual machine with serial console), setting as canonical and waiting
> on a read(). Then, in another session, killing agetty that is running on
> ttyS0 which will issue a hangup.

What process is sleeping on /dev/console read() and what is its controlling
tty? I ask because console teardown usually happens when SIGHUP is
received by the process group.


> [ 240.439045] INFO: task (agetty):1323 blocked for more than 120 seconds.
> [ 240.439569] Not tainted 3.13.0-rc3+ #11
> [ 240.439972] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 240.440596] (agetty) D ffff88007fd94440 0 1323 1 0x00000080
> [ 240.441253] ffff88007bca1c50 0000000000000086 ffff88007989b0c0 0000000000014440
> [ 240.441857] ffff88007bca1fd8 0000000000014440 ffff88007989b0c0 ffff88007989b0c0
> [ 240.442561] ffff88007ad46c30 7fffffffffffffff 0000000000000001 ffff88007ad46c28
> [ 240.443296] Call Trace:
> [ 240.443506] [<ffffffff815c8c99>] schedule+0x29/0x70
> [ 240.443883] [<ffffffff815c7f59>] schedule_timeout+0x209/0x2d0
> [ 240.444395] [<ffffffff810974b5>] ? check_preempt_curr+0x85/0xa0
> [ 240.444850] [<ffffffff810974e9>] ? ttwu_do_wakeup+0x19/0xd0
> [ 240.445343] [<ffffffff8109764d>] ? ttwu_do_activate.constprop.80+0x5d/0x70
> [ 240.445868] [<ffffffff810995eb>] ? try_to_wake_up+0xeb/0x2b0
> [ 240.446363] [<ffffffff815cbdaa>] ldsem_down_write+0xda/0x227
> [ 240.446797] [<ffffffff81099822>] ? default_wake_function+0x12/0x20
> [ 240.447359] [<ffffffff815cc43d>] tty_ldisc_lock_pair_timeout+0x7d/0x100
> [ 240.447861] [<ffffffff8136e519>] tty_ldisc_hangup+0xc9/0x220
> [ 240.448355] [<ffffffff81365463>] __tty_hangup+0x363/0x4b0
> [ 240.448768] [<ffffffff81367cc5>] tty_ioctl+0x865/0xbb0
> [ 240.449219] [<ffffffff811bb52a>] ? do_filp_open+0x3a/0x90
> [ 240.449634] [<ffffffff811bd900>] do_vfs_ioctl+0x2e0/0x4c0
> [ 240.450066] [<ffffffff8124ea76>] ? file_has_perm+0x86/0xa0
> [ 240.450543] [<ffffffff811bdb61>] SyS_ioctl+0x81/0xa0
> [ 240.450921] [<ffffffff815d4b69>] system_call_fastpath+0x16/0x1b
>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Jiri Slaby <[email protected]>
> Cc: Peter Hurley <[email protected]>
> Signed-off-by: Aristeu Rozanski <[email protected]>
>
> diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> index 0f74945..4fb909d 100644
> --- a/drivers/tty/n_tty.c
> +++ b/drivers/tty/n_tty.c
> @@ -2189,6 +2189,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
> } else {
> if (tty_hung_up_p(file))
> break;
> + if (test_bit(TTY_HUPPING, &tty->flags))
> + break;
> if (!timeout)
> break;
> if (file->f_flags & O_NONBLOCK) {
>

2015-02-17 21:50:55

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

Hi Peter,
On Tue, Feb 17, 2015 at 04:28:30PM -0500, Peter Hurley wrote:
> On 02/17/2015 04:06 PM, Aristeu Rozanski wrote:
> > If the console has a canonical reader and the respective tty hangs up,
> > it'll waste a wake up and will never release the last ldisc reference so
> > the hangup process can finish:
>
> This behavior is by-design; /dev/console cannot be hung-up.

hangup is issued on the tty that happens to be the console. In this
case, ttyS0.

> What process is sleeping on /dev/console read() and what is its controlling
> tty? I ask because console teardown usually happens when SIGHUP is
> received by the process group.

ttyS0 is the controller tty.

--
Aristeu

2015-02-17 22:35:20

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

Hi Aristeu,

On 02/17/2015 04:50 PM, Aristeu Rozanski wrote:
> Hi Peter,
> On Tue, Feb 17, 2015 at 04:28:30PM -0500, Peter Hurley wrote:
>> On 02/17/2015 04:06 PM, Aristeu Rozanski wrote:
>>> If the console has a canonical reader and the respective tty hangs up,
>>> it'll waste a wake up and will never release the last ldisc reference so
>>> the hangup process can finish:
>>
>> This behavior is by-design; /dev/console cannot be hung-up.
>
> hangup is issued on the tty that happens to be the console. In this
> case, ttyS0.

I realize that. But hanging up the tty that is /dev/console only affects
open descriptors that are not /dev/console.

So readers using the /dev/ttyS0 file descriptor will see a hungup fops,
but readers using /dev/console will not, and /dev/ttyS0 will _not_
be closed or released because of the still-open descriptor on /dev/console.

>> What process is sleeping on /dev/console read() and what is its controlling
>> tty? I ask because console teardown usually happens when SIGHUP is
>> received by the process group.
>
> ttyS0 is the controller tty.

Ok, so the process sleeping on /dev/console read() should have received
SIGHUP, which would wake the process and cause it to exit the
n_tty_read() loop, thus dropping the ldisc reference it holds.
Did it ignore the signal or perhaps the signal is masked?

Of course, there is no requirement for the process sleeping on /dev/console
to respond to SIGHUP, in which case, the hangup simply fails to make
forward progress because of the open /dev/console descriptor.

Regards,
Peter Hurley

2015-02-18 14:58:17

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

Hi Peter,
On Tue, Feb 17, 2015 at 05:35:10PM -0500, Peter Hurley wrote:
> I realize that. But hanging up the tty that is /dev/console only affects
> open descriptors that are not /dev/console.
>
> So readers using the /dev/ttyS0 file descriptor will see a hungup fops,
> but readers using /dev/console will not, and /dev/ttyS0 will _not_
> be closed or released because of the still-open descriptor on /dev/console.

I see.

> Ok, so the process sleeping on /dev/console read() should have received
> SIGHUP, which would wake the process and cause it to exit the
> n_tty_read() loop, thus dropping the ldisc reference it holds.
> Did it ignore the signal or perhaps the signal is masked?

Not masked on the test case (attached). Sent sighup manually and it did
receive it.

--
Aristeu

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <termios.h>
#include <errno.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/ioctl.h>

static char *default_console = "/dev/console";
static char *default_tty = "/dev/ttyS0";
struct data {
char *console;
char *tty;
};

static void *reader(void *d)
{
struct data *data = (struct data *)d;
struct termios old_termio;
char buff[512];
int fd = -1, rc;

while (1) {
if (fd == -1) {
fd = open(data->console, O_RDWR);
if (fd < 0)
exit(1);
if (tcgetattr(fd, &old_termio) == -1)
exit(1);
old_termio.c_lflag = ICANON;
if (tcsetattr(fd, 0, &old_termio) == -1)
exit(1);
}
rc = read(fd, buff, sizeof(buff));
if (rc < 0 && errno == EAGAIN)
continue;
close(fd);
fd = -1;
}
}

void launch(void *(*fn)(void *), struct data *data)
{
if (fork() == 0)
fn(data);
}

int main(int argc, char *argv[])
{
struct data data;
int fd;

fd = open("/dev/ttyS0", O_RDWR);
close(0);
close(1);
close(2);
ioctl(fd, TIOCSCTTY, 1);

data.console = default_console;
data.tty = default_tty;

launch(reader, &data);

waitpid(-1, NULL, 0);

return 0;
}

2015-02-18 15:40:17

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

On 02/18/2015 09:58 AM, Aristeu Rozanski wrote:
> Hi Peter,
> On Tue, Feb 17, 2015 at 05:35:10PM -0500, Peter Hurley wrote:
>> I realize that. But hanging up the tty that is /dev/console only affects
>> open descriptors that are not /dev/console.
>>
>> So readers using the /dev/ttyS0 file descriptor will see a hungup fops,
>> but readers using /dev/console will not, and /dev/ttyS0 will _not_
>> be closed or released because of the still-open descriptor on /dev/console.
>
> I see.
>
>> Ok, so the process sleeping on /dev/console read() should have received
>> SIGHUP, which would wake the process and cause it to exit the
>> n_tty_read() loop, thus dropping the ldisc reference it holds.
>> Did it ignore the signal or perhaps the signal is masked?
>
> Not masked on the test case (attached). Sent sighup manually and it did
> receive it.
>

The child is not receiving SIGHUP because /dev/ttyS0 was not set as the
controlling terminal by ioctl(TIOCSCTTY), which is failing (probably
with errno == EPERM). You need to check the return value and errno.

To set the controlling tty, the calling process must be a session leader;
ie., have called setsid() before ioctl(TIOCSCTTY). Check the return value
for that too.

FWIW, the idiom for starting a session leader is for the parent to
fork a child and exit and for the child to become the session leader with
setsid() and establish its controlling tty either with ioctl(TIOCSCTTY)
or simply opening the first tty.

The reason for this idiom is that setsid() will fail for an existing
group leader (because otherwise a group leader could abandon existing
members of its process group, leaving them without a group leader in
a different session).

I highly recommend Ch 34 of Michael Kerrisk's book, "The Linux Programming
Interface", especially if this is not a toy project.

Regards,
Peter Hurley

> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <termios.h>
> #include <errno.h>
> #include <signal.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/wait.h>
> #include <fcntl.h>
> #include <sys/ioctl.h>
>
> static char *default_console = "/dev/console";
> static char *default_tty = "/dev/ttyS0";
> struct data {
> char *console;
> char *tty;
> };
>
> static void *reader(void *d)
> {
> struct data *data = (struct data *)d;
> struct termios old_termio;
> char buff[512];
> int fd = -1, rc;
>
> while (1) {
> if (fd == -1) {
> fd = open(data->console, O_RDWR);
> if (fd < 0)
> exit(1);
> if (tcgetattr(fd, &old_termio) == -1)
> exit(1);
> old_termio.c_lflag = ICANON;
> if (tcsetattr(fd, 0, &old_termio) == -1)
> exit(1);
> }
> rc = read(fd, buff, sizeof(buff));
> if (rc < 0 && errno == EAGAIN)
> continue;
> close(fd);
> fd = -1;
> }
> }
>
> void launch(void *(*fn)(void *), struct data *data)
> {
> if (fork() == 0)
> fn(data);
> }
>
> int main(int argc, char *argv[])
> {
> struct data data;
> int fd;
>
> fd = open("/dev/ttyS0", O_RDWR);
> close(0);
> close(1);
> close(2);
> ioctl(fd, TIOCSCTTY, 1);
>
> data.console = default_console;
> data.tty = default_tty;
>
> launch(reader, &data);
>
> waitpid(-1, NULL, 0);
>
> return 0;
> }
>

2015-02-18 15:56:38

by Aristeu Rozanski

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

Hi Peter,
On Wed, Feb 18, 2015 at 10:40:10AM -0500, Peter Hurley wrote:
> The child is not receiving SIGHUP because /dev/ttyS0 was not set as the
> controlling terminal by ioctl(TIOCSCTTY), which is failing (probably
> with errno == EPERM). You need to check the return value and errno.
>
> To set the controlling tty, the calling process must be a session leader;
> ie., have called setsid() before ioctl(TIOCSCTTY). Check the return value
> for that too.
>
> FWIW, the idiom for starting a session leader is for the parent to
> fork a child and exit and for the child to become the session leader with
> setsid() and establish its controlling tty either with ioctl(TIOCSCTTY)
> or simply opening the first tty.
>
> The reason for this idiom is that setsid() will fail for an existing
> group leader (because otherwise a group leader could abandon existing
> members of its process group, leaving them without a group leader in
> a different session).
>
> I highly recommend Ch 34 of Michael Kerrisk's book, "The Linux Programming
> Interface", especially if this is not a toy project.

Actually wrote this trying to reproduce a problem a customer is seeing
in a commercial application, but clearly I need to read it. Specifically
about the console behavior, would you recommend the same book? Every
time I need details like this I fail to find any reference online.
Thanks for your help

--
Aristeu

2015-02-18 16:39:03

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty_read: check for hanging tty while waiting for input

On 02/18/2015 10:56 AM, Aristeu Rozanski wrote:
> Hi Peter,
> On Wed, Feb 18, 2015 at 10:40:10AM -0500, Peter Hurley wrote:
>> The child is not receiving SIGHUP because /dev/ttyS0 was not set as the
>> controlling terminal by ioctl(TIOCSCTTY), which is failing (probably
>> with errno == EPERM). You need to check the return value and errno.
>>
>> To set the controlling tty, the calling process must be a session leader;
>> ie., have called setsid() before ioctl(TIOCSCTTY). Check the return value
>> for that too.
>>
>> FWIW, the idiom for starting a session leader is for the parent to
>> fork a child and exit and for the child to become the session leader with
>> setsid() and establish its controlling tty either with ioctl(TIOCSCTTY)
>> or simply opening the first tty.
>>
>> The reason for this idiom is that setsid() will fail for an existing
>> group leader (because otherwise a group leader could abandon existing
>> members of its process group, leaving them without a group leader in
>> a different session).
>>
>> I highly recommend Ch 34 of Michael Kerrisk's book, "The Linux Programming
>> Interface", especially if this is not a toy project.
>
> Actually wrote this trying to reproduce a problem a customer is seeing
> in a commercial application, but clearly I need to read it. Specifically
> about the console behavior, would you recommend the same book?

Ch 34 is about controlling ttys and job control, so it covers the two-tier
process hierarchy, foreground and background process groups, and job control
signals. Personally, I think the book is invaluable.

Unfortunately, consoles are not well documented anywhere.

> Every time I need details like this I fail to find any reference online.
> Thanks for your help

No problem.

Regards,
Peter Hurley