2020-11-24 00:51:22

by Corey Minyard

[permalink] [raw]
Subject: [PATCH 0/2] drivers:tty:pty: Fix a race causing data loss on close

I finally got some time to spend with this issue, and I think I have a
good fix. Not really a v2, this is a completely different fix.
Basically, calling tty_vhangup() on the slave when closing a pty master
is just a bad idea.

From what I can tell, the tty_vhangup() is there in case there was a
console redirect to the pty slave. When you close the master, you need
to release the redirect. I didn't see another reason for tty_vhangup().
So this has two parts, export the release of the tty redirect release
for the pty code to use, and call it from the pty code.

With this change, everything seems to work ok and no data is lost on the
slave side if the master closes with outstanding data. I have updated
my test program to check that all written data is read, it's available
at https://sourceforge.net/projects/ser2net/files/tmp/testpty.c/download

But, this code is quite intricate and I certainly may have missed
something.

-corey



2020-11-24 00:53:12

by Corey Minyard

[permalink] [raw]
Subject: [PATCH 1/2] tty: Export redirect release

From: Corey Minyard <[email protected]>

This will be required by the pty code when it removes tty_vhangup() on
master close.

Signed-off-by: Corey Minyard <[email protected]>
---
drivers/tty/tty_io.c | 32 ++++++++++++++++++++++++--------
include/linux/tty.h | 1 +
2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 7a4c02548fb3..571b1d7d4d5a 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -542,6 +542,28 @@ void tty_wakeup(struct tty_struct *tty)

EXPORT_SYMBOL_GPL(tty_wakeup);

+/**
+ * tty_release_redirect - Release a redirect on a pty if present
+ * @tty: tty device
+ *
+ * This is available to the pty code so if the master closes, if the
+ * slave is a redirect it can release the redirect.
+ */
+struct file *tty_release_redirect(struct tty_struct *tty)
+{
+ struct file *f = NULL;
+
+ spin_lock(&redirect_lock);
+ if (redirect && file_tty(redirect) == tty) {
+ f = redirect;
+ redirect = NULL;
+ }
+ spin_unlock(&redirect_lock);
+
+ return f;
+}
+EXPORT_SYMBOL_GPL(tty_release_redirect);
+
/**
* __tty_hangup - actual handler for hangup events
* @tty: tty device
@@ -567,7 +589,7 @@ EXPORT_SYMBOL_GPL(tty_wakeup);
static void __tty_hangup(struct tty_struct *tty, int exit_session)
{
struct file *cons_filp = NULL;
- struct file *filp, *f = NULL;
+ struct file *filp, *f;
struct tty_file_private *priv;
int closecount = 0, n;
int refs;
@@ -575,13 +597,7 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
if (!tty)
return;

-
- spin_lock(&redirect_lock);
- if (redirect && file_tty(redirect) == tty) {
- f = redirect;
- redirect = NULL;
- }
- spin_unlock(&redirect_lock);
+ f = tty_release_redirect(tty);

tty_lock(tty);

diff --git a/include/linux/tty.h b/include/linux/tty.h
index a99e9b8e4e31..d98319c5d195 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -418,6 +418,7 @@ extern void tty_kclose(struct tty_struct *tty);
extern int tty_dev_name_to_number(const char *name, dev_t *number);
extern int tty_ldisc_lock(struct tty_struct *tty, unsigned long timeout);
extern void tty_ldisc_unlock(struct tty_struct *tty);
+extern struct file *tty_release_redirect(struct tty_struct *tty);
#else
static inline void tty_kref_put(struct tty_struct *tty)
{ }
--
2.25.1

2020-11-24 19:46:32

by Corey Minyard

[permalink] [raw]
Subject: [PATCH 2/2] drivers:tty:pty: Fix a race causing data loss on close

From: Corey Minyard <[email protected]>

Remove the tty_vhangup() from the pty code and just release the
redirect. The tty_vhangup() results in data loss and data out of order
issues.

If you write to a pty master an immediately close the pty master, the
receiver might get a chunk of data dropped, but then receive some later
data. That's obviously something rather unexpected for a user. It
certainly confused my test program.

It turns out that tty_vhangup() on the slave pty gets called from
pty_close(), and that causes the data on the slave side to be flushed,
but due to races more data can be copied into the slave side's buffer
after that. Consider the following sequence:

thread1 thread2 thread3
------- ------- -------
| |-write data into buffer,
| | n_tty buffer is filled
| | along with other buffers
| |-pty_close(master)
| |--tty_vhangup(slave)
| |---tty_ldisc_hangup()
| |----n_tty_flush_buffer()
| |-----reset_buffer_flags()
|-n_tty_read() |
|--up_read(&tty->termios_rwsem);
| |------down_read(&tty->termios_rwsem)
| |------clear n_tty buffer contents
| |------up_read(&tty->termios_rwsem)
|--tty_buffer_flush_work() |
|--schedules work calling |
| flush_to_ldisc() |
| |-flush_to_ldisc()
| |--receive_buf()
| |---tty_port_default_receive_buf()
| |----tty_ldisc_receive_buf()
| |-----n_tty_receive_buf2()
| |------n_tty_receive_buf_common()
| |-------down_read(&tty->termios_rwsem)
| |-------__receive_buf()
| | copies data into n_tty buffer
| |-------up_read(&tty->termios_rwsem)
|--down_read(&tty->termios_rwsem)
|--copy buffer data to user

From this sequence, you can see that thread2 writes to the buffer then
only clears the part of the buffer in n_tty. The n_tty receive buffer
code then copies more data into the n_tty buffer.

But part of the vhangup, releasing the redirect, is still required to
avoid issues with consoles running on pty slaves. So do that.
As far as I can tell, that is all that should be required.

Signed-off-by: Corey Minyard <[email protected]>
---
drivers/tty/pty.c | 15 +++++++++++++--
drivers/tty/tty_io.c | 5 +++--
2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 23368cec7ee8..29be6b985e76 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -67,7 +67,8 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
wake_up_interruptible(&tty->link->read_wait);
wake_up_interruptible(&tty->link->write_wait);
if (tty->driver->subtype == PTY_TYPE_MASTER) {
- set_bit(TTY_OTHER_CLOSED, &tty->flags);
+ struct file *f;
+
#ifdef CONFIG_UNIX98_PTYS
if (tty->driver == ptm_driver) {
mutex_lock(&devpts_mutex);
@@ -76,7 +77,17 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
mutex_unlock(&devpts_mutex);
}
#endif
- tty_vhangup(tty->link);
+
+ /*
+ * This hack is required because a program can open a
+ * pty and redirect a console to it, but if the pty is
+ * closed and the console is not released, then the
+ * slave side will never close. So release the
+ * redirect when the master closes.
+ */
+ f = tty_release_redirect(tty->link);
+ if (f)
+ fput(f);
}
}

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 571b1d7d4d5a..91c33a0df3c4 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -547,7 +547,9 @@ EXPORT_SYMBOL_GPL(tty_wakeup);
* @tty: tty device
*
* This is available to the pty code so if the master closes, if the
- * slave is a redirect it can release the redirect.
+ * slave is a redirect it can release the redirect. It returns the
+ * filp for the redirect, which must be fput when the operations on
+ * the tty are completed.
*/
struct file *tty_release_redirect(struct tty_struct *tty)
{
@@ -562,7 +564,6 @@ struct file *tty_release_redirect(struct tty_struct *tty)

return f;
}
-EXPORT_SYMBOL_GPL(tty_release_redirect);

/**
* __tty_hangup - actual handler for hangup events
--
2.25.1

2021-01-02 14:46:27

by Corey Minyard

[permalink] [raw]
Subject: Re: [PATCH 2/2] drivers:tty:pty: Fix a race causing data loss on close

On Mon, Nov 23, 2020 at 06:49:02PM -0600, [email protected] wrote:
> From: Corey Minyard <[email protected]>
>
> Remove the tty_vhangup() from the pty code and just release the
> redirect. The tty_vhangup() results in data loss and data out of order
> issues.

It's been a while, so ping on this. I'm pretty sure this is the right
fix, the more I've thought about it.

Thankks,

-corey

>
> If you write to a pty master an immediately close the pty master, the
> receiver might get a chunk of data dropped, but then receive some later
> data. That's obviously something rather unexpected for a user. It
> certainly confused my test program.
>
> It turns out that tty_vhangup() on the slave pty gets called from
> pty_close(), and that causes the data on the slave side to be flushed,
> but due to races more data can be copied into the slave side's buffer
> after that. Consider the following sequence:
>
> thread1 thread2 thread3
> ------- ------- -------
> | |-write data into buffer,
> | | n_tty buffer is filled
> | | along with other buffers
> | |-pty_close(master)
> | |--tty_vhangup(slave)
> | |---tty_ldisc_hangup()
> | |----n_tty_flush_buffer()
> | |-----reset_buffer_flags()
> |-n_tty_read() |
> |--up_read(&tty->termios_rwsem);
> | |------down_read(&tty->termios_rwsem)
> | |------clear n_tty buffer contents
> | |------up_read(&tty->termios_rwsem)
> |--tty_buffer_flush_work() |
> |--schedules work calling |
> | flush_to_ldisc() |
> | |-flush_to_ldisc()
> | |--receive_buf()
> | |---tty_port_default_receive_buf()
> | |----tty_ldisc_receive_buf()
> | |-----n_tty_receive_buf2()
> | |------n_tty_receive_buf_common()
> | |-------down_read(&tty->termios_rwsem)
> | |-------__receive_buf()
> | | copies data into n_tty buffer
> | |-------up_read(&tty->termios_rwsem)
> |--down_read(&tty->termios_rwsem)
> |--copy buffer data to user
>
> From this sequence, you can see that thread2 writes to the buffer then
> only clears the part of the buffer in n_tty. The n_tty receive buffer
> code then copies more data into the n_tty buffer.
>
> But part of the vhangup, releasing the redirect, is still required to
> avoid issues with consoles running on pty slaves. So do that.
> As far as I can tell, that is all that should be required.
>
> Signed-off-by: Corey Minyard <[email protected]>
> ---
> drivers/tty/pty.c | 15 +++++++++++++--
> drivers/tty/tty_io.c | 5 +++--
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
> index 23368cec7ee8..29be6b985e76 100644
> --- a/drivers/tty/pty.c
> +++ b/drivers/tty/pty.c
> @@ -67,7 +67,8 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
> wake_up_interruptible(&tty->link->read_wait);
> wake_up_interruptible(&tty->link->write_wait);
> if (tty->driver->subtype == PTY_TYPE_MASTER) {
> - set_bit(TTY_OTHER_CLOSED, &tty->flags);
> + struct file *f;
> +
> #ifdef CONFIG_UNIX98_PTYS
> if (tty->driver == ptm_driver) {
> mutex_lock(&devpts_mutex);
> @@ -76,7 +77,17 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
> mutex_unlock(&devpts_mutex);
> }
> #endif
> - tty_vhangup(tty->link);
> +
> + /*
> + * This hack is required because a program can open a
> + * pty and redirect a console to it, but if the pty is
> + * closed and the console is not released, then the
> + * slave side will never close. So release the
> + * redirect when the master closes.
> + */
> + f = tty_release_redirect(tty->link);
> + if (f)
> + fput(f);
> }
> }
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 571b1d7d4d5a..91c33a0df3c4 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -547,7 +547,9 @@ EXPORT_SYMBOL_GPL(tty_wakeup);
> * @tty: tty device
> *
> * This is available to the pty code so if the master closes, if the
> - * slave is a redirect it can release the redirect.
> + * slave is a redirect it can release the redirect. It returns the
> + * filp for the redirect, which must be fput when the operations on
> + * the tty are completed.
> */
> struct file *tty_release_redirect(struct tty_struct *tty)
> {
> @@ -562,7 +564,6 @@ struct file *tty_release_redirect(struct tty_struct *tty)
>
> return f;
> }
> -EXPORT_SYMBOL_GPL(tty_release_redirect);
>
> /**
> * __tty_hangup - actual handler for hangup events
> --
> 2.25.1
>

2021-01-07 15:33:13

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 0/2] drivers:tty:pty: Fix a race causing data loss on close

On Mon, Nov 23, 2020 at 06:49:00PM -0600, [email protected] wrote:
> I finally got some time to spend with this issue, and I think I have a
> good fix. Not really a v2, this is a completely different fix.
> Basically, calling tty_vhangup() on the slave when closing a pty master
> is just a bad idea.
>
> >From what I can tell, the tty_vhangup() is there in case there was a
> console redirect to the pty slave. When you close the master, you need
> to release the redirect. I didn't see another reason for tty_vhangup().
> So this has two parts, export the release of the tty redirect release
> for the pty code to use, and call it from the pty code.
>
> With this change, everything seems to work ok and no data is lost on the
> slave side if the master closes with outstanding data. I have updated
> my test program to check that all written data is read, it's available
> at https://sourceforge.net/projects/ser2net/files/tmp/testpty.c/download
>
> But, this code is quite intricate and I certainly may have missed
> something.

Tricky stuff, but I think this makes sense, so let's see how the
test-builders and linux-next work with this...

thanks,

greg k-h

2021-01-07 15:33:34

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 2/2] drivers:tty:pty: Fix a race causing data loss on close

On Sat, Jan 02, 2021 at 08:41:09AM -0600, Corey Minyard wrote:
> On Mon, Nov 23, 2020 at 06:49:02PM -0600, [email protected] wrote:
> > From: Corey Minyard <[email protected]>
> >
> > Remove the tty_vhangup() from the pty code and just release the
> > redirect. The tty_vhangup() results in data loss and data out of order
> > issues.
>
> It's been a while, so ping on this. I'm pretty sure this is the right
> fix, the more I've thought about it.

Sorry, was waiting to see if someone else would review it :)

Will go do so now...

greg k-h