Subject: [PATCH] tty_io: fix race in master pty close/slave pty close path

This patch fixes a possible race that leads to double freeing an idr index.
When the master begin to close, release_dev() is called and then pty_close()
is called:
if (tty->driver->close)
tty->driver->close(tty, filp);
This is done without helding any locks other than BKL. Inside pty_close(),
being a master close, the devpts entry will be removed:
#ifdef CONFIG_UNIX98_PTYS
if (tty->driver == ptm_driver)
devpts_pty_kill(tty->index);
#endif
But devpts_pty_kill() will call get_node() that may sleep while waiting for
&devpts_root->d_inode->i_sem. When this happens and the slave is being
opened, tty_open() just found the driver and index:
driver = get_tty_driver(device, &index);
if (!driver) {
mutex_unlock(&tty_mutex);
return -ENODEV;
}
This part of the code is already protected under tty_mute. The problem
is that the slave close already got an index. Then init_dev() is called and
blocks waiting for the same &devpts_root->d_inode->i_sem.

When the master close resumes, it removes the devpts entry, and the relation
between idr index and the tty is gone. The master then sleeps waiting for the
tty_mutex on release_dev().

Slave open resumes and found no tty for that index. As result, a NULL tty is
returned and init_dev() doesn't flow to fast_track:
/* check whether we're reopening an existing tty */
if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
tty = devpts_get_tty(idx);
if (tty && driver->subtype == PTY_TYPE_MASTER)
tty = tty->link;
} else {
tty = driver->ttys[idx];
}
if (tty) goto fast_track;
The result of this, is that a new tty will be created and init_dev() returns
sucessfull. After returning, tty_mutex is dropped and master close may resume.

Master close finds it's the only use and both sides are closing, then releases
the tty and the index. At this point, the idr index is free, but slave still
has it.

Slave open then calls pty_open() and finds that tty->link->count is 0, because
there's no master and returns error. Then tty_open() calls release_dev() which
executes without any warning, as it was a case of last slave close when the
master is already closed (master->count == 0, slave->count == 1). The tty is
then released with the already released idr index.

This normally would only issue a warning on idr_remove() but in case of a
customer's critical application, it's never too simple:

thread1: opens master, gets index X
thread1: begin closing master
thread2: begin opening slave with index X
thread1: finishes closing master, index X released
thread3: opens master, gets index X, just released
thread2: fails opening slave, releases index X <----
thread4: opens master, gets index X, init_dev() then find an already in use
and healthy tty and fails

If no more indexes are released, ptmx_open() will keep failing, as the first
free index available is X, and it will make init_dev() fail because you're
trying to "reopen a master" which isn't valid.

The patch notices when this race happens and make init_dev() fail imediately.
The init_dev() function is called with tty_mutex held, so it's safe to continue
with tty till the end of function because release_dev() won't make any further
changes without grabbing the tty_mutex.

Without the patch, on some machines it's possible get easily idr warnings like
this one:
idr_remove called for id=15 which is not allocated.
[<c02555b9>] idr_remove+0x139/0x170
[<c02a1b62>] release_mem+0x182/0x230
[<c02a28e7>] release_dev+0x4b7/0x700
[<c02a0ea7>] tty_ldisc_enable+0x27/0x30
[<c02a1e64>] init_dev+0x254/0x580
[<c02a0d64>] check_tty_count+0x14/0xb0
[<c02a4f05>] tty_open+0x1c5/0x340
[<c02a4d40>] tty_open+0x0/0x340
[<c017388f>] chrdev_open+0xaf/0x180
[<c017c2ac>] open_namei+0x8c/0x760
[<c01737e0>] chrdev_open+0x0/0x180
[<c0167bc9>] __dentry_open+0xc9/0x210
[<c0167e2c>] do_filp_open+0x5c/0x70
[<c0167a91>] get_unused_fd+0x61/0xd0
[<c0167e93>] do_sys_open+0x53/0x100
[<c0167f97>] sys_open+0x27/0x30
[<c010303b>] syscall_call+0x7/0xb
using this test application available on:
http://www.ruivo.org/~aris/pty_sodomizer.c

Signed-off-by: Aristeu Sergio Rozanski Filho <[email protected]>

Index: linus-2.6/drivers/char/tty_io.c
===================================================================
--- linus-2.6.orig/drivers/char/tty_io.c
+++ linus-2.6/drivers/char/tty_io.c
@@ -1901,6 +1901,20 @@ static int init_dev(struct tty_driver *d
/* check whether we're reopening an existing tty */
if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
tty = devpts_get_tty(idx);
+ /*
+ * If we don't have a tty here on a slave open, it's because
+ * the master already started the close process and there's
+ * no relation between devpts file and tty anymore.
+ */
+ if (!tty && driver->subtype == PTY_TYPE_SLAVE) {
+ retval = -EIO;
+ goto end_init;
+ }
+ /*
+ * It's safe from now on because init_dev() is called with
+ * tty_mutex held and release_dev() won't change tty->count
+ * or tty->flags without having to grab tty_mutex
+ */
if (tty && driver->subtype == PTY_TYPE_MASTER)
tty = tty->link;
} else {


2007-02-22 18:11:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] tty_io: fix race in master pty close/slave pty close path

Aristeu Sergio Rozanski Filho wrote:
>
> This normally would only issue a warning on idr_remove() but in case of a
> customer's critical application, it's never too simple:
>
> thread1: opens master, gets index X
> thread1: begin closing master
> thread2: begin opening slave with index X
> thread1: finishes closing master, index X released
> thread3: opens master, gets index X, just released
> thread2: fails opening slave, releases index X <----
> thread4: opens master, gets index X, init_dev() then find an already in use
> and healthy tty and fails
>

OK, that's fiendishly subtle, as most races are, of course. I'll try to
walk through the codepaths to make sure the proposed patch doesn't
introduce a leak, but the concept sounds good.

-hpa

2007-02-22 19:01:28

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [PATCH] tty_io: fix race in master pty close/slave pty close path

Aristeu Sergio Rozanski Filho wrote:
> This patch fixes a possible race that leads to double freeing an idr index.
>
> Without the patch, on some machines it's possible get easily idr warnings like
> this one:
> idr_remove called for id=15 which is not allocated.
> [<c02555b9>] idr_remove+0x139/0x170
> [<c02a1b62>] release_mem+0x182/0x230
> [<c02a28e7>] release_dev+0x4b7/0x700
> [<c02a0ea7>] tty_ldisc_enable+0x27/0x30
> [<c02a1e64>] init_dev+0x254/0x580
> [<c02a0d64>] check_tty_count+0x14/0xb0
> [<c02a4f05>] tty_open+0x1c5/0x340
> [<c02a4d40>] tty_open+0x0/0x340
> [<c017388f>] chrdev_open+0xaf/0x180
> [<c017c2ac>] open_namei+0x8c/0x760
> [<c01737e0>] chrdev_open+0x0/0x180
> [<c0167bc9>] __dentry_open+0xc9/0x210
> [<c0167e2c>] do_filp_open+0x5c/0x70
> [<c0167a91>] get_unused_fd+0x61/0xd0
> [<c0167e93>] do_sys_open+0x53/0x100
> [<c0167f97>] sys_open+0x27/0x30
> [<c010303b>] syscall_call+0x7/0xb

Would another possible trace look like this?

idr_remove called for id=1 which is not allocated.
[<c0403f10>] dump_trace+0x69/0x1af
[<c040406e>] show_trace_log_lvl+0x18/0x2c
[<c04045e9>] show_trace+0xf/0x11
[<c0404673>] dump_stack+0x15/0x17
[<c04d3fcb>] idr_remove+0xe2/0x143
[<c051ee50>] release_dev+0x63b/0x652
[<c051ee6e>] tty_release+0x7/0xa
[<c0461b2e>] __fput+0xba/0x178
[<c045f466>] filp_close+0x52/0x59
[<c041d3b4>] put_files_struct+0x64/0xa6
[<c041e3fe>] do_exit+0x248/0x747
[<c041e973>] sys_exit_group+0x0/0xd
[<f5d480e4>] 0xf5d480e4
DWARF2 unwinder stuck at 0xf5d480e4
Leftover inexact backtrace:
[<c0426b76>] get_signal_to_deliver+0x38a/0x3b2
[<c040243a>] do_notify_resume+0x75/0x62f
[<c044bfc0>] __pagevec_lru_add_active+0x95/0xa0
[<c042c814>] autoremove_wake_function+0x0/0x35
[<c05fc113>] _spin_unlock_irq+0x5/0x7
[<c05fa9cf>] schedule+0x529/0x585
[<c0402e2a>] work_notifysig+0x13/0x19


We have some bug reports:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=211429
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227203

Subject: Re: [PATCH] tty_io: fix race in master pty close/slave pty close path

> Would another possible trace look like this?
>
> idr_remove called for id=1 which is not allocated.
> [<c0403f10>] dump_trace+0x69/0x1af
> [<c040406e>] show_trace_log_lvl+0x18/0x2c
> [<c04045e9>] show_trace+0xf/0x11
> [<c0404673>] dump_stack+0x15/0x17
> [<c04d3fcb>] idr_remove+0xe2/0x143
> [<c051ee50>] release_dev+0x63b/0x652
> [<c051ee6e>] tty_release+0x7/0xa
> [<c0461b2e>] __fput+0xba/0x178
> [<c045f466>] filp_close+0x52/0x59
(snip)

> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=211429
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227203
yeap, this patch may fix these too. Will build a test kernel with the
patch included and hand over to reporters to test.

--
Aristeu