2012-02-19 20:36:05

by Sasha Levin

[permalink] [raw]
Subject: TTY: NULL dereference when closing a pty file

Hi all,

I got the following BUG() when running trinity on the KVM tool:

[ 665.738774] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 665.739651] IP: [<ffffffff81257e2b>] devpts_pty_kill+0x1b/0xa0
[ 665.739651] PGD 22eba067 PUD 22eaf067 PMD 0
[ 665.739651] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 665.739651] CPU 5
[ 665.739651] Pid: 3061, comm: trinity Not tainted 3.3.0-rc3-next-20120217-sasha-00001-gfa56acb #18
[ 665.739651] RIP: 0010:[<ffffffff81257e2b>] [<ffffffff81257e2b>] devpts_pty_kill+0x1b/0xa0
[ 665.739651] RSP: 0018:ffff880022ecfd88 EFLAGS: 00010286
[ 665.739651] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 665.739651] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff880022921148
[ 665.739651] RBP: ffff880022ecfda8 R08: 0000000000000000 R09: 0000000000000001
[ 665.739651] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880022921148
[ 665.739651] R13: ffff880022934840 R14: ffff880026831970 R15: ffff880026831970
[ 665.739651] FS: 00007f29c344e700(0000) GS:ffff88002a400000(0000) knlGS:0000000000000000
[ 665.739651] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 665.739651] CR2: 0000000000000028 CR3: 0000000022e1c000 CR4: 00000000000406e0
[ 665.739651] DR0: ffffffff810adc50 DR1: 0000000000000000 DR2: 0000000000000000
[ 665.739651] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 665.739651] Process trinity (pid: 3061, threadinfo ffff880022ece000, task ffff880022d88000)
[ 665.739651] Stack:
[ 665.739651] ffff880022ecfd98 ffff8800229267b0 ffff880022921148 ffff880022934840
[ 665.739651] ffff880022ecfdc8 ffffffff819407f1 ffff880022934840 ffff8800229267b0
[ 665.739651] ffff880022ecfeb8 ffffffff819374a6 ffff880022ecfe18 ffffffff81078e76
[ 665.739651] Call Trace:
[ 665.739651] [<ffffffff819407f1>] pty_close+0x121/0x140
[ 665.739651] [<ffffffff819374a6>] tty_release+0x186/0x610
[ 665.739651] [<ffffffff81078e76>] ? kvm_clock_read+0x46/0x80
[ 665.739651] [<ffffffff81054973>] ? sched_clock+0x13/0x20
[ 665.739651] [<ffffffff811d9b16>] fput+0xf6/0x330
[ 665.739651] [<ffffffff811d68b4>] filp_close+0x64/0x90
[ 665.739651] [<ffffffff811d699b>] sys_close+0xbb/0x1b0
[ 665.739651] [<ffffffff8267c079>] system_call_fastpath+0x16/0x1b
[ 665.739651] Code: 4c 8b a3 08 04 00 00 eb df 0f 0b 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 48 8b 9f 28 04 00 00 <48> 8b 43 28 48 81 78 58 d1 1c 00 00 74 0b 48 8b 05 f0 17 26 03
[ 665.739651] RIP [<ffffffff81257e2b>] devpts_pty_kill+0x1b/0xa0
[ 665.739651] RSP <ffff880022ecfd88>
[ 665.739651] CR2: 0000000000000028
[ 665.782570] ---[ end trace 5b128b9a8217de35 ]---

Looking further, it looks like devpts_pty_kill was called with tty->link==NULL.

I've bisected it down to d3bda529 ("TTY: get rid of BTM around devpts_*"), which has moved devpts_pty_kill() out of the tty_lock() protection.


2012-02-19 21:09:15

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] TTY: fix PTY hangup vs close race

Commit d3bda5298 (TTY: get rid of BTM around devpts_*) moved
devpts_pty_kill out of BTM, but the BTM was not protecting only
devpts_pty_kill, but also tty->link. Hence move the function back at
this late stage until this gets resolved properly some time later.

I was confused by tty_vhangup(tty->link) outside BTM. But inside of
tty_vhangup, there is a check for tty == NULL. But we cannot add such
a check here. We have to have the tty and free the devpts node...

Signed-off-by: Jiri Slaby <[email protected]>
Reported-by: Sasha Levin <[email protected]>
---
drivers/tty/pty.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index fa1bd2e..95037aa 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -54,8 +54,9 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
wake_up_interruptible(&tty->link->write_wait);
if (tty->driver->subtype == PTY_TYPE_MASTER) {
set_bit(TTY_OTHER_CLOSED, &tty->flags);
- tty_unlock();
+ /* BTM protects tty->link here */
devpts_pty_kill(tty->link);
+ tty_unlock();
tty_vhangup(tty->link);
tty_lock();
}
--
1.7.9

2012-02-19 21:10:24

by Jiri Slaby

[permalink] [raw]
Subject: Re: TTY: NULL dereference when closing a pty file

On 02/19/2012 09:36 PM, Sasha Levin wrote:
> I got the following BUG() when running trinity on the KVM tool:

Hi, I've sent a patch as a reply to your message. Could you test it?

And where can one obtain the tool?

thanks,
--
js

2012-02-19 21:19:14

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] TTY: fix PTY hangup vs close race

Commit d3bda5298 (TTY: get rid of BTM around devpts_*) moved
devpts_pty_kill out of BTM, but the BTM was not protecting only
devpts_pty_kill, but also tty->link. Hence move the function back at
this late stage until this gets resolved properly some time later.

I was confused by tty_vhangup(tty->link) outside BTM. But inside of
tty_vhangup, there is a check for tty == NULL. But we cannot add such
a check here. We have to have the tty and free the devpts node...

Signed-off-by: Jiri Slaby <[email protected]>
Reported-by: Sasha Levin <[email protected]>
---

Gee, I messed up Greg's address again...

drivers/tty/pty.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index fa1bd2e..95037aa 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -54,8 +54,9 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
wake_up_interruptible(&tty->link->write_wait);
if (tty->driver->subtype == PTY_TYPE_MASTER) {
set_bit(TTY_OTHER_CLOSED, &tty->flags);
- tty_unlock();
+ /* BTM protects tty->link here */
devpts_pty_kill(tty->link);
+ tty_unlock();
tty_vhangup(tty->link);
tty_lock();
}
--
1.7.9

2012-02-19 21:41:54

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] TTY: fix PTY hangup vs close race

On 02/19/2012 10:19 PM, Jiri Slaby wrote:
> Commit d3bda5298 (TTY: get rid of BTM around devpts_*) moved
> devpts_pty_kill out of BTM, but the BTM was not protecting only
> devpts_pty_kill, but also tty->link. Hence move the function back at
> this late stage until this gets resolved properly some time later.
>
> I was confused by tty_vhangup(tty->link) outside BTM. But inside of
> tty_vhangup, there is a check for tty == NULL. But we cannot add such
> a check here. We have to have the tty and free the devpts node...
>
> Signed-off-by: Jiri Slaby <[email protected]>
> Reported-by: Sasha Levin <[email protected]>
> ---
>
> Gee, I messed up Greg's address again...
>
> drivers/tty/pty.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
> index fa1bd2e..95037aa 100644
> --- a/drivers/tty/pty.c
> +++ b/drivers/tty/pty.c
> @@ -54,8 +54,9 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
> wake_up_interruptible(&tty->link->write_wait);
> if (tty->driver->subtype == PTY_TYPE_MASTER) {
> set_bit(TTY_OTHER_CLOSED, &tty->flags);
> - tty_unlock();
> + /* BTM protects tty->link here */
> devpts_pty_kill(tty->link);
> + tty_unlock();

I'm afraid this won't help. As this is based on an assumption that
tty->link is NULL [*] and that is not just true.

Greg, please revert commit d3bda5298 completely.

[*] Your dump reveals that the code fetches tty->driver_data (mov
0x428(%rdi),%rbx) and traps at a fetch of inode->i_sbm because inode is
NULL (mov 0x28(%rbx),%rax).

Anyway I'm still interested in the tool you triggered this, because we
will need to get rid of BTM eventually.

thanks,
--
js

2012-02-19 23:08:21

by Sasha Levin

[permalink] [raw]
Subject: Re: TTY: NULL dereference when closing a pty file

On Sun, 2012-02-19 at 22:10 +0100, Jiri Slaby wrote:
> On 02/19/2012 09:36 PM, Sasha Levin wrote:
> > I got the following BUG() when running trinity on the KVM tool:
>
> Hi, I've sent a patch as a reply to your message. Could you test it?

Sure. I'll re-run and report back tomorrow.

> And where can one obtain the tool?

The KVM tool is developed at https://github.com/penberg/linux-kvm

Trinity is developed at http://codemonkey.org.uk/projects/trinity/trinity.git

If you have any questions about how to use them both feel free to message me privately.

> thanks,

2012-02-20 10:20:34

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 1/1] TTY: fix PTY hangup vs close race

On Sun, Feb 19, 2012 at 11:41 PM, Jiri Slaby <[email protected]> wrote:
> On 02/19/2012 10:19 PM, Jiri Slaby wrote:
>> Commit d3bda5298 (TTY: get rid of BTM around devpts_*) moved
>> devpts_pty_kill out of BTM, but the BTM was not protecting only
>> devpts_pty_kill, but also tty->link. Hence move the function back at
>> this late stage until this gets resolved properly some time later.
>>
>> I was confused by tty_vhangup(tty->link) outside BTM. But inside of
>> tty_vhangup, there is a check for tty == NULL. But we cannot add such
>> a check here. We have to have the tty and free the devpts node...
>>
>> Signed-off-by: Jiri Slaby <[email protected]>
>> Reported-by: Sasha Levin <[email protected]>
>> ---
>>
>> Gee, I messed up Greg's address again...
>>
>> ?drivers/tty/pty.c | ? ?3 ++-
>> ?1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
>> index fa1bd2e..95037aa 100644
>> --- a/drivers/tty/pty.c
>> +++ b/drivers/tty/pty.c
>> @@ -54,8 +54,9 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>> ? ? ? wake_up_interruptible(&tty->link->write_wait);
>> ? ? ? if (tty->driver->subtype == PTY_TYPE_MASTER) {
>> ? ? ? ? ? ? ? set_bit(TTY_OTHER_CLOSED, &tty->flags);
>> - ? ? ? ? ? ? tty_unlock();
>> + ? ? ? ? ? ? /* BTM protects tty->link here */
>> ? ? ? ? ? ? ? devpts_pty_kill(tty->link);
>> + ? ? ? ? ? ? tty_unlock();
>
> I'm afraid this won't help. As this is based on an assumption that
> tty->link is NULL [*] and that is not just true.
>
> Greg, please revert commit d3bda5298 completely.
>
> [*] Your dump reveals that the code fetches tty->driver_data (mov
> 0x428(%rdi),%rbx) and traps at a fetch of inode->i_sbm because inode is
> NULL (mov ? ?0x28(%rbx),%rax).

It actually looks even more complex than that. I reverted the patch
above, but still got the error. A quick bisection pointed me to
a50f724a432997321cabb6c9e665c28e34850f78.

Looks like reverting both actually solves the problem. Reverting just
one of them doesn't.

2012-02-20 11:15:37

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] TTY: fix PTY hangup vs close race

On 02/20/2012 11:20 AM, Sasha Levin wrote:
>> [*] Your dump reveals that the code fetches tty->driver_data (mov
>> 0x428(%rdi),%rbx) and traps at a fetch of inode->i_sbm because inode is
>> NULL (mov 0x28(%rbx),%rax).
>
> It actually looks even more complex than that. I reverted the patch
> above, but still got the error. A quick bisection pointed me to
> a50f724a432997321cabb6c9e665c28e34850f78.

> Looks like reverting both actually solves the problem. Reverting just
> one of them doesn't.

Ah, I see. I suppose you have CONFIG_LEGACY_PTYS=y. I somehow overlooked
that case.

So both of them should be reverted in upstream:
d3bda5298aad98c7a27678bdd0dd9d008ab9e685
a50f724a432997321cabb6c9e665c28e34850f78

thanks,
--
js
suse labs

2012-02-24 21:59:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/1] TTY: fix PTY hangup vs close race

On Mon, Feb 20, 2012 at 12:15:31PM +0100, Jiri Slaby wrote:
> On 02/20/2012 11:20 AM, Sasha Levin wrote:
> >> [*] Your dump reveals that the code fetches tty->driver_data (mov
> >> 0x428(%rdi),%rbx) and traps at a fetch of inode->i_sbm because inode is
> >> NULL (mov 0x28(%rbx),%rax).
> >
> > It actually looks even more complex than that. I reverted the patch
> > above, but still got the error. A quick bisection pointed me to
> > a50f724a432997321cabb6c9e665c28e34850f78.
>
> > Looks like reverting both actually solves the problem. Reverting just
> > one of them doesn't.
>
> Ah, I see. I suppose you have CONFIG_LEGACY_PTYS=y. I somehow overlooked
> that case.
>
> So both of them should be reverted in upstream:
> d3bda5298aad98c7a27678bdd0dd9d008ab9e685
> a50f724a432997321cabb6c9e665c28e34850f78

Both now reverted, thanks.

greg k-h