2001-12-11 01:52:23

by Robert Love

[permalink] [raw]
Subject: [PATCH] console close race fix resend

Marcelo,

[ Resend of previous patch, now against pre8. Note it (a) is a bug fix
and (b) was in Alan's tree ]

The attached is a fix originally by Andrew Morton and discovered by the
preempt-kernel patch. It is in Alan's tree but was never merged into
Linus's.

There is a race between con_close and con_flush_chars.
n_tty_receive_buf writes to the tty queue and then flushes it via
con_flush_chars. If the console closes in between these operations,
con_flush_char barfs.

Please, for all that is righteous, apply.

Robert Love

diff -urN linux-2.4.17-pre8/drivers/char/console.c linux/drivers/char/console.c
--- linux-2.4.17-pre8/drivers/char/console.c Thu Dec 6 14:08:14 2001
+++ linux/drivers/char/console.c Thu Dec 6 14:09:06 2001
@@ -2356,8 +2356,14 @@
return;

pm_access(pm_con);
+
+ /*
+ * If we raced with con_close(), `vt' may be null.
+ * Hence this bandaid. - akpm
+ */
acquire_console_sem();
- set_cursor(vt->vc_num);
+ if (vt)
+ set_cursor(vt->vc_num);
release_console_sem();
}


2001-12-11 03:14:11

by Gordon Oliver

[permalink] [raw]
Subject: Re: [PATCH] console close race fix resend


On 2001.12.10 17:51 Robert Love wrote:
> Marcelo,
>
> [ Resend of previous patch, now against pre8. Note it (a) is a bug fix
> and (b) was in Alan's tree ]

and (c) appears to still have a race... You should extract
the value from the structure inside the lock, otherwise you
will still race with con_close (though perhaps a smaller race)
but since the call to acquire_console_sem() can sleep, the
vt handle you have may be stale.

> Please, for all that is righteous, apply.

please fix it better first...
(unless I am mistaken).
-gordo

2001-12-11 06:05:43

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] console close race fix resend

On Mon, 2001-12-10 at 22:16, Gordon Oliver wrote:

> and (c) appears to still have a race... You should extract
> the value from the structure inside the lock, otherwise you
> will still race with con_close (though perhaps a smaller race)
> but since the call to acquire_console_sem() can sleep, the
> vt handle you have may be stale.

Ehh, I don't think so. Here is the whole patched function:

static void con_flush_chars(struct tty_struct *tty)
{
struct vt_struct *vt = (struct vt_struct *)tty->driver_data;
if (in_interrupt()) /* from flush_to_ldisc */
return;
pm_access(pm_con);
acquire_console_sem();
if (vt)
set_cursor(vt->vc_num);
release_console_sem();
}

When we check vt, it isn't stale. vt is a _pointer_ to the data so that
first reference against it is guaranteed to grab the correct value. The
only possible race is between the if and the set_cursor, but that isn't
an issue because we acquired the console semaphore. There is no race
here.

> > Please, for all that is righteous, apply.
>
> please fix it better first...
> (unless I am mistaken).

Thus, unless I am mistaken, it is fine. Please, apply.

Robert Love

2001-12-11 06:30:02

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] console close race fix resend

Robert Love wrote:
>
> On Mon, 2001-12-10 at 22:16, Gordon Oliver wrote:
>
> > and (c) appears to still have a race... You should extract
> > the value from the structure inside the lock, otherwise you
> > will still race with con_close (though perhaps a smaller race)
> > but since the call to acquire_console_sem() can sleep, the
> > vt handle you have may be stale.
>
> Ehh, I don't think so. Here is the whole patched function:
>
> static void con_flush_chars(struct tty_struct *tty)
> {
> struct vt_struct *vt = (struct vt_struct *)tty->driver_data;
> if (in_interrupt()) /* from flush_to_ldisc */
> return;
> pm_access(pm_con);
> acquire_console_sem();
> if (vt)
> set_cursor(vt->vc_num);
> release_console_sem();
> }
>

It could be improved - we really should test tty->driver_data inside
lock_kernel(), and after the possible sleep.

How does this look (and how does it test?)

--- linux-2.4.17-pre8/drivers/char/console.c Mon Dec 10 13:46:20 2001
+++ linux-akpm/drivers/char/console.c Mon Dec 10 22:27:05 2001
@@ -100,6 +100,7 @@
#include <linux/tqueue.h>
#include <linux/bootmem.h>
#include <linux/pm.h>
+#include <linux/smp_lock.h>

#include <asm/io.h>
#include <asm/system.h>
@@ -2350,15 +2351,18 @@ static void con_start(struct tty_struct

static void con_flush_chars(struct tty_struct *tty)
{
- struct vt_struct *vt = (struct vt_struct *)tty->driver_data;
+ struct vt_struct *vt;

if (in_interrupt()) /* from flush_to_ldisc */
return;
-
pm_access(pm_con);
+ lock_kernel(); /* versus con_close() */
acquire_console_sem();
- set_cursor(vt->vc_num);
+ vt = (struct vt_struct *)tty->driver_data;
+ if (vt)
+ set_cursor(vt->vc_num);
release_console_sem();
+ unlock_kernel();
}

/*

2001-12-11 08:51:09

by Gordon Oliver

[permalink] [raw]
Subject: Re: [PATCH] console close race fix resend

On 2001.12.10 22:05 Robert Love wrote:
> Ehh, I don't think so. Here is the whole patched function:
>
> static void con_flush_chars(struct tty_struct *tty)
> {
> struct vt_struct *vt = (struct vt_struct *)tty->driver_data;
> if (in_interrupt()) /* from flush_to_ldisc */
> return;
> pm_access(pm_con);
> acquire_console_sem();
> if (vt)
> set_cursor(vt->vc_num);
> release_console_sem();
> }
>
> When we check vt, it isn't stale. vt is a _pointer_ to the data so that
> first reference against it is guaranteed to grab the correct value. The
> only possible race is between the if and the set_cursor, but that isn't
> an issue because we acquired the console semaphore. There is no race
> here.

I like the patch that Andrew Morton sent in reply to this better.
Note that in the event that the above code does the following sequence
it will cause a stale pointer to be used:

con_flush_chars con_close
vt = <>
tty->driver_data = NULL
acquire_console_sem()
set_cursor()
release_console_sem()

Now it _might_ be ok to act on a stale vt pointer, but it sure
feels like thin ice. I'm not sure that there is any danger of
the vt data being modified in a way that would break this, but
since the tty no longer has a reference it is bad practice.

What the earlier patch did is created some very subtle semantics
for a small window of a race. It fixed the blaring bug (the OOPS)
but left a possible one that would be harder to find....
-gordo

2001-12-11 09:06:32

by Robert Love

[permalink] [raw]
Subject: [PATCH] Re: console close race fix resend

OK, I talked to Andrew off-list and we agree his fix is more correct.
It should be applied. I tested it (I have a consistent method of
reproducing the race under preempt-kernel) and it works.

I rediffed it against pre8 because I had trouble applying his, oddly.
Ignore previous patch. Marcelo, please apply. Seriously, this time.

Robert Love

--- linux-2.4.17-pre8/drivers/char/console.c Mon Dec 10 20:48:50 2001
+++ linux/drivers/char/console.c Tue Dec 11 02:52:44 2001
@@ -100,6 +100,7 @@
#include <linux/tqueue.h>
#include <linux/bootmem.h>
#include <linux/pm.h>
+#include <linux/smp_lock.h>

#include <asm/io.h>
#include <asm/system.h>
@@ -2348,17 +2349,25 @@
set_leds();
}

+/*
+ * we can race here against con_close, so we grab the bkl
+ * and check the pointer before calling set_cursor
+ */
static void con_flush_chars(struct tty_struct *tty)
{
- struct vt_struct *vt = (struct vt_struct *)tty->driver_data;
+ struct vt_struct *vt;

if (in_interrupt()) /* from flush_to_ldisc */
return;

pm_access(pm_con);
+ lock_kernel();
acquire_console_sem();
- set_cursor(vt->vc_num);
+ vt = (struct vt_struct *)tty->driver_data;
+ if (vt)
+ set_cursor(vt->vc_num);
release_console_sem();
+ unlock_kernel();
}

/*