2007-08-08 09:58:23

by Laurent Pinchart

[permalink] [raw]
Subject: Serial buffer memory leak

Hi everybody.

Patch c5c34d4862e18ef07c1276d233507f540fb5a532 (tty: flush flip buffer on
ldisc input queue flush) introduces a race condition which can lead to memory
leaks.

The problem can be triggered when tcflush() is called when data are being
pushed to the line discipline driver by flush_to_ldisc().

flush_to_ldisc() releases tty->buf.lock when calling the line discipline
receive_buf function. At that poing tty_buffer_flush() kicks in and sets both
tty->buf.head and tty->buf.tail to NULL. When flush_to_ldisc() finishes, it
restores tty->buf.head but doesn't touch tty->buf.tail. This corrups the
buffer queue, and the next call to tty_buffer_request_room() will allocate a
new buffer and overwrite tty->buf.head. The previous buffer is then lost
forever without being released.

I'm not familiar enough with the tty code to decide what the proper fix should
be. I'll try to write a patch if someone could point me in the right
direction.

Please CC me when answering, as I'm not subscribed to the list.

Regards,

--
Laurent Pinchart
CSE Semaphore Belgium

Chauss?e de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75


2007-08-08 13:15:34

by Paul Fulghum

[permalink] [raw]
Subject: Re: Serial buffer memory leak

Laurent Pinchart wrote:
> Patch c5c34d4862e18ef07c1276d233507f540fb5a532 (tty: flush flip buffer on
> ldisc input queue flush) introduces a race condition which can lead to memory
> leaks.
>
> The problem can be triggered when tcflush() is called when data are being
> pushed to the line discipline driver by flush_to_ldisc().
>
> flush_to_ldisc() releases tty->buf.lock when calling the line discipline
> receive_buf function. At that poing tty_buffer_flush() kicks in and sets both
> tty->buf.head and tty->buf.tail to NULL. When flush_to_ldisc() finishes, it
> restores tty->buf.head but doesn't touch tty->buf.tail. This corrups the
> buffer queue, and the next call to tty_buffer_request_room() will allocate a
> new buffer and overwrite tty->buf.head. The previous buffer is then lost
> forever without being released.

Your description is clear enough, I'll make the patch.

Thanks,
Paul

2007-08-08 13:38:37

by Alan

[permalink] [raw]
Subject: Re: Serial buffer memory leak

> I'm not familiar enough with the tty code to decide what the proper fix should
> be. I'll try to write a patch if someone could point me in the right
> direction.

Something like this perhaps ?

diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.23rc1-mm1/drivers/char/tty_io.c linux-2.6.23rc1-mm1/drivers/char/tty_io.c
--- linux.vanilla-2.6.23rc1-mm1/drivers/char/tty_io.c 2007-07-26 15:02:57.000000000 +0100
+++ linux-2.6.23rc1-mm1/drivers/char/tty_io.c 2007-08-08 11:39:29.791433672 +0100
@@ -369,25 +369,50 @@
}

/**
- * tty_buffer_flush - flush full tty buffers
+ * __tty_buffer_flush - flush full tty buffers
* @tty: tty to flush
*
- * flush all the buffers containing receive data
+ * flush all the buffers containing receive data. Caller must
+ * hold the buffer lock and must have ensured no parallel flush to
+ * ldisc is running.
*
* Locking: none
*/

-static void tty_buffer_flush(struct tty_struct *tty)
+static void __tty_buffer_flush(struct tty_struct *tty)
{
struct tty_buffer *thead;
- unsigned long flags;

- spin_lock_irqsave(&tty->buf.lock, flags);
while((thead = tty->buf.head) != NULL) {
tty->buf.head = thead->next;
tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;
+}
+
+/**
+ * tty_buffer_flush - flush full tty buffers
+ * @tty: tty to flush
+ *
+ * flush all the buffers containing receive data. If the buffer is
+ * being processed by flush_to_ldisc then we defer the processing
+ * to that function
+ *
+ * Locking: none
+ */
+
+static void tty_buffer_flush(struct tty_struct *tty)
+{
+ unsigned long flags;
+ spin_lock_irqsave(&tty->buf.lock, flags);
+
+ /* If the data is being pushed to the tty layer then we can't
+ process it here. Instead set a flag and the flush_to_ldisc
+ path will process the flush request before it exits */
+ if (tty->buf.flushing)
+ tty->buf.flushpending = 1;
+ else
+ __tty_buffer_flush(tty);
spin_unlock_irqrestore(&tty->buf.lock, flags);
}

@@ -3594,6 +3619,7 @@
return;

spin_lock_irqsave(&tty->buf.lock, flags);
+ tty->buf.flushing = 1; /* So we know for tty_flush_buffers */
head = tty->buf.head;
if (head != NULL) {
tty->buf.head = NULL;
@@ -3607,6 +3633,11 @@
tty_buffer_free(tty, tbuf);
continue;
}
+ /* Ldisc or user is trying to flush the buffers
+ we are feeding to the ldisc, stop feeding the
+ line discipline as we want to empty the queue */
+ if (tty->buf.flushpending)
+ break;
if (!tty->receive_room) {
schedule_delayed_work(&tty->buf.work, 1);
break;
@@ -3620,8 +3651,16 @@
disc->receive_buf(tty, char_buf, flag_buf, count);
spin_lock_irqsave(&tty->buf.lock, flags);
}
+ /* Restore the queue head */
tty->buf.head = head;
+ /* We may have a deferred request to flush the input buffer,
+ if so do it now under the lock and empty the queue */
+ if (tty->buf.flushpending) {
+ __tty_buffer_flush(tty);
+ tty->buf.flushpending = 0;
+ }
}
+ tty->buf.flushing = 0;
spin_unlock_irqrestore(&tty->buf.lock, flags);

tty_ldisc_deref(disc);
diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.23rc1-mm1/include/linux/tty.h linux-2.6.23rc1-mm1/include/linux/tty.h
--- linux.vanilla-2.6.23rc1-mm1/include/linux/tty.h 2007-07-26 15:02:04.000000000 +0100
+++ linux-2.6.23rc1-mm1/include/linux/tty.h 2007-08-08 11:44:32.000000000 +0100
@@ -80,13 +73,10 @@
struct tty_buffer *tail; /* Active buffer */
struct tty_buffer *free; /* Free queue head */
int memory_used; /* Buffer space used excluding free queue */
+ unsigned int flushing:1; /* flush_to_ldisc active */
+ unsigned int flushpending:1; /* A request to clear the queue is pending */
};
/*
- * The pty uses char_buf and flag_buf as a contiguous buffer
- */
-#define PTY_BUF_SIZE 4*TTY_FLIPBUF_SIZE
-
-/*
* When a break, frame error, or parity error happens, these codes are
* stuffed into the flags buffer.
*/

2007-08-08 14:12:39

by Paul Fulghum

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wed, 2007-08-08 at 14:45 +0100, Alan Cox wrote:
> > I'm not familiar enough with the tty code to decide what the proper fix should
> > be. I'll try to write a patch if someone could point me in the right
> > direction.
>
> Something like this perhaps ?

That looks good, a little better than the solution
I was first considering. I'm compiling now.

This was a nasty bug for me to introduce :-(
Good work in finding this Laurent.

--
Paul Fulghum
Microgate Systems, Ltd

2007-08-08 14:28:50

by Laurent Pinchart

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wednesday 08 August 2007 16:11, Paul Fulghum wrote:
> On Wed, 2007-08-08 at 14:45 +0100, Alan Cox wrote:
> > > I'm not familiar enough with the tty code to decide what the proper fix
> > > should be. I'll try to write a patch if someone could point me in the
> > > right direction.
> >
> > Something like this perhaps ?
>
> That looks good, a little better than the solution
> I was first considering. I'm compiling now.

The patch fixes the problem (at least under the test conditions that lead me
to discover it in the first place). Thanks Alan.

> This was a nasty bug for me to introduce :-(
> Good work in finding this Laurent.

Thanks. It was a lot of work troubleshooting the problem. The bug report I got
was along the lines of "after the 256th incoming modem call, the application
doesn't receive data payloads anymore, but AT commands are still answered". I
guess the challenge is what made it fun.

--
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75

2007-08-08 14:30:35

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wed, Aug 08, 2007 at 02:45:32PM +0100, Alan Cox wrote:
> > I'm not familiar enough with the tty code to decide what the proper fix should
> > be. I'll try to write a patch if someone could point me in the right
> > direction.
>
> Something like this perhaps ?
>
[....]
> - * flush all the buffers containing receive data
> + * flush all the buffers containing receive data. Caller must
> + * hold the buffer lock and must have ensured no parallel flush to
> + * ldisc is running.
> *
> * Locking: none
^^^^
> */
Isn't the comment a bit misleading here? The caller must hold tty->buf.lock.

Frederik

2007-08-08 14:53:15

by Paul Fulghum

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wed, 2007-08-08 at 16:28 +0200, Laurent Pinchart wrote:

> The patch fixes the problem (at least under the test conditions that lead me
> to discover it in the first place). Thanks Alan.

It works here also, but I see a problem.

By deferring the flush, ioctl(TCFLSH) returns immediately
even though there is possibly still receive data being fed
to the ldisc.

If this is followed immediately by a read() then the
application gets unexpected (stale) data defeating the
purpose of the TCFLSH.

tty_buffer_flush() needs to wait for buf.flushpending to clear.


--
Paul Fulghum
Microgate Systems, Ltd

2007-08-08 15:08:55

by Alan

[permalink] [raw]
Subject: Re: Serial buffer memory leak

> tty_buffer_flush() needs to wait for buf.flushpending to clear.

Not 100% sure it does as its no different to events really occuring with
a possible timing. Trivial to wait_event on the read queue for this
however and definitely worth doing to be sure.

Should probably make the flushpending an atomic bit op to avoid taking
and retaking the lock.

Alan

2007-08-08 15:25:51

by Alan

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wed, 8 Aug 2007 16:16:06 +0100
Alan Cox <[email protected]> wrote:

> > tty_buffer_flush() needs to wait for buf.flushpending to clear.

> Should probably make the flushpending an atomic bit op to avoid taking
> and retaking the lock.

Ok try this for size folks. Use tty->flags bits for the flush status.
Wait for the flag to clear again before returning
Fix the doc error noted
Fix flush of empty queue leaving stale flushpending


Signed-off-by: Alan Cox <[email protected]>

diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.23rc1-mm1/drivers/char/tty_io.c linux-2.6.23rc1-mm1/drivers/char/tty_io.c
--- linux.vanilla-2.6.23rc1-mm1/drivers/char/tty_io.c 2007-07-26 15:02:57.000000000 +0100
+++ linux-2.6.23rc1-mm1/drivers/char/tty_io.c 2007-08-08 16:01:36.843558336 +0100
@@ -369,25 +369,54 @@
}

/**
- * tty_buffer_flush - flush full tty buffers
+ * __tty_buffer_flush - flush full tty buffers
* @tty: tty to flush
*
- * flush all the buffers containing receive data
+ * flush all the buffers containing receive data. Caller must
+ * hold the buffer lock and must have ensured no parallel flush to
+ * ldisc is running.
*
- * Locking: none
+ * Locking: Caller must hold tty->buf.lock
*/

-static void tty_buffer_flush(struct tty_struct *tty)
+static void __tty_buffer_flush(struct tty_struct *tty)
{
struct tty_buffer *thead;
- unsigned long flags;

- spin_lock_irqsave(&tty->buf.lock, flags);
while((thead = tty->buf.head) != NULL) {
tty->buf.head = thead->next;
tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;
+}
+
+/**
+ * tty_buffer_flush - flush full tty buffers
+ * @tty: tty to flush
+ *
+ * flush all the buffers containing receive data. If the buffer is
+ * being processed by flush_to_ldisc then we defer the processing
+ * to that function
+ *
+ * Locking: none
+ */
+
+static void tty_buffer_flush(struct tty_struct *tty)
+{
+ unsigned long flags;
+ spin_lock_irqsave(&tty->buf.lock, flags);
+
+ /* If the data is being pushed to the tty layer then we can't
+ process it here. Instead set a flag and the flush_to_ldisc
+ path will process the flush request before it exits */
+ if (test_bit(TTY_FLUSHING, &tty->flags)) {
+ set_bit(TTY_FLUSHPENDING, &tty->flags);
+ spin_unlock_irqrestore(&tty->buf.lock, flags);
+ wait_event(tty->read_wait, test_bit(TTY_FLUSHPENDING, &tty->flags) == 0);
+ return;
+ }
+ else
+ __tty_buffer_flush(tty);
spin_unlock_irqrestore(&tty->buf.lock, flags);
}

@@ -3594,6 +3622,7 @@
return;

spin_lock_irqsave(&tty->buf.lock, flags);
+ set_bit(TTY_FLUSHING, &tty->flags); /* So we know a flush is running */
head = tty->buf.head;
if (head != NULL) {
tty->buf.head = NULL;
@@ -3607,6 +3636,11 @@
tty_buffer_free(tty, tbuf);
continue;
}
+ /* Ldisc or user is trying to flush the buffers
+ we are feeding to the ldisc, stop feeding the
+ line discipline as we want to empty the queue */
+ if (test_bit(TTY_FLUSHPENDING, &tty->flags))
+ break;
if (!tty->receive_room) {
schedule_delayed_work(&tty->buf.work, 1);
break;
@@ -3620,8 +3654,16 @@
disc->receive_buf(tty, char_buf, flag_buf, count);
spin_lock_irqsave(&tty->buf.lock, flags);
}
+ /* Restore the queue head */
tty->buf.head = head;
}
+ /* We may have a deferred request to flush the input buffer,
+ if so pull the chain under the lock and empty the queue */
+ if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
+ __tty_buffer_flush(tty);
+ clear_bit(TTY_FLUSHPENDING, &tty->flags);
+ }
+ clear_bit(TTY_FLUSHING, &tty->flags);
spin_unlock_irqrestore(&tty->buf.lock, flags);

tty_ldisc_deref(disc);
diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.23rc1-mm1/include/linux/tty.h linux-2.6.23rc1-mm1/include/linux/tty.h
--- linux.vanilla-2.6.23rc1-mm1/include/linux/tty.h 2007-07-26 15:02:04.000000000 +0100
+++ linux-2.6.23rc1-mm1/include/linux/tty.h 2007-08-08 16:00:28.089010616 +0100
@@ -274,6 +262,8 @@
#define TTY_PTY_LOCK 16 /* pty private */
#define TTY_NO_WRITE_SPLIT 17 /* Preserve write boundaries to driver */
#define TTY_HUPPED 18 /* Post driver->hangup() */
+#define TTY_FLUSHING 19 /* Flushing to ldisc in progress */
+#define TTY_FLUSHPENDING 20 /* Queued buffer flush pending */

#define TTY_WRITE_FLUSH(tty) tty_write_flush((tty))

2007-08-08 17:36:06

by Paul Fulghum

[permalink] [raw]
Subject: Re: Serial buffer memory leak

On Wed, 2007-08-08 at 16:32 +0100, Alan Cox wrote:

> Ok try this for size folks. Use tty->flags bits for the flush status.
> Wait for the flag to clear again before returning
> Fix the doc error noted
> Fix flush of empty queue leaving stale flushpending
>
>
> Signed-off-by: Alan Cox <[email protected]>

It looks good and clean.

I compiled and tested the patch with interleaved
tcflush() and read() calls, allowing random amounts
of receive data to accumulate between each call.

Acked-by: Paul Fulghum <[email protected]>

Thanks,
Paul

--
Paul Fulghum
Microgate Systems, Ltd