When sending large quantities of data through a CDC ACM channel it is possible
for data to be lost when attempting to copy the data to the tty buffer. This
occurs due to the return value from tty_insert_flip_string not being checked.
This patch adds checking for how many bytes have been inserted into the tty
buffer and returns any remaining bytes back to the filled read buffer list.
Signed-off-by: Toby Gray <[email protected]>
---
drivers/usb/class/cdc-acm.c | 28 ++++++++++++++++++++++++----
1 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index f492a7f..63133c7 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -392,6 +392,7 @@ static void acm_rx_tasklet(unsigned long _acm)
struct acm_ru *rcv;
unsigned long flags;
unsigned char throttled;
+ int copied;
dbg("Entering acm_rx_tasklet");
@@ -423,12 +424,14 @@ next_buffer:
dbg("acm_rx_tasklet: procesing buf 0x%p, size = %d", buf, buf->size);
+ copied = 0;
if (tty) {
spin_lock_irqsave(&acm->throttle_lock, flags);
throttled = acm->throttle;
spin_unlock_irqrestore(&acm->throttle_lock, flags);
if (!throttled) {
- tty_insert_flip_string(tty, buf->base, buf->size);
+ copied = tty_insert_flip_string(tty,
+ buf->base, buf->size);
tty_flip_buffer_push(tty);
} else {
tty_kref_put(tty);
@@ -440,9 +443,26 @@ next_buffer:
}
}
- spin_lock_irqsave(&acm->read_lock, flags);
- list_add(&buf->list, &acm->spare_read_bufs);
- spin_unlock_irqrestore(&acm->read_lock, flags);
+ if (copied == buf->size || !tty) {
+ spin_lock_irqsave(&acm->read_lock, flags);
+ list_add(&buf->list, &acm->spare_read_bufs);
+ spin_unlock_irqrestore(&acm->read_lock, flags);
+ } else {
+ tty_kref_put(tty);
+ dbg("Partial buffer fill");
+ if (copied > 0) {
+ memmove(buf->base,
+ buf->base + copied,
+ buf->size - copied);
+ buf->size -= copied;
+ }
+
+ spin_lock_irqsave(&acm->read_lock, flags);
+ list_add(&buf->list, &acm->filled_read_bufs);
+ spin_unlock_irqrestore(&acm->read_lock, flags);
+ return;
+ }
+
goto next_buffer;
urbs:
--
1.7.0.4
On Mon, 21 Mar 2011 15:52:25 +0000
Toby Gray <[email protected]> wrote:
> When sending large quantities of data through a CDC ACM channel it is possible
> for data to be lost when attempting to copy the data to the tty buffer. This
> occurs due to the return value from tty_insert_flip_string not being checked.
For a tty that is normally the right thing to do - no flow control was
asserted and the internal 64K of buffering was overrun so discard.
> This patch adds checking for how many bytes have been inserted into the tty
> buffer and returns any remaining bytes back to the filled read buffer list.
How does ACM handle flow control - is the expectation that it gets flow
controlled in hardware by not having the opportunity to send bits
to the host end ? If so this seems to make sense.
> + copied = 0;
Surely copied = buf->size is the no tty assumption "we had the lot and
discarded it"
> if (tty) {
> spin_lock_irqsave(&acm->throttle_lock, flags);
> throttled = acm->throttle;
> spin_unlock_irqrestore(&acm->throttle_lock, flags);
> if (!throttled) {
> - tty_insert_flip_string(tty, buf->base, buf->size);
> + copied = tty_insert_flip_string(tty,
> + buf->base, buf->size);
> tty_flip_buffer_push(tty);
> } else {
> tty_kref_put(tty);
> @@ -440,9 +443,26 @@ next_buffer:
> }
> }
>
> - spin_lock_irqsave(&acm->read_lock, flags);
> - list_add(&buf->list, &acm->spare_read_bufs);
> - spin_unlock_irqrestore(&acm->read_lock, flags);
> + if (copied == buf->size || !tty) {
Which would simplify this lot
> + spin_lock_irqsave(&acm->read_lock, flags);
> + list_add(&buf->list, &acm->spare_read_bufs);
> + spin_unlock_irqrestore(&acm->read_lock, flags);
> + } else {
> + tty_kref_put(tty);
> + dbg("Partial buffer fill");
> + if (copied > 0) {
> + memmove(buf->base,
> + buf->base + copied,
> + buf->size - copied);
> + buf->size -= copied;
> + }
Would it not be cleaner to add a buf->head pointer that could simply be
advanced so the code would become
buf->head += copied;
buf->size -= copied;
if (buf->size != 0)
list_add(&buf->list, &acm->filled_read_bufs);
instead of all the memmove uglies ?
On 21/03/2011 16:56, Alan Cox wrote:
> On Mon, 21 Mar 2011 15:52:25 +0000
> Toby Gray<[email protected]> wrote:
>> This patch adds checking for how many bytes have been inserted into the tty
>> buffer and returns any remaining bytes back to the filled read buffer list.
> How does ACM handle flow control - is the expectation that it gets flow
> controlled in hardware by not having the opportunity to send bits
> to the host end ? If so this seems to make sense.
Yes, the expectation is that it is flow controlled in hardware if a bulk
endpoint is used. To this end cdc-acm.ko doesn't issue any new read
requests while the tty is throttled. This works as flow control most of
the time, but not when the data is arriving quickly (a few megabytes per
second).
Thank you for your other comments. I had initially gone for the memmove
thinking that having a buffer head as well as a base would have
needlessly complicated the rest of the code.
Having made the required changes it seems I'd greatly overestimated the
changes and it is definitely far cleaner without the memmove.
Regards,
Toby
When sending large quantities of data through a CDC ACM channel it is possible
for data to be lost when attempting to copy the data to the tty buffer. This
occurs due to the return value from tty_insert_flip_string not being checked.
This patch adds checking for how many bytes have been inserted into the tty
buffer and returns any remaining bytes back to the filled read buffer list.
Signed-off-by: Toby Gray <[email protected]>
---
drivers/usb/class/cdc-acm.c | 25 +++++++++++++++++++++----
drivers/usb/class/cdc-acm.h | 1 +
2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index f492a7f..f24d41c 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -363,6 +363,7 @@ static void acm_read_bulk(struct urb *urb)
dev_dbg(&acm->data->dev, "bulk rx status %d\n", status);
buf = rcv->buffer;
+ buf->head = buf->base;
buf->size = urb->actual_length;
if (likely(status == 0)) {
@@ -392,6 +393,7 @@ static void acm_rx_tasklet(unsigned long _acm)
struct acm_ru *rcv;
unsigned long flags;
unsigned char throttled;
+ int copied;
dbg("Entering acm_rx_tasklet");
@@ -423,12 +425,14 @@ next_buffer:
dbg("acm_rx_tasklet: procesing buf 0x%p, size = %d", buf, buf->size);
+ copied = buf->size;
if (tty) {
spin_lock_irqsave(&acm->throttle_lock, flags);
throttled = acm->throttle;
spin_unlock_irqrestore(&acm->throttle_lock, flags);
if (!throttled) {
- tty_insert_flip_string(tty, buf->base, buf->size);
+ copied = tty_insert_flip_string(tty,
+ buf->head, buf->size);
tty_flip_buffer_push(tty);
} else {
tty_kref_put(tty);
@@ -440,9 +444,22 @@ next_buffer:
}
}
- spin_lock_irqsave(&acm->read_lock, flags);
- list_add(&buf->list, &acm->spare_read_bufs);
- spin_unlock_irqrestore(&acm->read_lock, flags);
+ buf->head += copied;
+ buf->size -= copied;
+
+ if (buf->size == 0) {
+ spin_lock_irqsave(&acm->read_lock, flags);
+ list_add(&buf->list, &acm->spare_read_bufs);
+ spin_unlock_irqrestore(&acm->read_lock, flags);
+ } else {
+ tty_kref_put(tty);
+ dbg("Partial buffer fill");
+ spin_lock_irqsave(&acm->read_lock, flags);
+ list_add(&buf->list, &acm->filled_read_bufs);
+ spin_unlock_irqrestore(&acm->read_lock, flags);
+ return;
+ }
+
goto next_buffer;
urbs:
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index 5eeb570..d7581f6 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -75,6 +75,7 @@ struct acm_rb {
struct list_head list;
int size;
unsigned char *base;
+ unsigned char *head;
dma_addr_t dma;
};
--
1.7.0.4
Am Montag, 21. M?rz 2011, 16:52:25 schrieb Toby Gray:
> When sending large quantities of data through a CDC ACM channel it is possible
> for data to be lost when attempting to copy the data to the tty buffer. This
> occurs due to the return value from tty_insert_flip_string not being checked.
>
> This patch adds checking for how many bytes have been inserted into the tty
> buffer and returns any remaining bytes back to the filled read buffer list.
>
Have you tested whether the driver recovers from running out of buffers?
Other than that it is looking good to me.
Regards
Oliver
Am Montag, 21. M?rz 2011, 17:56:12 schrieb Alan Cox:
> On Mon, 21 Mar 2011 15:52:25 +0000
> Toby Gray <[email protected]> wrote:
>
> > When sending large quantities of data through a CDC ACM channel it is possible
> > for data to be lost when attempting to copy the data to the tty buffer. This
> > occurs due to the return value from tty_insert_flip_string not being checked.
>
> For a tty that is normally the right thing to do - no flow control was
> asserted and the internal 64K of buffering was overrun so discard.
But should we really randomly discard a part of a buffer?
If this happens the better alternative approach would be to nuke all buffers
we currently have.
Regards
Oliver
On Mon, Mar 21, 2011 at 06:04:58PM +0000, Toby Gray wrote:
> When sending large quantities of data through a CDC ACM channel it is possible
> for data to be lost when attempting to copy the data to the tty buffer. This
> occurs due to the return value from tty_insert_flip_string not being checked.
>
> This patch adds checking for how many bytes have been inserted into the tty
> buffer and returns any remaining bytes back to the filled read buffer list.
[...]
> @@ -392,6 +393,7 @@ static void acm_rx_tasklet(unsigned long _acm)
[...]
> - spin_lock_irqsave(&acm->read_lock, flags);
> - list_add(&buf->list, &acm->spare_read_bufs);
> - spin_unlock_irqrestore(&acm->read_lock, flags);
> + buf->head += copied;
> + buf->size -= copied;
> +
> + if (buf->size == 0) {
> + spin_lock_irqsave(&acm->read_lock, flags);
> + list_add(&buf->list, &acm->spare_read_bufs);
> + spin_unlock_irqrestore(&acm->read_lock, flags);
> + } else {
> + tty_kref_put(tty);
> + dbg("Partial buffer fill");
> + spin_lock_irqsave(&acm->read_lock, flags);
> + list_add(&buf->list, &acm->filled_read_bufs);
> + spin_unlock_irqrestore(&acm->read_lock, flags);
> + return;
> + }
> +
Say you fill up the tty buffer using the last of the sixteen buffers and
return in the else clause above, how will the tasklet ever get
re-scheduled?
The problem is that the tasklet is only scheduled on urb completion and
unthrottle (after open), and if you return above no urb will get
re-submitted. So the only way this will work is if it can be guaranteed
that the line discipline will throttle and later unthrottle us. I
doubt that is the case, but perhaps Alan can give a more definite
answer?
[By the way, did you see Filippe Balbi's patch posted today claiming to
fix a bug in n_tty which could cause data loss at high speeds?]
I was just about to submit a patch series killing the rx tasklet and
heavily simplifying the cdc-acm driver when you posted last night. I
think that if this mechanism is needed it is more straight-forwardly
implemented on top of those as they removes a lot of complexity and
makes it easier to spot corner cases such as the one pointed out above.
I would also prefer a more generic solution to the problem so that we
don't need to re-introduce driver buffering again. Since we already have
the throttelling mechanism in place, if we could only be notified/find
out that the tty buffers are say half-full, we could throttle (from
within the driver) but still push the remaining buffer still on the wire
as they arrive. It would of course require a guarantee that such a
throttle-is-about-to-happen notification is actually followed by (a
throttle and) unthrottle. Thoughts on that?
Thanks,
Johan
> re-submitted. So the only way this will work is if it can be guaranteed
> that the line discipline will throttle and later unthrottle us. I
> doubt that is the case, but perhaps Alan can give a more definite
> answer?
If an ldisc throttles it should always later unthrottle. However flow
control is async so at high enough data rates it'll be too slow.
> I would also prefer a more generic solution to the problem so that we
> don't need to re-introduce driver buffering again. Since we already have
> the throttelling mechanism in place, if we could only be notified/find
> out that the tty buffers are say half-full, we could throttle (from
The tty layer actually knows this fact
> within the driver) but still push the remaining buffer still on the wire
> as they arrive. It would of course require a guarantee that such a
> throttle-is-about-to-happen notification is actually followed by (a
> throttle and) unthrottle. Thoughts on that?
tty throttling is at the ldisc layer, the tty buffers are below this. The
space left being 64K - tty->buf.memory_used
So you can certainly add the following routine
int tty_constipated(struct tty *t)
{
if (tty->buf.memory_used > 49152)
return 1;
return 0;
}
EXPORT_SYMBOL_GPL(tty_constipated);
to drivers/tty/tty_buffer.c
The wakeup side is a bit trickier.
The down side of this of course is that you are likely to run at below
peak performance as you'll keep throttling back the hardware, whereas if
you have a tickless kernel with HZ set to 1000 it's probably sufficient
to bump the buffer sizes.
Right now (see tty_buffer.c) it's simply set to 64K as a sanity check
against throttled devices not stopping, but there isn't actually any
reason it couldn't be configurable at port setup time.
On Tue, 22 Mar 2011 09:07:42 +0100
Oliver Neukum <[email protected]> wrote:
> Am Montag, 21. M?rz 2011, 17:56:12 schrieb Alan Cox:
> > On Mon, 21 Mar 2011 15:52:25 +0000
> > Toby Gray <[email protected]> wrote:
> >
> > > When sending large quantities of data through a CDC ACM channel it is possible
> > > for data to be lost when attempting to copy the data to the tty buffer. This
> > > occurs due to the return value from tty_insert_flip_string not being checked.
> >
> > For a tty that is normally the right thing to do - no flow control was
> > asserted and the internal 64K of buffering was overrun so discard.
>
> But should we really randomly discard a part of a buffer?
> If this happens the better alternative approach would be to nuke all buffers
> we currently have.
If you have a stall on a PC eg a hard disk going through a fault sequence
and jamming up the CPU (as PATA can do so well) you don't want to discard
63K of perfectly good PPP frames just because you lost the following one.
It's also of course what the hardware itself does below us so the
behaviour isn't changing, we just provide a somewhat software extended
FIFO.
On Tue, Mar 22, 2011 at 10:35:34AM +0000, Alan Cox wrote:
> > re-submitted. So the only way this will work is if it can be guaranteed
> > that the line discipline will throttle and later unthrottle us. I
> > doubt that is the case, but perhaps Alan can give a more definite
> > answer?
>
> If an ldisc throttles it should always later unthrottle. However flow
> control is async so at high enough data rates it'll be too slow.
So my suspicion that no guarantees can be made that ldisc will _eventually_
throttle us, is correct?
The patch would introduce its parallel throttling scheme (through not
rescheduling the tasklet), and will only work if a throttle request is
guaranteed to eventually arrive:
16 x read_urb_bulk
- queues up 16 urbs and buffers
rx_tasklet
- attempt to push 16 buffers -- one is only partially pushed,
e.g. 64k of tty buffers full (or out of mem?)
- returns without resubmitting any urb or rescheduling tasklet
Q: will ldisc always throttle us as the 64K worth of data is
later propagated to ldisc?
A: Yes -- ldisc will throttle and later unthrottle, thereby
rescheduling tasklet which resumes reading.
No -- read will lock up
> > I would also prefer a more generic solution to the problem so that we
> > don't need to re-introduce driver buffering again. Since we already have
> > the throttelling mechanism in place, if we could only be notified/find
> > out that the tty buffers are say half-full, we could throttle (from
>
> The tty layer actually knows this fact
>
> > within the driver) but still push the remaining buffer still on the wire
> > as they arrive. It would of course require a guarantee that such a
> > throttle-is-about-to-happen notification is actually followed by (a
> > throttle and) unthrottle. Thoughts on that?
>
> tty throttling is at the ldisc layer, the tty buffers are below this. The
> space left being 64K - tty->buf.memory_used
>
> So you can certainly add the following routine
>
> int tty_constipated(struct tty *t)
> {
> if (tty->buf.memory_used > 49152)
> return 1;
> return 0;
> }
> EXPORT_SYMBOL_GPL(tty_constipated);
>
> to drivers/tty/tty_buffer.c
>
> The wakeup side is a bit trickier.
>
> The down side of this of course is that you are likely to run at below
> peak performance as you'll keep throttling back the hardware, whereas if
> you have a tickless kernel with HZ set to 1000 it's probably sufficient
> to bump the buffer sizes.
>
> Right now (see tty_buffer.c) it's simply set to 64K as a sanity check
> against throttled devices not stopping, but there isn't actually any
> reason it couldn't be configurable at port setup time.
So you suggest increasing the tty buffering instead of solving the wake
up problem? Would this really be sufficient, though? What if a driver
pushes and resubmits from interrupt context and with sufficiently many
bulk urbs on the wire at high speeds manage to starve the tty work
queue so nothing gets flushed to the ldisc? Or this cannot happen?
Thanks,
Johan
On 22/03/2011 10:05, Johan Hovold wrote:
> Say you fill up the tty buffer using the last of the sixteen buffers and
> return in the else clause above, how will the tasklet ever get
> re-scheduled?
That's a good point. I did check that this was happening during testing,
but looking into n_tty.c some more it seems that I probably just got
lucky with n_tty_receive_buf running to completion (and so throttling)
before n_tty_read got to run (and so un-throttling).
However I'm not 100% sure that I'm correctly understanding how
n_tty_receive_buf and n_tty_read interact. I can't see why it's safe for
n_tty_receive_buf to not have any sort of synchronization mechanism
between checking tty->receive_room and calling tty_throttle.
Is there a mechanism preventing a different thread from running
n_tty_read between n_tty_receive_buf finding receive_room to be below
the threshold and tty_throttle being called? If not then isn't there a
race condition when the following happens:
1. n_tty_receive_buf fills up the buffer so that the free space
is below TTY_THRESHOLD_THROTTLE
2. n_tty_receive_buf comes to the check at the end and decide
that it needs to call tty_throttle
3. Thread rescheduling happens and a different thread runs
n_tty_read which empties the buffer
4. After emptying the buffer n_tty_read calls tty_unthrottle,
which does nothing as the throttling bit isn't set
5. The n_tty_receive_buf thread is executed again, calling
tty_throttle, causing throttling, but with an empty buffer.
Or have I not understood a complexity in the interactions within n_tty.c?
> [By the way, did you see Filippe Balbi's patch posted today claiming to
> fix a bug in n_tty which could cause data loss at high speeds?]
I'd not seen that. Thanks for pointing that out.
> I was just about to submit a patch series killing the rx tasklet and
> heavily simplifying the cdc-acm driver when you posted last night. I
> think that if this mechanism is needed it is more straight-forwardly
> implemented on top of those as they removes a lot of complexity and
> makes it easier to spot corner cases such as the one pointed out above.
Makes sense. The cdc-acm driver is certainly far more readable with your
changes.
> I would also prefer a more generic solution to the problem so that we
> don't need to re-introduce driver buffering again. Since we already have
> the throttelling mechanism in place, if we could only be notified/find
> out that the tty buffers are say half-full, we could throttle (from
> within the driver) but still push the remaining buffer still on the wire
> as they arrive. It would of course require a guarantee that such a
> throttle-is-about-to-happen notification is actually followed by (a
> throttle and) unthrottle. Thoughts on that?
I was thinking about this too. Would it be good enough to add the
ability for forcing tty throttling and to call that instead of issuing
another URB read when the space remaining in the buffer would then be
less than the pending URB reads? Or would having a notification at a
particular level be more useful for other drivers?
Regards,
Toby
> Is there a mechanism preventing a different thread from running
> n_tty_read between n_tty_receive_buf finding receive_room to be below
> the threshold and tty_throttle being called? If not then isn't there a
> race condition when the following happens:
n_tty_receive is single threaded and is going to get run in preference to
user threads.
> 1. n_tty_receive_buf fills up the buffer so that the free space
> is below TTY_THRESHOLD_THROTTLE
> 2. n_tty_receive_buf comes to the check at the end and decide
> that it needs to call tty_throttle
> 3. Thread rescheduling happens and a different thread runs
> n_tty_read which empties the buffer
> 4. After emptying the buffer n_tty_read calls tty_unthrottle,
> which does nothing as the throttling bit isn't set
> 5. The n_tty_receive_buf thread is executed again, calling
> tty_throttle, causing throttling, but with an empty buffer.
>
> Or have I not understood a complexity in the interactions within n_tty.c?
Looks possible - historically it would have been safe but not any more.
The scenario I think would have to be two thread of execution in parallel
on two processors at the same moment and with near perfect timing but I
don't see why it can't happen.
Alan
On 22/03/2011 07:43, Oliver Neukum wrote:
> Have you tested whether the driver recovers from running out of buffers?
> Other than that it is looking good to me.
I had tested the driver running out of buffers quite extensively, but I
think I must have just been lucky with the timings. Now that I've had a
look at the code in more detail, I believe it would be possible for a
throttle/unthrottle not to occur, in which case no more new URB reads
would be submitted.
I've attached a v3 version of the patch which ensures that either the
tty is throttled or the tasklet is rescheduled. I assume that it would
be preferable to wait until Johan Hovold's '[PATCH 16/16] USB: cdc-acm:
re-write read processing' has been submitted and then for me to submit
another similar patch to this one, but which doesn't need the rx tasklet.
I decided it was still worth attaching the patch as it'll allow people
to fix this issue on older kernel versions
Regards,
Toby