2007-05-04 09:33:14

by Antonino Ingargiola

[permalink] [raw]
Subject: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Accidentally I've replied privately, sorry. Forwarding to ML...

---------- Forwarded message ----------
From: Antonino Ingargiola <[email protected]>
Date: May 4, 2007 11:29 AM
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI
usb-serial possible bug]
To: Oliver Neukum <[email protected]>


On 5/4/07, Oliver Neukum <[email protected]> wrote:
> Am Freitag, 4. Mai 2007 10:38 schrieb Antonino Ingargiola:
> > To solve the problem we must do a complete flush of all the buffer
> > chain. I do this flushing the input multiple times with a small pause
> > between them. In my case 10 flushes separated by a 10ms pause always
> > empties the whole buffer chain, so I get no corruption anymore. I'ts
> > not an elegant solution but it works (10 flushes are an overkill but I
> > want to be _really_ sure to read the correct data).
>
> How do you flush the buffers? Simply reading them out?

Nope. In python I use the flushInput() method of the serial object
defined by the pyserial library[0]. The method does just this system
call:

termios.tcflush(self.fd, TERMIOS.TCIFLUSH)

that I think is correct.

Cheers,

~ Antonio

[0]: http://pyserial.sourceforge.net/ (or python-serial debian package)


2007-05-04 13:41:42

by Oliver Neukum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]


> > How do you flush the buffers? Simply reading them out?
>
> Nope. In python I use the flushInput() method of the serial object
> defined by the pyserial library[0]. The method does just this system
> call:
>
> termios.tcflush(self.fd, TERMIOS.TCIFLUSH)

case TCFLSH:
retval = tty_check_change(tty);
if (retval)
return retval;

ld = tty_ldisc_ref(tty);
switch (arg) {
case TCIFLUSH:
if (ld && ld->flush_buffer)
ld->flush_buffer(tty);
break;
case TCIOFLUSH:
if (ld && ld->flush_buffer)
ld->flush_buffer(tty);
/* fall through */
case TCOFLUSH:
if (tty->driver->flush_buffer)
tty->driver->flush_buffer(tty);
break;
default:
tty_ldisc_deref(ld);
return -EINVAL;
}
tty_ldisc_deref(ld);
return 0;

Most likely you were using n_tty.

static void n_tty_flush_buffer(struct tty_struct * tty)
{
/* clear everything and unthrottle the driver */
reset_buffer_flags(tty);

if (!tty->link)
return;

if (tty->link->packet) {
tty->ctrl_status |= TIOCPKT_FLUSHREAD;
wake_up_interruptible(&tty->link->read_wait);
}
}

It sets TIOCPKT_FLUSHREAD, which is used nowhere else in the kernel.
And I am confused. Could somebody who has some experience in the
tty layer comment on that?

Regards
Oliver

2007-05-04 13:43:28

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino Ingargiola wrote:
> Nope. In python I use the flushInput() method of the serial object
> defined by the pyserial library[0]. The method does just this system
> call:
>
> termios.tcflush(self.fd, TERMIOS.TCIFLUSH)
>
> that I think is correct.

There is intermediate buffering between the driver and
the line discipline called the tty flip buffer.

receive data flow:
driver --> tty flip --> line discipline --> application

When you flush input, the line disciplines flush_buffer() method
is called to clear any data residing the in the line discipline.

This does not affect the tty flip buffer
or hardware receive FIFOs.

I suspect the biggest problem is the data in the
tty flip buffer.

A new function to flush the tty flip buffer needs to
be added and then called from tty_io.c:tty_ldisc_flush().

Then a call to tcflush(TCIFLUSH) will clear both buffers.

This still would not clear any data in the hardware
receive FIFOs.

--
Paul Fulghum
Microgate Systems, Ltd.

2007-05-04 13:54:27

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Paul Fulghum wrote:
> A new function to flush the tty flip buffer needs to
> be added and then called from tty_io.c:tty_ldisc_flush().

That is not enough.

It looks like tty_ioctl() needs to process ioctl(TCFLSH)
to clear the flip buffer before passing that ioctl
to the line discipline.

--
Paul Fulghum
Microgate Systems, Ltd.

2007-05-04 14:49:29

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Here is a patch against 2.6.21 which flushes the tty flip buffer
on ioctl(TCFLSH) for input.

--- a/drivers/char/tty_io.c 2007-04-25 22:08:32.000000000 -0500
+++ b/drivers/char/tty_io.c 2007-05-04 09:30:01.000000000 -0500
@@ -365,6 +365,28 @@ static void tty_buffer_free(struct tty_s
}

/**
+ * tty_buffer_flush - flush full tty buffers
+ * @tty: tty to flush
+ *
+ * flush all the buffers containing receive data
+ *
+ * Locking: none
+ */
+
+static void tty_buffer_flush(struct tty_struct *tty)
+{
+ struct tty_buffer *thead;
+ unsigned long flags;
+
+ spin_lock_irqsave(&tty->buf.lock, flags);
+ while((thead = tty->buf.head) != NULL) {
+ tty->buf.head = thead->next;
+ tty_buffer_free(tty, thead);
+ }
+ spin_unlock_irqrestore(&tty->buf.lock, flags);
+}
+
+/**
* tty_buffer_find - find a free tty buffer
* @tty: tty owning the buffer
* @size: characters wanted
@@ -1240,6 +1262,7 @@ void tty_ldisc_flush(struct tty_struct *
ld->flush_buffer(tty);
tty_ldisc_deref(ld);
}
+ tty_buffer_flush(tty);
}

EXPORT_SYMBOL_GPL(tty_ldisc_flush);
@@ -3336,6 +3359,15 @@ int tty_ioctl(struct inode * inode, stru
case TIOCMBIC:
case TIOCMBIS:
return tty_tiocmset(tty, file, cmd, p);
+ case TCFLSH:
+ switch (arg) {
+ case TCIFLUSH:
+ case TCIOFLUSH:
+ /* flush tty buffer and allow ldisc to process ioctl */
+ tty_buffer_flush(tty);
+ break;
+ }
+ break;
}
if (tty->driver->ioctl) {
retval = (tty->driver->ioctl)(tty, file, cmd, arg);


2007-05-04 16:04:15

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> Here is a patch against 2.6.21 which flushes the tty flip buffer
> on ioctl(TCFLSH) for input.
>
> --- a/drivers/char/tty_io.c 2007-04-25 22:08:32.000000000 -0500
> +++ b/drivers/char/tty_io.c 2007-05-04 09:30:01.000000000 -0500
<snip>

Thanks! I've applied the patch and I'm building the kernel. I'll
report the result.


Regards,

~ Antonio

2007-05-04 16:56:59

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Antonino Ingargiola <[email protected]> wrote:
> On 5/4/07, Paul Fulghum <[email protected]> wrote:
> > Here is a patch against 2.6.21 which flushes the tty flip buffer
> > on ioctl(TCFLSH) for input.
> >
> > --- a/drivers/char/tty_io.c 2007-04-25 22:08:32.000000000 -0500
> > +++ b/drivers/char/tty_io.c 2007-05-04 09:30:01.000000000 -0500
> <snip>
>
> Thanks! I've applied the patch and I'm building the kernel. I'll
> report the result.

The system blocks during booting. I can unblock it with SysReq+K but
then I'm unable to log into X.

The boot messages seems ok (full gzipped boot log attached):

Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
kernel: serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
kernel: serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
kernel: 00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
kernel: 00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

and when I pres the emergency keys:

kernel: EXT3-fs: mounted filesystem with ordered data mode.
kernel: eth0: link up, 10Mbps, half-duplex, lpa 0x0000
kernel: NET: Registered protocol family 10
kernel: SysRq : SAK
kernel: SAK: killed process 1511 (vt-is-UTF8): process_session(p)==tty->session
kernel: SAK: killed process 1508 (unicode_start):
process_session(p)==tty->session
kernel: SAK: killed process 1409 (sh): process_session(p)==tty->session
kernel: SAK: killed process 301 (rc): process_session(p)==tty->session
kernel: SAK: killed process 298 (init): process_session(p)==tty->session
kernel: SAK: killed process 298 (init): process_session(p)==tty->session
kernel: SAK: killed process 301 (rc): process_session(p)==tty->session
kernel: SAK: killed process 1409 (sh): process_session(p)==tty->session
kernel: SAK: killed process 1508 (unicode_start):
process_session(p)==tty->session
kernel: SAK: killed process 1511 (vt-is-UTF8): process_session(p)==tty->session
kernel: ACPI: Processor [CPU0] (supports 2 throttling states)
kernel: [drm] Initialized drm 1.1.0 20060810
kernel: [drm] Initialized radeon 1.25.0 20060524 on minor 0
kernel: input: Power Button (FF) as /class/input/input4
kernel: ACPI: Power Button (FF) [PWRF]
kernel: input: Power Button (CM) as /class/input/input5

I'll try if I can make some serial test nevertheless from the console.


Regards,

~ Antonio


Attachments:
(No filename) (2.27 kB)
boot.log.gz (3.95 kB)
Download all attachments

2007-05-04 17:00:51

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino Ingargiola wrote:
> The system blocks during booting. I can unblock it with SysReq+K but
> then I'm unable to log into X.

Hmmm, it tests here with no problem.

Does reversing the patch and rebuilding the kernel fix the boot?

--
Paul Fulghum
Microgate Systems, Ltd.

2007-05-04 17:13:28

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> Antonino Ingargiola wrote:
> > The system blocks during booting. I can unblock it with SysReq+K but
> > then I'm unable to log into X.
>
> Hmmm, it tests here with no problem.
>
> Does reversing the patch and rebuilding the kernel fix the boot?

The kernel to which I applied the patch was vanilla 2.6.21.1, and it
is the one I'm currently running (no config changes or other patches).

Regards,

~ Antonio

2007-05-04 17:20:54

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Fri, 2007-05-04 at 19:13 +0200, Antonino Ingargiola wrote:
> On 5/4/07, Paul Fulghum <[email protected]> wrote:
> > Antonino Ingargiola wrote:
> > > The system blocks during booting. I can unblock it with SysReq+K but
> > > then I'm unable to log into X.
> >
> > Hmmm, it tests here with no problem.
> >
> > Does reversing the patch and rebuilding the kernel fix the boot?
>
> The kernel to which I applied the patch was vanilla 2.6.21.1, and it
> is the one I'm currently running (no config changes or other patches).

Are you using a serial port as console?

--
Paul Fulghum
Microgate Systems, Ltd

2007-05-04 17:25:58

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> On Fri, 2007-05-04 at 19:13 +0200, Antonino Ingargiola wrote:
> > On 5/4/07, Paul Fulghum <[email protected]> wrote:
> > > Antonino Ingargiola wrote:
> > > > The system blocks during booting. I can unblock it with SysReq+K but
> > > > then I'm unable to log into X.
> > >
> > > Hmmm, it tests here with no problem.
> > >
> > > Does reversing the patch and rebuilding the kernel fix the boot?
> >
> > The kernel to which I applied the patch was vanilla 2.6.21.1, and it
> > is the one I'm currently running (no config changes or other patches).
>
> Are you using a serial port as console?
>

No. Ehmm ... I've an usb keybord and vga monitor on a standard desktop pc.

Regards,

~ Antonio

2007-05-04 17:41:42

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Fri, 2007-05-04 at 19:25 +0200, Antonino Ingargiola wrote:

> No. Ehmm ... I've an usb keybord and vga monitor on a standard desktop pc.

OK, I'm stumped.

I don't see how my patch could cause the machine to not boot
and I'm not seeing that behavior here.

--
Paul Fulghum
Microgate Systems, Ltd

2007-05-04 18:46:30

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> On Fri, 2007-05-04 at 19:25 +0200, Antonino Ingargiola wrote:
>
> > No. Ehmm ... I've an usb keybord and vga monitor on a standard desktop pc.
>
> OK, I'm stumped.
>
> I don't see how my patch could cause the machine to not boot
> and I'm not seeing that behavior here.

I can boot it hitting SysReq+K, but only in text mode. X start but I
can't do login (the mouse moves but I cant enter username and
password). Furthermore if I hit some key in GDM I can't switch to text
console anymore. If I switch to text-mode after boot all seems to
work. Maybe this is related to some important process I kill with the
SysReq key.

I'm quite confident the system blocks on "Setting consoles fonts and
modes.", and thus during the boot script console-screen.h. I'll try to
see which commands blocks...

BTW, I've tried the serial port in text mode. With the patch, flushing
the input results effectively in a complete flush. However after doing
the flush I can't read any char anymore without closing and reopening
the port. I've verified this behavior both communicating between two
serial port and both communicating with an usb-serial device (driver
cdc-acm).

Regards,

~ Antonio

2007-05-04 19:06:26

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Antonino Ingargiola <[email protected]> wrote:
[cut]
> I'm quite confident the system blocks on "Setting consoles fonts and
> modes.", and thus during the boot script console-screen.h. I'll try to
> see which commands blocks...

Filling with echo console-screen.sh I've found that the blocking command is:

unicode_start 2> /dev/null || true

or at least the echo before this command is the last shown.

Regards,

~ Antonio

2007-05-04 19:49:41

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Fri, 2007-05-04 at 21:06 +0200, Antonino Ingargiola wrote:

> Filling with echo console-screen.sh I've found that the blocking command is:
>
> unicode_start 2> /dev/null || true
>
> or at least the echo before this command is the last shown.

I still don't know what is blocking.

It is possible some tty device is operating with an improperly
initialized tty structure. I vaguely remember some console code
creating its own minimally initialized 'dummy' tty structure.

This might be causing the new flush code to hang.

Try this patch which does not call the flush unless a line
discipline is attached to the tty. That should indicate
a normally initialized tty structure.

--- a/drivers/char/tty_io.c 2007-04-25 22:08:32.000000000 -0500
+++ b/drivers/char/tty_io.c 2007-05-04 12:15:18.000000000 -0500
@@ -365,6 +365,28 @@ static void tty_buffer_free(struct tty_s
}

/**
+ * tty_buffer_flush - flush full tty buffers
+ * @tty: tty to flush
+ *
+ * flush all the buffers containing receive data
+ *
+ * Locking: none
+ */
+
+static void tty_buffer_flush(struct tty_struct *tty)
+{
+ struct tty_buffer *thead;
+ unsigned long flags;
+
+ spin_lock_irqsave(&tty->buf.lock, flags);
+ while((thead = tty->buf.head) != NULL) {
+ tty->buf.head = thead->next;
+ tty_buffer_free(tty, thead);
+ }
+ spin_unlock_irqrestore(&tty->buf.lock, flags);
+}
+
+/**
* tty_buffer_find - find a free tty buffer
* @tty: tty owning the buffer
* @size: characters wanted
@@ -1236,8 +1258,10 @@ void tty_ldisc_flush(struct tty_struct *
{
struct tty_ldisc *ld = tty_ldisc_ref(tty);
if(ld) {
- if(ld->flush_buffer)
+ if(ld->flush_buffer) {
ld->flush_buffer(tty);
+ tty_buffer_flush(tty);
+ }
tty_ldisc_deref(ld);
}
}
@@ -3336,6 +3360,20 @@ int tty_ioctl(struct inode * inode, stru
case TIOCMBIC:
case TIOCMBIS:
return tty_tiocmset(tty, file, cmd, p);
+ case TCFLSH:
+ switch (arg) {
+ case TCIFLUSH:
+ case TCIOFLUSH:
+ /* flush tty buffer and allow ldisc to process ioctl */
+ ld = tty_ldisc_ref_wait(tty);
+ if (ld) {
+ if (ld->ioctl)
+ tty_buffer_flush(tty);
+ tty_ldisc_deref(ld);
+ }
+ break;
+ }
+ break;
}
if (tty->driver->ioctl) {
retval = (tty->driver->ioctl)(tty, file, cmd, arg);


2007-05-04 21:21:29

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> On Fri, 2007-05-04 at 21:06 +0200, Antonino Ingargiola wrote:
>
> > Filling with echo console-screen.sh I've found that the blocking command is:
> >
> > unicode_start 2> /dev/null || true
> >
> > or at least the echo before this command is the last shown.
>
> I still don't know what is blocking.
>
> It is possible some tty device is operating with an improperly
> initialized tty structure. I vaguely remember some console code
> creating its own minimally initialized 'dummy' tty structure.
>
> This might be causing the new flush code to hang.
>
> Try this patch which does not call the flush unless a line
> discipline is attached to the tty. That should indicate
> a normally initialized tty structure.
>

Tried. Nothing changes with this patch. I'm booting the patched kernel
commenting out the unicode_start line for now, and there aren't other
"system" problems.

I confirm the behavior previously posted for the serial devices on
flush with this patch too.

"With the patch, flushing the input results effectively in a complete flush.
However after doing the flush I can't read [further] chars [sent to
the serial port]
without closing and reopening the port. I've verified this behavior both
communicating between two serial ports and both communicating with an
usb-serial device (driver cdc-acm)."

Regards,

~ Antonio

2007-05-04 22:31:35

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino Ingargiola wrote:
> "With the patch, flushing the input results effectively in a complete flush.
> However after doing the flush I can't read [further] chars [sent to
> the serial port]
> without closing and reopening the port. I've verified this behavior both
> communicating between two serial ports and both communicating with an
> usb-serial device (driver cdc-acm)."

OK, this behavior is so unexpected I must be missing
something basic. Not being able to reproduce this myself
is a problem. If I think of something I'll post.

--
Paul

2007-05-04 22:59:03

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino:

Can you try two tests (with my patch applied):

1. comment out the tty_flush_buffer() call in tty_ldisc_flush() and test

2. uncomment (reenable) the above call and comment out the
tty_flush_buffer() call in tty_ioctl() and test

Thanks,
Paul

2007-05-05 15:39:14

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Paul Fulghum <[email protected]>:
> On Fri, 2007-05-04 at 17:30 -0600, Paul Fulghum wrote:
> > OK, this behavior is so unexpected I must be missing
> > something basic.
>
> And so I was. Try this patch.
>
> --- a/drivers/char/tty_io.c 2007-05-04 05:46:55.000000000 -0500
> +++ b/drivers/char/tty_io.c 2007-05-05 03:23:46.000000000 -0500

Great! Really good job, Paul. The patch now boot properly and solves
completely the testcase with two serial lines:

In [1]: import serial
In [2]: s0 = serial.Serial(0, timeout=1)
In [3]: s1 = serial.Serial(1, timeout=1)
In [4]: s0.write('test\n')
In [5]: s1.inWaiting()
Out[5]: 5L
In [6]: s1.readline()
Out[6]: 'test\n'
In [7]: for i in xrange(1000):
...: s0.write(str(i).zfill(8)+'\n')
...:
In [8]: s1.inWaiting()
Out[8]: 4095L
In [9]: s1.flushInput()
In [10]: s1.inWaiting()
Out[10]: 0L
In [11]: s0.write('test\n')
In [12]: s1.inWaiting()
Out[12]: 5L
In [13]: s1.readline()
Out[13]: 'test\n'

I've done more tests with other scripts and all works perfectly (the
input buffer is really totally flushed the first time). Many thanks! I
think this patch should be included in mainline, since if one flushes
the input buffer, really want to flush the entire buffer chain and
doesn't want to read any old data _after_ a flush.

However I also tested a usb-serial device (that uses the cdc-acm
driver) and in this case I still need _two_ flushInput() to totally
flush the input buffer.

There can be another *secondary buffer* in the usb-serial driver? Can
this buffer be emptied out too?

I bet the same behavior can be reproduced with another FTDI-based
usb-serial device that I haven't at hand now (but that I can test the
next week).

Many thanks for the help so far.


Regards,

~ Antonio

2007-05-05 15:51:44

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/4/07, Paul Fulghum <[email protected]> wrote:
> Antonino:
>
> Can you try two tests (with my patch applied):
>
> 1. comment out the tty_flush_buffer() call in tty_ldisc_flush() and test
>
> 2. uncomment (reenable) the above call and comment out the
> tty_flush_buffer() call in tty_ioctl() and test

I assume you meant tty_buffer_flush(). I've built kernel 1). In kernel
2), do you mean:

/*if (ld->ioctl)
tty_buffer_flush(tty);*/
tty_ldisc_deref(ld);

right? This is what I'm building... I'll report these new tests soon.

While waiting for kernel building I'll document the testing procedure.
In this way someone else can easily try to reproduce the problem.

1. Hardware.

Two serial ports required. Connect the two port witth a null-modem
cable, or patch, for each port the Tx pin with the Rx of the other
port[0].

2. Software

I assume python is installed. Install also pyserial[1] (in debian
python-serial), if you manually download the package you can put the
"serial" dir in the dir you use to perform the test (no need to
install system-wide). If you can, install also the ipython shell that
has colored output and auto-completition.

3. Test

>From the python shell:

import serial
s0 = serial.Serial(0, timeout=1)

open the /dev/ttyS0 port with default values, hit 's0' [enter] to see
the serial parameters.

s1 = serial.Serial(1, timeout=1)

with the previous comma we open /dev/ttyS1 with the same serial port
settings. To write to one serial port:

s0.write('test\n')

and to read use the .read() or .readline() method:

s1.readline()

The .inWaiting() methods gives the numbers of bytes in the input
buffer. The .flushInput() method flushes the input buffer:

In [6]: s0.write('test\n')

In [7]: s1.inWaiting()
Out[7]: 5L

In [8]: s1.readline()
Out[8]: 'test\n'

In [9]: s1.inWaiting()
Out[9]: 0L

If all works, now fill one serial port with a bunch of data (an
increasing 8-digit number):

for i in xrange(10000):
s0.write(str(i).zfill(8)+'\n')

we can see that the s1 input buffer is full:

s1.inWaiting()
Out[21]: 4095L

then empty the buffer and see what happens:

In [22]: s1.flushInput()
In [23]: s1.inWaiting()
Out[23]: 4095L

the buffer is still full. You have to flush the buffer more then once
to completely flush all the buffer chain (including the flip buffer).
With the patched kernel at this point I get correctly:

In [22]: s1.flushInput()
In [23]: s1.inWaiting()
Out[23]: 0L

but then I can't read any other byte from the serial without closing
and reopening it:

In [24]: s0.write('test\n')

In [25]: s1.inWaiting()
Out[25]: 0L

In [26]: s1.read()
Out[26]: ''

In [27]: s1.close()

In [28]: s1.open()

In [29]: s1.inWaiting()
Out[29]: 0L

In [30]: s0.write('test\n')

In [31]: s1.inWaiting()
Out[31]: 5L

In [32]: s1.readline()
Out[32]: 'test\n'

Hope the explanation is clear If someone else want to try...

Regards,

~ Antonio

2007-05-05 16:00:25

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino Ingargiola wrote:
> The patch now boot properly and solves
> completely the testcase with two serial lines:

Good, thanks for testing.

> I think this patch should be included in mainline, since if one flushes
> the input buffer, really want to flush the entire buffer chain and
> doesn't want to read any old data _after_ a flush.

I will submit the patch, it's clearly needed.

> However I also tested a usb-serial device (that uses the cdc-acm
> driver) and in this case I still need _two_ flushInput() to totally
> flush the input buffer.
>
> There can be another *secondary buffer* in the usb-serial driver? Can
> this buffer be emptied out too?

Very possible, but I'm not familiar with that.

There is not an input flush method for the tty driver
and individual drivers don't process that ioctl.
The tty drivers I've seen immediately pass receive data to the
tty buffering and I'm not sure why a driver would
behave otherwise.

--
Paul

2007-05-05 16:10:22

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sat, 2007-05-05 at 10:43 -0600, Paul Fulghum wrote:
> There is not an input flush method for the tty driver
> and individual drivers don't process that ioctl.
> The tty drivers I've seen immediately pass receive data to the
> tty buffering and I'm not sure why a driver would
> behave otherwise.

cdc-acm does its own buffering.

In your case, the line discipline throttled the tty device
because the ldisc buffer was full.

If the line discipline throttles the driver input,
the cdc-acm driver stops giving data to the tty buffering
and instead stores them internally.

In the serial driver this usually just results in dropping
RTS to signal the remote end to stop sending. The serial
driver always immediately gives receive data to the tty buffering
without regard to the throttled state.

I would argue that cdc-acm should do the same as the serial driver.

--
Paul


2007-05-05 16:16:44

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Fri, 2007-05-04 at 17:30 -0600, Paul Fulghum wrote:
> OK, this behavior is so unexpected I must be missing
> something basic.

And so I was. Try this patch.

--- a/drivers/char/tty_io.c 2007-05-04 05:46:55.000000000 -0500
+++ b/drivers/char/tty_io.c 2007-05-05 03:23:46.000000000 -0500
@@ -365,6 +365,29 @@ static void tty_buffer_free(struct tty_s
}

/**
+ * tty_buffer_flush - flush full tty buffers
+ * @tty: tty to flush
+ *
+ * flush all the buffers containing receive data
+ *
+ * Locking: none
+ */
+
+static void tty_buffer_flush(struct tty_struct *tty)
+{
+ struct tty_buffer *thead;
+ unsigned long flags;
+
+ spin_lock_irqsave(&tty->buf.lock, flags);
+ while((thead = tty->buf.head) != NULL) {
+ tty->buf.head = thead->next;
+ tty_buffer_free(tty, thead);
+ }
+ tty->buf.tail = NULL;
+ spin_unlock_irqrestore(&tty->buf.lock, flags);
+}
+
+/**
* tty_buffer_find - find a free tty buffer
* @tty: tty owning the buffer
* @size: characters wanted
@@ -1240,6 +1263,7 @@ void tty_ldisc_flush(struct tty_struct *
ld->flush_buffer(tty);
tty_ldisc_deref(ld);
}
+ tty_buffer_flush(tty);
}

EXPORT_SYMBOL_GPL(tty_ldisc_flush);
@@ -3336,6 +3360,15 @@ int tty_ioctl(struct inode * inode, stru
case TIOCMBIC:
case TIOCMBIS:
return tty_tiocmset(tty, file, cmd, p);
+ case TCFLSH:
+ switch (arg) {
+ case TCIFLUSH:
+ case TCIOFLUSH:
+ /* flush tty buffer and allow ldisc to process ioctl */
+ tty_buffer_flush(tty);
+ break;
+ }
+ break;
}
if (tty->driver->ioctl) {
retval = (tty->driver->ioctl)(tty, file, cmd, arg);


2007-05-05 16:17:22

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sat, 2007-05-05 at 11:08 -0500, Paul Fulghum wrote:
> I would argue that cdc-acm should do the same as the serial driver.

Try this patch for the usb ports. (I don't have that hardware)

--- a/drivers/usb/class/cdc-acm.c 2007-04-25 22:08:32.000000000 -0500
+++ b/drivers/usb/class/cdc-acm.c 2007-05-05 11:12:10.000000000 -0500
@@ -355,18 +355,9 @@ next_buffer:
spin_lock_irqsave(&acm->throttle_lock, flags);
throttled = acm->throttle;
spin_unlock_irqrestore(&acm->throttle_lock, flags);
- if (!throttled)
- tty_insert_flip_string(tty, buf->base, buf->size);
+ tty_insert_flip_string(tty, buf->base, buf->size);
tty_flip_buffer_push(tty);

- if (throttled) {
- dbg("Throttling noticed");
- spin_lock_irqsave(&acm->read_lock, flags);
- list_add(&buf->list, &acm->filled_read_bufs);
- spin_unlock_irqrestore(&acm->read_lock, flags);
- return;
- }
-
spin_lock_irqsave(&acm->read_lock, flags);
list_add(&buf->list, &acm->spare_read_bufs);
spin_unlock_irqrestore(&acm->read_lock, flags);


2007-05-05 16:26:38

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Paul Fulghum <[email protected]>:
> On Sat, 2007-05-05 at 11:08 -0500, Paul Fulghum wrote:
> > I would argue that cdc-acm should do the same as the serial driver.
>
> Try this patch for the usb ports. (I don't have that hardware)

Building... thanks.

The HW is a cypress demo-board for their micro-controllers with an USB
interface. Linux had recognized it as a cdc_acm modem, but it's only a
very simple serial device that can stream data on request.

~ Antonio

2007-05-05 16:32:55

by Alan

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

> In the serial driver this usually just results in dropping
> RTS to signal the remote end to stop sending. The serial
> driver always immediately gives receive data to the tty buffering
> without regard to the throttled state.
>
> I would argue that cdc-acm should do the same as the serial driver.

This is a bug in cdc-acm really. It should not double buffer, but to be
fair to the authors prior to the new tty buffering it *had* to do this.

Alan

2007-05-05 16:47:12

by Oliver Neukum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Am Samstag, 5. Mai 2007 18:08 schrieb Paul Fulghum:

> If the line discipline throttles the driver input,
> the cdc-acm driver stops giving data to the tty buffering
> and instead stores them internally.

So do usb serial drivers.

> In the serial driver this usually just results in dropping
> RTS to signal the remote end to stop sending. The serial

This may take considerable time in the case of usb devices.

> driver always immediately gives receive data to the tty buffering
> without regard to the throttled state.
>
> I would argue that cdc-acm should do the same as the serial driver.

Has this been tested?
If so we could reduce the complexity of the throtteling logic in the usb
drivers.

Regards
Oliver

2007-05-05 16:54:54

by Oliver Neukum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Am Samstag, 5. Mai 2007 18:36 schrieb Alan Cox:
> > In the serial driver this usually just results in dropping
> > RTS to signal the remote end to stop sending. The serial
> > driver always immediately gives receive data to the tty buffering
> > without regard to the throttled state.
> >
> > I would argue that cdc-acm should do the same as the serial driver.
>
> This is a bug in cdc-acm really. It should not double buffer, but to be
> fair to the authors prior to the new tty buffering it *had* to do this.

I assume this applies to all serial drivers in the wider sense?

Regards
Oliver

2007-05-05 16:58:15

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sat, 2007-05-05 at 18:46 +0200, Oliver Neukum wrote:
> Am Samstag, 5. Mai 2007 18:08 schrieb Paul Fulghum:
> > I would argue that cdc-acm should do the same as the serial driver.
>
> Has this been tested?
> If so we could reduce the complexity of the throtteling logic in the usb
> drivers.

Antonino is doing so now.

I think Alan nailed it: with the old tty buffering the extra
logic was required to avoid data loss. The new tty buffering
handles large blocks of data with no problem.

Removing this part of the throttle logic makes
the input flushing mechanism complete.

--
Paul


2007-05-05 16:58:47

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Antonino Ingargiola <[email protected]>:
> 2007/5/5, Paul Fulghum <[email protected]>:
> > On Sat, 2007-05-05 at 11:08 -0500, Paul Fulghum wrote:
> > > I would argue that cdc-acm should do the same as the serial driver.
> >
> > Try this patch for the usb ports. (I don't have that hardware)
>
> Building... thanks.
>

This patch does not solve the problem with the cdc_acd driver. I still
need to flush two times to really flush the input. And the "secondary
buffer" still seems 26 bytes (as before).

Regards,

~ Antonio

2007-05-05 17:06:17

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sat, 2007-05-05 at 18:58 +0200, Antonino Ingargiola wrote:
> This patch does not solve the problem with the cdc_acd driver. I still
> need to flush two times to really flush the input. And the "secondary
> buffer" still seems 26 bytes (as before).

I missed a bit, try this.

--- a/drivers/usb/class/cdc-acm.c 2007-04-25 22:08:32.000000000 -0500
+++ b/drivers/usb/class/cdc-acm.c 2007-05-05 12:03:19.000000000 -0500
@@ -335,8 +335,6 @@ static void acm_rx_tasklet(unsigned long
spin_lock_irqsave(&acm->throttle_lock, flags);
throttled = acm->throttle;
spin_unlock_irqrestore(&acm->throttle_lock, flags);
- if (throttled)
- return;

next_buffer:
spin_lock_irqsave(&acm->read_lock, flags);
@@ -355,18 +353,9 @@ next_buffer:
spin_lock_irqsave(&acm->throttle_lock, flags);
throttled = acm->throttle;
spin_unlock_irqrestore(&acm->throttle_lock, flags);
- if (!throttled)
- tty_insert_flip_string(tty, buf->base, buf->size);
+ tty_insert_flip_string(tty, buf->base, buf->size);
tty_flip_buffer_push(tty);

- if (throttled) {
- dbg("Throttling noticed");
- spin_lock_irqsave(&acm->read_lock, flags);
- list_add(&buf->list, &acm->filled_read_bufs);
- spin_unlock_irqrestore(&acm->read_lock, flags);
- return;
- }
-
spin_lock_irqsave(&acm->read_lock, flags);
list_add(&buf->list, &acm->spare_read_bufs);
spin_unlock_irqrestore(&acm->read_lock, flags);


2007-05-05 17:09:12

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Paul Fulghum <[email protected]>:
> On Sat, 2007-05-05 at 18:46 +0200, Oliver Neukum wrote:
> > Am Samstag, 5. Mai 2007 18:08 schrieb Paul Fulghum:
> > > I would argue that cdc-acm should do the same as the serial driver.
> >
> > Has this been tested?
> > If so we could reduce the complexity of the throtteling logic in the usb
> > drivers.
>
> Antonino is doing so now.
>
> I think Alan nailed it: with the old tty buffering the extra
> logic was required to avoid data loss. The new tty buffering
> handles large blocks of data with no problem.
>

When I hit the problem originally I was using an FTDI device. I've
tested it on multiple 3 linux machine machine: Ubuntu Dapper (kernel
2.6.15), Fedora Core 2 (dunno what kernel but surely older that
2.6.15) and Debian Etch (with 2.6.20 and 2.6.21) and all of then gave
me corrupted data (while windows gave correct data). So all these
kernel have this (or a related) problem.


~ Antonio

2007-05-05 18:07:35

by Oliver Neukum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Am Samstag, 5. Mai 2007 18:36 schrieb Alan Cox:
> > In the serial driver this usually just results in dropping
> > RTS to signal the remote end to stop sending. The serial
> > driver always immediately gives receive data to the tty buffering
> > without regard to the throttled state.
> >
> > I would argue that cdc-acm should do the same as the serial driver.
>
> This is a bug in cdc-acm really. It should not double buffer, but to be
> fair to the authors prior to the new tty buffering it *had* to do this.

Hi,

should I understand this so that, if tty_buffer_request_room() returns
less than requested, the rest of the data should be dropped on the
floor?

Regards
Oliver

2007-05-05 18:07:57

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On 5/5/07, Antonino Ingargiola <[email protected]> wrote:
> On 5/4/07, Paul Fulghum <[email protected]> wrote:
> > Antonino:
> >
> > Can you try two tests (with my patch applied):
> >
> > 1. comment out the tty_flush_buffer() call in tty_ldisc_flush() and test
> >
> > 2. uncomment (reenable) the above call and comment out the
> > tty_flush_buffer() call in tty_ioctl() and test
>
> I assume you meant tty_buffer_flush(). I've built kernel 1). In kernel
> 2), do you mean:
>
> /*if (ld->ioctl)
> tty_buffer_flush(tty);*/
> tty_ldisc_deref(ld);
>
> right? This is what I'm building... I'll report these new tests soon.

Ok test done.

With kernel 1. the behavior is the same as with your plain second
patch only (flush input works but I cannot read anymore from the
serial that was flushed without closing and reopening the port). See
previous mail for the details of the test.

With kernel 2 the behavior is the same as mainline (multiple flushes
are needed to completely empty the buffers). And I _can_ read further
chars from the serial line after the flush. Here it is the ipython
session that document the test with kernel 2:

In [1]: import serial

In [2]: s0 = serial.Serial(0, timeout=1)

In [3]: s1 = serial.Serial(1, timeout=1)

In [4]: s0.write('test\n')

In [5]: s1.inWaiting()
Out[5]: 5L

In [6]: s1.readline()
Out[6]: 'test\n'

In [7]: for i in xrange(1000):
...: s0.write(str(i).zfill(8)+'\n')
...:

In [8]: s1.inWaiting()
Out[8]: 4095L

In [9]: s1.flushInput()

In [10]: s1.inWaiting()
Out[10]: 4095L # NOTE the buffer is still full!

In [11]: s1.flushInput()

In [12]: s1.inWaiting()
Out[12]: 810L # The buffer beginning
to be drained

In [13]: s1.flushInput()

In [14]: s1.inWaiting()
Out[14]: 0L # Now the buffer is empty

In [15]: s0.write('test\n') # An reading further chars works

In [16]: s1.inWaiting()
Out[16]: 5L

In [17]: s1.readline()
Out[17]: 'test\n'


Regards,

~ Antonio

2007-05-05 18:08:10

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Paul Fulghum <[email protected]>:
> On Sat, 2007-05-05 at 18:58 +0200, Antonino Ingargiola wrote:
> > This patch does not solve the problem with the cdc_acd driver. I still
> > need to flush two times to really flush the input. And the "secondary
> > buffer" still seems 26 bytes (as before).
>
> I missed a bit, try this.
>

Tried and this one work right. A single flush is sufficient to
completely flush the input buffer. Again, thanks a lot :).

Now I don't want to abuse your kindness, but I (personally) would be
*really* interested in a similar fix for the FTDI usb-serial driver,
because many measurements I do use an FTDI device.

However, if I understood well seems that many drives have to be
checked against this problem.


Regards,

~ Antonio

2007-05-05 18:36:10

by Oliver Neukum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
> Now I don't want to abuse your kindness, but I (personally) would be
> *really* interested in a similar fix for the FTDI usb-serial driver,
> because many measurements I do use an FTDI device.

Does this work?

Regards
Oliver
----

--- a/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:21:41.000000000 +0200
+++ b/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:27:09.000000000 +0200
@@ -1749,10 +1749,6 @@ static void ftdi_process_read (struct wo
length = 0;
}

- if (priv->rx_flags & THROTTLED) {
- dbg("%s - throttled", __FUNCTION__);
- break;
- }
if (tty_buffer_request_room(tty, length) < length) {
/* break out & wait for throttling/unthrottling to happen */
dbg("%s - receive room low", __FUNCTION__);
@@ -1825,16 +1821,6 @@ static void ftdi_process_read (struct wo
dbg("%s - incomplete, %d bytes processed, %d remain",
__FUNCTION__, packet_offset,
urb->actual_length - packet_offset);
- /* check if we were throttled while processing */
- spin_lock_irqsave(&priv->rx_lock, flags);
- if (priv->rx_flags & THROTTLED) {
- priv->rx_flags |= ACTUALLY_THROTTLED;
- spin_unlock_irqrestore(&priv->rx_lock, flags);
- dbg("%s - deferring remainder until unthrottled",
- __FUNCTION__);
- return;
- }
- spin_unlock_irqrestore(&priv->rx_lock, flags);
/* if the port is closed stop trying to read */
if (port->open_count > 0){
/* delay processing of remainder */
@@ -1856,9 +1842,15 @@ static void ftdi_process_read (struct wo
port->read_urb->transfer_buffer, port->read_urb->transfer_buffer_length,
ftdi_read_bulk_callback, port);

- result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
- if (result)
- err("%s - failed resubmitting read urb, error %d", __FUNCTION__, result);
+ spin_lock_irqsave(&priv->rx_lock, flags);
+ if (priv->rx_flags & THROTTLED) {
+ priv->rx_flags |= ACTUALLY_THROTTLED;
+ } else {
+ result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
+ if (result)
+ err("%s - failed resubmitting read urb, error %d", __FUNCTION__, result);
+ }
+ spin_unlock_irqrestore(&priv->rx_lock, flags);
}

return;

2007-05-05 21:45:52

by Alan

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

> > This is a bug in cdc-acm really. It should not double buffer, but to be
> > fair to the authors prior to the new tty buffering it *had* to do this.
>
> I assume this applies to all serial drivers in the wider sense?

When possible at least. Obviously there will always be some buffering in
the hardware or interface glue but you should never need to work around
the tty buffers now.

Alan

2007-05-05 21:48:41

by Alan

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sat, 5 May 2007 20:07:15 +0200
Oliver Neukum <[email protected]> wrote:

> Am Samstag, 5. Mai 2007 18:36 schrieb Alan Cox:
> > > In the serial driver this usually just results in dropping
> > > RTS to signal the remote end to stop sending. The serial
> > > driver always immediately gives receive data to the tty buffering
> > > without regard to the throttled state.
> > >
> > > I would argue that cdc-acm should do the same as the serial driver.
> >
> > This is a bug in cdc-acm really. It should not double buffer, but to be
> > fair to the authors prior to the new tty buffering it *had* to do this.
>
> Hi,
>
> should I understand this so that, if tty_buffer_request_room() returns
> less than requested, the rest of the data should be dropped on the
> floor?

If it returns NULL then either there is > 64K buffered (we can adjust
that if anyone shows need - its just for sanity) or the system is out of
RAM.

Alan

2007-05-06 07:06:36

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Oliver Neukum <[email protected]>:
> Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
> > Now I don't want to abuse your kindness, but I (personally) would be
> > *really* interested in a similar fix for the FTDI usb-serial driver,
> > because many measurements I do use an FTDI device.
>
> Does this work?
>

Thanks for the patch! Alas, I will have access to an FTDI device only
on Tuesday, then I'll test it and report results.

> Regards
> Oliver

Regards,

~ Antonio

2007-05-06 07:29:37

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/5, Alan Cox <[email protected]>:
> On Sat, 5 May 2007 20:07:15 +0200
> Oliver Neukum <[email protected]> wrote:
[cut]
> > should I understand this so that, if tty_buffer_request_room() returns
> > less than requested, the rest of the data should be dropped on the
> > floor?
>
> If it returns NULL then either there is > 64K buffered (we can adjust
> that if anyone shows need - its just for sanity) or the system is out of
> RAM.

For my use case would be more sensible to accept the new data and
discard the older one in the tty buffer: the tty buffer would be a
moving window of the most recent incoming data. This because if
someone does not read the incoming data maybe he's not interested in
it. When he finally reads the data (assuming there was buffer
underrun) he's likely interested to the more recent chunk of incoming
data and not to an "head" of the data firstly received. Since I
acquire measurement data from serial this make perfect sense for me.
But does this make sense in general too?

However, whatever policy the buffer uses, the fundamental point it's that
when I flush the input buffer I should be sure that each byte read
after the flush is *new* (current) data and not old one. This because
when the input stream can be stopped I can check that there are 0 byte
in the buffer, but when the stream can't be stopped I must use a
flush-and-sleep (multiple times) heuristic before I can read a single
*reliable* byte.

> Alan
>

Regards,

~ Antonio

2007-05-06 12:24:55

by Alan

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

> However, whatever policy the buffer uses, the fundamental point it's that
> when I flush the input buffer I should be sure that each byte read
> after the flush is *new* (current) data and not old one. This because

Define "new" and "old" in this case. I don't believe you can give a
precise definition or that such a thing is physically possible.

> when the input stream can be stopped I can check that there are 0 byte
> in the buffer, but when the stream can't be stopped I must use a
> flush-and-sleep (multiple times) heuristic before I can read a single
> *reliable* byte.

The hardware itself has buffers at both ends of the link, there may be
buffers in modems, muxes and the like as well. We can certainly flush
input buffers in the kernel but it isn't clear we can always do so at the
hardware level, let alone at the remote end or buffers on devices on the
link.

2007-05-06 13:37:46

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Antonino Ingargiola wrote:
> For my use case would be more sensible to accept the new data and
> discard the older one in the tty buffer: the tty buffer would be a
> moving window of the most recent incoming data. This because if
> someone does not read the incoming data maybe he's not interested in
> it. When he finally reads the data (assuming there was buffer
> underrun) he's likely interested to the more recent chunk of incoming
> data and not to an "head" of the data firstly received. Since I
> acquire measurement data from serial this make perfect sense for me.
> But does this make sense in general too?

There is no one policy here that will make everyone happy.
Some will want all the data before some was lost,
others the data after some was lost.

The real goal is to minimize any data loss.

> However, whatever policy the buffer uses, the fundamental point it's that
> when I flush the input buffer I should be sure that each byte read
> after the flush is *new* (current) data and not old one. This because
> when the input stream can be stopped I can check that there are 0 byte
> in the buffer, but when the stream can't be stopped I must use a
> flush-and-sleep (multiple times) heuristic before I can read a single
> *reliable* byte.

The flush minimizes stale data the application must process.
As Alan said in his response there are other sources of
stale data beyond the kernel's control. But we absolutely should
be flushing all buffers we control.

In the end, if more reliability is needed the application must
implement its own discipline of framing (defining block boundaries) and
checking (CRC) on the raw data stream.

--
Paul

2007-05-06 13:51:11

by Paul Fulghum

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Alan Cox wrote:
> On Sat, 5 May 2007 20:07:15 +0200
> Oliver Neukum <[email protected]> wrote:
>> should I understand this so that, if tty_buffer_request_room() returns
>> less than requested, the rest of the data should be dropped on the
>> floor?
>
> If it returns NULL then either there is > 64K buffered (we can adjust
> that if anyone shows need - its just for sanity) or the system is out of
> RAM.


It sounds bad, but I think dropping the data make sense with the
new tty buffering code.

My interpretation of the tty buffering is that
it is intended to be the main receive buffer facility
for the driver.

It simplifies and centralizes these functions so
the driver does not need implement policies such as when to
flush for user request, expand under load, crop when too large.

It should not be the driver's responsibility to
try and work around the tty buffering if it becomes full.
That adds other complexity such as when the driver should
attempt to push the data again: when more data is received?
after a timeout?

If the tty buffering runs dry, then maybe put out an error message.
If the losses occur too often then the tty buffering code needs to
be adjusted.

--
Paul


2007-05-06 16:39:51

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/6, Alan Cox <[email protected]>:
> > However, whatever policy the buffer uses, the fundamental point it's that
> > when I flush the input buffer I should be sure that each byte read
> > after the flush is *new* (current) data and not old one. This because
>
> Define "new" and "old" in this case. I don't believe you can give a
> precise definition or that such a thing is physically possible.
>

You are right. In the case of buffers storing the *last* N bytes
received we always get "new" data. To be precise if we get a byte each
T seconds
we can read data N*T seconds old (worst case). This is a bound, known,
and perfectly acceptable jitter.

Let see instead the case a not flushable buffer in the chain (inside
the PC) uses the policy of storing the *first* N bytes received. That
is the case for the flip buffer in mainline. In this case we can read
bytes also one hour old, if we read the first byte one hour after the
external device begun to send data. This is an unbound, unknown and
thus unacceptable error in a measurement (think if we were acquiring a
physical "realtime" sampled quantity). That was exactly my original
problem causing data corruption in case of multi-byte data. Paul
Fulghum patches fix this problem for serial port and for the cdc-acm
driver making the "secondary buffers" flushable.

So, in general, all the buffers (in the host) that store the first
received N bytes (as policy) should be flushable in order to don't
read (potentially very) "old" data after we made a flush.

I understand that some internal HW buffer can be not flushable via
software, but in this case I doubt the buffer use the
store-the-first-N-bytes policy.
And even if some particular device does this, than it's *its* fault, not linux.

To conclude, the store the *last* N bytes received seems a more
reasonable policy for the input buffers managed by the kernel. If the
other policy is used (as tty does), then the kernel should flush all
the buffers he *can* physically flush, remaining inside the host
computer of course.

Hope my English is sufficiently clear.


Regards,

~ Antonio

2007-05-06 16:42:47

by Alan

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

> To conclude, the store the *last* N bytes received seems a more
> reasonable policy for the input buffers managed by the kernel. If the

For your use maybe, but it is not the normal behaviour and it is not the
behaviour supported by hardware, which generally buffers the first few
bytes (and some things like Apple localtalk actually rely upon this)

> other policy is used (as tty does), then the kernel should flush all
> the buffers he *can* physically flush, remaining inside the host
> computer of course.

It certainly does no harm to try and do the job as best you can.

2007-05-06 21:51:14

by Alan Stern

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Sun, 6 May 2007, Alan Cox wrote:

> > However, whatever policy the buffer uses, the fundamental point it's that
> > when I flush the input buffer I should be sure that each byte read
> > after the flush is *new* (current) data and not old one. This because
>
> Define "new" and "old" in this case. I don't believe you can give a
> precise definition or that such a thing is physically possible.

One can come close. It would make sense to say that after a flush,
subsequent reads should retrieve _contiguous_ bytes from the input stream.
In other words, rule out the possibility that the read would get bytes
1-10 (from some buffer somewhere) followed by bytes 30-60 (because bytes
11-29 were dropped by the flush). By contrast, it would be permissible
for the read to obtain bytes 20-60, even though 20-29 may have been
entered the input stream before the flush occurred.

> The hardware itself has buffers at both ends of the link, there may be
> buffers in modems, muxes and the like as well. We can certainly flush
> input buffers in the kernel but it isn't clear we can always do so at the
> hardware level, let alone at the remote end or buffers on devices on the
> link.

This is of course the fly in the ointment.

Alan Stern

2007-05-07 08:07:24

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/6, Alan Stern <[email protected]>:
> On Sun, 6 May 2007, Alan Cox wrote:
>
> > > However, whatever policy the buffer uses, the fundamental point it's that
> > > when I flush the input buffer I should be sure that each byte read
> > > after the flush is *new* (current) data and not old one. This because
> >
> > Define "new" and "old" in this case. I don't believe you can give a
> > precise definition or that such a thing is physically possible.
>
> One can come close. It would make sense to say that after a flush,
> subsequent reads should retrieve _contiguous_ bytes from the input stream.
> In other words, rule out the possibility that the read would get bytes
> 1-10 (from some buffer somewhere) followed by bytes 30-60 (because bytes
> 11-29 were dropped by the flush). By contrast, it would be permissible
> for the read to obtain bytes 20-60, even though 20-29 may have been
> entered the input stream before the flush occurred.
>

You've expressed in an extremely clear way what I meant. Thanks.


Regards,

~ Antonio

2007-05-07 09:41:06

by Diego Zuccato

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Paul Fulghum ha scritto:
> There is no one policy here that will make everyone happy.
> Some will want all the data before some was lost,
> others the data after some was lost.
IMVHO the only sane thing is ALWAYS avoid "holes" (some old data, then
the "hole" of lost data, then some new data) after a flush().
HW buffers (and buffers on the remote end) are not an issue: they
contain fresher data (usually in "sliding window" mode) than buffered.

Thinking with a 300bps modem (anybody else remembers such an ancient
thing?):
- Sender have to transmit the alphabet;
- On the sender's modem there's a 4 char buffer
- On the receiver's modem there's another 4 char buffer
- The receiver's driver have another 4 char buffer

If the receiver issues a flush() after reading 'C', then the next read()
should return FGHIJKL...Z (or anything IN SEQUENCE), but *NEVER* DEFGKLM
(skipped H,I and J).

BYtE,
Diego.

2007-05-07 16:34:56

by Alan Stern

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Mon, 7 May 2007, Diego Zuccato wrote:

> Thinking with a 300bps modem (anybody else remembers such an ancient
> thing?):

I used a 110 bps modem for several years!

Alan Stern

2007-05-07 16:51:41

by Oliver Neukum

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Am Montag, 7. Mai 2007 18:34 schrieb Alan Stern:
> On Mon, 7 May 2007, Diego Zuccato wrote:
>
> > Thinking with a 300bps modem (anybody else remembers such an ancient
> > thing?):
>
> I used a 110 bps modem for several years!

I knew 110 bps is slow, but what took you years to transmit?

Regards
Oliver

2007-05-07 17:58:53

by Stephen Beaver

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

Luxury! When I was a lad . . . .


On 5/7/07 12:34 PM, "Alan Stern" <[email protected]> wrote:

> On Mon, 7 May 2007, Diego Zuccato wrote:
>
>> Thinking with a 300bps modem (anybody else remembers such an ancient
>> thing?):
>
> I used a 110 bps modem for several years!
>
> Alan Stern
>


2007-05-07 18:25:43

by Alan Stern

[permalink] [raw]
Subject: Re: [Linux-usb-users] [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Mon, 7 May 2007, Oliver Neukum wrote:

> Am Montag, 7. Mai 2007 18:34 schrieb Alan Stern:
> > On Mon, 7 May 2007, Diego Zuccato wrote:
> >
> > > Thinking with a 300bps modem (anybody else remembers such an ancient
> > > thing?):
> >
> > I used a 110 bps modem for several years!
>
> I knew 110 bps is slow, but what took you years to transmit?

Back then the CPUs were a lot slower as well. Doing practically anything
took a long time...

Alan Stern

2007-05-09 09:56:45

by Antonino Ingargiola

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

2007/5/6, Antonino Ingargiola <[email protected]>:
> 2007/5/5, Oliver Neukum <[email protected]>:
> > Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
> > > Now I don't want to abuse your kindness, but I (personally) would be
> > > *really* interested in a similar fix for the FTDI usb-serial driver,
> > > because many measurements I do use an FTDI device.
> >
> > Does this work?
> >
>
> Thanks for the patch! Alas, I will have access to an FTDI device only
> on Tuesday, then I'll test it and report results.

Sorry. I've not been able to try this ftdi patch yet. My work pc has
ubuntu dapper with 2.6.15 kernel. I've compiled 2.6.21 there but the
leap is too much and there are probably udev incompatibilities so the
/dev entry is not created for any USB device.

I need to install a newer distro to work with mainline.

This mail is just to point out that the patch is not forgotten, just
need more time to try it.


Regards,

~ Antonio

2007-05-09 10:42:49

by Gene Heskett

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Saturday 05 May 2007, Oliver Neukum wrote:
>Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
>> Now I don't want to abuse your kindness, but I (personally) would be
>> *really* interested in a similar fix for the FTDI usb-serial driver,
>> because many measurements I do use an FTDI device.
>
>Does this work?
>
> Regards
> Oliver
>----
>
>--- a/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:21:41.000000000 +0200
>+++ b/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:27:09.000000000 +0200
>@@ -1749,10 +1749,6 @@ static void ftdi_process_read (struct wo
> length = 0;
> }
>
>- if (priv->rx_flags & THROTTLED) {
>- dbg("%s - throttled", __FUNCTION__);
>- break;
>- }
> if (tty_buffer_request_room(tty, length) < length) {
> /* break out & wait for throttling/unthrottling to happen */
> dbg("%s - receive room low", __FUNCTION__);
>@@ -1825,16 +1821,6 @@ static void ftdi_process_read (struct wo
> dbg("%s - incomplete, %d bytes processed, %d remain",
> __FUNCTION__, packet_offset,
> urb->actual_length - packet_offset);
>- /* check if we were throttled while processing */
>- spin_lock_irqsave(&priv->rx_lock, flags);
>- if (priv->rx_flags & THROTTLED) {
>- priv->rx_flags |= ACTUALLY_THROTTLED;
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
>- dbg("%s - deferring remainder until unthrottled",
>- __FUNCTION__);
>- return;
>- }
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
> /* if the port is closed stop trying to read */
> if (port->open_count > 0){
> /* delay processing of remainder */
>@@ -1856,9 +1842,15 @@ static void ftdi_process_read (struct wo
> port->read_urb->transfer_buffer,
> port->read_urb->transfer_buffer_length, ftdi_read_bulk_callback, port);
>
>- result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>- if (result)
>- err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + spin_lock_irqsave(&priv->rx_lock, flags);
>+ if (priv->rx_flags & THROTTLED) {
>+ priv->rx_flags |= ACTUALLY_THROTTLED;
>+ } else {
>+ result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>+ if (result)
>+ err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + }
>+ spin_unlock_irqrestore(&priv->rx_lock, flags);
> }
>
> return;
>-

Oliver:

Sort of late, but,

I have a couple of kernels building with this patch applied, one with the
sd-0.48 patch where FTDI stuff is a 40+% cpu hog and can't be used, and one
with the cfs-v10 patch where it can be co-erced into working if I screw with
it enough.

I have also been using another serial patch that for unk reasons, also helps
the usb-serial case here, but vger won't let me repost that without stripping
off the headers. That patch has some resemblance to
this:8250-clear-on-read-bits-LSR-MSR where I usually replace spaces with
dashes so I don't have to use "to surround the name" in my scripts.

So these test kernels will have both patches.

I'll let you know how they work later this morning.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Oh my GOD -- the SUN just fell into YANKEE STADIUM!!

2007-05-09 11:03:17

by Gene Heskett

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Saturday 05 May 2007, Oliver Neukum wrote:
>Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
>> Now I don't want to abuse your kindness, but I (personally) would be
>> *really* interested in a similar fix for the FTDI usb-serial driver,
>> because many measurements I do use an FTDI device.
>
>Does this work?
>
> Regards
> Oliver
>----
>
>--- a/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:21:41.000000000 +0200
>+++ b/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:27:09.000000000 +0200
>@@ -1749,10 +1749,6 @@ static void ftdi_process_read (struct wo
> length = 0;
> }
>
>- if (priv->rx_flags & THROTTLED) {
>- dbg("%s - throttled", __FUNCTION__);
>- break;
>- }
> if (tty_buffer_request_room(tty, length) < length) {
> /* break out & wait for throttling/unthrottling to happen */
> dbg("%s - receive room low", __FUNCTION__);
>@@ -1825,16 +1821,6 @@ static void ftdi_process_read (struct wo
> dbg("%s - incomplete, %d bytes processed, %d remain",
> __FUNCTION__, packet_offset,
> urb->actual_length - packet_offset);
>- /* check if we were throttled while processing */
>- spin_lock_irqsave(&priv->rx_lock, flags);
>- if (priv->rx_flags & THROTTLED) {
>- priv->rx_flags |= ACTUALLY_THROTTLED;
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
>- dbg("%s - deferring remainder until unthrottled",
>- __FUNCTION__);
>- return;
>- }
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
> /* if the port is closed stop trying to read */
> if (port->open_count > 0){
> /* delay processing of remainder */
>@@ -1856,9 +1842,15 @@ static void ftdi_process_read (struct wo
> port->read_urb->transfer_buffer,
> port->read_urb->transfer_buffer_length, ftdi_read_bulk_callback, port);
>
>- result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>- if (result)
>- err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + spin_lock_irqsave(&priv->rx_lock, flags);
>+ if (priv->rx_flags & THROTTLED) {
>+ priv->rx_flags |= ACTUALLY_THROTTLED;
>+ } else {
>+ result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>+ if (result)
>+ err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + }
>+ spin_unlock_irqrestore(&priv->rx_lock, flags);
> }
>
> return;
>-
Oliver:

While running kernel-2.6.21.1-sd-0.48 + this and the other patch I mentioned,

Unforch, this patch doesn't seem to have effected my usb-serial problems, for
upsd anyway, it's running at 47% of the cpu right now, while running sd-0.48.

Occasionally a stop and restart will fix it... Nope. On the restart, its
about 3% _until_ something starts reading its output data file, which could
be gkrellmBUPS, or its own gui, either one bounces it back up to 45+% of the
cpu. On a kernel where it works sorta right, as in a recent fedora kernel,
its about 1.5%, and on even older 2.6.18ish stuff, this utility ran at .1% of
the cpu regardless of who read its output data. But that was with a pl2303
adapter, which no longer works at all so I bought some ftdi based ones.

So the next boot will be to cfs-v10 + these.



>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/



--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
For certain people, after fifty, litigation takes the place of sex.
-- Gore Vidal

2007-05-09 12:01:38

by Gene Heskett

[permalink] [raw]
Subject: Re: [SOLVED] Serial buffer corruption [was Re: FTDI usb-serial possible bug]

On Saturday 05 May 2007, Oliver Neukum wrote:

You are included this time David because 3 messages I posted this morning,
whose text should resemble this one, the first of those 3, were
also /dev/nulled. this is BS when I can't even discuss a patch in the same
manner as others can.

So whats wrong with this message that its filtered?

>Am Samstag, 5. Mai 2007 20:08 schrieb Antonino Ingargiola:
>> Now I don't want to abuse your kindness, but I (personally) would be
>> *really* interested in a similar fix for the FTDI usb-serial driver,
>> because many measurements I do use an FTDI device.
>
>Does this work?
>
> Regards
> Oliver
>----
>
>--- a/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:21:41.000000000 +0200
>+++ b/drivers/usb/serial/ftdi_sio.c 2007-05-05 20:27:09.000000000 +0200
>@@ -1749,10 +1749,6 @@ static void ftdi_process_read (struct wo
> length = 0;
> }
>
>- if (priv->rx_flags & THROTTLED) {
>- dbg("%s - throttled", __FUNCTION__);
>- break;
>- }
> if (tty_buffer_request_room(tty, length) < length) {
> /* break out & wait for throttling/unthrottling to happen */
> dbg("%s - receive room low", __FUNCTION__);
>@@ -1825,16 +1821,6 @@ static void ftdi_process_read (struct wo
> dbg("%s - incomplete, %d bytes processed, %d remain",
> __FUNCTION__, packet_offset,
> urb->actual_length - packet_offset);
>- /* check if we were throttled while processing */
>- spin_lock_irqsave(&priv->rx_lock, flags);
>- if (priv->rx_flags & THROTTLED) {
>- priv->rx_flags |= ACTUALLY_THROTTLED;
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
>- dbg("%s - deferring remainder until unthrottled",
>- __FUNCTION__);
>- return;
>- }
>- spin_unlock_irqrestore(&priv->rx_lock, flags);
> /* if the port is closed stop trying to read */
> if (port->open_count > 0){
> /* delay processing of remainder */
>@@ -1856,9 +1842,15 @@ static void ftdi_process_read (struct wo
> port->read_urb->transfer_buffer,
> port->read_urb->transfer_buffer_length, ftdi_read_bulk_callback, port);
>
>- result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>- if (result)
>- err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + spin_lock_irqsave(&priv->rx_lock, flags);
>+ if (priv->rx_flags & THROTTLED) {
>+ priv->rx_flags |= ACTUALLY_THROTTLED;
>+ } else {
>+ result = usb_submit_urb(port->read_urb, GFP_ATOMIC);
>+ if (result)
>+ err("%s - failed resubmitting read urb, error %d", __FUNCTION__,
> result); + }
>+ spin_unlock_irqrestore(&priv->rx_lock, flags);
> }
>
> return;
>-

Oliver:

Sort of late, but,

I have a couple of kernels building with this patch applied, one with the
sd-0.48 patch where FTDI stuff is a 40+% cpu hog and can't be used, and one
with the cfs-v10 patch where it can be co-erced into working if I screw with
it enough.

I have also been using another serial patch that for unk reasons, also helps
the usb-serial case here, but vger won't let me repost that without stripping
off the headers. That patch has some resemblance to
this:8250-clear-on-read-bits-LSR-MSR where I usually replace spaces with
dashes so I don't have to use "to surround the name" in my scripts.

So these test kernels will have both patches.

I'll let you know how they work later this morning.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Oh my GOD -- the SUN just fell into YANKEE STADIUM!!