2008-06-13 21:42:46

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

This is kind of an old thread, but I'm seeing something similar and, perhaps,
can throw some light on the subject.

On Fri, 16 May 2008, Andrew Morton wrote:

> On Thu, 15 May 2008 20:06:23 +0100 (BST) Chris Rankin <[email protected]>
> wrote:

>> I have two Linux boxes connected by a null-modem cable between their serial
>> ports; one box exports a serial console, which the other reads using the
>> minicom program. However, I have noticed that minicom can no longer use the
>> serial console when it is running on a 2.6.25.3 kernel, although it works
>> fine running on a 2.6.24.4 kernel.

>> Specifically, with minicom running on 2.6.25.3, the console does not accept
>> keystrokes although it does receive the boot log from the remote machine.

>> The serial console is being exported by a 2.6.25.3 kernel, and appears to be
>> working correctly.

> We did make some changes to serial_core.c in that timeframe which might have
> caused this, such as:

...

> Author: Yinghai Lu <[email protected]> 2008-02-04 22:27:46
> Committer: Linus Torvalds <[email protected]> 2008-02-05
> 09:44:09
> Parent: 149b36eae2ab6aa6056664f4bc461f3d3affc9c1 (serial: stop passing
> NULL to functions that expect data)
> Child: 9d778a69370cc1b643b13648df971c83ff5654ef (serial: avoid waking
> up closed serial ports on resume)
> Branches: many (89)
> Follows: v2.6.24
> Precedes: v2.6.25-rc1
>
> serial: keep the DTR setting for serial console.

That looks like a possibility. minicom has a DTR toggle function that drops
DTR by setting the baud rate to B0 (thereby clearing both DTR and RTS) and then
restoring the previous rate. It's called pretty early upon execution.

With kernels prior to 2.6.25 or so, resetting the baud rate would again raise
DTR and RTS, but I'm not seeing this with the current stable version (2.6.25.6
as of this writing).

Specifically:

Opening a serial device (e.g. 16550 as ttyS0) raises DTR and RTS.

Setting the baud rate to B0 clears both (as per SUS).

Subsequently setting the baud rate to something other than B0 leaves the
control lines low.

As it happens, apart from the fact that it breaks minicom, I actually prefer
this behavior.

Right now I have a patch that will fix minicom and I'm trying to convince the
maintainers to accept it. I need a definitive answer, though, as to whether
I'm seeing a bug or a feature.


If it's not too much trouble, please CC: to my address. The volume of this
list is a little overwhelming...


2008-06-14 09:46:49

by Alan

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

> Setting the baud rate to B0 clears both (as per SUS).
>
> Subsequently setting the baud rate to something other than B0 leaves the
> control lines low.
>
> As it happens, apart from the fact that it breaks minicom, I actually prefer
> this behavior.

What happens after you go from B0->Banythingelse should depend on the
termios settings at that point. This sounds like a bug therefore.

If you want that behaviour intentionally set B0, after this termios
call then set CLOCAL before changing the baud back.

I'll take a look at this next week.

Alan

2008-06-15 07:05:06

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Sat, 14 Jun 2008, Alan Cox wrote:

> What happens after you go from B0->Banythingelse should depend on the
> termios settings at that point. This sounds like a bug therefore.
>
> If you want that behaviour intentionally set B0, after this termios
> call then set CLOCAL before changing the baud back.

I'm not seeing any change in behavior with different combinations of
CLOCAL. As it happens, minicom sets CLOCAL first thing so as to keep
SIGHUP signals from being generated.

But, then, what is the relationship between CLOCAL and DTR/RTS supposed to
be? Presumably, it should prevent a SIGHUP from being raised when DSR
goes low, but beyond that, it all seems rather ill-defined. Throw in B0,
which the SUS strongly implies should unconditionally lower DTR and RTS,
and the CRTSCTS flag, and the waters really get murky. It looks to me
like the only thing you can say for certain is that CLOCAL will cause DSR
to be ignored...and for anything else you're on your own.

Mind you, all this just suggests that POSIX kinda sucks, which is hardly
an earth-shattering revelation.

> I'll take a look at this next week.

It would be much appreciated. Thanks.

2008-06-16 10:31:20

by Alan

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

> But, then, what is the relationship between CLOCAL and DTR/RTS supposed to
> be? Presumably, it should prevent a SIGHUP from being raised when DSR
> goes low, but beyond that, it all seems rather ill-defined. Throw in B0,

CLOCAL is defined to "ignore modem control lines"

> which the SUS strongly implies should unconditionally lower DTR and RTS,
> and the CRTSCTS flag, and the waters really get murky. It looks to me
> like the only thing you can say for certain is that CLOCAL will cause DSR
> to be ignored...and for anything else you're on your own.

Whatever it implies the behaviour should not have changed between 2.6.24
and 2.6.25. Nobody AFAIK sat down and decided to change it.

>
> Mind you, all this just suggests that POSIX kinda sucks, which is hardly
> an earth-shattering revelation.

Standards are part written spec and a large part an existing tradition
around that standard. The tradition is often the most important bit.

Alan

2008-06-17 04:24:15

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Mon, 16 Jun 2008, Alan Cox wrote:

> Whatever it implies the behaviour should not have changed between 2.6.24
> and 2.6.25. Nobody AFAIK sat down and decided to change it.

And, besides, I've gotten reports that the usb-serial drivers still behave
the same as with 2.6.24.

It looks like the call to tty_termios_encode_baud_rate() in
drivers/serial/8250.c is the culprit. If I comment it out, everything
appears to go back to normal (seemingly with no undesired side effects).

Why the call is there (it didn't replace anything else in the 2.6.24.7
version of 8250.c, though it did in serial_core.c) remains a mystery to
me.

2008-06-17 09:16:20

by Alan

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Mon, 16 Jun 2008 23:22:54 -0500 (CDT)
"R.L. Horn" <[email protected]> wrote:

> On Mon, 16 Jun 2008, Alan Cox wrote:
>
> > Whatever it implies the behaviour should not have changed between 2.6.24
> > and 2.6.25. Nobody AFAIK sat down and decided to change it.
>
> And, besides, I've gotten reports that the usb-serial drivers still behave
> the same as with 2.6.24.
>
> It looks like the call to tty_termios_encode_baud_rate() in
> drivers/serial/8250.c is the culprit. If I comment it out, everything
> appears to go back to normal (seemingly with no undesired side effects).
>
> Why the call is there (it didn't replace anything else in the 2.6.24.7
> version of 8250.c, though it did in serial_core.c) remains a mystery to
> me.

Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows

+ /* Don't rewrite B0 */
+ if (tty_termios_baud_rate(termios))
+ tty_termios_encode_baud_rate(termios, baud, baud);
}

2008-06-17 10:51:52

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Mon, 16 Jun 2008, Alan Cox wrote:

> Whatever it implies the behaviour should not have changed between 2.6.24
> and 2.6.25.

Okay, I think I may have figured this out.

serial8250_set_termios() (8250.c) gets a baud rate using
uart_get_baud_rate() (serial_core.c) which returns 9600 if the termios
rate is B0.

The function does some stuff.

Right before returning, it calls tty_termios_encode_baud_rate() with the
abovementioned rate. That function obediently changes the c_cflag baud
bits from B0 to B9600. Presumably, this causes confusion in other
functions (they never know that B0 was set).

I've now changed:

tty_termios_encode_baud_rate(termios, baud, baud);

to:

if ((termios->c_cflag & CBAUD) == B0)
baud = 0;
tty_termios_encode_baud_rate(termios, baud, baud);

And everybody's happy, I guess.

It looks like the DECstation DZ driver (drivers/serial/dz.c) will have the
same problem, but I believe the others are okay (as of 2.6.25.6).

2008-06-17 11:14:08

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Tue, 17 Jun 2008, Alan Cox wrote:

> Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows
>
> + /* Don't rewrite B0 */
> + if (tty_termios_baud_rate(termios))
> + tty_termios_encode_baud_rate(termios, baud, baud);
> }

Oh.

Oh well.

I guess I'll just fix it here and wait for the official fix.

2008-06-18 00:16:18

by Hans-Peter Jansen

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

Am Dienstag, 17. Juni 2008 schrieb Alan Cox:
> On Mon, 16 Jun 2008 23:22:54 -0500 (CDT)
>
> "R.L. Horn" <[email protected]> wrote:
> > On Mon, 16 Jun 2008, Alan Cox wrote:
> > > Whatever it implies the behaviour should not have changed between
> > > 2.6.24 and 2.6.25. Nobody AFAIK sat down and decided to change it.
> >
> > And, besides, I've gotten reports that the usb-serial drivers still
> > behave the same as with 2.6.24.
> >
> > It looks like the call to tty_termios_encode_baud_rate() in
> > drivers/serial/8250.c is the culprit. If I comment it out, everything
> > appears to go back to normal (seemingly with no undesired side
> > effects).
> >
> > Why the call is there (it didn't replace anything else in the 2.6.24.7
> > version of 8250.c, though it did in serial_core.c) remains a mystery to
> > me.
>
> Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows
>
> + /* Don't rewrite B0 */
> + if (tty_termios_baud_rate(termios))
> + tty_termios_encode_baud_rate(termios, baud, baud);
> }

Alan, could this issue lead to dysfunctional serial dcf77 receivers, too?
(and could you point my to the related git changeset?)

I'm using ntpd "127.127.8.0 mode 16" devices since a decade now (RAWDCF
receiver: DTR=low/RTS=high) and after upgrading to openSUSE 11.0, which is
using 2.6.25.5, those get no power anymore :-(.

Or do you have another idea? I currently prepare a test system, and would be
ready for kernel patching tomorrow..

TIA, Pete

2008-06-18 18:13:36

by Alan

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Wed, 18 Jun 2008 02:15:45 +0200
"Hans-Peter Jansen" <[email protected]> wrote:

> Am Dienstag, 17. Juni 2008 schrieb Alan Cox:
> > On Mon, 16 Jun 2008 23:22:54 -0500 (CDT)
> >
> > Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows
> >
> > + /* Don't rewrite B0 */
> > + if (tty_termios_baud_rate(termios))
> > + tty_termios_encode_baud_rate(termios, baud, baud);
> > }
>
> Alan, could this issue lead to dysfunctional serial dcf77 receivers, too?
> (and could you point my to the related git changeset?)

I don't generally work via git so I don't offhand know the changeset id.
I guess it could do. If its causing this many actual problem cases it
might also want to go into stable.

Alan

2008-06-18 20:16:36

by Olivier Galibert

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Wed, Jun 18, 2008 at 06:55:34PM +0100, Alan Cox wrote:
> On Wed, 18 Jun 2008 02:15:45 +0200
> "Hans-Peter Jansen" <[email protected]> wrote:
>
> > Am Dienstag, 17. Juni 2008 schrieb Alan Cox:
> > > On Mon, 16 Jun 2008 23:22:54 -0500 (CDT)
> > >
> > > Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows
> > >
> > > + /* Don't rewrite B0 */
> > > + if (tty_termios_baud_rate(termios))
> > > + tty_termios_encode_baud_rate(termios, baud, baud);
> > > }
> >
> > Alan, could this issue lead to dysfunctional serial dcf77 receivers, too?
> > (and could you point my to the related git changeset?)
>
> I don't generally work via git so I don't offhand know the changeset id.

e991a2bd4fa0b2f475b67dfe8f33e8ecbdcbb40b

git blame drivers/serial/8250.c then look for "rewrite B0" (which gets
you e991a2bd) then git show e991a2bd.

OG.

2008-06-19 08:21:33

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Wed, 18 Jun 2008, Alan Cox wrote:

> I don't generally work via git so I don't offhand know the changeset id.
> I guess it could do. If its causing this many actual problem cases it
> might also want to go into stable.

I vote for nipping it in the bud ASAP. There are currently eight stable
kernel versions that exhibit this bug (which I believe is fairly serious,
in breadth, if not depth), and it already has Adam Lackorzynski and me
scratching our heads trying to figure out what, if anything, to do with
minicom to accomodate it.

I would suggest one change to the 26-rc version. Rather than bypassing
tty_termios_encode_baud_rate() entirely in the B0 case, why not do
something like:

if (tty_termios_baud_rate(termios))
tty_termios_encode_baud_rate(termios, baud, baud);
else
tty_termios_encode_baud_rate(termios, 0, 0);

to ensure that c_ispeed and c_ospeed are set (just for the sake of
consistency)?

2008-06-19 23:46:44

by Hans-Peter Jansen

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

Am Mittwoch, 18. Juni 2008 schrieb Alan Cox:
> On Wed, 18 Jun 2008 02:15:45 +0200
>
> "Hans-Peter Jansen" <[email protected]> wrote:
> > Am Dienstag, 17. Juni 2008 schrieb Alan Cox:
> > > On Mon, 16 Jun 2008 23:22:54 -0500 (CDT)
> > >
> > > Ah ok I know what the bug is - it was fixed in 2.6.26-rc as follows
> > >
> > > + /* Don't rewrite B0 */
> > > + if (tty_termios_baud_rate(termios))
> > > + tty_termios_encode_baud_rate(termios, baud, baud);
> > > }
> >
> > Alan, could this issue lead to dysfunctional serial dcf77 receivers,
> > too? (and could you point my to the related git changeset?)
>
> I don't generally work via git so I don't offhand know the changeset id.
> I guess it could do. If its causing this many actual problem cases it
> might also want to go into stable.

Alan, I got to test this by applying the changeset (Olivier, many thanks
for your valuable hint, I'm sure, I will reuse this knowledge soon), to
the otherwise unchanged kernel, but unfortunately, it doesn't solve my
issue.

A different patch must have changed the behavior/state of some RS232 lines
in the 2.6.25 time frame, since the device still doesn't get any power at
all. Hopefully I get round bisecting it this weekend, but any hints are
greatly appreciated.

As another data point: using a usb <-> rs232 converter, the dcf device got
back to life again. It still doesn't work in its entirety, but at least,
some data arrives in ntpd. Expected is something similar to:
-#--#-#####-###--D--S124--2-p------p-----21-4-24-----8-- (incomplete)
but it reads:
###############RADMLS1248124P124812P1248121241248112481248P
thus, obviously it doesn't get any 0 values back (displayed as - above).

The setting of the serial interfaces was equivalent, but maybe the usb
converter cannot handle the rather strange 50 baud setting properly.

~# stty -a < /dev/ttyUSB0
speed 50 baud; rows 0; columns 0; line = 0;
intr = <undef>; quit = <undef>; erase = <undef>; kill = <undef>; eof = <undef>; eol = <undef>; eol2 = <undef>;
swtch = <undef>; start = <undef>; stop = <undef>; susp = <undef>; rprnt = <undef>; werase = <undef>; lnext = <undef>;
flush = <undef>; min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke

Pete

2008-06-20 11:23:28

by R.L. Horn

[permalink] [raw]
Subject: Re: 2.6.25.3: serial problem (minicom)

On Fri, 20 Jun 2008, Hans-Peter Jansen wrote:

> Alan, I got to test this by applying the changeset (Olivier, many thanks
> for your valuable hint, I'm sure, I will reuse this knowledge soon), to
> the otherwise unchanged kernel, but unfortunately, it doesn't solve my
> issue.
>
> A different patch must have changed the behavior/state of some RS232 lines
> in the 2.6.25 time frame,

There was a deliberate change in DTR behavior, though I'm not up on the
details. If you have a copy of 2.6.25.something handy and want to check
that that's the problem, you might look at drivers/serial/serial_core.c.
Round about line 2160 (in uart_configure_port()), you'll see:

/*
* Ensure that the modem control lines are de-activated.
* keep the DTR setting that is set in uart_set_options()
* We probably don't need a spinlock around this, but
*/
spin_lock_irqsave(&port->lock, flags);
port->ops->set_mctrl(port, port->mctrl & TIOCM_DTR);

Change the last line to:

port->ops->set_mctrl(port, 0);

which reverts to 2.6.24 behavior.

If your problem isn't resolved here, please contact me off-list with some
details about your receiver, ntpd version, etc. and I'll look into it
further. I've been thinking about building a WWVB doodad, and a solution
to this might save me some grief later on.

> As another data point: using a usb <-> rs232 converter, the dcf device
> got back to life again. It still doesn't work in its entirety, but at
> least, some data arrives in ntpd. Expected is something similar to:
> -#--#-#####-###--D--S124--2-p------p-----21-4-24-----8-- (incomplete)
> but it reads:
> ###############RADMLS1248124P124812P1248121241248112481248P
> thus, obviously it doesn't get any 0 values back (displayed as - above).

That looks like a hardware problem. Odds are, something here (and it
could be the USB<->RS232 converter or the DCF receiver or both) either
isn't up to RS-232 specs or the receiver is making unfounded assumptions
about the DTE port. In particular, I'd want to know the mark/space
voltages coming out of that USB thingy.