2022-02-25 05:24:43

by Steven Rostedt

[permalink] [raw]
Subject: Strange output on the console


I've been noticing that my tests have been spitting out strange output on
the console. It would happen at boot up and then clear up. It looks like
something screwed up with the serial timing.

Attached is a dmesg of one of my test runs as an example.

I've noticed this on both 32 bit and 64 bit x86.

I haven't had time to look deeper into this, but I figured I let you know
about it.

And it always seems to happen right after:

Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled

Maybe this is a serial issue and not a printk one? :-/

-- Steve


Attachments:
(No filename) (577.00 B)
strange.txt (53.58 kB)
Download all attachments

2022-02-25 05:42:23

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: Strange output on the console

On (22/02/24 23:00), Steven Rostedt wrote:
> I've been noticing that my tests have been spitting out strange output on
> the console. It would happen at boot up and then clear up. It looks like
> something screwed up with the serial timing.
>
> Attached is a dmesg of one of my test runs as an example.
>
> I've noticed this on both 32 bit and 64 bit x86.
>
> I haven't had time to look deeper into this, but I figured I let you know
> about it.
>
> And it always seems to happen right after:
>
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>
> Maybe this is a serial issue and not a printk one? :-/

Steven, did you notice this recently?
May I perhaps suggest git bisect? (sorry)

2022-02-25 07:08:12

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 06:12:35AM +0000, David Laight wrote:
> From: Steven Rostedt
> > Sent: 25 February 2022 04:01
> >
> > I've been noticing that my tests have been spitting out strange output on
> > the console. It would happen at boot up and then clear up. It looks like
> > something screwed up with the serial timing.
> >
> > Attached is a dmesg of one of my test runs as an example.
> >
> > I've noticed this on both 32 bit and 64 bit x86.
> >
> > I haven't had time to look deeper into this, but I figured I let you know
> > about it.
> >
> > And it always seems to happen right after:
> >
> > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> >
> > Maybe this is a serial issue and not a printk one? :-/
>
> Looks very much like the serial baud rate is being reset.

I don't think it's the baud rate, characters are still readable, it
looks more like a fifo being too short and causing lots of chars to
be dropped.

Willy

2022-02-25 08:43:55

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: Strange output on the console

On (22/02/25 13:58), Sergey Senozhatsky wrote:
> On (22/02/24 23:00), Steven Rostedt wrote:
> > I've been noticing that my tests have been spitting out strange output on
> > the console. It would happen at boot up and then clear up. It looks like
> > something screwed up with the serial timing.
> >
> > Attached is a dmesg of one of my test runs as an example.
> >
> > I've noticed this on both 32 bit and 64 bit x86.
> >
> > I haven't had time to look deeper into this, but I figured I let you know
> > about it.
> >
> > And it always seems to happen right after:
> >
> > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> >
> > Maybe this is a serial issue and not a printk one? :-/
>
> Steven, did you notice this recently?
> May I perhaps suggest git bisect? (sorry)

On the off chance - do you have this revert in your kernel?

commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
Author: Greg Kroah-Hartman <[email protected]>
Date: Wed Jan 26 14:33:58 2022 +0100

Revert "tty: serial: Use fifo in 8250 console driver"

This reverts commit 5021d709b31b8a14317998a33cbc78be0de9ab30.

The patch is still a bit buggy, and this breaks some other hardware
types. It needs to be resubmitted in a non-buggy way, and make sure the
other hardware types also continue to work properly.

2022-02-25 11:24:42

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 10:11:43AM +0000, David Laight wrote:
> From: Willy Tarreau
> > Sent: 25 February 2022 06:37
> >
> > On Fri, Feb 25, 2022 at 06:12:35AM +0000, David Laight wrote:
> > > From: Steven Rostedt
> > > > Sent: 25 February 2022 04:01
> > > >
> > > > I've been noticing that my tests have been spitting out strange output on
> > > > the console. It would happen at boot up and then clear up. It looks like
> > > > something screwed up with the serial timing.
> > > >
> > > > Attached is a dmesg of one of my test runs as an example.
> > > >
> > > > I've noticed this on both 32 bit and 64 bit x86.
> > > >
> > > > I haven't had time to look deeper into this, but I figured I let you know
> > > > about it.
> > > >
> > > > And it always seems to happen right after:
> > > >
> > > > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > > >
> > > > Maybe this is a serial issue and not a printk one? :-/
> > >
> > > Looks very much like the serial baud rate is being reset.
> >
> > I don't think it's the baud rate, characters are still readable, it
> > looks more like a fifo being too short and causing lots of chars to
> > be dropped.
>
> Just before it recovers there is this output:
> ATaitoscic nitahi tuPiet mfba Ae: aD nCt AH0 nP0
> That is probably 'fifo not enabled'.
>
> But the earlier output doesn't have many different characters in it.
> Which is typical of the baud rate being wrong.

I don't think so, here's the beginning of the capture:

Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
0SI 15Nto
LgtAsri[dnei00ieC nm
i:0eom:Soce
[ad000Ade s:ii SSLtbueludAis002h00 1)ASrPn
Ars004h00 7ARrPn

All of these chars are within the printable set, so there's a probability
of ~37% per char, so even for a short string of 9 chars like the first
line, that's a 0.01% chance of it respecting the set. That's why for me
it definitely doesn't correspond to a baud rate issue. With wrong baud
rates you get lots of garbage in the full 8-bit range.

In addition there are even upper case at the beginning of the lines which
probably correspond to the ones that are printed on these lines.

Willy

2022-02-25 11:58:28

by David Laight

[permalink] [raw]
Subject: RE: Strange output on the console

From: Steven Rostedt
> Sent: 25 February 2022 04:01
>
> I've been noticing that my tests have been spitting out strange output on
> the console. It would happen at boot up and then clear up. It looks like
> something screwed up with the serial timing.
>
> Attached is a dmesg of one of my test runs as an example.
>
> I've noticed this on both 32 bit and 64 bit x86.
>
> I haven't had time to look deeper into this, but I figured I let you know
> about it.
>
> And it always seems to happen right after:
>
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>
> Maybe this is a serial issue and not a printk one? :-/

Looks very much like the serial baud rate is being reset.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-02-25 12:57:33

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 11:11:49AM +0000, David Laight wrote:
> > > But the earlier output doesn't have many different characters in it.
> > > Which is typical of the baud rate being wrong.
> >
> > I don't think so, here's the beginning of the capture:
> >
> > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > 0SI 15Nto
> > LgtAsri[dnei00ieC nm
> > i:0eom:Soce
> > [ad000Ade s:ii SSLtbueludAis002h00 1)ASrPn
> > Ars004h00 7ARrPn
> >
> > All of these chars are within the printable set, so there's a probability
> > of ~37% per char, so even for a short string of 9 chars like the first
> > line, that's a 0.01% chance of it respecting the set. That's why for me
> > it definitely doesn't correspond to a baud rate issue. With wrong baud
> > rates you get lots of garbage in the full 8-bit range.
> >
> > In addition there are even upper case at the beginning of the lines which
> > probably correspond to the ones that are printed on these lines.
>
> Ok, that bit is probably output after the fifo is disabled.
>
> But a few lines later you get:
>
> TntiTet_TetkT t:TetyTvt:TntaTttTet TeteTvtnTntnTvtnT teT teTettTettKTeteTttnTnteTttaTttaTetKTttOTet:TvtnTvtltTntw Tnt:T tOTvt_Tnt:TvtKTntnT tsTvtsTttsTttsTttosTvtotTtt:TntrTntOTet T ttTvtKT tKT teTnttTettTntaTntdTttaT taTeteT tTtt TttrTntlTvt:TetkT tTtttTnteTttOTetkTvtsTeteT tlTttoTntoTetpTvtpTetsTvtvTvteTntOTet
> TttTetsTetsTntsTetKT t TttiTttT tnT tOTnt_TttrTtttTntKTetOTtteTnt:T tKTttpTvdoT dxTvdnTtd TedtTedoTvdiTedaTedT d Ttd:TedtTtd TeddTvdKTediTndgTednTtdgTtd TedKT snTesTtsOTesKTnstTvsOTvsTnsKTvs:Tts:TesTvs:Tns Tes T sdTts TtsoTesOTtslTesnTesKTesOTns
> TnsTesOT s T sKTesOT sKTvsOTtsTtsKTtsT sKTesTesKT sKTesOT siT stTnsaTvsaTesKTnsOT s
> TesKTnsT s TvsKTeslT sOTesT sKTtseTnseTvs_Tts_T sOTes TnsOT s T spTespTesdTesdTeseTtseTeseTt_rTe_rTv_oTes:T s:Tes:TnfwOTvfrKT fwT frTefpuTefpKTefpTvf:TnsOTvsOTnsoTvs Tvs TvsTtssTnsT sKTesTnsTvsTtsoTesmTnsgTesiTeslTtsTvsKTnsTvsOTtsT s_Tnsu
>
> which really doesn't have enough different characters in it to be a fifo problem.
> That looks like a UART struggling to find valid start and stop bits
> on a continuous data stream that doesn't match the baud rate.
>
> It may also be that whatever 'terminal' is being used is masking off the 0x80 bit.

That could be one option I thought about but still, that sounds quite
suspicious. You don't even get any #!:$/ etc. Or maybe the UART is
configured in 6-bit mode (most 16550 support 5-8 bits), and maybe even
the stop bit and/or parity participates.

> Another possibility is that the count of characters is about right.
> But the receiver is misaligned on the 10 bit async characters.
> Because these is no line idle, it never synchronises properly.
> ISTR a real async terminal behaving that way.
> But I think that gives a much wider range of characters - just the wrong ones.
>
> There is also a third error pattern:
> ^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[
> which might be an 'even more wrong' baud rate.

This one just looks like the Up arrow, that Steven might have pressed at
some point during the capture.

> So maybe we are both right - for different bits of the error.

:-)
... and Steven having fun reading our proposals after having made this
up entirely in preparation of the next April's fool :-)

Willy

2022-02-25 13:46:13

by David Laight

[permalink] [raw]
Subject: RE: Strange output on the console

From: Willy Tarreau
> Sent: 25 February 2022 12:26
...
> That could be one option I thought about but still, that sounds quite
> suspicious. You don't even get any #!:$/ etc. Or maybe the UART is
> configured in 6-bit mode (most 16550 support 5-8 bits), and maybe even
> the stop bit and/or parity participates.

I've been known to resort to an oscilloscope to get an async link
working :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-02-25 13:47:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 13:25:46 +0100
Willy Tarreau <[email protected]> wrote:

> > TnsTesOT s T sKTesOT sKTvsOTtsTtsKTtsT sKTesTesKT sKTesOT siT stTnsaTvsaTesKTnsOT s
> > TesKTnsT s TvsKTeslT sOTesT sKTtseTnseTvs_Tts_T sOTes TnsOT s T spTespTesdTesdTeseTtseTeseTt_rTe_rTv_oTes:T s:Tes:TnfwOTvfrKT fwT frTefpuTefpKTefpTvf:TnsOTvsOTnsoTvs Tvs TvsTtssTnsT sKTesTnsTvsTtsoTesmTnsgTesiTeslTtsTvsKTnsTvsOTtsT s_Tnsu
> >
> > which really doesn't have enough different characters in it to be a fifo problem.
> > That looks like a UART struggling to find valid start and stop bits
> > on a continuous data stream that doesn't match the baud rate.
> >
> > It may also be that whatever 'terminal' is being used is masking off the 0x80 bit.
>
> That could be one option I thought about but still, that sounds quite
> suspicious. You don't even get any #!:$/ etc. Or maybe the UART is
> configured in 6-bit mode (most 16550 support 5-8 bits), and maybe even
> the stop bit and/or parity participates.

One thing that's not easy to demonstrate here (I could try to video it), is
that this output is the final result. It shows a bunch of other characters
as it is displayed but then the cursor goes backwards and writes over it.

But the characters that are deleted still do not make sense. When watching
it, it reminds me of the Matrix characters, but running horizontal and not
vertical.

>
> > Another possibility is that the count of characters is about right.
> > But the receiver is misaligned on the 10 bit async characters.
> > Because these is no line idle, it never synchronises properly.
> > ISTR a real async terminal behaving that way.
> > But I think that gives a much wider range of characters - just the wrong ones.
> >
> > There is also a third error pattern:
> > ^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[
> > which might be an 'even more wrong' baud rate.
>
> This one just looks like the Up arrow, that Steven might have pressed at
> some point during the capture.

Yeah, sorry about that. That was the mouse wheel over my console window and
I didn't quite hit "shift" in time. "shift-mouse-up" scrolls up the
terminal window, but just "mouse-up" is the same as "up arrow", and caused
that huge string of ^[[A to appear.

>
> > So maybe we are both right - for different bits of the error.
>
> :-)
> ... and Steven having fun reading our proposals after having made this
> up entirely in preparation of the next April's fool :-)


I was thinking the same thing, but this output being the joke. Someone
planning on slipping in Matrix characters for printk?

-- Steve

2022-02-25 14:08:05

by John Ogness

[permalink] [raw]
Subject: Re: Strange output on the console

Hi Steven,

On 2022-02-25, Willy Tarreau <[email protected]> wrote:
>> > And it always seems to happen right after:
>> >
>> > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>> >
>> > Maybe this is a serial issue and not a printk one? :-/

Your first message says you attached a dmesg. Does this mean you dumped
the kernel log directly into the attached file? ...or was that a screen
capture in the attached file?

It is an important distinction because if garbage is landing in the
kernel buffer, it is not related to serial at all.

John

2022-02-25 14:25:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 08:32:09 -0500
Steven Rostedt <[email protected]> wrote:

> On Fri, 25 Feb 2022 14:07:00 +0900
> Sergey Senozhatsky <[email protected]> wrote:
>
> > > Steven, did you notice this recently?
> > > May I perhaps suggest git bisect? (sorry)
> >
> > On the off chance - do you have this revert in your kernel?
> >
> > commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
> > Author: Greg Kroah-Hartman <[email protected]>
> > Date: Wed Jan 26 14:33:58 2022 +0100
> >
> > Revert "tty: serial: Use fifo in 8250 console driver"
> >
> > This reverts commit 5021d709b31b8a14317998a33cbc78be0de9ab30.
> >
> > The patch is still a bit buggy, and this breaks some other hardware
> > types. It needs to be resubmitted in a non-buggy way, and make sure the
> > other hardware types also continue to work properly.
>
> My tree does not have this.
>
> I just applied it and I'm building now. I'll let you know if this fixes the
> issue or not.

Mystery solved. Thanks Sergey!

After applying the above revert patch the funky characters go away.
Removing the revert (putting back the problem commit) the problem reappears.

Greg,

I have the hardware that the code breaks. I will be willing to test the new
changes to make sure it's done in a "non-buggy" way. At least for my
hardware (two boxes that had the issue, but slightly different).


-- Steve

2022-02-25 14:31:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 14:07:00 +0900
Sergey Senozhatsky <[email protected]> wrote:

> > Steven, did you notice this recently?
> > May I perhaps suggest git bisect? (sorry)
>
> On the off chance - do you have this revert in your kernel?
>
> commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
> Author: Greg Kroah-Hartman <[email protected]>
> Date: Wed Jan 26 14:33:58 2022 +0100
>
> Revert "tty: serial: Use fifo in 8250 console driver"
>
> This reverts commit 5021d709b31b8a14317998a33cbc78be0de9ab30.
>
> The patch is still a bit buggy, and this breaks some other hardware
> types. It needs to be resubmitted in a non-buggy way, and make sure the
> other hardware types also continue to work properly.

My tree does not have this.

I just applied it and I'm building now. I'll let you know if this fixes the
issue or not.

Thanks,

-- Steve

2022-02-25 14:33:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 14:40:56 +0100
Willy Tarreau <[email protected]> wrote:

> Maybe you could try to change the config on the receiver to try to resync
> with what it sent and see if you figure what setting is being used ? I
> would personally try to set it to 7 bit with 1 stop and no parity, same
> baud rate just to see.

It appears to be a bug in the kernel and the commit was reverted.

I Cc'd you on my reply.

https://lore.kernel.org/all/[email protected]/

-- Steve

2022-02-25 15:38:36

by John Ogness

[permalink] [raw]
Subject: Re: Strange output on the console

On 2022-02-25, John Ogness <[email protected]> wrote:
> Your first message says you attached a dmesg. Does this mean you dumped
> the kernel log directly into the attached file? ...or was that a screen
> capture in the attached file?

Sorry, I just noticed the userspace output at the end, clearly showing
it is a capture of the console output.

John

2022-02-25 16:53:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 11:11:49 +0000
David Laight <[email protected]> wrote:

> TntiTet_TetkT t:TetyTvt:TntaTttTet TeteTvtnTntnTvtnT teT teTettTettKTeteTttnTnteTttaTttaTetKTttOTet:TvtnTvtltTntw Tnt:T tOTvt_Tnt:TvtKTntnT tsTvtsTttsTttsTttosTvtotTtt:TntrTntOTet T ttTvtKT tKT teTnttTettTntaTntdTttaT taTeteT tTtt TttrTntlTvt:TetkT tTtttTnteTttOTetkTvtsTeteT tlTttoTntoTetpTvtpTetsTvtvTvteTntOTet
> TttTetsTetsTntsTetKT t TttiTttT tnT tOTnt_TttrTtttTntKTetOTtteTnt:T tKTttpTvdoT dxTvdnTtd TedtTedoTvdiTedaTedT d Ttd:TedtTtd TeddTvdKTediTndgTednTtdgTtd TedKT snTesTtsOTesKTnstTvsOTvsTnsKTvs:Tts:TesTvs:Tns Tes T sdTts TtsoTesOTtslTesnTesKTesOTns
> TnsTesOT s T sKTesOT sKTvsOTtsTtsKTtsT sKTesTesKT sKTesOT siT stTnsaTvsaTesKTnsOT s
> TesKTnsT s TvsKTeslT sOTesT sKTtseTnseTvs_Tts_T sOTes TnsOT s T spTespTesdTesdTeseTtseTeseTt_rTe_rTv_oTes:T s:Tes:TnfwOTvfrKT fwT frTefpuTefpKTefpTvf:TnsOTvsOTnsoTvs Tvs TvsTtssTnsT sKTesTnsTvsTtsoTesmTnsgTesiTeslTtsTvsKTnsTvsOTtsT s_Tnsu
>
> which really doesn't have enough different characters in it to be a fifo problem.
> That looks like a UART struggling to find valid start and stop bits
> on a continuous data stream that doesn't match the baud rate.

The output was misleading as a lot of the characters were overwritten due
to the backspace character being one of the characters that were outputted.

Anyway, it looks to have been a fifo issue considering the problem goes
away after applying:

commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
Author: Greg Kroah-Hartman <[email protected]>
Date: Wed Jan 26 14:33:58 2022 +0100

Revert "tty: serial: Use fifo in 8250 console driver"


-- Steve

2022-02-25 18:23:39

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 08:17:19AM -0500, Steven Rostedt wrote:
> On Fri, 25 Feb 2022 13:25:46 +0100
> Willy Tarreau <[email protected]> wrote:
>
> > > TnsTesOT s T sKTesOT sKTvsOTtsTtsKTtsT sKTesTesKT sKTesOT siT stTnsaTvsaTesKTnsOT s
> > > TesKTnsT s TvsKTeslT sOTesT sKTtseTnseTvs_Tts_T sOTes TnsOT s T spTespTesdTesdTeseTtseTeseTt_rTe_rTv_oTes:T s:Tes:TnfwOTvfrKT fwT frTefpuTefpKTefpTvf:TnsOTvsOTnsoTvs Tvs TvsTtssTnsT sKTesTnsTvsTtsoTesmTnsgTesiTeslTtsTvsKTnsTvsOTtsT s_Tnsu
> > >
> > > which really doesn't have enough different characters in it to be a fifo problem.
> > > That looks like a UART struggling to find valid start and stop bits
> > > on a continuous data stream that doesn't match the baud rate.
> > >
> > > It may also be that whatever 'terminal' is being used is masking off the 0x80 bit.
> >
> > That could be one option I thought about but still, that sounds quite
> > suspicious. You don't even get any #!:$/ etc. Or maybe the UART is
> > configured in 6-bit mode (most 16550 support 5-8 bits), and maybe even
> > the stop bit and/or parity participates.
>
> One thing that's not easy to demonstrate here (I could try to video it), is
> that this output is the final result. It shows a bunch of other characters
> as it is displayed but then the cursor goes backwards and writes over it.
>
> But the characters that are deleted still do not make sense. When watching
> it, it reminds me of the Matrix characters, but running horizontal and not
> vertical.

Ah, so 6-bit might definitely make sense.

> > ... and Steven having fun reading our proposals after having made this
> > up entirely in preparation of the next April's fool :-)
>
>
> I was thinking the same thing, but this output being the joke. Someone
> planning on slipping in Matrix characters for printk?

;-)

Willy

2022-02-25 18:49:13

by David Laight

[permalink] [raw]
Subject: RE: Strange output on the console

From: Willy Tarreau
> Sent: 25 February 2022 06:37
>
> On Fri, Feb 25, 2022 at 06:12:35AM +0000, David Laight wrote:
> > From: Steven Rostedt
> > > Sent: 25 February 2022 04:01
> > >
> > > I've been noticing that my tests have been spitting out strange output on
> > > the console. It would happen at boot up and then clear up. It looks like
> > > something screwed up with the serial timing.
> > >
> > > Attached is a dmesg of one of my test runs as an example.
> > >
> > > I've noticed this on both 32 bit and 64 bit x86.
> > >
> > > I haven't had time to look deeper into this, but I figured I let you know
> > > about it.
> > >
> > > And it always seems to happen right after:
> > >
> > > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > >
> > > Maybe this is a serial issue and not a printk one? :-/
> >
> > Looks very much like the serial baud rate is being reset.
>
> I don't think it's the baud rate, characters are still readable, it
> looks more like a fifo being too short and causing lots of chars to
> be dropped.

Just before it recovers there is this output:
ATaitoscic nitahi tuPiet mfba Ae: aD nCt AH0 nP0
That is probably 'fifo not enabled'.

But the earlier output doesn't have many different characters in it.
Which is typical of the baud rate being wrong.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-02-25 19:59:51

by David Laight

[permalink] [raw]
Subject: RE: Strange output on the console

From: Willy Tarreau
> Sent: 25 February 2022 10:33
>
> On Fri, Feb 25, 2022 at 10:11:43AM +0000, David Laight wrote:
> > From: Willy Tarreau
> > > Sent: 25 February 2022 06:37
> > >
> > > On Fri, Feb 25, 2022 at 06:12:35AM +0000, David Laight wrote:
> > > > From: Steven Rostedt
> > > > > Sent: 25 February 2022 04:01
> > > > >
> > > > > I've been noticing that my tests have been spitting out strange output on
> > > > > the console. It would happen at boot up and then clear up. It looks like
> > > > > something screwed up with the serial timing.
> > > > >
> > > > > Attached is a dmesg of one of my test runs as an example.
> > > > >
> > > > > I've noticed this on both 32 bit and 64 bit x86.
> > > > >
> > > > > I haven't had time to look deeper into this, but I figured I let you know
> > > > > about it.
> > > > >
> > > > > And it always seems to happen right after:
> > > > >
> > > > > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> > > > >
> > > > > Maybe this is a serial issue and not a printk one? :-/
> > > >
> > > > Looks very much like the serial baud rate is being reset.
> > >
> > > I don't think it's the baud rate, characters are still readable, it
> > > looks more like a fifo being too short and causing lots of chars to
> > > be dropped.
> >
> > Just before it recovers there is this output:
> > ATaitoscic nitahi tuPiet mfba Ae: aD nCt AH0 nP0
> > That is probably 'fifo not enabled'.
> >
> > But the earlier output doesn't have many different characters in it.
> > Which is typical of the baud rate being wrong.
>
> I don't think so, here's the beginning of the capture:
>
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> 0SI 15Nto
> LgtAsri[dnei00ieC nm
> i:0eom:Soce
> [ad000Ade s:ii SSLtbueludAis002h00 1)ASrPn
> Ars004h00 7ARrPn
>
> All of these chars are within the printable set, so there's a probability
> of ~37% per char, so even for a short string of 9 chars like the first
> line, that's a 0.01% chance of it respecting the set. That's why for me
> it definitely doesn't correspond to a baud rate issue. With wrong baud
> rates you get lots of garbage in the full 8-bit range.
>
> In addition there are even upper case at the beginning of the lines which
> probably correspond to the ones that are printed on these lines.

Ok, that bit is probably output after the fifo is disabled.

But a few lines later you get:

TntiTet_TetkT t:TetyTvt:TntaTttTet TeteTvtnTntnTvtnT teT teTettTettKTeteTttnTnteTttaTttaTetKTttOTet:TvtnTvtltTntw Tnt:T tOTvt_Tnt:TvtKTntnT tsTvtsTttsTttsTttosTvtotTtt:TntrTntOTet T ttTvtKT tKT teTnttTettTntaTntdTttaT taTeteT tTtt TttrTntlTvt:TetkT tTtttTnteTttOTetkTvtsTeteT tlTttoTntoTetpTvtpTetsTvtvTvteTntOTet
TttTetsTetsTntsTetKT t TttiTttT tnT tOTnt_TttrTtttTntKTetOTtteTnt:T tKTttpTvdoT dxTvdnTtd TedtTedoTvdiTedaTedT d Ttd:TedtTtd TeddTvdKTediTndgTednTtdgTtd TedKT snTesTtsOTesKTnstTvsOTvsTnsKTvs:Tts:TesTvs:Tns Tes T sdTts TtsoTesOTtslTesnTesKTesOTns
TnsTesOT s T sKTesOT sKTvsOTtsTtsKTtsT sKTesTesKT sKTesOT siT stTnsaTvsaTesKTnsOT s
TesKTnsT s TvsKTeslT sOTesT sKTtseTnseTvs_Tts_T sOTes TnsOT s T spTespTesdTesdTeseTtseTeseTt_rTe_rTv_oTes:T s:Tes:TnfwOTvfrKT fwT frTefpuTefpKTefpTvf:TnsOTvsOTnsoTvs Tvs TvsTtssTnsT sKTesTnsTvsTtsoTesmTnsgTesiTeslTtsTvsKTnsTvsOTtsT s_Tnsu

which really doesn't have enough different characters in it to be a fifo problem.
That looks like a UART struggling to find valid start and stop bits
on a continuous data stream that doesn't match the baud rate.

It may also be that whatever 'terminal' is being used is masking off the 0x80 bit.

Another possibility is that the count of characters is about right.
But the receiver is misaligned on the 10 bit async characters.
Because these is no line idle, it never synchronises properly.
ISTR a real async terminal behaving that way.
But I think that gives a much wider range of characters - just the wrong ones.

There is also a third error pattern:
^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[A^[[
which might be an 'even more wrong' baud rate.

So maybe we are both right - for different bits of the error.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-02-26 00:16:48

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 08:28:50AM -0500, Steven Rostedt wrote:
> On another machine that has the same issue but with different characters, I
> captured the output from the console, but also took the output from dmesg
> (both are attached).
>
> The console has:
>
> [ 5.886056] thermal LNXTHERM:00: registered as thermal_zone0
> [ 5.891773] ACPI: thermal: Thermal Zone [TZ00] (28 C)
> [ 5.897082] thermal LNXTHERM:01: registered as thermal_zone1
> [ 5.902800] ACPI: thermal: Thermal Zone [TZ01] (30 C)
> [ 5.908053] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> .0I 15
> 10y0e
> 9No
> Lt
> 2t.d)
> .t v
> 6be
> a20tp
> 0a2ni
> s
> s
> 1s
> s
> [..]
>
>
> and dmesg has:
>
> [ 5.897082] thermal LNXTHERM:01: registered as thermal_zone1
> [ 5.902800] ACPI: thermal: Thermal Zone [TZ01] (30 C)
> [ 5.908053] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [ 5.914461] 00:08: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
> [ 5.916204] 0000:00:16.3: ttyS1 at I/O 0xf0e0 (irq = 19, base_baud = 115200) is a 16550A
> [ 5.917161] Non-volatile memory driver v1.3
> [ 5.917617] Linux agpgart interface v0.103
> [ 5.922965] tpm_tis 00:09: 1.2 TPM (device-id 0xB, rev-id 16)
> [ 5.988960] tpm tpm0: TPM is disabled/deactivated (0x7)
> [ 6.001410] brd: module loaded
> [ 6.001865] ahci 0000:00:1f.2: version 3.0
> [ 6.012113] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x5 impl SATA mode
> [ 6.012804] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst
> [ 6.016433] scsi host0: ahci
> [ 6.017170] scsi host1: ahci
> [ 6.017730] scsi host2: ahci
> [ 6.018326] scsi host3: ahci
> [ 6.018880] scsi host4: ahci
> [ 6.019469] scsi host5: ahci
> [ 6.019921] ata1: SATA max UDMA/133 abar m2048@0xf7c36000 port 0xf7c36100 irq 24
>
>
> The ending has:
>
> For console:
>
> [..]
> .Kp
> 6Kt
> [K-g
> Ke
> 1iah
> 3inc
> inc
> 6inc
> 1inc
> 6inc
> 6inc
> 1inc
> 6inc
> 6itu
> 6a
> .a c
> 6Pe
> 8tm
> 6R r
> [ 16.841072] Freeing unused decrypted memory: 2036K
> [ 16.846604] Freeing unused kernel image (initmem) memory: 2372K
> [ 16.857905] Write protecting the kernel read-only data: 22528k
>
>
> For dmesg:
>
> [..]
> [ 16.624186] Key type fscrypt-provisioning registered
> [ 16.627743] Key type encrypted registered
> [ 16.628167] ima: Allocated hash algorithm: sha1
> [ 16.634951] ima: Error Communicating to TPM chip
> [ 16.645952] ima: Error Communicating to TPM chip
> [ 16.656946] ima: Error Communicating to TPM chip
> [ 16.667939] ima: Error Communicating to TPM chip
> [ 16.678937] ima: Error Communicating to TPM chip
> [ 16.689943] ima: Error Communicating to TPM chip
> [ 16.700938] ima: Error Communicating to TPM chip
> [ 16.710941] ima: Error Communicating to TPM chip
> [ 16.726787] ima: No architecture policies found
> [ 16.746928] alg: No test for 842 (842-scomp)
> [ 16.747440] alg: No test for 842 (842-generic)
> [ 16.838166] PM: Magic number: 10:503:332
> [ 16.838607] tty tty53: hash matches
> [ 16.839099] RAS: Correctable Errors collector initialized.
> [ 16.841072] Freeing unused decrypted memory: 2036K
> [ 16.846604] Freeing unused kernel image (initmem) memory: 2372K
> [ 16.857905] Write protecting the kernel read-only data: 22528k
> [ 16.864836] Freeing unused kernel image (text/rodata gap) memory: 2036K
> [ 16.871969] Freeing unused kernel image (rodata/data gap) memory: 1180K

Interesting, it looks like the characters are rare samples from the
expected output. I.e. many '1' or '6' and 'inc' for "16* communicating".
But it looks like some chars help getting them resynced. A wrong bit
count and/or stop bit count could somehow provoke this, with one char
having the bits properly placed to be detected on the other side, then
many being shifted, arriving without the proper start bit condition,
and/or with the stop bit wrong causing them to be dropped, until some
of the low-bit count chars (0x20 or 0x0A) making it easier and resyncing
on end of lines.

Maybe you could try to change the config on the receiver to try to resync
with what it sent and see if you figure what setting is being used ? I
would personally try to set it to 7 bit with 1 stop and no parity, same
baud rate just to see.

Willy

2022-02-26 01:25:17

by Willy Tarreau

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, Feb 25, 2022 at 08:44:24AM -0500, Steven Rostedt wrote:
> On Fri, 25 Feb 2022 14:40:56 +0100
> Willy Tarreau <[email protected]> wrote:
>
> > Maybe you could try to change the config on the receiver to try to resync
> > with what it sent and see if you figure what setting is being used ? I
> > would personally try to set it to 7 bit with 1 stop and no parity, same
> > baud rate just to see.
>
> It appears to be a bug in the kernel and the commit was reverted.
>
> I Cc'd you on my reply.
>
> https://lore.kernel.org/all/[email protected]/

Yep, just seen, I received it after responding. Another happy end!

Willy

2022-02-26 02:21:32

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: Strange output on the console

On (22/02/25 08:42), Steven Rostedt wrote:
> On Fri, 25 Feb 2022 08:32:09 -0500
> Steven Rostedt <[email protected]> wrote:
>
> > On Fri, 25 Feb 2022 14:07:00 +0900
> > Sergey Senozhatsky <[email protected]> wrote:
> >
> > > > Steven, did you notice this recently?
> > > > May I perhaps suggest git bisect? (sorry)
> > >
> > > On the off chance - do you have this revert in your kernel?
> > >
> > > commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
> > > Author: Greg Kroah-Hartman <[email protected]>
> > > Date: Wed Jan 26 14:33:58 2022 +0100
> > >
> > > Revert "tty: serial: Use fifo in 8250 console driver"
> > >
> > > This reverts commit 5021d709b31b8a14317998a33cbc78be0de9ab30.
> > >
> > > The patch is still a bit buggy, and this breaks some other hardware
> > > types. It needs to be resubmitted in a non-buggy way, and make sure the
> > > other hardware types also continue to work properly.
> >
> > My tree does not have this.
> >
> > I just applied it and I'm building now. I'll let you know if this fixes the
> > issue or not.
>
> Mystery solved. Thanks Sergey!

Awesome!

2022-02-26 02:26:45

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Fri, 25 Feb 2022 13:25:46 +0100
Willy Tarreau <[email protected]> wrote:

> ... and Steven having fun reading our proposals after having made this
> up entirely in preparation of the next April's fool :-)

Here's more about the joke ;-)

On another machine that has the same issue but with different characters, I
captured the output from the console, but also took the output from dmesg
(both are attached).

The console has:

[ 5.886056] thermal LNXTHERM:00: registered as thermal_zone0
[ 5.891773] ACPI: thermal: Thermal Zone [TZ00] (28 C)
[ 5.897082] thermal LNXTHERM:01: registered as thermal_zone1
[ 5.902800] ACPI: thermal: Thermal Zone [TZ01] (30 C)
[ 5.908053] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
.0I 15
10y0e
9No
Lt
2t.d)
.t v
6be
a20tp
0a2ni
s
s
1s
s
[..]


and dmesg has:

[ 5.897082] thermal LNXTHERM:01: registered as thermal_zone1
[ 5.902800] ACPI: thermal: Thermal Zone [TZ01] (30 C)
[ 5.908053] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 5.914461] 00:08: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 5.916204] 0000:00:16.3: ttyS1 at I/O 0xf0e0 (irq = 19, base_baud = 115200) is a 16550A
[ 5.917161] Non-volatile memory driver v1.3
[ 5.917617] Linux agpgart interface v0.103
[ 5.922965] tpm_tis 00:09: 1.2 TPM (device-id 0xB, rev-id 16)
[ 5.988960] tpm tpm0: TPM is disabled/deactivated (0x7)
[ 6.001410] brd: module loaded
[ 6.001865] ahci 0000:00:1f.2: version 3.0
[ 6.012113] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x5 impl SATA mode
[ 6.012804] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst
[ 6.016433] scsi host0: ahci
[ 6.017170] scsi host1: ahci
[ 6.017730] scsi host2: ahci
[ 6.018326] scsi host3: ahci
[ 6.018880] scsi host4: ahci
[ 6.019469] scsi host5: ahci
[ 6.019921] ata1: SATA max UDMA/133 abar m2048@0xf7c36000 port 0xf7c36100 irq 24


The ending has:

For console:

[..]
.Kp
6Kt
[K-g
Ke
1iah
3inc
inc
6inc
1inc
6inc
6inc
1inc
6inc
6itu
6a
.a c
6Pe
8tm
6R r
[ 16.841072] Freeing unused decrypted memory: 2036K
[ 16.846604] Freeing unused kernel image (initmem) memory: 2372K
[ 16.857905] Write protecting the kernel read-only data: 22528k


For dmesg:

[..]
[ 16.624186] Key type fscrypt-provisioning registered
[ 16.627743] Key type encrypted registered
[ 16.628167] ima: Allocated hash algorithm: sha1
[ 16.634951] ima: Error Communicating to TPM chip
[ 16.645952] ima: Error Communicating to TPM chip
[ 16.656946] ima: Error Communicating to TPM chip
[ 16.667939] ima: Error Communicating to TPM chip
[ 16.678937] ima: Error Communicating to TPM chip
[ 16.689943] ima: Error Communicating to TPM chip
[ 16.700938] ima: Error Communicating to TPM chip
[ 16.710941] ima: Error Communicating to TPM chip
[ 16.726787] ima: No architecture policies found
[ 16.746928] alg: No test for 842 (842-scomp)
[ 16.747440] alg: No test for 842 (842-generic)
[ 16.838166] PM: Magic number: 10:503:332
[ 16.838607] tty tty53: hash matches
[ 16.839099] RAS: Correctable Errors collector initialized.
[ 16.841072] Freeing unused decrypted memory: 2036K
[ 16.846604] Freeing unused kernel image (initmem) memory: 2372K
[ 16.857905] Write protecting the kernel read-only data: 22528k
[ 16.864836] Freeing unused kernel image (text/rodata gap) memory: 2036K
[ 16.871969] Freeing unused kernel image (rodata/data gap) memory: 1180K

-- Steve


Attachments:
(No filename) (3.51 kB)
dmesg (72.86 kB)
console.txt (37.61 kB)
Download all attachments

2022-02-26 13:31:55

by Wander Costa

[permalink] [raw]
Subject: Re: Strange output on the console

Hi Steve and Sergey,

Could you please tell the serial controllers in your hardware?

On Fri, Feb 25, 2022 at 11:21 PM Sergey Senozhatsky
<[email protected]> wrote:
>
> On (22/02/25 08:42), Steven Rostedt wrote:
> > On Fri, 25 Feb 2022 08:32:09 -0500
> > Steven Rostedt <[email protected]> wrote:
> >
> > > On Fri, 25 Feb 2022 14:07:00 +0900
> > > Sergey Senozhatsky <[email protected]> wrote:
> > >
> > > > > Steven, did you notice this recently?
> > > > > May I perhaps suggest git bisect? (sorry)
> > > >
> > > > On the off chance - do you have this revert in your kernel?
> > > >
> > > > commit 6a7b9f002eca6788d346c16a6ff0c218b41f8d1d
> > > > Author: Greg Kroah-Hartman <[email protected]>
> > > > Date: Wed Jan 26 14:33:58 2022 +0100
> > > >
> > > > Revert "tty: serial: Use fifo in 8250 console driver"
> > > >
> > > > This reverts commit 5021d709b31b8a14317998a33cbc78be0de9ab30.
> > > >
> > > > The patch is still a bit buggy, and this breaks some other hardware
> > > > types. It needs to be resubmitted in a non-buggy way, and make sure the
> > > > other hardware types also continue to work properly.
> > >
> > > My tree does not have this.
> > >
> > > I just applied it and I'm building now. I'll let you know if this fixes the
> > > issue or not.
> >
> > Mystery solved. Thanks Sergey!
>
> Awesome!
>

2022-02-28 17:39:44

by Steven Rostedt

[permalink] [raw]
Subject: Re: Strange output on the console

On Sat, 26 Feb 2022 08:49:45 -0300
Wander Costa <[email protected]> wrote:

> Hi Steve and Sergey,
>
> Could you please tell the serial controllers in your hardware?

Not sure how to find that information, but I tried ;-)

On one machine, lspci gives:
00:16.3 Serial controller: Intel Corporation 7 Series/C210 Series Chipset Family KT Controller (rev 04)

and dmidecode gives:


Handle 0x0023, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: VGA/COMA
Internal Connector Type: None
External Reference Designator: COMA
External Connector Type: DB-9 male
Port Type: Serial Port 16550A Compatible

Handle 0x0024, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: COMB
Internal Connector Type: Proprietary
External Reference Designator: COMB
External Connector Type: DB-9 male
Port Type: Serial Port 16550A Compatible


On another machine, there's nothing in lspci, and dmidecode gives:

Handle 0x0007, DMI type 8, 9 bytes
Port Connector Information
Internal Reference Designator: J2A2A
Internal Connector Type: None
External Reference Designator: COM A
External Connector Type: DB-9 male
Port Type: Serial Port 16550A Compatible

and dmesg has:

Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

Anything else I could do to find out more?


-- Steve

2022-03-03 00:00:39

by Wander Costa

[permalink] [raw]
Subject: Re: Strange output on the console

On Mon, Feb 28, 2022 at 1:11 PM Steven Rostedt <[email protected]> wrote:
>
> On Sat, 26 Feb 2022 08:49:45 -0300
> Wander Costa <[email protected]> wrote:
>

[snip]

>
> Anything else I could do to find out more?
>

I think I have everything I need, thank you.