2018-12-13 00:43:21

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

Hi Marek / Greg / all,

On Mon, Sep 03, 2018 at 12:44:52AM +0000, Marek Vasut wrote:
> The 8250 FIFOs indeed need to be cleared after stopping transmission in
> RS485 mode without SER_RS485_RX_DURING_TX flag set. But there are two
> problems with the approach taken by the previous patch from Fixes tag.
>
> First, serial8250_clear_fifos() should clear fifos, but what it really
> does is it enables the FIFOs unconditionally if present, clears them
> and then sets the FCR register to zero, which effectively disables the
> FIFOs. In case the FIFO is disabled, enabling it and clearing it makes
> no sense and in fact can trigger misbehavior of the 8250 core. Moreover,
> the FCR register may contain other FIFO configuration bits which may not
> be writable unconditionally and writing them incorrectly can trigger
> misbehavior of the 8250 core too. (ie. AM335x UART swallows the first
> byte and retransmits the last byte twice because of this FCR write).
>
> Second, serial8250_clear_and_reinit_fifos() completely reloads the FCR,
> but what really has to happen at the end of the RS485 transmission is
> clearing of the FIFOs and nothing else.
>
> This patch repairs serial8250_clear_fifos() so that it really only
> clears the FIFOs by operating on FCR[2:1] bits and leaves all the
> other bits alone. It also undoes serial8250_clear_and_reinit_fifos()
> from __do_stop_tx_rs485() as serial8250_clear_fifos() is sufficient.
>
> Signed-off-by: Marek Vasut <[email protected]>
> Fixes: 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent transmits to break")
> Cc: Daniel Jedrychowski <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> ---
> NOTE: I am not entirely certain this won't break some other hardware,
> esp. since there might be dependencies on the weird previous
> behavior of the serial8250_clear_fifos() somewhere.

This patch has broken the UART on my Ingenic JZ4780 based MIPS Creator
Ci20 board. After this patch the system seems to detect garbage input
that is recognized as SysRq triggers which spam the console & eventually
reset the system.

One thing of note is that both serial8250_do_startup() &
serial8250_do_shutdown() contain comments that explicitly state their
expectation that the FIFOs will be disabled by serial8250_clear_fifos(),
which is no longer true after this patch.

I found that restoring the old behaviour for serial8250_do_startup() is
enough to make my system work again, but this is obviously a hack:

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..8def02933d19 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2210,7 +2210,11 @@ int serial8250_do_startup(struct uart_port *port)
* Clear the FIFO buffers and disable them.
* (they will be reenabled in set_termios())
*/
- serial8250_clear_fifos(up);
+ if (up->capabilities & UART_CAP_FIFO) {
+ serial_port_out(port, UART_FCR, UART_FCR_ENABLE_FIFO);
+ serial8250_clear_fifos(up);
+ serial_port_out(port, UART_FCR, 0);
+ }

/*
* Clear the interrupt registers.

Any ideas? Given the mismatch between this patch & comments that clearly
expect the old behaviour I think the __do_stop_tx_rs485() case probably
needs something different to other callers.

Thanks,
Paul


2018-12-13 00:56:38

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

On 12/13/2018 01:41 AM, Paul Burton wrote:
> Hi Marek / Greg / all,
>
> On Mon, Sep 03, 2018 at 12:44:52AM +0000, Marek Vasut wrote:
>> The 8250 FIFOs indeed need to be cleared after stopping transmission in
>> RS485 mode without SER_RS485_RX_DURING_TX flag set. But there are two
>> problems with the approach taken by the previous patch from Fixes tag.
>>
>> First, serial8250_clear_fifos() should clear fifos, but what it really
>> does is it enables the FIFOs unconditionally if present, clears them
>> and then sets the FCR register to zero, which effectively disables the
>> FIFOs. In case the FIFO is disabled, enabling it and clearing it makes
>> no sense and in fact can trigger misbehavior of the 8250 core. Moreover,
>> the FCR register may contain other FIFO configuration bits which may not
>> be writable unconditionally and writing them incorrectly can trigger
>> misbehavior of the 8250 core too. (ie. AM335x UART swallows the first
>> byte and retransmits the last byte twice because of this FCR write).
>>
>> Second, serial8250_clear_and_reinit_fifos() completely reloads the FCR,
>> but what really has to happen at the end of the RS485 transmission is
>> clearing of the FIFOs and nothing else.
>>
>> This patch repairs serial8250_clear_fifos() so that it really only
>> clears the FIFOs by operating on FCR[2:1] bits and leaves all the
>> other bits alone. It also undoes serial8250_clear_and_reinit_fifos()
>> from __do_stop_tx_rs485() as serial8250_clear_fifos() is sufficient.
>>
>> Signed-off-by: Marek Vasut <[email protected]>
>> Fixes: 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent transmits to break")
>> Cc: Daniel Jedrychowski <[email protected]>
>> Cc: Greg Kroah-Hartman <[email protected]>
>> ---
>> NOTE: I am not entirely certain this won't break some other hardware,
>> esp. since there might be dependencies on the weird previous
>> behavior of the serial8250_clear_fifos() somewhere.
>
> This patch has broken the UART on my Ingenic JZ4780 based MIPS Creator
> Ci20 board. After this patch the system seems to detect garbage input
> that is recognized as SysRq triggers which spam the console & eventually
> reset the system.
>
> One thing of note is that both serial8250_do_startup() &
> serial8250_do_shutdown() contain comments that explicitly state their
> expectation that the FIFOs will be disabled by serial8250_clear_fifos(),
> which is no longer true after this patch.
>
> I found that restoring the old behaviour for serial8250_do_startup() is
> enough to make my system work again, but this is obviously a hack:
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index f776b3eafb96..8def02933d19 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2210,7 +2210,11 @@ int serial8250_do_startup(struct uart_port *port)
> * Clear the FIFO buffers and disable them.
> * (they will be reenabled in set_termios())
> */
> - serial8250_clear_fifos(up);
> + if (up->capabilities & UART_CAP_FIFO) {
> + serial_port_out(port, UART_FCR, UART_FCR_ENABLE_FIFO);
> + serial8250_clear_fifos(up);
> + serial_port_out(port, UART_FCR, 0);
> + }
>
> /*
> * Clear the interrupt registers.
>
> Any ideas? Given the mismatch between this patch & comments that clearly
> expect the old behaviour I think the __do_stop_tx_rs485() case probably
> needs something different to other callers.

The problem the original patch fixed was that too many bits were cleared
in the FCR on OMAP3/AM335x . If you have such a system (a beaglebone or
similar), you can come up with a solution which can accommodate both the
JZ4780 and AM335x. Can you take a look ? The datasheet for both should
be public too.

--
Best regards,
Marek Vasut

2018-12-13 01:50:13

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

Hi Marek,

On Thu, Dec 13, 2018 at 01:55:29AM +0100, Marek Vasut wrote:
> On 12/13/2018 01:41 AM, Paul Burton wrote:
> > This patch has broken the UART on my Ingenic JZ4780 based MIPS Creator
> > Ci20 board. After this patch the system seems to detect garbage input
> > that is recognized as SysRq triggers which spam the console & eventually
> > reset the system.
> >
> > One thing of note is that both serial8250_do_startup() &
> > serial8250_do_shutdown() contain comments that explicitly state their
> > expectation that the FIFOs will be disabled by serial8250_clear_fifos(),
> > which is no longer true after this patch.
> >
> > I found that restoring the old behaviour for serial8250_do_startup() is
> > enough to make my system work again, but this is obviously a hack:
>%
> > Any ideas? Given the mismatch between this patch & comments that clearly
> > expect the old behaviour I think the __do_stop_tx_rs485() case probably
> > needs something different to other callers.
>
> The problem the original patch fixed was that too many bits were cleared
> in the FCR on OMAP3/AM335x . If you have such a system (a beaglebone or
> similar), you can come up with a solution which can accommodate both the
> JZ4780 and AM335x. Can you take a look ? The datasheet for both should
> be public too.

I don't have such a system, but hey - the Ingenic JZ4780 manual is
public too [1] so you have just as much opportunity to fix it :)

I wonder whether the issue may be the JZ4780 UART not automatically
resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
doesn't explicitly say it'll happen & the programming example it gives
says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
the kernel has been doing for at least the whole git era I wouldn't be
surprised if other devices are bitten by the change as people start
trying 4.20 on them.

So I think the safest option might be to restore the original behaviour
& just keep your change for the __do_stop_tx_rs485() path specifically,
something like the below. Could you test it on your system?

Assuming it works I'll clean it up & submit. Otherwise your patch is a
regression so I think a revert would make sense until the problem is
found.

Thanks,
Paul

[1] ftp://ftp.ingenic.cn/SOC/JZ4780/JZ4780_pm.pdf

---
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..0df0ed437a87 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -550,7 +550,7 @@ static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
/*
* FIFO support.
*/
-static void serial8250_clear_fifos(struct uart_8250_port *p)
+static void __serial8250_clear_fifos(struct uart_8250_port *p, int clr)
{
unsigned char fcr;
unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
@@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
if (p->capabilities & UART_CAP_FIFO) {
/*
* Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
- * In case ENABLE_FIFO is not set, there is nothing to flush
- * so just return. Furthermore, on certain implementations of
- * the 8250 core, the FCR[7:3] bits may only be changed under
- * specific conditions and changing them if those conditions
- * are not met can have nasty side effects. One such core is
- * the 8250-omap present in TI AM335x.
+ * On certain implementations of the 8250 core, the FCR[7:3]
+ * bits may only be changed under specific conditions and
+ * changing them if those conditions are not met can have nasty
+ * side effects. One such core is the 8250-omap present in TI
+ * AM335x.
*/
fcr = serial_in(p, UART_FCR);
+ serial_out(p, UART_FCR, fcr | clr_mask);
+ serial_out(p, UART_FCR, fcr & ~clr);
+ }
+}

- /* FIFO is not enabled, there's nothing to clear. */
- if (!(fcr & UART_FCR_ENABLE_FIFO))
- return;
-
- fcr |= clr_mask;
- serial_out(p, UART_FCR, fcr);
+static void serial8250_clear_fifos(struct uart_8250_port *p)
+{
+ __serial8250_clear_fifos(p, 0);
+}

- fcr &= ~clr_mask;
- serial_out(p, UART_FCR, fcr);
- }
+static void serial8250_clear_and_disable_fifos(struct uart_8250_port *p)
+{
+ __serial8250_clear_fifos(p, UART_FCR_ENABLE_FIFO);
}

static inline void serial8250_em485_rts_after_send(struct uart_8250_port *p)
@@ -595,7 +596,7 @@ static enum hrtimer_restart serial8250_em485_handle_stop_tx(struct hrtimer *t);

void serial8250_clear_and_reinit_fifos(struct uart_8250_port *p)
{
- serial8250_clear_fifos(p);
+ serial8250_clear_and_disable_fifos(p);
serial_out(p, UART_FCR, p->fcr);
}
EXPORT_SYMBOL_GPL(serial8250_clear_and_reinit_fifos);
@@ -1364,7 +1365,7 @@ static void autoconfig(struct uart_8250_port *up)
serial_out(up, UART_RSA_FRR, 0);
#endif
serial8250_out_MCR(up, save_mcr);
- serial8250_clear_fifos(up);
+ serial8250_clear_and_disable_fifos(up);
serial_in(up, UART_RX);
if (up->capabilities & UART_CAP_UUE)
serial_out(up, UART_IER, UART_IER_UUE);
@@ -2210,7 +2211,7 @@ int serial8250_do_startup(struct uart_port *port)
* Clear the FIFO buffers and disable them.
* (they will be reenabled in set_termios())
*/
- serial8250_clear_fifos(up);
+ serial8250_clear_and_disable_fifos(up);

/*
* Clear the interrupt registers.
@@ -2460,7 +2461,7 @@ void serial8250_do_shutdown(struct uart_port *port)
*/
serial_port_out(port, UART_LCR,
serial_port_in(port, UART_LCR) & ~UART_LCR_SBC);
- serial8250_clear_fifos(up);
+ serial8250_clear_and_disable_fifos(up);

#ifdef CONFIG_SERIAL_8250_RSA
/*

2018-12-13 03:03:34

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

On 12/13/2018 02:48 AM, Paul Burton wrote:
> Hi Marek,

Hi,

> On Thu, Dec 13, 2018 at 01:55:29AM +0100, Marek Vasut wrote:
>> On 12/13/2018 01:41 AM, Paul Burton wrote:
>>> This patch has broken the UART on my Ingenic JZ4780 based MIPS Creator
>>> Ci20 board. After this patch the system seems to detect garbage input
>>> that is recognized as SysRq triggers which spam the console & eventually
>>> reset the system.
>>>
>>> One thing of note is that both serial8250_do_startup() &
>>> serial8250_do_shutdown() contain comments that explicitly state their
>>> expectation that the FIFOs will be disabled by serial8250_clear_fifos(),
>>> which is no longer true after this patch.
>>>
>>> I found that restoring the old behaviour for serial8250_do_startup() is
>>> enough to make my system work again, but this is obviously a hack:
>> %
>>> Any ideas? Given the mismatch between this patch & comments that clearly
>>> expect the old behaviour I think the __do_stop_tx_rs485() case probably
>>> needs something different to other callers.
>>
>> The problem the original patch fixed was that too many bits were cleared
>> in the FCR on OMAP3/AM335x . If you have such a system (a beaglebone or
>> similar), you can come up with a solution which can accommodate both the
>> JZ4780 and AM335x. Can you take a look ? The datasheet for both should
>> be public too.
>
> I don't have such a system, but hey - the Ingenic JZ4780 manual is
> public too [1] so you have just as much opportunity to fix it :)
>
> I wonder whether the issue may be the JZ4780 UART not automatically
> resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
> doesn't explicitly say it'll happen & the programming example it gives
> says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
> the kernel has been doing for at least the whole git era I wouldn't be
> surprised if other devices are bitten by the change as people start
> trying 4.20 on them.

The patch you're complaining about is doing exactly that -- it sets
UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
Those two bits are exactly FCR[2:1]. It also explicitly does not touch
any other bits in FCR.

> So I think the safest option might be to restore the original behaviour
> & just keep your change for the __do_stop_tx_rs485() path specifically,
> something like the below. Could you test it on your system?
>
> Assuming it works I'll clean it up & submit. Otherwise your patch is a
> regression so I think a revert would make sense until the problem is
> found.
>
> Thanks,
> Paul
>
> [1] ftp://ftp.ingenic.cn/SOC/JZ4780/JZ4780_pm.pdf
>
> ---
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index f776b3eafb96..0df0ed437a87 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -550,7 +550,7 @@ static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
> /*
> * FIFO support.
> */
> -static void serial8250_clear_fifos(struct uart_8250_port *p)
> +static void __serial8250_clear_fifos(struct uart_8250_port *p, int clr)
> {
> unsigned char fcr;
> unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
> @@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
> if (p->capabilities & UART_CAP_FIFO) {
> /*
> * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
> - * In case ENABLE_FIFO is not set, there is nothing to flush
> - * so just return. Furthermore, on certain implementations of
> - * the 8250 core, the FCR[7:3] bits may only be changed under
> - * specific conditions and changing them if those conditions
> - * are not met can have nasty side effects. One such core is
> - * the 8250-omap present in TI AM335x.
> + * On certain implementations of the 8250 core, the FCR[7:3]
> + * bits may only be changed under specific conditions and
> + * changing them if those conditions are not met can have nasty
> + * side effects. One such core is the 8250-omap present in TI
> + * AM335x.
> */
> fcr = serial_in(p, UART_FCR);
> + serial_out(p, UART_FCR, fcr | clr_mask);
> + serial_out(p, UART_FCR, fcr & ~clr);

Note that, if I understand the patch correctly, this will _not_ restore
the original (broken) behavior. The original behavior cleared the entire
FCR at the end, which cleared even bits that were not supposed to be
cleared .

This patch will make things worse by retaining the clr_mask set in the
FCR, thus the FCR[2:1] will be set when you return from this function.
That cannot be right ?

> + }
> +}
>
> - /* FIFO is not enabled, there's nothing to clear. */
> - if (!(fcr & UART_FCR_ENABLE_FIFO))
> - return;
> -
> - fcr |= clr_mask;
> - serial_out(p, UART_FCR, fcr);
> +static void serial8250_clear_fifos(struct uart_8250_port *p)
> +{
> + __serial8250_clear_fifos(p, 0);
> +}
>
> - fcr &= ~clr_mask;
> - serial_out(p, UART_FCR, fcr);
> - }
> +static void serial8250_clear_and_disable_fifos(struct uart_8250_port *p)
> +{
> + __serial8250_clear_fifos(p, UART_FCR_ENABLE_FIFO);
> }
>
> static inline void serial8250_em485_rts_after_send(struct uart_8250_port *p)
> @@ -595,7 +596,7 @@ static enum hrtimer_restart serial8250_em485_handle_stop_tx(struct hrtimer *t);
>
> void serial8250_clear_and_reinit_fifos(struct uart_8250_port *p)
> {
> - serial8250_clear_fifos(p);
> + serial8250_clear_and_disable_fifos(p);
> serial_out(p, UART_FCR, p->fcr);
> }
> EXPORT_SYMBOL_GPL(serial8250_clear_and_reinit_fifos);
> @@ -1364,7 +1365,7 @@ static void autoconfig(struct uart_8250_port *up)
> serial_out(up, UART_RSA_FRR, 0);
> #endif
> serial8250_out_MCR(up, save_mcr);
> - serial8250_clear_fifos(up);
> + serial8250_clear_and_disable_fifos(up);
> serial_in(up, UART_RX);
> if (up->capabilities & UART_CAP_UUE)
> serial_out(up, UART_IER, UART_IER_UUE);
> @@ -2210,7 +2211,7 @@ int serial8250_do_startup(struct uart_port *port)
> * Clear the FIFO buffers and disable them.
> * (they will be reenabled in set_termios())
> */
> - serial8250_clear_fifos(up);
> + serial8250_clear_and_disable_fifos(up);
>
> /*
> * Clear the interrupt registers.
> @@ -2460,7 +2461,7 @@ void serial8250_do_shutdown(struct uart_port *port)
> */
> serial_port_out(port, UART_LCR,
> serial_port_in(port, UART_LCR) & ~UART_LCR_SBC);
> - serial8250_clear_fifos(up);
> + serial8250_clear_and_disable_fifos(up);
>
> #ifdef CONFIG_SERIAL_8250_RSA
> /*
>


--
Best regards,
Marek Vasut

2018-12-13 03:35:19

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

Hi Marek,

On Thu, Dec 13, 2018 at 03:27:51AM +0100, Marek Vasut wrote:
> > I wonder whether the issue may be the JZ4780 UART not automatically
> > resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
> > doesn't explicitly say it'll happen & the programming example it gives
> > says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
> > the kernel has been doing for at least the whole git era I wouldn't be
> > surprised if other devices are bitten by the change as people start
> > trying 4.20 on them.
>
> The patch you're complaining about is doing exactly that -- it sets
> UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
> Those two bits are exactly FCR[2:1]. It also explicitly does not touch
> any other bits in FCR.

No - your patch does that *only* if the FIFO enable bit is set, and
presumes that clearing FIFOs is a no-op when they're disabled. I don't
believe that's true on my system. I also believe that not touching the
FIFO enable bit is problematic, since some callers clearly expect that
to be cleared.

> > @@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
> > if (p->capabilities & UART_CAP_FIFO) {
> > /*
> > * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
> > - * In case ENABLE_FIFO is not set, there is nothing to flush
> > - * so just return. Furthermore, on certain implementations of
> > - * the 8250 core, the FCR[7:3] bits may only be changed under
> > - * specific conditions and changing them if those conditions
> > - * are not met can have nasty side effects. One such core is
> > - * the 8250-omap present in TI AM335x.
> > + * On certain implementations of the 8250 core, the FCR[7:3]
> > + * bits may only be changed under specific conditions and
> > + * changing them if those conditions are not met can have nasty
> > + * side effects. One such core is the 8250-omap present in TI
> > + * AM335x.
> > */
> > fcr = serial_in(p, UART_FCR);
> > + serial_out(p, UART_FCR, fcr | clr_mask);
> > + serial_out(p, UART_FCR, fcr & ~clr);
>
> Note that, if I understand the patch correctly, this will _not_ restore
> the original (broken) behavior. The original behavior cleared the entire
> FCR at the end, which cleared even bits that were not supposed to be
> cleared .

It will restore the original behaviour with regards to disabling the
FIFOs, so long as callers that expect that use
serial8250_clear_and_disable_fifos().

> This patch will make things worse by retaining the clr_mask set in the
> FCR, thus the FCR[2:1] will be set when you return from this function.
> That cannot be right ?

You're mistaken - clr_mask (ie. the FIFO reset bits) get cleared again
by the second call to serial_out(), I just did it without modifying the
fcr variable - ie. we write the original fcr value again at the end, but
with UART_FCR_ENABLE_FIFO cleared in the
serial8250_clear_and_disable_fifos() case. It would probably be clearer
with the clr argument renamed disable or something like that.

Thanks,
Paul

2018-12-13 04:29:37

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

On 12/13/2018 04:33 AM, Paul Burton wrote:
> Hi Marek,

Hello Paul,

> On Thu, Dec 13, 2018 at 03:27:51AM +0100, Marek Vasut wrote:
>>> I wonder whether the issue may be the JZ4780 UART not automatically
>>> resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
>>> doesn't explicitly say it'll happen & the programming example it gives
>>> says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
>>> the kernel has been doing for at least the whole git era I wouldn't be
>>> surprised if other devices are bitten by the change as people start
>>> trying 4.20 on them.
>>
>> The patch you're complaining about is doing exactly that -- it sets
>> UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
>> Those two bits are exactly FCR[2:1]. It also explicitly does not touch
>> any other bits in FCR.
>
> No - your patch does that *only* if the FIFO enable bit is set, and
> presumes that clearing FIFOs is a no-op when they're disabled. I don't
> believe that's true on my system. I also believe that not touching the
> FIFO enable bit is problematic, since some callers clearly expect that
> to be cleared.

Why would you disable FIFO to clear it ? This doesn't make sense to me,
the FIFO must be operational for it to be cleared. Moreover, it
contradicts your previous statement, "the programming example it gives
says to explicitly clear the FIFOs using FCR[2:1]" .

What exactly does your programming example for the JZ4780 say about the
FIFO clearing ? And does it work in that example ?

>>> @@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
>>> if (p->capabilities & UART_CAP_FIFO) {
>>> /*
>>> * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
>>> - * In case ENABLE_FIFO is not set, there is nothing to flush
>>> - * so just return. Furthermore, on certain implementations of
>>> - * the 8250 core, the FCR[7:3] bits may only be changed under
>>> - * specific conditions and changing them if those conditions
>>> - * are not met can have nasty side effects. One such core is
>>> - * the 8250-omap present in TI AM335x.
>>> + * On certain implementations of the 8250 core, the FCR[7:3]
>>> + * bits may only be changed under specific conditions and
>>> + * changing them if those conditions are not met can have nasty
>>> + * side effects. One such core is the 8250-omap present in TI
>>> + * AM335x.
>>> */
>>> fcr = serial_in(p, UART_FCR);
>>> + serial_out(p, UART_FCR, fcr | clr_mask);
>>> + serial_out(p, UART_FCR, fcr & ~clr);
>>
>> Note that, if I understand the patch correctly, this will _not_ restore
>> the original (broken) behavior. The original behavior cleared the entire
>> FCR at the end, which cleared even bits that were not supposed to be
>> cleared .
>
> It will restore the original behaviour with regards to disabling the
> FIFOs, so long as callers that expect that use
> serial8250_clear_and_disable_fifos().

Does your platform need the FIFO to be explicitly disabled for it to be
cleared, can you confirm that ? And if so, does this apply to other
platforms with 8250 UART or is this specific to JZ4780 implementation of
the UART block ?

I just remembered U-Boot has this patch for JZ4780 UART [1], which
touches the FCR UME bit, so the FCR behavior on JZ4780 does differ from
the generic 8250 core. Could it be that this is what you're hitting ?

[1]
http://git.denx.de/?p=u-boot.git;a=commitdiff;h=0b060eefd951fc111ecb77c7b1932b158e6a4f2c;hp=79fd9281880974f076c5b4b354b57faa6e0cc146

>> This patch will make things worse by retaining the clr_mask set in the
>> FCR, thus the FCR[2:1] will be set when you return from this function.
>> That cannot be right ?
>
> You're mistaken - clr_mask (ie. the FIFO reset bits) get cleared again
> by the second call to serial_out(), I just did it without modifying the
> fcr variable - ie. we write the original fcr value again at the end, but
> with UART_FCR_ENABLE_FIFO cleared in the
> serial8250_clear_and_disable_fifos() case. It would probably be clearer
> with the clr argument renamed disable or something like that.

Uhh, OK, thanks for clarifying.

--
Best regards,
Marek Vasut

2018-12-13 05:13:09

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

Hi Marek,

On Thu, Dec 13, 2018 at 05:18:19AM +0100, Marek Vasut wrote:
> >>> I wonder whether the issue may be the JZ4780 UART not automatically
> >>> resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
> >>> doesn't explicitly say it'll happen & the programming example it gives
> >>> says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
> >>> the kernel has been doing for at least the whole git era I wouldn't be
> >>> surprised if other devices are bitten by the change as people start
> >>> trying 4.20 on them.
> >>
> >> The patch you're complaining about is doing exactly that -- it sets
> >> UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
> >> Those two bits are exactly FCR[2:1]. It also explicitly does not touch
> >> any other bits in FCR.
> >
> > No - your patch does that *only* if the FIFO enable bit is set, and
> > presumes that clearing FIFOs is a no-op when they're disabled. I don't
> > believe that's true on my system. I also believe that not touching the
> > FIFO enable bit is problematic, since some callers clearly expect that
> > to be cleared.
>
> Why would you disable FIFO to clear it ? This doesn't make sense to me,
> the FIFO must be operational for it to be cleared. Moreover, it
> contradicts your previous statement, "the programming example it gives
> says to explicitly clear the FIFOs using FCR[2:1]" .

No, no, not at all... I'm saying that my theory is:

- The JZ4780 requires that the FIFO be reset using FCR[2:1] in order
to not read garbage.

- Prior to your patch serial8250_clear_fifos() did this,
unconditionally.

- After your patch serial8250_clear_fifos() only clears the FIFOs if
the FIFO is already enabled.

- When called from serial8250_do_startup() during boot, the FIFO may
not be enabled - for example if the bootloader didn't use the FIFO,
or if the UART is used with 8250_early which zeroes FCR.

- Therefore after your patch the FIFO reset bits are never set if the
UART was used with 8250_early, or if it wasn't but the bootloader
didn't enable the FIFO.

I suspect that you get away with that because according to the PC16550D
documentation the FIFOs should reset when the FIFO enable bit changes,
so when the FIFO is later enabled it gets reset. I suspect that in the
JZ4780 this does not happen, and the FIFO contains garbage. Our previous
behaviour (always set FCR[2:1] to reset the FIFO) has been around for a
long time, so I would not be surprised to find other devices relying
upon it.

I'm also saying that if the FIFOs are enabled during boot then your
patch will leave them enabled after serial8250_do_startup() calls
serial8250_clear_fifos(), which a comment in serial8250_do_startup()
clearly states is not the expected behaviour:

> Clear the FIFO buffers and disable them.
> (they will be reenabled in set_termios())

(From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/8250/8250_port.c?h=v4.20-rc6#n2209)

And further, with your patch serial8250_do_shutdown() will leave the
FIFO enabled which again does not match what comments suggest it expects
("Disable break condition and FIFOs").

> What exactly does your programming example for the JZ4780 say about the
> FIFO clearing ? And does it work in that example ?

It says to set FCR[2:1] to reset the FIFO after enabling it, which as
far as I can tell from the PC16550D documentation would not be necessary
there because when you enable the FIFO it would automatically be reset.
Linux did this until your patch.

> >>> @@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
> >>> if (p->capabilities & UART_CAP_FIFO) {
> >>> /*
> >>> * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
> >>> - * In case ENABLE_FIFO is not set, there is nothing to flush
> >>> - * so just return. Furthermore, on certain implementations of
> >>> - * the 8250 core, the FCR[7:3] bits may only be changed under
> >>> - * specific conditions and changing them if those conditions
> >>> - * are not met can have nasty side effects. One such core is
> >>> - * the 8250-omap present in TI AM335x.
> >>> + * On certain implementations of the 8250 core, the FCR[7:3]
> >>> + * bits may only be changed under specific conditions and
> >>> + * changing them if those conditions are not met can have nasty
> >>> + * side effects. One such core is the 8250-omap present in TI
> >>> + * AM335x.
> >>> */
> >>> fcr = serial_in(p, UART_FCR);
> >>> + serial_out(p, UART_FCR, fcr | clr_mask);
> >>> + serial_out(p, UART_FCR, fcr & ~clr);
> >>
> >> Note that, if I understand the patch correctly, this will _not_ restore
> >> the original (broken) behavior. The original behavior cleared the entire
> >> FCR at the end, which cleared even bits that were not supposed to be
> >> cleared .
> >
> > It will restore the original behaviour with regards to disabling the
> > FIFOs, so long as callers that expect that use
> > serial8250_clear_and_disable_fifos().
>
> Does your platform need the FIFO to be explicitly disabled for it to be
> cleared, can you confirm that ? And if so, does this apply to other
> platforms with 8250 UART or is this specific to JZ4780 implementation of
> the UART block ?
>
> I just remembered U-Boot has this patch for JZ4780 UART [1], which
> touches the FCR UME bit, so the FCR behavior on JZ4780 does differ from
> the generic 8250 core. Could it be that this is what you're hitting ?

No - we already set the UME bit & this has all worked fine until your
patch changed the FIFO reset behaviour.

Thanks,
Paul

2018-12-13 15:27:24

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

On 12/13/2018 06:11 AM, Paul Burton wrote:
> Hi Marek,

Hi,

> On Thu, Dec 13, 2018 at 05:18:19AM +0100, Marek Vasut wrote:
>>>>> I wonder whether the issue may be the JZ4780 UART not automatically
>>>>> resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
>>>>> doesn't explicitly say it'll happen & the programming example it gives
>>>>> says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
>>>>> the kernel has been doing for at least the whole git era I wouldn't be
>>>>> surprised if other devices are bitten by the change as people start
>>>>> trying 4.20 on them.
>>>>
>>>> The patch you're complaining about is doing exactly that -- it sets
>>>> UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
>>>> Those two bits are exactly FCR[2:1]. It also explicitly does not touch
>>>> any other bits in FCR.
>>>
>>> No - your patch does that *only* if the FIFO enable bit is set, and
>>> presumes that clearing FIFOs is a no-op when they're disabled. I don't
>>> believe that's true on my system. I also believe that not touching the
>>> FIFO enable bit is problematic, since some callers clearly expect that
>>> to be cleared.
>>
>> Why would you disable FIFO to clear it ? This doesn't make sense to me,
>> the FIFO must be operational for it to be cleared. Moreover, it
>> contradicts your previous statement, "the programming example it gives
>> says to explicitly clear the FIFOs using FCR[2:1]" .
>
> No, no, not at all... I'm saying that my theory is:
>
> - The JZ4780 requires that the FIFO be reset using FCR[2:1] in order
> to not read garbage.

Which we do

> - Prior to your patch serial8250_clear_fifos() did this,
> unconditionally.

btw originally, this also cleared the UME bit. Could this be what made
the difference on the JZ4780 ?

Can you try this patch on the JZ4780 , just out of curiosity ?

diff --git a/drivers/tty/serial/8250/8250_port.c
b/drivers/tty/serial/8250/8250_port.c
index c39482b961110..3dce99f4c6802 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -553,7 +553,7 @@ static unsigned int serial_icr_read(struct
uart_8250_port *up, int offset)
static void serial8250_clear_fifos(struct uart_8250_port *p)
{
unsigned char fcr;
- unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
+ unsigned char clr_mask = UART_FCR_CLEAR_RCVR |
UART_FCR_CLEAR_XMIT | BIT(4) /* UME */;

if (p->capabilities & UART_CAP_FIFO) {

> - After your patch serial8250_clear_fifos() only clears the FIFOs if
> the FIFO is already enabled.

I'd suppose this is OK, since clearing a disabled FIFO makes no sense ?

> - When called from serial8250_do_startup() during boot, the FIFO may
> not be enabled - for example if the bootloader didn't use the FIFO,
> or if the UART is used with 8250_early which zeroes FCR.

If it's not enabled, do you need to clear it ?

> - Therefore after your patch the FIFO reset bits are never set if the
> UART was used with 8250_early, or if it wasn't but the bootloader
> didn't enable the FIFO.

Hence my question above, do you need to clear the FIFO even if it was
not enabled ?

> I suspect that you get away with that because according to the PC16550D
> documentation the FIFOs should reset when the FIFO enable bit changes,
> so when the FIFO is later enabled it gets reset. I suspect that in the
> JZ4780 this does not happen, and the FIFO contains garbage. Our previous
> behaviour (always set FCR[2:1] to reset the FIFO) has been around for a
> long time, so I would not be surprised to find other devices relying
> upon it.

It is well possible, but I'd like to understand what really happens
here, not just guess.

> I'm also saying that if the FIFOs are enabled during boot then your
> patch will leave them enabled after serial8250_do_startup() calls
> serial8250_clear_fifos(), which a comment in serial8250_do_startup()
> clearly states is not the expected behaviour:

In that case, that needs to be fixed. But serial8250_clear_fifos()
should only do what the name says -- clear FIFOs -- not disable them.

>> Clear the FIFO buffers and disable them.
>> (they will be reenabled in set_termios())
>
> (From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/8250/8250_port.c?h=v4.20-rc6#n2209)
>
> And further, with your patch serial8250_do_shutdown() will leave the
> FIFO enabled which again does not match what comments suggest it expects
> ("Disable break condition and FIFOs").
>
>> What exactly does your programming example for the JZ4780 say about the
>> FIFO clearing ? And does it work in that example ?
>
> It says to set FCR[2:1] to reset the FIFO after enabling it, which as
> far as I can tell from the PC16550D documentation would not be necessary
> there because when you enable the FIFO it would automatically be reset.
> Linux did this until your patch.

Linux sets the FCR[2:1] if the FIFO is enabled even now.

>>>>> @@ -558,25 +558,26 @@ static void serial8250_clear_fifos(struct uart_8250_port *p)
>>>>> if (p->capabilities & UART_CAP_FIFO) {
>>>>> /*
>>>>> * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
>>>>> - * In case ENABLE_FIFO is not set, there is nothing to flush
>>>>> - * so just return. Furthermore, on certain implementations of
>>>>> - * the 8250 core, the FCR[7:3] bits may only be changed under
>>>>> - * specific conditions and changing them if those conditions
>>>>> - * are not met can have nasty side effects. One such core is
>>>>> - * the 8250-omap present in TI AM335x.
>>>>> + * On certain implementations of the 8250 core, the FCR[7:3]
>>>>> + * bits may only be changed under specific conditions and
>>>>> + * changing them if those conditions are not met can have nasty
>>>>> + * side effects. One such core is the 8250-omap present in TI
>>>>> + * AM335x.
>>>>> */
>>>>> fcr = serial_in(p, UART_FCR);
>>>>> + serial_out(p, UART_FCR, fcr | clr_mask);
>>>>> + serial_out(p, UART_FCR, fcr & ~clr);
>>>>
>>>> Note that, if I understand the patch correctly, this will _not_ restore
>>>> the original (broken) behavior. The original behavior cleared the entire
>>>> FCR at the end, which cleared even bits that were not supposed to be
>>>> cleared .
>>>
>>> It will restore the original behaviour with regards to disabling the
>>> FIFOs, so long as callers that expect that use
>>> serial8250_clear_and_disable_fifos().
>>
>> Does your platform need the FIFO to be explicitly disabled for it to be
>> cleared, can you confirm that ? And if so, does this apply to other
>> platforms with 8250 UART or is this specific to JZ4780 implementation of
>> the UART block ?
>>
>> I just remembered U-Boot has this patch for JZ4780 UART [1], which
>> touches the FCR UME bit, so the FCR behavior on JZ4780 does differ from
>> the generic 8250 core. Could it be that this is what you're hitting ?
>
> No - we already set the UME bit & this has all worked fine until your
> patch changed the FIFO reset behaviour.

The UME bit is in the FCR, serial8250_clear_fifos() originally cleared
it, cfr:
- serial_out(p, UART_FCR, 0);
in f6aa5beb45be27968a4df90176ca36dfc4363d37 . So the code was originally
completely disabling the UART on JZ4780 .

--
Best regards,
Marek Vasut

2018-12-13 17:59:03

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] serial: 8250: Fix clearing FIFOs in RS485 mode again

Hi Marek,

On Thu, Dec 13, 2018 at 03:51:02PM +0100, Marek Vasut wrote:
> >>>>> I wonder whether the issue may be the JZ4780 UART not automatically
> >>>>> resetting the FIFOs when FCR[0] changes. This is a guess, but the manual
> >>>>> doesn't explicitly say it'll happen & the programming example it gives
> >>>>> says to explicitly clear the FIFOs using FCR[2:1]. Since this is what
> >>>>> the kernel has been doing for at least the whole git era I wouldn't be
> >>>>> surprised if other devices are bitten by the change as people start
> >>>>> trying 4.20 on them.
> >>>>
> >>>> The patch you're complaining about is doing exactly that -- it sets
> >>>> UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT in FCR , and then clears it.
> >>>> Those two bits are exactly FCR[2:1]. It also explicitly does not touch
> >>>> any other bits in FCR.
> >>>
> >>> No - your patch does that *only* if the FIFO enable bit is set, and
> >>> presumes that clearing FIFOs is a no-op when they're disabled. I don't
> >>> believe that's true on my system. I also believe that not touching the
> >>> FIFO enable bit is problematic, since some callers clearly expect that
> >>> to be cleared.
> >>
> >> Why would you disable FIFO to clear it ? This doesn't make sense to me,
> >> the FIFO must be operational for it to be cleared. Moreover, it
> >> contradicts your previous statement, "the programming example it gives
> >> says to explicitly clear the FIFOs using FCR[2:1]" .
> >
> > No, no, not at all... I'm saying that my theory is:
> >
> > - The JZ4780 requires that the FIFO be reset using FCR[2:1] in order
> > to not read garbage.
>
> Which we do

No - we don't... We used to, but your patch stops that from happening if
the FIFO enable bit is clear.... And the FIFO enable bit is clear during
serial8250_do_startup() if the UART was used with earlycon, otherwise it
depends on state left behind by the bootloader. I don't know how many
ways I can say this.

> > - Prior to your patch serial8250_clear_fifos() did this,
> > unconditionally.
>
> btw originally, this also cleared the UME bit. Could this be what made
> the difference on the JZ4780 ?

It did not, you are wrong. serial_out() calls ingenic_uart_serial_out()
which unconditionally or's in the UME bit for writes to FCR.

> > - After your patch serial8250_clear_fifos() only clears the FIFOs if
> > the FIFO is already enabled.
>
> I'd suppose this is OK, since clearing a disabled FIFO makes no sense ?

This is where the brokenness seems to come from. As I said, this would
be fine according to the PC16550D documentation but I can say based on
experimentation that it is *not* fine on the JZ4780.

Resetting a disabled FIFO makes perfect sense if it's going to be
enabled later. That used to happen, and after your patch it doesn't.

> > - When called from serial8250_do_startup() during boot, the FIFO may
> > not be enabled - for example if the bootloader didn't use the FIFO,
> > or if the UART is used with 8250_early which zeroes FCR.
>
> If it's not enabled, do you need to clear it ?
>
> > - Therefore after your patch the FIFO reset bits are never set if the
> > UART was used with 8250_early, or if it wasn't but the bootloader
> > didn't enable the FIFO.
>
> Hence my question above, do you need to clear the FIFO even if it was
> not enabled ?

Yes.

What you asked before though was whether we need to disable the FIFO in
order to clear it, which is a different question.

> > I suspect that you get away with that because according to the PC16550D
> > documentation the FIFOs should reset when the FIFO enable bit changes,
> > so when the FIFO is later enabled it gets reset. I suspect that in the
> > JZ4780 this does not happen, and the FIFO contains garbage. Our previous
> > behaviour (always set FCR[2:1] to reset the FIFO) has been around for a
> > long time, so I would not be surprised to find other devices relying
> > upon it.
>
> It is well possible, but I'd like to understand what really happens
> here, not just guess.

I can only tell you what little the JZ4780 manual says & what actually
works when I run it on my system. I used the word supect to explain my
theory, but that's the best we can do with the documentation available.

> > I'm also saying that if the FIFOs are enabled during boot then your
> > patch will leave them enabled after serial8250_do_startup() calls
> > serial8250_clear_fifos(), which a comment in serial8250_do_startup()
> > clearly states is not the expected behaviour:
>
> In that case, that needs to be fixed. But serial8250_clear_fifos()
> should only do what the name says -- clear FIFOs -- not disable them.

...which is why I effectively renamed it
serial8250_clear_and_disable_fifos() in my suggested patch. Did you test
it? I have no OMAP3 system to check that I didn't break your setup, and
at this point if I can't be sure it works I'll be submitting a revert of
your broken patch.

> >> Clear the FIFO buffers and disable them.
> >> (they will be reenabled in set_termios())
> >
> > (From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/8250/8250_port.c?h=v4.20-rc6#n2209)
> >
> > And further, with your patch serial8250_do_shutdown() will leave the
> > FIFO enabled which again does not match what comments suggest it expects
> > ("Disable break condition and FIFOs").
> >
> >> What exactly does your programming example for the JZ4780 say about the
> >> FIFO clearing ? And does it work in that example ?
> >
> > It says to set FCR[2:1] to reset the FIFO after enabling it, which as
> > far as I can tell from the PC16550D documentation would not be necessary
> > there because when you enable the FIFO it would automatically be reset.
> > Linux did this until your patch.
>
> Linux sets the FCR[2:1] if the FIFO is enabled even now.

Correct, but not when the FIFO is disabled in serial8250_do_startup(),
nor when the FIFO is later enabled in serial8250_do_set_termios().

> >> I just remembered U-Boot has this patch for JZ4780 UART [1], which
> >> touches the FCR UME bit, so the FCR behavior on JZ4780 does differ from
> >> the generic 8250 core. Could it be that this is what you're hitting ?
> >
> > No - we already set the UME bit & this has all worked fine until your
> > patch changed the FIFO reset behaviour.
>
> The UME bit is in the FCR, serial8250_clear_fifos() originally cleared
> it, cfr:
> - serial_out(p, UART_FCR, 0);
> in f6aa5beb45be27968a4df90176ca36dfc4363d37 . So the code was originally
> completely disabling the UART on JZ4780 .

Again, wrong. Look at what serial_out() actually does & see
ingenic_uart_serial_out(). Again, the Ingenic UARTs have worked fine
until your patch, and work with mainline after reverting it - the
problem is *entirely* with the change you made & the UME bit has nothing
to do with it.

Thanks,
Paul

2018-12-16 20:11:47

by Paul Burton

[permalink] [raw]
Subject: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode
again") makes a change to FIFO clearing code which its commit message
suggests was intended to be specific to use with RS485 mode, however:

1) The change made does not just affect __do_stop_tx_rs485(), it also
affects other uses of serial8250_clear_fifos() including paths for
starting up, shutting down or auto-configuring a port regardless of
whether it's an RS485 port or not.

2) It makes the assumption that resetting the FIFOs is a no-op when
FIFOs are disabled, and as such it checks for this case & explicitly
avoids setting the FIFO reset bits when the FIFO enable bit is
clear. A reading of the PC16550D manual would suggest that this is
OK since the FIFO should automatically be reset if it is later
enabled, but we support many 16550-compatible devices and have never
required this auto-reset behaviour for at least the whole git era.
Starting to rely on it now seems risky, offers no benefit, and
indeed breaks at least the Ingenic JZ4780's UARTs which reads
garbage when the RX FIFO is enabled if we don't explicitly reset it.

3) By only resetting the FIFOs if they're enabled, the behaviour of
serial8250_do_startup() during boot now depends on what the value of
FCR is before the 8250 driver is probed. This in itself seems
questionable and leaves us with FCR=0 & no FIFO reset if the UART
was used by 8250_early, otherwise it depends upon what the
bootloader left behind.

4) Although the naming of serial8250_clear_fifos() may be unclear, it
is clear that callers of it expect that it will disable FIFOs. Both
serial8250_do_startup() & serial8250_do_shutdown() contain comments
to that effect, and other callers explicitly re-enable the FIFOs
after calling serial8250_clear_fifos(). The premise of that patch
that disabling the FIFOs is incorrect therefore seems wrong.

For these reasons, this reverts commit f6aa5beb45be ("serial: 8250: Fix
clearing FIFOs in RS485 mode again").

Signed-off-by: Paul Burton <[email protected]>
Fixes: f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode again").
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Daniel Jedrychowski <[email protected]>
Cc: Marek Vasut <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: stable <[email protected]> # 4.10+
---
I did suggest an alternative approach which would rename
serial8250_clear_fifos() and split it into 2 variants - one that
disables FIFOs & one that does not, then use the latter in
__do_stop_tx_rs485():

https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/

However I have no access to the OMAP3 hardware that Marek's patch was
attempting to fix & have heard nothing back with regards to him testing
that approach, so here's a simple revert that fixes the Ingenic JZ4780.

I've marked for stable back to v4.10 presuming that this is how far the
broken patch may be backported, given that this is where commit
2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
transmits to break") that it tried to fix was introduced.
---
drivers/tty/serial/8250/8250_port.c | 29 +++++------------------------
1 file changed, 5 insertions(+), 24 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..3f779d25ec0c 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -552,30 +552,11 @@ static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
*/
static void serial8250_clear_fifos(struct uart_8250_port *p)
{
- unsigned char fcr;
- unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
-
if (p->capabilities & UART_CAP_FIFO) {
- /*
- * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
- * In case ENABLE_FIFO is not set, there is nothing to flush
- * so just return. Furthermore, on certain implementations of
- * the 8250 core, the FCR[7:3] bits may only be changed under
- * specific conditions and changing them if those conditions
- * are not met can have nasty side effects. One such core is
- * the 8250-omap present in TI AM335x.
- */
- fcr = serial_in(p, UART_FCR);
-
- /* FIFO is not enabled, there's nothing to clear. */
- if (!(fcr & UART_FCR_ENABLE_FIFO))
- return;
-
- fcr |= clr_mask;
- serial_out(p, UART_FCR, fcr);
-
- fcr &= ~clr_mask;
- serial_out(p, UART_FCR, fcr);
+ serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO);
+ serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO |
+ UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT);
+ serial_out(p, UART_FCR, 0);
}
}

@@ -1467,7 +1448,7 @@ static void __do_stop_tx_rs485(struct uart_8250_port *p)
* Enable previously disabled RX interrupts.
*/
if (!(p->port.rs485.flags & SER_RS485_RX_DURING_TX)) {
- serial8250_clear_fifos(p);
+ serial8250_clear_and_reinit_fifos(p);

p->ier |= UART_IER_RLSI | UART_IER_RDI;
serial_port_out(&p->port, UART_IER, p->ier);
--
2.20.0


2018-12-16 20:39:13

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 09:10 PM, Paul Burton wrote:
> Commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode
> again") makes a change to FIFO clearing code which its commit message
> suggests was intended to be specific to use with RS485 mode, however:
>
> 1) The change made does not just affect __do_stop_tx_rs485(), it also
> affects other uses of serial8250_clear_fifos() including paths for
> starting up, shutting down or auto-configuring a port regardless of
> whether it's an RS485 port or not.
>
> 2) It makes the assumption that resetting the FIFOs is a no-op when
> FIFOs are disabled, and as such it checks for this case & explicitly
> avoids setting the FIFO reset bits when the FIFO enable bit is
> clear. A reading of the PC16550D manual would suggest that this is
> OK since the FIFO should automatically be reset if it is later
> enabled, but we support many 16550-compatible devices and have never
> required this auto-reset behaviour for at least the whole git era.
> Starting to rely on it now seems risky, offers no benefit, and
> indeed breaks at least the Ingenic JZ4780's UARTs which reads
> garbage when the RX FIFO is enabled if we don't explicitly reset it.
>
> 3) By only resetting the FIFOs if they're enabled, the behaviour of
> serial8250_do_startup() during boot now depends on what the value of
> FCR is before the 8250 driver is probed. This in itself seems
> questionable and leaves us with FCR=0 & no FIFO reset if the UART
> was used by 8250_early, otherwise it depends upon what the
> bootloader left behind.
>
> 4) Although the naming of serial8250_clear_fifos() may be unclear, it
> is clear that callers of it expect that it will disable FIFOs. Both
> serial8250_do_startup() & serial8250_do_shutdown() contain comments
> to that effect, and other callers explicitly re-enable the FIFOs
> after calling serial8250_clear_fifos(). The premise of that patch
> that disabling the FIFOs is incorrect therefore seems wrong.
>
> For these reasons, this reverts commit f6aa5beb45be ("serial: 8250: Fix
> clearing FIFOs in RS485 mode again").
>
> Signed-off-by: Paul Burton <[email protected]>
> Fixes: f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode again").
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Daniel Jedrychowski <[email protected]>
> Cc: Marek Vasut <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: stable <[email protected]> # 4.10+
I am unable to test it on such a short notice as I'm currently ill, so I
cannot tell if your change breaks the OMAP3/AM335x boards or not. Given
that there are very few CI20 boards in use, I'd like to ask you for some
extra time to investigate this on the OMAP3 too.

btw what strikes me as curious is that this patch emerged shortly after
Ezequiel re-posted the CI20 U-Boot patches after an year-long hiatus, is
it somehow related ?

--
Best regards,
Marek Vasut

2018-12-16 21:29:38

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 09:10 PM, Paul Burton wrote:
> Commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode
> again") makes a change to FIFO clearing code which its commit message
> suggests was intended to be specific to use with RS485 mode, however:
>
> 1) The change made does not just affect __do_stop_tx_rs485(), it also
> affects other uses of serial8250_clear_fifos() including paths for
> starting up, shutting down or auto-configuring a port regardless of
> whether it's an RS485 port or not.
>
> 2) It makes the assumption that resetting the FIFOs is a no-op when
> FIFOs are disabled, and as such it checks for this case & explicitly
> avoids setting the FIFO reset bits when the FIFO enable bit is
> clear. A reading of the PC16550D manual would suggest that this is
> OK since the FIFO should automatically be reset if it is later
> enabled, but we support many 16550-compatible devices and have never
> required this auto-reset behaviour for at least the whole git era.
> Starting to rely on it now seems risky, offers no benefit, and
> indeed breaks at least the Ingenic JZ4780's UARTs which reads
> garbage when the RX FIFO is enabled if we don't explicitly reset it.
>
> 3) By only resetting the FIFOs if they're enabled, the behaviour of
> serial8250_do_startup() during boot now depends on what the value of
> FCR is before the 8250 driver is probed. This in itself seems
> questionable and leaves us with FCR=0 & no FIFO reset if the UART
> was used by 8250_early, otherwise it depends upon what the
> bootloader left behind.
>
> 4) Although the naming of serial8250_clear_fifos() may be unclear, it
> is clear that callers of it expect that it will disable FIFOs. Both
> serial8250_do_startup() & serial8250_do_shutdown() contain comments
> to that effect, and other callers explicitly re-enable the FIFOs
> after calling serial8250_clear_fifos(). The premise of that patch
> that disabling the FIFOs is incorrect therefore seems wrong.
>
> For these reasons, this reverts commit f6aa5beb45be ("serial: 8250: Fix
> clearing FIFOs in RS485 mode again").
>
> Signed-off-by: Paul Burton <[email protected]>
> Fixes: f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in RS485 mode again").
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Daniel Jedrychowski <[email protected]>
> Cc: Marek Vasut <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: stable <[email protected]> # 4.10+
> ---
> I did suggest an alternative approach which would rename
> serial8250_clear_fifos() and split it into 2 variants - one that
> disables FIFOs & one that does not, then use the latter in
> __do_stop_tx_rs485():
>
> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>
> However I have no access to the OMAP3 hardware that Marek's patch was
> attempting to fix & have heard nothing back with regards to him testing
> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
>
> I've marked for stable back to v4.10 presuming that this is how far the
> broken patch may be backported, given that this is where commit
> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
> transmits to break") that it tried to fix was introduced.

OK, I tested this on AM335x / OMAP3 and the system is again broken, so
that's a NAK.

> ---
> drivers/tty/serial/8250/8250_port.c | 29 +++++------------------------
> 1 file changed, 5 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index f776b3eafb96..3f779d25ec0c 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -552,30 +552,11 @@ static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
> */
> static void serial8250_clear_fifos(struct uart_8250_port *p)
> {
> - unsigned char fcr;
> - unsigned char clr_mask = UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT;
> -
> if (p->capabilities & UART_CAP_FIFO) {
> - /*
> - * Make sure to avoid changing FCR[7:3] and ENABLE_FIFO bits.
> - * In case ENABLE_FIFO is not set, there is nothing to flush
> - * so just return. Furthermore, on certain implementations of
> - * the 8250 core, the FCR[7:3] bits may only be changed under
> - * specific conditions and changing them if those conditions
> - * are not met can have nasty side effects. One such core is
> - * the 8250-omap present in TI AM335x.
> - */
> - fcr = serial_in(p, UART_FCR);
> -
> - /* FIFO is not enabled, there's nothing to clear. */
> - if (!(fcr & UART_FCR_ENABLE_FIFO))
> - return;
> -
> - fcr |= clr_mask;
> - serial_out(p, UART_FCR, fcr);
> -
> - fcr &= ~clr_mask;
> - serial_out(p, UART_FCR, fcr);
> + serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO);
> + serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO |
> + UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT);
> + serial_out(p, UART_FCR, 0);
> }
> }
>
> @@ -1467,7 +1448,7 @@ static void __do_stop_tx_rs485(struct uart_8250_port *p)
> * Enable previously disabled RX interrupts.
> */
> if (!(p->port.rs485.flags & SER_RS485_RX_DURING_TX)) {
> - serial8250_clear_fifos(p);
> + serial8250_clear_and_reinit_fifos(p);
>
> p->ier |= UART_IER_RLSI | UART_IER_RDI;
> serial_port_out(&p->port, UART_IER, p->ier);
>


--
Best regards,
Marek Vasut

2018-12-16 21:42:08

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Marek,

On Sun, Dec 16, 2018 at 10:08:48PM +0100, Marek Vasut wrote:
> > I did suggest an alternative approach which would rename
> > serial8250_clear_fifos() and split it into 2 variants - one that
> > disables FIFOs & one that does not, then use the latter in
> > __do_stop_tx_rs485():
> >
> > https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
> >
> > However I have no access to the OMAP3 hardware that Marek's patch was
> > attempting to fix & have heard nothing back with regards to him testing
> > that approach, so here's a simple revert that fixes the Ingenic JZ4780.
> >
> > I've marked for stable back to v4.10 presuming that this is how far the
> > broken patch may be backported, given that this is where commit
> > 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
> > transmits to break") that it tried to fix was introduced.
>
> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
> that's a NAK.

To be clear - what did you test? This revert or the patch linked to
above?

This revert would of course reintroduce your RS485 issue because it just
undoes your change.

Either way, commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in
RS485 mode again") breaks systems that worked before it so at this late
stage in the 4.20 cycle a revert would make sense to me. If that breaks
RS85 on OMAP3 then my question would be how much can anyone really care
if nobody noticed since v4.10? And why should that lead to you breaking
the JZ4780 which has been discovered before a stable kernel release
includes the breakage?

Thanks,
Paul

2018-12-16 21:47:46

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 10:31 PM, Paul Burton wrote:
> Hi Marek,

Hi,

> On Sun, Dec 16, 2018 at 09:32:19PM +0100, Marek Vasut wrote:
>> I am unable to test it on such a short notice as I'm currently ill, so I
>> cannot tell if your change breaks the OMAP3/AM335x boards or not. Given
>> that there are very few CI20 boards in use, I'd like to ask you for some
>> extra time to investigate this on the OMAP3 too.
>
> I'm sorry to hear that you're ill, but your patch is getting awfully
> close to becoming part of a stable kernel release & it causes
> regressions. Even if it didn't break a board I use, I think the patch
> would be broken & risky for the reasons I outlined in my revert's commit
> message.

That's what the incremental releases are for, so that minor problems can
get fixed there. Sure, it's great to have things perfect in the first
release, but if that breaks other systems, too bad.

> Ultimately it's Greg's decision but it sounds like you're asking me to
> say it's OK to break the JZ4780 in a stable kernel with a patch that I
> think would be risky anyway, and I won't do that.

I am saying this revert breaks AM335x, so this is a stalemate. I had a
discussion with Ezequiel (on CC) and he seems to have a different
smaller patch coming for this problem.

[...]

--
Best regards,
Marek Vasut

2018-12-16 21:54:19

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On Sun, 16 Dec 2018 at 18:45, Marek Vasut <[email protected]> wrote:
[skips discussion]
>
> > Ultimately it's Greg's decision but it sounds like you're asking me to
> > say it's OK to break the JZ4780 in a stable kernel with a patch that I
> > think would be risky anyway, and I won't do that.
>
> I am saying this revert breaks AM335x, so this is a stalemate. I had a
> discussion with Ezequiel (on CC) and he seems to have a different
> smaller patch coming for this problem.
>

Can you guys test this? Note that serial8250_do_startup has a comment
stating clearly that it has the intention of disabling the FIFOs,
so it seems this is the right thing to do.

Paul, this removes the garbage on my CI20 (rev.1)

diff --git a/drivers/tty/serial/8250/8250_port.c
b/drivers/tty/serial/8250/8250_port.c
index c39482b96111..fac19cbc51d1 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2209,10 +2209,11 @@ int serial8250_do_startup(struct uart_port *port)
/*
* Clear the FIFO buffers and disable them.
* (they will be reenabled in set_termios())
*/
serial8250_clear_fifos(up);
+ serial_out(up, UART_FCR, 0);

/*
* Clear the interrupt registers.
*/
serial_port_in(port, UART_LSR);


--
Ezequiel García, VanguardiaSur
http://www.vanguardiasur.com.ar

2018-12-16 22:02:47

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 10:39 PM, Paul Burton wrote:
> Hi Marek,

Hi,

> On Sun, Dec 16, 2018 at 10:08:48PM +0100, Marek Vasut wrote:
>>> I did suggest an alternative approach which would rename
>>> serial8250_clear_fifos() and split it into 2 variants - one that
>>> disables FIFOs & one that does not, then use the latter in
>>> __do_stop_tx_rs485():
>>>
>>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>>>
>>> However I have no access to the OMAP3 hardware that Marek's patch was
>>> attempting to fix & have heard nothing back with regards to him testing
>>> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
>>>
>>> I've marked for stable back to v4.10 presuming that this is how far the
>>> broken patch may be backported, given that this is where commit
>>> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
>>> transmits to break") that it tried to fix was introduced.
>>
>> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
>> that's a NAK.
>
> To be clear - what did you test? This revert or the patch linked to
> above?
>
> This revert would of course reintroduce your RS485 issue because it just
> undoes your change.

The revert. Which of the two patches do you need me to test.

> Either way, commit f6aa5beb45be ("serial: 8250: Fix clearing FIFOs in
> RS485 mode again") breaks systems that worked before it so at this late
> stage in the 4.20 cycle a revert would make sense to me. If that breaks
> RS85 on OMAP3 then my question would be how much can anyone really care
> if nobody noticed since v4.10? And why should that lead to you breaking
> the JZ4780 which has been discovered before a stable kernel release
> includes the breakage?

There's always a .y release where this can be properly investigated and
solved, instead of breaking one platform or the other.

Then again, see the patch from Ezequiel that was just posted, I think it
might be a far better solution.

--
Best regards,
Marek Vasut

2018-12-16 22:09:34

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Marek,

On Sun, Dec 16, 2018 at 09:32:19PM +0100, Marek Vasut wrote:
> I am unable to test it on such a short notice as I'm currently ill, so I
> cannot tell if your change breaks the OMAP3/AM335x boards or not. Given
> that there are very few CI20 boards in use, I'd like to ask you for some
> extra time to investigate this on the OMAP3 too.

I'm sorry to hear that you're ill, but your patch is getting awfully
close to becoming part of a stable kernel release & it causes
regressions. Even if it didn't break a board I use, I think the patch
would be broken & risky for the reasons I outlined in my revert's commit
message.

Ultimately it's Greg's decision but it sounds like you're asking me to
say it's OK to break the JZ4780 in a stable kernel with a patch that I
think would be risky anyway, and I won't do that.

> btw what strikes me as curious is that this patch emerged shortly after
> Ezequiel re-posted the CI20 U-Boot patches after an year-long hiatus, is
> it somehow related ?

Not at all - I regularly test on Ci20 & found this breakage whilst
testing Paul Cercueil's Ingenic TCU patchset v8 [1]. Using a Ci20 with
mainline kernels doesn't rely on Ezequiel's U-Boot work, and indeed I
generally boot using my ci20-usb-boot tool [2] so U-Boot isn't involved
at all.

Now my Ci20 testing isn't automated so it tends to happen mostly when
there are obvious changes to the board or SoC support, but it should
become more automated soon. Kevin Hilman just got a Ci40 working with
kernelci.org infrastructure & I hope we can get Ci20 & other boards
included soon too, either via existing kernelci.org labs or by setting
up a new one.

Thanks,
Paul

[1] https://lore.kernel.org/linux-mips/[email protected]/T/#t
[2] https://github.com/paulburton/ci20-tools

2018-12-16 22:18:42

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On Sun, Dec 16, 2018 at 10:45:23PM +0100, Marek Vasut wrote:
> >> I am unable to test it on such a short notice as I'm currently ill, so I
> >> cannot tell if your change breaks the OMAP3/AM335x boards or not. Given
> >> that there are very few CI20 boards in use, I'd like to ask you for some
> >> extra time to investigate this on the OMAP3 too.
> >
> > I'm sorry to hear that you're ill, but your patch is getting awfully
> > close to becoming part of a stable kernel release & it causes
> > regressions. Even if it didn't break a board I use, I think the patch
> > would be broken & risky for the reasons I outlined in my revert's commit
> > message.
>
> That's what the incremental releases are for, so that minor problems can
> get fixed there. Sure, it's great to have things perfect in the first
> release, but if that breaks other systems, too bad.

I don't think the purpose of stable kernels is to intentionally break
systems in a stable release & backport fixes later...

> > Ultimately it's Greg's decision but it sounds like you're asking me to
> > say it's OK to break the JZ4780 in a stable kernel with a patch that I
> > think would be risky anyway, and I won't do that.
>
> I am saying this revert breaks AM335x, so this is a stalemate. I had a
> discussion with Ezequiel (on CC) and he seems to have a different
> smaller patch coming for this problem.

Well, no. Your patch says commit 2bed8a8e7072 ("Clearing FIFOs in RS485
emulation mode causes subsequent transmits to break") broke RS485 on
AM33x in v4.10, and it took 10 release cycles for someone to notice &
fix it. You're asking to break a system that is used & working in order
to fix a problem that seemingly wasn't even noticed for nearly 2 years.

Your fix breaks my system, but I outlined 4 reasons that I believe your
patch to be wrong anyway - breaking my system is just one part of that.

I'll rely to Ezequiel's patch now, but it again addresses just one part
of one of the 4 points I listed in the reasoning for my revert.

Thanks,
Paul

2018-12-16 22:25:12

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 10:52 PM, Ezequiel Garcia wrote:
> On Sun, 16 Dec 2018 at 18:45, Marek Vasut <[email protected]> wrote:
> [skips discussion]
>>
>>> Ultimately it's Greg's decision but it sounds like you're asking me to
>>> say it's OK to break the JZ4780 in a stable kernel with a patch that I
>>> think would be risky anyway, and I won't do that.
>>
>> I am saying this revert breaks AM335x, so this is a stalemate. I had a
>> discussion with Ezequiel (on CC) and he seems to have a different
>> smaller patch coming for this problem.
>>
>
> Can you guys test this? Note that serial8250_do_startup has a comment
> stating clearly that it has the intention of disabling the FIFOs,
> so it seems this is the right thing to do.
>
> Paul, this removes the garbage on my CI20 (rev.1)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c
> b/drivers/tty/serial/8250/8250_port.c
> index c39482b96111..fac19cbc51d1 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2209,10 +2209,11 @@ int serial8250_do_startup(struct uart_port *port)
> /*
> * Clear the FIFO buffers and disable them.
> * (they will be reenabled in set_termios())
> */
> serial8250_clear_fifos(up);
> + serial_out(up, UART_FCR, 0);
>
> /*
> * Clear the interrupt registers.
> */
> serial_port_in(port, UART_LSR);
>
>

On AM335x pocketbeagle
Tested-by: Marek Vasut <[email protected]>

--
Best regards,
Marek Vasut

2018-12-16 22:29:14

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Ezequiel,

On Sun, Dec 16, 2018 at 06:52:53PM -0300, Ezequiel Garcia wrote:
> diff --git a/drivers/tty/serial/8250/8250_port.c
> b/drivers/tty/serial/8250/8250_port.c
> index c39482b96111..fac19cbc51d1 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -2209,10 +2209,11 @@ int serial8250_do_startup(struct uart_port *port)
> /*
> * Clear the FIFO buffers and disable them.
> * (they will be reenabled in set_termios())
> */
> serial8250_clear_fifos(up);
> + serial_out(up, UART_FCR, 0);
>
> /*
> * Clear the interrupt registers.
> */
> serial_port_in(port, UART_LSR);
>

This helps, but it only addresses one part of one of the 4 reasons I
listed as motivation for my revert. For example serial8250_do_shutdown()
also clearly intends to disable the FIFOs.

Thanks,
Paul

2018-12-16 22:30:22

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
>
> Hi Ezequiel,
>
> On Sun, Dec 16, 2018 at 06:52:53PM -0300, Ezequiel Garcia wrote:
> > diff --git a/drivers/tty/serial/8250/8250_port.c
> > b/drivers/tty/serial/8250/8250_port.c
> > index c39482b96111..fac19cbc51d1 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2209,10 +2209,11 @@ int serial8250_do_startup(struct uart_port *port)
> > /*
> > * Clear the FIFO buffers and disable them.
> > * (they will be reenabled in set_termios())
> > */
> > serial8250_clear_fifos(up);
> > + serial_out(up, UART_FCR, 0);
> >
> > /*
> > * Clear the interrupt registers.
> > */
> > serial_port_in(port, UART_LSR);
> >
>
> This helps, but it only addresses one part of one of the 4 reasons I
> listed as motivation for my revert. For example serial8250_do_shutdown()
> also clearly intends to disable the FIFOs.
>

OK. So, let's fix that :-)

By all means, it would be really nice to push forward and fix the garbage
issue on JZ4780, as well as the transmission issue on AM335x.

AM335x is a wildly popular platform, and it's not funny to break it.
So, let's please stop discussing which board we'll break and just fix both.
--
Ezequiel García, VanguardiaSur
http://www.vanguardiasur.com.ar

2018-12-16 22:36:50

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Ezequiel,

On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
> On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
> > This helps, but it only addresses one part of one of the 4 reasons I
> > listed as motivation for my revert. For example serial8250_do_shutdown()
> > also clearly intends to disable the FIFOs.
> >
>
> OK. So, let's fix that :-)

I already did, or at least tried to, on Thursday [1].

> By all means, it would be really nice to push forward and fix the garbage
> issue on JZ4780, as well as the transmission issue on AM335x.
>
> AM335x is a wildly popular platform, and it's not funny to break it.

Well, clearly not if it was broken in v4.10 & only just fixed..? And
from Marek's commit message the patch in v4.10 doesn't break the whole
system just RS485.

> So, let's please stop discussing which board we'll break and just fix both.

I completely agree that would be ideal and I wrote a patch hoping to do
that on Thursday, but didn't get any response on testing. It's late in
the cycle hence a revert made sense. Simple as that.

Thanks,
Paul

[1] https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/

2018-12-16 22:43:11

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Marek,

On Sun, Dec 16, 2018 at 11:01:18PM +0100, Marek Vasut wrote:
> >>> I did suggest an alternative approach which would rename
> >>> serial8250_clear_fifos() and split it into 2 variants - one that
> >>> disables FIFOs & one that does not, then use the latter in
> >>> __do_stop_tx_rs485():
> >>>
> >>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
> >>>
> >>> However I have no access to the OMAP3 hardware that Marek's patch was
> >>> attempting to fix & have heard nothing back with regards to him testing
> >>> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
> >>>
> >>> I've marked for stable back to v4.10 presuming that this is how far the
> >>> broken patch may be backported, given that this is where commit
> >>> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
> >>> transmits to break") that it tried to fix was introduced.
> >>
> >> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
> >> that's a NAK.
> >
> > To be clear - what did you test? This revert or the patch linked to
> > above?
> >
> > This revert would of course reintroduce your RS485 issue because it just
> > undoes your change.
>
> The revert. Which of the two patches do you need me to test.

The one in the email I sent on Thursday 13th at 01:48:06 UTC, linked to
at lore.kernel.org in the quote right above:

https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/

You replied with comments on the patch, you just never tested it or
never told me if you did. The lack of response means I don't know
whether that potential patch even still works for your system, hence the
revert.

Thanks,
Paul

2018-12-17 15:20:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On Sun, Dec 16, 2018 at 10:35:12PM +0000, Paul Burton wrote:
> Hi Ezequiel,
>
> On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
> > On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
> > > This helps, but it only addresses one part of one of the 4 reasons I
> > > listed as motivation for my revert. For example serial8250_do_shutdown()
> > > also clearly intends to disable the FIFOs.
> > >
> >
> > OK. So, let's fix that :-)
>
> I already did, or at least tried to, on Thursday [1].
>
> > By all means, it would be really nice to push forward and fix the garbage
> > issue on JZ4780, as well as the transmission issue on AM335x.
> >
> > AM335x is a wildly popular platform, and it's not funny to break it.
>
> Well, clearly not if it was broken in v4.10 & only just fixed..? And
> from Marek's commit message the patch in v4.10 doesn't break the whole
> system just RS485.
>
> > So, let's please stop discussing which board we'll break and just fix both.
>
> I completely agree that would be ideal and I wrote a patch hoping to do
> that on Thursday, but didn't get any response on testing. It's late in
> the cycle hence a revert made sense. Simple as that.

A revert makes sense now, I'll go queue this up, thanks.

greg k-h

2018-12-17 16:32:21

by Ezequiel Garcia

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On Sun, 16 Dec 2018 at 19:35, Paul Burton <[email protected]> wrote:
>
> Hi Ezequiel,
>
> On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
> > On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
> > > This helps, but it only addresses one part of one of the 4 reasons I
> > > listed as motivation for my revert. For example serial8250_do_shutdown()
> > > also clearly intends to disable the FIFOs.
> > >
> >
> > OK. So, let's fix that :-)
>
> I already did, or at least tried to, on Thursday [1].
>
> > By all means, it would be really nice to push forward and fix the garbage
> > issue on JZ4780, as well as the transmission issue on AM335x.
> >
> > AM335x is a wildly popular platform, and it's not funny to break it.
>
> Well, clearly not if it was broken in v4.10 & only just fixed..? And
> from Marek's commit message the patch in v4.10 doesn't break the whole
> system just RS485.
>

Careful here. It's naive to consider v4.10 as old in this context.

AM335x is used in hundreds of thousands of products, probably
"industrial" products.
Manufacturers don't always follow the kernel, and it's entirely
likely that they notice a regression only when developing a new product,
or when rebasing on top of a newer longterm kernel.

Then again, I don't have any details about what is Marek's original fix
actually fixing.

Marek: could you please post the test case that you used to validate your fix?
I can't find that anywhere.

> > So, let's please stop discussing which board we'll break and just fix both.
>
> I completely agree that would be ideal and I wrote a patch hoping to do
> that on Thursday, but didn't get any response on testing. It's late in
> the cycle hence a revert made sense. Simple as that.
>
> Thanks,
> Paul
>
> [1] https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/

Well, I think this patch a lot of sense. But since Greg wants to
revert Marek's fix,
it would make sense to collate both Marek's and Paul's in a single commit.
--
Ezequiel García, VanguardiaSur
http://www.vanguardiasur.com.ar

2018-12-17 19:08:18

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/17/2018 04:18 PM, Greg Kroah-Hartman wrote:
> On Sun, Dec 16, 2018 at 10:35:12PM +0000, Paul Burton wrote:
>> Hi Ezequiel,
>>
>> On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
>>> On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
>>>> This helps, but it only addresses one part of one of the 4 reasons I
>>>> listed as motivation for my revert. For example serial8250_do_shutdown()
>>>> also clearly intends to disable the FIFOs.
>>>>
>>>
>>> OK. So, let's fix that :-)
>>
>> I already did, or at least tried to, on Thursday [1].
>>
>>> By all means, it would be really nice to push forward and fix the garbage
>>> issue on JZ4780, as well as the transmission issue on AM335x.
>>>
>>> AM335x is a wildly popular platform, and it's not funny to break it.
>>
>> Well, clearly not if it was broken in v4.10 & only just fixed..? And
>> from Marek's commit message the patch in v4.10 doesn't break the whole
>> system just RS485.
>>
>>> So, let's please stop discussing which board we'll break and just fix both.
>>
>> I completely agree that would be ideal and I wrote a patch hoping to do
>> that on Thursday, but didn't get any response on testing. It's late in
>> the cycle hence a revert made sense. Simple as that.
>
> A revert makes sense now, I'll go queue this up, thanks.

I don't like this for multiple reasons.
1) There is a better patch posted which doesn't break the AM335x and
clearly identifies and fixes the problem on the JZ4780 / CI20
2) The JZ4780 8250 core is not a standard 8250 core, since it has extra
bits in the FCR register (like the UME bit, which disables the whole
UART block), so the revert IMO would break that core too, it just
hides the breakage. I'm still trying to understand the implications
of that in detail, but the discussion wasn't quite constructive.

I'd much rather see Ezequiel's patch applied, since that's far less
destructive approach to fixing the problem than the revert.

--
Best regards,
Marek Vasut

2018-12-17 19:36:11

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/17/2018 05:30 PM, Ezequiel Garcia wrote:
> On Sun, 16 Dec 2018 at 19:35, Paul Burton <[email protected]> wrote:
>>
>> Hi Ezequiel,
>>
>> On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
>>> On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
>>>> This helps, but it only addresses one part of one of the 4 reasons I
>>>> listed as motivation for my revert. For example serial8250_do_shutdown()
>>>> also clearly intends to disable the FIFOs.
>>>>
>>>
>>> OK. So, let's fix that :-)
>>
>> I already did, or at least tried to, on Thursday [1].
>>
>>> By all means, it would be really nice to push forward and fix the garbage
>>> issue on JZ4780, as well as the transmission issue on AM335x.
>>>
>>> AM335x is a wildly popular platform, and it's not funny to break it.
>>
>> Well, clearly not if it was broken in v4.10 & only just fixed..? And
>> from Marek's commit message the patch in v4.10 doesn't break the whole
>> system just RS485.
>>
>
> Careful here. It's naive to consider v4.10 as old in this context.
>
> AM335x is used in hundreds of thousands of products, probably
> "industrial" products.
> Manufacturers don't always follow the kernel, and it's entirely
> likely that they notice a regression only when developing a new product,
> or when rebasing on top of a newer longterm kernel.
>
> Then again, I don't have any details about what is Marek's original fix
> actually fixing.
>
> Marek: could you please post the test case that you used to validate your fix?
> I can't find that anywhere.

I can't share the testcase itself because it has no license and I didn't
write it, but I can tell you what it's doing. (I'll check if I can share
the testcase verbatim too, I just sent an email to the author)

The test spawns two threads, one sending , one receiving. The sending
thread sends 8 bytes of data from /dev/ttyS4 , the receiving thread
receives the data on /dev/ttyS2 and compares the pattern. The port
settings is B1000000 8N1 with rs485conf.flags set to SER_RS485_ENABLED
and SER_RS485_RTS_AFTER_SEND. Sometimes the received data do not match
the sent data, but rather look as if one character was left over from
the previous transmission in the FIFO.

Those two UARTs are connected together by two wires, without any real
RS485 hardware to minimize the hardware complexity (it's easy to
implement that on the pocketbeagle, which is the cheap option here).

--
Best regards,
Marek Vasut

2018-12-19 02:34:58

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/17/2018 06:20 PM, Marek Vasut wrote:
> On 12/17/2018 05:30 PM, Ezequiel Garcia wrote:
>> On Sun, 16 Dec 2018 at 19:35, Paul Burton <[email protected]> wrote:
>>>
>>> Hi Ezequiel,
>>>
>>> On Sun, Dec 16, 2018 at 07:28:22PM -0300, Ezequiel Garcia wrote:
>>>> On Sun, 16 Dec 2018 at 19:24, Paul Burton <[email protected]> wrote:
>>>>> This helps, but it only addresses one part of one of the 4 reasons I
>>>>> listed as motivation for my revert. For example serial8250_do_shutdown()
>>>>> also clearly intends to disable the FIFOs.
>>>>>
>>>>
>>>> OK. So, let's fix that :-)
>>>
>>> I already did, or at least tried to, on Thursday [1].
>>>
>>>> By all means, it would be really nice to push forward and fix the garbage
>>>> issue on JZ4780, as well as the transmission issue on AM335x.
>>>>
>>>> AM335x is a wildly popular platform, and it's not funny to break it.
>>>
>>> Well, clearly not if it was broken in v4.10 & only just fixed..? And
>>> from Marek's commit message the patch in v4.10 doesn't break the whole
>>> system just RS485.
>>>
>>
>> Careful here. It's naive to consider v4.10 as old in this context.
>>
>> AM335x is used in hundreds of thousands of products, probably
>> "industrial" products.
>> Manufacturers don't always follow the kernel, and it's entirely
>> likely that they notice a regression only when developing a new product,
>> or when rebasing on top of a newer longterm kernel.
>>
>> Then again, I don't have any details about what is Marek's original fix
>> actually fixing.
>>
>> Marek: could you please post the test case that you used to validate your fix?
>> I can't find that anywhere.
>
> I can't share the testcase itself because it has no license and I didn't
> write it, but I can tell you what it's doing. (I'll check if I can share
> the testcase verbatim too, I just sent an email to the author)
>
> The test spawns two threads, one sending , one receiving. The sending
> thread sends 8 bytes of data from /dev/ttyS4 , the receiving thread
> receives the data on /dev/ttyS2 and compares the pattern. The port
> settings is B1000000 8N1 with rs485conf.flags set to SER_RS485_ENABLED
> and SER_RS485_RTS_AFTER_SEND. Sometimes the received data do not match
> the sent data, but rather look as if one character was left over from
> the previous transmission in the FIFO.
>
> Those two UARTs are connected together by two wires, without any real
> RS485 hardware to minimize the hardware complexity (it's easy to
> implement that on the pocketbeagle, which is the cheap option here).

Test code is below . On the pocketbeagle, connect SPI0:CLK with U4:TX
and SPI0:MISO with U4:RX , apply the DT patch below and run the example.
It'll fail after a few iterations.

#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <termios.h>
#include <unistd.h>

#include <iostream>
#include <iomanip>
#include <atomic>
#include <thread>

#include <linux/serial.h>
#include <sys/ioctl.h>

std::atomic<bool> running{true};

int set_interface_attribs (int fd, int speed, int parity)
{
struct termios tty;
memset (&tty, 0, sizeof tty);
if (tcgetattr (fd, &tty) != 0)
{
std::cerr << "Error from tcgetattr" << std::endl;
return -1;
}

cfsetospeed (&tty, speed);
cfsetispeed (&tty, speed);

tty.c_cflag = (tty.c_cflag & ~CSIZE) | CS8;
tty.c_iflag &= ~IGNBRK;
tty.c_lflag = 0;

tty.c_oflag = 0;
tty.c_cc[VMIN] = 8;
tty.c_cc[VTIME] = 0;

tty.c_iflag &= ~(IXON | IXOFF | IXANY);

tty.c_cflag |= (CLOCAL | CREAD);
tty.c_cflag &= ~(PARENB | PARODD);
tty.c_cflag |= parity;
tty.c_cflag &= ~CSTOPB;
tty.c_cflag &= ~CRTSCTS;

if (tcsetattr (fd, TCSANOW, &tty) != 0)
{
std::cerr << "Error from tcsetattr" << std::endl;
return -1;
}
return 0;
}

void enable_rts(int fd, int rts)
{
struct serial_rs485 rs485conf;
if (rts) {
rs485conf.flags = SER_RS485_ENABLED ;
rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
rs485conf.flags &= ~(SER_RS485_RTS_ON_SEND);
rs485conf.delay_rts_before_send = 0;
rs485conf.delay_rts_after_send = 0;
} else {
rs485conf.flags = 0 ;
}

if (ioctl( fd, TIOCSRS485, &rs485conf) < 0){
std::cerr << "Cannot set device in RS485 mode" << std::endl;
}
}

void output_function(int fd)
{
while(running) {
write (fd, "asdfghjk", 8);
usleep (20000);
}
}

void input_function(int fd)
{
char buf [100];
size_t count=0;
std::cout << std::unitbuf;
while(running){
int n = read (fd, buf, sizeof buf);
if( (n >0) && (buf[0] != 'a') ){
std::cout << "\nunexpected receive! " << std::string(buf,8) <<
std::endl;
running = false;
} else {
++count;
if(count%100 == 0){
std::cout << "\r" << std::setw(8) << count << " possibly ok";
}
}
}
}

int main(int argc, char** argv)
{
std::string in_port = "/dev/ttyS2";
std::string out_port = "/dev/ttyS4";

int c, rts = 1;

while ((c = getopt (argc, argv, "i:o:r")) != -1)
{
switch (c)
{
case 'i':
in_port = std::string(optarg);
break;
case 'o':
out_port = std::string(optarg);
break;
case 'r':
rts = 0;
break;
case '?':
return 1;
default:
break;
}
}

std::cout << "opening output " << out_port << std::endl;
int outfd = open ( out_port.c_str(), O_RDWR | O_NOCTTY | O_SYNC);
if (outfd < 0)
{
std::cerr << "Could not open " << out_port << std::endl;
return 1;
}

std::cout << "opening input " << in_port << std::endl;
int infd = open ( in_port.c_str(), O_RDWR | O_NOCTTY | O_SYNC);
if (infd < 0)
{
std::cerr << "Could not open " << in_port << std::endl;
return 1;
}

set_interface_attribs (infd, B1000000, 0);
set_interface_attribs (outfd, B1000000, 0);

enable_rts(outfd, rts);
enable_rts(infd, rts);

std::thread in(input_function, infd);
std::thread out(output_function, outfd);

in.join();
out.join();

close(infd);
close(outfd);
}

diff --git a/arch/arm/boot/dts/am335x-pocketbeagle.dts
b/arch/arm/boot/dts/am335x-pocketbeagle.dts
index 62fe5cab9fae..d9b8f09920a6 100644
--- a/arch/arm/boot/dts/am335x-pocketbeagle.dts
+++ b/arch/arm/boot/dts/am335x-pocketbeagle.dts
@@ -92,15 +92,6 @@
>;
};

- spi0_pins: pinmux-spi0-pins {
- pinctrl-single,pins = <
- AM33XX_IOPAD(0x950, PIN_INPUT_PULLUP |
MUX_MODE0) /* (A17) spi0_sclk.spi0_sclk */
- AM33XX_IOPAD(0x954, PIN_INPUT_PULLUP |
MUX_MODE0) /* (B17) spi0_d0.spi0_d0 */
- AM33XX_IOPAD(0x958, PIN_INPUT_PULLUP |
MUX_MODE0) /* (B16) spi0_d1.spi0_d1 */
- AM33XX_IOPAD(0x95c, PIN_INPUT_PULLUP |
MUX_MODE0) /* (A16) spi0_cs0.spi0_cs0 */
- >;
- };
-
spi1_pins: pinmux-spi1-pins {
pinctrl-single,pins = <
AM33XX_IOPAD(0x964, PIN_INPUT_PULLUP |
MUX_MODE4) /* (C18) eCAP0_in_PWM0_out.spi1_sclk */
@@ -126,6 +117,13 @@
>;
};

+ uart2_pins: pinmux-uart2-pins {
+ pinctrl-single,pins = <
+ AM33XX_IOPAD(0x950, PIN_INPUT | MUX_MODE1)
/* spi0_sclk.uart2_rxd */
+ AM33XX_IOPAD(0x954, PIN_OUTPUT | MUX_MODE1)
/* spi0_d0.uart2_txd */
+ >;
+ };
+
uart4_pins: pinmux-uart4-pins {
pinctrl-single,pins = <
AM33XX_IOPAD(0x870, PIN_INPUT_PULLUP |
MUX_MODE6) /* (T17) gpmc_wait0.uart4_rxd */
@@ -199,6 +197,13 @@
status = "okay";
};

+&uart2 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart2_pins>;
+
+ status = "okay";
+};
+
&uart4 {
pinctrl-names = "default";
pinctrl-0 = <&uart4_pins>;

--
Best regards,
Marek Vasut

2018-12-22 17:54:22

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/21/2018 10:08 PM, Paul Burton wrote:
> Hi Marek,
>
> On Fri, Dec 21, 2018 at 09:08:28PM +0100, Marek Vasut wrote:
>> On 12/16/2018 11:28 PM, Paul Burton wrote:
>>> On Sun, Dec 16, 2018 at 11:01:18PM +0100, Marek Vasut wrote:
>>>>>>> I did suggest an alternative approach which would rename
>>>>>>> serial8250_clear_fifos() and split it into 2 variants - one that
>>>>>>> disables FIFOs & one that does not, then use the latter in
>>>>>>> __do_stop_tx_rs485():
>>>>>>>
>>>>>>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>>>>>>>
>>>>>>> However I have no access to the OMAP3 hardware that Marek's patch was
>>>>>>> attempting to fix & have heard nothing back with regards to him testing
>>>>>>> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
>>>>>>>
>>>>>>> I've marked for stable back to v4.10 presuming that this is how far the
>>>>>>> broken patch may be backported, given that this is where commit
>>>>>>> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
>>>>>>> transmits to break") that it tried to fix was introduced.
>>>>>>
>>>>>> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
>>>>>> that's a NAK.
>>>>>
>>>>> To be clear - what did you test? This revert or the patch linked to
>>>>> above?
>>>>>
>>>>> This revert would of course reintroduce your RS485 issue because it just
>>>>> undoes your change.
>>>>
>>>> The revert. Which of the two patches do you need me to test.
>>>
>>> The one in the email I sent on Thursday 13th at 01:48:06 UTC, linked to
>>> at lore.kernel.org in the quote right above:
>>>
>>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>>>
>>> You replied with comments on the patch, you just never tested it or
>>> never told me if you did. The lack of response means I don't know
>>> whether that potential patch even still works for your system, hence the
>>> revert.
>>
>> I shared the entire testcase, which now fails on AM335x due to this
>> revert. Is there any progress on a proper fix from your side which does
>> not break the AM335x ?
>
> No.
>
> Let's be clear - I didn't break your AM335x system, your broken patch
> says that Daniel did with a commit applied back in v4.10. As such I
> don't consider it my responsibility to fix your problem at all, I don't
> have any access to the hardware anyway & I won't be buying hardware to
> fix a bug for you.
>
> Despite all that I did write a patch which I expect would have solved
> the problem for both of us, which is linked *twice* in the quoted emails
> above & which as far as I can tell you *still* have not tested. I can
> only surmise that you're trying deliberately to be annoying at this
> point.
>
> If you want to try the patch I already wrote, and confirm whether it
> actually works for you, then let's talk. Otherwise we're done here.

Understood. However I did test your patch, but it was generating
spurious IRQs and overruns (ttyS ttyS2: 91 input overrun(s)) on the
AM335x, so that's not a proper solution.

I even brought the CI20 V2 I have back to life (yes, I bought one). The
patch Ezequiel posted did fix the problem on the CI20, and did not break
the AM335x, so I'd prefer if it was revisited instead of a heavy-handed
revert.

And I'd prefer to keep this thread alive, since I am still convinced
that the FIFO handling code is wrong. Moreover, considering the UME bit
on JZ4780 in FCR, the original code should confuse the UART on the
JZ4780 too, although this might be hidden by some other surrounding code
specific to the 8250 core on the JZ4780.

I am also on mipslinux IRC channel, in case you want to discuss this.

--
Best regards,
Marek Vasut

2018-12-23 07:09:07

by Marek Vasut

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

On 12/16/2018 11:28 PM, Paul Burton wrote:
> Hi Marek,
>
> On Sun, Dec 16, 2018 at 11:01:18PM +0100, Marek Vasut wrote:
>>>>> I did suggest an alternative approach which would rename
>>>>> serial8250_clear_fifos() and split it into 2 variants - one that
>>>>> disables FIFOs & one that does not, then use the latter in
>>>>> __do_stop_tx_rs485():
>>>>>
>>>>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>>>>>
>>>>> However I have no access to the OMAP3 hardware that Marek's patch was
>>>>> attempting to fix & have heard nothing back with regards to him testing
>>>>> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
>>>>>
>>>>> I've marked for stable back to v4.10 presuming that this is how far the
>>>>> broken patch may be backported, given that this is where commit
>>>>> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
>>>>> transmits to break") that it tried to fix was introduced.
>>>>
>>>> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
>>>> that's a NAK.
>>>
>>> To be clear - what did you test? This revert or the patch linked to
>>> above?
>>>
>>> This revert would of course reintroduce your RS485 issue because it just
>>> undoes your change.
>>
>> The revert. Which of the two patches do you need me to test.
>
> The one in the email I sent on Thursday 13th at 01:48:06 UTC, linked to
> at lore.kernel.org in the quote right above:
>
> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
>
> You replied with comments on the patch, you just never tested it or
> never told me if you did. The lack of response means I don't know
> whether that potential patch even still works for your system, hence the
> revert.

I shared the entire testcase, which now fails on AM335x due to this
revert. Is there any progress on a proper fix from your side which does
not break the AM335x ?

--
Best regards,
Marek Vasut

2018-12-23 09:13:30

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Marek,

On Fri, Dec 21, 2018 at 09:08:28PM +0100, Marek Vasut wrote:
> On 12/16/2018 11:28 PM, Paul Burton wrote:
> > On Sun, Dec 16, 2018 at 11:01:18PM +0100, Marek Vasut wrote:
> >>>>> I did suggest an alternative approach which would rename
> >>>>> serial8250_clear_fifos() and split it into 2 variants - one that
> >>>>> disables FIFOs & one that does not, then use the latter in
> >>>>> __do_stop_tx_rs485():
> >>>>>
> >>>>> https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
> >>>>>
> >>>>> However I have no access to the OMAP3 hardware that Marek's patch was
> >>>>> attempting to fix & have heard nothing back with regards to him testing
> >>>>> that approach, so here's a simple revert that fixes the Ingenic JZ4780.
> >>>>>
> >>>>> I've marked for stable back to v4.10 presuming that this is how far the
> >>>>> broken patch may be backported, given that this is where commit
> >>>>> 2bed8a8e7072 ("Clearing FIFOs in RS485 emulation mode causes subsequent
> >>>>> transmits to break") that it tried to fix was introduced.
> >>>>
> >>>> OK, I tested this on AM335x / OMAP3 and the system is again broken, so
> >>>> that's a NAK.
> >>>
> >>> To be clear - what did you test? This revert or the patch linked to
> >>> above?
> >>>
> >>> This revert would of course reintroduce your RS485 issue because it just
> >>> undoes your change.
> >>
> >> The revert. Which of the two patches do you need me to test.
> >
> > The one in the email I sent on Thursday 13th at 01:48:06 UTC, linked to
> > at lore.kernel.org in the quote right above:
> >
> > https://lore.kernel.org/lkml/20181213014805.77u5dzydo23cm6fq@pburton-laptop/
> >
> > You replied with comments on the patch, you just never tested it or
> > never told me if you did. The lack of response means I don't know
> > whether that potential patch even still works for your system, hence the
> > revert.
>
> I shared the entire testcase, which now fails on AM335x due to this
> revert. Is there any progress on a proper fix from your side which does
> not break the AM335x ?

No.

Let's be clear - I didn't break your AM335x system, your broken patch
says that Daniel did with a commit applied back in v4.10. As such I
don't consider it my responsibility to fix your problem at all, I don't
have any access to the hardware anyway & I won't be buying hardware to
fix a bug for you.

Despite all that I did write a patch which I expect would have solved
the problem for both of us, which is linked *twice* in the quoted emails
above & which as far as I can tell you *still* have not tested. I can
only surmise that you're trying deliberately to be annoying at this
point.

If you want to try the patch I already wrote, and confirm whether it
actually works for you, then let's talk. Otherwise we're done here.

Thanks,
Paul

2018-12-24 15:44:31

by Paul Burton

[permalink] [raw]
Subject: Re: [PATCH] Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

Hi Marek,

On Fri, Dec 21, 2018 at 11:02:03PM +0100, Marek Vasut wrote:
> >> I shared the entire testcase, which now fails on AM335x due to this
> >> revert. Is there any progress on a proper fix from your side which does
> >> not break the AM335x ?
> >
> > No.
> >
> > Let's be clear - I didn't break your AM335x system, your broken patch
> > says that Daniel did with a commit applied back in v4.10. As such I
> > don't consider it my responsibility to fix your problem at all, I don't
> > have any access to the hardware anyway & I won't be buying hardware to
> > fix a bug for you.
> >
> > Despite all that I did write a patch which I expect would have solved
> > the problem for both of us, which is linked *twice* in the quoted emails
> > above & which as far as I can tell you *still* have not tested. I can
> > only surmise that you're trying deliberately to be annoying at this
> > point.
> >
> > If you want to try the patch I already wrote, and confirm whether it
> > actually works for you, then let's talk. Otherwise we're done here.
>
> Understood. However I did test your patch, but it was generating
> spurious IRQs and overruns (ttyS ttyS2: 91 input overrun(s)) on the
> AM335x, so that's not a proper solution.

OK, thanks for testing, and let's figure out what's going on. Since the
revert is in now I'd suggest starting from there - ie. approximately
from the code we've had since v4.10.

> I even brought the CI20 V2 I have back to life (yes, I bought one). The
> patch Ezequiel posted did fix the problem on the CI20, and did not break
> the AM335x, so I'd prefer if it was revisited instead of a heavy-handed
> revert.

As I described in an earlier email & on IRC, Ezequiel's one-liner does
not address all of the problems listed in my revert's commit message.
But again, since the revert is now in I suggest starting from there.

As neither a maintainer or significant contributor to
drivers/tty/serial, nor the author of the patch applied in v4.10 which
you say broke AM335x, nor someone with access to the affected hardware,
I'm probably not the best placed person to help you here - all I can do
is offer general debug suggestions. With that in mind, here are some
questions & ideas I have:

1) Your message for commit f6aa5beb45be ("serial: 8250: Fix clearing
FIFOs in RS485 mode again") suggests that the problem you're seeing
is specific to the __do_stop_tx_rs485() path. Does changing only
__do_stop_tx_rs485() to clear the FIFOs without disabling them,
without modifying other users of serial8250_clear_fifos(), fix your
problem? ie. does the following work?

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index f776b3eafb96..18d2a1a93f03 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -1458,17 +1458,21 @@ static void serial8250_stop_rx(struct uart_port *port)
}

static void __do_stop_tx_rs485(struct uart_8250_port *p)
{
+ unsigned char fcr;
+
serial8250_em485_rts_after_send(p);

/*
* Empty the RX FIFO, we are not interested in anything
* received during the half-duplex transmission.
* Enable previously disabled RX interrupts.
*/
if (!(p->port.rs485.flags & SER_RS485_RX_DURING_TX)) {
- serial8250_clear_fifos(p);
+ fcr = serial_port_in(&p->port, UART_FCR);
+ serial_port_out(&p->port, UART_FCR,
+ fcr | UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT);

p->ier |= UART_IER_RLSI | UART_IER_RDI;
serial_port_out(&p->port, UART_IER, p->ier);
}

Based on the comment above maybe leaving UART_FCR_CLEAR_XMIT out of
that might make sense too..?

Having __do_stop_tx_rs485() not share the FIFO clearing code with
other callers may make sense given that so far as I can see
__do_stop_tx_rs485() runs whilst the port is in use & other callers
of serial8250_clear_fifos() do not.

2) In an earlier email you said "The problem the original patch fixed
was that too many bits were cleared in the FCR on OMAP3/AM335x".
Could you clarify which bits we're talking about? From the AM335x
reference manual I can only see the DMA_MODE bit & the
[RT]X_FIFO_TRIG fields. Do you know which are problematic & why? In
your commit message you mention the AM335x UART swallowing &
retransmitting bytes - which field changing causes that?

> And I'd prefer to keep this thread alive, since I am still convinced
> that the FIFO handling code is wrong. Moreover, considering the UME bit
> on JZ4780 in FCR, the original code should confuse the UART on the
> JZ4780 too, although this might be hidden by some other surrounding code
> specific to the 8250 core on the JZ4780.

For completeness, as I said in an earlier email in the thread and as
we've since discussed on IRC, this is incorrect because
ingenic_uart_serial_out() unconditionally sets the UME bit. As such the
UME bit is not a problem, and JZ4780 works fine both before your patch &
after the revert.

Thanks,
Paul