2019-07-29 19:41:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.19 024/113] tty: serial: msm_serial: avoid system lockup condition

[ Upstream commit ba3684f99f1b25d2a30b6956d02d339d7acb9799 ]

The function msm_wait_for_xmitr can be taken with interrupts
disabled. In order to avoid a potential system lockup - demonstrated
under stress testing conditions on SoC QCS404/5 - make sure we wait
for a bounded amount of time.

Tested on SoC QCS404.

Signed-off-by: Jorge Ramirez-Ortiz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/tty/serial/msm_serial.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c
index 0f41b936da03..310bbae515b0 100644
--- a/drivers/tty/serial/msm_serial.c
+++ b/drivers/tty/serial/msm_serial.c
@@ -383,10 +383,14 @@ static void msm_request_rx_dma(struct msm_port *msm_port, resource_size_t base)

static inline void msm_wait_for_xmitr(struct uart_port *port)
{
+ unsigned int timeout = 500000;
+
while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) {
if (msm_read(port, UART_ISR) & UART_ISR_TX_READY)
break;
udelay(1);
+ if (!timeout--)
+ break;
}
msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR);
}
--
2.20.1




2019-07-31 19:59:16

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 4.19 024/113] tty: serial: msm_serial: avoid system lockup condition

Hi!

> [ Upstream commit ba3684f99f1b25d2a30b6956d02d339d7acb9799 ]
>
> The function msm_wait_for_xmitr can be taken with interrupts
> disabled. In order to avoid a potential system lockup - demonstrated
> under stress testing conditions on SoC QCS404/5 - make sure we wait
> for a bounded amount of time.
>
> Tested on SoC QCS404.

How long did it take to timeout?

Because... this is supposed to loop for 0.5 second with interrupts
disabled, but 500000*udelay(1) is probably going to wait for more than
that.

Is 500msec reasonable with interrupts disabled?

Should it use something like 5000*udelay(100), instead, as that has
chance to result in closer-to-500msec wait?

> +++ b/drivers/tty/serial/msm_serial.c
> @@ -383,10 +383,14 @@ static void msm_request_rx_dma(struct msm_port *msm_port, resource_size_t base)
>
> static inline void msm_wait_for_xmitr(struct uart_port *port)
> {
> + unsigned int timeout = 500000;
> +
> while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) {
> if (msm_read(port, UART_ISR) & UART_ISR_TX_READY)
> break;
> udelay(1);
> + if (!timeout--)
> + break;
> }
> msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR);
> }

Plus, should it do some kind of dev_err() to let users know that
something went very wrong with their serial?

Thanks,
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.46 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2019-07-31 21:02:48

by Jorge Ramirez-Ortiz

[permalink] [raw]
Subject: Re: [PATCH 4.19 024/113] tty: serial: msm_serial: avoid system lockup condition

On 7/31/19 21:05, Pavel Machek wrote:
> Hi!

hi Pavel,

>
>> [ Upstream commit ba3684f99f1b25d2a30b6956d02d339d7acb9799 ]
>>
>> The function msm_wait_for_xmitr can be taken with interrupts
>> disabled. In order to avoid a potential system lockup - demonstrated
>> under stress testing conditions on SoC QCS404/5 - make sure we wait
>> for a bounded amount of time.
>>
>> Tested on SoC QCS404.
>
> How long did it take to timeout?
>
> Because... this is supposed to loop for 0.5 second with interrupts
> disabled, but 500000*udelay(1) is probably going to wait for more than
> that.
>
> Is 500msec reasonable with interrupts disabled?

considering the original unbounded definition, it is hard to determine
what would be a good amount of time to wait (msm_serial can be used for
BT comms and I am not sure how critical that link might be for different
clients..and I didnt want to create a regression hence the half a second
delay).

yeah, I don't think disabling interrupts for half a second is a good
idea on most systems hence why I chose it that big.

>
> Should it use something like 5000*udelay(100), instead, as that has
> chance to result in closer-to-500msec wait?

the half a second timeout didnt mean to be accurate but a worst case
scenario...I am not sure accuracy matters.

>
>> +++ b/drivers/tty/serial/msm_serial.c
>> @@ -383,10 +383,14 @@ static void msm_request_rx_dma(struct msm_port *msm_port, resource_size_t base)
>>
>> static inline void msm_wait_for_xmitr(struct uart_port *port)
>> {
>> + unsigned int timeout = 500000;
>> +
>> while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) {
>> if (msm_read(port, UART_ISR) & UART_ISR_TX_READY)
>> break;
>> udelay(1);
>> + if (!timeout--)
>> + break;
>> }
>> msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR);
>> }
>
> Plus, should it do some kind of dev_err() to let users know that
> something went very wrong with their serial?

I did consider this but then I thought that 1/2 second without
interrupts on the core should not go unnoticed. But I might be wrong.

>
> Thanks,
> Pavel
>

2019-08-15 07:36:56

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 4.19 024/113] tty: serial: msm_serial: avoid system lockup condition

Hi!

> >> [ Upstream commit ba3684f99f1b25d2a30b6956d02d339d7acb9799 ]

> > Should it use something like 5000*udelay(100), instead, as that has
> > chance to result in closer-to-500msec wait?
>
> the half a second timeout didnt mean to be accurate but a worst case
> scenario...I am not sure accuracy matters.

Well, I'd be afraid that it would wait 5 seconds, not half a
second. udelay(1) may be very inaccurate.

> >> while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) {
> >> if (msm_read(port, UART_ISR) & UART_ISR_TX_READY)
> >> break;
> >> udelay(1);
> >> + if (!timeout--)
> >> + break;
> >> }
> >> msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR);
> >> }
> >
> > Plus, should it do some kind of dev_err() to let users know that
> > something went very wrong with their serial?
>
> I did consider this but then I thought that 1/2 second without
> interrupts on the core should not go unnoticed. But I might be wrong.

Well, maybe it will be noticed, but user will have no idea what caused
it.

Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.20 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments