2024-06-04 16:03:19

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 0/7] serial: qcom-geni: Overhaul TX handling to fix crashes/hangs


While trying to reproduce -EBUSY errors that our lab was getting in
suspend/resume testing, I ended up finding a whole pile of problems
with the Qualcomm GENI serial driver. I've posted a fix for the -EBUSY
issue separately [1]. This series is fixing all of the Qualcomm GENI
problems that I found.

As far as I can tell most of the problems have been in the Qualcomm
GENI serial driver since inception, but it can be noted that the
behavior got worse with the new kfifo changes. Previously when the OS
took data out of the circular queue we'd just spit stale data onto the
serial port. Now we'll hard lockup. :-P

I've tried to break this series up as much as possible to make it
easier to understand but the final patch is still a lot of change at
once. Hopefully it's OK.

[1] https://lore.kernel.org/r/20240530084841.v2.1.I2395e66cf70c6e67d774c56943825c289b9c13e4@changeid

Changes in v3:
- 0xffffffff => GENMASK(31, 0)
- Reword commit message.
- Use uart_fifo_timeout() for timeout.

Changes in v2:
- Totally rework / rename patch to handle suspend while active xfer
- serial: qcom-geni: Fix arg types for qcom_geni_serial_poll_bit()
- serial: qcom-geni: Fix the timeout in qcom_geni_serial_poll_bit()
- serial: qcom-geni: Introduce qcom_geni_serial_poll_bitfield()
- serial: qcom-geni: Just set the watermark level once
- serial: qcom-geni: Rework TX in FIFO mode to fix hangs/lockups
- soc: qcom: geni-se: Add GP_LENGTH/IRQ_EN_SET/IRQ_EN_CLEAR registers

Douglas Anderson (7):
soc: qcom: geni-se: Add GP_LENGTH/IRQ_EN_SET/IRQ_EN_CLEAR registers
serial: qcom-geni: Fix the timeout in qcom_geni_serial_poll_bit()
serial: qcom-geni: Fix arg types for qcom_geni_serial_poll_bit()
serial: qcom-geni: Introduce qcom_geni_serial_poll_bitfield()
serial: qcom-geni: Just set the watermark level once
serial: qcom-geni: Fix suspend while active UART xfer
serial: qcom-geni: Rework TX in FIFO mode to fix hangs/lockups

drivers/tty/serial/qcom_geni_serial.c | 321 +++++++++++++++-----------
include/linux/soc/qcom/geni-se.h | 6 +
2 files changed, 192 insertions(+), 135 deletions(-)

--
2.45.1.288.g0e0cd299f1-goog



2024-06-04 16:03:31

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 1/7] soc: qcom: geni-se: Add GP_LENGTH/IRQ_EN_SET/IRQ_EN_CLEAR registers

For UART devices the M_GP_LENGTH is the TX word count. For other
devices this is the transaction word count.

For UART devices the S_GP_LENGTH is the RX word count.

The IRQ_EN set/clear registers allow you to set or clear bits in the
IRQ_EN register without needing a read-modify-write.

Acked-by: Bjorn Andersson <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Since these new definitions are used in the future UART patches and
Bjorn has Acked them, I'd expect them to go through the same tree as
the UART patches that need them.

(no changes since v2)

Changes in v2:
- New

include/linux/soc/qcom/geni-se.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/include/linux/soc/qcom/geni-se.h b/include/linux/soc/qcom/geni-se.h
index 0f038a1a0330..8d07c442029b 100644
--- a/include/linux/soc/qcom/geni-se.h
+++ b/include/linux/soc/qcom/geni-se.h
@@ -88,11 +88,15 @@ struct geni_se {
#define SE_GENI_M_IRQ_STATUS 0x610
#define SE_GENI_M_IRQ_EN 0x614
#define SE_GENI_M_IRQ_CLEAR 0x618
+#define SE_GENI_M_IRQ_EN_SET 0x61c
+#define SE_GENI_M_IRQ_EN_CLEAR 0x620
#define SE_GENI_S_CMD0 0x630
#define SE_GENI_S_CMD_CTRL_REG 0x634
#define SE_GENI_S_IRQ_STATUS 0x640
#define SE_GENI_S_IRQ_EN 0x644
#define SE_GENI_S_IRQ_CLEAR 0x648
+#define SE_GENI_S_IRQ_EN_SET 0x64c
+#define SE_GENI_S_IRQ_EN_CLEAR 0x650
#define SE_GENI_TX_FIFOn 0x700
#define SE_GENI_RX_FIFOn 0x780
#define SE_GENI_TX_FIFO_STATUS 0x800
@@ -101,6 +105,8 @@ struct geni_se {
#define SE_GENI_RX_WATERMARK_REG 0x810
#define SE_GENI_RX_RFR_WATERMARK_REG 0x814
#define SE_GENI_IOS 0x908
+#define SE_GENI_M_GP_LENGTH 0x910
+#define SE_GENI_S_GP_LENGTH 0x914
#define SE_DMA_TX_IRQ_STAT 0xc40
#define SE_DMA_TX_IRQ_CLR 0xc44
#define SE_DMA_TX_FSM_RST 0xc58
--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:03:57

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 6/7] serial: qcom-geni: Fix suspend while active UART xfer

On devices using Qualcomm's GENI UART it is possible to get the UART
stuck such that it no longer outputs data. Specifically, logging in
via an agetty on the debug serial port (which was _not_ used for
kernel console) and running:
cat /var/log/messages
...and then (via an SSH session) forcing a few suspend/resume cycles
causes the UART to stop transmitting.

The root of the problems was with qcom_geni_serial_stop_tx_fifo()
which is called as part of the suspend process. Specific problems with
that function:
- When an in-progress "tx" command is cancelled it doesn't appear to
fully drain the FIFO. That meant qcom_geni_serial_tx_empty()
continued to report that the FIFO wasn't empty. The
qcom_geni_serial_start_tx_fifo() function didn't re-enable
interrupts in this case so the driver would never start transferring
again.
- When the driver cancelled the current "tx" command but it forgot to
zero out "tx_remaining". This confused logic elsewhere in the
driver.
- From experimentation, it appears that cancelling the "tx" command
could drop some of the queued up bytes.

While qcom_geni_serial_stop_tx_fifo() could be fixed to drain the FIFO
and shut things down properly, stop_tx() isn't supposed to be a slow
function. It is run with local interrupts off and is documented to
stop transmitting "as soon as possible". Change the function to just
stop new bytes from being queued. In order to make this work, change
qcom_geni_serial_start_tx_fifo() to remove some conditions. It's
always safe to enable the watermark interrupt and the IRQ handler will
disable it if it's not needed.

For system suspend the queue still needs to be drained. Failure to do
so means that the hardware won't provide new interrupts until a
"cancel" command is sent. Add draining logic (fixing the issues noted
above) at suspend time.

NOTE: It would be ideal if qcom_geni_serial_stop_tx_fifo() could
"pause" the transmitter right away. There is no obvious way to do this
in the docs and experimentation didn't find any tricks either, so
stopping TX "as soon as possible" isn't very soon but is the best
possible.

Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
Signed-off-by: Douglas Anderson <[email protected]>
---
There are still a number of problems with GENI UART after this but
I've kept this change separate to make it easier to understand.
Specifically on mainline just hitting "Ctrl-C" after dumping
/var/log/messages to the serial port hangs things after the kfifo
changes. Those issues will be addressed in future patches.

It should also be noted that the "Fixes" tag here is a bit of a
swag. I haven't gone and tested on ancient code, but at least the
problems exist on kernel 5.15 and much of the code touched here has
been here since the beginning, or at least since as long as the driver
was stable.

Changes in v3:
- 0xffffffff => GENMASK(31, 0)
- Reword commit message.

Changes in v2:
- Totally rework / rename patch to handle suspend while active xfer

drivers/tty/serial/qcom_geni_serial.c | 97 +++++++++++++++++++++------
1 file changed, 75 insertions(+), 22 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 4dbc59873b34..46b6674d90c5 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -130,6 +130,7 @@ struct qcom_geni_serial_port {
bool brk;

unsigned int tx_remaining;
+ unsigned int tx_total;
int wakeup_irq;
bool rx_tx_swap;
bool cts_rts_swap;
@@ -311,11 +312,14 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,

static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
{
+ struct qcom_geni_serial_port *port = to_dev_port(uport);
u32 m_cmd;

writel(xmit_size, uport->membase + SE_UART_TX_TRANS_LEN);
m_cmd = UART_START_TX << M_OPCODE_SHFT;
writel(m_cmd, uport->membase + SE_GENI_M_CMD0);
+
+ port->tx_total = xmit_size;
}

static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
@@ -335,6 +339,64 @@ static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
writel(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
}

+static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
+{
+ struct qcom_geni_serial_port *port = to_dev_port(uport);
+
+ /*
+ * If the main sequencer is inactive it means that the TX command has
+ * been completed and all bytes have been sent. Nothing to do in that
+ * case.
+ */
+ if (!qcom_geni_serial_main_active(uport))
+ return;
+
+ /*
+ * Wait until the FIFO has been drained. We've already taken bytes out
+ * of the higher level queue in qcom_geni_serial_send_chunk_fifo() so
+ * if we don't drain the FIFO but send the "cancel" below they seem to
+ * get lost.
+ */
+ qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GENMASK(31, 0),
+ port->tx_total - port->tx_remaining);
+
+ /*
+ * If clearing the FIFO made us inactive then we're done--no need for
+ * a cancel.
+ */
+ if (!qcom_geni_serial_main_active(uport))
+ return;
+
+ /*
+ * Cancel the current command. After this the main sequencer will
+ * stop reporting that it's active and we'll have to start a new
+ * transfer command.
+ *
+ * If we skip doing this cancel and then continue with a system
+ * suspend while there's an active command in the main sequencer
+ * then after resume time we won't get any more interrupts on the
+ * main sequencer until we send the cancel.
+ */
+ geni_se_cancel_m_cmd(&port->se);
+ if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
+ M_CMD_CANCEL_EN, true)) {
+ /* The cancel failed; try an abort as a fallback. */
+ geni_se_abort_m_cmd(&port->se);
+ qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
+ M_CMD_ABORT_EN, true);
+ writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
+ }
+ writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
+
+ /*
+ * We've cancelled the current command. "tx_remaining" stores how
+ * many bytes are left to finish in the current command so we know
+ * when to start a new command. Since the command was cancelled we
+ * need to zero "tx_remaining".
+ */
+ port->tx_remaining = 0;
+}
+
static void qcom_geni_serial_abort_rx(struct uart_port *uport)
{
u32 irq_clear = S_CMD_DONE_EN | S_CMD_ABORT_EN;
@@ -655,37 +717,18 @@ static void qcom_geni_serial_start_tx_fifo(struct uart_port *uport)
{
u32 irq_en;

- if (qcom_geni_serial_main_active(uport) ||
- !qcom_geni_serial_tx_empty(uport))
- return;
-
irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;
-
writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
}

static void qcom_geni_serial_stop_tx_fifo(struct uart_port *uport)
{
u32 irq_en;
- struct qcom_geni_serial_port *port = to_dev_port(uport);

irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
irq_en &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
- /* Possible stop tx is called multiple times. */
- if (!qcom_geni_serial_main_active(uport))
- return;
-
- geni_se_cancel_m_cmd(&port->se);
- if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
- M_CMD_CANCEL_EN, true)) {
- geni_se_abort_m_cmd(&port->se);
- qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
- M_CMD_ABORT_EN, true);
- writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
- }
- writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
}

static void qcom_geni_serial_handle_rx_fifo(struct uart_port *uport, bool drop)
@@ -1067,7 +1110,15 @@ static int setup_fifos(struct qcom_geni_serial_port *port)
}


-static void qcom_geni_serial_shutdown(struct uart_port *uport)
+static void qcom_geni_serial_shutdown_dma(struct uart_port *uport)
+{
+ disable_irq(uport->irq);
+
+ qcom_geni_serial_stop_tx(uport);
+ qcom_geni_serial_stop_rx(uport);
+}
+
+static void qcom_geni_serial_shutdown_fifo(struct uart_port *uport)
{
disable_irq(uport->irq);

@@ -1076,6 +1127,8 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport)

qcom_geni_serial_stop_tx(uport);
qcom_geni_serial_stop_rx(uport);
+
+ qcom_geni_serial_drain_tx_fifo(uport);
}

static int qcom_geni_serial_port_setup(struct uart_port *uport)
@@ -1533,7 +1586,7 @@ static const struct uart_ops qcom_geni_console_pops = {
.startup = qcom_geni_serial_startup,
.request_port = qcom_geni_serial_request_port,
.config_port = qcom_geni_serial_config_port,
- .shutdown = qcom_geni_serial_shutdown,
+ .shutdown = qcom_geni_serial_shutdown_fifo,
.type = qcom_geni_serial_get_type,
.set_mctrl = qcom_geni_serial_set_mctrl,
.get_mctrl = qcom_geni_serial_get_mctrl,
@@ -1555,7 +1608,7 @@ static const struct uart_ops qcom_geni_uart_pops = {
.startup = qcom_geni_serial_startup,
.request_port = qcom_geni_serial_request_port,
.config_port = qcom_geni_serial_config_port,
- .shutdown = qcom_geni_serial_shutdown,
+ .shutdown = qcom_geni_serial_shutdown_dma,
.type = qcom_geni_serial_get_type,
.set_mctrl = qcom_geni_serial_set_mctrl,
.get_mctrl = qcom_geni_serial_get_mctrl,
--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:04:46

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 4/7] serial: qcom-geni: Introduce qcom_geni_serial_poll_bitfield()

With a small modification the qcom_geni_serial_poll_bit() function
could be used to poll more than just a single bit. Let's generalize
it. We'll make the qcom_geni_serial_poll_bit() into just a wrapper of
the general function.

Signed-off-by: Douglas Anderson <[email protected]>
---
The new function isn't used yet (except by the wrapper) but will be
used in a future change.

(no changes since v2)

Changes in v2:
- New

drivers/tty/serial/qcom_geni_serial.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index e5effc2f5878..c4c54359d32d 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -264,8 +264,8 @@ static bool qcom_geni_serial_secondary_active(struct uart_port *uport)
return readl(uport->membase + SE_GENI_STATUS) & S_GENI_CMD_ACTIVE;
}

-static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
- unsigned int offset, u32 field, bool set)
+static bool qcom_geni_serial_poll_bitfield(struct uart_port *uport,
+ unsigned int offset, u32 field, u32 val)
{
u32 reg;
unsigned long timeout_us;
@@ -295,7 +295,7 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
timeout_us = DIV_ROUND_UP(timeout_us, 10) * 10;
while (timeout_us) {
reg = readl(uport->membase + offset);
- if ((bool)(reg & field) == set)
+ if ((reg & field) == val)
return true;
udelay(10);
timeout_us -= 10;
@@ -303,6 +303,12 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
return false;
}

+static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
+ unsigned int offset, u32 field, bool set)
+{
+ return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0);
+}
+
static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
{
u32 m_cmd;
--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:31:24

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 5/7] serial: qcom-geni: Just set the watermark level once

There's no reason to set the TX watermark level to 0 when we disable
TX since we're disabling the interrupt anyway. Just set the watermark
level once at init time and leave it alone.

Signed-off-by: Douglas Anderson <[email protected]>
---

(no changes since v2)

Changes in v2:
- New

drivers/tty/serial/qcom_geni_serial.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index c4c54359d32d..4dbc59873b34 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -392,7 +392,6 @@ static int qcom_geni_serial_get_char(struct uart_port *uport)
static void qcom_geni_serial_poll_put_char(struct uart_port *uport,
unsigned char c)
{
- writel(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
qcom_geni_serial_setup_tx(uport, 1);
WARN_ON(!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
M_TX_FIFO_WATERMARK_EN, true));
@@ -436,7 +435,6 @@ __qcom_geni_serial_console_write(struct uart_port *uport, const char *s,
bytes_to_send++;
}

- writel(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
qcom_geni_serial_setup_tx(uport, bytes_to_send);
for (i = 0; i < count; ) {
size_t chars_to_write = 0;
@@ -664,7 +662,6 @@ static void qcom_geni_serial_start_tx_fifo(struct uart_port *uport)
irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;

- writel(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
}

@@ -675,7 +672,6 @@ static void qcom_geni_serial_stop_tx_fifo(struct uart_port *uport)

irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
irq_en &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
- writel(0, uport->membase + SE_GENI_TX_WATERMARK_REG);
writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
/* Possible stop tx is called multiple times. */
if (!qcom_geni_serial_main_active(uport))
@@ -1127,6 +1123,7 @@ static int qcom_geni_serial_port_setup(struct uart_port *uport)
false, true, true);
geni_se_init(&port->se, UART_RX_WM, port->rx_fifo_depth - 2);
geni_se_select_mode(&port->se, port->dev_data->mode);
+ writel(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
qcom_geni_serial_start_rx(uport);
port->setup = true;

--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:37:58

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 3/7] serial: qcom-geni: Fix arg types for qcom_geni_serial_poll_bit()

The "offset" passed in should be unsigned since it's always a positive
offset from our memory mapped IO.

The "field" should be u32 since we're anding it with a 32-bit value
read from the device.

Suggested-by: Stephen Boyd <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---

(no changes since v2)

Changes in v2:
- New

drivers/tty/serial/qcom_geni_serial.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index a48a15c2555e..e5effc2f5878 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -265,7 +265,7 @@ static bool qcom_geni_serial_secondary_active(struct uart_port *uport)
}

static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
- int offset, int field, bool set)
+ unsigned int offset, u32 field, bool set)
{
u32 reg;
unsigned long timeout_us;
--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:44:20

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 2/7] serial: qcom-geni: Fix the timeout in qcom_geni_serial_poll_bit()

The qcom_geni_serial_poll_bit() is supposed to be able to be used to
poll a bit that's will become set when a TX transfer finishes. Because
of this it tries to set its timeout based on how long the UART will
take to shift out all of the queued bytes. There are two problems
here:
1. There appears to be a hidden extra word on the firmware side which
is the word that the firmware has already taken out of the FIFO and
is currently shifting out. We need to account for this.
2. The timeout calculation was assuming that it would only need 8 bits
on the wire to shift out 1 byte. This isn't true. Typically 10 bits
are used (8 data bits, 1 start and 1 stop bit), but as much as 13
bits could be used (14 if we allowed 9 bits per byte, which we
don't).

The too-short timeout was seen causing problems in a future patch
which more properly waited for bytes to transfer out of the UART
before cancelling.

Rather than fix the calculation, replace it with the core-provided
uart_fifo_timeout() function.

NOTE: during earlycon, uart_fifo_timeout() has the same limitations
about not being able to figure out the exact timeout that the old
function did. Luckily uart_fifo_timeout() returns the same default
timeout of 20ms in this case. We'll add a comment about it, though, to
make it more obvious what's happening.

Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
Suggested-by: Ilpo Järvinen <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---

Changes in v3:
- Use uart_fifo_timeout() for timeout.

Changes in v2:
- New

drivers/tty/serial/qcom_geni_serial.c | 37 +++++++++++++--------------
1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 2bd25afe0d92..a48a15c2555e 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -124,7 +124,6 @@ struct qcom_geni_serial_port {
dma_addr_t tx_dma_addr;
dma_addr_t rx_dma_addr;
bool setup;
- unsigned int baud;
unsigned long clk_rate;
void *rx_buf;
u32 loopback;
@@ -269,24 +268,25 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
int offset, int field, bool set)
{
u32 reg;
- struct qcom_geni_serial_port *port;
- unsigned int baud;
- unsigned int fifo_bits;
- unsigned long timeout_us = 20000;
- struct qcom_geni_private_data *private_data = uport->private_data;
+ unsigned long timeout_us;

- if (private_data->drv) {
- port = to_dev_port(uport);
- baud = port->baud;
- if (!baud)
- baud = 115200;
- fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
- /*
- * Total polling iterations based on FIFO worth of bytes to be
- * sent at current baud. Add a little fluff to the wait.
- */
- timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
- }
+ /*
+ * This function is used to poll bits, some of which (like CMD_DONE)
+ * might take as long as it takes for the FIFO plus the temp register
+ * on the geni side to drain. The Linux core calculates such a timeout
+ * for us and we can get it from uart_fifo_timeout().
+ *
+ * It should be noted that during earlycon the variables that
+ * uart_fifo_timeout() makes use of in "uport" may not be setup yet.
+ * It's difficult to set things up for earlycon since it can't
+ * necessarily figure out the baud rate and reading the FIFO depth
+ * from the wrapper means some extra MMIO maps that we don't get by
+ * default. This isn't a big problem, though, since uart_fifo_timeout()
+ * gives back its "slop" of 20ms as a minimum and that should be
+ * plenty of time for earlycon unless we're running at an extremely
+ * low baud rate.
+ */
+ timeout_us = jiffies_to_usecs(uart_fifo_timeout(uport));

/*
* Use custom implementation instead of readl_poll_atomic since ktimer
@@ -1224,7 +1224,6 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
qcom_geni_serial_stop_rx(uport);
/* baud rate */
baud = uart_get_baud_rate(uport, termios, old, 300, 4000000);
- port->baud = baud;

sampling_rate = UART_OVERSAMPLING;
/* Sampling rate is halved for IP versions >= 2.5 */
--
2.45.1.288.g0e0cd299f1-goog


2024-06-04 16:46:30

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v3 7/7] serial: qcom-geni: Rework TX in FIFO mode to fix hangs/lockups

The fact that the Qualcomm GENI hardware interface is based around
"packets" is really awkward to fit into Linux's UART design.
Specifically, in order to send bytes you need to start up a new
"command" saying how many bytes you want to send and then you need to
send all those bytes. Once you've committed to sending that number of
bytes it's very awkward to change your mind and send fewer, especially
if you want to do so without dropping bytes on the ground.

There may be a few cases where you might want to send fewer bytes than
you originally expected:
1. You might want to interrupt the transfer with something higher
priority, like the kernel console or kdb.
2. You might want to enter system suspend.
3. The user might have killed the program that had queued bytes for
sending over the UART.

Despite this awkwardness the Linux driver has still tried to send
bytes using large transfers. Whenever the driver started a new
transfer it would look at the number of bytes in the OS's queue and
start a transfer for that many. The idea of using larger transfers is
that it should be more efficient. When you're in the middle of a large
transfer you can get interrupted when the hardware FIFO is close to
empty and add more bytes in. Whenever you get to the end of a transfer
you have to wait until the transfer is totally done before you can add
more bytes and, depending on interrupt latency, that can cause the
UART to idle a bit.

Unfortunately there were lots of corner cases that the Linux driver
didn't handle.

One problem with the current driver is that if the user killed the
program that queued bytes for sending over the UART then bad things
would happen. Before commit 1788cf6a91d9 ("tty: serial: switch from
circ_buf to kfifo") we'd just send stale data out the UART. After that
commit we'll hard lockup.

Another problem with the current driver can be seen if you queue a
bunch of data to the UART and enter kdb. Specifically on a device
_without_ kernel console on the UART, with an agetty on the UART, and
with kgdb on the UART, doing `cat /var/log/messages` and then dropping
into kdb and resuming caused console output to stop.

Give up on trying to use large transfers in FIFO mode on GENI UART
since there doesn't appear to be any way to solve these problems
cleanly. Visually inspecting the console output even after these
patches doesn't show any big pauses.

In order to make this all work:
- Switch the watermark interrupt to just being used to prime the TX
pump. Once transfers are running, use "done" to queue the next
batch. As part of this, change the watermark to fire whenever the
queue is empty.
- Never queue more than what can fit in the FIFO. This means we don't
need to keep track of a command we're partway through.
- For the console code and kgdb code where we can safely block while
the queue empties, just do that rather than trying to queue a
command when one was already in progress (which didn't work so well
and is why there were some weird/awkward hacks in
qcom_geni_serial_console_write()).
- Leave the CMD_DONE interrupt enabled all the time since there's
never any reason we don't want to see it.
- Start using the "SE_GENI_M_IRQ_EN_SET" and "SE_GENI_M_IRQ_EN_CLEAR"
registers to avoid read-modify-write of the "SE_GENI_M_IRQ_EN"
register. This could be done in more of the driver if needed but for
now just update code that's touched.

Fixes: 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo")
Fixes: a1fee899e5be ("tty: serial: qcom_geni_serial: Fix softlock")
Signed-off-by: Douglas Anderson <[email protected]>
---
I'm listing two "fixes" commits here. The first is the kfifo change
since it is very easy to see a hardlockup after that change. Almost
certainly anyone with the kfifo patch wants this patch. I've also
listed a much earlier patch as one being fixed since that was the one
that made us send larger transfers.

I've tested this commit on an sc7180-trogdor board both with and
without kernel console going to the UART. I've tested across some
suspend/resume cycles and with kgdb. I've also confirmed that
bluetooth, which uses the DMA paths in this driver, continues to work.
That all being said, a lot of things change here so I'd love any
testing folks want to do.

I'm not explicitly CCing stable here. The only truly terrible problem
is the hardlockup introduced by the kfifo change. The rest of the
issue have been around for years. If someone wants the fixes ported
back to stable that's fine but IMO unless you're seeing problems it's
not 100% required.

Changes in v3:
- Reword commit message.

Changes in v2:
- New

drivers/tty/serial/qcom_geni_serial.c | 192 +++++++++++++-------------
1 file changed, 94 insertions(+), 98 deletions(-)

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 46b6674d90c5..204c6f40d7f2 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -78,7 +78,7 @@
#define GENI_UART_CONS_PORTS 1
#define GENI_UART_PORTS 3
#define DEF_FIFO_DEPTH_WORDS 16
-#define DEF_TX_WM 2
+#define DEF_TX_WM 1
#define DEF_FIFO_WIDTH_BITS 32
#define UART_RX_WM 2

@@ -128,8 +128,8 @@ struct qcom_geni_serial_port {
void *rx_buf;
u32 loopback;
bool brk;
+ bool tx_fifo_stopped;

- unsigned int tx_remaining;
unsigned int tx_total;
int wakeup_irq;
bool rx_tx_swap;
@@ -337,6 +337,14 @@ static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
M_CMD_ABORT_EN, true);
}
writel(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
+
+ /*
+ * Re-enable the TX watermark interrupt when we clear the "done"
+ * in case we were waiting on the "done" bit before starting a new
+ * command. The interrupt routine will re-disable this if it's not
+ * appropriate.
+ */
+ writel(M_TX_FIFO_WATERMARK_EN, uport->membase + SE_GENI_M_IRQ_EN_SET);
}

static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
@@ -358,7 +366,7 @@ static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
* get lost.
*/
qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GENMASK(31, 0),
- port->tx_total - port->tx_remaining);
+ port->tx_total);

/*
* If clearing the FIFO made us inactive then we're done--no need for
@@ -387,14 +395,6 @@ static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
}
writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
-
- /*
- * We've cancelled the current command. "tx_remaining" stores how
- * many bytes are left to finish in the current command so we know
- * when to start a new command. Since the command was cancelled we
- * need to zero "tx_remaining".
- */
- port->tx_remaining = 0;
}

static void qcom_geni_serial_abort_rx(struct uart_port *uport)
@@ -454,11 +454,12 @@ static int qcom_geni_serial_get_char(struct uart_port *uport)
static void qcom_geni_serial_poll_put_char(struct uart_port *uport,
unsigned char c)
{
+ qcom_geni_serial_drain_tx_fifo(uport);
+
qcom_geni_serial_setup_tx(uport, 1);
WARN_ON(!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
M_TX_FIFO_WATERMARK_EN, true));
writel(c, uport->membase + SE_GENI_TX_FIFOn);
- writel(M_TX_FIFO_WATERMARK_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
qcom_geni_serial_poll_tx_done(uport);
}
#endif
@@ -488,6 +489,8 @@ __qcom_geni_serial_console_write(struct uart_port *uport, const char *s,
int i;
u32 bytes_to_send = count;

+ qcom_geni_serial_drain_tx_fifo(uport);
+
for (i = 0; i < count; i++) {
/*
* uart_console_write() adds a carriage return for each newline.
@@ -538,7 +541,6 @@ static void qcom_geni_serial_console_write(struct console *co, const char *s,
bool locked = true;
unsigned long flags;
u32 geni_status;
- u32 irq_en;

WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);

@@ -554,38 +556,10 @@ static void qcom_geni_serial_console_write(struct console *co, const char *s,

geni_status = readl(uport->membase + SE_GENI_STATUS);

- if (!locked) {
- /*
- * We can only get here if an oops is in progress then we were
- * unable to get the lock. This means we can't safely access
- * our state variables like tx_remaining. About the best we
- * can do is wait for the FIFO to be empty before we start our
- * transfer, so we'll do that.
- */
- qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
- M_TX_FIFO_NOT_EMPTY_EN, false);
- } else if ((geni_status & M_GENI_CMD_ACTIVE) && !port->tx_remaining) {
- /*
- * It seems we can't interrupt existing transfers if all data
- * has been sent, in which case we need to look for done first.
- */
- qcom_geni_serial_poll_tx_done(uport);
-
- if (!kfifo_is_empty(&uport->state->port.xmit_fifo)) {
- irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
- writel(irq_en | M_TX_FIFO_WATERMARK_EN,
- uport->membase + SE_GENI_M_IRQ_EN);
- }
- }
-
__qcom_geni_serial_console_write(uport, s, count);

-
- if (locked) {
- if (port->tx_remaining)
- qcom_geni_serial_setup_tx(uport, port->tx_remaining);
+ if (locked)
uart_port_unlock_irqrestore(uport, flags);
- }
}

static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop)
@@ -662,9 +636,9 @@ static void qcom_geni_serial_stop_tx_dma(struct uart_port *uport)

if (port->tx_dma_addr) {
geni_se_tx_dma_unprep(&port->se, port->tx_dma_addr,
- port->tx_remaining);
+ port->tx_total);
port->tx_dma_addr = 0;
- port->tx_remaining = 0;
+ port->tx_total = 0;
}

geni_se_cancel_m_cmd(&port->se);
@@ -709,26 +683,27 @@ static void qcom_geni_serial_start_tx_dma(struct uart_port *uport)
qcom_geni_serial_stop_tx_dma(uport);
return;
}
-
- port->tx_remaining = xmit_size;
}

static void qcom_geni_serial_start_tx_fifo(struct uart_port *uport)
{
- u32 irq_en;
+ struct qcom_geni_serial_port *port = to_dev_port(uport);

- irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
- irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;
- writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
+ port->tx_fifo_stopped = false;
+
+ /* Prime the pump to get data flowing. */
+ writel(M_TX_FIFO_WATERMARK_EN, uport->membase + SE_GENI_M_IRQ_EN_SET);
}

static void qcom_geni_serial_stop_tx_fifo(struct uart_port *uport)
{
- u32 irq_en;
+ struct qcom_geni_serial_port *port = to_dev_port(uport);

- irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
- irq_en &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
- writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
+ /*
+ * We can't do anything to safely pause the bytes that have already
+ * been queued up so just set a flag saying we shouldn't queue any more.
+ */
+ port->tx_fifo_stopped = true;
}

static void qcom_geni_serial_handle_rx_fifo(struct uart_port *uport, bool drop)
@@ -896,10 +871,20 @@ static void qcom_geni_serial_stop_tx(struct uart_port *uport)
uport->ops->stop_tx(uport);
}

+static void qcom_geni_serial_enable_cmd_done(struct uart_port *uport)
+{
+ struct qcom_geni_serial_port *port = to_dev_port(uport);
+
+ /* If we're not in FIFO mode we don't use CMD_DONE. */
+ if (port->dev_data->mode != GENI_SE_FIFO)
+ return;
+
+ writel(M_CMD_DONE_EN, uport->membase + SE_GENI_M_IRQ_EN_SET);
+}
+
static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
unsigned int chunk)
{
- struct qcom_geni_serial_port *port = to_dev_port(uport);
unsigned int tx_bytes, remaining = chunk;
u8 buf[BYTES_PER_FIFO_WORD];

@@ -912,52 +897,74 @@ static void qcom_geni_serial_send_chunk_fifo(struct uart_port *uport,
iowrite32_rep(uport->membase + SE_GENI_TX_FIFOn, buf, 1);

remaining -= tx_bytes;
- port->tx_remaining -= tx_bytes;
}
}

-static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport,
- bool done, bool active)
+static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport)
{
struct qcom_geni_serial_port *port = to_dev_port(uport);
struct tty_port *tport = &uport->state->port;
size_t avail;
size_t pending;
u32 status;
- u32 irq_en;
unsigned int chunk;
+ bool active;

- status = readl(uport->membase + SE_GENI_TX_FIFO_STATUS);
-
- /* Complete the current tx command before taking newly added data */
- if (active)
- pending = port->tx_remaining;
- else
- pending = kfifo_len(&tport->xmit_fifo);
+ /*
+ * The TX watermark interrupt is only used to "prime the pump" for
+ * transfers. Once transfers have been kicked off we always use the
+ * "done" interrupt to queue the next batch. Once were here we can
+ * always disable the TX watermark interrupt.
+ *
+ * NOTE: we use the TX watermark in this way because we don't ever
+ * kick off TX transfers larger than we can stuff into the FIFO. This
+ * is because bytes from the OS's circular queue can disappear and
+ * there's no known safe/non-blocking way to cancel the larger
+ * transfer when bytes disappear. See qcom_geni_serial_drain_tx_fifo()
+ * for an example of a safe (but blocking) way to drain, but that's
+ * not appropriate in an IRQ handler. We also can't just kick off one
+ * large transfer and queue bytes whenever because we're using 4 bytes
+ * per FIFO word and thus we can only queue non-multiple-of-4 bytes as
+ * in the last word of a transfer.
+ */
+ writel(M_TX_FIFO_WATERMARK_EN, uport->membase + SE_GENI_M_IRQ_EN_CLEAR);

- /* All data has been transmitted and acknowledged as received */
- if (!pending && !status && done) {
- qcom_geni_serial_stop_tx_fifo(uport);
+ /*
+ * If we've got an active TX command running then we expect to still
+ * see the "done" bit in the future and we can't kick off another
+ * transfer till then. Bail. NOTE: it's important that we read "active"
+ * after we've cleared the "done" interrupt (which the caller already
+ * did for us) so that we know that if we show as non-active we're
+ * guaranteed to later get "done".
+ *
+ * If nothing is pending we _also_ want to bail. Later start_tx()
+ * will start transfers again by temporarily turning on the TX
+ * watermark.
+ */
+ active = readl(uport->membase + SE_GENI_STATUS) & M_GENI_CMD_ACTIVE;
+ pending = port->tx_fifo_stopped ? 0 : kfifo_len(&tport->xmit_fifo);
+ if (active || !pending)
goto out_write_wakeup;
- }

+ /* Calculate how much space is available in the FIFO right now. */
+ status = readl(uport->membase + SE_GENI_TX_FIFO_STATUS);
avail = port->tx_fifo_depth - (status & TX_FIFO_WC);
avail *= BYTES_PER_FIFO_WORD;

- chunk = min(avail, pending);
- if (!chunk)
+ /*
+ * It's a bit odd if we get here and have bytes pending and we're
+ * handling a "done" or "TX watermark" interrupt but we don't
+ * have space in the FIFO. Stick in a warning and bail.
+ */
+ if (!avail) {
+ dev_warn(uport->dev, "FIFO unexpectedly out of space\n");
goto out_write_wakeup;
-
- if (!port->tx_remaining) {
- qcom_geni_serial_setup_tx(uport, pending);
- port->tx_remaining = pending;
-
- irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
- if (!(irq_en & M_TX_FIFO_WATERMARK_EN))
- writel(irq_en | M_TX_FIFO_WATERMARK_EN,
- uport->membase + SE_GENI_M_IRQ_EN);
}

+
+ /* We're ready to throw some bytes into the FIFO. */
+ chunk = min(avail, pending);
+ qcom_geni_serial_setup_tx(uport, chunk);
qcom_geni_serial_send_chunk_fifo(uport, chunk);

/*
@@ -965,17 +972,9 @@ static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport,
* cleared it in qcom_geni_serial_isr it will have already reasserted
* so we must clear it again here after our writes.
*/
- writel(M_TX_FIFO_WATERMARK_EN,
- uport->membase + SE_GENI_M_IRQ_CLEAR);
+ writel(M_TX_FIFO_WATERMARK_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);

out_write_wakeup:
- if (!port->tx_remaining) {
- irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
- if (irq_en & M_TX_FIFO_WATERMARK_EN)
- writel(irq_en & ~M_TX_FIFO_WATERMARK_EN,
- uport->membase + SE_GENI_M_IRQ_EN);
- }
-
if (kfifo_len(&tport->xmit_fifo) < WAKEUP_CHARS)
uart_write_wakeup(uport);
}
@@ -985,10 +984,10 @@ static void qcom_geni_serial_handle_tx_dma(struct uart_port *uport)
struct qcom_geni_serial_port *port = to_dev_port(uport);
struct tty_port *tport = &uport->state->port;

- uart_xmit_advance(uport, port->tx_remaining);
- geni_se_tx_dma_unprep(&port->se, port->tx_dma_addr, port->tx_remaining);
+ uart_xmit_advance(uport, port->tx_total);
+ geni_se_tx_dma_unprep(&port->se, port->tx_dma_addr, port->tx_total);
port->tx_dma_addr = 0;
- port->tx_remaining = 0;
+ port->tx_total = 0;

if (!kfifo_is_empty(&tport->xmit_fifo))
qcom_geni_serial_start_tx_dma(uport);
@@ -1002,7 +1001,6 @@ static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
u32 m_irq_en;
u32 m_irq_status;
u32 s_irq_status;
- u32 geni_status;
u32 dma;
u32 dma_tx_status;
u32 dma_rx_status;
@@ -1020,7 +1018,6 @@ static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
s_irq_status = readl(uport->membase + SE_GENI_S_IRQ_STATUS);
dma_tx_status = readl(uport->membase + SE_DMA_TX_IRQ_STAT);
dma_rx_status = readl(uport->membase + SE_DMA_RX_IRQ_STAT);
- geni_status = readl(uport->membase + SE_GENI_STATUS);
dma = readl(uport->membase + SE_GENI_DMA_MODE_EN);
m_irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
writel(m_irq_status, uport->membase + SE_GENI_M_IRQ_CLEAR);
@@ -1067,9 +1064,7 @@ static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
} else {
if (m_irq_status & m_irq_en &
(M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN))
- qcom_geni_serial_handle_tx_fifo(uport,
- m_irq_status & M_CMD_DONE_EN,
- geni_status & M_GENI_CMD_ACTIVE);
+ qcom_geni_serial_handle_tx_fifo(uport);

if (s_irq_status & (S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN))
qcom_geni_serial_handle_rx_fifo(uport, drop_rx);
@@ -1177,6 +1172,7 @@ static int qcom_geni_serial_port_setup(struct uart_port *uport)
geni_se_init(&port->se, UART_RX_WM, port->rx_fifo_depth - 2);
geni_se_select_mode(&port->se, port->dev_data->mode);
writel(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
+ qcom_geni_serial_enable_cmd_done(uport);
qcom_geni_serial_start_rx(uport);
port->setup = true;

--
2.45.1.288.g0e0cd299f1-goog


2024-06-07 09:03:06

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] serial: qcom-geni: Fix suspend while active UART xfer

On Tue, 4 Jun 2024, Douglas Anderson wrote:

> On devices using Qualcomm's GENI UART it is possible to get the UART
> stuck such that it no longer outputs data. Specifically, logging in
> via an agetty on the debug serial port (which was _not_ used for
> kernel console) and running:
> cat /var/log/messages
> ...and then (via an SSH session) forcing a few suspend/resume cycles
> causes the UART to stop transmitting.
>
> The root of the problems was with qcom_geni_serial_stop_tx_fifo()
> which is called as part of the suspend process. Specific problems with
> that function:
> - When an in-progress "tx" command is cancelled it doesn't appear to
> fully drain the FIFO. That meant qcom_geni_serial_tx_empty()
> continued to report that the FIFO wasn't empty. The
> qcom_geni_serial_start_tx_fifo() function didn't re-enable
> interrupts in this case so the driver would never start transferring
> again.
> - When the driver cancelled the current "tx" command but it forgot to
> zero out "tx_remaining". This confused logic elsewhere in the
> driver.
> - From experimentation, it appears that cancelling the "tx" command
> could drop some of the queued up bytes.
>
> While qcom_geni_serial_stop_tx_fifo() could be fixed to drain the FIFO
> and shut things down properly, stop_tx() isn't supposed to be a slow
> function. It is run with local interrupts off and is documented to
> stop transmitting "as soon as possible". Change the function to just
> stop new bytes from being queued. In order to make this work, change
> qcom_geni_serial_start_tx_fifo() to remove some conditions. It's
> always safe to enable the watermark interrupt and the IRQ handler will
> disable it if it's not needed.
>
> For system suspend the queue still needs to be drained. Failure to do
> so means that the hardware won't provide new interrupts until a
> "cancel" command is sent. Add draining logic (fixing the issues noted
> above) at suspend time.
>
> NOTE: It would be ideal if qcom_geni_serial_stop_tx_fifo() could
> "pause" the transmitter right away. There is no obvious way to do this
> in the docs and experimentation didn't find any tricks either, so
> stopping TX "as soon as possible" isn't very soon but is the best
> possible.
>
> Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
> There are still a number of problems with GENI UART after this but
> I've kept this change separate to make it easier to understand.
> Specifically on mainline just hitting "Ctrl-C" after dumping
> /var/log/messages to the serial port hangs things after the kfifo
> changes. Those issues will be addressed in future patches.
>
> It should also be noted that the "Fixes" tag here is a bit of a
> swag. I haven't gone and tested on ancient code, but at least the
> problems exist on kernel 5.15 and much of the code touched here has
> been here since the beginning, or at least since as long as the driver
> was stable.
>
> Changes in v3:
> - 0xffffffff => GENMASK(31, 0)
> - Reword commit message.
>
> Changes in v2:
> - Totally rework / rename patch to handle suspend while active xfer
>
> drivers/tty/serial/qcom_geni_serial.c | 97 +++++++++++++++++++++------
> 1 file changed, 75 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> index 4dbc59873b34..46b6674d90c5 100644
> --- a/drivers/tty/serial/qcom_geni_serial.c
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -130,6 +130,7 @@ struct qcom_geni_serial_port {
> bool brk;
>
> unsigned int tx_remaining;
> + unsigned int tx_total;
> int wakeup_irq;
> bool rx_tx_swap;
> bool cts_rts_swap;
> @@ -311,11 +312,14 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
>
> static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
> {
> + struct qcom_geni_serial_port *port = to_dev_port(uport);
> u32 m_cmd;
>
> writel(xmit_size, uport->membase + SE_UART_TX_TRANS_LEN);
> m_cmd = UART_START_TX << M_OPCODE_SHFT;

Unrelated to this patch and won't belong to this patch but I noticed it
while reviewing. This could be converted into:

m_cmd = FIELD_PREP(M_OPCODE_MSK, UART_START_TX);

(and after converting the other use in the header file, the SHFT define
becomes unused).

> writel(m_cmd, uport->membase + SE_GENI_M_CMD0);
> +
> + port->tx_total = xmit_size;
> }
>
> static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
> @@ -335,6 +339,64 @@ static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
> writel(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
> }
>
> +static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
> +{
> + struct qcom_geni_serial_port *port = to_dev_port(uport);
> +
> + /*
> + * If the main sequencer is inactive it means that the TX command has
> + * been completed and all bytes have been sent. Nothing to do in that
> + * case.
> + */
> + if (!qcom_geni_serial_main_active(uport))
> + return;
> +
> + /*
> + * Wait until the FIFO has been drained. We've already taken bytes out
> + * of the higher level queue in qcom_geni_serial_send_chunk_fifo() so
> + * if we don't drain the FIFO but send the "cancel" below they seem to
> + * get lost.
> + */
> + qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GENMASK(31, 0),

That GENMASK(31, 0) is a field in a register (even if it covers the
entire register)? It should be named with a define instead of creating the
field mask here in an online fashion.

> + port->tx_total - port->tx_remaining);
> +
> + /*
> + * If clearing the FIFO made us inactive then we're done--no need for
> + * a cancel.
> + */
> + if (!qcom_geni_serial_main_active(uport))
> + return;
> +
> + /*
> + * Cancel the current command. After this the main sequencer will
> + * stop reporting that it's active and we'll have to start a new
> + * transfer command.
> + *
> + * If we skip doing this cancel and then continue with a system
> + * suspend while there's an active command in the main sequencer
> + * then after resume time we won't get any more interrupts on the
> + * main sequencer until we send the cancel.
> + */
> + geni_se_cancel_m_cmd(&port->se);
> + if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> + M_CMD_CANCEL_EN, true)) {
> + /* The cancel failed; try an abort as a fallback. */
> + geni_se_abort_m_cmd(&port->se);
> + qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> + M_CMD_ABORT_EN, true);

Misaligned.

--
i.

> + writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
> + }
> + writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
> +
> + /*
> + * We've cancelled the current command. "tx_remaining" stores how
> + * many bytes are left to finish in the current command so we know
> + * when to start a new command. Since the command was cancelled we
> + * need to zero "tx_remaining".
> + */
> + port->tx_remaining = 0;
> +}
> +
> static void qcom_geni_serial_abort_rx(struct uart_port *uport)
> {
> u32 irq_clear = S_CMD_DONE_EN | S_CMD_ABORT_EN;
> @@ -655,37 +717,18 @@ static void qcom_geni_serial_start_tx_fifo(struct uart_port *uport)
> {
> u32 irq_en;
>
> - if (qcom_geni_serial_main_active(uport) ||
> - !qcom_geni_serial_tx_empty(uport))
> - return;
> -
> irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
> irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;
> -
> writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> }
>
> static void qcom_geni_serial_stop_tx_fifo(struct uart_port *uport)
> {
> u32 irq_en;
> - struct qcom_geni_serial_port *port = to_dev_port(uport);
>
> irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
> irq_en &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
> writel(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> - /* Possible stop tx is called multiple times. */
> - if (!qcom_geni_serial_main_active(uport))
> - return;
> -
> - geni_se_cancel_m_cmd(&port->se);
> - if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> - M_CMD_CANCEL_EN, true)) {
> - geni_se_abort_m_cmd(&port->se);
> - qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> - M_CMD_ABORT_EN, true);
> - writel(M_CMD_ABORT_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
> - }
> - writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
> }
>
> static void qcom_geni_serial_handle_rx_fifo(struct uart_port *uport, bool drop)
> @@ -1067,7 +1110,15 @@ static int setup_fifos(struct qcom_geni_serial_port *port)
> }
>
>
> -static void qcom_geni_serial_shutdown(struct uart_port *uport)
> +static void qcom_geni_serial_shutdown_dma(struct uart_port *uport)
> +{
> + disable_irq(uport->irq);
> +
> + qcom_geni_serial_stop_tx(uport);
> + qcom_geni_serial_stop_rx(uport);
> +}
> +
> +static void qcom_geni_serial_shutdown_fifo(struct uart_port *uport)
> {
> disable_irq(uport->irq);
>
> @@ -1076,6 +1127,8 @@ static void qcom_geni_serial_shutdown(struct uart_port *uport)
>
> qcom_geni_serial_stop_tx(uport);
> qcom_geni_serial_stop_rx(uport);
> +
> + qcom_geni_serial_drain_tx_fifo(uport);
> }
>
> static int qcom_geni_serial_port_setup(struct uart_port *uport)
> @@ -1533,7 +1586,7 @@ static const struct uart_ops qcom_geni_console_pops = {
> .startup = qcom_geni_serial_startup,
> .request_port = qcom_geni_serial_request_port,
> .config_port = qcom_geni_serial_config_port,
> - .shutdown = qcom_geni_serial_shutdown,
> + .shutdown = qcom_geni_serial_shutdown_fifo,
> .type = qcom_geni_serial_get_type,
> .set_mctrl = qcom_geni_serial_set_mctrl,
> .get_mctrl = qcom_geni_serial_get_mctrl,
> @@ -1555,7 +1608,7 @@ static const struct uart_ops qcom_geni_uart_pops = {
> .startup = qcom_geni_serial_startup,
> .request_port = qcom_geni_serial_request_port,
> .config_port = qcom_geni_serial_config_port,
> - .shutdown = qcom_geni_serial_shutdown,
> + .shutdown = qcom_geni_serial_shutdown_dma,
> .type = qcom_geni_serial_get_type,
> .set_mctrl = qcom_geni_serial_set_mctrl,
> .get_mctrl = qcom_geni_serial_get_mctrl,
>


2024-06-07 10:16:51

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] serial: qcom-geni: Fix the timeout in qcom_geni_serial_poll_bit()

On Tue, 4 Jun 2024, Douglas Anderson wrote:

> The qcom_geni_serial_poll_bit() is supposed to be able to be used to
> poll a bit that's will become set when a TX transfer finishes. Because
> of this it tries to set its timeout based on how long the UART will
> take to shift out all of the queued bytes. There are two problems
> here:
> 1. There appears to be a hidden extra word on the firmware side which
> is the word that the firmware has already taken out of the FIFO and
> is currently shifting out. We need to account for this.
> 2. The timeout calculation was assuming that it would only need 8 bits
> on the wire to shift out 1 byte. This isn't true. Typically 10 bits
> are used (8 data bits, 1 start and 1 stop bit), but as much as 13
> bits could be used (14 if we allowed 9 bits per byte, which we
> don't).
>
> The too-short timeout was seen causing problems in a future patch
> which more properly waited for bytes to transfer out of the UART
> before cancelling.
>
> Rather than fix the calculation, replace it with the core-provided
> uart_fifo_timeout() function.
>
> NOTE: during earlycon, uart_fifo_timeout() has the same limitations
> about not being able to figure out the exact timeout that the old
> function did. Luckily uart_fifo_timeout() returns the same default
> timeout of 20ms in this case. We'll add a comment about it, though, to
> make it more obvious what's happening.
>
> Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
> Suggested-by: Ilpo J?rvinen <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v3:
> - Use uart_fifo_timeout() for timeout.
>
> Changes in v2:
> - New
>
> drivers/tty/serial/qcom_geni_serial.c | 37 +++++++++++++--------------
> 1 file changed, 18 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> index 2bd25afe0d92..a48a15c2555e 100644
> --- a/drivers/tty/serial/qcom_geni_serial.c
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -124,7 +124,6 @@ struct qcom_geni_serial_port {
> dma_addr_t tx_dma_addr;
> dma_addr_t rx_dma_addr;
> bool setup;
> - unsigned int baud;
> unsigned long clk_rate;
> void *rx_buf;
> u32 loopback;
> @@ -269,24 +268,25 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> int offset, int field, bool set)
> {
> u32 reg;
> - struct qcom_geni_serial_port *port;
> - unsigned int baud;
> - unsigned int fifo_bits;
> - unsigned long timeout_us = 20000;
> - struct qcom_geni_private_data *private_data = uport->private_data;
> + unsigned long timeout_us;
>
> - if (private_data->drv) {
> - port = to_dev_port(uport);
> - baud = port->baud;
> - if (!baud)
> - baud = 115200;
> - fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
> - /*
> - * Total polling iterations based on FIFO worth of bytes to be
> - * sent at current baud. Add a little fluff to the wait.
> - */
> - timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
> - }
> + /*
> + * This function is used to poll bits, some of which (like CMD_DONE)
> + * might take as long as it takes for the FIFO plus the temp register
> + * on the geni side to drain. The Linux core calculates such a timeout
> + * for us and we can get it from uart_fifo_timeout().
> + *
> + * It should be noted that during earlycon the variables that
> + * uart_fifo_timeout() makes use of in "uport" may not be setup yet.
> + * It's difficult to set things up for earlycon since it can't
> + * necessarily figure out the baud rate and reading the FIFO depth
> + * from the wrapper means some extra MMIO maps that we don't get by
> + * default. This isn't a big problem, though, since uart_fifo_timeout()
> + * gives back its "slop" of 20ms as a minimum and that should be
> + * plenty of time for earlycon unless we're running at an extremely
> + * low baud rate.
> + */
> + timeout_us = jiffies_to_usecs(uart_fifo_timeout(uport));

Hi,

While this is not exactly incorrect, the back and forth conversions nsecs
-> jiffies -> usecs feels somewhat odd, perhaps reworking
uart_fifo_timeout()'s return type from jiffies to e.g. usecs would be
preferrable. As is, the jiffies as its return type seems a small obstacle
for using uart_fifo_timeout() which has come up in other contexts too.

> @@ -1224,7 +1224,6 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
> qcom_geni_serial_stop_rx(uport);
> /* baud rate */
> baud = uart_get_baud_rate(uport, termios, old, 300, 4000000);
> - port->baud = baud;

It's always nice to see this kind of cache variable removed, good work. :-)

--
i.

2024-06-10 22:28:05

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] serial: qcom-geni: Fix suspend while active UART xfer

Hi,

On Fri, Jun 7, 2024 at 12:43 AM Ilpo Järvinen
<[email protected]> wrote:
>
> > @@ -311,11 +312,14 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> >
> > static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
> > {
> > + struct qcom_geni_serial_port *port = to_dev_port(uport);
> > u32 m_cmd;
> >
> > writel(xmit_size, uport->membase + SE_UART_TX_TRANS_LEN);
> > m_cmd = UART_START_TX << M_OPCODE_SHFT;
>
> Unrelated to this patch and won't belong to this patch but I noticed it
> while reviewing. This could be converted into:
>
> m_cmd = FIELD_PREP(M_OPCODE_MSK, UART_START_TX);
>
> (and after converting the other use in the header file, the SHFT define
> becomes unused).

Sure. I'm going to leave that to someone in the future, though. I've
already spent more time than I should on this series and, if we're
going to do this, we should convert the whole driver (and perhaps all
the geni drivers).


> > @@ -335,6 +339,64 @@ static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
> > writel(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
> > }
> >
> > +static void qcom_geni_serial_drain_tx_fifo(struct uart_port *uport)
> > +{
> > + struct qcom_geni_serial_port *port = to_dev_port(uport);
> > +
> > + /*
> > + * If the main sequencer is inactive it means that the TX command has
> > + * been completed and all bytes have been sent. Nothing to do in that
> > + * case.
> > + */
> > + if (!qcom_geni_serial_main_active(uport))
> > + return;
> > +
> > + /*
> > + * Wait until the FIFO has been drained. We've already taken bytes out
> > + * of the higher level queue in qcom_geni_serial_send_chunk_fifo() so
> > + * if we don't drain the FIFO but send the "cancel" below they seem to
> > + * get lost.
> > + */
> > + qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GENMASK(31, 0),
>
> That GENMASK(31, 0) is a field in a register (even if it covers the
> entire register)? It should be named with a define instead of creating the
> field mask here in an online fashion.

Sure. Done.


> > + port->tx_total - port->tx_remaining);
> > +
> > + /*
> > + * If clearing the FIFO made us inactive then we're done--no need for
> > + * a cancel.
> > + */
> > + if (!qcom_geni_serial_main_active(uport))
> > + return;
> > +
> > + /*
> > + * Cancel the current command. After this the main sequencer will
> > + * stop reporting that it's active and we'll have to start a new
> > + * transfer command.
> > + *
> > + * If we skip doing this cancel and then continue with a system
> > + * suspend while there's an active command in the main sequencer
> > + * then after resume time we won't get any more interrupts on the
> > + * main sequencer until we send the cancel.
> > + */
> > + geni_se_cancel_m_cmd(&port->se);
> > + if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> > + M_CMD_CANCEL_EN, true)) {
> > + /* The cancel failed; try an abort as a fallback. */
> > + geni_se_abort_m_cmd(&port->se);
> > + qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> > + M_CMD_ABORT_EN, true);
>
> Misaligned.

Done.

2024-06-10 22:28:31

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] serial: qcom-geni: Fix the timeout in qcom_geni_serial_poll_bit()

Hi,

On Fri, Jun 7, 2024 at 12:50 AM Ilpo Järvinen
<[email protected]> wrote:
>
> > + /*
> > + * This function is used to poll bits, some of which (like CMD_DONE)
> > + * might take as long as it takes for the FIFO plus the temp register
> > + * on the geni side to drain. The Linux core calculates such a timeout
> > + * for us and we can get it from uart_fifo_timeout().
> > + *
> > + * It should be noted that during earlycon the variables that
> > + * uart_fifo_timeout() makes use of in "uport" may not be setup yet.
> > + * It's difficult to set things up for earlycon since it can't
> > + * necessarily figure out the baud rate and reading the FIFO depth
> > + * from the wrapper means some extra MMIO maps that we don't get by
> > + * default. This isn't a big problem, though, since uart_fifo_timeout()
> > + * gives back its "slop" of 20ms as a minimum and that should be
> > + * plenty of time for earlycon unless we're running at an extremely
> > + * low baud rate.
> > + */
> > + timeout_us = jiffies_to_usecs(uart_fifo_timeout(uport));
>
> Hi,
>
> While this is not exactly incorrect, the back and forth conversions nsecs
> -> jiffies -> usecs feels somewhat odd, perhaps reworking
> uart_fifo_timeout()'s return type from jiffies to e.g. usecs would be
> preferrable. As is, the jiffies as its return type seems a small obstacle
> for using uart_fifo_timeout() which has come up in other contexts too.

Sure. I'll change it to "ms" instead of "us". We don't need the
fidelity of "us" here given that the function is adding 20 ms of slop
anyway so might as well return ms so that callers don't need to do so
much math and don't need to work with u64.

This means that I'll have to add a "* USEC_PER_MSEC" in my driver, but
it still feels like the more correct thing to do. It also has the nice
side effect of allowing the driver to remove the awkward
"DIV_ROUND_UP(timeout_us, 10) * 10" because we know that the timeout
will always be a proper multiple.

I'll also add a new function with the _ms suffix instead of changing
the old one. The suffix makes it clear to the caller what the unit of
the returned value is and we might as well leave the old wrapper
there--otherwise we just need to move the jiffies conversion into the
existing callers.