2020-10-09 00:27:00

by Doug Anderson

[permalink] [raw]
Subject: [PATCH 0/3] i2c: i2c-qcom-geni: More properly fix the DMA race

Previously I landed commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA
transfer race") to fix a race we were seeing. While that most
definitely fixed the race we were seeing, it looks like it causes
problems in the TX path, which we didn't stress test until we started
trying to update firmware on devices.

Let's revert that patch and try another way: fix the original problem
by disabling the interrupts that aren't relevant to DMA transfers.
Now we can stress both TX and RX cases and see no problems. I also
can't find any place to put an msleep() that causes problems anymore.

Since this problem only affects i2c, I'm hoping for an Ack from Bjorn
and then all these patches can go through the i2c tree. However, if
maintainers want to work a different way out to land that's OK too.

NOTE: the 3rd patch in the series could certianly be squashed with
patch #1 or I could re-order / rejigger. To me it seemed like a good
idea to first fix the probelm (and make the two functions as much of
an inverse as possible) and later try to clean things up. Yell if you
want something different.


Douglas Anderson (3):
soc: qcom: geni: More properly switch to DMA mode
Revert "i2c: i2c-qcom-geni: Fix DMA transfer race"
soc: qcom: geni: Optimize select fifo/dma mode

drivers/i2c/busses/i2c-qcom-geni.c | 6 ++--
drivers/soc/qcom/qcom-geni-se.c | 47 ++++++++++++++++++++----------
2 files changed, 34 insertions(+), 19 deletions(-)

--
2.28.0.1011.ga647a8990f-goog


2020-10-09 00:27:18

by Doug Anderson

[permalink] [raw]
Subject: [PATCH 1/3] soc: qcom: geni: More properly switch to DMA mode

On geni-i2c transfers using DMA, it was seen that if you program the
command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
cause interrupts to fire. If we get unlucky, these interrupts can
just keep firing (and not be handled) blocking further progress and
hanging the system.

In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
we avoided that by making sure we didn't program the command until
after geni_se_rx_dma_prep() was called. While that avoided the
problems, it also turns out to be invalid. At least in the TX case we
started seeing sporadic corrupted transfers. This is easily seen by
adding an msleep() between the DMA prep and the writing of the
command, which makes the problem worse. That means we need to revert
that commit and find another way to fix the bogus IRQs.

Specifically, after reverting commit 02b9aec59243 ("i2c:
i2c-qcom-geni: Fix DMA transfer race"), I put some traces in. I found
that the when the interrupts were firing like crazy:
- "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
- "dma" was set.

Further debugging showed that I could make the problem happen more
reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
ran up until geni_se_rx_dma_prep() programmed the length.

A rather simple fix is to change geni_se_select_dma_mode() so it's a
true inverse of geni_se_select_fifo_mode() and disables all the FIFO
related interrupts. Now the problematic interrupts can't fire and we
can program things in the correct order without worrying.

As part of this, let's also change the writel_relaxed() in the prepare
function to a writel() so that our DMA is guaranteed to be prepared
now that we can't rely on geni_se_setup_m_cmd()'s writel().

NOTE: the only current user of GENI_SE_DMA in mainline is i2c.

Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
Signed-off-by: Douglas Anderson <[email protected]>
---

drivers/soc/qcom/qcom-geni-se.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
index d0e4f520cff8..751a49f6534f 100644
--- a/drivers/soc/qcom/qcom-geni-se.c
+++ b/drivers/soc/qcom/qcom-geni-se.c
@@ -289,10 +289,23 @@ static void geni_se_select_fifo_mode(struct geni_se *se)

static void geni_se_select_dma_mode(struct geni_se *se)
{
+ u32 proto = geni_se_read_proto(se);
u32 val;

geni_se_irq_clear(se);

+ val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
+ if (proto != GENI_SE_UART) {
+ val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
+ val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
+ }
+ writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
+
+ val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
+ if (proto != GENI_SE_UART)
+ val &= ~S_CMD_DONE_EN;
+ writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
+
val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
val |= GENI_DMA_MODE_EN;
writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
@@ -651,7 +664,7 @@ int geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len,
writel_relaxed(lower_32_bits(*iova), se->base + SE_DMA_TX_PTR_L);
writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_TX_PTR_H);
writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
- writel_relaxed(len, se->base + SE_DMA_TX_LEN);
+ writel(len, se->base + SE_DMA_TX_LEN);
return 0;
}
EXPORT_SYMBOL(geni_se_tx_dma_prep);
@@ -688,7 +701,7 @@ int geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len,
writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_RX_PTR_H);
/* RX does not have EOT buffer type bit. So just reset RX_ATTR */
writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
- writel_relaxed(len, se->base + SE_DMA_RX_LEN);
+ writel(len, se->base + SE_DMA_RX_LEN);
return 0;
}
EXPORT_SYMBOL(geni_se_rx_dma_prep);
--
2.28.0.1011.ga647a8990f-goog

2020-10-09 00:28:54

by Doug Anderson

[permalink] [raw]
Subject: [PATCH 2/3] Revert "i2c: i2c-qcom-geni: Fix DMA transfer race"

This reverts commit 02b9aec59243c6240fc42884acc958602146ddf6.

As talked about in the patch ("soc: qcom: geni: More properly switch
to DMA mode"), swapping the order of geni_se_setup_m_cmd() and
geni_se_xx_dma_prep() can sometimes cause corrupted transfers. Thus
we traded one problem for another. Now that we've debugged the
problem further and fixed the geni helper functions to more disable
FIFO interrupts when we move to DMA mode we can revert it and end up
with (hopefully) zero problems!

To be explicit, the patch ("soc: qcom: geni: More properly switch
to DMA mode") is a prerequisite for this one.

Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
Signed-off-by: Douglas Anderson <[email protected]>
---

drivers/i2c/busses/i2c-qcom-geni.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index dead5db3315a..32b2a9921b14 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -367,6 +367,7 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
geni_se_select_mode(se, GENI_SE_FIFO);

writel_relaxed(len, se->base + SE_I2C_RX_TRANS_LEN);
+ geni_se_setup_m_cmd(se, I2C_READ, m_param);

if (dma_buf && geni_se_rx_dma_prep(se, dma_buf, len, &rx_dma)) {
geni_se_select_mode(se, GENI_SE_FIFO);
@@ -374,8 +375,6 @@ static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
dma_buf = NULL;
}

- geni_se_setup_m_cmd(se, I2C_READ, m_param);
-
time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
if (!time_left)
geni_i2c_abort_xfer(gi2c);
@@ -409,6 +408,7 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
geni_se_select_mode(se, GENI_SE_FIFO);

writel_relaxed(len, se->base + SE_I2C_TX_TRANS_LEN);
+ geni_se_setup_m_cmd(se, I2C_WRITE, m_param);

if (dma_buf && geni_se_tx_dma_prep(se, dma_buf, len, &tx_dma)) {
geni_se_select_mode(se, GENI_SE_FIFO);
@@ -416,8 +416,6 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
dma_buf = NULL;
}

- geni_se_setup_m_cmd(se, I2C_WRITE, m_param);
-
if (!dma_buf) /* Get FIFO IRQ */
writel_relaxed(1, se->base + SE_GENI_TX_WATERMARK_REG);

--
2.28.0.1011.ga647a8990f-goog

2020-10-09 00:29:42

by Doug Anderson

[permalink] [raw]
Subject: [PATCH 3/3] soc: qcom: geni: Optimize select fifo/dma mode

The functions geni_se_select_fifo_mode() and
geni_se_select_fifo_mode() are a little funny. They read/write a
bunch of memory mapped registers even if they don't change or aren't
relevant for the current protocol. Let's make them a little more
sane.

NOTE: there is no evidence at all that this makes any performance
difference and it fixes no bugs. However, it seems (to me) like it
makes the functions a little easier to understand. Decreasing the
amount of times we read/write memory mapped registers is also nice,
even if we are using "relaxed" variants.

Signed-off-by: Douglas Anderson <[email protected]>
---

drivers/soc/qcom/qcom-geni-se.c | 44 ++++++++++++++++++---------------
1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
index 751a49f6534f..746854745b15 100644
--- a/drivers/soc/qcom/qcom-geni-se.c
+++ b/drivers/soc/qcom/qcom-geni-se.c
@@ -266,49 +266,53 @@ EXPORT_SYMBOL(geni_se_init);
static void geni_se_select_fifo_mode(struct geni_se *se)
{
u32 proto = geni_se_read_proto(se);
- u32 val;
+ u32 val, val_old;

geni_se_irq_clear(se);

- val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
if (proto != GENI_SE_UART) {
+ val_old = val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
val |= M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN;
val |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
- }
- writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
+ if (val != val_old)
+ writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);

- val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
- if (proto != GENI_SE_UART)
- val |= S_CMD_DONE_EN;
- writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
+ val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
+ if (!(val & S_CMD_DONE_EN))
+ writel_relaxed(val | S_CMD_DONE_EN,
+ se->base + SE_GENI_S_IRQ_EN);
+ }

val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
- val &= ~GENI_DMA_MODE_EN;
- writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
+ if (val & GENI_DMA_MODE_EN)
+ writel_relaxed(val & ~GENI_DMA_MODE_EN,
+ se->base + SE_GENI_DMA_MODE_EN);
}

static void geni_se_select_dma_mode(struct geni_se *se)
{
u32 proto = geni_se_read_proto(se);
- u32 val;
+ u32 val, val_old;

geni_se_irq_clear(se);

- val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
if (proto != GENI_SE_UART) {
+ val_old = val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
- }
- writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
+ if (val != val_old)
+ writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);

- val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
- if (proto != GENI_SE_UART)
- val &= ~S_CMD_DONE_EN;
- writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
+ val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
+ if (val & S_CMD_DONE_EN)
+ writel_relaxed(val & ~S_CMD_DONE_EN,
+ se->base + SE_GENI_S_IRQ_EN);
+ }

val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
- val |= GENI_DMA_MODE_EN;
- writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
+ if (!(val & GENI_DMA_MODE_EN))
+ writel_relaxed(val | GENI_DMA_MODE_EN,
+ se->base + SE_GENI_DMA_MODE_EN);
}

/**
--
2.28.0.1011.ga647a8990f-goog

2020-10-10 07:05:34

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 2/3] Revert "i2c: i2c-qcom-geni: Fix DMA transfer race"

Quoting Douglas Anderson (2020-10-08 15:52:34)
> This reverts commit 02b9aec59243c6240fc42884acc958602146ddf6.
>
> As talked about in the patch ("soc: qcom: geni: More properly switch
> to DMA mode"), swapping the order of geni_se_setup_m_cmd() and
> geni_se_xx_dma_prep() can sometimes cause corrupted transfers. Thus
> we traded one problem for another. Now that we've debugged the
> problem further and fixed the geni helper functions to more disable
> FIFO interrupts when we move to DMA mode we can revert it and end up
> with (hopefully) zero problems!
>
> To be explicit, the patch ("soc: qcom: geni: More properly switch
> to DMA mode") is a prerequisite for this one.
>
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---

Reviewed-by: Stephen Boyd <[email protected]>

2020-10-10 22:56:44

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 0/3] i2c: i2c-qcom-geni: More properly fix the DMA race

+Roja

Quoting Douglas Anderson (2020-10-08 15:52:32)
> Previously I landed commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA
> transfer race") to fix a race we were seeing. While that most
> definitely fixed the race we were seeing, it looks like it causes
> problems in the TX path, which we didn't stress test until we started
> trying to update firmware on devices.
>
> Let's revert that patch and try another way: fix the original problem
> by disabling the interrupts that aren't relevant to DMA transfers.
> Now we can stress both TX and RX cases and see no problems. I also
> can't find any place to put an msleep() that causes problems anymore.
>
> Since this problem only affects i2c, I'm hoping for an Ack from Bjorn
> and then all these patches can go through the i2c tree. However, if
> maintainers want to work a different way out to land that's OK too.
>
> NOTE: the 3rd patch in the series could certianly be squashed with
> patch #1 or I could re-order / rejigger. To me it seemed like a good
> idea to first fix the probelm (and make the two functions as much of
> an inverse as possible) and later try to clean things up. Yell if you
> want something different.
>
>
> Douglas Anderson (3):
> soc: qcom: geni: More properly switch to DMA mode
> Revert "i2c: i2c-qcom-geni: Fix DMA transfer race"
> soc: qcom: geni: Optimize select fifo/dma mode
>
> drivers/i2c/busses/i2c-qcom-geni.c | 6 ++--
> drivers/soc/qcom/qcom-geni-se.c | 47 ++++++++++++++++++++----------
> 2 files changed, 34 insertions(+), 19 deletions(-)
>
> --
> 2.28.0.1011.ga647a8990f-goog
>

2020-10-10 23:08:30

by Dmitry Baryshkov

[permalink] [raw]
Subject: Re: [PATCH 0/3] i2c: i2c-qcom-geni: More properly fix the DMA race

On 09/10/2020 01:52, Douglas Anderson wrote:
> Previously I landed commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA
> transfer race") to fix a race we were seeing. While that most
> definitely fixed the race we were seeing, it looks like it causes
> problems in the TX path, which we didn't stress test until we started
> trying to update firmware on devices.
>
> Let's revert that patch and try another way: fix the original problem
> by disabling the interrupts that aren't relevant to DMA transfers.
> Now we can stress both TX and RX cases and see no problems. I also
> can't find any place to put an msleep() that causes problems anymore.
>
> Since this problem only affects i2c, I'm hoping for an Ack from Bjorn
> and then all these patches can go through the i2c tree. However, if
> maintainers want to work a different way out to land that's OK too.

These patches fix I2C DMA issues on SM8250 we were observing
Tested-by: Dmitry Baryshkov <[email protected]>


--
With best wishes
Dmitry

2020-10-11 02:53:38

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 3/3] soc: qcom: geni: Optimize select fifo/dma mode

Quoting Douglas Anderson (2020-10-08 15:52:35)
> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> index 751a49f6534f..746854745b15 100644
> --- a/drivers/soc/qcom/qcom-geni-se.c
> +++ b/drivers/soc/qcom/qcom-geni-se.c
> @@ -266,49 +266,53 @@ EXPORT_SYMBOL(geni_se_init);
> static void geni_se_select_fifo_mode(struct geni_se *se)
> {
> u32 proto = geni_se_read_proto(se);
> - u32 val;
> + u32 val, val_old;
>
> geni_se_irq_clear(se);
>
> - val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> if (proto != GENI_SE_UART) {
> + val_old = val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> val |= M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN;
> val |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
> - }
> - writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> + if (val != val_old)
> + writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
>
> - val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> - if (proto != GENI_SE_UART)
> - val |= S_CMD_DONE_EN;
> - writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> + val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);

Can we use the val_old trick here too?

> + if (!(val & S_CMD_DONE_EN))
> + writel_relaxed(val | S_CMD_DONE_EN,

Because this val | S_CMD_DONE_EN thing is just hard to read :/

> + se->base + SE_GENI_S_IRQ_EN);
> + }
>
> val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
> - val &= ~GENI_DMA_MODE_EN;
> - writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
> + if (val & GENI_DMA_MODE_EN)
> + writel_relaxed(val & ~GENI_DMA_MODE_EN,
> + se->base + SE_GENI_DMA_MODE_EN);
> }
>

2020-10-11 06:10:12

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH 1/3] soc: qcom: geni: More properly switch to DMA mode

Quoting Douglas Anderson (2020-10-08 15:52:33)
> On geni-i2c transfers using DMA, it was seen that if you program the
> command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
> cause interrupts to fire. If we get unlucky, these interrupts can
> just keep firing (and not be handled) blocking further progress and
> hanging the system.
>
> In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> we avoided that by making sure we didn't program the command until
> after geni_se_rx_dma_prep() was called. While that avoided the
> problems, it also turns out to be invalid. At least in the TX case we
> started seeing sporadic corrupted transfers. This is easily seen by
> adding an msleep() between the DMA prep and the writing of the
> command, which makes the problem worse. That means we need to revert
> that commit and find another way to fix the bogus IRQs.
>
> Specifically, after reverting commit 02b9aec59243 ("i2c:
> i2c-qcom-geni: Fix DMA transfer race"), I put some traces in. I found
> that the when the interrupts were firing like crazy:
> - "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
> - "dma" was set.
>
> Further debugging showed that I could make the problem happen more
> reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
> ran up until geni_se_rx_dma_prep() programmed the length.
>
> A rather simple fix is to change geni_se_select_dma_mode() so it's a
> true inverse of geni_se_select_fifo_mode() and disables all the FIFO
> related interrupts. Now the problematic interrupts can't fire and we
> can program things in the correct order without worrying.
>
> As part of this, let's also change the writel_relaxed() in the prepare
> function to a writel() so that our DMA is guaranteed to be prepared
> now that we can't rely on geni_se_setup_m_cmd()'s writel().
>
> NOTE: the only current user of GENI_SE_DMA in mainline is i2c.
>
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---

Reviewed-by: Stephen Boyd <[email protected]>

>
> drivers/soc/qcom/qcom-geni-se.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> index d0e4f520cff8..751a49f6534f 100644
> --- a/drivers/soc/qcom/qcom-geni-se.c
> +++ b/drivers/soc/qcom/qcom-geni-se.c
> @@ -289,10 +289,23 @@ static void geni_se_select_fifo_mode(struct geni_se *se)
>
> static void geni_se_select_dma_mode(struct geni_se *se)
> {
> + u32 proto = geni_se_read_proto(se);
> u32 val;
>
> geni_se_irq_clear(se);
>
> + val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> + if (proto != GENI_SE_UART) {

Not a problem with this patch but it would be great if there was a
comment here (and probably in geni_se_select_fifo_mode() too) indicating
why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
main sequencer? I think that is the reason, but I forgot and reading
this code doesn't tell me that.

Splitting the driver in this way where the logic is in the geni wrapper
and in the engine driver leads to this confusion.

> + val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
> + val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
> + }
> + writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> +
> + val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> + if (proto != GENI_SE_UART)
> + val &= ~S_CMD_DONE_EN;
> + writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> +
> val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
> val |= GENI_DMA_MODE_EN;
> writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);

2020-10-12 08:21:34

by Akash Asthana

[permalink] [raw]
Subject: Re: [PATCH 2/3] Revert "i2c: i2c-qcom-geni: Fix DMA transfer race"


On 10/9/2020 4:22 AM, Douglas Anderson wrote:
> This reverts commit 02b9aec59243c6240fc42884acc958602146ddf6.
>
> As talked about in the patch ("soc: qcom: geni: More properly switch
> to DMA mode"), swapping the order of geni_se_setup_m_cmd() and
> geni_se_xx_dma_prep() can sometimes cause corrupted transfers. Thus
> we traded one problem for another. Now that we've debugged the
> problem further and fixed the geni helper functions to more disable
> FIFO interrupts when we move to DMA mode we can revert it and end up
> with (hopefully) zero problems!
>
> To be explicit, the patch ("soc: qcom: geni: More properly switch
> to DMA mode") is a prerequisite for this one.
>
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Akash Asthana <[email protected]>

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project

2020-10-12 09:10:32

by Akash Asthana

[permalink] [raw]
Subject: Re: [PATCH 1/3] soc: qcom: geni: More properly switch to DMA mode

Hi Stephen,


>>
>> static void geni_se_select_dma_mode(struct geni_se *se)
>> {
>> + u32 proto = geni_se_read_proto(se);
>> u32 val;
>>
>> geni_se_irq_clear(se);
>>
>> + val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
>> + if (proto != GENI_SE_UART) {
> Not a problem with this patch but it would be great if there was a
> comment here (and probably in geni_se_select_fifo_mode() too) indicating
> why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
> main sequencer? I think that is the reason, but I forgot and reading
> this code doesn't tell me that.
>
> Splitting the driver in this way where the logic is in the geni wrapper
> and in the engine driver leads to this confusion.

GENI_SE_UART uses main sequencer for TX and secondary for RX transfers
because it is asynchronous in nature.

That's why  RX related bits (M_RX_FIFO_WATERMARK_EN |
M_RX_FIFO_LAST_EN)  are not enable in main sequencer for UART.

(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN) bits are controlled from UART
driver, it's gets enabled and disabled multiple times from start_tx
,stop_tx respectively.


Regards,

Akash

>
>> + val &= ~(M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN);
>> + val &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
>> + }
>> + writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
>> +
>> + val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
>> + if (proto != GENI_SE_UART)
>> + val &= ~S_CMD_DONE_EN;
>> + writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
>> +
>> val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
>> val |= GENI_DMA_MODE_EN;
>> writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project

2020-10-12 20:03:44

by Akash Asthana

[permalink] [raw]
Subject: Re: [PATCH 1/3] soc: qcom: geni: More properly switch to DMA mode


On 10/9/2020 4:22 AM, Douglas Anderson wrote:
> On geni-i2c transfers using DMA, it was seen that if you program the
> command (I2C_READ) before calling geni_se_rx_dma_prep() that it could
> cause interrupts to fire. If we get unlucky, these interrupts can
> just keep firing (and not be handled) blocking further progress and
> hanging the system.
>
> In commit 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> we avoided that by making sure we didn't program the command until
> after geni_se_rx_dma_prep() was called. While that avoided the
> problems, it also turns out to be invalid. At least in the TX case we
> started seeing sporadic corrupted transfers. This is easily seen by
> adding an msleep() between the DMA prep and the writing of the
> command, which makes the problem worse. That means we need to revert
> that commit and find another way to fix the bogus IRQs.
>
> Specifically, after reverting commit 02b9aec59243 ("i2c:
> i2c-qcom-geni: Fix DMA transfer race"), I put some traces in. I found
> that the when the interrupts were firing like crazy:
> - "m_stat" had bits for M_RX_IRQ_EN, M_RX_FIFO_WATERMARK_EN set.
> - "dma" was set.
>
> Further debugging showed that I could make the problem happen more
> reliably by adding an "msleep(1)" any time after geni_se_setup_m_cmd()
> ran up until geni_se_rx_dma_prep() programmed the length.
>
> A rather simple fix is to change geni_se_select_dma_mode() so it's a
> true inverse of geni_se_select_fifo_mode() and disables all the FIFO
> related interrupts. Now the problematic interrupts can't fire and we
> can program things in the correct order without worrying.
>
> As part of this, let's also change the writel_relaxed() in the prepare
> function to a writel() so that our DMA is guaranteed to be prepared
> now that we can't rely on geni_se_setup_m_cmd()'s writel().
>
> NOTE: the only current user of GENI_SE_DMA in mainline is i2c.
>
> Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
> Fixes: 02b9aec59243 ("i2c: i2c-qcom-geni: Fix DMA transfer race")
> Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Akash Asthana <[email protected]>

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na Linux Foundation Collaborative Project

2020-10-14 09:24:00

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH 1/3] soc: qcom: geni: More properly switch to DMA mode

Hi,

On Mon, Oct 12, 2020 at 2:05 AM Akash Asthana <[email protected]> wrote:
>
> Hi Stephen,
>
>
> >>
> >> static void geni_se_select_dma_mode(struct geni_se *se)
> >> {
> >> + u32 proto = geni_se_read_proto(se);
> >> u32 val;
> >>
> >> geni_se_irq_clear(se);
> >>
> >> + val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> >> + if (proto != GENI_SE_UART) {
> > Not a problem with this patch but it would be great if there was a
> > comment here (and probably in geni_se_select_fifo_mode() too) indicating
> > why GENI_SE_UART is special. Is it because GENI_SE_UART doesn't use the
> > main sequencer? I think that is the reason, but I forgot and reading
> > this code doesn't tell me that.
> >
> > Splitting the driver in this way where the logic is in the geni wrapper
> > and in the engine driver leads to this confusion.
>
> GENI_SE_UART uses main sequencer for TX and secondary for RX transfers
> because it is asynchronous in nature.
>
> That's why RX related bits (M_RX_FIFO_WATERMARK_EN |
> M_RX_FIFO_LAST_EN) are not enable in main sequencer for UART.
>
> (M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN) bits are controlled from UART
> driver, it's gets enabled and disabled multiple times from start_tx
> ,stop_tx respectively.

For now I've "solved" this by adding some comments (in the 3rd patch)
basically summarizing what Akash said. I didn't want to go further
than that for now because it felt more important to get the i2c bug
fixed sooner rather than later and re-organizing would be a big enough
change that it'd probably need a few spins.

Our bug trackers don't make it trivially easy to file a public bug
tracking this and assign it to Qualcomm, but I've filed a bug asking
folks at Qualcomm to help with re-organizing things after my patch
series lands. This is internally tracked at Google as b:170766462
("Rejigger geni_se_select_fifo_mode() / geni_se_select_dma_mode() to
not manage interrupt enables").

-Doug

2020-10-14 15:09:10

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH 3/3] soc: qcom: geni: Optimize select fifo/dma mode

Hi,

On Fri, Oct 9, 2020 at 5:32 PM Stephen Boyd <[email protected]> wrote:
>
> Quoting Douglas Anderson (2020-10-08 15:52:35)
> > diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> > index 751a49f6534f..746854745b15 100644
> > --- a/drivers/soc/qcom/qcom-geni-se.c
> > +++ b/drivers/soc/qcom/qcom-geni-se.c
> > @@ -266,49 +266,53 @@ EXPORT_SYMBOL(geni_se_init);
> > static void geni_se_select_fifo_mode(struct geni_se *se)
> > {
> > u32 proto = geni_se_read_proto(se);
> > - u32 val;
> > + u32 val, val_old;
> >
> > geni_se_irq_clear(se);
> >
> > - val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> > if (proto != GENI_SE_UART) {
> > + val_old = val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> > val |= M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN;
> > val |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
> > - }
> > - writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> > + if (val != val_old)
> > + writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> >
> > - val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> > - if (proto != GENI_SE_UART)
> > - val |= S_CMD_DONE_EN;
> > - writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> > + val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
>
> Can we use the val_old trick here too?
>
> > + if (!(val & S_CMD_DONE_EN))
> > + writel_relaxed(val | S_CMD_DONE_EN,
>
> Because this val | S_CMD_DONE_EN thing is just hard to read :/

This is done in v2. Thanks for your review!

-Doug