2024-04-16 16:29:33

by Serge Semin

[permalink] [raw]
Subject: [PATCH 0/4] dmaengine: dw: Fix src/dst addr width misconfig

The main goal of this series is to fix the data disappearance in case of
the DW UART handled by the DW AHB DMA engine. The problem happens on a
portion of the data received when the pre-initialized DEV_TO_MEM
DMA-transfer is paused and then disabled. The data just hangs up in the
DMA-engine FIFO and isn't flushed out to the memory on the DMA-channel
suspension (see the second commit log for details). On a way to find the
denoted problem fix it was discovered that the driver doesn't verify the
peripheral device address width specified by a client driver, which in its
turn if unsupported or undefined value passed may cause DMA-transfer being
misconfigured. It's fixed in the first patch of the series.

In addition to that two cleanup patch follow the fixes described above in
order to make the DWC-engine configuration procedure more coherent. First
one simplifies the CTL_LO register setup methods. Second one simplifies
the max-burst calculation procedure and unifies it with the rest of the
verification methods. Please see the patches log for more details.

Signed-off-by: Serge Semin <[email protected]>
Cc: "Ilpo Järvinen" <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Serge Semin (4):
dmaengine: dw: Add peripheral bus width verification
dmaengine: dw: Add memory bus width verification
dmaengine: dw: Simplify prepare CTL_LO methods
dmaengine: dw: Simplify max-burst calculation procedure

drivers/dma/dw/core.c | 97 ++++++++++++++++++++++++++++++++++++-----
drivers/dma/dw/dw.c | 43 +++++++++++-------
drivers/dma/dw/idma32.c | 25 +++++++----
drivers/dma/dw/regs.h | 1 -
4 files changed, 129 insertions(+), 37 deletions(-)

--
2.43.0



2024-04-16 16:29:45

by Serge Semin

[permalink] [raw]
Subject: [PATCH 1/4] dmaengine: dw: Add peripheral bus width verification

Currently the src_addr_width and dst_addr_width fields of the
dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
properly aligned data passed to the target device. It's done just by
converting the passed peripheral bus width to the encoded value using the
__ffs() function. This implementation has several problematic sides:

1. __ffs() is undefined if no bit exist in the passed value. Thus if the
specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
unexpected value depending on the platform-specific implementation.

2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
bus-width out of that constraints scope will definitely cause unexpected
result since the destination reg will be only partly touched than the
client driver implied.

Let's fix all of that by adding the peripheral bus width verification
method which would make sure that the passed source or destination address
width is valid and if undefined then the driver will just fallback to the
1-byte width transfer.

Fixes: 029a40e97d0d ("dmaengine: dw: provide DMA capabilities")
Signed-off-by: Serge Semin <[email protected]>
---
drivers/dma/dw/core.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 5f7d690e3dba..c297ca9d5706 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -16,6 +16,7 @@
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/io.h>
+#include <linux/log2.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/slab.h>
@@ -780,10 +781,43 @@ bool dw_dma_filter(struct dma_chan *chan, void *param)
}
EXPORT_SYMBOL_GPL(dw_dma_filter);

+static int dwc_verify_p_buswidth(struct dma_chan *chan)
+{
+ struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
+ struct dw_dma *dw = to_dw_dma(chan->device);
+ u32 reg_width, max_width;
+
+ if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
+ reg_width = dwc->dma_sconfig.dst_addr_width;
+ else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM)
+ reg_width = dwc->dma_sconfig.src_addr_width;
+ else /* DMA_MEM_TO_MEM */
+ return 0;
+
+ max_width = dw->pdata->data_width[dwc->dws.p_master];
+
+ /* Fall-back to 1byte transfer width if undefined */
+ if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
+ reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+ else if (!is_power_of_2(reg_width) || reg_width > max_width)
+ return -EINVAL;
+ else /* bus width is valid */
+ return 0;
+
+ /* Update undefined addr width value */
+ if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
+ dwc->dma_sconfig.dst_addr_width = reg_width;
+ else /* DMA_DEV_TO_MEM */
+ dwc->dma_sconfig.src_addr_width = reg_width;
+
+ return 0;
+}
+
static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
{
struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
struct dw_dma *dw = to_dw_dma(chan->device);
+ int err;

memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig));

@@ -792,6 +826,10 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
dwc->dma_sconfig.dst_maxburst =
clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);

+ err = dwc_verify_p_buswidth(chan);
+ if (err)
+ return err;
+
dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst);
dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst);

--
2.43.0


2024-04-16 16:29:58

by Serge Semin

[permalink] [raw]
Subject: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
data width (single transfer width) is determined based on the buffer
length, buffer base address or DMA master-channel max address width
capability. It isn't enough in case of the channel disabling prior the
block transfer is finished. Here is what DW AHB DMA IP-core databook says
regarding the port suspension (DMA-transfer pause) implementation in the
controller:

"When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
still be data in the channel FIFO, but not enough to form a single
transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
disabled, the remaining data in the channel FIFO is not transferred to the
destination peripheral."

So in case if the port gets to be suspended and then disabled it's
possible to have the data silently discarded even though the controller
reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
data already received from the source device. This looks as if the data
somehow got lost on a way from the peripheral device to memory and causes
problems for instance in the DW APB UART driver, which pauses and disables
the DMA-transfer as soon as the recv data timeout happens. Here is the way
it looks:

Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
DST_TR_WIDTH -+--------| | |
| | | | No more data
Current lvl -+--------| |---------+- DMA-burst lvl
| | |---------+- Leftover data
| | |---------+- SRC_TR_WIDTH
-+--------+-------+---------+

In the example above: no more data is getting received over the UART port
and BLOCK_TS is not even close to be fully received; some data is left in
the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
FIFO; some data is left in the DMA FIFO, but not enough to be passed
further to the system memory in a single transfer. In this situation the
8250 UART driver catches the recv timeout interrupt, pauses the
DMA-transfer and terminates it completely, after which the IRQ handler
manually fetches the leftover data from the UART FIFO into the
recv-buffer. But since the DMA-channel has been disabled with the data
left in the DMA FIFO, that data will be just discarded and the recv-buffer
will have a gap of the "current lvl" size in the recv-buffer at the tail
of the lately received data portion. So the data will be lost just due to
the misconfigured DMA transfer.

Note this is only relevant for the case of the transfer suspension and
_disabling_. No problem will happen if the transfer will be re-enabled
afterwards or the block transfer is fully completed. In the later case the
"FIFO flush mode" will be executed at the transfer final stage in order to
push out the data left in the DMA FIFO.

In order to fix the denoted problem the DW AHB DMA-engine driver needs to
make sure that the _bursted_ source transfer width is greater or equal to
the single destination transfer (note the HW databook describes more
strict constraint than actually required). Since the peripheral-device
side is prescribed by the client driver logic, the memory-side can be only
used for that. The solution can be easily implemented for the DEV_TO_MEM
transfers just by adjusting the memory-channel address width. Sadly it's
not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
is normally dynamically determined by the controller. So the only thing
that can be done is to make sure that memory-side address width can be
greater than the peripheral device address width.

Fixes: a09820043c9e ("dw_dmac: autoconfigure data_width or get it via platform data")
Signed-off-by: Serge Semin <[email protected]>
---
drivers/dma/dw/core.c | 41 ++++++++++++++++++++++++++++++++++-------
1 file changed, 34 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index c297ca9d5706..61e026310dd8 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -622,12 +622,10 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
struct dw_desc *prev;
struct dw_desc *first;
u32 ctllo, ctlhi;
- u8 m_master = dwc->dws.m_master;
- u8 lms = DWC_LLP_LMS(m_master);
+ u8 lms = DWC_LLP_LMS(dwc->dws.m_master);
dma_addr_t reg;
unsigned int reg_width;
unsigned int mem_width;
- unsigned int data_width = dw->pdata->data_width[m_master];
unsigned int i;
struct scatterlist *sg;
size_t total_len = 0;
@@ -661,7 +659,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
mem = sg_dma_address(sg);
len = sg_dma_len(sg);

- mem_width = __ffs(data_width | mem | len);
+ mem_width = __ffs(sconfig->src_addr_width | mem | len);

slave_sg_todev_fill_desc:
desc = dwc_desc_get(dwc);
@@ -721,7 +719,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
lli_write(desc, sar, reg);
lli_write(desc, dar, mem);
lli_write(desc, ctlhi, ctlhi);
- mem_width = __ffs(data_width | mem);
+ mem_width = __ffs(sconfig->dst_addr_width | mem);
lli_write(desc, ctllo, ctllo | DWC_CTLL_DST_WIDTH(mem_width));
desc->len = dlen;

@@ -813,6 +811,31 @@ static int dwc_verify_p_buswidth(struct dma_chan *chan)
return 0;
}

+static int dwc_verify_m_buswidth(struct dma_chan *chan)
+{
+ struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
+ struct dw_dma *dw = to_dw_dma(chan->device);
+ u32 reg_width, reg_burst, mem_width;
+
+ mem_width = dw->pdata->data_width[dwc->dws.m_master];
+
+ /* Make sure src and dst word widths are coherent */
+ if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) {
+ reg_width = dwc->dma_sconfig.dst_addr_width;
+ if (mem_width < reg_width)
+ return -EINVAL;
+
+ dwc->dma_sconfig.src_addr_width = mem_width;
+ } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
+ reg_width = dwc->dma_sconfig.src_addr_width;
+ reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
+
+ dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);
+ }
+
+ return 0;
+}
+
static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
{
struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
@@ -822,14 +845,18 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig));

dwc->dma_sconfig.src_maxburst =
- clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst);
+ clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
dwc->dma_sconfig.dst_maxburst =
- clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
+ clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);

err = dwc_verify_p_buswidth(chan);
if (err)
return err;

+ err = dwc_verify_m_buswidth(chan);
+ if (err)
+ return err;
+
dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst);
dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst);

--
2.43.0


2024-04-16 16:30:09

by Serge Semin

[permalink] [raw]
Subject: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

Currently the CTL LO fields are calculated on the platform-specific basis.
It's implemented by means of the prepare_ctllo() callbacks using the
ternary operator within the local variables init block at the beginning of
the block scope. The functions code currently is relatively hard to
comprehend and isn't that optimal since implies four conditional
statements executed and two additional local variables defined. Let's
simplify the DW AHB DMA prepare_ctllo() method by unrolling the ternary
operators into the normal if-else statement, dropping redundant
master-interface ID variables and initializing the local variables based
on the singly evaluated DMA-transfer direction check. Thus the method will
look much more readable since now the fields content can be easily
inferred right from the if-else branch. Provide the same update in the
Intel DMA32 core driver for sake of the driver code unification.

Note besides of the effects described above this update is basically a
preparation before dropping the max burst encoding callback. It will
require calling the burst fields calculation methods right in the
prepare_ctllo() callbacks, which would have made the later function code
even more complex.

Signed-off-by: Serge Semin <[email protected]>
---
drivers/dma/dw/dw.c | 24 ++++++++++++++++++------
drivers/dma/dw/idma32.c | 14 ++++++++++++--
2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index a4862263ff14..c65438d1f1ff 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -67,12 +67,24 @@ static size_t dw_dma_block2bytes(struct dw_dma_chan *dwc, u32 block, u32 width)
static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
{
struct dma_slave_config *sconfig = &dwc->dma_sconfig;
- u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
- u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;
- u8 p_master = dwc->dws.p_master;
- u8 m_master = dwc->dws.m_master;
- u8 dms = (dwc->direction == DMA_MEM_TO_DEV) ? p_master : m_master;
- u8 sms = (dwc->direction == DMA_DEV_TO_MEM) ? p_master : m_master;
+ u8 sms, smsize, dms, dmsize;
+
+ if (dwc->direction == DMA_MEM_TO_DEV) {
+ sms = dwc->dws.m_master;
+ smsize = 0;
+ dms = dwc->dws.p_master;
+ dmsize = sconfig->dst_maxburst;
+ } else if (dwc->direction == DMA_DEV_TO_MEM) {
+ sms = dwc->dws.p_master;
+ smsize = sconfig->src_maxburst;
+ dms = dwc->dws.m_master;
+ dmsize = 0;
+ } else /* DMA_MEM_TO_MEM */ {
+ sms = dwc->dws.m_master;
+ smsize = 0;
+ dms = dwc->dws.m_master;
+ dmsize = 0;
+ }

return DWC_CTLL_LLP_D_EN | DWC_CTLL_LLP_S_EN |
DWC_CTLL_DST_MSIZE(dmsize) | DWC_CTLL_SRC_MSIZE(smsize) |
diff --git a/drivers/dma/dw/idma32.c b/drivers/dma/dw/idma32.c
index 58f4078d83fe..3a1b12517655 100644
--- a/drivers/dma/dw/idma32.c
+++ b/drivers/dma/dw/idma32.c
@@ -202,8 +202,18 @@ static size_t idma32_block2bytes(struct dw_dma_chan *dwc, u32 block, u32 width)
static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
{
struct dma_slave_config *sconfig = &dwc->dma_sconfig;
- u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
- u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;
+ u8 smsize, dmsize;
+
+ if (dwc->direction == DMA_MEM_TO_DEV) {
+ smsize = 0;
+ dmsize = sconfig->dst_maxburst;
+ } else if (dwc->direction == DMA_DEV_TO_MEM) {
+ smsize = sconfig->src_maxburst;
+ dmsize = 0;
+ } else /* DMA_MEM_TO_MEM */ {
+ smsize = 0;
+ dmsize = 0;
+ }

return DWC_CTLL_LLP_D_EN | DWC_CTLL_LLP_S_EN |
DWC_CTLL_DST_MSIZE(dmsize) | DWC_CTLL_SRC_MSIZE(smsize);
--
2.43.0


2024-04-16 16:30:30

by Serge Semin

[permalink] [raw]
Subject: [PATCH 4/4] dmaengine: dw: Simplify max-burst calculation procedure

In order to have a more coherent DW AHB DMA slave configuration method
let's simplify the source and destination channel max-burst calculation
procedure:

1. Create the max-burst verification method as it has been just done for
the memory and peripheral address widths. Thus the DWC DMA slave config
method will turn to a set of the verification methods execution.

2. Since both the generic DW AHB DMA and Intel DMA32 engines support the
power-of-2 bursts only, then the specified by the client driver max-burst
values can be converted to being power-of-2 right in the max-burst
verification method.

3. Since max-burst encoded value is required on the CTL_LO fields
calculation stage, the encode_maxburst() callback can be easily dropped
from the dw_dma structure meanwhile the encoding procedure will be
executed right in the CTL_LO register value calculation.

Thus the update will provide the next positive effects: the internal
DMA-slave config structure will contain only the real DMA-transfer config
value, which will be encoded to the DMA-controller register fields only
when it's required on the buffer mapping; the redundant encode_maxburst()
callback will be dropped simplifying the internal HW-abstraction API;
DWC-config method will look more readable executing the verification
functions one-by-one.

Signed-off-by: Serge Semin <[email protected]>
---
drivers/dma/dw/core.c | 26 +++++++++++++++++---------
drivers/dma/dw/dw.c | 23 +++++++++++------------
drivers/dma/dw/idma32.c | 15 +++++++--------
drivers/dma/dw/regs.h | 1 -
4 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 61e026310dd8..8b4ecd137ae2 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -779,6 +779,21 @@ bool dw_dma_filter(struct dma_chan *chan, void *param)
}
EXPORT_SYMBOL_GPL(dw_dma_filter);

+static void dwc_verify_maxburst(struct dma_chan *chan)
+{
+ struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
+
+ dwc->dma_sconfig.src_maxburst =
+ clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
+ dwc->dma_sconfig.dst_maxburst =
+ clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
+
+ dwc->dma_sconfig.src_maxburst =
+ rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
+ dwc->dma_sconfig.dst_maxburst =
+ rounddown_pow_of_two(dwc->dma_sconfig.dst_maxburst);
+}
+
static int dwc_verify_p_buswidth(struct dma_chan *chan)
{
struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
@@ -828,7 +843,7 @@ static int dwc_verify_m_buswidth(struct dma_chan *chan)
dwc->dma_sconfig.src_addr_width = mem_width;
} else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
reg_width = dwc->dma_sconfig.src_addr_width;
- reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
+ reg_burst = dwc->dma_sconfig.src_maxburst;

dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);
}
@@ -839,15 +854,11 @@ static int dwc_verify_m_buswidth(struct dma_chan *chan)
static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
{
struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
- struct dw_dma *dw = to_dw_dma(chan->device);
int err;

memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig));

- dwc->dma_sconfig.src_maxburst =
- clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
- dwc->dma_sconfig.dst_maxburst =
- clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
+ dwc_verify_maxburst(chan);

err = dwc_verify_p_buswidth(chan);
if (err)
@@ -857,9 +868,6 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
if (err)
return err;

- dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst);
- dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst);
-
return 0;
}

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index c65438d1f1ff..c52333646edd 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -64,6 +64,15 @@ static size_t dw_dma_block2bytes(struct dw_dma_chan *dwc, u32 block, u32 width)
return DWC_CTLH_BLOCK_TS(block) << width;
}

+static inline u8 dw_dma_encode_maxburst(u32 maxburst)
+{
+ /*
+ * Fix burst size according to dw_dmac. We need to convert them as:
+ * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3.
+ */
+ return maxburst > 1 ? fls(maxburst) - 2 : 0;
+}
+
static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
{
struct dma_slave_config *sconfig = &dwc->dma_sconfig;
@@ -73,10 +82,10 @@ static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
sms = dwc->dws.m_master;
smsize = 0;
dms = dwc->dws.p_master;
- dmsize = sconfig->dst_maxburst;
+ dmsize = dw_dma_encode_maxburst(sconfig->dst_maxburst);
} else if (dwc->direction == DMA_DEV_TO_MEM) {
sms = dwc->dws.p_master;
- smsize = sconfig->src_maxburst;
+ smsize = dw_dma_encode_maxburst(sconfig->src_maxburst);
dms = dwc->dws.m_master;
dmsize = 0;
} else /* DMA_MEM_TO_MEM */ {
@@ -91,15 +100,6 @@ static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
DWC_CTLL_DMS(dms) | DWC_CTLL_SMS(sms);
}

-static void dw_dma_encode_maxburst(struct dw_dma_chan *dwc, u32 *maxburst)
-{
- /*
- * Fix burst size according to dw_dmac. We need to convert them as:
- * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3.
- */
- *maxburst = *maxburst > 1 ? fls(*maxburst) - 2 : 0;
-}
-
static void dw_dma_set_device_name(struct dw_dma *dw, int id)
{
snprintf(dw->name, sizeof(dw->name), "dw:dmac%d", id);
@@ -128,7 +128,6 @@ int dw_dma_probe(struct dw_dma_chip *chip)
dw->suspend_chan = dw_dma_suspend_chan;
dw->resume_chan = dw_dma_resume_chan;
dw->prepare_ctllo = dw_dma_prepare_ctllo;
- dw->encode_maxburst = dw_dma_encode_maxburst;
dw->bytes2block = dw_dma_bytes2block;
dw->block2bytes = dw_dma_block2bytes;

diff --git a/drivers/dma/dw/idma32.c b/drivers/dma/dw/idma32.c
index 3a1b12517655..428aba9fc2db 100644
--- a/drivers/dma/dw/idma32.c
+++ b/drivers/dma/dw/idma32.c
@@ -199,6 +199,11 @@ static size_t idma32_block2bytes(struct dw_dma_chan *dwc, u32 block, u32 width)
return IDMA32C_CTLH_BLOCK_TS(block);
}

+static inline u32 idma32_encode_maxburst(u32 maxburst)
+{
+ return maxburst > 1 ? fls(maxburst) - 1 : 0;
+}
+
static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
{
struct dma_slave_config *sconfig = &dwc->dma_sconfig;
@@ -206,9 +211,9 @@ static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)

if (dwc->direction == DMA_MEM_TO_DEV) {
smsize = 0;
- dmsize = sconfig->dst_maxburst;
+ dmsize = idma32_encode_maxburst(sconfig->dst_maxburst);
} else if (dwc->direction == DMA_DEV_TO_MEM) {
- smsize = sconfig->src_maxburst;
+ smsize = idma32_encode_maxburst(sconfig->src_maxburst);
dmsize = 0;
} else /* DMA_MEM_TO_MEM */ {
smsize = 0;
@@ -219,11 +224,6 @@ static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
DWC_CTLL_DST_MSIZE(dmsize) | DWC_CTLL_SRC_MSIZE(smsize);
}

-static void idma32_encode_maxburst(struct dw_dma_chan *dwc, u32 *maxburst)
-{
- *maxburst = *maxburst > 1 ? fls(*maxburst) - 1 : 0;
-}
-
static void idma32_set_device_name(struct dw_dma *dw, int id)
{
snprintf(dw->name, sizeof(dw->name), "idma32:dmac%d", id);
@@ -280,7 +280,6 @@ int idma32_dma_probe(struct dw_dma_chip *chip)
dw->suspend_chan = idma32_suspend_chan;
dw->resume_chan = idma32_resume_chan;
dw->prepare_ctllo = idma32_prepare_ctllo;
- dw->encode_maxburst = idma32_encode_maxburst;
dw->bytes2block = idma32_bytes2block;
dw->block2bytes = idma32_block2bytes;

diff --git a/drivers/dma/dw/regs.h b/drivers/dma/dw/regs.h
index 76654bd13c1a..5969d9cc8d7a 100644
--- a/drivers/dma/dw/regs.h
+++ b/drivers/dma/dw/regs.h
@@ -327,7 +327,6 @@ struct dw_dma {
void (*suspend_chan)(struct dw_dma_chan *dwc, bool drain);
void (*resume_chan)(struct dw_dma_chan *dwc, bool drain);
u32 (*prepare_ctllo)(struct dw_dma_chan *dwc);
- void (*encode_maxburst)(struct dw_dma_chan *dwc, u32 *maxburst);
u32 (*bytes2block)(struct dw_dma_chan *dwc, size_t bytes,
unsigned int width, size_t *len);
size_t (*block2bytes)(struct dw_dma_chan *dwc, u32 block, u32 width);
--
2.43.0


2024-04-16 18:01:06

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 1/4] dmaengine: dw: Add peripheral bus width verification

On Tue, Apr 16, 2024 at 07:28:55PM +0300, Serge Semin wrote:
> Currently the src_addr_width and dst_addr_width fields of the
> dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> properly aligned data passed to the target device. It's done just by
> converting the passed peripheral bus width to the encoded value using the
> __ffs() function. This implementation has several problematic sides:
>
> 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> unexpected value depending on the platform-specific implementation.
>
> 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> bus-width out of that constraints scope will definitely cause unexpected
> result since the destination reg will be only partly touched than the
> client driver implied.
>
> Let's fix all of that by adding the peripheral bus width verification
> method which would make sure that the passed source or destination address
> width is valid and if undefined then the driver will just fallback to the
> 1-byte width transfer.

Please, add a word that you apply the check in the dwc_config() which is
supposed to be called before preparing any transfer?

..

> +static int dwc_verify_p_buswidth(struct dma_chan *chan)
> +{
> + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> + struct dw_dma *dw = to_dw_dma(chan->device);
> + u32 reg_width, max_width;
> +
> + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> + reg_width = dwc->dma_sconfig.dst_addr_width;
> + else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM)
> + reg_width = dwc->dma_sconfig.src_addr_width;

> + else /* DMA_MEM_TO_MEM */

Actually not only this direction, but TBH I do not see value in these comments.

> + return 0;
> +
> + max_width = dw->pdata->data_width[dwc->dws.p_master];
> +
> + /* Fall-back to 1byte transfer width if undefined */

1-byte
(as you even used in the commit message correctly)

> + if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
> + reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
> + else if (!is_power_of_2(reg_width) || reg_width > max_width)
> + return -EINVAL;
> + else /* bus width is valid */
> + return 0;
> +
> + /* Update undefined addr width value */
> + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> + dwc->dma_sconfig.dst_addr_width = reg_width;
> + else /* DMA_DEV_TO_MEM */
> + dwc->dma_sconfig.src_addr_width = reg_width;

So, can't you simply call clamp() for both fields in dwc_config()?

> + return 0;
> +}

..

> + int err;

Hmm... we have two functions one of which is using different name for this.
Can we have a patch to convert to err the other one?

--
With Best Regards,
Andy Shevchenko



2024-04-16 18:53:32

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote:
> Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
> data width (single transfer width) is determined based on the buffer
> length, buffer base address or DMA master-channel max address width
> capability. It isn't enough in case of the channel disabling prior the
> block transfer is finished. Here is what DW AHB DMA IP-core databook says
> regarding the port suspension (DMA-transfer pause) implementation in the
> controller:
>
> "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> still be data in the channel FIFO, but not enough to form a single
> transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> disabled, the remaining data in the channel FIFO is not transferred to the
> destination peripheral."
>
> So in case if the port gets to be suspended and then disabled it's
> possible to have the data silently discarded even though the controller
> reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
> data already received from the source device. This looks as if the data
> somehow got lost on a way from the peripheral device to memory and causes
> problems for instance in the DW APB UART driver, which pauses and disables
> the DMA-transfer as soon as the recv data timeout happens. Here is the way
> it looks:
>
> Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
> DST_TR_WIDTH -+--------| | |
> | | | | No more data
> Current lvl -+--------| |---------+- DMA-burst lvl
> | | |---------+- Leftover data
> | | |---------+- SRC_TR_WIDTH
> -+--------+-------+---------+
>
> In the example above: no more data is getting received over the UART port
> and BLOCK_TS is not even close to be fully received; some data is left in
> the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
> FIFO; some data is left in the DMA FIFO, but not enough to be passed
> further to the system memory in a single transfer. In this situation the
> 8250 UART driver catches the recv timeout interrupt, pauses the
> DMA-transfer and terminates it completely, after which the IRQ handler
> manually fetches the leftover data from the UART FIFO into the
> recv-buffer. But since the DMA-channel has been disabled with the data
> left in the DMA FIFO, that data will be just discarded and the recv-buffer
> will have a gap of the "current lvl" size in the recv-buffer at the tail
> of the lately received data portion. So the data will be lost just due to
> the misconfigured DMA transfer.
>
> Note this is only relevant for the case of the transfer suspension and
> _disabling_. No problem will happen if the transfer will be re-enabled
> afterwards or the block transfer is fully completed. In the later case the
> "FIFO flush mode" will be executed at the transfer final stage in order to
> push out the data left in the DMA FIFO.
>
> In order to fix the denoted problem the DW AHB DMA-engine driver needs to
> make sure that the _bursted_ source transfer width is greater or equal to
> the single destination transfer (note the HW databook describes more
> strict constraint than actually required). Since the peripheral-device
> side is prescribed by the client driver logic, the memory-side can be only
> used for that. The solution can be easily implemented for the DEV_TO_MEM
> transfers just by adjusting the memory-channel address width. Sadly it's
> not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
> is normally dynamically determined by the controller. So the only thing
> that can be done is to make sure that memory-side address width can be
> greater than the peripheral device address width.

..

> +static int dwc_verify_m_buswidth(struct dma_chan *chan)
> +{
> + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> + struct dw_dma *dw = to_dw_dma(chan->device);
> + u32 reg_width, reg_burst, mem_width;
> +
> + mem_width = dw->pdata->data_width[dwc->dws.m_master];
> +
> + /* Make sure src and dst word widths are coherent */
> + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) {
> + reg_width = dwc->dma_sconfig.dst_addr_width;
> + if (mem_width < reg_width)
> + return -EINVAL;
> +
> + dwc->dma_sconfig.src_addr_width = mem_width;
> + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
> + reg_width = dwc->dma_sconfig.src_addr_width;
> + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> +
> + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);

I understand the desire to go this way, but wouldn't be better to have
a symmetrical check and return an error?

> + }
> +
> + return 0;
> +}

IIRC MEM side of the DMA channel will ignore those in HW, so basically you are
(re-)using them purely for the __ffs() corrections.

..

> dwc->dma_sconfig.src_maxburst =
> - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst);
> + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> dwc->dma_sconfig.dst_maxburst =
> - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
> + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);

--
With Best Regards,
Andy Shevchenko



2024-04-16 19:12:11

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:
> Currently the CTL LO fields are calculated on the platform-specific basis.
> It's implemented by means of the prepare_ctllo() callbacks using the
> ternary operator within the local variables init block at the beginning of
> the block scope. The functions code currently is relatively hard to
> comprehend and isn't that optimal since implies four conditional
> statements executed and two additional local variables defined. Let's
> simplify the DW AHB DMA prepare_ctllo() method by unrolling the ternary
> operators into the normal if-else statement, dropping redundant
> master-interface ID variables and initializing the local variables based
> on the singly evaluated DMA-transfer direction check. Thus the method will
> look much more readable since now the fields content can be easily
> inferred right from the if-else branch. Provide the same update in the
> Intel DMA32 core driver for sake of the driver code unification.
>
> Note besides of the effects described above this update is basically a
> preparation before dropping the max burst encoding callback. It will
> require calling the burst fields calculation methods right in the
> prepare_ctllo() callbacks, which would have made the later function code
> even more complex.

Yeah, this is inherited from the original driver where it used to be a macro.

..

> + if (dwc->direction == DMA_MEM_TO_DEV) {
> + sms = dwc->dws.m_master;
> + smsize = 0;
> + dms = dwc->dws.p_master;
> + dmsize = sconfig->dst_maxburst;

I would group it differently, i.e.

sms = dwc->dws.m_master;
dms = dwc->dws.p_master;
smsize = 0;
dmsize = sconfig->dst_maxburst;

> + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> + sms = dwc->dws.p_master;
> + smsize = sconfig->src_maxburst;
> + dms = dwc->dws.m_master;
> + dmsize = 0;
> + } else /* DMA_MEM_TO_MEM */ {
> + sms = dwc->dws.m_master;
> + smsize = 0;
> + dms = dwc->dws.m_master;
> + dmsize = 0;
> + }

Ditto for two above cases.

> static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
> {
> struct dma_slave_config *sconfig = &dwc->dma_sconfig;
> - u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
> - u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;
> + u8 smsize, dmsize;
> +
> + if (dwc->direction == DMA_MEM_TO_DEV) {
> + smsize = 0;
> + dmsize = sconfig->dst_maxburst;
> + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> + smsize = sconfig->src_maxburst;
> + dmsize = 0;
> + } else /* DMA_MEM_TO_MEM */ {
> + smsize = 0;
> + dmsize = 0;
> + }

u8 smsize = 0, dmsize = 0;

if (dwc->direction == DMA_MEM_TO_DEV)
dmsize = sconfig->dst_maxburst;
else if (dwc->direction == DMA_DEV_TO_MEM)
smsize = sconfig->src_maxburst;

?

Something similar also can be done in the Synopsys case above, no?

--
With Best Regards,
Andy Shevchenko



2024-04-16 19:13:42

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 0/4] dmaengine: dw: Fix src/dst addr width misconfig

On Tue, Apr 16, 2024 at 07:28:54PM +0300, Serge Semin wrote:
> The main goal of this series is to fix the data disappearance in case of
> the DW UART handled by the DW AHB DMA engine. The problem happens on a
> portion of the data received when the pre-initialized DEV_TO_MEM
> DMA-transfer is paused and then disabled. The data just hangs up in the
> DMA-engine FIFO and isn't flushed out to the memory on the DMA-channel
> suspension (see the second commit log for details). On a way to find the
> denoted problem fix it was discovered that the driver doesn't verify the
> peripheral device address width specified by a client driver, which in its
> turn if unsupported or undefined value passed may cause DMA-transfer being
> misconfigured. It's fixed in the first patch of the series.
>
> In addition to that two cleanup patch follow the fixes described above in
> order to make the DWC-engine configuration procedure more coherent. First
> one simplifies the CTL_LO register setup methods. Second one simplifies
> the max-burst calculation procedure and unifies it with the rest of the
> verification methods. Please see the patches log for more details.

Thank you for this.
I have looked into all of them and most worrying (relatively to the rest) to me
is the second patch that does some tricks. The rest are the cosmetics that can
be easily addressed.

--
With Best Regards,
Andy Shevchenko



2024-04-16 19:15:05

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine: dw: Simplify max-burst calculation procedure

On Tue, Apr 16, 2024 at 07:28:58PM +0300, Serge Semin wrote:
> In order to have a more coherent DW AHB DMA slave configuration method
> let's simplify the source and destination channel max-burst calculation
> procedure:
>
> 1. Create the max-burst verification method as it has been just done for
> the memory and peripheral address widths. Thus the DWC DMA slave config

dwc_config() method

?

> method will turn to a set of the verification methods execution.
>
> 2. Since both the generic DW AHB DMA and Intel DMA32 engines support the

"i" in iDMA 32-bit stands for "integrated", so 'Intel iDMA 32-bit'

> power-of-2 bursts only, then the specified by the client driver max-burst
> values can be converted to being power-of-2 right in the max-burst
> verification method.
>
> 3. Since max-burst encoded value is required on the CTL_LO fields
> calculation stage, the encode_maxburst() callback can be easily dropped
> from the dw_dma structure meanwhile the encoding procedure will be
> executed right in the CTL_LO register value calculation.
>
> Thus the update will provide the next positive effects: the internal
> DMA-slave config structure will contain only the real DMA-transfer config
> value, which will be encoded to the DMA-controller register fields only
> when it's required on the buffer mapping; the redundant encode_maxburst()
> callback will be dropped simplifying the internal HW-abstraction API;
> DWC-config method will look more readable executing the verification

dwc_config() method

?

> functions one-by-one.

..

> +static void dwc_verify_maxburst(struct dma_chan *chan)

It's inconsistent to the rest of _verify methods. It doesn't verify as it
doesn't return anything. Make it int or rename the function.

> +{
> + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> +
> + dwc->dma_sconfig.src_maxburst =
> + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> + dwc->dma_sconfig.dst_maxburst =
> + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
> +
> + dwc->dma_sconfig.src_maxburst =
> + rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> + dwc->dma_sconfig.dst_maxburst =
> + rounddown_pow_of_two(dwc->dma_sconfig.dst_maxburst);
> +}

..

> static int dwc_verify_p_buswidth(struct dma_chan *chan)
> - reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> + reg_burst = dwc->dma_sconfig.src_maxburst;

Seems you have a dependency, need a comment below that maxburst has to be
"verified" [whatever] first.

..

> +static inline u8 dw_dma_encode_maxburst(u32 maxburst)
> +{
> + /*
> + * Fix burst size according to dw_dmac. We need to convert them as:
> + * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3.
> + */
> + return maxburst > 1 ? fls(maxburst) - 2 : 0;
> +}

Split these moves to another preparatory patch.

--
With Best Regards,
Andy Shevchenko



2024-04-17 17:28:23

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote:
> On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote:
> > > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
> > > data width (single transfer width) is determined based on the buffer
> > > length, buffer base address or DMA master-channel max address width
> > > capability. It isn't enough in case of the channel disabling prior the
> > > block transfer is finished. Here is what DW AHB DMA IP-core databook says
> > > regarding the port suspension (DMA-transfer pause) implementation in the
> > > controller:
> > >
> > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> > > still be data in the channel FIFO, but not enough to form a single
> > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> > > disabled, the remaining data in the channel FIFO is not transferred to the
> > > destination peripheral."
> > >
> > > So in case if the port gets to be suspended and then disabled it's
> > > possible to have the data silently discarded even though the controller
> > > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
> > > data already received from the source device. This looks as if the data
> > > somehow got lost on a way from the peripheral device to memory and causes
> > > problems for instance in the DW APB UART driver, which pauses and disables
> > > the DMA-transfer as soon as the recv data timeout happens. Here is the way
> > > it looks:
> > >
> > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
> > > DST_TR_WIDTH -+--------| | |
> > > | | | | No more data
> > > Current lvl -+--------| |---------+- DMA-burst lvl
> > > | | |---------+- Leftover data
> > > | | |---------+- SRC_TR_WIDTH
> > > -+--------+-------+---------+
> > >
> > > In the example above: no more data is getting received over the UART port
> > > and BLOCK_TS is not even close to be fully received; some data is left in
> > > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
> > > FIFO; some data is left in the DMA FIFO, but not enough to be passed
> > > further to the system memory in a single transfer. In this situation the
> > > 8250 UART driver catches the recv timeout interrupt, pauses the
> > > DMA-transfer and terminates it completely, after which the IRQ handler
> > > manually fetches the leftover data from the UART FIFO into the
> > > recv-buffer. But since the DMA-channel has been disabled with the data
> > > left in the DMA FIFO, that data will be just discarded and the recv-buffer
> > > will have a gap of the "current lvl" size in the recv-buffer at the tail
> > > of the lately received data portion. So the data will be lost just due to
> > > the misconfigured DMA transfer.
> > >
> > > Note this is only relevant for the case of the transfer suspension and
> > > _disabling_. No problem will happen if the transfer will be re-enabled
> > > afterwards or the block transfer is fully completed. In the later case the
> > > "FIFO flush mode" will be executed at the transfer final stage in order to
> > > push out the data left in the DMA FIFO.
> > >
> > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to
> > > make sure that the _bursted_ source transfer width is greater or equal to
> > > the single destination transfer (note the HW databook describes more
> > > strict constraint than actually required). Since the peripheral-device
> > > side is prescribed by the client driver logic, the memory-side can be only
> > > used for that. The solution can be easily implemented for the DEV_TO_MEM
> > > transfers just by adjusting the memory-channel address width. Sadly it's
> > > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
> > > is normally dynamically determined by the controller. So the only thing
> > > that can be done is to make sure that memory-side address width can be
> > > greater than the peripheral device address width.

..

> > > +static int dwc_verify_m_buswidth(struct dma_chan *chan)
> > > +{
> > > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> > > + struct dw_dma *dw = to_dw_dma(chan->device);
> > > + u32 reg_width, reg_burst, mem_width;
> > > +
> > > + mem_width = dw->pdata->data_width[dwc->dws.m_master];
> > > +
> > > + /* Make sure src and dst word widths are coherent */
> > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) {
> > > + reg_width = dwc->dma_sconfig.dst_addr_width;
> > > + if (mem_width < reg_width)
> > > + return -EINVAL;
> > > +
> > > + dwc->dma_sconfig.src_addr_width = mem_width;
> > > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
> > > + reg_width = dwc->dma_sconfig.src_addr_width;
> > > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> > > +
> > > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);
> >
>
> > I understand the desire to go this way, but wouldn't be better to have
> > a symmetrical check and return an error?
>
> Sadly the situation isn't symmetrical.
>
> The main idea of the solution proposed in this patch is to make sure
> that the DMA transactions would fill in the DMA FIFO in a way so in
> case of the suspension all the data would be delivered to the
> destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY
> flag would mean the real FIFO emptiness. It can be reached only if
> (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH
> (calculated in the real values of course). But CTLx.SRC_MSIZE is only
> relevant for the flow-control/non-memory peripherals. Thus the

Oh, if it involves flow control, shouldn't you check for that as well?
We have a (PPC? IIRC) hardware with this IP that can have peripheral
as flow control.

> conditions under which the problem can be avoided are:
>
> DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH
> DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH
>
> In both cases the non-memory peripheral side parameters (DEV-side)
> can't be changed because they are selected by the client drivers based
> on their specific logic (Device FIFO depth, watermarks, CSR widths,
> etc). But we can vary the memory-side transfer width as long as it's
> within the permitted limits.
>
> In case of the DMA_MEM_TO_DEV transfers we can change the
> CTLx.SRC_TR_WIDTH because it represents the memory side transfer
> width. But if the maximum memory transfer width is smaller than the
> specified destination register width, there is nothing we can do. Thus
> returning the EINVAL error. Note this is mainly a hypothetical
> situation since normally the max width of the memory master xfers is
> greater than the peripheral master xfer max width (in my case it's 128
> and 32 bits respectively).
>
> In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH
> parameter because it's the memory side. Thus if the maximum
> memory transfer width is smaller than the bursted source transfer,
> then we can stick to the maximum memory transfer width. But if it's
> greater than the bursted source transfer, we can freely reduce it
> so to support the safe suspension+disable DMA-usage pattern.
>
> >
> > > + }
> > > +
> > > + return 0;
> > > +}
> >
>
> > IIRC MEM side of the DMA channel will ignore those in HW, so basically you are
> > (re-)using them purely for the __ffs() corrections.
>
> No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and
> CTLx.DEST_MSIZE for the memory side (also see my comment above):
>
> "The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for
> peripherals with a handshaking interface; they cannot be used for
> defining the burst length for memory peripherals.
>
> When the peripherals are memory, the DW_ahb_dmac is always the flow
> controller and uses DMA transfers to move blocks; thus the
> CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory
> peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to
> accommodate devices that have limited resources, such as a FIFO.
> Memory does not normally have limitations similar to the FIFOs."
>
> In my case the problem is in the CTLx.SRC_TR_WIDTH and
> CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment
> in the HW-manual about that (cited in the commit messages):
>
> "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> still be data in the channel FIFO, but not enough to form a single
> transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> disabled, the remaining data in the channel FIFO is not transferred to the
> destination peripheral."
>
> See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of
> the DW DMAC HW manual for more details.

Got it. Maybe a little summary in the code to explain all this magic?

..

> > > dwc->dma_sconfig.src_maxburst =
> > > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst);
> > > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> > > dwc->dma_sconfig.dst_maxburst =
> > > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
> > > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);

--
With Best Regards,
Andy Shevchenko



2024-04-17 17:33:21

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote:
> > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
> > data width (single transfer width) is determined based on the buffer
> > length, buffer base address or DMA master-channel max address width
> > capability. It isn't enough in case of the channel disabling prior the
> > block transfer is finished. Here is what DW AHB DMA IP-core databook says
> > regarding the port suspension (DMA-transfer pause) implementation in the
> > controller:
> >
> > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> > still be data in the channel FIFO, but not enough to form a single
> > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> > disabled, the remaining data in the channel FIFO is not transferred to the
> > destination peripheral."
> >
> > So in case if the port gets to be suspended and then disabled it's
> > possible to have the data silently discarded even though the controller
> > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
> > data already received from the source device. This looks as if the data
> > somehow got lost on a way from the peripheral device to memory and causes
> > problems for instance in the DW APB UART driver, which pauses and disables
> > the DMA-transfer as soon as the recv data timeout happens. Here is the way
> > it looks:
> >
> > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
> > DST_TR_WIDTH -+--------| | |
> > | | | | No more data
> > Current lvl -+--------| |---------+- DMA-burst lvl
> > | | |---------+- Leftover data
> > | | |---------+- SRC_TR_WIDTH
> > -+--------+-------+---------+
> >
> > In the example above: no more data is getting received over the UART port
> > and BLOCK_TS is not even close to be fully received; some data is left in
> > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
> > FIFO; some data is left in the DMA FIFO, but not enough to be passed
> > further to the system memory in a single transfer. In this situation the
> > 8250 UART driver catches the recv timeout interrupt, pauses the
> > DMA-transfer and terminates it completely, after which the IRQ handler
> > manually fetches the leftover data from the UART FIFO into the
> > recv-buffer. But since the DMA-channel has been disabled with the data
> > left in the DMA FIFO, that data will be just discarded and the recv-buffer
> > will have a gap of the "current lvl" size in the recv-buffer at the tail
> > of the lately received data portion. So the data will be lost just due to
> > the misconfigured DMA transfer.
> >
> > Note this is only relevant for the case of the transfer suspension and
> > _disabling_. No problem will happen if the transfer will be re-enabled
> > afterwards or the block transfer is fully completed. In the later case the
> > "FIFO flush mode" will be executed at the transfer final stage in order to
> > push out the data left in the DMA FIFO.
> >
> > In order to fix the denoted problem the DW AHB DMA-engine driver needs to
> > make sure that the _bursted_ source transfer width is greater or equal to
> > the single destination transfer (note the HW databook describes more
> > strict constraint than actually required). Since the peripheral-device
> > side is prescribed by the client driver logic, the memory-side can be only
> > used for that. The solution can be easily implemented for the DEV_TO_MEM
> > transfers just by adjusting the memory-channel address width. Sadly it's
> > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
> > is normally dynamically determined by the controller. So the only thing
> > that can be done is to make sure that memory-side address width can be
> > greater than the peripheral device address width.
>
> ...
>
> > +static int dwc_verify_m_buswidth(struct dma_chan *chan)
> > +{
> > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> > + struct dw_dma *dw = to_dw_dma(chan->device);
> > + u32 reg_width, reg_burst, mem_width;
> > +
> > + mem_width = dw->pdata->data_width[dwc->dws.m_master];
> > +
> > + /* Make sure src and dst word widths are coherent */
> > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) {
> > + reg_width = dwc->dma_sconfig.dst_addr_width;
> > + if (mem_width < reg_width)
> > + return -EINVAL;
> > +
> > + dwc->dma_sconfig.src_addr_width = mem_width;
> > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
> > + reg_width = dwc->dma_sconfig.src_addr_width;
> > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> > +
> > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);
>

> I understand the desire to go this way, but wouldn't be better to have
> a symmetrical check and return an error?

Sadly the situation isn't symmetrical.

The main idea of the solution proposed in this patch is to make sure
that the DMA transactions would fill in the DMA FIFO in a way so in
case of the suspension all the data would be delivered to the
destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY
flag would mean the real FIFO emptiness. It can be reached only if
(CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH
(calculated in the real values of course). But CTLx.SRC_MSIZE is only
relevant for the flow-control/non-memory peripherals. Thus the
conditions under which the problem can be avoided are:

DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH
DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH

In both cases the non-memory peripheral side parameters (DEV-side)
can't be changed because they are selected by the client drivers based
on their specific logic (Device FIFO depth, watermarks, CSR widths,
etc). But we can vary the memory-side transfer width as long as it's
within the permitted limits.

In case of the DMA_MEM_TO_DEV transfers we can change the
CTLx.SRC_TR_WIDTH because it represents the memory side transfer
width. But if the maximum memory transfer width is smaller than the
specified destination register width, there is nothing we can do. Thus
returning the EINVAL error. Note this is mainly a hypothetical
situation since normally the max width of the memory master xfers is
greater than the peripheral master xfer max width (in my case it's 128
and 32 bits respectively).

In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH
parameter because it's the memory side. Thus if the maximum
memory transfer width is smaller than the bursted source transfer,
then we can stick to the maximum memory transfer width. But if it's
greater than the bursted source transfer, we can freely reduce it
so to support the safe suspension+disable DMA-usage pattern.

>
> > + }
> > +
> > + return 0;
> > +}
>

> IIRC MEM side of the DMA channel will ignore those in HW, so basically you are
> (re-)using them purely for the __ffs() corrections.

No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and
CTLx.DEST_MSIZE for the memory side (also see my comment above):

"The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for
peripherals with a handshaking interface; they cannot be used for
defining the burst length for memory peripherals.

When the peripherals are memory, the DW_ahb_dmac is always the flow
controller and uses DMA transfers to move blocks; thus the
CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory
peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to
accommodate devices that have limited resources, such as a FIFO.
Memory does not normally have limitations similar to the FIFOs."

In my case the problem is in the CTLx.SRC_TR_WIDTH and
CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment
in the HW-manual about that (cited in the commit messages):

"When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
still be data in the channel FIFO, but not enough to form a single
transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
disabled, the remaining data in the channel FIFO is not transferred to the
destination peripheral."

See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of
the DW DMAC HW manual for more details.

-Serge(y)

>
> ...
>
> > dwc->dma_sconfig.src_maxburst =
> > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst);
> > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> > dwc->dma_sconfig.dst_maxburst =
> > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
> > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-17 17:34:47

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 0/4] dmaengine: dw: Fix src/dst addr width misconfig

On Tue, Apr 16, 2024 at 10:13:23PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 16, 2024 at 07:28:54PM +0300, Serge Semin wrote:
> > The main goal of this series is to fix the data disappearance in case of
> > the DW UART handled by the DW AHB DMA engine. The problem happens on a
> > portion of the data received when the pre-initialized DEV_TO_MEM
> > DMA-transfer is paused and then disabled. The data just hangs up in the
> > DMA-engine FIFO and isn't flushed out to the memory on the DMA-channel
> > suspension (see the second commit log for details). On a way to find the
> > denoted problem fix it was discovered that the driver doesn't verify the
> > peripheral device address width specified by a client driver, which in its
> > turn if unsupported or undefined value passed may cause DMA-transfer being
> > misconfigured. It's fixed in the first patch of the series.
> >
> > In addition to that two cleanup patch follow the fixes described above in
> > order to make the DWC-engine configuration procedure more coherent. First
> > one simplifies the CTL_LO register setup methods. Second one simplifies
> > the max-burst calculation procedure and unifies it with the rest of the
> > verification methods. Please see the patches log for more details.
>
> Thank you for this.

> I have looked into all of them and most worrying (relatively to the rest) to me
> is the second patch that does some tricks.

That's a crucial patch for which the series has been intended. The
rest of the patches were created to align the code around the fix.

> The rest are the cosmetics that can
> be easily addressed.

Right. I'll have a look at the rest of your comments shortly.

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-17 19:00:32

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote:
> > On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote:
> > > On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote:
> > > > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
> > > > data width (single transfer width) is determined based on the buffer
> > > > length, buffer base address or DMA master-channel max address width
> > > > capability. It isn't enough in case of the channel disabling prior the
> > > > block transfer is finished. Here is what DW AHB DMA IP-core databook says
> > > > regarding the port suspension (DMA-transfer pause) implementation in the
> > > > controller:
> > > >
> > > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> > > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> > > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> > > > still be data in the channel FIFO, but not enough to form a single
> > > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> > > > disabled, the remaining data in the channel FIFO is not transferred to the
> > > > destination peripheral."
> > > >
> > > > So in case if the port gets to be suspended and then disabled it's
> > > > possible to have the data silently discarded even though the controller
> > > > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
> > > > data already received from the source device. This looks as if the data
> > > > somehow got lost on a way from the peripheral device to memory and causes
> > > > problems for instance in the DW APB UART driver, which pauses and disables
> > > > the DMA-transfer as soon as the recv data timeout happens. Here is the way
> > > > it looks:
> > > >
> > > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
> > > > DST_TR_WIDTH -+--------| | |
> > > > | | | | No more data
> > > > Current lvl -+--------| |---------+- DMA-burst lvl
> > > > | | |---------+- Leftover data
> > > > | | |---------+- SRC_TR_WIDTH
> > > > -+--------+-------+---------+
> > > >
> > > > In the example above: no more data is getting received over the UART port
> > > > and BLOCK_TS is not even close to be fully received; some data is left in
> > > > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
> > > > FIFO; some data is left in the DMA FIFO, but not enough to be passed
> > > > further to the system memory in a single transfer. In this situation the
> > > > 8250 UART driver catches the recv timeout interrupt, pauses the
> > > > DMA-transfer and terminates it completely, after which the IRQ handler
> > > > manually fetches the leftover data from the UART FIFO into the
> > > > recv-buffer. But since the DMA-channel has been disabled with the data
> > > > left in the DMA FIFO, that data will be just discarded and the recv-buffer
> > > > will have a gap of the "current lvl" size in the recv-buffer at the tail
> > > > of the lately received data portion. So the data will be lost just due to
> > > > the misconfigured DMA transfer.
> > > >
> > > > Note this is only relevant for the case of the transfer suspension and
> > > > _disabling_. No problem will happen if the transfer will be re-enabled
> > > > afterwards or the block transfer is fully completed. In the later case the
> > > > "FIFO flush mode" will be executed at the transfer final stage in order to
> > > > push out the data left in the DMA FIFO.
> > > >
> > > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to
> > > > make sure that the _bursted_ source transfer width is greater or equal to
> > > > the single destination transfer (note the HW databook describes more
> > > > strict constraint than actually required). Since the peripheral-device
> > > > side is prescribed by the client driver logic, the memory-side can be only
> > > > used for that. The solution can be easily implemented for the DEV_TO_MEM
> > > > transfers just by adjusting the memory-channel address width. Sadly it's
> > > > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
> > > > is normally dynamically determined by the controller. So the only thing
> > > > that can be done is to make sure that memory-side address width can be
> > > > greater than the peripheral device address width.
>
> ...
>
> > > > +static int dwc_verify_m_buswidth(struct dma_chan *chan)
> > > > +{
> > > > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> > > > + struct dw_dma *dw = to_dw_dma(chan->device);
> > > > + u32 reg_width, reg_burst, mem_width;
> > > > +
> > > > + mem_width = dw->pdata->data_width[dwc->dws.m_master];
> > > > +
> > > > + /* Make sure src and dst word widths are coherent */
> > > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) {
> > > > + reg_width = dwc->dma_sconfig.dst_addr_width;
> > > > + if (mem_width < reg_width)
> > > > + return -EINVAL;
> > > > +
> > > > + dwc->dma_sconfig.src_addr_width = mem_width;
> > > > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) {
> > > > + reg_width = dwc->dma_sconfig.src_addr_width;
> > > > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> > > > +
> > > > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst);
> > >
> >
> > > I understand the desire to go this way, but wouldn't be better to have
> > > a symmetrical check and return an error?
> >
> > Sadly the situation isn't symmetrical.
> >
> > The main idea of the solution proposed in this patch is to make sure
> > that the DMA transactions would fill in the DMA FIFO in a way so in
> > case of the suspension all the data would be delivered to the
> > destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY
> > flag would mean the real FIFO emptiness. It can be reached only if
> > (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH
> > (calculated in the real values of course). But CTLx.SRC_MSIZE is only
> > relevant for the flow-control/non-memory peripherals. Thus the
>

> Oh, if it involves flow control, shouldn't you check for that as well?
> We have a (PPC? IIRC) hardware with this IP that can have peripheral
> as flow control.

Oops. Sorry, I was wrong in saying that the peripheral is a
flow-controller. All my cases imply DMAC as flow-controller. The
correct sentence would be "But CTLx.SRC_MSIZE is only relevant for the
non-memory peripherals."

>
> > conditions under which the problem can be avoided are:
> >
> > DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH
> > DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH
> >
> > In both cases the non-memory peripheral side parameters (DEV-side)
> > can't be changed because they are selected by the client drivers based
> > on their specific logic (Device FIFO depth, watermarks, CSR widths,
> > etc). But we can vary the memory-side transfer width as long as it's
> > within the permitted limits.
> >
> > In case of the DMA_MEM_TO_DEV transfers we can change the
> > CTLx.SRC_TR_WIDTH because it represents the memory side transfer
> > width. But if the maximum memory transfer width is smaller than the
> > specified destination register width, there is nothing we can do. Thus
> > returning the EINVAL error. Note this is mainly a hypothetical
> > situation since normally the max width of the memory master xfers is
> > greater than the peripheral master xfer max width (in my case it's 128
> > and 32 bits respectively).
> >
> > In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH
> > parameter because it's the memory side. Thus if the maximum
> > memory transfer width is smaller than the bursted source transfer,
> > then we can stick to the maximum memory transfer width. But if it's
> > greater than the bursted source transfer, we can freely reduce it
> > so to support the safe suspension+disable DMA-usage pattern.
> >
> > >
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > >
> >
> > > IIRC MEM side of the DMA channel will ignore those in HW, so basically you are
> > > (re-)using them purely for the __ffs() corrections.
> >
> > No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and
> > CTLx.DEST_MSIZE for the memory side (also see my comment above):
> >
> > "The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for
> > peripherals with a handshaking interface; they cannot be used for
> > defining the burst length for memory peripherals.
> >
> > When the peripherals are memory, the DW_ahb_dmac is always the flow
> > controller and uses DMA transfers to move blocks; thus the
> > CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory
> > peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to
> > accommodate devices that have limited resources, such as a FIFO.
> > Memory does not normally have limitations similar to the FIFOs."
> >
> > In my case the problem is in the CTLx.SRC_TR_WIDTH and
> > CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment
> > in the HW-manual about that (cited in the commit messages):
> >
> > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
> > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
> > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
> > still be data in the channel FIFO, but not enough to form a single
> > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
> > disabled, the remaining data in the channel FIFO is not transferred to the
> > destination peripheral."
> >
> > See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of
> > the DW DMAC HW manual for more details.
>

> Got it. Maybe a little summary in the code to explain all this magic?

Will it be enough to add something like this:
/*
* It's possible to have a data portion locked in the DMA FIFO in case
* of the channel suspension. Subsequent channel disabling will cause
* that data silent loss. In order to prevent that maintain the src
* and dst transfer widths coherency by means of the relation:
* (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH)
*/
?

-Serge(y)

>
> ...
>
> > > > dwc->dma_sconfig.src_maxburst =
> > > > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst);
> > > > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> > > > dwc->dma_sconfig.dst_maxburst =
> > > > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
> > > > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-17 19:54:56

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 1/4] dmaengine: dw: Add peripheral bus width verification

On Tue, Apr 16, 2024 at 09:00:50PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 16, 2024 at 07:28:55PM +0300, Serge Semin wrote:
> > Currently the src_addr_width and dst_addr_width fields of the
> > dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> > CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> > properly aligned data passed to the target device. It's done just by
> > converting the passed peripheral bus width to the encoded value using the
> > __ffs() function. This implementation has several problematic sides:
> >
> > 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> > specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> > unexpected value depending on the platform-specific implementation.
> >
> > 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> > by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> > bus-width out of that constraints scope will definitely cause unexpected
> > result since the destination reg will be only partly touched than the
> > client driver implied.
> >
> > Let's fix all of that by adding the peripheral bus width verification
> > method which would make sure that the passed source or destination address
> > width is valid and if undefined then the driver will just fallback to the
> > 1-byte width transfer.
>

> Please, add a word that you apply the check in the dwc_config() which is
> supposed to be called before preparing any transfer?

Ok.

>
> ...
>
> > +static int dwc_verify_p_buswidth(struct dma_chan *chan)
> > +{
> > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> > + struct dw_dma *dw = to_dw_dma(chan->device);
> > + u32 reg_width, max_width;
> > +
> > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> > + reg_width = dwc->dma_sconfig.dst_addr_width;
> > + else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM)
> > + reg_width = dwc->dma_sconfig.src_addr_width;
>
> > + else /* DMA_MEM_TO_MEM */
>

> Actually not only this direction,

DW DMAC driver supports only three directions:
dw->dma.directions = BIT(DMA_DEV_TO_MEM) | BIT(DMA_MEM_TO_DEV) |
BIT(DMA_MEM_TO_MEM);
so in this case the else-clause is intended to be taken for the
DMA_MEM_TO_MEM transfers only.

> but TBH I do not see value in these comments.

Value is in signifying that DMA_MEM_TO_MEM is implied by the else
clause. If in some future point we have DMA_DEV_TO_DEV support added
to the driver, then the if-else statement must be aligned
respectively.

>
> > + return 0;
> > +
> > + max_width = dw->pdata->data_width[dwc->dws.p_master];
> > +
> > + /* Fall-back to 1byte transfer width if undefined */
>

> 1-byte
> (as you even used in the commit message correctly)

Ok.

>
> > + if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
> > + reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
> > + else if (!is_power_of_2(reg_width) || reg_width > max_width)
> > + return -EINVAL;
> > + else /* bus width is valid */
> > + return 0;
> > +
> > + /* Update undefined addr width value */
> > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> > + dwc->dma_sconfig.dst_addr_width = reg_width;
> > + else /* DMA_DEV_TO_MEM */
> > + dwc->dma_sconfig.src_addr_width = reg_width;
>

> So, can't you simply call clamp() for both fields in dwc_config()?

Alas I can't. Because the addr-width is the non-memory peripheral
setting. We can't change it since the client drivers calculate it on
the application-specific basis (CSR widths, transfer length, etc). So
we must make sure that the specified value is supported.

>
> > + return 0;
> > +}
>
> ...
>
> > + int err;
>

> Hmm... we have two functions one of which is using different name for this.

Right, the driver uses both variants (see of.c, platform.c, pci.c too).

> Can we have a patch to convert to err the other one?

To be honest I'd prefer to use the "ret" name instead. It better
describes the variable usage context (Although the statements like "if
(err) ..." look a bit more readable). So I'd rather convert the "err"
vars to "ret". What do you think?

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-17 20:12:03

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Tue, Apr 16, 2024 at 10:04:42PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:
> > Currently the CTL LO fields are calculated on the platform-specific basis.
> > It's implemented by means of the prepare_ctllo() callbacks using the
> > ternary operator within the local variables init block at the beginning of
> > the block scope. The functions code currently is relatively hard to
> > comprehend and isn't that optimal since implies four conditional
> > statements executed and two additional local variables defined. Let's
> > simplify the DW AHB DMA prepare_ctllo() method by unrolling the ternary
> > operators into the normal if-else statement, dropping redundant
> > master-interface ID variables and initializing the local variables based
> > on the singly evaluated DMA-transfer direction check. Thus the method will
> > look much more readable since now the fields content can be easily
> > inferred right from the if-else branch. Provide the same update in the
> > Intel DMA32 core driver for sake of the driver code unification.
> >
> > Note besides of the effects described above this update is basically a
> > preparation before dropping the max burst encoding callback. It will
> > require calling the burst fields calculation methods right in the
> > prepare_ctllo() callbacks, which would have made the later function code
> > even more complex.
>
> Yeah, this is inherited from the original driver where it used to be a macro.
>
> ...
>
> > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > + sms = dwc->dws.m_master;
> > + smsize = 0;
> > + dms = dwc->dws.p_master;
> > + dmsize = sconfig->dst_maxburst;
>

> I would group it differently, i.e.
>
> sms = dwc->dws.m_master;
> dms = dwc->dws.p_master;
> smsize = 0;
> dmsize = sconfig->dst_maxburst;

Could you please clarify, why? From my point of view it was better to
group the source master ID and the source master burst size inits
together.

>
> > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > + sms = dwc->dws.p_master;
> > + smsize = sconfig->src_maxburst;
> > + dms = dwc->dws.m_master;
> > + dmsize = 0;
> > + } else /* DMA_MEM_TO_MEM */ {
> > + sms = dwc->dws.m_master;
> > + smsize = 0;
> > + dms = dwc->dws.m_master;
> > + dmsize = 0;
> > + }
>
> Ditto for two above cases.
>
> > static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
> > {
> > struct dma_slave_config *sconfig = &dwc->dma_sconfig;
> > - u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
> > - u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;

> > + u8 smsize, dmsize;
> > +
> > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > + smsize = 0;
> > + dmsize = sconfig->dst_maxburst;
> > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > + smsize = sconfig->src_maxburst;
> > + dmsize = 0;
> > + } else /* DMA_MEM_TO_MEM */ {
> > + smsize = 0;
> > + dmsize = 0;
> > + }
>
> u8 smsize = 0, dmsize = 0;
>
> if (dwc->direction == DMA_MEM_TO_DEV)
> dmsize = sconfig->dst_maxburst;
> else if (dwc->direction == DMA_DEV_TO_MEM)
> smsize = sconfig->src_maxburst;
>
> ?
>
> Something similar also can be done in the Synopsys case above, no?

As in case of the patch #1 the if-else statement here was designed
like that intentionally: to signify that the else clause implies the
DMA_MEM_TO_MEM transfer. Any other one (like DMA_DEV_TO_DEV) would
need to have the statement alteration. Moreover even though your
version looks smaller, but it causes one redundant store operation.
Do you think it still would be better to use your version despite of
my reasoning?

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-17 20:42:57

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine: dw: Simplify max-burst calculation procedure

On Tue, Apr 16, 2024 at 10:11:58PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 16, 2024 at 07:28:58PM +0300, Serge Semin wrote:
> > In order to have a more coherent DW AHB DMA slave configuration method
> > let's simplify the source and destination channel max-burst calculation
> > procedure:
> >
> > 1. Create the max-burst verification method as it has been just done for
> > the memory and peripheral address widths. Thus the DWC DMA slave config
>

> dwc_config() method
>
> ?

Right. I'll just directly refer to the dwc_config() method here.

>
> > method will turn to a set of the verification methods execution.
> >
> > 2. Since both the generic DW AHB DMA and Intel DMA32 engines support the
>

> "i" in iDMA 32-bit stands for "integrated", so 'Intel iDMA 32-bit'

Ok. Thanks for clarification.

>
> > power-of-2 bursts only, then the specified by the client driver max-burst
> > values can be converted to being power-of-2 right in the max-burst
> > verification method.
> >
> > 3. Since max-burst encoded value is required on the CTL_LO fields
> > calculation stage, the encode_maxburst() callback can be easily dropped
> > from the dw_dma structure meanwhile the encoding procedure will be
> > executed right in the CTL_LO register value calculation.
> >
> > Thus the update will provide the next positive effects: the internal
> > DMA-slave config structure will contain only the real DMA-transfer config
> > value, which will be encoded to the DMA-controller register fields only
> > when it's required on the buffer mapping; the redundant encode_maxburst()
> > callback will be dropped simplifying the internal HW-abstraction API;
> > DWC-config method will look more readable executing the verification
>

> dwc_config() method
>
> ?

Ok.

>
> > functions one-by-one.
>
> ...
>
> > +static void dwc_verify_maxburst(struct dma_chan *chan)
>

> It's inconsistent to the rest of _verify methods. It doesn't verify as it
> doesn't return anything. Make it int or rename the function.

Making it int won't make much sense since currently the method doesn't
imply returning an error status. IMO using "verify" was ok, but since
you don't see it suitable please suggest a better alternative. mend,
fix, align?

>
> > +{
> > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
> > +
> > + dwc->dma_sconfig.src_maxburst =
> > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst);
> > + dwc->dma_sconfig.dst_maxburst =
> > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
> > +
> > + dwc->dma_sconfig.src_maxburst =
> > + rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> > + dwc->dma_sconfig.dst_maxburst =
> > + rounddown_pow_of_two(dwc->dma_sconfig.dst_maxburst);
> > +}
>
> ...
>
> > static int dwc_verify_p_buswidth(struct dma_chan *chan)
> > - reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst);
> > + reg_burst = dwc->dma_sconfig.src_maxburst;
>

> Seems you have a dependency, need a comment below that maxburst has to be
> "verified" [whatever] first.

Ok.

>
> ...
>
> > +static inline u8 dw_dma_encode_maxburst(u32 maxburst)
> > +{
> > + /*
> > + * Fix burst size according to dw_dmac. We need to convert them as:
> > + * 1 -> 0, 4 -> 1, 8 -> 2, 16 -> 3.
> > + */
> > + return maxburst > 1 ? fls(maxburst) - 2 : 0;
> > +}
>

> Split these moves to another preparatory patch.

Ok.

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-18 09:37:23

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Wed, Apr 17, 2024 at 09:52:42PM +0300, Serge Semin wrote:
> On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote:
> > On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote:

..

> > Got it. Maybe a little summary in the code to explain all this magic?
>
> Will it be enough to add something like this:
> /*
> * It's possible to have a data portion locked in the DMA FIFO in case
> * of the channel suspension. Subsequent channel disabling will cause
> * that data silent loss. In order to prevent that maintain the src
> * and dst transfer widths coherency by means of the relation:
> * (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH)
> */

Yes, and you may add something like
"Look for the details in the commit message that brings this change."
at the end of it.

--
With Best Regards,
Andy Shevchenko



2024-04-18 09:45:02

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 1/4] dmaengine: dw: Add peripheral bus width verification

On Wed, Apr 17, 2024 at 10:54:42PM +0300, Serge Semin wrote:
> On Tue, Apr 16, 2024 at 09:00:50PM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 16, 2024 at 07:28:55PM +0300, Serge Semin wrote:

..

> > > + if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
> > > + reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
> > > + else if (!is_power_of_2(reg_width) || reg_width > max_width)
> > > + return -EINVAL;
> > > + else /* bus width is valid */
> > > + return 0;
> > > +
> > > + /* Update undefined addr width value */
> > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> > > + dwc->dma_sconfig.dst_addr_width = reg_width;
> > > + else /* DMA_DEV_TO_MEM */
> > > + dwc->dma_sconfig.src_addr_width = reg_width;
> >
>
> > So, can't you simply call clamp() for both fields in dwc_config()?
>
> Alas I can't. Because the addr-width is the non-memory peripheral
> setting. We can't change it since the client drivers calculate it on
> the application-specific basis (CSR widths, transfer length, etc). So
> we must make sure that the specified value is supported.

What I meant is to convert this "update" part to the clamping, so
we will have the check as the above like

_verify_()
{
if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
return -E...;
if (!is_power_of_2(reg_width) || reg_width > max_width)
return -EINVAL;

/* bus width is valid */
return 0;
}

dwc_config()
{
err = ...
if (err = ...)
clamp?
else if (err)
return err;
}

But it's up to you to choose the better variant. I just share the idea.

> > > + return 0;
> > > +}

..

> > > + int err;
>
> > Hmm... we have two functions one of which is using different name for this.
>
> Right, the driver uses both variants (see of.c, platform.c, pci.c too).
>
> > Can we have a patch to convert to err the other one?
>
> To be honest I'd prefer to use the "ret" name instead. It better
> describes the variable usage context (Although the statements like "if
> (err) ..." look a bit more readable). So I'd rather convert the "err"
> vars to "ret". What do you think?

I'm fine with any choice, just my point is to get it consistent across
the driver.

--
With Best Regards,
Andy Shevchenko



2024-04-18 11:47:48

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Wed, Apr 17, 2024 at 11:11:46PM +0300, Serge Semin wrote:
> On Tue, Apr 16, 2024 at 10:04:42PM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:

..

> > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > + sms = dwc->dws.m_master;
> > > + smsize = 0;
> > > + dms = dwc->dws.p_master;
> > > + dmsize = sconfig->dst_maxburst;
> >
>
> > I would group it differently, i.e.
> >
> > sms = dwc->dws.m_master;
> > dms = dwc->dws.p_master;
> > smsize = 0;
> > dmsize = sconfig->dst_maxburst;
>
> Could you please clarify, why? From my point of view it was better to
> group the source master ID and the source master burst size inits
> together.

Sure. The point here is that when you look at the DMA channel configuration
usually you operate with the semantically tied fields for source and
destination. At least this is my experience, I always check both sides
of the transfer for the same field, e.g., master setting, hence I want to
have them coupled.

> > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > + sms = dwc->dws.p_master;
> > > + smsize = sconfig->src_maxburst;
> > > + dms = dwc->dws.m_master;
> > > + dmsize = 0;
> > > + } else /* DMA_MEM_TO_MEM */ {
> > > + sms = dwc->dws.m_master;
> > > + smsize = 0;
> > > + dms = dwc->dws.m_master;
> > > + dmsize = 0;
> > > + }
> >
> > Ditto for two above cases.

..

> > > static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
> > > {
> > > struct dma_slave_config *sconfig = &dwc->dma_sconfig;
> > > - u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
> > > - u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;

> > > + u8 smsize, dmsize;
> > > +
> > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > + smsize = 0;
> > > + dmsize = sconfig->dst_maxburst;
> > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > + smsize = sconfig->src_maxburst;
> > > + dmsize = 0;
> > > + } else /* DMA_MEM_TO_MEM */ {
> > > + smsize = 0;
> > > + dmsize = 0;
> > > + }
> >
> > u8 smsize = 0, dmsize = 0;
> >
> > if (dwc->direction == DMA_MEM_TO_DEV)
> > dmsize = sconfig->dst_maxburst;
> > else if (dwc->direction == DMA_DEV_TO_MEM)
> > smsize = sconfig->src_maxburst;
> >
> > ?
> >
> > Something similar also can be done in the Synopsys case above, no?
>
> As in case of the patch #1 the if-else statement here was designed
> like that intentionally: to signify that the else clause implies the
> DMA_MEM_TO_MEM transfer. Any other one (like DMA_DEV_TO_DEV) would
> need to have the statement alteration.

My version as I read it:
- for M2D the dmsize is important
- for D2M the smsize is important
- for anything else use defaults (which are 0)

> Moreover even though your
> version looks smaller, but it causes one redundant store operation.

Most likely not. Any assembler here? I can check on x86_64, but I believe it
simply assigns 0 for both u8 at once using xor r16,r16 or so.

Maybe ARM or MIPS (what do you use?) sucks? :-)

> Do you think it still would be better to use your version despite of
> my reasoning?

--
With Best Regards,
Andy Shevchenko



2024-04-18 11:50:20

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine: dw: Simplify max-burst calculation procedure

On Wed, Apr 17, 2024 at 11:35:39PM +0300, Serge Semin wrote:
> On Tue, Apr 16, 2024 at 10:11:58PM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 16, 2024 at 07:28:58PM +0300, Serge Semin wrote:

..

> > > +static void dwc_verify_maxburst(struct dma_chan *chan)
>
> > It's inconsistent to the rest of _verify methods. It doesn't verify as it
> > doesn't return anything. Make it int or rename the function.
>
> Making it int won't make much sense since currently the method doesn't
> imply returning an error status. IMO using "verify" was ok, but since
> you don't see it suitable please suggest a better alternative. mend,
> fix, align?

My suggestion is (and was) to have it return 0 for now.

--
With Best Regards,
Andy Shevchenko



2024-04-18 15:47:41

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 1/4] dmaengine: dw: Add peripheral bus width verification

On Thu, Apr 18, 2024 at 12:43:25PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 17, 2024 at 10:54:42PM +0300, Serge Semin wrote:
> > On Tue, Apr 16, 2024 at 09:00:50PM +0300, Andy Shevchenko wrote:
> > > On Tue, Apr 16, 2024 at 07:28:55PM +0300, Serge Semin wrote:
>
> ...
>
> > > > + if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
> > > > + reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
> > > > + else if (!is_power_of_2(reg_width) || reg_width > max_width)
> > > > + return -EINVAL;
> > > > + else /* bus width is valid */
> > > > + return 0;
> > > > +
> > > > + /* Update undefined addr width value */
> > > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
> > > > + dwc->dma_sconfig.dst_addr_width = reg_width;
> > > > + else /* DMA_DEV_TO_MEM */
> > > > + dwc->dma_sconfig.src_addr_width = reg_width;
> > >
> >
> > > So, can't you simply call clamp() for both fields in dwc_config()?
> >
> > Alas I can't. Because the addr-width is the non-memory peripheral
> > setting. We can't change it since the client drivers calculate it on
> > the application-specific basis (CSR widths, transfer length, etc). So
> > we must make sure that the specified value is supported.
>
> What I meant is to convert this "update" part to the clamping, so
> we will have the check as the above like
>
> _verify_()
> {
> if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
> return -E...;
> if (!is_power_of_2(reg_width) || reg_width > max_width)
> return -EINVAL;
>
> /* bus width is valid */
> return 0;
> }
>
> dwc_config()
> {
> err = ...
> if (err = ...)
> clamp?
> else if (err)
> return err;
> }
>
> But it's up to you to choose the better variant. I just share the idea.

Ok. Thanks for the suggestion. But I'll stick to my solution then. The
specified *_addr_width values can't/shouldn't be clamped anyway and
having a single verification function will comply to what will be
implemented for the rest of the dwc_onfig() parts in this patchset.

>
> > > > + return 0;
> > > > +}
>
> ...
>
> > > > + int err;
> >
> > > Hmm... we have two functions one of which is using different name for this.
> >
> > Right, the driver uses both variants (see of.c, platform.c, pci.c too).
> >
> > > Can we have a patch to convert to err the other one?
> >
> > To be honest I'd prefer to use the "ret" name instead. It better
> > describes the variable usage context (Although the statements like "if
> > (err) ..." look a bit more readable). So I'd rather convert the "err"
> > vars to "ret". What do you think?
>

> I'm fine with any choice, just my point is to get it consistent across
> the driver.

Ok. "ret" it is then.

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-18 15:53:00

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 2/4] dmaengine: dw: Add memory bus width verification

On Thu, Apr 18, 2024 at 12:37:09PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 17, 2024 at 09:52:42PM +0300, Serge Semin wrote:
> > On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote:
>
> ...
>
> > > Got it. Maybe a little summary in the code to explain all this magic?
> >
> > Will it be enough to add something like this:
> > /*
> > * It's possible to have a data portion locked in the DMA FIFO in case
> > * of the channel suspension. Subsequent channel disabling will cause
> > * that data silent loss. In order to prevent that maintain the src
> > * and dst transfer widths coherency by means of the relation:
> > * (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH)
> > */
>
> Yes, and you may add something like
> "Look for the details in the commit message that brings this change."
> at the end of it.

Agreed. Thanks.

-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-18 19:00:15

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Thu, Apr 18, 2024 at 02:47:18PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 17, 2024 at 11:11:46PM +0300, Serge Semin wrote:
> > On Tue, Apr 16, 2024 at 10:04:42PM +0300, Andy Shevchenko wrote:
> > > On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:
>
> ...
>
> > > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > > + sms = dwc->dws.m_master;
> > > > + smsize = 0;
> > > > + dms = dwc->dws.p_master;
> > > > + dmsize = sconfig->dst_maxburst;
> > >
> >
> > > I would group it differently, i.e.
> > >
> > > sms = dwc->dws.m_master;
> > > dms = dwc->dws.p_master;
> > > smsize = 0;
> > > dmsize = sconfig->dst_maxburst;
> >
> > Could you please clarify, why? From my point of view it was better to
> > group the source master ID and the source master burst size inits
> > together.
>

> Sure. The point here is that when you look at the DMA channel configuration
> usually you operate with the semantically tied fields for source and
> destination. At least this is my experience, I always check both sides
> of the transfer for the same field, e.g., master setting, hence I want to
> have them coupled.

Ok. I see. Thanks for clarification. I normally do that in another
order: group the functionally related fields together - all
source-related configs first, then all destination-related configs.
Honestly I don't have strong opinion about this part, it's just my
personal preference. Am I right to think that from your experience in
kernel it's normally done in the order you described?

>
> > > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > > + sms = dwc->dws.p_master;
> > > > + smsize = sconfig->src_maxburst;
> > > > + dms = dwc->dws.m_master;
> > > > + dmsize = 0;
> > > > + } else /* DMA_MEM_TO_MEM */ {
> > > > + sms = dwc->dws.m_master;
> > > > + smsize = 0;
> > > > + dms = dwc->dws.m_master;
> > > > + dmsize = 0;
> > > > + }
> > >
> > > Ditto for two above cases.
>
> ...
>
> > > > static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
> > > > {
> > > > struct dma_slave_config *sconfig = &dwc->dma_sconfig;
> > > > - u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst : 0;
> > > > - u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst : 0;
>
> > > > + u8 smsize, dmsize;
> > > > +
> > > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > > + smsize = 0;
> > > > + dmsize = sconfig->dst_maxburst;
> > > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > > + smsize = sconfig->src_maxburst;
> > > > + dmsize = 0;
> > > > + } else /* DMA_MEM_TO_MEM */ {
> > > > + smsize = 0;
> > > > + dmsize = 0;
> > > > + }
> > >
> > > u8 smsize = 0, dmsize = 0;
> > >
> > > if (dwc->direction == DMA_MEM_TO_DEV)
> > > dmsize = sconfig->dst_maxburst;
> > > else if (dwc->direction == DMA_DEV_TO_MEM)
> > > smsize = sconfig->src_maxburst;
> > >
> > > ?
> > >
> > > Something similar also can be done in the Synopsys case above, no?
> >
> > As in case of the patch #1 the if-else statement here was designed
> > like that intentionally: to signify that the else clause implies the
> > DMA_MEM_TO_MEM transfer. Any other one (like DMA_DEV_TO_DEV) would
> > need to have the statement alteration.
>

> My version as I read it:
> - for M2D the dmsize is important
> - for D2M the smsize is important
> - for anything else use defaults (which are 0)
>

Ok. Let's follow your way in this case. After your how-to-read-it
comment your version no longer look less readable than what is
implemented by me. Thanks for clarification.

> > Moreover even though your
> > version looks smaller, but it causes one redundant store operation.
>
> Most likely not. Any assembler here? I can check on x86_64, but I believe it
> simply assigns 0 for both u8 at once using xor r16,r16 or so.
>
> Maybe ARM or MIPS (what do you use?) sucks? :-)

Interestingly, but asm-code in both cases match.) So the redundant
store operation in your C-code gets to be optimized away.

-Serge(y)

>
> > Do you think it still would be better to use your version despite of
> > my reasoning?
>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-18 19:10:17

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine: dw: Simplify max-burst calculation procedure

On Thu, Apr 18, 2024 at 02:49:26PM +0300, Andy Shevchenko wrote:
> On Wed, Apr 17, 2024 at 11:35:39PM +0300, Serge Semin wrote:
> > On Tue, Apr 16, 2024 at 10:11:58PM +0300, Andy Shevchenko wrote:
> > > On Tue, Apr 16, 2024 at 07:28:58PM +0300, Serge Semin wrote:
>
> ...
>
> > > > +static void dwc_verify_maxburst(struct dma_chan *chan)
> >
> > > It's inconsistent to the rest of _verify methods. It doesn't verify as it
> > > doesn't return anything. Make it int or rename the function.
> >
> > Making it int won't make much sense since currently the method doesn't
> > imply returning an error status. IMO using "verify" was ok, but since
> > you don't see it suitable please suggest a better alternative. mend,
> > fix, align?
>

> My suggestion is (and was) to have it return 0 for now.

Ok. Let's have it returning zero then.


-Serge(y)

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

2024-04-18 21:00:54

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Thu, Apr 18, 2024 at 10:00:02PM +0300, Serge Semin wrote:
> On Thu, Apr 18, 2024 at 02:47:18PM +0300, Andy Shevchenko wrote:
> > On Wed, Apr 17, 2024 at 11:11:46PM +0300, Serge Semin wrote:
> > > On Tue, Apr 16, 2024 at 10:04:42PM +0300, Andy Shevchenko wrote:
> > > > On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:

..

> > > > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > > > + sms = dwc->dws.m_master;
> > > > > + smsize = 0;
> > > > > + dms = dwc->dws.p_master;
> > > > > + dmsize = sconfig->dst_maxburst;
> > >
> > > > I would group it differently, i.e.
> > > >
> > > > sms = dwc->dws.m_master;
> > > > dms = dwc->dws.p_master;
> > > > smsize = 0;
> > > > dmsize = sconfig->dst_maxburst;
> > >
> > > Could you please clarify, why? From my point of view it was better to
> > > group the source master ID and the source master burst size inits
> > > together.
>
> > Sure. The point here is that when you look at the DMA channel configuration
> > usually you operate with the semantically tied fields for source and
> > destination. At least this is my experience, I always check both sides
> > of the transfer for the same field, e.g., master setting, hence I want to
> > have them coupled.
>
> Ok. I see. Thanks for clarification. I normally do that in another
> order: group the functionally related fields together - all
> source-related configs first, then all destination-related configs.
> Honestly I don't have strong opinion about this part, it's just my
> personal preference. Am I right to think that from your experience in
> kernel it's normally done in the order you described?

In this driver I believe I have followed that one, yes.

> > > > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > > > + sms = dwc->dws.p_master;
> > > > > + smsize = sconfig->src_maxburst;
> > > > > + dms = dwc->dws.m_master;
> > > > > + dmsize = 0;
> > > > > + } else /* DMA_MEM_TO_MEM */ {
> > > > > + sms = dwc->dws.m_master;
> > > > > + smsize = 0;
> > > > > + dms = dwc->dws.m_master;
> > > > > + dmsize = 0;
> > > > > + }
> > > >
> > > > Ditto for two above cases.

--
With Best Regards,
Andy Shevchenko



2024-04-19 09:20:06

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH 3/4] dmaengine: dw: Simplify prepare CTL_LO methods

On Fri, Apr 19, 2024 at 12:00:40AM +0300, Andy Shevchenko wrote:
> On Thu, Apr 18, 2024 at 10:00:02PM +0300, Serge Semin wrote:
> > On Thu, Apr 18, 2024 at 02:47:18PM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 17, 2024 at 11:11:46PM +0300, Serge Semin wrote:
> > > > On Tue, Apr 16, 2024 at 10:04:42PM +0300, Andy Shevchenko wrote:
> > > > > On Tue, Apr 16, 2024 at 07:28:57PM +0300, Serge Semin wrote:
>
> ...
>
> > > > > > + if (dwc->direction == DMA_MEM_TO_DEV) {
> > > > > > + sms = dwc->dws.m_master;
> > > > > > + smsize = 0;
> > > > > > + dms = dwc->dws.p_master;
> > > > > > + dmsize = sconfig->dst_maxburst;
> > > >
> > > > > I would group it differently, i.e.
> > > > >
> > > > > sms = dwc->dws.m_master;
> > > > > dms = dwc->dws.p_master;
> > > > > smsize = 0;
> > > > > dmsize = sconfig->dst_maxburst;
> > > >
> > > > Could you please clarify, why? From my point of view it was better to
> > > > group the source master ID and the source master burst size inits
> > > > together.
> >
> > > Sure. The point here is that when you look at the DMA channel configuration
> > > usually you operate with the semantically tied fields for source and
> > > destination. At least this is my experience, I always check both sides
> > > of the transfer for the same field, e.g., master setting, hence I want to
> > > have them coupled.
> >
> > Ok. I see. Thanks for clarification. I normally do that in another
> > order: group the functionally related fields together - all
> > source-related configs first, then all destination-related configs.
> > Honestly I don't have strong opinion about this part, it's just my
> > personal preference. Am I right to think that from your experience in
> > kernel it's normally done in the order you described?
>

> In this driver I believe I have followed that one, yes.

Agreed then. I'll change the order to the way you ask.

-Serge(y)

>
> > > > > > + } else if (dwc->direction == DMA_DEV_TO_MEM) {
> > > > > > + sms = dwc->dws.p_master;
> > > > > > + smsize = sconfig->src_maxburst;
> > > > > > + dms = dwc->dws.m_master;
> > > > > > + dmsize = 0;
> > > > > > + } else /* DMA_MEM_TO_MEM */ {
> > > > > > + sms = dwc->dws.m_master;
> > > > > > + smsize = 0;
> > > > > > + dms = dwc->dws.m_master;
> > > > > > + dmsize = 0;
> > > > > > + }
> > > > >
> > > > > Ditto for two above cases.
>
> --
> With Best Regards,
> Andy Shevchenko
>
>