2022-07-11 08:49:24

by Amelie Delaunay

[permalink] [raw]
Subject: [PATCH 0/4] STM32 DMA-MDMA chaining feature

This patchset (re)introduces STM32 DMA-MDMA chaining feature.

As the DMA is not able to generate convenient burst transfer on the DDR,
it penalises the AXI bus when accessing the DDR. While it accesses
optimally the SRAM. The DMA-MDMA chaining then consists in having an SRAM
buffer between DMA and MDMA, so the DMA deals with peripheral and SRAM,
and the MDMA with SRAM and DDR.

The feature relies on the fact that DMA channel Transfer Complete signal
can trigger a MDMA channel transfer and MDMA can clear the DMA request by
writing to DMA Interrupt Clear register.

A deeper introduction can be found in patch 1.

Previous implementation [1] has been dropped as nacked.
Unlike this previous implementation (where all the stuff was embedded in
stm32-dma driver), the user (in peripheral drivers using dma) has now to
configure the MDMA channel.

[1] https://lore.kernel.org/lkml/[email protected]/

Amelie Delaunay (4):
docs: arm: stm32: introduce STM32 DMA-MDMA chaining feature
dmaengine: stm32-dmamux: set dmamux channel id in dma features
bitfield
dmaengine: stm32-dma: add support to trigger STM32 MDMA
dmaengine: stm32-mdma: add support to be triggered by STM32 DMA

.../arm/stm32/stm32-dma-mdma-chaining.rst | 365 ++++++++++++++++++
drivers/dma/stm32-dma.c | 56 ++-
drivers/dma/stm32-dmamux.c | 2 +-
drivers/dma/stm32-mdma.c | 70 +++-
4 files changed, 490 insertions(+), 3 deletions(-)
create mode 100644 Documentation/arm/stm32/stm32-dma-mdma-chaining.rst

--
2.25.1


2022-07-11 08:49:43

by Amelie Delaunay

[permalink] [raw]
Subject: [PATCH 4/4] dmaengine: stm32-mdma: add support to be triggered by STM32 DMA

STM32 MDMA can be triggered by STM32 DMA channels transfer complete.

In case of non-null struct dma_slave_config .peripheral_size, it means the
DMA client wants the DMA to trigger the MDMA.

stm32-mdma driver gets the request id, the mask_addr, and the mask_data in
struct stm32_mdma_dma_config passed by DMA with struct dma_slave_config
.peripheral_config/.peripheral_size.

Then, as DMA is configured in Double-Buffer mode, and MDMA channel will
transfer data from/to SRAM to/from DDR, then bursts are optimized.

Signed-off-by: Amelie Delaunay <[email protected]>
---
drivers/dma/stm32-mdma.c | 70 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/stm32-mdma.c b/drivers/dma/stm32-mdma.c
index b11927ed4367..e28acbcb53f4 100644
--- a/drivers/dma/stm32-mdma.c
+++ b/drivers/dma/stm32-mdma.c
@@ -199,6 +199,7 @@ struct stm32_mdma_chan_config {
u32 transfer_config;
u32 mask_addr;
u32 mask_data;
+ bool m2m_hw; /* True when MDMA is triggered by STM32 DMA */
};

struct stm32_mdma_hwdesc {
@@ -227,6 +228,12 @@ struct stm32_mdma_desc {
struct stm32_mdma_desc_node node[];
};

+struct stm32_mdma_dma_config {
+ u32 request; /* STM32 DMA channel stream id, triggering MDMA */
+ u32 cmar; /* STM32 DMA interrupt flag clear register address */
+ u32 cmdr; /* STM32 DMA Transfer Complete flag */
+};
+
struct stm32_mdma_chan {
struct virt_dma_chan vchan;
struct dma_pool *desc_pool;
@@ -539,13 +546,23 @@ static int stm32_mdma_set_xfer_param(struct stm32_mdma_chan *chan,
dst_addr = chan->dma_config.dst_addr;

/* Set device data size */
+ if (chan_config->m2m_hw)
+ dst_addr_width = stm32_mdma_get_max_width(dst_addr, buf_len,
+ STM32_MDMA_MAX_BUF_LEN);
dst_bus_width = stm32_mdma_get_width(chan, dst_addr_width);
if (dst_bus_width < 0)
return dst_bus_width;
ctcr &= ~STM32_MDMA_CTCR_DSIZE_MASK;
ctcr |= STM32_MDMA_CTCR_DSIZE(dst_bus_width);
+ if (chan_config->m2m_hw) {
+ ctcr &= ~STM32_MDMA_CTCR_DINCOS_MASK;
+ ctcr |= STM32_MDMA_CTCR_DINCOS(dst_bus_width);
+ }

/* Set device burst value */
+ if (chan_config->m2m_hw)
+ dst_maxburst = STM32_MDMA_MAX_BUF_LEN / dst_addr_width;
+
dst_best_burst = stm32_mdma_get_best_burst(buf_len, tlen,
dst_maxburst,
dst_addr_width);
@@ -588,13 +605,24 @@ static int stm32_mdma_set_xfer_param(struct stm32_mdma_chan *chan,
src_addr = chan->dma_config.src_addr;

/* Set device data size */
+ if (chan_config->m2m_hw)
+ src_addr_width = stm32_mdma_get_max_width(src_addr, buf_len,
+ STM32_MDMA_MAX_BUF_LEN);
+
src_bus_width = stm32_mdma_get_width(chan, src_addr_width);
if (src_bus_width < 0)
return src_bus_width;
ctcr &= ~STM32_MDMA_CTCR_SSIZE_MASK;
ctcr |= STM32_MDMA_CTCR_SSIZE(src_bus_width);
+ if (chan_config->m2m_hw) {
+ ctcr &= ~STM32_MDMA_CTCR_SINCOS_MASK;
+ ctcr |= STM32_MDMA_CTCR_SINCOS(src_bus_width);
+ }

/* Set device burst value */
+ if (chan_config->m2m_hw)
+ src_maxburst = STM32_MDMA_MAX_BUF_LEN / src_addr_width;
+
src_best_burst = stm32_mdma_get_best_burst(buf_len, tlen,
src_maxburst,
src_addr_width);
@@ -702,11 +730,15 @@ static int stm32_mdma_setup_xfer(struct stm32_mdma_chan *chan,
{
struct stm32_mdma_device *dmadev = stm32_mdma_get_dev(chan);
struct dma_slave_config *dma_config = &chan->dma_config;
+ struct stm32_mdma_chan_config *chan_config = &chan->chan_config;
struct scatterlist *sg;
dma_addr_t src_addr, dst_addr;
- u32 ccr, ctcr, ctbr;
+ u32 m2m_hw_period, ccr, ctcr, ctbr;
int i, ret = 0;

+ if (chan_config->m2m_hw)
+ m2m_hw_period = sg_dma_len(sgl);
+
for_each_sg(sgl, sg, sg_len, i) {
if (sg_dma_len(sg) > STM32_MDMA_MAX_BLOCK_LEN) {
dev_err(chan2dev(chan), "Invalid block len\n");
@@ -716,6 +748,8 @@ static int stm32_mdma_setup_xfer(struct stm32_mdma_chan *chan,
if (direction == DMA_MEM_TO_DEV) {
src_addr = sg_dma_address(sg);
dst_addr = dma_config->dst_addr;
+ if (chan_config->m2m_hw && (i & 1))
+ dst_addr += m2m_hw_period;
ret = stm32_mdma_set_xfer_param(chan, direction, &ccr,
&ctcr, &ctbr, src_addr,
sg_dma_len(sg));
@@ -723,6 +757,8 @@ static int stm32_mdma_setup_xfer(struct stm32_mdma_chan *chan,
src_addr);
} else {
src_addr = dma_config->src_addr;
+ if (chan_config->m2m_hw && (i & 1))
+ src_addr += m2m_hw_period;
dst_addr = sg_dma_address(sg);
ret = stm32_mdma_set_xfer_param(chan, direction, &ccr,
&ctcr, &ctbr, dst_addr,
@@ -755,6 +791,7 @@ stm32_mdma_prep_slave_sg(struct dma_chan *c, struct scatterlist *sgl,
unsigned long flags, void *context)
{
struct stm32_mdma_chan *chan = to_stm32_mdma_chan(c);
+ struct stm32_mdma_chan_config *chan_config = &chan->chan_config;
struct stm32_mdma_desc *desc;
int i, ret;

@@ -777,6 +814,21 @@ stm32_mdma_prep_slave_sg(struct dma_chan *c, struct scatterlist *sgl,
if (ret < 0)
goto xfer_setup_err;

+ /*
+ * In case of M2M HW transfer triggered by STM32 DMA, we do not have to clear the
+ * transfer complete flag by hardware in order to let the CPU rearm the STM32 DMA
+ * with the next sg element and update some data in dmaengine framework.
+ */
+ if (chan_config->m2m_hw && direction == DMA_MEM_TO_DEV) {
+ struct stm32_mdma_hwdesc *hwdesc;
+
+ for (i = 0; i < sg_len; i++) {
+ hwdesc = desc->node[i].hwdesc;
+ hwdesc->cmar = 0;
+ hwdesc->cmdr = 0;
+ }
+ }
+
desc->cyclic = false;

return vchan_tx_prep(&chan->vchan, &desc->vdesc, flags);
@@ -798,6 +850,7 @@ stm32_mdma_prep_dma_cyclic(struct dma_chan *c, dma_addr_t buf_addr,
struct stm32_mdma_chan *chan = to_stm32_mdma_chan(c);
struct stm32_mdma_device *dmadev = stm32_mdma_get_dev(chan);
struct dma_slave_config *dma_config = &chan->dma_config;
+ struct stm32_mdma_chan_config *chan_config = &chan->chan_config;
struct stm32_mdma_desc *desc;
dma_addr_t src_addr, dst_addr;
u32 ccr, ctcr, ctbr, count;
@@ -858,8 +911,12 @@ stm32_mdma_prep_dma_cyclic(struct dma_chan *c, dma_addr_t buf_addr,
if (direction == DMA_MEM_TO_DEV) {
src_addr = buf_addr + i * period_len;
dst_addr = dma_config->dst_addr;
+ if (chan_config->m2m_hw && (i & 1))
+ dst_addr += period_len;
} else {
src_addr = dma_config->src_addr;
+ if (chan_config->m2m_hw && (i & 1))
+ src_addr += period_len;
dst_addr = buf_addr + i * period_len;
}

@@ -1244,6 +1301,17 @@ static int stm32_mdma_slave_config(struct dma_chan *c,

memcpy(&chan->dma_config, config, sizeof(*config));

+ /* Check if user is requesting STM32 DMA to trigger MDMA */
+ if (config->peripheral_size) {
+ struct stm32_mdma_dma_config *mdma_config;
+
+ mdma_config = (struct stm32_mdma_dma_config *)chan->dma_config.peripheral_config;
+ chan->chan_config.request = mdma_config->request;
+ chan->chan_config.mask_addr = mdma_config->cmar;
+ chan->chan_config.mask_data = mdma_config->cmdr;
+ chan->chan_config.m2m_hw = true;
+ }
+
return 0;
}

--
2.25.1

2022-07-11 09:08:18

by Amelie Delaunay

[permalink] [raw]
Subject: [PATCH 3/4] dmaengine: stm32-dma: add support to trigger STM32 MDMA

STM32 MDMA can be triggered by STM32 DMA channels transfer complete.
The "request line number" triggering STM32 MDMA is the STM32 DMAMUX channel
id set by stm32-dmamux driver in dma_spec->args[3].

stm32-dma driver fills the struct stm32_dma_mdma_config used to configure
the MDMA with struct dma_slave_config .peripheral_config/.peripheral_size.

Signed-off-by: Amelie Delaunay <[email protected]>
---
drivers/dma/stm32-dma.c | 56 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/stm32-dma.c b/drivers/dma/stm32-dma.c
index adb25a11c70f..3916295fe154 100644
--- a/drivers/dma/stm32-dma.c
+++ b/drivers/dma/stm32-dma.c
@@ -142,6 +142,8 @@
#define STM32_DMA_DIRECT_MODE_GET(n) (((n) & STM32_DMA_DIRECT_MODE_MASK) >> 2)
#define STM32_DMA_ALT_ACK_MODE_MASK BIT(4)
#define STM32_DMA_ALT_ACK_MODE_GET(n) (((n) & STM32_DMA_ALT_ACK_MODE_MASK) >> 4)
+#define STM32_DMA_MDMA_STREAM_ID_MASK GENMASK(19, 16)
+#define STM32_DMA_MDMA_STREAM_ID_GET(n) (((n) & STM32_DMA_MDMA_STREAM_ID_MASK) >> 16)

enum stm32_dma_width {
STM32_DMA_BYTE,
@@ -195,6 +197,19 @@ struct stm32_dma_desc {
struct stm32_dma_sg_req sg_req[];
};

+/**
+ * struct stm32_dma_mdma_cfg - STM32 DMA MDMA configuration
+ * @stream_id: DMA request to trigger STM32 MDMA transfer
+ * @ifcr: DMA interrupt flag clear register address,
+ * used by STM32 MDMA to clear DMA Transfer Complete flag
+ * @tcf: DMA Transfer Complete flag
+ */
+struct stm32_dma_mdma_config {
+ u32 stream_id;
+ u32 ifcr;
+ u32 tcf;
+};
+
struct stm32_dma_chan {
struct virt_dma_chan vchan;
bool config_init;
@@ -209,6 +224,8 @@ struct stm32_dma_chan {
u32 mem_burst;
u32 mem_width;
enum dma_status status;
+ bool trig_mdma;
+ struct stm32_dma_mdma_config mdma_config;
};

struct stm32_dma_device {
@@ -388,6 +405,13 @@ static int stm32_dma_slave_config(struct dma_chan *c,

memcpy(&chan->dma_sconfig, config, sizeof(*config));

+ /* Check if user is requesting DMA to trigger STM32 MDMA */
+ if (config->peripheral_size) {
+ config->peripheral_config = &chan->mdma_config;
+ config->peripheral_size = sizeof(chan->mdma_config);
+ chan->trig_mdma = true;
+ }
+
chan->config_init = true;

return 0;
@@ -576,6 +600,10 @@ static void stm32_dma_start_transfer(struct stm32_dma_chan *chan)
sg_req = &chan->desc->sg_req[chan->next_sg];
reg = &sg_req->chan_reg;

+ /* When DMA triggers STM32 MDMA, DMA Transfer Complete is managed by STM32 MDMA */
+ if (chan->trig_mdma && chan->dma_sconfig.direction != DMA_MEM_TO_DEV)
+ reg->dma_scr &= ~STM32_DMA_SCR_TCIE;
+
reg->dma_scr &= ~STM32_DMA_SCR_EN;
stm32_dma_write(dmadev, STM32_DMA_SCR(chan->id), reg->dma_scr);
stm32_dma_write(dmadev, STM32_DMA_SPAR(chan->id), reg->dma_spar);
@@ -725,6 +753,8 @@ static void stm32_dma_handle_chan_done(struct stm32_dma_chan *chan, u32 scr)

if (chan->desc->cyclic) {
vchan_cyclic_callback(&chan->desc->vdesc);
+ if (chan->trig_mdma)
+ return;
stm32_dma_sg_inc(chan);
/* cyclic while CIRC/DBM disable => post resume reconfiguration needed */
if (!(scr & (STM32_DMA_SCR_CIRC | STM32_DMA_SCR_DBM)))
@@ -1099,6 +1129,10 @@ static struct dma_async_tx_descriptor *stm32_dma_prep_slave_sg(
else
chan->chan_reg.dma_scr &= ~STM32_DMA_SCR_PFCTRL;

+ /* Activate Double Buffer Mode if DMA triggers STM32 MDMA and more than 1 sg */
+ if (chan->trig_mdma && sg_len > 1)
+ chan->chan_reg.dma_scr |= STM32_DMA_SCR_DBM;
+
for_each_sg(sgl, sg, sg_len, i) {
ret = stm32_dma_set_xfer_param(chan, direction, &buswidth,
sg_dma_len(sg),
@@ -1120,6 +1154,8 @@ static struct dma_async_tx_descriptor *stm32_dma_prep_slave_sg(
desc->sg_req[i].chan_reg.dma_spar = chan->chan_reg.dma_spar;
desc->sg_req[i].chan_reg.dma_sm0ar = sg_dma_address(sg);
desc->sg_req[i].chan_reg.dma_sm1ar = sg_dma_address(sg);
+ if (chan->trig_mdma)
+ desc->sg_req[i].chan_reg.dma_sm1ar += sg_dma_len(sg);
desc->sg_req[i].chan_reg.dma_sndtr = nb_data_items;
}

@@ -1207,8 +1243,11 @@ static struct dma_async_tx_descriptor *stm32_dma_prep_dma_cyclic(
desc->sg_req[i].chan_reg.dma_spar = chan->chan_reg.dma_spar;
desc->sg_req[i].chan_reg.dma_sm0ar = buf_addr;
desc->sg_req[i].chan_reg.dma_sm1ar = buf_addr;
+ if (chan->trig_mdma)
+ desc->sg_req[i].chan_reg.dma_sm1ar += period_len;
desc->sg_req[i].chan_reg.dma_sndtr = nb_data_items;
- buf_addr += period_len;
+ if (!chan->trig_mdma)
+ buf_addr += period_len;
}

desc->num_sgs = num_periods;
@@ -1491,6 +1530,7 @@ static void stm32_dma_set_config(struct stm32_dma_chan *chan,
chan->threshold = STM32_DMA_FIFO_THRESHOLD_NONE;
if (STM32_DMA_ALT_ACK_MODE_GET(cfg->features))
chan->chan_reg.dma_scr |= STM32_DMA_SCR_TRBUFF;
+ chan->mdma_config.stream_id = STM32_DMA_MDMA_STREAM_ID_GET(cfg->features);
}

static struct dma_chan *stm32_dma_of_xlate(struct of_phandle_args *dma_spec,
@@ -1630,6 +1670,20 @@ static int stm32_dma_probe(struct platform_device *pdev)
chan->id = i;
chan->vchan.desc_free = stm32_dma_desc_free;
vchan_init(&chan->vchan, dd);
+
+ chan->mdma_config.ifcr = res->start;
+ chan->mdma_config.ifcr += (chan->id & 4) ? STM32_DMA_HIFCR : STM32_DMA_LIFCR;
+
+ chan->mdma_config.tcf = STM32_DMA_TCI;
+ /*
+ * bit0 of chan->id represents the need to left shift by 6
+ * bit1 of chan->id represents the need to extra left shift by 16
+ * TCIF0, chan->id = b0000; TCIF4, chan->id = b0100: left shift by 0*6 + 0*16
+ * TCIF1, chan->id = b0001; TCIF5, chan->id = b0101: left shift by 1*6 + 0*16
+ * TCIF2, chan->id = b0010; TCIF6, chan->id = b0110: left shift by 0*6 + 1*16
+ * TCIF3, chan->id = b0011; TCIF7, chan->id = b0111: left shift by 1*6 + 1*16
+ */
+ chan->mdma_config.tcf <<= (6 * (chan->id & 0x1) + 16 * ((chan->id & 0x2) >> 1));
}

ret = dma_async_device_register(dd);
--
2.25.1

2022-07-11 09:08:39

by Amelie Delaunay

[permalink] [raw]
Subject: [PATCH 1/4] docs: arm: stm32: introduce STM32 DMA-MDMA chaining feature

STM32 DMA-MDMA chaining feature is available on STM32 SoCs which embed
STM32 DMAMUX, DMA and MDMA controllers. It is the case on STM32MP1 SoCs but
also on STM32H7 SoCs. But focus is on STM32MP1 SoCs, using DDR.
This documentation aims to explain how to use STM32 DMA-MDMA chaining
feature in drivers of STM32 peripheral having request lines on STM32 DMA.

Signed-off-by: Amelie Delaunay <[email protected]>
---
.../arm/stm32/stm32-dma-mdma-chaining.rst | 365 ++++++++++++++++++
1 file changed, 365 insertions(+)
create mode 100644 Documentation/arm/stm32/stm32-dma-mdma-chaining.rst

diff --git a/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
new file mode 100644
index 000000000000..bfbbadc45aa7
--- /dev/null
+++ b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
@@ -0,0 +1,365 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+STM32 DMA-MDMA chaining
+=======================
+
+
+Introduction
+------------
+
+ This document describes the STM32 DMA-MDMA chaining feature. But before going further, let's
+ introduce the peripherals involved.
+
+ To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed direct memory access
+ controllers (DMA).
+
+ STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA request routing
+ capabilities are enhanced by a DMA request multiplexer (STM32 DMAMUX).
+
+ **STM32 DMAMUX**
+
+ STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA controller (STM32MP1
+ counts two STM32 DMA controllers) channels.
+
+ **STM32 DMA**
+
+ STM32 DMA is mainly used to implement central data buffer storage (usually in the system SRAM) for
+ different peripheral. It can access external RAMs but without the ability to generate convenient
+ burst transfer ensuring the best load of the AXI.
+
+ **STM32 MDMA**
+
+ STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between RAM data buffers
+ without CPU intervention. It can also be used in a hierarchical structure that uses STM32 DMA as
+ first level data buffer interfaces for AHB peripherals, while the STM32 MDMA acts as a second
+ level DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control of the AXI/AHB
+ bus.
+
+
+Principles
+----------
+
+ STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and STM32 MDMA controllers.
+
+ STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction (when DMA data
+ counter - DMA_SxNDTR - reaches 0), the memory pointers (configured with DMA_SxSM0AR and
+ DMA_SxM1AR) are swapped and the DMA data counter is automatically reloaded. This allows the SW or
+ the STM32 MDMA to process one memory area while the second memory area is being filled/used by the
+ STM32 DMA transfer.
+
+ With STM32 MDMA linked-list mode, a single request initiates the data array (collection of nodes)
+ to be transferred until the linked-list pointer for the channel is null. The channel transfer
+ complete of the last node is the end of transfer, unless first and last nodes are linked to each
+ other, in such a case, the linked-list loops on to create a circular MDMA transfer.
+
+ STM32 MDMA has direct connections with STM32 DMA. This enables autonomous communication and
+ synchronization between peripherals, thus saving CPU resources and bus congestion. Transfer
+ Complete signal of STM32 DMA channel can triggers STM32 MDMA transfer. STM32 MDMA can clear the
+ request generated by the STM32 DMA by writing to its Interrupt Clear register (whose address is
+ stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR).
+
+ .. csv-table:: STM32 MDMA interconnect table with STM32 DMA
+ :header: "STM32 DMAMUX channels", "STM32 DMA controllers channels",
+ "STM32 DMA Transfer Complete signal", "STM32 MDMA request"
+
+ "Channel *0*", "DMA1 channel 0", dma1_tcf0, *0x00*
+ "Channel *1*", "DMA1 channel 1", dma1_tcf1, *0x01*
+ "Channel *2*", "DMA1 channel 2", dma1_tcf2, *0x02*
+ "Channel *3*", "DMA1 channel 3", dma1_tcf3, *0x03*
+ "Channel *4*", "DMA1 channel 4", dma1_tcf4, *0x04*
+ "Channel *5*", "DMA1 channel 5", dma1_tcf5, *0x05*
+ "Channel *6*", "DMA1 channel 6", dma1_tcf6, *0x06*
+ "Channel *7*", "DMA1 channel 7", dma1_tcf7, *0x07*
+ "Channel *8*", "DMA2 channel 0", dma2_tcf0, *0x08*
+ "Channel *9*", "DMA2 channel 1", dma2_tcf1, *0x09*
+ "Channel *10*", "DMA2 channel 2", dma2_tcf2, *0x0A*
+ "Channel *11*", "DMA2 channel 3", dma2_tcf3, *0x0B*
+ "Channel *12*", "DMA2 channel 4", dma2_tcf4, *0x0C*
+ "Channel *13*", "DMA2 channel 5", dma2_tcf5, *0x0D*
+ "Channel *14*", "DMA2 channel 6", dma2_tcf6, *0x0E*
+ "Channel *15*", "DMA2 channel 7", dma2_tcf7, *0x0F*
+
+ STM32 DMA-MDMA chaining feature then uses a SRAM buffer. STM32MP1 SoCs embed three fast access
+ static internal RAMs of various size, used for data storage. Due to STM32 DMA legacy (within
+ microcontrollers), STM32 DMA performances are bad with DDR, while they are optimal with SRAM.
+ Hence the SRAM buffer used between STM32 DMA and STM32 MDMA. This buffer is split in two equal
+ periods and STM32 DMA uses one period while STM32 MDMA uses the other period simultaneously.
+ ::
+
+ dma[1:2]-tcf[0:7]
+ .----------------.
+ ____________ ' _________ V____________
+ | STM32 DMA | / __|>_ \ | STM32 MDMA |
+ |------------| | / \ | |------------|
+ | DMA_SxM0AR |<=>| | SRAM | |<=>| []-[]...[] |
+ | DMA_SxM1AR | | \_____/ | | |
+ |____________| \___<|____/ |____________|
+
+ STM32 DMA-MDMA chaining uses (struct dma_slave_config).peripheral_config to exchange the
+ parameters needed to configure MDMA. These parameters are gathered into a u32 array with three
+ values:
+
+ * the STM32 MDMA request (which is actually the DMAMUX channel ID),
+ * the address of the STM32 DMA register to clear the Transfer Complete interrupt flag,
+ * the mask of the Transfer Complete interrupt flag of the STM32 DMA channel.
+
+Device Tree updates for STM32 DMA-MDMA chaining support
+-------------------------------------------------------
+
+ **1. Allocate a SRAM buffer**
+
+ SRAM device tree node is defined in SoC device tree. You can refer to it in your board device
+ tree to define your SRAM pool.
+ ::
+
+ &sram {
+ my_foo_device_dma_pool: dma-sram@0 {
+ reg = <0x0 0x1000>;
+ };
+ };
+
+ Be careful of the start index, in case there are other SRAM consumers.
+ Define your pool size strategically: to optimise chaining, the idea is that STM32 DMA and STM32
+ MDMA can work simultaneously, on each buffer of the SRAM.
+ If the SRAM period is greater than the expected DMA transfer, then STM32 DMA and STM32 MDMA will
+ work sequentially instead of simultaneously. It is not a functional issue but it is not optimal.
+
+ Don't forget to refer to your SRAM pool in your device node. You need to define a new property.
+ ::
+
+ &my_foo_device {
+ ...
+ my_dma_pool = &my_foo_device_dma_pool;
+ };
+
+ Then get this SRAM pool in your foo driver and allocate your SRAM buffer.
+
+ **2. Allocate a STM32 DMA channel and a STM32 MDMA channel**
+
+ You need to define an extra channel in your device tree node, in addition to the one you should
+ already have for "classic" DMA operation.
+
+ This new channel must be taken from STM32 MDMA channels, so, the phandle of the DMA controller
+ to use is the MDMA controller's one.
+ ::
+
+ &my_foo_device {
+ [...]
+ my_dma_pool = &my_foo_device_dma_pool;
+ dmas = <&dmamux1 ...>, // your STM32 DMA channel
+ <&mdma1 0 0x3 0x1200000a 0 0>; // the extra STM32 MDMA channel
+ };
+
+ Concerning STM32 MDMA bindings:
+
+ 1. The request line number : whatever the value here, it will be overwritten by MDMA driver
+ with the STM32 DMAMUX channel ID passed through (struct dma_slave_config).peripheral_config
+
+ 2. The priority level : choose Very High (0x3) so that your channel will take priority other the
+ other during request arbitration
+
+ 3. A 32bit mask specifying the DMA channel configuration : source and destination address
+ increment, block transfer with 128 bytes per single transfer
+
+ 4. The 32bit value specifying the register to be used to acknowledge the request: it will be
+ overwritten by MDMA driver, with the DMA channel interrupt flag clear register address passed
+ through (struct dma_slave_config).peripheral_config
+
+ 5. The 32bit mask specifying the value to be written to acknowledge the request: it will be
+ overwritten by MDMA driver, with the DMA channel Transfer Complete flag passed through (struct
+ dma_slave_config).peripheral_config
+
+Driver updates for STM32 DMA-MDMA chaining support in foo driver
+----------------------------------------------------------------
+
+ **0. (optional) Refactor the original sg_table in case of dmaengine_prep_slave_sg()**
+
+ In case of dmaengine_prep_slave_sg(), the original sg_table can't be used as is. Two new
+ sg_tables must be created from the original one. One for STM32 DMA transfer (where memory
+ address targets now the SRAM buffer instead of DDR buffer) and one for STM32 MDMA transfer
+ (where memory address targets the DDR buffer).
+
+ The new sg_list items must fit SRAM period length. Here is an example for DMA_DEV_TO_MEM:
+ ::
+
+ /*
+ * Assuming sgl and nents, respectively the initial scatterlist and its length.
+ * Assuming sram_dma_buf and sram_period, respectively the memory allocated from
+ * the pool for DMA usage, and the length of the period, which is half of the
+ * sram_buf size.
+ */
+ struct sg_table new_dma_sgt, new_mdma_sgt;
+ struct scatterlist *s, *_sgl;
+ dma_addr_t ddr_dma_buf;
+ u32 new_nents = 0, len;
+ int i;
+
+ /* Count the number of entries needed */
+ for_each_sg(sgl, s, nents, i)
+ if (sg_dma_len(s) > sram_period)
+ new_nents += DIV_ROUND_UP(sg_dma_len(s), sram_period);
+ else
+ new_nents++;
+
+ /* Create sg table for STM32 DMA channel */
+ ret = sg_alloc_table(&new_dma_sgt, new_nents, GFP_ATOMIC);
+ if (ret)
+ dev_err(dev, "DMA sg table alloc failed\n");
+
+ for_each_sg(new_dma_sgt.sgl, s, new_dma_sgt.nents, i) {
+ _sgl = sgl;
+ sg_dma_len(s) = min(sg_dma_len(_sgl), sram_period);
+ /* Targets the beginning = first half of the sram_buf */
+ s->dma_address = sram_buf;
+ /*
+ * Targets the second half of the sram_buf
+ * for odd indexes of the item of the sg_list
+ */
+ if (i & 1)
+ s->dma_address += sram_period;
+ }
+
+ /* Create sg table for STM32 MDMA channel */
+ ret = sg_alloc_table(&new_mdma_sgt, new_nents, GFP_ATOMIC);
+ if (ret)
+ dev_err(dev, "MDMA sg_table alloc failed\n");
+
+ _sgl = sgl;
+ len = sg_dma_len(sgl);
+ ddr_dma_buf = sg_dma_address(sgl);
+ for_each_sg(mdma_sgt.sgl, s, mdma_sgt.nents, i) {
+ size_t bytes = min_t(size_t, len, sram_period);
+
+ sg_dma_len(s) = bytes;
+ sg_dma_address(s) = ddr_dma_buf;
+ len -= bytes;
+
+ if (!len && sg_next(_sgl)) {
+ _sgl = sg_next(_sgl);
+ len = sg_dma_len(_sgl);
+ ddr_dma_buf = sg_dma_address(_sgl);
+ } else {
+ ddr_dma_buf += bytes;
+ }
+ }
+
+ Don't forget to release these new sg_tables after getting the descriptors with
+ dmaengine_prep_slave_sg().
+
+ **1. Set controller specific parameters**
+
+ First, use dmaengine_slave_config() with a struct dma_slave_config to configure STM32 DMA
+ channel. You just have to take care of DMA addresses, the memory address (depending on the
+ transfer direction) must point on your SRAM buffer, and set (struct dma_slave_config)
+ .peripheral_size != 0.
+
+ STM32 DMA driver will check (struct dma_slave_config).peripheral_size to determine if chaining
+ is being used or not. If it is used, then STM32 DMA driver fills (struct dma_slave_config)
+ .peripheral_config with an array of three u32 : the first one containing STM32 DMAMUX channel
+ ID, the second one the channel interrupt flag clear register address, and the third one the
+ channel Transfer Complete flag mask.
+
+ Then, use dmaengine_slave_config with another struct dma_slave_config to configure STM32 MDMA
+ channel. Take care of DMA addresses, the device address (depending on the transfer direction)
+ must point on your SRAM buffer, and the memory address must point to the buffer originally used
+ for "classic" DMA operation. Use the previous (struct dma_slave_config).peripheral_size and
+ .peripheral_config that have been updated by STM32 DMA driver, to set (struct dma_slave_config)
+ .peripheral_size and .peripheral_config of the struct dma_slave_config to configure STM32 MDMA
+ channel.
+ ::
+
+ struct dma_slave_config dma_conf;
+ struct dma_slave_config mdma_conf;
+
+ memset(&dma_conf, 0, sizeof(dma_conf));
+ [...]
+ config.direction = DMA_DEV_TO_MEM;
+ config.dst_addr = sram_dma_buf; // SRAM buffer
+ config.peripheral_size = 1; // peripheral_size != 0 => chaining
+
+ dmaengine_slave_config(dma_chan, &dma_config);
+
+ memset(&mdma_conf, 0, sizeof(mdma_conf));
+ config.direction = DMA_DEV_TO_MEM;
+ mdma_conf.src_addr = sram_dma_buf; // SRAM buffer
+ mdma_conf.dst_addr = rx_dma_buf; // original memory buffer
+ mdma_conf.peripheral_size = dma_conf.peripheral_size; // from dma_conf
+ mdma_conf.peripheral_config = dma_config.peripheral_config; // from dma_conf
+
+ dmaengine_slave_config(mdma_chan, &mdma_conf);
+
+ **2. Get a descriptor for STM32 DMA channel transaction**
+
+ In the same way you get your descriptor for your "classic" DMA operation, you just have to
+ replace the original sg_list (in case of dmaengine_prep_slave_sg()) with the new sg_list using
+ SRAM buffer, or to replace the original buffer address, length and period (in case of
+ dmaengine_prep_dma_cyclic()) with the new SRAM buffer.
+
+ **3. Get a descriptor for STM32 MDMA channel transaction**
+
+ If you previously get descriptor (for STM32 DMA) with
+
+ * dmaengine_prep_slave_sg(), then use dmaengine_prep_slave_sg() for STM32 MDMA;
+ * dmaengine_prep_dma_cyclic(), then use dmaengine_prep_dma_cyclic() for STM32 MDMA.
+
+ Use the new sg_list using SRAM buffer (in case of dmaengine_prep_slave_sg()), or, depending on
+ the transfer direction, either the original DDR buffer (in case of DMA_DEV_TO_MEM) or the SRAM
+ buffer (in case of DMA_MEM_TO_DEV), the source address being previously set with
+ dmaengine_slave_config().
+
+ **4. Submit both transactions**
+
+ Before submitting your transactions, you may need to define on which descriptor you want a
+ callback to be called at the end of the transfer (dmaengine_prep_slave_sg()) or the period
+ (dmaengine_prep_dma_cyclic()). Depending on the direction, set the callback on the descriptor
+ that finishes the overal transfer:
+
+ * DMA_DEV_TO_MEM: set the callback on the "MDMA" descriptor
+ * DMA_MEM_TO_DEV: set the callback on the "DMA" descriptor
+
+ Then, submit the descriptors, whatever the order, with dmaengine_tx_submit().
+
+ **5. Issue pending requests (and wait for callback notification)**
+
+ As STM32 MDMA channel transfer is triggered by STM32 DMA, you must issue STM32 MDMA channel before
+ STM32 DMA channel.
+
+ If any, your callback will be called to warn you about the end of the overal transfer or the
+ period completion.
+
+ Don't forget to terminate both channels. STM32 DMA channel is configured in cyclic Double-Buffer
+ mode so it won't be disabled by HW, you need to terminate it. STM32 MDMA channel, will be stopped
+ by HW in case of sg transfer, but not in case of cyclic transfer. You can terminate it whatever
+ the kind of transfer.
+
+ **STM32 DMA-MDMA chaining DMA_MEM_TO_DEV special case**
+
+ STM32 DMA-MDMA chaining in DMA_MEM_TO_DEV is a special case. Indeed, the STM32 MDMA feeds the SRAM
+ buffer with the DDR data, and the STM32 DMA reads data from SRAM buffer. So some data (the first
+ period) have to be copied in SRAM buffer when the STM32 DMA starts to read.
+
+ A trick could be pausing the STM32 DMA channel (that will raise a Transfer Complete signal,
+ triggering the STM32 MDMA channel), but the first data read by the STM32 DMA could be "wrong".
+ The proper way is to prepare the first SRAM period with dmaengine_prep_dma_memcpy(). Then this
+ first period should be "removed" from the sg or the cyclic transfer.
+
+ Due to this complexity, rather use the STM32 DMA-MDMA chaining for DMA_DEV_TO_MEM and keep the
+ "classic" DMA usage for DMA_MEM_TO_DEV, unless you're not afraid.
+
+Resources
+---------
+
+ Application note, datasheet and reference manual are available on ST website (STM32MP1_).
+
+ Dedicated focus on three application notes (AN5224_, AN4031_ & AN5001_) dealing with STM32 DMAMUX,
+ STM32 DMA and STM32 MDMA.
+
+.. _STM32MP1: https://www.st.com/en/microcontrollers-microprocessors/stm32mp1-series.html
+.. _AN5224: https://www.st.com/resource/en/application_note/an5224-stm32-dmamux-the-dma-request-router-stmicroelectronics.pdf
+.. _AN4031: https://www.st.com/resource/en/application_note/dm00046011-using-the-stm32f2-stm32f4-and-stm32f7-series-dma-controller-stmicroelectronics.pdf
+.. _AN5001: https://www.st.com/resource/en/application_note/an5001-stm32cube-expansion-package-for-stm32h7-series-mdma-stmicroelectronics.pdf
+
+:Authors:
+
+- Amelie Delaunay <[email protected]>
\ No newline at end of file
--
2.25.1

2022-07-11 09:10:09

by Amelie Delaunay

[permalink] [raw]
Subject: [PATCH 2/4] dmaengine: stm32-dmamux: set dmamux channel id in dma features bitfield

STM32 DMAMUX is used with STM32 DMA1 and DMA2:
- DMAMUX channels 0 to 7 are connected to DMA1 channels 0 to 7
- DMAMUX channels 8 to 15 are connected to DMA2 channels 0 to 7

STM32 MDMA can be triggered by DMA1 and DMA2 channels transfer complete,
and the "request line number" is the DMAMUX channel id (e.g. DMA2 channel 0
triggers MDMA with request line 8).

To well configure MDMA, set DMAMUX channel id in DMA features bitfield,
so that DMA can update struct dma_slave_config peripheral_config properly.

Signed-off-by: Amelie Delaunay <[email protected]>
---
drivers/dma/stm32-dmamux.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/stm32-dmamux.c b/drivers/dma/stm32-dmamux.c
index eee0c5aa5fb5..b431f9da9206 100644
--- a/drivers/dma/stm32-dmamux.c
+++ b/drivers/dma/stm32-dmamux.c
@@ -147,7 +147,7 @@ static void *stm32_dmamux_route_allocate(struct of_phandle_args *dma_spec,
mux->request = dma_spec->args[0];

/* craft DMA spec */
- dma_spec->args[3] = dma_spec->args[2];
+ dma_spec->args[3] = dma_spec->args[2] | mux->chan_id << 16;
dma_spec->args[2] = dma_spec->args[1];
dma_spec->args[1] = 0;
dma_spec->args[0] = mux->chan_id - min;
--
2.25.1

2022-07-11 15:41:48

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 1/4] docs: arm: stm32: introduce STM32 DMA-MDMA chaining feature

Amelie Delaunay <[email protected]> writes:

> STM32 DMA-MDMA chaining feature is available on STM32 SoCs which embed
> STM32 DMAMUX, DMA and MDMA controllers. It is the case on STM32MP1 SoCs but
> also on STM32H7 SoCs. But focus is on STM32MP1 SoCs, using DDR.
> This documentation aims to explain how to use STM32 DMA-MDMA chaining
> feature in drivers of STM32 peripheral having request lines on STM32 DMA.
>
> Signed-off-by: Amelie Delaunay <[email protected]>
> ---
> .../arm/stm32/stm32-dma-mdma-chaining.rst | 365 ++++++++++++++++++
> 1 file changed, 365 insertions(+)
> create mode 100644 Documentation/arm/stm32/stm32-dma-mdma-chaining.rst

When you add a new RST file you also need to add it to index.rst
somewhere so that it becomes part of the docs build.

> diff --git a/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
> new file mode 100644
> index 000000000000..bfbbadc45aa7
> --- /dev/null
> +++ b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
> @@ -0,0 +1,365 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================
> +STM32 DMA-MDMA chaining
> +=======================
> +
> +
> +Introduction
> +------------
> +
> + This document describes the STM32 DMA-MDMA chaining feature. But before going further, let's
> + introduce the peripherals involved.

Please keep to the 80-column limit for documentation, it makes it easier
to read.

> + To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed direct memory access
> + controllers (DMA).
> +
> + STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA request routing
> + capabilities are enhanced by a DMA request multiplexer (STM32 DMAMUX).
> +
> + **STM32 DMAMUX**
> +
> + STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA controller (STM32MP1
> + counts two STM32 DMA controllers) channels.
> +
> + **STM32 DMA**
> +
> + STM32 DMA is mainly used to implement central data buffer storage (usually in the system SRAM) for
> + different peripheral. It can access external RAMs but without the ability to generate convenient
> + burst transfer ensuring the best load of the AXI.
> +
> + **STM32 MDMA**
> +
> + STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between RAM data buffers
> + without CPU intervention. It can also be used in a hierarchical structure that uses STM32 DMA as
> + first level data buffer interfaces for AHB peripherals, while the STM32 MDMA acts as a second
> + level DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control of the AXI/AHB
> + bus.
> +
> +
> +Principles
> +----------
> +
> + STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and STM32 MDMA controllers.
> +
> + STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction (when DMA data
> + counter - DMA_SxNDTR - reaches 0), the memory pointers (configured with DMA_SxSM0AR and
> + DMA_SxM1AR) are swapped and the DMA data counter is automatically reloaded. This allows the SW or
> + the STM32 MDMA to process one memory area while the second memory area is being filled/used by the
> + STM32 DMA transfer.
> +
> + With STM32 MDMA linked-list mode, a single request initiates the data array (collection of nodes)
> + to be transferred until the linked-list pointer for the channel is null. The channel transfer
> + complete of the last node is the end of transfer, unless first and last nodes are linked to each
> + other, in such a case, the linked-list loops on to create a circular MDMA transfer.
> +
> + STM32 MDMA has direct connections with STM32 DMA. This enables autonomous communication and
> + synchronization between peripherals, thus saving CPU resources and bus congestion. Transfer
> + Complete signal of STM32 DMA channel can triggers STM32 MDMA transfer. STM32 MDMA can clear the
> + request generated by the STM32 DMA by writing to its Interrupt Clear register (whose address is
> + stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR).
> +
> + .. csv-table:: STM32 MDMA interconnect table with STM32 DMA
> + :header: "STM32 DMAMUX channels", "STM32 DMA controllers channels",
> + "STM32 DMA Transfer Complete signal", "STM32 MDMA request"

If at all possible, please use simple tables; that makes the plain text
documentation much easier to read.

[...]

Thanks,

jon

2022-07-12 14:01:01

by Amelie Delaunay

[permalink] [raw]
Subject: Re: [PATCH 1/4] docs: arm: stm32: introduce STM32 DMA-MDMA chaining feature



On 7/11/22 17:11, Jonathan Corbet wrote:
> Amelie Delaunay <[email protected]> writes:
>
>> STM32 DMA-MDMA chaining feature is available on STM32 SoCs which embed
>> STM32 DMAMUX, DMA and MDMA controllers. It is the case on STM32MP1 SoCs but
>> also on STM32H7 SoCs. But focus is on STM32MP1 SoCs, using DDR.
>> This documentation aims to explain how to use STM32 DMA-MDMA chaining
>> feature in drivers of STM32 peripheral having request lines on STM32 DMA.
>>
>> Signed-off-by: Amelie Delaunay <[email protected]>
>> ---
>> .../arm/stm32/stm32-dma-mdma-chaining.rst | 365 ++++++++++++++++++
>> 1 file changed, 365 insertions(+)
>> create mode 100644 Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
>
> When you add a new RST file you also need to add it to index.rst
> somewhere so that it becomes part of the docs build.
>

Thanks for you review.

I'll add it to index.rst, with other stm32 documentations.

>> diff --git a/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
>> new file mode 100644
>> index 000000000000..bfbbadc45aa7
>> --- /dev/null
>> +++ b/Documentation/arm/stm32/stm32-dma-mdma-chaining.rst
>> @@ -0,0 +1,365 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +=======================
>> +STM32 DMA-MDMA chaining
>> +=======================
>> +
>> +
>> +Introduction
>> +------------
>> +
>> + This document describes the STM32 DMA-MDMA chaining feature. But before going further, let's
>> + introduce the peripherals involved.
>
> Please keep to the 80-column limit for documentation, it makes it easier
> to read.
>

OK, I prepare a v2 with a 80-column limit for the documentation patch.

>> + To offload data transfers from the CPU, STM32 microprocessors (MPUs) embed direct memory access
>> + controllers (DMA).
>> +
>> + STM32MP1 SoCs embed both STM32 DMA and STM32 MDMA controllers. STM32 DMA request routing
>> + capabilities are enhanced by a DMA request multiplexer (STM32 DMAMUX).
>> +
>> + **STM32 DMAMUX**
>> +
>> + STM32 DMAMUX routes any DMA request from a given peripheral to any STM32 DMA controller (STM32MP1
>> + counts two STM32 DMA controllers) channels.
>> +
>> + **STM32 DMA**
>> +
>> + STM32 DMA is mainly used to implement central data buffer storage (usually in the system SRAM) for
>> + different peripheral. It can access external RAMs but without the ability to generate convenient
>> + burst transfer ensuring the best load of the AXI.
>> +
>> + **STM32 MDMA**
>> +
>> + STM32 MDMA (Master DMA) is mainly used to manage direct data transfers between RAM data buffers
>> + without CPU intervention. It can also be used in a hierarchical structure that uses STM32 DMA as
>> + first level data buffer interfaces for AHB peripherals, while the STM32 MDMA acts as a second
>> + level DMA with better performance. As a AXI/AHB master, STM32 MDMA can take control of the AXI/AHB
>> + bus.
>> +
>> +
>> +Principles
>> +----------
>> +
>> + STM32 DMA-MDMA chaining feature relies on the strengths of STM32 DMA and STM32 MDMA controllers.
>> +
>> + STM32 DMA has a circular Double Buffer Mode (DBM). At each end of transaction (when DMA data
>> + counter - DMA_SxNDTR - reaches 0), the memory pointers (configured with DMA_SxSM0AR and
>> + DMA_SxM1AR) are swapped and the DMA data counter is automatically reloaded. This allows the SW or
>> + the STM32 MDMA to process one memory area while the second memory area is being filled/used by the
>> + STM32 DMA transfer.
>> +
>> + With STM32 MDMA linked-list mode, a single request initiates the data array (collection of nodes)
>> + to be transferred until the linked-list pointer for the channel is null. The channel transfer
>> + complete of the last node is the end of transfer, unless first and last nodes are linked to each
>> + other, in such a case, the linked-list loops on to create a circular MDMA transfer.
>> +
>> + STM32 MDMA has direct connections with STM32 DMA. This enables autonomous communication and
>> + synchronization between peripherals, thus saving CPU resources and bus congestion. Transfer
>> + Complete signal of STM32 DMA channel can triggers STM32 MDMA transfer. STM32 MDMA can clear the
>> + request generated by the STM32 DMA by writing to its Interrupt Clear register (whose address is
>> + stored in MDMA_CxMAR, and bit mask in MDMA_CxMDR).
>> +
>> + .. csv-table:: STM32 MDMA interconnect table with STM32 DMA
>> + :header: "STM32 DMAMUX channels", "STM32 DMA controllers channels",
>> + "STM32 DMA Transfer Complete signal", "STM32 MDMA request"
>
> If at all possible, please use simple tables; that makes the plain text
> documentation much easier to read.
>

It is possible, with some extra lines. I'll update it in v2 coming soon.

> [...]
>
> Thanks,
>
> jon

Regards,
Amelie