2018-04-02 10:42:24

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 0/6] Xilinx DMA enhancements and optimization

Some background about the patch series: Xilinx Axi Ethernet device driver
(xilinx_axienet_main.c) currently has axi-dma code inside it. The goal
is to refactor axiethernet driver and use existing AXI DMA driver using
DMAEngine API.

This patchset does feature addition and optimization to support
integration with axiethernet network driver. Once this patchset
is reviewed I will send separate patchset for axiethernet driver.

Radhey Shyam Pandey (6):
dt-bindings: dma: xilinx_dma: Add optional property
has_axieth_connected
dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma
client
dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
dmaengine: xilinx_dma: Freeup active list based on descriptor
completion bit
dmaengine: xilinx_dma: Program interrupt delay timeout
dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical
usecase

.../devicetree/bindings/dma/xilinx/xilinx_dma.txt | 1 +
drivers/dma/xilinx/xilinx_dma.c | 73 +++++++++++++++----
2 files changed, 58 insertions(+), 16 deletions(-)



2018-04-02 10:41:06

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 1/6] dt-bindings: dma: xilinx_dma: Add optional property has_axieth_connected

Add an optional AXI DMA property 'has_axieth_connected'. This can be
specified to indicate that AXI DMA is connected to AXI Ethernet in
hardware design and dma driver needs to do some additional handling.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
.../devicetree/bindings/dma/xilinx/xilinx_dma.txt | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
index a2b8bfa..5d592e3 100644
--- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
+++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
@@ -42,6 +42,7 @@ Optional properties:
the hardware.
Optional properties for AXI DMA:
- xlnx,mcdma: Tells whether configured for multi-channel mode in the hardware.
+- xlnx,axieth-connected: Tells whether AXI DMA is connected to AXI Ethernet.
Optional properties for VDMA:
- xlnx,flush-fsync: Tells which channel to Flush on Frame sync.
It takes following values:
--
1.7.1


2018-04-02 10:41:59

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 5/6] dmaengine: xilinx_dma: Program interrupt delay timeout

Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
the DMA engine to generate an interrupt after the delay time period
has expired. It enables dmaengine to respond in real-time even though
interrupt coalescing is configured.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
drivers/dma/xilinx/xilinx_dma.c | 16 ++++++++++++++--
1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 518465e..ab8f1b0 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -161,8 +161,12 @@
/* AXI DMA Specific Masks/Bit fields */
#define XILINX_DMA_MAX_TRANS_LEN GENMASK(22, 0)
#define XILINX_DMA_CR_COALESCE_MAX GENMASK(23, 16)
+#define XILINX_DMA_CR_DELAY_MAX GENMASK(31, 24)
#define XILINX_DMA_CR_CYCLIC_BD_EN_MASK BIT(4)
#define XILINX_DMA_CR_COALESCE_SHIFT 16
+#define XILINX_DMA_CR_DELAY_SHIFT 24
+#define XILINX_DMA_CR_WAITBOUND_DFT 254
+
#define XILINX_DMA_BD_SOP BIT(27)
#define XILINX_DMA_BD_EOP BIT(26)
#define XILINX_DMA_COALESCE_MAX 255
@@ -1294,6 +1298,12 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
reg &= ~XILINX_DMA_CR_COALESCE_MAX;
reg |= chan->desc_pendingcount <<
XILINX_DMA_CR_COALESCE_SHIFT;
+
+ if (chan->xdev->has_axieth_connected) {
+ reg &= ~XILINX_DMA_CR_DELAY_MAX;
+ reg |= XILINX_DMA_CR_WAITBOUND_DFT <<
+ XILINX_DMA_CR_DELAY_SHIFT;
+ }
dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
}

@@ -1508,7 +1518,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
}
}

- if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
+ if (!chan->xdev->has_axieth_connected && (status &
+ XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
/*
* Device takes too long to do the transfer when user requires
* responsiveness.
@@ -1516,7 +1527,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
dev_dbg(chan->dev, "Inter-packet latency too long\n");
}

- if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
+ if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
+ XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
spin_lock(&chan->lock);
xilinx_dma_complete_descriptor(chan);
chan->idle = true;
--
1.7.1


2018-04-02 10:42:06

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 4/6] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit

AXIDMA IP sets completion bit to 1 when the transfer is completed. Read
this bit to move descriptor from active list to the done list. This feature
is needed when interrupt delay timeout and IRQThreshold is enabled i.e
Dly_IrqEn is triggered w/o completing Interrupt Threshold.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
drivers/dma/xilinx/xilinx_dma.c | 18 ++++++++++++++----
1 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 36e1ab9..518465e 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -103,6 +103,7 @@
#define XILINX_DMA_PARK_PTR_RD_REF_SHIFT 0
#define XILINX_DMA_PARK_PTR_RD_REF_MASK GENMASK(4, 0)
#define XILINX_DMA_REG_VDMA_VERSION 0x002c
+#define XILINX_DMA_COMP_MASK BIT(31)

/* Register Direct Mode Registers */
#define XILINX_DMA_REG_VSIZE 0x0000
@@ -1387,16 +1388,25 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
{
struct xilinx_dma_tx_descriptor *desc, *next;
+ struct xilinx_axidma_tx_segment *seg;

/* This function was invoked with lock held */
if (list_empty(&chan->active_list))
return;

list_for_each_entry_safe(desc, next, &chan->active_list, node) {
- list_del(&desc->node);
- if (!desc->cyclic)
- dma_cookie_complete(&desc->async_tx);
- list_add_tail(&desc->node, &chan->done_list);
+
+ seg = list_last_entry(&desc->segments,
+ struct xilinx_axidma_tx_segment, node);
+ if ((seg->hw.status & XILINX_DMA_COMP_MASK) ||
+ (!chan->xdev->has_axieth_connected)) {
+ list_del(&desc->node);
+ if (!desc->cyclic)
+ dma_cookie_complete(&desc->async_tx);
+ list_add_tail(&desc->node, &chan->done_list);
+ } else {
+ break;
+ }
}
}

--
1.7.1


2018-04-02 10:42:11

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 6/6] dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase

Schedule tasklet with high priority to ensure that callback processing
is prioritized. It improves throughput for netdev dma clients.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
drivers/dma/xilinx/xilinx_dma.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index ab8f1b0..d45224d 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -1536,7 +1536,7 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
spin_unlock(&chan->lock);
}

- tasklet_schedule(&chan->tasklet);
+ tasklet_hi_schedule(&chan->tasklet);
return IRQ_HANDLED;
}

--
1.7.1


2018-04-02 10:42:18

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 3/6] dmaengine: xilinx_dma: Increase AXI DMA transaction segment count

Increase AXI DMA transaction segments count to ensure that even in
high load we always get a free segment in prepare descriptor for a
DMA_SLAVE transaction.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
drivers/dma/xilinx/xilinx_dma.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 16fee30..36e1ab9 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -165,7 +165,7 @@
#define XILINX_DMA_BD_SOP BIT(27)
#define XILINX_DMA_BD_EOP BIT(26)
#define XILINX_DMA_COALESCE_MAX 255
-#define XILINX_DMA_NUM_DESCS 255
+#define XILINX_DMA_NUM_DESCS 512
#define XILINX_DMA_NUM_APP_WORDS 5

/* Multi-Channel DMA Descriptor offsets*/
--
1.7.1


2018-04-02 10:42:41

by Radhey Shyam Pandey

[permalink] [raw]
Subject: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Read DT property to check if AXI DMA is connected to axithernet.
If connected pass AXI4-Stream control words to netdev dma client.
It is mandatory that netdev dma client reserve initial memory for
max supported control words in callback_param.

Signed-off-by: Radhey Shyam Pandey <[email protected]>
---
drivers/dma/xilinx/xilinx_dma.c | 35 +++++++++++++++++++++++++++--------
1 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
index 27b5235..16fee30 100644
--- a/drivers/dma/xilinx/xilinx_dma.c
+++ b/drivers/dma/xilinx/xilinx_dma.c
@@ -418,6 +418,7 @@ struct xilinx_dma_config {
* @rxs_clk: DMA s2mm stream clock
* @nr_channels: Number of channels DMA device supports
* @chan_id: DMA channel identifier
+ * @has_axieth_connected: AXI DMA connected to AXI ethernet
*/
struct xilinx_dma_device {
void __iomem *regs;
@@ -437,6 +438,7 @@ struct xilinx_dma_device {
struct clk *rxs_clk;
u32 nr_channels;
u32 chan_id;
+ bool has_axieth_connected;
};

/* Macros */
@@ -809,7 +811,10 @@ static void xilinx_dma_chan_handle_cyclic(struct xilinx_dma_chan *chan,
static void xilinx_dma_chan_desc_cleanup(struct xilinx_dma_chan *chan)
{
struct xilinx_dma_tx_descriptor *desc, *next;
+ struct xilinx_axidma_tx_segment *seg;
+ struct xilinx_axidma_desc_hw *hw;
unsigned long flags;
+ u32 *app_w;

spin_lock_irqsave(&chan->lock, flags);

@@ -821,17 +826,28 @@ static void xilinx_dma_chan_desc_cleanup(struct xilinx_dma_chan *chan)
break;
}

- /* Remove from the list of running transactions */
- list_del(&desc->node);
-
/* Run the link descriptor callback function */
dmaengine_desc_get_callback(&desc->async_tx, &cb);
- if (dmaengine_desc_callback_valid(&cb)) {
- spin_unlock_irqrestore(&chan->lock, flags);
- dmaengine_desc_callback_invoke(&cb, NULL);
- spin_lock_irqsave(&chan->lock, flags);
+
+ if (chan->xdev->has_axieth_connected) {
+ seg = list_first_entry(&desc->segments,
+ struct xilinx_axidma_tx_segment, node);
+ if (cb.callback_param) {
+ app_w = (u32 *) cb.callback_param;
+ hw = &seg->hw;
+ *app_w = hw->status & XILINX_DMA_MAX_TRANS_LEN;
+ memcpy(app_w, hw->app, sizeof(u32) *
+ XILINX_DMA_NUM_APP_WORDS);
+ }
}

+ /* Remove from the list of running transactions */
+ list_del(&desc->node);
+
+ spin_unlock_irqrestore(&chan->lock, flags);
+ dmaengine_desc_callback_invoke(&cb, NULL);
+ spin_lock_irqsave(&chan->lock, flags);
+
/* Run any dependencies, then free the descriptor */
dma_run_dependencies(&desc->async_tx);
xilinx_dma_free_tx_descriptor(chan, desc);
@@ -2602,8 +2618,11 @@ static int xilinx_dma_probe(struct platform_device *pdev)

/* Retrieve the DMA engine properties from the device tree */
xdev->has_sg = of_property_read_bool(node, "xlnx,include-sg");
- if (xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA)
+ if (xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) {
xdev->mcdma = of_property_read_bool(node, "xlnx,mcdma");
+ xdev->has_axieth_connected = of_property_read_bool(node,
+ "xlnx,axieth-connected");
+ }

if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) {
err = of_property_read_u32(node, "xlnx,num-fstores",
--
1.7.1


2018-04-11 09:05:53

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 1/6] dt-bindings: dma: xilinx_dma: Add optional property has_axieth_connected

On Mon, Apr 02, 2018 at 04:09:01PM +0530, Radhey Shyam Pandey wrote:
> Add an optional AXI DMA property 'has_axieth_connected'. This can be
> specified to indicate that AXI DMA is connected to AXI Ethernet in
> hardware design and dma driver needs to do some additional handling.

1. why are DT people and list not added to this? I have added that
2. should we have a property for a specific peripheral? Wouldn't it help to have
a generic property for peripheral connected which can be scaled to more IPs
being connected in future?
3. Please use right subsystem name, it damengine: not dma:

>
> Signed-off-by: Radhey Shyam Pandey <[email protected]>
> ---
> .../devicetree/bindings/dma/xilinx/xilinx_dma.txt | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> index a2b8bfa..5d592e3 100644
> --- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> @@ -42,6 +42,7 @@ Optional properties:
> the hardware.
> Optional properties for AXI DMA:
> - xlnx,mcdma: Tells whether configured for multi-channel mode in the hardware.
> +- xlnx,axieth-connected: Tells whether AXI DMA is connected to AXI Ethernet.
> Optional properties for VDMA:
> - xlnx,flush-fsync: Tells which channel to Flush on Frame sync.
> It takes following values:
> --
> 1.7.1
>

--
~Vinod

2018-04-11 09:07:43

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On Mon, Apr 02, 2018 at 04:09:02PM +0530, Radhey Shyam Pandey wrote:

> +
> + if (chan->xdev->has_axieth_connected) {
> + seg = list_first_entry(&desc->segments,
> + struct xilinx_axidma_tx_segment, node);
> + if (cb.callback_param) {
> + app_w = (u32 *) cb.callback_param;

why are you interpreting callback_param? This is plainly wrong.
we do not know what is the interpretation of callback_param and it is
internal to submitter.

What exactly is the problem you are trying to solve?

> + hw = &seg->hw;
> + *app_w = hw->status & XILINX_DMA_MAX_TRANS_LEN;
> + memcpy(app_w, hw->app, sizeof(u32) *
> + XILINX_DMA_NUM_APP_WORDS);
> + }

--
~Vinod

2018-04-11 09:10:15

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 4/6] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit

On Mon, Apr 02, 2018 at 04:09:04PM +0530, Radhey Shyam Pandey wrote:
> AXIDMA IP sets completion bit to 1 when the transfer is completed. Read
> this bit to move descriptor from active list to the done list. This feature
> is needed when interrupt delay timeout and IRQThreshold is enabled i.e
> Dly_IrqEn is triggered w/o completing Interrupt Threshold.
>
> Signed-off-by: Radhey Shyam Pandey <[email protected]>
> ---
> drivers/dma/xilinx/xilinx_dma.c | 18 ++++++++++++++----
> 1 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 36e1ab9..518465e 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -103,6 +103,7 @@
> #define XILINX_DMA_PARK_PTR_RD_REF_SHIFT 0
> #define XILINX_DMA_PARK_PTR_RD_REF_MASK GENMASK(4, 0)
> #define XILINX_DMA_REG_VDMA_VERSION 0x002c
> +#define XILINX_DMA_COMP_MASK BIT(31)
>
> /* Register Direct Mode Registers */
> #define XILINX_DMA_REG_VSIZE 0x0000
> @@ -1387,16 +1388,25 @@ static void xilinx_dma_issue_pending(struct dma_chan *dchan)
> static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
> {
> struct xilinx_dma_tx_descriptor *desc, *next;
> + struct xilinx_axidma_tx_segment *seg;
>
> /* This function was invoked with lock held */
> if (list_empty(&chan->active_list))
> return;
>
> list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> - list_del(&desc->node);
> - if (!desc->cyclic)
> - dma_cookie_complete(&desc->async_tx);
> - list_add_tail(&desc->node, &chan->done_list);
> +
> + seg = list_last_entry(&desc->segments,
> + struct xilinx_axidma_tx_segment, node);
> + if ((seg->hw.status & XILINX_DMA_COMP_MASK) ||
> + (!chan->xdev->has_axieth_connected)) {

why the second case ? That is not expalined in log?

> + list_del(&desc->node);
> + if (!desc->cyclic)
> + dma_cookie_complete(&desc->async_tx);
> + list_add_tail(&desc->node, &chan->done_list);
> + } else {
> + break;
> + }
> }
> }
>
> --
> 1.7.1
>

--
~Vinod

2018-04-11 09:10:57

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 5/6] dmaengine: xilinx_dma: Program interrupt delay timeout

On Mon, Apr 02, 2018 at 04:09:05PM +0530, Radhey Shyam Pandey wrote:
> Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
> the DMA engine to generate an interrupt after the delay time period
> has expired. It enables dmaengine to respond in real-time even though
> interrupt coalescing is configured.

again you are doing this only for axieth_connected, why is that?

>
> Signed-off-by: Radhey Shyam Pandey <[email protected]>
> ---
> drivers/dma/xilinx/xilinx_dma.c | 16 ++++++++++++++--
> 1 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c
> index 518465e..ab8f1b0 100644
> --- a/drivers/dma/xilinx/xilinx_dma.c
> +++ b/drivers/dma/xilinx/xilinx_dma.c
> @@ -161,8 +161,12 @@
> /* AXI DMA Specific Masks/Bit fields */
> #define XILINX_DMA_MAX_TRANS_LEN GENMASK(22, 0)
> #define XILINX_DMA_CR_COALESCE_MAX GENMASK(23, 16)
> +#define XILINX_DMA_CR_DELAY_MAX GENMASK(31, 24)
> #define XILINX_DMA_CR_CYCLIC_BD_EN_MASK BIT(4)
> #define XILINX_DMA_CR_COALESCE_SHIFT 16
> +#define XILINX_DMA_CR_DELAY_SHIFT 24
> +#define XILINX_DMA_CR_WAITBOUND_DFT 254
> +
> #define XILINX_DMA_BD_SOP BIT(27)
> #define XILINX_DMA_BD_EOP BIT(26)
> #define XILINX_DMA_COALESCE_MAX 255
> @@ -1294,6 +1298,12 @@ static void xilinx_dma_start_transfer(struct xilinx_dma_chan *chan)
> reg &= ~XILINX_DMA_CR_COALESCE_MAX;
> reg |= chan->desc_pendingcount <<
> XILINX_DMA_CR_COALESCE_SHIFT;
> +
> + if (chan->xdev->has_axieth_connected) {
> + reg &= ~XILINX_DMA_CR_DELAY_MAX;
> + reg |= XILINX_DMA_CR_WAITBOUND_DFT <<
> + XILINX_DMA_CR_DELAY_SHIFT;
> + }
> dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
> }
>
> @@ -1508,7 +1518,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
> }
> }
>
> - if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
> + if (!chan->xdev->has_axieth_connected && (status &
> + XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
> /*
> * Device takes too long to do the transfer when user requires
> * responsiveness.
> @@ -1516,7 +1527,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq, void *data)
> dev_dbg(chan->dev, "Inter-packet latency too long\n");
> }
>
> - if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
> + if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
> + XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
> spin_lock(&chan->lock);
> xilinx_dma_complete_descriptor(chan);
> chan->idle = true;
> --
> 1.7.1
>

--
~Vinod

2018-04-17 10:56:35

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 1/6] dt-bindings: dma: xilinx_dma: Add optional property has_axieth_connected

Hi Vinod,

> -----Original Message-----
> From: Vinod Koul [mailto:[email protected]]
> Sent: Wednesday, April 11, 2018 2:35 PM
> To: Radhey Shyam Pandey <[email protected]>; Rob Herring
> <[email protected]>; [email protected]
> Cc: [email protected]; [email protected]; Appana Durga
> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [RFC 1/6] dt-bindings: dma: xilinx_dma: Add optional property
> has_axieth_connected
>
> On Mon, Apr 02, 2018 at 04:09:01PM +0530, Radhey Shyam Pandey wrote:
> > Add an optional AXI DMA property 'has_axieth_connected'. This can be
> > specified to indicate that AXI DMA is connected to AXI Ethernet in
> > hardware design and dma driver needs to do some additional handling.
>
> 1. why are DT people and list not added to this? I have added that
Sorry missed it. I will add in v2. Thanks.
> 2. should we have a property for a specific peripheral? Wouldn't it help to
> have
> a generic property for peripheral connected which can be scaled to more IPs
> being connected in future?
Good point! I will rename this property to xlnx,axistream-connected and
update description. It can then be reused for other usecases as well.
> 3. Please use right subsystem name, it damengine: not dma:
I will fix.
>
> >
> > Signed-off-by: Radhey Shyam Pandey <[email protected]>
> > ---
> > .../devicetree/bindings/dma/xilinx/xilinx_dma.txt | 1 +
> > 1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> > index a2b8bfa..5d592e3 100644
> > --- a/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> > +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_dma.txt
> > @@ -42,6 +42,7 @@ Optional properties:
> > the hardware.
> > Optional properties for AXI DMA:
> > - xlnx,mcdma: Tells whether configured for multi-channel mode in the
> hardware.
> > +- xlnx,axieth-connected: Tells whether AXI DMA is connected to AXI
> Ethernet.
> > Optional properties for VDMA:
> > - xlnx,flush-fsync: Tells which channel to Flush on Frame sync.
> > It takes following values:
> > --
> > 1.7.1
> >
>
> --
> ~Vinod

2018-04-17 11:45:36

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Hi Vinod,

> -----Original Message-----
> From: Vinod Koul [mailto:[email protected]]
> Sent: Wednesday, April 11, 2018 2:39 PM
> To: Radhey Shyam Pandey <[email protected]>
> Cc: [email protected]; [email protected]; Appana Durga
> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control
> words to netdev dma client
>
> On Mon, Apr 02, 2018 at 04:09:02PM +0530, Radhey Shyam Pandey wrote:
>
> > +
> > + if (chan->xdev->has_axieth_connected) {
> > + seg = list_first_entry(&desc->segments,
> > + struct xilinx_axidma_tx_segment,
> node);
> > + if (cb.callback_param) {
> > + app_w = (u32 *) cb.callback_param;
>
> why are you interpreting callback_param? This is plainly wrong.
> we do not know what is the interpretation of callback_param and it is
> internal to submitter.
In design, if AXI DMA is connected to AXI Ethernet IP there are certain
AXI4-Stream Status fields (RX) that we need to pass to ethernet driver
along with data buffer. An example includes: checksum fields, packet
length etc. To pass these control words there is a structure defined
between dmaengine and client. Before calling the client callback
stream control words are copied to dma client callback_param struct
(only if axieth is connected).

I understand it's not an ideal way and we shouldn't be interpreting
callback_param but couldn't find any better alternative of passing
custom information from dmaengine to client driver and still be
aligned to the framework.

>
> What exactly is the problem you are trying to solve?
As mentioned above we need to pass AXI4-stream words(custom
data) from dmaengine driver to dma client driver(ethernet) for
each DMA descriptor. Current solution populates callback_param
struct (only if axieth is connected). Please let me know if there is
an alternate solution.

>
> > + hw = &seg->hw;
> > + *app_w = hw->status &
> XILINX_DMA_MAX_TRANS_LEN;
> > + memcpy(app_w, hw->app, sizeof(u32) *
> > + XILINX_DMA_NUM_APP_WORDS);
> > + }
>
> --
> ~Vinod

2018-04-17 12:31:19

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 4/6] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit

Hi Vinod,

> -----Original Message-----
> From: Vinod Koul [mailto:[email protected]]
> Sent: Wednesday, April 11, 2018 2:41 PM
> To: Radhey Shyam Pandey <[email protected]>
> Cc: [email protected]; [email protected]; Appana Durga
> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [RFC 4/6] dmaengine: xilinx_dma: Freeup active list based on
> descriptor completion bit
>
> On Mon, Apr 02, 2018 at 04:09:04PM +0530, Radhey Shyam Pandey wrote:
> > AXIDMA IP sets completion bit to 1 when the transfer is completed. Read
> > this bit to move descriptor from active list to the done list. This feature
> > is needed when interrupt delay timeout and IRQThreshold is enabled i.e
> > Dly_IrqEn is triggered w/o completing Interrupt Threshold.
> >
> > Signed-off-by: Radhey Shyam Pandey <[email protected]>
> > ---
> > drivers/dma/xilinx/xilinx_dma.c | 18 ++++++++++++++----
> > 1 files changed, 14 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/dma/xilinx/xilinx_dma.c
> b/drivers/dma/xilinx/xilinx_dma.c
> > index 36e1ab9..518465e 100644
> > --- a/drivers/dma/xilinx/xilinx_dma.c
> > +++ b/drivers/dma/xilinx/xilinx_dma.c
> > @@ -103,6 +103,7 @@
> > #define XILINX_DMA_PARK_PTR_RD_REF_SHIFT 0
> > #define XILINX_DMA_PARK_PTR_RD_REF_MASK GENMASK(4,
> 0)
> > #define XILINX_DMA_REG_VDMA_VERSION 0x002c
> > +#define XILINX_DMA_COMP_MASK BIT(31)
> >
> > /* Register Direct Mode Registers */
> > #define XILINX_DMA_REG_VSIZE 0x0000
> > @@ -1387,16 +1388,25 @@ static void xilinx_dma_issue_pending(struct
> dma_chan *dchan)
> > static void xilinx_dma_complete_descriptor(struct xilinx_dma_chan *chan)
> > {
> > struct xilinx_dma_tx_descriptor *desc, *next;
> > + struct xilinx_axidma_tx_segment *seg;
> >
> > /* This function was invoked with lock held */
> > if (list_empty(&chan->active_list))
> > return;
> >
> > list_for_each_entry_safe(desc, next, &chan->active_list, node) {
> > - list_del(&desc->node);
> > - if (!desc->cyclic)
> > - dma_cookie_complete(&desc->async_tx);
> > - list_add_tail(&desc->node, &chan->done_list);
> > +
> > + seg = list_last_entry(&desc->segments,
> > + struct xilinx_axidma_tx_segment, node);
> > + if ((seg->hw.status & XILINX_DMA_COMP_MASK) ||
> > + (!chan->xdev->has_axieth_connected)) {
>
> why the second case ? That is not expalined in log?
In the current implementation, delay timeout is enabled only for
has_axieth_connected usecase. For ethernet, we need real-time processing
while still having benefit of interrupt coalescing. Example: In RX interrupt
coalescing is set to 0x3. Without delay timeout, DMA engine will wait for
all frames and then issue completion interrupt. In ethernet usecase, this
can introduce huge latencies. Delay timeout interrupt will trigger if delay
time period has expired and we can notify dma client with received frames.

The second case is added to keep the previous implementation as is.(i.e when
Delay timeout interrupt is not enabled - move all active desc to done list).
Sure I will add a description for it in the commit log.

>
> > + list_del(&desc->node);
> > + if (!desc->cyclic)
> > + dma_cookie_complete(&desc->async_tx);
> > + list_add_tail(&desc->node, &chan->done_list);
> > + } else {
> > + break;
> > + }
> > }
> > }
> >
> > --
> > 1.7.1
> >
>
> --
> ~Vinod

2018-04-17 12:51:23

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 5/6] dmaengine: xilinx_dma: Program interrupt delay timeout

Hi Vinod,

> -----Original Message-----
> From: Vinod Koul [mailto:[email protected]]
> Sent: Wednesday, April 11, 2018 2:42 PM
> To: Radhey Shyam Pandey <[email protected]>
> Cc: [email protected]; [email protected]; Appana Durga
> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [RFC 5/6] dmaengine: xilinx_dma: Program interrupt delay
> timeout
>
> On Mon, Apr 02, 2018 at 04:09:05PM +0530, Radhey Shyam Pandey wrote:
> > Program IRQDelay for AXI DMA. The interrupt timeout mechanism causes
> > the DMA engine to generate an interrupt after the delay time period
> > has expired. It enables dmaengine to respond in real-time even though
> > interrupt coalescing is configured.
>
> again you are doing this only for axieth_connected, why is that?
The initial application was axieth but yes delay timeout feature may
be needed for other use cases as well. We can make delay timeout
an optional DT property. Is that fine?

>
> >
> > Signed-off-by: Radhey Shyam Pandey <[email protected]>
> > ---
> > drivers/dma/xilinx/xilinx_dma.c | 16 ++++++++++++++--
> > 1 files changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/dma/xilinx/xilinx_dma.c
> b/drivers/dma/xilinx/xilinx_dma.c
> > index 518465e..ab8f1b0 100644
> > --- a/drivers/dma/xilinx/xilinx_dma.c
> > +++ b/drivers/dma/xilinx/xilinx_dma.c
> > @@ -161,8 +161,12 @@
> > /* AXI DMA Specific Masks/Bit fields */
> > #define XILINX_DMA_MAX_TRANS_LEN GENMASK(22, 0)
> > #define XILINX_DMA_CR_COALESCE_MAX GENMASK(23, 16)
> > +#define XILINX_DMA_CR_DELAY_MAX GENMASK(31, 24)
> > #define XILINX_DMA_CR_CYCLIC_BD_EN_MASK BIT(4)
> > #define XILINX_DMA_CR_COALESCE_SHIFT 16
> > +#define XILINX_DMA_CR_DELAY_SHIFT 24
> > +#define XILINX_DMA_CR_WAITBOUND_DFT 254
> > +
> > #define XILINX_DMA_BD_SOP BIT(27)
> > #define XILINX_DMA_BD_EOP BIT(26)
> > #define XILINX_DMA_COALESCE_MAX 255
> > @@ -1294,6 +1298,12 @@ static void xilinx_dma_start_transfer(struct
> xilinx_dma_chan *chan)
> > reg &= ~XILINX_DMA_CR_COALESCE_MAX;
> > reg |= chan->desc_pendingcount <<
> > XILINX_DMA_CR_COALESCE_SHIFT;
> > +
> > + if (chan->xdev->has_axieth_connected) {
> > + reg &= ~XILINX_DMA_CR_DELAY_MAX;
> > + reg |= XILINX_DMA_CR_WAITBOUND_DFT <<
> > + XILINX_DMA_CR_DELAY_SHIFT;
> > + }
> > dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg);
> > }
> >
> > @@ -1508,7 +1518,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq,
> void *data)
> > }
> > }
> >
> > - if (status & XILINX_DMA_DMASR_DLY_CNT_IRQ) {
> > + if (!chan->xdev->has_axieth_connected && (status &
> > + XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
> > /*
> > * Device takes too long to do the transfer when user
> requires
> > * responsiveness.
> > @@ -1516,7 +1527,8 @@ static irqreturn_t xilinx_dma_irq_handler(int irq,
> void *data)
> > dev_dbg(chan->dev, "Inter-packet latency too long\n");
> > }
> >
> > - if (status & XILINX_DMA_DMASR_FRM_CNT_IRQ) {
> > + if (status & (XILINX_DMA_DMASR_FRM_CNT_IRQ |
> > + XILINX_DMA_DMASR_DLY_CNT_IRQ)) {
> > spin_lock(&chan->lock);
> > xilinx_dma_complete_descriptor(chan);
> > chan->idle = true;
> > --
> > 1.7.1
> >
>
> --
> ~Vinod

2018-04-17 12:56:58

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 04/17/2018 01:43 PM, Radhey Shyam Pandey wrote:
> Hi Vinod,
>
>> -----Original Message-----
>> From: Vinod Koul [mailto:[email protected]]
>> Sent: Wednesday, April 11, 2018 2:39 PM
>> To: Radhey Shyam Pandey <[email protected]>
>> Cc: [email protected]; [email protected]; Appana Durga
>> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
>> <[email protected]>; [email protected]; [email protected];
>> [email protected]; [email protected]
>> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control
>> words to netdev dma client
>>
>> On Mon, Apr 02, 2018 at 04:09:02PM +0530, Radhey Shyam Pandey wrote:
>>
>>> +
>>> + if (chan->xdev->has_axieth_connected) {
>>> + seg = list_first_entry(&desc->segments,
>>> + struct xilinx_axidma_tx_segment,
>> node);
>>> + if (cb.callback_param) {
>>> + app_w = (u32 *) cb.callback_param;
>>
>> why are you interpreting callback_param? This is plainly wrong.
>> we do not know what is the interpretation of callback_param and it is
>> internal to submitter.
> In design, if AXI DMA is connected to AXI Ethernet IP there are certain
> AXI4-Stream Status fields (RX) that we need to pass to ethernet driver
> along with data buffer. An example includes: checksum fields, packet
> length etc. To pass these control words there is a structure defined
> between dmaengine and client. Before calling the client callback
> stream control words are copied to dma client callback_param struct
> (only if axieth is connected).
>
> I understand it's not an ideal way and we shouldn't be interpreting
> callback_param but couldn't find any better alternative of passing
> custom information from dmaengine to client driver and still be
> aligned to the framework.
>
>>
>> What exactly is the problem you are trying to solve?
> As mentioned above we need to pass AXI4-stream words(custom
> data) from dmaengine driver to dma client driver(ethernet) for
> each DMA descriptor. Current solution populates callback_param
> struct (only if axieth is connected). Please let me know if there is
> an alternate solution.

The standard interfaces need to work in a way that a client using them does
not have to know to which DMA controller it is connected. In a similar way
the DMA controller shouldn't have any client specific logic for the generic
interfaces. Otherwise there is no point of having a generic interface.

There are two options.

Either you extend the generic interfaces so it can cover your usecase in a
generic way. E.g. the ability to attach meta data to transfer.

Or you can implement a interface that is specific to your DMA controller and
any client using this interface knows it is talking to your DMA controller.



2018-04-17 13:49:00

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 2018-04-17 15:54, Lars-Peter Clausen wrote:
> On 04/17/2018 01:43 PM, Radhey Shyam Pandey wrote:
>> Hi Vinod,
>>
>>> -----Original Message-----
>>> From: Vinod Koul [mailto:[email protected]]
>>> Sent: Wednesday, April 11, 2018 2:39 PM
>>> To: Radhey Shyam Pandey <[email protected]>
>>> Cc: [email protected]; [email protected]; Appana Durga
>>> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
>>> <[email protected]>; [email protected]; [email protected];
>>> [email protected]; [email protected]
>>> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control
>>> words to netdev dma client
>>>
>>> On Mon, Apr 02, 2018 at 04:09:02PM +0530, Radhey Shyam Pandey wrote:
>>>
>>>> +
>>>> + if (chan->xdev->has_axieth_connected) {
>>>> + seg = list_first_entry(&desc->segments,
>>>> + struct xilinx_axidma_tx_segment,
>>> node);
>>>> + if (cb.callback_param) {
>>>> + app_w = (u32 *) cb.callback_param;
>>>
>>> why are you interpreting callback_param? This is plainly wrong.
>>> we do not know what is the interpretation of callback_param and it is
>>> internal to submitter.
>> In design, if AXI DMA is connected to AXI Ethernet IP there are certain
>> AXI4-Stream Status fields (RX) that we need to pass to ethernet driver
>> along with data buffer. An example includes: checksum fields, packet
>> length etc. To pass these control words there is a structure defined
>> between dmaengine and client. Before calling the client callback
>> stream control words are copied to dma client callback_param struct
>> (only if axieth is connected).
>>
>> I understand it's not an ideal way and we shouldn't be interpreting
>> callback_param but couldn't find any better alternative of passing
>> custom information from dmaengine to client driver and still be
>> aligned to the framework.
>>
>>>
>>> What exactly is the problem you are trying to solve?
>> As mentioned above we need to pass AXI4-stream words(custom
>> data) from dmaengine driver to dma client driver(ethernet) for
>> each DMA descriptor. Current solution populates callback_param
>> struct (only if axieth is connected). Please let me know if there is
>> an alternate solution.
>
> The standard interfaces need to work in a way that a client using them does
> not have to know to which DMA controller it is connected. In a similar way
> the DMA controller shouldn't have any client specific logic for the generic
> interfaces. Otherwise there is no point of having a generic interface.
>
> There are two options.
>
> Either you extend the generic interfaces so it can cover your usecase in a
> generic way. E.g. the ability to attach meta data to transfer.

Fwiw I have this patch as part of a bigger work to achieve similar results:

commit f7cf442a37618bbd996f2640ececc4a990ea655e
Author: Peter Ujfalusi <[email protected]>
Date: Wed Nov 22 10:21:59 2017 +0200

dmaengine: Add callback to set per descriptor metadata

Some DMA engines can send and receive side band information, commands or
parameters which is not transferred within the data stream itself. In such
case clients can set the metadata to the given descriptor and it is going
to be sent to the peripheral, or in case of DEV_TO_MEM the provided buffer
will receive the metadata.

Signed-off-by: Peter Ujfalusi <[email protected]>

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index cc887a6f74ba..fb8e9c92fb01 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -709,6 +709,11 @@ struct dma_filter {
* be called after period_len bytes have been transferred.
* @device_prep_interleaved_dma: Transfer expression in a generic way.
* @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
+ * @device_attach_metadata: Some DMA engines can send and receive side band
+ * information, commands or parameters which is not transferred within the
+ * data stream itself. In such case clients can set the metadata to the
+ * given descriptor and it is going to be sent to the peripheral, or in
+ * case of DEV_TO_MEM the provided buffer will receive the metadata.
* @device_config: Pushes a new configuration to a channel, return 0 or an error
* code
* @device_pause: Pauses any transfer happening on a channel. Returns
@@ -796,6 +801,9 @@ struct dma_device {
struct dma_chan *chan, dma_addr_t dst, u64 data,
unsigned long flags);

+ int (*device_attach_metadata)(struct dma_async_tx_descriptor *desc,
+ void *data, size_t len);
+
int (*device_config)(struct dma_chan *chan,
struct dma_slave_config *config);
int (*device_pause)(struct dma_chan *chan);
@@ -909,6 +917,21 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memcpy(
len, flags);
}

+static inline int dmaengine_attach_metadata(
+ struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+ struct dma_chan *chan;
+
+ if (!desc)
+ return 0;
+
+ chan = desc->chan;
+ if (!chan || !chan->device || !chan->device->device_attach_metadata)
+ return 0;
+
+ return chan->device->device_attach_metadata(desc, data, len);
+}
+
/**
* dmaengine_terminate_all() - Terminate all active DMA transfers
* @chan: The channel for which to terminate the transfers

The drawback is that the DMA drivers need to do memcpy.
I'm considering to change this to get metadata_ptr() so clients can
directly write to the buffer w/o memcpy.

> Or you can implement a interface that is specific to your DMA controller and
> any client using this interface knows it is talking to your DMA controller.

Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
navigator DMA support was rejected that it was introducing NAV specific calls
for clients to configure features not yet supported by the framework.

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-17 14:00:16

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 04/17/2018 03:46 PM, Peter Ujfalusi wrote:
> On 2018-04-17 15:54, Lars-Peter Clausen wrote:
>> On 04/17/2018 01:43 PM, Radhey Shyam Pandey wrote:
>>> Hi Vinod,
>>>
>>>> -----Original Message-----
>>>> From: Vinod Koul [mailto:[email protected]]
>>>> Sent: Wednesday, April 11, 2018 2:39 PM
>>>> To: Radhey Shyam Pandey <[email protected]>
>>>> Cc: [email protected]; [email protected]; Appana Durga
>>>> Kedareswara Rao <[email protected]>; Radhey Shyam Pandey
>>>> <[email protected]>; [email protected]; [email protected];
>>>> [email protected]; [email protected]
>>>> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control
>>>> words to netdev dma client
>>>>
>>>> On Mon, Apr 02, 2018 at 04:09:02PM +0530, Radhey Shyam Pandey wrote:
>>>>
>>>>> +
>>>>> + if (chan->xdev->has_axieth_connected) {
>>>>> + seg = list_first_entry(&desc->segments,
>>>>> + struct xilinx_axidma_tx_segment,
>>>> node);
>>>>> + if (cb.callback_param) {
>>>>> + app_w = (u32 *) cb.callback_param;
>>>>
>>>> why are you interpreting callback_param? This is plainly wrong.
>>>> we do not know what is the interpretation of callback_param and it is
>>>> internal to submitter.
>>> In design, if AXI DMA is connected to AXI Ethernet IP there are certain
>>> AXI4-Stream Status fields (RX) that we need to pass to ethernet driver
>>> along with data buffer. An example includes: checksum fields, packet
>>> length etc. To pass these control words there is a structure defined
>>> between dmaengine and client. Before calling the client callback
>>> stream control words are copied to dma client callback_param struct
>>> (only if axieth is connected).
>>>
>>> I understand it's not an ideal way and we shouldn't be interpreting
>>> callback_param but couldn't find any better alternative of passing
>>> custom information from dmaengine to client driver and still be
>>> aligned to the framework.
>>>
>>>>
>>>> What exactly is the problem you are trying to solve?
>>> As mentioned above we need to pass AXI4-stream words(custom
>>> data) from dmaengine driver to dma client driver(ethernet) for
>>> each DMA descriptor. Current solution populates callback_param
>>> struct (only if axieth is connected). Please let me know if there is
>>> an alternate solution.
>>
>> The standard interfaces need to work in a way that a client using them does
>> not have to know to which DMA controller it is connected. In a similar way
>> the DMA controller shouldn't have any client specific logic for the generic
>> interfaces. Otherwise there is no point of having a generic interface.
>>
>> There are two options.
>>
>> Either you extend the generic interfaces so it can cover your usecase in a
>> generic way. E.g. the ability to attach meta data to transfer.
>
> Fwiw I have this patch as part of a bigger work to achieve similar results:

That's good stuff. Is this in a public tree somewhere?

>> Or you can implement a interface that is specific to your DMA controller and
>> any client using this interface knows it is talking to your DMA controller.
>
> Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
> navigator DMA support was rejected that it was introducing NAV specific calls
> for clients to configure features not yet supported by the framework.

In my opinion it is OK, somebody else might have different ideas. I mean it
is not nice, but it is better than the alternative of overloading the
generic API with driver specific semantics or introducing some kind of IOCTL
catch all callback.

If there is tight coupling between the DMA core and client and there is no
intention of using a generic client the best solution might even be to no
use DMAengine at all.

2018-04-17 14:55:38

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 2018-04-17 16:58, Lars-Peter Clausen wrote:
>>> There are two options.
>>>
>>> Either you extend the generic interfaces so it can cover your usecase in a
>>> generic way. E.g. the ability to attach meta data to transfer.
>>
>> Fwiw I have this patch as part of a bigger work to achieve similar results:
>
> That's good stuff. Is this in a public tree somewhere?

Not atm. I can not send the user of the new API and I did not wanted to
send something like this out of the blue w/o context.

But as it is a generic patch, I can send it as well. The only thing is
that the need for the memcpy, so I might end up with
ptr = get_metadata_ptr(desc, &size); /* size: in RX the valid size */

and set_metadata_size(); /* in TX to tell how the client placed */

Or something like that, the attach_metadata() as it is works just fine,
but high throughput might not like the memcpy.

>>> Or you can implement a interface that is specific to your DMA controller and
>>> any client using this interface knows it is talking to your DMA controller.
>>
>> Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
>> navigator DMA support was rejected that it was introducing NAV specific calls
>> for clients to configure features not yet supported by the framework.
>
> In my opinion it is OK, somebody else might have different ideas. I mean it
> is not nice, but it is better than the alternative of overloading the
> generic API with driver specific semantics or introducing some kind of IOCTL
> catch all callback.

True, but the generic API can be extended as well to cover new grounds,
features. Like this metadata thing.

> If there is tight coupling between the DMA core and client and there is no
> intention of using a generic client the best solution might even be to no
> use DMAengine at all.

This is how the knav stuff ended up. Well it is only used by networking
atm, so it is 'fine' to have custom API, but it is not portable.

>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-17 15:39:32

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On Tue, Apr 17, 2018 at 04:46:43PM +0300, Peter Ujfalusi wrote:

> @@ -709,6 +709,11 @@ struct dma_filter {
> * be called after period_len bytes have been transferred.
> * @device_prep_interleaved_dma: Transfer expression in a generic way.
> * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
> + * @device_attach_metadata: Some DMA engines can send and receive side band
> + * information, commands or parameters which is not transferred within the
> + * data stream itself. In such case clients can set the metadata to the
> + * given descriptor and it is going to be sent to the peripheral, or in
> + * case of DEV_TO_MEM the provided buffer will receive the metadata.
> * @device_config: Pushes a new configuration to a channel, return 0 or an error
> * code
> * @device_pause: Pauses any transfer happening on a channel. Returns
> @@ -796,6 +801,9 @@ struct dma_device {
> struct dma_chan *chan, dma_addr_t dst, u64 data,
> unsigned long flags);
>
> + int (*device_attach_metadata)(struct dma_async_tx_descriptor *desc,
> + void *data, size_t len);

while i am okay with the concept, I would not want to go again the custom
pointer route, this is a no-go for me.

Instead lets add the vendor data, define that explicitly. We can use struct,
tokens or something else to define these. But lets try to stay away from
opaque objects please :-)

--
~Vinod

2018-04-17 15:47:16

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 04/17/2018 05:42 PM, Vinod Koul wrote:
> On Tue, Apr 17, 2018 at 04:46:43PM +0300, Peter Ujfalusi wrote:
>
>> @@ -709,6 +709,11 @@ struct dma_filter {
>> * be called after period_len bytes have been transferred.
>> * @device_prep_interleaved_dma: Transfer expression in a generic way.
>> * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
>> + * @device_attach_metadata: Some DMA engines can send and receive side band
>> + * information, commands or parameters which is not transferred within the
>> + * data stream itself. In such case clients can set the metadata to the
>> + * given descriptor and it is going to be sent to the peripheral, or in
>> + * case of DEV_TO_MEM the provided buffer will receive the metadata.
>> * @device_config: Pushes a new configuration to a channel, return 0 or an error
>> * code
>> * @device_pause: Pauses any transfer happening on a channel. Returns
>> @@ -796,6 +801,9 @@ struct dma_device {
>> struct dma_chan *chan, dma_addr_t dst, u64 data,
>> unsigned long flags);
>>
>> + int (*device_attach_metadata)(struct dma_async_tx_descriptor *desc,
>> + void *data, size_t len);
>
> while i am okay with the concept, I would not want to go again the custom
> pointer route, this is a no-go for me.
>
> Instead lets add the vendor data, define that explicitly. We can use struct,
> tokens or something else to define these. But lets try to stay away from
> opaque objects please :-)

Well, for all the DMA core cares about the meta-data stream would be as
opaque as the data stream itself. The data is in the end client specific. It
is data that sits somewhere in memory that should be send along with the
actual data to the client.


2018-04-17 15:58:06

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 04/17/2018 04:53 PM, Peter Ujfalusi wrote:
> On 2018-04-17 16:58, Lars-Peter Clausen wrote:
>>>> There are two options.
>>>>
>>>> Either you extend the generic interfaces so it can cover your usecase in a
>>>> generic way. E.g. the ability to attach meta data to transfer.
>>>
>>> Fwiw I have this patch as part of a bigger work to achieve similar results:
>>
>> That's good stuff. Is this in a public tree somewhere?
>
> Not atm. I can not send the user of the new API and I did not wanted to
> send something like this out of the blue w/o context.
>
> But as it is a generic patch, I can send it as well. The only thing is
> that the need for the memcpy, so I might end up with
> ptr = get_metadata_ptr(desc, &size); /* size: in RX the valid size */
>
> and set_metadata_size(); /* in TX to tell how the client placed */
>
> Or something like that, the attach_metadata() as it is works just fine,
> but high throughput might not like the memcpy.
>

In the most abstracted way I'd say metadata and data are two different data
streams that are correlated and send/received at the same time.

Think multi-planar transfer, like for audio when the right and left channel
are in separate buffers and not interleaved. Or video with different
color/luminance components in separate buffers. This is something that is at
the moment not covered by the dmaengine API either.

>>>> Or you can implement a interface that is specific to your DMA controller and
>>>> any client using this interface knows it is talking to your DMA controller.
>>>
>>> Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
>>> navigator DMA support was rejected that it was introducing NAV specific calls
>>> for clients to configure features not yet supported by the framework.
>>
>> In my opinion it is OK, somebody else might have different ideas. I mean it
>> is not nice, but it is better than the alternative of overloading the
>> generic API with driver specific semantics or introducing some kind of IOCTL
>> catch all callback.
>
> True, but the generic API can be extended as well to cover new grounds,
> features. Like this metadata thing.
>
>> If there is tight coupling between the DMA core and client and there is no
>> intention of using a generic client the best solution might even be to no
>> use DMAengine at all.
>
> This is how the knav stuff ended up. Well it is only used by networking
> atm, so it is 'fine' to have custom API, but it is not portable.

I totally agree generic APIs are better, but not everybody has the resources
to rewrite the whole framework just because they want to do this tiny thing
that isn't covered by the framework yet. In that case it is better to go
with a custom API (that might evolve into a generic API), rather than
overloading the generic API and putting a strain on everybody who works on
the generic API.

2018-04-18 06:33:43

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client


On 2018-04-17 18:54, Lars-Peter Clausen wrote:
> On 04/17/2018 04:53 PM, Peter Ujfalusi wrote:
>> On 2018-04-17 16:58, Lars-Peter Clausen wrote:
>>>>> There are two options.
>>>>>
>>>>> Either you extend the generic interfaces so it can cover your usecase in a
>>>>> generic way. E.g. the ability to attach meta data to transfer.
>>>>
>>>> Fwiw I have this patch as part of a bigger work to achieve similar results:
>>>
>>> That's good stuff. Is this in a public tree somewhere?
>>
>> Not atm. I can not send the user of the new API and I did not wanted to
>> send something like this out of the blue w/o context.
>>
>> But as it is a generic patch, I can send it as well. The only thing is
>> that the need for the memcpy, so I might end up with
>> ptr = get_metadata_ptr(desc, &size); /* size: in RX the valid size */
>>
>> and set_metadata_size(); /* in TX to tell how the client placed */
>>
>> Or something like that, the attach_metadata() as it is works just fine,
>> but high throughput might not like the memcpy.
>>
>
> In the most abstracted way I'd say metadata and data are two different data
> streams that are correlated and send/received at the same time.

In my case the meatdata is sideband information or parameters for/from
the remote end. Like timestamp, algorithm parameters, keys, etc.

It is tight to the data payload, but it is not part of it.

But the API should be generic enough to cover other use cases where
clients need to provide additional information.
For me, the metadata is part of the descriptor we give and receive back
from the DMA, others might have sideband channel to send that.

For metadata handling we could have:

struct dma_desc_metadata_ops {
/* To give a buffer for the DMA with the metadata, as it was in my
* original patch
*/
int (*desc_attach_metadata)(struct dma_async_tx_descriptor *desc,
void *data, size_t len);

void *(*desc_get_metadata_ptr)(struct dma_async_tx_descriptor *desc,
size_t *payload_len, size_t *max_len);
int (*desc_set_payload_len)(struct dma_async_tx_descriptor *desc,
size_t payload_len);
};

Probably a simple flag variable to indicate which of the two modes are
supported:
1. Client provided metadata buffer handling
Clients provide the buffer via desc_attach_metadata(), the DMA driver
will do whatever it needs to do, copy it in place, send it differently,
use parameters.
In RX the received metadata is going to be placed to the provided buffer.
2. Ability to give the metadata pointer to user to work on it.
In TX, clients can use desc_get_metadata_ptr() to get the pointer,
current payload size and maximum size of the metadata and can work
directly on the buffer to place the data. Then desc_set_payload_len() to
let the DMA know how much data is actually placed there.
In RX, desc_get_metadata_ptr() will give the user the pointer and the
payload size so it can process that information correctly.

DMA driver can implement either or both, but clients must only use
either 1 or 2 to work with the metadata.


> Think multi-planar transfer, like for audio when the right and left channel
> are in separate buffers and not interleaved. Or video with different
> color/luminance components in separate buffers. This is something that is at
> the moment not covered by the dmaengine API either.

Hrm, true, but it is hardly the metadata use case. It is more like
different DMA transfer type.

>>>>> Or you can implement a interface that is specific to your DMA controller and
>>>>> any client using this interface knows it is talking to your DMA controller.
>>>>
>>>> Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
>>>> navigator DMA support was rejected that it was introducing NAV specific calls
>>>> for clients to configure features not yet supported by the framework.
>>>
>>> In my opinion it is OK, somebody else might have different ideas. I mean it
>>> is not nice, but it is better than the alternative of overloading the
>>> generic API with driver specific semantics or introducing some kind of IOCTL
>>> catch all callback.
>>
>> True, but the generic API can be extended as well to cover new grounds,
>> features. Like this metadata thing.
>>
>>> If there is tight coupling between the DMA core and client and there is no
>>> intention of using a generic client the best solution might even be to no
>>> use DMAengine at all.
>>
>> This is how the knav stuff ended up. Well it is only used by networking
>> atm, so it is 'fine' to have custom API, but it is not portable.
>
> I totally agree generic APIs are better, but not everybody has the resources
> to rewrite the whole framework just because they want to do this tiny thing
> that isn't covered by the framework yet. In that case it is better to go
> with a custom API (that might evolve into a generic API), rather than
> overloading the generic API and putting a strain on everybody who works on
> the generic API.

At some point a threshold is reached when the burden of maintaining a
custom API is more costly than investing on the extension of the framework.
What happens when an existing driver (using DMAengine API) need to be
supported on platform where only custom DMA code is available or if you
want to migrate to a new DMA from the custom API the drier is wired for?
It is just plain pain.
At the end what we want to do with DMA: move data from opne place to
another (in oversimplified view).

The framework can be extended to cover new features w/o much impact on
generality or drivers do not need the feature, like the metadata. The
impact is minimal for drivers who don't care.

If there is interest I can send the core patches for the metadata in
couple of days (need to update the documentation as well) and test the
new give me the metadata pointer support.
Unfortunately I can not send atm the DMA driver itself which is using this.
If this helps others to move ahead, I'm fine spending extra time to get
it in a good shape for upstream.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-18 06:40:50

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client



On 2018-04-17 18:42, Vinod Koul wrote:
> On Tue, Apr 17, 2018 at 04:46:43PM +0300, Peter Ujfalusi wrote:
>
>> @@ -709,6 +709,11 @@ struct dma_filter {
>> * be called after period_len bytes have been transferred.
>> * @device_prep_interleaved_dma: Transfer expression in a generic way.
>> * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
>> + * @device_attach_metadata: Some DMA engines can send and receive side band
>> + * information, commands or parameters which is not transferred within the
>> + * data stream itself. In such case clients can set the metadata to the
>> + * given descriptor and it is going to be sent to the peripheral, or in
>> + * case of DEV_TO_MEM the provided buffer will receive the metadata.
>> * @device_config: Pushes a new configuration to a channel, return 0 or an error
>> * code
>> * @device_pause: Pauses any transfer happening on a channel. Returns
>> @@ -796,6 +801,9 @@ struct dma_device {
>> struct dma_chan *chan, dma_addr_t dst, u64 data,
>> unsigned long flags);
>>
>> + int (*device_attach_metadata)(struct dma_async_tx_descriptor *desc,
>> + void *data, size_t len);
>
> while i am okay with the concept, I would not want to go again the custom
> pointer route, this is a no-go for me.
>
> Instead lets add the vendor data, define that explicitly. We can use struct,
> tokens or something else to define these. But lets try to stay away from
> opaque objects please :-)

The DMA does not interpret the metadata, it is information which can be
only understood by the client driver and the remote peripheral. It is
just chunk of data (parameters, timestamps, keys, etc) that needs to
travel along with the payload.

The content is not relevant for the DMA itself.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-18 07:05:09

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client



On 2018-04-18 09:39, Peter Ujfalusi wrote:
>
>
> On 2018-04-17 18:42, Vinod Koul wrote:
>> On Tue, Apr 17, 2018 at 04:46:43PM +0300, Peter Ujfalusi wrote:
>>
>>> @@ -709,6 +709,11 @@ struct dma_filter {
>>> * be called after period_len bytes have been transferred.
>>> * @device_prep_interleaved_dma: Transfer expression in a generic way.
>>> * @device_prep_dma_imm_data: DMA's 8 byte immediate data to the dst address
>>> + * @device_attach_metadata: Some DMA engines can send and receive side band
>>> + * information, commands or parameters which is not transferred within the
>>> + * data stream itself. In such case clients can set the metadata to the
>>> + * given descriptor and it is going to be sent to the peripheral, or in
>>> + * case of DEV_TO_MEM the provided buffer will receive the metadata.
>>> * @device_config: Pushes a new configuration to a channel, return 0 or an error
>>> * code
>>> * @device_pause: Pauses any transfer happening on a channel. Returns
>>> @@ -796,6 +801,9 @@ struct dma_device {
>>> struct dma_chan *chan, dma_addr_t dst, u64 data,
>>> unsigned long flags);
>>>
>>> + int (*device_attach_metadata)(struct dma_async_tx_descriptor *desc,
>>> + void *data, size_t len);
>>
>> while i am okay with the concept, I would not want to go again the custom
>> pointer route, this is a no-go for me.
>>
>> Instead lets add the vendor data, define that explicitly. We can use struct,
>> tokens or something else to define these. But lets try to stay away from
>> opaque objects please :-)
>
> The DMA does not interpret the metadata, it is information which can be
> only understood by the client driver and the remote peripheral. It is
> just chunk of data (parameters, timestamps, keys, etc) that needs to
> travel along with the payload.
>
> The content is not relevant for the DMA itself.

To add: different peripherals needs to send receive different metadata
and even the same peripheral might pass different information based on
their operating mode. The size of metadata can be different as well.

So it is not really vendor specific metadata, but peripheral, operating
mode and other factors affected chunk of data.

>
> - Péter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-18 13:09:14

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 04/18/2018 08:31 AM, Peter Ujfalusi wrote:
>
> On 2018-04-17 18:54, Lars-Peter Clausen wrote:
>> On 04/17/2018 04:53 PM, Peter Ujfalusi wrote:
>>> On 2018-04-17 16:58, Lars-Peter Clausen wrote:
>>>>>> There are two options.
>>>>>>
>>>>>> Either you extend the generic interfaces so it can cover your usecase in a
>>>>>> generic way. E.g. the ability to attach meta data to transfer.
>>>>>
>>>>> Fwiw I have this patch as part of a bigger work to achieve similar results:
>>>>
>>>> That's good stuff. Is this in a public tree somewhere?
>>>
>>> Not atm. I can not send the user of the new API and I did not wanted to
>>> send something like this out of the blue w/o context.
>>>
>>> But as it is a generic patch, I can send it as well. The only thing is
>>> that the need for the memcpy, so I might end up with
>>> ptr = get_metadata_ptr(desc, &size); /* size: in RX the valid size */
>>>
>>> and set_metadata_size(); /* in TX to tell how the client placed */
>>>
>>> Or something like that, the attach_metadata() as it is works just fine,
>>> but high throughput might not like the memcpy.
>>>
>>
>> In the most abstracted way I'd say metadata and data are two different data
>> streams that are correlated and send/received at the same time.
>
> In my case the meatdata is sideband information or parameters for/from
> the remote end. Like timestamp, algorithm parameters, keys, etc.
>
> It is tight to the data payload, but it is not part of it.
>
> But the API should be generic enough to cover other use cases where
> clients need to provide additional information.
> For me, the metadata is part of the descriptor we give and receive back
> from the DMA, others might have sideband channel to send that.
>
> For metadata handling we could have:
>
> struct dma_desc_metadata_ops {
> /* To give a buffer for the DMA with the metadata, as it was in my
> * original patch
> */
> int (*desc_attach_metadata)(struct dma_async_tx_descriptor *desc,
> void *data, size_t len);
>
> void *(*desc_get_metadata_ptr)(struct dma_async_tx_descriptor *desc,
> size_t *payload_len, size_t *max_len);
> int (*desc_set_payload_len)(struct dma_async_tx_descriptor *desc,
> size_t payload_len);
> };
>
> Probably a simple flag variable to indicate which of the two modes are
> supported:
> 1. Client provided metadata buffer handling
> Clients provide the buffer via desc_attach_metadata(), the DMA driver
> will do whatever it needs to do, copy it in place, send it differently,
> use parameters.
> In RX the received metadata is going to be placed to the provided buffer.
> 2. Ability to give the metadata pointer to user to work on it.
> In TX, clients can use desc_get_metadata_ptr() to get the pointer,
> current payload size and maximum size of the metadata and can work
> directly on the buffer to place the data. Then desc_set_payload_len() to
> let the DMA know how much data is actually placed there.
> In RX, desc_get_metadata_ptr() will give the user the pointer and the
> payload size so it can process that information correctly.
>
> DMA driver can implement either or both, but clients must only use
> either 1 or 2 to work with the metadata.
>
>
>> Think multi-planar transfer, like for audio when the right and left channel
>> are in separate buffers and not interleaved. Or video with different
>> color/luminance components in separate buffers. This is something that is at
>> the moment not covered by the dmaengine API either.
>
> Hrm, true, but it is hardly the metadata use case. It is more like
> different DMA transfer type.

When I look at this with my astronaut architect view from high high up above
I do not see a difference between metadata and multi-planar data.

Both split the data that is sent to the peripheral into multiple
sub-streams, each carrying part of the data. I'm sure there are peripherals
that interleave data and metadata on the same data stream. Similar to how we
have left and right channel interleaved in a audio stream.

What about metadata that is not contiguous and split into multiple segments.
How do you handle passing a sgl to the metadata interface? And then it
suddenly looks quite similar to the normal DMA descriptor interface.

But maybe that's just one abstraction level to high.

>>>>>> Or you can implement a interface that is specific to your DMA controller and
>>>>>> any client using this interface knows it is talking to your DMA controller.
>>>>>
>>>>> Hrm, so we can have DMA driver specific calls? The reason why TI's keystone 2
>>>>> navigator DMA support was rejected that it was introducing NAV specific calls
>>>>> for clients to configure features not yet supported by the framework.
>>>>
>>>> In my opinion it is OK, somebody else might have different ideas. I mean it
>>>> is not nice, but it is better than the alternative of overloading the
>>>> generic API with driver specific semantics or introducing some kind of IOCTL
>>>> catch all callback.
>>>
>>> True, but the generic API can be extended as well to cover new grounds,
>>> features. Like this metadata thing.
>>>
>>>> If there is tight coupling between the DMA core and client and there is no
>>>> intention of using a generic client the best solution might even be to no
>>>> use DMAengine at all.
>>>
>>> This is how the knav stuff ended up. Well it is only used by networking
>>> atm, so it is 'fine' to have custom API, but it is not portable.
>>
>> I totally agree generic APIs are better, but not everybody has the resources
>> to rewrite the whole framework just because they want to do this tiny thing
>> that isn't covered by the framework yet. In that case it is better to go
>> with a custom API (that might evolve into a generic API), rather than
>> overloading the generic API and putting a strain on everybody who works on
>> the generic API.
>
> At some point a threshold is reached when the burden of maintaining a
> custom API is more costly than investing on the extension of the framework.
> What happens when an existing driver (using DMAengine API) need to be
> supported on platform where only custom DMA code is available or if you
> want to migrate to a new DMA from the custom API the drier is wired for?
> It is just plain pain.
> At the end what we want to do with DMA: move data from opne place to
> another (in oversimplified view).

I think we fully agree on this :)

2018-04-19 11:42:12

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client


On 2018-04-18 16:06, Lars-Peter Clausen wrote:
>> Hrm, true, but it is hardly the metadata use case. It is more like
>> different DMA transfer type.
>
> When I look at this with my astronaut architect view from high high up above
> I do not see a difference between metadata and multi-planar data.

I tend to disagree.

> Both split the data that is sent to the peripheral into multiple
> sub-streams, each carrying part of the data. I'm sure there are peripherals
> that interleave data and metadata on the same data stream. Similar to how we
> have left and right channel interleaved in a audio stream.

Slimbus, S/PDIF?

> What about metadata that is not contiguous and split into multiple segments.
> How do you handle passing a sgl to the metadata interface? And then it
> suddenly looks quite similar to the normal DMA descriptor interface.

Well, the metadata is for the descriptor. The descriptor describe the
data transfer _and_ can convey additional information. Nothing is
interleaved, the data and the descriptor are different things. It is
more like TCP headers detached from the data (but pointing to it).

> But maybe that's just one abstraction level to high.

I understand your point, but at the end the metadata needs to end up in
the descriptor which is describing the data that is going to be moved.

The descriptor is not sent as a separate DMA trasnfer, it is part of the
DMA transfer, it is handled internally by the DMA.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-04-23 05:20:52

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 4/6] dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit

On Tue, Apr 17, 2018 at 12:28:52PM +0000, Radhey Shyam Pandey wrote:

> > > + if ((seg->hw.status & XILINX_DMA_COMP_MASK) ||
> > > + (!chan->xdev->has_axieth_connected)) {
> >
> > why the second case ? That is not expalined in log?
> In the current implementation, delay timeout is enabled only for
> has_axieth_connected usecase. For ethernet, we need real-time processing
> while still having benefit of interrupt coalescing. Example: In RX interrupt
> coalescing is set to 0x3. Without delay timeout, DMA engine will wait for
> all frames and then issue completion interrupt. In ethernet usecase, this
> can introduce huge latencies. Delay timeout interrupt will trigger if delay
> time period has expired and we can notify dma client with received frames.
>
> The second case is added to keep the previous implementation as is.(i.e when
> Delay timeout interrupt is not enabled - move all active desc to done list).
> Sure I will add a description for it in the commit log.

That should help, it didn't seem to have anything to do with log or other
changes, please keep in mind again that changelog should describe the change
and help ppl review...

--
~Vinod

2018-04-24 03:52:40

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On Thu, Apr 19, 2018 at 02:40:26PM +0300, Peter Ujfalusi wrote:
>
> On 2018-04-18 16:06, Lars-Peter Clausen wrote:
> >> Hrm, true, but it is hardly the metadata use case. It is more like
> >> different DMA transfer type.
> >
> > When I look at this with my astronaut architect view from high high up above
> > I do not see a difference between metadata and multi-planar data.
>
> I tend to disagree.

and we will love to hear more :)

> > Both split the data that is sent to the peripheral into multiple
> > sub-streams, each carrying part of the data. I'm sure there are peripherals
> > that interleave data and metadata on the same data stream. Similar to how we
> > have left and right channel interleaved in a audio stream.
>
> Slimbus, S/PDIF?
>
> > What about metadata that is not contiguous and split into multiple segments.
> > How do you handle passing a sgl to the metadata interface? And then it
> > suddenly looks quite similar to the normal DMA descriptor interface.
>
> Well, the metadata is for the descriptor. The descriptor describe the
> data transfer _and_ can convey additional information. Nothing is
> interleaved, the data and the descriptor are different things. It is
> more like TCP headers detached from the data (but pointing to it).
>
> > But maybe that's just one abstraction level to high.
>
> I understand your point, but at the end the metadata needs to end up in
> the descriptor which is describing the data that is going to be moved.
>
> The descriptor is not sent as a separate DMA trasnfer, it is part of the
> DMA transfer, it is handled internally by the DMA.

That is bit confusing to me. I thought DMA was transparent to meta data and
would blindly collect and transfer along with the descriptor. So at high
level we are talking about two transfers (probably co-joined at hip and you
want to call one transfer) but why can't we visualize this as just a DMA
transfers. maybe you want to signal/attach to transfer, cant we do that with
additional flag DMA_METADATA etc..?

--
~Vinod

2018-04-24 14:28:50

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

On 2018-04-24 06:55, Vinod Koul wrote:
> On Thu, Apr 19, 2018 at 02:40:26PM +0300, Peter Ujfalusi wrote:
>>
>> On 2018-04-18 16:06, Lars-Peter Clausen wrote:
>>>> Hrm, true, but it is hardly the metadata use case. It is more like
>>>> different DMA transfer type.
>>>
>>> When I look at this with my astronaut architect view from high high up above
>>> I do not see a difference between metadata and multi-planar data.
>>
>> I tend to disagree.
>
> and we will love to hear more :)

It is getting pretty off topic from the subject ;) and I'm sorry about that.

Multi-planar data is _data_, the metadata is
parameters/commands/information on _how_ to use the data.
It is more like a replacement or extension of:
configure peripheral
send data

to

send data with configuration

In both cases the same data is sent, but the configuration,
parametrization is 'simplified' to allow per packet changes.

>>> Both split the data that is sent to the peripheral into multiple
>>> sub-streams, each carrying part of the data. I'm sure there are peripherals
>>> that interleave data and metadata on the same data stream. Similar to how we
>>> have left and right channel interleaved in a audio stream.
>>
>> Slimbus, S/PDIF?
>>
>>> What about metadata that is not contiguous and split into multiple segments.
>>> How do you handle passing a sgl to the metadata interface? And then it
>>> suddenly looks quite similar to the normal DMA descriptor interface.
>>
>> Well, the metadata is for the descriptor. The descriptor describe the
>> data transfer _and_ can convey additional information. Nothing is
>> interleaved, the data and the descriptor are different things. It is
>> more like TCP headers detached from the data (but pointing to it).
>>
>>> But maybe that's just one abstraction level to high.
>>
>> I understand your point, but at the end the metadata needs to end up in
>> the descriptor which is describing the data that is going to be moved.
>>
>> The descriptor is not sent as a separate DMA trasnfer, it is part of the
>> DMA transfer, it is handled internally by the DMA.
>
> That is bit confusing to me. I thought DMA was transparent to meta data and
> would blindly collect and transfer along with the descriptor. So at high
> level we are talking about two transfers (probably co-joined at hip and you
> want to call one transfer)

At the end yes, both the descriptor and the data is going to be sent to
the other end.

As a reference see [1]

The metadata is not a separate entity, it is part of the descriptor
(Host Packet Descriptor - HPD).
Each transfer (packet) is described with a HPD. The HPD have optional
fields, like EPIB (Extended Packet Info Block), PSdata (Protocol
Specific data).

When the DMA reads the HPD, is going to move the data described by the
HPD to the entry point (or from the entry point to memory), copies the
EPIB/PSdata from the HPD to a destination HPD. The other end will use
the destination HPD to know the size of the data and to get the metadata
from the descriptor.

In essence every entity within the Multicore Navigator system have
pktdma, they all work in a similar way, but their capabilities might
differ. Our entry to this mesh is via the DMA.

> but why can't we visualize this as just a DMA
> transfers. maybe you want to signal/attach to transfer, cant we do that with
> additional flag DMA_METADATA etc..?

For the data we need to call dmaengine_prep_slave_* to create the
descriptor (HPD). The metadata needs to be present in the HPD, hence I
was thinking of the attach_metadata as per descriptor API.

If separate dmaengine_prep_slave_* is used for allocating the HPD and
place the metadata in it then the consequent dmaengine_prep_slave_* call
must be for the data of the transfer and it is still unclear how the
prepare call would have any idea where to look for the HPD it needs to
update with the parameters for the data transfer.

I guess the driver could store the HPD pointer in the channel data if
the prepare is called with DMA_METADATA and it would be mandatory that
the next prepare is for the data portion. The driver would pick the
pointer to the HPD we stored away and update the descriptor belonging to
different tx_desc.

But if we are here, we could have a flag like DMA_DESCRIPTOR and let
client drivers to allocate the whole descriptor, fill in the metadata
and give that to the DMA driver, which will update the rest of the HPD.

Well, let's see where this is going to go when I can send the patches
for review.

[1] http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-05-17 06:40:07

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Hi,

> -----Original Message-----
> From: Peter Ujfalusi [mailto:[email protected]]
> Sent: Tuesday, April 24, 2018 3:21 PM
> To: Vinod Koul <[email protected]>
> Cc: Lars-Peter Clausen <[email protected]>; Radhey Shyam Pandey
> <[email protected]>; [email protected]; linux-
> [email protected]; [email protected];
> [email protected]; Appana Durga Kedareswara Rao
> <[email protected]>; [email protected]
> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words
> to netdev dma client
>
> On 2018-04-24 06:55, Vinod Koul wrote:
> > On Thu, Apr 19, 2018 at 02:40:26PM +0300, Peter Ujfalusi wrote:
> >>
> >> On 2018-04-18 16:06, Lars-Peter Clausen wrote:
> >>>> Hrm, true, but it is hardly the metadata use case. It is more like
> >>>> different DMA transfer type.
> >>>
> >>> When I look at this with my astronaut architect view from high high up
> above
> >>> I do not see a difference between metadata and multi-planar data.
> >>
> >> I tend to disagree.
> >
> > and we will love to hear more :)
>
> It is getting pretty off topic from the subject ;) and I'm sorry about that.
>
> Multi-planar data is _data_, the metadata is
> parameters/commands/information on _how_ to use the data.
> It is more like a replacement or extension of:
> configure peripheral
> send data
>
> to
>
> send data with configuration
>
> In both cases the same data is sent, but the configuration,
> parametrization is 'simplified' to allow per packet changes.
>
> >>> Both split the data that is sent to the peripheral into multiple
> >>> sub-streams, each carrying part of the data. I'm sure there are peripherals
> >>> that interleave data and metadata on the same data stream. Similar to
> how we
> >>> have left and right channel interleaved in a audio stream.
> >>
> >> Slimbus, S/PDIF?
> >>
> >>> What about metadata that is not contiguous and split into multiple
> segments.
> >>> How do you handle passing a sgl to the metadata interface? And then it
> >>> suddenly looks quite similar to the normal DMA descriptor interface.
> >>
> >> Well, the metadata is for the descriptor. The descriptor describe the
> >> data transfer _and_ can convey additional information. Nothing is
> >> interleaved, the data and the descriptor are different things. It is
> >> more like TCP headers detached from the data (but pointing to it).
> >>
> >>> But maybe that's just one abstraction level to high.
> >>
> >> I understand your point, but at the end the metadata needs to end up in
> >> the descriptor which is describing the data that is going to be moved.
> >>
> >> The descriptor is not sent as a separate DMA trasnfer, it is part of the
> >> DMA transfer, it is handled internally by the DMA.
> >
> > That is bit confusing to me. I thought DMA was transparent to meta data and
> > would blindly collect and transfer along with the descriptor. So at high
> > level we are talking about two transfers (probably co-joined at hip and you
> > want to call one transfer)
>
> At the end yes, both the descriptor and the data is going to be sent to
> the other end.
>
> As a reference see [1]
>
> The metadata is not a separate entity, it is part of the descriptor
> (Host Packet Descriptor - HPD).
> Each transfer (packet) is described with a HPD. The HPD have optional
> fields, like EPIB (Extended Packet Info Block), PSdata (Protocol
> Specific data).
>
> When the DMA reads the HPD, is going to move the data described by the
> HPD to the entry point (or from the entry point to memory), copies the
> EPIB/PSdata from the HPD to a destination HPD. The other end will use
> the destination HPD to know the size of the data and to get the metadata
> from the descriptor.
>
> In essence every entity within the Multicore Navigator system have
> pktdma, they all work in a similar way, but their capabilities might
> differ. Our entry to this mesh is via the DMA.
>
> > but why can't we visualize this as just a DMA
> > transfers. maybe you want to signal/attach to transfer, cant we do that with
> > additional flag DMA_METADATA etc..?
>
> For the data we need to call dmaengine_prep_slave_* to create the
> descriptor (HPD). The metadata needs to be present in the HPD, hence I
> was thinking of the attach_metadata as per descriptor API.
>
> If separate dmaengine_prep_slave_* is used for allocating the HPD and
> place the metadata in it then the consequent dmaengine_prep_slave_* call
> must be for the data of the transfer and it is still unclear how the
> prepare call would have any idea where to look for the HPD it needs to
> update with the parameters for the data transfer.
>
> I guess the driver could store the HPD pointer in the channel data if
> the prepare is called with DMA_METADATA and it would be mandatory that
> the next prepare is for the data portion. The driver would pick the
> pointer to the HPD we stored away and update the descriptor belonging to
> different tx_desc.
>
> But if we are here, we could have a flag like DMA_DESCRIPTOR and let
> client drivers to allocate the whole descriptor, fill in the metadata
> and give that to the DMA driver, which will update the rest of the HPD.
>
> Well, let's see where this is going to go when I can send the patches
> for review.
Thanks all. @Peter: If we have metadata patchset ready may be good
to send an RFC?

>
> [1] http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf
>
> - Péter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-05-29 15:05:31

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Hi,

On 2018-05-17 09:39, Radhey Shyam Pandey wrote:
>> Well, let's see where this is going to go when I can send the patches
>> for review.
> Thanks all. @Peter: If we have metadata patchset ready may be good
> to send an RFC?

Sorry for the delay, I got distracted by this:
http://www.ti.com/lit/pdf/spruid7 Chapter 10.

I have given some tough to the metadata attach patches.
In my case the 'metadata' is more like private data section within the
DMA descriptor (10.1.2.2.1) which is used by the remote peripheral and
the driver for the given peripheral and it is optional.

I liked the idea of treating it as metadata as it gives more generic API
which can be adopted by other drivers if they need something similar.

Another issue I have with the attach metadata way is that it would
require one memcpy to copy the data to the DMA descriptor and in high
throughput case it is not acceptable.

For me probably a .get_private_area / .put_private_area like API would
be desirable where I can give the pointer of the 'metadata' are (and
size) to the user.

But these can co-exist in my opinion and DMA drivers can opt to
implement none, either or both of the callbacks.

In couple of days I can update the metadata patches I have atm and send
as RFC.

Is there anything from your side I should take into account when doing that?

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-05-30 17:30:38

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Hi,

> -----Original Message-----
> From: Peter Ujfalusi [mailto:[email protected]]
> Sent: Tuesday, May 29, 2018 8:35 PM
> To: Radhey Shyam Pandey <[email protected]>; Vinod Koul
> <[email protected]>
> Cc: Lars-Peter Clausen <[email protected]>; [email protected]; linux-
> [email protected]; [email protected];
> [email protected]; Appana Durga Kedareswara Rao
> <[email protected]>; [email protected]
> Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words
> to netdev dma client
>
> Hi,
>
> On 2018-05-17 09:39, Radhey Shyam Pandey wrote:
> >> Well, let's see where this is going to go when I can send the patches
> >> for review.
> > Thanks all. @Peter: If we have metadata patchset ready may be good
> > to send an RFC?
>
> Sorry for the delay, I got distracted by this:
> http://www.ti.com/lit/pdf/spruid7 Chapter 10.
>
> I have given some tough to the metadata attach patches.
> In my case the 'metadata' is more like private data section within the
> DMA descriptor (10.1.2.2.1) which is used by the remote peripheral and
> the driver for the given peripheral and it is optional.
>
> I liked the idea of treating it as metadata as it gives more generic API
> which can be adopted by other drivers if they need something similar.
>
> Another issue I have with the attach metadata way is that it would
> require one memcpy to copy the data to the DMA descriptor and in high
> throughput case it is not acceptable.

I think memcpy is needed (alternative?) if dma engine doesn’t directly
update metadata buffers i.e in RX, we need to copy metadata from
dma descriptor fields to client allocated metadata buffer (sideband/
metadata info is part of Buffer descriptor fields)

memcpy(app_w, hw->app, sizeof(u32) * XILINX_DMA_NUM_APP_WORDS);

Descriptor Fields
Address Space Offset Name Description
20h APP0 User Application Field 0
24h APP1 User Application Field 1
...
30h APP5 User Application Field 5

>
> For me probably a .get_private_area / .put_private_area like API would
> be desirable where I can give the pointer of the 'metadata' are (and
> size) to the user.
>
> But these can co-exist in my opinion and DMA drivers can opt to
> implement none, either or both of the callbacks.
>
> In couple of days I can update the metadata patches I have atm and send
> as RFC.
>
> Is there anything from your side I should take into account when doing that?
I think a generic interface to attach/share metadata buffer b/w client and the
dmaengine driver is good enough. Is metadata patchset (early version)
available in TI external repos?

Thanks,
Radhey

>
> - Péter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-06-01 10:18:08

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC 2/6] dmaengine: xilinx_dma: Pass AXI4-Stream control words to netdev dma client

Hi Radhey,

On 2018-05-30 20:29, Radhey Shyam Pandey wrote:
>> In couple of days I can update the metadata patches I have atm and send
>> as RFC.
>>
>> Is there anything from your side I should take into account when doing that?
> I think a generic interface to attach/share metadata buffer b/w client and the
> dmaengine driver is good enough. Is metadata patchset (early version)
> available in TI external repos?

I don't have it in public repository, but now that the TRM is public I
can start preparing things for upstream.

I have attached the patch I ended up with, but I need to add the
documentation part.

Since the 'metadata' is part of the DMA descriptor itself I thought that
it might be better to reflect that -> the metadata_ops is part of the
dma_async_tx_descriptor struct.

DMA drivers can initialize it when it is supported by the channel or
setup. In my case it is optional, many peripherals did not use it at all.

I have two modes to deal with the metadata:
1. attach mode
Client drivers are giving a buffer and a size to the DMA driver and in
case of TX the data is copied to the descriptor's metadata part. In case
of RX when the transfer is completed the DMA driver will copy the data
from the DMA descriptor to the client provided buffer.
Here we need one memcpy for each descriptor.

2. pointer mode
If we have high throughput peripheral, the per descriptor memcpy can be
an obstacle for performance.

TX: The client dmaengine_desc_get_metadata_ptr() to get the pointer to
the metadata section of the descriptor, it will receive back the max
size and the currently used size (important for RX). This is done before
issue_pending().
The client can update the metadata directly and when it is done calls
the dmaengine_desc_set_metadata_len() to tell the DMA driver the size of
the metadata it has configured.

RX: in the DMA callback the client calls
dmaengine_desc_get_metadata_ptr() to get the pointer and the size of the
metadata we have received and can process the information w/o memcpy.

I think it might be better to rename these from metadata to client_data
or something. It is part of the DMA descriptor, passed along with the
DMA descriptor, but it is owned by the client driver.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


Attachments:
0001-dmaengine-Add-metadat_ops-for-dma_async_tx_descripto.patch (3.12 kB)

2018-06-01 10:25:55

by Peter Ujfalusi

[permalink] [raw]
Subject: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

If the DMA supports per descriptor metadata it can implement the attach,
get_ptr/set_len callbacks.

Client drivers must only use either attach or get_ptr/set_len to avoid
miss configuration.

Wrappers are also added for the metadata_ops:
dmaengine_desc_attach_metadata()
dmaengine_desc_get_metadata_ptr()
dmaengine_desc_set_metadata_len()

Signed-off-by: Peter Ujfalusi <[email protected]>
---
Hi,

since attachments are bouncing back, I send the patch separately

Regards,
Peter

include/linux/dmaengine.h | 50 +++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 51fbb861e84b..ac42ace36aa3 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -491,6 +491,18 @@ struct dmaengine_unmap_data {
dma_addr_t addr[0];
};

+struct dma_async_tx_descriptor;
+
+struct dma_descriptor_metadata_ops {
+ int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
+ size_t len);
+
+ void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
+ size_t *payload_len, size_t *max_len);
+ int (*set_len)(struct dma_async_tx_descriptor *desc,
+ size_t payload_len);
+};
+
/**
* struct dma_async_tx_descriptor - async transaction descriptor
* ---dma generic offload fields---
@@ -520,6 +532,7 @@ struct dma_async_tx_descriptor {
dma_async_tx_callback_result callback_result;
void *callback_param;
struct dmaengine_unmap_data *unmap;
+ struct dma_descriptor_metadata_ops *metadata_ops;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
@@ -932,6 +945,43 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memcpy(
len, flags);
}

+static inline int dmaengine_desc_attach_metadata(
+ struct dma_async_tx_descriptor *desc, void *data, size_t len)
+{
+ if (!desc)
+ return 0;
+
+ if (!desc->metadata_ops || !desc->metadata_ops->attach)
+ return -ENOTSUPP;
+
+ return desc->metadata_ops->attach(desc, data, len);
+}
+
+static inline void *dmaengine_desc_get_metadata_ptr(
+ struct dma_async_tx_descriptor *desc, size_t *payload_len,
+ size_t *max_len)
+{
+ if (!desc)
+ return NULL;
+
+ if (!desc->metadata_ops || !desc->metadata_ops->get_ptr)
+ return ERR_PTR(-ENOTSUPP);
+
+ return desc->metadata_ops->get_ptr(desc, payload_len, max_len);
+}
+
+static inline int dmaengine_desc_set_metadata_len(
+ struct dma_async_tx_descriptor *desc, size_t payload_len)
+{
+ if (!desc)
+ return 0;
+
+ if (!desc->metadata_ops || !desc->metadata_ops->set_len)
+ return -ENOTSUPP;
+
+ return desc->metadata_ops->set_len(desc, payload_len);
+}
+
/**
* dmaengine_terminate_all() - Terminate all active DMA transfers
* @chan: The channel for which to terminate the transfers
--
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


2018-07-02 07:08:54

by Radhey Shyam Pandey

[permalink] [raw]
Subject: RE: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

> -----Original Message-----
> From: Peter Ujfalusi [mailto:[email protected]]
> Sent: Friday, June 1, 2018 3:54 PM
> To: Radhey Shyam Pandey <[email protected]>; [email protected]
> Cc: [email protected]; [email protected]; linux-
> [email protected]; [email protected];
> [email protected]; Appana Durga Kedareswara Rao
> <[email protected]>; [email protected]
> Subject: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor
>
> If the DMA supports per descriptor metadata it can implement the attach,
> get_ptr/set_len callbacks.
>
> Client drivers must only use either attach or get_ptr/set_len to avoid
> miss configuration.
>
> Wrappers are also added for the metadata_ops:
> dmaengine_desc_attach_metadata()
> dmaengine_desc_get_metadata_ptr()
> dmaengine_desc_set_metadata_len()
>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> ---
> Hi,
>
> since attachments are bouncing back, I send the patch separately
>
> Regards,
> Peter
>
> include/linux/dmaengine.h | 50
> +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 51fbb861e84b..ac42ace36aa3 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -491,6 +491,18 @@ struct dmaengine_unmap_data {
> dma_addr_t addr[0];
> };
>
> +struct dma_async_tx_descriptor;
> +
> +struct dma_descriptor_metadata_ops {
> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
> + size_t len);
> +
> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
> + size_t *payload_len, size_t *max_len);
> + int (*set_len)(struct dma_async_tx_descriptor *desc,
> + size_t payload_len);
> +};
> +
> /**
> * struct dma_async_tx_descriptor - async transaction descriptor
> * ---dma generic offload fields---
> @@ -520,6 +532,7 @@ struct dma_async_tx_descriptor {
> dma_async_tx_callback_result callback_result;
> void *callback_param;
> struct dmaengine_unmap_data *unmap;
> + struct dma_descriptor_metadata_ops *metadata_ops;
> #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
> struct dma_async_tx_descriptor *next;
> struct dma_async_tx_descriptor *parent;
> @@ -932,6 +945,43 @@ static inline struct dma_async_tx_descriptor
> *dmaengine_prep_dma_memcpy(
> len, flags);
> }
>
> +static inline int dmaengine_desc_attach_metadata(
> + struct dma_async_tx_descriptor *desc, void *data, size_t
> len)
> +{
> + if (!desc)
> + return 0;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->attach)
> + return -ENOTSUPP;
> +
> + return desc->metadata_ops->attach(desc, data, len);
> +}
> +
> +static inline void *dmaengine_desc_get_metadata_ptr(
> + struct dma_async_tx_descriptor *desc, size_t *payload_len,
> + size_t *max_len)
> +{
> + if (!desc)
> + return NULL;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->get_ptr)
> + return ERR_PTR(-ENOTSUPP);
> +
> + return desc->metadata_ops->get_ptr(desc, payload_len, max_len);
> +}
> +
> +static inline int dmaengine_desc_set_metadata_len(
> + struct dma_async_tx_descriptor *desc, size_t payload_len)
> +{
> + if (!desc)
> + return 0;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->set_len)
> + return -ENOTSUPP;
> +
> + return desc->metadata_ops->set_len(desc, payload_len);
> +}
> +
Thanks for the RFC patchset. It looks fine to me. We also need to update
documentation. Let's wait for Vinod/Lars feedback.

> /**
> * dmaengine_terminate_all() - Terminate all active DMA transfers
> * @chan: The channel for which to terminate the transfers
> --
> Peter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


2018-07-10 05:53:40

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor


Hey Peter,

Sorry for late response on this..

On 01-06-18, 13:24, Peter Ujfalusi wrote:
> If the DMA supports per descriptor metadata it can implement the attach,
> get_ptr/set_len callbacks.
>
> Client drivers must only use either attach or get_ptr/set_len to avoid
> miss configuration.
>
> Wrappers are also added for the metadata_ops:
> dmaengine_desc_attach_metadata()
> dmaengine_desc_get_metadata_ptr()
> dmaengine_desc_set_metadata_len()
>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> ---
> Hi,
>
> since attachments are bouncing back, I send the patch separately
>
> Regards,
> Peter
>
> include/linux/dmaengine.h | 50 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 51fbb861e84b..ac42ace36aa3 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -491,6 +491,18 @@ struct dmaengine_unmap_data {
> dma_addr_t addr[0];
> };
>
> +struct dma_async_tx_descriptor;
> +
> +struct dma_descriptor_metadata_ops {
> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
> + size_t len);

How does one detach? When should the client free up the memory, IOW when
does dma driver drop ref to data.

> +
> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
> + size_t *payload_len, size_t *max_len);

so what is this supposed to do..?

> + int (*set_len)(struct dma_async_tx_descriptor *desc,
> + size_t payload_len);

attach already has length, so how does this help?

> +};
> +
> /**
> * struct dma_async_tx_descriptor - async transaction descriptor
> * ---dma generic offload fields---
> @@ -520,6 +532,7 @@ struct dma_async_tx_descriptor {
> dma_async_tx_callback_result callback_result;
> void *callback_param;
> struct dmaengine_unmap_data *unmap;
> + struct dma_descriptor_metadata_ops *metadata_ops;
> #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
> struct dma_async_tx_descriptor *next;
> struct dma_async_tx_descriptor *parent;
> @@ -932,6 +945,43 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memcpy(
> len, flags);
> }
>
> +static inline int dmaengine_desc_attach_metadata(
> + struct dma_async_tx_descriptor *desc, void *data, size_t len)
> +{
> + if (!desc)
> + return 0;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->attach)
> + return -ENOTSUPP;
> +
> + return desc->metadata_ops->attach(desc, data, len);
> +}
> +
> +static inline void *dmaengine_desc_get_metadata_ptr(
> + struct dma_async_tx_descriptor *desc, size_t *payload_len,
> + size_t *max_len)
> +{
> + if (!desc)
> + return NULL;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->get_ptr)
> + return ERR_PTR(-ENOTSUPP);
> +
> + return desc->metadata_ops->get_ptr(desc, payload_len, max_len);
> +}
> +
> +static inline int dmaengine_desc_set_metadata_len(
> + struct dma_async_tx_descriptor *desc, size_t payload_len)
> +{
> + if (!desc)
> + return 0;
> +
> + if (!desc->metadata_ops || !desc->metadata_ops->set_len)
> + return -ENOTSUPP;
> +
> + return desc->metadata_ops->set_len(desc, payload_len);
> +}
> +
> /**
> * dmaengine_terminate_all() - Terminate all active DMA transfers
> * @chan: The channel for which to terminate the transfers
> --
> Peter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
~Vinod

2018-07-18 10:06:54

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

Hi Vinod,

On 2018-07-10 08:52, Vinod wrote:
>
> Hey Peter,
>
> Sorry for late response on this..

No problem, I was away myself also...

> On 01-06-18, 13:24, Peter Ujfalusi wrote:
>> If the DMA supports per descriptor metadata it can implement the attach,
>> get_ptr/set_len callbacks.
>>
>> Client drivers must only use either attach or get_ptr/set_len to avoid
>> miss configuration.
>>
>> Wrappers are also added for the metadata_ops:
>> dmaengine_desc_attach_metadata()
>> dmaengine_desc_get_metadata_ptr()
>> dmaengine_desc_set_metadata_len()
>>
>> Signed-off-by: Peter Ujfalusi <[email protected]>
>> ---
>> Hi,
>>
>> since attachments are bouncing back, I send the patch separately
>>
>> Regards,
>> Peter
>>
>> include/linux/dmaengine.h | 50 +++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 50 insertions(+)
>>
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index 51fbb861e84b..ac42ace36aa3 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -491,6 +491,18 @@ struct dmaengine_unmap_data {
>> dma_addr_t addr[0];
>> };
>>
>> +struct dma_async_tx_descriptor;
>> +
>> +struct dma_descriptor_metadata_ops {
>> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
>> + size_t len);
>
> How does one detach?

I have not thought about detach, but clients can just attach NULL I guess.

> When should the client free up the memory, IOW when
> does dma driver drop ref to data.

The metadata is for the descriptor so the DMA driver might want to
access to it while the descriptor is valid.

Typically clients can free up their metadata storage after the dma
completion callback. On DEV_TO_MEM the metadata is going to be placed in
the provided buffer when the transfer is completed.

>
>> +
>> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
>> + size_t *payload_len, size_t *max_len);
>
> so what is this supposed to do..?

My issue with the attach in general is that it will need additional
memcpy to move the metadata from/to the client buffer to it's place.

With get_ptr the client can get the pointer to the actual place where
the metadata resides and modify/read it in place w/o memcpy.

I know, I know... We need to trust the clients, but with high throughput
peripherals the memcpy is taxing.

>
>> + int (*set_len)(struct dma_async_tx_descriptor *desc,
>> + size_t payload_len);
>
> attach already has length, so how does this help?

So, DMA drivers can implement either or both:
1. attach()
2. get_ptr() / set_len()

Clients must not mix the two way of handling the metadata.
The set_len() is intended to tell the DMA driver the client provided
metadata size (in MEM_TO_DEV case mostly).

MEM_TO_DEV flow on client side:
get_ptr()
fill in the metadata to the pointer (not exceeding max_len)
set_len() to tell the DMA driver the amount of valid bytes written

DEV_TO_MEM flow on client side:
In the completion callback, get_ptr()
the metadata is payload_len bytes and can be accessed in the return pointer.


BTW: The driver which is going to need this is now accessible in public:
https://git.ti.com/ti-linux-kernel/ti-linux-kernel/trees/ti-linux-4.14.y/drivers/dma/ti

or in my wip tree:
https://github.com/omap-audio/linux-audio/tree/peter/ti-linux-4.14.y/wip/drivers/dma/ti

prefixed with k3-*

>
>> +};
>> +
>> /**
>> * struct dma_async_tx_descriptor - async transaction descriptor
>> * ---dma generic offload fields---
>> @@ -520,6 +532,7 @@ struct dma_async_tx_descriptor {
>> dma_async_tx_callback_result callback_result;
>> void *callback_param;
>> struct dmaengine_unmap_data *unmap;
>> + struct dma_descriptor_metadata_ops *metadata_ops;
>> #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
>> struct dma_async_tx_descriptor *next;
>> struct dma_async_tx_descriptor *parent;
>> @@ -932,6 +945,43 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_memcpy(
>> len, flags);
>> }
>>
>> +static inline int dmaengine_desc_attach_metadata(
>> + struct dma_async_tx_descriptor *desc, void *data, size_t len)
>> +{
>> + if (!desc)
>> + return 0;
>> +
>> + if (!desc->metadata_ops || !desc->metadata_ops->attach)
>> + return -ENOTSUPP;
>> +
>> + return desc->metadata_ops->attach(desc, data, len);
>> +}
>> +
>> +static inline void *dmaengine_desc_get_metadata_ptr(
>> + struct dma_async_tx_descriptor *desc, size_t *payload_len,
>> + size_t *max_len)
>> +{
>> + if (!desc)
>> + return NULL;
>> +
>> + if (!desc->metadata_ops || !desc->metadata_ops->get_ptr)
>> + return ERR_PTR(-ENOTSUPP);
>> +
>> + return desc->metadata_ops->get_ptr(desc, payload_len, max_len);
>> +}
>> +
>> +static inline int dmaengine_desc_set_metadata_len(
>> + struct dma_async_tx_descriptor *desc, size_t payload_len)
>> +{
>> + if (!desc)
>> + return 0;
>> +
>> + if (!desc->metadata_ops || !desc->metadata_ops->set_len)
>> + return -ENOTSUPP;
>> +
>> + return desc->metadata_ops->set_len(desc, payload_len);
>> +}
>> +
>> /**
>> * dmaengine_terminate_all() - Terminate all active DMA transfers
>> * @chan: The channel for which to terminate the transfers
>> --
>> Peter
>>
>> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
>> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-07-19 09:23:30

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

Hi Peter,

On 18-07-18, 13:06, Peter Ujfalusi wrote:

> >> +struct dma_async_tx_descriptor;
> >> +
> >> +struct dma_descriptor_metadata_ops {
> >> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
> >> + size_t len);
> >
> > How does one detach?
>
> I have not thought about detach, but clients can just attach NULL I guess.

So what are the implication of attach and detach here, should the data
be deref by dmaengine driver and drop the ref.

Should anyone do refcounting?

>
> > When should the client free up the memory, IOW when
> > does dma driver drop ref to data.
>
> The metadata is for the descriptor so the DMA driver might want to
> access to it while the descriptor is valid.
>
> Typically clients can free up their metadata storage after the dma
> completion callback. On DEV_TO_MEM the metadata is going to be placed in
> the provided buffer when the transfer is completed.

That sounds okay to me

> >> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
> >> + size_t *payload_len, size_t *max_len);
> >
> > so what is this supposed to do..?
>
> My issue with the attach in general is that it will need additional
> memcpy to move the metadata from/to the client buffer to it's place.
>
> With get_ptr the client can get the pointer to the actual place where
> the metadata resides and modify/read it in place w/o memcpy.
>
> I know, I know... We need to trust the clients, but with high throughput
> peripherals the memcpy is taxing.

Okay I am not sure I have understood fully, so with attach you set
a pointer (containing metdata?) so why do you need additional one..

>
> >
> >> + int (*set_len)(struct dma_async_tx_descriptor *desc,
> >> + size_t payload_len);
> >
> > attach already has length, so how does this help?
>
> So, DMA drivers can implement either or both:
> 1. attach()
> 2. get_ptr() / set_len()

Ah okay, what are the reasons for providing two methods and not a single
one

>
> Clients must not mix the two way of handling the metadata.
> The set_len() is intended to tell the DMA driver the client provided
> metadata size (in MEM_TO_DEV case mostly).
>
> MEM_TO_DEV flow on client side:
> get_ptr()
> fill in the metadata to the pointer (not exceeding max_len)
> set_len() to tell the DMA driver the amount of valid bytes written
>
> DEV_TO_MEM flow on client side:
> In the completion callback, get_ptr()
> the metadata is payload_len bytes and can be accessed in the return pointer.

I would think to unify this..

> BTW: The driver which is going to need this is now accessible in public:
> https://git.ti.com/ti-linux-kernel/ti-linux-kernel/trees/ti-linux-4.14.y/drivers/dma/ti
>
> or in my wip tree:
> https://github.com/omap-audio/linux-audio/tree/peter/ti-linux-4.14.y/wip/drivers/dma/ti
>
> prefixed with k3-*
>
--
~Vinod

2018-07-20 13:43:53

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor



On 2018-07-19 12:22, Vinod wrote:
> Hi Peter,
>
> On 18-07-18, 13:06, Peter Ujfalusi wrote:
>
>>>> +struct dma_async_tx_descriptor;
>>>> +
>>>> +struct dma_descriptor_metadata_ops {
>>>> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
>>>> + size_t len);
>>>
>>> How does one detach?
>>
>> I have not thought about detach, but clients can just attach NULL I guess.
>
> So what are the implication of attach and detach here, should the data
> be deref by dmaengine driver and drop the ref.

It largely depends on the DMA driver, but I think we must have clear
definition on how clients (and thus DMA drivers) must handle the metadata.

I think the simpler rule would be that clients _must_ attach the
metadata buffer after _prepare() and before issue_pending() and they
must make sure that the buffer is valid (not freed up) before the
completion callback is called for the given descriptor.

About the detach: If clients detaches the metadata buffer then on
completion it is not going to receive back any metadata and I think the
DMA drivers should clean and disable the metadata sending as well if the
detach happens before issue_pending().

> Should anyone do refcounting?

Need to think about that.

>>
>>> When should the client free up the memory, IOW when
>>> does dma driver drop ref to data.
>>
>> The metadata is for the descriptor so the DMA driver might want to
>> access to it while the descriptor is valid.
>>
>> Typically clients can free up their metadata storage after the dma
>> completion callback. On DEV_TO_MEM the metadata is going to be placed in
>> the provided buffer when the transfer is completed.
>
> That sounds okay to me
>
>>>> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
>>>> + size_t *payload_len, size_t *max_len);
>>>
>>> so what is this supposed to do..?
>>
>> My issue with the attach in general is that it will need additional
>> memcpy to move the metadata from/to the client buffer to it's place.
>>
>> With get_ptr the client can get the pointer to the actual place where
>> the metadata resides and modify/read it in place w/o memcpy.
>>
>> I know, I know... We need to trust the clients, but with high throughput
>> peripherals the memcpy is taxing.
>
> Okay I am not sure I have understood fully, so with attach you set
> a pointer (containing metdata?) so why do you need additional one..
>
>>
>>>
>>>> + int (*set_len)(struct dma_async_tx_descriptor *desc,
>>>> + size_t payload_len);
>>>
>>> attach already has length, so how does this help?
>>
>> So, DMA drivers can implement either or both:
>> 1. attach()
>> 2. get_ptr() / set_len()
>
> Ah okay, what are the reasons for providing two methods and not a single
> one

For the HW I have it would be more efficient to grab pointer and do
in-place modification to metadata section (the part of the CPPI5
descriptor which is owned by the client driver).

Other vendors might have the metadata scattered, or in different way
which does not fit with the ptr mode for security or sanity point of
view - I don't want to give the whole descriptor to the client. I don't
trust ;)

>>
>> Clients must not mix the two way of handling the metadata.
>> The set_len() is intended to tell the DMA driver the client provided
>> metadata size (in MEM_TO_DEV case mostly).
>>
>> MEM_TO_DEV flow on client side:
>> get_ptr()
>> fill in the metadata to the pointer (not exceeding max_len)
>> set_len() to tell the DMA driver the amount of valid bytes written
>>
>> DEV_TO_MEM flow on client side:
>> In the completion callback, get_ptr()
>> the metadata is payload_len bytes and can be accessed in the return pointer.
>
> I would think to unify this..

I have tried it, but the attach mode and the pointer mode is hard to
handle with a generic API.
I will try to find a way to unify things in a sane way.

I have moved the metadata_ops to dma_async_tx_descriptor to emphasize
that it is per descriptor setting:
https://github.com/omap-audio/linux-audio/commit/02e095d1320a4bb3ae281ddb208ce82ead746f00#diff-92c0a79f414dc3be9dfc67a969c0dd71


>> BTW: The driver which is going to need this is now accessible in public:
>> https://git.ti.com/ti-linux-kernel/ti-linux-kernel/trees/ti-linux-4.14.y/drivers/dma/ti
>>
>> or in my wip tree:
>> https://github.com/omap-audio/linux-audio/tree/peter/ti-linux-4.14.y/wip/drivers/dma/ti
>>
>> prefixed with k3-*
>>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-07-24 11:15:49

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

On 20-07-18, 16:42, Peter Ujfalusi wrote:
>
>
> On 2018-07-19 12:22, Vinod wrote:
> > Hi Peter,
> >
> > On 18-07-18, 13:06, Peter Ujfalusi wrote:
> >
> >>>> +struct dma_async_tx_descriptor;
> >>>> +
> >>>> +struct dma_descriptor_metadata_ops {
> >>>> + int (*attach)(struct dma_async_tx_descriptor *desc, void *data,
> >>>> + size_t len);
> >>>
> >>> How does one detach?
> >>
> >> I have not thought about detach, but clients can just attach NULL I guess.
> >
> > So what are the implication of attach and detach here, should the data
> > be deref by dmaengine driver and drop the ref.
>
> It largely depends on the DMA driver, but I think we must have clear
> definition on how clients (and thus DMA drivers) must handle the metadata.

Correct, defining these will help out get clarity and avoid abuse.

> I think the simpler rule would be that clients _must_ attach the
> metadata buffer after _prepare() and before issue_pending() and they
> must make sure that the buffer is valid (not freed up) before the
> completion callback is called for the given descriptor.
>
> About the detach: If clients detaches the metadata buffer then on
> completion it is not going to receive back any metadata and I think the
> DMA drivers should clean and disable the metadata sending as well if the
> detach happens before issue_pending().
>
> > Should anyone do refcounting?
>
> Need to think about that.
>
> >>
> >>> When should the client free up the memory, IOW when
> >>> does dma driver drop ref to data.
> >>
> >> The metadata is for the descriptor so the DMA driver might want to
> >> access to it while the descriptor is valid.
> >>
> >> Typically clients can free up their metadata storage after the dma
> >> completion callback. On DEV_TO_MEM the metadata is going to be placed in
> >> the provided buffer when the transfer is completed.
> >
> > That sounds okay to me
> >
> >>>> + void *(*get_ptr)(struct dma_async_tx_descriptor *desc,
> >>>> + size_t *payload_len, size_t *max_len);
> >>>
> >>> so what is this supposed to do..?
> >>
> >> My issue with the attach in general is that it will need additional
> >> memcpy to move the metadata from/to the client buffer to it's place.
> >>
> >> With get_ptr the client can get the pointer to the actual place where
> >> the metadata resides and modify/read it in place w/o memcpy.
> >>
> >> I know, I know... We need to trust the clients, but with high throughput
> >> peripherals the memcpy is taxing.
> >
> > Okay I am not sure I have understood fully, so with attach you set
> > a pointer (containing metdata?) so why do you need additional one..
> >
> >>
> >>>
> >>>> + int (*set_len)(struct dma_async_tx_descriptor *desc,
> >>>> + size_t payload_len);
> >>>
> >>> attach already has length, so how does this help?
> >>
> >> So, DMA drivers can implement either or both:
> >> 1. attach()
> >> 2. get_ptr() / set_len()
> >
> > Ah okay, what are the reasons for providing two methods and not a single
> > one
>
> For the HW I have it would be more efficient to grab pointer and do
> in-place modification to metadata section (the part of the CPPI5
> descriptor which is owned by the client driver).
>
> Other vendors might have the metadata scattered, or in different way
> which does not fit with the ptr mode for security or sanity point of
> view - I don't want to give the whole descriptor to the client. I don't
> trust ;)
>
> >>
> >> Clients must not mix the two way of handling the metadata.
> >> The set_len() is intended to tell the DMA driver the client provided
> >> metadata size (in MEM_TO_DEV case mostly).
> >>
> >> MEM_TO_DEV flow on client side:
> >> get_ptr()
> >> fill in the metadata to the pointer (not exceeding max_len)
> >> set_len() to tell the DMA driver the amount of valid bytes written
> >>
> >> DEV_TO_MEM flow on client side:
> >> In the completion callback, get_ptr()
> >> the metadata is payload_len bytes and can be accessed in the return pointer.
> >
> > I would think to unify this..
>
> I have tried it, but the attach mode and the pointer mode is hard to
> handle with a generic API.
> I will try to find a way to unify things in a sane way.

Hmmm, looking from the description they will be for different methods,
so lets make them orthogonal and not allow driver to register both.

>
> I have moved the metadata_ops to dma_async_tx_descriptor to emphasize
> that it is per descriptor setting:
> https://github.com/omap-audio/linux-audio/commit/02e095d1320a4bb3ae281ddb208ce82ead746f00#diff-92c0a79f414dc3be9dfc67a969c0dd71
>
>
> >> BTW: The driver which is going to need this is now accessible in public:
> >> https://git.ti.com/ti-linux-kernel/ti-linux-kernel/trees/ti-linux-4.14.y/drivers/dma/ti
> >>
> >> or in my wip tree:
> >> https://github.com/omap-audio/linux-audio/tree/peter/ti-linux-4.14.y/wip/drivers/dma/ti
> >>
> >> prefixed with k3-*
> >>
>
> - P?ter
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

--
~Vinod

2018-07-30 09:49:48

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

Vinod,

On 2018-07-24 14:14, Vinod wrote:
>>>> Clients must not mix the two way of handling the metadata.
>>>> The set_len() is intended to tell the DMA driver the client provided
>>>> metadata size (in MEM_TO_DEV case mostly).
>>>>
>>>> MEM_TO_DEV flow on client side:
>>>> get_ptr()
>>>> fill in the metadata to the pointer (not exceeding max_len)
>>>> set_len() to tell the DMA driver the amount of valid bytes written
>>>>
>>>> DEV_TO_MEM flow on client side:
>>>> In the completion callback, get_ptr()
>>>> the metadata is payload_len bytes and can be accessed in the return pointer.
>>>
>>> I would think to unify this..
>>
>> I have tried it, but the attach mode and the pointer mode is hard to
>> handle with a generic API.
>> I will try to find a way to unify things in a sane way.
>
> Hmmm, looking from the description they will be for different methods,
> so lets make them orthogonal and not allow driver to register both.

I would allow DMA drivers to register both, but somehow enforce that
clients are not mixing the two distinct way of dealing with the metadata.

The reason for that is for example the attach mode is the simplest (I
implemented it first and I have a client using it), but if the pointer
mode is found to be more efficient and feasible for the DMA then the DMA
driver can implement that mode and the client can move as well w/o
breaking anything.

>
>>
>> I have moved the metadata_ops to dma_async_tx_descriptor to emphasize
>> that it is per descriptor setting:
>> https://github.com/omap-audio/linux-audio/commit/02e095d1320a4bb3ae281ddb208ce82ead746f00#diff-92c0a79f414dc3be9dfc67a969c0dd71
>>
>>
>>>> BTW: The driver which is going to need this is now accessible in public:
>>>> https://git.ti.com/ti-linux-kernel/ti-linux-kernel/trees/ti-linux-4.14.y/drivers/dma/ti
>>>>
>>>> or in my wip tree:
>>>> https://github.com/omap-audio/linux-audio/tree/peter/ti-linux-4.14.y/wip/drivers/dma/ti
>>>>
>>>> prefixed with k3-*
>>>>
>>
>> - Péter
>>
>> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
>> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-07-31 04:30:10

by Vinod Koul

[permalink] [raw]
Subject: Re: [RFC] dmaengine: Add metadat_ops for dma_async_tx_descriptor

On 30-07-18, 12:46, Peter Ujfalusi wrote:
> Vinod,
>
> On 2018-07-24 14:14, Vinod wrote:
> >>>> Clients must not mix the two way of handling the metadata.
> >>>> The set_len() is intended to tell the DMA driver the client provided
> >>>> metadata size (in MEM_TO_DEV case mostly).
> >>>>
> >>>> MEM_TO_DEV flow on client side:
> >>>> get_ptr()
> >>>> fill in the metadata to the pointer (not exceeding max_len)
> >>>> set_len() to tell the DMA driver the amount of valid bytes written
> >>>>
> >>>> DEV_TO_MEM flow on client side:
> >>>> In the completion callback, get_ptr()
> >>>> the metadata is payload_len bytes and can be accessed in the return pointer.
> >>>
> >>> I would think to unify this..
> >>
> >> I have tried it, but the attach mode and the pointer mode is hard to
> >> handle with a generic API.
> >> I will try to find a way to unify things in a sane way.
> >
> > Hmmm, looking from the description they will be for different methods,
> > so lets make them orthogonal and not allow driver to register both.
>
> I would allow DMA drivers to register both, but somehow enforce that
> clients are not mixing the two distinct way of dealing with the metadata.
>
> The reason for that is for example the attach mode is the simplest (I
> implemented it first and I have a client using it), but if the pointer
> mode is found to be more efficient and feasible for the DMA then the DMA
> driver can implement that mode and the client can move as well w/o
> breaking anything.

Sounds reasonable...

--
~Vinod