LinuxLists.cc - [PATCH v4 0/6] dma: edma: Support scatter-lists of any length

2013-08-29 23:06:37

Subject: [PATCH v4 0/6] dma: edma: Support scatter-lists of any length

The following series adds support to EDMA driver to enable DMA of
scatter-gather lists of arbitrary length, but still make use of only
a certain MAX number of slots at a time for a given channel. Thus
free-ing up the rest of the slots to other slaves/channels. With this
there is no need for slave drivers to query the EDMA driver about how
much is the MAX it can send at a time as done in [1]. Drivers can send
SG lists of any number of entries to DMA. Reference discussion at [2].

With this, all the patches for MMC and EDMA related to "sg limits" can be
dropped.

Tested omap-aes and omap_hsmmc drivers with different MAX number of slots,
even just 1. In the case where it is 1, only 1-slot is used to DMA an
entire scatter list of arbitrary length.
Since this series touches EDMA private API code also shared with davinci-pcm,
playback of a 16-bit 44.1KHz audio file with davinci-pcm has been tested.

Sample test run with 1 vs 16 (MAX number of slots/SG) in omap-aes driver:
MAX slots = 1:
(128 bit key, 8192 byte blocks): 1266 operations in 1 seconds (10371072 bytes)
MAX slots = 16:
(128 bit key, 8192 byte blocks): 1601 operations in 1 seconds (13115392 bytes)

Note: For the above test, 8K buffer is mapped into SG list of size 2 so
only 2 slots are required. So beyond size 2, there will not be any noticeable
performance improvement. But above experiment just shows as proof of concept
that even using 1 slot is managed by just DMA'ing 1 SG entry at a time.

Patch series history:
v1: The initial patch series
v2: Several style related cleanups.
v3: Rewrote major portions of the series to avoid usage of error interrupts.
v4: Go back to v1: After performing tests, it is determined that the v1
series is a good enough approach and that v3 triggers problems in certain cases.
Tests have shown that there is not any noticeable performance difference between
using vs not using error interrupts. v4 also squashes some patches in v1.
At this point, we are freezing the overall architecture of the patch series to v4.

[1] https://lkml.org/lkml/2013/7/18/432
[2] http://marc.info/?l=linux-omap&m=137416733628831&w=2

Joel Fernandes (6):
dma: edma: Setup parameters to DMA MAX_NR_SG at a time
dma: edma: Write out and handle MAX_NR_SG at a given time
ARM: edma: Add function to manually trigger an EDMA channel
dma: edma: Find missed events and issue them
dma: edma: Leave linked to Null slot instead of DUMMY slot
dma: edma: Remove limits on number of slots

arch/arm/common/edma.c | 17 ++++
drivers/dma/edma.c | 164 ++++++++++++++++++++++++++++---------
include/linux/platform_data/edma.h | 2 +
3 files changed, 144 insertions(+), 39 deletions(-)

--
1.8.1.2

2013-08-29 23:06:41

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 3/6] ARM: edma: Add function to manually trigger an EDMA channel

Manual trigger for events missed as a result of splitting a
scatter gather list and DMA'ing it in batches. Add a helper
function to trigger a channel incase any such events are missed.

Signed-off-by: Joel Fernandes <[email protected]>
---
arch/arm/common/edma.c | 17 +++++++++++++++++
include/linux/platform_data/edma.h | 2 ++
2 files changed, 19 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 3567ba1..67e8cf5 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1236,6 +1236,23 @@ void edma_resume(unsigned channel)
}
EXPORT_SYMBOL(edma_resume);

+int edma_trigger_channel(unsigned channel)
+{
+ unsigned ctlr;
+ unsigned int mask;
+
+ ctlr = EDMA_CTLR(channel);
+ channel = EDMA_CHAN_SLOT(channel);
+ mask = BIT(channel & 0x1f);
+
+ edma_shadow0_write_array(ctlr, SH_ESR, (channel >> 5), mask);
+
+ pr_debug("EDMA: ESR%d %08x\n", (channel >> 5),
+ edma_shadow0_read_array(ctlr, SH_ESR, (channel >> 5)));
+ return 0;
+}
+EXPORT_SYMBOL(edma_trigger_channel);
+
/**
* edma_start - start dma on a channel
* @channel: channel being activated
diff --git a/include/linux/platform_data/edma.h b/include/linux/platform_data/edma.h
index 57300fd..179fb91 100644
--- a/include/linux/platform_data/edma.h
+++ b/include/linux/platform_data/edma.h
@@ -180,4 +180,6 @@ struct edma_soc_info {
const s16 (*xbar_chans)[2];
};

+int edma_trigger_channel(unsigned);
+
#endif
--
1.8.1.2

2013-08-29 23:06:40

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 5/6] dma: edma: Leave linked to Null slot instead of DUMMY slot

Dummy slot has been used as a way for missed-events not to be
reported as missing. This has been particularly troublesome for cases
where we might want to temporarily pause all incoming events.

For EDMA DMAC, there is no way to do any such pausing of events as
the occurence of the "next" event is not software controlled.
Using "edma_pause" in IRQ handlers doesn't help as by then the event
in concern from the slave is already missed.

Linking a dummy slot, is seen to absorb these events which we didn't
want to miss. So we don't link to dummy, but instead leave it linked
to NULL set, allow an error condition and detect the channel that
missed it.

Signed-off-by: Joel Fernandes <[email protected]>

dma: edma: Link to dummy slot only for last SG list split

Consider the case where we have a scatter-list like:
SG1->SG2->SG3->SG4->SG5->SG6->Null

For ex, for a MAX_NR_SG of 2, earlier we were splitting this as:
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Null

Now we split it as
SG1->SG2->Null
SG3->SG4->Null
SG5->SG6->Dummy

This approach results in lesser unwanted interrupts that occur
for the last list split. The Dummy slot has the property of not
raising an error condition if events are missed unlike the Null
slot. We are OK with this as we're done with processing the
whole list once we reach Dummy.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 0e77b57..25e47d4 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -158,13 +158,18 @@ static void edma_execute(struct edma_chan *echan)
/* Link to the previous slot if not the last set */
if (i != (nslots - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
- /* Final pset links to the dummy pset */
- else
- edma_link(echan->slot[i], echan->ecc->dummy_slot);
}

edesc->processed += nslots;

+ /*
+ * If this is either the last set in a set of SG-list transactions
+ * then setup a link to the dummy slot, this results in all future
+ * events being absorbed and that's OK because we're done
+ */
+ if (edesc->processed == edesc->pset_nr)
+ edma_link(echan->slot[nslots-1], echan->ecc->dummy_slot);
+
edma_resume(echan->ch_num);

if (edesc->processed <= MAX_NR_SG) {
--
1.8.1.2

2013-08-29 23:07:05

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 1/6] dma: edma: Setup parameters to DMA MAX_NR_SG at a time

Changes are made here for configuring existing parameters to support
DMA'ing them out in batches as needed.

Also allocate as many as slots as needed by the SG list, but not more
than MAX_NR_SG. Then these slots will be reused accordingly.
For ex, if MAX_NR_SG=10, and number of SG entries is 40, still only
10 slots will be allocated to DMA the entire SG list of size 40.

Also enable TC interrupts for slots that are a last in a current
iteration, or that fall on a MAX_NR_SG boundary.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 5f3e532..e522ad5 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -222,9 +222,9 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
enum dma_slave_buswidth dev_width;
u32 burst;
struct scatterlist *sg;
- int i;
int acnt, bcnt, ccnt, src, dst, cidx;
int src_bidx, dst_bidx, src_cidx, dst_cidx;
+ int i, nslots;

if (unlikely(!echan || !sgl || !sg_len))
return NULL;
@@ -262,8 +262,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(

edesc->pset_nr = sg_len;

- for_each_sg(sgl, sg, sg_len, i) {
- /* Allocate a PaRAM slot, if needed */
+ /* Allocate a PaRAM slot, if needed */
+ nslots = min_t(unsigned, MAX_NR_SG, sg_len);
+
+ for (i = 0; i < nslots; i++) {
if (echan->slot[i] < 0) {
echan->slot[i] =
edma_alloc_slot(EDMA_CTLR(echan->ch_num),
@@ -273,6 +275,10 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}
}
+ }
+
+ /* Configure PaRAM sets for each SG */
+ for_each_sg(sgl, sg, sg_len, i) {

acnt = dev_width;

@@ -330,6 +336,12 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
/* Configure A or AB synchronized transfers */
if (edesc->absync)
edesc->pset[i].opt |= SYNCDIM;
+
+ /* If this is the last in a current SG set of transactions,
+ enable interrupts so that next set is processed */
+ if (!((i+1) % MAX_NR_SG))
+ edesc->pset[i].opt |= TCINTEN;
+
/* If this is the last set, enable completion interrupt flag */
if (i == sg_len - 1)
edesc->pset[i].opt |= TCINTEN;
--
1.8.1.2

2013-08-29 23:06:36

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 4/6] dma: edma: Find missed events and issue them

In an effort to move to using Scatter gather lists of any size with
EDMA as discussed at [1] instead of placing limitations on the driver,
we work through the limitations of the EDMAC hardware to find missed
events and issue them.

The sequence of events that require this are:

For the scenario where MAX slots for an EDMA channel is 3:

SG1 -> SG2 -> SG3 -> SG4 -> SG5 -> SG6 -> Null

The above SG list will have to be DMA'd in 2 sets:

(1) SG1 -> SG2 -> SG3 -> Null
(2) SG4 -> SG5 -> SG6 -> Null

After (1) is succesfully transferred, the events from the MMC controller
donot stop coming and are missed by the time we have setup the transfer
for (2). So here, we catch the events missed as an error condition and
issue them manually.

In the second part of the patch, we make handle the NULL slot cases:
For crypto IP, we continue to receive events even continuously in
NULL slot, the setup of the next set of SG elements happens after
the error handler executes. This is results in some recursion problems.
Due to this, we continously receive error interrupts when we manually
trigger an event from the error handler.

We fix this, by first detecting if the Channel is currently transferring
from a NULL slot or not, that's where the edma_read_slot in the error
callback from interrupt handler comes in. With this we can determine if
the set up of the next SG list has completed, and we manually trigger
only in this case. If the setup has _not_ completed, we are still in NULL
so we just set a missed flag and allow the manual triggerring to happen
in edma_execute which will be eventually called. This fixes the above
mentioned race conditions seen with the crypto drivers.

[1] http://marc.info/?l=linux-omap&m=137416733628831&w=2

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 732829b..0e77b57 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -70,6 +70,7 @@ struct edma_chan {
int ch_num;
bool alloced;
int slot[EDMA_MAX_SLOTS];
+ int missed;
struct dma_slave_config cfg;
};

@@ -170,6 +171,20 @@ static void edma_execute(struct edma_chan *echan)
dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
edma_start(echan->ch_num);
}
+
+ /*
+ * This happens due to setup times between intermediate transfers
+ * in long SG lists which have to be broken up into transfers of
+ * MAX_NR_SG
+ */
+ if (echan->missed) {
+ dev_dbg(dev, "missed event in execute detected\n");
+ edma_clean_channel(echan->ch_num);
+ edma_stop(echan->ch_num);
+ edma_start(echan->ch_num);
+ edma_trigger_channel(echan->ch_num);
+ echan->missed = 0;
+ }
}

static int edma_terminate_all(struct edma_chan *echan)
@@ -387,6 +402,7 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data)
struct device *dev = echan->vchan.chan.device->dev;
struct edma_desc *edesc;
unsigned long flags;
+ struct edmacc_param p;

/* Pause the channel */
edma_pause(echan->ch_num);
@@ -414,7 +430,39 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data)

break;
case DMA_CC_ERROR:
- dev_dbg(dev, "transfer error on channel %d\n", ch_num);
+ spin_lock_irqsave(&echan->vchan.lock, flags);
+
+ edma_read_slot(EDMA_CHAN_SLOT(echan->slot[0]), &p);
+
+ /*
+ * Issue later based on missed flag which will be sure
+ * to happen as:
+ * (1) we finished transmitting an intermediate slot and
+ * edma_execute is coming up.
+ * (2) or we finished current transfer and issue will
+ * call edma_execute.
+ *
+ * Important note: issuing can be dangerous here and
+ * lead to some nasty recursion when we are in a NULL
+ * slot. So we avoid doing so and set the missed flag.
+ */
+ if (p.a_b_cnt == 0 && p.ccnt == 0) {
+ dev_dbg(dev, "Error occurred, looks like slot is null, just setting miss\n");
+ echan->missed = 1;
+ } else {
+ /*
+ * The slot is already programmed but the event got
+ * missed, so its safe to issue it here.
+ */
+ dev_dbg(dev, "Error occurred but slot is non-null, TRIGGERING\n");
+ edma_clean_channel(echan->ch_num);
+ edma_stop(echan->ch_num);
+ edma_start(echan->ch_num);
+ edma_trigger_channel(echan->ch_num);
+ }
+
+ spin_unlock_irqrestore(&echan->vchan.lock, flags);
+
break;
default:
break;
--
1.8.1.2

2013-08-29 23:09:23

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given time

Process SG-elements in batches of MAX_NR_SG if they are greater
than MAX_NR_SG. Due to this, at any given time only those many
slots will be used in the given channel no matter how long the
scatter list is. We keep track of how much has been written
inorder to process the next batch of elements in the scatter-list
and detect completion.

For such intermediate transfer completions (one batch of MAX_NR_SG),
make use of pause and resume functions instead of start and stop
when such intermediate transfer is in progress or completed as we
donot want to clear any pending events.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 79 ++++++++++++++++++++++++++++++++++++------------------
1 file changed, 53 insertions(+), 26 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index e522ad5..732829b 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -56,6 +56,7 @@ struct edma_desc {
struct list_head node;
int absync;
int pset_nr;
+ int processed;
struct edmacc_param pset[0];
};

@@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
/* Dispatch a queued descriptor to the controller (caller holds lock) */
static void edma_execute(struct edma_chan *echan)
{
- struct virt_dma_desc *vdesc = vchan_next_desc(&echan->vchan);
+ struct virt_dma_desc *vdesc;
struct edma_desc *edesc;
- int i;
-
- if (!vdesc) {
- echan->edesc = NULL;
- return;
+ struct device *dev = echan->vchan.chan.device->dev;
+ int i, j, left, nslots;
+
+ /* If either we processed all psets or we're still not started */
+ if (!echan->edesc ||
+ echan->edesc->pset_nr == echan->edesc->processed) {
+ /* Get next vdesc */
+ vdesc = vchan_next_desc(&echan->vchan);
+ if (!vdesc) {
+ echan->edesc = NULL;
+ return;
+ }
+ list_del(&vdesc->node);
+ echan->edesc = to_edma_desc(&vdesc->tx);
}

- list_del(&vdesc->node);
+ edesc = echan->edesc;

- echan->edesc = edesc = to_edma_desc(&vdesc->tx);
+ /* Find out how many left */
+ left = edesc->pset_nr - edesc->processed;
+ nslots = min(MAX_NR_SG, left);

/* Write descriptor PaRAM set(s) */
- for (i = 0; i < edesc->pset_nr; i++) {
- edma_write_slot(echan->slot[i], &edesc->pset[i]);
+ for (i = 0; i < nslots; i++) {
+ j = i + edesc->processed;
+ edma_write_slot(echan->slot[i], &edesc->pset[j]);
dev_dbg(echan->vchan.chan.device->dev,
"\n pset[%d]:\n"
" chnum\t%d\n"
@@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
" bidx\t%08x\n"
" cidx\t%08x\n"
" lkrld\t%08x\n",
- i, echan->ch_num, echan->slot[i],
- edesc->pset[i].opt,
- edesc->pset[i].src,
- edesc->pset[i].dst,
- edesc->pset[i].a_b_cnt,
- edesc->pset[i].ccnt,
- edesc->pset[i].src_dst_bidx,
- edesc->pset[i].src_dst_cidx,
- edesc->pset[i].link_bcntrld);
+ j, echan->ch_num, echan->slot[i],
+ edesc->pset[j].opt,
+ edesc->pset[j].src,
+ edesc->pset[j].dst,
+ edesc->pset[j].a_b_cnt,
+ edesc->pset[j].ccnt,
+ edesc->pset[j].src_dst_bidx,
+ edesc->pset[j].src_dst_cidx,
+ edesc->pset[j].link_bcntrld);
/* Link to the previous slot if not the last set */
- if (i != (edesc->pset_nr - 1))
+ if (i != (nslots - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
/* Final pset links to the dummy pset */
else
edma_link(echan->slot[i], echan->ecc->dummy_slot);
}

- edma_start(echan->ch_num);
+ edesc->processed += nslots;
+
+ edma_resume(echan->ch_num);
+
+ if (edesc->processed <= MAX_NR_SG) {
+ dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
+ edma_start(echan->ch_num);
+ }
}

static int edma_terminate_all(struct edma_chan *echan)
@@ -368,19 +388,26 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data)
struct edma_desc *edesc;
unsigned long flags;

- /* Stop the channel */
- edma_stop(echan->ch_num);
+ /* Pause the channel */
+ edma_pause(echan->ch_num);

switch (ch_status) {
case DMA_COMPLETE:
- dev_dbg(dev, "transfer complete on channel %d\n", ch_num);
-
spin_lock_irqsave(&echan->vchan.lock, flags);

edesc = echan->edesc;
if (edesc) {
+ if (edesc->processed == edesc->pset_nr) {
+ dev_dbg(dev, "transfer complete." \
+ " stopping channel %d\n", ch_num);
+ edma_stop(echan->ch_num);
+ vchan_cookie_complete(&edesc->vdesc);
+ } else {
+ dev_dbg(dev, "Intermediate transfer complete" \
+ " on channel %d\n", ch_num);
+ }
+
edma_execute(echan);
- vchan_cookie_complete(&edesc->vdesc);
}

spin_unlock_irqrestore(&echan->vchan.lock, flags);
--
1.8.1.2

2013-08-29 23:09:58

by Joel Fernandes

[permalink] [raw]

Subject: [PATCH v4 6/6] dma: edma: Remove limits on number of slots

With this series, this check is no longer required and
we shouldn't need to reject drivers DMA'ing more than the
MAX number of slots.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 25e47d4..d966413 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -287,12 +287,6 @@ static struct dma_async_tx_descriptor *edma_prep_slave_sg(
return NULL;
}

- if (sg_len > MAX_NR_SG) {
- dev_err(dev, "Exceeded max SG segments %d > %d\n",
- sg_len, MAX_NR_SG);
- return NULL;
- }
-
edesc = kzalloc(sizeof(*edesc) + sg_len *
sizeof(edesc->pset[0]), GFP_ATOMIC);
if (!edesc) {
--
1.8.1.2

2013-09-02 12:46:21

by Vinod Koul

[permalink] [raw]

Subject: Re: [PATCH v4 0/6] dma: edma: Support scatter-lists of any length

On Thu, 2013-08-29 at 18:05 -0500, Joel Fernandes wrote:
> The following series adds support to EDMA driver to enable DMA of
> scatter-gather lists of arbitrary length, but still make use of only
> a certain MAX number of slots at a time for a given channel. Thus
> free-ing up the rest of the slots to other slaves/channels. With this
> there is no need for slave drivers to query the EDMA driver about how
> much is the MAX it can send at a time as done in [1]. Drivers can send
> SG lists of any number of entries to DMA. Reference discussion at [2].
>
> With this, all the patches for MMC and EDMA related to "sg limits" can be
> dropped.
>
> Tested omap-aes and omap_hsmmc drivers with different MAX number of slots,
> even just 1. In the case where it is 1, only 1-slot is used to DMA an
> entire scatter list of arbitrary length.
> Since this series touches EDMA private API code also shared with davinci-pcm,
> playback of a 16-bit 44.1KHz audio file with davinci-pcm has been tested.
>
> Sample test run with 1 vs 16 (MAX number of slots/SG) in omap-aes driver:
> MAX slots = 1:
> (128 bit key, 8192 byte blocks): 1266 operations in 1 seconds (10371072 bytes)
> MAX slots = 16:
> (128 bit key, 8192 byte blocks): 1601 operations in 1 seconds (13115392 bytes)
>
> Note: For the above test, 8K buffer is mapped into SG list of size 2 so
> only 2 slots are required. So beyond size 2, there will not be any noticeable
> performance improvement. But above experiment just shows as proof of concept
> that even using 1 slot is managed by just DMA'ing 1 SG entry at a time.
much better series, thanks

I think i am okay with this, if anyone has objections pls speak up. Also
I need ack on the ARM patch 3/6 before I can carry this.

--
Vinod Koul
Intel Corp.

2013-09-02 14:09:32

by Sekhar Nori

[permalink] [raw]

Subject: Re: [PATCH v4 3/6] ARM: edma: Add function to manually trigger an EDMA channel

On 8/30/2013 4:35 AM, Joel Fernandes wrote:
> Manual trigger for events missed as a result of splitting a
> scatter gather list and DMA'ing it in batches. Add a helper
> function to trigger a channel incase any such events are missed.
>
> Signed-off-by: Joel Fernandes <[email protected]>

Acked-by: Sekhar Nori <[email protected]>

Thanks,
Sekhar

2013-09-03 04:54:44

by Vinod Koul

[permalink] [raw]

Subject: Re: [PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given time

On Thu, Aug 29, 2013 at 06:05:41PM -0500, Joel Fernandes wrote:
> Process SG-elements in batches of MAX_NR_SG if they are greater
> than MAX_NR_SG. Due to this, at any given time only those many
> slots will be used in the given channel no matter how long the
> scatter list is. We keep track of how much has been written
> inorder to process the next batch of elements in the scatter-list
> and detect completion.
>
> For such intermediate transfer completions (one batch of MAX_NR_SG),
> make use of pause and resume functions instead of start and stop
> when such intermediate transfer is in progress or completed as we
> donot want to clear any pending events.
>
> Signed-off-by: Joel Fernandes <[email protected]>
> ---
> drivers/dma/edma.c | 79 ++++++++++++++++++++++++++++++++++++------------------
> 1 file changed, 53 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
> index e522ad5..732829b 100644
> --- a/drivers/dma/edma.c
> +++ b/drivers/dma/edma.c
> @@ -56,6 +56,7 @@ struct edma_desc {
> struct list_head node;
> int absync;
> int pset_nr;
> + int processed;
> struct edmacc_param pset[0];
> };
>
> @@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
> /* Dispatch a queued descriptor to the controller (caller holds lock) */
> static void edma_execute(struct edma_chan *echan)
> {
> - struct virt_dma_desc *vdesc = vchan_next_desc(&echan->vchan);
> + struct virt_dma_desc *vdesc;
> struct edma_desc *edesc;
> - int i;
> -
> - if (!vdesc) {
> - echan->edesc = NULL;
> - return;
> + struct device *dev = echan->vchan.chan.device->dev;
> + int i, j, left, nslots;
> +
> + /* If either we processed all psets or we're still not started */
> + if (!echan->edesc ||
> + echan->edesc->pset_nr == echan->edesc->processed) {
> + /* Get next vdesc */
> + vdesc = vchan_next_desc(&echan->vchan);
> + if (!vdesc) {
> + echan->edesc = NULL;
> + return;
> + }
> + list_del(&vdesc->node);
> + echan->edesc = to_edma_desc(&vdesc->tx);
> }
>
> - list_del(&vdesc->node);
> + edesc = echan->edesc;
>
> - echan->edesc = edesc = to_edma_desc(&vdesc->tx);
> + /* Find out how many left */
> + left = edesc->pset_nr - edesc->processed;
> + nslots = min(MAX_NR_SG, left);
>
> /* Write descriptor PaRAM set(s) */
> - for (i = 0; i < edesc->pset_nr; i++) {
> - edma_write_slot(echan->slot[i], &edesc->pset[i]);
> + for (i = 0; i < nslots; i++) {
> + j = i + edesc->processed;
> + edma_write_slot(echan->slot[i], &edesc->pset[j]);
> dev_dbg(echan->vchan.chan.device->dev,
> "\n pset[%d]:\n"
> " chnum\t%d\n"
> @@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
> " bidx\t%08x\n"
> " cidx\t%08x\n"
> " lkrld\t%08x\n",
> - i, echan->ch_num, echan->slot[i],
> - edesc->pset[i].opt,
> - edesc->pset[i].src,
> - edesc->pset[i].dst,
> - edesc->pset[i].a_b_cnt,
> - edesc->pset[i].ccnt,
> - edesc->pset[i].src_dst_bidx,
> - edesc->pset[i].src_dst_cidx,
> - edesc->pset[i].link_bcntrld);
> + j, echan->ch_num, echan->slot[i],
> + edesc->pset[j].opt,
> + edesc->pset[j].src,
> + edesc->pset[j].dst,
> + edesc->pset[j].a_b_cnt,
> + edesc->pset[j].ccnt,
> + edesc->pset[j].src_dst_bidx,
> + edesc->pset[j].src_dst_cidx,
> + edesc->pset[j].link_bcntrld);
> /* Link to the previous slot if not the last set */
> - if (i != (edesc->pset_nr - 1))
> + if (i != (nslots - 1))
> edma_link(echan->slot[i], echan->slot[i+1]);
> /* Final pset links to the dummy pset */
> else
> edma_link(echan->slot[i], echan->ecc->dummy_slot);
> }
>
> - edma_start(echan->ch_num);
> + edesc->processed += nslots;
> +
> + edma_resume(echan->ch_num);
> +
> + if (edesc->processed <= MAX_NR_SG) {
> + dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
> + edma_start(echan->ch_num);
> + }
> }
>
> static int edma_terminate_all(struct edma_chan *echan)
> @@ -368,19 +388,26 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data)
> struct edma_desc *edesc;
> unsigned long flags;
>
> - /* Stop the channel */
> - edma_stop(echan->ch_num);
> + /* Pause the channel */
> + edma_pause(echan->ch_num);
>
> switch (ch_status) {
> case DMA_COMPLETE:
> - dev_dbg(dev, "transfer complete on channel %d\n", ch_num);
> -
> spin_lock_irqsave(&echan->vchan.lock, flags);
>
> edesc = echan->edesc;
> if (edesc) {
> + if (edesc->processed == edesc->pset_nr) {
> + dev_dbg(dev, "transfer complete." \
> + " stopping channel %d\n", ch_num);
> + edma_stop(echan->ch_num);
> + vchan_cookie_complete(&edesc->vdesc);
> + } else {
> + dev_dbg(dev, "Intermediate transfer complete" \
> + " on channel %d\n", ch_num);
No, these two not right... Consider someone seeing this message will try to
grep "transfer complete stopping channel" and will fail miserable to find the
offending line. And this makes it uglier...

Also I believe checkpatch has this check, don't you run that before sending
patches?

Quoting CodingStyle:

Chapter 2: Breaking long lines and strings

Coding style is all about readability and maintainability using commonly
available tools.

The limit on the length of lines is 80 columns and this is a strongly
preferred limit.

Statements longer than 80 columns will be broken into sensible chunks, unless
exceeding 80 columns significantly increases readability and does not hide
information. Descendants are always substantially shorter than the parent and
are placed substantially to the right. The same applies to function headers
with a long argument list. However, never break user-visible strings such as
printk messages, because that breaks the ability to grep for them.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

~Vinod

2013-09-03 15:03:38

by Joel Fernandes

[permalink] [raw]

Subject: Re: [PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given time

On 09/02/2013 11:08 PM, Vinod Koul wrote:
> On Thu, Aug 29, 2013 at 06:05:41PM -0500, Joel Fernandes wrote:
>> Process SG-elements in batches of MAX_NR_SG if they are greater
>> than MAX_NR_SG. Due to this, at any given time only those many
>> slots will be used in the given channel no matter how long the
>> scatter list is. We keep track of how much has been written
>> inorder to process the next batch of elements in the scatter-list
>> and detect completion.
>>
>> For such intermediate transfer completions (one batch of MAX_NR_SG),
>> make use of pause and resume functions instead of start and stop
>> when such intermediate transfer is in progress or completed as we
>> donot want to clear any pending events.
>>
>> Signed-off-by: Joel Fernandes <[email protected]>
>> ---
>> drivers/dma/edma.c | 79 ++++++++++++++++++++++++++++++++++++------------------
>> 1 file changed, 53 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
>> index e522ad5..732829b 100644
>> --- a/drivers/dma/edma.c
>> +++ b/drivers/dma/edma.c
>> @@ -56,6 +56,7 @@ struct edma_desc {
>> struct list_head node;
>> int absync;
>> int pset_nr;
>> + int processed;
>> struct edmacc_param pset[0];
>> };
>>
>> @@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
>> /* Dispatch a queued descriptor to the controller (caller holds lock) */
>> static void edma_execute(struct edma_chan *echan)
>> {
>> - struct virt_dma_desc *vdesc = vchan_next_desc(&echan->vchan);
>> + struct virt_dma_desc *vdesc;
>> struct edma_desc *edesc;
>> - int i;
>> -
>> - if (!vdesc) {
>> - echan->edesc = NULL;
>> - return;
>> + struct device *dev = echan->vchan.chan.device->dev;
>> + int i, j, left, nslots;
>> +
>> + /* If either we processed all psets or we're still not started */
>> + if (!echan->edesc ||
>> + echan->edesc->pset_nr == echan->edesc->processed) {
>> + /* Get next vdesc */
>> + vdesc = vchan_next_desc(&echan->vchan);
>> + if (!vdesc) {
>> + echan->edesc = NULL;
>> + return;
>> + }
>> + list_del(&vdesc->node);
>> + echan->edesc = to_edma_desc(&vdesc->tx);
>> }
>>
>> - list_del(&vdesc->node);
>> + edesc = echan->edesc;
>>
>> - echan->edesc = edesc = to_edma_desc(&vdesc->tx);
>> + /* Find out how many left */
>> + left = edesc->pset_nr - edesc->processed;
>> + nslots = min(MAX_NR_SG, left);
>>
>> /* Write descriptor PaRAM set(s) */
>> - for (i = 0; i < edesc->pset_nr; i++) {
>> - edma_write_slot(echan->slot[i], &edesc->pset[i]);
>> + for (i = 0; i < nslots; i++) {
>> + j = i + edesc->processed;
>> + edma_write_slot(echan->slot[i], &edesc->pset[j]);
>> dev_dbg(echan->vchan.chan.device->dev,
>> "\n pset[%d]:\n"
>> " chnum\t%d\n"
>> @@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
>> " bidx\t%08x\n"
>> " cidx\t%08x\n"
>> " lkrld\t%08x\n",
>> - i, echan->ch_num, echan->slot[i],
>> - edesc->pset[i].opt,
>> - edesc->pset[i].src,
>> - edesc->pset[i].dst,
>> - edesc->pset[i].a_b_cnt,
>> - edesc->pset[i].ccnt,
>> - edesc->pset[i].src_dst_bidx,
>> - edesc->pset[i].src_dst_cidx,
>> - edesc->pset[i].link_bcntrld);
>> + j, echan->ch_num, echan->slot[i],
>> + edesc->pset[j].opt,
>> + edesc->pset[j].src,
>> + edesc->pset[j].dst,
>> + edesc->pset[j].a_b_cnt,
>> + edesc->pset[j].ccnt,
>> + edesc->pset[j].src_dst_bidx,
>> + edesc->pset[j].src_dst_cidx,
>> + edesc->pset[j].link_bcntrld);
>> /* Link to the previous slot if not the last set */
>> - if (i != (edesc->pset_nr - 1))
>> + if (i != (nslots - 1))
>> edma_link(echan->slot[i], echan->slot[i+1]);
>> /* Final pset links to the dummy pset */
>> else
>> edma_link(echan->slot[i], echan->ecc->dummy_slot);
>> }
>>
>> - edma_start(echan->ch_num);
>> + edesc->processed += nslots;
>> +
>> + edma_resume(echan->ch_num);
>> +
>> + if (edesc->processed <= MAX_NR_SG) {
>> + dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
>> + edma_start(echan->ch_num);
>> + }
>> }
>>
>> static int edma_terminate_all(struct edma_chan *echan)
>> @@ -368,19 +388,26 @@ static void edma_callback(unsigned ch_num, u16 ch_status, void *data)
>> struct edma_desc *edesc;
>> unsigned long flags;
>>
>> - /* Stop the channel */
>> - edma_stop(echan->ch_num);
>> + /* Pause the channel */
>> + edma_pause(echan->ch_num);
>>
>> switch (ch_status) {
>> case DMA_COMPLETE:
>> - dev_dbg(dev, "transfer complete on channel %d\n", ch_num);
>> -
>> spin_lock_irqsave(&echan->vchan.lock, flags);
>>
>> edesc = echan->edesc;
>> if (edesc) {
>> + if (edesc->processed == edesc->pset_nr) {
>> + dev_dbg(dev, "transfer complete." \
>> + " stopping channel %d\n", ch_num);
>> + edma_stop(echan->ch_num);
>> + vchan_cookie_complete(&edesc->vdesc);
>> + } else {
>> + dev_dbg(dev, "Intermediate transfer complete" \
>> + " on channel %d\n", ch_num);
> No, these two not right... Consider someone seeing this message will try to
> grep "transfer complete stopping channel" and will fail miserable to find the
> offending line. And this makes it uglier...
>
> Also I believe checkpatch has this check, don't you run that before sending
> patches?

Sorry I did run checkpatch but the >80 char warnings were still there. Thanks
for pointing out that user-visible strings should not be split.

>
> Quoting CodingStyle:
>
> Chapter 2: Breaking long lines and strings
>
> Coding style is all about readability and maintainability using commonly
> available tools.
>
> The limit on the length of lines is 80 columns and this is a strongly
> preferred limit.
>
> Statements longer than 80 columns will be broken into sensible chunks, unless
> exceeding 80 columns significantly increases readability and does not hide
> information. Descendants are always substantially shorter than the parent and
> are placed substantially to the right. The same applies to function headers
> with a long argument list. However, never break user-visible strings such as
> printk messages, because that breaks the ability to grep for them.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sure, I removed the string split and a generate a new patch below, Thanks.

---8<---
From: Joel Fernandes <[email protected]>
Subject: [PATCH v4 2/6] dma: edma: Write out and handle MAX_NR_SG at a given
time

Process SG-elements in batches of MAX_NR_SG if they are greater
than MAX_NR_SG. Due to this, at any given time only those many
slots will be used in the given channel no matter how long the
scatter list is. We keep track of how much has been written
inorder to process the next batch of elements in the scatter-list
and detect completion.

For such intermediate transfer completions (one batch of MAX_NR_SG),
make use of pause and resume functions instead of start and stop
when such intermediate transfer is in progress or completed as we
donot want to clear any pending events.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/dma/edma.c | 77 ++++++++++++++++++++++++++++++++++++------------------
1 file changed, 51 insertions(+), 26 deletions(-)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index e522ad5..f5232bc 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -56,6 +56,7 @@ struct edma_desc {
struct list_head node;
int absync;
int pset_nr;
+ int processed;
struct edmacc_param pset[0];
};

@@ -104,22 +105,34 @@ static void edma_desc_free(struct virt_dma_desc *vdesc)
/* Dispatch a queued descriptor to the controller (caller holds lock) */
static void edma_execute(struct edma_chan *echan)
{
- struct virt_dma_desc *vdesc = vchan_next_desc(&echan->vchan);
+ struct virt_dma_desc *vdesc;
struct edma_desc *edesc;
- int i;
-
- if (!vdesc) {
- echan->edesc = NULL;
- return;
+ struct device *dev = echan->vchan.chan.device->dev;
+ int i, j, left, nslots;
+
+ /* If either we processed all psets or we're still not started */
+ if (!echan->edesc ||
+ echan->edesc->pset_nr == echan->edesc->processed) {
+ /* Get next vdesc */
+ vdesc = vchan_next_desc(&echan->vchan);
+ if (!vdesc) {
+ echan->edesc = NULL;
+ return;
+ }
+ list_del(&vdesc->node);
+ echan->edesc = to_edma_desc(&vdesc->tx);
}

- list_del(&vdesc->node);
+ edesc = echan->edesc;

- echan->edesc = edesc = to_edma_desc(&vdesc->tx);
+ /* Find out how many left */
+ left = edesc->pset_nr - edesc->processed;
+ nslots = min(MAX_NR_SG, left);

/* Write descriptor PaRAM set(s) */
- for (i = 0; i < edesc->pset_nr; i++) {
- edma_write_slot(echan->slot[i], &edesc->pset[i]);
+ for (i = 0; i < nslots; i++) {
+ j = i + edesc->processed;
+ edma_write_slot(echan->slot[i], &edesc->pset[j]);
dev_dbg(echan->vchan.chan.device->dev,
"\n pset[%d]:\n"
" chnum\t%d\n"
@@ -132,24 +145,31 @@ static void edma_execute(struct edma_chan *echan)
" bidx\t%08x\n"
" cidx\t%08x\n"
" lkrld\t%08x\n",
- i, echan->ch_num, echan->slot[i],
- edesc->pset[i].opt,
- edesc->pset[i].src,
- edesc->pset[i].dst,
- edesc->pset[i].a_b_cnt,
- edesc->pset[i].ccnt,
- edesc->pset[i].src_dst_bidx,
- edesc->pset[i].src_dst_cidx,
- edesc->pset[i].link_bcntrld);
+ j, echan->ch_num, echan->slot[i],
+ edesc->pset[j].opt,
+ edesc->pset[j].src,
+ edesc->pset[j].dst,
+ edesc->pset[j].a_b_cnt,
+ edesc->pset[j].ccnt,
+ edesc->pset[j].src_dst_bidx,
+ edesc->pset[j].src_dst_cidx,
+ edesc->pset[j].link_bcntrld);
/* Link to the previous slot if not the last set */
- if (i != (edesc->pset_nr - 1))
+ if (i != (nslots - 1))
edma_link(echan->slot[i], echan->slot[i+1]);
/* Final pset links to the dummy pset */
else
edma_link(echan->slot[i], echan->ecc->dummy_slot);
}

- edma_start(echan->ch_num);
+ edesc->processed += nslots;
+
+ edma_resume(echan->ch_num);
+
+ if (edesc->processed <= MAX_NR_SG) {
+ dev_dbg(dev, "first transfer starting %d\n", echan->ch_num);
+ edma_start(echan->ch_num);
+ }
}

static int edma_terminate_all(struct edma_chan *echan)
@@ -368,19 +388,24 @@ static void edma_callback(unsigned ch_num, u16 ch_status,
void *data)
struct edma_desc *edesc;
unsigned long flags;

- /* Stop the channel */
- edma_stop(echan->ch_num);
+ /* Pause the channel */
+ edma_pause(echan->ch_num);

switch (ch_status) {
case DMA_COMPLETE:
- dev_dbg(dev, "transfer complete on channel %d\n", ch_num);
-
spin_lock_irqsave(&echan->vchan.lock, flags);

edesc = echan->edesc;
if (edesc) {
+ if (edesc->processed == edesc->pset_nr) {
+ dev_dbg(dev, "Transfer complete, stopping channel %d\n", ch_num);
+ edma_stop(echan->ch_num);
+ vchan_cookie_complete(&edesc->vdesc);
+ } else {
+ dev_dbg(dev, "Intermediate transfer complete on channel %d\n", ch_num);
+ }
+
edma_execute(echan);
- vchan_cookie_complete(&edesc->vdesc);
}

spin_unlock_irqrestore(&echan->vchan.lock, flags);
--
1.8.1.2