Subject: [PATCH 00/20] DMA: DMA unmap fixes

Hi,

Currently DMA subsystem does DMA mapping in the core code and DMA
unmapping is done by device drivers. This is counterintuitive,
causes code duplication and subtle errors (some drivers like PL330
one don't implement DMA unmapping code). The following patchset
modifies DMA subsystem to do DMA unmapping in the core code.
It results in simpler code, less code duplication (more than 400
LOC is gone) and fixes the issue with missing DMA unmapping code
in some drivers. Additionally many cases when DMA wasn't unmapped
on a failure are also fixed.


patches #1-3 add missing DMA unmap on failure to async_tx core
code (async_memcpy()), ioat and fsmc_nand drivers

patch #4 fixes DMA flags used by carma-fpga driver

patches #5-7 fix core code and dmatest driver to DMA unmap for
MEMCPY operations

patch #8 adds missing DMA unmap on failure to ioat3 driver

patch #9 fixes build for async_memset.c

patch #10 adds missing DMA unmap on failure to async tx core
code (async_memset())

patches #11-18 fix async_tx core code and dmatest driver to do
DMA unmap for MEMSET, XOR, XOR_VAL, PQ and PQ_VAL operations

patches #19-20 remove no longer needed DMA unmap code from
device drivers and DMA unmap flags from code code


This patchset was tested on PL330 DMA controller using MEMCPY
operations. It would be great if somebody could test it on
more advanced controller capable of MEMSET, XOR, XOR_VAL,
PQ and PQ_VAL operations (especially since the conversion of
XOR and PQ operations was not obvious).


Bartlomiej Zolnierkiewicz (20):
async_tx: add missing DMA unmap to async_memcpy()
ioat: add missing DMA unmap to ioat_dma_self_test()
mtd: fsmc_nand: add missing DMA unmap to dma_xfer()
carma-fpga: pass correct flags to ->device_prep_dma_memcpy()
dmatest: do DMA unmap for MEMCPY operations
DMA: do DMA unmap in core for MEMCPY operations
async_tx: do DMA unmap in core for MEMCPY operations
ioat3: add missing DMA unmap to ioat_xor_val_self_test()
async_tx: fix build for async_memset
async_tx: add missing DMA unmap to async_memset()
async_tx: do DMA unmap in core for MEMSET operations
dmatest: do DMA unmap for XOR operations
async_tx: do DMA unmap in core for XOR operations
async_tx: do DMA unmap in core for XOR_VAL operations
dmatest: do DMA unmap for PQ operations
async_tx: do DMA unmap in async_raid6_recov.c for PQ operations
async_tx: do DMA unmap in core for PQ operations
async_tx: do DMA unmap in core for PQ_VAL operations
DMA: remove DMA unmap from drivers
DMA: remove DMA unmap flags

arch/arm/include/asm/hardware/iop3xx-adma.h | 30 ----
arch/arm/mach-iop13xx/include/mach/adma.h | 26 ---
crypto/async_tx/async_memcpy.c | 27 ++-
crypto/async_tx/async_memset.c | 23 ++-
crypto/async_tx/async_pq.c | 129 +++++++++----
crypto/async_tx/async_raid6_recov.c | 42 ++++-
crypto/async_tx/async_tx.c | 25 ++-
crypto/async_tx/async_xor.c | 98 +++++++---
drivers/ata/pata_arasan_cf.c | 3 +-
drivers/dma/amba-pl08x.c | 31 ----
drivers/dma/at_hdmac.c | 25 ---
drivers/dma/dmaengine.c | 59 +++++-
drivers/dma/dmatest.c | 14 +-
drivers/dma/dw_dmac.c | 20 ---
drivers/dma/ep93xx_dma.c | 32 +---
drivers/dma/fsldma.c | 16 --
drivers/dma/ioat/dma.c | 28 +--
drivers/dma/ioat/dma.h | 12 --
drivers/dma/ioat/dma_v2.c | 1 -
drivers/dma/ioat/dma_v3.c | 179 +++++-------------
drivers/dma/iop-adma.c | 70 +-------
drivers/dma/mv_xor.c | 45 +----
drivers/dma/ppc4xx/adma.c | 270 ----------------------------
drivers/dma/timb_dma.c | 36 ----
drivers/dma/txx9dmac.c | 24 ---
drivers/media/platform/m2m-deinterlace.c | 3 +-
drivers/media/platform/timblogiw.c | 2 +-
drivers/misc/carma/carma-fpga.c | 3 +-
drivers/mtd/nand/atmel_nand.c | 3 +-
drivers/mtd/nand/fsmc_nand.c | 20 ++-
drivers/net/ethernet/micrel/ks8842.c | 6 +-
drivers/spi/spi-dw-mid.c | 4 +-
include/linux/async_tx.h | 4 +
include/linux/dmaengine.h | 34 ++--
34 files changed, 446 insertions(+), 898 deletions(-)

--
1.8.0


Subject: [PATCH 02/20] ioat: add missing DMA unmap to ioat_dma_self_test()

Make ioat_dma_self_test() do DMA unmapping itself and fix handling
of failure cases.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/ioat/dma.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 73b2b65..464138a 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -833,14 +833,14 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)

dma_src = dma_map_single(dev, src, IOAT_TEST_SIZE, DMA_TO_DEVICE);
dma_dest = dma_map_single(dev, dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE);
- flags = DMA_COMPL_SRC_UNMAP_SINGLE | DMA_COMPL_DEST_UNMAP_SINGLE |
+ flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP |
DMA_PREP_INTERRUPT;
tx = device->common.device_prep_dma_memcpy(dma_chan, dma_dest, dma_src,
IOAT_TEST_SIZE, flags);
if (!tx) {
dev_err(dev, "Self-test prep failed, disabling\n");
err = -ENODEV;
- goto free_resources;
+ goto unmap_dma;
}

async_tx_ack(tx);
@@ -851,7 +851,7 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)
if (cookie < 0) {
dev_err(dev, "Self-test setup failed, disabling\n");
err = -ENODEV;
- goto free_resources;
+ goto unmap_dma;
}
dma->device_issue_pending(dma_chan);

@@ -862,7 +862,7 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)
!= DMA_SUCCESS) {
dev_err(dev, "Self-test copy timed out, disabling\n");
err = -ENODEV;
- goto free_resources;
+ goto unmap_dma;
}
if (memcmp(src, dest, IOAT_TEST_SIZE)) {
dev_err(dev, "Self-test copy failed compare, disabling\n");
@@ -870,6 +870,9 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)
goto free_resources;
}

+unmap_dma:
+ dma_unmap_single(dev, dma_src, IOAT_TEST_SIZE, DMA_TO_DEVICE);
+ dma_unmap_single(dev, dma_dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE);
free_resources:
dma->device_free_chan_resources(dma_chan);
out:
--
1.8.0

Subject: [PATCH 03/20] mtd: fsmc_nand: add missing DMA unmap to dma_xfer()

Make dma_xfer() do DMA unmapping itself and fix handling
of failure cases.

Cc: David Woodhouse <[email protected]>
Cc: Vipin Kumar <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/mtd/nand/fsmc_nand.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 38d2624..679ede8 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -569,23 +569,22 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
dma_dev = chan->device;
dma_addr = dma_map_single(dma_dev->dev, buffer, len, direction);

+ flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
+
if (direction == DMA_TO_DEVICE) {
dma_src = dma_addr;
dma_dst = host->data_pa;
- flags |= DMA_COMPL_SRC_UNMAP_SINGLE | DMA_COMPL_SKIP_DEST_UNMAP;
} else {
dma_src = host->data_pa;
dma_dst = dma_addr;
- flags |= DMA_COMPL_DEST_UNMAP_SINGLE | DMA_COMPL_SKIP_SRC_UNMAP;
}

tx = dma_dev->device_prep_dma_memcpy(chan, dma_dst, dma_src,
len, flags);
-
if (!tx) {
dev_err(host->dev, "device_prep_dma_memcpy error\n");
- dma_unmap_single(dma_dev->dev, dma_addr, len, direction);
- return -EIO;
+ ret = -EIO;
+ goto unmap_dma;
}

tx->callback = dma_complete;
@@ -595,7 +594,7 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
ret = dma_submit_error(cookie);
if (ret) {
dev_err(host->dev, "dma_submit_error %d\n", cookie);
- return ret;
+ goto unmap_dma;
}

dma_async_issue_pending(chan);
@@ -606,10 +605,17 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
if (ret <= 0) {
chan->device->device_control(chan, DMA_TERMINATE_ALL, 0);
dev_err(host->dev, "wait_for_completion_timeout\n");
- return ret ? ret : -ETIMEDOUT;
+ if (!ret)
+ ret = -ETIMEDOUT;
+ goto unmap_dma;
}

- return 0;
+ ret = 0;
+
+unmap_dma:
+ dma_unmap_single(dma_dev->dev, dma_addr, len, direction);
+
+ return ret;
}

/*
--
1.8.0

Subject: [PATCH 06/20] DMA: do DMA unmap in core for MEMCPY operations

Add dma_src, dma_dst and dma_len to struct dma_async_tx_descriptor
for storing DMA mapping data and convert core DMA engine code
(dma_async_memcpy_buf_to_buf(), dma_async_memcpy_buf_to_pg() and
dma_async_memcpy_pg_to_pg()) to do DMA unmapping itself using
the ->callback functionality.

Cc: Vinod Koul <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/dmaengine.c | 62 +++++++++++++++++++++++++++++++++++++++++------
include/linux/dmaengine.h | 6 +++++
2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index a815d44..1b9c02a 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -853,6 +853,15 @@ void dma_async_device_unregister(struct dma_device *device)
}
EXPORT_SYMBOL(dma_async_device_unregister);

+static void dma_async_memcpy_buf_to_buf_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_single(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_single(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+}
+
/**
* dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses
* @chan: DMA channel to offload copy to
@@ -877,9 +886,8 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,

dma_src = dma_map_single(dev->dev, src, len, DMA_TO_DEVICE);
dma_dest = dma_map_single(dev->dev, dest, len, DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK |
- DMA_COMPL_SRC_UNMAP_SINGLE |
- DMA_COMPL_DEST_UNMAP_SINGLE;
+ flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
@@ -888,7 +896,13 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
return -ENOMEM;
}

- tx->callback = NULL;
+ tx->dma_src = dma_src;
+ tx->dma_dst = dma_dest;
+ tx->dma_len = len;
+
+ tx->callback = dma_async_memcpy_buf_to_buf_cb;
+ tx->callback_param = tx;
+
cookie = tx->tx_submit(tx);

preempt_disable();
@@ -900,6 +914,15 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
}
EXPORT_SYMBOL(dma_async_memcpy_buf_to_buf);

+static void dma_async_memcpy_buf_to_pg_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_single(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+}
+
/**
* dma_async_memcpy_buf_to_pg - offloaded copy from address to page
* @chan: DMA channel to offload copy to
@@ -925,7 +948,8 @@ dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,

dma_src = dma_map_single(dev->dev, kdata, len, DMA_TO_DEVICE);
dma_dest = dma_map_page(dev->dev, page, offset, len, DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK | DMA_COMPL_SRC_UNMAP_SINGLE;
+ flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
@@ -934,7 +958,13 @@ dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
return -ENOMEM;
}

- tx->callback = NULL;
+ tx->dma_src = dma_src;
+ tx->dma_dst = dma_dest;
+ tx->dma_len = len;
+
+ tx->callback = dma_async_memcpy_buf_to_pg_cb;
+ tx->callback_param = tx;
+
cookie = tx->tx_submit(tx);

preempt_disable();
@@ -946,6 +976,15 @@ dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
}
EXPORT_SYMBOL(dma_async_memcpy_buf_to_pg);

+static void dma_async_memcpy_pg_to_pg_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_page(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+}
+
/**
* dma_async_memcpy_pg_to_pg - offloaded copy from page to page
* @chan: DMA channel to offload copy to
@@ -974,7 +1013,8 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK;
+ flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
@@ -983,7 +1023,13 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
return -ENOMEM;
}

- tx->callback = NULL;
+ tx->dma_src = dma_src;
+ tx->dma_dst = dma_dest;
+ tx->dma_len = len;
+
+ tx->callback = dma_async_memcpy_pg_to_pg_cb;
+ tx->callback_param = tx;
+
cookie = tx->tx_submit(tx);

preempt_disable();
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index d3201e4..8741d57 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -402,6 +402,9 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
* @phys: physical address of the descriptor
* @chan: target channel for this operation
* @tx_submit: set the prepared descriptor(s) to be executed by the engine
+ * @dma_src: DMA source address (needed for DMA unmap)
+ * @dma_dst: DMA destination address (needed for DMA unmap)
+ * @dma_len: DMA length (needed for DMA unmap)
* @callback: routine to call after this operation is complete
* @callback_param: general parameter to pass to the callback routine
* ---async_tx api specific fields---
@@ -415,6 +418,9 @@ struct dma_async_tx_descriptor {
dma_addr_t phys;
struct dma_chan *chan;
dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
+ dma_addr_t dma_src;
+ dma_addr_t dma_dst;
+ size_t dma_len;
dma_async_tx_callback callback;
void *callback_param;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
--
1.8.0

Subject: [PATCH 07/20] async_tx: do DMA unmap in core for MEMCPY operations

Add orig_callback and orig_callback_param to struct
dma_async_tx_descriptor for storing original dma_async_tx_callback
function and its parameter. Teach async_tx_submit() about these
new parameters, to allow passing them to async_tx_submit(), rename
the function to __async_tx_submit(), add the new async_tx_submit()
wrapper to preserve the original functionality and convert core
async_tx code (async_memcpy()) to do DMA unmapping itself using
the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memcpy.c | 24 +++++++++++++++++++++---
crypto/async_tx/async_tx.c | 25 +++++++++++++++++++++----
include/linux/async_tx.h | 4 ++++
include/linux/dmaengine.h | 4 ++++
4 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 9e62fef..b6d5dab 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -30,6 +30,18 @@
#include <linux/dma-mapping.h>
#include <linux/async_tx.h>

+static void async_memcpy_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_page(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/**
* async_memcpy - attempt to copy memory with a dma engine.
* @dest: destination page
@@ -50,10 +62,11 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
&dest, 1, &src, 1, len);
struct dma_device *device = chan ? chan->device : NULL;
struct dma_async_tx_descriptor *tx = NULL;
+ dma_addr_t dma_dest, dma_src;

if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
- dma_addr_t dma_dest, dma_src;
- unsigned long dma_prep_flags = 0;
+ unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;

if (submit->cb_fn)
dma_prep_flags |= DMA_PREP_INTERRUPT;
@@ -77,7 +90,12 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,

if (tx) {
pr_debug("%s: (async) len: %zu\n", __func__, len);
- async_tx_submit(chan, tx, submit);
+
+ tx->dma_src = dma_src;
+ tx->dma_dst = dma_dest;
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_memcpy_cb, tx, submit);
} else {
void *dest_buf, *src_buf;
pr_debug("%s: (sync) len: %zu\n", __func__, len);
diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 8421209..d4335fe 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -153,13 +153,22 @@ enum submit_disposition {
};

void
-async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
- struct async_submit_ctl *submit)
+__async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
+ dma_async_tx_callback cb_fn, void *cb_param,
+ struct async_submit_ctl *submit)
{
struct dma_async_tx_descriptor *depend_tx = submit->depend_tx;

- tx->callback = submit->cb_fn;
- tx->callback_param = submit->cb_param;
+ if (cb_fn) {
+ tx->orig_callback = submit->cb_fn;
+ tx->orig_callback_param = submit->cb_param;
+
+ tx->callback = cb_fn;
+ tx->callback_param = cb_param;
+ } else {
+ tx->callback = submit->cb_fn;
+ tx->callback_param = submit->cb_param;
+ }

if (depend_tx) {
enum submit_disposition s;
@@ -220,6 +229,14 @@ async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
if (depend_tx)
async_tx_ack(depend_tx);
}
+EXPORT_SYMBOL_GPL(__async_tx_submit);
+
+void
+async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
+ struct async_submit_ctl *submit)
+{
+ __async_tx_submit(chan, tx, NULL, NULL, submit);
+}
EXPORT_SYMBOL_GPL(async_tx_submit);

/**
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index a1c486a..cf21d49 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -165,6 +165,10 @@ init_async_submit(struct async_submit_ctl *args, enum async_tx_flags flags,
args->scribble = scribble;
}

+void __async_tx_submit(struct dma_chan *chan,
+ struct dma_async_tx_descriptor *tx,
+ dma_async_tx_callback cb_fn, void *cb_param,
+ struct async_submit_ctl *submit);
void async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
struct async_submit_ctl *submit);

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 8741d57..440b609 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -408,6 +408,8 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
* @callback: routine to call after this operation is complete
* @callback_param: general parameter to pass to the callback routine
* ---async_tx api specific fields---
+ * @orig_callback: optional routine to call from the callback routine
+ * @orig_callback_param: parameter to pass to the orig_callback routine
* @next: at completion submit this descriptor
* @parent: pointer to the next level up in the dependency chain
* @lock: protect the parent and next pointers
@@ -423,6 +425,8 @@ struct dma_async_tx_descriptor {
size_t dma_len;
dma_async_tx_callback callback;
void *callback_param;
+ dma_async_tx_callback orig_callback;
+ void *orig_callback_param;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
--
1.8.0

Subject: [PATCH 08/20] ioat3: add missing DMA unmap to ioat_xor_val_self_test()

Make ioat_xor_val_self_test() do DMA unmapping itself and fix handling
of failure cases.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/ioat/dma_v3.c | 76 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 59 insertions(+), 17 deletions(-)

diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index f7f1dc6..6456f7d 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -863,6 +863,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
unsigned long tmo;
struct device *dev = &device->pdev->dev;
struct dma_device *dma = &device->common;
+ u8 op = 0;

dev_dbg(dev, "%s\n", __func__);

@@ -908,18 +909,22 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
}

/* test xor */
+ op = IOAT_OP_XOR;
+
dest_dma = dma_map_page(dev, dest, 0, PAGE_SIZE, DMA_FROM_DEVICE);
for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
dma_srcs[i] = dma_map_page(dev, xor_srcs[i], 0, PAGE_SIZE,
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor(dma_chan, dest_dma, dma_srcs,
IOAT_NUM_SRC_TEST, PAGE_SIZE,
- DMA_PREP_INTERRUPT);
+ DMA_PREP_INTERRUPT |
+ DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP);

if (!tx) {
dev_err(dev, "Self-test xor prep failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

async_tx_ack(tx);
@@ -930,7 +935,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (cookie < 0) {
dev_err(dev, "Self-test xor setup failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}
dma->device_issue_pending(dma_chan);

@@ -939,9 +944,13 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (dma->device_tx_status(dma_chan, cookie, NULL) != DMA_SUCCESS) {
dev_err(dev, "Self-test xor timed out\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

+ dma_unmap_page(dev, dest_dma, PAGE_SIZE, DMA_FROM_DEVICE);
+ for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
+ dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE, DMA_TO_DEVICE);
+
dma_sync_single_for_cpu(dev, dest_dma, PAGE_SIZE, DMA_FROM_DEVICE);
for (i = 0; i < (PAGE_SIZE / sizeof(u32)); i++) {
u32 *ptr = page_address(dest);
@@ -957,6 +966,8 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (!dma_has_cap(DMA_XOR_VAL, dma_chan->device->cap_mask))
goto free_resources;

+ op = IOAT_OP_XOR_VAL;
+
/* validate the sources with the destintation page */
for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
xor_val_srcs[i] = xor_srcs[i];
@@ -969,11 +980,13 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
- &xor_val_result, DMA_PREP_INTERRUPT);
+ &xor_val_result, DMA_PREP_INTERRUPT |
+ DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP);
if (!tx) {
dev_err(dev, "Self-test zero prep failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

async_tx_ack(tx);
@@ -984,7 +997,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (cookie < 0) {
dev_err(dev, "Self-test zero setup failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}
dma->device_issue_pending(dma_chan);

@@ -993,9 +1006,12 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (dma->device_tx_status(dma_chan, cookie, NULL) != DMA_SUCCESS) {
dev_err(dev, "Self-test validate timed out\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

+ for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
+ dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE, DMA_TO_DEVICE);
+
if (xor_val_result != 0) {
dev_err(dev, "Self-test validate failed compare\n");
err = -ENODEV;
@@ -1007,14 +1023,18 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
goto free_resources;

/* test memset */
+ op = IOAT_OP_FILL;
+
dma_addr = dma_map_page(dev, dest, 0,
PAGE_SIZE, DMA_FROM_DEVICE);
tx = dma->device_prep_dma_memset(dma_chan, dma_addr, 0, PAGE_SIZE,
- DMA_PREP_INTERRUPT);
+ DMA_PREP_INTERRUPT |
+ DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP);
if (!tx) {
dev_err(dev, "Self-test memset prep failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

async_tx_ack(tx);
@@ -1025,7 +1045,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (cookie < 0) {
dev_err(dev, "Self-test memset setup failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}
dma->device_issue_pending(dma_chan);

@@ -1034,9 +1054,11 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (dma->device_tx_status(dma_chan, cookie, NULL) != DMA_SUCCESS) {
dev_err(dev, "Self-test memset timed out\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

+ dma_unmap_page(dev, dma_addr, PAGE_SIZE, DMA_FROM_DEVICE);
+
for (i = 0; i < PAGE_SIZE/sizeof(u32); i++) {
u32 *ptr = page_address(dest);
if (ptr[i]) {
@@ -1047,17 +1069,21 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
}

/* test for non-zero parity sum */
+ op = IOAT_OP_XOR_VAL;
+
xor_val_result = 0;
for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
dma_srcs[i] = dma_map_page(dev, xor_val_srcs[i], 0, PAGE_SIZE,
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
- &xor_val_result, DMA_PREP_INTERRUPT);
+ &xor_val_result, DMA_PREP_INTERRUPT |
+ DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP);
if (!tx) {
dev_err(dev, "Self-test 2nd zero prep failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

async_tx_ack(tx);
@@ -1068,7 +1094,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (cookie < 0) {
dev_err(dev, "Self-test 2nd zero setup failed\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}
dma->device_issue_pending(dma_chan);

@@ -1077,15 +1103,31 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
if (dma->device_tx_status(dma_chan, cookie, NULL) != DMA_SUCCESS) {
dev_err(dev, "Self-test 2nd validate timed out\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

if (xor_val_result != SUM_CHECK_P_RESULT) {
dev_err(dev, "Self-test validate failed compare\n");
err = -ENODEV;
- goto free_resources;
+ goto dma_unmap;
}

+ for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
+ dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE, DMA_TO_DEVICE);
+
+ goto free_resources;
+dma_unmap:
+ if (op == IOAT_OP_XOR) {
+ dma_unmap_page(dev, dest_dma, PAGE_SIZE, DMA_FROM_DEVICE);
+ for (i = 0; i < IOAT_NUM_SRC_TEST; i++)
+ dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
+ DMA_TO_DEVICE);
+ } else if (op == IOAT_OP_XOR_VAL) {
+ for (i = 0; i < IOAT_NUM_SRC_TEST + 1; i++)
+ dma_unmap_page(dev, dma_srcs[i], PAGE_SIZE,
+ DMA_TO_DEVICE);
+ } else if (op == IOAT_OP_FILL)
+ dma_unmap_page(dev, dma_addr, PAGE_SIZE, DMA_FROM_DEVICE);
free_resources:
dma->device_free_chan_resources(dma_chan);
out:
--
1.8.0

Subject: [PATCH 10/20] async_tx: add missing DMA unmap to async_memset()

Do DMA unmap on ->device_prep_dma_memset failure.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memset.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index 05a4d1e..a6a667b 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -61,6 +61,9 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,

tx = device->device_prep_dma_memset(chan, dma_dest, val, len,
dma_prep_flags);
+ if (!tx)
+ dma_unmap_page(device->dev, dma_dest, len,
+ DMA_FROM_DEVICE);
}

if (tx) {
--
1.8.0

Subject: [PATCH 09/20] async_tx: fix build for async_memset

Add missing <linux/module.h> include.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memset.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index 58e4a87..05a4d1e 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -25,6 +25,7 @@
*/
#include <linux/kernel.h>
#include <linux/interrupt.h>
+#include <linux/module.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
#include <linux/async_tx.h>
--
1.8.0

Subject: [PATCH 15/20] dmatest: do DMA unmap for PQ operations

Make driver do DMA unmap for PQ operations.

Cc: Vinod Koul <[email protected]>
Cc: Havard Skinnemoen <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/dmatest.c | 26 ++++++++------------------
1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index eabb230..3b36890 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -304,18 +304,9 @@ static int dmatest_func(void *data)

set_user_nice(current, 10);

- /*
- * src buffers are freed by the DMAEngine code with dma_unmap_single()
- * (except DMA_MEMCPY and DMA_XOR operations)
- * dst buffers are freed by ourselves below
- */
- flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT
- | DMA_COMPL_SKIP_DEST_UNMAP;
-
- if (thread->type == DMA_MEMCPY || thread->type == DMA_XOR)
- flags |= DMA_COMPL_SKIP_SRC_UNMAP;
- else
- flags |= DMA_COMPL_SRC_UNMAP_SINGLE;
+ /* src and dst buffers are freed by ourselves below */
+ flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT |
+ DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;

while (!kthread_should_stop()
&& !(iterations && total_tests >= iterations)) {
@@ -448,12 +439,11 @@ static int dmatest_func(void *data)
continue;
}

- /* Unmap by myself (see DMA_COMPL_SKIP_DEST_UNMAP above) */
- if (thread->type == DMA_MEMCPY || thread->type == DMA_XOR) {
- for (i = 0; i < src_cnt; i++)
- dma_unmap_single(dev->dev, dma_srcs[i], len,
- DMA_TO_DEVICE);
- }
+ /* Unmap by myself */
+ for (i = 0; i < src_cnt; i++)
+ dma_unmap_single(dev->dev, dma_srcs[i], len,
+ DMA_TO_DEVICE);
+
for (i = 0; i < dst_cnt; i++)
dma_unmap_single(dev->dev, dma_dsts[i], test_buf_size,
DMA_BIDIRECTIONAL);
--
1.8.0

Subject: [PATCH 17/20] async_tx: do DMA unmap in core for PQ operations

Convert core async_tx code (do_async_gen_syndrome()) to do
DMA unmapping itself using the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_pq.c | 90 +++++++++++++++++++++++++++++++++-------------
include/linux/dmaengine.h | 2 +-
2 files changed, 66 insertions(+), 26 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index 91d5d38..2848fe8 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -42,6 +42,26 @@ static struct page *pq_scribble_page;
#define P(b, d) (b[d-2])
#define Q(b, d) (b[d-1])

+static void do_async_gen_syndrome_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+ int i;
+
+ for (i = 0; i < 2; i++) {
+ if (tx->dma_dst[i])
+ dma_unmap_page(dev->dev, tx->dma_dst[i], tx->dma_len,
+ DMA_BIDIRECTIONAL);
+ }
+
+ for (i = 0; i < tx->dma_src_cnt; i++)
+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
+ DMA_TO_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/**
* do_async_gen_syndrome - asynchronously calculate P and/or Q
*/
@@ -61,37 +81,21 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
unsigned char coefs[src_cnt];
unsigned short pq_src_cnt;
dma_addr_t dma_dest[2];
- int src_off = 0;
+ int blocks_cnt = 0;
int idx;
int i;

- /* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */
- if (P(blocks, disks))
- dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset,
- len, DMA_BIDIRECTIONAL);
- else
- dma_flags |= DMA_PREP_PQ_DISABLE_P;
- if (Q(blocks, disks))
- dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset,
- len, DMA_BIDIRECTIONAL);
- else
- dma_flags |= DMA_PREP_PQ_DISABLE_Q;
-
- /* convert source addresses being careful to collapse 'empty'
- * sources and update the coefficients accordingly
- */
for (i = 0, idx = 0; i < src_cnt; i++) {
if (blocks[i] == NULL)
continue;
- dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
- DMA_TO_DEVICE);
- coefs[idx] = scfs[i];
idx++;
}
src_cnt = idx;

while (src_cnt > 0) {
submit->flags = flags_orig;
+ dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
/* if we are submitting additional pqs, leave the chain open,
* clear the callback parameters, and leave the destination
@@ -100,11 +104,9 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
if (src_cnt > pq_src_cnt) {
submit->flags &= ~ASYNC_TX_ACK;
submit->flags |= ASYNC_TX_FENCE;
- dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = NULL;
submit->cb_param = NULL;
} else {
- dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = cb_fn_orig;
submit->cb_param = cb_param_orig;
if (cb_fn_orig)
@@ -113,15 +115,46 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;

+ /*
+ * DMAs use destinations as sources,
+ * so use BIDIRECTIONAL mapping
+ */
+ if (P(blocks, disks))
+ dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks),
+ offset, len, DMA_BIDIRECTIONAL);
+ else {
+ dma_dest[0] = 0;
+ dma_flags |= DMA_PREP_PQ_DISABLE_P;
+ }
+ if (Q(blocks, disks))
+ dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks),
+ offset, len, DMA_BIDIRECTIONAL);
+ else {
+ dma_dest[1] = 0;
+ dma_flags |= DMA_PREP_PQ_DISABLE_Q;
+ }
+
+ /* convert source addresses being careful to collapse 'empty'
+ * sources and update the coefficients accordingly
+ */
+ for (i = blocks_cnt, idx = 0; idx < pq_src_cnt; i++) {
+ if (blocks[i] == NULL)
+ continue;
+ dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset,
+ len, DMA_TO_DEVICE);
+ coefs[idx] = scfs[i];
+ idx++;
+ }
+
/* Since we have clobbered the src_list we are committed
* to doing this asynchronously. Drivers force forward
* progress in case they can not provide a descriptor
*/
for (;;) {
tx = dma->device_prep_dma_pq(chan, dma_dest,
- &dma_src[src_off],
+ &dma_src[0],
pq_src_cnt,
- &coefs[src_off], len,
+ &coefs[0], len,
dma_flags);
if (likely(tx))
break;
@@ -129,12 +162,19 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
dma_async_issue_pending(chan);
}

- async_tx_submit(chan, tx, submit);
+ for (i = 0; i < pq_src_cnt; i++)
+ tx->dma_src[i] = dma_src[i];
+ tx->dma_src_cnt = pq_src_cnt;
+ tx->dma_dst[0] = dma_dest[0];
+ tx->dma_dst[1] = dma_dest[1];
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, do_async_gen_syndrome_cb, tx,
+ submit);
submit->depend_tx = tx;

/* drop completed sources */
src_cnt -= pq_src_cnt;
- src_off += pq_src_cnt;

dma_flags |= DMA_PREP_CONTINUE;
}
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 0df69f1..af3b941 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -393,7 +393,7 @@ typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);

typedef void (*dma_async_tx_callback)(void *dma_async_param);

-/* max value of ->max_xor from struct dma_device */
+/* max value of ->max_[xor,pq] from struct dma_device */
#define DMA_ASYNC_TX_MAX_ENT 128

/**
--
1.8.0

Subject: [PATCH 20/20] DMA: remove DMA unmap flags

Remove no longer needed DMA unmap flags:
- DMA_COMPL_SKIP_SRC_UNMAP
- DMA_COMPL_SKIP_DEST_UNMAP
- DMA_COMPL_SRC_UNMAP_SINGLE
- DMA_COMPL_DEST_UNMAP_SINGLE

Cc: Vinod Koul <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memcpy.c | 3 +--
crypto/async_tx/async_memset.c | 3 +--
crypto/async_tx/async_pq.c | 3 +--
crypto/async_tx/async_raid6_recov.c | 8 ++------
crypto/async_tx/async_xor.c | 6 ++----
drivers/ata/pata_arasan_cf.c | 3 +--
drivers/dma/dmaengine.c | 9 +++------
drivers/dma/dmatest.c | 3 +--
drivers/dma/ioat/dma.c | 3 +--
drivers/dma/ioat/dma_v3.c | 16 ++++------------
drivers/media/platform/m2m-deinterlace.c | 3 +--
drivers/media/platform/timblogiw.c | 2 +-
drivers/misc/carma/carma-fpga.c | 3 +--
drivers/mtd/nand/atmel_nand.c | 3 +--
drivers/mtd/nand/fsmc_nand.c | 2 --
drivers/net/ethernet/micrel/ks8842.c | 6 ++----
drivers/spi/spi-dw-mid.c | 4 ++--
include/linux/dmaengine.h | 18 ++++--------------
18 files changed, 29 insertions(+), 69 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index cb0628e..ff5e803 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -65,8 +65,7 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
dma_addr_t dma_dest, dma_src;

if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
- unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ unsigned long dma_prep_flags = 0;

if (submit->cb_fn)
dma_prep_flags |= DMA_PREP_INTERRUPT;
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index cf30bf1..7775852 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -61,8 +61,7 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
dma_addr_t dma_dest;

if (device && is_dma_fill_aligned(device, offset, 0, len)) {
- unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ unsigned long dma_prep_flags = 0;

if (submit->cb_fn)
dma_prep_flags |= DMA_PREP_INTERRUPT;
diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index 9e5500e..0963de5 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -94,8 +94,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,

while (src_cnt > 0) {
submit->flags = flags_orig;
- dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ dma_flags = 0;
pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
/* if we are submitting additional pqs, leave the chain open,
* clear the callback parameters, and leave the destination
diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
index 3db97aa..b0fceb1 100644
--- a/crypto/async_tx/async_raid6_recov.c
+++ b/crypto/async_tx/async_raid6_recov.c
@@ -57,9 +57,7 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
dma_addr_t dma_src[2];
struct device *dev = dma->dev;
struct dma_async_tx_descriptor *tx;
- enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP |
- DMA_PREP_PQ_DISABLE_P;
+ enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;

if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
@@ -133,9 +131,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
dma_addr_t dma_src[1];
struct device *dev = dma->dev;
struct dma_async_tx_descriptor *tx;
- enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP |
- DMA_PREP_PQ_DISABLE_P;
+ enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;

if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index d44da16..e632913 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -78,8 +78,7 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,

while (src_cnt) {
submit->flags = flags_orig;
- dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ dma_flags = 0;
xor_src_cnt = min(src_cnt, (int)dma->max_xor);
/* if we are submitting additional xors, leave the chain open
* and clear the callback parameters
@@ -333,8 +332,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,

if (dma_src && device && src_cnt <= device->max_xor &&
is_dma_xor_aligned(device, offset, 0, len)) {
- unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ unsigned long dma_prep_flags = 0;
int i;

pr_debug("%s: (async) len: %zu\n", __func__, len);
diff --git a/drivers/ata/pata_arasan_cf.c b/drivers/ata/pata_arasan_cf.c
index 26201eb..9a6d38d 100644
--- a/drivers/ata/pata_arasan_cf.c
+++ b/drivers/ata/pata_arasan_cf.c
@@ -393,8 +393,7 @@ dma_xfer(struct arasan_cf_dev *acdev, dma_addr_t src, dma_addr_t dest, u32 len)
struct dma_async_tx_descriptor *tx;
struct dma_chan *chan = acdev->dma_chan;
dma_cookie_t cookie;
- unsigned long flags = DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ unsigned long flags = DMA_PREP_INTERRUPT;
int ret = 0;

tx = chan->device->device_prep_dma_memcpy(chan, dest, src, len, flags);
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 5573e86..b137121 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -888,8 +888,7 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,

dma_src = dma_map_single(dev->dev, src, len, DMA_TO_DEVICE);
dma_dest = dma_map_single(dev->dev, dest, len, DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ flags = DMA_CTRL_ACK;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
@@ -950,8 +949,7 @@ dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,

dma_src = dma_map_single(dev->dev, kdata, len, DMA_TO_DEVICE);
dma_dest = dma_map_page(dev->dev, page, offset, len, DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ flags = DMA_CTRL_ACK;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
@@ -1015,8 +1013,7 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE);
dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len,
DMA_FROM_DEVICE);
- flags = DMA_CTRL_ACK | DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ flags = DMA_CTRL_ACK;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);

if (!tx) {
diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index 3b36890..df78702 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -305,8 +305,7 @@ static int dmatest_func(void *data)
set_user_nice(current, 10);

/* src and dst buffers are freed by ourselves below */
- flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
+ flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;

while (!kthread_should_stop()
&& !(iterations && total_tests >= iterations)) {
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 0851ded..cc18aaa 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -817,8 +817,7 @@ int __devinit ioat_dma_self_test(struct ioatdma_device *device)

dma_src = dma_map_single(dev, src, IOAT_TEST_SIZE, DMA_TO_DEVICE);
dma_dest = dma_map_single(dev, dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE);
- flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP |
- DMA_PREP_INTERRUPT;
+ flags = DMA_PREP_INTERRUPT;
tx = device->common.device_prep_dma_memcpy(dma_chan, dma_dest, dma_src,
IOAT_TEST_SIZE, flags);
if (!tx) {
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 5027c0b..64b3744 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -798,9 +798,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor(dma_chan, dest_dma, dma_srcs,
IOAT_NUM_SRC_TEST, PAGE_SIZE,
- DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP);
+ DMA_PREP_INTERRUPT);

if (!tx) {
dev_err(dev, "Self-test xor prep failed\n");
@@ -861,9 +859,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
- &xor_val_result, DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP);
+ &xor_val_result, DMA_PREP_INTERRUPT);
if (!tx) {
dev_err(dev, "Self-test zero prep failed\n");
err = -ENODEV;
@@ -909,9 +905,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
dma_addr = dma_map_page(dev, dest, 0,
PAGE_SIZE, DMA_FROM_DEVICE);
tx = dma->device_prep_dma_memset(dma_chan, dma_addr, 0, PAGE_SIZE,
- DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP);
+ DMA_PREP_INTERRUPT);
if (!tx) {
dev_err(dev, "Self-test memset prep failed\n");
err = -ENODEV;
@@ -958,9 +952,7 @@ static int __devinit ioat_xor_val_self_test(struct ioatdma_device *device)
DMA_TO_DEVICE);
tx = dma->device_prep_dma_xor_val(dma_chan, dma_srcs,
IOAT_NUM_SRC_TEST + 1, PAGE_SIZE,
- &xor_val_result, DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP);
+ &xor_val_result, DMA_PREP_INTERRUPT);
if (!tx) {
dev_err(dev, "Self-test 2nd zero prep failed\n");
err = -ENODEV;
diff --git a/drivers/media/platform/m2m-deinterlace.c b/drivers/media/platform/m2m-deinterlace.c
index 45164c4..8c63b93 100644
--- a/drivers/media/platform/m2m-deinterlace.c
+++ b/drivers/media/platform/m2m-deinterlace.c
@@ -344,8 +344,7 @@ static void deinterlace_issue_dma(struct deinterlace_ctx *ctx, int op,
ctx->xt->dir = DMA_MEM_TO_MEM;
ctx->xt->src_sgl = false;
ctx->xt->dst_sgl = true;
- flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT |
- DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SKIP_SRC_UNMAP;
+ flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;

tx = dmadev->device_prep_interleaved_dma(chan, ctx->xt, flags);
if (tx == NULL) {
diff --git a/drivers/media/platform/timblogiw.c b/drivers/media/platform/timblogiw.c
index 02194c0..8ae630c 100644
--- a/drivers/media/platform/timblogiw.c
+++ b/drivers/media/platform/timblogiw.c
@@ -566,7 +566,7 @@ static void buffer_queue(struct videobuf_queue *vq, struct videobuf_buffer *vb)

desc = dmaengine_prep_slave_sg(fh->chan,
buf->sg, sg_elems, DMA_DEV_TO_MEM,
- DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+ DMA_PREP_INTERRUPT);
if (!desc) {
spin_lock_irq(&fh->queue_lock);
list_del_init(&vb->queue);
diff --git a/drivers/misc/carma/carma-fpga.c b/drivers/misc/carma/carma-fpga.c
index 6b43f8c..ae4289b 100644
--- a/drivers/misc/carma/carma-fpga.c
+++ b/drivers/misc/carma/carma-fpga.c
@@ -631,8 +631,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
struct dma_async_tx_descriptor *tx;
dma_cookie_t cookie;
dma_addr_t dst, src;
- unsigned long dma_flags = DMA_COMPL_SKIP_DEST_UNMAP |
- DMA_COMPL_SKIP_SRC_UNMAP;
+ unsigned long dma_flags = 0;

dst_sg = buf->vb.sglist;
dst_nents = buf->vb.sglen;
diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 9144557..5e1f88f 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -238,8 +238,7 @@ static int atmel_nand_dma_op(struct mtd_info *mtd, void *buf, int len,

dma_dev = host->dma_chan->device;

- flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP |
- DMA_COMPL_SKIP_DEST_UNMAP;
+ flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;

phys_addr = dma_map_single(dma_dev->dev, p, len, dir);
if (dma_mapping_error(dma_dev->dev, phys_addr)) {
diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index 679ede8..fdb98f0 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -569,8 +569,6 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
dma_dev = chan->device;
dma_addr = dma_map_single(dma_dev->dev, buffer, len, direction);

- flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
-
if (direction == DMA_TO_DEVICE) {
dma_src = dma_addr;
dma_dst = host->data_pa;
diff --git a/drivers/net/ethernet/micrel/ks8842.c b/drivers/net/ethernet/micrel/ks8842.c
index 24fb049..f657760 100644
--- a/drivers/net/ethernet/micrel/ks8842.c
+++ b/drivers/net/ethernet/micrel/ks8842.c
@@ -459,8 +459,7 @@ static int ks8842_tx_frame_dma(struct sk_buff *skb, struct net_device *netdev)
sg_dma_len(&ctl->sg) += 4 - sg_dma_len(&ctl->sg) % 4;

ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
- &ctl->sg, 1, DMA_MEM_TO_DEV,
- DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+ &ctl->sg, 1, DMA_MEM_TO_DEV, DMA_PREP_INTERRUPT);
if (!ctl->adesc)
return NETDEV_TX_BUSY;

@@ -571,8 +570,7 @@ static int __ks8842_start_new_rx_dma(struct net_device *netdev)
sg_dma_len(sg) = DMA_BUFFER_SIZE;

ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
- sg, 1, DMA_DEV_TO_MEM,
- DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
+ sg, 1, DMA_DEV_TO_MEM, DMA_PREP_INTERRUPT);

if (!ctl->adesc)
goto out;
diff --git a/drivers/spi/spi-dw-mid.c b/drivers/spi/spi-dw-mid.c
index b9f0192..6d207af 100644
--- a/drivers/spi/spi-dw-mid.c
+++ b/drivers/spi/spi-dw-mid.c
@@ -150,7 +150,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
&dws->tx_sgl,
1,
DMA_MEM_TO_DEV,
- DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP);
+ DMA_PREP_INTERRUPT);
txdesc->callback = dw_spi_dma_done;
txdesc->callback_param = dws;

@@ -173,7 +173,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
&dws->rx_sgl,
1,
DMA_DEV_TO_MEM,
- DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP);
+ DMA_PREP_INTERRUPT);
rxdesc->callback = dw_spi_dma_done;
rxdesc->callback_param = dws;

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index af3b941..b1269ed 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -169,12 +169,6 @@ struct dma_interleaved_template {
* @DMA_CTRL_ACK - if clear, the descriptor cannot be reused until the client
* acknowledges receipt, i.e. has has a chance to establish any dependency
* chains
- * @DMA_COMPL_SKIP_SRC_UNMAP - set to disable dma-unmapping the source buffer(s)
- * @DMA_COMPL_SKIP_DEST_UNMAP - set to disable dma-unmapping the destination(s)
- * @DMA_COMPL_SRC_UNMAP_SINGLE - set to do the source dma-unmapping as single
- * (if not set, do the source dma-unmapping as page)
- * @DMA_COMPL_DEST_UNMAP_SINGLE - set to do the destination dma-unmapping as single
- * (if not set, do the destination dma-unmapping as page)
* @DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q
* @DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P
* @DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as
@@ -186,14 +180,10 @@ struct dma_interleaved_template {
enum dma_ctrl_flags {
DMA_PREP_INTERRUPT = (1 << 0),
DMA_CTRL_ACK = (1 << 1),
- DMA_COMPL_SKIP_SRC_UNMAP = (1 << 2),
- DMA_COMPL_SKIP_DEST_UNMAP = (1 << 3),
- DMA_COMPL_SRC_UNMAP_SINGLE = (1 << 4),
- DMA_COMPL_DEST_UNMAP_SINGLE = (1 << 5),
- DMA_PREP_PQ_DISABLE_P = (1 << 6),
- DMA_PREP_PQ_DISABLE_Q = (1 << 7),
- DMA_PREP_CONTINUE = (1 << 8),
- DMA_PREP_FENCE = (1 << 9),
+ DMA_PREP_PQ_DISABLE_P = (1 << 2),
+ DMA_PREP_PQ_DISABLE_Q = (1 << 3),
+ DMA_PREP_CONTINUE = (1 << 4),
+ DMA_PREP_FENCE = (1 << 5),
};

/**
--
1.8.0

Subject: [PATCH 19/20] DMA: remove DMA unmap from drivers

Remove support for DMA unmapping from drivers as it is no longer
needed (DMA core code is now handling it).

Cc: Vinod Koul <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
arch/arm/include/asm/hardware/iop3xx-adma.h | 30 ----
arch/arm/mach-iop13xx/include/mach/adma.h | 26 ---
drivers/dma/amba-pl08x.c | 31 ----
drivers/dma/at_hdmac.c | 25 ---
drivers/dma/dw_dmac.c | 20 ---
drivers/dma/ep93xx_dma.c | 32 +---
drivers/dma/fsldma.c | 16 --
drivers/dma/ioat/dma.c | 16 --
drivers/dma/ioat/dma.h | 12 --
drivers/dma/ioat/dma_v2.c | 1 -
drivers/dma/ioat/dma_v3.c | 119 ------------
drivers/dma/iop-adma.c | 70 +-------
drivers/dma/mv_xor.c | 45 +----
drivers/dma/ppc4xx/adma.c | 270 ----------------------------
drivers/dma/timb_dma.c | 36 ----
drivers/dma/txx9dmac.c | 24 ---
16 files changed, 4 insertions(+), 769 deletions(-)

diff --git a/arch/arm/include/asm/hardware/iop3xx-adma.h b/arch/arm/include/asm/hardware/iop3xx-adma.h
index 9b28f12..240b29e 100644
--- a/arch/arm/include/asm/hardware/iop3xx-adma.h
+++ b/arch/arm/include/asm/hardware/iop3xx-adma.h
@@ -393,36 +393,6 @@ static inline int iop_chan_zero_sum_slot_count(size_t len, int src_cnt,
return slot_cnt;
}

-static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
-{
- return 0;
-}
-
-static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
- struct iop_adma_chan *chan)
-{
- union iop3xx_desc hw_desc = { .ptr = desc->hw_desc, };
-
- switch (chan->device->id) {
- case DMA0_ID:
- case DMA1_ID:
- return hw_desc.dma->dest_addr;
- case AAU_ID:
- return hw_desc.aau->dest_addr;
- default:
- BUG();
- }
- return 0;
-}
-
-
-static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
- struct iop_adma_chan *chan)
-{
- BUG();
- return 0;
-}
-
static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
diff --git a/arch/arm/mach-iop13xx/include/mach/adma.h b/arch/arm/mach-iop13xx/include/mach/adma.h
index 6d3782d..a86fd0e 100644
--- a/arch/arm/mach-iop13xx/include/mach/adma.h
+++ b/arch/arm/mach-iop13xx/include/mach/adma.h
@@ -218,20 +218,6 @@ iop_chan_xor_slot_count(size_t len, int src_cnt, int *slots_per_op)
#define iop_chan_pq_slot_count iop_chan_xor_slot_count
#define iop_chan_pq_zero_sum_slot_count iop_chan_xor_slot_count

-static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
- struct iop_adma_chan *chan)
-{
- struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
- return hw_desc->dest_addr;
-}
-
-static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
- struct iop_adma_chan *chan)
-{
- struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
- return hw_desc->q_dest_addr;
-}
-
static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
@@ -350,18 +336,6 @@ iop_desc_init_pq(struct iop_adma_desc_slot *desc, int src_cnt,
hw_desc->desc_ctrl = u_desc_ctrl.value;
}

-static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
-{
- struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
- union {
- u32 value;
- struct iop13xx_adma_desc_ctrl field;
- } u_desc_ctrl;
-
- u_desc_ctrl.value = hw_desc->desc_ctrl;
- return u_desc_ctrl.field.pq_xfer_en;
-}
-
static inline void
iop_desc_init_pq_zero_sum(struct iop_adma_desc_slot *desc, int src_cnt,
unsigned long flags)
diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c
index d1cc579..06baca0 100644
--- a/drivers/dma/amba-pl08x.c
+++ b/drivers/dma/amba-pl08x.c
@@ -1050,42 +1050,11 @@ static void pl08x_free_txd(struct pl08x_driver_data *pl08x,
kfree(txd);
}

-static void pl08x_unmap_buffers(struct pl08x_txd *txd)
-{
- struct device *dev = txd->vd.tx.chan->device->dev;
- struct pl08x_sg *dsg;
-
- if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->vd.tx.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- list_for_each_entry(dsg, &txd->dsg_list, node)
- dma_unmap_single(dev, dsg->src_addr, dsg->len,
- DMA_TO_DEVICE);
- else {
- list_for_each_entry(dsg, &txd->dsg_list, node)
- dma_unmap_page(dev, dsg->src_addr, dsg->len,
- DMA_TO_DEVICE);
- }
- }
- if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->vd.tx.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- list_for_each_entry(dsg, &txd->dsg_list, node)
- dma_unmap_single(dev, dsg->dst_addr, dsg->len,
- DMA_FROM_DEVICE);
- else
- list_for_each_entry(dsg, &txd->dsg_list, node)
- dma_unmap_page(dev, dsg->dst_addr, dsg->len,
- DMA_FROM_DEVICE);
- }
-}
-
static void pl08x_desc_free(struct virt_dma_desc *vd)
{
struct pl08x_txd *txd = to_pl08x_txd(&vd->tx);
struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan);

- if (!plchan->slave)
- pl08x_unmap_buffers(txd);
-
if (!txd->done)
pl08x_release_mux(plchan);

diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c
index 13a02f4..2da1f9a 100644
--- a/drivers/dma/at_hdmac.c
+++ b/drivers/dma/at_hdmac.c
@@ -252,31 +252,6 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc)
/* move myself to free_list */
list_move(&desc->desc_node, &atchan->free_list);

- /* unmap dma addresses (not on slave channels) */
- if (!atchan->chan_common.private) {
- struct device *parent = chan2parent(&atchan->chan_common);
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(parent,
- desc->lli.daddr,
- desc->len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(parent,
- desc->lli.daddr,
- desc->len, DMA_FROM_DEVICE);
- }
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(parent,
- desc->lli.saddr,
- desc->len, DMA_TO_DEVICE);
- else
- dma_unmap_page(parent,
- desc->lli.saddr,
- desc->len, DMA_TO_DEVICE);
- }
- }
-
/* for cyclic transfers,
* no need to replay callback function while stopping */
if (!atc_chan_is_cyclic(atchan)) {
diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index c4b0eb3..abc7994 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -326,26 +326,6 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
list_splice_init(&desc->tx_list, &dwc->free_list);
list_move(&desc->desc_node, &dwc->free_list);

- if (!dwc->chan.private) {
- struct device *parent = chan2parent(&dwc->chan);
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(parent, desc->lli.dar,
- desc->len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(parent, desc->lli.dar,
- desc->len, DMA_FROM_DEVICE);
- }
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(parent, desc->lli.sar,
- desc->len, DMA_TO_DEVICE);
- else
- dma_unmap_page(parent, desc->lli.sar,
- desc->len, DMA_TO_DEVICE);
- }
- }
-
spin_unlock_irqrestore(&dwc->lock, flags);

if (callback_required && callback)
diff --git a/drivers/dma/ep93xx_dma.c b/drivers/dma/ep93xx_dma.c
index bcfde40..9a934e0 100644
--- a/drivers/dma/ep93xx_dma.c
+++ b/drivers/dma/ep93xx_dma.c
@@ -733,28 +733,6 @@ static void ep93xx_dma_advance_work(struct ep93xx_dma_chan *edmac)
spin_unlock_irqrestore(&edmac->lock, flags);
}

-static void ep93xx_dma_unmap_buffers(struct ep93xx_dma_desc *desc)
-{
- struct device *dev = desc->txd.chan->device->dev;
-
- if (!(desc->txd.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (desc->txd.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(dev, desc->src_addr, desc->size,
- DMA_TO_DEVICE);
- else
- dma_unmap_page(dev, desc->src_addr, desc->size,
- DMA_TO_DEVICE);
- }
- if (!(desc->txd.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (desc->txd.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(dev, desc->dst_addr, desc->size,
- DMA_FROM_DEVICE);
- else
- dma_unmap_page(dev, desc->dst_addr, desc->size,
- DMA_FROM_DEVICE);
- }
-}
-
static void ep93xx_dma_tasklet(unsigned long data)
{
struct ep93xx_dma_chan *edmac = (struct ep93xx_dma_chan *)data;
@@ -786,16 +764,8 @@ static void ep93xx_dma_tasklet(unsigned long data)
ep93xx_dma_advance_work(edmac);

/* Now we can release all the chained descriptors */
- list_for_each_entry_safe(desc, d, &list, node) {
- /*
- * For the memcpy channels the API requires us to unmap the
- * buffers unless requested otherwise.
- */
- if (!edmac->chan.private)
- ep93xx_dma_unmap_buffers(desc);
-
+ list_for_each_entry_safe(desc, d, &list, node)
ep93xx_dma_desc_put(edmac, desc);
- }

if (callback)
callback(callback_param);
diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 094437b..dc650e1 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -868,22 +868,6 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
/* Run any dependencies */
dma_run_dependencies(txd);

- /* Unmap the dst buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
- }
-
- /* Unmap the src buffer, if requested */
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
- else
- dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
- }
-
#ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc);
#endif
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 464138a..0851ded 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -531,21 +531,6 @@ static void ioat1_cleanup_event(unsigned long data)
writew(IOAT_CHANCTRL_RUN, ioat->base.reg_base + IOAT_CHANCTRL_OFFSET);
}

-void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
- size_t len, struct ioat_dma_descriptor *hw)
-{
- struct pci_dev *pdev = chan->device->pdev;
- size_t offset = len - hw->size;
-
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
- ioat_unmap(pdev, hw->dst_addr - offset, len,
- PCI_DMA_FROMDEVICE, flags, 1);
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP))
- ioat_unmap(pdev, hw->src_addr - offset, len,
- PCI_DMA_TODEVICE, flags, 0);
-}
-
dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan)
{
dma_addr_t phys_complete;
@@ -602,7 +587,6 @@ static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete)
dump_desc_dbg(ioat, desc);
if (tx->cookie) {
dma_cookie_complete(tx);
- ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
ioat->active -= desc->hw->tx_cnt;
if (tx->callback) {
tx->callback(tx->callback_param);
diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
index 5e8fe01..4e9e027 100644
--- a/drivers/dma/ioat/dma.h
+++ b/drivers/dma/ioat/dma.h
@@ -293,16 +293,6 @@ static inline bool is_ioat_bug(unsigned long err)
return !!err;
}

-static inline void ioat_unmap(struct pci_dev *pdev, dma_addr_t addr, size_t len,
- int direction, enum dma_ctrl_flags flags, bool dst)
-{
- if ((dst && (flags & DMA_COMPL_DEST_UNMAP_SINGLE)) ||
- (!dst && (flags & DMA_COMPL_SRC_UNMAP_SINGLE)))
- pci_unmap_single(pdev, addr, len, direction);
- else
- pci_unmap_page(pdev, addr, len, direction);
-}
-
int __devinit ioat_probe(struct ioatdma_device *device);
int __devinit ioat_register(struct ioatdma_device *device);
int __devinit ioat1_dma_probe(struct ioatdma_device *dev, int dca);
@@ -315,8 +305,6 @@ void ioat_init_channel(struct ioatdma_device *device,
struct ioat_chan_common *chan, int idx);
enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie,
struct dma_tx_state *txstate);
-void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
- size_t len, struct ioat_dma_descriptor *hw);
bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
dma_addr_t *phys_complete);
void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type);
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index b9d6678..d9abfba 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -148,7 +148,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
tx = &desc->txd;
dump_desc_dbg(ioat, desc);
if (tx->cookie) {
- ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw);
dma_cookie_complete(tx);
if (tx->callback) {
tx->callback(tx->callback_param);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 6456f7d..5027c0b 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -111,124 +111,6 @@ static void pq_set_src(struct ioat_raw_descriptor *descs[2],
pq->coef[idx] = coef;
}

-static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat,
- struct ioat_ring_ent *desc, int idx)
-{
- struct ioat_chan_common *chan = &ioat->base;
- struct pci_dev *pdev = chan->device->pdev;
- size_t len = desc->len;
- size_t offset = len - desc->hw->size;
- struct dma_async_tx_descriptor *tx = &desc->txd;
- enum dma_ctrl_flags flags = tx->flags;
-
- switch (desc->hw->ctl_f.op) {
- case IOAT_OP_COPY:
- if (!desc->hw->ctl_f.null) /* skip 'interrupt' ops */
- ioat_dma_unmap(chan, flags, len, desc->hw);
- break;
- case IOAT_OP_FILL: {
- struct ioat_fill_descriptor *hw = desc->fill;
-
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
- ioat_unmap(pdev, hw->dst_addr - offset, len,
- PCI_DMA_FROMDEVICE, flags, 1);
- break;
- }
- case IOAT_OP_XOR_VAL:
- case IOAT_OP_XOR: {
- struct ioat_xor_descriptor *xor = desc->xor;
- struct ioat_ring_ent *ext;
- struct ioat_xor_ext_descriptor *xor_ex = NULL;
- int src_cnt = src_cnt_to_sw(xor->ctl_f.src_cnt);
- struct ioat_raw_descriptor *descs[2];
- int i;
-
- if (src_cnt > 5) {
- ext = ioat2_get_ring_ent(ioat, idx + 1);
- xor_ex = ext->xor_ex;
- }
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- descs[0] = (struct ioat_raw_descriptor *) xor;
- descs[1] = (struct ioat_raw_descriptor *) xor_ex;
- for (i = 0; i < src_cnt; i++) {
- dma_addr_t src = xor_get_src(descs, i);
-
- ioat_unmap(pdev, src - offset, len,
- PCI_DMA_TODEVICE, flags, 0);
- }
-
- /* dest is a source in xor validate operations */
- if (xor->ctl_f.op == IOAT_OP_XOR_VAL) {
- ioat_unmap(pdev, xor->dst_addr - offset, len,
- PCI_DMA_TODEVICE, flags, 1);
- break;
- }
- }
-
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
- ioat_unmap(pdev, xor->dst_addr - offset, len,
- PCI_DMA_FROMDEVICE, flags, 1);
- break;
- }
- case IOAT_OP_PQ_VAL:
- case IOAT_OP_PQ: {
- struct ioat_pq_descriptor *pq = desc->pq;
- struct ioat_ring_ent *ext;
- struct ioat_pq_ext_descriptor *pq_ex = NULL;
- int src_cnt = src_cnt_to_sw(pq->ctl_f.src_cnt);
- struct ioat_raw_descriptor *descs[2];
- int i;
-
- if (src_cnt > 3) {
- ext = ioat2_get_ring_ent(ioat, idx + 1);
- pq_ex = ext->pq_ex;
- }
-
- /* in the 'continue' case don't unmap the dests as sources */
- if (dmaf_p_disabled_continue(flags))
- src_cnt--;
- else if (dmaf_continue(flags))
- src_cnt -= 3;
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- descs[0] = (struct ioat_raw_descriptor *) pq;
- descs[1] = (struct ioat_raw_descriptor *) pq_ex;
- for (i = 0; i < src_cnt; i++) {
- dma_addr_t src = pq_get_src(descs, i);
-
- ioat_unmap(pdev, src - offset, len,
- PCI_DMA_TODEVICE, flags, 0);
- }
-
- /* the dests are sources in pq validate operations */
- if (pq->ctl_f.op == IOAT_OP_XOR_VAL) {
- if (!(flags & DMA_PREP_PQ_DISABLE_P))
- ioat_unmap(pdev, pq->p_addr - offset,
- len, PCI_DMA_TODEVICE, flags, 0);
- if (!(flags & DMA_PREP_PQ_DISABLE_Q))
- ioat_unmap(pdev, pq->q_addr - offset,
- len, PCI_DMA_TODEVICE, flags, 0);
- break;
- }
- }
-
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- if (!(flags & DMA_PREP_PQ_DISABLE_P))
- ioat_unmap(pdev, pq->p_addr - offset, len,
- PCI_DMA_BIDIRECTIONAL, flags, 1);
- if (!(flags & DMA_PREP_PQ_DISABLE_Q))
- ioat_unmap(pdev, pq->q_addr - offset, len,
- PCI_DMA_BIDIRECTIONAL, flags, 1);
- }
- break;
- }
- default:
- dev_err(&pdev->dev, "%s: unknown op type: %#x\n",
- __func__, desc->hw->ctl_f.op);
- }
-}
-
static bool desc_has_ext(struct ioat_ring_ent *desc)
{
struct ioat_dma_descriptor *hw = desc->hw;
@@ -279,7 +161,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
tx = &desc->txd;
if (tx->cookie) {
dma_cookie_complete(tx);
- ioat3_dma_unmap(ioat, desc, idx + i);
if (tx->callback) {
tx->callback(tx->callback_param);
tx->callback = NULL;
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 79e3eba..9623b5e 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -64,77 +64,15 @@ static void iop_adma_free_slots(struct iop_adma_desc_slot *slot)
static void
iop_desc_unmap(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
{
- struct dma_async_tx_descriptor *tx = &desc->async_tx;
- struct iop_adma_desc_slot *unmap = desc->group_head;
- struct device *dev = &iop_chan->device->pdev->dev;
- u32 len = unmap->unmap_len;
- enum dma_ctrl_flags flags = tx->flags;
- u32 src_cnt;
- dma_addr_t addr;
- dma_addr_t dest;
-
- src_cnt = unmap->unmap_src_cnt;
- dest = iop_desc_get_dest_addr(unmap, iop_chan);
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- enum dma_data_direction dir;
-
- if (src_cnt > 1) /* is xor? */
- dir = DMA_BIDIRECTIONAL;
- else
- dir = DMA_FROM_DEVICE;
-
- dma_unmap_page(dev, dest, len, dir);
- }
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- while (src_cnt--) {
- addr = iop_desc_get_src_addr(unmap, iop_chan, src_cnt);
- if (addr == dest)
- continue;
- dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
- }
- }
desc->group_head = NULL;
}

static void
iop_desc_unmap_pq(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
{
- struct dma_async_tx_descriptor *tx = &desc->async_tx;
- struct iop_adma_desc_slot *unmap = desc->group_head;
- struct device *dev = &iop_chan->device->pdev->dev;
- u32 len = unmap->unmap_len;
- enum dma_ctrl_flags flags = tx->flags;
- u32 src_cnt = unmap->unmap_src_cnt;
- dma_addr_t pdest = iop_desc_get_dest_addr(unmap, iop_chan);
- dma_addr_t qdest = iop_desc_get_qdest_addr(unmap, iop_chan);
- int i;
-
- if (tx->flags & DMA_PREP_CONTINUE)
- src_cnt -= 3;
-
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP) && !desc->pq_check_result) {
- dma_unmap_page(dev, pdest, len, DMA_BIDIRECTIONAL);
- dma_unmap_page(dev, qdest, len, DMA_BIDIRECTIONAL);
- }
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- dma_addr_t addr;
-
- for (i = 0; i < src_cnt; i++) {
- addr = iop_desc_get_src_addr(unmap, iop_chan, i);
- dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
- }
- if (desc->pq_check_result) {
- dma_unmap_page(dev, pdest, len, DMA_TO_DEVICE);
- dma_unmap_page(dev, qdest, len, DMA_TO_DEVICE);
- }
- }
-
desc->group_head = NULL;
}

-
static dma_cookie_t
iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *iop_chan, dma_cookie_t cookie)
@@ -155,12 +93,8 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
/* unmap dma addresses
* (unmap_single vs unmap_page?)
*/
- if (desc->group_head && desc->unmap_len) {
- if (iop_desc_is_pq(desc))
- iop_desc_unmap_pq(iop_chan, desc);
- else
- iop_desc_unmap(iop_chan, desc);
- }
+ if (desc->group_head && desc->unmap_len)
+ desc->group_head = NULL;
}

/* run dependent operations */
diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index e362e2b..a97f0ba 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -57,14 +57,6 @@ static u32 mv_desc_get_dest_addr(struct mv_xor_desc_slot *desc)
return hw_desc->phy_dest_addr;
}

-static u32 mv_desc_get_src_addr(struct mv_xor_desc_slot *desc,
- int src_idx)
-{
- struct mv_xor_desc *hw_desc = desc->hw_desc;
- return hw_desc->phy_src_addr[src_idx];
-}
-
-
static void mv_desc_set_byte_count(struct mv_xor_desc_slot *desc,
u32 byte_count)
{
@@ -303,43 +295,8 @@ mv_xor_run_tx_complete_actions(struct mv_xor_desc_slot *desc,
desc->async_tx.callback(
desc->async_tx.callback_param);

- /* unmap dma addresses
- * (unmap_single vs unmap_page?)
- */
- if (desc->group_head && desc->unmap_len) {
- struct mv_xor_desc_slot *unmap = desc->group_head;
- struct device *dev =
- &mv_chan->device->pdev->dev;
- u32 len = unmap->unmap_len;
- enum dma_ctrl_flags flags = desc->async_tx.flags;
- u32 src_cnt;
- dma_addr_t addr;
- dma_addr_t dest;
-
- src_cnt = unmap->unmap_src_cnt;
- dest = mv_desc_get_dest_addr(unmap);
- if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- enum dma_data_direction dir;
-
- if (src_cnt > 1) /* is xor ? */
- dir = DMA_BIDIRECTIONAL;
- else
- dir = DMA_FROM_DEVICE;
- dma_unmap_page(dev, dest, len, dir);
- }
-
- if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- while (src_cnt--) {
- addr = mv_desc_get_src_addr(unmap,
- src_cnt);
- if (addr == dest)
- continue;
- dma_unmap_page(dev, addr, len,
- DMA_TO_DEVICE);
- }
- }
+ if (desc->group_head && desc->unmap_len)
desc->group_head = NULL;
- }
}

/* run dependent operations */
diff --git a/drivers/dma/ppc4xx/adma.c b/drivers/dma/ppc4xx/adma.c
index f72348d..a03909f 100644
--- a/drivers/dma/ppc4xx/adma.c
+++ b/drivers/dma/ppc4xx/adma.c
@@ -802,218 +802,6 @@ static void ppc440spe_desc_set_link(struct ppc440spe_adma_chan *chan,
}

/**
- * ppc440spe_desc_get_src_addr - extract the source address from the descriptor
- */
-static u32 ppc440spe_desc_get_src_addr(struct ppc440spe_adma_desc_slot *desc,
- struct ppc440spe_adma_chan *chan, int src_idx)
-{
- struct dma_cdb *dma_hw_desc;
- struct xor_cb *xor_hw_desc;
-
- switch (chan->device->id) {
- case PPC440SPE_DMA0_ID:
- case PPC440SPE_DMA1_ID:
- dma_hw_desc = desc->hw_desc;
- /* May have 0, 1, 2, or 3 sources */
- switch (dma_hw_desc->opc) {
- case DMA_CDB_OPC_NO_OP:
- case DMA_CDB_OPC_DFILL128:
- return 0;
- case DMA_CDB_OPC_DCHECK128:
- if (unlikely(src_idx)) {
- printk(KERN_ERR "%s: try to get %d source for"
- " DCHECK128\n", __func__, src_idx);
- BUG();
- }
- return le32_to_cpu(dma_hw_desc->sg1l);
- case DMA_CDB_OPC_MULTICAST:
- case DMA_CDB_OPC_MV_SG1_SG2:
- if (unlikely(src_idx > 2)) {
- printk(KERN_ERR "%s: try to get %d source from"
- " DMA descr\n", __func__, src_idx);
- BUG();
- }
- if (src_idx) {
- if (le32_to_cpu(dma_hw_desc->sg1u) &
- DMA_CUED_XOR_WIN_MSK) {
- u8 region;
-
- if (src_idx == 1)
- return le32_to_cpu(
- dma_hw_desc->sg1l) +
- desc->unmap_len;
-
- region = (le32_to_cpu(
- dma_hw_desc->sg1u)) >>
- DMA_CUED_REGION_OFF;
-
- region &= DMA_CUED_REGION_MSK;
- switch (region) {
- case DMA_RXOR123:
- return le32_to_cpu(
- dma_hw_desc->sg1l) +
- (desc->unmap_len << 1);
- case DMA_RXOR124:
- return le32_to_cpu(
- dma_hw_desc->sg1l) +
- (desc->unmap_len * 3);
- case DMA_RXOR125:
- return le32_to_cpu(
- dma_hw_desc->sg1l) +
- (desc->unmap_len << 2);
- default:
- printk(KERN_ERR
- "%s: try to"
- " get src3 for region %02x"
- "PPC440SPE_DESC_RXOR12?\n",
- __func__, region);
- BUG();
- }
- } else {
- printk(KERN_ERR
- "%s: try to get %d"
- " source for non-cued descr\n",
- __func__, src_idx);
- BUG();
- }
- }
- return le32_to_cpu(dma_hw_desc->sg1l);
- default:
- printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
- __func__, dma_hw_desc->opc);
- BUG();
- }
- return le32_to_cpu(dma_hw_desc->sg1l);
- case PPC440SPE_XOR_ID:
- /* May have up to 16 sources */
- xor_hw_desc = desc->hw_desc;
- return xor_hw_desc->ops[src_idx].l;
- }
- return 0;
-}
-
-/**
- * ppc440spe_desc_get_dest_addr - extract the destination address from the
- * descriptor
- */
-static u32 ppc440spe_desc_get_dest_addr(struct ppc440spe_adma_desc_slot *desc,
- struct ppc440spe_adma_chan *chan, int idx)
-{
- struct dma_cdb *dma_hw_desc;
- struct xor_cb *xor_hw_desc;
-
- switch (chan->device->id) {
- case PPC440SPE_DMA0_ID:
- case PPC440SPE_DMA1_ID:
- dma_hw_desc = desc->hw_desc;
-
- if (likely(!idx))
- return le32_to_cpu(dma_hw_desc->sg2l);
- return le32_to_cpu(dma_hw_desc->sg3l);
- case PPC440SPE_XOR_ID:
- xor_hw_desc = desc->hw_desc;
- return xor_hw_desc->cbtal;
- }
- return 0;
-}
-
-/**
- * ppc440spe_desc_get_src_num - extract the number of source addresses from
- * the descriptor
- */
-static u32 ppc440spe_desc_get_src_num(struct ppc440spe_adma_desc_slot *desc,
- struct ppc440spe_adma_chan *chan)
-{
- struct dma_cdb *dma_hw_desc;
- struct xor_cb *xor_hw_desc;
-
- switch (chan->device->id) {
- case PPC440SPE_DMA0_ID:
- case PPC440SPE_DMA1_ID:
- dma_hw_desc = desc->hw_desc;
-
- switch (dma_hw_desc->opc) {
- case DMA_CDB_OPC_NO_OP:
- case DMA_CDB_OPC_DFILL128:
- return 0;
- case DMA_CDB_OPC_DCHECK128:
- return 1;
- case DMA_CDB_OPC_MV_SG1_SG2:
- case DMA_CDB_OPC_MULTICAST:
- /*
- * Only for RXOR operations we have more than
- * one source
- */
- if (le32_to_cpu(dma_hw_desc->sg1u) &
- DMA_CUED_XOR_WIN_MSK) {
- /* RXOR op, there are 2 or 3 sources */
- if (((le32_to_cpu(dma_hw_desc->sg1u) >>
- DMA_CUED_REGION_OFF) &
- DMA_CUED_REGION_MSK) == DMA_RXOR12) {
- /* RXOR 1-2 */
- return 2;
- } else {
- /* RXOR 1-2-3/1-2-4/1-2-5 */
- return 3;
- }
- }
- return 1;
- default:
- printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
- __func__, dma_hw_desc->opc);
- BUG();
- }
- case PPC440SPE_XOR_ID:
- /* up to 16 sources */
- xor_hw_desc = desc->hw_desc;
- return xor_hw_desc->cbc & XOR_CDCR_OAC_MSK;
- default:
- BUG();
- }
- return 0;
-}
-
-/**
- * ppc440spe_desc_get_dst_num - get the number of destination addresses in
- * this descriptor
- */
-static u32 ppc440spe_desc_get_dst_num(struct ppc440spe_adma_desc_slot *desc,
- struct ppc440spe_adma_chan *chan)
-{
- struct dma_cdb *dma_hw_desc;
-
- switch (chan->device->id) {
- case PPC440SPE_DMA0_ID:
- case PPC440SPE_DMA1_ID:
- /* May be 1 or 2 destinations */
- dma_hw_desc = desc->hw_desc;
- switch (dma_hw_desc->opc) {
- case DMA_CDB_OPC_NO_OP:
- case DMA_CDB_OPC_DCHECK128:
- return 0;
- case DMA_CDB_OPC_MV_SG1_SG2:
- case DMA_CDB_OPC_DFILL128:
- return 1;
- case DMA_CDB_OPC_MULTICAST:
- if (desc->dst_cnt == 2)
- return 2;
- else
- return 1;
- default:
- printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
- __func__, dma_hw_desc->opc);
- BUG();
- }
- case PPC440SPE_XOR_ID:
- /* Always only 1 destination */
- return 1;
- default:
- BUG();
- }
- return 0;
-}
-
-/**
* ppc440spe_desc_get_link - get the address of the descriptor that
* follows this one
*/
@@ -1705,43 +1493,6 @@ static void ppc440spe_adma_free_slots(struct ppc440spe_adma_desc_slot *slot,
}
}

-static void ppc440spe_adma_unmap(struct ppc440spe_adma_chan *chan,
- struct ppc440spe_adma_desc_slot *desc)
-{
- u32 src_cnt, dst_cnt;
- dma_addr_t addr;
-
- /*
- * get the number of sources & destination
- * included in this descriptor and unmap
- * them all
- */
- src_cnt = ppc440spe_desc_get_src_num(desc, chan);
- dst_cnt = ppc440spe_desc_get_dst_num(desc, chan);
-
- /* unmap destinations */
- if (!(desc->async_tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- while (dst_cnt--) {
- addr = ppc440spe_desc_get_dest_addr(
- desc, chan, dst_cnt);
- dma_unmap_page(chan->device->dev,
- addr, desc->unmap_len,
- DMA_FROM_DEVICE);
- }
- }
-
- /* unmap sources */
- if (!(desc->async_tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- while (src_cnt--) {
- addr = ppc440spe_desc_get_src_addr(
- desc, chan, src_cnt);
- dma_unmap_page(chan->device->dev,
- addr, desc->unmap_len,
- DMA_TO_DEVICE);
- }
- }
-}
-
/**
* ppc440spe_adma_run_tx_complete_actions - call functions to be called
* upon completion
@@ -1764,27 +1515,6 @@ static dma_cookie_t ppc440spe_adma_run_tx_complete_actions(
if (desc->async_tx.callback)
desc->async_tx.callback(
desc->async_tx.callback_param);
-
- /* unmap dma addresses
- * (unmap_single vs unmap_page?)
- *
- * actually, ppc's dma_unmap_page() functions are empty, so
- * the following code is just for the sake of completeness
- */
- if (chan && chan->needs_unmap && desc->group_head &&
- desc->unmap_len) {
- struct ppc440spe_adma_desc_slot *unmap =
- desc->group_head;
- /* assume 1 slot per op always */
- u32 slot_count = unmap->slot_cnt;
-
- /* Run through the group list and unmap addresses */
- for (i = 0; i < slot_count; i++) {
- BUG_ON(!unmap);
- ppc440spe_adma_unmap(chan, unmap);
- unmap = unmap->hw_next;
- }
- }
}

/* run dependent operations */
diff --git a/drivers/dma/timb_dma.c b/drivers/dma/timb_dma.c
index 4e0dff5..034031b 100644
--- a/drivers/dma/timb_dma.c
+++ b/drivers/dma/timb_dma.c
@@ -154,38 +154,6 @@ static bool __td_dma_done_ack(struct timb_dma_chan *td_chan)
return done;
}

-static void __td_unmap_desc(struct timb_dma_chan *td_chan, const u8 *dma_desc,
- bool single)
-{
- dma_addr_t addr;
- int len;
-
- addr = (dma_desc[7] << 24) | (dma_desc[6] << 16) | (dma_desc[5] << 8) |
- dma_desc[4];
-
- len = (dma_desc[3] << 8) | dma_desc[2];
-
- if (single)
- dma_unmap_single(chan2dev(&td_chan->chan), addr, len,
- DMA_TO_DEVICE);
- else
- dma_unmap_page(chan2dev(&td_chan->chan), addr, len,
- DMA_TO_DEVICE);
-}
-
-static void __td_unmap_descs(struct timb_dma_desc *td_desc, bool single)
-{
- struct timb_dma_chan *td_chan = container_of(td_desc->txd.chan,
- struct timb_dma_chan, chan);
- u8 *descs;
-
- for (descs = td_desc->desc_list; ; descs += TIMB_DMA_DESC_SIZE) {
- __td_unmap_desc(td_chan, descs, single);
- if (descs[0] & 0x02)
- break;
- }
-}
-
static int td_fill_desc(struct timb_dma_chan *td_chan, u8 *dma_desc,
struct scatterlist *sg, bool last)
{
@@ -293,10 +261,6 @@ static void __td_finish(struct timb_dma_chan *td_chan)

list_move(&td_desc->desc_node, &td_chan->free_list);

- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP))
- __td_unmap_descs(td_desc,
- txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE);
-
/*
* The API requires that no submissions are done from a
* callback, so we don't need to drop the lock here
diff --git a/drivers/dma/txx9dmac.c b/drivers/dma/txx9dmac.c
index 913f55c..00a45ba 100644
--- a/drivers/dma/txx9dmac.c
+++ b/drivers/dma/txx9dmac.c
@@ -419,30 +419,6 @@ txx9dmac_descriptor_complete(struct txx9dmac_chan *dc,
list_splice_init(&desc->tx_list, &dc->free_list);
list_move(&desc->desc_node, &dc->free_list);

- if (!ds) {
- dma_addr_t dmaaddr;
- if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
- dmaaddr = is_dmac64(dc) ?
- desc->hwdesc.DAR : desc->hwdesc32.DAR;
- if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
- dma_unmap_single(chan2parent(&dc->chan),
- dmaaddr, desc->len, DMA_FROM_DEVICE);
- else
- dma_unmap_page(chan2parent(&dc->chan),
- dmaaddr, desc->len, DMA_FROM_DEVICE);
- }
- if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
- dmaaddr = is_dmac64(dc) ?
- desc->hwdesc.SAR : desc->hwdesc32.SAR;
- if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
- dma_unmap_single(chan2parent(&dc->chan),
- dmaaddr, desc->len, DMA_TO_DEVICE);
- else
- dma_unmap_page(chan2parent(&dc->chan),
- dmaaddr, desc->len, DMA_TO_DEVICE);
- }
- }
-
/*
* The API requires that no submissions are done from a
* callback, so we don't need to drop the lock here
--
1.8.0

Subject: [PATCH 16/20] async_tx: do DMA unmap in async_raid6_recov.c for PQ operations

Convert core async_tx code (async_sum_product() and async_mult())
to do DMA unmapping itself using the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_raid6_recov.c | 50 ++++++++++++++++++++++++++++++++++---
1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
index a9f08a6..3db97aa 100644
--- a/crypto/async_tx/async_raid6_recov.c
+++ b/crypto/async_tx/async_raid6_recov.c
@@ -27,6 +27,20 @@
#include <linux/raid/pq.h>
#include <linux/async_tx.h>

+static void async_sum_product_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_page(dev->dev, tx->dma_dst[1], tx->dma_len,
+ DMA_BIDIRECTIONAL);
+ dma_unmap_page(dev->dev, tx->dma_src[0], tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_src[1], tx->dma_len, DMA_TO_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
static struct dma_async_tx_descriptor *
async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
size_t len, struct async_submit_ctl *submit)
@@ -43,7 +57,9 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
dma_addr_t dma_src[2];
struct device *dev = dma->dev;
struct dma_async_tx_descriptor *tx;
- enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+ enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP |
+ DMA_PREP_PQ_DISABLE_P;

if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
@@ -53,7 +69,13 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef,
len, dma_flags);
if (tx) {
- async_tx_submit(chan, tx, submit);
+ tx->dma_dst[1] = dma_dest[1];
+ tx->dma_src[0] = dma_src[0];
+ tx->dma_src[1] = dma_src[1];
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_sum_product_cb, tx,
+ submit);
return tx;
}

@@ -82,6 +104,20 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
return NULL;
}

+static void async_mult_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_page(dev->dev, tx->dma_dst[1], tx->dma_len,
+ DMA_BIDIRECTIONAL);
+ dma_unmap_page(dev->dev, tx->dma_src[0], tx->dma_len,
+ DMA_TO_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
static struct dma_async_tx_descriptor *
async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
struct async_submit_ctl *submit)
@@ -97,7 +133,9 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
dma_addr_t dma_src[1];
struct device *dev = dma->dev;
struct dma_async_tx_descriptor *tx;
- enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+ enum dma_ctrl_flags dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP |
+ DMA_PREP_PQ_DISABLE_P;

if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
@@ -106,7 +144,11 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef,
len, dma_flags);
if (tx) {
- async_tx_submit(chan, tx, submit);
+ tx->dma_dst[1] = dma_dest[1];
+ tx->dma_src[0] = dma_src[0];
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_mult_cb, tx, submit);
return tx;
}

--
1.8.0

Subject: [PATCH 18/20] async_tx: do DMA unmap in core for PQ_VAL operations

Convert core async_tx code (async_syndrome_val()) to do
DMA unmapping itself using the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_pq.c | 40 +++++++++++++++++++++++++++++++++++-----
1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
index 2848fe8..9e5500e 100644
--- a/crypto/async_tx/async_pq.c
+++ b/crypto/async_tx/async_pq.c
@@ -292,6 +292,26 @@ pq_val_chan(struct async_submit_ctl *submit, struct page **blocks, int disks, si
disks, len);
}

+static void async_syndrome_val_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+ int i;
+
+ for (i = tx->dma_src_cnt; i < tx->dma_src_cnt + 2; i++) {
+ if (tx->dma_src[i])
+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
+ DMA_TO_DEVICE);
+ }
+
+ for (i = 0; i < tx->dma_src_cnt; i++)
+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
+ DMA_TO_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/**
* async_syndrome_val - asynchronously validate a raid6 syndrome
* @blocks: source blocks from idx 0..disks-3, P @ disks-2 and Q @ disks-1
@@ -335,15 +355,17 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,

pr_debug("%s: (async) disks: %d len: %zu\n",
__func__, disks, len);
- if (!P(blocks, disks))
+ if (!P(blocks, disks)) {
+ pq[0] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_P;
- else
+ } else
pq[0] = dma_map_page(dev, P(blocks, disks),
offset, len,
DMA_TO_DEVICE);
- if (!Q(blocks, disks))
+ if (!Q(blocks, disks)) {
+ pq[1] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_Q;
- else
+ } else
pq[1] = dma_map_page(dev, Q(blocks, disks),
offset, len,
DMA_TO_DEVICE);
@@ -370,7 +392,15 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
async_tx_quiesce(&submit->depend_tx);
dma_async_issue_pending(chan);
}
- async_tx_submit(chan, tx, submit);
+
+ tx->dma_src[src_cnt] = pq[0];
+ tx->dma_src[src_cnt + 1] = pq[1];
+ for (i = 0; i < src_cnt; i++)
+ tx->dma_src[i] = dma_src[i];
+ tx->dma_src_cnt = src_cnt;
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_syndrome_val_cb, tx, submit);

return tx;
} else {
--
1.8.0

Subject: [PATCH 14/20] async_tx: do DMA unmap in core for XOR_VAL operations

Convert core async_tx code (async_xor_val()) to do DMA unmapping
itself using the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_xor.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 59a4af3..d44da16 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -284,6 +284,20 @@ xor_val_chan(struct async_submit_ctl *submit, struct page *dest,
src_cnt, len);
}

+static void async_xor_val_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+ int i;
+
+ for (i = 0; i < tx->dma_src_cnt; i++)
+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
+ DMA_TO_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/**
* async_xor_val - attempt a xor parity check with a dma engine.
* @dest: destination page used if the xor is performed synchronously
@@ -319,7 +333,8 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,

if (dma_src && device && src_cnt <= device->max_xor &&
is_dma_xor_aligned(device, offset, 0, len)) {
- unsigned long dma_prep_flags = 0;
+ unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
int i;

pr_debug("%s: (async) len: %zu\n", __func__, len);
@@ -346,7 +361,12 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
}
}

- async_tx_submit(chan, tx, submit);
+ for (i = 0; i < src_cnt; i++)
+ tx->dma_src[i] = dma_src[i];
+ tx->dma_src_cnt = src_cnt;
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_xor_val_cb, tx, submit);
} else {
enum async_tx_flags flags_orig = submit->flags;

--
1.8.0

Subject: [PATCH 11/20] async_tx: do DMA unmap in core for MEMSET operations

Convert core async_tx code (async_memset()) to do DMA unmapping
itself using the ->callback functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memset.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index a6a667b..cf30bf1 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -30,6 +30,17 @@
#include <linux/dma-mapping.h>
#include <linux/async_tx.h>

+static void async_memset_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+
+ dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/**
* async_memset - attempt to fill memory with a dma engine.
* @dest: destination page
@@ -47,10 +58,11 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
&dest, 1, NULL, 0, len);
struct dma_device *device = chan ? chan->device : NULL;
struct dma_async_tx_descriptor *tx = NULL;
+ dma_addr_t dma_dest;

if (device && is_dma_fill_aligned(device, offset, 0, len)) {
- dma_addr_t dma_dest;
- unsigned long dma_prep_flags = 0;
+ unsigned long dma_prep_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;

if (submit->cb_fn)
dma_prep_flags |= DMA_PREP_INTERRUPT;
@@ -68,7 +80,11 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,

if (tx) {
pr_debug("%s: (async) len: %zu\n", __func__, len);
- async_tx_submit(chan, tx, submit);
+
+ tx->dma_dst = dma_dest;
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, async_memset_cb, tx, submit);
} else { /* run the memset synchronously */
void *dest_buf;
pr_debug("%s: (sync) len: %zu\n", __func__, len);
--
1.8.0

Subject: [PATCH 13/20] async_tx: do DMA unmap in core for XOR operations

In struct dma_async_tx_descriptor convert dma_[src,dst] fields to
arrays and also add dma_src_cnt field. Then convert core async_tx
code (do_async_xor()) to do DMA unmapping itself using the ->callback
functionality.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memcpy.c | 8 ++---
crypto/async_tx/async_xor.c | 80 +++++++++++++++++++++++++++++++-----------
drivers/dma/dmaengine.c | 26 +++++++-------
include/linux/dmaengine.h | 14 +++++---
4 files changed, 87 insertions(+), 41 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index b6d5dab..cb0628e 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -35,8 +35,8 @@ static void async_memcpy_cb(void *dma_async_param)
struct dma_async_tx_descriptor *tx = dma_async_param;
struct dma_device *dev = tx->chan->device;

- dma_unmap_page(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
- dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_src[0], tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len, DMA_FROM_DEVICE);

if (tx->orig_callback)
tx->orig_callback(tx->orig_callback_param);
@@ -91,8 +91,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
if (tx) {
pr_debug("%s: (async) len: %zu\n", __func__, len);

- tx->dma_src = dma_src;
- tx->dma_dst = dma_dest;
+ tx->dma_src[0] = dma_src;
+ tx->dma_dst[0] = dma_dest;
tx->dma_len = len;

__async_tx_submit(chan, tx, async_memcpy_cb, tx, submit);
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 154cc84..59a4af3 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -31,6 +31,26 @@
#include <linux/raid/xor.h>
#include <linux/async_tx.h>

+static void do_async_xor_cb(void *dma_async_param)
+{
+ struct dma_async_tx_descriptor *tx = dma_async_param;
+ struct dma_device *dev = tx->chan->device;
+ int i;
+
+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len,
+ DMA_BIDIRECTIONAL);
+
+ for (i = 0; i < tx->dma_src_cnt; i++) {
+ if (tx->dma_src[i] == tx->dma_dst[0])
+ continue;
+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
+ DMA_TO_DEVICE);
+ }
+
+ if (tx->orig_callback)
+ tx->orig_callback(tx->orig_callback_param);
+}
+
/* do_async_xor - dma map the pages and perform the xor with an engine */
static __async_inline struct dma_async_tx_descriptor *
do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
@@ -39,42 +59,34 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
{
struct dma_device *dma = chan->device;
struct dma_async_tx_descriptor *tx = NULL;
- int src_off = 0;
- int i;
+ int i, j;
dma_async_tx_callback cb_fn_orig = submit->cb_fn;
void *cb_param_orig = submit->cb_param;
enum async_tx_flags flags_orig = submit->flags;
enum dma_ctrl_flags dma_flags;
int xor_src_cnt = 0;
+ int src_list_cnt = 0;
+ int extra_ent = 0;
dma_addr_t dma_dest;

- /* map the dest bidrectional in case it is re-used as a source */
- dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL);
for (i = 0; i < src_cnt; i++) {
- /* only map the dest once */
if (!src_list[i])
continue;
- if (unlikely(src_list[i] == dest)) {
- dma_src[xor_src_cnt++] = dma_dest;
- continue;
- }
- dma_src[xor_src_cnt++] = dma_map_page(dma->dev, src_list[i], offset,
- len, DMA_TO_DEVICE);
+ xor_src_cnt++;
}
src_cnt = xor_src_cnt;

while (src_cnt) {
submit->flags = flags_orig;
- dma_flags = 0;
+ dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
+ DMA_COMPL_SKIP_DEST_UNMAP;
xor_src_cnt = min(src_cnt, (int)dma->max_xor);
- /* if we are submitting additional xors, leave the chain open,
- * clear the callback parameters, and leave the destination
- * buffer mapped
+ /* if we are submitting additional xors, leave the chain open
+ * and clear the callback parameters
*/
if (src_cnt > xor_src_cnt) {
submit->flags &= ~ASYNC_TX_ACK;
submit->flags |= ASYNC_TX_FENCE;
- dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = NULL;
submit->cb_param = NULL;
} else {
@@ -85,11 +97,32 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
dma_flags |= DMA_PREP_INTERRUPT;
if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
+
+ /* map it bidirectional as it can be re-used as a source */
+ dma_dest = dma_map_page(dma->dev, dest, offset, len,
+ DMA_BIDIRECTIONAL);
+ j = 0;
+ if (extra_ent)
+ dma_src[j++] = dma_dest;
+ for (i = src_list_cnt; j < xor_src_cnt; i++) {
+ /* only map the dest once */
+ if (!src_list[i])
+ continue;
+ if (unlikely(src_list[i] == dest)) {
+ dma_src[j++] = dma_dest;
+ continue;
+ }
+ dma_src[j++] = dma_map_page(dma->dev, src_list[i],
+ offset, len, DMA_TO_DEVICE);
+ }
+
+ src_list_cnt = i;
+
/* Since we have clobbered the src_list we are committed
* to doing this asynchronously. Drivers force forward progress
* in case they can not provide a descriptor
*/
- tx = dma->device_prep_dma_xor(chan, dma_dest, &dma_src[src_off],
+ tx = dma->device_prep_dma_xor(chan, dma_dest, &dma_src[0],
xor_src_cnt, len, dma_flags);

if (unlikely(!tx))
@@ -99,22 +132,27 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
while (unlikely(!tx)) {
dma_async_issue_pending(chan);
tx = dma->device_prep_dma_xor(chan, dma_dest,
- &dma_src[src_off],
+ &dma_src[0],
xor_src_cnt, len,
dma_flags);
}

- async_tx_submit(chan, tx, submit);
+ for (i = 0; i < xor_src_cnt; i++)
+ tx->dma_src[i] = dma_src[i];
+ tx->dma_src_cnt = xor_src_cnt;
+ tx->dma_dst[0] = dma_dest;
+ tx->dma_len = len;
+
+ __async_tx_submit(chan, tx, do_async_xor_cb, tx, submit);
submit->depend_tx = tx;

if (src_cnt > xor_src_cnt) {
/* drop completed sources */
src_cnt -= xor_src_cnt;
- src_off += xor_src_cnt;

/* use the intermediate result a source */
- dma_src[--src_off] = dma_dest;
src_cnt++;
+ extra_ent = 1;
} else
break;
}
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 1b9c02a..5573e86 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -858,8 +858,10 @@ static void dma_async_memcpy_buf_to_buf_cb(void *dma_async_param)
struct dma_async_tx_descriptor *tx = dma_async_param;
struct dma_device *dev = tx->chan->device;

- dma_unmap_single(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
- dma_unmap_single(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+ dma_unmap_single(dev->dev, tx->dma_src[0], tx->dma_len,
+ DMA_TO_DEVICE);
+ dma_unmap_single(dev->dev, tx->dma_dst[0], tx->dma_len,
+ DMA_FROM_DEVICE);
}

/**
@@ -896,8 +898,8 @@ dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
return -ENOMEM;
}

- tx->dma_src = dma_src;
- tx->dma_dst = dma_dest;
+ tx->dma_src[0] = dma_src;
+ tx->dma_dst[0] = dma_dest;
tx->dma_len = len;

tx->callback = dma_async_memcpy_buf_to_buf_cb;
@@ -919,8 +921,8 @@ static void dma_async_memcpy_buf_to_pg_cb(void *dma_async_param)
struct dma_async_tx_descriptor *tx = dma_async_param;
struct dma_device *dev = tx->chan->device;

- dma_unmap_single(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
- dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+ dma_unmap_single(dev->dev, tx->dma_src[0], tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len, DMA_FROM_DEVICE);
}

/**
@@ -958,8 +960,8 @@ dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
return -ENOMEM;
}

- tx->dma_src = dma_src;
- tx->dma_dst = dma_dest;
+ tx->dma_src[0] = dma_src;
+ tx->dma_dst[0] = dma_dest;
tx->dma_len = len;

tx->callback = dma_async_memcpy_buf_to_pg_cb;
@@ -981,8 +983,8 @@ static void dma_async_memcpy_pg_to_pg_cb(void *dma_async_param)
struct dma_async_tx_descriptor *tx = dma_async_param;
struct dma_device *dev = tx->chan->device;

- dma_unmap_page(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
- dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_src[0], tx->dma_len, DMA_TO_DEVICE);
+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len, DMA_FROM_DEVICE);
}

/**
@@ -1023,8 +1025,8 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
return -ENOMEM;
}

- tx->dma_src = dma_src;
- tx->dma_dst = dma_dest;
+ tx->dma_src[0] = dma_src;
+ tx->dma_dst[0] = dma_dest;
tx->dma_len = len;

tx->callback = dma_async_memcpy_pg_to_pg_cb;
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 440b609..0df69f1 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -392,6 +392,10 @@ void dma_chan_cleanup(struct kref *kref);
typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);

typedef void (*dma_async_tx_callback)(void *dma_async_param);
+
+/* max value of ->max_xor from struct dma_device */
+#define DMA_ASYNC_TX_MAX_ENT 128
+
/**
* struct dma_async_tx_descriptor - async transaction descriptor
* ---dma generic offload fields---
@@ -402,8 +406,9 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
* @phys: physical address of the descriptor
* @chan: target channel for this operation
* @tx_submit: set the prepared descriptor(s) to be executed by the engine
- * @dma_src: DMA source address (needed for DMA unmap)
- * @dma_dst: DMA destination address (needed for DMA unmap)
+ * @dma_src: DMA source addresses (needed for DMA unmap)
+ * @dma_src_cnt: number of DMA source addresses (needed for DMA unmap)
+ * @dma_dst: DMA destination addresses (needed for DMA unmap)
* @dma_len: DMA length (needed for DMA unmap)
* @callback: routine to call after this operation is complete
* @callback_param: general parameter to pass to the callback routine
@@ -420,8 +425,9 @@ struct dma_async_tx_descriptor {
dma_addr_t phys;
struct dma_chan *chan;
dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
- dma_addr_t dma_src;
- dma_addr_t dma_dst;
+ dma_addr_t dma_src[DMA_ASYNC_TX_MAX_ENT];
+ unsigned int dma_src_cnt;
+ dma_addr_t dma_dst[DMA_ASYNC_TX_MAX_ENT];
size_t dma_len;
dma_async_tx_callback callback;
void *callback_param;
--
1.8.0

Subject: [PATCH 12/20] dmatest: do DMA unmap for XOR operations

Make driver do DMA unmap for XOR operations.

Cc: Vinod Koul <[email protected]>
Cc: Havard Skinnemoen <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/dmatest.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index 22655a7..eabb230 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -306,13 +306,13 @@ static int dmatest_func(void *data)

/*
* src buffers are freed by the DMAEngine code with dma_unmap_single()
- * (except DMA_MEMCPY operations)
+ * (except DMA_MEMCPY and DMA_XOR operations)
* dst buffers are freed by ourselves below
*/
flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT
| DMA_COMPL_SKIP_DEST_UNMAP;

- if (thread->type == DMA_MEMCPY)
+ if (thread->type == DMA_MEMCPY || thread->type == DMA_XOR)
flags |= DMA_COMPL_SKIP_SRC_UNMAP;
else
flags |= DMA_COMPL_SRC_UNMAP_SINGLE;
@@ -449,9 +449,11 @@ static int dmatest_func(void *data)
}

/* Unmap by myself (see DMA_COMPL_SKIP_DEST_UNMAP above) */
- if (thread->type == DMA_MEMCPY)
- dma_unmap_single(dev->dev, dma_srcs[0], len,
- DMA_TO_DEVICE);
+ if (thread->type == DMA_MEMCPY || thread->type == DMA_XOR) {
+ for (i = 0; i < src_cnt; i++)
+ dma_unmap_single(dev->dev, dma_srcs[i], len,
+ DMA_TO_DEVICE);
+ }
for (i = 0; i < dst_cnt; i++)
dma_unmap_single(dev->dev, dma_dsts[i], test_buf_size,
DMA_BIDIRECTIONAL);
--
1.8.0

Subject: [PATCH 05/20] dmatest: do DMA unmap for MEMCPY operations

Make driver do DMA unmap for MEMCPY operations.

Cc: Vinod Koul <[email protected]>
Cc: Havard Skinnemoen <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/dma/dmatest.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index 24225f0..22655a7 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -306,10 +306,16 @@ static int dmatest_func(void *data)

/*
* src buffers are freed by the DMAEngine code with dma_unmap_single()
+ * (except DMA_MEMCPY operations)
* dst buffers are freed by ourselves below
*/
flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT
- | DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SRC_UNMAP_SINGLE;
+ | DMA_COMPL_SKIP_DEST_UNMAP;
+
+ if (thread->type == DMA_MEMCPY)
+ flags |= DMA_COMPL_SKIP_SRC_UNMAP;
+ else
+ flags |= DMA_COMPL_SRC_UNMAP_SINGLE;

while (!kthread_should_stop()
&& !(iterations && total_tests >= iterations)) {
@@ -443,6 +449,9 @@ static int dmatest_func(void *data)
}

/* Unmap by myself (see DMA_COMPL_SKIP_DEST_UNMAP above) */
+ if (thread->type == DMA_MEMCPY)
+ dma_unmap_single(dev->dev, dma_srcs[0], len,
+ DMA_TO_DEVICE);
for (i = 0; i < dst_cnt; i++)
dma_unmap_single(dev->dev, dma_dsts[i], test_buf_size,
DMA_BIDIRECTIONAL);
--
1.8.0

Subject: [PATCH 04/20] carma-fpga: pass correct flags to ->device_prep_dma_memcpy()

DMA unmapping is handled by a driver so tell fsldma.c driver
(which is the DMA engine driver used by carma-fpga) to skip
unmapping destination and source buffers.

Cc: Ira W. Snyder <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
drivers/misc/carma/carma-fpga.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/carma/carma-fpga.c b/drivers/misc/carma/carma-fpga.c
index 8835eab..6b43f8c 100644
--- a/drivers/misc/carma/carma-fpga.c
+++ b/drivers/misc/carma/carma-fpga.c
@@ -631,6 +631,8 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
struct dma_async_tx_descriptor *tx;
dma_cookie_t cookie;
dma_addr_t dst, src;
+ unsigned long dma_flags = DMA_COMPL_SKIP_DEST_UNMAP |
+ DMA_COMPL_SKIP_SRC_UNMAP;

dst_sg = buf->vb.sglist;
dst_nents = buf->vb.sglen;
@@ -666,7 +668,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
src = SYS_FPGA_BLOCK;
tx = chan->device->device_prep_dma_memcpy(chan, dst, src,
REG_BLOCK_SIZE,
- 0);
+ dma_flags);
if (!tx) {
dev_err(priv->dev, "unable to prep SYS-FPGA DMA\n");
return -ENOMEM;
--
1.8.0

Subject: [PATCH 01/20] async_tx: add missing DMA unmap to async_memcpy()

Do DMA unmap on ->device_prep_dma_memcpy failure.

Cc: Dan Williams <[email protected]>
Cc: Tomasz Figa <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
---
crypto/async_tx/async_memcpy.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 361b5e8..9e62fef 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -67,6 +67,12 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,

tx = device->device_prep_dma_memcpy(chan, dma_dest, dma_src,
len, dma_prep_flags);
+ if (!tx) {
+ dma_unmap_page(device->dev, dma_dest, len,
+ DMA_FROM_DEVICE);
+ dma_unmap_page(device->dev, dma_src, len,
+ DMA_TO_DEVICE);
+ }
}

if (tx) {
--
1.8.0

2012-11-05 19:44:39

by Ira W. Snyder

[permalink] [raw]
Subject: Re: [PATCH 00/20] DMA: DMA unmap fixes

On Mon, Nov 05, 2012 at 11:00:11AM +0100, Bartlomiej Zolnierkiewicz wrote:
> Hi,
>
> Currently DMA subsystem does DMA mapping in the core code and DMA
> unmapping is done by device drivers. This is counterintuitive,
> causes code duplication and subtle errors (some drivers like PL330
> one don't implement DMA unmapping code). The following patchset
> modifies DMA subsystem to do DMA unmapping in the core code.
> It results in simpler code, less code duplication (more than 400
> LOC is gone) and fixes the issue with missing DMA unmapping code
> in some drivers. Additionally many cases when DMA wasn't unmapped
> on a failure are also fixed.
>
>
> patches #1-3 add missing DMA unmap on failure to async_tx core
> code (async_memcpy()), ioat and fsmc_nand drivers
>
> patch #4 fixes DMA flags used by carma-fpga driver
>
> patches #5-7 fix core code and dmatest driver to DMA unmap for
> MEMCPY operations
>
> patch #8 adds missing DMA unmap on failure to ioat3 driver
>
> patch #9 fixes build for async_memset.c
>
> patch #10 adds missing DMA unmap on failure to async tx core
> code (async_memset())
>
> patches #11-18 fix async_tx core code and dmatest driver to do
> DMA unmap for MEMSET, XOR, XOR_VAL, PQ and PQ_VAL operations
>
> patches #19-20 remove no longer needed DMA unmap code from
> device drivers and DMA unmap flags from code code
>
>
> This patchset was tested on PL330 DMA controller using MEMCPY
> operations. It would be great if somebody could test it on
> more advanced controller capable of MEMSET, XOR, XOR_VAL,
> PQ and PQ_VAL operations (especially since the conversion of
> XOR and PQ operations was not obvious).
>
>
> Bartlomiej Zolnierkiewicz (20):
> async_tx: add missing DMA unmap to async_memcpy()
> ioat: add missing DMA unmap to ioat_dma_self_test()
> mtd: fsmc_nand: add missing DMA unmap to dma_xfer()
> carma-fpga: pass correct flags to ->device_prep_dma_memcpy()
> dmatest: do DMA unmap for MEMCPY operations
> DMA: do DMA unmap in core for MEMCPY operations
> async_tx: do DMA unmap in core for MEMCPY operations
> ioat3: add missing DMA unmap to ioat_xor_val_self_test()
> async_tx: fix build for async_memset
> async_tx: add missing DMA unmap to async_memset()
> async_tx: do DMA unmap in core for MEMSET operations
> dmatest: do DMA unmap for XOR operations
> async_tx: do DMA unmap in core for XOR operations
> async_tx: do DMA unmap in core for XOR_VAL operations
> dmatest: do DMA unmap for PQ operations
> async_tx: do DMA unmap in async_raid6_recov.c for PQ operations
> async_tx: do DMA unmap in core for PQ operations
> async_tx: do DMA unmap in core for PQ_VAL operations
> DMA: remove DMA unmap from drivers
> DMA: remove DMA unmap flags
>
> arch/arm/include/asm/hardware/iop3xx-adma.h | 30 ----
> arch/arm/mach-iop13xx/include/mach/adma.h | 26 ---
> crypto/async_tx/async_memcpy.c | 27 ++-
> crypto/async_tx/async_memset.c | 23 ++-
> crypto/async_tx/async_pq.c | 129 +++++++++----
> crypto/async_tx/async_raid6_recov.c | 42 ++++-
> crypto/async_tx/async_tx.c | 25 ++-
> crypto/async_tx/async_xor.c | 98 +++++++---
> drivers/ata/pata_arasan_cf.c | 3 +-
> drivers/dma/amba-pl08x.c | 31 ----
> drivers/dma/at_hdmac.c | 25 ---
> drivers/dma/dmaengine.c | 59 +++++-
> drivers/dma/dmatest.c | 14 +-
> drivers/dma/dw_dmac.c | 20 ---
> drivers/dma/ep93xx_dma.c | 32 +---
> drivers/dma/fsldma.c | 16 --
> drivers/dma/ioat/dma.c | 28 +--
> drivers/dma/ioat/dma.h | 12 --
> drivers/dma/ioat/dma_v2.c | 1 -
> drivers/dma/ioat/dma_v3.c | 179 +++++-------------
> drivers/dma/iop-adma.c | 70 +-------
> drivers/dma/mv_xor.c | 45 +----
> drivers/dma/ppc4xx/adma.c | 270 ----------------------------
> drivers/dma/timb_dma.c | 36 ----
> drivers/dma/txx9dmac.c | 24 ---
> drivers/media/platform/m2m-deinterlace.c | 3 +-
> drivers/media/platform/timblogiw.c | 2 +-
> drivers/misc/carma/carma-fpga.c | 3 +-
> drivers/mtd/nand/atmel_nand.c | 3 +-
> drivers/mtd/nand/fsmc_nand.c | 20 ++-
> drivers/net/ethernet/micrel/ks8842.c | 6 +-
> drivers/spi/spi-dw-mid.c | 4 +-
> include/linux/async_tx.h | 4 +
> include/linux/dmaengine.h | 34 ++--
> 34 files changed, 446 insertions(+), 898 deletions(-)
>
> --
> 1.8.0
>

Hello Bartlomiej Zolnierkiewicz,

The carma-fpga and fsldma parts look good to me. For those parts,

Acked-by: Ira W. Snyder <[email protected]>

Thanks,
Ira

2012-11-06 10:13:17

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH 00/20] DMA: DMA unmap fixes

Hi Bart,

On Monday 05 of November 2012 11:00:11 Bartlomiej Zolnierkiewicz wrote:
> Hi,
>
> Currently DMA subsystem does DMA mapping in the core code and DMA
> unmapping is done by device drivers. This is counterintuitive,
> causes code duplication and subtle errors (some drivers like PL330
> one don't implement DMA unmapping code). The following patchset
> modifies DMA subsystem to do DMA unmapping in the core code.
> It results in simpler code, less code duplication (more than 400
> LOC is gone) and fixes the issue with missing DMA unmapping code
> in some drivers. Additionally many cases when DMA wasn't unmapped
> on a failure are also fixed.
>
>
> patches #1-3 add missing DMA unmap on failure to async_tx core
> code (async_memcpy()), ioat and fsmc_nand drivers
>
> patch #4 fixes DMA flags used by carma-fpga driver
>
> patches #5-7 fix core code and dmatest driver to DMA unmap for
> MEMCPY operations
>
> patch #8 adds missing DMA unmap on failure to ioat3 driver
>
> patch #9 fixes build for async_memset.c
>
> patch #10 adds missing DMA unmap on failure to async tx core
> code (async_memset())
>
> patches #11-18 fix async_tx core code and dmatest driver to do
> DMA unmap for MEMSET, XOR, XOR_VAL, PQ and PQ_VAL operations
>
> patches #19-20 remove no longer needed DMA unmap code from
> device drivers and DMA unmap flags from code code
>
>
> This patchset was tested on PL330 DMA controller using MEMCPY
> operations. It would be great if somebody could test it on
> more advanced controller capable of MEMSET, XOR, XOR_VAL,
> PQ and PQ_VAL operations (especially since the conversion of
> XOR and PQ operations was not obvious).
>
>
> Bartlomiej Zolnierkiewicz (20):
> async_tx: add missing DMA unmap to async_memcpy()
> ioat: add missing DMA unmap to ioat_dma_self_test()
> mtd: fsmc_nand: add missing DMA unmap to dma_xfer()
> carma-fpga: pass correct flags to ->device_prep_dma_memcpy()
> dmatest: do DMA unmap for MEMCPY operations
> DMA: do DMA unmap in core for MEMCPY operations
> async_tx: do DMA unmap in core for MEMCPY operations
> ioat3: add missing DMA unmap to ioat_xor_val_self_test()
> async_tx: fix build for async_memset
> async_tx: add missing DMA unmap to async_memset()
> async_tx: do DMA unmap in core for MEMSET operations
> dmatest: do DMA unmap for XOR operations
> async_tx: do DMA unmap in core for XOR operations
> async_tx: do DMA unmap in core for XOR_VAL operations
> dmatest: do DMA unmap for PQ operations
> async_tx: do DMA unmap in async_raid6_recov.c for PQ operations
> async_tx: do DMA unmap in core for PQ operations
> async_tx: do DMA unmap in core for PQ_VAL operations
> DMA: remove DMA unmap from drivers
> DMA: remove DMA unmap flags
>
> arch/arm/include/asm/hardware/iop3xx-adma.h | 30 ----
> arch/arm/mach-iop13xx/include/mach/adma.h | 26 ---
> crypto/async_tx/async_memcpy.c | 27 ++-
> crypto/async_tx/async_memset.c | 23 ++-
> crypto/async_tx/async_pq.c | 129 +++++++++----
> crypto/async_tx/async_raid6_recov.c | 42 ++++-
> crypto/async_tx/async_tx.c | 25 ++-
> crypto/async_tx/async_xor.c | 98 +++++++---
> drivers/ata/pata_arasan_cf.c | 3 +-
> drivers/dma/amba-pl08x.c | 31 ----
> drivers/dma/at_hdmac.c | 25 ---
> drivers/dma/dmaengine.c | 59 +++++-
> drivers/dma/dmatest.c | 14 +-
> drivers/dma/dw_dmac.c | 20 ---
> drivers/dma/ep93xx_dma.c | 32 +---
> drivers/dma/fsldma.c | 16 --
> drivers/dma/ioat/dma.c | 28 +--
> drivers/dma/ioat/dma.h | 12 --
> drivers/dma/ioat/dma_v2.c | 1 -
> drivers/dma/ioat/dma_v3.c | 179 +++++-------------
> drivers/dma/iop-adma.c | 70 +-------
> drivers/dma/mv_xor.c | 45 +----
> drivers/dma/ppc4xx/adma.c | 270
> ---------------------------- drivers/dma/timb_dma.c
> | 36 ----
> drivers/dma/txx9dmac.c | 24 ---
> drivers/media/platform/m2m-deinterlace.c | 3 +-
> drivers/media/platform/timblogiw.c | 2 +-
> drivers/misc/carma/carma-fpga.c | 3 +-
> drivers/mtd/nand/atmel_nand.c | 3 +-
> drivers/mtd/nand/fsmc_nand.c | 20 ++-
> drivers/net/ethernet/micrel/ks8842.c | 6 +-
> drivers/spi/spi-dw-mid.c | 4 +-
> include/linux/async_tx.h | 4 +
> include/linux/dmaengine.h | 34 ++--
> 34 files changed, 446 insertions(+), 898 deletions(-)

For all patches:

Reviewed-by: Tomasz Figa <[email protected]>

Best regards,
--
Tomasz Figa
Samsung Poland R&D Center
SW Solution Development, Linux Platform

2012-11-07 20:40:59

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 00/20] DMA: DMA unmap fixes

On Mon, Nov 5, 2012 at 2:00 AM, Bartlomiej Zolnierkiewicz
<[email protected]> wrote:
> Hi,
>
> Currently DMA subsystem does DMA mapping in the core code and DMA
> unmapping is done by device drivers. This is counterintuitive,
> causes code duplication and subtle errors (some drivers like PL330
> one don't implement DMA unmapping code). The following patchset
> modifies DMA subsystem to do DMA unmapping in the core code.
> It results in simpler code, less code duplication (more than 400
> LOC is gone) and fixes the issue with missing DMA unmapping code
> in some drivers. Additionally many cases when DMA wasn't unmapped
> on a failure are also fixed.
>
>
> patches #1-3 add missing DMA unmap on failure to async_tx core
> code (async_memcpy()), ioat and fsmc_nand drivers
>
> patch #4 fixes DMA flags used by carma-fpga driver

Ack patches 1-4

> patches #5-7 fix core code and dmatest driver to DMA unmap for
> MEMCPY operations

Yes, this needs to move out of the drivers, but it needs to move past
the core and into the dmaengine clients directly (md/raid and
net_dma). The slave_dma usage model already does this as each client
takes responsibility for dma mapping. Doing this in the core has some
downsides, maybe they're negligible: 1/ all drivers suffer the size
increase to dma_async_tx_descriptor 2/ pure polling usage models like
net_dma will now trigger dma channel interrupts to run the callback.

That being said this cleanup does get us a step closer to where
dmaengine needs to go, and I have not found time to do the "move it to
the client" patches mentioned above. So I'm inclined to proceed with
them. Does anyone still see positive benefits from net_dma?

> patch #8 adds missing DMA unmap on failure to ioat3 driver
>
> patch #9 fixes build for async_memset.c
>
> patch #10 adds missing DMA unmap on failure to async tx core
> code (async_memset())

Ack patches 8-10

>
> patches #11-18 fix async_tx core code and dmatest driver to do
> DMA unmap for MEMSET, XOR, XOR_VAL, PQ and PQ_VAL operations
>

Some comments in the patch.

> patches #19-20 remove no longer needed DMA unmap code from
> device drivers and DMA unmap flags from code code
>
>
> This patchset was tested on PL330 DMA controller using MEMCPY
> operations. It would be great if somebody could test it on
> more advanced controller capable of MEMSET, XOR, XOR_VAL,
> PQ and PQ_VAL operations (especially since the conversion of
> XOR and PQ operations was not obvious).

async_tx has a bug in that it permits overlapping dma mappings or
operations that cross dma boundaries, but these patches don't make
this worse. The interim "fix" I am proposing for this to drop support
for channel configurations that require an operation chain to switch
channels. This will allow the removal the async_tx_ack() machinery
and hopefully prompt channel switching users to contribute to the md
changes needed to support this properly. In the meantime the
async_tx_ack bits are just adding needless complexity to the drivers
that don't care about that functionality.

--
Dan

2012-11-07 20:56:22

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 13/20] async_tx: do DMA unmap in core for XOR operations

[resend]

On Mon, Nov 5, 2012 at 2:00 AM, Bartlomiej Zolnierkiewicz
<[email protected]> wrote:
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index 440b609..0df69f1 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -392,6 +392,10 @@ void dma_chan_cleanup(struct kref *kref);
> typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);
>
> typedef void (*dma_async_tx_callback)(void *dma_async_param);
> +
> +/* max value of ->max_xor from struct dma_device */
> +#define DMA_ASYNC_TX_MAX_ENT 128

This balloons the descriptor size. Looks like the ppc4xx driver will
try to do 16MB allocations after this. I think this should be limited
in the core to something like 16 or at most 32. ppc4xx is also going
to be impacted by the removal of channel switching support in the
core. Adding Anatolij as a heads up.

> +
> /**
> * struct dma_async_tx_descriptor - async transaction descriptor
> * ---dma generic offload fields---
> @@ -402,8 +406,9 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
> * @phys: physical address of the descriptor
> * @chan: target channel for this operation
> * @tx_submit: set the prepared descriptor(s) to be executed by the engine
> - * @dma_src: DMA source address (needed for DMA unmap)
> - * @dma_dst: DMA destination address (needed for DMA unmap)
> + * @dma_src: DMA source addresses (needed for DMA unmap)
> + * @dma_src_cnt: number of DMA source addresses (needed for DMA unmap)
> + * @dma_dst: DMA destination addresses (needed for DMA unmap)
> * @dma_len: DMA length (needed for DMA unmap)
> * @callback: routine to call after this operation is complete
> * @callback_param: general parameter to pass to the callback routine
> @@ -420,8 +425,9 @@ struct dma_async_tx_descriptor {
> dma_addr_t phys;
> struct dma_chan *chan;
> dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
> - dma_addr_t dma_src;
> - dma_addr_t dma_dst;
> + dma_addr_t dma_src[DMA_ASYNC_TX_MAX_ENT];
> + unsigned int dma_src_cnt;
> + dma_addr_t dma_dst[DMA_ASYNC_TX_MAX_ENT];
> size_t dma_len;
> dma_async_tx_callback callback;
> void *callback_param;

For engines that don't care about raid this unmap data should be at
the end to hopefully get the more frequently used callback parameters
into the same cacheline as the rest.

2012-11-19 10:02:56

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 13/20] async_tx: do DMA unmap in core for XOR operations



On 11/5/12 2:00 AM, "Bartlomiej Zolnierkiewicz" <[email protected]>
wrote:

>In struct dma_async_tx_descriptor convert dma_[src,dst] fields to
>arrays and also add dma_src_cnt field. Then convert core async_tx
>code (do_async_xor()) to do DMA unmapping itself using the ->callback
>functionality.
>
>Cc: Dan Williams <[email protected]>
>Cc: Tomasz Figa <[email protected]>
>Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
>Signed-off-by: Kyungmin Park <[email protected]>
>---
> crypto/async_tx/async_memcpy.c | 8 ++---
> crypto/async_tx/async_xor.c | 80
>+++++++++++++++++++++++++++++++-----------
> drivers/dma/dmaengine.c | 26 +++++++-------
> include/linux/dmaengine.h | 14 +++++---
> 4 files changed, 87 insertions(+), 41 deletions(-)
>
>diff --git a/crypto/async_tx/async_memcpy.c
>b/crypto/async_tx/async_memcpy.c
>index b6d5dab..cb0628e 100644
>--- a/crypto/async_tx/async_memcpy.c
>+++ b/crypto/async_tx/async_memcpy.c
>@@ -35,8 +35,8 @@ static void async_memcpy_cb(void *dma_async_param)
> struct dma_async_tx_descriptor *tx = dma_async_param;
> struct dma_device *dev = tx->chan->device;
>
>- dma_unmap_page(dev->dev, tx->dma_src, tx->dma_len, DMA_TO_DEVICE);
>- dma_unmap_page(dev->dev, tx->dma_dst, tx->dma_len, DMA_FROM_DEVICE);
>+ dma_unmap_page(dev->dev, tx->dma_src[0], tx->dma_len, DMA_TO_DEVICE);
>+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len, DMA_FROM_DEVICE);
>
> if (tx->orig_callback)
> tx->orig_callback(tx->orig_callback_param);
>@@ -91,8 +91,8 @@ async_memcpy(struct page *dest, struct page *src,
>unsigned int dest_offset,
> if (tx) {
> pr_debug("%s: (async) len: %zu\n", __func__, len);
>
>- tx->dma_src = dma_src;
>- tx->dma_dst = dma_dest;
>+ tx->dma_src[0] = dma_src;
>+ tx->dma_dst[0] = dma_dest;
> tx->dma_len = len;
>
> __async_tx_submit(chan, tx, async_memcpy_cb, tx, submit);
>diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
>index 154cc84..59a4af3 100644
>--- a/crypto/async_tx/async_xor.c
>+++ b/crypto/async_tx/async_xor.c
>@@ -31,6 +31,26 @@
> #include <linux/raid/xor.h>
> #include <linux/async_tx.h>
>
>+static void do_async_xor_cb(void *dma_async_param)
>+{
>+ struct dma_async_tx_descriptor *tx = dma_async_param;
>+ struct dma_device *dev = tx->chan->device;
>+ int i;
>+
>+ dma_unmap_page(dev->dev, tx->dma_dst[0], tx->dma_len,
>+ DMA_BIDIRECTIONAL);
>+
>+ for (i = 0; i < tx->dma_src_cnt; i++) {
>+ if (tx->dma_src[i] == tx->dma_dst[0])
>+ continue;
>+ dma_unmap_page(dev->dev, tx->dma_src[i], tx->dma_len,
>+ DMA_TO_DEVICE);
>+ }
>+
>+ if (tx->orig_callback)
>+ tx->orig_callback(tx->orig_callback_param);
>+}
>+
> /* do_async_xor - dma map the pages and perform the xor with an engine */
> static __async_inline struct dma_async_tx_descriptor *
> do_async_xor(struct dma_chan *chan, struct page *dest, struct page
>**src_list,
>@@ -39,42 +59,34 @@ do_async_xor(struct dma_chan *chan, struct page
>*dest, struct page **src_list,
> {
> struct dma_device *dma = chan->device;
> struct dma_async_tx_descriptor *tx = NULL;
>- int src_off = 0;
>- int i;
>+ int i, j;
> dma_async_tx_callback cb_fn_orig = submit->cb_fn;
> void *cb_param_orig = submit->cb_param;
> enum async_tx_flags flags_orig = submit->flags;
> enum dma_ctrl_flags dma_flags;
> int xor_src_cnt = 0;
>+ int src_list_cnt = 0;
>+ int extra_ent = 0;
> dma_addr_t dma_dest;
>
>- /* map the dest bidrectional in case it is re-used as a source */
>- dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL);
> for (i = 0; i < src_cnt; i++) {
>- /* only map the dest once */
> if (!src_list[i])
> continue;
>- if (unlikely(src_list[i] == dest)) {
>- dma_src[xor_src_cnt++] = dma_dest;
>- continue;
>- }
>- dma_src[xor_src_cnt++] = dma_map_page(dma->dev, src_list[i], offset,
>- len, DMA_TO_DEVICE);
>+ xor_src_cnt++;
> }
> src_cnt = xor_src_cnt;
>
> while (src_cnt) {
> submit->flags = flags_orig;
>- dma_flags = 0;
>+ dma_flags = DMA_COMPL_SKIP_SRC_UNMAP |
>+ DMA_COMPL_SKIP_DEST_UNMAP;
> xor_src_cnt = min(src_cnt, (int)dma->max_xor);
>- /* if we are submitting additional xors, leave the chain open,
>- * clear the callback parameters, and leave the destination
>- * buffer mapped
>+ /* if we are submitting additional xors, leave the chain open
>+ * and clear the callback parameters
> */
> if (src_cnt > xor_src_cnt) {
> submit->flags &= ~ASYNC_TX_ACK;
> submit->flags |= ASYNC_TX_FENCE;
>- dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
> submit->cb_fn = NULL;
> submit->cb_param = NULL;
> } else {
>@@ -85,11 +97,32 @@ do_async_xor(struct dma_chan *chan, struct page
>*dest, struct page **src_list,
> dma_flags |= DMA_PREP_INTERRUPT;
> if (submit->flags & ASYNC_TX_FENCE)
> dma_flags |= DMA_PREP_FENCE;
>+
>+ /* map it bidirectional as it can be re-used as a source */
>+ dma_dest = dma_map_page(dma->dev, dest, offset, len,
>+ DMA_BIDIRECTIONAL);

This maps the destination unconditionally multiple times. More critically
it gets unmapped at the first completion, but it needs to remain mapped
until all operations complete.

So, we need a separate unmap object with a different lifetime than the
descriptor. I'm putting together some patches along those lines to
dynamically allocate some unmap data and hang it off the descriptor.

--
Dan

Subject: Re: [PATCH 00/20] DMA: DMA unmap fixes

On Wednesday 07 November 2012 21:40:57 Dan Williams wrote:
> On Mon, Nov 5, 2012 at 2:00 AM, Bartlomiej Zolnierkiewicz
> <[email protected]> wrote:
> > Hi,
> >
> > Currently DMA subsystem does DMA mapping in the core code and DMA
> > unmapping is done by device drivers. This is counterintuitive,
> > causes code duplication and subtle errors (some drivers like PL330
> > one don't implement DMA unmapping code). The following patchset
> > modifies DMA subsystem to do DMA unmapping in the core code.
> > It results in simpler code, less code duplication (more than 400
> > LOC is gone) and fixes the issue with missing DMA unmapping code
> > in some drivers. Additionally many cases when DMA wasn't unmapped
> > on a failure are also fixed.
> >
> >
> > patches #1-3 add missing DMA unmap on failure to async_tx core
> > code (async_memcpy()), ioat and fsmc_nand drivers
> >
> > patch #4 fixes DMA flags used by carma-fpga driver
>
> Ack patches 1-4

[...]

> > patch #8 adds missing DMA unmap on failure to ioat3 driver
> >
> > patch #9 fixes build for async_memset.c
> >
> > patch #10 adds missing DMA unmap on failure to async tx core
> > code (async_memset())
>
> Ack patches 8-10

Thank you.

Could these patches be merged for 3.8 by either your's or Vinod's tree?

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung Poland R&D Center