2008-02-13 07:04:56

by Dan Williams

[permalink] [raw]
Subject: [PATCH 0/4] async_tx: fix dependency handling and related cleanups

Injecting channel-switch-interrupts has been broken for a while now. It
has not been a problem in practice because the only in-tree driver that
relied on this functionality was the iop3xx version of iop-adma, and it
had a bug-masking local workaround. Three side benefits arise from this
fix:

1/ dma_async_tx_descriptor sheds two list_heads
2/ Locking is made sane in that dma drivers no longer need to directly
touch dma_async_tx_descriptor.lock
3/ dma_device.device_dependency_added is no longer needed

Testing shows that iop-adma now gets by without the 'watchdog'
workaround.

---

Dan Williams (4):
iop-adma: remove the workaround for missed interrupts on iop3xx
async_tx: kill ->device_dependency_added
async_tx: fix multiple dependency submission
async_tx: checkpatch says s/__FUNCTION__/__func__/g


crypto/async_tx/async_memcpy.c | 6 -
crypto/async_tx/async_memset.c | 6 -
crypto/async_tx/async_tx.c | 203 ++++++++++++++++++++++++++------
crypto/async_tx/async_xor.c | 12 +-
drivers/dma/dmaengine.c | 3
drivers/dma/ioat_dma.c | 12 --
drivers/dma/iop-adma.c | 21 +--
include/asm-arm/arch-iop13xx/adma.h | 5 -
include/asm-arm/hardware/iop3xx-adma.h | 8 -
include/asm-arm/hardware/iop_adma.h | 2
include/linux/dmaengine.h | 11 --
11 files changed, 185 insertions(+), 104 deletions(-)

--
Dan


2008-02-13 07:05:17

by Dan Williams

[permalink] [raw]
Subject: [PATCH 3/4] async_tx: kill ->device_dependency_added

DMA drivers no longer need to be notified of depdency submission events as
async_tx_run_dependencies and async_tx_channel_switch will handle the
scheduling and execution of dependent operations.

Signed-off-by: Dan Williams <[email protected]>
---

drivers/dma/dmaengine.c | 1 -
drivers/dma/ioat_dma.c | 12 ------------
drivers/dma/iop-adma.c | 7 -------
include/linux/dmaengine.h | 2 --
4 files changed, 0 insertions(+), 22 deletions(-)


diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 9369781..5ca0d94 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -362,7 +362,6 @@ int dma_async_device_register(struct dma_device *device)

BUG_ON(!device->device_alloc_chan_resources);
BUG_ON(!device->device_free_chan_resources);
- BUG_ON(!device->device_dependency_added);
BUG_ON(!device->device_is_tx_complete);
BUG_ON(!device->device_issue_pending);
BUG_ON(!device->dev);
diff --git a/drivers/dma/ioat_dma.c b/drivers/dma/ioat_dma.c
index dff38ac..05ace54 100644
--- a/drivers/dma/ioat_dma.c
+++ b/drivers/dma/ioat_dma.c
@@ -922,17 +922,6 @@ static void ioat_dma_memcpy_cleanup(struct ioat_dma_chan *ioat_chan)
spin_unlock_bh(&ioat_chan->cleanup_lock);
}

-static void ioat_dma_dependency_added(struct dma_chan *chan)
-{
- struct ioat_dma_chan *ioat_chan = to_ioat_chan(chan);
- spin_lock_bh(&ioat_chan->desc_lock);
- if (ioat_chan->pending == 0) {
- spin_unlock_bh(&ioat_chan->desc_lock);
- ioat_dma_memcpy_cleanup(ioat_chan);
- } else
- spin_unlock_bh(&ioat_chan->desc_lock);
-}
-
/**
* ioat_dma_is_complete - poll the status of a IOAT DMA transaction
* @chan: IOAT DMA channel handle
@@ -1314,7 +1303,6 @@ struct ioatdma_device *ioat_dma_probe(struct pci_dev *pdev,

dma_cap_set(DMA_MEMCPY, device->common.cap_mask);
device->common.device_is_tx_complete = ioat_dma_is_complete;
- device->common.device_dependency_added = ioat_dma_dependency_added;
switch (device->version) {
case IOAT_VER_1_2:
device->common.device_prep_dma_memcpy = ioat1_dma_prep_memcpy;
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index a6171da..1cb4284 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -672,12 +672,6 @@ iop_adma_prep_dma_zero_sum(struct dma_chan *chan, dma_addr_t *dma_src,
return sw_desc ? &sw_desc->async_tx : NULL;
}

-static void iop_adma_dependency_added(struct dma_chan *chan)
-{
- struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
- tasklet_schedule(&iop_chan->irq_tasklet);
-}
-
static void iop_adma_free_chan_resources(struct dma_chan *chan)
{
struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
@@ -1178,7 +1172,6 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
dma_dev->device_free_chan_resources = iop_adma_free_chan_resources;
dma_dev->device_is_tx_complete = iop_adma_is_complete;
dma_dev->device_issue_pending = iop_adma_issue_pending;
- dma_dev->device_dependency_added = iop_adma_dependency_added;
dma_dev->dev = &pdev->dev;

/* set prep routines based on capability */
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index d04b169..e2538b4 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -258,7 +258,6 @@ struct dma_async_tx_descriptor {
* @device_prep_dma_zero_sum: prepares a zero_sum operation
* @device_prep_dma_memset: prepares a memset operation
* @device_prep_dma_interrupt: prepares an end of chain interrupt operation
- * @device_dependency_added: async_tx notifies the channel about new deps
* @device_issue_pending: push pending transactions to hardware
*/
struct dma_device {
@@ -293,7 +292,6 @@ struct dma_device {
struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
struct dma_chan *chan);

- void (*device_dependency_added)(struct dma_chan *chan);
enum dma_status (*device_is_tx_complete)(struct dma_chan *chan,
dma_cookie_t cookie, dma_cookie_t *last,
dma_cookie_t *used);

2008-02-13 07:05:43

by Dan Williams

[permalink] [raw]
Subject: [PATCH 1/4] async_tx: checkpatch says s/__FUNCTION__/__func__/g

Signed-off-by: Dan Williams <[email protected]>
---

crypto/async_tx/async_memcpy.c | 6 +++---
crypto/async_tx/async_memset.c | 6 +++---
crypto/async_tx/async_tx.c | 6 +++---
crypto/async_tx/async_xor.c | 12 ++++++------
4 files changed, 15 insertions(+), 15 deletions(-)


diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 0f62822..84caa4e 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -66,11 +66,11 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
}

if (tx) {
- pr_debug("%s: (async) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (async) len: %zu\n", __func__, len);
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else {
void *dest_buf, *src_buf;
- pr_debug("%s: (sync) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (sync) len: %zu\n", __func__, len);

/* wait for any prerequisite operations */
if (depend_tx) {
@@ -80,7 +80,7 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
BUG_ON(depend_tx->ack);
if (dma_wait_for_async_tx(depend_tx) == DMA_ERROR)
panic("%s: DMA_ERROR waiting for depend_tx\n",
- __FUNCTION__);
+ __func__);
}

dest_buf = kmap_atomic(dest, KM_USER0) + dest_offset;
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index 09c0e83..f5ff390 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -63,11 +63,11 @@ async_memset(struct page *dest, int val, unsigned int offset,
}

if (tx) {
- pr_debug("%s: (async) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (async) len: %zu\n", __func__, len);
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run the memset synchronously */
void *dest_buf;
- pr_debug("%s: (sync) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (sync) len: %zu\n", __func__, len);

dest_buf = (void *) (((char *) page_address(dest)) + offset);

@@ -79,7 +79,7 @@ async_memset(struct page *dest, int val, unsigned int offset,
BUG_ON(depend_tx->ack);
if (dma_wait_for_async_tx(depend_tx) == DMA_ERROR)
panic("%s: DMA_ERROR waiting for depend_tx\n",
- __FUNCTION__);
+ __func__);
}

memset(dest_buf, val, len);
diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 5628821..2be3bae 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -472,11 +472,11 @@ async_trigger_callback(enum async_tx_flags flags,
tx = NULL;

if (tx) {
- pr_debug("%s: (async)\n", __FUNCTION__);
+ pr_debug("%s: (async)\n", __func__);

async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else {
- pr_debug("%s: (sync)\n", __FUNCTION__);
+ pr_debug("%s: (sync)\n", __func__);

/* wait for any prerequisite operations */
if (depend_tx) {
@@ -486,7 +486,7 @@ async_trigger_callback(enum async_tx_flags flags,
BUG_ON(depend_tx->ack);
if (dma_wait_for_async_tx(depend_tx) == DMA_ERROR)
panic("%s: DMA_ERROR waiting for depend_tx\n",
- __FUNCTION__);
+ __func__);
}

async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 2259a4f..7a9db35 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -47,7 +47,7 @@ do_async_xor(struct dma_device *device,
int i;
unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;

- pr_debug("%s: len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: len: %zu\n", __func__, len);

dma_dest = dma_map_page(device->dev, dest, offset, len,
DMA_FROM_DEVICE);
@@ -86,7 +86,7 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
void *_dest;
int i;

- pr_debug("%s: len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: len: %zu\n", __func__, len);

/* reuse the 'src_list' array to convert to buffer pointers */
for (i = 0; i < src_cnt; i++)
@@ -196,7 +196,7 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
DMA_ERROR)
panic("%s: DMA_ERROR waiting for "
"depend_tx\n",
- __FUNCTION__);
+ __func__);
}

do_sync_xor(dest, &src_list[src_off], offset,
@@ -276,7 +276,7 @@ async_xor_zero_sum(struct page *dest, struct page **src_list,
unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
int i;

- pr_debug("%s: (async) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (async) len: %zu\n", __func__, len);

for (i = 0; i < src_cnt; i++)
dma_src[i] = dma_map_page(device->dev, src_list[i],
@@ -299,7 +299,7 @@ async_xor_zero_sum(struct page *dest, struct page **src_list,
} else {
unsigned long xor_flags = flags;

- pr_debug("%s: (sync) len: %zu\n", __FUNCTION__, len);
+ pr_debug("%s: (sync) len: %zu\n", __func__, len);

xor_flags |= ASYNC_TX_XOR_DROP_DST;
xor_flags &= ~ASYNC_TX_ACK;
@@ -310,7 +310,7 @@ async_xor_zero_sum(struct page *dest, struct page **src_list,
if (tx) {
if (dma_wait_for_async_tx(tx) == DMA_ERROR)
panic("%s: DMA_ERROR waiting for tx\n",
- __FUNCTION__);
+ __func__);
async_tx_ack(tx);
}

2008-02-13 07:07:29

by Dan Williams

[permalink] [raw]
Subject: [PATCH 4/4] iop-adma: remove the workaround for missed interrupts on iop3xx

This workaround was covering the dependency submission bug in async_tx.

Signed-off-by: Dan Williams <[email protected]>
---

drivers/dma/iop-adma.c | 5 -----
include/asm-arm/arch-iop13xx/adma.h | 5 -----
include/asm-arm/hardware/iop3xx-adma.h | 8 --------
include/asm-arm/hardware/iop_adma.h | 2 --
4 files changed, 0 insertions(+), 20 deletions(-)


diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 1cb4284..821bd17 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -255,8 +255,6 @@ static void __iop_adma_slot_cleanup(struct iop_adma_chan *iop_chan)

BUG_ON(!seen_current);

- iop_chan_idle(busy, iop_chan);
-
if (cookie > 0) {
iop_chan->completed_cookie = cookie;
pr_debug("\tcompleted cookie %d\n", cookie);
@@ -1226,9 +1224,6 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
}

spin_lock_init(&iop_chan->lock);
- init_timer(&iop_chan->cleanup_watchdog);
- iop_chan->cleanup_watchdog.data = (unsigned long) iop_chan;
- iop_chan->cleanup_watchdog.function = iop_adma_tasklet;
INIT_LIST_HEAD(&iop_chan->chain);
INIT_LIST_HEAD(&iop_chan->all_slots);
INIT_RCU_HEAD(&iop_chan->common.rcu);
diff --git a/include/asm-arm/arch-iop13xx/adma.h b/include/asm-arm/arch-iop13xx/adma.h
index efd9a5e..90d14ee 100644
--- a/include/asm-arm/arch-iop13xx/adma.h
+++ b/include/asm-arm/arch-iop13xx/adma.h
@@ -454,11 +454,6 @@ static inline void iop_chan_append(struct iop_adma_chan *chan)
__raw_writel(adma_accr, ADMA_ACCR(chan));
}

-static inline void iop_chan_idle(int busy, struct iop_adma_chan *chan)
-{
- do { } while (0);
-}
-
static inline u32 iop_chan_get_status(struct iop_adma_chan *chan)
{
return __raw_readl(ADMA_ACSR(chan));
diff --git a/include/asm-arm/hardware/iop3xx-adma.h b/include/asm-arm/hardware/iop3xx-adma.h
index 5c529e6..84d635b 100644
--- a/include/asm-arm/hardware/iop3xx-adma.h
+++ b/include/asm-arm/hardware/iop3xx-adma.h
@@ -767,20 +767,12 @@ static inline int iop_desc_get_zero_result(struct iop_adma_desc_slot *desc)
static inline void iop_chan_append(struct iop_adma_chan *chan)
{
u32 dma_chan_ctrl;
- /* workaround dropped interrupts on 3xx */
- mod_timer(&chan->cleanup_watchdog, jiffies + msecs_to_jiffies(3));

dma_chan_ctrl = __raw_readl(DMA_CCR(chan));
dma_chan_ctrl |= 0x2;
__raw_writel(dma_chan_ctrl, DMA_CCR(chan));
}

-static inline void iop_chan_idle(int busy, struct iop_adma_chan *chan)
-{
- if (!busy)
- del_timer(&chan->cleanup_watchdog);
-}
-
static inline u32 iop_chan_get_status(struct iop_adma_chan *chan)
{
return __raw_readl(DMA_CSR(chan));
diff --git a/include/asm-arm/hardware/iop_adma.h b/include/asm-arm/hardware/iop_adma.h
index ca8e71f..cb7e361 100644
--- a/include/asm-arm/hardware/iop_adma.h
+++ b/include/asm-arm/hardware/iop_adma.h
@@ -51,7 +51,6 @@ struct iop_adma_device {
* @common: common dmaengine channel object members
* @last_used: place holder for allocation to continue from where it left off
* @all_slots: complete domain of slots usable by the channel
- * @cleanup_watchdog: workaround missed interrupts on iop3xx
* @slots_allocated: records the actual size of the descriptor slot pool
* @irq_tasklet: bottom half where iop_adma_slot_cleanup runs
*/
@@ -65,7 +64,6 @@ struct iop_adma_chan {
struct dma_chan common;
struct iop_adma_desc_slot *last_used;
struct list_head all_slots;
- struct timer_list cleanup_watchdog;
int slots_allocated;
struct tasklet_struct irq_tasklet;
};

2008-02-13 07:09:21

by Dan Williams

[permalink] [raw]
Subject: [PATCH 2/4] async_tx: fix multiple dependency submission

Shrink struct dma_async_tx_descriptor and introduce
async_tx_channel_switch to properly inject a channel switch interrupt in
the descriptor stream. This simplifies the locking model as drivers no
longer need to handle dma_async_tx_descriptor.lock.

Signed-off-by: Dan Williams <[email protected]>
---

crypto/async_tx/async_tx.c | 197 ++++++++++++++++++++++++++++++++++++--------
drivers/dma/dmaengine.c | 2
drivers/dma/iop-adma.c | 9 +-
include/linux/dmaengine.h | 9 +-
4 files changed, 170 insertions(+), 47 deletions(-)


diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 2be3bae..6975616 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -89,13 +89,19 @@ dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx)
iter = tx;

/* find the root of the unsubmitted dependency chain */
- while (iter->cookie == -EBUSY) {
+ do {
parent = iter->parent;
- if (parent && parent->cookie == -EBUSY)
- iter = iter->parent;
- else
+ if (!parent)
break;
- }
+ else
+ iter = parent;
+ } while (parent);
+
+ /* there is a small window for ->parent == NULL and
+ * ->cookie == -EBUSY
+ */
+ while (iter->cookie == -EBUSY)
+ cpu_relax();

status = dma_sync_wait(iter->chan, iter->cookie);
} while (status == DMA_IN_PROGRESS || (iter != tx));
@@ -111,24 +117,33 @@ EXPORT_SYMBOL_GPL(dma_wait_for_async_tx);
void
async_tx_run_dependencies(struct dma_async_tx_descriptor *tx)
{
- struct dma_async_tx_descriptor *dep_tx, *_dep_tx;
- struct dma_device *dev;
+ struct dma_async_tx_descriptor *next = tx->next;
struct dma_chan *chan;

- list_for_each_entry_safe(dep_tx, _dep_tx, &tx->depend_list,
- depend_node) {
- chan = dep_tx->chan;
- dev = chan->device;
- /* we can't depend on ourselves */
- BUG_ON(chan == tx->chan);
- list_del(&dep_tx->depend_node);
- tx->tx_submit(dep_tx);
-
- /* we need to poke the engine as client code does not
- * know about dependency submission events
- */
- dev->device_issue_pending(chan);
+ if (!next)
+ return;
+
+ tx->next = NULL;
+ chan = next->chan;
+
+ /* keep submitting up until a channel switch is detected
+ * in that case we will be called again as a result of
+ * processing the interrupt from async_tx_channel_switch
+ */
+ while (next && next->chan == chan) {
+ struct dma_async_tx_descriptor *_next;
+
+ spin_lock_bh(&next->lock);
+ next->parent = NULL;
+ _next = next->next;
+ next->next = NULL;
+ spin_unlock_bh(&next->lock);
+
+ next->tx_submit(next);
+ next = _next;
}
+
+ chan->device->device_issue_pending(chan);
}
EXPORT_SYMBOL_GPL(async_tx_run_dependencies);

@@ -397,6 +412,92 @@ static void __exit async_tx_exit(void)
}
#endif

+
+/**
+ * async_tx_channel_switch - queue an interrupt descriptor with a dependency
+ * pre-attached.
+ * @depend_tx: the operation that must finish before the new operation runs
+ * @tx: the new operation
+ */
+static void
+async_tx_channel_switch(struct dma_async_tx_descriptor *depend_tx,
+ struct dma_async_tx_descriptor *tx)
+{
+ struct dma_chan *chan;
+ struct dma_device *device;
+ struct dma_async_tx_descriptor *intr_tx = (void *) ~0;
+
+ /* first check to see if we can still append to depend_tx */
+ spin_lock_bh(&depend_tx->lock);
+ if (depend_tx->parent && depend_tx->chan == tx->chan) {
+ tx->parent = depend_tx;
+ depend_tx->next = tx;
+ intr_tx = NULL;
+ }
+ spin_unlock_bh(&depend_tx->lock);
+
+ if (!intr_tx)
+ return;
+
+ chan = depend_tx->chan;
+ device = chan->device;
+
+ /* see if we can schedule an interrupt
+ * otherwise poll for completion
+ */
+ if (dma_has_cap(DMA_INTERRUPT, device->cap_mask))
+ intr_tx = device->device_prep_dma_interrupt(chan);
+ else
+ intr_tx = NULL;
+
+ if (intr_tx) {
+ intr_tx->callback = NULL;
+ intr_tx->callback_param = NULL;
+ tx->parent = intr_tx;
+ /* safe to set ->next outside the lock since we know we are
+ * not submitted yet
+ */
+ intr_tx->next = tx;
+
+ /* check if we need to append */
+ spin_lock_bh(&depend_tx->lock);
+ if (depend_tx->parent) {
+ intr_tx->parent = depend_tx;
+ depend_tx->next = intr_tx;
+ async_tx_ack(intr_tx);
+ intr_tx = NULL;
+ }
+ spin_unlock_bh(&depend_tx->lock);
+
+ if (intr_tx) {
+ intr_tx->parent = NULL;
+ intr_tx->tx_submit(intr_tx);
+ async_tx_ack(intr_tx);
+ }
+ } else {
+ if (dma_wait_for_async_tx(depend_tx) == DMA_ERROR)
+ panic("%s: DMA_ERROR waiting for depend_tx\n",
+ __func__);
+ tx->tx_submit(tx);
+ }
+}
+
+
+/**
+ * submit_disposition - while holding depend_tx->lock we must avoid submitting
+ * new operations to prevent a circular locking dependency with
+ * drivers that already hold a channel lock when calling
+ * async_tx_run_dependencies.
+ * @ASYNC_TX_SUBMITTED: we were able to append the new operation under the lock
+ * @ASYNC_TX_CHANNEL_SWITCH: when the lock is dropped schedule a channel switch
+ * @ASYNC_TX_DIRECT_SUBMIT: when the lock is dropped submit directly
+ */
+enum submit_disposition {
+ ASYNC_TX_SUBMITTED,
+ ASYNC_TX_CHANNEL_SWITCH,
+ ASYNC_TX_DIRECT_SUBMIT,
+};
+
void
async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
enum async_tx_flags flags, struct dma_async_tx_descriptor *depend_tx,
@@ -405,28 +506,54 @@ async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
tx->callback = cb_fn;
tx->callback_param = cb_param;

- /* set this new tx to run after depend_tx if:
- * 1/ a dependency exists (depend_tx is !NULL)
- * 2/ the tx can not be submitted to the current channel
- */
- if (depend_tx && depend_tx->chan != chan) {
- /* if ack is already set then we cannot be sure
+ if (depend_tx) {
+ enum submit_disposition s;
+
+ /* sanity check the dependency chain:
+ * 1/ if ack is already set then we cannot be sure
* we are referring to the correct operation
+ * 2/ dependencies are 1:1 i.e. two transactions can
+ * not depend on the same parent
*/
- BUG_ON(depend_tx->ack);
+ BUG_ON(depend_tx->ack || depend_tx->next || tx->parent);

- tx->parent = depend_tx;
+ /* the lock prevents async_tx_run_dependencies from missing
+ * the setting of ->next when ->parent != NULL
+ */
spin_lock_bh(&depend_tx->lock);
- list_add_tail(&tx->depend_node, &depend_tx->depend_list);
- if (depend_tx->cookie == 0) {
- struct dma_chan *dep_chan = depend_tx->chan;
- struct dma_device *dep_dev = dep_chan->device;
- dep_dev->device_dependency_added(dep_chan);
+ if (depend_tx->parent) {
+ /* we have a parent so we can not submit directly
+ * if we are staying on the same channel: append
+ * else: channel switch
+ */
+ if (depend_tx->chan == chan) {
+ tx->parent = depend_tx;
+ depend_tx->next = tx;
+ s = ASYNC_TX_SUBMITTED;
+ } else
+ s = ASYNC_TX_CHANNEL_SWITCH;
+ } else {
+ /* we do not have a parent so we may be able to submit
+ * directly if we are staying on the same channel
+ */
+ if (depend_tx->chan == chan)
+ s = ASYNC_TX_DIRECT_SUBMIT;
+ else
+ s = ASYNC_TX_CHANNEL_SWITCH;
}
spin_unlock_bh(&depend_tx->lock);

- /* schedule an interrupt to trigger the channel switch */
- async_trigger_callback(ASYNC_TX_ACK, depend_tx, NULL, NULL);
+ switch (s) {
+ case ASYNC_TX_SUBMITTED:
+ break;
+ case ASYNC_TX_CHANNEL_SWITCH:
+ async_tx_channel_switch(depend_tx, tx);
+ break;
+ case ASYNC_TX_DIRECT_SUBMIT:
+ tx->parent = NULL;
+ tx->tx_submit(tx);
+ break;
+ }
} else {
tx->parent = NULL;
tx->tx_submit(tx);
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 2996523..9369781 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -600,8 +600,6 @@ void dma_async_tx_descriptor_init(struct dma_async_tx_descriptor *tx,
{
tx->chan = chan;
spin_lock_init(&tx->lock);
- INIT_LIST_HEAD(&tx->depend_node);
- INIT_LIST_HEAD(&tx->depend_list);
}
EXPORT_SYMBOL(dma_async_tx_descriptor_init);

diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 3986d54..a6171da 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -63,7 +63,6 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *iop_chan, dma_cookie_t cookie)
{
BUG_ON(desc->async_tx.cookie < 0);
- spin_lock_bh(&desc->async_tx.lock);
if (desc->async_tx.cookie > 0) {
cookie = desc->async_tx.cookie;
desc->async_tx.cookie = 0;
@@ -101,7 +100,6 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,

/* run dependent operations */
async_tx_run_dependencies(&desc->async_tx);
- spin_unlock_bh(&desc->async_tx.lock);

return cookie;
}
@@ -275,8 +273,11 @@ iop_adma_slot_cleanup(struct iop_adma_chan *iop_chan)

static void iop_adma_tasklet(unsigned long data)
{
- struct iop_adma_chan *chan = (struct iop_adma_chan *) data;
- __iop_adma_slot_cleanup(chan);
+ struct iop_adma_chan *iop_chan = (struct iop_adma_chan *) data;
+
+ spin_lock(&iop_chan->lock);
+ __iop_adma_slot_cleanup(iop_chan);
+ spin_unlock(&iop_chan->lock);
}

static struct iop_adma_desc_slot *
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index acbb364..d04b169 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -221,11 +221,9 @@ typedef void (*dma_async_tx_callback)(void *dma_async_param);
* @callback: routine to call after this operation is complete
* @callback_param: general parameter to pass to the callback routine
* ---async_tx api specific fields---
- * @depend_list: at completion this list of transactions are submitted
- * @depend_node: allow this transaction to be executed after another
- * transaction has completed, possibly on another channel
+ * @next: at completion submit this descriptor
* @parent: pointer to the next level up in the dependency chain
- * @lock: protect the dependency list
+ * @lock: protect the parent and next pointers
*/
struct dma_async_tx_descriptor {
dma_cookie_t cookie;
@@ -236,8 +234,7 @@ struct dma_async_tx_descriptor {
dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
dma_async_tx_callback callback;
void *callback_param;
- struct list_head depend_list;
- struct list_head depend_node;
+ struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
spinlock_t lock;
};

2008-02-13 16:07:33

by Shannon Nelson

[permalink] [raw]
Subject: RE: [PATCH 3/4] async_tx: kill ->device_dependency_added

>-----Original Message-----
>From: Williams, Dan J
>Sent: Tuesday, February 12, 2008 11:03 PM
>To: [email protected]
>Cc: [email protected]; Nelson, Shannon; [email protected];
>[email protected]
>Subject: [PATCH 3/4] async_tx: kill ->device_dependency_added
>
>DMA drivers no longer need to be notified of depdency
>submission events as
>async_tx_run_dependencies and async_tx_channel_switch will handle the
>scheduling and execution of dependent operations.
>
>Signed-off-by: Dan Williams <[email protected]>
>---
>
> drivers/dma/dmaengine.c | 1 -
> drivers/dma/ioat_dma.c | 12 ------------
> drivers/dma/iop-adma.c | 7 -------
> include/linux/dmaengine.h | 2 --
> 4 files changed, 0 insertions(+), 22 deletions(-)
>
>
>diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
>index 9369781..5ca0d94 100644
>--- a/drivers/dma/dmaengine.c
>+++ b/drivers/dma/dmaengine.c
>@@ -362,7 +362,6 @@ int dma_async_device_register(struct
>dma_device *device)
>
> BUG_ON(!device->device_alloc_chan_resources);
> BUG_ON(!device->device_free_chan_resources);
>- BUG_ON(!device->device_dependency_added);
> BUG_ON(!device->device_is_tx_complete);
> BUG_ON(!device->device_issue_pending);
> BUG_ON(!device->dev);
>diff --git a/drivers/dma/ioat_dma.c b/drivers/dma/ioat_dma.c
>index dff38ac..05ace54 100644
>--- a/drivers/dma/ioat_dma.c
>+++ b/drivers/dma/ioat_dma.c
>@@ -922,17 +922,6 @@ static void
>ioat_dma_memcpy_cleanup(struct ioat_dma_chan *ioat_chan)
> spin_unlock_bh(&ioat_chan->cleanup_lock);
> }
>
>-static void ioat_dma_dependency_added(struct dma_chan *chan)
>-{
>- struct ioat_dma_chan *ioat_chan = to_ioat_chan(chan);
>- spin_lock_bh(&ioat_chan->desc_lock);
>- if (ioat_chan->pending == 0) {
>- spin_unlock_bh(&ioat_chan->desc_lock);
>- ioat_dma_memcpy_cleanup(ioat_chan);
>- } else
>- spin_unlock_bh(&ioat_chan->desc_lock);
>-}
>-
> /**
> * ioat_dma_is_complete - poll the status of a IOAT DMA transaction
> * @chan: IOAT DMA channel handle
>@@ -1314,7 +1303,6 @@ struct ioatdma_device
>*ioat_dma_probe(struct pci_dev *pdev,
>
> dma_cap_set(DMA_MEMCPY, device->common.cap_mask);
> device->common.device_is_tx_complete = ioat_dma_is_complete;
>- device->common.device_dependency_added =
>ioat_dma_dependency_added;
> switch (device->version) {
> case IOAT_VER_1_2:
> device->common.device_prep_dma_memcpy =
>ioat1_dma_prep_memcpy;
>diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
>index a6171da..1cb4284 100644
>--- a/drivers/dma/iop-adma.c
>+++ b/drivers/dma/iop-adma.c
>@@ -672,12 +672,6 @@ iop_adma_prep_dma_zero_sum(struct
>dma_chan *chan, dma_addr_t *dma_src,
> return sw_desc ? &sw_desc->async_tx : NULL;
> }
>
>-static void iop_adma_dependency_added(struct dma_chan *chan)
>-{
>- struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
>- tasklet_schedule(&iop_chan->irq_tasklet);
>-}
>-
> static void iop_adma_free_chan_resources(struct dma_chan *chan)
> {
> struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
>@@ -1178,7 +1172,6 @@ static int __devinit
>iop_adma_probe(struct platform_device *pdev)
> dma_dev->device_free_chan_resources =
>iop_adma_free_chan_resources;
> dma_dev->device_is_tx_complete = iop_adma_is_complete;
> dma_dev->device_issue_pending = iop_adma_issue_pending;
>- dma_dev->device_dependency_added = iop_adma_dependency_added;
> dma_dev->dev = &pdev->dev;
>
> /* set prep routines based on capability */
>diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>index d04b169..e2538b4 100644
>--- a/include/linux/dmaengine.h
>+++ b/include/linux/dmaengine.h
>@@ -258,7 +258,6 @@ struct dma_async_tx_descriptor {
> * @device_prep_dma_zero_sum: prepares a zero_sum operation
> * @device_prep_dma_memset: prepares a memset operation
> * @device_prep_dma_interrupt: prepares an end of chain
>interrupt operation
>- * @device_dependency_added: async_tx notifies the channel
>about new deps
> * @device_issue_pending: push pending transactions to hardware
> */
> struct dma_device {
>@@ -293,7 +292,6 @@ struct dma_device {
> struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
> struct dma_chan *chan);
>
>- void (*device_dependency_added)(struct dma_chan *chan);
> enum dma_status (*device_is_tx_complete)(struct dma_chan *chan,
> dma_cookie_t cookie, dma_cookie_t *last,
> dma_cookie_t *used);
>
>

Acked-by: Shannon Nelson <[email protected]>

2008-02-13 16:11:57

by Shannon Nelson

[permalink] [raw]
Subject: RE: [PATCH 2/4] async_tx: fix multiple dependency submission

>-----Original Message-----
>From: Williams, Dan J
>Sent: Tuesday, February 12, 2008 11:03 PM
>To: [email protected]
>Cc: [email protected]; Nelson, Shannon; [email protected];
>[email protected]
>Subject: [PATCH 2/4] async_tx: fix multiple dependency submission
>
>Shrink struct dma_async_tx_descriptor and introduce
>async_tx_channel_switch to properly inject a channel switch
>interrupt in
>the descriptor stream. This simplifies the locking model as drivers no
>longer need to handle dma_async_tx_descriptor.lock.
>
>Signed-off-by: Dan Williams <[email protected]>
>---
>
> crypto/async_tx/async_tx.c | 197
>++++++++++++++++++++++++++++++++++++--------
> drivers/dma/dmaengine.c | 2
> drivers/dma/iop-adma.c | 9 +-
> include/linux/dmaengine.h | 9 +-
> 4 files changed, 170 insertions(+), 47 deletions(-)
>
>


Acked-by: Shannon Nelson <[email protected]>

2008-02-15 08:39:18

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [PATCH 0/4] async_tx: fix dependency handling and related cleanups

On Wed, 13 Feb 2008 00:02:52 -0700
Dan Williams <[email protected]> wrote:

> Dan Williams (4):
> iop-adma: remove the workaround for missed interrupts on iop3xx
> async_tx: kill ->device_dependency_added
> async_tx: fix multiple dependency submission
> async_tx: checkpatch says s/__FUNCTION__/__func__/g

All of the above changes are fine with me. I'll rebase my dmaslave
patches on top of them; that will probably allow me to get rid of the
dma_slave_descriptor struct.

Haavard