2013-04-10 23:44:17

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 0/5] ioatdma: Intel S1200 support patches

These are the updated patches from first submission series and rebased against
vinod's slave-dma git tree for-linus branch.

Patches 1 & 4 have been updated after discussion with Dan. Patch 5 was acked by
Dan but requires additional review by Dan. I had to make additional modification
to get raid6test working.

---

Dave Jiang (5):
ioatdma: Removing hw bug workaround for CB3.x .2 and earlier
ioatdma: Adding support for 16 src PQ ops and super extended descriptors
ioatdma: S1200 platforms ioatdma channel 2 and 3 falsely advertise RAID cap
ioatdma: Adding write back descriptor error status support for ioatdma 3.3
async_tx: allow generic async_memcpy() not be effected by channel switch


crypto/async_tx/async_memcpy.c | 76 ++++--
drivers/dma/ioat/dma.h | 18 +
drivers/dma/ioat/dma_v2.h | 2
drivers/dma/ioat/dma_v3.c | 538 +++++++++++++++++++++++++++++++++++++---
drivers/dma/ioat/hw.h | 60 ++++
drivers/dma/ioat/pci.c | 3
drivers/dma/ioat/registers.h | 2
drivers/md/raid5.c | 15 +
include/linux/async_tx.h | 5
include/linux/dmaengine.h | 34 ---
10 files changed, 647 insertions(+), 106 deletions(-)


2013-04-10 23:44:25

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 1/5] ioatdma: Removing hw bug workaround for CB3.x .2 and earlier

CB3.2 and earlier hardware has silicon bugs that are no longer needed with
the new hardware. We don't have to use a NULL op to signal interrupt for
RAID ops any longer. This code make sure the legacy workarounds only happen on
legacy hardware.

Signed-off-by: Dave Jiang <[email protected]>
---
drivers/dma/ioat/dma_v3.c | 31 ++++++++++++++++++++-----------
1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index cf97e3f..6393115 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -837,6 +837,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
{
struct ioat2_dma_chan *ioat = to_ioat2_chan(c);
struct ioat_chan_common *chan = &ioat->base;
+ struct ioatdma_device *device = chan->device;
struct ioat_ring_ent *compl_desc;
struct ioat_ring_ent *desc;
struct ioat_ring_ent *ext;
@@ -847,6 +848,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
u32 offset = 0;
u8 op = result ? IOAT_OP_PQ_VAL : IOAT_OP_PQ;
int i, s, idx, with_ext, num_descs;
+ int cb32 = (device->version < IOAT_VER_3_3) ? 1 : 0;

dev_dbg(to_dev(chan), "%s\n", __func__);
/* the engine requires at least two sources (we provide
@@ -872,7 +874,7 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
* order.
*/
if (likely(num_descs) &&
- ioat2_check_space_lock(ioat, num_descs+1) == 0)
+ ioat2_check_space_lock(ioat, num_descs + cb32) == 0)
idx = ioat->head;
else
return NULL;
@@ -926,16 +928,23 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
pq->ctl_f.fence = !!(flags & DMA_PREP_FENCE);
dump_pq_desc_dbg(ioat, desc, ext);

- /* completion descriptor carries interrupt bit */
- compl_desc = ioat2_get_ring_ent(ioat, idx + i);
- compl_desc->txd.flags = flags & DMA_PREP_INTERRUPT;
- hw = compl_desc->hw;
- hw->ctl = 0;
- hw->ctl_f.null = 1;
- hw->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT);
- hw->ctl_f.compl_write = 1;
- hw->size = NULL_DESC_BUFFER_SIZE;
- dump_desc_dbg(ioat, compl_desc);
+ if (!cb32) {
+ pq->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT);
+ pq->ctl_f.compl_write = 1;
+ compl_desc = desc;
+ } else {
+ /* completion descriptor carries interrupt bit */
+ compl_desc = ioat2_get_ring_ent(ioat, idx + i);
+ compl_desc->txd.flags = flags & DMA_PREP_INTERRUPT;
+ hw = compl_desc->hw;
+ hw->ctl = 0;
+ hw->ctl_f.null = 1;
+ hw->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT);
+ hw->ctl_f.compl_write = 1;
+ hw->size = NULL_DESC_BUFFER_SIZE;
+ dump_desc_dbg(ioat, compl_desc);
+ }
+

/* we leave the channel locked to ensure in order submission */
return &compl_desc->txd;

2013-04-10 23:44:32

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 2/5] ioatdma: Adding support for 16 src PQ ops and super extended descriptors

v3.3 introduced 16 sources PQ operations. This also introduced super extended
descriptors to support the 16 srcs operations. This patch adds support for
the 16 sources ops and in turn adds the super extended descriptors for those
ops.

5 SED pools are created depending on the descriptor sizes. An SED can be a 64
bytes sized descriptor or larger and must be physically contiguous. A kmem
cache pool is created for allocating the software descriptor that manages the
hardware descriptor. The super extended descriptor will take place of extended
descriptor under certain operations and be "attached" to the op descriptor
during operation. This is a new feature for ioatdma v3.3.

Signed-off-by: Dave Jiang <[email protected]>
Acked-by: Dan Williams <[email protected]>
---
drivers/dma/ioat/dma.h | 17 ++
drivers/dma/ioat/dma_v2.h | 2
drivers/dma/ioat/dma_v3.c | 399 ++++++++++++++++++++++++++++++++++++++++--
drivers/dma/ioat/hw.h | 43 ++++-
drivers/dma/ioat/pci.c | 3
drivers/dma/ioat/registers.h | 1
6 files changed, 443 insertions(+), 22 deletions(-)

diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
index 976eba8..35d7402 100644
--- a/drivers/dma/ioat/dma.h
+++ b/drivers/dma/ioat/dma.h
@@ -81,6 +81,9 @@ struct ioatdma_device {
void __iomem *reg_base;
struct pci_pool *dma_pool;
struct pci_pool *completion_pool;
+#define MAX_SED_POOLS 5
+ struct dma_pool *sed_hw_pool[MAX_SED_POOLS];
+ struct kmem_cache *sed_pool;
struct dma_device common;
u8 version;
struct msix_entry msix_entries[4];
@@ -141,6 +144,20 @@ struct ioat_dma_chan {
u16 active;
};

+/**
+ * struct ioat_sed_ent - wrapper around super extended hardware descriptor
+ * @hw: hardware SED
+ * @sed_dma: dma address for the SED
+ * @list: list member
+ * @parent: point to the dma descriptor that's the parent
+ */
+struct ioat_sed_ent {
+ struct ioat_sed_raw_descriptor *hw;
+ dma_addr_t dma;
+ struct ioat_ring_ent *parent;
+ unsigned int hw_pool;
+};
+
static inline struct ioat_chan_common *to_chan_common(struct dma_chan *c)
{
return container_of(c, struct ioat_chan_common, common);
diff --git a/drivers/dma/ioat/dma_v2.h b/drivers/dma/ioat/dma_v2.h
index e100f64..29bf944 100644
--- a/drivers/dma/ioat/dma_v2.h
+++ b/drivers/dma/ioat/dma_v2.h
@@ -137,6 +137,7 @@ struct ioat_ring_ent {
#ifdef DEBUG
int id;
#endif
+ struct ioat_sed_ent *sed;
};

static inline struct ioat_ring_ent *
@@ -157,6 +158,7 @@ static inline void ioat2_set_chainaddr(struct ioat2_dma_chan *ioat, u64 addr)

int ioat2_dma_probe(struct ioatdma_device *dev, int dca);
int ioat3_dma_probe(struct ioatdma_device *dev, int dca);
+void ioat3_dma_remove(struct ioatdma_device *dev);
struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase);
struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase);
int ioat2_check_space_lock(struct ioat2_dma_chan *ioat, int num_descs);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 6393115..02afdb5 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -55,7 +55,7 @@
/*
* Support routines for v3+ hardware
*/
-
+#include <linux/module.h>
#include <linux/pci.h>
#include <linux/gfp.h>
#include <linux/dmaengine.h>
@@ -70,6 +70,10 @@
/* ioat hardware assumes at least two sources for raid operations */
#define src_cnt_to_sw(x) ((x) + 2)
#define src_cnt_to_hw(x) ((x) - 2)
+#define ndest_to_sw(x) ((x) + 1)
+#define ndest_to_hw(x) ((x) - 1)
+#define src16_cnt_to_sw(x) ((x) + 9)
+#define src16_cnt_to_hw(x) ((x) - 9)

/* provide a lookup table for setting the source address in the base or
* extended descriptor of an xor or pq descriptor
@@ -77,7 +81,18 @@
static const u8 xor_idx_to_desc = 0xe0;
static const u8 xor_idx_to_field[] = { 1, 4, 5, 6, 7, 0, 1, 2 };
static const u8 pq_idx_to_desc = 0xf8;
+static const u8 pq16_idx_to_desc[] = { 0, 0, 1, 1, 1, 1, 1, 1, 1,
+ 2, 2, 2, 2, 2, 2, 2 };
static const u8 pq_idx_to_field[] = { 1, 4, 5, 0, 1, 2, 4, 5 };
+static const u8 pq16_idx_to_field[] = { 1, 4, 1, 2, 3, 4, 5, 6, 7,
+ 0, 1, 2, 3, 4, 5, 6 };
+
+/*
+ * technically sources 1 and 2 do not require SED, but the op will have
+ * at least 9 descriptors so that's irrelevant.
+ */
+static const u8 pq16_idx_to_sed[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 1, 1, 1, 1, 1, 1, 1 };

static void ioat3_eh(struct ioat2_dma_chan *ioat);

@@ -103,6 +118,13 @@ static dma_addr_t pq_get_src(struct ioat_raw_descriptor *descs[2], int idx)
return raw->field[pq_idx_to_field[idx]];
}

+static dma_addr_t pq16_get_src(struct ioat_raw_descriptor *desc[3], int idx)
+{
+ struct ioat_raw_descriptor *raw = desc[pq16_idx_to_desc[idx]];
+
+ return raw->field[pq16_idx_to_field[idx]];
+}
+
static void pq_set_src(struct ioat_raw_descriptor *descs[2],
dma_addr_t addr, u32 offset, u8 coef, int idx)
{
@@ -113,6 +135,12 @@ static void pq_set_src(struct ioat_raw_descriptor *descs[2],
pq->coef[idx] = coef;
}

+static int sed_get_pq16_pool_idx(int src_cnt)
+{
+
+ return pq16_idx_to_sed[src_cnt];
+}
+
static bool is_jf_ioat(struct pci_dev *pdev)
{
switch (pdev->device) {
@@ -210,6 +238,57 @@ static bool is_bwd_ioat(struct pci_dev *pdev)
}
}

+static void pq16_set_src(struct ioat_raw_descriptor *desc[3],
+ dma_addr_t addr, u32 offset, u8 coef, int idx)
+{
+ struct ioat_pq_descriptor *pq = (struct ioat_pq_descriptor *)desc[0];
+ struct ioat_pq16a_descriptor *pq16 =
+ (struct ioat_pq16a_descriptor *)desc[1];
+ struct ioat_raw_descriptor *raw = desc[pq16_idx_to_desc[idx]];
+
+ raw->field[pq16_idx_to_field[idx]] = addr + offset;
+
+ if (idx < 8)
+ pq->coef[idx] = coef;
+ else
+ pq16->coef[idx - 8] = coef;
+}
+
+struct ioat_sed_ent *
+ioat3_alloc_sed(struct ioatdma_device *device, unsigned int hw_pool)
+{
+ struct ioat_sed_ent *sed;
+ gfp_t flags = __GFP_ZERO;
+
+ if (in_atomic())
+ flags |= GFP_ATOMIC;
+ else
+ flags |= GFP_KERNEL;
+
+ sed = kmem_cache_alloc(device->sed_pool, flags);
+ if (!sed)
+ return NULL;
+
+ sed->hw_pool = hw_pool;
+ sed->hw = dma_pool_alloc(device->sed_hw_pool[hw_pool],
+ flags, &sed->dma);
+ if (!sed->hw) {
+ kmem_cache_free(device->sed_pool, sed);
+ return NULL;
+ }
+
+ return sed;
+}
+
+void ioat3_free_sed(struct ioatdma_device *device, struct ioat_sed_ent *sed)
+{
+ if (!sed)
+ return;
+
+ dma_pool_free(device->sed_hw_pool[sed->hw_pool], sed->hw, sed->dma);
+ kmem_cache_free(device->sed_pool, sed);
+}
+
static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat,
struct ioat_ring_ent *desc, int idx)
{
@@ -322,6 +401,54 @@ static void ioat3_dma_unmap(struct ioat2_dma_chan *ioat,
}
break;
}
+ case IOAT_OP_PQ_16S:
+ case IOAT_OP_PQ_VAL_16S: {
+ struct ioat_pq_descriptor *pq = desc->pq;
+ int src_cnt = src16_cnt_to_sw(pq->ctl_f.src_cnt);
+ struct ioat_raw_descriptor *descs[4];
+ int i;
+
+ /* in the 'continue' case don't unmap the dests as sources */
+ if (dmaf_p_disabled_continue(flags))
+ src_cnt--;
+ else if (dmaf_continue(flags))
+ src_cnt -= 3;
+
+ if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
+ descs[0] = (struct ioat_raw_descriptor *)pq;
+ descs[1] = (struct ioat_raw_descriptor *)(desc->sed->hw);
+ descs[2] = (struct ioat_raw_descriptor *)(&desc->sed->hw->b[0]);
+ for (i = 0; i < src_cnt; i++) {
+ dma_addr_t src = pq16_get_src(descs, i);
+
+ ioat_unmap(pdev, src - offset, len,
+ PCI_DMA_TODEVICE, flags, 0);
+ }
+
+ /* the dests are sources in pq validate operations */
+ if (pq->ctl_f.op == IOAT_OP_XOR_VAL) {
+ if (!(flags & DMA_PREP_PQ_DISABLE_P))
+ ioat_unmap(pdev, pq->p_addr - offset,
+ len, PCI_DMA_TODEVICE,
+ flags, 0);
+ if (!(flags & DMA_PREP_PQ_DISABLE_Q))
+ ioat_unmap(pdev, pq->q_addr - offset,
+ len, PCI_DMA_TODEVICE,
+ flags, 0);
+ break;
+ }
+ }
+
+ if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
+ if (!(flags & DMA_PREP_PQ_DISABLE_P))
+ ioat_unmap(pdev, pq->p_addr - offset, len,
+ PCI_DMA_BIDIRECTIONAL, flags, 1);
+ if (!(flags & DMA_PREP_PQ_DISABLE_Q))
+ ioat_unmap(pdev, pq->q_addr - offset, len,
+ PCI_DMA_BIDIRECTIONAL, flags, 1);
+ }
+ break;
+ }
default:
dev_err(&pdev->dev, "%s: unknown op type: %#x\n",
__func__, desc->hw->ctl_f.op);
@@ -386,6 +513,7 @@ static bool ioat3_cleanup_preamble(struct ioat_chan_common *chan,
static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
{
struct ioat_chan_common *chan = &ioat->base;
+ struct ioatdma_device *device = chan->device;
struct ioat_ring_ent *desc;
bool seen_current = false;
int idx = ioat->tail, i;
@@ -430,6 +558,12 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
BUG_ON(i + 1 >= active);
i++;
}
+
+ /* cleanup super extended descriptors */
+ if (desc->sed) {
+ ioat3_free_sed(device, desc->sed);
+ desc->sed = NULL;
+ }
}
smp_mb(); /* finish all descriptor reads before incrementing tail */
ioat->tail = idx + i;
@@ -522,6 +656,7 @@ static void ioat3_eh(struct ioat2_dma_chan *ioat)
}
break;
case IOAT_OP_PQ_VAL:
+ case IOAT_OP_PQ_VAL_16S:
if (chanerr & IOAT_CHANERR_XOR_P_OR_CRC_ERR) {
*desc->result |= SUM_CHECK_P_RESULT;
err_handled |= IOAT_CHANERR_XOR_P_OR_CRC_ERR;
@@ -814,7 +949,8 @@ dump_pq_desc_dbg(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc, struct
int i;

dev_dbg(dev, "desc[%d]: (%#llx->%#llx) flags: %#x"
- " sz: %#10.8x ctl: %#x (op: %#x int: %d compl: %d pq: '%s%s' src_cnt: %d)\n",
+ " sz: %#10.8x ctl: %#x (op: %#x int: %d compl: %d pq: '%s%s'"
+ " src_cnt: %d)\n",
desc_id(desc), (unsigned long long) desc->txd.phys,
(unsigned long long) (pq_ex ? pq_ex->next : pq->next),
desc->txd.flags, pq->size, pq->ctl, pq->ctl_f.op, pq->ctl_f.int_en,
@@ -829,6 +965,41 @@ dump_pq_desc_dbg(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc, struct
dev_dbg(dev, "\tNEXT: %#llx\n", pq->next);
}

+static void dump_pq16_desc_dbg(struct ioat2_dma_chan *ioat,
+ struct ioat_ring_ent *desc)
+{
+ struct device *dev = to_dev(&ioat->base);
+ struct ioat_pq_descriptor *pq = desc->pq;
+ struct ioat_raw_descriptor *descs[] = { (void *)pq,
+ (void *)pq,
+ (void *)pq };
+ int src_cnt = src16_cnt_to_sw(pq->ctl_f.src_cnt);
+ int i;
+
+ if (desc->sed) {
+ descs[1] = (void *)desc->sed->hw;
+ descs[2] = (void *)desc->sed->hw + 64;
+ }
+
+ dev_dbg(dev, "desc[%d]: (%#llx->%#llx) flags: %#x"
+ " sz: %#x ctl: %#x (op: %#x int: %d compl: %d pq: '%s%s'"
+ " src_cnt: %d)\n",
+ desc_id(desc), (unsigned long long) desc->txd.phys,
+ (unsigned long long) pq->next,
+ desc->txd.flags, pq->size, pq->ctl,
+ pq->ctl_f.op, pq->ctl_f.int_en,
+ pq->ctl_f.compl_write,
+ pq->ctl_f.p_disable ? "" : "p", pq->ctl_f.q_disable ? "" : "q",
+ pq->ctl_f.src_cnt);
+ for (i = 0; i < src_cnt; i++) {
+ dev_dbg(dev, "\tsrc[%d]: %#llx coef: %#x\n", i,
+ (unsigned long long) pq16_get_src(descs, i),
+ pq->coef[i]);
+ }
+ dev_dbg(dev, "\tP: %#llx\n", pq->p_addr);
+ dev_dbg(dev, "\tQ: %#llx\n", pq->q_addr);
+}
+
static struct dma_async_tx_descriptor *
__ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
const dma_addr_t *dst, const dma_addr_t *src,
@@ -951,10 +1122,114 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
}

static struct dma_async_tx_descriptor *
+__ioat3_prep_pq16_lock(struct dma_chan *c, enum sum_check_flags *result,
+ const dma_addr_t *dst, const dma_addr_t *src,
+ unsigned int src_cnt, const unsigned char *scf,
+ size_t len, unsigned long flags)
+{
+ struct ioat2_dma_chan *ioat = to_ioat2_chan(c);
+ struct ioat_chan_common *chan = &ioat->base;
+ struct ioatdma_device *device = chan->device;
+ struct ioat_ring_ent *desc;
+ size_t total_len = len;
+ struct ioat_pq_descriptor *pq;
+ u32 offset = 0;
+ u8 op;
+ int i, s, idx, num_descs;
+
+ /* this function only handles src_cnt 9 - 16 */
+ BUG_ON(src_cnt < 9);
+
+ /* this function is only called with 9-16 sources */
+ op = result ? IOAT_OP_PQ_VAL_16S : IOAT_OP_PQ_16S;
+
+ dev_dbg(to_dev(chan), "%s\n", __func__);
+
+ num_descs = ioat2_xferlen_to_descs(ioat, len);
+
+ /*
+ * 16 source pq is only available on cb3.3 and has no completion
+ * write hw bug.
+ */
+ if (num_descs && ioat2_check_space_lock(ioat, num_descs) == 0)
+ idx = ioat->head;
+ else
+ return NULL;
+
+ i = 0;
+
+ do {
+ struct ioat_raw_descriptor *descs[4];
+ size_t xfer_size = min_t(size_t, len, 1 << ioat->xfercap_log);
+
+ desc = ioat2_get_ring_ent(ioat, idx + i);
+ pq = desc->pq;
+
+ descs[0] = (struct ioat_raw_descriptor *) pq;
+
+ desc->sed = ioat3_alloc_sed(device,
+ sed_get_pq16_pool_idx(src_cnt));
+ if (!desc->sed) {
+ dev_err(to_dev(chan),
+ "%s: no free sed entries\n", __func__);
+ return NULL;
+ }
+
+ pq->sed_addr = desc->sed->dma;
+ desc->sed->parent = desc;
+
+ descs[1] = (struct ioat_raw_descriptor *)desc->sed->hw;
+ descs[2] = (void *)descs[1] + 64;
+
+ for (s = 0; s < src_cnt; s++)
+ pq16_set_src(descs, src[s], offset, scf[s], s);
+
+ /* see the comment for dma_maxpq in include/linux/dmaengine.h */
+ if (dmaf_p_disabled_continue(flags))
+ pq16_set_src(descs, dst[1], offset, 1, s++);
+ else if (dmaf_continue(flags)) {
+ pq16_set_src(descs, dst[0], offset, 0, s++);
+ pq16_set_src(descs, dst[1], offset, 1, s++);
+ pq16_set_src(descs, dst[1], offset, 0, s++);
+ }
+
+ pq->size = xfer_size;
+ pq->p_addr = dst[0] + offset;
+ pq->q_addr = dst[1] + offset;
+ pq->ctl = 0;
+ pq->ctl_f.op = op;
+ pq->ctl_f.src_cnt = src16_cnt_to_hw(s);
+ pq->ctl_f.p_disable = !!(flags & DMA_PREP_PQ_DISABLE_P);
+ pq->ctl_f.q_disable = !!(flags & DMA_PREP_PQ_DISABLE_Q);
+
+ len -= xfer_size;
+ offset += xfer_size;
+ } while (++i < num_descs);
+
+ /* last pq descriptor carries the unmap parameters and fence bit */
+ desc->txd.flags = flags;
+ desc->len = total_len;
+ if (result)
+ desc->result = result;
+ pq->ctl_f.fence = !!(flags & DMA_PREP_FENCE);
+
+ /* with cb3.3 we should be able to do completion w/o a null desc */
+ pq->ctl_f.int_en = !!(flags & DMA_PREP_INTERRUPT);
+ pq->ctl_f.compl_write = 1;
+
+ dump_pq16_desc_dbg(ioat, desc);
+
+ /* we leave the channel locked to ensure in order submission */
+ return &desc->txd;
+}
+
+static struct dma_async_tx_descriptor *
ioat3_prep_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
unsigned int src_cnt, const unsigned char *scf, size_t len,
unsigned long flags)
{
+ struct dma_device *dma = chan->device;
+
/* specify valid address for disabled result */
if (flags & DMA_PREP_PQ_DISABLE_P)
dst[0] = dst[1];
@@ -974,11 +1249,20 @@ ioat3_prep_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
single_source_coef[0] = scf[0];
single_source_coef[1] = 0;

- return __ioat3_prep_pq_lock(chan, NULL, dst, single_source, 2,
- single_source_coef, len, flags);
- } else
- return __ioat3_prep_pq_lock(chan, NULL, dst, src, src_cnt, scf,
- len, flags);
+ return (src_cnt > 8) && (dma->max_pq > 8) ?
+ __ioat3_prep_pq16_lock(chan, NULL, dst, single_source,
+ 2, single_source_coef, len,
+ flags) :
+ __ioat3_prep_pq_lock(chan, NULL, dst, single_source, 2,
+ single_source_coef, len, flags);
+
+ } else {
+ return (src_cnt > 8) && (dma->max_pq > 8) ?
+ __ioat3_prep_pq16_lock(chan, NULL, dst, src, src_cnt,
+ scf, len, flags) :
+ __ioat3_prep_pq_lock(chan, NULL, dst, src, src_cnt,
+ scf, len, flags);
+ }
}

struct dma_async_tx_descriptor *
@@ -986,6 +1270,8 @@ ioat3_prep_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
unsigned int src_cnt, const unsigned char *scf, size_t len,
enum sum_check_flags *pqres, unsigned long flags)
{
+ struct dma_device *dma = chan->device;
+
/* specify valid address for disabled result */
if (flags & DMA_PREP_PQ_DISABLE_P)
pq[0] = pq[1];
@@ -997,14 +1283,18 @@ ioat3_prep_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
*/
*pqres = 0;

- return __ioat3_prep_pq_lock(chan, pqres, pq, src, src_cnt, scf, len,
- flags);
+ return (src_cnt > 8) && (dma->max_pq > 8) ?
+ __ioat3_prep_pq16_lock(chan, pqres, pq, src, src_cnt, scf, len,
+ flags) :
+ __ioat3_prep_pq_lock(chan, pqres, pq, src, src_cnt, scf, len,
+ flags);
}

static struct dma_async_tx_descriptor *
ioat3_prep_pqxor(struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src,
unsigned int src_cnt, size_t len, unsigned long flags)
{
+ struct dma_device *dma = chan->device;
unsigned char scf[src_cnt];
dma_addr_t pq[2];

@@ -1013,8 +1303,11 @@ ioat3_prep_pqxor(struct dma_chan *chan, dma_addr_t dst, dma_addr_t *src,
flags |= DMA_PREP_PQ_DISABLE_Q;
pq[1] = dst; /* specify valid address for disabled result */

- return __ioat3_prep_pq_lock(chan, NULL, pq, src, src_cnt, scf, len,
- flags);
+ return (src_cnt > 8) && (dma->max_pq > 8) ?
+ __ioat3_prep_pq16_lock(chan, NULL, pq, src, src_cnt, scf, len,
+ flags) :
+ __ioat3_prep_pq_lock(chan, NULL, pq, src, src_cnt, scf, len,
+ flags);
}

struct dma_async_tx_descriptor *
@@ -1022,6 +1315,7 @@ ioat3_prep_pqxor_val(struct dma_chan *chan, dma_addr_t *src,
unsigned int src_cnt, size_t len,
enum sum_check_flags *result, unsigned long flags)
{
+ struct dma_device *dma = chan->device;
unsigned char scf[src_cnt];
dma_addr_t pq[2];

@@ -1035,8 +1329,12 @@ ioat3_prep_pqxor_val(struct dma_chan *chan, dma_addr_t *src,
flags |= DMA_PREP_PQ_DISABLE_Q;
pq[1] = pq[0]; /* specify valid address for disabled result */

- return __ioat3_prep_pq_lock(chan, result, pq, &src[1], src_cnt - 1, scf,
- len, flags);
+
+ return (src_cnt > 8) && (dma->max_pq > 8) ?
+ __ioat3_prep_pq16_lock(chan, result, pq, &src[1], src_cnt - 1,
+ scf, len, flags) :
+ __ioat3_prep_pq_lock(chan, result, pq, &src[1], src_cnt - 1,
+ scf, len, flags);
}

static struct dma_async_tx_descriptor *
@@ -1533,11 +1831,17 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)

if (cap & IOAT_CAP_PQ) {
is_raid_device = true;
- dma_set_maxpq(dma, 8, 0);
- if (is_xeon_cb32(pdev))
- dma->pq_align = 6;
- else
+
+ if (cap & IOAT_CAP_RAID16SS) {
+ dma_set_maxpq(dma, 16, 0);
dma->pq_align = 0;
+ } else {
+ dma_set_maxpq(dma, 8, 0);
+ if (is_xeon_cb32(pdev))
+ dma->pq_align = 6;
+ else
+ dma->pq_align = 0;
+ }

dma_cap_set(DMA_PQ, dma->cap_mask);
dma->device_prep_dma_pq = ioat3_prep_pq;
@@ -1546,11 +1850,16 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma->device_prep_dma_pq_val = ioat3_prep_pq_val;

if (!(cap & IOAT_CAP_XOR)) {
- dma->max_xor = 8;
- if (is_xeon_cb32(pdev))
- dma->xor_align = 6;
- else
+ if (cap & IOAT_CAP_RAID16SS) {
+ dma->max_xor = 16;
dma->xor_align = 0;
+ } else {
+ dma->max_xor = 8;
+ if (is_xeon_cb32(pdev))
+ dma->xor_align = 6;
+ else
+ dma->xor_align = 0;
+ }

dma_cap_set(DMA_XOR, dma->cap_mask);
dma->device_prep_dma_xor = ioat3_prep_pqxor;
@@ -1578,6 +1887,30 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma->device_prep_dma_pq_val = NULL;
}

+ /* starting with CB3.3 super extended descriptors are supported */
+ if (cap & IOAT_CAP_RAID16SS) {
+ char pool_name[14];
+ int i;
+
+ /* allocate sw descriptor pool for SED */
+ device->sed_pool = kmem_cache_create("ioat_sed",
+ sizeof(struct ioat_sed_ent), 0, 0, NULL);
+ if (!device->sed_pool)
+ return -ENOMEM;
+
+ for (i = 0; i < MAX_SED_POOLS; i++) {
+ snprintf(pool_name, 14, "ioat_hw%d_sed", i);
+
+ /* allocate SED DMA pool */
+ device->sed_hw_pool[i] = dma_pool_create(pool_name,
+ &pdev->dev,
+ SED_SIZE * (i + 1), 64, 0);
+ if (!device->sed_hw_pool[i])
+ goto sed_pool_cleanup;
+
+ }
+ }
+
err = ioat_probe(device);
if (err)
return err;
@@ -1599,4 +1932,28 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
device->dca = ioat3_dca_init(pdev, device->reg_base);

return 0;
+
+sed_pool_cleanup:
+ if (device->sed_pool) {
+ int i;
+ kmem_cache_destroy(device->sed_pool);
+
+ for (i = 0; i < MAX_SED_POOLS; i++)
+ if (device->sed_hw_pool[i])
+ dma_pool_destroy(device->sed_hw_pool[i]);
+ }
+
+ return -ENOMEM;
+}
+
+void ioat3_dma_remove(struct ioatdma_device *device)
+{
+ if (device->sed_pool) {
+ int i;
+ kmem_cache_destroy(device->sed_pool);
+
+ for (i = 0; i < MAX_SED_POOLS; i++)
+ if (device->sed_hw_pool[i])
+ dma_pool_destroy(device->sed_hw_pool[i]);
+ }
}
diff --git a/drivers/dma/ioat/hw.h b/drivers/dma/ioat/hw.h
index ce431f5..d10570d 100644
--- a/drivers/dma/ioat/hw.h
+++ b/drivers/dma/ioat/hw.h
@@ -183,6 +183,8 @@ struct ioat_pq_descriptor {
unsigned int rsvd:11;
#define IOAT_OP_PQ 0x89
#define IOAT_OP_PQ_VAL 0x8a
+ #define IOAT_OP_PQ_16S 0xa0
+ #define IOAT_OP_PQ_VAL_16S 0xa1
unsigned int op:8;
} ctl_f;
};
@@ -190,7 +192,10 @@ struct ioat_pq_descriptor {
uint64_t p_addr;
uint64_t next;
uint64_t src_addr2;
- uint64_t src_addr3;
+ union {
+ uint64_t src_addr3;
+ uint64_t sed_addr;
+ };
uint8_t coef[8];
uint64_t q_addr;
};
@@ -239,4 +244,40 @@ struct ioat_pq_update_descriptor {
struct ioat_raw_descriptor {
uint64_t field[8];
};
+
+struct ioat_pq16a_descriptor {
+ uint8_t coef[8];
+ uint64_t src_addr3;
+ uint64_t src_addr4;
+ uint64_t src_addr5;
+ uint64_t src_addr6;
+ uint64_t src_addr7;
+ uint64_t src_addr8;
+ uint64_t src_addr9;
+};
+
+struct ioat_pq16b_descriptor {
+ uint64_t src_addr10;
+ uint64_t src_addr11;
+ uint64_t src_addr12;
+ uint64_t src_addr13;
+ uint64_t src_addr14;
+ uint64_t src_addr15;
+ uint64_t src_addr16;
+ uint64_t rsvd;
+};
+
+union ioat_sed_pq_descriptor {
+ struct ioat_pq16a_descriptor a;
+ struct ioat_pq16b_descriptor b;
+};
+
+#define SED_SIZE 64
+
+struct ioat_sed_raw_descriptor {
+ uint64_t a[8];
+ uint64_t b[8];
+ uint64_t c[8];
+};
+
#endif
diff --git a/drivers/dma/ioat/pci.c b/drivers/dma/ioat/pci.c
index 1f63296..2c8d560 100644
--- a/drivers/dma/ioat/pci.c
+++ b/drivers/dma/ioat/pci.c
@@ -207,6 +207,9 @@ static void ioat_remove(struct pci_dev *pdev)
if (!device)
return;

+ if (device->version >= IOAT_VER_3_0)
+ ioat3_dma_remove(device);
+
dev_err(&pdev->dev, "Removing dma and dca services\n");
if (device->dca) {
unregister_dca_provider(device->dca, &pdev->dev);
diff --git a/drivers/dma/ioat/registers.h b/drivers/dma/ioat/registers.h
index c1ad194..efdd47e 100644
--- a/drivers/dma/ioat/registers.h
+++ b/drivers/dma/ioat/registers.h
@@ -79,6 +79,7 @@
#define IOAT_CAP_APIC 0x00000080
#define IOAT_CAP_XOR 0x00000100
#define IOAT_CAP_PQ 0x00000200
+#define IOAT_CAP_RAID16SS 0x00020000

#define IOAT_CHANNEL_MMIO_SIZE 0x80 /* Each Channel MMIO space is this size */

2013-04-10 23:44:36

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 3/5] ioatdma: S1200 platforms ioatdma channel 2 and 3 falsely advertise RAID cap

This workaround checks for channel 2&3 and remove RAID cap.

Signed-off-by: Dave Jiang <[email protected]>
Acked-by: Dan Williams <[email protected]>
---
drivers/dma/ioat/dma_v3.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 02afdb5..343320c 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -238,6 +238,18 @@ static bool is_bwd_ioat(struct pci_dev *pdev)
}
}

+static bool is_bwd_noraid(struct pci_dev *pdev)
+{
+ switch (pdev->device) {
+ case PCI_DEVICE_ID_INTEL_IOAT_BWD2:
+ case PCI_DEVICE_ID_INTEL_IOAT_BWD3:
+ return true;
+ default:
+ return false;
+ }
+
+}
+
static void pq16_set_src(struct ioat_raw_descriptor *desc[3],
dma_addr_t addr, u32 offset, u8 coef, int idx)
{
@@ -1813,6 +1825,9 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)

cap = readl(device->reg_base + IOAT_DMA_CAP_OFFSET);

+ if (is_bwd_noraid(pdev))
+ cap &= ~(IOAT_CAP_XOR | IOAT_CAP_PQ | IOAT_CAP_RAID16SS);
+
/* dca is incompatible with raid operations */
if (dca_en && (cap & (IOAT_CAP_XOR|IOAT_CAP_PQ)))
cap &= ~(IOAT_CAP_XOR|IOAT_CAP_PQ);

2013-04-10 23:44:44

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 4/5] ioatdma: Adding write back descriptor error status support for ioatdma 3.3

v3.3 provides support for write back descriptor error status. This allows
reporting of errors in a descriptor field. In supporting this, certain
errors such as P/Q validation errors no longer halts the channel. The DMA
engine can continue to execute until the end of the chain and allow software
to report the "errors" up the stack. We are also going to mask those error
interrupts and handle them when the "chain" has completed at the end.

Signed-off-by: Dave Jiang <[email protected]>
---
drivers/dma/ioat/dma.h | 1
drivers/dma/ioat/dma_v3.c | 111 +++++++++++++++++++++++++++++++++---------
drivers/dma/ioat/hw.h | 17 ++++++
drivers/dma/ioat/registers.h | 1
4 files changed, 105 insertions(+), 25 deletions(-)

diff --git a/drivers/dma/ioat/dma.h b/drivers/dma/ioat/dma.h
index 35d7402..54fb7b9 100644
--- a/drivers/dma/ioat/dma.h
+++ b/drivers/dma/ioat/dma.h
@@ -90,6 +90,7 @@ struct ioatdma_device {
struct ioat_chan_common *idx[4];
struct dca_provider *dca;
enum ioat_irq_mode irq_mode;
+ u32 cap;
void (*intr_quirk)(struct ioatdma_device *device);
int (*enumerate_channels)(struct ioatdma_device *device);
int (*reset_hw)(struct ioat_chan_common *chan);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 343320c..2802c92 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -515,6 +515,36 @@ static bool ioat3_cleanup_preamble(struct ioat_chan_common *chan,
return true;
}

+static void
+desc_get_errstat(struct ioat2_dma_chan *ioat, struct ioat_ring_ent *desc)
+{
+ struct ioat_dma_descriptor *hw = desc->hw;
+
+ switch (hw->ctl_f.op) {
+ case IOAT_OP_PQ_VAL:
+ case IOAT_OP_PQ_VAL_16S:
+ {
+ struct ioat_pq_descriptor *pq = desc->pq;
+
+ /* check if there's error written */
+ if (!pq->dwbes_f.wbes)
+ return;
+
+ /* need to set a chanerr var for checking to clear later */
+
+ if (pq->dwbes_f.p_val_err)
+ *desc->result |= SUM_CHECK_P_RESULT;
+
+ if (pq->dwbes_f.q_val_err)
+ *desc->result |= SUM_CHECK_Q_RESULT;
+
+ return;
+ }
+ default:
+ return;
+ }
+}
+
/**
* __cleanup - reclaim used descriptors
* @ioat: channel (ring) to clean
@@ -552,6 +582,11 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
desc = ioat2_get_ring_ent(ioat, idx + i);
dump_desc_dbg(ioat, desc);
+
+ /* set err stat if we are using dwbes */
+ if (device->cap & IOAT_CAP_DWBES)
+ desc_get_errstat(ioat, desc);
+
tx = &desc->txd;
if (tx->cookie) {
dma_cookie_complete(tx);
@@ -1095,6 +1130,9 @@ __ioat3_prep_pq_lock(struct dma_chan *c, enum sum_check_flags *result,
pq->q_addr = dst[1] + offset;
pq->ctl = 0;
pq->ctl_f.op = op;
+ /* we turn on descriptor write back error status */
+ if (device->cap & IOAT_CAP_DWBES)
+ pq->ctl_f.wb_en = result ? 1 : 0;
pq->ctl_f.src_cnt = src_cnt_to_hw(s);
pq->ctl_f.p_disable = !!(flags & DMA_PREP_PQ_DISABLE_P);
pq->ctl_f.q_disable = !!(flags & DMA_PREP_PQ_DISABLE_Q);
@@ -1211,6 +1249,9 @@ __ioat3_prep_pq16_lock(struct dma_chan *c, enum sum_check_flags *result,
pq->ctl = 0;
pq->ctl_f.op = op;
pq->ctl_f.src_cnt = src16_cnt_to_hw(s);
+ /* we turn on descriptor write back error status */
+ if (device->cap & IOAT_CAP_DWBES)
+ pq->ctl_f.wb_en = result ? 1 : 0;
pq->ctl_f.p_disable = !!(flags & DMA_PREP_PQ_DISABLE_P);
pq->ctl_f.q_disable = !!(flags & DMA_PREP_PQ_DISABLE_Q);

@@ -1797,6 +1838,32 @@ static int ioat3_reset_hw(struct ioat_chan_common *chan)
return err;
}

+static void ioat3_intr_quirk(struct ioatdma_device *device)
+{
+ struct dma_device *dma;
+ struct dma_chan *c;
+ struct ioat_chan_common *chan;
+ u32 errmask;
+
+ dma = &device->common;
+
+ /*
+ * if we have descriptor write back error status, we mask the
+ * error interrupts
+ */
+ if (device->cap & IOAT_CAP_DWBES) {
+ list_for_each_entry(c, &dma->channels, device_node) {
+ chan = to_chan_common(c);
+ errmask = readl(chan->reg_base +
+ IOAT_CHANERR_MASK_OFFSET);
+ errmask |= IOAT_CHANERR_XOR_P_OR_CRC_ERR |
+ IOAT_CHANERR_XOR_Q_ERR;
+ writel(errmask, chan->reg_base +
+ IOAT_CHANERR_MASK_OFFSET);
+ }
+ }
+}
+
int ioat3_dma_probe(struct ioatdma_device *device, int dca)
{
struct pci_dev *pdev = device->pdev;
@@ -1806,11 +1873,11 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
struct ioat_chan_common *chan;
bool is_raid_device = false;
int err;
- u32 cap;

device->enumerate_channels = ioat2_enumerate_channels;
device->reset_hw = ioat3_reset_hw;
device->self_test = ioat3_dma_self_test;
+ device->intr_quirk = ioat3_intr_quirk;
dma = &device->common;
dma->device_prep_dma_memcpy = ioat2_dma_prep_memcpy_lock;
dma->device_issue_pending = ioat2_issue_pending;
@@ -1823,16 +1890,16 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma_cap_set(DMA_INTERRUPT, dma->cap_mask);
dma->device_prep_dma_interrupt = ioat3_prep_interrupt_lock;

- cap = readl(device->reg_base + IOAT_DMA_CAP_OFFSET);
+ device->cap = readl(device->reg_base + IOAT_DMA_CAP_OFFSET);

if (is_bwd_noraid(pdev))
- cap &= ~(IOAT_CAP_XOR | IOAT_CAP_PQ | IOAT_CAP_RAID16SS);
+ device->cap &= ~(IOAT_CAP_XOR | IOAT_CAP_PQ | IOAT_CAP_RAID16SS);

/* dca is incompatible with raid operations */
- if (dca_en && (cap & (IOAT_CAP_XOR|IOAT_CAP_PQ)))
- cap &= ~(IOAT_CAP_XOR|IOAT_CAP_PQ);
+ if (dca_en && (device->cap & (IOAT_CAP_XOR|IOAT_CAP_PQ)))
+ device->cap &= ~(IOAT_CAP_XOR|IOAT_CAP_PQ);

- if (cap & IOAT_CAP_XOR) {
+ if (device->cap & IOAT_CAP_XOR) {
is_raid_device = true;
dma->max_xor = 8;
dma->xor_align = 6;
@@ -1844,10 +1911,15 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma->device_prep_dma_xor_val = ioat3_prep_xor_val;
}

- if (cap & IOAT_CAP_PQ) {
+ if (device->cap & IOAT_CAP_PQ) {
is_raid_device = true;

- if (cap & IOAT_CAP_RAID16SS) {
+ dma->device_prep_dma_pq = ioat3_prep_pq;
+ dma->device_prep_dma_pq_val = ioat3_prep_pq_val;
+ dma_cap_set(DMA_PQ, dma->cap_mask);
+ dma_cap_set(DMA_PQ_VAL, dma->cap_mask);
+
+ if (device->cap & IOAT_CAP_RAID16SS) {
dma_set_maxpq(dma, 16, 0);
dma->pq_align = 0;
} else {
@@ -1858,14 +1930,13 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma->pq_align = 0;
}

- dma_cap_set(DMA_PQ, dma->cap_mask);
- dma->device_prep_dma_pq = ioat3_prep_pq;
-
- dma_cap_set(DMA_PQ_VAL, dma->cap_mask);
- dma->device_prep_dma_pq_val = ioat3_prep_pq_val;
+ if (!(device->cap & IOAT_CAP_XOR)) {
+ dma->device_prep_dma_xor = ioat3_prep_pqxor;
+ dma->device_prep_dma_xor_val = ioat3_prep_pqxor_val;
+ dma_cap_set(DMA_XOR, dma->cap_mask);
+ dma_cap_set(DMA_XOR_VAL, dma->cap_mask);

- if (!(cap & IOAT_CAP_XOR)) {
- if (cap & IOAT_CAP_RAID16SS) {
+ if (device->cap & IOAT_CAP_RAID16SS) {
dma->max_xor = 16;
dma->xor_align = 0;
} else {
@@ -1875,16 +1946,10 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
else
dma->xor_align = 0;
}
-
- dma_cap_set(DMA_XOR, dma->cap_mask);
- dma->device_prep_dma_xor = ioat3_prep_pqxor;
-
- dma_cap_set(DMA_XOR_VAL, dma->cap_mask);
- dma->device_prep_dma_xor_val = ioat3_prep_pqxor_val;
}
}

- if (is_raid_device && (cap & IOAT_CAP_FILL_BLOCK)) {
+ if (is_raid_device && (device->cap & IOAT_CAP_FILL_BLOCK)) {
dma_cap_set(DMA_MEMSET, dma->cap_mask);
dma->device_prep_dma_memset = ioat3_prep_memset_lock;
}
@@ -1903,7 +1968,7 @@ int ioat3_dma_probe(struct ioatdma_device *device, int dca)
}

/* starting with CB3.3 super extended descriptors are supported */
- if (cap & IOAT_CAP_RAID16SS) {
+ if (device->cap & IOAT_CAP_RAID16SS) {
char pool_name[14];
int i;

diff --git a/drivers/dma/ioat/hw.h b/drivers/dma/ioat/hw.h
index d10570d..5ee57d4 100644
--- a/drivers/dma/ioat/hw.h
+++ b/drivers/dma/ioat/hw.h
@@ -165,7 +165,17 @@ struct ioat_xor_ext_descriptor {
};

struct ioat_pq_descriptor {
- uint32_t size;
+ union {
+ uint32_t size;
+ uint32_t dwbes;
+ struct {
+ unsigned int rsvd:25;
+ unsigned int p_val_err:1;
+ unsigned int q_val_err:1;
+ unsigned int rsvd1:4;
+ unsigned int wbes:1;
+ } dwbes_f;
+ };
union {
uint32_t ctl;
struct {
@@ -180,7 +190,10 @@ struct ioat_pq_descriptor {
unsigned int hint:1;
unsigned int p_disable:1;
unsigned int q_disable:1;
- unsigned int rsvd:11;
+ unsigned int rsvd2:2;
+ unsigned int wb_en:1;
+ unsigned int prl_en:1;
+ unsigned int rsvd3:7;
#define IOAT_OP_PQ 0x89
#define IOAT_OP_PQ_VAL 0x8a
#define IOAT_OP_PQ_16S 0xa0
diff --git a/drivers/dma/ioat/registers.h b/drivers/dma/ioat/registers.h
index efdd47e..2f1cfa0 100644
--- a/drivers/dma/ioat/registers.h
+++ b/drivers/dma/ioat/registers.h
@@ -79,6 +79,7 @@
#define IOAT_CAP_APIC 0x00000080
#define IOAT_CAP_XOR 0x00000100
#define IOAT_CAP_PQ 0x00000200
+#define IOAT_CAP_DWBES 0x00002000
#define IOAT_CAP_RAID16SS 0x00020000

#define IOAT_CHANNEL_MMIO_SIZE 0x80 /* Each Channel MMIO space is this size */

2013-04-10 23:45:19

by Dave Jiang

[permalink] [raw]
Subject: [PATCH v2 5/5] async_tx: allow generic async_memcpy() not be effected by channel switch

This adds a generic async_memcpy() for the DMA engines that cannot do channel
switch. Previously it would exclude all DMA engines that don't have all equal
capabilities for all ops with the DMA_ASYNC_TX check. The RAID version of
async memcpy will only request RAID capable channels. This allow us to remove
the ifdef for channel switching fixup.

Signed-off-by: Dave Jiang <[email protected]>
---
crypto/async_tx/async_memcpy.c | 76 +++++++++++++++++++++++++++++++---------
drivers/md/raid5.c | 15 +++++---
include/linux/async_tx.h | 5 +++
include/linux/dmaengine.h | 34 ------------------
4 files changed, 73 insertions(+), 57 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 9e62fef..3cecb49 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -30,24 +30,11 @@
#include <linux/dma-mapping.h>
#include <linux/async_tx.h>

-/**
- * async_memcpy - attempt to copy memory with a dma engine.
- * @dest: destination page
- * @src: src page
- * @dest_offset: offset into 'dest' to start transaction
- * @src_offset: offset into 'src' to start transaction
- * @len: length in bytes
- * @submit: submission / completion modifiers
- *
- * honored flags: ASYNC_TX_ACK
- */
-struct dma_async_tx_descriptor *
-async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
- unsigned int src_offset, size_t len,
- struct async_submit_ctl *submit)
+static struct dma_async_tx_descriptor *
+__async_memcpy(struct dma_chan *chan, struct page *dest, struct page *src,
+ unsigned int dest_offset, unsigned int src_offset, size_t len,
+ struct async_submit_ctl *submit)
{
- struct dma_chan *chan = async_tx_find_channel(submit, DMA_MEMCPY,
- &dest, 1, &src, 1, len);
struct dma_device *device = chan ? chan->device : NULL;
struct dma_async_tx_descriptor *tx = NULL;

@@ -98,8 +85,63 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,

return tx;
}
+
+/**
+ * async_memcpy - attempt to copy memory with a dma engine.
+ * @dest: destination page
+ * @src: src page
+ * @dest_offset: offset into 'dest' to start transaction
+ * @src_offset: offset into 'src' to start transaction
+ * @len: length in bytes
+ * @submit: submission / completion modifiers
+ *
+ * The function will grab any channel that has DMA_MEMCPY cap. This allows
+ * generic DMA memcpy without having to worry about channel switch issues.
+ */
+struct dma_async_tx_descriptor *
+async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
+ unsigned int src_offset, size_t len,
+ struct async_submit_ctl *submit)
+{
+ struct dma_chan *chan;
+ struct dma_async_tx_descriptor *depend_tx = submit->depend_tx;
+
+ if (depend_tx &&
+ dma_has_cap(DMA_MEMCPY, depend_tx->chan->device->cap_mask))
+ chan = depend_tx->chan;
+ else
+ chan = dma_find_channel(DMA_MEMCPY);
+
+ return __async_memcpy(chan, dest, src, dest_offset, src_offset,
+ len, submit);
+}
EXPORT_SYMBOL_GPL(async_memcpy);

+
+/**
+ * async_raid_memcpy - attempt to copy memory with a dma engine.
+ * @dest: destination page
+ * @src: src page
+ * @dest_offset: offset into 'dest' to start transaction
+ * @src_offset: offset into 'src' to start transaction
+ * @len: length in bytes
+ * @submit: submission / completion modifiers
+ *
+ * honored flags: ASYNC_TX_ACK
+ */
+struct dma_async_tx_descriptor *
+async_raid_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
+ unsigned int src_offset, size_t len,
+ struct async_submit_ctl *submit)
+{
+ struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
+ &dest, 1, &src, 1, len);
+
+ return __async_memcpy(chan, dest, src, dest_offset, src_offset,
+ len, submit);
+}
+EXPORT_SYMBOL_GPL(async_raid_memcpy);
+
MODULE_AUTHOR("Intel Corporation");
MODULE_DESCRIPTION("asynchronous memcpy api");
MODULE_LICENSE("GPL");
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 24909eb..8779266 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -760,10 +760,11 @@ async_copy_data(int frombio, struct bio *bio, struct page *page,
b_offset += bvl->bv_offset;
bio_page = bvl->bv_page;
if (frombio)
- tx = async_memcpy(page, bio_page, page_offset,
- b_offset, clen, &submit);
+ tx = async_raid_memcpy(page, bio_page,
+ page_offset, b_offset,
+ clen, &submit);
else
- tx = async_memcpy(bio_page, page, b_offset,
+ tx = async_raid_memcpy(bio_page, page, b_offset,
page_offset, clen, &submit);
}
/* chain the operations */
@@ -915,7 +916,8 @@ ops_run_compute5(struct stripe_head *sh, struct raid5_percpu *percpu)
init_async_submit(&submit, ASYNC_TX_FENCE|ASYNC_TX_XOR_ZERO_DST, NULL,
ops_complete_compute, sh, to_addr_conv(sh, percpu));
if (unlikely(count == 1))
- tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
+ tx = async_raid_memcpy(xor_dest, xor_srcs[0], 0, 0,
+ STRIPE_SIZE, &submit);
else
tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);

@@ -1302,7 +1304,8 @@ ops_run_reconstruct5(struct stripe_head *sh, struct raid5_percpu *percpu,
init_async_submit(&submit, flags, tx, ops_complete_reconstruct, sh,
to_addr_conv(sh, percpu));
if (unlikely(count == 1))
- tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
+ tx = async_raid_memcpy(xor_dest, xor_srcs[0], 0, 0,
+ STRIPE_SIZE, &submit);
else
tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
}
@@ -3211,7 +3214,7 @@ static void handle_stripe_expansion(struct r5conf *conf, struct stripe_head *sh)

/* place all the copies on one channel */
init_async_submit(&submit, 0, tx, NULL, NULL, NULL);
- tx = async_memcpy(sh2->dev[dd_idx].page,
+ tx = async_raid_memcpy(sh2->dev[dd_idx].page,
sh->dev[i].page, 0, 0, STRIPE_SIZE,
&submit);

diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index a1c486a..ce9be3d 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -183,6 +183,11 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
struct async_submit_ctl *submit);

struct dma_async_tx_descriptor *
+async_raid_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
+ unsigned int src_offset, size_t len,
+ struct async_submit_ctl *submit);
+
+struct dma_async_tx_descriptor *
async_memset(struct page *dest, int val, unsigned int offset,
size_t len, struct async_submit_ctl *submit);

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index dd6d21b..a4d30e8 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -417,40 +417,11 @@ struct dma_async_tx_descriptor {
dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
dma_async_tx_callback callback;
void *callback_param;
-#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent;
spinlock_t lock;
-#endif
};

-#ifndef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
-static inline void txd_lock(struct dma_async_tx_descriptor *txd)
-{
-}
-static inline void txd_unlock(struct dma_async_tx_descriptor *txd)
-{
-}
-static inline void txd_chain(struct dma_async_tx_descriptor *txd, struct dma_async_tx_descriptor *next)
-{
- BUG();
-}
-static inline void txd_clear_parent(struct dma_async_tx_descriptor *txd)
-{
-}
-static inline void txd_clear_next(struct dma_async_tx_descriptor *txd)
-{
-}
-static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descriptor *txd)
-{
- return NULL;
-}
-static inline struct dma_async_tx_descriptor *txd_parent(struct dma_async_tx_descriptor *txd)
-{
- return NULL;
-}
-
-#else
static inline void txd_lock(struct dma_async_tx_descriptor *txd)
{
spin_lock_bh(&txd->lock);
@@ -480,7 +451,6 @@ static inline struct dma_async_tx_descriptor *txd_next(struct dma_async_tx_descr
{
return txd->next;
}
-#endif

/**
* struct dma_tx_state - filled in to report the status of
@@ -820,11 +790,7 @@ static inline void net_dmaengine_put(void)
#ifdef CONFIG_ASYNC_TX_DMA
#define async_dmaengine_get() dmaengine_get()
#define async_dmaengine_put() dmaengine_put()
-#ifndef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
-#define async_dma_find_channel(type) dma_find_channel(DMA_ASYNC_TX)
-#else
#define async_dma_find_channel(type) dma_find_channel(type)
-#endif /* CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH */
#else
static inline void async_dmaengine_get(void)
{

2013-04-12 01:05:00

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] ioatdma: Intel S1200 support patches

On Wed, Apr 10, 2013 at 4:44 PM, Dave Jiang <[email protected]> wrote:
> These are the updated patches from first submission series and rebased against
> vinod's slave-dma git tree for-linus branch.
>
> Patches 1 & 4 have been updated after discussion with Dan.

Patches 1-4 acked.

> Patch 5 was acked by
> Dan but requires additional review by Dan. I had to make additional modification
> to get raid6test working.

Yeah, we need to find a way to support this without allowing silent
channel switching on engines that can't support it.

--
Dan

2013-04-15 17:47:37

by Vinod Koul

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] ioatdma: Intel S1200 support patches

On Wed, Apr 10, 2013 at 04:44:14PM -0700, Dave Jiang wrote:
> These are the updated patches from first submission series and rebased against
> vinod's slave-dma git tree for-linus branch.
>
> Patches 1 & 4 have been updated after discussion with Dan. Patch 5 was acked by
> Dan but requires additional review by Dan. I had to make additional modification
> to get raid6test working.
Applied 1, 3, 4 and v3

--
~Vinod
>
> ---
>
> Dave Jiang (5):
> ioatdma: Removing hw bug workaround for CB3.x .2 and earlier
> ioatdma: Adding support for 16 src PQ ops and super extended descriptors
> ioatdma: S1200 platforms ioatdma channel 2 and 3 falsely advertise RAID cap
> ioatdma: Adding write back descriptor error status support for ioatdma 3.3
> async_tx: allow generic async_memcpy() not be effected by channel switch
>
>
> crypto/async_tx/async_memcpy.c | 76 ++++--
> drivers/dma/ioat/dma.h | 18 +
> drivers/dma/ioat/dma_v2.h | 2
> drivers/dma/ioat/dma_v3.c | 538 +++++++++++++++++++++++++++++++++++++---
> drivers/dma/ioat/hw.h | 60 ++++
> drivers/dma/ioat/pci.c | 3
> drivers/dma/ioat/registers.h | 2
> drivers/md/raid5.c | 15 +
> include/linux/async_tx.h | 5
> include/linux/dmaengine.h | 34 ---
> 10 files changed, 647 insertions(+), 106 deletions(-)