2012-05-25 16:09:05

by Phil Sutter

[permalink] [raw]
Subject: RFC: support for MV_CESA with TDMA

Hi,

The following patch series adds support for the TDMA engine built into
Marvell's Kirkwood-based SoCs, and enhances mv_cesa.c in order to use it
for speeding up crypto operations. Kirkwood hardware contains a security
accelerator, which can control DMA as well as crypto engines. It allows
for operation with minimal software intervenience, which the following
patches implement: using a chain of DMA descriptors, data input,
configuration, engine startup and data output repeat fully automatically
until the whole input data has been handled.

The point for this being RFC is backwards-compatibility: earlier
hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
with the same crypto engine, so in fact mv_cesa.c is in use on these
platforms, too. But since I don't possess hardware of this kind, I am
not able to make this code IDMA-compatible. Also, due to the quite
massive reorganisation of code flow, I don't really see how to make TDMA
support optional in mv_cesa.c.

Greetings, Phil


2012-05-25 16:08:53

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 89 ++++++++++++++++++----------------------------
1 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 5dba9df..9afed2d 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -536,34 +536,14 @@ static void mv_init_hash_config(struct ahash_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));
-
- mv_tdma_separator();
-
- if (req->result) {
- req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
- req_ctx->digestsize, DMA_FROM_DEVICE);
- mv_tdma_memcpy(req_ctx->result_dma,
- cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
- } else {
- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_tdma_memcpy(cpg->sa_sram_dma,
- cpg->sram_phys + SRAM_CONFIG, 1);
- }
-
- /* GO */
- mv_setup_timer();
- mv_tdma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

-static void mv_update_hash_config(void)
+static void mv_update_hash_config(struct ahash_request *req)
{
- struct ahash_request *req = ahash_request_cast(cpg->cur_req);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
- struct sec_accel_config *op = &cpg->sa_sram.op;
int is_last;
+ u32 val;

/* update only the config (for changed fragment state) and
* mac_digest (for changed frag len) fields */
@@ -571,10 +551,10 @@ static void mv_update_hash_config(void)
switch (req_ctx->op) {
case COP_SHA1:
default:
- op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
- op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
break;
}

@@ -582,36 +562,11 @@ static void mv_update_hash_config(void)
&& (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);

- op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
- sizeof(u32), DMA_TO_DEVICE);
- mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG,
- cpg->sa_sram_dma, sizeof(u32));
-
- op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32),
- sizeof(u32), DMA_TO_DEVICE);
- mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32),
- cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32));
-
- mv_tdma_separator();
-
- if (req->result) {
- req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
- req_ctx->digestsize, DMA_FROM_DEVICE);
- mv_tdma_memcpy(req_ctx->result_dma,
- cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
- } else {
- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_tdma_memcpy(cpg->sa_sram_dma,
- cpg->sram_phys + SRAM_CONFIG, 1);
- }
+ val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
+ mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG, val);

- /* GO */
- mv_setup_timer();
- mv_tdma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+ val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
+ mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val);
}

static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx,
@@ -835,7 +790,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)

p->hw_nbytes = hw_bytes;
p->complete = mv_hash_algo_completion;
- p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
@@ -847,6 +801,33 @@ static void mv_start_new_hash_req(struct ahash_request *req)

setup_data_in();
mv_init_hash_config(req);
+ mv_tdma_separator();
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
+ cpg->p.crypt_len = 0;
+
+ setup_data_in();
+ mv_update_hash_config(req);
+ mv_tdma_separator();
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ }
+ if (req->result) {
+ ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ ctx->digestsize, DMA_FROM_DEVICE);
+ mv_tdma_memcpy(ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF,
+ ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_tdma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }
+
+ /* GO */
+ mv_setup_timer();
+ mv_tdma_trigger();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static int queue_manag(void *data)
--
1.7.3.4

2012-05-25 16:08:53

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 04/13] mv_cesa: split up processing callbacks

Have a dedicated function initialising the full SRAM config, then use a
minimal callback for changing only relevant parts of it.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 87 +++++++++++++++++++++++++++++++++------------
1 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 68b83d8..4a989ea 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -62,7 +62,7 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
- void (*process) (int is_first);
+ void (*process) (void);

/* src mostly */
int sg_src_left;
@@ -265,9 +265,8 @@ static void setup_data_in(void)
p->crypt_len = data_in_sram;
}

-static void mv_process_current_q(int first_block)
+static void mv_init_crypt_config(struct ablkcipher_request *req)
{
- struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
struct sec_accel_config *op = &cpg->sa_sram.op;
@@ -281,8 +280,6 @@ static void mv_process_current_q(int first_block)
op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
- if (!first_block)
- memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
memcpy(cpg->sa_sram.sa_iv, req->info, 16);
break;
}
@@ -308,9 +305,8 @@ static void mv_process_current_q(int first_block)
op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
op->enc_key_p = SRAM_DATA_KEY_P;
-
- setup_data_in();
op->enc_len = cpg->p.crypt_len;
+
memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
sizeof(struct sec_accel_sram));

@@ -319,6 +315,17 @@ static void mv_process_current_q(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

+static void mv_update_crypt_config(void)
+{
+ /* update the enc_len field only */
+ memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32),
+ &cpg->p.crypt_len, sizeof(u32));
+
+ /* GO */
+ mv_setup_timer();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+}
+
static void mv_crypto_algo_completion(void)
{
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
@@ -330,9 +337,8 @@ static void mv_crypto_algo_completion(void)
memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
}

-static void mv_process_hash_current(int first_block)
+static void mv_init_hash_config(struct ahash_request *req)
{
- struct ahash_request *req = ahash_request_cast(cpg->cur_req);
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
@@ -355,8 +361,6 @@ static void mv_process_hash_current(int first_block)
MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

- setup_data_in();
-
op->mac_digest =
MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
op->mac_iv =
@@ -379,13 +383,11 @@ static void mv_process_hash_current(int first_block)
else
op->config |= CFG_MID_FRAG;

- if (first_block) {
- writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
- writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B);
- writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C);
- writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D);
- writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
- }
+ writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
+ writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B);
+ writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C);
+ writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D);
+ writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
}

memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
@@ -396,6 +398,42 @@ static void mv_process_hash_current(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

+static void mv_update_hash_config(void)
+{
+ struct ahash_request *req = ahash_request_cast(cpg->cur_req);
+ struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
+ struct req_progress *p = &cpg->p;
+ struct sec_accel_config *op = &cpg->sa_sram.op;
+ int is_last;
+
+ /* update only the config (for changed fragment state) and
+ * mac_digest (for changed frag len) fields */
+
+ switch (req_ctx->op) {
+ case COP_SHA1:
+ default:
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ break;
+ case COP_HMAC_SHA1:
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ break;
+ }
+
+ is_last = req_ctx->last_chunk
+ && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes)
+ && (req_ctx->count <= MAX_HW_HASH_SIZE);
+
+ op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
+ memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32));
+
+ op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
+ memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32));
+
+ /* GO */
+ mv_setup_timer();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+}
+
static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx,
struct shash_desc *desc)
{
@@ -507,7 +545,8 @@ static void dequeue_complete_req(void)
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
/* process next scatter list entry */
cpg->eng_st = ENGINE_BUSY;
- cpg->p.process(0);
+ setup_data_in();
+ cpg->p.process();
} else {
cpg->p.complete();
cpg->eng_st = ENGINE_IDLE;
@@ -542,7 +581,7 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
memset(p, 0, sizeof(struct req_progress));
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
- p->process = mv_process_current_q;
+ p->process = mv_update_crypt_config;
p->copy_back = 1;

p->src_sg = req->src;
@@ -554,7 +593,8 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->sg_dst_left = req->dst->length;
}

- mv_process_current_q(1);
+ setup_data_in();
+ mv_init_crypt_config(req);
}

static void mv_start_new_hash_req(struct ahash_request *req)
@@ -583,7 +623,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
if (hw_bytes) {
p->hw_nbytes = hw_bytes;
p->complete = mv_hash_algo_completion;
- p->process = mv_process_hash_current;
+ p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer,
@@ -591,7 +631,8 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->crypt_len = old_extra_bytes;
}

- mv_process_hash_current(1);
+ setup_data_in();
+ mv_init_hash_config(req);
} else {
copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
ctx->extra_bytes - old_extra_bytes);
--
1.7.3.4

2012-05-25 16:08:55

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index d099aa0..bc2692e 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -156,6 +156,7 @@ struct mv_req_hash_ctx {
u64 count;
u32 state[SHA1_DIGEST_SIZE / 4];
u8 buffer[SHA1_BLOCK_SIZE];
+ dma_addr_t buffer_dma;
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes; /* unprocessed bytes in buffer */
@@ -636,6 +637,9 @@ static void mv_hash_algo_completion(void)
dma_unmap_single(cpg->dev, ctx->result_dma,
ctx->digestsize, DMA_FROM_DEVICE);

+ dma_unmap_single(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+
if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
@@ -755,8 +759,10 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
- memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer,
- old_extra_bytes);
+ dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
+ ctx->buffer_dma, old_extra_bytes);
p->crypt_len = old_extra_bytes;
}

@@ -901,6 +907,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx *ctx, int op,
ctx->first_hash = 1;
ctx->last_chunk = is_last;
ctx->count_add = count_add;
+ ctx->buffer_dma = dma_map_single(cpg->dev, ctx->buffer,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
}

static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last,
--
1.7.3.4

2012-05-25 16:08:56

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 07/13] mv_cesa: have TDMA copy back the digest result


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 40 +++++++++++++++++++++++++++++-----------
1 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index e10da2b..d099aa0 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -159,8 +159,10 @@ struct mv_req_hash_ctx {
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes; /* unprocessed bytes in buffer */
+ int digestsize; /* size of the digest */
enum hash_op op;
int count_add;
+ dma_addr_t result_dma;
};

static void mv_completion_timer_callback(unsigned long unused)
@@ -497,9 +499,17 @@ static void mv_init_hash_config(struct ahash_request *req)

mv_tdma_separator();

- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+ if (req->result) {
+ req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ req_ctx->digestsize, DMA_FROM_DEVICE);
+ mv_tdma_memcpy(req_ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_tdma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }

/* GO */
mv_setup_timer();
@@ -546,9 +556,17 @@ static void mv_update_hash_config(void)

mv_tdma_separator();

- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+ if (req->result) {
+ req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ req_ctx->digestsize, DMA_FROM_DEVICE);
+ mv_tdma_memcpy(req_ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_tdma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }

/* GO */
mv_setup_timer();
@@ -615,11 +633,10 @@ static void mv_hash_algo_completion(void)
copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes);

if (likely(ctx->last_chunk)) {
- if (likely(ctx->count <= MAX_HW_HASH_SIZE)) {
- memcpy(req->result, cpg->sram + SRAM_DIGEST_BUF,
- crypto_ahash_digestsize(crypto_ahash_reqtfm
- (req)));
- } else {
+ dma_unmap_single(cpg->dev, ctx->result_dma,
+ ctx->digestsize, DMA_FROM_DEVICE);
+
+ if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
}
@@ -717,6 +734,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
old_extra_bytes = ctx->extra_bytes;
+ ctx->digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));

ctx->extra_bytes = hw_bytes % SHA1_BLOCK_SIZE;
if (ctx->extra_bytes != 0
--
1.7.3.4

2012-05-25 16:08:59

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit

Check and exit early for whether CESA can be used at all.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 61 +++++++++++++++++++++++++---------------------
1 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 8e66080..5dba9df 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -804,35 +804,13 @@ static void mv_start_new_hash_req(struct ahash_request *req)
else
ctx->extra_bytes = 0;

- p->src_sg = req->src;
- if (req->nbytes) {
- BUG_ON(!req->src);
- p->sg_src_left = req->src->length;
- }
-
- if (hw_bytes) {
- p->hw_nbytes = hw_bytes;
- p->complete = mv_hash_algo_completion;
- p->process = mv_update_hash_config;
-
- if (unlikely(old_extra_bytes)) {
- dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
- SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
- mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
- ctx->buffer_dma, old_extra_bytes);
- p->crypt_len = old_extra_bytes;
+ if (unlikely(!hw_bytes)) { /* too little data for CESA */
+ if (req->nbytes) {
+ p->src_sg = req->src;
+ p->sg_src_left = req->src->length;
+ copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
+ req->nbytes);
}
-
- if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
- printk(KERN_ERR "%s: out of memory\n", __func__);
- return;
- }
-
- setup_data_in();
- mv_init_hash_config(req);
- } else {
- copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
- ctx->extra_bytes - old_extra_bytes);
if (ctx->last_chunk)
rc = mv_hash_final_fallback(req);
else
@@ -841,7 +819,34 @@ static void mv_start_new_hash_req(struct ahash_request *req)
local_bh_disable();
req->base.complete(&req->base, rc);
local_bh_enable();
+ return;
}
+
+ if (likely(req->nbytes)) {
+ BUG_ON(!req->src);
+
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
+ printk(KERN_ERR "%s: out of memory\n", __func__);
+ return;
+ }
+ p->sg_src_left = sg_dma_len(req->src);
+ p->src_sg = req->src;
+ }
+
+ p->hw_nbytes = hw_bytes;
+ p->complete = mv_hash_algo_completion;
+ p->process = mv_update_hash_config;
+
+ if (unlikely(old_extra_bytes)) {
+ dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
+ ctx->buffer_dma, old_extra_bytes);
+ p->crypt_len = old_extra_bytes;
+ }
+
+ setup_data_in();
+ mv_init_hash_config(req);
}

static int queue_manag(void *data)
--
1.7.3.4

2012-05-25 16:08:59

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon

This is just to keep formatting changes out of the following commit,
hopefully simplifying it a bit.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index c305350..3862a93 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -267,12 +267,10 @@ static void mv_process_current_q(int first_block)
}
if (req_ctx->decrypt) {
op.config |= CFG_DIR_DEC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key,
- AES_KEY_LEN);
+ memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN);
} else {
op.config |= CFG_DIR_ENC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key,
- AES_KEY_LEN);
+ memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN);
}

switch (ctx->key_len) {
@@ -333,9 +331,8 @@ static void mv_process_hash_current(int first_block)
}

op.mac_src_p =
- MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)
- req_ctx->
- count);
+ MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
+ MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

setup_data_in();

@@ -370,7 +367,8 @@ static void mv_process_hash_current(int first_block)
}
}

- memcpy(cpg->sram + SRAM_CONFIG, &op, sizeof(struct sec_accel_config));
+ memcpy(cpg->sram + SRAM_CONFIG, &op,
+ sizeof(struct sec_accel_config));

/* GO */
mv_setup_timer();
--
1.7.3.4

2012-05-25 16:08:54

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 13/13] mv_cesa, mv_tdma: outsource common dma-pool handling code


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/dma_desclist.h | 79 +++++++++++++++++++++++++++++++++++
drivers/crypto/mv_cesa.c | 81 +++++++++----------------------------
drivers/crypto/mv_tdma.c | 91 ++++++++++++-----------------------------
3 files changed, 125 insertions(+), 126 deletions(-)
create mode 100644 drivers/crypto/dma_desclist.h

diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h
new file mode 100644
index 0000000..c471ad6
--- /dev/null
+++ b/drivers/crypto/dma_desclist.h
@@ -0,0 +1,79 @@
+#ifndef __DMA_DESCLIST__
+#define __DMA_DESCLIST__
+
+struct dma_desc {
+ void *virt;
+ dma_addr_t phys;
+};
+
+struct dma_desclist {
+ struct dma_pool *itempool;
+ struct dma_desc *desclist;
+ unsigned long length;
+ unsigned long usage;
+};
+
+#define DESCLIST_ITEM(dl, x) ((dl).desclist[(x)].virt)
+#define DESCLIST_ITEM_DMA(dl, x) ((dl).desclist[(x)].phys)
+#define DESCLIST_FULL(dl) ((dl).length == (dl).usage)
+
+static inline int
+init_dma_desclist(struct dma_desclist *dl, struct device *dev,
+ size_t size, size_t align, size_t boundary)
+{
+#define STRX(x) #x
+#define STR(x) STRX(x)
+ dl->itempool = dma_pool_create(
+ "DMA Desclist Pool at "__FILE__"("STR(__LINE__)")",
+ dev, size, align, boundary);
+#undef STR
+#undef STRX
+ if (!dl->itempool)
+ return 1;
+ dl->desclist = NULL;
+ dl->length = dl->usage = 0;
+ return 0;
+}
+
+static inline int
+set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > dl->length) {
+ struct dma_desc *newmem;
+ int newsize = nelem * sizeof(struct dma_desc);
+
+ newmem = krealloc(dl->desclist, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ dl->desclist = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting dl->length on the go */
+ for (; dl->length < nelem; dl->length++) {
+ DESCLIST_ITEM(*dl, dl->length) = dma_pool_alloc(dl->itempool,
+ GFP_KERNEL, &DESCLIST_ITEM_DMA(*dl, dl->length));
+ if (!DESCLIST_ITEM(*dl, dl->length))
+ return -ENOMEM;
+ }
+ for (; dl->length > nelem; dl->length--)
+ dma_pool_free(dl->itempool, DESCLIST_ITEM(*dl, dl->length - 1),
+ DESCLIST_ITEM_DMA(*dl, dl->length - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(dl->desclist);
+ dl->desclist = 0;
+ }
+ return 0;
+}
+
+static inline void
+fini_dma_desclist(struct dma_desclist *dl)
+{
+ set_dma_desclist_size(dl, 0);
+ dma_pool_destroy(dl->itempool);
+ dl->length = dl->usage = 0;
+}
+
+#endif /* __DMA_DESCLIST__ */
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 9a2f413..367aa18 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -23,6 +23,7 @@

#include "mv_cesa.h"
#include "mv_tdma.h"
+#include "dma_desclist.h"

#define MV_CESA "MV-CESA:"
#define MAX_HW_HASH_SIZE 0xFFFF
@@ -99,11 +100,6 @@ struct sec_accel_sram {
#define sa_ivo type.hash.ivo
} __attribute__((packed));

-struct u32_mempair {
- u32 *vaddr;
- dma_addr_t daddr;
-};
-
struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -127,14 +123,14 @@ struct crypto_priv {
struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;

- struct dma_pool *u32_pool;
- struct u32_mempair *u32_list;
- int u32_list_len;
- int u32_usage;
+ struct dma_desclist desclist;
};

static struct crypto_priv *cpg;

+#define ITEM(x) ((u32 *)DESCLIST_ITEM(cpg->desclist, x))
+#define ITEM_DMA(x) DESCLIST_ITEM_DMA(cpg->desclist, x)
+
struct mv_ctx {
u8 aes_enc_key[AES_KEY_LEN];
u32 aes_dec_key[8];
@@ -202,52 +198,17 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

-#define U32_ITEM(x) (cpg->u32_list[x].vaddr)
-#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr)
-
-static inline int set_u32_poolsize(int nelem)
-{
- /* need to increase size first if requested */
- if (nelem > cpg->u32_list_len) {
- struct u32_mempair *newmem;
- int newsize = nelem * sizeof(struct u32_mempair);
-
- newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL);
- if (!newmem)
- return -ENOMEM;
- cpg->u32_list = newmem;
- }
-
- /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */
- for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) {
- U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool,
- GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len));
- if (!U32_ITEM((cpg->u32_list_len)))
- return -ENOMEM;
- }
- for (; cpg->u32_list_len > nelem; cpg->u32_list_len--)
- dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1),
- U32_ITEM_DMA(cpg->u32_list_len - 1));
-
- /* ignore size decreases but those to zero */
- if (!nelem) {
- kfree(cpg->u32_list);
- cpg->u32_list = 0;
- }
- return 0;
-}
-
static inline void mv_tdma_u32_copy(dma_addr_t dst, u32 val)
{
- if (unlikely(cpg->u32_usage == cpg->u32_list_len)
- && set_u32_poolsize(cpg->u32_list_len << 1)) {
- printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n",
- cpg->u32_list_len << 1);
+ if (unlikely(DESCLIST_FULL(cpg->desclist)) &&
+ set_dma_desclist_size(&cpg->desclist, cpg->desclist.length << 1)) {
+ printk(KERN_ERR MV_CESA "resizing poolsize to %lu failed\n",
+ cpg->desclist.length << 1);
return;
}
- *(U32_ITEM(cpg->u32_usage)) = val;
- mv_tdma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32));
- cpg->u32_usage++;
+ *ITEM(cpg->desclist.usage) = val;
+ mv_tdma_memcpy(dst, ITEM_DMA(cpg->desclist.usage), sizeof(u32));
+ cpg->desclist.usage++;
}

static inline bool
@@ -649,7 +610,7 @@ static void dequeue_complete_req(void)
struct crypto_async_request *req = cpg->cur_req;

mv_tdma_clear();
- cpg->u32_usage = 0;
+ cpg->desclist.usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);

@@ -1326,13 +1287,12 @@ static int mv_probe(struct platform_device *pdev)
cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);

- cpg->u32_pool = dma_pool_create("CESA U32 Item Pool",
- &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0);
- if (!cpg->u32_pool) {
+ if (init_dma_desclist(&cpg->desclist, &pdev->dev,
+ sizeof(u32), MV_DMA_ALIGN, 0)) {
ret = -ENOMEM;
goto err_mapping;
}
- if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) {
+ if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) {
printk(KERN_ERR MV_CESA "failed to initialise poolsize\n");
goto err_pool;
}
@@ -1341,7 +1301,7 @@ static int mv_probe(struct platform_device *pdev)
if (ret) {
printk(KERN_WARNING MV_CESA
"Could not register aes-ecb driver\n");
- goto err_poolsize;
+ goto err_pool;
}

ret = crypto_register_alg(&mv_aes_alg_cbc);
@@ -1368,10 +1328,8 @@ static int mv_probe(struct platform_device *pdev)
return 0;
err_unreg_ecb:
crypto_unregister_alg(&mv_aes_alg_ecb);
-err_poolsize:
- set_u32_poolsize(0);
err_pool:
- dma_pool_destroy(cpg->u32_pool);
+ fini_dma_desclist(&cpg->desclist);
err_mapping:
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
@@ -1403,8 +1361,7 @@ static int mv_remove(struct platform_device *pdev)
free_irq(cp->irq, cp);
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
- set_u32_poolsize(0);
- dma_pool_destroy(cpg->u32_pool);
+ fini_dma_desclist(&cpg->desclist);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
diff --git a/drivers/crypto/mv_tdma.c b/drivers/crypto/mv_tdma.c
index aa5316a..d8e8c3f 100644
--- a/drivers/crypto/mv_tdma.c
+++ b/drivers/crypto/mv_tdma.c
@@ -17,6 +17,7 @@
#include <linux/platform_device.h>

#include "mv_tdma.h"
+#include "dma_desclist.h"

#define MV_TDMA "MV-TDMA: "

@@ -30,57 +31,17 @@ struct tdma_desc {
u32 next;
} __attribute__((packed));

-struct desc_mempair {
- struct tdma_desc *vaddr;
- dma_addr_t daddr;
-};
-
struct tdma_priv {
struct device *dev;
void __iomem *reg;
int irq;
/* protecting the dma descriptors and stuff */
spinlock_t lock;
- struct dma_pool *descpool;
- struct desc_mempair *desclist;
- int desclist_len;
- int desc_usage;
+ struct dma_desclist desclist;
} tpg;

-#define DESC(x) (tpg.desclist[x].vaddr)
-#define DESC_DMA(x) (tpg.desclist[x].daddr)
-
-static inline int set_poolsize(int nelem)
-{
- /* need to increase size first if requested */
- if (nelem > tpg.desclist_len) {
- struct desc_mempair *newmem;
- int newsize = nelem * sizeof(struct desc_mempair);
-
- newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL);
- if (!newmem)
- return -ENOMEM;
- tpg.desclist = newmem;
- }
-
- /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */
- for (; tpg.desclist_len < nelem; tpg.desclist_len++) {
- DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool,
- GFP_KERNEL, &DESC_DMA(tpg.desclist_len));
- if (!DESC((tpg.desclist_len)))
- return -ENOMEM;
- }
- for (; tpg.desclist_len > nelem; tpg.desclist_len--)
- dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1),
- DESC_DMA(tpg.desclist_len - 1));
-
- /* ignore size decreases but those to zero */
- if (!nelem) {
- kfree(tpg.desclist);
- tpg.desclist = 0;
- }
- return 0;
-}
+#define ITEM(x) ((struct tdma_desc *)DESCLIST_ITEM(tpg.desclist, x))
+#define ITEM_DMA(x) DESCLIST_ITEM_DMA(tpg.desclist, x)

static inline void wait_for_tdma_idle(void)
{
@@ -100,17 +61,18 @@ static inline void switch_tdma_engine(bool state)

static struct tdma_desc *get_new_last_desc(void)
{
- if (unlikely(tpg.desc_usage == tpg.desclist_len) &&
- set_poolsize(tpg.desclist_len << 1)) {
- printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %d\n",
- tpg.desclist_len << 1);
+ if (unlikely(DESCLIST_FULL(tpg.desclist)) &&
+ set_dma_desclist_size(&tpg.desclist, tpg.desclist.length << 1)) {
+ printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %lu\n",
+ tpg.desclist.length << 1);
return NULL;
}

- if (likely(tpg.desc_usage))
- DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage);
+ if (likely(tpg.desclist.usage))
+ ITEM(tpg.desclist.usage - 1)->next =
+ ITEM_DMA(tpg.desclist.usage);

- return DESC(tpg.desc_usage++);
+ return ITEM(tpg.desclist.usage++);
}

static inline void mv_tdma_desc_dump(void)
@@ -118,17 +80,17 @@ static inline void mv_tdma_desc_dump(void)
struct tdma_desc *tmp;
int i;

- if (!tpg.desc_usage) {
+ if (!tpg.desclist.usage) {
printk(KERN_WARNING MV_TDMA "DMA descriptor list is empty\n");
return;
}

printk(KERN_WARNING MV_TDMA "DMA descriptor list:\n");
- for (i = 0; i < tpg.desc_usage; i++) {
- tmp = DESC(i);
+ for (i = 0; i < tpg.desclist.usage; i++) {
+ tmp = ITEM(i);
printk(KERN_WARNING MV_TDMA "entry %d at 0x%x: dma addr 0x%x, "
"src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i,
- (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst,
+ (u32)tmp, ITEM_DMA(i) , tmp->src, tmp->dst,
tmp->count & ~TDMA_OWN_BIT, !!(tmp->count & TDMA_OWN_BIT),
tmp->next);
}
@@ -167,7 +129,7 @@ void mv_tdma_clear(void)
writel(0, tpg.reg + TDMA_CURR_DESC);
writel(0, tpg.reg + TDMA_NEXT_DESC);

- tpg.desc_usage = 0;
+ tpg.desclist.usage = 0;

switch_tdma_engine(1);

@@ -183,7 +145,7 @@ void mv_tdma_trigger(void)

spin_lock(&tpg.lock);

- writel(DESC_DMA(0), tpg.reg + TDMA_NEXT_DESC);
+ writel(ITEM_DMA(0), tpg.reg + TDMA_NEXT_DESC);

spin_unlock(&tpg.lock);
}
@@ -287,13 +249,15 @@ static int mv_probe(struct platform_device *pdev)
goto out_unmap_reg;
}

- tpg.descpool = dma_pool_create("TDMA Descriptor Pool", tpg.dev,
- sizeof(struct tdma_desc), MV_DMA_ALIGN, 0);
- if (!tpg.descpool) {
+ if (init_dma_desclist(&tpg.desclist, tpg.dev,
+ sizeof(struct tdma_desc), MV_DMA_ALIGN, 0)) {
rc = -ENOMEM;
goto out_free_irq;
}
- set_poolsize(MV_DMA_INIT_POOLSIZE);
+ if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) {
+ rc = -ENOMEM;
+ goto out_free_desclist;
+ }

platform_set_drvdata(pdev, &tpg);

@@ -327,8 +291,8 @@ static int mv_probe(struct platform_device *pdev)
out_free_all:
switch_tdma_engine(0);
platform_set_drvdata(pdev, NULL);
- set_poolsize(0);
- dma_pool_destroy(tpg.descpool);
+out_free_desclist:
+ fini_dma_desclist(&tpg.desclist);
out_free_irq:
free_irq(tpg.irq, &tpg);
out_unmap_reg:
@@ -341,8 +305,7 @@ static int mv_remove(struct platform_device *pdev)
{
switch_tdma_engine(0);
platform_set_drvdata(pdev, NULL);
- set_poolsize(0);
- dma_pool_destroy(tpg.descpool);
+ fini_dma_desclist(&tpg.desclist);
free_irq(tpg.irq, &tpg);
iounmap(tpg.reg);
tpg.dev = NULL;
--
1.7.3.4

2012-05-25 16:08:57

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 06/13] mv_cesa: use TDMA engine for data transfers

Simply chose the same DMA mask value as for mvsdio and ehci.

Signed-off-by: Phil Sutter <[email protected]>
---
arch/arm/plat-orion/common.c | 6 +
drivers/crypto/mv_cesa.c | 214 +++++++++++++++++++++++++++++++++---------
2 files changed, 175 insertions(+), 45 deletions(-)

diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 74daf5e..dd3a327 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -916,9 +916,15 @@ static struct resource orion_crypto_resources[] = {
},
};

+static u64 mv_crypto_dmamask = DMA_BIT_MASK(32);
+
static struct platform_device orion_crypto = {
.name = "mv_crypto",
.id = -1,
+ .dev = {
+ .dma_mask = &mv_crypto_dmamask,
+ .coherent_dma_mask = DMA_BIT_MASK(32),
+ },
};

void __init orion_crypto_init(unsigned long mapbase,
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4a989ea..e10da2b 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -9,6 +9,7 @@
#include <crypto/aes.h>
#include <crypto/algapi.h>
#include <linux/crypto.h>
+#include <linux/dma-mapping.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kthread.h>
@@ -20,11 +21,14 @@
#include <crypto/sha.h>

#include "mv_cesa.h"
+#include "mv_tdma.h"

#define MV_CESA "MV-CESA:"
#define MAX_HW_HASH_SIZE 0xFFFF
#define MV_CESA_EXPIRE 500 /* msec */

+static int count_sgs(struct scatterlist *, unsigned int);
+
/*
* STM:
* /---------------------------------------\
@@ -49,7 +53,6 @@ enum engine_status {
* @src_start: offset to add to src start position (scatter list)
* @crypt_len: length of current hw crypt/hash process
* @hw_nbytes: total bytes to process in hw for this request
- * @copy_back: whether to copy data back (crypt) or not (hash)
* @sg_dst_left: bytes left dst to process in this scatter list
* @dst_start: offset to add to dst start position (scatter list)
* @hw_processed_bytes: number of bytes processed by hw (request).
@@ -70,7 +73,6 @@ struct req_progress {
int crypt_len;
int hw_nbytes;
/* dst mostly */
- int copy_back;
int sg_dst_left;
int dst_start;
int hw_processed_bytes;
@@ -95,8 +97,10 @@ struct sec_accel_sram {
} __attribute__((packed));

struct crypto_priv {
+ struct device *dev;
void __iomem *reg;
void __iomem *sram;
+ u32 sram_phys;
int irq;
struct task_struct *queue_th;

@@ -113,6 +117,7 @@ struct crypto_priv {
int has_hmac_sha1;

struct sec_accel_sram sa_sram;
+ dma_addr_t sa_sram_dma;
};

static struct crypto_priv *cpg;
@@ -181,6 +186,23 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

+static inline bool
+mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+ int nents = count_sgs(sg, nbytes);
+
+ if (nbytes && dma_map_sg(cpg->dev, sg, nents, dir) != nents)
+ return false;
+ return true;
+}
+
+static inline void
+mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+ if (nbytes)
+ dma_unmap_sg(cpg->dev, sg, count_sgs(sg, nbytes), dir);
+}
+
static void compute_aes_dec_key(struct mv_ctx *ctx)
{
struct crypto_aes_ctx gen_aes_key;
@@ -255,12 +277,66 @@ static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
}
}

+static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int len)
+{
+ dma_addr_t sbuf;
+ int copy_len;
+
+ while (len) {
+ if (!p->sg_src_left) {
+ /* next sg please */
+ p->src_sg = sg_next(p->src_sg);
+ BUG_ON(!p->src_sg);
+ p->sg_src_left = sg_dma_len(p->src_sg);
+ p->src_start = 0;
+ }
+
+ sbuf = sg_dma_address(p->src_sg) + p->src_start;
+
+ copy_len = min(p->sg_src_left, len);
+ mv_tdma_memcpy(dbuf, sbuf, copy_len);
+
+ p->src_start += copy_len;
+ p->sg_src_left -= copy_len;
+
+ len -= copy_len;
+ dbuf += copy_len;
+ }
+}
+
+static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int len)
+{
+ dma_addr_t dbuf;
+ int copy_len;
+
+ while (len) {
+ if (!p->sg_dst_left) {
+ /* next sg please */
+ p->dst_sg = sg_next(p->dst_sg);
+ BUG_ON(!p->dst_sg);
+ p->sg_dst_left = sg_dma_len(p->dst_sg);
+ p->dst_start = 0;
+ }
+
+ dbuf = sg_dma_address(p->dst_sg) + p->dst_start;
+
+ copy_len = min(p->sg_dst_left, len);
+ mv_tdma_memcpy(dbuf, sbuf, copy_len);
+
+ p->dst_start += copy_len;
+ p->sg_dst_left -= copy_len;
+
+ len -= copy_len;
+ sbuf += copy_len;
+ }
+}
+
static void setup_data_in(void)
{
struct req_progress *p = &cpg->p;
int data_in_sram =
min(p->hw_nbytes - p->hw_processed_bytes, cpg->max_req_size);
- copy_src_to_buf(p, cpg->sram + SRAM_DATA_IN_START + p->crypt_len,
+ dma_copy_src_to_buf(p, cpg->sram_phys + SRAM_DATA_IN_START + p->crypt_len,
data_in_sram - p->crypt_len);
p->crypt_len = data_in_sram;
}
@@ -307,22 +383,39 @@ static void mv_init_crypt_config(struct ablkcipher_request *req)
op->enc_key_p = SRAM_DATA_KEY_P;
op->enc_len = cpg->p.crypt_len;

- memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));

+ mv_tdma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+
/* GO */
mv_setup_timer();
+ mv_tdma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_update_crypt_config(void)
{
+ struct sec_accel_config *op = &cpg->sa_sram.op;
+
/* update the enc_len field only */
- memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32),
- &cpg->p.crypt_len, sizeof(u32));
+
+ op->enc_len = cpg->p.crypt_len;
+
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32),
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+ cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
+
+ mv_tdma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);

/* GO */
mv_setup_timer();
+ mv_tdma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -331,6 +424,13 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);

+ if (req->src == req->dst) {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL);
+ } else {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
+ mv_dma_unmap_sg(req->dst, req->nbytes, DMA_FROM_DEVICE);
+ }
+
if (req_ctx->op != COP_AES_CBC)
return ;

@@ -390,11 +490,20 @@ static void mv_init_hash_config(struct ahash_request *req)
writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
}

- memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));

+ mv_tdma_separator();
+
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+
/* GO */
mv_setup_timer();
+ mv_tdma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -424,13 +533,26 @@ static void mv_update_hash_config(void)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);

op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
- memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32));
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG,
+ cpg->sa_sram_dma, sizeof(u32));

op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32));
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32),
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32),
+ cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32));
+
+ mv_tdma_separator();
+
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);

/* GO */
mv_setup_timer();
+ mv_tdma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -504,43 +626,18 @@ static void mv_hash_algo_completion(void)
} else {
mv_save_digest_state(ctx);
}
+
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
}

static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
- void *buf;
cpg->p.hw_processed_bytes += cpg->p.crypt_len;
- if (cpg->p.copy_back) {
- int need_copy_len = cpg->p.crypt_len;
- int sram_offset = 0;
- do {
- int dst_copy;
-
- if (!cpg->p.sg_dst_left) {
- /* next sg please */
- cpg->p.dst_sg = sg_next(cpg->p.dst_sg);
- BUG_ON(!cpg->p.dst_sg);
- cpg->p.sg_dst_left = cpg->p.dst_sg->length;
- cpg->p.dst_start = 0;
- }
-
- buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start;
-
- dst_copy = min(need_copy_len, cpg->p.sg_dst_left);
-
- memcpy(buf,
- cpg->sram + SRAM_DATA_OUT_START + sram_offset,
- dst_copy);
- sram_offset += dst_copy;
- cpg->p.sg_dst_left -= dst_copy;
- need_copy_len -= dst_copy;
- cpg->p.dst_start += dst_copy;
- } while (need_copy_len > 0);
- }
-
cpg->p.crypt_len = 0;

+ mv_tdma_clear();
+
BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
/* process next scatter list entry */
@@ -582,15 +679,28 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
p->process = mv_update_crypt_config;
- p->copy_back = 1;
+
+ /* assume inplace request */
+ if (req->src == req->dst) {
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL))
+ return;
+ } else {
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE))
+ return;
+
+ if (!mv_dma_map_sg(req->dst, req->nbytes, DMA_FROM_DEVICE)) {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
+ return;
+ }
+ }

p->src_sg = req->src;
p->dst_sg = req->dst;
if (req->nbytes) {
BUG_ON(!req->src);
BUG_ON(!req->dst);
- p->sg_src_left = req->src->length;
- p->sg_dst_left = req->dst->length;
+ p->sg_src_left = sg_dma_len(req->src);
+ p->sg_dst_left = sg_dma_len(req->dst);
}

setup_data_in();
@@ -602,6 +712,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
struct req_progress *p = &cpg->p;
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
int hw_bytes, old_extra_bytes, rc;
+
cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
@@ -631,6 +742,11 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->crypt_len = old_extra_bytes;
}

+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
+ printk(KERN_ERR "%s: out of memory\n", __func__);
+ return;
+ }
+
setup_data_in();
mv_init_hash_config(req);
} else {
@@ -966,14 +1082,14 @@ irqreturn_t crypto_int(int irq, void *priv)
u32 val;

val = readl(cpg->reg + SEC_ACCEL_INT_STATUS);
- if (!(val & SEC_INT_ACCEL0_DONE))
+ if (!(val & SEC_INT_ACC0_IDMA_DONE))
return IRQ_NONE;

if (!del_timer(&cpg->completion_timer)) {
printk(KERN_WARNING MV_CESA
"got an interrupt but no pending timer?\n");
}
- val &= ~SEC_INT_ACCEL0_DONE;
+ val &= ~SEC_INT_ACC0_IDMA_DONE;
writel(val, cpg->reg + SEC_ACCEL_INT_STATUS);
BUG_ON(cpg->eng_st != ENGINE_BUSY);
cpg->eng_st = ENGINE_W_DEQUEUE;
@@ -1112,6 +1228,7 @@ static int mv_probe(struct platform_device *pdev)
}
cp->sram_size = resource_size(res);
cp->max_req_size = cp->sram_size - SRAM_CFG_SPACE;
+ cp->sram_phys = res->start;
cp->sram = ioremap(res->start, cp->sram_size);
if (!cp->sram) {
ret = -ENOMEM;
@@ -1127,6 +1244,7 @@ static int mv_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, cp);
cpg = cp;
+ cpg->dev = &pdev->dev;

cp->queue_th = kthread_run(queue_manag, cp, "mv_crypto");
if (IS_ERR(cp->queue_th)) {
@@ -1140,10 +1258,14 @@ static int mv_probe(struct platform_device *pdev)
goto err_thread;

writel(0, cpg->reg + SEC_ACCEL_INT_STATUS);
- writel(SEC_INT_ACCEL0_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
- writel(SEC_CFG_STOP_DIG_ERR, cpg->reg + SEC_ACCEL_CFG);
+ writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
+ writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA |
+ SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG);
writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0);

+ cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+
ret = crypto_register_alg(&mv_aes_alg_ecb);
if (ret) {
printk(KERN_WARNING MV_CESA
@@ -1202,6 +1324,8 @@ static int mv_remove(struct platform_device *pdev)
crypto_unregister_ahash(&mv_hmac_sha1_alg);
kthread_stop(cp->queue_th);
free_irq(cp->irq, cp);
+ dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
--
1.7.3.4

2012-05-25 16:09:00

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 12/13] mv_cesa: drop the now unused process callback

And while here, simplify dequeue_complete_req() a bit.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 21 ++++++---------------
1 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 9afed2d..9a2f413 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -69,7 +69,6 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
- void (*process) (void);

/* src mostly */
int sg_src_left;
@@ -648,25 +647,17 @@ static void mv_hash_algo_completion(void)
static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
- cpg->p.hw_processed_bytes += cpg->p.crypt_len;
- cpg->p.crypt_len = 0;

mv_tdma_clear();
cpg->u32_usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
- if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
- /* process next scatter list entry */
- cpg->eng_st = ENGINE_BUSY;
- setup_data_in();
- cpg->p.process();
- } else {
- cpg->p.complete();
- cpg->eng_st = ENGINE_IDLE;
- local_bh_disable();
- req->complete(req, 0);
- local_bh_enable();
- }
+
+ cpg->p.complete();
+ cpg->eng_st = ENGINE_IDLE;
+ local_bh_disable();
+ req->complete(req, 0);
+ local_bh_enable();
}

static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
--
1.7.3.4

2012-05-25 16:09:03

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now

This introduces a pool of four-byte DMA buffers for security
accelerator config updates.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 134 ++++++++++++++++++++++++++++++++++++----------
drivers/crypto/mv_cesa.h | 1 +
2 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index bc2692e..8e66080 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -10,6 +10,7 @@
#include <crypto/algapi.h>
#include <linux/crypto.h>
#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kthread.h>
@@ -27,6 +28,9 @@
#define MAX_HW_HASH_SIZE 0xFFFF
#define MV_CESA_EXPIRE 500 /* msec */

+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
static int count_sgs(struct scatterlist *, unsigned int);

/*
@@ -96,6 +100,11 @@ struct sec_accel_sram {
#define sa_ivo type.hash.ivo
} __attribute__((packed));

+struct u32_mempair {
+ u32 *vaddr;
+ dma_addr_t daddr;
+};
+
struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -118,6 +127,11 @@ struct crypto_priv {

struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;
+
+ struct dma_pool *u32_pool;
+ struct u32_mempair *u32_list;
+ int u32_list_len;
+ int u32_usage;
};

static struct crypto_priv *cpg;
@@ -189,6 +203,54 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

+#define U32_ITEM(x) (cpg->u32_list[x].vaddr)
+#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr)
+
+static inline int set_u32_poolsize(int nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > cpg->u32_list_len) {
+ struct u32_mempair *newmem;
+ int newsize = nelem * sizeof(struct u32_mempair);
+
+ newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ cpg->u32_list = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */
+ for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) {
+ U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool,
+ GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len));
+ if (!U32_ITEM((cpg->u32_list_len)))
+ return -ENOMEM;
+ }
+ for (; cpg->u32_list_len > nelem; cpg->u32_list_len--)
+ dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1),
+ U32_ITEM_DMA(cpg->u32_list_len - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(cpg->u32_list);
+ cpg->u32_list = 0;
+ }
+ return 0;
+}
+
+static inline void mv_tdma_u32_copy(dma_addr_t dst, u32 val)
+{
+ if (unlikely(cpg->u32_usage == cpg->u32_list_len)
+ && set_u32_poolsize(cpg->u32_list_len << 1)) {
+ printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n",
+ cpg->u32_list_len << 1);
+ return;
+ }
+ *(U32_ITEM(cpg->u32_usage)) = val;
+ mv_tdma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32));
+ cpg->u32_usage++;
+}
+
static inline bool
mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
{
@@ -390,36 +452,13 @@ static void mv_init_crypt_config(struct ablkcipher_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));
-
- mv_tdma_separator();
- dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
-
- /* GO */
- mv_setup_timer();
- mv_tdma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_update_crypt_config(void)
{
- struct sec_accel_config *op = &cpg->sa_sram.op;
-
/* update the enc_len field only */
-
- op->enc_len = cpg->p.crypt_len;
-
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32),
- sizeof(u32), DMA_TO_DEVICE);
- mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
- cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
-
- mv_tdma_separator();
- dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
-
- /* GO */
- mv_setup_timer();
- mv_tdma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+ mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+ (u32)cpg->p.crypt_len);
}

static void mv_crypto_algo_completion(void)
@@ -658,6 +697,7 @@ static void dequeue_complete_req(void)
cpg->p.crypt_len = 0;

mv_tdma_clear();
+ cpg->u32_usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
@@ -699,7 +739,6 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
memset(p, 0, sizeof(struct req_progress));
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
- p->process = mv_update_crypt_config;

/* assume inplace request */
if (req->src == req->dst) {
@@ -726,6 +765,24 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)

setup_data_in();
mv_init_crypt_config(req);
+ mv_tdma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
+ cpg->p.crypt_len = 0;
+
+ setup_data_in();
+ mv_update_crypt_config();
+ mv_tdma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ }
+
+
+ /* GO */
+ mv_setup_timer();
+ mv_tdma_trigger();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_start_new_hash_req(struct ahash_request *req)
@@ -1285,18 +1342,29 @@ static int mv_probe(struct platform_device *pdev)

writel(0, cpg->reg + SEC_ACCEL_INT_STATUS);
writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
- writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA |
+ writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN |
SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG);
writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0);

cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);

+ cpg->u32_pool = dma_pool_create("CESA U32 Item Pool",
+ &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0);
+ if (!cpg->u32_pool) {
+ ret = -ENOMEM;
+ goto err_mapping;
+ }
+ if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) {
+ printk(KERN_ERR MV_CESA "failed to initialise poolsize\n");
+ goto err_pool;
+ }
+
ret = crypto_register_alg(&mv_aes_alg_ecb);
if (ret) {
printk(KERN_WARNING MV_CESA
"Could not register aes-ecb driver\n");
- goto err_irq;
+ goto err_poolsize;
}

ret = crypto_register_alg(&mv_aes_alg_cbc);
@@ -1323,7 +1391,13 @@ static int mv_probe(struct platform_device *pdev)
return 0;
err_unreg_ecb:
crypto_unregister_alg(&mv_aes_alg_ecb);
-err_irq:
+err_poolsize:
+ set_u32_poolsize(0);
+err_pool:
+ dma_pool_destroy(cpg->u32_pool);
+err_mapping:
+ dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
free_irq(irq, cp);
err_thread:
kthread_stop(cp->queue_th);
@@ -1352,6 +1426,8 @@ static int mv_remove(struct platform_device *pdev)
free_irq(cp->irq, cp);
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ set_u32_poolsize(0);
+ dma_pool_destroy(cpg->u32_pool);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h
index 81ce109..83730ca 100644
--- a/drivers/crypto/mv_cesa.h
+++ b/drivers/crypto/mv_cesa.h
@@ -24,6 +24,7 @@
#define SEC_CFG_CH1_W_IDMA (1 << 8)
#define SEC_CFG_ACT_CH0_IDMA (1 << 9)
#define SEC_CFG_ACT_CH1_IDMA (1 << 10)
+#define SEC_CFG_MP_CHAIN (1 << 11)

#define SEC_ACCEL_STATUS 0xde0c
#define SEC_ST_ACT_0 (1 << 0)
--
1.7.3.4

2012-05-25 16:08:59

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 05/13] add a driver for the Marvell TDMA engine

This is a DMA engine integrated into the Marvell Kirkwood SoC, designed
to offload data transfers from/to the CESA crypto engine.

Signed-off-by: Phil Sutter <[email protected]>
---
arch/arm/mach-kirkwood/common.c | 33 +++
arch/arm/mach-kirkwood/include/mach/irqs.h | 1 +
drivers/crypto/Kconfig | 5 +
drivers/crypto/Makefile | 3 +-
drivers/crypto/mv_tdma.c | 377 ++++++++++++++++++++++++++++
drivers/crypto/mv_tdma.h | 50 ++++
6 files changed, 468 insertions(+), 1 deletions(-)
create mode 100644 drivers/crypto/mv_tdma.c
create mode 100644 drivers/crypto/mv_tdma.h

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 3ad0373..adc6eff 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -269,9 +269,42 @@ void __init kirkwood_uart1_init(void)
/*****************************************************************************
* Cryptographic Engines and Security Accelerator (CESA)
****************************************************************************/
+static struct resource kirkwood_tdma_res[] = {
+ {
+ .name = "regs deco",
+ .start = CRYPTO_PHYS_BASE + 0xA00,
+ .end = CRYPTO_PHYS_BASE + 0xA24,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "regs control and error",
+ .start = CRYPTO_PHYS_BASE + 0x800,
+ .end = CRYPTO_PHYS_BASE + 0x8CF,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "crypto error",
+ .start = IRQ_KIRKWOOD_TDMA_ERR,
+ .end = IRQ_KIRKWOOD_TDMA_ERR,
+ .flags = IORESOURCE_IRQ,
+ },
+};
+
+static u64 mv_tdma_dma_mask = 0xffffffffUL;
+
+static struct platform_device kirkwood_tdma_device = {
+ .name = "mv_tdma",
+ .id = -1,
+ .dev = {
+ .dma_mask = &mv_tdma_dma_mask,
+ .coherent_dma_mask = 0xffffffff,
+ },
+ .num_resources = ARRAY_SIZE(kirkwood_tdma_res),
+ .resource = kirkwood_tdma_res,
+};
+
void __init kirkwood_crypto_init(void)
{
kirkwood_clk_ctrl |= CGC_CRYPTO;
+ platform_device_register(&kirkwood_tdma_device);
orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE,
KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO);
}
diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h b/arch/arm/mach-kirkwood/include/mach/irqs.h
index 2bf8161..a66aa3f 100644
--- a/arch/arm/mach-kirkwood/include/mach/irqs.h
+++ b/arch/arm/mach-kirkwood/include/mach/irqs.h
@@ -51,6 +51,7 @@
#define IRQ_KIRKWOOD_GPIO_HIGH_16_23 41
#define IRQ_KIRKWOOD_GE00_ERR 46
#define IRQ_KIRKWOOD_GE01_ERR 47
+#define IRQ_KIRKWOOD_TDMA_ERR 49
#define IRQ_KIRKWOOD_RTC 53

/*
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 1092a77..17becf3 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390

It is available as of z196.

+config CRYPTO_DEV_MV_TDMA
+ tristate
+ default no
+
config CRYPTO_DEV_MV_CESA
tristate "Marvell's Cryptographic Engine"
depends on PLAT_ORION
@@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA
select CRYPTO_AES
select CRYPTO_BLKCIPHER2
select CRYPTO_HASH
+ select CRYPTO_DEV_MV_TDMA
help
This driver allows you to utilize the Cryptographic Engines and
Security Accelerator (CESA) which can be found on the Marvell Orion
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 0139032..65806e8 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o
obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
n2_crypto-y := n2_core.o n2_asm.o
obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o
+obj-$(CONFIG_CRYPTO_DEV_MV_TDMA) += mv_tdma.o
obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o
obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/
@@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o
obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o
obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o
-obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
\ No newline at end of file
+obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
diff --git a/drivers/crypto/mv_tdma.c b/drivers/crypto/mv_tdma.c
new file mode 100644
index 0000000..aa5316a
--- /dev/null
+++ b/drivers/crypto/mv_tdma.c
@@ -0,0 +1,377 @@
+/*
+ * Support for Marvell's TDMA engine found on Kirkwood chips,
+ * used exclusively by the CESA crypto accelerator.
+ *
+ * Based on unpublished code for IDMA written by Sebastian Siewior.
+ *
+ * Copyright (C) 2012 Phil Sutter <[email protected]>
+ * License: GPLv2
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+
+#include "mv_tdma.h"
+
+#define MV_TDMA "MV-TDMA: "
+
+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
+struct tdma_desc {
+ u32 count;
+ u32 src;
+ u32 dst;
+ u32 next;
+} __attribute__((packed));
+
+struct desc_mempair {
+ struct tdma_desc *vaddr;
+ dma_addr_t daddr;
+};
+
+struct tdma_priv {
+ struct device *dev;
+ void __iomem *reg;
+ int irq;
+ /* protecting the dma descriptors and stuff */
+ spinlock_t lock;
+ struct dma_pool *descpool;
+ struct desc_mempair *desclist;
+ int desclist_len;
+ int desc_usage;
+} tpg;
+
+#define DESC(x) (tpg.desclist[x].vaddr)
+#define DESC_DMA(x) (tpg.desclist[x].daddr)
+
+static inline int set_poolsize(int nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > tpg.desclist_len) {
+ struct desc_mempair *newmem;
+ int newsize = nelem * sizeof(struct desc_mempair);
+
+ newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ tpg.desclist = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */
+ for (; tpg.desclist_len < nelem; tpg.desclist_len++) {
+ DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool,
+ GFP_KERNEL, &DESC_DMA(tpg.desclist_len));
+ if (!DESC((tpg.desclist_len)))
+ return -ENOMEM;
+ }
+ for (; tpg.desclist_len > nelem; tpg.desclist_len--)
+ dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1),
+ DESC_DMA(tpg.desclist_len - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(tpg.desclist);
+ tpg.desclist = 0;
+ }
+ return 0;
+}
+
+static inline void wait_for_tdma_idle(void)
+{
+ while (readl(tpg.reg + TDMA_CTRL) & TDMA_CTRL_ACTIVE)
+ mdelay(100);
+}
+
+static inline void switch_tdma_engine(bool state)
+{
+ u32 val = readl(tpg.reg + TDMA_CTRL);
+
+ val |= ( state * TDMA_CTRL_ENABLE);
+ val &= ~(!state * TDMA_CTRL_ENABLE);
+
+ writel(val, tpg.reg + TDMA_CTRL);
+}
+
+static struct tdma_desc *get_new_last_desc(void)
+{
+ if (unlikely(tpg.desc_usage == tpg.desclist_len) &&
+ set_poolsize(tpg.desclist_len << 1)) {
+ printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %d\n",
+ tpg.desclist_len << 1);
+ return NULL;
+ }
+
+ if (likely(tpg.desc_usage))
+ DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage);
+
+ return DESC(tpg.desc_usage++);
+}
+
+static inline void mv_tdma_desc_dump(void)
+{
+ struct tdma_desc *tmp;
+ int i;
+
+ if (!tpg.desc_usage) {
+ printk(KERN_WARNING MV_TDMA "DMA descriptor list is empty\n");
+ return;
+ }
+
+ printk(KERN_WARNING MV_TDMA "DMA descriptor list:\n");
+ for (i = 0; i < tpg.desc_usage; i++) {
+ tmp = DESC(i);
+ printk(KERN_WARNING MV_TDMA "entry %d at 0x%x: dma addr 0x%x, "
+ "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i,
+ (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst,
+ tmp->count & ~TDMA_OWN_BIT, !!(tmp->count & TDMA_OWN_BIT),
+ tmp->next);
+ }
+}
+
+static inline void mv_tdma_reg_dump(void)
+{
+#define PRINTREG(offset) \
+ printk(KERN_WARNING MV_TDMA "tpg.reg + " #offset " = 0x%x\n", \
+ readl(tpg.reg + offset))
+
+ PRINTREG(TDMA_CTRL);
+ PRINTREG(TDMA_BYTE_COUNT);
+ PRINTREG(TDMA_SRC_ADDR);
+ PRINTREG(TDMA_DST_ADDR);
+ PRINTREG(TDMA_NEXT_DESC);
+ PRINTREG(TDMA_CURR_DESC);
+
+#undef PRINTREG
+}
+
+void mv_tdma_clear(void)
+{
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ /* make sure tdma is idle */
+ wait_for_tdma_idle();
+ switch_tdma_engine(0);
+ wait_for_tdma_idle();
+
+ /* clear descriptor registers */
+ writel(0, tpg.reg + TDMA_BYTE_COUNT);
+ writel(0, tpg.reg + TDMA_CURR_DESC);
+ writel(0, tpg.reg + TDMA_NEXT_DESC);
+
+ tpg.desc_usage = 0;
+
+ switch_tdma_engine(1);
+
+ /* finally free system lock again */
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_tdma_clear);
+
+void mv_tdma_trigger(void)
+{
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ writel(DESC_DMA(0), tpg.reg + TDMA_NEXT_DESC);
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_tdma_trigger);
+
+void mv_tdma_separator(void)
+{
+ struct tdma_desc *tmp;
+
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ tmp = get_new_last_desc();
+ memset(tmp, 0, sizeof(*tmp));
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_tdma_separator);
+
+void mv_tdma_memcpy(dma_addr_t dst, dma_addr_t src, unsigned int size)
+{
+ struct tdma_desc *tmp;
+
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ tmp = get_new_last_desc();
+ tmp->count = size | TDMA_OWN_BIT;
+ tmp->src = src;
+ tmp->dst = dst;
+ tmp->next = 0;
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_tdma_memcpy);
+
+irqreturn_t tdma_int(int irq, void *priv)
+{
+ u32 val;
+
+ val = readl(tpg.reg + TDMA_ERR_CAUSE);
+
+ if (val & TDMA_INT_MISS)
+ printk(KERN_ERR MV_TDMA "%s: miss!\n", __func__);
+ if (val & TDMA_INT_DOUBLE_HIT)
+ printk(KERN_ERR MV_TDMA "%s: double hit!\n", __func__);
+ if (val & TDMA_INT_BOTH_HIT)
+ printk(KERN_ERR MV_TDMA "%s: both hit!\n", __func__);
+ if (val & TDMA_INT_DATA_ERROR)
+ printk(KERN_ERR MV_TDMA "%s: data error!\n", __func__);
+ if (val) {
+ mv_tdma_reg_dump();
+ mv_tdma_desc_dump();
+ }
+
+ switch_tdma_engine(0);
+ wait_for_tdma_idle();
+
+ /* clear descriptor registers */
+ writel(0, tpg.reg + TDMA_BYTE_COUNT);
+ writel(0, tpg.reg + TDMA_SRC_ADDR);
+ writel(0, tpg.reg + TDMA_DST_ADDR);
+ writel(0, tpg.reg + TDMA_CURR_DESC);
+
+ /* clear error cause register */
+ writel(0, tpg.reg + TDMA_ERR_CAUSE);
+
+ /* initialize control register (also enables engine) */
+ writel(TDMA_CTRL_INIT_VALUE, tpg.reg + TDMA_CTRL);
+ wait_for_tdma_idle();
+
+ return (val ? IRQ_HANDLED : IRQ_NONE);
+}
+
+static int mv_probe(struct platform_device *pdev)
+{
+ struct resource *res;
+ int rc;
+
+ if (tpg.dev) {
+ printk(KERN_ERR MV_TDMA "second TDMA device?!\n");
+ return -ENXIO;
+ }
+ tpg.dev = &pdev->dev;
+
+ res = platform_get_resource_byname(pdev,
+ IORESOURCE_MEM, "regs control and error");
+ if (!res)
+ return -ENXIO;
+
+ if (!(tpg.reg = ioremap(res->start, resource_size(res))))
+ return -ENOMEM;
+
+ tpg.irq = platform_get_irq(pdev, 0);
+ if (tpg.irq < 0 || tpg.irq == NO_IRQ) {
+ rc = -ENXIO;
+ goto out_unmap_reg;
+ }
+
+ tpg.descpool = dma_pool_create("TDMA Descriptor Pool", tpg.dev,
+ sizeof(struct tdma_desc), MV_DMA_ALIGN, 0);
+ if (!tpg.descpool) {
+ rc = -ENOMEM;
+ goto out_free_irq;
+ }
+ set_poolsize(MV_DMA_INIT_POOLSIZE);
+
+ platform_set_drvdata(pdev, &tpg);
+
+ switch_tdma_engine(0);
+ wait_for_tdma_idle();
+
+ /* clear descriptor registers */
+ writel(0, tpg.reg + TDMA_BYTE_COUNT);
+ writel(0, tpg.reg + TDMA_SRC_ADDR);
+ writel(0, tpg.reg + TDMA_DST_ADDR);
+ writel(0, tpg.reg + TDMA_CURR_DESC);
+
+ /* have an ear for occurring errors */
+ writel(TDMA_INT_ALL, tpg.reg + TDMA_ERR_MASK);
+ writel(0, tpg.reg + TDMA_ERR_CAUSE);
+
+ /* initialize control register (also enables engine) */
+ writel(TDMA_CTRL_INIT_VALUE, tpg.reg + TDMA_CTRL);
+ wait_for_tdma_idle();
+
+ if (request_irq(tpg.irq, tdma_int, IRQF_DISABLED,
+ dev_name(tpg.dev), &tpg)) {
+ rc = -ENXIO;
+ goto out_free_all;
+ }
+
+ spin_lock_init(&tpg.lock);
+
+ printk(KERN_INFO MV_TDMA "up and running, IRQ %d\n", tpg.irq);
+ return 0;
+out_free_all:
+ switch_tdma_engine(0);
+ platform_set_drvdata(pdev, NULL);
+ set_poolsize(0);
+ dma_pool_destroy(tpg.descpool);
+out_free_irq:
+ free_irq(tpg.irq, &tpg);
+out_unmap_reg:
+ iounmap(tpg.reg);
+ tpg.dev = NULL;
+ return rc;
+}
+
+static int mv_remove(struct platform_device *pdev)
+{
+ switch_tdma_engine(0);
+ platform_set_drvdata(pdev, NULL);
+ set_poolsize(0);
+ dma_pool_destroy(tpg.descpool);
+ free_irq(tpg.irq, &tpg);
+ iounmap(tpg.reg);
+ tpg.dev = NULL;
+ return 0;
+}
+
+static struct platform_driver marvell_tdma = {
+ .probe = mv_probe,
+ .remove = mv_remove,
+ .driver = {
+ .owner = THIS_MODULE,
+ .name = "mv_tdma",
+ },
+};
+MODULE_ALIAS("platform:mv_tdma");
+
+static int __init mv_tdma_init(void)
+{
+ return platform_driver_register(&marvell_tdma);
+}
+module_init(mv_tdma_init);
+
+static void __exit mv_tdma_exit(void)
+{
+ platform_driver_unregister(&marvell_tdma);
+}
+module_exit(mv_tdma_exit);
+
+MODULE_AUTHOR("Phil Sutter <[email protected]>");
+MODULE_DESCRIPTION("Support for Marvell's TDMA engine");
+MODULE_LICENSE("GPL");
+
diff --git a/drivers/crypto/mv_tdma.h b/drivers/crypto/mv_tdma.h
new file mode 100644
index 0000000..3efa44c3
--- /dev/null
+++ b/drivers/crypto/mv_tdma.h
@@ -0,0 +1,50 @@
+#ifndef _MV_TDMA_H
+#define _MV_TDMA_H
+
+/* TDMA_CTRL register bits */
+#define TDMA_CTRL_DST_BURST(x) (x)
+#define TDMA_CTRL_DST_BURST_32 TDMA_CTRL_DST_BURST(3)
+#define TDMA_CTRL_DST_BURST_128 TDMA_CTRL_DST_BURST(4)
+#define TDMA_CTRL_OUTST_RD_EN (1 << 4)
+#define TDMA_CTRL_SRC_BURST(x) (x << 6)
+#define TDMA_CTRL_SRC_BURST_32 TDMA_CTRL_SRC_BURST(3)
+#define TDMA_CTRL_SRC_BURST_128 TDMA_CTRL_SRC_BURST(4)
+#define TDMA_CTRL_NO_CHAIN_MODE (1 << 9)
+#define TDMA_CTRL_NO_BYTE_SWAP (1 << 11)
+#define TDMA_CTRL_ENABLE (1 << 12)
+#define TDMA_CTRL_FETCH_ND (1 << 13)
+#define TDMA_CTRL_ACTIVE (1 << 14)
+
+#define TDMA_CTRL_INIT_VALUE ( \
+ TDMA_CTRL_DST_BURST_128 | TDMA_CTRL_SRC_BURST_128 | \
+ TDMA_CTRL_NO_BYTE_SWAP | TDMA_CTRL_ENABLE \
+)
+
+/* TDMA_ERR_CAUSE bits */
+#define TDMA_INT_MISS (1 << 0)
+#define TDMA_INT_DOUBLE_HIT (1 << 1)
+#define TDMA_INT_BOTH_HIT (1 << 2)
+#define TDMA_INT_DATA_ERROR (1 << 3)
+#define TDMA_INT_ALL 0x0f
+
+/* offsets of registers, starting at "regs control and error" */
+#define TDMA_BYTE_COUNT 0x00
+#define TDMA_SRC_ADDR 0x10
+#define TDMA_DST_ADDR 0x20
+#define TDMA_NEXT_DESC 0x30
+#define TDMA_CTRL 0x40
+#define TDMA_CURR_DESC 0x70
+#define TDMA_ERR_CAUSE 0xc8
+#define TDMA_ERR_MASK 0xcc
+
+/* Owner bit in TDMA_BYTE_COUNT and descriptors' count field, used
+ * to signal TDMA in descriptor chain when input data is complete. */
+#define TDMA_OWN_BIT (1 << 31)
+
+extern void mv_tdma_memcpy(dma_addr_t, dma_addr_t, unsigned int);
+extern void mv_tdma_separator(void);
+extern void mv_tdma_clear(void);
+extern void mv_tdma_trigger(void);
+
+
+#endif /* _MV_TDMA_H */
--
1.7.3.4

2012-05-25 16:09:02

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 03/13] mv_cesa: prepare the full sram config in dram

This way reconfiguring the cryptographic accelerator consists of a
single step (memcpy here), which in future can be done by the tdma
engine.

This patch introduces some ugly IV copying, necessary for input buffers
above 1920bytes. But this will go away later.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 83 ++++++++++++++++++++++++++++-----------------
1 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 3862a93..68b83d8 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -76,6 +76,24 @@ struct req_progress {
int hw_processed_bytes;
};

+struct sec_accel_sram {
+ struct sec_accel_config op;
+ union {
+ struct {
+ u32 key[8];
+ u32 iv[4];
+ } crypt;
+ struct {
+ u32 ivi[5];
+ u32 ivo[5];
+ } hash;
+ } type;
+#define sa_key type.crypt.key
+#define sa_iv type.crypt.iv
+#define sa_ivi type.hash.ivi
+#define sa_ivo type.hash.ivo
+} __attribute__((packed));
+
struct crypto_priv {
void __iomem *reg;
void __iomem *sram;
@@ -93,6 +111,8 @@ struct crypto_priv {
int sram_size;
int has_sha1;
int has_hmac_sha1;
+
+ struct sec_accel_sram sa_sram;
};

static struct crypto_priv *cpg;
@@ -250,48 +270,49 @@ static void mv_process_current_q(int first_block)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
- struct sec_accel_config op;
+ struct sec_accel_config *op = &cpg->sa_sram.op;

switch (req_ctx->op) {
case COP_AES_ECB:
- op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB;
+ op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB;
break;
case COP_AES_CBC:
default:
- op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
- op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
+ op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
+ op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
- if (first_block)
- memcpy(cpg->sram + SRAM_DATA_IV, req->info, 16);
+ if (!first_block)
+ memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
+ memcpy(cpg->sa_sram.sa_iv, req->info, 16);
break;
}
if (req_ctx->decrypt) {
- op.config |= CFG_DIR_DEC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN);
+ op->config |= CFG_DIR_DEC;
+ memcpy(cpg->sa_sram.sa_key, ctx->aes_dec_key, AES_KEY_LEN);
} else {
- op.config |= CFG_DIR_ENC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN);
+ op->config |= CFG_DIR_ENC;
+ memcpy(cpg->sa_sram.sa_key, ctx->aes_enc_key, AES_KEY_LEN);
}

switch (ctx->key_len) {
case AES_KEYSIZE_128:
- op.config |= CFG_AES_LEN_128;
+ op->config |= CFG_AES_LEN_128;
break;
case AES_KEYSIZE_192:
- op.config |= CFG_AES_LEN_192;
+ op->config |= CFG_AES_LEN_192;
break;
case AES_KEYSIZE_256:
- op.config |= CFG_AES_LEN_256;
+ op->config |= CFG_AES_LEN_256;
break;
}
- op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
+ op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
- op.enc_key_p = SRAM_DATA_KEY_P;
+ op->enc_key_p = SRAM_DATA_KEY_P;

setup_data_in();
- op.enc_len = cpg->p.crypt_len;
- memcpy(cpg->sram + SRAM_CONFIG, &op,
- sizeof(struct sec_accel_config));
+ op->enc_len = cpg->p.crypt_len;
+ memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ sizeof(struct sec_accel_sram));

/* GO */
mv_setup_timer();
@@ -315,30 +336,30 @@ static void mv_process_hash_current(int first_block)
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
- struct sec_accel_config op = { 0 };
+ struct sec_accel_config *op = &cpg->sa_sram.op;
int is_last;

switch (req_ctx->op) {
case COP_SHA1:
default:
- op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
- op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
- memcpy(cpg->sram + SRAM_HMAC_IV_IN,
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ memcpy(cpg->sa_sram.sa_ivi,
tfm_ctx->ivs, sizeof(tfm_ctx->ivs));
break;
}

- op.mac_src_p =
+ op->mac_src_p =
MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

setup_data_in();

- op.mac_digest =
+ op->mac_digest =
MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- op.mac_iv =
+ op->mac_iv =
MAC_INNER_IV_P(SRAM_HMAC_IV_IN) |
MAC_OUTER_IV_P(SRAM_HMAC_IV_OUT);

@@ -347,16 +368,16 @@ static void mv_process_hash_current(int first_block)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);
if (req_ctx->first_hash) {
if (is_last)
- op.config |= CFG_NOT_FRAG;
+ op->config |= CFG_NOT_FRAG;
else
- op.config |= CFG_FIRST_FRAG;
+ op->config |= CFG_FIRST_FRAG;

req_ctx->first_hash = 0;
} else {
if (is_last)
- op.config |= CFG_LAST_FRAG;
+ op->config |= CFG_LAST_FRAG;
else
- op.config |= CFG_MID_FRAG;
+ op->config |= CFG_MID_FRAG;

if (first_block) {
writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
@@ -367,8 +388,8 @@ static void mv_process_hash_current(int first_block)
}
}

- memcpy(cpg->sram + SRAM_CONFIG, &op,
- sizeof(struct sec_accel_config));
+ memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ sizeof(struct sec_accel_sram));

/* GO */
mv_setup_timer();
--
1.7.3.4

2012-05-25 16:09:05

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 01/13] mv_cesa: do not use scatterlist iterators

The big problem is they cannot be used to iterate over DMA mapped
scatterlists, so get rid of them in order to add DMA functionality to
mv_cesa.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 57 ++++++++++++++++++++++-----------------------
1 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 3cc9237..c305350 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -43,8 +43,8 @@ enum engine_status {

/**
* struct req_progress - used for every crypt request
- * @src_sg_it: sg iterator for src
- * @dst_sg_it: sg iterator for dst
+ * @src_sg: sg list for src
+ * @dst_sg: sg list for dst
* @sg_src_left: bytes left in src to process (scatter list)
* @src_start: offset to add to src start position (scatter list)
* @crypt_len: length of current hw crypt/hash process
@@ -59,8 +59,8 @@ enum engine_status {
* track of progress within current scatterlist.
*/
struct req_progress {
- struct sg_mapping_iter src_sg_it;
- struct sg_mapping_iter dst_sg_it;
+ struct scatterlist *src_sg;
+ struct scatterlist *dst_sg;
void (*complete) (void);
void (*process) (int is_first);

@@ -210,19 +210,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher *cipher, const u8 *key,

static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
{
- int ret;
void *sbuf;
int copy_len;

while (len) {
if (!p->sg_src_left) {
- ret = sg_miter_next(&p->src_sg_it);
- BUG_ON(!ret);
- p->sg_src_left = p->src_sg_it.length;
+ /* next sg please */
+ p->src_sg = sg_next(p->src_sg);
+ BUG_ON(!p->src_sg);
+ p->sg_src_left = p->src_sg->length;
p->src_start = 0;
}

- sbuf = p->src_sg_it.addr + p->src_start;
+ sbuf = sg_virt(p->src_sg) + p->src_start;

copy_len = min(p->sg_src_left, len);
memcpy(dbuf, sbuf, copy_len);
@@ -305,9 +305,6 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);

- sg_miter_stop(&cpg->p.src_sg_it);
- sg_miter_stop(&cpg->p.dst_sg_it);
-
if (req_ctx->op != COP_AES_CBC)
return ;

@@ -437,7 +434,6 @@ static void mv_hash_algo_completion(void)

if (ctx->extra_bytes)
copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes);
- sg_miter_stop(&cpg->p.src_sg_it);

if (likely(ctx->last_chunk)) {
if (likely(ctx->count <= MAX_HW_HASH_SIZE)) {
@@ -457,7 +453,6 @@ static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
void *buf;
- int ret;
cpg->p.hw_processed_bytes += cpg->p.crypt_len;
if (cpg->p.copy_back) {
int need_copy_len = cpg->p.crypt_len;
@@ -466,14 +461,14 @@ static void dequeue_complete_req(void)
int dst_copy;

if (!cpg->p.sg_dst_left) {
- ret = sg_miter_next(&cpg->p.dst_sg_it);
- BUG_ON(!ret);
- cpg->p.sg_dst_left = cpg->p.dst_sg_it.length;
+ /* next sg please */
+ cpg->p.dst_sg = sg_next(cpg->p.dst_sg);
+ BUG_ON(!cpg->p.dst_sg);
+ cpg->p.sg_dst_left = cpg->p.dst_sg->length;
cpg->p.dst_start = 0;
}

- buf = cpg->p.dst_sg_it.addr;
- buf += cpg->p.dst_start;
+ buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start;

dst_copy = min(need_copy_len, cpg->p.sg_dst_left);

@@ -523,7 +518,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
static void mv_start_new_crypt_req(struct ablkcipher_request *req)
{
struct req_progress *p = &cpg->p;
- int num_sgs;

cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
@@ -532,11 +526,14 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->process = mv_process_current_q;
p->copy_back = 1;

- num_sgs = count_sgs(req->src, req->nbytes);
- sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG);
-
- num_sgs = count_sgs(req->dst, req->nbytes);
- sg_miter_start(&p->dst_sg_it, req->dst, num_sgs, SG_MITER_TO_SG);
+ p->src_sg = req->src;
+ p->dst_sg = req->dst;
+ if (req->nbytes) {
+ BUG_ON(!req->src);
+ BUG_ON(!req->dst);
+ p->sg_src_left = req->src->length;
+ p->sg_dst_left = req->dst->length;
+ }

mv_process_current_q(1);
}
@@ -545,7 +542,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
{
struct req_progress *p = &cpg->p;
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
- int num_sgs, hw_bytes, old_extra_bytes, rc;
+ int hw_bytes, old_extra_bytes, rc;
cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
@@ -558,8 +555,11 @@ static void mv_start_new_hash_req(struct ahash_request *req)
else
ctx->extra_bytes = 0;

- num_sgs = count_sgs(req->src, req->nbytes);
- sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG);
+ p->src_sg = req->src;
+ if (req->nbytes) {
+ BUG_ON(!req->src);
+ p->sg_src_left = req->src->length;
+ }

if (hw_bytes) {
p->hw_nbytes = hw_bytes;
@@ -576,7 +576,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)
} else {
copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
ctx->extra_bytes - old_extra_bytes);
- sg_miter_stop(&p->src_sg_it);
if (ctx->last_chunk)
rc = mv_hash_final_fallback(req);
else
--
1.7.3.4

2012-05-27 14:03:16

by cloudy.linux

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with TDMA

On 2012-5-26 0:08, Phil Sutter wrote:
> Hi,
>
> The following patch series adds support for the TDMA engine built into
> Marvell's Kirkwood-based SoCs, and enhances mv_cesa.c in order to use it
> for speeding up crypto operations. Kirkwood hardware contains a security
> accelerator, which can control DMA as well as crypto engines. It allows
> for operation with minimal software intervenience, which the following
> patches implement: using a chain of DMA descriptors, data input,
> configuration, engine startup and data output repeat fully automatically
> until the whole input data has been handled.
>
> The point for this being RFC is backwards-compatibility: earlier
> hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
> with the same crypto engine, so in fact mv_cesa.c is in use on these
> platforms, too. But since I don't possess hardware of this kind, I am
> not able to make this code IDMA-compatible. Also, due to the quite
> massive reorganisation of code flow, I don't really see how to make TDMA
> support optional in mv_cesa.c.
>
> Greetings, Phil
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Could the source code from the manufacturers of hardwares using kirkwood
be helpful?
I saw the source code of ls-wvl from buffalo contains driver for CESA.
And it deals with both IDMA and TDMA. If you need, I can send you the
download link.

I also have to point out that CESA of some orion revisions has hardware
flaws that needs to be addressed which currently doesn't. Information
about those flaws can be found in 88F5182_Functional_Errata.pdf which is
available on the net.

2012-05-29 11:34:47

by Phil Sutter

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with TDMA

Hi,

On Sun, May 27, 2012 at 10:03:07PM +0800, cloudy.linux wrote:
> Could the source code from the manufacturers of hardwares using kirkwood
> be helpful?
> I saw the source code of ls-wvl from buffalo contains driver for CESA.
> And it deals with both IDMA and TDMA. If you need, I can send you the
> download link.

Actually, I do have the sources. Just had doubts about how useful it
would be to write code for something I couldn't test at all. OTOH,
that's probably a better start than nothing.

> I also have to point out that CESA of some orion revisions has hardware
> flaws that needs to be addressed which currently doesn't. Information
> about those flaws can be found in 88F5182_Functional_Errata.pdf which is
> available on the net.

OK, thanks for the pointer! Looks like implementing combined
(crypto/digest) operation for Orion will be no fun at least.

Greetings, Phil



Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-12 10:04:41

by Herbert Xu

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with TDMA

On Fri, May 25, 2012 at 06:08:26PM +0200, Phil Sutter wrote:
>
> The point for this being RFC is backwards-compatibility: earlier
> hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
> with the same crypto engine, so in fact mv_cesa.c is in use on these
> platforms, too. But since I don't possess hardware of this kind, I am
> not able to make this code IDMA-compatible. Also, due to the quite
> massive reorganisation of code flow, I don't really see how to make TDMA
> support optional in mv_cesa.c.

So does this break existing functionality or not?

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2012-06-12 10:25:03

by Phil Sutter

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with TDMA

On Tue, Jun 12, 2012 at 06:04:37PM +0800, Herbert Xu wrote:
> On Fri, May 25, 2012 at 06:08:26PM +0200, Phil Sutter wrote:
> >
> > The point for this being RFC is backwards-compatibility: earlier
> > hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
> > with the same crypto engine, so in fact mv_cesa.c is in use on these
> > platforms, too. But since I don't possess hardware of this kind, I am
> > not able to make this code IDMA-compatible. Also, due to the quite
> > massive reorganisation of code flow, I don't really see how to make TDMA
> > support optional in mv_cesa.c.
>
> So does this break existing functionality or not?

It does break mv_cesa on Orion-based devices (precisely those with IDMA
instead of TDMA). I am currently working on a version which supports
IDMA, too. Since all CESA-equipped hardware comes with either TDMA or
IDMA, that version then should improve all platforms without breaking
any.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-12 11:39:44

by Herbert Xu

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with TDMA

On Tue, Jun 12, 2012 at 12:24:52PM +0200, Phil Sutter wrote:
>
> It does break mv_cesa on Orion-based devices (precisely those with IDMA
> instead of TDMA). I am currently working on a version which supports
> IDMA, too. Since all CESA-equipped hardware comes with either TDMA or
> IDMA, that version then should improve all platforms without breaking
> any.

Thanks for the explanation. I'll wait for your new patches :)
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2012-06-12 17:17:40

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit

Check and exit early for whether CESA can be used at all.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 61 +++++++++++++++++++++++++---------------------
1 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7917d1a..9c65980 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -806,35 +806,13 @@ static void mv_start_new_hash_req(struct ahash_request *req)
else
ctx->extra_bytes = 0;

- p->src_sg = req->src;
- if (req->nbytes) {
- BUG_ON(!req->src);
- p->sg_src_left = req->src->length;
- }
-
- if (hw_bytes) {
- p->hw_nbytes = hw_bytes;
- p->complete = mv_hash_algo_completion;
- p->process = mv_update_hash_config;
-
- if (unlikely(old_extra_bytes)) {
- dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
- SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
- mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
- ctx->buffer_dma, old_extra_bytes);
- p->crypt_len = old_extra_bytes;
+ if (unlikely(!hw_bytes)) { /* too little data for CESA */
+ if (req->nbytes) {
+ p->src_sg = req->src;
+ p->sg_src_left = req->src->length;
+ copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
+ req->nbytes);
}
-
- if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
- printk(KERN_ERR "%s: out of memory\n", __func__);
- return;
- }
-
- setup_data_in();
- mv_init_hash_config(req);
- } else {
- copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
- ctx->extra_bytes - old_extra_bytes);
if (ctx->last_chunk)
rc = mv_hash_final_fallback(req);
else
@@ -843,7 +821,34 @@ static void mv_start_new_hash_req(struct ahash_request *req)
local_bh_disable();
req->base.complete(&req->base, rc);
local_bh_enable();
+ return;
}
+
+ if (likely(req->nbytes)) {
+ BUG_ON(!req->src);
+
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
+ printk(KERN_ERR "%s: out of memory\n", __func__);
+ return;
+ }
+ p->sg_src_left = sg_dma_len(req->src);
+ p->src_sg = req->src;
+ }
+
+ p->hw_nbytes = hw_bytes;
+ p->complete = mv_hash_algo_completion;
+ p->process = mv_update_hash_config;
+
+ if (unlikely(old_extra_bytes)) {
+ dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
+ ctx->buffer_dma, old_extra_bytes);
+ p->crypt_len = old_extra_bytes;
+ }
+
+ setup_data_in();
+ mv_init_hash_config(req);
}

static int queue_manag(void *data)
--
1.7.3.4

2012-06-12 17:17:40

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 12/13] mv_cesa: drop the now unused process callback

And while here, simplify dequeue_complete_req() a bit.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 21 ++++++---------------
1 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 86b73d1..7b2b693 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -70,7 +70,6 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
- void (*process) (void);

/* src mostly */
int sg_src_left;
@@ -650,25 +649,17 @@ static void mv_hash_algo_completion(void)
static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
- cpg->p.hw_processed_bytes += cpg->p.crypt_len;
- cpg->p.crypt_len = 0;

mv_dma_clear();
cpg->u32_usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
- if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
- /* process next scatter list entry */
- cpg->eng_st = ENGINE_BUSY;
- setup_data_in();
- cpg->p.process();
- } else {
- cpg->p.complete();
- cpg->eng_st = ENGINE_IDLE;
- local_bh_disable();
- req->complete(req, 0);
- local_bh_enable();
- }
+
+ cpg->p.complete();
+ cpg->eng_st = ENGINE_IDLE;
+ local_bh_disable();
+ req->complete(req, 0);
+ local_bh_enable();
}

static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
--
1.7.3.4

2012-06-12 17:17:43

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now

This introduces a pool of four-byte DMA buffers for security
accelerator config updates.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 134 ++++++++++++++++++++++++++++++++++++----------
drivers/crypto/mv_cesa.h | 1 +
2 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7dfab85..7917d1a 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -10,6 +10,7 @@
#include <crypto/algapi.h>
#include <linux/crypto.h>
#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kthread.h>
@@ -28,6 +29,9 @@
#define MAX_HW_HASH_SIZE 0xFFFF
#define MV_CESA_EXPIRE 500 /* msec */

+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
static int count_sgs(struct scatterlist *, unsigned int);

/*
@@ -97,6 +101,11 @@ struct sec_accel_sram {
#define sa_ivo type.hash.ivo
} __attribute__((packed));

+struct u32_mempair {
+ u32 *vaddr;
+ dma_addr_t daddr;
+};
+
struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -120,6 +129,11 @@ struct crypto_priv {

struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;
+
+ struct dma_pool *u32_pool;
+ struct u32_mempair *u32_list;
+ int u32_list_len;
+ int u32_usage;
};

static struct crypto_priv *cpg;
@@ -191,6 +205,54 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

+#define U32_ITEM(x) (cpg->u32_list[x].vaddr)
+#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr)
+
+static inline int set_u32_poolsize(int nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > cpg->u32_list_len) {
+ struct u32_mempair *newmem;
+ int newsize = nelem * sizeof(struct u32_mempair);
+
+ newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ cpg->u32_list = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */
+ for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) {
+ U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool,
+ GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len));
+ if (!U32_ITEM((cpg->u32_list_len)))
+ return -ENOMEM;
+ }
+ for (; cpg->u32_list_len > nelem; cpg->u32_list_len--)
+ dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1),
+ U32_ITEM_DMA(cpg->u32_list_len - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(cpg->u32_list);
+ cpg->u32_list = 0;
+ }
+ return 0;
+}
+
+static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val)
+{
+ if (unlikely(cpg->u32_usage == cpg->u32_list_len)
+ && set_u32_poolsize(cpg->u32_list_len << 1)) {
+ printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n",
+ cpg->u32_list_len << 1);
+ return;
+ }
+ *(U32_ITEM(cpg->u32_usage)) = val;
+ mv_dma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32));
+ cpg->u32_usage++;
+}
+
static inline bool
mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
{
@@ -392,36 +454,13 @@ static void mv_init_crypt_config(struct ablkcipher_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));
-
- mv_dma_separator();
- dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
-
- /* GO */
- mv_setup_timer();
- mv_dma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_update_crypt_config(void)
{
- struct sec_accel_config *op = &cpg->sa_sram.op;
-
/* update the enc_len field only */
-
- op->enc_len = cpg->p.crypt_len;
-
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32),
- sizeof(u32), DMA_TO_DEVICE);
- mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
- cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
-
- mv_dma_separator();
- dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
-
- /* GO */
- mv_setup_timer();
- mv_dma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+ mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+ (u32)cpg->p.crypt_len);
}

static void mv_crypto_algo_completion(void)
@@ -660,6 +699,7 @@ static void dequeue_complete_req(void)
cpg->p.crypt_len = 0;

mv_dma_clear();
+ cpg->u32_usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
@@ -701,7 +741,6 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
memset(p, 0, sizeof(struct req_progress));
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
- p->process = mv_update_crypt_config;

/* assume inplace request */
if (req->src == req->dst) {
@@ -728,6 +767,24 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)

setup_data_in();
mv_init_crypt_config(req);
+ mv_dma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
+ cpg->p.crypt_len = 0;
+
+ setup_data_in();
+ mv_update_crypt_config();
+ mv_dma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ }
+
+
+ /* GO */
+ mv_setup_timer();
+ mv_dma_trigger();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_start_new_hash_req(struct ahash_request *req)
@@ -1294,18 +1351,29 @@ static int mv_probe(struct platform_device *pdev)

writel(0, cpg->reg + SEC_ACCEL_INT_STATUS);
writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
- writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA |
+ writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN |
SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG);
writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0);

cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);

+ cpg->u32_pool = dma_pool_create("CESA U32 Item Pool",
+ &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0);
+ if (!cpg->u32_pool) {
+ ret = -ENOMEM;
+ goto err_mapping;
+ }
+ if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) {
+ printk(KERN_ERR MV_CESA "failed to initialise poolsize\n");
+ goto err_pool;
+ }
+
ret = crypto_register_alg(&mv_aes_alg_ecb);
if (ret) {
printk(KERN_WARNING MV_CESA
"Could not register aes-ecb driver\n");
- goto err_irq;
+ goto err_poolsize;
}

ret = crypto_register_alg(&mv_aes_alg_cbc);
@@ -1332,7 +1400,13 @@ static int mv_probe(struct platform_device *pdev)
return 0;
err_unreg_ecb:
crypto_unregister_alg(&mv_aes_alg_ecb);
-err_irq:
+err_poolsize:
+ set_u32_poolsize(0);
+err_pool:
+ dma_pool_destroy(cpg->u32_pool);
+err_mapping:
+ dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
free_irq(irq, cp);
err_thread:
kthread_stop(cp->queue_th);
@@ -1361,6 +1435,8 @@ static int mv_remove(struct platform_device *pdev)
free_irq(cp->irq, cp);
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ set_u32_poolsize(0);
+ dma_pool_destroy(cpg->u32_pool);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h
index 08fcb11..866c437 100644
--- a/drivers/crypto/mv_cesa.h
+++ b/drivers/crypto/mv_cesa.h
@@ -24,6 +24,7 @@
#define SEC_CFG_CH1_W_IDMA (1 << 8)
#define SEC_CFG_ACT_CH0_IDMA (1 << 9)
#define SEC_CFG_ACT_CH1_IDMA (1 << 10)
+#define SEC_CFG_MP_CHAIN (1 << 11)

#define SEC_ACCEL_STATUS 0xde0c
#define SEC_ST_ACT_0 (1 << 0)
--
1.7.3.4

2012-06-12 17:17:53

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 89 ++++++++++++++++++----------------------------
1 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 9c65980..86b73d1 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -538,34 +538,14 @@ static void mv_init_hash_config(struct ahash_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));
-
- mv_dma_separator();
-
- if (req->result) {
- req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
- req_ctx->digestsize, DMA_FROM_DEVICE);
- mv_dma_memcpy(req_ctx->result_dma,
- cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
- } else {
- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_dma_memcpy(cpg->sa_sram_dma,
- cpg->sram_phys + SRAM_CONFIG, 1);
- }
-
- /* GO */
- mv_setup_timer();
- mv_dma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

-static void mv_update_hash_config(void)
+static void mv_update_hash_config(struct ahash_request *req)
{
- struct ahash_request *req = ahash_request_cast(cpg->cur_req);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
- struct sec_accel_config *op = &cpg->sa_sram.op;
int is_last;
+ u32 val;

/* update only the config (for changed fragment state) and
* mac_digest (for changed frag len) fields */
@@ -573,10 +553,10 @@ static void mv_update_hash_config(void)
switch (req_ctx->op) {
case COP_SHA1:
default:
- op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
- op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
break;
}

@@ -584,36 +564,11 @@ static void mv_update_hash_config(void)
&& (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);

- op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
- sizeof(u32), DMA_TO_DEVICE);
- mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG,
- cpg->sa_sram_dma, sizeof(u32));
-
- op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32),
- sizeof(u32), DMA_TO_DEVICE);
- mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32),
- cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32));
-
- mv_dma_separator();
-
- if (req->result) {
- req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
- req_ctx->digestsize, DMA_FROM_DEVICE);
- mv_dma_memcpy(req_ctx->result_dma,
- cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
- } else {
- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_dma_memcpy(cpg->sa_sram_dma,
- cpg->sram_phys + SRAM_CONFIG, 1);
- }
+ val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
+ mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG, val);

- /* GO */
- mv_setup_timer();
- mv_dma_trigger();
- writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+ val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
+ mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val);
}

static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx,
@@ -837,7 +792,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)

p->hw_nbytes = hw_bytes;
p->complete = mv_hash_algo_completion;
- p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
@@ -849,6 +803,33 @@ static void mv_start_new_hash_req(struct ahash_request *req)

setup_data_in();
mv_init_hash_config(req);
+ mv_dma_separator();
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
+ cpg->p.crypt_len = 0;
+
+ setup_data_in();
+ mv_update_hash_config(req);
+ mv_dma_separator();
+ cpg->p.hw_processed_bytes += cpg->p.crypt_len;
+ }
+ if (req->result) {
+ ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ ctx->digestsize, DMA_FROM_DEVICE);
+ mv_dma_memcpy(ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF,
+ ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_dma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }
+
+ /* GO */
+ mv_setup_timer();
+ mv_dma_trigger();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static int queue_manag(void *data)
--
1.7.3.4

2012-06-12 17:17:52

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4b08137..7dfab85 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -158,6 +158,7 @@ struct mv_req_hash_ctx {
u64 count;
u32 state[SHA1_DIGEST_SIZE / 4];
u8 buffer[SHA1_BLOCK_SIZE];
+ dma_addr_t buffer_dma;
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes; /* unprocessed bytes in buffer */
@@ -638,6 +639,9 @@ static void mv_hash_algo_completion(void)
dma_unmap_single(cpg->dev, ctx->result_dma,
ctx->digestsize, DMA_FROM_DEVICE);

+ dma_unmap_single(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+
if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
@@ -757,8 +761,10 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
- memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer,
- old_extra_bytes);
+ dma_sync_single_for_device(cpg->dev, ctx->buffer_dma,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START,
+ ctx->buffer_dma, old_extra_bytes);
p->crypt_len = old_extra_bytes;
}

@@ -903,6 +909,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx *ctx, int op,
ctx->first_hash = 1;
ctx->last_chunk = is_last;
ctx->count_add = count_add;
+ ctx->buffer_dma = dma_map_single(cpg->dev, ctx->buffer,
+ SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
}

static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last,
--
1.7.3.4

2012-06-12 17:17:49

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 06/13] mv_cesa: use DMA engine for data transfers


Signed-off-by: Phil Sutter <[email protected]>
---
arch/arm/plat-orion/common.c | 6 +
drivers/crypto/mv_cesa.c | 214 +++++++++++++++++++++++++++++++++---------
2 files changed, 175 insertions(+), 45 deletions(-)

diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 61fd837..0c6c695 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -924,9 +924,15 @@ static struct resource orion_crypto_resources[] = {
},
};

+static u64 mv_crypto_dmamask = DMA_BIT_MASK(32);
+
static struct platform_device orion_crypto = {
.name = "mv_crypto",
.id = -1,
+ .dev = {
+ .dma_mask = &mv_crypto_dmamask,
+ .coherent_dma_mask = DMA_BIT_MASK(32),
+ },
};

void __init orion_crypto_init(unsigned long mapbase,
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index ad21c72..cdbc82e 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -9,6 +9,7 @@
#include <crypto/aes.h>
#include <crypto/algapi.h>
#include <linux/crypto.h>
+#include <linux/dma-mapping.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/kthread.h>
@@ -21,11 +22,14 @@
#include <crypto/sha.h>

#include "mv_cesa.h"
+#include "mv_dma.h"

#define MV_CESA "MV-CESA:"
#define MAX_HW_HASH_SIZE 0xFFFF
#define MV_CESA_EXPIRE 500 /* msec */

+static int count_sgs(struct scatterlist *, unsigned int);
+
/*
* STM:
* /---------------------------------------\
@@ -50,7 +54,6 @@ enum engine_status {
* @src_start: offset to add to src start position (scatter list)
* @crypt_len: length of current hw crypt/hash process
* @hw_nbytes: total bytes to process in hw for this request
- * @copy_back: whether to copy data back (crypt) or not (hash)
* @sg_dst_left: bytes left dst to process in this scatter list
* @dst_start: offset to add to dst start position (scatter list)
* @hw_processed_bytes: number of bytes processed by hw (request).
@@ -71,7 +74,6 @@ struct req_progress {
int crypt_len;
int hw_nbytes;
/* dst mostly */
- int copy_back;
int sg_dst_left;
int dst_start;
int hw_processed_bytes;
@@ -96,8 +98,10 @@ struct sec_accel_sram {
} __attribute__((packed));

struct crypto_priv {
+ struct device *dev;
void __iomem *reg;
void __iomem *sram;
+ u32 sram_phys;
int irq;
struct clk *clk;
struct task_struct *queue_th;
@@ -115,6 +119,7 @@ struct crypto_priv {
int has_hmac_sha1;

struct sec_accel_sram sa_sram;
+ dma_addr_t sa_sram_dma;
};

static struct crypto_priv *cpg;
@@ -183,6 +188,23 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

+static inline bool
+mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+ int nents = count_sgs(sg, nbytes);
+
+ if (nbytes && dma_map_sg(cpg->dev, sg, nents, dir) != nents)
+ return false;
+ return true;
+}
+
+static inline void
+mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+ if (nbytes)
+ dma_unmap_sg(cpg->dev, sg, count_sgs(sg, nbytes), dir);
+}
+
static void compute_aes_dec_key(struct mv_ctx *ctx)
{
struct crypto_aes_ctx gen_aes_key;
@@ -257,12 +279,66 @@ static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
}
}

+static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int len)
+{
+ dma_addr_t sbuf;
+ int copy_len;
+
+ while (len) {
+ if (!p->sg_src_left) {
+ /* next sg please */
+ p->src_sg = sg_next(p->src_sg);
+ BUG_ON(!p->src_sg);
+ p->sg_src_left = sg_dma_len(p->src_sg);
+ p->src_start = 0;
+ }
+
+ sbuf = sg_dma_address(p->src_sg) + p->src_start;
+
+ copy_len = min(p->sg_src_left, len);
+ mv_dma_memcpy(dbuf, sbuf, copy_len);
+
+ p->src_start += copy_len;
+ p->sg_src_left -= copy_len;
+
+ len -= copy_len;
+ dbuf += copy_len;
+ }
+}
+
+static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int len)
+{
+ dma_addr_t dbuf;
+ int copy_len;
+
+ while (len) {
+ if (!p->sg_dst_left) {
+ /* next sg please */
+ p->dst_sg = sg_next(p->dst_sg);
+ BUG_ON(!p->dst_sg);
+ p->sg_dst_left = sg_dma_len(p->dst_sg);
+ p->dst_start = 0;
+ }
+
+ dbuf = sg_dma_address(p->dst_sg) + p->dst_start;
+
+ copy_len = min(p->sg_dst_left, len);
+ mv_dma_memcpy(dbuf, sbuf, copy_len);
+
+ p->dst_start += copy_len;
+ p->sg_dst_left -= copy_len;
+
+ len -= copy_len;
+ sbuf += copy_len;
+ }
+}
+
static void setup_data_in(void)
{
struct req_progress *p = &cpg->p;
int data_in_sram =
min(p->hw_nbytes - p->hw_processed_bytes, cpg->max_req_size);
- copy_src_to_buf(p, cpg->sram + SRAM_DATA_IN_START + p->crypt_len,
+ dma_copy_src_to_buf(p, cpg->sram_phys + SRAM_DATA_IN_START + p->crypt_len,
data_in_sram - p->crypt_len);
p->crypt_len = data_in_sram;
}
@@ -309,22 +385,39 @@ static void mv_init_crypt_config(struct ablkcipher_request *req)
op->enc_key_p = SRAM_DATA_KEY_P;
op->enc_len = cpg->p.crypt_len;

- memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));

+ mv_dma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);
+
/* GO */
mv_setup_timer();
+ mv_dma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

static void mv_update_crypt_config(void)
{
+ struct sec_accel_config *op = &cpg->sa_sram.op;
+
/* update the enc_len field only */
- memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32),
- &cpg->p.crypt_len, sizeof(u32));
+
+ op->enc_len = cpg->p.crypt_len;
+
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32),
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+ cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
+
+ mv_dma_separator();
+ dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len);

/* GO */
mv_setup_timer();
+ mv_dma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -333,6 +426,13 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);

+ if (req->src == req->dst) {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL);
+ } else {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
+ mv_dma_unmap_sg(req->dst, req->nbytes, DMA_FROM_DEVICE);
+ }
+
if (req_ctx->op != COP_AES_CBC)
return ;

@@ -392,11 +492,20 @@ static void mv_init_hash_config(struct ahash_request *req)
writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
}

- memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram));

+ mv_dma_separator();
+
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+
/* GO */
mv_setup_timer();
+ mv_dma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -426,13 +535,26 @@ static void mv_update_hash_config(void)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);

op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
- memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32));
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma,
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG,
+ cpg->sa_sram_dma, sizeof(u32));

op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32));
+ dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32),
+ sizeof(u32), DMA_TO_DEVICE);
+ mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32),
+ cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32));
+
+ mv_dma_separator();
+
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);

/* GO */
mv_setup_timer();
+ mv_dma_trigger();
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

@@ -506,43 +628,18 @@ static void mv_hash_algo_completion(void)
} else {
mv_save_digest_state(ctx);
}
+
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
}

static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
- void *buf;
cpg->p.hw_processed_bytes += cpg->p.crypt_len;
- if (cpg->p.copy_back) {
- int need_copy_len = cpg->p.crypt_len;
- int sram_offset = 0;
- do {
- int dst_copy;
-
- if (!cpg->p.sg_dst_left) {
- /* next sg please */
- cpg->p.dst_sg = sg_next(cpg->p.dst_sg);
- BUG_ON(!cpg->p.dst_sg);
- cpg->p.sg_dst_left = cpg->p.dst_sg->length;
- cpg->p.dst_start = 0;
- }
-
- buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start;
-
- dst_copy = min(need_copy_len, cpg->p.sg_dst_left);
-
- memcpy(buf,
- cpg->sram + SRAM_DATA_OUT_START + sram_offset,
- dst_copy);
- sram_offset += dst_copy;
- cpg->p.sg_dst_left -= dst_copy;
- need_copy_len -= dst_copy;
- cpg->p.dst_start += dst_copy;
- } while (need_copy_len > 0);
- }
-
cpg->p.crypt_len = 0;

+ mv_dma_clear();
+
BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
/* process next scatter list entry */
@@ -584,15 +681,28 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
p->process = mv_update_crypt_config;
- p->copy_back = 1;
+
+ /* assume inplace request */
+ if (req->src == req->dst) {
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL))
+ return;
+ } else {
+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE))
+ return;
+
+ if (!mv_dma_map_sg(req->dst, req->nbytes, DMA_FROM_DEVICE)) {
+ mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE);
+ return;
+ }
+ }

p->src_sg = req->src;
p->dst_sg = req->dst;
if (req->nbytes) {
BUG_ON(!req->src);
BUG_ON(!req->dst);
- p->sg_src_left = req->src->length;
- p->sg_dst_left = req->dst->length;
+ p->sg_src_left = sg_dma_len(req->src);
+ p->sg_dst_left = sg_dma_len(req->dst);
}

setup_data_in();
@@ -604,6 +714,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
struct req_progress *p = &cpg->p;
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
int hw_bytes, old_extra_bytes, rc;
+
cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
@@ -633,6 +744,11 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->crypt_len = old_extra_bytes;
}

+ if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) {
+ printk(KERN_ERR "%s: out of memory\n", __func__);
+ return;
+ }
+
setup_data_in();
mv_init_hash_config(req);
} else {
@@ -968,14 +1084,14 @@ irqreturn_t crypto_int(int irq, void *priv)
u32 val;

val = readl(cpg->reg + SEC_ACCEL_INT_STATUS);
- if (!(val & SEC_INT_ACCEL0_DONE))
+ if (!(val & SEC_INT_ACC0_IDMA_DONE))
return IRQ_NONE;

if (!del_timer(&cpg->completion_timer)) {
printk(KERN_WARNING MV_CESA
"got an interrupt but no pending timer?\n");
}
- val &= ~SEC_INT_ACCEL0_DONE;
+ val &= ~SEC_INT_ACC0_IDMA_DONE;
writel(val, cpg->reg + FPGA_INT_STATUS);
writel(val, cpg->reg + SEC_ACCEL_INT_STATUS);
BUG_ON(cpg->eng_st != ENGINE_BUSY);
@@ -1115,6 +1231,7 @@ static int mv_probe(struct platform_device *pdev)
}
cp->sram_size = resource_size(res);
cp->max_req_size = cp->sram_size - SRAM_CFG_SPACE;
+ cp->sram_phys = res->start;
cp->sram = ioremap(res->start, cp->sram_size);
if (!cp->sram) {
ret = -ENOMEM;
@@ -1130,6 +1247,7 @@ static int mv_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, cp);
cpg = cp;
+ cpg->dev = &pdev->dev;

cp->queue_th = kthread_run(queue_manag, cp, "mv_crypto");
if (IS_ERR(cp->queue_th)) {
@@ -1149,10 +1267,14 @@ static int mv_probe(struct platform_device *pdev)
clk_prepare_enable(cp->clk);

writel(0, cpg->reg + SEC_ACCEL_INT_STATUS);
- writel(SEC_INT_ACCEL0_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
- writel(SEC_CFG_STOP_DIG_ERR, cpg->reg + SEC_ACCEL_CFG);
+ writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK);
+ writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA |
+ SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG);
writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0);

+ cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
+
ret = crypto_register_alg(&mv_aes_alg_ecb);
if (ret) {
printk(KERN_WARNING MV_CESA
@@ -1211,6 +1333,8 @@ static int mv_remove(struct platform_device *pdev)
crypto_unregister_ahash(&mv_hmac_sha1_alg);
kthread_stop(cp->queue_th);
free_irq(cp->irq, cp);
+ dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
+ sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
--
1.7.3.4

2012-06-12 17:17:50

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 03/13] mv_cesa: prepare the full sram config in dram

This way reconfiguring the cryptographic accelerator consists of a
single step (memcpy here), which in future can be done by the tdma
engine.

This patch introduces some ugly IV copying, necessary for input buffers
above 1920bytes. But this will go away later.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 83 ++++++++++++++++++++++++++++-----------------
1 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 59c2ed2..80dcf16 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -77,6 +77,24 @@ struct req_progress {
int hw_processed_bytes;
};

+struct sec_accel_sram {
+ struct sec_accel_config op;
+ union {
+ struct {
+ u32 key[8];
+ u32 iv[4];
+ } crypt;
+ struct {
+ u32 ivi[5];
+ u32 ivo[5];
+ } hash;
+ } type;
+#define sa_key type.crypt.key
+#define sa_iv type.crypt.iv
+#define sa_ivi type.hash.ivi
+#define sa_ivo type.hash.ivo
+} __attribute__((packed));
+
struct crypto_priv {
void __iomem *reg;
void __iomem *sram;
@@ -95,6 +113,8 @@ struct crypto_priv {
int sram_size;
int has_sha1;
int has_hmac_sha1;
+
+ struct sec_accel_sram sa_sram;
};

static struct crypto_priv *cpg;
@@ -252,48 +272,49 @@ static void mv_process_current_q(int first_block)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
- struct sec_accel_config op;
+ struct sec_accel_config *op = &cpg->sa_sram.op;

switch (req_ctx->op) {
case COP_AES_ECB:
- op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB;
+ op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB;
break;
case COP_AES_CBC:
default:
- op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
- op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
+ op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
+ op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
- if (first_block)
- memcpy(cpg->sram + SRAM_DATA_IV, req->info, 16);
+ if (!first_block)
+ memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
+ memcpy(cpg->sa_sram.sa_iv, req->info, 16);
break;
}
if (req_ctx->decrypt) {
- op.config |= CFG_DIR_DEC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN);
+ op->config |= CFG_DIR_DEC;
+ memcpy(cpg->sa_sram.sa_key, ctx->aes_dec_key, AES_KEY_LEN);
} else {
- op.config |= CFG_DIR_ENC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN);
+ op->config |= CFG_DIR_ENC;
+ memcpy(cpg->sa_sram.sa_key, ctx->aes_enc_key, AES_KEY_LEN);
}

switch (ctx->key_len) {
case AES_KEYSIZE_128:
- op.config |= CFG_AES_LEN_128;
+ op->config |= CFG_AES_LEN_128;
break;
case AES_KEYSIZE_192:
- op.config |= CFG_AES_LEN_192;
+ op->config |= CFG_AES_LEN_192;
break;
case AES_KEYSIZE_256:
- op.config |= CFG_AES_LEN_256;
+ op->config |= CFG_AES_LEN_256;
break;
}
- op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
+ op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
- op.enc_key_p = SRAM_DATA_KEY_P;
+ op->enc_key_p = SRAM_DATA_KEY_P;

setup_data_in();
- op.enc_len = cpg->p.crypt_len;
- memcpy(cpg->sram + SRAM_CONFIG, &op,
- sizeof(struct sec_accel_config));
+ op->enc_len = cpg->p.crypt_len;
+ memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ sizeof(struct sec_accel_sram));

/* GO */
mv_setup_timer();
@@ -317,30 +338,30 @@ static void mv_process_hash_current(int first_block)
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
- struct sec_accel_config op = { 0 };
+ struct sec_accel_config *op = &cpg->sa_sram.op;
int is_last;

switch (req_ctx->op) {
case COP_SHA1:
default:
- op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
- op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
- memcpy(cpg->sram + SRAM_HMAC_IV_IN,
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ memcpy(cpg->sa_sram.sa_ivi,
tfm_ctx->ivs, sizeof(tfm_ctx->ivs));
break;
}

- op.mac_src_p =
+ op->mac_src_p =
MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

setup_data_in();

- op.mac_digest =
+ op->mac_digest =
MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
- op.mac_iv =
+ op->mac_iv =
MAC_INNER_IV_P(SRAM_HMAC_IV_IN) |
MAC_OUTER_IV_P(SRAM_HMAC_IV_OUT);

@@ -349,16 +370,16 @@ static void mv_process_hash_current(int first_block)
&& (req_ctx->count <= MAX_HW_HASH_SIZE);
if (req_ctx->first_hash) {
if (is_last)
- op.config |= CFG_NOT_FRAG;
+ op->config |= CFG_NOT_FRAG;
else
- op.config |= CFG_FIRST_FRAG;
+ op->config |= CFG_FIRST_FRAG;

req_ctx->first_hash = 0;
} else {
if (is_last)
- op.config |= CFG_LAST_FRAG;
+ op->config |= CFG_LAST_FRAG;
else
- op.config |= CFG_MID_FRAG;
+ op->config |= CFG_MID_FRAG;

if (first_block) {
writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
@@ -369,8 +390,8 @@ static void mv_process_hash_current(int first_block)
}
}

- memcpy(cpg->sram + SRAM_CONFIG, &op,
- sizeof(struct sec_accel_config));
+ memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
+ sizeof(struct sec_accel_sram));

/* GO */
mv_setup_timer();
--
1.7.3.4

2012-06-12 17:17:44

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 07/13] mv_cesa: have DMA engine copy back the digest result


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 40 +++++++++++++++++++++++++++++-----------
1 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index cdbc82e..4b08137 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -161,8 +161,10 @@ struct mv_req_hash_ctx {
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes; /* unprocessed bytes in buffer */
+ int digestsize; /* size of the digest */
enum hash_op op;
int count_add;
+ dma_addr_t result_dma;
};

static void mv_completion_timer_callback(unsigned long unused)
@@ -499,9 +501,17 @@ static void mv_init_hash_config(struct ahash_request *req)

mv_dma_separator();

- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+ if (req->result) {
+ req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ req_ctx->digestsize, DMA_FROM_DEVICE);
+ mv_dma_memcpy(req_ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_dma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }

/* GO */
mv_setup_timer();
@@ -548,9 +558,17 @@ static void mv_update_hash_config(void)

mv_dma_separator();

- /* XXX: this fixes some ugly register fuckup bug in the tdma engine
- * (no need to sync since the data is ignored anyway) */
- mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1);
+ if (req->result) {
+ req_ctx->result_dma = dma_map_single(cpg->dev, req->result,
+ req_ctx->digestsize, DMA_FROM_DEVICE);
+ mv_dma_memcpy(req_ctx->result_dma,
+ cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize);
+ } else {
+ /* XXX: this fixes some ugly register fuckup bug in the tdma engine
+ * (no need to sync since the data is ignored anyway) */
+ mv_dma_memcpy(cpg->sa_sram_dma,
+ cpg->sram_phys + SRAM_CONFIG, 1);
+ }

/* GO */
mv_setup_timer();
@@ -617,11 +635,10 @@ static void mv_hash_algo_completion(void)
copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes);

if (likely(ctx->last_chunk)) {
- if (likely(ctx->count <= MAX_HW_HASH_SIZE)) {
- memcpy(req->result, cpg->sram + SRAM_DIGEST_BUF,
- crypto_ahash_digestsize(crypto_ahash_reqtfm
- (req)));
- } else {
+ dma_unmap_single(cpg->dev, ctx->result_dma,
+ ctx->digestsize, DMA_FROM_DEVICE);
+
+ if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
}
@@ -719,6 +736,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
old_extra_bytes = ctx->extra_bytes;
+ ctx->digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));

ctx->extra_bytes = hw_bytes % SHA1_BLOCK_SIZE;
if (ctx->extra_bytes != 0
--
1.7.3.4

2012-06-12 17:17:47

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 04/13] mv_cesa: split up processing callbacks

Have a dedicated function initialising the full SRAM config, then use a
minimal callback for changing only relevant parts of it.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 87 +++++++++++++++++++++++++++++++++------------
1 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 80dcf16..ad21c72 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -63,7 +63,7 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
- void (*process) (int is_first);
+ void (*process) (void);

/* src mostly */
int sg_src_left;
@@ -267,9 +267,8 @@ static void setup_data_in(void)
p->crypt_len = data_in_sram;
}

-static void mv_process_current_q(int first_block)
+static void mv_init_crypt_config(struct ablkcipher_request *req)
{
- struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
struct sec_accel_config *op = &cpg->sa_sram.op;
@@ -283,8 +282,6 @@ static void mv_process_current_q(int first_block)
op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
- if (!first_block)
- memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
memcpy(cpg->sa_sram.sa_iv, req->info, 16);
break;
}
@@ -310,9 +307,8 @@ static void mv_process_current_q(int first_block)
op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
op->enc_key_p = SRAM_DATA_KEY_P;
-
- setup_data_in();
op->enc_len = cpg->p.crypt_len;
+
memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
sizeof(struct sec_accel_sram));

@@ -321,6 +317,17 @@ static void mv_process_current_q(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

+static void mv_update_crypt_config(void)
+{
+ /* update the enc_len field only */
+ memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32),
+ &cpg->p.crypt_len, sizeof(u32));
+
+ /* GO */
+ mv_setup_timer();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+}
+
static void mv_crypto_algo_completion(void)
{
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
@@ -332,9 +339,8 @@ static void mv_crypto_algo_completion(void)
memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16);
}

-static void mv_process_hash_current(int first_block)
+static void mv_init_hash_config(struct ahash_request *req)
{
- struct ahash_request *req = ahash_request_cast(cpg->cur_req);
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = &cpg->p;
@@ -357,8 +363,6 @@ static void mv_process_hash_current(int first_block)
MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

- setup_data_in();
-
op->mac_digest =
MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
op->mac_iv =
@@ -381,13 +385,11 @@ static void mv_process_hash_current(int first_block)
else
op->config |= CFG_MID_FRAG;

- if (first_block) {
- writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
- writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B);
- writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C);
- writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D);
- writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
- }
+ writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A);
+ writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B);
+ writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C);
+ writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D);
+ writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E);
}

memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram,
@@ -398,6 +400,42 @@ static void mv_process_hash_current(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
}

+static void mv_update_hash_config(void)
+{
+ struct ahash_request *req = ahash_request_cast(cpg->cur_req);
+ struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
+ struct req_progress *p = &cpg->p;
+ struct sec_accel_config *op = &cpg->sa_sram.op;
+ int is_last;
+
+ /* update only the config (for changed fragment state) and
+ * mac_digest (for changed frag len) fields */
+
+ switch (req_ctx->op) {
+ case COP_SHA1:
+ default:
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+ break;
+ case COP_HMAC_SHA1:
+ op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+ break;
+ }
+
+ is_last = req_ctx->last_chunk
+ && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes)
+ && (req_ctx->count <= MAX_HW_HASH_SIZE);
+
+ op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
+ memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32));
+
+ op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len);
+ memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32));
+
+ /* GO */
+ mv_setup_timer();
+ writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD);
+}
+
static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx,
struct shash_desc *desc)
{
@@ -509,7 +547,8 @@ static void dequeue_complete_req(void)
if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) {
/* process next scatter list entry */
cpg->eng_st = ENGINE_BUSY;
- cpg->p.process(0);
+ setup_data_in();
+ cpg->p.process();
} else {
cpg->p.complete();
cpg->eng_st = ENGINE_IDLE;
@@ -544,7 +583,7 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
memset(p, 0, sizeof(struct req_progress));
p->hw_nbytes = req->nbytes;
p->complete = mv_crypto_algo_completion;
- p->process = mv_process_current_q;
+ p->process = mv_update_crypt_config;
p->copy_back = 1;

p->src_sg = req->src;
@@ -556,7 +595,8 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->sg_dst_left = req->dst->length;
}

- mv_process_current_q(1);
+ setup_data_in();
+ mv_init_crypt_config(req);
}

static void mv_start_new_hash_req(struct ahash_request *req)
@@ -585,7 +625,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
if (hw_bytes) {
p->hw_nbytes = hw_bytes;
p->complete = mv_hash_algo_completion;
- p->process = mv_process_hash_current;
+ p->process = mv_update_hash_config;

if (unlikely(old_extra_bytes)) {
memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer,
@@ -593,7 +633,8 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p->crypt_len = old_extra_bytes;
}

- mv_process_hash_current(1);
+ setup_data_in();
+ mv_init_hash_config(req);
} else {
copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
ctx->extra_bytes - old_extra_bytes);
--
1.7.3.4

2012-06-12 17:17:46

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code


Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/dma_desclist.h | 79 +++++++++++++++++++++++++++++++++++
drivers/crypto/mv_cesa.c | 81 +++++++++----------------------------
drivers/crypto/mv_dma.c | 91 ++++++++++++-----------------------------
3 files changed, 125 insertions(+), 126 deletions(-)
create mode 100644 drivers/crypto/dma_desclist.h

diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h
new file mode 100644
index 0000000..c471ad6
--- /dev/null
+++ b/drivers/crypto/dma_desclist.h
@@ -0,0 +1,79 @@
+#ifndef __DMA_DESCLIST__
+#define __DMA_DESCLIST__
+
+struct dma_desc {
+ void *virt;
+ dma_addr_t phys;
+};
+
+struct dma_desclist {
+ struct dma_pool *itempool;
+ struct dma_desc *desclist;
+ unsigned long length;
+ unsigned long usage;
+};
+
+#define DESCLIST_ITEM(dl, x) ((dl).desclist[(x)].virt)
+#define DESCLIST_ITEM_DMA(dl, x) ((dl).desclist[(x)].phys)
+#define DESCLIST_FULL(dl) ((dl).length == (dl).usage)
+
+static inline int
+init_dma_desclist(struct dma_desclist *dl, struct device *dev,
+ size_t size, size_t align, size_t boundary)
+{
+#define STRX(x) #x
+#define STR(x) STRX(x)
+ dl->itempool = dma_pool_create(
+ "DMA Desclist Pool at "__FILE__"("STR(__LINE__)")",
+ dev, size, align, boundary);
+#undef STR
+#undef STRX
+ if (!dl->itempool)
+ return 1;
+ dl->desclist = NULL;
+ dl->length = dl->usage = 0;
+ return 0;
+}
+
+static inline int
+set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > dl->length) {
+ struct dma_desc *newmem;
+ int newsize = nelem * sizeof(struct dma_desc);
+
+ newmem = krealloc(dl->desclist, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ dl->desclist = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting dl->length on the go */
+ for (; dl->length < nelem; dl->length++) {
+ DESCLIST_ITEM(*dl, dl->length) = dma_pool_alloc(dl->itempool,
+ GFP_KERNEL, &DESCLIST_ITEM_DMA(*dl, dl->length));
+ if (!DESCLIST_ITEM(*dl, dl->length))
+ return -ENOMEM;
+ }
+ for (; dl->length > nelem; dl->length--)
+ dma_pool_free(dl->itempool, DESCLIST_ITEM(*dl, dl->length - 1),
+ DESCLIST_ITEM_DMA(*dl, dl->length - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(dl->desclist);
+ dl->desclist = 0;
+ }
+ return 0;
+}
+
+static inline void
+fini_dma_desclist(struct dma_desclist *dl)
+{
+ set_dma_desclist_size(dl, 0);
+ dma_pool_destroy(dl->itempool);
+ dl->length = dl->usage = 0;
+}
+
+#endif /* __DMA_DESCLIST__ */
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7b2b693..2a9fe8a 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -24,6 +24,7 @@

#include "mv_cesa.h"
#include "mv_dma.h"
+#include "dma_desclist.h"

#define MV_CESA "MV-CESA:"
#define MAX_HW_HASH_SIZE 0xFFFF
@@ -100,11 +101,6 @@ struct sec_accel_sram {
#define sa_ivo type.hash.ivo
} __attribute__((packed));

-struct u32_mempair {
- u32 *vaddr;
- dma_addr_t daddr;
-};
-
struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -129,14 +125,14 @@ struct crypto_priv {
struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;

- struct dma_pool *u32_pool;
- struct u32_mempair *u32_list;
- int u32_list_len;
- int u32_usage;
+ struct dma_desclist desclist;
};

static struct crypto_priv *cpg;

+#define ITEM(x) ((u32 *)DESCLIST_ITEM(cpg->desclist, x))
+#define ITEM_DMA(x) DESCLIST_ITEM_DMA(cpg->desclist, x)
+
struct mv_ctx {
u8 aes_enc_key[AES_KEY_LEN];
u32 aes_dec_key[8];
@@ -204,52 +200,17 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
}

-#define U32_ITEM(x) (cpg->u32_list[x].vaddr)
-#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr)
-
-static inline int set_u32_poolsize(int nelem)
-{
- /* need to increase size first if requested */
- if (nelem > cpg->u32_list_len) {
- struct u32_mempair *newmem;
- int newsize = nelem * sizeof(struct u32_mempair);
-
- newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL);
- if (!newmem)
- return -ENOMEM;
- cpg->u32_list = newmem;
- }
-
- /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */
- for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) {
- U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool,
- GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len));
- if (!U32_ITEM((cpg->u32_list_len)))
- return -ENOMEM;
- }
- for (; cpg->u32_list_len > nelem; cpg->u32_list_len--)
- dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1),
- U32_ITEM_DMA(cpg->u32_list_len - 1));
-
- /* ignore size decreases but those to zero */
- if (!nelem) {
- kfree(cpg->u32_list);
- cpg->u32_list = 0;
- }
- return 0;
-}
-
static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val)
{
- if (unlikely(cpg->u32_usage == cpg->u32_list_len)
- && set_u32_poolsize(cpg->u32_list_len << 1)) {
- printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n",
- cpg->u32_list_len << 1);
+ if (unlikely(DESCLIST_FULL(cpg->desclist)) &&
+ set_dma_desclist_size(&cpg->desclist, cpg->desclist.length << 1)) {
+ printk(KERN_ERR MV_CESA "resizing poolsize to %lu failed\n",
+ cpg->desclist.length << 1);
return;
}
- *(U32_ITEM(cpg->u32_usage)) = val;
- mv_dma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32));
- cpg->u32_usage++;
+ *ITEM(cpg->desclist.usage) = val;
+ mv_dma_memcpy(dst, ITEM_DMA(cpg->desclist.usage), sizeof(u32));
+ cpg->desclist.usage++;
}

static inline bool
@@ -651,7 +612,7 @@ static void dequeue_complete_req(void)
struct crypto_async_request *req = cpg->cur_req;

mv_dma_clear();
- cpg->u32_usage = 0;
+ cpg->desclist.usage = 0;

BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE);

@@ -1335,13 +1296,12 @@ static int mv_probe(struct platform_device *pdev)
cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);

- cpg->u32_pool = dma_pool_create("CESA U32 Item Pool",
- &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0);
- if (!cpg->u32_pool) {
+ if (init_dma_desclist(&cpg->desclist, &pdev->dev,
+ sizeof(u32), MV_DMA_ALIGN, 0)) {
ret = -ENOMEM;
goto err_mapping;
}
- if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) {
+ if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) {
printk(KERN_ERR MV_CESA "failed to initialise poolsize\n");
goto err_pool;
}
@@ -1350,7 +1310,7 @@ static int mv_probe(struct platform_device *pdev)
if (ret) {
printk(KERN_WARNING MV_CESA
"Could not register aes-ecb driver\n");
- goto err_poolsize;
+ goto err_pool;
}

ret = crypto_register_alg(&mv_aes_alg_cbc);
@@ -1377,10 +1337,8 @@ static int mv_probe(struct platform_device *pdev)
return 0;
err_unreg_ecb:
crypto_unregister_alg(&mv_aes_alg_ecb);
-err_poolsize:
- set_u32_poolsize(0);
err_pool:
- dma_pool_destroy(cpg->u32_pool);
+ fini_dma_desclist(&cpg->desclist);
err_mapping:
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
@@ -1412,8 +1370,7 @@ static int mv_remove(struct platform_device *pdev)
free_irq(cp->irq, cp);
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
- set_u32_poolsize(0);
- dma_pool_destroy(cpg->u32_pool);
+ fini_dma_desclist(&cpg->desclist);
memset(cp->sram, 0, cp->sram_size);
iounmap(cp->sram);
iounmap(cp->reg);
diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
index 24c5256..b84ff80 100644
--- a/drivers/crypto/mv_dma.c
+++ b/drivers/crypto/mv_dma.c
@@ -17,6 +17,7 @@
#include <linux/platform_device.h>

#include "mv_dma.h"
+#include "dma_desclist.h"

#define MV_DMA "MV-DMA: "

@@ -30,11 +31,6 @@ struct mv_dma_desc {
u32 next;
} __attribute__((packed));

-struct desc_mempair {
- struct mv_dma_desc *vaddr;
- dma_addr_t daddr;
-};
-
struct mv_dma_priv {
bool idma_registered, tdma_registered;
struct device *dev;
@@ -42,47 +38,12 @@ struct mv_dma_priv {
int irq;
/* protecting the dma descriptors and stuff */
spinlock_t lock;
- struct dma_pool *descpool;
- struct desc_mempair *desclist;
- int desclist_len;
- int desc_usage;
+ struct dma_desclist desclist;
u32 (*print_and_clear_irq)(void);
} tpg;

-#define DESC(x) (tpg.desclist[x].vaddr)
-#define DESC_DMA(x) (tpg.desclist[x].daddr)
-
-static inline int set_poolsize(int nelem)
-{
- /* need to increase size first if requested */
- if (nelem > tpg.desclist_len) {
- struct desc_mempair *newmem;
- int newsize = nelem * sizeof(struct desc_mempair);
-
- newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL);
- if (!newmem)
- return -ENOMEM;
- tpg.desclist = newmem;
- }
-
- /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */
- for (; tpg.desclist_len < nelem; tpg.desclist_len++) {
- DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool,
- GFP_KERNEL, &DESC_DMA(tpg.desclist_len));
- if (!DESC((tpg.desclist_len)))
- return -ENOMEM;
- }
- for (; tpg.desclist_len > nelem; tpg.desclist_len--)
- dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1),
- DESC_DMA(tpg.desclist_len - 1));
-
- /* ignore size decreases but those to zero */
- if (!nelem) {
- kfree(tpg.desclist);
- tpg.desclist = 0;
- }
- return 0;
-}
+#define ITEM(x) ((struct mv_dma_desc *)DESCLIST_ITEM(tpg.desclist, x))
+#define ITEM_DMA(x) DESCLIST_ITEM_DMA(tpg.desclist, x)

static inline void wait_for_dma_idle(void)
{
@@ -102,17 +63,18 @@ static inline void switch_dma_engine(bool state)

static struct mv_dma_desc *get_new_last_desc(void)
{
- if (unlikely(tpg.desc_usage == tpg.desclist_len) &&
- set_poolsize(tpg.desclist_len << 1)) {
- printk(KERN_ERR MV_DMA "failed to increase DMA pool to %d\n",
- tpg.desclist_len << 1);
+ if (unlikely(DESCLIST_FULL(tpg.desclist)) &&
+ set_dma_desclist_size(&tpg.desclist, tpg.desclist.length << 1)) {
+ printk(KERN_ERR MV_DMA "failed to increase DMA pool to %lu\n",
+ tpg.desclist.length << 1);
return NULL;
}

- if (likely(tpg.desc_usage))
- DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage);
+ if (likely(tpg.desclist.usage))
+ ITEM(tpg.desclist.usage - 1)->next =
+ ITEM_DMA(tpg.desclist.usage);

- return DESC(tpg.desc_usage++);
+ return ITEM(tpg.desclist.usage++);
}

static inline void mv_dma_desc_dump(void)
@@ -120,17 +82,17 @@ static inline void mv_dma_desc_dump(void)
struct mv_dma_desc *tmp;
int i;

- if (!tpg.desc_usage) {
+ if (!tpg.desclist.usage) {
printk(KERN_WARNING MV_DMA "DMA descriptor list is empty\n");
return;
}

printk(KERN_WARNING MV_DMA "DMA descriptor list:\n");
- for (i = 0; i < tpg.desc_usage; i++) {
- tmp = DESC(i);
+ for (i = 0; i < tpg.desclist.usage; i++) {
+ tmp = ITEM(i);
printk(KERN_WARNING MV_DMA "entry %d at 0x%x: dma addr 0x%x, "
"src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i,
- (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst,
+ (u32)tmp, ITEM_DMA(i) , tmp->src, tmp->dst,
tmp->count & DMA_BYTE_COUNT_MASK, !!(tmp->count & DMA_OWN_BIT),
tmp->next);
}
@@ -176,7 +138,7 @@ void mv_dma_clear(void)
/* clear descriptor registers */
mv_dma_clear_desc_reg();

- tpg.desc_usage = 0;
+ tpg.desclist.usage = 0;

switch_dma_engine(1);

@@ -192,7 +154,7 @@ void mv_dma_trigger(void)

spin_lock(&tpg.lock);

- writel(DESC_DMA(0), tpg.reg + DMA_NEXT_DESC);
+ writel(ITEM_DMA(0), tpg.reg + DMA_NEXT_DESC);

spin_unlock(&tpg.lock);
}
@@ -331,13 +293,15 @@ static int mv_init_engine(struct platform_device *pdev,
}

/* initialise DMA descriptor list */
- tpg.descpool = dma_pool_create("MV_DMA Descriptor Pool", tpg.dev,
- sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0);
- if (!tpg.descpool) {
+ if (init_dma_desclist(&tpg.desclist, tpg.dev,
+ sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) {
rc = -ENOMEM;
goto out_free_irq;
}
- set_poolsize(MV_DMA_INIT_POOLSIZE);
+ if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) {
+ rc = -ENOMEM;
+ goto out_free_desclist;
+ }

platform_set_drvdata(pdev, &tpg);

@@ -364,8 +328,8 @@ static int mv_init_engine(struct platform_device *pdev,
out_free_all:
switch_dma_engine(0);
platform_set_drvdata(pdev, NULL);
- set_poolsize(0);
- dma_pool_destroy(tpg.descpool);
+out_free_desclist:
+ fini_dma_desclist(&tpg.desclist);
out_free_irq:
free_irq(tpg.irq, &tpg);
out_unmap_reg:
@@ -378,8 +342,7 @@ static int mv_remove(struct platform_device *pdev)
{
switch_dma_engine(0);
platform_set_drvdata(pdev, NULL);
- set_poolsize(0);
- dma_pool_destroy(tpg.descpool);
+ fini_dma_desclist(&tpg.desclist);
free_irq(tpg.irq, &tpg);
iounmap(tpg.reg);
tpg.dev = NULL;
--
1.7.3.4

2012-06-12 17:17:57

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon

This is just to keep formatting changes out of the following commit,
hopefully simplifying it a bit.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 818a5c7..59c2ed2 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -269,12 +269,10 @@ static void mv_process_current_q(int first_block)
}
if (req_ctx->decrypt) {
op.config |= CFG_DIR_DEC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key,
- AES_KEY_LEN);
+ memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN);
} else {
op.config |= CFG_DIR_ENC;
- memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key,
- AES_KEY_LEN);
+ memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN);
}

switch (ctx->key_len) {
@@ -335,9 +333,8 @@ static void mv_process_hash_current(int first_block)
}

op.mac_src_p =
- MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)
- req_ctx->
- count);
+ MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
+ MAC_SRC_TOTAL_LEN((u32)req_ctx->count);

setup_data_in();

@@ -372,7 +369,8 @@ static void mv_process_hash_current(int first_block)
}
}

- memcpy(cpg->sram + SRAM_CONFIG, &op, sizeof(struct sec_accel_config));
+ memcpy(cpg->sram + SRAM_CONFIG, &op,
+ sizeof(struct sec_accel_config));

/* GO */
mv_setup_timer();
--
1.7.3.4

2012-06-12 17:17:57

by Phil Sutter

[permalink] [raw]
Subject: RFC: support for MV_CESA with IDMA or TDMA

Hi,

The following patch series adds support for the TDMA engine built into
Marvell's Kirkwood-based SoCs as well as the IDMA engine built into
Marvell's Orion-based SoCs and enhances mv_cesa.c in order to use it for
speeding up crypto operations. The hardware contains a security
accelerator, which can control DMA as well as crypto engines. It allows
for operation with minimal software intervention, which the following
patches implement: using a chain of DMA descriptors, data input,
configuration, engine startup and data output repeat fully automatically
until the whole input data has been handled.

The point for this being RFC is lack of hardware on my side for testing
the IDMA support. I'd highly appreciate if someone with Orion hardware
could test this, preferably using the hmac_comp tool shipped with
cryptodev-linux as it does a more extensive testing (with bigger buffer
sizes at least) than tcrypt or the standard kernel-internal use cases.

Greetings, Phil

2012-06-12 17:17:58

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines

These are DMA engines integrated into the Marvell Orion/Kirkwood SoCs,
designed to offload data transfers from/to the CESA crypto engine.

Signed-off-by: Phil Sutter <[email protected]>
---
arch/arm/mach-kirkwood/common.c | 33 ++
arch/arm/mach-kirkwood/include/mach/irqs.h | 1 +
arch/arm/mach-orion5x/common.c | 33 ++
arch/arm/mach-orion5x/include/mach/orion5x.h | 2 +
drivers/crypto/Kconfig | 5 +
drivers/crypto/Makefile | 3 +-
drivers/crypto/mv_dma.c | 464 ++++++++++++++++++++++++++
drivers/crypto/mv_dma.h | 127 +++++++
8 files changed, 667 insertions(+), 1 deletions(-)
create mode 100644 drivers/crypto/mv_dma.c
create mode 100644 drivers/crypto/mv_dma.h

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 25fb3fd..dcd1327 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -426,8 +426,41 @@ void __init kirkwood_uart1_init(void)
/*****************************************************************************
* Cryptographic Engines and Security Accelerator (CESA)
****************************************************************************/
+static struct resource kirkwood_tdma_res[] = {
+ {
+ .name = "regs deco",
+ .start = CRYPTO_PHYS_BASE + 0xA00,
+ .end = CRYPTO_PHYS_BASE + 0xA24,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "regs control and error",
+ .start = CRYPTO_PHYS_BASE + 0x800,
+ .end = CRYPTO_PHYS_BASE + 0x8CF,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "crypto error",
+ .start = IRQ_KIRKWOOD_TDMA_ERR,
+ .end = IRQ_KIRKWOOD_TDMA_ERR,
+ .flags = IORESOURCE_IRQ,
+ },
+};
+
+static u64 mv_tdma_dma_mask = DMA_BIT_MASK(32);
+
+static struct platform_device kirkwood_tdma_device = {
+ .name = "mv_tdma",
+ .id = -1,
+ .dev = {
+ .dma_mask = &mv_tdma_dma_mask,
+ .coherent_dma_mask = DMA_BIT_MASK(32),
+ },
+ .num_resources = ARRAY_SIZE(kirkwood_tdma_res),
+ .resource = kirkwood_tdma_res,
+};
+
void __init kirkwood_crypto_init(void)
{
+ platform_device_register(&kirkwood_tdma_device);
orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE,
KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO);
}
diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h b/arch/arm/mach-kirkwood/include/mach/irqs.h
index 2bf8161..a66aa3f 100644
--- a/arch/arm/mach-kirkwood/include/mach/irqs.h
+++ b/arch/arm/mach-kirkwood/include/mach/irqs.h
@@ -51,6 +51,7 @@
#define IRQ_KIRKWOOD_GPIO_HIGH_16_23 41
#define IRQ_KIRKWOOD_GE00_ERR 46
#define IRQ_KIRKWOOD_GE01_ERR 47
+#define IRQ_KIRKWOOD_TDMA_ERR 49
#define IRQ_KIRKWOOD_RTC 53

/*
diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c
index 9148b22..553ccf2 100644
--- a/arch/arm/mach-orion5x/common.c
+++ b/arch/arm/mach-orion5x/common.c
@@ -181,9 +181,42 @@ void __init orion5x_xor_init(void)
/*****************************************************************************
* Cryptographic Engines and Security Accelerator (CESA)
****************************************************************************/
+static struct resource orion_idma_res[] = {
+ {
+ .name = "regs deco",
+ .start = ORION5X_IDMA_PHYS_BASE + 0xA00,
+ .end = ORION5X_IDMA_PHYS_BASE + 0xA24,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "regs control and error",
+ .start = ORION5X_IDMA_PHYS_BASE + 0x800,
+ .end = ORION5X_IDMA_PHYS_BASE + 0x8CF,
+ .flags = IORESOURCE_MEM,
+ }, {
+ .name = "crypto error",
+ .start = IRQ_ORION5X_IDMA_ERR,
+ .end = IRQ_ORION5X_IDMA_ERR,
+ .flags = IORESOURCE_IRQ,
+ },
+};
+
+static u64 mv_idma_dma_mask = DMA_BIT_MASK(32);
+
+static struct platform_device orion_idma_device = {
+ .name = "mv_idma",
+ .id = -1,
+ .dev = {
+ .dma_mask = &mv_idma_dma_mask,
+ .coherent_dma_mask = DMA_BIT_MASK(32),
+ },
+ .num_resources = ARRAY_SIZE(orion_idma_res),
+ .resource = orion_idma_res,
+};
+
static void __init orion5x_crypto_init(void)
{
orion5x_setup_sram_win();
+ platform_device_register(&orion_idma_device);
orion_crypto_init(ORION5X_CRYPTO_PHYS_BASE, ORION5X_SRAM_PHYS_BASE,
SZ_8K, IRQ_ORION5X_CESA);
}
diff --git a/arch/arm/mach-orion5x/include/mach/orion5x.h b/arch/arm/mach-orion5x/include/mach/orion5x.h
index 2745f5d..a31ac88 100644
--- a/arch/arm/mach-orion5x/include/mach/orion5x.h
+++ b/arch/arm/mach-orion5x/include/mach/orion5x.h
@@ -90,6 +90,8 @@
#define ORION5X_USB0_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x50000)
#define ORION5X_USB0_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x50000)

+#define ORION5X_IDMA_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60000)
+
#define ORION5X_XOR_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60900)
#define ORION5X_XOR_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x60900)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 1092a77..3709f38 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390

It is available as of z196.

+config CRYPTO_DEV_MV_DMA
+ tristate
+ default no
+
config CRYPTO_DEV_MV_CESA
tristate "Marvell's Cryptographic Engine"
depends on PLAT_ORION
@@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA
select CRYPTO_AES
select CRYPTO_BLKCIPHER2
select CRYPTO_HASH
+ select CRYPTO_DEV_MV_DMA
help
This driver allows you to utilize the Cryptographic Engines and
Security Accelerator (CESA) which can be found on the Marvell Orion
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 0139032..cb655ad 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o
obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
n2_crypto-y := n2_core.o n2_asm.o
obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o
+obj-$(CONFIG_CRYPTO_DEV_MV_DMA) += mv_dma.o
obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o
obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/
@@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o
obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o
obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o
-obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
\ No newline at end of file
+obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
new file mode 100644
index 0000000..24c5256
--- /dev/null
+++ b/drivers/crypto/mv_dma.c
@@ -0,0 +1,464 @@
+/*
+ * Support for Marvell's IDMA/TDMA engines found on Orion/Kirkwood chips,
+ * used exclusively by the CESA crypto accelerator.
+ *
+ * Based on unpublished code for IDMA written by Sebastian Siewior.
+ *
+ * Copyright (C) 2012 Phil Sutter <[email protected]>
+ * License: GPLv2
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmapool.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+
+#include "mv_dma.h"
+
+#define MV_DMA "MV-DMA: "
+
+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
+struct mv_dma_desc {
+ u32 count;
+ u32 src;
+ u32 dst;
+ u32 next;
+} __attribute__((packed));
+
+struct desc_mempair {
+ struct mv_dma_desc *vaddr;
+ dma_addr_t daddr;
+};
+
+struct mv_dma_priv {
+ bool idma_registered, tdma_registered;
+ struct device *dev;
+ void __iomem *reg;
+ int irq;
+ /* protecting the dma descriptors and stuff */
+ spinlock_t lock;
+ struct dma_pool *descpool;
+ struct desc_mempair *desclist;
+ int desclist_len;
+ int desc_usage;
+ u32 (*print_and_clear_irq)(void);
+} tpg;
+
+#define DESC(x) (tpg.desclist[x].vaddr)
+#define DESC_DMA(x) (tpg.desclist[x].daddr)
+
+static inline int set_poolsize(int nelem)
+{
+ /* need to increase size first if requested */
+ if (nelem > tpg.desclist_len) {
+ struct desc_mempair *newmem;
+ int newsize = nelem * sizeof(struct desc_mempair);
+
+ newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL);
+ if (!newmem)
+ return -ENOMEM;
+ tpg.desclist = newmem;
+ }
+
+ /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */
+ for (; tpg.desclist_len < nelem; tpg.desclist_len++) {
+ DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool,
+ GFP_KERNEL, &DESC_DMA(tpg.desclist_len));
+ if (!DESC((tpg.desclist_len)))
+ return -ENOMEM;
+ }
+ for (; tpg.desclist_len > nelem; tpg.desclist_len--)
+ dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1),
+ DESC_DMA(tpg.desclist_len - 1));
+
+ /* ignore size decreases but those to zero */
+ if (!nelem) {
+ kfree(tpg.desclist);
+ tpg.desclist = 0;
+ }
+ return 0;
+}
+
+static inline void wait_for_dma_idle(void)
+{
+ while (readl(tpg.reg + DMA_CTRL) & DMA_CTRL_ACTIVE)
+ mdelay(100);
+}
+
+static inline void switch_dma_engine(bool state)
+{
+ u32 val = readl(tpg.reg + DMA_CTRL);
+
+ val |= ( state * DMA_CTRL_ENABLE);
+ val &= ~(!state * DMA_CTRL_ENABLE);
+
+ writel(val, tpg.reg + DMA_CTRL);
+}
+
+static struct mv_dma_desc *get_new_last_desc(void)
+{
+ if (unlikely(tpg.desc_usage == tpg.desclist_len) &&
+ set_poolsize(tpg.desclist_len << 1)) {
+ printk(KERN_ERR MV_DMA "failed to increase DMA pool to %d\n",
+ tpg.desclist_len << 1);
+ return NULL;
+ }
+
+ if (likely(tpg.desc_usage))
+ DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage);
+
+ return DESC(tpg.desc_usage++);
+}
+
+static inline void mv_dma_desc_dump(void)
+{
+ struct mv_dma_desc *tmp;
+ int i;
+
+ if (!tpg.desc_usage) {
+ printk(KERN_WARNING MV_DMA "DMA descriptor list is empty\n");
+ return;
+ }
+
+ printk(KERN_WARNING MV_DMA "DMA descriptor list:\n");
+ for (i = 0; i < tpg.desc_usage; i++) {
+ tmp = DESC(i);
+ printk(KERN_WARNING MV_DMA "entry %d at 0x%x: dma addr 0x%x, "
+ "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i,
+ (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst,
+ tmp->count & DMA_BYTE_COUNT_MASK, !!(tmp->count & DMA_OWN_BIT),
+ tmp->next);
+ }
+}
+
+static inline void mv_dma_reg_dump(void)
+{
+#define PRINTREG(offset) \
+ printk(KERN_WARNING MV_DMA "tpg.reg + " #offset " = 0x%x\n", \
+ readl(tpg.reg + offset))
+
+ PRINTREG(DMA_CTRL);
+ PRINTREG(DMA_BYTE_COUNT);
+ PRINTREG(DMA_SRC_ADDR);
+ PRINTREG(DMA_DST_ADDR);
+ PRINTREG(DMA_NEXT_DESC);
+ PRINTREG(DMA_CURR_DESC);
+
+#undef PRINTREG
+}
+
+static inline void mv_dma_clear_desc_reg(void)
+{
+ writel(0, tpg.reg + DMA_BYTE_COUNT);
+ writel(0, tpg.reg + DMA_SRC_ADDR);
+ writel(0, tpg.reg + DMA_DST_ADDR);
+ writel(0, tpg.reg + DMA_CURR_DESC);
+ writel(0, tpg.reg + DMA_NEXT_DESC);
+}
+
+void mv_dma_clear(void)
+{
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ /* make sure engine is idle */
+ wait_for_dma_idle();
+ switch_dma_engine(0);
+ wait_for_dma_idle();
+
+ /* clear descriptor registers */
+ mv_dma_clear_desc_reg();
+
+ tpg.desc_usage = 0;
+
+ switch_dma_engine(1);
+
+ /* finally free system lock again */
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_dma_clear);
+
+void mv_dma_trigger(void)
+{
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ writel(DESC_DMA(0), tpg.reg + DMA_NEXT_DESC);
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_dma_trigger);
+
+void mv_dma_separator(void)
+{
+ struct mv_dma_desc *tmp;
+
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ tmp = get_new_last_desc();
+ memset(tmp, 0, sizeof(*tmp));
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_dma_separator);
+
+void mv_dma_memcpy(dma_addr_t dst, dma_addr_t src, unsigned int size)
+{
+ struct mv_dma_desc *tmp;
+
+ if (!tpg.dev)
+ return;
+
+ spin_lock(&tpg.lock);
+
+ tmp = get_new_last_desc();
+ tmp->count = size | DMA_OWN_BIT;
+ tmp->src = src;
+ tmp->dst = dst;
+ tmp->next = 0;
+
+ spin_unlock(&tpg.lock);
+}
+EXPORT_SYMBOL_GPL(mv_dma_memcpy);
+
+static u32 idma_print_and_clear_irq(void)
+{
+ u32 val, val2, addr;
+
+ val = readl(tpg.reg + IDMA_INT_CAUSE);
+ val2 = readl(tpg.reg + IDMA_ERR_SELECT);
+ addr = readl(tpg.reg + IDMA_ERR_ADDR);
+
+ if (val & IDMA_INT_MISS(0))
+ printk(KERN_ERR MV_DMA "%s: address miss @%x!\n",
+ __func__, val2 & IDMA_INT_MISS(0) ? addr : 0);
+ if (val & IDMA_INT_APROT(0))
+ printk(KERN_ERR MV_DMA "%s: access protection @%x!\n",
+ __func__, val2 & IDMA_INT_APROT(0) ? addr : 0);
+ if (val & IDMA_INT_WPROT(0))
+ printk(KERN_ERR MV_DMA "%s: write protection @%x!\n",
+ __func__, val2 & IDMA_INT_WPROT(0) ? addr : 0);
+
+ /* clear interrupt cause register */
+ writel(0, tpg.reg + IDMA_INT_CAUSE);
+
+ return val;
+}
+
+static u32 tdma_print_and_clear_irq(void)
+{
+ u32 val;
+
+ val = readl(tpg.reg + TDMA_ERR_CAUSE);
+
+ if (val & TDMA_INT_MISS)
+ printk(KERN_ERR MV_DMA "%s: miss!\n", __func__);
+ if (val & TDMA_INT_DOUBLE_HIT)
+ printk(KERN_ERR MV_DMA "%s: double hit!\n", __func__);
+ if (val & TDMA_INT_BOTH_HIT)
+ printk(KERN_ERR MV_DMA "%s: both hit!\n", __func__);
+ if (val & TDMA_INT_DATA_ERROR)
+ printk(KERN_ERR MV_DMA "%s: data error!\n", __func__);
+
+ /* clear error cause register */
+ writel(0, tpg.reg + TDMA_ERR_CAUSE);
+
+ return val;
+}
+
+irqreturn_t mv_dma_int(int irq, void *priv)
+{
+ int handled;
+
+ handled = (*tpg.print_and_clear_irq)();
+
+ if (handled) {
+ mv_dma_reg_dump();
+ mv_dma_desc_dump();
+ }
+
+ switch_dma_engine(0);
+ wait_for_dma_idle();
+
+ /* clear descriptor registers */
+ mv_dma_clear_desc_reg();
+
+ switch_dma_engine(1);
+ wait_for_dma_idle();
+
+ return (handled ? IRQ_HANDLED : IRQ_NONE);
+}
+
+/* initialise the global tpg structure */
+static int mv_init_engine(struct platform_device *pdev,
+ u32 ctrl_init_val, u32 (*print_and_clear_irq)(void))
+{
+ struct resource *res;
+ int rc;
+
+ if (tpg.dev) {
+ printk(KERN_ERR MV_DMA "second DMA device?!\n");
+ return -ENXIO;
+ }
+ tpg.dev = &pdev->dev;
+ tpg.print_and_clear_irq = print_and_clear_irq;
+
+ /* get register start address */
+ res = platform_get_resource_byname(pdev,
+ IORESOURCE_MEM, "regs control and error");
+ if (!res)
+ return -ENXIO;
+ if (!(tpg.reg = ioremap(res->start, resource_size(res))))
+ return -ENOMEM;
+
+ /* get the IRQ */
+ tpg.irq = platform_get_irq(pdev, 0);
+ if (tpg.irq < 0 || tpg.irq == NO_IRQ) {
+ rc = -ENXIO;
+ goto out_unmap_reg;
+ }
+
+ /* initialise DMA descriptor list */
+ tpg.descpool = dma_pool_create("MV_DMA Descriptor Pool", tpg.dev,
+ sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0);
+ if (!tpg.descpool) {
+ rc = -ENOMEM;
+ goto out_free_irq;
+ }
+ set_poolsize(MV_DMA_INIT_POOLSIZE);
+
+ platform_set_drvdata(pdev, &tpg);
+
+ spin_lock_init(&tpg.lock);
+
+ switch_dma_engine(0);
+ wait_for_dma_idle();
+
+ /* clear descriptor registers */
+ mv_dma_clear_desc_reg();
+
+ /* initialize control register (also enables engine) */
+ writel(ctrl_init_val, tpg.reg + DMA_CTRL);
+ wait_for_dma_idle();
+
+ if (request_irq(tpg.irq, mv_dma_int, IRQF_DISABLED,
+ dev_name(tpg.dev), &tpg)) {
+ rc = -ENXIO;
+ goto out_free_all;
+ }
+
+ return 0;
+
+out_free_all:
+ switch_dma_engine(0);
+ platform_set_drvdata(pdev, NULL);
+ set_poolsize(0);
+ dma_pool_destroy(tpg.descpool);
+out_free_irq:
+ free_irq(tpg.irq, &tpg);
+out_unmap_reg:
+ iounmap(tpg.reg);
+ tpg.dev = NULL;
+ return rc;
+}
+
+static int mv_remove(struct platform_device *pdev)
+{
+ switch_dma_engine(0);
+ platform_set_drvdata(pdev, NULL);
+ set_poolsize(0);
+ dma_pool_destroy(tpg.descpool);
+ free_irq(tpg.irq, &tpg);
+ iounmap(tpg.reg);
+ tpg.dev = NULL;
+ return 0;
+}
+
+static int mv_probe_tdma(struct platform_device *pdev)
+{
+ int rc;
+
+ rc = mv_init_engine(pdev, TDMA_CTRL_INIT_VALUE,
+ &tdma_print_and_clear_irq);
+ if (rc)
+ return rc;
+
+ /* have an ear for occurring errors */
+ writel(TDMA_INT_ALL, tpg.reg + TDMA_ERR_MASK);
+ writel(0, tpg.reg + TDMA_ERR_CAUSE);
+
+ printk(KERN_INFO MV_DMA
+ "TDMA engine up and running, IRQ %d\n", tpg.irq);
+ return 0;
+}
+
+static int mv_probe_idma(struct platform_device *pdev)
+{
+ int rc;
+
+ rc = mv_init_engine(pdev, IDMA_CTRL_INIT_VALUE,
+ &idma_print_and_clear_irq);
+ if (rc)
+ return rc;
+
+ /* have an ear for occurring errors */
+ writel(IDMA_INT_MISS(0) | IDMA_INT_APROT(0) | IDMA_INT_WPROT(0),
+ tpg.reg + IDMA_INT_MASK);
+ writel(0, tpg.reg + IDMA_INT_CAUSE);
+
+ printk(KERN_INFO MV_DMA
+ "IDMA engine up and running, IRQ %d\n", tpg.irq);
+ return 0;
+}
+
+static struct platform_driver marvell_tdma = {
+ .probe = mv_probe_tdma,
+ .remove = mv_remove,
+ .driver = {
+ .owner = THIS_MODULE,
+ .name = "mv_tdma",
+ },
+}, marvell_idma = {
+ .probe = mv_probe_idma,
+ .remove = mv_remove,
+ .driver = {
+ .owner = THIS_MODULE,
+ .name = "mv_idma",
+ },
+};
+MODULE_ALIAS("platform:mv_tdma");
+MODULE_ALIAS("platform:mv_idma");
+
+static int __init mv_dma_init(void)
+{
+ tpg.tdma_registered = !platform_driver_register(&marvell_tdma);
+ tpg.idma_registered = !platform_driver_register(&marvell_idma);
+ return !(tpg.tdma_registered || tpg.idma_registered);
+}
+module_init(mv_dma_init);
+
+static void __exit mv_dma_exit(void)
+{
+ if (tpg.tdma_registered)
+ platform_driver_unregister(&marvell_tdma);
+ if (tpg.idma_registered)
+ platform_driver_unregister(&marvell_idma);
+}
+module_exit(mv_dma_exit);
+
+MODULE_AUTHOR("Phil Sutter <[email protected]>");
+MODULE_DESCRIPTION("Support for Marvell's IDMA/TDMA engines");
+MODULE_LICENSE("GPL");
+
diff --git a/drivers/crypto/mv_dma.h b/drivers/crypto/mv_dma.h
new file mode 100644
index 0000000..d0c9d0c
--- /dev/null
+++ b/drivers/crypto/mv_dma.h
@@ -0,0 +1,127 @@
+#ifndef _MV_DMA_H
+#define _MV_DMA_H
+
+/* common TDMA_CTRL/IDMA_CTRL_LOW bits */
+#define DMA_CTRL_DST_BURST(x) (x)
+#define DMA_CTRL_SRC_BURST(x) (x << 6)
+#define DMA_CTRL_NO_CHAIN_MODE (1 << 9)
+#define DMA_CTRL_ENABLE (1 << 12)
+#define DMA_CTRL_FETCH_ND (1 << 13)
+#define DMA_CTRL_ACTIVE (1 << 14)
+
+/* TDMA_CTRL register bits */
+#define TDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3)
+#define TDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4)
+#define TDMA_CTRL_OUTST_RD_EN (1 << 4)
+#define TDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3)
+#define TDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4)
+#define TDMA_CTRL_NO_BYTE_SWAP (1 << 11)
+
+#define TDMA_CTRL_INIT_VALUE ( \
+ TDMA_CTRL_DST_BURST_128 | TDMA_CTRL_SRC_BURST_128 | \
+ TDMA_CTRL_NO_BYTE_SWAP | DMA_CTRL_ENABLE \
+)
+
+/* IDMA_CTRL_LOW register bits */
+#define IDMA_CTRL_DST_BURST_8 DMA_CTRL_DST_BURST(0)
+#define IDMA_CTRL_DST_BURST_16 DMA_CTRL_DST_BURST(1)
+#define IDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3)
+#define IDMA_CTRL_DST_BURST_64 DMA_CTRL_DST_BURST(7)
+#define IDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4)
+#define IDMA_CTRL_SRC_HOLD (1 << 3)
+#define IDMA_CTRL_DST_HOLD (1 << 5)
+#define IDMA_CTRL_SRC_BURST_8 DMA_CTRL_SRC_BURST(0)
+#define IDMA_CTRL_SRC_BURST_16 DMA_CTRL_SRC_BURST(1)
+#define IDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3)
+#define IDMA_CTRL_SRC_BURST_64 DMA_CTRL_SRC_BURST(7)
+#define IDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4)
+#define IDMA_CTRL_INT_MODE (1 << 10)
+#define IDMA_CTRL_BLOCK_MODE (1 << 11)
+#define IDMA_CTRL_CLOSE_DESC (1 << 17)
+#define IDMA_CTRL_ABORT (1 << 20)
+#define IDMA_CTRL_SADDR_OVR(x) (x << 21)
+#define IDMA_CTRL_NO_SADDR_OVR IDMA_CTRL_SADDR_OVR(0)
+#define IDMA_CTRL_SADDR_OVR_1 IDMA_CTRL_SADDR_OVR(1)
+#define IDMA_CTRL_SADDR_OVR_2 IDMA_CTRL_SADDR_OVR(2)
+#define IDMA_CTRL_SADDR_OVR_3 IDMA_CTRL_SADDR_OVR(3)
+#define IDMA_CTRL_DADDR_OVR(x) (x << 23)
+#define IDMA_CTRL_NO_DADDR_OVR IDMA_CTRL_DADDR_OVR(0)
+#define IDMA_CTRL_DADDR_OVR_1 IDMA_CTRL_DADDR_OVR(1)
+#define IDMA_CTRL_DADDR_OVR_2 IDMA_CTRL_DADDR_OVR(2)
+#define IDMA_CTRL_DADDR_OVR_3 IDMA_CTRL_DADDR_OVR(3)
+#define IDMA_CTRL_NADDR_OVR(x) (x << 25)
+#define IDMA_CTRL_NO_NADDR_OVR IDMA_CTRL_NADDR_OVR(0)
+#define IDMA_CTRL_NADDR_OVR_1 IDMA_CTRL_NADDR_OVR(1)
+#define IDMA_CTRL_NADDR_OVR_2 IDMA_CTRL_NADDR_OVR(2)
+#define IDMA_CTRL_NADDR_OVR_3 IDMA_CTRL_NADDR_OVR(3)
+#define IDMA_CTRL_DESC_MODE_16M (1 << 31)
+
+#define IDMA_CTRL_INIT_VALUE ( \
+ IDMA_CTRL_DST_BURST_128 | IDMA_CTRL_SRC_BURST_128 | \
+ IDMA_CTRL_INT_MODE | IDMA_CTRL_BLOCK_MODE | \
+ DMA_CTRL_ENABLE | IDMA_CTRL_DESC_MODE_16M \
+)
+
+/* TDMA_ERR_CAUSE bits */
+#define TDMA_INT_MISS (1 << 0)
+#define TDMA_INT_DOUBLE_HIT (1 << 1)
+#define TDMA_INT_BOTH_HIT (1 << 2)
+#define TDMA_INT_DATA_ERROR (1 << 3)
+#define TDMA_INT_ALL 0x0f
+
+/* offsets of registers, starting at "regs control and error" */
+#define TDMA_BYTE_COUNT 0x00
+#define TDMA_SRC_ADDR 0x10
+#define TDMA_DST_ADDR 0x20
+#define TDMA_NEXT_DESC 0x30
+#define TDMA_CTRL 0x40
+#define TDMA_CURR_DESC 0x70
+#define TDMA_ERR_CAUSE 0xc8
+#define TDMA_ERR_MASK 0xcc
+
+#define IDMA_BYTE_COUNT(chan) (0x00 + (chan) * 4)
+#define IDMA_SRC_ADDR(chan) (0x10 + (chan) * 4)
+#define IDMA_DST_ADDR(chan) (0x20 + (chan) * 4)
+#define IDMA_NEXT_DESC(chan) (0x30 + (chan) * 4)
+#define IDMA_CTRL_LOW(chan) (0x40 + (chan) * 4)
+#define IDMA_CURR_DESC(chan) (0x70 + (chan) * 4)
+#define IDMA_CTRL_HIGH(chan) (0x80 + (chan) * 4)
+#define IDMA_INT_CAUSE (0xc0)
+#define IDMA_INT_MASK (0xc4)
+#define IDMA_ERR_ADDR (0xc8)
+#define IDMA_ERR_SELECT (0xcc)
+
+/* register offsets common to TDMA and IDMA channel 0 */
+#define DMA_BYTE_COUNT TDMA_BYTE_COUNT
+#define DMA_SRC_ADDR TDMA_SRC_ADDR
+#define DMA_DST_ADDR TDMA_DST_ADDR
+#define DMA_NEXT_DESC TDMA_NEXT_DESC
+#define DMA_CTRL TDMA_CTRL
+#define DMA_CURR_DESC TDMA_CURR_DESC
+
+/* IDMA_INT_CAUSE and IDMA_INT_MASK bits */
+#define IDMA_INT_COMP(chan) ((1 << 0) << ((chan) * 8))
+#define IDMA_INT_MISS(chan) ((1 << 1) << ((chan) * 8))
+#define IDMA_INT_APROT(chan) ((1 << 2) << ((chan) * 8))
+#define IDMA_INT_WPROT(chan) ((1 << 3) << ((chan) * 8))
+#define IDMA_INT_OWN(chan) ((1 << 4) << ((chan) * 8))
+#define IDMA_INT_ALL(chan) (0x1f << (chan) * 8)
+
+/* Owner bit in DMA_BYTE_COUNT and descriptors' count field, used
+ * to signal input data completion in descriptor chain */
+#define DMA_OWN_BIT (1 << 31)
+
+/* IDMA also has a "Left Byte Count" bit,
+ * indicating not everything was transfered */
+#define IDMA_LEFT_BYTE_COUNT (1 << 30)
+
+/* filter the actual byte count value from the DMA_BYTE_COUNT field */
+#define DMA_BYTE_COUNT_MASK (~(DMA_OWN_BIT | IDMA_LEFT_BYTE_COUNT))
+
+extern void mv_dma_memcpy(dma_addr_t, dma_addr_t, unsigned int);
+extern void mv_dma_separator(void);
+extern void mv_dma_clear(void);
+extern void mv_dma_trigger(void);
+
+
+#endif /* _MV_DMA_H */
--
1.7.3.4

2012-06-12 17:17:59

by Phil Sutter

[permalink] [raw]
Subject: [PATCH 01/13] mv_cesa: do not use scatterlist iterators

The big problem is they cannot be used to iterate over DMA mapped
scatterlists, so get rid of them in order to add DMA functionality to
mv_cesa.

Signed-off-by: Phil Sutter <[email protected]>
---
drivers/crypto/mv_cesa.c | 57 ++++++++++++++++++++++-----------------------
1 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 0d40717..818a5c7 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -44,8 +44,8 @@ enum engine_status {

/**
* struct req_progress - used for every crypt request
- * @src_sg_it: sg iterator for src
- * @dst_sg_it: sg iterator for dst
+ * @src_sg: sg list for src
+ * @dst_sg: sg list for dst
* @sg_src_left: bytes left in src to process (scatter list)
* @src_start: offset to add to src start position (scatter list)
* @crypt_len: length of current hw crypt/hash process
@@ -60,8 +60,8 @@ enum engine_status {
* track of progress within current scatterlist.
*/
struct req_progress {
- struct sg_mapping_iter src_sg_it;
- struct sg_mapping_iter dst_sg_it;
+ struct scatterlist *src_sg;
+ struct scatterlist *dst_sg;
void (*complete) (void);
void (*process) (int is_first);

@@ -212,19 +212,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher *cipher, const u8 *key,

static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
{
- int ret;
void *sbuf;
int copy_len;

while (len) {
if (!p->sg_src_left) {
- ret = sg_miter_next(&p->src_sg_it);
- BUG_ON(!ret);
- p->sg_src_left = p->src_sg_it.length;
+ /* next sg please */
+ p->src_sg = sg_next(p->src_sg);
+ BUG_ON(!p->src_sg);
+ p->sg_src_left = p->src_sg->length;
p->src_start = 0;
}

- sbuf = p->src_sg_it.addr + p->src_start;
+ sbuf = sg_virt(p->src_sg) + p->src_start;

copy_len = min(p->sg_src_left, len);
memcpy(dbuf, sbuf, copy_len);
@@ -307,9 +307,6 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);

- sg_miter_stop(&cpg->p.src_sg_it);
- sg_miter_stop(&cpg->p.dst_sg_it);
-
if (req_ctx->op != COP_AES_CBC)
return ;

@@ -439,7 +436,6 @@ static void mv_hash_algo_completion(void)

if (ctx->extra_bytes)
copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes);
- sg_miter_stop(&cpg->p.src_sg_it);

if (likely(ctx->last_chunk)) {
if (likely(ctx->count <= MAX_HW_HASH_SIZE)) {
@@ -459,7 +455,6 @@ static void dequeue_complete_req(void)
{
struct crypto_async_request *req = cpg->cur_req;
void *buf;
- int ret;
cpg->p.hw_processed_bytes += cpg->p.crypt_len;
if (cpg->p.copy_back) {
int need_copy_len = cpg->p.crypt_len;
@@ -468,14 +463,14 @@ static void dequeue_complete_req(void)
int dst_copy;

if (!cpg->p.sg_dst_left) {
- ret = sg_miter_next(&cpg->p.dst_sg_it);
- BUG_ON(!ret);
- cpg->p.sg_dst_left = cpg->p.dst_sg_it.length;
+ /* next sg please */
+ cpg->p.dst_sg = sg_next(cpg->p.dst_sg);
+ BUG_ON(!cpg->p.dst_sg);
+ cpg->p.sg_dst_left = cpg->p.dst_sg->length;
cpg->p.dst_start = 0;
}

- buf = cpg->p.dst_sg_it.addr;
- buf += cpg->p.dst_start;
+ buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start;

dst_copy = min(need_copy_len, cpg->p.sg_dst_left);

@@ -525,7 +520,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
static void mv_start_new_crypt_req(struct ablkcipher_request *req)
{
struct req_progress *p = &cpg->p;
- int num_sgs;

cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
@@ -534,11 +528,14 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req)
p->process = mv_process_current_q;
p->copy_back = 1;

- num_sgs = count_sgs(req->src, req->nbytes);
- sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG);
-
- num_sgs = count_sgs(req->dst, req->nbytes);
- sg_miter_start(&p->dst_sg_it, req->dst, num_sgs, SG_MITER_TO_SG);
+ p->src_sg = req->src;
+ p->dst_sg = req->dst;
+ if (req->nbytes) {
+ BUG_ON(!req->src);
+ BUG_ON(!req->dst);
+ p->sg_src_left = req->src->length;
+ p->sg_dst_left = req->dst->length;
+ }

mv_process_current_q(1);
}
@@ -547,7 +544,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
{
struct req_progress *p = &cpg->p;
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
- int num_sgs, hw_bytes, old_extra_bytes, rc;
+ int hw_bytes, old_extra_bytes, rc;
cpg->cur_req = &req->base;
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req->nbytes + ctx->extra_bytes;
@@ -560,8 +557,11 @@ static void mv_start_new_hash_req(struct ahash_request *req)
else
ctx->extra_bytes = 0;

- num_sgs = count_sgs(req->src, req->nbytes);
- sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG);
+ p->src_sg = req->src;
+ if (req->nbytes) {
+ BUG_ON(!req->src);
+ p->sg_src_left = req->src->length;
+ }

if (hw_bytes) {
p->hw_nbytes = hw_bytes;
@@ -578,7 +578,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)
} else {
copy_src_to_buf(p, ctx->buffer + old_extra_bytes,
ctx->extra_bytes - old_extra_bytes);
- sg_miter_stop(&p->src_sg_it);
if (ctx->last_chunk)
rc = mv_hash_final_fallback(req);
else
--
1.7.3.4

2012-06-15 01:40:53

by cloudy.linux

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with IDMA or TDMA

On 2012-6-13 1:17, Phil Sutter wrote:
> Hi,
>
> The following patch series adds support for the TDMA engine built into
> Marvell's Kirkwood-based SoCs as well as the IDMA engine built into
> Marvell's Orion-based SoCs and enhances mv_cesa.c in order to use it for
> speeding up crypto operations. The hardware contains a security
> accelerator, which can control DMA as well as crypto engines. It allows
> for operation with minimal software intervention, which the following
> patches implement: using a chain of DMA descriptors, data input,
> configuration, engine startup and data output repeat fully automatically
> until the whole input data has been handled.
>
> The point for this being RFC is lack of hardware on my side for testing
> the IDMA support. I'd highly appreciate if someone with Orion hardware
> could test this, preferably using the hmac_comp tool shipped with
> cryptodev-linux as it does a more extensive testing (with bigger buffer
> sizes at least) than tcrypt or the standard kernel-internal use cases.
>
> Greetings, Phil
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

I would like to have a try on those patches. But what version of kernel
should I apply those patches on?

Thanks.

2012-06-15 09:51:33

by Phil Sutter

[permalink] [raw]
Subject: Re: RFC: support for MV_CESA with IDMA or TDMA

Hi,

On Fri, Jun 15, 2012 at 09:40:28AM +0800, cloudy.linux wrote:
> I would like to have a try on those patches. But what version of kernel
> should I apply those patches on?

Sorry for the caused confusion. I have applied those patches to linus'
git, preceded by the three accepted ones of the earlier four. Yay.

Long story short, please just fetch git://nwl.cc/~n0-1/linux.git and
checkout the 'cesa-dma' branch. It's exactly what I formatted the
patches from.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-16 00:21:00

by Simon Baatz

[permalink] [raw]
Subject: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Phil,

thanks for providing these patches; it's great to finally see DMA
support for CESA in the kernel. Additionally, the implementation seems
to be fine regarding cache incoherencies (at least my test in [*]
works).

I have two patches for your patchset...

- Fix for mv_init_engine error handling

- My system locked up hard when mv_dma and mv_cesa were built as
modules. mv_cesa has code to enable the crypto clock in 3.5, but
mv_dma already accesses the CESA engine before. Thus, we need to
enable this clock here, too.

[*] http://www.spinics.net/lists/arm-kernel/msg176913.html

Simon Baatz (2):
mv_dma: fix mv_init_engine() error case
ARM: Orion: mv_dma: Add support for clk

arch/arm/mach-kirkwood/common.c | 1 +
drivers/crypto/mv_dma.c | 18 +++++++++++++++---
2 files changed, 16 insertions(+), 3 deletions(-)

--
1.7.9.5

2012-06-16 00:21:00

by Simon Baatz

[permalink] [raw]
Subject: [PATCH 1/2] mv_dma: fix mv_init_engine() error case

Fix wrongly placed free_irq in mv_init_engine() error recovery. In fact, we
can remove the respective label, since request_irq() is the last thing the
function does anyway.

Signed-off-by: Simon Baatz <[email protected]>
---
drivers/crypto/mv_dma.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
index b84ff80..125dfee 100644
--- a/drivers/crypto/mv_dma.c
+++ b/drivers/crypto/mv_dma.c
@@ -296,7 +296,7 @@ static int mv_init_engine(struct platform_device *pdev,
if (init_dma_desclist(&tpg.desclist, tpg.dev,
sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) {
rc = -ENOMEM;
- goto out_free_irq;
+ goto out_unmap_reg;
}
if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) {
rc = -ENOMEM;
@@ -330,8 +330,6 @@ out_free_all:
platform_set_drvdata(pdev, NULL);
out_free_desclist:
fini_dma_desclist(&tpg.desclist);
-out_free_irq:
- free_irq(tpg.irq, &tpg);
out_unmap_reg:
iounmap(tpg.reg);
tpg.dev = NULL;
--
1.7.9.5

2012-06-16 00:21:00

by Simon Baatz

[permalink] [raw]
Subject: [PATCH 2/2] ARM: Orion: mv_dma: Add support for clk

mv_dma needs the crypto clock. Some orion platforms support gating of the
clock. If the clock exists enable/disable it as appropriate.

Signed-off-by: Simon Baatz <[email protected]>
---
arch/arm/mach-kirkwood/common.c | 1 +
drivers/crypto/mv_dma.c | 14 ++++++++++++++
2 files changed, 15 insertions(+)

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 560b920..e7bbc60 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -234,6 +234,7 @@ void __init kirkwood_clk_init(void)
orion_clkdev_add(NULL, "orion-ehci.0", usb0);
orion_clkdev_add(NULL, "orion_nand", runit);
orion_clkdev_add(NULL, "mvsdio", sdio);
+ orion_clkdev_add(NULL, "mv_tdma", crypto);
orion_clkdev_add(NULL, "mv_crypto", crypto);
orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".0", xor0);
orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".1", xor1);
diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
index 125dfee..9fdb7be 100644
--- a/drivers/crypto/mv_dma.c
+++ b/drivers/crypto/mv_dma.c
@@ -13,6 +13,7 @@
#include <linux/dmapool.h>
#include <linux/interrupt.h>
#include <linux/module.h>
+#include <linux/clk.h>
#include <linux/slab.h>
#include <linux/platform_device.h>

@@ -36,6 +37,7 @@ struct mv_dma_priv {
struct device *dev;
void __iomem *reg;
int irq;
+ struct clk *clk;
/* protecting the dma descriptors and stuff */
spinlock_t lock;
struct dma_desclist desclist;
@@ -292,6 +294,12 @@ static int mv_init_engine(struct platform_device *pdev,
goto out_unmap_reg;
}

+ /* Not all platforms can gate the clock, so it is not
+ an error if the clock does not exists. */
+ tpg.clk = clk_get(&pdev->dev, NULL);
+ if (!IS_ERR(tpg.clk))
+ clk_prepare_enable(tpg.clk);
+
/* initialise DMA descriptor list */
if (init_dma_desclist(&tpg.desclist, tpg.dev,
sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) {
@@ -343,6 +351,12 @@ static int mv_remove(struct platform_device *pdev)
fini_dma_desclist(&tpg.desclist);
free_irq(tpg.irq, &tpg);
iounmap(tpg.reg);
+
+ if (!IS_ERR(tpg.clk)) {
+ clk_disable_unprepare(tpg.clk);
+ clk_put(tpg.clk);
+ }
+
tpg.dev = NULL;
return 0;
}
--
1.7.9.5

2012-06-18 13:47:28

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Simon,

On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
> thanks for providing these patches; it's great to finally see DMA
> support for CESA in the kernel. Additionally, the implementation seems
> to be fine regarding cache incoherencies (at least my test in [*]
> works).

Thanks for testing and the fixes. Could you also specify the platform
you are testing on?

> I have two patches for your patchset...
>
> - Fix for mv_init_engine error handling
>
> - My system locked up hard when mv_dma and mv_cesa were built as
> modules. mv_cesa has code to enable the crypto clock in 3.5, but
> mv_dma already accesses the CESA engine before. Thus, we need to
> enable this clock here, too.

I have folded them into my patch series, thanks again. I somewhat miss
the orion_clkdev_add() part for orion5x platforms, but also fail to find
any equivalent place in the correspondent subdirectory. So I hope it is
OK like this.

The updated patch series is available at git://nwl.cc/~n0-1/linux.git,
branch 'cesa-dma'. My push changed history, so you have to either reset
--hard to it's HEAD, or rebase skipping the outdated patches.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-18 20:12:39

by Simon Baatz

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Phil,

On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
> On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
> > thanks for providing these patches; it's great to finally see DMA
> > support for CESA in the kernel. Additionally, the implementation seems
> > to be fine regarding cache incoherencies (at least my test in [*]
> > works).
>
> Thanks for testing and the fixes. Could you also specify the platform
> you are testing on?

This is a Marvell Kirkwood MV88F6281-A1.

I see one effect that I don't fully understand.
Similar to the previous implementation, the system is mostly in
kernel space when accessing an encrypted dm-crypt device:

# cryptsetup --cipher=aes-cbc-plain --key-size=128 create c_sda2 /dev/sda2
Enter passphrase:
# dd if=/dev/mapper/c_sda2 of=/dev/null bs=64k count=2048
2048+0 records in
2048+0 records out
134217728 bytes (134 MB) copied, 10.7324 s, 12.5 MB/s

Doing an "mpstat 1" at the same time gives:

21:21:42 CPU %usr %nice %sys %iowait %irq %soft ...
21:21:45 all 0.00 0.00 0.00 0.00 0.00 0.00
21:21:46 all 0.00 0.00 79.00 0.00 0.00 2.00
21:21:47 all 0.00 0.00 95.00 0.00 0.00 5.00
21:21:48 all 0.00 0.00 94.00 0.00 0.00 6.00
21:21:49 all 0.00 0.00 96.00 0.00 0.00 4.00
...

The underlying device is a SATA drive and should not be the limit:

# dd if=/dev/sda2 of=/dev/null bs=64k count=2048
2048+0 records in
2048+0 records out
134217728 bytes (134 MB) copied, 1.79804 s, 74.6 MB/s

I did not dare hope the DMA implementation to be much faster than the
old one, but I would have expected a rather low CPU usage using DMA.
Do you have an idea where the kernel spends its time? (Am I hitting
a non/only partially accelerated path here?)

> > - My system locked up hard when mv_dma and mv_cesa were built as
> > modules. mv_cesa has code to enable the crypto clock in 3.5, but
> > mv_dma already accesses the CESA engine before. Thus, we need to
> > enable this clock here, too.
>
> I have folded them into my patch series, thanks again. I somewhat miss
> the orion_clkdev_add() part for orion5x platforms, but also fail to find
> any equivalent place in the correspondent subdirectory. So I hope it is
> OK like this.

The change follows the original clk changes by Andrew. I don't know
orion5x, but apparently, only kirkwood has such fine grained clock
gates:

/* Create clkdev entries for all orion platforms except kirkwood.
Kirkwood has gated clocks for some of its peripherals, so creates
its own clkdev entries. For all the other orion devices, create
clkdev entries to the tclk. */

(from plat-orion/common.c)

This is why the clock enabling code in the modules ignores the case
that the clock can't be found. I think the clocks defined by
plat-orion are for those drivers that need the actual TCLK rate (but
there is no clock gate functionality here).

- Simon

2012-06-19 11:51:32

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Simon,

On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote:
> On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
> > On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
> > > thanks for providing these patches; it's great to finally see DMA
> > > support for CESA in the kernel. Additionally, the implementation seems
> > > to be fine regarding cache incoherencies (at least my test in [*]
> > > works).
> >
> > Thanks for testing and the fixes. Could you also specify the platform
> > you are testing on?
>
> This is a Marvell Kirkwood MV88F6281-A1.

OK, thanks. Just wanted to be sure it's not already the Orion test I'm
hoping for. :)

> I see one effect that I don't fully understand.
> Similar to the previous implementation, the system is mostly in
> kernel space when accessing an encrypted dm-crypt device:
>
> # cryptsetup --cipher=aes-cbc-plain --key-size=128 create c_sda2 /dev/sda2
> Enter passphrase:
> # dd if=/dev/mapper/c_sda2 of=/dev/null bs=64k count=2048
> 2048+0 records in
> 2048+0 records out
> 134217728 bytes (134 MB) copied, 10.7324 s, 12.5 MB/s
>
> Doing an "mpstat 1" at the same time gives:
>
> 21:21:42 CPU %usr %nice %sys %iowait %irq %soft ...
> 21:21:45 all 0.00 0.00 0.00 0.00 0.00 0.00
> 21:21:46 all 0.00 0.00 79.00 0.00 0.00 2.00
> 21:21:47 all 0.00 0.00 95.00 0.00 0.00 5.00
> 21:21:48 all 0.00 0.00 94.00 0.00 0.00 6.00
> 21:21:49 all 0.00 0.00 96.00 0.00 0.00 4.00
> ...
>
> The underlying device is a SATA drive and should not be the limit:
>
> # dd if=/dev/sda2 of=/dev/null bs=64k count=2048
> 2048+0 records in
> 2048+0 records out
> 134217728 bytes (134 MB) copied, 1.79804 s, 74.6 MB/s
>
> I did not dare hope the DMA implementation to be much faster than the
> old one, but I would have expected a rather low CPU usage using DMA.
> Do you have an idea where the kernel spends its time? (Am I hitting
> a non/only partially accelerated path here?)

Hmm. Though you passed bs=64k to dd, block sizes may still be the
bottleneck. No idea if the parameter is really passed down to dm-crypt
or if that uses the underlying device's block size anyway. I just did a
short speed test on the 2.6.39.2 we're using productively:

| Testing AES-128-CBC cipher:
| Encrypting in chunks of 512 bytes: done. 46.19 MB in 5.00 secs: 9.24 MB/sec
| Encrypting in chunks of 1024 bytes: done. 81.82 MB in 5.00 secs: 16.36 MB/sec
| Encrypting in chunks of 2048 bytes: done. 124.63 MB in 5.00 secs: 24.93 MB/sec
| Encrypting in chunks of 4096 bytes: done. 162.88 MB in 5.00 secs: 32.58 MB/sec
| Encrypting in chunks of 8192 bytes: done. 200.47 MB in 5.00 secs: 40.09 MB/sec
| Encrypting in chunks of 16384 bytes: done. 226.61 MB in 5.00 secs: 45.32 MB/sec
| Encrypting in chunks of 32768 bytes: done. 242.78 MB in 5.00 secs: 48.55 MB/sec
| Encrypting in chunks of 65536 bytes: done. 251.85 MB in 5.00 secs: 50.36 MB/sec
|
| Testing AES-256-CBC cipher:
| Encrypting in chunks of 512 bytes: done. 45.15 MB in 5.00 secs: 9.03 MB/sec
| Encrypting in chunks of 1024 bytes: done. 78.72 MB in 5.00 secs: 15.74 MB/sec
| Encrypting in chunks of 2048 bytes: done. 117.59 MB in 5.00 secs: 23.52 MB/sec
| Encrypting in chunks of 4096 bytes: done. 151.59 MB in 5.00 secs: 30.32 MB/sec
| Encrypting in chunks of 8192 bytes: done. 182.95 MB in 5.00 secs: 36.59 MB/sec
| Encrypting in chunks of 16384 bytes: done. 204.00 MB in 5.00 secs: 40.80 MB/sec
| Encrypting in chunks of 32768 bytes: done. 216.17 MB in 5.00 secs: 43.23 MB/sec
| Encrypting in chunks of 65536 bytes: done. 223.22 MB in 5.00 secs: 44.64 MB/sec

Observing top while it was running revealed that system load was
decreasing with increased block sizes - ~75% at 512B, ~20% at 32kB. I
fear this is a limitation we have to live with, the overhead of setting
up DMA descriptors and handling the returned data is quite high,
especially when compared to the time it takes the engine to encrypt
512B. I was playing around with descriptor preparation at some point
(i.e. preparing the next descriptor chaing while the engine is active),
but without satisfying results. Maybe I should have another look at it,
especially regarding the case of small chunk sizes. OTOH this all makes
sense only when used asymmetrically, and I have no idea whether dm-crypt
(or fellows like IPsec) makes use of that interface at all.

> > > - My system locked up hard when mv_dma and mv_cesa were built as
> > > modules. mv_cesa has code to enable the crypto clock in 3.5, but
> > > mv_dma already accesses the CESA engine before. Thus, we need to
> > > enable this clock here, too.
> >
> > I have folded them into my patch series, thanks again. I somewhat miss
> > the orion_clkdev_add() part for orion5x platforms, but also fail to find
> > any equivalent place in the correspondent subdirectory. So I hope it is
> > OK like this.
>
> The change follows the original clk changes by Andrew. I don't know
> orion5x, but apparently, only kirkwood has such fine grained clock
> gates:
>
> /* Create clkdev entries for all orion platforms except kirkwood.
> Kirkwood has gated clocks for some of its peripherals, so creates
> its own clkdev entries. For all the other orion devices, create
> clkdev entries to the tclk. */
>
> (from plat-orion/common.c)
>
> This is why the clock enabling code in the modules ignores the case
> that the clock can't be found. I think the clocks defined by
> plat-orion are for those drivers that need the actual TCLK rate (but
> there is no clock gate functionality here).

Ah, OK. Reading helps, they say. Thanks anyway for your explanation.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-19 15:10:06

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On 2012-6-19 19:51, Phil Sutter wrote:
> Hi Simon,
>
> On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote:
>> On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
>>> On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
>>>> thanks for providing these patches; it's great to finally see DMA
>>>> support for CESA in the kernel. Additionally, the implementation seems
>>>> to be fine regarding cache incoherencies (at least my test in [*]
>>>> works).
>>>
>>> Thanks for testing and the fixes. Could you also specify the platform
>>> you are testing on?
>>
>> This is a Marvell Kirkwood MV88F6281-A1.
>
> OK, thanks. Just wanted to be sure it's not already the Orion test I'm
> hoping for. :)
>

OK, here comes the Orion test result - Linkstation Pro with 88F5182 A2.
I didn't enable any debug option yet (I don't know what to be enabled in
fact). Hope the mv_cesa and mv_dma related kernel messages below could
be helpful though:

...

MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008
MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
alg: skcipher: Test 1 failed on encryption for mv-ecb-aes
00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
...

MV-CESA:completion timer expired (CESA active), cleaning up.
MV-CESA:mv_completion_timer_callback: waiting for engine finishing
MV-CESA:mv_completion_timer_callback: waiting for engine finishing

Then the console was flooded by the "waiting for engine finshing"
message and the boot can't finish.

I'll be happy to help to debug this. Just tell me how.

2012-06-19 17:13:13

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi,

On Tue, Jun 19, 2012 at 11:09:43PM +0800, cloudy.linux wrote:
> On 2012-6-19 19:51, Phil Sutter wrote:
> > Hi Simon,
> >
> > On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote:
> >> On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
> >>> On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
> >>>> thanks for providing these patches; it's great to finally see DMA
> >>>> support for CESA in the kernel. Additionally, the implementation seems
> >>>> to be fine regarding cache incoherencies (at least my test in [*]
> >>>> works).
> >>>
> >>> Thanks for testing and the fixes. Could you also specify the platform
> >>> you are testing on?
> >>
> >> This is a Marvell Kirkwood MV88F6281-A1.
> >
> > OK, thanks. Just wanted to be sure it's not already the Orion test I'm
> > hoping for. :)
> >
>
> OK, here comes the Orion test result - Linkstation Pro with 88F5182 A2.
> I didn't enable any debug option yet (I don't know what to be enabled in
> fact). Hope the mv_cesa and mv_dma related kernel messages below could
> be helpful though:
>
> ...
>
> MV-DMA: IDMA engine up and running, IRQ 23
> MV-DMA: idma_print_and_clear_irq: address miss @0!
> MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
> MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010
> MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008
> MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080
> MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010
> MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000
> MV-DMA: DMA descriptor list:
> MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
> 0xf2200080, count 16, own 1, next 0x79b1010
> MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
> 0xf2200000, count 80, own 1, next 0x79b1020
> MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
> count 0, own 0, next 0x79b1030
> MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
> 0x79b4000, count 16, own 1, next 0x0
> MV-CESA:got an interrupt but no pending timer?
> alg: skcipher: Test 1 failed on encryption for mv-ecb-aes
> 00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
> ...
>
> MV-CESA:completion timer expired (CESA active), cleaning up.
> MV-CESA:mv_completion_timer_callback: waiting for engine finishing
> MV-CESA:mv_completion_timer_callback: waiting for engine finishing
>
> Then the console was flooded by the "waiting for engine finshing"
> message and the boot can't finish.
>
> I'll be happy to help to debug this. Just tell me how.

OK. IDMA bailing out was more or less expected, but the error path
flooding the log makes me deserve the darwin award. ;)

I suspect address decoding to be the real problem here (kirkwood seems
not to need any setup, so I completely skipped that), at least the IDMA
interrupt cause points that out. OTOH I found out that CESA wasn't
exactly configured as stated in the specs, so could you please test the
attached diff? (Should also sanitise the error case a bit.)

In any case, thanks a lot for your time!



Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel


Attachments:
(No filename) (3.35 kB)
cesa_test.diff (1.95 kB)
Download all attachments

2012-06-20 01:16:51

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

The CESA still didn't work as expected. But this time the machine
managed to finish the boot.

...

MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008
MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
alg: skcipher: Test 1 failed on encryption for mv-ecb-aes
00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff

...

MV-CESA:completion timer expired (CESA active), cleaning up.
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (5)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (4)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (3)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (2)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (1)
alg: hash: Test 1 failed for mv-sha1
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00
ata2: SATA link down (SStatus 0 SControl 300)
MV-CESA:completion timer expired (CESA active), cleaning up.
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (5)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (4)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (3)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (2)
MV-CESA:mv_completion_timer_callback: waiting for engine finishing (1)
alg: hash: Test 1 failed for mv-hmac-sha1
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00

...

Regards
Cloudy

2012-06-20 13:31:25

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Phil

On 2012-6-19 4:12, Simon Baatz wrote:
> I see one effect that I don't fully understand.
> Similar to the previous implementation, the system is mostly in
> kernel space when accessing an encrypted dm-crypt device:

Today I also compiled the patched 3.5.0-rc3 for another NAS box with
MV88F6282-Rev-A0 (LS-WVL), I noticed one thing that when the CESA engine
was used, the interrupt number of mv_crypto kept rising, but the
interrupt number of mv_tdma was always zero.

$ cat /proc/interrupts
CPU0
1: 31296 orion_irq orion_tick
5: 2 orion_irq mv_xor.0
6: 2 orion_irq mv_xor.1
7: 2 orion_irq mv_xor.2
8: 2 orion_irq mv_xor.3
11: 23763 orion_irq eth0
19: 0 orion_irq ehci_hcd:usb1
21: 4696 orion_irq sata_mv
22: 64907 orion_irq mv_crypto
33: 432 orion_irq serial
46: 51 orion_irq mv643xx_eth
49: 0 orion_irq mv_tdma
53: 0 orion_irq rtc-mv
107: 0 - GPIO fan alarm
109: 0 - function
110: 0 - power-on
111: 0 - power-auto
Err: 0

Is this normal?

Regards
Cloudy

2012-06-20 15:41:40

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Cloudy,

On Wed, Jun 20, 2012 at 09:31:10PM +0800, cloudy.linux wrote:
> On 2012-6-19 4:12, Simon Baatz wrote:
> > I see one effect that I don't fully understand.
> > Similar to the previous implementation, the system is mostly in
> > kernel space when accessing an encrypted dm-crypt device:
>
> Today I also compiled the patched 3.5.0-rc3 for another NAS box with
> MV88F6282-Rev-A0 (LS-WVL), I noticed one thing that when the CESA engine
> was used, the interrupt number of mv_crypto kept rising, but the
> interrupt number of mv_tdma was always zero.

Yes, that is exactly how it should be: the DMA engine is configured to
run "attached" to CESA, meaning that when CESA is triggered from
mv_cesa.c, it first enables the DMA engine. Using a special descriptor
in the chain, the DMA engine knows when to stop and signals CESA again
so it can start the crypto operation. Afterwards, CESA triggers the DMA
engine again for copying back the results (or more specific: process the
remaining descriptors in the chain after the special one). After a
descriptor with it's next descriptor field being zero has been handled,
CESA is signaled again which in turn generates the interrupt to signal
the software. So no DMA interrupt needed, and no software interaction in
between data copying and crypto operation, of course. :)

Greetings, Phil

PS: I am currently working at the address decoding problem, will get
back to in a few days when I have something to test. So stay tuned!

Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-25 13:41:00

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi,

On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
> PS: I am currently working at the address decoding problem, will get
> back to in a few days when I have something to test. So stay tuned!

I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
code setting the decoding windows. I hope this fixes the issues on
orion. I decided not to publish the changes regarding the second DMA
channel for now, as this seems to be support for a second crypto session
(handled consecutively, so no real improvement) which is not supported
anyway.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel

2012-06-25 14:25:13

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On 2012-6-25 21:40, Phil Sutter wrote:
> Hi,
>
> On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
>> PS: I am currently working at the address decoding problem, will get
>> back to in a few days when I have something to test. So stay tuned!
>
> I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
> code setting the decoding windows. I hope this fixes the issues on
> orion. I decided not to publish the changes regarding the second DMA
> channel for now, as this seems to be support for a second crypto session
> (handled consecutively, so no real improvement) which is not supported
> anyway.
>
> Greetings, Phil
>
>
> Phil Sutter
> Software Engineer
>

Thanks Phil. I'm cloning your git now but the speed is really slow. Last
time I tried to do this but had to cancel after hours of downloading (at
only about 20% progress). So the previous tests were actually done with
3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling
problem), of course with your patch and Simon's. Could you provide a
diff based on your last round patch (diff to the not patched kernel
should also be good, I think)?

In the mean time, I'm still trying with a cloning speed of 5KiB/s ...

Regards
Cloudy

2012-06-25 14:36:59

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi,

On Mon, Jun 25, 2012 at 10:25:01PM +0800, cloudy.linux wrote:
> On 2012-6-25 21:40, Phil Sutter wrote:
> > Hi,
> >
> > On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
> >> PS: I am currently working at the address decoding problem, will get
> >> back to in a few days when I have something to test. So stay tuned!
> >
> > I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
> > code setting the decoding windows. I hope this fixes the issues on
> > orion. I decided not to publish the changes regarding the second DMA
> > channel for now, as this seems to be support for a second crypto session
> > (handled consecutively, so no real improvement) which is not supported
> > anyway.
> >
> > Greetings, Phil
> >
> >
> > Phil Sutter
> > Software Engineer
> >
>
> Thanks Phil. I'm cloning your git now but the speed is really slow. Last
> time I tried to do this but had to cancel after hours of downloading (at
> only about 20% progress). So the previous tests were actually done with
> 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling
> problem), of course with your patch and Simon's. Could you provide a
> diff based on your last round patch (diff to the not patched kernel
> should also be good, I think)?
>
> In the mean time, I'm still trying with a cloning speed of 5KiB/s ...

Ugh, that's horrible. No idea what's going wrong there, and no access to
the management interface right now. In the mean time, please refer to
the attached patch. It bases on 94fa83c in linus' git but should
cleanly apply to it's current HEAD, too.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel


Attachments:
(No filename) (2.00 kB)
mv_dma_full.diff (53.73 kB)
Download all attachments

2012-06-25 16:06:10

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On 2012-6-25 22:25, cloudy.linux wrote:
> On 2012-6-25 21:40, Phil Sutter wrote:
>> Hi,
>>
>> On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
>>> PS: I am currently working at the address decoding problem, will get
>>> back to in a few days when I have something to test. So stay tuned!
>>
>> I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
>> code setting the decoding windows. I hope this fixes the issues on
>> orion. I decided not to publish the changes regarding the second DMA
>> channel for now, as this seems to be support for a second crypto session
>> (handled consecutively, so no real improvement) which is not supported
>> anyway.
>>
>> Greetings, Phil
>>
>>
>> Phil Sutter
>> Software Engineer
>>
>
> Thanks Phil. I'm cloning your git now but the speed is really slow. Last
> time I tried to do this but had to cancel after hours of downloading (at
> only about 20% progress). So the previous tests were actually done with
> 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling
> problem), of course with your patch and Simon's. Could you provide a
> diff based on your last round patch (diff to the not patched kernel
> should also be good, I think)?
>
> In the mean time, I'm still trying with a cloning speed of 5KiB/s ...
>
> Regards
> Cloudy

Hi Phil

This time the machine can't finish the boot again and the console was
flooded by the message like below:

...

MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0

Also, I had to make some modifications to the
arch/arm/mach-orion5x/common.c to let it compile successfully:
1 Add including of mv_dma.h
2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c,
so I think the clean solution should be modify the addr-map.h? Anyway,
as a quick solution the source finally got compiled)

Regards
Cloudy

2012-06-25 21:59:38

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi,

On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote:
> This time the machine can't finish the boot again and the console was
> flooded by the message like below:

Oh well. I decided to drop that BUG_ON() again, since I saw it once
being triggered while in interrupt context. But since the error is
non-recovering anyway, I guess it may stay there as well.

> Also, I had to make some modifications to the
> arch/arm/mach-orion5x/common.c to let it compile successfully:
> 1 Add including of mv_dma.h
> 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c,
> so I think the clean solution should be modify the addr-map.h? Anyway,
> as a quick solution the source finally got compiled)

Hmm, yeah. Test-compiling for the platform one is writing code for is
still a good idea. But it's even worse than that: according to the
specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU.

Please apply the attached patch on top of the one I sent earlier,
without your modifications (the necessary parts are contained in it).
Also, I've added some log output to the decode window setter, so we see
what's going on there.

Anyway, thanks a lot for your help so far! I hope next try shows some
progress at least.

Greetings, Phil


Phil Sutter
Software Engineer

--


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl: +49-6721-49030-134
Fax: +49-6721-49030-209

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel


Attachments:
(No filename) (1.66 kB)
mv_dma_fixup.diff (2.15 kB)
Download all attachments

2012-06-26 11:25:06

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On 2012-6-26 5:59, Phil Sutter wrote:
> Hi,
>
> On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote:
>> This time the machine can't finish the boot again and the console was
>> flooded by the message like below:
>
> Oh well. I decided to drop that BUG_ON() again, since I saw it once
> being triggered while in interrupt context. But since the error is
> non-recovering anyway, I guess it may stay there as well.
>
>> Also, I had to make some modifications to the
>> arch/arm/mach-orion5x/common.c to let it compile successfully:
>> 1 Add including of mv_dma.h
>> 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c,
>> so I think the clean solution should be modify the addr-map.h? Anyway,
>> as a quick solution the source finally got compiled)
>
> Hmm, yeah. Test-compiling for the platform one is writing code for is
> still a good idea. But it's even worse than that: according to the
> specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU.
>
> Please apply the attached patch on top of the one I sent earlier,
> without your modifications (the necessary parts are contained in it).
> Also, I've added some log output to the decode window setter, so we see
> what's going on there.
>
> Anyway, thanks a lot for your help so far! I hope next try shows some
> progress at least.
>
> Greetings, Phil
>
>
> Phil Sutter
> Software Engineer
>

Kernel message after applying the latest patch:

MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000
MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000
MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
------------[ cut here ]------------
kernel BUG at drivers/crypto/mv_cesa.c:1126!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 Not tainted (3.5.0-rc4+ #2)
pc : [<c01dfcd0>] lr : [<c0015c20>] psr: 20000093
sp : c79b9e58 ip : c79b9da8 fp : c79b9e6c
r10: c02f4184 r9 : c0308342 r8 : 0000001c
r7 : 00000000 r6 : 00000000 r5 : c03149a4 r4 : 00000002
r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: a005317f Table: 00004000 DAC: 00000017
Process mv_crypto (pid: 276, stack limit = 0xc79b8270)
Stack: (0xc79b9e58 to 0xc79ba000)
9e40: c79afc40
0000001c
9e60: c79b9ea4 c79b9e70 c0047aa8 c01dfc48 c788929c 4bfb2bc9 00000000
c02f4184
9e80: 0000001c 00000000 c79b9f4c c79bbe18 c02eec18 c02f10a0 c79b9ebc
c79b9ea8
9ea0: c0047c58 c0047a64 00022000 c02f4184 c79b9ed4 c79b9ec0 c0049f60
c0047c38
9ec0: c0049ed8 c0301758 c79b9ee4 c79b9ed8 c00473e4 c0049ee8 c79b9f04
c79b9ee8
9ee0: c000985c c00473c4 c01debf4 c025dfdc a0000013 fdd20200 c79b9f14
c79b9f08
9f00: c0008170 c0009834 c79b9f6c c79b9f18 c0008c14 c0008170 00000000
00000001
9f20: c79b9f60 c78897a0 c03149a4 c799c200 c7936cc0 c79ac780 c79bbe18
c02eec18
9f40: c02f10a0 c79b9f6c c79b9f70 c79b9f60 c01debf4 c025dfdc a0000013
ffffffff
9f60: c79b9fbc c79b9f70 c01debf4 c025dfd0 00000000 c79b9f80 c025dd54
c0035418
9f80: c78897a0 c7827de8 c79b8000 c02eec18 00000013 c7827de8 c799c200
c01dea10
9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002de90
c01dea20
9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000
c7827de8
9fe0: c002de00 c001877c 00000000 c79b9ff8 c001877c c002de10 01e6e7fe
01e6e7ff
Backtrace:
Function entered at [<c01dfc38>] from [<c0047aa8>]
r5:0000001c r4:c79afc40
Function entered at [<c0047a54>] from [<c0047c58>]
Function entered at [<c0047c28>] from [<c0049f60>]
r4:c02f4184 r3:00022000
Function entered at [<c0049ed8>] from [<c00473e4>]
r4:c0301758 r3:c0049ed8
Function entered at [<c00473b4>] from [<c000985c>]
Function entered at [<c0009824>] from [<c0008170>]
r6:fdd20200 r5:a0000013 r4:c025dfdc r3:c01debf4
Function entered at [<c0008160>] from [<c0008c14>]
Exception stack(0xc79b9f18 to 0xc79b9f60)
9f00: 00000000
00000001
9f20: c79b9f60 c78897a0 c03149a4 c799c200 c7936cc0 c79ac780 c79bbe18
c02eec18
9f40: c02f10a0 c79b9f6c c79b9f70 c79b9f60 c01debf4 c025dfdc a0000013
ffffffff
Function entered at [<c025dfc0>] from [<c01debf4>]
Function entered at [<c01dea10>] from [<c002de90>]
Function entered at [<c002de00>] from [<c001877c>]
r6:c001877c r5:c002de00 r4:c7827de8
Code: e89da830 e59f000c eb01ec50 eaffffe9 (e7f001f2)
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0

2012-06-26 20:32:16

by Simon Baatz

[permalink] [raw]
Subject: [PATCH 1/1] mv_dma: mv_cesa: fixes for clock init

mv_dma tries to access CESA engine registers before the CESA clock is
enabled. Shift the clock enable code to the proper position.

Additionally, both mv_dma and mv_cesa did not disable the clock if something
went wrong during init.

Signed-off-by: Simon Baatz <[email protected]>
---
drivers/crypto/mv_cesa.c | 7 ++++++-
drivers/crypto/mv_dma.c | 44 +++++++++++++++++++++++++++++---------------
2 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index b75fdf5..aa05567 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -1308,7 +1308,8 @@ static int mv_probe(struct platform_device *pdev)
ret = -ENOMEM;
goto err_mapping;
}
- if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) {
+ ret = set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE);
+ if (ret) {
printk(KERN_ERR MV_CESA "failed to initialise poolsize\n");
goto err_pool;
}
@@ -1350,6 +1351,10 @@ err_mapping:
dma_unmap_single(&pdev->dev, cpg->sa_sram_dma,
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
free_irq(irq, cp);
+ if (!IS_ERR(cp->clk)) {
+ clk_disable_unprepare(cp->clk);
+ clk_put(cp->clk);
+ }
err_thread:
kthread_stop(cp->queue_th);
err_unmap_sram:
diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
index dd1ce02..9440fbc 100644
--- a/drivers/crypto/mv_dma.c
+++ b/drivers/crypto/mv_dma.c
@@ -350,23 +350,39 @@ static int mv_init_engine(struct platform_device *pdev, u32 ctrl_init_val,
tpg.dev = &pdev->dev;
tpg.print_and_clear_irq = pc_irq;

+ /* Not all platforms can gate the clock, so it is not
+ an error if the clock does not exists. */
+ tpg.clk = clk_get(&pdev->dev, NULL);
+ if (!IS_ERR(tpg.clk))
+ clk_prepare_enable(tpg.clk);
+
/* setup address decoding */
res = platform_get_resource_byname(pdev,
IORESOURCE_MEM, "regs deco");
- if (!res)
- return -ENXIO;
- if (!(deco = ioremap(res->start, resource_size(res))))
- return -ENOMEM;
+ if (!res) {
+ rc = -ENXIO;
+ goto out_disable_clk;
+ }
+ deco = ioremap(res->start, resource_size(res));
+ if (!deco) {
+ rc = -ENOMEM;
+ goto out_disable_clk;
+ }
setup_mbus_windows(deco, pdev->dev.platform_data, win_setter);
iounmap(deco);

/* get register start address */
res = platform_get_resource_byname(pdev,
IORESOURCE_MEM, "regs control and error");
- if (!res)
- return -ENXIO;
- if (!(tpg.reg = ioremap(res->start, resource_size(res))))
- return -ENOMEM;
+ if (!res) {
+ rc = -ENXIO;
+ goto out_disable_clk;
+ }
+ tpg.reg = ioremap(res->start, resource_size(res));
+ if (!tpg.reg) {
+ rc = -ENOMEM;
+ goto out_disable_clk;
+ }

/* get the IRQ */
tpg.irq = platform_get_irq(pdev, 0);
@@ -375,12 +391,6 @@ static int mv_init_engine(struct platform_device *pdev, u32 ctrl_init_val,
goto out_unmap_reg;
}

- /* Not all platforms can gate the clock, so it is not
- an error if the clock does not exists. */
- tpg.clk = clk_get(&pdev->dev, NULL);
- if (!IS_ERR(tpg.clk))
- clk_prepare_enable(tpg.clk);
-
/* initialise DMA descriptor list */
if (init_dma_desclist(&tpg.desclist, tpg.dev,
sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) {
@@ -421,6 +431,11 @@ out_free_desclist:
fini_dma_desclist(&tpg.desclist);
out_unmap_reg:
iounmap(tpg.reg);
+out_disable_clk:
+ if (!IS_ERR(tpg.clk)) {
+ clk_disable_unprepare(tpg.clk);
+ clk_put(tpg.clk);
+ }
tpg.dev = NULL;
return rc;
}
@@ -517,4 +532,3 @@ module_exit(mv_dma_exit);
MODULE_AUTHOR("Phil Sutter <[email protected]>");
MODULE_DESCRIPTION("Support for Marvell's IDMA/TDMA engines");
MODULE_LICENSE("GPL");
-
--
1.7.9.5

2012-06-26 20:32:05

by Simon Baatz

[permalink] [raw]
Subject: [PATCH 0/1] MV_CESA with DMA: Clk init fixes

Hi Phil,

I just found the time to test your updates. Alas, the mv_dma module
hangs at boot again. The culprit seems to be setup_mbus_windows(),
which is called before the clock is turned on but accesses the DMA
engine.

I shifted the clock init code a bit and while doing so, fixed some error
case handling for mv_dma and mv_cesa. See proposed patch in next mail.

- Simon

Simon Baatz (1):
mv_dma: mv_cesa: fixes for clock init

drivers/crypto/mv_cesa.c | 7 ++++++-
drivers/crypto/mv_dma.c | 44 +++++++++++++++++++++++++++++---------------
2 files changed, 35 insertions(+), 16 deletions(-)

--
1.7.9.5

2012-06-30 07:36:08

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Phil

Although I had no idea about what's wrong, I looked in the functional
errata (again), And I found what's attached (The doc I got from Internet
was a protected PDF, that's why I had to use screen capture).
Is this relevant? Or maybe you have already addressed this in the code
(I can just read some simple C code)?

Regards
Cloudy

On 2012-6-26 5:59, Phil Sutter wrote:
> Hi,
>
> On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote:
>> This time the machine can't finish the boot again and the console was
>> flooded by the message like below:
>
> Oh well. I decided to drop that BUG_ON() again, since I saw it once
> being triggered while in interrupt context. But since the error is
> non-recovering anyway, I guess it may stay there as well.
>
>> Also, I had to make some modifications to the
>> arch/arm/mach-orion5x/common.c to let it compile successfully:
>> 1 Add including of mv_dma.h
>> 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c,
>> so I think the clean solution should be modify the addr-map.h? Anyway,
>> as a quick solution the source finally got compiled)
>
> Hmm, yeah. Test-compiling for the platform one is writing code for is
> still a good idea. But it's even worse than that: according to the
> specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU.
>
> Please apply the attached patch on top of the one I sent earlier,
> without your modifications (the necessary parts are contained in it).
> Also, I've added some log output to the decode window setter, so we see
> what's going on there.
>
> Anyway, thanks a lot for your help so far! I hope next try shows some
> progress at least.
>
> Greetings, Phil
>
>
> Phil Sutter
> Software Engineer
>



Attachments:
gl-cesa-110.PNG (61.45 kB)

2012-07-06 15:06:04

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/1] MV_CESA with DMA: Clk init fixes

Hi Simon,

On Tue, Jun 26, 2012 at 10:31:51PM +0200, Simon Baatz wrote:
> I just found the time to test your updates. Alas, the mv_dma module
> hangs at boot again. The culprit seems to be setup_mbus_windows(),
> which is called before the clock is turned on but accesses the DMA
> engine.
>
> I shifted the clock init code a bit and while doing so, fixed some error
> case handling for mv_dma and mv_cesa. See proposed patch in next mail.

I applied that to my public git, thanks a lot!

Greetings, Phil


Phil Sutter
Software Engineer

--
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale: +49 6721 49030-0
Direct line/Durchwahl: +49 6721 49030-134
Fax: +49 6721 49030-109

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel

2012-07-06 15:30:27

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Cloudy,

On Sat, Jun 30, 2012 at 03:35:48PM +0800, cloudy.linux wrote:
> Although I had no idea about what's wrong, I looked in the functional
> errata (again), And I found what's attached (The doc I got from Internet
> was a protected PDF, that's why I had to use screen capture).
> Is this relevant? Or maybe you have already addressed this in the code
> (I can just read some simple C code)?

To me, doesn't read like a real problem, just a guideline for doing
things. From the output you sent me in your previous mail, I'd rather
suspect fetching the first descriptor to be faulty: the next descriptor
pointer contains the first descriptor's DMA address, all other fields
are zero (this is the situation when triggering the engine, as on
kirkwood all I have to do is fill the first descriptor's address in and
TDMA does the rest) and IDMA triggers an address miss interrupt at
address 0x0. So probably IDMA starts up and tries to look up decoding
windows for he up the still zero source and destination addresses.

According to the specs, when using the next descriptor field for
fetching the first descriptor one also has to set the FETCH_ND field in
DMA_CTRL register, also for TDMA. Though, on my hardware the only
working configuration is the implemented one, i.e. without FETCH_ND
being set.

I have implemented a separate approach just for IDMA, which instead of
just writing the first descriptor's address to NEXT_DESC does:
1. clear CTRL_ENABLE bit
2. fill NEXT_DESC
3. set CTRL_ENABLE along with FETCH_ND
hopefully this is the way to go on Orion. Since Marvell's BSP doesn't
implement *DMA attached to CESA, I have nowhere to look this up. Getting
it right for TDMA was just a matter of trial and error.

My public git got a few updates, including the code described above.
Would be great if you could give it a try.

Greetings, Phil



Phil Sutter
Software Engineer

--
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale: +49 6721 49030-0
Direct line/Durchwahl: +49 6721 49030-134
Fax: +49 6721 49030-109

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel

2012-07-08 05:39:07

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On 2012-7-6 23:30, Phil Sutter wrote:
> Hi Cloudy,
>
> On Sat, Jun 30, 2012 at 03:35:48PM +0800, cloudy.linux wrote:
>> Although I had no idea about what's wrong, I looked in the functional
>> errata (again), And I found what's attached (The doc I got from Internet
>> was a protected PDF, that's why I had to use screen capture).
>> Is this relevant? Or maybe you have already addressed this in the code
>> (I can just read some simple C code)?
>
> To me, doesn't read like a real problem, just a guideline for doing
> things. From the output you sent me in your previous mail, I'd rather
> suspect fetching the first descriptor to be faulty: the next descriptor
> pointer contains the first descriptor's DMA address, all other fields
> are zero (this is the situation when triggering the engine, as on
> kirkwood all I have to do is fill the first descriptor's address in and
> TDMA does the rest) and IDMA triggers an address miss interrupt at
> address 0x0. So probably IDMA starts up and tries to look up decoding
> windows for he up the still zero source and destination addresses.
>
> According to the specs, when using the next descriptor field for
> fetching the first descriptor one also has to set the FETCH_ND field in
> DMA_CTRL register, also for TDMA. Though, on my hardware the only
> working configuration is the implemented one, i.e. without FETCH_ND
> being set.
>
> I have implemented a separate approach just for IDMA, which instead of
> just writing the first descriptor's address to NEXT_DESC does:
> 1. clear CTRL_ENABLE bit
> 2. fill NEXT_DESC
> 3. set CTRL_ENABLE along with FETCH_ND
> hopefully this is the way to go on Orion. Since Marvell's BSP doesn't
> implement *DMA attached to CESA, I have nowhere to look this up. Getting
> it right for TDMA was just a matter of trial and error.
>
> My public git got a few updates, including the code described above.
> Would be great if you could give it a try.
>
> Greetings, Phil
>
>
>
> Phil Sutter
> Software Engineer
>

Hi

Newest result. Still couldn't boot up. This time the source was cloned
from your git repository.

MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000
MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000
MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
------------[ cut here ]------------
kernel BUG at drivers/crypto/mv_cesa.c:1126!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 Not tainted (3.5.0-rc2+ #3)
pc : [<c01df8e0>] lr : [<c0015810>] psr: 20000093
sp : c79b9e58 ip : c79b9da8 fp : c79b9e6c
r10: c02f2164 r9 : c0306322 r8 : 0000001c
r7 : 00000000 r6 : 00000000 r5 : c0312988 r4 : 00000002
r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: a005317f Table: 00004000 DAC: 00000017
Process mv_crypto (pid: 276, stack limit = 0xc79b8270)
Stack: (0xc79b9e58 to 0xc79ba000)
9e40: c79afc40
0000001c
9e60: c79b9ea4 c79b9e70 c0047694 c01df858 c788929c 4bf287d9 00000000
c02f2164
9e80: 0000001c 00000000 c79b9f4c c79bbe18 c02ecc18 c02ef0a0 c79b9ebc
c79b9ea8
9ea0: c0047844 c0047650 00022000 c02f2164 c79b9ed4 c79b9ec0 c0049b4c
c0047824
9ec0: c0049ac4 c02ff738 c79b9ee4 c79b9ed8 c0046fd0 c0049ad4 c79b9f04
c79b9ee8
9ee0: c000985c c0046fb0 c01de804 c025db80 a0000013 fdd20200 c79b9f14
c79b9f08
9f00: c0008170 c0009834 c79b9f6c c79b9f18 c0008c14 c0008170 00000000
00000001
9f20: c79b9f60 0000de00 c0312988 c799c200 c7936cc0 c79ac780 c79bbe18
c02ecc18
9f40: c02ef0a0 c79b9f6c c79b9f70 c79b9f60 c01de804 c025db80 a0000013
ffffffff
9f60: c79b9fbc c79b9f70 c01de804 c025db80 00000000 c79b9f80 c025d904
c0035000
9f80: c78897a0 c7827de8 c79b8000 c02ecc18 00000013 c7827de8 c799c200
c01de620
9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002da78
c01de630
9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000
c7827de8
9fe0: c002d9e8 c0018354 00000000 c79b9ff8 c0018354 c002d9f8 01e6e7fe
01e6e7ff
Backtrace:
Function entered at [<c01df848>] from [<c0047694>]
r5:0000001c r4:c79afc40
Function entered at [<c0047640>] from [<c0047844>]
Function entered at [<c0047814>] from [<c0049b4c>]
r4:c02f2164 r3:00022000
Function entered at [<c0049ac4>] from [<c0046fd0>]
r4:c02ff738 r3:c0049ac4
Function entered at [<c0046fa0>] from [<c000985c>]
Function entered at [<c0009824>] from [<c0008170>]
r6:fdd20200 r5:a0000013 r4:c025db80 r3:c01de804
Function entered at [<c0008160>] from [<c0008c14>]
Exception stack(0xc79b9f18 to 0xc79b9f60)
9f00: 00000000
00000001
9f20: c79b9f60 0000de00 c0312988 c799c200 c7936cc0 c79ac780 c79bbe18
c02ecc18
9f40: c02ef0a0 c79b9f6c c79b9f70 c79b9f60 c01de804 c025db80 a0000013
ffffffff
Function entered at [<c025db70>] from [<c01de804>]
Function entered at [<c01de620>] from [<c002da78>]
Function entered at [<c002d9e8>] from [<c0018354>]
r6:c0018354 r5:c002d9e8 r4:c7827de8
Code: e89da830 e59f000c eb01ec39 eaffffe9 (e7f001f2)
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0

Regards

2012-07-09 12:54:59

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi,

On Sun, Jul 08, 2012 at 01:38:47PM +0800, cloudy.linux wrote:
> Newest result. Still couldn't boot up. This time the source was cloned
> from your git repository.
>
> MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000
> MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000
> MV-DMA: IDMA engine up and running, IRQ 23
> MV-DMA: idma_print_and_clear_irq: address miss @0!
> MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
> MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
> MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
> MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
> MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
> MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
> MV-DMA: DMA descriptor list:
> MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
> 0xf2200080, count 16, own 1, next 0x79b1010
> MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
> 0xf2200000, count 80, own 1, next 0x79b1020
> MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
> count 0, own 0, next 0x79b1030
> MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
> 0x79b4000, count 16, own 1, next 0x0
> MV-CESA:got an interrupt but no pending timer?

Sucks. What's making me wonder here is, address decoding of address 0x0
actually shouldn't fail, since window 0 includes this address.

For now, I have pushed two new commits to my public git, adding more
debugging output for decoding window logic and interrupt case as well as
decoding window permission fix and changing from FETCH_ND to programming
the first DMA descriptor's values manually.

In the long term, I probably should try to get access to some
appropriate hardware myself. This is rather a quiz game than actual bug
tracking.

Greetings, Phil


Phil Sutter
Software Engineer

--
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale: +49 6721 49030-0
Direct line/Durchwahl: +49 6721 49030-134
Fax: +49 6721 49030-109

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel

2012-07-16 09:32:41

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Cloudy

I've not been following this thread too closely..

Do you have any patches you want included into mainline?

Thanks
Andrew

2012-07-16 13:52:25

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hey Andrew,

On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
> I've not been following this thread too closely..
>
> Do you have any patches you want included into mainline?

No need to fix anything mainline, he's just testing my RFC-state DMA
engine addon to MV_CESA. Current state is failing operation on
IDMA-based machines due to errors in hardware configuration I have not
been able to track down yet. On Kirkwood (i.e. TDMA), the only hardware
I have access to, the same code runs fine.

After all, I am not sure why he decided to put you in Cc in the first
place?

Greetings, Phil


Phil Sutter
Software Engineer

--
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale: +49 6721 49030-0
Direct line/Durchwahl: +49 6721 49030-134
Fax: +49 6721 49030-109

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel

2012-07-16 14:03:56

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote:
> Hey Andrew,
>
> On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
> > I've not been following this thread too closely..
> >
> > Do you have any patches you want included into mainline?
>
> No need to fix anything mainline

O.K. I thought there was a problem with user space using it, some
flushes missing somewhere? Or VM mapping problem? Has that been fixed?

Thanks
Andrew

2012-07-16 14:53:26

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

On Mon, Jul 16, 2012 at 04:03:44PM +0200, Andrew Lunn wrote:
> On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote:
> > Hey Andrew,
> >
> > On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
> > > I've not been following this thread too closely..
> > >
> > > Do you have any patches you want included into mainline?
> >
> > No need to fix anything mainline
>
> O.K. I thought there was a problem with user space using it, some
> flushes missing somewhere? Or VM mapping problem? Has that been fixed?

Hmm, there was some discussion about an issue like that in this list at
end of February/beginning of March, which was resolved then. On the
other hand there is an unanswered mail from Cloudy at 20. April about a
failing kernel hash test. Maybe he can elaborate on this?

Greetings, Phil


Phil Sutter
Software Engineer

--
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale: +49 6721 49030-0
Direct line/Durchwahl: +49 6721 49030-134
Fax: +49 6721 49030-109

[email protected]
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel

2012-07-16 17:32:37

by Simon Baatz

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Andrew, Phil,

On Mon, Jul 16, 2012 at 04:53:18PM +0200, Phil Sutter wrote:
> On Mon, Jul 16, 2012 at 04:03:44PM +0200, Andrew Lunn wrote:
> > On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote:
> > > Hey Andrew,
> > >
> > > On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
> > > > I've not been following this thread too closely..
> > > >
> > > > Do you have any patches you want included into mainline?
> > >
> > > No need to fix anything mainline
> >
> > O.K. I thought there was a problem with user space using it, some
> > flushes missing somewhere? Or VM mapping problem? Has that been fixed?
>
> Hmm, there was some discussion about an issue like that in this list at
> end of February/beginning of March, which was resolved then. On the
> other hand there is an unanswered mail from Cloudy at 20. April about a
> failing kernel hash test. Maybe he can elaborate on this?
>

I think the problem is not in mv_cesa but in
flush_kernel_dcache_page(). I have proposed a fix here:

http://www.spinics.net/lists/arm-kernel/msg176913.html

There was a little bit of discussion on the patch, but it has not
been picked up yet.

- Simon

2012-07-16 18:00:24

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

> I think the problem is not in mv_cesa but in
> flush_kernel_dcache_page(). I have proposed a fix here:
>
> http://www.spinics.net/lists/arm-kernel/msg176913.html
>
> There was a little bit of discussion on the patch, but it has not
> been picked up yet.

Hi Simon

This is core code, not an area i feel comfortable about.

I suggest you repost it, CC: Catalin and Russell.

Andrew

2012-07-31 12:12:19

by cloudy.linux

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hi Phil

On 2012-7-9 20:54, Phil Sutter wrote:
> Hi,
>
> On Sun, Jul 08, 2012 at 01:38:47PM +0800, cloudy.linux wrote:
>> Newest result. Still couldn't boot up. This time the source was cloned
>> from your git repository.
>>
>> MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000
>> MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000
>> MV-DMA: IDMA engine up and running, IRQ 23
>> MV-DMA: idma_print_and_clear_irq: address miss @0!
>> MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
>> MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
>> MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
>> MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
>> MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
>> MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
>> MV-DMA: DMA descriptor list:
>> MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
>> 0xf2200080, count 16, own 1, next 0x79b1010
>> MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
>> 0xf2200000, count 80, own 1, next 0x79b1020
>> MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
>> count 0, own 0, next 0x79b1030
>> MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
>> 0x79b4000, count 16, own 1, next 0x0
>> MV-CESA:got an interrupt but no pending timer?
>
> Sucks. What's making me wonder here is, address decoding of address 0x0
> actually shouldn't fail, since window 0 includes this address.
>
> For now, I have pushed two new commits to my public git, adding more
> debugging output for decoding window logic and interrupt case as well as
> decoding window permission fix and changing from FETCH_ND to programming
> the first DMA descriptor's values manually.
>
> In the long term, I probably should try to get access to some
> appropriate hardware myself. This is rather a quiz game than actual bug
> tracking.
>
> Greetings, Phil
>
>
> Phil Sutter
> Software Engineer
>

Sorry for taking so long time to try the latest code. Just came back
from a vacation and tried several days to get a tight sleep.

The latest console output:

MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000
MV-DMA: idma_set_deco_win: win(0): BAR 0x7ff0000, size 0x0, enable
0xffff, prot 0xc031295c
MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000
MV-DMA: idma_set_deco_win: win(1): BAR 0x0, size 0x0, enable 0xffff,
prot 0xc031295c
MV-DMA: IDMA engine up and running, IRQ 23
MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x79b4000
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4000
MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-CESA:got an interrupt but no pending timer?
------------[ cut here ]------------
kernel BUG at drivers/crypto/mv_cesa.c:1126!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 Not tainted (3.5.0-rc2+ #4)
pc : [<c01df9d8>] lr : [<c0015810>] psr: 20000093
sp : c79b9e68 ip : c79b9db8 fp : c79b9e7c
r10: c02f2164 r9 : c0306322 r8 : 0000001c
r7 : 00000000 r6 : 00000000 r5 : c0312988 r4 : 00000002
r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000
Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: a005317f Table: 00004000 DAC: 00000017
Process mv_crypto (pid: 276, stack limit = 0xc79b8270)
Stack: (0xc79b9e68 to 0xc79ba000)
9e60: c79afc40 0000001c c79b9eb4 c79b9e80 c0047694
c01df950
9e80: c7824540 00000001 c79b9eac c02f2164 0000001c 00000000 c79b9f5c
c79bbe18
9ea0: c02ecc18 c02ef0a0 c79b9ecc c79b9eb8 c0047844 c0047650 00022000
c02f2164
9ec0: c79b9ee4 c79b9ed0 c0049b4c c0047824 c0049ac4 c02ff738 c79b9ef4
c79b9ee8
9ee0: c0046fd0 c0049ad4 c79b9f14 c79b9ef8 c000985c c0046fb0 c01dc330
c01dee08
9f00: a0000013 fdd20200 c79b9f24 c79b9f18 c0008170 c0009834 c79b9fbc
c79b9f28
9f20: c0008c14 c0008170 0000009b 00000001 fdd90000 0000de00 c0312988
c799c200
9f40: c7936cc0 c79ac780 c79bbe18 c02ecc18 c02ef0a0 c79b9fbc 00000010
c79b9f70
9f60: c01dc330 c01dee08 a0000013 ffffffff 00000000 c79b9f80 c025d9fc
c0035000
9f80: c78897a0 c7827de8 c79b8000 c02ecc18 00000013 c7827de8 c799c200
c01de718
9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002da78
c01de728
9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000
c7827de8
9fe0: c002d9e8 c0018354 00000000 c79b9ff8 c0018354 c002d9f8 01e6e7fe
01e6e7ff
Backtrace:
Function entered at [<c01df940>] from [<c0047694>]
r5:0000001c r4:c79afc40
Function entered at [<c0047640>] from [<c0047844>]
Function entered at [<c0047814>] from [<c0049b4c>]
r4:c02f2164 r3:00022000
Function entered at [<c0049ac4>] from [<c0046fd0>]
r4:c02ff738 r3:c0049ac4
Function entered at [<c0046fa0>] from [<c000985c>]
Function entered at [<c0009824>] from [<c0008170>]
r6:fdd20200 r5:a0000013 r4:c01dee08 r3:c01dc330
Function entered at [<c0008160>] from [<c0008c14>]
Exception stack(0xc79b9f28 to 0xc79b9f70)
9f20: 0000009b 00000001 fdd90000 0000de00 c0312988
c799c200
9f40: c7936cc0 c79ac780 c79bbe18 c02ecc18 c02ef0a0 c79b9fbc 00000010
c79b9f70
9f60: c01dc330 c01dee08 a0000013 ffffffff
Function entered at [<c01de718>] from [<c002da78>]
Function entered at [<c002d9e8>] from [<c0018354>]
r6:c0018354 r5:c002d9e8 r4:c7827de8
Code: e89da830 e59f000c eb01ec39 eaffffe9 (e7f001f2)
MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
MV-DMA: DMA descriptor list:
MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst
0xf2200080, count 16, own 1, next 0x79b1010
MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst
0xf2200000, count 80, own 1, next 0x79b1020
MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0,
count 0, own 0, next 0x79b1030
MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst
0x79b4000, count 16, own 1, next 0x0
MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0
MV-DMA: idma_print_and_clear_irq: address miss @0!
MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0
MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
...

Best Regards
Cloudy

2012-10-23 17:18:36

by Phil Sutter

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

Hey,

On Tue, Jul 31, 2012 at 08:12:02PM +0800, cloudy.linux wrote:
> Sorry for taking so long time to try the latest code. Just came back
> from a vacation and tried several days to get a tight sleep.

My apologies for having a ~3months lag. Somehow I have totally forgotten
about your mail in my inbox and just recently found it again.

Luckily, I received testing hardware from a colleague a few days ago on
which I can reproduce the problems at hand. So for now, I can do the
testing on my own. Thanks a lot for yours!

I'll get back to you (probably via linux-crypto) as soon as I have some
useful progress. Could be that I have to implement bigger changes in
code flow for Orion, as the IDMA seems to lag this "Enhanced Software
Flow" functionality (how the Kirkwood datasheet calls it) I am relying
upon in my current code. Will see.

Best wishes, Phil