2016-06-01 08:56:01

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 00/28] crypto: omap fixes / support additions

Hi,

This series adds support for crypto hardware accelerators on TI DRA7xx
and AM43xx SoCs, and fixes a number of bugs in the existing codebase.
This series also addresses performance issues with the AES / SHA
accelerators, doing some optimizations on these.

Patch #7 and #13 are generic crypto API implementation changes.
Without #7, omap-sham export/import does not work, #13 is kind
of nice to have.

Patches 16+ should be picked-up / acked by Tony, but they have
dependencies on the preceding patches; at least the AES dual core
support must be in before applying the rest, otherwise bad things
will happen.

-Tero


2016-06-01 08:57:42

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 04/28] crypto: omap: do not call dmaengine_terminate_all

From: Lokesh Vutla <[email protected]>

The extra call to dmaengine_terminate_all is not needed, as the DMA
is not running at this point. This improves performance slightly.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-aes.c | 2 --
drivers/crypto/omap-sham.c | 1 -
2 files changed, 3 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 4a0e6a5..8178632 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -528,8 +528,6 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)

omap_aes_dma_stop(dd);

- dmaengine_terminate_all(dd->dma_lch_in);
- dmaengine_terminate_all(dd->dma_lch_out);

return 0;
}
diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 71eac5c..287bc43 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -805,7 +805,6 @@ static int omap_sham_update_dma_stop(struct omap_sham_dev *dd)
{
struct omap_sham_reqctx *ctx = ahash_request_ctx(dd->req);

- dmaengine_terminate_all(dd->dma_lch);

if (ctx->flags & BIT(FLAGS_SG)) {
dma_unmap_sg(dd->dev, ctx->sg, 1, DMA_TO_DEVICE);
--
1.9.1

2016-06-01 08:57:42

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 03/28] crypto: omap-sham: change queue size from 1 to 10

Change crypto queue size from 1 to 10 for omap SHA driver. This should
allow clients to enqueue requests more effectively to avoid serializing
whole crypto sequences, giving extra performance.

Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-sham.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index bd0258f..71eac5c 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -173,7 +173,7 @@ struct omap_sham_ctx {
struct omap_sham_hmac_ctx base[0];
};

-#define OMAP_SHAM_QUEUE_LENGTH 1
+#define OMAP_SHAM_QUEUE_LENGTH 10

struct omap_sham_algs_info {
struct ahash_alg *algs_list;
--
1.9.1

2016-06-01 08:57:44

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

From: Lokesh Vutla <[email protected]>

Calling runtime PM API for every block causes serious perf hit to
crypto operations that are done on a long buffer.
As crypto is performed on a page boundary, encrypting large buffers can
cause a series of crypto operations divided by page. The runtime PM API
is also called those many times.

We call runtime_pm_get_sync only at beginning on the session (cra_init)
and runtime_pm_put at the end. This result in upto a 50% speedup.
This doesn't make the driver to keep the system awake as runtime get/put
is only called during a crypto session which completes usually quickly.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 6eefaa2..bd0258f 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct ahash_request *req)

static int omap_sham_hw_init(struct omap_sham_dev *dd)
{
- int err;
-
- err = pm_runtime_get_sync(dd->dev);
- if (err < 0) {
- dev_err(dd->dev, "failed to get sync: %d\n", err);
- return err;
- }
-
if (!test_bit(FLAGS_INIT, &dd->flags)) {
set_bit(FLAGS_INIT, &dd->flags);
dd->err = 0;
@@ -999,8 +991,6 @@ static void omap_sham_finish_req(struct ahash_request *req, int err)
dd->flags &= ~(BIT(FLAGS_BUSY) | BIT(FLAGS_FINAL) | BIT(FLAGS_CPU) |
BIT(FLAGS_DMA_READY) | BIT(FLAGS_OUTPUT_READY));

- pm_runtime_put(dd->dev);
-
if (req->base.complete)
req->base.complete(&req->base, err);

@@ -1239,6 +1229,7 @@ static int omap_sham_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
{
struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
const char *alg_name = crypto_tfm_alg_name(tfm);
+ struct omap_sham_dev *dd;

/* Allocate a fallback and abort if it failed. */
tctx->fallback = crypto_alloc_shash(alg_name, 0,
@@ -1266,6 +1257,13 @@ static int omap_sham_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)

}

+ spin_lock_bh(&sham.lock);
+ list_for_each_entry(dd, &sham.dev_list, list) {
+ break;
+ }
+ spin_unlock_bh(&sham.lock);
+
+ pm_runtime_get_sync(dd->dev);
return 0;
}

@@ -1307,6 +1305,7 @@ static int omap_sham_cra_sha512_init(struct crypto_tfm *tfm)
static void omap_sham_cra_exit(struct crypto_tfm *tfm)
{
struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
+ struct omap_sham_dev *dd;

crypto_free_shash(tctx->fallback);
tctx->fallback = NULL;
@@ -1315,6 +1314,14 @@ static void omap_sham_cra_exit(struct crypto_tfm *tfm)
struct omap_sham_hmac_ctx *bctx = tctx->base;
crypto_free_shash(bctx->shash);
}
+
+ spin_lock_bh(&sham.lock);
+ list_for_each_entry(dd, &sham.dev_list, list) {
+ break;
+ }
+ spin_unlock_bh(&sham.lock);
+
+ pm_runtime_get_sync(dd->dev);
}

static struct ahash_alg algs_sha1_md5[] = {
--
1.9.1

2016-06-01 08:56:02

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 01/28] crypto: omap-aes: Fix registration of algorithms

From: Lokesh Vutla <[email protected]>

Algorithms can be registered only once. So skip registration of
algorithms if already registered (i.e. in case we have two AES cores
in the system.)

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-aes.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ce174d3..4a0e6a5 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1185,17 +1185,19 @@ static int omap_aes_probe(struct platform_device *pdev)
spin_unlock(&list_lock);

for (i = 0; i < dd->pdata->algs_info_size; i++) {
- for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
- algp = &dd->pdata->algs_info[i].algs_list[j];
+ if (!dd->pdata->algs_info[i].registered) {
+ for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
+ algp = &dd->pdata->algs_info[i].algs_list[j];

- pr_debug("reg alg: %s\n", algp->cra_name);
- INIT_LIST_HEAD(&algp->cra_list);
+ pr_debug("reg alg: %s\n", algp->cra_name);
+ INIT_LIST_HEAD(&algp->cra_list);

- err = crypto_register_alg(algp);
- if (err)
- goto err_algs;
+ err = crypto_register_alg(algp);
+ if (err)
+ goto err_algs;

- dd->pdata->algs_info[i].registered++;
+ dd->pdata->algs_info[i].registered++;
+ }
}
}

--
1.9.1

2016-06-01 08:57:46

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 05/28] crypto: omap-sham: set sw fallback to 240 bytes

From: Bin Liu <[email protected]>

Adds software fallback support for small crypto requests. In these cases,
it is undesirable to use DMA, as setting it up itself is rather heavy
operation. Gives about 40% extra performance in ipsec usecase.

Signed-off-by: Bin Liu <[email protected]>
[[email protected]: dropped the extra traces, updated some comments
on the code]
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-sham.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 287bc43..a5c823b 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -1082,7 +1082,7 @@ static int omap_sham_update(struct ahash_request *req)
ctx->offset = 0;

if (ctx->flags & BIT(FLAGS_FINUP)) {
- if ((ctx->digcnt + ctx->bufcnt + ctx->total) < 9) {
+ if ((ctx->digcnt + ctx->bufcnt + ctx->total) < 240) {
/*
* OMAP HW accel works only with buffers >= 9
* will switch to bypass in final()
@@ -1138,9 +1138,13 @@ static int omap_sham_final(struct ahash_request *req)
if (ctx->flags & BIT(FLAGS_ERROR))
return 0; /* uncompleted hash is not needed */

- /* OMAP HW accel works only with buffers >= 9 */
- /* HMAC is always >= 9 because ipad == block size */
- if ((ctx->digcnt + ctx->bufcnt) < 9)
+ /*
+ * OMAP HW accel works only with buffers >= 9.
+ * HMAC is always >= 9 because ipad == block size.
+ * If buffersize is less than 240, we use fallback SW encoding,
+ * as using DMA + HW in this case doesn't provide any benefit.
+ */
+ if ((ctx->digcnt + ctx->bufcnt) < 240)
return omap_sham_final_shash(req);
else if (ctx->bufcnt)
return omap_sham_enqueue(req, OP_FINAL);
--
1.9.1

2016-06-01 08:57:53

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 09/28] crypto: omap-des: Fix support for unequal lengths

From: Lokesh Vutla <[email protected]>

For cases where total length of an input SGs is not same as
length of the input data for encryption, omap-des driver
crashes. This happens in the case when IPsec is trying to use
omap-des driver.

To avoid this, we copy all the pages from the input SG list
into a contiguous buffer and prepare a single element SG list
for this buffer with length as the total bytes to crypt, which is
similar thing that is done in case of unaligned lengths.

Signed-off-by: Lokesh Vutla <[email protected]>
Tested-by: Aparna Balasubramanian <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-des.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
index 3eedb03..e4c87bc 100644
--- a/drivers/crypto/omap-des.c
+++ b/drivers/crypto/omap-des.c
@@ -521,29 +521,36 @@ static int omap_des_crypt_dma_stop(struct omap_des_dev *dd)
return 0;
}

-static int omap_des_copy_needed(struct scatterlist *sg)
+static int omap_des_copy_needed(struct scatterlist *sg, int total)
{
+ int len = 0;
+
+ if (!IS_ALIGNED(total, DES_BLOCK_SIZE))
+ return -1;
+
while (sg) {
if (!IS_ALIGNED(sg->offset, 4))
return -1;
if (!IS_ALIGNED(sg->length, DES_BLOCK_SIZE))
return -1;
+
+ len += sg->length;
sg = sg_next(sg);
}
+
+ if (len != total)
+ return -1;
+
return 0;
}

static int omap_des_copy_sgs(struct omap_des_dev *dd)
{
void *buf_in, *buf_out;
- int pages;
-
- pages = dd->total >> PAGE_SHIFT;
-
- if (dd->total & (PAGE_SIZE-1))
- pages++;
+ int pages, total;

- BUG_ON(!pages);
+ total = ALIGN(dd->total, DES_BLOCK_SIZE);
+ pages = get_order(total);

buf_in = (void *)__get_free_pages(GFP_ATOMIC, pages);
buf_out = (void *)__get_free_pages(GFP_ATOMIC, pages);
@@ -595,8 +602,8 @@ static int omap_des_prepare_req(struct crypto_engine *engine,
dd->in_sg = req->src;
dd->out_sg = req->dst;

- if (omap_des_copy_needed(dd->in_sg) ||
- omap_des_copy_needed(dd->out_sg)) {
+ if (omap_des_copy_needed(dd->in_sg, dd->total) ||
+ omap_des_copy_needed(dd->out_sg, dd->total)) {
if (omap_des_copy_sgs(dd))
pr_err("Failed to copy SGs for unaligned cases\n");
dd->sgs_copied = 1;
--
1.9.1

2016-06-01 08:57:48

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 06/28] crypto: omap-sham: avoid executing tasklet where not needed

Some of the call paths of OMAP SHA driver can avoid executing the next
step of the crypto queue under tasklet; instead, execute the next step
directly via function call. This avoids a costly round-trip via the
scheduler giving a slight performance boost.

Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-sham.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index a5c823b..34ebe1d 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -240,6 +240,8 @@ static struct omap_sham_drv sham = {
.lock = __SPIN_LOCK_UNLOCKED(sham.lock),
};

+static void omap_sham_done_task(unsigned long data);
+
static inline u32 omap_sham_read(struct omap_sham_dev *dd, u32 offset)
{
return __raw_readl(dd->io_base + offset);
@@ -994,7 +996,7 @@ static void omap_sham_finish_req(struct ahash_request *req, int err)
req->base.complete(&req->base, err);

/* handle new request */
- tasklet_schedule(&dd->done_task);
+ omap_sham_done_task((unsigned long)dd);
}

static int omap_sham_handle_queue(struct omap_sham_dev *dd,
--
1.9.1

2016-06-01 08:57:58

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 11/28] crypto: omap-aes: Add support for multiple cores

From: Lokesh Vutla <[email protected]>

Some SoCs like omap4/omap5/dra7 contain multiple AES crypto accelerator
cores. Adapt the driver to support this. The driver picks the last used
device from a list of AES devices.

Signed-off-by: Lokesh Vutla <[email protected]>
[[email protected]: forward ported to 4.7 kernel]
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-aes.c | 20 ++++++--------------
1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index cf53d3f..f710602 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -319,20 +319,12 @@ static void omap_aes_dma_stop(struct omap_aes_dev *dd)

static struct omap_aes_dev *omap_aes_find_dev(struct omap_aes_ctx *ctx)
{
- struct omap_aes_dev *dd = NULL, *tmp;
+ struct omap_aes_dev *dd;

spin_lock_bh(&list_lock);
- if (!ctx->dd) {
- list_for_each_entry(tmp, &dev_list, list) {
- /* FIXME: take fist available aes core */
- dd = tmp;
- break;
- }
- ctx->dd = dd;
- } else {
- /* already found before */
- dd = ctx->dd;
- }
+ dd = list_first_entry(&dev_list, struct omap_aes_dev, list);
+ list_move_tail(&dd->list, &dev_list);
+ ctx->dd = dd;
spin_unlock_bh(&list_lock);

return dd;
@@ -600,7 +592,7 @@ static int omap_aes_prepare_req(struct crypto_engine *engine,
{
struct omap_aes_ctx *ctx = crypto_ablkcipher_ctx(
crypto_ablkcipher_reqtfm(req));
- struct omap_aes_dev *dd = omap_aes_find_dev(ctx);
+ struct omap_aes_dev *dd = ctx->dd;
struct omap_aes_reqctx *rctx;
int len;

@@ -644,7 +636,7 @@ static int omap_aes_crypt_req(struct crypto_engine *engine,
{
struct omap_aes_ctx *ctx = crypto_ablkcipher_ctx(
crypto_ablkcipher_reqtfm(req));
- struct omap_aes_dev *dd = omap_aes_find_dev(ctx);
+ struct omap_aes_dev *dd = ctx->dd;

if (!dd)
return -ENODEV;
--
1.9.1

2016-06-01 08:57:53

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 10/28] crypto: omap-aes - Fix enabling clocks

From: Lokesh Vutla <[email protected]>

Enable clocks for all cores before starting session.
Driver has to pic the aes core dynamically based on the queue length.

Signed-off-by: Lokesh Vutla <[email protected]>
---
drivers/crypto/omap-aes.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 8178632..cf53d3f 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -760,18 +760,13 @@ static int omap_aes_cra_init(struct crypto_tfm *tfm)
struct omap_aes_dev *dd = NULL;
int err;

- /* Find AES device, currently picks the first device */
- spin_lock_bh(&list_lock);
list_for_each_entry(dd, &dev_list, list) {
- break;
- }
- spin_unlock_bh(&list_lock);
-
- err = pm_runtime_get_sync(dd->dev);
- if (err < 0) {
- dev_err(dd->dev, "%s: failed to get_sync(%d)\n",
- __func__, err);
- return err;
+ err = pm_runtime_get_sync(dd->dev);
+ if (err < 0) {
+ dev_err(dd->dev, "%s: failed to get_sync(%d)\n",
+ __func__, err);
+ return err;
+ }
}

tfm->crt_ablkcipher.reqsize = sizeof(struct omap_aes_reqctx);
@@ -783,14 +778,10 @@ static void omap_aes_cra_exit(struct crypto_tfm *tfm)
{
struct omap_aes_dev *dd = NULL;

- /* Find AES device, currently picks the first device */
- spin_lock_bh(&list_lock);
list_for_each_entry(dd, &dev_list, list) {
- break;
+ pm_runtime_put_sync(dd->dev);
}
- spin_unlock_bh(&list_lock);

- pm_runtime_put_sync(dd->dev);
}

/* ********************** ALGS ************************************ */
--
1.9.1

2016-06-01 08:57:51

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 08/28] crypto: omap-sham: implement context export/import APIs

Context export/import are now required for ahash algorithms due to
required support in algif_hash. Implement these for OMAP SHA driver,
saving and restoring the internal state of the driver.

Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-sham.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 34ebe1d..321c097 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -1329,6 +1329,35 @@ static void omap_sham_cra_exit(struct crypto_tfm *tfm)
pm_runtime_get_sync(dd->dev);
}

+static int omap_sham_export(struct ahash_request *req, void *out)
+{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
+ struct omap_sham_ctx *ctx = crypto_ahash_ctx(tfm);
+ struct omap_sham_hmac_ctx *bctx = ctx->base;
+
+ memcpy(out, rctx, sizeof(*rctx) + BUFLEN);
+ memcpy(out + sizeof(*rctx) + BUFLEN, ctx, sizeof(*ctx));
+ memcpy(out + sizeof(*rctx) + BUFLEN + sizeof(*ctx), bctx,
+ sizeof(*bctx));
+
+ return 0;
+}
+
+static int omap_sham_import(struct ahash_request *req, const void *in)
+{
+ struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+ struct omap_sham_reqctx *rctx = ahash_request_ctx(req);
+ struct omap_sham_ctx *ctx = crypto_ahash_ctx(tfm);
+ struct omap_sham_hmac_ctx *bctx = ctx->base;
+
+ memcpy(rctx, in, sizeof(*rctx) + BUFLEN);
+ memcpy(ctx, in + sizeof(*rctx) + BUFLEN, sizeof(*ctx));
+ memcpy(bctx, in + sizeof(*rctx) + BUFLEN + sizeof(*ctx), sizeof(*bctx));
+
+ return 0;
+}
+
static struct ahash_alg algs_sha1_md5[] = {
{
.init = omap_sham_init,
@@ -1980,8 +2009,15 @@ static int omap_sham_probe(struct platform_device *pdev)

for (i = 0; i < dd->pdata->algs_info_size; i++) {
for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
- err = crypto_register_ahash(
- &dd->pdata->algs_info[i].algs_list[j]);
+ struct ahash_alg *alg;
+
+ alg = &dd->pdata->algs_info[i].algs_list[j];
+ alg->export = omap_sham_export;
+ alg->import = omap_sham_import;
+ alg->halg.statesize = sizeof(struct omap_sham_reqctx) +
+ sizeof(struct omap_sham_ctx) +
+ sizeof(struct omap_sham_hmac_ctx) + BUFLEN;
+ err = crypto_register_ahash(alg);
if (err)
goto err_algs;

--
1.9.1

2016-06-01 08:58:05

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 12/28] crypto: omap-aes: Add fallback support

From: Lokesh Vutla <[email protected]>

As setting up the DMA operations is quite costly, add software fallback
support for requests smaller than 200 bytes. This change gives some 10%
extra performance in ipsec use case.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/Kconfig | 3 +++
drivers/crypto/omap-aes.c | 45 ++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index d77ba2f..0c57ac9 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -305,6 +305,9 @@ config CRYPTO_DEV_OMAP_AES
select CRYPTO_AES
select CRYPTO_BLKCIPHER
select CRYPTO_ENGINE
+ select CRYPTO_CBC
+ select CRYPTO_ECB
+ select CRYPTO_CTR
help
OMAP processors have AES module accelerator. Select this if you
want to use the OMAP module for AES algorithms.
diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index f710602..867e56a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -103,6 +103,7 @@ struct omap_aes_ctx {
int keylen;
u32 key[AES_KEYSIZE_256 / sizeof(u32)];
unsigned long flags;
+ struct crypto_ablkcipher *fallback;
};

struct omap_aes_reqctx {
@@ -680,15 +681,28 @@ static void omap_aes_done_task(unsigned long data)

static int omap_aes_crypt(struct ablkcipher_request *req, unsigned long mode)
{
+ struct crypto_tfm *tfm =
+ crypto_ablkcipher_tfm(crypto_ablkcipher_reqtfm(req));
struct omap_aes_ctx *ctx = crypto_ablkcipher_ctx(
crypto_ablkcipher_reqtfm(req));
struct omap_aes_reqctx *rctx = ablkcipher_request_ctx(req);
struct omap_aes_dev *dd;
+ int ret;

pr_debug("nbytes: %d, enc: %d, cbc: %d\n", req->nbytes,
!!(mode & FLAGS_ENCRYPT),
!!(mode & FLAGS_CBC));

+ if (req->nbytes < 200) {
+ ablkcipher_request_set_tfm(req, ctx->fallback);
+
+ if (mode & FLAGS_ENCRYPT)
+ ret = crypto_ablkcipher_encrypt(req);
+ else
+ ret = crypto_ablkcipher_decrypt(req);
+ ablkcipher_request_set_tfm(req, __crypto_ablkcipher_cast(tfm));
+ return ret;
+ }
dd = omap_aes_find_dev(ctx);
if (!dd)
return -ENODEV;
@@ -704,6 +718,7 @@ static int omap_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct omap_aes_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+ int ret;

if (keylen != AES_KEYSIZE_128 && keylen != AES_KEYSIZE_192 &&
keylen != AES_KEYSIZE_256)
@@ -714,6 +729,14 @@ static int omap_aes_setkey(struct crypto_ablkcipher *tfm, const u8 *key,
memcpy(ctx->key, key, keylen);
ctx->keylen = keylen;

+ ctx->fallback->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
+ ctx->fallback->base.crt_flags |=
+ tfm->base.crt_flags & CRYPTO_TFM_REQ_MASK;
+
+ ret = crypto_ablkcipher_setkey(ctx->fallback, key, keylen);
+ if (!ret)
+ return 0;
+
return 0;
}

@@ -751,6 +774,11 @@ static int omap_aes_cra_init(struct crypto_tfm *tfm)
{
struct omap_aes_dev *dd = NULL;
int err;
+ const char *name = crypto_tfm_alg_name(tfm);
+ const u32 flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK;
+ struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct crypto_ablkcipher *blk;
+

list_for_each_entry(dd, &dev_list, list) {
err = pm_runtime_get_sync(dd->dev);
@@ -761,6 +789,12 @@ static int omap_aes_cra_init(struct crypto_tfm *tfm)
}
}

+ blk = crypto_alloc_ablkcipher(name, 0, flags);
+ if (IS_ERR(blk))
+ return PTR_ERR(blk);
+
+ ctx->fallback = blk;
+
tfm->crt_ablkcipher.reqsize = sizeof(struct omap_aes_reqctx);

return 0;
@@ -769,11 +803,16 @@ static int omap_aes_cra_init(struct crypto_tfm *tfm)
static void omap_aes_cra_exit(struct crypto_tfm *tfm)
{
struct omap_aes_dev *dd = NULL;
+ struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);

list_for_each_entry(dd, &dev_list, list) {
pm_runtime_put_sync(dd->dev);
}

+ if (ctx->fallback)
+ crypto_free_ablkcipher(ctx->fallback);
+
+ ctx->fallback = NULL;
}

/* ********************** ALGS ************************************ */
@@ -785,7 +824,7 @@ static struct crypto_alg algs_ecb_cbc[] = {
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER |
CRYPTO_ALG_KERN_DRIVER_ONLY |
- CRYPTO_ALG_ASYNC,
+ CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct omap_aes_ctx),
.cra_alignmask = 0,
@@ -807,7 +846,7 @@ static struct crypto_alg algs_ecb_cbc[] = {
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER |
CRYPTO_ALG_KERN_DRIVER_ONLY |
- CRYPTO_ALG_ASYNC,
+ CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct omap_aes_ctx),
.cra_alignmask = 0,
@@ -833,7 +872,7 @@ static struct crypto_alg algs_ctr[] = {
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER |
CRYPTO_ALG_KERN_DRIVER_ONLY |
- CRYPTO_ALG_ASYNC,
+ CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct omap_aes_ctx),
.cra_alignmask = 0,
--
1.9.1

2016-06-01 08:58:16

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 13/28] crypto: engine: avoid unnecessary context switches

Crypto engine will now hi-jack the currently running thread for executing
crypto functionality. Only if we are not running a thread (in interrupt
context) the kthread will be scheduled.

This will improve performance of crypto operations using crypto engine.

Signed-off-by: Tero Kristo <[email protected]>
---
crypto/crypto_engine.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
index a55c82d..aac5870 100644
--- a/crypto/crypto_engine.c
+++ b/crypto/crypto_engine.c
@@ -136,6 +136,14 @@ static void crypto_pump_work(struct kthread_work *work)
crypto_pump_requests(engine, true);
}

+static void queue_pump_work(struct crypto_engine *engine)
+{
+ if (in_interrupt())
+ queue_kthread_work(&engine->kworker, &engine->pump_requests);
+ else
+ crypto_pump_requests(engine, true);
+}
+
/**
* crypto_transfer_request - transfer the new request into the engine queue
* @engine: the hardware engine
@@ -156,10 +164,11 @@ int crypto_transfer_request(struct crypto_engine *engine,

ret = ablkcipher_enqueue_request(&engine->queue, req);

+ spin_unlock_irqrestore(&engine->queue_lock, flags);
+
if (!engine->busy && need_pump)
- queue_kthread_work(&engine->kworker, &engine->pump_requests);
+ queue_pump_work(engine);

- spin_unlock_irqrestore(&engine->queue_lock, flags);
return ret;
}
EXPORT_SYMBOL_GPL(crypto_transfer_request);
@@ -210,7 +219,7 @@ void crypto_finalize_request(struct crypto_engine *engine,

req->base.complete(&req->base, err);

- queue_kthread_work(&engine->kworker, &engine->pump_requests);
+ queue_pump_work(engine);
}
EXPORT_SYMBOL_GPL(crypto_finalize_request);

@@ -234,7 +243,7 @@ int crypto_engine_start(struct crypto_engine *engine)
engine->running = true;
spin_unlock_irqrestore(&engine->queue_lock, flags);

- queue_kthread_work(&engine->kworker, &engine->pump_requests);
+ queue_pump_work(engine);

return 0;
}
--
1.9.1

2016-06-01 08:57:47

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 07/28] crypto: ahash: increase the maximum allowed statesize

The statesize is used to determine the maximum size for saved ahash
context. In some cases, this can be much larger than what is currently
allocated for it, for example omap-sham driver uses a buffer size of
PAGE_SIZE. Increase the statesize to accommodate this.

Signed-off-by: Tero Kristo <[email protected]>
---
crypto/ahash.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/ahash.c b/crypto/ahash.c
index 3887a98..375bbd7 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -536,7 +536,7 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
struct crypto_alg *base = &alg->halg.base;

if (alg->halg.digestsize > PAGE_SIZE / 8 ||
- alg->halg.statesize > PAGE_SIZE / 8 ||
+ alg->halg.statesize > PAGE_SIZE * 2 ||
alg->halg.statesize == 0)
return -EINVAL;

--
1.9.1

2016-06-01 08:58:16

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 15/28] crypto: omap-des: fix crypto engine initialization order

The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.

Fixes: f1b77aaca85a ("crypto: omap-des - Integrate with the crypto engine framework")
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-des.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
index e4c87bc..f5bf0d1 100644
--- a/drivers/crypto/omap-des.c
+++ b/drivers/crypto/omap-des.c
@@ -1079,6 +1079,17 @@ static int omap_des_probe(struct platform_device *pdev)
list_add_tail(&dd->list, &dev_list);
spin_unlock(&list_lock);

+ /* Initialize des crypto engine */
+ dd->engine = crypto_engine_alloc_init(dev, 1);
+ if (!dd->engine)
+ goto err_engine;
+
+ dd->engine->prepare_request = omap_des_prepare_req;
+ dd->engine->crypt_one_request = omap_des_crypt_req;
+ err = crypto_engine_start(dd->engine);
+ if (err)
+ goto err_engine;
+
for (i = 0; i < dd->pdata->algs_info_size; i++) {
for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
algp = &dd->pdata->algs_info[i].algs_list[j];
@@ -1094,27 +1105,18 @@ static int omap_des_probe(struct platform_device *pdev)
}
}

- /* Initialize des crypto engine */
- dd->engine = crypto_engine_alloc_init(dev, 1);
- if (!dd->engine)
- goto err_algs;
-
- dd->engine->prepare_request = omap_des_prepare_req;
- dd->engine->crypt_one_request = omap_des_crypt_req;
- err = crypto_engine_start(dd->engine);
- if (err)
- goto err_engine;
-
return 0;

-err_engine:
- crypto_engine_exit(dd->engine);
err_algs:
for (i = dd->pdata->algs_info_size - 1; i >= 0; i--)
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
&dd->pdata->algs_info[i].algs_list[j]);

+err_engine:
+ if (dd->engine)
+ crypto_engine_exit(dd->engine);
+
omap_des_dma_cleanup(dd);
err_irq:
tasklet_kill(&dd->done_task);
--
1.9.1

2016-06-01 08:58:16

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 14/28] crypto: omap-aes: fix crypto engine initialization order

The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.

Fixes: 0529900a01cb ("crypto: omap-aes - Support crypto engine framework")
Signed-off-by: Tero Kristo <[email protected]>
---
drivers/crypto/omap-aes.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 867e56a..2d0978a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1204,6 +1204,17 @@ static int omap_aes_probe(struct platform_device *pdev)
list_add_tail(&dd->list, &dev_list);
spin_unlock(&list_lock);

+ /* Initialize crypto engine */
+ dd->engine = crypto_engine_alloc_init(dev, 1);
+ if (!dd->engine)
+ goto err_engine;
+
+ dd->engine->prepare_request = omap_aes_prepare_req;
+ dd->engine->crypt_one_request = omap_aes_crypt_req;
+ err = crypto_engine_start(dd->engine);
+ if (err)
+ goto err_engine;
+
for (i = 0; i < dd->pdata->algs_info_size; i++) {
if (!dd->pdata->algs_info[i].registered) {
for (j = 0; j < dd->pdata->algs_info[i].size; j++) {
@@ -1221,26 +1232,17 @@ static int omap_aes_probe(struct platform_device *pdev)
}
}

- /* Initialize crypto engine */
- dd->engine = crypto_engine_alloc_init(dev, 1);
- if (!dd->engine)
- goto err_algs;
-
- dd->engine->prepare_request = omap_aes_prepare_req;
- dd->engine->crypt_one_request = omap_aes_crypt_req;
- err = crypto_engine_start(dd->engine);
- if (err)
- goto err_engine;
-
return 0;
-err_engine:
- crypto_engine_exit(dd->engine);
err_algs:
for (i = dd->pdata->algs_info_size - 1; i >= 0; i--)
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
&dd->pdata->algs_info[i].algs_list[j]);

+err_engine:
+ if (dd->engine)
+ crypto_engine_exit(dd->engine);
+
omap_aes_dma_cleanup(dd);
err_irq:
tasklet_kill(&dd->done_task);
--
1.9.1

2016-06-01 09:05:31

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 16/28] ARM: DRA7: hwmod: Add data for DES IP

From: Joel Fernandes <[email protected]>

DRA7 SoC contains DES crypto hardware accelerator. Add hwmod data for
this IP so that it can be utilized by crypto frameworks.

Signed-off-by: Joel Fernandes <[email protected]>
Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 37 +++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index d0e7e525..13e4ea2 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -2541,6 +2541,34 @@ static struct omap_hwmod dra7xx_uart10_hwmod = {
},
};

+/* DES (the 'P' (public) device) */
+static struct omap_hwmod_class_sysconfig dra7xx_des_sysc = {
+ .rev_offs = 0x0030,
+ .sysc_offs = 0x0034,
+ .syss_offs = 0x0038,
+ .sysc_flags = SYSS_HAS_RESET_STATUS,
+};
+
+static struct omap_hwmod_class dra7xx_des_hwmod_class = {
+ .name = "des",
+ .sysc = &dra7xx_des_sysc,
+};
+
+/* DES */
+static struct omap_hwmod dra7xx_des_hwmod = {
+ .name = "des",
+ .class = &dra7xx_des_hwmod_class,
+ .clkdm_name = "l4sec_clkdm",
+ .main_clk = "l3_iclk_div",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = DRA7XX_CM_L4SEC_DES3DES_CLKCTRL_OFFSET,
+ .context_offs = DRA7XX_RM_L4SEC_DES3DES_CONTEXT_OFFSET,
+ .modulemode = MODULEMODE_HWCTRL,
+ },
+ },
+};
+
/*
* 'usb_otg_ss' class
*
@@ -3683,6 +3711,14 @@ static struct omap_hwmod_ocp_if dra7xx_l4_per2__uart7 = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};

+/* l4_per1 -> des */
+static struct omap_hwmod_ocp_if dra7xx_l4_per1__des = {
+ .master = &dra7xx_l4_per1_hwmod,
+ .slave = &dra7xx_des_hwmod,
+ .clk = "l3_iclk_div",
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
/* l4_per2 -> uart8 */
static struct omap_hwmod_ocp_if dra7xx_l4_per2__uart8 = {
.master = &dra7xx_l4_per2_hwmod,
@@ -3916,6 +3952,7 @@ static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] __initdata = {
&dra7xx_l4_per2__uart8,
&dra7xx_l4_per2__uart9,
&dra7xx_l4_wkup__uart10,
+ &dra7xx_l4_per1__des,
&dra7xx_l4_per3__usb_otg_ss1,
&dra7xx_l4_per3__usb_otg_ss2,
&dra7xx_l4_per3__usb_otg_ss3,
--
1.9.1

2016-06-01 09:07:12

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 17/28] ARM: DRA7: hwmod: Add data for AES IP

From: Joel Fernandes <[email protected]>

DRA7 SoC contains AES crypto hardware accelerator. Add hwmod data for
this IP so that it can be utilized by crypto frameworks.

Signed-off-by: Joel Fernandes <[email protected]>
Signed-off-by: Lokesh Vutla <[email protected]>
[[email protected]: squash in support for both AES1 and AES2 cores]
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 62 +++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index 13e4ea2..ceb1b42 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -690,6 +690,50 @@ static struct omap_hwmod dra7xx_dss_hdmi_hwmod = {
.parent_hwmod = &dra7xx_dss_hwmod,
};

+/* AES (the 'P' (public) device) */
+static struct omap_hwmod_class_sysconfig dra7xx_aes_sysc = {
+ .rev_offs = 0x0080,
+ .sysc_offs = 0x0084,
+ .syss_offs = 0x0088,
+ .sysc_flags = SYSS_HAS_RESET_STATUS,
+};
+
+static struct omap_hwmod_class dra7xx_aes_hwmod_class = {
+ .name = "aes",
+ .sysc = &dra7xx_aes_sysc,
+ .rev = 2,
+};
+
+/* AES1 */
+static struct omap_hwmod dra7xx_aes1_hwmod = {
+ .name = "aes1",
+ .class = &dra7xx_aes_hwmod_class,
+ .clkdm_name = "l4sec_clkdm",
+ .main_clk = "l3_iclk_div",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = DRA7XX_CM_L4SEC_AES1_CLKCTRL_OFFSET,
+ .context_offs = DRA7XX_RM_L4SEC_AES1_CONTEXT_OFFSET,
+ .modulemode = MODULEMODE_HWCTRL,
+ },
+ },
+};
+
+/* AES2 */
+static struct omap_hwmod dra7xx_aes2_hwmod = {
+ .name = "aes2",
+ .class = &dra7xx_aes_hwmod_class,
+ .clkdm_name = "l4sec_clkdm",
+ .main_clk = "l3_iclk_div",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = DRA7XX_CM_L4SEC_AES2_CLKCTRL_OFFSET,
+ .context_offs = DRA7XX_RM_L4SEC_AES2_CONTEXT_OFFSET,
+ .modulemode = MODULEMODE_HWCTRL,
+ },
+ },
+};
+
/*
* 'elm' class
*
@@ -2988,6 +3032,22 @@ static struct omap_hwmod_ocp_if dra7xx_l3_main_1__hdmi = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};

+/* l3_main_1 -> aes1 */
+static struct omap_hwmod_ocp_if dra7xx_l3_main_1__aes1 = {
+ .master = &dra7xx_l3_main_1_hwmod,
+ .slave = &dra7xx_aes1_hwmod,
+ .clk = "l3_iclk_div",
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
+/* l3_main_1 -> aes2 */
+static struct omap_hwmod_ocp_if dra7xx_l3_main_1__aes2 = {
+ .master = &dra7xx_l3_main_1_hwmod,
+ .slave = &dra7xx_aes2_hwmod,
+ .clk = "l3_iclk_div",
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
/* l4_per2 -> mcasp1 */
static struct omap_hwmod_ocp_if dra7xx_l4_per2__mcasp1 = {
.master = &dra7xx_l4_per2_hwmod,
@@ -3877,6 +3937,8 @@ static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] __initdata = {
&dra7xx_l3_main_1__dss,
&dra7xx_l3_main_1__dispc,
&dra7xx_l3_main_1__hdmi,
+ &dra7xx_l3_main_1__aes1,
+ &dra7xx_l3_main_1__aes2,
&dra7xx_l4_per1__elm,
&dra7xx_l4_wkup__gpio1,
&dra7xx_l4_per1__gpio2,
--
1.9.1

2016-06-01 09:07:23

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 18/28] ARM: DRA7: hwmod: Add data for SHA IP

From: Lokesh Vutla <[email protected]>

DRA7 SoC contains SHA crypto hardware accelerator. Add hwmod data for
this IP so that it can be utilized by crypto frameworks.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 37 +++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index ceb1b42..8932619 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -734,6 +734,34 @@ static struct omap_hwmod dra7xx_aes2_hwmod = {
},
};

+/* sha0 HIB2 (the 'P' (public) device) */
+static struct omap_hwmod_class_sysconfig dra7xx_sha0_sysc = {
+ .rev_offs = 0x100,
+ .sysc_offs = 0x110,
+ .syss_offs = 0x114,
+ .sysc_flags = SYSS_HAS_RESET_STATUS,
+};
+
+static struct omap_hwmod_class dra7xx_sha0_hwmod_class = {
+ .name = "sham",
+ .sysc = &dra7xx_sha0_sysc,
+ .rev = 2,
+};
+
+struct omap_hwmod dra7xx_sha0_hwmod = {
+ .name = "sham",
+ .class = &dra7xx_sha0_hwmod_class,
+ .clkdm_name = "l4sec_clkdm",
+ .main_clk = "l3_iclk_div",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = DRA7XX_CM_L4SEC_SHA2MD51_CLKCTRL_OFFSET,
+ .context_offs = DRA7XX_RM_L4SEC_SHA2MD51_CONTEXT_OFFSET,
+ .modulemode = MODULEMODE_HWCTRL,
+ },
+ },
+};
+
/*
* 'elm' class
*
@@ -3048,6 +3076,14 @@ static struct omap_hwmod_ocp_if dra7xx_l3_main_1__aes2 = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};

+/* l3_main_1 -> sha0 */
+static struct omap_hwmod_ocp_if dra7xx_l3_main_1__sha0 = {
+ .master = &dra7xx_l3_main_1_hwmod,
+ .slave = &dra7xx_sha0_hwmod,
+ .clk = "l3_iclk_div",
+ .user = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
/* l4_per2 -> mcasp1 */
static struct omap_hwmod_ocp_if dra7xx_l4_per2__mcasp1 = {
.master = &dra7xx_l4_per2_hwmod,
@@ -3939,6 +3975,7 @@ static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] __initdata = {
&dra7xx_l3_main_1__hdmi,
&dra7xx_l3_main_1__aes1,
&dra7xx_l3_main_1__aes2,
+ &dra7xx_l3_main_1__sha0,
&dra7xx_l4_per1__elm,
&dra7xx_l4_wkup__gpio1,
&dra7xx_l4_per1__gpio2,
--
1.9.1

2016-06-01 09:07:26

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 19/28] ARM: DRA7: hwmod: Add data for RNG IP

From: Joel Fernandes <[email protected]>

DRA7 SoC contains hardware random number generator. Add hwmod data for
this IP so that it can be utilized.

Signed-off-by: Joel Fernandes <[email protected]>
Signed-off-by: Lokesh Vutla <[email protected]>
[[email protected]: squashed the RNG hwmod IP flag fixes from Lokesh,
squashed the HS chip fix from Daniel Allred]
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 36 +++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index 8932619..0508067 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -2641,6 +2641,34 @@ static struct omap_hwmod dra7xx_des_hwmod = {
},
};

+/* rng */
+static struct omap_hwmod_class_sysconfig dra7xx_rng_sysc = {
+ .rev_offs = 0x1fe0,
+ .sysc_offs = 0x1fe4,
+ .sysc_flags = SYSC_HAS_AUTOIDLE | SYSC_HAS_SIDLEMODE,
+ .idlemodes = SIDLE_FORCE | SIDLE_NO,
+ .sysc_fields = &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class dra7xx_rng_hwmod_class = {
+ .name = "rng",
+ .sysc = &dra7xx_rng_sysc,
+};
+
+static struct omap_hwmod dra7xx_rng_hwmod = {
+ .name = "rng",
+ .class = &dra7xx_rng_hwmod_class,
+ .flags = HWMOD_SWSUP_SIDLE,
+ .clkdm_name = "l4sec_clkdm",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = DRA7XX_CM_L4SEC_RNG_CLKCTRL_OFFSET,
+ .context_offs = DRA7XX_RM_L4SEC_RNG_CONTEXT_OFFSET,
+ .modulemode = MODULEMODE_HWCTRL,
+ },
+ },
+};
+
/*
* 'usb_otg_ss' class
*
@@ -3839,6 +3867,13 @@ static struct omap_hwmod_ocp_if dra7xx_l4_wkup__uart10 = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};

+/* l4_per1 -> rng */
+static struct omap_hwmod_ocp_if dra7xx_l4_per1__rng = {
+ .master = &dra7xx_l4_per1_hwmod,
+ .slave = &dra7xx_rng_hwmod,
+ .user = OCP_USER_MPU,
+};
+
/* l4_per3 -> usb_otg_ss1 */
static struct omap_hwmod_ocp_if dra7xx_l4_per3__usb_otg_ss1 = {
.master = &dra7xx_l4_per3_hwmod,
@@ -4069,6 +4104,7 @@ static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] __initdata = {
/* GP-only hwmod links */
static struct omap_hwmod_ocp_if *dra7xx_gp_hwmod_ocp_ifs[] __initdata = {
&dra7xx_l4_wkup__timer12,
+ &dra7xx_l4_per1__rng,
NULL,
};

--
1.9.1

2016-06-01 09:07:42

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 21/28] ARM: AM43xx: hwmod: Add data for DES

From: Lokesh Vutla <[email protected]>

AM43xx SoC contains DES crypto hardware accelerator. Add hwmod data for
this IP so that it can be utilized by crypto frameworks.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/omap_hwmod_43xx_data.c | 33 ++++++++++++++++++++++++++++++
arch/arm/mach-omap2/prcm43xx.h | 1 +
2 files changed, 34 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_43xx_data.c b/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
index 97fd399..b54eeaa 100644
--- a/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
@@ -463,6 +463,31 @@ static struct omap_hwmod am43xx_adc_tsc_hwmod = {
},
};

+static struct omap_hwmod_class_sysconfig am43xx_des_sysc = {
+ .rev_offs = 0x30,
+ .sysc_offs = 0x34,
+ .syss_offs = 0x38,
+ .sysc_flags = SYSS_HAS_RESET_STATUS,
+};
+
+static struct omap_hwmod_class am43xx_des_hwmod_class = {
+ .name = "des",
+ .sysc = &am43xx_des_sysc,
+};
+
+static struct omap_hwmod am43xx_des_hwmod = {
+ .name = "des",
+ .class = &am43xx_des_hwmod_class,
+ .clkdm_name = "l3_clkdm",
+ .main_clk = "l3_gclk",
+ .prcm = {
+ .omap4 = {
+ .clkctrl_offs = AM43XX_CM_PER_DES_CLKCTRL_OFFSET,
+ .modulemode = MODULEMODE_SWCTRL,
+ },
+ },
+};
+
/* dss */

static struct omap_hwmod am43xx_dss_core_hwmod = {
@@ -912,6 +937,13 @@ static struct omap_hwmod_ocp_if am43xx_l4_ls__vpfe1 = {
.user = OCP_USER_MPU | OCP_USER_SDMA,
};

+static struct omap_hwmod_ocp_if am43xx_l3_main__des = {
+ .master = &am33xx_l3_main_hwmod,
+ .slave = &am43xx_des_hwmod,
+ .clk = "l3_gclk",
+ .user = OCP_USER_MPU,
+};
+
static struct omap_hwmod_ocp_if *am43xx_hwmod_ocp_ifs[] __initdata = {
&am33xx_l4_wkup__synctimer,
&am43xx_l4_ls__timer8,
@@ -1004,6 +1036,7 @@ static struct omap_hwmod_ocp_if *am43xx_hwmod_ocp_ifs[] __initdata = {
&am33xx_cpgmac0__mdio,
&am33xx_l3_main__sha0,
&am33xx_l3_main__aes0,
+ &am43xx_l3_main__des,
&am43xx_l4_ls__ocp2scp0,
&am43xx_l4_ls__ocp2scp1,
&am43xx_l3_s__usbotgss0,
diff --git a/arch/arm/mach-omap2/prcm43xx.h b/arch/arm/mach-omap2/prcm43xx.h
index 7c34c44e..593482e 100644
--- a/arch/arm/mach-omap2/prcm43xx.h
+++ b/arch/arm/mach-omap2/prcm43xx.h
@@ -132,6 +132,7 @@
#define AM43XX_CM_PER_OCMCRAM_CLKCTRL_OFFSET 0x0050
#define AM43XX_CM_PER_SHA0_CLKCTRL_OFFSET 0x0058
#define AM43XX_CM_PER_AES0_CLKCTRL_OFFSET 0x0028
+#define AM43XX_CM_PER_DES_CLKCTRL_OFFSET 0x0030
#define AM43XX_CM_PER_TIMER8_CLKCTRL_OFFSET 0x0560
#define AM43XX_CM_PER_TIMER9_CLKCTRL_OFFSET 0x0568
#define AM43XX_CM_PER_TIMER10_CLKCTRL_OFFSET 0x0570
--
1.9.1

2016-06-01 09:07:46

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 20/28] ARM: OMAP: DRA7xx: Make L4SEC clock domain SWSUP only

From: Joel Fernandes <[email protected]>

Using HWSUP for l4sec clock domain is causing warnings in HWMOD code for
DRA7. Based on some observations, once the clock domain goes into an IDLE
state (because of no activity etc), the IDLEST for the module goes to '0x2'
value which means Interface IDLE condition. So far so go, however once the
MODULEMODE is set to disabled for the particular IP, the IDLEST for the
module should go to '0x3', per the HW AUTO IDLE protocol. However this is
not observed and there is no reason per the protocl for the transition to
not happen. This could potentially be a bug in the HW AUTO state-machine.

Work around for this is to use SWSUP only for the particular clockdomain.
With this all the transitions of IDLEST happen correctly and warnings
don't occur.

Signed-off-by: Joel Fernandes <[email protected]>
Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/mach-omap2/clockdomains7xx_data.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-omap2/clockdomains7xx_data.c b/arch/arm/mach-omap2/clockdomains7xx_data.c
index ef9ed36..6c67965 100644
--- a/arch/arm/mach-omap2/clockdomains7xx_data.c
+++ b/arch/arm/mach-omap2/clockdomains7xx_data.c
@@ -409,7 +409,7 @@ static struct clockdomain l4sec_7xx_clkdm = {
.dep_bit = DRA7XX_L4SEC_STATDEP_SHIFT,
.wkdep_srcs = l4sec_wkup_sleep_deps,
.sleepdep_srcs = l4sec_wkup_sleep_deps,
- .flags = CLKDM_CAN_HWSUP_SWSUP,
+ .flags = CLKDM_CAN_SWSUP,
};

static struct clockdomain l3main1_7xx_clkdm = {
--
1.9.1

2016-06-01 09:07:49

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 24/28] ARM: dts: DRA7: Add DT nodes for AES IP

From: Joel Fernandes <[email protected]>

DRA7 SoC has the same AES IP as OMAP4. Add DT entries for both AES cores.

Signed-off-by: Joel Fernandes <[email protected]>
Signed-off-by: Lokesh Vutla <[email protected]>
[[email protected]: squashed in the change to use EDMA, squashed in
support for two AES cores]
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/boot/dts/dra7.dtsi | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index 959f99b..da31a72 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1744,6 +1744,28 @@
};
};

+ aes1: aes@4b500000 {
+ compatible = "ti,omap4-aes";
+ ti,hwmods = "aes1";
+ reg = <0x4b500000 0xa0>;
+ interrupts = <GIC_SPI 80 IRQ_TYPE_LEVEL_HIGH>;
+ dmas = <&edma_xbar 111 0>, <&edma_xbar 110 0>;
+ dma-names = "tx", "rx";
+ clocks = <&l3_iclk_div>;
+ clock-names = "fck";
+ };
+
+ aes2: aes@4b700000 {
+ compatible = "ti,omap4-aes";
+ ti,hwmods = "aes2";
+ reg = <0x4b700000 0xa0>;
+ interrupts = <GIC_SPI 59 IRQ_TYPE_LEVEL_HIGH>;
+ dmas = <&edma_xbar 114 0>, <&edma_xbar 113 0>;
+ dma-names = "tx", "rx";
+ clocks = <&l3_iclk_div>;
+ clock-names = "fck";
+ };
+
des: des@480a5000 {
compatible = "ti,omap4-des";
ti,hwmods = "des";
--
1.9.1

2016-06-01 09:07:47

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 22/28] ARM: AMx3xx: hwmod: Add data for RNG

From: Lokesh Vutla <[email protected]>

Hardware random number generator is present in both AM33xx and AM43xx
SoC's. So moving the hwmod data to common data.

Signed-off-by: Lokesh Vutla <[email protected]>
Signed-off-by: Tero Kristo <[email protected]>
---
.../mach-omap2/omap_hwmod_33xx_43xx_common_data.h | 2 ++
.../omap_hwmod_33xx_43xx_interconnect_data.c | 8 +++++
.../mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c | 29 ++++++++++++++++++
arch/arm/mach-omap2/omap_hwmod_33xx_data.c | 35 ----------------------
arch/arm/mach-omap2/omap_hwmod_43xx_data.c | 1 +
arch/arm/mach-omap2/prcm43xx.h | 1 +
6 files changed, 41 insertions(+), 35 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_common_data.h b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_common_data.h
index 7f73796..968ce46 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_common_data.h
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_common_data.h
@@ -77,6 +77,7 @@ extern struct omap_hwmod_ocp_if am33xx_l4_ls__uart6;
extern struct omap_hwmod_ocp_if am33xx_l3_main__ocmc;
extern struct omap_hwmod_ocp_if am33xx_l3_main__sha0;
extern struct omap_hwmod_ocp_if am33xx_l3_main__aes0;
+extern struct omap_hwmod_ocp_if am33xx_l4_per__rng;

extern struct omap_hwmod am33xx_l3_main_hwmod;
extern struct omap_hwmod am33xx_l3_s_hwmod;
@@ -89,6 +90,7 @@ extern struct omap_hwmod am33xx_gfx_hwmod;
extern struct omap_hwmod am33xx_prcm_hwmod;
extern struct omap_hwmod am33xx_aes0_hwmod;
extern struct omap_hwmod am33xx_sha0_hwmod;
+extern struct omap_hwmod am33xx_rng_hwmod;
extern struct omap_hwmod am33xx_ocmcram_hwmod;
extern struct omap_hwmod am33xx_smartreflex0_hwmod;
extern struct omap_hwmod am33xx_smartreflex1_hwmod;
diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_interconnect_data.c b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_interconnect_data.c
index 1c210cb..b99d6ea 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_interconnect_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_interconnect_data.c
@@ -611,3 +611,11 @@ struct omap_hwmod_ocp_if am33xx_l3_main__aes0 = {
.addr = am33xx_aes0_addrs,
.user = OCP_USER_MPU | OCP_USER_SDMA,
};
+
+/* l4 per -> rng */
+struct omap_hwmod_ocp_if am33xx_l4_per__rng = {
+ .master = &am33xx_l4_ls_hwmod,
+ .slave = &am33xx_rng_hwmod,
+ .clk = "rng_fck",
+ .user = OCP_USER_MPU,
+};
diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
index aed3362..d2f0bb4 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_43xx_ipblock_data.c
@@ -267,6 +267,33 @@ struct omap_hwmod am33xx_sha0_hwmod = {
},
};

+/* rng */
+static struct omap_hwmod_class_sysconfig am33xx_rng_sysc = {
+ .rev_offs = 0x1fe0,
+ .sysc_offs = 0x1fe4,
+ .sysc_flags = SYSC_HAS_AUTOIDLE | SYSC_HAS_SIDLEMODE,
+ .idlemodes = SIDLE_FORCE | SIDLE_NO,
+ .sysc_fields = &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class am33xx_rng_hwmod_class = {
+ .name = "rng",
+ .sysc = &am33xx_rng_sysc,
+};
+
+struct omap_hwmod am33xx_rng_hwmod = {
+ .name = "rng",
+ .class = &am33xx_rng_hwmod_class,
+ .clkdm_name = "l4ls_clkdm",
+ .flags = HWMOD_SWSUP_SIDLE,
+ .main_clk = "rng_fck",
+ .prcm = {
+ .omap4 = {
+ .modulemode = MODULEMODE_SWCTRL,
+ },
+ },
+};
+
/* ocmcram */
static struct omap_hwmod_class am33xx_ocmcram_hwmod_class = {
.name = "ocmcram",
@@ -1397,6 +1424,7 @@ static void omap_hwmod_am33xx_clkctrl(void)
CLKCTRL(am33xx_ocmcram_hwmod , AM33XX_CM_PER_OCMCRAM_CLKCTRL_OFFSET);
CLKCTRL(am33xx_sha0_hwmod , AM33XX_CM_PER_SHA0_CLKCTRL_OFFSET);
CLKCTRL(am33xx_aes0_hwmod , AM33XX_CM_PER_AES0_CLKCTRL_OFFSET);
+ CLKCTRL(am33xx_rng_hwmod, AM33XX_CM_PER_RNG_CLKCTRL_OFFSET);
}

static void omap_hwmod_am33xx_rst(void)
@@ -1470,6 +1498,7 @@ static void omap_hwmod_am43xx_clkctrl(void)
CLKCTRL(am33xx_ocmcram_hwmod , AM43XX_CM_PER_OCMCRAM_CLKCTRL_OFFSET);
CLKCTRL(am33xx_sha0_hwmod , AM43XX_CM_PER_SHA0_CLKCTRL_OFFSET);
CLKCTRL(am33xx_aes0_hwmod , AM43XX_CM_PER_AES0_CLKCTRL_OFFSET);
+ CLKCTRL(am33xx_rng_hwmod, AM43XX_CM_PER_RNG_CLKCTRL_OFFSET);
}

static void omap_hwmod_am43xx_rst(void)
diff --git a/arch/arm/mach-omap2/omap_hwmod_33xx_data.c b/arch/arm/mach-omap2/omap_hwmod_33xx_data.c
index cc0791d..f43ab86 100644
--- a/arch/arm/mach-omap2/omap_hwmod_33xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_33xx_data.c
@@ -503,41 +503,6 @@ static struct omap_hwmod_ocp_if am33xx_l3_s__usbss = {
.flags = OCPIF_SWSUP_IDLE,
};

-/* rng */
-static struct omap_hwmod_class_sysconfig am33xx_rng_sysc = {
- .rev_offs = 0x1fe0,
- .sysc_offs = 0x1fe4,
- .sysc_flags = SYSC_HAS_AUTOIDLE | SYSC_HAS_SIDLEMODE,
- .idlemodes = SIDLE_FORCE | SIDLE_NO,
- .sysc_fields = &omap_hwmod_sysc_type1,
-};
-
-static struct omap_hwmod_class am33xx_rng_hwmod_class = {
- .name = "rng",
- .sysc = &am33xx_rng_sysc,
-};
-
-static struct omap_hwmod am33xx_rng_hwmod = {
- .name = "rng",
- .class = &am33xx_rng_hwmod_class,
- .clkdm_name = "l4ls_clkdm",
- .flags = HWMOD_SWSUP_SIDLE,
- .main_clk = "rng_fck",
- .prcm = {
- .omap4 = {
- .clkctrl_offs = AM33XX_CM_PER_RNG_CLKCTRL_OFFSET,
- .modulemode = MODULEMODE_SWCTRL,
- },
- },
-};
-
-static struct omap_hwmod_ocp_if am33xx_l4_per__rng = {
- .master = &am33xx_l4_ls_hwmod,
- .slave = &am33xx_rng_hwmod,
- .clk = "rng_fck",
- .user = OCP_USER_MPU,
-};
-
static struct omap_hwmod_ocp_if *am33xx_hwmod_ocp_ifs[] __initdata = {
&am33xx_l3_main__emif,
&am33xx_mpu__l3_main,
diff --git a/arch/arm/mach-omap2/omap_hwmod_43xx_data.c b/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
index b54eeaa..1cb12ea 100644
--- a/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_43xx_data.c
@@ -994,6 +994,7 @@ static struct omap_hwmod_ocp_if *am43xx_hwmod_ocp_ifs[] __initdata = {
&am33xx_l4_per__i2c2,
&am33xx_l4_per__i2c3,
&am33xx_l4_per__mailbox,
+ &am33xx_l4_per__rng,
&am33xx_l4_ls__mcasp0,
&am33xx_l4_ls__mcasp1,
&am33xx_l4_ls__mmc0,
diff --git a/arch/arm/mach-omap2/prcm43xx.h b/arch/arm/mach-omap2/prcm43xx.h
index 593482e..75976b4 100644
--- a/arch/arm/mach-omap2/prcm43xx.h
+++ b/arch/arm/mach-omap2/prcm43xx.h
@@ -91,6 +91,7 @@
#define AM43XX_CM_PER_MAILBOX0_CLKCTRL_OFFSET 0x04b8
#define AM43XX_CM_PER_MMC0_CLKCTRL_OFFSET 0x04c0
#define AM43XX_CM_PER_MMC1_CLKCTRL_OFFSET 0x04c8
+#define AM43XX_CM_PER_RNG_CLKCTRL_OFFSET 0x04e0
#define AM43XX_CM_PER_SPI0_CLKCTRL_OFFSET 0x0500
#define AM43XX_CM_PER_SPI1_CLKCTRL_OFFSET 0x0508
#define AM43XX_CM_PER_SPINLOCK_CLKCTRL_OFFSET 0x0528
--
1.9.1

2016-06-01 09:07:53

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 25/28] ARM: dts: DRA7: Add support for SHA IP

From: Lokesh Vutla <[email protected]>

DRA7 SoC has the same SHA IP as OMAP5. Add DT entry for the same.

Signed-off-by: Lokesh Vutla <[email protected]>
[[email protected]: changed SHA to use EDMA instead of SDMA]
Signed-off-by: Tero Kristo <[email protected]>
---
arch/arm/boot/dts/dra7.dtsi | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index da31a72..64759e1 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1776,6 +1776,17 @@
clocks = <&l3_iclk_div>;
clock-names = "fck";
};
+
+ sham: sham@53100000 {
+ compatible = "ti,omap5-sham";
+ ti,hwmods = "sham";
+ reg = <0x4b101000 0x300>;
+ interrupts = <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>;
+ dmas = <&edma_xbar 119 0>;
+ dma-names = "rx";
+ clocks = <&l3_iclk_div>;
+ clock-names = "fck";
+ };
};

thermal_zones: thermal-zones {
--
1.9.1

2016-06-01 09:07:52

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 26/28] ARM: dts: DRA7: Add DT node for RNG IP

From: Lokesh Vutla <[email protected]>

Adding dt node for hardware random number generator IP.

Signed-off-by: Lokesh Vutla <[email protected]>
---
arch/arm/boot/dts/dra7.dtsi | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index 64759e1..16ff083 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1787,6 +1787,15 @@
clocks = <&l3_iclk_div>;
clock-names = "fck";
};
+
+ rng: rng@48090000 {
+ compatible = "ti,omap4-rng";
+ ti,hwmods = "rng";
+ reg = <0x48090000 0x2000>;
+ interrupts = <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&l3_iclk_div>;
+ clock-names = "fck";
+ };
};

thermal_zones: thermal-zones {
--
1.9.1

2016-06-01 09:07:54

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 27/28] ARM: dts: AM43xx: clk: Add RNG clk node

From: Lokesh Vutla <[email protected]>

Add clk node for RNG module.

Signed-off-by: Lokesh Vutla <[email protected]>
---
arch/arm/boot/dts/am43xx-clocks.dtsi | 8 ++++++++
drivers/clk/ti/clk-43xx.c | 1 +
2 files changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/am43xx-clocks.dtsi b/arch/arm/boot/dts/am43xx-clocks.dtsi
index 7630ba1..d1d73b7 100644
--- a/arch/arm/boot/dts/am43xx-clocks.dtsi
+++ b/arch/arm/boot/dts/am43xx-clocks.dtsi
@@ -104,6 +104,14 @@
clock-div = <1>;
};

+ rng_fck: rng_fck {
+ #clock-cells = <0>;
+ compatible = "fixed-factor-clock";
+ clocks = <&sys_clkin_ck>;
+ clock-mult = <1>;
+ clock-div = <1>;
+ };
+
ehrpwm0_tbclk: ehrpwm0_tbclk@664 {
#clock-cells = <0>;
compatible = "ti,gate-clock";
diff --git a/drivers/clk/ti/clk-43xx.c b/drivers/clk/ti/clk-43xx.c
index 097fc90..3f157a4 100644
--- a/drivers/clk/ti/clk-43xx.c
+++ b/drivers/clk/ti/clk-43xx.c
@@ -58,6 +58,7 @@ static struct ti_dt_clk am43xx_clks[] = {
DT_CLK(NULL, "smartreflex1_fck", "smartreflex1_fck"),
DT_CLK(NULL, "sha0_fck", "sha0_fck"),
DT_CLK(NULL, "aes0_fck", "aes0_fck"),
+ DT_CLK(NULL, "rng_fck", "rng_fck"),
DT_CLK(NULL, "timer1_fck", "timer1_fck"),
DT_CLK(NULL, "timer2_fck", "timer2_fck"),
DT_CLK(NULL, "timer3_fck", "timer3_fck"),
--
1.9.1

2016-06-01 09:07:57

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 28/28] ARM: dts: AM43xx: Add node for RNG

From: Lokesh Vutla <[email protected]>

Adding DT node for hardware random number generator.

Signed-off-by: Lokesh Vutla <[email protected]>
---
arch/arm/boot/dts/am4372.dtsi | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/arm/boot/dts/am4372.dtsi b/arch/arm/boot/dts/am4372.dtsi
index 12fcde4..a44ee94 100644
--- a/arch/arm/boot/dts/am4372.dtsi
+++ b/arch/arm/boot/dts/am4372.dtsi
@@ -843,6 +843,13 @@
dma-names = "tx", "rx";
};

+ rng: rng@48310000 {
+ compatible = "ti,omap4-rng";
+ ti,hwmods = "rng";
+ reg = <0x48310000 0x2000>;
+ interrupts = <GIC_SPI 111 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
mcasp0: mcasp@48038000 {
compatible = "ti,am33xx-mcasp-audio";
ti,hwmods = "mcasp0";
--
1.9.1

2016-06-01 09:07:50

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 23/28] ARM: dts: DRA7: Add DT node for DES IP

From: Joel Fernandes <[email protected]>

DRA7xx SoCs have a DES3DES IP. Add DT data for the same.

Signed-off-by: Joel Fernandes <[email protected]>
---
arch/arm/boot/dts/dra7.dtsi | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index e007401..959f99b 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1743,6 +1743,17 @@
clock-names = "fck", "sys_clk";
};
};
+
+ des: des@480a5000 {
+ compatible = "ti,omap4-des";
+ ti,hwmods = "des";
+ reg = <0x480a5000 0xa0>;
+ interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
+ dmas = <&sdma_xbar 117>, <&sdma_xbar 116>;
+ dma-names = "tx", "rx";
+ clocks = <&l3_iclk_div>;
+ clock-names = "fck";
+ };
};

thermal_zones: thermal-zones {
--
1.9.1

2016-06-01 09:54:11

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On 06/01/2016 11:56 AM, Tero Kristo wrote:
> From: Lokesh Vutla <[email protected]>
>
> Calling runtime PM API for every block causes serious perf hit to
> crypto operations that are done on a long buffer.
> As crypto is performed on a page boundary, encrypting large buffers can
> cause a series of crypto operations divided by page. The runtime PM API
> is also called those many times.
>
> We call runtime_pm_get_sync only at beginning on the session (cra_init)
> and runtime_pm_put at the end. This result in upto a 50% speedup.
> This doesn't make the driver to keep the system awake as runtime get/put
> is only called during a crypto session which completes usually quickly.
>
> Signed-off-by: Lokesh Vutla <[email protected]>
> Signed-off-by: Tero Kristo <[email protected]>
> ---
> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
> 1 file changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
> index 6eefaa2..bd0258f 100644
> --- a/drivers/crypto/omap-sham.c
> +++ b/drivers/crypto/omap-sham.c
> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct ahash_request *req)
>
> static int omap_sham_hw_init(struct omap_sham_dev *dd)
> {
> - int err;
> -
> - err = pm_runtime_get_sync(dd->dev);
> - if (err < 0) {
> - dev_err(dd->dev, "failed to get sync: %d\n", err);
> - return err;
> - }
> -
> if (!test_bit(FLAGS_INIT, &dd->flags)) {
> set_bit(FLAGS_INIT, &dd->flags);
> dd->err = 0;
> @@ -999,8 +991,6 @@ static void omap_sham_finish_req(struct ahash_request *req, int err)
> dd->flags &= ~(BIT(FLAGS_BUSY) | BIT(FLAGS_FINAL) | BIT(FLAGS_CPU) |
> BIT(FLAGS_DMA_READY) | BIT(FLAGS_OUTPUT_READY));
>
> - pm_runtime_put(dd->dev);
> -
> if (req->base.complete)
> req->base.complete(&req->base, err);
>
> @@ -1239,6 +1229,7 @@ static int omap_sham_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
> {
> struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> const char *alg_name = crypto_tfm_alg_name(tfm);
> + struct omap_sham_dev *dd;
>
> /* Allocate a fallback and abort if it failed. */
> tctx->fallback = crypto_alloc_shash(alg_name, 0,
> @@ -1266,6 +1257,13 @@ static int omap_sham_cra_init_alg(struct crypto_tfm *tfm, const char *alg_base)
>
> }
>
> + spin_lock_bh(&sham.lock);
> + list_for_each_entry(dd, &sham.dev_list, list) {
> + break;
> + }
> + spin_unlock_bh(&sham.lock);
> +
> + pm_runtime_get_sync(dd->dev);
> return 0;
> }
>
> @@ -1307,6 +1305,7 @@ static int omap_sham_cra_sha512_init(struct crypto_tfm *tfm)
> static void omap_sham_cra_exit(struct crypto_tfm *tfm)
> {
> struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
> + struct omap_sham_dev *dd;
>
> crypto_free_shash(tctx->fallback);
> tctx->fallback = NULL;
> @@ -1315,6 +1314,14 @@ static void omap_sham_cra_exit(struct crypto_tfm *tfm)
> struct omap_sham_hmac_ctx *bctx = tctx->base;
> crypto_free_shash(bctx->shash);
> }
> +
> + spin_lock_bh(&sham.lock);
> + list_for_each_entry(dd, &sham.dev_list, list) {
> + break;
> + }
> + spin_unlock_bh(&sham.lock);
> +
> + pm_runtime_get_sync(dd->dev);

May be put_?

> }
>
> static struct ahash_alg algs_sha1_md5[] = {
>


--
regards,
-grygorii

2016-06-01 23:04:37

by Dave Gerlach

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>> From: Lokesh Vutla <[email protected]>
>>
>> Calling runtime PM API for every block causes serious perf hit to
>> crypto operations that are done on a long buffer.
>> As crypto is performed on a page boundary, encrypting large buffers can
>> cause a series of crypto operations divided by page. The runtime PM API
>> is also called those many times.
>>
>> We call runtime_pm_get_sync only at beginning on the session (cra_init)
>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>> This doesn't make the driver to keep the system awake as runtime get/put
>> is only called during a crypto session which completes usually quickly.
>>
>> Signed-off-by: Lokesh Vutla <[email protected]>
>> Signed-off-by: Tero Kristo <[email protected]>
>> ---
>> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>> index 6eefaa2..bd0258f 100644
>> --- a/drivers/crypto/omap-sham.c
>> +++ b/drivers/crypto/omap-sham.c
>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>> ahash_request *req)
>>
>> static int omap_sham_hw_init(struct omap_sham_dev *dd)
>> {
>> - int err;
>> -
>> - err = pm_runtime_get_sync(dd->dev);
>> - if (err < 0) {
>> - dev_err(dd->dev, "failed to get sync: %d\n", err);
>> - return err;
>> - }
>> -

Would it be worth it to investigate a pm_runtime autosuspend approach
rather than knocking runtime PM out here completely? I am not clear if
the overhead is coming from the pm_runtime calls themselves or the
actual idling of the IP, but if it's the idling of the IP causing the
slowdown, with a large enough autosuspend_delay we don't actually sleep
between each block but after a long enough period of idle time we would
actually suspend.

Regards,
Dave

>> if (!test_bit(FLAGS_INIT, &dd->flags)) {
>> set_bit(FLAGS_INIT, &dd->flags);
>> dd->err = 0;
>> @@ -999,8 +991,6 @@ static void omap_sham_finish_req(struct
>> ahash_request *req, int err)
>> dd->flags &= ~(BIT(FLAGS_BUSY) | BIT(FLAGS_FINAL) |
>> BIT(FLAGS_CPU) |
>> BIT(FLAGS_DMA_READY) | BIT(FLAGS_OUTPUT_READY));
>>
>> - pm_runtime_put(dd->dev);
>> -
>> if (req->base.complete)
>> req->base.complete(&req->base, err);
>>
>> @@ -1239,6 +1229,7 @@ static int omap_sham_cra_init_alg(struct
>> crypto_tfm *tfm, const char *alg_base)
>> {
>> struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
>> const char *alg_name = crypto_tfm_alg_name(tfm);
>> + struct omap_sham_dev *dd;
>>
>> /* Allocate a fallback and abort if it failed. */
>> tctx->fallback = crypto_alloc_shash(alg_name, 0,
>> @@ -1266,6 +1257,13 @@ static int omap_sham_cra_init_alg(struct
>> crypto_tfm *tfm, const char *alg_base)
>>
>> }
>>
>> + spin_lock_bh(&sham.lock);
>> + list_for_each_entry(dd, &sham.dev_list, list) {
>> + break;
>> + }
>> + spin_unlock_bh(&sham.lock);
>> +
>> + pm_runtime_get_sync(dd->dev);
>> return 0;
>> }
>>
>> @@ -1307,6 +1305,7 @@ static int omap_sham_cra_sha512_init(struct
>> crypto_tfm *tfm)
>> static void omap_sham_cra_exit(struct crypto_tfm *tfm)
>> {
>> struct omap_sham_ctx *tctx = crypto_tfm_ctx(tfm);
>> + struct omap_sham_dev *dd;
>>
>> crypto_free_shash(tctx->fallback);
>> tctx->fallback = NULL;
>> @@ -1315,6 +1314,14 @@ static void omap_sham_cra_exit(struct
>> crypto_tfm *tfm)
>> struct omap_sham_hmac_ctx *bctx = tctx->base;
>> crypto_free_shash(bctx->shash);
>> }
>> +
>> + spin_lock_bh(&sham.lock);
>> + list_for_each_entry(dd, &sham.dev_list, list) {
>> + break;
>> + }
>> + spin_unlock_bh(&sham.lock);
>> +
>> + pm_runtime_get_sync(dd->dev);
>
> May be put_?
>
>> }
>>
>> static struct ahash_alg algs_sha1_md5[] = {
>>
>
>

2016-06-07 10:08:37

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
> >On 06/01/2016 11:56 AM, Tero Kristo wrote:
> >>From: Lokesh Vutla <[email protected]>
> >>
> >>Calling runtime PM API for every block causes serious perf hit to
> >>crypto operations that are done on a long buffer.
> >>As crypto is performed on a page boundary, encrypting large buffers can
> >>cause a series of crypto operations divided by page. The runtime PM API
> >>is also called those many times.
> >>
> >>We call runtime_pm_get_sync only at beginning on the session (cra_init)
> >>and runtime_pm_put at the end. This result in upto a 50% speedup.
> >>This doesn't make the driver to keep the system awake as runtime get/put
> >>is only called during a crypto session which completes usually quickly.
> >>
> >>Signed-off-by: Lokesh Vutla <[email protected]>
> >>Signed-off-by: Tero Kristo <[email protected]>
> >>---
> >> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
> >> 1 file changed, 17 insertions(+), 10 deletions(-)
> >>
> >>diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
> >>index 6eefaa2..bd0258f 100644
> >>--- a/drivers/crypto/omap-sham.c
> >>+++ b/drivers/crypto/omap-sham.c
> >>@@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
> >>ahash_request *req)
> >>
> >> static int omap_sham_hw_init(struct omap_sham_dev *dd)
> >> {
> >>- int err;
> >>-
> >>- err = pm_runtime_get_sync(dd->dev);
> >>- if (err < 0) {
> >>- dev_err(dd->dev, "failed to get sync: %d\n", err);
> >>- return err;
> >>- }
> >>-
>
> Would it be worth it to investigate a pm_runtime autosuspend
> approach rather than knocking runtime PM out here completely? I am
> not clear if the overhead is coming from the pm_runtime calls
> themselves or the actual idling of the IP, but if it's the idling of
> the IP causing the slowdown, with a large enough autosuspend_delay
> we don't actually sleep between each block but after a long enough
> period of idle time we would actually suspend.

Indeed, I think this patch is bogus. cra_init is associated
with the tfm object which is usually long-lived. So doing power
management there makes no sense.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2016-06-07 10:48:37

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 01/28] crypto: omap-aes: Fix registration of algorithms

On Wed, Jun 01, 2016 at 11:56:02AM +0300, Tero Kristo wrote:
> From: Lokesh Vutla <[email protected]>
>
> Algorithms can be registered only once. So skip registration of
> algorithms if already registered (i.e. in case we have two AES cores
> in the system.)
>
> Signed-off-by: Lokesh Vutla <[email protected]>
> Signed-off-by: Tero Kristo <[email protected]>

Patch applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2016-06-07 11:53:51

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On 07/06/16 13:08, Herbert Xu wrote:
> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>> From: Lokesh Vutla <[email protected]>
>>>>
>>>> Calling runtime PM API for every block causes serious perf hit to
>>>> crypto operations that are done on a long buffer.
>>>> As crypto is performed on a page boundary, encrypting large buffers can
>>>> cause a series of crypto operations divided by page. The runtime PM API
>>>> is also called those many times.
>>>>
>>>> We call runtime_pm_get_sync only at beginning on the session (cra_init)
>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>> This doesn't make the driver to keep the system awake as runtime get/put
>>>> is only called during a crypto session which completes usually quickly.
>>>>
>>>> Signed-off-by: Lokesh Vutla <[email protected]>
>>>> Signed-off-by: Tero Kristo <[email protected]>
>>>> ---
>>>> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>> index 6eefaa2..bd0258f 100644
>>>> --- a/drivers/crypto/omap-sham.c
>>>> +++ b/drivers/crypto/omap-sham.c
>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>> ahash_request *req)
>>>>
>>>> static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>> {
>>>> - int err;
>>>> -
>>>> - err = pm_runtime_get_sync(dd->dev);
>>>> - if (err < 0) {
>>>> - dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>> - return err;
>>>> - }
>>>> -
>>
>> Would it be worth it to investigate a pm_runtime autosuspend
>> approach rather than knocking runtime PM out here completely? I am
>> not clear if the overhead is coming from the pm_runtime calls
>> themselves or the actual idling of the IP, but if it's the idling of
>> the IP causing the slowdown, with a large enough autosuspend_delay
>> we don't actually sleep between each block but after a long enough
>> period of idle time we would actually suspend.
>
> Indeed, I think this patch is bogus. cra_init is associated
> with the tfm object which is usually long-lived. So doing power
> management there makes no sense.
>
> Cheers,
>

I can investigate this further, but I believe this patch itself gave a
noticeable performance boost.

This is an optimization anyway, and not critical for functionality.

-Tero

2016-06-07 12:25:16

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On 06/07/2016 02:52 PM, Tero Kristo wrote:
> On 07/06/16 13:08, Herbert Xu wrote:
>> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>>> From: Lokesh Vutla <[email protected]>
>>>>>
>>>>> Calling runtime PM API for every block causes serious perf hit to
>>>>> crypto operations that are done on a long buffer.
>>>>> As crypto is performed on a page boundary, encrypting large buffers
>>>>> can
>>>>> cause a series of crypto operations divided by page. The runtime PM
>>>>> API
>>>>> is also called those many times.
>>>>>
>>>>> We call runtime_pm_get_sync only at beginning on the session
>>>>> (cra_init)
>>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>>> This doesn't make the driver to keep the system awake as runtime
>>>>> get/put
>>>>> is only called during a crypto session which completes usually
>>>>> quickly.
>>>>>
>>>>> Signed-off-by: Lokesh Vutla <[email protected]>
>>>>> Signed-off-by: Tero Kristo <[email protected]>
>>>>> ---
>>>>> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>>> index 6eefaa2..bd0258f 100644
>>>>> --- a/drivers/crypto/omap-sham.c
>>>>> +++ b/drivers/crypto/omap-sham.c
>>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>>> ahash_request *req)
>>>>>
>>>>> static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>>> {
>>>>> - int err;
>>>>> -
>>>>> - err = pm_runtime_get_sync(dd->dev);
>>>>> - if (err < 0) {
>>>>> - dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>>> - return err;
>>>>> - }
>>>>> -
>>>
>>> Would it be worth it to investigate a pm_runtime autosuspend
>>> approach rather than knocking runtime PM out here completely? I am
>>> not clear if the overhead is coming from the pm_runtime calls
>>> themselves or the actual idling of the IP, but if it's the idling of
>>> the IP causing the slowdown, with a large enough autosuspend_delay
>>> we don't actually sleep between each block but after a long enough
>>> period of idle time we would actually suspend.
>>
>> Indeed, I think this patch is bogus. cra_init is associated
>> with the tfm object which is usually long-lived. So doing power
>> management there makes no sense.
>>
>> Cheers,
>>
>
> I can investigate this further, but I believe this patch itself gave a
> noticeable performance boost.
>
> This is an optimization anyway, and not critical for functionality.
>

It is not critical only if below code would not introduce races
+ spin_lock_bh(&sham.lock);
+ list_for_each_entry(dd, &sham.dev_list, list) {
+ break;
+ }
+ spin_unlock_bh(&sham.lock);

Is it guaranteed that dd will alive always at this moment?

+
+ pm_runtime_get_sync(dd->dev);



--
regards,
-grygorii

2016-06-10 11:38:40

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH 23/28] ARM: dts: DRA7: Add DT node for DES IP

* Tero Kristo <[email protected]> [160601 02:09]:
> From: Joel Fernandes <[email protected]>
>
> DRA7xx SoCs have a DES3DES IP. Add DT data for the same.

Are these dts changes safe to apply separately or do they
cause issues like extra warnings during boot?

Regards,

Tony

2016-06-20 12:11:54

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 01/28] crypto: omap-aes: Fix registration of algorithms

On 07/06/16 13:48, Herbert Xu wrote:
> On Wed, Jun 01, 2016 at 11:56:02AM +0300, Tero Kristo wrote:
>> From: Lokesh Vutla <[email protected]>
>>
>> Algorithms can be registered only once. So skip registration of
>> algorithms if already registered (i.e. in case we have two AES cores
>> in the system.)
>>
>> Signed-off-by: Lokesh Vutla <[email protected]>
>> Signed-off-by: Tero Kristo <[email protected]>
>
> Patch applied. Thanks.
>

Thanks,

Did you check the rest of the series? I only got feedback for this and
patch #2 on the series, shall I repost the remainder of the series as a
whole or...?

-Tero

2016-06-20 23:49:29

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 01/28] crypto: omap-aes: Fix registration of algorithms

On Mon, Jun 20, 2016 at 03:11:54PM +0300, Tero Kristo wrote:
>
> Did you check the rest of the series? I only got feedback for this
> and patch #2 on the series, shall I repost the remainder of the
> series as a whole or...?

Yes please repost them.

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2016-06-21 17:56:44

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 23/28] ARM: dts: DRA7: Add DT node for DES IP

On 10/06/16 14:38, Tony Lindgren wrote:
> * Tero Kristo <[email protected]> [160601 02:09]:
>> From: Joel Fernandes <[email protected]>
>>
>> DRA7xx SoCs have a DES3DES IP. Add DT data for the same.
>
> Are these dts changes safe to apply separately or do they
> cause issues like extra warnings during boot?

DTS changes are fine to merge as is separately, some crypto
functionality might not work properly though (well, the support is
somewhat broken on am43xx/dra7 anyways.) I just gave the kernel a boot
test with am43xx/dra7 platforms with the DT data only applied, and that
part worked fine.

I noticed that if you merge the hwmod changes before you have the DTS
data though, you will get some extra boot warnings of the following kind
on am43xx/dra7:

<snip>
[ 0.315623] omap_hwmod: aes1: no dt node
[ 0.315631] ------------[ cut here ]------------
[ 0.315656] WARNING: CPU: 0 PID: 1 at
arch/arm/mach-omap2/omap_hwmod.c:2497 _init+0x1d0/0x41c
[ 0.315663] omap_hwmod: aes1: doesn't have mpu register target base
<snip>

Do you want to pick-up the DTS changes from this revision of series as
is or shall I repost those also? I think the series would require a
re-ordering of posting the DTS changes before the hwmod data.

-Tero

2016-06-22 07:58:40

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH 23/28] ARM: dts: DRA7: Add DT node for DES IP

* Tero Kristo <[email protected]> [160621 10:58]:
>
> Do you want to pick-up the DTS changes from this revision of series as is or
> shall I repost those also? I think the series would require a re-ordering of
> posting the DTS changes before the hwmod data.

I'll pick the dts changes from this series into omap-for-v4.8/dt
thanks. Please split the rest of the patches into separate hwmod
and driver changes and describe if there's a dependency with the
order they should get merged.

Regards,

Tony

2016-06-22 09:17:09

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

On 07/06/16 15:24, Grygorii Strashko wrote:
> On 06/07/2016 02:52 PM, Tero Kristo wrote:
>> On 07/06/16 13:08, Herbert Xu wrote:
>>> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>>>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>>>> From: Lokesh Vutla <[email protected]>
>>>>>>
>>>>>> Calling runtime PM API for every block causes serious perf hit to
>>>>>> crypto operations that are done on a long buffer.
>>>>>> As crypto is performed on a page boundary, encrypting large buffers
>>>>>> can
>>>>>> cause a series of crypto operations divided by page. The runtime PM
>>>>>> API
>>>>>> is also called those many times.
>>>>>>
>>>>>> We call runtime_pm_get_sync only at beginning on the session
>>>>>> (cra_init)
>>>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>>>> This doesn't make the driver to keep the system awake as runtime
>>>>>> get/put
>>>>>> is only called during a crypto session which completes usually
>>>>>> quickly.
>>>>>>
>>>>>> Signed-off-by: Lokesh Vutla <[email protected]>
>>>>>> Signed-off-by: Tero Kristo <[email protected]>
>>>>>> ---
>>>>>> drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>>>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>>>> index 6eefaa2..bd0258f 100644
>>>>>> --- a/drivers/crypto/omap-sham.c
>>>>>> +++ b/drivers/crypto/omap-sham.c
>>>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>>>> ahash_request *req)
>>>>>>
>>>>>> static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>>>> {
>>>>>> - int err;
>>>>>> -
>>>>>> - err = pm_runtime_get_sync(dd->dev);
>>>>>> - if (err < 0) {
>>>>>> - dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>>>> - return err;
>>>>>> - }
>>>>>> -
>>>>
>>>> Would it be worth it to investigate a pm_runtime autosuspend
>>>> approach rather than knocking runtime PM out here completely? I am
>>>> not clear if the overhead is coming from the pm_runtime calls
>>>> themselves or the actual idling of the IP, but if it's the idling of
>>>> the IP causing the slowdown, with a large enough autosuspend_delay
>>>> we don't actually sleep between each block but after a long enough
>>>> period of idle time we would actually suspend.
>>>
>>> Indeed, I think this patch is bogus. cra_init is associated
>>> with the tfm object which is usually long-lived. So doing power
>>> management there makes no sense.
>>>
>>> Cheers,
>>>
>>
>> I can investigate this further, but I believe this patch itself gave a
>> noticeable performance boost.
>>
>> This is an optimization anyway, and not critical for functionality.
>>
>
> It is not critical only if below code would not introduce races

I don't get your point here. This patch is an optimization, and the
driver works fine without it.

> + spin_lock_bh(&sham.lock);
> + list_for_each_entry(dd, &sham.dev_list, list) {
> + break;
> + }
> + spin_unlock_bh(&sham.lock);
>
> Is it guaranteed that dd will alive always at this moment?

Typically yes, but I think there might be a race condition here if the
driver is removed during operation. Anyway, I'll drop this patch and
change the optimization to use autosuspend as Dave suggested; that gives
almost the same performance boost as this one (I miss a couple of
percent in the overall performance, but I can live with that.)

-Tero

>
> +
> + pm_runtime_get_sync(dd->dev);
>
>
>