2013-08-14 23:13:47

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

This patch series is a rewrite of the DMA portion of omap-aes driver
and also adds support for PIO mode. Both these modes, give better
performance than before.

Earlier, only a single SG was used for DMA purpose, and the SG-list
passed from the crypto layer was being copied and DMA'd one entry at
a time. This turns out to be quite inefficient, so we replace it with
much simpler code that directly passes the SG-list from crypto to the
DMA layer.

We also add PIO mode support to the driver, and switch to PIO mode
whenever the DMA channel allocation is not available. This is only for
OMAP4 platform will work on any platform on which IRQ information is
populated.

Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement
for large 8K blocks:

Sample run on am33xx (beaglebone):
With DMA rewrite:
[ 26.410052] test 0 (128 bit key, 16 byte blocks): 4318 operations in 1 seconds (69088 bytes)
[ 27.414314] test 1 (128 bit key, 64 byte blocks): 4360 operations in 1 seconds (279040 bytes)
[ 28.414406] test 2 (128 bit key, 256 byte blocks): 3609 operations in 1 seconds (923904 bytes)
[ 29.414410] test 3 (128 bit key, 1024 byte blocks): 3418 operations in 1 seconds (3500032 bytes)
[ 30.414510] test 4 (128 bit key, 8192 byte blocks): 1766 operations in 1 seconds (14467072 bytes)

Without DMA rewrite:
[ 31.920519] test 0 (128 bit key, 16 byte blocks): 4417 operations in 1 seconds (70672 bytes)
[ 32.925997] test 1 (128 bit key, 64 byte blocks): 4221 operations in 1 seconds (270144 bytes)
[ 33.926194] test 2 (128 bit key, 256 byte blocks): 3528 operations in 1 seconds (903168 bytes)
[ 34.926225] test 3 (128 bit key, 1024 byte blocks): 3281 operations in 1 seconds (3359744 bytes)
[ 35.926385] test 4 (128 bit key, 8192 byte blocks): 1460 operations in 1 seconds (11960320 bytes)

With PIO mode, note the tremndous boost in performance for small blocks there:
[ 27.294905] test 0 (128 bit key, 16 byte blocks): 20585 operations in 1 seconds (329360 bytes)
[ 28.302282] test 1 (128 bit key, 64 byte blocks): 8106 operations in 1 seconds (518784 bytes)
[ 29.302374] test 2 (128 bit key, 256 byte blocks): 2359 operations in 1 seconds (603904 bytes)
[ 30.302575] test 3 (128 bit key, 1024 byte blocks): 605 operations in 1 seconds (619520 bytes)
[ 31.303781] test 4 (128 bit key, 8192 byte blocks): 79 operations in 1 seconds (647168 bytes)

Future work in this direction would be to dynamically change between
PIO/DMA mode based on the block size.

Joel Fernandes (10):
crypto: scatterwalk: Add support for calculating number of SG
elements
crypto: omap-aes: Add useful debug macros
crypto: omap-aes: Populate number of SG elements
crypto: omap-aes: Simplify DMA usage by using direct SGs
crypto: omap-aes: Sync SG before DMA operation
crypto: omap-aes: Remove previously used intermediate buffers
crypto: omap-aes: Add IRQ info and helper macros
crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it
crypto: omap-aes: Switch to PIO mode in probe function

crypto/scatterwalk.c | 22 +++
drivers/crypto/omap-aes.c | 400 ++++++++++++++++++++----------------------
include/crypto/scatterwalk.h | 2 +
3 files changed, 217 insertions(+), 207 deletions(-)

--
1.7.9.5


2013-08-14 23:13:44

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 08/10] crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs

We add an IRQ handler that implements a state-machine for PIO-mode
and data structures for walking the scatter-gather list. The IRQ handler
is called in succession both when data is available to read or next
data can be sent for processing. This process continues till the entire
in/out SG lists have been walked. Once the SG-list has been completely
walked, the IRQ handler schedules the done_task tasklet.

Also add a useful macro that is used through out the IRQ code for a
common pattern of calculating how much an SG list has been walked.
This improves code readability and avoids checkpatch errors.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 90 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 90 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index c057eac..891455b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -46,6 +46,8 @@
#define DST_MAXBURST 4
#define DMA_MIN (DST_MAXBURST * sizeof(u32))

+#define _calc_walked(inout) (dd->inout##_walk.offset - dd->inout##_sg->offset)
+
/* OMAP TRM gives bitfields as start:end, where start is the higher bit
number. For example 7:0 */
#define FLD_MASK(start, end) (((1 << ((start) - (end) + 1)) - 1) << (end))
@@ -98,6 +100,8 @@
#define FLAGS_FAST BIT(5)
#define FLAGS_BUSY BIT(6)

+#define AES_BLOCK_WORDS (AES_BLOCK_SIZE >> 2)
+
struct omap_aes_ctx {
struct omap_aes_dev *dd;

@@ -163,6 +167,8 @@ struct omap_aes_dev {
size_t total;
struct scatterlist *in_sg;
struct scatterlist *out_sg;
+ struct scatter_walk in_walk;
+ struct scatter_walk out_walk;
int dma_in;
struct dma_chan *dma_lch_in;
int dma_out;
@@ -862,6 +868,90 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.minor_shift = 0,
};

+static irqreturn_t omap_aes_irq(int irq, void *dev_id)
+{
+ struct omap_aes_dev *dd = dev_id;
+ u32 status, i;
+ u32 *src, *dst;
+
+ status = omap_aes_read(dd, AES_REG_IRQ_STATUS(dd));
+ if (status & AES_REG_IRQ_DATA_IN) {
+ omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+ BUG_ON(!dd->in_sg);
+
+ BUG_ON(_calc_walked(in) > dd->in_sg->length);
+
+ src = sg_virt(dd->in_sg) + _calc_walked(in);
+
+ for (i = 0; i < AES_BLOCK_WORDS; i++) {
+ omap_aes_write(dd, AES_REG_DATA_N(dd, i), *src);
+
+ scatterwalk_advance(&dd->in_walk, 4);
+ if (dd->in_sg->length == _calc_walked(in)) {
+ dd->in_sg = scatterwalk_sg_next(dd->in_sg);
+ if (dd->in_sg) {
+ scatterwalk_start(&dd->in_walk,
+ dd->in_sg);
+ src = sg_virt(dd->in_sg) +
+ _calc_walked(in);
+ }
+ } else {
+ src++;
+ }
+ }
+
+ /* Clear IRQ status */
+ status &= ~AES_REG_IRQ_DATA_IN;
+ omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+ /* Enable DATA_OUT interrupt */
+ omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x4);
+
+ } else if (status & AES_REG_IRQ_DATA_OUT) {
+ omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x0);
+
+ BUG_ON(!dd->out_sg);
+
+ BUG_ON(_calc_walked(out) > dd->out_sg->length);
+
+ dst = sg_virt(dd->out_sg) + _calc_walked(out);
+
+ for (i = 0; i < AES_BLOCK_WORDS; i++) {
+ *dst = omap_aes_read(dd, AES_REG_DATA_N(dd, i));
+ scatterwalk_advance(&dd->out_walk, 4);
+ if (dd->out_sg->length == _calc_walked(out)) {
+ dd->out_sg = scatterwalk_sg_next(dd->out_sg);
+ if (dd->out_sg) {
+ scatterwalk_start(&dd->out_walk,
+ dd->out_sg);
+ dst = sg_virt(dd->out_sg) +
+ _calc_walked(out);
+ }
+ } else {
+ dst++;
+ }
+ }
+
+ dd->total -= AES_BLOCK_SIZE;
+
+ BUG_ON(dd->total < 0);
+
+ /* Clear IRQ status */
+ status &= ~AES_REG_IRQ_DATA_OUT;
+ omap_aes_write(dd, AES_REG_IRQ_STATUS(dd), status);
+
+ if (!dd->total)
+ /* All bytes read! */
+ tasklet_schedule(&dd->done_task);
+ else
+ /* Enable DATA_IN interrupt for next block */
+ omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+ }
+
+ return IRQ_HANDLED;
+}
+
static const struct of_device_id omap_aes_of_match[] = {
{
.compatible = "ti,omap2-aes",
--
1.7.9.5

2013-08-14 23:13:54

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 06/10] crypto: omap-aes: Remove previously used intermediate buffers

Intermdiate buffers were allocated, mapped and used for DMA.
These are no longer required as we use the SGs from crypto layer
directly in previous commits in the series. Also along with it,
remove the logic for copying SGs etc as they are no longer used,
and all the associated variables in omap_aes_device.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 90 ---------------------------------------------
1 file changed, 90 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index df2f867..71da52b 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -156,25 +156,13 @@ struct omap_aes_dev {
struct ablkcipher_request *req;
size_t total;
struct scatterlist *in_sg;
- struct scatterlist in_sgl;
- size_t in_offset;
struct scatterlist *out_sg;
- struct scatterlist out_sgl;
- size_t out_offset;
-
- size_t buflen;
- void *buf_in;
- size_t dma_size;
int dma_in;
struct dma_chan *dma_lch_in;
- dma_addr_t dma_addr_in;
- void *buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
- dma_addr_t dma_addr_out;
-
const struct omap_aes_pdata *pdata;
};

@@ -362,33 +350,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
dd->dma_lch_out = NULL;
dd->dma_lch_in = NULL;

- dd->buf_in = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
- dd->buf_out = (void *)__get_free_pages(GFP_KERNEL, OMAP_AES_CACHE_SIZE);
- dd->buflen = PAGE_SIZE << OMAP_AES_CACHE_SIZE;
- dd->buflen &= ~(AES_BLOCK_SIZE - 1);
-
- if (!dd->buf_in || !dd->buf_out) {
- dev_err(dd->dev, "unable to alloc pages.\n");
- goto err_alloc;
- }
-
- /* MAP here */
- dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in, dd->buflen,
- DMA_TO_DEVICE);
- if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
- dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
- err = -EINVAL;
- goto err_map_in;
- }
-
- dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out, dd->buflen,
- DMA_FROM_DEVICE);
- if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
- dev_err(dd->dev, "dma %d bytes error\n", dd->buflen);
- err = -EINVAL;
- goto err_map_out;
- }
-
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);

@@ -415,14 +376,6 @@ static int omap_aes_dma_init(struct omap_aes_dev *dd)
err_dma_out:
dma_release_channel(dd->dma_lch_in);
err_dma_in:
- dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
- DMA_FROM_DEVICE);
-err_map_out:
- dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
-err_map_in:
- free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
- free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
-err_alloc:
if (err)
pr_err("error: %d\n", err);
return err;
@@ -432,11 +385,6 @@ static void omap_aes_dma_cleanup(struct omap_aes_dev *dd)
{
dma_release_channel(dd->dma_lch_out);
dma_release_channel(dd->dma_lch_in);
- dma_unmap_single(dd->dev, dd->dma_addr_out, dd->buflen,
- DMA_FROM_DEVICE);
- dma_unmap_single(dd->dev, dd->dma_addr_in, dd->buflen, DMA_TO_DEVICE);
- free_pages((unsigned long)dd->buf_out, OMAP_AES_CACHE_SIZE);
- free_pages((unsigned long)dd->buf_in, OMAP_AES_CACHE_SIZE);
}

static void sg_copy_buf(void *buf, struct scatterlist *sg,
@@ -453,42 +401,6 @@ static void sg_copy_buf(void *buf, struct scatterlist *sg,
scatterwalk_done(&walk, out, 0);
}

-static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
- size_t buflen, size_t total, int out)
-{
- unsigned int count, off = 0;
-
- while (buflen && total) {
- count = min((*sg)->length - *offset, total);
- count = min(count, buflen);
-
- if (!count)
- return off;
-
- /*
- * buflen and total are AES_BLOCK_SIZE size aligned,
- * so count should be also aligned
- */
-
- sg_copy_buf(buf + off, *sg, *offset, count, out);
-
- off += count;
- buflen -= count;
- *offset += count;
- total -= count;
-
- if (*offset == (*sg)->length) {
- *sg = sg_next(*sg);
- if (*sg)
- *offset = 0;
- else
- total = 0;
- }
- }
-
- return off;
-}
-
static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct scatterlist *in_sg, struct scatterlist *out_sg,
int in_sg_len, int out_sg_len)
@@ -653,9 +565,7 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
/* assign new request to device */
dd->req = req;
dd->total = req->nbytes;
- dd->in_offset = 0;
dd->in_sg = req->src;
- dd->out_offset = 0;
dd->out_sg = req->dst;

dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
--
1.7.9.5

2013-08-14 23:13:51

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 01/10] crypto: scatterwalk: Add support for calculating number of SG elements

Crypto layer only passes nbytes to encrypt but in omap-aes driver we
need to know number of SG elements to pass to dmaengine slave API.
We add function for the same to scatterwalk library.

Signed-off-by: Joel Fernandes <[email protected]>
---
crypto/scatterwalk.c | 22 ++++++++++++++++++++++
include/crypto/scatterwalk.h | 2 ++
2 files changed, 24 insertions(+)

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 7281b8a..79ca227 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -124,3 +124,25 @@ void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
scatterwalk_done(&walk, out, 0);
}
EXPORT_SYMBOL_GPL(scatterwalk_map_and_copy);
+
+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes)
+{
+ int offset = 0, n = 0;
+
+ /* num_bytes is too small */
+ if (num_bytes < sg->length)
+ return -1;
+
+ do {
+ offset += sg->length;
+ n++;
+ sg = scatterwalk_sg_next(sg);
+
+ /* num_bytes is too large */
+ if (unlikely(!sg && (num_bytes < offset)))
+ return -1;
+ } while (sg && (num_bytes > offset));
+
+ return n;
+}
+EXPORT_SYMBOL_GPL(scatterwalk_bytes_sglen);
diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h
index 3744d2a..13621cc 100644
--- a/include/crypto/scatterwalk.h
+++ b/include/crypto/scatterwalk.h
@@ -113,4 +113,6 @@ void scatterwalk_done(struct scatter_walk *walk, int out, int more);
void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
unsigned int start, unsigned int nbytes, int out);

+int scatterwalk_bytes_sglen(struct scatterlist *sg, int num_bytes);
+
#endif /* _CRYPTO_SCATTERWALK_H */
--
1.7.9.5

2013-08-14 23:14:35

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 04/10] crypto: omap-aes: Simplify DMA usage by using direct SGs

In early version of this driver, assumptions were made such as
DMA layer requires contiguous buffers etc. Due to this, new buffers
were allocated, mapped and used for DMA. These assumptions are no
longer true and DMAEngine scatter-gather DMA doesn't have such
requirements. We simply the DMA operations by directly using the
scatter-gather buffers provided by the crypto layer instead of
creating our own.

Lot of logic that handled DMA'ing only X number of bytes of the
total, or as much as fitted into a 3rd party buffer is removed
and is no longer required.

Also, drastically improves performance (atleast 35% seen with
encrypting a buffer size of 8K (2800 ops/sec vs 1800 ops/sec).
Also DMA usage is much more simplified and coherent with rest
of the code.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 147 ++++++++-------------------------------------
1 file changed, 25 insertions(+), 122 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index b7189a8..9104a7f 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -490,22 +490,14 @@ static int sg_copy(struct scatterlist **sg, size_t *offset, void *buf,
}

static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
- struct scatterlist *in_sg, struct scatterlist *out_sg)
+ struct scatterlist *in_sg, struct scatterlist *out_sg,
+ int in_sg_len, int out_sg_len)
{
struct omap_aes_ctx *ctx = crypto_tfm_ctx(tfm);
struct omap_aes_dev *dd = ctx->dd;
struct dma_async_tx_descriptor *tx_in, *tx_out;
struct dma_slave_config cfg;
- dma_addr_t dma_addr_in = sg_dma_address(in_sg);
- int ret, length = sg_dma_len(in_sg);
-
- pr_debug("len: %d\n", length);
-
- dd->dma_size = length;
-
- if (!(dd->flags & FLAGS_FAST))
- dma_sync_single_for_device(dd->dev, dma_addr_in, length,
- DMA_TO_DEVICE);
+ int ret;

memset(&cfg, 0, sizeof(cfg));

@@ -524,7 +516,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}

- tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, 1,
+ tx_in = dmaengine_prep_slave_sg(dd->dma_lch_in, in_sg, in_sg_len,
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_in) {
@@ -543,7 +535,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
return ret;
}

- tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, 1,
+ tx_out = dmaengine_prep_slave_sg(dd->dma_lch_out, out_sg, out_sg_len,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!tx_out) {
@@ -561,7 +553,7 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
dma_async_issue_pending(dd->dma_lch_out);

/* start DMA */
- dd->pdata->trigger(dd, length);
+ dd->pdata->trigger(dd, dd->total);

return 0;
}
@@ -570,93 +562,28 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)
{
struct crypto_tfm *tfm = crypto_ablkcipher_tfm(
crypto_ablkcipher_reqtfm(dd->req));
- int err, fast = 0, in, out;
- size_t count;
- dma_addr_t addr_in, addr_out;
- struct scatterlist *in_sg, *out_sg;
- int len32;
+ int err;

pr_debug("total: %d\n", dd->total);

- if (sg_is_last(dd->in_sg) && sg_is_last(dd->out_sg)) {
- /* check for alignment */
- in = IS_ALIGNED((u32)dd->in_sg->offset, sizeof(u32));
- out = IS_ALIGNED((u32)dd->out_sg->offset, sizeof(u32));
-
- fast = in && out;
+ err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+ if (!err) {
+ dev_err(dd->dev, "dma_map_sg() error\n");
+ return -EINVAL;
}

- if (fast) {
- count = min(dd->total, sg_dma_len(dd->in_sg));
- count = min(count, sg_dma_len(dd->out_sg));
-
- if (count != dd->total) {
- pr_err("request length != buffer length\n");
- return -EINVAL;
- }
-
- pr_debug("fast\n");
-
- err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
- if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
- return -EINVAL;
- }
-
- err = dma_map_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
- if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
- dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
- return -EINVAL;
- }
-
- addr_in = sg_dma_address(dd->in_sg);
- addr_out = sg_dma_address(dd->out_sg);
-
- in_sg = dd->in_sg;
- out_sg = dd->out_sg;
-
- dd->flags |= FLAGS_FAST;
-
- } else {
- /* use cache buffers */
- count = sg_copy(&dd->in_sg, &dd->in_offset, dd->buf_in,
- dd->buflen, dd->total, 0);
-
- len32 = DIV_ROUND_UP(count, DMA_MIN) * DMA_MIN;
-
- /*
- * The data going into the AES module has been copied
- * to a local buffer and the data coming out will go
- * into a local buffer so set up local SG entries for
- * both.
- */
- sg_init_table(&dd->in_sgl, 1);
- dd->in_sgl.offset = dd->in_offset;
- sg_dma_len(&dd->in_sgl) = len32;
- sg_dma_address(&dd->in_sgl) = dd->dma_addr_in;
-
- sg_init_table(&dd->out_sgl, 1);
- dd->out_sgl.offset = dd->out_offset;
- sg_dma_len(&dd->out_sgl) = len32;
- sg_dma_address(&dd->out_sgl) = dd->dma_addr_out;
-
- in_sg = &dd->in_sgl;
- out_sg = &dd->out_sgl;
-
- addr_in = dd->dma_addr_in;
- addr_out = dd->dma_addr_out;
-
- dd->flags &= ~FLAGS_FAST;
-
+ err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
+ if (!err) {
+ dev_err(dd->dev, "dma_map_sg() error\n");
+ return -EINVAL;
}

- dd->total -= count;
-
- err = omap_aes_crypt_dma(tfm, in_sg, out_sg);
+ err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
+ dd->out_sg_len);
if (err) {
- dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
- dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE);
+ dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+ dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+ DMA_FROM_DEVICE);
}

return err;
@@ -677,7 +604,6 @@ static void omap_aes_finish_req(struct omap_aes_dev *dd, int err)
static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
{
int err = 0;
- size_t count;

pr_debug("total: %d\n", dd->total);

@@ -686,21 +612,8 @@ static int omap_aes_crypt_dma_stop(struct omap_aes_dev *dd)
dmaengine_terminate_all(dd->dma_lch_in);
dmaengine_terminate_all(dd->dma_lch_out);

- if (dd->flags & FLAGS_FAST) {
- dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE);
- dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
- } else {
- dma_sync_single_for_device(dd->dev, dd->dma_addr_out,
- dd->dma_size, DMA_FROM_DEVICE);
-
- /* copy data */
- count = sg_copy(&dd->out_sg, &dd->out_offset, dd->buf_out,
- dd->buflen, dd->dma_size, 1);
- if (count != dd->dma_size) {
- err = -EINVAL;
- pr_err("not all data converted: %u\n", count);
- }
- }
+ dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
+ dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);

return err;
}
@@ -770,21 +683,11 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
static void omap_aes_done_task(unsigned long data)
{
struct omap_aes_dev *dd = (struct omap_aes_dev *)data;
- int err;
-
- pr_debug("enter\n");

- err = omap_aes_crypt_dma_stop(dd);
-
- err = dd->err ? : err;
-
- if (dd->total && !err) {
- err = omap_aes_crypt_dma_start(dd);
- if (!err)
- return; /* DMA started. Not fininishing. */
- }
+ pr_debug("enter done_task\n");

- omap_aes_finish_req(dd, err);
+ omap_aes_crypt_dma_stop(dd);
+ omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);

pr_debug("exit\n");
--
1.7.9.5

2013-08-14 23:15:00

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 05/10] crypto: omap-aes: Sync SG before DMA operation

Earlier functions that did a similar sync are replaced by
the dma_sync_sg_* which can operate on entire SG list.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 9104a7f..df2f867 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -499,6 +499,8 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;

+ dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);
+
memset(&cfg, 0, sizeof(cfg));

cfg.src_addr = dd->phys_base + AES_REG_DATA_N(dd, 0);
@@ -686,6 +688,8 @@ static void omap_aes_done_task(unsigned long data)

pr_debug("enter done_task\n");

+ dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
+
omap_aes_crypt_dma_stop(dd);
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);
--
1.7.9.5

2013-08-14 23:15:30

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 10/10] crypto: omap-aes: Switch to PIO mode in probe function

In cases where requesting for DMA channels fails for some reason, or channel
numbers are not provided in DT or platform data, we switch to PIO-only mode
also checking if platform provides IRQ numbers and interrupt register offsets
in DT and platform data. All dma-only paths are avoided in this mode.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 28 +++++++++++++++++++++++-----
1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 54f2729..34e3b77 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -1074,7 +1074,7 @@ static int omap_aes_probe(struct platform_device *pdev)
struct omap_aes_dev *dd;
struct crypto_alg *algp;
struct resource res;
- int err = -ENOMEM, i, j;
+ int err = -ENOMEM, i, j, irq = -1;
u32 reg;

dd = kzalloc(sizeof(struct omap_aes_dev), GFP_KERNEL);
@@ -1118,8 +1118,23 @@ static int omap_aes_probe(struct platform_device *pdev)
tasklet_init(&dd->queue_task, omap_aes_queue_task, (unsigned long)dd);

err = omap_aes_dma_init(dd);
- if (err)
- goto err_dma;
+ if (err && AES_REG_IRQ_STATUS(dd) && AES_REG_IRQ_ENABLE(dd)) {
+ dd->pio_only = 1;
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0) {
+ dev_err(dev, "can't get IRQ resource\n");
+ goto err_irq;
+ }
+
+ err = request_irq(irq, omap_aes_irq, 0,
+ dev_name(dev), dd);
+ if (err) {
+ dev_err(dev, "Unable to grab omap-aes IRQ\n");
+ goto err_irq;
+ }
+ }
+

INIT_LIST_HEAD(&dd->list);
spin_lock(&list_lock);
@@ -1147,8 +1162,11 @@ err_algs:
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
crypto_unregister_alg(
&dd->pdata->algs_info[i].algs_list[j]);
- omap_aes_dma_cleanup(dd);
-err_dma:
+ if (dd->pio_only)
+ free_irq(irq, dd);
+ else
+ omap_aes_dma_cleanup(dd);
+err_irq:
tasklet_kill(&dd->done_task);
tasklet_kill(&dd->queue_task);
pm_runtime_disable(dev);
--
1.7.9.5

2013-08-14 23:15:55

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 09/10] crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it

We initialize the scatter gather walk lists needed for PIO mode
and avoid all DMA paths such as mapping/unmapping buffers by
checking for the pio_only flag.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 43 ++++++++++++++++++++++++++++++-------------
1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 891455b..54f2729 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -175,6 +175,7 @@ struct omap_aes_dev {
struct dma_chan *dma_lch_out;
int in_sg_len;
int out_sg_len;
+ int pio_only;
const struct omap_aes_pdata *pdata;
};

@@ -423,6 +424,16 @@ static int omap_aes_crypt_dma(struct crypto_tfm *tfm,
struct dma_slave_config cfg;
int ret;

+ if (dd->pio_only) {
+ scatterwalk_start(&dd->in_walk, dd->in_sg);
+ scatterwalk_start(&dd->out_walk, dd->out_sg);
+
+ /* Enable DATAIN interrupt and let it take
+ care of the rest */
+ omap_aes_write(dd, AES_REG_IRQ_ENABLE(dd), 0x2);
+ return 0;
+ }
+
dma_sync_sg_for_device(dd->dev, dd->in_sg, in_sg_len, DMA_TO_DEVICE);

memset(&cfg, 0, sizeof(cfg));
@@ -492,21 +503,25 @@ static int omap_aes_crypt_dma_start(struct omap_aes_dev *dd)

pr_debug("total: %d\n", dd->total);

- err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
- if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
- return -EINVAL;
- }
+ if (!dd->pio_only) {
+ err = dma_map_sg(dd->dev, dd->in_sg, dd->in_sg_len,
+ DMA_TO_DEVICE);
+ if (!err) {
+ dev_err(dd->dev, "dma_map_sg() error\n");
+ return -EINVAL;
+ }

- err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len, DMA_FROM_DEVICE);
- if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
- return -EINVAL;
+ err = dma_map_sg(dd->dev, dd->out_sg, dd->out_sg_len,
+ DMA_FROM_DEVICE);
+ if (!err) {
+ dev_err(dd->dev, "dma_map_sg() error\n");
+ return -EINVAL;
+ }
}

err = omap_aes_crypt_dma(tfm, dd->in_sg, dd->out_sg, dd->in_sg_len,
dd->out_sg_len);
- if (err) {
+ if (err && !dd->pio_only) {
dma_unmap_sg(dd->dev, dd->in_sg, dd->in_sg_len, DMA_TO_DEVICE);
dma_unmap_sg(dd->dev, dd->out_sg, dd->out_sg_len,
DMA_FROM_DEVICE);
@@ -610,9 +625,11 @@ static void omap_aes_done_task(unsigned long data)

pr_debug("enter done_task\n");

- dma_sync_sg_for_cpu(dd->dev, dd->in_sg, dd->in_sg_len, DMA_FROM_DEVICE);
-
- omap_aes_crypt_dma_stop(dd);
+ if (!dd->pio_only) {
+ dma_sync_sg_for_device(dd->dev, dd->out_sg, dd->out_sg_len,
+ DMA_FROM_DEVICE);
+ omap_aes_crypt_dma_stop(dd);
+ }
omap_aes_finish_req(dd, 0);
omap_aes_handle_queue(dd, NULL);

--
1.7.9.5

2013-08-14 23:13:42

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 03/10] crypto: omap-aes: Populate number of SG elements

Crypto layer only passes nbytes but number of SG elements is needed
for mapping/unmapping SGs at one time using dma_map* API and also
needed to pass in for dmaengine prep function.

We call function added to scatterwalk for this purpose in omap_aes_handle_queue
to populate the values which are used later.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 3838e0a..b7189a8 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -171,6 +171,8 @@ struct omap_aes_dev {
void *buf_out;
int dma_out;
struct dma_chan *dma_lch_out;
+ int in_sg_len;
+ int out_sg_len;
dma_addr_t dma_addr_out;

const struct omap_aes_pdata *pdata;
@@ -741,6 +743,10 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
dd->out_offset = 0;
dd->out_sg = req->dst;

+ dd->in_sg_len = scatterwalk_bytes_sglen(dd->in_sg, dd->total);
+ dd->out_sg_len = scatterwalk_bytes_sglen(dd->out_sg, dd->total);
+ BUG_ON(dd->in_sg_len < 0 || dd->out_sg_len < 0);
+
rctx = ablkcipher_request_ctx(req);
ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
rctx->mode &= FLAGS_MODE_MASK;
--
1.7.9.5

2013-08-14 23:16:18

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 07/10] crypto: omap-aes: Add IRQ info and helper macros

Add IRQ information to pdata and helper macros. These are
required for PIO-mode support.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 71da52b..c057eac 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -82,6 +82,10 @@

#define AES_REG_LENGTH_N(x) (0x54 + ((x) * 0x04))

+#define AES_REG_IRQ_STATUS(dd) ((dd)->pdata->irq_status_ofs)
+#define AES_REG_IRQ_ENABLE(dd) ((dd)->pdata->irq_enable_ofs)
+#define AES_REG_IRQ_DATA_IN BIT(1)
+#define AES_REG_IRQ_DATA_OUT BIT(2)
#define DEFAULT_TIMEOUT (5*HZ)

#define FLAGS_MODE_MASK 0x000f
@@ -127,6 +131,8 @@ struct omap_aes_pdata {
u32 data_ofs;
u32 rev_ofs;
u32 mask_ofs;
+ u32 irq_enable_ofs;
+ u32 irq_status_ofs;

u32 dma_enable_in;
u32 dma_enable_out;
@@ -846,6 +852,8 @@ static const struct omap_aes_pdata omap_aes_pdata_omap4 = {
.data_ofs = 0x60,
.rev_ofs = 0x80,
.mask_ofs = 0x84,
+ .irq_status_ofs = 0x8c,
+ .irq_enable_ofs = 0x90,
.dma_enable_in = BIT(5),
.dma_enable_out = BIT(6),
.major_mask = 0x0700,
--
1.7.9.5

2013-08-14 23:16:42

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

When DEBUG is enabled, these macros can be used to print variables
in integer and hex format, and clearly display which registers,
offsets and values are being read/written , including printing the
names of the offsets and their values.

Signed-off-by: Joel Fernandes <[email protected]>
---
drivers/crypto/omap-aes.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index ee15b0f..3838e0a 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -15,6 +15,14 @@

#define pr_fmt(fmt) "%s: " fmt, __func__

+#ifdef DEBUG
+#define prn(num) printk(#num "=%d\n", num)
+#define prx(num) printk(#num "=%x\n", num)
+#else
+#define prn(num) do { } while (0)
+#define prx(num) do { } while (0)
+#endif
+
#include <linux/err.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -172,13 +180,35 @@ struct omap_aes_dev {
static LIST_HEAD(dev_list);
static DEFINE_SPINLOCK(list_lock);

+#ifdef DEBUG
+#define omap_aes_read(dd, offset) \
+ do { \
+ omap_aes_read_1(dd, offset); \
+ pr_debug("omap_aes_read(" #offset ")\n"); \
+ } while (0)
+
+static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
+#else
static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
+#endif
{
return __raw_readl(dd->io_base + offset);
}

+#ifdef DEBUG
+#define omap_aes_write(dd, offset, value) \
+ do { \
+ pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
+ offset, value); \
+ omap_aes_write_1(dd, offset, value); \
+ } while (0)
+
+static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
+ u32 value)
+#else
static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
u32 value)
+#endif
{
__raw_writel(value, dd->io_base + offset);
}
--
1.7.9.5

2013-08-14 23:30:01

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
> When DEBUG is enabled, these macros can be used to print variables
> in integer and hex format, and clearly display which registers,
> offsets and values are being read/written , including printing the
> names of the offsets and their values.
[]
> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
[]
> @@ -15,6 +15,14 @@
>
> #define pr_fmt(fmt) "%s: " fmt, __func__
>
> +#ifdef DEBUG
> +#define prn(num) printk(#num "=%d\n", num)
> +#define prx(num) printk(#num "=%x\n", num)

pr_debug?

> +#else
> +#define prn(num) do { } while (0)
> +#define prx(num) do { } while (0)
> +#endif
[]
> @@ -172,13 +180,35 @@ struct omap_aes_dev {
> static LIST_HEAD(dev_list);
> static DEFINE_SPINLOCK(list_lock);
>
> +#ifdef DEBUG
> +#define omap_aes_read(dd, offset) \
> + do { \
> + omap_aes_read_1(dd, offset); \
> + pr_debug("omap_aes_read(" #offset ")\n"); \
> + } while (0)
> +
> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
> +#else
> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
> +#endif
> {
> return __raw_readl(dd->io_base + offset);
> }
>
> +#ifdef DEBUG
> +#define omap_aes_write(dd, offset, value) \
> + do { \
> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
> + offset, value); \
> + omap_aes_write_1(dd, offset, value); \
> + } while (0)
> +
> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
> + u32 value)
> +#else
> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
> u32 value)
> +#endif
> {
> __raw_writel(value, dd->io_base + offset);
> }

Umm, yuck?

Is there any real value in read_1 and write_1?

2013-08-14 23:32:29

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

On 08/14/2013 06:12 PM, Joel Fernandes wrote:
> This patch series is a rewrite of the DMA portion of omap-aes driver
> and also adds support for PIO mode. Both these modes, give better
> performance than before.
>
> Earlier, only a single SG was used for DMA purpose, and the SG-list
> passed from the crypto layer was being copied and DMA'd one entry at
> a time. This turns out to be quite inefficient, so we replace it with
> much simpler code that directly passes the SG-list from crypto to the
> DMA layer.
>
> We also add PIO mode support to the driver, and switch to PIO mode
> whenever the DMA channel allocation is not available. This is only for
> OMAP4 platform will work on any platform on which IRQ information is
> populated.
>
> Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement

Just correcting, this is more like 35% not 50% when using DMA.

Thanks,

-Joel





> for large 8K blocks:
>
> Sample run on am33xx (beaglebone):
> With DMA rewrite:
> [ 26.410052] test 0 (128 bit key, 16 byte blocks): 4318 operations in 1 seconds (69088 bytes)
> [ 27.414314] test 1 (128 bit key, 64 byte blocks): 4360 operations in 1 seconds (279040 bytes)
> [ 28.414406] test 2 (128 bit key, 256 byte blocks): 3609 operations in 1 seconds (923904 bytes)
> [ 29.414410] test 3 (128 bit key, 1024 byte blocks): 3418 operations in 1 seconds (3500032 bytes)
> [ 30.414510] test 4 (128 bit key, 8192 byte blocks): 1766 operations in 1 seconds (14467072 bytes)
>
> Without DMA rewrite:
> [ 31.920519] test 0 (128 bit key, 16 byte blocks): 4417 operations in 1 seconds (70672 bytes)
> [ 32.925997] test 1 (128 bit key, 64 byte blocks): 4221 operations in 1 seconds (270144 bytes)
> [ 33.926194] test 2 (128 bit key, 256 byte blocks): 3528 operations in 1 seconds (903168 bytes)
> [ 34.926225] test 3 (128 bit key, 1024 byte blocks): 3281 operations in 1 seconds (3359744 bytes)
> [ 35.926385] test 4 (128 bit key, 8192 byte blocks): 1460 operations in 1 seconds (11960320 bytes)
>
> With PIO mode, note the tremndous boost in performance for small blocks there:
> [ 27.294905] test 0 (128 bit key, 16 byte blocks): 20585 operations in 1 seconds (329360 bytes)
> [ 28.302282] test 1 (128 bit key, 64 byte blocks): 8106 operations in 1 seconds (518784 bytes)
> [ 29.302374] test 2 (128 bit key, 256 byte blocks): 2359 operations in 1 seconds (603904 bytes)
> [ 30.302575] test 3 (128 bit key, 1024 byte blocks): 605 operations in 1 seconds (619520 bytes)
> [ 31.303781] test 4 (128 bit key, 8192 byte blocks): 79 operations in 1 seconds (647168 bytes)
>
> Future work in this direction would be to dynamically change between
> PIO/DMA mode based on the block size.
>
> Joel Fernandes (10):
> crypto: scatterwalk: Add support for calculating number of SG
> elements
> crypto: omap-aes: Add useful debug macros
> crypto: omap-aes: Populate number of SG elements
> crypto: omap-aes: Simplify DMA usage by using direct SGs
> crypto: omap-aes: Sync SG before DMA operation
> crypto: omap-aes: Remove previously used intermediate buffers
> crypto: omap-aes: Add IRQ info and helper macros
> crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
> crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it
> crypto: omap-aes: Switch to PIO mode in probe function
>
> crypto/scatterwalk.c | 22 +++
> drivers/crypto/omap-aes.c | 400 ++++++++++++++++++++----------------------
> include/crypto/scatterwalk.h | 2 +
> 3 files changed, 217 insertions(+), 207 deletions(-)
>

2013-08-14 23:41:12

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On 08/14/2013 06:29 PM, Joe Perches wrote:
> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>> When DEBUG is enabled, these macros can be used to print variables
>> in integer and hex format, and clearly display which registers,
>> offsets and values are being read/written , including printing the
>> names of the offsets and their values.
> []
>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
> []
>> @@ -15,6 +15,14 @@
>>
>> #define pr_fmt(fmt) "%s: " fmt, __func__
>>
>> +#ifdef DEBUG
>> +#define prn(num) printk(#num "=%d\n", num)
>> +#define prx(num) printk(#num "=%x\n", num)
>
> pr_debug?

Sure, can change that.

>> +#else
>> +#define prn(num) do { } while (0)
>> +#define prx(num) do { } while (0)
>> +#endif
> []
>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>> static LIST_HEAD(dev_list);
>> static DEFINE_SPINLOCK(list_lock);
>>
>> +#ifdef DEBUG
>> +#define omap_aes_read(dd, offset) \
>> + do { \
>> + omap_aes_read_1(dd, offset); \
>> + pr_debug("omap_aes_read(" #offset ")\n"); \
>> + } while (0)
>> +
>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>> +#else
>> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>> +#endif
>> {
>> return __raw_readl(dd->io_base + offset);
>> }
>>
>> +#ifdef DEBUG
>> +#define omap_aes_write(dd, offset, value) \
>> + do { \
>> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
>> + offset, value); \
>> + omap_aes_write_1(dd, offset, value); \
>> + } while (0)
>> +
>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>> + u32 value)
>> +#else
>> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>> u32 value)
>> +#endif
>> {
>> __raw_writel(value, dd->io_base + offset);
>> }
>
> Umm, yuck?
>
> Is there any real value in read_1 and write_1?

Can you be more descriptive? There is a lot of value in them for debug
to show clearly sequence of read/writes. Moreover, they are no-op'd when
DEBUG is disabled.

Other than the obvious use, I guess it could be rewritten to be bit
cleaner. I can do that.

-Joel

2013-08-15 00:47:19

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
> On 08/14/2013 06:29 PM, Joe Perches wrote:
> > On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
> >> When DEBUG is enabled, these macros can be used to print variables
> >> in integer and hex format, and clearly display which registers,
> >> offsets and values are being read/written , including printing the
> >> names of the offsets and their values.
> > []
> >> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
> > []
> >> @@ -15,6 +15,14 @@
> >>
> >> #define pr_fmt(fmt) "%s: " fmt, __func__
> >>
> >> +#ifdef DEBUG
> >> +#define prn(num) printk(#num "=%d\n", num)
> >> +#define prx(num) printk(#num "=%x\n", num)
> >
> > pr_debug?
>
> Sure, can change that.
>
> >> +#else
> >> +#define prn(num) do { } while (0)
> >> +#define prx(num) do { } while (0)
> >> +#endif
> > []
> >> @@ -172,13 +180,35 @@ struct omap_aes_dev {
> >> static LIST_HEAD(dev_list);
> >> static DEFINE_SPINLOCK(list_lock);
> >>
> >> +#ifdef DEBUG
> >> +#define omap_aes_read(dd, offset) \
> >> + do { \
> >> + omap_aes_read_1(dd, offset); \
> >> + pr_debug("omap_aes_read(" #offset ")\n"); \
> >> + } while (0)
> >> +
> >> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
> >> +#else
> >> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
> >> +#endif
> >> {
> >> return __raw_readl(dd->io_base + offset);
> >> }
> >>
> >> +#ifdef DEBUG
> >> +#define omap_aes_write(dd, offset, value) \
> >> + do { \
> >> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
> >> + offset, value); \
> >> + omap_aes_write_1(dd, offset, value); \
> >> + } while (0)
> >> +
> >> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
> >> + u32 value)
> >> +#else
> >> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
> >> u32 value)
> >> +#endif
> >> {
> >> __raw_writel(value, dd->io_base + offset);
> >> }
> >
> > Umm, yuck?
> >
> > Is there any real value in read_1 and write_1?
>
> Can you be more descriptive? There is a lot of value in them for debug
> to show clearly sequence of read/writes. Moreover, they are no-op'd when
> DEBUG is disabled.

pr_debug is no-op'd when DEBUG is not #defined.
so just use a single

omap_aes_write(...)
{
pr_debug(...)
__raw_writel(...);
}

2013-08-15 05:14:13

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On 08/14/2013 07:47 PM, Joe Perches wrote:
> On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
>> On 08/14/2013 06:29 PM, Joe Perches wrote:
>>> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>>>> When DEBUG is enabled, these macros can be used to print variables
>>>> in integer and hex format, and clearly display which registers,
>>>> offsets and values are being read/written , including printing the
>>>> names of the offsets and their values.
>>> []
>>>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>>> []
>>>> @@ -15,6 +15,14 @@
>>>>
>>>> #define pr_fmt(fmt) "%s: " fmt, __func__
>>>>
>>>> +#ifdef DEBUG
>>>> +#define prn(num) printk(#num "=%d\n", num)
>>>> +#define prx(num) printk(#num "=%x\n", num)
>>>
>>> pr_debug?
>>
>> Sure, can change that.
>>
>>>> +#else
>>>> +#define prn(num) do { } while (0)
>>>> +#define prx(num) do { } while (0)
>>>> +#endif
>>> []
>>>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>>> static LIST_HEAD(dev_list);
>>>> static DEFINE_SPINLOCK(list_lock);
>>>>
>>>> +#ifdef DEBUG
>>>> +#define omap_aes_read(dd, offset) \
>>>> + do { \
>>>> + omap_aes_read_1(dd, offset); \
>>>> + pr_debug("omap_aes_read(" #offset ")\n"); \
>>>> + } while (0)
>>>> +
>>>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>>>> +#else
>>>> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>>>> +#endif
>>>> {
>>>> return __raw_readl(dd->io_base + offset);
>>>> }
>>>>
>>>> +#ifdef DEBUG
>>>> +#define omap_aes_write(dd, offset, value) \
>>>> + do { \
>>>> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
>>>> + offset, value); \
>>>> + omap_aes_write_1(dd, offset, value); \
>>>> + } while (0)
>>>> +
>>>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>>>> + u32 value)
>>>> +#else
>>>> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>>> u32 value)
>>>> +#endif
>>>> {
>>>> __raw_writel(value, dd->io_base + offset);
>>>> }
>>>
>>> Umm, yuck?
>>>
>>> Is there any real value in read_1 and write_1?
>>
>> Can you be more descriptive? There is a lot of value in them for debug
>> to show clearly sequence of read/writes. Moreover, they are no-op'd when
>> DEBUG is disabled.
>
> pr_debug is no-op'd when DEBUG is not #defined.
> so just use a single
>
> omap_aes_write(...)
> {
> pr_debug(...)
> __raw_writel(...);
> }

Actually this doesn't work, as the pr_debug cannot print the name of the
offset as my original patch set does using "#offset".

There are many places where named offsets are used to pass to
read/write, and this macro helps to visually see which offset is being
written to by name.

So the original patch would stand in its current form except for a small
rewrite of the write debug part of it as follows to be cleaner getting
rid of the _1. For the read , we still need it as we need to return the
value from a function which cannot be done in a macro.

So the new patch would look something like this:

#ifdef DEBUG
#define omap_aes_read(dd, offset) \
omap_aes_read_1(dd, offset); \
pr_debug("omap_aes_read(" #offset ")\n");
static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
#else
static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
#endif
{
return __raw_readl(dd->io_base + offset);
}

#ifdef DEBUG
#define omap_aes_write(dd, offset, value) \
do { \
pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
offset, value); \
__raw_writel(value, dd->io_base + offset); \
} while (0)
#else
static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
u32 value)
{
__raw_writel(value, dd->io_base + offset);
}
#endif



Thanks,

-Joel

2013-08-15 05:58:48

by Dmitry Kasatkin

[permalink] [raw]
Subject: Re: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

On 15/08/13 02:30, Joel Fernandes wrote:
> On 08/14/2013 06:12 PM, Joel Fernandes wrote:
>> This patch series is a rewrite of the DMA portion of omap-aes driver
>> and also adds support for PIO mode. Both these modes, give better
>> performance than before.
>>
>> Earlier, only a single SG was used for DMA purpose, and the SG-list
>> passed from the crypto layer was being copied and DMA'd one entry at
>> a time. This turns out to be quite inefficient, so we replace it with
>> much simpler code that directly passes the SG-list from crypto to the
>> DMA layer.
>>
>> We also add PIO mode support to the driver, and switch to PIO mode
>> whenever the DMA channel allocation is not available. This is only for
>> OMAP4 platform will work on any platform on which IRQ information is
>> populated.
>>
>> Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement
> Just correcting, this is more like 35% not 50% when using DMA.

Hmm :)

1766/1460 = ~20%

>
> Thanks,
>
> -Joel
>
>
>
>
>
>> for large 8K blocks:
>>
>> Sample run on am33xx (beaglebone):
>> With DMA rewrite:
>> [ 26.410052] test 0 (128 bit key, 16 byte blocks): 4318 operations in 1 seconds (69088 bytes)
>> [ 27.414314] test 1 (128 bit key, 64 byte blocks): 4360 operations in 1 seconds (279040 bytes)
>> [ 28.414406] test 2 (128 bit key, 256 byte blocks): 3609 operations in 1 seconds (923904 bytes)
>> [ 29.414410] test 3 (128 bit key, 1024 byte blocks): 3418 operations in 1 seconds (3500032 bytes)
>> [ 30.414510] test 4 (128 bit key, 8192 byte blocks): 1766 operations in 1 seconds (14467072 bytes)
>>
>> Without DMA rewrite:
>> [ 31.920519] test 0 (128 bit key, 16 byte blocks): 4417 operations in 1 seconds (70672 bytes)
>> [ 32.925997] test 1 (128 bit key, 64 byte blocks): 4221 operations in 1 seconds (270144 bytes)
>> [ 33.926194] test 2 (128 bit key, 256 byte blocks): 3528 operations in 1 seconds (903168 bytes)
>> [ 34.926225] test 3 (128 bit key, 1024 byte blocks): 3281 operations in 1 seconds (3359744 bytes)
>> [ 35.926385] test 4 (128 bit key, 8192 byte blocks): 1460 operations in 1 seconds (11960320 bytes)
>>
>> With PIO mode, note the tremndous boost in performance for small blocks there:
>> [ 27.294905] test 0 (128 bit key, 16 byte blocks): 20585 operations in 1 seconds (329360 bytes)
>> [ 28.302282] test 1 (128 bit key, 64 byte blocks): 8106 operations in 1 seconds (518784 bytes)
>> [ 29.302374] test 2 (128 bit key, 256 byte blocks): 2359 operations in 1 seconds (603904 bytes)
>> [ 30.302575] test 3 (128 bit key, 1024 byte blocks): 605 operations in 1 seconds (619520 bytes)
>> [ 31.303781] test 4 (128 bit key, 8192 byte blocks): 79 operations in 1 seconds (647168 bytes)
>>
>> Future work in this direction would be to dynamically change between
>> PIO/DMA mode based on the block size.
>>
>> Joel Fernandes (10):
>> crypto: scatterwalk: Add support for calculating number of SG
>> elements
>> crypto: omap-aes: Add useful debug macros
>> crypto: omap-aes: Populate number of SG elements
>> crypto: omap-aes: Simplify DMA usage by using direct SGs
>> crypto: omap-aes: Sync SG before DMA operation
>> crypto: omap-aes: Remove previously used intermediate buffers
>> crypto: omap-aes: Add IRQ info and helper macros
>> crypto: omap-aes: PIO mode: Add IRQ handler and walk SGs
>> crypto: omap-aes: PIO mode: platform data for OMAP4 and trigger it
>> crypto: omap-aes: Switch to PIO mode in probe function
>>
>> crypto/scatterwalk.c | 22 +++
>> drivers/crypto/omap-aes.c | 400 ++++++++++++++++++++----------------------
>> include/crypto/scatterwalk.h | 2 +
>> 3 files changed, 217 insertions(+), 207 deletions(-)
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2013-08-15 06:23:32

by Dmitry Kasatkin

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On 15/08/13 06:12, Joel Fernandes wrote:
> On 08/14/2013 07:47 PM, Joe Perches wrote:
>> On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
>>> On 08/14/2013 06:29 PM, Joe Perches wrote:
>>>> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>>>>> When DEBUG is enabled, these macros can be used to print variables
>>>>> in integer and hex format, and clearly display which registers,
>>>>> offsets and values are being read/written , including printing the
>>>>> names of the offsets and their values.
>>>> []
>>>>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>>>> []
>>>>> @@ -15,6 +15,14 @@
>>>>>
>>>>> #define pr_fmt(fmt) "%s: " fmt, __func__
>>>>>
>>>>> +#ifdef DEBUG
>>>>> +#define prn(num) printk(#num "=%d\n", num)
>>>>> +#define prx(num) printk(#num "=%x\n", num)
>>>> pr_debug?
>>> Sure, can change that.
>>>
>>>>> +#else
>>>>> +#define prn(num) do { } while (0)
>>>>> +#define prx(num) do { } while (0)
>>>>> +#endif
>>>> []
>>>>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>>>> static LIST_HEAD(dev_list);
>>>>> static DEFINE_SPINLOCK(list_lock);
>>>>>
>>>>> +#ifdef DEBUG
>>>>> +#define omap_aes_read(dd, offset) \
>>>>> + do { \
>>>>> + omap_aes_read_1(dd, offset); \
>>>>> + pr_debug("omap_aes_read(" #offset ")\n"); \
>>>>> + } while (0)
>>>>> +
>>>>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>>>>> +#else
>>>>> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>>>>> +#endif
>>>>> {
>>>>> return __raw_readl(dd->io_base + offset);
>>>>> }
>>>>>
>>>>> +#ifdef DEBUG
>>>>> +#define omap_aes_write(dd, offset, value) \
>>>>> + do { \
>>>>> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
>>>>> + offset, value); \
>>>>> + omap_aes_write_1(dd, offset, value); \
>>>>> + } while (0)
>>>>> +
>>>>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>>>>> + u32 value)
>>>>> +#else
>>>>> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>>>> u32 value)
>>>>> +#endif
>>>>> {
>>>>> __raw_writel(value, dd->io_base + offset);
>>>>> }
>>>> Umm, yuck?
>>>>
>>>> Is there any real value in read_1 and write_1?
>>> Can you be more descriptive? There is a lot of value in them for debug
>>> to show clearly sequence of read/writes. Moreover, they are no-op'd when
>>> DEBUG is disabled.
>> pr_debug is no-op'd when DEBUG is not #defined.
>> so just use a single
>>
>> omap_aes_write(...)
>> {
>> pr_debug(...)
>> __raw_writel(...);
>> }
> Actually this doesn't work, as the pr_debug cannot print the name of the
> offset as my original patch set does using "#offset".
>
> There are many places where named offsets are used to pass to
> read/write, and this macro helps to visually see which offset is being
> written to by name.
>
> So the original patch would stand in its current form except for a small
> rewrite of the write debug part of it as follows to be cleaner getting
> rid of the _1. For the read , we still need it as we need to return the
> value from a function which cannot be done in a macro.
>
> So the new patch would look something like this:
>
> #ifdef DEBUG
> #define omap_aes_read(dd, offset) \
> omap_aes_read_1(dd, offset); \
> pr_debug("omap_aes_read(" #offset ")\n");
> static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
> #else
> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
> #endif
> {
> return __raw_readl(dd->io_base + offset);
> }

Bellow version "write" looks much more readable - never re-define
function signature by macro.
Above should be similar as well...

> #ifdef DEBUG
> #define omap_aes_write(dd, offset, value) \
> do { \
> pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
> offset, value); \
> __raw_writel(value, dd->io_base + offset); \
> } while (0)
> #else
> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
> u32 value)
> {
> __raw_writel(value, dd->io_base + offset);
> }
> #endif

>
>
> Thanks,
>
> -Joel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2013-08-15 07:03:41

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 00/10] crypto: omap-aes: DMA and PIO mode improvements

On 08/15/2013 12:58 AM, Dmitry Kasatkin wrote:
> On 15/08/13 02:30, Joel Fernandes wrote:
>> On 08/14/2013 06:12 PM, Joel Fernandes wrote:
>>> This patch series is a rewrite of the DMA portion of omap-aes driver
>>> and also adds support for PIO mode. Both these modes, give better
>>> performance than before.
>>>
>>> Earlier, only a single SG was used for DMA purpose, and the SG-list
>>> passed from the crypto layer was being copied and DMA'd one entry at
>>> a time. This turns out to be quite inefficient, so we replace it with
>>> much simpler code that directly passes the SG-list from crypto to the
>>> DMA layer.
>>>
>>> We also add PIO mode support to the driver, and switch to PIO mode
>>> whenever the DMA channel allocation is not available. This is only for
>>> OMAP4 platform will work on any platform on which IRQ information is
>>> populated.
>>>
>>> Tests performed on am33xx and omap4 SoCs , notice the 50% perf improvement
>> Just correcting, this is more like 35% not 50% when using DMA.
>
> Hmm :)
>
> 1766/1460 = ~20%

Yes sorry, I messed the cover letter up. If I resend the series again,
I'll update this number.

On OMAP4 though, I saw 2800 ops/sec vs 1800 ops/sec which is around 50%
so it depends on SoC and DMA controller. OMAP4 uses SDMA while AM335x
uses EDMA.

Also with very large blocks, this improvement will be much higher as we
will not be doing all the intermediate copy but I haven't tested with
such large blocks.

Thanks,

-Joel

2013-08-15 07:27:52

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 02/10] crypto: omap-aes: Add useful debug macros

On 08/15/2013 01:23 AM, Dmitry Kasatkin wrote:
> On 15/08/13 06:12, Joel Fernandes wrote:
>> On 08/14/2013 07:47 PM, Joe Perches wrote:
>>> On Wed, 2013-08-14 at 18:40 -0500, Joel Fernandes wrote:
>>>> On 08/14/2013 06:29 PM, Joe Perches wrote:
>>>>> On Wed, 2013-08-14 at 18:12 -0500, Joel Fernandes wrote:
>>>>>> When DEBUG is enabled, these macros can be used to print variables
>>>>>> in integer and hex format, and clearly display which registers,
>>>>>> offsets and values are being read/written , including printing the
>>>>>> names of the offsets and their values.
>>>>> []
>>>>>> diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
>>>>> []
>>>>>> @@ -15,6 +15,14 @@
>>>>>>
>>>>>> #define pr_fmt(fmt) "%s: " fmt, __func__
>>>>>>
>>>>>> +#ifdef DEBUG
>>>>>> +#define prn(num) printk(#num "=%d\n", num)
>>>>>> +#define prx(num) printk(#num "=%x\n", num)
>>>>> pr_debug?
>>>> Sure, can change that.
>>>>
>>>>>> +#else
>>>>>> +#define prn(num) do { } while (0)
>>>>>> +#define prx(num) do { } while (0)
>>>>>> +#endif
>>>>> []
>>>>>> @@ -172,13 +180,35 @@ struct omap_aes_dev {
>>>>>> static LIST_HEAD(dev_list);
>>>>>> static DEFINE_SPINLOCK(list_lock);
>>>>>>
>>>>>> +#ifdef DEBUG
>>>>>> +#define omap_aes_read(dd, offset) \
>>>>>> + do { \
>>>>>> + omap_aes_read_1(dd, offset); \
>>>>>> + pr_debug("omap_aes_read(" #offset ")\n"); \
>>>>>> + } while (0)
>>>>>> +
>>>>>> +static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>>>>>> +#else
>>>>>> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>>>>>> +#endif
>>>>>> {
>>>>>> return __raw_readl(dd->io_base + offset);
>>>>>> }
>>>>>>
>>>>>> +#ifdef DEBUG
>>>>>> +#define omap_aes_write(dd, offset, value) \
>>>>>> + do { \
>>>>>> + pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
>>>>>> + offset, value); \
>>>>>> + omap_aes_write_1(dd, offset, value); \
>>>>>> + } while (0)
>>>>>> +
>>>>>> +static inline void omap_aes_write_1(struct omap_aes_dev *dd, u32 offset,
>>>>>> + u32 value)
>>>>>> +#else
>>>>>> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>>>>>> u32 value)
>>>>>> +#endif
>>>>>> {
>>>>>> __raw_writel(value, dd->io_base + offset);
>>>>>> }
>>>>> Umm, yuck?
>>>>>
>>>>> Is there any real value in read_1 and write_1?
>>>> Can you be more descriptive? There is a lot of value in them for debug
>>>> to show clearly sequence of read/writes. Moreover, they are no-op'd when
>>>> DEBUG is disabled.
>>> pr_debug is no-op'd when DEBUG is not #defined.
>>> so just use a single
>>>
>>> omap_aes_write(...)
>>> {
>>> pr_debug(...)
>>> __raw_writel(...);
>>> }
>> Actually this doesn't work, as the pr_debug cannot print the name of the
>> offset as my original patch set does using "#offset".
>>
>> There are many places where named offsets are used to pass to
>> read/write, and this macro helps to visually see which offset is being
>> written to by name.
>>
>> So the original patch would stand in its current form except for a small
>> rewrite of the write debug part of it as follows to be cleaner getting
>> rid of the _1. For the read , we still need it as we need to return the
>> value from a function which cannot be done in a macro.
>>
>> So the new patch would look something like this:
>>
>> #ifdef DEBUG
>> #define omap_aes_read(dd, offset) \
>> omap_aes_read_1(dd, offset); \
>> pr_debug("omap_aes_read(" #offset ")\n");
>> static inline u32 omap_aes_read_1(struct omap_aes_dev *dd, u32 offset)
>> #else
>> static inline u32 omap_aes_read(struct omap_aes_dev *dd, u32 offset)
>> #endif
>> {
>> return __raw_readl(dd->io_base + offset);
>> }
>
> Bellow version "write" looks much more readable - never re-define
> function signature by macro.
> Above should be similar as well...

Yes, I'll write the read in the final version of this patch like the
write. Its certainly cleaner.

-Joel



>> #ifdef DEBUG
>> #define omap_aes_write(dd, offset, value) \
>> do { \
>> pr_debug("omap_aes_write(" #offset "=%x) value=%d\n", \
>> offset, value); \
>> __raw_writel(value, dd->io_base + offset); \
>> } while (0)
>> #else
>> static inline void omap_aes_write(struct omap_aes_dev *dd, u32 offset,
>> u32 value)
>> {
>> __raw_writel(value, dd->io_base + offset);
>> }
>> #endif
>
>>
>>
>> Thanks,
>>
>> -Joel
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>