2022-12-02 09:25:31

by Herbert Xu

[permalink] [raw]
Subject: [PATCH 0/10] crypto: Driver conversions for DMA alignment

These are the rest of the driver conversions in order for arm64
to safely lower the kmalloc alignment below that required for DMA.

My criteria for inclusion are:

1) the driver can be built on arm64.
2) the driver may perform DMA on the context structure.

I have worked through all the drivers in crypto but if you think
I've missed something please let me know.

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


2022-12-02 10:13:24

by liulongfang

[permalink] [raw]
Subject: Re: [PATCH 0/10] crypto: Driver conversions for DMA alignment

On 2022/12/2 17:19, Herbert Xu wrote:
> These are the rest of the driver conversions in order for arm64
> to safely lower the kmalloc alignment below that required for DMA.
>
> My criteria for inclusion are:
>
> 1) the driver can be built on arm64.
> 2) the driver may perform DMA on the context structure.
>
> I have worked through all the drivers in crypto but if you think
> I've missed something please let me know.
>

Hi, Herbert:
There are also skcipher_request_ctx and aead_request_ctx in the hisilicon/sec2 driver
that need to be updated.

Thanks,
Longfang.
> Thanks,
>

2022-12-04 09:38:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/10] crypto: Driver conversions for DMA alignment

On Fri, Dec 02, 2022 at 05:19:43PM +0800, Herbert Xu wrote:
> These are the rest of the driver conversions in order for arm64
> to safely lower the kmalloc alignment below that required for DMA.

Btw, drivers/crypto/ has a lot of weird and most likely uses of
GFP_DMA. Can someone look into those while we're doing DMA audits
of the cryto drivers?

2022-12-06 04:26:33

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/10] crypto: Driver conversions for DMA alignment

On Sun, Dec 04, 2022 at 01:32:08AM -0800, Christoph Hellwig wrote:
> On Fri, Dec 02, 2022 at 05:19:43PM +0800, Herbert Xu wrote:
> > These are the rest of the driver conversions in order for arm64
> > to safely lower the kmalloc alignment below that required for DMA.
>
> Btw, drivers/crypto/ has a lot of weird and most likely uses of
> GFP_DMA. Can someone look into those while we're doing DMA audits
> of the cryto drivers?

Yes they're clearly bogus. Basically they are saying they want
memory that is aligned sufficiently for DMA. So if Catalin's
patch-set will break this assumption, then all the GFP_DMA allocations
in drivers/crypto will need to be enlarged to take this into
account.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2022-12-06 06:46:22

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/10] crypto: Driver conversions for DMA alignment

On Tue, Dec 06, 2022 at 12:13:52PM +0800, Herbert Xu wrote:
> Yes they're clearly bogus. Basically they are saying they want
> memory that is aligned sufficiently for DMA. So if Catalin's
> patch-set will break this assumption, then all the GFP_DMA allocations
> in drivers/crypto will need to be enlarged to take this into
> account.

But GFP_DMA never did do anything at all about alignment. It picks
allocations from ZONE_DMA (which on x86 is the first 16MB only).

2022-12-06 08:49:36

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/10] crypto: Driver conversions for DMA alignment

On Mon, Dec 05, 2022 at 10:27:58PM -0800, Christoph Hellwig wrote:
.
> But GFP_DMA never did do anything at all about alignment. It picks
> allocations from ZONE_DMA (which on x86 is the first 16MB only).

Right. I'm not arguing that they are correct or anything. I'm
just saying that they are currently working on arm64 because of
the large minimum kmalloc alignment, and they will all be broken
afterwards.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2022-12-29 09:12:50

by Herbert Xu

[permalink] [raw]
Subject: [PATCH] crypto: sun8i-ss - Remove GFP_DMA and add DMA alignment padding

GFP_DMA does not guarantee that the returned memory is aligned
for DMA. In fact for sun8i-ss it is superfluous and can be removed.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
index 902f6be057ec..83c6dfad77e1 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
@@ -452,7 +452,7 @@ int sun8i_ss_aes_setkey(struct crypto_skcipher *tfm, const u8 *key,
}
kfree_sensitive(op->key);
op->keylen = keylen;
- op->key = kmemdup(key, keylen, GFP_KERNEL | GFP_DMA);
+ op->key = kmemdup(key, keylen, GFP_KERNEL);
if (!op->key)
return -ENOMEM;

@@ -475,7 +475,7 @@ int sun8i_ss_des3_setkey(struct crypto_skcipher *tfm, const u8 *key,

kfree_sensitive(op->key);
op->keylen = keylen;
- op->key = kmemdup(key, keylen, GFP_KERNEL | GFP_DMA);
+ op->key = kmemdup(key, keylen, GFP_KERNEL);
if (!op->key)
return -ENOMEM;

diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
index ac2329e2b0e5..c9dc06f97857 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
@@ -16,6 +16,7 @@
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/irq.h>
+#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_device.h>
@@ -527,7 +528,7 @@ static int allocate_flows(struct sun8i_ss_dev *ss)
init_completion(&ss->flows[i].complete);

ss->flows[i].biv = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
- GFP_KERNEL | GFP_DMA);
+ GFP_KERNEL);
if (!ss->flows[i].biv) {
err = -ENOMEM;
goto error_engine;
@@ -535,7 +536,7 @@ static int allocate_flows(struct sun8i_ss_dev *ss)

for (j = 0; j < MAX_SG; j++) {
ss->flows[i].iv[j] = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
- GFP_KERNEL | GFP_DMA);
+ GFP_KERNEL);
if (!ss->flows[i].iv[j]) {
err = -ENOMEM;
goto error_engine;
@@ -544,13 +545,15 @@ static int allocate_flows(struct sun8i_ss_dev *ss)

/* the padding could be up to two block. */
ss->flows[i].pad = devm_kmalloc(ss->dev, MAX_PAD_SIZE,
- GFP_KERNEL | GFP_DMA);
+ GFP_KERNEL);
if (!ss->flows[i].pad) {
err = -ENOMEM;
goto error_engine;
}
- ss->flows[i].result = devm_kmalloc(ss->dev, SHA256_DIGEST_SIZE,
- GFP_KERNEL | GFP_DMA);
+ ss->flows[i].result =
+ devm_kmalloc(ss->dev, max(SHA256_DIGEST_SIZE,
+ dma_get_cache_alignment()),
+ GFP_KERNEL);
if (!ss->flows[i].result) {
err = -ENOMEM;
goto error_engine;
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
index 36a82b22953c..577bf636f7fb 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
@@ -79,10 +79,10 @@ int sun8i_ss_hmac_setkey(struct crypto_ahash *ahash, const u8 *key,
memcpy(tfmctx->key, key, keylen);
}

- tfmctx->ipad = kzalloc(bs, GFP_KERNEL | GFP_DMA);
+ tfmctx->ipad = kzalloc(bs, GFP_KERNEL);
if (!tfmctx->ipad)
return -ENOMEM;
- tfmctx->opad = kzalloc(bs, GFP_KERNEL | GFP_DMA);
+ tfmctx->opad = kzalloc(bs, GFP_KERNEL);
if (!tfmctx->opad) {
ret = -ENOMEM;
goto err_opad;
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-prng.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-prng.c
index dd677e9ed06f..70c7b5d571b8 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-prng.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-prng.c
@@ -11,6 +11,8 @@
*/
#include "sun8i-ss.h"
#include <linux/dma-mapping.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
#include <linux/pm_runtime.h>
#include <crypto/internal/rng.h>

@@ -25,7 +27,7 @@ int sun8i_ss_prng_seed(struct crypto_rng *tfm, const u8 *seed,
ctx->seed = NULL;
}
if (!ctx->seed)
- ctx->seed = kmalloc(slen, GFP_KERNEL | GFP_DMA);
+ ctx->seed = kmalloc(slen, GFP_KERNEL);
if (!ctx->seed)
return -ENOMEM;

@@ -58,6 +60,7 @@ int sun8i_ss_prng_generate(struct crypto_rng *tfm, const u8 *src,
struct sun8i_ss_rng_tfm_ctx *ctx = crypto_rng_ctx(tfm);
struct rng_alg *alg = crypto_rng_alg(tfm);
struct sun8i_ss_alg_template *algt;
+ unsigned int todo_with_padding;
struct sun8i_ss_dev *ss;
dma_addr_t dma_iv, dma_dst;
unsigned int todo;
@@ -81,7 +84,11 @@ int sun8i_ss_prng_generate(struct crypto_rng *tfm, const u8 *src,
todo = dlen + PRNG_SEED_SIZE + PRNG_DATA_SIZE;
todo -= todo % PRNG_DATA_SIZE;

- d = kzalloc(todo, GFP_KERNEL | GFP_DMA);
+ todo_with_padding = ALIGN(todo, dma_get_cache_alignment());
+ if (todo_with_padding < todo || todo < dlen)
+ return -EOVERFLOW;
+
+ d = kzalloc(todo_with_padding, GFP_KERNEL);
if (!d)
return -ENOMEM;

--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2022-12-30 05:43:29

by Herbert Xu

[permalink] [raw]
Subject: [PATCH] crypto: caam - Remove GFP_DMA and add DMA alignment padding

GFP_DMA does not guarantee that the returned memory is aligned
for DMA. It should be removed where it is superfluous.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/crypto/caam/blob_gen.c b/drivers/crypto/caam/blob_gen.c
index 1f65df489847..0c347ba6e692 100644
--- a/drivers/crypto/caam/blob_gen.c
+++ b/drivers/crypto/caam/blob_gen.c
@@ -83,7 +83,7 @@ int caam_process_blob(struct caam_blob_priv *priv,
output_len = info->input_len - CAAM_BLOB_OVERHEAD;
}

- desc = kzalloc(CAAM_BLOB_DESC_BYTES_MAX, GFP_KERNEL | GFP_DMA);
+ desc = kzalloc(CAAM_BLOB_DESC_BYTES_MAX, GFP_KERNEL);
if (!desc)
return -ENOMEM;

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index ecc15bc521db..4a9b998a8d26 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -59,6 +59,8 @@
#include <crypto/engine.h>
#include <crypto/xts.h>
#include <asm/unaligned.h>
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>

/*
* crypto alg
@@ -1379,8 +1381,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
sec4_sg_bytes = sec4_sg_len * sizeof(struct sec4_sg_entry);

/* allocate space for base edesc and hw desc commands, link tables */
- edesc = kzalloc(sizeof(*edesc) + desc_bytes + sec4_sg_bytes,
- GFP_DMA | flags);
+ edesc = kzalloc(sizeof(*edesc) + desc_bytes + sec4_sg_bytes, flags);
if (!edesc) {
caam_unmap(jrdev, req->src, req->dst, src_nents, dst_nents, 0,
0, 0, 0);
@@ -1608,6 +1609,7 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
u8 *iv;
int ivsize = crypto_skcipher_ivsize(skcipher);
int dst_sg_idx, sec4_sg_ents, sec4_sg_bytes;
+ unsigned int aligned_size;

src_nents = sg_nents_for_len(req->src, req->cryptlen);
if (unlikely(src_nents < 0)) {
@@ -1681,15 +1683,18 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
/*
* allocate space for base edesc and hw desc commands, link tables, IV
*/
- edesc = kzalloc(sizeof(*edesc) + desc_bytes + sec4_sg_bytes + ivsize,
- GFP_DMA | flags);
- if (!edesc) {
+ aligned_size = ALIGN(ivsize, __alignof__(*edesc));
+ aligned_size += sizeof(*edesc) + desc_bytes + sec4_sg_bytes;
+ aligned_size = ALIGN(aligned_size, dma_get_cache_alignment());
+ iv = kzalloc(aligned_size, flags);
+ if (!iv) {
dev_err(jrdev, "could not allocate extended descriptor\n");
caam_unmap(jrdev, req->src, req->dst, src_nents, dst_nents, 0,
0, 0, 0);
return ERR_PTR(-ENOMEM);
}

+ edesc = (void *)(iv + ALIGN(ivsize, __alignof__(*edesc)));
edesc->src_nents = src_nents;
edesc->dst_nents = dst_nents;
edesc->mapped_src_nents = mapped_src_nents;
@@ -1701,7 +1706,6 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,

/* Make sure IV is located in a DMAable area */
if (ivsize) {
- iv = (u8 *)edesc->sec4_sg + sec4_sg_bytes;
memcpy(iv, req->iv, ivsize);

iv_dma = dma_map_single(jrdev, iv, ivsize, DMA_BIDIRECTIONAL);
diff --git a/drivers/crypto/caam/caamalg_qi.c b/drivers/crypto/caam/caamalg_qi.c
index c37b67be0492..5e218bf20d5b 100644
--- a/drivers/crypto/caam/caamalg_qi.c
+++ b/drivers/crypto/caam/caamalg_qi.c
@@ -20,6 +20,8 @@
#include "caamalg_desc.h"
#include <crypto/xts.h>
#include <asm/unaligned.h>
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>

/*
* crypto alg
@@ -959,7 +961,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
return (struct aead_edesc *)drv_ctx;

/* allocate space for base edesc and hw desc commands, link tables */
- edesc = qi_cache_alloc(GFP_DMA | flags);
+ edesc = qi_cache_alloc(flags);
if (unlikely(!edesc)) {
dev_err(qidev, "could not allocate extended descriptor\n");
return ERR_PTR(-ENOMEM);
@@ -1317,8 +1319,9 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
qm_sg_ents = 1 + pad_sg_nents(qm_sg_ents);

qm_sg_bytes = qm_sg_ents * sizeof(struct qm_sg_entry);
- if (unlikely(offsetof(struct skcipher_edesc, sgt) + qm_sg_bytes +
- ivsize > CAAM_QI_MEMCACHE_SIZE)) {
+ if (unlikely(ALIGN(ivsize, __alignof__(*edesc)) +
+ offsetof(struct skcipher_edesc, sgt) + qm_sg_bytes >
+ CAAM_QI_MEMCACHE_SIZE)) {
dev_err(qidev, "No space for %d S/G entries and/or %dB IV\n",
qm_sg_ents, ivsize);
caam_unmap(qidev, req->src, req->dst, src_nents, dst_nents, 0,
@@ -1327,17 +1330,18 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req,
}

/* allocate space for base edesc, link tables and IV */
- edesc = qi_cache_alloc(GFP_DMA | flags);
- if (unlikely(!edesc)) {
+ iv = qi_cache_alloc(flags);
+ if (unlikely(!iv)) {
dev_err(qidev, "could not allocate extended descriptor\n");
caam_unmap(qidev, req->src, req->dst, src_nents, dst_nents, 0,
0, DMA_NONE, 0, 0);
return ERR_PTR(-ENOMEM);
}

+ edesc = (void *)(iv + ALIGN(ivsize, __alignof__(*edesc)));
+
/* Make sure IV is located in a DMAable area */
sg_table = &edesc->sgt[0];
- iv = (u8 *)(sg_table + qm_sg_ents);
memcpy(iv, req->iv, ivsize);

iv_dma = dma_map_single(qidev, iv, ivsize, DMA_BIDIRECTIONAL);
diff --git a/drivers/crypto/caam/caamalg_qi2.c b/drivers/crypto/caam/caamalg_qi2.c
index 1b0dd742c53f..0ddef9a033a1 100644
--- a/drivers/crypto/caam/caamalg_qi2.c
+++ b/drivers/crypto/caam/caamalg_qi2.c
@@ -16,7 +16,9 @@
#include "caamalg_desc.h"
#include "caamhash_desc.h"
#include "dpseci-debugfs.h"
+#include <linux/dma-mapping.h>
#include <linux/fsl/mc.h>
+#include <linux/kernel.h>
#include <soc/fsl/dpaa2-io.h>
#include <soc/fsl/dpaa2-fd.h>
#include <crypto/xts.h>
@@ -370,7 +372,7 @@ static struct aead_edesc *aead_edesc_alloc(struct aead_request *req,
struct dpaa2_sg_entry *sg_table;

/* allocate space for base edesc, link tables and IV */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (unlikely(!edesc)) {
dev_err(dev, "could not allocate extended descriptor\n");
return ERR_PTR(-ENOMEM);
@@ -1189,7 +1191,7 @@ static struct skcipher_edesc *skcipher_edesc_alloc(struct skcipher_request *req)
}

/* allocate space for base edesc, link tables and IV */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (unlikely(!edesc)) {
dev_err(dev, "could not allocate extended descriptor\n");
caam_unmap(dev, req->src, req->dst, src_nents, dst_nents, 0,
@@ -3220,14 +3222,14 @@ static int hash_digest_key(struct caam_hash_ctx *ctx, u32 *keylen, u8 *key,
int ret = -ENOMEM;
struct dpaa2_fl_entry *in_fle, *out_fle;

- req_ctx = kzalloc(sizeof(*req_ctx), GFP_KERNEL | GFP_DMA);
+ req_ctx = kzalloc(sizeof(*req_ctx), GFP_KERNEL);
if (!req_ctx)
return -ENOMEM;

in_fle = &req_ctx->fd_flt[1];
out_fle = &req_ctx->fd_flt[0];

- flc = kzalloc(sizeof(*flc), GFP_KERNEL | GFP_DMA);
+ flc = kzalloc(sizeof(*flc), GFP_KERNEL);
if (!flc)
goto err_flc;

@@ -3316,7 +3318,13 @@ static int ahash_setkey(struct crypto_ahash *ahash, const u8 *key,
dev_dbg(ctx->dev, "keylen %d blocksize %d\n", keylen, blocksize);

if (keylen > blocksize) {
- hashed_key = kmemdup(key, keylen, GFP_KERNEL | GFP_DMA);
+ unsigned int aligned_len =
+ ALIGN(keylen, dma_get_cache_alignment());
+
+ if (aligned_len < keylen)
+ return -EOVERFLOW;
+
+ hashed_key = kmemdup(key, aligned_len, GFP_KERNEL);
if (!hashed_key)
return -ENOMEM;
ret = hash_digest_key(ctx, &keylen, hashed_key, digestsize);
@@ -3560,7 +3568,7 @@ static int ahash_update_ctx(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents,
DMA_TO_DEVICE);
@@ -3654,7 +3662,7 @@ static int ahash_final_ctx(struct ahash_request *req)
int ret;

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc)
return -ENOMEM;

@@ -3743,7 +3751,7 @@ static int ahash_finup_ctx(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents, DMA_TO_DEVICE);
return -ENOMEM;
@@ -3836,7 +3844,7 @@ static int ahash_digest(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents, DMA_TO_DEVICE);
return ret;
@@ -3913,7 +3921,7 @@ static int ahash_final_no_ctx(struct ahash_request *req)
int ret = -ENOMEM;

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc)
return ret;

@@ -4012,7 +4020,7 @@ static int ahash_update_no_ctx(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents,
DMA_TO_DEVICE);
@@ -4125,7 +4133,7 @@ static int ahash_finup_no_ctx(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents, DMA_TO_DEVICE);
return ret;
@@ -4230,7 +4238,7 @@ static int ahash_update_first(struct ahash_request *req)
}

/* allocate space for base edesc and link tables */
- edesc = qi_cache_zalloc(GFP_DMA | flags);
+ edesc = qi_cache_zalloc(flags);
if (!edesc) {
dma_unmap_sg(ctx->dev, req->src, src_nents,
DMA_TO_DEVICE);
@@ -4926,6 +4934,7 @@ static int dpaa2_dpseci_congestion_setup(struct dpaa2_caam_priv *priv,
{
struct dpseci_congestion_notification_cfg cong_notif_cfg = { 0 };
struct device *dev = priv->dev;
+ unsigned int alignmask;
int err;

/*
@@ -4936,13 +4945,14 @@ static int dpaa2_dpseci_congestion_setup(struct dpaa2_caam_priv *priv,
!(priv->dpseci_attr.options & DPSECI_OPT_HAS_CG))
return 0;

- priv->cscn_mem = kzalloc(DPAA2_CSCN_SIZE + DPAA2_CSCN_ALIGN,
- GFP_KERNEL | GFP_DMA);
+ alignmask = DPAA2_CSCN_ALIGN - 1;
+ alignmask |= dma_get_cache_alignment() - 1;
+ priv->cscn_mem = kzalloc(ALIGN(DPAA2_CSCN_SIZE, alignmask + 1),
+ GFP_KERNEL);
if (!priv->cscn_mem)
return -ENOMEM;

- priv->cscn_mem_aligned = PTR_ALIGN(priv->cscn_mem, DPAA2_CSCN_ALIGN);
- priv->cscn_dma = dma_map_single(dev, priv->cscn_mem_aligned,
+ priv->cscn_dma = dma_map_single(dev, priv->cscn_mem,
DPAA2_CSCN_SIZE, DMA_FROM_DEVICE);
if (dma_mapping_error(dev, priv->cscn_dma)) {
dev_err(dev, "Error mapping CSCN memory area\n");
@@ -5174,7 +5184,7 @@ static int dpaa2_caam_probe(struct fsl_mc_device *dpseci_dev)
priv->domain = iommu_get_domain_for_dev(dev);

qi_cache = kmem_cache_create("dpaa2_caamqicache", CAAM_QI_MEMCACHE_SIZE,
- 0, SLAB_CACHE_DMA, NULL);
+ 0, 0, NULL);
if (!qi_cache) {
dev_err(dev, "Can't allocate SEC cache\n");
return -ENOMEM;
@@ -5451,7 +5461,7 @@ int dpaa2_caam_enqueue(struct device *dev, struct caam_request *req)
dma_sync_single_for_cpu(priv->dev, priv->cscn_dma,
DPAA2_CSCN_SIZE,
DMA_FROM_DEVICE);
- if (unlikely(dpaa2_cscn_state_congested(priv->cscn_mem_aligned))) {
+ if (unlikely(dpaa2_cscn_state_congested(priv->cscn_mem))) {
dev_dbg_ratelimited(dev, "Dropping request\n");
return -EBUSY;
}
diff --git a/drivers/crypto/caam/caamalg_qi2.h b/drivers/crypto/caam/caamalg_qi2.h
index d35253407ade..abb502bb675c 100644
--- a/drivers/crypto/caam/caamalg_qi2.h
+++ b/drivers/crypto/caam/caamalg_qi2.h
@@ -7,13 +7,14 @@
#ifndef _CAAMALG_QI2_H_
#define _CAAMALG_QI2_H_

+#include <crypto/internal/skcipher.h>
+#include <linux/compiler_attributes.h>
#include <soc/fsl/dpaa2-io.h>
#include <soc/fsl/dpaa2-fd.h>
#include <linux/threads.h>
#include <linux/netdevice.h>
#include "dpseci.h"
#include "desc_constr.h"
-#include <crypto/skcipher.h>

#define DPAA2_CAAM_STORE_SIZE 16
/* NAPI weight *must* be a multiple of the store size. */
@@ -36,8 +37,6 @@
* @tx_queue_attr: array of Tx queue attributes
* @cscn_mem: pointer to memory region containing the congestion SCN
* it's size is larger than to accommodate alignment
- * @cscn_mem_aligned: pointer to congestion SCN; it is computed as
- * PTR_ALIGN(cscn_mem, DPAA2_CSCN_ALIGN)
* @cscn_dma: dma address used by the QMAN to write CSCN messages
* @dev: device associated with the DPSECI object
* @mc_io: pointer to MC portal's I/O object
@@ -58,7 +57,6 @@ struct dpaa2_caam_priv {

/* congestion */
void *cscn_mem;
- void *cscn_mem_aligned;
dma_addr_t cscn_dma;

struct device *dev;
@@ -158,7 +156,7 @@ struct ahash_edesc {
struct caam_flc {
u32 flc[16];
u32 sh_desc[MAX_SDLEN];
-} ____cacheline_aligned;
+} __aligned(CRYPTO_DMA_ALIGN);

enum optype {
ENCRYPT = 0,
@@ -180,7 +178,7 @@ enum optype {
* @edesc: extended descriptor; points to one of {skcipher,aead}_edesc
*/
struct caam_request {
- struct dpaa2_fl_entry fd_flt[2];
+ struct dpaa2_fl_entry fd_flt[2] __aligned(CRYPTO_DMA_ALIGN);
dma_addr_t fd_flt_dma;
struct caam_flc *flc;
dma_addr_t flc_dma;
diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c
index 1050e965a438..1f357f48c473 100644
--- a/drivers/crypto/caam/caamhash.c
+++ b/drivers/crypto/caam/caamhash.c
@@ -66,6 +66,8 @@
#include "key_gen.h"
#include "caamhash_desc.h"
#include <crypto/engine.h>
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>

#define CAAM_CRA_PRIORITY 3000

@@ -365,7 +367,7 @@ static int hash_digest_key(struct caam_hash_ctx *ctx, u32 *keylen, u8 *key,
dma_addr_t key_dma;
int ret;

- desc = kmalloc(CAAM_CMD_SZ * 8 + CAAM_PTR_SZ * 2, GFP_KERNEL | GFP_DMA);
+ desc = kmalloc(CAAM_CMD_SZ * 8 + CAAM_PTR_SZ * 2, GFP_KERNEL);
if (!desc) {
dev_err(jrdev, "unable to allocate key input memory\n");
return -ENOMEM;
@@ -432,7 +434,13 @@ static int ahash_setkey(struct crypto_ahash *ahash,
dev_dbg(jrdev, "keylen %d\n", keylen);

if (keylen > blocksize) {
- hashed_key = kmemdup(key, keylen, GFP_KERNEL | GFP_DMA);
+ unsigned int aligned_len =
+ ALIGN(keylen, dma_get_cache_alignment());
+
+ if (aligned_len < keylen)
+ return -EOVERFLOW;
+
+ hashed_key = kmemdup(key, keylen, GFP_KERNEL);
if (!hashed_key)
return -ENOMEM;
ret = hash_digest_key(ctx, &keylen, hashed_key, digestsize);
@@ -702,7 +710,7 @@ static struct ahash_edesc *ahash_edesc_alloc(struct ahash_request *req,
struct ahash_edesc *edesc;
unsigned int sg_size = sg_num * sizeof(struct sec4_sg_entry);

- edesc = kzalloc(sizeof(*edesc) + sg_size, GFP_DMA | flags);
+ edesc = kzalloc(sizeof(*edesc) + sg_size, flags);
if (!edesc) {
dev_err(ctx->jrdev, "could not allocate extended descriptor\n");
return NULL;
diff --git a/drivers/crypto/caam/caampkc.c b/drivers/crypto/caam/caampkc.c
index aef031946f33..e40614fef39d 100644
--- a/drivers/crypto/caam/caampkc.c
+++ b/drivers/crypto/caam/caampkc.c
@@ -16,6 +16,8 @@
#include "desc_constr.h"
#include "sg_sw_sec4.h"
#include "caampkc.h"
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>

#define DESC_RSA_PUB_LEN (2 * CAAM_CMD_SZ + SIZEOF_RSA_PUB_PDB)
#define DESC_RSA_PRIV_F1_LEN (2 * CAAM_CMD_SZ + \
@@ -310,8 +312,7 @@ static struct rsa_edesc *rsa_edesc_alloc(struct akcipher_request *req,
sec4_sg_bytes = sec4_sg_len * sizeof(struct sec4_sg_entry);

/* allocate space for base edesc, hw desc commands and link tables */
- edesc = kzalloc(sizeof(*edesc) + desclen + sec4_sg_bytes,
- GFP_DMA | flags);
+ edesc = kzalloc(sizeof(*edesc) + desclen + sec4_sg_bytes, flags);
if (!edesc)
goto dst_fail;

@@ -898,7 +899,7 @@ static u8 *caam_read_rsa_crt(const u8 *ptr, size_t nbytes, size_t dstlen)
if (!nbytes)
return NULL;

- dst = kzalloc(dstlen, GFP_DMA | GFP_KERNEL);
+ dst = kzalloc(dstlen, GFP_KERNEL);
if (!dst)
return NULL;

@@ -910,7 +911,7 @@ static u8 *caam_read_rsa_crt(const u8 *ptr, size_t nbytes, size_t dstlen)
/**
* caam_read_raw_data - Read a raw byte stream as a positive integer.
* The function skips buffer's leading zeros, copies the remained data
- * to a buffer allocated in the GFP_DMA | GFP_KERNEL zone and returns
+ * to a buffer allocated in the GFP_KERNEL zone and returns
* the address of the new buffer.
*
* @buf : The data to read
@@ -923,7 +924,7 @@ static inline u8 *caam_read_raw_data(const u8 *buf, size_t *nbytes)
if (!*nbytes)
return NULL;

- return kmemdup(buf, *nbytes, GFP_DMA | GFP_KERNEL);
+ return kmemdup(buf, *nbytes, GFP_KERNEL);
}

static int caam_rsa_check_key_length(unsigned int len)
@@ -949,13 +950,13 @@ static int caam_rsa_set_pub_key(struct crypto_akcipher *tfm, const void *key,
return ret;

/* Copy key in DMA zone */
- rsa_key->e = kmemdup(raw_key.e, raw_key.e_sz, GFP_DMA | GFP_KERNEL);
+ rsa_key->e = kmemdup(raw_key.e, raw_key.e_sz, GFP_KERNEL);
if (!rsa_key->e)
goto err;

/*
* Skip leading zeros and copy the positive integer to a buffer
- * allocated in the GFP_DMA | GFP_KERNEL zone. The decryption descriptor
+ * allocated in the GFP_KERNEL zone. The decryption descriptor
* expects a positive integer for the RSA modulus and uses its length as
* decryption output length.
*/
@@ -983,6 +984,7 @@ static void caam_rsa_set_priv_key_form(struct caam_rsa_ctx *ctx,
struct caam_rsa_key *rsa_key = &ctx->key;
size_t p_sz = raw_key->p_sz;
size_t q_sz = raw_key->q_sz;
+ unsigned aligned_size;

rsa_key->p = caam_read_raw_data(raw_key->p, &p_sz);
if (!rsa_key->p)
@@ -994,11 +996,13 @@ static void caam_rsa_set_priv_key_form(struct caam_rsa_ctx *ctx,
goto free_p;
rsa_key->q_sz = q_sz;

- rsa_key->tmp1 = kzalloc(raw_key->p_sz, GFP_DMA | GFP_KERNEL);
+ aligned_size = ALIGN(raw_key->p_sz, dma_get_cache_alignment());
+ rsa_key->tmp1 = kzalloc(aligned_size, GFP_KERNEL);
if (!rsa_key->tmp1)
goto free_q;

- rsa_key->tmp2 = kzalloc(raw_key->q_sz, GFP_DMA | GFP_KERNEL);
+ aligned_size = ALIGN(raw_key->q_sz, dma_get_cache_alignment());
+ rsa_key->tmp2 = kzalloc(aligned_size, GFP_KERNEL);
if (!rsa_key->tmp2)
goto free_tmp1;

@@ -1051,17 +1055,17 @@ static int caam_rsa_set_priv_key(struct crypto_akcipher *tfm, const void *key,
return ret;

/* Copy key in DMA zone */
- rsa_key->d = kmemdup(raw_key.d, raw_key.d_sz, GFP_DMA | GFP_KERNEL);
+ rsa_key->d = kmemdup(raw_key.d, raw_key.d_sz, GFP_KERNEL);
if (!rsa_key->d)
goto err;

- rsa_key->e = kmemdup(raw_key.e, raw_key.e_sz, GFP_DMA | GFP_KERNEL);
+ rsa_key->e = kmemdup(raw_key.e, raw_key.e_sz, GFP_KERNEL);
if (!rsa_key->e)
goto err;

/*
* Skip leading zeros and copy the positive integer to a buffer
- * allocated in the GFP_DMA | GFP_KERNEL zone. The decryption descriptor
+ * allocated in the GFP_KERNEL zone. The decryption descriptor
* expects a positive integer for the RSA modulus and uses its length as
* decryption output length.
*/
@@ -1185,8 +1189,7 @@ int caam_pkc_init(struct device *ctrldev)
return 0;

/* allocate zero buffer, used for padding input */
- zero_buffer = kzalloc(CAAM_RSA_MAX_INPUT_SIZE - 1, GFP_DMA |
- GFP_KERNEL);
+ zero_buffer = kzalloc(CAAM_RSA_MAX_INPUT_SIZE - 1, GFP_KERNEL);
if (!zero_buffer)
return -ENOMEM;

diff --git a/drivers/crypto/caam/caamprng.c b/drivers/crypto/caam/caamprng.c
index 4839e66300a2..6e4c1191cb28 100644
--- a/drivers/crypto/caam/caamprng.c
+++ b/drivers/crypto/caam/caamprng.c
@@ -8,6 +8,8 @@

#include <linux/completion.h>
#include <crypto/internal/rng.h>
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>
#include "compat.h"
#include "regs.h"
#include "intern.h"
@@ -75,6 +77,7 @@ static int caam_prng_generate(struct crypto_rng *tfm,
const u8 *src, unsigned int slen,
u8 *dst, unsigned int dlen)
{
+ unsigned int aligned_dlen = ALIGN(dlen, dma_get_cache_alignment());
struct caam_prng_ctx ctx;
struct device *jrdev;
dma_addr_t dst_dma;
@@ -82,7 +85,10 @@ static int caam_prng_generate(struct crypto_rng *tfm,
u8 *buf;
int ret;

- buf = kzalloc(dlen, GFP_KERNEL);
+ if (aligned_dlen < dlen)
+ return -EOVERFLOW;
+
+ buf = kzalloc(aligned_dlen, GFP_KERNEL);
if (!buf)
return -ENOMEM;

@@ -94,7 +100,7 @@ static int caam_prng_generate(struct crypto_rng *tfm,
return ret;
}

- desc = kzalloc(CAAM_PRNG_MAX_DESC_LEN, GFP_KERNEL | GFP_DMA);
+ desc = kzalloc(CAAM_PRNG_MAX_DESC_LEN, GFP_KERNEL);
if (!desc) {
ret = -ENOMEM;
goto out1;
@@ -156,7 +162,7 @@ static int caam_prng_seed(struct crypto_rng *tfm,
return ret;
}

- desc = kzalloc(CAAM_PRNG_MAX_DESC_LEN, GFP_KERNEL | GFP_DMA);
+ desc = kzalloc(CAAM_PRNG_MAX_DESC_LEN, GFP_KERNEL);
if (!desc) {
caam_jr_free(jrdev);
return -ENOMEM;
diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
index 1f0e82050976..1fd8ff965006 100644
--- a/drivers/crypto/caam/caamrng.c
+++ b/drivers/crypto/caam/caamrng.c
@@ -12,6 +12,8 @@
#include <linux/hw_random.h>
#include <linux/completion.h>
#include <linux/atomic.h>
+#include <linux/dma-mapping.h>
+#include <linux/kernel.h>
#include <linux/kfifo.h>

#include "compat.h"
@@ -176,17 +178,18 @@ static int caam_init(struct hwrng *rng)
int err;

ctx->desc_sync = devm_kzalloc(ctx->ctrldev, CAAM_RNG_DESC_LEN,
- GFP_DMA | GFP_KERNEL);
+ GFP_KERNEL);
if (!ctx->desc_sync)
return -ENOMEM;

ctx->desc_async = devm_kzalloc(ctx->ctrldev, CAAM_RNG_DESC_LEN,
- GFP_DMA | GFP_KERNEL);
+ GFP_KERNEL);
if (!ctx->desc_async)
return -ENOMEM;

- if (kfifo_alloc(&ctx->fifo, CAAM_RNG_MAX_FIFO_STORE_SIZE,
- GFP_DMA | GFP_KERNEL))
+ if (kfifo_alloc(&ctx->fifo, ALIGN(CAAM_RNG_MAX_FIFO_STORE_SIZE,
+ dma_get_cache_alignment()),
+ GFP_KERNEL))
return -ENOMEM;

INIT_WORK(&ctx->worker, caam_rng_worker);
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 32253a064d0f..6278afb951c3 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -199,7 +199,7 @@ static int deinstantiate_rng(struct device *ctrldev, int state_handle_mask)
u32 *desc, status;
int sh_idx, ret = 0;

- desc = kmalloc(CAAM_CMD_SZ * 3, GFP_KERNEL | GFP_DMA);
+ desc = kmalloc(CAAM_CMD_SZ * 3, GFP_KERNEL);
if (!desc)
return -ENOMEM;

@@ -276,7 +276,7 @@ static int instantiate_rng(struct device *ctrldev, int state_handle_mask,
int ret = 0, sh_idx;

ctrl = (struct caam_ctrl __iomem *)ctrlpriv->ctrl;
- desc = kmalloc(CAAM_CMD_SZ * 7, GFP_KERNEL | GFP_DMA);
+ desc = kmalloc(CAAM_CMD_SZ * 7, GFP_KERNEL);
if (!desc)
return -ENOMEM;

diff --git a/drivers/crypto/caam/key_gen.c b/drivers/crypto/caam/key_gen.c
index b0e8a4939b4f..88cc4fe2a585 100644
--- a/drivers/crypto/caam/key_gen.c
+++ b/drivers/crypto/caam/key_gen.c
@@ -64,7 +64,7 @@ int gen_split_key(struct device *jrdev, u8 *key_out,
if (local_max > max_keylen)
return -EINVAL;

- desc = kmalloc(CAAM_CMD_SZ * 6 + CAAM_PTR_SZ * 2, GFP_KERNEL | GFP_DMA);
+ desc = kmalloc(CAAM_CMD_SZ * 6 + CAAM_PTR_SZ * 2, GFP_KERNEL);
if (!desc) {
dev_err(jrdev, "unable to allocate key input memory\n");
return ret;
diff --git a/drivers/crypto/caam/qi.c b/drivers/crypto/caam/qi.c
index c36f27376d7e..4c52c9365558 100644
--- a/drivers/crypto/caam/qi.c
+++ b/drivers/crypto/caam/qi.c
@@ -614,7 +614,7 @@ static int alloc_rsp_fq_cpu(struct device *qidev, unsigned int cpu)
struct qman_fq *fq;
int ret;

- fq = kzalloc(sizeof(*fq), GFP_KERNEL | GFP_DMA);
+ fq = kzalloc(sizeof(*fq), GFP_KERNEL);
if (!fq)
return -ENOMEM;

@@ -756,7 +756,7 @@ int caam_qi_init(struct platform_device *caam_pdev)
}

qi_cache = kmem_cache_create("caamqicache", CAAM_QI_MEMCACHE_SIZE, 0,
- SLAB_CACHE_DMA, NULL);
+ 0, NULL);
if (!qi_cache) {
dev_err(qidev, "Can't allocate CAAM cache\n");
free_rsp_fqs();
diff --git a/drivers/crypto/caam/qi.h b/drivers/crypto/caam/qi.h
index 5894f16f8fe3..a96e3d213c06 100644
--- a/drivers/crypto/caam/qi.h
+++ b/drivers/crypto/caam/qi.h
@@ -9,6 +9,8 @@
#ifndef __QI_H__
#define __QI_H__

+#include <crypto/algapi.h>
+#include <linux/compiler_attributes.h>
#include <soc/fsl/qman.h>
#include "compat.h"
#include "desc.h"
@@ -58,8 +60,10 @@ enum optype {
* @qidev: device pointer for CAAM/QI backend
*/
struct caam_drv_ctx {
- u32 prehdr[2];
- u32 sh_desc[MAX_SDLEN];
+ struct {
+ u32 prehdr[2];
+ u32 sh_desc[MAX_SDLEN];
+ } __aligned(CRYPTO_DMA_ALIGN);
dma_addr_t context_a;
struct qman_fq *req_fq;
struct qman_fq *rsp_fq;
@@ -67,7 +71,7 @@ struct caam_drv_ctx {
int cpu;
enum optype op_type;
struct device *qidev;
-} ____cacheline_aligned;
+};

/**
* caam_drv_req - The request structure the driver application should fill while
@@ -88,7 +92,7 @@ struct caam_drv_req {
struct caam_drv_ctx *drv_ctx;
caam_qi_cbk cbk;
void *app_ctx;
-} ____cacheline_aligned;
+} __aligned(CRYPTO_DMA_ALIGN);

/**
* caam_drv_ctx_init - Initialise a CAAM/QI driver context
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2022-12-30 07:32:59

by Herbert Xu

[permalink] [raw]
Subject: [PATCH] crypto: talitos - Remove GFP_DMA and add DMA alignment padding

GFP_DMA does not guarantee that the returned memory is aligned
for DMA. It should be removed where it is superfluous.

However, kmalloc may start returning DMA-unaligned memory in future
so fix this by adding the alignment by hand.

Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 71db6450b6aa..d62ec68e3183 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -1393,7 +1393,7 @@ static struct talitos_edesc *talitos_edesc_alloc(struct device *dev,
alloc_len += sizeof(struct talitos_desc);
alloc_len += ivsize;

- edesc = kmalloc(alloc_len, GFP_DMA | flags);
+ edesc = kmalloc(ALIGN(alloc_len, dma_get_cache_alignment()), flags);
if (!edesc)
return ERR_PTR(-ENOMEM);
if (ivsize) {
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2023-01-03 06:52:29

by Corentin Labbe

[permalink] [raw]
Subject: Re: [PATCH] crypto: sun8i-ss - Remove GFP_DMA and add DMA alignment padding

Le Thu, Dec 29, 2022 at 04:58:21PM +0800, Herbert Xu a ?crit :
> GFP_DMA does not guarantee that the returned memory is aligned
> for DMA. In fact for sun8i-ss it is superfluous and can be removed.
>
> However, kmalloc may start returning DMA-unaligned memory in future
> so fix this by adding the alignment by hand.
>
> Signed-off-by: Herbert Xu <[email protected]>

Hello

Tested-by: Corentin Labbe <[email protected]>
Tested-on: sun8i-a83t-bananapi-m3
Acked-by: Corentin Labbe <[email protected]>

This means that lot of other crypto driver (sun8i-ce, amlogic, etc...) I maintain need the same fix, right ?

Regards

2023-01-03 07:31:48

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] crypto: sun8i-ss - Remove GFP_DMA and add DMA alignment padding

On Tue, Jan 03, 2023 at 07:51:24AM +0100, Corentin Labbe wrote:
.
> Tested-by: Corentin Labbe <[email protected]>
> Tested-on: sun8i-a83t-bananapi-m3
> Acked-by: Corentin Labbe <[email protected]>

Thanks,

> This means that lot of other crypto driver (sun8i-ce, amlogic, etc...) I maintain need the same fix, right ?

I checked the other drivers and none of them seem to be broken.
Only DMA from the device (DMA_FROM_DEVICE or DMA_BIDIRECTIONAL)
need to be aligned appropriately.

However, please do remove any unnecessary uses of GFP_DMA in those
drivers.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2023-01-09 07:22:29

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] crypto: talitos - Remove GFP_DMA and add DMA alignment padding



Le 30/12/2022 à 08:31, Herbert Xu a écrit :
> GFP_DMA does not guarantee that the returned memory is aligned
> for DMA. It should be removed where it is superfluous.

Doesn't GFP_DMA guarantees that the provided memory is addressable for
DMA ? Or do we assume that all memory returned by kmalloc can be used
for DMA ?

>
> However, kmalloc may start returning DMA-unaligned memory in future
> so fix this by adding the alignment by hand.

kmalloc() already returns not DMA aligned memory, why does it becomes a
problem now ?
Ok, that may be suboptimal, but is that a problem at all ?

By the way, I'm not sure I understand the solution, what's the added
value of aligning allocation length to the cache alignment ?

>
> Signed-off-by: Herbert Xu <[email protected]>
>
> diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
> index 71db6450b6aa..d62ec68e3183 100644
> --- a/drivers/crypto/talitos.c
> +++ b/drivers/crypto/talitos.c
> @@ -1393,7 +1393,7 @@ static struct talitos_edesc *talitos_edesc_alloc(struct device *dev,
> alloc_len += sizeof(struct talitos_desc);
> alloc_len += ivsize;
>
> - edesc = kmalloc(alloc_len, GFP_DMA | flags);
> + edesc = kmalloc(ALIGN(alloc_len, dma_get_cache_alignment()), flags);
> if (!edesc)
> return ERR_PTR(-ENOMEM);
> if (ivsize) {

2023-01-09 07:39:09

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] crypto: talitos - Remove GFP_DMA and add DMA alignment padding

On Mon, Jan 09, 2023 at 07:18:27AM +0000, Christophe Leroy wrote:
>
> Doesn't GFP_DMA guarantees that the provided memory is addressable for
> DMA ? Or do we assume that all memory returned by kmalloc can be used
> for DMA ?

No it does not. Please refer to the DMI API documentation.

> kmalloc() already returns not DMA aligned memory, why does it becomes a
> problem now ?

As it stands kmalloc on arm64 returns DMA-aligned memory. This
will soon no longer be the case.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt