2020-01-01 22:28:22

by Christian Lamparter

[permalink] [raw]
Subject: [PATCH 1/2] crypto: crypto4xx - reduce memory fragmentation

With recent kernels (>5.2), the driver fails to probe, as the
allocation of the driver's scatter buffer fails with -ENOMEM.

This happens in crypto4xx_build_sdr(). Where the driver tries
to get 512KiB (=PPC4XX_SD_BUFFER_SIZE * PPC4XX_NUM_SD) of
continuous memory. This big chunk is by design, since the driver
uses this circumstance in the crypto4xx_copy_pkt_to_dst() to
its advantage:
"all scatter-buffers are all neatly organized in one big
continuous ringbuffer; So scatterwalk_map_and_copy() can be
instructed to copy a range of buffers in one go."

The PowerPC arch does not have support for DMA_CMA. Hence,
this patch reorganizes the order in which the memory
allocations are done. Since the driver itself is responsible
for some of the issues.

Signed-off-by: Christian Lamparter <[email protected]>
---
drivers/crypto/amcc/crypto4xx_core.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index 7d6b695c4ab3..3ce5f0a24cbc 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -286,7 +286,8 @@ static u32 crypto4xx_build_gdr(struct crypto4xx_device *dev)

static inline void crypto4xx_destroy_gdr(struct crypto4xx_device *dev)
{
- dma_free_coherent(dev->core_dev->device,
+ if (dev->gdr)
+ dma_free_coherent(dev->core_dev->device,
sizeof(struct ce_gd) * PPC4XX_NUM_GD,
dev->gdr, dev->gdr_pa);
}
@@ -354,13 +355,6 @@ static u32 crypto4xx_build_sdr(struct crypto4xx_device *dev)
{
int i;

- /* alloc memory for scatter descriptor ring */
- dev->sdr = dma_alloc_coherent(dev->core_dev->device,
- sizeof(struct ce_sd) * PPC4XX_NUM_SD,
- &dev->sdr_pa, GFP_ATOMIC);
- if (!dev->sdr)
- return -ENOMEM;
-
dev->scatter_buffer_va =
dma_alloc_coherent(dev->core_dev->device,
PPC4XX_SD_BUFFER_SIZE * PPC4XX_NUM_SD,
@@ -368,6 +362,13 @@ static u32 crypto4xx_build_sdr(struct crypto4xx_device *dev)
if (!dev->scatter_buffer_va)
return -ENOMEM;

+ /* alloc memory for scatter descriptor ring */
+ dev->sdr = dma_alloc_coherent(dev->core_dev->device,
+ sizeof(struct ce_sd) * PPC4XX_NUM_SD,
+ &dev->sdr_pa, GFP_ATOMIC);
+ if (!dev->sdr)
+ return -ENOMEM;
+
for (i = 0; i < PPC4XX_NUM_SD; i++) {
dev->sdr[i].ptr = dev->scatter_buffer_pa +
PPC4XX_SD_BUFFER_SIZE * i;
@@ -1439,15 +1440,14 @@ static int crypto4xx_probe(struct platform_device *ofdev)
spin_lock_init(&core_dev->lock);
INIT_LIST_HEAD(&core_dev->dev->alg_list);
ratelimit_default_init(&core_dev->dev->aead_ratelimit);
+ rc = crypto4xx_build_sdr(core_dev->dev);
+ if (rc)
+ goto err_build_sdr;
rc = crypto4xx_build_pdr(core_dev->dev);
if (rc)
- goto err_build_pdr;
+ goto err_build_sdr;

rc = crypto4xx_build_gdr(core_dev->dev);
- if (rc)
- goto err_build_pdr;
-
- rc = crypto4xx_build_sdr(core_dev->dev);
if (rc)
goto err_build_sdr;

@@ -1493,7 +1493,6 @@ static int crypto4xx_probe(struct platform_device *ofdev)
err_build_sdr:
crypto4xx_destroy_sdr(core_dev->dev);
crypto4xx_destroy_gdr(core_dev->dev);
-err_build_pdr:
crypto4xx_destroy_pdr(core_dev->dev);
kfree(core_dev->dev);
err_alloc_dev:
--
2.25.0.rc0


2020-01-01 22:28:56

by Christian Lamparter

[permalink] [raw]
Subject: [PATCH 2/2] crypto: crypto4xx - use GFP_KERNEL for big allocations

The driver should use GFP_KERNEL for the bigger allocation
during the driver's crypto4xx_probe() and not GFP_ATOMIC in
my opinion.

Signed-off-by: Christian Lamparter <[email protected]>
---
drivers/crypto/amcc/crypto4xx_core.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index 3ce5f0a24cbc..981de43ea5e2 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -169,7 +169,7 @@ static u32 crypto4xx_build_pdr(struct crypto4xx_device *dev)
int i;
dev->pdr = dma_alloc_coherent(dev->core_dev->device,
sizeof(struct ce_pd) * PPC4XX_NUM_PD,
- &dev->pdr_pa, GFP_ATOMIC);
+ &dev->pdr_pa, GFP_KERNEL);
if (!dev->pdr)
return -ENOMEM;

@@ -185,13 +185,13 @@ static u32 crypto4xx_build_pdr(struct crypto4xx_device *dev)
dev->shadow_sa_pool = dma_alloc_coherent(dev->core_dev->device,
sizeof(union shadow_sa_buf) * PPC4XX_NUM_PD,
&dev->shadow_sa_pool_pa,
- GFP_ATOMIC);
+ GFP_KERNEL);
if (!dev->shadow_sa_pool)
return -ENOMEM;

dev->shadow_sr_pool = dma_alloc_coherent(dev->core_dev->device,
sizeof(struct sa_state_record) * PPC4XX_NUM_PD,
- &dev->shadow_sr_pool_pa, GFP_ATOMIC);
+ &dev->shadow_sr_pool_pa, GFP_KERNEL);
if (!dev->shadow_sr_pool)
return -ENOMEM;
for (i = 0; i < PPC4XX_NUM_PD; i++) {
@@ -277,7 +277,7 @@ static u32 crypto4xx_build_gdr(struct crypto4xx_device *dev)
{
dev->gdr = dma_alloc_coherent(dev->core_dev->device,
sizeof(struct ce_gd) * PPC4XX_NUM_GD,
- &dev->gdr_pa, GFP_ATOMIC);
+ &dev->gdr_pa, GFP_KERNEL);
if (!dev->gdr)
return -ENOMEM;

@@ -358,14 +358,14 @@ static u32 crypto4xx_build_sdr(struct crypto4xx_device *dev)
dev->scatter_buffer_va =
dma_alloc_coherent(dev->core_dev->device,
PPC4XX_SD_BUFFER_SIZE * PPC4XX_NUM_SD,
- &dev->scatter_buffer_pa, GFP_ATOMIC);
+ &dev->scatter_buffer_pa, GFP_KERNEL);
if (!dev->scatter_buffer_va)
return -ENOMEM;

/* alloc memory for scatter descriptor ring */
dev->sdr = dma_alloc_coherent(dev->core_dev->device,
sizeof(struct ce_sd) * PPC4XX_NUM_SD,
- &dev->sdr_pa, GFP_ATOMIC);
+ &dev->sdr_pa, GFP_KERNEL);
if (!dev->sdr)
return -ENOMEM;

--
2.25.0.rc0

2020-01-09 05:14:47

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 1/2] crypto: crypto4xx - reduce memory fragmentation

On Wed, Jan 01, 2020 at 11:27:01PM +0100, Christian Lamparter wrote:
> With recent kernels (>5.2), the driver fails to probe, as the
> allocation of the driver's scatter buffer fails with -ENOMEM.
>
> This happens in crypto4xx_build_sdr(). Where the driver tries
> to get 512KiB (=PPC4XX_SD_BUFFER_SIZE * PPC4XX_NUM_SD) of
> continuous memory. This big chunk is by design, since the driver
> uses this circumstance in the crypto4xx_copy_pkt_to_dst() to
> its advantage:
> "all scatter-buffers are all neatly organized in one big
> continuous ringbuffer; So scatterwalk_map_and_copy() can be
> instructed to copy a range of buffers in one go."
>
> The PowerPC arch does not have support for DMA_CMA. Hence,
> this patch reorganizes the order in which the memory
> allocations are done. Since the driver itself is responsible
> for some of the issues.
>
> Signed-off-by: Christian Lamparter <[email protected]>
> ---
> drivers/crypto/amcc/crypto4xx_core.c | 27 +++++++++++++--------------
> 1 file changed, 13 insertions(+), 14 deletions(-)

All applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt