2017-07-21 15:42:54

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v2 0/3] crypto: scompress - defer allocation of percpu scratch buffers

This is a followup to 'crypto: scompress - eliminate percpu scratch buffers',
which attempted to replace the scompress per-CPU buffer entirely, but as
Herbert pointed out, this is not going to fly in the targeted use cases.

Instead, move the alloc/free of the buffers into the tfm init/exit hooks,
so that we only have the per-CPU buffers allocated if their are any
acomp ciphers of the right kind (i.e, ones that encapsulate a synchronous
implementation) in use (#3)

Patches #1 and #2 are fixes for issues I spotted when working on this code.

Ard Biesheuvel (3):
crypto: scompress - don't sleep with preemption disabled
crypto: scompress - free partially allocated scratch buffers on
failure
crypto: scompress - defer allocation of scratch buffer to first use

crypto/scompress.c | 55 ++++++++------------
1 file changed, 22 insertions(+), 33 deletions(-)

--
2.11.0


2017-07-21 15:43:01

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v2 2/3] crypto: scompress - free partially allocated scratch buffers on failure

When allocating the per-CPU scratch buffers, we allocate the source
and destination buffers separately, but bail immediately if the second
allocation fails, without freeing the first one. Fix that.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/scompress.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index 0b40d991d65f..2c07648305ad 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -125,8 +125,11 @@ static int crypto_scomp_alloc_all_scratches(void)
if (!scomp_src_scratches)
return -ENOMEM;
scomp_dst_scratches = crypto_scomp_alloc_scratches();
- if (!scomp_dst_scratches)
+ if (!scomp_dst_scratches) {
+ crypto_scomp_free_scratches(scomp_src_scratches);
+ scomp_src_scratches = NULL;
return -ENOMEM;
+ }
}
return 0;
}
--
2.11.0

2017-07-21 15:42:59

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v2 1/3] crypto: scompress - don't sleep with preemption disabled

Due to the use of per-CPU buffers, scomp_acomp_comp_decomp() executes
with preemption disabled, and so whether the CRYPTO_TFM_REQ_MAY_SLEEP
flag is set is irrelevant, since we cannot sleep anyway. So disregard
the flag, and use GFP_ATOMIC unconditionally.

Cc: <[email protected]> # v4.10+
Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/scompress.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index ae1d3cf209e4..0b40d991d65f 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -211,9 +211,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
scratch_dst, &req->dlen, *ctx);
if (!ret) {
if (!req->dst) {
- req->dst = crypto_scomp_sg_alloc(req->dlen,
- req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ?
- GFP_KERNEL : GFP_ATOMIC);
+ req->dst = crypto_scomp_sg_alloc(req->dlen, GFP_ATOMIC);
if (!req->dst)
goto out;
}
--
2.11.0

2017-07-21 15:43:03

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use

The scompress code allocates 2 x 128 KB of scratch buffers for each CPU,
so that clients of the async API can use synchronous implementations
even from atomic context. However, on systems such as Cavium Thunderx
(which has 96 cores), this adds up to a non-negligible 24 MB. Also,
32-bit systems may prefer to use their precious vmalloc space for other
things,especially since there don't appear to be any clients for the
async compression API yet.

So let's defer allocation of the scratch buffers until the first time
we allocate an acompress cipher based on an scompress implementation.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/scompress.c | 46 ++++++++------------
1 file changed, 17 insertions(+), 29 deletions(-)

diff --git a/crypto/scompress.c b/crypto/scompress.c
index 2c07648305ad..2075e2c4e7df 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -65,11 +65,6 @@ static void crypto_scomp_show(struct seq_file *m, struct crypto_alg *alg)
seq_puts(m, "type : scomp\n");
}

-static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
-{
- return 0;
-}
-
static void crypto_scomp_free_scratches(void * __percpu *scratches)
{
int i;
@@ -134,6 +129,17 @@ static int crypto_scomp_alloc_all_scratches(void)
return 0;
}

+static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
+{
+ int ret;
+
+ mutex_lock(&scomp_lock);
+ ret = crypto_scomp_alloc_all_scratches();
+ mutex_unlock(&scomp_lock);
+
+ return ret;
+}
+
static void crypto_scomp_sg_free(struct scatterlist *sgl)
{
int i, n;
@@ -241,6 +247,10 @@ static void crypto_exit_scomp_ops_async(struct crypto_tfm *tfm)
struct crypto_scomp **ctx = crypto_tfm_ctx(tfm);

crypto_free_scomp(*ctx);
+
+ mutex_lock(&scomp_lock);
+ crypto_scomp_free_all_scratches();
+ mutex_unlock(&scomp_lock);
}

int crypto_init_scomp_ops_async(struct crypto_tfm *tfm)
@@ -317,40 +327,18 @@ static const struct crypto_type crypto_scomp_type = {
int crypto_register_scomp(struct scomp_alg *alg)
{
struct crypto_alg *base = &alg->base;
- int ret = -ENOMEM;
-
- mutex_lock(&scomp_lock);
- if (crypto_scomp_alloc_all_scratches())
- goto error;

base->cra_type = &crypto_scomp_type;
base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK;
base->cra_flags |= CRYPTO_ALG_TYPE_SCOMPRESS;

- ret = crypto_register_alg(base);
- if (ret)
- goto error;
-
- mutex_unlock(&scomp_lock);
- return ret;
-
-error:
- crypto_scomp_free_all_scratches();
- mutex_unlock(&scomp_lock);
- return ret;
+ return crypto_register_alg(base);
}
EXPORT_SYMBOL_GPL(crypto_register_scomp);

int crypto_unregister_scomp(struct scomp_alg *alg)
{
- int ret;
-
- mutex_lock(&scomp_lock);
- ret = crypto_unregister_alg(&alg->base);
- crypto_scomp_free_all_scratches();
- mutex_unlock(&scomp_lock);
-
- return ret;
+ return crypto_unregister_alg(&alg->base);
}
EXPORT_SYMBOL_GPL(crypto_unregister_scomp);

--
2.11.0

2017-07-25 23:37:12

by Cabiddu, Giovanni

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use

Hi Ard,

On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:
> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
> +{
> + int ret;
> +
> + mutex_lock(&scomp_lock);
> + ret = crypto_scomp_alloc_all_scratches();
> + mutex_unlock(&scomp_lock);
> +
> + return ret;
> +}
If you allocate the scratch buffers at init_tfm, don't you end
up with a situation where if you allocate two tfms of the same algo
then you get twice the number of scratches?
If that is the case, we should implement a reference count
mechanism.
Am I missing something?

Regards,

--
Giovanni

2017-07-26 00:07:40

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use


> On 26 Jul 2017, at 00:36, Giovanni Cabiddu <[email protected]> wrote:
>
> Hi Ard,
>
>> On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:
>> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
>> +{
>> + int ret;
>> +
>> + mutex_lock(&scomp_lock);
>> + ret = crypto_scomp_alloc_all_scratches();
>> + mutex_unlock(&scomp_lock);
>> +
>> + return ret;
>> +}
> If you allocate the scratch buffers at init_tfm, don't you end
> up with a situation where if you allocate two tfms of the same algo
> then you get twice the number of scratches?
> If that is the case, we should implement a reference count
> mechanism.
> Am I missing something?
>

Isn't the mutex supposed to take care of that?

2017-07-28 09:13:58

by Cabiddu, Giovanni

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use

On Wed, Jul 26, 2017 at 01:07:34AM +0100, Ard Biesheuvel wrote:
>
> > On 26 Jul 2017, at 00:36, Giovanni Cabiddu <[email protected]> wrote:
> >
> > Hi Ard,
> >
> >> On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:
> >> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
> >> +{
> >> + int ret;
> >> +
> >> + mutex_lock(&scomp_lock);
> >> + ret = crypto_scomp_alloc_all_scratches();
> >> + mutex_unlock(&scomp_lock);
> >> +
> >> + return ret;
> >> +}
> > If you allocate the scratch buffers at init_tfm, don't you end
> > up with a situation where if you allocate two tfms of the same algo
> > then you get twice the number of scratches?
> > If that is the case, we should implement a reference count
> > mechanism.
> > Am I missing something?
> >
>
> Isn't the mutex supposed to take care of that?
Ignore my previous question. There is already a reference count
mechanism that prevents that to happen.
Your change sounds correct to me.

Regards,

--
Giovanni

2017-08-03 06:24:48

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] crypto: scompress - defer allocation of percpu scratch buffers

On Fri, Jul 21, 2017 at 04:42:35PM +0100, Ard Biesheuvel wrote:
> This is a followup to 'crypto: scompress - eliminate percpu scratch buffers',
> which attempted to replace the scompress per-CPU buffer entirely, but as
> Herbert pointed out, this is not going to fly in the targeted use cases.
>
> Instead, move the alloc/free of the buffers into the tfm init/exit hooks,
> so that we only have the per-CPU buffers allocated if their are any
> acomp ciphers of the right kind (i.e, ones that encapsulate a synchronous
> implementation) in use (#3)
>
> Patches #1 and #2 are fixes for issues I spotted when working on this code.
>
> Ard Biesheuvel (3):
> crypto: scompress - don't sleep with preemption disabled
> crypto: scompress - free partially allocated scratch buffers on
> failure
> crypto: scompress - defer allocation of scratch buffer to first use

All patches applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt