2021-05-12 21:08:40

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 0/7] running kernel mode SIMD with softirqs disabled

This is a follow-up to [0], but given that the arm64 architectural
pieces have been merged for arm64, the only remaining changes are crypto
specific. Therefore, the audience has been reduced to those people who
are likely to care about these specifics.

Patch #1 addresses an issue in the skcipher walker which doesn't handle
zero sized AEAD inputs entirely consistently, which is uncovered by the
change in patch #7.

Patches #2 and #3 add some sanity checks to the public AEAD and skcipher
APIs to limit their availibility to either task or softirq context
(which is the only way in which they are currently being used). Adding
this restriction permits the arm64 crypto code to get rid of all scalar
fallbacks, given that on this architecture, softirqs are no longer
served while the SIMD unit is being used in kernel mode, which means
that the scalar fallbacks are never needed. These are removed in the
remaining 4 patches.

[0] https://lore.kernel.org/linux-arm-kernel/[email protected]/

Ard Biesheuvel (7):
crypto: handle zero sized AEAD inputs correctly
crypto: aead - disallow en/decrypt for non-task or non-softirq context
crypto: skcipher - disallow en/decrypt for non-task or non-softirq
context
crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path
crypto: arm64/aes-neonbs - stop using SIMD helper for skciphers
crypto: arm64/aes-ce - stop using SIMD helper for skciphers
crypto: arm64/aes-ccm - remove non-SIMD fallback path

arch/arm64/crypto/Kconfig | 6 -
arch/arm64/crypto/aes-ce-ccm-core.S | 1 +
arch/arm64/crypto/aes-ce-ccm-glue.c | 183 +++++------------
arch/arm64/crypto/aes-glue.c | 102 ++--------
arch/arm64/crypto/aes-neonbs-glue.c | 122 +-----------
arch/arm64/crypto/ghash-ce-glue.c | 209 +++++---------------
crypto/aead.c | 10 +
crypto/skcipher.c | 12 ++
8 files changed, 148 insertions(+), 497 deletions(-)

--
2.20.1


2021-05-12 21:08:53

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 2/7] crypto: aead - disallow en/decrypt for non-task or non-softirq context

In order to ensure that kernel mode SIMD routines will not need a scalar
fallback if they run with softirqs disabled, disallow any use of the
AEAD encrypt and decrypt routines from outside of task or softirq context.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/aead.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/crypto/aead.c b/crypto/aead.c
index 16991095270d..b5304b3d3314 100644
--- a/crypto/aead.c
+++ b/crypto/aead.c
@@ -87,6 +87,11 @@ int crypto_aead_encrypt(struct aead_request *req)
unsigned int cryptlen = req->cryptlen;
int ret;

+ if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+ WARN_ONCE(!in_task() && !in_serving_softirq(),
+ "synchronous call from invalid context\n"))
+ return -EBUSY;
+
crypto_stats_get(alg);
if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
ret = -ENOKEY;
@@ -104,6 +109,11 @@ int crypto_aead_decrypt(struct aead_request *req)
unsigned int cryptlen = req->cryptlen;
int ret;

+ if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+ WARN_ONCE(!in_task() && !in_serving_softirq(),
+ "synchronous call from invalid context\n"))
+ return -EBUSY;
+
crypto_stats_get(alg);
if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
ret = -ENOKEY;
--
2.20.1

2021-05-12 21:09:04

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 5/7] crypto: arm64/aes-neonbs - stop using SIMD helper for skciphers

Calls into the skcipher API can only occur from contexts where the SIMD
unit is available, so there is no need for the SIMD helper.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 2 -
arch/arm64/crypto/aes-neonbs-glue.c | 122 ++------------------
2 files changed, 9 insertions(+), 115 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index b8eb0453123d..ed1e8cadeb3a 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -122,8 +122,6 @@ config CRYPTO_AES_ARM64_BS
depends on KERNEL_MODE_NEON
select CRYPTO_SKCIPHER
select CRYPTO_AES_ARM64_NEON_BLK
- select CRYPTO_AES_ARM64
select CRYPTO_LIB_AES
- select CRYPTO_SIMD

endif
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
index fb507d569922..8df6ad8cb09d 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -63,11 +63,6 @@ struct aesbs_cbc_ctx {
u32 enc[AES_MAX_KEYLENGTH_U32];
};

-struct aesbs_ctr_ctx {
- struct aesbs_ctx key; /* must be first member */
- struct crypto_aes_ctx fallback;
-};
-
struct aesbs_xts_ctx {
struct aesbs_ctx key;
u32 twkey[AES_MAX_KEYLENGTH_U32];
@@ -207,25 +202,6 @@ static int cbc_decrypt(struct skcipher_request *req)
return err;
}

-static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
- unsigned int key_len)
-{
- struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
- int err;
-
- err = aes_expandkey(&ctx->fallback, in_key, key_len);
- if (err)
- return err;
-
- ctx->key.rounds = 6 + key_len / 4;
-
- kernel_neon_begin();
- aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds);
- kernel_neon_end();
-
- return 0;
-}
-
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -292,29 +268,6 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return aesbs_setkey(tfm, in_key, key_len);
}

-static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
-{
- struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
- unsigned long flags;
-
- /*
- * Temporarily disable interrupts to avoid races where
- * cachelines are evicted when the CPU is interrupted
- * to do something else.
- */
- local_irq_save(flags);
- aes_encrypt(&ctx->fallback, dst, src);
- local_irq_restore(flags);
-}
-
-static int ctr_encrypt_sync(struct skcipher_request *req)
-{
- if (!crypto_simd_usable())
- return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
-
- return ctr_encrypt(req);
-}
-
static int __xts_crypt(struct skcipher_request *req, bool encrypt,
void (*fn)(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]))
@@ -431,13 +384,12 @@ static int xts_decrypt(struct skcipher_request *req)
}

static struct skcipher_alg aes_algs[] = { {
- .base.cra_name = "__ecb(aes)",
- .base.cra_driver_name = "__ecb-aes-neonbs",
+ .base.cra_name = "ecb(aes)",
+ .base.cra_driver_name = "ecb-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_module = THIS_MODULE,
- .base.cra_flags = CRYPTO_ALG_INTERNAL,

.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
@@ -446,13 +398,12 @@ static struct skcipher_alg aes_algs[] = { {
.encrypt = ecb_encrypt,
.decrypt = ecb_decrypt,
}, {
- .base.cra_name = "__cbc(aes)",
- .base.cra_driver_name = "__cbc-aes-neonbs",
+ .base.cra_name = "cbc(aes)",
+ .base.cra_driver_name = "cbc-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.base.cra_module = THIS_MODULE,
- .base.cra_flags = CRYPTO_ALG_INTERNAL,

.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
@@ -462,13 +413,12 @@ static struct skcipher_alg aes_algs[] = { {
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
- .base.cra_name = "__ctr(aes)",
- .base.cra_driver_name = "__ctr-aes-neonbs",
+ .base.cra_name = "ctr(aes)",
+ .base.cra_driver_name = "ctr-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_module = THIS_MODULE,
- .base.cra_flags = CRYPTO_ALG_INTERNAL,

.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
@@ -479,29 +429,12 @@ static struct skcipher_alg aes_algs[] = { {
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
- .base.cra_name = "ctr(aes)",
- .base.cra_driver_name = "ctr-aes-neonbs",
- .base.cra_priority = 250 - 1,
- .base.cra_blocksize = 1,
- .base.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
- .base.cra_module = THIS_MODULE,
-
- .min_keysize = AES_MIN_KEY_SIZE,
- .max_keysize = AES_MAX_KEY_SIZE,
- .chunksize = AES_BLOCK_SIZE,
- .walksize = 8 * AES_BLOCK_SIZE,
- .ivsize = AES_BLOCK_SIZE,
- .setkey = aesbs_ctr_setkey_sync,
- .encrypt = ctr_encrypt_sync,
- .decrypt = ctr_encrypt_sync,
-}, {
- .base.cra_name = "__xts(aes)",
- .base.cra_driver_name = "__xts-aes-neonbs",
+ .base.cra_name = "xts(aes)",
+ .base.cra_driver_name = "xts-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
.base.cra_module = THIS_MODULE,
- .base.cra_flags = CRYPTO_ALG_INTERNAL,

.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
@@ -512,54 +445,17 @@ static struct skcipher_alg aes_algs[] = { {
.decrypt = xts_decrypt,
} };

-static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
-
static void aes_exit(void)
{
- int i;
-
- for (i = 0; i < ARRAY_SIZE(aes_simd_algs); i++)
- if (aes_simd_algs[i])
- simd_skcipher_free(aes_simd_algs[i]);
-
crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
}

static int __init aes_init(void)
{
- struct simd_skcipher_alg *simd;
- const char *basename;
- const char *algname;
- const char *drvname;
- int err;
- int i;
-
if (!cpu_have_named_feature(ASIMD))
return -ENODEV;

- err = crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
- if (err)
- return err;
-
- for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
- if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
- continue;
-
- algname = aes_algs[i].base.cra_name + 2;
- drvname = aes_algs[i].base.cra_driver_name + 2;
- basename = aes_algs[i].base.cra_driver_name;
- simd = simd_skcipher_create_compat(algname, drvname, basename);
- err = PTR_ERR(simd);
- if (IS_ERR(simd))
- goto unregister_simds;
-
- aes_simd_algs[i] = simd;
- }
- return 0;
-
-unregister_simds:
- aes_exit();
- return err;
+ return crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
}

module_init(aes_init);
--
2.20.1

2021-05-12 21:09:06

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] crypto: aead - disallow en/decrypt for non-task or non-softirq context

On Wed, May 12, 2021 at 08:44:34PM +0200, Ard Biesheuvel wrote:
> In order to ensure that kernel mode SIMD routines will not need a scalar
> fallback if they run with softirqs disabled, disallow any use of the
> AEAD encrypt and decrypt routines from outside of task or softirq context.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> crypto/aead.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/crypto/aead.c b/crypto/aead.c
> index 16991095270d..b5304b3d3314 100644
> --- a/crypto/aead.c
> +++ b/crypto/aead.c
> @@ -87,6 +87,11 @@ int crypto_aead_encrypt(struct aead_request *req)
> unsigned int cryptlen = req->cryptlen;
> int ret;
>
> + if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
> + WARN_ONCE(!in_task() && !in_serving_softirq(),
> + "synchronous call from invalid context\n"))
> + return -EBUSY;
> +
> crypto_stats_get(alg);
> if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
> ret = -ENOKEY;
> @@ -104,6 +109,11 @@ int crypto_aead_decrypt(struct aead_request *req)
> unsigned int cryptlen = req->cryptlen;
> int ret;
>
> + if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
> + WARN_ONCE(!in_task() && !in_serving_softirq(),
> + "synchronous call from invalid context\n"))
> + return -EBUSY;
> +
> crypto_stats_get(alg);
> if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
> ret = -ENOKEY;

This probably should go after crypto_stats_get() so that the error gets counted
in the stats (if stats are enabled) -- analogous to how the ENOKEY error is
counted.

Likewise for the skcipher patch.

- Eric

2021-05-12 21:10:13

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 7/7] crypto: arm64/aes-ccm - remove non-SIMD fallback path

AES/CCM on arm64 is implemented as a synchronous AEAD, and so it is
guaranteed by the API that it is only invoked in task or softirq
context. Since softirqs are now only handled when the SIMD is not
being used in the task context that was interrupted to service the
softirq, we no longer need a fallback path. Let's remove it.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/aes-ce-ccm-core.S | 1 +
arch/arm64/crypto/aes-ce-ccm-glue.c | 183 ++++++--------------
2 files changed, 53 insertions(+), 131 deletions(-)

diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S
index 99a028e298ed..8adff299fcd3 100644
--- a/arch/arm64/crypto/aes-ce-ccm-core.S
+++ b/arch/arm64/crypto/aes-ce-ccm-core.S
@@ -124,6 +124,7 @@ SYM_FUNC_START(ce_aes_ccm_final)
SYM_FUNC_END(ce_aes_ccm_final)

.macro aes_ccm_do_crypt,enc
+ cbz x2, 5f
ldr x8, [x6, #8] /* load lower ctr */
ld1 {v0.16b}, [x5] /* load mac */
CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */
diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index f6d19b0dc893..a36df98f6fae 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -99,36 +99,8 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen)
static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
u32 abytes, u32 *macp)
{
- if (crypto_simd_usable()) {
- kernel_neon_begin();
- ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
- num_rounds(key));
- kernel_neon_end();
- } else {
- if (*macp > 0 && *macp < AES_BLOCK_SIZE) {
- int added = min(abytes, AES_BLOCK_SIZE - *macp);
-
- crypto_xor(&mac[*macp], in, added);
-
- *macp += added;
- in += added;
- abytes -= added;
- }
-
- while (abytes >= AES_BLOCK_SIZE) {
- aes_encrypt(key, mac, mac);
- crypto_xor(mac, in, AES_BLOCK_SIZE);
-
- in += AES_BLOCK_SIZE;
- abytes -= AES_BLOCK_SIZE;
- }
-
- if (abytes > 0) {
- aes_encrypt(key, mac, mac);
- crypto_xor(mac, in, abytes);
- *macp = abytes;
- }
- }
+ ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
+ num_rounds(key));
}

static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
@@ -171,54 +143,6 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
} while (len);
}

-static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
- struct crypto_aes_ctx *ctx, bool enc)
-{
- u8 buf[AES_BLOCK_SIZE];
- int err = 0;
-
- while (walk->nbytes) {
- int blocks = walk->nbytes / AES_BLOCK_SIZE;
- u32 tail = walk->nbytes % AES_BLOCK_SIZE;
- u8 *dst = walk->dst.virt.addr;
- u8 *src = walk->src.virt.addr;
- u32 nbytes = walk->nbytes;
-
- if (nbytes == walk->total && tail > 0) {
- blocks++;
- tail = 0;
- }
-
- do {
- u32 bsize = AES_BLOCK_SIZE;
-
- if (nbytes < AES_BLOCK_SIZE)
- bsize = nbytes;
-
- crypto_inc(walk->iv, AES_BLOCK_SIZE);
- aes_encrypt(ctx, buf, walk->iv);
- aes_encrypt(ctx, mac, mac);
- if (enc)
- crypto_xor(mac, src, bsize);
- crypto_xor_cpy(dst, src, buf, bsize);
- if (!enc)
- crypto_xor(mac, dst, bsize);
- dst += bsize;
- src += bsize;
- nbytes -= bsize;
- } while (--blocks);
-
- err = skcipher_walk_done(walk, tail);
- }
-
- if (!err) {
- aes_encrypt(ctx, buf, iv0);
- aes_encrypt(ctx, mac, mac);
- crypto_xor(mac, buf, AES_BLOCK_SIZE);
- }
- return err;
-}
-
static int ccm_encrypt(struct aead_request *req)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
@@ -233,41 +157,40 @@ static int ccm_encrypt(struct aead_request *req)
if (err)
return err;

- if (req->assoclen)
- ccm_calculate_auth_mac(req, mac);
-
/* preserve the original iv for the final round */
memcpy(buf, req->iv, AES_BLOCK_SIZE);

err = skcipher_walk_aead_encrypt(&walk, req, false);
+ if (unlikely(err))
+ return err;

- if (crypto_simd_usable()) {
- while (walk.nbytes) {
- u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+ kernel_neon_begin();

- if (walk.nbytes == walk.total)
- tail = 0;
+ if (req->assoclen)
+ ccm_calculate_auth_mac(req, mac);

- kernel_neon_begin();
- ce_aes_ccm_encrypt(walk.dst.virt.addr,
- walk.src.virt.addr,
- walk.nbytes - tail, ctx->key_enc,
- num_rounds(ctx), mac, walk.iv);
- kernel_neon_end();
+ do {
+ u32 tail = walk.nbytes % AES_BLOCK_SIZE;

- err = skcipher_walk_done(&walk, tail);
- }
- if (!err) {
+ if (walk.nbytes == walk.total)
+ tail = 0;
+
+ ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ walk.nbytes - tail, ctx->key_enc,
+ num_rounds(ctx), mac, walk.iv);
+
+ if (walk.nbytes == walk.total)
+ ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx));
+
+ kernel_neon_end();
+
+ err = skcipher_walk_done(&walk, tail);
+ if (unlikely(err))
+ return err;
+
+ if (unlikely(walk.nbytes))
kernel_neon_begin();
- ce_aes_ccm_final(mac, buf, ctx->key_enc,
- num_rounds(ctx));
- kernel_neon_end();
- }
- } else {
- err = ccm_crypt_fallback(&walk, mac, buf, ctx, true);
- }
- if (err)
- return err;
+ } while (walk.nbytes);

/* copy authtag to end of dst */
scatterwalk_map_and_copy(mac, req->dst, req->assoclen + req->cryptlen,
@@ -291,42 +214,40 @@ static int ccm_decrypt(struct aead_request *req)
if (err)
return err;

- if (req->assoclen)
- ccm_calculate_auth_mac(req, mac);
-
/* preserve the original iv for the final round */
memcpy(buf, req->iv, AES_BLOCK_SIZE);

err = skcipher_walk_aead_decrypt(&walk, req, false);
+ if (unlikely(err))
+ return err;

- if (crypto_simd_usable()) {
- while (walk.nbytes) {
- u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+ kernel_neon_begin();

- if (walk.nbytes == walk.total)
- tail = 0;
+ if (req->assoclen)
+ ccm_calculate_auth_mac(req, mac);

- kernel_neon_begin();
- ce_aes_ccm_decrypt(walk.dst.virt.addr,
- walk.src.virt.addr,
- walk.nbytes - tail, ctx->key_enc,
- num_rounds(ctx), mac, walk.iv);
- kernel_neon_end();
+ do {
+ u32 tail = walk.nbytes % AES_BLOCK_SIZE;

- err = skcipher_walk_done(&walk, tail);
- }
- if (!err) {
- kernel_neon_begin();
- ce_aes_ccm_final(mac, buf, ctx->key_enc,
- num_rounds(ctx));
- kernel_neon_end();
- }
- } else {
- err = ccm_crypt_fallback(&walk, mac, buf, ctx, false);
- }
+ if (walk.nbytes == walk.total)
+ tail = 0;

- if (err)
- return err;
+ ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+ walk.nbytes - tail, ctx->key_enc,
+ num_rounds(ctx), mac, walk.iv);
+
+ if (walk.nbytes == walk.total)
+ ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx));
+
+ kernel_neon_end();
+
+ err = skcipher_walk_done(&walk, tail);
+ if (unlikely(err))
+ return err;
+
+ if (unlikely(walk.nbytes))
+ kernel_neon_begin();
+ } while (walk.nbytes);

/* compare calculated auth tag with the stored one */
scatterwalk_map_and_copy(buf, req->src,
--
2.20.1

2021-05-12 21:10:23

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 3/7] crypto: skcipher - disallow en/decrypt for non-task or non-softirq context

In order to ensure that kernel mode SIMD routines will not need a scalar
fallback if they run with softirqs disabled, disallow any use of the
skcipher encrypt and decrypt routines from outside of task or softirq
context.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/skcipher.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/crypto/skcipher.c b/crypto/skcipher.c
index 93fdacf49697..9bce5350008b 100644
--- a/crypto/skcipher.c
+++ b/crypto/skcipher.c
@@ -625,6 +625,11 @@ int crypto_skcipher_encrypt(struct skcipher_request *req)
unsigned int cryptlen = req->cryptlen;
int ret;

+ if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+ WARN_ONCE(!in_task() && !in_serving_softirq(),
+ "synchronous call from invalid context\n"))
+ return -EBUSY;
+
crypto_stats_get(alg);
if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
ret = -ENOKEY;
@@ -642,6 +647,11 @@ int crypto_skcipher_decrypt(struct skcipher_request *req)
unsigned int cryptlen = req->cryptlen;
int ret;

+ if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+ WARN_ONCE(!in_task() && !in_serving_softirq(),
+ "synchronous call from invalid context\n"))
+ return -EBUSY;
+
crypto_stats_get(alg);
if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
ret = -ENOKEY;
--
2.20.1

2021-05-12 21:10:23

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 4/7] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path

Now that kernel mode SIMD is guaranteed to be available when executing
in task or softirq context, we no longer need scalar fallbacks to use
when the NEON is unavailable. So get rid of them.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/ghash-ce-glue.c | 209 +++++---------------
1 file changed, 51 insertions(+), 158 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index 720cd3a58da3..15794fe21a0b 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -362,84 +362,36 @@ static int gcm_encrypt(struct aead_request *req)

err = skcipher_walk_aead_encrypt(&walk, req, false);

- if (likely(crypto_simd_usable())) {
- do {
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
- int nbytes = walk.nbytes;
-
- tag = (u8 *)&lengths;
-
- if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
- src = dst = memcpy(buf + sizeof(buf) - nbytes,
- src, nbytes);
- } else if (nbytes < walk.total) {
- nbytes &= ~(AES_BLOCK_SIZE - 1);
- tag = NULL;
- }
-
- kernel_neon_begin();
- pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
- dg, iv, ctx->aes_key.key_enc, nrounds,
- tag);
- kernel_neon_end();
-
- if (unlikely(!nbytes))
- break;
-
- if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
- memcpy(walk.dst.virt.addr,
- buf + sizeof(buf) - nbytes, nbytes);
-
- err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
- } while (walk.nbytes);
- } else {
- while (walk.nbytes >= AES_BLOCK_SIZE) {
- int blocks = walk.nbytes / AES_BLOCK_SIZE;
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
- int remaining = blocks;
-
- do {
- aes_encrypt(&ctx->aes_key, buf, iv);
- crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
- crypto_inc(iv, AES_BLOCK_SIZE);
-
- dst += AES_BLOCK_SIZE;
- src += AES_BLOCK_SIZE;
- } while (--remaining > 0);
-
- ghash_do_update(blocks, dg, walk.dst.virt.addr,
- &ctx->ghash_key, NULL);
-
- err = skcipher_walk_done(&walk,
- walk.nbytes % AES_BLOCK_SIZE);
- }
-
- /* handle the tail */
- if (walk.nbytes) {
- aes_encrypt(&ctx->aes_key, buf, iv);
+ do {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+ int nbytes = walk.nbytes;

- crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr,
- buf, walk.nbytes);
+ tag = (u8 *)&lengths;

- memcpy(buf, walk.dst.virt.addr, walk.nbytes);
- memset(buf + walk.nbytes, 0, sizeof(buf) - walk.nbytes);
+ if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
+ src = dst = memcpy(buf + sizeof(buf) - nbytes,
+ src, nbytes);
+ } else if (nbytes < walk.total) {
+ nbytes &= ~(AES_BLOCK_SIZE - 1);
+ tag = NULL;
}

- tag = (u8 *)&lengths;
- ghash_do_update(1, dg, tag, &ctx->ghash_key,
- walk.nbytes ? buf : NULL);
+ kernel_neon_begin();
+ pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
+ dg, iv, ctx->aes_key.key_enc, nrounds,
+ tag);
+ kernel_neon_end();

- if (walk.nbytes)
- err = skcipher_walk_done(&walk, 0);
+ if (unlikely(!nbytes))
+ break;

- put_unaligned_be64(dg[1], tag);
- put_unaligned_be64(dg[0], tag + 8);
- put_unaligned_be32(1, iv + GCM_IV_SIZE);
- aes_encrypt(&ctx->aes_key, iv, iv);
- crypto_xor(tag, iv, AES_BLOCK_SIZE);
- }
+ if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
+ memcpy(walk.dst.virt.addr,
+ buf + sizeof(buf) - nbytes, nbytes);
+
+ err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+ } while (walk.nbytes);

if (err)
return err;
@@ -464,6 +416,7 @@ static int gcm_decrypt(struct aead_request *req)
u64 dg[2] = {};
be128 lengths;
u8 *tag;
+ int ret;
int err;

lengths.a = cpu_to_be64(req->assoclen * 8);
@@ -481,101 +434,41 @@ static int gcm_decrypt(struct aead_request *req)

err = skcipher_walk_aead_decrypt(&walk, req, false);

- if (likely(crypto_simd_usable())) {
- int ret;
-
- do {
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
- int nbytes = walk.nbytes;
-
- tag = (u8 *)&lengths;
-
- if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
- src = dst = memcpy(buf + sizeof(buf) - nbytes,
- src, nbytes);
- } else if (nbytes < walk.total) {
- nbytes &= ~(AES_BLOCK_SIZE - 1);
- tag = NULL;
- }
-
- kernel_neon_begin();
- ret = pmull_gcm_decrypt(nbytes, dst, src,
- ctx->ghash_key.h,
- dg, iv, ctx->aes_key.key_enc,
- nrounds, tag, otag, authsize);
- kernel_neon_end();
-
- if (unlikely(!nbytes))
- break;
-
- if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
- memcpy(walk.dst.virt.addr,
- buf + sizeof(buf) - nbytes, nbytes);
-
- err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
- } while (walk.nbytes);
-
- if (err)
- return err;
- if (ret)
- return -EBADMSG;
- } else {
- while (walk.nbytes >= AES_BLOCK_SIZE) {
- int blocks = walk.nbytes / AES_BLOCK_SIZE;
- const u8 *src = walk.src.virt.addr;
- u8 *dst = walk.dst.virt.addr;
-
- ghash_do_update(blocks, dg, walk.src.virt.addr,
- &ctx->ghash_key, NULL);
-
- do {
- aes_encrypt(&ctx->aes_key, buf, iv);
- crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
- crypto_inc(iv, AES_BLOCK_SIZE);
-
- dst += AES_BLOCK_SIZE;
- src += AES_BLOCK_SIZE;
- } while (--blocks > 0);
+ do {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+ int nbytes = walk.nbytes;

- err = skcipher_walk_done(&walk,
- walk.nbytes % AES_BLOCK_SIZE);
- }
+ tag = (u8 *)&lengths;

- /* handle the tail */
- if (walk.nbytes) {
- memcpy(buf, walk.src.virt.addr, walk.nbytes);
- memset(buf + walk.nbytes, 0, sizeof(buf) - walk.nbytes);
+ if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
+ src = dst = memcpy(buf + sizeof(buf) - nbytes,
+ src, nbytes);
+ } else if (nbytes < walk.total) {
+ nbytes &= ~(AES_BLOCK_SIZE - 1);
+ tag = NULL;
}

- tag = (u8 *)&lengths;
- ghash_do_update(1, dg, tag, &ctx->ghash_key,
- walk.nbytes ? buf : NULL);
-
- if (walk.nbytes) {
- aes_encrypt(&ctx->aes_key, buf, iv);
+ kernel_neon_begin();
+ ret = pmull_gcm_decrypt(nbytes, dst, src, ctx->ghash_key.h,
+ dg, iv, ctx->aes_key.key_enc,
+ nrounds, tag, otag, authsize);
+ kernel_neon_end();

- crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr,
- buf, walk.nbytes);
+ if (unlikely(!nbytes))
+ break;

- err = skcipher_walk_done(&walk, 0);
- }
+ if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
+ memcpy(walk.dst.virt.addr,
+ buf + sizeof(buf) - nbytes, nbytes);

- if (err)
- return err;
+ err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+ } while (walk.nbytes);

- put_unaligned_be64(dg[1], tag);
- put_unaligned_be64(dg[0], tag + 8);
- put_unaligned_be32(1, iv + GCM_IV_SIZE);
- aes_encrypt(&ctx->aes_key, iv, iv);
- crypto_xor(tag, iv, AES_BLOCK_SIZE);
+ if (err)
+ return err;

- if (crypto_memneq(tag, otag, authsize)) {
- memzero_explicit(tag, AES_BLOCK_SIZE);
- return -EBADMSG;
- }
- }
- return 0;
+ return ret ? -EBADMSG : 0;
}

static struct aead_alg gcm_aes_alg = {
--
2.20.1

2021-05-12 21:10:23

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] running kernel mode SIMD with softirqs disabled

On Wed, May 12, 2021 at 08:44:32PM +0200, Ard Biesheuvel wrote:
> This is a follow-up to [0], but given that the arm64 architectural
> pieces have been merged for arm64, the only remaining changes are crypto
> specific. Therefore, the audience has been reduced to those people who
> are likely to care about these specifics.
>
> Patch #1 addresses an issue in the skcipher walker which doesn't handle
> zero sized AEAD inputs entirely consistently, which is uncovered by the
> change in patch #7.
>
> Patches #2 and #3 add some sanity checks to the public AEAD and skcipher
> APIs to limit their availibility to either task or softirq context
> (which is the only way in which they are currently being used). Adding
> this restriction permits the arm64 crypto code to get rid of all scalar
> fallbacks, given that on this architecture, softirqs are no longer
> served while the SIMD unit is being used in kernel mode, which means
> that the scalar fallbacks are never needed. These are removed in the
> remaining 4 patches.
>
> [0] https://lore.kernel.org/linux-arm-kernel/[email protected]/

Did you check whether any updates to the self-tests in testmgr.c are warranted?
Specifically, is disabling the use of SIMD for testing still something that
makes sense?

- Eric

2021-05-12 22:24:46

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] crypto: aead - disallow en/decrypt for non-task or non-softirq context

On Wed, 12 May 2021 at 22:06, Eric Biggers <[email protected]> wrote:
>
> On Wed, May 12, 2021 at 08:44:34PM +0200, Ard Biesheuvel wrote:
> > In order to ensure that kernel mode SIMD routines will not need a scalar
> > fallback if they run with softirqs disabled, disallow any use of the
> > AEAD encrypt and decrypt routines from outside of task or softirq context.
> >
> > Signed-off-by: Ard Biesheuvel <[email protected]>
> > ---
> > crypto/aead.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/crypto/aead.c b/crypto/aead.c
> > index 16991095270d..b5304b3d3314 100644
> > --- a/crypto/aead.c
> > +++ b/crypto/aead.c
> > @@ -87,6 +87,11 @@ int crypto_aead_encrypt(struct aead_request *req)
> > unsigned int cryptlen = req->cryptlen;
> > int ret;
> >
> > + if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
> > + WARN_ONCE(!in_task() && !in_serving_softirq(),
> > + "synchronous call from invalid context\n"))
> > + return -EBUSY;
> > +
> > crypto_stats_get(alg);
> > if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
> > ret = -ENOKEY;
> > @@ -104,6 +109,11 @@ int crypto_aead_decrypt(struct aead_request *req)
> > unsigned int cryptlen = req->cryptlen;
> > int ret;
> >
> > + if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
> > + WARN_ONCE(!in_task() && !in_serving_softirq(),
> > + "synchronous call from invalid context\n"))
> > + return -EBUSY;
> > +
> > crypto_stats_get(alg);
> > if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
> > ret = -ENOKEY;
>
> This probably should go after crypto_stats_get() so that the error gets counted
> in the stats (if stats are enabled) -- analogous to how the ENOKEY error is
> counted.
>
> Likewise for the skcipher patch.
>

Good point, I'll fix that

2021-05-12 22:25:07

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] running kernel mode SIMD with softirqs disabled

On Wed, 12 May 2021 at 22:11, Eric Biggers <[email protected]> wrote:
>
> On Wed, May 12, 2021 at 08:44:32PM +0200, Ard Biesheuvel wrote:
> > This is a follow-up to [0], but given that the arm64 architectural
> > pieces have been merged for arm64, the only remaining changes are crypto
> > specific. Therefore, the audience has been reduced to those people who
> > are likely to care about these specifics.
> >
> > Patch #1 addresses an issue in the skcipher walker which doesn't handle
> > zero sized AEAD inputs entirely consistently, which is uncovered by the
> > change in patch #7.
> >
> > Patches #2 and #3 add some sanity checks to the public AEAD and skcipher
> > APIs to limit their availibility to either task or softirq context
> > (which is the only way in which they are currently being used). Adding
> > this restriction permits the arm64 crypto code to get rid of all scalar
> > fallbacks, given that on this architecture, softirqs are no longer
> > served while the SIMD unit is being used in kernel mode, which means
> > that the scalar fallbacks are never needed. These are removed in the
> > remaining 4 patches.
> >
> > [0] https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Did you check whether any updates to the self-tests in testmgr.c are warranted?
> Specifically, is disabling the use of SIMD for testing still something that
> makes sense?
>

The situation is not ideal, but I am not sure what we can do about
this: the scalar fallbacks are gone, which means that the SIMD unit
will be used in the test even if testmgr attempts to disable it. But
keeping the scalar fallbacks just for the test suite makes no sense
either. So I don't think we should change anything, other than perhaps
document this somewhere (any suggestions on a place to put that)

Note that the library routines, as well as shashes (which are
sometimes exposed via library routines, e.g., CRC-T10DIF and CRC-32,
and maybe others) are different, which is why their scalar fallbacks
are retained. There, we need the testmgr to override SIMD availability
to ensure that combinations of the SIMD and scalar code are tested.