2019-07-02 19:42:36

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 00/32] crypto: AES cleanup

This started out as an attempt to provide synchronous SIMD based GCM
on 32-bit ARM, but along the way, I ended up changing and cleaning up
so many things that it is more of a general AES cleanup now rather than
anything else.

Changes since v3:
- fix build warning due to missing array dimensions of exported sboxes
- rename the internal sparc64 AES routines so they don't clash with the
new library symbols
- fix a couple of comments that got truncated

Changes since v2:
- fix a bug in the CTR helper function - use chunksize not blocksize of the
skcipher as the blocksize of the CTR transformation
- add a couple of patches so all AES implementation share the forward and
inverse Sboxes that are in the AES library.
- unexpert data structures and helpers that are not actually (or should be)
used outside the drivers that define them

Code can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=aes-cleanup-v4

Changes since v1/RFC:
- rename x86 AES-NI and padlock routines as well, in order to avoid clashes (#2)
- move irq en/disabling out of the AES library into the callers (aes-ti
and the skcipher helper for sync ctr(aes) added in #17)
- align 32-bit ARM CE key schedule endianness with other AES drivers, to
avoid problems on BE systems when using the synchronous ctr fallback (#18)
- replace some occurrences where a "aes" or "aes-generic" cipher was allocated
explicitly, and use library calls instead.
- use a generic helper in crypto/ctr.h instead of adding a CTR helper to the
AES library for providing the synchronous CTR fallback code.

Some users of the AES cipher are being switched to other algorithms (i.e.,
SipHash for TCP fastopen and CCM or cbcmac for wusb and lib80211). These
have been posted separately, since they have no build time interdependencies.

----- Original blurb below ------

On 32-bit ARM, asynchronous GCM can be provided by the following drivers:

| cycles per byte on low end Si
gcm_base(ctr(aes-generic),ghash-generic) | 65.3
gcm_base(ctr-aes-neonbs,ghash-ce) [*] | 27.7
gcm_base(ctr-aes-ce,ghash-ce) [**] | 3.7

[*] ghash-ce using vmull.p8 instructions
[**] ghash-ce using optional vmull.p64 instructions

The third and fastest option is actually only available on 32-bit cores that
implement the v8 Crypto Extensions, which are rare, but the NEON based runner
up is obviously a huge improvement over the generic code, not only in terms of
performance, but also because it is time invariant (generic AES and generic
GHASH are both implemented using lookup tables, which are susceptible to
cache timing attacks)

However, when allocating the gcm(aes) skcipher in synchronous mode, we end up
with the generic code, due to the fact that the NEON code has no handling for
invocations that occur from a context where the NEON cannot be used, and so
it defers the processing to a kthread, which is only permitted for asynchronous
ciphers.

So let's implement this fallback handling, by reusing some of the logic that
has already been implemented for arm64. Note that these fallbacks are rarely
called in practice, but the API requires the functionality to be there.
This is implemented in patches 16-22.

All the patches leading up to that are cleanups for the AES code, to reduce
the dependency on the generic table based AES code, or in some cases, hardcoded
dependencies on the scalar arm64 asm code which suffers from the same problem.
It also removes redundant key expansion routines, and gets rid of the x86
scalar asm code, which is a maintenance burden and is not actually faster than
the generic code built with a modern compiler.

Ard Biesheuvel (32):
crypto: arm/aes-ce - cosmetic/whitespace cleanup
crypto: aes - rename local routines to prevent future clashes
crypto: aes/fixed-time - align key schedule with other implementations
crypto: aes - create AES library based on the fixed time AES code
crypto: x86/aes-ni - switch to generic for fallback and key routines
crypto: x86/aes - drop scalar assembler implementations
crypto: padlock/aes - switch to library version of key expansion
routine
crypto: cesa/aes - switch to library version of key expansion routine
crypto: safexcel/aes - switch to library version of key expansion
routine
crypto: arm64/ghash - switch to AES library
crypto: arm/aes-neonbs - switch to library version of key expansion
routine
crypto: arm64/aes-ccm - switch to AES library
crypto: arm64/aes-neonbs - switch to library version of key expansion
routine
crypto: arm64/aes-ce - switch to library version of key expansion
routine
crypto: generic/aes - drop key expansion routine in favor of library
version
crypto: ctr - add helper for performing a CTR encryption walk
crypto: aes - move sync ctr(aes) to AES library and generic helper
crypto: arm64/aes-ce-cipher - use AES library as fallback
crypto: aes/arm - use native endiannes for key schedule
crypto: arm/aes-ce - provide a synchronous version of ctr(aes)
crypto: arm/aes-neonbs - provide a synchronous version of ctr(aes)
crypto: arm/ghash - provide a synchronous version
bluetooth: switch to AES library
crypto: amcc/aes - switch to AES library for GCM key derivation
crypto: ccp - move to AES library for CMAC key derivation
crypto: chelsio/aes - replace AES cipher calls with library calls
crypto: aes/generic - unexport last-round AES tables
crypto: lib/aes - export sbox and inverse sbox
crypto: arm64/aes-neon - switch to shared AES Sboxes
crypto: arm/aes-cipher - switch to shared AES inverse Sbox
crypto: arm64/aes-cipher - switch to shared AES inverse Sbox
crypto: arm/aes-scalar - unexport en/decryption routines

arch/arm/crypto/Kconfig | 2 +-
arch/arm/crypto/aes-ce-core.S | 20 +-
arch/arm/crypto/aes-ce-glue.c | 168 ++++----
arch/arm/crypto/aes-cipher-core.S | 40 +-
arch/arm/crypto/aes-cipher-glue.c | 11 +-
arch/arm/crypto/aes-neonbs-glue.c | 69 +++-
arch/arm/crypto/ghash-ce-glue.c | 78 ++--
arch/arm64/crypto/Kconfig | 10 +-
arch/arm64/crypto/aes-ce-ccm-glue.c | 18 +-
arch/arm64/crypto/aes-ce-glue.c | 7 +-
arch/arm64/crypto/aes-cipher-core.S | 40 +-
arch/arm64/crypto/aes-cipher-glue.c | 11 +-
arch/arm64/crypto/aes-ctr-fallback.h | 53 ---
arch/arm64/crypto/aes-glue.c | 39 +-
arch/arm64/crypto/aes-neon.S | 74 +---
arch/arm64/crypto/aes-neonbs-glue.c | 29 +-
arch/arm64/crypto/ghash-ce-glue.c | 30 +-
arch/sparc/crypto/aes_glue.c | 8 +-
arch/x86/crypto/Makefile | 4 -
arch/x86/crypto/aes-i586-asm_32.S | 362 ------------------
arch/x86/crypto/aes-x86_64-asm_64.S | 185 ---------
arch/x86/crypto/aes_glue.c | 70 ----
arch/x86/crypto/aesni-intel_glue.c | 23 +-
arch/x86/include/asm/crypto/aes.h | 12 -
crypto/Kconfig | 52 +--
crypto/aes_generic.c | 167 +-------
crypto/aes_ti.c | 317 +--------------
drivers/crypto/Kconfig | 8 +-
drivers/crypto/amcc/crypto4xx_alg.c | 24 +-
drivers/crypto/ccp/Kconfig | 1 +
drivers/crypto/ccp/ccp-crypto-aes-cmac.c | 25 +-
drivers/crypto/ccp/ccp-crypto.h | 3 -
drivers/crypto/chelsio/Kconfig | 1 +
drivers/crypto/chelsio/chcr_algo.c | 46 +--
drivers/crypto/chelsio/chcr_crypto.h | 1 -
drivers/crypto/chelsio/chcr_ipsec.c | 19 +-
drivers/crypto/chelsio/chtls/chtls_hw.c | 20 +-
.../crypto/inside-secure/safexcel_cipher.c | 2 +-
drivers/crypto/marvell/cipher.c | 2 +-
drivers/crypto/padlock-aes.c | 10 +-
include/crypto/aes.h | 41 +-
include/crypto/ctr.h | 50 +++
lib/crypto/Makefile | 3 +
lib/crypto/aes.c | 356 +++++++++++++++++
net/bluetooth/Kconfig | 3 +-
net/bluetooth/smp.c | 103 ++---
46 files changed, 877 insertions(+), 1740 deletions(-)
delete mode 100644 arch/arm64/crypto/aes-ctr-fallback.h
delete mode 100644 arch/x86/crypto/aes-i586-asm_32.S
delete mode 100644 arch/x86/crypto/aes-x86_64-asm_64.S
delete mode 100644 arch/x86/crypto/aes_glue.c
delete mode 100644 arch/x86/include/asm/crypto/aes.h
create mode 100644 lib/crypto/aes.c

--
2.17.1


2019-07-02 19:42:36

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 04/32] crypto: aes - create AES library based on the fixed time AES code

Take the existing small footprint and mostly time invariant C code
and turn it into a AES library that can be used for non-performance
critical, casual use of AES, and as a fallback for, e.g., SIMD code
that needs a secondary path that can be taken in contexts where the
SIMD unit is off limits (e.g., in hard interrupts taken from kernel
context)

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/Kconfig | 4 +
crypto/aes_ti.c | 307 +----------------
include/crypto/aes.h | 34 ++
lib/crypto/Makefile | 3 +
lib/crypto/aes.c | 350 ++++++++++++++++++++
5 files changed, 395 insertions(+), 303 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index e801450bcb1c..091ebbbc9655 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1066,6 +1066,9 @@ config CRYPTO_GHASH_CLMUL_NI_INTEL

comment "Ciphers"

+config CRYPTO_LIB_AES
+ tristate
+
config CRYPTO_AES
tristate "AES cipher algorithms"
select CRYPTO_ALGAPI
@@ -1089,6 +1092,7 @@ config CRYPTO_AES
config CRYPTO_AES_TI
tristate "Fixed time AES cipher"
select CRYPTO_ALGAPI
+ select CRYPTO_LIB_AES
help
This is a generic implementation of AES that attempts to eliminate
data dependent latencies as much as possible without affecting
diff --git a/crypto/aes_ti.c b/crypto/aes_ti.c
index fd70dc322634..339915db9aeb 100644
--- a/crypto/aes_ti.c
+++ b/crypto/aes_ti.c
@@ -1,259 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* Scalar fixed time AES core transform
*
* Copyright (C) 2017 Linaro Ltd <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
*/

#include <crypto/aes.h>
#include <linux/crypto.h>
#include <linux/module.h>
-#include <asm/unaligned.h>
-
-/*
- * Emit the sbox as volatile const to prevent the compiler from doing
- * constant folding on sbox references involving fixed indexes.
- */
-static volatile const u8 __cacheline_aligned __aesti_sbox[] = {
- 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
- 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
- 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
- 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
- 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
- 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
- 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,
- 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
- 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,
- 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
- 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,
- 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
- 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,
- 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
- 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,
- 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
- 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,
- 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
- 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,
- 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
- 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,
- 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
- 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,
- 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
- 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,
- 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
- 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,
- 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
- 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,
- 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
- 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
- 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
-};
-
-static volatile const u8 __cacheline_aligned __aesti_inv_sbox[] = {
- 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
- 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
- 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
- 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
- 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
- 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
- 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,
- 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
- 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,
- 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
- 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,
- 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
- 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,
- 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
- 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,
- 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
- 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,
- 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
- 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,
- 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
- 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,
- 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
- 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,
- 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
- 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,
- 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
- 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,
- 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
- 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,
- 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
- 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,
- 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
-};
-
-static u32 mul_by_x(u32 w)
-{
- u32 x = w & 0x7f7f7f7f;
- u32 y = w & 0x80808080;
-
- /* multiply by polynomial 'x' (0b10) in GF(2^8) */
- return (x << 1) ^ (y >> 7) * 0x1b;
-}
-
-static u32 mul_by_x2(u32 w)
-{
- u32 x = w & 0x3f3f3f3f;
- u32 y = w & 0x80808080;
- u32 z = w & 0x40404040;
-
- /* multiply by polynomial 'x^2' (0b100) in GF(2^8) */
- return (x << 2) ^ (y >> 7) * 0x36 ^ (z >> 6) * 0x1b;
-}
-
-static u32 mix_columns(u32 x)
-{
- /*
- * Perform the following matrix multiplication in GF(2^8)
- *
- * | 0x2 0x3 0x1 0x1 | | x[0] |
- * | 0x1 0x2 0x3 0x1 | | x[1] |
- * | 0x1 0x1 0x2 0x3 | x | x[2] |
- * | 0x3 0x1 0x1 0x2 | | x[3] |
- */
- u32 y = mul_by_x(x) ^ ror32(x, 16);
-
- return y ^ ror32(x ^ y, 8);
-}
-
-static u32 inv_mix_columns(u32 x)
-{
- /*
- * Perform the following matrix multiplication in GF(2^8)
- *
- * | 0xe 0xb 0xd 0x9 | | x[0] |
- * | 0x9 0xe 0xb 0xd | | x[1] |
- * | 0xd 0x9 0xe 0xb | x | x[2] |
- * | 0xb 0xd 0x9 0xe | | x[3] |
- *
- * which can conveniently be reduced to
- *
- * | 0x2 0x3 0x1 0x1 | | 0x5 0x0 0x4 0x0 | | x[0] |
- * | 0x1 0x2 0x3 0x1 | | 0x0 0x5 0x0 0x4 | | x[1] |
- * | 0x1 0x1 0x2 0x3 | x | 0x4 0x0 0x5 0x0 | x | x[2] |
- * | 0x3 0x1 0x1 0x2 | | 0x0 0x4 0x0 0x5 | | x[3] |
- */
- u32 y = mul_by_x2(x);
-
- return mix_columns(x ^ y ^ ror32(y, 16));
-}
-
-static __always_inline u32 subshift(u32 in[], int pos)
-{
- return (__aesti_sbox[in[pos] & 0xff]) ^
- (__aesti_sbox[(in[(pos + 1) % 4] >> 8) & 0xff] << 8) ^
- (__aesti_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
- (__aesti_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
-}
-
-static __always_inline u32 inv_subshift(u32 in[], int pos)
-{
- return (__aesti_inv_sbox[in[pos] & 0xff]) ^
- (__aesti_inv_sbox[(in[(pos + 3) % 4] >> 8) & 0xff] << 8) ^
- (__aesti_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
- (__aesti_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
-}
-
-static u32 subw(u32 in)
-{
- return (__aesti_sbox[in & 0xff]) ^
- (__aesti_sbox[(in >> 8) & 0xff] << 8) ^
- (__aesti_sbox[(in >> 16) & 0xff] << 16) ^
- (__aesti_sbox[(in >> 24) & 0xff] << 24);
-}
-
-static int aesti_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- unsigned int key_len)
-{
- u32 kwords = key_len / sizeof(u32);
- u32 rc, i, j;
-
- if (key_len != AES_KEYSIZE_128 &&
- key_len != AES_KEYSIZE_192 &&
- key_len != AES_KEYSIZE_256)
- return -EINVAL;
-
- ctx->key_length = key_len;
-
- for (i = 0; i < kwords; i++)
- ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
-
- for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
- u32 *rki = ctx->key_enc + (i * kwords);
- u32 *rko = rki + kwords;
-
- rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
- rko[1] = rko[0] ^ rki[1];
- rko[2] = rko[1] ^ rki[2];
- rko[3] = rko[2] ^ rki[3];
-
- if (key_len == 24) {
- if (i >= 7)
- break;
- rko[4] = rko[3] ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- } else if (key_len == 32) {
- if (i >= 6)
- break;
- rko[4] = subw(rko[3]) ^ rki[4];
- rko[5] = rko[4] ^ rki[5];
- rko[6] = rko[5] ^ rki[6];
- rko[7] = rko[6] ^ rki[7];
- }
- }
-
- /*
- * Generate the decryption keys for the Equivalent Inverse Cipher.
- * This involves reversing the order of the round keys, and applying
- * the Inverse Mix Columns transformation to all but the first and
- * the last one.
- */
- ctx->key_dec[0] = ctx->key_enc[key_len + 24];
- ctx->key_dec[1] = ctx->key_enc[key_len + 25];
- ctx->key_dec[2] = ctx->key_enc[key_len + 26];
- ctx->key_dec[3] = ctx->key_enc[key_len + 27];
-
- for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
- ctx->key_dec[i] = inv_mix_columns(ctx->key_enc[j]);
- ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
- ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
- ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
- }

- ctx->key_dec[i] = ctx->key_enc[0];
- ctx->key_dec[i + 1] = ctx->key_enc[1];
- ctx->key_dec[i + 2] = ctx->key_enc[2];
- ctx->key_dec[i + 3] = ctx->key_enc[3];
-
- return 0;
-}

static int aesti_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);

- return aesti_expand_key(ctx, in_key, key_len);
+ return aes_expandkey(ctx, in_key, key_len);
}

static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
- const u32 *rkp = ctx->key_enc + 4;
- int rounds = 6 + ctx->key_length / 4;
- u32 st0[4], st1[4];
unsigned long flags;
- int round;
-
- st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
- st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
- st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
- st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);

/*
* Temporarily disable interrupts to avoid races where cachelines are
@@ -261,36 +29,7 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
*/
local_irq_save(flags);

- /*
- * Force the compiler to emit data independent Sbox references,
- * by xoring the input with Sbox values that are known to add up
- * to zero. This pulls the entire Sbox into the D-cache before any
- * data dependent lookups are done.
- */
- st0[0] ^= __aesti_sbox[ 0] ^ __aesti_sbox[ 64] ^ __aesti_sbox[134] ^ __aesti_sbox[195];
- st0[1] ^= __aesti_sbox[16] ^ __aesti_sbox[ 82] ^ __aesti_sbox[158] ^ __aesti_sbox[221];
- st0[2] ^= __aesti_sbox[32] ^ __aesti_sbox[ 96] ^ __aesti_sbox[160] ^ __aesti_sbox[234];
- st0[3] ^= __aesti_sbox[48] ^ __aesti_sbox[112] ^ __aesti_sbox[186] ^ __aesti_sbox[241];
-
- for (round = 0;; round += 2, rkp += 8) {
- st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
- st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
- st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
- st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
-
- if (round == rounds - 2)
- break;
-
- st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
- st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
- st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
- st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
- }
-
- put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
- put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
- put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
- put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
+ aes_encrypt(ctx, out, in);

local_irq_restore(flags);
}
@@ -298,16 +37,7 @@ static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
- const u32 *rkp = ctx->key_dec + 4;
- int rounds = 6 + ctx->key_length / 4;
- u32 st0[4], st1[4];
unsigned long flags;
- int round;
-
- st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
- st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
- st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
- st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);

/*
* Temporarily disable interrupts to avoid races where cachelines are
@@ -315,36 +45,7 @@ static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
*/
local_irq_save(flags);

- /*
- * Force the compiler to emit data independent Sbox references,
- * by xoring the input with Sbox values that are known to add up
- * to zero. This pulls the entire Sbox into the D-cache before any
- * data dependent lookups are done.
- */
- st0[0] ^= __aesti_inv_sbox[ 0] ^ __aesti_inv_sbox[ 64] ^ __aesti_inv_sbox[129] ^ __aesti_inv_sbox[200];
- st0[1] ^= __aesti_inv_sbox[16] ^ __aesti_inv_sbox[ 83] ^ __aesti_inv_sbox[150] ^ __aesti_inv_sbox[212];
- st0[2] ^= __aesti_inv_sbox[32] ^ __aesti_inv_sbox[ 96] ^ __aesti_inv_sbox[160] ^ __aesti_inv_sbox[236];
- st0[3] ^= __aesti_inv_sbox[48] ^ __aesti_inv_sbox[112] ^ __aesti_inv_sbox[187] ^ __aesti_inv_sbox[247];
-
- for (round = 0;; round += 2, rkp += 8) {
- st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
- st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
- st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
- st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
-
- if (round == rounds - 2)
- break;
-
- st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
- st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
- st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
- st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
- }
-
- put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
- put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
- put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
- put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
+ aes_decrypt(ctx, out, in);

local_irq_restore(flags);
}
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 0fdb542c70cd..d0067fca0cd0 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -37,4 +37,38 @@ int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len);
int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
unsigned int key_len);
+
+/**
+ * aes_expandkey - Expands the AES key as described in FIPS-197
+ * @ctx: The location where the computed key will be stored.
+ * @in_key: The supplied key.
+ * @key_len: The length of the supplied key.
+ *
+ * Returns 0 on success. The function fails only if an invalid key size (or
+ * pointer) is supplied.
+ * The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
+ * key schedule plus a 16 bytes key which is used before the first round).
+ * The decryption key is prepared for the "Equivalent Inverse Cipher" as
+ * described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
+ * for the initial combination, the second slot for the first round and so on.
+ */
+int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
+ unsigned int key_len);
+
+/**
+ * aes_encrypt - Encrypt a single AES block
+ * @ctx: Context struct containing the key schedule
+ * @out: Buffer to store the ciphertext
+ * @in: Buffer containing the plaintext
+ */
+void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
+
+/**
+ * aes_decrypt - Decrypt a single AES block
+ * @ctx: Context struct containing the key schedule
+ * @out: Buffer to store the plaintext
+ * @in: Buffer containing the ciphertext
+ */
+void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
+
#endif
diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile
index 88195c34932d..42a91c62d96d 100644
--- a/lib/crypto/Makefile
+++ b/lib/crypto/Makefile
@@ -1,4 +1,7 @@
# SPDX-License-Identifier: GPL-2.0

+obj-$(CONFIG_CRYPTO_LIB_AES) += libaes.o
+libaes-y := aes.o
+
obj-$(CONFIG_CRYPTO_LIB_ARC4) += libarc4.o
libarc4-y := arc4.o
diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
new file mode 100644
index 000000000000..9928b23e0a8a
--- /dev/null
+++ b/lib/crypto/aes.c
@@ -0,0 +1,350 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017-2019 Linaro Ltd <[email protected]>
+ */
+
+#include <crypto/aes.h>
+#include <linux/crypto.h>
+#include <linux/module.h>
+#include <asm/unaligned.h>
+
+/*
+ * Emit the sbox as volatile const to prevent the compiler from doing
+ * constant folding on sbox references involving fixed indexes.
+ */
+static volatile const u8 __cacheline_aligned aes_sbox[] = {
+ 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
+ 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
+ 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
+ 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
+ 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc,
+ 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,
+ 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a,
+ 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,
+ 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0,
+ 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,
+ 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b,
+ 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,
+ 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85,
+ 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,
+ 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5,
+ 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,
+ 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17,
+ 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,
+ 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88,
+ 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,
+ 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c,
+ 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,
+ 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9,
+ 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,
+ 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6,
+ 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,
+ 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e,
+ 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,
+ 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94,
+ 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,
+ 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68,
+ 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
+};
+
+static volatile const u8 __cacheline_aligned aes_inv_sbox[] = {
+ 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
+ 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
+ 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
+ 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
+ 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d,
+ 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e,
+ 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2,
+ 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25,
+ 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16,
+ 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92,
+ 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda,
+ 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84,
+ 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a,
+ 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06,
+ 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02,
+ 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b,
+ 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea,
+ 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73,
+ 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85,
+ 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e,
+ 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89,
+ 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b,
+ 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20,
+ 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4,
+ 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31,
+ 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f,
+ 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d,
+ 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
+ 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0,
+ 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
+ 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26,
+ 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
+};
+
+static u32 mul_by_x(u32 w)
+{
+ u32 x = w & 0x7f7f7f7f;
+ u32 y = w & 0x80808080;
+
+ /* multiply by polynomial 'x' (0b10) in GF(2^8) */
+ return (x << 1) ^ (y >> 7) * 0x1b;
+}
+
+static u32 mul_by_x2(u32 w)
+{
+ u32 x = w & 0x3f3f3f3f;
+ u32 y = w & 0x80808080;
+ u32 z = w & 0x40404040;
+
+ /* multiply by polynomial 'x^2' (0b100) in GF(2^8) */
+ return (x << 2) ^ (y >> 7) * 0x36 ^ (z >> 6) * 0x1b;
+}
+
+static u32 mix_columns(u32 x)
+{
+ /*
+ * Perform the following matrix multiplication in GF(2^8)
+ *
+ * | 0x2 0x3 0x1 0x1 | | x[0] |
+ * | 0x1 0x2 0x3 0x1 | | x[1] |
+ * | 0x1 0x1 0x2 0x3 | x | x[2] |
+ * | 0x3 0x1 0x1 0x2 | | x[3] |
+ */
+ u32 y = mul_by_x(x) ^ ror32(x, 16);
+
+ return y ^ ror32(x ^ y, 8);
+}
+
+static u32 inv_mix_columns(u32 x)
+{
+ /*
+ * Perform the following matrix multiplication in GF(2^8)
+ *
+ * | 0xe 0xb 0xd 0x9 | | x[0] |
+ * | 0x9 0xe 0xb 0xd | | x[1] |
+ * | 0xd 0x9 0xe 0xb | x | x[2] |
+ * | 0xb 0xd 0x9 0xe | | x[3] |
+ *
+ * which can conveniently be reduced to
+ *
+ * | 0x2 0x3 0x1 0x1 | | 0x5 0x0 0x4 0x0 | | x[0] |
+ * | 0x1 0x2 0x3 0x1 | | 0x0 0x5 0x0 0x4 | | x[1] |
+ * | 0x1 0x1 0x2 0x3 | x | 0x4 0x0 0x5 0x0 | x | x[2] |
+ * | 0x3 0x1 0x1 0x2 | | 0x0 0x4 0x0 0x5 | | x[3] |
+ */
+ u32 y = mul_by_x2(x);
+
+ return mix_columns(x ^ y ^ ror32(y, 16));
+}
+
+static __always_inline u32 subshift(u32 in[], int pos)
+{
+ return (aes_sbox[in[pos] & 0xff]) ^
+ (aes_sbox[(in[(pos + 1) % 4] >> 8) & 0xff] << 8) ^
+ (aes_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
+ (aes_sbox[(in[(pos + 3) % 4] >> 24) & 0xff] << 24);
+}
+
+static __always_inline u32 inv_subshift(u32 in[], int pos)
+{
+ return (aes_inv_sbox[in[pos] & 0xff]) ^
+ (aes_inv_sbox[(in[(pos + 3) % 4] >> 8) & 0xff] << 8) ^
+ (aes_inv_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
+ (aes_inv_sbox[(in[(pos + 1) % 4] >> 24) & 0xff] << 24);
+}
+
+static u32 subw(u32 in)
+{
+ return (aes_sbox[in & 0xff]) ^
+ (aes_sbox[(in >> 8) & 0xff] << 8) ^
+ (aes_sbox[(in >> 16) & 0xff] << 16) ^
+ (aes_sbox[(in >> 24) & 0xff] << 24);
+}
+
+/**
+ * aes_expandkey - Expands the AES key as described in FIPS-197
+ * @ctx: The location where the computed key will be stored.
+ * @in_key: The supplied key.
+ * @key_len: The length of the supplied key.
+ *
+ * Returns 0 on success. The function fails only if an invalid key size (or
+ * pointer) is supplied.
+ * The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
+ * key schedule plus a 16 bytes key which is used before the first round).
+ * The decryption key is prepared for the "Equivalent Inverse Cipher" as
+ * described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
+ * for the initial combination, the second slot for the first round and so on.
+ */
+int aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
+ unsigned int key_len)
+{
+ u32 kwords = key_len / sizeof(u32);
+ u32 rc, i, j;
+
+ if (key_len != AES_KEYSIZE_128 &&
+ key_len != AES_KEYSIZE_192 &&
+ key_len != AES_KEYSIZE_256)
+ return -EINVAL;
+
+ ctx->key_length = key_len;
+
+ for (i = 0; i < kwords; i++)
+ ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));
+
+ for (i = 0, rc = 1; i < 10; i++, rc = mul_by_x(rc)) {
+ u32 *rki = ctx->key_enc + (i * kwords);
+ u32 *rko = rki + kwords;
+
+ rko[0] = ror32(subw(rki[kwords - 1]), 8) ^ rc ^ rki[0];
+ rko[1] = rko[0] ^ rki[1];
+ rko[2] = rko[1] ^ rki[2];
+ rko[3] = rko[2] ^ rki[3];
+
+ if (key_len == AES_KEYSIZE_192) {
+ if (i >= 7)
+ break;
+ rko[4] = rko[3] ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ } else if (key_len == AES_KEYSIZE_256) {
+ if (i >= 6)
+ break;
+ rko[4] = subw(rko[3]) ^ rki[4];
+ rko[5] = rko[4] ^ rki[5];
+ rko[6] = rko[5] ^ rki[6];
+ rko[7] = rko[6] ^ rki[7];
+ }
+ }
+
+ /*
+ * Generate the decryption keys for the Equivalent Inverse Cipher.
+ * This involves reversing the order of the round keys, and applying
+ * the Inverse Mix Columns transformation to all but the first and
+ * the last one.
+ */
+ ctx->key_dec[0] = ctx->key_enc[key_len + 24];
+ ctx->key_dec[1] = ctx->key_enc[key_len + 25];
+ ctx->key_dec[2] = ctx->key_enc[key_len + 26];
+ ctx->key_dec[3] = ctx->key_enc[key_len + 27];
+
+ for (i = 4, j = key_len + 20; j > 0; i += 4, j -= 4) {
+ ctx->key_dec[i] = inv_mix_columns(ctx->key_enc[j]);
+ ctx->key_dec[i + 1] = inv_mix_columns(ctx->key_enc[j + 1]);
+ ctx->key_dec[i + 2] = inv_mix_columns(ctx->key_enc[j + 2]);
+ ctx->key_dec[i + 3] = inv_mix_columns(ctx->key_enc[j + 3]);
+ }
+
+ ctx->key_dec[i] = ctx->key_enc[0];
+ ctx->key_dec[i + 1] = ctx->key_enc[1];
+ ctx->key_dec[i + 2] = ctx->key_enc[2];
+ ctx->key_dec[i + 3] = ctx->key_enc[3];
+
+ return 0;
+}
+EXPORT_SYMBOL(aes_expandkey);
+
+/**
+ * aes_encrypt - Encrypt a single AES block
+ * @ctx: Context struct containing the key schedule
+ * @out: Buffer to store the ciphertext
+ * @in: Buffer containing the plaintext
+ */
+void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
+{
+ const u32 *rkp = ctx->key_enc + 4;
+ int rounds = 6 + ctx->key_length / 4;
+ u32 st0[4], st1[4];
+ int round;
+
+ st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
+ st0[1] = ctx->key_enc[1] ^ get_unaligned_le32(in + 4);
+ st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
+ st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
+
+ /*
+ * Force the compiler to emit data independent Sbox references,
+ * by xoring the input with Sbox values that are known to add up
+ * to zero. This pulls the entire Sbox into the D-cache before any
+ * data dependent lookups are done.
+ */
+ st0[0] ^= aes_sbox[ 0] ^ aes_sbox[ 64] ^ aes_sbox[134] ^ aes_sbox[195];
+ st0[1] ^= aes_sbox[16] ^ aes_sbox[ 82] ^ aes_sbox[158] ^ aes_sbox[221];
+ st0[2] ^= aes_sbox[32] ^ aes_sbox[ 96] ^ aes_sbox[160] ^ aes_sbox[234];
+ st0[3] ^= aes_sbox[48] ^ aes_sbox[112] ^ aes_sbox[186] ^ aes_sbox[241];
+
+ for (round = 0;; round += 2, rkp += 8) {
+ st1[0] = mix_columns(subshift(st0, 0)) ^ rkp[0];
+ st1[1] = mix_columns(subshift(st0, 1)) ^ rkp[1];
+ st1[2] = mix_columns(subshift(st0, 2)) ^ rkp[2];
+ st1[3] = mix_columns(subshift(st0, 3)) ^ rkp[3];
+
+ if (round == rounds - 2)
+ break;
+
+ st0[0] = mix_columns(subshift(st1, 0)) ^ rkp[4];
+ st0[1] = mix_columns(subshift(st1, 1)) ^ rkp[5];
+ st0[2] = mix_columns(subshift(st1, 2)) ^ rkp[6];
+ st0[3] = mix_columns(subshift(st1, 3)) ^ rkp[7];
+ }
+
+ put_unaligned_le32(subshift(st1, 0) ^ rkp[4], out);
+ put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
+ put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
+ put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
+}
+EXPORT_SYMBOL(aes_encrypt);
+
+/**
+ * aes_decrypt - Decrypt a single AES block
+ * @ctx: Context struct containing the key schedule
+ * @out: Buffer to store the plaintext
+ * @in: Buffer containing the ciphertext
+ */
+void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in)
+{
+ const u32 *rkp = ctx->key_dec + 4;
+ int rounds = 6 + ctx->key_length / 4;
+ u32 st0[4], st1[4];
+ int round;
+
+ st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
+ st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4);
+ st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
+ st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
+
+ /*
+ * Force the compiler to emit data independent Sbox references,
+ * by xoring the input with Sbox values that are known to add up
+ * to zero. This pulls the entire Sbox into the D-cache before any
+ * data dependent lookups are done.
+ */
+ st0[0] ^= aes_inv_sbox[ 0] ^ aes_inv_sbox[ 64] ^ aes_inv_sbox[129] ^ aes_inv_sbox[200];
+ st0[1] ^= aes_inv_sbox[16] ^ aes_inv_sbox[ 83] ^ aes_inv_sbox[150] ^ aes_inv_sbox[212];
+ st0[2] ^= aes_inv_sbox[32] ^ aes_inv_sbox[ 96] ^ aes_inv_sbox[160] ^ aes_inv_sbox[236];
+ st0[3] ^= aes_inv_sbox[48] ^ aes_inv_sbox[112] ^ aes_inv_sbox[187] ^ aes_inv_sbox[247];
+
+ for (round = 0;; round += 2, rkp += 8) {
+ st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0];
+ st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1];
+ st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2];
+ st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3];
+
+ if (round == rounds - 2)
+ break;
+
+ st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4];
+ st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5];
+ st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6];
+ st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7];
+ }
+
+ put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out);
+ put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4);
+ put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8);
+ put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12);
+}
+EXPORT_SYMBOL(aes_decrypt);
+
+MODULE_DESCRIPTION("Generic AES library");
+MODULE_AUTHOR("Ard Biesheuvel <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
2.17.1

2019-07-02 19:42:37

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 07/32] crypto: padlock/aes - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/Kconfig | 2 +-
drivers/crypto/padlock-aes.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 67af688d7d84..3fca5f7e38f0 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -26,7 +26,7 @@ config CRYPTO_DEV_PADLOCK_AES
tristate "PadLock driver for AES algorithm"
depends on CRYPTO_DEV_PADLOCK
select CRYPTO_BLKCIPHER
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
help
Use VIA PadLock for AES algorithm.

diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index 854539512c35..af90138eddb7 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -144,7 +144,7 @@ static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
ctx->cword.encrypt.keygen = 1;
ctx->cword.decrypt.keygen = 1;

- if (crypto_aes_expand_key(&gen_aes, in_key, key_len)) {
+ if (aes_expandkey(&gen_aes, in_key, key_len)) {
*flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
--
2.17.1

2019-07-02 19:42:40

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 08/32] crypto: cesa/aes - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/Kconfig | 2 +-
drivers/crypto/marvell/cipher.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 3fca5f7e38f0..fdccadc94819 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -213,7 +213,7 @@ config CRYPTO_CRC32_S390
config CRYPTO_DEV_MARVELL_CESA
tristate "Marvell's Cryptographic Engine driver"
depends on PLAT_ORION || ARCH_MVEBU
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
select CRYPTO_DES
select CRYPTO_BLKCIPHER
select CRYPTO_HASH
diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 2fd936b19c6d..debe7d9f00ae 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -257,7 +257,7 @@ static int mv_cesa_aes_setkey(struct crypto_skcipher *cipher, const u8 *key,
int ret;
int i;

- ret = crypto_aes_expand_key(&ctx->aes, key, len);
+ ret = aes_expandkey(&ctx->aes, key, len);
if (ret) {
crypto_skcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
return ret;
--
2.17.1

2019-07-02 19:42:40

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 06/32] crypto: x86/aes - drop scalar assembler implementations

The AES assembler code for x86 isn't actually faster than code
generated by the compiler from aes_generic.c, and considering
the disproportionate maintenance burden of assembler code on
x86, it is better just to drop it entirely. Modern x86 systems
will use AES-NI anyway, and given that the modules being removed
have a dependency on aes_generic already, we can remove them
without running the risk of regressions.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/crypto/Makefile | 4 -
arch/x86/crypto/aes-i586-asm_32.S | 362 --------------------
arch/x86/crypto/aes-x86_64-asm_64.S | 185 ----------
arch/x86/crypto/aes_glue.c | 70 ----
crypto/Kconfig | 44 ---
5 files changed, 665 deletions(-)

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 45734e1cf967..b96a14e67ab0 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -14,11 +14,9 @@ sha256_ni_supported :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,yes,no)

obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o

-obj-$(CONFIG_CRYPTO_AES_586) += aes-i586.o
obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_586) += serpent-sse2-i586.o

-obj-$(CONFIG_CRYPTO_AES_X86_64) += aes-x86_64.o
obj-$(CONFIG_CRYPTO_DES3_EDE_X86_64) += des3_ede-x86_64.o
obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
@@ -68,11 +66,9 @@ ifeq ($(avx2_supported),yes)
obj-$(CONFIG_CRYPTO_MORUS1280_AVX2) += morus1280-avx2.o
endif

-aes-i586-y := aes-i586-asm_32.o aes_glue.o
twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
serpent-sse2-i586-y := serpent-sse2-i586-asm_32.o serpent_sse2_glue.o

-aes-x86_64-y := aes-x86_64-asm_64.o aes_glue.o
des3_ede-x86_64-y := des3_ede-asm_64.o des3_ede_glue.o
camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
diff --git a/arch/x86/crypto/aes-i586-asm_32.S b/arch/x86/crypto/aes-i586-asm_32.S
deleted file mode 100644
index 2849dbc59e11..000000000000
--- a/arch/x86/crypto/aes-i586-asm_32.S
+++ /dev/null
@@ -1,362 +0,0 @@
-// -------------------------------------------------------------------------
-// Copyright (c) 2001, Dr Brian Gladman < >, Worcester, UK.
-// All rights reserved.
-//
-// LICENSE TERMS
-//
-// The free distribution and use of this software in both source and binary
-// form is allowed (with or without changes) provided that:
-//
-// 1. distributions of this source code include the above copyright
-// notice, this list of conditions and the following disclaimer//
-//
-// 2. distributions in binary form include the above copyright
-// notice, this list of conditions and the following disclaimer
-// in the documentation and/or other associated materials//
-//
-// 3. the copyright holder's name is not used to endorse products
-// built using this software without specific written permission.
-//
-//
-// ALTERNATIVELY, provided that this notice is retained in full, this product
-// may be distributed under the terms of the GNU General Public License (GPL),
-// in which case the provisions of the GPL apply INSTEAD OF those given above.
-//
-// Copyright (c) 2004 Linus Torvalds <[email protected]>
-// Copyright (c) 2004 Red Hat, Inc., James Morris <[email protected]>
-
-// DISCLAIMER
-//
-// This software is provided 'as is' with no explicit or implied warranties
-// in respect of its properties including, but not limited to, correctness
-// and fitness for purpose.
-// -------------------------------------------------------------------------
-// Issue Date: 29/07/2002
-
-.file "aes-i586-asm.S"
-.text
-
-#include <linux/linkage.h>
-#include <asm/asm-offsets.h>
-
-#define tlen 1024 // length of each of 4 'xor' arrays (256 32-bit words)
-
-/* offsets to parameters with one register pushed onto stack */
-#define ctx 8
-#define out_blk 12
-#define in_blk 16
-
-/* offsets in crypto_aes_ctx structure */
-#define klen (480)
-#define ekey (0)
-#define dkey (240)
-
-// register mapping for encrypt and decrypt subroutines
-
-#define r0 eax
-#define r1 ebx
-#define r2 ecx
-#define r3 edx
-#define r4 esi
-#define r5 edi
-
-#define eaxl al
-#define eaxh ah
-#define ebxl bl
-#define ebxh bh
-#define ecxl cl
-#define ecxh ch
-#define edxl dl
-#define edxh dh
-
-#define _h(reg) reg##h
-#define h(reg) _h(reg)
-
-#define _l(reg) reg##l
-#define l(reg) _l(reg)
-
-// This macro takes a 32-bit word representing a column and uses
-// each of its four bytes to index into four tables of 256 32-bit
-// words to obtain values that are then xored into the appropriate
-// output registers r0, r1, r4 or r5.
-
-// Parameters:
-// table table base address
-// %1 out_state[0]
-// %2 out_state[1]
-// %3 out_state[2]
-// %4 out_state[3]
-// idx input register for the round (destroyed)
-// tmp scratch register for the round
-// sched key schedule
-
-#define do_col(table, a1,a2,a3,a4, idx, tmp) \
- movzx %l(idx),%tmp; \
- xor table(,%tmp,4),%a1; \
- movzx %h(idx),%tmp; \
- shr $16,%idx; \
- xor table+tlen(,%tmp,4),%a2; \
- movzx %l(idx),%tmp; \
- movzx %h(idx),%idx; \
- xor table+2*tlen(,%tmp,4),%a3; \
- xor table+3*tlen(,%idx,4),%a4;
-
-// initialise output registers from the key schedule
-// NB1: original value of a3 is in idx on exit
-// NB2: original values of a1,a2,a4 aren't used
-#define do_fcol(table, a1,a2,a3,a4, idx, tmp, sched) \
- mov 0 sched,%a1; \
- movzx %l(idx),%tmp; \
- mov 12 sched,%a2; \
- xor table(,%tmp,4),%a1; \
- mov 4 sched,%a4; \
- movzx %h(idx),%tmp; \
- shr $16,%idx; \
- xor table+tlen(,%tmp,4),%a2; \
- movzx %l(idx),%tmp; \
- movzx %h(idx),%idx; \
- xor table+3*tlen(,%idx,4),%a4; \
- mov %a3,%idx; \
- mov 8 sched,%a3; \
- xor table+2*tlen(,%tmp,4),%a3;
-
-// initialise output registers from the key schedule
-// NB1: original value of a3 is in idx on exit
-// NB2: original values of a1,a2,a4 aren't used
-#define do_icol(table, a1,a2,a3,a4, idx, tmp, sched) \
- mov 0 sched,%a1; \
- movzx %l(idx),%tmp; \
- mov 4 sched,%a2; \
- xor table(,%tmp,4),%a1; \
- mov 12 sched,%a4; \
- movzx %h(idx),%tmp; \
- shr $16,%idx; \
- xor table+tlen(,%tmp,4),%a2; \
- movzx %l(idx),%tmp; \
- movzx %h(idx),%idx; \
- xor table+3*tlen(,%idx,4),%a4; \
- mov %a3,%idx; \
- mov 8 sched,%a3; \
- xor table+2*tlen(,%tmp,4),%a3;
-
-
-// original Gladman had conditional saves to MMX regs.
-#define save(a1, a2) \
- mov %a2,4*a1(%esp)
-
-#define restore(a1, a2) \
- mov 4*a2(%esp),%a1
-
-// These macros perform a forward encryption cycle. They are entered with
-// the first previous round column values in r0,r1,r4,r5 and
-// exit with the final values in the same registers, using stack
-// for temporary storage.
-
-// round column values
-// on entry: r0,r1,r4,r5
-// on exit: r2,r1,r4,r5
-#define fwd_rnd1(arg, table) \
- save (0,r1); \
- save (1,r5); \
- \
- /* compute new column values */ \
- do_fcol(table, r2,r5,r4,r1, r0,r3, arg); /* idx=r0 */ \
- do_col (table, r4,r1,r2,r5, r0,r3); /* idx=r4 */ \
- restore(r0,0); \
- do_col (table, r1,r2,r5,r4, r0,r3); /* idx=r1 */ \
- restore(r0,1); \
- do_col (table, r5,r4,r1,r2, r0,r3); /* idx=r5 */
-
-// round column values
-// on entry: r2,r1,r4,r5
-// on exit: r0,r1,r4,r5
-#define fwd_rnd2(arg, table) \
- save (0,r1); \
- save (1,r5); \
- \
- /* compute new column values */ \
- do_fcol(table, r0,r5,r4,r1, r2,r3, arg); /* idx=r2 */ \
- do_col (table, r4,r1,r0,r5, r2,r3); /* idx=r4 */ \
- restore(r2,0); \
- do_col (table, r1,r0,r5,r4, r2,r3); /* idx=r1 */ \
- restore(r2,1); \
- do_col (table, r5,r4,r1,r0, r2,r3); /* idx=r5 */
-
-// These macros performs an inverse encryption cycle. They are entered with
-// the first previous round column values in r0,r1,r4,r5 and
-// exit with the final values in the same registers, using stack
-// for temporary storage
-
-// round column values
-// on entry: r0,r1,r4,r5
-// on exit: r2,r1,r4,r5
-#define inv_rnd1(arg, table) \
- save (0,r1); \
- save (1,r5); \
- \
- /* compute new column values */ \
- do_icol(table, r2,r1,r4,r5, r0,r3, arg); /* idx=r0 */ \
- do_col (table, r4,r5,r2,r1, r0,r3); /* idx=r4 */ \
- restore(r0,0); \
- do_col (table, r1,r4,r5,r2, r0,r3); /* idx=r1 */ \
- restore(r0,1); \
- do_col (table, r5,r2,r1,r4, r0,r3); /* idx=r5 */
-
-// round column values
-// on entry: r2,r1,r4,r5
-// on exit: r0,r1,r4,r5
-#define inv_rnd2(arg, table) \
- save (0,r1); \
- save (1,r5); \
- \
- /* compute new column values */ \
- do_icol(table, r0,r1,r4,r5, r2,r3, arg); /* idx=r2 */ \
- do_col (table, r4,r5,r0,r1, r2,r3); /* idx=r4 */ \
- restore(r2,0); \
- do_col (table, r1,r4,r5,r0, r2,r3); /* idx=r1 */ \
- restore(r2,1); \
- do_col (table, r5,r0,r1,r4, r2,r3); /* idx=r5 */
-
-// AES (Rijndael) Encryption Subroutine
-/* void aes_enc_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) */
-
-.extern crypto_ft_tab
-.extern crypto_fl_tab
-
-ENTRY(aes_enc_blk)
- push %ebp
- mov ctx(%esp),%ebp
-
-// CAUTION: the order and the values used in these assigns
-// rely on the register mappings
-
-1: push %ebx
- mov in_blk+4(%esp),%r2
- push %esi
- mov klen(%ebp),%r3 // key size
- push %edi
-#if ekey != 0
- lea ekey(%ebp),%ebp // key pointer
-#endif
-
-// input four columns and xor in first round key
-
- mov (%r2),%r0
- mov 4(%r2),%r1
- mov 8(%r2),%r4
- mov 12(%r2),%r5
- xor (%ebp),%r0
- xor 4(%ebp),%r1
- xor 8(%ebp),%r4
- xor 12(%ebp),%r5
-
- sub $8,%esp // space for register saves on stack
- add $16,%ebp // increment to next round key
- cmp $24,%r3
- jb 4f // 10 rounds for 128-bit key
- lea 32(%ebp),%ebp
- je 3f // 12 rounds for 192-bit key
- lea 32(%ebp),%ebp
-
-2: fwd_rnd1( -64(%ebp), crypto_ft_tab) // 14 rounds for 256-bit key
- fwd_rnd2( -48(%ebp), crypto_ft_tab)
-3: fwd_rnd1( -32(%ebp), crypto_ft_tab) // 12 rounds for 192-bit key
- fwd_rnd2( -16(%ebp), crypto_ft_tab)
-4: fwd_rnd1( (%ebp), crypto_ft_tab) // 10 rounds for 128-bit key
- fwd_rnd2( +16(%ebp), crypto_ft_tab)
- fwd_rnd1( +32(%ebp), crypto_ft_tab)
- fwd_rnd2( +48(%ebp), crypto_ft_tab)
- fwd_rnd1( +64(%ebp), crypto_ft_tab)
- fwd_rnd2( +80(%ebp), crypto_ft_tab)
- fwd_rnd1( +96(%ebp), crypto_ft_tab)
- fwd_rnd2(+112(%ebp), crypto_ft_tab)
- fwd_rnd1(+128(%ebp), crypto_ft_tab)
- fwd_rnd2(+144(%ebp), crypto_fl_tab) // last round uses a different table
-
-// move final values to the output array. CAUTION: the
-// order of these assigns rely on the register mappings
-
- add $8,%esp
- mov out_blk+12(%esp),%ebp
- mov %r5,12(%ebp)
- pop %edi
- mov %r4,8(%ebp)
- pop %esi
- mov %r1,4(%ebp)
- pop %ebx
- mov %r0,(%ebp)
- pop %ebp
- ret
-ENDPROC(aes_enc_blk)
-
-// AES (Rijndael) Decryption Subroutine
-/* void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) */
-
-.extern crypto_it_tab
-.extern crypto_il_tab
-
-ENTRY(aes_dec_blk)
- push %ebp
- mov ctx(%esp),%ebp
-
-// CAUTION: the order and the values used in these assigns
-// rely on the register mappings
-
-1: push %ebx
- mov in_blk+4(%esp),%r2
- push %esi
- mov klen(%ebp),%r3 // key size
- push %edi
-#if dkey != 0
- lea dkey(%ebp),%ebp // key pointer
-#endif
-
-// input four columns and xor in first round key
-
- mov (%r2),%r0
- mov 4(%r2),%r1
- mov 8(%r2),%r4
- mov 12(%r2),%r5
- xor (%ebp),%r0
- xor 4(%ebp),%r1
- xor 8(%ebp),%r4
- xor 12(%ebp),%r5
-
- sub $8,%esp // space for register saves on stack
- add $16,%ebp // increment to next round key
- cmp $24,%r3
- jb 4f // 10 rounds for 128-bit key
- lea 32(%ebp),%ebp
- je 3f // 12 rounds for 192-bit key
- lea 32(%ebp),%ebp
-
-2: inv_rnd1( -64(%ebp), crypto_it_tab) // 14 rounds for 256-bit key
- inv_rnd2( -48(%ebp), crypto_it_tab)
-3: inv_rnd1( -32(%ebp), crypto_it_tab) // 12 rounds for 192-bit key
- inv_rnd2( -16(%ebp), crypto_it_tab)
-4: inv_rnd1( (%ebp), crypto_it_tab) // 10 rounds for 128-bit key
- inv_rnd2( +16(%ebp), crypto_it_tab)
- inv_rnd1( +32(%ebp), crypto_it_tab)
- inv_rnd2( +48(%ebp), crypto_it_tab)
- inv_rnd1( +64(%ebp), crypto_it_tab)
- inv_rnd2( +80(%ebp), crypto_it_tab)
- inv_rnd1( +96(%ebp), crypto_it_tab)
- inv_rnd2(+112(%ebp), crypto_it_tab)
- inv_rnd1(+128(%ebp), crypto_it_tab)
- inv_rnd2(+144(%ebp), crypto_il_tab) // last round uses a different table
-
-// move final values to the output array. CAUTION: the
-// order of these assigns rely on the register mappings
-
- add $8,%esp
- mov out_blk+12(%esp),%ebp
- mov %r5,12(%ebp)
- pop %edi
- mov %r4,8(%ebp)
- pop %esi
- mov %r1,4(%ebp)
- pop %ebx
- mov %r0,(%ebp)
- pop %ebp
- ret
-ENDPROC(aes_dec_blk)
diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
deleted file mode 100644
index 8739cf7795de..000000000000
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ /dev/null
@@ -1,185 +0,0 @@
-/* AES (Rijndael) implementation (FIPS PUB 197) for x86_64
- *
- * Copyright (C) 2005 Andreas Steinmetz, <[email protected]>
- *
- * License:
- * This code can be distributed under the terms of the GNU General Public
- * License (GPL) Version 2 provided that the above header down to and
- * including this sentence is retained in full.
- */
-
-.extern crypto_ft_tab
-.extern crypto_it_tab
-.extern crypto_fl_tab
-.extern crypto_il_tab
-
-.text
-
-#include <linux/linkage.h>
-#include <asm/asm-offsets.h>
-
-#define R1 %rax
-#define R1E %eax
-#define R1X %ax
-#define R1H %ah
-#define R1L %al
-#define R2 %rbx
-#define R2E %ebx
-#define R2X %bx
-#define R2H %bh
-#define R2L %bl
-#define R3 %rcx
-#define R3E %ecx
-#define R3X %cx
-#define R3H %ch
-#define R3L %cl
-#define R4 %rdx
-#define R4E %edx
-#define R4X %dx
-#define R4H %dh
-#define R4L %dl
-#define R5 %rsi
-#define R5E %esi
-#define R6 %rdi
-#define R6E %edi
-#define R7 %r9 /* don't use %rbp; it breaks stack traces */
-#define R7E %r9d
-#define R8 %r8
-#define R10 %r10
-#define R11 %r11
-
-#define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
- ENTRY(FUNC); \
- movq r1,r2; \
- leaq KEY+48(r8),r9; \
- movq r10,r11; \
- movl (r7),r5 ## E; \
- movl 4(r7),r1 ## E; \
- movl 8(r7),r6 ## E; \
- movl 12(r7),r7 ## E; \
- movl 480(r8),r10 ## E; \
- xorl -48(r9),r5 ## E; \
- xorl -44(r9),r1 ## E; \
- xorl -40(r9),r6 ## E; \
- xorl -36(r9),r7 ## E; \
- cmpl $24,r10 ## E; \
- jb B128; \
- leaq 32(r9),r9; \
- je B192; \
- leaq 32(r9),r9;
-
-#define epilogue(FUNC,r1,r2,r5,r6,r7,r8,r9) \
- movq r1,r2; \
- movl r5 ## E,(r9); \
- movl r6 ## E,4(r9); \
- movl r7 ## E,8(r9); \
- movl r8 ## E,12(r9); \
- ret; \
- ENDPROC(FUNC);
-
-#define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
- movzbl r2 ## H,r5 ## E; \
- movzbl r2 ## L,r6 ## E; \
- movl TAB+1024(,r5,4),r5 ## E;\
- movw r4 ## X,r2 ## X; \
- movl TAB(,r6,4),r6 ## E; \
- roll $16,r2 ## E; \
- shrl $16,r4 ## E; \
- movzbl r4 ## L,r7 ## E; \
- movzbl r4 ## H,r4 ## E; \
- xorl OFFSET(r8),ra ## E; \
- xorl OFFSET+4(r8),rb ## E; \
- xorl TAB+3072(,r4,4),r5 ## E;\
- xorl TAB+2048(,r7,4),r6 ## E;\
- movzbl r1 ## L,r7 ## E; \
- movzbl r1 ## H,r4 ## E; \
- movl TAB+1024(,r4,4),r4 ## E;\
- movw r3 ## X,r1 ## X; \
- roll $16,r1 ## E; \
- shrl $16,r3 ## E; \
- xorl TAB(,r7,4),r5 ## E; \
- movzbl r3 ## L,r7 ## E; \
- movzbl r3 ## H,r3 ## E; \
- xorl TAB+3072(,r3,4),r4 ## E;\
- xorl TAB+2048(,r7,4),r5 ## E;\
- movzbl r1 ## L,r7 ## E; \
- movzbl r1 ## H,r3 ## E; \
- shrl $16,r1 ## E; \
- xorl TAB+3072(,r3,4),r6 ## E;\
- movl TAB+2048(,r7,4),r3 ## E;\
- movzbl r1 ## L,r7 ## E; \
- movzbl r1 ## H,r1 ## E; \
- xorl TAB+1024(,r1,4),r6 ## E;\
- xorl TAB(,r7,4),r3 ## E; \
- movzbl r2 ## H,r1 ## E; \
- movzbl r2 ## L,r7 ## E; \
- shrl $16,r2 ## E; \
- xorl TAB+3072(,r1,4),r3 ## E;\
- xorl TAB+2048(,r7,4),r4 ## E;\
- movzbl r2 ## H,r1 ## E; \
- movzbl r2 ## L,r2 ## E; \
- xorl OFFSET+8(r8),rc ## E; \
- xorl OFFSET+12(r8),rd ## E; \
- xorl TAB+1024(,r1,4),r3 ## E;\
- xorl TAB(,r2,4),r4 ## E;
-
-#define move_regs(r1,r2,r3,r4) \
- movl r3 ## E,r1 ## E; \
- movl r4 ## E,r2 ## E;
-
-#define entry(FUNC,KEY,B128,B192) \
- prologue(FUNC,KEY,B128,B192,R2,R8,R1,R3,R4,R6,R10,R5,R11)
-
-#define return(FUNC) epilogue(FUNC,R8,R2,R5,R6,R3,R4,R11)
-
-#define encrypt_round(TAB,OFFSET) \
- round(TAB,OFFSET,R1,R2,R3,R4,R5,R6,R7,R10,R5,R6,R3,R4) \
- move_regs(R1,R2,R5,R6)
-
-#define encrypt_final(TAB,OFFSET) \
- round(TAB,OFFSET,R1,R2,R3,R4,R5,R6,R7,R10,R5,R6,R3,R4)
-
-#define decrypt_round(TAB,OFFSET) \
- round(TAB,OFFSET,R2,R1,R4,R3,R6,R5,R7,R10,R5,R6,R3,R4) \
- move_regs(R1,R2,R5,R6)
-
-#define decrypt_final(TAB,OFFSET) \
- round(TAB,OFFSET,R2,R1,R4,R3,R6,R5,R7,R10,R5,R6,R3,R4)
-
-/* void aes_enc_blk(stuct crypto_tfm *tfm, u8 *out, const u8 *in) */
-
- entry(aes_enc_blk,0,.Le128,.Le192)
- encrypt_round(crypto_ft_tab,-96)
- encrypt_round(crypto_ft_tab,-80)
-.Le192: encrypt_round(crypto_ft_tab,-64)
- encrypt_round(crypto_ft_tab,-48)
-.Le128: encrypt_round(crypto_ft_tab,-32)
- encrypt_round(crypto_ft_tab,-16)
- encrypt_round(crypto_ft_tab, 0)
- encrypt_round(crypto_ft_tab, 16)
- encrypt_round(crypto_ft_tab, 32)
- encrypt_round(crypto_ft_tab, 48)
- encrypt_round(crypto_ft_tab, 64)
- encrypt_round(crypto_ft_tab, 80)
- encrypt_round(crypto_ft_tab, 96)
- encrypt_final(crypto_fl_tab,112)
- return(aes_enc_blk)
-
-/* void aes_dec_blk(struct crypto_tfm *tfm, u8 *out, const u8 *in) */
-
- entry(aes_dec_blk,240,.Ld128,.Ld192)
- decrypt_round(crypto_it_tab,-96)
- decrypt_round(crypto_it_tab,-80)
-.Ld192: decrypt_round(crypto_it_tab,-64)
- decrypt_round(crypto_it_tab,-48)
-.Ld128: decrypt_round(crypto_it_tab,-32)
- decrypt_round(crypto_it_tab,-16)
- decrypt_round(crypto_it_tab, 0)
- decrypt_round(crypto_it_tab, 16)
- decrypt_round(crypto_it_tab, 32)
- decrypt_round(crypto_it_tab, 48)
- decrypt_round(crypto_it_tab, 64)
- decrypt_round(crypto_it_tab, 80)
- decrypt_round(crypto_it_tab, 96)
- decrypt_final(crypto_il_tab,112)
- return(aes_dec_blk)
diff --git a/arch/x86/crypto/aes_glue.c b/arch/x86/crypto/aes_glue.c
deleted file mode 100644
index e26984f7ab8d..000000000000
--- a/arch/x86/crypto/aes_glue.c
+++ /dev/null
@@ -1,70 +0,0 @@
-/*
- * Glue Code for the asm optimized version of the AES Cipher Algorithm
- *
- */
-
-#include <linux/module.h>
-#include <crypto/aes.h>
-#include <asm/crypto/aes.h>
-
-asmlinkage void aes_enc_blk(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
-asmlinkage void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
-
-void crypto_aes_encrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
-{
- aes_enc_blk(ctx, dst, src);
-}
-EXPORT_SYMBOL_GPL(crypto_aes_encrypt_x86);
-
-void crypto_aes_decrypt_x86(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
-{
- aes_dec_blk(ctx, dst, src);
-}
-EXPORT_SYMBOL_GPL(crypto_aes_decrypt_x86);
-
-static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
- aes_enc_blk(crypto_tfm_ctx(tfm), dst, src);
-}
-
-static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
-{
- aes_dec_blk(crypto_tfm_ctx(tfm), dst, src);
-}
-
-static struct crypto_alg aes_alg = {
- .cra_name = "aes",
- .cra_driver_name = "aes-asm",
- .cra_priority = 200,
- .cra_flags = CRYPTO_ALG_TYPE_CIPHER,
- .cra_blocksize = AES_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct crypto_aes_ctx),
- .cra_module = THIS_MODULE,
- .cra_u = {
- .cipher = {
- .cia_min_keysize = AES_MIN_KEY_SIZE,
- .cia_max_keysize = AES_MAX_KEY_SIZE,
- .cia_setkey = crypto_aes_set_key,
- .cia_encrypt = aes_encrypt,
- .cia_decrypt = aes_decrypt
- }
- }
-};
-
-static int __init aes_init(void)
-{
- return crypto_register_alg(&aes_alg);
-}
-
-static void __exit aes_fini(void)
-{
- crypto_unregister_alg(&aes_alg);
-}
-
-module_init(aes_init);
-module_exit(aes_fini);
-
-MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm, asm optimized");
-MODULE_LICENSE("GPL");
-MODULE_ALIAS_CRYPTO("aes");
-MODULE_ALIAS_CRYPTO("aes-asm");
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 20af58068e6b..df6f0be66574 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1108,50 +1108,6 @@ config CRYPTO_AES_TI
block. Interrupts are also disabled to avoid races where cachelines
are evicted when the CPU is interrupted to do something else.

-config CRYPTO_AES_586
- tristate "AES cipher algorithms (i586)"
- depends on (X86 || UML_X86) && !64BIT
- select CRYPTO_ALGAPI
- select CRYPTO_AES
- help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/encryption/aes/> for more information.
-
-config CRYPTO_AES_X86_64
- tristate "AES cipher algorithms (x86_64)"
- depends on (X86 || UML_X86) && 64BIT
- select CRYPTO_ALGAPI
- select CRYPTO_AES
- help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See <http://csrc.nist.gov/encryption/aes/> for more information.
-
config CRYPTO_AES_NI_INTEL
tristate "AES cipher algorithms (AES-NI)"
depends on X86
--
2.17.1

2019-07-02 19:42:41

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 09/32] crypto: safexcel/aes - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/Kconfig | 2 +-
drivers/crypto/inside-secure/safexcel_cipher.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index fdccadc94819..b30b84089d11 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -718,7 +718,7 @@ config CRYPTO_DEV_SAFEXCEL
tristate "Inside Secure's SafeXcel cryptographic engine driver"
depends on OF
depends on (ARM64 && ARCH_MVEBU) || (COMPILE_TEST && 64BIT)
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
select CRYPTO_AUTHENC
select CRYPTO_BLKCIPHER
select CRYPTO_DES
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index 8cdbdbe35681..19ec086dce4f 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -178,7 +178,7 @@ static int safexcel_skcipher_aes_setkey(struct crypto_skcipher *ctfm,
struct crypto_aes_ctx aes;
int ret, i;

- ret = crypto_aes_expand_key(&aes, key, len);
+ ret = aes_expandkey(&aes, key, len);
if (ret) {
crypto_skcipher_set_flags(ctfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return ret;
--
2.17.1

2019-07-02 19:42:42

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 10/32] crypto: arm64/ghash - switch to AES library

The GHASH code uses the generic AES key expansion routines, and calls
directly into the scalar table based AES cipher for arm64 from the
fallback path, and since this implementation is known to be non-time
invariant, doing so from a time invariant SIMD cipher is a bit nasty.

So let's switch to the AES library - this makes the code more robust,
and drops the dependency on the generic AES cipher, allowing us to
omit it entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 3 +-
arch/arm64/crypto/ghash-ce-glue.c | 30 +++++++-------------
2 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index d9a523ecdd83..1762055e7093 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -58,8 +58,7 @@ config CRYPTO_GHASH_ARM64_CE
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_GF128MUL
- select CRYPTO_AES
- select CRYPTO_AES_ARM64
+ select CRYPTO_LIB_AES

config CRYPTO_CRCT10DIF_ARM64_CE
tristate "CRCT10DIF digest algorithm using PMULL instructions"
diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index b39ed99b06fb..90496765d22f 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -73,8 +73,6 @@ asmlinkage void pmull_gcm_decrypt(int blocks, u64 dg[], u8 dst[],
asmlinkage void pmull_gcm_encrypt_block(u8 dst[], u8 const src[],
u32 const rk[], int rounds);

-asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
static int ghash_init(struct shash_desc *desc)
{
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
@@ -312,14 +310,13 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 *inkey,
u8 key[GHASH_BLOCK_SIZE];
int ret;

- ret = crypto_aes_expand_key(&ctx->aes_key, inkey, keylen);
+ ret = aes_expandkey(&ctx->aes_key, inkey, keylen);
if (ret) {
tfm->base.crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}

- __aes_arm64_encrypt(ctx->aes_key.key_enc, key, (u8[AES_BLOCK_SIZE]){},
- num_rounds(&ctx->aes_key));
+ aes_encrypt(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});

return __ghash_setkey(&ctx->ghash_key, key, sizeof(be128));
}
@@ -470,7 +467,7 @@ static int gcm_encrypt(struct aead_request *req)
rk = ctx->aes_key.key_enc;
} while (walk.nbytes >= 2 * AES_BLOCK_SIZE);
} else {
- __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, nrounds);
+ aes_encrypt(&ctx->aes_key, tag, iv);
put_unaligned_be32(2, iv + GCM_IV_SIZE);

while (walk.nbytes >= (2 * AES_BLOCK_SIZE)) {
@@ -481,8 +478,7 @@ static int gcm_encrypt(struct aead_request *req)
int remaining = blocks;

do {
- __aes_arm64_encrypt(ctx->aes_key.key_enc,
- ks, iv, nrounds);
+ aes_encrypt(&ctx->aes_key, ks, iv);
crypto_xor_cpy(dst, src, ks, AES_BLOCK_SIZE);
crypto_inc(iv, AES_BLOCK_SIZE);

@@ -498,13 +494,10 @@ static int gcm_encrypt(struct aead_request *req)
walk.nbytes % (2 * AES_BLOCK_SIZE));
}
if (walk.nbytes) {
- __aes_arm64_encrypt(ctx->aes_key.key_enc, ks, iv,
- nrounds);
+ aes_encrypt(&ctx->aes_key, ks, iv);
if (walk.nbytes > AES_BLOCK_SIZE) {
crypto_inc(iv, AES_BLOCK_SIZE);
- __aes_arm64_encrypt(ctx->aes_key.key_enc,
- ks + AES_BLOCK_SIZE, iv,
- nrounds);
+ aes_encrypt(&ctx->aes_key, ks + AES_BLOCK_SIZE, iv);
}
}
}
@@ -608,7 +601,7 @@ static int gcm_decrypt(struct aead_request *req)
rk = ctx->aes_key.key_enc;
} while (walk.nbytes >= 2 * AES_BLOCK_SIZE);
} else {
- __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, nrounds);
+ aes_encrypt(&ctx->aes_key, tag, iv);
put_unaligned_be32(2, iv + GCM_IV_SIZE);

while (walk.nbytes >= (2 * AES_BLOCK_SIZE)) {
@@ -621,8 +614,7 @@ static int gcm_decrypt(struct aead_request *req)
pmull_ghash_update_p64);

do {
- __aes_arm64_encrypt(ctx->aes_key.key_enc,
- buf, iv, nrounds);
+ aes_encrypt(&ctx->aes_key, buf, iv);
crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
crypto_inc(iv, AES_BLOCK_SIZE);

@@ -640,11 +632,9 @@ static int gcm_decrypt(struct aead_request *req)
memcpy(iv2, iv, AES_BLOCK_SIZE);
crypto_inc(iv2, AES_BLOCK_SIZE);

- __aes_arm64_encrypt(ctx->aes_key.key_enc, iv2,
- iv2, nrounds);
+ aes_encrypt(&ctx->aes_key, iv2, iv2);
}
- __aes_arm64_encrypt(ctx->aes_key.key_enc, iv, iv,
- nrounds);
+ aes_encrypt(&ctx->aes_key, iv, iv);
}
}

--
2.17.1

2019-07-02 19:42:43

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 11/32] crypto: arm/aes-neonbs - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/Kconfig | 2 +-
arch/arm/crypto/aes-neonbs-glue.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index a95322b59799..b24df84a1d7a 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -82,8 +82,8 @@ config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
+ select CRYPTO_LIB_AES
select CRYPTO_SIMD
- select CRYPTO_AES
help
Use a faster and more secure NEON based implementation of AES in CBC,
CTR and XTS modes
diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c
index 617c2c99ebfb..f43c9365b6a9 100644
--- a/arch/arm/crypto/aes-neonbs-glue.c
+++ b/arch/arm/crypto/aes-neonbs-glue.c
@@ -64,7 +64,7 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;

- err = crypto_aes_expand_key(&rk, in_key, key_len);
+ err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;

@@ -123,7 +123,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;

- err = crypto_aes_expand_key(&rk, in_key, key_len);
+ err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;

--
2.17.1

2019-07-02 19:42:45

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 12/32] crypto: arm64/aes-ccm - switch to AES library

The CCM code calls directly into the scalar table based AES cipher for
arm64 from the fallback path, and since this implementation is known to
be non-time invariant, doing so from a time invariant SIMD cipher is a
bit nasty.

So let's switch to the AES library - this makes the code more robust,
and drops the dependency on the generic AES cipher, allowing us to
omit it entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 2 +-
arch/arm64/crypto/aes-ce-ccm-glue.c | 18 ++++++------------
2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 1762055e7093..c6032bfb44fb 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -80,8 +80,8 @@ config CRYPTO_AES_ARM64_CE_CCM
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AES_ARM64_CE
- select CRYPTO_AES_ARM64
select CRYPTO_AEAD
+ select CRYPTO_LIB_AES

config CRYPTO_AES_ARM64_CE_BLK
tristate "AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions"
diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index cb89c80800b5..b9b7cf4b5a8f 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -46,8 +46,6 @@ asmlinkage void ce_aes_ccm_decrypt(u8 out[], u8 const in[], u32 cbytes,
asmlinkage void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u32 const rk[],
u32 rounds);

-asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
static int ccm_setkey(struct crypto_aead *tfm, const u8 *in_key,
unsigned int key_len)
{
@@ -127,8 +125,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
}

while (abytes >= AES_BLOCK_SIZE) {
- __aes_arm64_encrypt(key->key_enc, mac, mac,
- num_rounds(key));
+ aes_encrypt(key, mac, mac);
crypto_xor(mac, in, AES_BLOCK_SIZE);

in += AES_BLOCK_SIZE;
@@ -136,8 +133,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
}

if (abytes > 0) {
- __aes_arm64_encrypt(key->key_enc, mac, mac,
- num_rounds(key));
+ aes_encrypt(key, mac, mac);
crypto_xor(mac, in, abytes);
*macp = abytes;
}
@@ -209,10 +205,8 @@ static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
bsize = nbytes;

crypto_inc(walk->iv, AES_BLOCK_SIZE);
- __aes_arm64_encrypt(ctx->key_enc, buf, walk->iv,
- num_rounds(ctx));
- __aes_arm64_encrypt(ctx->key_enc, mac, mac,
- num_rounds(ctx));
+ aes_encrypt(ctx, buf, walk->iv);
+ aes_encrypt(ctx, mac, mac);
if (enc)
crypto_xor(mac, src, bsize);
crypto_xor_cpy(dst, src, buf, bsize);
@@ -227,8 +221,8 @@ static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
}

if (!err) {
- __aes_arm64_encrypt(ctx->key_enc, buf, iv0, num_rounds(ctx));
- __aes_arm64_encrypt(ctx->key_enc, mac, mac, num_rounds(ctx));
+ aes_encrypt(ctx, buf, iv0);
+ aes_encrypt(ctx, mac, mac);
crypto_xor(mac, buf, AES_BLOCK_SIZE);
}
return err;
--
2.17.1

2019-07-02 19:42:46

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 13/32] crypto: arm64/aes-neonbs - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 1 +
arch/arm64/crypto/aes-neonbs-glue.c | 8 ++++----
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index c6032bfb44fb..17bf5dc10aad 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -116,6 +116,7 @@ config CRYPTO_AES_ARM64_BS
select CRYPTO_BLKCIPHER
select CRYPTO_AES_ARM64_NEON_BLK
select CRYPTO_AES_ARM64
+ select CRYPTO_LIB_AES
select CRYPTO_SIMD

endif
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
index 02b65d9eb947..cb8d90f795a0 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -77,7 +77,7 @@ static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;

- err = crypto_aes_expand_key(&rk, in_key, key_len);
+ err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;

@@ -136,7 +136,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
struct crypto_aes_ctx rk;
int err;

- err = crypto_aes_expand_key(&rk, in_key, key_len);
+ err = aes_expandkey(&rk, in_key, key_len);
if (err)
return err;

@@ -208,7 +208,7 @@ static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;

- err = crypto_aes_expand_key(&ctx->fallback, in_key, key_len);
+ err = aes_expandkey(&ctx->fallback, in_key, key_len);
if (err)
return err;

@@ -274,7 +274,7 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return err;

key_len /= 2;
- err = crypto_aes_expand_key(&rk, in_key + key_len, key_len);
+ err = aes_expandkey(&rk, in_key + key_len, key_len);
if (err)
return err;

--
2.17.1

2019-07-02 19:42:49

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 14/32] crypto: arm64/aes-ce - switch to library version of key expansion routine

Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

While at it, remove some references to the table based arm64 version
of AES and replace them with AES library calls as well.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 2 +-
arch/arm64/crypto/aes-glue.c | 17 ++++++++++-------
2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 17bf5dc10aad..66dea518221c 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -96,7 +96,7 @@ config CRYPTO_AES_ARM64_NEON_BLK
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_AES_ARM64
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
select CRYPTO_SIMD

config CRYPTO_CHACHA20_NEON
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index f0ceb545bd1e..3c80345d914f 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -26,7 +26,6 @@
#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
#define PRIO 300
-#define aes_setkey ce_aes_setkey
#define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt
@@ -42,8 +41,6 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE "neon"
#define PRIO 200
-#define aes_setkey crypto_aes_set_key
-#define aes_expandkey crypto_aes_expand_key
#define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt
@@ -121,7 +118,14 @@ struct mac_desc_ctx {
static int skcipher_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
- return aes_setkey(crypto_skcipher_tfm(tfm), in_key, key_len);
+ struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ int ret;
+
+ ret = aes_expandkey(ctx, in_key, key_len);
+ if (ret)
+ crypto_skcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+
+ return ret;
}

static int xts_set_key(struct crypto_skcipher *tfm, const u8 *in_key,
@@ -649,15 +653,14 @@ static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks,
kernel_neon_end();
} else {
if (enc_before)
- __aes_arm64_encrypt(ctx->key_enc, dg, dg, rounds);
+ aes_encrypt(ctx, dg, dg);

while (blocks--) {
crypto_xor(dg, in, AES_BLOCK_SIZE);
in += AES_BLOCK_SIZE;

if (blocks || enc_after)
- __aes_arm64_encrypt(ctx->key_enc, dg, dg,
- rounds);
+ aes_encrypt(ctx, dg, dg);
}
}
}
--
2.17.1

2019-07-02 19:42:52

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 16/32] crypto: ctr - add helper for performing a CTR encryption walk

Add a static inline helper modeled after crypto_cbc_encrypt_walk()
that can be reused for SIMD algorithms that need to implement a
non-SIMD fallback for performing CTR encryption.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/crypto/ctr.h | 50 ++++++++++++++++++++
1 file changed, 50 insertions(+)

diff --git a/include/crypto/ctr.h b/include/crypto/ctr.h
index 4180fc080e3b..d64017fae41c 100644
--- a/include/crypto/ctr.h
+++ b/include/crypto/ctr.h
@@ -13,8 +13,58 @@
#ifndef _CRYPTO_CTR_H
#define _CRYPTO_CTR_H

+#include <crypto/algapi.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
#define CTR_RFC3686_NONCE_SIZE 4
#define CTR_RFC3686_IV_SIZE 8
#define CTR_RFC3686_BLOCK_SIZE 16

+static inline int crypto_ctr_encrypt_walk(struct skcipher_request *req,
+ void (*fn)(struct crypto_skcipher *,
+ const u8 *, u8 *))
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ int blocksize = crypto_skcipher_chunksize(tfm);
+ u8 buf[MAX_CIPHER_BLOCKSIZE];
+ struct skcipher_walk walk;
+ int err;
+
+ /* avoid integer division due to variable blocksize parameter */
+ if (WARN_ON_ONCE(!is_power_of_2(blocksize)))
+ return -EINVAL;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while (walk.nbytes > 0) {
+ u8 *dst = walk.dst.virt.addr;
+ u8 *src = walk.src.virt.addr;
+ int nbytes = walk.nbytes;
+ int tail = 0;
+
+ if (nbytes < walk.total) {
+ tail = walk.nbytes & (blocksize - 1);
+ nbytes -= tail;
+ }
+
+ do {
+ int bsize = min(nbytes, blocksize);
+
+ fn(tfm, walk.iv, buf);
+
+ crypto_xor_cpy(dst, src, buf, bsize);
+ crypto_inc(walk.iv, blocksize);
+
+ dst += bsize;
+ src += bsize;
+ nbytes -= bsize;
+ } while (nbytes > 0);
+
+ err = skcipher_walk_done(&walk, tail);
+ }
+ return err;
+}
+
#endif /* _CRYPTO_CTR_H */
--
2.17.1

2019-07-02 19:42:52

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 15/32] crypto: generic/aes - drop key expansion routine in favor of library version

Drop aes-generic's version of crypto_aes_expand_key(), and switch to
the key expansion routine provided by the AES library. AES key expansion
is not performance critical, and it is better to have a single version
shared by all AES implementations.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/Kconfig | 1 +
crypto/aes_generic.c | 153 +-------------------
include/crypto/aes.h | 2 -
3 files changed, 3 insertions(+), 153 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index df6f0be66574..80ea118600ab 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1072,6 +1072,7 @@ config CRYPTO_LIB_AES
config CRYPTO_AES
tristate "AES cipher algorithms"
select CRYPTO_ALGAPI
+ select CRYPTO_LIB_AES
help
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
index 3aa4a715c216..426deb437f19 100644
--- a/crypto/aes_generic.c
+++ b/crypto/aes_generic.c
@@ -1125,155 +1125,6 @@ EXPORT_SYMBOL_GPL(crypto_fl_tab);
EXPORT_SYMBOL_GPL(crypto_it_tab);
EXPORT_SYMBOL_GPL(crypto_il_tab);

-/* initialise the key schedule from the user supplied key */
-
-#define star_x(x) (((x) & 0x7f7f7f7f) << 1) ^ ((((x) & 0x80808080) >> 7) * 0x1b)
-
-#define imix_col(y, x) do { \
- u = star_x(x); \
- v = star_x(u); \
- w = star_x(v); \
- t = w ^ (x); \
- (y) = u ^ v ^ w; \
- (y) ^= ror32(u ^ t, 8) ^ \
- ror32(v ^ t, 16) ^ \
- ror32(t, 24); \
-} while (0)
-
-#define ls_box(x) \
- crypto_fl_tab[0][byte(x, 0)] ^ \
- crypto_fl_tab[1][byte(x, 1)] ^ \
- crypto_fl_tab[2][byte(x, 2)] ^ \
- crypto_fl_tab[3][byte(x, 3)]
-
-#define loop4(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[4 * i]; \
- ctx->key_enc[4 * i + 4] = t; \
- t ^= ctx->key_enc[4 * i + 1]; \
- ctx->key_enc[4 * i + 5] = t; \
- t ^= ctx->key_enc[4 * i + 2]; \
- ctx->key_enc[4 * i + 6] = t; \
- t ^= ctx->key_enc[4 * i + 3]; \
- ctx->key_enc[4 * i + 7] = t; \
-} while (0)
-
-#define loop6(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[6 * i]; \
- ctx->key_enc[6 * i + 6] = t; \
- t ^= ctx->key_enc[6 * i + 1]; \
- ctx->key_enc[6 * i + 7] = t; \
- t ^= ctx->key_enc[6 * i + 2]; \
- ctx->key_enc[6 * i + 8] = t; \
- t ^= ctx->key_enc[6 * i + 3]; \
- ctx->key_enc[6 * i + 9] = t; \
- t ^= ctx->key_enc[6 * i + 4]; \
- ctx->key_enc[6 * i + 10] = t; \
- t ^= ctx->key_enc[6 * i + 5]; \
- ctx->key_enc[6 * i + 11] = t; \
-} while (0)
-
-#define loop8tophalf(i) do { \
- t = ror32(t, 8); \
- t = ls_box(t) ^ rco_tab[i]; \
- t ^= ctx->key_enc[8 * i]; \
- ctx->key_enc[8 * i + 8] = t; \
- t ^= ctx->key_enc[8 * i + 1]; \
- ctx->key_enc[8 * i + 9] = t; \
- t ^= ctx->key_enc[8 * i + 2]; \
- ctx->key_enc[8 * i + 10] = t; \
- t ^= ctx->key_enc[8 * i + 3]; \
- ctx->key_enc[8 * i + 11] = t; \
-} while (0)
-
-#define loop8(i) do { \
- loop8tophalf(i); \
- t = ctx->key_enc[8 * i + 4] ^ ls_box(t); \
- ctx->key_enc[8 * i + 12] = t; \
- t ^= ctx->key_enc[8 * i + 5]; \
- ctx->key_enc[8 * i + 13] = t; \
- t ^= ctx->key_enc[8 * i + 6]; \
- ctx->key_enc[8 * i + 14] = t; \
- t ^= ctx->key_enc[8 * i + 7]; \
- ctx->key_enc[8 * i + 15] = t; \
-} while (0)
-
-/**
- * crypto_aes_expand_key - Expands the AES key as described in FIPS-197
- * @ctx: The location where the computed key will be stored.
- * @in_key: The supplied key.
- * @key_len: The length of the supplied key.
- *
- * Returns 0 on success. The function fails only if an invalid key size (or
- * pointer) is supplied.
- * The expanded key size is 240 bytes (max of 14 rounds with a unique 16 bytes
- * key schedule plus a 16 bytes key which is used before the first round).
- * The decryption key is prepared for the "Equivalent Inverse Cipher" as
- * described in FIPS-197. The first slot (16 bytes) of each key (enc or dec) is
- * for the initial combination, the second slot for the first round and so on.
- */
-int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- unsigned int key_len)
-{
- u32 i, t, u, v, w, j;
-
- if (key_len != AES_KEYSIZE_128 && key_len != AES_KEYSIZE_192 &&
- key_len != AES_KEYSIZE_256)
- return -EINVAL;
-
- ctx->key_length = key_len;
-
- ctx->key_enc[0] = get_unaligned_le32(in_key);
- ctx->key_enc[1] = get_unaligned_le32(in_key + 4);
- ctx->key_enc[2] = get_unaligned_le32(in_key + 8);
- ctx->key_enc[3] = get_unaligned_le32(in_key + 12);
-
- ctx->key_dec[key_len + 24] = ctx->key_enc[0];
- ctx->key_dec[key_len + 25] = ctx->key_enc[1];
- ctx->key_dec[key_len + 26] = ctx->key_enc[2];
- ctx->key_dec[key_len + 27] = ctx->key_enc[3];
-
- switch (key_len) {
- case AES_KEYSIZE_128:
- t = ctx->key_enc[3];
- for (i = 0; i < 10; ++i)
- loop4(i);
- break;
-
- case AES_KEYSIZE_192:
- ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
- t = ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
- for (i = 0; i < 8; ++i)
- loop6(i);
- break;
-
- case AES_KEYSIZE_256:
- ctx->key_enc[4] = get_unaligned_le32(in_key + 16);
- ctx->key_enc[5] = get_unaligned_le32(in_key + 20);
- ctx->key_enc[6] = get_unaligned_le32(in_key + 24);
- t = ctx->key_enc[7] = get_unaligned_le32(in_key + 28);
- for (i = 0; i < 6; ++i)
- loop8(i);
- loop8tophalf(i);
- break;
- }
-
- ctx->key_dec[0] = ctx->key_enc[key_len + 24];
- ctx->key_dec[1] = ctx->key_enc[key_len + 25];
- ctx->key_dec[2] = ctx->key_enc[key_len + 26];
- ctx->key_dec[3] = ctx->key_enc[key_len + 27];
-
- for (i = 4; i < key_len + 24; ++i) {
- j = key_len + 24 - (i & ~3) + (i & 3);
- imix_col(ctx->key_dec[j], ctx->key_enc[i]);
- }
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
-
/**
* crypto_aes_set_key - Set the AES key.
* @tfm: The %crypto_tfm that is used in the context.
@@ -1281,7 +1132,7 @@ EXPORT_SYMBOL_GPL(crypto_aes_expand_key);
* @key_len: The size of the key.
*
* Returns 0 on success, on failure the %CRYPTO_TFM_RES_BAD_KEY_LEN flag in tfm
- * is set. The function uses crypto_aes_expand_key() to expand the key.
+ * is set. The function uses aes_expand_key() to expand the key.
* &crypto_aes_ctx _must_ be the private data embedded in @tfm which is
* retrieved with crypto_tfm_ctx().
*/
@@ -1292,7 +1143,7 @@ int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
u32 *flags = &tfm->crt_flags;
int ret;

- ret = crypto_aes_expand_key(ctx, in_key, key_len);
+ ret = aes_expandkey(ctx, in_key, key_len);
if (!ret)
return 0;

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index d0067fca0cd0..0a64a977f9b3 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -35,8 +35,6 @@ extern const u32 crypto_il_tab[4][256] ____cacheline_aligned;

int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len);
-int crypto_aes_expand_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
- unsigned int key_len);

/**
* aes_expandkey - Expands the AES key as described in FIPS-197
--
2.17.1

2019-07-02 19:42:53

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 18/32] crypto: arm64/aes-ce-cipher - use AES library as fallback

Instead of calling into the table based scalar AES code in situations
where the SIMD unit may not be used, use the generic AES code, which
is more appropriate since it is less likely to be susceptible to
timing attacks.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/Kconfig | 2 +-
arch/arm64/crypto/aes-ce-glue.c | 7 ++-----
arch/arm64/crypto/aes-cipher-glue.c | 3 ---
3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 66dea518221c..4922c4451e7c 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -73,7 +73,7 @@ config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_ALGAPI
- select CRYPTO_AES_ARM64
+ select CRYPTO_LIB_AES

config CRYPTO_AES_ARM64_CE_CCM
tristate "AES in CCM mode using ARMv8 Crypto Extensions"
diff --git a/arch/arm64/crypto/aes-ce-glue.c b/arch/arm64/crypto/aes-ce-glue.c
index 3213843fcb46..6890e003b8f1 100644
--- a/arch/arm64/crypto/aes-ce-glue.c
+++ b/arch/arm64/crypto/aes-ce-glue.c
@@ -23,9 +23,6 @@ MODULE_DESCRIPTION("Synchronous AES cipher using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <[email protected]>");
MODULE_LICENSE("GPL v2");

-asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
struct aes_block {
u8 b[AES_BLOCK_SIZE];
};
@@ -54,7 +51,7 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);

if (!crypto_simd_usable()) {
- __aes_arm64_encrypt(ctx->key_enc, dst, src, num_rounds(ctx));
+ aes_encrypt(ctx, dst, src);
return;
}

@@ -68,7 +65,7 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[])
struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);

if (!crypto_simd_usable()) {
- __aes_arm64_decrypt(ctx->key_dec, dst, src, num_rounds(ctx));
+ aes_decrypt(ctx, dst, src);
return;
}

diff --git a/arch/arm64/crypto/aes-cipher-glue.c b/arch/arm64/crypto/aes-cipher-glue.c
index 0e90b06ebcec..bf32cc6489e1 100644
--- a/arch/arm64/crypto/aes-cipher-glue.c
+++ b/arch/arm64/crypto/aes-cipher-glue.c
@@ -13,10 +13,7 @@
#include <linux/module.h>

asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-EXPORT_SYMBOL(__aes_arm64_encrypt);
-
asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-EXPORT_SYMBOL(__aes_arm64_decrypt);

static void aes_arm64_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
--
2.17.1

2019-07-02 19:42:54

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 17/32] crypto: aes - move sync ctr(aes) to AES library and generic helper

In preparation of duplicating the sync ctr(aes) functionality to modules
under arch/arm, move the helper function from a inline .h file to the
AES library, which is already depended upon by the drivers that use this
fallback.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/aes-ctr-fallback.h | 53 --------------------
arch/arm64/crypto/aes-glue.c | 22 ++++++--
arch/arm64/crypto/aes-neonbs-glue.c | 21 ++++++--
3 files changed, 33 insertions(+), 63 deletions(-)

diff --git a/arch/arm64/crypto/aes-ctr-fallback.h b/arch/arm64/crypto/aes-ctr-fallback.h
deleted file mode 100644
index c9285717b6b5..000000000000
--- a/arch/arm64/crypto/aes-ctr-fallback.h
+++ /dev/null
@@ -1,53 +0,0 @@
-/*
- * Fallback for sync aes(ctr) in contexts where kernel mode NEON
- * is not allowed
- *
- * Copyright (C) 2017 Linaro Ltd <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <crypto/aes.h>
-#include <crypto/internal/skcipher.h>
-
-asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
-
-static inline int aes_ctr_encrypt_fallback(struct crypto_aes_ctx *ctx,
- struct skcipher_request *req)
-{
- struct skcipher_walk walk;
- u8 buf[AES_BLOCK_SIZE];
- int err;
-
- err = skcipher_walk_virt(&walk, req, true);
-
- while (walk.nbytes > 0) {
- u8 *dst = walk.dst.virt.addr;
- u8 *src = walk.src.virt.addr;
- int nbytes = walk.nbytes;
- int tail = 0;
-
- if (nbytes < walk.total) {
- nbytes = round_down(nbytes, AES_BLOCK_SIZE);
- tail = walk.nbytes % AES_BLOCK_SIZE;
- }
-
- do {
- int bsize = min(nbytes, AES_BLOCK_SIZE);
-
- __aes_arm64_encrypt(ctx->key_enc, buf, walk.iv,
- 6 + ctx->key_length / 4);
- crypto_xor_cpy(dst, src, buf, bsize);
- crypto_inc(walk.iv, AES_BLOCK_SIZE);
-
- dst += AES_BLOCK_SIZE;
- src += AES_BLOCK_SIZE;
- nbytes -= AES_BLOCK_SIZE;
- } while (nbytes > 0);
-
- err = skcipher_walk_done(&walk, tail);
- }
- return err;
-}
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index 3c80345d914f..60303ea625e6 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -12,6 +12,7 @@
#include <asm/hwcap.h>
#include <asm/simd.h>
#include <crypto/aes.h>
+#include <crypto/ctr.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
@@ -21,7 +22,6 @@
#include <crypto/xts.h>

#include "aes-ce-setkey.h"
-#include "aes-ctr-fallback.h"

#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
@@ -404,13 +404,25 @@ static int ctr_encrypt(struct skcipher_request *req)
return err;
}

-static int ctr_encrypt_sync(struct skcipher_request *req)
+static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
- struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ unsigned long flags;
+
+ /*
+ * Temporarily disable interrupts to avoid races where
+ * cachelines are evicted when the CPU is interrupted
+ * to do something else.
+ */
+ local_irq_save(flags);
+ aes_encrypt(ctx, dst, src);
+ local_irq_restore(flags);
+}

+static int ctr_encrypt_sync(struct skcipher_request *req)
+{
if (!crypto_simd_usable())
- return aes_ctr_encrypt_fallback(ctx, req);
+ return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);

return ctr_encrypt(req);
}
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
index cb8d90f795a0..73c17ccb347d 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -11,13 +11,12 @@
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/aes.h>
+#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/xts.h>
#include <linux/module.h>

-#include "aes-ctr-fallback.h"
-
MODULE_AUTHOR("Ard Biesheuvel <[email protected]>");
MODULE_LICENSE("GPL v2");

@@ -283,13 +282,25 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
return aesbs_setkey(tfm, in_key, key_len);
}

-static int ctr_encrypt_sync(struct skcipher_request *req)
+static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
{
- struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
+ unsigned long flags;
+
+ /*
+ * Temporarily disable interrupts to avoid races where
+ * cachelines are evicted when the CPU is interrupted
+ * to do something else.
+ */
+ local_irq_save(flags);
+ aes_encrypt(&ctx->fallback, dst, src);
+ local_irq_restore(flags);
+}

+static int ctr_encrypt_sync(struct skcipher_request *req)
+{
if (!crypto_simd_usable())
- return aes_ctr_encrypt_fallback(&ctx->fallback, req);
+ return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);

return ctr_encrypt(req);
}
--
2.17.1

2019-07-02 19:42:55

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 19/32] crypto: aes/arm - use native endiannes for key schedule

Align ARM's hw instruction based AES implementation with other versions
that keep the key schedule in native endianness. This will allow us to
merge the various implementations going forward.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/aes-ce-core.S | 20 ++++++++++----------
arch/arm/crypto/aes-ce-glue.c | 9 +++------
2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S
index bc53bcaa772e..3692b8735ef7 100644
--- a/arch/arm/crypto/aes-ce-core.S
+++ b/arch/arm/crypto/aes-ce-core.S
@@ -91,19 +91,19 @@

.macro do_block, dround, fround
cmp r3, #12 @ which key size?
- vld1.8 {q10-q11}, [ip]!
+ vld1.32 {q10-q11}, [ip]!
\dround q8, q9
- vld1.8 {q12-q13}, [ip]!
+ vld1.32 {q12-q13}, [ip]!
\dround q10, q11
- vld1.8 {q10-q11}, [ip]!
+ vld1.32 {q10-q11}, [ip]!
\dround q12, q13
- vld1.8 {q12-q13}, [ip]!
+ vld1.32 {q12-q13}, [ip]!
\dround q10, q11
blo 0f @ AES-128: 10 rounds
- vld1.8 {q10-q11}, [ip]!
+ vld1.32 {q10-q11}, [ip]!
\dround q12, q13
beq 1f @ AES-192: 12 rounds
- vld1.8 {q12-q13}, [ip]
+ vld1.32 {q12-q13}, [ip]
\dround q10, q11
0: \fround q12, q13, q14
bx lr
@@ -152,8 +152,8 @@ ENDPROC(aes_decrypt_3x)

.macro prepare_key, rk, rounds
add ip, \rk, \rounds, lsl #4
- vld1.8 {q8-q9}, [\rk] @ load first 2 round keys
- vld1.8 {q14}, [ip] @ load last round key
+ vld1.32 {q8-q9}, [\rk] @ load first 2 round keys
+ vld1.32 {q14}, [ip] @ load last round key
.endm

/*
@@ -508,8 +508,8 @@ ENDPROC(ce_aes_sub)
* operation on round key *src
*/
ENTRY(ce_aes_invert)
- vld1.8 {q0}, [r1]
+ vld1.32 {q0}, [r1]
aesimc.8 q0, q0
- vst1.8 {q0}, [r0]
+ vst1.32 {q0}, [r0]
bx lr
ENDPROC(ce_aes_invert)
diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index 04ba66903674..e6da3e30018b 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -10,6 +10,7 @@

#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/unaligned.h>
#include <crypto/aes.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
@@ -80,21 +81,17 @@ static int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key,
key_len != AES_KEYSIZE_256)
return -EINVAL;

- memcpy(ctx->key_enc, in_key, key_len);
ctx->key_length = key_len;
+ for (i = 0; i < kwords; i++)
+ ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32));

kernel_neon_begin();
for (i = 0; i < sizeof(rcon); i++) {
u32 *rki = ctx->key_enc + (i * kwords);
u32 *rko = rki + kwords;

-#ifndef CONFIG_CPU_BIG_ENDIAN
rko[0] = ror32(ce_aes_sub(rki[kwords - 1]), 8);
rko[0] = rko[0] ^ rki[0] ^ rcon[i];
-#else
- rko[0] = rol32(ce_aes_sub(rki[kwords - 1]), 8);
- rko[0] = rko[0] ^ rki[0] ^ (rcon[i] << 24);
-#endif
rko[1] = rko[0] ^ rki[1];
rko[2] = rko[1] ^ rki[2];
rko[3] = rko[2] ^ rki[3];
--
2.17.1

2019-07-02 19:42:57

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 20/32] crypto: arm/aes-ce - provide a synchronous version of ctr(aes)

AES in CTR mode is used by modes such as GCM and CCM, which are often
used in contexts where only synchronous ciphers are permitted. So
provide a synchronous version of ctr(aes) based on the existing code.
This requires a non-SIMD fallback to deal with invocations occurring
from a context where SIMD instructions may not be used. We have a
helper for this now in the AES library, so wire that up.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/aes-ce-glue.c | 43 ++++++++++++++++++++
1 file changed, 43 insertions(+)

diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index e6da3e30018b..1d93da29d03a 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -10,8 +10,10 @@

#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <asm/unaligned.h>
#include <crypto/aes.h>
+#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <linux/cpufeature.h>
@@ -289,6 +291,29 @@ static int ctr_encrypt(struct skcipher_request *req)
return err;
}

+static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
+{
+ struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ unsigned long flags;
+
+ /*
+ * Temporarily disable interrupts to avoid races where
+ * cachelines are evicted when the CPU is interrupted
+ * to do something else.
+ */
+ local_irq_save(flags);
+ aes_encrypt(ctx, dst, src);
+ local_irq_restore(flags);
+}
+
+static int ctr_encrypt_sync(struct skcipher_request *req)
+{
+ if (!crypto_simd_usable())
+ return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
+
+ return ctr_encrypt(req);
+}
+
static int xts_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -378,6 +403,21 @@ static struct skcipher_alg aes_algs[] = { {
.setkey = ce_aes_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
+}, {
+ .base.cra_name = "ctr(aes)",
+ .base.cra_driver_name = "ctr-aes-ce-sync",
+ .base.cra_priority = 300 - 1,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct crypto_aes_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .chunksize = AES_BLOCK_SIZE,
+ .setkey = ce_aes_setkey,
+ .encrypt = ctr_encrypt_sync,
+ .decrypt = ctr_encrypt_sync,
}, {
.base.cra_name = "__xts(aes)",
.base.cra_driver_name = "__xts-aes-ce",
@@ -421,6 +461,9 @@ static int __init aes_init(void)
return err;

for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
+ if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
+ continue;
+
algname = aes_algs[i].base.cra_name + 2;
drvname = aes_algs[i].base.cra_driver_name + 2;
basename = aes_algs[i].base.cra_driver_name;
--
2.17.1

2019-07-02 19:42:59

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 21/32] crypto: arm/aes-neonbs - provide a synchronous version of ctr(aes)

AES in CTR mode is used by modes such as GCM and CCM, which are often
used in contexts where only synchronous ciphers are permitted. So
provide a synchronous version of ctr(aes) based on the existing code.
This requires a non-SIMD fallback to deal with invocations occurring
from a context where SIMD instructions may not be used. We have a
helper for this now in the AES library, so wire that up.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/aes-neonbs-glue.c | 65 ++++++++++++++++++++
1 file changed, 65 insertions(+)

diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c
index f43c9365b6a9..6eecdbb7e9b6 100644
--- a/arch/arm/crypto/aes-neonbs-glue.c
+++ b/arch/arm/crypto/aes-neonbs-glue.c
@@ -9,8 +9,10 @@
*/

#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/aes.h>
#include <crypto/cbc.h>
+#include <crypto/ctr.h>
#include <crypto/internal/simd.h>
#include <crypto/internal/skcipher.h>
#include <crypto/xts.h>
@@ -57,6 +59,11 @@ struct aesbs_xts_ctx {
struct crypto_cipher *tweak_tfm;
};

+struct aesbs_ctr_ctx {
+ struct aesbs_ctx key; /* must be first member */
+ struct crypto_aes_ctx fallback;
+};
+
static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
@@ -192,6 +199,25 @@ static void cbc_exit(struct crypto_tfm *tfm)
crypto_free_cipher(ctx->enc_tfm);
}

+static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
+ unsigned int key_len)
+{
+ struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
+ int err;
+
+ err = aes_expandkey(&ctx->fallback, in_key, key_len);
+ if (err)
+ return err;
+
+ ctx->key.rounds = 6 + key_len / 4;
+
+ kernel_neon_begin();
+ aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds);
+ kernel_neon_end();
+
+ return 0;
+}
+
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -234,6 +260,29 @@ static int ctr_encrypt(struct skcipher_request *req)
return err;
}

+static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
+{
+ struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
+ unsigned long flags;
+
+ /*
+ * Temporarily disable interrupts to avoid races where
+ * cachelines are evicted when the CPU is interrupted
+ * to do something else.
+ */
+ local_irq_save(flags);
+ aes_encrypt(&ctx->fallback, dst, src);
+ local_irq_restore(flags);
+}
+
+static int ctr_encrypt_sync(struct skcipher_request *req)
+{
+ if (!crypto_simd_usable())
+ return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
+
+ return ctr_encrypt(req);
+}
+
static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
@@ -361,6 +410,22 @@ static struct skcipher_alg aes_algs[] = { {
.setkey = aesbs_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
+}, {
+ .base.cra_name = "ctr(aes)",
+ .base.cra_driver_name = "ctr-aes-neonbs-sync",
+ .base.cra_priority = 250 - 1,
+ .base.cra_blocksize = 1,
+ .base.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
+ .base.cra_module = THIS_MODULE,
+
+ .min_keysize = AES_MIN_KEY_SIZE,
+ .max_keysize = AES_MAX_KEY_SIZE,
+ .chunksize = AES_BLOCK_SIZE,
+ .walksize = 8 * AES_BLOCK_SIZE,
+ .ivsize = AES_BLOCK_SIZE,
+ .setkey = aesbs_ctr_setkey_sync,
+ .encrypt = ctr_encrypt_sync,
+ .decrypt = ctr_encrypt_sync,
}, {
.base.cra_name = "__xts(aes)",
.base.cra_driver_name = "__xts-aes-neonbs",
--
2.17.1

2019-07-02 19:43:04

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 22/32] crypto: arm/ghash - provide a synchronous version

GHASH is used by the GCM mode, which is often used in contexts where
only synchronous ciphers are permitted. So provide a synchronous version
of GHASH based on the existing code. This requires a non-SIMD fallback
to deal with invocations occurring from a context where SIMD instructions
may not be used.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/ghash-ce-glue.c | 78 +++++++++++++-------
1 file changed, 52 insertions(+), 26 deletions(-)

diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c
index 39d1ccec1aab..ebb237ca874b 100644
--- a/arch/arm/crypto/ghash-ce-glue.c
+++ b/arch/arm/crypto/ghash-ce-glue.c
@@ -12,6 +12,7 @@
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/unaligned.h>
+#include <crypto/b128ops.h>
#include <crypto/cryptd.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
@@ -33,6 +34,8 @@ struct ghash_key {
u64 h2[2];
u64 h3[2];
u64 h4[2];
+
+ be128 k;
};

struct ghash_desc_ctx {
@@ -65,6 +68,36 @@ static int ghash_init(struct shash_desc *desc)
return 0;
}

+static void ghash_do_update(int blocks, u64 dg[], const char *src,
+ struct ghash_key *key, const char *head)
+{
+ if (likely(crypto_simd_usable())) {
+ kernel_neon_begin();
+ pmull_ghash_update(blocks, dg, src, key, head);
+ kernel_neon_end();
+ } else {
+ be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
+
+ do {
+ const u8 *in = src;
+
+ if (head) {
+ in = head;
+ blocks++;
+ head = NULL;
+ } else {
+ src += GHASH_BLOCK_SIZE;
+ }
+
+ crypto_xor((u8 *)&dst, in, GHASH_BLOCK_SIZE);
+ gf128mul_lle(&dst, &key->k);
+ } while (--blocks);
+
+ dg[0] = be64_to_cpu(dst.b);
+ dg[1] = be64_to_cpu(dst.a);
+ }
+}
+
static int ghash_update(struct shash_desc *desc, const u8 *src,
unsigned int len)
{
@@ -88,10 +121,8 @@ static int ghash_update(struct shash_desc *desc, const u8 *src,
blocks = len / GHASH_BLOCK_SIZE;
len %= GHASH_BLOCK_SIZE;

- kernel_neon_begin();
- pmull_ghash_update(blocks, ctx->digest, src, key,
- partial ? ctx->buf : NULL);
- kernel_neon_end();
+ ghash_do_update(blocks, ctx->digest, src, key,
+ partial ? ctx->buf : NULL);
src += blocks * GHASH_BLOCK_SIZE;
partial = 0;
}
@@ -109,9 +140,7 @@ static int ghash_final(struct shash_desc *desc, u8 *dst)
struct ghash_key *key = crypto_shash_ctx(desc->tfm);

memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
- kernel_neon_begin();
- pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL);
- kernel_neon_end();
+ ghash_do_update(1, ctx->digest, ctx->buf, key, NULL);
}
put_unaligned_be64(ctx->digest[1], dst);
put_unaligned_be64(ctx->digest[0], dst + 8);
@@ -135,24 +164,25 @@ static int ghash_setkey(struct crypto_shash *tfm,
const u8 *inkey, unsigned int keylen)
{
struct ghash_key *key = crypto_shash_ctx(tfm);
- be128 h, k;
+ be128 h;

if (keylen != GHASH_BLOCK_SIZE) {
crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}

- memcpy(&k, inkey, GHASH_BLOCK_SIZE);
- ghash_reflect(key->h, &k);
+ /* needed for the fallback */
+ memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
+ ghash_reflect(key->h, &key->k);

- h = k;
- gf128mul_lle(&h, &k);
+ h = key->k;
+ gf128mul_lle(&h, &key->k);
ghash_reflect(key->h2, &h);

- gf128mul_lle(&h, &k);
+ gf128mul_lle(&h, &key->k);
ghash_reflect(key->h3, &h);

- gf128mul_lle(&h, &k);
+ gf128mul_lle(&h, &key->k);
ghash_reflect(key->h4, &h);

return 0;
@@ -165,15 +195,13 @@ static struct shash_alg ghash_alg = {
.final = ghash_final,
.setkey = ghash_setkey,
.descsize = sizeof(struct ghash_desc_ctx),
- .base = {
- .cra_name = "__ghash",
- .cra_driver_name = "__driver-ghash-ce",
- .cra_priority = 0,
- .cra_flags = CRYPTO_ALG_INTERNAL,
- .cra_blocksize = GHASH_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct ghash_key),
- .cra_module = THIS_MODULE,
- },
+
+ .base.cra_name = "ghash",
+ .base.cra_driver_name = "ghash-ce-sync",
+ .base.cra_priority = 300 - 1,
+ .base.cra_blocksize = GHASH_BLOCK_SIZE,
+ .base.cra_ctxsize = sizeof(struct ghash_key),
+ .base.cra_module = THIS_MODULE,
};

static int ghash_async_init(struct ahash_request *req)
@@ -288,9 +316,7 @@ static int ghash_async_init_tfm(struct crypto_tfm *tfm)
struct cryptd_ahash *cryptd_tfm;
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);

- cryptd_tfm = cryptd_alloc_ahash("__driver-ghash-ce",
- CRYPTO_ALG_INTERNAL,
- CRYPTO_ALG_INTERNAL);
+ cryptd_tfm = cryptd_alloc_ahash("ghash-ce-sync", 0, 0);
if (IS_ERR(cryptd_tfm))
return PTR_ERR(cryptd_tfm);
ctx->cryptd_tfm = cryptd_tfm;
--
2.17.1

2019-07-02 19:43:06

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 23/32] bluetooth: switch to AES library

The bluetooth code uses a bare AES cipher for the encryption operations.
Given that it carries out a set_key() operation right before every
encryption operation, this is clearly not a hot path, and so the use of
the cipher interface (which provides the best implementation available
on the system) is not really required.

In fact, when using a cipher like AES-NI or AES-CE, both the set_key()
and the encrypt() operations involve en/disabling preemption as well as
stacking and unstacking the SIMD context, and this is most certainly
not worth it for encrypting 16 bytes of data.

So let's switch to the new lightweight library interface instead.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
net/bluetooth/Kconfig | 3 +-
net/bluetooth/smp.c | 103 ++++++--------------
2 files changed, 33 insertions(+), 73 deletions(-)

diff --git a/net/bluetooth/Kconfig b/net/bluetooth/Kconfig
index db82a40875e8..a9d83ec4ee33 100644
--- a/net/bluetooth/Kconfig
+++ b/net/bluetooth/Kconfig
@@ -9,7 +9,8 @@ menuconfig BT
select CRC16
select CRYPTO
select CRYPTO_BLKCIPHER
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
+ imply CRYPTO_AES
select CRYPTO_CMAC
select CRYPTO_ECB
select CRYPTO_SHA256
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index e68c715f8d37..b5045b57ead3 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -23,6 +23,7 @@
#include <linux/debugfs.h>
#include <linux/scatterlist.h>
#include <linux/crypto.h>
+#include <crypto/aes.h>
#include <crypto/algapi.h>
#include <crypto/b128ops.h>
#include <crypto/hash.h>
@@ -88,7 +89,6 @@ struct smp_dev {
u8 local_rand[16];
bool debug_key;

- struct crypto_cipher *tfm_aes;
struct crypto_shash *tfm_cmac;
struct crypto_kpp *tfm_ecdh;
};
@@ -127,7 +127,6 @@ struct smp_chan {
u8 dhkey[32];
u8 mackey[16];

- struct crypto_cipher *tfm_aes;
struct crypto_shash *tfm_cmac;
struct crypto_kpp *tfm_ecdh;
};
@@ -377,22 +376,18 @@ static int smp_h7(struct crypto_shash *tfm_cmac, const u8 w[16],
* s1 and ah.
*/

-static int smp_e(struct crypto_cipher *tfm, const u8 *k, u8 *r)
+static int smp_e(const u8 *k, u8 *r)
{
+ struct crypto_aes_ctx ctx;
uint8_t tmp[16], data[16];
int err;

SMP_DBG("k %16phN r %16phN", k, r);

- if (!tfm) {
- BT_ERR("tfm %p", tfm);
- return -EINVAL;
- }
-
/* The most significant octet of key corresponds to k[0] */
swap_buf(k, tmp, 16);

- err = crypto_cipher_setkey(tfm, tmp, 16);
+ err = aes_expandkey(&ctx, tmp, 16);
if (err) {
BT_ERR("cipher setkey failed: %d", err);
return err;
@@ -401,17 +396,18 @@ static int smp_e(struct crypto_cipher *tfm, const u8 *k, u8 *r)
/* Most significant octet of plaintextData corresponds to data[0] */
swap_buf(r, data, 16);

- crypto_cipher_encrypt_one(tfm, data, data);
+ aes_encrypt(&ctx, data, data);

/* Most significant octet of encryptedData corresponds to data[0] */
swap_buf(data, r, 16);

SMP_DBG("r %16phN", r);

+ memzero_explicit(&ctx, sizeof (ctx));
return err;
}

-static int smp_c1(struct crypto_cipher *tfm_aes, const u8 k[16],
+static int smp_c1(const u8 k[16],
const u8 r[16], const u8 preq[7], const u8 pres[7], u8 _iat,
const bdaddr_t *ia, u8 _rat, const bdaddr_t *ra, u8 res[16])
{
@@ -436,7 +432,7 @@ static int smp_c1(struct crypto_cipher *tfm_aes, const u8 k[16],
u128_xor((u128 *) res, (u128 *) r, (u128 *) p1);

/* res = e(k, res) */
- err = smp_e(tfm_aes, k, res);
+ err = smp_e(k, res);
if (err) {
BT_ERR("Encrypt data error");
return err;
@@ -453,14 +449,14 @@ static int smp_c1(struct crypto_cipher *tfm_aes, const u8 k[16],
u128_xor((u128 *) res, (u128 *) res, (u128 *) p2);

/* res = e(k, res) */
- err = smp_e(tfm_aes, k, res);
+ err = smp_e(k, res);
if (err)
BT_ERR("Encrypt data error");

return err;
}

-static int smp_s1(struct crypto_cipher *tfm_aes, const u8 k[16],
+static int smp_s1(const u8 k[16],
const u8 r1[16], const u8 r2[16], u8 _r[16])
{
int err;
@@ -469,15 +465,14 @@ static int smp_s1(struct crypto_cipher *tfm_aes, const u8 k[16],
memcpy(_r, r2, 8);
memcpy(_r + 8, r1, 8);

- err = smp_e(tfm_aes, k, _r);
+ err = smp_e(k, _r);
if (err)
BT_ERR("Encrypt data error");

return err;
}

-static int smp_ah(struct crypto_cipher *tfm, const u8 irk[16],
- const u8 r[3], u8 res[3])
+static int smp_ah(const u8 irk[16], const u8 r[3], u8 res[3])
{
u8 _res[16];
int err;
@@ -486,7 +481,7 @@ static int smp_ah(struct crypto_cipher *tfm, const u8 irk[16],
memcpy(_res, r, 3);
memset(_res + 3, 0, 13);

- err = smp_e(tfm, irk, _res);
+ err = smp_e(irk, _res);
if (err) {
BT_ERR("Encrypt error");
return err;
@@ -518,7 +513,7 @@ bool smp_irk_matches(struct hci_dev *hdev, const u8 irk[16],

BT_DBG("RPA %pMR IRK %*phN", bdaddr, 16, irk);

- err = smp_ah(smp->tfm_aes, irk, &bdaddr->b[3], hash);
+ err = smp_ah(irk, &bdaddr->b[3], hash);
if (err)
return false;

@@ -541,7 +536,7 @@ int smp_generate_rpa(struct hci_dev *hdev, const u8 irk[16], bdaddr_t *rpa)
rpa->b[5] &= 0x3f; /* Clear two most significant bits */
rpa->b[5] |= 0x40; /* Set second most significant bit */

- err = smp_ah(smp->tfm_aes, irk, &rpa->b[3], rpa->b);
+ err = smp_ah(irk, &rpa->b[3], rpa->b);
if (err < 0)
return err;

@@ -768,7 +763,6 @@ static void smp_chan_destroy(struct l2cap_conn *conn)
kzfree(smp->slave_csrk);
kzfree(smp->link_key);

- crypto_free_cipher(smp->tfm_aes);
crypto_free_shash(smp->tfm_cmac);
crypto_free_kpp(smp->tfm_ecdh);

@@ -957,7 +951,7 @@ static u8 smp_confirm(struct smp_chan *smp)

BT_DBG("conn %p", conn);

- ret = smp_c1(smp->tfm_aes, smp->tk, smp->prnd, smp->preq, smp->prsp,
+ ret = smp_c1(smp->tk, smp->prnd, smp->preq, smp->prsp,
conn->hcon->init_addr_type, &conn->hcon->init_addr,
conn->hcon->resp_addr_type, &conn->hcon->resp_addr,
cp.confirm_val);
@@ -983,12 +977,9 @@ static u8 smp_random(struct smp_chan *smp)
u8 confirm[16];
int ret;

- if (IS_ERR_OR_NULL(smp->tfm_aes))
- return SMP_UNSPECIFIED;
-
BT_DBG("conn %p %s", conn, conn->hcon->out ? "master" : "slave");

- ret = smp_c1(smp->tfm_aes, smp->tk, smp->rrnd, smp->preq, smp->prsp,
+ ret = smp_c1(smp->tk, smp->rrnd, smp->preq, smp->prsp,
hcon->init_addr_type, &hcon->init_addr,
hcon->resp_addr_type, &hcon->resp_addr, confirm);
if (ret)
@@ -1005,7 +996,7 @@ static u8 smp_random(struct smp_chan *smp)
__le64 rand = 0;
__le16 ediv = 0;

- smp_s1(smp->tfm_aes, smp->tk, smp->rrnd, smp->prnd, stk);
+ smp_s1(smp->tk, smp->rrnd, smp->prnd, stk);

if (test_and_set_bit(HCI_CONN_ENCRYPT_PEND, &hcon->flags))
return SMP_UNSPECIFIED;
@@ -1021,7 +1012,7 @@ static u8 smp_random(struct smp_chan *smp)
smp_send_cmd(conn, SMP_CMD_PAIRING_RANDOM, sizeof(smp->prnd),
smp->prnd);

- smp_s1(smp->tfm_aes, smp->tk, smp->prnd, smp->rrnd, stk);
+ smp_s1(smp->tk, smp->prnd, smp->rrnd, stk);

if (hcon->pending_sec_level == BT_SECURITY_HIGH)
auth = 1;
@@ -1389,16 +1380,10 @@ static struct smp_chan *smp_chan_create(struct l2cap_conn *conn)
if (!smp)
return NULL;

- smp->tfm_aes = crypto_alloc_cipher("aes", 0, 0);
- if (IS_ERR(smp->tfm_aes)) {
- BT_ERR("Unable to create AES crypto context");
- goto zfree_smp;
- }
-
smp->tfm_cmac = crypto_alloc_shash("cmac(aes)", 0, 0);
if (IS_ERR(smp->tfm_cmac)) {
BT_ERR("Unable to create CMAC crypto context");
- goto free_cipher;
+ goto zfree_smp;
}

smp->tfm_ecdh = crypto_alloc_kpp("ecdh", CRYPTO_ALG_INTERNAL, 0);
@@ -1420,8 +1405,6 @@ static struct smp_chan *smp_chan_create(struct l2cap_conn *conn)

free_shash:
crypto_free_shash(smp->tfm_cmac);
-free_cipher:
- crypto_free_cipher(smp->tfm_aes);
zfree_smp:
kzfree(smp);
return NULL;
@@ -3219,7 +3202,6 @@ static struct l2cap_chan *smp_add_cid(struct hci_dev *hdev, u16 cid)
{
struct l2cap_chan *chan;
struct smp_dev *smp;
- struct crypto_cipher *tfm_aes;
struct crypto_shash *tfm_cmac;
struct crypto_kpp *tfm_ecdh;

@@ -3232,17 +3214,9 @@ static struct l2cap_chan *smp_add_cid(struct hci_dev *hdev, u16 cid)
if (!smp)
return ERR_PTR(-ENOMEM);

- tfm_aes = crypto_alloc_cipher("aes", 0, 0);
- if (IS_ERR(tfm_aes)) {
- BT_ERR("Unable to create AES crypto context");
- kzfree(smp);
- return ERR_CAST(tfm_aes);
- }
-
tfm_cmac = crypto_alloc_shash("cmac(aes)", 0, 0);
if (IS_ERR(tfm_cmac)) {
BT_ERR("Unable to create CMAC crypto context");
- crypto_free_cipher(tfm_aes);
kzfree(smp);
return ERR_CAST(tfm_cmac);
}
@@ -3251,13 +3225,11 @@ static struct l2cap_chan *smp_add_cid(struct hci_dev *hdev, u16 cid)
if (IS_ERR(tfm_ecdh)) {
BT_ERR("Unable to create ECDH crypto context");
crypto_free_shash(tfm_cmac);
- crypto_free_cipher(tfm_aes);
kzfree(smp);
return ERR_CAST(tfm_ecdh);
}

smp->local_oob = false;
- smp->tfm_aes = tfm_aes;
smp->tfm_cmac = tfm_cmac;
smp->tfm_ecdh = tfm_ecdh;

@@ -3265,7 +3237,6 @@ static struct l2cap_chan *smp_add_cid(struct hci_dev *hdev, u16 cid)
chan = l2cap_chan_create();
if (!chan) {
if (smp) {
- crypto_free_cipher(smp->tfm_aes);
crypto_free_shash(smp->tfm_cmac);
crypto_free_kpp(smp->tfm_ecdh);
kzfree(smp);
@@ -3313,7 +3284,6 @@ static void smp_del_chan(struct l2cap_chan *chan)
smp = chan->data;
if (smp) {
chan->data = NULL;
- crypto_free_cipher(smp->tfm_aes);
crypto_free_shash(smp->tfm_cmac);
crypto_free_kpp(smp->tfm_ecdh);
kzfree(smp);
@@ -3569,7 +3539,7 @@ static int __init test_debug_key(struct crypto_kpp *tfm_ecdh)
return 0;
}

-static int __init test_ah(struct crypto_cipher *tfm_aes)
+static int __init test_ah(void)
{
const u8 irk[16] = {
0x9b, 0x7d, 0x39, 0x0a, 0xa6, 0x10, 0x10, 0x34,
@@ -3579,7 +3549,7 @@ static int __init test_ah(struct crypto_cipher *tfm_aes)
u8 res[3];
int err;

- err = smp_ah(tfm_aes, irk, r, res);
+ err = smp_ah(irk, r, res);
if (err)
return err;

@@ -3589,7 +3559,7 @@ static int __init test_ah(struct crypto_cipher *tfm_aes)
return 0;
}

-static int __init test_c1(struct crypto_cipher *tfm_aes)
+static int __init test_c1(void)
{
const u8 k[16] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -3609,7 +3579,7 @@ static int __init test_c1(struct crypto_cipher *tfm_aes)
u8 res[16];
int err;

- err = smp_c1(tfm_aes, k, r, preq, pres, _iat, &ia, _rat, &ra, res);
+ err = smp_c1(k, r, preq, pres, _iat, &ia, _rat, &ra, res);
if (err)
return err;

@@ -3619,7 +3589,7 @@ static int __init test_c1(struct crypto_cipher *tfm_aes)
return 0;
}

-static int __init test_s1(struct crypto_cipher *tfm_aes)
+static int __init test_s1(void)
{
const u8 k[16] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -3634,7 +3604,7 @@ static int __init test_s1(struct crypto_cipher *tfm_aes)
u8 res[16];
int err;

- err = smp_s1(tfm_aes, k, r1, r2, res);
+ err = smp_s1(k, r1, r2, res);
if (err)
return err;

@@ -3815,8 +3785,7 @@ static const struct file_operations test_smp_fops = {
.llseek = default_llseek,
};

-static int __init run_selftests(struct crypto_cipher *tfm_aes,
- struct crypto_shash *tfm_cmac,
+static int __init run_selftests(struct crypto_shash *tfm_cmac,
struct crypto_kpp *tfm_ecdh)
{
ktime_t calltime, delta, rettime;
@@ -3831,19 +3800,19 @@ static int __init run_selftests(struct crypto_cipher *tfm_aes,
goto done;
}

- err = test_ah(tfm_aes);
+ err = test_ah();
if (err) {
BT_ERR("smp_ah test failed");
goto done;
}

- err = test_c1(tfm_aes);
+ err = test_c1();
if (err) {
BT_ERR("smp_c1 test failed");
goto done;
}

- err = test_s1(tfm_aes);
+ err = test_s1();
if (err) {
BT_ERR("smp_s1 test failed");
goto done;
@@ -3900,21 +3869,13 @@ static int __init run_selftests(struct crypto_cipher *tfm_aes,

int __init bt_selftest_smp(void)
{
- struct crypto_cipher *tfm_aes;
struct crypto_shash *tfm_cmac;
struct crypto_kpp *tfm_ecdh;
int err;

- tfm_aes = crypto_alloc_cipher("aes", 0, 0);
- if (IS_ERR(tfm_aes)) {
- BT_ERR("Unable to create AES crypto context");
- return PTR_ERR(tfm_aes);
- }
-
tfm_cmac = crypto_alloc_shash("cmac(aes)", 0, 0);
if (IS_ERR(tfm_cmac)) {
BT_ERR("Unable to create CMAC crypto context");
- crypto_free_cipher(tfm_aes);
return PTR_ERR(tfm_cmac);
}

@@ -3922,14 +3883,12 @@ int __init bt_selftest_smp(void)
if (IS_ERR(tfm_ecdh)) {
BT_ERR("Unable to create ECDH crypto context");
crypto_free_shash(tfm_cmac);
- crypto_free_cipher(tfm_aes);
return PTR_ERR(tfm_ecdh);
}

- err = run_selftests(tfm_aes, tfm_cmac, tfm_ecdh);
+ err = run_selftests(tfm_cmac, tfm_ecdh);

crypto_free_shash(tfm_cmac);
- crypto_free_cipher(tfm_aes);
crypto_free_kpp(tfm_ecdh);

return err;
--
2.17.1

2019-07-02 19:43:08

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 24/32] crypto: amcc/aes - switch to AES library for GCM key derivation

The AMCC code for GCM key derivation allocates a AES cipher to
perform a single block encryption. So let's switch to the new
and more lightweight AES library instead.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/Kconfig | 2 +-
drivers/crypto/amcc/crypto4xx_alg.c | 24 +++++++-------------
2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index b30b84089d11..c7ac1e6d23d4 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -311,7 +311,7 @@ config CRYPTO_DEV_PPC4XX
depends on PPC && 4xx
select CRYPTO_HASH
select CRYPTO_AEAD
- select CRYPTO_AES
+ select CRYPTO_LIB_AES
select CRYPTO_CCM
select CRYPTO_CTR
select CRYPTO_GCM
diff --git a/drivers/crypto/amcc/crypto4xx_alg.c b/drivers/crypto/amcc/crypto4xx_alg.c
index 26f86fd7532b..d3660703a36c 100644
--- a/drivers/crypto/amcc/crypto4xx_alg.c
+++ b/drivers/crypto/amcc/crypto4xx_alg.c
@@ -536,28 +536,20 @@ static int crypto4xx_aes_gcm_validate_keylen(unsigned int keylen)
static int crypto4xx_compute_gcm_hash_key_sw(__le32 *hash_start, const u8 *key,
unsigned int keylen)
{
- struct crypto_cipher *aes_tfm = NULL;
+ struct crypto_aes_ctx ctx;
uint8_t src[16] = { 0 };
- int rc = 0;
-
- aes_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_NEED_FALLBACK);
- if (IS_ERR(aes_tfm)) {
- rc = PTR_ERR(aes_tfm);
- pr_warn("could not load aes cipher driver: %d\n", rc);
- return rc;
- }
+ int rc;

- rc = crypto_cipher_setkey(aes_tfm, key, keylen);
+ rc = aes_expandkey(&ctx, key, keylen);
if (rc) {
- pr_err("setkey() failed: %d\n", rc);
- goto out;
+ pr_err("aes_expandkey() failed: %d\n", rc);
+ return rc;
}

- crypto_cipher_encrypt_one(aes_tfm, src, src);
+ aes_encrypt(&ctx, src, src);
crypto4xx_memcpy_to_le32(hash_start, src, 16);
-out:
- crypto_free_cipher(aes_tfm);
- return rc;
+ memzero_explicit(&ctx, sizeof(ctx));
+ return 0;
}

int crypto4xx_setkey_aes_gcm(struct crypto_aead *cipher,
--
2.17.1

2019-07-02 19:43:13

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 32/32] crypto: arm/aes-scalar - unexport en/decryption routines

The scalar table based AES routines are not used by other drivers, so
let's keep it that way and unexport the symbols.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/aes-cipher-glue.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/arch/arm/crypto/aes-cipher-glue.c b/arch/arm/crypto/aes-cipher-glue.c
index f6c07867b8ff..26a2b81c2c12 100644
--- a/arch/arm/crypto/aes-cipher-glue.c
+++ b/arch/arm/crypto/aes-cipher-glue.c
@@ -14,10 +14,7 @@
#include <linux/module.h>

asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
-EXPORT_SYMBOL(__aes_arm_encrypt);
-
asmlinkage void __aes_arm_decrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
-EXPORT_SYMBOL(__aes_arm_decrypt);

static void aes_arm_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
--
2.17.1

2019-07-02 19:43:14

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 30/32] crypto: arm/aes-cipher - switch to shared AES inverse Sbox

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm/crypto/aes-cipher-core.S | 40 +-------------------
1 file changed, 1 insertion(+), 39 deletions(-)

diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-cipher-core.S
index f2d67c095e59..180d8555a09c 100644
--- a/arch/arm/crypto/aes-cipher-core.S
+++ b/arch/arm/crypto/aes-cipher-core.S
@@ -222,43 +222,5 @@ ENDPROC(__aes_arm_encrypt)

.align 5
ENTRY(__aes_arm_decrypt)
- do_crypt iround, crypto_it_tab, __aes_arm_inverse_sbox, 0
+ do_crypt iround, crypto_it_tab, crypto_aes_inv_sbox, 0
ENDPROC(__aes_arm_decrypt)
-
- .section ".rodata", "a"
- .align L1_CACHE_SHIFT
- .type __aes_arm_inverse_sbox, %object
-__aes_arm_inverse_sbox:
- .byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
- .byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
- .byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
- .byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
- .byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
- .byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
- .byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
- .byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
- .byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
- .byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
- .byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
- .byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
- .byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
- .byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
- .byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
- .byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
- .byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
- .byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
- .byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
- .byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
- .byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
- .byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
- .byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
- .byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
- .byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
- .byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
- .byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
- .byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
- .byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
- .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
- .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
- .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
- .size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox
--
2.17.1

2019-07-02 19:43:14

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 25/32] crypto: ccp - move to AES library for CMAC key derivation

Use the AES library instead of the cipher interface to perform
the single block of AES processing involved in updating the key
of the cmac(aes) hash.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/ccp/Kconfig | 1 +
drivers/crypto/ccp/ccp-crypto-aes-cmac.c | 25 ++++----------------
drivers/crypto/ccp/ccp-crypto.h | 3 ---
3 files changed, 5 insertions(+), 24 deletions(-)

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index b9dfae47aefd..ee06d0fccdb5 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -29,6 +29,7 @@ config CRYPTO_DEV_CCP_CRYPTO
select CRYPTO_BLKCIPHER
select CRYPTO_AUTHENC
select CRYPTO_RSA
+ select CRYPTO_LIB_AES
help
Support for using the cryptographic API with the AMD Cryptographic
Coprocessor. This module supports offload of SHA and AES algorithms.
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
index f6e252c1d6fb..c8f4b29bf044 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-cmac.c
@@ -264,6 +264,7 @@ static int ccp_aes_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
ccp_crypto_ahash_alg(crypto_ahash_tfm(tfm));
u64 k0_hi, k0_lo, k1_hi, k1_lo, k2_hi, k2_lo;
u64 rb_hi = 0x00, rb_lo = 0x87;
+ struct crypto_aes_ctx aes;
__be64 *gk;
int ret;

@@ -287,14 +288,14 @@ static int ccp_aes_cmac_setkey(struct crypto_ahash *tfm, const u8 *key,
ctx->u.aes.key_len = 0;

/* Set the key for the AES cipher used to generate the keys */
- ret = crypto_cipher_setkey(ctx->u.aes.tfm_cipher, key, key_len);
+ ret = aes_expandkey(&aes, key, key_len);
if (ret)
return ret;

/* Encrypt a block of zeroes - use key area in context */
memset(ctx->u.aes.key, 0, sizeof(ctx->u.aes.key));
- crypto_cipher_encrypt_one(ctx->u.aes.tfm_cipher, ctx->u.aes.key,
- ctx->u.aes.key);
+ aes_encrypt(&aes, ctx->u.aes.key, ctx->u.aes.key);
+ memzero_explicit(&aes, sizeof(aes));

/* Generate K1 and K2 */
k0_hi = be64_to_cpu(*((__be64 *)ctx->u.aes.key));
@@ -339,32 +340,15 @@ static int ccp_aes_cmac_cra_init(struct crypto_tfm *tfm)
{
struct ccp_ctx *ctx = crypto_tfm_ctx(tfm);
struct crypto_ahash *ahash = __crypto_ahash_cast(tfm);
- struct crypto_cipher *cipher_tfm;

ctx->complete = ccp_aes_cmac_complete;
ctx->u.aes.key_len = 0;

crypto_ahash_set_reqsize(ahash, sizeof(struct ccp_aes_cmac_req_ctx));

- cipher_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_NEED_FALLBACK);
- if (IS_ERR(cipher_tfm)) {
- pr_warn("could not load aes cipher driver\n");
- return PTR_ERR(cipher_tfm);
- }
- ctx->u.aes.tfm_cipher = cipher_tfm;
-
return 0;
}

-static void ccp_aes_cmac_cra_exit(struct crypto_tfm *tfm)
-{
- struct ccp_ctx *ctx = crypto_tfm_ctx(tfm);
-
- if (ctx->u.aes.tfm_cipher)
- crypto_free_cipher(ctx->u.aes.tfm_cipher);
- ctx->u.aes.tfm_cipher = NULL;
-}
-
int ccp_register_aes_cmac_algs(struct list_head *head)
{
struct ccp_crypto_ahash_alg *ccp_alg;
@@ -404,7 +388,6 @@ int ccp_register_aes_cmac_algs(struct list_head *head)
base->cra_ctxsize = sizeof(struct ccp_ctx);
base->cra_priority = CCP_CRA_PRIORITY;
base->cra_init = ccp_aes_cmac_cra_init;
- base->cra_exit = ccp_aes_cmac_cra_exit;
base->cra_module = THIS_MODULE;

ret = crypto_register_ahash(alg);
diff --git a/drivers/crypto/ccp/ccp-crypto.h b/drivers/crypto/ccp/ccp-crypto.h
index 28819e11db96..9100df77a7b3 100644
--- a/drivers/crypto/ccp/ccp-crypto.h
+++ b/drivers/crypto/ccp/ccp-crypto.h
@@ -90,9 +90,6 @@ struct ccp_aes_ctx {
/* Fallback cipher for XTS with unsupported unit sizes */
struct crypto_sync_skcipher *tfm_skcipher;

- /* Cipher used to generate CMAC K1/K2 keys */
- struct crypto_cipher *tfm_cipher;
-
enum ccp_engine engine;
enum ccp_aes_type type;
enum ccp_aes_mode mode;
--
2.17.1

2019-07-02 19:43:14

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 27/32] crypto: aes/generic - unexport last-round AES tables

The versions of the AES lookup tables that are only used during the last
round are never used outside of the driver, so there is no need to
export their symbols.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
crypto/aes_generic.c | 6 ++----
include/crypto/aes.h | 2 --
2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
index 426deb437f19..71a5c190d360 100644
--- a/crypto/aes_generic.c
+++ b/crypto/aes_generic.c
@@ -328,7 +328,7 @@ __visible const u32 crypto_ft_tab[4][256] ____cacheline_aligned = {
}
};

-__visible const u32 crypto_fl_tab[4][256] ____cacheline_aligned = {
+static const u32 crypto_fl_tab[4][256] ____cacheline_aligned = {
{
0x00000063, 0x0000007c, 0x00000077, 0x0000007b,
0x000000f2, 0x0000006b, 0x0000006f, 0x000000c5,
@@ -856,7 +856,7 @@ __visible const u32 crypto_it_tab[4][256] ____cacheline_aligned = {
}
};

-__visible const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
+static const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
{
0x00000052, 0x00000009, 0x0000006a, 0x000000d5,
0x00000030, 0x00000036, 0x000000a5, 0x00000038,
@@ -1121,9 +1121,7 @@ __visible const u32 crypto_il_tab[4][256] ____cacheline_aligned = {
};

EXPORT_SYMBOL_GPL(crypto_ft_tab);
-EXPORT_SYMBOL_GPL(crypto_fl_tab);
EXPORT_SYMBOL_GPL(crypto_it_tab);
-EXPORT_SYMBOL_GPL(crypto_il_tab);

/**
* crypto_aes_set_key - Set the AES key.
diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 0a64a977f9b3..df8426fd8051 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -29,9 +29,7 @@ struct crypto_aes_ctx {
};

extern const u32 crypto_ft_tab[4][256] ____cacheline_aligned;
-extern const u32 crypto_fl_tab[4][256] ____cacheline_aligned;
extern const u32 crypto_it_tab[4][256] ____cacheline_aligned;
-extern const u32 crypto_il_tab[4][256] ____cacheline_aligned;

int crypto_aes_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len);
--
2.17.1

2019-07-02 19:43:40

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 31/32] crypto: arm64/aes-cipher - switch to shared AES inverse Sbox

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/aes-cipher-core.S | 40 +-------------------
1 file changed, 1 insertion(+), 39 deletions(-)

diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S
index 3a44eada2347..27dac259b359 100644
--- a/arch/arm64/crypto/aes-cipher-core.S
+++ b/arch/arm64/crypto/aes-cipher-core.S
@@ -131,43 +131,5 @@ ENDPROC(__aes_arm64_encrypt)

.align 5
ENTRY(__aes_arm64_decrypt)
- do_crypt iround, crypto_it_tab, __aes_arm64_inverse_sbox, 0
+ do_crypt iround, crypto_it_tab, crypto_aes_inv_sbox, 0
ENDPROC(__aes_arm64_decrypt)
-
- .section ".rodata", "a"
- .align L1_CACHE_SHIFT
- .type __aes_arm64_inverse_sbox, %object
-__aes_arm64_inverse_sbox:
- .byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
- .byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
- .byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
- .byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
- .byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
- .byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
- .byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
- .byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
- .byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
- .byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
- .byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
- .byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
- .byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
- .byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
- .byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
- .byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
- .byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
- .byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
- .byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
- .byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
- .byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
- .byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
- .byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
- .byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
- .byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
- .byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
- .byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
- .byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
- .byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
- .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
- .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
- .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
- .size __aes_arm64_inverse_sbox, . - __aes_arm64_inverse_sbox
--
2.17.1

2019-07-02 19:43:42

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 29/32] crypto: arm64/aes-neon - switch to shared AES Sboxes

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/arm64/crypto/aes-neon.S | 74 +-------------------
1 file changed, 3 insertions(+), 71 deletions(-)

diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S
index 29100f692e8a..169e86d8ae36 100644
--- a/arch/arm64/crypto/aes-neon.S
+++ b/arch/arm64/crypto/aes-neon.S
@@ -50,7 +50,7 @@

/* do preload for encryption */
.macro enc_prepare, ignore0, ignore1, temp
- prepare .LForward_Sbox, .LForward_ShiftRows, \temp
+ prepare crypto_aes_sbox, .LForward_ShiftRows, \temp
.endm

.macro enc_switch_key, ignore0, ignore1, temp
@@ -59,7 +59,7 @@

/* do preload for decryption */
.macro dec_prepare, ignore0, ignore1, temp
- prepare .LReverse_Sbox, .LReverse_ShiftRows, \temp
+ prepare crypto_aes_inv_sbox, .LReverse_ShiftRows, \temp
.endm

/* apply SubBytes transformation using the the preloaded Sbox */
@@ -279,75 +279,7 @@
#include "aes-modes.S"

.section ".rodata", "a"
- .align 6
-.LForward_Sbox:
- .byte 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5
- .byte 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76
- .byte 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0
- .byte 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0
- .byte 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc
- .byte 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15
- .byte 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a
- .byte 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75
- .byte 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0
- .byte 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84
- .byte 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b
- .byte 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf
- .byte 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85
- .byte 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8
- .byte 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5
- .byte 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2
- .byte 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17
- .byte 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73
- .byte 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88
- .byte 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb
- .byte 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c
- .byte 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79
- .byte 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9
- .byte 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08
- .byte 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6
- .byte 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a
- .byte 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e
- .byte 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e
- .byte 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94
- .byte 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf
- .byte 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68
- .byte 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
-
-.LReverse_Sbox:
- .byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
- .byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
- .byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
- .byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
- .byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
- .byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
- .byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
- .byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
- .byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
- .byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
- .byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
- .byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
- .byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
- .byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
- .byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02
- .byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b
- .byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea
- .byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73
- .byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85
- .byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e
- .byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89
- .byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b
- .byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20
- .byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4
- .byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31
- .byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f
- .byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d
- .byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef
- .byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0
- .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61
- .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
- .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
-
+ .align 4
.LForward_ShiftRows:
.octa 0x0b06010c07020d08030e09040f0a0500

--
2.17.1

2019-07-02 19:45:21

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 28/32] crypto: lib/aes - export sbox and inverse sbox

There are a few copies of the AES S-boxes floating around, so export
the ones from the AES library so that we can reuse them in other
modules.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
include/crypto/aes.h | 3 +++
lib/crypto/aes.c | 6 ++++++
2 files changed, 9 insertions(+)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index df8426fd8051..8e0f4cf948e5 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -67,4 +67,7 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);
*/
void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 *out, const u8 *in);

+extern const u8 crypto_aes_sbox[];
+extern const u8 crypto_aes_inv_sbox[];
+
#endif
diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
index 9928b23e0a8a..4e100af38c51 100644
--- a/lib/crypto/aes.c
+++ b/lib/crypto/aes.c
@@ -82,6 +82,12 @@ static volatile const u8 __cacheline_aligned aes_inv_sbox[] = {
0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d,
};

+extern const u8 crypto_aes_sbox[256] __alias(aes_sbox);
+extern const u8 crypto_aes_inv_sbox[256] __alias(aes_inv_sbox);
+
+EXPORT_SYMBOL(crypto_aes_sbox);
+EXPORT_SYMBOL(crypto_aes_inv_sbox);
+
static u32 mul_by_x(u32 w)
{
u32 x = w & 0x7f7f7f7f;
--
2.17.1

2019-07-02 19:45:36

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v4 26/32] crypto: chelsio/aes - replace AES cipher calls with library calls

Replace a couple of occurrences where the "aes-generic" cipher is
instantiated explicitly and only used for encryption of a single block.
Use AES library calls instead.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/crypto/chelsio/Kconfig | 1 +
drivers/crypto/chelsio/chcr_algo.c | 46 ++++++--------------
drivers/crypto/chelsio/chcr_crypto.h | 1 -
drivers/crypto/chelsio/chcr_ipsec.c | 19 +++-----
drivers/crypto/chelsio/chtls/chtls_hw.c | 20 +++------
5 files changed, 26 insertions(+), 61 deletions(-)

diff --git a/drivers/crypto/chelsio/Kconfig b/drivers/crypto/chelsio/Kconfig
index 930d82d991f2..36402ba63b50 100644
--- a/drivers/crypto/chelsio/Kconfig
+++ b/drivers/crypto/chelsio/Kconfig
@@ -1,6 +1,7 @@
config CRYPTO_DEV_CHELSIO
tristate "Chelsio Crypto Co-processor Driver"
depends on CHELSIO_T4
+ select CRYPTO_LIB_AES
select CRYPTO_SHA1
select CRYPTO_SHA256
select CRYPTO_SHA512
diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index 177f572b9589..38ee38b37ae6 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -1023,22 +1023,21 @@ static int chcr_update_tweak(struct ablkcipher_request *req, u8 *iv,
struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
struct ablk_ctx *ablkctx = ABLK_CTX(c_ctx(tfm));
struct chcr_blkcipher_req_ctx *reqctx = ablkcipher_request_ctx(req);
- struct crypto_cipher *cipher;
+ struct crypto_aes_ctx aes;
int ret, i;
u8 *key;
unsigned int keylen;
int round = reqctx->last_req_len / AES_BLOCK_SIZE;
int round8 = round / 8;

- cipher = ablkctx->aes_generic;
memcpy(iv, reqctx->iv, AES_BLOCK_SIZE);

keylen = ablkctx->enckey_len / 2;
key = ablkctx->key + keylen;
- ret = crypto_cipher_setkey(cipher, key, keylen);
+ ret = aes_expandkey(&aes, key, keylen);
if (ret)
- goto out;
- crypto_cipher_encrypt_one(cipher, iv, iv);
+ return ret;
+ aes_encrypt(&aes, iv, iv);
for (i = 0; i < round8; i++)
gf128mul_x8_ble((le128 *)iv, (le128 *)iv);

@@ -1046,9 +1045,10 @@ static int chcr_update_tweak(struct ablkcipher_request *req, u8 *iv,
gf128mul_x_ble((le128 *)iv, (le128 *)iv);

if (!isfinal)
- crypto_cipher_decrypt_one(cipher, iv, iv);
-out:
- return ret;
+ aes_decrypt(&aes, iv, iv);
+
+ memzero_explicit(&aes, sizeof(aes));
+ return 0;
}

static int chcr_update_cipher_iv(struct ablkcipher_request *req,
@@ -1411,16 +1411,6 @@ static int chcr_cra_init(struct crypto_tfm *tfm)
return PTR_ERR(ablkctx->sw_cipher);
}

- if (get_cryptoalg_subtype(tfm) == CRYPTO_ALG_SUB_TYPE_XTS) {
- /* To update tweak*/
- ablkctx->aes_generic = crypto_alloc_cipher("aes-generic", 0, 0);
- if (IS_ERR(ablkctx->aes_generic)) {
- pr_err("failed to allocate aes cipher for tweak\n");
- return PTR_ERR(ablkctx->aes_generic);
- }
- } else
- ablkctx->aes_generic = NULL;
-
tfm->crt_ablkcipher.reqsize = sizeof(struct chcr_blkcipher_req_ctx);
return chcr_device_init(crypto_tfm_ctx(tfm));
}
@@ -1451,8 +1441,6 @@ static void chcr_cra_exit(struct crypto_tfm *tfm)
struct ablk_ctx *ablkctx = ABLK_CTX(ctx);

crypto_free_sync_skcipher(ablkctx->sw_cipher);
- if (ablkctx->aes_generic)
- crypto_free_cipher(ablkctx->aes_generic);
}

static int get_alg_config(struct algo_param *params,
@@ -3364,9 +3352,9 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, const u8 *key,
{
struct chcr_aead_ctx *aeadctx = AEAD_CTX(a_ctx(aead));
struct chcr_gcm_ctx *gctx = GCM_CTX(aeadctx);
- struct crypto_cipher *cipher;
unsigned int ck_size;
int ret = 0, key_ctx_size = 0;
+ struct crypto_aes_ctx aes;

aeadctx->enckey_len = 0;
crypto_aead_clear_flags(aeadctx->sw_cipher, CRYPTO_TFM_REQ_MASK);
@@ -3409,23 +3397,15 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, const u8 *key,
/* Calculate the H = CIPH(K, 0 repeated 16 times).
* It will go in key context
*/
- cipher = crypto_alloc_cipher("aes-generic", 0, 0);
- if (IS_ERR(cipher)) {
- aeadctx->enckey_len = 0;
- ret = -ENOMEM;
- goto out;
- }
-
- ret = crypto_cipher_setkey(cipher, key, keylen);
+ ret = aes_expandkey(&aes, key, keylen);
if (ret) {
aeadctx->enckey_len = 0;
- goto out1;
+ goto out;
}
memset(gctx->ghash_h, 0, AEAD_H_SIZE);
- crypto_cipher_encrypt_one(cipher, gctx->ghash_h, gctx->ghash_h);
+ aes_encrypt(&aes, gctx->ghash_h, gctx->ghash_h);
+ memzero_explicit(&aes, sizeof(aes));

-out1:
- crypto_free_cipher(cipher);
out:
return ret;
}
diff --git a/drivers/crypto/chelsio/chcr_crypto.h b/drivers/crypto/chelsio/chcr_crypto.h
index 655606f2e4d0..993c97e70565 100644
--- a/drivers/crypto/chelsio/chcr_crypto.h
+++ b/drivers/crypto/chelsio/chcr_crypto.h
@@ -172,7 +172,6 @@ static inline struct chcr_context *h_ctx(struct crypto_ahash *tfm)

struct ablk_ctx {
struct crypto_sync_skcipher *sw_cipher;
- struct crypto_cipher *aes_generic;
__be32 key_ctx_hdr;
unsigned int enckey_len;
unsigned char ciph_mode;
diff --git a/drivers/crypto/chelsio/chcr_ipsec.c b/drivers/crypto/chelsio/chcr_ipsec.c
index f429aae72542..24355680f30a 100644
--- a/drivers/crypto/chelsio/chcr_ipsec.c
+++ b/drivers/crypto/chelsio/chcr_ipsec.c
@@ -132,11 +132,11 @@ static inline int chcr_ipsec_setauthsize(struct xfrm_state *x,
static inline int chcr_ipsec_setkey(struct xfrm_state *x,
struct ipsec_sa_entry *sa_entry)
{
- struct crypto_cipher *cipher;
int keylen = (x->aead->alg_key_len + 7) / 8;
unsigned char *key = x->aead->alg_key;
int ck_size, key_ctx_size = 0;
unsigned char ghash_h[AEAD_H_SIZE];
+ struct crypto_aes_ctx aes;
int ret = 0;

if (keylen > 3) {
@@ -170,26 +170,19 @@ static inline int chcr_ipsec_setkey(struct xfrm_state *x,
/* Calculate the H = CIPH(K, 0 repeated 16 times).
* It will go in key context
*/
- cipher = crypto_alloc_cipher("aes-generic", 0, 0);
- if (IS_ERR(cipher)) {
- sa_entry->enckey_len = 0;
- ret = -ENOMEM;
- goto out;
- }
-
- ret = crypto_cipher_setkey(cipher, key, keylen);
+ ret = aes_expandkey(&aes, key, keylen);
if (ret) {
sa_entry->enckey_len = 0;
- goto out1;
+ goto out;
}
memset(ghash_h, 0, AEAD_H_SIZE);
- crypto_cipher_encrypt_one(cipher, ghash_h, ghash_h);
+ aes_encrypt(&aes, ghash_h, ghash_h);
+ memzero_explicit(&aes, sizeof(aes));
+
memcpy(sa_entry->key + (DIV_ROUND_UP(sa_entry->enckey_len, 16) *
16), ghash_h, AEAD_H_SIZE);
sa_entry->kctx_len = ((DIV_ROUND_UP(sa_entry->enckey_len, 16)) << 4) +
AEAD_H_SIZE;
-out1:
- crypto_free_cipher(cipher);
out:
return ret;
}
diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c b/drivers/crypto/chelsio/chtls/chtls_hw.c
index 490960755864..a6f0278f3597 100644
--- a/drivers/crypto/chelsio/chtls/chtls_hw.c
+++ b/drivers/crypto/chelsio/chtls/chtls_hw.c
@@ -216,8 +216,8 @@ static int chtls_key_info(struct chtls_sock *csk,
unsigned char key[AES_KEYSIZE_128];
struct tls12_crypto_info_aes_gcm_128 *gcm_ctx;
unsigned char ghash_h[AEAD_H_SIZE];
- struct crypto_cipher *cipher;
int ck_size, key_ctx_size;
+ struct crypto_aes_ctx aes;
int ret;

gcm_ctx = (struct tls12_crypto_info_aes_gcm_128 *)
@@ -237,18 +237,13 @@ static int chtls_key_info(struct chtls_sock *csk,
/* Calculate the H = CIPH(K, 0 repeated 16 times).
* It will go in key context
*/
- cipher = crypto_alloc_cipher("aes", 0, 0);
- if (IS_ERR(cipher)) {
- ret = -ENOMEM;
- goto out;
- }
-
- ret = crypto_cipher_setkey(cipher, key, keylen);
+ ret = aes_expandkey(&aes, key, keylen);
if (ret)
- goto out1;
+ return ret;

memset(ghash_h, 0, AEAD_H_SIZE);
- crypto_cipher_encrypt_one(cipher, ghash_h, ghash_h);
+ aes_encrypt(&aes, ghash_h, ghash_h);
+ memzero_explicit(&aes, sizeof(aes));
csk->tlshws.keylen = key_ctx_size;

/* Copy the Key context */
@@ -272,10 +267,7 @@ static int chtls_key_info(struct chtls_sock *csk,
/* erase key info from driver */
memset(gcm_ctx->key, 0, keylen);

-out1:
- crypto_free_cipher(cipher);
-out:
- return ret;
+ return 0;
}

static void chtls_set_scmd(struct chtls_sock *csk)
--
2.17.1

2019-07-26 12:32:24

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH v4 00/32] crypto: AES cleanup

Ard Biesheuvel <[email protected]> wrote:
> This started out as an attempt to provide synchronous SIMD based GCM
> on 32-bit ARM, but along the way, I ended up changing and cleaning up
> so many things that it is more of a general AES cleanup now rather than
> anything else.
>
> Changes since v3:
> - fix build warning due to missing array dimensions of exported sboxes
> - rename the internal sparc64 AES routines so they don't clash with the
> new library symbols
> - fix a couple of comments that got truncated
>
> Changes since v2:
> - fix a bug in the CTR helper function - use chunksize not blocksize of the
> skcipher as the blocksize of the CTR transformation
> - add a couple of patches so all AES implementation share the forward and
> inverse Sboxes that are in the AES library.
> - unexpert data structures and helpers that are not actually (or should be)
> used outside the drivers that define them
>
> Code can be found here:
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=aes-cleanup-v4
>
> Changes since v1/RFC:
> - rename x86 AES-NI and padlock routines as well, in order to avoid clashes (#2)
> - move irq en/disabling out of the AES library into the callers (aes-ti
> and the skcipher helper for sync ctr(aes) added in #17)
> - align 32-bit ARM CE key schedule endianness with other AES drivers, to
> avoid problems on BE systems when using the synchronous ctr fallback (#18)
> - replace some occurrences where a "aes" or "aes-generic" cipher was allocated
> explicitly, and use library calls instead.
> - use a generic helper in crypto/ctr.h instead of adding a CTR helper to the
> AES library for providing the synchronous CTR fallback code.
>
> Some users of the AES cipher are being switched to other algorithms (i.e.,
> SipHash for TCP fastopen and CCM or cbcmac for wusb and lib80211). These
> have been posted separately, since they have no build time interdependencies.
>
> ----- Original blurb below ------
>
> On 32-bit ARM, asynchronous GCM can be provided by the following drivers:
>
> | cycles per byte on low end Si
> gcm_base(ctr(aes-generic),ghash-generic) | 65.3
> gcm_base(ctr-aes-neonbs,ghash-ce) [*] | 27.7
> gcm_base(ctr-aes-ce,ghash-ce) [**] | 3.7
>
> [*] ghash-ce using vmull.p8 instructions
> [**] ghash-ce using optional vmull.p64 instructions
>
> The third and fastest option is actually only available on 32-bit cores that
> implement the v8 Crypto Extensions, which are rare, but the NEON based runner
> up is obviously a huge improvement over the generic code, not only in terms of
> performance, but also because it is time invariant (generic AES and generic
> GHASH are both implemented using lookup tables, which are susceptible to
> cache timing attacks)
>
> However, when allocating the gcm(aes) skcipher in synchronous mode, we end up
> with the generic code, due to the fact that the NEON code has no handling for
> invocations that occur from a context where the NEON cannot be used, and so
> it defers the processing to a kthread, which is only permitted for asynchronous
> ciphers.
>
> So let's implement this fallback handling, by reusing some of the logic that
> has already been implemented for arm64. Note that these fallbacks are rarely
> called in practice, but the API requires the functionality to be there.
> This is implemented in patches 16-22.
>
> All the patches leading up to that are cleanups for the AES code, to reduce
> the dependency on the generic table based AES code, or in some cases, hardcoded
> dependencies on the scalar arm64 asm code which suffers from the same problem.
> It also removes redundant key expansion routines, and gets rid of the x86
> scalar asm code, which is a maintenance burden and is not actually faster than
> the generic code built with a modern compiler.
>
> Ard Biesheuvel (32):
> crypto: arm/aes-ce - cosmetic/whitespace cleanup
> crypto: aes - rename local routines to prevent future clashes
> crypto: aes/fixed-time - align key schedule with other implementations
> crypto: aes - create AES library based on the fixed time AES code
> crypto: x86/aes-ni - switch to generic for fallback and key routines
> crypto: x86/aes - drop scalar assembler implementations
> crypto: padlock/aes - switch to library version of key expansion
> routine
> crypto: cesa/aes - switch to library version of key expansion routine
> crypto: safexcel/aes - switch to library version of key expansion
> routine
> crypto: arm64/ghash - switch to AES library
> crypto: arm/aes-neonbs - switch to library version of key expansion
> routine
> crypto: arm64/aes-ccm - switch to AES library
> crypto: arm64/aes-neonbs - switch to library version of key expansion
> routine
> crypto: arm64/aes-ce - switch to library version of key expansion
> routine
> crypto: generic/aes - drop key expansion routine in favor of library
> version
> crypto: ctr - add helper for performing a CTR encryption walk
> crypto: aes - move sync ctr(aes) to AES library and generic helper
> crypto: arm64/aes-ce-cipher - use AES library as fallback
> crypto: aes/arm - use native endiannes for key schedule
> crypto: arm/aes-ce - provide a synchronous version of ctr(aes)
> crypto: arm/aes-neonbs - provide a synchronous version of ctr(aes)
> crypto: arm/ghash - provide a synchronous version
> bluetooth: switch to AES library
> crypto: amcc/aes - switch to AES library for GCM key derivation
> crypto: ccp - move to AES library for CMAC key derivation
> crypto: chelsio/aes - replace AES cipher calls with library calls
> crypto: aes/generic - unexport last-round AES tables
> crypto: lib/aes - export sbox and inverse sbox
> crypto: arm64/aes-neon - switch to shared AES Sboxes
> crypto: arm/aes-cipher - switch to shared AES inverse Sbox
> crypto: arm64/aes-cipher - switch to shared AES inverse Sbox
> crypto: arm/aes-scalar - unexport en/decryption routines
>
> arch/arm/crypto/Kconfig | 2 +-
> arch/arm/crypto/aes-ce-core.S | 20 +-
> arch/arm/crypto/aes-ce-glue.c | 168 ++++----
> arch/arm/crypto/aes-cipher-core.S | 40 +-
> arch/arm/crypto/aes-cipher-glue.c | 11 +-
> arch/arm/crypto/aes-neonbs-glue.c | 69 +++-
> arch/arm/crypto/ghash-ce-glue.c | 78 ++--
> arch/arm64/crypto/Kconfig | 10 +-
> arch/arm64/crypto/aes-ce-ccm-glue.c | 18 +-
> arch/arm64/crypto/aes-ce-glue.c | 7 +-
> arch/arm64/crypto/aes-cipher-core.S | 40 +-
> arch/arm64/crypto/aes-cipher-glue.c | 11 +-
> arch/arm64/crypto/aes-ctr-fallback.h | 53 ---
> arch/arm64/crypto/aes-glue.c | 39 +-
> arch/arm64/crypto/aes-neon.S | 74 +---
> arch/arm64/crypto/aes-neonbs-glue.c | 29 +-
> arch/arm64/crypto/ghash-ce-glue.c | 30 +-
> arch/sparc/crypto/aes_glue.c | 8 +-
> arch/x86/crypto/Makefile | 4 -
> arch/x86/crypto/aes-i586-asm_32.S | 362 ------------------
> arch/x86/crypto/aes-x86_64-asm_64.S | 185 ---------
> arch/x86/crypto/aes_glue.c | 70 ----
> arch/x86/crypto/aesni-intel_glue.c | 23 +-
> arch/x86/include/asm/crypto/aes.h | 12 -
> crypto/Kconfig | 52 +--
> crypto/aes_generic.c | 167 +-------
> crypto/aes_ti.c | 317 +--------------
> drivers/crypto/Kconfig | 8 +-
> drivers/crypto/amcc/crypto4xx_alg.c | 24 +-
> drivers/crypto/ccp/Kconfig | 1 +
> drivers/crypto/ccp/ccp-crypto-aes-cmac.c | 25 +-
> drivers/crypto/ccp/ccp-crypto.h | 3 -
> drivers/crypto/chelsio/Kconfig | 1 +
> drivers/crypto/chelsio/chcr_algo.c | 46 +--
> drivers/crypto/chelsio/chcr_crypto.h | 1 -
> drivers/crypto/chelsio/chcr_ipsec.c | 19 +-
> drivers/crypto/chelsio/chtls/chtls_hw.c | 20 +-
> .../crypto/inside-secure/safexcel_cipher.c | 2 +-
> drivers/crypto/marvell/cipher.c | 2 +-
> drivers/crypto/padlock-aes.c | 10 +-
> include/crypto/aes.h | 41 +-
> include/crypto/ctr.h | 50 +++
> lib/crypto/Makefile | 3 +
> lib/crypto/aes.c | 356 +++++++++++++++++
> net/bluetooth/Kconfig | 3 +-
> net/bluetooth/smp.c | 103 ++---
> 46 files changed, 877 insertions(+), 1740 deletions(-)
> delete mode 100644 arch/arm64/crypto/aes-ctr-fallback.h
> delete mode 100644 arch/x86/crypto/aes-i586-asm_32.S
> delete mode 100644 arch/x86/crypto/aes-x86_64-asm_64.S
> delete mode 100644 arch/x86/crypto/aes_glue.c
> delete mode 100644 arch/x86/include/asm/crypto/aes.h
> create mode 100644 lib/crypto/aes.c

All applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2019-10-27 10:09:57

by Christian Lamparter

[permalink] [raw]
Subject: Re: [PATCH v4 24/32] crypto: amcc/aes - switch to AES library for GCM key derivation

Hi,

On Tuesday, July 2, 2019 9:41:42 PM CET Ard Biesheuvel wrote:
> The AMCC code for GCM key derivation allocates a AES cipher to
> perform a single block encryption. So let's switch to the new
> and more lightweight AES library instead.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> drivers/crypto/Kconfig | 2 +-
> drivers/crypto/amcc/crypto4xx_alg.c | 24 +++++++-------------
> 2 files changed, 9 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index b30b84089d11..c7ac1e6d23d4 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -311,7 +311,7 @@ config CRYPTO_DEV_PPC4XX
> depends on PPC && 4xx
> select CRYPTO_HASH
> select CRYPTO_AEAD
> - select CRYPTO_AES
> + select CRYPTO_LIB_AES

I think that getting rid of CRYPTO_AES was not a good idea here.
Reason being that the crypto4xx driver registers fallbacks to cover
edge-cases for AES-CTR, AES-CCM and AES-GCM modes that the hardware
is incapbale of handling itself.

So without the dependency of CRYPTO_AES, I think there's now a way
to build the crypto4xx module without necessarily having CRYPTO_AES.
And if that's the case then the necessary fallbacks cannot be
instantiated and the driver will not provide the afromentioned modes.

Can somebody clarify?

Regards,
Christian