2022-12-20 05:44:33

by Eric Biggers

[permalink] [raw]
Subject: [PATCH 0/3] crypto: x86/ghash cleanups

These patches are a replacement for Peter Zijlstra's patch
"[RFC][PATCH 02/12] crypto/ghash-clmulni: Use (struct) be128"
(https://lore.kernel.org/r/[email protected]).

Eric Biggers (3):
crypto: x86/ghash - fix unaligned access in ghash_setkey()
crypto: x86/ghash - use le128 instead of u128
crypto: x86/ghash - add comment and fix broken link

arch/x86/crypto/ghash-clmulni-intel_asm.S | 6 +--
arch/x86/crypto/ghash-clmulni-intel_glue.c | 45 +++++++++++++++-------
2 files changed, 35 insertions(+), 16 deletions(-)


base-commit: 6feb57c2fd7c787aecf2846a535248899e7b70fa
--
2.39.0


2022-12-20 05:45:14

by Eric Biggers

[permalink] [raw]
Subject: [PATCH 1/3] crypto: x86/ghash - fix unaligned access in ghash_setkey()

From: Eric Biggers <[email protected]>

The key can be unaligned, so use the unaligned memory access helpers.

Fixes: 8ceee72808d1 ("crypto: ghash-clmulni-intel - use C implementation for setkey()")
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/ghash-clmulni-intel_glue.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
index 1f1a95f3dd0c..c0ab0ff4af65 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
+++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
@@ -19,6 +19,7 @@
#include <crypto/internal/simd.h>
#include <asm/cpu_device_id.h>
#include <asm/simd.h>
+#include <asm/unaligned.h>

#define GHASH_BLOCK_SIZE 16
#define GHASH_DIGEST_SIZE 16
@@ -54,15 +55,14 @@ static int ghash_setkey(struct crypto_shash *tfm,
const u8 *key, unsigned int keylen)
{
struct ghash_ctx *ctx = crypto_shash_ctx(tfm);
- be128 *x = (be128 *)key;
u64 a, b;

if (keylen != GHASH_BLOCK_SIZE)
return -EINVAL;

/* perform multiplication by 'x' in GF(2^128) */
- a = be64_to_cpu(x->a);
- b = be64_to_cpu(x->b);
+ a = get_unaligned_be64(key);
+ b = get_unaligned_be64(key + 8);

ctx->shash.a = (b << 1) | (a >> 63);
ctx->shash.b = (a << 1) | (b >> 63);
--
2.39.0

2022-12-20 05:45:34

by Eric Biggers

[permalink] [raw]
Subject: [PATCH 3/3] crypto: x86/ghash - add comment and fix broken link

From: Eric Biggers <[email protected]>

Add a comment that explains what ghash_setkey() is doing, as it's hard
to understand otherwise. Also fix a broken hyperlink.

Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/ghash-clmulni-intel_asm.S | 2 +-
arch/x86/crypto/ghash-clmulni-intel_glue.c | 27 ++++++++++++++++++----
2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 9dfeb4d31b92..257ed9446f3e 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -4,7 +4,7 @@
* instructions. This file contains accelerated part of ghash
* implementation. More information about PCLMULQDQ can be found at:
*
- * http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/
+ * https://www.intel.com/content/dam/develop/external/us/en/documents/clmul-wp-rev-2-02-2014-04-20.pdf
*
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <[email protected]>
diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
index 9453b094bb3b..700ecaee9a08 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
+++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
@@ -60,16 +60,35 @@ static int ghash_setkey(struct crypto_shash *tfm,
if (keylen != GHASH_BLOCK_SIZE)
return -EINVAL;

- /* perform multiplication by 'x' in GF(2^128) */
+ /*
+ * GHASH maps bits to polynomial coefficients backwards, which makes it
+ * hard to implement. But it can be shown that the GHASH multiplication
+ *
+ * D * K (mod x^128 + x^7 + x^2 + x + 1)
+ *
+ * (where D is a data block and K is the key) is equivalent to:
+ *
+ * bitreflect(D) * bitreflect(K) * x^(-127)
+ * (mod x^128 + x^127 + x^126 + x^121 + 1)
+ *
+ * So, the code below precomputes:
+ *
+ * bitreflect(K) * x^(-127) (mod x^128 + x^127 + x^126 + x^121 + 1)
+ *
+ * ... but in Montgomery form (so that Montgomery multiplication can be
+ * used), i.e. with an extra x^128 factor, which means actually:
+ *
+ * bitreflect(K) * x (mod x^128 + x^127 + x^126 + x^121 + 1)
+ *
+ * The within-a-byte part of bitreflect() cancels out GHASH's built-in
+ * reflection, and thus bitreflect() is actually a byteswap.
+ */
a = get_unaligned_be64(key);
b = get_unaligned_be64(key + 8);
-
ctx->shash.a = cpu_to_le64((a << 1) | (b >> 63));
ctx->shash.b = cpu_to_le64((b << 1) | (a >> 63));
-
if (a >> 63)
ctx->shash.a ^= cpu_to_le64((u64)0xc2 << 56);
-
return 0;
}

--
2.39.0

2022-12-20 05:45:35

by Eric Biggers

[permalink] [raw]
Subject: [PATCH 2/3] crypto: x86/ghash - use le128 instead of u128

From: Eric Biggers <[email protected]>

The u128 struct type is going away, so make ghash-clmulni-intel use
le128 instead. Note that the field names a and b swapped, as they were
backwards with u128. (a is meant to be high-order and b low-order.)

Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4 ++--
arch/x86/crypto/ghash-clmulni-intel_glue.c | 12 ++++++------
2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index 2bf871899920..9dfeb4d31b92 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -88,7 +88,7 @@ SYM_FUNC_START_LOCAL(__clmul_gf128mul_ble)
RET
SYM_FUNC_END(__clmul_gf128mul_ble)

-/* void clmul_ghash_mul(char *dst, const u128 *shash) */
+/* void clmul_ghash_mul(char *dst, const le128 *shash) */
SYM_FUNC_START(clmul_ghash_mul)
FRAME_BEGIN
movups (%rdi), DATA
@@ -104,7 +104,7 @@ SYM_FUNC_END(clmul_ghash_mul)

/*
* void clmul_ghash_update(char *dst, const char *src, unsigned int srclen,
- * const u128 *shash);
+ * const le128 *shash);
*/
SYM_FUNC_START(clmul_ghash_update)
FRAME_BEGIN
diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
index c0ab0ff4af65..9453b094bb3b 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
+++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
@@ -24,17 +24,17 @@
#define GHASH_BLOCK_SIZE 16
#define GHASH_DIGEST_SIZE 16

-void clmul_ghash_mul(char *dst, const u128 *shash);
+void clmul_ghash_mul(char *dst, const le128 *shash);

void clmul_ghash_update(char *dst, const char *src, unsigned int srclen,
- const u128 *shash);
+ const le128 *shash);

struct ghash_async_ctx {
struct cryptd_ahash *cryptd_tfm;
};

struct ghash_ctx {
- u128 shash;
+ le128 shash;
};

struct ghash_desc_ctx {
@@ -64,11 +64,11 @@ static int ghash_setkey(struct crypto_shash *tfm,
a = get_unaligned_be64(key);
b = get_unaligned_be64(key + 8);

- ctx->shash.a = (b << 1) | (a >> 63);
- ctx->shash.b = (a << 1) | (b >> 63);
+ ctx->shash.a = cpu_to_le64((a << 1) | (b >> 63));
+ ctx->shash.b = cpu_to_le64((b << 1) | (a >> 63));

if (a >> 63)
- ctx->shash.b ^= ((u64)0xc2) << 56;
+ ctx->shash.a ^= cpu_to_le64((u64)0xc2 << 56);

return 0;
}
--
2.39.0

2022-12-20 10:10:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/ghash cleanups

On Mon, Dec 19, 2022 at 09:40:39PM -0800, Eric Biggers wrote:
> These patches are a replacement for Peter Zijlstra's patch
> "[RFC][PATCH 02/12] crypto/ghash-clmulni: Use (struct) be128"
> (https://lore.kernel.org/r/[email protected]).
>
> Eric Biggers (3):
> crypto: x86/ghash - fix unaligned access in ghash_setkey()
> crypto: x86/ghash - use le128 instead of u128
> crypto: x86/ghash - add comment and fix broken link
>
> arch/x86/crypto/ghash-clmulni-intel_asm.S | 6 +--
> arch/x86/crypto/ghash-clmulni-intel_glue.c | 45 +++++++++++++++-------
> 2 files changed, 35 insertions(+), 16 deletions(-)

Thanks! Lemme go rebase on top of this.

2022-12-20 10:22:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/3] crypto: x86/ghash - add comment and fix broken link

On Mon, Dec 19, 2022 at 09:40:42PM -0800, Eric Biggers wrote:
> From: Eric Biggers <[email protected]>
>
> Add a comment that explains what ghash_setkey() is doing, as it's hard
> to understand otherwise. Also fix a broken hyperlink.
>
> Signed-off-by: Eric Biggers <[email protected]>
> ---
> arch/x86/crypto/ghash-clmulni-intel_asm.S | 2 +-
> arch/x86/crypto/ghash-clmulni-intel_glue.c | 27 ++++++++++++++++++----
> 2 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
> index 9dfeb4d31b92..257ed9446f3e 100644
> --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
> +++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
> @@ -4,7 +4,7 @@
> * instructions. This file contains accelerated part of ghash
> * implementation. More information about PCLMULQDQ can be found at:
> *
> - * http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/
> + * https://www.intel.com/content/dam/develop/external/us/en/documents/clmul-wp-rev-2-02-2014-04-20.pdf

Since these things have a habbit if changing, we tend to prefer to host
a copy on kernel.org somewhere (used to be bugzilla, but perhaps there's
a better places these days).

> *
> * Copyright (c) 2009 Intel Corp.
> * Author: Huang Ying <[email protected]>
> diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c
> index 9453b094bb3b..700ecaee9a08 100644
> --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c
> +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c
> @@ -60,16 +60,35 @@ static int ghash_setkey(struct crypto_shash *tfm,
> if (keylen != GHASH_BLOCK_SIZE)
> return -EINVAL;
>
> - /* perform multiplication by 'x' in GF(2^128) */
> + /*
> + * GHASH maps bits to polynomial coefficients backwards, which makes it
> + * hard to implement. But it can be shown that the GHASH multiplication
> + *
> + * D * K (mod x^128 + x^7 + x^2 + x + 1)
> + *
> + * (where D is a data block and K is the key) is equivalent to:
> + *
> + * bitreflect(D) * bitreflect(K) * x^(-127)
> + * (mod x^128 + x^127 + x^126 + x^121 + 1)
> + *
> + * So, the code below precomputes:
> + *
> + * bitreflect(K) * x^(-127) (mod x^128 + x^127 + x^126 + x^121 + 1)
> + *
> + * ... but in Montgomery form (so that Montgomery multiplication can be
> + * used), i.e. with an extra x^128 factor, which means actually:
> + *
> + * bitreflect(K) * x (mod x^128 + x^127 + x^126 + x^121 + 1)
> + *
> + * The within-a-byte part of bitreflect() cancels out GHASH's built-in
> + * reflection, and thus bitreflect() is actually a byteswap.
> + */

Whee, thanks, that was indeed entirely non-obvious.

2022-12-30 09:17:54

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 3/3] crypto: x86/ghash - add comment and fix broken link

On Tue, Dec 20, 2022 at 11:09:16AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 19, 2022 at 09:40:42PM -0800, Eric Biggers wrote:
>
> > - * http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/
> > + * https://www.intel.com/content/dam/develop/external/us/en/documents/clmul-wp-rev-2-02-2014-04-20.pdf
>
> Since these things have a habbit if changing, we tend to prefer to host
> a copy on kernel.org somewhere (used to be bugzilla, but perhaps there's
> a better places these days).

Alternatively just save a copy at web.archive.org when you submit
the patch and use that link. In fact it seems that the original
link is still available via web.archive.org (sans pictures).

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2022-12-30 15:20:56

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/3] crypto: x86/ghash cleanups

On Mon, Dec 19, 2022 at 09:40:39PM -0800, Eric Biggers wrote:
> These patches are a replacement for Peter Zijlstra's patch
> "[RFC][PATCH 02/12] crypto/ghash-clmulni: Use (struct) be128"
> (https://lore.kernel.org/r/[email protected]).
>
> Eric Biggers (3):
> crypto: x86/ghash - fix unaligned access in ghash_setkey()
> crypto: x86/ghash - use le128 instead of u128
> crypto: x86/ghash - add comment and fix broken link
>
> arch/x86/crypto/ghash-clmulni-intel_asm.S | 6 +--
> arch/x86/crypto/ghash-clmulni-intel_glue.c | 45 +++++++++++++++-------
> 2 files changed, 35 insertions(+), 16 deletions(-)
>
>
> base-commit: 6feb57c2fd7c787aecf2846a535248899e7b70fa
> --
> 2.39.0

All applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt