2022-11-18 19:47:08

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 00/12] crypto: CFI fixes

This series fixes some crashes when CONFIG_CFI_CLANG (Control Flow
Integrity) is enabled, with the new CFI implementation that was merged
in 6.1 and is supported on x86. Some of them were unconditional
crashes, while others depended on whether the compiler optimized out the
indirect calls or not. This series also simplifies some code that was
intended to work around limitations of the old CFI implementation and is
unnecessary for the new CFI implementation.

Changed in v2:
- Added patch "crypto: x86/sm4 - fix crash with CFI enabled"
- Restored accidentally-deleted include of <asm/assembler.h>
- Tweaked some commit messages and added Reviewed-by and Acked-by tags

Eric Biggers (12):
crypto: x86/aegis128 - fix possible crash with CFI enabled
crypto: x86/aria - fix crash with CFI enabled
crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers
crypto: x86/sha1 - fix possible crash with CFI enabled
crypto: x86/sha256 - fix possible crash with CFI enabled
crypto: x86/sha512 - fix possible crash with CFI enabled
crypto: x86/sm3 - fix possible crash with CFI enabled
crypto: x86/sm4 - fix crash with CFI enabled
crypto: arm64/nhpoly1305 - eliminate unnecessary CFI wrapper
crypto: arm64/sm3 - fix possible crash with CFI enabled
crypto: arm/nhpoly1305 - eliminate unnecessary CFI wrapper
Revert "crypto: shash - avoid comparing pointers to exported functions
under CFI"

arch/arm/crypto/nh-neon-core.S | 2 +-
arch/arm/crypto/nhpoly1305-neon-glue.c | 11 ++---------
arch/arm64/crypto/nh-neon-core.S | 5 +++--
arch/arm64/crypto/nhpoly1305-neon-glue.c | 11 ++---------
arch/arm64/crypto/sm3-neon-core.S | 3 ++-
arch/x86/crypto/aegis128-aesni-asm.S | 9 +++++----
arch/x86/crypto/aria-aesni-avx-asm_64.S | 13 +++++++------
arch/x86/crypto/nh-avx2-x86_64.S | 5 +++--
arch/x86/crypto/nh-sse2-x86_64.S | 5 +++--
arch/x86/crypto/nhpoly1305-avx2-glue.c | 11 ++---------
arch/x86/crypto/nhpoly1305-sse2-glue.c | 11 ++---------
arch/x86/crypto/sha1_ni_asm.S | 3 ++-
arch/x86/crypto/sha1_ssse3_asm.S | 3 ++-
arch/x86/crypto/sha256-avx-asm.S | 3 ++-
arch/x86/crypto/sha256-avx2-asm.S | 3 ++-
arch/x86/crypto/sha256-ssse3-asm.S | 3 ++-
arch/x86/crypto/sha256_ni_asm.S | 3 ++-
arch/x86/crypto/sha512-avx-asm.S | 3 ++-
arch/x86/crypto/sha512-avx2-asm.S | 3 ++-
arch/x86/crypto/sha512-ssse3-asm.S | 3 ++-
arch/x86/crypto/sm3-avx-asm_64.S | 3 ++-
arch/x86/crypto/sm4-aesni-avx-asm_64.S | 7 ++++---
arch/x86/crypto/sm4-aesni-avx2-asm_64.S | 7 ++++---
crypto/shash.c | 18 +++---------------
include/crypto/internal/hash.h | 8 +++++++-
25 files changed, 70 insertions(+), 86 deletions(-)


base-commit: 75df46b598b5b46b0857ee7d2410deaf215e23d1
--
2.38.1



2022-11-18 19:48:04

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 05/12] crypto: x86/sha256 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

sha256_transform_ssse3(), sha256_transform_avx(),
sha256_transform_rorx(), and sha256_ni_transform() are called via
indirect function calls. Therefore they need to use
SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
Otherwise, the code crashes with a CFI failure (if the compiler didn't
happen to optimize out the indirect calls).

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/sha256-avx-asm.S | 3 ++-
arch/x86/crypto/sha256-avx2-asm.S | 3 ++-
arch/x86/crypto/sha256-ssse3-asm.S | 3 ++-
arch/x86/crypto/sha256_ni_asm.S | 3 ++-
4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha256-avx-asm.S b/arch/x86/crypto/sha256-avx-asm.S
index 3baa1ec390974..06ea30c20828d 100644
--- a/arch/x86/crypto/sha256-avx-asm.S
+++ b/arch/x86/crypto/sha256-avx-asm.S
@@ -48,6 +48,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

## assume buffers not aligned
#define VMOVDQ vmovdqu
@@ -346,7 +347,7 @@ a = TMP_
## arg 3 : Num blocks
########################################################################
.text
-SYM_FUNC_START(sha256_transform_avx)
+SYM_TYPED_FUNC_START(sha256_transform_avx)
.align 32
pushq %rbx
pushq %r12
diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S
index 9bcdbc47b8b4b..2d2be531a11ed 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -49,6 +49,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

## assume buffers not aligned
#define VMOVDQ vmovdqu
@@ -523,7 +524,7 @@ STACK_SIZE = _CTX + _CTX_SIZE
## arg 3 : Num blocks
########################################################################
.text
-SYM_FUNC_START(sha256_transform_rorx)
+SYM_TYPED_FUNC_START(sha256_transform_rorx)
.align 32
pushq %rbx
pushq %r12
diff --git a/arch/x86/crypto/sha256-ssse3-asm.S b/arch/x86/crypto/sha256-ssse3-asm.S
index c4a5db612c327..7db28839108dd 100644
--- a/arch/x86/crypto/sha256-ssse3-asm.S
+++ b/arch/x86/crypto/sha256-ssse3-asm.S
@@ -47,6 +47,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

## assume buffers not aligned
#define MOVDQ movdqu
@@ -355,7 +356,7 @@ a = TMP_
## arg 3 : Num blocks
########################################################################
.text
-SYM_FUNC_START(sha256_transform_ssse3)
+SYM_TYPED_FUNC_START(sha256_transform_ssse3)
.align 32
pushq %rbx
pushq %r12
diff --git a/arch/x86/crypto/sha256_ni_asm.S b/arch/x86/crypto/sha256_ni_asm.S
index 94d50dd27cb53..47f93937f798a 100644
--- a/arch/x86/crypto/sha256_ni_asm.S
+++ b/arch/x86/crypto/sha256_ni_asm.S
@@ -54,6 +54,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

#define DIGEST_PTR %rdi /* 1st arg */
#define DATA_PTR %rsi /* 2nd arg */
@@ -97,7 +98,7 @@

.text
.align 32
-SYM_FUNC_START(sha256_ni_transform)
+SYM_TYPED_FUNC_START(sha256_ni_transform)

shl $6, NUM_BLKS /* convert to bytes */
jz .Ldone_hash
--
2.38.1


2022-11-18 19:48:06

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 03/12] crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers

From: Eric Biggers <[email protected]>

Since the CFI implementation now supports indirect calls to assembly
functions, take advantage of that rather than use wrapper functions.

Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/nh-avx2-x86_64.S | 5 +++--
arch/x86/crypto/nh-sse2-x86_64.S | 5 +++--
arch/x86/crypto/nhpoly1305-avx2-glue.c | 11 ++---------
arch/x86/crypto/nhpoly1305-sse2-glue.c | 11 ++---------
4 files changed, 10 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/nh-avx2-x86_64.S b/arch/x86/crypto/nh-avx2-x86_64.S
index 6a0b15e7196a8..ef73a3ab87263 100644
--- a/arch/x86/crypto/nh-avx2-x86_64.S
+++ b/arch/x86/crypto/nh-avx2-x86_64.S
@@ -8,6 +8,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

#define PASS0_SUMS %ymm0
#define PASS1_SUMS %ymm1
@@ -65,11 +66,11 @@

/*
* void nh_avx2(const u32 *key, const u8 *message, size_t message_len,
- * u8 hash[NH_HASH_BYTES])
+ * __le64 hash[NH_NUM_PASSES])
*
* It's guaranteed that message_len % 16 == 0.
*/
-SYM_FUNC_START(nh_avx2)
+SYM_TYPED_FUNC_START(nh_avx2)

vmovdqu 0x00(KEY), K0
vmovdqu 0x10(KEY), K1
diff --git a/arch/x86/crypto/nh-sse2-x86_64.S b/arch/x86/crypto/nh-sse2-x86_64.S
index 34c567bbcb4fa..75fb994b6d177 100644
--- a/arch/x86/crypto/nh-sse2-x86_64.S
+++ b/arch/x86/crypto/nh-sse2-x86_64.S
@@ -8,6 +8,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

#define PASS0_SUMS %xmm0
#define PASS1_SUMS %xmm1
@@ -67,11 +68,11 @@

/*
* void nh_sse2(const u32 *key, const u8 *message, size_t message_len,
- * u8 hash[NH_HASH_BYTES])
+ * __le64 hash[NH_NUM_PASSES])
*
* It's guaranteed that message_len % 16 == 0.
*/
-SYM_FUNC_START(nh_sse2)
+SYM_TYPED_FUNC_START(nh_sse2)

movdqu 0x00(KEY), K0
movdqu 0x10(KEY), K1
diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c
index 8ea5ab0f1ca74..46b036204ed91 100644
--- a/arch/x86/crypto/nhpoly1305-avx2-glue.c
+++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c
@@ -14,14 +14,7 @@
#include <asm/simd.h>

asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len,
- u8 hash[NH_HASH_BYTES]);
-
-/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
-static void _nh_avx2(const u32 *key, const u8 *message, size_t message_len,
- __le64 hash[NH_NUM_PASSES])
-{
- nh_avx2(key, message, message_len, (u8 *)hash);
-}
+ __le64 hash[NH_NUM_PASSES]);

static int nhpoly1305_avx2_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
@@ -33,7 +26,7 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K);

kernel_fpu_begin();
- crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2);
+ crypto_nhpoly1305_update_helper(desc, src, n, nh_avx2);
kernel_fpu_end();
src += n;
srclen -= n;
diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c
index 2b353d42ed13f..4a4970d751076 100644
--- a/arch/x86/crypto/nhpoly1305-sse2-glue.c
+++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c
@@ -14,14 +14,7 @@
#include <asm/simd.h>

asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len,
- u8 hash[NH_HASH_BYTES]);
-
-/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
-static void _nh_sse2(const u32 *key, const u8 *message, size_t message_len,
- __le64 hash[NH_NUM_PASSES])
-{
- nh_sse2(key, message, message_len, (u8 *)hash);
-}
+ __le64 hash[NH_NUM_PASSES]);

static int nhpoly1305_sse2_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
@@ -33,7 +26,7 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K);

kernel_fpu_begin();
- crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2);
+ crypto_nhpoly1305_update_helper(desc, src, n, nh_sse2);
kernel_fpu_end();
src += n;
srclen -= n;
--
2.38.1


2022-11-18 19:49:39

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 10/12] crypto: arm64/sm3 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

sm3_neon_transform() is called via indirect function calls. Therefore
it needs to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause
its type hash to be emitted when the kernel is built with
CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if
the compiler didn't happen to optimize out the indirect call).

Fixes: c50d32859e70 ("arm64: Add types to indirect called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/arm64/crypto/sm3-neon-core.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/crypto/sm3-neon-core.S b/arch/arm64/crypto/sm3-neon-core.S
index 3e3b4e5c736fc..4357e0e51be38 100644
--- a/arch/arm64/crypto/sm3-neon-core.S
+++ b/arch/arm64/crypto/sm3-neon-core.S
@@ -9,6 +9,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/assembler.h>

/* Context structure */
@@ -351,7 +352,7 @@
*/
.text
.align 3
-SYM_FUNC_START(sm3_neon_transform)
+SYM_TYPED_FUNC_START(sm3_neon_transform)
ldp ra, rb, [RSTATE, #0]
ldp rc, rd, [RSTATE, #8]
ldp re, rf, [RSTATE, #16]
--
2.38.1


2022-11-18 19:50:22

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 12/12] Revert "crypto: shash - avoid comparing pointers to exported functions under CFI"

From: Eric Biggers <[email protected]>

This reverts commit 22ca9f4aaf431a9413dcc115dd590123307f274f because CFI
no longer breaks cross-module function address equality, so
crypto_shash_alg_has_setkey() can now be an inline function like before.

This commit should not be backported to kernels that don't have the new
CFI implementation.

Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
crypto/shash.c | 18 +++---------------
include/crypto/internal/hash.h | 8 +++++++-
2 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index 4c88e63b3350f..0f85431588267 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -20,24 +20,12 @@

static const struct crypto_type crypto_shash_type;

-static int shash_no_setkey(struct crypto_shash *tfm, const u8 *key,
- unsigned int keylen)
+int shash_no_setkey(struct crypto_shash *tfm, const u8 *key,
+ unsigned int keylen)
{
return -ENOSYS;
}
-
-/*
- * Check whether an shash algorithm has a setkey function.
- *
- * For CFI compatibility, this must not be an inline function. This is because
- * when CFI is enabled, modules won't get the same address for shash_no_setkey
- * (if it were exported, which inlining would require) as the core kernel will.
- */
-bool crypto_shash_alg_has_setkey(struct shash_alg *alg)
-{
- return alg->setkey != shash_no_setkey;
-}
-EXPORT_SYMBOL_GPL(crypto_shash_alg_has_setkey);
+EXPORT_SYMBOL_GPL(shash_no_setkey);

static int shash_setkey_unaligned(struct crypto_shash *tfm, const u8 *key,
unsigned int keylen)
diff --git a/include/crypto/internal/hash.h b/include/crypto/internal/hash.h
index 25806141db591..0a288dddcf5be 100644
--- a/include/crypto/internal/hash.h
+++ b/include/crypto/internal/hash.h
@@ -75,7 +75,13 @@ void crypto_unregister_ahashes(struct ahash_alg *algs, int count);
int ahash_register_instance(struct crypto_template *tmpl,
struct ahash_instance *inst);

-bool crypto_shash_alg_has_setkey(struct shash_alg *alg);
+int shash_no_setkey(struct crypto_shash *tfm, const u8 *key,
+ unsigned int keylen);
+
+static inline bool crypto_shash_alg_has_setkey(struct shash_alg *alg)
+{
+ return alg->setkey != shash_no_setkey;
+}

static inline bool crypto_shash_alg_needs_key(struct shash_alg *alg)
{
--
2.38.1


2022-11-18 19:53:37

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 02/12] crypto: x86/aria - fix crash with CFI enabled

From: Eric Biggers <[email protected]>

aria_aesni_avx_encrypt_16way(), aria_aesni_avx_decrypt_16way(),
aria_aesni_avx_ctr_crypt_16way(), aria_aesni_avx_gfni_encrypt_16way(),
aria_aesni_avx_gfni_decrypt_16way(), and
aria_aesni_avx_gfni_ctr_crypt_16way() are called via indirect function
calls. Therefore they need to use SYM_TYPED_FUNC_START instead of
SYM_FUNC_START to cause their type hashes to be emitted when the kernel
is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a
CFI failure.

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Cc: Taehee Yoo <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/aria-aesni-avx-asm_64.S | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/crypto/aria-aesni-avx-asm_64.S b/arch/x86/crypto/aria-aesni-avx-asm_64.S
index c75fd7d015ed8..03ae4cd1d976a 100644
--- a/arch/x86/crypto/aria-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/aria-aesni-avx-asm_64.S
@@ -7,6 +7,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/frame.h>

/* struct aria_ctx: */
@@ -913,7 +914,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_crypt_16way)
RET;
SYM_FUNC_END(__aria_aesni_avx_crypt_16way)

-SYM_FUNC_START(aria_aesni_avx_encrypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_encrypt_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
@@ -938,7 +939,7 @@ SYM_FUNC_START(aria_aesni_avx_encrypt_16way)
RET;
SYM_FUNC_END(aria_aesni_avx_encrypt_16way)

-SYM_FUNC_START(aria_aesni_avx_decrypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_decrypt_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
@@ -1039,7 +1040,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_ctr_gen_keystream_16way)
RET;
SYM_FUNC_END(__aria_aesni_avx_ctr_gen_keystream_16way)

-SYM_FUNC_START(aria_aesni_avx_ctr_crypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_ctr_crypt_16way)
/* input:
* %rdi: ctx
* %rsi: dst
@@ -1208,7 +1209,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_gfni_crypt_16way)
RET;
SYM_FUNC_END(__aria_aesni_avx_gfni_crypt_16way)

-SYM_FUNC_START(aria_aesni_avx_gfni_encrypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_encrypt_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
@@ -1233,7 +1234,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_encrypt_16way)
RET;
SYM_FUNC_END(aria_aesni_avx_gfni_encrypt_16way)

-SYM_FUNC_START(aria_aesni_avx_gfni_decrypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_decrypt_16way)
/* input:
* %rdi: ctx, CTX
* %rsi: dst
@@ -1258,7 +1259,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_decrypt_16way)
RET;
SYM_FUNC_END(aria_aesni_avx_gfni_decrypt_16way)

-SYM_FUNC_START(aria_aesni_avx_gfni_ctr_crypt_16way)
+SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_ctr_crypt_16way)
/* input:
* %rdi: ctx
* %rsi: dst
--
2.38.1


2022-11-18 19:53:38

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 04/12] crypto: x86/sha1 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

sha1_transform_ssse3(), sha1_transform_avx(), and sha1_ni_transform()
(but not sha1_transform_avx2()) are called via indirect function calls.
Therefore they need to use SYM_TYPED_FUNC_START instead of
SYM_FUNC_START to cause their type hashes to be emitted when the kernel
is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a
CFI failure (if the compiler didn't happen to optimize out the indirect
calls).

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/sha1_ni_asm.S | 3 ++-
arch/x86/crypto/sha1_ssse3_asm.S | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/crypto/sha1_ni_asm.S b/arch/x86/crypto/sha1_ni_asm.S
index 2f94ec0e763bf..3cae5a1bb3d6e 100644
--- a/arch/x86/crypto/sha1_ni_asm.S
+++ b/arch/x86/crypto/sha1_ni_asm.S
@@ -54,6 +54,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

#define DIGEST_PTR %rdi /* 1st arg */
#define DATA_PTR %rsi /* 2nd arg */
@@ -93,7 +94,7 @@
*/
.text
.align 32
-SYM_FUNC_START(sha1_ni_transform)
+SYM_TYPED_FUNC_START(sha1_ni_transform)
push %rbp
mov %rsp, %rbp
sub $FRAME_SIZE, %rsp
diff --git a/arch/x86/crypto/sha1_ssse3_asm.S b/arch/x86/crypto/sha1_ssse3_asm.S
index 263f916362e02..f54988c80eb40 100644
--- a/arch/x86/crypto/sha1_ssse3_asm.S
+++ b/arch/x86/crypto/sha1_ssse3_asm.S
@@ -25,6 +25,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

#define CTX %rdi // arg1
#define BUF %rsi // arg2
@@ -67,7 +68,7 @@
* param: function's name
*/
.macro SHA1_VECTOR_ASM name
- SYM_FUNC_START(\name)
+ SYM_TYPED_FUNC_START(\name)

push %rbx
push %r12
--
2.38.1


2022-11-18 19:53:38

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 01/12] crypto: x86/aegis128 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

crypto_aegis128_aesni_enc(), crypto_aegis128_aesni_enc_tail(),
crypto_aegis128_aesni_dec(), and crypto_aegis128_aesni_dec_tail() are
called via indirect function calls. Therefore they need to use
SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
Otherwise, the code crashes with a CFI failure (if the compiler didn't
happen to optimize out the indirect calls).

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/aegis128-aesni-asm.S | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/aegis128-aesni-asm.S b/arch/x86/crypto/aegis128-aesni-asm.S
index b48ddebb47489..cdf3215ec272c 100644
--- a/arch/x86/crypto/aegis128-aesni-asm.S
+++ b/arch/x86/crypto/aegis128-aesni-asm.S
@@ -7,6 +7,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/frame.h>

#define STATE0 %xmm0
@@ -402,7 +403,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_ad)
* void crypto_aegis128_aesni_enc(void *state, unsigned int length,
* const void *src, void *dst);
*/
-SYM_FUNC_START(crypto_aegis128_aesni_enc)
+SYM_TYPED_FUNC_START(crypto_aegis128_aesni_enc)
FRAME_BEGIN

cmp $0x10, LEN
@@ -499,7 +500,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc)
* void crypto_aegis128_aesni_enc_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
-SYM_FUNC_START(crypto_aegis128_aesni_enc_tail)
+SYM_TYPED_FUNC_START(crypto_aegis128_aesni_enc_tail)
FRAME_BEGIN

/* load the state: */
@@ -556,7 +557,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc_tail)
* void crypto_aegis128_aesni_dec(void *state, unsigned int length,
* const void *src, void *dst);
*/
-SYM_FUNC_START(crypto_aegis128_aesni_dec)
+SYM_TYPED_FUNC_START(crypto_aegis128_aesni_dec)
FRAME_BEGIN

cmp $0x10, LEN
@@ -653,7 +654,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_dec)
* void crypto_aegis128_aesni_dec_tail(void *state, unsigned int length,
* const void *src, void *dst);
*/
-SYM_FUNC_START(crypto_aegis128_aesni_dec_tail)
+SYM_TYPED_FUNC_START(crypto_aegis128_aesni_dec_tail)
FRAME_BEGIN

/* load the state: */
--
2.38.1


2022-11-18 19:53:38

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

From: Eric Biggers <[email protected]>

sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are
called via indirect function calls. Therefore they need to use
SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
Otherwise, the code crashes with a CFI failure.

(Or at least that should be the case. For some reason the CFI checks in
sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt()
are not always being generated, using current tip-of-tree clang.
Anyway, this patch is a good idea anyway.)

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/sm4-aesni-avx-asm_64.S | 7 ++++---
arch/x86/crypto/sm4-aesni-avx2-asm_64.S | 7 ++++---
2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/crypto/sm4-aesni-avx-asm_64.S b/arch/x86/crypto/sm4-aesni-avx-asm_64.S
index 4767ab61ff489..22b6560eb9e1e 100644
--- a/arch/x86/crypto/sm4-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/sm4-aesni-avx-asm_64.S
@@ -14,6 +14,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/frame.h>

#define rRIP (%rip)
@@ -420,7 +421,7 @@ SYM_FUNC_END(sm4_aesni_avx_crypt8)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx_ctr_enc_blk8)
+SYM_TYPED_FUNC_START(sm4_aesni_avx_ctr_enc_blk8)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (8 blocks)
@@ -495,7 +496,7 @@ SYM_FUNC_END(sm4_aesni_avx_ctr_enc_blk8)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx_cbc_dec_blk8)
+SYM_TYPED_FUNC_START(sm4_aesni_avx_cbc_dec_blk8)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (8 blocks)
@@ -545,7 +546,7 @@ SYM_FUNC_END(sm4_aesni_avx_cbc_dec_blk8)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx_cfb_dec_blk8)
+SYM_TYPED_FUNC_START(sm4_aesni_avx_cfb_dec_blk8)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (8 blocks)
diff --git a/arch/x86/crypto/sm4-aesni-avx2-asm_64.S b/arch/x86/crypto/sm4-aesni-avx2-asm_64.S
index 4732fe8bb65b6..23ee39a8ada8c 100644
--- a/arch/x86/crypto/sm4-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/sm4-aesni-avx2-asm_64.S
@@ -14,6 +14,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/frame.h>

#define rRIP (%rip)
@@ -282,7 +283,7 @@ SYM_FUNC_END(__sm4_crypt_blk16)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx2_ctr_enc_blk16)
+SYM_TYPED_FUNC_START(sm4_aesni_avx2_ctr_enc_blk16)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (16 blocks)
@@ -395,7 +396,7 @@ SYM_FUNC_END(sm4_aesni_avx2_ctr_enc_blk16)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx2_cbc_dec_blk16)
+SYM_TYPED_FUNC_START(sm4_aesni_avx2_cbc_dec_blk16)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (16 blocks)
@@ -449,7 +450,7 @@ SYM_FUNC_END(sm4_aesni_avx2_cbc_dec_blk16)
* const u8 *src, u8 *iv)
*/
.align 8
-SYM_FUNC_START(sm4_aesni_avx2_cfb_dec_blk16)
+SYM_TYPED_FUNC_START(sm4_aesni_avx2_cfb_dec_blk16)
/* input:
* %rdi: round key array, CTX
* %rsi: dst (16 blocks)
--
2.38.1


2022-11-18 19:53:51

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 06/12] crypto: x86/sha512 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

sha512_transform_ssse3(), sha512_transform_avx(), and
sha512_transform_rorx() are called via indirect function calls.
Therefore they need to use SYM_TYPED_FUNC_START instead of
SYM_FUNC_START to cause their type hashes to be emitted when the kernel
is built with CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a
CFI failure (if the compiler didn't happen to optimize out the indirect
calls).

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/sha512-avx-asm.S | 3 ++-
arch/x86/crypto/sha512-avx2-asm.S | 3 ++-
arch/x86/crypto/sha512-ssse3-asm.S | 3 ++-
3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/crypto/sha512-avx-asm.S b/arch/x86/crypto/sha512-avx-asm.S
index 1fefe6dd3a9e2..b0984f19fdb40 100644
--- a/arch/x86/crypto/sha512-avx-asm.S
+++ b/arch/x86/crypto/sha512-avx-asm.S
@@ -48,6 +48,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

.text

@@ -273,7 +274,7 @@ frame_size = frame_WK + WK_SIZE
# of SHA512 message blocks.
# "blocks" is the message length in SHA512 blocks
########################################################################
-SYM_FUNC_START(sha512_transform_avx)
+SYM_TYPED_FUNC_START(sha512_transform_avx)
test msglen, msglen
je nowork

diff --git a/arch/x86/crypto/sha512-avx2-asm.S b/arch/x86/crypto/sha512-avx2-asm.S
index 5cdaab7d69015..b1ca99055ef99 100644
--- a/arch/x86/crypto/sha512-avx2-asm.S
+++ b/arch/x86/crypto/sha512-avx2-asm.S
@@ -50,6 +50,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

.text

@@ -565,7 +566,7 @@ frame_size = frame_CTX + CTX_SIZE
# of SHA512 message blocks.
# "blocks" is the message length in SHA512 blocks
########################################################################
-SYM_FUNC_START(sha512_transform_rorx)
+SYM_TYPED_FUNC_START(sha512_transform_rorx)
# Save GPRs
push %rbx
push %r12
diff --git a/arch/x86/crypto/sha512-ssse3-asm.S b/arch/x86/crypto/sha512-ssse3-asm.S
index b84c22e06c5f7..c06afb5270e5f 100644
--- a/arch/x86/crypto/sha512-ssse3-asm.S
+++ b/arch/x86/crypto/sha512-ssse3-asm.S
@@ -48,6 +48,7 @@
########################################################################

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

.text

@@ -274,7 +275,7 @@ frame_size = frame_WK + WK_SIZE
# of SHA512 message blocks.
# "blocks" is the message length in SHA512 blocks.
########################################################################
-SYM_FUNC_START(sha512_transform_ssse3)
+SYM_TYPED_FUNC_START(sha512_transform_ssse3)

test msglen, msglen
je nowork
--
2.38.1


2022-11-18 19:54:11

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 11/12] crypto: arm/nhpoly1305 - eliminate unnecessary CFI wrapper

From: Eric Biggers <[email protected]>

The arm architecture doesn't support CFI yet, and even if it did, the
new CFI implementation supports indirect calls to assembly functions.
Therefore, there's no need to use a wrapper function for nh_neon().

Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/arm/crypto/nh-neon-core.S | 2 +-
arch/arm/crypto/nhpoly1305-neon-glue.c | 11 ++---------
2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/arm/crypto/nh-neon-core.S b/arch/arm/crypto/nh-neon-core.S
index 434d80ab531c2..01620a0782ca9 100644
--- a/arch/arm/crypto/nh-neon-core.S
+++ b/arch/arm/crypto/nh-neon-core.S
@@ -69,7 +69,7 @@

/*
* void nh_neon(const u32 *key, const u8 *message, size_t message_len,
- * u8 hash[NH_HASH_BYTES])
+ * __le64 hash[NH_NUM_PASSES])
*
* It's guaranteed that message_len % 16 == 0.
*/
diff --git a/arch/arm/crypto/nhpoly1305-neon-glue.c b/arch/arm/crypto/nhpoly1305-neon-glue.c
index ffa8d73fe722c..e93e41ff26566 100644
--- a/arch/arm/crypto/nhpoly1305-neon-glue.c
+++ b/arch/arm/crypto/nhpoly1305-neon-glue.c
@@ -14,14 +14,7 @@
#include <linux/module.h>

asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
- u8 hash[NH_HASH_BYTES]);
-
-/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
-static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
- __le64 hash[NH_NUM_PASSES])
-{
- nh_neon(key, message, message_len, (u8 *)hash);
-}
+ __le64 hash[NH_NUM_PASSES]);

static int nhpoly1305_neon_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K);

kernel_neon_begin();
- crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
+ crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
kernel_neon_end();
src += n;
srclen -= n;
--
2.38.1


2022-11-18 19:54:45

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 09/12] crypto: arm64/nhpoly1305 - eliminate unnecessary CFI wrapper

From: Eric Biggers <[email protected]>

Since the CFI implementation now supports indirect calls to assembly
functions, take advantage of that rather than use a wrapper function.

Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/arm64/crypto/nh-neon-core.S | 5 +++--
arch/arm64/crypto/nhpoly1305-neon-glue.c | 11 ++---------
2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/crypto/nh-neon-core.S b/arch/arm64/crypto/nh-neon-core.S
index 51c0a534ef87c..13eda08fda1e5 100644
--- a/arch/arm64/crypto/nh-neon-core.S
+++ b/arch/arm64/crypto/nh-neon-core.S
@@ -8,6 +8,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>

KEY .req x0
MESSAGE .req x1
@@ -58,11 +59,11 @@

/*
* void nh_neon(const u32 *key, const u8 *message, size_t message_len,
- * u8 hash[NH_HASH_BYTES])
+ * __le64 hash[NH_NUM_PASSES])
*
* It's guaranteed that message_len % 16 == 0.
*/
-SYM_FUNC_START(nh_neon)
+SYM_TYPED_FUNC_START(nh_neon)

ld1 {K0.4s,K1.4s}, [KEY], #32
movi PASS0_SUMS.2d, #0
diff --git a/arch/arm64/crypto/nhpoly1305-neon-glue.c b/arch/arm64/crypto/nhpoly1305-neon-glue.c
index c5405e6a6db76..cd882c35d9252 100644
--- a/arch/arm64/crypto/nhpoly1305-neon-glue.c
+++ b/arch/arm64/crypto/nhpoly1305-neon-glue.c
@@ -14,14 +14,7 @@
#include <linux/module.h>

asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
- u8 hash[NH_HASH_BYTES]);
-
-/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
-static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
- __le64 hash[NH_NUM_PASSES])
-{
- nh_neon(key, message, message_len, (u8 *)hash);
-}
+ __le64 hash[NH_NUM_PASSES]);

static int nhpoly1305_neon_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen)
@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K);

kernel_neon_begin();
- crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon);
+ crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
kernel_neon_end();
src += n;
srclen -= n;
--
2.38.1


2022-11-18 19:55:01

by Eric Biggers

[permalink] [raw]
Subject: [PATCH v2 07/12] crypto: x86/sm3 - fix possible crash with CFI enabled

From: Eric Biggers <[email protected]>

sm3_transform_avx() is called via indirect function calls. Therefore it
needs to use SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause its
type hash to be emitted when the kernel is built with
CONFIG_CFI_CLANG=y. Otherwise, the code crashes with a CFI failure (if
the compiler didn't happen to optimize out the indirect call).

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sami Tolvanen <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
---
arch/x86/crypto/sm3-avx-asm_64.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/crypto/sm3-avx-asm_64.S b/arch/x86/crypto/sm3-avx-asm_64.S
index b12b9efb5ec51..8fc5ac681fd63 100644
--- a/arch/x86/crypto/sm3-avx-asm_64.S
+++ b/arch/x86/crypto/sm3-avx-asm_64.S
@@ -12,6 +12,7 @@
*/

#include <linux/linkage.h>
+#include <linux/cfi_types.h>
#include <asm/frame.h>

/* Context structure */
@@ -328,7 +329,7 @@
* const u8 *data, int nblocks);
*/
.align 16
-SYM_FUNC_START(sm3_transform_avx)
+SYM_TYPED_FUNC_START(sm3_transform_avx)
/* input:
* %rdi: ctx, CTX
* %rsi: data (64*nblks bytes)
--
2.38.1


2022-11-18 20:21:28

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 11:44:17AM -0800, Eric Biggers wrote:
> From: Eric Biggers <[email protected]>
>
> sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
> sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
> sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are
> called via indirect function calls. Therefore they need to use
> SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
> hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
> Otherwise, the code crashes with a CFI failure.
>
> (Or at least that should be the case. For some reason the CFI checks in
> sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt()
> are not always being generated, using current tip-of-tree clang.
> Anyway, this patch is a good idea anyway.)

Sami, is it expected that a CFI check isn't being generated for the indirect
call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.

- Eric

2022-11-18 20:33:02

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 12:10 PM Eric Biggers <[email protected]> wrote:
>
> On Fri, Nov 18, 2022 at 11:44:17AM -0800, Eric Biggers wrote:
> > From: Eric Biggers <[email protected]>
> >
> > sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
> > sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
> > sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are
> > called via indirect function calls. Therefore they need to use
> > SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
> > hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
> > Otherwise, the code crashes with a CFI failure.
> >
> > (Or at least that should be the case. For some reason the CFI checks in
> > sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt()
> > are not always being generated, using current tip-of-tree clang.
> > Anyway, this patch is a good idea anyway.)
>
> Sami, is it expected that a CFI check isn't being generated for the indirect
> call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.

If the compiler emits an indirect call, it should also emit a CFI
check. What's the assembly code it generates here?

Sami

2022-11-18 21:05:50

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 12:27:46PM -0800, Sami Tolvanen wrote:
> On Fri, Nov 18, 2022 at 12:10 PM Eric Biggers <[email protected]> wrote:
> >
> > On Fri, Nov 18, 2022 at 11:44:17AM -0800, Eric Biggers wrote:
> > > From: Eric Biggers <[email protected]>
> > >
> > > sm4_aesni_avx_ctr_enc_blk8(), sm4_aesni_avx_cbc_dec_blk8(),
> > > sm4_aesni_avx_cfb_dec_blk8(), sm4_aesni_avx2_ctr_enc_blk16(),
> > > sm4_aesni_avx2_cbc_dec_blk16(), and sm4_aesni_avx2_cfb_dec_blk16() are
> > > called via indirect function calls. Therefore they need to use
> > > SYM_TYPED_FUNC_START instead of SYM_FUNC_START to cause their type
> > > hashes to be emitted when the kernel is built with CONFIG_CFI_CLANG=y.
> > > Otherwise, the code crashes with a CFI failure.
> > >
> > > (Or at least that should be the case. For some reason the CFI checks in
> > > sm4_avx_cbc_decrypt(), sm4_avx_cfb_decrypt(), and sm4_avx_ctr_crypt()
> > > are not always being generated, using current tip-of-tree clang.
> > > Anyway, this patch is a good idea anyway.)
> >
> > Sami, is it expected that a CFI check isn't being generated for the indirect
> > call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.
>
> If the compiler emits an indirect call, it should also emit a CFI
> check. What's the assembly code it generates here?
>

Here it is. I think the indirect call to 'func', without a preceding CFI check,
happens at address ffffffff81143599. 'func' is an argument to
sm4_avx_cbc_decrypt(), but it was spilled to the stack -- maybe that made a
difference somehow. I've attached the kconfig I'm using, in case you need that.

I also notice that sm4_avx_cbc_decrypt itself never has its address taken, yet
it has a CFI type hash itself. Is that expected? Maybe it comes from the
EXPORT_SYMBOL_GPL?

ffffffff811434b0 <__cfi_sm4_avx_cbc_decrypt>:
ffffffff811434b0: 90 nop
ffffffff811434b1: 90 nop
ffffffff811434b2: 90 nop
ffffffff811434b3: 90 nop
ffffffff811434b4: 90 nop
ffffffff811434b5: 90 nop
ffffffff811434b6: 90 nop
ffffffff811434b7: 90 nop
ffffffff811434b8: 90 nop
ffffffff811434b9: 90 nop
ffffffff811434ba: 90 nop
ffffffff811434bb: b8 bd 55 47 3e mov $0x3e4755bd,%eax

ffffffff811434c0 <sm4_avx_cbc_decrypt>:
ffffffff811434c0: 55 push %rbp
ffffffff811434c1: 48 89 e5 mov %rsp,%rbp
ffffffff811434c4: 41 57 push %r15
ffffffff811434c6: 41 56 push %r14
ffffffff811434c8: 41 55 push %r13
ffffffff811434ca: 41 54 push %r12
ffffffff811434cc: 53 push %rbx
ffffffff811434cd: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
ffffffff811434d1: 48 81 ec 60 01 00 00 sub $0x160,%rsp
ffffffff811434d8: 49 89 d4 mov %rdx,%r12
ffffffff811434db: 41 89 f5 mov %esi,%r13d
ffffffff811434de: 48 89 fb mov %rdi,%rbx
ffffffff811434e1: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
ffffffff811434e8: 00 00
ffffffff811434ea: 48 89 84 24 50 01 00 mov %rax,0x150(%rsp)
ffffffff811434f1: 00
ffffffff811434f2: 4c 8b 77 40 mov 0x40(%rdi),%r14
ffffffff811434f6: 4c 8d 7c 24 28 lea 0x28(%rsp),%r15
ffffffff811434fb: ba 98 00 00 00 mov $0x98,%edx
ffffffff81143500: 4c 89 ff mov %r15,%rdi
ffffffff81143503: 31 f6 xor %esi,%esi
ffffffff81143505: e8 f6 09 72 00 call ffffffff81863f00 <__memset>
ffffffff8114350a: 4c 89 ff mov %r15,%rdi
ffffffff8114350d: 48 89 de mov %rbx,%rsi
ffffffff81143510: 31 d2 xor %edx,%edx
ffffffff81143512: e8 59 ec 2e 00 call ffffffff81432170 <skcipher_walk_virt>
ffffffff81143517: 44 8b 7c 24 58 mov 0x58(%rsp),%r15d
ffffffff8114351c: 45 85 ff test %r15d,%r15d
ffffffff8114351f: 0f 84 d5 03 00 00 je ffffffff811438fa <sm4_avx_cbc_decrypt+0x43a>
ffffffff81143525: 49 81 c6 98 00 00 00 add $0x98,%r14
ffffffff8114352c: 4c 89 74 24 10 mov %r14,0x10(%rsp)
ffffffff81143531: 44 89 e8 mov %r13d,%eax
ffffffff81143534: 48 89 44 24 18 mov %rax,0x18(%rsp)
ffffffff81143539: 45 89 ee mov %r13d,%r14d
ffffffff8114353c: 4c 89 64 24 20 mov %r12,0x20(%rsp)
ffffffff81143541: 44 89 6c 24 0c mov %r13d,0xc(%rsp)
ffffffff81143546: eb 20 jmp ffffffff81143568 <sm4_avx_cbc_decrypt+0xa8>
ffffffff81143548: e8 33 57 ef ff call ffffffff81038c80 <kernel_fpu_end>
ffffffff8114354d: 48 8d 7c 24 28 lea 0x28(%rsp),%rdi
ffffffff81143552: 44 89 fe mov %r15d,%esi
ffffffff81143555: e8 06 e6 2e 00 call ffffffff81431b60 <skcipher_walk_done>
ffffffff8114355a: 44 8b 7c 24 58 mov 0x58(%rsp),%r15d
ffffffff8114355f: 45 85 ff test %r15d,%r15d
ffffffff81143562: 0f 84 92 03 00 00 je ffffffff811438fa <sm4_avx_cbc_decrypt+0x43a>
ffffffff81143568: 48 8b 5c 24 30 mov 0x30(%rsp),%rbx
ffffffff8114356d: 4c 8b 6c 24 40 mov 0x40(%rsp),%r13
ffffffff81143572: bf 02 00 00 00 mov $0x2,%edi
ffffffff81143577: e8 64 55 ef ff call ffffffff81038ae0 <kernel_fpu_begin_mask>
ffffffff8114357c: 45 39 f7 cmp %r14d,%r15d
ffffffff8114357f: 4c 8b 64 24 18 mov 0x18(%rsp),%r12
ffffffff81143584: 72 25 jb ffffffff811435ab <sm4_avx_cbc_decrypt+0xeb>
ffffffff81143586: 48 8b 8c 24 a0 00 00 mov 0xa0(%rsp),%rcx
ffffffff8114358d: 00
ffffffff8114358e: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
ffffffff81143593: 4c 89 ee mov %r13,%rsi
ffffffff81143596: 48 89 da mov %rbx,%rdx
ffffffff81143599: ff 54 24 20 call *0x20(%rsp)
ffffffff8114359d: 4d 01 e5 add %r12,%r13
ffffffff811435a0: 4c 01 e3 add %r12,%rbx
ffffffff811435a3: 45 29 f7 sub %r14d,%r15d
ffffffff811435a6: 45 39 f7 cmp %r14d,%r15d
ffffffff811435a9: 73 db jae ffffffff81143586 <sm4_avx_cbc_decrypt+0xc6>
ffffffff811435ab: 41 83 ff 10 cmp $0x10,%r15d
ffffffff811435af: 73 6c jae ffffffff8114361d <sm4_avx_cbc_decrypt+0x15d>
ffffffff811435b1: eb 95 jmp ffffffff81143548 <sm4_avx_cbc_decrypt+0x88>
ffffffff811435b3: 48 83 c1 f0 add $0xfffffffffffffff0,%rcx
ffffffff811435b7: 49 83 c5 f0 add $0xfffffffffffffff0,%r13
ffffffff811435bb: 44 8b 74 24 0c mov 0xc(%rsp),%r14d
ffffffff811435c0: 48 8b 94 24 a0 00 00 mov 0xa0(%rsp),%rdx
ffffffff811435c7: 00
ffffffff811435c8: 48 8b b4 24 d0 00 00 mov 0xd0(%rsp),%rsi
ffffffff811435cf: 00
ffffffff811435d0: 48 33 32 xor (%rdx),%rsi
ffffffff811435d3: 49 89 75 00 mov %rsi,0x0(%r13)
ffffffff811435d7: 48 8b b4 24 d8 00 00 mov 0xd8(%rsp),%rsi
ffffffff811435de: 00
ffffffff811435df: 48 33 72 08 xor 0x8(%rdx),%rsi
ffffffff811435e3: 49 89 75 08 mov %rsi,0x8(%r13)
ffffffff811435e7: 48 8b 94 24 a0 00 00 mov 0xa0(%rsp),%rdx
ffffffff811435ee: 00
ffffffff811435ef: 48 8b b4 24 c0 00 00 mov 0xc0(%rsp),%rsi
ffffffff811435f6: 00
ffffffff811435f7: 48 8b bc 24 c8 00 00 mov 0xc8(%rsp),%rdi
ffffffff811435fe: 00
ffffffff811435ff: 48 89 7a 08 mov %rdi,0x8(%rdx)
ffffffff81143603: 48 89 32 mov %rsi,(%rdx)
ffffffff81143606: 89 c2 mov %eax,%edx
ffffffff81143608: 49 01 d5 add %rdx,%r13
ffffffff8114360b: 48 8d 5c 0a 10 lea 0x10(%rdx,%rcx,1),%rbx
ffffffff81143610: 41 29 c7 sub %eax,%r15d
ffffffff81143613: 41 83 ff 0f cmp $0xf,%r15d
ffffffff81143617: 0f 86 2b ff ff ff jbe ffffffff81143548 <sm4_avx_cbc_decrypt+0x88>
ffffffff8114361d: 4d 89 ee mov %r13,%r14
ffffffff81143620: 48 c7 84 24 48 01 00 movq $0x0,0x148(%rsp)
ffffffff81143627: 00 00 00 00 00
ffffffff8114362c: 48 c7 84 24 40 01 00 movq $0x0,0x140(%rsp)
ffffffff81143633: 00 00 00 00 00
ffffffff81143638: 48 c7 84 24 38 01 00 movq $0x0,0x138(%rsp)
ffffffff8114363f: 00 00 00 00 00
ffffffff81143644: 48 c7 84 24 30 01 00 movq $0x0,0x130(%rsp)
ffffffff8114364b: 00 00 00 00 00
ffffffff81143650: 48 c7 84 24 28 01 00 movq $0x0,0x128(%rsp)
ffffffff81143657: 00 00 00 00 00
ffffffff8114365c: 48 c7 84 24 20 01 00 movq $0x0,0x120(%rsp)
ffffffff81143663: 00 00 00 00 00
ffffffff81143668: 48 c7 84 24 18 01 00 movq $0x0,0x118(%rsp)
ffffffff8114366f: 00 00 00 00 00
ffffffff81143674: 48 c7 84 24 10 01 00 movq $0x0,0x110(%rsp)
ffffffff8114367b: 00 00 00 00 00
ffffffff81143680: 48 c7 84 24 08 01 00 movq $0x0,0x108(%rsp)
ffffffff81143687: 00 00 00 00 00
ffffffff8114368c: 48 c7 84 24 00 01 00 movq $0x0,0x100(%rsp)
ffffffff81143693: 00 00 00 00 00
ffffffff81143698: 48 c7 84 24 f8 00 00 movq $0x0,0xf8(%rsp)
ffffffff8114369f: 00 00 00 00 00
ffffffff811436a4: 48 c7 84 24 f0 00 00 movq $0x0,0xf0(%rsp)
ffffffff811436ab: 00 00 00 00 00
ffffffff811436b0: 48 c7 84 24 e8 00 00 movq $0x0,0xe8(%rsp)
ffffffff811436b7: 00 00 00 00 00
ffffffff811436bc: 48 c7 84 24 e0 00 00 movq $0x0,0xe0(%rsp)
ffffffff811436c3: 00 00 00 00 00
ffffffff811436c8: 48 c7 84 24 d8 00 00 movq $0x0,0xd8(%rsp)
ffffffff811436cf: 00 00 00 00 00
ffffffff811436d4: 48 c7 84 24 d0 00 00 movq $0x0,0xd0(%rsp)
ffffffff811436db: 00 00 00 00 00
ffffffff811436e0: 48 c7 84 24 c8 00 00 movq $0x0,0xc8(%rsp)
ffffffff811436e7: 00 00 00 00 00
ffffffff811436ec: 48 c7 84 24 c0 00 00 movq $0x0,0xc0(%rsp)
ffffffff811436f3: 00 00 00 00 00
ffffffff811436f8: 45 89 fc mov %r15d,%r12d
ffffffff811436fb: 41 c1 ec 04 shr $0x4,%r12d
ffffffff811436ff: 41 83 fc 08 cmp $0x8,%r12d
ffffffff81143703: b8 08 00 00 00 mov $0x8,%eax
ffffffff81143708: 44 0f 43 e0 cmovae %eax,%r12d
ffffffff8114370c: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
ffffffff81143711: 48 8d b4 24 d0 00 00 lea 0xd0(%rsp),%rsi
ffffffff81143718: 00
ffffffff81143719: 48 89 da mov %rbx,%rdx
ffffffff8114371c: 44 89 e1 mov %r12d,%ecx
ffffffff8114371f: e8 bc f7 ff ff call ffffffff81142ee0 <sm4_aesni_avx_crypt8>
ffffffff81143724: 44 89 e0 mov %r12d,%eax
ffffffff81143727: c1 e0 04 shl $0x4,%eax
ffffffff8114372a: 48 8d 4c 18 e0 lea -0x20(%rax,%rbx,1),%rcx
ffffffff8114372f: 44 8d 68 f0 lea -0x10(%rax),%r13d
ffffffff81143733: 4d 01 f5 add %r14,%r13
ffffffff81143736: 48 8b 54 18 f0 mov -0x10(%rax,%rbx,1),%rdx
ffffffff8114373b: 48 8b 74 18 f8 mov -0x8(%rax,%rbx,1),%rsi
ffffffff81143740: 48 89 b4 24 c8 00 00 mov %rsi,0xc8(%rsp)
ffffffff81143747: 00
ffffffff81143748: 48 89 94 24 c0 00 00 mov %rdx,0xc0(%rsp)
ffffffff8114374f: 00
ffffffff81143750: 41 83 fc 02 cmp $0x2,%r12d
ffffffff81143754: 0f 82 61 fe ff ff jb ffffffff811435bb <sm4_avx_cbc_decrypt+0xfb>
ffffffff8114375a: 44 89 e2 mov %r12d,%edx
ffffffff8114375d: 48 8d 72 ff lea -0x1(%rdx),%rsi
ffffffff81143761: 48 89 f7 mov %rsi,%rdi
ffffffff81143764: 48 c1 e7 04 shl $0x4,%rdi
ffffffff81143768: 4c 8b 84 3c d0 00 00 mov 0xd0(%rsp,%rdi,1),%r8
ffffffff8114376f: 00
ffffffff81143770: 4c 33 01 xor (%rcx),%r8
ffffffff81143773: 4d 89 45 00 mov %r8,0x0(%r13)
ffffffff81143777: 48 8b bc 3c d8 00 00 mov 0xd8(%rsp,%rdi,1),%rdi
ffffffff8114377e: 00
ffffffff8114377f: 48 33 79 08 xor 0x8(%rcx),%rdi
ffffffff81143783: 49 89 7d 08 mov %rdi,0x8(%r13)
ffffffff81143787: 41 83 fc 03 cmp $0x3,%r12d
ffffffff8114378b: 0f 82 22 fe ff ff jb ffffffff811435b3 <sm4_avx_cbc_decrypt+0xf3>
ffffffff81143791: 48 8d 7a fe lea -0x2(%rdx),%rdi
ffffffff81143795: 49 89 f8 mov %rdi,%r8
ffffffff81143798: 49 c1 e0 04 shl $0x4,%r8
ffffffff8114379c: 4e 8b 8c 04 d0 00 00 mov 0xd0(%rsp,%r8,1),%r9
ffffffff811437a3: 00
ffffffff811437a4: 4c 33 49 f0 xor -0x10(%rcx),%r9
ffffffff811437a8: 4d 89 4d f0 mov %r9,-0x10(%r13)
ffffffff811437ac: 4e 8b 84 04 d8 00 00 mov 0xd8(%rsp,%r8,1),%r8
ffffffff811437b3: 00
ffffffff811437b4: 4c 33 41 f8 xor -0x8(%rcx),%r8
ffffffff811437b8: 4d 89 45 f8 mov %r8,-0x8(%r13)
ffffffff811437bc: 48 83 fe 03 cmp $0x3,%rsi
ffffffff811437c0: 44 8b 74 24 0c mov 0xc(%rsp),%r14d
ffffffff811437c5: 0f 82 ee 00 00 00 jb ffffffff811438b9 <sm4_avx_cbc_decrypt+0x3f9>
ffffffff811437cb: 48 8d 72 fd lea -0x3(%rdx),%rsi
ffffffff811437cf: 49 89 f0 mov %rsi,%r8
ffffffff811437d2: 49 c1 e0 04 shl $0x4,%r8
ffffffff811437d6: 4e 8b 8c 04 d0 00 00 mov 0xd0(%rsp,%r8,1),%r9
ffffffff811437dd: 00
ffffffff811437de: 4c 33 49 e0 xor -0x20(%rcx),%r9
ffffffff811437e2: 4d 89 4d e0 mov %r9,-0x20(%r13)
ffffffff811437e6: 4e 8b 84 04 d8 00 00 mov 0xd8(%rsp,%r8,1),%r8
ffffffff811437ed: 00
ffffffff811437ee: 4c 33 41 e8 xor -0x18(%rcx),%r8
ffffffff811437f2: 4d 89 45 e8 mov %r8,-0x18(%r13)
ffffffff811437f6: 48 83 ff 03 cmp $0x3,%rdi
ffffffff811437fa: 0f 82 c6 00 00 00 jb ffffffff811438c6 <sm4_avx_cbc_decrypt+0x406>
ffffffff81143800: 48 8d 7a fc lea -0x4(%rdx),%rdi
ffffffff81143804: 49 89 f8 mov %rdi,%r8
ffffffff81143807: 49 c1 e0 04 shl $0x4,%r8
ffffffff8114380b: 4e 8b 8c 04 d0 00 00 mov 0xd0(%rsp,%r8,1),%r9
ffffffff81143812: 00
ffffffff81143813: 4c 33 49 d0 xor -0x30(%rcx),%r9
ffffffff81143817: 4d 89 4d d0 mov %r9,-0x30(%r13)
ffffffff8114381b: 4e 8b 84 04 d8 00 00 mov 0xd8(%rsp,%r8,1),%r8
ffffffff81143822: 00
ffffffff81143823: 4c 33 41 d8 xor -0x28(%rcx),%r8
ffffffff81143827: 4d 89 45 d8 mov %r8,-0x28(%r13)
ffffffff8114382b: 48 83 fe 03 cmp $0x3,%rsi
ffffffff8114382f: 0f 82 9e 00 00 00 jb ffffffff811438d3 <sm4_avx_cbc_decrypt+0x413>
ffffffff81143835: 48 8d 72 fb lea -0x5(%rdx),%rsi
ffffffff81143839: 49 89 f0 mov %rsi,%r8
ffffffff8114383c: 49 c1 e0 04 shl $0x4,%r8
ffffffff81143840: 4e 8b 8c 04 d0 00 00 mov 0xd0(%rsp,%r8,1),%r9
ffffffff81143847: 00
ffffffff81143848: 4c 33 49 c0 xor -0x40(%rcx),%r9
ffffffff8114384c: 4d 89 4d c0 mov %r9,-0x40(%r13)
ffffffff81143850: 4e 8b 84 04 d8 00 00 mov 0xd8(%rsp,%r8,1),%r8
ffffffff81143857: 00
ffffffff81143858: 4c 33 41 c8 xor -0x38(%rcx),%r8
ffffffff8114385c: 4d 89 45 c8 mov %r8,-0x38(%r13)
ffffffff81143860: 48 83 ff 03 cmp $0x3,%rdi
ffffffff81143864: 72 7a jb ffffffff811438e0 <sm4_avx_cbc_decrypt+0x420>
ffffffff81143866: 48 c1 e2 04 shl $0x4,%rdx
ffffffff8114386a: 4c 8d 84 24 d0 00 00 lea 0xd0(%rsp),%r8
ffffffff81143871: 00
ffffffff81143872: 4a 8b 7c 02 a0 mov -0x60(%rdx,%r8,1),%rdi
ffffffff81143877: 48 33 79 b0 xor -0x50(%rcx),%rdi
ffffffff8114387b: 49 89 7d b0 mov %rdi,-0x50(%r13)
ffffffff8114387f: 4a 8b 7c 02 a8 mov -0x58(%rdx,%r8,1),%rdi
ffffffff81143884: 48 33 79 b8 xor -0x48(%rcx),%rdi
ffffffff81143888: 49 89 7d b8 mov %rdi,-0x48(%r13)
ffffffff8114388c: 48 83 fe 03 cmp $0x3,%rsi
ffffffff81143890: 72 5b jb ffffffff811438ed <sm4_avx_cbc_decrypt+0x42d>
ffffffff81143892: 4a 8b 74 02 90 mov -0x70(%rdx,%r8,1),%rsi
ffffffff81143897: 48 33 71 a0 xor -0x60(%rcx),%rsi
ffffffff8114389b: 49 89 75 a0 mov %rsi,-0x60(%r13)
ffffffff8114389f: 4a 8b 54 02 98 mov -0x68(%rdx,%r8,1),%rdx
ffffffff811438a4: 48 33 51 a8 xor -0x58(%rcx),%rdx
ffffffff811438a8: 49 89 55 a8 mov %rdx,-0x58(%r13)
ffffffff811438ac: 48 83 c1 90 add $0xffffffffffffff90,%rcx
ffffffff811438b0: 49 83 c5 90 add $0xffffffffffffff90,%r13
ffffffff811438b4: e9 07 fd ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438b9: 48 83 c1 e0 add $0xffffffffffffffe0,%rcx
ffffffff811438bd: 49 83 c5 e0 add $0xffffffffffffffe0,%r13
ffffffff811438c1: e9 fa fc ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438c6: 48 83 c1 d0 add $0xffffffffffffffd0,%rcx
ffffffff811438ca: 49 83 c5 d0 add $0xffffffffffffffd0,%r13
ffffffff811438ce: e9 ed fc ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438d3: 48 83 c1 c0 add $0xffffffffffffffc0,%rcx
ffffffff811438d7: 49 83 c5 c0 add $0xffffffffffffffc0,%r13
ffffffff811438db: e9 e0 fc ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438e0: 48 83 c1 b0 add $0xffffffffffffffb0,%rcx
ffffffff811438e4: 49 83 c5 b0 add $0xffffffffffffffb0,%r13
ffffffff811438e8: e9 d3 fc ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438ed: 48 83 c1 a0 add $0xffffffffffffffa0,%rcx
ffffffff811438f1: 49 83 c5 a0 add $0xffffffffffffffa0,%r13
ffffffff811438f5: e9 c6 fc ff ff jmp ffffffff811435c0 <sm4_avx_cbc_decrypt+0x100>
ffffffff811438fa: 65 48 8b 0c 25 28 00 mov %gs:0x28,%rcx
ffffffff81143901: 00 00
ffffffff81143903: 48 3b 8c 24 50 01 00 cmp 0x150(%rsp),%rcx
ffffffff8114390a: 00
ffffffff8114390b: 75 0f jne ffffffff8114391c <sm4_avx_cbc_decrypt+0x45c>
ffffffff8114390d: 48 8d 65 d8 lea -0x28(%rbp),%rsp
ffffffff81143911: 5b pop %rbx
ffffffff81143912: 41 5c pop %r12
ffffffff81143914: 41 5d pop %r13
ffffffff81143916: 41 5e pop %r14
ffffffff81143918: 41 5f pop %r15
ffffffff8114391a: 5d pop %rbp
ffffffff8114391b: c3 ret
ffffffff8114391c: e8 9f 9d 72 00 call ffffffff8186d6c0 <__stack_chk_fail>
ffffffff81143921: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
ffffffff81143928: 00 00 00
ffffffff8114392b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)


Attachments:
(No filename) (18.41 kB)
config.txt (66.45 kB)
Download all attachments

2022-11-18 21:07:07

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 12:27 PM Sami Tolvanen <[email protected]> wrote:
>
> On Fri, Nov 18, 2022 at 12:10 PM Eric Biggers <[email protected]> wrote:
> > Sami, is it expected that a CFI check isn't being generated for the indirect
> > call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.
>
> If the compiler emits an indirect call, it should also emit a CFI
> check. What's the assembly code it generates here?

With CONFIG_RETPOLINE, the check is emitted as expected, but I can
reproduce this issue without retpolines. It looks like the cfi-type
attribute is dropped from the machine instruction in one of the X86
specific passes. I'll take a look.

Sami

2022-11-18 22:05:57

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 12:53 PM Sami Tolvanen <[email protected]> wrote:
>
> On Fri, Nov 18, 2022 at 12:27 PM Sami Tolvanen <[email protected]> wrote:
> >
> > On Fri, Nov 18, 2022 at 12:10 PM Eric Biggers <[email protected]> wrote:
> > > Sami, is it expected that a CFI check isn't being generated for the indirect
> > > call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.
> >
> > If the compiler emits an indirect call, it should also emit a CFI
> > check. What's the assembly code it generates here?
>
> With CONFIG_RETPOLINE, the check is emitted as expected, but I can
> reproduce this issue without retpolines. It looks like the cfi-type
> attribute is dropped from the machine instruction in one of the X86
> specific passes. I'll take a look.

This should now be fixed in ToT LLVM after commit 7c96f61aaa4c. Thanks
for spotting the issue!

Sami

2022-11-18 22:39:17

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v2 08/12] crypto: x86/sm4 - fix crash with CFI enabled

On Fri, Nov 18, 2022 at 02:01:40PM -0800, Sami Tolvanen wrote:
> On Fri, Nov 18, 2022 at 12:53 PM Sami Tolvanen <[email protected]> wrote:
> >
> > On Fri, Nov 18, 2022 at 12:27 PM Sami Tolvanen <[email protected]> wrote:
> > >
> > > On Fri, Nov 18, 2022 at 12:10 PM Eric Biggers <[email protected]> wrote:
> > > > Sami, is it expected that a CFI check isn't being generated for the indirect
> > > > call to 'func' in sm4_avx_cbc_decrypt()? I'm using LLVM commit 4a7be42d922af0.
> > >
> > > If the compiler emits an indirect call, it should also emit a CFI
> > > check. What's the assembly code it generates here?
> >
> > With CONFIG_RETPOLINE, the check is emitted as expected, but I can
> > reproduce this issue without retpolines. It looks like the cfi-type
> > attribute is dropped from the machine instruction in one of the X86
> > specific passes. I'll take a look.
>
> This should now be fixed in ToT LLVM after commit 7c96f61aaa4c. Thanks
> for spotting the issue!
>

Thanks, it seems to work now. (If I revert my sm4 fix, I get a CFI failure as
expected.)

- Eric

2022-11-25 09:49:04

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH v2 00/12] crypto: CFI fixes

Eric Biggers <[email protected]> wrote:
> This series fixes some crashes when CONFIG_CFI_CLANG (Control Flow
> Integrity) is enabled, with the new CFI implementation that was merged
> in 6.1 and is supported on x86. Some of them were unconditional
> crashes, while others depended on whether the compiler optimized out the
> indirect calls or not. This series also simplifies some code that was
> intended to work around limitations of the old CFI implementation and is
> unnecessary for the new CFI implementation.
>
> Changed in v2:
> - Added patch "crypto: x86/sm4 - fix crash with CFI enabled"
> - Restored accidentally-deleted include of <asm/assembler.h>
> - Tweaked some commit messages and added Reviewed-by and Acked-by tags
>
> Eric Biggers (12):
> crypto: x86/aegis128 - fix possible crash with CFI enabled
> crypto: x86/aria - fix crash with CFI enabled
> crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers
> crypto: x86/sha1 - fix possible crash with CFI enabled
> crypto: x86/sha256 - fix possible crash with CFI enabled
> crypto: x86/sha512 - fix possible crash with CFI enabled
> crypto: x86/sm3 - fix possible crash with CFI enabled
> crypto: x86/sm4 - fix crash with CFI enabled
> crypto: arm64/nhpoly1305 - eliminate unnecessary CFI wrapper
> crypto: arm64/sm3 - fix possible crash with CFI enabled
> crypto: arm/nhpoly1305 - eliminate unnecessary CFI wrapper
> Revert "crypto: shash - avoid comparing pointers to exported functions
> under CFI"
>
> arch/arm/crypto/nh-neon-core.S | 2 +-
> arch/arm/crypto/nhpoly1305-neon-glue.c | 11 ++---------
> arch/arm64/crypto/nh-neon-core.S | 5 +++--
> arch/arm64/crypto/nhpoly1305-neon-glue.c | 11 ++---------
> arch/arm64/crypto/sm3-neon-core.S | 3 ++-
> arch/x86/crypto/aegis128-aesni-asm.S | 9 +++++----
> arch/x86/crypto/aria-aesni-avx-asm_64.S | 13 +++++++------
> arch/x86/crypto/nh-avx2-x86_64.S | 5 +++--
> arch/x86/crypto/nh-sse2-x86_64.S | 5 +++--
> arch/x86/crypto/nhpoly1305-avx2-glue.c | 11 ++---------
> arch/x86/crypto/nhpoly1305-sse2-glue.c | 11 ++---------
> arch/x86/crypto/sha1_ni_asm.S | 3 ++-
> arch/x86/crypto/sha1_ssse3_asm.S | 3 ++-
> arch/x86/crypto/sha256-avx-asm.S | 3 ++-
> arch/x86/crypto/sha256-avx2-asm.S | 3 ++-
> arch/x86/crypto/sha256-ssse3-asm.S | 3 ++-
> arch/x86/crypto/sha256_ni_asm.S | 3 ++-
> arch/x86/crypto/sha512-avx-asm.S | 3 ++-
> arch/x86/crypto/sha512-avx2-asm.S | 3 ++-
> arch/x86/crypto/sha512-ssse3-asm.S | 3 ++-
> arch/x86/crypto/sm3-avx-asm_64.S | 3 ++-
> arch/x86/crypto/sm4-aesni-avx-asm_64.S | 7 ++++---
> arch/x86/crypto/sm4-aesni-avx2-asm_64.S | 7 ++++---
> crypto/shash.c | 18 +++---------------
> include/crypto/internal/hash.h | 8 +++++++-
> 25 files changed, 70 insertions(+), 86 deletions(-)
>
>
> base-commit: 75df46b598b5b46b0857ee7d2410deaf215e23d1

All applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt