2022-01-21 19:16:35

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

blake2s_compress_generic is weakly aliased to blake2s_generic. The
current harness for function selection uses a function pointer, which is
ordinarily inlined and resolved at compile time. But when Clang's CFI is
enabled, CFI still triggers when making an indirect call via a weak
symbol. This seems like a bug in Clang's CFI, as though it's bucketing
weak symbols and strong symbols differently. It also only seems to
trigger when "full LTO" mode is used, rather than "thin LTO".

[ 0.000000][ T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
[ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
[ 0.000000][ T0] Hardware name: MT6873 (DT)
[ 0.000000][ T0] Call trace:
[ 0.000000][ T0] dump_backtrace+0xfc/0x1dc
[ 0.000000][ T0] dump_stack_lvl+0xa8/0x11c
[ 0.000000][ T0] panic+0x194/0x464
[ 0.000000][ T0] __cfi_check_fail+0x54/0x58
[ 0.000000][ T0] __cfi_slowpath_diag+0x354/0x4b0
[ 0.000000][ T0] blake2s_update+0x14c/0x178
[ 0.000000][ T0] _extract_entropy+0xf4/0x29c
[ 0.000000][ T0] crng_initialize_primary+0x24/0x94
[ 0.000000][ T0] rand_initialize+0x2c/0x6c
[ 0.000000][ T0] start_kernel+0x2f8/0x65c
[ 0.000000][ T0] __primary_switched+0xc4/0x7be4
[ 0.000000][ T0] Rebooting in 5 seconds..

Nonetheless, the function pointer method isn't so terrific anyway, so
this patch replaces it with a simple boolean, which also gets inlined
away. This successfully works around the Clang bug.

In general, I'm not too keen on all of the indirection involved here; it
clearly does more harm than good. Hopefully the whole thing can get
cleaned up down the road when lib/crypto is overhauled more
comprehensively. But for now, we go with a simple bandaid.

Fixes: 6048fdcc5f26 ("lib/crypto: blake2s: include as built-in")
Reported-by: Miles Chen <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Sami Tolvanen <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
arch/arm/crypto/blake2s-shash.c | 4 ++--
arch/x86/crypto/blake2s-shash.c | 4 ++--
crypto/blake2s_generic.c | 4 ++--
include/crypto/internal/blake2s.h | 36 ++++++++++++++++++-------------
lib/crypto/blake2s.c | 4 ++--
5 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/arch/arm/crypto/blake2s-shash.c b/arch/arm/crypto/blake2s-shash.c
index 17c1c3bfe2f5..763c73beea2d 100644
--- a/arch/arm/crypto/blake2s-shash.c
+++ b/arch/arm/crypto/blake2s-shash.c
@@ -13,12 +13,12 @@
static int crypto_blake2s_update_arm(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress);
+ return crypto_blake2s_update(desc, in, inlen, false);
}

static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress);
+ return crypto_blake2s_final(desc, out, false);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/arch/x86/crypto/blake2s-shash.c b/arch/x86/crypto/blake2s-shash.c
index f9e2fecdb761..59ae28abe35c 100644
--- a/arch/x86/crypto/blake2s-shash.c
+++ b/arch/x86/crypto/blake2s-shash.c
@@ -18,12 +18,12 @@
static int crypto_blake2s_update_x86(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress);
+ return crypto_blake2s_update(desc, in, inlen, false);
}

static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress);
+ return crypto_blake2s_final(desc, out, false);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c
index 72fe480f9bd6..5f96a21f8788 100644
--- a/crypto/blake2s_generic.c
+++ b/crypto/blake2s_generic.c
@@ -15,12 +15,12 @@
static int crypto_blake2s_update_generic(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress_generic);
+ return crypto_blake2s_update(desc, in, inlen, true);
}

static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress_generic);
+ return crypto_blake2s_final(desc, out, true);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h
index d39cfa0d333e..e2da9b413d44 100644
--- a/include/crypto/internal/blake2s.h
+++ b/include/crypto/internal/blake2s.h
@@ -24,14 +24,11 @@ static inline void blake2s_set_lastblock(struct blake2s_state *state)
state->f[0] = -1;
}

-typedef void (*blake2s_compress_t)(struct blake2s_state *state,
- const u8 *block, size_t nblocks, u32 inc);
-
/* Helper functions for BLAKE2s shared by the library and shash APIs */

-static inline void __blake2s_update(struct blake2s_state *state,
- const u8 *in, size_t inlen,
- blake2s_compress_t compress)
+static __always_inline void __blake2s_update(struct blake2s_state *state,
+ const u8 *in, size_t inlen,
+ bool force_generic)
{
const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;

@@ -39,7 +36,10 @@ static inline void __blake2s_update(struct blake2s_state *state,
return;
if (inlen > fill) {
memcpy(state->buf + state->buflen, in, fill);
- (*compress)(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
+ if (force_generic)
+ blake2s_compress_generic(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
+ else
+ blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
state->buflen = 0;
in += fill;
inlen -= fill;
@@ -47,7 +47,10 @@ static inline void __blake2s_update(struct blake2s_state *state,
if (inlen > BLAKE2S_BLOCK_SIZE) {
const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
/* Hash one less (full) block than strictly possible */
- (*compress)(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
+ if (force_generic)
+ blake2s_compress_generic(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
+ else
+ blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
}
@@ -55,13 +58,16 @@ static inline void __blake2s_update(struct blake2s_state *state,
state->buflen += inlen;
}

-static inline void __blake2s_final(struct blake2s_state *state, u8 *out,
- blake2s_compress_t compress)
+static __always_inline void __blake2s_final(struct blake2s_state *state, u8 *out,
+ bool force_generic)
{
blake2s_set_lastblock(state);
memset(state->buf + state->buflen, 0,
BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
- (*compress)(state, state->buf, 1, state->buflen);
+ if (force_generic)
+ blake2s_compress_generic(state, state->buf, 1, state->buflen);
+ else
+ blake2s_compress(state, state->buf, 1, state->buflen);
cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
memcpy(out, state->h, state->outlen);
}
@@ -99,20 +105,20 @@ static inline int crypto_blake2s_init(struct shash_desc *desc)

static inline int crypto_blake2s_update(struct shash_desc *desc,
const u8 *in, unsigned int inlen,
- blake2s_compress_t compress)
+ bool force_generic)
{
struct blake2s_state *state = shash_desc_ctx(desc);

- __blake2s_update(state, in, inlen, compress);
+ __blake2s_update(state, in, inlen, force_generic);
return 0;
}

static inline int crypto_blake2s_final(struct shash_desc *desc, u8 *out,
- blake2s_compress_t compress)
+ bool force_generic)
{
struct blake2s_state *state = shash_desc_ctx(desc);

- __blake2s_final(state, out, compress);
+ __blake2s_final(state, out, force_generic);
return 0;
}

diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c
index 9364f79937b8..c71c09621c09 100644
--- a/lib/crypto/blake2s.c
+++ b/lib/crypto/blake2s.c
@@ -18,14 +18,14 @@

void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen)
{
- __blake2s_update(state, in, inlen, blake2s_compress);
+ __blake2s_update(state, in, inlen, false);
}
EXPORT_SYMBOL(blake2s_update);

void blake2s_final(struct blake2s_state *state, u8 *out)
{
WARN_ON(IS_ENABLED(DEBUG) && !out);
- __blake2s_final(state, out, blake2s_compress);
+ __blake2s_final(state, out, false);
memzero_explicit(state, sizeof(*state));
}
EXPORT_SYMBOL(blake2s_final);
--
2.34.1


2022-01-21 21:13:53

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

On 1/19/22, Nathan Chancellor <[email protected]> wrote:
> On Wed, Jan 19, 2022 at 02:54:50PM +0100, Jason A. Donenfeld wrote:
>> blake2s_compress_generic is weakly aliased to blake2s_generic. The
>
> Isn't it the other way around?

Thanks. Changed that 'to' to a 'by'.

Jason

2022-01-22 08:06:52

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

On Wed, Jan 19, 2022 at 02:54:50PM +0100, Jason A. Donenfeld wrote:
> blake2s_compress_generic is weakly aliased to blake2s_generic. The
> current harness for function selection uses a function pointer, which is
> ordinarily inlined and resolved at compile time. But when Clang's CFI is
> enabled, CFI still triggers when making an indirect call via a weak
> symbol. This seems like a bug in Clang's CFI, as though it's bucketing
> weak symbols and strong symbols differently. It also only seems to
> trigger when "full LTO" mode is used, rather than "thin LTO".
>
> [ 0.000000][ T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
> [ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
> [ 0.000000][ T0] Hardware name: MT6873 (DT)
> [ 0.000000][ T0] Call trace:
> [ 0.000000][ T0] dump_backtrace+0xfc/0x1dc
> [ 0.000000][ T0] dump_stack_lvl+0xa8/0x11c
> [ 0.000000][ T0] panic+0x194/0x464
> [ 0.000000][ T0] __cfi_check_fail+0x54/0x58
> [ 0.000000][ T0] __cfi_slowpath_diag+0x354/0x4b0
> [ 0.000000][ T0] blake2s_update+0x14c/0x178
> [ 0.000000][ T0] _extract_entropy+0xf4/0x29c
> [ 0.000000][ T0] crng_initialize_primary+0x24/0x94
> [ 0.000000][ T0] rand_initialize+0x2c/0x6c
> [ 0.000000][ T0] start_kernel+0x2f8/0x65c
> [ 0.000000][ T0] __primary_switched+0xc4/0x7be4
> [ 0.000000][ T0] Rebooting in 5 seconds..
>
> Nonetheless, the function pointer method isn't so terrific anyway, so
> this patch replaces it with a simple boolean, which also gets inlined
> away. This successfully works around the Clang bug.
>
> In general, I'm not too keen on all of the indirection involved here; it
> clearly does more harm than good. Hopefully the whole thing can get
> cleaned up down the road when lib/crypto is overhauled more
> comprehensively. But for now, we go with a simple bandaid.
>
> Fixes: 6048fdcc5f26 ("lib/crypto: blake2s: include as built-in")
> Reported-by: Miles Chen <[email protected]>
> Cc: Nick Desaulniers <[email protected]>
> Cc: Sami Tolvanen <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> arch/arm/crypto/blake2s-shash.c | 4 ++--
> arch/x86/crypto/blake2s-shash.c | 4 ++--
> crypto/blake2s_generic.c | 4 ++--
> include/crypto/internal/blake2s.h | 36 ++++++++++++++++++-------------
> lib/crypto/blake2s.c | 4 ++--
> 5 files changed, 29 insertions(+), 23 deletions(-)

There are some lines over 80 columns in this patch.

Otherwise this looks fine. It would be really nice to fix this in clang,
though.

- Eric

2022-01-22 08:08:49

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [PATCH] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

Hi Jason,

On Fri, Jan 21, 2022 at 12:23 PM Jason A. Donenfeld <[email protected]> wrote:
>
> On Fri, Jan 21, 2022 at 8:54 PM Eric Biggers <[email protected]> wrote:
> > There are some lines over 80 columns in this patch.
>
> I'll fix that up.
>
> > Otherwise this looks fine. It would be really nice to fix this in clang,
> > though.
>
> I agree. I'm wondering whether it makes sense to roll with this patch,
> or to just not change anything and lean on Clang upstream to fix their
> bug. Full LTO is marked as "experimental" still. On the other hand,
> Android ships with it turned on, so how "experimental" can it really
> be? I don't have a strong read on the Clang ecosystem to know what
> makes most sense, to apply this or not. Do you have an opinion?

We'll work on fixing the compiler, but please note that it's most
likely going to take a while, so I would prefer to work around the
issue in the kernel for now.

Sami

2022-01-24 22:18:30

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH v2] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

blake2s_compress_generic is weakly aliased by blake2s_generic. The
current harness for function selection uses a function pointer, which is
ordinarily inlined and resolved at compile time. But when Clang's CFI is
enabled, CFI still triggers when making an indirect call via a weak
symbol. This seems like a bug in Clang's CFI, as though it's bucketing
weak symbols and strong symbols differently. It also only seems to
trigger when "full LTO" mode is used, rather than "thin LTO".

[ 0.000000][ T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
[ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
[ 0.000000][ T0] Hardware name: MT6873 (DT)
[ 0.000000][ T0] Call trace:
[ 0.000000][ T0] dump_backtrace+0xfc/0x1dc
[ 0.000000][ T0] dump_stack_lvl+0xa8/0x11c
[ 0.000000][ T0] panic+0x194/0x464
[ 0.000000][ T0] __cfi_check_fail+0x54/0x58
[ 0.000000][ T0] __cfi_slowpath_diag+0x354/0x4b0
[ 0.000000][ T0] blake2s_update+0x14c/0x178
[ 0.000000][ T0] _extract_entropy+0xf4/0x29c
[ 0.000000][ T0] crng_initialize_primary+0x24/0x94
[ 0.000000][ T0] rand_initialize+0x2c/0x6c
[ 0.000000][ T0] start_kernel+0x2f8/0x65c
[ 0.000000][ T0] __primary_switched+0xc4/0x7be4
[ 0.000000][ T0] Rebooting in 5 seconds..

Nonetheless, the function pointer method isn't so terrific anyway, so
this patch replaces it with a simple boolean, which also gets inlined
away. This successfully works around the Clang bug.

In general, I'm not too keen on all of the indirection involved here; it
clearly does more harm than good. Hopefully the whole thing can get
cleaned up down the road when lib/crypto is overhauled more
comprehensively. But for now, we go with a simple bandaid.

Fixes: 6048fdcc5f26 ("lib/crypto: blake2s: include as built-in")
Reported-by: Miles Chen <[email protected]>
Tested-by: Miles Chen <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/1567
Cc: Nick Desaulniers <[email protected]>
Cc: Sami Tolvanen <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
Changes v1->v2:
- Wrapped columns at 80 for Eric.

arch/arm/crypto/blake2s-shash.c | 4 ++--
arch/x86/crypto/blake2s-shash.c | 4 ++--
crypto/blake2s_generic.c | 4 ++--
include/crypto/internal/blake2s.h | 40 +++++++++++++++++++------------
lib/crypto/blake2s.c | 4 ++--
5 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/arch/arm/crypto/blake2s-shash.c b/arch/arm/crypto/blake2s-shash.c
index 17c1c3bfe2f5..763c73beea2d 100644
--- a/arch/arm/crypto/blake2s-shash.c
+++ b/arch/arm/crypto/blake2s-shash.c
@@ -13,12 +13,12 @@
static int crypto_blake2s_update_arm(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress);
+ return crypto_blake2s_update(desc, in, inlen, false);
}

static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress);
+ return crypto_blake2s_final(desc, out, false);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/arch/x86/crypto/blake2s-shash.c b/arch/x86/crypto/blake2s-shash.c
index f9e2fecdb761..59ae28abe35c 100644
--- a/arch/x86/crypto/blake2s-shash.c
+++ b/arch/x86/crypto/blake2s-shash.c
@@ -18,12 +18,12 @@
static int crypto_blake2s_update_x86(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress);
+ return crypto_blake2s_update(desc, in, inlen, false);
}

static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress);
+ return crypto_blake2s_final(desc, out, false);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c
index 72fe480f9bd6..5f96a21f8788 100644
--- a/crypto/blake2s_generic.c
+++ b/crypto/blake2s_generic.c
@@ -15,12 +15,12 @@
static int crypto_blake2s_update_generic(struct shash_desc *desc,
const u8 *in, unsigned int inlen)
{
- return crypto_blake2s_update(desc, in, inlen, blake2s_compress_generic);
+ return crypto_blake2s_update(desc, in, inlen, true);
}

static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out)
{
- return crypto_blake2s_final(desc, out, blake2s_compress_generic);
+ return crypto_blake2s_final(desc, out, true);
}

#define BLAKE2S_ALG(name, driver_name, digest_size) \
diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h
index d39cfa0d333e..52363eee2b20 100644
--- a/include/crypto/internal/blake2s.h
+++ b/include/crypto/internal/blake2s.h
@@ -24,14 +24,11 @@ static inline void blake2s_set_lastblock(struct blake2s_state *state)
state->f[0] = -1;
}

-typedef void (*blake2s_compress_t)(struct blake2s_state *state,
- const u8 *block, size_t nblocks, u32 inc);
-
/* Helper functions for BLAKE2s shared by the library and shash APIs */

-static inline void __blake2s_update(struct blake2s_state *state,
- const u8 *in, size_t inlen,
- blake2s_compress_t compress)
+static __always_inline void
+__blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen,
+ bool force_generic)
{
const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;

@@ -39,7 +36,12 @@ static inline void __blake2s_update(struct blake2s_state *state,
return;
if (inlen > fill) {
memcpy(state->buf + state->buflen, in, fill);
- (*compress)(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
+ if (force_generic)
+ blake2s_compress_generic(state, state->buf, 1,
+ BLAKE2S_BLOCK_SIZE);
+ else
+ blake2s_compress(state, state->buf, 1,
+ BLAKE2S_BLOCK_SIZE);
state->buflen = 0;
in += fill;
inlen -= fill;
@@ -47,7 +49,12 @@ static inline void __blake2s_update(struct blake2s_state *state,
if (inlen > BLAKE2S_BLOCK_SIZE) {
const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
/* Hash one less (full) block than strictly possible */
- (*compress)(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
+ if (force_generic)
+ blake2s_compress_generic(state, in, nblocks - 1,
+ BLAKE2S_BLOCK_SIZE);
+ else
+ blake2s_compress(state, in, nblocks - 1,
+ BLAKE2S_BLOCK_SIZE);
in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
}
@@ -55,13 +62,16 @@ static inline void __blake2s_update(struct blake2s_state *state,
state->buflen += inlen;
}

-static inline void __blake2s_final(struct blake2s_state *state, u8 *out,
- blake2s_compress_t compress)
+static __always_inline void
+__blake2s_final(struct blake2s_state *state, u8 *out, bool force_generic)
{
blake2s_set_lastblock(state);
memset(state->buf + state->buflen, 0,
BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
- (*compress)(state, state->buf, 1, state->buflen);
+ if (force_generic)
+ blake2s_compress_generic(state, state->buf, 1, state->buflen);
+ else
+ blake2s_compress(state, state->buf, 1, state->buflen);
cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
memcpy(out, state->h, state->outlen);
}
@@ -99,20 +109,20 @@ static inline int crypto_blake2s_init(struct shash_desc *desc)

static inline int crypto_blake2s_update(struct shash_desc *desc,
const u8 *in, unsigned int inlen,
- blake2s_compress_t compress)
+ bool force_generic)
{
struct blake2s_state *state = shash_desc_ctx(desc);

- __blake2s_update(state, in, inlen, compress);
+ __blake2s_update(state, in, inlen, force_generic);
return 0;
}

static inline int crypto_blake2s_final(struct shash_desc *desc, u8 *out,
- blake2s_compress_t compress)
+ bool force_generic)
{
struct blake2s_state *state = shash_desc_ctx(desc);

- __blake2s_final(state, out, compress);
+ __blake2s_final(state, out, force_generic);
return 0;
}

diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c
index 9364f79937b8..c71c09621c09 100644
--- a/lib/crypto/blake2s.c
+++ b/lib/crypto/blake2s.c
@@ -18,14 +18,14 @@

void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen)
{
- __blake2s_update(state, in, inlen, blake2s_compress);
+ __blake2s_update(state, in, inlen, false);
}
EXPORT_SYMBOL(blake2s_update);

void blake2s_final(struct blake2s_state *state, u8 *out)
{
WARN_ON(IS_ENABLED(DEBUG) && !out);
- __blake2s_final(state, out, blake2s_compress);
+ __blake2s_final(state, out, false);
memzero_explicit(state, sizeof(*state));
}
EXPORT_SYMBOL(blake2s_final);
--
2.34.1

2022-01-25 16:53:14

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH v2] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

On 1/25/22, Eric Biggers <[email protected]> wrote:
> On Mon, Jan 24, 2022 at 08:28:49PM +0100, Jason A. Donenfeld wrote:
>> blake2s_compress_generic is weakly aliased by blake2s_generic. The
>
> Don't you mean "weakly aliased by blake2s_compress"?

Grrrr. Thanks.

>> Changes v1->v2:
>> - Wrapped columns at 80 for Eric.
>
> It is the recommended coding style, so not just for me :-)

I was under the impression this increased to 100 mid-2020 or so, and
checkpatch.pl now makes noise at that width instead.

2022-01-27 02:56:38

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

On Fri, Jan 21, 2022 at 11:17 AM Jason A. Donenfeld <[email protected]> wrote:
>
> blake2s_compress_generic is weakly aliased to blake2s_generic. The
> current harness for function selection uses a function pointer, which is
> ordinarily inlined and resolved at compile time. But when Clang's CFI is
> enabled, CFI still triggers when making an indirect call via a weak
> symbol. This seems like a bug in Clang's CFI, as though it's bucketing
> weak symbols and strong symbols differently. It also only seems to
> trigger when "full LTO" mode is used, rather than "thin LTO".
>
> [ 0.000000][ T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
> [ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
> [ 0.000000][ T0] Hardware name: MT6873 (DT)
> [ 0.000000][ T0] Call trace:
> [ 0.000000][ T0] dump_backtrace+0xfc/0x1dc
> [ 0.000000][ T0] dump_stack_lvl+0xa8/0x11c
> [ 0.000000][ T0] panic+0x194/0x464
> [ 0.000000][ T0] __cfi_check_fail+0x54/0x58
> [ 0.000000][ T0] __cfi_slowpath_diag+0x354/0x4b0
> [ 0.000000][ T0] blake2s_update+0x14c/0x178
> [ 0.000000][ T0] _extract_entropy+0xf4/0x29c
> [ 0.000000][ T0] crng_initialize_primary+0x24/0x94
> [ 0.000000][ T0] rand_initialize+0x2c/0x6c
> [ 0.000000][ T0] start_kernel+0x2f8/0x65c
> [ 0.000000][ T0] __primary_switched+0xc4/0x7be4
> [ 0.000000][ T0] Rebooting in 5 seconds..

YongQin also reported hitting this issue(also, only in the LTO=full
case) on the db845c dev board. Sami pointed me to this patch and I
just wanted to confirm it gets things booting again.

Reported-by: YongQin Liu <[email protected]>
Tested-by: John Stultz <[email protected]>

Thanks so much for the quick analysis and fix!
-john

2022-01-27 02:56:44

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v2] lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI

On Tue, Jan 25, 2022 at 01:23:34PM +0100, Jason A. Donenfeld wrote:
> On 1/25/22, Eric Biggers <[email protected]> wrote:
> > On Mon, Jan 24, 2022 at 08:28:49PM +0100, Jason A. Donenfeld wrote:
> >> blake2s_compress_generic is weakly aliased by blake2s_generic. The
> >
> > Don't you mean "weakly aliased by blake2s_compress"?
>
> Grrrr. Thanks.
>
> >> Changes v1->v2:
> >> - Wrapped columns at 80 for Eric.
> >
> > It is the recommended coding style, so not just for me :-)
>
> I was under the impression this increased to 100 mid-2020 or so, and
> checkpatch.pl now makes noise at that width instead.

From Documentation/process/coding-style.rst:

"The preferred limit on the length of a single line is 80 columns.

Statements longer than 80 columns should be broken into sensible chunks,
unless exceeding 80 columns significantly increases readability and does
not hide information."

It's not as strict as it used to be, but checkpatch seems to be overly-lenient.
I always run it with --max-line-length=80.

- Eric