Since there are plenty of uses for the new-in-4.10 BPF digest feature
that would be problematic if malicious users could produce collisions,
the BPF digest should be collision-resistant. SHA-1 is no longer
considered collision-resistant, so switch it to SHA-256.
The actual switchover is trivial. Most of this series consists of
cleanups to the SHA256 code to make it usable as a standalone library
(since BPF should not depend on crypto).
The cleaned up library is much more user-friendly than the SHA-1 code,
so this also significantly tidies up the BPF digest code.
This is intended for 4.10. If this series misses 4.10 and nothing
takes its place, then we'll have an unpleasant ABI stability
situation.
Andy Lutomirski (6):
crypto/sha256: Refactor the API so it can be used without shash
crypto/sha256: Make the sha256 library functions selectable
bpf: Use SHA256 instead of SHA1 for bpf digests
bpf: Avoid copying the entire BPF program when hashing it
bpf: Rename fdinfo's prog_digest to prog_sha256
net: Rename TCA*BPF_DIGEST to ..._SHA256
arch/arm/crypto/sha2-ce-glue.c | 10 +++---
arch/arm/crypto/sha256_glue.c | 23 +++++++++-----
arch/arm/crypto/sha256_neon_glue.c | 34 ++++++++++----------
arch/arm64/crypto/sha2-ce-glue.c | 13 ++++----
arch/arm64/crypto/sha256-glue.c | 59 +++++++++++++++++++---------------
arch/x86/crypto/sha256_ssse3_glue.c | 46 ++++++++++++++++-----------
arch/x86/purgatory/purgatory.c | 2 +-
arch/x86/purgatory/sha256.c | 25 ++-------------
arch/x86/purgatory/sha256.h | 22 -------------
crypto/Kconfig | 8 +++++
crypto/Makefile | 2 +-
crypto/sha256_generic.c | 54 +++++++++++++++++++++++--------
include/crypto/sha.h | 33 ++++++++++++++++---
include/crypto/sha256_base.h | 40 +++++++----------------
include/linux/filter.h | 11 ++-----
include/uapi/linux/pkt_cls.h | 2 +-
include/uapi/linux/tc_act/tc_bpf.h | 2 +-
init/Kconfig | 1 +
kernel/bpf/core.c | 63 +++++++++----------------------------
kernel/bpf/syscall.c | 2 +-
net/sched/act_bpf.c | 2 +-
net/sched/cls_bpf.c | 2 +-
22 files changed, 225 insertions(+), 231 deletions(-)
delete mode 100644 arch/x86/purgatory/sha256.h
--
2.9.3
This will let other kernel code call into sha256_init(), etc. without
pulling in the core crypto code.
Signed-off-by: Andy Lutomirski <[email protected]>
---
crypto/Kconfig | 8 ++++++++
crypto/Makefile | 2 +-
crypto/sha256_generic.c | 4 ++++
include/crypto/sha.h | 4 ++++
4 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 160f08e721cc..85a2b3440c2b 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -10,6 +10,13 @@ config XOR_BLOCKS
source "crypto/async_tx/Kconfig"
#
+# Cryptographic algorithms that are usable without the Crypto API.
+# None of these should have visible config options.
+#
+config CRYPTO_SHA256_LIB
+ bool
+
+#
# Cryptographic API Configuration
#
menuconfig CRYPTO
@@ -763,6 +770,7 @@ config CRYPTO_SHA512_MB
config CRYPTO_SHA256
tristate "SHA224 and SHA256 digest algorithm"
+ select CRYPTO_SHA256_LIB
select CRYPTO_HASH
help
SHA256 secure hash standard (DFIPS 180-2).
diff --git a/crypto/Makefile b/crypto/Makefile
index b8f0e3eb0791..d147d4c911f5 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -71,7 +71,7 @@ obj-$(CONFIG_CRYPTO_RMD160) += rmd160.o
obj-$(CONFIG_CRYPTO_RMD256) += rmd256.o
obj-$(CONFIG_CRYPTO_RMD320) += rmd320.o
obj-$(CONFIG_CRYPTO_SHA1) += sha1_generic.o
-obj-$(CONFIG_CRYPTO_SHA256) += sha256_generic.o
+obj-$(CONFIG_CRYPTO_SHA256_LIB) += sha256_generic.o
obj-$(CONFIG_CRYPTO_SHA512) += sha512_generic.o
obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
index f2747893402c..9df71ac66dc4 100644
--- a/crypto/sha256_generic.c
+++ b/crypto/sha256_generic.c
@@ -261,6 +261,8 @@ void sha256_final(struct sha256_state *sctx, u8 *out)
}
EXPORT_SYMBOL(sha256_final);
+#ifdef CONFIG_CRYPTO_HASH
+
static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
@@ -328,6 +330,8 @@ static void __exit sha256_generic_mod_fini(void)
module_init(sha256_generic_mod_init);
module_exit(sha256_generic_mod_fini);
+#endif /* CONFIG_CRYPTO_HASH */
+
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA-224 and SHA-256 Secure Hash Algorithm");
diff --git a/include/crypto/sha.h b/include/crypto/sha.h
index 2b6978471605..381ba7fa5e3f 100644
--- a/include/crypto/sha.h
+++ b/include/crypto/sha.h
@@ -96,6 +96,8 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash);
+#ifdef CONFIG_CRYPTO_SHA256_LIB
+
static inline void sha256_init(struct sha256_state *sctx)
{
sctx->state[0] = SHA256_H0;
@@ -121,6 +123,8 @@ static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
sha256_final(sctx, hash);
}
+#endif /* CONFIG_CRYPTO_SHA256_LIB */
+
extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
unsigned int len);
--
2.9.3
This makes it easier to add another digest algorithm down the road if
needed. It also serves to force any programs that might have been
written against a kernel that had the old field name to notice the
change and make any necessary changes.
This shouldn't violate any stable API policies, as no released kernel
has ever had TCA*BPF_DIGEST.
Cc: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
include/uapi/linux/pkt_cls.h | 2 +-
include/uapi/linux/tc_act/tc_bpf.h | 2 +-
net/sched/act_bpf.c | 2 +-
net/sched/cls_bpf.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index cb4bcdc58543..ac6b300c1550 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -397,7 +397,7 @@ enum {
TCA_BPF_NAME,
TCA_BPF_FLAGS,
TCA_BPF_FLAGS_GEN,
- TCA_BPF_DIGEST,
+ TCA_BPF_SHA256,
__TCA_BPF_MAX,
};
diff --git a/include/uapi/linux/tc_act/tc_bpf.h b/include/uapi/linux/tc_act/tc_bpf.h
index a6b88a6f7f71..eae18a7430eb 100644
--- a/include/uapi/linux/tc_act/tc_bpf.h
+++ b/include/uapi/linux/tc_act/tc_bpf.h
@@ -27,7 +27,7 @@ enum {
TCA_ACT_BPF_FD,
TCA_ACT_BPF_NAME,
TCA_ACT_BPF_PAD,
- TCA_ACT_BPF_DIGEST,
+ TCA_ACT_BPF_SHA256,
__TCA_ACT_BPF_MAX,
};
#define TCA_ACT_BPF_MAX (__TCA_ACT_BPF_MAX - 1)
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 1c60317f0121..3868a66d0b24 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -123,7 +123,7 @@ static int tcf_bpf_dump_ebpf_info(const struct tcf_bpf *prog,
nla_put_string(skb, TCA_ACT_BPF_NAME, prog->bpf_name))
return -EMSGSIZE;
- nla = nla_reserve(skb, TCA_ACT_BPF_DIGEST,
+ nla = nla_reserve(skb, TCA_ACT_BPF_SHA256,
sizeof(prog->filter->digest));
if (nla == NULL)
return -EMSGSIZE;
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index adc776048d1a..6fc110321621 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -555,7 +555,7 @@ static int cls_bpf_dump_ebpf_info(const struct cls_bpf_prog *prog,
nla_put_string(skb, TCA_BPF_NAME, prog->bpf_name))
return -EMSGSIZE;
- nla = nla_reserve(skb, TCA_BPF_DIGEST, sizeof(prog->filter->digest));
+ nla = nla_reserve(skb, TCA_BPF_SHA256, sizeof(prog->filter->digest));
if (nla == NULL)
return -EMSGSIZE;
--
2.9.3
This makes it easier to add another digest algorithm down the road
if needed. It also serves to force any programs that might have
been written against a kernel that had 'prog_digest' to be updated.
This shouldn't violate any stable API policies, as no released
kernel has ever had 'prog_digest'.
Cc: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
kernel/bpf/syscall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e89acea22ecf..956370b80296 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -694,7 +694,7 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp)
seq_printf(m,
"prog_type:\t%u\n"
"prog_jited:\t%u\n"
- "prog_digest:\t%s\n"
+ "prog_sha256:\t%s\n"
"memlock:\t%llu\n",
prog->type,
prog->jited,
--
2.9.3
The sha256 helpers can consume a message incrementally, so there's no need
to allocate a buffer to store the whole blob to be hashed.
This may be a slight slowdown for very long messages because gcc can't
inline the sha256_update() calls. For reasonably-sized programs,
however, this should be a considerable speedup as vmalloc() is quite
slow.
Cc: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
kernel/bpf/core.c | 34 ++++++++++++++--------------------
1 file changed, 14 insertions(+), 20 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 911993863799..1c2931f505af 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -149,43 +149,37 @@ void __bpf_prog_free(struct bpf_prog *fp)
int bpf_prog_calc_digest(struct bpf_prog *fp)
{
struct sha256_state sha;
- u32 i, psize;
- struct bpf_insn *dst;
+ u32 i;
bool was_ld_map;
- u8 *raw;
-
- psize = bpf_prog_insn_size(fp);
- raw = vmalloc(psize);
- if (!raw)
- return -ENOMEM;
sha256_init(&sha);
/* We need to take out the map fd for the digest calculation
* since they are unstable from user space side.
*/
- dst = (void *)raw;
for (i = 0, was_ld_map = false; i < fp->len; i++) {
- dst[i] = fp->insnsi[i];
+ struct bpf_insn insn = fp->insnsi[i];
+
if (!was_ld_map &&
- dst[i].code == (BPF_LD | BPF_IMM | BPF_DW) &&
- dst[i].src_reg == BPF_PSEUDO_MAP_FD) {
+ insn.code == (BPF_LD | BPF_IMM | BPF_DW) &&
+ insn.src_reg == BPF_PSEUDO_MAP_FD) {
was_ld_map = true;
- dst[i].imm = 0;
+ insn.imm = 0;
} else if (was_ld_map &&
- dst[i].code == 0 &&
- dst[i].dst_reg == 0 &&
- dst[i].src_reg == 0 &&
- dst[i].off == 0) {
+ insn.code == 0 &&
+ insn.dst_reg == 0 &&
+ insn.src_reg == 0 &&
+ insn.off == 0) {
was_ld_map = false;
- dst[i].imm = 0;
+ insn.imm = 0;
} else {
was_ld_map = false;
}
+
+ sha256_update(&sha, (const u8 *)&insn, sizeof(insn));
}
- sha256_finup(&sha, raw, psize, fp->digest);
- vfree(raw);
+ sha256_final(&sha, fp->digest);
return 0;
}
--
2.9.3
There are some pieecs of kernel code that want to compute SHA256
directly without going through the crypto core. Adjust the exported
API to decouple it from the crypto core.
I suspect this will very slightly speed up the SHA256 shash operations
as well by reducing the amount of indirection involved.
Cc: Ard Biesheuvel <[email protected]>
Cc: Herbert Xu <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/arm/crypto/sha2-ce-glue.c | 10 ++++---
arch/arm/crypto/sha256_glue.c | 23 ++++++++++-----
arch/arm/crypto/sha256_neon_glue.c | 34 +++++++++++----------
arch/arm64/crypto/sha2-ce-glue.c | 13 ++++----
arch/arm64/crypto/sha256-glue.c | 59 +++++++++++++++++++++----------------
arch/x86/crypto/sha256_ssse3_glue.c | 46 +++++++++++++++++------------
arch/x86/purgatory/purgatory.c | 2 +-
arch/x86/purgatory/sha256.c | 25 ++--------------
arch/x86/purgatory/sha256.h | 22 --------------
crypto/sha256_generic.c | 50 +++++++++++++++++++++++--------
include/crypto/sha.h | 29 ++++++++++++++----
include/crypto/sha256_base.h | 40 ++++++++-----------------
12 files changed, 184 insertions(+), 169 deletions(-)
delete mode 100644 arch/x86/purgatory/sha256.h
diff --git a/arch/arm/crypto/sha2-ce-glue.c b/arch/arm/crypto/sha2-ce-glue.c
index 0755b2d657f3..8832c2f85591 100644
--- a/arch/arm/crypto/sha2-ce-glue.c
+++ b/arch/arm/crypto/sha2-ce-glue.c
@@ -38,7 +38,7 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
return crypto_sha256_arm_update(desc, data, len);
kernel_neon_begin();
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
@@ -48,17 +48,19 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
if (!may_use_simd())
return crypto_sha256_arm_finup(desc, data, len, out);
kernel_neon_begin();
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha2_ce_transform);
- sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
+ sha256_base_do_finalize(sctx, (sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
static int sha2_ce_final(struct shash_desc *desc, u8 *out)
diff --git a/arch/arm/crypto/sha256_glue.c b/arch/arm/crypto/sha256_glue.c
index a84e869ef900..405a29a9a9d3 100644
--- a/arch/arm/crypto/sha256_glue.c
+++ b/arch/arm/crypto/sha256_glue.c
@@ -36,27 +36,34 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
/* make sure casting to sha256_block_fn() is safe */
BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
- return sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order);
+ return 0;
}
EXPORT_SYMBOL(crypto_sha256_arm_update);
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_arm_final(struct shash_desc *desc, u8 *out)
{
- sha256_base_do_finalize(desc,
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ sha256_base_do_finalize(sctx,
(sha256_block_fn *)sha256_block_data_order);
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- sha256_base_do_update(desc, data, len,
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order);
- return sha256_final(desc, out);
+ return crypto_sha2_final(desc, out);
}
EXPORT_SYMBOL(crypto_sha256_arm_finup);
@@ -64,7 +71,7 @@ static struct shash_alg algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = crypto_sha256_arm_update,
- .final = sha256_final,
+ .final = sha256_arm_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {
@@ -79,7 +86,7 @@ static struct shash_alg algs[] = { {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = crypto_sha256_arm_update,
- .final = sha256_final,
+ .final = sha256_arm_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {
diff --git a/arch/arm/crypto/sha256_neon_glue.c b/arch/arm/crypto/sha256_neon_glue.c
index 39ccd658817e..40c85d1d4c1e 100644
--- a/arch/arm/crypto/sha256_neon_glue.c
+++ b/arch/arm/crypto/sha256_neon_glue.c
@@ -29,8 +29,8 @@
asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
unsigned int num_blks);
-static int sha256_update(struct shash_desc *desc, const u8 *data,
- unsigned int len)
+static int sha256_neon_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
@@ -39,41 +39,43 @@ static int sha256_update(struct shash_desc *desc, const u8 *data,
return crypto_sha256_arm_update(desc, data, len);
kernel_neon_begin();
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order_neon);
kernel_neon_end();
return 0;
}
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *out)
+static int sha256_neon_finup(struct shash_desc *desc, const u8 *data,
+ unsigned int len, u8 *out)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
if (!may_use_simd())
return crypto_sha256_arm_finup(desc, data, len, out);
kernel_neon_begin();
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order_neon);
- sha256_base_do_finalize(desc,
+ sha256_base_do_finalize(sctx,
(sha256_block_fn *)sha256_block_data_order_neon);
kernel_neon_end();
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_neon_final(struct shash_desc *desc, u8 *out)
{
- return sha256_finup(desc, NULL, 0, out);
+ return sha256_neon_finup(desc, NULL, 0, out);
}
struct shash_alg sha256_neon_algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
- .update = sha256_update,
- .final = sha256_final,
- .finup = sha256_finup,
+ .update = sha256_neon_update,
+ .final = sha256_neon_final,
+ .finup = sha256_neon_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha256",
@@ -86,9 +88,9 @@ struct shash_alg sha256_neon_algs[] = { {
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
- .update = sha256_update,
- .final = sha256_final,
- .finup = sha256_finup,
+ .update = sha256_neon_update,
+ .final = sha256_neon_final,
+ .finup = sha256_neon_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha224",
diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c
index 7cd587564a41..e38dd301abce 100644
--- a/arch/arm64/crypto/sha2-ce-glue.c
+++ b/arch/arm64/crypto/sha2-ce-glue.c
@@ -39,7 +39,7 @@ static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
sctx->finalize = 0;
kernel_neon_begin_partial(28);
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(&sctx->sst, data, len,
(sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
@@ -64,13 +64,13 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data,
sctx->finalize = finalize;
kernel_neon_begin_partial(28);
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(&sctx->sst, data, len,
(sha256_block_fn *)sha2_ce_transform);
if (!finalize)
- sha256_base_do_finalize(desc,
+ sha256_base_do_finalize(&sctx->sst,
(sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
static int sha256_ce_final(struct shash_desc *desc, u8 *out)
@@ -79,9 +79,10 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out)
sctx->finalize = 0;
kernel_neon_begin_partial(28);
- sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
+ sha256_base_do_finalize(&sctx->sst,
+ (sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
static struct shash_alg algs[] = { {
diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
index a2226f841960..132a1ef89a71 100644
--- a/arch/arm64/crypto/sha256-glue.c
+++ b/arch/arm64/crypto/sha256-glue.c
@@ -33,36 +33,39 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
asmlinkage void sha256_block_neon(u32 *digest, const void *data,
unsigned int num_blks);
-static int sha256_update(struct shash_desc *desc, const u8 *data,
- unsigned int len)
+static int sha256_update_arm64(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
{
- return sha256_base_do_update(desc, data, len,
- (sha256_block_fn *)sha256_block_data_order);
+ sha256_base_do_update(shash_desc_ctx(desc), data, len,
+ (sha256_block_fn *)sha256_block_data_order);
+ return 0;
}
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *out)
+static int sha256_finup_arm64(struct shash_desc *desc, const u8 *data,
+ unsigned int len, u8 *out)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order);
- sha256_base_do_finalize(desc,
+ sha256_base_do_finalize(sctx,
(sha256_block_fn *)sha256_block_data_order);
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int sha256_final_arm64(struct shash_desc *desc, u8 *out)
{
- return sha256_finup(desc, NULL, 0, out);
+ return sha256_finup_arm64(desc, NULL, 0, out);
}
static struct shash_alg algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
- .update = sha256_update,
- .final = sha256_final,
- .finup = sha256_finup,
+ .update = sha256_update_arm64,
+ .final = sha256_final_arm64,
+ .finup = sha256_finup_arm64,
.descsize = sizeof(struct sha256_state),
.base.cra_name = "sha256",
.base.cra_driver_name = "sha256-arm64",
@@ -73,9 +76,9 @@ static struct shash_alg algs[] = { {
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
- .update = sha256_update,
- .final = sha256_final,
- .finup = sha256_finup,
+ .update = sha256_update_arm64,
+ .final = sha256_final_arm64,
+ .finup = sha256_finup_arm64,
.descsize = sizeof(struct sha256_state),
.base.cra_name = "sha224",
.base.cra_driver_name = "sha224-arm64",
@@ -88,18 +91,22 @@ static struct shash_alg algs[] = { {
static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
/*
* Stacking and unstacking a substantial slice of the NEON register
* file may significantly affect performance for small updates when
* executing in interrupt context, so fall back to the scalar code
* in that case.
*/
- if (!may_use_simd())
- return sha256_base_do_update(desc, data, len,
+ if (!may_use_simd()) {
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order);
+ return 0;
+ }
kernel_neon_begin();
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_neon);
kernel_neon_end();
@@ -109,22 +116,24 @@ static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
if (!may_use_simd()) {
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_data_order);
- sha256_base_do_finalize(desc,
+ sha256_base_do_finalize(sctx,
(sha256_block_fn *)sha256_block_data_order);
} else {
kernel_neon_begin();
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_block_neon);
- sha256_base_do_finalize(desc,
+ sha256_base_do_finalize(sctx,
(sha256_block_fn *)sha256_block_neon);
kernel_neon_end();
}
- return sha256_base_finish(desc, out);
+ return crypto_sha2_final(desc, out);
}
static int sha256_final_neon(struct shash_desc *desc, u8 *out)
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
index 9e79baf03a4b..e722fbaf0558 100644
--- a/arch/x86/crypto/sha256_ssse3_glue.c
+++ b/arch/x86/crypto/sha256_ssse3_glue.c
@@ -44,52 +44,60 @@ asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
u64 rounds);
typedef void (sha256_transform_fn)(u32 *digest, const char *data, u64 rounds);
-static int sha256_update(struct shash_desc *desc, const u8 *data,
- unsigned int len, sha256_transform_fn *sha256_xform)
+static int sha256_fpu_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len,
+ sha256_transform_fn *sha256_xform)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
if (!irq_fpu_usable() ||
- (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
- return crypto_sha256_update(desc, data, len);
+ (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) {
+ sha256_update(sctx, data, len);
+ return 0;
+ }
/* make sure casting to sha256_block_fn() is safe */
BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
kernel_fpu_begin();
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_xform);
kernel_fpu_end();
return 0;
}
-static int sha256_finup(struct shash_desc *desc, const u8 *data,
+static int sha256_fpu_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out, sha256_transform_fn *sha256_xform)
{
- if (!irq_fpu_usable())
- return crypto_sha256_finup(desc, data, len, out);
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ if (!irq_fpu_usable()) {
+ sha256_finup(sctx, data, len, out);
+ return 0;
+ }
kernel_fpu_begin();
if (len)
- sha256_base_do_update(desc, data, len,
+ sha256_base_do_update(sctx, data, len,
(sha256_block_fn *)sha256_xform);
- sha256_base_do_finalize(desc, (sha256_block_fn *)sha256_xform);
+ sha256_base_do_finalize(sctx, (sha256_block_fn *)sha256_xform);
kernel_fpu_end();
- return sha256_base_finish(desc, out);
+ crypto_sha2_final(desc, out);
+ return 0;
}
static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- return sha256_update(desc, data, len, sha256_transform_ssse3);
+ return sha256_fpu_update(desc, data, len, sha256_transform_ssse3);
}
static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- return sha256_finup(desc, data, len, out, sha256_transform_ssse3);
+ return sha256_fpu_finup(desc, data, len, out, sha256_transform_ssse3);
}
/* Add padding and return the message digest. */
@@ -152,13 +160,13 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char *data,
static int sha256_avx_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- return sha256_update(desc, data, len, sha256_transform_avx);
+ return sha256_fpu_update(desc, data, len, sha256_transform_avx);
}
static int sha256_avx_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- return sha256_finup(desc, data, len, out, sha256_transform_avx);
+ return sha256_fpu_finup(desc, data, len, out, sha256_transform_avx);
}
static int sha256_avx_final(struct shash_desc *desc, u8 *out)
@@ -236,13 +244,13 @@ asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- return sha256_update(desc, data, len, sha256_transform_rorx);
+ return sha256_fpu_update(desc, data, len, sha256_transform_rorx);
}
static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- return sha256_finup(desc, data, len, out, sha256_transform_rorx);
+ return sha256_fpu_finup(desc, data, len, out, sha256_transform_rorx);
}
static int sha256_avx2_final(struct shash_desc *desc, u8 *out)
@@ -318,13 +326,13 @@ asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- return sha256_update(desc, data, len, sha256_ni_transform);
+ return sha256_fpu_update(desc, data, len, sha256_ni_transform);
}
static int sha256_ni_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
- return sha256_finup(desc, data, len, out, sha256_ni_transform);
+ return sha256_fpu_finup(desc, data, len, out, sha256_ni_transform);
}
static int sha256_ni_final(struct shash_desc *desc, u8 *out)
diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
index 25e068ba3382..ed6e80b844cf 100644
--- a/arch/x86/purgatory/purgatory.c
+++ b/arch/x86/purgatory/purgatory.c
@@ -10,7 +10,7 @@
* Version 2. See the file COPYING for more details.
*/
-#include "sha256.h"
+#include <crypto/sha.h>
#include "../boot/string.h"
struct sha_region {
diff --git a/arch/x86/purgatory/sha256.c b/arch/x86/purgatory/sha256.c
index 548ca675a14a..724925d5da61 100644
--- a/arch/x86/purgatory/sha256.c
+++ b/arch/x86/purgatory/sha256.c
@@ -17,7 +17,7 @@
#include <linux/bitops.h>
#include <asm/byteorder.h>
-#include "sha256.h"
+#include <crypto/sha.h>
#include "../boot/string.h"
static inline u32 Ch(u32 x, u32 y, u32 z)
@@ -208,22 +208,7 @@ static void sha256_transform(u32 *state, const u8 *input)
memset(W, 0, 64 * sizeof(u32));
}
-int sha256_init(struct sha256_state *sctx)
-{
- sctx->state[0] = SHA256_H0;
- sctx->state[1] = SHA256_H1;
- sctx->state[2] = SHA256_H2;
- sctx->state[3] = SHA256_H3;
- sctx->state[4] = SHA256_H4;
- sctx->state[5] = SHA256_H5;
- sctx->state[6] = SHA256_H6;
- sctx->state[7] = SHA256_H7;
- sctx->count = 0;
-
- return 0;
-}
-
-int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
+void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
{
unsigned int partial, done;
const u8 *src;
@@ -249,11 +234,9 @@ int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
partial = 0;
}
memcpy(sctx->buf + partial, src, len - done);
-
- return 0;
}
-int sha256_final(struct sha256_state *sctx, u8 *out)
+void sha256_final(struct sha256_state *sctx, u8 *out)
{
__be32 *dst = (__be32 *)out;
__be64 bits;
@@ -278,6 +261,4 @@ int sha256_final(struct sha256_state *sctx, u8 *out)
/* Zeroize sensitive information. */
memset(sctx, 0, sizeof(*sctx));
-
- return 0;
}
diff --git a/arch/x86/purgatory/sha256.h b/arch/x86/purgatory/sha256.h
deleted file mode 100644
index bd15a4127735..000000000000
--- a/arch/x86/purgatory/sha256.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/*
- * Copyright (C) 2014 Red Hat Inc.
- *
- * Author: Vivek Goyal <[email protected]>
- *
- * This source code is licensed under the GNU General Public License,
- * Version 2. See the file COPYING for more details.
- */
-
-#ifndef SHA256_H
-#define SHA256_H
-
-
-#include <linux/types.h>
-#include <crypto/sha.h>
-
-extern int sha256_init(struct sha256_state *sctx);
-extern int sha256_update(struct sha256_state *sctx, const u8 *input,
- unsigned int length);
-extern int sha256_final(struct sha256_state *sctx, u8 *hash);
-
-#endif /* SHA256_H */
diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
index 8f9c47e1a96e..f2747893402c 100644
--- a/crypto/sha256_generic.c
+++ b/crypto/sha256_generic.c
@@ -231,6 +231,13 @@ static void sha256_transform(u32 *state, const u8 *input)
memzero_explicit(W, 64 * sizeof(u32));
}
+int sha256_base_init(struct shash_desc *desc)
+{
+ sha256_init(shash_desc_ctx(desc));
+ return 0;
+}
+EXPORT_SYMBOL(sha256_base_init);
+
static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
int blocks)
{
@@ -240,32 +247,49 @@ static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
}
}
-int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
+void sha256_update(struct sha256_state *sctx, const u8 *data,
unsigned int len)
{
- return sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
+ sha256_base_do_update(sctx, data, len, sha256_generic_block_fn);
+}
+EXPORT_SYMBOL(sha256_update);
+
+void sha256_final(struct sha256_state *sctx, u8 *out)
+{
+ sha256_base_do_finalize(sctx, sha256_generic_block_fn);
+ sha256_base_finish(sctx, out);
}
-EXPORT_SYMBOL(crypto_sha256_update);
+EXPORT_SYMBOL(sha256_final);
-static int sha256_final(struct shash_desc *desc, u8 *out)
+static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
{
- sha256_base_do_finalize(desc, sha256_generic_block_fn);
- return sha256_base_finish(desc, out);
+ sha256_update(shash_desc_ctx(desc), data, len);
+ return 0;
+}
+
+int crypto_sha2_final(struct shash_desc *desc, u8 *out)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ sha256_base_do_finalize(sctx, sha256_generic_block_fn);
+ sha2_base_finish(sctx, crypto_shash_digestsize(desc->tfm), out);
+ return 0;
}
+EXPORT_SYMBOL(crypto_sha2_final);
-int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *hash)
+static int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
+ unsigned int len, u8 *hash)
{
- sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
- return sha256_final(desc, hash);
+ sha256_finup(shash_desc_ctx(desc), data, len, hash);
+ return 0;
}
-EXPORT_SYMBOL(crypto_sha256_finup);
static struct shash_alg sha256_algs[2] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = crypto_sha256_update,
- .final = sha256_final,
+ .final = crypto_sha2_final,
.finup = crypto_sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {
@@ -279,7 +303,7 @@ static struct shash_alg sha256_algs[2] = { {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = crypto_sha256_update,
- .final = sha256_final,
+ .final = crypto_sha2_final,
.finup = crypto_sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {
diff --git a/include/crypto/sha.h b/include/crypto/sha.h
index c94d3eb1cefd..2b6978471605 100644
--- a/include/crypto/sha.h
+++ b/include/crypto/sha.h
@@ -96,11 +96,30 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash);
-extern int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
- unsigned int len);
-
-extern int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
- unsigned int len, u8 *hash);
+static inline void sha256_init(struct sha256_state *sctx)
+{
+ sctx->state[0] = SHA256_H0;
+ sctx->state[1] = SHA256_H1;
+ sctx->state[2] = SHA256_H2;
+ sctx->state[3] = SHA256_H3;
+ sctx->state[4] = SHA256_H4;
+ sctx->state[5] = SHA256_H5;
+ sctx->state[6] = SHA256_H6;
+ sctx->state[7] = SHA256_H7;
+ sctx->count = 0;
+}
+
+extern void sha256_update(struct sha256_state *sctx, const u8 *data,
+ unsigned int len);
+
+extern void sha256_final(struct sha256_state *sctx, u8 *out);
+
+static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
+ unsigned int len, u8 *hash)
+{
+ sha256_update(sctx, data, len);
+ sha256_final(sctx, hash);
+}
extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
unsigned int len);
diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
index d1f2195bb7de..f65d9a516b36 100644
--- a/include/crypto/sha256_base.h
+++ b/include/crypto/sha256_base.h
@@ -35,29 +35,13 @@ static inline int sha224_base_init(struct shash_desc *desc)
return 0;
}
-static inline int sha256_base_init(struct shash_desc *desc)
-{
- struct sha256_state *sctx = shash_desc_ctx(desc);
-
- sctx->state[0] = SHA256_H0;
- sctx->state[1] = SHA256_H1;
- sctx->state[2] = SHA256_H2;
- sctx->state[3] = SHA256_H3;
- sctx->state[4] = SHA256_H4;
- sctx->state[5] = SHA256_H5;
- sctx->state[6] = SHA256_H6;
- sctx->state[7] = SHA256_H7;
- sctx->count = 0;
-
- return 0;
-}
+extern int sha256_base_init(struct shash_desc *desc);
-static inline int sha256_base_do_update(struct shash_desc *desc,
+static inline void sha256_base_do_update(struct sha256_state *sctx,
const u8 *data,
unsigned int len,
sha256_block_fn *block_fn)
{
- struct sha256_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
sctx->count += len;
@@ -86,15 +70,12 @@ static inline int sha256_base_do_update(struct shash_desc *desc,
}
if (len)
memcpy(sctx->buf + partial, data, len);
-
- return 0;
}
-static inline int sha256_base_do_finalize(struct shash_desc *desc,
+static inline void sha256_base_do_finalize(struct sha256_state *sctx,
sha256_block_fn *block_fn)
{
const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
- struct sha256_state *sctx = shash_desc_ctx(desc);
__be64 *bits = (__be64 *)(sctx->buf + bit_offset);
unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
@@ -109,14 +90,11 @@ static inline int sha256_base_do_finalize(struct shash_desc *desc,
memset(sctx->buf + partial, 0x0, bit_offset - partial);
*bits = cpu_to_be64(sctx->count << 3);
block_fn(sctx, sctx->buf, 1);
-
- return 0;
}
-static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
+static inline void sha2_base_finish(struct sha256_state *sctx,
+ unsigned int digest_size, u8 *out)
{
- unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
- struct sha256_state *sctx = shash_desc_ctx(desc);
__be32 *digest = (__be32 *)out;
int i;
@@ -124,5 +102,11 @@ static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
put_unaligned_be32(sctx->state[i], digest++);
*sctx = (struct sha256_state){};
- return 0;
}
+
+static inline void sha256_base_finish(struct sha256_state *sctx, u8 *out)
+{
+ sha2_base_finish(sctx, SHA256_DIGEST_SIZE, out);
+}
+
+extern int crypto_sha2_final(struct shash_desc *desc, u8 *out);
--
2.9.3
BPF digests are intended to be used to avoid reloading programs that
are already loaded. For use cases (CRIU?) where untrusted programs
are involved, intentional hash collisions could cause the wrong BPF
program to execute. Additionally, if BPF digests are ever used
in-kernel to skip verification, a hash collision could give privilege
escalation directly.
SHA1 is no longer considered adequately collision-resistant (see, for
example, all the major browsers dropping support for SHA1
certificates). Use SHA256 instead.
I moved the digest field to keep all of the bpf program metadata in
the same cache line.
Cc: Daniel Borkmann <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
include/linux/filter.h | 11 +++--------
init/Kconfig | 1 +
kernel/bpf/core.c | 41 +++++++----------------------------------
3 files changed, 11 insertions(+), 42 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 702314253797..23df2574e30c 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -14,7 +14,8 @@
#include <linux/workqueue.h>
#include <linux/sched.h>
#include <linux/capability.h>
-#include <linux/cryptohash.h>
+
+#include <crypto/sha.h>
#include <net/sch_generic.h>
@@ -408,11 +409,11 @@ struct bpf_prog {
kmemcheck_bitfield_end(meta);
enum bpf_prog_type type; /* Type of BPF program */
u32 len; /* Number of filter blocks */
- u32 digest[SHA_DIGEST_WORDS]; /* Program digest */
struct bpf_prog_aux *aux; /* Auxiliary fields */
struct sock_fprog_kern *orig_prog; /* Original BPF program */
unsigned int (*bpf_func)(const void *ctx,
const struct bpf_insn *insn);
+ u8 digest[SHA256_DIGEST_SIZE]; /* Program digest */
/* Instructions for interpreter */
union {
struct sock_filter insns[0];
@@ -519,12 +520,6 @@ static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)
return prog->len * sizeof(struct bpf_insn);
}
-static inline u32 bpf_prog_digest_scratch_size(const struct bpf_prog *prog)
-{
- return round_up(bpf_prog_insn_size(prog) +
- sizeof(__be64) + 1, SHA_MESSAGE_BYTES);
-}
-
static inline unsigned int bpf_prog_size(unsigned int proglen)
{
return max(sizeof(struct bpf_prog),
diff --git a/init/Kconfig b/init/Kconfig
index 223b734abccd..5a4e2d99cc38 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1634,6 +1634,7 @@ config BPF_SYSCALL
bool "Enable bpf() system call"
select ANON_INODES
select BPF
+ select CRYPTO_SHA256_LIB
default n
help
Enable the bpf() system call that allows to manipulate eBPF
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1eb4f1303756..911993863799 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -148,22 +148,18 @@ void __bpf_prog_free(struct bpf_prog *fp)
int bpf_prog_calc_digest(struct bpf_prog *fp)
{
- const u32 bits_offset = SHA_MESSAGE_BYTES - sizeof(__be64);
- u32 raw_size = bpf_prog_digest_scratch_size(fp);
- u32 ws[SHA_WORKSPACE_WORDS];
- u32 i, bsize, psize, blocks;
+ struct sha256_state sha;
+ u32 i, psize;
struct bpf_insn *dst;
bool was_ld_map;
- u8 *raw, *todo;
- __be32 *result;
- __be64 *bits;
+ u8 *raw;
- raw = vmalloc(raw_size);
+ psize = bpf_prog_insn_size(fp);
+ raw = vmalloc(psize);
if (!raw)
return -ENOMEM;
- sha_init(fp->digest);
- memset(ws, 0, sizeof(ws));
+ sha256_init(&sha);
/* We need to take out the map fd for the digest calculation
* since they are unstable from user space side.
@@ -188,30 +184,7 @@ int bpf_prog_calc_digest(struct bpf_prog *fp)
}
}
- psize = bpf_prog_insn_size(fp);
- memset(&raw[psize], 0, raw_size - psize);
- raw[psize++] = 0x80;
-
- bsize = round_up(psize, SHA_MESSAGE_BYTES);
- blocks = bsize / SHA_MESSAGE_BYTES;
- todo = raw;
- if (bsize - psize >= sizeof(__be64)) {
- bits = (__be64 *)(todo + bsize - sizeof(__be64));
- } else {
- bits = (__be64 *)(todo + bsize + bits_offset);
- blocks++;
- }
- *bits = cpu_to_be64((psize - 1) << 3);
-
- while (blocks--) {
- sha_transform(fp->digest, todo, ws);
- todo += SHA_MESSAGE_BYTES;
- }
-
- result = (__force __be32 *)fp->digest;
- for (i = 0; i < SHA_DIGEST_WORDS; i++)
- result[i] = cpu_to_be32(fp->digest[i]);
-
+ sha256_finup(&sha, raw, psize, fp->digest);
vfree(raw);
return 0;
}
--
2.9.3
On Fri, Dec 23, 2016 at 6:22 PM, Andy Lutomirski <[email protected]> wrote:
> There are some pieecs of kernel code that want to compute SHA256
> directly without going through the crypto core. Adjust the exported
> API to decouple it from the crypto core.
>
> I suspect this will very slightly speed up the SHA256 shash operations
> as well by reducing the amount of indirection involved.
>
I should also mention: there's a nice potential cleanup that's
possible on top of this. Currently, most of the accelerated SHA256
implementations just swap out the block function. Another approach to
enabling this would be to restructure sha256_update along the lines
of:
sha256_block_fn_t fn = arch_sha256_block_fn(len);
sha256_base_do_update(sctx, data, len, arch_sha256_block_fn(len));
The idea being that arch code can decide whether to use an accelerated
block function based on context (x86, for example, can't always use
xmm regs) and length (on x86, using the accelerated versions for short
digests is very slow due to the state save/restore that happens) and
then the core code can just use it.
This would allow a lot of the boilerplate that this patch was forced
to modify to be deleted outright.
--Andy
Hi Andy,
On 24 December 2016 at 02:22, Andy Lutomirski <[email protected]> wrote:
> There are some pieecs of kernel code that want to compute SHA256
> directly without going through the crypto core. Adjust the exported
> API to decouple it from the crypto core.
>
There are a bunch of things happening at the same time in this patch,
i.e., unnecessary renames of functions with static linkage, return
type changes to the base prototypes (int (*)(...) to void (*)(...))
and the change for the base functions to take a struct sha256_state
ctx rather than a shash_desc. I suppose you are mainly after the
latter, so could we please drop the other changes?
For the name clashes, could we simply use the crypto_ prefix for the
globally visible functions rather than using names that are already in
use? (and having to go around clean up the conflicts)
As for the return type changes, the base functions intentionally
return int to allow tail calls from the functions exposed by the
crypto API (whose prototypes cannot be changed). Unlikely to matter in
the grand scheme of things (especially now that the base layer
consists of static inline functions primarily), but it is equally
pointless to go around and change them to return void IMO.
So what remains is the struct shash_desc to struct sha256_state
change, which makes sense given that you are after a sha256_digest()
function that does not require the crypto API. But it seems your use
case does not rely on incremental hashing, and so there is no reason
for the state to be exposed outside of the implementation, and we
could simply expose a crypto_sha256_digest() routine from the
sha256_generic.c implementation instead.
Also, I strongly feel that crypto and other security related patches
should be tested before being posted, even if they are only RFC,
especially when they are posted by high profile kernel devs like
yourself. (Your code incorrectly calls crypto_sha2_final() in a couple
of places, resulting in the finalization being performed twice, once
with the accelerated block transform and again with the generic
transform)
> I suspect this will very slightly speed up the SHA256 shash operations
> as well by reducing the amount of indirection involved.
>
I think you have a valid point when it comes to the complexity of the
crypto API in general. But the struct sha256_state is embedded in the
shash_desc rather than referred to via a pointer, so the level of
indirection does not appear to change. And given how 99.9% of the
SHA256 execution time is spent in the block transform routine anyway,
I expect the performance delta to be in the noise tbh.
Finally, another thing to keep in mind is that the base layers of
SHA-1, SHA-256 and SHA-512 are intentionally structured in the same
way. If there is a need for a digest() entry point, I'd prefer to add
them for all flavours.
E.g, something like
"""
@@ -126,3 +128,23 @@ static inline int sha256_base_finish(struct
shash_desc *desc, u8 *out)
*sctx = (struct sha256_state){};
return 0;
}
+
+static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
+{
+ unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ return __sha256_base_finish(sctx, out, digest_size);
+}
+
+static inline int sha256_base_digest(const u8 *data, unsigned int len, u8 *out,
+ sha256_block_fn *block_fn)
+{
+ struct sha256_state sctx;
+
+ __sha256_base_init(&sctx);
+ sha256_base_do_update(&sctx, data, len, block_fn);
+ sha256_base_do_finalize(&sctx, block_fn);
+
+ return __sha256_base_finish(&sctx, out, SHA256_DIGEST_SIZE);
+}
"""
(where __sha256_base_init() and __sha256_base_finish() are the
existing functions modified to take a struct sha256_state rather than
a shash_desc) should be sufficient to allow a generic
crypto_sha256_digest() to be composed that does not rely on the crypto
API.
Whether this still belongs under crypto or under lib/sha256.c as a
library function (allowing archs to override it) is open for debate.
If hashing BPF programs becomes a hot spot, we probably have bigger
problems.
Regards,
Ard.
P.S. I do take your point regarding the arch_sha256_block_transform()
proposed in your follow up email, but there are some details (SIMD,
availability of the instructions etc) that would make it only suitable
for the generic implementation anyway, and the base layer was already
a huge improvement compared to the open coded implementations of the
SHA boilerplate.
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Herbert Xu <[email protected]>
> Signed-off-by: Andy Lutomirski <[email protected]>
> ---
> arch/arm/crypto/sha2-ce-glue.c | 10 ++++---
> arch/arm/crypto/sha256_glue.c | 23 ++++++++++-----
> arch/arm/crypto/sha256_neon_glue.c | 34 +++++++++++----------
> arch/arm64/crypto/sha2-ce-glue.c | 13 ++++----
> arch/arm64/crypto/sha256-glue.c | 59 +++++++++++++++++++++----------------
> arch/x86/crypto/sha256_ssse3_glue.c | 46 +++++++++++++++++------------
> arch/x86/purgatory/purgatory.c | 2 +-
> arch/x86/purgatory/sha256.c | 25 ++--------------
> arch/x86/purgatory/sha256.h | 22 --------------
> crypto/sha256_generic.c | 50 +++++++++++++++++++++++--------
> include/crypto/sha.h | 29 ++++++++++++++----
> include/crypto/sha256_base.h | 40 ++++++++-----------------
> 12 files changed, 184 insertions(+), 169 deletions(-)
> delete mode 100644 arch/x86/purgatory/sha256.h
>
> diff --git a/arch/arm/crypto/sha2-ce-glue.c b/arch/arm/crypto/sha2-ce-glue.c
> index 0755b2d657f3..8832c2f85591 100644
> --- a/arch/arm/crypto/sha2-ce-glue.c
> +++ b/arch/arm/crypto/sha2-ce-glue.c
> @@ -38,7 +38,7 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
> return crypto_sha256_arm_update(desc, data, len);
>
> kernel_neon_begin();
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha2_ce_transform);
> kernel_neon_end();
>
> @@ -48,17 +48,19 @@ static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
> static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> if (!may_use_simd())
> return crypto_sha256_arm_finup(desc, data, len, out);
>
> kernel_neon_begin();
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha2_ce_transform);
> - sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
> + sha256_base_do_finalize(sctx, (sha256_block_fn *)sha2_ce_transform);
> kernel_neon_end();
>
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> static int sha2_ce_final(struct shash_desc *desc, u8 *out)
> diff --git a/arch/arm/crypto/sha256_glue.c b/arch/arm/crypto/sha256_glue.c
> index a84e869ef900..405a29a9a9d3 100644
> --- a/arch/arm/crypto/sha256_glue.c
> +++ b/arch/arm/crypto/sha256_glue.c
> @@ -36,27 +36,34 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
> int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> /* make sure casting to sha256_block_fn() is safe */
> BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
>
> - return sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order);
> + return 0;
> }
> EXPORT_SYMBOL(crypto_sha256_arm_update);
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_arm_final(struct shash_desc *desc, u8 *out)
> {
> - sha256_base_do_finalize(desc,
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> + sha256_base_do_finalize(sctx,
> (sha256_block_fn *)sha256_block_data_order);
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
This is wrong: your crypto_sha2_final also calls sha256_base_do_finalize()
> }
>
> int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> - sha256_base_do_update(desc, data, len,
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order);
> - return sha256_final(desc, out);
> + return crypto_sha2_final(desc, out);
> }
> EXPORT_SYMBOL(crypto_sha256_arm_finup);
>
> @@ -64,7 +71,7 @@ static struct shash_alg algs[] = { {
> .digestsize = SHA256_DIGEST_SIZE,
> .init = sha256_base_init,
> .update = crypto_sha256_arm_update,
> - .final = sha256_final,
> + .final = sha256_arm_final,
> .finup = crypto_sha256_arm_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> @@ -79,7 +86,7 @@ static struct shash_alg algs[] = { {
> .digestsize = SHA224_DIGEST_SIZE,
> .init = sha224_base_init,
> .update = crypto_sha256_arm_update,
> - .final = sha256_final,
> + .final = sha256_arm_final,
> .finup = crypto_sha256_arm_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> diff --git a/arch/arm/crypto/sha256_neon_glue.c b/arch/arm/crypto/sha256_neon_glue.c
> index 39ccd658817e..40c85d1d4c1e 100644
> --- a/arch/arm/crypto/sha256_neon_glue.c
> +++ b/arch/arm/crypto/sha256_neon_glue.c
> @@ -29,8 +29,8 @@
> asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
> unsigned int num_blks);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> - unsigned int len)
> +static int sha256_neon_update(struct shash_desc *desc, const u8 *data,
> + unsigned int len)
> {
> struct sha256_state *sctx = shash_desc_ctx(desc);
>
> @@ -39,41 +39,43 @@ static int sha256_update(struct shash_desc *desc, const u8 *data,
> return crypto_sha256_arm_update(desc, data, len);
>
> kernel_neon_begin();
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order_neon);
> kernel_neon_end();
>
> return 0;
> }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> - unsigned int len, u8 *out)
> +static int sha256_neon_finup(struct shash_desc *desc, const u8 *data,
> + unsigned int len, u8 *out)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> if (!may_use_simd())
> return crypto_sha256_arm_finup(desc, data, len, out);
>
> kernel_neon_begin();
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order_neon);
> - sha256_base_do_finalize(desc,
> + sha256_base_do_finalize(sctx,
> (sha256_block_fn *)sha256_block_data_order_neon);
> kernel_neon_end();
>
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_neon_final(struct shash_desc *desc, u8 *out)
> {
> - return sha256_finup(desc, NULL, 0, out);
> + return sha256_neon_finup(desc, NULL, 0, out);
> }
>
> struct shash_alg sha256_neon_algs[] = { {
> .digestsize = SHA256_DIGEST_SIZE,
> .init = sha256_base_init,
> - .update = sha256_update,
> - .final = sha256_final,
> - .finup = sha256_finup,
> + .update = sha256_neon_update,
> + .final = sha256_neon_final,
> + .finup = sha256_neon_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> .cra_name = "sha256",
> @@ -86,9 +88,9 @@ struct shash_alg sha256_neon_algs[] = { {
> }, {
> .digestsize = SHA224_DIGEST_SIZE,
> .init = sha224_base_init,
> - .update = sha256_update,
> - .final = sha256_final,
> - .finup = sha256_finup,
> + .update = sha256_neon_update,
> + .final = sha256_neon_final,
> + .finup = sha256_neon_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> .cra_name = "sha224",
> diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c
> index 7cd587564a41..e38dd301abce 100644
> --- a/arch/arm64/crypto/sha2-ce-glue.c
> +++ b/arch/arm64/crypto/sha2-ce-glue.c
> @@ -39,7 +39,7 @@ static int sha256_ce_update(struct shash_desc *desc, const u8 *data,
>
> sctx->finalize = 0;
> kernel_neon_begin_partial(28);
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(&sctx->sst, data, len,
> (sha256_block_fn *)sha2_ce_transform);
> kernel_neon_end();
>
> @@ -64,13 +64,13 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data,
> sctx->finalize = finalize;
>
> kernel_neon_begin_partial(28);
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(&sctx->sst, data, len,
> (sha256_block_fn *)sha2_ce_transform);
> if (!finalize)
> - sha256_base_do_finalize(desc,
> + sha256_base_do_finalize(&sctx->sst,
> (sha256_block_fn *)sha2_ce_transform);
> kernel_neon_end();
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> static int sha256_ce_final(struct shash_desc *desc, u8 *out)
> @@ -79,9 +79,10 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out)
>
> sctx->finalize = 0;
> kernel_neon_begin_partial(28);
> - sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
> + sha256_base_do_finalize(&sctx->sst,
> + (sha256_block_fn *)sha2_ce_transform);
> kernel_neon_end();
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> static struct shash_alg algs[] = { {
> diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
> index a2226f841960..132a1ef89a71 100644
> --- a/arch/arm64/crypto/sha256-glue.c
> +++ b/arch/arm64/crypto/sha256-glue.c
> @@ -33,36 +33,39 @@ asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
> asmlinkage void sha256_block_neon(u32 *digest, const void *data,
> unsigned int num_blks);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> - unsigned int len)
> +static int sha256_update_arm64(struct shash_desc *desc, const u8 *data,
> + unsigned int len)
> {
> - return sha256_base_do_update(desc, data, len,
> - (sha256_block_fn *)sha256_block_data_order);
> + sha256_base_do_update(shash_desc_ctx(desc), data, len,
> + (sha256_block_fn *)sha256_block_data_order);
> + return 0;
> }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> - unsigned int len, u8 *out)
> +static int sha256_finup_arm64(struct shash_desc *desc, const u8 *data,
> + unsigned int len, u8 *out)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order);
> - sha256_base_do_finalize(desc,
> + sha256_base_do_finalize(sctx,
> (sha256_block_fn *)sha256_block_data_order);
>
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int sha256_final_arm64(struct shash_desc *desc, u8 *out)
> {
> - return sha256_finup(desc, NULL, 0, out);
> + return sha256_finup_arm64(desc, NULL, 0, out);
> }
>
> static struct shash_alg algs[] = { {
> .digestsize = SHA256_DIGEST_SIZE,
> .init = sha256_base_init,
> - .update = sha256_update,
> - .final = sha256_final,
> - .finup = sha256_finup,
> + .update = sha256_update_arm64,
> + .final = sha256_final_arm64,
> + .finup = sha256_finup_arm64,
> .descsize = sizeof(struct sha256_state),
> .base.cra_name = "sha256",
> .base.cra_driver_name = "sha256-arm64",
> @@ -73,9 +76,9 @@ static struct shash_alg algs[] = { {
> }, {
> .digestsize = SHA224_DIGEST_SIZE,
> .init = sha224_base_init,
> - .update = sha256_update,
> - .final = sha256_final,
> - .finup = sha256_finup,
> + .update = sha256_update_arm64,
> + .final = sha256_final_arm64,
> + .finup = sha256_finup_arm64,
> .descsize = sizeof(struct sha256_state),
> .base.cra_name = "sha224",
> .base.cra_driver_name = "sha224-arm64",
> @@ -88,18 +91,22 @@ static struct shash_alg algs[] = { {
> static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> /*
> * Stacking and unstacking a substantial slice of the NEON register
> * file may significantly affect performance for small updates when
> * executing in interrupt context, so fall back to the scalar code
> * in that case.
> */
> - if (!may_use_simd())
> - return sha256_base_do_update(desc, data, len,
> + if (!may_use_simd()) {
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order);
> + return 0;
> + }
>
> kernel_neon_begin();
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_neon);
> kernel_neon_end();
>
> @@ -109,22 +116,24 @@ static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
> static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> if (!may_use_simd()) {
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_data_order);
> - sha256_base_do_finalize(desc,
> + sha256_base_do_finalize(sctx,
> (sha256_block_fn *)sha256_block_data_order);
> } else {
> kernel_neon_begin();
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_block_neon);
> - sha256_base_do_finalize(desc,
> + sha256_base_do_finalize(sctx,
> (sha256_block_fn *)sha256_block_neon);
> kernel_neon_end();
> }
> - return sha256_base_finish(desc, out);
> + return crypto_sha2_final(desc, out);
> }
>
> static int sha256_final_neon(struct shash_desc *desc, u8 *out)
> diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c
> index 9e79baf03a4b..e722fbaf0558 100644
> --- a/arch/x86/crypto/sha256_ssse3_glue.c
> +++ b/arch/x86/crypto/sha256_ssse3_glue.c
> @@ -44,52 +44,60 @@ asmlinkage void sha256_transform_ssse3(u32 *digest, const char *data,
> u64 rounds);
> typedef void (sha256_transform_fn)(u32 *digest, const char *data, u64 rounds);
>
> -static int sha256_update(struct shash_desc *desc, const u8 *data,
> - unsigned int len, sha256_transform_fn *sha256_xform)
> +static int sha256_fpu_update(struct shash_desc *desc, const u8 *data,
> + unsigned int len,
> + sha256_transform_fn *sha256_xform)
> {
> struct sha256_state *sctx = shash_desc_ctx(desc);
>
> if (!irq_fpu_usable() ||
> - (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
> - return crypto_sha256_update(desc, data, len);
> + (sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE) {
> + sha256_update(sctx, data, len);
> + return 0;
> + }
>
> /* make sure casting to sha256_block_fn() is safe */
> BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
>
> kernel_fpu_begin();
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_xform);
> kernel_fpu_end();
>
> return 0;
> }
>
> -static int sha256_finup(struct shash_desc *desc, const u8 *data,
> +static int sha256_fpu_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out, sha256_transform_fn *sha256_xform)
> {
> - if (!irq_fpu_usable())
> - return crypto_sha256_finup(desc, data, len, out);
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> + if (!irq_fpu_usable()) {
> + sha256_finup(sctx, data, len, out);
> + return 0;
> + }
>
> kernel_fpu_begin();
> if (len)
> - sha256_base_do_update(desc, data, len,
> + sha256_base_do_update(sctx, data, len,
> (sha256_block_fn *)sha256_xform);
> - sha256_base_do_finalize(desc, (sha256_block_fn *)sha256_xform);
> + sha256_base_do_finalize(sctx, (sha256_block_fn *)sha256_xform);
> kernel_fpu_end();
>
> - return sha256_base_finish(desc, out);
> + crypto_sha2_final(desc, out);
> + return 0;
> }
>
> static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> - return sha256_update(desc, data, len, sha256_transform_ssse3);
> + return sha256_fpu_update(desc, data, len, sha256_transform_ssse3);
> }
>
> static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> - return sha256_finup(desc, data, len, out, sha256_transform_ssse3);
> + return sha256_fpu_finup(desc, data, len, out, sha256_transform_ssse3);
> }
>
> /* Add padding and return the message digest. */
> @@ -152,13 +160,13 @@ asmlinkage void sha256_transform_avx(u32 *digest, const char *data,
> static int sha256_avx_update(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> - return sha256_update(desc, data, len, sha256_transform_avx);
> + return sha256_fpu_update(desc, data, len, sha256_transform_avx);
> }
>
> static int sha256_avx_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> - return sha256_finup(desc, data, len, out, sha256_transform_avx);
> + return sha256_fpu_finup(desc, data, len, out, sha256_transform_avx);
> }
>
> static int sha256_avx_final(struct shash_desc *desc, u8 *out)
> @@ -236,13 +244,13 @@ asmlinkage void sha256_transform_rorx(u32 *digest, const char *data,
> static int sha256_avx2_update(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> - return sha256_update(desc, data, len, sha256_transform_rorx);
> + return sha256_fpu_update(desc, data, len, sha256_transform_rorx);
> }
>
> static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> - return sha256_finup(desc, data, len, out, sha256_transform_rorx);
> + return sha256_fpu_finup(desc, data, len, out, sha256_transform_rorx);
> }
>
> static int sha256_avx2_final(struct shash_desc *desc, u8 *out)
> @@ -318,13 +326,13 @@ asmlinkage void sha256_ni_transform(u32 *digest, const char *data,
> static int sha256_ni_update(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> - return sha256_update(desc, data, len, sha256_ni_transform);
> + return sha256_fpu_update(desc, data, len, sha256_ni_transform);
> }
>
> static int sha256_ni_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *out)
> {
> - return sha256_finup(desc, data, len, out, sha256_ni_transform);
> + return sha256_fpu_finup(desc, data, len, out, sha256_ni_transform);
> }
>
> static int sha256_ni_final(struct shash_desc *desc, u8 *out)
> diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
> index 25e068ba3382..ed6e80b844cf 100644
> --- a/arch/x86/purgatory/purgatory.c
> +++ b/arch/x86/purgatory/purgatory.c
> @@ -10,7 +10,7 @@
> * Version 2. See the file COPYING for more details.
> */
>
> -#include "sha256.h"
> +#include <crypto/sha.h>
> #include "../boot/string.h"
>
> struct sha_region {
> diff --git a/arch/x86/purgatory/sha256.c b/arch/x86/purgatory/sha256.c
> index 548ca675a14a..724925d5da61 100644
> --- a/arch/x86/purgatory/sha256.c
> +++ b/arch/x86/purgatory/sha256.c
> @@ -17,7 +17,7 @@
>
> #include <linux/bitops.h>
> #include <asm/byteorder.h>
> -#include "sha256.h"
> +#include <crypto/sha.h>
> #include "../boot/string.h"
>
> static inline u32 Ch(u32 x, u32 y, u32 z)
> @@ -208,22 +208,7 @@ static void sha256_transform(u32 *state, const u8 *input)
> memset(W, 0, 64 * sizeof(u32));
> }
>
> -int sha256_init(struct sha256_state *sctx)
> -{
> - sctx->state[0] = SHA256_H0;
> - sctx->state[1] = SHA256_H1;
> - sctx->state[2] = SHA256_H2;
> - sctx->state[3] = SHA256_H3;
> - sctx->state[4] = SHA256_H4;
> - sctx->state[5] = SHA256_H5;
> - sctx->state[6] = SHA256_H6;
> - sctx->state[7] = SHA256_H7;
> - sctx->count = 0;
> -
> - return 0;
> -}
> -
> -int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
> +void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
> {
> unsigned int partial, done;
> const u8 *src;
> @@ -249,11 +234,9 @@ int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
> partial = 0;
> }
> memcpy(sctx->buf + partial, src, len - done);
> -
> - return 0;
> }
>
> -int sha256_final(struct sha256_state *sctx, u8 *out)
> +void sha256_final(struct sha256_state *sctx, u8 *out)
> {
> __be32 *dst = (__be32 *)out;
> __be64 bits;
> @@ -278,6 +261,4 @@ int sha256_final(struct sha256_state *sctx, u8 *out)
>
> /* Zeroize sensitive information. */
> memset(sctx, 0, sizeof(*sctx));
> -
> - return 0;
> }
> diff --git a/arch/x86/purgatory/sha256.h b/arch/x86/purgatory/sha256.h
> deleted file mode 100644
> index bd15a4127735..000000000000
> --- a/arch/x86/purgatory/sha256.h
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/*
> - * Copyright (C) 2014 Red Hat Inc.
> - *
> - * Author: Vivek Goyal <[email protected]>
> - *
> - * This source code is licensed under the GNU General Public License,
> - * Version 2. See the file COPYING for more details.
> - */
> -
> -#ifndef SHA256_H
> -#define SHA256_H
> -
> -
> -#include <linux/types.h>
> -#include <crypto/sha.h>
> -
> -extern int sha256_init(struct sha256_state *sctx);
> -extern int sha256_update(struct sha256_state *sctx, const u8 *input,
> - unsigned int length);
> -extern int sha256_final(struct sha256_state *sctx, u8 *hash);
> -
> -#endif /* SHA256_H */
> diff --git a/crypto/sha256_generic.c b/crypto/sha256_generic.c
> index 8f9c47e1a96e..f2747893402c 100644
> --- a/crypto/sha256_generic.c
> +++ b/crypto/sha256_generic.c
> @@ -231,6 +231,13 @@ static void sha256_transform(u32 *state, const u8 *input)
> memzero_explicit(W, 64 * sizeof(u32));
> }
>
> +int sha256_base_init(struct shash_desc *desc)
> +{
> + sha256_init(shash_desc_ctx(desc));
> + return 0;
> +}
> +EXPORT_SYMBOL(sha256_base_init);
> +
> static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
> int blocks)
> {
> @@ -240,32 +247,49 @@ static void sha256_generic_block_fn(struct sha256_state *sst, u8 const *src,
> }
> }
>
> -int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> +void sha256_update(struct sha256_state *sctx, const u8 *data,
> unsigned int len)
> {
> - return sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
> + sha256_base_do_update(sctx, data, len, sha256_generic_block_fn);
> +}
> +EXPORT_SYMBOL(sha256_update);
> +
> +void sha256_final(struct sha256_state *sctx, u8 *out)
> +{
> + sha256_base_do_finalize(sctx, sha256_generic_block_fn);
> + sha256_base_finish(sctx, out);
> }
> -EXPORT_SYMBOL(crypto_sha256_update);
> +EXPORT_SYMBOL(sha256_final);
>
> -static int sha256_final(struct shash_desc *desc, u8 *out)
> +static int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> + unsigned int len)
> {
> - sha256_base_do_finalize(desc, sha256_generic_block_fn);
> - return sha256_base_finish(desc, out);
> + sha256_update(shash_desc_ctx(desc), data, len);
> + return 0;
> +}
> +
> +int crypto_sha2_final(struct shash_desc *desc, u8 *out)
> +{
> + struct sha256_state *sctx = shash_desc_ctx(desc);
> +
> + sha256_base_do_finalize(sctx, sha256_generic_block_fn);
> + sha2_base_finish(sctx, crypto_shash_digestsize(desc->tfm), out);
> + return 0;
> }
> +EXPORT_SYMBOL(crypto_sha2_final);
>
> -int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> - unsigned int len, u8 *hash)
> +static int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> + unsigned int len, u8 *hash)
> {
> - sha256_base_do_update(desc, data, len, sha256_generic_block_fn);
> - return sha256_final(desc, hash);
> + sha256_finup(shash_desc_ctx(desc), data, len, hash);
> + return 0;
> }
> -EXPORT_SYMBOL(crypto_sha256_finup);
>
> static struct shash_alg sha256_algs[2] = { {
> .digestsize = SHA256_DIGEST_SIZE,
> .init = sha256_base_init,
> .update = crypto_sha256_update,
> - .final = sha256_final,
> + .final = crypto_sha2_final,
> .finup = crypto_sha256_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> @@ -279,7 +303,7 @@ static struct shash_alg sha256_algs[2] = { {
> .digestsize = SHA224_DIGEST_SIZE,
> .init = sha224_base_init,
> .update = crypto_sha256_update,
> - .final = sha256_final,
> + .final = crypto_sha2_final,
> .finup = crypto_sha256_finup,
> .descsize = sizeof(struct sha256_state),
> .base = {
> diff --git a/include/crypto/sha.h b/include/crypto/sha.h
> index c94d3eb1cefd..2b6978471605 100644
> --- a/include/crypto/sha.h
> +++ b/include/crypto/sha.h
> @@ -96,11 +96,30 @@ extern int crypto_sha1_update(struct shash_desc *desc, const u8 *data,
> extern int crypto_sha1_finup(struct shash_desc *desc, const u8 *data,
> unsigned int len, u8 *hash);
>
> -extern int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
> - unsigned int len);
> -
> -extern int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
> - unsigned int len, u8 *hash);
> +static inline void sha256_init(struct sha256_state *sctx)
> +{
> + sctx->state[0] = SHA256_H0;
> + sctx->state[1] = SHA256_H1;
> + sctx->state[2] = SHA256_H2;
> + sctx->state[3] = SHA256_H3;
> + sctx->state[4] = SHA256_H4;
> + sctx->state[5] = SHA256_H5;
> + sctx->state[6] = SHA256_H6;
> + sctx->state[7] = SHA256_H7;
> + sctx->count = 0;
> +}
> +
> +extern void sha256_update(struct sha256_state *sctx, const u8 *data,
> + unsigned int len);
> +
> +extern void sha256_final(struct sha256_state *sctx, u8 *out);
> +
> +static inline void sha256_finup(struct sha256_state *sctx, const u8 *data,
> + unsigned int len, u8 *hash)
> +{
> + sha256_update(sctx, data, len);
> + sha256_final(sctx, hash);
> +}
>
> extern int crypto_sha512_update(struct shash_desc *desc, const u8 *data,
> unsigned int len);
> diff --git a/include/crypto/sha256_base.h b/include/crypto/sha256_base.h
> index d1f2195bb7de..f65d9a516b36 100644
> --- a/include/crypto/sha256_base.h
> +++ b/include/crypto/sha256_base.h
> @@ -35,29 +35,13 @@ static inline int sha224_base_init(struct shash_desc *desc)
> return 0;
> }
>
> -static inline int sha256_base_init(struct shash_desc *desc)
> -{
> - struct sha256_state *sctx = shash_desc_ctx(desc);
> -
> - sctx->state[0] = SHA256_H0;
> - sctx->state[1] = SHA256_H1;
> - sctx->state[2] = SHA256_H2;
> - sctx->state[3] = SHA256_H3;
> - sctx->state[4] = SHA256_H4;
> - sctx->state[5] = SHA256_H5;
> - sctx->state[6] = SHA256_H6;
> - sctx->state[7] = SHA256_H7;
> - sctx->count = 0;
> -
> - return 0;
> -}
> +extern int sha256_base_init(struct shash_desc *desc);
>
> -static inline int sha256_base_do_update(struct shash_desc *desc,
> +static inline void sha256_base_do_update(struct sha256_state *sctx,
> const u8 *data,
> unsigned int len,
> sha256_block_fn *block_fn)
> {
> - struct sha256_state *sctx = shash_desc_ctx(desc);
> unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
>
> sctx->count += len;
> @@ -86,15 +70,12 @@ static inline int sha256_base_do_update(struct shash_desc *desc,
> }
> if (len)
> memcpy(sctx->buf + partial, data, len);
> -
> - return 0;
> }
>
> -static inline int sha256_base_do_finalize(struct shash_desc *desc,
> +static inline void sha256_base_do_finalize(struct sha256_state *sctx,
> sha256_block_fn *block_fn)
> {
> const int bit_offset = SHA256_BLOCK_SIZE - sizeof(__be64);
> - struct sha256_state *sctx = shash_desc_ctx(desc);
> __be64 *bits = (__be64 *)(sctx->buf + bit_offset);
> unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
>
> @@ -109,14 +90,11 @@ static inline int sha256_base_do_finalize(struct shash_desc *desc,
> memset(sctx->buf + partial, 0x0, bit_offset - partial);
> *bits = cpu_to_be64(sctx->count << 3);
> block_fn(sctx, sctx->buf, 1);
> -
> - return 0;
> }
>
> -static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
> +static inline void sha2_base_finish(struct sha256_state *sctx,
> + unsigned int digest_size, u8 *out)
> {
> - unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
> - struct sha256_state *sctx = shash_desc_ctx(desc);
> __be32 *digest = (__be32 *)out;
> int i;
>
> @@ -124,5 +102,11 @@ static inline int sha256_base_finish(struct shash_desc *desc, u8 *out)
> put_unaligned_be32(sctx->state[i], digest++);
>
> *sctx = (struct sha256_state){};
> - return 0;
> }
> +
> +static inline void sha256_base_finish(struct sha256_state *sctx, u8 *out)
> +{
> + sha2_base_finish(sctx, SHA256_DIGEST_SIZE, out);
> +}
> +
> +extern int crypto_sha2_final(struct shash_desc *desc, u8 *out);
> --
> 2.9.3
>
On Sat, Dec 24, 2016 at 2:33 AM, Ard Biesheuvel
<[email protected]> wrote:
> Hi Andy,
>
> On 24 December 2016 at 02:22, Andy Lutomirski <[email protected]> wrote:
>> There are some pieecs of kernel code that want to compute SHA256
>> directly without going through the crypto core. Adjust the exported
>> API to decouple it from the crypto core.
>>
>
> There are a bunch of things happening at the same time in this patch,
> i.e., unnecessary renames of functions with static linkage, return
> type changes to the base prototypes (int (*)(...) to void (*)(...))
> and the change for the base functions to take a struct sha256_state
> ctx rather than a shash_desc. I suppose you are mainly after the
> latter, so could we please drop the other changes?
>
> For the name clashes, could we simply use the crypto_ prefix for the
> globally visible functions rather than using names that are already in
> use? (and having to go around clean up the conflicts)
> As for the return type changes, the base functions intentionally
> return int to allow tail calls from the functions exposed by the
> crypto API (whose prototypes cannot be changed). Unlikely to matter in
> the grand scheme of things (especially now that the base layer
> consists of static inline functions primarily), but it is equally
> pointless to go around and change them to return void IMO.
>
> So what remains is the struct shash_desc to struct sha256_state
> change, which makes sense given that you are after a sha256_digest()
> function that does not require the crypto API. But it seems your use
> case does not rely on incremental hashing, and so there is no reason
> for the state to be exposed outside of the implementation, and we
> could simply expose a crypto_sha256_digest() routine from the
> sha256_generic.c implementation instead.
I actually do use incremental hashing later on. BPF currently
vmallocs() a big temporary buffer just so it can fill it and hash it.
I change it to hash as it goes.
I painted the bike shed the other way because I thought that crypto_
names should indicate that they're the versions compatible with the
crypto API, but I take your point about churn. Part of the reason I
didn't want to use crypto_sha256_update is because that function is
currently like this:
int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
and I wanted to avoid churn. The sha256_update() functions scattered
all over were static, so I didn't worry about them.
I'm going to give this another try as a split-up series that tries to
avoid making any changes beyond simple function renames to the
drivers.
>
> Also, I strongly feel that crypto and other security related patches
> should be tested before being posted, even if they are only RFC,
> especially when they are posted by high profile kernel devs like
> yourself. (Your code incorrectly calls crypto_sha2_final() in a couple
> of places, resulting in the finalization being performed twice, once
> with the accelerated block transform and again with the generic
> transform)
>
I tested it, albeit poorly. I wanted feedback on the API (thanks!)
and I figured I could more carefully check the implementation once the
API survives a bit of review. Since it looks like I have to rework
this, I'd need to re-test anyway.
>> I suspect this will very slightly speed up the SHA256 shash operations
>> as well by reducing the amount of indirection involved.
>>
>
> I think you have a valid point when it comes to the complexity of the
> crypto API in general. But the struct sha256_state is embedded in the
> shash_desc rather than referred to via a pointer, so the level of
> indirection does not appear to change. And given how 99.9% of the
> SHA256 execution time is spent in the block transform routine anyway,
> I expect the performance delta to be in the noise tbh.
s/very slightly/negligibly? There's an extra speedup from avoiding a
variable-length stack allocation, but that probably doesn't matter
much either.
>
> Finally, another thing to keep in mind is that the base layers of
> SHA-1, SHA-256 and SHA-512 are intentionally structured in the same
> way. If there is a need for a digest() entry point, I'd prefer to add
> them for all flavours.
I want to get sha256 right first. Once it's in good shape, making the
same changes to the other variants should be easy.
>
> Whether this still belongs under crypto or under lib/sha256.c as a
> library function (allowing archs to override it) is open for debate.
> If hashing BPF programs becomes a hot spot, we probably have bigger
> problems.
>
> Regards,
> Ard.
>
> P.S. I do take your point regarding the arch_sha256_block_transform()
> proposed in your follow up email, but there are some details (SIMD,
> availability of the instructions etc) that would make it only suitable
> for the generic implementation anyway, and the base layer was already
> a huge improvement compared to the open coded implementations of the
> SHA boilerplate.
Agreed, and my model may not be quite right. It might have to be
something like:
if (arch_begin_sha256(len)) {
... do it with arch helpers...
arch_end_sha256();
} else {
... do it generically ...
}
>> - return sha256_base_finish(desc, out);
>> + return crypto_sha2_final(desc, out);
>
> This is wrong: your crypto_sha2_final also calls sha256_base_do_finalize()
Ugh, right. I clearly need to organize this change better to avoid
this kind of mistake.
On 12/24/2016 03:22 AM, Andy Lutomirski wrote:
> BPF digests are intended to be used to avoid reloading programs that
> are already loaded. For use cases (CRIU?) where untrusted programs
> are involved, intentional hash collisions could cause the wrong BPF
> program to execute. Additionally, if BPF digests are ever used
> in-kernel to skip verification, a hash collision could give privilege
> escalation directly.
Just for the record, digests will never ever be used to skip the
verification step, so I don't know why this idea even comes up
here (?) or is part of the changelog? As this will never be done
anyway, rather drop that part so we can avoid confusion on this?
Wrt untrusted programs, I don't see much of a use on this facility
in general for them. Something like a tail call map would quite
likely only be private to the application. And again, I really doubt
we'll have something like user namespace support in the foreseeable
future. Anyway, that said, I don't really have a big issue if you
want to switch to sha256, though.
> SHA1 is no longer considered adequately collision-resistant (see, for
> example, all the major browsers dropping support for SHA1
> certificates). Use SHA256 instead.
>
> I moved the digest field to keep all of the bpf program metadata in
> the same cache line.
>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Alexei Starovoitov <[email protected]>
> Signed-off-by: Andy Lutomirski <[email protected]>
On Sat, Dec 24, 2016 at 09:57:53AM -0800, Andy Lutomirski wrote:
>
> I actually do use incremental hashing later on. BPF currently
> vmallocs() a big temporary buffer just so it can fill it and hash it.
> I change it to hash as it goes.
How much data is this supposed to hash on average? If it's a large
amount then perhaps using the existing crypto API would be a better
option than adding this.
Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Andy Lutomirski <[email protected]> wrote:
> Since there are plenty of uses for the new-in-4.10 BPF digest feature
> that would be problematic if malicious users could produce collisions,
> the BPF digest should be collision-resistant. SHA-1 is no longer
> considered collision-resistant, so switch it to SHA-256.
>
> The actual switchover is trivial. Most of this series consists of
> cleanups to the SHA256 code to make it usable as a standalone library
> (since BPF should not depend on crypto).
>
> The cleaned up library is much more user-friendly than the SHA-1 code,
> so this also significantly tidies up the BPF digest code.
>
> This is intended for 4.10. If this series misses 4.10 and nothing
> takes its place, then we'll have an unpleasant ABI stability
> situation.
Can you please explain why BPF needs to be able to use SHA directly
rather than through the crypto API?
Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 26 December 2016 at 07:57, Herbert Xu <[email protected]> wrote:
> On Sat, Dec 24, 2016 at 09:57:53AM -0800, Andy Lutomirski wrote:
>>
>> I actually do use incremental hashing later on. BPF currently
>> vmallocs() a big temporary buffer just so it can fill it and hash it.
>> I change it to hash as it goes.
>
> How much data is this supposed to hash on average? If it's a large
> amount then perhaps using the existing crypto API would be a better
> option than adding this.
>
This is a good point actually: you didn't explain *why* BPF shouldn't
depend on the crypto API.
On Mon, Dec 26, 2016 at 9:51 AM, Ard Biesheuvel
<[email protected]> wrote:
> On 26 December 2016 at 07:57, Herbert Xu <[email protected]> wrote:
>> On Sat, Dec 24, 2016 at 09:57:53AM -0800, Andy Lutomirski wrote:
>>>
>>> I actually do use incremental hashing later on. BPF currently
>>> vmallocs() a big temporary buffer just so it can fill it and hash it.
>>> I change it to hash as it goes.
>>
>> How much data is this supposed to hash on average? If it's a large
>> amount then perhaps using the existing crypto API would be a better
>> option than adding this.
>>
>
> This is a good point actually: you didn't explain *why* BPF shouldn't
> depend on the crypto API.
According to Daniel, the networking folks want to let embedded systems
include BPF without requiring the crypto core.
At some point, I'd also like to use modern hash functions for module
verification. If doing so would require the crypto core to be
available when modules are loaded, then the crypto core couldn't be
modular. (Although it occurs to me that my patches get that wrong --
if this change happens, I need to split the code so that the library
functions can be built in even if CRYPTO=m.)
Daniel, would you be okay with BPF selecting CRYPTO and CRYPTO_HASH?
Also, as a bikeshed thought: I could call the functions
sha256_init_direct(), etc. Then there wouldn't be namespace
collisions and the fact that they bypass accelerated drivers would be
more obvious.
--Andy
On Sat, Dec 24, 2016 at 08:59:53PM +0100, Daniel Borkmann wrote:
> On 12/24/2016 03:22 AM, Andy Lutomirski wrote:
> >BPF digests are intended to be used to avoid reloading programs that
> >are already loaded. For use cases (CRIU?) where untrusted programs
> >are involved, intentional hash collisions could cause the wrong BPF
> >program to execute. Additionally, if BPF digests are ever used
> >in-kernel to skip verification, a hash collision could give privilege
> >escalation directly.
>
> Just for the record, digests will never ever be used to skip the
> verification step, so I don't know why this idea even comes up
> here (?) or is part of the changelog? As this will never be done
> anyway, rather drop that part so we can avoid confusion on this?
+1 to what Daniel said above.
For the others let me explain what this patch set is actually
trying to accomplish.
Andy had an idea that sha256 of the program can somehow be used
to bypass kernel verifier during program loading. Furthemore
he thinks that such 'bypass' would be useful for criu of bpf programs,
hence see vigorously attacking existing prog_digest (sha1) because
it's not as secure as sha256 and hence cannot be used for such 'bypass'.
The problem with criu of bpf programs is same as criu of kernel modules.
For the main tracing and networking use cases, we cannot stop the kernel,
so criu is out of question already.
Even if we could stop all the events that trigger bpf program execution,
the sha256 or memcmp() of the full program is not enough to guarantee
that two programs are the same.
Ex. bpf_map_lookup() may be safe for one program and not for another
depending on how map was created. Two programs of different types
are not comparable either. Etc, etc.
Therefore the idea of using sha256 for such purpose is bogus,
and I have an obvious NACK for bpf related patches 3,4,5,6.
For the questions raised in other threads:
I'm not ok with making BPF depend on CRYPTO, since it's the same as
requiring CRYPTO to select BPF for no good reason.
And 0/6 commit log:
> Since there are plenty of uses for the new-in-4.10 BPF digest feature
> that would be problematic if malicious users could produce collisions,
> the BPF digest should be collision-resistant.
This statement is also bogus. The only reason we added prog_digest is
to improve debuggability and introspection of bpf programs.
As I said in the previous thread "collisions are ok" and we could have
used jhash here to avoid patches like this ever appearing
and wasting everyones time.
sha1 is 20 bytes which is already a bit long to print and copy paste by humans.
whereas 4 byte jhash is a bit too short, since collisions are not that rare
and may lead to incorrect assumptions from the users that develop the programs.
I would prefer something in 6-10 byte range that prevents collisions most of
the time and short to print as hex, but I don't know of anything like this
in the existing kernel and inventing bpf specific hash is not great.
Another requirement for debugging (and prog_digest) that user space
should be able to produce the same hash without asking kernel, so
sha1 fits that as well, since it's well known and easy to put into library.
sha256 doesn't fit either of these requirements. 32-bytes are too long to print
and when we use it as a substitue for the prog name for jited ksym, looking
at long function names will screw up all tools like perf, which we don't
want. sha256 is equally not easy for user space app like iproute2,
so not an acceptable choice from that pov either.
On Mon, Dec 26, 2016 at 5:36 PM, Alexei Starovoitov
<[email protected]> wrote:
> On Sat, Dec 24, 2016 at 08:59:53PM +0100, Daniel Borkmann wrote:
>> On 12/24/2016 03:22 AM, Andy Lutomirski wrote:
>> >BPF digests are intended to be used to avoid reloading programs that
>> >are already loaded. For use cases (CRIU?) where untrusted programs
>> >are involved, intentional hash collisions could cause the wrong BPF
>> >program to execute. Additionally, if BPF digests are ever used
>> >in-kernel to skip verification, a hash collision could give privilege
>> >escalation directly.
>>
>> Just for the record, digests will never ever be used to skip the
>> verification step, so I don't know why this idea even comes up
>> here (?) or is part of the changelog? As this will never be done
>> anyway, rather drop that part so we can avoid confusion on this?
>
> +1 to what Daniel said above.
>
> For the others let me explain what this patch set is actually
> trying to accomplish.
The patch:
a) cleans up the code and
b) uses a cryptographic hash that is actually believed to satisfy the
definition of a cryptographic hash.
There's no excuse for not doing b.
> and I have an obvious NACK for bpf related patches 3,4,5,6.
Did you *read* the ones that were pure cleanups?
>
> sha1 is 20 bytes which is already a bit long to print and copy paste by humans.
> whereas 4 byte jhash is a bit too short, since collisions are not that rare
> and may lead to incorrect assumptions from the users that develop the programs.
> I would prefer something in 6-10 byte range that prevents collisions most of
> the time and short to print as hex, but I don't know of anything like this
> in the existing kernel and inventing bpf specific hash is not great.
> Another requirement for debugging (and prog_digest) that user space
> should be able to produce the same hash without asking kernel, so
> sha1 fits that as well, since it's well known and easy to put into library.
Then truncate them in user space.
On Mon, Dec 26, 2016 at 10:08:48AM -0800, Andy Lutomirski wrote:
>
> According to Daniel, the networking folks want to let embedded systems
> include BPF without requiring the crypto core.
Last I checked the IPv4 stack depended on the crypto API so this
sounds bogus.
Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 12/27/2016 10:58 AM, Herbert Xu wrote:
> On Mon, Dec 26, 2016 at 10:08:48AM -0800, Andy Lutomirski wrote:
>>
>> According to Daniel, the networking folks want to let embedded systems
>> include BPF without requiring the crypto core.
>
> Last I checked the IPv4 stack depended on the crypto API so this
> sounds bogus.
I think there's a bit of a mixup here with what I said. To clarify,
requirement back then from tracing folks was that bpf engine and
therefore bpf syscall can be build w/o networking enabled for small
devices, so dependencies preferably need to be kept on a absolute
minimum, same counts for either making it suddenly a depend on
CRYPTO or a select CRYPTO for just those few lines that can be
pulled in from lib/ code instead.
On Tue, Dec 27, 2016 at 6:16 AM, Daniel Borkmann <[email protected]> wrote:
> On 12/27/2016 10:58 AM, Herbert Xu wrote:
>>
>> On Mon, Dec 26, 2016 at 10:08:48AM -0800, Andy Lutomirski wrote:
>>>
>>>
>>> According to Daniel, the networking folks want to let embedded systems
>>> include BPF without requiring the crypto core.
>>
>>
>> Last I checked the IPv4 stack depended on the crypto API so this
>> sounds bogus.
>
>
> I think there's a bit of a mixup here with what I said. To clarify,
> requirement back then from tracing folks was that bpf engine and
> therefore bpf syscall can be build w/o networking enabled for small
> devices, so dependencies preferably need to be kept on a absolute
> minimum, same counts for either making it suddenly a depend on
> CRYPTO or a select CRYPTO for just those few lines that can be
> pulled in from lib/ code instead.
Somehow I had that in my head as "networking" not "tracing", probably
because of the TCA stuff. Whoops.
Anyway, I'm rewriting the crypto part of the patch completely based on
Ard's feedback.