LinuxLists.cc - [PATCH v8 0/9] crypto: Remove VLA usage

2018-08-07 21:18:34

Subject: [PATCH v8 0/9] crypto: Remove VLA usage

v8 cover letter:

I continue to hope this can land in v4.19, but I realize that's unlikely.
It would be nice, though, if some of the "trivial" patches could get taken
(e.g. cbc, xcbc, ccm VLA removals) so I don't have to keep repeating them.
*fingers crossed*

Series cover letter:

This is nearly the last of the VLA removals[1], but it's one of the
largest because crypto gets used in lots of places. After looking
through code, usage, reading the threads Gustavo started, and comparing
the use-cases to the other VLA removals that have landed in the kernel,
I think this series is likely the best way forward to shut the door on
VLAs forever.

For background, the crypto stack usage is for callers to do an immediate
bit of work that doesn't allocate new memory. This means that other VLA
removal techniques (like just using kmalloc) aren't workable, and the
next common technique is needed: examination of maximum stack usage and
the addition of sanity checks. This series does that, and in several
cases, these maximums were already implicit in the code.

This series is intended to land via the crypto tree for 4.19, though it
touches dm, networking, and a few other things as well, since there are
dependent patches (new crypto #defines being used, etc).

Thanks!

-Kees

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Changelog:

v8:
- remove "impossible condition" BUG-check in cbc.

v7:
- refresh and reorganize without ahash->shash conversions
- merge hash maximums for both shash and ahash

v6:
- make xcbc blocksize unconditional
- add ahash-to-shash conversion patch series to entirely remove
AHASH_REQUEST_ON_STACK from the kernel

v5:
- limit AHASH_REQUEST_ON_STACK size only to non-async hash wrapping.
- sanity-check ahash reqsize only when doing shash wrapping.
- remove frame_warn changes in favor of shash conversions and other fixes.
- send ahash to shash conversion patches and other fixes separately.

v4:
- add back *_REQUEST_ON_STACK patches.
- programmatically find stack sizes for *_REQUEST_ON_STACK patches.
- whitelist some code that trips FRAME_WARN on 32-bit builds.
- fix alignment patches.

v3:
- drop *_REQUEST_ON_STACK patches. The rest of this series is pretty
straight-forward, and I'd like to get them tested in -next while
we continue to chip away at the *_REQUEST_ON_STACK VLA removal patches
separately. "Part 2" will continue with those.

v2:
- use 512 instead of PAGE_SIZE / 8 to avoid bloat on large-page archs.
- swtich xcbc to "16" max universally.
- fix typo in bounds check for dm buffer size.
- examine actual reqsizes for skcipher and ahash instead of guessing.
- improve names and comments for alg maxes

Ard Biesheuvel (1):
crypto: ccm: Remove VLA usage

Kees Cook (8):
crypto: xcbc: Remove VLA usage
crypto: cbc: Remove VLA usage
crypto: hash: Remove VLA usage
dm: Remove VLA usage from hashes
crypto alg: Introduce generic max blocksize and alignmask
crypto: qat: Remove VLA usage
crypto: shash: Remove VLA usage in unaligned hashing
crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

crypto/ahash.c | 4 +--
crypto/algapi.c | 7 ++++-
crypto/algif_hash.c | 2 +-
crypto/ccm.c | 9 ++++---
crypto/shash.c | 33 ++++++++++++++----------
crypto/xcbc.c | 8 +++---
drivers/crypto/qat/qat_common/qat_algs.c | 8 ++++--
drivers/md/dm-integrity.c | 23 ++++++++++++-----
drivers/md/dm-verity-fec.c | 5 +++-
include/crypto/algapi.h | 4 ++-
include/crypto/cbc.h | 2 +-
include/crypto/hash.h | 6 ++++-
include/crypto/internal/skcipher.h | 1 +
include/crypto/skcipher.h | 4 ++-
include/linux/compiler-gcc.h | 1 -
15 files changed, 79 insertions(+), 38 deletions(-)

--
2.17.1

2018-08-07 21:18:37

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 3/9] crypto: ccm: Remove VLA usage

From: Ard Biesheuvel <[email protected]>

In the quest to remove all stack VLA usage from the kernel[1], this drops
AHASH_REQUEST_ON_STACK by preallocating the ahash request area combined
with the skcipher area (which are not used at the same time).

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
---
crypto/ccm.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/crypto/ccm.c b/crypto/ccm.c
index 0a083342ec8c..b242fd0d3262 100644
--- a/crypto/ccm.c
+++ b/crypto/ccm.c
@@ -50,7 +50,10 @@ struct crypto_ccm_req_priv_ctx {
u32 flags;
struct scatterlist src[3];
struct scatterlist dst[3];
- struct skcipher_request skreq;
+ union {
+ struct ahash_request ahreq;
+ struct skcipher_request skreq;
+ };
};

struct cbcmac_tfm_ctx {
@@ -181,7 +184,7 @@ static int crypto_ccm_auth(struct aead_request *req, struct scatterlist *plain,
struct crypto_ccm_req_priv_ctx *pctx = crypto_ccm_reqctx(req);
struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct crypto_ccm_ctx *ctx = crypto_aead_ctx(aead);
- AHASH_REQUEST_ON_STACK(ahreq, ctx->mac);
+ struct ahash_request *ahreq = &pctx->ahreq;
unsigned int assoclen = req->assoclen;
struct scatterlist sg[3];
u8 *odata = pctx->odata;
@@ -427,7 +430,7 @@ static int crypto_ccm_init_tfm(struct crypto_aead *tfm)
crypto_aead_set_reqsize(
tfm,
align + sizeof(struct crypto_ccm_req_priv_ctx) +
- crypto_skcipher_reqsize(ctr));
+ max(crypto_ahash_reqsize(mac), crypto_skcipher_reqsize(ctr)));

return 0;

--
2.17.1

2018-08-07 21:18:35

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 1/9] crypto: xcbc: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check. For xcbc, the blocksize
must always be 16, so use that, since it's already being enforced during
instantiation.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
crypto/xcbc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index 25c75af50d3f..c055f57fab11 100644
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -57,15 +57,17 @@ struct xcbc_desc_ctx {
u8 ctx[];
};

+#define XCBC_BLOCKSIZE 16
+
static int crypto_xcbc_digest_setkey(struct crypto_shash *parent,
const u8 *inkey, unsigned int keylen)
{
unsigned long alignmask = crypto_shash_alignmask(parent);
struct xcbc_tfm_ctx *ctx = crypto_shash_ctx(parent);
- int bs = crypto_shash_blocksize(parent);
u8 *consts = PTR_ALIGN(&ctx->ctx[0], alignmask + 1);
int err = 0;
- u8 key1[bs];
+ u8 key1[XCBC_BLOCKSIZE];
+ int bs = sizeof(key1);

if ((err = crypto_cipher_setkey(ctx->child, inkey, keylen)))
return err;
@@ -212,7 +214,7 @@ static int xcbc_create(struct crypto_template *tmpl, struct rtattr **tb)
return PTR_ERR(alg);

switch(alg->cra_blocksize) {
- case 16:
+ case XCBC_BLOCKSIZE:
break;
default:
goto out_put_alg;
--
2.17.1

2018-08-07 21:18:39

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 5/9] dm: Remove VLA usage from hashes

In the quest to remove all stack VLA usage from the kernel[1], this uses
the new HASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
bounds on stack usage.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
drivers/md/dm-integrity.c | 23 +++++++++++++++++------
drivers/md/dm-verity-fec.c | 5 ++++-
2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 86438b2f10dd..884edd7cf1d0 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, unsigned section, __u8 result
}
memset(result + size, 0, JOURNAL_MAC_SIZE - size);
} else {
- __u8 digest[size];
+ __u8 digest[HASH_MAX_DIGESTSIZE];
+
+ if (WARN_ON(size > sizeof(digest))) {
+ dm_integrity_io_error(ic, "digest_size", -EINVAL);
+ goto err;
+ }
r = crypto_shash_final(desc, digest);
if (unlikely(r)) {
dm_integrity_io_error(ic, "crypto_shash_final", r);
@@ -1244,7 +1249,7 @@ static void integrity_metadata(struct work_struct *w)
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
char *checksums;
unsigned extra_space = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0;
- char checksums_onstack[ic->tag_size + extra_space];
+ char checksums_onstack[HASH_MAX_DIGESTSIZE];
unsigned sectors_to_process = dio->range.n_sectors;
sector_t sector = dio->range.logical_sector;

@@ -1253,8 +1258,14 @@ static void integrity_metadata(struct work_struct *w)

checksums = kmalloc((PAGE_SIZE >> SECTOR_SHIFT >> ic->sb->log2_sectors_per_block) * ic->tag_size + extra_space,
GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN);
- if (!checksums)
+ if (!checksums) {
checksums = checksums_onstack;
+ if (WARN_ON(extra_space &&
+ digest_size > sizeof(checksums_onstack))) {
+ r = -EINVAL;
+ goto error;
+ }
+ }

__bio_for_each_segment(bv, bio, iter, dio->orig_bi_iter) {
unsigned pos;
@@ -1466,7 +1477,7 @@ static bool __journal_read_write(struct dm_integrity_io *dio, struct bio *bio,
} while (++s < ic->sectors_per_block);
#ifdef INTERNAL_VERIFY
if (ic->internal_hash) {
- char checksums_onstack[max(crypto_shash_digestsize(ic->internal_hash), ic->tag_size)];
+ char checksums_onstack[max(HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];

integrity_sector_checksum(ic, logical_sector, mem + bv.bv_offset, checksums_onstack);
if (unlikely(memcmp(checksums_onstack, journal_entry_tag(ic, je), ic->tag_size))) {
@@ -1516,7 +1527,7 @@ static bool __journal_read_write(struct dm_integrity_io *dio, struct bio *bio,
if (ic->internal_hash) {
unsigned digest_size = crypto_shash_digestsize(ic->internal_hash);
if (unlikely(digest_size > ic->tag_size)) {
- char checksums_onstack[digest_size];
+ char checksums_onstack[HASH_MAX_DIGESTSIZE];
integrity_sector_checksum(ic, logical_sector, (char *)js, checksums_onstack);
memcpy(journal_entry_tag(ic, je), checksums_onstack, ic->tag_size);
} else
@@ -1937,7 +1948,7 @@ static void do_journal_write(struct dm_integrity_c *ic, unsigned write_start,
unlikely(from_replay) &&
#endif
ic->internal_hash) {
- char test_tag[max(crypto_shash_digestsize(ic->internal_hash), ic->tag_size)];
+ char test_tag[max_t(size_t, HASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];

integrity_sector_checksum(ic, sec + ((l - j) << ic->sb->log2_sectors_per_block),
(char *)access_journal_data(ic, i, l), test_tag);
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 684af08d0747..0ce04e5b4afb 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -212,12 +212,15 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
struct dm_verity_fec_io *fio = fec_io(io);
u64 block, ileaved;
u8 *bbuf, *rs_block;
- u8 want_digest[v->digest_size];
+ u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned n, k;

if (neras)
*neras = 0;

+ if (WARN_ON(v->digest_size > sizeof(want_digest)))
+ return -EINVAL;
+
/*
* read each of the rsn data blocks that are part of the RS block, and
* interleave contents to available bufs
--
2.17.1

2018-08-07 21:18:42

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 8/9] crypto: shash: Remove VLA usage in unaligned hashing

In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes. Additionally, the __aligned_largest macro is
removed since this helper was the only user.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
crypto/shash.c | 27 ++++++++++++++++-----------
include/linux/compiler-gcc.h | 1 -
2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index 86d76b5c626c..d21f04d70dce 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key,
}
EXPORT_SYMBOL_GPL(crypto_shash_setkey);

-static inline unsigned int shash_align_buffer_size(unsigned len,
- unsigned long mask)
-{
- typedef u8 __aligned_largest u8_aligned;
- return len + (mask & ~(__alignof__(u8_aligned) - 1));
-}
-
static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
@@ -88,11 +81,17 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
unsigned long alignmask = crypto_shash_alignmask(tfm);
unsigned int unaligned_len = alignmask + 1 -
((unsigned long)data & alignmask);
- u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
- __aligned_largest;
+ /*
+ * We cannot count on __aligned() working for large values:
+ * https://patchwork.kernel.org/patch/9507697/
+ */
+ u8 ubuf[MAX_ALGAPI_ALIGNMASK * 2];
u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
int err;

+ if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
+ return -EINVAL;
+
if (unaligned_len > len)
unaligned_len = len;

@@ -124,11 +123,17 @@ static int shash_final_unaligned(struct shash_desc *desc, u8 *out)
unsigned long alignmask = crypto_shash_alignmask(tfm);
struct shash_alg *shash = crypto_shash_alg(tfm);
unsigned int ds = crypto_shash_digestsize(tfm);
- u8 ubuf[shash_align_buffer_size(ds, alignmask)]
- __aligned_largest;
+ /*
+ * We cannot count on __aligned() working for large values:
+ * https://patchwork.kernel.org/patch/9507697/
+ */
+ u8 ubuf[MAX_ALGAPI_ALIGNMASK + HASH_MAX_DIGESTSIZE];
u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
int err;

+ if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
+ return -EINVAL;
+
err = shash->final(desc, buf);
if (err)
goto out;
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492a5cc8..1f1cdef36a82 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -125,7 +125,6 @@
*/
#define __pure __attribute__((pure))
#define __aligned(x) __attribute__((aligned(x)))
-#define __aligned_largest __attribute__((aligned))
#define __printf(a, b) __attribute__((format(printf, a, b)))
#define __scanf(a, b) __attribute__((format(scanf, a, b)))
#define __attribute_const__ __attribute__((__const__))
--
2.17.1

2018-08-07 21:18:41

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 7/9] crypto: qat: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this uses
the new upper bound for the stack buffer. Also adds a sanity check.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
drivers/crypto/qat/qat_common/qat_algs.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/qat/qat_common/qat_algs.c b/drivers/crypto/qat/qat_common/qat_algs.c
index 1138e41d6805..a28edf7b792f 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -153,8 +153,8 @@ static int qat_alg_do_precomputes(struct icp_qat_hw_auth_algo_blk *hash,
struct sha512_state sha512;
int block_size = crypto_shash_blocksize(ctx->hash_tfm);
int digest_size = crypto_shash_digestsize(ctx->hash_tfm);
- char ipad[block_size];
- char opad[block_size];
+ char ipad[MAX_ALGAPI_BLOCKSIZE];
+ char opad[MAX_ALGAPI_BLOCKSIZE];
__be32 *hash_state_out;
__be64 *hash512_state_out;
int i, offset;
@@ -164,6 +164,10 @@ static int qat_alg_do_precomputes(struct icp_qat_hw_auth_algo_blk *hash,
shash->tfm = ctx->hash_tfm;
shash->flags = 0x0;

+ if (WARN_ON(block_size > sizeof(ipad) ||
+ sizeof(ipad) != sizeof(opad)))
+ return -EINVAL;
+
if (auth_keylen > block_size) {
int ret = crypto_shash_digest(shash, auth_key,
auth_keylen, ipad);
--
2.17.1

2018-08-07 21:18:43

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 9/9] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

In the quest to remove all stack VLA usage from the kernel[1], this
caps the skcipher request size similar to other limits and adds a sanity
check at registration. Looking at instrumented tcrypt output, the largest
is for lrw:

crypt: testing lrw(aes)
crypto_skcipher_set_reqsize: 8
crypto_skcipher_set_reqsize: 88
crypto_skcipher_set_reqsize: 472

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
include/crypto/internal/skcipher.h | 1 +
include/crypto/skcipher.h | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
index e42f7063f245..5035482cbe68 100644
--- a/include/crypto/internal/skcipher.h
+++ b/include/crypto/internal/skcipher.h
@@ -130,6 +130,7 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
static inline void crypto_skcipher_set_reqsize(
struct crypto_skcipher *skcipher, unsigned int reqsize)
{
+ BUG_ON(reqsize > SKCIPHER_MAX_REQSIZE);
skcipher->reqsize = reqsize;
}

diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 2f327f090c3e..c48e194438cf 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -139,9 +139,11 @@ struct skcipher_alg {
struct crypto_alg base;
};

+#define SKCIPHER_MAX_REQSIZE 472
+
#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
- crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
+ SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
struct skcipher_request *name = (void *)__##name##_desc

/**
--
2.17.1

2018-08-07 21:18:38

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 4/9] crypto: hash: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLAs in SHASH_DESC_ON_STACK (via crypto_shash_descsize())
by using the maximum allowable size (which is now more clearly captured
in a macro), along with a few other cases. Similar limits are turned into
macros as well.

A review of existing sizes shows that SHA512_DIGEST_SIZE (64) is the
largest digest size and that sizeof(struct sha3_state) (360) is the
largest descriptor size. The corresponding maximums are reduced.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
crypto/ahash.c | 4 ++--
crypto/algif_hash.c | 2 +-
crypto/shash.c | 6 +++---
include/crypto/hash.h | 6 +++++-
4 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/crypto/ahash.c b/crypto/ahash.c
index a64c143165b1..78aaf2158c43 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -550,8 +550,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
{
struct crypto_alg *base = &alg->halg.base;

- if (alg->halg.digestsize > PAGE_SIZE / 8 ||
- alg->halg.statesize > PAGE_SIZE / 8 ||
+ if (alg->halg.digestsize > HASH_MAX_DIGESTSIZE ||
+ alg->halg.statesize > HASH_MAX_STATESIZE ||
alg->halg.statesize == 0)
return -EINVAL;

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index bfcf595fd8f9..d0cde541beb6 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -239,7 +239,7 @@ static int hash_accept(struct socket *sock, struct socket *newsock, int flags,
struct alg_sock *ask = alg_sk(sk);
struct hash_ctx *ctx = ask->private;
struct ahash_request *req = &ctx->req;
- char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req)) ? : 1];
+ char state[HASH_MAX_STATESIZE];
struct sock *sk2;
struct alg_sock *ask2;
struct hash_ctx *ctx2;
diff --git a/crypto/shash.c b/crypto/shash.c
index 5d732c6bb4b2..86d76b5c626c 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -458,9 +458,9 @@ static int shash_prepare_alg(struct shash_alg *alg)
{
struct crypto_alg *base = &alg->base;

- if (alg->digestsize > PAGE_SIZE / 8 ||
- alg->descsize > PAGE_SIZE / 8 ||
- alg->statesize > PAGE_SIZE / 8)
+ if (alg->digestsize > HASH_MAX_DIGESTSIZE ||
+ alg->descsize > HASH_MAX_DESCSIZE ||
+ alg->statesize > HASH_MAX_STATESIZE)
return -EINVAL;

base->cra_type = &crypto_shash_type;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 76e432cab75d..21587011ab0f 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -151,9 +151,13 @@ struct shash_desc {
void *__ctx[] CRYPTO_MINALIGN_ATTR;
};

+#define HASH_MAX_DIGESTSIZE 64
+#define HASH_MAX_DESCSIZE 360
+#define HASH_MAX_STATESIZE 512
+
#define SHASH_DESC_ON_STACK(shash, ctx) \
char __##shash##_desc[sizeof(struct shash_desc) + \
- crypto_shash_descsize(ctx)] CRYPTO_MINALIGN_ATTR; \
+ HASH_MAX_DESCSIZE] CRYPTO_MINALIGN_ATTR; \
struct shash_desc *shash = (struct shash_desc *)__##shash##_desc

/**
--
2.17.1

2018-08-07 21:18:40

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 6/9] crypto alg: Introduce generic max blocksize and alignmask

In the quest to remove all stack VLA usage from the kernel[1], this
exposes a new general upper bound on crypto blocksize and alignmask
(higher than for the existing cipher limits) for VLA removal,
and introduces new checks.

At present, the highest cra_alignmask in the kernel is 63. The highest
cra_blocksize is 144 (SHA3_224_BLOCK_SIZE, 18 8-byte words). For the
new blocksize limit, I went with 160 (20 8-byte words).

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
crypto/algapi.c | 7 ++++++-
include/crypto/algapi.h | 4 +++-
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/crypto/algapi.c b/crypto/algapi.c
index c0755cf4f53f..496fc51bf215 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -57,9 +57,14 @@ static int crypto_check_alg(struct crypto_alg *alg)
if (alg->cra_alignmask & (alg->cra_alignmask + 1))
return -EINVAL;

- if (alg->cra_blocksize > PAGE_SIZE / 8)
+ /* General maximums for all algs. */
+ if (alg->cra_alignmask > MAX_ALGAPI_ALIGNMASK)
return -EINVAL;

+ if (alg->cra_blocksize > MAX_ALGAPI_BLOCKSIZE)
+ return -EINVAL;
+
+ /* Lower maximums for specific alg types. */
if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
CRYPTO_ALG_TYPE_CIPHER) {
if (alg->cra_alignmask > MAX_CIPHER_ALIGNMASK)
diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
index bd5e8ccf1687..21371ac8f355 100644
--- a/include/crypto/algapi.h
+++ b/include/crypto/algapi.h
@@ -20,8 +20,10 @@
/*
* Maximum values for blocksize and alignmask, used to allocate
* static buffers that are big enough for any combination of
- * ciphers and architectures.
+ * algs and architectures. Ciphers have a lower maximum size.
*/
+#define MAX_ALGAPI_BLOCKSIZE 160
+#define MAX_ALGAPI_ALIGNMASK 63
#define MAX_CIPHER_BLOCKSIZE 16
#define MAX_CIPHER_ALIGNMASK 15

--
2.17.1

2018-08-07 21:18:36

by Kees Cook

[permalink] [raw]

Subject: [PATCH v8 2/9] crypto: cbc: Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this
uses the upper bounds on blocksize. Since this is always a cipher
blocksize, use the existing cipher max blocksize.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
include/crypto/cbc.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
index f5b8bfc22e6d..3bf28beefa33 100644
--- a/include/crypto/cbc.h
+++ b/include/crypto/cbc.h
@@ -113,7 +113,7 @@ static inline int crypto_cbc_decrypt_inplace(
unsigned int bsize = crypto_skcipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
- u8 last_iv[bsize];
+ u8 last_iv[MAX_CIPHER_BLOCKSIZE];

/* Start of the last block. */
src += nbytes - (nbytes & (bsize - 1)) - bsize;
--
2.17.1

2018-09-04 03:13:18

by Herbert Xu

[permalink] [raw]

Subject: Re: [PATCH v8 5/9] dm: Remove VLA usage from hashes

On Tue, Aug 07, 2018 at 02:18:39PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the new HASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
> bounds on stack usage.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>

Can the dm folks please review this patch?

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-04 03:15:09

by Herbert Xu

[permalink] [raw]

Subject: Re: [PATCH v8 9/9] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Tue, Aug 07, 2018 at 02:18:43PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> caps the skcipher request size similar to other limits and adds a sanity
> check at registration. Looking at instrumented tcrypt output, the largest
> is for lrw:
>
> crypt: testing lrw(aes)
> crypto_skcipher_set_reqsize: 8
> crypto_skcipher_set_reqsize: 88
> crypto_skcipher_set_reqsize: 472
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> include/crypto/internal/skcipher.h | 1 +
> include/crypto/skcipher.h | 4 +++-
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
> index e42f7063f245..5035482cbe68 100644
> --- a/include/crypto/internal/skcipher.h
> +++ b/include/crypto/internal/skcipher.h
> @@ -130,6 +130,7 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
> static inline void crypto_skcipher_set_reqsize(
> struct crypto_skcipher *skcipher, unsigned int reqsize)
> {
> + BUG_ON(reqsize > SKCIPHER_MAX_REQSIZE);

Please do not add these BUG_ONs. Instead allow this function to
fail and check for the failure in the caller.

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-04 05:19:05

by Herbert Xu

[permalink] [raw]

Subject: Re: [PATCH v8 0/9] crypto: Remove VLA usage

On Tue, Aug 07, 2018 at 02:18:34PM -0700, Kees Cook wrote:
> v8 cover letter:
>
> I continue to hope this can land in v4.19, but I realize that's unlikely.
> It would be nice, though, if some of the "trivial" patches could get taken
> (e.g. cbc, xcbc, ccm VLA removals) so I don't have to keep repeating them.
> *fingers crossed*
>
> Series cover letter:
>
> This is nearly the last of the VLA removals[1], but it's one of the
> largest because crypto gets used in lots of places. After looking
> through code, usage, reading the threads Gustavo started, and comparing
> the use-cases to the other VLA removals that have landed in the kernel,
> I think this series is likely the best way forward to shut the door on
> VLAs forever.
>
> For background, the crypto stack usage is for callers to do an immediate
> bit of work that doesn't allocate new memory. This means that other VLA
> removal techniques (like just using kmalloc) aren't workable, and the
> next common technique is needed: examination of maximum stack usage and
> the addition of sanity checks. This series does that, and in several
> cases, these maximums were already implicit in the code.
>
> This series is intended to land via the crypto tree for 4.19, though it
> touches dm, networking, and a few other things as well, since there are
> dependent patches (new crypto #defines being used, etc).

I have applied patches 1-4 and 6-8. I'd like to get an ack from
the dm folks regarding patch 5. As to patch 9, please fix it so
it doesn't rely on the BUG_ON to catch things.

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-04 05:50:41

by Kees Cook

[permalink] [raw]

Subject: Re: [PATCH v8 0/9] crypto: Remove VLA usage

On Mon, Sep 3, 2018 at 10:19 PM, Herbert Xu <[email protected]> wrote:
> On Tue, Aug 07, 2018 at 02:18:34PM -0700, Kees Cook wrote:
>> v8 cover letter:
>>
>> I continue to hope this can land in v4.19, but I realize that's unlikely.
>> It would be nice, though, if some of the "trivial" patches could get taken
>> (e.g. cbc, xcbc, ccm VLA removals) so I don't have to keep repeating them.
>> *fingers crossed*
>>
>> Series cover letter:
>>
>> This is nearly the last of the VLA removals[1], but it's one of the
>> largest because crypto gets used in lots of places. After looking
>> through code, usage, reading the threads Gustavo started, and comparing
>> the use-cases to the other VLA removals that have landed in the kernel,
>> I think this series is likely the best way forward to shut the door on
>> VLAs forever.
>>
>> For background, the crypto stack usage is for callers to do an immediate
>> bit of work that doesn't allocate new memory. This means that other VLA
>> removal techniques (like just using kmalloc) aren't workable, and the
>> next common technique is needed: examination of maximum stack usage and
>> the addition of sanity checks. This series does that, and in several
>> cases, these maximums were already implicit in the code.
>>
>> This series is intended to land via the crypto tree for 4.19, though it
>> touches dm, networking, and a few other things as well, since there are
>> dependent patches (new crypto #defines being used, etc).
>
> I have applied patches 1-4 and 6-8. I'd like to get an ack from
> the dm folks regarding patch 5. As to patch 9, please fix it so
> it doesn't rely on the BUG_ON to catch things.

Great! Thanks very much. I'll get a patch prepared to plumb
crypto_skcipher_set_reqsize() failures.

-Kees

--
Kees Cook
Pixel Security

2018-09-13 16:41:45

by Kees Cook

[permalink] [raw]

Subject: Re: [PATCH v8 5/9] dm: Remove VLA usage from hashes

On Mon, Sep 3, 2018 at 8:13 PM, Herbert Xu <[email protected]> wrote:
> On Tue, Aug 07, 2018 at 02:18:39PM -0700, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this uses
>> the new HASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
>> bounds on stack usage.
>>
>> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>>
>> Signed-off-by: Kees Cook <[email protected]>
>
> Can the dm folks please review this patch?

Mike or Alasdair, can you Ack this patch so Herbert can include it in
the crypto tree? This is blocking some VLA removals[1]...

Thanks!

-Kees

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

--
Kees Cook
Pixel Security

2018-09-13 17:54:39

by Mike Snitzer

[permalink] [raw]

Subject: Re: [PATCH v8 5/9] dm: Remove VLA usage from hashes

On Tue, Aug 07 2018 at 5:18pm -0400,
Kees Cook <[email protected]> wrote:

> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the new HASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
> bounds on stack usage.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> drivers/md/dm-integrity.c | 23 +++++++++++++++++------
> drivers/md/dm-verity-fec.c | 5 ++++-
> 2 files changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
> index 86438b2f10dd..884edd7cf1d0 100644
> --- a/drivers/md/dm-integrity.c
> +++ b/drivers/md/dm-integrity.c
> @@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, unsigned section, __u8 result
> }
> memset(result + size, 0, JOURNAL_MAC_SIZE - size);
> } else {
> - __u8 digest[size];
> + __u8 digest[HASH_MAX_DIGESTSIZE];
> +
> + if (WARN_ON(size > sizeof(digest))) {
> + dm_integrity_io_error(ic, "digest_size", -EINVAL);
> + goto err;
> + }
> r = crypto_shash_final(desc, digest);
> if (unlikely(r)) {
> dm_integrity_io_error(ic, "crypto_shash_final", r);
> @@ -1244,7 +1249,7 @@ static void integrity_metadata(struct work_struct *w)
> struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
> char *checksums;
> unsigned extra_space = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0;
> - char checksums_onstack[ic->tag_size + extra_space];
> + char checksums_onstack[HASH_MAX_DIGESTSIZE];
> unsigned sectors_to_process = dio->range.n_sectors;
> sector_t sector = dio->range.logical_sector;
>
> @@ -1253,8 +1258,14 @@ static void integrity_metadata(struct work_struct *w)
>
> checksums = kmalloc((PAGE_SIZE >> SECTOR_SHIFT >> ic->sb->log2_sectors_per_block) * ic->tag_size + extra_space,
> GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN);
> - if (!checksums)
> + if (!checksums) {
> checksums = checksums_onstack;
> + if (WARN_ON(extra_space &&
> + digest_size > sizeof(checksums_onstack))) {
> + r = -EINVAL;
> + goto error;
> + }
> + }
>
> __bio_for_each_segment(bv, bio, iter, dio->orig_bi_iter) {
> unsigned pos;

Given the length of the kmalloc() just prior to this new WARN_ON() line
I'm not seeing why you've elected to split the WARN_ON across multiple
lines.

But that style nit aside:

Acked-by: Mike Snitzer <[email protected]>

2018-09-14 06:10:36

by Herbert Xu

[permalink] [raw]

Subject: Re: [PATCH v8 5/9] dm: Remove VLA usage from hashes

On Thu, Sep 13, 2018 at 01:54:39PM -0400, Mike Snitzer wrote:
>
> Acked-by: Mike Snitzer <[email protected]>

Patch applied. Thanks.
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-25 16:12:10

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH v8 7/9] crypto: qat: Remove VLA usage

On Tue, Aug 7, 2018 at 11:18 PM Kees Cook <[email protected]> wrote:
>
> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the new upper bound for the stack buffer. Also adds a sanity check.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>

After rebasing to linux-next, I now get a warning about this file:

drivers/crypto/qat/qat_common/qat_algs.c: In function 'qat_alg_do_precomputes':
drivers/crypto/qat/qat_common/qat_algs.c:257:1: error: the frame size
of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

I assume it was already possible to get into that state with the VLA,
but it seems bad enough that I think we need to do something
about it.

The large stack variables add up to 1084 bytes, which fully explains
how we got here:

SHASH_DESC_ON_STACK(shash, ctx->hash_tfm); /* 360 */
struct sha1_state sha1; /* 92 */
struct sha256_state sha256; /* 104 */
struct sha512_state sha512; /* 208 */
char ipad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
char opad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */

The question is what we can do about it. One simple step I've tried
is to move the sha1/sha256/sha512 into a union, which saves around
200 bytes and should bring us (slightly) below the warning
limit, but I suspect we can do better than that. Any ideas?

Arnd

2018-09-26 08:44:41

by Ard Biesheuvel

[permalink] [raw]

Subject: Re: [PATCH v8 7/9] crypto: qat: Remove VLA usage

On Tue, 25 Sep 2018 at 18:12, Arnd Bergmann <[email protected]> wrote:
>
> On Tue, Aug 7, 2018 at 11:18 PM Kees Cook <[email protected]> wrote:
> >
> > In the quest to remove all stack VLA usage from the kernel[1], this uses
> > the new upper bound for the stack buffer. Also adds a sanity check.
> >
> > [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> >
> > Signed-off-by: Kees Cook <[email protected]>
>
> After rebasing to linux-next, I now get a warning about this file:
>
> drivers/crypto/qat/qat_common/qat_algs.c: In function 'qat_alg_do_precomputes':
> drivers/crypto/qat/qat_common/qat_algs.c:257:1: error: the frame size
> of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
>
> I assume it was already possible to get into that state with the VLA,
> but it seems bad enough that I think we need to do something
> about it.
>
> The large stack variables add up to 1084 bytes, which fully explains
> how we got here:
>
> SHASH_DESC_ON_STACK(shash, ctx->hash_tfm); /* 360 */
> struct sha1_state sha1; /* 92 */
> struct sha256_state sha256; /* 104 */
> struct sha512_state sha512; /* 208 */
> char ipad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
> char opad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
>
> The question is what we can do about it. One simple step I've tried
> is to move the sha1/sha256/sha512 into a union, which saves around
> 200 bytes and should bring us (slightly) below the warning
> limit, but I suspect we can do better than that. Any ideas?
>

All the processing takes place in the context of a setkey() operation,
which means only one such operation should be in flight per tfm at any
given time. So we could move all these pieces into the tfm context
struct instead. Something like the below [untested] (whitespace
mangling courtesy of Gmail)

diff --git a/drivers/crypto/qat/qat_common/qat_algs.c
b/drivers/crypto/qat/qat_common/qat_algs.c
index 1138e41d6805..c14f1906bdf4 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -113,6 +113,13 @@ struct qat_alg_aead_ctx {
struct crypto_shash *hash_tfm;
enum icp_qat_hw_auth_algo qat_hash_alg;
struct qat_crypto_instance *inst;
+ union {
+ struct sha1_state sha1;
+ struct sha256_state sha256;
+ struct sha512_state sha512;
+ };
+ char ipad[MAX_ALGAPI_BLOCKSIZE];
+ char opad[MAX_ALGAPI_BLOCKSIZE];
};

struct qat_alg_ablkcipher_ctx {
@@ -148,37 +155,32 @@ static int qat_alg_do_precomputes(struct
icp_qat_hw_auth_algo_blk *hash,
unsigned int auth_keylen)
{
SHASH_DESC_ON_STACK(shash, ctx->hash_tfm);
- struct sha1_state sha1;
- struct sha256_state sha256;
- struct sha512_state sha512;
int block_size = crypto_shash_blocksize(ctx->hash_tfm);
int digest_size = crypto_shash_digestsize(ctx->hash_tfm);
- char ipad[block_size];
- char opad[block_size];
__be32 *hash_state_out;
__be64 *hash512_state_out;
int i, offset;

- memset(ipad, 0, block_size);
- memset(opad, 0, block_size);
+ memset(ctx->ipad, 0, block_size);
+ memset(ctx->opad, 0, block_size);
shash->tfm = ctx->hash_tfm;
shash->flags = 0x0;

if (auth_keylen > block_size) {
int ret = crypto_shash_digest(shash, auth_key,
- auth_keylen, ipad);
+ auth_keylen, ctx->ipad);
if (ret)
return ret;

- memcpy(opad, ipad, digest_size);
+ memcpy(ctx->opad, ctx->ipad, digest_size);
} else {
- memcpy(ipad, auth_key, auth_keylen);
- memcpy(opad, auth_key, auth_keylen);
+ memcpy(ctx->ipad, auth_key, auth_keylen);
+ memcpy(ctx->opad, auth_key, auth_keylen);
}

for (i = 0; i < block_size; i++) {
- char *ipad_ptr = ipad + i;
- char *opad_ptr = opad + i;
+ char *ipad_ptr = ctx->ipad + i;
+ char *opad_ptr = ctx->opad + i;
*ipad_ptr ^= HMAC_IPAD_VALUE;
*opad_ptr ^= HMAC_OPAD_VALUE;
}
@@ -186,7 +188,7 @@ static int qat_alg_do_precomputes(struct
icp_qat_hw_auth_algo_blk *hash,
if (crypto_shash_init(shash))
return -EFAULT;

- if (crypto_shash_update(shash, ipad, block_size))
+ if (crypto_shash_update(shash, ctx->ipad, block_size))
return -EFAULT;

hash_state_out = (__be32 *)hash->sha.state1;
@@ -194,22 +196,22 @@ static int qat_alg_do_precomputes(struct
icp_qat_hw_auth_algo_blk *hash,

switch (ctx->qat_hash_alg) {
case ICP_QAT_HW_AUTH_ALGO_SHA1:
- if (crypto_shash_export(shash, &sha1))
+ if (crypto_shash_export(shash, &ctx->sha1))
return -EFAULT;
for (i = 0; i < digest_size >> 2; i++, hash_state_out++)
- *hash_state_out = cpu_to_be32(*(sha1.state + i));
+ *hash_state_out = cpu_to_be32(*(ctx->sha1.state + i));
break;
case ICP_QAT_HW_AUTH_ALGO_SHA256:
- if (crypto_shash_export(shash, &sha256))
+ if (crypto_shash_export(shash, &ctx->sha256))
return -EFAULT;
for (i = 0; i < digest_size >> 2; i++, hash_state_out++)
- *hash_state_out = cpu_to_be32(*(sha256.state + i));
+ *hash_state_out = cpu_to_be32(*(ctx->sha256.state + i));
break;
case ICP_QAT_HW_AUTH_ALGO_SHA512:
- if (crypto_shash_export(shash, &sha512))
+ if (crypto_shash_export(shash, &ctx->sha512))
return -EFAULT;
for (i = 0; i < digest_size >> 3; i++, hash512_state_out++)
- *hash512_state_out = cpu_to_be64(*(sha512.state + i));
+ *hash512_state_out = cpu_to_be64(*(ctx->sha512.state + i));
break;
default:
return -EFAULT;
@@ -218,7 +220,7 @@ static int qat_alg_do_precomputes(struct
icp_qat_hw_auth_algo_blk *hash,
if (crypto_shash_init(shash))
return -EFAULT;

- if (crypto_shash_update(shash, opad, block_size))
+ if (crypto_shash_update(shash, ctx->opad, block_size))
return -EFAULT;

offset = round_up(qat_get_inter_state_size(ctx->qat_hash_alg), 8);
@@ -227,28 +229,28 @@ static int qat_alg_do_precomputes(struct
icp_qat_hw_auth_algo_blk *hash,

switch (ctx->qat_hash_alg) {
case ICP_QAT_HW_AUTH_ALGO_SHA1:
- if (crypto_shash_export(shash, &sha1))
+ if (crypto_shash_export(shash, &ctx->sha1))
return -EFAULT;
for (i = 0; i < digest_size >> 2; i++, hash_state_out++)
- *hash_state_out = cpu_to_be32(*(sha1.state + i));
+ *hash_state_out = cpu_to_be32(*(ctx->sha1.state + i));
break;
case ICP_QAT_HW_AUTH_ALGO_SHA256:
- if (crypto_shash_export(shash, &sha256))
+ if (crypto_shash_export(shash, &ctx->sha256))
return -EFAULT;
for (i = 0; i < digest_size >> 2; i++, hash_state_out++)
- *hash_state_out = cpu_to_be32(*(sha256.state + i));
+ *hash_state_out = cpu_to_be32(*(ctx->sha256.state + i));
break;
case ICP_QAT_HW_AUTH_ALGO_SHA512:
- if (crypto_shash_export(shash, &sha512))
+ if (crypto_shash_export(shash, &ctx->sha512))
return -EFAULT;
for (i = 0; i < digest_size >> 3; i++, hash512_state_out++)
- *hash512_state_out = cpu_to_be64(*(sha512.state + i));
+ *hash512_state_out = cpu_to_be64(*(ctx->sha512.state + i));
break;
default:
return -EFAULT;
}
- memzero_explicit(ipad, block_size);
- memzero_explicit(opad, block_size);
+ memzero_explicit(ctx->ipad, block_size);
+ memzero_explicit(ctx->opad, block_size);
return 0;
}

2018-09-26 08:53:42

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH v8 7/9] crypto: qat: Remove VLA usage

On Wed, Sep 26, 2018 at 10:44 AM Ard Biesheuvel
<[email protected]> wrote:
>
> On Tue, 25 Sep 2018 at 18:12, Arnd Bergmann <[email protected]> wrote:
> >
> > On Tue, Aug 7, 2018 at 11:18 PM Kees Cook <[email protected]> wrote:
> > >
> > > In the quest to remove all stack VLA usage from the kernel[1], this uses
> > > the new upper bound for the stack buffer. Also adds a sanity check.
> > >
> > > [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> > >
> > > Signed-off-by: Kees Cook <[email protected]>
> >
> > After rebasing to linux-next, I now get a warning about this file:
> >
> > drivers/crypto/qat/qat_common/qat_algs.c: In function 'qat_alg_do_precomputes':
> > drivers/crypto/qat/qat_common/qat_algs.c:257:1: error: the frame size
> > of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
> >
> > I assume it was already possible to get into that state with the VLA,
> > but it seems bad enough that I think we need to do something
> > about it.
> >
> > The large stack variables add up to 1084 bytes, which fully explains
> > how we got here:
> >
> > SHASH_DESC_ON_STACK(shash, ctx->hash_tfm); /* 360 */
> > struct sha1_state sha1; /* 92 */
> > struct sha256_state sha256; /* 104 */
> > struct sha512_state sha512; /* 208 */
> > char ipad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
> > char opad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
> >
> > The question is what we can do about it. One simple step I've tried
> > is to move the sha1/sha256/sha512 into a union, which saves around
> > 200 bytes and should bring us (slightly) below the warning
> > limit, but I suspect we can do better than that. Any ideas?
> >
>
> All the processing takes place in the context of a setkey() operation,
> which means only one such operation should be in flight per tfm at any
> given time. So we could move all these pieces into the tfm context
> struct instead. Something like the below [untested] (whitespace
> mangling courtesy of Gmail)

Ah, right, this is what I was hoping for. I assume we already guarantee
that this context is never put on the stack somewhere else, right?

Arnd

2018-09-26 08:55:10

by Ard Biesheuvel

[permalink] [raw]

Subject: Re: [PATCH v8 7/9] crypto: qat: Remove VLA usage

On Wed, 26 Sep 2018 at 10:54, Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Sep 26, 2018 at 10:44 AM Ard Biesheuvel
> <[email protected]> wrote:
> >
> > On Tue, 25 Sep 2018 at 18:12, Arnd Bergmann <[email protected]> wrote:
> > >
> > > On Tue, Aug 7, 2018 at 11:18 PM Kees Cook <[email protected]> wrote:
> > > >
> > > > In the quest to remove all stack VLA usage from the kernel[1], this uses
> > > > the new upper bound for the stack buffer. Also adds a sanity check.
> > > >
> > > > [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
> > > >
> > > > Signed-off-by: Kees Cook <[email protected]>
> > >
> > > After rebasing to linux-next, I now get a warning about this file:
> > >
> > > drivers/crypto/qat/qat_common/qat_algs.c: In function 'qat_alg_do_precomputes':
> > > drivers/crypto/qat/qat_common/qat_algs.c:257:1: error: the frame size
> > > of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
> > >
> > > I assume it was already possible to get into that state with the VLA,
> > > but it seems bad enough that I think we need to do something
> > > about it.
> > >
> > > The large stack variables add up to 1084 bytes, which fully explains
> > > how we got here:
> > >
> > > SHASH_DESC_ON_STACK(shash, ctx->hash_tfm); /* 360 */
> > > struct sha1_state sha1; /* 92 */
> > > struct sha256_state sha256; /* 104 */
> > > struct sha512_state sha512; /* 208 */
> > > char ipad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
> > > char opad[MAX_ALGAPI_BLOCKSIZE]; /* 160 */
> > >
> > > The question is what we can do about it. One simple step I've tried
> > > is to move the sha1/sha256/sha512 into a union, which saves around
> > > 200 bytes and should bring us (slightly) below the warning
> > > limit, but I suspect we can do better than that. Any ideas?
> > >
> >
> > All the processing takes place in the context of a setkey() operation,
> > which means only one such operation should be in flight per tfm at any
> > given time. So we could move all these pieces into the tfm context
> > struct instead. Something like the below [untested] (whitespace
> > mangling courtesy of Gmail)
>
> Ah, right, this is what I was hoping for. I assume we already guarantee
> that this context is never put on the stack somewhere else, right?
>

Yes.