v3:
- drop *_REQUEST_ON_STACK patches. The rest of this series is pretty
straight-forward, and I'd like to get them tested in -next while
we continue to chip away at the *_REQUEST_ON_STACK VLA removal patches
separately. "Part 2" will continue with those.
v2:
- use 512 instead of PAGE_SIZE / 8 to avoid bloat on large-page archs.
- swtich xcbc to "16" max universally.
- fix typo in bounds check for dm buffer size.
- examine actual reqsizes for skcipher and ahash instead of guessing.
- improve names and comments for alg maxes
This is nearly the last of the VLA removals[1], but it's one of the
largest because crypto gets used in lots of places. After looking
through code, usage, reading the threads Gustavo started, and comparing
the use-cases to the other VLA removals that have landed in the kernel,
I think this series is likely the best way forward to shut the door on
VLAs forever.
As background, the crypto stack usage is for callers to do an immediate
bit of work that doesn't allocate new memory. This means that other VLA
removal techniques (like just using kmalloc) aren't workable, and the
next common technique is needed: examination of maximum stack usage and
the addition of sanity checks. This series does that, and in several
cases, these maximums were already implicit in the code.
This series is intended to land via the crypto tree, though it touches
dm as well, since there are dependent patches (new crypto #defines
being used).
Thanks!
-Kees
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Kees Cook (9):
crypto: xcbc: Remove VLA usage
crypto: cbc: Remove VLA usage
crypto: shash: Remove VLA usage
dm integrity: Remove VLA usage
crypto: ahash: Remove VLA usage
dm verity fec: Remove VLA usage
crypto alg: Introduce generic max blocksize and alignmask
crypto: qat: Remove VLA usage
crypto: shash: Remove VLA usage in unaligned hashing
crypto/ahash.c | 4 ++--
crypto/algapi.c | 7 ++++++-
crypto/algif_hash.c | 2 +-
crypto/shash.c | 25 +++++++++++-------------
crypto/xcbc.c | 9 +++++++--
drivers/crypto/qat/qat_common/qat_algs.c | 8 ++++++--
drivers/md/dm-integrity.c | 23 ++++++++++++++++------
drivers/md/dm-verity-fec.c | 5 ++++-
include/crypto/algapi.h | 4 +++-
include/crypto/cbc.h | 4 +++-
include/crypto/hash.h | 9 ++++++++-
include/linux/compiler-gcc.h | 1 -
12 files changed, 68 insertions(+), 33 deletions(-)
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLAs in SHASH_DESC_ON_STACK (via crypto_shash_descsize())
by using the maximum allowable size (which is now more clearly captured
in a macro). Similar limits are turned into macros as well.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
crypto/shash.c | 6 +++---
include/crypto/hash.h | 6 +++++-
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/crypto/shash.c b/crypto/shash.c
index 5d732c6bb4b2..ab6902c6dae7 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -458,9 +458,9 @@ static int shash_prepare_alg(struct shash_alg *alg)
{
struct crypto_alg *base = &alg->base;
- if (alg->digestsize > PAGE_SIZE / 8 ||
- alg->descsize > PAGE_SIZE / 8 ||
- alg->statesize > PAGE_SIZE / 8)
+ if (alg->digestsize > SHASH_MAX_DIGESTSIZE ||
+ alg->descsize > SHASH_MAX_DESCSIZE ||
+ alg->statesize > SHASH_MAX_STATESIZE)
return -EINVAL;
base->cra_type = &crypto_shash_type;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 76e432cab75d..804e87a9510e 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -151,9 +151,13 @@ struct shash_desc {
void *__ctx[] CRYPTO_MINALIGN_ATTR;
};
+#define SHASH_MAX_DIGESTSIZE 512
+#define SHASH_MAX_DESCSIZE 512
+#define SHASH_MAX_STATESIZE 512
+
#define SHASH_DESC_ON_STACK(shash, ctx) \
char __##shash##_desc[sizeof(struct shash_desc) + \
- crypto_shash_descsize(ctx)] CRYPTO_MINALIGN_ATTR; \
+ SHASH_MAX_DESCSIZE] CRYPTO_MINALIGN_ATTR; \
struct shash_desc *shash = (struct shash_desc *)__##shash##_desc
/**
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this uses
the new upper bound for the stack buffer. Also adds a sanity check.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
drivers/crypto/qat/qat_common/qat_algs.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/qat/qat_common/qat_algs.c b/drivers/crypto/qat/qat_common/qat_algs.c
index 1138e41d6805..a28edf7b792f 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -153,8 +153,8 @@ static int qat_alg_do_precomputes(struct icp_qat_hw_auth_algo_blk *hash,
struct sha512_state sha512;
int block_size = crypto_shash_blocksize(ctx->hash_tfm);
int digest_size = crypto_shash_digestsize(ctx->hash_tfm);
- char ipad[block_size];
- char opad[block_size];
+ char ipad[MAX_ALGAPI_BLOCKSIZE];
+ char opad[MAX_ALGAPI_BLOCKSIZE];
__be32 *hash_state_out;
__be64 *hash512_state_out;
int i, offset;
@@ -164,6 +164,10 @@ static int qat_alg_do_precomputes(struct icp_qat_hw_auth_algo_blk *hash,
shash->tfm = ctx->hash_tfm;
shash->flags = 0x0;
+ if (WARN_ON(block_size > sizeof(ipad) ||
+ sizeof(ipad) != sizeof(opad)))
+ return -EINVAL;
+
if (auth_keylen > block_size) {
int ret = crypto_shash_digest(shash, auth_key,
auth_keylen, ipad);
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes. Additionally, the __aligned_largest macro is
removed since this helper was the only user.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
crypto/shash.c | 19 ++++++++-----------
include/linux/compiler-gcc.h | 1 -
2 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/crypto/shash.c b/crypto/shash.c
index ab6902c6dae7..8081c5e03770 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key,
}
EXPORT_SYMBOL_GPL(crypto_shash_setkey);
-static inline unsigned int shash_align_buffer_size(unsigned len,
- unsigned long mask)
-{
- typedef u8 __aligned_largest u8_aligned;
- return len + (mask & ~(__alignof__(u8_aligned) - 1));
-}
-
static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
@@ -88,11 +81,13 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
unsigned long alignmask = crypto_shash_alignmask(tfm);
unsigned int unaligned_len = alignmask + 1 -
((unsigned long)data & alignmask);
- u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
- __aligned_largest;
+ u8 ubuf[MAX_ALGAPI_ALIGNMASK + 1];
u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
int err;
+ if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
+ return -EINVAL;
+
if (unaligned_len > len)
unaligned_len = len;
@@ -124,11 +119,13 @@ static int shash_final_unaligned(struct shash_desc *desc, u8 *out)
unsigned long alignmask = crypto_shash_alignmask(tfm);
struct shash_alg *shash = crypto_shash_alg(tfm);
unsigned int ds = crypto_shash_digestsize(tfm);
- u8 ubuf[shash_align_buffer_size(ds, alignmask)]
- __aligned_largest;
+ u8 ubuf[SHASH_MAX_DIGESTSIZE];
u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
int err;
+ if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
+ return -EINVAL;
+
err = shash->final(desc, buf);
if (err)
goto out;
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492a5cc8..1f1cdef36a82 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -125,7 +125,6 @@
*/
#define __pure __attribute__((pure))
#define __aligned(x) __attribute__((aligned(x)))
-#define __aligned_largest __attribute__((aligned))
#define __printf(a, b) __attribute__((format(printf, a, b)))
#define __scanf(a, b) __attribute__((format(scanf, a, b)))
#define __attribute_const__ __attribute__((__const__))
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this uses
the new SHASH_MAX_DIGESTSIZE from the crypto layer to allocate the upper
bounds on stack usage.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
drivers/md/dm-integrity.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 86438b2f10dd..85e8ce1625a2 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, unsigned section, __u8 result
}
memset(result + size, 0, JOURNAL_MAC_SIZE - size);
} else {
- __u8 digest[size];
+ __u8 digest[SHASH_MAX_DIGESTSIZE];
+
+ if (WARN_ON(size > sizeof(digest))) {
+ dm_integrity_io_error(ic, "digest_size", -EINVAL);
+ goto err;
+ }
r = crypto_shash_final(desc, digest);
if (unlikely(r)) {
dm_integrity_io_error(ic, "crypto_shash_final", r);
@@ -1244,7 +1249,7 @@ static void integrity_metadata(struct work_struct *w)
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
char *checksums;
unsigned extra_space = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0;
- char checksums_onstack[ic->tag_size + extra_space];
+ char checksums_onstack[SHASH_MAX_DIGESTSIZE];
unsigned sectors_to_process = dio->range.n_sectors;
sector_t sector = dio->range.logical_sector;
@@ -1253,8 +1258,14 @@ static void integrity_metadata(struct work_struct *w)
checksums = kmalloc((PAGE_SIZE >> SECTOR_SHIFT >> ic->sb->log2_sectors_per_block) * ic->tag_size + extra_space,
GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN);
- if (!checksums)
+ if (!checksums) {
checksums = checksums_onstack;
+ if (WARN_ON(extra_space &&
+ digest_size > sizeof(checksums_onstack))) {
+ r = -EINVAL;
+ goto error;
+ }
+ }
__bio_for_each_segment(bv, bio, iter, dio->orig_bi_iter) {
unsigned pos;
@@ -1466,7 +1477,7 @@ static bool __journal_read_write(struct dm_integrity_io *dio, struct bio *bio,
} while (++s < ic->sectors_per_block);
#ifdef INTERNAL_VERIFY
if (ic->internal_hash) {
- char checksums_onstack[max(crypto_shash_digestsize(ic->internal_hash), ic->tag_size)];
+ char checksums_onstack[max(SHASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
integrity_sector_checksum(ic, logical_sector, mem + bv.bv_offset, checksums_onstack);
if (unlikely(memcmp(checksums_onstack, journal_entry_tag(ic, je), ic->tag_size))) {
@@ -1516,7 +1527,7 @@ static bool __journal_read_write(struct dm_integrity_io *dio, struct bio *bio,
if (ic->internal_hash) {
unsigned digest_size = crypto_shash_digestsize(ic->internal_hash);
if (unlikely(digest_size > ic->tag_size)) {
- char checksums_onstack[digest_size];
+ char checksums_onstack[SHASH_MAX_DIGESTSIZE];
integrity_sector_checksum(ic, logical_sector, (char *)js, checksums_onstack);
memcpy(journal_entry_tag(ic, je), checksums_onstack, ic->tag_size);
} else
@@ -1937,7 +1948,7 @@ static void do_journal_write(struct dm_integrity_c *ic, unsigned write_start,
unlikely(from_replay) &&
#endif
ic->internal_hash) {
- char test_tag[max(crypto_shash_digestsize(ic->internal_hash), ic->tag_size)];
+ char test_tag[max_t(size_t, SHASH_MAX_DIGESTSIZE, MAX_TAG_SIZE)];
integrity_sector_checksum(ic, sec + ((l - j) << ic->sb->log2_sectors_per_block),
(char *)access_journal_data(ic, i, l), test_tag);
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this
exposes a new general upper bound on crypto blocksize and alignmask
(higher than for the existing cipher limits) for VLA removal,
and introduces new checks.
At present, the highest cra_alignmask in the kernel is 63. The highest
cra_blocksize is 144 (SHA3_224_BLOCK_SIZE, 18 8-byte words). For the
new blocksize limit, I went with 160 (20 8-byte words).
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
crypto/algapi.c | 7 ++++++-
include/crypto/algapi.h | 4 +++-
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/crypto/algapi.c b/crypto/algapi.c
index c0755cf4f53f..496fc51bf215 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -57,9 +57,14 @@ static int crypto_check_alg(struct crypto_alg *alg)
if (alg->cra_alignmask & (alg->cra_alignmask + 1))
return -EINVAL;
- if (alg->cra_blocksize > PAGE_SIZE / 8)
+ /* General maximums for all algs. */
+ if (alg->cra_alignmask > MAX_ALGAPI_ALIGNMASK)
return -EINVAL;
+ if (alg->cra_blocksize > MAX_ALGAPI_BLOCKSIZE)
+ return -EINVAL;
+
+ /* Lower maximums for specific alg types. */
if (!alg->cra_type && (alg->cra_flags & CRYPTO_ALG_TYPE_MASK) ==
CRYPTO_ALG_TYPE_CIPHER) {
if (alg->cra_alignmask > MAX_CIPHER_ALIGNMASK)
diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
index bd5e8ccf1687..21371ac8f355 100644
--- a/include/crypto/algapi.h
+++ b/include/crypto/algapi.h
@@ -20,8 +20,10 @@
/*
* Maximum values for blocksize and alignmask, used to allocate
* static buffers that are big enough for any combination of
- * ciphers and architectures.
+ * algs and architectures. Ciphers have a lower maximum size.
*/
+#define MAX_ALGAPI_BLOCKSIZE 160
+#define MAX_ALGAPI_ALIGNMASK 63
#define MAX_CIPHER_BLOCKSIZE 16
#define MAX_CIPHER_ALIGNMASK 15
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this
uses the upper bounds on blocksize. Since this is always a cipher
blocksize, use the existing cipher max blocksize.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
include/crypto/cbc.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
index f5b8bfc22e6d..47db0aac2ab9 100644
--- a/include/crypto/cbc.h
+++ b/include/crypto/cbc.h
@@ -113,7 +113,9 @@ static inline int crypto_cbc_decrypt_inplace(
unsigned int bsize = crypto_skcipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
- u8 last_iv[bsize];
+ u8 last_iv[MAX_CIPHER_BLOCKSIZE];
+
+ BUG_ON(bsize > sizeof(last_iv));
/* Start of the last block. */
src += nbytes - (nbytes & (bsize - 1)) - bsize;
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this
uses the newly defined max digest size macro. Also adds a sanity-check
at use-time.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
drivers/md/dm-verity-fec.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 684af08d0747..fe5cfd1a5fa5 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -212,12 +212,15 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
struct dm_verity_fec_io *fio = fec_io(io);
u64 block, ileaved;
u8 *bbuf, *rs_block;
- u8 want_digest[v->digest_size];
+ u8 want_digest[AHASH_MAX_DIGESTSIZE];
unsigned n, k;
if (neras)
*neras = 0;
+ if (WARN_ON(v->digest_size > sizeof(want_digest)))
+ return -EINVAL;
+
/*
* read each of the rsn data blocks that are part of the RS block, and
* interleave contents to available bufs
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this
introduces max size macros for ahash, as already done for shash, and
adjust the crypto user to max state size.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
crypto/ahash.c | 4 ++--
crypto/algif_hash.c | 2 +-
include/crypto/hash.h | 3 +++
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index a64c143165b1..6435bdbe42fd 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -550,8 +550,8 @@ static int ahash_prepare_alg(struct ahash_alg *alg)
{
struct crypto_alg *base = &alg->halg.base;
- if (alg->halg.digestsize > PAGE_SIZE / 8 ||
- alg->halg.statesize > PAGE_SIZE / 8 ||
+ if (alg->halg.digestsize > AHASH_MAX_DIGESTSIZE ||
+ alg->halg.statesize > AHASH_MAX_STATESIZE ||
alg->halg.statesize == 0)
return -EINVAL;
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index bfcf595fd8f9..8974ee8ebead 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -239,7 +239,7 @@ static int hash_accept(struct socket *sock, struct socket *newsock, int flags,
struct alg_sock *ask = alg_sk(sk);
struct hash_ctx *ctx = ask->private;
struct ahash_request *req = &ctx->req;
- char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req)) ? : 1];
+ char state[AHASH_MAX_STATESIZE];
struct sock *sk2;
struct alg_sock *ask2;
struct hash_ctx *ctx2;
diff --git a/include/crypto/hash.h b/include/crypto/hash.h
index 804e87a9510e..5d79e2f0936e 100644
--- a/include/crypto/hash.h
+++ b/include/crypto/hash.h
@@ -64,6 +64,9 @@ struct ahash_request {
void *__ctx[] CRYPTO_MINALIGN_ATTR;
};
+#define AHASH_MAX_DIGESTSIZE 512
+#define AHASH_MAX_STATESIZE 512
+
#define AHASH_REQUEST_ON_STACK(name, ahash) \
char __##name##_desc[sizeof(struct ahash_request) + \
crypto_ahash_reqsize(ahash)] CRYPTO_MINALIGN_ATTR; \
--
2.17.1
In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check. For xcbc, the blocksize
must always be 16, so use that, since it's already being enforced during
instantiation.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <[email protected]>
---
crypto/xcbc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index 25c75af50d3f..7aa03beed795 100644
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -57,6 +57,8 @@ struct xcbc_desc_ctx {
u8 ctx[];
};
+#define XCBC_BLOCKSIZE 16
+
static int crypto_xcbc_digest_setkey(struct crypto_shash *parent,
const u8 *inkey, unsigned int keylen)
{
@@ -65,7 +67,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash *parent,
int bs = crypto_shash_blocksize(parent);
u8 *consts = PTR_ALIGN(&ctx->ctx[0], alignmask + 1);
int err = 0;
- u8 key1[bs];
+ u8 key1[XCBC_BLOCKSIZE];
+
+ if (WARN_ON(bs > sizeof(key1)))
+ return -EINVAL;
if ((err = crypto_cipher_setkey(ctx->child, inkey, keylen)))
return err;
@@ -212,7 +217,7 @@ static int xcbc_create(struct crypto_template *tmpl, struct rtattr **tb)
return PTR_ERR(alg);
switch(alg->cra_blocksize) {
- case 16:
+ case XCBC_BLOCKSIZE:
break;
default:
goto out_put_alg;
--
2.17.1
On Fri, Jun 29, 2018 at 2:28 AM, Kees Cook <[email protected]> wrote:
> diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
> index 86438b2f10dd..85e8ce1625a2 100644
> --- a/drivers/md/dm-integrity.c
> +++ b/drivers/md/dm-integrity.c
> @@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, unsigned section, __u8 result
> }
> memset(result + size, 0, JOURNAL_MAC_SIZE - size);
> } else {
> - __u8 digest[size];
> + __u8 digest[SHASH_MAX_DIGESTSIZE];
> +
> + if (WARN_ON(size > sizeof(digest))) {
> + dm_integrity_io_error(ic, "digest_size", -EINVAL);
> + goto err;
> + }
I'm still slightly worried that some patches like this one could make
things worse
and lead to an actual stack overflow. You define SHASH_MAX_DIGESTSIZE
as '512', which is still quite a lot to put on the kernel stack. The
function also
uses SHASH_DESC_ON_STACK(), so now you have two copies. Then you
could call shash_final_unaligned(), which seems to put a third copy on
the stack,
so replacing each one with a fixed-size buffer adds quite a bit of bloat.
Is there actually a digest that can be used in dm-integrity with more than 64
byte output (matching JOURNAL_MAC_SIZE) here?
Arnd
On Fri, Jun 29, 2018 at 1:43 PM, Arnd Bergmann <[email protected]> wrote:
> On Fri, Jun 29, 2018 at 2:28 AM, Kees Cook <[email protected]> wrote:
>
>> diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
>> index 86438b2f10dd..85e8ce1625a2 100644
>> --- a/drivers/md/dm-integrity.c
>> +++ b/drivers/md/dm-integrity.c
>> @@ -521,7 +521,12 @@ static void section_mac(struct dm_integrity_c *ic, unsigned section, __u8 result
>> }
>> memset(result + size, 0, JOURNAL_MAC_SIZE - size);
>> } else {
>> - __u8 digest[size];
>> + __u8 digest[SHASH_MAX_DIGESTSIZE];
>> +
>> + if (WARN_ON(size > sizeof(digest))) {
>> + dm_integrity_io_error(ic, "digest_size", -EINVAL);
>> + goto err;
>> + }
>
> I'm still slightly worried that some patches like this one could make
> things worse and lead to an actual stack overflow.
As in stack exhaustion? Yeah, this has been a concern of mine for the
crypto stuff because some combinations get BIG. My thinking has been
mainly that it means ALL cases will lead to a bad state instead of
only corner cases, which makes it easier to find and fix.
> You define SHASH_MAX_DIGESTSIZE
> as '512', which is still quite a lot to put on the kernel stack. The
> function also
> uses SHASH_DESC_ON_STACK(), so now you have two copies. Then you
> could call shash_final_unaligned(), which seems to put a third copy on
> the stack,
> so replacing each one with a fixed-size buffer adds quite a bit of bloat.
>
> Is there actually a digest that can be used in dm-integrity with more than 64
> byte output (matching JOURNAL_MAC_SIZE) here?
This conversion was following the existing check (PAGE_SIZE / 8), and
not via an analysis of alg.digestsize users. Let me double-check. For
predefined stuff, it looks like the largest is:
SKEIN1024_DIGEST_BIT_SIZE/8 == 128
I can drop this from 512 down to 128...
-Kees
--
Kees Cook
Pixel Security
On Thu, Jun 28, 2018 at 05:28:43PM -0700, Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this uses
> the newly defined max alignment to perform unaligned hashing to avoid
> VLAs, and drops the helper function while adding sanity checks on the
> resulting buffer sizes. Additionally, the __aligned_largest macro is
> removed since this helper was the only user.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> crypto/shash.c | 19 ++++++++-----------
> include/linux/compiler-gcc.h | 1 -
> 2 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/crypto/shash.c b/crypto/shash.c
> index ab6902c6dae7..8081c5e03770 100644
> --- a/crypto/shash.c
> +++ b/crypto/shash.c
> @@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key,
> }
> EXPORT_SYMBOL_GPL(crypto_shash_setkey);
>
> -static inline unsigned int shash_align_buffer_size(unsigned len,
> - unsigned long mask)
> -{
> - typedef u8 __aligned_largest u8_aligned;
> - return len + (mask & ~(__alignof__(u8_aligned) - 1));
> -}
> -
> static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
> unsigned int len)
> {
> @@ -88,11 +81,13 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
> unsigned long alignmask = crypto_shash_alignmask(tfm);
> unsigned int unaligned_len = alignmask + 1 -
> ((unsigned long)data & alignmask);
> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
> - __aligned_largest;
> + u8 ubuf[MAX_ALGAPI_ALIGNMASK + 1];
> u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
> int err;
>
> + if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
> + return -EINVAL;
> +
How is 'ubuf' guaranteed to be large enough? You removed the __aligned
attribute, so 'ubuf' can have any alignment. So the aligned pointer 'buf' may
be as high as '&ubuf[alignmask]'. Then, up to 'alignmask' bytes of data will be
copied into 'buf'... resulting in up to '2 * alignmask' bytes needed in 'ubuf'.
But you've only guaranteed 'alignmask + 1' bytes.
> if (unaligned_len > len)
> unaligned_len = len;
>
> @@ -124,11 +119,13 @@ static int shash_final_unaligned(struct shash_desc *desc, u8 *out)
> unsigned long alignmask = crypto_shash_alignmask(tfm);
> struct shash_alg *shash = crypto_shash_alg(tfm);
> unsigned int ds = crypto_shash_digestsize(tfm);
> - u8 ubuf[shash_align_buffer_size(ds, alignmask)]
> - __aligned_largest;
> + u8 ubuf[SHASH_MAX_DIGESTSIZE];
> u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
> int err;
>
> + if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
> + return -EINVAL;
> +
Similar problem here. Wouldn't 'ubuf' need to be of size 'alignmask + ds'?
> err = shash->final(desc, buf);
> if (err)
> goto out;
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index f1a7492a5cc8..1f1cdef36a82 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -125,7 +125,6 @@
> */
> #define __pure __attribute__((pure))
> #define __aligned(x) __attribute__((aligned(x)))
> -#define __aligned_largest __attribute__((aligned))
> #define __printf(a, b) __attribute__((format(printf, a, b)))
> #define __scanf(a, b) __attribute__((format(scanf, a, b)))
> #define __attribute_const__ __attribute__((__const__))
> --
> 2.17.1
>
> --
> dm-devel mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/dm-devel
On Fri, Jun 29, 2018 at 02:56:37PM -0700, Kees Cook wrote:
>
> This conversion was following the existing check (PAGE_SIZE / 8), and
> not via an analysis of alg.digestsize users. Let me double-check. For
> predefined stuff, it looks like the largest is:
>
> SKEIN1024_DIGEST_BIT_SIZE/8 == 128
This should be removed. We shouldn't allow generic or new crypto
algorithms in staging.
Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Jun 30, 2018 at 12:03 AM, Eric Biggers <[email protected]> wrote:
> On Thu, Jun 28, 2018 at 05:28:43PM -0700, Kees Cook wrote:
>> @@ -88,11 +81,13 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
>> unsigned long alignmask = crypto_shash_alignmask(tfm);
>> unsigned int unaligned_len = alignmask + 1 -
>> ((unsigned long)data & alignmask);
>> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
>> - __aligned_largest;
>> + u8 ubuf[MAX_ALGAPI_ALIGNMASK + 1];
>> u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
>> int err;
>>
>> + if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
>> + return -EINVAL;
>> +
>
> How is 'ubuf' guaranteed to be large enough? You removed the __aligned
> attribute, so 'ubuf' can have any alignment. So the aligned pointer 'buf' may
> be as high as '&ubuf[alignmask]'. Then, up to 'alignmask' bytes of data will be
> copied into 'buf'... resulting in up to '2 * alignmask' bytes needed in 'ubuf'.
> But you've only guaranteed 'alignmask + 1' bytes.
Hm, good point. Adding __aligned(MAX_ALGAPI_ALIGNMASK + 1) looks to
fix this, yes?
Also, if __aligned() is used here, can't PTR_ALIGN() be dropped? (I
think you pointed this out earlier.)
Also, is "unaligned_len" being calculated correctly? Let's say
alignmask is 63. If data is binary ...111111, then unaligned_len will
be 64 - 63 == 1, which is fine: we copy 1 byte out, bump the address
by 1, and we're happily aligned to ...000000. If data is ...000000,
then unaligned_len will be 64. But it should be 0. Shouldn't this be:
unsigned int unaligned_len;
unaligned_len = (unsigned long)data & alignmask;
if (unaligned_len)
unaligned_len = alignmask + 1 - unaligned_len;
And then ubuf only needs to be MAX_ALGAPI_ALIGNMASK, without the +1?
-Kees
--
Kees Cook
Pixel Security
On Sun, Jul 01, 2018 at 10:04:59AM -0700, Kees Cook wrote:
> On Sat, Jun 30, 2018 at 12:03 AM, Eric Biggers <[email protected]> wrote:
> > On Thu, Jun 28, 2018 at 05:28:43PM -0700, Kees Cook wrote:
> >> @@ -88,11 +81,13 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
> >> unsigned long alignmask = crypto_shash_alignmask(tfm);
> >> unsigned int unaligned_len = alignmask + 1 -
> >> ((unsigned long)data & alignmask);
> >> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
> >> - __aligned_largest;
> >> + u8 ubuf[MAX_ALGAPI_ALIGNMASK + 1];
> >> u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
> >> int err;
> >>
> >> + if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
> >> + return -EINVAL;
> >> +
> >
> > How is 'ubuf' guaranteed to be large enough? You removed the __aligned
> > attribute, so 'ubuf' can have any alignment. So the aligned pointer 'buf' may
> > be as high as '&ubuf[alignmask]'. Then, up to 'alignmask' bytes of data will be
> > copied into 'buf'... resulting in up to '2 * alignmask' bytes needed in 'ubuf'.
> > But you've only guaranteed 'alignmask + 1' bytes.
>
> Hm, good point. Adding __aligned(MAX_ALGAPI_ALIGNMASK + 1) looks to
> fix this, yes?
>
> Also, if __aligned() is used here, can't PTR_ALIGN() be dropped? (I
> think you pointed this out earlier.)
Sure, I'm just not sure whether __aligned() with such a large alignment is
guaranteed to work on stack variables on all architectures. See e.g.
https://patchwork.kernel.org/patch/9507697/.
>
> Also, is "unaligned_len" being calculated correctly? Let's say
> alignmask is 63. If data is binary ...111111, then unaligned_len will
> be 64 - 63 == 1, which is fine: we copy 1 byte out, bump the address
> by 1, and we're happily aligned to ...000000. If data is ...000000,
> then unaligned_len will be 64. But it should be 0. Shouldn't this be:
>
> unsigned int unaligned_len;
>
> unaligned_len = (unsigned long)data & alignmask;
> if (unaligned_len)
> unaligned_len = alignmask + 1 - unaligned_len;
>
> And then ubuf only needs to be MAX_ALGAPI_ALIGNMASK, without the +1?
shash_update_unaligned() is only called when 'data & alignmask'.
Similarly with shash_final_unaligned().
Though, calculating 'unaligned_len' could be simplified to
unsigned int unaligned_len = -(unsigned long)data & alignmask;
which works either way.
- Eric
On Sun, Jul 1, 2018 at 10:20 AM, Eric Biggers <[email protected]> wrote:
> On Sun, Jul 01, 2018 at 10:04:59AM -0700, Kees Cook wrote:
>> On Sat, Jun 30, 2018 at 12:03 AM, Eric Biggers <[email protected]> wrote:
>> > On Thu, Jun 28, 2018 at 05:28:43PM -0700, Kees Cook wrote:
>> >> @@ -88,11 +81,13 @@ static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
>> >> unsigned long alignmask = crypto_shash_alignmask(tfm);
>> >> unsigned int unaligned_len = alignmask + 1 -
>> >> ((unsigned long)data & alignmask);
>> >> - u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
>> >> - __aligned_largest;
>> >> + u8 ubuf[MAX_ALGAPI_ALIGNMASK + 1];
>> >> u8 *buf = PTR_ALIGN(&ubuf[0], alignmask + 1);
>> >> int err;
>> >>
>> >> + if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
>> >> + return -EINVAL;
>> >> +
>> >
>> > How is 'ubuf' guaranteed to be large enough? You removed the __aligned
>> > attribute, so 'ubuf' can have any alignment. So the aligned pointer 'buf' may
>> > be as high as '&ubuf[alignmask]'. Then, up to 'alignmask' bytes of data will be
>> > copied into 'buf'... resulting in up to '2 * alignmask' bytes needed in 'ubuf'.
>> > But you've only guaranteed 'alignmask + 1' bytes.
>>
>> Hm, good point. Adding __aligned(MAX_ALGAPI_ALIGNMASK + 1) looks to
>> fix this, yes?
>>
>> Also, if __aligned() is used here, can't PTR_ALIGN() be dropped? (I
>> think you pointed this out earlier.)
>
> Sure, I'm just not sure whether __aligned() with such a large alignment is
> guaranteed to work on stack variables on all architectures. See e.g.
> https://patchwork.kernel.org/patch/9507697/.
That's terrible. :( That seems like a compiler bug, but okay.
>> Also, is "unaligned_len" being calculated correctly? Let's say
>> alignmask is 63. If data is binary ...111111, then unaligned_len will
>> be 64 - 63 == 1, which is fine: we copy 1 byte out, bump the address
>> by 1, and we're happily aligned to ...000000. If data is ...000000,
>> then unaligned_len will be 64. But it should be 0. Shouldn't this be:
>>
>> unsigned int unaligned_len;
>>
>> unaligned_len = (unsigned long)data & alignmask;
>> if (unaligned_len)
>> unaligned_len = alignmask + 1 - unaligned_len;
>>
>> And then ubuf only needs to be MAX_ALGAPI_ALIGNMASK, without the +1?
>
> shash_update_unaligned() is only called when 'data & alignmask'.
> Similarly with shash_final_unaligned().
Ah! I see that now.
> Though, calculating 'unaligned_len' could be simplified to
>
> unsigned int unaligned_len = -(unsigned long)data & alignmask;
>
> which works either way.
So, since we can't depend on __aligned() working, I'll just keep the
PTR_ALIGN and add MAX_ALGAPI_ALIGNMASK to each array. That'll be less
memory-efficient, but it'll actually get aligned correctly.
-Kees
--
Kees Cook
Pixel Security