2018-09-04 18:18:00

by Kees Cook

[permalink] [raw]
Subject: [PATCH 0/2] crypto: Remove VLA usage from skcipher

This removes VLAs[1] from SKCIPHER_REQUEST_ON_STACK by making it possible
for crypto_skcipher_set_reqsize() to fail. Callers are updated to handle
the error condition.

-Kees

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Kees Cook (2):
crypto: skcipher: Allow crypto_skcipher_set_reqsize() to fail
crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

crypto/cryptd.c | 7 +++++--
crypto/ctr.c | 7 +++++--
crypto/cts.c | 7 +++++--
crypto/lrw.c | 9 ++++++---
crypto/simd.c | 7 +++++--
crypto/xts.c | 11 ++++++++---
drivers/crypto/amcc/crypto4xx_core.c | 8 +++++++-
drivers/crypto/cavium/nitrox/nitrox_algs.c | 9 +++++++--
drivers/crypto/ccree/cc_cipher.c | 6 ++++--
drivers/crypto/hisilicon/sec/sec_algs.c | 5 ++++-
drivers/crypto/inside-secure/safexcel_cipher.c | 5 ++++-
drivers/crypto/marvell/cipher.c | 4 +---
drivers/crypto/sunxi-ss/sun4i-ss-cipher.c | 4 +---
include/crypto/internal/skcipher.h | 7 ++++++-
include/crypto/skcipher.h | 4 +++-
15 files changed, 71 insertions(+), 29 deletions(-)

--
2.17.1



2018-09-04 18:18:03

by Kees Cook

[permalink] [raw]
Subject: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

In the quest to remove all stack VLA usage from the kernel[1], this
caps the skcipher request size similar to other limits and adds a sanity
check at registration. Looking at instrumented tcrypt output, the largest
is for lrw:

crypt: testing lrw(aes)
crypto_skcipher_set_reqsize: 8
crypto_skcipher_set_reqsize: 88
crypto_skcipher_set_reqsize: 472

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
include/crypto/internal/skcipher.h | 3 +++
include/crypto/skcipher.h | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
index d2926ecae2ac..6da811c0747e 100644
--- a/include/crypto/internal/skcipher.h
+++ b/include/crypto/internal/skcipher.h
@@ -130,6 +130,9 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
static inline int crypto_skcipher_set_reqsize(
struct crypto_skcipher *skcipher, unsigned int reqsize)
{
+ if (WARN_ON(reqsize > SKCIPHER_MAX_REQSIZE))
+ return -EINVAL;
+
skcipher->reqsize = reqsize;

return 0;
diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 2f327f090c3e..c48e194438cf 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -139,9 +139,11 @@ struct skcipher_alg {
struct crypto_alg base;
};

+#define SKCIPHER_MAX_REQSIZE 472
+
#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
- crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
+ SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
struct skcipher_request *name = (void *)__##name##_desc

/**
--
2.17.1


2018-09-04 18:18:09

by Kees Cook

[permalink] [raw]
Subject: [PATCH 1/2] crypto: skcipher: Allow crypto_skcipher_set_reqsize() to fail

In the quest to remove all stack VLA usage from the kernel[1], we must
put an upper limit on how large an skcipher's reqsize can grow. In order
to cleanly handle this limit, crypto_skcipher_set_reqsize() must report
whether the desired reqsize is allowed. This means all callers need to
check the new return value and handle any cleanup now. This patch adds
the return value and updates all the callers to check the result and
act appropriately. A followup patch will add the new bounds checking to
crypto_skcipher_set_reqsize().

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <[email protected]>
---
crypto/cryptd.c | 7 +++++--
crypto/ctr.c | 7 +++++--
crypto/cts.c | 7 +++++--
crypto/lrw.c | 9 ++++++---
crypto/simd.c | 7 +++++--
crypto/xts.c | 11 ++++++++---
drivers/crypto/amcc/crypto4xx_core.c | 8 +++++++-
drivers/crypto/cavium/nitrox/nitrox_algs.c | 9 +++++++--
drivers/crypto/ccree/cc_cipher.c | 6 ++++--
drivers/crypto/hisilicon/sec/sec_algs.c | 5 ++++-
drivers/crypto/inside-secure/safexcel_cipher.c | 5 ++++-
drivers/crypto/marvell/cipher.c | 4 +---
drivers/crypto/sunxi-ss/sun4i-ss-cipher.c | 4 +---
include/crypto/internal/skcipher.h | 4 +++-
14 files changed, 65 insertions(+), 28 deletions(-)

diff --git a/crypto/cryptd.c b/crypto/cryptd.c
index addca7bae33f..e0131907a537 100644
--- a/crypto/cryptd.c
+++ b/crypto/cryptd.c
@@ -563,15 +563,18 @@ static int cryptd_skcipher_init_tfm(struct crypto_skcipher *tfm)
struct crypto_skcipher_spawn *spawn = &ictx->spawn;
struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_skcipher *cipher;
+ int ret;

cipher = crypto_spawn_skcipher(spawn);
if (IS_ERR(cipher))
return PTR_ERR(cipher);

ctx->child = cipher;
- crypto_skcipher_set_reqsize(
+ ret = crypto_skcipher_set_reqsize(
tfm, sizeof(struct cryptd_skcipher_request_ctx));
- return 0;
+ if (ret)
+ crypto_free_skcipher(ctx->child);
+ return ret;
}

static void cryptd_skcipher_exit_tfm(struct crypto_skcipher *tfm)
diff --git a/crypto/ctr.c b/crypto/ctr.c
index 435b75bd619e..70b8496ee569 100644
--- a/crypto/ctr.c
+++ b/crypto/ctr.c
@@ -319,6 +319,7 @@ static int crypto_rfc3686_init_tfm(struct crypto_skcipher *tfm)
struct crypto_skcipher *cipher;
unsigned long align;
unsigned int reqsize;
+ int ret;

cipher = crypto_spawn_skcipher(spawn);
if (IS_ERR(cipher))
@@ -330,9 +331,11 @@ static int crypto_rfc3686_init_tfm(struct crypto_skcipher *tfm)
align &= ~(crypto_tfm_ctx_alignment() - 1);
reqsize = align + sizeof(struct crypto_rfc3686_req_ctx) +
crypto_skcipher_reqsize(cipher);
- crypto_skcipher_set_reqsize(tfm, reqsize);
+ ret = crypto_skcipher_set_reqsize(tfm, reqsize);
+ if (ret)
+ crypto_free_skcipher(ctx->child);

- return 0;
+ return ret;
}

static void crypto_rfc3686_exit_tfm(struct crypto_skcipher *tfm)
diff --git a/crypto/cts.c b/crypto/cts.c
index 4e28d83ae37d..f04c29f4197f 100644
--- a/crypto/cts.c
+++ b/crypto/cts.c
@@ -289,6 +289,7 @@ static int crypto_cts_init_tfm(struct crypto_skcipher *tfm)
unsigned reqsize;
unsigned bsize;
unsigned align;
+ int ret;

cipher = crypto_spawn_skcipher(spawn);
if (IS_ERR(cipher))
@@ -303,9 +304,11 @@ static int crypto_cts_init_tfm(struct crypto_skcipher *tfm)
crypto_tfm_ctx_alignment()) +
(align & ~(crypto_tfm_ctx_alignment() - 1)) + bsize;

- crypto_skcipher_set_reqsize(tfm, reqsize);
+ ret = crypto_skcipher_set_reqsize(tfm, reqsize);
+ if (ret)
+ crypto_free_skcipher(ctx->child);

- return 0;
+ return ret;
}

static void crypto_cts_exit_tfm(struct crypto_skcipher *tfm)
diff --git a/crypto/lrw.c b/crypto/lrw.c
index 393a782679c7..dc344046b637 100644
--- a/crypto/lrw.c
+++ b/crypto/lrw.c
@@ -426,6 +426,7 @@ static int init_tfm(struct crypto_skcipher *tfm)
struct crypto_skcipher_spawn *spawn = skcipher_instance_ctx(inst);
struct priv *ctx = crypto_skcipher_ctx(tfm);
struct crypto_skcipher *cipher;
+ int ret;

cipher = crypto_spawn_skcipher(spawn);
if (IS_ERR(cipher))
@@ -433,10 +434,12 @@ static int init_tfm(struct crypto_skcipher *tfm)

ctx->child = cipher;

- crypto_skcipher_set_reqsize(tfm, crypto_skcipher_reqsize(cipher) +
- sizeof(struct rctx));
+ ret = crypto_skcipher_set_reqsize(tfm,
+ crypto_skcipher_reqsize(cipher) + sizeof(struct rctx));
+ if (ret)
+ crypto_free_skcipher(ctx->child);

- return 0;
+ return ret;
}

static void exit_tfm(struct crypto_skcipher *tfm)
diff --git a/crypto/simd.c b/crypto/simd.c
index ea7240be3001..bf1a27057e92 100644
--- a/crypto/simd.c
+++ b/crypto/simd.c
@@ -112,6 +112,7 @@ static int simd_skcipher_init(struct crypto_skcipher *tfm)
struct simd_skcipher_alg *salg;
struct skcipher_alg *alg;
unsigned reqsize;
+ int ret;

alg = crypto_skcipher_alg(tfm);
salg = container_of(alg, struct simd_skcipher_alg, alg);
@@ -127,9 +128,11 @@ static int simd_skcipher_init(struct crypto_skcipher *tfm)
reqsize = sizeof(struct skcipher_request);
reqsize += crypto_skcipher_reqsize(&cryptd_tfm->base);

- crypto_skcipher_set_reqsize(tfm, reqsize);
+ ret = crypto_skcipher_set_reqsize(tfm, reqsize);
+ if (ret)
+ cryptd_free_skcipher(ctx->cryptd_tfm);

- return 0;
+ return ret;
}

struct simd_skcipher_alg *simd_skcipher_create_compat(const char *algname,
diff --git a/crypto/xts.c b/crypto/xts.c
index ccf55fbb8bc2..d7a85abb9723 100644
--- a/crypto/xts.c
+++ b/crypto/xts.c
@@ -364,6 +364,7 @@ static int init_tfm(struct crypto_skcipher *tfm)
struct priv *ctx = crypto_skcipher_ctx(tfm);
struct crypto_skcipher *child;
struct crypto_cipher *tweak;
+ int ret;

child = crypto_spawn_skcipher(&ictx->spawn);
if (IS_ERR(child))
@@ -379,10 +380,14 @@ static int init_tfm(struct crypto_skcipher *tfm)

ctx->tweak = tweak;

- crypto_skcipher_set_reqsize(tfm, crypto_skcipher_reqsize(child) +
- sizeof(struct rctx));
+ ret = crypto_skcipher_set_reqsize(tfm,
+ crypto_skcipher_reqsize(child) + sizeof(struct rctx));
+ if (ret) {
+ crypto_free_cipher(ctx->tweak);
+ crypto_free_skcipher(ctx->child);
+ }

- return 0;
+ return ret;
}

static void exit_tfm(struct crypto_skcipher *tfm)
diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index 6eaec9ba0f68..de41bc35629c 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -951,6 +951,8 @@ static int crypto4xx_sk_init(struct crypto_skcipher *sk)
struct crypto4xx_ctx *ctx = crypto_skcipher_ctx(sk);

if (alg->base.cra_flags & CRYPTO_ALG_NEED_FALLBACK) {
+ int ret;
+
ctx->sw_cipher.cipher =
crypto_alloc_skcipher(alg->base.cra_name, 0,
CRYPTO_ALG_NEED_FALLBACK |
@@ -958,9 +960,13 @@ static int crypto4xx_sk_init(struct crypto_skcipher *sk)
if (IS_ERR(ctx->sw_cipher.cipher))
return PTR_ERR(ctx->sw_cipher.cipher);

- crypto_skcipher_set_reqsize(sk,
+ ret = crypto_skcipher_set_reqsize(sk,
sizeof(struct skcipher_request) + 32 +
crypto_skcipher_reqsize(ctx->sw_cipher.cipher));
+ if (ret) {
+ crypto_free_skcipher(ctx->sw_cipher.cipher);
+ return ret;
+ }
}

amcc_alg = container_of(alg, struct crypto4xx_alg, alg.u.cipher);
diff --git a/drivers/crypto/cavium/nitrox/nitrox_algs.c b/drivers/crypto/cavium/nitrox/nitrox_algs.c
index 2ae6124e5da6..f0e688d37da6 100644
--- a/drivers/crypto/cavium/nitrox/nitrox_algs.c
+++ b/drivers/crypto/cavium/nitrox/nitrox_algs.c
@@ -74,6 +74,7 @@ static int nitrox_skcipher_init(struct crypto_skcipher *tfm)
{
struct nitrox_crypto_ctx *nctx = crypto_skcipher_ctx(tfm);
void *fctx;
+ int ret;

/* get the first device */
nctx->ndev = nitrox_get_first_device();
@@ -87,9 +88,13 @@ static int nitrox_skcipher_init(struct crypto_skcipher *tfm)
return -ENOMEM;
}
nctx->u.ctx_handle = (uintptr_t)fctx;
- crypto_skcipher_set_reqsize(tfm, crypto_skcipher_reqsize(tfm) +
+ ret = crypto_skcipher_set_reqsize(tfm, crypto_skcipher_reqsize(tfm) +
sizeof(struct nitrox_kcrypt_request));
- return 0;
+ if (ret) {
+ crypto_free_context(fctx);
+ nitrox_put_device(nctx->ndev);
+ }
+ return ret;
}

static void nitrox_skcipher_exit(struct crypto_skcipher *tfm)
diff --git a/drivers/crypto/ccree/cc_cipher.c b/drivers/crypto/ccree/cc_cipher.c
index 7623b29911af..ec8e9506f4c5 100644
--- a/drivers/crypto/ccree/cc_cipher.c
+++ b/drivers/crypto/ccree/cc_cipher.c
@@ -136,13 +136,15 @@ static int cc_cipher_init(struct crypto_tfm *tfm)
skcipher_alg.base);
struct device *dev = drvdata_to_dev(cc_alg->drvdata);
unsigned int max_key_buf_size = cc_alg->skcipher_alg.max_keysize;
- int rc = 0;
+ int rc;

dev_dbg(dev, "Initializing context @%p for %s\n", ctx_p,
crypto_tfm_alg_name(tfm));

- crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+ rc = crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
sizeof(struct cipher_req_ctx));
+ if (rc)
+ return rc;

ctx_p->cipher_mode = cc_alg->cipher_mode;
ctx_p->flow_mode = cc_alg->flow_mode;
diff --git a/drivers/crypto/hisilicon/sec/sec_algs.c b/drivers/crypto/hisilicon/sec/sec_algs.c
index f7d6d690116e..b10ff7202718 100644
--- a/drivers/crypto/hisilicon/sec/sec_algs.c
+++ b/drivers/crypto/hisilicon/sec/sec_algs.c
@@ -880,10 +880,13 @@ static int sec_alg_skcipher_decrypt(struct skcipher_request *req)
static int sec_alg_skcipher_init(struct crypto_skcipher *tfm)
{
struct sec_alg_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
+ int ret;

mutex_init(&ctx->lock);
INIT_LIST_HEAD(&ctx->backlog);
- crypto_skcipher_set_reqsize(tfm, sizeof(struct sec_request));
+ ret = crypto_skcipher_set_reqsize(tfm, sizeof(struct sec_request));
+ if (ret)
+ return ret;

ctx->queue = sec_queue_alloc_start_safe();
if (IS_ERR(ctx->queue))
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index 3aef1d43e435..b64a245a00fb 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -782,9 +782,12 @@ static int safexcel_skcipher_cra_init(struct crypto_tfm *tfm)
struct safexcel_alg_template *tmpl =
container_of(tfm->__crt_alg, struct safexcel_alg_template,
alg.skcipher.base);
+ int ret;

- crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+ ret = crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
sizeof(struct safexcel_cipher_req));
+ if (ret)
+ return ret;

ctx->priv = tmpl->priv;

diff --git a/drivers/crypto/marvell/cipher.c b/drivers/crypto/marvell/cipher.c
index 0ae84ec9e21c..41a2e047beb6 100644
--- a/drivers/crypto/marvell/cipher.c
+++ b/drivers/crypto/marvell/cipher.c
@@ -241,10 +241,8 @@ static int mv_cesa_skcipher_cra_init(struct crypto_tfm *tfm)

ctx->ops = &mv_cesa_skcipher_req_ops;

- crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+ return crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
sizeof(struct mv_cesa_skcipher_req));
-
- return 0;
}

static int mv_cesa_aes_setkey(struct crypto_skcipher *cipher, const u8 *key,
diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-cipher.c b/drivers/crypto/sunxi-ss/sun4i-ss-cipher.c
index 5cf64746731a..d01fb1054b77 100644
--- a/drivers/crypto/sunxi-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/sunxi-ss/sun4i-ss-cipher.c
@@ -465,10 +465,8 @@ int sun4i_ss_cipher_init(struct crypto_tfm *tfm)
alg.crypto.base);
op->ss = algt->ss;

- crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+ return crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
sizeof(struct sun4i_cipher_req_ctx));
-
- return 0;
}

/* check and set the AES key, prepare the mode to be used */
diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
index e42f7063f245..d2926ecae2ac 100644
--- a/include/crypto/internal/skcipher.h
+++ b/include/crypto/internal/skcipher.h
@@ -127,10 +127,12 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
return crypto_spawn_tfm2(&spawn->base);
}

-static inline void crypto_skcipher_set_reqsize(
+static inline int crypto_skcipher_set_reqsize(
struct crypto_skcipher *skcipher, unsigned int reqsize)
{
skcipher->reqsize = reqsize;
+
+ return 0;
}

int crypto_register_skcipher(struct skcipher_alg *alg);
--
2.17.1


2018-09-05 06:14:30

by Alexander Stein

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Tuesday, September 4, 2018, 8:16:29 PM CEST Kees Cook wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> caps the skcipher request size similar to other limits and adds a sanity
> check at registration. Looking at instrumented tcrypt output, the largest
> is for lrw:
>
> crypt: testing lrw(aes)
> crypto_skcipher_set_reqsize: 8
> crypto_skcipher_set_reqsize: 88
> crypto_skcipher_set_reqsize: 472
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> include/crypto/internal/skcipher.h | 3 +++
> include/crypto/skcipher.h | 4 +++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
> index d2926ecae2ac..6da811c0747e 100644
> --- a/include/crypto/internal/skcipher.h
> +++ b/include/crypto/internal/skcipher.h
> @@ -130,6 +130,9 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
> static inline int crypto_skcipher_set_reqsize(
> struct crypto_skcipher *skcipher, unsigned int reqsize)
> {
> + if (WARN_ON(reqsize > SKCIPHER_MAX_REQSIZE))
> + return -EINVAL;
> +
> skcipher->reqsize = reqsize;
>
> return 0;
> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
> index 2f327f090c3e..c48e194438cf 100644
> --- a/include/crypto/skcipher.h
> +++ b/include/crypto/skcipher.h
> @@ -139,9 +139,11 @@ struct skcipher_alg {
> struct crypto_alg base;
> };
>
> +#define SKCIPHER_MAX_REQSIZE 472
> +
> #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
> char __##name##_desc[sizeof(struct skcipher_request) + \
> - crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
> + SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
> struct skcipher_request *name = (void *)__##name##_desc

Now tfm could be removed from the macro arguments, no?

Best regards,
Alexander




2018-09-05 09:20:06

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> caps the skcipher request size similar to other limits and adds a sanity
> check at registration. Looking at instrumented tcrypt output, the largest
> is for lrw:
>
> crypt: testing lrw(aes)
> crypto_skcipher_set_reqsize: 8
> crypto_skcipher_set_reqsize: 88
> crypto_skcipher_set_reqsize: 472
>

Are you sure this is a representative sampling? I haven't double
checked myself, but we have plenty of drivers for peripherals in
drivers/crypto that implement block ciphers, and they would not turn
up in tcrypt unless you are running on a platform that provides the
hardware in question.

> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> include/crypto/internal/skcipher.h | 3 +++
> include/crypto/skcipher.h | 4 +++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/crypto/internal/skcipher.h b/include/crypto/internal/skcipher.h
> index d2926ecae2ac..6da811c0747e 100644
> --- a/include/crypto/internal/skcipher.h
> +++ b/include/crypto/internal/skcipher.h
> @@ -130,6 +130,9 @@ static inline struct crypto_skcipher *crypto_spawn_skcipher(
> static inline int crypto_skcipher_set_reqsize(
> struct crypto_skcipher *skcipher, unsigned int reqsize)
> {
> + if (WARN_ON(reqsize > SKCIPHER_MAX_REQSIZE))
> + return -EINVAL;
> +
> skcipher->reqsize = reqsize;
>
> return 0;
> diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
> index 2f327f090c3e..c48e194438cf 100644
> --- a/include/crypto/skcipher.h
> +++ b/include/crypto/skcipher.h
> @@ -139,9 +139,11 @@ struct skcipher_alg {
> struct crypto_alg base;
> };
>
> +#define SKCIPHER_MAX_REQSIZE 472
> +
> #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
> char __##name##_desc[sizeof(struct skcipher_request) + \
> - crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
> + SKCIPHER_MAX_REQSIZE] CRYPTO_MINALIGN_ATTR; \
> struct skcipher_request *name = (void *)__##name##_desc
>
> /**
> --
> 2.17.1
>

2018-09-05 21:14:29

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
<[email protected]> wrote:
> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> caps the skcipher request size similar to other limits and adds a sanity
>> check at registration. Looking at instrumented tcrypt output, the largest
>> is for lrw:
>>
>> crypt: testing lrw(aes)
>> crypto_skcipher_set_reqsize: 8
>> crypto_skcipher_set_reqsize: 88
>> crypto_skcipher_set_reqsize: 472
>>
>
> Are you sure this is a representative sampling? I haven't double
> checked myself, but we have plenty of drivers for peripherals in
> drivers/crypto that implement block ciphers, and they would not turn
> up in tcrypt unless you are running on a platform that provides the
> hardware in question.

Hrm, excellent point. Looking at this again:

The core part of the VLA is using this in the ON_STACK macro:

static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
{
return tfm->reqsize;
}

I don't find any struct crypto_skcipher .reqsize static initializers,
and the initial reqsize is here:

static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
{
...
skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
sizeof(struct ablkcipher_request);

with updates via crypto_skcipher_set_reqsize().

So I have to examine ablkcipher reqsize too:

static inline unsigned int crypto_ablkcipher_reqsize(
struct crypto_ablkcipher *tfm)
{
return crypto_ablkcipher_crt(tfm)->reqsize;
}

And of the crt_ablkcipher.reqsize assignments/initializers, I found:

ablkcipher reqsize:
1 struct dcp_aes_req_ctx
8 struct atmel_tdes_reqctx
8 struct cryptd_blkcipher_request_ctx
8 struct mtk_aes_reqctx
8 struct omap_des_reqctx
8 struct s5p_aes_reqctx
8 struct sahara_aes_reqctx
8 struct stm32_cryp_reqctx
8 struct stm32_cryp_reqctx
16 struct ablk_ctx
24 struct atmel_aes_reqctx
48 struct omap_aes_reqctx
48 struct omap_aes_reqctx
48 struct qat_crypto_request
56 struct artpec6_crypto_request_context
64 struct chcr_blkcipher_req_ctx
80 struct spacc_req
80 struct virtio_crypto_sym_request
136 struct qce_cipher_reqctx
168 struct n2_request_context
328 struct ccp_des3_req_ctx
400 struct ccp_aes_req_ctx
536 struct hifn_request_context
992 struct cvm_req_ctx
2456 struct iproc_reqctx_s

The base ablkcipher wrapper is:
80 struct ablkcipher_request

And in my earlier skcipher wrapper analysis, lrw was the largest
skcipher wrapper:
384 struct rctx

iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.

Making this a 2920 byte fixed array doesn't seem sensible at all
(though that's what's already possible to use with existing
SKCIPHER_REQUEST_ON_STACK users).

What's the right path forward here?

-Kees

--
Kees Cook
Pixel Security

2018-09-05 22:52:52

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
> <[email protected]> wrote:
>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>> caps the skcipher request size similar to other limits and adds a sanity
>>> check at registration. Looking at instrumented tcrypt output, the largest
>>> is for lrw:
>>>
>>> crypt: testing lrw(aes)
>>> crypto_skcipher_set_reqsize: 8
>>> crypto_skcipher_set_reqsize: 88
>>> crypto_skcipher_set_reqsize: 472
>>>
>>
>> Are you sure this is a representative sampling? I haven't double
>> checked myself, but we have plenty of drivers for peripherals in
>> drivers/crypto that implement block ciphers, and they would not turn
>> up in tcrypt unless you are running on a platform that provides the
>> hardware in question.
>
> Hrm, excellent point. Looking at this again:
>
> The core part of the VLA is using this in the ON_STACK macro:
>
> static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
> {
> return tfm->reqsize;
> }
>
> I don't find any struct crypto_skcipher .reqsize static initializers,
> and the initial reqsize is here:
>
> static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
> {
> ...
> skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
> sizeof(struct ablkcipher_request);
>
> with updates via crypto_skcipher_set_reqsize().
>
> So I have to examine ablkcipher reqsize too:
>
> static inline unsigned int crypto_ablkcipher_reqsize(
> struct crypto_ablkcipher *tfm)
> {
> return crypto_ablkcipher_crt(tfm)->reqsize;
> }
>
> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>
> ablkcipher reqsize:
> 1 struct dcp_aes_req_ctx
> 8 struct atmel_tdes_reqctx
> 8 struct cryptd_blkcipher_request_ctx
> 8 struct mtk_aes_reqctx
> 8 struct omap_des_reqctx
> 8 struct s5p_aes_reqctx
> 8 struct sahara_aes_reqctx
> 8 struct stm32_cryp_reqctx
> 8 struct stm32_cryp_reqctx
> 16 struct ablk_ctx
> 24 struct atmel_aes_reqctx
> 48 struct omap_aes_reqctx
> 48 struct omap_aes_reqctx
> 48 struct qat_crypto_request
> 56 struct artpec6_crypto_request_context
> 64 struct chcr_blkcipher_req_ctx
> 80 struct spacc_req
> 80 struct virtio_crypto_sym_request
> 136 struct qce_cipher_reqctx
> 168 struct n2_request_context
> 328 struct ccp_des3_req_ctx
> 400 struct ccp_aes_req_ctx
> 536 struct hifn_request_context
> 992 struct cvm_req_ctx
> 2456 struct iproc_reqctx_s
>
> The base ablkcipher wrapper is:
> 80 struct ablkcipher_request
>
> And in my earlier skcipher wrapper analysis, lrw was the largest
> skcipher wrapper:
> 384 struct rctx
>
> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>
> Making this a 2920 byte fixed array doesn't seem sensible at all
> (though that's what's already possible to use with existing
> SKCIPHER_REQUEST_ON_STACK users).
>
> What's the right path forward here?
>

The skcipher implementations based on crypto IP blocks are typically
asynchronous, and I wouldn't be surprised if a fair number of
SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
skciphers.

So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
synchronous skciphers, which implies that the reqsize limit only has
to apply synchronous skciphers as well. But before we can do this, we
have to identify the remaining occurrences that allow asynchronous
skciphers to be used, and replace them with heap allocations.

2018-09-06 00:45:26

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Wed, Sep 5, 2018 at 3:49 PM, Ard Biesheuvel
<[email protected]> wrote:
> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>> <[email protected]> wrote:
>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>> is for lrw:
>>>>
>>>> crypt: testing lrw(aes)
>>>> crypto_skcipher_set_reqsize: 8
>>>> crypto_skcipher_set_reqsize: 88
>>>> crypto_skcipher_set_reqsize: 472
>>>>
>>>
>>> Are you sure this is a representative sampling? I haven't double
>>> checked myself, but we have plenty of drivers for peripherals in
>>> drivers/crypto that implement block ciphers, and they would not turn
>>> up in tcrypt unless you are running on a platform that provides the
>>> hardware in question.
>>
>> Hrm, excellent point. Looking at this again:
>>
>> The core part of the VLA is using this in the ON_STACK macro:
>>
>> static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
>> {
>> return tfm->reqsize;
>> }
>>
>> I don't find any struct crypto_skcipher .reqsize static initializers,
>> and the initial reqsize is here:
>>
>> static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
>> {
>> ...
>> skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
>> sizeof(struct ablkcipher_request);
>>
>> with updates via crypto_skcipher_set_reqsize().
>>
>> So I have to examine ablkcipher reqsize too:
>>
>> static inline unsigned int crypto_ablkcipher_reqsize(
>> struct crypto_ablkcipher *tfm)
>> {
>> return crypto_ablkcipher_crt(tfm)->reqsize;
>> }
>>
>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>
>> ablkcipher reqsize:
>> 1 struct dcp_aes_req_ctx
>> 8 struct atmel_tdes_reqctx
>> 8 struct cryptd_blkcipher_request_ctx
>> 8 struct mtk_aes_reqctx
>> 8 struct omap_des_reqctx
>> 8 struct s5p_aes_reqctx
>> 8 struct sahara_aes_reqctx
>> 8 struct stm32_cryp_reqctx
>> 8 struct stm32_cryp_reqctx
>> 16 struct ablk_ctx
>> 24 struct atmel_aes_reqctx
>> 48 struct omap_aes_reqctx
>> 48 struct omap_aes_reqctx
>> 48 struct qat_crypto_request
>> 56 struct artpec6_crypto_request_context
>> 64 struct chcr_blkcipher_req_ctx
>> 80 struct spacc_req
>> 80 struct virtio_crypto_sym_request
>> 136 struct qce_cipher_reqctx
>> 168 struct n2_request_context
>> 328 struct ccp_des3_req_ctx
>> 400 struct ccp_aes_req_ctx
>> 536 struct hifn_request_context
>> 992 struct cvm_req_ctx
>> 2456 struct iproc_reqctx_s
>>
>> The base ablkcipher wrapper is:
>> 80 struct ablkcipher_request
>>
>> And in my earlier skcipher wrapper analysis, lrw was the largest
>> skcipher wrapper:
>> 384 struct rctx
>>
>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>
>> Making this a 2920 byte fixed array doesn't seem sensible at all
>> (though that's what's already possible to use with existing
>> SKCIPHER_REQUEST_ON_STACK users).
>>
>> What's the right path forward here?
>>
>
> The skcipher implementations based on crypto IP blocks are typically
> asynchronous, and I wouldn't be surprised if a fair number of
> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
> skciphers.

Looks similar to ahash vs shash. :) Yes, so nearly all
crypto_alloc_skcipher() users explicitly mask away ASYNC. What's left
appears to be:

crypto/drbg.c: sk_tfm = crypto_alloc_skcipher(ctr_name, 0, 0);
crypto/tcrypt.c: tfm = crypto_alloc_skcipher(algo, 0, async ? 0
: CRYPTO_ALG_ASYNC);
drivers/crypto/omap-aes.c: ctx->ctr =
crypto_alloc_skcipher("ecb(aes)", 0, 0);
drivers/md/dm-crypt.c: cc->cipher_tfm.tfms[i] =
crypto_alloc_skcipher(ciphermode, 0, 0);
drivers/md/dm-integrity.c: ic->journal_crypt =
crypto_alloc_skcipher(ic->journal_crypt_alg.alg_string, 0, 0);
fs/crypto/keyinfo.c: struct crypto_skcipher *tfm =
crypto_alloc_skcipher("ecb(aes)", 0, 0);
fs/crypto/keyinfo.c: ctfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
fs/ecryptfs/crypto.c: crypt_stat->tfm =
crypto_alloc_skcipher(full_alg_name, 0, 0);

I'll cross-reference this with SKCIPHER_REQUEST_ON_STACK...

> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
> synchronous skciphers, which implies that the reqsize limit only has
> to apply synchronous skciphers as well. But before we can do this, we
> have to identify the remaining occurrences that allow asynchronous
> skciphers to be used, and replace them with heap allocations.

Sounds good; thanks!

-Kees

--
Kees Cook
Pixel Security

2018-09-06 04:55:26

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 6, 2018 at 1:49 AM, Ard Biesheuvel
<[email protected]> wrote:
> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>> <[email protected]> wrote:
>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>> is for lrw:
>>>>
>>>> crypt: testing lrw(aes)
>>>> crypto_skcipher_set_reqsize: 8
>>>> crypto_skcipher_set_reqsize: 88
>>>> crypto_skcipher_set_reqsize: 472
>>>>
>>>
>>> Are you sure this is a representative sampling? I haven't double
>>> checked myself, but we have plenty of drivers for peripherals in
>>> drivers/crypto that implement block ciphers, and they would not turn
>>> up in tcrypt unless you are running on a platform that provides the
>>> hardware in question.
>>
>> Hrm, excellent point. Looking at this again:
>>
>> The core part of the VLA is using this in the ON_STACK macro:
>>
>> static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
>> {
>> return tfm->reqsize;
>> }
>>
>> I don't find any struct crypto_skcipher .reqsize static initializers,
>> and the initial reqsize is here:
>>
>> static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
>> {
>> ...
>> skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
>> sizeof(struct ablkcipher_request);
>>
>> with updates via crypto_skcipher_set_reqsize().
>>
>> So I have to examine ablkcipher reqsize too:
>>
>> static inline unsigned int crypto_ablkcipher_reqsize(
>> struct crypto_ablkcipher *tfm)
>> {
>> return crypto_ablkcipher_crt(tfm)->reqsize;
>> }
>>
>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>
>> ablkcipher reqsize:
>> 1 struct dcp_aes_req_ctx
>> 8 struct atmel_tdes_reqctx
>> 8 struct cryptd_blkcipher_request_ctx
>> 8 struct mtk_aes_reqctx
>> 8 struct omap_des_reqctx
>> 8 struct s5p_aes_reqctx
>> 8 struct sahara_aes_reqctx
>> 8 struct stm32_cryp_reqctx
>> 8 struct stm32_cryp_reqctx
>> 16 struct ablk_ctx
>> 24 struct atmel_aes_reqctx
>> 48 struct omap_aes_reqctx
>> 48 struct omap_aes_reqctx
>> 48 struct qat_crypto_request
>> 56 struct artpec6_crypto_request_context
>> 64 struct chcr_blkcipher_req_ctx
>> 80 struct spacc_req
>> 80 struct virtio_crypto_sym_request
>> 136 struct qce_cipher_reqctx
>> 168 struct n2_request_context
>> 328 struct ccp_des3_req_ctx
>> 400 struct ccp_aes_req_ctx
>> 536 struct hifn_request_context
>> 992 struct cvm_req_ctx
>> 2456 struct iproc_reqctx_s
>>
>> The base ablkcipher wrapper is:
>> 80 struct ablkcipher_request
>>
>> And in my earlier skcipher wrapper analysis, lrw was the largest
>> skcipher wrapper:
>> 384 struct rctx
>>
>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>
>> Making this a 2920 byte fixed array doesn't seem sensible at all
>> (though that's what's already possible to use with existing
>> SKCIPHER_REQUEST_ON_STACK users).
>>
>> What's the right path forward here?
>>
>
> The skcipher implementations based on crypto IP blocks are typically
> asynchronous, and I wouldn't be surprised if a fair number of
> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
> skciphers.

According to Herbert, SKCIPHER_REQUEST_ON_STACK() may only be used
for invoking synchronous ciphers.

In fact, due to the way the crypto API is built, if you try using it
with any transformation that uses DMA
you would most probably end up trying to DMA to/from the stack which
as we all know is not a great idea.

>
> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
> synchronous skciphers, which implies that the reqsize limit only has
> to apply synchronous skciphers as well. But before we can do this, we
> have to identify the remaining occurrences that allow asynchronous
> skciphers to be used, and replace them with heap allocations.

Any such occurrences are almost for sure broken already due to the DMA
issue I've mentioned.

Gilad

--
Gilad Ben-Yossef
Chief Coffee Drinker

values of β will give rise to dom!

2018-09-06 08:16:26

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 6 September 2018 at 09:21, Ard Biesheuvel <[email protected]> wrote:
> On 6 September 2018 at 06:53, Gilad Ben-Yossef <[email protected]> wrote:
>> On Thu, Sep 6, 2018 at 1:49 AM, Ard Biesheuvel
>> <[email protected]> wrote:
>>> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>>>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>>>> <[email protected]> wrote:
>>>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>>>> is for lrw:
>>>>>>
>>>>>> crypt: testing lrw(aes)
>>>>>> crypto_skcipher_set_reqsize: 8
>>>>>> crypto_skcipher_set_reqsize: 88
>>>>>> crypto_skcipher_set_reqsize: 472
>>>>>>
>>>>>
>>>>> Are you sure this is a representative sampling? I haven't double
>>>>> checked myself, but we have plenty of drivers for peripherals in
>>>>> drivers/crypto that implement block ciphers, and they would not turn
>>>>> up in tcrypt unless you are running on a platform that provides the
>>>>> hardware in question.
>>>>
>>>> Hrm, excellent point. Looking at this again:
>>>>
>>>> The core part of the VLA is using this in the ON_STACK macro:
>>>>
>>>> static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
>>>> {
>>>> return tfm->reqsize;
>>>> }
>>>>
>>>> I don't find any struct crypto_skcipher .reqsize static initializers,
>>>> and the initial reqsize is here:
>>>>
>>>> static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
>>>> {
>>>> ...
>>>> skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
>>>> sizeof(struct ablkcipher_request);
>>>>
>>>> with updates via crypto_skcipher_set_reqsize().
>>>>
>>>> So I have to examine ablkcipher reqsize too:
>>>>
>>>> static inline unsigned int crypto_ablkcipher_reqsize(
>>>> struct crypto_ablkcipher *tfm)
>>>> {
>>>> return crypto_ablkcipher_crt(tfm)->reqsize;
>>>> }
>>>>
>>>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>>>
>>>> ablkcipher reqsize:
>>>> 1 struct dcp_aes_req_ctx
>>>> 8 struct atmel_tdes_reqctx
>>>> 8 struct cryptd_blkcipher_request_ctx
>>>> 8 struct mtk_aes_reqctx
>>>> 8 struct omap_des_reqctx
>>>> 8 struct s5p_aes_reqctx
>>>> 8 struct sahara_aes_reqctx
>>>> 8 struct stm32_cryp_reqctx
>>>> 8 struct stm32_cryp_reqctx
>>>> 16 struct ablk_ctx
>>>> 24 struct atmel_aes_reqctx
>>>> 48 struct omap_aes_reqctx
>>>> 48 struct omap_aes_reqctx
>>>> 48 struct qat_crypto_request
>>>> 56 struct artpec6_crypto_request_context
>>>> 64 struct chcr_blkcipher_req_ctx
>>>> 80 struct spacc_req
>>>> 80 struct virtio_crypto_sym_request
>>>> 136 struct qce_cipher_reqctx
>>>> 168 struct n2_request_context
>>>> 328 struct ccp_des3_req_ctx
>>>> 400 struct ccp_aes_req_ctx
>>>> 536 struct hifn_request_context
>>>> 992 struct cvm_req_ctx
>>>> 2456 struct iproc_reqctx_s
>>>>
>>>> The base ablkcipher wrapper is:
>>>> 80 struct ablkcipher_request
>>>>
>>>> And in my earlier skcipher wrapper analysis, lrw was the largest
>>>> skcipher wrapper:
>>>> 384 struct rctx
>>>>
>>>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>>>
>>>> Making this a 2920 byte fixed array doesn't seem sensible at all
>>>> (though that's what's already possible to use with existing
>>>> SKCIPHER_REQUEST_ON_STACK users).
>>>>
>>>> What's the right path forward here?
>>>>
>>>
>>> The skcipher implementations based on crypto IP blocks are typically
>>> asynchronous, and I wouldn't be surprised if a fair number of
>>> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
>>> skciphers.
>>
>> According to Herbert, SKCIPHER_REQUEST_ON_STACK() may only be used
>> for invoking synchronous ciphers.
>>
>> In fact, due to the way the crypto API is built, if you try using it
>> with any transformation that uses DMA
>> you would most probably end up trying to DMA to/from the stack which
>> as we all know is not a great idea.
>>
>
> Ah yes, I found [0] which contains that quote.
>
> So that means that Kees can disregard the occurrences that are async
> only, but it still implies that we cannot limit the reqsize like he
> proposes unless we take the sync/async nature into account.
> It also means we should probably BUG() or WARN() in
> SKCIPHER_REQUEST_ON_STACK() when used with an async algo.
>

Something like this should do the trick:

diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 2f327f090c3e..70584e0f26bc 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -142,7 +142,9 @@ struct skcipher_alg {
#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
- struct skcipher_request *name = (void *)__##name##_desc
+ struct skcipher_request *name = WARN_ON( \
+ crypto_skcipher_alg(tfm)->base.cra_flags & CRYPTO_ALG_ASYNC) \
+ ? NULL : (void *)__##name##_desc

/**
* DOC: Symmetric Key Cipher API

That way, we will almost certainly oops on a NULL pointer dereference
right after, but we at least the stack corruption.

>>>
>>> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
>>> synchronous skciphers, which implies that the reqsize limit only has
>>> to apply synchronous skciphers as well. But before we can do this, we
>>> have to identify the remaining occurrences that allow asynchronous
>>> skciphers to be used, and replace them with heap allocations.
>>
>> Any such occurrences are almost for sure broken already due to the DMA
>> issue I've mentioned.
>>
>
> I am not convinced of this. The skcipher request struct does not
> contain any payload buffers, and whether the algo specific ctx struct
> is used for DMA is completely up to the driver. So I am quite sure
> there are plenty of async algos that work fine with
> SKCIPHER_REQUEST_ON_STACK() and vmapped stacks.
>
>> Gilad
>>
>> --
>> Gilad Ben-Yossef
>> Chief Coffee Drinker
>>
>> values of β will give rise to dom!
>
> [0] https://www.redhat.com/archives/dm-devel/2018-January/msg00087.html

2018-09-06 08:26:58

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 6, 2018 at 10:21 AM, Ard Biesheuvel
<[email protected]> wrote:

>>> The skcipher implementations based on crypto IP blocks are typically
>>> asynchronous, and I wouldn't be surprised if a fair number of
>>> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
>>> skciphers.
>>
>> According to Herbert, SKCIPHER_REQUEST_ON_STACK() may only be used
>> for invoking synchronous ciphers.
>>
>> In fact, due to the way the crypto API is built, if you try using it
>> with any transformation that uses DMA
>> you would most probably end up trying to DMA to/from the stack which
>> as we all know is not a great idea.
>>
>
> Ah yes, I found [0] which contains that quote.
>
> So that means that Kees can disregard the occurrences that are async
> only, but it still implies that we cannot limit the reqsize like he
> proposes unless we take the sync/async nature into account.
> It also means we should probably BUG() or WARN() in
> SKCIPHER_REQUEST_ON_STACK() when used with an async algo.
>
>>>
>>> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
>>> synchronous skciphers, which implies that the reqsize limit only has
>>> to apply synchronous skciphers as well. But before we can do this, we
>>> have to identify the remaining occurrences that allow asynchronous
>>> skciphers to be used, and replace them with heap allocations.
>>
>> Any such occurrences are almost for sure broken already due to the DMA
>> issue I've mentioned.
>>
>
> I am not convinced of this. The skcipher request struct does not
> contain any payload buffers, and whether the algo specific ctx struct
> is used for DMA is completely up to the driver. So I am quite sure
> there are plenty of async algos that work fine with
> SKCIPHER_REQUEST_ON_STACK() and vmapped stacks.


You are right that it is up to the driver but the cost is an extra
memory allocation and release
*per request* for any per request data that needs to be DMAable beyond
the actual plain
and cipher text buffers such as the IV, so driver writers have an
incentive against doing that :-)

Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker

values of β will give rise to dom!

2018-09-06 08:52:54

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 06, 2018 at 10:11:59AM +0200, Ard Biesheuvel wrote:
>
> That way, we will almost certainly oops on a NULL pointer dereference
> right after, but we at least the stack corruption.

A crash is just as bad as a BUG_ON.

Is this even a real problem? Do we have any users of this construct
that is using it on async algorithms?

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-06 09:37:36

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 6 September 2018 at 06:53, Gilad Ben-Yossef <[email protected]> wrote:
> On Thu, Sep 6, 2018 at 1:49 AM, Ard Biesheuvel
> <[email protected]> wrote:
>> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>>> <[email protected]> wrote:
>>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>>> is for lrw:
>>>>>
>>>>> crypt: testing lrw(aes)
>>>>> crypto_skcipher_set_reqsize: 8
>>>>> crypto_skcipher_set_reqsize: 88
>>>>> crypto_skcipher_set_reqsize: 472
>>>>>
>>>>
>>>> Are you sure this is a representative sampling? I haven't double
>>>> checked myself, but we have plenty of drivers for peripherals in
>>>> drivers/crypto that implement block ciphers, and they would not turn
>>>> up in tcrypt unless you are running on a platform that provides the
>>>> hardware in question.
>>>
>>> Hrm, excellent point. Looking at this again:
>>>
>>> The core part of the VLA is using this in the ON_STACK macro:
>>>
>>> static inline unsigned int crypto_skcipher_reqsize(struct crypto_skcipher *tfm)
>>> {
>>> return tfm->reqsize;
>>> }
>>>
>>> I don't find any struct crypto_skcipher .reqsize static initializers,
>>> and the initial reqsize is here:
>>>
>>> static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm)
>>> {
>>> ...
>>> skcipher->reqsize = crypto_ablkcipher_reqsize(ablkcipher) +
>>> sizeof(struct ablkcipher_request);
>>>
>>> with updates via crypto_skcipher_set_reqsize().
>>>
>>> So I have to examine ablkcipher reqsize too:
>>>
>>> static inline unsigned int crypto_ablkcipher_reqsize(
>>> struct crypto_ablkcipher *tfm)
>>> {
>>> return crypto_ablkcipher_crt(tfm)->reqsize;
>>> }
>>>
>>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>>
>>> ablkcipher reqsize:
>>> 1 struct dcp_aes_req_ctx
>>> 8 struct atmel_tdes_reqctx
>>> 8 struct cryptd_blkcipher_request_ctx
>>> 8 struct mtk_aes_reqctx
>>> 8 struct omap_des_reqctx
>>> 8 struct s5p_aes_reqctx
>>> 8 struct sahara_aes_reqctx
>>> 8 struct stm32_cryp_reqctx
>>> 8 struct stm32_cryp_reqctx
>>> 16 struct ablk_ctx
>>> 24 struct atmel_aes_reqctx
>>> 48 struct omap_aes_reqctx
>>> 48 struct omap_aes_reqctx
>>> 48 struct qat_crypto_request
>>> 56 struct artpec6_crypto_request_context
>>> 64 struct chcr_blkcipher_req_ctx
>>> 80 struct spacc_req
>>> 80 struct virtio_crypto_sym_request
>>> 136 struct qce_cipher_reqctx
>>> 168 struct n2_request_context
>>> 328 struct ccp_des3_req_ctx
>>> 400 struct ccp_aes_req_ctx
>>> 536 struct hifn_request_context
>>> 992 struct cvm_req_ctx
>>> 2456 struct iproc_reqctx_s
>>>
>>> The base ablkcipher wrapper is:
>>> 80 struct ablkcipher_request
>>>
>>> And in my earlier skcipher wrapper analysis, lrw was the largest
>>> skcipher wrapper:
>>> 384 struct rctx
>>>
>>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>>
>>> Making this a 2920 byte fixed array doesn't seem sensible at all
>>> (though that's what's already possible to use with existing
>>> SKCIPHER_REQUEST_ON_STACK users).
>>>
>>> What's the right path forward here?
>>>
>>
>> The skcipher implementations based on crypto IP blocks are typically
>> asynchronous, and I wouldn't be surprised if a fair number of
>> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
>> skciphers.
>
> According to Herbert, SKCIPHER_REQUEST_ON_STACK() may only be used
> for invoking synchronous ciphers.
>
> In fact, due to the way the crypto API is built, if you try using it
> with any transformation that uses DMA
> you would most probably end up trying to DMA to/from the stack which
> as we all know is not a great idea.
>

Ah yes, I found [0] which contains that quote.

So that means that Kees can disregard the occurrences that are async
only, but it still implies that we cannot limit the reqsize like he
proposes unless we take the sync/async nature into account.
It also means we should probably BUG() or WARN() in
SKCIPHER_REQUEST_ON_STACK() when used with an async algo.

>>
>> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
>> synchronous skciphers, which implies that the reqsize limit only has
>> to apply synchronous skciphers as well. But before we can do this, we
>> have to identify the remaining occurrences that allow asynchronous
>> skciphers to be used, and replace them with heap allocations.
>
> Any such occurrences are almost for sure broken already due to the DMA
> issue I've mentioned.
>

I am not convinced of this. The skcipher request struct does not
contain any payload buffers, and whether the algo specific ctx struct
is used for DMA is completely up to the driver. So I am quite sure
there are plenty of async algos that work fine with
SKCIPHER_REQUEST_ON_STACK() and vmapped stacks.

> Gilad
>
> --
> Gilad Ben-Yossef
> Chief Coffee Drinker
>
> values of β will give rise to dom!

[0] https://www.redhat.com/archives/dm-devel/2018-January/msg00087.html

2018-09-06 11:05:15

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 6 September 2018 at 10:51, Herbert Xu <[email protected]> wrote:
> On Thu, Sep 06, 2018 at 10:11:59AM +0200, Ard Biesheuvel wrote:
>>
>> That way, we will almost certainly oops on a NULL pointer dereference
>> right after, but we at least the stack corruption.
>
> A crash is just as bad as a BUG_ON.
>
> Is this even a real problem? Do we have any users of this construct
> that is using it on async algorithms?
>

Perhaps not, but it is not enforced atm.

In any case, limiting the reqsize is going to break things, so that
needs to occur based on the sync/async nature of the algo. That also
means we'll corrupt the stack if we ever end up using
SKCIPHER_REQUEST_ON_STACK() with an async algo whose reqsize is
greater than the sync reqsize limit, so I do think some additional
sanity check is appropriate.

2018-09-06 13:14:49

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 06, 2018 at 11:29:41AM +0200, Ard Biesheuvel wrote:
>
> Perhaps not, but it is not enforced atm.
>
> In any case, limiting the reqsize is going to break things, so that
> needs to occur based on the sync/async nature of the algo. That also
> means we'll corrupt the stack if we ever end up using
> SKCIPHER_REQUEST_ON_STACK() with an async algo whose reqsize is
> greater than the sync reqsize limit, so I do think some additional
> sanity check is appropriate.

I'd prefer compile-time based checks. Perhaps we can introduce
a wrapper around crypto_skcipher, say crypto_skcipher_sync which
could then be used by SKCIPHER_REQUEST_ON_STACK to ensure that
only sync algorithms can use this construct.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2018-09-06 14:52:41

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On 6 September 2018 at 15:11, Herbert Xu <[email protected]> wrote:
> On Thu, Sep 06, 2018 at 11:29:41AM +0200, Ard Biesheuvel wrote:
>>
>> Perhaps not, but it is not enforced atm.
>>
>> In any case, limiting the reqsize is going to break things, so that
>> needs to occur based on the sync/async nature of the algo. That also
>> means we'll corrupt the stack if we ever end up using
>> SKCIPHER_REQUEST_ON_STACK() with an async algo whose reqsize is
>> greater than the sync reqsize limit, so I do think some additional
>> sanity check is appropriate.
>
> I'd prefer compile-time based checks. Perhaps we can introduce
> a wrapper around crypto_skcipher, say crypto_skcipher_sync which
> could then be used by SKCIPHER_REQUEST_ON_STACK to ensure that
> only sync algorithms can use this construct.
>

That would require lots of changes in the callers, including ones that
already take care to use sync algos only.

How about we do something like the below instead?

diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index 2f327f090c3e..ace707d59cd9 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -19,6 +19,7 @@

/**
* struct skcipher_request - Symmetric key cipher request
+ * @__onstack: 1 if the request was allocated by SKCIPHER_REQUEST_ON_STACK
* @cryptlen: Number of bytes to encrypt or decrypt
* @iv: Initialisation Vector
* @src: Source SG list
@@ -27,6 +28,7 @@
* @__ctx: Start of private context data
*/
struct skcipher_request {
+ unsigned char __onstack;
unsigned int cryptlen;

u8 *iv;
@@ -141,7 +143,7 @@ struct skcipher_alg {

#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
- crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
+ crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR = { 1 }; \
struct skcipher_request *name = (void *)__##name##_desc

/**
@@ -437,6 +439,10 @@ static inline int crypto_skcipher_encrypt(struct
skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);

+ if (req->__onstack &&
+ (crypto_skcipher_alg(tfm)->base.cra_flags & CRYPTO_ALG_ASYNC))
+ return -EINVAL;
+
if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
return -ENOKEY;

@@ -458,6 +464,10 @@ static inline int crypto_skcipher_decrypt(struct
skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);

+ if (req->__onstack &&
+ (crypto_skcipher_alg(tfm)->base.cra_flags & CRYPTO_ALG_ASYNC))
+ return -EINVAL;
+
if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
return -ENOKEY;

2018-09-06 21:17:33

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 6, 2018 at 7:49 AM, Ard Biesheuvel
<[email protected]> wrote:
> On 6 September 2018 at 15:11, Herbert Xu <[email protected]> wrote:
>> On Thu, Sep 06, 2018 at 11:29:41AM +0200, Ard Biesheuvel wrote:
>>>
>>> Perhaps not, but it is not enforced atm.
>>>
>>> In any case, limiting the reqsize is going to break things, so that
>>> needs to occur based on the sync/async nature of the algo. That also
>>> means we'll corrupt the stack if we ever end up using
>>> SKCIPHER_REQUEST_ON_STACK() with an async algo whose reqsize is
>>> greater than the sync reqsize limit, so I do think some additional
>>> sanity check is appropriate.
>>
>> I'd prefer compile-time based checks. Perhaps we can introduce
>> a wrapper around crypto_skcipher, say crypto_skcipher_sync which
>> could then be used by SKCIPHER_REQUEST_ON_STACK to ensure that
>> only sync algorithms can use this construct.
>>
>
> That would require lots of changes in the callers, including ones that
> already take care to use sync algos only.
>
> How about we do something like the below instead?

Oh, I like this, thanks!

-Kees

--
Kees Cook
Pixel Security

2018-09-06 21:25:06

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Wed, Sep 5, 2018 at 5:43 PM, Kees Cook <[email protected]> wrote:
> On Wed, Sep 5, 2018 at 3:49 PM, Ard Biesheuvel
> <[email protected]> wrote:
>> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>>> <[email protected]> wrote:
>>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>>> is for lrw:
>>>>>
>>>>> crypt: testing lrw(aes)
>>>>> crypto_skcipher_set_reqsize: 8
>>>>> crypto_skcipher_set_reqsize: 88
>>>>> crypto_skcipher_set_reqsize: 472
>>>>>
>>>>
>>>> Are you sure this is a representative sampling? I haven't double
>>>> checked myself, but we have plenty of drivers for peripherals in
>>>> drivers/crypto that implement block ciphers, and they would not turn
>>>> up in tcrypt unless you are running on a platform that provides the
>>>> hardware in question.
>>>
>>> Hrm, excellent point. Looking at this again:
>>> [...]
>>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>>
>>> ablkcipher reqsize:
>>> 1 struct dcp_aes_req_ctx
>>> 8 struct atmel_tdes_reqctx
>>> 8 struct cryptd_blkcipher_request_ctx
>>> 8 struct mtk_aes_reqctx
>>> 8 struct omap_des_reqctx
>>> 8 struct s5p_aes_reqctx
>>> 8 struct sahara_aes_reqctx
>>> 8 struct stm32_cryp_reqctx
>>> 8 struct stm32_cryp_reqctx
>>> 16 struct ablk_ctx
>>> 24 struct atmel_aes_reqctx
>>> 48 struct omap_aes_reqctx
>>> 48 struct omap_aes_reqctx
>>> 48 struct qat_crypto_request
>>> 56 struct artpec6_crypto_request_context
>>> 64 struct chcr_blkcipher_req_ctx
>>> 80 struct spacc_req
>>> 80 struct virtio_crypto_sym_request
>>> 136 struct qce_cipher_reqctx
>>> 168 struct n2_request_context
>>> 328 struct ccp_des3_req_ctx
>>> 400 struct ccp_aes_req_ctx
>>> 536 struct hifn_request_context
>>> 992 struct cvm_req_ctx
>>> 2456 struct iproc_reqctx_s

All of these are ASYNC (they're all crt_ablkcipher), so IIUC, I can ignore them.

>>> The base ablkcipher wrapper is:
>>> 80 struct ablkcipher_request
>>>
>>> And in my earlier skcipher wrapper analysis, lrw was the largest
>>> skcipher wrapper:
>>> 384 struct rctx
>>>
>>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>>
>>> Making this a 2920 byte fixed array doesn't seem sensible at all
>>> (though that's what's already possible to use with existing
>>> SKCIPHER_REQUEST_ON_STACK users).
>>>
>>> What's the right path forward here?
>>>
>>
>> The skcipher implementations based on crypto IP blocks are typically
>> asynchronous, and I wouldn't be surprised if a fair number of
>> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
>> skciphers.
>
> Looks similar to ahash vs shash. :) Yes, so nearly all
> crypto_alloc_skcipher() users explicitly mask away ASYNC. What's left
> appears to be:
>
> crypto/drbg.c: sk_tfm = crypto_alloc_skcipher(ctr_name, 0, 0);
> crypto/tcrypt.c: tfm = crypto_alloc_skcipher(algo, 0, async ? 0
> : CRYPTO_ALG_ASYNC);
> drivers/crypto/omap-aes.c: ctx->ctr =
> crypto_alloc_skcipher("ecb(aes)", 0, 0);
> drivers/md/dm-crypt.c: cc->cipher_tfm.tfms[i] =
> crypto_alloc_skcipher(ciphermode, 0, 0);
> drivers/md/dm-integrity.c: ic->journal_crypt =
> crypto_alloc_skcipher(ic->journal_crypt_alg.alg_string, 0, 0);
> fs/crypto/keyinfo.c: struct crypto_skcipher *tfm =
> crypto_alloc_skcipher("ecb(aes)", 0, 0);
> fs/crypto/keyinfo.c: ctfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
> fs/ecryptfs/crypto.c: crypt_stat->tfm =
> crypto_alloc_skcipher(full_alg_name, 0, 0);
>
> I'll cross-reference this with SKCIPHER_REQUEST_ON_STACK...

None of these use SKCIPHER_REQUEST_ON_STACK that I can find.

>> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
>> synchronous skciphers, which implies that the reqsize limit only has
>> to apply synchronous skciphers as well. But before we can do this, we
>> have to identify the remaining occurrences that allow asynchronous
>> skciphers to be used, and replace them with heap allocations.
>
> Sounds good; thanks!

crypto_init_skcipher_ops_blkcipher() doesn't touch reqsize at all, so
the only places I can find it gets changed are with direct callers of
crypto_skcipher_set_reqsize(), which, when wrapping a sync blkcipher
start with a reqsize == 0. So, the remaining non-ASYNC callers ask
for:

4 struct sun4i_cipher_req_ctx
96 struct crypto_rfc3686_req_ctx
375 sum:
160 crypto_skcipher_blocksize(cipher) (max)
152 struct crypto_cts_reqctx
63 align_mask (max)
384 struct rctx

So, following your patch to encrypt/decrypt, I can add reqsize check
there. How does this look, on top of your patch?

--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -144,9 +144,10 @@ struct skcipher_alg {
/*
* This must only ever be used with synchronous algorithms.
*/
+#define MAX_SYNC_SKCIPHER_REQSIZE 384
#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
char __##name##_desc[sizeof(struct skcipher_request) + \
- crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR = { 1 } \
+ MAX_SYNC_SKCIPHER_REQSIZE] CRYPTO_MINALIGN_ATTR = { 1 } \
struct skcipher_request *name = (void *)__##name##_desc

/**
@@ -442,10 +443,14 @@ static inline int crypto_skcipher_encrypt(struct
skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);

- if (req->__onstack &&
- WARN_ON(crypto_skcipher_alg(tfm)->base.cra_flags &
- CRYPTO_ALG_ASYNC))
- return -EINVAL;
+ if (req->__onstack) {
+ if (WARN_ON(crypto_skcipher_alg(tfm)->base.cra_flags &
+ CRYPTO_ALG_ASYNC))
+ return -EINVAL;
+ if (WARN_ON(crypto_skcipher_reqsize(tfm) >
+ MAX_SYNC_SKCIPHER_REQSIZE))
+ return -ENOSPC;
+ }
...etc

--
Kees Cook
Pixel Security

2018-09-06 22:41:58

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] crypto: skcipher: Remove VLA usage for SKCIPHER_REQUEST_ON_STACK

On Thu, Sep 6, 2018 at 1:22 PM, Kees Cook <[email protected]> wrote:
> On Wed, Sep 5, 2018 at 5:43 PM, Kees Cook <[email protected]> wrote:
>> On Wed, Sep 5, 2018 at 3:49 PM, Ard Biesheuvel
>> <[email protected]> wrote:
>>> On 5 September 2018 at 23:05, Kees Cook <[email protected]> wrote:
>>>> On Wed, Sep 5, 2018 at 2:18 AM, Ard Biesheuvel
>>>> <[email protected]> wrote:
>>>>> On 4 September 2018 at 20:16, Kees Cook <[email protected]> wrote:
>>>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>>>> caps the skcipher request size similar to other limits and adds a sanity
>>>>>> check at registration. Looking at instrumented tcrypt output, the largest
>>>>>> is for lrw:
>>>>>>
>>>>>> crypt: testing lrw(aes)
>>>>>> crypto_skcipher_set_reqsize: 8
>>>>>> crypto_skcipher_set_reqsize: 88
>>>>>> crypto_skcipher_set_reqsize: 472
>>>>>>
>>>>>
>>>>> Are you sure this is a representative sampling? I haven't double
>>>>> checked myself, but we have plenty of drivers for peripherals in
>>>>> drivers/crypto that implement block ciphers, and they would not turn
>>>>> up in tcrypt unless you are running on a platform that provides the
>>>>> hardware in question.
>>>>
>>>> Hrm, excellent point. Looking at this again:
>>>> [...]
>>>> And of the crt_ablkcipher.reqsize assignments/initializers, I found:
>>>>
>>>> ablkcipher reqsize:
>>>> 1 struct dcp_aes_req_ctx
>>>> 8 struct atmel_tdes_reqctx
>>>> 8 struct cryptd_blkcipher_request_ctx
>>>> 8 struct mtk_aes_reqctx
>>>> 8 struct omap_des_reqctx
>>>> 8 struct s5p_aes_reqctx
>>>> 8 struct sahara_aes_reqctx
>>>> 8 struct stm32_cryp_reqctx
>>>> 8 struct stm32_cryp_reqctx
>>>> 16 struct ablk_ctx
>>>> 24 struct atmel_aes_reqctx
>>>> 48 struct omap_aes_reqctx
>>>> 48 struct omap_aes_reqctx
>>>> 48 struct qat_crypto_request
>>>> 56 struct artpec6_crypto_request_context
>>>> 64 struct chcr_blkcipher_req_ctx
>>>> 80 struct spacc_req
>>>> 80 struct virtio_crypto_sym_request
>>>> 136 struct qce_cipher_reqctx
>>>> 168 struct n2_request_context
>>>> 328 struct ccp_des3_req_ctx
>>>> 400 struct ccp_aes_req_ctx
>>>> 536 struct hifn_request_context
>>>> 992 struct cvm_req_ctx
>>>> 2456 struct iproc_reqctx_s
>
> All of these are ASYNC (they're all crt_ablkcipher), so IIUC, I can ignore them.
>
>>>> The base ablkcipher wrapper is:
>>>> 80 struct ablkcipher_request
>>>>
>>>> And in my earlier skcipher wrapper analysis, lrw was the largest
>>>> skcipher wrapper:
>>>> 384 struct rctx
>>>>
>>>> iproc_reqctx_s is an extreme outlier, with cvm_req_ctx at a bit less than half.
>>>>
>>>> Making this a 2920 byte fixed array doesn't seem sensible at all
>>>> (though that's what's already possible to use with existing
>>>> SKCIPHER_REQUEST_ON_STACK users).
>>>>
>>>> What's the right path forward here?
>>>>
>>>
>>> The skcipher implementations based on crypto IP blocks are typically
>>> asynchronous, and I wouldn't be surprised if a fair number of
>>> SKCIPHER_REQUEST_ON_STACK() users are limited to synchronous
>>> skciphers.
>>
>> Looks similar to ahash vs shash. :) Yes, so nearly all
>> crypto_alloc_skcipher() users explicitly mask away ASYNC. What's left
>> appears to be:
>>
>> crypto/drbg.c: sk_tfm = crypto_alloc_skcipher(ctr_name, 0, 0);
>> crypto/tcrypt.c: tfm = crypto_alloc_skcipher(algo, 0, async ? 0
>> : CRYPTO_ALG_ASYNC);
>> drivers/crypto/omap-aes.c: ctx->ctr =
>> crypto_alloc_skcipher("ecb(aes)", 0, 0);
>> drivers/md/dm-crypt.c: cc->cipher_tfm.tfms[i] =
>> crypto_alloc_skcipher(ciphermode, 0, 0);
>> drivers/md/dm-integrity.c: ic->journal_crypt =
>> crypto_alloc_skcipher(ic->journal_crypt_alg.alg_string, 0, 0);
>> fs/crypto/keyinfo.c: struct crypto_skcipher *tfm =
>> crypto_alloc_skcipher("ecb(aes)", 0, 0);
>> fs/crypto/keyinfo.c: ctfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
>> fs/ecryptfs/crypto.c: crypt_stat->tfm =
>> crypto_alloc_skcipher(full_alg_name, 0, 0);
>>
>> I'll cross-reference this with SKCIPHER_REQUEST_ON_STACK...
>
> None of these use SKCIPHER_REQUEST_ON_STACK that I can find.
>
>>> So we could formalize this and limit SKCIPHER_REQUEST_ON_STACK() to
>>> synchronous skciphers, which implies that the reqsize limit only has
>>> to apply synchronous skciphers as well. But before we can do this, we
>>> have to identify the remaining occurrences that allow asynchronous
>>> skciphers to be used, and replace them with heap allocations.
>>
>> Sounds good; thanks!
>
> crypto_init_skcipher_ops_blkcipher() doesn't touch reqsize at all, so
> the only places I can find it gets changed are with direct callers of
> crypto_skcipher_set_reqsize(), which, when wrapping a sync blkcipher
> start with a reqsize == 0. So, the remaining non-ASYNC callers ask
> for:
>
> 4 struct sun4i_cipher_req_ctx
> 96 struct crypto_rfc3686_req_ctx
> 375 sum:
> 160 crypto_skcipher_blocksize(cipher) (max)
> 152 struct crypto_cts_reqctx
> 63 align_mask (max)
> 384 struct rctx
>
> So, following your patch to encrypt/decrypt, I can add reqsize check
> there. How does this look, on top of your patch?
>
> --- a/include/crypto/skcipher.h
> +++ b/include/crypto/skcipher.h
> @@ -144,9 +144,10 @@ struct skcipher_alg {
> /*
> * This must only ever be used with synchronous algorithms.
> */
> +#define MAX_SYNC_SKCIPHER_REQSIZE 384
> #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
> char __##name##_desc[sizeof(struct skcipher_request) + \
> - crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR = { 1 } \
> + MAX_SYNC_SKCIPHER_REQSIZE] CRYPTO_MINALIGN_ATTR = { 1 } \
> struct skcipher_request *name = (void *)__##name##_desc

If the lack of named initializer is too ugly, we could do something crazy like:

#define MAX_SYNC_SKCIPHER_REQSIZE 384
struct skcipher_request_on_stack {
union {
struct skcipher_request req;
char bytes[sizeof(struct skcipher_request) +
MAX_SYNC_SKCIPHER_REQSIZE];
};
};

/*
* This must only ever be used with synchronous algorithms.
*/
#define SKCIPHER_REQUEST_ON_STACK(name) \
struct skcipher_request_on_stack __##name##_req = \
{ req.__onstack = 1 }; \
struct skcipher_request *name = &(__##name##_req.req)


-Kees

--
Kees Cook
Pixel Security