2020-12-17 17:24:45

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

Hi,

I'm posting this patch series as RFC since there are a few open points on
which I'd like to have the opinion of the crypto maintainers and the rest of
kernel community.

The patch series adds the Intel Keem Bay OCS ECC crypto driver, which
enables hardware-accelerated ECDH on the Intel Keem Bay SoC.

The open points are the followings:

1. The OCS ECC HW supports the NIST P-384 curve, which, however, is not
supported by Linux ECDH software implementation ('crypto/ecdh.c').
Support for P-384 is added to the driver anyway by reserving a curve id
for P-384 in 'include/crypto/ecdh.h' and defining the cure parameters in
'drivers/crypto/keembay/ocs-ecc-curve-defs.h'. Is this reasonable?

2. The OCS ECC HW does not support the NIST P-192 curve. We were planning to
add SW fallback for P-192 in the driver, but the Intel Crypto team
(which, internally, has to approve any code involving cryptography)
advised against it, because they consider P-192 weak. As a result, the
driver is not passing crypto self-tests. Is there any possible solution
to this? Is it reasonable to change the self-tests to only test the
curves actually supported by the tested driver? (not fully sure how to do
that).

3. Another request from our crypto team was to make private key generation
optional in the driver, since they advice against automatic key
generation. As a result, the driver passes the P-256 self test only when
CONFIG_CRYPTO_DEV_KEEMBAY_OCS_ECDH_GEN_PRIV_KEY_SUPPORT=y. Is that
acceptable?


Daniele Alessandrelli (2):
crypto: ecc - Move ecc.h to include/crypto/internal
crypto: ecc - Export additional helper functions

Prabhjot Khurana (4):
crypto: engine - Add KPP Support to Crypto Engine
crypto: ecdh - Add Curve ID for NIST P-384
dt-bindings: crypto: Add Keem Bay ECC bindings
crypto: keembay-ocs-ecc - Add Keem Bay OCS ECC Driver

Documentation/crypto/crypto_engine.rst | 4 +
.../crypto/intel,keembay-ocs-ecc.yaml | 47 +
MAINTAINERS | 11 +
crypto/crypto_engine.c | 27 +
crypto/ecc.c | 18 +-
crypto/ecdh.c | 2 +-
crypto/ecrdsa.c | 2 +-
crypto/ecrdsa_defs.h | 2 +-
drivers/crypto/keembay/Kconfig | 31 +
drivers/crypto/keembay/Makefile | 2 +
drivers/crypto/keembay/keembay-ocs-ecc.c | 1003 +++++++++++++++++
drivers/crypto/keembay/ocs-ecc-curve-defs.h | 68 ++
include/crypto/ecdh.h | 1 +
include/crypto/engine.h | 5 +
{crypto => include/crypto/internal}/ecc.h | 44 +
15 files changed, 1257 insertions(+), 10 deletions(-)
create mode 100644 Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
create mode 100644 drivers/crypto/keembay/keembay-ocs-ecc.c
create mode 100644 drivers/crypto/keembay/ocs-ecc-curve-defs.h
rename {crypto => include/crypto/internal}/ecc.h (87%)


base-commit: 90cc8cf2d1ab87d708ebc311ac104ccbbefad9fc
--
2.26.2


2020-12-17 17:25:09

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 2/6] crypto: ecc - Move ecc.h to include/crypto/internal

From: Daniele Alessandrelli <[email protected]>

Move ecc.h header file to 'include/crypto/internal' so that it can be
easily imported from everywhere in the kernel tree.

This change is done to allow crypto device drivers to re-use the symbols
exported by 'crypto/ecc.c', thus avoiding code duplication.

Signed-off-by: Daniele Alessandrelli <[email protected]>
---
crypto/ecc.c | 2 +-
crypto/ecdh.c | 2 +-
crypto/ecrdsa.c | 2 +-
crypto/ecrdsa_defs.h | 2 +-
{crypto => include/crypto/internal}/ecc.h | 0
5 files changed, 4 insertions(+), 4 deletions(-)
rename {crypto => include/crypto/internal}/ecc.h (100%)

diff --git a/crypto/ecc.c b/crypto/ecc.c
index c80aa25994a0..9aa801315a32 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -31,10 +31,10 @@
#include <linux/fips.h>
#include <crypto/ecdh.h>
#include <crypto/rng.h>
+#include <crypto/internal/ecc.h>
#include <asm/unaligned.h>
#include <linux/ratelimit.h>

-#include "ecc.h"
#include "ecc_curve_defs.h"

typedef struct {
diff --git a/crypto/ecdh.c b/crypto/ecdh.c
index d56b8603dec9..edb06ce90d1d 100644
--- a/crypto/ecdh.c
+++ b/crypto/ecdh.c
@@ -6,11 +6,11 @@
*/

#include <linux/module.h>
+#include <crypto/internal/ecc.h>
#include <crypto/internal/kpp.h>
#include <crypto/kpp.h>
#include <crypto/ecdh.h>
#include <linux/scatterlist.h>
-#include "ecc.h"

struct ecdh_ctx {
unsigned int curve_id;
diff --git a/crypto/ecrdsa.c b/crypto/ecrdsa.c
index 6a3fd09057d0..b32ffcaad9ad 100644
--- a/crypto/ecrdsa.c
+++ b/crypto/ecrdsa.c
@@ -20,12 +20,12 @@
#include <linux/crypto.h>
#include <crypto/streebog.h>
#include <crypto/internal/akcipher.h>
+#include <crypto/internal/ecc.h>
#include <crypto/akcipher.h>
#include <linux/oid_registry.h>
#include <linux/scatterlist.h>
#include "ecrdsa_params.asn1.h"
#include "ecrdsa_pub_key.asn1.h"
-#include "ecc.h"
#include "ecrdsa_defs.h"

#define ECRDSA_MAX_SIG_SIZE (2 * 512 / 8)
diff --git a/crypto/ecrdsa_defs.h b/crypto/ecrdsa_defs.h
index 170baf039007..0056335b9d03 100644
--- a/crypto/ecrdsa_defs.h
+++ b/crypto/ecrdsa_defs.h
@@ -13,7 +13,7 @@
#ifndef _CRYTO_ECRDSA_DEFS_H
#define _CRYTO_ECRDSA_DEFS_H

-#include "ecc.h"
+#include <crypto/internal/ecc.h>

#define ECRDSA_MAX_SIG_SIZE (2 * 512 / 8)
#define ECRDSA_MAX_DIGITS (512 / 64)
diff --git a/crypto/ecc.h b/include/crypto/internal/ecc.h
similarity index 100%
rename from crypto/ecc.h
rename to include/crypto/internal/ecc.h
--
2.26.2

2020-12-17 17:25:24

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 3/6] crypto: ecc - Export additional helper functions

From: Daniele Alessandrelli <[email protected]>

Export the following additional ECC helper functions:
- ecc_alloc_point()
- ecc_free_point()
- vli_num_bits()
- ecc_point_is_zero()
- ecc_swap_digits()

This is done to allow future KPP device drivers to re-use existing code,
thus simplifying their implementation.

Functions are exported using EXPORT_SYMBOL() (instead of
EXPORT_SYMBOL_GPL()) to be consistent with the functions already
exported by crypto/ecc.c.

Exported functions are documented in crypto/ecc.h.

Signed-off-by: Daniele Alessandrelli <[email protected]>
---
crypto/ecc.c | 16 ++++++++-----
include/crypto/internal/ecc.h | 44 +++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/crypto/ecc.c b/crypto/ecc.c
index 9aa801315a32..e8176db071d8 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -70,7 +70,7 @@ static void ecc_free_digits_space(u64 *space)
kfree_sensitive(space);
}

-static struct ecc_point *ecc_alloc_point(unsigned int ndigits)
+struct ecc_point *ecc_alloc_point(unsigned int ndigits)
{
struct ecc_point *p = kmalloc(sizeof(*p), GFP_KERNEL);

@@ -95,8 +95,9 @@ static struct ecc_point *ecc_alloc_point(unsigned int ndigits)
kfree(p);
return NULL;
}
+EXPORT_SYMBOL(ecc_alloc_point);

-static void ecc_free_point(struct ecc_point *p)
+void ecc_free_point(struct ecc_point *p)
{
if (!p)
return;
@@ -105,6 +106,7 @@ static void ecc_free_point(struct ecc_point *p)
kfree_sensitive(p->y);
kfree_sensitive(p);
}
+EXPORT_SYMBOL(ecc_free_point);

static void vli_clear(u64 *vli, unsigned int ndigits)
{
@@ -154,7 +156,7 @@ static unsigned int vli_num_digits(const u64 *vli, unsigned int ndigits)
}

/* Counts the number of bits required for vli. */
-static unsigned int vli_num_bits(const u64 *vli, unsigned int ndigits)
+unsigned int vli_num_bits(const u64 *vli, unsigned int ndigits)
{
unsigned int i, num_digits;
u64 digit;
@@ -169,6 +171,7 @@ static unsigned int vli_num_bits(const u64 *vli, unsigned int ndigits)

return ((num_digits - 1) * 64 + i);
}
+EXPORT_SYMBOL(vli_num_bits);

/* Set dest from unaligned bit string src. */
void vli_from_be64(u64 *dest, const void *src, unsigned int ndigits)
@@ -933,11 +936,12 @@ EXPORT_SYMBOL(vli_mod_inv);
/* ------ Point operations ------ */

/* Returns true if p_point is the point at infinity, false otherwise. */
-static bool ecc_point_is_zero(const struct ecc_point *point)
+bool ecc_point_is_zero(const struct ecc_point *point)
{
return (vli_is_zero(point->x, point->ndigits) &&
vli_is_zero(point->y, point->ndigits));
}
+EXPORT_SYMBOL(ecc_point_is_zero);

/* Point multiplication algorithm using Montgomery's ladder with co-Z
* coordinates. From https://eprint.iacr.org/2011/338.pdf
@@ -1281,8 +1285,7 @@ void ecc_point_mult_shamir(const struct ecc_point *result,
}
EXPORT_SYMBOL(ecc_point_mult_shamir);

-static inline void ecc_swap_digits(const u64 *in, u64 *out,
- unsigned int ndigits)
+void ecc_swap_digits(const u64 *in, u64 *out, unsigned int ndigits)
{
const __be64 *src = (__force __be64 *)in;
int i;
@@ -1290,6 +1293,7 @@ static inline void ecc_swap_digits(const u64 *in, u64 *out,
for (i = 0; i < ndigits; i++)
out[i] = be64_to_cpu(src[ndigits - 1 - i]);
}
+EXPORT_SYMBOL(ecc_swap_digits);

static int __ecc_is_key_valid(const struct ecc_curve *curve,
const u64 *private_key, unsigned int ndigits)
diff --git a/include/crypto/internal/ecc.h b/include/crypto/internal/ecc.h
index d4e546b9ad79..2f6ac78b64f3 100644
--- a/include/crypto/internal/ecc.h
+++ b/include/crypto/internal/ecc.h
@@ -239,6 +239,41 @@ void vli_mod_inv(u64 *result, const u64 *input, const u64 *mod,
void vli_mod_mult_slow(u64 *result, const u64 *left, const u64 *right,
const u64 *mod, unsigned int ndigits);

+/**
+ * vli_num_bits() - Counts the number of bits required for vli.
+ *
+ * @vli: vli to check.
+ * @ndigits: Length of the @vli
+ *
+ * Return: The number of bits required to represent @vli.
+ */
+unsigned int vli_num_bits(const u64 *vli, unsigned int ndigits);
+
+/**
+ * ecc_aloc_point() - Allocate ECC point.
+ *
+ * @ndigits: Length of vlis in u64 qwords.
+ *
+ * Return: Pointer to the allocated point or NULL if allocation failed.
+ */
+struct ecc_point *ecc_alloc_point(unsigned int ndigits);
+
+/**
+ * ecc_free_point() - Free ECC point.
+ *
+ * @p: The point to free.
+ */
+void ecc_free_point(struct ecc_point *p);
+
+/**
+ * ecc_point_is_zero() - Check if point is zero.
+ *
+ * @p: Point to check for zero.
+ *
+ * Return: true if point is the point at infinity, false otherwise.
+ */
+bool ecc_point_is_zero(const struct ecc_point *point);
+
/**
* ecc_point_mult_shamir() - Add two points multiplied by scalars
*
@@ -256,4 +291,13 @@ void ecc_point_mult_shamir(const struct ecc_point *result,
const u64 *x, const struct ecc_point *p,
const u64 *y, const struct ecc_point *q,
const struct ecc_curve *curve);
+
+/**
+ * ecc_swap_digits() - Swap digits in vli.
+ *
+ * @in: The input vli (whose digits will be swapped).
+ * @out: The output vli (where swapped digits will be saved).
+ * @ndigits: Length of vli to be swapped.
+ */
+void ecc_swap_digits(const u64 *in, u64 *out, unsigned int ndigits);
#endif
--
2.26.2

2020-12-17 17:26:34

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 4/6] crypto: ecdh - Add Curve ID for NIST P-384

From: Prabhjot Khurana <[email protected]>

Reserve ECC curve id for NIST P-384 curve.

This is done to prepare for future support of the P-384 curve by KPP
device drivers.

Signed-off-by: Prabhjot Khurana <[email protected]>
Signed-off-by: Daniele Alessandrelli <[email protected]>
---
include/crypto/ecdh.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/crypto/ecdh.h b/include/crypto/ecdh.h
index a5b805b5526d..e4ba1de961e4 100644
--- a/include/crypto/ecdh.h
+++ b/include/crypto/ecdh.h
@@ -25,6 +25,7 @@
/* Curves IDs */
#define ECC_CURVE_NIST_P192 0x0001
#define ECC_CURVE_NIST_P256 0x0002
+#define ECC_CURVE_NIST_P384 0x0003

/**
* struct ecdh - define an ECDH private key
--
2.26.2

2020-12-17 17:26:37

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 5/6] dt-bindings: crypto: Add Keem Bay ECC bindings

From: Prabhjot Khurana <[email protected]>

Add Keem Bay Offload and Crypto Subsystem (OCS) Elliptic Curve
Cryptography (ECC) device tree bindings.

Signed-off-by: Prabhjot Khurana <[email protected]>
Signed-off-by: Daniele Alessandrelli <[email protected]>
---
.../crypto/intel,keembay-ocs-ecc.yaml | 47 +++++++++++++++++++
MAINTAINERS | 7 +++
2 files changed, 54 insertions(+)
create mode 100644 Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml

diff --git a/Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml b/Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
new file mode 100644
index 000000000000..a3c16451b1ad
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
@@ -0,0 +1,47 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/crypto/intel,keembay-ocs-ecc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Intel Keem Bay OCS ECC Device Tree Bindings
+
+maintainers:
+ - Daniele Alessandrelli <[email protected]>
+ - Prabhjot Khurana <[email protected]>
+
+description:
+ The Intel Keem Bay Offload and Crypto Subsystem (OCS) Elliptic Curve
+ Cryptography (ECC) device provides hardware acceleration for elliptic curve
+ cryptography using the NIST P-256 and NIST P-384 elliptic curves.
+
+properties:
+ compatible:
+ const: intel,keembay-ocs-ecc
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ crypto@30001000 {
+ compatible = "intel,keembay-ocs-ecc";
+ reg = <0x30001000 0x1000>;
+ interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&scmi_clk 95>;
+ };
diff --git a/MAINTAINERS b/MAINTAINERS
index f5eafee83bc6..fa5bc7c4c9fe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9070,6 +9070,13 @@ F: drivers/crypto/keembay/keembay-ocs-aes-core.c
F: drivers/crypto/keembay/ocs-aes.c
F: drivers/crypto/keembay/ocs-aes.h

+INTEL KEEM BAY OCS ECC CRYPTO DRIVER
+M: Daniele Alessandrelli <[email protected]>
+M: Prabhjot Khurana <[email protected]>
+M: Mark Gross <[email protected]>
+S: Maintained
+F: Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
+
INTEL MANAGEMENT ENGINE (mei)
M: Tomas Winkler <[email protected]>
L: [email protected]
--
2.26.2

2020-12-17 17:28:00

by Daniele Alessandrelli

[permalink] [raw]
Subject: [RFC PATCH 6/6] crypto: keembay-ocs-ecc - Add Keem Bay OCS ECC Driver

From: Prabhjot Khurana <[email protected]>

The Intel Keem Bay SoC can provide hardware acceleration of Elliptic
Curve Cryptography (ECC) by means of its Offload and Crypto Subsystem
(OCS).

Add the Keem Bay OCS ECC driver which leverages such hardware
capabilities to provide hardware-acceleration of ECDH-256 and ECDH-384.

Signed-off-by: Prabhjot Khurana <[email protected]>
Co-developed-by: Daniele Alessandrelli <[email protected]>
Signed-off-by: Daniele Alessandrelli <[email protected]>
---
MAINTAINERS | 4 +
drivers/crypto/keembay/Kconfig | 31 +
drivers/crypto/keembay/Makefile | 2 +
drivers/crypto/keembay/keembay-ocs-ecc.c | 1003 +++++++++++++++++++
drivers/crypto/keembay/ocs-ecc-curve-defs.h | 68 ++
5 files changed, 1108 insertions(+)
create mode 100644 drivers/crypto/keembay/keembay-ocs-ecc.c
create mode 100644 drivers/crypto/keembay/ocs-ecc-curve-defs.h

diff --git a/MAINTAINERS b/MAINTAINERS
index fa5bc7c4c9fe..a9ba4989fb45 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9076,6 +9076,10 @@ M: Prabhjot Khurana <[email protected]>
M: Mark Gross <[email protected]>
S: Maintained
F: Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
+F: drivers/crypto/keembay/Kconfig
+F: drivers/crypto/keembay/Makefile
+F: drivers/crypto/keembay/keembay-ocs-ecc.c
+F: drivers/crypto/keembay/ocs-ecc-curve-defs.h

INTEL MANAGEMENT ENGINE (mei)
M: Tomas Winkler <[email protected]>
diff --git a/drivers/crypto/keembay/Kconfig b/drivers/crypto/keembay/Kconfig
index 3c16797b25b9..01f9d4ecfa92 100644
--- a/drivers/crypto/keembay/Kconfig
+++ b/drivers/crypto/keembay/Kconfig
@@ -37,3 +37,34 @@ config CRYPTO_DEV_KEEMBAY_OCS_AES_SM4_CTS
Provides OCS version of cts(cbc(aes)) and cts(cbc(sm4)).

Intel does not recommend use of CTS mode with AES/SM4.
+
+config CRYPTO_DEV_KEEMBAY_OCS_ECC
+ tristate "Support for Intel Keem Bay OCS ECC HW acceleration"
+ depends on ARCH_KEEMBAY || COMPILE_TEST
+ depends on OF || COMPILE_TEST
+ depends on HAS_IOMEM
+ select CRYPTO_ECDH
+ select CRYPTO_ENGINE
+ help
+ Support for Intel Keem Bay Offload and Crypto Subsystem (OCS)
+ Elliptic Curve Cryptography (ECC) hardware acceleration for use with
+ Crypto API.
+
+ Provides OCS acceleration for ECDH-256, ECDH-384.
+
+ Say Y or M if you are compiling for the Intel Keem Bay SoC. The
+ module will be called keembay-ocs-ecc.
+
+ If unsure, say N.
+
+config CRYPTO_DEV_KEEMBAY_OCS_ECDH_GEN_PRIV_KEY_SUPPORT
+ bool "Add ECDH private key generation in Keem Bay OCS ECC driver"
+ depends on CRYPTO_DEV_KEEMBAY_OCS_ECC
+ help
+ Add ECDH private key generation in the Intel Keem Bay OCS ECC driver.
+
+ Intel does not recommend use of private key generation for ECDH
+ computations, which, however, is required to pass crypto self-tests.
+
+ Say Y if you need the driver to pass crypto self-tests. If unsure,
+ say N.
diff --git a/drivers/crypto/keembay/Makefile b/drivers/crypto/keembay/Makefile
index f21e2c4ab3b3..6ac1182871f7 100644
--- a/drivers/crypto/keembay/Makefile
+++ b/drivers/crypto/keembay/Makefile
@@ -3,3 +3,5 @@
#
obj-$(CONFIG_CRYPTO_DEV_KEEMBAY_OCS_AES_SM4) += keembay-ocs-aes.o
keembay-ocs-aes-objs := keembay-ocs-aes-core.o ocs-aes.o
+
+obj-$(CONFIG_CRYPTO_DEV_KEEMBAY_OCS_ECC) += keembay-ocs-ecc.o
diff --git a/drivers/crypto/keembay/keembay-ocs-ecc.c b/drivers/crypto/keembay/keembay-ocs-ecc.c
new file mode 100644
index 000000000000..7775ac864fff
--- /dev/null
+++ b/drivers/crypto/keembay/keembay-ocs-ecc.c
@@ -0,0 +1,1003 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Intel Keem Bay OCS ECC Crypto Driver.
+ *
+ * Copyright (C) 2019-2020 Intel Corporation
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/clk.h>
+#include <linux/completion.h>
+#include <linux/crypto.h>
+#include <linux/delay.h>
+#include <linux/fips.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/irq.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/scatterlist.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <crypto/ecdh.h>
+#include <crypto/engine.h>
+#include <crypto/kpp.h>
+#include <crypto/rng.h>
+
+#include <crypto/internal/ecc.h>
+#include <crypto/internal/kpp.h>
+
+#include "ocs-ecc-curve-defs.h"
+
+#define DRV_NAME "keembay-ocs-ecc"
+
+#define KMB_OCS_ECC_PRIORITY 350
+
+#define HW_OFFS_OCS_ECC_COMMAND 0x00000000
+#define HW_OFFS_OCS_ECC_STATUS 0x00000004
+#define HW_OFFS_OCS_ECC_DATA_IN 0x00000080
+#define HW_OFFS_OCS_ECC_CX_DATA_OUT 0x00000100
+#define HW_OFFS_OCS_ECC_CY_DATA_OUT 0x00000180
+#define HW_OFFS_OCS_ECC_ISR 0x00000400
+#define HW_OFFS_OCS_ECC_IER 0x00000404
+
+#define HW_OCS_ECC_ISR_INT_STATUS_DONE BIT(0)
+#define HW_OCS_ECC_COMMAND_INS_BP BIT(0)
+
+#define HW_OCS_ECC_COMMAND_START_VAL BIT(0)
+
+#define OCS_ECC_OP_SIZE_384 BIT(8)
+#define OCS_ECC_OP_SIZE_256 0
+
+/* ECC Instruction : for ECC_COMMAND */
+#define OCS_ECC_INST_WRITE_AX (0x01 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_WRITE_AY (0x02 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_WRITE_BX_D (0x03 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_WRITE_BY_L (0x04 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_WRITE_P (0x05 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_WRITE_A (0x06 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_CALC_D_IDX_A (0x08 << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_CALC_A_POW_B_MODP (0xB << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_CALC_A_MUL_B_MODP (0xB << HW_OCS_ECC_COMMAND_INS_BP)
+#define OCS_ECC_INST_CALC_A_ADD_B_MODP (0xB << HW_OCS_ECC_COMMAND_INS_BP)
+
+#define ECC_ENABLE_INTR 1
+
+#define POLL_USEC 100
+#define TIMEOUT_USEC 10000
+
+#define ECC_CURVE_NIST_P384_DIGITS (384 / 64)
+#define KMB_ECC_VLI_MAX_DIGITS ECC_CURVE_NIST_P384_DIGITS
+#define KMB_ECC_VLI_MAX_BYTES (KMB_ECC_VLI_MAX_DIGITS \
+ << ECC_DIGITS_TO_BYTES_SHIFT)
+
+#define POW_CUBE 3
+
+/**
+ * struct ocs_ecc_dev - ECC device context
+ * @list: List of device contexts
+ * @dev: OCS ECC device
+ * @base_reg: IO base address of OCS ECC
+ * @engine: Crypto engine for the device
+ * @irq_done: IRQ done completion.
+ * @irq: IRQ number
+ */
+struct ocs_ecc_dev {
+ struct list_head list;
+ struct device *dev;
+ void __iomem *base_reg;
+ struct crypto_engine *engine;
+ struct completion irq_done;
+ int irq;
+};
+
+/**
+ * struct ocs_ecc_ctx - Transformation context.
+ * @engine_ctx: Crypto engine ctx.
+ * @ecc_dev: The ECC driver associated with this context.
+ * @curve: The elliptic curve used by this transformation.
+ * @private_key: The private key.
+ */
+struct ocs_ecc_ctx {
+ struct crypto_engine_ctx engine_ctx;
+ struct ocs_ecc_dev *ecc_dev;
+ const struct ecc_curve *curve;
+ u64 private_key[KMB_ECC_VLI_MAX_DIGITS];
+};
+
+/* Driver data. */
+struct ocs_ecc_drv {
+ struct list_head dev_list;
+ spinlock_t lock; /* Protects dev_list. */
+};
+
+/* Global variable holding the list of OCS ECC devices (only one expected). */
+static struct ocs_ecc_drv ocs_ecc = {
+ .dev_list = LIST_HEAD_INIT(ocs_ecc.dev_list),
+ .lock = __SPIN_LOCK_UNLOCKED(ocs_ecc.lock),
+};
+
+/* Get OCS ECC tfm context from kpp_request. */
+static inline struct ocs_ecc_ctx *kmb_ocs_ecc_tctx(struct kpp_request *req)
+{
+ return kpp_tfm_ctx(crypto_kpp_reqtfm(req));
+}
+
+/* Converts number of digits to number of bytes. */
+static inline unsigned int digits_to_bytes(unsigned int n)
+{
+ return n << ECC_DIGITS_TO_BYTES_SHIFT;
+}
+
+static inline const struct ecc_curve *kmb_ecc_get_curve(unsigned int curve_id)
+{
+ switch (curve_id) {
+ case ECC_CURVE_NIST_P256:
+ return &nist_p256;
+ case ECC_CURVE_NIST_P384:
+ return &nist_p384;
+ default:
+ return NULL;
+ }
+}
+
+/*
+ * Wait for ECC idle i.e when an operation (other than write operations)
+ * is done.
+ */
+static inline int ocs_ecc_wait_idle(struct ocs_ecc_dev *dev)
+{
+ u32 value;
+
+ return readl_poll_timeout((dev->base_reg + HW_OFFS_OCS_ECC_STATUS),
+ value,
+ !(value & HW_OCS_ECC_ISR_INT_STATUS_DONE),
+ POLL_USEC, TIMEOUT_USEC);
+}
+
+static void ocs_ecc_cmd_start(struct ocs_ecc_dev *ecc_dev, u32 op_size)
+{
+ iowrite32(op_size | HW_OCS_ECC_COMMAND_START_VAL,
+ ecc_dev->base_reg + HW_OFFS_OCS_ECC_COMMAND);
+}
+
+/* Direct write of u32 buffer to ECC engine with associated instruction. */
+static void ocs_ecc_write_cmd_and_data(struct ocs_ecc_dev *dev,
+ u32 op_size,
+ u32 inst,
+ const void *data_in,
+ size_t data_size)
+{
+ iowrite32(op_size | inst, dev->base_reg + HW_OFFS_OCS_ECC_COMMAND);
+
+ /* MMIO Write src uint32 to dst. */
+ memcpy_toio(dev->base_reg + HW_OFFS_OCS_ECC_DATA_IN, data_in,
+ data_size);
+}
+
+/* Start OCS ECC operation and wait for its completion. */
+static int ocs_ecc_trigger_op(struct ocs_ecc_dev *ecc_dev, u32 op_size,
+ u32 inst)
+{
+ reinit_completion(&ecc_dev->irq_done);
+
+ iowrite32(ECC_ENABLE_INTR, ecc_dev->base_reg + HW_OFFS_OCS_ECC_IER);
+ iowrite32(op_size | inst, ecc_dev->base_reg + HW_OFFS_OCS_ECC_COMMAND);
+
+ return wait_for_completion_interruptible(&ecc_dev->irq_done);
+}
+
+/**
+ * ocs_ecc_read_cx_out() - Read the CX data output buffer.
+ * @dev: The OCS ECC device to read from.
+ * @cx_out: The buffer where to store the CX value. Must be at least
+ * @byte_count byte long.
+ * @byte_count: The amount of data to read.
+ */
+static inline void ocs_ecc_read_cx_out(struct ocs_ecc_dev *dev, void *cx_out,
+ size_t byte_count)
+{
+ memcpy_fromio(cx_out, dev->base_reg + HW_OFFS_OCS_ECC_CX_DATA_OUT,
+ byte_count);
+}
+
+/**
+ * ocs_ecc_read_cy_out() - Read the CX data output buffer.
+ * @dev: The OCS ECC device to read from.
+ * @cy_out: The buffer where to store the CY value. Must be at least
+ * @byte_count byte long.
+ * @byte_count: The amount of data to read.
+ */
+static inline void ocs_ecc_read_cy_out(struct ocs_ecc_dev *dev, void *cy_out,
+ size_t byte_count)
+{
+ memcpy_fromio(cy_out, dev->base_reg + HW_OFFS_OCS_ECC_CY_DATA_OUT,
+ byte_count);
+}
+
+static struct ocs_ecc_dev *kmb_ocs_ecc_find_dev(struct ocs_ecc_ctx *tctx)
+{
+ if (tctx->ecc_dev)
+ return tctx->ecc_dev;
+
+ spin_lock(&ocs_ecc.lock);
+
+ /* Only a single OCS device available. */
+ tctx->ecc_dev = list_first_entry(&ocs_ecc.dev_list, struct ocs_ecc_dev,
+ list);
+
+ spin_unlock(&ocs_ecc.lock);
+
+ return tctx->ecc_dev;
+}
+
+/* Do point multiplication using OCS ECC HW. */
+static int kmb_ecc_point_mult(struct ocs_ecc_dev *ecc_dev,
+ struct ecc_point *result,
+ const struct ecc_point *point,
+ u64 *scalar,
+ const struct ecc_curve *curve)
+{
+ u8 sca[KMB_ECC_VLI_MAX_BYTES]; /* Use the maximum data size. */
+ u32 op_size = (curve->g.ndigits > ECC_CURVE_NIST_P256_DIGITS) ?
+ OCS_ECC_OP_SIZE_384 : OCS_ECC_OP_SIZE_256;
+ size_t nbytes = digits_to_bytes(curve->g.ndigits);
+ int rc = 0;
+
+ /* Generate random nbytes for Simple and Differential SCA protection. */
+ rc = crypto_get_default_rng();
+ if (rc)
+ return rc;
+
+ rc = crypto_rng_get_bytes(crypto_default_rng, sca, nbytes);
+ crypto_put_default_rng();
+ if (rc)
+ return rc;
+
+ /* Wait engine to be idle before starting new operation. */
+ rc = ocs_ecc_wait_idle(ecc_dev);
+ if (rc)
+ return rc;
+
+ /* Send ecc_start pulse as well as indicating operation size. */
+ ocs_ecc_cmd_start(ecc_dev, op_size);
+
+ /* Write ax param; Base point (Gx). */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_AX,
+ point->x, nbytes);
+
+ /* Write ay param; Base point (Gy). */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_AY,
+ point->y, nbytes);
+
+ /*
+ * Write the private key into DATA_IN reg.
+ *
+ * Since DATA_IN register is used to write different values during the
+ * computation private Key value is overwritten with
+ * side-channel-resistance value.
+ */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_BX_D,
+ scalar, nbytes);
+
+ /* Write operand by/l. */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_BY_L,
+ sca, nbytes);
+ memzero_explicit(sca, sizeof(sca));
+
+ /* Write p = curve prime(GF modulus). */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_P,
+ curve->p, nbytes);
+
+ /* Write a = curve coefficient. */
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_A,
+ curve->a, nbytes);
+
+ /* Make hardware perform the multiplication. */
+ rc = ocs_ecc_trigger_op(ecc_dev, op_size, OCS_ECC_INST_CALC_D_IDX_A);
+ if (rc)
+ return rc;
+
+ /* Read result. */
+ ocs_ecc_read_cx_out(ecc_dev, result->x, nbytes);
+ ocs_ecc_read_cy_out(ecc_dev, result->y, nbytes);
+
+ return 0;
+}
+
+/**
+ * kmb_ecc_do_scalar_op() - Perform Scalar operation using OCS ECC HW.
+ * @ecc_dev: The OCS ECC device to use.
+ * @scalar_out: Where to store the output scalar.
+ * @scalar_a: Input scalar operand 'a'.
+ * @scalar_b: Input scalar operand 'b'
+ * @curve: The curve on which the operation is performed.
+ * @ndigits: The size of the operands (in digits).
+ * @inst: The operation to perform (as an OCS ECC instruction).
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+static int kmb_ecc_do_scalar_op(struct ocs_ecc_dev *ecc_dev, u64 *scalar_out,
+ const u64 *scalar_a, const u64 *scalar_b,
+ const struct ecc_curve *curve,
+ unsigned int ndigits, const u32 inst)
+{
+ u32 op_size = (ndigits > ECC_CURVE_NIST_P256_DIGITS) ?
+ OCS_ECC_OP_SIZE_384 : OCS_ECC_OP_SIZE_256;
+ size_t nbytes = digits_to_bytes(ndigits);
+ size_t data_size_u8;
+ int rc;
+
+ /* Wait engine to be idle before starting new operation. */
+ rc = ocs_ecc_wait_idle(ecc_dev);
+
+ if (rc)
+ return rc;
+
+ /* Send ecc_start pulse as well as indicating operation size. */
+ ocs_ecc_cmd_start(ecc_dev, op_size);
+
+ /* Write ax param (Base point (Gx).*/
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_AX,
+ scalar_a, nbytes);
+
+ /* Write ay param Base point (Gy).*/
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_AY,
+ scalar_b, nbytes);
+
+ /* Write p = curve prime(GF modulus).*/
+ ocs_ecc_write_cmd_and_data(ecc_dev, op_size, OCS_ECC_INST_WRITE_P,
+ curve->p, nbytes);
+
+ /* Give instruction A.B or A+B to ECC engine. */
+ rc = ocs_ecc_trigger_op(ecc_dev, op_size, inst);
+ if (rc)
+ return rc;
+
+ data_size_u8 = digits_to_bytes(ndigits);
+
+ ocs_ecc_read_cx_out(ecc_dev, scalar_out, data_size_u8);
+
+ if (vli_is_zero(scalar_out, ndigits))
+ return -EINVAL;
+
+ return 0;
+}
+
+/* SP800-56A section 5.6.2.3.4 partial verification: ephemeral keys only */
+static int kmb_ocs_ecc_is_pubkey_valid_partial(struct ocs_ecc_dev *ecc_dev,
+ const struct ecc_curve *curve,
+ struct ecc_point *pk)
+{
+ u64 xxx[KMB_ECC_VLI_MAX_DIGITS];
+ u64 yy[KMB_ECC_VLI_MAX_DIGITS];
+ u64 w[KMB_ECC_VLI_MAX_DIGITS];
+ int rc;
+
+ if (WARN_ON(pk->ndigits != curve->g.ndigits))
+ return -EINVAL;
+
+ /* Check 1: Verify key is not the zero point. */
+ if (ecc_point_is_zero(pk))
+ return -EINVAL;
+
+ /* Check 2: Verify key is in the range [0, p-1]. */
+ if (vli_cmp(curve->p, pk->x, pk->ndigits) != 1)
+ return -EINVAL;
+
+ if (vli_cmp(curve->p, pk->y, pk->ndigits) != 1)
+ return -EINVAL;
+
+ /* Check 3: Verify that y^2 == (x^3 + a·x + b) mod p */
+
+ /* y^2 */
+ /* Compute y^2 -> store in yy */
+ rc = kmb_ecc_do_scalar_op(ecc_dev, yy, pk->y, pk->y, curve, pk->ndigits,
+ OCS_ECC_INST_CALC_A_MUL_B_MODP);
+ if (!rc)
+ goto exit;
+
+ /* x^3 */
+ /* Assigning w = 3, used for calculating x^3. */
+ w[0] = POW_CUBE;
+ /* Load the next stage.*/
+ rc = kmb_ecc_do_scalar_op(ecc_dev, xxx, pk->x, w, curve, pk->ndigits,
+ OCS_ECC_INST_CALC_A_POW_B_MODP);
+ if (!rc)
+ goto exit;
+
+ /* Do a*x -> store in w. */
+ rc = kmb_ecc_do_scalar_op(ecc_dev, w, curve->a, pk->x, curve,
+ pk->ndigits,
+ OCS_ECC_INST_CALC_A_MUL_B_MODP);
+ if (!rc)
+ goto exit;
+
+ /* Do ax + b == w + b; store in w. */
+ rc = kmb_ecc_do_scalar_op(ecc_dev, w, w, curve->b, curve,
+ pk->ndigits,
+ OCS_ECC_INST_CALC_A_ADD_B_MODP);
+ if (!rc)
+ goto exit;
+
+ /* x^3 + ax + b == x^3 + w -> store in w. */
+ rc = kmb_ecc_do_scalar_op(ecc_dev, w, xxx, w, curve, pk->ndigits,
+ OCS_ECC_INST_CALC_A_ADD_B_MODP);
+ if (!rc)
+ goto exit;
+
+ /* Compare y^2 == x^3 + a·x + b. */
+ rc = vli_cmp(yy, w, pk->ndigits);
+ if (rc)
+ rc = -EINVAL;
+
+exit:
+ memzero_explicit(xxx, sizeof(xxx));
+ memzero_explicit(yy, sizeof(yy));
+ memzero_explicit(w, sizeof(w));
+
+ return rc;
+}
+
+/* SP800-56A section 5.6.2.3.3 full verification */
+static int kmb_ocs_ecc_is_pubkey_valid_full(struct ocs_ecc_dev *ecc_dev,
+ const struct ecc_curve *curve,
+ struct ecc_point *pk)
+{
+ struct ecc_point *nQ;
+ int rc;
+
+ /* Checks 1 through 3 */
+ rc = kmb_ocs_ecc_is_pubkey_valid_partial(ecc_dev, curve, pk);
+ if (rc)
+ return rc;
+
+ /* Check 4: Verify that nQ is the zero point. */
+ nQ = ecc_alloc_point(pk->ndigits);
+ if (!nQ)
+ return -ENOMEM;
+
+ rc = kmb_ecc_point_mult(ecc_dev, nQ, pk, curve->n, curve);
+ if (rc)
+ goto exit;
+
+ if (!ecc_point_is_zero(nQ))
+ rc = -EINVAL;
+
+exit:
+ ecc_free_point(nQ);
+
+ return rc;
+}
+
+static int kmb_ecc_is_key_valid(const struct ecc_curve *curve,
+ const u64 *private_key, size_t private_key_len)
+{
+ size_t ndigits = curve->g.ndigits;
+ u64 one[KMB_ECC_VLI_MAX_DIGITS] = {1};
+ u64 res[KMB_ECC_VLI_MAX_DIGITS];
+
+ if (private_key_len != digits_to_bytes(ndigits))
+ return -EINVAL;
+
+ if (!private_key)
+ return -EINVAL;
+
+ /* Make sure the private key is in the range [2, n-3]. */
+ if (vli_cmp(one, private_key, ndigits) != -1)
+ return -EINVAL;
+
+ vli_sub(res, curve->n, one, ndigits);
+ vli_sub(res, res, one, ndigits);
+ if (vli_cmp(res, private_key, ndigits) != 1)
+ return -EINVAL;
+
+ return 0;
+}
+
+#ifdef CONFIG_CRYPTO_DEV_KEEMBAY_OCS_ECDH_GEN_PRIV_KEY_SUPPORT
+/*
+ * ECC private keys are generated using the method of extra random bits,
+ * equivalent to that described in FIPS 186-4, Appendix B.4.1.
+ *
+ * d = (c mod(n–1)) + 1 where c is a string of random bits, 64 bits longer
+ * than requested
+ * 0 <= c mod(n-1) <= n-2 and implies that
+ * 1 <= d <= n-1
+ *
+ * This method generates a private key uniformly distributed in the range
+ * [1, n-1].
+ */
+static int kmb_ecc_gen_privkey(const struct ecc_curve *curve, u64 *privkey)
+{
+ size_t nbytes = digits_to_bytes(curve->g.ndigits);
+ u64 priv[KMB_ECC_VLI_MAX_DIGITS];
+ size_t nbits;
+ int rc;
+
+ nbits = vli_num_bits(curve->n, curve->g.ndigits);
+
+ /* Check that N is included in Table 1 of FIPS 186-4, section 6.1.1 */
+ if (nbits < 160 || curve->g.ndigits > ARRAY_SIZE(priv))
+ return -EINVAL;
+
+ /*
+ * FIPS 186-4 recommends that the private key should be obtained from a
+ * RBG with a security strength equal to or greater than the security
+ * strength associated with N.
+ *
+ * The maximum security strength identified by NIST SP800-57pt1r4 for
+ * ECC is 256 (N >= 512).
+ *
+ * This condition is met by the default RNG because it selects a favored
+ * DRBG with a security strength of 256.
+ */
+ if (crypto_get_default_rng())
+ return -EFAULT;
+
+ rc = crypto_rng_get_bytes(crypto_default_rng, (u8 *)priv, nbytes);
+ crypto_put_default_rng();
+ if (rc)
+ goto cleanup;
+
+ rc = kmb_ecc_is_key_valid(curve, priv, nbytes);
+ if (rc)
+ goto cleanup;
+
+ ecc_swap_digits(priv, privkey, curve->g.ndigits);
+
+cleanup:
+ memzero_explicit(&priv, sizeof(priv));
+
+ return rc;
+}
+#else /* !CONFIG_CRYPTO_DEV_KEEMBAY_OCS_ECDH_GEN_PRIV_KEY_SUPPORT */
+/* If key generation is not enabled, always return error. */
+static int kmb_ecc_gen_privkey(const struct ecc_curve *curve, u64 *privkey)
+{
+ return -EINVAL;
+}
+#endif /* !CONFIG_CRYPTO_DEV_KEEMBAY_OCS_ECDH_GEN_PRIV_KEY_SUPPORT */
+
+static int kmb_ocs_ecdh_set_secret(struct crypto_kpp *tfm, const void *buf,
+ unsigned int len)
+{
+ struct ocs_ecc_ctx *tctx = kpp_tfm_ctx(tfm);
+ struct ecdh params;
+ int rc = 0;
+
+ rc = crypto_ecdh_decode_key(buf, len, &params);
+ if (rc)
+ goto cleanup;
+
+ tctx->curve = kmb_ecc_get_curve(params.curve_id);
+ if (!tctx->curve) {
+ rc = -EOPNOTSUPP;
+ goto cleanup;
+ }
+
+ if (!params.key || !params.key_size) {
+ rc = kmb_ecc_gen_privkey(tctx->curve, tctx->private_key);
+ goto cleanup;
+ }
+
+ rc = kmb_ecc_is_key_valid(tctx->curve, (const u64 *)params.key,
+ params.key_size);
+ if (rc)
+ goto cleanup;
+
+ ecc_swap_digits((const u64 *)params.key, tctx->private_key,
+ tctx->curve->g.ndigits);
+cleanup:
+ memzero_explicit(&params, sizeof(params));
+
+ if (rc)
+ tctx->curve = NULL;
+
+ return rc;
+}
+
+/* Compute shared secret. */
+static int kmb_ecc_do_shared_secret(struct ocs_ecc_ctx *tctx,
+ struct kpp_request *req)
+{
+ struct ocs_ecc_dev *ecc_dev = tctx->ecc_dev;
+ const struct ecc_curve *curve = tctx->curve;
+ u64 shared_secret[KMB_ECC_VLI_MAX_DIGITS];
+ u64 pubk_buf[KMB_ECC_VLI_MAX_DIGITS * 2];
+ size_t copied, nbytes, pubk_len;
+ struct ecc_point *pk, *result;
+ int rc;
+
+ nbytes = digits_to_bytes(curve->g.ndigits);
+
+ /* Public key is a point, thus it has two coordinates */
+ pubk_len = 2 * nbytes;
+
+ /* Copy public key from SG list to pubk_buf. */
+ copied = sg_copy_to_buffer(req->src,
+ sg_nents_for_len(req->src, pubk_len),
+ pubk_buf, pubk_len);
+ if (copied != pubk_len)
+ return -EINVAL;
+
+ /* Allocate and initialize public key point. */
+ pk = ecc_alloc_point(curve->g.ndigits);
+ if (!pk)
+ return -ENOMEM;
+
+ ecc_swap_digits(pubk_buf, pk->x, curve->g.ndigits);
+ ecc_swap_digits(&pubk_buf[curve->g.ndigits], pk->y, curve->g.ndigits);
+
+ /*
+ * Check the public key for following
+ * Check 1: Verify key is not the zero point.
+ * Check 2: Verify key is in the range [1, p-1].
+ * Check 3: Verify that y^2 == (x^3 + a·x + b) mod p
+ */
+ rc = kmb_ocs_ecc_is_pubkey_valid_partial(tctx->ecc_dev, curve, pk);
+ if (rc)
+ goto exit_free_pk;
+
+ /* Allocate point for storing computed shared secret. */
+ result = ecc_alloc_point(pk->ndigits);
+ if (!result) {
+ rc = -ENOMEM;
+ goto exit_free_pk;
+ }
+
+ /* Calculate the shared secret.*/
+ rc = kmb_ecc_point_mult(ecc_dev, result, pk, tctx->private_key, curve);
+ if (rc)
+ goto exit_free_result;
+
+ if (ecc_point_is_zero(result)) {
+ rc = -EFAULT;
+ goto exit_free_result;
+ }
+
+ /* Copy shared secret from point to buffer. */
+ ecc_swap_digits(result->x, shared_secret, result->ndigits);
+
+ /* Request might ask for less bytes than what we have. */
+ nbytes = min_t(size_t, nbytes, req->dst_len);
+
+ copied = sg_copy_from_buffer(req->dst,
+ sg_nents_for_len(req->dst, nbytes),
+ shared_secret, nbytes);
+
+ if (copied != nbytes)
+ rc = -EINVAL;
+
+ memzero_explicit(shared_secret, sizeof(shared_secret));
+
+exit_free_result:
+ ecc_free_point(result);
+
+exit_free_pk:
+ ecc_free_point(pk);
+
+ if (rc)
+ return rc;
+
+ crypto_finalize_kpp_request(ecc_dev->engine, req, 0);
+
+ return 0;
+}
+
+/* Compute public key. */
+static int kmb_ecc_do_public_key(struct ocs_ecc_ctx *tctx,
+ struct kpp_request *req)
+{
+ const struct ecc_curve *curve = tctx->curve;
+ u64 pubk_buf[KMB_ECC_VLI_MAX_DIGITS * 2];
+ struct ecc_point *pk;
+ size_t pubk_len;
+ size_t copied;
+ int rc;
+
+ /* Public key is a point, so it has double the digits. */
+ pubk_len = 2 * digits_to_bytes(curve->g.ndigits);
+
+ pk = ecc_alloc_point(curve->g.ndigits);
+ if (!pk)
+ return -ENOMEM;
+
+ /* Public Key(pk) = priv * G. */
+ rc = kmb_ecc_point_mult(tctx->ecc_dev, pk, &curve->g, tctx->private_key,
+ curve);
+ if (rc)
+ goto exit;
+
+ /* SP800-56A rev 3 5.6.2.1.3 key check */
+ if (kmb_ocs_ecc_is_pubkey_valid_full(tctx->ecc_dev, curve, pk)) {
+ rc = -EAGAIN;
+ goto exit;
+ }
+
+ /* Copy public key from point to buffer. */
+ ecc_swap_digits(pk->x, pubk_buf, pk->ndigits);
+ ecc_swap_digits(pk->y, &pubk_buf[pk->ndigits], pk->ndigits);
+
+ /* Copy public key to req->dst. */
+ copied = sg_copy_from_buffer(req->dst,
+ sg_nents_for_len(req->dst, pubk_len),
+ pubk_buf, pubk_len);
+
+ if (copied != pubk_len)
+ rc = -EINVAL;
+
+exit:
+ ecc_free_point(pk);
+
+ /* If there was an error, return. */
+ if (rc)
+ return rc;
+
+ /* Otherwise finalize request. */
+ crypto_finalize_kpp_request(tctx->ecc_dev->engine, req, 0);
+
+ return 0;
+}
+
+static int kmb_ocs_ecc_do_one_request(struct crypto_engine *engine,
+ void *areq)
+{
+ struct kpp_request *req = container_of(areq, struct kpp_request, base);
+ struct ocs_ecc_ctx *tctx = kmb_ocs_ecc_tctx(req);
+
+ if (req->src)
+ return kmb_ecc_do_shared_secret(tctx, req);
+ else
+ return kmb_ecc_do_public_key(tctx, req);
+}
+
+static int kmb_ocs_ecdh_generate_public_key(struct kpp_request *req)
+{
+ struct ocs_ecc_ctx *tctx = kmb_ocs_ecc_tctx(req);
+ const struct ecc_curve *curve = tctx->curve;
+
+ /* Ensure kmb_ocs_ecdh_set_secret() has been successfully called. */
+ if (!tctx->curve)
+ return -EINVAL;
+
+ /* Ensure dst is present. */
+ if (!req->dst)
+ return -EINVAL;
+
+ /* Check the request dst is big enough to hold the public key. */
+ if (req->dst_len < (2 * digits_to_bytes(curve->g.ndigits)))
+ return -EINVAL;
+
+ /* 'src' is not supposed to be present when generate pubk is called. */
+ if (req->src)
+ return -EINVAL;
+
+ return crypto_transfer_kpp_request_to_engine(tctx->ecc_dev->engine,
+ req);
+}
+
+static int kmb_ocs_ecdh_compute_shared_secret(struct kpp_request *req)
+{
+ struct ocs_ecc_ctx *tctx = kmb_ocs_ecc_tctx(req);
+ const struct ecc_curve *curve = tctx->curve;
+
+ /* Ensure kmb_ocs_ecdh_set_secret() has been successfully called. */
+ if (!tctx->curve)
+ return -EINVAL;
+
+ /* Ensure dst is present. */
+ if (!req->dst)
+ return -EINVAL;
+
+ /* Ensure src is present. */
+ if (!req->src)
+ return -EINVAL;
+
+ /*
+ * req->src is expected to the (other-side) public key, so its length
+ * must be 2 * coordinate size (in bytes).
+ */
+ if (req->src_len != 2 * digits_to_bytes(curve->g.ndigits))
+ return -EINVAL;
+
+ return crypto_transfer_kpp_request_to_engine(tctx->ecc_dev->engine,
+ req);
+}
+
+static int kmb_ocs_ecdh_init_tfm(struct crypto_kpp *tfm)
+{
+ struct ocs_ecc_ctx *tctx = kpp_tfm_ctx(tfm);
+
+ memset(tctx, 0, sizeof(*tctx));
+
+ tctx->ecc_dev = kmb_ocs_ecc_find_dev(tctx);
+
+ if (IS_ERR(tctx->ecc_dev)) {
+ pr_err("Failed to find the device : %ld\n",
+ PTR_ERR(tctx->ecc_dev));
+ return PTR_ERR(tctx->ecc_dev);
+ }
+
+ tctx->engine_ctx.op.prepare_request = NULL;
+ tctx->engine_ctx.op.do_one_request = kmb_ocs_ecc_do_one_request;
+ tctx->engine_ctx.op.unprepare_request = NULL;
+
+ return 0;
+}
+
+static void kmb_ocs_ecdh_exit_tfm(struct crypto_kpp *tfm)
+{
+ struct ocs_ecc_ctx *tctx = kpp_tfm_ctx(tfm);
+
+ memzero_explicit(tctx->private_key, sizeof(*tctx->private_key));
+}
+
+static unsigned int kmb_ocs_ecdh_max_size(struct crypto_kpp *tfm)
+{
+ struct ocs_ecc_ctx *tctx = kpp_tfm_ctx(tfm);
+
+ /* Public key is made of two coordinates, so double the digits. */
+ return digits_to_bytes(tctx->curve->g.ndigits) * 2;
+}
+
+static struct kpp_alg ocs_ecc_algs = {
+ .set_secret = kmb_ocs_ecdh_set_secret,
+ .generate_public_key = kmb_ocs_ecdh_generate_public_key,
+ .compute_shared_secret = kmb_ocs_ecdh_compute_shared_secret,
+ .init = kmb_ocs_ecdh_init_tfm,
+ .exit = kmb_ocs_ecdh_exit_tfm,
+ .max_size = kmb_ocs_ecdh_max_size,
+ .base = {
+ .cra_name = "ecdh",
+ .cra_driver_name = "ecdh-keembay-ocs",
+ .cra_priority = KMB_OCS_ECC_PRIORITY,
+ .cra_module = THIS_MODULE,
+ .cra_ctxsize = sizeof(struct ocs_ecc_ctx),
+ },
+};
+
+static irqreturn_t ocs_ecc_irq_handler(int irq, void *dev_id)
+{
+ struct ocs_ecc_dev *ecc_dev = dev_id;
+ u32 status;
+
+ /*
+ * Read the status register and write it back to clear the
+ * DONE_INT_STATUS bit.
+ */
+ status = ioread32(ecc_dev->base_reg + HW_OFFS_OCS_ECC_ISR);
+ iowrite32(status, ecc_dev->base_reg + HW_OFFS_OCS_ECC_ISR);
+
+ if (!(status & HW_OCS_ECC_ISR_INT_STATUS_DONE))
+ return IRQ_NONE;
+
+ complete(&ecc_dev->irq_done);
+
+ return IRQ_HANDLED;
+}
+
+static int kmb_ocs_ecc_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct ocs_ecc_dev *ecc_dev;
+ int rc;
+
+ ecc_dev = devm_kzalloc(dev, sizeof(*ecc_dev), GFP_KERNEL);
+ if (!ecc_dev)
+ return -ENOMEM;
+
+ ecc_dev->dev = dev;
+
+ platform_set_drvdata(pdev, ecc_dev);
+
+ INIT_LIST_HEAD(&ecc_dev->list);
+ init_completion(&ecc_dev->irq_done);
+
+ /* Get base register address. */
+ ecc_dev->base_reg = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(ecc_dev->base_reg)) {
+ dev_err(dev, "Failed to get base address\n");
+ rc = PTR_ERR(ecc_dev->base_reg);
+ goto list_del;
+ }
+
+ /* Get and request IRQ */
+ ecc_dev->irq = platform_get_irq(pdev, 0);
+ if (ecc_dev->irq < 0) {
+ rc = ecc_dev->irq;
+ goto list_del;
+ }
+
+ rc = devm_request_threaded_irq(dev, ecc_dev->irq, ocs_ecc_irq_handler,
+ NULL, 0, "keembay-ocs-ecc", ecc_dev);
+ if (rc < 0) {
+ dev_err(dev, "Could not request IRQ\n");
+ goto list_del;
+ }
+
+ /* Add device to the list of OCS ECC devices. */
+ spin_lock(&ocs_ecc.lock);
+ list_add_tail(&ecc_dev->list, &ocs_ecc.dev_list);
+ spin_unlock(&ocs_ecc.lock);
+
+ /* Initialize crypto engine. */
+ ecc_dev->engine = crypto_engine_alloc_init(dev, 1);
+ if (!ecc_dev->engine) {
+ dev_err(dev, "Could not allocate crypto engine\n");
+ goto list_del;
+ }
+
+ rc = crypto_engine_start(ecc_dev->engine);
+ if (rc) {
+ dev_err(dev, "Could not start crypto engine\n");
+ goto cleanup;
+ }
+
+ /* Register the KPP algo. */
+ rc = crypto_register_kpp(&ocs_ecc_algs);
+ if (rc) {
+ dev_err(dev,
+ "Could not register OCS algorithms with Crypto API\n");
+ goto cleanup;
+ }
+
+ return 0;
+
+cleanup:
+ crypto_engine_exit(ecc_dev->engine);
+
+list_del:
+ spin_lock(&ocs_ecc.lock);
+ list_del(&ecc_dev->list);
+ spin_unlock(&ocs_ecc.lock);
+
+ return rc;
+}
+
+static int kmb_ocs_ecc_remove(struct platform_device *pdev)
+{
+ struct ocs_ecc_dev *ecc_dev;
+
+ ecc_dev = platform_get_drvdata(pdev);
+ if (!ecc_dev)
+ return -ENODEV;
+
+ crypto_unregister_kpp(&ocs_ecc_algs);
+
+ spin_lock(&ocs_ecc.lock);
+ list_del(&ecc_dev->list);
+ spin_unlock(&ocs_ecc.lock);
+
+ crypto_engine_exit(ecc_dev->engine);
+
+ return 0;
+}
+
+/* Device tree driver match. */
+static const struct of_device_id kmb_ocs_ecc_of_match[] = {
+ {
+ .compatible = "intel,keembay-ocs-ecc",
+ },
+ {}
+};
+
+/* The OCS driver is a platform device. */
+static struct platform_driver kmb_ocs_ecc_driver = {
+ .probe = kmb_ocs_ecc_probe,
+ .remove = kmb_ocs_ecc_remove,
+ .driver = {
+ .name = DRV_NAME,
+ .of_match_table = kmb_ocs_ecc_of_match,
+ },
+};
+module_platform_driver(kmb_ocs_ecc_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Intel Keem Bay OCS ECC Driver");
+MODULE_ALIAS_CRYPTO("ecdh");
+MODULE_ALIAS_CRYPTO("ecdh-keembay-ocs");
diff --git a/drivers/crypto/keembay/ocs-ecc-curve-defs.h b/drivers/crypto/keembay/ocs-ecc-curve-defs.h
new file mode 100644
index 000000000000..9e5273828559
--- /dev/null
+++ b/drivers/crypto/keembay/ocs-ecc-curve-defs.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Intel Keem Bay OCS ECC Curve Definitions.
+ *
+ * Copyright (C) 2019-2020 Intel Corporation
+ */
+#ifndef _CRYPTO_KEEMBAY_OCS_ECC_CURVE_DEFS_H
+#define _CRYPTO_KEEMBAY_OCS_ECC_CURVE_DEFS_H
+
+/* NIST P-256: a = p - 3 */
+static u64 nist_p256_g_x[] = { 0xF4A13945D898C296ull, 0x77037D812DEB33A0ull,
+ 0xF8BCE6E563A440F2ull, 0x6B17D1F2E12C4247ull };
+static u64 nist_p256_g_y[] = { 0xCBB6406837BF51F5ull, 0x2BCE33576B315ECEull,
+ 0x8EE7EB4A7C0F9E16ull, 0x4FE342E2FE1A7F9Bull };
+static u64 nist_p256_p[] = { 0xFFFFFFFFFFFFFFFFull, 0x00000000FFFFFFFFull,
+ 0x0000000000000000ull, 0xFFFFFFFF00000001ull };
+static u64 nist_p256_n[] = { 0xF3B9CAC2FC632551ull, 0xBCE6FAADA7179E84ull,
+ 0xFFFFFFFFFFFFFFFFull, 0xFFFFFFFF00000000ull };
+static u64 nist_p256_a[] = { 0xFFFFFFFFFFFFFFFCull, 0x00000000FFFFFFFFull,
+ 0x0000000000000000ull, 0xFFFFFFFF00000001ull };
+static u64 nist_p256_b[] = { 0x3BCE3C3E27D2604Bull, 0x651D06B0CC53B0F6ull,
+ 0xB3EBBD55769886BCull, 0x5AC635D8AA3A93E7ull };
+static struct ecc_curve nist_p256 = {
+ .name = "nist_256",
+ .g = {
+ .x = nist_p256_g_x,
+ .y = nist_p256_g_y,
+ .ndigits = 4,
+ },
+ .p = nist_p256_p,
+ .n = nist_p256_n,
+ .a = nist_p256_a,
+ .b = nist_p256_b
+};
+
+/* NIST P-384: a = p - 3 */
+static u64 nist_p384_g_x[] = { 0x3A545E3872760AB7ull, 0x5502F25DBF55296Cull,
+ 0x59F741E082542A38ull, 0x6E1D3B628BA79B98ull,
+ 0x8EB1C71EF320AD74ull, 0xAA87CA22BE8B0537ull };
+static u64 nist_p384_g_y[] = { 0x7A431D7C90EA0E5F, 0x0A60B1CE1D7E819Dull,
+ 0xE9DA3113B5F0B8C0ull, 0xF8F41DBD289A147Cull,
+ 0x5D9E98BF9292DC29ull, 0x3617DE4A96262C6Full };
+static u64 nist_p384_p[] = { 0x00000000FFFFFFFFull, 0xFFFFFFFF00000000ull,
+ 0xFFFFFFFFFFFFFFFEull, 0xFFFFFFFFFFFFFFFFull,
+ 0xFFFFFFFFFFFFFFFFull, 0xFFFFFFFFFFFFFFFFull };
+static u64 nist_p384_n[] = { 0xECEC196ACCC52973ull, 0x581A0DB248B0A77Aull,
+ 0xC7634D81F4372DDFull, 0xFFFFFFFFFFFFFFFF,
+ 0xFFFFFFFFFFFFFFFFull, 0xFFFFFFFFFFFFFFFFull};
+static u64 nist_p384_a[] = { 0x00000000FFFFFFFCull, 0xFFFFFFFF00000000ull,
+ 0xFFFFFFFFFFFFFFFEull, 0xFFFFFFFFFFFFFFFFull,
+ 0xFFFFFFFFFFFFFFFFull, 0xFFFFFFFFFFFFFFFFull};
+static u64 nist_p384_b[] = { 0x2A85C8EDD3EC2AEFull, 0xC656398D8A2ED19Dull,
+ 0x0314088F5013875Aull, 0x181D9C6EFE814112ull,
+ 0x988E056BE3F82D19ull, 0xB3312FA7E23EE7E4ull };
+static struct ecc_curve nist_p384 = {
+ .name = "nist_384",
+ .g = {
+ .x = nist_p384_g_x,
+ .y = nist_p384_g_y,
+ .ndigits = 6,
+ },
+ .p = nist_p384_p,
+ .n = nist_p384_n,
+ .a = nist_p384_a,
+ .b = nist_p384_b
+};
+
+#endif
--
2.26.2

2020-12-21 22:56:30

by Rob Herring

[permalink] [raw]
Subject: Re: [RFC PATCH 5/6] dt-bindings: crypto: Add Keem Bay ECC bindings

On Thu, 17 Dec 2020 17:21:00 +0000, Daniele Alessandrelli wrote:
> From: Prabhjot Khurana <[email protected]>
>
> Add Keem Bay Offload and Crypto Subsystem (OCS) Elliptic Curve
> Cryptography (ECC) device tree bindings.
>
> Signed-off-by: Prabhjot Khurana <[email protected]>
> Signed-off-by: Daniele Alessandrelli <[email protected]>
> ---
> .../crypto/intel,keembay-ocs-ecc.yaml | 47 +++++++++++++++++++
> MAINTAINERS | 7 +++
> 2 files changed, 54 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/crypto/intel,keembay-ocs-ecc.yaml
>

Reviewed-by: Rob Herring <[email protected]>

2021-01-04 08:10:40

by Elena Reshetova

[permalink] [raw]
Subject: RE: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

> 2. The OCS ECC HW does not support the NIST P-192 curve. We were planning to
> add SW fallback for P-192 in the driver, but the Intel Crypto team
> (which, internally, has to approve any code involving cryptography)
> advised against it, because they consider P-192 weak. As a result, the
> driver is not passing crypto self-tests. Is there any possible solution
> to this? Is it reasonable to change the self-tests to only test the
> curves actually supported by the tested driver? (not fully sure how to do
> that).

An additional reason against the P-192 SW fallback is the fact that it can
potentially trigger unsafe behavior which is not even "visible" to the end user
of the ECC functionality. If I request (by my developer mistake) a P-192
weaker curve from ECC Keem Bay HW driver, it is much safer to return a
"not supported" error that proceed behind my back with a SW code
implementation making me believe that I am actually getting a HW-backed up
functionality (since I don't think there is a way for me to check that I am using
SW fallback).

Best Regards,
Elena



2021-01-04 11:34:15

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > 2. The OCS ECC HW does not support the NIST P-192 curve. We were planning to
> > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > (which, internally, has to approve any code involving cryptography)
> > advised against it, because they consider P-192 weak. As a result, the
> > driver is not passing crypto self-tests. Is there any possible solution
> > to this? Is it reasonable to change the self-tests to only test the
> > curves actually supported by the tested driver? (not fully sure how to do
> > that).
>
> An additional reason against the P-192 SW fallback is the fact that it can
> potentially trigger unsafe behavior which is not even "visible" to the end user
> of the ECC functionality. If I request (by my developer mistake) a P-192
> weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> "not supported" error that proceed behind my back with a SW code
> implementation making me believe that I am actually getting a HW-backed up
> functionality (since I don't think there is a way for me to check that I am using
> SW fallback).

Sorry, but if you break the Crypto API requirement then your driver
isn't getting merged.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2021-01-04 14:42:23

by Elena Reshetova

[permalink] [raw]
Subject: RE: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

> On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were planning to
> > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > (which, internally, has to approve any code involving cryptography)
> > > advised against it, because they consider P-192 weak. As a result, the
> > > driver is not passing crypto self-tests. Is there any possible solution
> > > to this? Is it reasonable to change the self-tests to only test the
> > > curves actually supported by the tested driver? (not fully sure how to do
> > > that).
> >
> > An additional reason against the P-192 SW fallback is the fact that it can
> > potentially trigger unsafe behavior which is not even "visible" to the end user
> > of the ECC functionality. If I request (by my developer mistake) a P-192
> > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > "not supported" error that proceed behind my back with a SW code
> > implementation making me believe that I am actually getting a HW-backed up
> > functionality (since I don't think there is a way for me to check that I am using
> > SW fallback).
>
> Sorry, but if you break the Crypto API requirement then your driver
> isn't getting merged.

But should not we think what behavior would make sense for good crypto drivers in future?
As cryptography moves forward (especially for the post quantum era), we will have
lengths for all existing algorithms increased (in addition to having a bunch of new ones),
and we surely should not expect the new generation of HW drivers to implement
the old/weaker lengths, so why there the requirement to support them? It is not a
part of crypto API definition on what bit lengths should be supported, because it
cannot be part of API to begin with since it is always changing parameter (algorithms and attacks
develop all the time).

Best Regards,
Elena.

2021-01-14 10:26:25

by Elena Reshetova

[permalink] [raw]
Subject: RE: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

> > On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were planning
> to
> > > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > > (which, internally, has to approve any code involving cryptography)
> > > > advised against it, because they consider P-192 weak. As a result, the
> > > > driver is not passing crypto self-tests. Is there any possible solution
> > > > to this? Is it reasonable to change the self-tests to only test the
> > > > curves actually supported by the tested driver? (not fully sure how to do
> > > > that).
> > >
> > > An additional reason against the P-192 SW fallback is the fact that it can
> > > potentially trigger unsafe behavior which is not even "visible" to the end user
> > > of the ECC functionality. If I request (by my developer mistake) a P-192
> > > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > > "not supported" error that proceed behind my back with a SW code
> > > implementation making me believe that I am actually getting a HW-backed up
> > > functionality (since I don't think there is a way for me to check that I am using
> > > SW fallback).
> >
> > Sorry, but if you break the Crypto API requirement then your driver
> > isn't getting merged.
>
> But should not we think what behavior would make sense for good crypto drivers in
> future?
> As cryptography moves forward (especially for the post quantum era), we will have
> lengths for all existing algorithms increased (in addition to having a bunch of new
> ones),
> and we surely should not expect the new generation of HW drivers to implement
> the old/weaker lengths, so why there the requirement to support them? It is not a
> part of crypto API definition on what bit lengths should be supported, because it
> cannot be part of API to begin with since it is always changing parameter (algorithms
> and attacks
> develop all the time).

I would really appreciate, if someone helps us to understand here. Maybe there is a
correct way to address this, but we just don't see it. The question is not even about
this particular crypto driver and the fact whenever it gests merged or not, but the
logic of the crypto API subsystem.

As far as I understand the implementations that are provided by the specialized drivers
(like our Keem Bay OCS ECC driver example here) have a higher priority vs. generic
Implementations that exists in kernel, which makes sense because we expect these drivers
(and the security HW they talk to) to provide both more efficient and more secure
implementations than a pure SW implementation in kernel can do (even if it utilizes special
instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound by
what security HW can do, and if it does not support a certain size/param of the algorithm
(P-192 curve in our case), it is pointless and wrong for them to reimplement what SW is
already doing in kernel, so they should not do so and currently they re-direct to core kernel
implementation. So far good.

But now comes my biggest worry is that this redirection as far
as I can see is *internal to driver itself*, i.e. it does a callback to these core functions from the driver
code, which again, unless I misunderstand smth, leads to the fact that the end user gets
P-192 curve ECC implementation from the core kernel that has been "promoted" to a highest
priority (given that ECC KeemBay driver for example got priority 300 to begin with). So, if
we say we have another HW Driver 'Foo', which happens to implement P-192 curves more securely,
but happens to have a lower priority than ECC KeemBay driver, its implementation would never
be chosen, but core kernel implementation will be used (via SW fallback internal to ECC Keem
Bay driver).

Another problem is that for a user of crypto API I don't see a way (and perhaps I am wrong here)
to guarantee that all my calls to perform crypto operations will end up being performed on a
security HW I want (maybe because this is the only thing I trust). It seems to be possible in theory,
but in practice would require careful evaluation of a kernel setup and a sync between what
end user requests and what driver can provide. Let me try to explain a potential scenario.
Lets say we had an end user that used to ask for both P-192 and P-384 curve-based ECC operations
and let's say we had a driver and security HW that implemented it. The end user made sure that
this driver implementation is always preferred vs. other existing implementations. Now, time moves, a new
security HW comes instead that only supports P-384, and the driver now has been updated to
support P-192 via the SW fallback (like we are asked now).
Now, how does an end user notice that when it asks for a P-192 based operations, his operations
are not done in security HW anymore? The only way seems to be
is to know that driver and security HW has been updated, algorithms and sizes changed, etc.
It might take a while before the end user realizes this and for example stops using P-192 altogether,
but what if this silent redirect by the driver actually breaks some security assumptions (side-channel
resistance being one potential example) made by this end user? The consequences can be very bad.
You might say: "this is the end user problem to verify this", but shouldn't we do smth to prevent or
at least indicate such potential issues to them?

Best Regards,
Elena.


2021-01-14 18:30:29

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Thu, 14 Jan 2021 at 11:25, Reshetova, Elena
<[email protected]> wrote:
>
> > > On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were planning
> > to
> > > > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > > > (which, internally, has to approve any code involving cryptography)
> > > > > advised against it, because they consider P-192 weak. As a result, the
> > > > > driver is not passing crypto self-tests. Is there any possible solution
> > > > > to this? Is it reasonable to change the self-tests to only test the
> > > > > curves actually supported by the tested driver? (not fully sure how to do
> > > > > that).
> > > >
> > > > An additional reason against the P-192 SW fallback is the fact that it can
> > > > potentially trigger unsafe behavior which is not even "visible" to the end user
> > > > of the ECC functionality. If I request (by my developer mistake) a P-192
> > > > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > > > "not supported" error that proceed behind my back with a SW code
> > > > implementation making me believe that I am actually getting a HW-backed up
> > > > functionality (since I don't think there is a way for me to check that I am using
> > > > SW fallback).
> > >
> > > Sorry, but if you break the Crypto API requirement then your driver
> > > isn't getting merged.
> >
> > But should not we think what behavior would make sense for good crypto drivers in
> > future?
> > As cryptography moves forward (especially for the post quantum era), we will have
> > lengths for all existing algorithms increased (in addition to having a bunch of new
> > ones),
> > and we surely should not expect the new generation of HW drivers to implement
> > the old/weaker lengths, so why there the requirement to support them? It is not a
> > part of crypto API definition on what bit lengths should be supported, because it
> > cannot be part of API to begin with since it is always changing parameter (algorithms
> > and attacks
> > develop all the time).
>
> I would really appreciate, if someone helps us to understand here. Maybe there is a
> correct way to address this, but we just don't see it. The question is not even about
> this particular crypto driver and the fact whenever it gests merged or not, but the
> logic of the crypto API subsystem.
>
> As far as I understand the implementations that are provided by the specialized drivers
> (like our Keem Bay OCS ECC driver example here) have a higher priority vs. generic
> Implementations that exists in kernel, which makes sense because we expect these drivers
> (and the security HW they talk to) to provide both more efficient and more secure
> implementations than a pure SW implementation in kernel can do (even if it utilizes special
> instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound by
> what security HW can do, and if it does not support a certain size/param of the algorithm
> (P-192 curve in our case), it is pointless and wrong for them to reimplement what SW is
> already doing in kernel, so they should not do so and currently they re-direct to core kernel
> implementation. So far good.
>
> But now comes my biggest worry is that this redirection as far
> as I can see is *internal to driver itself*, i.e. it does a callback to these core functions from the driver
> code, which again, unless I misunderstand smth, leads to the fact that the end user gets
> P-192 curve ECC implementation from the core kernel that has been "promoted" to a highest
> priority (given that ECC KeemBay driver for example got priority 300 to begin with). So, if
> we say we have another HW Driver 'Foo', which happens to implement P-192 curves more securely,
> but happens to have a lower priority than ECC KeemBay driver, its implementation would never
> be chosen, but core kernel implementation will be used (via SW fallback internal to ECC Keem
> Bay driver).
>

No, this is incorrect. If you allocate a fallback algorithm in the
correct way, the crypto API will resolve the allocation in the usual
manner, and select whichever of the remaining implementations has the
highest priority (provided that it does not require a fallback
itself).

> Another problem is that for a user of crypto API I don't see a way (and perhaps I am wrong here)
> to guarantee that all my calls to perform crypto operations will end up being performed on a
> security HW I want (maybe because this is the only thing I trust). It seems to be possible in theory,
> but in practice would require careful evaluation of a kernel setup and a sync between what
> end user requests and what driver can provide. Let me try to explain a potential scenario.
> Lets say we had an end user that used to ask for both P-192 and P-384 curve-based ECC operations
> and let's say we had a driver and security HW that implemented it. The end user made sure that
> this driver implementation is always preferred vs. other existing implementations. Now, time moves, a new
> security HW comes instead that only supports P-384, and the driver now has been updated to
> support P-192 via the SW fallback (like we are asked now).
> Now, how does an end user notice that when it asks for a P-192 based operations, his operations
> are not done in security HW anymore? The only way seems to be
> is to know that driver and security HW has been updated, algorithms and sizes changed, etc.
> It might take a while before the end user realizes this and for example stops using P-192 altogether,
> but what if this silent redirect by the driver actually breaks some security assumptions (side-channel
> resistance being one potential example) made by this end user? The consequences can be very bad.
> You might say: "this is the end user problem to verify this", but shouldn't we do smth to prevent or
> at least indicate such potential issues to them?
>

I don't think it is possible at the API level to define rules that
will always produce the most secure combination of drivers. The
priority fields are only used to convey relative performance (which is
already semantically murky, given the lack of distinction between
hardware with a single queue vs software algorithms that can be
executed by all CPUs in parallel).

When it comes to comparative security, trustworthiness or robustness
of implementations, it is simply left up to the user to blacklist
modules that they prefer not to use. When fallback allocations are
made in the correct way, the remaining available implementations will
be used in priority order.

2021-01-18 12:14:26

by Elena Reshetova

[permalink] [raw]
Subject: RE: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

> On Thu, 14 Jan 2021 at 11:25, Reshetova, Elena
> <[email protected]> wrote:
> >
> > > > On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > > > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were
> planning
> > > to
> > > > > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > > > > (which, internally, has to approve any code involving cryptography)
> > > > > > advised against it, because they consider P-192 weak. As a result, the
> > > > > > driver is not passing crypto self-tests. Is there any possible solution
> > > > > > to this? Is it reasonable to change the self-tests to only test the
> > > > > > curves actually supported by the tested driver? (not fully sure how to do
> > > > > > that).
> > > > >
> > > > > An additional reason against the P-192 SW fallback is the fact that it can
> > > > > potentially trigger unsafe behavior which is not even "visible" to the end user
> > > > > of the ECC functionality. If I request (by my developer mistake) a P-192
> > > > > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > > > > "not supported" error that proceed behind my back with a SW code
> > > > > implementation making me believe that I am actually getting a HW-backed up
> > > > > functionality (since I don't think there is a way for me to check that I am using
> > > > > SW fallback).
> > > >
> > > > Sorry, but if you break the Crypto API requirement then your driver
> > > > isn't getting merged.
> > >
> > > But should not we think what behavior would make sense for good crypto drivers
> in
> > > future?
> > > As cryptography moves forward (especially for the post quantum era), we will
> have
> > > lengths for all existing algorithms increased (in addition to having a bunch of new
> > > ones),
> > > and we surely should not expect the new generation of HW drivers to implement
> > > the old/weaker lengths, so why there the requirement to support them? It is not
> a
> > > part of crypto API definition on what bit lengths should be supported, because it
> > > cannot be part of API to begin with since it is always changing parameter
> (algorithms
> > > and attacks
> > > develop all the time).
> >
> > I would really appreciate, if someone helps us to understand here. Maybe there is a
> > correct way to address this, but we just don't see it. The question is not even about
> > this particular crypto driver and the fact whenever it gests merged or not, but the
> > logic of the crypto API subsystem.
> >
> > As far as I understand the implementations that are provided by the specialized
> drivers
> > (like our Keem Bay OCS ECC driver example here) have a higher priority vs. generic
> > Implementations that exists in kernel, which makes sense because we expect these
> drivers
> > (and the security HW they talk to) to provide both more efficient and more secure
> > implementations than a pure SW implementation in kernel can do (even if it utilizes
> special
> > instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound by
> > what security HW can do, and if it does not support a certain size/param of the
> algorithm
> > (P-192 curve in our case), it is pointless and wrong for them to reimplement what
> SW is
> > already doing in kernel, so they should not do so and currently they re-direct to
> core kernel
> > implementation. So far good.
> >
> > But now comes my biggest worry is that this redirection as far
> > as I can see is *internal to driver itself*, i.e. it does a callback to these core
> functions from the driver
> > code, which again, unless I misunderstand smth, leads to the fact that the end user
> gets
> > P-192 curve ECC implementation from the core kernel that has been "promoted"
> to a highest
> > priority (given that ECC KeemBay driver for example got priority 300 to begin with).
> So, if
> > we say we have another HW Driver 'Foo', which happens to implement P-192
> curves more securely,
> > but happens to have a lower priority than ECC KeemBay driver, its implementation
> would never
> > be chosen, but core kernel implementation will be used (via SW fallback internal to
> ECC Keem
> > Bay driver).
> >
>
> No, this is incorrect. If you allocate a fallback algorithm in the
> correct way, the crypto API will resolve the allocation in the usual
> manner, and select whichever of the remaining implementations has the
> highest priority (provided that it does not require a fallback
> itself).

Thank you very much Ard for the important correction here!
See below if I got it now correctly to the end for the use case in question.

>
> > Another problem is that for a user of crypto API I don't see a way (and perhaps I
> am wrong here)
> > to guarantee that all my calls to perform crypto operations will end up being
> performed on a
> > security HW I want (maybe because this is the only thing I trust). It seems to be
> possible in theory,
> > but in practice would require careful evaluation of a kernel setup and a sync
> between what
> > end user requests and what driver can provide. Let me try to explain a potential
> scenario.
> > Lets say we had an end user that used to ask for both P-192 and P-384 curve-based
> ECC operations
> > and let's say we had a driver and security HW that implemented it. The end user
> made sure that
> > this driver implementation is always preferred vs. other existing implementations.
> Now, time moves, a new
> > security HW comes instead that only supports P-384, and the driver now has been
> updated to
> > support P-192 via the SW fallback (like we are asked now).
> > Now, how does an end user notice that when it asks for a P-192 based operations,
> his operations
> > are not done in security HW anymore? The only way seems to be
> > is to know that driver and security HW has been updated, algorithms and sizes
> changed, etc.
> > It might take a while before the end user realizes this and for example stops using
> P-192 altogether,
> > but what if this silent redirect by the driver actually breaks some security
> assumptions (side-channel
> > resistance being one potential example) made by this end user? The consequences
> can be very bad.
> > You might say: "this is the end user problem to verify this", but shouldn't we do
> smth to prevent or
> > at least indicate such potential issues to them?
> >
>
> I don't think it is possible at the API level to define rules that
> will always produce the most secure combination of drivers. The
> priority fields are only used to convey relative performance (which is
> already semantically murky, given the lack of distinction between
> hardware with a single queue vs software algorithms that can be
> executed by all CPUs in parallel).
>
> When it comes to comparative security, trustworthiness or robustness
> of implementations, it is simply left up to the user to blacklist
> modules that they prefer not to use. When fallback allocations are
> made in the correct way, the remaining available implementations will
> be used in priority order.

So, let me see if I understand the full picture correctly now and how to utilize
the blacklisting of modules as a user. Suppose I want to blacklist everything but
my OSC driver module. So, if I am as a user refer to a specific driver implementation
using a unique driver name (ecdh-keembay-ocs in our case), then regardless of the
fact that a driver implements this SW fallback for P-192 curve, if I am as a user to
ask for P-192 curve (or any other param that results in SW fallback), I will be notified
that this requested implementation does not provide it?

Best Regards,
Elena.

2021-01-18 12:15:51

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Mon, 18 Jan 2021 at 12:55, Reshetova, Elena
<[email protected]> wrote:
>
> > On Thu, 14 Jan 2021 at 11:25, Reshetova, Elena
> > <[email protected]> wrote:
> > >
> > > > > On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > > > > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were
> > planning
> > > > to
> > > > > > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > > > > > (which, internally, has to approve any code involving cryptography)
> > > > > > > advised against it, because they consider P-192 weak. As a result, the
> > > > > > > driver is not passing crypto self-tests. Is there any possible solution
> > > > > > > to this? Is it reasonable to change the self-tests to only test the
> > > > > > > curves actually supported by the tested driver? (not fully sure how to do
> > > > > > > that).
> > > > > >
> > > > > > An additional reason against the P-192 SW fallback is the fact that it can
> > > > > > potentially trigger unsafe behavior which is not even "visible" to the end user
> > > > > > of the ECC functionality. If I request (by my developer mistake) a P-192
> > > > > > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > > > > > "not supported" error that proceed behind my back with a SW code
> > > > > > implementation making me believe that I am actually getting a HW-backed up
> > > > > > functionality (since I don't think there is a way for me to check that I am using
> > > > > > SW fallback).
> > > > >
> > > > > Sorry, but if you break the Crypto API requirement then your driver
> > > > > isn't getting merged.
> > > >
> > > > But should not we think what behavior would make sense for good crypto drivers
> > in
> > > > future?
> > > > As cryptography moves forward (especially for the post quantum era), we will
> > have
> > > > lengths for all existing algorithms increased (in addition to having a bunch of new
> > > > ones),
> > > > and we surely should not expect the new generation of HW drivers to implement
> > > > the old/weaker lengths, so why there the requirement to support them? It is not
> > a
> > > > part of crypto API definition on what bit lengths should be supported, because it
> > > > cannot be part of API to begin with since it is always changing parameter
> > (algorithms
> > > > and attacks
> > > > develop all the time).
> > >
> > > I would really appreciate, if someone helps us to understand here. Maybe there is a
> > > correct way to address this, but we just don't see it. The question is not even about
> > > this particular crypto driver and the fact whenever it gests merged or not, but the
> > > logic of the crypto API subsystem.
> > >
> > > As far as I understand the implementations that are provided by the specialized
> > drivers
> > > (like our Keem Bay OCS ECC driver example here) have a higher priority vs. generic
> > > Implementations that exists in kernel, which makes sense because we expect these
> > drivers
> > > (and the security HW they talk to) to provide both more efficient and more secure
> > > implementations than a pure SW implementation in kernel can do (even if it utilizes
> > special
> > > instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound by
> > > what security HW can do, and if it does not support a certain size/param of the
> > algorithm
> > > (P-192 curve in our case), it is pointless and wrong for them to reimplement what
> > SW is
> > > already doing in kernel, so they should not do so and currently they re-direct to
> > core kernel
> > > implementation. So far good.
> > >
> > > But now comes my biggest worry is that this redirection as far
> > > as I can see is *internal to driver itself*, i.e. it does a callback to these core
> > functions from the driver
> > > code, which again, unless I misunderstand smth, leads to the fact that the end user
> > gets
> > > P-192 curve ECC implementation from the core kernel that has been "promoted"
> > to a highest
> > > priority (given that ECC KeemBay driver for example got priority 300 to begin with).
> > So, if
> > > we say we have another HW Driver 'Foo', which happens to implement P-192
> > curves more securely,
> > > but happens to have a lower priority than ECC KeemBay driver, its implementation
> > would never
> > > be chosen, but core kernel implementation will be used (via SW fallback internal to
> > ECC Keem
> > > Bay driver).
> > >
> >
> > No, this is incorrect. If you allocate a fallback algorithm in the
> > correct way, the crypto API will resolve the allocation in the usual
> > manner, and select whichever of the remaining implementations has the
> > highest priority (provided that it does not require a fallback
> > itself).
>
> Thank you very much Ard for the important correction here!
> See below if I got it now correctly to the end for the use case in question.
>
> >
> > > Another problem is that for a user of crypto API I don't see a way (and perhaps I
> > am wrong here)
> > > to guarantee that all my calls to perform crypto operations will end up being
> > performed on a
> > > security HW I want (maybe because this is the only thing I trust). It seems to be
> > possible in theory,
> > > but in practice would require careful evaluation of a kernel setup and a sync
> > between what
> > > end user requests and what driver can provide. Let me try to explain a potential
> > scenario.
> > > Lets say we had an end user that used to ask for both P-192 and P-384 curve-based
> > ECC operations
> > > and let's say we had a driver and security HW that implemented it. The end user
> > made sure that
> > > this driver implementation is always preferred vs. other existing implementations.
> > Now, time moves, a new
> > > security HW comes instead that only supports P-384, and the driver now has been
> > updated to
> > > support P-192 via the SW fallback (like we are asked now).
> > > Now, how does an end user notice that when it asks for a P-192 based operations,
> > his operations
> > > are not done in security HW anymore? The only way seems to be
> > > is to know that driver and security HW has been updated, algorithms and sizes
> > changed, etc.
> > > It might take a while before the end user realizes this and for example stops using
> > P-192 altogether,
> > > but what if this silent redirect by the driver actually breaks some security
> > assumptions (side-channel
> > > resistance being one potential example) made by this end user? The consequences
> > can be very bad.
> > > You might say: "this is the end user problem to verify this", but shouldn't we do
> > smth to prevent or
> > > at least indicate such potential issues to them?
> > >
> >
> > I don't think it is possible at the API level to define rules that
> > will always produce the most secure combination of drivers. The
> > priority fields are only used to convey relative performance (which is
> > already semantically murky, given the lack of distinction between
> > hardware with a single queue vs software algorithms that can be
> > executed by all CPUs in parallel).
> >
> > When it comes to comparative security, trustworthiness or robustness
> > of implementations, it is simply left up to the user to blacklist
> > modules that they prefer not to use. When fallback allocations are
> > made in the correct way, the remaining available implementations will
> > be used in priority order.
>
> So, let me see if I understand the full picture correctly now and how to utilize
> the blacklisting of modules as a user. Suppose I want to blacklist everything but
> my OSC driver module. So, if I am as a user refer to a specific driver implementation
> using a unique driver name (ecdh-keembay-ocs in our case), then regardless of the
> fact that a driver implements this SW fallback for P-192 curve, if I am as a user to
> ask for P-192 curve (or any other param that results in SW fallback), I will be notified
> that this requested implementation does not provide it?
>

This is rather unusual compared with how the crypto API is typically
used, but if this is really what you want to implement, you can do so
by:
- having a "ecdh" implementation that implements the entire range, and
uses a fallback for curves that it does not implement
- export the same implementation again as "ecdh" and with a known
driver name "ecdh-keembay-ocs", but with a slightly lower priority,
and in this case, return an error when the unimplemented curve is
requested.

That way, you fully adhere to the API, by providing implementations of
all curves by default. And if a user requests "ecdh-keembay-ocs"
explicitly, it will not be able to use the P192 curve inadvertently.

But policing which curves are secure and which are not is really not
the job of the API. We have implementations of MD5 and RC4 in the
kernel that we would *love* to remove but we simply cannot do so as
long as they are still being used. The same applies to P192: we simply
cannot fail requests for that curve for use cases that were previously
deemed valid. It is perfectly reasonable to omit the implementation
from your hardware, but banning its use outright on the grounds that
is no longer secure conflicts with our requirement not to break
existing use cases.

2021-01-19 14:53:46

by Elena Reshetova

[permalink] [raw]
Subject: RE: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

> On Mon, 18 Jan 2021 at 12:55, Reshetova, Elena
> <[email protected]> wrote:
> >
> > > On Thu, 14 Jan 2021 at 11:25, Reshetova, Elena
> > > <[email protected]> wrote:
> > > >
> > > > > > On Mon, Jan 04, 2021 at 08:04:15AM +0000, Reshetova, Elena wrote:
> > > > > > > > 2. The OCS ECC HW does not support the NIST P-192 curve. We were
> > > planning
> > > > > to
> > > > > > > > add SW fallback for P-192 in the driver, but the Intel Crypto team
> > > > > > > > (which, internally, has to approve any code involving cryptography)
> > > > > > > > advised against it, because they consider P-192 weak. As a result, the
> > > > > > > > driver is not passing crypto self-tests. Is there any possible solution
> > > > > > > > to this? Is it reasonable to change the self-tests to only test the
> > > > > > > > curves actually supported by the tested driver? (not fully sure how to
> do
> > > > > > > > that).
> > > > > > >
> > > > > > > An additional reason against the P-192 SW fallback is the fact that it can
> > > > > > > potentially trigger unsafe behavior which is not even "visible" to the end
> user
> > > > > > > of the ECC functionality. If I request (by my developer mistake) a P-192
> > > > > > > weaker curve from ECC Keem Bay HW driver, it is much safer to return a
> > > > > > > "not supported" error that proceed behind my back with a SW code
> > > > > > > implementation making me believe that I am actually getting a HW-
> backed up
> > > > > > > functionality (since I don't think there is a way for me to check that I am
> using
> > > > > > > SW fallback).
> > > > > >
> > > > > > Sorry, but if you break the Crypto API requirement then your driver
> > > > > > isn't getting merged.
> > > > >
> > > > > But should not we think what behavior would make sense for good crypto
> drivers
> > > in
> > > > > future?
> > > > > As cryptography moves forward (especially for the post quantum era), we will
> > > have
> > > > > lengths for all existing algorithms increased (in addition to having a bunch of
> new
> > > > > ones),
> > > > > and we surely should not expect the new generation of HW drivers to
> implement
> > > > > the old/weaker lengths, so why there the requirement to support them? It is
> not
> > > a
> > > > > part of crypto API definition on what bit lengths should be supported,
> because it
> > > > > cannot be part of API to begin with since it is always changing parameter
> > > (algorithms
> > > > > and attacks
> > > > > develop all the time).
> > > >
> > > > I would really appreciate, if someone helps us to understand here. Maybe there
> is a
> > > > correct way to address this, but we just don't see it. The question is not even
> about
> > > > this particular crypto driver and the fact whenever it gests merged or not, but
> the
> > > > logic of the crypto API subsystem.
> > > >
> > > > As far as I understand the implementations that are provided by the specialized
> > > drivers
> > > > (like our Keem Bay OCS ECC driver example here) have a higher priority vs.
> generic
> > > > Implementations that exists in kernel, which makes sense because we expect
> these
> > > drivers
> > > > (and the security HW they talk to) to provide both more efficient and more
> secure
> > > > implementations than a pure SW implementation in kernel can do (even if it
> utilizes
> > > special
> > > > instructions, like SIMD, AESNI, etc.). However, naturally these drivers are bound
> by
> > > > what security HW can do, and if it does not support a certain size/param of the
> > > algorithm
> > > > (P-192 curve in our case), it is pointless and wrong for them to reimplement
> what
> > > SW is
> > > > already doing in kernel, so they should not do so and currently they re-direct to
> > > core kernel
> > > > implementation. So far good.
> > > >
> > > > But now comes my biggest worry is that this redirection as far
> > > > as I can see is *internal to driver itself*, i.e. it does a callback to these core
> > > functions from the driver
> > > > code, which again, unless I misunderstand smth, leads to the fact that the end
> user
> > > gets
> > > > P-192 curve ECC implementation from the core kernel that has been
> "promoted"
> > > to a highest
> > > > priority (given that ECC KeemBay driver for example got priority 300 to begin
> with).
> > > So, if
> > > > we say we have another HW Driver 'Foo', which happens to implement P-192
> > > curves more securely,
> > > > but happens to have a lower priority than ECC KeemBay driver, its
> implementation
> > > would never
> > > > be chosen, but core kernel implementation will be used (via SW fallback
> internal to
> > > ECC Keem
> > > > Bay driver).
> > > >
> > >
> > > No, this is incorrect. If you allocate a fallback algorithm in the
> > > correct way, the crypto API will resolve the allocation in the usual
> > > manner, and select whichever of the remaining implementations has the
> > > highest priority (provided that it does not require a fallback
> > > itself).
> >
> > Thank you very much Ard for the important correction here!
> > See below if I got it now correctly to the end for the use case in question.
> >
> > >
> > > > Another problem is that for a user of crypto API I don't see a way (and perhaps I
> > > am wrong here)
> > > > to guarantee that all my calls to perform crypto operations will end up being
> > > performed on a
> > > > security HW I want (maybe because this is the only thing I trust). It seems to be
> > > possible in theory,
> > > > but in practice would require careful evaluation of a kernel setup and a sync
> > > between what
> > > > end user requests and what driver can provide. Let me try to explain a potential
> > > scenario.
> > > > Lets say we had an end user that used to ask for both P-192 and P-384 curve-
> based
> > > ECC operations
> > > > and let's say we had a driver and security HW that implemented it. The end user
> > > made sure that
> > > > this driver implementation is always preferred vs. other existing
> implementations.
> > > Now, time moves, a new
> > > > security HW comes instead that only supports P-384, and the driver now has
> been
> > > updated to
> > > > support P-192 via the SW fallback (like we are asked now).
> > > > Now, how does an end user notice that when it asks for a P-192 based
> operations,
> > > his operations
> > > > are not done in security HW anymore? The only way seems to be
> > > > is to know that driver and security HW has been updated, algorithms and sizes
> > > changed, etc.
> > > > It might take a while before the end user realizes this and for example stops
> using
> > > P-192 altogether,
> > > > but what if this silent redirect by the driver actually breaks some security
> > > assumptions (side-channel
> > > > resistance being one potential example) made by this end user? The
> consequences
> > > can be very bad.
> > > > You might say: "this is the end user problem to verify this", but shouldn't we do
> > > smth to prevent or
> > > > at least indicate such potential issues to them?
> > > >
> > >
> > > I don't think it is possible at the API level to define rules that
> > > will always produce the most secure combination of drivers. The
> > > priority fields are only used to convey relative performance (which is
> > > already semantically murky, given the lack of distinction between
> > > hardware with a single queue vs software algorithms that can be
> > > executed by all CPUs in parallel).
> > >
> > > When it comes to comparative security, trustworthiness or robustness
> > > of implementations, it is simply left up to the user to blacklist
> > > modules that they prefer not to use. When fallback allocations are
> > > made in the correct way, the remaining available implementations will
> > > be used in priority order.
> >
> > So, let me see if I understand the full picture correctly now and how to utilize
> > the blacklisting of modules as a user. Suppose I want to blacklist everything but
> > my OSC driver module. So, if I am as a user refer to a specific driver
> implementation
> > using a unique driver name (ecdh-keembay-ocs in our case), then regardless of the
> > fact that a driver implements this SW fallback for P-192 curve, if I am as a user to
> > ask for P-192 curve (or any other param that results in SW fallback), I will be
> notified
> > that this requested implementation does not provide it?
> >
>
> This is rather unusual compared with how the crypto API is typically
> used,

The interesting part here is that I think the use case I am describing is
pretty generic, it is strange to see why noone has raised this before.
One would think that this is the purpose of having these more secure crypto
HW implementations supported via drivers to make sure that the users
that care about high-grade security actually get what they expect and
the risk of mistakes is minimized.

but if this is really what you want to implement, you can do so
> by:
> - having a "ecdh" implementation that implements the entire range, and
> uses a fallback for curves that it does not implement
> - export the same implementation again as "ecdh" and with a known
> driver name "ecdh-keembay-ocs", but with a slightly lower priority,
> and in this case, return an error when the unimplemented curve is
> requested.
>
> That way, you fully adhere to the API, by providing implementations of
> all curves by default. And if a user requests "ecdh-keembay-ocs"
> explicitly, it will not be able to use the P192 curve inadvertently.

Thank you very much Ard for this explanation! We will see how such
implementation definition would look in practice first before thinking
on the way to proceed. It might end up looking strange and confusing,
and if so, it would at the end destroy the original idea to make "it simple
and secure for the driver users to use the provided functionality".
I have not seen this being done at all before at any existing crypto drivers,
that's why I was thinking this is not supported to begin with.

>
> But policing which curves are secure and which are not is really not
> the job of the API. We have implementations of MD5 and RC4 in the
> kernel that we would *love* to remove but we simply cannot do so as
> long as they are still being used. The same applies to P192: we simply
> cannot fail requests for that curve for use cases that were previously
> deemed valid. It is perfectly reasonable to omit the implementation
> from your hardware, but banning its use outright on the grounds that
> is no longer secure conflicts with our requirement not to break
> existing use cases.

I agree with the above logic. In the light that the crypto API is a toolbox for the
crypto and not the security subsystem that aims to make any guarantees on
the end security properties it all makes sense. I guess I was only asking if this
toolbox can help its users to minimize potential mistakes and be configurable
enough to support different usage scenarios. But if the above works, I guess
it supports it, just hope the end declaration is not too messy to create other
(different) usage problems.

Best Regards,
Elena.

2021-01-20 19:02:14

by Alessandrelli, Daniele

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

Hi Ard,

Thank you very much for your valuable feedback.

On Mon, 2021-01-18 at 13:09 +0100, Ard Biesheuvel wrote:
> This is rather unusual compared with how the crypto API is typically
> used, but if this is really what you want to implement, you can do so
> by:
> - having a "ecdh" implementation that implements the entire range, and
> uses a fallback for curves that it does not implement
> - export the same implementation again as "ecdh" and with a known
> driver name "ecdh-keembay-ocs", but with a slightly lower priority,
> and in this case, return an error when the unimplemented curve is
> requested.
>
> That way, you fully adhere to the API, by providing implementations of
> all curves by default. And if a user requests "ecdh-keembay-ocs"
> explicitly, it will not be able to use the P192 curve inadvertently.

I tried to implement this, but it looks like the driver name is
mandatory, so I specified one also for the first implementation.

Basically I defined two 'struct kpp_alg' variables; both with cra_name
= "ecdh", but with different 'cra_driver_name' (one with
cra_driver_name = "ecdh-keembay-ocs-fallback" and the other one with
cra_driver_name = "ecdh-keembay-ocs").

Is this what you were suggesting?

Anyway, that works (i.e., 'ecdh-keembay-ocs' returns an error when the
unimplemented curve is requested; while 'ecdh-keembay-ocs' and 'ecdh'
work fine with any curve), but I have to set the priority of 'ecdh-
keembay-ocs' to something lower than the 'ecdh_generic' priority.
Otherwise the implementation with fallback ends up using my "ecdh-
keembay-ocs" as fallback (so it ends up using a fallback that still
does not support the P-192 curve).

Also, the implementation without fallback is still failing crypto self-
tests (as expected I guess).

Therefore, I tried with a slightly different solution. Still two
implementations, but with different cra_names (one with cra_name =
"ecdh" and the other one with cra_name = "ecdh-keembay"). This solution
seems to be working, since, the "ecdh-keembay" is not tested by the
self tests and is not picked up as fallback for "ecdh" (since, if I
understand it correctly, it's like if I'm defining a new kind of kpp
algorithm), but it's still picked when calling crypto_alloc_kpp("ecdh-
keembay").

Does this second solution looks okay to you? Or does it have some
pitfall that I'm missing?

Regards,
Daniele


2021-01-21 09:56:20

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Wed, 20 Jan 2021 at 20:00, Alessandrelli, Daniele
<[email protected]> wrote:
>
> Hi Ard,
>
> Thank you very much for your valuable feedback.
>
> On Mon, 2021-01-18 at 13:09 +0100, Ard Biesheuvel wrote:
> > This is rather unusual compared with how the crypto API is typically
> > used, but if this is really what you want to implement, you can do so
> > by:
> > - having a "ecdh" implementation that implements the entire range, and
> > uses a fallback for curves that it does not implement
> > - export the same implementation again as "ecdh" and with a known
> > driver name "ecdh-keembay-ocs", but with a slightly lower priority,
> > and in this case, return an error when the unimplemented curve is
> > requested.
> >
> > That way, you fully adhere to the API, by providing implementations of
> > all curves by default. And if a user requests "ecdh-keembay-ocs"
> > explicitly, it will not be able to use the P192 curve inadvertently.
>
> I tried to implement this, but it looks like the driver name is
> mandatory, so I specified one also for the first implementation.
>
> Basically I defined two 'struct kpp_alg' variables; both with cra_name
> = "ecdh", but with different 'cra_driver_name' (one with
> cra_driver_name = "ecdh-keembay-ocs-fallback" and the other one with
> cra_driver_name = "ecdh-keembay-ocs").
>
> Is this what you were suggesting?
>
> Anyway, that works (i.e., 'ecdh-keembay-ocs' returns an error when the
> unimplemented curve is requested; while 'ecdh-keembay-ocs' and 'ecdh'
> work fine with any curve), but I have to set the priority of 'ecdh-
> keembay-ocs' to something lower than the 'ecdh_generic' priority.
> Otherwise the implementation with fallback ends up using my "ecdh-
> keembay-ocs" as fallback (so it ends up using a fallback that still
> does not support the P-192 curve).
>
> Also, the implementation without fallback is still failing crypto self-
> tests (as expected I guess).
>
> Therefore, I tried with a slightly different solution. Still two
> implementations, but with different cra_names (one with cra_name =
> "ecdh" and the other one with cra_name = "ecdh-keembay"). This solution
> seems to be working, since, the "ecdh-keembay" is not tested by the
> self tests and is not picked up as fallback for "ecdh" (since, if I
> understand it correctly, it's like if I'm defining a new kind of kpp
> algorithm), but it's still picked when calling crypto_alloc_kpp("ecdh-
> keembay").
>
> Does this second solution looks okay to you? Or does it have some
> pitfall that I'm missing?
>

You should set the CRYPTO_ALG_NEED_FALLBACK flag on both
implementations, to ensure that neither of them are considered as
fallbacks themselves.

2021-01-21 16:16:10

by Alessandrelli, Daniele

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Thu, 2021-01-21 at 10:52 +0100, Ard Biesheuvel wrote:
> On Wed, 20 Jan 2021 at 20:00, Alessandrelli, Daniele
> <[email protected]> wrote:
> > Hi Ard,
> >
> > Thank you very much for your valuable feedback.
> >
> > On Mon, 2021-01-18 at 13:09 +0100, Ard Biesheuvel wrote:
> > > This is rather unusual compared with how the crypto API is
> > > typically
> > > used, but if this is really what you want to implement, you can
> > > do so
> > > by:
> > > - having a "ecdh" implementation that implements the entire
> > > range, and
> > > uses a fallback for curves that it does not implement
> > > - export the same implementation again as "ecdh" and with a known
> > > driver name "ecdh-keembay-ocs", but with a slightly lower
> > > priority,
> > > and in this case, return an error when the unimplemented curve is
> > > requested.
> > >
> > > That way, you fully adhere to the API, by providing
> > > implementations of
> > > all curves by default. And if a user requests "ecdh-keembay-ocs"
> > > explicitly, it will not be able to use the P192 curve
> > > inadvertently.
> >
> > I tried to implement this, but it looks like the driver name is
> > mandatory, so I specified one also for the first implementation.
> >
> > Basically I defined two 'struct kpp_alg' variables; both with
> > cra_name
> > = "ecdh", but with different 'cra_driver_name' (one with
> > cra_driver_name = "ecdh-keembay-ocs-fallback" and the other one
> > with
> > cra_driver_name = "ecdh-keembay-ocs").
> >
> > Is this what you were suggesting?
> >
> > Anyway, that works (i.e., 'ecdh-keembay-ocs' returns an error when
> > the
> > unimplemented curve is requested; while 'ecdh-keembay-ocs' and
> > 'ecdh'
> > work fine with any curve), but I have to set the priority of 'ecdh-
> > keembay-ocs' to something lower than the 'ecdh_generic' priority.
> > Otherwise the implementation with fallback ends up using my "ecdh-
> > keembay-ocs" as fallback (so it ends up using a fallback that still
> > does not support the P-192 curve).
> >
> > Also, the implementation without fallback is still failing crypto
> > self-
> > tests (as expected I guess).
> >
> > Therefore, I tried with a slightly different solution. Still two
> > implementations, but with different cra_names (one with cra_name =
> > "ecdh" and the other one with cra_name = "ecdh-keembay"). This
> > solution
> > seems to be working, since, the "ecdh-keembay" is not tested by the
> > self tests and is not picked up as fallback for "ecdh" (since, if I
> > understand it correctly, it's like if I'm defining a new kind of
> > kpp
> > algorithm), but it's still picked when calling
> > crypto_alloc_kpp("ecdh-
> > keembay").
> >
> > Does this second solution looks okay to you? Or does it have some
> > pitfall that I'm missing?
> >
>
> You should set the CRYPTO_ALG_NEED_FALLBACK flag on both
> implementations, to ensure that neither of them are considered as
> fallbacks themselves.

Thanks again!

I was setting that flag only for the first implementation (the one with
fallback), but I see now how it's needed for the second one as well.

With that, the second implementation (i.e., the one without fallback)
is not used anymore as a fallback for the first one.

As expected, the second implementation does not pass self-tests and
crypto_alloc_kpp() returns -ELIBBAD when trying to allocate it, but
I've seen that I can avoid the error (and have it allocated properly)
by passing the CRYPTO_ALG_TESTED flag in the 'type' argument, like
below:

crypto_alloc_kpp("ecdh-keembay-ocs", CRYPTO_ALG_TESTED, 0);

Is that the right way to tell crypto_alloc_kpp() that we are fine using
an implementation that fails self-tests?



2021-01-21 20:06:01

by Herbert Xu

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Thu, Jan 21, 2021 at 04:13:51PM +0000, Alessandrelli, Daniele wrote:
>
> As expected, the second implementation does not pass self-tests and
> crypto_alloc_kpp() returns -ELIBBAD when trying to allocate it, but
> I've seen that I can avoid the error (and have it allocated properly)
> by passing the CRYPTO_ALG_TESTED flag in the 'type' argument, like
> below:

Did you set your algorithm's name to ecdh? I think Ard was suggesting
you to not do that. As long as you're not using the same name as a
recognised algorithm, then you won't need to pass any self-tests at
all.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2021-01-22 12:10:40

by Alessandrelli, Daniele

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Keem Bay OCS ECC crypto driver

On Fri, 2021-01-22 at 07:02 +1100, Herbert Xu wrote:
> On Thu, Jan 21, 2021 at 04:13:51PM +0000, Alessandrelli, Daniele
> wrote:
> > As expected, the second implementation does not pass self-tests and
> > crypto_alloc_kpp() returns -ELIBBAD when trying to allocate it, but
> > I've seen that I can avoid the error (and have it allocated
> > properly)
> > by passing the CRYPTO_ALG_TESTED flag in the 'type' argument, like
> > below:
>
> Did you set your algorithm's name to ecdh? I think Ard was suggesting
> you to not do that. As long as you're not using the same name as a
> recognised algorithm, then you won't need to pass any self-tests at
> all.
>

Oh, looks like I misunderstood Ard suggestion.

I will change the second implementation to use a different algorithm
name (cra_name), something like 'ecdh-keembay-ocs'.

Thanks,
Daniele