2015-03-08 20:08:00

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 0/7] crypto: OCTEON MD5 bugfix + SHA modules

Hi,

The first patch is a bug fix for OCTEON MD5 aimed for 4.0-rc cycle.

The remaining patches add SHA1/SHA256/SHA512 modules. I have tested
these on the following OCTEON boards and CPUs with 4.0-rc2:

D-Link DSR-1000N: CN5010p1.1-500-SCP
EdgeRouter Lite: CN5020p1.1-500-SCP
EdgeRouter Pro: CN6120p1.1-1000-NSP

All selftests are passing. With tcrypt, I get the following numbers
on speed compared to the generic modules:

SHA1:

test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 1.25x faster
test 1 ( 64 byte blocks, 16 bytes per update, 4 updates): 1.20x faster
test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 1.47x faster
test 3 ( 256 byte blocks, 16 bytes per update, 16 updates): 1.15x faster
test 4 ( 256 byte blocks, 64 bytes per update, 4 updates): 1.56x faster
test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 2.27x faster
test 6 ( 1024 byte blocks, 16 bytes per update, 64 updates): 1.13x faster
test 7 ( 1024 byte blocks, 256 bytes per update, 4 updates): 2.74x faster
test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 3.60x faster
test 9 ( 2048 byte blocks, 16 bytes per update, 128 updates): 1.13x faster
test 10 ( 2048 byte blocks, 256 bytes per update, 8 updates): 2.87x faster
test 11 ( 2048 byte blocks, 1024 bytes per update, 2 updates): 3.90x faster
test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 4.18x faster
test 13 ( 4096 byte blocks, 16 bytes per update, 256 updates): 1.13x faster
test 14 ( 4096 byte blocks, 256 bytes per update, 16 updates): 2.95x faster
test 15 ( 4096 byte blocks, 1024 bytes per update, 4 updates): 4.09x faster
test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 4.57x faster
test 17 ( 8192 byte blocks, 16 bytes per update, 512 updates): 1.13x faster
test 18 ( 8192 byte blocks, 256 bytes per update, 32 updates): 2.99x faster
test 19 ( 8192 byte blocks, 1024 bytes per update, 8 updates): 4.20x faster
test 20 ( 8192 byte blocks, 4096 bytes per update, 2 updates): 4.72x faster
test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 4.73x faster

SHA256:

test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 2.72x faster
test 1 ( 64 byte blocks, 16 bytes per update, 4 updates): 2.45x faster
test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 3.65x faster
test 3 ( 256 byte blocks, 16 bytes per update, 16 updates): 2.18x faster
test 4 ( 256 byte blocks, 64 bytes per update, 4 updates): 3.74x faster
test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 5.72x faster
test 6 ( 1024 byte blocks, 16 bytes per update, 64 updates): 2.08x faster
test 7 ( 1024 byte blocks, 256 bytes per update, 4 updates): 6.54x faster
test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 8.19x faster
test 9 ( 2048 byte blocks, 16 bytes per update, 128 updates): 2.06x faster
test 10 ( 2048 byte blocks, 256 bytes per update, 8 updates): 6.77x faster
test 11 ( 2048 byte blocks, 1024 bytes per update, 2 updates): 8.56x faster
test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 9.01x faster
test 13 ( 4096 byte blocks, 16 bytes per update, 256 updates): 2.05x faster
test 14 ( 4096 byte blocks, 256 bytes per update, 16 updates): 6.89x faster
test 15 ( 4096 byte blocks, 1024 bytes per update, 4 updates): 8.82x faster
test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 9.50x faster
test 17 ( 8192 byte blocks, 16 bytes per update, 512 updates): 2.04x faster
test 18 ( 8192 byte blocks, 256 bytes per update, 32 updates): 6.96x faster
test 19 ( 8192 byte blocks, 1024 bytes per update, 8 updates): 8.95x faster
test 20 ( 8192 byte blocks, 4096 bytes per update, 2 updates): 9.66x faster
test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 9.67x faster

SHA512:

test 0 ( 16 byte blocks, 16 bytes per update, 1 updates): 3.19x faster
test 1 ( 64 byte blocks, 16 bytes per update, 4 updates): 2.18x faster
test 2 ( 64 byte blocks, 64 bytes per update, 1 updates): 3.19x faster
test 3 ( 256 byte blocks, 16 bytes per update, 16 updates): 2.12x faster
test 4 ( 256 byte blocks, 64 bytes per update, 4 updates): 3.54x faster
test 5 ( 256 byte blocks, 256 bytes per update, 1 updates): 5.16x faster
test 6 ( 1024 byte blocks, 16 bytes per update, 64 updates): 1.92x faster
test 7 ( 1024 byte blocks, 256 bytes per update, 4 updates): 5.80x faster
test 8 ( 1024 byte blocks, 1024 bytes per update, 1 updates): 8.07x faster
test 9 ( 2048 byte blocks, 16 bytes per update, 128 updates): 1.88x faster
test 10 ( 2048 byte blocks, 256 bytes per update, 8 updates): 6.00x faster
test 11 ( 2048 byte blocks, 1024 bytes per update, 2 updates): 8.64x faster
test 12 ( 2048 byte blocks, 2048 bytes per update, 1 updates): 9.40x faster
test 13 ( 4096 byte blocks, 16 bytes per update, 256 updates): 1.86x faster
test 14 ( 4096 byte blocks, 256 bytes per update, 16 updates): 6.12x faster
test 15 ( 4096 byte blocks, 1024 bytes per update, 4 updates): 9.03x faster
test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 10.31x faster
test 17 ( 8192 byte blocks, 16 bytes per update, 512 updates): 1.85x faster
test 18 ( 8192 byte blocks, 256 bytes per update, 32 updates): 6.18x faster
test 19 ( 8192 byte blocks, 1024 bytes per update, 8 updates): 9.26x faster
test 20 ( 8192 byte blocks, 4096 bytes per update, 2 updates): 10.64x faster
test 21 ( 8192 byte blocks, 8192 bytes per update, 1 updates): 10.65x faster

A.

Aaro Koskinen (7):
crypto: octeon - don't disable bottom half in octeon-md5
crypto: octeon - always disable preemption when using crypto engine
crypto: octeon - add instruction definitions for SHA1/256/512
crypto: octeon - add SHA1 module
crypto: octeon - add SHA256 module
crypto: octeon - add SHA512 module
crypto: octeon - enable OCTEON SHA1/256/512 module selection

arch/mips/cavium-octeon/crypto/Makefile | 5 +-
arch/mips/cavium-octeon/crypto/octeon-crypto.c | 4 +-
arch/mips/cavium-octeon/crypto/octeon-crypto.h | 83 +++++++-
arch/mips/cavium-octeon/crypto/octeon-md5.c | 8 -
arch/mips/cavium-octeon/crypto/octeon-sha1.c | 241 +++++++++++++++++++++
arch/mips/cavium-octeon/crypto/octeon-sha256.c | 280 +++++++++++++++++++++++++
arch/mips/cavium-octeon/crypto/octeon-sha512.c | 277 ++++++++++++++++++++++++
crypto/Kconfig | 27 +++
8 files changed, 911 insertions(+), 14 deletions(-)
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha1.c
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha256.c
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha512.c

--
2.2.0


2015-03-08 20:07:43

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 3/7] crypto: octeon - add instruction definitions for SHA1/256/512

Add instruction definitions for SHA1/256/512.

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/octeon-crypto.h | 83 ++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 4 deletions(-)

diff --git a/arch/mips/cavium-octeon/crypto/octeon-crypto.h b/arch/mips/cavium-octeon/crypto/octeon-crypto.h
index e2a4aec..3550725 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-crypto.h
+++ b/arch/mips/cavium-octeon/crypto/octeon-crypto.h
@@ -5,7 +5,8 @@
*
* Copyright (C) 2012-2013 Cavium Inc., All Rights Reserved.
*
- * MD5 instruction definitions added by Aaro Koskinen <[email protected]>.
+ * MD5/SHA1/SHA256/SHA512 instruction definitions added by
+ * Aaro Koskinen <[email protected]>.
*
*/
#ifndef __LINUX_OCTEON_CRYPTO_H
@@ -21,11 +22,11 @@ extern void octeon_crypto_disable(struct octeon_cop2_state *state,
unsigned long flags);

/*
- * Macros needed to implement MD5:
+ * Macros needed to implement MD5/SHA1/SHA256:
*/

/*
- * The index can be 0-1.
+ * The index can be 0-1 (MD5) or 0-2 (SHA1), 0-3 (SHA256).
*/
#define write_octeon_64bit_hash_dword(value, index) \
do { \
@@ -36,7 +37,7 @@ do { \
} while (0)

/*
- * The index can be 0-1.
+ * The index can be 0-1 (MD5) or 0-2 (SHA1), 0-3 (SHA256).
*/
#define read_octeon_64bit_hash_dword(index) \
({ \
@@ -72,4 +73,78 @@ do { \
: [rt] "d" (value)); \
} while (0)

+/*
+ * The value is the final block dword (64-bit).
+ */
+#define octeon_sha1_start(value) \
+do { \
+ __asm__ __volatile__ ( \
+ "dmtc2 %[rt],0x4057" \
+ : \
+ : [rt] "d" (value)); \
+} while (0)
+
+/*
+ * The value is the final block dword (64-bit).
+ */
+#define octeon_sha256_start(value) \
+do { \
+ __asm__ __volatile__ ( \
+ "dmtc2 %[rt],0x404f" \
+ : \
+ : [rt] "d" (value)); \
+} while (0)
+
+/*
+ * Macros needed to implement SHA512:
+ */
+
+/*
+ * The index can be 0-7.
+ */
+#define write_octeon_64bit_hash_sha512(value, index) \
+do { \
+ __asm__ __volatile__ ( \
+ "dmtc2 %[rt],0x0250+" STR(index) \
+ : \
+ : [rt] "d" (value)); \
+} while (0)
+
+/*
+ * The index can be 0-7.
+ */
+#define read_octeon_64bit_hash_sha512(index) \
+({ \
+ u64 __value; \
+ \
+ __asm__ __volatile__ ( \
+ "dmfc2 %[rt],0x0250+" STR(index) \
+ : [rt] "=d" (__value) \
+ : ); \
+ \
+ __value; \
+})
+
+/*
+ * The index can be 0-14.
+ */
+#define write_octeon_64bit_block_sha512(value, index) \
+do { \
+ __asm__ __volatile__ ( \
+ "dmtc2 %[rt],0x0240+" STR(index) \
+ : \
+ : [rt] "d" (value)); \
+} while (0)
+
+/*
+ * The value is the final block word (64-bit).
+ */
+#define octeon_sha512_start(value) \
+do { \
+ __asm__ __volatile__ ( \
+ "dmtc2 %[rt],0x424f" \
+ : \
+ : [rt] "d" (value)); \
+} while (0)
+
#endif /* __LINUX_OCTEON_CRYPTO_H */
--
2.2.0

2015-03-08 20:07:41

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 1/7] crypto: octeon - don't disable bottom half in octeon-md5

Don't disable bottom half while the crypto engine is in use, as it
should be unnecessary: All kernel crypto engine usage is wrapped with
crypto engine state save/restore, so if we get interrupted by softirq
that uses crypto they should save and restore our context.

This actually fixes an issue when running OCTEON MD5 with interrupts
disabled (tcrypt mode=302). There's a WARNING because the module is
trying to enable the bottom half with irqs disabled:

[ 52.656610] ------------[ cut here ]------------
[ 52.661439] WARNING: CPU: 1 PID: 428 at /home/aaro/git/linux/kernel/softirq.c:150 __local_bh_enable_ip+0x9c/0xd8()
[ 52.671780] Modules linked in: tcrypt(+)
[...]
[ 52.763539] [<ffffffff8114082c>] warn_slowpath_common+0x94/0xd8
[ 52.769465] [<ffffffff81144614>] __local_bh_enable_ip+0x9c/0xd8
[ 52.775390] [<ffffffff81119574>] octeon_md5_final+0x12c/0x1e8
[ 52.781144] [<ffffffff81337050>] shash_compat_digest+0xd0/0x1b0

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/octeon-md5.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/arch/mips/cavium-octeon/crypto/octeon-md5.c b/arch/mips/cavium-octeon/crypto/octeon-md5.c
index b909881..3dd8845 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-md5.c
+++ b/arch/mips/cavium-octeon/crypto/octeon-md5.c
@@ -97,7 +97,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,
memcpy((char *)mctx->block + (sizeof(mctx->block) - avail), data,
avail);

- local_bh_disable();
preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);
@@ -115,7 +114,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,
octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
preempt_enable();
- local_bh_enable();

memcpy(mctx->block, data, len);

@@ -133,7 +131,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)

*p++ = 0x80;

- local_bh_disable();
preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);
@@ -153,7 +150,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)
octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
preempt_enable();
- local_bh_enable();

memcpy(out, mctx->hash, sizeof(mctx->hash));
memset(mctx, 0, sizeof(*mctx));
--
2.2.0

2015-03-08 20:07:46

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 6/7] crypto: octeon - add SHA512 module

Add OCTEON SHA512 module.

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/Makefile | 1 +
arch/mips/cavium-octeon/crypto/octeon-sha512.c | 277 +++++++++++++++++++++++++
2 files changed, 278 insertions(+)
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha512.c

diff --git a/arch/mips/cavium-octeon/crypto/Makefile b/arch/mips/cavium-octeon/crypto/Makefile
index 47806a5..f7aa9d5 100644
--- a/arch/mips/cavium-octeon/crypto/Makefile
+++ b/arch/mips/cavium-octeon/crypto/Makefile
@@ -7,3 +7,4 @@ obj-y += octeon-crypto.o
obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
obj-$(CONFIG_CRYPTO_SHA1_OCTEON) += octeon-sha1.o
obj-$(CONFIG_CRYPTO_SHA256_OCTEON) += octeon-sha256.o
+obj-$(CONFIG_CRYPTO_SHA512_OCTEON) += octeon-sha512.o
diff --git a/arch/mips/cavium-octeon/crypto/octeon-sha512.c b/arch/mips/cavium-octeon/crypto/octeon-sha512.c
new file mode 100644
index 0000000..d5fb3c6
--- /dev/null
+++ b/arch/mips/cavium-octeon/crypto/octeon-sha512.c
@@ -0,0 +1,277 @@
+/*
+ * Cryptographic API.
+ *
+ * SHA-512 and SHA-384 Secure Hash Algorithm.
+ *
+ * Adapted for OCTEON by Aaro Koskinen <[email protected]>.
+ *
+ * Based on crypto/sha512_generic.c, which is:
+ *
+ * Copyright (c) Jean-Luc Cooke <[email protected]>
+ * Copyright (c) Andrew McDonald <[email protected]>
+ * Copyright (c) 2003 Kyle McMartin <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ */
+
+#include <linux/mm.h>
+#include <crypto/sha.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <asm/byteorder.h>
+#include <asm/octeon/octeon.h>
+#include <crypto/internal/hash.h>
+
+#include "octeon-crypto.h"
+
+/*
+ * We pass everything as 64-bit. OCTEON can handle misaligned data.
+ */
+
+static void octeon_sha512_store_hash(struct sha512_state *sctx)
+{
+ write_octeon_64bit_hash_sha512(sctx->state[0], 0);
+ write_octeon_64bit_hash_sha512(sctx->state[1], 1);
+ write_octeon_64bit_hash_sha512(sctx->state[2], 2);
+ write_octeon_64bit_hash_sha512(sctx->state[3], 3);
+ write_octeon_64bit_hash_sha512(sctx->state[4], 4);
+ write_octeon_64bit_hash_sha512(sctx->state[5], 5);
+ write_octeon_64bit_hash_sha512(sctx->state[6], 6);
+ write_octeon_64bit_hash_sha512(sctx->state[7], 7);
+}
+
+static void octeon_sha512_read_hash(struct sha512_state *sctx)
+{
+ sctx->state[0] = read_octeon_64bit_hash_sha512(0);
+ sctx->state[1] = read_octeon_64bit_hash_sha512(1);
+ sctx->state[2] = read_octeon_64bit_hash_sha512(2);
+ sctx->state[3] = read_octeon_64bit_hash_sha512(3);
+ sctx->state[4] = read_octeon_64bit_hash_sha512(4);
+ sctx->state[5] = read_octeon_64bit_hash_sha512(5);
+ sctx->state[6] = read_octeon_64bit_hash_sha512(6);
+ sctx->state[7] = read_octeon_64bit_hash_sha512(7);
+}
+
+static void octeon_sha512_transform(const void *_block)
+{
+ const u64 *block = _block;
+
+ write_octeon_64bit_block_sha512(block[0], 0);
+ write_octeon_64bit_block_sha512(block[1], 1);
+ write_octeon_64bit_block_sha512(block[2], 2);
+ write_octeon_64bit_block_sha512(block[3], 3);
+ write_octeon_64bit_block_sha512(block[4], 4);
+ write_octeon_64bit_block_sha512(block[5], 5);
+ write_octeon_64bit_block_sha512(block[6], 6);
+ write_octeon_64bit_block_sha512(block[7], 7);
+ write_octeon_64bit_block_sha512(block[8], 8);
+ write_octeon_64bit_block_sha512(block[9], 9);
+ write_octeon_64bit_block_sha512(block[10], 10);
+ write_octeon_64bit_block_sha512(block[11], 11);
+ write_octeon_64bit_block_sha512(block[12], 12);
+ write_octeon_64bit_block_sha512(block[13], 13);
+ write_octeon_64bit_block_sha512(block[14], 14);
+ octeon_sha512_start(block[15]);
+}
+
+static int octeon_sha512_init(struct shash_desc *desc)
+{
+ struct sha512_state *sctx = shash_desc_ctx(desc);
+
+ sctx->state[0] = SHA512_H0;
+ sctx->state[1] = SHA512_H1;
+ sctx->state[2] = SHA512_H2;
+ sctx->state[3] = SHA512_H3;
+ sctx->state[4] = SHA512_H4;
+ sctx->state[5] = SHA512_H5;
+ sctx->state[6] = SHA512_H6;
+ sctx->state[7] = SHA512_H7;
+ sctx->count[0] = sctx->count[1] = 0;
+
+ return 0;
+}
+
+static int octeon_sha384_init(struct shash_desc *desc)
+{
+ struct sha512_state *sctx = shash_desc_ctx(desc);
+
+ sctx->state[0] = SHA384_H0;
+ sctx->state[1] = SHA384_H1;
+ sctx->state[2] = SHA384_H2;
+ sctx->state[3] = SHA384_H3;
+ sctx->state[4] = SHA384_H4;
+ sctx->state[5] = SHA384_H5;
+ sctx->state[6] = SHA384_H6;
+ sctx->state[7] = SHA384_H7;
+ sctx->count[0] = sctx->count[1] = 0;
+
+ return 0;
+}
+
+static void __octeon_sha512_update(struct sha512_state *sctx, const u8 *data,
+ unsigned int len)
+{
+ unsigned int part_len;
+ unsigned int index;
+ unsigned int i;
+
+ /* Compute number of bytes mod 128. */
+ index = sctx->count[0] % SHA512_BLOCK_SIZE;
+
+ /* Update number of bytes. */
+ if ((sctx->count[0] += len) < len)
+ sctx->count[1]++;
+
+ part_len = SHA512_BLOCK_SIZE - index;
+
+ /* Transform as many times as possible. */
+ if (len >= part_len) {
+ memcpy(&sctx->buf[index], data, part_len);
+ octeon_sha512_transform(sctx->buf);
+
+ for (i = part_len; i + SHA512_BLOCK_SIZE <= len;
+ i += SHA512_BLOCK_SIZE)
+ octeon_sha512_transform(&data[i]);
+
+ index = 0;
+ } else {
+ i = 0;
+ }
+
+ /* Buffer remaining input. */
+ memcpy(&sctx->buf[index], &data[i], len - i);
+}
+
+static int octeon_sha512_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
+{
+ struct sha512_state *sctx = shash_desc_ctx(desc);
+ struct octeon_cop2_state state;
+ unsigned long flags;
+
+ /*
+ * Small updates never reach the crypto engine, so the generic sha512 is
+ * faster because of the heavyweight octeon_crypto_enable() /
+ * octeon_crypto_disable().
+ */
+ if ((sctx->count[0] % SHA512_BLOCK_SIZE) + len < SHA512_BLOCK_SIZE)
+ return crypto_sha512_update(desc, data, len);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha512_store_hash(sctx);
+
+ __octeon_sha512_update(sctx, data, len);
+
+ octeon_sha512_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ return 0;
+}
+
+static int octeon_sha512_final(struct shash_desc *desc, u8 *hash)
+{
+ struct sha512_state *sctx = shash_desc_ctx(desc);
+ static u8 padding[128] = { 0x80, };
+ struct octeon_cop2_state state;
+ __be64 *dst = (__be64 *)hash;
+ unsigned int pad_len;
+ unsigned long flags;
+ unsigned int index;
+ __be64 bits[2];
+ int i;
+
+ /* Save number of bits. */
+ bits[1] = cpu_to_be64(sctx->count[0] << 3);
+ bits[0] = cpu_to_be64(sctx->count[1] << 3 | sctx->count[0] >> 61);
+
+ /* Pad out to 112 mod 128. */
+ index = sctx->count[0] & 0x7f;
+ pad_len = (index < 112) ? (112 - index) : ((128+112) - index);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha512_store_hash(sctx);
+
+ __octeon_sha512_update(sctx, padding, pad_len);
+
+ /* Append length (before padding). */
+ __octeon_sha512_update(sctx, (const u8 *)bits, sizeof(bits));
+
+ octeon_sha512_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ /* Store state in digest. */
+ for (i = 0; i < 8; i++)
+ dst[i] = cpu_to_be64(sctx->state[i]);
+
+ /* Zeroize sensitive information. */
+ memset(sctx, 0, sizeof(struct sha512_state));
+
+ return 0;
+}
+
+static int octeon_sha384_final(struct shash_desc *desc, u8 *hash)
+{
+ u8 D[64];
+
+ octeon_sha512_final(desc, D);
+
+ memcpy(hash, D, 48);
+ memzero_explicit(D, 64);
+
+ return 0;
+}
+
+static struct shash_alg octeon_sha512_algs[2] = { {
+ .digestsize = SHA512_DIGEST_SIZE,
+ .init = octeon_sha512_init,
+ .update = octeon_sha512_update,
+ .final = octeon_sha512_final,
+ .descsize = sizeof(struct sha512_state),
+ .base = {
+ .cra_name = "sha512",
+ .cra_driver_name= "octeon-sha512",
+ .cra_priority = OCTEON_CR_OPCODE_PRIORITY,
+ .cra_flags = CRYPTO_ALG_TYPE_SHASH,
+ .cra_blocksize = SHA512_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ }
+}, {
+ .digestsize = SHA384_DIGEST_SIZE,
+ .init = octeon_sha384_init,
+ .update = octeon_sha512_update,
+ .final = octeon_sha384_final,
+ .descsize = sizeof(struct sha512_state),
+ .base = {
+ .cra_name = "sha384",
+ .cra_driver_name= "octeon-sha384",
+ .cra_priority = OCTEON_CR_OPCODE_PRIORITY,
+ .cra_flags = CRYPTO_ALG_TYPE_SHASH,
+ .cra_blocksize = SHA384_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ }
+} };
+
+static int __init octeon_sha512_mod_init(void)
+{
+ if (!octeon_has_crypto())
+ return -ENOTSUPP;
+ return crypto_register_shashes(octeon_sha512_algs,
+ ARRAY_SIZE(octeon_sha512_algs));
+}
+
+static void __exit octeon_sha512_mod_fini(void)
+{
+ crypto_unregister_shashes(octeon_sha512_algs,
+ ARRAY_SIZE(octeon_sha512_algs));
+}
+
+module_init(octeon_sha512_mod_init);
+module_exit(octeon_sha512_mod_fini);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("SHA-512 and SHA-384 Secure Hash Algorithms (OCTEON)");
+MODULE_AUTHOR("Aaro Koskinen <[email protected]>");
--
2.2.0

2015-03-08 20:07:44

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 4/7] crypto: octeon - add SHA1 module

Add OCTEON SHA1 module.

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/Makefile | 3 +-
arch/mips/cavium-octeon/crypto/octeon-sha1.c | 241 +++++++++++++++++++++++++++
2 files changed, 243 insertions(+), 1 deletion(-)
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha1.c

diff --git a/arch/mips/cavium-octeon/crypto/Makefile b/arch/mips/cavium-octeon/crypto/Makefile
index a74f76d..3f671d6 100644
--- a/arch/mips/cavium-octeon/crypto/Makefile
+++ b/arch/mips/cavium-octeon/crypto/Makefile
@@ -4,4 +4,5 @@

obj-y += octeon-crypto.o

-obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
+obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
+obj-$(CONFIG_CRYPTO_SHA1_OCTEON) += octeon-sha1.o
diff --git a/arch/mips/cavium-octeon/crypto/octeon-sha1.c b/arch/mips/cavium-octeon/crypto/octeon-sha1.c
new file mode 100644
index 0000000..2b74b5b
--- /dev/null
+++ b/arch/mips/cavium-octeon/crypto/octeon-sha1.c
@@ -0,0 +1,241 @@
+/*
+ * Cryptographic API.
+ *
+ * SHA1 Secure Hash Algorithm.
+ *
+ * Adapted for OCTEON by Aaro Koskinen <[email protected]>.
+ *
+ * Based on crypto/sha1_generic.c, which is:
+ *
+ * Copyright (c) Alan Smithee.
+ * Copyright (c) Andrew McDonald <[email protected]>
+ * Copyright (c) Jean-Francois Dive <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ */
+
+#include <linux/mm.h>
+#include <crypto/sha.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <asm/byteorder.h>
+#include <asm/octeon/octeon.h>
+#include <crypto/internal/hash.h>
+
+#include "octeon-crypto.h"
+
+/*
+ * We pass everything as 64-bit. OCTEON can handle misaligned data.
+ */
+
+static void octeon_sha1_store_hash(struct sha1_state *sctx)
+{
+ u64 *hash = (u64 *)sctx->state;
+ union {
+ u32 word[2];
+ u64 dword;
+ } hash_tail = { { sctx->state[4], } };
+
+ write_octeon_64bit_hash_dword(hash[0], 0);
+ write_octeon_64bit_hash_dword(hash[1], 1);
+ write_octeon_64bit_hash_dword(hash_tail.dword, 2);
+ memzero_explicit(&hash_tail.word[0], sizeof(hash_tail.word[0]));
+}
+
+static void octeon_sha1_read_hash(struct sha1_state *sctx)
+{
+ u64 *hash = (u64 *)sctx->state;
+ union {
+ u32 word[2];
+ u64 dword;
+ } hash_tail;
+
+ hash[0] = read_octeon_64bit_hash_dword(0);
+ hash[1] = read_octeon_64bit_hash_dword(1);
+ hash_tail.dword = read_octeon_64bit_hash_dword(2);
+ sctx->state[4] = hash_tail.word[0];
+ memzero_explicit(&hash_tail.dword, sizeof(hash_tail.dword));
+}
+
+static void octeon_sha1_transform(const void *_block)
+{
+ const u64 *block = _block;
+
+ write_octeon_64bit_block_dword(block[0], 0);
+ write_octeon_64bit_block_dword(block[1], 1);
+ write_octeon_64bit_block_dword(block[2], 2);
+ write_octeon_64bit_block_dword(block[3], 3);
+ write_octeon_64bit_block_dword(block[4], 4);
+ write_octeon_64bit_block_dword(block[5], 5);
+ write_octeon_64bit_block_dword(block[6], 6);
+ octeon_sha1_start(block[7]);
+}
+
+static int octeon_sha1_init(struct shash_desc *desc)
+{
+ struct sha1_state *sctx = shash_desc_ctx(desc);
+
+ sctx->state[0] = SHA1_H0;
+ sctx->state[1] = SHA1_H1;
+ sctx->state[2] = SHA1_H2;
+ sctx->state[3] = SHA1_H3;
+ sctx->state[4] = SHA1_H4;
+ sctx->count = 0;
+
+ return 0;
+}
+
+static void __octeon_sha1_update(struct sha1_state *sctx, const u8 *data,
+ unsigned int len)
+{
+ unsigned int partial;
+ unsigned int done;
+ const u8 *src;
+
+ partial = sctx->count % SHA1_BLOCK_SIZE;
+ sctx->count += len;
+ done = 0;
+ src = data;
+
+ if ((partial + len) >= SHA1_BLOCK_SIZE) {
+ if (partial) {
+ done = -partial;
+ memcpy(sctx->buffer + partial, data,
+ done + SHA1_BLOCK_SIZE);
+ src = sctx->buffer;
+ }
+
+ do {
+ octeon_sha1_transform(src);
+ done += SHA1_BLOCK_SIZE;
+ src = data + done;
+ } while (done + SHA1_BLOCK_SIZE <= len);
+
+ partial = 0;
+ }
+ memcpy(sctx->buffer + partial, src, len - done);
+}
+
+static int octeon_sha1_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
+{
+ struct sha1_state *sctx = shash_desc_ctx(desc);
+ struct octeon_cop2_state state;
+ unsigned long flags;
+
+ /*
+ * Small updates never reach the crypto engine, so the generic sha1 is
+ * faster because of the heavyweight octeon_crypto_enable() /
+ * octeon_crypto_disable().
+ */
+ if ((sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE)
+ return crypto_sha1_update(desc, data, len);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha1_store_hash(sctx);
+
+ __octeon_sha1_update(sctx, data, len);
+
+ octeon_sha1_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ return 0;
+}
+
+static int octeon_sha1_final(struct shash_desc *desc, u8 *out)
+{
+ struct sha1_state *sctx = shash_desc_ctx(desc);
+ static const u8 padding[64] = { 0x80, };
+ struct octeon_cop2_state state;
+ __be32 *dst = (__be32 *)out;
+ unsigned int pad_len;
+ unsigned long flags;
+ unsigned int index;
+ __be64 bits;
+ int i;
+
+ /* Save number of bits. */
+ bits = cpu_to_be64(sctx->count << 3);
+
+ /* Pad out to 56 mod 64. */
+ index = sctx->count & 0x3f;
+ pad_len = (index < 56) ? (56 - index) : ((64+56) - index);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha1_store_hash(sctx);
+
+ __octeon_sha1_update(sctx, padding, pad_len);
+
+ /* Append length (before padding). */
+ __octeon_sha1_update(sctx, (const u8 *)&bits, sizeof(bits));
+
+ octeon_sha1_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ /* Store state in digest */
+ for (i = 0; i < 5; i++)
+ dst[i] = cpu_to_be32(sctx->state[i]);
+
+ /* Zeroize sensitive information. */
+ memset(sctx, 0, sizeof(*sctx));
+
+ return 0;
+}
+
+static int octeon_sha1_export(struct shash_desc *desc, void *out)
+{
+ struct sha1_state *sctx = shash_desc_ctx(desc);
+
+ memcpy(out, sctx, sizeof(*sctx));
+ return 0;
+}
+
+static int octeon_sha1_import(struct shash_desc *desc, const void *in)
+{
+ struct sha1_state *sctx = shash_desc_ctx(desc);
+
+ memcpy(sctx, in, sizeof(*sctx));
+ return 0;
+}
+
+static struct shash_alg octeon_sha1_alg = {
+ .digestsize = SHA1_DIGEST_SIZE,
+ .init = octeon_sha1_init,
+ .update = octeon_sha1_update,
+ .final = octeon_sha1_final,
+ .export = octeon_sha1_export,
+ .import = octeon_sha1_import,
+ .descsize = sizeof(struct sha1_state),
+ .statesize = sizeof(struct sha1_state),
+ .base = {
+ .cra_name = "sha1",
+ .cra_driver_name= "octeon-sha1",
+ .cra_priority = OCTEON_CR_OPCODE_PRIORITY,
+ .cra_flags = CRYPTO_ALG_TYPE_SHASH,
+ .cra_blocksize = SHA1_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ }
+};
+
+static int __init octeon_sha1_mod_init(void)
+{
+ if (!octeon_has_crypto())
+ return -ENOTSUPP;
+ return crypto_register_shash(&octeon_sha1_alg);
+}
+
+static void __exit octeon_sha1_mod_fini(void)
+{
+ crypto_unregister_shash(&octeon_sha1_alg);
+}
+
+module_init(octeon_sha1_mod_init);
+module_exit(octeon_sha1_mod_fini);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm (OCTEON)");
+MODULE_AUTHOR("Aaro Koskinen <[email protected]>");
--
2.2.0

2015-03-08 20:08:04

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 7/7] crypto: octeon - enable OCTEON SHA1/256/512 module selection

Enable user to select OCTEON SHA1/256/512 modules.

Signed-off-by: Aaro Koskinen <[email protected]>
---
crypto/Kconfig | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 50f4da4..38b2315 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -546,6 +546,15 @@ config CRYPTO_SHA512_SSSE3
Extensions version 1 (AVX1), or Advanced Vector Extensions
version 2 (AVX2) instructions, when available.

+config CRYPTO_SHA1_OCTEON
+ tristate "SHA1 digest algorithm (OCTEON)"
+ depends on CPU_CAVIUM_OCTEON
+ select CRYPTO_SHA1
+ select CRYPTO_HASH
+ help
+ SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
+ using OCTEON crypto instructions, when available.
+
config CRYPTO_SHA1_SPARC64
tristate "SHA1 digest algorithm (SPARC64)"
depends on SPARC64
@@ -610,6 +619,15 @@ config CRYPTO_SHA256
This code also includes SHA-224, a 224 bit hash with 112 bits
of security against collision attacks.

+config CRYPTO_SHA256_OCTEON
+ tristate "SHA224 and SHA256 digest algorithm (OCTEON)"
+ depends on CPU_CAVIUM_OCTEON
+ select CRYPTO_SHA256
+ select CRYPTO_HASH
+ help
+ SHA-256 secure hash standard (DFIPS 180-2) implemented
+ using OCTEON crypto instructions, when available.
+
config CRYPTO_SHA256_SPARC64
tristate "SHA224 and SHA256 digest algorithm (SPARC64)"
depends on SPARC64
@@ -631,6 +649,15 @@ config CRYPTO_SHA512
This code also includes SHA-384, a 384 bit hash with 192 bits
of security against collision attacks.

+config CRYPTO_SHA512_OCTEON
+ tristate "SHA384 and SHA512 digest algorithms (OCTEON)"
+ depends on CPU_CAVIUM_OCTEON
+ select CRYPTO_SHA512
+ select CRYPTO_HASH
+ help
+ SHA-512 secure hash standard (DFIPS 180-2) implemented
+ using OCTEON crypto instructions, when available.
+
config CRYPTO_SHA512_SPARC64
tristate "SHA384 and SHA512 digest algorithm (SPARC64)"
depends on SPARC64
--
2.2.0

2015-03-08 20:08:04

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 2/7] crypto: octeon - always disable preemption when using crypto engine

Always disable preemption on behalf of the drivers when crypto engine
is taken into use. This will simplify the usage.

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/octeon-crypto.c | 4 +++-
arch/mips/cavium-octeon/crypto/octeon-md5.c | 4 ----
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/mips/cavium-octeon/crypto/octeon-crypto.c b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
index 7c82ff4..f66bd1a 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-crypto.c
+++ b/arch/mips/cavium-octeon/crypto/octeon-crypto.c
@@ -17,7 +17,7 @@
* crypto operations in calls to octeon_crypto_enable/disable in order to make
* sure the state of COP2 isn't corrupted if userspace is also performing
* hardware crypto operations. Allocate the state parameter on the stack.
- * Preemption must be disabled to prevent context switches.
+ * Returns with preemption disabled.
*
* @state: Pointer to state structure to store current COP2 state in.
*
@@ -28,6 +28,7 @@ unsigned long octeon_crypto_enable(struct octeon_cop2_state *state)
int status;
unsigned long flags;

+ preempt_disable();
local_irq_save(flags);
status = read_c0_status();
write_c0_status(status | ST0_CU2);
@@ -62,5 +63,6 @@ void octeon_crypto_disable(struct octeon_cop2_state *state,
else
write_c0_status(read_c0_status() & ~ST0_CU2);
local_irq_restore(flags);
+ preempt_enable();
}
EXPORT_SYMBOL_GPL(octeon_crypto_disable);
diff --git a/arch/mips/cavium-octeon/crypto/octeon-md5.c b/arch/mips/cavium-octeon/crypto/octeon-md5.c
index 3dd8845..12dccdb 100644
--- a/arch/mips/cavium-octeon/crypto/octeon-md5.c
+++ b/arch/mips/cavium-octeon/crypto/octeon-md5.c
@@ -97,7 +97,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,
memcpy((char *)mctx->block + (sizeof(mctx->block) - avail), data,
avail);

- preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);

@@ -113,7 +112,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,

octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
- preempt_enable();

memcpy(mctx->block, data, len);

@@ -131,7 +129,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)

*p++ = 0x80;

- preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);

@@ -149,7 +146,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)

octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
- preempt_enable();

memcpy(out, mctx->hash, sizeof(mctx->hash));
memset(mctx, 0, sizeof(*mctx));
--
2.2.0

2015-03-08 20:07:45

by Aaro Koskinen

[permalink] [raw]
Subject: [PATCH 5/7] crypto: octeon - add SHA256 module

Add OCTEON SHA256 module.

Signed-off-by: Aaro Koskinen <[email protected]>
---
arch/mips/cavium-octeon/crypto/Makefile | 1 +
arch/mips/cavium-octeon/crypto/octeon-sha256.c | 280 +++++++++++++++++++++++++
2 files changed, 281 insertions(+)
create mode 100644 arch/mips/cavium-octeon/crypto/octeon-sha256.c

diff --git a/arch/mips/cavium-octeon/crypto/Makefile b/arch/mips/cavium-octeon/crypto/Makefile
index 3f671d6..47806a5 100644
--- a/arch/mips/cavium-octeon/crypto/Makefile
+++ b/arch/mips/cavium-octeon/crypto/Makefile
@@ -6,3 +6,4 @@ obj-y += octeon-crypto.o

obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
obj-$(CONFIG_CRYPTO_SHA1_OCTEON) += octeon-sha1.o
+obj-$(CONFIG_CRYPTO_SHA256_OCTEON) += octeon-sha256.o
diff --git a/arch/mips/cavium-octeon/crypto/octeon-sha256.c b/arch/mips/cavium-octeon/crypto/octeon-sha256.c
new file mode 100644
index 0000000..97e96fe
--- /dev/null
+++ b/arch/mips/cavium-octeon/crypto/octeon-sha256.c
@@ -0,0 +1,280 @@
+/*
+ * Cryptographic API.
+ *
+ * SHA-224 and SHA-256 Secure Hash Algorithm.
+ *
+ * Adapted for OCTEON by Aaro Koskinen <[email protected]>.
+ *
+ * Based on crypto/sha256_generic.c, which is:
+ *
+ * Copyright (c) Jean-Luc Cooke <[email protected]>
+ * Copyright (c) Andrew McDonald <[email protected]>
+ * Copyright (c) 2002 James Morris <[email protected]>
+ * SHA224 Support Copyright 2007 Intel Corporation <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ */
+
+#include <linux/mm.h>
+#include <crypto/sha.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <asm/byteorder.h>
+#include <asm/octeon/octeon.h>
+#include <crypto/internal/hash.h>
+
+#include "octeon-crypto.h"
+
+/*
+ * We pass everything as 64-bit. OCTEON can handle misaligned data.
+ */
+
+static void octeon_sha256_store_hash(struct sha256_state *sctx)
+{
+ u64 *hash = (u64 *)sctx->state;
+
+ write_octeon_64bit_hash_dword(hash[0], 0);
+ write_octeon_64bit_hash_dword(hash[1], 1);
+ write_octeon_64bit_hash_dword(hash[2], 2);
+ write_octeon_64bit_hash_dword(hash[3], 3);
+}
+
+static void octeon_sha256_read_hash(struct sha256_state *sctx)
+{
+ u64 *hash = (u64 *)sctx->state;
+
+ hash[0] = read_octeon_64bit_hash_dword(0);
+ hash[1] = read_octeon_64bit_hash_dword(1);
+ hash[2] = read_octeon_64bit_hash_dword(2);
+ hash[3] = read_octeon_64bit_hash_dword(3);
+}
+
+static void octeon_sha256_transform(const void *_block)
+{
+ const u64 *block = _block;
+
+ write_octeon_64bit_block_dword(block[0], 0);
+ write_octeon_64bit_block_dword(block[1], 1);
+ write_octeon_64bit_block_dword(block[2], 2);
+ write_octeon_64bit_block_dword(block[3], 3);
+ write_octeon_64bit_block_dword(block[4], 4);
+ write_octeon_64bit_block_dword(block[5], 5);
+ write_octeon_64bit_block_dword(block[6], 6);
+ octeon_sha256_start(block[7]);
+}
+
+static int octeon_sha224_init(struct shash_desc *desc)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ sctx->state[0] = SHA224_H0;
+ sctx->state[1] = SHA224_H1;
+ sctx->state[2] = SHA224_H2;
+ sctx->state[3] = SHA224_H3;
+ sctx->state[4] = SHA224_H4;
+ sctx->state[5] = SHA224_H5;
+ sctx->state[6] = SHA224_H6;
+ sctx->state[7] = SHA224_H7;
+ sctx->count = 0;
+
+ return 0;
+}
+
+static int octeon_sha256_init(struct shash_desc *desc)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ sctx->state[0] = SHA256_H0;
+ sctx->state[1] = SHA256_H1;
+ sctx->state[2] = SHA256_H2;
+ sctx->state[3] = SHA256_H3;
+ sctx->state[4] = SHA256_H4;
+ sctx->state[5] = SHA256_H5;
+ sctx->state[6] = SHA256_H6;
+ sctx->state[7] = SHA256_H7;
+ sctx->count = 0;
+
+ return 0;
+}
+
+static void __octeon_sha256_update(struct sha256_state *sctx, const u8 *data,
+ unsigned int len)
+{
+ unsigned int partial;
+ unsigned int done;
+ const u8 *src;
+
+ partial = sctx->count % SHA256_BLOCK_SIZE;
+ sctx->count += len;
+ done = 0;
+ src = data;
+
+ if ((partial + len) >= SHA256_BLOCK_SIZE) {
+ if (partial) {
+ done = -partial;
+ memcpy(sctx->buf + partial, data,
+ done + SHA256_BLOCK_SIZE);
+ src = sctx->buf;
+ }
+
+ do {
+ octeon_sha256_transform(src);
+ done += SHA256_BLOCK_SIZE;
+ src = data + done;
+ } while (done + SHA256_BLOCK_SIZE <= len);
+
+ partial = 0;
+ }
+ memcpy(sctx->buf + partial, src, len - done);
+}
+
+static int octeon_sha256_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+ struct octeon_cop2_state state;
+ unsigned long flags;
+
+ /*
+ * Small updates never reach the crypto engine, so the generic sha256 is
+ * faster because of the heavyweight octeon_crypto_enable() /
+ * octeon_crypto_disable().
+ */
+ if ((sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
+ return crypto_sha256_update(desc, data, len);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha256_store_hash(sctx);
+
+ __octeon_sha256_update(sctx, data, len);
+
+ octeon_sha256_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ return 0;
+}
+
+static int octeon_sha256_final(struct shash_desc *desc, u8 *out)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+ static const u8 padding[64] = { 0x80, };
+ struct octeon_cop2_state state;
+ __be32 *dst = (__be32 *)out;
+ unsigned int pad_len;
+ unsigned long flags;
+ unsigned int index;
+ __be64 bits;
+ int i;
+
+ /* Save number of bits. */
+ bits = cpu_to_be64(sctx->count << 3);
+
+ /* Pad out to 56 mod 64. */
+ index = sctx->count & 0x3f;
+ pad_len = (index < 56) ? (56 - index) : ((64+56) - index);
+
+ flags = octeon_crypto_enable(&state);
+ octeon_sha256_store_hash(sctx);
+
+ __octeon_sha256_update(sctx, padding, pad_len);
+
+ /* Append length (before padding). */
+ __octeon_sha256_update(sctx, (const u8 *)&bits, sizeof(bits));
+
+ octeon_sha256_read_hash(sctx);
+ octeon_crypto_disable(&state, flags);
+
+ /* Store state in digest */
+ for (i = 0; i < 8; i++)
+ dst[i] = cpu_to_be32(sctx->state[i]);
+
+ /* Zeroize sensitive information. */
+ memset(sctx, 0, sizeof(*sctx));
+
+ return 0;
+}
+
+static int octeon_sha224_final(struct shash_desc *desc, u8 *hash)
+{
+ u8 D[SHA256_DIGEST_SIZE];
+
+ octeon_sha256_final(desc, D);
+
+ memcpy(hash, D, SHA224_DIGEST_SIZE);
+ memzero_explicit(D, SHA256_DIGEST_SIZE);
+
+ return 0;
+}
+
+static int octeon_sha256_export(struct shash_desc *desc, void *out)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ memcpy(out, sctx, sizeof(*sctx));
+ return 0;
+}
+
+static int octeon_sha256_import(struct shash_desc *desc, const void *in)
+{
+ struct sha256_state *sctx = shash_desc_ctx(desc);
+
+ memcpy(sctx, in, sizeof(*sctx));
+ return 0;
+}
+
+static struct shash_alg octeon_sha256_algs[2] = { {
+ .digestsize = SHA256_DIGEST_SIZE,
+ .init = octeon_sha256_init,
+ .update = octeon_sha256_update,
+ .final = octeon_sha256_final,
+ .export = octeon_sha256_export,
+ .import = octeon_sha256_import,
+ .descsize = sizeof(struct sha256_state),
+ .statesize = sizeof(struct sha256_state),
+ .base = {
+ .cra_name = "sha256",
+ .cra_driver_name= "octeon-sha256",
+ .cra_priority = OCTEON_CR_OPCODE_PRIORITY,
+ .cra_flags = CRYPTO_ALG_TYPE_SHASH,
+ .cra_blocksize = SHA256_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ }
+}, {
+ .digestsize = SHA224_DIGEST_SIZE,
+ .init = octeon_sha224_init,
+ .update = octeon_sha256_update,
+ .final = octeon_sha224_final,
+ .descsize = sizeof(struct sha256_state),
+ .base = {
+ .cra_name = "sha224",
+ .cra_driver_name= "octeon-sha224",
+ .cra_flags = CRYPTO_ALG_TYPE_SHASH,
+ .cra_blocksize = SHA224_BLOCK_SIZE,
+ .cra_module = THIS_MODULE,
+ }
+} };
+
+static int __init octeon_sha256_mod_init(void)
+{
+ if (!octeon_has_crypto())
+ return -ENOTSUPP;
+ return crypto_register_shashes(octeon_sha256_algs,
+ ARRAY_SIZE(octeon_sha256_algs));
+}
+
+static void __exit octeon_sha256_mod_fini(void)
+{
+ crypto_unregister_shashes(octeon_sha256_algs,
+ ARRAY_SIZE(octeon_sha256_algs));
+}
+
+module_init(octeon_sha256_mod_init);
+module_exit(octeon_sha256_mod_fini);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("SHA-224 and SHA-256 Secure Hash Algorithm (OCTEON)");
+MODULE_AUTHOR("Aaro Koskinen <[email protected]>");
--
2.2.0

2015-03-10 09:58:22

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/7] crypto: OCTEON MD5 bugfix + SHA modules

On Sun, Mar 08, 2015 at 10:07:40PM +0200, Aaro Koskinen wrote:
> Hi,
>
> The first patch is a bug fix for OCTEON MD5 aimed for 4.0-rc cycle.

Please send such bug fixes in a separate series in future.
For this one in particular it does not appear to be critical
enough to go in straight away so I have pushed it into cryptodev
along with the rest.

> The remaining patches add SHA1/SHA256/SHA512 modules. I have tested
> these on the following OCTEON boards and CPUs with 4.0-rc2:
>
> D-Link DSR-1000N: CN5010p1.1-500-SCP
> EdgeRouter Lite: CN5020p1.1-500-SCP
> EdgeRouter Pro: CN6120p1.1-1000-NSP
>
> All selftests are passing. With tcrypt, I get the following numbers
> on speed compared to the generic modules:

All applied. Thanks!
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt