2024-01-16 06:36:03

by Tony W Wang-oc

[permalink] [raw]
Subject: [PATCH 0/3] Add Zhaoxin hardware engine driver support for SHA

Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its CPU
instructions, including SHA1, SHA256, SHA384 and SHA512, which conform
to the Secure Hash Algorithms specified by FIPS 180-3.

With the help of implementation of SHA in hardware instead of software,
can develop applications with higher performance, more security and more
flexibility.

Below table gives a summary of test using the driver tcrypt with different
crypt algorithm drivers on Zhaoxin KH-40000 platform:
---------------------------------------------------------------------------
tcrypt driver 16* 64 256 1024 2048 4096 8192
---------------------------------------------------------------------------
zhaoxin** 442.80 1309.21 3257.53 5221.56 5813.45 6136.39 6264.50***
403:SHA1 generic** 341.44 813.27 1458.98 1818.03 1896.60 1940.71 1939.06
ratio 1.30 1.61 2.23 2.87 3.07 3.16 3.23
---------------------------------------------------------------------------
zhaoxin 451.70 1313.65 2958.71 4658.55 5109.16 5359.08 5459.13
404:SHA256 generic 202.62 463.55 845.01 1070.50 1117.51 1144.79 1155.68
ratio 2.23 2.83 3.50 4.35 4.57 4.68 4.72
---------------------------------------------------------------------------
zhaoxin 350.90 1406.42 3166.16 5736.39 6627.77 7182.01 7429.18
405:SHA384 generic 161.76 654.88 979.06 1350.56 1423.08 1496.57 1513.12
ratio 2.17 2.15 3.23 4.25 4.66 4.80 4.91
---------------------------------------------------------------------------
zhaoxin 334.49 1394.71 3159.93 5728.86 6625.33 7169.23 7407.80
406:SHA512 generic 161.80 653.84 979.42 1351.41 1444.14 1495.35 1518.43
ratio 2.07 2.13 3.23 4.24 4.59 4.79 4.88
---------------------------------------------------------------------------
*: The length of each data block to be processed by one complete SHA
sequence, namely one INIT, multi UPDATEs and one FINAL.
**: Crypt algorithm driver used by tcrypt, "zhaoxin" represents zhaoxin-sha
while "generic" represents the generic software SHA driver.
***: The speed of each crypt algorithm driver processing different length
of data blocks, unit is Mb/s.

The ratio in the table implies the performance of SHA implemented by
zhaoxin-sha driver is much higher than the ones implemented by the generic
software driver of sha1/sha256/sha384/sha512.

In order to support Zhaoxin-sha driver, make padlock-sha driver matches
the CENTAUR CPUs with Family == 6 and add two Zhaoxin Hash Engine
cpufeatures.

Tony W Wang-oc (3):
crypto: padlock-sha: Matches CPU with Family with 6 explicitly
x86/cpufeatures: Add CPU feature flags for Zhaoxin Hash Engine
crypto: Zhaoxin: Hardware Engine Driver for SHA1/256/384/512

arch/x86/include/asm/cpufeatures.h | 4 +-
drivers/crypto/Kconfig | 15 +
drivers/crypto/Makefile | 1 +
drivers/crypto/padlock-sha.c | 2 +-
drivers/crypto/zhaoxin-sha.c | 500 +++++++++++++++++++++++
drivers/crypto/zhaoxin-sha.h | 16 +
tools/arch/x86/include/asm/cpufeatures.h | 4 +-
7 files changed, 539 insertions(+), 3 deletions(-)
create mode 100644 drivers/crypto/zhaoxin-sha.c
create mode 100644 drivers/crypto/zhaoxin-sha.h

--
2.25.1



2024-01-16 06:47:34

by Tony W Wang-oc

[permalink] [raw]
Subject: [PATCH 2/3] x86/cpufeatures: Add CPU feature flags for Zhaoxin Hash Engine

Zhaoxin CPUs have implemented the SHA(Secure Hash Algorithm) as its
instrucions.
Add two CPU feature flags indicated by CPUID.(EAX=C0000001,ECX=0):EDX
bit 25/26 which will be used by Zhaoxin SHA driver.

Signed-off-by: Tony W Wang-oc <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 4 +++-
tools/arch/x86/include/asm/cpufeatures.h | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 29cb275a219d..28b0e62dbdf5 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -145,7 +145,7 @@
#define X86_FEATURE_RDRAND ( 4*32+30) /* RDRAND instruction */
#define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */

-/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
+/* VIA/Cyrix/Centaur/Zhaoxin-defined CPU features, CPUID level 0xC0000001, word 5 */
#define X86_FEATURE_XSTORE ( 5*32+ 2) /* "rng" RNG present (xstore) */
#define X86_FEATURE_XSTORE_EN ( 5*32+ 3) /* "rng_en" RNG enabled */
#define X86_FEATURE_XCRYPT ( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
@@ -156,6 +156,8 @@
#define X86_FEATURE_PHE_EN ( 5*32+11) /* PHE enabled */
#define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */
#define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */
+#define X86_FEATURE_PHE2 ( 5*32+25) /* "phe2" Zhaoxin Hash Engine */
+#define X86_FEATURE_PHE2_EN ( 5*32+26) /* "phe2_en" PHE2 enabled */

/* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */
#define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index f4542d2718f4..21caba9d070b 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -145,7 +145,7 @@
#define X86_FEATURE_RDRAND ( 4*32+30) /* RDRAND instruction */
#define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */

-/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
+/* VIA/Cyrix/Centaur/Zhaoxin-defined CPU features, CPUID level 0xC0000001, word 5 */
#define X86_FEATURE_XSTORE ( 5*32+ 2) /* "rng" RNG present (xstore) */
#define X86_FEATURE_XSTORE_EN ( 5*32+ 3) /* "rng_en" RNG enabled */
#define X86_FEATURE_XCRYPT ( 5*32+ 6) /* "ace" on-CPU crypto (xcrypt) */
@@ -156,6 +156,8 @@
#define X86_FEATURE_PHE_EN ( 5*32+11) /* PHE enabled */
#define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */
#define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */
+#define X86_FEATURE_PHE2 ( 5*32+25) /* "phe2" Zhaoxin Hash Engine */
+#define X86_FEATURE_PHE2_EN ( 5*32+26) /* "phe2_en" PHE2 enabled */

/* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */
#define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */
--
2.25.1