This patch series provide an accelerated/optimized Chacha20 and Poly1305
implementation for Power10 or later CPU (ppc64le). This module
implements algorithm specified in RFC7539. The implementation
provides 3.5X better performance than the baseline for Chacha20 and
Poly1305 individually and 1.5X improvement for Chacha20/Poly1305
operation.
This patch has been tested with the kernel crypto module tcrypt.ko and
has passed the selftest. The patch is also tested with
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS enabled.
Danny Tsen (5):
An optimized Chacha20 implementation with 8-way unrolling for ppc64le.
Glue code for optmized Chacha20 implementation for ppc64le.
An optimized Poly1305 implementation with 4-way unrolling for ppc64le.
Glue code for optmized Poly1305 implementation for ppc64le.
Update Kconfig and Makefile.
arch/powerpc/crypto/Kconfig | 26 +
arch/powerpc/crypto/Makefile | 4 +
arch/powerpc/crypto/chacha-p10-glue.c | 223 +++++
arch/powerpc/crypto/chacha-p10le-8x.S | 842 ++++++++++++++++++
arch/powerpc/crypto/poly1305-p10-glue.c | 186 ++++
arch/powerpc/crypto/poly1305-p10le_64.S | 1075 +++++++++++++++++++++++
6 files changed, 2356 insertions(+)
create mode 100644 arch/powerpc/crypto/chacha-p10-glue.c
create mode 100644 arch/powerpc/crypto/chacha-p10le-8x.S
create mode 100644 arch/powerpc/crypto/poly1305-p10-glue.c
create mode 100644 arch/powerpc/crypto/poly1305-p10le_64.S
--
2.31.1