2012-09-25 21:50:09

by Tim Chen

[permalink] [raw]
Subject: [PATCH 0/3] Optimize CRC32C calculation using PCLMULQDQ in crc32c-intel module

This patch series optimized CRC32C calculations with PCLMULQDQ
instruction for crc32c-intel module. It speeds up the original
implementation by 1.6x for 1K buffer and by 3x for buffer 4k or
more. The tcrypt module was enhanced for doing speed test
on crc32c calculations.

Tim

Signed-off-by: Tim Chen <[email protected]>
---
Tim Chen (3):
Rename crc32c-intel.c to crc32c-intel_glue.c
Optimize CRC32C calculation with PCLMULQDQ instruction
Added speed test in tcrypt for crc32c

arch/x86/crypto/Makefile | 1 +
.../crypto/{crc32c-intel.c => crc32c-intel_glue.c} | 75 ++++
arch/x86/crypto/crc32c-pcl-intel-asm.S | 460 ++++++++++++++++++++
crypto/tcrypt.c | 4 +
4 files changed, 540 insertions(+), 0 deletions(-)
rename arch/x86/crypto/{crc32c-intel.c => crc32c-intel_glue.c} (70%)
create mode 100644 arch/x86/crypto/crc32c-pcl-intel-asm.S

--
1.7.7.6


2012-09-26 16:55:24

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 0/3] Optimize CRC32C calculation using PCLMULQDQ in crc32c-intel module

From: Tim Chen <[email protected]>
Date: Tue, 25 Sep 2012 14:50:08 -0700

> This patch series optimized CRC32C calculations with PCLMULQDQ
> instruction for crc32c-intel module. It speeds up the original
> implementation by 1.6x for 1K buffer and by 3x for buffer 4k or
> more. The tcrypt module was enhanced for doing speed test
> on crc32c calculations.
>
> Signed-off-by: Tim Chen <[email protected]>

Great work.

I intend to do something nearly identical on sparc64 since we have
similar instructions in the form of "xmulx" and "xmulxhi" which return
the lower and upper 64-bits (respectively) of a XOR multiply.

2012-09-27 05:46:00

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/3] Optimize CRC32C calculation using PCLMULQDQ in crc32c-intel module

On Tue, Sep 25, 2012 at 02:50:08PM -0700, Tim Chen wrote:
> This patch series optimized CRC32C calculations with PCLMULQDQ
> instruction for crc32c-intel module. It speeds up the original
> implementation by 1.6x for 1K buffer and by 3x for buffer 4k or
> more. The tcrypt module was enhanced for doing speed test
> on crc32c calculations.
>
> Tim
>
> Signed-off-by: Tim Chen <[email protected]>

All applied. Thanks Tim!
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt