2020-12-21 19:30:14

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] crypto: x86/crc32c-intel - Don't match some Zhaoxin CPUs

On December 20, 2020 6:46:25 PM PST, [email protected] wrote:
>On December 16, 2020 1:56:45 AM GMT+08:00, Eric Biggers
><[email protected]> wrote:
>>On Tue, Dec 15, 2020 at 10:15:29AM +0800, Tony W Wang-oc wrote:
>>>
>>> On 15/12/2020 04:41, Eric Biggers wrote:
>>> > On Mon, Dec 14, 2020 at 10:28:19AM +0800, Tony W Wang-oc wrote:
>>> >> On 12/12/2020 01:43, Eric Biggers wrote:
>>> >>> On Fri, Dec 11, 2020 at 07:29:04PM +0800, Tony W Wang-oc wrote:
>>> >>>> The driver crc32c-intel match CPUs supporting
>>X86_FEATURE_XMM4_2.
>>> >>>> On platforms with Zhaoxin CPUs supporting this X86 feature,
>When
>>> >>>> crc32c-intel and crc32c-generic are both registered, system
>will
>>> >>>> use crc32c-intel because its .cra_priority is greater than
>>> >>>> crc32c-generic. This case expect to use crc32c-generic driver
>>for
>>> >>>> some Zhaoxin CPUs to get performance gain, So remove these
>>Zhaoxin
>>> >>>> CPUs support from crc32c-intel.
>>> >>>>
>>> >>>> Signed-off-by: Tony W Wang-oc <[email protected]>
>>> >>>
>>> >>> Does this mean that the performance of the crc32c instruction on
>>those CPUs is
>>> >>> actually slower than a regular C implementation? That's very
>>weird.
>>> >>>
>>> >>
>>> >> From the lmbench3 Create and Delete file test on those chips, I
>>think yes.
>>> >>
>>> >
>>> > Did you try measuring the performance of the hashing itself, and
>>not some
>>> > higher-level filesystem operations?
>>> >
>>>
>>> Yes. Was testing on these Zhaoxin CPUs, the result is that with the
>>same
>>> input value the generic C implementation takes fewer time than the
>>> crc32c instruction implementation.
>>>
>>
>>And that is really "working as intended"?
>
>These CPU's crc32c instruction is not working as intended.
>
> Why do these CPUs even
>>declare that
>>they support the crc32c instruction, when it is so slow?
>>
>
>The presence of crc32c and some other instructions supports are
>enumerated by CPUID.01:ECX[SSE4.2] = 1, other instructions are ok
>except the crc32c instruction.
>
>>Are there any other instruction sets (AES-NI, PCLMUL, SSE, SSE2, AVX,
>>etc.) that
>>these CPUs similarly declare support for but they are uselessly slow?
>
>No.
>
>Sincerely
>Tonyw
>
>>
>>- Eric

Then the right thing to do is to disable the CPUID bit in the vendor-specific startup code.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.