Kamlesh Gurudasani <[email protected]> writes:
...
> Hi Eric, thanks for your detailed and valuable inputs.
>
> As per your suggestion, we did some profiling.
>
> Use case is to calculate crc32/crc64 for file input from user space.
>
> Instead of directly implementing PMULL based CRC64, we made first comparison between
> Case 1.
> CRC32 (splice() + kernel space SW driver)
> https://gist.github.com/ti-kamlesh/5be75dbde292e122135ddf795fad9f21
>
> Case 2.
> CRC32(mmap() + userspace armv8 crc32 instruction implementation)
> (tried read() as well to get contents of file, but that lost to mmap() so not mentioning number here)
> https://gist.github.com/ti-kamlesh/002df094dd522422c6cb62069e15c40d
>
> Case 3.
> CRC64 (splice() + MCRC64 HW)
> https://gist.github.com/ti-kamlesh/98b1fc36c9a7c3defcc2dced4136b8a0
>
>
> Overall, overhead of userspace + af_alg + driver in (Case 1) and
> ( Case 3) is ~0.025s, which is constant for any file size.
> This is calculated using real time to calculate crc -
> driver time (time spend inside init() + update() +final()) = overhead ~0.025s
>
>
>
> +-------------------+-----------------------------+-----------------------+------------------------+------------------------+
> | | | | | |
> | File size | 120mb(ideal size for us) | 20mb | 15mb | 5mb |
> +===================+=============================+=======================+========================+========================+
> | | | | | |
> | CRC32 (Case 1) | Driver time 0.155s | Driver time 0.0325s | Driver time 0.019s | Driver time 0.0062s |
> | | real time 0.18s | real time 0.06s | real time 0.04s | real time 0.03s |
> | | overhead 0.025s | overhead 0.025s | overhead 0.021s | overhead ~0.023s |
> +-------------------+-----------------------------+-----------------------+------------------------+------------------------+
> | | | | | |
> | CRC32 (Case 2) | Real time 0.30s | Real time 0.05s | Real time 0.04s | Real time 0.02s |
> +-------------------+-----------------------------+-----------------------+------------------------+------------------------+
> | | | | | |
> | CRC64 (Case 3) | Driver time 0.385s | Driver time 0.0665s | Driver time 0.0515s | Driver time 0.019s |
> | | real time 0.41s | real time 0.09s | real time 0.08s | real time 0.04s |
> | | overhead 0.025s | overhead 0.025s | overhead ~0.025s | overhead ~0.021s |
> +-------------------+-----------------------------+-----------------------+------------------------+------------------------+
>
> Here, if we consider similar numbers for crc64 PMULL implementation as
> crc32 (case 2) , we save good number of cpu cycles using mcrc64
> in case of files bigger than 5-10mb as most of the time is being spent in HW offload.
>
> Regards,
> Kamlesh
Hi Eric,
Please let me know if above numbers make sense to you and I should send
next revision.
Regards,
Kamlesh
Kamlesh Gurudasani <[email protected]> writes:
>>
>> Here, if we consider similar numbers for crc64 PMULL implementation as
>> crc32 (case 2) , we save good number of cpu cycles using mcrc64
>> in case of files bigger than 5-10mb as most of the time is being spent in HW offload.
>>
>> Regards,
>> Kamlesh
>
> Hi Eric,
>
> Please let me know if above numbers make sense to you and I should send
> next revision.
Hi Eric,
I understand that there is no in-kernel user for crc64-iso3309 and this
is new algorithm that we are trying to add in linux kernel.
As per your suggestion we did the calculations and it turns out to be we
are saving good number of cpu cycles with HW offload.
Also, there are some automotive customers who have a safety
requirement to offload any parameters that are in Linux to ensure FFI.
Let me know if you are willing to accept this driver, so that I can put
efforts to send next revision.
Regards,
Kamlesh