2017-07-11 07:52:12

by Kamil Konieczny

[permalink] [raw]
Subject: HELP writing crypto driver for HW with blocksize hash

Hi,

I am writing crypto driver for hash MD5/SHA1/SHA256 on Exynos 4412,
and I am facing some (minor?) difficulties.

In old days, hadware (HW) can only do basic hash block operation,
so at the end it needed to finalize hash, and driver need to
write some bits into buffer to get message hash. Time passes,
and hardware (HW) was designed with improved cappabilities,
so it can add itself bits after message and calculate hash,
it can stop then export state, import and resume,
but ... it can not process null-length (or zero-length) ending.
So there is no more final method.

It must be feeded with at least one message byte to produce hash.

One more constrain is to process data in constant-sized chunks,
here it is 64 bytes, the same as for AES block,
i.e. it cannot stop and export state while in middle of block,
example - if we feed 16 bytes, we must then feed 48 bytes
to stop, but ideally we should feed it always 64 bytes.

Some crypto drivers with similar problem(s):

omap-sham.c - no final and blocksize needed,
broadcom bcm/spu.c - no final,
ccp/ccp-crypto-sha.c - no final,
nx/nx-sha256.c - blocksize needed,

One more thing - in algorithm description for methods:
final, finup, update, digest, export, import
there is note that finup is for those hardware
that cannot do final, but again,

it looks like crypto framework is ignoring that and every finup
is translated into "update, final".

HW driver will do opposite - it will translate final into finup.

>From this follows that for every such HW crypto drivers authors
duplicate code for keeping at least blocksize cache of message.

One more point - use of block size in algo structure.
It is for informing framework about HW limitation,
but seems to be ignored again...

Any suggestions ?

Can i keep some bytes unfeeded from ahash_request
and return -EINPROGRESS ?
Should i set timer and copy rest bytes after some timeout,
where no more requests are incoming ? Or not ? cause it is
async mode ?
can i wait for more requests for processing waiting one ?

--
Best regards,
Kamil Konieczny
Samsung R&D Institute Poland


2017-07-11 10:30:44

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: HELP writing crypto driver for HW with blocksize hash

On Tue, Jul 11, 2017 at 10:52 AM, Kamil Konieczny
<[email protected]> wrote:
>
> Hi,
>
> I am writing crypto driver for hash MD5/SHA1/SHA256 on Exynos 4412,
> and I am facing some (minor?) difficulties.
>
> In old days, hadware (HW) can only do basic hash block operation,
> so at the end it needed to finalize hash, and driver need to
> write some bits into buffer to get message hash. Time passes,
> and hardware (HW) was designed with improved cappabilities,
> so it can add itself bits after message and calculate hash,
> it can stop then export state, import and resume,
> but ... it can not process null-length (or zero-length) ending.
> So there is no more final method.
>
> It must be feeded with at least one message byte to produce hash.
>
> One more constrain is to process data in constant-sized chunks,
> here it is 64 bytes, the same as for AES block,
> i.e. it cannot stop and export state while in middle of block,
> example - if we feed 16 bytes, we must then feed 48 bytes
> to stop, but ideally we should feed it always 64 bytes.


>
> Some crypto drivers with similar problem(s):
>
> omap-sham.c - no final and blocksize needed,
> broadcom bcm/spu.c - no final,
> ccp/ccp-crypto-sha.c - no final,
> nx/nx-sha256.c - blocksize needed,
>
> One more thing - in algorithm description for methods:
> final, finup, update, digest, export, import
> there is note that finup is for those hardware
> that cannot do final, but again,
>
> it looks like crypto framework is ignoring that and every finup
> is translated into "update, final".
>
> HW driver will do opposite - it will translate final into finup.
>
> From this follows that for every such HW crypto drivers authors
> duplicate code for keeping at least blocksize cache of message.
>
> One more point - use of block size in algo structure.
> It is for informing framework about HW limitation,
> but seems to be ignored again...
>
> Any suggestions ?
>
> Can i keep some bytes unfeeded from ahash_request
> and return -EINPROGRESS ?
> Should i set timer and copy rest bytes after some timeout,
> where no more requests are incoming ? Or not ? cause it is
> async mode ?
> can i wait for more requests for processing waiting one ?
>

Your two constraints are actually inter-related -

If you can only feed the HW a constant size chunk, than indeed need to
keey bytes fed
to the driver the are below the chunk size in a software buffer, but
than you need the final()
method to feed these bytes (padded as needed) to the HW if they are
the last bytes

Gilad




--
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
-- Jean-Baptiste Queru

2017-07-11 10:50:08

by Kamil Konieczny

[permalink] [raw]
Subject: Re: HELP writing crypto driver for HW with blocksize hash

On 11.07.2017 12:30, Gilad Ben-Yossef wrote:
> On Tue, Jul 11, 2017 at 10:52 AM, Kamil Konieczny
> <[email protected]> wrote:
>>
>>
>> I am writing crypto driver for hash MD5/SHA1/SHA256 on Exynos 4412,
>> and I am facing some (minor?) difficulties. [...]

>> So there is no [...] final method.
>>
>> It must be feeded with at least one message byte to produce hash.
>>
>> One more constrain is to process data in constant-sized chunks,
>> here it is 64 bytes, the same as for AES block,
>> i.e. it cannot stop and export state while in middle of block,
>> example - if we feed 16 bytes, we must then feed 48 bytes
>> to stop, but ideally we should feed it always 64 bytes. [...]
>>
>> Any suggestions ?
>>
>> Can i keep some bytes unfeeded from ahash_request
>> and return -EINPROGRESS ?
>> Should i set timer and copy rest bytes after some timeout,
>> where no more requests are incoming ? Or not ? cause it is
>> async mode ?
>> can i wait for more requests for processing waiting one ?
> Your two constraints are actually inter-related -
>
> If you can only feed the HW a constant size chunk, than indeed need to
> keey bytes fed
> to the driver the are below the chunk size in a software buffer, but
> than you need the final()
> method to feed these bytes (padded as needed) to the HW if they are
> the last bytes
>
> Gilad

HW will done padding

I want to avoid memcpy iff possible for async hash

--
Best regards,
Kamil Konieczny
Samsung R&D Institute Poland

2017-07-11 10:59:27

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: HELP writing crypto driver for HW with blocksize hash

On Tue, Jul 11, 2017 at 1:50 PM, Kamil Konieczny
<[email protected]> wrote:
> On 11.07.2017 12:30, Gilad Ben-Yossef wrote:
>> On Tue, Jul 11, 2017 at 10:52 AM, Kamil Konieczny
>> <[email protected]> wrote:
>>>
>>>
>>> I am writing crypto driver for hash MD5/SHA1/SHA256 on Exynos 4412,
>>> and I am facing some (minor?) difficulties. [...]
>
>>> So there is no [...] final method.
>>>
>>> It must be feeded with at least one message byte to produce hash.
>>>
>>> One more constrain is to process data in constant-sized chunks,
>>> here it is 64 bytes, the same as for AES block,
>>> i.e. it cannot stop and export state while in middle of block,
>>> example - if we feed 16 bytes, we must then feed 48 bytes
>>> to stop, but ideally we should feed it always 64 bytes. [...]
>>>
>>> Any suggestions ?
>>>
>>> Can i keep some bytes unfeeded from ahash_request
>>> and return -EINPROGRESS ?
>>> Should i set timer and copy rest bytes after some timeout,
>>> where no more requests are incoming ? Or not ? cause it is
>>> async mode ?
>>> can i wait for more requests for processing waiting one ?
>> Your two constraints are actually inter-related -
>>
>> If you can only feed the HW a constant size chunk, than indeed need to
>> keey bytes fed
>> to the driver the are below the chunk size in a software buffer, but
>> than you need the final()
>> method to feed these bytes (padded as needed) to the HW if they are
>> the last bytes
>>
>> Gilad
>
> HW will done padding
>
> I want to avoid memcpy iff possible for async hash

Why? you're talking about a copy of a single cache line at most.
That hardly seem worth the trouble.

Gilad

>
> --
> Best regards,
> Kamil Konieczny
> Samsung R&D Institute Poland



--
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
-- Jean-Baptiste Queru