Hi Herbert,
TI's crypto accelerator SA2UL doesn't support partial hashing (can't
export hash in between) but it does support incremental hashing (can
process multiple updates).
Currently in driver, only digest is being offloaded to hardware.
init + update + final are implemented using software fall back method.
Multiple of our customer wants to use SA2UL for openssl, which basically
ends up calling init + n X update + final, which ends up using software
fallback(CPU) method.
For incremental hasing, we have to keep the FRAG bit set.
For last packet, we have to unset the FRAG bit and then send the last
packet in. But we don't have a way to know if it is last packet.
We can implement update function to offload to HW, and then call
implemented finup for last packet. But this will fail the selftest combo
of init + update + final. If final is called without sending the last
packet with FRAG bit unset, SHA can't be extracted from SA2UL. As we
have no way to know whether it is a last update or not, the last packet
was also sent in with FRAG bit set.
I found a old thread[0] where Tero has tried to do this by appending all the
data from updates in one buffer and dump them in to SA2UL in one shot.
His idea was rejected because
"These states could potentially be placed on the stack so they
must not be large."
But now we know that FRAG bit can be used. The problem of not able to
know which is last packet can be solved if we can have a lag of one
update packet.
As in, the data that came in first update will be stored
in buffer stored in context and, when second update comes, we pass the
first updates data from buffer to SA2UL and stores the data that came in second
update in buffer. This way when final function is being called we
have last packet in global buffer and that can be sent with FRAG bit unset
to get the SHA.
All we need is buffer only big enough to hold the data that can
be MAX size of data that can come in update from AF_ALG or cryptodev.
With this solution, we still can't support intermediate hash, so export
function will not export intermediate hash. But import and export
functions are not marked as mandatory and we don't really need them either.
Let me know if this solution is upstreamable or if you have any other suggestions.
Thanks,
Kamlesh
[0]https://patchwork.kernel.org/project/linux-crypto/patch/[email protected]/#23454245
Herbert Xu <[email protected]> writes:
> On Wed, May 22, 2024 at 04:42:52PM +0530, Kamlesh Gurudasani wrote:
>>
>> For incremental hasing, we have to keep the FRAG bit set.
>> For last packet, we have to unset the FRAG bit and then send the last
>> packet in. But we don't have a way to know if it is last packet.
>
> I don't understand. Can't your user just submit the request as
> a digest instead of init+update+final? Wouldn't that already work
> with your driver as is?
Thanks for the response Herbert.
The way I understand algif_hash calls the digest function[0] if entire
file(file whose SHA needs to be calculated) is less than 16 *
PAGE_SIZE(4 kb for us)
For any file more than 64 kb, it will fall back to init->update->finup[1]
Implementing init-> update -> finup (SA2UL supports this)
But with this selftest will fail for
init -> update -> final(SA2UL doesn't support this)
Hence we suggested that if we can save 64kb(size of one update) in
context so in final we have one update available.(kind of virtual
finup).
This will pass the self test as well.
Let me know what you think.
Kamlesh
[0]https://elixir.bootlin.com/linux/latest/source/crypto/algif_hash.c#L137
[1]https://elixir.bootlin.com/linux/latest/source/crypto/algif_hash.c#L151
Herbert Xu <[email protected]> writes:
> On Wed, Jun 05, 2024 at 06:12:22PM +0530, Kamlesh Gurudasani wrote:
>>
>> The way I understand algif_hash calls the digest function[0] if entire
>> file(file whose SHA needs to be calculated) is less than 16 *
>> PAGE_SIZE(4 kb for us)
>
> If that is the concern we should explore increasing the limit,
> or at least making it configurable. The limit exists in order
> to prevent user-space from exhausting kernel memory, so it is
> meant to be admin-configurable.
Increasing the limit is do able.
But even if we increase the limit to ~1Mb. Calculating the file with size
bigger than that will again fall back to init-> update -> finup.
Also, the size that can be held by one sg list [0] is 204 entries,
which can hold upto 800KB of data. I'm not sure if this is still true.
Old article. If we consider chaining we can have more data, not sure how
HW handles that.
If the file size is about 100 MB in that case it will fallback to init
-> update -> finup.
If at max I can get 800KB, then I think it would be better if we save
64kb in context and implement virtual finup in init -> update -> final.
Allocating 800kb in algif_hash module vs 64kb in sa2ul driver.
This would be generic solution and will fit all scenarios.
If we rely on digest, it will always have some limitation.
Like if user allocates only 5MB buffer and file size was 10MB and data
came in two update() instead of one digest(). (MSG_MORE flag scenario
in algif_hash)
[0]https://lwn.net/Articles/234617/
Cheers,
Kamlesh
>
> Cheers,
> --
> Email: Herbert Xu <[email protected]>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu <[email protected]> writes:
> On Thu, Jun 06, 2024 at 05:33:41PM +0530, Kamlesh Gurudasani wrote:
>>
>> Also, the size that can be held by one sg list [0] is 204 entries,
>> which can hold upto 800KB of data. I'm not sure if this is still true.
>> Old article. If we consider chaining we can have more data, not sure how
>> HW handles that.
>
> If it's the SG list size that's limiting you then we should look
> into allowing bigger SG lists for af_alg to be constructed.
>
Increasing the size of sg list will sove our problem to certain extent,
but not completely.
There are few scenarios where we fail,
1. If we increase the size of sg list to handle 100Mb, file size with more
than 100Mb will still fall back to init->update->finup.
SA2UL HW supports only upto 4MB max in one shot, so internally in driver we
have to break the sg list in chunks of 4MB anyway. Will take performance hit.
2. There is scenario of MSG_MORE. If the user wants to break the file
in multiple chunks because of memory limitation and then send it to
SA2UL, then again we fall back to init->update->final.
Now all this scenarios will work if we can have a 64kb of buffer
for one cra_init->init->n*update->final->cra->exit
I think this would be better solution as against increasing the sg list
size to a big value in af_alg.
With the solution of saving 64kb in context, we still won't be able to
export partial hashed state(HW limitation), but we don't even want partial hashed
states. But we can then utilize our HW at fullest as it supports
incremental hashing.
Our HW will work with init->n*updates->finup WITHOUT 64KB saved in
context, but it will FAIL the selftest with init->update->final
With the solution with 64KB(AFALG_MAX_PAGES * PAGE_SIZE) saved in
context, selftest will also PASS and then we don't have to tinker around
for the use case which use init -> update -> FINAL as standard
offloading.
Kamlesh
> Cheers,
> --
> Email: Herbert Xu <[email protected]>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt