2022-08-23 16:37:48

by Paul Menzel

[permalink] [raw]
Subject: kdf108_init() takes over 250 ms

Dear Stephan,


On the Dell XPS 13 9370 with Debian sid/unstable, I noticed with Linux
5.18.16, that `crypto_kdf108_init()` takes 263 ms to run even with
disabled self-tests:

```
[ 0.000000] Linux version 5.18.0-4-amd64
([email protected]) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU
ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC
Debian 5.18.16-1 (2022-08-10)
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64
root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet
module_blacklist=psmouse initcall_debug log_buf_len=4M cryptomgr.notests
[…]
[ 0.000000] DMI: Dell Inc. XPS 13 9370/0RMYH9, BIOS 1.21.0 07/06/2022
[…]
[ 0.272123] calling x509_key_init+0x0/0x11 @ 1
[ 0.272125] Asymmetric key parser 'x509' registered
[ 0.272126] initcall x509_key_init+0x0/0x11 returned 0 after 1 usecs
[ 0.272127] calling crypto_kdf108_init+0x0/0x149 @ 1
[ 0.530787] Freeing initrd memory: 39332K
[ 0.534667] alg: self-tests disabled
[ 0.534701] alg: self-tests for CTR-KDF (hmac(sha256)) passed
[ 0.534703] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
262573 usecs
[ 0.534708] calling blkdev_init+0x0/0x20 @ 1
[ 0.534716] initcall blkdev_init+0x0/0x20 returned 0 after 5 usecs
[ 0.534718] calling proc_genhd_init+0x0/0x46 @ 1
[ 0.534723] initcall proc_genhd_init+0x0/0x46 returned 0 after 3 usecs
```

With self-tests enabled it’s only less than a millisecond longer.

```
[ 0.282389] calling crypto_kdf108_init+0x0/0x149 @ 1
[ 0.541096] Freeing initrd memory: 39332K
[ 0.545674] alg: self-tests for CTR-KDF (hmac(sha256)) passed
[ 0.545676] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
263284 usecs
```


Kind regards,

Paul


Subject: RE: kdf108_init() takes over 250 ms


> -----Original Message-----
> From: Paul Menzel <[email protected]>
> Sent: Tuesday, August 23, 2022 9:52 AM
> To: Stephan Müller <[email protected]>
> Cc: Herbert Xu <[email protected]>; David S. Miller
> <[email protected]>; [email protected]; LKML <linux-
> [email protected]>
> Subject: kdf108_init() takes over 250 ms
>
> Dear Stephan,
>
> On the Dell XPS 13 9370 with Debian sid/unstable, I noticed with Linux
> 5.18.16, that `crypto_kdf108_init()` takes 263 ms to run even with
> disabled self-tests:
>
...
> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64
> root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet
> module_blacklist=psmouse initcall_debug log_buf_len=4M cryptomgr.notests
...
> [ 0.272127] calling crypto_kdf108_init+0x0/0x149 @ 1
> [ 0.530787] Freeing initrd memory: 39332K
> [ 0.534667] alg: self-tests disabled
> [ 0.534701] alg: self-tests for CTR-KDF (hmac(sha256)) passed
> [ 0.534703] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
> 262573 usecs
...
>
> With self-tests enabled it’s only less than a millisecond longer.
>
> ```
> [ 0.282389] calling crypto_kdf108_init+0x0/0x149 @ 1
> [ 0.541096] Freeing initrd memory: 39332K
> [ 0.545674] alg: self-tests for CTR-KDF (hmac(sha256)) passed
> [ 0.545676] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
> 263284 usecs
> ```

crypto_kdf108_init() call its self-test function directly rather
that alg_test(), which implements that notests flag. Maybe it
should go through alg_test().

Outside of that, check that Tim's x86-optimized SHA-256 module
is loaded, so it is used rather than the generic implementation.
One my system, that improves the kdf108 initialization time
from 1.4 s to 0.38 s:

With sha256_generic:
initcall sha256_generic_mod_init+0x0/0x16 returned 0 after 0 usecs
...
initcall crypto_kdf108_init+0x0/0x18d returned 0 after 1425640 usecs

With sha256_ssse3 (using its AVX2 implementation):
initcall sha256_ssse3_mod_init+0x0/0x1bf returned 0 after 12148 usecs
...
initcall crypto_kdf108_init+0x0/0x153 returned 0 after 382799 usecs

That's controlled by CONFIG_CRYPTO_SHA256_SSSE3.


2022-08-26 07:54:22

by Stephan Müller

[permalink] [raw]
Subject: Re: kdf108_init() takes over 250 ms

Am Dienstag, 23. August 2022, 22:10:01 CEST schrieb Elliott, Robert (Servers):

Hi Robert,

> > -----Original Message-----
> > From: Paul Menzel <[email protected]>
> > Sent: Tuesday, August 23, 2022 9:52 AM
> > To: Stephan Müller <[email protected]>
> > Cc: Herbert Xu <[email protected]>; David S. Miller
> > <[email protected]>; [email protected]; LKML <linux-
> > [email protected]>
> > Subject: kdf108_init() takes over 250 ms
> >
> > Dear Stephan,
> >
> > On the Dell XPS 13 9370 with Debian sid/unstable, I noticed with Linux
> > 5.18.16, that `crypto_kdf108_init()` takes 263 ms to run even with
> > disabled self-tests:
> >
>
> ...
>
> > [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.18.0-4-amd64
> > root=UUID=56f398e0-1e25-4fda-aa9f-611dece4b333 ro quiet
> > module_blacklist=psmouse initcall_debug log_buf_len=4M cryptomgr.notests
>
> ...
>
> > [ 0.272127] calling crypto_kdf108_init+0x0/0x149 @ 1
> > [ 0.530787] Freeing initrd memory: 39332K
> > [ 0.534667] alg: self-tests disabled
> > [ 0.534701] alg: self-tests for CTR-KDF (hmac(sha256)) passed
> > [ 0.534703] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
> > 262573 usecs
>
> ...
>
> >
> > With self-tests enabled it’s only less than a millisecond longer.
> >
> > ```
> > [ 0.282389] calling crypto_kdf108_init+0x0/0x149 @ 1
> > [ 0.541096] Freeing initrd memory: 39332K
> > [ 0.545674] alg: self-tests for CTR-KDF (hmac(sha256)) passed
> > [ 0.545676] initcall crypto_kdf108_init+0x0/0x149 returned 0 after
> > 263284 usecs
> > ```
>
>
> crypto_kdf108_init() call its self-test function directly rather
> that alg_test(), which implements that notests flag. Maybe it
> should go through alg_test().

You are right that it does not uses the alg_test. This is because the KDF is
just a helper and not implemented as a template. I initially wanted and
provided a patch that turns the KDFs into templates which then would be able
to go though alg_test. It was not accepted, but instead only service functions
where accepted.

The reason for not accepting the template approach was that a complete new API
is needed to accommodate the KDFs. Initially I called the API "rng" because a
KDF and a PRNG are very very similar in nature: they take an arbitrary string
as input (the seed/key/personalization/additional info/label string) and
provide an arbitrary output (mathematically you can even use both
interchangeably for the same purposes - although cryptographically speaking
you do not want that). As this concept cannot be covered with the existing
APIs, a KDF cannot be rolled into those existing APIs as template. Side note:
the same question around such new API will appear as soon as somebody asks for
SHAKE to be added.

A low hanging fruit would be to also deactivate the KDF test when the notest
option is selected.

>
> Outside of that, check that Tim's x86-optimized SHA-256 module
> is loaded, so it is used rather than the generic implementation.
> One my system, that improves the kdf108 initialization time
> from 1.4 s to 0.38 s:
>
> With sha256_generic:
> initcall sha256_generic_mod_init+0x0/0x16 returned 0 after 0 usecs
> ...
> initcall crypto_kdf108_init+0x0/0x18d returned 0 after 1425640 usecs
>
> With sha256_ssse3 (using its AVX2 implementation):
> initcall sha256_ssse3_mod_init+0x0/0x1bf returned 0 after 12148 usecs
> ...
> initcall crypto_kdf108_init+0x0/0x153 returned 0 after 382799 usecs
>
> That's controlled by CONFIG_CRYPTO_SHA256_SSSE3.

The test is performed during kernel boot time with the available
implementation - the self test uses "hmac(sha256)". If the AVX2 is not
registered at that time with the kernel crypto API, it will not be available
for use. But it is not possible to hard-code the use of the AVX implementation
or any other implementation as it is not guaranteed to be present.

The issue would be alleviated it would go through alg_test though.

>
>


Ciao
Stephan