Hi Jason,
On Tue, Jan 11, 2022 at 2:49 PM Jason A. Donenfeld <[email protected]> wrote:
> Re-wind the loops entirely on kernels optimized for code size. This is
> really not good at all performance-wise. But on m68k, it shaves off 4k
> of code size, which is apparently important.
On arm32:
add/remove: 1/0 grow/shrink: 0/1 up/down: 160/-4212 (-4052)
Function old new delta
blake2s_sigma - 160 +160
blake2s_compress_generic 4872 660 -4212
Total: Before=9846148, After=9842096, chg -0.04%
On arm64:
add/remove: 1/2 grow/shrink: 0/1 up/down: 160/-4584 (-4424)
Function old new delta
blake2s_sigma - 160 +160
e843419@0710_00007634_e8a0 8 - -8
e843419@0441_0000423a_178c 8 - -8
blake2s_compress_generic 5088 520 -4568
Total: Before=32800278, After=32795854, chg -0.01%
> Signed-off-by: Jason A. Donenfeld <[email protected]>
For the size reduction:
Tested-by: Geert Uytterhoeven <[email protected]>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Hi Geert,
Thanks for testing this. However, I've *abandoned* this patch, due to
unacceptable performance hits, and figuring out that we can accomplish
basically the same thing without as large of a hit by modifying the
obsolete sha1 implementation.
Herbert - please do not apply this patch. Instead, later versions of
this patchset (e.g. v3 [1] and potentially later if it comes to that)
are what should be applied.
Jason
[1] https://lore.kernel.org/linux-crypto/[email protected]/