2022-05-19 05:45:59

by Sandy Harris

[permalink] [raw]
Subject: [RFC] random: use blake2b instead of blake2s?

The original random.c used a 4k-bit input pool, I think mainly because
sometimes (e.g. for a large RSA key) we might need up to 4k of
high-grade output. The current driver uses only a 512-bit hash
context, basically a Yarrow-like design. This is quite plausible since
we trust both the hash and the chacha output mechanism, but it seems
to me there are open questions.

One is that the Yarrow designers no longer support it; they have moved
on to a more complex design called Fortuna, so one might wonder if the
driver should use some Fortuna-like design. I'd say no; we do not need
extra complexity in the kernel & it is not clear there'd be a large
advantage.

Similarly, there's a Blake 3 that might replace Blake 2 in the driver;
the designers say it is considerably faster. I regard that as an open
question, but will not address it here.

What I do want to address is that the Yarrow paper says the
cryptographic strength of the output is at most the size of the hash
context, 160 bits for them & 512 for our current driver. Or, since we
use only 256 bits to rekey, our strength might be only 256. These
numbers are likely adequate, but if we can increase them easily, why
not?

Blake comes in two variants, blake2s and blake2b; presumably b and s
are for big & small. The kernel crypto library has both & the driver
currently uses 2s. 2s has 512-bit context (16 32-bit words) and can
give up to 256-bit output. 2b has 1024 (16 64-bit words) and can do
512 out.

To me, it looks like switching to 2b would be an obvious improvement,
though not at all urgent. Benchmarks I've seen seem to say it is
faster on 64-bit CPUs and slower on 32-bit ones, but neither
difference is huge.


2022-05-19 06:54:11

by Sandy Harris

[permalink] [raw]
Subject: Re: [RFC] random: use blake2b instead of blake2s?

Sandy Harris <[email protected]> wrote:

> The original random.c used a 4k-bit input pool, ...

> Blake comes in two variants, blake2s and blake2b; ...
>
> To me, it looks like switching to 2b would be an obvious improvement,
> though not at all urgent.

I'd actually go a bit further and have 2k bits of input pool,
two blake2b contexts; probably make inputs alternate
between them. It would also be possible to put each
input into both pools.

For output, have a flip-flop variable and alternate between
the pools with some sequence like:
: mix some extra entropy into pool
: generate 512 bits output
: mix that back into the other 2b context
: 8-round chacha on output
: mix output into chacha context

Mixing output from one context into the other ties the
two together so in effect we have a 2k-bit input pool.

Chacha is designed to be non-invertible so the
8-round instance prevents a rather unlikely
attack. Even if an enemy manages to get the
chacha state & infer some of the rekeying inputs,
they do not get direct access to blake output.
They would need to repeatedly break chacha8
to get any data that might let them attack blake.