2019-04-02 22:02:09

by Neil Horman

[permalink] [raw]
Subject: [PATCH] Fix xoring of arch_get_random_long into crng->state array

When _crng_extract is called, any arch that has a registered
arch_get_random_long method, attempts to mix an unsigned long value into
the crng->state buffer, it only mixes in 32 of the 64 bits available,
because the state buffer is an array of u32 values, even though 2 u32
are expected to be filled (owing to the fact that it expects indexes 14
and 15 to be filled).

Bring the expected behavior into alignment by casting index 14 to an
unsignled long pointer, and xoring that in instead.

Tested successfully by myself

Signed-off-by: Neil Horman <[email protected]>
Reported-by: Steve Grubb <[email protected]>
CC: "Theodore Ts'o" <[email protected]>
CC: Arnd Bergmann <[email protected]>
CC: Greg Kroah-Hartman <[email protected]>
---
drivers/char/random.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 38c6d1af6d1c..8178618458ac 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
__u8 out[CHACHA_BLOCK_SIZE])
{
unsigned long v, flags;
-
+ unsigned long *archrnd;
if (crng_ready() &&
(time_after(crng_global_init_time, crng->init_time) ||
time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
spin_lock_irqsave(&crng->lock, flags);
- if (arch_get_random_long(&v))
- crng->state[14] ^= v;
+ if (arch_get_random_long(&v)) {
+ archrnd = (unsigned long *)&crng->state[14];
+ *archrnd ^= v;
+ }
chacha20_block(&crng->state[0], out);
if (crng->state[12] == 0)
crng->state[13]++;
--
2.20.1


2019-05-29 13:46:01

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH] Fix xoring of arch_get_random_long into crng->state array

On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> When _crng_extract is called, any arch that has a registered
> arch_get_random_long method, attempts to mix an unsigned long value into
> the crng->state buffer, it only mixes in 32 of the 64 bits available,
> because the state buffer is an array of u32 values, even though 2 u32
> are expected to be filled (owing to the fact that it expects indexes 14
> and 15 to be filled).
>
> Bring the expected behavior into alignment by casting index 14 to an
> unsignled long pointer, and xoring that in instead.
>
> Tested successfully by myself
>
> Signed-off-by: Neil Horman <[email protected]>
> Reported-by: Steve Grubb <[email protected]>
> CC: "Theodore Ts'o" <[email protected]>
> CC: Arnd Bergmann <[email protected]>
> CC: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/char/random.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 38c6d1af6d1c..8178618458ac 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
> __u8 out[CHACHA_BLOCK_SIZE])
> {
> unsigned long v, flags;
> -
> + unsigned long *archrnd;
> if (crng_ready() &&
> (time_after(crng_global_init_time, crng->init_time) ||
> time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
> crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
> spin_lock_irqsave(&crng->lock, flags);
> - if (arch_get_random_long(&v))
> - crng->state[14] ^= v;
> + if (arch_get_random_long(&v)) {
> + archrnd = (unsigned long *)&crng->state[14];
> + *archrnd ^= v;
> + }
> chacha20_block(&crng->state[0], out);
> if (crng->state[12] == 0)
> crng->state[13]++;
> --
> 2.20.1
>
>

Ping, Arnd, Ted, Greg, any comment here?
Neil

2019-05-29 13:53:06

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] Fix xoring of arch_get_random_long into crng->state array

From: Neil Horman
> Sent: 29 May 2019 14:42
> On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> > When _crng_extract is called, any arch that has a registered
> > arch_get_random_long method, attempts to mix an unsigned long value into
> > the crng->state buffer, it only mixes in 32 of the 64 bits available,
> > because the state buffer is an array of u32 values, even though 2 u32
> > are expected to be filled (owing to the fact that it expects indexes 14
> > and 15 to be filled).
> >
> > Bring the expected behavior into alignment by casting index 14 to an
> > unsignled long pointer, and xoring that in instead.
...
> > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > index 38c6d1af6d1c..8178618458ac 100644
> > --- a/drivers/char/random.c
> > +++ b/drivers/char/random.c
> > @@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
> > __u8 out[CHACHA_BLOCK_SIZE])
> > {
> > unsigned long v, flags;
> > -
> > + unsigned long *archrnd;
> > if (crng_ready() &&
> > (time_after(crng_global_init_time, crng->init_time) ||
> > time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
> > crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
> > spin_lock_irqsave(&crng->lock, flags);
> > - if (arch_get_random_long(&v))
> > - crng->state[14] ^= v;
> > + if (arch_get_random_long(&v)) {
> > + archrnd = (unsigned long *)&crng->state[14];
> > + *archrnd ^= v;
> > + }

Isn't that likely to generate a misaligned memory access?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2019-05-29 15:55:02

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH] Fix xoring of arch_get_random_long into crng->state array

On Wed, May 29, 2019 at 01:51:24PM +0000, David Laight wrote:
> From: Neil Horman
> > Sent: 29 May 2019 14:42
> > On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> > > When _crng_extract is called, any arch that has a registered
> > > arch_get_random_long method, attempts to mix an unsigned long value into
> > > the crng->state buffer, it only mixes in 32 of the 64 bits available,
> > > because the state buffer is an array of u32 values, even though 2 u32
> > > are expected to be filled (owing to the fact that it expects indexes 14
> > > and 15 to be filled).
> > >
> > > Bring the expected behavior into alignment by casting index 14 to an
> > > unsignled long pointer, and xoring that in instead.
> ...
> > > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > > index 38c6d1af6d1c..8178618458ac 100644
> > > --- a/drivers/char/random.c
> > > +++ b/drivers/char/random.c
> > > @@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
> > > __u8 out[CHACHA_BLOCK_SIZE])
> > > {
> > > unsigned long v, flags;
> > > -
> > > + unsigned long *archrnd;
> > > if (crng_ready() &&
> > > (time_after(crng_global_init_time, crng->init_time) ||
> > > time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
> > > crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
> > > spin_lock_irqsave(&crng->lock, flags);
> > > - if (arch_get_random_long(&v))
> > > - crng->state[14] ^= v;
> > > + if (arch_get_random_long(&v)) {
> > > + archrnd = (unsigned long *)&crng->state[14];
> > > + *archrnd ^= v;
> > > + }
>
> Isn't that likely to generate a misaligned memory access?
>
I'm not quite sure how it would, crng->state is an array of _u32's, and so every
even element should be on a 64 bit boundary.

Neil

> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
>

2019-05-29 16:00:02

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] Fix xoring of arch_get_random_long into crng->state array

From: Neil Horman [mailto:[email protected]]
> Sent: 29 May 2019 16:52
> On Wed, May 29, 2019 at 01:51:24PM +0000, David Laight wrote:
> > From: Neil Horman
> > > Sent: 29 May 2019 14:42
> > > On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> > > > When _crng_extract is called, any arch that has a registered
> > > > arch_get_random_long method, attempts to mix an unsigned long value into
> > > > the crng->state buffer, it only mixes in 32 of the 64 bits available,
> > > > because the state buffer is an array of u32 values, even though 2 u32
> > > > are expected to be filled (owing to the fact that it expects indexes 14
> > > > and 15 to be filled).
> > > >
> > > > Bring the expected behavior into alignment by casting index 14 to an
> > > > unsignled long pointer, and xoring that in instead.
> > ...
> > > > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > > > index 38c6d1af6d1c..8178618458ac 100644
> > > > --- a/drivers/char/random.c
> > > > +++ b/drivers/char/random.c
> > > > @@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
> > > > __u8 out[CHACHA_BLOCK_SIZE])
> > > > {
> > > > unsigned long v, flags;
> > > > -
> > > > + unsigned long *archrnd;
> > > > if (crng_ready() &&
> > > > (time_after(crng_global_init_time, crng->init_time) ||
> > > > time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
> > > > crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
> > > > spin_lock_irqsave(&crng->lock, flags);
> > > > - if (arch_get_random_long(&v))
> > > > - crng->state[14] ^= v;
> > > > + if (arch_get_random_long(&v)) {
> > > > + archrnd = (unsigned long *)&crng->state[14];
> > > > + *archrnd ^= v;
> > > > + }
> >
> > Isn't that likely to generate a misaligned memory access?
> >
> I'm not quite sure how it would, crng->state is an array of _u32's, and so every
> even element should be on a 64 bit boundary.

Only if the first item is aligned....
Add a u32 before it and you'll probably flip the alignment.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2019-05-29 16:29:28

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH] Fix xoring of arch_get_random_long into crng->state array

On Wed, May 29, 2019 at 03:57:07PM +0000, David Laight wrote:
> From: Neil Horman [mailto:[email protected]]
> > Sent: 29 May 2019 16:52
> > On Wed, May 29, 2019 at 01:51:24PM +0000, David Laight wrote:
> > > From: Neil Horman
> > > > Sent: 29 May 2019 14:42
> > > > On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> > > > > When _crng_extract is called, any arch that has a registered
> > > > > arch_get_random_long method, attempts to mix an unsigned long value into
> > > > > the crng->state buffer, it only mixes in 32 of the 64 bits available,
> > > > > because the state buffer is an array of u32 values, even though 2 u32
> > > > > are expected to be filled (owing to the fact that it expects indexes 14
> > > > > and 15 to be filled).
> > > > >
> > > > > Bring the expected behavior into alignment by casting index 14 to an
> > > > > unsignled long pointer, and xoring that in instead.
> > > ...
> > > > > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > > > > index 38c6d1af6d1c..8178618458ac 100644
> > > > > --- a/drivers/char/random.c
> > > > > +++ b/drivers/char/random.c
> > > > > @@ -975,14 +975,16 @@ static void _extract_crng(struct crng_state *crng,
> > > > > __u8 out[CHACHA_BLOCK_SIZE])
> > > > > {
> > > > > unsigned long v, flags;
> > > > > -
> > > > > + unsigned long *archrnd;
> > > > > if (crng_ready() &&
> > > > > (time_after(crng_global_init_time, crng->init_time) ||
> > > > > time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL)))
> > > > > crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
> > > > > spin_lock_irqsave(&crng->lock, flags);
> > > > > - if (arch_get_random_long(&v))
> > > > > - crng->state[14] ^= v;
> > > > > + if (arch_get_random_long(&v)) {
> > > > > + archrnd = (unsigned long *)&crng->state[14];
> > > > > + *archrnd ^= v;
> > > > > + }
> > >
> > > Isn't that likely to generate a misaligned memory access?
> > >
> > I'm not quite sure how it would, crng->state is an array of _u32's, and so every
> > even element should be on a 64 bit boundary.
>
> Only if the first item is aligned....
> Add a u32 before it and you'll probably flip the alignment.
>
Sure (assuming no padding by the compiler of leading elements), but thats not
the case here, state is the first element in the array. I suppose we could add
an __attribute__((aligned,8)) to the element if you think it would help

Neil

> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
>

2019-05-30 04:18:20

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] Fix xoring of arch_get_random_long into crng->state array

On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> When _crng_extract is called, any arch that has a registered
> arch_get_random_long method, attempts to mix an unsigned long value into
> the crng->state buffer, it only mixes in 32 of the 64 bits available,
> because the state buffer is an array of u32 values, even though 2 u32
> are expected to be filled (owing to the fact that it expects indexes 14
> and 15 to be filled).

Index 15 does get initialized; in fact, it's changed each time
crng_reseed() is called.

The way things currently work is that we use state[12] and state[13]
as 64-bit counter (it gets incremented each time we call
_extract_crng), and state[14] and state[15] are nonce values. After
crng->state has been in use for five minutes, we reseed the crng by
grabbing randomness from the input pool, and using that to initialize
state[4..15]. (State[0..3] are always set to the ChaCha20 constant of
"expand 32-byte k".)

If the CPU provides and RDRAND-like instruction (which can be the case
for x86, PPC, and S390), we xor it into state[14]. Whether we xor any
extra entropy into state[15] to be honest, really doesn't matter much.
I think I was trying to keep things simple, and it wasn't worth it to
call RDRAND twice on a 32-bit x86. (And there isn't an
arch_get_random_long_long. :-)

Why do we do this at all? Well, the goal was to feed in some
contributing randomness from RDRAND when we turn the CRNG crank. (The
reason why we don't just XOR in the RDRAND into the output ohf the
CRNG is mainly to assuage critics that hypothetical RDRAND backdoor
has access to the CPU registers. So we perturb the inputs to the
CRNG, on the theory that if malicious firmware can reverse
CHACHA20... we've got bigger problems. :-) We get up to 20 bytes out
of a single turn of the CRNG crank, so whether we mix in 4 bytes or 8
bytes from RDRAND, we're never going to be depending on RDRAND
completely in any case.

The bottom line is that I'm not at all convinced it worth the effort
to mix in 8 bytes versus 4 bytes from RDRAND. This is really a CRNG,
and the RDRAND inputs really don't change that.

- Ted

2019-05-30 11:17:41

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH] Fix xoring of arch_get_random_long into crng->state array

On Wed, May 29, 2019 at 11:12:01PM -0400, Theodore Ts'o wrote:
> On Tue, Apr 02, 2019 at 06:00:25PM -0400, Neil Horman wrote:
> > When _crng_extract is called, any arch that has a registered
> > arch_get_random_long method, attempts to mix an unsigned long value into
> > the crng->state buffer, it only mixes in 32 of the 64 bits available,
> > because the state buffer is an array of u32 values, even though 2 u32
> > are expected to be filled (owing to the fact that it expects indexes 14
> > and 15 to be filled).
>
> Index 15 does get initialized; in fact, it's changed each time
> crng_reseed() is called.
>
> The way things currently work is that we use state[12] and state[13]
> as 64-bit counter (it gets incremented each time we call
> _extract_crng), and state[14] and state[15] are nonce values. After
> crng->state has been in use for five minutes, we reseed the crng by
> grabbing randomness from the input pool, and using that to initialize
> state[4..15]. (State[0..3] are always set to the ChaCha20 constant of
> "expand 32-byte k".)
>
> If the CPU provides and RDRAND-like instruction (which can be the case
> for x86, PPC, and S390), we xor it into state[14]. Whether we xor any
> extra entropy into state[15] to be honest, really doesn't matter much.
> I think I was trying to keep things simple, and it wasn't worth it to
> call RDRAND twice on a 32-bit x86. (And there isn't an
> arch_get_random_long_long. :-)
>
> Why do we do this at all? Well, the goal was to feed in some
> contributing randomness from RDRAND when we turn the CRNG crank. (The
> reason why we don't just XOR in the RDRAND into the output ohf the
> CRNG is mainly to assuage critics that hypothetical RDRAND backdoor
> has access to the CPU registers. So we perturb the inputs to the
> CRNG, on the theory that if malicious firmware can reverse
> CHACHA20... we've got bigger problems. :-) We get up to 20 bytes out
> of a single turn of the CRNG crank, so whether we mix in 4 bytes or 8
> bytes from RDRAND, we're never going to be depending on RDRAND
> completely in any case.
>
> The bottom line is that I'm not at all convinced it worth the effort
> to mix in 8 bytes versus 4 bytes from RDRAND. This is really a CRNG,
> and the RDRAND inputs really don't change that.
>
Ok, so what I'm getting is that the exclusion of the second 32 bit word here
from &crng->state[15], isn't an oversight, its just skipped because its not
worth taking the time for the extra write there, and this is not a bug. I'm ok
with that.

Thanks for the explination
Neil

> - Ted
>