2013-10-14 13:38:38

by Sandy Harris

[permalink] [raw]
Subject: Fwd: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller <[email protected]> wrote:

>>Paper has:
>>
>>" the time delta is partitioned into chunks of 1 bit starting at the
>>lowest bit " .... The 64 1 bit chunks of the time value are XORed with
>>each other to " form a 1 bit value.
>>
>>As I read that, you are just taking the parity. Why not use that
>>simpler description & possibly one of several possible optimised
>>algorithms for the task:
>>http://graphics.stanford.edu/~seander/bithacks.html
>
> I am fully aware that the bit operation is inefficient. Yet it is
> deliberately inefficient, because that "folding loop" performs actual
> work for the RNG (the collapse of 64 bits into one bit) and at the very
> same time, it is the fixed instruction set over which I measure the time
> variations.
>
> Thus, the folding loop can be considered as the entropy source ...
>
> As the RNG is not primarily about speed, the folding operation should
> stay inefficient.

OK, that makes sense.

>>If what you are doing is not a parity computation, then you need a
>>better description so people like me do not misread it.
>
> It is not a parity computation that the folding loop performs. The code
> XORs each individual bit of the 64 bit stream with each other, whereas
> your cited document implies an ANDing of the bits (see section
> "Computing parity the naive way" of the cited document).

No. The AND is used in a trick; x&(x-1) gives a value with exactly
one bit set, the lowest bit set in x. The code there just counts that
way for efficiency.

Parity asks whether the number of set bits is odd or even. For
example this is another way to find the parity of x.

for( p = 0; x ; x >>= 1 )
p ^= (x&1) ;

>From your description (I haven't looked at the code) you are
computing parity. If so, say that. If not, explain.

>>This appears to be missing the cryptographically strong
>>mixing step which most RNG code includes. If that is
>>what you are doing, you need to provide some strong
>>arguments to justify it.
>
> The key here is that there is no cryptographic function needed for
> mixing as in fact I am not really mixing things. I am simply
> concatenating the new bit to the previously obtained bits. That is it.
>
> The argument for doing that is that the time deltas are independent of
> each other. ...

> ... each bit from the folding operation therefore contains
> one bit of entropy. So, why should I mix it even further with some
> crypto function?

That does make sense, but in security code I tend to prefer a
belt-and-suspenders approach. Even believing that each
individual bit is truly random, I'd still mix some just in case.

> Can you please help me understand why you think that a whitening
> function (cryptographic or not) is needed in the case of the CPU Jitter
> RNG, provided that I can show that each individual bit coming from the
> folding operation has one bit of entropy?

Basically, sheer paranoia. I'd mix and whiten just on general
principles. Since efficiency is not a large concern, there is little
reason not to.

On the other hand, most RNGs use a hash because they need
to distill some large amount of low-entropy input into a smaller
high-entropy output. With high input entropy, you do not need
the hash and can choose some cheaper mixer.

>>> I will present the RNG at the Linux Symposium in Ottawa this year.
>>> ....
>>
>>I live in Ottawa, ...
>
> As mentioned before, I would really like to meet you there to have a cup
> of coffee over that matter.

Sounds good. Ted, will you be around?


2013-10-14 14:12:41

by Stephan Müller

[permalink] [raw]
Subject: Re: Fwd: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 14. Oktober 2013, 09:38:34 schrieb Sandy Harris:

Hi Sandy,

>Stephan Mueller <[email protected]> wrote:
>
>>>If what you are doing is not a parity computation, then you need a
>>>better description so people like me do not misread it.
>>>
>> It is not a parity computation that the folding loop performs. The
>> code XORs each individual bit of the 64 bit stream with each other,
>> whereas your cited document implies an ANDing of the bits (see
>> section "Computing parity the naive way" of the cited document).
>
>No. The AND is used in a trick; x&(x-1) gives a value with exactly
>one bit set, the lowest bit set in x. The code there just counts that
>way for efficiency.
>
>Parity asks whether the number of set bits is odd or even. For
>example this is another way to find the parity of x.
>
> for( p = 0; x ; x >>= 1 )
> p ^= (x&1) ;
>
>From your description (I haven't looked at the code) you are
>computing parity. If so, say that. If not, explain.

I re-read the referenced documentation. Yes, what I do is a parity
calculation. But I do not want to think that way, although technically
it is.

The reason and the goal of the implementation is the following: To
maintain a mathematical model in which the entropy is preserved, I can
only perform the following operations:

1. Use of concatenation which in terms of entropy implies that two
values concatenated with each other add the entropy the two values
contain. For example: string a contains 4 bits of entropy, string b
contains 6 bits of entropy. Now, when concatenating both, I get 10 bits
of entropy in the combined string.

2. The use of XOR implies that the resulting value has the maximum
entropy of the two individual strings. With the same example above, a
XOR b implies that the resulting string has 6 bits of entropy.

Any other operation will loose entropy in a way that is not easily to be
modeled.

Now, back to the folding loop: I know that the lower bits have some
entropy. When collapsing them into one single bit, I "slice" the 64 bit
timer value into 64 1 bit values and XOR the 64 bit values together. The
goal is to preserve the entropy each bit holds.

(PS: I am aware that in case none of the individual bits would contain
one full bit of entropy, the folding operation may --mathematically
spoken-- not deliver one full bit of entropy. However, after speaking
with a mathematician, that slight inconsistency is ok, if I can show
that the distribution of the folded bit is 50% zeros and 50% ones. That
is what I did in section 5.2.1. Thus, the conclusion is that I receive
one bit of entropy after the folding loop.)

Yes, that is equal to parity calculation. But the reference to a parity
calculation is not defined when you want to apply a mathematical model
to entropy maintenance. Thus, I would rather like to have a consistent
design description that I can easily apply to the entropy discussion.

>
>>>This appears to be missing the cryptographically strong
>>>mixing step which most RNG code includes. If that is
>>>what you are doing, you need to provide some strong
>>>arguments to justify it.
>>>
>> The key here is that there is no cryptographic function needed for
>> mixing as in fact I am not really mixing things. I am simply
>> concatenating the new bit to the previously obtained bits. That is
>> it.
>>
>> The argument for doing that is that the time deltas are independent
>> of
>> each other. ...
>>
>> ... each bit from the folding operation therefore contains
>> one bit of entropy. So, why should I mix it even further with some
>> crypto function?
>
>That does make sense, but in security code I tend to prefer a
>belt-and-suspenders approach. Even believing that each
>individual bit is truly random, I'd still mix some just in case.

I am with you on the potential for further mixing. However, please note
that the standard hashes are all surjective and not bijective. Thus,
they implicitly loose entropy (albeit it is not much). Therefore, a good
mixing function would be a symmetric cipher.

Also, I tried to have my code portable to a large variety of target
systems. All of them have crypto libraries with a great number of cipher
implementation, but which suffer from the lack of entropy. Thus, I
concluded that I do not want to re-invent the wheel by reimplementing
some cipher, but only provide what is missing: a decentral entropy
source that delivers entropy on demand in kernel and user space.

Who would be a user of the CPU Jitter RNG? I hardly believe that a
normal end user would use it, but rather it would be provided via some
crypto lib. And the authors of a crypto lib surely know how to handle
the seed source (I hope).

This all is the reason why I implemented the different links to various
crypto libraries, where the kernel crypto API is one of it. The CPU
Jitter RNG is no a stand-alone system, but always linked with a DRNG
that is frequently seeded by the CPU Jitter RNG.
>
>> Can you please help me understand why you think that a whitening
>> function (cryptographic or not) is needed in the case of the CPU
>> Jitter RNG, provided that I can show that each individual bit coming
>> from the folding operation has one bit of entropy?
>
>Basically, sheer paranoia. I'd mix and whiten just on general
>principles. Since efficiency is not a large concern, there is little
>reason not to.

I am ok with that, when you consider the surjective/bijective argument
above. Still, as mentioned above, I think that will be covered by the
"users" of the CPU jitter RNG.
>
>On the other hand, most RNGs use a hash because they need
>to distill some large amount of low-entropy input into a smaller
>high-entropy output. With high input entropy, you do not need
>the hash and can choose some cheaper mixer.

IMHO, that decision is left to the user of the code. Some users are
forced to use some DRNGs (e.g. FIPS comes to mind). Thus, I do not want
to do things that will need to be re-done in a slightly different way
again. This is what I see as inefficient complexity that I truly want to
avoid.
>
>>>> I will present the RNG at the Linux Symposium in Ottawa this year.
>>>> ....
>>>
>>>I live in Ottawa, ...
>>>
>> As mentioned before, I would really like to meet you there to have a
>> cup of coffee over that matter.
>
>Sounds good. Ted, will you be around?


Thanks
Stephan

2013-10-14 14:14:03

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Mon, Oct 14, 2013 at 9:38 AM, Sandy Harris <[email protected]> wrote:

> Stephan Mueller <[email protected]> wrote:

>> Can you please help me understand why you think that a whitening
>> function (cryptographic or not) is needed in the case of the CPU Jitter
>> RNG, provided that I can show that each individual bit coming from the
>> folding operation has one bit of entropy?
>
> Basically, sheer paranoia. I'd mix and whiten just on general
> principles. Since efficiency is not a large concern, there is little
> reason not to.
>
> On the other hand, most RNGs use a hash because they need
> to distill some large amount of low-entropy input into a smaller
> high-entropy output. With high input entropy, you do not need
> the hash and can choose some cheaper mixer.

You could use strong mixing/whitening:

Feed into random(4) and let it do the mixing.

Use some early outputs from your RNG to key an AES
instance. Then encrypt later outputs; this gives a 64 in 64
out mixer that is cryptographically strong but perhaps a bit
slow in the context.

Alternately, quite a few plausible components for fast cheap
mixing are readily available.

The Aria cipher has one that is 128 in 128 out. It multiplies
a 128-bit object by a fixed Boolean matrix, makes every
output bit depend on many input bits. It is fairly cheap,
used in every round and the cipher is acceptably fast.

The column transform from AES is 32 in 32 out and makes
every output byte depend on every input byte. It is fast; has
to be since it is used four times in every round.

A two-way pseudo-Hadamard transform (PHT) is 2n bits in
and 2n out, requires only two additions, makes both n-bit
outputs depend on both inputs.

PHT can be applied recursively to mix 4n, 8n, ...

My QHT is 32 in 32 out, makes every /bit/ of output
depend on every bit of input. It is a tad expensive;
two multiplications & two modulo operations. File
qht.c at:
ftp://ftp.cs.sjtu.edu.cn:990/sandy/maxwell/

To mix 64 bits, I'd use two qht() calls to mix the 32-bit
halves then a two-way PHT.

2013-10-14 14:26:29

by Stephan Müller

[permalink] [raw]
Subject: Re: Fwd: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 14. Oktober 2013, 16:12:24 schrieb Stephan Mueller:

Hi Sandy,

>
>(PS: I am aware that in case none of the individual bits would contain
>one full bit of entropy, the folding operation may --mathematically
>spoken-- not deliver one full bit of entropy. However, after speaking
>with a mathematician, that slight inconsistency is ok, if I can show
>that the distribution of the folded bit is 50% zeros and 50% ones. That
>is what I did in section 5.2.1. Thus, the conclusion is that I receive
>one bit of entropy after the folding loop.)

One followup on this issue: if one believes that he has a problem with
that consideration, he can initialize the CPU Jitter RNG with an
oversampling rate. That rate simply performs the folding operation 64
times oversampling rate.

To fill the entropy pool completely once, the RNG needs 64 time deltas
which are folded into the one bit. All the oversampling rate now does is
to calculate more than once the complete entropy pool.

For example, when you set the oversampling rate to 2, you need twice as
long for the random value as each bit the random value is calculated
twice. And the two independent 64 bit random values are simply XORed
together.

You can set the oversampling rate to any value above 1.

Ciao
Stephan

2013-10-14 14:41:09

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 14. Oktober 2013, 10:14:00 schrieb Sandy Harris:

Hi Sandy,

>On Mon, Oct 14, 2013 at 9:38 AM, Sandy Harris <[email protected]>
wrote:
>> Stephan Mueller <[email protected]> wrote:
>>> Can you please help me understand why you think that a whitening
>>> function (cryptographic or not) is needed in the case of the CPU
>>> Jitter RNG, provided that I can show that each individual bit
>>> coming from the folding operation has one bit of entropy?
>>
>> Basically, sheer paranoia. I'd mix and whiten just on general
>> principles. Since efficiency is not a large concern, there is little
>> reason not to.
>>
>> On the other hand, most RNGs use a hash because they need
>> to distill some large amount of low-entropy input into a smaller
>> high-entropy output. With high input entropy, you do not need
>> the hash and can choose some cheaper mixer.
>
>You could use strong mixing/whitening:
>
>Feed into random(4) and let it do the mixing.

That is exactly the goal with the patch found in patches/linux-3.9-
random.patch in the code distribution.

And that approach is exactly what I do in the linking code / patches for
other crypto libs:

- kernel crypto API

- OpenSSL (implementation as an Engine that uses the internal DRNGs or a
hook into RAND_poll that implements the seeding for the DRNGs)

- libgcrypt (hook into the seeding of the DRNGs)

>
>Use some early outputs from your RNG to key an AES
>instance. Then encrypt later outputs; this gives a 64 in 64
>out mixer that is cryptographically strong but perhaps a bit
>slow in the context.

That is exactly what the SP800-90A CTR DRBG or the X9.31 does. As these
DRNGs are available for different crypto libs, I am simply reusing them
with the crypto lib linking code.
>
>Alternately, quite a few plausible components for fast cheap
>mixing are readily available.

Thank you for the references. I have seen that in your maxwell(8)
documentation. But again, I do not re-invent the wheel with the CPU
Jitter RNG and therefore skipped the whitening step based on the reasons
above.

Another thing: when you start adding whitening functions, other people
are starting (and did -- thus I added section 4.3 to my documentation)
to complain that you hide your weaknesses behind the whiteners. I simply
want to counter that argument and show that RNG produces white noise
without a whitener.

Ciao
Stephan

2013-10-14 15:18:19

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Mon, Oct 14, 2013 at 10:40 AM, Stephan Mueller <[email protected]> wrote:

> Another thing: when you start adding whitening functions, other people
> are starting (and did -- thus I added section 4.3 to my documentation)
> to complain that you hide your weaknesses behind the whiteners. I simply
> want to counter that argument and show that RNG produces white noise
> without a whitener.

Yes, you absolutely have to test the unwhitened input entropy, and
provide a way for others to test it so they can have confidence in your
code and it can be tested again if it is going to be used on some new
host. You do a fine job of that; your paper has the most detailed
analysis I have seen. Bravo.

However, having done that, I see no reason not to add mixing.
Using bit() for getting one bit of input and rotl(x) for rotating
left one bit, your code is basically, with 64-bit x:

for( i=0, x = 0 ; i < 64; i++, x =rotl(x) )
x |= bit()

Why not declare some 64-bit constant C with a significant
number of bits set and do this:

for( i=0, x = 0 ; i < 64; i++, x =rotl(x) ) // same loop control
if( bit() ) x ^= C ;

This makes every output bit depend on many input bits
and costs almost nothing extra.

In the unlikely event that the overhead here matters,
your deliberately inefficient parity calculation in bit()
could easily be made faster to compensate.

2013-10-14 15:27:09

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 14. Oktober 2013, 11:18:16 schrieb Sandy Harris:

Hi Sandy,

>On Mon, Oct 14, 2013 at 10:40 AM, Stephan Mueller <[email protected]>
wrote:
>> Another thing: when you start adding whitening functions, other
>> people
>> are starting (and did -- thus I added section 4.3 to my
>> documentation)
>> to complain that you hide your weaknesses behind the whiteners. I
>> simply want to counter that argument and show that RNG produces
>> white noise without a whitener.
>
>Yes, you absolutely have to test the unwhitened input entropy, and
>provide a way for others to test it so they can have confidence in your
>code and it can be tested again if it is going to be used on some new
>host. You do a fine job of that; your paper has the most detailed
>analysis I have seen. Bravo.

Thank you very much.
>
>However, having done that, I see no reason not to add mixing.
>Using bit() for getting one bit of input and rotl(x) for rotating
>left one bit, your code is basically, with 64-bit x:
>
> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) )
> x |= bit()

Ok, let me play a bit with that. Maybe I can add another flag to the
allocation function so that the caller can decide whether to use that.
If the user is another RNG, you skip that mixing function, otherwise you
should take it.

But I need a whitening / mixing function that should not add more
dependencies on the underlying host. The code you have above would be
ok.
>
>Why not declare some 64-bit constant C with a significant

Which constant would you take? The CRC twist values? The SHA-1 initial
values? Or the first few from SHA-256?

>number of bits set and do this:
>
> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) ) // same loop control
> if( bit() ) x ^= C ;
>
>This makes every output bit depend on many input bits
>and costs almost nothing extra.

Good point.
>
>In the unlikely event that the overhead here matters,
>your deliberately inefficient parity calculation in bit()
>could easily be made faster to compensate.


I will come back to you on this suggestion.


Ciao
Stephan

2013-10-14 15:46:21

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Mon, Oct 14, 2013 at 11:26 AM, Stephan Mueller <[email protected]> wrote:

>>Why not declare some 64-bit constant C with a significant
>
> Which constant would you take? The CRC twist values? The SHA-1 initial
> values? Or the first few from SHA-256?

The only essential requirement is that it not be something stupidly
regular like a 64-bit string 0x5555555555555555.

I'd pick an odd number so the low bit always changes, and a
constant with about half the bits set, maybe 24 < n < 40 or
some such. I'm not certain either of those is strictly required
but I'd do them anyway.

2013-10-14 21:33:16

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller <[email protected]> wrote:

> [quoting me]

>> ...your code is basically, with 64-bit x:
>>
>> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) )
>> x |= bit()
>
> Why not declare some 64-bit constant C with a significant
>>number of bits set and do this:
>>
>> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) ) // same loop control
>> if( bit() ) x ^= C ;
>>
>>This makes every output bit depend on many input bits
>>and costs almost nothing extra.

> Ok, let me play a bit with that. Maybe I can add another flag to the
> allocation function so that the caller can decide whether to use that.
> If the user is another RNG, you skip that mixing function, otherwise you
> should take it.

I'd say just do it. It is cheap enough and using it does no harm
even where it is not strictly needed. Adding a flag just gives the
calling code a chance to get it wrong. Better not to take that risk
if you don't have to.

2013-10-15 06:23:54

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 14. Oktober 2013, 11:18:16 schrieb Sandy Harris:

Hi Sandy,

Could you please review the following code to see that the mix is
function right in your eyes?
>
>However, having done that, I see no reason not to add mixing.
>Using bit() for getting one bit of input and rotl(x) for rotating
>left one bit, your code is basically, with 64-bit x:
>
> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) )
> x |= bit()
>
>Why not declare some 64-bit constant C with a significant
>number of bits set and do this:
>
> for( i=0, x = 0 ; i < 64; i++, x =rotl(x) ) // same loop control
> if( bit() ) x ^= C ;

I only want to use the XOR function as this is bijective and fits to my
mathematical model.

The entropy_collector->data contains the random number. The code first
produces the mixer value that is XORed as often as set bits are
available in the input random number. Finally, it is XORed with the
random number.

The function is currently called unconditionally after the 64 bit random
number is generated from the noise source.

static inline void jent_stir_pool(struct rand_data *entropy_collector)
{
/* This constant is derived from the first two 32 bit
initialization
* vectors of SHA-1 -- 32 bits are set and 32 are unset */
__u64 constant = 0x67452301efcdab89;
__u64 mixer = 0;
int i = 0;

for(i = 0; i < DATA_SIZE_BITS; i++)
{
/* get the i-th bit of the input random number and
* XOR the constant into the mixer value only when that
bit
* is set */
if((entropy_collector->data >> i) & 0x0000000000000001)
mixer ^= constant;
mixer = rol64(mixer, 1);
}
entropy_collector->data ^= mixer;
}

The statistical behavior of the output looks good so far (just tested it
with the ent tool -- the Chi Square value is good). It also does not
compress with bzip2.

Thanks a lot
Stephan