LinuxLists.cc - [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

2013-10-11 18:39:05

Subject: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi,

the CPU Jitter RNG [1] is a true random number generator that is
intended to work in user and kernel space equally well on a large number
of different CPUs. The heart of the RNG is about 30 lines of code. The
current implementation allows seamless hooking into the kernel crypto
API as well as the Linux /dev/random driver. With its inherent non-
blocking behavior, it could solve the problem of a blocking /dev/random.

Over the last months, new tests were executed. The list of tests now
cover all major operating systems and CPU types as well as microkernels
of NOVA, Fiasco.OC and Pistacio. More than 200 different systems are
tested. And for those, the tests show that the Jitter RNG produces high-
quality output. See [2] appendix F for details.

When talking with developers from different chip manufactures, I was
told that even they see the execution timing jitter in their tests and
cannot eliminate the timing jitter. Nor are they able to explain to 100%
certainty the root cause of the jitter. Therefore, the noise source
looks appropriate for general use.

I am asking whether this RNG would good as an inclusion into the Linux
kernel for:

- kernel crypto API to provide a true random number generator as part of
this API (see [2] appendix B for a description)

- inclusion into /dev/random as an entropy provider of last resort when
the entropy estimator falls low.

Patches for both are provided in the source code tarball at [1].

A full description of the RNG together with all testing is provided at
[2] or [3].

I will present the RNG at the Linux Symposium in Ottawa this year. There
I can give a detailed description of the design and testing.

[1] http://www.chronox.de

[2] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

[3] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf

Ciao
Stephan

2013-10-12 01:46:00

by Sandy Harris

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Fri, Oct 11, 2013 at 2:38 PM, Stephan Mueller <[email protected]> wrote:

I like the basic idea. Here I'm alternately reading the email and the
page you link to & commenting on both.

A nitpick in the paper is that you cite RFC 1750. That was superceded
some years back by RFC 4086
http://tools.ietf.org/html/rfc4086

(Ted's comments in the actual driver had the same problem last
I looked. That is excusable since they were written long ago.)

I think you may be missing some other citations that should be
there, to previous work along similar lines. One is the HAVEGE
work, another:
McGuire, Okech & Schiesser,
Analysis of inherent randomness of the Linux kernel,
http://lwn.net/images/conf/rtlws11/random-hardware.pdf

Paper has:

" the time delta is partitioned into chunks of 1 bit starting at the lowest bit
" .... The 64 1 bit chunks of the time value are XORed with each other to
" form a 1 bit value.

As I read that, you are just taking the parity. Why not use that simpler
description & possibly one of several possible optimised algorithms
for the task: http://graphics.stanford.edu/~seander/bithacks.html

If what you are doing is not a parity computation, then you need a
better description so people like me do not misread it.

A bit later you have:

" After obtaining the 1 bit folded and unbiased time stamp,
" how is it mixed into the entropy pool? ... The 1 bit folded
" value is XORed with 1 bit from the entropy pool.

This appears to be missing the cryptographically strong
mixing step which most RNG code includes. If that is
what you are doing, you need to provide some strong
arguments to justify it.

Sometimes doing without is justified; for example my
code along these lines
ftp://ftp.cs.sjtu.edu.cn:990/sandy/maxwell/
does more mixing than I see in yours, but probably
not enough overall. That's OK because I am just
feeding into /dev/random which has plenty of
mixing.

It is OK for your code too if you are feeding into
/dev/random, but it looks problematic if your code
is expected to stand alone.

Ah! You talk about whitening a bit later. However,
you seem to make it optional, up to the user. I
cannot see that that is a good idea.

At the very least I think you need something like
the linear transform from the ARIA cipher -- fast
and cheap, 128 bits in & 128 out and it makes
every output bit depend on every input bit. That
might not be enough, though.

You require compilation without optimisation. How does
that interact with kernel makefiles? Can you avoid
undesirable optimisations in some other way, such as
volatile declartions?

> I am asking whether this RNG would good as an inclusion into the Linux
> kernel for:
>
> - kernel crypto API to provide a true random number generator as part of
> this API (see [2] appendix B for a description)

My first reaction is no. We have /dev/random for the userspace
API and there is a decent kernel API too. I may change my
mind here as I look more at your appendix & maybe the code.

> - inclusion into /dev/random as an entropy provider of last resort when
> the entropy estimator falls low.

Why only 'of last resort'? If it can provide good entropy, we should
use it often.

> I will present the RNG at the Linux Symposium in Ottawa this year. There
> I can give a detailed description of the design and testing.

I live in Ottawa, don't know if I'll make it to the Symposium this
year. Ted; I saw you at one Symposium; are you coming this
year?

2013-10-12 03:28:43

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi Stephan,

I haven't had a chance to look at your paper in detail, yet, but a
quick scan has found a huge red flag for me that puts the rest of your
analysis in severe doubt for me.

You say that you got really good results and perfect statistical
entropy on a number of platforms, including on an MIPS embedded
system. You also say that you are harvesting jitter by using
get_cycles() yes?

Well, on the MIPS platform, here is the definition of get_cycles:

static inline cycles_t get_cycles(void)
{
return 0;
}

So if you are getting great entropy results when in effect you
couldn't possibly be harvesting any jitter at all, then something is
really, Really, REALLY wrong with your tests.

One might be that you are just getting great statistical results
because of the whitening step. This is why I have very little faith
in statistical tests of randomness, given that they will return
perfect results for the following "random number generator"

AES_ENCRYPT(i++, NSA_KEY)

Regards,

- Ted

2013-10-12 19:04:43

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 23:28:35 schrieb Theodore Ts'o:

Hi Theodore,

>Hi Stephan,
>
>I haven't had a chance to look at your paper in detail, yet, but a
>quick scan has found a huge red flag for me that puts the rest of your
>analysis in severe doubt for me.
>
>You say that you got really good results and perfect statistical
>entropy on a number of platforms, including on an MIPS embedded
>system. You also say that you are harvesting jitter by using
>get_cycles() yes?
>
>Well, on the MIPS platform, here is the definition of get_cycles:
>
>static inline cycles_t get_cycles(void)
>{
> return 0;
>}

There are multiple catches to this issue:

- First, if the time gathering function does not work or is to coarse,
the function jent_entropy_init() returns an error. As outlined in
jitterentropy(3), the result of this function must be honored before
using the RNG.

- Second, the time stamp function of jent_get_nstime uses
__getnstimeofday in case get_cycles returns zero (see implementation of
jent_get_nstime()). On MIPS systems with missing get_cycles, the RNG
would use the __getnstimeofday() as get_cycles returns 0. When using the
RNG in user space, it calls clock_gettime(CLOCK_REALTIME) that is backed
by the same timer of__getnstimeofday on MIPS.

Please consider the use of the jent_entropy_init function in the two
patches for /dev/random and kernel crypto API:

/dev/random:

+ /* we are uninitialized, try to initialize */
+ if(jent_entropy_init())
+ {
+ /* there is no CPU Jitter, disable the entropy
collector */
+ r->jent_enable = 0;
+ return;
+ }

kernel crypto API:

static int __init jent_drng_init(void)
{
...
ret = jent_entropy_init();
if(ret)
{
printk(DRIVER_NAME ": Initialization failed with host
not compliant with requirements: %d\n", ret);
return -EFAULT;
}

>
>So if you are getting great entropy results when in effect you
>couldn't possibly be harvesting any jitter at all, then something is
>really, Really, REALLY wrong with your tests.
>
>One might be that you are just getting great statistical results
>because of the whitening step. This is why I have very little faith

There is *no* whitening function (cryptographic or otherwise) involved
in the generation of random data. All is done by harvesting time deltas
and align them appropriately. This is the sole reason why the heart of
the RNG is only 30 lines of code.

I have added arguments about broken time stamp collections in section
4.3 of the documentation in [2]. These anti tests clearly show that
broken time stamps would be immediately visible and not disguised by
some whitening function.

Note, the testing of the 200+ systems is tone by measuring the jitter of
the core of the RNG. The measurement is logically similar to measure the
different add_*_randomness functions for random.c. Thus, even the logic
to arrange the timing values to a random value bit stream does not
affect the measurements.

>in statistical tests of randomness, given that they will return
>perfect results for the following "random number generator"
>
> AES_ENCRYPT(i++, NSA_KEY)
>
>Regards,
>
> - Ted

Ciao
Stephan

2013-10-12 20:12:25

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 21:45:57 schrieb Sandy Harris:

Hi Sandy,

>On Fri, Oct 11, 2013 at 2:38 PM, Stephan Mueller <[email protected]>
>wrote:
>
>I like the basic idea. Here I'm alternately reading the email and the
>page you link to & commenting on both.

Thank you very much for reviewing both and sharing your thoughts.
>
>A nitpick in the paper is that you cite RFC 1750. That was superceded
>some years back by RFC 4086
>http://tools.ietf.org/html/rfc4086

Thank you for the link. I updated the reference in the paper. Though,
the Von-Neumann unbias operation is way older than this RFC and I just
listed the RFC to give people a reference.
>
>(Ted's comments in the actual driver had the same problem last
>I looked. That is excusable since they were written long ago.)
>
>I think you may be missing some other citations that should be
>there, to previous work along similar lines. One is the HAVEGE
>work, another:
>McGuire, Okech & Schiesser,
>Analysis of inherent randomness of the Linux kernel,
>http://lwn.net/images/conf/rtlws11/random-hardware.pdf

Thank you for the reminder. I added section 1.1.

>
>Paper has:
>
>" the time delta is partitioned into chunks of 1 bit starting at the
>lowest bit " .... The 64 1 bit chunks of the time value are XORed with
>each other to " form a 1 bit value.
>
>As I read that, you are just taking the parity. Why not use that
>simpler description & possibly one of several possible optimised
>algorithms for the task:
>http://graphics.stanford.edu/~seander/bithacks.html

I am fully aware that the bit operation is inefficient. Yet it is
deliberately inefficient, because that "folding loop" performs actual
work for the RNG (the collapse of 64 bits into one bit) and at the very
same time, it is the fixed instruction set over which I measure the time
variations.

Thus, the folding loop can be considered as the entropy source at the
same time. Thus, when making it shorter or more efficient, we alter the
duration of the loop. With significantly shorter durations, the jitter
measurements will get smaller.

That issue is the sole reason why that code part of the RNG shall be
compiled without optimizations. The compiler is clever enough to arrange
my inefficient operation into something faster when optimizing. You
clearly see that with Clang: the unoptimized folding loop code takes
about 300 nanoseconds on my i7 2nd gen. Optimized, it takes about 40 ns!
At the same time, the jitter is lower (it is still sufficient, but the
cushion is smaller).

As the RNG is not primarily about speed, the folding operation should
stay inefficient.
>
>If what you are doing is not a parity computation, then you need a
>better description so people like me do not misread it.

It is not a parity computation that the folding loop performs. The code
XORs each individual bit of the 64 bit stream with each other, whereas
your cited document implies an ANDing of the bits (see section
"Computing parity the naive way" of the cited document). The reason is
that the different bits contain entropy -- the lower the bit index, the
more entropy they are assumed to have. The only operation to retain
entropy and yet collapse a bit string is to use XOR. Any other operation
will result in the fact that you cannot make any statement about the
entropy behavior any more.

The rationale about the folding loop is given in section 5.1 [2].

>
>A bit later you have:
>
>" After obtaining the 1 bit folded and unbiased time stamp,
>" how is it mixed into the entropy pool? ... The 1 bit folded
>" value is XORed with 1 bit from the entropy pool.
>
>This appears to be missing the cryptographically strong
>mixing step which most RNG code includes. If that is
>what you are doing, you need to provide some strong
>arguments to justify it.

The key here is that there is no cryptographic function needed for
mixing as in fact I am not really mixing things. I am simply
concatenating the new bit to the previously obtained bits. That is it.

The argument for doing that is that the time deltas are independent of
each other. Yet, these deltas show variations which are collapsed into
the one bit that is the result of the folding loop. Therefore, the
individual bits are independent of each other and yet contain the
variations of the jitter. Now, there is no need to apply a cryptographic
function for further mixing.

The key here is that the variations in the time deltas must be *larger*
than one bit. What does that mean? For example, if the variation would
be, say 50% of 250 ns and 50% of 251 ns, we have 1 bit variation
(considering that each individual measurement is independent of the
others). Or, if you have 30% with 249ns, 20% with 250%, 25% with 251 and
25% with 252ns, you have more than one bit. You see, that this line of
thought brings us to the Shannon Entropy formula. And now we come to the
200+ test systems that I tested on: all with the exception of one very
old system showed a variation with the lower boundary of more than one
bit. Therefore, each bit from the folding operation therefore contains
one bit of entropy. So, why should I mix it even further with some
crypto function?

>
>Sometimes doing without is justified; for example my
>code along these lines
>ftp://ftp.cs.sjtu.edu.cn:990/sandy/maxwell/
>does more mixing than I see in yours, but probably
>not enough overall. That's OK because I am just
>feeding into /dev/random which has plenty of
>mixing.
>
>It is OK for your code too if you are feeding into
>/dev/random, but it looks problematic if your code
>is expected to stand alone.

Please consider use case of the CPU Jitter RNG: it is intended as a seed
source for deterministic RNGs. That is how the various integration
patches for the kernel crypto API, /dev/random, OpenSSL, libgcrypt or
even the stand-alone entropy feeding daemon operate. Thus, you have a
whitening function when using it together with such DRNGs.

Yet, the code allows other users that may want entropy (NSS or even
stand-alone DRNGs come to mind) to simply use that noise source without
the burden of another whitening function to their DRNGs.

>
>Ah! You talk about whitening a bit later. However,
>you seem to make it optional, up to the user. I
>cannot see that that is a good idea.

Why is whitening a good idea in the first place when the statistical
properties of your output already are white noise? When applying all
kinds of statistical test to my non-whitened output, I do not see any
statistical significant odd behavior -- ent (Chi-Square test, mean test,
monte carlo simulation), dieharder tests, various tests from the German
BSI, NIST test suite.

And if you say that your entropy source has problems, then your
whitening function only disguises the fact but does not remedy it.
Please refer to section 4.3 of my document in [2] where I have some
"anti" tests with broken timers. Such brokenness would be immediately
visible for anybody without a whitening function (such as the jitter
RNG). But if the jitter RNG would use a whitening function, the missing
entropy would not be visible.

Also, the RNG is intended as a seeding mechanism for all kinds of use
cases. The typical one is the seeding of a DRNG of your choice. This is
one more reason why I consider the use of a whitening function as not
helpful.

>
>At the very least I think you need something like
>the linear transform from the ARIA cipher -- fast
>and cheap, 128 bits in & 128 out and it makes
>every output bit depend on every input bit. That
>might not be enough, though.

Can you please help me understand why you think that a whitening
function (cryptographic or not) is needed in the case of the CPU Jitter
RNG, provided that I can show that each individual bit coming from the
folding operation has one bit of entropy?
>
>You require compilation without optimisation. How does
>that interact with kernel makefiles? Can you avoid
>undesirable optimisations in some other way, such as
>volatile declartions?

That is very easy to implement, see the kernel Makefile for my code:

...
obj-m += jitterentropy.o
CFLAGS_jitterentropy-base.o = -O0
jitterentropy-y := jitterentropy-base.o jitterentropy-drng.o
...

This code implies that only the compilation of jitterentropy-base.c is
performed with -O0. Any other C file from my code or even the reminder
of the kernel is compiled with the optimization the kernel developer
deemed right.

>
>> I am asking whether this RNG would good as an inclusion into the
>> Linux
>> kernel for:
>>
>> - kernel crypto API to provide a true random number generator as part
>> of this API (see [2] appendix B for a description)
>
>My first reaction is no. We have /dev/random for the userspace
>API and there is a decent kernel API too. I may change my
>mind here as I look more at your appendix & maybe the code.

The concern with /dev/random and its get_random_bytes() function is that
inside the kernel, we only have the equivalent of /dev/urandom.
Therefore, in the worst case, you have *no* guarantee that you obtain
random numbers that are freshly seeded with hardware entropy.

For example, when you look into libgcrypt and you want to have "strong
random numbers", libgcrypt seeds from /dev/random every 1000 or so
random numbers produced from the DRNG.

Due to the ability of user space to obtain data via the nonblocking_pool
using /dev/urandom and the use of the same pool for get_random_bytes,
equally strong random numbers as, for example, libgcrypt would produce
would not be available.

Therefore, the code linking with the kernel crypto API registers three
RNGs:

- one X9.31 DRNG that is seeded "once in a while" with the CPU jitter

- one X9.31 DRNG that is seeded "frequently" with the CPU jitter

- and one direct access to the CPU jitter RNG

This is logically equivalent to what is done in libgcrypt.

Of course, we may replace the X9.31 with, say, an SP800-90A DRBG, if
that becomes available.

Furthermore, one of the core ideas of the CPU jitter RNG is that you get
away from the centralized noise source of /dev/random. Such central
services are also the focus for hackers. The CPU Jitter RNG allows
multiple instantiations of the noise source into independent components.
Every "user" of random data may now have its own private copy of a noise
source and its operation.
>
>> - inclusion into /dev/random as an entropy provider of last resort
>> when the entropy estimator falls low.
>
>Why only 'of last resort'? If it can provide good entropy, we should
>use it often.

I did not dare to be more than a provider of last resort at this point.
The CPU jitter RNG is new and I guess people may want to study it first.

>From my point of view, it surely can become one "regular" source. But as
it delivers entropy on demand, it may simply outpace any other entropy
source of /dev/random. This means, in some normal use cases, my CPU
jitter RNG would always be the provider of the majority of the entropy,
by far. Therefore, my CPU Jitter RNG would alter the nature of
/dev/random significantly.

If Ted thinks that this is ok, I am all for it and will change the
patch. But I tried to be cautious and not to overwhelm people :-)
>
>> I will present the RNG at the Linux Symposium in Ottawa this year.
>> There I can give a detailed description of the design and testing.
>
>I live in Ottawa, don't know if I'll make it to the Symposium this
>year. Ted; I saw you at one Symposium; are you coming this
>year?

As mentioned before, I would really like to meet you there to have a cup
of coffee over that matter.

Ciao
Stephan

2013-10-28 15:40:39

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 20:38:51 schrieb Stephan Mueller:

Hi Ted,

>Hi,
>
>the CPU Jitter RNG [1] is a true random number generator that is
>intended to work in user and kernel space equally well on a large
>number of different CPUs. The heart of the RNG is about 30 lines of
>code. The current implementation allows seamless hooking into the
>kernel crypto API as well as the Linux /dev/random driver. With its
>inherent non- blocking behavior, it could solve the problem of a
>blocking /dev/random.
>
>Over the last months, new tests were executed. The list of tests now
>cover all major operating systems and CPU types as well as microkernels
>of NOVA, Fiasco.OC and Pistacio. More than 200 different systems are
>tested. And for those, the tests show that the Jitter RNG produces
>high- quality output. See [2] appendix F for details.

Apart from adding more test results from more systems (now including
Windows), I added more updates:

- The structure of the Linux kernel code is updated such that the common
C code can go to straight to the lib/ directory or any other directory
that seems suitable for common code. If it is of help, I can create a
patch file to add the CPU Jitter RNG to the Linux kernel code instead of
manually copying into a kernel tree for testing it with random.c.

- Based on Sandy Harris' discussion in
http://permalink.gmane.org/gmane.comp.encryption.general/16219, the
patch for random.c is updated that the initialization function of the
entropy pools init_std_data now contains a call to the CPU Jitter RNG to
mix in 256 bits of entropy when the entropy pool is filled.

If it is accepted that the CPU Jitter RNG delivers entropy, the latter
update may now allow us to get rid of storing the seed file during
shutdown and restoring it during the next boot sequence.

Please see the latest patch to random.c in the file patches/linux-3.11-
random.patch delivered with [1].

Ciao
Stephan

[1] http://www.chronox.de/jent/jitterentropy-20131028.tar.bz2

2013-10-28 16:06:30

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Mon, 28 Oct 2013, Stephan Mueller wrote:
> If it is accepted that the CPU Jitter RNG delivers entropy, the latter
> update may now allow us to get rid of storing the seed file during
> shutdown and restoring it during the next boot sequence.

That's a 4096-bit safety net (uncredited entropy) which at least Debian
shall not remove.

I think Debian also dumps some low-entropy-per-bit crap into /dev/random
during boot (again, not credited), such as the boot kernel logs. We could
increase the density of that entropy a lot using gzip -0 or something like
that... is an uncredited low-entropy-per-bit dump into the pool detrimental
to its quality?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2013-10-28 16:16:07

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 28. Oktober 2013, 14:06:23 schrieb Henrique de Moraes
Holschuh:

Hi Henrique,

>On Mon, 28 Oct 2013, Stephan Mueller wrote:
>> If it is accepted that the CPU Jitter RNG delivers entropy, the
>> latter
>> update may now allow us to get rid of storing the seed file during
>> shutdown and restoring it during the next boot sequence.
>
>That's a 4096-bit safety net (uncredited entropy) which at least Debian
>shall not remove.

That is correct, and I did not want have such safety net removed.

I have to correct my initial statement: there is a possibility where the
CPU Jitter RNG may not deliver data: when the timer resolution is too
coarse. Thus, please disregard my notion to remove the user space
seeding.

My goal is that the entropy pools are filled with entropy at the time
when they are created so that they will deliver good random data if the
system has a high resolution timer (which is almost always the case as
shown with my tests).

>
>I think Debian also dumps some low-entropy-per-bit crap into
>/dev/random during boot (again, not credited), such as the boot kernel
>logs. We could increase the density of that entropy a lot using gzip
>-0 or something like that... is an uncredited low-entropy-per-bit dump
>into the pool detrimental to its quality?

Any mixing of data into the entropy pools can never diminish the
entropy, but just further mix the data.

Note, a simple write of data into the device files will never update the
entropy estimator that is behind the blocking of /dev/random. All that
is done by the write to the device files is the mixing of the entropy
pools of blocking_pool and nonblocking_pool (and not input_pool).

Ciao
Stephan

2013-10-28 22:34:15

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Fundamentally, what worries me about this scheme (actually, causes the
hair on the back of my neck to rise up on end) is this statement in
your documentation[1]:

When looking at the sequence of time deltas gathered
during testing [D] , no pattern can be detected. Therefore, the
fluctuation and the resulting distribution are not based on a
repeating pattern and must be considered random.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Just because we can't detect a pattern does **not** mean that it is
not based on a repeating pattern, and therefore must be considered
random. We can't detect a pattern in RDRAND, so does that mean it's
automatically random? Why, no.

If all you have is the output of "AES_ENCRPYT(NSA_KEY, i++)". and
NSA_KEY is not known to you, you won't be able to detect a pattern,
either. But I can guarantee to you that it's not random...

It may be that there is some very complex state which is hidden inside
the the CPU execution pipeline, the L1 cache, etc., etc. But just
because *you* can't figure it out, and just because *I* can't figure
it out doesn't mean that it is ipso facto something which a really
bright NSA analyst working in Fort Meade can't figure out. (Or heck,
a really clever Intel engineer who has full visibility into the
internal design of an Intel CPU....)

Now, it may be that in practice, an adversary won't be able to carry
out a practical attack because there will be external interrupts that
the adversary won't be able to put into his or her model of your CPU
--- for example, from network interrupts or keyboard interrupts. But
in that case, it's to measure just the interrupt, because it may be
that the 32 interrupts that you got while extracting 128 bits of
entropy from your jitter engine was only 32 bits of entropy, and the
rest could be determined by someone with sufficient knowledge and
understanding of the internal guts of the CPU. (Treating this
obscurity as security is probably not a good idea; we have to assume
the NSA can get its hands on anything it wants, even internal,
super-secret, "black cover" Intel documents. :-)

To be honest, I have exactly the same worry about relying on HDD
interrupts. The theoretical basis of this resulting in true
randomness is based on a 1994 paper by Don Davis: "Cryptographic
randomness from air turbulence in disk drives"[2]:

[2] http://world.std.com/~dtd/random/forward.pdf

The problem is that almost two decades later, the technology of HDD's,
and certainly SSD (which didn't exist back then) have changed quite a
lot. It is not obvious to me how much entropy you can really get from
observing the disk completion times if you assume that the adversary
has complete knowledge to the relative timing and block numbers of the
disk accesses from the OS (for example, if we boot multiple mobile
phone from flash for the first time, how many bits of entropy are
there really?)

But at least back in 1994, there was an attempt to come up with a
physical theory as to where the entropy was coming from, and then as
much work as possible to rule out other possible causes of the
uncertainty.

So if you want to really convince the world that CPU jitter is random,
it's not enough to claim that it you can't see a pattern. What you
need to do is to remove all possible sources of the uncertainty, and
show that there is still no discernable pattern after you do things
like (a) run in kernel space, on an otherwise quiscent computer, (b)
disable interrupts, so that any uncertainty can't be coming from
interrupts, etc., Try to rule it all out, and then see if you still
get uncertainty.

If you think it is from DRAM timing, first try accessing the same
memory location in kernel code with the interrupts off, over and over
again, so that the memory is pinned into L1 cache. You should be able
to get consistent results. If you can, then if you then try to read
from DRAM with the L1 and L2 caches disabled, and with interrupts
turned off, etc, and see if you get consistent results or inconsistent
results. If you get consistent results in both cases, then your
hypothesis is disproven. If you get consistent results with the
memory pinned in L1 cache, and inconsistent results when the L1 and L2
cache are disabled, then maybe the timing of DRAM reads really are
introducing entropy. But the point is you need to test each part of
the system in isolation, so you can point at a specific part of the
system and say, *that*'s where at least some uncertainty which an
adversary can not reverse engineer, and here is the physical process
from which the choatic air patterns, or quantum effects, etc., which
is hypothesized to cause the uncertainty.

And note that when you do this, you can't use any unbiasing or
whitening techniques --- you want to use the raw timings, and then do
things like look very hard for any kind of patterns; Don Davis used
FFT's because he wanted to look for any patterns that might be
introduced by the rotating plattern, which would presumably would show
up in a frequency domain analysis even if it was invisible in the time
domain.

If you don't do all of this work, there is no way to know for sure
where the entropy is coming from. And if you don't know, that's when
you have to be very, very conservative, and use a very large
engineering safety margin. Currently we use the high resolution CPU
counter, plus the interrupted IP, and we mix all of this together from
64 interrupts, and we count this as a single bit of entropy. I *hope*
that at least one of those interrupts has sufficient unpredictably,
perhaps because the remote attacker can't know when a LAN interrupt
has happened, such that have a single bit of entropy.

Maybe someone can prove that there is more entropy because of some
instability between the oscillator used by the CPU clock and the one
used by the ethernet NIC, and so I'm being hopelessly
over-conservative. Perhaps; but until we know for sure, using a
similar analysis to what I described above, I'd much rather be slow
than be potentially insecure.

The jitter "entropy collector" may be able to generate more
"randomness" much more quickly, but is the resulting numbers really
more secure? Other people will have to judge for themselves, but this
is why I'm not convinced.

Best regards,

- Ted

2013-10-29 08:42:51

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 28. Oktober 2013, 17:45:49 schrieb Theodore Ts'o:

Hi Theodore,

first of all, thank you for your thoughts.

And, before we continue any discussion, please consider that all the big
testing that is done to analyze the jitter so far did (a) not include
any whitening schema (cryptographic or otherwise) and (b) did not even
include the processing done inside the RNG. The testing in appendix F of
the documentation just measures the execution time of some instructions
-- the very heart of the RNG, and not more. And only if these show
variations, then I conclude the RNG can be used.

[...]
>
>It may be that there is some very complex state which is hidden inside
>the the CPU execution pipeline, the L1 cache, etc., etc. But just
>because *you* can't figure it out, and just because *I* can't figure
>it out doesn't mean that it is ipso facto something which a really
>bright NSA analyst working in Fort Meade can't figure out. (Or heck,
>a really clever Intel engineer who has full visibility into the
>internal design of an Intel CPU....)

I concur here. But so are all sources of /dev/random too. As you have
outlined later, your HDD fluctuations may not be as trustworthy as we
think. The key strokes and their timings can be obtained from
electromagnetic emanation. Lastly, the use of the fast_pool using
interrupts may still show a correlation with the other noise sources as
they all generate interrupts. But I diverge as we talk about my RNG and
do not analyze random.c.

So, I guess we all agree on the notion that entropy is *relative*. Some
information may be more entropic to one than to the other. However, for
us, it shall be entropy enough to counter our adversary.
>
>Now, it may be that in practice, an adversary won't be able to carry
>out a practical attack because there will be external interrupts that
>the adversary won't be able to put into his or her model of your CPU
>--- for example, from network interrupts or keyboard interrupts. But
>in that case, it's to measure just the interrupt, because it may be
>that the 32 interrupts that you got while extracting 128 bits of
>entropy from your jitter engine was only 32 bits of entropy, and the
>rest could be determined by someone with sufficient knowledge and
>understanding of the internal guts of the CPU. (Treating this
>obscurity as security is probably not a good idea; we have to assume
>the NSA can get its hands on anything it wants, even internal,
>super-secret, "black cover" Intel documents. :-)

Again, I concur. But since I have seen the jitter with quite similar
size on all the major CPUs we have around us (Intel, AMD, Sparc, POWER,
PowerPC, ARM, MIPS, zSeries), I guess you need to update your statement
to "... even internal, super-secret, "black cover" documents that are
synchronized among all the different chip vendors". :-)

[...]

Thanks again to your ideas below in testing the issue more.
>
>So if you want to really convince the world that CPU jitter is random,
>it's not enough to claim that it you can't see a pattern. What you
>need to do is to remove all possible sources of the uncertainty, and
>show that there is still no discernable pattern after you do things
>like (a) run in kernel space, on an otherwise quiscent computer, (b)

Re: (a) that is what I already did. The kernel implementation of the RNG
is capable of that testing. Moreover, that is what I already did in
section 5.1. It is easy for everybody to redo the testing by simply
compiling the kernel module, load it and look into
/sys/kernel/debug/jitterentropy. There you find some files that are
direct interfaces to the RNG. In particular, the file stat-fold is the
key to redo the testing that covers appendix F of my document (as
mentioned above, there is no postprocessing of the raw variations when
you read that file).

>disable interrupts, so that any uncertainty can't be coming from
>interrupts, etc., Try to rule it all out, and then see if you still
>get uncertainty.

When I did testing on all systems, interrupts are easily visible by the
larger "variations". When compiling the test results in appendix F, all
measurements that are a tad higher than the majority of the variations
are simply removed to focus on the worst case. I.e. the measurements and
the results *already* exclude any interrupts, scheduling impacts.

Regarding, caches, may I ask you to look into appendix F.46 of the
current document version? I conducted tests that tried to disable /
remove the impact of: system call context switches, flushing the
instruction pipeline, flushing of all caches, disabling preemtion,
flushing TLB, executing the code exclusively on one CPU core, disabling
of power management and frequency scaling.

All these tests show *no* deterioration in jitter, i.e. the jitter is
still there. The only exception is the power management where I see some
small jitter drop off, which is analyzed and concluded to be
unproblematic.

>
>If you think it is from DRAM timing, first try accessing the same
>memory location in kernel code with the interrupts off, over and over
>again, so that the memory is pinned into L1 cache. You should be able

That is what the testing already does. I constantly access some piece of
memory millions of times and measure the execution time of the operation
on that memory location. As mentioned above, interrupts are disregarded
in any case.

And, jitter is there.

>to get consistent results. If you can, then if you then try to read
>from DRAM with the L1 and L2 caches disabled, and with interrupts

Based on this suggestion, I now added the tests in Appendix F.46.8 where
I disable the caches and the tests in Appendix F.46.9 where I disable
the caches and interrupts.

The results show that the jitter even goes way up -- thus, jitter that
is sufficient is even more present when disabling the caches and
interrupts.

>turned off, etc, and see if you get consistent results or inconsistent
>results. If you get consistent results in both cases, then your
>hypothesis is disproven. If you get consistent results with the

Currently, the hypothesis is *not* disproven.

>memory pinned in L1 cache, and inconsistent results when the L1 and L2
>cache are disabled, then maybe the timing of DRAM reads really are
>introducing entropy. But the point is you need to test each part of
>the system in isolation, so you can point at a specific part of the
>system and say, *that*'s where at least some uncertainty which an
>adversary can not reverse engineer, and here is the physical process
>from which the choatic air patterns, or quantum effects, etc., which
>is hypothesized to cause the uncertainty.

As I tried quite a number of different variations on disabling /
enabling features in appendix F.46, I am out of ideas what else I should
try.
>
>And note that when you do this, you can't use any unbiasing or
>whitening techniques --- you want to use the raw timings, and then do
>things like look very hard for any kind of patterns; Don Davis used

Again, there is no whitening, and not even the RNG processing involved.
All I am doing is simple timing analysis of some fixed set of
instructions -- i.e. the very heart of the RNG.

[..]
>
>The jitter "entropy collector" may be able to generate more
>"randomness" much more quickly, but is the resulting numbers really
>more secure? Other people will have to judge for themselves, but this
>is why I'm not convinced.

May I ask to recheck appendix F.46 again?

Thanks
Stephan

2013-10-29 13:24:54

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
> Based on this suggestion, I now added the tests in Appendix F.46.8 where
> I disable the caches and the tests in Appendix F.46.9 where I disable
> the caches and interrupts.

What you've added in F.46 is a good start, but as a suggestiom,
instead of disabling one thing at a time, try disabling *everything*
and then see what you get, and then enabling one thing at a time. The
best thing is if you can get to the point where the amount of entropy
is close to zero. Then as you add things back, there's a much better
sense of where the unpredictability might be coming from, and whether
the unpredictability is coming from something which is fundamentally
arising from something which is chaotic or quantum effect, or just
because we don't have a good way of modelling the behavior of the
L1/L2 cache (for example) and that is spoofing your entropy estimator.

Regards,

- Ted

2013-10-29 14:00:50

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:

Hi Theodore,

>On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
>> Based on this suggestion, I now added the tests in Appendix F.46.8
>> where I disable the caches and the tests in Appendix F.46.9 where I
>> disable the caches and interrupts.
>
>What you've added in F.46 is a good start, but as a suggestiom,
>instead of disabling one thing at a time, try disabling *everything*
>and then see what you get, and then enabling one thing at a time. The
>best thing is if you can get to the point where the amount of entropy
>is close to zero. Then as you add things back, there's a much better

I will try to do that.

But please see the different lower boundary values in the different
subsections of F.46: none of them fall when I disable or change anything
in the base system (except the power management -- where I added
additional analyses). Some of the changes imply that the jitter
increases when I disable certain support.

Thus, expect that we will not see a significant drop in the jitter as
you fear or expect.

Yet, I will try and report back.

Though, does anybody has an idea how to flush/disable branch prediction
on x86 (short of using an ARM where I can disable the branch prediction
unit with CP15)? That is the last unit that I do not have a handle on.

>sense of where the unpredictability might be coming from, and whether
>the unpredictability is coming from something which is fundamentally
>arising from something which is chaotic or quantum effect, or just
>because we don't have a good way of modelling the behavior of the
>L1/L2 cache (for example) and that is spoofing your entropy estimator.

Please note: if the jitter really comes from the oscillator effect of
the RAM clock vs. the CPU clock (which I suspect), we will not be able
to alter the jitter software wise.

The reason why I suspect that oscillating effect is the following: I
spoke with two different employees from the research departments of
major chip vendors. They mentioned that they see the very same jitter in
their measurements albeit they tried to get rid of it. So, when they
cannot get rid of that, I guess we will not be able to lower or even
eliminate the jitter significantly.

Ciao
Stephan

2013-10-29 22:25:44

by Stephan Müller

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 29. Oktober 2013, 15:00:31 schrieb Stephan Mueller:

Hi Ted,

>Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:
>
>Hi Theodore,
>
>>On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
>>> Based on this suggestion, I now added the tests in Appendix F.46.8
>>> where I disable the caches and the tests in Appendix F.46.9 where I
>>> disable the caches and interrupts.
>>
>>What you've added in F.46 is a good start, but as a suggestiom,
>>instead of disabling one thing at a time, try disabling *everything*
>>and then see what you get, and then enabling one thing at a time. The
>>best thing is if you can get to the point where the amount of entropy
>>is close to zero. Then as you add things back, there's a much better
>
>I will try to do that.
>
>But please see the different lower boundary values in the different
>subsections of F.46: none of them fall when I disable or change
>anything in the base system (except the power management -- where I
>added additional analyses). Some of the changes imply that the jitter
>increases when I disable certain support.
>
>Thus, expect that we will not see a significant drop in the jitter as
>you fear or expect.
>
>Yet, I will try and report back.

Please have a look at the updated documentation, appendix F.46.10
provided in [1].

The interesting result is that some combinations of disabling CPU
support do reduce the CPU execution jitter. However, disabling all
support is not the lowest jitter measurement.

Though, none of the tried combinations deteriorate the measurement so
much that the execution jitter would be insufficient for use in the RNG.

[...]
>>sense of where the unpredictability might be coming from, and whether
>>the unpredictability is coming from something which is fundamentally
>>arising from something which is chaotic or quantum effect, or just
>>because we don't have a good way of modelling the behavior of the
>>L1/L2 cache (for example) and that is spoofing your entropy estimator.
>
>Please note: if the jitter really comes from the oscillator effect of
>the RAM clock vs. the CPU clock (which I suspect), we will not be able
>to alter the jitter software wise.

My current conclusion is that software can have some impact on the
execution jitter (as it is visible when executing different OSes on the
same hardware system as outlined in appendix F.45.

But none of the software impacts can deteriorate the jitter to the level
that it is not usable any more for the RNG and still keep the claim that
each output bit would have one bit of entropy.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Ciao
Stephan

2013-10-30 12:59:29

by Sandy Harris

[permalink] [raw]

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Theodore Ts'o <[email protected]> wrote:

> Fundamentally, what worries me about this scheme (actually, causes the
> hair on the back of my neck to rise up on end) is this statement in
> your documentation[1]:
>
> When looking at the sequence of time deltas gathered
> during testing [D] , no pattern can be detected. Therefore, the
> fluctuation and the resulting distribution are not based on a
> repeating pattern and must be considered random.
>
> [1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html
>
> Just because we can't detect a pattern does **not** mean that it is
> not based on a repeating pattern, and therefore must be considered
> random. We can't detect a pattern in RDRAND, so does that mean it's
> automatically random? Why, no.
> ...
> It may be that there is some very complex state which is hidden inside
> the the CPU execution pipeline, the L1 cache, etc., etc. But just
> because *you* can't figure it out, and just because *I* can't figure
> it out doesn't mean that it is ipso facto something which a really
> bright NSA analyst working in Fort Meade can't figure out. (Or heck,
> a really clever Intel engineer who has full visibility into the
> internal design of an Intel CPU....)
>
> Now, it may be that in practice, an adversary won't be able to carry
> out a practical attack ...

It seems worth noting here that Ted's reasons for skepticism
apply not just to Stephan's Jitter generator, but to others such
as Havege (already in Debian) which are based on differences
in speed of arithmetic operations, presumably due to cache
& TLB misses, pipeline stalls, etc. Also to ones based on
variations in delays from timer calls such as my maxwell(8).

It is also worth mentioning that, while Stephan has done
thorough testing on a range of CPUs, others have test
& rationale info as well. The Havege papers have a lot,
my maxwell paper has a little, and there's:
McGuire, Okech & Schiesser,
Analysis of inherent randomness of the Linux kernel,
http://lwn.net/images/conf/rtlws11/random-hardware.pdf

I know my stuff is not an adequate answer to Ted, but
I suspect some of the others may be.