2013-10-11 18:39:04

by Stephan Müller

[permalink] [raw]
Subject: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi,

the CPU Jitter RNG [1] is a true random number generator that is
intended to work in user and kernel space equally well on a large number
of different CPUs. The heart of the RNG is about 30 lines of code. The
current implementation allows seamless hooking into the kernel crypto
API as well as the Linux /dev/random driver. With its inherent non-
blocking behavior, it could solve the problem of a blocking /dev/random.

Over the last months, new tests were executed. The list of tests now
cover all major operating systems and CPU types as well as microkernels
of NOVA, Fiasco.OC and Pistacio. More than 200 different systems are
tested. And for those, the tests show that the Jitter RNG produces high-
quality output. See [2] appendix F for details.

When talking with developers from different chip manufactures, I was
told that even they see the execution timing jitter in their tests and
cannot eliminate the timing jitter. Nor are they able to explain to 100%
certainty the root cause of the jitter. Therefore, the noise source
looks appropriate for general use.

I am asking whether this RNG would good as an inclusion into the Linux
kernel for:

- kernel crypto API to provide a true random number generator as part of
this API (see [2] appendix B for a description)

- inclusion into /dev/random as an entropy provider of last resort when
the entropy estimator falls low.

Patches for both are provided in the source code tarball at [1].

A full description of the RNG together with all testing is provided at
[2] or [3].

I will present the RNG at the Linux Symposium in Ottawa this year. There
I can give a detailed description of the design and testing.

[1] http://www.chronox.de

[2] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

[3] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf

Ciao
Stephan


2013-10-12 01:45:58

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Fri, Oct 11, 2013 at 2:38 PM, Stephan Mueller <[email protected]> wrote:

I like the basic idea. Here I'm alternately reading the email and the
page you link to & commenting on both.

A nitpick in the paper is that you cite RFC 1750. That was superceded
some years back by RFC 4086
http://tools.ietf.org/html/rfc4086

(Ted's comments in the actual driver had the same problem last
I looked. That is excusable since they were written long ago.)

I think you may be missing some other citations that should be
there, to previous work along similar lines. One is the HAVEGE
work, another:
McGuire, Okech & Schiesser,
Analysis of inherent randomness of the Linux kernel,
http://lwn.net/images/conf/rtlws11/random-hardware.pdf

Paper has:

" the time delta is partitioned into chunks of 1 bit starting at the lowest bit
" .... The 64 1 bit chunks of the time value are XORed with each other to
" form a 1 bit value.

As I read that, you are just taking the parity. Why not use that simpler
description & possibly one of several possible optimised algorithms
for the task: http://graphics.stanford.edu/~seander/bithacks.html

If what you are doing is not a parity computation, then you need a
better description so people like me do not misread it.

A bit later you have:

" After obtaining the 1 bit folded and unbiased time stamp,
" how is it mixed into the entropy pool? ... The 1 bit folded
" value is XORed with 1 bit from the entropy pool.

This appears to be missing the cryptographically strong
mixing step which most RNG code includes. If that is
what you are doing, you need to provide some strong
arguments to justify it.

Sometimes doing without is justified; for example my
code along these lines
ftp://ftp.cs.sjtu.edu.cn:990/sandy/maxwell/
does more mixing than I see in yours, but probably
not enough overall. That's OK because I am just
feeding into /dev/random which has plenty of
mixing.

It is OK for your code too if you are feeding into
/dev/random, but it looks problematic if your code
is expected to stand alone.

Ah! You talk about whitening a bit later. However,
you seem to make it optional, up to the user. I
cannot see that that is a good idea.

At the very least I think you need something like
the linear transform from the ARIA cipher -- fast
and cheap, 128 bits in & 128 out and it makes
every output bit depend on every input bit. That
might not be enough, though.

You require compilation without optimisation. How does
that interact with kernel makefiles? Can you avoid
undesirable optimisations in some other way, such as
volatile declartions?

> I am asking whether this RNG would good as an inclusion into the Linux
> kernel for:
>
> - kernel crypto API to provide a true random number generator as part of
> this API (see [2] appendix B for a description)

My first reaction is no. We have /dev/random for the userspace
API and there is a decent kernel API too. I may change my
mind here as I look more at your appendix & maybe the code.

> - inclusion into /dev/random as an entropy provider of last resort when
> the entropy estimator falls low.

Why only 'of last resort'? If it can provide good entropy, we should
use it often.

> I will present the RNG at the Linux Symposium in Ottawa this year. There
> I can give a detailed description of the design and testing.

I live in Ottawa, don't know if I'll make it to the Symposium this
year. Ted; I saw you at one Symposium; are you coming this
year?

2013-10-12 03:28:42

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi Stephan,

I haven't had a chance to look at your paper in detail, yet, but a
quick scan has found a huge red flag for me that puts the rest of your
analysis in severe doubt for me.

You say that you got really good results and perfect statistical
entropy on a number of platforms, including on an MIPS embedded
system. You also say that you are harvesting jitter by using
get_cycles() yes?

Well, on the MIPS platform, here is the definition of get_cycles:

static inline cycles_t get_cycles(void)
{
return 0;
}

So if you are getting great entropy results when in effect you
couldn't possibly be harvesting any jitter at all, then something is
really, Really, REALLY wrong with your tests.

One might be that you are just getting great statistical results
because of the whitening step. This is why I have very little faith
in statistical tests of randomness, given that they will return
perfect results for the following "random number generator"

AES_ENCRYPT(i++, NSA_KEY)

Regards,

- Ted

2013-10-12 19:04:35

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 23:28:35 schrieb Theodore Ts'o:

Hi Theodore,

>Hi Stephan,
>
>I haven't had a chance to look at your paper in detail, yet, but a
>quick scan has found a huge red flag for me that puts the rest of your
>analysis in severe doubt for me.
>
>You say that you got really good results and perfect statistical
>entropy on a number of platforms, including on an MIPS embedded
>system. You also say that you are harvesting jitter by using
>get_cycles() yes?
>
>Well, on the MIPS platform, here is the definition of get_cycles:
>
>static inline cycles_t get_cycles(void)
>{
> return 0;
>}

There are multiple catches to this issue:

- First, if the time gathering function does not work or is to coarse,
the function jent_entropy_init() returns an error. As outlined in
jitterentropy(3), the result of this function must be honored before
using the RNG.

- Second, the time stamp function of jent_get_nstime uses
__getnstimeofday in case get_cycles returns zero (see implementation of
jent_get_nstime()). On MIPS systems with missing get_cycles, the RNG
would use the __getnstimeofday() as get_cycles returns 0. When using the
RNG in user space, it calls clock_gettime(CLOCK_REALTIME) that is backed
by the same timer of__getnstimeofday on MIPS.

Please consider the use of the jent_entropy_init function in the two
patches for /dev/random and kernel crypto API:

/dev/random:

+ /* we are uninitialized, try to initialize */
+ if(jent_entropy_init())
+ {
+ /* there is no CPU Jitter, disable the entropy
collector */
+ r->jent_enable = 0;
+ return;
+ }


kernel crypto API:

static int __init jent_drng_init(void)
{
...
ret = jent_entropy_init();
if(ret)
{
printk(DRIVER_NAME ": Initialization failed with host
not compliant with requirements: %d\n", ret);
return -EFAULT;
}

>
>So if you are getting great entropy results when in effect you
>couldn't possibly be harvesting any jitter at all, then something is
>really, Really, REALLY wrong with your tests.
>
>One might be that you are just getting great statistical results
>because of the whitening step. This is why I have very little faith

There is *no* whitening function (cryptographic or otherwise) involved
in the generation of random data. All is done by harvesting time deltas
and align them appropriately. This is the sole reason why the heart of
the RNG is only 30 lines of code.

I have added arguments about broken time stamp collections in section
4.3 of the documentation in [2]. These anti tests clearly show that
broken time stamps would be immediately visible and not disguised by
some whitening function.

Note, the testing of the 200+ systems is tone by measuring the jitter of
the core of the RNG. The measurement is logically similar to measure the
different add_*_randomness functions for random.c. Thus, even the logic
to arrange the timing values to a random value bit stream does not
affect the measurements.

>in statistical tests of randomness, given that they will return
>perfect results for the following "random number generator"
>
> AES_ENCRYPT(i++, NSA_KEY)
>
>Regards,
>
> - Ted


Ciao
Stephan

2013-10-12 20:12:26

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 21:45:57 schrieb Sandy Harris:

Hi Sandy,

>On Fri, Oct 11, 2013 at 2:38 PM, Stephan Mueller <[email protected]>
>wrote:
>
>I like the basic idea. Here I'm alternately reading the email and the
>page you link to & commenting on both.

Thank you very much for reviewing both and sharing your thoughts.
>
>A nitpick in the paper is that you cite RFC 1750. That was superceded
>some years back by RFC 4086
>http://tools.ietf.org/html/rfc4086

Thank you for the link. I updated the reference in the paper. Though,
the Von-Neumann unbias operation is way older than this RFC and I just
listed the RFC to give people a reference.
>
>(Ted's comments in the actual driver had the same problem last
>I looked. That is excusable since they were written long ago.)
>
>I think you may be missing some other citations that should be
>there, to previous work along similar lines. One is the HAVEGE
>work, another:
>McGuire, Okech & Schiesser,
>Analysis of inherent randomness of the Linux kernel,
>http://lwn.net/images/conf/rtlws11/random-hardware.pdf

Thank you for the reminder. I added section 1.1.

>
>Paper has:
>
>" the time delta is partitioned into chunks of 1 bit starting at the
>lowest bit " .... The 64 1 bit chunks of the time value are XORed with
>each other to " form a 1 bit value.
>
>As I read that, you are just taking the parity. Why not use that
>simpler description & possibly one of several possible optimised
>algorithms for the task:
>http://graphics.stanford.edu/~seander/bithacks.html

I am fully aware that the bit operation is inefficient. Yet it is
deliberately inefficient, because that "folding loop" performs actual
work for the RNG (the collapse of 64 bits into one bit) and at the very
same time, it is the fixed instruction set over which I measure the time
variations.

Thus, the folding loop can be considered as the entropy source at the
same time. Thus, when making it shorter or more efficient, we alter the
duration of the loop. With significantly shorter durations, the jitter
measurements will get smaller.

That issue is the sole reason why that code part of the RNG shall be
compiled without optimizations. The compiler is clever enough to arrange
my inefficient operation into something faster when optimizing. You
clearly see that with Clang: the unoptimized folding loop code takes
about 300 nanoseconds on my i7 2nd gen. Optimized, it takes about 40 ns!
At the same time, the jitter is lower (it is still sufficient, but the
cushion is smaller).

As the RNG is not primarily about speed, the folding operation should
stay inefficient.
>
>If what you are doing is not a parity computation, then you need a
>better description so people like me do not misread it.

It is not a parity computation that the folding loop performs. The code
XORs each individual bit of the 64 bit stream with each other, whereas
your cited document implies an ANDing of the bits (see section
"Computing parity the naive way" of the cited document). The reason is
that the different bits contain entropy -- the lower the bit index, the
more entropy they are assumed to have. The only operation to retain
entropy and yet collapse a bit string is to use XOR. Any other operation
will result in the fact that you cannot make any statement about the
entropy behavior any more.

The rationale about the folding loop is given in section 5.1 [2].

>
>A bit later you have:
>
>" After obtaining the 1 bit folded and unbiased time stamp,
>" how is it mixed into the entropy pool? ... The 1 bit folded
>" value is XORed with 1 bit from the entropy pool.
>
>This appears to be missing the cryptographically strong
>mixing step which most RNG code includes. If that is
>what you are doing, you need to provide some strong
>arguments to justify it.

The key here is that there is no cryptographic function needed for
mixing as in fact I am not really mixing things. I am simply
concatenating the new bit to the previously obtained bits. That is it.

The argument for doing that is that the time deltas are independent of
each other. Yet, these deltas show variations which are collapsed into
the one bit that is the result of the folding loop. Therefore, the
individual bits are independent of each other and yet contain the
variations of the jitter. Now, there is no need to apply a cryptographic
function for further mixing.

The key here is that the variations in the time deltas must be *larger*
than one bit. What does that mean? For example, if the variation would
be, say 50% of 250 ns and 50% of 251 ns, we have 1 bit variation
(considering that each individual measurement is independent of the
others). Or, if you have 30% with 249ns, 20% with 250%, 25% with 251 and
25% with 252ns, you have more than one bit. You see, that this line of
thought brings us to the Shannon Entropy formula. And now we come to the
200+ test systems that I tested on: all with the exception of one very
old system showed a variation with the lower boundary of more than one
bit. Therefore, each bit from the folding operation therefore contains
one bit of entropy. So, why should I mix it even further with some
crypto function?

>
>Sometimes doing without is justified; for example my
>code along these lines
>ftp://ftp.cs.sjtu.edu.cn:990/sandy/maxwell/
>does more mixing than I see in yours, but probably
>not enough overall. That's OK because I am just
>feeding into /dev/random which has plenty of
>mixing.
>
>It is OK for your code too if you are feeding into
>/dev/random, but it looks problematic if your code
>is expected to stand alone.

Please consider use case of the CPU Jitter RNG: it is intended as a seed
source for deterministic RNGs. That is how the various integration
patches for the kernel crypto API, /dev/random, OpenSSL, libgcrypt or
even the stand-alone entropy feeding daemon operate. Thus, you have a
whitening function when using it together with such DRNGs.

Yet, the code allows other users that may want entropy (NSS or even
stand-alone DRNGs come to mind) to simply use that noise source without
the burden of another whitening function to their DRNGs.

>
>Ah! You talk about whitening a bit later. However,
>you seem to make it optional, up to the user. I
>cannot see that that is a good idea.

Why is whitening a good idea in the first place when the statistical
properties of your output already are white noise? When applying all
kinds of statistical test to my non-whitened output, I do not see any
statistical significant odd behavior -- ent (Chi-Square test, mean test,
monte carlo simulation), dieharder tests, various tests from the German
BSI, NIST test suite.

And if you say that your entropy source has problems, then your
whitening function only disguises the fact but does not remedy it.
Please refer to section 4.3 of my document in [2] where I have some
"anti" tests with broken timers. Such brokenness would be immediately
visible for anybody without a whitening function (such as the jitter
RNG). But if the jitter RNG would use a whitening function, the missing
entropy would not be visible.

Also, the RNG is intended as a seeding mechanism for all kinds of use
cases. The typical one is the seeding of a DRNG of your choice. This is
one more reason why I consider the use of a whitening function as not
helpful.

>
>At the very least I think you need something like
>the linear transform from the ARIA cipher -- fast
>and cheap, 128 bits in & 128 out and it makes
>every output bit depend on every input bit. That
>might not be enough, though.

Can you please help me understand why you think that a whitening
function (cryptographic or not) is needed in the case of the CPU Jitter
RNG, provided that I can show that each individual bit coming from the
folding operation has one bit of entropy?
>
>You require compilation without optimisation. How does
>that interact with kernel makefiles? Can you avoid
>undesirable optimisations in some other way, such as
>volatile declartions?

That is very easy to implement, see the kernel Makefile for my code:

...
obj-m += jitterentropy.o
CFLAGS_jitterentropy-base.o = -O0
jitterentropy-y := jitterentropy-base.o jitterentropy-drng.o
...

This code implies that only the compilation of jitterentropy-base.c is
performed with -O0. Any other C file from my code or even the reminder
of the kernel is compiled with the optimization the kernel developer
deemed right.

>
>> I am asking whether this RNG would good as an inclusion into the
>> Linux
>> kernel for:
>>
>> - kernel crypto API to provide a true random number generator as part
>> of this API (see [2] appendix B for a description)
>
>My first reaction is no. We have /dev/random for the userspace
>API and there is a decent kernel API too. I may change my
>mind here as I look more at your appendix & maybe the code.

The concern with /dev/random and its get_random_bytes() function is that
inside the kernel, we only have the equivalent of /dev/urandom.
Therefore, in the worst case, you have *no* guarantee that you obtain
random numbers that are freshly seeded with hardware entropy.

For example, when you look into libgcrypt and you want to have "strong
random numbers", libgcrypt seeds from /dev/random every 1000 or so
random numbers produced from the DRNG.

Due to the ability of user space to obtain data via the nonblocking_pool
using /dev/urandom and the use of the same pool for get_random_bytes,
equally strong random numbers as, for example, libgcrypt would produce
would not be available.

Therefore, the code linking with the kernel crypto API registers three
RNGs:

- one X9.31 DRNG that is seeded "once in a while" with the CPU jitter

- one X9.31 DRNG that is seeded "frequently" with the CPU jitter

- and one direct access to the CPU jitter RNG

This is logically equivalent to what is done in libgcrypt.

Of course, we may replace the X9.31 with, say, an SP800-90A DRBG, if
that becomes available.

Furthermore, one of the core ideas of the CPU jitter RNG is that you get
away from the centralized noise source of /dev/random. Such central
services are also the focus for hackers. The CPU Jitter RNG allows
multiple instantiations of the noise source into independent components.
Every "user" of random data may now have its own private copy of a noise
source and its operation.
>
>> - inclusion into /dev/random as an entropy provider of last resort
>> when the entropy estimator falls low.
>
>Why only 'of last resort'? If it can provide good entropy, we should
>use it often.

I did not dare to be more than a provider of last resort at this point.
The CPU jitter RNG is new and I guess people may want to study it first.

>From my point of view, it surely can become one "regular" source. But as
it delivers entropy on demand, it may simply outpace any other entropy
source of /dev/random. This means, in some normal use cases, my CPU
jitter RNG would always be the provider of the majority of the entropy,
by far. Therefore, my CPU Jitter RNG would alter the nature of
/dev/random significantly.

If Ted thinks that this is ok, I am all for it and will change the
patch. But I tried to be cautious and not to overwhelm people :-)
>
>> I will present the RNG at the Linux Symposium in Ottawa this year.
>> There I can give a detailed description of the design and testing.
>
>I live in Ottawa, don't know if I'll make it to the Symposium this
>year. Ted; I saw you at one Symposium; are you coming this
>year?

As mentioned before, I would really like to meet you there to have a cup
of coffee over that matter.


Ciao
Stephan

2013-10-28 15:40:34

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Freitag, 11. Oktober 2013, 20:38:51 schrieb Stephan Mueller:

Hi Ted,

>Hi,
>
>the CPU Jitter RNG [1] is a true random number generator that is
>intended to work in user and kernel space equally well on a large
>number of different CPUs. The heart of the RNG is about 30 lines of
>code. The current implementation allows seamless hooking into the
>kernel crypto API as well as the Linux /dev/random driver. With its
>inherent non- blocking behavior, it could solve the problem of a
>blocking /dev/random.
>
>Over the last months, new tests were executed. The list of tests now
>cover all major operating systems and CPU types as well as microkernels
>of NOVA, Fiasco.OC and Pistacio. More than 200 different systems are
>tested. And for those, the tests show that the Jitter RNG produces
>high- quality output. See [2] appendix F for details.

Apart from adding more test results from more systems (now including
Windows), I added more updates:

- The structure of the Linux kernel code is updated such that the common
C code can go to straight to the lib/ directory or any other directory
that seems suitable for common code. If it is of help, I can create a
patch file to add the CPU Jitter RNG to the Linux kernel code instead of
manually copying into a kernel tree for testing it with random.c.

- Based on Sandy Harris' discussion in
http://permalink.gmane.org/gmane.comp.encryption.general/16219, the
patch for random.c is updated that the initialization function of the
entropy pools init_std_data now contains a call to the CPU Jitter RNG to
mix in 256 bits of entropy when the entropy pool is filled.

If it is accepted that the CPU Jitter RNG delivers entropy, the latter
update may now allow us to get rid of storing the seed file during
shutdown and restoring it during the next boot sequence.

Please see the latest patch to random.c in the file patches/linux-3.11-
random.patch delivered with [1].

Ciao
Stephan

[1] http://www.chronox.de/jent/jitterentropy-20131028.tar.bz2

Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Mon, 28 Oct 2013, Stephan Mueller wrote:
> If it is accepted that the CPU Jitter RNG delivers entropy, the latter
> update may now allow us to get rid of storing the seed file during
> shutdown and restoring it during the next boot sequence.

That's a 4096-bit safety net (uncredited entropy) which at least Debian
shall not remove.

I think Debian also dumps some low-entropy-per-bit crap into /dev/random
during boot (again, not credited), such as the boot kernel logs. We could
increase the density of that entropy a lot using gzip -0 or something like
that... is an uncredited low-entropy-per-bit dump into the pool detrimental
to its quality?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2013-10-28 16:15:38

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 28. Oktober 2013, 14:06:23 schrieb Henrique de Moraes
Holschuh:

Hi Henrique,

>On Mon, 28 Oct 2013, Stephan Mueller wrote:
>> If it is accepted that the CPU Jitter RNG delivers entropy, the
>> latter
>> update may now allow us to get rid of storing the seed file during
>> shutdown and restoring it during the next boot sequence.
>
>That's a 4096-bit safety net (uncredited entropy) which at least Debian
>shall not remove.

That is correct, and I did not want have such safety net removed.

I have to correct my initial statement: there is a possibility where the
CPU Jitter RNG may not deliver data: when the timer resolution is too
coarse. Thus, please disregard my notion to remove the user space
seeding.

My goal is that the entropy pools are filled with entropy at the time
when they are created so that they will deliver good random data if the
system has a high resolution timer (which is almost always the case as
shown with my tests).

>
>I think Debian also dumps some low-entropy-per-bit crap into
>/dev/random during boot (again, not credited), such as the boot kernel
>logs. We could increase the density of that entropy a lot using gzip
>-0 or something like that... is an uncredited low-entropy-per-bit dump
>into the pool detrimental to its quality?

Any mixing of data into the entropy pools can never diminish the
entropy, but just further mix the data.

Note, a simple write of data into the device files will never update the
entropy estimator that is behind the blocking of /dev/random. All that
is done by the write to the device files is the mixing of the entropy
pools of blocking_pool and nonblocking_pool (and not input_pool).

Ciao
Stephan

2013-10-28 22:34:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Fundamentally, what worries me about this scheme (actually, causes the
hair on the back of my neck to rise up on end) is this statement in
your documentation[1]:

When looking at the sequence of time deltas gathered
during testing [D] , no pattern can be detected. Therefore, the
fluctuation and the resulting distribution are not based on a
repeating pattern and must be considered random.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Just because we can't detect a pattern does **not** mean that it is
not based on a repeating pattern, and therefore must be considered
random. We can't detect a pattern in RDRAND, so does that mean it's
automatically random? Why, no.

If all you have is the output of "AES_ENCRPYT(NSA_KEY, i++)". and
NSA_KEY is not known to you, you won't be able to detect a pattern,
either. But I can guarantee to you that it's not random...

It may be that there is some very complex state which is hidden inside
the the CPU execution pipeline, the L1 cache, etc., etc. But just
because *you* can't figure it out, and just because *I* can't figure
it out doesn't mean that it is ipso facto something which a really
bright NSA analyst working in Fort Meade can't figure out. (Or heck,
a really clever Intel engineer who has full visibility into the
internal design of an Intel CPU....)

Now, it may be that in practice, an adversary won't be able to carry
out a practical attack because there will be external interrupts that
the adversary won't be able to put into his or her model of your CPU
--- for example, from network interrupts or keyboard interrupts. But
in that case, it's to measure just the interrupt, because it may be
that the 32 interrupts that you got while extracting 128 bits of
entropy from your jitter engine was only 32 bits of entropy, and the
rest could be determined by someone with sufficient knowledge and
understanding of the internal guts of the CPU. (Treating this
obscurity as security is probably not a good idea; we have to assume
the NSA can get its hands on anything it wants, even internal,
super-secret, "black cover" Intel documents. :-)

To be honest, I have exactly the same worry about relying on HDD
interrupts. The theoretical basis of this resulting in true
randomness is based on a 1994 paper by Don Davis: "Cryptographic
randomness from air turbulence in disk drives"[2]:

[2] http://world.std.com/~dtd/random/forward.pdf

The problem is that almost two decades later, the technology of HDD's,
and certainly SSD (which didn't exist back then) have changed quite a
lot. It is not obvious to me how much entropy you can really get from
observing the disk completion times if you assume that the adversary
has complete knowledge to the relative timing and block numbers of the
disk accesses from the OS (for example, if we boot multiple mobile
phone from flash for the first time, how many bits of entropy are
there really?)

But at least back in 1994, there was an attempt to come up with a
physical theory as to where the entropy was coming from, and then as
much work as possible to rule out other possible causes of the
uncertainty.

So if you want to really convince the world that CPU jitter is random,
it's not enough to claim that it you can't see a pattern. What you
need to do is to remove all possible sources of the uncertainty, and
show that there is still no discernable pattern after you do things
like (a) run in kernel space, on an otherwise quiscent computer, (b)
disable interrupts, so that any uncertainty can't be coming from
interrupts, etc., Try to rule it all out, and then see if you still
get uncertainty.

If you think it is from DRAM timing, first try accessing the same
memory location in kernel code with the interrupts off, over and over
again, so that the memory is pinned into L1 cache. You should be able
to get consistent results. If you can, then if you then try to read
from DRAM with the L1 and L2 caches disabled, and with interrupts
turned off, etc, and see if you get consistent results or inconsistent
results. If you get consistent results in both cases, then your
hypothesis is disproven. If you get consistent results with the
memory pinned in L1 cache, and inconsistent results when the L1 and L2
cache are disabled, then maybe the timing of DRAM reads really are
introducing entropy. But the point is you need to test each part of
the system in isolation, so you can point at a specific part of the
system and say, *that*'s where at least some uncertainty which an
adversary can not reverse engineer, and here is the physical process
from which the choatic air patterns, or quantum effects, etc., which
is hypothesized to cause the uncertainty.

And note that when you do this, you can't use any unbiasing or
whitening techniques --- you want to use the raw timings, and then do
things like look very hard for any kind of patterns; Don Davis used
FFT's because he wanted to look for any patterns that might be
introduced by the rotating plattern, which would presumably would show
up in a frequency domain analysis even if it was invisible in the time
domain.

If you don't do all of this work, there is no way to know for sure
where the entropy is coming from. And if you don't know, that's when
you have to be very, very conservative, and use a very large
engineering safety margin. Currently we use the high resolution CPU
counter, plus the interrupted IP, and we mix all of this together from
64 interrupts, and we count this as a single bit of entropy. I *hope*
that at least one of those interrupts has sufficient unpredictably,
perhaps because the remote attacker can't know when a LAN interrupt
has happened, such that have a single bit of entropy.

Maybe someone can prove that there is more entropy because of some
instability between the oscillator used by the CPU clock and the one
used by the ethernet NIC, and so I'm being hopelessly
over-conservative. Perhaps; but until we know for sure, using a
similar analysis to what I described above, I'd much rather be slow
than be potentially insecure.

The jitter "entropy collector" may be able to generate more
"randomness" much more quickly, but is the resulting numbers really
more secure? Other people will have to judge for themselves, but this
is why I'm not convinced.

Best regards,

- Ted

2013-10-29 08:42:47

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 28. Oktober 2013, 17:45:49 schrieb Theodore Ts'o:

Hi Theodore,

first of all, thank you for your thoughts.

And, before we continue any discussion, please consider that all the big
testing that is done to analyze the jitter so far did (a) not include
any whitening schema (cryptographic or otherwise) and (b) did not even
include the processing done inside the RNG. The testing in appendix F of
the documentation just measures the execution time of some instructions
-- the very heart of the RNG, and not more. And only if these show
variations, then I conclude the RNG can be used.

[...]
>
>It may be that there is some very complex state which is hidden inside
>the the CPU execution pipeline, the L1 cache, etc., etc. But just
>because *you* can't figure it out, and just because *I* can't figure
>it out doesn't mean that it is ipso facto something which a really
>bright NSA analyst working in Fort Meade can't figure out. (Or heck,
>a really clever Intel engineer who has full visibility into the
>internal design of an Intel CPU....)

I concur here. But so are all sources of /dev/random too. As you have
outlined later, your HDD fluctuations may not be as trustworthy as we
think. The key strokes and their timings can be obtained from
electromagnetic emanation. Lastly, the use of the fast_pool using
interrupts may still show a correlation with the other noise sources as
they all generate interrupts. But I diverge as we talk about my RNG and
do not analyze random.c.

So, I guess we all agree on the notion that entropy is *relative*. Some
information may be more entropic to one than to the other. However, for
us, it shall be entropy enough to counter our adversary.
>
>Now, it may be that in practice, an adversary won't be able to carry
>out a practical attack because there will be external interrupts that
>the adversary won't be able to put into his or her model of your CPU
>--- for example, from network interrupts or keyboard interrupts. But
>in that case, it's to measure just the interrupt, because it may be
>that the 32 interrupts that you got while extracting 128 bits of
>entropy from your jitter engine was only 32 bits of entropy, and the
>rest could be determined by someone with sufficient knowledge and
>understanding of the internal guts of the CPU. (Treating this
>obscurity as security is probably not a good idea; we have to assume
>the NSA can get its hands on anything it wants, even internal,
>super-secret, "black cover" Intel documents. :-)

Again, I concur. But since I have seen the jitter with quite similar
size on all the major CPUs we have around us (Intel, AMD, Sparc, POWER,
PowerPC, ARM, MIPS, zSeries), I guess you need to update your statement
to "... even internal, super-secret, "black cover" documents that are
synchronized among all the different chip vendors". :-)

[...]

Thanks again to your ideas below in testing the issue more.
>
>So if you want to really convince the world that CPU jitter is random,
>it's not enough to claim that it you can't see a pattern. What you
>need to do is to remove all possible sources of the uncertainty, and
>show that there is still no discernable pattern after you do things
>like (a) run in kernel space, on an otherwise quiscent computer, (b)

Re: (a) that is what I already did. The kernel implementation of the RNG
is capable of that testing. Moreover, that is what I already did in
section 5.1. It is easy for everybody to redo the testing by simply
compiling the kernel module, load it and look into
/sys/kernel/debug/jitterentropy. There you find some files that are
direct interfaces to the RNG. In particular, the file stat-fold is the
key to redo the testing that covers appendix F of my document (as
mentioned above, there is no postprocessing of the raw variations when
you read that file).

>disable interrupts, so that any uncertainty can't be coming from
>interrupts, etc., Try to rule it all out, and then see if you still
>get uncertainty.

When I did testing on all systems, interrupts are easily visible by the
larger "variations". When compiling the test results in appendix F, all
measurements that are a tad higher than the majority of the variations
are simply removed to focus on the worst case. I.e. the measurements and
the results *already* exclude any interrupts, scheduling impacts.

Regarding, caches, may I ask you to look into appendix F.46 of the
current document version? I conducted tests that tried to disable /
remove the impact of: system call context switches, flushing the
instruction pipeline, flushing of all caches, disabling preemtion,
flushing TLB, executing the code exclusively on one CPU core, disabling
of power management and frequency scaling.

All these tests show *no* deterioration in jitter, i.e. the jitter is
still there. The only exception is the power management where I see some
small jitter drop off, which is analyzed and concluded to be
unproblematic.

>
>If you think it is from DRAM timing, first try accessing the same
>memory location in kernel code with the interrupts off, over and over
>again, so that the memory is pinned into L1 cache. You should be able

That is what the testing already does. I constantly access some piece of
memory millions of times and measure the execution time of the operation
on that memory location. As mentioned above, interrupts are disregarded
in any case.

And, jitter is there.

>to get consistent results. If you can, then if you then try to read
>from DRAM with the L1 and L2 caches disabled, and with interrupts

Based on this suggestion, I now added the tests in Appendix F.46.8 where
I disable the caches and the tests in Appendix F.46.9 where I disable
the caches and interrupts.

The results show that the jitter even goes way up -- thus, jitter that
is sufficient is even more present when disabling the caches and
interrupts.

>turned off, etc, and see if you get consistent results or inconsistent
>results. If you get consistent results in both cases, then your
>hypothesis is disproven. If you get consistent results with the

Currently, the hypothesis is *not* disproven.

>memory pinned in L1 cache, and inconsistent results when the L1 and L2
>cache are disabled, then maybe the timing of DRAM reads really are
>introducing entropy. But the point is you need to test each part of
>the system in isolation, so you can point at a specific part of the
>system and say, *that*'s where at least some uncertainty which an
>adversary can not reverse engineer, and here is the physical process
>from which the choatic air patterns, or quantum effects, etc., which
>is hypothesized to cause the uncertainty.

As I tried quite a number of different variations on disabling /
enabling features in appendix F.46, I am out of ideas what else I should
try.
>
>And note that when you do this, you can't use any unbiasing or
>whitening techniques --- you want to use the raw timings, and then do
>things like look very hard for any kind of patterns; Don Davis used

Again, there is no whitening, and not even the RNG processing involved.
All I am doing is simple timing analysis of some fixed set of
instructions -- i.e. the very heart of the RNG.

[..]
>
>The jitter "entropy collector" may be able to generate more
>"randomness" much more quickly, but is the resulting numbers really
>more secure? Other people will have to judge for themselves, but this
>is why I'm not convinced.

May I ask to recheck appendix F.46 again?

Thanks
Stephan

2013-10-29 13:24:53

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
> Based on this suggestion, I now added the tests in Appendix F.46.8 where
> I disable the caches and the tests in Appendix F.46.9 where I disable
> the caches and interrupts.

What you've added in F.46 is a good start, but as a suggestiom,
instead of disabling one thing at a time, try disabling *everything*
and then see what you get, and then enabling one thing at a time. The
best thing is if you can get to the point where the amount of entropy
is close to zero. Then as you add things back, there's a much better
sense of where the unpredictability might be coming from, and whether
the unpredictability is coming from something which is fundamentally
arising from something which is chaotic or quantum effect, or just
because we don't have a good way of modelling the behavior of the
L1/L2 cache (for example) and that is spoofing your entropy estimator.

Regards,

- Ted

2013-10-29 14:00:31

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:

Hi Theodore,

>On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
>> Based on this suggestion, I now added the tests in Appendix F.46.8
>> where I disable the caches and the tests in Appendix F.46.9 where I
>> disable the caches and interrupts.
>
>What you've added in F.46 is a good start, but as a suggestiom,
>instead of disabling one thing at a time, try disabling *everything*
>and then see what you get, and then enabling one thing at a time. The
>best thing is if you can get to the point where the amount of entropy
>is close to zero. Then as you add things back, there's a much better

I will try to do that.

But please see the different lower boundary values in the different
subsections of F.46: none of them fall when I disable or change anything
in the base system (except the power management -- where I added
additional analyses). Some of the changes imply that the jitter
increases when I disable certain support.

Thus, expect that we will not see a significant drop in the jitter as
you fear or expect.

Yet, I will try and report back.

Though, does anybody has an idea how to flush/disable branch prediction
on x86 (short of using an ARM where I can disable the branch prediction
unit with CP15)? That is the last unit that I do not have a handle on.


>sense of where the unpredictability might be coming from, and whether
>the unpredictability is coming from something which is fundamentally
>arising from something which is chaotic or quantum effect, or just
>because we don't have a good way of modelling the behavior of the
>L1/L2 cache (for example) and that is spoofing your entropy estimator.

Please note: if the jitter really comes from the oscillator effect of
the RAM clock vs. the CPU clock (which I suspect), we will not be able
to alter the jitter software wise.

The reason why I suspect that oscillating effect is the following: I
spoke with two different employees from the research departments of
major chip vendors. They mentioned that they see the very same jitter in
their measurements albeit they tried to get rid of it. So, when they
cannot get rid of that, I guess we will not be able to lower or even
eliminate the jitter significantly.

Ciao
Stephan

2013-10-29 22:25:28

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 29. Oktober 2013, 15:00:31 schrieb Stephan Mueller:

Hi Ted,

>Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:
>
>Hi Theodore,
>
>>On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
>>> Based on this suggestion, I now added the tests in Appendix F.46.8
>>> where I disable the caches and the tests in Appendix F.46.9 where I
>>> disable the caches and interrupts.
>>
>>What you've added in F.46 is a good start, but as a suggestiom,
>>instead of disabling one thing at a time, try disabling *everything*
>>and then see what you get, and then enabling one thing at a time. The
>>best thing is if you can get to the point where the amount of entropy
>>is close to zero. Then as you add things back, there's a much better
>
>I will try to do that.
>
>But please see the different lower boundary values in the different
>subsections of F.46: none of them fall when I disable or change
>anything in the base system (except the power management -- where I
>added additional analyses). Some of the changes imply that the jitter
>increases when I disable certain support.
>
>Thus, expect that we will not see a significant drop in the jitter as
>you fear or expect.
>
>Yet, I will try and report back.

Please have a look at the updated documentation, appendix F.46.10
provided in [1].

The interesting result is that some combinations of disabling CPU
support do reduce the CPU execution jitter. However, disabling all
support is not the lowest jitter measurement.

Though, none of the tried combinations deteriorate the measurement so
much that the execution jitter would be insufficient for use in the RNG.

[...]
>>sense of where the unpredictability might be coming from, and whether
>>the unpredictability is coming from something which is fundamentally
>>arising from something which is chaotic or quantum effect, or just
>>because we don't have a good way of modelling the behavior of the
>>L1/L2 cache (for example) and that is spoofing your entropy estimator.
>
>Please note: if the jitter really comes from the oscillator effect of
>the RAM clock vs. the CPU clock (which I suspect), we will not be able
>to alter the jitter software wise.

My current conclusion is that software can have some impact on the
execution jitter (as it is visible when executing different OSes on the
same hardware system as outlined in appendix F.45.

But none of the software impacts can deteriorate the jitter to the level
that it is not usable any more for the RNG and still keep the claim that
each output bit would have one bit of entropy.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Ciao
Stephan

2013-10-30 12:59:25

by Sandy Harris

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Theodore Ts'o <[email protected]> wrote:

> Fundamentally, what worries me about this scheme (actually, causes the
> hair on the back of my neck to rise up on end) is this statement in
> your documentation[1]:
>
> When looking at the sequence of time deltas gathered
> during testing [D] , no pattern can be detected. Therefore, the
> fluctuation and the resulting distribution are not based on a
> repeating pattern and must be considered random.
>
> [1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html
>
> Just because we can't detect a pattern does **not** mean that it is
> not based on a repeating pattern, and therefore must be considered
> random. We can't detect a pattern in RDRAND, so does that mean it's
> automatically random? Why, no.
> ...
> It may be that there is some very complex state which is hidden inside
> the the CPU execution pipeline, the L1 cache, etc., etc. But just
> because *you* can't figure it out, and just because *I* can't figure
> it out doesn't mean that it is ipso facto something which a really
> bright NSA analyst working in Fort Meade can't figure out. (Or heck,
> a really clever Intel engineer who has full visibility into the
> internal design of an Intel CPU....)
>
> Now, it may be that in practice, an adversary won't be able to carry
> out a practical attack ...

It seems worth noting here that Ted's reasons for skepticism
apply not just to Stephan's Jitter generator, but to others such
as Havege (already in Debian) which are based on differences
in speed of arithmetic operations, presumably due to cache
& TLB misses, pipeline stalls, etc. Also to ones based on
variations in delays from timer calls such as my maxwell(8).

It is also worth mentioning that, while Stephan has done
thorough testing on a range of CPUs, others have test
& rationale info as well. The Havege papers have a lot,
my maxwell paper has a little, and there's:
McGuire, Okech & Schiesser,
Analysis of inherent randomness of the Linux kernel,
http://lwn.net/images/conf/rtlws11/random-hardware.pdf

I know my stuff is not an adequate answer to Ted, but
I suspect some of the others may be.

2013-11-02 11:01:15

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi!

> >sense of where the unpredictability might be coming from, and whether
> >the unpredictability is coming from something which is fundamentally
> >arising from something which is chaotic or quantum effect, or just
> >because we don't have a good way of modelling the behavior of the
> >L1/L2 cache (for example) and that is spoofing your entropy estimator.
>
> Please note: if the jitter really comes from the oscillator effect of
> the RAM clock vs. the CPU clock (which I suspect), we will not be able
> to alter the jitter software wise.

Well... if it is really oscillator effect, there should be _no_
entropy when running with L1/L2 enabled (because DRAM will not be
accessed at all at that case).

There should be way to extract entropy more directly from various
oscillator effects, no?

For DRAM, just perform timing, have entropy. Plus we could for example
measure PIT vs. other timer sources... (but I suspect that on PCs we
already have enough entropy...)
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-02 11:12:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Sat 2013-11-02 12:01:12, Pavel Machek wrote:
> Hi!
>
> > >sense of where the unpredictability might be coming from, and whether
> > >the unpredictability is coming from something which is fundamentally
> > >arising from something which is chaotic or quantum effect, or just
> > >because we don't have a good way of modelling the behavior of the
> > >L1/L2 cache (for example) and that is spoofing your entropy estimator.
> >
> > Please note: if the jitter really comes from the oscillator effect of
> > the RAM clock vs. the CPU clock (which I suspect), we will not be able
> > to alter the jitter software wise.
>
> Well... if it is really oscillator effect, there should be _no_
> entropy when running with L1/L2 enabled (because DRAM will not be
> accessed at all at that case).
>
> There should be way to extract entropy more directly from various
> oscillator effects, no?

Actually... bogomips calibrating loop already has jitter, measured
from difference between CPU clock and system clock... Could that be
used as entropy source? As a bonus, we should have it on most
platforms...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-03 07:20:43

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Samstag, 2. November 2013, 12:01:13 schrieb Pavel Machek:

Hi Pavel,

>Hi!
>
>> >sense of where the unpredictability might be coming from, and
>> >whether
>> >the unpredictability is coming from something which is fundamentally
>> >arising from something which is chaotic or quantum effect, or just
>> >because we don't have a good way of modelling the behavior of the
>> >L1/L2 cache (for example) and that is spoofing your entropy
>> >estimator.
>>
>> Please note: if the jitter really comes from the oscillator effect of
>> the RAM clock vs. the CPU clock (which I suspect), we will not be
>> able
>> to alter the jitter software wise.
>
>Well... if it is really oscillator effect, there should be _no_
>entropy when running with L1/L2 enabled (because DRAM will not be
>accessed at all at that case).

That is a good one!

Another friend of mine mentioned that he assumes the rise and fall times
of transistors varies very slightly and could be the main reason for the
jitter. I do not think that this is really the case, because our gates
that form the CPU instructions comprise of many transistors. The
combined raise/fall jitter should cancel each other out.

That said, the full root cause is is not really known at this time
considering that major chip vendors have no real clue either.

>
>There should be way to extract entropy more directly from various
>oscillator effects, no?

I am working a different way of measuring such oscillator effects by
using the high-resolution timer of the CPU and measure it with a
Jiffies-based snapshotting window. So, here I would combine two timers
that are differently generated. If their frequencies would be relative
prime to each other, we would measure a direct oscillator effect.

This already was tried in the cryptlib RNG, but I try a slightly
different approach.

Yet, nothing is complete at this point.
>
>For DRAM, just perform timing, have entropy. Plus we could for example
>measure PIT vs. other timer sources... (but I suspect that on PCs we
>already have enough entropy...)

I suspect so too, but my measurements as of now are not conclusive.
>
Pavel


Ciao
Stephan

2013-11-03 12:41:40

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Sun, Nov 03, 2013 at 08:20:34AM +0100, Stephan Mueller wrote:
> Another friend of mine mentioned that he assumes the rise and fall times
> of transistors varies very slightly and could be the main reason for the
> jitter. I do not think that this is really the case, because our gates
> that form the CPU instructions comprise of many transistors. The
> combined raise/fall jitter should cancel each other out.

The whole point of using a clocked architecture for digital circuitry
is to avoid differences in the rise/fall times of transitors to lead
to non-deterministic behavior --- which is important if you want to
make sure that 2 * 2 == 4 and not 3.99999999.....

The one place where you might find differences is when you have
different subsystems which are clocked using different clock sources,
and then there's extra circuitry that has to be needed to handle
interfacing between those differently clocked circuit areas.

The problem, though is that over time, these boundaries can change; it
may be that on certain chipsets, things that had been in different
clock circuits, have later been integrated onto a single
system-on-chip device and all run off of a single clock source.

> That said, the full root cause is is not really known at this time
> considering that major chip vendors have no real clue either.

I have trouble beliving that statement; they might not be willing to
comment, since to do so would expose a internal CPU designs which they
consider secret, or worse, it might cause people to depend on certain
implementation details which might not be true in future chipsets ---
which may either tie their hands, or they may even know that it won't
be true for CPU version currently under development but not yet
released.

Sandy Harris pointed out a very good paper that I would definitely
recommend that people read:

http://lwn.net/images/conf/rtlws11/random-hardware.pdf

It basically describes some efforts made in 2009 by folks to do
exactly the sort of experiments I was advocating. What I actually
think is more important, though, is not the doing of the experiments,
but the development the tools to do these experiments. If people can
create kernel modules (and they have to be done in the kernel, since
you need to be able to disable interrupts, L1 caches, etc., while you
run these tests), then it will be possible to do these experiments
each time a new CPU comes out from Intel, or each time an ARM hardware
vendor comes out with a new ARM SOC.

It's important that these tests are done all time, and not, "OK, we
did some tests in 2009 or 2013, and things looked good, and we don't
have to worry about this any more; CPU-generated entropy is guaranteed
to be good!" We need to make sure we can easily do these tests all
the time, for every new piece of hardware out there --- and when we
can't explain where the entropy is coming from, we need to be doubly
on our guard.

Regards,

- Ted

2013-11-03 23:32:09

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi!

> Another friend of mine mentioned that he assumes the rise and fall times
> of transistors varies very slightly and could be the main reason for the
> jitter. I do not think that this is really the case, because our gates
> that form the CPU instructions comprise of many transistors. The
> combined raise/fall jitter should cancel each other out.

Plus, there's clock that should make sure that this jitter does not
matter.

> >There should be way to extract entropy more directly from various
> >oscillator effects, no?
>
> I am working a different way of measuring such oscillator effects by
> using the high-resolution timer of the CPU and measure it with a
> Jiffies-based snapshotting window. So, here I would combine two timers
> that are differently generated. If their frequencies would be relative
> prime to each other, we would measure a direct oscillator effect.

I guess main problem is machines that do not have high-resolution
timer on the CPU (rdtsc). They do not get enough entropy during boot,
and the hell breaks loose.

But they usually _do_ have RTC or other clock, not driven by CPU
oscilator. Good.

What about just

while (!enough_entropy) {
cur_time = read_rtc();
simulated_tsc = 0;
while (cur_time == read_rtc())
simulated_tsc++;
gain_entropy_from(simulated_tsc)
}

(Where enough_entropy should be something like 128 bits).

This should work, we know why it works (drift between rtc and cpu
clock) and it does _not_ need rdtsc-style fast source.

Disadvantage is that it burns cpu, but, well, you only need 128
bits. Asuming the rtc used has 100Hz resolution, enough entropy should
be collected in under 2 seconds. That's acceptable adition to time it
takes generating ssh keys on slow cpu.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-05 12:21:11

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Sonntag, 3. November 2013, 07:41:35 schrieb Theodore Ts'o:

Hi Theodore,

>On Sun, Nov 03, 2013 at 08:20:34AM +0100, Stephan Mueller wrote:
>
>Sandy Harris pointed out a very good paper that I would definitely
>recommend that people read:
>
>http://lwn.net/images/conf/rtlws11/random-hardware.pdf
>
>It basically describes some efforts made in 2009 by folks to do
>exactly the sort of experiments I was advocating. What I actually

I am wondering whether you have seen my last measurements where I
effectively performed the tests you were asking for: disabling all
possible CPU features and selectively enabling them.

The tests described in the above mentioned documents and much more are
all already in the test suite and test results I present here.

>think is more important, though, is not the doing of the experiments,
>but the development the tools to do these experiments. If people can
>create kernel modules (and they have to be done in the kernel, since
>you need to be able to disable interrupts, L1 caches, etc., while you
>run these tests), then it will be possible to do these experiments
>each time a new CPU comes out from Intel, or each time an ARM hardware
>vendor comes out with a new ARM SOC.

The test code is available and it is executed in kernel space.
>
>It's important that these tests are done all time, and not, "OK, we

Again, the code is there for the taking, including the analysis part.
Yes, it can be easily converted into a fully automated test such that at
the end a result of "CPU shows sufficient variations for the RNG" or
not.

Therefore, I am asking again:

- Which other CPU mechanisms can I disable that I did not so far?

- The execution time measurements when disabling CPU features show that
there is still significant variations available. Given the fact that an
adversary is not able to disable the features as I did, he will not be
able to reduce the variations induced by the features. He may alter them
potentially, but there are still variations which he cannot affect, let
alone predict. Therefore, how shall an adversary make predictions of the
variations to weaken the RNG?

I heard a nice statement from the developer who implemented the
/dev/random device of a different, respected operating system: the last
step to accept the underlying root cause of uncertainty for an RNG
always requires a leap of faith. Looking at typical noise sources that
sounds about right. For example:

- how can we be sure that nobody who measures the key stroke interrupts
can do that with a precision that is higher than the estimated entropy
the key stroke is awarded (note, an adversary has the means to observe
key strokes)? Same applies to mouse movements. Note that X11 allows you
to measure these events precisely (the xev application should give a
glimpse).

- how can we be sure that fast_pool exhibits no correlation with the
other noise sources?

- how can we be sure that the HDD fluctuations are random?

We simply accept that these issues do not allow predicting sequences to
the extent that weakens the RNG.

My goal is to give another seed source to add even more uncertainty into
the Linux RNG in addition to the existing seed sources. This would also
support environments that were typically left in the rain so far, such
as virtual machines, early boot sequences, Live CDs, or headless systems
without a spinning disk.

Ciao
Stephan

2013-11-05 12:25:40

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Montag, 4. November 2013, 00:32:07 schrieb Pavel Machek:

Hi Pavel,

>Hi!
>
>> Another friend of mine mentioned that he assumes the rise and fall
>> times of transistors varies very slightly and could be the main
>> reason for the jitter. I do not think that this is really the case,
>> because our gates that form the CPU instructions comprise of many
>> transistors. The combined raise/fall jitter should cancel each other
>> out.
>
>Plus, there's clock that should make sure that this jitter does not
>matter.
>
>> >There should be way to extract entropy more directly from various
>> >oscillator effects, no?
>>
>> I am working a different way of measuring such oscillator effects by
>> using the high-resolution timer of the CPU and measure it with a
>> Jiffies-based snapshotting window. So, here I would combine two
>> timers
>> that are differently generated. If their frequencies would be
>> relative
>> prime to each other, we would measure a direct oscillator effect.
>
>I guess main problem is machines that do not have high-resolution
>timer on the CPU (rdtsc). They do not get enough entropy during boot,
>and the hell breaks loose.

That is right. That is also why I try to use the clocksource framework
if the get_cycles righ-resolution timer is not available.
>
>But they usually _do_ have RTC or other clock, not driven by CPU
>oscilator. Good.
>
>What about just
>
>while (!enough_entropy) {
> cur_time = read_rtc();
> simulated_tsc = 0;
> while (cur_time == read_rtc())
> simulated_tsc++;
> gain_entropy_from(simulated_tsc)
>}

That is an interesting piece of code -- what would you do in the
gain_entropy_from function?
>
>(Where enough_entropy should be something like 128 bits).
>
>This should work, we know why it works (drift between rtc and cpu
>clock) and it does _not_ need rdtsc-style fast source.
>
>Disadvantage is that it burns cpu, but, well, you only need 128

That disadvantage should be no problem, because at the time we need
entropy, burning some CPU cycles are ok. Encryption burns even more CPU
cycles :-)

>bits. Asuming the rtc used has 100Hz resolution, enough entropy should
>be collected in under 2 seconds. That's acceptable adition to time it
>takes generating ssh keys on slow cpu.
>
Pavel


Ciao
Stephan

2013-11-05 15:00:34

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 5. November 2013, 13:25:40 schrieb Stephan Mueller:

Hi Pavel,

>Am Montag, 4. November 2013, 00:32:07 schrieb Pavel Machek:

>>But they usually _do_ have RTC or other clock, not driven by CPU
>>oscilator. Good.
>>
>>What about just
>>
>>while (!enough_entropy) {
>>
>> cur_time = read_rtc();
>> simulated_tsc = 0;
>> while (cur_time == read_rtc())
>>
>> simulated_tsc++;
>>
>> gain_entropy_from(simulated_tsc)
>>
>>}
>
>That is an interesting piece of code -- what would you do in the
>gain_entropy_from function?

Please disregard my question.

I plugged that idea into my current Jitter RNG processing and disabled
the other jitter measurements to get a clear, isolated picture.

The result is also a white noise! And it is even quite fast.

That means with this approach, even another noise source is available
that I could combine with the jitter measurements.

I will have to perform more tests on that noise source. But the smoke
test is already quite interesting.


Ciao
Stephan

2013-11-06 11:42:47

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 5. November 2013, 14:45:58 schrieb Stephan Mueller:

Hi Pavel,

>Am Dienstag, 5. November 2013, 13:25:40 schrieb Stephan Mueller:
>
>Hi Pavel,
>
>>Am Montag, 4. November 2013, 00:32:07 schrieb Pavel Machek:
>>>But they usually _do_ have RTC or other clock, not driven by CPU
>>>oscilator. Good.
>>>
>>>What about just
>>>
>>>while (!enough_entropy) {
>>>
>>> cur_time = read_rtc();
>>> simulated_tsc = 0;
>>> while (cur_time == read_rtc())
>>>
>>> simulated_tsc++;
>>>
>>> gain_entropy_from(simulated_tsc)
>>>
>>>}
>>
>>That is an interesting piece of code -- what would you do in the
>>gain_entropy_from function?
>
>Please disregard my question.
>
>I plugged that idea into my current Jitter RNG processing and disabled
>the other jitter measurements to get a clear, isolated picture.
>
>The result is also a white noise! And it is even quite fast.

After doing some more research on this approach, I have to admit that
the output not good (i.e. white noise) in all situations. Therefore, I
dropped that (for now).

But thank you very much for your suggestion.

Ciao
Stephan

2013-11-06 11:49:45

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Dienstag, 5. November 2013, 13:20:57 schrieb Stephan Mueller:

Hi Ted,

>Am Sonntag, 3. November 2013, 07:41:35 schrieb Theodore Ts'o:
>
>Hi Theodore,
>
>>On Sun, Nov 03, 2013 at 08:20:34AM +0100, Stephan Mueller wrote:
>>
>>Sandy Harris pointed out a very good paper that I would definitely
>>recommend that people read:
>>
>>http://lwn.net/images/conf/rtlws11/random-hardware.pdf
>>
>>It basically describes some efforts made in 2009 by folks to do
>>exactly the sort of experiments I was advocating. What I actually
>
>I am wondering whether you have seen my last measurements where I
>effectively performed the tests you were asking for: disabling all
>possible CPU features and selectively enabling them.
>
>The tests described in the above mentioned documents and much more are
>all already in the test suite and test results I present here.

After this comment, I got back to one of the authors of the cited paper
(he is in CC).

Here is a quote from his answer to my question whether he was able to
identify the root cause:

"its inherent in the microtiming of Hardware and there is nothing you
can do about it if you want the root cause is quantum physics"

That means, no matter how much CPU support you disable, you will always
have some jitter -- as I showed in my latest test results in appendix
F.46 of [1]. This statement is supported by my tests on even
microkernels which have no other job running than my test application.
Furthermore, as we see that phenomenon on every tested CPU type on every
tested operating system with every tested compiler, I am wondering what
else argument is needed to have this solution considered.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

Ciao
Stephan

2013-11-06 12:43:54

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Wed, Nov 06, 2013 at 12:49:45PM +0100, Stephan Mueller wrote:
> Here is a quote from his answer to my question whether he was able to
> identify the root cause:
>
> "its inherent in the microtiming of Hardware and there is nothing you
> can do about it if you want the root cause is quantum physics"

That's unfortunate, since it leaves open the question of whether this
jitter is something that could be at least somewhat predictable if you
had a lot more information about the internal works of the CPU or not....

- Ted

2013-11-06 12:51:17

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Mittwoch, 6. November 2013, 07:43:54 schrieb Theodore Ts'o:

Hi Theodore,

>On Wed, Nov 06, 2013 at 12:49:45PM +0100, Stephan Mueller wrote:
>> Here is a quote from his answer to my question whether he was able to
>> identify the root cause:
>>
>> "its inherent in the microtiming of Hardware and there is nothing you
>> can do about it if you want the root cause is quantum physics"
>
>That's unfortunate, since it leaves open the question of whether this
>jitter is something that could be at least somewhat predictable if you
>had a lot more information about the internal works of the CPU or
>not....

I do not understand that answer: I thought we are talking about the
search of non-predictable noise sources. If you cannot predict the
sequence even if you have the state of the CPU, that is what we are
looking for, is it not?

Besides, how on earth shall an attacker even gain knowledge about the
state of the CPU or disable CPU mechanisms? Oh, I forgot, your NSA guy.
But if he is able to do that, all discussions are moot because he simply
disables any noise sources by flipping a bit, reads the memory that is
used to hold the state of the RNG or just overwrites the memory
locations where data is collected, because the general protection
mechanisms offered by the kernel and the underlying hardware are broken.

Also, does your answer mean you would disregard radioactive decay that
is not predictable due to quantum physics and Heisenberg?

Ciao
Stephan

2013-11-06 13:04:40

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
> >That's unfortunate, since it leaves open the question of whether this
> >jitter is something that could be at least somewhat predictable if you
> >had a lot more information about the internal works of the CPU or
> >not....
>
> I do not understand that answer: I thought we are talking about the
> search of non-predictable noise sources. If you cannot predict the
> sequence even if you have the state of the CPU, that is what we are
> looking for, is it not?

I was asking the question about whether someone who knew more about
the internal _workings_ of the CPU, note of the state of the CPU.
This is not necessarily "the NSA guy", but someone who knows more
about the internal workings of the Intel CPU (such as an Intel
engineer --- and I've had Intel express misgivings about approaches
which depend on "CPU jitter" approaches), or just someone who has
spent a lot more time trying to examine the black box of the Intel CPU
from the outside.

I remember making my own home-grown encryption algorithm before I
entered college, secure in my arrogance that I had created something
so complicated that no one could crack it. Of course, as I got older
and wiser, I realized that it just meant it was just something *I*
couldn't yet anaylze and crack. The argument "the internal state of
the CPU is sooooo large, and the state transitions ar sooooo complex
that it *must* be unpredictable, because *I* can't predict them" seems
to be a very similar mistake that I made many years ago when I was in
high school.

Of course, some of the state in the CPU may not be unknown to the
attacker, if it is derived by external events that are not visible to
the attacker, such as a network interrupt. But if that's the case,
why not measure network interrupts directly? We're much less likely
to overestimate the amount of entropy we can extract the system in
that case.

> Also, does your answer mean you would disregard radioactive decay that
> is not predictable due to quantum physics and Heisenberg?

No, because that's uncertainty that we can tie to a physical source.
My concern about CPU state is that we haven't yet tied that to a
physical source of entropy, or to exactly what external inputs that
are not available to an external attacker. We know that quantum
events are not predictable. We also know that digital circuits in
general are very carefully designed to *be* predictable.

Regards,

- Ted

2013-11-06 13:24:59

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi!

> Of course, some of the state in the CPU may not be unknown to the
> attacker, if it is derived by external events that are not visible to
> the attacker, such as a network interrupt. But if that's the case,
> why not measure network interrupts directly? We're much less likely
> to overestimate the amount of entropy we can extract the system in
> that case.

Actually, I believe Stephan is up to something here.

We _can't_ measure network interrupts directly, because we do not have
TSC. (And TSC-less machines are the ones that are problematic, right?)

Extracting entropy from the CPU will allow us to pick up entropy from
network packets (and timer interrupt jitter) even on machines that
lack TSC. And that counts like very cool feature.

(And yes, we could just increment variable to get tsc emulation in
idle loop, and then extract entropy from that. But we would not be
able to enter low power states at that point, and it would not work
when cpu is busy computing.)

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-06 13:26:35

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi!

> >I plugged that idea into my current Jitter RNG processing and disabled
> >the other jitter measurements to get a clear, isolated picture.
> >
> >The result is also a white noise! And it is even quite fast.
>
> After doing some more research on this approach, I have to admit that
> the output not good (i.e. white noise) in all situations. Therefore, I
> dropped that (for now).

Is there chance to extract at least some entropy from it? (Can you
post the code you used for testing?) Because in this case we know
where the entropy comes from, which is important for Ted.

Thanks,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-07 00:46:31

by Nicholas Mc Guire

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Wed, 06 Nov 2013, Pavel Machek wrote:

> Hi!
>
> > Of course, some of the state in the CPU may not be unknown to the
> > attacker, if it is derived by external events that are not visible to
> > the attacker, such as a network interrupt. But if that's the case,
> > why not measure network interrupts directly? We're much less likely
> > to overestimate the amount of entropy we can extract the system in
> > that case.
>
> Actually, I believe Stephan is up to something here.
>
> We _can't_ measure network interrupts directly, because we do not have
> TSC. (And TSC-less machines are the ones that are problematic, right?)
>

If you are interested in entropy then there is no need to compare events against timestamps (which are just an event class) you can use any uncorrelated event (the assumption in using timestamps precisely is that they would be uncorrelated). If you are using jitter then the only question is how to extract it and how your
extracter can be assured to be sensitive to time differences that are smaller than what can be impacted by external events. Specifically this also means that no mater how you extract jitter you can not assume it is unbiased and you would need to unbias it (e.g. using Neuman/Peres method) - e.g. look at the jitter distributions in the OSADL QA-Farm - it is clear that they are in none of the cases bias-free.

E.g. practically all systems we looked at the jitter distribution is sensitive to the amount of memory in the box.

> Extracting entropy from the CPU will allow us to pick up entropy from
> network packets (and timer interrupt jitter) even on machines that
> lack TSC. And that counts like very cool feature.

if you use external events then I do not see the relation to using CPU inherent non-determinism - why do you want to bind the selectd events to an external and potentially contrtolable event at all ?

>
> (And yes, we could just increment variable to get tsc emulation in
> idle loop, and then extract entropy from that. But we would not be
> able to enter low power states at that point, and it would not work
> when cpu is busy computing.)
>
there is so much global state information in the kernel that can be
used to harvest entropy I do not see why we would neeed to implement
a dedicated method (counter loop or the like) at all - rather
all that would be needed is an extracter.

Note that using global state though is not directly bound to microtiming (it
might be too complex to systematically scew never the less but that would need tto be demonstrated). The only think that I believe is directly bound to microttiming are race conditions (also not unbiased though).

thx!
hofrat

2013-11-07 01:03:59

by Nicholas Mc Guire

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On Wed, 06 Nov 2013, Stephan Mueller wrote:

> Am Mittwoch, 6. November 2013, 07:43:54 schrieb Theodore Ts'o:
>
> Hi Theodore,
>
> >On Wed, Nov 06, 2013 at 12:49:45PM +0100, Stephan Mueller wrote:
> >> Here is a quote from his answer to my question whether he was able to
> >> identify the root cause:
> >>
> >> "its inherent in the microtiming of Hardware and there is nothing you
> >> can do about it if you want the root cause is quantum physics"
> >
> >That's unfortunate, since it leaves open the question of whether this
> >jitter is something that could be at least somewhat predictable if you
> >had a lot more information about the internal works of the CPU or
> >not....
>
> I do not understand that answer: I thought we are talking about the
> search of non-predictable noise sources. If you cannot predict the
> sequence even if you have the state of the CPU, that is what we are
> looking for, is it not?

unpredictabilitty of the individual event does not imply that you do not have
the ability to "guess" with more than 50% - that is just because it is not
predictable
does not mean itt is bias-free (and more importantly here that the bias is not
influenced by controllable factors like load). Attached is a simple
demonstration of this problem (gauss.c) that runs in user-space and harvestts
entropy by race occurrence, while the distribution is a nice normal
distribution (on multticore it is an overlay of multtiple normal
distributions - one per core-pairing possible) it is not load independent.
Never the less the individual event (race/no-race) is not predictable.

>
> Besides, how on earth shall an attacker even gain knowledge about the
> state of the CPU or disable CPU mechanisms? Oh, I forgot, your NSA guy.
> But if he is able to do that, all discussions are moot because he simply
> disables any noise sources by flipping a bit, reads the memory that is
> used to hold the state of the RNG or just overwrites the memory
> locations where data is collected, because the general protection
> mechanisms offered by the kernel and the underlying hardware are broken.

No need to gain knowledge of the internal CPU state itt would be
sufficient to be able to put the CPU in a sub-state-space in which
the distribution is shifted. it may be enough to reduce the truely random
bits of some key only by a few bits to make it suceptible to brute force
attacks.

>
> Also, does your answer mean you would disregard radioactive decay that
> is not predictable due to quantum physics and Heisenberg?
>
maybe I missed something - but what does radioactive decay induced
emission have to do with Heissenberg ?

thx!
hofrat


Attachments:
(No filename) (2.60 kB)
gauss.c (4.26 kB)
Download all attachments

2013-11-07 03:13:03

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Mittwoch, 6. November 2013, 14:26:35 schrieb Pavel Machek:

Hi Pavel,

>Hi!
>
>> >I plugged that idea into my current Jitter RNG processing and
>> >disabled
>> >the other jitter measurements to get a clear, isolated picture.
>> >
>> >The result is also a white noise! And it is even quite fast.
>>
>> After doing some more research on this approach, I have to admit that
>> the output not good (i.e. white noise) in all situations. Therefore,
>> I
>> dropped that (for now).
>
>Is there chance to extract at least some entropy from it? (Can you
>post the code you used for testing?) Because in this case we know
>where the entropy comes from, which is important for Ted.

The code is as follows -- it hooks into the framework of the RNG I
already have, so the code folds the obtained data into one bit (use the
following function as a drop-in replacement to my RNG code.

static __u64 jent_measure_jitter(struct rand_data *entropy_collector)
{
__u64 starttime = 0;
__u64 currtime = 0;
__u64 counter = 0;
__u64 data = 0;

jent_get_ustime(&starttime);
jent_get_ustime(&currtime);
while(starttime == currtime)
{
jent_get_ustime(&currtime);
counter++;
}
jent_fold_time(counter, &data, 1);
return data;
}

Consider the following in addition:

static inline void jent_get_ustime(__u64 *out)
{
__u64 tmp = 0;
struct timeval time;
if(gettimeofday(&time, NULL) == 0)
tmp = time.tv_usec;
*out = tmp;
}

For the kernel land, I implemented jent_get_ustime to be identical to
do_gettimeofday().

The result is the following on my i7 2nd gen without using the Von-
Neumann unbias operation:

- user space: looks like good white noise based on the results of ent
(Chi square, etc). When I print out the counter variable above and
calculate the Shannon Entropy, I get about 1.5 bits, so we have
variations. But when you look at the data manually, you see quite some
streaks that alternate between two values. Here is an example:

4
6
10
2
3
2
3
4
4
4
4
4
5
3
4
5
4
4
4
5
4
4
5
4
4
5
4
4
5
4
4
5
4
4
4
5
4
4


- kernel space: the resulting binary string is not very good: the chi
square is very bad. Moreover, the resulting data string is slightly
skewed. The reason is simple by looking at the counter value which I
obtained with another debugfs file: there are very very long streaks of
the same or alternating values.

So, I guess you may get some entropy, but I am not sure how much.

Also, when I enlarge the timer value to look something like that:

if(gettimeofday(&time, NULL) == 0)
tmp = time.tv_usec>>3;

the counter value is not getting really better, it is still alternating
between two or three values.


>
>Thanks,
> Pavel


Ciao
Stephan

2013-11-07 05:21:48

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Mittwoch, 6. November 2013, 08:04:32 schrieb Theodore Ts'o:

Hi Theodore,

>On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
>> >That's unfortunate, since it leaves open the question of whether
>> >this
>> >jitter is something that could be at least somewhat predictable if
>> >you
>> >had a lot more information about the internal works of the CPU or
>> >not....
>>
>> I do not understand that answer: I thought we are talking about the
>> search of non-predictable noise sources. If you cannot predict the
>> sequence even if you have the state of the CPU, that is what we are
>> looking for, is it not?
>
>I was asking the question about whether someone who knew more about
>the internal _workings_ of the CPU, note of the state of the CPU.
>This is not necessarily "the NSA guy", but someone who knows more
>about the internal workings of the Intel CPU (such as an Intel
>engineer --- and I've had Intel express misgivings about approaches
>which depend on "CPU jitter" approaches), or just someone who has
>spent a lot more time trying to examine the black box of the Intel CPU
>from the outside.

I try to get more information from my contacts to other vendors. But I
am wondering what shall we do if the answer is (maybe even proven with
some test results) that they see the same issue themselves and have no
handle on it?

I mean, what is it that I would need to test and demonstrate to prove or
disprove my RNG? We can certainly test very much, but one thing we
cannot prove, and that is the fundamental jitter, provided it is a
result of quantum fluctuations. Just as with any other noise source,
basic fundamental principles are hard if not impossible to test.

I would again like to stress the fact that we see the root cause to a
similar degree on all major CPUs that we have around us. All of these
CPUs work differently and yet they show a common behavior, the execution
time variations. I wonder whether this is already a good indication
>
>I remember making my own home-grown encryption algorithm before I
>entered college, secure in my arrogance that I had created something
>so complicated that no one could crack it. Of course, as I got older
>and wiser, I realized that it just meant it was just something *I*
>couldn't yet anaylze and crack. The argument "the internal state of
>the CPU is sooooo large, and the state transitions ar sooooo complex
>that it *must* be unpredictable, because *I* can't predict them" seems
>to be a very similar mistake that I made many years ago when I was in
>high school.

Agreed, I do not want to state that nobody is able to obtain the state
(especially considering that the CPUs *must* have some hardware
debugging hooks somewhere that should allow dumping out the current
state of the entire CPU to aid CPU development). However, we also have
to consider our threat model: to me, an attacker can only reside in user
state of the CPU. All that he can observe there can be used against us.
Anything that he can only do when in supervisor state is not of
interest, because gathering and maintaining entropy is the least of our
worries in this case.
>
>Of course, some of the state in the CPU may not be unknown to the
>attacker, if it is derived by external events that are not visible to
>the attacker, such as a network interrupt. But if that's the case,
>why not measure network interrupts directly? We're much less likely
>to overestimate the amount of entropy we can extract the system in
>that case.

To aid that discussion, let us assume that the variations are solely
based on the CPU state (which indications show that this is not the
case). When you focus on only the interrupts (or only one interrupt for
that matter) we limit the base from which we try to obtain entropy.

If you are concerned about the entropy level we add with the proposed
jitter RNG, what about the following:

- when mixing in bits from the jitter RNG into the entropy pools of
random.c, we increase the entropy estimator not by the equivalent amount
of bits, but some fraction. For example:

...
ret = jent_read_entropy(&r->entropy_collector, rand, JENTBLOCKSIZE);
...
_mix_pool_bytes(r, rand, ret, NULL);
credit_entropy_bits(r, (ret * 8) / 2);

This code would only increase the entropy estimator by half of the
obtained bits

- The jitter RNG allows specifying an oversampling rate where it
oversamples the jitter by multiples of the given value. The default is 1
(i.e. no oversampling). But we could set it to 2 (double oversampling).
The code is here:

r->entropy_collector.osr = 1;

Ciao
Stephan

2013-11-07 05:26:06

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Donnerstag, 7. November 2013, 02:03:57 schrieb Nicholas Mc Guire:

Hi Nicholas,

>On Wed, 06 Nov 2013, Stephan Mueller wrote:
>> Am Mittwoch, 6. November 2013, 07:43:54 schrieb Theodore Ts'o:
>>
>> Hi Theodore,
>>
>> >On Wed, Nov 06, 2013 at 12:49:45PM +0100, Stephan Mueller wrote:
>> >> Here is a quote from his answer to my question whether he was able
>> >> to
>> >> identify the root cause:
>> >>
>> >> "its inherent in the microtiming of Hardware and there is nothing
>> >> you
>> >> can do about it if you want the root cause is quantum physics"
>> >
>> >That's unfortunate, since it leaves open the question of whether
>> >this
>> >jitter is something that could be at least somewhat predictable if
>> >you
>> >had a lot more information about the internal works of the CPU or
>> >not....
>>
>> I do not understand that answer: I thought we are talking about the
>> search of non-predictable noise sources. If you cannot predict the
>> sequence even if you have the state of the CPU, that is what we are
>> looking for, is it not?
>
>unpredictabilitty of the individual event does not imply that you do
>not have the ability to "guess" with more than 50% - that is just
>because it is not predictable
>does not mean itt is bias-free (and more importantly here that the bias
>is not influenced by controllable factors like load). Attached is a
>simple demonstration of this problem (gauss.c) that runs in user-space
>and harvestts entropy by race occurrence, while the distribution is a
>nice normal distribution (on multticore it is an overlay of multtiple
>normal distributions - one per core-pairing possible) it is not load
>independent. Never the less the individual event (race/no-race) is not
>predictable.

Thank you for sharing your ideas and code.

There is one deviation of my RNG from your considerations. My RNG is not
around race contention (i.e. your software-based uncertainty) but a
simple measurement of how long some instruction take to execute.

So, how can I induce skews? By trying to disable CPU logic or fudge with
the CPU support (like cache attacks, etc). My tests have shown that
disabling some CPU logic may diminish timing variations, but they are
still sufficiently large to support the RNG.

So, when we already have variations that are sufficient (i.e. the
entropy is sufficient), using the CPU features that initially have been
disabled but are now added on top of an already sufficient set of
variations cannot diminish the entropy that is already present. Even
when we assume we have full control over the CPU features, I do not see
a way how we can use these features to cancel already existing
variations out.

Or do you have an idea how that can be established? Note, we all must
assume that an attacker is only in user state. Anything else is
meaningless.

>> Besides, how on earth shall an attacker even gain knowledge about the
>> state of the CPU or disable CPU mechanisms? Oh, I forgot, your NSA
>> guy. But if he is able to do that, all discussions are moot because
>> he simply disables any noise sources by flipping a bit, reads the
>> memory that is used to hold the state of the RNG or just overwrites
>> the memory locations where data is collected, because the general
>> protection mechanisms offered by the kernel and the underlying
>> hardware are broken.
>No need to gain knowledge of the internal CPU state itt would be
>sufficient to be able to put the CPU in a sub-state-space in which
>the distribution is shifted. it may be enough to reduce the truely
>random bits of some key only by a few bits to make it suceptible to
>brute force attacks.

Note, the proposed RNG contains an unbias operation (the Von-Neumann
unbiaser) which is proven to remove any bias when it is established that
the individual observations are independent. And the way the
observations are generated ensures that they are independent. Thus, a
skew should not be a concern here.

Ciao
Stephan

2013-11-09 22:04:07

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Donnerstag, 7. November 2013, 02:03:57 schrieb Nicholas Mc Guire:
>> On Wed, 06 Nov 2013, Stephan Mueller wrote:
>>> Besides, how on earth shall an attacker even gain knowledge about the
>>> state of the CPU or disable CPU mechanisms? Oh, I forgot, your NSA
>>> guy. But if he is able to do that, all discussions are moot because
>>> he simply disables any noise sources by flipping a bit, reads the
>>> memory that is used to hold the state of the RNG or just overwrites
>>> the memory locations where data is collected, because the general
>>> protection mechanisms offered by the kernel and the underlying
>>> hardware are broken.
>>
>> No need to gain knowledge of the internal CPU state itt would be
>> sufficient to be able to put the CPU in a sub-state-space in which
>> the distribution is shifted. it may be enough to reduce the truely
>> random bits of some key only by a few bits to make it suceptible to
>> brute force attacks.
>
> Note, the proposed RNG contains an unbias operation (the Von-Neumann
> unbiaser) which is proven to remove any bias when it is established that
> the individual observations are independent. And the way the
> observations are generated ensures that they are independent.

"Independent" does not mean that your own code avoids reusing data from
the previous loop iteration; it means that the _entire_ process that
generates the bits is not affected by any memory of the past.

The observations are derived from the internal CPU state, which is *not*
reset between measurements.


Regards,
Clemens

2013-11-09 22:04:49

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Mittwoch, 6. November 2013, 08:04:32 schrieb Theodore Ts'o:
>> On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
>>>> That's unfortunate, since it leaves open the question of whether this
>>>> jitter is something that could be at least somewhat predictable if you
>>>> had a lot more information about the internal works of the CPU or not....
>>>
>>> I do not understand that answer: I thought we are talking about the
>>> search of non-predictable noise sources. If you cannot predict the
>>> sequence even if you have the state of the CPU, that is what we are
>>> looking for, is it not?
>>
>> I was asking the question about whether someone who knew more about
>> the internal _workings_ of the CPU, note of the state of the CPU.
>> This is not necessarily "the NSA guy", but someone who knows more
>> about the internal workings of the Intel CPU (such as an Intel
>> engineer --- and I've had Intel express misgivings about approaches
>> which depend on "CPU jitter" approaches), or just someone who has
>> spent a lot more time trying to examine the black box of the Intel CPU
>> from the outside.
>
> I try to get more information from my contacts to other vendors. But I
> am wondering what shall we do if the answer is (maybe even proven with
> some test results) that they see the same issue themselves and have no
> handle on it?
>
> I mean, what is it that I would need to test and demonstrate to prove or
> disprove my RNG?

You need to prove that the CPU will never get into an internal state
where the loop execution times happen to form a predictable pattern.
Alternatively, detect this so that the measurements can be thrown away.

> We can certainly test very much, but one thing we cannot prove, and
> that is the fundamental jitter, provided it is a result of quantum
> fluctuations. Just as with any other noise source, basic fundamental
> principles are hard if not impossible to test.

You cannot test if the noise source was replaced with fake hardware.
But if you know the characteristics of the noise source, you can test
for likely failure modes, such as the output value being stuck or
oscillating.

In the case of CPU jitter measurements, you do not have direct access to
the noise source; you measure it indirectly through the CPU's internal
state. So you need to know how the delta times of a noisy CPU are
different from the delta times of a CPU without or with unsuitable
noise source.


Regards,
Clemens

2013-11-10 01:10:18

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Samstag, 9. November 2013, 23:04:49 schrieb Clemens Ladisch:

Hi Clemens,

> Stephan Mueller wrote:
> > Am Mittwoch, 6. November 2013, 08:04:32 schrieb Theodore Ts'o:
> >> On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
> >>>> That's unfortunate, since it leaves open the question of whether this
> >>>> jitter is something that could be at least somewhat predictable if you
> >>>> had a lot more information about the internal works of the CPU or
> >>>> not....
> >>>
> >>> I do not understand that answer: I thought we are talking about the
> >>> search of non-predictable noise sources. If you cannot predict the
> >>> sequence even if you have the state of the CPU, that is what we are
> >>> looking for, is it not?
> >>
> >> I was asking the question about whether someone who knew more about
> >> the internal _workings_ of the CPU, note of the state of the CPU.
> >> This is not necessarily "the NSA guy", but someone who knows more
> >> about the internal workings of the Intel CPU (such as an Intel
> >> engineer --- and I've had Intel express misgivings about approaches
> >> which depend on "CPU jitter" approaches), or just someone who has
> >> spent a lot more time trying to examine the black box of the Intel CPU
> >> from the outside.
> >
> > I try to get more information from my contacts to other vendors. But I
> > am wondering what shall we do if the answer is (maybe even proven with
> > some test results) that they see the same issue themselves and have no
> > handle on it?
> >
> > I mean, what is it that I would need to test and demonstrate to prove or
> > disprove my RNG?
>
> You need to prove that the CPU will never get into an internal state
> where the loop execution times happen to form a predictable pattern.
> Alternatively, detect this so that the measurements can be thrown away.

That statement sounds very nice, and I would love to prove it. But I do not
see any way to do that except applying statistical tests on a large number of
different systems and by disabling CPU features -- as I have done.

I am fully aware that statistical tests cannot prove the conclusion that the
noise source is proper.

But we have to keep requirements to my RNG in league with the research applied
to other noise sources. Let us look at physical noise sources we all know and
love:

- The conclusion that radioactive decay is random is based on the quantum
theory. That theory contains a number of formulas which were all validated
with a plethora of measurements. Yet, there is no proof (in the mathematical
sense) that the formulas are correct. These formulas are based on deductive
science but *not* inductive science (like math).

- Oscillators: Again, the conclusion of oscillators being appropriate depends
on deductive science.

- Shot noise, Johnson noise, etc. of resistors, transistors is again based on
deductive science.

For software:

- The noise sources of interrupts, HID events, HDD fluctuations are all based
on deductive science. There is even no formulas or other mathematical model
behind them to state that they are good for RNGs.

So, there was never a proof in the sense of an inductive science of any noise
source. That means I cannot be expected to show a full formulated proof based
on inductive science of the noise source I offer here.

Yet, I meet the deductive science approach with my RNG as I base my
conclusions on a large array of measurements. And I give the tools to perform
the measurements to everybody. So, everybody can easily redo the testing, or,
if possible, prove me wrong!

You may look into [1] and [2]. [1] mentions that inductive methods cannot
reach absolute proof.
>
> > We can certainly test very much, but one thing we cannot prove, and
> > that is the fundamental jitter, provided it is a result of quantum
> > fluctuations. Just as with any other noise source, basic fundamental
> > principles are hard if not impossible to test.
>
> You cannot test if the noise source was replaced with fake hardware.
> But if you know the characteristics of the noise source, you can test
> for likely failure modes, such as the output value being stuck or
> oscillating.

And here I am asking for help! What did I do so far:

- Test the CPU in kernel and user mode.

- Selectively and mutually disable CPU features (see appendix F.46 of my
documentation).

- Tested on quiet and heavily loaded CPUs.

- Testing on the same physical system / CPU with different operating systems.

What else can I do?

When you ask for testing of stuck values, what shall I really test for? Shall
I test adjacent measurements for the same or alternating values? The test for
the same values is caught with the Von-Neumann unbiaser. If I test for
alternating values, other people may come in and ask to check for pattern x or
y.

But then, section 4.3 of my document contains an analysis of patterns. As I do
not use a whitener, any pattern WILL be visible with the Chi-Square test
result. All tests I conducted show a good Chi-Square value! That leads me to
the conclusion that there is NO pattern visible.
>
> In the case of CPU jitter measurements, you do not have direct access to
> the noise source; you measure it indirectly through the CPU's internal
> state. So you need to know how the delta times of a noisy CPU are
> different from the delta times of a CPU without or with unsuitable
> noise source.

Please give me a practical example! This statement is again a nice theoretical
wish, but my considerations above for the different noise sources can again be
applied here: I have never seen elaborate stuck tests on any other physical or
non-physical RNG. Yes, I have seen the FIPS 140-2 continuous test, but that is
already implemented in my code.

[1] https://en.wikipedia.org/wiki/Inductive_reasoning

[2] https://en.wikipedia.org/wiki/Deductive_reasoning

Ciao
Stephan
--
| Cui bono? |

2013-11-10 01:16:13

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Samstag, 9. November 2013, 23:04:07 schrieb Clemens Ladisch:

Hi Clemens,

> Stephan Mueller wrote:
> > Am Donnerstag, 7. November 2013, 02:03:57 schrieb Nicholas Mc Guire:
> >> On Wed, 06 Nov 2013, Stephan Mueller wrote:
> >>> Besides, how on earth shall an attacker even gain knowledge about the
> >>> state of the CPU or disable CPU mechanisms? Oh, I forgot, your NSA
> >>> guy. But if he is able to do that, all discussions are moot because
> >>> he simply disables any noise sources by flipping a bit, reads the
> >>> memory that is used to hold the state of the RNG or just overwrites
> >>> the memory locations where data is collected, because the general
> >>> protection mechanisms offered by the kernel and the underlying
> >>> hardware are broken.
> >>
> >> No need to gain knowledge of the internal CPU state itt would be
> >> sufficient to be able to put the CPU in a sub-state-space in which
> >> the distribution is shifted. it may be enough to reduce the truely
> >> random bits of some key only by a few bits to make it suceptible to
> >> brute force attacks.
> >
> > Note, the proposed RNG contains an unbias operation (the Von-Neumann
> > unbiaser) which is proven to remove any bias when it is established that
> > the individual observations are independent. And the way the
> > observations are generated ensures that they are independent.
>
> "Independent" does not mean that your own code avoids reusing data from
> the previous loop iteration; it means that the _entire_ process that
> generates the bits is not affected by any memory of the past.

In the other email, I explained the different types of tests I performed. All
of these tests show proper statistical results.

Now, I also performed these tests without the Von-Neumann unbiaser. All of the
statistical tests results still showed a white noise (note, in the next code
release, I will have an allocation flag added that you can use to very simply
deactivate the Von-Neumann unbiaser for testing).

So, the Von-Neumann unbiaser is to be considered a line of defence against
(not yet observed, but potential) skews. Similarly, the optional whitening
(non-cryptographic) function of jent_stir_pool is yet another line of defence.

So, bottom line: I fully concur that using two separate measurements may not
imply that they are independent. But testing shows that it does not matter.
>
> The observations are derived from the internal CPU state, which is *not*
> reset between measurements.
>
>
> Regards,
> Clemens


Ciao
Stephan
--
| Cui bono? |

2013-11-10 16:31:47

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Samstag, 9. November 2013, 23:04:49 schrieb Clemens Ladisch:
>> Stephan Mueller wrote:
>>> Am Mittwoch, 6. November 2013, 08:04:32 schrieb Theodore Ts'o:
>>>> On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
>>>>>> That's unfortunate, since it leaves open the question of whether this
>>>>>> jitter is something that could be at least somewhat predictable if you
>>>>>> had a lot more information about the internal works of the CPU or
>>>>>> not....
>>>>>
>>>>> I do not understand that answer: I thought we are talking about the
>>>>> search of non-predictable noise sources. If you cannot predict the
>>>>> sequence even if you have the state of the CPU, that is what we are
>>>>> looking for, is it not?
>>>>
>>>> I was asking the question about whether someone who knew more about
>>>> the internal _workings_ of the CPU, note of the state of the CPU.
>>>> This is not necessarily "the NSA guy", but someone who knows more
>>>> about the internal workings of the Intel CPU (such as an Intel
>>>> engineer --- and I've had Intel express misgivings about approaches
>>>> which depend on "CPU jitter" approaches), or just someone who has
>>>> spent a lot more time trying to examine the black box of the Intel CPU
>>>> from the outside.
>>>
>>> I try to get more information from my contacts to other vendors. But I
>>> am wondering what shall we do if the answer is (maybe even proven with
>>> some test results) that they see the same issue themselves and have no
>>> handle on it?
>>>
>>> I mean, what is it that I would need to test and demonstrate to prove or
>>> disprove my RNG?
>>
>> You need to prove that the CPU will never get into an internal state
>> where the loop execution times happen to form a predictable pattern.
>> Alternatively, detect this so that the measurements can be thrown away.
>
> That statement sounds very nice, and I would love to prove it. But I do not
> see any way to do that except applying statistical tests on a large number of
> different systems

Then you cannot say anything about unknown future CPU models.

> But we have to keep requirements to my RNG in league with the research applied
> to other noise sources. Let us look at physical noise sources we all know and
> love:
>
> - The conclusion that radioactive decay is random is based on the quantum
> theory. That theory contains a number of formulas which were all validated
> with a plethora of measurements. Yet, there is no proof (in the mathematical
> sense) that the formulas are correct.

But we assume that the unpredictability of quantum mechanics is a limit
for _everyone_. In the case of CPUs, the jitter you observe in delta
times results in part from the complexities of the inner state, and in
part from real random noise. The first part is deterministic and might
be predicted by anyone who has enough knowledge about the CPU's
internals.

Additionally, physical noise sources allow us to estimate the entropy of
our measurements. There is no such estimate for the CPU jitter RNG.

(BTW, the entropy calculations in the paper are meaningless, because the
Shannon formula assumes the values are created independently. Entropy
calculation always depends on the process that created those values.
For example, the entropy of 11001001000011111101101010100010001000 is
zero if you know that this is just pi. To take a more relevant example,
assume a completely deterministic CPU where each measuring loop takes
exactly one timer tick; the delta times would look like this:
1 1 1 1 1 1 1 1 1 1 1 1 ...
Now assume that the timer runs at a speed of 4/3 relative to the CPU,
i.e., every third loop, another tick has accumulated:
1 1 2 1 1 2 1 1 2 1 1 2 ...
Now assume that in every fourth loop iteration, the instruction decoder
falls behind and stalls the pipeline for the same time as one loop
iteration:
1 1 2 2 2 1 1 3 1 2 1 3 ...
This sequence still has zero entropy.)

> For software:
>
> - The noise sources of interrupts, HID events, HDD fluctuations are all based
> on deductive science. There is even no formulas or other mathematical model
> behind them to state that they are good for RNGs.

Indeed. However, in the absence of a 'real' RNG, these are the best
that the kernel has. And at least these events come from multiple
sources and are mostly independent.

>>> We can certainly test very much, but one thing we cannot prove, and
>>> that is the fundamental jitter, provided it is a result of quantum
>>> fluctuations. Just as with any other noise source, basic fundamental
>>> principles are hard if not impossible to test.
>>
>> You cannot test if the noise source was replaced with fake hardware.
>> But if you know the characteristics of the noise source, you can test
>> for likely failure modes, such as the output value being stuck or
>> oscillating.
>
> And here I am asking for help! [...]
> When you ask for testing of stuck values, what shall I really test for?
> Shall I test adjacent measurements for the same or alternating values?

Same or alternating delta time values happen even on random CPUs. You
need a theory of how random and non-random CPUs work, and how this
difference affects the delta times, before you can test for that.

(You might take the deterministic toy CPU from above, use more realistic
timings, and add several layers of cache, write buffers, timer I/O wait
states, and out-of-order execution, while still staying deterministic.
Then add real randomness to the I/O or bus clocks, or power management
delays, and check if the result has the same characteristics as your
measurements of real CPUs.)

> The test for the same values is caught with the Von-Neumann unbiaser.

No, the von Neumann unbiaser is run on the whitened bitstream, i.e.,
_after_ the folding operation.

The unbiasing should happen _before_ the whitener. This implies that
you cannot use the von Neumann unbiaser because the sequence of delta
times is not a sequence of single, independent bits.

(Even for single bits, independence is strictly needed. Otherwise,
a von Neumann unbiaser could _remove_ entropy. For example, consider
a noise source that outputs bits in groups of two, where the first bit
is perfectly random, but the second bit is always zero, or the same as
the first.)

> If I test for alternating values, other people may come in and ask to
> check for pattern x or y.

Well, if those people know a pattern that can _actually_ differentiate
between random and non-random delta times, then you should test for it.
(And I'd buy them a beer; with an entropy estimate, the CPU jitter RNG
would be perfect.)


Regards,
Clemens

2013-11-10 17:22:05

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Sonntag, 10. November 2013, 17:31:07 schrieb Clemens Ladisch:

Hi Clemens,

> Stephan Mueller wrote:
> > Am Samstag, 9. November 2013, 23:04:49 schrieb Clemens Ladisch:
> >> Stephan Mueller wrote:
> >>> Am Mittwoch, 6. November 2013, 08:04:32 schrieb Theodore Ts'o:
> >>>> On Wed, Nov 06, 2013 at 01:51:17PM +0100, Stephan Mueller wrote:
> >>>>>> That's unfortunate, since it leaves open the question of whether this
> >>>>>> jitter is something that could be at least somewhat predictable if
> >>>>>> you
> >>>>>> had a lot more information about the internal works of the CPU or
> >>>>>> not....
> >>>>>
> >>>>> I do not understand that answer: I thought we are talking about the
> >>>>> search of non-predictable noise sources. If you cannot predict the
> >>>>> sequence even if you have the state of the CPU, that is what we are
> >>>>> looking for, is it not?
> >>>>
> >>>> I was asking the question about whether someone who knew more about
> >>>> the internal _workings_ of the CPU, note of the state of the CPU.
> >>>> This is not necessarily "the NSA guy", but someone who knows more
> >>>> about the internal workings of the Intel CPU (such as an Intel
> >>>> engineer --- and I've had Intel express misgivings about approaches
> >>>> which depend on "CPU jitter" approaches), or just someone who has
> >>>> spent a lot more time trying to examine the black box of the Intel CPU
> >>>> from the outside.
> >>>
> >>> I try to get more information from my contacts to other vendors. But I
> >>> am wondering what shall we do if the answer is (maybe even proven with
> >>> some test results) that they see the same issue themselves and have no
> >>> handle on it?
> >>>
> >>> I mean, what is it that I would need to test and demonstrate to prove or
> >>> disprove my RNG?
> >>
> >> You need to prove that the CPU will never get into an internal state
> >> where the loop execution times happen to form a predictable pattern.
> >> Alternatively, detect this so that the measurements can be thrown away.
> >
> > That statement sounds very nice, and I would love to prove it. But I do
> > not
> > see any way to do that except applying statistical tests on a large number
> > of different systems
>
> Then you cannot say anything about unknown future CPU models.

I concur.

But neither can we say anything about future HDDs (note, they now start to
fill them with Helium instead of air -- does that change anything in the
disturbances?) or timers or interrupt hardware. Also, we know for a fact that
the reliance on HDD as seed sources breaks down even today with SSDs!

Again, we are not discussing about random.c, so please let us not further
elaborate on HDDs/SSDs and their significance. But I am asking that my RNG is
treated with the *same* level of rigor we apply to other seed sources. Thus,
we can safely assume that future CPUs have that jitter too, because chip
vendors will NOT simply drop all their existing effort and start at zero
designing a new CPU.

Besides, all my measurements show that the newer the CPU is, the more jitter
it shows. Nonetheless, even the oldest ones I tested contain good jitter.
>
> > But we have to keep requirements to my RNG in league with the research
> > applied to other noise sources. Let us look at physical noise sources we
> > all know and love:
> >
> > - The conclusion that radioactive decay is random is based on the quantum
> > theory. That theory contains a number of formulas which were all validated
> > with a plethora of measurements. Yet, there is no proof (in the
> > mathematical sense) that the formulas are correct.
>
> But we assume that the unpredictability of quantum mechanics is a limit

Here you say it: we *assume*. And please bear in mind that we all know for a
fact that the theory behind quantum physics is not complete, because it does
not work together with the theory of relativity. That again is a hint that
there is NO proof that decay is really unpredictable.

But we disgres again, and I will skip more discussions about that theory.

> for _everyone_. In the case of CPUs, the jitter you observe in delta
> times results in part from the complexities of the inner state, and in
> part from real random noise. The first part is deterministic and might
> be predicted by anyone who has enough knowledge about the CPU's
> internals.

Right, and that is why I tried to eliminate the CPU mechanisms that may be
having a deterministic impact. If I miss a mechanism or your have other
suggestions, please help me.
>
> Additionally, physical noise sources allow us to estimate the entropy of
> our measurements. There is no such estimate for the CPU jitter RNG.

Well, I thought I have hints on the estimations of entropy in all my graphs by
calculating a Shannon Entropy value. Granted, that is over the execution
duration of my folding loop. Yet, that gives a good hint on the variations in
place.
>
> (BTW, the entropy calculations in the paper are meaningless, because the
> Shannon formula assumes the values are created independently. Entropy
> calculation always depends on the process that created those values.
> For example, the entropy of 11001001000011111101101010100010001000 is
> zero if you know that this is just pi. To take a more relevant example,
> assume a completely deterministic CPU where each measuring loop takes
> exactly one timer tick; the delta times would look like this:
> 1 1 1 1 1 1 1 1 1 1 1 1 ...
> Now assume that the timer runs at a speed of 4/3 relative to the CPU,
> i.e., every third loop, another tick has accumulated:
> 1 1 2 1 1 2 1 1 2 1 1 2 ...
> Now assume that in every fourth loop iteration, the instruction decoder
> falls behind and stalls the pipeline for the same time as one loop
> iteration:
> 1 1 2 2 2 1 1 3 1 2 1 3 ...
> This sequence still has zero entropy.)

First, the patterns you mention will be immediately obvious with the Chi-
Square measurements -- see section 4.3 of my document. So, I disregard the
discussion of patterns.

Second, we have to establish that entropy is *relative*. What is entropic to
me, may be fully deterministic to you (when I look at the desk of my wife, it
is all chaos to me, but when I ask her to get me some information, she needs 5
seconds to dig through her heaps and get me the stuff -- i.e. it is no chaos
for her). That said, of course if I have full knowledge of all mechanisms that
affect my RNG, I have zero entropy. But what I am saying is that I do not have
that knowledge. The key now is to explain that nobody else has that either.
>
> > For software:
> >
> > - The noise sources of interrupts, HID events, HDD fluctuations are all
> > based on deductive science. There is even no formulas or other
> > mathematical model behind them to state that they are good for RNGs.
>
> Indeed. However, in the absence of a 'real' RNG, these are the best
> that the kernel has. And at least these events come from multiple
> sources and are mostly independent.

Great, and here we are together. All I am offering is yet *another* noise
source that is *added* to the existing ones.

The only big difference is that my noise source works on demand whereas the
others work by being triggered by the kernel operation. Therefore, I tried to
provide the patch in a way that my RNG is used as the last resort before
/dev/random blocks.

Yet, my RNG as a seed source is available way earlier than any of the other
seed sources. Moreover, it works in environments the others start to fail.
That is why I am pushing my RNG in, because I have an itch (the missing
/dev/random entropy in environments I care about -- embedded, Live CDs and
virtual environments) that I want to scratch.
>
> >>> We can certainly test very much, but one thing we cannot prove, and
> >>> that is the fundamental jitter, provided it is a result of quantum
> >>> fluctuations. Just as with any other noise source, basic fundamental
> >>> principles are hard if not impossible to test.
> >>
> >> You cannot test if the noise source was replaced with fake hardware.
> >> But if you know the characteristics of the noise source, you can test
> >> for likely failure modes, such as the output value being stuck or
> >> oscillating.
> >
> > And here I am asking for help! [...]
> > When you ask for testing of stuck values, what shall I really test for?
> > Shall I test adjacent measurements for the same or alternating values?
>
> Same or alternating delta time values happen even on random CPUs. You
> need a theory of how random and non-random CPUs work, and how this
> difference affects the delta times, before you can test for that.

Are you telling me that I should invent a formula and apply it? Where have you
seen such an approach in existing RNGs?
>
> (You might take the deterministic toy CPU from above, use more realistic

Your toy CPU from above contains some artificially created deterministic
behavior. Sure I can check for that. But then somebody else tells me, why not
checking for pattern X or Y. So, that effort is bound to failure, IMHO.

> timings, and add several layers of cache, write buffers, timer I/O wait
> states, and out-of-order execution, while still staying deterministic.
> Then add real randomness to the I/O or bus clocks, or power management
> delays, and check if the result has the same characteristics as your
> measurements of real CPUs.)

I have no idea how that proposal would look like in real live.
>
> > The test for the same values is caught with the Von-Neumann unbiaser.
>
> No, the von Neumann unbiaser is run on the whitened bitstream, i.e.,
> _after_ the folding operation.

The folding is whitened? How do you reach that conclusion? Yes, the folding is
my (very simple) post-processing. But I am not calling it whitened as all
statistical problems the underlying variations have *will* be still visible in
the folded value. Patterns in the underlying timer *will* be visible -- and I
have demonstrated that in section 4.3.
>
> The unbiasing should happen _before_ the whitener. This implies that
> you cannot use the von Neumann unbiaser because the sequence of delta
> times is not a sequence of single, independent bits.

You are right when the processing is a whitening. But it is not! Whitening
implies that you somehow are able to hide statistical problems with the
applied whitening. My folding DOES NOT do that. So, it is no whitening.
>
> (Even for single bits, independence is strictly needed. Otherwise,
> a von Neumann unbiaser could _remove_ entropy. For example, consider
> a noise source that outputs bits in groups of two, where the first bit
> is perfectly random, but the second bit is always zero, or the same as
> the first.)

I know that. But from what I see, there is no patterns so far and therefore,
the only conclusion I can reach is that they are independent.

But if you do not like the Von-Neumann unbiaser, take it out. And still, the
statistical tests will pass.
>
> > If I test for alternating values, other people may come in and ask to
> > check for pattern x or y.
>
> Well, if those people know a pattern that can _actually_ differentiate
> between random and non-random delta times, then you should test for it.
> (And I'd buy them a beer; with an entropy estimate, the CPU jitter RNG
> would be perfect.)

What would you expect me to do when I should do to come up with an entropy
estimate that I not already have done? There are so many assessments on
entropy I make, I am surprised that I am said to have no entropy assessment.

Ciao
Stephan
--
| Cui bono? |

2013-11-10 20:28:06

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Sonntag, 10. November 2013, 17:31:07 schrieb Clemens Ladisch:
>> In the case of CPUs, the jitter you observe in delta
>> times results in part from the complexities of the inner state, and in
>> part from real random noise. The first part is deterministic and might
>> be predicted by anyone who has enough knowledge about the CPU's
>> internals.
>
> Right, and that is why I tried to eliminate the CPU mechanisms that may be
> having a deterministic impact. If I miss a mechanism or your have other
> suggestions, please help me.

Many CPUs allow to disable branch prediction, but this is very vendor
specific (try to find MSR documentation). The biggest offender probably
is the out-of-order execution engine, which cannot be disabled.

>>> When you ask for testing of stuck values, what shall I really test for?
>>> Shall I test adjacent measurements for the same or alternating values?
>>
>> Same or alternating delta time values happen even on random CPUs. You
>> need a theory of how random and non-random CPUs work, and how this
>> difference affects the delta times, before you can test for that.
>
> Are you telling me that I should invent a formula and apply it?

I was not implying that the theory has nothing to do with the physical
device. It must correctly _describe_ the relevant physical processes.

>>> The test for the same values is caught with the Von-Neumann unbiaser.
>>
>> No, the von Neumann unbiaser is run on the whitened bitstream, i.e.,
>> _after_ the folding operation.
>
> The folding is whitened? How do you reach that conclusion? Yes, the folding is
> my (very simple) post-processing. But I am not calling it whitened as all
> statistical problems the underlying variations have *will* be still visible in
> the folded value.

If you don't want to call it "whitening", call it "randomness extraction"
instead. But its input is a series of delta times like this:
00000000000000000000000001010011
00000000000000000000000010011010
00000000000000000000000001011011
00000000000000000000000001100100
00000000000000000000000010111000
and the purpose of the folding is to remove these zero patterns.

> What would you expect me to do when I should do to come up with an entropy
> estimate that I not already have done?

I do not expect you (or anybody) to be able to come up with a correct
entropy estimate for CPU jitter.

> There are so many assessments on entropy I make, I am surprised that I
> am said to have no entropy assessment.

Again: Shannon entropy assumes that you have a sequence of independent
and identically distributed random variables. And you cannot prove
these properties from the output; you need to know the process that
generates the values.


Regards,
Clemens

2013-11-11 02:59:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

On 11/10/2013 09:21 AM, Stephan Mueller wrote:
>
> Here you say it: we *assume*. And please bear in mind that we all know for a
> fact that the theory behind quantum physics is not complete, because it does
> not work together with the theory of relativity. That again is a hint that
> there is NO proof that decay is really unpredictable.
>

Honestly, if you want to be taken seriously, this is not the kind of
thing to put in your discussion.

-hpa

2013-11-13 03:12:47

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Sonntag, 10. November 2013, 21:28:06 schrieb Clemens Ladisch:

Hi Clemens,

> Stephan Mueller wrote:
> > Am Sonntag, 10. November 2013, 17:31:07 schrieb Clemens Ladisch:
> >> In the case of CPUs, the jitter you observe in delta
> >> times results in part from the complexities of the inner state, and in
> >> part from real random noise. The first part is deterministic and might
> >> be predicted by anyone who has enough knowledge about the CPU's
> >> internals.
> >
> > Right, and that is why I tried to eliminate the CPU mechanisms that may be
> > having a deterministic impact. If I miss a mechanism or your have other
> > suggestions, please help me.
>
> Many CPUs allow to disable branch prediction, but this is very vendor
> specific (try to find MSR documentation). The biggest offender probably
> is the out-of-order execution engine, which cannot be disabled.

I was also digging around in this area. My research showed so far that only on
ARM one can disable the branch prediction unit. Unfortunately, I do not have
an ARM available where I can do that. I have my Samsung phone, but that runs
Android and I am not sure how to generate a kernel module here.

For x86, I have not found a way to disable the unit. Nonetheless, I tried to
bring down the effect by "warming" the caches and the branch prediction up
(see section 6.1.1 of the new version of the documentation). There I execute
the testing 1000 times and use only the last result for further analysis.
>
> >>> When you ask for testing of stuck values, what shall I really test for?
> >>> Shall I test adjacent measurements for the same or alternating values?
> >>
> >> Same or alternating delta time values happen even on random CPUs. You
> >> need a theory of how random and non-random CPUs work, and how this
> >> difference affects the delta times, before you can test for that.
> >
> > Are you telling me that I should invent a formula and apply it?
>
> I was not implying that the theory has nothing to do with the physical
> device. It must correctly _describe_ the relevant physical processes.

Right, but currently I am not sure how I can find such description. In
particular, I created a new round of testing which is even more interesting as
the results do not allow to pinpoint the exact root cause. More to that in a
separate email.

Do you have an idea?
>
> >>> The test for the same values is caught with the Von-Neumann unbiaser.
> >>
> >> No, the von Neumann unbiaser is run on the whitened bitstream, i.e.,
> >> _after_ the folding operation.
> >
> > The folding is whitened? How do you reach that conclusion? Yes, the
> > folding is my (very simple) post-processing. But I am not calling it
> > whitened as all statistical problems the underlying variations have
> > *will* be still visible in the folded value.
>
> If you don't want to call it "whitening", call it "randomness extraction"
> instead. But its input is a series of delta times like this:
> 00000000000000000000000001010011
> 00000000000000000000000010011010
> 00000000000000000000000001011011
> 00000000000000000000000001100100
> 00000000000000000000000010111000
> and the purpose of the folding is to remove these zero patterns.

Not quite. Let me explain the motive behind the folding loop. To maintain
entropy mathematically, there are only a few operations allowed. One of them
is a bijective operation which implies that two strings combined with a
bijective operation will form a new string which contains the maximum entropy
of the initial strings. XOR is a bijective operation.

Hence, if you have a string with 10 bits that holds 5 bits of entropy and XOR
it with a 20 bit string that holds 2 bits of entropy, you receive a string
that is 20 bits in length and holds 5 bits of entropy.

In any case, with the bijective operation, it is not possible to loose entropy
even when you use the bijective operation to add fully deterministic values to
a bit stream that is believed to have entropy.

That said, the folding loop uses that line of thought. The loop XORs each bit
with every other bit to receive one bit at the end. The idea is to collapse
the initial bit stream by still retaining the entropy present in each
individual bit. The goal is now that the resulting bit will hold one bit of
entropy by "collecting" the combined entropy found in the individual bits.

That folding operation will loose entropy, if the overall entropy in the
folded bit stream is more than one bit. But that point is our advantage,
because it provides a safety margin, if the value to be folded really holds
more than one bit of entropy. All my measurements in section 5.1 and appendix
F just try to show that on every CPU there is always more than one bit of
entropy.

There is a catch, however: what happens if each individual bit of the bit
stream holds less than one bit? I.e. the entire bit stream may hold more than
one bit, but when chopping the bit stream, the none of the individual bits may
have one bit of entropy. As XOR only retains entropy but never enlarges
entropy (like a concatenation would do), this would be a problem for the RNG.
However, measurements and analyses given in section 5.2.1 come to the rescue:
The measurements show that the folded value behaves like having one bit of
entropy.

To support that conclusion, the folding loop is sometimes executed in
multiples of 64. This looks strange at first look, but these multiples ensure
that the distribution of the folding loop bits discussed in section 5.2.1
stabilizes quickly. And with these multiple execution of the folding loop, the
upper boundary of the entropy is defined.

Now coming back to your concern: It is surely possible to put the Von-Neumann
unbiaser before the folding loop. But considering the argument above, no
information is lost apart from entropy exceeding one bit due to the use of a
bijective operation. This means, it would not matter whether to put the Von-
Neumann unbiaser before or after the folding. However, when putting it after
the folding, the unbiaser will throw away more values and it would be more
conservative with this approach (it is more likely to find adjacent 0 0 or 1 1
combinations than identical adjacent time deltas).

Finally, the Von-Neumann unbiaser applies to individual bits. I am unsure how
that would fit to the 64 bit time delta value that is the input to the folding
loop.

> > There are so many assessments on entropy I make, I am surprised that I
> > am said to have no entropy assessment.
>
> Again: Shannon entropy assumes that you have a sequence of independent
> and identically distributed random variables. And you cannot prove
> these properties from the output; you need to know the process that
> generates the values.

I am absolutely on your side here. And as we cannot (yet?) fully conclude that
the observations are independent, a safety margin must always be considered.
The RNG has the following safety margins where it is more conservative than
measurements indicate:

- the folding loop produces one bit where measurements of the time deltas that
go into the folding loop are typically above 2 bits with the lower boundary.
The upper boundary is typical at 6 bits or higher. Some systems (especially
older CPUs) show variations that are less than 2 bits with the lower boundary.
But all of them are well above one bit. (note, my first designs had a folding
loop that results in three bits, but then I reduced it to two and finally to
one bit to ensure a safety margin)

- the placement of the Von-Neumann unbiaser will throw away more bits than it
would when it would be placed before the folding loop. That means that the
noise source must produce more bits which are assembled into the final random
number

- the execution of the folding loop with multiples of 64 chosen with another
time stamp adds more uncertainty over the folding timing -- therefore we have
an upper boundary for the entropy estimation. I am almost certain that the
selection of the multiplier value is not independent of the time stamps
gathered for the time delta measurement (due to using the four lowest bits of
that time stamp to determine the multiplexer value). That means that the upper
boundary calculated with the Shannon Entropy formula is certainly too high.
But the "real" value is above the lower boundary.

Thanks for your thoughts.

Ciao
Stephan
--
| Cui bono? |

2013-11-13 03:37:27

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: Executing time variation tests on bare metal

Am Dienstag, 29. Oktober 2013, 09:24:48 schrieb Theodore Ts'o:

Hi Theodore,

> On Tue, Oct 29, 2013 at 09:42:30AM +0100, Stephan Mueller wrote:
> > Based on this suggestion, I now added the tests in Appendix F.46.8 where
> > I disable the caches and the tests in Appendix F.46.9 where I disable
> > the caches and interrupts.
>
> What you've added in F.46 is a good start, but as a suggestiom,
> instead of disabling one thing at a time, try disabling *everything*
> and then see what you get, and then enabling one thing at a time. The
> best thing is if you can get to the point where the amount of entropy
> is close to zero. Then as you add things back, there's a much better
> sense of where the unpredictability might be coming from, and whether
> the unpredictability is coming from something which is fundamentally
> arising from something which is chaotic or quantum effect, or just
> because we don't have a good way of modelling the behavior of the
> L1/L2 cache (for example) and that is spoofing your entropy estimator.

I was now able to implement two more test buckets that were in my mind for
quite some time. They are documented in the new sections 6.3 and 6.4 in [1].

The tests for the time variation measurements are now executed on bare metal,
i.e. without *any* operating system underneath. For achieving that, I used the
memtest86+ tool, ripped out the memory tests and added the time variation
testing into it.

The time variation tests now execute single threaded without any interference
from an underlying operating system. Again, I added all the variations of
disabling CPU support (TLB flushes, L1/2 flushes, cache disabling, ...).

And, surprise: all the jitter is still there.

Furthermore, I use the same vehicle to just measure the variations by
obtaining two timestamps immediately after each other and calculate the
difference. As before, there are various tests which disable the different CPU
mechanisms.

And, surprise: there is still variations visible. Granted, these variations
are smaller than the ones for the folding loop. But the smallest variations
still have way more than 1 bit when applying the Shannon Entropy formula.

The code is uploaded to [2] and can be used to play with.

In addition, I added test resutls with varying loads as explained in section
6.2 (thanks to Nicholas Mc Guire for helping here).

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html

[2] http://www.chronox.de/

Ciao
Stephan
--
| Cui bono? |

2013-11-13 11:51:52

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Sonntag, 10. November 2013, 21:28:06 schrieb Clemens Ladisch:
>> Many CPUs allow to disable branch prediction, but this is very vendor
>> specific (try to find MSR documentation). The biggest offender probably
>> is the out-of-order execution engine, which cannot be disabled.
>
> For x86, I have not found a way to disable the unit.

My AMD processor can do this in the IC_CFG/DC_CFG MSRs. (See the BKDG.)

The Intel Core family has other settings (such as data prefetching) that
are configurable in the IA32_MISC_ENABLE MSR.

(And any setting that increases accesses to main memory is likey to
introduce more entropy due to clock drift between the processor and the
memory bus. Or where do you assume the entropy comes from?)

BTW: MFENCE is not guaranteed to flush the instruction pipeline; you
need CPUID for that.

> XOR is a bijective operation.

Only XOR with a constant, which is not what you're doing.

>>> There are so many assessments on entropy I make, I am surprised that I
>>> am said to have no entropy assessment.
>>
>> Again: Shannon entropy assumes that you have a sequence of independent
>> and identically distributed random variables. And you cannot prove
>> these properties from the output; you need to know the process that
>> generates the values.
>
> I am absolutely on your side here. And as we cannot (yet?) fully conclude that
> the observations are independent, a safety margin must always be considered.

If the observations are not independent, you cannot just assume that the
estimate is off by a certain factor. It means that the estimate is not
applicable _at all_.

> The RNG has the following safety margins where it is more conservative than
> measurements indicate:

You _cannot_ measure entropy by looking at the output. How would you
differentiate between /dev/random and /dev/urandom?


Regards,
Clemens

2013-11-13 15:15:26

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Mittwoch, 13. November 2013, 12:51:44 schrieb Clemens Ladisch:

Hi Clemens,

>Stephan Mueller wrote:
>> Am Sonntag, 10. November 2013, 21:28:06 schrieb Clemens Ladisch:
>>> Many CPUs allow to disable branch prediction, but this is very
>>> vendor
>>> specific (try to find MSR documentation). The biggest offender
>>> probably is the out-of-order execution engine, which cannot be
>>> disabled.>
>> For x86, I have not found a way to disable the unit.
>
>My AMD processor can do this in the IC_CFG/DC_CFG MSRs. (See the
>BKDG.)

Thanks for the hint, I will look into that one.
>
>The Intel Core family has other settings (such as data prefetching)
>that are configurable in the IA32_MISC_ENABLE MSR.

Thanks for the hint, I will check that as well.
>
>(And any setting that increases accesses to main memory is likey to
>introduce more entropy due to clock drift between the processor and the
>memory bus. Or where do you assume the entropy comes from?)

You nailed it. When I disable the caches using the CR0 setting, I get a
massive increase in variations and thus entropy.
>
>BTW: MFENCE is not guaranteed to flush the instruction pipeline; you
>need CPUID for that.

I was not aware of that. Can I simply call the CPUID instruction to read
it or do I have to do something special?
>
>> XOR is a bijective operation.
>
>Only XOR with a constant, which is not what you're doing.

If you want to regain the full 64 bit input bit stream, then you are
right.

But still, XOR is bijective with respect to the two bits that go into
the operation. Before I released the RNG work, I had the mathematical
side and the entropy consideration analyzed by a mathematician who is
specialized in the field of cryptography and statistics. She actually
said that this folding based on XOR is an appropriate way to collapse
the string and yet retain entropy.
>
>>>> There are so many assessments on entropy I make, I am surprised
>>>> that I
>>>> am said to have no entropy assessment.
>>>
>>> Again: Shannon entropy assumes that you have a sequence of
>>> independent
>>> and identically distributed random variables. And you cannot prove
>>> these properties from the output; you need to know the process that
>>> generates the values.
>>
>> I am absolutely on your side here. And as we cannot (yet?) fully
>> conclude that the observations are independent, a safety margin must
>> always be considered.
>If the observations are not independent, you cannot just assume that
>the estimate is off by a certain factor. It means that the estimate
>is not applicable _at all_.

Well, I am not so sure here. If there are concerns on independence
(which I currently do not have, but more to that below), I have always
seen that safety margins are put in place. And that is what I tried here
as well.
>
>> The RNG has the following safety margins where it is more
>> conservative than
>> measurements indicate:
>You _cannot_ measure entropy by looking at the output. How would you
>differentiate between /dev/random and /dev/urandom?

I think regarding the independence you can very well draw the conclusion
based on the output, because any dependencies will ultimately form a
pattern. That pattern would initially be visible in the measurements
that go into the folding loop. What I now must show is that these
measurements do not have a pattern.

In addition, we need to keep in mind that the independence claim is
relative just like the entropy itself.

If you have an output string that has no detectable patterns, i.e. looks
like white noise, you can only assume that the observations are
independent of each other. If there are patterns, you know that there
are dependencies.

That statement may *not* apply any more if you look at the internals of
the RNG. In such a case, you may have even strong dependencies.

The best example on this matter are deterministic RNGs with a
cryptographic output function. When you see the output string, you only
see white noise. As you cannot detect a pattern, you have to assume that
the string provides independent observations. At least for you who looks
from the outside cannot draw any conclusions from observing some output
bit stream which would be the future bit stream. Yet, when you know the
state of the RNG, you entire future output bit stream has no
independence.

Coming back to the jitter RNG: I applied a large array of statistical
tests and all of them concluded that the output is white noise, as
explained in the documentation. That means when looking from the
outside, there are no dependencies visible. Now you may say that this
conclusion does not mean that there are no dependencies -- but I reply
again, if there would be any, none of the array of statistical tests can
detect it. So, how would an attacker detect patterns? And again, I
cannot prove it just like nobody else cannot prove it for other hardware
noise sources.

In addition, when we conclude that the output bit stream has no
dependencies due to the absence of patterns and considering the folding
operation not disguising patterns from the underlying measurements, I
draw the conclusion that there are no patterns in the underlying
measurements. The "Anti-Tests" I outline in section 4.3 show that if
there would be patterns in the underlying measurements, it is
immediately visible in the output bit stream.
>
>
>Regards,
>Clemens


Ciao
Stephan

2013-11-13 17:14:55

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Hi!

> >BTW: MFENCE is not guaranteed to flush the instruction pipeline; you
> >need CPUID for that.
>
> I was not aware of that. Can I simply call the CPUID instruction to read
> it or do I have to do something special?

Simply call CPUID (warning, it clobbers some registers.).
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2013-11-14 10:51:10

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Mittwoch, 13. November 2013, 12:51:44 schrieb Clemens Ladisch:
>> (And any setting that increases accesses to main memory is likey to
>> introduce more entropy due to clock drift between the processor and the
>> memory bus. Or where do you assume the entropy comes from?)
>
> You nailed it. When I disable the caches using the CR0 setting, I get a
> massive increase in variations and thus entropy.

Now this would be an opportunity to use the number of main memory
accesses to estimate entropy. (And when you're running out of the
cache, i.e., deterministically, is there any entropy?)

>>> XOR is a bijective operation.
>>
>> Only XOR with a constant, which is not what you're doing.
>
> If you want to regain the full 64 bit input bit stream, then you are
> right.
>
> But still, XOR is bijective with respect to the two bits that go into
> the operation.

Please look up what "bijective" actually means:
<http://en.wikipedia.org/wiki/Bijection>

> folding based on XOR is an appropriate way to collapse the string and
> yet retain entropy.

Correct, but the reason for that is not being bijective.

>> If the observations are not independent, you cannot just assume that
>> the estimate is off by a certain factor. It means that the estimate
>> is not applicable _at all_.
>
> Well, I am not so sure here.

So, what is the factor that you are saying is large enough?

>>> The RNG has the following safety margins where it is more conservative than
>>> measurements indicate:
>>
>> You _cannot_ measure entropy by looking at the output. How would you
>> differentiate between /dev/random and /dev/urandom?
>
> I think regarding the independence you can very well draw the conclusion
> based on the output, because any dependencies will ultimately form a
> pattern.

The output of a pure PRNG (such as /dev/urandom without entropy) is
dependent on the internal state, but there are no patterns detectable
from the output alone.

> In addition, we need to keep in mind that the independence claim is
> relative

No, independence is a property of the process that generates the
samples.

> If you have an output string that has no detectable patterns, i.e. looks
> like white noise, you can only assume that the observations are
> independent of each other. If there are patterns, you know that there
> are dependencies.
>
> That statement may *not* apply any more if you look at the internals of
> the RNG. In such a case, you may have even strong dependencies.
>
> The best example on this matter are deterministic RNGs with a
> cryptographic output function. When you see the output string, you only
> see white noise. As you cannot detect a pattern, you have to assume that
> the string provides independent observations. At least for you who looks
> from the outside cannot draw any conclusions from observing some output
> bit stream which would be the future bit stream. Yet, when you know the
> state of the RNG, you entire future output bit stream has no
> independence.

You cannot restrict the analysis to observations from the outside;
there are many people who can know about the CPU's internal structure.

> Coming back to the jitter RNG: I applied a large array of statistical
> tests and all of them concluded that the output is white noise, as
> explained in the documentation. That means when looking from the
> outside, there are no dependencies visible. Now you may say that this
> conclusion does not mean that there are no dependencies -- but I reply
> again, if there would be any, none of the array of statistical tests can
> detect it. So, how would an attacker detect patterns?

An attacker would not try to detect patterns; he would apply knowledge
of the internals.

Statistical tests are useful only for detecting the absence of entropy,
not for the opposite.


All your arguments about the output of the CPU jitter RNG also apply to
the output of /dev/urandom. So are you saying that it would be a good
idea to loop the output of /dev/urandom back into /dev/random?


Regards,
Clemens

2013-11-14 18:02:11

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Donnerstag, 14. November 2013, 11:51:03 schrieb Clemens Ladisch:

Hi Clemens,

>Stephan Mueller wrote:
>> Am Mittwoch, 13. November 2013, 12:51:44 schrieb Clemens Ladisch:
>>> (And any setting that increases accesses to main memory is likey to
>>> introduce more entropy due to clock drift between the processor and
>>> the memory bus. Or where do you assume the entropy comes from?)
>>
>> You nailed it. When I disable the caches using the CR0 setting, I get
>> a massive increase in variations and thus entropy.
>
>Now this would be an opportunity to use the number of main memory
>accesses to estimate entropy. (And when you're running out of the
>cache, i.e., deterministically, is there any entropy?)
>

I seem to have found the root cause with my bare metal tester, but I am
yet unable to make sense of it.

I will report back with more analyses.


>An attacker would not try to detect patterns; he would apply knowledge
>of the internals.

I do not buy that argument, because if an attacker can detect or deduce
the internals of the CPU, he surely can detect the state of the
input_pool or the other entropy pools behind /dev/random. And then,
/dev/random is not so entropic any more for that attacker.
>
>Statistical tests are useful only for detecting the absence of entropy,
>not for the opposite.

Again, I fully agree. But it is equally important to understand that
entropy is relative. And all I want and all I care about is that an
attacker has only the knowledge and ability to make measurements that
are less precise than 1 bit.

Ciao
Stephan

2013-11-14 18:31:19

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Stephan Mueller wrote:
> Am Donnerstag, 14. November 2013, 11:51:03 schrieb Clemens Ladisch:
>> An attacker would not try to detect patterns; he would apply knowledge
>> of the internals.
>
> I do not buy that argument, because if an attacker can detect or deduce
> the internals of the CPU, he surely can detect the state of the
> input_pool or the other entropy pools behind /dev/random.

With "internals", I do not mean the actual state of the CPU, but the
behaviour of all the CPU's execution engines.

An Intel engineer might know how to affect the CPU so that the CPU
jitter code measures a deterministic pattern, but he will not know the
contents of my memory.

>> Statistical tests are useful only for detecting the absence of entropy,
>> not for the opposite.
>
> Again, I fully agree. But it is equally important to understand that
> entropy is relative.

In cryptography, we care about absolute entropy, i.e., _nobody_ must be
able to predict the RNG output, not even any CPU engineer.


Regards,
Clemens

2013-11-14 18:34:18

by Stephan Müller

[permalink] [raw]
Subject: Re: [PATCH] CPU Jitter RNG: inclusion into kernel crypto API and /dev/random

Am Donnerstag, 14. November 2013, 19:30:22 schrieb Clemens Ladisch:

Hi Clemens,

>Stephan Mueller wrote:
>> Am Donnerstag, 14. November 2013, 11:51:03 schrieb Clemens Ladisch:
>>> An attacker would not try to detect patterns; he would apply
>>> knowledge
>>> of the internals.
>>
>> I do not buy that argument, because if an attacker can detect or
>> deduce the internals of the CPU, he surely can detect the state of
>> the input_pool or the other entropy pools behind /dev/random.
>
>With "internals", I do not mean the actual state of the CPU, but the
>behaviour of all the CPU's execution engines.
>
>An Intel engineer might know how to affect the CPU so that the CPU
>jitter code measures a deterministic pattern, but he will not know the
>contents of my memory.

Here I agree fully.
>
>>> Statistical tests are useful only for detecting the absence of
>>> entropy, not for the opposite.
>>
>> Again, I fully agree. But it is equally important to understand that
>> entropy is relative.
>
>In cryptography, we care about absolute entropy, i.e., _nobody_ must be
>able to predict the RNG output, not even any CPU engineer.

With your clarification above, I agree here fully.

And now my task is to verify the root cause which I seem to have found.

Let me do my homework.

Ciao
Stephan