Subject: Safety of early boot init of /dev/random seed

Hello,

We are trying to enhance the Debian support for /dev/random seeding at early
boot, and we need some expert help to do it right. Maybe some of you could
give us some enlightenment on a few issues?

Apologies in advance if I got the list of Linux kernel maintainers wrong. I
have also copied LKML just in case.

A bit of context: Debian tries to initialize /dev/random, by restoring the
pool size and giving it some seed material (through a write to /dev/random)
from saved state stored in /var.

Since we store the seed data in /var, that means we only feed it to
/dev/random relatively late in the boot sequence, after remote filesystems
are available. Thus, anything that needs random numbers earlier than that
point will run with whatever the kernel managed to harness without any sort
of userspace help (which is probably not much, especially on platforms that
clear RAM contents at reboot, or after a cold boot).

We take care of regenerating the stored seed data as soon as we use it, in
order to avoid as much as possible the possibility of reuse of seed data.
This means that we write the old seed data to /dev/random, and immediately
copy poolsize bytes from /dev/urandom to the seed data file.

The seed data file is also regenerated prior to shutdown.

We would like to clarify some points, so as to know how safe they are on
face of certain error modes, and also whether some of what we do is
necessary at all. Unfortunately, real answers require more intimate
knowledge of the theory behind Linux' random pools than we have in the
Debian initscripts team.

Here are our questions:

1. How much data of unknown quality can we feed the random pool at boot,
before it causes damage (i.e. what is the threshold where we violate the
"you are not goint to be any worse than you were before" rule) ?

2. How dangerous it is to feed the pool with stale seed data in the next
boot (i.e. in a failure mode where we do not regenerate the seed file) ?

3. What is the optimal size of the seed data based on the pool size ?

4. How dangerous it is to have functions that need randomness (like
encripted network and partitions, possibly encripted swap with an
ephemeral key), BEFORE initializing the random seed ?

5. Is there an optimal size for the pool? Does the quality of the randomness
one extracts from the pool increase or decrease with pool size?

Basically, we need these answers to find our way regarding the following
decisions:

a) Is it better to seed the pool as early as possible and risk a larger time
window for problem (2) above, instead of the current behaviour where we
have a large time window where (4) above happens?

b) Is it worth the effort to base the seed file on the size of the pool,
instead of just using a constant size? If a constant size is better,
which size would that be? 512 bytes? 4096 bytes? 16384 bytes?

c) What is the maximum seed file size we can allow (maybe based on size of
the pool) to try to avoid problem (1) above ?

We would be very grateful if you could help us find good answers to the
questions above.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh


Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

(adding Petter Reinholdtsen to CC, stupid MUA...)

On Sat, 03 Jul 2010, Henrique de Moraes Holschuh wrote:
> Hello,
>
> We are trying to enhance the Debian support for /dev/random seeding at early
> boot, and we need some expert help to do it right. Maybe some of you could
> give us some enlightenment on a few issues?
>
> Apologies in advance if I got the list of Linux kernel maintainers wrong. I
> have also copied LKML just in case.
>
> A bit of context: Debian tries to initialize /dev/random, by restoring the
> pool size and giving it some seed material (through a write to /dev/random)
> from saved state stored in /var.
>
> Since we store the seed data in /var, that means we only feed it to
> /dev/random relatively late in the boot sequence, after remote filesystems
> are available. Thus, anything that needs random numbers earlier than that
> point will run with whatever the kernel managed to harness without any sort
> of userspace help (which is probably not much, especially on platforms that
> clear RAM contents at reboot, or after a cold boot).
>
> We take care of regenerating the stored seed data as soon as we use it, in
> order to avoid as much as possible the possibility of reuse of seed data.
> This means that we write the old seed data to /dev/random, and immediately
> copy poolsize bytes from /dev/urandom to the seed data file.
>
> The seed data file is also regenerated prior to shutdown.
>
> We would like to clarify some points, so as to know how safe they are on
> face of certain error modes, and also whether some of what we do is
> necessary at all. Unfortunately, real answers require more intimate
> knowledge of the theory behind Linux' random pools than we have in the
> Debian initscripts team.
>
> Here are our questions:
>
> 1. How much data of unknown quality can we feed the random pool at boot,
> before it causes damage (i.e. what is the threshold where we violate the
> "you are not goint to be any worse than you were before" rule) ?
>
> 2. How dangerous it is to feed the pool with stale seed data in the next
> boot (i.e. in a failure mode where we do not regenerate the seed file) ?
>
> 3. What is the optimal size of the seed data based on the pool size ?
>
> 4. How dangerous it is to have functions that need randomness (like
> encripted network and partitions, possibly encripted swap with an
> ephemeral key), BEFORE initializing the random seed ?
>
> 5. Is there an optimal size for the pool? Does the quality of the randomness
> one extracts from the pool increase or decrease with pool size?
>
> Basically, we need these answers to find our way regarding the following
> decisions:
>
> a) Is it better to seed the pool as early as possible and risk a larger time
> window for problem (2) above, instead of the current behaviour where we
> have a large time window where (4) above happens?
>
> b) Is it worth the effort to base the seed file on the size of the pool,
> instead of just using a constant size? If a constant size is better,
> which size would that be? 512 bytes? 4096 bytes? 16384 bytes?
>
> c) What is the maximum seed file size we can allow (maybe based on size of
> the pool) to try to avoid problem (1) above ?
>
> We would be very grateful if you could help us find good answers to the
> questions above.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2010-07-05 18:40:38

by Matt Mackall

[permalink] [raw]
Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Sat, 2010-07-03 at 13:08 -0300, Henrique de Moraes Holschuh wrote:
> (adding Petter Reinholdtsen to CC, stupid MUA...)
>
> On Sat, 03 Jul 2010, Henrique de Moraes Holschuh wrote:
> > Hello,
> >
> > We are trying to enhance the Debian support for /dev/random seeding at early
> > boot, and we need some expert help to do it right. Maybe some of you could
> > give us some enlightenment on a few issues?
> >
> > Apologies in advance if I got the list of Linux kernel maintainers wrong. I
> > have also copied LKML just in case.
> >
> > A bit of context: Debian tries to initialize /dev/random, by restoring the
> > pool size and giving it some seed material (through a write to /dev/random)
> > from saved state stored in /var.
> >
> > Since we store the seed data in /var, that means we only feed it to
> > /dev/random relatively late in the boot sequence, after remote filesystems
> > are available. Thus, anything that needs random numbers earlier than that
> > point will run with whatever the kernel managed to harness without any sort
> > of userspace help (which is probably not much, especially on platforms that
> > clear RAM contents at reboot, or after a cold boot).
> >
> > We take care of regenerating the stored seed data as soon as we use it, in
> > order to avoid as much as possible the possibility of reuse of seed data.
> > This means that we write the old seed data to /dev/random, and immediately
> > copy poolsize bytes from /dev/urandom to the seed data file.
> >
> > The seed data file is also regenerated prior to shutdown.
> >
> > We would like to clarify some points, so as to know how safe they are on
> > face of certain error modes, and also whether some of what we do is
> > necessary at all. Unfortunately, real answers require more intimate
> > knowledge of the theory behind Linux' random pools than we have in the
> > Debian initscripts team.
> >
> > Here are our questions:
> >
> > 1. How much data of unknown quality can we feed the random pool at boot,
> > before it causes damage (i.e. what is the threshold where we violate the
> > "you are not goint to be any worse than you were before" rule) ?

There is no limit. The mixing operations are computationally reversible,
which guarantees that no unknown degrees of freedom are clobbered when
mixing known data.

> > 2. How dangerous it is to feed the pool with stale seed data in the next
> > boot (i.e. in a failure mode where we do not regenerate the seed file) ?

Not at all.

> > 3. What is the optimal size of the seed data based on the pool size ?

1:1.

> > 4. How dangerous it is to have functions that need randomness (like
> > encripted network and partitions, possibly encripted swap with an
> > ephemeral key), BEFORE initializing the random seed ?

Depends on the platform. For instance, if you've got an unattended boot
off a Live CD on a machine with a predictable clock, you may get
duplicate outputs.

> > 5. Is there an optimal size for the pool? Does the quality of the randomness
> > one extracts from the pool increase or decrease with pool size?

Don't bother fiddling with the pool size.

> > Basically, we need these answers to find our way regarding the following
> > decisions:
> >
> > a) Is it better to seed the pool as early as possible and risk a larger time
> > window for problem (2) above, instead of the current behaviour where we
> > have a large time window where (4) above happens?

Earlier is better.

> > b) Is it worth the effort to base the seed file on the size of the pool,
> > instead of just using a constant size? If a constant size is better,
> > which size would that be? 512 bytes? 4096 bytes? 16384 bytes?

512 bytes is plenty.

> > c) What is the maximum seed file size we can allow (maybe based on size of
> > the pool) to try to avoid problem (1) above ?

Anything larger than a sector is simply wasting CPU time, but is
otherwise harmless.

--
Mathematics is the supreme nostalgia of our time.

Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Mon, 05 Jul 2010, Matt Mackall wrote:
> > > Here are our questions:
> > >
> > > 1. How much data of unknown quality can we feed the random pool at boot,
> > > before it causes damage (i.e. what is the threshold where we violate the
> > > "you are not goint to be any worse than you were before" rule) ?
>
> There is no limit. The mixing operations are computationally reversible,
> which guarantees that no unknown degrees of freedom are clobbered when
> mixing known data.

Good. So, whatever we do, we are never worse off than we were before we did
it, at least by design.

> > > 2. How dangerous it is to feed the pool with stale seed data in the next
> > > boot (i.e. in a failure mode where we do not regenerate the seed file) ?
>
> Not at all.
>
> > > 3. What is the optimal size of the seed data based on the pool size ?
>
> 1:1.

We shall try to keep it at 1:1, then.

> > > 4. How dangerous it is to have functions that need randomness (like
> > > encripted network and partitions, possibly encripted swap with an
> > > ephemeral key), BEFORE initializing the random seed ?
>
> Depends on the platform. For instance, if you've got an unattended boot
> off a Live CD on a machine with a predictable clock, you may get
> duplicate outputs.

I.e. it is somewhat dangerous, and we should try to avoid it by design, so
we should try to init it as early as possible. Very well.

> > > 5. Is there an optimal size for the pool? Does the quality of the randomness
> > > one extracts from the pool increase or decrease with pool size?
>
> Don't bother fiddling with the pool size.

We don't, but local admins often do, probably in an attempt to better handle
bursts of entropy drainage. So, we do want to properly support non-standard
pool sizes in Debian if we can.

> > > Basically, we need these answers to find our way regarding the following
> > > decisions:
> > >
> > > a) Is it better to seed the pool as early as possible and risk a larger time
> > > window for problem (2) above, instead of the current behaviour where we
> > > have a large time window where (4) above happens?
>
> Earlier is better.
>
> > > b) Is it worth the effort to base the seed file on the size of the pool,
> > > instead of just using a constant size? If a constant size is better,
> > > which size would that be? 512 bytes? 4096 bytes? 16384 bytes?
>
> 512 bytes is plenty.
>
> > > c) What is the maximum seed file size we can allow (maybe based on size of
> > > the pool) to try to avoid problem (1) above ?
>
> Anything larger than a sector is simply wasting CPU time, but is
> otherwise harmless.

Well, a filesystem block is usually 1024 bytes, and a sector is 4096 bytes
nowadays... :-)

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2010-07-16 02:41:27

by Matt Mackall

[permalink] [raw]
Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Thu, 2010-07-15 at 20:33 -0300, Henrique de Moraes Holschuh wrote:
> On Mon, 05 Jul 2010, Matt Mackall wrote:
> > > > Here are our questions:
> > > >
> > > > 1. How much data of unknown quality can we feed the random pool at boot,
> > > > before it causes damage (i.e. what is the threshold where we violate the
> > > > "you are not goint to be any worse than you were before" rule) ?
> >
> > There is no limit. The mixing operations are computationally reversible,
> > which guarantees that no unknown degrees of freedom are clobbered when
> > mixing known data.
>
> Good. So, whatever we do, we are never worse off than we were before we did
> it, at least by design.
>
> > > > 2. How dangerous it is to feed the pool with stale seed data in the next
> > > > boot (i.e. in a failure mode where we do not regenerate the seed file) ?
> >
> > Not at all.
> >
> > > > 3. What is the optimal size of the seed data based on the pool size ?
> >
> > 1:1.
>
> We shall try to keep it at 1:1, then.
>
> > > > 4. How dangerous it is to have functions that need randomness (like
> > > > encripted network and partitions, possibly encripted swap with an
> > > > ephemeral key), BEFORE initializing the random seed ?
> >
> > Depends on the platform. For instance, if you've got an unattended boot
> > off a Live CD on a machine with a predictable clock, you may get
> > duplicate outputs.
>
> I.e. it is somewhat dangerous, and we should try to avoid it by design, so
> we should try to init it as early as possible. Very well.
>
> > > > 5. Is there an optimal size for the pool? Does the quality of the randomness
> > > > one extracts from the pool increase or decrease with pool size?
> >
> > Don't bother fiddling with the pool size.
>
> We don't, but local admins often do, probably in an attempt to better handle
> bursts of entropy drainage. So, we do want to properly support non-standard
> pool sizes in Debian if we can.

Unless they're manually patching their kernel, they probably aren't
succeeding. The pool resize ioctl was disabled ages ago. But there's
really nothing to support here: even the largest polynomial in the
source is only 2048 bits, or 256 bytes.

> > > > Basically, we need these answers to find our way regarding the following
> > > > decisions:
> > > >
> > > > a) Is it better to seed the pool as early as possible and risk a larger time
> > > > window for problem (2) above, instead of the current behaviour where we
> > > > have a large time window where (4) above happens?
> >
> > Earlier is better.
> >
> > > > b) Is it worth the effort to base the seed file on the size of the pool,
> > > > instead of just using a constant size? If a constant size is better,
> > > > which size would that be? 512 bytes? 4096 bytes? 16384 bytes?
> >
> > 512 bytes is plenty.
> >
> > > > c) What is the maximum seed file size we can allow (maybe based on size of
> > > > the pool) to try to avoid problem (1) above ?
> >
> > Anything larger than a sector is simply wasting CPU time, but is
> > otherwise harmless.
>
> Well, a filesystem block is usually 1024 bytes, and a sector is 4096 bytes
> nowadays... :-)
>


--
Mathematics is the supreme nostalgia of our time.

Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Thu, 15 Jul 2010, Matt Mackall wrote:
> > > Don't bother fiddling with the pool size.
> >
> > We don't, but local admins often do, probably in an attempt to better handle
> > bursts of entropy drainage. So, we do want to properly support non-standard
> > pool sizes in Debian if we can.
>
> Unless they're manually patching their kernel, they probably aren't
> succeeding. The pool resize ioctl was disabled ages ago. But there's
> really nothing to support here: even the largest polynomial in the
> source is only 2048 bits, or 256 bytes.

Well,

cat /proc/sys/kernel/random/poolsize
4096

And that is stock mainline 2.6.32.16 on amd64, AFAIK...

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2010-08-01 22:52:59

by Christoph Anton Mitterer

[permalink] [raw]
Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

Hey Matt...

May I ask you a follow-up question on that,... which is however not so
much Debian-init-related, I guess.


On Mon, 2010-07-05 at 13:40 -0500, Matt Mackall wrote:
> > > 1. How much data of unknown quality can we feed the random pool at boot,
> > > before it causes damage (i.e. what is the threshold where we violate the
> > > "you are not goint to be any worse than you were before" rule) ?
>
> There is no limit. The mixing operations are computationally reversible,
> which guarantees that no unknown degrees of freedom are clobbered when
> mixing known data.
>
> > > 2. How dangerous it is to feed the pool with stale seed data in the next
> > > boot (i.e. in a failure mode where we do not regenerate the seed file) ?
>
> Not at all.

Are the above to statements also true for possibly "evil" random data?


I mean the seed file (as in Debian) is already from the kernel's PRNG,
right? So that shouldn't contain evil and special crafted data in order
to weak the PRNG.

Working with a Gird-CA for the LHC - we're always interested in nice
tokens like:
http://www.entropykey.co.uk/

Unfortunately it's never really clear how well their contribution would
actually be.... and the paranoid below us could even believe, that
mighty government organisations have such devices hacked in order to
harm our crypto ;)


Thanks,
Chris.


Attachments:
smime.p7s (5.54 kB)
Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Mon, 02 Aug 2010, Christoph Anton Mitterer wrote:
> > > > 2. How dangerous it is to feed the pool with stale seed data in the next
> > > > boot (i.e. in a failure mode where we do not regenerate the seed file) ?
> >
> > Not at all.
>
> Are the above to statements also true for possibly "evil" random data?

Yes. I think you could consider that seeding with evil data does as much
damage as not seeding at all.

Unless there is a big bad bug somewhere, in which case we'd very much like
to know about it ;-)

> Working with a Gird-CA for the LHC - we're always interested in nice
> tokens like:
> http://www.entropykey.co.uk/
>
> Unfortunately it's never really clear how well their contribution would
> actually be.... and the paranoid below us could even believe, that
> mighty government organisations have such devices hacked in order to
> harm our crypto ;)

Well, if you overestimate the entropy that thing will output, it might cause
harm. If it has a self-sabotage device that is intelligent enough not to
fail the tests done by the application that feeds entropy to the kernel, it
might cause harm. The list goes on and on...

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2010-08-02 04:52:23

by Matt Mackall

[permalink] [raw]
Subject: Re: [Pkg-sysvinit-devel] Bug#587665: Safety of early boot init of /dev/random seed

On Mon, 2010-08-02 at 00:52 +0200, Christoph Anton Mitterer wrote:
> Hey Matt...
>
> May I ask you a follow-up question on that,... which is however not so
> much Debian-init-related, I guess.
>
>
> On Mon, 2010-07-05 at 13:40 -0500, Matt Mackall wrote:
> > > > 1. How much data of unknown quality can we feed the random pool at boot,
> > > > before it causes damage (i.e. what is the threshold where we violate the
> > > > "you are not goint to be any worse than you were before" rule) ?
> >
> > There is no limit. The mixing operations are computationally reversible,
> > which guarantees that no unknown degrees of freedom are clobbered when
> > mixing known data.
> >
> > > > 2. How dangerous it is to feed the pool with stale seed data in the next
> > > > boot (i.e. in a failure mode where we do not regenerate the seed file) ?
> >
> > Not at all.
>
> Are the above to statements also true for possibly "evil" random data?

Yes. Mixing in known values will not cause the contents of the pool to
become 'more known'. This is what I mean about reversible mixing
(without getting too technical): you can mix in a billion known values,
then mathematically reverse the billion mixing operations to return to
the original unknown state. Which means that the state after a billion
operations has just as much unknown-ness as it did when it started.

Here's the simplest version: consider that you've got a single unknown
bit X. Then you "mix" in Y1...Y9999 with X'=X^Y (reversible with
X=X'^Y). Because you don't know anything about X beforehand, no number
or pattern of bits Yn is going to improve your guess of the final value
of X - after each operation, it's still exactly as unguessable as
before.

Crucially, though, if you start with a _known_ value and mix in
_unknowns_ (what you're usually trying to do), the resulting state's
unknown-ness increases. With a good mixing function (and ours is pretty
decent), repeated addition of unknown values rapidly saturates the
unknown-ness of the pool.

--
Mathematics is the supreme nostalgia of our time.