2011-06-13 22:07:35

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Many server systems are seriously lacking in sources of entropy,
as we typically only feed the entropy pool by way of input layer
events, a few NIC driver interrupts and disk activity. A non-busy
server can easily become entropy-starved. We can mitigate this
somewhat by periodically mixing in entropy data based on the
delta between multiple high-resolution clocksource reads, per:

https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html

Additionally, NIST already approves of similar implementations, so
this should be usable in high-securtiy deployments requiring a
fair chunk of available entropy data for frequent use of /dev/random.

http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp750.pdf
(section 6.1 mentions a clock-based seed source).

Yes, thus far, I've only bothered with x86-based clocksources, since that
is what I can test most easily. If this patch series isn't deemed totally
insane, adding support for other clocksources should be trivial.

Also note that unless you explicitly build and load the clock-entropy
driver, there should be little or no change whatsoever to the way anything
behaves right now, its purely opt-in. There's probably room for some
improvement here, and I'm kind of outside my comfort area, but hey, it
seems to work pretty well here in my own testing, so here it is...

Jarod Wilson (5):
random: add new clocksource entropy interface
clocksource: add support for entropy-generation function
hpet: wire up entropy generation function
tsc: wire up entropy generation function
misc: add clocksource-based entropy generation driver

arch/x86/kernel/hpet.c | 18 ++++++
arch/x86/kernel/tsc.c | 18 ++++++
drivers/char/random.c | 28 ++++++++++
drivers/misc/Kconfig | 14 +++++
drivers/misc/Makefile | 1 +
drivers/misc/clock-entropy.c | 122 ++++++++++++++++++++++++++++++++++++++++++
include/linux/clocksource.h | 6 ++
include/linux/random.h | 1 +
kernel/time/clocksource.c | 33 +++++++++++
9 files changed, 241 insertions(+), 0 deletions(-)
create mode 100644 drivers/misc/clock-entropy.c

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>


2011-06-13 22:07:36

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 5/5] misc: add clocksource-based entropy generation driver

This is a fairly simple driver that just starts up a kernel thread that
periodically calls the active clocksource's entropy-gathering function,
if it has one. The default interval of 100us between polls doesn't show
any measurable impact to cpu usage on a core 2 duo test rig here, and
ensures the entropy pool is constantly being fed, even on a system with
no input device, no network traffic and no disk activity.

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/misc/Kconfig | 14 +++++
drivers/misc/Makefile | 1 +
drivers/misc/clock-entropy.c | 122 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 137 insertions(+), 0 deletions(-)
create mode 100644 drivers/misc/clock-entropy.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 4e349cd..b997f6a 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -490,6 +490,20 @@ config PCH_PHUB
To compile this driver as a module, choose M here: the module will
be called pch_phub.

+config CLOCK_ENTROPY
+ tristate "Clocksource-based Entropy Generator"
+ help
+ This is a driver that simply starts up a kernel thread that calls
+ clocksource helper functions to periodically poll the clocksource,
+ calculate a delta from the prior poll, and then feed the delta
+ value along to the random number generator entropy pool. This can
+ be useful for server systems that are otherwise starved for entropy,
+ as there are very few sources of entropy generation, particularly
+ on a non-busy server system.
+
+ To compile this driver as a module, choose M here. The module will
+ be called clock-entropy.
+
source "drivers/misc/c2port/Kconfig"
source "drivers/misc/eeprom/Kconfig"
source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 5f03172..a659ad2 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -46,3 +46,4 @@ obj-y += ti-st/
obj-$(CONFIG_AB8500_PWM) += ab8500-pwm.o
obj-y += lis3lv02d/
obj-y += carma/
+obj-$(CONFIG_CLOCK_ENTROPY) += clock-entropy.o
diff --git a/drivers/misc/clock-entropy.c b/drivers/misc/clock-entropy.c
new file mode 100644
index 0000000..56d65ac
--- /dev/null
+++ b/drivers/misc/clock-entropy.c
@@ -0,0 +1,122 @@
+/*
+ * Clocksource-based entropy generator
+ * Copyright (c) 2011 Jarod Wilson <[email protected]>
+ *
+ * Many server systems are seriously lacking in sources of entropy,
+ * as we typically only feed the entropy pool by way of input layer
+ * events, a few NIC driver interrupts and disk activity. A non-busy
+ * server can easily become entropy-starved. We can mitigate this
+ * somewhat by periodically mixing in entropy data based on the
+ * delta between multiple high-resolution clocksource reads, per:
+ *
+ * https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
+ *
+ * Additionally, NIST already approves of similar implementations, so
+ * this should be usable in high-securtiy deployments requiring a
+ * fair chunk of available entropy data for frequent use of /dev/random.
+ * ---
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/errno.h>
+#include <linux/kthread.h>
+#include <linux/clocksource.h>
+
+/* module parameters */
+static int ce_debug;
+static int interval = 100; /* microseconds */
+
+#define ce_dbg(fmt, args...) \
+ do { \
+ if (ce_debug) \
+ printk(KERN_DEBUG KBUILD_MODNAME ": " fmt, \
+ ## args); \
+ } while (0)
+
+/* Periodic clocksource polling thread data */
+static struct task_struct *ceg_task;
+
+/**
+ * If the current clocksource doesn't have an entropy-generation function
+ * wired up, this thread will do absolutely nothing productive, but we
+ * leave it running, in the event clocksources are changed to one of the
+ * ones that can provide entropy. We've already verified there's one that
+ * can by way of clocksource_entropy_available at init time.
+ */
+static int ceg_thread(void *arg)
+{
+ signed long timeout = usecs_to_jiffies(interval);
+
+ ce_dbg("polling thread started\n");
+
+ while (!kthread_should_stop()) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ schedule_timeout(timeout);
+ if (kthread_should_stop())
+ break;
+
+ clocksource_add_entropy();
+ }
+
+ ce_dbg("polling thread ended\n");
+
+ return 0;
+}
+
+static int __init ceg_init(void)
+{
+ if (!clocksource_entropy_available()) {
+ pr_err(KBUILD_MODNAME ": no clocksource entropy available\n");
+ return -EINVAL;
+ }
+
+ ceg_task = kthread_run(ceg_thread, NULL, "kclock-entropy-gen");
+ if (IS_ERR(ceg_task)) {
+ pr_err(KBUILD_MODNAME ": failed to initialize kthread\n");
+ return PTR_ERR(ceg_task);
+ }
+
+ ce_dbg("Clocksource Entropy Generator initialized\n");
+
+ return 0;
+}
+
+static void __exit ceg_exit(void)
+{
+ if (!IS_ERR_OR_NULL(ceg_task)) {
+ kthread_stop(ceg_task);
+ ceg_task = NULL;
+ }
+
+ ce_dbg("Clocksource Entropy Generator unloaded\n");
+}
+
+module_init(ceg_init);
+module_exit(ceg_exit);
+
+MODULE_DESCRIPTION("Clocksource-based entropy generator");
+MODULE_AUTHOR("Jarod Wilson <[email protected]>");
+MODULE_LICENSE("GPL");
+
+module_param(ce_debug, bool, 0644);
+MODULE_PARM_DESC(ce_debug, "Enable debugging messages");
+
+module_param(interval, int, 0644);
+MODULE_PARM_DESC(interval, "Clocksource sampling interval, in microseconds (default: 100us)");
--
1.7.1

2011-06-13 22:07:39

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 3/5] hpet: wire up entropy generation function

HPET is high enough resolution that we can use its low-order byte to
stir new data into the random number generator entropy pool.

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
---
arch/x86/kernel/hpet.c | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 6781765..cf23090 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -10,6 +10,7 @@
#include <linux/cpu.h>
#include <linux/pm.h>
#include <linux/io.h>
+#include <linux/random.h>

#include <asm/fixmap.h>
#include <asm/i8253.h>
@@ -63,6 +64,22 @@ static inline void hpet_writel(unsigned int d, unsigned int a)
writel(d, hpet_virt_address + a);
}

+static void hpet_add_entropy(void)
+{
+ static unsigned long last;
+ unsigned long counter;
+ int delta;
+
+ counter = hpet_readl(HPET_COUNTER);
+ delta = (int)(counter - last);
+ last = counter;
+
+ if (delta == counter)
+ return;
+
+ add_clocksource_randomness(delta);
+}
+
#ifdef CONFIG_X86_64
#include <asm/pgtable.h>
#endif
@@ -752,6 +769,7 @@ static struct clocksource clocksource_hpet = {
.mask = HPET_MASK,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
.resume = hpet_resume_counter,
+ .entropy = hpet_add_entropy,
#ifdef CONFIG_X86_64
.vread = vread_hpet,
#endif
--
1.7.1

2011-06-13 22:07:46

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 2/5] clocksource: add support for entropy-generation function

Add a new function pointer to struct clocksource that can optionally be
filled in by clocksources deemed to be high enough resolution to feed
the random number generator entropy pool.

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
---
include/linux/clocksource.h | 6 ++++++
kernel/time/clocksource.c | 33 +++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index d4646b4..c74bcf2 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -156,6 +156,8 @@ extern u64 timecounter_cyc2time(struct timecounter *tc,
* @vread: vsyscall based read
* @suspend: suspend function for the clocksource, if necessary
* @resume: resume function for the clocksource, if necessary
+ * @entropy: random entropy pool addition function (optional, and
+ * requires a fairly high-resolution clocksource)
*/
struct clocksource {
/*
@@ -184,6 +186,7 @@ struct clocksource {
unsigned long flags;
void (*suspend)(struct clocksource *cs);
void (*resume)(struct clocksource *cs);
+ void (*entropy)(struct clocksource *cs);

#ifdef CONFIG_CLOCKSOURCE_WATCHDOG
/* Watchdog related data, used by the framework */
@@ -279,6 +282,9 @@ extern void clocksource_resume(void);
extern struct clocksource * __init __weak clocksource_default_clock(void);
extern void clocksource_mark_unstable(struct clocksource *cs);

+extern bool clocksource_entropy_available(void);
+extern void clocksource_add_entropy(void);
+
extern void
clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 1c95fd6..71006ef 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -745,6 +745,39 @@ void clocksource_unregister(struct clocksource *cs)
}
EXPORT_SYMBOL(clocksource_unregister);

+/**
+ * Do we have at least one clocksource that can generate entropy?
+ */
+bool clocksource_entropy_available(void)
+{
+ struct clocksource *src;
+ bool entropy_possible = false;
+
+ mutex_lock(&clocksource_mutex);
+ list_for_each_entry(src, &clocksource_list, list) {
+ if (src->entropy) {
+ entropy_possible = true;
+ break;
+ }
+ }
+ mutex_unlock(&clocksource_mutex);
+
+ return entropy_possible;
+}
+EXPORT_SYMBOL(clocksource_entropy_available);
+
+/**
+ * Call the clocksource's entropy-generation function, if set
+ */
+void clocksource_add_entropy(void)
+{
+ if (!curr_clocksource->entropy)
+ return;
+
+ curr_clocksource->entropy(curr_clocksource);
+}
+EXPORT_SYMBOL(clocksource_add_entropy);
+
#ifdef CONFIG_SYSFS
/**
* sysfs_show_current_clocksources - sysfs interface for current clocksource
--
1.7.1

2011-06-13 22:07:43

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 1/5] random: add new clocksource entropy interface

This is a new interface for adding entropy data to the random number
generator. The low-order byte of a delta between successive clocksource
reads is mixed into the pool, with one bit per bytes of data mixed in
credited to the entropy pool.

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
---
drivers/char/random.c | 28 ++++++++++++++++++++++++++++
include/linux/random.h | 1 +
2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d4ddeba..03626c3 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -129,6 +129,7 @@
* unsigned int value);
* void add_interrupt_randomness(int irq);
* void add_disk_randomness(struct gendisk *disk);
+ * void add_clocksource_randomness(int delta);
*
* add_input_randomness() uses the input layer interrupt timing, as well as
* the event type information from the hardware.
@@ -147,6 +148,12 @@
* seek times do not make for good sources of entropy, as their seek
* times are usually fairly consistent.
*
+ * add_clocksource_randomness() uses time deltas between period reads
+ * of high-precision clocksources. The Linux kernel scheduler has no
+ * absolute guarantees of execution time, its best-effort, and we can
+ * be certain there will be entirely random variation in the actual
+ * deltas, at least at the nanosecond level for high-precision timers.
+ *
* All of these routines try to estimate how many bits of randomness a
* particular randomness source. They do this by keeping track of the
* first and second order deltas of the event timings.
@@ -722,6 +729,27 @@ void add_disk_randomness(struct gendisk *disk)
}
#endif

+void add_clocksource_randomness(int clock_delta)
+{
+ /* only mix in the low byte */
+ u8 mix = clock_delta & 0xff;
+
+ DEBUG_ENT("clock event %u\n", mix);
+
+ preempt_disable();
+ if (input_pool.entropy_count > trickle_thresh &&
+ (__get_cpu_var(trickle_count)++ & 0xfff))
+ goto out;
+
+ mix_pool_bytes(&input_pool, &mix, sizeof(mix));
+ /* Only credit one bit per byte to be conservative */
+ credit_entropy_bits(&input_pool, sizeof(mix));
+
+out:
+ preempt_enable();
+}
+EXPORT_SYMBOL_GPL(add_clocksource_randomness);
+
/*********************************************************************
*
* Entropy extraction routines
diff --git a/include/linux/random.h b/include/linux/random.h
index fb7ab9d..9e303dd 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -53,6 +53,7 @@ extern void rand_initialize_irq(int irq);
extern void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value);
extern void add_interrupt_randomness(int irq);
+extern void add_clocksource_randomness(int delta);

extern void get_random_bytes(void *buf, int nbytes);
void generate_random_uuid(unsigned char uuid_out[16]);
--
1.7.1

2011-06-13 22:07:44

by Jarod Wilson

[permalink] [raw]
Subject: [PATCH 4/5] tsc: wire up entropy generation function

TSC is high enough resolution that we can use its low-order byte to
stir new data into the random number generator entropy pool.

CC: Matt Mackall <[email protected]>
CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: John Stultz <[email protected]>
CC: Herbert Xu <[email protected]>
CC: "David S. Miller" <[email protected]>
Signed-off-by: Jarod Wilson <[email protected]>
---
arch/x86/kernel/tsc.c | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 6cc6922..d206ec3 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -10,6 +10,7 @@
#include <linux/clocksource.h>
#include <linux/percpu.h>
#include <linux/timex.h>
+#include <linux/random.h>

#include <asm/hpet.h>
#include <asm/timer.h>
@@ -768,11 +769,28 @@ static void resume_tsc(struct clocksource *cs)
clocksource_tsc.cycle_last = 0;
}

+static void tsc_add_entropy(void)
+{
+ static u64 last;
+ u64 counter;
+ int delta;
+
+ rdtscll(counter);
+ delta = (int)(counter - last);
+ last = counter;
+
+ if (delta == counter)
+ return;
+
+ add_clocksource_randomness(delta);
+}
+
static struct clocksource clocksource_tsc = {
.name = "tsc",
.rating = 300,
.read = read_tsc,
.resume = resume_tsc,
+ .entropy = tsc_add_entropy,
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS |
CLOCK_SOURCE_MUST_VERIFY,
--
1.7.1

2011-06-13 22:27:30

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson <[email protected]> wrote:
> TSC is high enough resolution that we can use its low-order byte to
> stir new data into the random number generator entropy pool.

From what I vaguely remember from years past, rdtsc, especially last
few bits of it are not very good as random number source. As they are
based on lower bus frequency and a multiplier. May be things have
changed these days. Adding Peter and Suresh for comments.

Thanks,
Venki

>
> CC: Matt Mackall <[email protected]>
> CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: John Stultz <[email protected]>
> CC: Herbert Xu <[email protected]>
> CC: "David S. Miller" <[email protected]>
> Signed-off-by: Jarod Wilson <[email protected]>
> ---
> ?arch/x86/kernel/tsc.c | ? 18 ++++++++++++++++++
> ?1 files changed, 18 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 6cc6922..d206ec3 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -10,6 +10,7 @@
> ?#include <linux/clocksource.h>
> ?#include <linux/percpu.h>
> ?#include <linux/timex.h>
> +#include <linux/random.h>
>
> ?#include <asm/hpet.h>
> ?#include <asm/timer.h>
> @@ -768,11 +769,28 @@ static void resume_tsc(struct clocksource *cs)
> ? ? ? ?clocksource_tsc.cycle_last = 0;
> ?}
>
> +static void tsc_add_entropy(void)
> +{
> + ? ? ? static u64 last;
> + ? ? ? u64 counter;
> + ? ? ? int delta;
> +
> + ? ? ? rdtscll(counter);
> + ? ? ? delta = (int)(counter - last);
> + ? ? ? last = counter;
> +
> + ? ? ? if (delta == counter)
> + ? ? ? ? ? ? ? return;
> +
> + ? ? ? add_clocksource_randomness(delta);
> +}
> +
> ?static struct clocksource clocksource_tsc = {
> ? ? ? ?.name ? ? ? ? ? ? ? ? ? = "tsc",
> ? ? ? ?.rating ? ? ? ? ? ? ? ? = 300,
> ? ? ? ?.read ? ? ? ? ? ? ? ? ? = read_tsc,
> ? ? ? ?.resume ? ? ? ? ? ? ? ? = resume_tsc,
> + ? ? ? .entropy ? ? ? ? ? ? ? ?= tsc_add_entropy,
> ? ? ? ?.mask ? ? ? ? ? ? ? ? ? = CLOCKSOURCE_MASK(64),
> ? ? ? ?.flags ? ? ? ? ? ? ? ? ?= CLOCK_SOURCE_IS_CONTINUOUS |
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?CLOCK_SOURCE_MUST_VERIFY,
> --
> 1.7.1
>
>

2011-06-13 22:35:33

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

Venkatesh Pallipadi wrote:
> On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson<[email protected]> wrote:
>> TSC is high enough resolution that we can use its low-order byte to
>> stir new data into the random number generator entropy pool.
>
> From what I vaguely remember from years past, rdtsc, especially last
> few bits of it are not very good as random number source. As they are
> based on lower bus frequency and a multiplier. May be things have
> changed these days. Adding Peter and Suresh for comments.

Ah, that would definitely be good to know. I *have* enabled debug spew
on my primary test rig though, and the randomness appears to be quite
good in the low byte using tsc as the primary clocksource. Its a ~3+
year old core 2 duo 2.67GHz system though, so things could easily be
better or worse with more current systems, and I can't say that I've
tried it out exhaustively with the cpu at full-bore, which could affect
things (system is mostly idle, so acpi-cpufreq had the cpu dialed back).

--
Jarod Wilson
[email protected]

2011-06-13 22:36:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On 06/13/2011 03:27 PM, Venkatesh Pallipadi wrote:
> On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson <[email protected]> wrote:
>> TSC is high enough resolution that we can use its low-order byte to
>> stir new data into the random number generator entropy pool.
>
> From what I vaguely remember from years past, rdtsc, especially last
> few bits of it are not very good as random number source. As they are
> based on lower bus frequency and a multiplier. May be things have
> changed these days. Adding Peter and Suresh for comments.

This is correct; at the very least I would multiply the low 32 bits of
the TSC with a 32-bit prime number before mixing.

However, the big issue with this is that it's recursive... what causes
this to be invoked... probably an interrupt, which is going to have been
invoked by a timer, quite possible the TSC deadline timer. Oops.

-hpa

2011-06-13 22:39:10

by john stultz

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
> Many server systems are seriously lacking in sources of entropy,
> as we typically only feed the entropy pool by way of input layer
> events, a few NIC driver interrupts and disk activity. A non-busy
> server can easily become entropy-starved. We can mitigate this
> somewhat by periodically mixing in entropy data based on the
> delta between multiple high-resolution clocksource reads, per:
>
> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
>
> Additionally, NIST already approves of similar implementations, so
> this should be usable in high-securtiy deployments requiring a
> fair chunk of available entropy data for frequent use of /dev/random.
>
> http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp750.pdf
> (section 6.1 mentions a clock-based seed source).
>
> Yes, thus far, I've only bothered with x86-based clocksources, since that
> is what I can test most easily. If this patch series isn't deemed totally
> insane, adding support for other clocksources should be trivial.
>
> Also note that unless you explicitly build and load the clock-entropy
> driver, there should be little or no change whatsoever to the way anything
> behaves right now, its purely opt-in. There's probably room for some
> improvement here, and I'm kind of outside my comfort area, but hey, it
> seems to work pretty well here in my own testing, so here it is...
>
> Jarod Wilson (5):
> random: add new clocksource entropy interface
> clocksource: add support for entropy-generation function
> hpet: wire up entropy generation function
> tsc: wire up entropy generation function
> misc: add clocksource-based entropy generation driver

So this is interesting work, but I have a few questions.

1) hpet_add_entropy() and tsc_add_entropy() are basically identical. It
seems there should be a generic bit of code that uses the clocksource's
read() function and does the same calculation.

2) If the .entropy() functions really aren't clocksource specific, it
doesn't really seem like this functionality belongs in the clocksource
layer. Instead it seems it should be a layer above, that makes use of
the clocksource code in a generic fashion.

3) Why are you making use of the curr_clocksource? The timekeeping code
has very strict rules about what makes a clocksource usable for
timekeeping. I suspect entropy generation has different requirements,
and thus shouldn't lean on the curr_clocksource value (ie: TSC may be
unsynced or halts in idle, and thus unusable for time, but likely still
ok for entropy).

4) Further on that point, there's a likely bug: If an entropy supporting
clocksource is available, clocksource_entropy_available() will return
true, but if the one curr_clocksource is not the one that supports
entropy, clocksource_add_entropy() won't do anything. I suspect the
add_entropy function needs to scan for a entropy flagged clocksource and
use it instead.

5) Does the entropy calculation need to be flagged clocksource by
clocksource? Or could a generic metric be made (ie: frequency?) in order
to enable such functionality on all capable clocksources automatically?


thanks
-john

2011-06-13 23:55:38

by Kent Borg

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

Venkatesh Pallipadi wrote:
> On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson <[email protected]> wrote:
>
>> TSC is high enough resolution that we can use its low-order byte to
>> stir new data into the random number generator entropy pool.
>
> From what I vaguely remember from years past, rdtsc, especially last
> few bits of it are not very good as random number source.

It doesn't have to be "random" (a very high bar), it just has to be
somewhat non-deterministic to contain entropy (a low-bar). The idea
isn't to use the TSC as a random number, only to use it for mixing the
pool. If somehow the TSC were *completely* deterministic, we would only
fail to add entropy, any entropy already in the pool would still be just
as good as before, no harm done. (Like a perfect shuffle of an already
randomly arranged deck of cards.)

Okay, not "no harm", a ~little~ harm. There are two problems with using
the TSC if it doesn't have much entropy:

1. It wastes computrons. Okay, have efficient code, don't run it too
often.

2. Entropy estimates might be too high. Okay, but entropy estimation
is a horrible
problem in the best of well-defined circumstances. Some (Schneier
I think) say
it isn't worth attempting. Difficulty in estimating entropy is a
bad excuse for
starving the system of entropy. Better to credit no entropy than
to collect
no entropy. In a server, entropy sources should not be discarded.

> As they are
> based on lower bus frequency and a multiplier.

Based on a lower bus frequency, but multiplied up in the CPU. Why?
Because synchronous clock distribution is very difficult at GHz speeds.
Heck, even inside the CPU, clock distribution is not a trivial matter.
It is impossible for anyone at a distance to know the LSB of the TSC at
any given moment because the very concept of a "given moment" is ill
defined at any distance.

And how is the low-frequency clock multiplied up? With analog circuitry
(a PLL, right?), and analog is a source of indeterminism. There is
going to be a little jitter in there. Differences from chip-to-chip,
with slight power supply changes, with temperature, with time.


Disclaimer: I have not looked at the patch carefully, it might be shit.
But the idea of letting us add some entropy is excellent!


-kb, the Kent who has been paid money for building a defensible RNG.

2011-06-14 00:39:50

by Kent Borg

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

H. Peter Anvin wrote:
> However, the big issue with this is that it's recursive... what
causes this to
> be invoked... probably an interrupt, which is going to have been
> invoked by a timer, quite possible the TSC deadline timer. Oops.

I was assuming that drivers, responding to an interrupt from some
external event, would want to make this call.

Such as a network driver. The canonical example is an entropy-starved
server that doesn't see much light at all, let alone a local user with a
mouse. But such a server does see plenty of packets. Packets that go
through various delays before getting to the ethernet port.

And even if an outside observer could get past the physical security
that isolates such a poor starved server and tap at the ethernet port,
an ethernet port is still many centimeters, a lot of gates, and even
some firmware away from the IRQ pin, and that pin itself is some
distance from the TSC. Knowing that LSB is extremely hard. Knowing
exactly when the LSB is going to be sampled is extremely hard. Knowing
both presents very real theoretical hurdles. And those hurdles get much
higher with every meter of distance.

> at the very least I would multiply the low 32 bits of
> the TSC with a 32-bit prime number before mixing.

Two points:

1. Why look at the high-order bits? How are they going to have values
that cannot be inferred? If you are looking for entropy, the low-order
bits is where they're going keep it. (Think Willie Sutton.) If looking
at the low-byte is cheaper, then let's be cheap. If, however, if
looking at more bytes *is* as cheap, then there is no harm. But in any
case let's keep the code lean enough that it can be called from an
interrupt service routine.

2. Don't confuse collecting entropy with how to mix the pool. Whether
or not to multiply by a prime number is the domain of how to mix the
pool. Possibly we need a patch on that, too, you might have a point
there, but that is a different question from whether we should be
allowed a function to use the TSC to mix the pool. Different issues.


-kb, the Kent who insists that it is better to credit no entropy and be
safe, than to collect no entropy and be certain.

2011-06-14 01:48:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On 06/13/2011 05:39 PM, Kent Borg wrote:
> I was assuming that drivers, responding to an interrupt from some
> external event, would want to make this call.
> Such as a network driver.

Those already are doing this.

> Two points:
>
> 1. Why look at the high-order bits? How are they going to have values
> that cannot be inferred? If you are looking for entropy, the low-order
> bits is where they're going keep it. (Think Willie Sutton.) If looking
> at the low-byte is cheaper, then let's be cheap. If, however, if
> looking at more bytes *is* as cheap, then there is no harm. But in any
> case let's keep the code lean enough that it can be called from an
> interrupt service routine.

Consider the case where the TSC is running in steps of 64 and you're
using the low 6 bits.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2011-06-14 12:39:42

by Kent Borg

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

H. Peter Anvin wrote:
> Those already are doing this.

They used to via IRQF_SAMPLE_RANDOM, but these are being removed
(according to Documentation/feature-removal-schedule.txt). In 2.6.39 I
can only find 10 remaining instances, out of many more network drivers.

The alternative is supposed to be calls to add_*_randomness(), but that
family of functions seems to only have one exported symbol:
add_input_randomness(), which looks like is called only for things
dealing with human input.

So network entropy is being eradicated, and nothing is being done to
replace it.

This is crazy.

> Consider the case where the TSC is running in steps of 64 and you're
> using the low 6 bits.

It does that? I thought it was a "time stamp counter", that is
incremented. You know, incremented by one. (Why would someone have it
jump by 64? Wiring it up on the cheap side of the clock multiplier?
What chips do that?)


Maybe this patch is not sensible. If it is only going to be called on
timer, then there is the feedback issue that will badly limit the
entropy, and if a TSC is sometimes only jumping in big steps, then the
available entropy is being masked.


Let me fall back to a different (but related) topic: eradicating
IRQF_SAMPLE_RANDOM is stupid. The only argument in favor seems to be
that the entropy estimation might be too high. But entropy estimation
is never going to be accurate. Better to credit no entropy than to
collect no entropy.


-kb

2011-06-14 14:02:52

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

H. Peter Anvin wrote:
> On 06/13/2011 05:39 PM, Kent Borg wrote:
>> I was assuming that drivers, responding to an interrupt from some
>> external event, would want to make this call.
>> Such as a network driver.
>
> Those already are doing this.
>
>> Two points:
>>
>> 1. Why look at the high-order bits? How are they going to have values
>> that cannot be inferred? If you are looking for entropy, the low-order
>> bits is where they're going keep it. (Think Willie Sutton.) If looking
>> at the low-byte is cheaper, then let's be cheap. If, however, if
>> looking at more bytes *is* as cheap, then there is no harm. But in any
>> case let's keep the code lean enough that it can be called from an
>> interrupt service routine.
>
> Consider the case where the TSC is running in steps of 64 and you're
> using the low 6 bits.

What if we mix the low byte into the pool, but only increase the entropy
estimate if a comparison with the prior read actually shows a change?

Another thought is to perhaps mix the counter value with a value read
from ansi_cprng, which is similar to a tactic referred to in NIST's
140sp750.pdf.

Peter, as long as I've got your attention here... Patch 3 in the set
adds essentially the same sort of support using the hpet instead of the
tsc. Is that less contentious, as far as precision in the low byte(s) to
provide entropy data?

My initial local implementation was actually hpet-only, before reworking
things to try to be more generic.

--
Jarod Wilson
[email protected]

2011-06-14 14:25:42

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

john stultz wrote:
> On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
>> Many server systems are seriously lacking in sources of entropy,
>> as we typically only feed the entropy pool by way of input layer
>> events, a few NIC driver interrupts and disk activity. A non-busy
>> server can easily become entropy-starved. We can mitigate this
>> somewhat by periodically mixing in entropy data based on the
>> delta between multiple high-resolution clocksource reads, per:
>>
>> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
>>
>> Additionally, NIST already approves of similar implementations, so
>> this should be usable in high-securtiy deployments requiring a
>> fair chunk of available entropy data for frequent use of /dev/random.
>>
>> http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp750.pdf
>> (section 6.1 mentions a clock-based seed source).
>>
>> Yes, thus far, I've only bothered with x86-based clocksources, since that
>> is what I can test most easily. If this patch series isn't deemed totally
>> insane, adding support for other clocksources should be trivial.
>>
>> Also note that unless you explicitly build and load the clock-entropy
>> driver, there should be little or no change whatsoever to the way anything
>> behaves right now, its purely opt-in. There's probably room for some
>> improvement here, and I'm kind of outside my comfort area, but hey, it
>> seems to work pretty well here in my own testing, so here it is...
>>
>> Jarod Wilson (5):
>> random: add new clocksource entropy interface
>> clocksource: add support for entropy-generation function
>> hpet: wire up entropy generation function
>> tsc: wire up entropy generation function
>> misc: add clocksource-based entropy generation driver
>
> So this is interesting work, but I have a few questions.
>
> 1) hpet_add_entropy() and tsc_add_entropy() are basically identical. It
> seems there should be a generic bit of code that uses the clocksource's
> read() function and does the same calculation.
>
> 2) If the .entropy() functions really aren't clocksource specific, it
> doesn't really seem like this functionality belongs in the clocksource
> layer. Instead it seems it should be a layer above, that makes use of
> the clocksource code in a generic fashion.

Wasn't sure at first if there might be a need for more differentiation
in the entropy-gathering functions, but yeah, at least with these two,
just using the clocksource read function from a common entropy-gathering
function does look to make more sense.


> 3) Why are you making use of the curr_clocksource? The timekeeping code
> has very strict rules about what makes a clocksource usable for
> timekeeping. I suspect entropy generation has different requirements,
> and thus shouldn't lean on the curr_clocksource value (ie: TSC may be
> unsynced or halts in idle, and thus unusable for time, but likely still
> ok for entropy).
>
> 4) Further on that point, there's a likely bug: If an entropy supporting
> clocksource is available, clocksource_entropy_available() will return
> true, but if the one curr_clocksource is not the one that supports
> entropy, clocksource_add_entropy() won't do anything. I suspect the
> add_entropy function needs to scan for a entropy flagged clocksource and
> use it instead.

Yeah, there are definitely different requirements. Using
curr_clocksource was sort of the easy way out, I guess. :) So yeah, its
entirely possible for someone to set a non-entropy-generating
clocksource as their active one, and still have an entropy-generating
one that is available -- I was actually running tsc as my primary
clocksource and using the hpet for entropy in my initial internal
implementation here.

Each clocksource currently has a rating for how good a timekeeping
clocksource it is, would it make sense to have a second rating that
indicates how good an entropy source it is, and use that to determine
which clocksource would best serve for entropy generation?


> 5) Does the entropy calculation need to be flagged clocksource by
> clocksource? Or could a generic metric be made (ie: frequency?) in order
> to enable such functionality on all capable clocksources automatically?

I honestly haven't a clue here. I'm not sure if there are possibly
clocksources that have a relatively high frequency, but lack precision
in their lowest bits, nor am I sure (yet?) how to figure out what the
frequency and/or precision of a given clocksource actually is. I guess I
assume its non-trivial, since we have to have the 'rating' value to
choose primary timekeeping clocksource.

My initial thinking is that it would be best to have this purely opt-in,
for clocksources that have been evaluated and deemed acceptable for this
use.

So I'll see what I can come up with in the way of removing the
per-clocksource entropy functions. Doing so adds a new complication, by
way of removing what I was keying off of to decide if the clocksource
should be providing entropy, but perhaps I can try working in an entropy
quality rating at the same time.

--
Jarod Wilson
[email protected]

2011-06-14 14:41:18

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
> Many server systems are seriously lacking in sources of entropy,
> as we typically only feed the entropy pool by way of input layer
> events, a few NIC driver interrupts and disk activity. A non-busy
> server can easily become entropy-starved. We can mitigate this
> somewhat by periodically mixing in entropy data based on the
> delta between multiple high-resolution clocksource reads, per:
>
> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
>
> Additionally, NIST already approves of similar implementations, so
> this should be usable in high-securtiy deployments requiring a
> fair chunk of available entropy data for frequent use of /dev/random.

So, mixed feelings here:

Yes: it's a great idea to regularly mix other data into the pool. More
samples are always better for RNG quality.

Maybe: the current RNG is not really designed with high-bandwidth
entropy sources in mind, so this might introduce non-negligible overhead
in systems with, for instance, huge numbers of CPUs.

No: it's not a great idea to _credit_ the entropy count with this data.
Someone watching the TSC or HPET from userspace can guess when samples
are added by watching for drop-outs in their sampling (ie classic timing
attack).

(I see you do credit only 1 bit per byte: that's fairly conservative,
true, but it must be _perfectly conservative_ for the theoretical
requirements of /dev/random to be met. These requirements are in fact
known to be unfulfillable in practice(!), but that doesn't mean we
should introduce more users of entropy accounting. Instead, it means
that entropy accounting is broken and needs to be removed.)

> http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp750.pdf
> (section 6.1 mentions a clock-based seed source).
>
> Yes, thus far, I've only bothered with x86-based clocksources, since that
> is what I can test most easily. If this patch series isn't deemed totally
> insane, adding support for other clocksources should be trivial.
>
> Also note that unless you explicitly build and load the clock-entropy
> driver, there should be little or no change whatsoever to the way anything
> behaves right now, its purely opt-in. There's probably room for some
> improvement here, and I'm kind of outside my comfort area, but hey, it
> seems to work pretty well here in my own testing, so here it is...
>
> Jarod Wilson (5):
> random: add new clocksource entropy interface
> clocksource: add support for entropy-generation function
> hpet: wire up entropy generation function
> tsc: wire up entropy generation function
> misc: add clocksource-based entropy generation driver
>
> arch/x86/kernel/hpet.c | 18 ++++++
> arch/x86/kernel/tsc.c | 18 ++++++
> drivers/char/random.c | 28 ++++++++++
> drivers/misc/Kconfig | 14 +++++
> drivers/misc/Makefile | 1 +
> drivers/misc/clock-entropy.c | 122 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/clocksource.h | 6 ++
> include/linux/random.h | 1 +
> kernel/time/clocksource.c | 33 +++++++++++
> 9 files changed, 241 insertions(+), 0 deletions(-)
> create mode 100644 drivers/misc/clock-entropy.c
>
> CC: Matt Mackall <[email protected]>
> CC: "Venkatesh Pallipadi (Venki)" <[email protected]>
> CC: Thomas Gleixner <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: John Stultz <[email protected]>
> CC: Herbert Xu <[email protected]>
> CC: "David S. Miller" <[email protected]>


--
Mathematics is the supreme nostalgia of our time.

2011-06-14 14:41:18

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On Tue, 2011-06-14 at 08:39 -0400, Kent Borg wrote:
> H. Peter Anvin wrote:
> > Those already are doing this.
>
> They used to via IRQF_SAMPLE_RANDOM, but these are being removed
> (according to Documentation/feature-removal-schedule.txt). In 2.6.39 I
> can only find 10 remaining instances, out of many more network drivers.

That's down from three. Oh wait, that's up.

> So network entropy is being eradicated, and nothing is being done to
> replace it.

Nothing is being done is a more accurate summary of the situation.

--
Mathematics is the supreme nostalgia of our time.

2011-06-14 14:41:18

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On Mon, 2011-06-13 at 15:36 -0700, H. Peter Anvin wrote:
> On 06/13/2011 03:27 PM, Venkatesh Pallipadi wrote:
> > On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson <[email protected]> wrote:
> >> TSC is high enough resolution that we can use its low-order byte to
> >> stir new data into the random number generator entropy pool.
> >
> > From what I vaguely remember from years past, rdtsc, especially last
> > few bits of it are not very good as random number source. As they are
> > based on lower bus frequency and a multiplier. May be things have
> > changed these days. Adding Peter and Suresh for comments.

Well the issue is that samples are going to be roughly aligned to some
multiple of the bus frequency. If an interrupt occurs on bus cycle X,
this code will be hit at roughly TSC cycle X*M+d.

> This is correct; at the very least I would multiply the low 32 bits of
> the TSC with a 32-bit prime number before mixing.

This multiply doesn't actually accomplish anything. Mixing known values
into the pool is harmless because the mixing function is fully
reversable. ie:

unmix(sample, mix(sample, pool)) = pool

If you didn't know the contents of pool before, you can't guess it
after.

The danger comes if you (a) already know pool and (b) can narrow sample
to a tractable set of values. Then you can calculate a set of possible
pool' values and compare the resulting output to confirm the actual
state of pool' (aka state extension attack). Sticking a constant
multiplier on sample doesn't affect this attack at all.

> However, the big issue with this is that it's recursive... what causes
> this to be invoked... probably an interrupt, which is going to have been
> invoked by a timer, quite possible the TSC deadline timer. Oops.

..and the sampling function already mixes in a TSC timestamp with the
provided timestamp.

--
Mathematics is the supreme nostalgia of our time.

2011-06-14 15:19:01

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Matt Mackall wrote:
> On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
>> Many server systems are seriously lacking in sources of entropy,
>> as we typically only feed the entropy pool by way of input layer
>> events, a few NIC driver interrupts and disk activity. A non-busy
>> server can easily become entropy-starved. We can mitigate this
>> somewhat by periodically mixing in entropy data based on the
>> delta between multiple high-resolution clocksource reads, per:
>>
>> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
>>
>> Additionally, NIST already approves of similar implementations, so
>> this should be usable in high-securtiy deployments requiring a
>> fair chunk of available entropy data for frequent use of /dev/random.
>
> So, mixed feelings here:
>
> Yes: it's a great idea to regularly mix other data into the pool. More
> samples are always better for RNG quality.
>
> Maybe: the current RNG is not really designed with high-bandwidth
> entropy sources in mind, so this might introduce non-negligible overhead
> in systems with, for instance, huge numbers of CPUs.

The current implementation is opt-in, and single-threaded, so at least
currently, I don't think there should be any significant issues. But
yeah, there's nothing currently in the implementation preventing a
variant that is per-cpu, which could certainly lead to some scalability
issues.

> No: it's not a great idea to _credit_ the entropy count with this data.
> Someone watching the TSC or HPET from userspace can guess when samples
> are added by watching for drop-outs in their sampling (ie classic timing
> attack).

I'm admittedly a bit of a novice in this area... Why does it matter if
someone watching knows more or less when a sample is added? It doesn't
really reveal anything about the sample itself, if we're using a
high-granularity counter value's low bits -- round-trip to userspace has
all sorts of inherent timing jitter, so determining the low-order bits
the kernel got by monitoring from userspace should be more or less
impossible. And the pool is constantly changing, making it a less static
target on an otherwise mostly idle system.

> (I see you do credit only 1 bit per byte: that's fairly conservative,
> true, but it must be _perfectly conservative_ for the theoretical
> requirements of /dev/random to be met. These requirements are in fact
> known to be unfulfillable in practice(!), but that doesn't mean we
> should introduce more users of entropy accounting. Instead, it means
> that entropy accounting is broken and needs to be removed.)

Hrm. The government seems to have a different opinion. Various certs
have requirements for some sort of entropy accounting and minimum
estimated entropy guarantees. We can certainly be even more conservative
than 1 bit per byte, but yeah, I don't really have a good answer for
perfectly conservative, and I don't know what might result (on the
government cert front) from removing entropy accounting altogether...

Any thoughts on the idea of mixing clocksource bits with reads from
ansi_cprng? We could mix in more bytes while still only crediting one
bit, and periodically reseed ansi_cprng from the clocksource, or
something along those lines... This may be entirely orthogonal to the
timing attack issue you're talking about though. :)

--
Jarod Wilson
[email protected]

2011-06-14 15:23:15

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Jarod Wilson wrote:
> Matt Mackall wrote:
>> On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
>>> Many server systems are seriously lacking in sources of entropy,
>>> as we typically only feed the entropy pool by way of input layer
>>> events, a few NIC driver interrupts and disk activity. A non-busy
>>> server can easily become entropy-starved. We can mitigate this
>>> somewhat by periodically mixing in entropy data based on the
>>> delta between multiple high-resolution clocksource reads, per:
>>>
>>> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
>>>
>>>
>>> Additionally, NIST already approves of similar implementations, so
>>> this should be usable in high-securtiy deployments requiring a
>>> fair chunk of available entropy data for frequent use of /dev/random.
>>
>> So, mixed feelings here:
>>
>> Yes: it's a great idea to regularly mix other data into the pool. More
>> samples are always better for RNG quality.
>>
>> Maybe: the current RNG is not really designed with high-bandwidth
>> entropy sources in mind, so this might introduce non-negligible overhead
>> in systems with, for instance, huge numbers of CPUs.
>
> The current implementation is opt-in, and single-threaded, so at least
> currently, I don't think there should be any significant issues.

I stand corrected. Hadn't considered the possible issues with doing a
regular preempt_disable() and __get_cpu_var() on a system with tons of
cpus. (I'm still not sure exactly what the issues would be, but I think
I see the potential for issues of some sort.)

--
Jarod Wilson
[email protected]

2011-06-14 17:13:52

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Tue, 2011-06-14 at 11:18 -0400, Jarod Wilson wrote:
> Matt Mackall wrote:
> > On Mon, 2011-06-13 at 18:06 -0400, Jarod Wilson wrote:
> >> Many server systems are seriously lacking in sources of entropy,
> >> as we typically only feed the entropy pool by way of input layer
> >> events, a few NIC driver interrupts and disk activity. A non-busy
> >> server can easily become entropy-starved. We can mitigate this
> >> somewhat by periodically mixing in entropy data based on the
> >> delta between multiple high-resolution clocksource reads, per:
> >>
> >> https://www.osadl.org/Analysis-of-inherent-randomness-of-the-L.rtlws11-developers-okech.0.html
> >>
> >> Additionally, NIST already approves of similar implementations, so
> >> this should be usable in high-securtiy deployments requiring a
> >> fair chunk of available entropy data for frequent use of /dev/random.
> >
> > So, mixed feelings here:
> >
> > Yes: it's a great idea to regularly mix other data into the pool. More
> > samples are always better for RNG quality.
> >
> > Maybe: the current RNG is not really designed with high-bandwidth
> > entropy sources in mind, so this might introduce non-negligible overhead
> > in systems with, for instance, huge numbers of CPUs.
>
> The current implementation is opt-in, and single-threaded, so at least
> currently, I don't think there should be any significant issues. But
> yeah, there's nothing currently in the implementation preventing a
> variant that is per-cpu, which could certainly lead to some scalability
> issues.

The pool itself is single-threaded. On large-ish machines (100+ CPUs),
we've seen contention rise to 60% or more. Hence the addition of the
trickle threshold. But I can see that breaking down with a lot more
writers.

> > No: it's not a great idea to _credit_ the entropy count with this data.
> > Someone watching the TSC or HPET from userspace can guess when samples
> > are added by watching for drop-outs in their sampling (ie classic timing
> > attack).
>
> I'm admittedly a bit of a novice in this area... Why does it matter if
> someone watching knows more or less when a sample is added? It doesn't
> really reveal anything about the sample itself, if we're using a
> high-granularity counter value's low bits -- round-trip to userspace has
> all sorts of inherent timing jitter, so determining the low-order bits
> the kernel got by monitoring from userspace should be more or less
> impossible. And the pool is constantly changing, making it a less static
> target on an otherwise mostly idle system.

I recommend you do some Google searches for "ssl timing attack" and "aes
timing attack" to get a feel for the kind of seemingly impossible things
that can be done and thereby recalibrate your scale of the impossible.

> > (I see you do credit only 1 bit per byte: that's fairly conservative,
> > true, but it must be _perfectly conservative_ for the theoretical
> > requirements of /dev/random to be met. These requirements are in fact
> > known to be unfulfillable in practice(!), but that doesn't mean we
> > should introduce more users of entropy accounting. Instead, it means
> > that entropy accounting is broken and needs to be removed.)
>
> Hrm. The government seems to have a different opinion. Various certs
> have requirements for some sort of entropy accounting and minimum
> estimated entropy guarantees. We can certainly be even more conservative
> than 1 bit per byte, but yeah, I don't really have a good answer for
> perfectly conservative, and I don't know what might result (on the
> government cert front) from removing entropy accounting altogether...

Well, the deal with accounting is this: if you earn $.90 and spend $1.00
every day, you'll eventually go broke, even if your
rounded-to-the-nearest-dollar accounting tells you you're solidly in the
black.

The only distinction between /dev/random and urandom is that we claim
that /dev/random is always solidly in the black. But as we don't have a
firm theoretical basis for making our accounting estimates on the input
side, the whole accounting thing kind of breaks down into a kind of
busted rate-limiter.

We'd do better counting a raw number of samples per source, and then
claiming that we've reached a 'full' state when we reach a certain
'diversity x depth' score. And then assuring we have a lot of diversity
and depth going into the pool.


> Any thoughts on the idea of mixing clocksource bits with reads from
> ansi_cprng?

Useless. The definition of entropy here can be thought of as 'log(volume
of state space that can't be observed by an attacker)'. There is nothing
we can do algorithmically to a sample that will increase that volume, we
can only shrink it! And any function that is not completely reversible
(aka 1:1) will in fact shrink that volume, so you have to be very
careful here. The mixing primitives are already quite solid, no need to
layer on more potentially faulty ones.


--
Mathematics is the supreme nostalgia of our time.

2011-06-14 17:48:38

by Kent Borg

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

Matt Mackall wrote:
> Kent Borg wrote:
>>So network entropy is being eradicated, and nothing is being done to
>>replace it.
>
>Nothing is being done is a more accurate summary of the situation.


So the feature-removal-schedule.txt entry about IRQF_SAMPLE_RANDOM is
obsolete?

(Then the trend from three network drivers to ten network drivers should
be accelerated and restore we should restore the deleted instances of
IRQF_SAMPLE_RANDOM??)


-kb, the Kent who sees network adapter interrupt timing as a usually
good source of entropy.

2011-06-14 18:00:36

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On Tue, 2011-06-14 at 13:48 -0400, Kent Borg wrote:
> Matt Mackall wrote:
> > Kent Borg wrote:
> >>So network entropy is being eradicated, and nothing is being done to
> >>replace it.
> >
> >Nothing is being done is a more accurate summary of the situation.
>
>
> So the feature-removal-schedule.txt entry about IRQF_SAMPLE_RANDOM is
> obsolete?
>
> (Then the trend from three network drivers to ten network drivers should
> be accelerated and restore we should restore the deleted instances of
> IRQF_SAMPLE_RANDOM??)
>
>
> -kb, the Kent who sees network adapter interrupt timing as a usually
> good source of entropy.

It's a great source of potential entropy, a bad source of guaranteed
entropy. The current RNG tries to do accounting on the latter.
Accounting on the former is extremely suspect.

--
Mathematics is the supreme nostalgia of our time.

2011-06-14 18:12:00

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On 06/13/2011 04:10 PM, Matt Mackall wrote:
>
> Well the issue is that samples are going to be roughly aligned to some
> multiple of the bus frequency. If an interrupt occurs on bus cycle X,
> this code will be hit at roughly TSC cycle X*M+d.
>
>> This is correct; at the very least I would multiply the low 32 bits of
>> the TSC with a 32-bit prime number before mixing.
>
> This multiply doesn't actually accomplish anything. Mixing known values
> into the pool is harmless because the mixing function is fully
> reversable. ie:
>
> unmix(sample, mix(sample, pool)) = pool
>
> If you didn't know the contents of pool before, you can't guess it
> after.
>
> The danger comes if you (a) already know pool and (b) can narrow sample
> to a tractable set of values. Then you can calculate a set of possible
> pool' values and compare the resulting output to confirm the actual
> state of pool' (aka state extension attack). Sticking a constant
> multiplier on sample doesn't affect this attack at all.
>

It only matters if you actually truncate it to LSBs.

>> However, the big issue with this is that it's recursive... what causes
>> this to be invoked... probably an interrupt, which is going to have been
>> invoked by a timer, quite possible the TSC deadline timer. Oops.
>
> ..and the sampling function already mixes in a TSC timestamp with the
> provided timestamp.

Right.

-hpa

2011-06-14 20:04:08

by Kent Borg

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

Matt Mackall wrote:
> [network adapters are] a great source of potential entropy, a bad
> source of guaranteed entropy. The current RNG tries to do
> accounting on the latter. Accounting on the former is extremely
> suspect.

So we need a patch that:

- Deletes the IRQF_SAMPLE_RANDOM mention in feature-removal-schedule.txt,

- Restores instances of IRQF_SAMPLE_RANDOM in drivers, and

- Changes the credit_entropy_bits() to credit less entropy*.

* The code seems to only handle integer values of entropy. Maybe
when crediting, choose between 1 and 0 credits.

Then once that kernel makes it into the field, a bunch of entropy
starved machines will no longer be entropy starved. A few machines that
are run by people who worship an entropy estimate will still have to
install mice and users, explicit RNGs, etc., but entropy will flow.

Make sense?


-kb

2011-06-14 20:17:46

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Matt Mackall wrote:
> On Tue, 2011-06-14 at 11:18 -0400, Jarod Wilson wrote:
>> Matt Mackall wrote:
...
>>> No: it's not a great idea to _credit_ the entropy count with this data.
>>> Someone watching the TSC or HPET from userspace can guess when samples
>>> are added by watching for drop-outs in their sampling (ie classic timing
>>> attack).
>> I'm admittedly a bit of a novice in this area... Why does it matter if
>> someone watching knows more or less when a sample is added? It doesn't
>> really reveal anything about the sample itself, if we're using a
>> high-granularity counter value's low bits -- round-trip to userspace has
>> all sorts of inherent timing jitter, so determining the low-order bits
>> the kernel got by monitoring from userspace should be more or less
>> impossible. And the pool is constantly changing, making it a less static
>> target on an otherwise mostly idle system.
>
> I recommend you do some Google searches for "ssl timing attack" and "aes
> timing attack" to get a feel for the kind of seemingly impossible things
> that can be done and thereby recalibrate your scale of the impossible.

Hm. These are attempting to reveal a static key though. We're talking
about trying to reveal the exact value of the counter when it was read
by the kernel. And trying to do so repeatedly, several times per second.
And this can't be done without getting some form of local system access,
so far as I know. And the act of trying to monitor and calculate deltas
should serve to introduce even more potential randomness into the actual
clock read deltas.

This code is largely spurned on by someone here at Red Hat who I
probably should have had in the cc list to begin with, Steve Grubb, who
pointed to slides 23-25 and the chart in slide 30 of this doc...

https://www.osadl.org/fileadmin/dam/presentations/RTLWS11/okech-inherent-randomness.pdf

...as the primary arguments for why this is a good source of entropy.


>>> (I see you do credit only 1 bit per byte: that's fairly conservative,
>>> true, but it must be _perfectly conservative_ for the theoretical
>>> requirements of /dev/random to be met. These requirements are in fact
>>> known to be unfulfillable in practice(!), but that doesn't mean we
>>> should introduce more users of entropy accounting. Instead, it means
>>> that entropy accounting is broken and needs to be removed.)
>> Hrm. The government seems to have a different opinion. Various certs
>> have requirements for some sort of entropy accounting and minimum
>> estimated entropy guarantees. We can certainly be even more conservative
>> than 1 bit per byte, but yeah, I don't really have a good answer for
>> perfectly conservative, and I don't know what might result (on the
>> government cert front) from removing entropy accounting altogether...
>
> Well, the deal with accounting is this: if you earn $.90 and spend $1.00
> every day, you'll eventually go broke, even if your
> rounded-to-the-nearest-dollar accounting tells you you're solidly in the
> black.
>
> The only distinction between /dev/random and urandom is that we claim
> that /dev/random is always solidly in the black. But as we don't have a
> firm theoretical basis for making our accounting estimates on the input
> side, the whole accounting thing kind of breaks down into a kind of
> busted rate-limiter.

Well, they *are* understood to be estimates, and /dev/random does block
when we've spent everything we've (estimated we've) got, and at least
circa 2.6.18 in RHEL5.4, NIST was satisfied that /dev/random's
estimation was "good enough" by way of some statistical analysis done on
data dumped out of it. What if we could show through statistical
analysis that our entropy estimation is still good enough even with
clock data mixed in? (Ignoring the potential of timing attacks for the
moment).


> We'd do better counting a raw number of samples per source, and then
> claiming that we've reached a 'full' state when we reach a certain
> 'diversity x depth' score. And then assuring we have a lot of diversity
> and depth going into the pool.

Hrm. I presume NIST and friends would still need some way to translate
that into estimated bits of entropy for the purposes of having a common
metric with other systems, but I guess we could feel better about the
entropy if we had some sort of guarantee that no more than x% came from
a single entropy source -- such as, say, no more than 25% of your
entropy bits are from a clocksource. But then we may end up right back
where we are right now -- a blocking entropy-starved /dev/random on a
server system that has no other significant sources generating entropy.

--
Jarod Wilson
[email protected]

2011-06-14 21:04:23

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function

On Tue, 2011-06-14 at 16:04 -0400, Kent Borg wrote:
> Matt Mackall wrote:
> > [network adapters are] a great source of potential entropy, a bad
> > source of guaranteed entropy. The current RNG tries to do
> > accounting on the latter. Accounting on the former is extremely
> > suspect.
>
> So we need a patch that:
>
> - Deletes the IRQF_SAMPLE_RANDOM mention in feature-removal-schedule.txt,
>
> - Restores instances of IRQF_SAMPLE_RANDOM in drivers, and
>
> - Changes the credit_entropy_bits() to credit less entropy*.
>
> * The code seems to only handle integer values of entropy. Maybe
> when crediting, choose between 1 and 0 credits.

No, that's not what I'm saying here. I'm saying we need to rethink
entropy accounting entirely.

An alternate plan is:

- add an add_network_randomness hook in the network stack that feeds
samples with 0 entropy credits (strengthen /dev/urandom)

- add similar sources elsewhere in the kernel

- remove the entropy accounting framework so that /dev/random is the
same as /dev/urandom


I currently don't think either of these is quite right for lots of
interesting systems and the answer is somewhere in the middle.


Let me describle one extremely common pathological case: Wifi routers.
Many of these have effectively no storage and don't preserve pool data
across boots or a battery-backed RTC so they come up in a 'known' pool
state. Many of them also don't have a high-resolution time source or
cycle counter available. So guessing the timestamp on a network packet
is just about trivial. State extension attacks here are probably quite
easy (though I don't know that anyone's tried it).

So I think what we want to instead do is simply count samples until
we've got enough unleaked internal state that state extension becomes
hopeless. If we assume an attacker can guess each sample with 99%
accuracy (.08 bits per sample), pool state extension becomes intractable
(> 2^64 guesses) at a batch size on the order of 1000 samples. If we
have inputs from a variety of sources (network, keyboard, mouse, etc),
this number is probably quite a bit smaller.

This is a lot like the existing catastrophic reseed logic, except that
the effective multiplier is much larger and probably implies another
pool in the chain.

--
Mathematics is the supreme nostalgia of our time.

2011-06-14 21:45:07

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Tue, 2011-06-14 at 16:17 -0400, Jarod Wilson wrote:
> Matt Mackall wrote:
> > On Tue, 2011-06-14 at 11:18 -0400, Jarod Wilson wrote:
> >> Matt Mackall wrote:
> ...
> >>> No: it's not a great idea to _credit_ the entropy count with this data.
> >>> Someone watching the TSC or HPET from userspace can guess when samples
> >>> are added by watching for drop-outs in their sampling (ie classic timing
> >>> attack).
> >> I'm admittedly a bit of a novice in this area... Why does it matter if
> >> someone watching knows more or less when a sample is added? It doesn't
> >> really reveal anything about the sample itself, if we're using a
> >> high-granularity counter value's low bits -- round-trip to userspace has
> >> all sorts of inherent timing jitter, so determining the low-order bits
> >> the kernel got by monitoring from userspace should be more or less
> >> impossible. And the pool is constantly changing, making it a less static
> >> target on an otherwise mostly idle system.
> >
> > I recommend you do some Google searches for "ssl timing attack" and "aes
> > timing attack" to get a feel for the kind of seemingly impossible things
> > that can be done and thereby recalibrate your scale of the impossible.
>
> Hm. These are attempting to reveal a static key though. We're talking
> about trying to reveal the exact value of the counter when it was read
> by the kernel. And trying to do so repeatedly, several times per second.

I read this as "I am not yet properly recalibrated".

Yes, it's hard. Hard != impractical.

> And this can't be done without getting some form of local system access,

Ok, now Google "remote timing attack".

> This code is largely spurned on by someone here at Red Hat who I
> probably should have had in the cc list to begin with, Steve Grubb, who
> pointed to slides 23-25 and the chart in slide 30 of this doc...
>
> https://www.osadl.org/fileadmin/dam/presentations/RTLWS11/okech-inherent-randomness.pdf
>
> ...as the primary arguments for why this is a good source of entropy.

..on a sixth-generation desktop CPU with a cycle-accurate counter.

Welcome to the real world, where that's now a tiny minority of deployed
systems.

But that's not even the point. Entropy accounting here is about
providing a theoretical level of security above "cryptographically
strong". As the source says:

"Even if it is possible to analyze SHA in some clever way, as long as
the amount of data returned from the generator is less than the inherent
entropy in the pool, the output data is totally unpredictable."

This is the goal of the code as it exists. And that goal depends on
consistent _underestimates_ and accurate accounting.


Look, I understand what I'm trying to say here is very confusing, so
please make an effort to understand all the pieces together:

- the driver is designed for -perfect- security as described above
- the usual assumptions about observability of network samples and other
timestamps ARE FALSE on COMMON NON-PC HARDWARE
- thus network sampling is incompatible with the CURRENT design
- nonetheless, the current design of entropy accounting is not actually
meeting its goals in practice
- thus we need an alternative to entropy accounting
- that alternative WILL be compatible with sampling insecure sources

> Well, they *are* understood to be estimates, and /dev/random does block
> when we've spent everything we've (estimated we've) got, and at least
> circa 2.6.18 in RHEL5.4, NIST was satisfied that /dev/random's
> estimation was "good enough" by way of some statistical analysis done on
> data dumped out of it.

A modern stream cipher output on a -single- sample will also pass that
test. These sorts of statistical tests can never be more than a basic
sanity check because they only look at output streams. A real attacker
tries to observe inputs and correlate their observations with outputs.

--
Mathematics is the supreme nostalgia of our time.

2011-06-14 22:52:25

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Matt Mackall wrote:
> On Tue, 2011-06-14 at 16:17 -0400, Jarod Wilson wrote:
>> Matt Mackall wrote:
>>> On Tue, 2011-06-14 at 11:18 -0400, Jarod Wilson wrote:
>>>> Matt Mackall wrote:
>> ...
>>>>> No: it's not a great idea to _credit_ the entropy count with this data.
>>>>> Someone watching the TSC or HPET from userspace can guess when samples
>>>>> are added by watching for drop-outs in their sampling (ie classic timing
>>>>> attack).
>>>> I'm admittedly a bit of a novice in this area... Why does it matter if
>>>> someone watching knows more or less when a sample is added? It doesn't
>>>> really reveal anything about the sample itself, if we're using a
>>>> high-granularity counter value's low bits -- round-trip to userspace has
>>>> all sorts of inherent timing jitter, so determining the low-order bits
>>>> the kernel got by monitoring from userspace should be more or less
>>>> impossible. And the pool is constantly changing, making it a less static
>>>> target on an otherwise mostly idle system.
>>> I recommend you do some Google searches for "ssl timing attack" and "aes
>>> timing attack" to get a feel for the kind of seemingly impossible things
>>> that can be done and thereby recalibrate your scale of the impossible.
>> Hm. These are attempting to reveal a static key though. We're talking
>> about trying to reveal the exact value of the counter when it was read
>> by the kernel. And trying to do so repeatedly, several times per second.
>
> I read this as "I am not yet properly recalibrated".

Probably not. :)

> Yes, it's hard. Hard != impractical.
>
>> And this can't be done without getting some form of local system access,
>
> Ok, now Google "remote timing attack".

The stuff I'm reading seems to require that the data you're trying to
discern is somehow exposed over the network, which so far as I know, the
entropy input pool isn't, but you obviously know this stuff WAY better
than I do, so I'll stop trying. ;)

>> This code is largely spurned on by someone here at Red Hat who I
>> probably should have had in the cc list to begin with, Steve Grubb, who
>> pointed to slides 23-25 and the chart in slide 30 of this doc...
>>
>> https://www.osadl.org/fileadmin/dam/presentations/RTLWS11/okech-inherent-randomness.pdf
>>
>> ...as the primary arguments for why this is a good source of entropy.
>
> ..on a sixth-generation desktop CPU with a cycle-accurate counter.
>
> Welcome to the real world, where that's now a tiny minority of deployed
> systems.

Sure, but that's part of why only the hpet and tsc clocksources were
wired up in this patchset.

> But that's not even the point. Entropy accounting here is about
> providing a theoretical level of security above "cryptographically
> strong". As the source says:
>
> "Even if it is possible to analyze SHA in some clever way, as long as
> the amount of data returned from the generator is less than the inherent
> entropy in the pool, the output data is totally unpredictable."
>
> This is the goal of the code as it exists. And that goal depends on
> consistent _underestimates_ and accurate accounting.

Okay, so as you noted, I was only crediting one bit of entropy per byte
mixed in. Would there be some higher mixed-to-credited ratio that might
be sufficient to meet the goal?

> Look, I understand what I'm trying to say here is very confusing, so
> please make an effort to understand all the pieces together:
>
> - the driver is designed for -perfect- security as described above
> - the usual assumptions about observability of network samples and other
> timestamps ARE FALSE on COMMON NON-PC HARDWARE
> - thus network sampling is incompatible with the CURRENT design
> - nonetheless, the current design of entropy accounting is not actually
> meeting its goals in practice

Heh, I guess that answers my question already...

> - thus we need an alternative to entropy accounting
> - that alternative WILL be compatible with sampling insecure sources

Okay. So I admit to really only considering and/or caring about x86
hardware, which doesn't seem to have helped my cause. But you do seem to
be saying that clocksource-based sampling *will* be compatible with the
new alternative, correct? And is said alternative something on the
relatively near-term radar?


--
Jarod Wilson
[email protected]

2011-06-14 23:27:05

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
> Matt Mackall wrote:
> > On Tue, 2011-06-14 at 16:17 -0400, Jarod Wilson wrote:
> >> Matt Mackall wrote:
> >>> On Tue, 2011-06-14 at 11:18 -0400, Jarod Wilson wrote:
> >>>> Matt Mackall wrote:
> >> ...
> >>>>> No: it's not a great idea to _credit_ the entropy count with this data.
> >>>>> Someone watching the TSC or HPET from userspace can guess when samples
> >>>>> are added by watching for drop-outs in their sampling (ie classic timing
> >>>>> attack).
> >>>> I'm admittedly a bit of a novice in this area... Why does it matter if
> >>>> someone watching knows more or less when a sample is added? It doesn't
> >>>> really reveal anything about the sample itself, if we're using a
> >>>> high-granularity counter value's low bits -- round-trip to userspace has
> >>>> all sorts of inherent timing jitter, so determining the low-order bits
> >>>> the kernel got by monitoring from userspace should be more or less
> >>>> impossible. And the pool is constantly changing, making it a less static
> >>>> target on an otherwise mostly idle system.
> >>> I recommend you do some Google searches for "ssl timing attack" and "aes
> >>> timing attack" to get a feel for the kind of seemingly impossible things
> >>> that can be done and thereby recalibrate your scale of the impossible.
> >> Hm. These are attempting to reveal a static key though. We're talking
> >> about trying to reveal the exact value of the counter when it was read
> >> by the kernel. And trying to do so repeatedly, several times per second.
> >
> > I read this as "I am not yet properly recalibrated".
>
> Probably not. :)
>
> > Yes, it's hard. Hard != impractical.
> >
> >> And this can't be done without getting some form of local system access,
> >
> > Ok, now Google "remote timing attack".
>
> The stuff I'm reading seems to require that the data you're trying to
> discern is somehow exposed over the network, which so far as I know, the
> entropy input pool isn't, but you obviously know this stuff WAY better
> than I do, so I'll stop trying. ;)
>
> >> This code is largely spurned on by someone here at Red Hat who I
> >> probably should have had in the cc list to begin with, Steve Grubb, who
> >> pointed to slides 23-25 and the chart in slide 30 of this doc...
> >>
> >> https://www.osadl.org/fileadmin/dam/presentations/RTLWS11/okech-inherent-randomness.pdf
> >>
> >> ...as the primary arguments for why this is a good source of entropy.
> >
> > ..on a sixth-generation desktop CPU with a cycle-accurate counter.
> >
> > Welcome to the real world, where that's now a tiny minority of deployed
> > systems.
>
> Sure, but that's part of why only the hpet and tsc clocksources were
> wired up in this patchset.

> > But that's not even the point. Entropy accounting here is about
> > providing a theoretical level of security above "cryptographically
> > strong". As the source says:
> >
> > "Even if it is possible to analyze SHA in some clever way, as long as
> > the amount of data returned from the generator is less than the inherent
> > entropy in the pool, the output data is totally unpredictable."
> >
> > This is the goal of the code as it exists. And that goal depends on
> > consistent _underestimates_ and accurate accounting.
>
> Okay, so as you noted, I was only crediting one bit of entropy per byte
> mixed in. Would there be some higher mixed-to-credited ratio that might
> be sufficient to meet the goal?

As I've mentioned elsewhere, I think something around .08 bits per
timestamp is probably a good target. That's the entropy content of a
coin-flip that is biased to flip heads 99 times out of 100. But even
that isn't good enough in the face of a 100Hz clock source.

And obviously the current system doesn't handle fractional bits at all.

> > Look, I understand what I'm trying to say here is very confusing, so
> > please make an effort to understand all the pieces together:
> >
> > - the driver is designed for -perfect- security as described above
> > - the usual assumptions about observability of network samples and other
> > timestamps ARE FALSE on COMMON NON-PC HARDWARE
> > - thus network sampling is incompatible with the CURRENT design
> > - nonetheless, the current design of entropy accounting is not actually
> > meeting its goals in practice
>
> Heh, I guess that answers my question already...
>
> > - thus we need an alternative to entropy accounting
> > - that alternative WILL be compatible with sampling insecure sources
>
> Okay. So I admit to really only considering and/or caring about x86
> hardware, which doesn't seem to have helped my cause. But you do seem to
> be saying that clocksource-based sampling *will* be compatible with the
> new alternative, correct? And is said alternative something on the
> relatively near-term radar?

Various people have offered to spend some time fixing this; I haven't
had time to look at it for a while.

--
Mathematics is the supreme nostalgia of our time.

2011-06-15 14:50:24

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Matt Mackall wrote:
> On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
>> Matt Mackall wrote:
...
>>> But that's not even the point. Entropy accounting here is about
>>> providing a theoretical level of security above "cryptographically
>>> strong". As the source says:
>>>
>>> "Even if it is possible to analyze SHA in some clever way, as long as
>>> the amount of data returned from the generator is less than the inherent
>>> entropy in the pool, the output data is totally unpredictable."
>>>
>>> This is the goal of the code as it exists. And that goal depends on
>>> consistent _underestimates_ and accurate accounting.
>> Okay, so as you noted, I was only crediting one bit of entropy per byte
>> mixed in. Would there be some higher mixed-to-credited ratio that might
>> be sufficient to meet the goal?
>
> As I've mentioned elsewhere, I think something around .08 bits per
> timestamp is probably a good target. That's the entropy content of a
> coin-flip that is biased to flip heads 99 times out of 100. But even
> that isn't good enough in the face of a 100Hz clock source.
>
> And obviously the current system doesn't handle fractional bits at all.

What if only one bit every n samples were credited? So 1/n bits per
timestamp, effectively, and for an n of 100, that would yield .01 bits
per timestamp. Something like this:

void add_clocksource_randomness(int clock_delta)
{
static int samples;
/* only mix in the low byte */
u8 mix = clock_delta & 0xff;

DEBUG_ENT("clock event %u\n", mix);

preempt_disable();
if (input_pool.entropy_count > trickle_thresh &&
(__get_cpu_var(trickle_count)++ & 0xfff))
goto out;

mix_pool_bytes(&input_pool, &mix, sizeof(mix));
samples++;
/* Only credit one bit per 100 samples to be conservative */
if (samples == 100) {
credit_entropy_bits(&input_pool, sizeof(mix));
samples = 0;
}

out:
preempt_enable();
}


Additionally, this function would NOT be exported, it would only be
utilized by a new clocksource entropy contribution function in
kernel/time/clocksource.c. Locally, I've made most of the changes as
discussed with John, so clocksources now have an entropy rating rather
than an entropy function, and if the rating is not high enough, the
clocksource won't be able to add entropy -- all clocksources default to
a rating of 0, only hpet and tsc have been marked otherwise.

Additionally, hpet has a higher rating than tsc, so it'll be preferred
over tsc, even if tsc is the system timer clocksource. This code will
effectively do absolutely nothing if not running on x86 PC hardware with
an hpet or tsc (and it seems maybe tsc shouldn't even be considered, so
perhaps this should be hpet-only).

One further thought: what if reads of both the hpet and tsc were mixed
together to form the sample value actually fed into the entropy pool?
This of course assumes the system has both available, and that the tsc
is actually fine-grained enough to be usable, but maybe it strengthens
the randomness of the sample value at least somewhat?

This could also be marked as experimental or dangerous or what have you,
so that its a kernel builder's conscious decision to enable clock-based
entropy contributions.

(If I appear to be grasping at straws here, well, I probably am.) ;)

>>> Look, I understand what I'm trying to say here is very confusing, so
>>> please make an effort to understand all the pieces together:
>>>
>>> - the driver is designed for -perfect- security as described above
>>> - the usual assumptions about observability of network samples and other
>>> timestamps ARE FALSE on COMMON NON-PC HARDWARE
>>> - thus network sampling is incompatible with the CURRENT design
>>> - nonetheless, the current design of entropy accounting is not actually
>>> meeting its goals in practice
>> Heh, I guess that answers my question already...
>>
>>> - thus we need an alternative to entropy accounting
>>> - that alternative WILL be compatible with sampling insecure sources
>> Okay. So I admit to really only considering and/or caring about x86
>> hardware, which doesn't seem to have helped my cause. But you do seem to
>> be saying that clocksource-based sampling *will* be compatible with the
>> new alternative, correct? And is said alternative something on the
>> relatively near-term radar?
>
> Various people have offered to spend some time fixing this; I haven't
> had time to look at it for a while.

Okay, I know how that goes. So not likely to come to fruition in the
immediate near-term. I'd offer to spend some time working on it, but I
don't think I'm qualified. :)

--
Jarod Wilson
[email protected]

2011-06-15 20:06:48

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Wed, 2011-06-15 at 10:49 -0400, Jarod Wilson wrote:
> Matt Mackall wrote:
> > On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
> >> Matt Mackall wrote:
> ...
> >>> But that's not even the point. Entropy accounting here is about
> >>> providing a theoretical level of security above "cryptographically
> >>> strong". As the source says:
> >>>
> >>> "Even if it is possible to analyze SHA in some clever way, as long as
> >>> the amount of data returned from the generator is less than the inherent
> >>> entropy in the pool, the output data is totally unpredictable."
> >>>
> >>> This is the goal of the code as it exists. And that goal depends on
> >>> consistent _underestimates_ and accurate accounting.
> >> Okay, so as you noted, I was only crediting one bit of entropy per byte
> >> mixed in. Would there be some higher mixed-to-credited ratio that might
> >> be sufficient to meet the goal?
> >
> > As I've mentioned elsewhere, I think something around .08 bits per
> > timestamp is probably a good target. That's the entropy content of a
> > coin-flip that is biased to flip heads 99 times out of 100. But even
> > that isn't good enough in the face of a 100Hz clock source.
> >
> > And obviously the current system doesn't handle fractional bits at all.
>
> What if only one bit every n samples were credited? So 1/n bits per
> timestamp, effectively, and for an n of 100, that would yield .01 bits
> per timestamp. Something like this:

Something like that would "work", sure. But it's a hack/abuse -relative
to the current framework-. I'm reluctant to just pile on the hacks on
the current system, as that just means getting it coherent is that much
further away.

The current system says things like "I've gotten 20 samples at intervals
that look vaguely random based on their timestamps, I'm calling that 64
bits of entropy. That's enough to reseed!" But what it doesn't know is
that those all came from the local network from an attacker with access
to a better clock than us. Or that they all came from an HPET, but the
attacker was directly driving its firing. Or that they came from a
wireless mouse, and the attacker has an RF snooper. So that in the end,
it's only 20 bits of entropy and the attacker can brute-force its way
through the state space. (Yes, this is obviously paranoid, because
that's a ground rule.)

A better framework would say something like "I don't actually pretend to
know how to 'measure' entropy, but I've got 1000 samples batched from 4
different subsystems (clock, scheduler, network, block I/O), an attacker
is going to have a very hard time monitoring/predicting all of those,
and even if it's got 99% accuracy per sample on all sources, it's still
got > 2^64 work to guess the current state. Let's refresh our pool and
call it full". See?

Here's a sketch. Each subsystem does something like:

add_rng_sample(RNG_NETWORK, some_packet_data);

And that code does something like:

pool = per_cpu(sample_pool);
timestamp = sched_clock();
mix(pool, MAKESAMPLE(sample_data, source, timestamp), sizeof(rng_sample));

pool.sample_count++;
if (!(source & pool.source_mask)) {
/* haven't seen this source since last reseed */
pool.source_mask |= source;
pool.source_count++;

/* Do we have enough sample depth and diversity in our per-cpu pool? */
if (pool.sample_count[pool.source_count] > threshold[pool.source_count]) {
/* yes, reseed the main pool */
reseed(input_pool, pool, reseed_entropy);
/* "empty" our sample pool */
pool.sample_count = pool.source_count = pool.source_mask = 0;
}

--
Mathematics is the supreme nostalgia of our time.

2011-06-17 18:52:13

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

Matt Mackall wrote:
> On Wed, 2011-06-15 at 10:49 -0400, Jarod Wilson wrote:
>> Matt Mackall wrote:
>>> On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
>>>> Matt Mackall wrote:
>> ...
>>>>> But that's not even the point. Entropy accounting here is about
>>>>> providing a theoretical level of security above "cryptographically
>>>>> strong". As the source says:
>>>>>
>>>>> "Even if it is possible to analyze SHA in some clever way, as long as
>>>>> the amount of data returned from the generator is less than the inherent
>>>>> entropy in the pool, the output data is totally unpredictable."
>>>>>
>>>>> This is the goal of the code as it exists. And that goal depends on
>>>>> consistent _underestimates_ and accurate accounting.
>>>> Okay, so as you noted, I was only crediting one bit of entropy per byte
>>>> mixed in. Would there be some higher mixed-to-credited ratio that might
>>>> be sufficient to meet the goal?
>>> As I've mentioned elsewhere, I think something around .08 bits per
>>> timestamp is probably a good target. That's the entropy content of a
>>> coin-flip that is biased to flip heads 99 times out of 100. But even
>>> that isn't good enough in the face of a 100Hz clock source.
>>>
>>> And obviously the current system doesn't handle fractional bits at all.
>> What if only one bit every n samples were credited? So 1/n bits per
>> timestamp, effectively, and for an n of 100, that would yield .01 bits
>> per timestamp. Something like this:
>
> Something like that would "work", sure. But it's a hack/abuse -relative
> to the current framework-. I'm reluctant to just pile on the hacks on
> the current system, as that just means getting it coherent is that much
> further away.

Something I should have probably made clearer was that we were looking
for a semi-quick improvement for a particular use case that involves a
certain 2.6.32-based kernel... For the short term, we're looking at some
userspace alternatives, such as a Linux implementation of an entropy
seed approach BSI approves of. Reference doc here, but in German:

https://www.bsi.bund.de/ContentBSI/Publikationen/TechnischeRichtlinien/tr02102/index_htm.html

Longer-term, we're having some discussions related to the revamped
framework you've laid out. I'll table this clocksource-based entropy
contribution code for now.

Also, thank you for putting up with me. ;)

--
Jarod Wilson
[email protected]m

2011-06-17 19:29:54

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Fri, Jun 17, 2011 at 02:51:31PM -0400, Jarod Wilson wrote:
> Matt Mackall wrote:
> >On Wed, 2011-06-15 at 10:49 -0400, Jarod Wilson wrote:
> >>Matt Mackall wrote:
> >>>On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
> >>>>Matt Mackall wrote:
> >>...
> >>>>>But that's not even the point. Entropy accounting here is about
> >>>>>providing a theoretical level of security above "cryptographically
> >>>>>strong". As the source says:
> >>>>>
> >>>>>"Even if it is possible to analyze SHA in some clever way, as long as
> >>>>>the amount of data returned from the generator is less than the inherent
> >>>>>entropy in the pool, the output data is totally unpredictable."
> >>>>>
> >>>>>This is the goal of the code as it exists. And that goal depends on
> >>>>>consistent _underestimates_ and accurate accounting.
> >>>>Okay, so as you noted, I was only crediting one bit of entropy per byte
> >>>>mixed in. Would there be some higher mixed-to-credited ratio that might
> >>>>be sufficient to meet the goal?
> >>>As I've mentioned elsewhere, I think something around .08 bits per
> >>>timestamp is probably a good target. That's the entropy content of a
> >>>coin-flip that is biased to flip heads 99 times out of 100. But even
> >>>that isn't good enough in the face of a 100Hz clock source.
> >>>
> >>>And obviously the current system doesn't handle fractional bits at all.
> >>What if only one bit every n samples were credited? So 1/n bits per
> >>timestamp, effectively, and for an n of 100, that would yield .01 bits
> >>per timestamp. Something like this:
> >
> >Something like that would "work", sure. But it's a hack/abuse -relative
> >to the current framework-. I'm reluctant to just pile on the hacks on
> >the current system, as that just means getting it coherent is that much
> >further away.
>
> Something I should have probably made clearer was that we were
> looking for a semi-quick improvement for a particular use case that
> involves a certain 2.6.32-based kernel... For the short term, we're
> looking at some userspace alternatives, such as a Linux
> implementation of an entropy seed approach BSI approves of.
> Reference doc here, but in German:
>
> https://www.bsi.bund.de/ContentBSI/Publikationen/TechnischeRichtlinien/tr02102/index_htm.html
>
> Longer-term, we're having some discussions related to the revamped
> framework you've laid out. I'll table this clocksource-based entropy
> contribution code for now.
>
> Also, thank you for putting up with me. ;)
>
Not to prolong this conversation, but Matt, since you seem talkative about it,
has any consideration been given to using a periodically reseeded DRNG as an
input to the entropy pool? some of the papers that Jarod referenced in this
thread suggest that using the tsc entropy as a reseed factor to a drng can
provide a good non-predictable source of entropy.

Thoughts?
Neil

> --
> Jarod Wilson
> [email protected]
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2011-06-17 19:48:54

by hpas

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On 06/14/2011 04:12 PM, Matt Mackall wrote:
>
> Various people have offered to spend some time fixing this; I haven't
> had time to look at it for a while.
>

So on my (long...) list of things to do for a while is enablement of
RDRAND, which is a new instruction in Ivy Bridge disclosed in the latest
few revisions of the AVX spec and is now in the SDM (the functional
description is in vol 1, section 7.3.18 of the May 2011 edition.)

Fenghua Yu did an initial enabling patch, but we have had to make some
changes.

>From the SDM:

"The random numbers that are returned by the RDRAND instruction are
supplied by a cryptographically secure Random Number Generator that
employs a hardware DRBG (Digital Random Bit Generator, also known as a
Pseudo Random Number Generator) seeded by a hardware NRBG
(Nondeterministic Random Bit Generator, also known as a TRNG or True
Random Number generator).

In order for the hardware design to meet its security goals, the random
number generator continuously tests itself and the random data it is
generating. Runtime failures in the random number generator circuitry or
statistically anomalous data occurring by chance will be detected by the
self test hardware and flag the resulting data as being bad. In such
extremely rare cases, the RDRAND instruction will return no data instead
of bad data."

Additionally, there is a software enabling guide containing a *lot* more
detail at:

http://tinyurl.com/6x6dmd2/

This basically means it is similar in behavior to our /dev/urandom in
that it may gracefully degrade to a PRNG for short stretches of time,
but will not degrade to a PRNG for an extended time, nor will it produce
guessable data, ever; instead it will "fail shut."

The one use case that it is cryptographically insufficient for is to
seed a new PRNG, which probably means it is unsuitable for being fed
as-is into /dev/random.

-hpa

2011-06-17 20:28:12

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Fri, 2011-06-17 at 12:48 -0700, [email protected] wrote:
> On 06/14/2011 04:12 PM, Matt Mackall wrote:
> >
> > Various people have offered to spend some time fixing this; I haven't
> > had time to look at it for a while.
> >
>
> So on my (long...) list of things to do for a while is enablement of
> RDRAND, which is a new instruction in Ivy Bridge disclosed in the latest
> few revisions of the AVX spec and is now in the SDM (the functional
> description is in vol 1, section 7.3.18 of the May 2011 edition.)
>
> Fenghua Yu did an initial enabling patch, but we have had to make some
> changes.
>
> >From the SDM:
>
> "The random numbers that are returned by the RDRAND instruction are
> supplied by a cryptographically secure Random Number Generator that
> employs a hardware DRBG (Digital Random Bit Generator, also known as a
> Pseudo Random Number Generator) seeded by a hardware NRBG
> (Nondeterministic Random Bit Generator, also known as a TRNG or True
> Random Number generator).
>
> In order for the hardware design to meet its security goals, the random
> number generator continuously tests itself and the random data it is
> generating. Runtime failures in the random number generator circuitry or
> statistically anomalous data occurring by chance will be detected by the
> self test hardware and flag the resulting data as being bad. In such
> extremely rare cases, the RDRAND instruction will return no data instead
> of bad data."
>
> Additionally, there is a software enabling guide containing a *lot* more
> detail at:
>
> http://tinyurl.com/6x6dmd2/
>
> This basically means it is similar in behavior to our /dev/urandom in
> that it may gracefully degrade to a PRNG for short stretches of time,
> but will not degrade to a PRNG for an extended time, nor will it produce
> guessable data, ever; instead it will "fail shut."
>
> The one use case that it is cryptographically insufficient for is to
> seed a new PRNG, which probably means it is unsuitable for being fed
> as-is into /dev/random.

The thing to understand about the input side of /dev/random is that it's
COMPLETELY immune to untrusted data. So there's absolutely no harm in
sending it data of questionable entropy so long as you don't tell it to
account it. And, of course, if it DOES contain entropy, it makes things
better.

Think of it this way: I have a coin in my pocket. You, the attacker,
tell me to flip it. You can do that any number of times and not improve
your guess about the coin's state over your initial guess. This is what
it means to have a reversible mixing function: no number of iterations
reduces the degrees of freedom of the pool.

--
Mathematics is the supreme nostalgia of our time.

2011-06-17 20:46:45

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Fri, 2011-06-17 at 15:29 -0400, Neil Horman wrote:
> On Fri, Jun 17, 2011 at 02:51:31PM -0400, Jarod Wilson wrote:
> > Matt Mackall wrote:
> > >On Wed, 2011-06-15 at 10:49 -0400, Jarod Wilson wrote:
> > >>Matt Mackall wrote:
> > >>>On Tue, 2011-06-14 at 18:51 -0400, Jarod Wilson wrote:
> > >>>>Matt Mackall wrote:
> > >>...
> > >>>>>But that's not even the point. Entropy accounting here is about
> > >>>>>providing a theoretical level of security above "cryptographically
> > >>>>>strong". As the source says:
> > >>>>>
> > >>>>>"Even if it is possible to analyze SHA in some clever way, as long as
> > >>>>>the amount of data returned from the generator is less than the inherent
> > >>>>>entropy in the pool, the output data is totally unpredictable."
> > >>>>>
> > >>>>>This is the goal of the code as it exists. And that goal depends on
> > >>>>>consistent _underestimates_ and accurate accounting.
> > >>>>Okay, so as you noted, I was only crediting one bit of entropy per byte
> > >>>>mixed in. Would there be some higher mixed-to-credited ratio that might
> > >>>>be sufficient to meet the goal?
> > >>>As I've mentioned elsewhere, I think something around .08 bits per
> > >>>timestamp is probably a good target. That's the entropy content of a
> > >>>coin-flip that is biased to flip heads 99 times out of 100. But even
> > >>>that isn't good enough in the face of a 100Hz clock source.
> > >>>
> > >>>And obviously the current system doesn't handle fractional bits at all.
> > >>What if only one bit every n samples were credited? So 1/n bits per
> > >>timestamp, effectively, and for an n of 100, that would yield .01 bits
> > >>per timestamp. Something like this:
> > >
> > >Something like that would "work", sure. But it's a hack/abuse -relative
> > >to the current framework-. I'm reluctant to just pile on the hacks on
> > >the current system, as that just means getting it coherent is that much
> > >further away.
> >
> > Something I should have probably made clearer was that we were
> > looking for a semi-quick improvement for a particular use case that
> > involves a certain 2.6.32-based kernel... For the short term, we're
> > looking at some userspace alternatives, such as a Linux
> > implementation of an entropy seed approach BSI approves of.
> > Reference doc here, but in German:
> >
> > https://www.bsi.bund.de/ContentBSI/Publikationen/TechnischeRichtlinien/tr02102/index_htm.html
> >
> > Longer-term, we're having some discussions related to the revamped
> > framework you've laid out. I'll table this clocksource-based entropy
> > contribution code for now.
> >
> > Also, thank you for putting up with me. ;)
> >
> Not to prolong this conversation, but Matt, since you seem talkative about it,
> has any consideration been given to using a periodically reseeded DRNG as an
> input to the entropy pool? some of the papers that Jarod referenced in this
> thread suggest that using the tsc entropy as a reseed factor to a drng can
> provide a good non-predictable source of entropy.

Yep. But just about everything we're talking about in this thread has to
come after a redesign of the RNG's input side, which at present is
simply no good for weak sources.

--
Mathematics is the supreme nostalgia of our time.

2011-06-17 20:52:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2/5] clocksource: add support for entropy-generation function

On Mon, 13 Jun 2011, Jarod Wilson wrote:
> Add a new function pointer to struct clocksource that can optionally be
> filled in by clocksources deemed to be high enough resolution to feed
> the random number generator entropy pool.

Uurrg.

> + * @entropy: random entropy pool addition function (optional, and
> + * requires a fairly high-resolution clocksource)

Why do you want to do that ? Looking at the implementations of TSC and
HPET it's the same code. We really do not want to add that all over
the place. We can make that a property flag and the entropy function
common to the core code.

> +/**
> + * Do we have at least one clocksource that can generate entropy?
> + */
> +bool clocksource_entropy_available(void)
> +{
> + struct clocksource *src;
> + bool entropy_possible = false;
> +
> + mutex_lock(&clocksource_mutex);
> + list_for_each_entry(src, &clocksource_list, list) {
> + if (src->entropy) {
> + entropy_possible = true;
> + break;
> + }
> + }
> + mutex_unlock(&clocksource_mutex);
> +
> + return entropy_possible;
> +}
> +EXPORT_SYMBOL(clocksource_entropy_available);

That should be evaluated when clocksources are registered not at some
random point in time, which might return total nonsense as it's not
guaranteed that the clocksource which has the entropy property is
going to end up as the current clocksource.

> +/**
> + * Call the clocksource's entropy-generation function, if set
> + */
> +void clocksource_add_entropy(void)
> +{
> + if (!curr_clocksource->entropy)
> + return;

Why restricting it to the current clocksource? We can use the best
registered one for this which has the entropy property set.

> + curr_clocksource->entropy(curr_clocksource);
> +}
> +EXPORT_SYMBOL(clocksource_add_entropy);
> +
> #ifdef CONFIG_SYSFS
> /**
> * sysfs_show_current_clocksources - sysfs interface for current clocksource
> --
> 1.7.1
>
>

2011-06-17 20:58:49

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 4/5] tsc: wire up entropy generation function



On Mon, 13 Jun 2011, Venkatesh Pallipadi wrote:

> On Mon, Jun 13, 2011 at 3:06 PM, Jarod Wilson <[email protected]> wrote:
> > TSC is high enough resolution that we can use its low-order byte to
> > stir new data into the random number generator entropy pool.
>
> >From what I vaguely remember from years past, rdtsc, especially last
> few bits of it are not very good as random number source. As they are
> based on lower bus frequency and a multiplier. May be things have
> changed these days. Adding Peter and Suresh for comments.

Recommended reading:

https://www.osadl.org/fileadmin/dam/rtlws/12/Okech.pdf

Thanks,

tglx

2011-06-17 21:01:57

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 5/5] misc: add clocksource-based entropy generation driver

On Mon, 13 Jun 2011, Jarod Wilson wrote:
> This is a fairly simple driver that just starts up a kernel thread that
> periodically calls the active clocksource's entropy-gathering function,
> if it has one. The default interval of 100us between polls doesn't show
> any measurable impact to cpu usage on a core 2 duo test rig here, and

Of course it does not show up.

> + schedule_timeout(timeout);

That will line you up with the tick nicely. So in fact you run that
code with HZ frequency.

Thanks,

tglx

2011-06-17 21:20:16

by Jarod Wilson

[permalink] [raw]
Subject: Re: [PATCH 2/5] clocksource: add support for entropy-generation function

Thomas Gleixner wrote:
> On Mon, 13 Jun 2011, Jarod Wilson wrote:
>> Add a new function pointer to struct clocksource that can optionally be
>> filled in by clocksources deemed to be high enough resolution to feed
>> the random number generator entropy pool.
>
> Uurrg.
>
>> + * @entropy: random entropy pool addition function (optional, and
>> + * requires a fairly high-resolution clocksource)
>
> Why do you want to do that ? Looking at the implementations of TSC and
> HPET it's the same code. We really do not want to add that all over
> the place. We can make that a property flag and the entropy function
> common to the core code.
>
>> +/**
>> + * Do we have at least one clocksource that can generate entropy?
>> + */
>> +bool clocksource_entropy_available(void)
>> +{
>> + struct clocksource *src;
>> + bool entropy_possible = false;
>> +
>> + mutex_lock(&clocksource_mutex);
>> + list_for_each_entry(src,&clocksource_list, list) {
>> + if (src->entropy) {
>> + entropy_possible = true;
>> + break;
>> + }
>> + }
>> + mutex_unlock(&clocksource_mutex);
>> +
>> + return entropy_possible;
>> +}
>> +EXPORT_SYMBOL(clocksource_entropy_available);
>
> That should be evaluated when clocksources are registered not at some
> random point in time, which might return total nonsense as it's not
> guaranteed that the clocksource which has the entropy property is
> going to end up as the current clocksource.
>
>> +/**
>> + * Call the clocksource's entropy-generation function, if set
>> + */
>> +void clocksource_add_entropy(void)
>> +{
>> + if (!curr_clocksource->entropy)
>> + return;
>
> Why restricting it to the current clocksource? We can use the best
> registered one for this which has the entropy property set.

Yeah, John had some similar suggestions, and locally, I've got a
modified version that instead of adding entropy functions for each
clocksource, adds an entropy rating, then the highest rated entropy
clocksource gets used, but a common clocksource entropy function that
calls the chosen clocksource's read function. The entropy clocksource
function gets picked by way of another function similar to
clocksource_select, called in all the same places.

However, since none of this is going to be viable until the random code
is significantly restructured, I haven't bothered with posting any of
the updates I've made. I can toss the delta out there if anyone really
wants to take a look at it, but otherwise, I'll just sit on it until
said restructuring happens.

--
Jarod Wilson
[email protected]

2011-06-18 22:44:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On 06/17/2011 01:28 PM, Matt Mackall wrote:
>>
>> The one use case that it is cryptographically insufficient for is to
>> seed a new PRNG, which probably means it is unsuitable for being fed
>> as-is into /dev/random.
>
> The thing to understand about the input side of /dev/random is that it's
> COMPLETELY immune to untrusted data. So there's absolutely no harm in
> sending it data of questionable entropy so long as you don't tell it to
> account it. And, of course, if it DOES contain entropy, it makes things
> better.
>
> Think of it this way: I have a coin in my pocket. You, the attacker,
> tell me to flip it. You can do that any number of times and not improve
> your guess about the coin's state over your initial guess. This is what
> it means to have a reversible mixing function: no number of iterations
> reduces the degrees of freedom of the pool.
>

What I meant is that it is unsuitable to *bypass the pool* for
/dev/random. I think we can -- and almost certainly should -- use
RDRAND on the input side; we just have to figure out the accounting.

However, RDRAND is high enough quality (and high enough bandwidth) that
it should be not just possible but desirable to completely bypass the
pool system for /dev/urandom users and pull straight from the RDRAND
instruction. I don't actually know what the exact numbers look like,
but the stall conditions being looked at are of the order of "every core
in the socket trying to execute RDRAND at the same time."

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2011-06-19 13:39:05

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Sat, Jun 18, 2011 at 03:40:50PM -0700, H. Peter Anvin wrote:
> On 06/17/2011 01:28 PM, Matt Mackall wrote:
> >>
> >> The one use case that it is cryptographically insufficient for is to
> >> seed a new PRNG, which probably means it is unsuitable for being fed
> >> as-is into /dev/random.
> >
> > The thing to understand about the input side of /dev/random is that it's
> > COMPLETELY immune to untrusted data. So there's absolutely no harm in
> > sending it data of questionable entropy so long as you don't tell it to
> > account it. And, of course, if it DOES contain entropy, it makes things
> > better.
> >
> > Think of it this way: I have a coin in my pocket. You, the attacker,
> > tell me to flip it. You can do that any number of times and not improve
> > your guess about the coin's state over your initial guess. This is what
> > it means to have a reversible mixing function: no number of iterations
> > reduces the degrees of freedom of the pool.
> >
>
> What I meant is that it is unsuitable to *bypass the pool* for
> /dev/random. I think we can -- and almost certainly should -- use
> RDRAND on the input side; we just have to figure out the accounting.
>
>From your description of the instruction, this sounds almost identical to me as
what I suggested in using an instance of the cprng module in the kernel as the
drng coupled with using the tsc entropy as the trng input to reseed it. We
could use rnrand when possible on the input to the entropy pool, and the
software cprng when that wasn't available.

> However, RDRAND is high enough quality (and high enough bandwidth) that
> it should be not just possible but desirable to completely bypass the
> pool system for /dev/urandom users and pull straight from the RDRAND
> instruction. I don't actually know what the exact numbers look like,
> but the stall conditions being looked at are of the order of "every core
> in the socket trying to execute RDRAND at the same time."
>
It sounds to me like, if its desireous to bypass the entropy pool, then we
should bypass the /dev/random path altogether. Why not write a hwrng driver
that can export access to the rdrand instruction via a misc device.

Neil

> -hpa
>
> --
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel. I don't speak on their behalf.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2011-06-19 15:08:27

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On Sun, Jun 19, 2011 at 09:38:43AM -0400, Neil Horman wrote:
>
> It sounds to me like, if its desireous to bypass the entropy pool, then we
> should bypass the /dev/random path altogether. Why not write a hwrng driver
> that can export access to the rdrand instruction via a misc device.

I presume the rdrand instruction can be used from user-space
directly.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2011-06-20 00:01:41

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH 0/5] Feed entropy pool via high-resolution clocksources

On 06/19/2011 08:07 AM, Herbert Xu wrote:
> On Sun, Jun 19, 2011 at 09:38:43AM -0400, Neil Horman wrote:
>>
>> It sounds to me like, if its desireous to bypass the entropy pool, then we
>> should bypass the /dev/random path altogether. Why not write a hwrng driver
>> that can export access to the rdrand instruction via a misc device.
>
> I presume the rdrand instruction can be used from user-space
> directly.
>

Yes, it can.

Again, RDRAND is not suitable for /dev/random (as opposed to
/dev/urandom users.) /dev/urandom is used both by user space (and here
the only reason to hook it up to /dev/urandom is compatibility with
existing userspace; we are working separately to enabling user space
users like OpenSSL to use RDRAND directly) and by kernel users via the
internal APIs.

/dev/random as far as I can tell is only ever fed to userspace, however,
the guarantees that it is at least supposed to give are very, very
strict. RDRAND do not fulfill those criteria, but we should be able to
use it as part of its implementation.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.