This is very much an RFC patch, or maybe even an RFG -- request for
grumbles. This topic has come up a million times, and usually doesn't go
anywhere. This time I thought I'd bring it up with a slightly narrower
focus. Before you read further, realize that I do not intend to merge
this without there being an appropriate amount of consensus for it and
discussion about it.
Ever since Linus' 50ee7529ec45 ("random: try to actively add entropy
rather than passively wait for it"), the RNG does a haveged-style jitter
dance around the scheduler, in order to produce entropy (and credit it)
for the case when we're stuck in wait_for_random_bytes(). How ever you
feel about the Linus Jitter Dance is beside the point: it's been there
for three years and usually gets the RNG initialized in a second or so.
As a matter of fact, this is what happens currently when people use
getrandom(2).
So, given that the kernel has grown this mechanism for seeding itself
from nothing, and that this procedure happens pretty fast, maybe there's
no point any longer in having /dev/urandom give insecure bytes. In the
past we didn't want the boot process to deadlock, which was
understandable. But now, in the worst case, a second goes by, and the
problem is resolved. It seems like maybe we're finally at a point when
we can get rid of the infamous "urandom read hole".
Maybe. And this is why this is a request for grumbles patch: the Linus
Jitter Dance relies on random_get_entropy() returning a cycle counter
value. The first lines of try_to_generate_entropy() are:
stack.now = random_get_entropy();
/* Slow counter - or none. Don't even bother */
if (stack.now == random_get_entropy())
return;
So it would appear that what seemed initially like a panacea does not in
fact work everywhere. Where doesn't it work?
On every platform, random_get_entropy() is connected to get_cycles(),
except for three: m68k, MIPS, and RISC-V.
On m68k, it looks like this:
if (mach_random_get_entropy)
return mach_random_get_entropy();
return 0;
And mach_random_get_entropy seems to be set in amiga/config.c only.
On MIPS, it looks like this:
if (can_use_mips_counter(prid))
return read_c0_count();
else if (likely(imp != PRID_IMP_R6000 && imp != PRID_IMP_R6000A))
return read_c0_random();
else
return 0;
So it seems like we're okay except for R6000 and R6000A.
Finally on RISC-V, it looks like this:
if (unlikely(clint_time_val == NULL))
return 0;
return get_cycles();
Where clint_time_val is eventually filled in later in boot with
clint_timer_init_dt(). So I assume that's a case where it _eventually_
works, which is probably good enough for our purposes.
I think what this adds up to is that this change would positively affect
everybody, except for _possibly_ negatively affecting poorly configured
non-Amiga m68k systems and the MIPS R6000 and R6000A. Does that analysis
seem correct to folks reading, or did I miss something?
Are there other cases where the cycle counter does exist but is simply
too slow? Perhaps some computer historians can chime in here.
If my general analysis is correct, are these ancient platforms really
worth holding this back? I halfway expect to receive a few thrown
tomatoes, an angry fist, and a "get off my lawn!", and if that's _all_ I
hear, I'll take a hint and we can forget I ever proposed this. As
mentioned, I do not intend to merge this unless there's broad consensus
about it. But on the off chance that people feel differently, perhaps
the Linus Jitter Dance is finally the solution to years of /dev/urandom
kvetching.
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: [email protected]
Cc: Geert Uytterhoeven <[email protected]>
Cc: [email protected]
Cc: Thomas Bogendoerfer <[email protected]>
Cc: [email protected]
Cc: Dominik Brodowski <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
drivers/char/mem.c | 2 +-
drivers/char/random.c | 68 +++++++++----------------------------
include/uapi/linux/random.h | 2 +-
3 files changed, 18 insertions(+), 54 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index cc296f0823bd..9f586025dbe6 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -707,7 +707,7 @@ static const struct memdev {
[5] = { "zero", 0666, &zero_fops, FMODE_NOWAIT },
[7] = { "full", 0666, &full_fops, 0 },
[8] = { "random", 0666, &random_fops, 0 },
- [9] = { "urandom", 0666, &urandom_fops, 0 },
+ [9] = { "urandom", 0666, &random_fops, 0 },
#ifdef CONFIG_PRINTK
[11] = { "kmsg", 0644, &kmsg_fops, 0 },
#endif
diff --git a/drivers/char/random.c b/drivers/char/random.c
index c564f795f68c..868334ea0ce3 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -88,8 +88,6 @@ static LIST_HEAD(random_ready_list);
/* Control how we warn userspace. */
static struct ratelimit_state unseeded_warning =
RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3);
-static struct ratelimit_state urandom_warning =
- RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
static int ratelimit_disable __read_mostly;
module_param_named(ratelimit_disable, ratelimit_disable, int, 0644);
MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit suppression");
@@ -321,11 +319,6 @@ static void crng_reseed(void)
unseeded_warning.missed);
unseeded_warning.missed = 0;
}
- if (urandom_warning.missed) {
- pr_notice("%d urandom warning(s) missed due to ratelimiting\n",
- urandom_warning.missed);
- urandom_warning.missed = 0;
- }
}
}
@@ -978,10 +971,8 @@ int __init rand_initialize(void)
pr_notice("crng init done (trusting CPU's manufacturer)\n");
}
- if (ratelimit_disable) {
- urandom_warning.interval = 0;
+ if (ratelimit_disable)
unseeded_warning.interval = 0;
- }
return 0;
}
@@ -1363,20 +1354,17 @@ static void try_to_generate_entropy(void)
* getrandom(2) is the primary modern interface into the RNG and should
* be used in preference to anything else.
*
- * Reading from /dev/random has the same functionality as calling
- * getrandom(2) with flags=0. In earlier versions, however, it had
- * vastly different semantics and should therefore be avoided, to
- * prevent backwards compatibility issues.
- *
- * Reading from /dev/urandom has the same functionality as calling
- * getrandom(2) with flags=GRND_INSECURE. Because it does not block
- * waiting for the RNG to be ready, it should not be used.
+ * Reading from /dev/random and /dev/urandom both the same effect as
+ * calling getrandom(2) with flags=0. In earlier versions, however,
+ * they each had vastly different semantics and should therefore be
+ * avoided to prevent backwards compatibility issues.
*
* Writing to either /dev/random or /dev/urandom adds entropy to
* the input pool but does not credit it.
*
- * Polling on /dev/random indicates when the RNG is initialized, on
- * the read side, and when it wants new entropy, on the write side.
+ * Polling on /dev/random or /dev/urandom indicates when the RNG
+ * is initialized, on the read side, and when it wants new entropy,
+ * on the write side.
*
* Both /dev/random and /dev/urandom have the same set of ioctls for
* adding entropy, getting the entropy count, zeroing the count, and
@@ -1387,6 +1375,8 @@ static void try_to_generate_entropy(void)
SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count, unsigned int,
flags)
{
+ int ret;
+
if (flags & ~(GRND_NONBLOCK | GRND_RANDOM | GRND_INSECURE))
return -EINVAL;
@@ -1400,15 +1390,13 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count, unsigned int,
if (count > INT_MAX)
count = INT_MAX;
- if (!(flags & GRND_INSECURE) && !crng_ready()) {
- int ret;
+ if ((flags & GRND_NONBLOCK) && !crng_ready())
+ return -EAGAIN;
+
+ ret = wait_for_random_bytes();
+ if (ret != 0)
+ return ret;
- if (flags & GRND_NONBLOCK)
- return -EAGAIN;
- ret = wait_for_random_bytes();
- if (unlikely(ret))
- return ret;
- }
return get_random_bytes_user(buf, count);
}
@@ -1461,21 +1449,6 @@ static ssize_t random_write(struct file *file, const char __user *buffer,
return (ssize_t)count;
}
-static ssize_t urandom_read(struct file *file, char __user *buf, size_t nbytes,
- loff_t *ppos)
-{
- static int maxwarn = 10;
-
- if (!crng_ready() && maxwarn > 0) {
- maxwarn--;
- if (__ratelimit(&urandom_warning))
- pr_notice("%s: uninitialized urandom read (%zd bytes read)\n",
- current->comm, nbytes);
- }
-
- return get_random_bytes_user(buf, nbytes);
-}
-
static ssize_t random_read(struct file *file, char __user *buf, size_t nbytes,
loff_t *ppos)
{
@@ -1562,15 +1535,6 @@ const struct file_operations random_fops = {
.llseek = noop_llseek,
};
-const struct file_operations urandom_fops = {
- .read = urandom_read,
- .write = random_write,
- .unlocked_ioctl = random_ioctl,
- .compat_ioctl = compat_ptr_ioctl,
- .fasync = random_fasync,
- .llseek = noop_llseek,
-};
-
/********************************************************************
*
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index dcc1b3e6106f..9ec1703f45ad 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -49,7 +49,7 @@ struct rand_pool_info {
*
* GRND_NONBLOCK Don't block and return EAGAIN instead
* GRND_RANDOM No effect
- * GRND_INSECURE Return non-cryptographic random bytes
+ * GRND_INSECURE No effect
*/
#define GRND_NONBLOCK 0x0001
#define GRND_RANDOM 0x0002
--
2.35.0
On 2/11/2022 16:07, Jason A. Donenfeld wrote:
> This is very much an RFC patch, or maybe even an RFG -- request for
> grumbles. This topic has come up a million times, and usually doesn't go
> anywhere. This time I thought I'd bring it up with a slightly narrower
> focus. Before you read further, realize that I do not intend to merge
> this without there being an appropriate amount of consensus for it and
> discussion about it.
>
> Ever since Linus' 50ee7529ec45 ("random: try to actively add entropy
> rather than passively wait for it"), the RNG does a haveged-style jitter
> dance around the scheduler, in order to produce entropy (and credit it)
> for the case when we're stuck in wait_for_random_bytes(). How ever you
> feel about the Linus Jitter Dance is beside the point: it's been there
> for three years and usually gets the RNG initialized in a second or so.
>
> As a matter of fact, this is what happens currently when people use
> getrandom(2).
>
> So, given that the kernel has grown this mechanism for seeding itself
> from nothing, and that this procedure happens pretty fast, maybe there's
> no point any longer in having /dev/urandom give insecure bytes. In the
> past we didn't want the boot process to deadlock, which was
> understandable. But now, in the worst case, a second goes by, and the
> problem is resolved. It seems like maybe we're finally at a point when
> we can get rid of the infamous "urandom read hole".
>
> Maybe. And this is why this is a request for grumbles patch: the Linus
> Jitter Dance relies on random_get_entropy() returning a cycle counter
> value. The first lines of try_to_generate_entropy() are:
>
> stack.now = random_get_entropy();
> /* Slow counter - or none. Don't even bother */
> if (stack.now == random_get_entropy())
> return;
>
> So it would appear that what seemed initially like a panacea does not in
> fact work everywhere. Where doesn't it work?
>
> On every platform, random_get_entropy() is connected to get_cycles(),
> except for three: m68k, MIPS, and RISC-V.
>
[snip]
> On MIPS, it looks like this:
>
> if (can_use_mips_counter(prid))
> return read_c0_count();
> else if (likely(imp != PRID_IMP_R6000 && imp != PRID_IMP_R6000A))
> return read_c0_random();
> else
> return 0;
>
> So it seems like we're okay except for R6000 and R6000A.
The R6000/R6000A CPU only ever existed in systems in the late 1980's that
were fairly large, and I don't think there is a complete, working unit out
there that can actually boot up, let alone boot a Linux kernel. This check
was probably added as a mental exercise following a processor manual or such.
The old linux-mips wiki even says this:
https://www.linux-mips.org/wiki/R6000
"""
The R6000 is an ECL implementation of the MIPS architecture which was
produced by Bipolar Integrated Technology. The R6000 miroprocessor did
introduce the MIPS II instruction set. Its TLB and cache architecture are
different from all other members of the MIPS family. The R6000 did not
deliver the promised performance benefits, and although it saw some use in
Control Data machines, it quickly disappeared from the mainstream market.
"""
A a quick grep of a recent kernel tree shows this one conditional as the
only user, plus the two defines:
# grep -r "PRID_IMP_R6000" *
arch/mips/include/asm/cpu.h:70:#define PRID_IMP_R6000 0x0300
/* Same as R3000A */
arch/mips/include/asm/cpu.h:72:#define PRID_IMP_R6000A 0x0600
arch/mips/include/asm/timex.h:94: else if (likely(imp !=
PRID_IMP_R6000 && imp != PRID_IMP_R6000A))
I'd say it's better to remove the check and simplify the conditional to
eliminate this corner case. Maybe keep the #defines around for
documentation, but even that may not be necessary for CPUs that likely don't
exist anymore.
>
> I think what this adds up to is that this change would positively affect
> everybody, except for _possibly_ negatively affecting poorly configured
> non-Amiga m68k systems and the MIPS R6000 and R6000A. Does that analysis
> seem correct to folks reading, or did I miss something?
>
> Are there other cases where the cycle counter does exist but is simply
> too slow? Perhaps some computer historians can chime in here.
>
> If my general analysis is correct, are these ancient platforms really
> worth holding this back? I halfway expect to receive a few thrown
> tomatoes, an angry fist, and a "get off my lawn!", and if that's _all_ I
> hear, I'll take a hint and we can forget I ever proposed this. As
> mentioned, I do not intend to merge this unless there's broad consensus
> about it. But on the off chance that people feel differently, perhaps
> the Linus Jitter Dance is finally the solution to years of /dev/urandom
> kvetching.
--
Joshua Kinard
Gentoo/MIPS
[email protected]
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943
"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
On Fri, Feb 11, 2022, at 1:07 PM, Jason A. Donenfeld wrote:
> This is very much an RFC patch, or maybe even an RFG -- request for
> grumbles. This topic has come up a million times, and usually doesn't go
> anywhere. This time I thought I'd bring it up with a slightly narrower
> focus. Before you read further, realize that I do not intend to merge
> this without there being an appropriate amount of consensus for it and
> discussion about it.
>
> Ever since Linus' 50ee7529ec45 ("random: try to actively add entropy
> rather than passively wait for it"), the RNG does a haveged-style jitter
> dance around the scheduler, in order to produce entropy (and credit it)
> for the case when we're stuck in wait_for_random_bytes(). How ever you
> feel about the Linus Jitter Dance is beside the point: it's been there
> for three years and usually gets the RNG initialized in a second or so.
I dislike this patch for a reason that has nothing to do with security. Somewhere there’s a Linux machine that boots straight to Nethack in a glorious 50ms. If Nethack gets 256 bits of amazing entropy from /dev/urandom, then the machine’s owner has to play for real. If it repeats the same game on occasion, the owner can be disappointed or amused. If it gets a weak seed that can be brute forced, then the owner can have fun brute forcing it.
If, on the other hand, it waits 750ms for enough jitter entropy to be perfect, it’s a complete fail. No one wants to wait 750ms to play Nethack.
Replace Nethack with something with a backup camera or a lightbulb, both of which have regulations related to startup time, and there may be a real problem. Keep in mind that some language runtimes randomize their hash table seeds at startup, possibly using /dev/urandom. This patch may break actual, correct, working code.
On Fr, 11.02.22 22:07, Jason A. Donenfeld ([email protected]) wrote:
> So, given that the kernel has grown this mechanism for seeding itself
> from nothing, and that this procedure happens pretty fast, maybe there's
> no point any longer in having /dev/urandom give insecure bytes. In the
> past we didn't want the boot process to deadlock, which was
> understandable. But now, in the worst case, a second goes by, and the
> problem is resolved. It seems like maybe we're finally at a point when
> we can get rid of the infamous "urandom read hole".
So, systemd uses (potentially half-initialized) /dev/urandom for
seeding its hash tables. For that its kinda OK if the random values
have low entropy initially, as we'll automatically reseed when too
many hash collisions happen, and then use a newer (and thus hopefully
better) seed, again acquired through /dev/urandom. i.e. if the seeds
are initially not good enough to thwart hash collision attacks, once
the hash table are actually attacked we'll replace the seeds with
someting better. For that all we need is that the random pool
eventually gets better, that's all.
So for that usecase /dev/urandom behaving the way it so far does is
kinda nice. We need lots of hash tables, from earliest initialization
on, hence the ability to get some seed there reasonably fast is really
good, even if its entropy is initially not as high as we'd want. It's
a good middle ground for us to be able to boot up quickly and not
having to block until the entropy pool is fully initialized, but still
thwart hash table collision attacks.
If you make /dev/urandom block for initialization then this would mean
systemd and its components would start waiting for initialization
(simply because we need hash tables all over the place), i.e you'd
effectively add a second to the boot process of each affected system.
What about AT_RANDOM and /proc/sys/kernel/random/uuid btw, do you
intend to block for that too? If you block for the former it doesn't
really matter what systemd does I guess, given that you already have
to delay invoking PID 1 until you get a good AT_RANDOM.
Lennart
--
Lennart Poettering, Berlin
Hi Joshua,
Thanks a lot for the historical background.
On Sun, Feb 13, 2022 at 12:06 AM Joshua Kinard <[email protected]> wrote:
> The R6000/R6000A CPU only ever existed in systems in the late 1980's that
> were fairly large, and I don't think there is a complete, working unit out
> there that can actually boot up, let alone boot a Linux kernel.
So from what you've written, it sounds like MIPS is actually not a problem here.
So the only systems we're actually talking about without a good cycle
counter are non-Amiga m68k? If so, that'd be a pretty terrific
finding. It'd mean that this idea can move forward, and we only need
to worry about some m68k museum pieces with misconfigured
userspaces...
Jason
From: Geert Uytterhoeven
> Sent: 14 February 2022 14:26
...
> I'm afraid you missed one important detail. You wrote:
>
> > On every platform, random_get_entropy() is connected to get_cycles(),
> > except for three: m68k, MIPS, and RISC-V.
>
> The default implementation in include/asm-generic/timex.h is:
>
> static inline cycles_t get_cycles(void)
> {
> return 0;
> }
>
> Several architectures do not implement get_cycles(), or implement it
> with a variant that's very similar or identical to the generic version.
Add to the list nios2 and old x86 (I think rdtsc is a pentium instruction)
I can't see it in my 386 book, and i don't think 486 added it.
I'm not sure if/when sparc added one.
I don't remember it being there in the late 1980s.
nios2 (soft cpu on Altera/Intel fpga) is annoying.
There is a 'read control register' instruction and plenty of space ones.
But you can't define your own and one isn't a clock counter.
You can add one as the result of a custom instruction.
(Even the same custom instruction that does byteswap.)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)