[permalink] [raw]

Subject: Re: [PATCH RFC] random: getrandom(2): don't block on non-initialized entropy pool

(resending without HTML this time, sorry for the duplicate)
14.09.2019 17:25, Ahmed S. Darwish пишет:
> getrandom() has been created as a new and more secure interface for
> pseudorandom data requests. Unlike /dev/urandom, it unconditionally
> blocks until the entropy pool has been properly initialized.
>
> While getrandom() has no guaranteed upper bound for its waiting time,
> user-space has been abusing it by issuing the syscall, from shared
> libraries no less, during the main system boot sequence.
>
> Thus, on certain setups where there is no hwrng (embedded), or the
> hwrng is not trusted by some users (intel RDRAND), or sometimes it's
> just broken (amd RDRAND), the system boot can be *reliably* blocked.
>
> The issue is further exaggerated by recent file-system optimizations,
> e.g. b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), which
> merges directory lookup code inode table IO, and thus minimizes the
> number of disk interrupts and entropy during boot. After that commit,
> a blocked boot can be reliably reproduced on a Thinkpad E480 laptop
> with standard ArchLinux user-space.
>
> Thus, don't trust user-space on calling getrandom() from the right
> context. Just never block, and return -EINVAL if entropy is not yet
> available.
>
> Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
> Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
> Link: https://lkml.kernel.org/r/[email protected]
> Link: https://lkml.kernel.org/r/[email protected]

Let me reword the commit message for a hopefully better historical
perspective.

===
getrandom() has been created as a new and more secure interface for
pseudorandom data requests. It attempted to solve two problems, as
compared to /dev/{u,}random: the need to open a file descriptor (which
can fail) and possibility to get not-so-random data from the
incompletely initialized entropy pool. It has succeeded in the first
improvement, but failed horribly in the second one: it blocks until the
entropy pool has been properly initialized, if called without
GRND_NONBLOCK, while none of these behaviors are suitable for the early
boot stage.

The issue is further exaggerated by recent file-system optimizations,
e.g. b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), which merges
directory lookup code inode table IO, and thus minimizes the number of
disk interrupts and entropy during boot. After that commit, a blocked
boot can be reliably reproduced on a Thinkpad E480 laptop with standard
ArchLinux user-space.

Thus, on certain setups where there is no hwrng (embedded systems or
non-KVM virtual machines), or the hwrng is not trusted by some users
(intel RDRAND), or sometimes it's just broken (amd RDRAND), the system
boot can be *reliably* blocked. It can be therefore argued that there is
no way to use getrandom() on Linux correctly, especially from shared
libraries: GRND_NONBLOCK has to be used, and a fallback to some other
interface like /dev/urandom is required, thus making the net result no
better than just using /dev/urandom unconditionally.

While getrandom() has no guaranteed upper bound for its waiting time,
user-space has been using it incorrectly by issuing the syscall, from
shared libraries no less, during the main system boot sequence, without
GRND_NONBLOCK.

We can't trust user-space on calling getrandom() from the right context.
Therefore, just never block, and return -EINVAL (with some entropy still
in the buffer) if the requested amount of entropy is not yet available.

Link:
https://github.com/openbsd/src/commit/edb2eeb7da8494998d0073f8aaeb8478cee5e00b
Link:
https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=LBF8NKXB6vJcm9XjLnRfg@mail.gmail.com
Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
===

That said, I have an issue with the -EINVAL return code here: it is also
returned in cases where the parameters passed are genuinely not
understood by the kernel, and no entropy has been written to the buffer.
Therefore, the caller has to assume that the call has failed, waste all
the bytes in the buffer, and try some fallback strategy. Can we think of
some other error code?

The other part of me thinks that triggering a fallback, by returning an
error code, is never the right thing to do. If the "uninitialized" state
exists at all, applications and libraries have to care (and I would
expect their authors who don't pass GRND_RANDOM to just fall back to
/dev/urandom). Therefore, we are back to square one, except that the
fallback code in the application is something that is only rarely
exercised, and thus has higher chances to accumulate bugs. Because the
only expected/reasonable fallback is to read from /dev/urandom, the
whole result looks like shifting the responsibility/blame without
achieving anything useful. As the issue is not really solvable, just
give the application not-so-random data, as /dev/urandom does, without
any indication - this would at least keep the benefit of not needing a
file descriptor. It is simply not possible to do anything better without
eliminating the userspace-visible "uninitialized" crng state, e.g. with
the help of entropy input from the boot loader or a configurable config
or command line option to trust the jitter entropy in-kernel.

>
> Suggested-by: Linus Torvalds <[email protected]>
> Signed-off-by: Ahmed S. Darwish <[email protected]>
> ---
>
> Notes:
> This feels very risky at the very end of -rc8, so only sending
> this as an RFC. The system of course reliably boots with this,
> and the log, as expected, powerfully warns all callers:
>
> $ dmesg | grep random
> [0.236472] random: get_random_bytes called from start_kernel+0x30f/0x4d7 with crng_init=0
> [0.680263] random: fast init done
> [2.500346] random: lvm: uninitialized urandom read (4 bytes read)
> [2.595125] random: systemd-random-: invalid getrandom request (512 bytes): crng not ready
> [2.595126] random: systemd-random-: uninitialized urandom read (512 bytes read)
> [3.427699] random: dbus-daemon: uninitialized urandom read (12 bytes read)
> [3.979425] urandom_read: 1 callbacks suppressed
> [3.979426] random: polkitd: uninitialized urandom read (8 bytes read)
> [3.979726] random: polkitd: uninitialized urandom read (8 bytes read)
> [3.979752] random: polkitd: uninitialized urandom read (8 bytes read)
> [4.473398] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
> [4.473404] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
> [4.473409] random: gnome-session-b: invalid getrandom request (16 bytes): crng not ready
> [5.265636] random: crng init done
> [5.265649] random: 3 urandom warning(s) missed due to ratelimiting
> [5.265652] random: 1 getrandom warning(s) missed due to ratelimiting
>
> drivers/char/random.c | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 4a50ee2c230d..309dc5ddf370 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -511,6 +511,8 @@ static struct ratelimit_state unseeded_warning =
> RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3);
> static struct ratelimit_state urandom_warning =
> RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
> +static struct ratelimit_state getrandom_warning =
> + RATELIMIT_STATE_INIT("warn_getrandom_notavail", HZ, 3);
>
> static int ratelimit_disable __read_mostly;
>
> @@ -1053,6 +1055,12 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
> urandom_warning.missed);
> urandom_warning.missed = 0;
> }
> + if (getrandom_warning.missed) {
> + pr_notice("random: %d getrandom warning(s) missed "
> + "due to ratelimiting\n",
> + getrandom_warning.missed);
> + getrandom_warning.missed = 0;
> + }
> }
> }
>
> @@ -1915,6 +1923,7 @@ int __init rand_initialize(void)
> crng_global_init_time = jiffies;
> if (ratelimit_disable) {
> urandom_warning.interval = 0;
> + getrandom_warning.interval = 0;
> unseeded_warning.interval = 0;
> }
> return 0;
> @@ -2138,8 +2147,6 @@ const struct file_operations urandom_fops = {
> SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
> unsigned int, flags)
> {
> - int ret;
> -
> if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
> return -EINVAL;
>
> @@ -2152,9 +2159,13 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
> if (!crng_ready()) {
> if (flags & GRND_NONBLOCK)
> return -EAGAIN;
> - ret = wait_for_random_bytes();
> - if (unlikely(ret))
> - return ret;
> +
> + if (__ratelimit(&getrandom_warning))
> + pr_notice("random: %s: invalid getrandom request "
> + "(%zd bytes): crng not ready",
> + current->comm, count);
> +
> + return -EINVAL;
> }
> return urandom_read(NULL, buf, count, NULL);
> }
> --
> 2.23.0
>

--
Alexander E. Patrakov

2019-09-14 19:57:53

by Ahmed S. Darwish

[permalink] [raw]

Subject: Re: Linux 5.3-rc8

On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> On Thu, Sep 12, 2019 at 9:25 AM Theodore Y. Ts'o <[email protected]> wrote:
> >
> > Hmm, one thought might be GRND_FAILSAFE, which will wait up to two
> > minutes before returning "best efforts" randomness and issuing a huge
> > massive warning if it is triggered?
>
> Yeah, based on (by now) _years_ of experience with people mis-using
> "get me random numbers", I think the sense of a new flag needs to be
> "yeah, I'm willing to wait for it".
>
> Because most people just don't want to wait for it, and most people
> don't think about it, and we need to make the default be for that
> "don't think about it" crowd, with the people who ask for randomness
> sources for a secure key having to very clearly and very explicitly
> say "Yes, I understand that this can take minutes and can only be done
> long after boot".
>
> Even then people will screw that up because they copy code, or some
> less than gifted rodent writes a library and decides "my library is so
> important that I need that waiting sooper-sekrit-secure random
> number", and then people use that broken library by mistake without
> realizing that it's not going to be reliable at boot time.
>
> An alternative might be to make getrandom() just return an error
> instead of waiting. Sure, fill the buffer with "as random as we can"
> stuff, but then return -EINVAL because you called us too early.
>

ACK, that's probably _the_ most sensible approach. Only caveat is
the slight change in user-space API semantics though...

For example, this breaks the just released systemd-random-seed(8)
as it _explicitly_ requests blocking behvior from getrandom() here:

=> src/random-seed/random-seed.c:
/*
* Let's make this whole job asynchronous, i.e. let's make
* ourselves a barrier for proper initialization of the
* random pool.
*/
k = getrandom(buf, buf_size, GRND_NONBLOCK);
if (k < 0 && errno == EAGAIN && synchronous) {
log_notice("Kernel entropy pool is not initialized yet, "
"waiting until it is.");

k = getrandom(buf, buf_size, 0); /* retry synchronously */
}
if (k < 0) {
log_debug_errno(errno, "Failed to read random data with "
"getrandom(), falling back to "
"/dev/urandom: %m");
} else if ((size_t) k < buf_size) {
log_debug("Short read from getrandom(), falling back to "
"/dev/urandom: %m");
} else {
getrandom_worked = true;
}

Nonetheless, a slightly broken systemd-random-seed, that was just
released only 11 days ago (v243), is honestly much better than a
*non-booting system*...

I've sent an RFC patch at [1].

To handle the systemd case, I'll add the discussed "yeah, I'm
willing to wait for it" flag (GRND_BLOCK) in v2.

If this whole approach is going to be merged, and the slight ABI
breakage is to be tolerated (hmmmmm?), I wonder how will systemd
random-seed handle the semantics change though without doing
ugly kernel version checks..

thanks,

[1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc

--
darwi
http://darwish.chasingpointers.com

2019-09-14 21:20:55

by Theodore Ts'o

[permalink] [raw]

Subject: Re: Linux 5.3-rc8

On Sat, Sep 14, 2019 at 11:25:09AM +0200, Ahmed S. Darwish wrote:
> Unfortunately, it only made the early fast init faster, but didn't fix
> the normal crng init blockage :-(

Yeah, I see why; the original goal was to do the fast init so that
using /dev/urandom, even before we were fully initialized, wouldn't be
deadly. But then we still wanted 128 bits of estimated entropy the
old fashioned way before we declare the CRNG initialized.

There are a bunch of things that I think I want to do long-term, such
as make CONFIG_RANDOM_TRUST_CPU the default, trying to get random
entropy from the bootloader, etc. But none of this is something we
should do in a hurry, especially this close before 5.4 drops. So I
think I want to fix things this way, which is a bit a of a hack, but I
think it's better than simply reverting commit b03755ad6f33.

Ahmed, Linus, what do you think?

- Ted

From f1a111bff3b996258410e51a3760fc39bbd7058f Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <[email protected]>
Date: Sat, 14 Sep 2019 12:21:39 -0400
Subject: [PATCH] ext4: don't plug in __ext4_get_inode_loc if the CRNG is not
initialized

Unfortuantely commit b03755ad6f33 ("ext4: make __ext4_get_inode_loc
plug") is so effective that on some systems, where RDRAND is not
trusted, and the GNOME display manager is using getrandom(2) to get
randomness for MIT Magic Cookie (which isn't really secure so using
getrandom(2) is a bit of waste) in early boot on an Arch system is
causing the boot to hang.

Since this is causing problems, although arguably this is userspace's
fault, let's not do it if the CRNG is not yet initialized. This is
better than trying to tweak the random number generator right before
5.4 is released (I'm afraid we'll accidentally make it _too_ weak),
and it's also better than simply completely reverting b03755ad6f33.

We're effectively reverting it while the RNG is not yet initialized,
to slow down the boot and make it less efficient, just to work around
broken init setups.

Fixes: b03755ad6f33 ("ext4: make __ext4_get_inode_loc plug")
Signed-off-by: Theodore Ts'o <[email protected]>
---
fs/ext4/inode.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4e271b509af1..41ad93f11b6d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4534,6 +4534,7 @@ static int __ext4_get_inode_loc(struct inode *inode,
struct buffer_head *bh;
struct super_block *sb = inode->i_sb;
ext4_fsblk_t block;
+ int be_inefficient = !rng_is_initialized();
struct blk_plug plug;
int inodes_per_block, inode_offset;

@@ -4541,7 +4542,6 @@ static int __ext4_get_inode_loc(struct inode *inode,
if (inode->i_ino < EXT4_ROOT_INO ||
inode->i_ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))
return -EFSCORRUPTED;
-
iloc->block_group = (inode->i_ino - 1) / EXT4_INODES_PER_GROUP(sb);
gdp = ext4_get_group_desc(sb, iloc->block_group, NULL);
if (!gdp)
@@ -4623,7 +4623,8 @@ static int __ext4_get_inode_loc(struct inode *inode,
* If we need to do any I/O, try to pre-readahead extra
* blocks from the inode table.
*/
- blk_start_plug(&plug);
+ if (likely(!be_inefficient))
+ blk_start_plug(&plug);
if (EXT4_SB(sb)->s_inode_readahead_blks) {
ext4_fsblk_t b, end, table;
unsigned num;
@@ -4654,7 +4655,8 @@ static int __ext4_get_inode_loc(struct inode *inode,
get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
submit_bh(REQ_OP_READ, REQ_META | REQ_PRIO, bh);
- blk_finish_plug(&plug);
+ if (likely(!be_inefficient))
+ blk_finish_plug(&plug);
wait_on_buffer(bh);
if (!buffer_uptodate(bh)) {
EXT4_ERROR_INODE_BLOCK(inode, block,
--
2.23.0

2019-09-14 21:23:29

by Linus Torvalds

[permalink] [raw]

Subject: Re: Linux 5.3-rc8

On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish <[email protected]> wrote:
>
> On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> >
> > An alternative might be to make getrandom() just return an error
> > instead of waiting. Sure, fill the buffer with "as random as we can"
> > stuff, but then return -EINVAL because you called us too early.
>
> ACK, that's probably _the_ most sensible approach. Only caveat is
> the slight change in user-space API semantics though...
>
> For example, this breaks the just released systemd-random-seed(8)
> as it _explicitly_ requests blocking behvior from getrandom() here:
>

Actually, I would argue that the "don't ever block, instead fill
buffer and return error instead" fixes this broken case.

> => src/random-seed/random-seed.c:
> /*
> * Let's make this whole job asynchronous, i.e. let's make
> * ourselves a barrier for proper initialization of the
> * random pool.
> */
> k = getrandom(buf, buf_size, GRND_NONBLOCK);
> if (k < 0 && errno == EAGAIN && synchronous) {
> log_notice("Kernel entropy pool is not initialized yet, "
> "waiting until it is.");
>
> k = getrandom(buf, buf_size, 0); /* retry synchronously */
> }

Yeah, the above is yet another example of completely broken garbage.

You can't just wait and block at boot. That is simply 100%
unacceptable, and always has been, exactly because that may
potentially mean waiting forever since you didn't do anything that
actually is likely to add any entropy.

> if (k < 0) {
> log_debug_errno(errno, "Failed to read random data with "
> "getrandom(), falling back to "
> "/dev/urandom: %m");

At least it gets a log message.

So I think the right thing to do is to just make getrandom() return
-EINVAL, and refuse to block.

As mentioned, this has already historically been a huge issue on
embedded devices, and with disks turnign not just to NVMe but to
actual polling nvdimm/xpoint/flash, the amount of true "entropy"
randomness we can give at boot is very questionable.

We can (and will) continue to do a best-effort thing (including very
much using rdread and friends), but the whole "wait for entropy"
simply *must* stop.

> I've sent an RFC patch at [1].
>
> [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc

Looks reasonable to me. Except I'd just make it simpler and make it a
big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice().
Make it clear that it is a *bug* if user space thinks it should wait
at boot time.

Also, we might even want to just fill the buffer and return 0 at that
point, to make sure that even more broken user space doesn't then try
to sleep manually and turn it into a "I'll wait myself" loop.

Linus

2019-09-14 21:27:20

On Sat, Sep 14, 2019 at 11:56 PM Lennart Poettering
<[email protected]> wrote:
>
> I am not expecting the kernel to guarantee entropy. I just expecting
> the kernel to not give me garbage knowingly. It's OK if it gives me
> garbage unknowingly, but I have a problem if it gives me trash all the
> time.

So realistically, we never actually give you *garbage*.

It's just that we try very hard to actually give you some entropy
guarantees, and that we can't always do in a timely manner -
particularly if you don't help.

But on a PC, we can _almost_ guarantee entropy. Even with a golden
image, we do mix in:

- timestamp counter on every device interrupt (but "device interrupt"
doesn't include things like the local CPU timer, so it really needs
device activity)

- random boot and BIOS memory (dmi tables, the EFI RNG entry, etc)

- various device state (things like MAC addresses when registering
network devices, USB device numbers, etc)

- and obviously any CPU rdrand data

and note the "mix in" part - it's all designed so that you don't trust
any of this for randomness on its own, but very much hopefully it
means that almost *any* differences in boot environment will add a
fair amount of unpredictable behavior.

But also note the "on a PC" part.

Also note that as far as the kernel is concerned, none of the above
counts as "entropy" for us, except to a very small degree the device
interrupt timing thing. But you need hundreds of interrupts for that
to be considered really sufficient.

And that's why things broke. It turns out that making ext4 be more
efficient at boot caused fewer disk interrupts, and now we weren't
convinced we had sufficient entropy. And the systemd boot thing just
*stopped* waiting for entropy to magically appear, which is never will
if the machine is idle and not doing anything.

So do we give you "garbage" in getrandom()? We try really really hard
not to, but it's exactly the "can we _guarantee_ that it has entropy"
that ends up being the problem.

So if some silly early boot process comes along, and asks for "true
randomness", and just blocks for it without doing anything else,
that's broken from a kernel perspective.

In practice, the only situation we have had really big problems with
not giving "garbage" isn't actually the "golden distro image" case you
talk about. It's the "embedded device golden _system_ image" case,
where the image isn't just the distribution, but the full bootloader
state.

Some cheap embedded MIPS CPU without even a timestamp counter, with
identical flash contents for millions of devices, and doing a "on
first boot, generate a long-term key" without even connecting to the
network first.

That's the thing Ted was pointing at:

https://factorable.net/weakkeys12.extended.pdf

so yes, it can be "garbage", but it can be garbage only if you really
really do things entirely wrong.

But basically, you should never *ever* try to generate some long-lived
key and then just wait for it without doing anything else. The
"without doing anything else" is key here.

But every time we've had a blocking interface, that's exactly what
somebody has done. Which is why I consider that long blocking thing to
be completely unacceptable. There is no reason to believe that the
wait will ever end, partly exactly because we don't consider timer
interrupts to add any timer randomness. So if you are just waiting,
nothing necessarily ever happen.

Linus

2019-09-15 19:36:29

20.09.2019 02:47, Linus Torvalds пишет:
> On Thu, Sep 19, 2019 at 1:45 PM Alexander E. Patrakov
> <[email protected]> wrote:
>>
>> This already resembles in-kernel haveged (except that it doesn't credit
>> entropy), and Willy Tarreau said "collect the small entropy where it is,
>> period" today. So, too many people touched upon the topic in one day,
>> and therefore I'll bite.
>
> I'm one of the people who aren't entirely convinced by the jitter
> entropy - I definitely believe it exists, I just am not necessarily
> convinced about the actual entropy calculations.
>
> So while I do think we should take things like the cycle counter into
> account just because I think it's a a useful way to force some noise,
> I am *not* a huge fan of the jitter entropy driver either, because of
> the whole "I'm not convinced about the amount of entropy".
>
> The whole "third order time difference" thing would make sense if the
> time difference was some kind of smooth function - which it is at a
> macro level.
>
> But at a micro level, I could easily see the time difference having
> some very simple pattern - say that your cycle counter isn't really
> cycle-granular, and the load takes 5.33 "cycles" and you see a time
> difference pattern of (5, 5, 6, 5, 5, 6, ...). No real entropy at all
> there, it is 100% reliable.
>
> At a macro level, that's a very smooth curve, and you'd say "ok, time
> difference is 5.3333 (repeating)". But that's not what the jitter
> entropy code does. It just does differences of differences.
>
> And that completely non-random pattern has a first-order difference of
> 0, 1, 1, 0, 1, 1.. and a second order of 1, 0, 1, 1, 0, and so on
> forever. So the "jitter entropy" logic will assign that completely
> repeatable thing entropy, because the delta difference doesn't ever go
> away.
>
> Maybe I misread it.

You didn't. Let me generalize and rephrase the part of the concern that
I agree with, in my own words:

The same code is used in cryptoapi rng, and also a userspace version
exists. These two have been tested by the author via the "dieharder"
tool (see the message for commit d9d67c87), so we know that on his
machine it actually produces good-quality random bits. However, the
in-kernel self-test is much, much weaker, and would not catch the
situation when someone's machine is deterministic in a way that you
describe, or something similar.

OTOH, I thought that at least part of the real entropy, if it exists,
comes from the interference of the CPU's memory accesses with the
refresh cycles that are clocked from an independent oscillator. That's
why (in order to catch more of them before declaring the crng
initialized) I have set the quality to the minimum possible that is
guaranteed to be distinct from zero according to the fixed-point math in
hwrng_fillfn() in drivers/char/hw_random/core.c.

>
> We used to (we still do, but we used to too) do that same third-order
> delta difference ourselves for the interrupt timing entropy estimation
> in add_timer_randomness(). But I think it's more valid with something
> that likely has more noise (interrupt timing really _should_ be
> noisy). It's not clear that the jitterentropy load really has all that
> much noise.
>
> That said, I'm _also_ not a fan of the user mode models - they happen
> too late anyway for some users, and as you say, it leaves us open to
> random (heh) user mode distribution choices that may be more or less
> broken.
>
> I would perhaps be willing to just put my foot down, and say "ok,
> we'll solve the 'getrandom(0)' issue by just saying that if that
> blocks too much, we'll do the jitter entropy thing".
>
> Making absolutely nobody happy, but working in practice. And maybe
> encouraging the people who don't like jitter entropy to use
> GRND_SECURE instead.

I think this approach makes sense. For those who don't believe in jitter
entropy, it changes really nothing (except a one-time delay) to Ahmed's
first patch that makes getrandom(0) equivalent to /dev/urandom, and
nobody so far proposed anything better that doesn't break existing
systems. And for those who do believe in jitter entropy, this makes the
situation as good as in OpenBSD.

--
Alexander E. Patrakov

Attachments:

smime.p7s (3.96 kB)
Криптографическая подпись S/MIME

2019-09-20 14:33:26

On 9/10/19 12:21 AM, Ahmed S. Darwish wrote:

> Can this even be considered a user-space breakage? I'm honestly not
> sure. On my modern RDRAND-capable x86, just running rng-tools rngd(8)
> early-on fixes the problem. I'm not sure about the status of older
> CPUs though.

Tangent: I asked aloud on Twitter last night if anyone had exploited
Rowhammer-like effects to generate entropy...and sure enough, the usual
suspects have: https://arxiv.org/pdf/1808.04286.pdf

While this requires low level access to a memory controller, it's
perhaps an example of something a platform designer could look at as a
source to introduce boot-time entropy for e.g. EFI_RNG_PROTOCOL even on
an existing platform without dedicated hardware for the purpose.

Just a thought.

Jon.

2019-10-03 21:35:25

by Jon Masters

[permalink] [raw]