2017-06-02 15:00:03

by Jason A. Donenfeld

[permalink] [raw]
Subject: get_random_bytes returns bad randomness before seeding is complete

Hi folks,

This email is about an issue with get_random_bytes(), the CSPRNG used
inside the kernel for generating keys and nonces and whatnot. However,
I will begin with an aside:

/dev/urandom will return bad randomness before its seeded, rather than
blocking, and despite years and years of discussion and bike shedding,
this behavior hasn't changed, and the vast majority of cryptographers
and security experts remain displeased. It _should_ block until its
been seeded for the first time, just like the new getrandom() syscall,
and this important bug fix (which is not a real api break) should be
backported to all stable kernels. (Userspace doesn't even have a
reliable way of determining whether or not it's been seeded!) Yes,
yes, you have arguments for why you're keeping this pathological, but
you're still wrong, and this api is still a bug. Forcing userspace
architectures to work around your bug, as your arguments usually go,
is disquieting. Anyway, that's not what this email is about, but given
that as a backdrop, here's a different-but-related, and perhaps more
petulant, issue...

The problem this email intends to address is this:

void get_random_bytes(void *buf, int nbytes)
{
#if DEBUG_RANDOM_BOOT > 0
if (!crng_ready())
printk(KERN_NOTICE "random: %pF get_random_bytes called "
"with crng_init = %d\n", (void *) _RET_IP_, crng_init);
#endif
...
extract_crng(buf);

Or, on older kernels:

void get_random_bytes(void *buf, int nbytes)
{
#if DEBUG_RANDOM_BOOT > 0
if (unlikely(nonblocking_pool.initialized == 0))
printk(KERN_NOTICE "random: %pF get_random_bytes called "
"with %d bits of entropy available\n",
(void *) _RET_IP_,
nonblocking_pool.entropy_total);
#endif
trace_get_random_bytes(nbytes, _RET_IP_);
extract_entropy(&nonblocking_pool, buf, nbytes, 0, 0);
}

As the printk implies, it's possible for get_random_bytes() to be
called before it's seeded. Why was that helpful printk put behind an
ifdef? I suspect because people would see the lines in their dmesg and
write emails like this one to the list.

I anticipate an argument coming from the never-block /dev/urandom
cartel that goes something along the lines of, "get_random_bytes() is
only called in paths initiated by a syscall; therefore, userspace must
ensure /dev/urandom, which is the same pool as get_random_bytes, has
been properly seeded, before making any syscalls that might lead to
get_random_bytes() being called."

If you've already given up on the general initiative of trying to urge
them to make /dev/urandom block until seeded, then this might seem
like a reasonable argument. If /dev/urandom is broken, it doesn't
matter if get_random_bytes() is broken too, if the required
work-around for one is the same as for the other.

But the premise is flawed. get_random_bytes() can be called before any
syscalls are made, before userspace even has an opportunity to ensure
/dev/urandom is seeded. That's what that printk in the ifdef is all
about. Bad news bears.

Grepping through the source tree for get_random_bytes, I just opened a
few files at random to see what the deal was. Here's one:

drivers/net/ieee802154/at86rf230.c:
static int at86rf230_probe(struct spi_device *spi)
{
...
get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed));

No clue what this driver is for or if it actually needs good
randomness, but using get_random_bytes in a hardware probe function
can happen prior to userspace's ability to seed.

arch/s390/crypto/prng.c seems to call get_random_bytes on module load,
which can happen pre-userspace, for seeding some kind of RNG of its
own.

net/ipv6/addrconf.c:
static void __ipv6_regen_rndid(struct inet6_dev *idev)
{
regen:
get_random_bytes(idev->rndid, sizeof(idev->rndid));

This is in the networking stack when bringing up an interface and
assigning randomly assigned v6 addresses. While you might argue that
userspace is required for networking, remember that the kernel
actually has the ability to initiate networking all on its own, before
userspace; the kernel even has its own dhcp client! Yowza. In fact, on
that note:

net/ipv4/ipconfig.c:
static int __init ic_open_devs(void)
{
...
get_random_bytes(&d->xid, sizeof(__be32));

And so on and so on and so on. If you grep the source as I did, you'll
find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
"could this be called before userspace has ensured that /dev/urandom
is seeded? do we actually _need_ good randomness here, or does it not
matter?" I guess you could try to reason about each and every one of
them -- you might even have those same questions about the silly
examples I pasted above -- but that one-by-one methodology seems
excessively fragile and arduous.

There must have been a developer who thought about this at least once
before, because random.c does have a callback notifier mechanism for
drivers to learn when they can safely call get_random_bytes --
add_random_ready_callback. However, it's only used by ONE driver --
the drbg in crypto/. It's pretty clunky to use, and there's no
reasonable to way replace every single usage of get_random_bytes with
complicated callback infrastructure.

Instead, get_random_bytes simply needs to always return good
randomness. But this is complicated to do inside the kernel. We can't
simply have it block until seeded, because we might not be in process
context and therefore cannot sleep, and introducing a spin for
something that could take a while is untenable too.

There might be some hope, however. Recent kernels now have a very fast
urandom seeding, which moves the seed-ready-point to be much early
during boot. This is still not early enough to just "do nothing", but
early enough that there's room for a good solution:

For builtin modules, we defer all __init calls to after seeding of
drivers that use get_random_bytes. That is to say, rather than using
add_random_ready_callback one by one in an ad-hoc fashion with every
call site that needs good randomness, we just defer loading those
entire modules until an add_random_ready_callback callback. We might
need to explicitly opt-in to this -- by introducing an
`after_urandom_init(..)`, for example, to replace module_init in these
cases -- or perhaps there's a way to automatically detect and defer.
For external modules, it's much easier; we simply have
request_module() block until after seeding is complete. This function
is already a blocking one, so that's not an issue. And it'd ensure
that things like virtual network drivers that are dynamically loaded
and use get_random_bytes, such as vxlan or wireguard, simply block on
`ip link add ...`, which would be the desired behavior.

Alternatively, I'm open to other solutions people might come up with.

Thoughts?

Jason


2017-06-02 15:53:44

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

(Meanwhile...)

In my own code, I'm currently playing with a workaround that looks like this:

--- a/src/main.c
+++ b/src/main.c

+#include <linux/completion.h>
+#include <linux/random.h>

+struct rng_initializer {
+ struct completion done;
+ struct random_ready_callback cb;
+};
+static void rng_initialized_callback(struct random_ready_callback *cb)
+{
+ complete(&container_of(cb, struct rng_initializer, cb)->done);
+}
+
static int __init mod_init(void)
{
int ret;
+ struct rng_initializer rng = {
+ .done = COMPLETION_INITIALIZER(rng.done),
+ .cb = { .owner = THIS_MODULE, .func = rng_initialized_callback }
+ };
+
+ ret = add_random_ready_callback(&rng.cb);
+ if (!ret)
+ wait_for_completion(&rng.done);
+ else if (ret != -EALREADY)
+ return ret;

do_things_with_get_random_bytes_maybe();

Depending on the situation, however, I could imagine that
wait_for_completion never returning, if its blocking activity that
contributes to the seed actually being available, if this is called
from a compiled-in module, so I find this a bit sub-optimal...

2017-06-02 16:48:29

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Further investigations: if the whack-a-mole approach is desirable,
perhaps many of those get_random_bytes calls should be converted to
get_blocking_random_bytes. In that case, this commit, which removed
this helpful API, should be reverted:

commit c2719503f5e1e6213d716bb078bdad01e28ebcbf
Author: Herbert Xu <[email protected]>
Date: Tue Jun 9 18:19:42 2015 +0800

random: Remove kernel blocking API

This patch removes the kernel blocking API as it has been completely
replaced by the callback API.

Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/char/random.c b/drivers/char/random.c
index a1576ed1d88e..d0da5d852d41 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1265,18 +1265,6 @@ void get_random_bytes(void *buf, int nbytes)
EXPORT_SYMBOL(get_random_bytes);

/*
- * Equivalent function to get_random_bytes with the difference that this
- * function blocks the request until the nonblocking_pool is initialized.
- */
-void get_blocking_random_bytes(void *buf, int nbytes)
-{
- if (unlikely(nonblocking_pool.initialized == 0))
- wait_event(urandom_init_wait, nonblocking_pool.initialized);
- extract_entropy(&nonblocking_pool, buf, nbytes, 0, 0);
-}
-EXPORT_SYMBOL(get_blocking_random_bytes);
-
-/*

2017-06-02 17:26:25

by Theodore Ts'o

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, Jun 02, 2017 at 04:59:56PM +0200, Jason A. Donenfeld wrote:
> /dev/urandom will return bad randomness before its seeded, rather than
> blocking, and despite years and years of discussion and bike shedding,
> this behavior hasn't changed, and the vast majority of cryptographers
> and security experts remain displeased. It _should_ block until its
> been seeded for the first time, just like the new getrandom() syscall,
> and this important bug fix (which is not a real api break) should be
> backported to all stable kernels. (Userspace doesn't even have a
> reliable way of determining whether or not it's been seeded!) Yes,
> yes, you have arguments for why you're keeping this pathological, but
> you're still wrong, and this api is still a bug. Forcing userspace
> architectures to work around your bug, as your arguments usually go,
> is disquieting.

I tried making /dev/urandom block. The zero day kernel testing within
a few hours told us that Ubuntu LTS at the time and OpenWrt would fail
to boot. And when Python made a change to call getrandom(2) with the
default blocking semantic instead of using /dev/urandom, some
distributions using systemd stopped booting.

So if you're a security focused individual who is kvetching that we're
asking to force userspace to change to fix "a bug", you need to
understand that making the change you're suggesting will *also*
require making changes to userspace. And if you want to go try to
convince Linus Torvalds that it's OK to make a backwards-incompatible
changes that break Ubuntu LTS and OpenWRT by causing them to fail to
boot --- be my guest.

And if we're breaking distributions from them booting when their use
of /dev/urandom is not security critical, I suspect Linus is not going
to be impressed that you are breaking those users for no beneft. (For
example, Python's use of getrandom(2) was to prevent denial of service
attacks when Python is used as a fast-CGI web server script, and it
was utterly pointless from security perspective to block when the
python script was creating on-demand systemd unit files, where the DOS
threat was completely irrelevant.)

> As the printk implies, it's possible for get_random_bytes() to be
> called before it's seeded. Why was that helpful printk put behind an
> ifdef? I suspect because people would see the lines in their dmesg and
> write emails like this one to the list.

The #ifdef and the printk was there from the very beginning.

commit 392a546dc8368d1745f9891ef3f8f7c380de8650
Author: Theodore Ts'o <[email protected]>
Date: Sun Nov 3 18:24:08 2013 -0500

random: add debugging code to detect early use of get_random_bytes()

Signed-off-by: "Theodore Ts'o" <[email protected]>

It was four years ago, so I don't remember the exact circumstances,
but if I recall the issue was not wanting to spam the dmesg. Also, at
the time, we hadn't yet created the asynchronous interfaces that allow
kernel code to do the right thing, even if it was massively
inconvenient.

At this point, I think we should be completely open to making it be a
config option, and if it looks like for common configs we're not
spamming dmesg, we could even it make it be the default. We just also
need to give driver writers some easy-to-understand receipes to fix
their drivers to do the right thing. If we do that first, it's much
more likely they will actually fix their kernel code, instead of just
turning the warning off.

> drivers/net/ieee802154/at86rf230.c:
> static int at86rf230_probe(struct spi_device *spi)
> {
> ...
> get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed));
>
> No clue what this driver is for or if it actually needs good
> randomness, but using get_random_bytes in a hardware probe function
> can happen prior to userspace's ability to seed.

What, you mean as a computer scientist you didn't immediately
understand that csma doesn't refer to CSMA/CD --- or "Carrier-sense
multiple access with collision avoidance"? For shame! :-)

This is basically the exponential backoff which used in ethernet
networks, and the *real* answer is that they should be using
prandom_u32().

> arch/s390/crypto/prng.c seems to call get_random_bytes on module load,
> which can happen pre-userspace, for seeding some kind of RNG of its
> own.

It appears to be accessing a hardware cryptographic processor
function, and seems to do its own entropy gathering relying on stack
garbage as well as using get_random_bytes(). By default it is defined
as a module, so I suspect in most situations it's called
post-userspace, but agreed that it is not guaranteed to be the case.
I think in practice most IBM customers will be safe because they tend
to use distro kernels that will almost certainly be building it as a
module, but your point is well taken.

> And so on and so on and so on. If you grep the source as I did, you'll
> find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
> "could this be called before userspace has ensured that /dev/urandom
> is seeded? do we actually _need_ good randomness here, or does it not
> matter?" I guess you could try to reason about each and every one of
> them -- you might even have those same questions about the silly
> examples I pasted above -- but that one-by-one methodology seems
> excessively fragile and arduous.

This is a fair point.

> There must have been a developer who thought about this at least once
> before, because random.c does have a callback notifier mechanism for
> drivers to learn when they can safely call get_random_bytes --
> add_random_ready_callback. However, it's only used by ONE driver --
> the drbg in crypto/. It's pretty clunky to use, and there's no
> reasonable to way replace every single usage of get_random_bytes with
> complicated callback infrastructure.

That developer was Herbert Xu, and he added it about two years ago.
See commit 205a525c3342.

> There might be some hope, however. Recent kernels now have a very fast
> urandom seeding, which moves the seed-ready-point to be much early
> during boot. This is still not early enough to just "do nothing", but
> early enough that there's room for a good solution:
>
> For builtin modules, we defer all __init calls to after seeding of
> drivers that use get_random_bytes. That is to say, rather than using
> add_random_ready_callback one by one in an ad-hoc fashion with every
> call site that needs good randomness, we just defer loading those
> entire modules until an add_random_ready_callback callback. We might
> need to explicitly opt-in to this -- by introducing an
> `after_urandom_init(..)`, for example, to replace module_init in these
> cases -- or perhaps there's a way to automatically detect and defer.
> For external modules, it's much easier; we simply have
> request_module() block until after seeding is complete. This function
> is already a blocking one, so that's not an issue. And it'd ensure
> that things like virtual network drivers that are dynamically loaded
> and use get_random_bytes, such as vxlan or wireguard, simply block on
> `ip link add ...`, which would be the desired behavior.

So the problem with doing this by default, for all modules, is that on
those platforms which don't have good hardware support for seeding the
non-blocking pool quickly, if we defer all modules, we will also be
deferring the means by which we get the entropy needed to seed the
non-blocking pool. So for example, if you have a bunch of networking
drivers that are using get_random_bytes() for exponential backoff,
when they *should* be using prandom_u32(), if we don't fix *that* bug,
simply deferring the module load will also defer the entropy input
from the networking interrupts from the networking card.

So while I agree that auditing all calls to get_random_bytes() is
fragile from a security robustness perspective, and I certainly agree
that it is onerous, some amount of it is still going to have to be
done. And for things like the S390 PRNG (which is supposed to be
generating cryptorgaphically secure random numbers), probably we
should just fix it to use the add_random_ready_callback() interface
since even if in practice most users will be safe, we shouldn't be
depending on it.

Adding a patch to make DEBUG_RANDOM_BOOT a Kconfig option also is a
really good first step, for someone who wants to take this on as a
project.

Cheers,

- Ted


2017-06-02 17:41:15

by Daniel Micay

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote:
> (Meanwhile...)
>
> In my own code, I'm currently playing with a workaround that looks
> like this:
>
> --- a/src/main.c
> +++ b/src/main.c
>
> +#include <linux/completion.h>
> +#include <linux/random.h>
>
> +struct rng_initializer {
> + struct completion done;
> + struct random_ready_callback cb;
> +};
> +static void rng_initialized_callback(struct random_ready_callback
> *cb)
> +{
> + complete(&container_of(cb, struct rng_initializer, cb)->done);
> +}
> +
> static int __init mod_init(void)
> {
> int ret;
> + struct rng_initializer rng = {
> + .done = COMPLETION_INITIALIZER(rng.done),
> + .cb = { .owner = THIS_MODULE, .func =
> rng_initialized_callback }
> + };
> +
> + ret = add_random_ready_callback(&rng.cb);
> + if (!ret)
> + wait_for_completion(&rng.done);
> + else if (ret != -EALREADY)
> + return ret;
>
> do_things_with_get_random_bytes_maybe();
>
> Depending on the situation, however, I could imagine that
> wait_for_completion never returning, if its blocking activity that
> contributes to the seed actually being available, if this is called
> from a compiled-in module, so I find this a bit sub-optimal...

One of the early uses is initializing the stack canary value for SSP in
very early boot. If that blocks, it's going to be blocking nearly
anything else from happening.

On x86, that's only the initial canary since the per-task canaries end
up being used, but elsewhere at least without SMP disabled or changes to
GCC that's all there is so the entropy matters.

2017-06-02 17:44:12

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'o <[email protected]> wrote:
> I tried making /dev/urandom block.
> So if you're a security focused individual who is kvetching
> And if we're breaking

Yes yes, bla bla, predictable response. I don't care. Your API is
still broken. Excuses excuses. Yes, somebody needs to do the work in
the end, maybe that person can be me, maybe you, maybe somebody else.
Regardless, we can take this to a different thread if you'd like to
bikeshed about that particular point for another five millennia. But
that isn't the main topic of this thread, so I'm not going to get
sucked into that blackhole.

> the time, we hadn't yet created the asynchronous interfaces that allow
> kernel code to do the right thing, even if it was massively
> inconvenient.

While we're on the topic of that, you might consider adding a simple
synchronous interface. I realize that the get_blocking_random_bytes
attempt was aborted as soon as it began, because of issues of
cancelability, but you could just expose the usual array of wait,
wait_interruptable, wait_killable, etc, or just make that wait object
and condition non-static so others can use it as needed. Having to
wrap the current asynchronous API like this kludge is kind of a
downer:

https://git.zx2c4.com/WireGuard/diff/src/config.c?id=c77145a8248dfc49e307eae7d7edd5fdca8d5559

> At this point, I think we should be completely open to making it be a
> config option, and if it looks like for common configs we're not
> spamming dmesg, we could even it make it be the default. We just also
> need to give driver writers some easy-to-understand receipes to fix
> their drivers to do the right thing. If we do that first, it's much
> more likely they will actually fix their kernel code, instead of just
> turning the warning off.

That's a good point. In my initial email I looked down on the
whack-a-mole approach, because nobody is ever going to analyze all of
those cases. But... if dmesgs are going wild for some people, we'll
have upset users doing the hard work for us. So maybe it's worthwhile
to turn that warning on by default.


> This is basically the exponential backoff which used in ethernet
> networks, and the *real* answer is that they should be using
> prandom_u32().
> I think in practice most IBM customers will be safe because they tend

No, what it means is that the particularities of individual examples I
picked at random don't matter. Are we really going to take the time to
audit each and every callsite to see "do they need actually random
data? can this be called pre-userspace?" I mentioned this in my
initial email. As I said there, I think analyzing all the cases one by
one is fragile, and more will pop up, and that's not really the right
way to approach this. And furthermore, as alluded to above, even
fixing clearly-broken places means using that hard-to-use asynchronous
API, which adds even more potentially buggy TCB to these drivers and
all the rest. Not a good strategy.

Seeing as you took the time to actually respond to the
_particularities_ of each individual random example I picked could
indicate that you've missed this point prior.

>> And so on and so on and so on. If you grep the source as I did, you'll
>> find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
>> "could this be called before userspace has ensured that /dev/urandom
>> is seeded? do we actually _need_ good randomness here, or does it not
>> matter?" I guess you could try to reason about each and every one of
>> them -- you might even have those same questions about the silly
>> examples I pasted above -- but that one-by-one methodology seems
>> excessively fragile and arduous.
>
> This is a fair point.

Glad we're on the same page about that.


> That developer was Herbert Xu, and he added it about two years ago.
> See commit 205a525c3342.

Right, it was him and Stephan (CCd). They initially started by adding
get_blocking_random_bytes, but then replaced this with the
asynchronous one, because they realized it could block forever. As I
said above, though, I still think a blocking API would be useful,
perhaps just with more granularity for the way in which it blocks.

> So the problem with doing this by default, for all modules, is that on
> those platforms which don't have good hardware support for seeding the
> non-blocking pool quickly, if we defer all modules, we will also be
> deferring the means by which we get the entropy needed to seed the
> non-blocking pool. So for example, if you have a bunch of networking
> drivers that are using get_random_bytes() for exponential backoff,
> when they *should* be using prandom_u32(), if we don't fix *that* bug,
> simply deferring the module load will also defer the entropy input
> from the networking interrupts from the networking card.

"We shouldn't fix legitimate cryptographic bugs, because the rest of
our codebase is too buggy to support anything reasonable."
Sounds like it'd be worthwhile to begin fixing things in this fashion, then.

> So while I agree that auditing all calls to get_random_bytes() is
> fragile from a security robustness perspective, and I certainly agree
> that it is onerous, some amount of it is still going to have to be
> done.

In fixing those bugs where it's clearly not necessary, I guess you're
right then.

> And for things like the S390 PRNG (which is supposed to be
> generating cryptorgaphically secure random numbers), probably we
> should just fix it to use the add_random_ready_callback() interface
> since even if in practice most users will be safe, we shouldn't be
> depending on it.

Please don't think of this example as THE instance where it matters. I
seriously just opened a few files from my grep at random. That this
one wound up being a particularly relevant one is just chance. Other
dragons lurk below, beware.

> Adding a patch to make DEBUG_RANDOM_BOOT a Kconfig option also is a
> really good first step, for someone who wants to take this on as a
> project.

What would you think of just removing the #ifdef completely?

Regards,
Jason

2017-06-02 17:46:49

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, Jun 2, 2017 at 7:41 PM, Daniel Micay <[email protected]> wrote:
> One of the early uses is initializing the stack canary value for SSP in
> very early boot. If that blocks, it's going to be blocking nearly
> anything else from happening.
>
> On x86, that's only the initial canary since the per-task canaries end
> up being used, but elsewhere at least without SMP disabled or changes to
> GCC that's all there is so the entropy matters.

If this is the case, then we simply need a function called
get_random_bytes_but_potentially_crappy_ones_because_we_are_desperate_for_anything(),
which would respond with a weaker guarantee than that
get_random_bytes(), which the documentation says always returns
cryptographically secure numbers.

2017-06-02 18:58:29

by Kees Cook

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, Jun 2, 2017 at 10:41 AM, Daniel Micay <[email protected]> wrote:
> On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote:
>> (Meanwhile...)
>>
>> In my own code, I'm currently playing with a workaround that looks
>> like this:
>>
>> --- a/src/main.c
>> +++ b/src/main.c
>>
>> +#include <linux/completion.h>
>> +#include <linux/random.h>
>>
>> +struct rng_initializer {
>> + struct completion done;
>> + struct random_ready_callback cb;
>> +};
>> +static void rng_initialized_callback(struct random_ready_callback
>> *cb)
>> +{
>> + complete(&container_of(cb, struct rng_initializer, cb)->done);
>> +}
>> +
>> static int __init mod_init(void)
>> {
>> int ret;
>> + struct rng_initializer rng = {
>> + .done = COMPLETION_INITIALIZER(rng.done),
>> + .cb = { .owner = THIS_MODULE, .func =
>> rng_initialized_callback }
>> + };
>> +
>> + ret = add_random_ready_callback(&rng.cb);
>> + if (!ret)
>> + wait_for_completion(&rng.done);
>> + else if (ret != -EALREADY)
>> + return ret;
>>
>> do_things_with_get_random_bytes_maybe();
>>
>> Depending on the situation, however, I could imagine that
>> wait_for_completion never returning, if its blocking activity that
>> contributes to the seed actually being available, if this is called
>> from a compiled-in module, so I find this a bit sub-optimal...
>
> One of the early uses is initializing the stack canary value for SSP in
> very early boot. If that blocks, it's going to be blocking nearly
> anything else from happening.
>
> On x86, that's only the initial canary since the per-task canaries end
> up being used, but elsewhere at least without SMP disabled or changes to
> GCC that's all there is so the entropy matters.

And just to note, building with GCC_PLUGIN_LATENT_ENTROPY, while it
(correctly) doesn't credit entropy to the pool, should at least make
the pool less deterministic between boots.

-Kees

--
Kees Cook
Pixel Security

2017-06-02 19:07:41

by Theodore Ts'o

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Fri, Jun 02, 2017 at 07:44:04PM +0200, Jason A. Donenfeld wrote:
> On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'o <[email protected]> wrote:
> > I tried making /dev/urandom block.
> > So if you're a security focused individual who is kvetching
> > And if we're breaking
>
> Yes yes, bla bla, predictable response. I don't care. Your API is
> still broken. Excuses excuses. Yes, somebody needs to do the work in
> the end, maybe that person can be me, maybe you, maybe somebody else.

It's not _my_ API, it's *our* API --- that is the Linux kernel
community's. And part of the rules of this community is that we very
much don't break backwards compatibility, unless there is a really
good reason, where Linus gets to decide if it's a really good reason.
So if you care a lot about this issue, then you need to do the work to
make the change, and part of it is showing, to a high degree of
certainty, that it won't break backwards compatibility.

Because if you don't, and you flout community norms, and users get
broken, and they complain, and you tell them to suck it, then Linus
will pull out is patented clue stick, and tell you that you have in
fact flouted community norms, correct you publically, and then revert
your change.

If you are using the word *you*, and speaking as an outside to the
community, they you can kvetch all you like. But you're an outsider,
and don't have to listen to you. But if you want to make a positive
difference here, and you're passionate about it --- this is you would
need to do. That being said, we're all volunteers, so if you don't
want to bother, that's fine. But then don't be surprised if we don't
take your complaints seriously.

> While we're on the topic of that, you might consider adding a simple
> synchronous interface.

There's that word "you" again....

> I realize that the get_blocking_random_bytes
> attempt was aborted as soon as it began, because of issues of
> cancelability, but you could just expose the usual array of wait,
> wait_interruptable, wait_killable, etc, or just make that wait object
> and condition non-static so others can use it as needed. Having to
> wrap the current asynchronous API like this kludge is kind of a
> downer:

This is open source --- want to send patches? It sounds like it's a
workable, good idea.

> No, what it means is that the particularities of individual examples I
> picked at random don't matter. Are we really going to take the time to
> audit each and every callsite to see "do they need actually random
> data? can this be called pre-userspace?" I mentioned this in my
> initial email. As I said there, I think analyzing all the cases one by
> one is fragile, and more will pop up, and that's not really the right
> way to approach this. And furthermore, as alluded to above, even
> fixing clearly-broken places means using that hard-to-use asynchronous
> API, which adds even more potentially buggy TCB to these drivers and
> all the rest. Not a good strategy.
>
> Seeing as you took the time to actually respond to the
> _particularities_ of each individual random example I picked could
> indicate that you've missed this point prior.

...or that I disagree with your prior point. I think you're being
lazy, and trying to make it someone else's problem and standing on the
side lines and complaining, as opposed to trying to help solve the
problem.

No, of course we can't audit all of the code, but it's probably a good
idea to take a random sample, and to analyze them, so we can get a
sense of what the issues are. And then maybe we can find a way to
quickly find a class of users that can be easily fixed by using
prandom_u32() (for example).

Or maybe we can then help figure out what percentage of the callsites
can be fixed with a synchronous interface, and fix some number of them
just to demonstrate that the synchronous interface does work well.

> Right, it was him and Stephan (CCd). They initially started by adding
> get_blocking_random_bytes, but then replaced this with the
> asynchronous one, because they realized it could block forever. As I
> said above, though, I still think a blocking API would be useful,
> perhaps just with more granularity for the way in which it blocks.

It depends on where it's being used. If it's part of module load,
especially if it's one that's done automatically, having something
that blocks forever might not be all that useful. Especially if it
blocks device drivers from being albe to be initialized enough to
actually supply entropy to the whole system.

Or maybe (in the case of stack canaries), the answer is we should
start with crappy random numbers, but then once the random number
generator has been initialized, we can use the callback to get
cryptographically secure random number generators, and then we need to
figure out how to phase out use of the old crappy random numbers and
substitute in the exclusive use of the good random numbers. Because
saying that we'll just simply not allow any processes to start until
we have good random numbers, which means we can't load the kernel
modules, and we're running on an architecture which doesn't have
RDRAND or even a high-resolution clock, may mean that we're in a world
of hurt.

And simply saying *your* system is buggy, or *your* system is
fundamentally broken, isn't particularly helpful. Yes, *we* have Linux
routers using MIPS processors which don't have much in the way of
entropy gathering facilities or true random number generation. Simply
bricking them so we can say, "yay, our system is no longer buggy", is
not acceptable. So the question is whether you're going to help make
things incrementally better, or just sit on the sidelines and kvetch.

And if your answer is just, "blah blah di blah blah", don't be
surprised if others respond to you in exactly the same way.
Specifically, by saying to you (in your words), "I don't care".

> > Adding a patch to make DEBUG_RANDOM_BOOT a Kconfig option also is a
> > really good first step, for someone who wants to take this on as a
> > project.
>
> What would you think of just removing the #ifdef completely?

I think making it a Kconfig option which defaults to true is the
better approach. At the very least let's make sure that on a range of
"standard x86 developer machines", we're not spamming dmesg. If we
are, simply turning it on and standing on principle, "we're the
cryptographers and we get to decide what is right and holy", and if
lots of people start complaining about how it makes their machine
usuable, that's exactly the same kind of arrogance which caused kernel
developers to become incensed by systemd developers when they spammed
dmesg and made kernel developers' systems unusuable. Would you be
upset if systemd developers did it unto you? Then maybe you shouldn't
do it unto others....

- Ted

2017-06-02 23:58:35

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Hi Ted,

Based on the tone of your last email, before I respond to your
individual points, I think it's worth noting that the intent of this
thread is to get a sampling of opinions of the issue of
get_random_bytes, so that I can write a patch that fixes this issue
(or a series of issues) using some strategy that we agree is a good
one. I've already implemented prototypes of a few different ideas, so
that I can discuss them competently here (a personal thing, I tend to
learn best by "doing"), and so with list feedback, I'll know in which
direction I should go for refinement eventually leading to a [PATCH]
submission. In case it wasn't evident from my last patch series for
drivers/char/random.c, despite not being corporately backed, I really
have a fun time working on this part of the kernel in my freetime.
With that said...

On Fri, Jun 2, 2017 at 9:07 PM, Theodore Ts'o <[email protected]> wrote:
> It's not _my_ API, it's *our* API
> much don't break backwards compatibility,
> broken, and they complain, and you tell them to suck it,
> If you are using the word *you*, and speaking as an outside to the
> community, they you can kvetch all you like. But you're an outsider,
> take your complaints seriously.

As I wrote earlier, if you want to bike-shed the technical aspects and
political aspects of fixing /dev/urandom, we can do that in another
thread. It's obviously a complex topic worthy of discussion, and I'd
like that discussion to be technically worthy of the issues at hand,
some of which you've raised here. So please, start a new topic for
that, and I'll happily follow up, all with the intent of writing a
patch series and running compatibility tests and whatnot; as I said
before, I like working on this stuff. However, the focus of this
thread is get_random_bytes, so please try to stay focused, lest this
get derailed.

So, you wrote:

> And if your answer is just, "blah blah di blah blah", don't be
> surprised if others respond to you in exactly the same way.
> Specifically, by saying to you (in your words), "I don't care".

I really don't care about bikeshedding this with you here, now, in
this thread. I care about solving the get_random_bytes situation.

>> While we're on the topic of that, you might consider adding a simple
>> synchronous interface.
>
> There's that word "you" again....

Yea, whatever, Ted. Can I rephrase? "As the maintainer of
drivers/char/random.c, you might consider adding a synchronous
interface so that the consumers of the file you maintain can benefit
from it; alternatively, and perhaps easier for you, express your
support for such an idea, and I'll gladly write the patch." Better? I
ask you for the benefit of the doubt that this is what I had in mind
when I wrote "you" -- that I'm seeking approval of the idea before
moving my code past prototype stage. You're the boss, so whether I
write the patch or you write the patch or someone else writes the
patch, it really _is_ you whose consideration matters the most.

> This is open source --- want to send patches? It sounds like it's a
> workable, good idea.

Awesome, happy you like it. Yes, absolutely, I'll start cleaning
things up and will send a series.

Do you have any opinions on the issue of what the most flexible API
would be? There are a lot of varieties of wait_event. The default one
is uninterruptable, but as Stephan and Herbert saw when they were
working on this was that it could be dangerous in certain
circumstances not to allow interruption. So probably we'd want an
interruptable variety. But then maybe we want to support the other
wait_event_* modes too, like killable, timeout, hrtimeout, io, and so
on. There's a huge list. These are all implemented as macros in
wait.h, and I don't really want to have a different get_random_bytes_*
function that corresponds to _each_ of these wait_event_* functions,
since that'd be overwhelming. So I was thinking maybe a simple flag
and a timeout param could work:

int get_random_bytes_seeded(u8 *buf, size_t len, bool
is_interruptable, unsigned long timeout);

If timeout==0, the timeout is infinite. If in_interruptible is true,
we use wait_event_interruptible_*, otherwise we use wait_event_*, and
the ret value is the usual return value from wait_event_*.

Does that seem like a good simplification that will cover most use
cases? We could of course add exceeding complexity and choice, but I
was thinking this would cover most use cases, at least to begin.

What do you think?

> ...or that I disagree with your prior point. I think you're being
> lazy, and trying to make it someone else's problem and standing on the
> side lines and complaining, as opposed to trying to help solve the
> problem.

Um, no, what an offensive insinuation. I'm trying to solve the
problem. I'd like your honest feedback, since your maintainer, before
I start fixing this.

> No, of course we can't audit all of the code, but it's probably a good
> idea to take a random sample, and to analyze them, so we can get a
> sense of what the issues are.

That's fair. In that case, please don't take my sample of 4 as a good
one. I'll put some time into doing a more extensive analysis.

> And then maybe we can find a way to
> quickly find a class of users that can be easily fixed by using
> prandom_u32() (for example).

Yea, the easy cases like this might be easy to just extensively audit
for (it's the more subtle ones I'm concerned about). I'm sure there
will be some drivers where I'm unsure, in which case I'll leave it as
is, but for others I can see.

There are ~180 call sites to examine.

> Or maybe we can then help figure out what percentage of the callsites
> can be fixed with a synchronous interface, and fix some number of them
> just to demonstrate that the synchronous interface does work well.

I was planning on doing this to at least a couple callsites in my
upcoming patch series that adds the synchronous interface. It seems
like the right way to see if the API is good or not.

> It depends on where it's being used. If it's part of module load,
> especially if it's one that's done automatically, having something
> that blocks forever might not be all that useful. Especially if it
> blocks device drivers from being albe to be initialized enough to
> actually supply entropy to the whole system.

Right. In my initial email I proposed a distinction:

- External module loading is usually in a different process context,
so multiple modules can be attempted to be loaded at the same time. In
this case, it is probably safe to block in request_module.
- Built-in modules are loaded ± linearly, from init/main.c, so these
really can't block each other. For this, I was thinking of introducing
an rnginit section, to add to the list of things like lateinit. The
rnginit drivers would be loaded _after_ the RNG has initialized, so
that they don't get blocked.

This solution would be instead of introducing that synchronous API
mentioned above. I generally like it better, since it means fewer
error prone changes, and it kind of fixes the problem in a more
systemic way.

The point you raise against it is that there might be issues where a
driver needs randomness to start, yet that driver contributes, perhaps
solely, to randomness. However, I don't think this is an inherent
problem with this solution. I think, rather, that in cases like this,
it's the driver itself that needs to be fixed, so as not to have this
catch-22. So, indeed, we'd have to hunt down any of these cases.

>> What would you think of just removing the #ifdef completely?
>
> I think making it a Kconfig option which defaults to true is the
> better approach. At the very least let's make sure that on a range of
> "standard x86 developer machines", we're not spamming dmesg. If we
> are, simply turning it on and standing on principle, "we're the
> cryptographers and we get to decide what is right and holy", and if
> lots of people start complaining about how it makes their machine
> usuable, that's exactly the same kind of arrogance which caused kernel
> developers to become incensed by systemd developers when they spammed
> dmesg and made kernel developers' systems unusuable. Would you be
> upset if systemd developers did it unto you? Then maybe you shouldn't
> do it unto others....

Again, I wasn't thinking about this in such an extreme polarized
fashion like how you present it here. I was just thinking that since
it's a bug when the pool is read from while unseeded -- either they
should be using prandom or should be deferring the call, as we
discussed -- that it'd just be easiest to always print, so we can
always see when this bug occurs. I wasn't thinking that this would
result in "arrogant log spam" or something. But anyway, if you prefer
it to be a Kconfig option, that's no problem with me, and I'll roll a
patch for that and submit it here.

Regards,
Jason

2017-06-03 00:20:29

by Sandy Harris

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

The only sensible & general solution for the initialisation problem
that I have seen is John Denker's.
http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

If I read that right, it would require only minor kernel changes &
none to the API Ted & others are worrying about. It would be secure
except against an enemy who can read your kernel image or interfere
with your install process. Assuming permissions are set sensibly, that
means an enemy who already has root & such an enemy has lots of much
easier ways to break things, so we need not worry about that case.

The difficulty is that it would require significant changes to
installation scripts. Still, since it is a general solution to a real
problem, it might be better to implement that rather than work on the
other suggestions in the thread.

2017-06-03 02:32:24

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH RFC 1/3] random: add synchronous API for the urandom pool

This enables users of get_random_{bytes,u32,u64,int,long} to wait until
the pool is ready before using this function, in case they actually want
to have reliable randomness.

Signed-off-by: Jason A. Donenfeld <[email protected]>
---
drivers/char/random.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
include/linux/random.h | 1 +
2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0ab024918907..bee7b1349bcb 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1466,7 +1466,10 @@ static ssize_t extract_entropy_user(struct entropy_store *r, void __user *buf,
* number of good random numbers, suitable for key generation, seeding
* TCP sequence numbers, etc. It does not rely on the hardware random
* number generator. For random bytes direct from the hardware RNG
- * (when available), use get_random_bytes_arch().
+ * (when available), use get_random_bytes_arch(). In order to ensure
+ * that the randomness provided by this function is okay, the function
+ * wait_for_random_bytes() should be called and return 0 at least once
+ * at any point prior.
*/
void get_random_bytes(void *buf, int nbytes)
{
@@ -1496,6 +1499,42 @@ void get_random_bytes(void *buf, int nbytes)
EXPORT_SYMBOL(get_random_bytes);

/*
+ * Wait for the urandom pool to be seeded and thus guaranteed to supply
+ * cryptographically secure random numbers. This applies to: the /dev/urandom
+ * device, the get_random_bytes function, and the get_random_{u32,u64,int,long}
+ * family of functions. Using any of these functions without first calling
+ * this function forfeits the guarantee of security.
+ *
+ * If is_interruptable is true, then this uses wait_event_interruptable.
+ * Otherwise the calling process will not respond to signals. If timeout is
+ * 0, then, unless interrupted, this function will wait potentially forever.
+ * Otherwise, timeout is a measure in jiffies.
+ *
+ * Returns: 0 if the urandom pool has been seeded.
+ * -ERESTARTSYS if the function was interrupted or timed out.
+ * Other return values of the wait_event_* functions.
+ */
+int wait_for_random_bytes(bool is_interruptable, unsigned long timeout)
+{
+ if (likely(crng_ready()))
+ return 0;
+ if (is_interruptable && timeout)
+ return wait_event_interruptible_timeout(crng_init_wait, crng_ready(), timeout);
+ if (is_interruptable && !timeout)
+ return wait_event_interruptible(crng_init_wait, crng_ready());
+ if (!is_interruptable && timeout)
+ return wait_event_timeout(crng_init_wait, crng_ready(), timeout);
+ if (!is_interruptable && !timeout) {
+ wait_event(crng_init_wait, crng_ready());
+ return 0;
+ }
+
+ BUG(); /* This BUG() should be compiled out as unreachable code, but just in case... */
+ return -EINVAL;
+}
+EXPORT_SYMBOL(wait_for_random_bytes);
+
+/*
* Add a callback function that will be invoked when the nonblocking
* pool is initialised.
*
@@ -2023,7 +2062,10 @@ struct batched_entropy {
/*
* Get a random word for internal kernel use only. The quality of the random
* number is either as good as RDRAND or as good as /dev/urandom, with the
- * goal of being quite fast and not depleting entropy.
+ * goal of being quite fast and not depleting entropy. In order to ensure
+ * that the randomness provided by this function is okay, the function
+ * wait_for_random_bytes() should be called and return 0 at least once
+ * at any point prior.
*/
static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u64);
u64 get_random_u64(void)
diff --git a/include/linux/random.h b/include/linux/random.h
index ed5c3838780d..20dd73418bd5 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -34,6 +34,7 @@ extern void add_input_randomness(unsigned int type, unsigned int code,
extern void add_interrupt_randomness(int irq, int irq_flags) __latent_entropy;

extern void get_random_bytes(void *buf, int nbytes);
+extern int wait_for_random_bytes(bool is_interruptable, unsigned long timeout);
extern int add_random_ready_callback(struct random_ready_callback *rdy);
extern void del_random_ready_callback(struct random_ready_callback *rdy);
extern void get_random_bytes_arch(void *buf, int nbytes);
--
2.13.0

2017-06-03 02:32:19

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH RFC 0/3] get_random_bytes seed blocking

Per the other thread on this mailing list, here's an initial
stab at what we discussed -- adding a blocking API for the RNG,
and adding a default-on dmesg Kconfig value for when things go
wrong.

Let me know what you think of this general implementation strategy,
and if you like it, I'll move forward with polish and with integrating
it into a fix for a few currently buggy get_random_bytes use cases.

Jason A. Donenfeld (3):
random: add synchronous API for the urandom pool
random: add get_random_{bytes,u32,u64,int,long}_wait family
random: warn when kernel uses unseeded randomness

drivers/char/random.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
include/linux/random.h | 31 +++++++++++++++++++++++++++++++
lib/Kconfig.debug | 15 +++++++++++++++
3 files changed, 91 insertions(+), 4 deletions(-)

--
2.13.0

2017-06-03 02:32:33

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH RFC 3/3] random: warn when kernel uses unseeded randomness

This enables an important dmesg notification about when drivers have
used the crng without it being seeded first. Prior, these errors would
occur silently, and so there hasn't been a great way of diagnosing these
types of bugs for obscure setups. By adding this as a config option, we
can leave it on by default, so that we learn where these issues happen,
in the field, will still allowing some people to turn it off, if they
really know what they're doing and do not want the log entries.

Signed-off-by: Jason A. Donenfeld <[email protected]>
---
drivers/char/random.c | 3 +--
lib/Kconfig.debug | 15 +++++++++++++++
2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index bee7b1349bcb..f64844383d86 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -285,7 +285,6 @@
#define SEC_XFER_SIZE 512
#define EXTRACT_SIZE 10

-#define DEBUG_RANDOM_BOOT 0

#define LONGS(x) (((x) + sizeof(unsigned long) - 1)/sizeof(unsigned long))

@@ -1475,7 +1474,7 @@ void get_random_bytes(void *buf, int nbytes)
{
__u8 tmp[CHACHA20_BLOCK_SIZE];

-#if DEBUG_RANDOM_BOOT > 0
+#ifdef CONFIG_WARN_UNSEEDED_RANDOM
if (!crng_ready())
printk(KERN_NOTICE "random: %pF get_random_bytes called "
"with crng_init = %d\n", (void *) _RET_IP_, crng_init);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e4587ebe52c7..fd5e67bcd46c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1209,6 +1209,21 @@ config STACKTRACE
It is also used by various kernel debugging features that require
stack trace generation.

+config WARN_UNSEEDED_RANDOM
+ bool "Warn when kernel uses unseeded randomness"
+ default y
+ help
+ Some parts of the kernel contain bugs relating to their use of
+ cryptographically secure random numbers before it's actually possible
+ to generate those numbers securely. This setting ensures that these
+ flaws don't go unnoticed, by enabling a message, should this ever
+ occur. This will allow people with obscure setups to know when things
+ are going wrong, so that they might contact developers about fixing
+ it.
+
+ Say Y here, unless you simply do not care about using unseeded
+ randomness and do not want a potential warning message in your logs.
+
config DEBUG_KOBJECT
bool "kobject debugging"
depends on DEBUG_KERNEL
--
2.13.0

2017-06-03 02:32:50

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH RFC 2/3] random: add get_random_{bytes,u32,u64,int,long}_wait family

These functions are simple convience wrappers that call
wait_for_random_bytes before calling the respective get_random_*
function.

Signed-off-by: Jason A. Donenfeld <[email protected]>
---
include/linux/random.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/include/linux/random.h b/include/linux/random.h
index 20dd73418bd5..6a19da815ff1 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -58,6 +58,36 @@ static inline unsigned long get_random_long(void)
#endif
}

+/* Calls wait_for_random_bytes(is_interruptable, timeout) and then
+ * calls get_random_bytes(buf, nbytes). Returns the result of the
+ * call to wait_for_random_bytes.
+ */
+static inline int get_random_bytes_wait(void *buf, int nbytes,
+ bool is_interruptable, unsigned long timeout)
+{
+ int ret = wait_for_random_bytes(is_interruptable, timeout);
+ if (unlikely(ret))
+ return ret;
+ get_random_bytes(buf, nbytes);
+ return 0;
+}
+
+#define declare_get_random_var_wait(var) \
+ static inline int get_random_ ## var ## _wait(var *out, \
+ bool is_interruptable, unsigned long timeout) { \
+ int ret = wait_for_random_bytes(is_interruptable, timeout); \
+ if (unlikely(ret)) \
+ return ret; \
+ *out = get_random_ ## var(); \
+ return 0; \
+ }
+declare_get_random_var_wait(u32)
+declare_get_random_var_wait(u64)
+declare_get_random_var_wait(int)
+declare_get_random_var_wait(long)
+#undef declare_get_random_var
+
+
unsigned long randomize_page(unsigned long start, unsigned long range);

u32 prandom_u32(void);
--
2.13.0

2017-06-03 05:04:41

by Theodore Ts'o

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Sat, Jun 03, 2017 at 01:58:25AM +0200, Jason A. Donenfeld wrote:
> Hi Ted,
>
> Based on the tone of your last email, before I respond to your
> individual points....

May I gently point out that *your* tone that started this whole thread
has been pretty terrible? Quoting your your first message:

"Yes, yes, you have arguments for why you're keeping this
pathological, but you're still wrong, and this api is still a bug."

This kind of "my shit doesn't stink, but yours does", is not
particularly useful if you are trying to have a constructive
discussion. If you think the /dev/urandom question wasn't appropriate
to discuss in this thread, then why the *hell* did you bring it up
*first*?

The reason why I keep harping on this is because I'm concerned about
an absolutist attitude towards technical design, where the good is the
enemy of the perfect, and where bricking machines in the pursuit of
cryptographic perfection is considered laudable. Yes, if you apply
thermite to the CPU and the hard drives, the system will be secure.
But it also won't be terribly useful.

And to me, the phrase "you're still wrong" doesn't seem to admit any
concessions towards acknowledging that there might be another side to
the security versus usability debate. And I'd ***really*** rather not
waste my time trying to work with someone who doesn't care about
usability, because my focus is in practical engineering, which means
if no one uses the system, or can use the system, then I consider it a
failure, even if it is 100% secure. And this is true no matter
whether we're talking about /dev/urandom or some enhancement to
get_random_bytes().

> Do you have any opinions on the issue of what the most flexible API
> would be? There are a lot of varieties of wait_event. The default one
> is uninterruptable, but as Stephan and Herbert saw when they were
> working on this was that it could be dangerous in certain
> circumstances not to allow interruption. So probably we'd want an
> interruptable variety. But then maybe we want to support the other
> wait_event_* modes too, like killable, timeout, hrtimeout, io, and so
> on. There's a huge list. These are all implemented as macros in
> wait.h, and I don't really want to have a different get_random_bytes_*
> function that corresponds to _each_ of these wait_event_* functions,
> since that'd be overwhelming.

We're going to have to look at a representative sample of the call
sites to figure this out. The simple case is where the call site is
only run in response to a userspace system call. There, blocking
makes perfect sense. I'm just not sure there are many callers of
get_random_ bytes() where this is the case.

There are definitely quite a few callsites of get_random_bytes() where
the caller is in the middle of an interrupt service routine, or is
holding spinlock. There, we are *not* allowed to block. So any kind
of blocking interface isn't going to be allowed; the async callback is
the only thing which is guaranteed to work, in fact. Mercifully, many
of these are cases where prandom_u32() makes sense.

> So I was thinking maybe a simple flag and a timeout param could work:

When would a timeout be useful? If you are using get_random_bytes()
for security reasons, does the security reason go away after 15
seconds? Or even 30 seconds?

> > Or maybe we can then help figure out what percentage of the callsites
> > can be fixed with a synchronous interface, and fix some number of them
> > just to demonstrate that the synchronous interface does work well.
>
> I was planning on doing this to at least a couple callsites in my
> upcoming patch series that adds the synchronous interface. It seems
> like the right way to see if the API is good or not.

This is absolutely the right approach. See my above comments about
why there may not be that many places where a synchronous interface
will work out. And I'm going to be rather surprised if there are
places where having a timeout parameter will be helpful. But I could
be wrong; let's see if we can find cases where we (a) need secure
bytes, (b) are allowed to block, and (c) where a timeout would make
sense.

> Right. In my initial email I proposed a distinction:
>
> - External module loading is usually in a different process context,
> so multiple modules can be attempted to be loaded at the same time. In
> this case, it is probably safe to block in request_module.
> - Built-in modules are loaded ? linearly, from init/main.c, so these
> really can't block each other. For this, I was thinking of introducing
> an rnginit section, to add to the list of things like lateinit. The
> rnginit drivers would be loaded _after_ the RNG has initialized, so
> that they don't get blocked.

... except that we already have an order in which drivers are added,
so moving a driver to the rnginit section might mean that some kernel
module that depends on that driver being loaded also needs to be moved
after rnginit. We already have the following initcall levels:

static char *initcall_level_names[] __initdata = {
"early",
"core",
"postcore",
"arch",
"subsys",
"fs",
"device",
"late",
};

If we move a caller of get_random_bytes() from say, "fs" to after
"rnginit", that may break things.

Also, it is possible that we may have architectures, without
fine-grained clocks, where we don't initialize the rng until after
userspace as sharted running. So it's not clear adding a rnginit
section makes sense. Even if we put it as late as possible --- say,
after "late", what do we do if don't have the CRNG fully
negotiated after the last of the "late" drivers have been run?

- Ted

2017-06-03 12:30:46

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Sat, Jun 3, 2017 at 7:04 AM, Theodore Ts'o <[email protected]> wrote:
> has been pretty terrible?
> This kind of "my shit doesn't stink, but yours does", is not
> The reason why I keep harping on this is because I'm concerned about
> an absolutist attitude towards technical design, where the good is the

Moving past that, did you see the [PATCH RCF 0/3] series I posted
yesterday? Would be helpful to have your feedback on that approach and
implementation strategy. Since it seems like you're preferring
cleaning up things individually, rather than the systemic rnginit
solution I initially proposed, I moved forward with implementing an
RFC-version of that. I'm pretty sure so quickly compromising and going
with what I perceived you thought was best is a strong indication that
there isn't an, "absolutist attitude towards technical design".
However, if you do somehow find evidence of that kind of claim in my
[PATCH] set, please do bring it up, and I'll try to adjust to be more
pleasing.

> We're going to have to look at a representative sample of the call
> sites to figure this out. The simple case is where the call site is
> only run in response to a userspace system call. There, blocking
> makes perfect sense. I'm just not sure there are many callers of
> get_random_ bytes() where this is the case.

In the patch series I sent earlier, the reason I split things into
wait_for_random_bytes, which just blocks until the pool is ready, and
then the convenience combiner of get_random_bytes_wait, which calls
wait_for_random_bytes and then get_random_bytes, is because I was
thinking there might be a few places where we can't actually sleep
during the get_random_bytes call, due to in_interrupt() or whatever,
but that there's some process-context area that's _always_ called
before get_random_bytes, like a userspace configuration API or an
ioctl, so we could simply put a call to wait_for_random_bytes, and
then be sure that all calls to get_random_bytes after that are safe.

I guess I'll see in practice if this is actually a useful way of doing
it, once I dig in and start modifying representative call sites.

> When would a timeout be useful? If you are using get_random_bytes()
> for security reasons, does the security reason go away after 15
> seconds? Or even 30 seconds?

I was thinking that returning to userspace with -ETIMEDOUT or
something might be more desirable in some odd situations (which ones?)
than just waiting for a signal and responding with
-EINTR/-ERESTARTSYS. That might turn out to be not true, in which
case I guess I won't add that API, as you suggested.

> Also, it is possible that we may have architectures, without
> fine-grained clocks, where we don't initialize the rng until after
> userspace as sharted running. So it's not clear adding a rnginit
> section makes sense. Even if we put it as late as possible --- say,
> after "late", what do we do if don't have the CRNG fully
> negotiated after the last of the "late" drivers have been run?

My idea was that it would be eventually inserted on the callback from
add_random_ready_callback. You're right that this would not be okay
for things like filesystems, but maybe it'd be appropriate for things
like crypto/rng.c? Or, perhaps the blocking API on configuration-time
would be better, anyway, for things like that. You seem wary of this
approach, so I'm going to roll with your suggestions above and see how
they work out. It it pans out great, if not, maybe we'll revisit this
down the road once I have a better picture of what the call sites are
like.

Jason

2017-06-03 21:45:47

by Sandy Harris

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Stephan's driver, the HAVEGE system & several others purport to
extract entropy from a series of timer calls. Probably the best
analysis is in the Mcguire et al. paper at
https://static.lwn.net/images/conf/rtlws11/random-hardware.pdf & the
simplest code in my user-space driver at
https://github.com/sandy-harris/maxwell The only kernel-space code I
know of is Stephan's.

If the claim that such calls give entropy is accepted (which I think
it should be) then if we get one bit per call, need 100 or so bits &
space the calls 100 ns apart, loading up a decent chunk of startup
entropy takes about 10,000 ns or 10 microseconds which looks like an
acceptable delay. Can we just do that very early in the boot process?

Of course this will fail on systems with no high-res timer. Are there
still some of those? It might be done in about 1000 times as long on a
system that lacks the realtime library's nanosecond timer but has the
Posix standard microsecond timer, implying a delay time in the
milliseconds. Would that be acceptable in those cases?

2017-06-03 22:54:42

by Jeffrey Walton

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris <[email protected]> wrote:
> ...
> Of course this will fail on systems with no high-res timer. Are there
> still some of those? It might be done in about 1000 times as long on a
> system that lacks the realtime library's nanosecond timer but has the
> Posix standard microsecond timer, implying a delay time in the
> milliseconds. Would that be acceptable in those cases?

A significant portion of the use cases should include mobile devices.
Device sales outnumbered desktop and server sales several years ago.

Many devices are sensor rich. Even the low-end ones come with
accelorometers for gaming. A typical one has 3 or 4 sensors, and
higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.

There's no wanting for entropy in many of the use cases. The thing
that is lacking seems to be taking advantage of it.

Jeff

2017-06-03 23:55:57

by Daniel Micay

[permalink] [raw]
Subject: Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

On 3 June 2017 at 18:54, Jeffrey Walton <[email protected]> wrote:
> On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris <[email protected]> wrote:
>> ...
>> Of course this will fail on systems with no high-res timer. Are there
>> still some of those? It might be done in about 1000 times as long on a
>> system that lacks the realtime library's nanosecond timer but has the
>> Posix standard microsecond timer, implying a delay time in the
>> milliseconds. Would that be acceptable in those cases?
>
> A significant portion of the use cases should include mobile devices.
> Device sales outnumbered desktop and server sales several years ago.
>
> Many devices are sensor rich. Even the low-end ones come with
> accelorometers for gaming. A typical one has 3 or 4 sensors, and
> higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.
>
> There's no wanting for entropy in many of the use cases. The thing
> that is lacking seems to be taking advantage of it.
>
> Jeff

Hardware random number generator support is also standard on even
low-end mobile devices. The Linux kernel now knows to feed some of the
entropy from those hardware random generators into the kernel CSPRNG
when the driver is initialized but that doesn't happen until fairly
late in the kernel's boot process. The sensors present the same issue.
They aren't available when the kernel starts needing entropy for
features like SSP and KASLR or other early boot uses, unlike
RDRAND/RDSEED on modern x86_64 CPUs.

For userspace, Android's init system blocks until a certain amount of
entropy is obtained from one for the kernel CSPRNG. It's possible for
there to be no hwrandom but I think that's very rare now since the
standard SoCs used everywhere have it available. The device vendor
would probably need to go out of the way to break it. Android also
regularly saves a persistent random seed and restores it on boot. It
also mixes in entropy from the hardware generator regularly since the
kernel didn't know how to do that before, just like it didn't know how
to grab any initial entropy from the hardware generator.

I don't think it's worth worrying too much about mobile. Slimmer
embedded devices that probably don't even save / restore a seed in
many cases or generate keys on first boot before that helps are the
real issue. At least if you're not focused on KASLR and other early
probabilistic kernel exploit mitigations where there's a lack of a way
to get entropy in early boot right now unless the bootloader helps.

2017-06-04 05:48:31

by Stephan Müller

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:

Hi Jason,

>
> Alternatively, I'm open to other solutions people might come up with.

How about stirring in some data from the Jitter RNG that we have in the kernel
already and that is used for the DRBG in case get_random_bytes has
insufficient entropy? Yes, two kernel developers said that this RNG is
useless, where in fact a lot of hardware and even crypto folks say that this
approach has merits.

In any case, it cannot destroy the (not present) entropy at boot time anyway.
Thus, take some 32, 48 or 64 bytes from it right at the start of the kernel,
and we should be better (from the view point of quite some folks) or not worse
off (view point of two developers here).

As this RNG does not depend on any in-kernel facility, it is always available
at any time.

PS: I could revive a patch adding this to random.c that I sent long ago if
desired.

Ciao
Stephan

2017-06-04 05:54:17

by Jeffrey Walton

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

On Sun, Jun 4, 2017 at 1:48 AM, Stephan Müller <[email protected]> wrote:
> Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:
>
>> Alternatively, I'm open to other solutions people might come up with.
>
> How about stirring in some data from the Jitter RNG that we have in the kernel
> already and that is used for the DRBG in case get_random_bytes has
> insufficient entropy? Yes, two kernel developers said that this RNG is
> useless, where in fact a lot of hardware and even crypto folks say that this
> approach has merits.

Almost anything has to be better than (1) silent failures, and (2)
draining the little entropy available when the generators are starting
and trying to become operational.

The [negative] use case for (2) is systemd. See, for example,
https://github.com/systemd/systemd/issues/4167.

Jeff

2017-06-04 05:55:35

by Stephan Müller

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Am Sonntag, 4. Juni 2017, 00:54:39 CEST schrieb Jeffrey Walton:

Hi Jeffrey,

> On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris <[email protected]> wrote:
> > ...
> > Of course this will fail on systems with no high-res timer. Are there
> > still some of those? It might be done in about 1000 times as long on a
> > system that lacks the realtime library's nanosecond timer but has the
> > Posix standard microsecond timer, implying a delay time in the
> > milliseconds. Would that be acceptable in those cases?
>
> A significant portion of the use cases should include mobile devices.
> Device sales outnumbered desktop and server sales several years ago.
>
> Many devices are sensor rich. Even the low-end ones come with
> accelorometers for gaming. A typical one has 3 or 4 sensors, and
> higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.
>

I think those devices are covered with the kernels 4.8+. That kernel uses
solely interrupts as noise source for the first stage we talk about here.

Not having done any particular measurements with the latest kernels on mobile
devices, but based on my experience with my LRNG assessment, I could fathom
that mobile devices have a fully seeded ChaCha20 DRNG before user space
starts.

Just to give an illustration: I have a Lenovo T540 which receives more than
256 interrupts before late_initcall. On all system with a high-res timer, each
interrupt will give more than one bit of entropy. Conversely, on my MacBook
Pro 2015, at late_initcall the kernel received less than 100 interrupts. In a
KVM guest with very little devices, I also have some 100 interrupts before
late_initcall. These measurements are taken with the same kernel and same
kernel configs.

Ciao
Stephan

2017-06-04 06:23:45

by Stephan Müller

[permalink] [raw]
Subject: Re: get_random_bytes returns bad randomness before seeding is complete

Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:

Hi Jason,

> Alternatively, I'm open to other solutions people might come up with.

One addition, there is an issue (I would call it a bug) in random.c before 4.8
where the nonblocking_pool is not reseeded during early boot even though
entropy may be available. That issue aggravates early boot time entropy issues
for user and kernel land.

I have not heard about accepting or rejecting it, so I am wondering how
patches go into random.c at all.

[1] https://patchwork.kernel.org/patch/9620431/

Ciao
Stephan