2022-05-11 14:44:28

by Simo Sorce

[permalink] [raw]
Subject: Re: [PATCH 2/2] random: add fork_event sysctl for polling VM forks

Hi Jason,

On Wed, 2022-05-11 at 03:18 +0200, Jason A. Donenfeld wrote:
> My proposal here is made with nonce reuse in mind, for things like
> session keys that use sequential nonces.

Although this makes sense the problem is that changing applications to
do the right thing based on which situation they are in will never be
done right or soon enough. So I would focus on a solution that makes
the CSPRNGs in crypto libraries safe.

> A different issue is random nonces. For these, it seems like a call to
> getrandom() for each nonce is probably the best bet. But it sounds like
> you're interested in a userspace RNG, akin to OpenBSD's arc4random(3). I
> hope you saw these threads:
>
> - https://lore.kernel.org/lkml/[email protected]/
> - https://lore.kernel.org/lkml/[email protected]/
> - https://lore.kernel.org/lkml/CAHmME9qHGSF8w3DoyCP+ud_N0MAJ5_8zsUWx=rxQB1mFnGcu9w@mail.gmail.com/

4c does sound like a decent solution, it is semantically identical to
an epoch vmgenid, all the library needs to do is to create such a mmap
region, stick a value on it, verify it is not zero after computing the
next random value but before returning it to the caller.
This reduces the race to a very small window when the machine is frozen
right after the random value is returned to the caller but before it is
used, but hopefully this just means that the two machines will just
make parallel computations that yield the exact same value, so no
catastrophic consequence will arise (there is the odd case where two
random values are sought and the split happens between the two are
retrieved and this has bad consequences, I think we can ignore that).

> Each one of those touches on vDSO things quite a bit. Basically, the
> motivation for doing that is for making userspace RNGs safe and
> promoting their use with a variety of kernel enhancements to make that
> easy. And IF we are to ship a vDSO RNG, then certainly this vmgenid
> business should be exposed that way, over and above other mechanisms.
> It'd make the most sense...IF we're going to ship a vDSO RNG.
>
> So the question really is: should we ship a vDSO RNG? I could work on
> designing that right. But I'm a little bit skeptical generally of the
> whole userspace RNG concept. By and large they always turn out to be
> less safe and more complex than the kernel one. So if we're to go that
> way, I'd like to understand what the strongest arguments for it are.

I am not entirely sure how a vDSO RNG would work, I think exposing the
epoch(or whatever indicator) is enough, crypto libraries have pretty
good PRNGs, what they require is simply a good source of entropy for
the initial seeding and this safety mechanism to avoid state
duplication on machine cloning.
All the decent libraries already support detecting process forks.

Simo.

--
Simo Sorce
RHEL Crypto Team
Red Hat, Inc





2022-05-11 15:03:51

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH 2/2] random: add fork_event sysctl for polling VM forks

Hi Simo,

On Wed, May 11, 2022 at 08:59:07AM -0400, Simo Sorce wrote:
> Hi Jason,
>
> On Wed, 2022-05-11 at 03:18 +0200, Jason A. Donenfeld wrote:
> > My proposal here is made with nonce reuse in mind, for things like
> > session keys that use sequential nonces.
>
> Although this makes sense the problem is that changing applications to
> do the right thing based on which situation they are in will never be
> done right or soon enough. So I would focus on a solution that makes
> the CSPRNGs in crypto libraries safe.

Please don't dismiss this. I realize you have your one single use case
in mind, but there are others, and the distinction you gave for why we
should dismiss the others to focus on yours doesn't really make any
sense. Here's why:

In my email I pointed out two places where VM forks impact crypto in bad
ways:

- Session keys, wrt nonce reuse.

- Random nonces, wrt nonce reuse.

There are other problems that arise from VM forks too. But these stand
out because they are both quite catastrophic, whether it's duplicated
ECDSA random nonces, or whether it's the same session key used with the
same sequential counter to encrypt different plaintexts with something
like AES-GCM or ChaCha20Poly1305. These are both very, very bad things.

And both things happen in:

- Libraries: crypto lib random number generators (e.g. OpenSSL), crypto
lib session keys (e.g. any TLS library).

- Applications: application level random number generators (e.g.
Bitcoin Core *facepalm*), application level session keys (e.g.
OpenSSH).

So I don't think the "library vs application" distinction is really
meaningful here. Rather, things kind of fall apart all over the place
for a variety of reasons on VM fork.

> > - https://lore.kernel.org/lkml/[email protected]/
> > - https://lore.kernel.org/lkml/[email protected]/
> > - https://lore.kernel.org/lkml/CAHmME9qHGSF8w3DoyCP+ud_N0MAJ5_8zsUWx=rxQB1mFnGcu9w@mail.gmail.com/
>
> 4c does sound like a decent solution, it is semantically identical to

It does, yeah, but realistically it's never going to happen. I don't
think there's a near- or medium-term chance of changing hypervisor
semantics again. That means for 4-like solutions, there's 4a and 4b.

By the way, that email of mine has inaccuracy in it. I complain about
being in irq context, but it turns out not to be the case; we're inside
of a kthread during the notification, which means we have a lot more
options on what we can do.

If 4 is the solution that appeals to you most, do you want to try your
hand at a RFC patch for it? I don't yet know if that's the best
direction to take, but the devil is kind of in the details, so it might
be interesting to see how it pans out.

Jason

2022-05-11 15:25:43

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH 2/2] random: add fork_event sysctl for polling VM forks

Hi Simo,

On 11.05.22 14:59, Simo Sorce wrote:
> Hi Jason,
>
> On Wed, 2022-05-11 at 03:18 +0200, Jason A. Donenfeld wrote:
>> My proposal here is made with nonce reuse in mind, for things like
>> session keys that use sequential nonces.
> Although this makes sense the problem is that changing applications to
> do the right thing based on which situation they are in will never be
> done right or soon enough. So I would focus on a solution that makes
> the CSPRNGs in crypto libraries safe.


I think we intrinsically have 2 sets of applications: Ones that want an
event based notification and don't care about the racyness of it and
ones that want an atomic way to determine the epoch. Userspace RNGs are
naturally in the second category.


>
>> A different issue is random nonces. For these, it seems like a call to
>> getrandom() for each nonce is probably the best bet. But it sounds like
>> you're interested in a userspace RNG, akin to OpenBSD's arc4random(3). I
>> hope you saw these threads:
>>
>> - https://lore.kernel.org/lkml/[email protected]/
>> - https://lore.kernel.org/lkml/[email protected]/
>> - https://lore.kernel.org/lkml/CAHmME9qHGSF8w3DoyCP+ud_N0MAJ5_8zsUWx=rxQB1mFnGcu9w@mail.gmail.com/
> 4c does sound like a decent solution, it is semantically identical to
> an epoch vmgenid, all the library needs to do is to create such a mmap
> region, stick a value on it, verify it is not zero after computing the
> next random value but before returning it to the caller.
> This reduces the race to a very small window when the machine is frozen
> right after the random value is returned to the caller but before it is
> used, but hopefully this just means that the two machines will just
> make parallel computations that yield the exact same value, so no
> catastrophic consequence will arise (there is the odd case where two
> random values are sought and the split happens between the two are
> retrieved and this has bad consequences, I think we can ignore that).


The problem with wiping memory on clone is that it means you must keep
these special wipe on clone pages always present and pinned in memory
and can no longer swap them out, compact them or move them for memory
hotplug.

We started the journey with a WIPEONSUSPEND flag and eventually
abandoned it because it seemed clunky. Happy to reopen it if we believe
it's a viable path:

  https://lwn.net/Articles/825230/


Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879