2019-09-17 20:29:39

by Martin Steigerwald

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

Linus Torvalds - 17.09.19, 20:01:23 CEST:
> > We can make boot hang in "sane", discoverable way.
>
> That is certainly a huge advantage, yes. Right now I suspect that what
> has happened is that this has probably been going on as some
> low-level background noise for a while, and people either figured it
> out and switched away from gdm (example: Christoph), or more likely
> some unexplained boot problems that people just didn't chase down. So
> it took basically a random happenstance to make this a kernel issue.
>
> But "easily discoverable" would be good.

Well I meanwhile remembered how it was with sddm:

Without CPU assistance (RDRAND) or haveged or any other source of
entropy, sddm would simply not appear and I'd see the tty1 login. Then
I start to type something and after a while sddm popped up. If I would
not type anything it took easily at least have a minute till it appeared.

Actually I used my system like this quite a while, cause I did not feel
comfortable with haveged and RDRAND.

AFAIR this was as this Debian still ran with Systemd. What Debian
maintainer for sddm did was this:

sddm (0.18.0-1) unstable; urgency=medium
[…]
[ Maximiliano Curia ]
* Workaround entropy starvation by recommending haveged
* Release to unstable

-- Maximiliano Curia […] Sun, 22 Jul 2018 13:26:44 +0200

With Sysvinit I still have neither haveged nor RDRAND enabled, but
behavior changed a bit. crng init still takes a while

% zgrep -h "crng init" /var/log/kern.log*
Sep 16 09:06:23 merkaba kernel: [ 16.910096][ C3] random: crng init done
Sep 8 14:08:39 merkaba kernel: [ 16.682014][ C2] random: crng init done
Sep 9 09:16:43 merkaba kernel: [ 46.084188][ C2] random: crng init done
Sep 11 10:52:37 merkaba kernel: [ 47.209825][ C3] random: crng init done
Sep 12 08:32:08 merkaba kernel: [ 76.624375][ C3] random: crng init done
Sep 12 20:07:29 merkaba kernel: [ 10.726349][ C2] random: crng init done
Sep 8 10:02:42 merkaba kernel: [ 37.391577][ C2] random: crng init done
Aug 26 09:23:51 merkaba kernel: [ 40.555337][ C3] random: crng init done
Aug 28 09:45:28 merkaba kernel: [ 39.446847][ C1] random: crng init done
Aug 20 10:14:59 merkaba kernel: [ 12.242467][ C1] random: crng init done

and there might be a slight delay before sddm appears, before tty has been
initialized. I am not completely sure whether it is related to sddm or
something else. But AFAIR delays have been in the range of a maximum of
5-10 seconds, so I did not bother to check more closely.

Note this is on a ThinkPad T520 which is a PC. And if I read above kernel log
excerpts right, it can still take up to 76 second for crng to be initialized with
entropy. Would be interesting to see other people's numbers there.

There might be a different ordering with Sysvinit and it may still be sddm.
But I never have seen a delay of 76 seconds AFAIR… so something else
might be different or I just did not notice the delay. Sometimes I switch
on the laptop and do something else to come back in a minute or so.

I don't have any kernel logs old enough to see whether whether crng init
times have been different with Systemd due to asking for randomness for
UUID/hashmaps.

Thanks,
--
Martin



2019-09-17 20:53:12

by Ahmed S. Darwish

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

On Tue, Sep 17, 2019 at 10:28:47PM +0200, Martin Steigerwald wrote:
[...]
>
> I don't have any kernel logs old enough to see whether whether crng init
> times have been different with Systemd due to asking for randomness for
> UUID/hashmaps.
>

Please stop claiming this. It has been pointed out to you, __multiple
times__, that this makes no difference. For example:

https://lkml.kernel.org/r/[email protected]

No. getrandom(2) uses the new CRNG, which is either initialized,
or it's not ... So to the extent that systemd has made systems
boot faster, you could call that systemd's "fault".

You've claimed this like 3 times before in this thread already, and
multiple people replied with the same response. If you don't get the
paragraph above, then please don't continue replying further on this
thread.

thanks,

--
Ahmed Darwish
http://darwish.chasingpointers.com

2019-09-18 00:36:17

by Martin Steigerwald

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

Ahmed S. Darwish - 17.09.19, 22:52:34 CEST:
> On Tue, Sep 17, 2019 at 10:28:47PM +0200, Martin Steigerwald wrote:
> [...]
>
> > I don't have any kernel logs old enough to see whether whether crng
> > init times have been different with Systemd due to asking for
> > randomness for UUID/hashmaps.
>
> Please stop claiming this. It has been pointed out to you, __multiple
> times__, that this makes no difference. For example:
>
> https://lkml.kernel.org/r/[email protected]
>
> No. getrandom(2) uses the new CRNG, which is either initialized,
> or it's not ... So to the extent that systemd has made systems
> boot faster, you could call that systemd's "fault".
>
> You've claimed this like 3 times before in this thread already, and
> multiple people replied with the same response. If you don't get the
> paragraph above, then please don't continue replying further on this
> thread.

First off, this mail you referenced has not been an answer to a mail of
mine. It does not have my mail address in Cc. So no, it has not been
pointed out directly to me in that mail.

Secondly: Pardon me, but I do not see how asking for entropy early at
boot times or not doing so has *no effect* on the available entropy¹. And
I do not see the above mail actually saying this. To my knowledge
Sysvinit does not need entropy for itself². The above mail merely talks
about the blocking on boot. And whether systemd-random-seed would drain
entropy, not whether hashmaps/UUID do. And also not the effect that
asking for entropy early has on the available entropy and on the
*initial* initialization time of the new CRNG. However I did not claim
that Systemd would block booting. *Not at all*.

Thirdly: I disagree with the tone you use in your mail. And for that
alone I feel it may be better for me to let go of this discussion.

My understanding of entropy always has been that only a certain amount
of it can be produced in a certain amount of time. If that is wrong…
please by all means, please teach me, how it would be.

However I am not even claiming anything. All I wrote above is that I do
not have any measurements. But I'd expect that the more entropy is asked
for early during boot, the longer the initial initialization of the new
CRNG will take. And if someone else relies on this initialization, that
something else would block for a longer time.

I got that it the new crng won't block after that anymore.

[1] https://github.com/systemd/systemd/issues/4167

(I know that it still with /dev/urandom, so if it is using RDRAND now,
this may indeed be different, but would it then deplete entropy the CPU
has available and that by default is fed into the Linux crng as well
(even without trusting it completely)?)

[2] According to

https://daniel-lange.com/archives/152-Openssh-taking-minutes-to-become-available,-booting-takes-half-an-hour-...-because-your-server-waits-for-a-few-bytes-of-randomness.html

sysvinit does not contain a single line of code about entropy or random
numbers.

Daniel even updated his blog post with a hint to this discussion.

Thanks,
--
Martin


2019-09-18 00:36:59

by Matthew Garrett

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

On Tue, Sep 17, 2019 at 11:38:33PM +0200, Martin Steigerwald wrote:

> My understanding of entropy always has been that only a certain amount
> of it can be produced in a certain amount of time. If that is wrong…
> please by all means, please teach me, how it would be.

getrandom() will never "consume entropy" in a way that will block any
users of getrandom(). If you don't have enough collected entropy to seed
the rng, getrandom() will block. If you do, getrandom() will generate as
many numbers as you ask it to, even if no more entropy is ever collected
by the system. So it doesn't matter how many clients you have calling
getrandom() in the boot process - either there'll be enough entropy
available to satisfy all of them, or there'll be too little to satisfy
any of them.

--
Matthew Garrett | [email protected]

2019-09-18 00:39:01

by Martin Steigerwald

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

Matthew Garrett - 17.09.19, 23:52:00 CEST:
> On Tue, Sep 17, 2019 at 11:38:33PM +0200, Martin Steigerwald wrote:
> > My understanding of entropy always has been that only a certain
> > amount of it can be produced in a certain amount of time. If that
> > is wrong… please by all means, please teach me, how it would be.
>
> getrandom() will never "consume entropy" in a way that will block any
> users of getrandom(). If you don't have enough collected entropy to
> seed the rng, getrandom() will block. If you do, getrandom() will
> generate as many numbers as you ask it to, even if no more entropy is
> ever collected by the system. So it doesn't matter how many clients
> you have calling getrandom() in the boot process - either there'll be
> enough entropy available to satisfy all of them, or there'll be too
> little to satisfy any of them.

Right, but then Systemd would not use getrandom() for initial hashmap/
UUID stuff since it

1) would block boot very early then, which is not desirable and

2) it does not need strong random numbers anyway.

At least that is how I understood Lennart's comments on the Systemd bug
report I referenced.

AFAIK hashmap/UUID stuff uses *some* entropy *before* crng has been
seeded with entropy and all I wondered was whether this using *some*
entropy *before* crng has been seeded – by /dev/urandom initially, but
now as far as I got with RDRAND if available – will delay the process of
gathering the entropy necessary to seed crng… if that is the case then
anything that uses crng during or soon after boot, like gdm, sddm,
OpenSSH ssh-keygen will be blocked for a longer time will the initial
seeding of crng has been done.

Of course if hashmap/UUID stuff does not use any entropy that would be
required for the *initial* seeding or crng, then… that would not be the
case. But from what I understood, it does.

And yes, for "systemd-random-seed" it is true that it does not drain
entropy for getrandom, cause it writes the seed to disk *after* crng has
been initialized, i.e. at a time where getrandom would never block again
as long as the system is running.

If I am still completely misunderstanding something there, then it may
be better to go to sleep. Which I will do now anyway.

Or I may just not be very good at explaining what I mean.

--
Martin


2019-09-18 00:48:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

On Tue, Sep 17, 2019 at 2:52 PM Matthew Garrett <[email protected]> wrote:
>
> getrandom() will never "consume entropy" in a way that will block any
> users of getrandom().

Yes, this is true for any common and sane use.

And by that I just mean that we do have GRND_RANDOM, which currently
does exactly that entropy consumption.

But it only consumes it for other GRND_RANDOM users - of which there
are approximately zero, because nobody wants that rats nest.

Linus

2019-09-18 13:43:33

by Lennart Poettering

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

On Di, 17.09.19 23:38, Martin Steigerwald ([email protected]) wrote:

> (I know that it still with /dev/urandom, so if it is using RDRAND now,
> this may indeed be different, but would it then deplete entropy the CPU
> has available and that by default is fed into the Linux crng as well
> (even without trusting it completely)?)

Neither RDRAND nor /dev/urandom know a concept of "depleting
entropy". That concept does not exist for them. It does exist for
/dev/random, but only crazy people use that. systemd does not.

Lennart

--
Lennart Poettering, Berlin

2019-09-18 13:55:04

by Lennart Poettering

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

On Mi, 18.09.19 00:10, Martin Steigerwald ([email protected]) wrote:

> > getrandom() will never "consume entropy" in a way that will block any
> > users of getrandom(). If you don't have enough collected entropy to
> > seed the rng, getrandom() will block. If you do, getrandom() will
> > generate as many numbers as you ask it to, even if no more entropy is
> > ever collected by the system. So it doesn't matter how many clients
> > you have calling getrandom() in the boot process - either there'll be
> > enough entropy available to satisfy all of them, or there'll be too
> > little to satisfy any of them.
>
> Right, but then Systemd would not use getrandom() for initial hashmap/
> UUID stuff since it

Actually things are more complex. In systemd there are four classes of
random values we need:

1. High "cryptographic" quality. There are very few needs for this in
systemd, as we do very little in this area. It's basically only
used for generating salt values for hashed passwords, in the
systemd-firstboot component, which can be used to set the root
pw. systemd uses synchronous getrandom() for this. It does not use
RDRAND for this.

2. High "non-cryptographic" quality. This is used for example for
generating type 4 uuids, i.e uuids that are supposed to be globally
unique, but aren't key material. We use RDRAND for this if
available, falling back to synchronous getrandom(). Type 3 UUIDs
are frequently needed by systemd, as we assign a uuid to each
service invocation implicitly, so that people can match logging
data and such to a specific instance and runtime of a service.

3. Medium quality. This is used for seeding hash tables. These may be
crap initially, but should not be guessable in the long
run. /dev/urandom would be perfect for this, but the mentioned log
message sucks, hence we use RDRAND for this if available, and fall
back to /dev/urandom if that isn't available, accepting the log
message.

4. Crap quality. There are only a few uses of this, where rand_r() is
is OK.

Of these four case, the first two might block boot. Because the first
case is not common you won't see blocking that often though for
them. The second case is very common, but since we use RDRAND you
won't see it on any recent Intel machines.

Or to say this all differently: the hash table seeding and the uuid
case are two distinct cases in systemd, and I am sure they should be.

Lennart

--
Lennart Poettering, Berlin

2019-09-19 07:42:47

by Martin Steigerwald

[permalink] [raw]
Subject: Re: Linux 5.3-rc8

Dear Lennart.

Lennart Poettering - 18.09.19, 15:53:25 CEST:
> On Mi, 18.09.19 00:10, Martin Steigerwald ([email protected]) wrote:
> > > getrandom() will never "consume entropy" in a way that will block
> > > any
> > > users of getrandom(). If you don't have enough collected entropy
> > > to
> > > seed the rng, getrandom() will block. If you do, getrandom() will
> > > generate as many numbers as you ask it to, even if no more entropy
> > > is
> > > ever collected by the system. So it doesn't matter how many
> > > clients
> > > you have calling getrandom() in the boot process - either there'll
> > > be
> > > enough entropy available to satisfy all of them, or there'll be
> > > too
> > > little to satisfy any of them.
> >
> > Right, but then Systemd would not use getrandom() for initial
> > hashmap/ UUID stuff since it
>
> Actually things are more complex. In systemd there are four classes of
> random values we need:
>
> 1. High "cryptographic" quality. There are very few needs for this in
[…]
> 2. High "non-cryptographic" quality. This is used for example for
[…]
> 3. Medium quality. This is used for seeding hash tables. These may be
[…]
> 4. Crap quality. There are only a few uses of this, where rand_r() is
> is OK.
>
> Of these four case, the first two might block boot. Because the first
> case is not common you won't see blocking that often though for
> them. The second case is very common, but since we use RDRAND you
> won't see it on any recent Intel machines.
>
> Or to say this all differently: the hash table seeding and the uuid
> case are two distinct cases in systemd, and I am sure they should be.

Thank you very much for your summary of uses of random numbers in
Systemd and also for your other mail that "neither RDRAND nor /dev/
urandom know a concept of of "depleting entropy"". I thought they would
deplete entropy needed to the initial seeding of crng.

Thank you also for taking part in this discussion, even if someone put
your mail address on carbon copy without asking with.

I do not claim I understand enough of this random number stuff. But I
feel its important that kernel and userspace developers actually talk
with each other about a sane approach for it. And I believe that the
complexity involved is part of the issue. I feel an API for attaining
random number with different quality levels needs to be much, much, much
more simple to use *properly*.

I felt a bit overwhelmed by the discussion (and by what else is
happening in my life, just having come back from holding a Linux
performance workshop in front of about two dozen people), so I intend to
step back from it.

If one of my mails actually helped to encourage or facilitate kernel
space and user space developers talking with each other about a sane
approach to random numbers, then I may have used my soft skills in a way
that brings some benefit. For the technical aspects certainly people are
taking part in this discussion who are much much deeper into the
intricacies of entropy in Linux and computers in general, so I just hope
for a good outcome.

Best,
--
Martin