2022-07-28 10:37:54

by Szabolcs Nagy

[permalink] [raw]
Subject: Re: [PATCH v6] arc4random: simplify design for better safety

The 07/26/2022 21:58, Jason A. Donenfeld via Libc-alpha wrote:
> Rather than buffering 16 MiB of entropy in userspace (by way of
> chacha20), simply call getrandom() every time.
>
> This approach is doubtlessly slower, for now, but trying to prematurely
> optimize arc4random appears to be leading toward all sorts of nasty
> properties and gotchas. Instead, this patch takes a much more
> conservative approach. The interface is added as a basic loop wrapper
> around getrandom(), and then later, the kernel and libc together can
> work together on optimizing that.
>
> This prevents numerous issues in which userspace is unaware of when it
> really must throw away its buffer, since we avoid buffering all
> together. Future improvements may include userspace learning more from
> the kernel about when to do that, which might make these sorts of
> chacha20-based optimizations more possible. The current heuristic of 16
> MiB is meaningless garbage that doesn't correspond to anything the
> kernel might know about. So for now, let's just do something
> conservative that we know is correct and won't lead to cryptographic
> issues for users of this function.
>
> This patch might be considered along the lines of, "optimization is the
> root of all evil," in that the much more complex implementation it
> replaces moves too fast without considering security implications,
> whereas the incremental approach done here is a much safer way of going
> about things. Once this lands, we can take our time in optimizing this
> properly using new interplay between the kernel and userspace.
>
> getrandom(0) is used, since that's the one that ensures the bytes
> returned are cryptographically secure. But on systems without it, we
> fallback to using /dev/urandom. This is unfortunate because it means
> opening a file descriptor, but there's not much of a choice. Secondly,
> as part of the fallback, in order to get more or less the same
> properties of getrandom(0), we poll on /dev/random, and if the poll
> succeeds at least once, then we assume the RNG is initialized. This is a
> rough approximation, as the ancient "non-blocking pool" initialized
> after the "blocking pool", not before, and it may not port back to all
> ancient kernels, though it does to all kernels supported by glibc
> (≥3.2), so generally it's the best approximation we can do.
>
> The motivation for including arc4random, in the first place, is to have
> source-level compatibility with existing code. That means this patch
> doesn't attempt to litigate the interface itself. It does, however,
> choose a conservative approach for implementing it.
>
> Cc: Adhemerval Zanella Netto <[email protected]>
> Cc: Florian Weimer <[email protected]>
> Cc: Cristian Rodríguez <[email protected]>
> Cc: Paul Eggert <[email protected]>
> Cc: Mark Harris <[email protected]>
> Cc: Eric Biggers <[email protected]>
> Cc: [email protected]
> Signed-off-by: Jason A. Donenfeld <[email protected]>

fyi, after this patch i see

FAIL: stdlib/tst-arc4random-thread

with

$ cat stdlib/tst-arc4random-thread.out
info: arc4random: minimum of 1750000 blob results expected
info: arc4random: 1750777 blob results observed
info: arc4random_buf: minimum of 1750000 blob results expected
info: arc4random_buf: 1750000 blob results observed
info: arc4random_uniform: minimum of 1750000 blob results expected
Timed out: killed the child process
Termination time: 2022-07-27T14:41:33.766791947
Last write to standard output: 2022-07-27T14:41:22.522497854

on an arm and aarch64 builder.

running it manually it takes >30s to complete.

> ---
> LICENSES | 23 -
> NEWS | 4 +-
> include/stdlib.h | 3 -
> manual/math.texi | 13 +-
> stdlib/Makefile | 2 -
> stdlib/arc4random.c | 196 ++----
> stdlib/arc4random.h | 48 --
> stdlib/chacha20.c | 191 ------
> stdlib/tst-arc4random-chacha20.c | 167 -----
> sysdeps/aarch64/Makefile | 4 -
> sysdeps/aarch64/chacha20-aarch64.S | 314 ----------
> sysdeps/aarch64/chacha20_arch.h | 40 --
> sysdeps/generic/chacha20_arch.h | 24 -
> sysdeps/generic/not-cancel.h | 3 +
> sysdeps/generic/tls-internal-struct.h | 1 -
> sysdeps/generic/tls-internal.c | 10 -
> sysdeps/mach/hurd/_Fork.c | 2 -
> sysdeps/mach/hurd/not-cancel.h | 4 +
> sysdeps/nptl/_Fork.c | 2 -
> .../powerpc/powerpc64/be/multiarch/Makefile | 4 -
> .../powerpc64/be/multiarch/chacha20-ppc.c | 1 -
> .../powerpc64/be/multiarch/chacha20_arch.h | 42 --
> sysdeps/powerpc/powerpc64/power8/Makefile | 5 -
> .../powerpc/powerpc64/power8/chacha20-ppc.c | 256 --------
> .../powerpc/powerpc64/power8/chacha20_arch.h | 37 --
> sysdeps/s390/s390-64/Makefile | 6 -
> sysdeps/s390/s390-64/chacha20-s390x.S | 573 ------------------
> sysdeps/s390/s390-64/chacha20_arch.h | 45 --
> sysdeps/unix/sysv/linux/not-cancel.h | 8 +-
> sysdeps/unix/sysv/linux/tls-internal.c | 10 -
> sysdeps/unix/sysv/linux/tls-internal.h | 1 -
> sysdeps/x86_64/Makefile | 7 -
> sysdeps/x86_64/chacha20-amd64-avx2.S | 328 ----------
> sysdeps/x86_64/chacha20-amd64-sse2.S | 311 ----------
> sysdeps/x86_64/chacha20_arch.h | 55 --
> 35 files changed, 64 insertions(+), 2676 deletions(-)
> delete mode 100644 stdlib/arc4random.h
> delete mode 100644 stdlib/chacha20.c
> delete mode 100644 stdlib/tst-arc4random-chacha20.c
> delete mode 100644 sysdeps/aarch64/chacha20-aarch64.S
> delete mode 100644 sysdeps/aarch64/chacha20_arch.h
> delete mode 100644 sysdeps/generic/chacha20_arch.h
> delete mode 100644 sysdeps/powerpc/powerpc64/be/multiarch/Makefile
> delete mode 100644 sysdeps/powerpc/powerpc64/be/multiarch/chacha20-ppc.c
> delete mode 100644 sysdeps/powerpc/powerpc64/be/multiarch/chacha20_arch.h
> delete mode 100644 sysdeps/powerpc/powerpc64/power8/chacha20-ppc.c
> delete mode 100644 sysdeps/powerpc/powerpc64/power8/chacha20_arch.h
> delete mode 100644 sysdeps/s390/s390-64/chacha20-s390x.S
> delete mode 100644 sysdeps/s390/s390-64/chacha20_arch.h
> delete mode 100644 sysdeps/x86_64/chacha20-amd64-avx2.S
> delete mode 100644 sysdeps/x86_64/chacha20-amd64-sse2.S
> delete mode 100644 sysdeps/x86_64/chacha20_arch.h


2022-07-28 10:44:15

by Szabolcs Nagy

[permalink] [raw]
Subject: Re: [PATCH v6] arc4random: simplify design for better safety

The 07/28/2022 11:29, Szabolcs Nagy via Libc-alpha wrote:
> The 07/26/2022 21:58, Jason A. Donenfeld via Libc-alpha wrote:
...
>
> fyi, after this patch i see
>
> FAIL: stdlib/tst-arc4random-thread
>
> with
>
> $ cat stdlib/tst-arc4random-thread.out
> info: arc4random: minimum of 1750000 blob results expected
> info: arc4random: 1750777 blob results observed
> info: arc4random_buf: minimum of 1750000 blob results expected
> info: arc4random_buf: 1750000 blob results observed
> info: arc4random_uniform: minimum of 1750000 blob results expected
> Timed out: killed the child process
> Termination time: 2022-07-27T14:41:33.766791947
> Last write to standard output: 2022-07-27T14:41:22.522497854
>
> on an arm and aarch64 builder.
>
> running it manually it takes >30s to complete.

note that before the patch it was <5s on the same machine.

2022-07-28 11:05:21

by Adhemerval Zanella

[permalink] [raw]
Subject: Re: [PATCH v6] arc4random: simplify design for better safety

On Thu, Jul 28, 2022 at 7:37 AM Szabolcs Nagy via Libc-alpha
<[email protected]> wrote:
>
> The 07/28/2022 11:29, Szabolcs Nagy via Libc-alpha wrote:
> > The 07/26/2022 21:58, Jason A. Donenfeld via Libc-alpha wrote:
> ...
> >
> > fyi, after this patch i see
> >
> > FAIL: stdlib/tst-arc4random-thread
> >
> > with
> >
> > $ cat stdlib/tst-arc4random-thread.out
> > info: arc4random: minimum of 1750000 blob results expected
> > info: arc4random: 1750777 blob results observed
> > info: arc4random_buf: minimum of 1750000 blob results expected
> > info: arc4random_buf: 1750000 blob results observed
> > info: arc4random_uniform: minimum of 1750000 blob results expected
> > Timed out: killed the child process
> > Termination time: 2022-07-27T14:41:33.766791947
> > Last write to standard output: 2022-07-27T14:41:22.522497854
> >
> > on an arm and aarch64 builder.
> >
> > running it manually it takes >30s to complete.
>
> note that before the patch it was <5s on the same machine.

Yeap, we need to tune down the internal test parameters [1].

[1] https://patchwork.sourceware.org/project/glibc/patch/[email protected]/