Subject: Re: [PATCH 1/3] getrandom.2: new manpage

Hi Heinrich,

On Fri, Oct 3, 2014 at 2:15 AM, Heinrich Schuchardt <[email protected]> wrote:
> Kernel 3.17 introduces a new system call getrandom(2).
>
> The man page in this patch is based on the commit message by
> Theodore Ts'o <[email protected]> and suggestion by
> Michael Kerrisk <[email protected]>.

No word from Ted so far...

I've added LKML to CC, because I think it's worth getting wider review
of the page at this point.

I've done some further editing on the page, and pushed the current
version in public branch:
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/log/?h=draft_getrandom

Could you please review the current version, appended below. Note that
there are a couple of FIXMEs there. Also, could you pay special
attention to the changes in these commits, to make sure I have not
injected any errors:

pick 0ef180e getrandom.2: Add a sentence to clarify the default behavior...
pick 62342ef getrandom.2: Reword GRND_NONBLOCK description
pick 0c90d3d getrandom.2: Reword GRND_RANDOM description

Cheers,

Michael

.\" Copyright (C) 2014, Theodore Ts'o <[email protected]>
.\" Copyright (C) 2014, Heinrich Schuchardt <[email protected]>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of
.\" this manual under the conditions for verbatim copying, provided that
.\" the entire resulting derived work is distributed under the terms of
.\" a permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date. The author(s) assume.
.\" no responsibility for errors or omissions, or for damages resulting.
.\" from the use of the information contained herein. The author(s) may.
.\" not have taken the same level of care in the production of this.
.\" manual, which is licensed free of charge, as they might when working.
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END

.TH GETRANDOM 2 2014-10-03 "Linux" "Linux Programmer's Manual"
.SH NAME
getrandom \- obtain a series of random bytes
.SH SYNOPSIS
.B #include <linux/random.h>
.sp
.BI "int getrandom(void *"buf ", size_t " buflen ", unsigned int " flags );
.SH DESCRIPTION
The system call
.BR getrandom ()
fills the buffer pointed to by
.I buf
with up to
.I buflen
random bytes.
These can be used to seed user-space random number generators
or for other cryptographic purposes.
.PP
.BR getrandom ()
relies on entropy gathered from device drivers and other sources of
environmental noise.
Unnecessarily reading large quantities of data will have a negative impact
on other users of the
.I /dev/random
and
.I /dev/urandom
devices.
Therefore
.BR getrandom ()
should not be used for Monte Carlo simulations or other
programs/algorithms which are doing probabilistic sampling.

.\" FIXME is the following paragraph correct?
By default,
.BR getrandom ()
draws entropy from the
.IR /dev/urandom
pool, and, if that pool has been initialized and
.IR buflen
is less than or equal to 256 (see NOTES, below),
then the call never blocks when drawing from that pool
and always returns the number of bytes requested in
.IR buflen .
This behavior can be changed via the
.I flags
argument.

The
.I flags
argument is a bit mask that can contain zero or more of the following values
ORed together:
.TP
.B GRND_RANDOM
If this bit is set, then random bytes are drawn from the
.I /dev/random
pool instead of the
.I /dev/urandom
pool.
The
.I /dev/random
pool is limited based on the entropy that can be obtained from environmental
noise.
If the number of available bytes in
.I /dev/random
is less than requested in
.IR buflen ,
the call returns just the available random bytes.
If no random byte is available, the response will depend on the
presence of
.B GRND_NONBLOCK
in the
.I flags
argument.
.TP
.B GRND_NONBLOCK
By default, if there is no random byte available at all,
.BR getrandom ()
blocks until data is available.
If the
.B GRND_NONBLOCK
flag is set, then
.BR getrandom ()
instead immediately returns -1 with
.I errno
set to
.BR EAGAIN .
.SH RETURN VALUE
On success,
.BR getrandom ()
returns the number of bytes that were copied to the buffer
.IR buf .
This may be less than the number of bytes requested via
.I buflen
if insufficient entropy was present in the
.IR /dev/random
pool, or if the system call was interrupted by a signal.
.PP
On error, -1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
.TP
.B EINVAL
An invalid flag was specified in
.IR flags .
.TP
.B EFAULT
The address referred to by
.I buf
is outside the accessible address space.
.TP
.B EAGAIN
The requested entropy was not available, and
.BR getrandom ()
would have blocked if the
.B GRND_NONBLOCK
flag was not set.
.TP
.B EINTR
While blocked waiting for entropy, the call was interrupted by a signal
handler; see the description of how interrupted
.BR read (2)
calls on "slow" devices are handled with and without the
.B SA_RESTART
flag in the
.BR signal (7)
man page.
.SH VERSIONS
.BR getrandom ()
was introduced in version 3.17 of the Linux kernel.
.SH CONFORMING TO
This system call is Linux-specific.
.SH NOTES
.SS Interruption by a signal handler
.\" FIXME Here, I think there needs to be an opening paragraph that describes
.\" the cases where getrandom() can block. This should cover the cases with
.\" GRND_RANDOM and without GRND_RANDOM. Reading the existing page, I am
.\" still not completely confident that I know what the cases are.
The reaction of
.BR getrandom ()
in case of an interruption of a blocking call by a signal
when reading from
.I /dev/urandom
.RB ( GRND_RANDOM
is not set)
depends on the initialization state of the entropy buffer
and on the request size
.IR buflen .
If the entropy is not yet initialized or the request size is large
.RI ( buflen "\ >\ 256),"
.B EINTR
will be returned.
If the entropy pool has been initialized and the request size is small
.RI ( buflen "\ <=\ 256),"
.BR getrandom ()
will not return
.BR EINTR .
Instead, it will return all of the bytes that have been requested.
.PP
When reading from
.I /dev/random
.RB ( GRND_RANDOM
is set)
these guarantees do
.I not
apply.
.PP
Calling
.BR getrandom ()
to read
.I /dev/urandom
for small values (<=\ 256) of
.I buflen
is the preferred mode of usage.
.PP
The special treatment of small values of
.I buflen
was designed for compatibility with
OpenBSD's
.BR getentropy ()
system call.
.PP
The user of
.BR getrandom ()
.I must
always check the return value,
to determine whether either an error occurred
or fewer bytes than requested were returned.
In the case where
.B GRND_RANDOM
is not specified and
.I buflen
is less than or equal to 256,
a return of fewer bytes than requested should never happen,
but the careful user-space code should check for this anyway!
.SS Choice of random device
Unless you are doing long-term key generation (and perhaps not even
then), you probably shouldn't be using
.B GRND_RANDOM.
The cryptographic algorithms used for
.I /dev/urandom
are quite conservative, and so should be sufficient for all purposes.
The disadvantage of
.B GRND_RANDOM
is that it can block.
Furthermore, dealing with partially fulfilled
.BR getrandom ()
requests increases code complexity.
.SS Emulating OpenBSD's getentropy()
The
.BR getentropy ()
system call in OpenBSD can be emulated using the following
function:

.in +4n
.nf
int
getentropy(void *buf, size_t buflen)
{
int ret;

if (buflen > 256)
goto failure;
ret = getrandom(buf, buflen, 0);
if (ret < 0)
return ret;
if (ret == buflen)
return 0;
failure:
errno = EIO;
return -1;
}
.fi
.in
.SH SEE ALSO
.BR random (4),
.BR urandom (4)


2014-11-11 16:23:33

by Heinrich Schuchardt

[permalink] [raw]
Subject: [PATCH] getrandom.2: treatment of interrupts

|| pick 0ef180e getrandom.2: Add a sentence to clarify the default behavior...
|| pick 62342ef getrandom.2: Reword GRND_NONBLOCK description
|| pick 0c90d3d getrandom.2: Reword GRND_RANDOM description
These patches look ok.

|| FIXME is the following paragraph correct?
Paragraph is correct. FIXME removed.

The appended patch removed two FIXMEs.
The description of the treatment of interrupts is updated.

|| FIXME Here, I think there needs to be an opening paragraph ...
The description of the treatment of interupts has been reworked.
FIXME removed.

Signed-off-by: Heinrich Schuchardt <[email protected]>
---
man2/getrandom.2 | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/man2/getrandom.2 b/man2/getrandom.2
index 59ebbe0..5bf57b7 100644
--- a/man2/getrandom.2
+++ b/man2/getrandom.2
@@ -55,7 +55,6 @@ Therefore
should not be used for Monte Carlo simulations or other
programs/algorithms which are doing probabilistic sampling.

-.\" FIXME is the following paragraph correct?
By default,
.BR getrandom ()
draws entropy from the
@@ -157,10 +156,14 @@ was introduced in version 3.17 of the Linux kernel.
This system call is Linux-specific.
.SH NOTES
.SS Interruption by a signal handler
-.\" FIXME Here, I think there needs to be an opening paragraph that describes
-.\" the cases where getrandom() can block. This should cover the cases with
-.\" GRND_RANDOM and without GRND_RANDOM. Reading the existing page, I am
-.\" still not completely confident that I know what the cases are.
+If a blocking call is interrupted by a signal
+when reading from
+.I /dev/random
+.RB ( GRND_RANDOM
+is set),
+.B EINTR
+will be returned.
+.PP
The reaction of
.BR getrandom ()
in case of an interruption of a blocking call by a signal
@@ -182,14 +185,6 @@ will not return
.BR EINTR .
Instead, it will return all of the bytes that have been requested.
.PP
-When reading from
-.I /dev/random
-.RB ( GRND_RANDOM
-is set)
-these guarantees do
-.I not
-apply.
-.PP
Calling
.BR getrandom ()
to read
--
2.1.1

Subject: Re: [PATCH] getrandom.2: treatment of interrupts

Hi Heinrich,

On Tue, Nov 11, 2014 at 5:19 PM, Heinrich Schuchardt <[email protected]> wrote:
> || pick 0ef180e getrandom.2: Add a sentence to clarify the default behavior...
> || pick 62342ef getrandom.2: Reword GRND_NONBLOCK description
> || pick 0c90d3d getrandom.2: Reword GRND_RANDOM description
> These patches look ok.
>
> || FIXME is the following paragraph correct?
> Paragraph is correct. FIXME removed.
>
> The appended patch removed two FIXMEs.
> The description of the treatment of interrupts is updated.
>
> || FIXME Here, I think there needs to be an opening paragraph ...
> The description of the treatment of interupts has been reworked.
> FIXME removed.
>
> Signed-off-by: Heinrich Schuchardt <[email protected]>
> ---
> man2/getrandom.2 | 21 ++++++++-------------
> 1 file changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/man2/getrandom.2 b/man2/getrandom.2
> index 59ebbe0..5bf57b7 100644
> --- a/man2/getrandom.2
> +++ b/man2/getrandom.2
> @@ -55,7 +55,6 @@ Therefore
> should not be used for Monte Carlo simulations or other
> programs/algorithms which are doing probabilistic sampling.
>
> -.\" FIXME is the following paragraph correct?
> By default,
> .BR getrandom ()
> draws entropy from the
> @@ -157,10 +156,14 @@ was introduced in version 3.17 of the Linux kernel.
> This system call is Linux-specific.
> .SH NOTES
> .SS Interruption by a signal handler
> -.\" FIXME Here, I think there needs to be an opening paragraph that describes
> -.\" the cases where getrandom() can block. This should cover the cases with
> -.\" GRND_RANDOM and without GRND_RANDOM. Reading the existing page, I am
> -.\" still not completely confident that I know what the cases are.
> +If a blocking call is interrupted by a signal
> +when reading from
> +.I /dev/random
> +.RB ( GRND_RANDOM
> +is set),
> +.B EINTR
> +will be returned.
> +.PP

What I was meaning with my FIXME was: please describe the various
cases where getrandom() can block, both with GRND_RANDOM and without
GRND_RANDOM. As I read the man page, I am not sure that I understand
which the cases are.

Thanks,

Michael



> The reaction of
> .BR getrandom ()
> in case of an interruption of a blocking call by a signal
> @@ -182,14 +185,6 @@ will not return
> .BR EINTR .
> Instead, it will return all of the bytes that have been requested.
> .PP
> -When reading from
> -.I /dev/random
> -.RB ( GRND_RANDOM
> -is set)
> -these guarantees do
> -.I not
> -apply.
> -.PP
> Calling
> .BR getrandom ()
> to read
> --
> 2.1.1
>



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2014-11-22 11:36:35

by Heinrich Schuchardt

[permalink] [raw]
Subject: getrandom.2: treatment of interrupts

Hello Theodore,

I created the test program below.

While running it I issued
kill -SIGUSR1 <pid>
and
kill -SIGUSR2 <pid>

What I found was rather strange.

No matter whether specifying GRND_NONBLOCK or not, signals do not
interrupt the execution of getrandom() while reading from the
/dev/urandom pool.

Only after getrandom has finished signals are handled.

I would have expected getrandom() to react to interrupts immediately and
to return whatever number of random bytes have been collected before the
interrupt.

A system call not reacting to interrupts for several seconds looks like
a bug to me.

Tested on Linux 3.18.0-rc4 mips64.

Best regards

Heinrich Schuchardt



#define _GNU_SOURCE
#include <errno.h>
#include <linux/unistd.h>
#include <linux/random.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <sys/types.h>

#if _MIPS_SIM == _MIPS_SIM_ABI32
#define __NR_getrandom (__NR_Linux + 353)
#endif /* _MIPS_SIM == _MIPS_SIM_ABI32 */

#if _MIPS_SIM == _MIPS_SIM_ABI64
#define __NR_getrandom (__NR_Linux + 313)
#endif /* _MIPS_SIM == _MIPS_SIM_ABI64 */

#if _MIPS_SIM == _MIPS_SIM_NABI32
#define __NR_getrandom (__NR_Linux + 317)
#endif /* _MIPS_SIM == _MIPS_SIM_NABI32 */

#ifdef __i386__
#define __NR_getrandom (355)
#endif /* __i386__ */

#ifdef __x86_64__
#define __NR_getrandom (318)
#endif /* __x86_64__ */

#define SYS_getrandom __NR_getrandom

#define GRND_NONBLOCK 0x0001
#define GRND_RANDOM 0x0002

#define BUFLEN 0x12345678

int getrandom(void *buf, size_t buflen, unsigned int flags)
{
return syscall(SYS_getrandom, buf, buflen, flags);
}

/**
* Handles signal.
*
* @param sig signal
*/


int do_print = 0;

static void hdl(int sig)
{
if (sig == SIGUSR1) {
fprintf(stderr, "Main received SIGUSR1\n");
do_print = 1;
}
}

int
main(int argc, char *argv[])
{

char *buf;
size_t buflen = BUFLEN;
int ret;
pid_t pid;
// action to take when signal occurs
struct sigaction act;
// signal mask
sigset_t blockset;


pid = getpid();

printf("PID = %u\n", pid);

printf("__NR_getrandom = %u\n", __NR_getrandom);

// Set handler for SIGUSR1
act.sa_handler = hdl;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
if (sigaction(SIGUSR1, &act, NULL)) {
perror("sigaction");
exit(EXIT_FAILURE);
}

buf = (char *) malloc(buflen);
if (buf == NULL) {
perror("malloc");
exit(EXIT_FAILURE);
}

buf = (char *) malloc(buflen);

for (;;) {
ret = getrandom(buf, buflen, GRND_NONBLOCK);
if (ret == -1) {
fprintf(stderr, "errno = %d\n", errno);
perror("getrandom");
exit(EXIT_FAILURE);
}
if (do_print) {
do_print = 0;
printf("ret = %d\n", ret);
}

}
printf("ret = %d\n", ret);

free(buf);

return EXIT_SUCCESS;
}

2014-11-29 09:13:30

by Heinrich Schuchardt

[permalink] [raw]
Subject: [PATCH 1/1] urandom: handle signals immediately

Without the patch device /dev/urandom only considers signals when a
rescheduling of the thread is requested. This may imply that
signals will not be handled for time intervals in excess of 30s.

With the patch signals are handled in every round copying 10 bytes if more
than 256 bytes have been collected. This 256 byte limit ensures the
guarantee given by system call getrandom().

With the patch rescheduling may occur even when reading less than 257 bytes.
This restores the logic before the introduction of the getrandom() system
call. getrandom() provides no guarantee concerning rescheduling.

An example code for testing is provided in
https://lkml.org/lkml/2014/11/22/41

Signed-off-by: Heinrich Schuchardt <[email protected]>
---
drivers/char/random.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 04645c0..75e96c1 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1189,21 +1189,20 @@ static ssize_t extract_entropy_user(struct entropy_store *r, void __user *buf,
{
ssize_t ret = 0, i;
__u8 tmp[EXTRACT_SIZE];
- int large_request = (nbytes > 256);

trace_extract_entropy_user(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_);
xfer_secondary_pool(r, nbytes);
nbytes = account(r, nbytes, 0, 0);

while (nbytes) {
- if (large_request && need_resched()) {
- if (signal_pending(current)) {
- if (ret == 0)
- ret = -ERESTARTSYS;
- break;
- }
+ /*
+ * getrandom must not be interrupted by a signal while
+ * reading up to 256 bytes.
+ */
+ if (signal_pending(current) && ret > 256)
+ break;
+ if (need_resched())
schedule();
- }

extract_buf(r, tmp);
i = min_t(int, nbytes, EXTRACT_SIZE);
--
2.1.3

2014-12-19 16:57:48

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/1] urandom: handle signals immediately

On Sat, Nov 29, 2014 at 10:12:29AM +0100, Heinrich Schuchardt wrote:
> Without the patch device /dev/urandom only considers signals when a
> rescheduling of the thread is requested. This may imply that
> signals will not be handled for time intervals in excess of 30s.

Sorry, I didn't see your e-mail for a while; it got lost in my inbox
due to my being travelling for Thanksgiving weeksend.

I'm not sure where you are getting 30 seconds from, but you're right
that it would be better to check signal_pending() on each loop. That
being said, your patch isn't right.

> + /*
> + * getrandom must not be interrupted by a signal while
> + * reading up to 256 bytes.
> + */
> + if (signal_pending(current) && ret > 256)
> + break;
> + if (need_resched())
> schedule();
> - }

This means that we can reschedule even for small requests, and that's
no good; getrandom *must* be atomic. You also need to return
-ERESTARTSYS if we get interrupted with no bytes. So this needs to be
something like this:

if (ret > 256) {
if (signal_pending(current)) {
if (ret == 0)
ret = -ERESTARTSYS;
break;
}
if (need_resched())
schedule();
}

Cheers,

- Ted

2014-12-19 19:06:38

by Heinrich Schuchardt

[permalink] [raw]
Subject: Re: [PATCH 1/1] urandom: handle signals immediately

Hello Ted,

thanks for the review.

On 19.12.2014 17:57, Theodore Ts'o wrote:
> On Sat, Nov 29, 2014 at 10:12:29AM +0100, Heinrich Schuchardt wrote:
>...
> I'm not sure where you are getting 30 seconds from,

The example code is in https://lkml.org/lkml/2014/11/22/41 .
Running it on an EdgeRouter Lite which has a Cavium Octeon MIPS
processor, it took more than 30s to get a response for an interrupt
(after creating of 0x12345678 = 305,419,896 pseudo random bytes).

> but you're right
> that it would be better to check signal_pending() on each loop. That
> being said, your patch isn't right.
>
>> + /*
>> + * getrandom must not be interrupted by a signal while
>> + * reading up to 256 bytes.
>> + */
>> + if (signal_pending(current) && ret > 256)
>> + break;
>> + if (need_resched())
>> schedule();
>> - }
>
> This means that we can reschedule even for small requests, and that's
> no good; getrandom *must* be atomic.

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6e9d6f38894798696f23c8084ca7edbf16ee895
does not indicate this as a necessary property.

Could you, please, explain why getrandom must be atomic.
I would like to add a comment line with the next version of my patch.

> You also need to return
> -ERESTARTSYS if we get interrupted with no bytes. So this needs to be
> something like this:
>
> if (ret > 256) {
> if (signal_pending(current)) {
> if (ret == 0)

This line will never be reached because ret > 256.

The idea in my patch was, that we will not react to an interrupt during
the generation of the first 256 bytes. Hence there is not need to return
ERESTARTSYS. Instead we will always return a positive number except if
the entropy buffer is not yet intialized.

This just removes the inconsistency that ERESTARTSYS may arise for calls
for more than 256 bytes, but not for calls up to 256 bytes.

Do you see any need to react to an interrupt before copying the first byte?

> ret = -ERESTARTSYS;
> break;
> }
> if (need_resched())
> schedule();
> }

Best regards

Heinrich Schuchardt