The getrandom(2) system call was requested by the LibreSSL Portable
developers. It is analoguous to the getentropy(2) system call in
OpenBSD.
The rationale of this system call is to provide resiliance against
file descriptor exhaustion attacks, where the attacker consumes all
available file descriptors, forcing the use of the fallback code where
/dev/[u]random is not available. Since the fallback code is often not
well-tested, it is better to eliminate this potential failure mode
entirely.
The other feature provided by this new system call is the ability to
request randomness from the /dev/urandom entropy pool, but to block
until at least 128 bits of entropy has been accumulated in the
/dev/urandom entropy pool. Historically, the emphasis in the
/dev/urandom development has been to ensure that urandom pool is
initialized as quickly as possible after system boot, and preferably
before the init scripts start execution.
This is because changing /dev/urandom reads to block represents an
interface change that could potentially break userspace which is not
acceptable. In practice, on most x86 desktop and server systems, in
general the entropy pool can be initialized before it is needed (and
in modern kernels, we will printk a warning message if not). However,
on an embedded system, this may not be the case. And so with this new
interface, we can provide the functionality of blocking until the
urandom pool has been initialized. Any userspace program which uses
this new functionality must take care to assure that if it is used
during the boot process, that it will not cause the init scripts or
other portions of the system startup to hang indefinitely.
SYNOPSIS
#include <linux/random.h>
int getrandom(void *buf, size_t buflen, unsigned int flags);
DESCRIPTION
The system call getrandom() fills the buffer pointed to by buf
with up to buflen random bytes which can be used to seed user
space random number generators (i.e., DRBG's) or for other
cryptographic uses. It should not be used for Monte Carlo
simulations or other programs/algorithms which are doing
probabilistic sampling.
If the GRND_RANDOM flags bit is set, then draw from the
/dev/random pool instead of the /dev/urandom pool. The
/dev/random pool is limited based on the entropy that can be
obtained from environmental noise, so if there is insufficient
entropy, the requested number of bytes may not be returned.
If there is no entropy available at all, getrandom(2) will
either block, or return an error with errno set to EAGAIN if
the GRND_NONBLOCK bit is set in flags.
If the GRND_RANDOM bit is not set, then the /dev/urandom pool
will be used. Unlike using read(2) to fetch data from
/dev/urandom, if the urandom pool has not been sufficiently
initialized, getrandom(2) will block or return -1 with the
errno set to EGAIN if the GRND_NONBLOCK bit is set in flags.
The getentropy(2) system call in OpenBSD can be emulated using
the following function:
int getentropy(void *buf, size_t buflen)
{
int ret;
if (buflen > 256)
goto failure;
ret = getrandom(buf, buflen, 0);
if (ret < 0)
return ret;
if (ret == buflen)
return 0;
failure:
errno = EIO;
return -1;
}
RETURN VALUE
On success, the number of bytes that was filled in the buf is
returned. This may not be all the bytes requested by the
caller via buflen if insufficient entropy was present in the
/dev/random pool, or if the system call was interrupted by a
signal.
On error, -1 is returned, and errno is set appropriately.
ERRORS
EINVAL An invalid flag was passed to getrandom(2)
EFAULT buf is outside the accessible address space.
EAGAIN The requested entropy was not available, and the
getentropy(2) would have blocked if GRND_BLOCK flag
was set.
EINTR While blocked waiting for entropy, the call was
interrupted by a signal handler; see the description
of how interrupted read(2) calls on "slow" devices
are handled with and without the SA_RESTART flag
in the signal(7) man page.
NOTES
For small requests (buflen <= 256) getrandom(2) will not
return EINTR when reading from the urandom pool once the
entropy pool has been initialized, and it will return all of
the bytes that have been requested. This is the recommended
way to use getrandom(2), and is designed for compatibility
with OpenBSD's getentropy() system call.
However, if you are using GRND_RANDOM, then getrandom(2) may
block until the entropy accounting determines that sufficient
environmental noise has been gathered such that getrandom(2)
will be operating as a NRBG instead of a DRBG for those people
who are working in the NIST SP 800-90 regime. Since it may
block for a long time, these guarantees do *not* apply. The
user may want to interrupt a hanging process using a signal,
so blocking until all of the requested bytes are returned
would be unfriendly.
For this reason, the user of getrandom(2) MUST always check
the return value, in case it returns some error, or if fewer
bytes than requested was returned. In the case of
!GRND_RANDOM and small request, the latter should never
happen, but the careful userspace code (and all crypto code
should be careful) should check for this anyway!
Signed-off-by: Theodore Ts'o <[email protected]>
Reviewed-by: Zach Brown <[email protected]>
---
Changelog:
v4: Use a wait queue and wait_event() instead of a completion handler
More fine tuning of the commit description and man page
v3: Eliminate potential short reads and EINTR returns when requesting
small amounts of entropy from urandom (once the urandom pool is
initialized). See the NOTES section in the suggested man page for
a more in-depth discussion of the issues involved.
v2: Statically declare the completion structure
Check for and reject unknown flags
arch/x86/syscalls/syscall_32.tbl | 1 +
arch/x86/syscalls/syscall_64.tbl | 1 +
drivers/char/random.c | 42 ++++++++++++++++++++++++++++++++++++---
include/linux/syscalls.h | 3 +++
include/uapi/asm-generic/unistd.h | 4 +++-
include/uapi/linux/random.h | 9 +++++++++
6 files changed, 56 insertions(+), 4 deletions(-)
diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index d6b8679..f484e39 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -360,3 +360,4 @@
351 i386 sched_setattr sys_sched_setattr
352 i386 sched_getattr sys_sched_getattr
353 i386 renameat2 sys_renameat2
+354 i386 getrandom sys_getrandom
diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index ec255a1..6705032 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -323,6 +323,7 @@
314 common sched_setattr sys_sched_setattr
315 common sched_getattr sys_sched_getattr
316 common renameat2 sys_renameat2
+317 common getrandom sys_getrandom
#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/drivers/char/random.c b/drivers/char/random.c
index aa22fe5..c5335a2 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -258,6 +258,8 @@
#include <linux/kmemcheck.h>
#include <linux/workqueue.h>
#include <linux/irq.h>
+#include <linux/syscalls.h>
+#include <linux/completion.h>
#include <asm/processor.h>
#include <asm/uaccess.h>
@@ -404,6 +406,7 @@ static struct poolinfo {
*/
static DECLARE_WAIT_QUEUE_HEAD(random_read_wait);
static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
+static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait);
static struct fasync_struct *fasync;
/**********************************************************************
@@ -657,6 +660,7 @@ retry:
r->entropy_total = 0;
if (r == &nonblocking_pool) {
prandom_reseed_late();
+ wake_up_interruptible(&urandom_init_wait);
pr_notice("random: %s pool is initialized\n", r->name);
}
}
@@ -1174,13 +1178,14 @@ static ssize_t extract_entropy_user(struct entropy_store *r, void __user *buf,
{
ssize_t ret = 0, i;
__u8 tmp[EXTRACT_SIZE];
+ int large_request = (nbytes > 256);
trace_extract_entropy_user(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_);
xfer_secondary_pool(r, nbytes);
nbytes = account(r, nbytes, 0, 0);
while (nbytes) {
- if (need_resched()) {
+ if (large_request && need_resched()) {
if (signal_pending(current)) {
if (ret == 0)
ret = -ERESTARTSYS;
@@ -1355,7 +1360,7 @@ static int arch_random_refill(void)
}
static ssize_t
-random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+_random_read(int nonblock, char __user *buf, size_t nbytes)
{
ssize_t n;
@@ -1379,7 +1384,7 @@ random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
if (arch_random_refill())
continue;
- if (file->f_flags & O_NONBLOCK)
+ if (nonblock)
return -EAGAIN;
wait_event_interruptible(random_read_wait,
@@ -1391,6 +1396,12 @@ random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
}
static ssize_t
+random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+ return _random_read(file->f_flags & O_NONBLOCK, buf, nbytes);
+}
+
+static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
int ret;
@@ -1533,6 +1544,31 @@ const struct file_operations urandom_fops = {
.llseek = noop_llseek,
};
+SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
+ unsigned int, flags)
+{
+ int r;
+
+ if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
+ return -EINVAL;
+
+ if (count > INT_MAX)
+ count = INT_MAX;
+
+ if (flags & GRND_RANDOM)
+ return _random_read(flags & GRND_NONBLOCK, buf, count);
+
+ if (unlikely(nonblocking_pool.initialized == 0)) {
+ if (flags & GRND_NONBLOCK)
+ return -EAGAIN;
+ wait_event_interruptible(urandom_init_wait,
+ nonblocking_pool.initialized);
+ if (signal_pending(current))
+ return -ERESTARTSYS;
+ }
+ return urandom_read(NULL, buf, count, NULL);
+}
+
/***************************************************************
* Random UUID interface
*
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b0881a0..cd82f72f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -866,4 +866,7 @@ asmlinkage long sys_process_vm_writev(pid_t pid,
asmlinkage long sys_kcmp(pid_t pid1, pid_t pid2, int type,
unsigned long idx1, unsigned long idx2);
asmlinkage long sys_finit_module(int fd, const char __user *uargs, int flags);
+asmlinkage long sys_getrandom(char __user * buf, size_t count,
+ unsigned int flags);
+
#endif
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 3336406..2926b1d 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -699,9 +699,11 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr)
__SYSCALL(__NR_sched_getattr, sys_sched_getattr)
#define __NR_renameat2 276
__SYSCALL(__NR_renameat2, sys_renameat2)
+#define __NR_getrandom 277
+__SYSCALL(__NR_getrandom, sys_getrandom)
#undef __NR_syscalls
-#define __NR_syscalls 277
+#define __NR_syscalls 278
/*
* All syscalls below here should go away really,
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index fff3528..3f93d16 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -40,4 +40,13 @@ struct rand_pool_info {
__u32 buf[0];
};
+/*
+ * Flags for getrandom(2)
+ *
+ * GRND_NONBLOCK Don't block and return EAGAIN instead
+ * GRND_RANDOM Use the /dev/random pool instead of /dev/urandom
+ */
+#define GRND_NONBLOCK 0x0001
+#define GRND_RANDOM 0x0002
+
#endif /* _UAPI_LINUX_RANDOM_H */
--
2.0.0
Hi,
On Fri, 18 Jul 2014, Theodore Ts'o wrote:
[...]
> If the GRND_RANDOM bit is not set, then the /dev/urandom pool
> will be used. Unlike using read(2) to fetch data from
> /dev/urandom, if the urandom pool has not been sufficiently
> initialized, getrandom(2) will block or return -1 with the
> errno set to EGAIN if the GRND_NONBLOCK bit is set in flags.
^^^^^
Small typo: this should be EAGAIN.
[...]
Regards
Till Smejkal
On Mon, Jul 21, 2014 at 10:21:26PM +0200, Till Smejkal wrote:
> Hi,
>
> On Fri, 18 Jul 2014, Theodore Ts'o wrote:
> [...]
> > If the GRND_RANDOM bit is not set, then the /dev/urandom pool
> > will be used. Unlike using read(2) to fetch data from
> > /dev/urandom, if the urandom pool has not been sufficiently
> > initialized, getrandom(2) will block or return -1 with the
> > errno set to EGAIN if the GRND_NONBLOCK bit is set in flags.
> ^^^^^
> Small typo: this should be EAGAIN.
Thanks for pointing this out! Will fix.
- Ted
> EAGAIN The requested entropy was not available, and the
> getentropy(2) would have blocked if GRND_BLOCK flag
> was set.
I think either "and the call to getentropy(2)" or "and getentropy(2)" here.
Greetings,
Eike
Hi!
> The rationale of this system call is to provide resiliance against
> file descriptor exhaustion attacks, where the attacker consumes all
> available file descriptors, forcing the use of the fallback code where
> /dev/[u]random is not available. Since the fallback code is often not
> well-tested, it is better to eliminate this potential failure mode
> entirely.
I'm not sure I understand the rationale; if someone can eat all your
file descriptors, he can make you stop working. So you can just stop
working when you can't open /dev/urandom, no?
Fallback code is probably very bad idea to use...
> The other feature provided by this new system call is the ability to
> request randomness from the /dev/urandom entropy pool, but to block
> until at least 128 bits of entropy has been accumulated in the
> /dev/urandom entropy pool. Historically, the emphasis in the
> /dev/urandom development has been to ensure that urandom pool is
> initialized as quickly as possible after system boot, and preferably
> before the init scripts start execution.
Sounds like ioctl() for /dev/urandom for this behaviour would be nice?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Pavel. I have bit 'ol enterprise daemon running with established file
descriptors serving thousands of connections
which periodically require entropy. Now I run out of descriptors. I
can't establish new connections. but I should
now halt all the other ones that require entropy? I should raise
SIGKILL on my process serving these thousands
of connetions? I don't think so.
On Wed, Jul 30, 2014 at 6:26 AM, Pavel Machek <[email protected]> wrote:
> Hi!
>
>> The rationale of this system call is to provide resiliance against
>> file descriptor exhaustion attacks, where the attacker consumes all
>> available file descriptors, forcing the use of the fallback code where
>> /dev/[u]random is not available. Since the fallback code is often not
>> well-tested, it is better to eliminate this potential failure mode
>> entirely.
>
> I'm not sure I understand the rationale; if someone can eat all your
> file descriptors, he can make you stop working. So you can just stop
> working when you can't open /dev/urandom, no?
>
> Fallback code is probably very bad idea to use...
>
>> The other feature provided by this new system call is the ability to
>> request randomness from the /dev/urandom entropy pool, but to block
>> until at least 128 bits of entropy has been accumulated in the
>> /dev/urandom entropy pool. Historically, the emphasis in the
>> /dev/urandom development has been to ensure that urandom pool is
>> initialized as quickly as possible after system boot, and preferably
>> before the init scripts start execution.
>
> Sounds like ioctl() for /dev/urandom for this behaviour would be nice?
>
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Mit, 2014-07-30 at 07:56 -0600, Bob Beck wrote:
> Pavel. I have bit 'ol enterprise daemon running with established file
> descriptors serving thousands of connections
> which periodically require entropy. Now I run out of descriptors. I
> can't establish new connections. but I should
> now halt all the other ones that require entropy? I should raise
> SIGKILL on my process serving these thousands
> of connetions? I don't think so.
If that long-running daemon periodically needs something from a device,
one would better keep the fd for that open the whole time. Saves some
CPU cycles and latency too BTW.
Bernd
--
"I dislike type abstraction if it has no real reason. And saving
on typing is not a good reason - if your typing speed is the main
issue when you're coding, you're doing something seriously wrong."
- Linus Torvalds
On Wed 2014-07-30 16:40:52, Bernd Petrovitsch wrote:
> On Mit, 2014-07-30 at 07:56 -0600, Bob Beck wrote:
> > Pavel. I have bit 'ol enterprise daemon running with established file
> > descriptors serving thousands of connections
> > which periodically require entropy. Now I run out of descriptors. I
> > can't establish new connections. but I should
> > now halt all the other ones that require entropy? I should raise
> > SIGKILL on my process serving these thousands
> > of connetions? I don't think so.
>
> If that long-running daemon periodically needs something from a device,
> one would better keep the fd for that open the whole time. Saves some
> CPU cycles and latency too BTW.
Agreed.
On the other hand, keeping a fd open is quite tricky for a
library. But better solution might be to make that easier.
open( , O_IM_A_LIBRARY_GIVE_ME_ONE_OF_THREE_RESERVED_FDS) might be one
solution. Actually, one reserved fd should be enough.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Don, 2014-07-31 at 00:18 +0200, Pavel Machek wrote:
> On Wed 2014-07-30 16:40:52, Bernd Petrovitsch wrote:
> > On Mit, 2014-07-30 at 07:56 -0600, Bob Beck wrote:
> > > Pavel. I have bit 'ol enterprise daemon running with established file
> > > descriptors serving thousands of connections
> > > which periodically require entropy. Now I run out of descriptors. I
> > > can't establish new connections. but I should
> > > now halt all the other ones that require entropy? I should raise
> > > SIGKILL on my process serving these thousands
> > > of connetions? I don't think so.
> >
> > If that long-running daemon periodically needs something from a device,
> > one would better keep the fd for that open the whole time. Saves some
> > CPU cycles and latency too BTW.
>
> Agreed.
>
> On the other hand, keeping a fd open is quite tricky for a
> library. But better solution might be to make that easier.
Yes, in a (full-fledged, standalone) library seems at least tricky (also
referring to some off-list mails here: think about fork() - which could
be inside system() or popen() or similar).
But as part of the *application* (where one has control over fork()
etc.), this should be somewhat less risky. Yes, that doesn't really help
libssl;-)
Hehe, we (Unix!) have (had) gettimeofday(), time() and similar sys-calls
since ages and no one proposed to make devices for them and get rid of
the system-calls.
> open( , O_IM_A_LIBRARY_GIVE_ME_ONE_OF_THREE_RESERVED_FDS) might be one
> solution. Actually, one reserved fd should be enough.
Well, this can also be DoSed and the proposal aims to make that
impossible (and where does this reserved count against? process-limits,
kernel-wide limit?).
Bernd
--
"I dislike type abstraction if it has no real reason. And saving
on typing is not a good reason - if your typing speed is the main
issue when you're coding, you're doing something seriously wrong."
- Linus Torvalds