The unlocked function is needed for following work.
No API change.
(Minor functional change while messing with code: all 64 bits of
the cycles counter is used. No API change.)
---
drivers/char/random.c | 27 +++++++++++++++++----------
1 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4bcc4f2..fdbf7b6 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1621,22 +1621,29 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
/*
* Get a random word for internal kernel use only. Similar to urandom but
* with the goal of minimal entropy pool depletion. As a result, the random
- * value is not cryptographically secure but for several uses the cost of
- * depleting entropy is too high
+ * value is not strongly cryptographically secure, but for several uses the
+ * cost of depleting entropy is too high.
*/
DEFINE_PER_CPU(__u32 [4], get_random_int_hash);
-unsigned int get_random_int(void)
+static u32 __get_random_int(u32 *hash)
{
- struct keydata *keyptr;
- __u32 *hash = get_cpu_var(get_random_int_hash);
- int ret;
+ struct keydata const *keyptr = get_keyptr();
+ cycles_t c = get_cycles();
- keyptr = get_keyptr();
- hash[0] += current->pid + jiffies + get_cycles();
+ /* Throw in some extra seed material */
+ hash[0] += (__u32)c;
+ hash[1] += (__u32)(c>>16>>16); /* Safe if cycles_it is 32 bits. */
+ hash[2] += current->pid + jiffies;
- ret = half_md4_transform(hash, keyptr->secret);
- put_cpu_var(get_random_int_hash);
+ return half_md4_transform(hash, keyptr->secret);
+}
+unsigned int get_random_int(void)
+{
+ u32 *hash = get_cpu_var(get_random_int_hash);
+ u32 ret = __get_random_int(hash);
+
+ put_cpu_var(get_random_int_hash);
return ret;
}
--
1.7.4.1
On Sun, 2011-03-13 at 20:57 -0400, George Spelvin wrote:
> The unlocked function is needed for following work.
> No API change.
As I mentioned last time this code was discussed, we're already one
crypto-savvy attacker away from this code becoming a security hole.
We really need to give it a serious rethink before we make it look
anything like a general-use API.
And you've got it backwards here: __ should be the unlocked, dangerous
version. But the locked version already has a __ because it's already
dangerous.
--
Mathematics is the supreme nostalgia of our time.
Thank you very much for your review!
> I've spent a while thinking about this over the past few weeks, and I
> really don't think it's productive to try to randomize the allocators.
> It provides negligible defense and just makes life harder for kernel
> hackers.
>
> (And you definitely can't randomize SLOB like this.)
I'm not sure, either. I *do* think it actually prevents an attacker
reliably allocating two consecutive kernel objects, but I expect that
most buffer overrun attacks can just allocate lots of taget objects and
figure out which one got smashed.
It's mostly for benchmarking and discussion.
>> The unlocked function is needed for following work.
>> No API change.
> As I mentioned last time this code was discussed, we're already one
> crypto-savvy attacker away from this code becoming a security hole.
> We really need to give it a serious rethink before we make it look
> anything like a general-use API.
If you like, and don't mind a few more bytes of per-cpu data, I'll
happily replace the whole dubious thing with a cryptographically secure
high-speed PRNG. I'm thinking ChaCha/12, as Salsa20 was selected by
eSTREAM and ChaCha is generally agreed to be stronger. (It's had more
review as the basis of the BLAKE hash function, a SHA-3 finalist.)
I've got some working SSE2 code for it, too. Invoking it should be
conditional on the amount requested; there's no point context-switching
the FPU for one iteration.
I can also add a (configurable) /dev/frandom interface for it.
> And you've got it backwards here: __ should be the unlocked, dangerous
> version. But the locked version already has a __ because it's already
> dangerous.
I don't understand. The old version did *not* have a __, and I added
__ in front of the dangerous unlocked version. If, on re-reading it,
you still think I did something wrong, can you please explain in more
detail?
>> This is a function for generating random numbers modulo small
>> integers, with uniform distribution and parsimonious use of seed
>> material.
> This actually looks pretty reasonable, ignoring the scary API foundation
> it's built on. But as popular as rand() % m constructs are with
> programmers, it's better to design things so as to avoid the modulus
> entirely. We've done pretty well at that so far, so I'd rather not have
> such a thing in the kernel.
I was thinking of using it to implement randomize_range(), I just didn't
want to be too intrusive, and I'd need to extend the code to handle 64-bit
address spaces.
If you'd like, I can do that. (Actually, looking at it, there are
only three callers and the range is always 0x02000000. And the
use of PAGE_ALIGN is wrong; it should round down rather than up.)
On Mon, 2011-03-14 at 21:58 -0400, George Spelvin wrote:
>> For sysfs files that map a boolean to a flags bit.
> This one's actually pretty nice.
The old code just annoyed me; I couldn't stand to cut & paste one
more time.
I can probably do better; I can extend the slab_sttribute structure to
include the bit mask, have the slab_attr_show and slab_attr_store dispatch
functions pass the attribute pointer to the ->show and ->store functions,
and do away with all the per-bit functions.
> You should really try to put all the uncontroversial bits of a series
> first.
Is that really a more important principle than putting related changes
together? I get the idea, but thought it made more sense to put
all the slub.c changes together.
On Wed, Mar 16, 2011 at 6:24 AM, George Spelvin <[email protected]> wrote:
>> You should really try to put all the uncontroversial bits of a series
>> first.
>
> Is that really a more important principle than putting related changes
> together? ?I get the idea, but thought it made more sense to put
> all the slub.c changes together.
It it is more important because we might end up merging only the
non-controversial bits - at least for now.
On Wed, 2011-03-16 at 00:24 -0400, George Spelvin wrote:
> Thank you very much for your review!
>
> > I've spent a while thinking about this over the past few weeks, and I
> > really don't think it's productive to try to randomize the allocators.
> > It provides negligible defense and just makes life harder for kernel
> > hackers.
> >
> > (And you definitely can't randomize SLOB like this.)
>
> I'm not sure, either. I *do* think it actually prevents an attacker
> reliably allocating two consecutive kernel objects, but I expect that
> most buffer overrun attacks can just allocate lots of taget objects and
> figure out which one got smashed.
>
> It's mostly for benchmarking and discussion.
>
>
> >> The unlocked function is needed for following work.
> >> No API change.
>
> > As I mentioned last time this code was discussed, we're already one
> > crypto-savvy attacker away from this code becoming a security hole.
> > We really need to give it a serious rethink before we make it look
> > anything like a general-use API.
>
> If you like, and don't mind a few more bytes of per-cpu data, I'll
> happily replace the whole dubious thing with a cryptographically secure
> high-speed PRNG. I'm thinking ChaCha/12, as Salsa20 was selected by
> eSTREAM and ChaCha is generally agreed to be stronger. (It's had more
> review as the basis of the BLAKE hash function, a SHA-3 finalist.)
Yes, let's do this. ChaCha looks like a fine candidate.
> I've got some working SSE2 code for it, too. Invoking it should be
> conditional on the amount requested; there's no point context-switching
> the FPU for one iteration.
>
> I can also add a (configurable) /dev/frandom interface for it.
I'd rather not add an frandom until after we get rid of the
random/urandom dichotomy.
> > And you've got it backwards here: __ should be the unlocked, dangerous
> > version. But the locked version already has a __ because it's already
> > dangerous.
>
> I don't understand. The old version did *not* have a __, and I added
> __ in front of the dangerous unlocked version. If, on re-reading it,
> you still think I did something wrong, can you please explain in more
> detail?
I probably misread your patch, sorry.
>
> >> This is a function for generating random numbers modulo small
> >> integers, with uniform distribution and parsimonious use of seed
> >> material.
>
> > This actually looks pretty reasonable, ignoring the scary API foundation
> > it's built on. But as popular as rand() % m constructs are with
> > programmers, it's better to design things so as to avoid the modulus
> > entirely. We've done pretty well at that so far, so I'd rather not have
> > such a thing in the kernel.
>
> I was thinking of using it to implement randomize_range(), I just didn't
> want to be too intrusive, and I'd need to extend the code to handle 64-bit
> address spaces.
>
> If you'd like, I can do that. (Actually, looking at it, there are
> only three callers and the range is always 0x02000000. And the
> use of PAGE_ALIGN is wrong; it should round down rather than up.)
> On Mon, 2011-03-14 at 21:58 -0400, George Spelvin wrote:
>
>
> >> For sysfs files that map a boolean to a flags bit.
>
> > This one's actually pretty nice.
>
> The old code just annoyed me; I couldn't stand to cut & paste one
> more time.
>
> I can probably do better; I can extend the slab_sttribute structure to
> include the bit mask, have the slab_attr_show and slab_attr_store dispatch
> functions pass the attribute pointer to the ->show and ->store functions,
> and do away with all the per-bit functions.
>
> > You should really try to put all the uncontroversial bits of a series
> > first.
>
> Is that really a more important principle than putting related changes
> together? I get the idea, but thought it made more sense to put
> all the slub.c changes together.
Think of it as a way of making forward progress. You should explicitly
call out 'hey, these bits are cleanups you should just merge' so they
don't get lost in the debate. Then the next time around, you have that
many fewer patches.
--
Mathematics is the supreme nostalgia of our time.
>From [email protected] Wed Mar 16 14:24:09 2011
X-Virus-Scanned: Debian amavisd-new at waste.org
Subject: Re: [PATCH 2/8] drivers/char/random: Split out __get_random_int
From: Matt Mackall <[email protected]>
To: George Spelvin <[email protected]>
Cc: [email protected], [email protected],
[email protected], [email protected]
In-Reply-To: <[email protected]>
References: <[email protected]>
Content-Type: text/plain; charset="UTF-8"
Date: Wed, 16 Mar 2011 09:24:05 -0500
Mime-Version: 1.0
X-Mailer: Evolution 2.32.2
Content-Transfer-Encoding: 7bit
On Wed, 2011-03-16, Mat Mackall wrote:
> On Wed, 2011-03-16 at 00:24 -0400, George Spelvin wrote:
>> If you like, and don't mind a few more bytes of per-cpu data, I'll
>> happily replace the whole dubious thing with a cryptographically secure
>> high-speed PRNG. I'm thinking ChaCha/12, as Salsa20 was selected by
>> eSTREAM and ChaCha is generally agreed to be stronger. (It's had more
>> review as the basis of the BLAKE hash function, a SHA-3 finalist.)
> Yes, let's do this. ChaCha looks like a fine candidate.
Just to confirm, it'll have basically the same structure as the
current code: a global secret key, re-seeded every 300 seconds,
with per-CPU state for generation. I'll generate 16 words at a time,
and use them until they're exhausted or the global secret changes.
ChaCha uses a 256-bit (8-word) key. It obviously shouldn't be shared
with the weaker half-MD4 operation. Should I generate both from the
pool directly, or only take 8 words and use ChaCha to generate the
half-MD4 key? Cryptographically, either is fine; it's a matter of code
simplicity vs. economical use of entropy. Do you have a preference?
(I slightly prefer #2.)
> I'd rather not add an frandom until after we get rid of the
> random/urandom dichotomy.
Can you explain? I think Ted's idea of the split was a good idea.
It does require user education, but it's important user education.
(I'm talking API level; I understand the internal plumbing is a bit
of a mess.)
> Think of it as a way of making forward progress. You should explicitly
> call out 'hey, these bits are cleanups you should just merge' so they
> don't get lost in the debate. Then the next time around, you have that
> many fewer patches.
True enough. I'll submit the cleanups separately. Appended is another
cleanup I'm thinking of. Okay with you? (If so, I'll post it separately
for wider review.)
>From c7a878c143c7e63d2540785b76db54b2e8cf6d38 Mon Sep 17 00:00:00 2001
From: George Spelvin <[email protected]>
Date: Wed, 16 Mar 2011 11:42:52 -0400
Subject: [PATCH 1/9] drivers/char/random: Eliminate randomize_range().
This is only called in three places, each of which is trivially
replaced by a call to get_random_int() followed by a bit mask.
(It's to randomize the start of the brk() range by 0..32M bytes,
0..8K pages, which is 13 bits of entropy.)
There is a slight behaviour change, as randomize_range() used PAGE_ALIGN()
which rounds up, but it appears that rounding down was the intention.
---
arch/arm/kernel/process.c | 3 +--
arch/x86/kernel/process.c | 3 +--
arch/x86/kernel/sys_x86_64.c | 7 ++-----
drivers/char/random.c | 19 -------------------
include/linux/random.h | 1 -
5 files changed, 4 insertions(+), 29 deletions(-)
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 94bbedb..ffb7c87 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -479,8 +479,7 @@ unsigned long get_wchan(struct task_struct *p)
unsigned long arch_randomize_brk(struct mm_struct *mm)
{
- unsigned long range_end = mm->brk + 0x02000000;
- return randomize_range(mm->brk, range_end, 0) ? : mm->brk;
+ return mm->brk + (get_random_int() & 0x01ffffff & PAGE_MASK);
}
#ifdef CONFIG_MMU
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ff45541..dcec1a1 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -677,7 +677,6 @@ unsigned long arch_align_stack(unsigned long sp)
unsigned long arch_randomize_brk(struct mm_struct *mm)
{
- unsigned long range_end = mm->brk + 0x02000000;
- return randomize_range(mm->brk, range_end, 0) ? : mm->brk;
+ return mm->brk + (get_random_int() & 0x01ffffff & PAGE_MASK);
}
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index ff14a50..0f874f7 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -46,11 +46,8 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
of playground for now. -AK */
*begin = 0x40000000;
*end = 0x80000000;
- if (current->flags & PF_RANDOMIZE) {
- new_begin = randomize_range(*begin, *begin + 0x02000000, 0);
- if (new_begin)
- *begin = new_begin;
- }
+ if (current->flags & PF_RANDOMIZE)
+ *begin += (get_random_int() & 0x01ffffff & PAGE_MASK);
} else {
*begin = TASK_UNMAPPED_BASE;
*end = TASK_SIZE;
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 72a4fcb..cea9ddc 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1639,22 +1639,3 @@ unsigned int get_random_int(void)
return ret;
}
-
-/*
- * randomize_range() returns a start address such that
- *
- * [...... <range> .....]
- * start end
- *
- * a <range> with size "len" starting at the return value is inside in the
- * area defined by [start, end], but is otherwise randomized.
- */
-unsigned long
-randomize_range(unsigned long start, unsigned long end, unsigned long len)
-{
- unsigned long range = end - len - start;
-
- if (end <= start + len)
- return 0;
- return PAGE_ALIGN(get_random_int() % range + start);
-}
diff --git a/include/linux/random.h b/include/linux/random.h
index fb7ab9d..0e17434 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -73,7 +73,6 @@ extern const struct file_operations random_fops, urandom_fops;
#endif
unsigned int get_random_int(void);
-unsigned long randomize_range(unsigned long start, unsigned long end, unsigned long len);
u32 random32(void);
void srandom32(u32 seed);
--
1.7.4.1
On Wed, 2011-03-16 at 14:35 -0400, George Spelvin wrote:
> >From [email protected] Wed Mar 16 14:24:09 2011
> X-Virus-Scanned: Debian amavisd-new at waste.org
> Subject: Re: [PATCH 2/8] drivers/char/random: Split out __get_random_int
> From: Matt Mackall <[email protected]>
> To: George Spelvin <[email protected]>
> Cc: [email protected], [email protected],
> [email protected], [email protected]
> In-Reply-To: <[email protected]>
> References: <[email protected]>
> Content-Type: text/plain; charset="UTF-8"
> Date: Wed, 16 Mar 2011 09:24:05 -0500
> Mime-Version: 1.0
> X-Mailer: Evolution 2.32.2
> Content-Transfer-Encoding: 7bit
>
> On Wed, 2011-03-16, Mat Mackall wrote:
> > On Wed, 2011-03-16 at 00:24 -0400, George Spelvin wrote:
> >> If you like, and don't mind a few more bytes of per-cpu data, I'll
> >> happily replace the whole dubious thing with a cryptographically secure
> >> high-speed PRNG. I'm thinking ChaCha/12, as Salsa20 was selected by
> >> eSTREAM and ChaCha is generally agreed to be stronger. (It's had more
> >> review as the basis of the BLAKE hash function, a SHA-3 finalist.)
>
> > Yes, let's do this. ChaCha looks like a fine candidate.
>
> Just to confirm, it'll have basically the same structure as the
> current code: a global secret key, re-seeded every 300 seconds,
> with per-CPU state for generation.
> I'll generate 16 words at a time,
> and use them until they're exhausted or the global secret changes.
>
> ChaCha uses a 256-bit (8-word) key. It obviously shouldn't be shared
> with the weaker half-MD4 operation. Should I generate both from the
> pool directly, or only take 8 words and use ChaCha to generate the
> half-MD4 key?
Ideally, I'd like to banish the syncookie code back to networking. I
actually had patches that did this, but they collided in flight with
Arjan's address-space randomization patches.
So decoupling the lightweight RNG from the half-MD4 crap should be the
goal. A future goal would be replacing half-MD4 (and its IPv6 sibling)
with a sensible alternative.
> > I'd rather not add an frandom until after we get rid of the
> > random/urandom dichotomy.
>
> Can you explain? I think Ted's idea of the split was a good idea.
I had a long talk with Ted about two years ago (and separately with
Bruce Schneier) and we all agreed that the entropy accounting was a neat
idea in theory, but broken in practice. For entropy accounting to
actually work (and thus make random stronger than urandom), we MUST
strictly underestimate input entropy. But in general, we have no model
of entropy sources that actually lets us meet this claim (and in fact we
know that most of the sources don't!).
And it hurts us because (a) it keeps us from sampling sources that may
have significant entropy in practice but not in theory and (b) confuses
the hell out of everyone.
We'd be better off with significantly more lighter-weight sampling from
less ideal sources and ditching entropy counting. In other words, we can
trade some ideal but unrealizable level of security for more actual
depth.
There's a bit of a chicken and egg problem here though: it's hard to
introduce a lot of new sampling (eg per-interrupt!) due to locking
overhead associated with entropy counting. A really lightweight approach
needs to be nearly exclusively cpu-local, but we need a design that can
still do cross-cpu catastrophic reseeding. And I've been pretty busy
with another little project, so this is all a bit stalled.
> >From c7a878c143c7e63d2540785b76db54b2e8cf6d38 Mon Sep 17 00:00:00 2001
> From: George Spelvin <[email protected]>
> Date: Wed, 16 Mar 2011 11:42:52 -0400
> Subject: [PATCH 1/9] drivers/char/random: Eliminate randomize_range().
>
> This is only called in three places, each of which is trivially
> replaced by a call to get_random_int() followed by a bit mask.
> (It's to randomize the start of the brk() range by 0..32M bytes,
> 0..8K pages, which is 13 bits of entropy.)
>
> There is a slight behaviour change, as randomize_range() used PAGE_ALIGN()
> which rounds up, but it appears that rounding down was the intention.
Can't say I really like this. Your rolled-in bugfix is a perfect
demonstration of why we don't want to inline this: it's too easy to do
wrong. Ideally we'd have a nice ASR-friendly helper function using a
mask somewhere in mm/ that gets all the tricky arch pointer size and
alignment issues right, rather than a generic-looking one using modulus
(in a sketchy way) in random.h
> ---
> arch/arm/kernel/process.c | 3 +--
> arch/x86/kernel/process.c | 3 +--
> arch/x86/kernel/sys_x86_64.c | 7 ++-----
> drivers/char/random.c | 19 -------------------
> include/linux/random.h | 1 -
> 5 files changed, 4 insertions(+), 29 deletions(-)
>
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 94bbedb..ffb7c87 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -479,8 +479,7 @@ unsigned long get_wchan(struct task_struct *p)
>
> unsigned long arch_randomize_brk(struct mm_struct *mm)
> {
> - unsigned long range_end = mm->brk + 0x02000000;
> - return randomize_range(mm->brk, range_end, 0) ? : mm->brk;
> + return mm->brk + (get_random_int() & 0x01ffffff & PAGE_MASK);
> }
>
> #ifdef CONFIG_MMU
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index ff45541..dcec1a1 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -677,7 +677,6 @@ unsigned long arch_align_stack(unsigned long sp)
>
> unsigned long arch_randomize_brk(struct mm_struct *mm)
> {
> - unsigned long range_end = mm->brk + 0x02000000;
> - return randomize_range(mm->brk, range_end, 0) ? : mm->brk;
> + return mm->brk + (get_random_int() & 0x01ffffff & PAGE_MASK);
> }
>
> diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
> index ff14a50..0f874f7 100644
> --- a/arch/x86/kernel/sys_x86_64.c
> +++ b/arch/x86/kernel/sys_x86_64.c
> @@ -46,11 +46,8 @@ static void find_start_end(unsigned long flags, unsigned long *begin,
> of playground for now. -AK */
> *begin = 0x40000000;
> *end = 0x80000000;
> - if (current->flags & PF_RANDOMIZE) {
> - new_begin = randomize_range(*begin, *begin + 0x02000000, 0);
> - if (new_begin)
> - *begin = new_begin;
> - }
> + if (current->flags & PF_RANDOMIZE)
> + *begin += (get_random_int() & 0x01ffffff & PAGE_MASK);
> } else {
> *begin = TASK_UNMAPPED_BASE;
> *end = TASK_SIZE;
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 72a4fcb..cea9ddc 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1639,22 +1639,3 @@ unsigned int get_random_int(void)
>
> return ret;
> }
> -
> -/*
> - * randomize_range() returns a start address such that
> - *
> - * [...... <range> .....]
> - * start end
> - *
> - * a <range> with size "len" starting at the return value is inside in the
> - * area defined by [start, end], but is otherwise randomized.
> - */
> -unsigned long
> -randomize_range(unsigned long start, unsigned long end, unsigned long len)
> -{
> - unsigned long range = end - len - start;
> -
> - if (end <= start + len)
> - return 0;
> - return PAGE_ALIGN(get_random_int() % range + start);
> -}
> diff --git a/include/linux/random.h b/include/linux/random.h
> index fb7ab9d..0e17434 100644
> --- a/include/linux/random.h
> +++ b/include/linux/random.h
> @@ -73,7 +73,6 @@ extern const struct file_operations random_fops, urandom_fops;
> #endif
>
> unsigned int get_random_int(void);
> -unsigned long randomize_range(unsigned long start, unsigned long end, unsigned long len);
>
> u32 random32(void);
> void srandom32(u32 seed);
--
Mathematics is the supreme nostalgia of our time.