This is the cleanup of the latest series of prandom_u32 experimentations
consisting in using SipHash instead of Tausworthe to produce the randoms
used by the network stack. The changes to the files were kept minimal,
and the controversial commit that used to take noise from the fast_pool
(f227e3ec3b5c) was reverted. Instead, a dedicated "net_rand_noise" per_cpu
variable is fed from various sources of activities (networking, scheduling)
to perturb the SipHash state using fast, non-trivially predictable data,
instead of keeping it fully deterministic. The goal is essentially to make
any occasional memory leakage or brute-force attempt useless.
The resulting code was verified to be very slightly faster on x86_64 than
what is was with the controversial commit above, though this remains barely
above measurement noise. It was only build-tested on arm & arm64.
George Spelvin (1):
random32: make prandom_u32() output unpredictable
Willy Tarreau (1):
random32: add noise from network and scheduling activity
drivers/char/random.c | 1 -
include/linux/prandom.h | 55 ++++-
kernel/time/timer.c | 9 +-
lib/random32.c | 438 ++++++++++++++++++++++++----------------
net/core/dev.c | 4 +
5 files changed, 326 insertions(+), 181 deletions(-)
Cc: George Spelvin <[email protected]>
Cc: Amit Klein <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: "Jason A. Donenfeld" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: Florian Westphal <[email protected]>
Cc: Marc Plumb <[email protected]>
Cc: Sedat Dilek <[email protected]>
--
2.28.0
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b5c ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/[email protected]/
Cc: George Spelvin <[email protected]>
Cc: Amit Klein <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: "Jason A. Donenfeld" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: Florian Westphal <[email protected]>
Cc: Marc Plumb <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
---
include/linux/prandom.h | 19 +++++++++++++++++++
kernel/time/timer.c | 2 ++
lib/random32.c | 5 +++++
net/core/dev.c | 4 ++++
4 files changed, 30 insertions(+)
diff --git a/include/linux/prandom.h b/include/linux/prandom.h
index cc1e71334e53..aa7de3432e0f 100644
--- a/include/linux/prandom.h
+++ b/include/linux/prandom.h
@@ -16,6 +16,12 @@ void prandom_bytes(void *buf, size_t nbytes);
void prandom_seed(u32 seed);
void prandom_reseed_late(void);
+DECLARE_PER_CPU(unsigned long, net_rand_noise);
+
+#define PRANDOM_ADD_NOISE(a, b, c, d) \
+ prandom_u32_add_noise((unsigned long)(a), (unsigned long)(b), \
+ (unsigned long)(c), (unsigned long)(d))
+
#if BITS_PER_LONG == 64
/*
* The core SipHash round function. Each line can be executed in
@@ -50,6 +56,18 @@ void prandom_reseed_late(void);
#error Unsupported BITS_PER_LONG
#endif
+static inline void prandom_u32_add_noise(unsigned long a, unsigned long b,
+ unsigned long c, unsigned long d)
+{
+ /*
+ * This is not used cryptographically; it's just
+ * a convenient 4-word hash function. (3 xor, 2 add, 2 rol)
+ */
+ a ^= __this_cpu_read(net_rand_noise);
+ PRND_SIPROUND(a, b, c, d);
+ __this_cpu_write(net_rand_noise, d);
+}
+
struct rnd_state {
__u32 s1, s2, s3, s4;
};
@@ -99,6 +117,7 @@ static inline void prandom_seed_state(struct rnd_state *state, u64 seed)
state->s2 = __seed(i, 8U);
state->s3 = __seed(i, 16U);
state->s4 = __seed(i, 128U);
+ PRANDOM_ADD_NOISE(state, i, 0, 0);
}
/* Pseudo random number generator from numerical recipes. */
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 401fcb9d7388..bebcf2fc1226 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1704,6 +1704,8 @@ void update_process_times(int user_tick)
{
struct task_struct *p = current;
+ PRANDOM_ADD_NOISE(jiffies, user_tick, p, 0);
+
/* Note: this timer irq context must be accounted for as well. */
account_process_tick(p, user_tick);
run_local_timers();
diff --git a/lib/random32.c b/lib/random32.c
index 00fa925a4487..38db382a8cf5 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -324,6 +324,8 @@ struct siprand_state {
};
static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
+DEFINE_PER_CPU(unsigned long, net_rand_noise);
+EXPORT_PER_CPU_SYMBOL(net_rand_noise);
/*
* This is the core CPRNG function. As "pseudorandom", this is not used
@@ -347,9 +349,12 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
static inline u32 siprand_u32(struct siprand_state *s)
{
unsigned long v0 = s->v0, v1 = s->v1, v2 = s->v2, v3 = s->v3;
+ unsigned long n = __this_cpu_read(net_rand_noise);
+ v3 ^= n;
PRND_SIPROUND(v0, v1, v2, v3);
PRND_SIPROUND(v0, v1, v2, v3);
+ v0 ^= n;
s->v0 = v0; s->v1 = v1; s->v2 = v2; s->v3 = v3;
return v1 + v3;
}
diff --git a/net/core/dev.c b/net/core/dev.c
index b9c6f31ae96e..e075f7e0785a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -144,6 +144,7 @@
#include <linux/indirect_call_wrapper.h>
#include <net/devlink.h>
#include <linux/pm_runtime.h>
+#include <linux/prandom.h>
#include "net-sysfs.h"
@@ -3557,6 +3558,7 @@ static int xmit_one(struct sk_buff *skb, struct net_device *dev,
dev_queue_xmit_nit(skb, dev);
len = skb->len;
+ PRANDOM_ADD_NOISE(skb, dev, txq, len + jiffies);
trace_net_dev_start_xmit(skb, dev);
rc = netdev_start_xmit(skb, dev, txq, more);
trace_net_dev_xmit(skb, rc, dev, len);
@@ -4129,6 +4131,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
if (!skb)
goto out;
+ PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
HARD_TX_LOCK(dev, txq, cpu);
if (!netif_xmit_stopped(txq)) {
@@ -4194,6 +4197,7 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
skb_set_queue_mapping(skb, queue_id);
txq = skb_get_tx_queue(dev, skb);
+ PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
local_bh_disable();
--
2.28.0
On 8/31/20 11:43 PM, Willy Tarreau wrote:
> With the removal of the interrupt perturbations in previous random32
> change (random32: make prandom_u32() output unpredictable), the PRNG
> has become 100% deterministic again. While SipHash is expected to be
> way more robust against brute force than the previous Tausworthe LFSR,
> there's still the risk that whoever has even one temporary access to
> the PRNG's internal state is able to predict all subsequent draws till
> the next reseed (roughly every minute). This may happen through a side
> channel attack or any data leak.
>
> This patch restores the spirit of commit f227e3ec3b5c ("random32: update
> the net random state on interrupt and activity") in that it will perturb
> the internal PRNG's statee using externally collected noise, except that
> it will not pick that noise from the random pool's bits nor upon
> interrupt, but will rather combine a few elements along the Tx path
> that are collectively hard to predict, such as dev, skb and txq
> pointers, packet length and jiffies values. These ones are combined
> using a single round of SipHash into a single long variable that is
> mixed with the net_rand_state upon each invocation.
>
> The operation was inlined because it produces very small and efficient
> code, typically 3 xor, 2 add and 2 rol. The performance was measured
> to be the same (even very slightly better) than before the switch to
> SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
> (i40e), the connection rate dropped from 556k/s to 555k/s while the
> SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b9c6f31ae96e..e075f7e0785a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -144,6 +144,7 @@
> #include <linux/indirect_call_wrapper.h>
> #include <net/devlink.h>
> #include <linux/pm_runtime.h>
> +#include <linux/prandom.h>
>
> #include "net-sysfs.h"
>
> @@ -3557,6 +3558,7 @@ static int xmit_one(struct sk_buff *skb, struct net_device *dev,
> dev_queue_xmit_nit(skb, dev);
>
> len = skb->len;
> + PRANDOM_ADD_NOISE(skb, dev, txq, len + jiffies);
> trace_net_dev_start_xmit(skb, dev);
> rc = netdev_start_xmit(skb, dev, txq, more);
> trace_net_dev_xmit(skb, rc, dev, len);
> @@ -4129,6 +4131,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> if (!skb)
> goto out;
>
> + PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
> HARD_TX_LOCK(dev, txq, cpu);
>
> if (!netif_xmit_stopped(txq)) {
> @@ -4194,6 +4197,7 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
>
> skb_set_queue_mapping(skb, queue_id);
> txq = skb_get_tx_queue(dev, skb);
> + PRANDOM_ADD_NOISE(skb, dev, txq, jiffies);
>
> local_bh_disable();
>
>
Hi Willy
There is not much entropy here really :
1) dev & txq are mostly constant on a typical host (at least the kind of hosts that is targeted by
Amit Klein and others in their attacks.
2) len is also known by the attacker, attacking an idle host.
3) skb are also allocations from slab cache, which tend to recycle always the same pointers (on idle hosts)
4) jiffies might be incremented every 4 ms (if HZ=250)
Maybe we could feed percpu prandom noise with samples of ns resolution timestamps,
lazily cached from ktime_get() or similar functions.
This would use one instruction on x86 to update the cache, with maybe more generic noise.
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4c47f388a83f17860fdafa3229bba0cc605ec25a..a3e026cbbb6e8c5499ed780e57de5fa09bc010b6 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -751,7 +751,7 @@ ktime_t ktime_get(void)
{
struct timekeeper *tk = &tk_core.timekeeper;
unsigned int seq;
- ktime_t base;
+ ktime_t res, base;
u64 nsecs;
WARN_ON(timekeeping_suspended);
@@ -763,7 +763,9 @@ ktime_t ktime_get(void)
} while (read_seqcount_retry(&tk_core.seq, seq));
- return ktime_add_ns(base, nsecs);
+ res = ktime_add_ns(base, nsecs);
+ __this_cpu_add(prandom_noise, (unsigned long)ktime_to_ns(res));
+ return res;
}
EXPORT_SYMBOL_GPL(ktime_get);
Hi Eric,
On Tue, Sep 01, 2020 at 12:24:38PM +0200, Eric Dumazet wrote:
> There is not much entropy here really :
>
> 1) dev & txq are mostly constant on a typical host (at least the kind of hosts that is targeted by
> Amit Klein and others in their attacks.
>
> 2) len is also known by the attacker, attacking an idle host.
>
> 3) skb are also allocations from slab cache, which tend to recycle always the same pointers (on idle hosts)
>
>
> 4) jiffies might be incremented every 4 ms (if HZ=250)
I know. The point is essentially that someone "remote" or with rare access
to the host's memory (i.e. in a VM on the same CPU sharing L1 with some
CPU vulnerabilities) cannot synchronize with the PRNG and easily stay
synchronized forever. Otherwise I totally agree that these are pretty
weak. But in my opinion they are sufficient to turn a 100% success into
way less. I try not to forget that we're just trying to make a ~15-bit
port require ~2^14 attempts on average. Oh and by the way the number of
calls also counts here.
> Maybe we could feed percpu prandom noise with samples of ns resolution timestamps,
> lazily cached from ktime_get() or similar functions.
>
> This would use one instruction on x86 to update the cache, with maybe more generic noise.
Sure! I think the principle here allows to easily extend it to various
places, and the more the better. Maybe actually we'll figure that there
are plenty of sources of randomness that were not considered secure enough
to feed /dev/random while they're perfectly fine for such use cases.
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 4c47f388a83f17860fdafa3229bba0cc605ec25a..a3e026cbbb6e8c5499ed780e57de5fa09bc010b6 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -751,7 +751,7 @@ ktime_t ktime_get(void)
> {
> struct timekeeper *tk = &tk_core.timekeeper;
> unsigned int seq;
> - ktime_t base;
> + ktime_t res, base;
> u64 nsecs;
>
> WARN_ON(timekeeping_suspended);
> @@ -763,7 +763,9 @@ ktime_t ktime_get(void)
>
> } while (read_seqcount_retry(&tk_core.seq, seq));
>
> - return ktime_add_ns(base, nsecs);
> + res = ktime_add_ns(base, nsecs);
> + __this_cpu_add(prandom_noise, (unsigned long)ktime_to_ns(res));
> + return res;
> }
> EXPORT_SYMBOL_GPL(ktime_get);
Actually it could even be nice to combine it with __builtin_return_address(0)
given the large number of callers this one has! But I generally agree with
your proposal.
Thanks,
Willy
On Tue, Sep 1, 2020 at 8:43 AM Willy Tarreau <[email protected]> wrote:
>
> This is the cleanup of the latest series of prandom_u32 experimentations
> consisting in using SipHash instead of Tausworthe to produce the randoms
> used by the network stack. The changes to the files were kept minimal,
> and the controversial commit that used to take noise from the fast_pool
> (f227e3ec3b5c) was reverted. Instead, a dedicated "net_rand_noise" per_cpu
> variable is fed from various sources of activities (networking, scheduling)
> to perturb the SipHash state using fast, non-trivially predictable data,
> instead of keeping it fully deterministic. The goal is essentially to make
> any occasional memory leakage or brute-force attempt useless.
>
> The resulting code was verified to be very slightly faster on x86_64 than
> what is was with the controversial commit above, though this remains barely
> above measurement noise. It was only build-tested on arm & arm64.
>
> George Spelvin (1):
> random32: make prandom_u32() output unpredictable
>
> Willy Tarreau (1):
> random32: add noise from network and scheduling activity
>
> drivers/char/random.c | 1 -
> include/linux/prandom.h | 55 ++++-
> kernel/time/timer.c | 9 +-
> lib/random32.c | 438 ++++++++++++++++++++++++----------------
> net/core/dev.c | 4 +
> 5 files changed, 326 insertions(+), 181 deletions(-)
>
> Cc: George Spelvin <[email protected]>
> Cc: Amit Klein <[email protected]>
> Cc: Eric Dumazet <[email protected]>
> Cc: "Jason A. Donenfeld" <[email protected]>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: [email protected]
> Cc: Florian Westphal <[email protected]>
> Cc: Marc Plumb <[email protected]>
> Cc: Sedat Dilek <[email protected]>
>
I have tested with the patchset from [1].
( Later I saw, you dropped "WIP: tcp: reuse incoming skb hash in
tcp_conn_request()". )
- Sedat -
https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/log/?h=20200901-siphash-noise
> --
> 2.28.0
On Tue, Sep 01, 2020 at 04:41:13PM +0200, Sedat Dilek wrote:
> I have tested with the patchset from [1].
> ( Later I saw, you dropped "WIP: tcp: reuse incoming skb hash in
> tcp_conn_request()". )
Yes because it's a bit out of the cope of this series and makes sense
even without these patches, thus I assume Eric will take care of it
separately.
Willy
On Tue, Sep 1, 2020 at 4:55 PM Willy Tarreau <[email protected]> wrote:
>
> On Tue, Sep 01, 2020 at 04:41:13PM +0200, Sedat Dilek wrote:
> > I have tested with the patchset from [1].
> > ( Later I saw, you dropped "WIP: tcp: reuse incoming skb hash in
> > tcp_conn_request()". )
>
> Yes because it's a bit out of the cope of this series and makes sense
> even without these patches, thus I assume Eric will take care of it
> separately.
Yes, I am still pondering on this one and really there is no hurry.