Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965126AbcDYVN7 (ORCPT ); Mon, 25 Apr 2016 17:13:59 -0400 Received: from mail-ob0-f175.google.com ([209.85.214.175]:35874 "EHLO mail-ob0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964869AbcDYVN5 (ORCPT ); Mon, 25 Apr 2016 17:13:57 -0400 MIME-Version: 1.0 In-Reply-To: <20160425141046.d14466272ea246dd0374ea43@linux-foundation.org> References: <1461616763-60246-1-git-send-email-thgarnie@google.com> <20160425141046.d14466272ea246dd0374ea43@linux-foundation.org> Date: Mon, 25 Apr 2016 14:13:56 -0700 Message-ID: Subject: Re: [PATCH v2] mm: SLAB freelist randomization From: Thomas Garnier To: Andrew Morton Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Kees Cook , Greg Thelen , Laura Abbott , kernel-hardening@lists.openwall.com, LKML , Linux-MM Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5197 Lines: 153 On Mon, Apr 25, 2016 at 2:10 PM, Andrew Morton wrote: > On Mon, 25 Apr 2016 13:39:23 -0700 Thomas Garnier wrote: > >> Provides an optional config (CONFIG_FREELIST_RANDOM) to randomize the >> SLAB freelist. The list is randomized during initialization of a new set >> of pages. The order on different freelist sizes is pre-computed at boot >> for performance. Each kmem_cache has its own randomized freelist except >> early on boot where global lists are used. This security feature reduces >> the predictability of the kernel SLAB allocator against heap overflows >> rendering attacks much less stable. >> >> For example this attack against SLUB (also applicable against SLAB) >> would be affected: >> https://jon.oberheide.org/blog/2010/09/10/linux-kernel-can-slub-overflow/ >> >> Also, since v4.6 the freelist was moved at the end of the SLAB. It means >> a controllable heap is opened to new attacks not yet publicly discussed. >> A kernel heap overflow can be transformed to multiple use-after-free. >> This feature makes this type of attack harder too. >> >> To generate entropy, we use get_random_bytes_arch because 0 bits of >> entropy is available in the boot stage. In the worse case this function >> will fallback to the get_random_bytes sub API. We also generate a shift >> random number to shift pre-computed freelist for each new set of pages. >> >> The config option name is not specific to the SLAB as this approach will >> be extended to other allocators like SLUB. >> >> Performance results highlighted no major changes: >> >> slab_test 1 run on boot. Difference only seen on the 2048 size test >> being the worse case scenario covered by freelist randomization. New >> slab pages are constantly being created on the 10000 allocations. >> Variance should be mainly due to getting new pages every few >> allocations. >> >> ... >> >> --- a/include/linux/slab_def.h >> +++ b/include/linux/slab_def.h >> @@ -80,6 +80,10 @@ struct kmem_cache { >> struct kasan_cache kasan_info; >> #endif >> >> +#ifdef CONFIG_FREELIST_RANDOM > > CONFIG_FREELIST_RANDOM bugs me a bit - "freelist" is so vague. > CONFIG_SLAB_FREELIST_RANDOM would be better. I mean, what Kconfig > identifier could be used for implementing randomisation in > slub/slob/etc once CONFIG_FREELIST_RANDOM is used up? > >> + void *random_seq; >> +#endif >> + >> struct kmem_cache_node *node[MAX_NUMNODES]; >> }; >> >> diff --git a/init/Kconfig b/init/Kconfig >> index 0c66640..73453d0 100644 >> --- a/init/Kconfig >> +++ b/init/Kconfig >> @@ -1742,6 +1742,15 @@ config SLOB >> >> endchoice >> >> +config FREELIST_RANDOM >> + default n >> + depends on SLAB >> + bool "SLAB freelist randomization" >> + help >> + Randomizes the freelist order used on creating new SLABs. This >> + security feature reduces the predictability of the kernel slab >> + allocator against heap overflows. >> + >> config SLUB_CPU_PARTIAL >> default y >> depends on SLUB && SMP >> diff --git a/mm/slab.c b/mm/slab.c >> index b82ee6b..89eb617 100644 >> --- a/mm/slab.c >> +++ b/mm/slab.c >> @@ -116,6 +116,7 @@ >> #include >> #include >> #include >> +#include >> >> #include >> >> @@ -1230,6 +1231,100 @@ static void __init set_up_node(struct kmem_cache *cachep, int index) >> } >> } >> >> +#ifdef CONFIG_FREELIST_RANDOM >> +static void freelist_randomize(struct rnd_state *state, freelist_idx_t *list, >> + size_t count) >> +{ >> + size_t i; >> + unsigned int rand; >> + >> + for (i = 0; i < count; i++) >> + list[i] = i; >> + >> + /* Fisher-Yates shuffle */ >> + for (i = count - 1; i > 0; i--) { >> + rand = prandom_u32_state(state); >> + rand %= (i + 1); >> + swap(list[i], list[rand]); >> + } >> +} >> + >> +/* Create a random sequence per cache */ >> +static void cache_random_seq_create(struct kmem_cache *cachep) >> +{ >> + unsigned int seed, count = cachep->num; >> + struct rnd_state state; >> + >> + if (count < 2) >> + return; >> + >> + cachep->random_seq = kcalloc(count, sizeof(freelist_idx_t), GFP_KERNEL); >> + BUG_ON(cachep->random_seq == NULL); On your previous email. (trying to stay in one thread). I added a comment on this version to explain that we need best entropy at this boot stage. > > Yikes, that's a bit rude. Is there no way of recovering from this? If > the answer to that is really really "no" then I guess we should put a > __GFP_NOFAIL in there. Add a comment explaining why (apologetically - > __GFP_NOFAIL is unpopular!) and remove the now-unneeded BUG_ON. > > We can always use the static. I will update on next iteration to remove the BUG_ON. >> + /* Get best entropy at this stage */ >> + get_random_bytes_arch(&seed, sizeof(seed)); > > See concerns in other email - isn't this a no-op if CONFIG_ARCH_RANDOM=n? > > >> + prandom_seed_state(&state, seed); >> + >> + freelist_randomize(&state, cachep->random_seq, count); >> +} >> + >