Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752704AbeADKbI (ORCPT + 1 other); Thu, 4 Jan 2018 05:31:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35106 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752159AbeADKbG (ORCPT ); Thu, 4 Jan 2018 05:31:06 -0500 Date: Thu, 4 Jan 2018 18:30:57 +0800 From: Baoquan He To: Chao Fan Cc: linux-kernel@vger.kernel.org, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, keescook@chromium.org, yasu.isimatu@gmail.com, indou.takao@jp.fujitsu.com, caoj.fnst@cn.fujitsu.com, douly.fnst@cn.fujitsu.com, lcapitulino@redhat.com Subject: Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory Message-ID: <20180104103057.GC7235@x1> References: <20180104080219.23893-1-fanc.fnst@cn.fujitsu.com> <20180104080219.23893-2-fanc.fnst@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180104080219.23893-2-fanc.fnst@cn.fujitsu.com> User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 04 Jan 2018 10:31:06 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 01/04/18 at 04:02pm, Chao Fan wrote: > In current code, kaslr may choose the memory region in movable > nodes to extract kernel, which will make the nodes can't be hot-removed. > To solve it, we can specify the memory region in immovable node. > Create immovable_mem to store the regions in immovable_mem, where should > be chosen by kaslr. > > Also change the "handle_mem_memmap" to "handle_mem_filter", since > it will not only handle memmap parameter now. > Since "immovable_mem=" only works with "movable_node", so "immovable_mem=" > doesn't work alone. If specify "movable_node" without "immovable_mem=", > disable KASLR. > > Multiple regions can be specified, comma delimited. > Considering the usage of memory, only support for 4 regions. > 4 regions contains 2 nodes at least, enough for kernel to extract. > > Signed-off-by: Chao Fan Hi Chao, Thanks for your effort on this issue. Luiz told me they met a hugetlb issue when kaslr enabled on kvm guest. Please check the below bug information. There's only one available position which hugepage can use to allocate. In this case, if we have a generic parameter to tell kernel where we can randomize into, this hugepage issue can be solved. We can restrict kernel to randomize beyond [0x40000000, 0x7fffffff]. Not sure if your immovable_mem=nn[KMG]@ss[KMG] can be adjusted to do this. I am hesitating on whether we should change this or not. Hi maintainers, Kees, 1) Let's keep Chao's current code, just ask Luiz to use nokaslr to work around the hugepage allocation failure; 2) Change immovable_mem=nn[KMG]@ss[KMG] to be a generic parameter, people can use it to restrict kernel to places they want. Which one is better? Or any other idea or suggestion? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Luiz said on kvm guest, they add the following to the kernel command-line: default_hugepagesz=1G hugepagesz=1G hugepages=1 Boot the guest and check number of 1GB pages reserved: grep HugePages_Total /proc/meminfo When booting with "nokaslr" HugePages_Total is always 1. When booting without "nokaslr" sometimes HugePages_Total is zero (that is, reserving the 1GB page fails). And 20 reboots, 6 failures to mount single 1G hugepage. I reproduced on kvm guest with Luiz's help, and found it's because there's only one available position for 1G hugepage allocation, [0x40000000, 0x7fffffff]. That's why they saw 1/4 possibility of failure. If kernel randomized to [0x0, 0x3fffffff], [0x80000000, 0xbffdffff], or [0x100000000, 0x13fffffff], hugepage can always succeed to allocate. dmesg output snippet of kvm guest: [ +0.000000] e820: BIOS-provided physical RAM map: [ +0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ +0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [ +0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ +0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable [ +0.000000] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved [ +0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [ +0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ +0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable Thanks Baoquan > --- > arch/x86/boot/compressed/kaslr.c | 112 +++++++++++++++++++++++++++++++++++++-- > 1 file changed, 109 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c > index 8199a6187251..60e5aa28b510 100644 > --- a/arch/x86/boot/compressed/kaslr.c > +++ b/arch/x86/boot/compressed/kaslr.c > @@ -108,6 +108,19 @@ enum mem_avoid_index { > > static struct mem_vector mem_avoid[MEM_AVOID_MAX]; > > +#ifdef CONFIG_MEMORY_HOTPLUG > +/* Only supporting at most 4 immovable memory regions with kaslr */ > +#define MAX_IMMOVABLE_MEM 4 > + > +static bool lack_immovable_mem; > + > +/* Store the memory regions in immovable node */ > +static struct mem_vector immovable_mem[MAX_IMMOVABLE_MEM]; > + > +/* The immovable regions user specify, not more than 4 */ > +static int num_immovable_region; > +#endif > + > static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two) > { > /* Item one is entirely before item two. */ > @@ -206,15 +219,97 @@ static void mem_avoid_memmap(char *str) > memmap_too_large = true; > } > > -static int handle_mem_memmap(void) > +#ifdef CONFIG_MEMORY_HOTPLUG > +static int parse_immovable_mem(char *p, > + unsigned long long *start, > + unsigned long long *size) > +{ > + char *oldp; > + > + if (!p) > + return -EINVAL; > + > + oldp = p; > + *size = memparse(p, &p); > + if (p == oldp) > + return -EINVAL; > + > + switch (*p) { > + case '@': > + *start = memparse(p + 1, &p); > + return 0; > + default: > + /* > + * If w/o offset, only size specified, immovable_mem=nn[KMG] > + * has the same behaviour as immovable_mem=nn[KMG]@0. It means > + * the region starts from 0. > + */ > + *start = 0; > + return 0; > + } > + > + return -EINVAL; > +} > + > +static void parse_immovable_mem_regions(char *str) > +{ > + static int i; > + > + while (str && (i < MAX_IMMOVABLE_MEM)) { > + int rc; > + unsigned long long start, size; > + char *k = strchr(str, ','); > + > + if (k) > + *k++ = 0; > + > + rc = parse_immovable_mem(str, &start, &size); > + if (rc < 0) > + break; > + str = k; > + > + immovable_mem[i].start = start; > + immovable_mem[i].size = size; > + i++; > + } > + num_immovable_region = i; > +} > +#else > +static inline void parse_immovable_mem_regions(char *str) > +{ > +} > +#endif > + > +static int handle_mem_filter(void) > { > char *args = (char *)get_cmd_line_ptr(); > size_t len = strlen((char *)args); > + bool enable_movable_node = false; > char *tmp_cmdline; > char *param, *val; > u64 mem_size; > > - if (!strstr(args, "memmap=") && !strstr(args, "mem=")) > +#ifdef CONFIG_MEMORY_HOTPLUG > + if (strstr(args, "movable_node")) { > + /* > + * Confirm "movable_node" specified, otherwise > + * "immovable_mem=" doesn't work. > + */ > + enable_movable_node = true; > + > + /* > + * If only specify "movable_node" without "immovable_mem=", > + * disable KASLR. > + */ > + if (!strstr(args, "immovable_mem=")) { > + lack_immovable_mem = true; > + return 0; > + } > + } > +#endif > + > + if (!strstr(args, "memmap=") && !strstr(args, "mem=") && > + !enable_movable_node) > return 0; > > tmp_cmdline = malloc(len + 1); > @@ -239,6 +334,9 @@ static int handle_mem_memmap(void) > > if (!strcmp(param, "memmap")) { > mem_avoid_memmap(val); > + } else if (!strcmp(param, "immovable_mem=") && > + enable_movable_node) { > + parse_immovable_mem_regions(val); > } else if (!strcmp(param, "mem")) { > char *p = val; > > @@ -378,7 +476,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size, > /* We don't need to set a mapping for setup_data. */ > > /* Mark the memmap regions we need to avoid */ > - handle_mem_memmap(); > + handle_mem_filter(); > > #ifdef CONFIG_X86_VERBOSE_BOOTUP > /* Make sure video RAM can be used. */ > @@ -673,6 +771,14 @@ static unsigned long find_random_phys_addr(unsigned long minimum, > return 0; > } > > +#ifdef CONFIG_MEMORY_HOTPLUG > + /* Check if specify "movable_node" without "immovable_mem=". */ > + if (lack_immovable_mem) { > + debug_putstr("Fail KASLR when movable_node specified without immovable_mem=.\n"); > + return 0; > + } > +#endif > + > /* Make sure minimum is aligned. */ > minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN); > > -- > 2.14.3 > > >