Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752142AbdHIOpK (ORCPT ); Wed, 9 Aug 2017 10:45:10 -0400 Received: from [183.91.158.132] ([183.91.158.132]:51433 "EHLO heian.cn.fujitsu.com" rhost-flags-FAIL-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752260AbdHIOpI (ORCPT ); Wed, 9 Aug 2017 10:45:08 -0400 X-IronPort-AV: E=Sophos;i="5.41,348,1498492800"; d="scan'208";a="22871110" Subject: Re: [PATCH] x86/boot/KASLR: Extend movable_node option for KASLR To: YASUAKI ISHIMATSU , Baoquan He References: <1501762641-15634-1-git-send-email-douly.fnst@cn.fujitsu.com> <20170803122458.GA5913@localhost.localdomain> <20170803234901.GE1874@x1> <96b436e3-6d48-6a02-5cd4-f23c3a8de240@cn.fujitsu.com> <20170804020022.GF1874@x1> <00b0236b-01f5-5f4b-93bb-a5e510b2b4f3@cn.fujitsu.com> <20170804025540.GG1874@x1> <94f54d02-0512-fb01-b9ca-ed63e1f80bc7@cn.fujitsu.com> <0bd0153e-225d-a0a3-2b9c-f85082bb9477@gmail.com> CC: Chao Fan , , , , , , , , , , , From: Dou Liyang Message-ID: Date: Wed, 9 Aug 2017 22:44:55 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <0bd0153e-225d-a0a3-2b9c-f85082bb9477@gmail.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.167.226.106] X-yoursite-MailScanner-ID: 794E6472438B.AD4D4 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: douly.fnst@cn.fujitsu.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8767 Lines: 231 Hi YASUAKI, [...] >>>>>> >>>>>> we boot up kernel with 4 node: >>>>>> >>>>>> node 0 size: 1024 MB immovable >>>>>> node 1 size: 1024 MB movable >>>>>> node 2 size: 1024 MB movable >>>>>> node 3 size: 1024 MB movable >>>>>> >>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory. >>>>>> But actually, we should have 4G normally. >>>>> >>>>> So do you have assumption on the order of immovable nodes and movable >>>>> nodes? E.g above your example of nodes, immovable nodes have to be the >>>>> lowest address. Is this required by the current hot-plug memory code? >>>>> >>>> >>>> Wow! So great, It seems this is required by the hot-plug memory code. >>>> >>>> yesterday, I tested the patch in Qemu with 4 node and each time I >>>> used different node as immovable node. But no matter what node I used, >>>> the immovable nodes always had the lowest address. >>>> >>>> I am not familiar with memory, I am investigating this and I am going >>>> to apply for a physical machine with movable nodes to check. :) >>> >> >> Cc YASUAKI ISHIMATSU >> >> could you give us some help! >> >>> Great, thanks for your effort. I asked because this question confuses me >>> and I know FJ ever focusd on the memory hot-plug implementation and >>> continue working on that, it must be easier for you to consult your >>> co-workers who ever worked on this. For normal kernel, seems it has >>> to be that normal zone is on immovable node, namely node0. But what if >>> people modified bootloader to locate kernel onto the last node and >>> configure efi firmware to make the last node un-hot-plugable? I believe >>> both of these can be done. Is this allowed? memory hot-plug has a >>> requirement about the order of immovable node? And how many immovable >>> nodes can we have? I have an slides FJ published, didn't find info about >>> these. > > I read your patch. And I think what Baoquan wrote is right. The patch does > care of only your server. As he wrote, if a server wants to build immovable > node onto last node, the patch cannot handle such configuration. > Thanks for your reviewing. it is reasonable. I will keep in my mind. But, I am not sure that when we boot up a system with the following 4 nodes, does the BOIS(ACPI firmware) map the immovable node RAM from the lowest address first? node 0 size: 1024 MB immovable node 1 size: 1024 MB movable node 2 size: 1024 MB movable node 3 size: 1024 MB immovable the order of the physical RAM maps may be node 0, 3, 1, 2. Thanks, dou, > Thanks, > Yasuaki Ishimatsu > >>> >> >> Thanks, >> dou. >> >>>> >>>>>> >>>>>> Above is also one reason for why not using 'mem=' directly. Following >>>>>> is other reasons: >>>>>> >>>>>> 1). each kernel option has its own role, we'd better misuse them. >>>>>> 2). movable_node is used as a boot-time switch to make nodes movable >>>>>> or not, it should consider any situations, such as KASLR. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> dou. >>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote: >>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory >>>>>>>>> NUMA nodes to be movable. This option is based on an assumption >>>>>>>>> that any node which the kernel resides in is defined as >>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image >>>>>>>>> to try the best to keep the kernel away from hotpluggable memory >>>>>>>>> in the same NUMA node. So other nodes can be movable. >>>>>>>>> >>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all >>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT >>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable >>>>>>>>> node which will be immovable. >>>>>>>>> >>>>>>>>> Extend movable_node option to restrict kernel to be randomized in >>>>>>>>> immovable nodes by adding a parameter. this parameter sets up >>>>>>>>> the boundaries between the movable nodes and immovable nodes. >>>>> >>>>> And here you mentioned boundaries, means not only one boundary, so how >>>>> do you handle the case movable nodes and immovable nodes alternate to be >>>>> placed? >>>>> >>>>> I mean, are you sure the current hot-plug memory code require immovable >>>>> node has to be the first node and there's only one immovable node or >>>>> there are several immovable node but they are the first few nodes? >>>>> >>>>> If yes, then this patch looks good to me, I would like to ack it. >>>>> >>>>> Thanks >>>>> Baoquan >>>>> >>>>>>>>> >>>>>>>>> Reported-by: Chao Fan >>>>>>>>> Signed-off-by: Dou Liyang >>>>>>>>> --- >>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++-- >>>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++--- >>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >>>>>>>>> index d9c171c..44c7e33 100644 >>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>>>>>>> @@ -2305,7 +2305,8 @@ >>>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices >>>>>>>>> reporting absolute coordinates, such as tablets >>>>>>>>> >>>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter >>>>>>>>> + movablecore=nn[KMG] >>>>>>>>> + [KNL,X86,IA-64,PPC] This parameter >>>>>>>>> is similar to kernelcore except it specifies the >>>>>>>>> amount of memory used for migratable allocations. >>>>>>>>> If both kernelcore and movablecore is specified, >>>>>>>>> @@ -2315,12 +2316,18 @@ >>>>>>>>> that the amount of memory usable for all allocations >>>>>>>>> is not too small. >>>>>>>>> >>>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory >>>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory >>>>>>>>> NUMA nodes to be movable. This means that the memory >>>>>>>>> of such nodes will be usable only for movable >>>>>>>>> allocations which rules out almost all kernel >>>>>>>>> allocations. Use with caution! >>>>>>>>> >>>>>>>>> + movable_node=nn[KMG] >>>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This >>>>>>>>> + parameter is the boundaries between the movable nodes >>>>>>>>> + and immovable nodes, the memory which exceeds it will >>>>>>>>> + be regarded as hot-pluggable. >>>>>>>>> + >>>>>>>>> MTD_Partition= [MTD] >>>>>>>>> Format: ,,, >>>>>>>>> >>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c >>>>>>>>> index 91f27ab..7e2351b 100644 >>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c >>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c >>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector { >>>>>>>>> static bool memmap_too_large; >>>>>>>>> >>>>>>>>> >>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */ >>>>>>>>> +/* >>>>>>>>> + * Store memory limit specified by the following situations: >>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]" >>>>>>>>> + */ >>>>>>>>> unsigned long long mem_limit = ULLONG_MAX; >>>>>>>>> >>>>>>>>> >>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void) >>>>>>>>> char *param, *val; >>>>>>>>> u64 mem_size; >>>>>>>>> >>>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem=")) >>>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") && >>>>>>>>> + !strstr(args, "movable_node=")) >>>>>>>>> return 0; >>>>>>>>> >>>>>>>>> tmp_cmdline = malloc(len + 1); >>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void) >>>>>>>>> free(tmp_cmdline); >>>>>>>>> return -EINVAL; >>>>>>>>> } >>>>>>>>> - mem_limit = mem_size; >>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit; >>>>>>>>> + } else if (!strcmp(param, "movable_node")) { >>>>>>>>> + char *p = val; >>>>>>>>> + >>>>>>>>> + mem_size = memparse(p, &p); >>>>>>>>> + if (mem_size == 0) { >>>>>>>>> + free(tmp_cmdline); >>>>>>>>> + return -EINVAL; >>>>>>>>> + } >>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> -- >>>>>>>>> 2.5.5 >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >> >> > > >