movable_node is a boot-time switch to make hot-pluggable memory
NUMA nodes to be movable. This option is based on an assumption
that any node which the kernel resides in is defined as
un-hotpluggable. Linux can allocates memory near the kernel image
to try the best to keep the kernel away from hotpluggable memory
in the same NUMA node. So other nodes can be movable.
But, KASLR doesn't know which node is un-hotpluggable, the all
hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
is not parsed. So, KASLR may randomize the kernel in a movable
node which will be immovable.
Extend movable_node option to restrict kernel to be randomized in
immovable nodes by adding a parameter. this parameter sets up
the boundaries between the movable nodes and immovable nodes.
Reported-by: Chao Fan <[email protected]>
Signed-off-by: Dou Liyang <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
2 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d9c171c..44c7e33 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2305,7 +2305,8 @@
mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
reporting absolute coordinates, such as tablets
- movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
+ movablecore=nn[KMG]
+ [KNL,X86,IA-64,PPC] This parameter
is similar to kernelcore except it specifies the
amount of memory used for migratable allocations.
If both kernelcore and movablecore is specified,
@@ -2315,12 +2316,18 @@
that the amount of memory usable for all allocations
is not too small.
- movable_node [KNL] Boot-time switch to make hotplugable memory
+ movable_node [KNL] Boot-time switch to make hot-pluggable memory
NUMA nodes to be movable. This means that the memory
of such nodes will be usable only for movable
allocations which rules out almost all kernel
allocations. Use with caution!
+ movable_node=nn[KMG]
+ [KNL] Extend movable_node to work well with KASLR. This
+ parameter is the boundaries between the movable nodes
+ and immovable nodes, the memory which exceeds it will
+ be regarded as hot-pluggable.
+
MTD_Partition= [MTD]
Format: <name>,<region-number>,<size>,<offset>
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 91f27ab..7e2351b 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -89,7 +89,10 @@ struct mem_vector {
static bool memmap_too_large;
-/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
+/*
+ * Store memory limit specified by the following situations:
+ * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
+ */
unsigned long long mem_limit = ULLONG_MAX;
@@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
char *param, *val;
u64 mem_size;
- if (!strstr(args, "memmap=") && !strstr(args, "mem="))
+ if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
+ !strstr(args, "movable_node="))
return 0;
tmp_cmdline = malloc(len + 1);
@@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
free(tmp_cmdline);
return -EINVAL;
}
- mem_limit = mem_size;
+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
+ } else if (!strcmp(param, "movable_node")) {
+ char *p = val;
+
+ mem_size = memparse(p, &p);
+ if (mem_size == 0) {
+ free(tmp_cmdline);
+ return -EINVAL;
+ }
+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
}
}
--
2.5.5
It's almost another "mem=".
Thanks,
Chao Fan
On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>movable_node is a boot-time switch to make hot-pluggable memory
>NUMA nodes to be movable. This option is based on an assumption
>that any node which the kernel resides in is defined as
>un-hotpluggable. Linux can allocates memory near the kernel image
>to try the best to keep the kernel away from hotpluggable memory
>in the same NUMA node. So other nodes can be movable.
>
>But, KASLR doesn't know which node is un-hotpluggable, the all
>hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>is not parsed. So, KASLR may randomize the kernel in a movable
>node which will be immovable.
>
>Extend movable_node option to restrict kernel to be randomized in
>immovable nodes by adding a parameter. this parameter sets up
>the boundaries between the movable nodes and immovable nodes.
>
>Reported-by: Chao Fan <[email protected]>
>Signed-off-by: Dou Liyang <[email protected]>
>---
> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
> 2 files changed, 25 insertions(+), 5 deletions(-)
>
>diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>index d9c171c..44c7e33 100644
>--- a/Documentation/admin-guide/kernel-parameters.txt
>+++ b/Documentation/admin-guide/kernel-parameters.txt
>@@ -2305,7 +2305,8 @@
> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
> reporting absolute coordinates, such as tablets
>
>- movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>+ movablecore=nn[KMG]
>+ [KNL,X86,IA-64,PPC] This parameter
> is similar to kernelcore except it specifies the
> amount of memory used for migratable allocations.
> If both kernelcore and movablecore is specified,
>@@ -2315,12 +2316,18 @@
> that the amount of memory usable for all allocations
> is not too small.
>
>- movable_node [KNL] Boot-time switch to make hotplugable memory
>+ movable_node [KNL] Boot-time switch to make hot-pluggable memory
> NUMA nodes to be movable. This means that the memory
> of such nodes will be usable only for movable
> allocations which rules out almost all kernel
> allocations. Use with caution!
>
>+ movable_node=nn[KMG]
>+ [KNL] Extend movable_node to work well with KASLR. This
>+ parameter is the boundaries between the movable nodes
>+ and immovable nodes, the memory which exceeds it will
>+ be regarded as hot-pluggable.
>+
> MTD_Partition= [MTD]
> Format: <name>,<region-number>,<size>,<offset>
>
>diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>index 91f27ab..7e2351b 100644
>--- a/arch/x86/boot/compressed/kaslr.c
>+++ b/arch/x86/boot/compressed/kaslr.c
>@@ -89,7 +89,10 @@ struct mem_vector {
> static bool memmap_too_large;
>
>
>-/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>+/*
>+ * Store memory limit specified by the following situations:
>+ * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>+ */
> unsigned long long mem_limit = ULLONG_MAX;
>
>
>@@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
> char *param, *val;
> u64 mem_size;
>
>- if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>+ if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>+ !strstr(args, "movable_node="))
> return 0;
>
> tmp_cmdline = malloc(len + 1);
>@@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
> free(tmp_cmdline);
> return -EINVAL;
> }
>- mem_limit = mem_size;
>+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>+ } else if (!strcmp(param, "movable_node")) {
>+ char *p = val;
>+
>+ mem_size = memparse(p, &p);
>+ if (mem_size == 0) {
>+ free(tmp_cmdline);
>+ return -EINVAL;
>+ }
>+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> }
> }
>
>--
>2.5.5
>
On Thu, Aug 3, 2017 at 5:17 AM, Dou Liyang <[email protected]> wrote:
> movable_node is a boot-time switch to make hot-pluggable memory
> NUMA nodes to be movable. This option is based on an assumption
> that any node which the kernel resides in is defined as
> un-hotpluggable. Linux can allocates memory near the kernel image
> to try the best to keep the kernel away from hotpluggable memory
> in the same NUMA node. So other nodes can be movable.
>
> But, KASLR doesn't know which node is un-hotpluggable, the all
> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
> is not parsed. So, KASLR may randomize the kernel in a movable
> node which will be immovable.
>
> Extend movable_node option to restrict kernel to be randomized in
> immovable nodes by adding a parameter. this parameter sets up
> the boundaries between the movable nodes and immovable nodes.
>
> Reported-by: Chao Fan <[email protected]>
> Signed-off-by: Dou Liyang <[email protected]>
This seems reasonable to me. Thanks for the fix!
Reviewed-by: Kees Cook <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
> 2 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index d9c171c..44c7e33 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2305,7 +2305,8 @@
> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
> reporting absolute coordinates, such as tablets
>
> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> + movablecore=nn[KMG]
> + [KNL,X86,IA-64,PPC] This parameter
> is similar to kernelcore except it specifies the
> amount of memory used for migratable allocations.
> If both kernelcore and movablecore is specified,
> @@ -2315,12 +2316,18 @@
> that the amount of memory usable for all allocations
> is not too small.
>
> - movable_node [KNL] Boot-time switch to make hotplugable memory
> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
> NUMA nodes to be movable. This means that the memory
> of such nodes will be usable only for movable
> allocations which rules out almost all kernel
> allocations. Use with caution!
>
> + movable_node=nn[KMG]
> + [KNL] Extend movable_node to work well with KASLR. This
> + parameter is the boundaries between the movable nodes
> + and immovable nodes, the memory which exceeds it will
> + be regarded as hot-pluggable.
> +
> MTD_Partition= [MTD]
> Format: <name>,<region-number>,<size>,<offset>
>
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index 91f27ab..7e2351b 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -89,7 +89,10 @@ struct mem_vector {
> static bool memmap_too_large;
>
>
> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
> +/*
> + * Store memory limit specified by the following situations:
> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
> + */
> unsigned long long mem_limit = ULLONG_MAX;
>
>
> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
> char *param, *val;
> u64 mem_size;
>
> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
> + !strstr(args, "movable_node="))
> return 0;
>
> tmp_cmdline = malloc(len + 1);
> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
> free(tmp_cmdline);
> return -EINVAL;
> }
> - mem_limit = mem_size;
> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> + } else if (!strcmp(param, "movable_node")) {
> + char *p = val;
> +
> + mem_size = memparse(p, &p);
> + if (mem_size == 0) {
> + free(tmp_cmdline);
> + return -EINVAL;
> + }
> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> }
> }
>
> --
> 2.5.5
>
>
>
--
Kees Cook
Pixel Security
On 08/03/17 at 08:24pm, Chao Fan wrote:
> It's almost another "mem=".
Then why not using 'mem=' directly?
>
> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
> >movable_node is a boot-time switch to make hot-pluggable memory
> >NUMA nodes to be movable. This option is based on an assumption
> >that any node which the kernel resides in is defined as
> >un-hotpluggable. Linux can allocates memory near the kernel image
> >to try the best to keep the kernel away from hotpluggable memory
> >in the same NUMA node. So other nodes can be movable.
> >
> >But, KASLR doesn't know which node is un-hotpluggable, the all
> >hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
> >is not parsed. So, KASLR may randomize the kernel in a movable
> >node which will be immovable.
> >
> >Extend movable_node option to restrict kernel to be randomized in
> >immovable nodes by adding a parameter. this parameter sets up
> >the boundaries between the movable nodes and immovable nodes.
> >
> >Reported-by: Chao Fan <[email protected]>
> >Signed-off-by: Dou Liyang <[email protected]>
> >---
> > Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
> > arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
> > 2 files changed, 25 insertions(+), 5 deletions(-)
> >
> >diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >index d9c171c..44c7e33 100644
> >--- a/Documentation/admin-guide/kernel-parameters.txt
> >+++ b/Documentation/admin-guide/kernel-parameters.txt
> >@@ -2305,7 +2305,8 @@
> > mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
> > reporting absolute coordinates, such as tablets
> >
> >- movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> >+ movablecore=nn[KMG]
> >+ [KNL,X86,IA-64,PPC] This parameter
> > is similar to kernelcore except it specifies the
> > amount of memory used for migratable allocations.
> > If both kernelcore and movablecore is specified,
> >@@ -2315,12 +2316,18 @@
> > that the amount of memory usable for all allocations
> > is not too small.
> >
> >- movable_node [KNL] Boot-time switch to make hotplugable memory
> >+ movable_node [KNL] Boot-time switch to make hot-pluggable memory
> > NUMA nodes to be movable. This means that the memory
> > of such nodes will be usable only for movable
> > allocations which rules out almost all kernel
> > allocations. Use with caution!
> >
> >+ movable_node=nn[KMG]
> >+ [KNL] Extend movable_node to work well with KASLR. This
> >+ parameter is the boundaries between the movable nodes
> >+ and immovable nodes, the memory which exceeds it will
> >+ be regarded as hot-pluggable.
> >+
> > MTD_Partition= [MTD]
> > Format: <name>,<region-number>,<size>,<offset>
> >
> >diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> >index 91f27ab..7e2351b 100644
> >--- a/arch/x86/boot/compressed/kaslr.c
> >+++ b/arch/x86/boot/compressed/kaslr.c
> >@@ -89,7 +89,10 @@ struct mem_vector {
> > static bool memmap_too_large;
> >
> >
> >-/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
> >+/*
> >+ * Store memory limit specified by the following situations:
> >+ * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
> >+ */
> > unsigned long long mem_limit = ULLONG_MAX;
> >
> >
> >@@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
> > char *param, *val;
> > u64 mem_size;
> >
> >- if (!strstr(args, "memmap=") && !strstr(args, "mem="))
> >+ if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
> >+ !strstr(args, "movable_node="))
> > return 0;
> >
> > tmp_cmdline = malloc(len + 1);
> >@@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
> > free(tmp_cmdline);
> > return -EINVAL;
> > }
> >- mem_limit = mem_size;
> >+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> >+ } else if (!strcmp(param, "movable_node")) {
> >+ char *p = val;
> >+
> >+ mem_size = memparse(p, &p);
> >+ if (mem_size == 0) {
> >+ free(tmp_cmdline);
> >+ return -EINVAL;
> >+ }
> >+ mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> > }
> > }
> >
> >--
> >2.5.5
> >
>
>
Hi Chao,Baoquan
At 08/04/2017 07:49 AM, Baoquan He wrote:
> On 08/03/17 at 08:24pm, Chao Fan wrote:
>> It's almost another "mem=".
>
No, it is different.
See Documentation/kernel-parameters:
"mem=" will force usage of a specific amount of memory and kernel will
not see the whole system memory.
But "movable_node=" will not do that.
> Then why not using 'mem=' directly?
>
Before answer this question, let's first discuss why the users want to
replace "mem=" with "movable_node" when they hope to support NUMA node
hot-plug.
I guess the real reason is that:
When booting up the system, We should have the whole memory not just
the un-hotpluggable memory which restrict by "mem=", eg:
we boot up kernel with 4 node:
node 0 size: 1024 MB immovable
node 1 size: 1024 MB movable
node 2 size: 1024 MB movable
node 3 size: 1024 MB movable
If we use "mem=1024M" in the command line, we just can use 1G memory.
But actually, we should have 4G normally.
Above is also one reason for why not using 'mem=' directly. Following
is other reasons:
1). each kernel option has its own role, we'd better misuse them.
2). movable_node is used as a boot-time switch to make nodes movable
or not, it should consider any situations, such as KASLR.
Thanks,
dou.
>>
>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>> movable_node is a boot-time switch to make hot-pluggable memory
>>> NUMA nodes to be movable. This option is based on an assumption
>>> that any node which the kernel resides in is defined as
>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>> to try the best to keep the kernel away from hotpluggable memory
>>> in the same NUMA node. So other nodes can be movable.
>>>
>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>> node which will be immovable.
>>>
>>> Extend movable_node option to restrict kernel to be randomized in
>>> immovable nodes by adding a parameter. this parameter sets up
>>> the boundaries between the movable nodes and immovable nodes.
>>>
>>> Reported-by: Chao Fan <[email protected]>
>>> Signed-off-by: Dou Liyang <[email protected]>
>>> ---
>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index d9c171c..44c7e33 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -2305,7 +2305,8 @@
>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>> reporting absolute coordinates, such as tablets
>>>
>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>> + movablecore=nn[KMG]
>>> + [KNL,X86,IA-64,PPC] This parameter
>>> is similar to kernelcore except it specifies the
>>> amount of memory used for migratable allocations.
>>> If both kernelcore and movablecore is specified,
>>> @@ -2315,12 +2316,18 @@
>>> that the amount of memory usable for all allocations
>>> is not too small.
>>>
>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>> NUMA nodes to be movable. This means that the memory
>>> of such nodes will be usable only for movable
>>> allocations which rules out almost all kernel
>>> allocations. Use with caution!
>>>
>>> + movable_node=nn[KMG]
>>> + [KNL] Extend movable_node to work well with KASLR. This
>>> + parameter is the boundaries between the movable nodes
>>> + and immovable nodes, the memory which exceeds it will
>>> + be regarded as hot-pluggable.
>>> +
>>> MTD_Partition= [MTD]
>>> Format: <name>,<region-number>,<size>,<offset>
>>>
>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>> index 91f27ab..7e2351b 100644
>>> --- a/arch/x86/boot/compressed/kaslr.c
>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>> static bool memmap_too_large;
>>>
>>>
>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>> +/*
>>> + * Store memory limit specified by the following situations:
>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>> + */
>>> unsigned long long mem_limit = ULLONG_MAX;
>>>
>>>
>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>> char *param, *val;
>>> u64 mem_size;
>>>
>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>> + !strstr(args, "movable_node="))
>>> return 0;
>>>
>>> tmp_cmdline = malloc(len + 1);
>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>> free(tmp_cmdline);
>>> return -EINVAL;
>>> }
>>> - mem_limit = mem_size;
>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>> + } else if (!strcmp(param, "movable_node")) {
>>> + char *p = val;
>>> +
>>> + mem_size = memparse(p, &p);
>>> + if (mem_size == 0) {
>>> + free(tmp_cmdline);
>>> + return -EINVAL;
>>> + }
>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>> }
>>> }
>>>
>>> --
>>> 2.5.5
>>>
>>
>>
>
>
>
Hi Kees,
At 08/04/2017 06:40 AM, Kees Cook wrote:
> On Thu, Aug 3, 2017 at 5:17 AM, Dou Liyang <[email protected]> wrote:
>> movable_node is a boot-time switch to make hot-pluggable memory
>> NUMA nodes to be movable. This option is based on an assumption
>> that any node which the kernel resides in is defined as
>> un-hotpluggable. Linux can allocates memory near the kernel image
>> to try the best to keep the kernel away from hotpluggable memory
>> in the same NUMA node. So other nodes can be movable.
>>
>> But, KASLR doesn't know which node is un-hotpluggable, the all
>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>> is not parsed. So, KASLR may randomize the kernel in a movable
>> node which will be immovable.
>>
>> Extend movable_node option to restrict kernel to be randomized in
>> immovable nodes by adding a parameter. this parameter sets up
>> the boundaries between the movable nodes and immovable nodes.
>>
>> Reported-by: Chao Fan <[email protected]>
>> Signed-off-by: Dou Liyang <[email protected]>
>
> This seems reasonable to me. Thanks for the fix!
>
It's my pleasure!
> Reviewed-by: Kees Cook <[email protected]>
>
Thanks for reviewing.
Thanks,
dou.
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index d9c171c..44c7e33 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2305,7 +2305,8 @@
>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>> reporting absolute coordinates, such as tablets
>>
>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>> + movablecore=nn[KMG]
>> + [KNL,X86,IA-64,PPC] This parameter
>> is similar to kernelcore except it specifies the
>> amount of memory used for migratable allocations.
>> If both kernelcore and movablecore is specified,
>> @@ -2315,12 +2316,18 @@
>> that the amount of memory usable for all allocations
>> is not too small.
>>
>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>> NUMA nodes to be movable. This means that the memory
>> of such nodes will be usable only for movable
>> allocations which rules out almost all kernel
>> allocations. Use with caution!
>>
>> + movable_node=nn[KMG]
>> + [KNL] Extend movable_node to work well with KASLR. This
>> + parameter is the boundaries between the movable nodes
>> + and immovable nodes, the memory which exceeds it will
>> + be regarded as hot-pluggable.
>> +
>> MTD_Partition= [MTD]
>> Format: <name>,<region-number>,<size>,<offset>
>>
>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> index 91f27ab..7e2351b 100644
>> --- a/arch/x86/boot/compressed/kaslr.c
>> +++ b/arch/x86/boot/compressed/kaslr.c
>> @@ -89,7 +89,10 @@ struct mem_vector {
>> static bool memmap_too_large;
>>
>>
>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>> +/*
>> + * Store memory limit specified by the following situations:
>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>> + */
>> unsigned long long mem_limit = ULLONG_MAX;
>>
>>
>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>> char *param, *val;
>> u64 mem_size;
>>
>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>> + !strstr(args, "movable_node="))
>> return 0;
>>
>> tmp_cmdline = malloc(len + 1);
>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>> free(tmp_cmdline);
>> return -EINVAL;
>> }
>> - mem_limit = mem_size;
>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> + } else if (!strcmp(param, "movable_node")) {
>> + char *p = val;
>> +
>> + mem_size = memparse(p, &p);
>> + if (mem_size == 0) {
>> + free(tmp_cmdline);
>> + return -EINVAL;
>> + }
>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> }
>> }
>>
>> --
>> 2.5.5
>>
>>
>>
>
>
>
On 08/04/17 at 09:37am, Dou Liyang wrote:
> Hi Chao,Baoquan
>
> At 08/04/2017 07:49 AM, Baoquan He wrote:
> > On 08/03/17 at 08:24pm, Chao Fan wrote:
> > > It's almost another "mem=".
> >
>
> No, it is different.
>
> See Documentation/kernel-parameters:
>
> "mem=" will force usage of a specific amount of memory and kernel will
> not see the whole system memory.
>
> But "movable_node=" will not do that.
>
>
> > Then why not using 'mem=' directly?
> >
>
> Before answer this question, let's first discuss why the users want to
> replace "mem=" with "movable_node" when they hope to support NUMA node
> hot-plug.
>
> I guess the real reason is that:
>
> When booting up the system, We should have the whole memory not just
> the un-hotpluggable memory which restrict by "mem=", eg:
>
> we boot up kernel with 4 node:
>
> node 0 size: 1024 MB immovable
> node 1 size: 1024 MB movable
> node 2 size: 1024 MB movable
> node 3 size: 1024 MB movable
>
> If we use "mem=1024M" in the command line, we just can use 1G memory.
> But actually, we should have 4G normally.
So do you have assumption on the order of immovable nodes and movable
nodes? E.g above your example of nodes, immovable nodes have to be the
lowest address. Is this required by the current hot-plug memory code?
>
> Above is also one reason for why not using 'mem=' directly. Following
> is other reasons:
>
> 1). each kernel option has its own role, we'd better misuse them.
> 2). movable_node is used as a boot-time switch to make nodes movable
> or not, it should consider any situations, such as KASLR.
>
>
> Thanks,
> dou.
>
> > >
> > > On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
> > > > movable_node is a boot-time switch to make hot-pluggable memory
> > > > NUMA nodes to be movable. This option is based on an assumption
> > > > that any node which the kernel resides in is defined as
> > > > un-hotpluggable. Linux can allocates memory near the kernel image
> > > > to try the best to keep the kernel away from hotpluggable memory
> > > > in the same NUMA node. So other nodes can be movable.
> > > >
> > > > But, KASLR doesn't know which node is un-hotpluggable, the all
> > > > hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
> > > > is not parsed. So, KASLR may randomize the kernel in a movable
> > > > node which will be immovable.
> > > >
> > > > Extend movable_node option to restrict kernel to be randomized in
> > > > immovable nodes by adding a parameter. this parameter sets up
> > > > the boundaries between the movable nodes and immovable nodes.
And here you mentioned boundaries, means not only one boundary, so how
do you handle the case movable nodes and immovable nodes alternate to be
placed?
I mean, are you sure the current hot-plug memory code require immovable
node has to be the first node and there's only one immovable node or
there are several immovable node but they are the first few nodes?
If yes, then this patch looks good to me, I would like to ack it.
Thanks
Baoquan
> > > >
> > > > Reported-by: Chao Fan <[email protected]>
> > > > Signed-off-by: Dou Liyang <[email protected]>
> > > > ---
> > > > Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
> > > > arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
> > > > 2 files changed, 25 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index d9c171c..44c7e33 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -2305,7 +2305,8 @@
> > > > mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
> > > > reporting absolute coordinates, such as tablets
> > > >
> > > > - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> > > > + movablecore=nn[KMG]
> > > > + [KNL,X86,IA-64,PPC] This parameter
> > > > is similar to kernelcore except it specifies the
> > > > amount of memory used for migratable allocations.
> > > > If both kernelcore and movablecore is specified,
> > > > @@ -2315,12 +2316,18 @@
> > > > that the amount of memory usable for all allocations
> > > > is not too small.
> > > >
> > > > - movable_node [KNL] Boot-time switch to make hotplugable memory
> > > > + movable_node [KNL] Boot-time switch to make hot-pluggable memory
> > > > NUMA nodes to be movable. This means that the memory
> > > > of such nodes will be usable only for movable
> > > > allocations which rules out almost all kernel
> > > > allocations. Use with caution!
> > > >
> > > > + movable_node=nn[KMG]
> > > > + [KNL] Extend movable_node to work well with KASLR. This
> > > > + parameter is the boundaries between the movable nodes
> > > > + and immovable nodes, the memory which exceeds it will
> > > > + be regarded as hot-pluggable.
> > > > +
> > > > MTD_Partition= [MTD]
> > > > Format: <name>,<region-number>,<size>,<offset>
> > > >
> > > > diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> > > > index 91f27ab..7e2351b 100644
> > > > --- a/arch/x86/boot/compressed/kaslr.c
> > > > +++ b/arch/x86/boot/compressed/kaslr.c
> > > > @@ -89,7 +89,10 @@ struct mem_vector {
> > > > static bool memmap_too_large;
> > > >
> > > >
> > > > -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
> > > > +/*
> > > > + * Store memory limit specified by the following situations:
> > > > + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
> > > > + */
> > > > unsigned long long mem_limit = ULLONG_MAX;
> > > >
> > > >
> > > > @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
> > > > char *param, *val;
> > > > u64 mem_size;
> > > >
> > > > - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
> > > > + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
> > > > + !strstr(args, "movable_node="))
> > > > return 0;
> > > >
> > > > tmp_cmdline = malloc(len + 1);
> > > > @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
> > > > free(tmp_cmdline);
> > > > return -EINVAL;
> > > > }
> > > > - mem_limit = mem_size;
> > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> > > > + } else if (!strcmp(param, "movable_node")) {
> > > > + char *p = val;
> > > > +
> > > > + mem_size = memparse(p, &p);
> > > > + if (mem_size == 0) {
> > > > + free(tmp_cmdline);
> > > > + return -EINVAL;
> > > > + }
> > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> > > > }
> > > > }
> > > >
> > > > --
> > > > 2.5.5
> > > >
> > >
> > >
> >
> >
> >
>
>
On Fri, Aug 04, 2017 at 09:37:14AM +0800, Dou Liyang wrote:
>Hi Chao,Baoquan
>
>At 08/04/2017 07:49 AM, Baoquan He wrote:
>> On 08/03/17 at 08:24pm, Chao Fan wrote:
>> > It's almost another "mem=".
>>
>
>No, it is different.
>
>See Documentation/kernel-parameters:
>
>"mem=" will force usage of a specific amount of memory and kernel will
>not see the whole system memory.
>
>But "movable_node=" will not do that.
>
>
>> Then why not using 'mem=' directly?
>>
>
>Before answer this question, let's first discuss why the users want to
>replace "mem=" with "movable_node" when they hope to support NUMA node
>hot-plug.
>
>I guess the real reason is that:
>
>When booting up the system, We should have the whole memory not just
>the un-hotpluggable memory which restrict by "mem=", eg:
>
>we boot up kernel with 4 node:
>
>node 0 size: 1024 MB immovable
>node 1 size: 1024 MB movable
>node 2 size: 1024 MB movable
>node 3 size: 1024 MB movable
>
>If we use "mem=1024M" in the command line, we just can use 1G memory.
>But actually, we should have 4G normally.
>
>Above is also one reason for why not using 'mem=' directly. Following
>is other reasons:
>
Hi Dou,
>1). each kernel option has its own role, we'd better misuse them.
I guess you mean "we'd better not misuse them"
>2). movable_node is used as a boot-time switch to make nodes movable
>or not, it should consider any situations, such as KASLR.
Yes, then in my understanding, as for this issue, you will leave both
"movable_node" and "movable_node=nn[KMG]" in kernel option, right?
If so, then what will happen when only "movable_node" specified
without "movable_node=nn[KMG]"?
If I misunderstand it, please let me know.
Thanks,
Chao Fan
>
>
>Thanks,
> dou.
>
>> >
>> > On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>> > > movable_node is a boot-time switch to make hot-pluggable memory
>> > > NUMA nodes to be movable. This option is based on an assumption
>> > > that any node which the kernel resides in is defined as
>> > > un-hotpluggable. Linux can allocates memory near the kernel image
>> > > to try the best to keep the kernel away from hotpluggable memory
>> > > in the same NUMA node. So other nodes can be movable.
>> > >
>> > > But, KASLR doesn't know which node is un-hotpluggable, the all
>> > > hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>> > > is not parsed. So, KASLR may randomize the kernel in a movable
>> > > node which will be immovable.
>> > >
>> > > Extend movable_node option to restrict kernel to be randomized in
>> > > immovable nodes by adding a parameter. this parameter sets up
>> > > the boundaries between the movable nodes and immovable nodes.
>> > >
>> > > Reported-by: Chao Fan <[email protected]>
>> > > Signed-off-by: Dou Liyang <[email protected]>
>> > > ---
>> > > Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>> > > arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>> > > 2 files changed, 25 insertions(+), 5 deletions(-)
>> > >
>> > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> > > index d9c171c..44c7e33 100644
>> > > --- a/Documentation/admin-guide/kernel-parameters.txt
>> > > +++ b/Documentation/admin-guide/kernel-parameters.txt
>> > > @@ -2305,7 +2305,8 @@
>> > > mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>> > > reporting absolute coordinates, such as tablets
>> > >
>> > > - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>> > > + movablecore=nn[KMG]
>> > > + [KNL,X86,IA-64,PPC] This parameter
>> > > is similar to kernelcore except it specifies the
>> > > amount of memory used for migratable allocations.
>> > > If both kernelcore and movablecore is specified,
>> > > @@ -2315,12 +2316,18 @@
>> > > that the amount of memory usable for all allocations
>> > > is not too small.
>> > >
>> > > - movable_node [KNL] Boot-time switch to make hotplugable memory
>> > > + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>> > > NUMA nodes to be movable. This means that the memory
>> > > of such nodes will be usable only for movable
>> > > allocations which rules out almost all kernel
>> > > allocations. Use with caution!
>> > >
>> > > + movable_node=nn[KMG]
>> > > + [KNL] Extend movable_node to work well with KASLR. This
>> > > + parameter is the boundaries between the movable nodes
>> > > + and immovable nodes, the memory which exceeds it will
>> > > + be regarded as hot-pluggable.
>> > > +
>> > > MTD_Partition= [MTD]
>> > > Format: <name>,<region-number>,<size>,<offset>
>> > >
>> > > diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> > > index 91f27ab..7e2351b 100644
>> > > --- a/arch/x86/boot/compressed/kaslr.c
>> > > +++ b/arch/x86/boot/compressed/kaslr.c
>> > > @@ -89,7 +89,10 @@ struct mem_vector {
>> > > static bool memmap_too_large;
>> > >
>> > >
>> > > -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>> > > +/*
>> > > + * Store memory limit specified by the following situations:
>> > > + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>> > > + */
>> > > unsigned long long mem_limit = ULLONG_MAX;
>> > >
>> > >
>> > > @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>> > > char *param, *val;
>> > > u64 mem_size;
>> > >
>> > > - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>> > > + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>> > > + !strstr(args, "movable_node="))
>> > > return 0;
>> > >
>> > > tmp_cmdline = malloc(len + 1);
>> > > @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>> > > free(tmp_cmdline);
>> > > return -EINVAL;
>> > > }
>> > > - mem_limit = mem_size;
>> > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> > > + } else if (!strcmp(param, "movable_node")) {
>> > > + char *p = val;
>> > > +
>> > > + mem_size = memparse(p, &p);
>> > > + if (mem_size == 0) {
>> > > + free(tmp_cmdline);
>> > > + return -EINVAL;
>> > > + }
>> > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> > > }
>> > > }
>> > >
>> > > --
>> > > 2.5.5
>> > >
>> >
>> >
>>
>>
>>
Hi Baoquan,
At 08/04/2017 10:00 AM, Baoquan He wrote:
> On 08/04/17 at 09:37am, Dou Liyang wrote:
>> Hi Chao,Baoquan
>>
>> At 08/04/2017 07:49 AM, Baoquan He wrote:
>>> On 08/03/17 at 08:24pm, Chao Fan wrote:
>>>> It's almost another "mem=".
>>>
>>
>> No, it is different.
>>
>> See Documentation/kernel-parameters:
>>
>> "mem=" will force usage of a specific amount of memory and kernel will
>> not see the whole system memory.
>>
>> But "movable_node=" will not do that.
>>
>>
>>> Then why not using 'mem=' directly?
>>>
>>
>> Before answer this question, let's first discuss why the users want to
>> replace "mem=" with "movable_node" when they hope to support NUMA node
>> hot-plug.
>>
>> I guess the real reason is that:
>>
>> When booting up the system, We should have the whole memory not just
>> the un-hotpluggable memory which restrict by "mem=", eg:
>>
>> we boot up kernel with 4 node:
>>
>> node 0 size: 1024 MB immovable
>> node 1 size: 1024 MB movable
>> node 2 size: 1024 MB movable
>> node 3 size: 1024 MB movable
>>
>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>> But actually, we should have 4G normally.
>
> So do you have assumption on the order of immovable nodes and movable
> nodes? E.g above your example of nodes, immovable nodes have to be the
> lowest address. Is this required by the current hot-plug memory code?
>
Wow! So great, It seems this is required by the hot-plug memory code.
yesterday, I tested the patch in Qemu with 4 node and each time I
used different node as immovable node. But no matter what node I used,
the immovable nodes always had the lowest address.
I am not familiar with memory, I am investigating this and I am going
to apply for a physical machine with movable nodes to check. :)
Thanks,
dou.
>>
>> Above is also one reason for why not using 'mem=' directly. Following
>> is other reasons:
>>
>> 1). each kernel option has its own role, we'd better misuse them.
>> 2). movable_node is used as a boot-time switch to make nodes movable
>> or not, it should consider any situations, such as KASLR.
>>
>>
>> Thanks,
>> dou.
>>
>>>>
>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>> that any node which the kernel resides in is defined as
>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>
>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>> node which will be immovable.
>>>>>
>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>> the boundaries between the movable nodes and immovable nodes.
>
> And here you mentioned boundaries, means not only one boundary, so how
> do you handle the case movable nodes and immovable nodes alternate to be
> placed?
>
> I mean, are you sure the current hot-plug memory code require immovable
> node has to be the first node and there's only one immovable node or
> there are several immovable node but they are the first few nodes?
>
> If yes, then this patch looks good to me, I would like to ack it.
>
> Thanks
> Baoquan
>
>>>>>
>>>>> Reported-by: Chao Fan <[email protected]>
>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>> ---
>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>> index d9c171c..44c7e33 100644
>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>> @@ -2305,7 +2305,8 @@
>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>> reporting absolute coordinates, such as tablets
>>>>>
>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>> + movablecore=nn[KMG]
>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>> is similar to kernelcore except it specifies the
>>>>> amount of memory used for migratable allocations.
>>>>> If both kernelcore and movablecore is specified,
>>>>> @@ -2315,12 +2316,18 @@
>>>>> that the amount of memory usable for all allocations
>>>>> is not too small.
>>>>>
>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>> NUMA nodes to be movable. This means that the memory
>>>>> of such nodes will be usable only for movable
>>>>> allocations which rules out almost all kernel
>>>>> allocations. Use with caution!
>>>>>
>>>>> + movable_node=nn[KMG]
>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>> + parameter is the boundaries between the movable nodes
>>>>> + and immovable nodes, the memory which exceeds it will
>>>>> + be regarded as hot-pluggable.
>>>>> +
>>>>> MTD_Partition= [MTD]
>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>
>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>> index 91f27ab..7e2351b 100644
>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>> static bool memmap_too_large;
>>>>>
>>>>>
>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>> +/*
>>>>> + * Store memory limit specified by the following situations:
>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>> + */
>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>
>>>>>
>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>> char *param, *val;
>>>>> u64 mem_size;
>>>>>
>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>> + !strstr(args, "movable_node="))
>>>>> return 0;
>>>>>
>>>>> tmp_cmdline = malloc(len + 1);
>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>> free(tmp_cmdline);
>>>>> return -EINVAL;
>>>>> }
>>>>> - mem_limit = mem_size;
>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>> + char *p = val;
>>>>> +
>>>>> + mem_size = memparse(p, &p);
>>>>> + if (mem_size == 0) {
>>>>> + free(tmp_cmdline);
>>>>> + return -EINVAL;
>>>>> + }
>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>> }
>>>>> }
>>>>>
>>>>> --
>>>>> 2.5.5
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
Hi chao
At 08/04/2017 10:01 AM, Chao Fan wrote:
> On Fri, Aug 04, 2017 at 09:37:14AM +0800, Dou Liyang wrote:
>> Hi Chao,Baoquan
>>
>> At 08/04/2017 07:49 AM, Baoquan He wrote:
>>> On 08/03/17 at 08:24pm, Chao Fan wrote:
>>>> It's almost another "mem=".
>>>
>>
>> No, it is different.
>>
>> See Documentation/kernel-parameters:
>>
>> "mem=" will force usage of a specific amount of memory and kernel will
>> not see the whole system memory.
>>
>> But "movable_node=" will not do that.
>>
>>
>>> Then why not using 'mem=' directly?
>>>
>>
>> Before answer this question, let's first discuss why the users want to
>> replace "mem=" with "movable_node" when they hope to support NUMA node
>> hot-plug.
>>
>> I guess the real reason is that:
>>
>> When booting up the system, We should have the whole memory not just
>> the un-hotpluggable memory which restrict by "mem=", eg:
>>
>> we boot up kernel with 4 node:
>>
>> node 0 size: 1024 MB immovable
>> node 1 size: 1024 MB movable
>> node 2 size: 1024 MB movable
>> node 3 size: 1024 MB movable
>>
>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>> But actually, we should have 4G normally.
>>
>> Above is also one reason for why not using 'mem=' directly. Following
>> is other reasons:
>>
> Hi Dou,
>
>> 1). each kernel option has its own role, we'd better misuse them.
>
> I guess you mean "we'd better not misuse them"
>
Oops, yes, thanks!
>> 2). movable_node is used as a boot-time switch to make nodes movable
>> or not, it should consider any situations, such as KASLR.
>
> Yes, then in my understanding, as for this issue, you will leave both
> "movable_node" and "movable_node=nn[KMG]" in kernel option, right?
Yes, Both.
> If so, then what will happen when only "movable_node" specified
> without "movable_node=nn[KMG]"?
If the system does not support KASLR and has movable node, people can
use movable_node directly or use movable_node=nn[KMG], but the
parameter "nn" will useless.
If the system supports both KASLR and movable node, please use
movable_node=nn[KMG] instead of movable_node.
Thanks,
dou.
> If I misunderstand it, please let me know.
>
> Thanks,
> Chao Fan
>
>>
>>
>> Thanks,
>> dou.
>>
>>>>
>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>> that any node which the kernel resides in is defined as
>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>
>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>> node which will be immovable.
>>>>>
>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>>
>>>>> Reported-by: Chao Fan <[email protected]>
>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>> ---
>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>> index d9c171c..44c7e33 100644
>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>> @@ -2305,7 +2305,8 @@
>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>> reporting absolute coordinates, such as tablets
>>>>>
>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>> + movablecore=nn[KMG]
>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>> is similar to kernelcore except it specifies the
>>>>> amount of memory used for migratable allocations.
>>>>> If both kernelcore and movablecore is specified,
>>>>> @@ -2315,12 +2316,18 @@
>>>>> that the amount of memory usable for all allocations
>>>>> is not too small.
>>>>>
>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>> NUMA nodes to be movable. This means that the memory
>>>>> of such nodes will be usable only for movable
>>>>> allocations which rules out almost all kernel
>>>>> allocations. Use with caution!
>>>>>
>>>>> + movable_node=nn[KMG]
>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>> + parameter is the boundaries between the movable nodes
>>>>> + and immovable nodes, the memory which exceeds it will
>>>>> + be regarded as hot-pluggable.
>>>>> +
>>>>> MTD_Partition= [MTD]
>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>
>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>> index 91f27ab..7e2351b 100644
>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>> static bool memmap_too_large;
>>>>>
>>>>>
>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>> +/*
>>>>> + * Store memory limit specified by the following situations:
>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>> + */
>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>
>>>>>
>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>> char *param, *val;
>>>>> u64 mem_size;
>>>>>
>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>> + !strstr(args, "movable_node="))
>>>>> return 0;
>>>>>
>>>>> tmp_cmdline = malloc(len + 1);
>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>> free(tmp_cmdline);
>>>>> return -EINVAL;
>>>>> }
>>>>> - mem_limit = mem_size;
>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>> + char *p = val;
>>>>> +
>>>>> + mem_size = memparse(p, &p);
>>>>> + if (mem_size == 0) {
>>>>> + free(tmp_cmdline);
>>>>> + return -EINVAL;
>>>>> + }
>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>> }
>>>>> }
>>>>>
>>>>> --
>>>>> 2.5.5
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>
On 08/04/17 at 10:42am, Dou Liyang wrote:
> Hi Baoquan,
>
> At 08/04/2017 10:00 AM, Baoquan He wrote:
> > On 08/04/17 at 09:37am, Dou Liyang wrote:
> > > Hi Chao,Baoquan
> > >
> > > At 08/04/2017 07:49 AM, Baoquan He wrote:
> > > > On 08/03/17 at 08:24pm, Chao Fan wrote:
> > > > > It's almost another "mem=".
> > > >
> > >
> > > No, it is different.
> > >
> > > See Documentation/kernel-parameters:
> > >
> > > "mem=" will force usage of a specific amount of memory and kernel will
> > > not see the whole system memory.
> > >
> > > But "movable_node=" will not do that.
> > >
> > >
> > > > Then why not using 'mem=' directly?
> > > >
> > >
> > > Before answer this question, let's first discuss why the users want to
> > > replace "mem=" with "movable_node" when they hope to support NUMA node
> > > hot-plug.
> > >
> > > I guess the real reason is that:
> > >
> > > When booting up the system, We should have the whole memory not just
> > > the un-hotpluggable memory which restrict by "mem=", eg:
> > >
> > > we boot up kernel with 4 node:
> > >
> > > node 0 size: 1024 MB immovable
> > > node 1 size: 1024 MB movable
> > > node 2 size: 1024 MB movable
> > > node 3 size: 1024 MB movable
> > >
> > > If we use "mem=1024M" in the command line, we just can use 1G memory.
> > > But actually, we should have 4G normally.
> >
> > So do you have assumption on the order of immovable nodes and movable
> > nodes? E.g above your example of nodes, immovable nodes have to be the
> > lowest address. Is this required by the current hot-plug memory code?
> >
>
> Wow! So great, It seems this is required by the hot-plug memory code.
>
> yesterday, I tested the patch in Qemu with 4 node and each time I
> used different node as immovable node. But no matter what node I used,
> the immovable nodes always had the lowest address.
>
> I am not familiar with memory, I am investigating this and I am going
> to apply for a physical machine with movable nodes to check. :)
Great, thanks for your effort. I asked because this question confuses me
and I know FJ ever focusd on the memory hot-plug implementation and
continue working on that, it must be easier for you to consult your
co-workers who ever worked on this. For normal kernel, seems it has
to be that normal zone is on immovable node, namely node0. But what if
people modified bootloader to locate kernel onto the last node and
configure efi firmware to make the last node un-hot-plugable? I believe
both of these can be done. Is this allowed? memory hot-plug has a
requirement about the order of immovable node? And how many immovable
nodes can we have? I have an slides FJ published, didn't find info about
these.
>
> > >
> > > Above is also one reason for why not using 'mem=' directly. Following
> > > is other reasons:
> > >
> > > 1). each kernel option has its own role, we'd better misuse them.
> > > 2). movable_node is used as a boot-time switch to make nodes movable
> > > or not, it should consider any situations, such as KASLR.
> > >
> > >
> > > Thanks,
> > > dou.
> > >
> > > > >
> > > > > On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
> > > > > > movable_node is a boot-time switch to make hot-pluggable memory
> > > > > > NUMA nodes to be movable. This option is based on an assumption
> > > > > > that any node which the kernel resides in is defined as
> > > > > > un-hotpluggable. Linux can allocates memory near the kernel image
> > > > > > to try the best to keep the kernel away from hotpluggable memory
> > > > > > in the same NUMA node. So other nodes can be movable.
> > > > > >
> > > > > > But, KASLR doesn't know which node is un-hotpluggable, the all
> > > > > > hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
> > > > > > is not parsed. So, KASLR may randomize the kernel in a movable
> > > > > > node which will be immovable.
> > > > > >
> > > > > > Extend movable_node option to restrict kernel to be randomized in
> > > > > > immovable nodes by adding a parameter. this parameter sets up
> > > > > > the boundaries between the movable nodes and immovable nodes.
> >
> > And here you mentioned boundaries, means not only one boundary, so how
> > do you handle the case movable nodes and immovable nodes alternate to be
> > placed?
> >
> > I mean, are you sure the current hot-plug memory code require immovable
> > node has to be the first node and there's only one immovable node or
> > there are several immovable node but they are the first few nodes?
> >
> > If yes, then this patch looks good to me, I would like to ack it.
> >
> > Thanks
> > Baoquan
> >
> > > > > >
> > > > > > Reported-by: Chao Fan <[email protected]>
> > > > > > Signed-off-by: Dou Liyang <[email protected]>
> > > > > > ---
> > > > > > Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
> > > > > > arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
> > > > > > 2 files changed, 25 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > index d9c171c..44c7e33 100644
> > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > @@ -2305,7 +2305,8 @@
> > > > > > mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
> > > > > > reporting absolute coordinates, such as tablets
> > > > > >
> > > > > > - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> > > > > > + movablecore=nn[KMG]
> > > > > > + [KNL,X86,IA-64,PPC] This parameter
> > > > > > is similar to kernelcore except it specifies the
> > > > > > amount of memory used for migratable allocations.
> > > > > > If both kernelcore and movablecore is specified,
> > > > > > @@ -2315,12 +2316,18 @@
> > > > > > that the amount of memory usable for all allocations
> > > > > > is not too small.
> > > > > >
> > > > > > - movable_node [KNL] Boot-time switch to make hotplugable memory
> > > > > > + movable_node [KNL] Boot-time switch to make hot-pluggable memory
> > > > > > NUMA nodes to be movable. This means that the memory
> > > > > > of such nodes will be usable only for movable
> > > > > > allocations which rules out almost all kernel
> > > > > > allocations. Use with caution!
> > > > > >
> > > > > > + movable_node=nn[KMG]
> > > > > > + [KNL] Extend movable_node to work well with KASLR. This
> > > > > > + parameter is the boundaries between the movable nodes
> > > > > > + and immovable nodes, the memory which exceeds it will
> > > > > > + be regarded as hot-pluggable.
> > > > > > +
> > > > > > MTD_Partition= [MTD]
> > > > > > Format: <name>,<region-number>,<size>,<offset>
> > > > > >
> > > > > > diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> > > > > > index 91f27ab..7e2351b 100644
> > > > > > --- a/arch/x86/boot/compressed/kaslr.c
> > > > > > +++ b/arch/x86/boot/compressed/kaslr.c
> > > > > > @@ -89,7 +89,10 @@ struct mem_vector {
> > > > > > static bool memmap_too_large;
> > > > > >
> > > > > >
> > > > > > -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
> > > > > > +/*
> > > > > > + * Store memory limit specified by the following situations:
> > > > > > + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
> > > > > > + */
> > > > > > unsigned long long mem_limit = ULLONG_MAX;
> > > > > >
> > > > > >
> > > > > > @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
> > > > > > char *param, *val;
> > > > > > u64 mem_size;
> > > > > >
> > > > > > - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
> > > > > > + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
> > > > > > + !strstr(args, "movable_node="))
> > > > > > return 0;
> > > > > >
> > > > > > tmp_cmdline = malloc(len + 1);
> > > > > > @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
> > > > > > free(tmp_cmdline);
> > > > > > return -EINVAL;
> > > > > > }
> > > > > > - mem_limit = mem_size;
> > > > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> > > > > > + } else if (!strcmp(param, "movable_node")) {
> > > > > > + char *p = val;
> > > > > > +
> > > > > > + mem_size = memparse(p, &p);
> > > > > > + if (mem_size == 0) {
> > > > > > + free(tmp_cmdline);
> > > > > > + return -EINVAL;
> > > > > > + }
> > > > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > --
> > > > > > 2.5.5
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
On Fri, Aug 04, 2017 at 10:52:45AM +0800, Dou Liyang wrote:
>Hi chao
>
>At 08/04/2017 10:01 AM, Chao Fan wrote:
>> On Fri, Aug 04, 2017 at 09:37:14AM +0800, Dou Liyang wrote:
>> > Hi Chao,Baoquan
>> >
>> > At 08/04/2017 07:49 AM, Baoquan He wrote:
>> > > On 08/03/17 at 08:24pm, Chao Fan wrote:
>> > > > It's almost another "mem=".
>> > >
>> >
>> > No, it is different.
>> >
>> > See Documentation/kernel-parameters:
>> >
>> > "mem=" will force usage of a specific amount of memory and kernel will
>> > not see the whole system memory.
>> >
>> > But "movable_node=" will not do that.
>> >
>> >
>> > > Then why not using 'mem=' directly?
>> > >
>> >
>> > Before answer this question, let's first discuss why the users want to
>> > replace "mem=" with "movable_node" when they hope to support NUMA node
>> > hot-plug.
>> >
>> > I guess the real reason is that:
>> >
>> > When booting up the system, We should have the whole memory not just
>> > the un-hotpluggable memory which restrict by "mem=", eg:
>> >
>> > we boot up kernel with 4 node:
>> >
>> > node 0 size: 1024 MB immovable
>> > node 1 size: 1024 MB movable
>> > node 2 size: 1024 MB movable
>> > node 3 size: 1024 MB movable
>> >
>> > If we use "mem=1024M" in the command line, we just can use 1G memory.
>> > But actually, we should have 4G normally.
>> >
>> > Above is also one reason for why not using 'mem=' directly. Following
>> > is other reasons:
>> >
>> Hi Dou,
>>
>> > 1). each kernel option has its own role, we'd better misuse them.
>>
>> I guess you mean "we'd better not misuse them"
>>
>
>Oops, yes, thanks!
>
>> > 2). movable_node is used as a boot-time switch to make nodes movable
>> > or not, it should consider any situations, such as KASLR.
>>
>> Yes, then in my understanding, as for this issue, you will leave both
>> "movable_node" and "movable_node=nn[KMG]" in kernel option, right?
>
>Yes, Both.
>
>> If so, then what will happen when only "movable_node" specified
>> without "movable_node=nn[KMG]"?
>
>If the system does not support KASLR and has movable node, people can
>use movable_node directly or use movable_node=nn[KMG], but the
>parameter "nn" will useless.
>
>If the system supports both KASLR and movable node, please use
>movable_node=nn[KMG] instead of movable_node.
Hi Dou,
So many thanks for your explaination. I got it, with both KASLR and
movable_node feature specified, you suggest users use
"movable_node=nn[KMG]" but not "movable_node".
Thanks,
Chao Fan
>
>Thanks,
> dou.
>
>> If I misunderstand it, please let me know.
>>
>> Thanks,
>> Chao Fan
>>
>> >
>> >
>> > Thanks,
>> > dou.
>> >
>> > > >
>> > > > On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>> > > > > movable_node is a boot-time switch to make hot-pluggable memory
>> > > > > NUMA nodes to be movable. This option is based on an assumption
>> > > > > that any node which the kernel resides in is defined as
>> > > > > un-hotpluggable. Linux can allocates memory near the kernel image
>> > > > > to try the best to keep the kernel away from hotpluggable memory
>> > > > > in the same NUMA node. So other nodes can be movable.
>> > > > >
>> > > > > But, KASLR doesn't know which node is un-hotpluggable, the all
>> > > > > hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>> > > > > is not parsed. So, KASLR may randomize the kernel in a movable
>> > > > > node which will be immovable.
>> > > > >
>> > > > > Extend movable_node option to restrict kernel to be randomized in
>> > > > > immovable nodes by adding a parameter. this parameter sets up
>> > > > > the boundaries between the movable nodes and immovable nodes.
>> > > > >
>> > > > > Reported-by: Chao Fan <[email protected]>
>> > > > > Signed-off-by: Dou Liyang <[email protected]>
>> > > > > ---
>> > > > > Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>> > > > > arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>> > > > > 2 files changed, 25 insertions(+), 5 deletions(-)
>> > > > >
>> > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> > > > > index d9c171c..44c7e33 100644
>> > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
>> > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
>> > > > > @@ -2305,7 +2305,8 @@
>> > > > > mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>> > > > > reporting absolute coordinates, such as tablets
>> > > > >
>> > > > > - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>> > > > > + movablecore=nn[KMG]
>> > > > > + [KNL,X86,IA-64,PPC] This parameter
>> > > > > is similar to kernelcore except it specifies the
>> > > > > amount of memory used for migratable allocations.
>> > > > > If both kernelcore and movablecore is specified,
>> > > > > @@ -2315,12 +2316,18 @@
>> > > > > that the amount of memory usable for all allocations
>> > > > > is not too small.
>> > > > >
>> > > > > - movable_node [KNL] Boot-time switch to make hotplugable memory
>> > > > > + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>> > > > > NUMA nodes to be movable. This means that the memory
>> > > > > of such nodes will be usable only for movable
>> > > > > allocations which rules out almost all kernel
>> > > > > allocations. Use with caution!
>> > > > >
>> > > > > + movable_node=nn[KMG]
>> > > > > + [KNL] Extend movable_node to work well with KASLR. This
>> > > > > + parameter is the boundaries between the movable nodes
>> > > > > + and immovable nodes, the memory which exceeds it will
>> > > > > + be regarded as hot-pluggable.
>> > > > > +
>> > > > > MTD_Partition= [MTD]
>> > > > > Format: <name>,<region-number>,<size>,<offset>
>> > > > >
>> > > > > diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> > > > > index 91f27ab..7e2351b 100644
>> > > > > --- a/arch/x86/boot/compressed/kaslr.c
>> > > > > +++ b/arch/x86/boot/compressed/kaslr.c
>> > > > > @@ -89,7 +89,10 @@ struct mem_vector {
>> > > > > static bool memmap_too_large;
>> > > > >
>> > > > >
>> > > > > -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>> > > > > +/*
>> > > > > + * Store memory limit specified by the following situations:
>> > > > > + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>> > > > > + */
>> > > > > unsigned long long mem_limit = ULLONG_MAX;
>> > > > >
>> > > > >
>> > > > > @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>> > > > > char *param, *val;
>> > > > > u64 mem_size;
>> > > > >
>> > > > > - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>> > > > > + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>> > > > > + !strstr(args, "movable_node="))
>> > > > > return 0;
>> > > > >
>> > > > > tmp_cmdline = malloc(len + 1);
>> > > > > @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>> > > > > free(tmp_cmdline);
>> > > > > return -EINVAL;
>> > > > > }
>> > > > > - mem_limit = mem_size;
>> > > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> > > > > + } else if (!strcmp(param, "movable_node")) {
>> > > > > + char *p = val;
>> > > > > +
>> > > > > + mem_size = memparse(p, &p);
>> > > > > + if (mem_size == 0) {
>> > > > > + free(tmp_cmdline);
>> > > > > + return -EINVAL;
>> > > > > + }
>> > > > > + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>> > > > > }
>> > > > > }
>> > > > >
>> > > > > --
>> > > > > 2.5.5
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>>
At 08/04/2017 10:55 AM, Baoquan He wrote:
> On 08/04/17 at 10:42am, Dou Liyang wrote:
>> Hi Baoquan,
>>
>> At 08/04/2017 10:00 AM, Baoquan He wrote:
>>> On 08/04/17 at 09:37am, Dou Liyang wrote:
>>>> Hi Chao,Baoquan
>>>>
>>>> At 08/04/2017 07:49 AM, Baoquan He wrote:
>>>>> On 08/03/17 at 08:24pm, Chao Fan wrote:
>>>>>> It's almost another "mem=".
>>>>>
>>>>
>>>> No, it is different.
>>>>
>>>> See Documentation/kernel-parameters:
>>>>
>>>> "mem=" will force usage of a specific amount of memory and kernel will
>>>> not see the whole system memory.
>>>>
>>>> But "movable_node=" will not do that.
>>>>
>>>>
>>>>> Then why not using 'mem=' directly?
>>>>>
>>>>
>>>> Before answer this question, let's first discuss why the users want to
>>>> replace "mem=" with "movable_node" when they hope to support NUMA node
>>>> hot-plug.
>>>>
>>>> I guess the real reason is that:
>>>>
>>>> When booting up the system, We should have the whole memory not just
>>>> the un-hotpluggable memory which restrict by "mem=", eg:
>>>>
>>>> we boot up kernel with 4 node:
>>>>
>>>> node 0 size: 1024 MB immovable
>>>> node 1 size: 1024 MB movable
>>>> node 2 size: 1024 MB movable
>>>> node 3 size: 1024 MB movable
>>>>
>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>> But actually, we should have 4G normally.
>>>
>>> So do you have assumption on the order of immovable nodes and movable
>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>> lowest address. Is this required by the current hot-plug memory code?
>>>
>>
>> Wow! So great, It seems this is required by the hot-plug memory code.
>>
>> yesterday, I tested the patch in Qemu with 4 node and each time I
>> used different node as immovable node. But no matter what node I used,
>> the immovable nodes always had the lowest address.
>>
>> I am not familiar with memory, I am investigating this and I am going
>> to apply for a physical machine with movable nodes to check. :)
>
Cc YASUAKI ISHIMATSU
could you give us some help!
> Great, thanks for your effort. I asked because this question confuses me
> and I know FJ ever focusd on the memory hot-plug implementation and
> continue working on that, it must be easier for you to consult your
> co-workers who ever worked on this. For normal kernel, seems it has
> to be that normal zone is on immovable node, namely node0. But what if
> people modified bootloader to locate kernel onto the last node and
> configure efi firmware to make the last node un-hot-plugable? I believe
> both of these can be done. Is this allowed? memory hot-plug has a
> requirement about the order of immovable node? And how many immovable
> nodes can we have? I have an slides FJ published, didn't find info about
> these.
>
Thanks,
dou.
>>
>>>>
>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>> is other reasons:
>>>>
>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>> or not, it should consider any situations, such as KASLR.
>>>>
>>>>
>>>> Thanks,
>>>> dou.
>>>>
>>>>>>
>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>> that any node which the kernel resides in is defined as
>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>
>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>> node which will be immovable.
>>>>>>>
>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>
>>> And here you mentioned boundaries, means not only one boundary, so how
>>> do you handle the case movable nodes and immovable nodes alternate to be
>>> placed?
>>>
>>> I mean, are you sure the current hot-plug memory code require immovable
>>> node has to be the first node and there's only one immovable node or
>>> there are several immovable node but they are the first few nodes?
>>>
>>> If yes, then this patch looks good to me, I would like to ack it.
>>>
>>> Thanks
>>> Baoquan
>>>
>>>>>>>
>>>>>>> Reported-by: Chao Fan <[email protected]>
>>>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>>>> ---
>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> index d9c171c..44c7e33 100644
>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>>>> reporting absolute coordinates, such as tablets
>>>>>>>
>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>>>> + movablecore=nn[KMG]
>>>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>>>> is similar to kernelcore except it specifies the
>>>>>>> amount of memory used for migratable allocations.
>>>>>>> If both kernelcore and movablecore is specified,
>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>> that the amount of memory usable for all allocations
>>>>>>> is not too small.
>>>>>>>
>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>> NUMA nodes to be movable. This means that the memory
>>>>>>> of such nodes will be usable only for movable
>>>>>>> allocations which rules out almost all kernel
>>>>>>> allocations. Use with caution!
>>>>>>>
>>>>>>> + movable_node=nn[KMG]
>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>>>> + parameter is the boundaries between the movable nodes
>>>>>>> + and immovable nodes, the memory which exceeds it will
>>>>>>> + be regarded as hot-pluggable.
>>>>>>> +
>>>>>>> MTD_Partition= [MTD]
>>>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>>>
>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>> static bool memmap_too_large;
>>>>>>>
>>>>>>>
>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>> +/*
>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>> + */
>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>
>>>>>>>
>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>> char *param, *val;
>>>>>>> u64 mem_size;
>>>>>>>
>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>> + !strstr(args, "movable_node="))
>>>>>>> return 0;
>>>>>>>
>>>>>>> tmp_cmdline = malloc(len + 1);
>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>> free(tmp_cmdline);
>>>>>>> return -EINVAL;
>>>>>>> }
>>>>>>> - mem_limit = mem_size;
>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>>>> + char *p = val;
>>>>>>> +
>>>>>>> + mem_size = memparse(p, &p);
>>>>>>> + if (mem_size == 0) {
>>>>>>> + free(tmp_cmdline);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> --
>>>>>>> 2.5.5
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
Hi Dou and Baoquan,
On 08/03/2017 11:28 PM, Dou Liyang wrote:
>
>
> At 08/04/2017 10:55 AM, Baoquan He wrote:
>> On 08/04/17 at 10:42am, Dou Liyang wrote:
>>> Hi Baoquan,
>>>
>>> At 08/04/2017 10:00 AM, Baoquan He wrote:
>>>> On 08/04/17 at 09:37am, Dou Liyang wrote:
>>>>> Hi Chao,Baoquan
>>>>>
>>>>> At 08/04/2017 07:49 AM, Baoquan He wrote:
>>>>>> On 08/03/17 at 08:24pm, Chao Fan wrote:
>>>>>>> It's almost another "mem=".
>>>>>>
>>>>>
>>>>> No, it is different.
>>>>>
>>>>> See Documentation/kernel-parameters:
>>>>>
>>>>> "mem=" will force usage of a specific amount of memory and kernel will
>>>>> not see the whole system memory.
>>>>>
>>>>> But "movable_node=" will not do that.
>>>>>
>>>>>
>>>>>> Then why not using 'mem=' directly?
>>>>>>
>>>>>
>>>>> Before answer this question, let's first discuss why the users want to
>>>>> replace "mem=" with "movable_node" when they hope to support NUMA node
>>>>> hot-plug.
>>>>>
>>>>> I guess the real reason is that:
>>>>>
>>>>> When booting up the system, We should have the whole memory not just
>>>>> the un-hotpluggable memory which restrict by "mem=", eg:
>>>>>
>>>>> we boot up kernel with 4 node:
>>>>>
>>>>> node 0 size: 1024 MB immovable
>>>>> node 1 size: 1024 MB movable
>>>>> node 2 size: 1024 MB movable
>>>>> node 3 size: 1024 MB movable
>>>>>
>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>>> But actually, we should have 4G normally.
>>>>
>>>> So do you have assumption on the order of immovable nodes and movable
>>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>>> lowest address. Is this required by the current hot-plug memory code?
>>>>
>>>
>>> Wow! So great, It seems this is required by the hot-plug memory code.
>>>
>>> yesterday, I tested the patch in Qemu with 4 node and each time I
>>> used different node as immovable node. But no matter what node I used,
>>> the immovable nodes always had the lowest address.
>>>
>>> I am not familiar with memory, I am investigating this and I am going
>>> to apply for a physical machine with movable nodes to check. :)
>>
>
> Cc YASUAKI ISHIMATSU
>
> could you give us some help!
>
>> Great, thanks for your effort. I asked because this question confuses me
>> and I know FJ ever focusd on the memory hot-plug implementation and
>> continue working on that, it must be easier for you to consult your
>> co-workers who ever worked on this. For normal kernel, seems it has
>> to be that normal zone is on immovable node, namely node0. But what if
>> people modified bootloader to locate kernel onto the last node and
>> configure efi firmware to make the last node un-hot-plugable? I believe
>> both of these can be done. Is this allowed? memory hot-plug has a
>> requirement about the order of immovable node? And how many immovable
>> nodes can we have? I have an slides FJ published, didn't find info about
>> these.
I read your patch. And I think what Baoquan wrote is right. The patch does
care of only your server. As he wrote, if a server wants to build immovable
node onto last node, the patch cannot handle such configuration.
Thanks,
Yasuaki Ishimatsu
>>
>
> Thanks,
> dou.
>
>>>
>>>>>
>>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>>> is other reasons:
>>>>>
>>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>>> or not, it should consider any situations, such as KASLR.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> dou.
>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>>> that any node which the kernel resides in is defined as
>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>>
>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>>> node which will be immovable.
>>>>>>>>
>>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>
>>>> And here you mentioned boundaries, means not only one boundary, so how
>>>> do you handle the case movable nodes and immovable nodes alternate to be
>>>> placed?
>>>>
>>>> I mean, are you sure the current hot-plug memory code require immovable
>>>> node has to be the first node and there's only one immovable node or
>>>> there are several immovable node but they are the first few nodes?
>>>>
>>>> If yes, then this patch looks good to me, I would like to ack it.
>>>>
>>>> Thanks
>>>> Baoquan
>>>>
>>>>>>>>
>>>>>>>> Reported-by: Chao Fan <[email protected]>
>>>>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>>>>> ---
>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>> index d9c171c..44c7e33 100644
>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>>>>> reporting absolute coordinates, such as tablets
>>>>>>>>
>>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>>>>> + movablecore=nn[KMG]
>>>>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>>>>> is similar to kernelcore except it specifies the
>>>>>>>> amount of memory used for migratable allocations.
>>>>>>>> If both kernelcore and movablecore is specified,
>>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>>> that the amount of memory usable for all allocations
>>>>>>>> is not too small.
>>>>>>>>
>>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>>> NUMA nodes to be movable. This means that the memory
>>>>>>>> of such nodes will be usable only for movable
>>>>>>>> allocations which rules out almost all kernel
>>>>>>>> allocations. Use with caution!
>>>>>>>>
>>>>>>>> + movable_node=nn[KMG]
>>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>>>>> + parameter is the boundaries between the movable nodes
>>>>>>>> + and immovable nodes, the memory which exceeds it will
>>>>>>>> + be regarded as hot-pluggable.
>>>>>>>> +
>>>>>>>> MTD_Partition= [MTD]
>>>>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>>>>
>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>>> static bool memmap_too_large;
>>>>>>>>
>>>>>>>>
>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>>> +/*
>>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>>> + */
>>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>>
>>>>>>>>
>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>>> char *param, *val;
>>>>>>>> u64 mem_size;
>>>>>>>>
>>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>>> + !strstr(args, "movable_node="))
>>>>>>>> return 0;
>>>>>>>>
>>>>>>>> tmp_cmdline = malloc(len + 1);
>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>>> free(tmp_cmdline);
>>>>>>>> return -EINVAL;
>>>>>>>> }
>>>>>>>> - mem_limit = mem_size;
>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>>>>> + char *p = val;
>>>>>>>> +
>>>>>>>> + mem_size = memparse(p, &p);
>>>>>>>> + if (mem_size == 0) {
>>>>>>>> + free(tmp_cmdline);
>>>>>>>> + return -EINVAL;
>>>>>>>> + }
>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> --
>>>>>>>> 2.5.5
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Hi YASUAKI,
[...]
>>>>>>
>>>>>> we boot up kernel with 4 node:
>>>>>>
>>>>>> node 0 size: 1024 MB immovable
>>>>>> node 1 size: 1024 MB movable
>>>>>> node 2 size: 1024 MB movable
>>>>>> node 3 size: 1024 MB movable
>>>>>>
>>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>>>> But actually, we should have 4G normally.
>>>>>
>>>>> So do you have assumption on the order of immovable nodes and movable
>>>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>>>> lowest address. Is this required by the current hot-plug memory code?
>>>>>
>>>>
>>>> Wow! So great, It seems this is required by the hot-plug memory code.
>>>>
>>>> yesterday, I tested the patch in Qemu with 4 node and each time I
>>>> used different node as immovable node. But no matter what node I used,
>>>> the immovable nodes always had the lowest address.
>>>>
>>>> I am not familiar with memory, I am investigating this and I am going
>>>> to apply for a physical machine with movable nodes to check. :)
>>>
>>
>> Cc YASUAKI ISHIMATSU
>>
>> could you give us some help!
>>
>>> Great, thanks for your effort. I asked because this question confuses me
>>> and I know FJ ever focusd on the memory hot-plug implementation and
>>> continue working on that, it must be easier for you to consult your
>>> co-workers who ever worked on this. For normal kernel, seems it has
>>> to be that normal zone is on immovable node, namely node0. But what if
>>> people modified bootloader to locate kernel onto the last node and
>>> configure efi firmware to make the last node un-hot-plugable? I believe
>>> both of these can be done. Is this allowed? memory hot-plug has a
>>> requirement about the order of immovable node? And how many immovable
>>> nodes can we have? I have an slides FJ published, didn't find info about
>>> these.
>
> I read your patch. And I think what Baoquan wrote is right. The patch does
> care of only your server. As he wrote, if a server wants to build immovable
> node onto last node, the patch cannot handle such configuration.
>
Thanks for your reviewing. it is reasonable. I will keep in my mind.
But, I am not sure that when we boot up a system with the following 4
nodes, does the BOIS(ACPI firmware) map the immovable node RAM from the
lowest address first?
node 0 size: 1024 MB immovable
node 1 size: 1024 MB movable
node 2 size: 1024 MB movable
node 3 size: 1024 MB immovable
the order of the physical RAM maps may be node 0, 3, 1, 2.
Thanks,
dou,
> Thanks,
> Yasuaki Ishimatsu
>
>>>
>>
>> Thanks,
>> dou.
>>
>>>>
>>>>>>
>>>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>>>> is other reasons:
>>>>>>
>>>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>>>> or not, it should consider any situations, such as KASLR.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> dou.
>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>>>> that any node which the kernel resides in is defined as
>>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>>>
>>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>>>> node which will be immovable.
>>>>>>>>>
>>>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>>
>>>>> And here you mentioned boundaries, means not only one boundary, so how
>>>>> do you handle the case movable nodes and immovable nodes alternate to be
>>>>> placed?
>>>>>
>>>>> I mean, are you sure the current hot-plug memory code require immovable
>>>>> node has to be the first node and there's only one immovable node or
>>>>> there are several immovable node but they are the first few nodes?
>>>>>
>>>>> If yes, then this patch looks good to me, I would like to ack it.
>>>>>
>>>>> Thanks
>>>>> Baoquan
>>>>>
>>>>>>>>>
>>>>>>>>> Reported-by: Chao Fan <[email protected]>
>>>>>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>>>>>> ---
>>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>> index d9c171c..44c7e33 100644
>>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>>>>>> reporting absolute coordinates, such as tablets
>>>>>>>>>
>>>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>> + movablecore=nn[KMG]
>>>>>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>> is similar to kernelcore except it specifies the
>>>>>>>>> amount of memory used for migratable allocations.
>>>>>>>>> If both kernelcore and movablecore is specified,
>>>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>>>> that the amount of memory usable for all allocations
>>>>>>>>> is not too small.
>>>>>>>>>
>>>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>>>> NUMA nodes to be movable. This means that the memory
>>>>>>>>> of such nodes will be usable only for movable
>>>>>>>>> allocations which rules out almost all kernel
>>>>>>>>> allocations. Use with caution!
>>>>>>>>>
>>>>>>>>> + movable_node=nn[KMG]
>>>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>>>>>> + parameter is the boundaries between the movable nodes
>>>>>>>>> + and immovable nodes, the memory which exceeds it will
>>>>>>>>> + be regarded as hot-pluggable.
>>>>>>>>> +
>>>>>>>>> MTD_Partition= [MTD]
>>>>>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>>>>>
>>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>>>> static bool memmap_too_large;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>>>> +/*
>>>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>>>> + */
>>>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>>>> char *param, *val;
>>>>>>>>> u64 mem_size;
>>>>>>>>>
>>>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>>>> + !strstr(args, "movable_node="))
>>>>>>>>> return 0;
>>>>>>>>>
>>>>>>>>> tmp_cmdline = malloc(len + 1);
>>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>>>> free(tmp_cmdline);
>>>>>>>>> return -EINVAL;
>>>>>>>>> }
>>>>>>>>> - mem_limit = mem_size;
>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>>>>>> + char *p = val;
>>>>>>>>> +
>>>>>>>>> + mem_size = memparse(p, &p);
>>>>>>>>> + if (mem_size == 0) {
>>>>>>>>> + free(tmp_cmdline);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> 2.5.5
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
On 08/09/2017 10:44 AM, Dou Liyang wrote:
>
> Hi YASUAKI,
>
> [...]
>>>>>>>
>>>>>>> we boot up kernel with 4 node:
>>>>>>>
>>>>>>> node 0 size: 1024 MB immovable
>>>>>>> node 1 size: 1024 MB movable
>>>>>>> node 2 size: 1024 MB movable
>>>>>>> node 3 size: 1024 MB movable
>>>>>>>
>>>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>>>>> But actually, we should have 4G normally.
>>>>>>
>>>>>> So do you have assumption on the order of immovable nodes and movable
>>>>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>>>>> lowest address. Is this required by the current hot-plug memory code?
>>>>>>
>>>>>
>>>>> Wow! So great, It seems this is required by the hot-plug memory code.
>>>>>
>>>>> yesterday, I tested the patch in Qemu with 4 node and each time I
>>>>> used different node as immovable node. But no matter what node I used,
>>>>> the immovable nodes always had the lowest address.
>>>>>
>>>>> I am not familiar with memory, I am investigating this and I am going
>>>>> to apply for a physical machine with movable nodes to check. :)
>>>>
>>>
>>> Cc YASUAKI ISHIMATSU
>>>
>>> could you give us some help!
>>>
>>>> Great, thanks for your effort. I asked because this question confuses me
>>>> and I know FJ ever focusd on the memory hot-plug implementation and
>>>> continue working on that, it must be easier for you to consult your
>>>> co-workers who ever worked on this. For normal kernel, seems it has
>>>> to be that normal zone is on immovable node, namely node0. But what if
>>>> people modified bootloader to locate kernel onto the last node and
>>>> configure efi firmware to make the last node un-hot-plugable? I believe
>>>> both of these can be done. Is this allowed? memory hot-plug has a
>>>> requirement about the order of immovable node? And how many immovable
>>>> nodes can we have? I have an slides FJ published, didn't find info about
>>>> these.
>>
>> I read your patch. And I think what Baoquan wrote is right. The patch does
>> care of only your server. As he wrote, if a server wants to build immovable
>> node onto last node, the patch cannot handle such configuration.
>>
>
> Thanks for your reviewing. it is reasonable. I will keep in my mind.
>
> But, I am not sure that when we boot up a system with the following 4
> nodes, does the BOIS(ACPI firmware) map the immovable node RAM from the
> lowest address first?
>
> node 0 size: 1024 MB immovable
> node 1 size: 1024 MB movable
> node 2 size: 1024 MB movable
> node 3 size: 1024 MB immovable
>
> the order of the physical RAM maps may be node 0, 3, 1, 2.
It depends on SRAT table. If system boots up with movable_node, kernel checks
hot pluggable bit of memory affinity structure in SRAT table. And if hot pluggable
bit is set, the memory will be movable. If not set, the memory will be immovable.
If memory affinity structures in SRAT table are defined as follows, the system
sets up the configuration you mentioned.
PXM: start : end : hot pluggable bit
0:0x00000000000:0x0ffffffffff: disable
1:0x10000000000:0x1ffffffffff: enable
2:0x30000000000:0x2ffffffffff: enable
3:0x40000000000:0x3ffffffffff: disable
We are not sure there is such server. But there is no specification that immovable
node has to be set from lowest address. So kernel should care of such SRAT table.
Thanks,
Yasuaki Ishimatsuu
>
>
> Thanks,
>
> dou,
>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>>
>>>
>>> Thanks,
>>> dou.
>>>
>>>>>
>>>>>>>
>>>>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>>>>> is other reasons:
>>>>>>>
>>>>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>>>>> or not, it should consider any situations, such as KASLR.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> dou.
>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>>>>> that any node which the kernel resides in is defined as
>>>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>>>>
>>>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>>>>> node which will be immovable.
>>>>>>>>>>
>>>>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>>>
>>>>>> And here you mentioned boundaries, means not only one boundary, so how
>>>>>> do you handle the case movable nodes and immovable nodes alternate to be
>>>>>> placed?
>>>>>>
>>>>>> I mean, are you sure the current hot-plug memory code require immovable
>>>>>> node has to be the first node and there's only one immovable node or
>>>>>> there are several immovable node but they are the first few nodes?
>>>>>>
>>>>>> If yes, then this patch looks good to me, I would like to ack it.
>>>>>>
>>>>>> Thanks
>>>>>> Baoquan
>>>>>>
>>>>>>>>>>
>>>>>>>>>> Reported-by: Chao Fan <[email protected]>
>>>>>>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>>>>>>> ---
>>>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>> index d9c171c..44c7e33 100644
>>>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>>>>>>> reporting absolute coordinates, such as tablets
>>>>>>>>>>
>>>>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>> + movablecore=nn[KMG]
>>>>>>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>> is similar to kernelcore except it specifies the
>>>>>>>>>> amount of memory used for migratable allocations.
>>>>>>>>>> If both kernelcore and movablecore is specified,
>>>>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>>>>> that the amount of memory usable for all allocations
>>>>>>>>>> is not too small.
>>>>>>>>>>
>>>>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>>>>> NUMA nodes to be movable. This means that the memory
>>>>>>>>>> of such nodes will be usable only for movable
>>>>>>>>>> allocations which rules out almost all kernel
>>>>>>>>>> allocations. Use with caution!
>>>>>>>>>>
>>>>>>>>>> + movable_node=nn[KMG]
>>>>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>>>>>>> + parameter is the boundaries between the movable nodes
>>>>>>>>>> + and immovable nodes, the memory which exceeds it will
>>>>>>>>>> + be regarded as hot-pluggable.
>>>>>>>>>> +
>>>>>>>>>> MTD_Partition= [MTD]
>>>>>>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>>>>> static bool memmap_too_large;
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>>>>> +/*
>>>>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>>>>> + */
>>>>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>>>>> char *param, *val;
>>>>>>>>>> u64 mem_size;
>>>>>>>>>>
>>>>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>>>>> + !strstr(args, "movable_node="))
>>>>>>>>>> return 0;
>>>>>>>>>>
>>>>>>>>>> tmp_cmdline = malloc(len + 1);
>>>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>>>>> free(tmp_cmdline);
>>>>>>>>>> return -EINVAL;
>>>>>>>>>> }
>>>>>>>>>> - mem_limit = mem_size;
>>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>>>>>>> + char *p = val;
>>>>>>>>>> +
>>>>>>>>>> + mem_size = memparse(p, &p);
>>>>>>>>>> + if (mem_size == 0) {
>>>>>>>>>> + free(tmp_cmdline);
>>>>>>>>>> + return -EINVAL;
>>>>>>>>>> + }
>>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 2.5.5
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Hi, YASUAKI
At 08/10/2017 12:55 AM, YASUAKI ISHIMATSU wrote:
>
>
> On 08/09/2017 10:44 AM, Dou Liyang wrote:
>>
>> Hi YASUAKI,
>>
>> [...]
>>>>>>>>
>>>>>>>> we boot up kernel with 4 node:
>>>>>>>>
>>>>>>>> node 0 size: 1024 MB immovable
>>>>>>>> node 1 size: 1024 MB movable
>>>>>>>> node 2 size: 1024 MB movable
>>>>>>>> node 3 size: 1024 MB movable
>>>>>>>>
>>>>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>>>>>> But actually, we should have 4G normally.
>>>>>>>
>>>>>>> So do you have assumption on the order of immovable nodes and movable
>>>>>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>>>>>> lowest address. Is this required by the current hot-plug memory code?
>>>>>>>
>>>>>>
>>>>>> Wow! So great, It seems this is required by the hot-plug memory code.
>>>>>>
>>>>>> yesterday, I tested the patch in Qemu with 4 node and each time I
>>>>>> used different node as immovable node. But no matter what node I used,
>>>>>> the immovable nodes always had the lowest address.
>>>>>>
>>>>>> I am not familiar with memory, I am investigating this and I am going
>>>>>> to apply for a physical machine with movable nodes to check. :)
>>>>>
>>>>
>>>> Cc YASUAKI ISHIMATSU
>>>>
>>>> could you give us some help!
>>>>
>>>>> Great, thanks for your effort. I asked because this question confuses me
>>>>> and I know FJ ever focusd on the memory hot-plug implementation and
>>>>> continue working on that, it must be easier for you to consult your
>>>>> co-workers who ever worked on this. For normal kernel, seems it has
>>>>> to be that normal zone is on immovable node, namely node0. But what if
>>>>> people modified bootloader to locate kernel onto the last node and
>>>>> configure efi firmware to make the last node un-hot-plugable? I believe
>>>>> both of these can be done. Is this allowed? memory hot-plug has a
>>>>> requirement about the order of immovable node? And how many immovable
>>>>> nodes can we have? I have an slides FJ published, didn't find info about
>>>>> these.
>>>
>>> I read your patch. And I think what Baoquan wrote is right. The patch does
>>> care of only your server. As he wrote, if a server wants to build immovable
>>> node onto last node, the patch cannot handle such configuration.
>>>
>>
>> Thanks for your reviewing. it is reasonable. I will keep in my mind.
>>
>> But, I am not sure that when we boot up a system with the following 4
>> nodes, does the BOIS(ACPI firmware) map the immovable node RAM from the
>> lowest address first?
>>
>> node 0 size: 1024 MB immovable
>> node 1 size: 1024 MB movable
>> node 2 size: 1024 MB movable
>> node 3 size: 1024 MB immovable
>>
>> the order of the physical RAM maps may be node 0, 3, 1, 2.
>
>
> It depends on SRAT table. If system boots up with movable_node, kernel checks
> hot pluggable bit of memory affinity structure in SRAT table. And if hot pluggable
> bit is set, the memory will be movable. If not set, the memory will be immovable.
>
> If memory affinity structures in SRAT table are defined as follows, the system
> sets up the configuration you mentioned.
>
> PXM: start : end : hot pluggable bit
> 0:0x00000000000:0x0ffffffffff: disable
> 1:0x10000000000:0x1ffffffffff: enable
> 2:0x30000000000:0x2ffffffffff: enable
> 3:0x40000000000:0x3ffffffffff: disable
>
> We are not sure there is such server. But there is no specification that immovable
> node has to be set from lowest address. So kernel should care of such SRAT table.
>
Yes, this patch didn't consider this situation.
It's related to the ACPI table. As I know when the ACPI firmware
generates the local APIC entries in MADT, it generates enabled CPUs
first and then disabled one(will be hot-plugged). I don't know whether
this stratagem is also used in SRAT or not.
I will validate the generation order of memory affinity structures in
ACPI SRAT. Then modify this patch.
Thanks,
dou.
> Thanks,
> Yasuaki Ishimatsuu
>
>>
>>
>> Thanks,
>>
>> dou,
>>
>>> Thanks,
>>> Yasuaki Ishimatsu
>>>
>>>>>
>>>>
>>>> Thanks,
>>>> dou.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>>>>>> is other reasons:
>>>>>>>>
>>>>>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>>>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>>>>>> or not, it should consider any situations, such as KASLR.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> dou.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>>>>>> that any node which the kernel resides in is defined as
>>>>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>>>>>
>>>>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>>>>>> node which will be immovable.
>>>>>>>>>>>
>>>>>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>>>>
>>>>>>> And here you mentioned boundaries, means not only one boundary, so how
>>>>>>> do you handle the case movable nodes and immovable nodes alternate to be
>>>>>>> placed?
>>>>>>>
>>>>>>> I mean, are you sure the current hot-plug memory code require immovable
>>>>>>> node has to be the first node and there's only one immovable node or
>>>>>>> there are several immovable node but they are the first few nodes?
>>>>>>>
>>>>>>> If yes, then this patch looks good to me, I would like to ack it.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Baoquan
>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Reported-by: Chao Fan <[email protected]>
>>>>>>>>>>> Signed-off-by: Dou Liyang <[email protected]>
>>>>>>>>>>> ---
>>>>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>>>>>> arch/x86/boot/compressed/kaslr.c | 19 ++++++++++++++++---
>>>>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> index d9c171c..44c7e33 100644
>>>>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>>>>>> mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
>>>>>>>>>>> reporting absolute coordinates, such as tablets
>>>>>>>>>>>
>>>>>>>>>>> - movablecore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>>> + movablecore=nn[KMG]
>>>>>>>>>>> + [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>>> is similar to kernelcore except it specifies the
>>>>>>>>>>> amount of memory used for migratable allocations.
>>>>>>>>>>> If both kernelcore and movablecore is specified,
>>>>>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>>>>>> that the amount of memory usable for all allocations
>>>>>>>>>>> is not too small.
>>>>>>>>>>>
>>>>>>>>>>> - movable_node [KNL] Boot-time switch to make hotplugable memory
>>>>>>>>>>> + movable_node [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>>>>>> NUMA nodes to be movable. This means that the memory
>>>>>>>>>>> of such nodes will be usable only for movable
>>>>>>>>>>> allocations which rules out almost all kernel
>>>>>>>>>>> allocations. Use with caution!
>>>>>>>>>>>
>>>>>>>>>>> + movable_node=nn[KMG]
>>>>>>>>>>> + [KNL] Extend movable_node to work well with KASLR. This
>>>>>>>>>>> + parameter is the boundaries between the movable nodes
>>>>>>>>>>> + and immovable nodes, the memory which exceeds it will
>>>>>>>>>>> + be regarded as hot-pluggable.
>>>>>>>>>>> +
>>>>>>>>>>> MTD_Partition= [MTD]
>>>>>>>>>>> Format: <name>,<region-number>,<size>,<offset>
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>>>>>> static bool memmap_too_large;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>>>>>> +/*
>>>>>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>>>>>> + */
>>>>>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>>>>>> char *param, *val;
>>>>>>>>>>> u64 mem_size;
>>>>>>>>>>>
>>>>>>>>>>> - if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>>>>>> + if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>>>>>> + !strstr(args, "movable_node="))
>>>>>>>>>>> return 0;
>>>>>>>>>>>
>>>>>>>>>>> tmp_cmdline = malloc(len + 1);
>>>>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>>>>>> free(tmp_cmdline);
>>>>>>>>>>> return -EINVAL;
>>>>>>>>>>> }
>>>>>>>>>>> - mem_limit = mem_size;
>>>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>>> + } else if (!strcmp(param, "movable_node")) {
>>>>>>>>>>> + char *p = val;
>>>>>>>>>>> +
>>>>>>>>>>> + mem_size = memparse(p, &p);
>>>>>>>>>>> + if (mem_size == 0) {
>>>>>>>>>>> + free(tmp_cmdline);
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> + }
>>>>>>>>>>> + mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> 2.5.5
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>