2010-11-17 04:45:56

by Zheng, Shaohui

[permalink] [raw]
Subject: [8/8,v3] NUMA Hotplug Emulator: documentation

From: Shaohui Zheng <[email protected]>

add a text file Documentation/x86/x86_64/numa_hotplug_emulator.txt
to explain the usage for the hotplug emulator.

Signed-off-by: Haicheng Li <[email protected]>
Signed-off-by: Shaohui Zheng <[email protected]>
---
Index: linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt 2010-11-17 09:01:10.342836513 +0800
@@ -0,0 +1,92 @@
+NUMA Hotplug Emulator for x86
+---------------------------------------------------
+
+NUMA hotplug emulator is able to emulate NUMA Node Hotplug
+thru a pure software way. It intends to help people easily debug
+and test node/cpu/memory hotplug related stuff on a
+none-numa-hotplug-support machine, even a UMA machine and virtual
+environment.
+
+1) Node hotplug emulation:
+
+The emulator firstly hides RAM via E820 table, and then it can
+fake offlined nodes with the hidden RAM.
+
+After system bootup, user is able to hotplug-add these offlined
+nodes, which is just similar to a real hotplug hardware behavior.
+
+Using boot option "numa=hide=N*size" to fake offlined nodes:
+ - N is the number of hidden nodes
+ - size is the memory size (in MB) per hidden node.
+
+There is a sysfs entry "probe" under /sys/devices/system/node/ for user
+to hotplug the fake offlined nodes:
+
+ - to show all fake offlined nodes:
+ $ cat /sys/devices/system/node/probe
+
+ - to hotadd a fake offlined node, e.g. nodeid is N:
+ $ echo N > /sys/devices/system/node/probe
+
+2) CPU hotplug emulation:
+
+The emulator reserve CPUs throu grub parameter, the reserved CPUs can be
+hot-add/hot-remove in software method, it emulates the process of physical
+cpu hotplug.
+
+When hotplug a CPU with emulator, we are using a logical CPU to emulate the CPU
+socket hotplug process. For the CPU supported SMT, some logical CPUs are in the
+same socket, but it may located in different NUMA node after we have emulator.
+We put the logical CPU into a fake CPU socket, and assign it an unique
+phys_proc_id. For the fake socket, we put one logical CPU in only.
+
+ - to hide CPUs
+ - Using boot option "maxcpus=N" hide CPUs
+ N is the number of initialize CPUs
+ - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation
+ when cpu_hpe is enabled, the rest CPUs will not be initialized
+
+ - to hot-add CPU to node
+ $ echo nid > cpu/probe
+
+ - to hot-remove CPU
+ $ echo nid > cpu/release
+
+3) Memory hotplug emulation:
+
+The emulator reserve memory before OS booting, the reserved memory region
+is remove from e820 table, and they can be hot-added via the probe interface,
+this interface was extend to support add memory to the specified node, It
+maintains backwards compatibility.
+
+The difficulty of Memory Release is well-known, we have no plan for it until now.
+
+ - reserve memory throu grub parameter
+ mem=1024m
+
+ - add a memory section to node 3
+ $ echo 0x40000000,3 > memory/probe
+ OR
+ $ echo 1024m,3 > memory/probe
+ OR
+ $ echo "physical_address=0x40000000 numa_node=3" > memory/probe
+
+4) Script for hotplug testing
+
+These scripts provides convenience when we hot-add memory/cpu in batch.
+
+- Online all memory sections:
+for m in /sys/devices/system/memory/memory*;
+do
+ echo online > $m/state;
+done
+
+- CPU Online:
+for c in /sys/devices/system/cpu/cpu*;
+do
+ echo 1 > $c/online;
+done
+
+- Haicheng Li <[email protected]>
+- Shaohui Zheng <[email protected]>
+ Nov 2010
Index: linux-hpe4/Documentation/x86/x86_64/boot-options.txt
===================================================================
--- linux-hpe4.orig/Documentation/x86/x86_64/boot-options.txt 2010-11-17 10:01:37.093461435 +0800
+++ linux-hpe4/Documentation/x86/x86_64/boot-options.txt 2010-11-17 10:03:10.881043878 +0800
@@ -173,6 +173,13 @@
numa=fake=<N>
If given as an integer, fills all system RAM with N fake nodes
interleaved over physical nodes.
+ numa=hide=N*size1[,size2,...]
+ Give an string seperated by comma, each sub string stands for a serie nodes.
+ system will reserve an area to create hide numa nodes for them.
+
+ for example: numa=hide=2*512,256
+ system will reserve (2*512 + 256) M for 3 hide nodes. 2 nodes with 512M memory,
+ and 1 node with 256 memory

ACPI

@@ -316,3 +323,8 @@
Do not use GB pages for kernel direct mappings.
gbpages
Use GB pages for kernel direct mappings.
+ cpu_hpe=on/off
+ Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,
+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
+ this option is disabled in default.
+

--
Thanks & Regards,
Shaohui


2010-11-17 23:08:13

by Randy Dunlap

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Wed, 17 Nov 2010 10:08:07 +0800 [email protected] wrote:

> From: Shaohui Zheng <[email protected]>
>
> add a text file Documentation/x86/x86_64/numa_hotplug_emulator.txt
> to explain the usage for the hotplug emulator.
>
> Signed-off-by: Haicheng Li <[email protected]>
> Signed-off-by: Shaohui Zheng <[email protected]>
> ---
> Index: linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt 2010-11-17 09:01:10.342836513 +0800
> @@ -0,0 +1,92 @@
> +NUMA Hotplug Emulator for x86

(I'm only looking at the documentation file.)

Is this only for x86_64? if so, please change the line above (for x86).
If not, then don't put this file into the /x86_64/ sub-directory.


> +---------------------------------------------------
> +
> +NUMA hotplug emulator is able to emulate NUMA Node Hotplug
> +thru a pure software way. It intends to help people easily debug
> +and test node/cpu/memory hotplug related stuff on a

CPU

> +none-numa-hotplug-support machine, even a UMA machine and virtual

non-NUMA-hotplug-support machine,

> +environment.
> +
> +1) Node hotplug emulation:
> +
> +The emulator firstly hides RAM via E820 table, and then it can
> +fake offlined nodes with the hidden RAM.
> +
> +After system bootup, user is able to hotplug-add these offlined
> +nodes, which is just similar to a real hotplug hardware behavior.
> +
> +Using boot option "numa=hide=N*size" to fake offlined nodes:
> + - N is the number of hidden nodes
> + - size is the memory size (in MB) per hidden node.
> +
> +There is a sysfs entry "probe" under /sys/devices/system/node/ for user
> +to hotplug the fake offlined nodes:
> +
> + - to show all fake offlined nodes:
> + $ cat /sys/devices/system/node/probe
> +
> + - to hotadd a fake offlined node, e.g. nodeid is N:
> + $ echo N > /sys/devices/system/node/probe
> +
> +2) CPU hotplug emulation:
> +
> +The emulator reserve CPUs throu grub parameter, the reserved CPUs can be

thru a kernel boot parameter;
(hopefully any boot loader will work, not just grub)

> +hot-add/hot-remove in software method, it emulates the process of physical
> +cpu hotplug.

CPU

> +
> +When hotplug a CPU with emulator, we are using a logical CPU to emulate the CPU

hotplugging

> +socket hotplug process. For the CPU supported SMT, some logical CPUs are in the
> +same socket, but it may located in different NUMA node after we have emulator.
> +We put the logical CPU into a fake CPU socket, and assign it an unique

a unique

> +phys_proc_id. For the fake socket, we put one logical CPU in only.
> +
> + - to hide CPUs
> + - Using boot option "maxcpus=N" hide CPUs
> + N is the number of initialize CPUs

N is the number of CPUs to initialize; the rest will be hidden.

> + - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation

CPU

> + when cpu_hpe is enabled, the rest CPUs will not be initialized

rest of the CPUs

> +
> + - to hot-add CPU to node
> + $ echo nid > cpu/probe
> +
> + - to hot-remove CPU
> + $ echo nid > cpu/release
> +
> +3) Memory hotplug emulation:
> +
> +The emulator reserve memory before OS booting, the reserved memory region

reserves memory before the OS boots; the reserved

> +is remove from e820 table, and they can be hot-added via the probe interface,

removed interface.

> +this interface was extend to support add memory to the specified node, It

This interface was extended to support adding memory to the specified node. It

> +maintains backwards compatibility.
> +
> +The difficulty of Memory Release is well-known, we have no plan for it until now.
> +
> + - reserve memory throu grub parameter

thru a kernel boot parameter

> + mem=1024m
> +
> + - add a memory section to node 3
> + $ echo 0x40000000,3 > memory/probe
> + OR
> + $ echo 1024m,3 > memory/probe
> + OR
> + $ echo "physical_address=0x40000000 numa_node=3" > memory/probe
> +
> +4) Script for hotplug testing
> +
> +These scripts provides convenience when we hot-add memory/cpu in batch.
> +
> +- Online all memory sections:
> +for m in /sys/devices/system/memory/memory*;
> +do
> + echo online > $m/state;
> +done
> +
> +- CPU Online:
> +for c in /sys/devices/system/cpu/cpu*;
> +do
> + echo 1 > $c/online;
> +done
> +
> +- Haicheng Li <[email protected]>
> +- Shaohui Zheng <[email protected]>
> + Nov 2010
> Index: linux-hpe4/Documentation/x86/x86_64/boot-options.txt
> ===================================================================
> --- linux-hpe4.orig/Documentation/x86/x86_64/boot-options.txt 2010-11-17 10:01:37.093461435 +0800
> +++ linux-hpe4/Documentation/x86/x86_64/boot-options.txt 2010-11-17 10:03:10.881043878 +0800
> @@ -173,6 +173,13 @@
> numa=fake=<N>
> If given as an integer, fills all system RAM with N fake nodes
> interleaved over physical nodes.
> + numa=hide=N*size1[,size2,...]
> + Give an string seperated by comma, each sub string stands for a serie nodes.

Give a string separated by commas; each substring stands for a node size.
??


> + system will reserve an area to create hide numa nodes for them.

System will reserve an area to create or hide NUMA nodes.

> +
> + for example: numa=hide=2*512,256
> + system will reserve (2*512 + 256) M for 3 hide nodes. 2 nodes with 512M memory,

MB for 3 hidden nodes: 2 nodes with
512 MB memory and 1 node with 256 MB memory

> + and 1 node with 256 memory
>
> ACPI
>
> @@ -316,3 +323,8 @@
> Do not use GB pages for kernel direct mappings.
> gbpages
> Use GB pages for kernel direct mappings.
> + cpu_hpe=on/off
> + Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,

CPU method. When

> + sysfs provides probe/release interface to hot add/remove cpu dynamically.

CPUs

> + this option is disabled in default.

This by default.



---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-11-18 03:52:25

by Zheng, Shaohui

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Wed, Nov 17, 2010 at 03:06:59PM -0800, Randy Dunlap wrote:
> On Wed, 17 Nov 2010 10:08:07 +0800 [email protected] wrote:
>
> > From: Shaohui Zheng <[email protected]>
> >
> > add a text file Documentation/x86/x86_64/numa_hotplug_emulator.txt
> > to explain the usage for the hotplug emulator.
> >
> > Signed-off-by: Haicheng Li <[email protected]>
> > Signed-off-by: Shaohui Zheng <[email protected]>
> > ---
> > Index: linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt
> > ===================================================================
> > --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> > +++ linux-hpe4/Documentation/x86/x86_64/numa_hotplug_emulator.txt 2010-11-17 09:01:10.342836513 +0800
> > @@ -0,0 +1,92 @@
> > +NUMA Hotplug Emulator for x86
>
> (I'm only looking at the documentation file.)
>
> Is this only for x86_64? if so, please change the line above (for x86).
> If not, then don't put this file into the /x86_64/ sub-directory.

There are only a few x86_64 specific codes on the patch series, so it should
work for both x86_64 and i386. Currently cpu/memory hotplug works stable against
x86_64 kernel, it still has many issues for i386, so we can not do the testing
for emualtor on i386 kernel, I'd prefer to keep the document for x86_64 only.

> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
I will check the documentation once again, thanks for the careful review from Randy.

--
Thanks & Regards,
Shaohui

2010-11-21 15:00:44

by Cong Wang

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Wed, Nov 17, 2010 at 10:08:07AM +0800, [email protected] wrote:
>+2) CPU hotplug emulation:
>+
>+The emulator reserve CPUs throu grub parameter, the reserved CPUs can be
>+hot-add/hot-remove in software method, it emulates the process of physical
>+cpu hotplug.
>+
>+When hotplug a CPU with emulator, we are using a logical CPU to emulate the CPU
>+socket hotplug process. For the CPU supported SMT, some logical CPUs are in the
>+same socket, but it may located in different NUMA node after we have emulator.
>+We put the logical CPU into a fake CPU socket, and assign it an unique
>+phys_proc_id. For the fake socket, we put one logical CPU in only.
>+
>+ - to hide CPUs
>+ - Using boot option "maxcpus=N" hide CPUs
>+ N is the number of initialize CPUs
>+ - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation
>+ when cpu_hpe is enabled, the rest CPUs will not be initialized
>+
>+ - to hot-add CPU to node
>+ $ echo nid > cpu/probe
>+
>+ - to hot-remove CPU
>+ $ echo nid > cpu/release
>+

Again, we already have software CPU hotplug,
i.e. /sys/devices/system/cpu/cpuX/online.

You need to pick up another name for this.

>From your documentation above, it looks like you are trying
to move one CPU between nodes?

>+ cpu_hpe=on/off
>+ Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,
>+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
>+ this option is disabled in default.
>+

Why not just a CONFIG? IOW, why do we need to make another boot
parameter for this?

2010-11-21 15:17:21

by Li, Haicheng

[permalink] [raw]
Subject: RE: [8/8,v3] NUMA Hotplug Emulator: documentation

Am?rico Wang wrote:
> On Wed, Nov 17, 2010 at 10:08:07AM +0800, [email protected]
> wrote:
>> +2) CPU hotplug emulation:
>> +
>> +The emulator reserve CPUs throu grub parameter, the reserved CPUs
>> can be +hot-add/hot-remove in software method, it emulates the
>> process of physical +cpu hotplug. +
>> +When hotplug a CPU with emulator, we are using a logical CPU to
>> emulate the CPU +socket hotplug process. For the CPU supported SMT,
>> some logical CPUs are in the +same socket, but it may located in
>> different NUMA node after we have emulator. +We put the logical CPU
>> into a fake CPU socket, and assign it an unique +phys_proc_id. For
>> the fake socket, we put one logical CPU in only. + + - to hide CPUs
>> + - Using boot option "maxcpus=N" hide CPUs
>> + N is the number of initialize CPUs
>> + - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation
>> + when cpu_hpe is enabled, the rest CPUs will not be
>> initialized + + - to hot-add CPU to node
>> + $ echo nid > cpu/probe
>> +
>> + - to hot-remove CPU
>> + $ echo nid > cpu/release
>> +
>
> Again, we already have software CPU hotplug,
> i.e. /sys/devices/system/cpu/cpuX/online.

online here is just for logical CPU online. what we're achieving here is to emulate physical CPU hotadd.


-haicheng-

2010-11-22 00:55:38

by Zheng, Shaohui

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Sun, Nov 21, 2010 at 11:03:45PM +0800, Am?rico Wang wrote:
> On Wed, Nov 17, 2010 at 10:08:07AM +0800, [email protected] wrote:
> >+2) CPU hotplug emulation:
> >+
> >+The emulator reserve CPUs throu grub parameter, the reserved CPUs can be
> >+hot-add/hot-remove in software method, it emulates the process of physical
> >+cpu hotplug.
> >+
> >+When hotplug a CPU with emulator, we are using a logical CPU to emulate the CPU
> >+socket hotplug process. For the CPU supported SMT, some logical CPUs are in the
> >+same socket, but it may located in different NUMA node after we have emulator.
> >+We put the logical CPU into a fake CPU socket, and assign it an unique
> >+phys_proc_id. For the fake socket, we put one logical CPU in only.
> >+
> >+ - to hide CPUs
> >+ - Using boot option "maxcpus=N" hide CPUs
> >+ N is the number of initialize CPUs
> >+ - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation
> >+ when cpu_hpe is enabled, the rest CPUs will not be initialized
> >+
> >+ - to hot-add CPU to node
> >+ $ echo nid > cpu/probe
> >+
> >+ - to hot-remove CPU
> >+ $ echo nid > cpu/release
> >+
>
> Again, we already have software CPU hotplug,
> i.e. /sys/devices/system/cpu/cpuX/online.
it is cpu online/offline in current kernel, not physical CPU hot-add or hot-remove.
the emulator is a tool to emulate the process of physcial CPU hotplug.
>
> You need to pick up another name for this.
>
> >From your documentation above, it looks like you are trying
> to move one CPU between nodes?
Yes, you are correct. With cpu probe/release interface, you can hot-remove a
CPU from a node, and hot-add it to another node.
>
> >+ cpu_hpe=on/off
> >+ Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,
> >+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
> >+ this option is disabled in default.
> >+
>
> Why not just a CONFIG? IOW, why do we need to make another boot
> parameter for this?
Only the developer or QA will use the emulator, we did not want to change the
default action for common user who does not care the hotplug emulator, so we
use a kernel parameter as a switch. The common user is not aware the existence
of the emulator.


--
Thanks & Regards,
Shaohui

2010-11-22 16:01:11

by Cong Wang

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Mon, Nov 22, 2010 at 07:33:51AM +0800, Shaohui Zheng wrote:
>On Sun, Nov 21, 2010 at 11:03:45PM +0800, Américo Wang wrote:
>>
>> >From your documentation above, it looks like you are trying
>> to move one CPU between nodes?
>Yes, you are correct. With cpu probe/release interface, you can hot-remove a
>CPU from a node, and hot-add it to another node.


Can I also move the CPU to another node _after_ it is hot-added?
Or I have to hot-remove it first and then hot-add it again?

>>
>> >+ cpu_hpe=on/off
>> >+ Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,
>> >+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
>> >+ this option is disabled in default.
>> >+
>>
>> Why not just a CONFIG? IOW, why do we need to make another boot
>> parameter for this?
>Only the developer or QA will use the emulator, we did not want to change the
>default action for common user who does not care the hotplug emulator, so we
>use a kernel parameter as a switch. The common user is not aware the existence
>of the emulator.
>

I think it is also useful to other Linux users, e.g. after I
boot with "maxcpus=1", I can still bring the rest 3 CPU's
back without reboot.

Thanks.

--
Live like a child, think like the god.

2010-11-23 00:44:53

by Zheng, Shaohui

[permalink] [raw]
Subject: Re: [8/8,v3] NUMA Hotplug Emulator: documentation

On Tue, Nov 23, 2010 at 12:04:12AM +0800, Am?rico Wang wrote:
> On Mon, Nov 22, 2010 at 07:33:51AM +0800, Shaohui Zheng wrote:
> >On Sun, Nov 21, 2010 at 11:03:45PM +0800, Am?rico Wang wrote:
> >>
> >> >From your documentation above, it looks like you are trying
> >> to move one CPU between nodes?
> >Yes, you are correct. With cpu probe/release interface, you can hot-remove a
> >CPU from a node, and hot-add it to another node.
>
>
> Can I also move the CPU to another node _after_ it is hot-added?
> Or I have to hot-remove it first and then hot-add it again?
of course you can. you can hot-remove it via cpu/release interface, and then hot-add
it by cpu/probe interface.

With the cpu probe/reelase interface, we can design some stress test cases to hot add/remove
cpu by script.
>
> >>
> >> >+ cpu_hpe=on/off
> >> >+ Enable/disable cpu hotplug emulation with software method. when cpu_hpe=on,
> >> >+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
> >> >+ this option is disabled in default.
> >> >+
> >>
> >> Why not just a CONFIG? IOW, why do we need to make another boot
> >> parameter for this?
> >Only the developer or QA will use the emulator, we did not want to change the
> >default action for common user who does not care the hotplug emulator, so we
> >use a kernel parameter as a switch. The common user is not aware the existence
> >of the emulator.
> >
>
> I think it is also useful to other Linux users, e.g. after I
> boot with "maxcpus=1", I can still bring the rest 3 CPU's
> back without reboot.
You understand it very well. the probe/release on ppc is already implemented,
but for x86, it is a feature missing, so we finished it with these patches.

>
> Thanks.
>
> --
> Live like a child, think like the god.
>

--
Thanks & Regards,
Shaohui