2016-10-24 19:59:21

by Reza Arbab

[permalink] [raw]
Subject: [PATCH v5 0/3] powerpc/mm: movable hotplug memory nodes

These changes enable the dynamic creation of movable nodes on power.

On x86, the ACPI SRAT memory affinity structure can mark memory
hotpluggable, allowing the kernel to possibly create movable nodes at
boot.

While power has no analog of this SRAT information, we can still create
a movable memory node, post boot, by hotplugging all of the node's
memory into ZONE_MOVABLE.

In v1, this patchset introduced a new dt compatible id to explicitly
create a memoryless node at boot. Here, things have been simplified to
be applicable regardless of the status of node hotplug on power. We
still intend to enable hotadding a pgdat, but that's now untangled as a
separate topic.

v5:
* Drop the patches which recognize the "status" property of dt memory
nodes. Firmware can set the size of "linux,usable-memory" to zero instead.

v4:
* http://lkml.kernel.org/r/[email protected]

* Rename of_fdt_is_available() to of_fdt_device_is_available().
Rename of_flat_dt_is_available() to of_flat_dt_device_is_available().

* Instead of restoring top-down allocation, ensure it never goes
bottom-up in the first place, by making movable_node arch-specific.

* Use MEMORY_HOTPLUG instead of PPC64 in the mm/Kconfig patch.

v3:
* http://lkml.kernel.org/r/[email protected]

* Use Rob Herring's suggestions to improve the node availability check.

* More verbose commit log in the patch enabling CONFIG_MOVABLE_NODE.

* Add a patch to restore top-down allocation the way x86 does.

v2:
* http://lkml.kernel.org/r/[email protected]

* Use the "status" property of standard dt memory nodes instead of
introducing a new "ibm,hotplug-aperture" compatible id.

* Remove the patch which explicitly creates a memoryless node. This set
no longer has any bearing on whether the pgdat is created at boot or
at the time of memory addition.

v1:
* http://lkml.kernel.org/r/[email protected]

Reza Arbab (3):
powerpc/mm: allow memory hotplug into a memoryless node
mm: make processing of movable_node arch-specific
mm: enable CONFIG_MOVABLE_NODE on non-x86 arches

arch/powerpc/mm/numa.c | 13 +------------
arch/x86/mm/numa.c | 35 ++++++++++++++++++++++++++++++++++-
mm/Kconfig | 2 +-
mm/memory_hotplug.c | 31 -------------------------------
4 files changed, 36 insertions(+), 45 deletions(-)

--
1.8.3.1


2016-10-24 19:59:23

by Reza Arbab

[permalink] [raw]
Subject: [PATCH v5 2/3] mm: make processing of movable_node arch-specific

Currently, CONFIG_MOVABLE_NODE depends on X86_64. In preparation to
enable it for other arches, we need to factor a detail which is unique
to x86 out of the generic mm code.

Specifically, as documented in kernel-parameters.txt, the use of
"movable_node" should remain restricted to x86:

movable_node [KNL,X86] Boot-time switch to enable the effects
of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.

This option tells x86 to find movable nodes identified by the ACPI SRAT.
On other arches, it would have no benefit, only the undesired side
effect of setting bottom-up memblock allocation.

Since #ifdef CONFIG_MOVABLE_NODE will no longer be enough to restrict
this option to x86, move it to an arch-specific compilation unit
instead.

Signed-off-by: Reza Arbab <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Balbir Singh <[email protected]>
---
arch/x86/mm/numa.c | 35 ++++++++++++++++++++++++++++++++++-
mm/memory_hotplug.c | 31 -------------------------------
2 files changed, 34 insertions(+), 32 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 3f35b48..37584ba 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -886,6 +886,38 @@ const struct cpumask *cpumask_of_node(int node)
#endif /* !CONFIG_DEBUG_PER_CPU_MAPS */

#ifdef CONFIG_MEMORY_HOTPLUG
+
+static int __init cmdline_parse_movable_node(char *p)
+{
+#ifdef CONFIG_MOVABLE_NODE
+ /*
+ * Memory used by the kernel cannot be hot-removed because Linux
+ * cannot migrate the kernel pages. When memory hotplug is
+ * enabled, we should prevent memblock from allocating memory
+ * for the kernel.
+ *
+ * ACPI SRAT records all hotpluggable memory ranges. But before
+ * SRAT is parsed, we don't know about it.
+ *
+ * The kernel image is loaded into memory at very early time. We
+ * cannot prevent this anyway. So on NUMA system, we set any
+ * node the kernel resides in as un-hotpluggable.
+ *
+ * Since on modern servers, one node could have double-digit
+ * gigabytes memory, we can assume the memory around the kernel
+ * image is also un-hotpluggable. So before SRAT is parsed, just
+ * allocate memory near the kernel image to try the best to keep
+ * the kernel away from hotpluggable memory.
+ */
+ memblock_set_bottom_up(true);
+ movable_node_enabled = true;
+#else
+ pr_warn("movable_node option not supported\n");
+#endif
+ return 0;
+}
+early_param("movable_node", cmdline_parse_movable_node);
+
int memory_add_physaddr_to_nid(u64 start)
{
struct numa_meminfo *mi = &numa_meminfo;
@@ -898,4 +930,5 @@ int memory_add_physaddr_to_nid(u64 start)
return nid;
}
EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
-#endif
+
+#endif /* CONFIG_MEMORY_HOTPLUG */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9629273..9931e7e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1738,37 +1738,6 @@ static bool can_offline_normal(struct zone *zone, unsigned long nr_pages)
}
#endif /* CONFIG_MOVABLE_NODE */

-static int __init cmdline_parse_movable_node(char *p)
-{
-#ifdef CONFIG_MOVABLE_NODE
- /*
- * Memory used by the kernel cannot be hot-removed because Linux
- * cannot migrate the kernel pages. When memory hotplug is
- * enabled, we should prevent memblock from allocating memory
- * for the kernel.
- *
- * ACPI SRAT records all hotpluggable memory ranges. But before
- * SRAT is parsed, we don't know about it.
- *
- * The kernel image is loaded into memory at very early time. We
- * cannot prevent this anyway. So on NUMA system, we set any
- * node the kernel resides in as un-hotpluggable.
- *
- * Since on modern servers, one node could have double-digit
- * gigabytes memory, we can assume the memory around the kernel
- * image is also un-hotpluggable. So before SRAT is parsed, just
- * allocate memory near the kernel image to try the best to keep
- * the kernel away from hotpluggable memory.
- */
- memblock_set_bottom_up(true);
- movable_node_enabled = true;
-#else
- pr_warn("movable_node option not supported\n");
-#endif
- return 0;
-}
-early_param("movable_node", cmdline_parse_movable_node);
-
/* check which state of node_states will be changed when offline memory */
static void node_states_check_changes_offline(unsigned long nr_pages,
struct zone *zone, struct memory_notify *arg)
--
1.8.3.1

2016-10-24 19:59:18

by Reza Arbab

[permalink] [raw]
Subject: [PATCH v5 3/3] mm: enable CONFIG_MOVABLE_NODE on non-x86 arches

To support movable memory nodes (CONFIG_MOVABLE_NODE), at least one of
the following must be true:

1. We're on x86. This arch has the capability to identify movable nodes
at boot by parsing the ACPI SRAT, if the movable_node option is used.

2. Our config supports memory hotplug, which means that a movable node
can be created by hotplugging all of its memory into ZONE_MOVABLE.

Fix the Kconfig definition of CONFIG_MOVABLE_NODE, which currently
recognizes (1), but not (2).

Signed-off-by: Reza Arbab <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Balbir Singh <[email protected]>
---
mm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index be0ee11..5d0818f 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -153,7 +153,7 @@ config MOVABLE_NODE
bool "Enable to assign a node which has only movable memory"
depends on HAVE_MEMBLOCK
depends on NO_BOOTMEM
- depends on X86_64
+ depends on X86_64 || MEMORY_HOTPLUG
depends on NUMA
default n
help
--
1.8.3.1

2016-10-24 19:59:56

by Reza Arbab

[permalink] [raw]
Subject: [PATCH v5 1/3] powerpc/mm: allow memory hotplug into a memoryless node

Remove the check which prevents us from hotplugging into an empty node.

The original commit b226e4621245 ("[PATCH] powerpc: don't add memory to
empty node/zone"), states that this was intended to be a temporary measure.
It is a workaround for an oops which no longer occurs.

Signed-off-by: Reza Arbab <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Cc: Nathan Fontenot <[email protected]>
Cc: Bharata B Rao <[email protected]>
---
arch/powerpc/mm/numa.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index a51c188..0cb6bd8 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1085,7 +1085,7 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr)
int hot_add_scn_to_nid(unsigned long scn_addr)
{
struct device_node *memory = NULL;
- int nid, found = 0;
+ int nid;

if (!numa_enabled || (min_common_depth < 0))
return first_online_node;
@@ -1101,17 +1101,6 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
if (nid < 0 || !node_online(nid))
nid = first_online_node;

- if (NODE_DATA(nid)->node_spanned_pages)
- return nid;
-
- for_each_online_node(nid) {
- if (NODE_DATA(nid)->node_spanned_pages) {
- found = 1;
- break;
- }
- }
-
- BUG_ON(!found);
return nid;
}

--
1.8.3.1