2018-06-01 12:55:08

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH 0/4] Small cleanup for memoryhotplug

From: Oscar Salvador <[email protected]>


Hi,

I wanted to give it a try and do a small cleanup in the memhotplug's code.
A lot more could be done, but I wanted to start somewhere.
I tried to unify/remove duplicated code.

The following is what this patchset does:

1) add_memory_resource() has code to allocate a node in case it was offline.
Since try_online_node has some code for that as well, I just made add_memory_resource() to
use that so we can remove duplicated code..
This is better explained in patch 1/4.

2) register_mem_sect_under_node() will be called only from link_mem_sections()

3) Get rid of link_mem_sections() in favour of walk_memory_range() with a callback to
register_mem_sect_under_node()

4) Drop unnecessary checks from register_mem_sect_under_node()


I have done some tests and I could not see anything broken because of
this patchset.

Oscar Salvador (4):
mm/memory_hotplug: Make add_memory_resource use __try_online_node
mm/memory_hotplug: Call register_mem_sect_under_node
mm/memory_hotplug: Get rid of link_mem_sections
mm/memory_hotplug: Drop unnecessary checks from
register_mem_sect_under_node

drivers/base/memory.c | 2 -
drivers/base/node.c | 52 +++++---------------------
include/linux/node.h | 21 +++++------
mm/memory_hotplug.c | 101 ++++++++++++++++++++++++++------------------------
4 files changed, 71 insertions(+), 105 deletions(-)

--
2.13.6



2018-06-01 12:55:21

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH 2/4] mm/memory_hotplug: Call register_mem_sect_under_node

From: Oscar Salvador <[email protected]>

When hotpluging memory, it is possible that two calls are being made
to register_mem_sect_under_node().
One comes from __add_section()->hotplug_memory_register()
and the other from add_memory_resource()->link_mem_sections() if
we had to register a new node.

In case we had to register a new node, hotplug_memory_register()
will only handle/allocate the memory_block's since
register_mem_sect_under_node() will return right away because the
node it is not online yet.

I think it is better if we leave hotplug_memory_register() to
handle/allocate only memory_block's and make link_mem_sections()
to call register_mem_sect_under_node().

So this patch removes the call to register_mem_sect_under_node()
from hotplug_memory_register(), and moves the call to link_mem_sections()
out of the condition, so it will always be called.
In this way we only have one place where the memory sections
are registered.

Signed-off-by: Oscar Salvador <[email protected]>
---
drivers/base/memory.c | 2 --
mm/memory_hotplug.c | 40 ++++++++++++++++++----------------------
2 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index f5e560188a18..c8a1cb0b6136 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -736,8 +736,6 @@ int hotplug_memory_register(int nid, struct mem_section *section)
mem->section_count++;
}

- if (mem->section_count == sections_per_block)
- ret = register_mem_sect_under_node(mem, nid, false);
out:
mutex_unlock(&mem_sysfs_mutex);
return ret;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 29a5fc89bdb1..f84ef96175ab 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1118,6 +1118,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
u64 start, size;
bool new_node;
int ret;
+ unsigned long start_pfn, nr_pages;

start = res->start;
size = resource_size(res);
@@ -1147,34 +1148,21 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
if (ret < 0)
goto error;

- /* we online node here. we can't roll back from here. */
- node_set_online(nid);
-
if (new_node) {
- unsigned long start_pfn = start >> PAGE_SHIFT;
- unsigned long nr_pages = size >> PAGE_SHIFT;
-
+ /* we online node here. we can't roll back from here. */
+ node_set_online(nid);
ret = __register_one_node(nid);
if (ret)
goto register_fail;
-
- /*
- * link memory sections under this node. This is already
- * done when creatig memory section in register_new_memory
- * but that depends to have the node registered so offline
- * nodes have to go through register_node.
- * TODO clean up this mess.
- */
- ret = link_mem_sections(nid, start_pfn, nr_pages, false);
-register_fail:
- /*
- * If sysfs file of new node can't create, cpu on the node
- * can't be hot-added. There is no rollback way now.
- * So, check by BUG_ON() to catch it reluctantly..
- */
- BUG_ON(ret);
}

+ /* link memory sections under this node.*/
+ start_pfn = start >> PAGE_SHIFT;
+ nr_pages = size >> PAGE_SHIFT;
+ ret = link_mem_sections(nid, start_pfn, nr_pages, false);
+ if (ret)
+ goto register_fail;
+
/* create new memmap entry */
firmware_map_add_hotplug(start, start + size, "System RAM");

@@ -1185,6 +1173,14 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)

goto out;

+register_fail:
+ /*
+ * If sysfs file of new node can't create, cpu on the node
+ * can't be hot-added. There is no rollback way now.
+ * So, check by BUG_ON() to catch it reluctantly..
+ */
+ BUG_ON(ret);
+
error:
/* rollback pgdat allocation and others */
if (new_node)
--
2.13.6


2018-06-01 12:56:17

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH 1/4] mm/memory_hotplug: Make add_memory_resource use __try_online_node

From: Oscar Salvador <[email protected]>

add_memory_resource() contains code to allocate a new node in case
it is necessary.
Since try_online_node() also hast some code for this purpose,
let us make use of that and remove duplicate code.

This introduces __try_online_node(), which is called by add_memory_resource()
and try_online_node().
__try_online_node() has two new parameters, start_addr of the node,
and if the node should be onlined and registered right away.
This is always wanted if we are calling from do_cpu_up(), but not
when we are calling from memhotplug code.
Nothing changes from the point of view of the users of try_online_node(),
since try_online_node passes start_addr=0 and online_node=true to
__try_online_node().

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 61 +++++++++++++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 7deb49f69e27..29a5fc89bdb1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1034,8 +1034,10 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
return pgdat;
}

-static void rollback_node_hotadd(int nid, pg_data_t *pgdat)
+static void rollback_node_hotadd(int nid)
{
+ pg_data_t *pgdat = NODE_DATA(nid);
+
arch_refresh_nodedata(nid, NULL);
free_percpu(pgdat->per_cpu_nodestats);
arch_free_nodedata(pgdat);
@@ -1046,28 +1048,43 @@ static void rollback_node_hotadd(int nid, pg_data_t *pgdat)
/**
* try_online_node - online a node if offlined
* @nid: the node ID
- *
+ * @start: start addr of the node
+ * @set_node_online: Whether we want to online the node
* called by cpu_up() to online a node without onlined memory.
*/
-int try_online_node(int nid)
+static int __try_online_node(int nid, u64 start, bool set_node_online)
{
- pg_data_t *pgdat;
- int ret;
+ pg_data_t *pgdat;
+ int ret = 1;

if (node_online(nid))
return 0;

- mem_hotplug_begin();
- pgdat = hotadd_new_pgdat(nid, 0);
+ pgdat = hotadd_new_pgdat(nid, start);
if (!pgdat) {
pr_err("Cannot online node %d due to NULL pgdat\n", nid);
ret = -ENOMEM;
goto out;
}
- node_set_online(nid);
- ret = register_one_node(nid);
- BUG_ON(ret);
+
+ if (set_node_online) {
+ node_set_online(nid);
+ ret = register_one_node(nid);
+ BUG_ON(ret);
+ }
out:
+ return ret;
+}
+
+/*
+ * Users of this function always want to online/register the node
+ */
+int try_online_node(int nid)
+{
+ int ret;
+
+ mem_hotplug_begin();
+ ret = __try_online_node (nid, 0, true);
mem_hotplug_done();
return ret;
}
@@ -1099,8 +1116,6 @@ static int online_memory_block(struct memory_block *mem, void *arg)
int __ref add_memory_resource(int nid, struct resource *res, bool online)
{
u64 start, size;
- pg_data_t *pgdat = NULL;
- bool new_pgdat;
bool new_node;
int ret;

@@ -1111,11 +1126,6 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
if (ret)
return ret;

- { /* Stupid hack to suppress address-never-null warning */
- void *p = NODE_DATA(nid);
- new_pgdat = !p;
- }
-
mem_hotplug_begin();

/*
@@ -1126,17 +1136,14 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
*/
memblock_add_node(start, size, nid);

- new_node = !node_online(nid);
- if (new_node) {
- pgdat = hotadd_new_pgdat(nid, start);
- ret = -ENOMEM;
- if (!pgdat)
- goto error;
- }
+ ret = __try_online_node (nid, start, false);
+ new_node = !!(ret > 0);
+ if (ret < 0)
+ goto error;
+

/* call arch's memory hotadd */
ret = arch_add_memory(nid, start, size, NULL, true);
-
if (ret < 0)
goto error;

@@ -1180,8 +1187,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)

error:
/* rollback pgdat allocation and others */
- if (new_pgdat && pgdat)
- rollback_node_hotadd(nid, pgdat);
+ if (new_node)
+ rollback_node_hotadd(nid);
memblock_remove(start, size);

out:
--
2.13.6


2018-06-01 12:56:19

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH 4/4] mm/memory_hotplug: Drop unnecessary checks from register_mem_sect_under_node

From: Oscar Salvador <[email protected]>

Callers of register_mem_sect_under_node() are always passing a valid
memory_block (not NULL), so we can safely drop the check for NULL.

In the same way, register_mem_sect_under_node() is only called in case
the node is online, so we can safely remove that check as well.

Signed-off-by: Oscar Salvador <[email protected]>
---
drivers/base/node.c | 5 -----
1 file changed, 5 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 248c712e8de5..681be04351bc 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -415,12 +415,7 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid,
int ret;
unsigned long pfn, sect_start_pfn, sect_end_pfn;

- if (!mem_blk)
- return -EFAULT;
-
mem_blk->nid = nid;
- if (!node_online(nid))
- return 0;

sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
--
2.13.6


2018-06-01 12:56:59

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH 3/4] mm/memory_hotplug: Get rid of link_mem_sections

From: Oscar Salvador <[email protected]>

link_mem_sections() and walk_memory_range() share most of the code,
so we can use walk_memory_range() with a callback to register_mem_sect_under_node()
instead of using link_mem_sections().

To control whether the node id must be check, two new functions has been added:

register_mem_sect_under_node_nocheck_node()
and
register_mem_sect_under_node_check_node()

They both call register_mem_sect_under_node_check() with
the parameter check_nid set to true or false.

Signed-off-by: Oscar Salvador <[email protected]>
---
drivers/base/node.c | 47 ++++++++++-------------------------------------
include/linux/node.h | 21 +++++++++------------
mm/memory_hotplug.c | 8 ++++----
3 files changed, 23 insertions(+), 53 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index a5e821d09656..248c712e8de5 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -398,6 +398,16 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
return pfn_to_nid(pfn);
}

+int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid)
+{
+ return register_mem_sect_under_node (mem_blk, *(int *)nid, true);
+}
+
+int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid)
+{
+ return register_mem_sect_under_node (mem_blk, *(int *)nid, false);
+}
+
/* register memory section under specified node if it spans that node */
int register_mem_sect_under_node(struct memory_block *mem_blk, int nid,
bool check_nid)
@@ -490,43 +500,6 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
return 0;
}

-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages,
- bool check_nid)
-{
- unsigned long end_pfn = start_pfn + nr_pages;
- unsigned long pfn;
- struct memory_block *mem_blk = NULL;
- int err = 0;
-
- for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
- unsigned long section_nr = pfn_to_section_nr(pfn);
- struct mem_section *mem_sect;
- int ret;
-
- if (!present_section_nr(section_nr))
- continue;
- mem_sect = __nr_to_section(section_nr);
-
- /* same memblock ? */
- if (mem_blk)
- if ((section_nr >= mem_blk->start_section_nr) &&
- (section_nr <= mem_blk->end_section_nr))
- continue;
-
- mem_blk = find_memory_block_hinted(mem_sect, mem_blk);
-
- ret = register_mem_sect_under_node(mem_blk, nid, check_nid);
- if (!err)
- err = ret;
-
- /* discard ref obtained in find_memory_block() */
- }
-
- if (mem_blk)
- kobject_put(&mem_blk->dev.kobj);
- return err;
-}
-
#ifdef CONFIG_HUGETLBFS
/*
* Handle per node hstate attribute [un]registration on transistions
diff --git a/include/linux/node.h b/include/linux/node.h
index 6d336e38d155..1158bea9be52 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -31,19 +31,11 @@ struct memory_block;
extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);

-#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long nr_pages, bool check_nid);
-#else
-static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long nr_pages, bool check_nid)
-{
- return 0;
-}
-#endif
-
extern void unregister_node(struct node *node);
#ifdef CONFIG_NUMA
+#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
+extern int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid);
+#endif
/* Core of the node registration - only memory hotplug should use this */
extern int __register_one_node(int nid);

@@ -54,12 +46,17 @@ static inline int register_one_node(int nid)

if (node_online(nid)) {
struct pglist_data *pgdat = NODE_DATA(nid);
+ unsigned long start = pgdat->node_start_pfn;
+ unsigned long size = pgdat->node_spanned_pages;

error = __register_one_node(nid);
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, pgdat->node_start_pfn, pgdat->node_spanned_pages, true);
+#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
+ error = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
+ (void *)&nid, register_mem_sect_under_node_check_node);
+#endif
}

return error;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f84ef96175ab..ac21dc506b84 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -40,6 +40,8 @@

#include "internal.h"

+extern int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid);
+
/*
* online_page_callback contains pointer to current page onlining function.
* Initially it is generic_online_page(). If it is required it could be
@@ -1118,7 +1120,6 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
u64 start, size;
bool new_node;
int ret;
- unsigned long start_pfn, nr_pages;

start = res->start;
size = resource_size(res);
@@ -1157,9 +1158,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
}

/* link memory sections under this node.*/
- start_pfn = start >> PAGE_SHIFT;
- nr_pages = size >> PAGE_SHIFT;
- ret = link_mem_sections(nid, start_pfn, nr_pages, false);
+ ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
+ (void *)&nid, register_mem_sect_under_node_nocheck_node);
if (ret)
goto register_fail;

--
2.13.6


2018-06-07 10:48:56

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH 0/4] Small cleanup for memoryhotplug

On Fri, 1 Jun 2018 14:53:17 +0200
<[email protected]> wrote:

> From: Oscar Salvador <[email protected]>
>
>
> Hi,
>
> I wanted to give it a try and do a small cleanup in the memhotplug's code.
> A lot more could be done, but I wanted to start somewhere.
> I tried to unify/remove duplicated code.
>
> The following is what this patchset does:
>
> 1) add_memory_resource() has code to allocate a node in case it was offline.
> Since try_online_node has some code for that as well, I just made add_memory_resource() to
> use that so we can remove duplicated code..
> This is better explained in patch 1/4.
>
> 2) register_mem_sect_under_node() will be called only from link_mem_sections()
>
> 3) Get rid of link_mem_sections() in favour of walk_memory_range() with a callback to
> register_mem_sect_under_node()
>
> 4) Drop unnecessary checks from register_mem_sect_under_node()
>
>
> I have done some tests and I could not see anything broken because of
> this patchset.
Works fine with the patch set for arm64 I'm intermittently working on.
Or at least I don't need to make any additional changes on top of what I currently
have!

Tested-by: Jonathan Cameron <[email protected]>

Thanks,

Jonathan
>
> Oscar Salvador (4):
> mm/memory_hotplug: Make add_memory_resource use __try_online_node
> mm/memory_hotplug: Call register_mem_sect_under_node
> mm/memory_hotplug: Get rid of link_mem_sections
> mm/memory_hotplug: Drop unnecessary checks from
> register_mem_sect_under_node
>
> drivers/base/memory.c | 2 -
> drivers/base/node.c | 52 +++++---------------------
> include/linux/node.h | 21 +++++------
> mm/memory_hotplug.c | 101 ++++++++++++++++++++++++++------------------------
> 4 files changed, 71 insertions(+), 105 deletions(-)
>



2018-06-07 10:50:53

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/memory_hotplug: Make add_memory_resource use __try_online_node

On Fri, 1 Jun 2018 14:53:18 +0200
<[email protected]> wrote:

> From: Oscar Salvador <[email protected]>
>
> add_memory_resource() contains code to allocate a new node in case
> it is necessary.
> Since try_online_node() also hast some code for this purpose,
> let us make use of that and remove duplicate code.
>
> This introduces __try_online_node(), which is called by add_memory_resource()
> and try_online_node().
> __try_online_node() has two new parameters, start_addr of the node,
> and if the node should be onlined and registered right away.
> This is always wanted if we are calling from do_cpu_up(), but not
> when we are calling from memhotplug code.
> Nothing changes from the point of view of the users of try_online_node(),
> since try_online_node passes start_addr=0 and online_node=true to
> __try_online_node().
>

Trivial whitespace issue inline...

> Signed-off-by: Oscar Salvador <[email protected]>
> ---
> mm/memory_hotplug.c | 61 +++++++++++++++++++++++++++++------------------------
> 1 file changed, 34 insertions(+), 27 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7deb49f69e27..29a5fc89bdb1 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1034,8 +1034,10 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
> return pgdat;
> }
>
> -static void rollback_node_hotadd(int nid, pg_data_t *pgdat)
> +static void rollback_node_hotadd(int nid)
> {
> + pg_data_t *pgdat = NODE_DATA(nid);
> +
> arch_refresh_nodedata(nid, NULL);
> free_percpu(pgdat->per_cpu_nodestats);
> arch_free_nodedata(pgdat);
> @@ -1046,28 +1048,43 @@ static void rollback_node_hotadd(int nid, pg_data_t *pgdat)
> /**
> * try_online_node - online a node if offlined
> * @nid: the node ID
> - *
> + * @start: start addr of the node
> + * @set_node_online: Whether we want to online the node
> * called by cpu_up() to online a node without onlined memory.
> */
> -int try_online_node(int nid)
> +static int __try_online_node(int nid, u64 start, bool set_node_online)
> {
> - pg_data_t *pgdat;
> - int ret;
> + pg_data_t *pgdat;
> + int ret = 1;
>
> if (node_online(nid))
> return 0;
>
> - mem_hotplug_begin();
> - pgdat = hotadd_new_pgdat(nid, 0);
> + pgdat = hotadd_new_pgdat(nid, start);
> if (!pgdat) {
> pr_err("Cannot online node %d due to NULL pgdat\n", nid);
> ret = -ENOMEM;
> goto out;
> }
> - node_set_online(nid);
> - ret = register_one_node(nid);
> - BUG_ON(ret);
> +
> + if (set_node_online) {
> + node_set_online(nid);
> + ret = register_one_node(nid);
> + BUG_ON(ret);
> + }
> out:
> + return ret;
> +}
> +
> +/*
> + * Users of this function always want to online/register the node
> + */
> +int try_online_node(int nid)
> +{
> + int ret;
> +
> + mem_hotplug_begin();
> + ret = __try_online_node (nid, 0, true);
> mem_hotplug_done();
> return ret;
> }
> @@ -1099,8 +1116,6 @@ static int online_memory_block(struct memory_block *mem, void *arg)
> int __ref add_memory_resource(int nid, struct resource *res, bool online)
> {
> u64 start, size;
> - pg_data_t *pgdat = NULL;
> - bool new_pgdat;
> bool new_node;
> int ret;
>
> @@ -1111,11 +1126,6 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> if (ret)
> return ret;
>
> - { /* Stupid hack to suppress address-never-null warning */
> - void *p = NODE_DATA(nid);
> - new_pgdat = !p;
> - }
> -
> mem_hotplug_begin();
>
> /*
> @@ -1126,17 +1136,14 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> */
> memblock_add_node(start, size, nid);
>
> - new_node = !node_online(nid);
> - if (new_node) {
> - pgdat = hotadd_new_pgdat(nid, start);
> - ret = -ENOMEM;
> - if (!pgdat)
> - goto error;
> - }
> + ret = __try_online_node (nid, start, false);

space before (


> + new_node = !!(ret > 0);
> + if (ret < 0)
> + goto error;
> +
>
> /* call arch's memory hotadd */
> ret = arch_add_memory(nid, start, size, NULL, true);
> -
> if (ret < 0)
> goto error;
>
> @@ -1180,8 +1187,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
>
> error:
> /* rollback pgdat allocation and others */
> - if (new_pgdat && pgdat)
> - rollback_node_hotadd(nid, pgdat);
> + if (new_node)
> + rollback_node_hotadd(nid);
> memblock_remove(start, size);
>
> out:



2018-06-07 16:27:29

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 0/4] Small cleanup for memoryhotplug

On Thu, Jun 07, 2018 at 11:42:45AM +0100, Jonathan Cameron wrote:
> On Fri, 1 Jun 2018 14:53:17 +0200
> <[email protected]> wrote:
>
> > From: Oscar Salvador <[email protected]>
> >
> >
> > Hi,
> >
> > I wanted to give it a try and do a small cleanup in the memhotplug's code.
> > A lot more could be done, but I wanted to start somewhere.
> > I tried to unify/remove duplicated code.
> >
> > The following is what this patchset does:
> >
> > 1) add_memory_resource() has code to allocate a node in case it was offline.
> > Since try_online_node has some code for that as well, I just made add_memory_resource() to
> > use that so we can remove duplicated code..
> > This is better explained in patch 1/4.
> >
> > 2) register_mem_sect_under_node() will be called only from link_mem_sections()
> >
> > 3) Get rid of link_mem_sections() in favour of walk_memory_range() with a callback to
> > register_mem_sect_under_node()
> >
> > 4) Drop unnecessary checks from register_mem_sect_under_node()
> >
> >
> > I have done some tests and I could not see anything broken because of
> > this patchset.
> Works fine with the patch set for arm64 I'm intermittently working on.
> Or at least I don't need to make any additional changes on top of what I currently
> have!
>
> Tested-by: Jonathan Cameron <[email protected]>

Thanks for having tested it Jonathan ;-)!

>
> Thanks,
>
> Jonathan
> >
> > Oscar Salvador (4):
> > mm/memory_hotplug: Make add_memory_resource use __try_online_node
> > mm/memory_hotplug: Call register_mem_sect_under_node
> > mm/memory_hotplug: Get rid of link_mem_sections
> > mm/memory_hotplug: Drop unnecessary checks from
> > register_mem_sect_under_node
> >
> > drivers/base/memory.c | 2 -
> > drivers/base/node.c | 52 +++++---------------------
> > include/linux/node.h | 21 +++++------
> > mm/memory_hotplug.c | 101 ++++++++++++++++++++++++++------------------------
> > 4 files changed, 71 insertions(+), 105 deletions(-)
> >
>
>

2018-06-18 07:14:50

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 0/4] Small cleanup for memoryhotplug

On Fri, Jun 01, 2018 at 02:53:17PM +0200, [email protected] wrote:
> From: Oscar Salvador <[email protected]>
>
>
> Hi,
>
> I wanted to give it a try and do a small cleanup in the memhotplug's code.
> A lot more could be done, but I wanted to start somewhere.
> I tried to unify/remove duplicated code.
>
> The following is what this patchset does:
>
> 1) add_memory_resource() has code to allocate a node in case it was offline.
> Since try_online_node has some code for that as well, I just made add_memory_resource() to
> use that so we can remove duplicated code..
> This is better explained in patch 1/4.
>
> 2) register_mem_sect_under_node() will be called only from link_mem_sections()
>
> 3) Get rid of link_mem_sections() in favour of walk_memory_range() with a callback to
> register_mem_sect_under_node()
>
> 4) Drop unnecessary checks from register_mem_sect_under_node()
>
>
> I have done some tests and I could not see anything broken because of
> this patchset.
>
> Oscar Salvador (4):
> mm/memory_hotplug: Make add_memory_resource use __try_online_node
> mm/memory_hotplug: Call register_mem_sect_under_node
> mm/memory_hotplug: Get rid of link_mem_sections
> mm/memory_hotplug: Drop unnecessary checks from
> register_mem_sect_under_node
>
> drivers/base/memory.c | 2 -
> drivers/base/node.c | 52 +++++---------------------
> include/linux/node.h | 21 +++++------
> mm/memory_hotplug.c | 101 ++++++++++++++++++++++++++------------------------
> 4 files changed, 71 insertions(+), 105 deletions(-)
>
> --
> 2.13.6
>

Hi,

maybe this can get reviewed now that the merge window is closed.

Thanks

Best Regards
Oscar Salvador



2018-06-20 22:19:38

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/memory_hotplug: Make add_memory_resource use __try_online_node

On Fri, 1 Jun 2018 14:53:18 +0200 [email protected] wrote:

> From: Oscar Salvador <[email protected]>
>
> add_memory_resource() contains code to allocate a new node in case
> it is necessary.
> Since try_online_node() also hast some code for this purpose,
> let us make use of that and remove duplicate code.
>
> This introduces __try_online_node(), which is called by add_memory_resource()
> and try_online_node().
> __try_online_node() has two new parameters, start_addr of the node,
> and if the node should be onlined and registered right away.
> This is always wanted if we are calling from do_cpu_up(), but not
> when we are calling from memhotplug code.
> Nothing changes from the point of view of the users of try_online_node(),
> since try_online_node passes start_addr=0 and online_node=true to
> __try_online_node().
>
> ...
>
> @@ -1126,17 +1136,14 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> */
> memblock_add_node(start, size, nid);
>
> - new_node = !node_online(nid);
> - if (new_node) {
> - pgdat = hotadd_new_pgdat(nid, start);
> - ret = -ENOMEM;
> - if (!pgdat)
> - goto error;
> - }
> + ret = __try_online_node (nid, start, false);
> + new_node = !!(ret > 0);

I don't think __try_online_node() will ever return a value greater than
zero. I assume what was meant was

new_node = !!(ret >= 0);

which may as well be

new_node = (ret >= 0);

since both sides have bool type.

The fact that testing didn't detect this is worrisome....

> + if (ret < 0)
> + goto error;
> +
>
> /* call arch's memory hotadd */
> ret = arch_add_memory(nid, start, size, NULL, true);
> -
> if (ret < 0)
> goto error;
>
>
> ...
>

2018-06-21 01:43:07

by Pavel Tatashin

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/memory_hotplug: Make add_memory_resource use __try_online_node

> I don't think __try_online_node() will ever return a value greater than
> zero. I assume what was meant was

Hi Andrew and Oscar,

Actually, the new __try_online_node() returns:
1 -> a new node was allocated
0 -> node is already online
-error -> an error encountered.

The function simply missing the return comment at the beginning.

Oscar, please check it via ./scripts/checkpatch.pl

Add comment explaining the return values.

And change:
ret = __try_online_node (nid, start, false);
new_node = !!(ret > 0);
if (ret < 0)
goto error;
To:
ret = __try_online_node (nid, start, false);
if (ret < 0)
goto error;
new_node = ret;

Other than that the patch looks good to me, it simplifies the code.
So, if the above is addressed:

Reviewed-by: Pavel Tatashin <[email protected]>

Thank you,
Pavel

>
> new_node = !!(ret >= 0);
>
> which may as well be
>
> new_node = (ret >= 0);
>
> since both sides have bool type.
>
> The fact that testing didn't detect this is worrisome....
>
> > + if (ret < 0)
> > + goto error;
> > +
> >
> > /* call arch's memory hotadd */
> > ret = arch_add_memory(nid, start, size, NULL, true);
> > -
> > if (ret < 0)
> > goto error;
> >
> >
> > ...
> >
>

2018-06-21 02:04:46

by Pavel Tatashin

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memory_hotplug: Call register_mem_sect_under_node

On Fri, Jun 1, 2018 at 8:54 AM <[email protected]> wrote:
>
> From: Oscar Salvador <[email protected]>
>
> When hotpluging memory, it is possible that two calls are being made
> to register_mem_sect_under_node().
> One comes from __add_section()->hotplug_memory_register()
> and the other from add_memory_resource()->link_mem_sections() if
> we had to register a new node.
>
> In case we had to register a new node, hotplug_memory_register()
> will only handle/allocate the memory_block's since
> register_mem_sect_under_node() will return right away because the
> node it is not online yet.

Indeed.

>
> I think it is better if we leave hotplug_memory_register() to
> handle/allocate only memory_block's and make link_mem_sections()
> to call register_mem_sect_under_node().

Agree, this makes the code simpler.

Please remove:
> +register_fail:
> + /*
> + * If sysfs file of new node can't create, cpu on the node
> + * can't be hot-added. There is no rollback way now.
> + * So, check by BUG_ON() to catch it reluctantly..
> + */
> + BUG_ON(ret);

Merge the above comment with:
> + /* we online node here. we can't roll back from here. */

And replace all:
> + if (ret)
> + goto register_fail;

With:
BUG_ON(ret);

With the above addressed:

Reviewed-by: Pavel Tatashin <[email protected]>

2018-06-21 02:36:53

by Pavel Tatashin

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/memory_hotplug: Get rid of link_mem_sections

On Fri, Jun 1, 2018 at 8:54 AM <[email protected]> wrote:
>
> From: Oscar Salvador <[email protected]>
>
> link_mem_sections() and walk_memory_range() share most of the code,
> so we can use walk_memory_range() with a callback to register_mem_sect_under_node()
> instead of using link_mem_sections().

Yes, their logic is indeed identical, so it is good to replace some
code with walk_memory_range().

>
> To control whether the node id must be check, two new functions has been added:
>
> register_mem_sect_under_node_nocheck_node()
> and
> register_mem_sect_under_node_check_node()

I do not like this, please see if my suggestion is better:

1. Revert all the changes outside of link_mem_sections()
2. Remove check_nid argument from register_mem_sect_under_node
and link_mem_sections.
3. In register_mem_sect_under_node
Replace:

if (check_nid) {
}

With:
if (system_state == SYSTEM_BOOTING) {
}

4. Change register_mem_sect_under_node() prototype to match callback
of walk_memory_range()
5. Call walk_memory_range(... register_mem_sect_under_node ...) from
link_mem_sections

Thank you,
Pavel

2018-06-21 02:39:34

by Pavel Tatashin

[permalink] [raw]
Subject: Re: [PATCH 4/4] mm/memory_hotplug: Drop unnecessary checks from register_mem_sect_under_node

On Fri, Jun 1, 2018 at 8:54 AM <[email protected]> wrote:
>
> From: Oscar Salvador <[email protected]>
>
> Callers of register_mem_sect_under_node() are always passing a valid
> memory_block (not NULL), so we can safely drop the check for NULL.
>
> In the same way, register_mem_sect_under_node() is only called in case
> the node is online, so we can safely remove that check as well.
>
> Signed-off-by: Oscar Salvador <[email protected]>

Reviewed-by: Pavel Tatashin <[email protected]>

> ---
> drivers/base/node.c | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 248c712e8de5..681be04351bc 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -415,12 +415,7 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, int nid,
> int ret;
> unsigned long pfn, sect_start_pfn, sect_end_pfn;
>
> - if (!mem_blk)
> - return -EFAULT;
> -
> mem_blk->nid = nid;
> - if (!node_online(nid))
> - return 0;
>
> sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
> sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
> --
> 2.13.6
>

2018-06-21 07:34:32

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/memory_hotplug: Make add_memory_resource use __try_online_node

On Wed, Jun 20, 2018 at 09:41:35PM -0400, Pavel Tatashin wrote:
> > I don't think __try_online_node() will ever return a value greater than
> > zero. I assume what was meant was
>
> Hi Andrew and Oscar,
>
> Actually, the new __try_online_node() returns:
> 1 -> a new node was allocated
> 0 -> node is already online
> -error -> an error encountered.
>
> The function simply missing the return comment at the beginning.
>
> Oscar, please check it via ./scripts/checkpatch.pl
>
> Add comment explaining the return values.
>
> And change:
> ret = __try_online_node (nid, start, false);
> new_node = !!(ret > 0);
> if (ret < 0)
> goto error;
> To:
> ret = __try_online_node (nid, start, false);
> if (ret < 0)
> goto error;
> new_node = ret;
>
> Other than that the patch looks good to me, it simplifies the code.
> So, if the above is addressed:
>
> Reviewed-by: Pavel Tatashin <[email protected]>

Hi Pavel,

I'll do so, thanks!

>
> Thank you,
> Pavel
>
> >
> > new_node = !!(ret >= 0);
> >
> > which may as well be
> >
> > new_node = (ret >= 0);
> >
> > since both sides have bool type.
> >
> > The fact that testing didn't detect this is worrisome....
> >
> > > + if (ret < 0)
> > > + goto error;
> > > +
> > >
> > > /* call arch's memory hotadd */
> > > ret = arch_add_memory(nid, start, size, NULL, true);
> > > -
> > > if (ret < 0)
> > > goto error;
> > >
> > >
> > > ...
> > >
> >
>

--
Oscar Salvador
SUSE L3

2018-06-21 08:34:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 0/4] Small cleanup for memoryhotplug

[Cc Reza Arbab - I remember he was able to hit some bugs in memblock
registration code when I was reworking that area previously]

On Fri 01-06-18 14:53:17, [email protected] wrote:
> From: Oscar Salvador <[email protected]>
>
>
> Hi,
>
> I wanted to give it a try and do a small cleanup in the memhotplug's code.
> A lot more could be done, but I wanted to start somewhere.
> I tried to unify/remove duplicated code.
>
> The following is what this patchset does:
>
> 1) add_memory_resource() has code to allocate a node in case it was offline.
> Since try_online_node has some code for that as well, I just made add_memory_resource() to
> use that so we can remove duplicated code..
> This is better explained in patch 1/4.
>
> 2) register_mem_sect_under_node() will be called only from link_mem_sections()
>
> 3) Get rid of link_mem_sections() in favour of walk_memory_range() with a callback to
> register_mem_sect_under_node()
>
> 4) Drop unnecessary checks from register_mem_sect_under_node()
>
>
> I have done some tests and I could not see anything broken because of
> this patchset.
>
> Oscar Salvador (4):
> mm/memory_hotplug: Make add_memory_resource use __try_online_node
> mm/memory_hotplug: Call register_mem_sect_under_node
> mm/memory_hotplug: Get rid of link_mem_sections
> mm/memory_hotplug: Drop unnecessary checks from
> register_mem_sect_under_node
>
> drivers/base/memory.c | 2 -
> drivers/base/node.c | 52 +++++---------------------
> include/linux/node.h | 21 +++++------
> mm/memory_hotplug.c | 101 ++++++++++++++++++++++++++------------------------
> 4 files changed, 71 insertions(+), 105 deletions(-)
>
> --
> 2.13.6
>

--
Michal Hocko
SUSE Labs

2018-06-25 17:06:03

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/memory_hotplug: Get rid of link_mem_sections

On Fri, 1 Jun 2018 14:53:20 +0200
<[email protected]> wrote:

> From: Oscar Salvador <[email protected]>
>
> link_mem_sections() and walk_memory_range() share most of the code,
> so we can use walk_memory_range() with a callback to register_mem_sect_under_node()
> instead of using link_mem_sections().
>
> To control whether the node id must be check, two new functions has been added:
>
> register_mem_sect_under_node_nocheck_node()
> and
> register_mem_sect_under_node_check_node()
>
> They both call register_mem_sect_under_node_check() with
> the parameter check_nid set to true or false.
>
> Signed-off-by: Oscar Salvador <[email protected]>
> ---
> drivers/base/node.c | 47 ++++++++++-------------------------------------
> include/linux/node.h | 21 +++++++++------------
> mm/memory_hotplug.c | 8 ++++----
> 3 files changed, 23 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index a5e821d09656..248c712e8de5 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -398,6 +398,16 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
> return pfn_to_nid(pfn);
> }
>
> +int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid)
> +{
> + return register_mem_sect_under_node (mem_blk, *(int *)nid, true);
> +}
> +
> +int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid)
> +{
> + return register_mem_sect_under_node (mem_blk, *(int *)nid, false);
> +}
> +
> /* register memory section under specified node if it spans that node */
> int register_mem_sect_under_node(struct memory_block *mem_blk, int nid,
> bool check_nid)
> @@ -490,43 +500,6 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
> return 0;
> }
>
> -int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages,
> - bool check_nid)
> -{
> - unsigned long end_pfn = start_pfn + nr_pages;
> - unsigned long pfn;
> - struct memory_block *mem_blk = NULL;
> - int err = 0;
> -
> - for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> - unsigned long section_nr = pfn_to_section_nr(pfn);
> - struct mem_section *mem_sect;
> - int ret;
> -
> - if (!present_section_nr(section_nr))
> - continue;
> - mem_sect = __nr_to_section(section_nr);
> -
> - /* same memblock ? */
> - if (mem_blk)
> - if ((section_nr >= mem_blk->start_section_nr) &&
> - (section_nr <= mem_blk->end_section_nr))
> - continue;
> -
> - mem_blk = find_memory_block_hinted(mem_sect, mem_blk);
> -
> - ret = register_mem_sect_under_node(mem_blk, nid, check_nid);
> - if (!err)
> - err = ret;
> -
> - /* discard ref obtained in find_memory_block() */
> - }
> -
> - if (mem_blk)
> - kobject_put(&mem_blk->dev.kobj);
> - return err;
> -}
> -
> #ifdef CONFIG_HUGETLBFS
> /*
> * Handle per node hstate attribute [un]registration on transistions
> diff --git a/include/linux/node.h b/include/linux/node.h
> index 6d336e38d155..1158bea9be52 100644
> --- a/include/linux/node.h
> +++ b/include/linux/node.h
> @@ -31,19 +31,11 @@ struct memory_block;
> extern struct node *node_devices[];
> typedef void (*node_registration_func_t)(struct node *);
>
> -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
> -extern int link_mem_sections(int nid, unsigned long start_pfn,
> - unsigned long nr_pages, bool check_nid);
> -#else
> -static inline int link_mem_sections(int nid, unsigned long start_pfn,
> - unsigned long nr_pages, bool check_nid)
> -{
> - return 0;
> -}
> -#endif
> -
> extern void unregister_node(struct node *node);
> #ifdef CONFIG_NUMA
> +#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
> +extern int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid);
> +#endif
> /* Core of the node registration - only memory hotplug should use this */
> extern int __register_one_node(int nid);
>
> @@ -54,12 +46,17 @@ static inline int register_one_node(int nid)
>
> if (node_online(nid)) {
> struct pglist_data *pgdat = NODE_DATA(nid);
> + unsigned long start = pgdat->node_start_pfn;
> + unsigned long size = pgdat->node_spanned_pages;
>
> error = __register_one_node(nid);
> if (error)
> return error;
> /* link memory sections under this node */
> - error = link_mem_sections(nid, pgdat->node_start_pfn, pgdat->node_spanned_pages, true);
> +#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
> + error = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
> + (void *)&nid, register_mem_sect_under_node_check_node);
> +#endif
Apologies, my previous testing was clearly garbage.

Looks like we take the node pfns then shift them again. Result on my system is we only get as far as pfn 22
which is still in the first memory block so rest of them are never successfully added.
Replacing with

error = walk_memory_range(start, start + size - 1, ...

works much better and lets me test Lorenzo's patch which is what I was really trying to do today.

Sorry again for the incorrect previous tested-by.

Thanks,

Jonathan


> }
>
> return error;
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index f84ef96175ab..ac21dc506b84 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -40,6 +40,8 @@
>
> #include "internal.h"
>
> +extern int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid);
> +
> /*
> * online_page_callback contains pointer to current page onlining function.
> * Initially it is generic_online_page(). If it is required it could be
> @@ -1118,7 +1120,6 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> u64 start, size;
> bool new_node;
> int ret;
> - unsigned long start_pfn, nr_pages;
>
> start = res->start;
> size = resource_size(res);
> @@ -1157,9 +1158,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> }
>
> /* link memory sections under this node.*/
> - start_pfn = start >> PAGE_SHIFT;
> - nr_pages = size >> PAGE_SHIFT;
> - ret = link_mem_sections(nid, start_pfn, nr_pages, false);
> + ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
> + (void *)&nid, register_mem_sect_under_node_nocheck_node);
> if (ret)
> goto register_fail;
>



2018-06-25 17:37:09

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/memory_hotplug: Get rid of link_mem_sections

On Mon, Jun 25, 2018 at 06:04:36PM +0100, Jonathan Cameron wrote:
> On Fri, 1 Jun 2018 14:53:20 +0200
> <[email protected]> wrote:
>
> > From: Oscar Salvador <[email protected]>
> >
> > link_mem_sections() and walk_memory_range() share most of the code,
> > so we can use walk_memory_range() with a callback to register_mem_sect_under_node()
> > instead of using link_mem_sections().
> >
> > To control whether the node id must be check, two new functions has been added:
> >
> > register_mem_sect_under_node_nocheck_node()
> > and
> > register_mem_sect_under_node_check_node()
> >
> > They both call register_mem_sect_under_node_check() with
> > the parameter check_nid set to true or false.
> >
> > Signed-off-by: Oscar Salvador <[email protected]>
> > ---
> > drivers/base/node.c | 47 ++++++++++-------------------------------------
> > include/linux/node.h | 21 +++++++++------------
> > mm/memory_hotplug.c | 8 ++++----
> > 3 files changed, 23 insertions(+), 53 deletions(-)
> >
> > diff --git a/drivers/base/node.c b/drivers/base/node.c
> > index a5e821d09656..248c712e8de5 100644
> > --- a/drivers/base/node.c
> > +++ b/drivers/base/node.c
> > @@ -398,6 +398,16 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
> > return pfn_to_nid(pfn);
> > }
> >
> > +int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid)
> > +{
> > + return register_mem_sect_under_node (mem_blk, *(int *)nid, true);
> > +}
> > +
> > +int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid)
> > +{
> > + return register_mem_sect_under_node (mem_blk, *(int *)nid, false);
> > +}
> > +
> > /* register memory section under specified node if it spans that node */
> > int register_mem_sect_under_node(struct memory_block *mem_blk, int nid,
> > bool check_nid)
> > @@ -490,43 +500,6 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
> > return 0;
> > }
> >
> > -int link_mem_sections(int nid, unsigned long start_pfn, unsigned long nr_pages,
> > - bool check_nid)
> > -{
> > - unsigned long end_pfn = start_pfn + nr_pages;
> > - unsigned long pfn;
> > - struct memory_block *mem_blk = NULL;
> > - int err = 0;
> > -
> > - for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> > - unsigned long section_nr = pfn_to_section_nr(pfn);
> > - struct mem_section *mem_sect;
> > - int ret;
> > -
> > - if (!present_section_nr(section_nr))
> > - continue;
> > - mem_sect = __nr_to_section(section_nr);
> > -
> > - /* same memblock ? */
> > - if (mem_blk)
> > - if ((section_nr >= mem_blk->start_section_nr) &&
> > - (section_nr <= mem_blk->end_section_nr))
> > - continue;
> > -
> > - mem_blk = find_memory_block_hinted(mem_sect, mem_blk);
> > -
> > - ret = register_mem_sect_under_node(mem_blk, nid, check_nid);
> > - if (!err)
> > - err = ret;
> > -
> > - /* discard ref obtained in find_memory_block() */
> > - }
> > -
> > - if (mem_blk)
> > - kobject_put(&mem_blk->dev.kobj);
> > - return err;
> > -}
> > -
> > #ifdef CONFIG_HUGETLBFS
> > /*
> > * Handle per node hstate attribute [un]registration on transistions
> > diff --git a/include/linux/node.h b/include/linux/node.h
> > index 6d336e38d155..1158bea9be52 100644
> > --- a/include/linux/node.h
> > +++ b/include/linux/node.h
> > @@ -31,19 +31,11 @@ struct memory_block;
> > extern struct node *node_devices[];
> > typedef void (*node_registration_func_t)(struct node *);
> >
> > -#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
> > -extern int link_mem_sections(int nid, unsigned long start_pfn,
> > - unsigned long nr_pages, bool check_nid);
> > -#else
> > -static inline int link_mem_sections(int nid, unsigned long start_pfn,
> > - unsigned long nr_pages, bool check_nid)
> > -{
> > - return 0;
> > -}
> > -#endif
> > -
> > extern void unregister_node(struct node *node);
> > #ifdef CONFIG_NUMA
> > +#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
> > +extern int register_mem_sect_under_node_check_node(struct memory_block *mem_blk, void *nid);
> > +#endif
> > /* Core of the node registration - only memory hotplug should use this */
> > extern int __register_one_node(int nid);
> >
> > @@ -54,12 +46,17 @@ static inline int register_one_node(int nid)
> >
> > if (node_online(nid)) {
> > struct pglist_data *pgdat = NODE_DATA(nid);
> > + unsigned long start = pgdat->node_start_pfn;
> > + unsigned long size = pgdat->node_spanned_pages;
> >
> > error = __register_one_node(nid);
> > if (error)
> > return error;
> > /* link memory sections under this node */
> > - error = link_mem_sections(nid, pgdat->node_start_pfn, pgdat->node_spanned_pages, true);
> > +#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE)
> > + error = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
> > + (void *)&nid, register_mem_sect_under_node_check_node);
> > +#endif
> Apologies, my previous testing was clearly garbage.
>
> Looks like we take the node pfns then shift them again. Result on my system is we only get as far as pfn 22
> which is still in the first memory block so rest of them are never successfully added.
> Replacing with
>
> error = walk_memory_range(start, start + size - 1, ...
>
> works much better and lets me test Lorenzo's patch which is what I was really trying to do today.
>
> Sorry again for the incorrect previous tested-by.

Hi Jonathan,

this was fixed in v2.
Please, see: <[email protected]>

Thanks

Oscar Salvador

>
> Thanks,
>
> Jonathan
>
>
> > }
> >
> > return error;
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index f84ef96175ab..ac21dc506b84 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -40,6 +40,8 @@
> >
> > #include "internal.h"
> >
> > +extern int register_mem_sect_under_node_nocheck_node(struct memory_block *mem_blk, void *nid);
> > +
> > /*
> > * online_page_callback contains pointer to current page onlining function.
> > * Initially it is generic_online_page(). If it is required it could be
> > @@ -1118,7 +1120,6 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> > u64 start, size;
> > bool new_node;
> > int ret;
> > - unsigned long start_pfn, nr_pages;
> >
> > start = res->start;
> > size = resource_size(res);
> > @@ -1157,9 +1158,8 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online)
> > }
> >
> > /* link memory sections under this node.*/
> > - start_pfn = start >> PAGE_SHIFT;
> > - nr_pages = size >> PAGE_SHIFT;
> > - ret = link_mem_sections(nid, start_pfn, nr_pages, false);
> > + ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1),
> > + (void *)&nid, register_mem_sect_under_node_nocheck_node);
> > if (ret)
> > goto register_fail;
> >
>
>

--
Oscar Salvador
SUSE L3