2018-08-22 09:36:14

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 0/5] Clean up node_states_check_changes_online/offline

From: Oscar Salvador <[email protected]>

This patchset clean ups node_states_check_changes_online/offline
functions together with node_states_set/clear_node functions.

The main reason behind this patchset is that currently, these
functions are suboptimal and confusing.

For example, they contain wrong statements like:

if (N_MEMORY == N_NORMAL_MEMORY)
if (N_MEMORY =! N_NORMAL_MEMORY)
if (N_MEMORY != N_HIGH_MEMORY)
if (N_MEMORY == N_HIGH_MEMORY)

At least, I could not find anywhere where N_NORMAL_MEMORY gets
assigned to N_MEMORY, or the other way around.
Neither for the N_HIGH_MEMORY case.

My rough guess is that all that was meant to compare
N_NORMAL_MEMORY to N_HIGH_MEMORY, to see if we were on
CONFIG_HIGHMEM systems.

This went unnoticed because the if statements never got triggered,
so they were always silent.
For instance, let us take a look at node_states_clear_node

...
if ((N_MEMORY != N_NORMAL_MEMORY) &&
(arg->status_change_nid_high >= 0))
node_clear_state(node, N_HIGH_MEMORY);

if ((N_MEMORY != N_HIGH_MEMORY) &&
(arg->status_change_nid >= 0))
node_clear_state(node, N_MEMORY);
...

Since N_MEMORY will never be equal to neither N_HIGH_MEMORY nor
N_NORMAL_MEMORY, this justs proceeds normally.

Another case is node_states_check_changes_offline:

...
zone_last = ZONE_HIGHMEM;
if (N_MEMORY == N_HIGH_MEMORY)
zone_last = ZONE_MOVABLE;
...

Since N_MEMORY will never be equal to N_HIGH_MEMORY, zone_last will
never be set to ZONE_MOVABLE.
But this is fine as the code works without that.

After I found all this, I tried to re-write the code in a more
understandable way, and I got rid of these confusing parts
on the way.

Another reason for this patchset is that there are some functions that are
called unconditionally when they should only be called under certain
conditions.

That is the case for:

- node_states_set_node()->node_set_state(node, N_MEMORY)

* node_states_set_node() gets called whenever we online pages,
so we end up calling node_set_state(node, N_MEMORY) everytime.
To avoid this, we should check if the node is already in node_state[N_MEMORY].

- node_states_set_node()->node_set_state(node, N_HIGH_MEMORY)

* On !CONFIG_HIGH_MEMORY, N_HIGH_MEMORY == N_NORMAL_MEMORY,
but the current code sets:
status_change_nid_high = status_change_nid_normal
This means that we will call node_set_state(node, N_NORMAL_MEMORY) twice.
The fix here is to set status_change_nid_normal = -1 on such systems,
so we skip the second call.


I tried it out on x86_64 so far and everything worked.
But I would like to get feedback on this since I could be
missing something.

Oscar Salvador (5):
mm/memory_hotplug: Spare unnecessary calls to node_set_state
mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when
!CONFIG_HIGHMEM
mm/memory_hotplug: Simplify node_states_check_changes_online
mm/memory_hotplug: Tidy up node_states_clear_node
mm/memory_hotplug: Simplify node_states_check_changes_offline

mm/memory_hotplug.c | 146 +++++++++++++++++++++-------------------------------
1 file changed, 60 insertions(+), 86 deletions(-)

--
2.13.6



2018-08-22 09:33:59

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 5/5] mm/memory_hotplug: Simplify node_states_check_changes_offline

From: Oscar Salvador <[email protected]>

This patch tries to simplify node_states_check_changes_offline
and make the code more understandable by:

- Removing the if (N_MEMORY == N_NORMAL_MEMORY) wrong statement
- Removing the if (N_MEMORY == N_HIGH_MEMORY) wrong statement
- Re-structure the code a bit
- Removing confusing comments

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 81 ++++++++++++++++++++++-------------------------------
1 file changed, 33 insertions(+), 48 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 006a7b817724..b45bc681e6db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1487,51 +1487,40 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
enum zone_type zt, zone_last = ZONE_NORMAL;

/*
- * If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_NORMAL,
- * set zone_last to ZONE_NORMAL.
- *
- * If we don't have HIGHMEM nor movable node,
- * node_states[N_NORMAL_MEMORY] contains nodes which have zones of
- * 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
+ * If the current zone is whithin (0..ZONE_NORMAL],
+ * check if the amount of pages that are going to be
+ * offlined is above or equal to the sum of the present
+ * pages of these zones.
+ * If that happens, we need to take this node out of
+ * node_state[N_NORMAL_MEMORY]
*/
- if (N_MEMORY == N_NORMAL_MEMORY)
- zone_last = ZONE_MOVABLE;
+ if (zone_idx(zone) <= zone_last) {
+ for (zt = 0; zt <= zone_last; zt++)
+ present_pages += pgdat->node_zones[zt].present_pages;

- /*
- * check whether node_states[N_NORMAL_MEMORY] will be changed.
- * If the memory to be offline is in a zone of 0...zone_last,
- * and it is the last present memory, 0...zone_last will
- * become empty after offline , thus we can determind we will
- * need to clear the node from node_states[N_NORMAL_MEMORY].
- */
- for (zt = 0; zt <= zone_last; zt++)
- present_pages += pgdat->node_zones[zt].present_pages;
- if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
- arg->status_change_nid_normal = zone_to_nid(zone);
- else
- arg->status_change_nid_normal = -1;
+ if (nr_pages >= present_pages)
+ arg->status_change_nid_normal = zone_to_nid(zone);
+ else
+ arg->status_change_nid_normal = -1;
+ }

#ifdef CONFIG_HIGHMEM
/*
- * If we have movable node, node_states[N_HIGH_MEMORY]
- * contains nodes which have zones of 0...ZONE_HIGHMEM,
- * set zone_last to ZONE_HIGHMEM.
- *
- * If we don't have movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_MOVABLE,
- * set zone_last to ZONE_MOVABLE.
+ * If the current zone is whithin (0..ZONE_HIGHMEM], check if
+ * the amount of pages that are going to be offlined is above
+ * or equal to the sum of the present pages of these zones.
+ * If that happens, we need to take this node out of
+ * node_state[N_HIGH_MEMORY]
*/
- zone_last = ZONE_HIGHMEM;
- if (N_MEMORY == N_HIGH_MEMORY)
- zone_last = ZONE_MOVABLE;
-
- for (; zt <= zone_last; zt++)
+ if (zone_idx(zone) <= ZONE_HIGHMEM) {
+ zt = ZONE_HIGHMEM;
present_pages += pgdat->node_zones[zt].present_pages;
- if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
- arg->status_change_nid_high = zone_to_nid(zone);
- else
- arg->status_change_nid_high = -1;
+
+ if (nr_pages >= present_pages)
+ arg->status_change_nid_high = zone_to_nid(zone);
+ else
+ arg->status_change_nid_high = -1;
+ }
#else
/*
* When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
@@ -1541,18 +1530,14 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
#endif

/*
- * node_states[N_HIGH_MEMORY] contains nodes which have 0...ZONE_MOVABLE
+ * Count pages from ZONE_MOVABLE as well.
+ * If the amount of pages that are going to be offlined is above
+ * or equal the sum of the present pages of all zones, we need
+ * to remove this node from node_state[N_MEMORY]
*/
- zone_last = ZONE_MOVABLE;
+ zt = ZONE_MOVABLE;
+ present_pages += pgdat->node_zones[zt].present_pages;

- /*
- * check whether node_states[N_HIGH_MEMORY] will be changed
- * If we try to offline the last present @nr_pages from the node,
- * we can determind we will need to clear the node from
- * node_states[N_HIGH_MEMORY].
- */
- for (; zt <= zone_last; zt++)
- present_pages += pgdat->node_zones[zt].present_pages;
if (nr_pages >= present_pages)
arg->status_change_nid = zone_to_nid(zone);
else
--
2.13.6


2018-08-22 09:34:06

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 4/5] mm/memory_hotplug: Tidy up node_states_clear_node

From: Oscar Salvador <[email protected]>

node_states_clear has the following if statements:

if ((N_MEMORY != N_NORMAL_MEMORY) &&
(arg->status_change_nid_high >= 0))
...

if ((N_MEMORY != N_HIGH_MEMORY) &&
(arg->status_change_nid >= 0))
...

N_MEMORY can never be equal to neither N_NORMAL_MEMORY nor
N_HIGH_MEMORY.
This is wrong, so let us get rid of it.

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0f2cf6941224..006a7b817724 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1564,12 +1564,10 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
if (arg->status_change_nid_normal >= 0)
node_clear_state(node, N_NORMAL_MEMORY);

- if ((N_MEMORY != N_NORMAL_MEMORY) &&
- (arg->status_change_nid_high >= 0))
+ if (arg->status_change_nid_high >= 0)
node_clear_state(node, N_HIGH_MEMORY);

- if ((N_MEMORY != N_HIGH_MEMORY) &&
- (arg->status_change_nid >= 0))
+ if (arg->status_change_nid >= 0)
node_clear_state(node, N_MEMORY);
}

--
2.13.6


2018-08-22 09:34:13

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 3/5] mm/memory_hotplug: Simplify node_states_check_changes_online

From: Oscar Salvador <[email protected]>

While looking at node_states_check_changes_online, I saw some
confusing things I am not sure how it was supposed to work.

Right after entering the function, we find this:

if (N_MEMORY == N_NORMAL_MEMORY)
zone_last = ZONE_MOVABLE;

This, unless I am missing something really obvious, is wrong.
N_MEMORY cannot really be equal to N_NORMAL_MEMORY.
My guess is that this wanted to be something like:

if (N_NORMAL_MEMORY == N_HIGH_MEMORY)

to check if we have CONFIG_HIGHMEM.

Later on, in the CONFIG_HICHMEM block, we have:

if (N_MEMORY == N_HIGH_MEMORY)
zone_last = ZONE_MOVABLE;

This is also wrong, and will never be evaluated to true.

The thing is that besides this, the function can be simplified a bit.

- If the zone is whithin (0..ZONE_NORMAL], we need to set the node
for node_state[N_NORMAL_MEMORY]
- If we have CONFIG_HIGHMEM, and the zone is within (0..ZONE_NORMAL],
we need to set the node for node_state[N_HIGH_MEMORY], as
N_HIGH_MEMORY stands for regular or high memory.
- Finally, we set the node for node_states[N_MEMORY].
ZONE_MOVABLE ends up there.

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 44 +++++++++++++-------------------------------
1 file changed, 13 insertions(+), 31 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1cfd0b5a9cc7..0f2cf6941224 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -680,46 +680,28 @@ static void node_states_check_changes_online(unsigned long nr_pages,
struct zone *zone, struct memory_notify *arg)
{
int nid = zone_to_nid(zone);
- enum zone_type zone_last = ZONE_NORMAL;

/*
- * If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_NORMAL,
- * set zone_last to ZONE_NORMAL.
- *
- * If we don't have HIGHMEM nor movable node,
- * node_states[N_NORMAL_MEMORY] contains nodes which have zones of
- * 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
- */
- if (N_MEMORY == N_NORMAL_MEMORY)
- zone_last = ZONE_MOVABLE;
-
- /*
- * if the memory to be online is in a zone of 0...zone_last, and
- * the zones of 0...zone_last don't have memory before online, we will
- * need to set the node to node_states[N_NORMAL_MEMORY] after
- * the memory is online.
+ * node_states[N_NORMAL_MEMORY] contains nodes which have
+ * zones from (0..ZONE_NORMAL]
+ * We can start checking if the current zone is in that range
+ * and if so, if the node needs to be set to node_states[N_NORMAL_MEMORY]
+ * after memory is online.
*/
- if (zone_idx(zone) <= zone_last && !node_state(nid, N_NORMAL_MEMORY))
+ if (zone_idx(zone) <= ZONE_NORMAL && !node_state(nid, N_NORMAL_MEMORY))
arg->status_change_nid_normal = nid;
else
arg->status_change_nid_normal = -1;

#ifdef CONFIG_HIGHMEM
/*
- * If we have movable node, node_states[N_HIGH_MEMORY]
- * contains nodes which have zones of 0...ZONE_HIGHMEM,
- * set zone_last to ZONE_HIGHMEM.
- *
- * If we don't have movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_MOVABLE,
- * set zone_last to ZONE_MOVABLE.
+ * The current zone cannot be ZONE_HIGHMEM, as zone_for_pfn_range
+ * can only return (0..ZONE_NORMAL] or ZONE_MOVABLE.
+ * N_HIGH_MEMORY stands for regular or high memory, so if the zone
+ * is within the range (0..ZONE_NORMAL], we have to set the node
+ * for N_HIGH_MEMORY as well.
*/
- zone_last = ZONE_HIGHMEM;
- if (N_MEMORY == N_HIGH_MEMORY)
- zone_last = ZONE_MOVABLE;
-
- if (zone_idx(zone) <= zone_last && !node_state(nid, N_HIGH_MEMORY))
+ if (zone_idx(zone) < ZONE_HIGHMEM && !node_state(nid, N_HIGH_MEMORY))
arg->status_change_nid_high = nid;
else
arg->status_change_nid_high = -1;
@@ -732,7 +714,7 @@ static void node_states_check_changes_online(unsigned long nr_pages,
#endif

/*
- * if the node don't have memory befor online, we will need to
+ * if the node don't have memory before online, we will need to
* set the node to node_states[N_MEMORY] after the memory
* is online.
*/
--
2.13.6


2018-08-22 09:34:21

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 2/5] mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when !CONFIG_HIGHMEM

From: Oscar Salvador <[email protected]>

Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set
to status_change_nid_normal, but on such systems, N_HIGH_MEMORY equals
N_NORMAL_MEMORY.
That means that if status_change_nid_normal is not -1,
we will perform two calls to node_set_state for the same memory type.

Set status_change_nid_high to -1 for !CONFIG_HIGHMEM, so we skip the
double call in node_states_set_node.

The same goes for node_clear_state.

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 4a89915e1467..1cfd0b5a9cc7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -724,7 +724,11 @@ static void node_states_check_changes_online(unsigned long nr_pages,
else
arg->status_change_nid_high = -1;
#else
- arg->status_change_nid_high = arg->status_change_nid_normal;
+ /*
+ * When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
+ * so setting the node for N_NORMAL_MEMORY is enough.
+ */
+ arg->status_change_nid_high = -1;
#endif

/*
@@ -1547,7 +1551,11 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
else
arg->status_change_nid_high = -1;
#else
- arg->status_change_nid_high = arg->status_change_nid_normal;
+ /*
+ * When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
+ * so clearing the node for N_NORMAL_MEMORY is enough.
+ */
+ arg->status_change_nid_high = -1;
#endif

/*
--
2.13.6


2018-08-22 09:35:26

by Oscar Salvador

[permalink] [raw]
Subject: [RFC PATCH 1/5] mm/memory_hotplug: Spare unnecessary calls to node_set_state

From: Oscar Salvador <[email protected]>

In node_states_check_changes_online, we check if the node will
have to be set for any of the N_*_MEMORY states after the pages
have been onlined.

Later on, we perform the activation in node_states_set_node.
Currently, in node_states_set_node we set the node to N_MEMORY
unconditionally.
This means that we will call node_set_state for N_MEMORY every time
pages go online, but we only need to do it if the node has not yet been
set for N_MEMORY.

Signed-off-by: Oscar Salvador <[email protected]>
---
mm/memory_hotplug.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 60b67f09956e..4a89915e1467 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -746,7 +746,8 @@ static void node_states_set_node(int node, struct memory_notify *arg)
if (arg->status_change_nid_high >= 0)
node_set_state(node, N_HIGH_MEMORY);

- node_set_state(node, N_MEMORY);
+ if (arg->status_change_nid >= 0)
+ node_set_state(node, N_MEMORY);
}

static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn,
--
2.13.6


2018-08-23 15:58:22

by Oscar Salvador

[permalink] [raw]
Subject: Re: [RFC PATCH 5/5] mm/memory_hotplug: Simplify node_states_check_changes_offline

On Wed, Aug 22, 2018 at 11:32:26AM +0200, Oscar Salvador wrote:
> From: Oscar Salvador <[email protected]>
>
> This patch tries to simplify node_states_check_changes_offline
> and make the code more understandable by:
>
> - Removing the if (N_MEMORY == N_NORMAL_MEMORY) wrong statement
> - Removing the if (N_MEMORY == N_HIGH_MEMORY) wrong statement
> - Re-structure the code a bit
> - Removing confusing comments
>
> Signed-off-by: Oscar Salvador <[email protected]>

I realized I made a mistake here.
I was not counting the present pages correctly.
I will send a new version after the merge-windows gets closed.

Sorry for the noise

For the sake of clarity, the patch should have been:


---

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 006a7b817724..bca11da4e11f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1487,23 +1487,12 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
enum zone_type zt, zone_last = ZONE_NORMAL;

/*
- * If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_NORMAL,
- * set zone_last to ZONE_NORMAL.
- *
- * If we don't have HIGHMEM nor movable node,
- * node_states[N_NORMAL_MEMORY] contains nodes which have zones of
- * 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
- */
- if (N_MEMORY == N_NORMAL_MEMORY)
- zone_last = ZONE_MOVABLE;
-
- /*
- * check whether node_states[N_NORMAL_MEMORY] will be changed.
- * If the memory to be offline is in a zone of 0...zone_last,
- * and it is the last present memory, 0...zone_last will
- * become empty after offline , thus we can determind we will
- * need to clear the node from node_states[N_NORMAL_MEMORY].
+ * If the current zone is whithin (0..ZONE_NORMAL],
+ * check if the amount of pages that are going to be
+ * offlined is above or equal to the sum of the present
+ * pages of these zones.
+ * If that happens, we need to take this node out of
+ * node_state[N_NORMAL_MEMORY]
*/
for (zt = 0; zt <= zone_last; zt++)
present_pages += pgdat->node_zones[zt].present_pages;
@@ -1514,21 +1503,15 @@ static void node_states_check_changes_offline(unsigned long nr_pages,

#ifdef CONFIG_HIGHMEM
/*
- * If we have movable node, node_states[N_HIGH_MEMORY]
- * contains nodes which have zones of 0...ZONE_HIGHMEM,
- * set zone_last to ZONE_HIGHMEM.
- *
- * If we don't have movable node, node_states[N_NORMAL_MEMORY]
- * contains nodes which have zones of 0...ZONE_MOVABLE,
- * set zone_last to ZONE_MOVABLE.
+ * If the current zone is whithin (0..ZONE_HIGHMEM], check if
+ * the amount of pages that are going to be offlined is above
+ * or equal to the sum of the present pages of these zones.
+ * If that happens, we need to take this node out of
+ * node_state[N_HIGH_MEMORY]
*/
- zone_last = ZONE_HIGHMEM;
- if (N_MEMORY == N_HIGH_MEMORY)
- zone_last = ZONE_MOVABLE;
-
- for (; zt <= zone_last; zt++)
- present_pages += pgdat->node_zones[zt].present_pages;
- if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
+ zt = ZONE_HIGHMEM;
+ present_pages += pgdat->node_zones[zt].present_pages;
+ if (zone_idx(zone) <= zt && nr_pages >= present_pages)
arg->status_change_nid_high = zone_to_nid(zone);
else
arg->status_change_nid_high = -1;
@@ -1541,18 +1524,14 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
#endif

/*
- * node_states[N_HIGH_MEMORY] contains nodes which have 0...ZONE_MOVABLE
+ * Count pages from ZONE_MOVABLE as well.
+ * If the amount of pages that are going to be offlined is above
+ * or equal the sum of the present pages of all zones, we need
+ * to remove this node from node_state[N_MEMORY]
*/
- zone_last = ZONE_MOVABLE;
+ zt = ZONE_MOVABLE;
+ present_pages += pgdat->node_zones[zt].present_pages;

- /*
- * check whether node_states[N_HIGH_MEMORY] will be changed
- * If we try to offline the last present @nr_pages from the node,
- * we can determind we will need to clear the node from
- * node_states[N_HIGH_MEMORY].
- */
- for (; zt <= zone_last; zt++)
- present_pages += pgdat->node_zones[zt].present_pages;
if (nr_pages >= present_pages)
arg->status_change_nid = zone_to_nid(zone);
else


--
Oscar Salvador
SUSE L3