2015-11-04 08:23:52

by Changsheng Liu

[permalink] [raw]
Subject: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default

After the user config CONFIG_MOVABLE_NODE,
When the memory is hot added, should_add_memory_movable() return 0
because all zones including ZONE_MOVABLE are empty,
so the memory that was hot added will be assigned to ZONE_NORMAL,
and we need using the udev rules to online the memory automatically:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
ATTR{state}="online_movable"
The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
The events of memory section are notified to udev asynchronously,
so it can not ensure that the memory block onlined by udev is
adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
But we want the whole node to be added to ZONE_MOVABLE by default.

So we change should_add_memory_movable(): if the user config
CONFIG_MOVABLE_NODE and movable_node kernel option
and the ZONE_NORMAL is empty or the pfn of the hot-added memory
is after the end of the ZONE_NORMAL it will always return 1
and then the whole node will be added to ZONE_MOVABLE by default.
If we want the node to be assigned to ZONE_NORMAL,
we can do it as follows:
"echo online_kernel > /sys/devices/system/memory/memoryXXX/state"

Signed-off-by: liuchangsheng <[email protected]>
Signed-off-by: Xiaofeng Yan <[email protected]>
Tested-by: Dongdong Fan <[email protected]>
Reviewed-by: <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Toshi Kani <[email protected]>
Cc: Xishi Qiu <[email protected]>
---
mm/memory_hotplug.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index aa992e2..8617b9f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
/*
* If movable zone has already been setup, newly added memory should be check.
* If its address is higher than movable zone, it should be added as movable.
+ * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
+ * added memory does not overlap the zone before MOVABLE_ZONE,
+ * the memory is added as movable.
* Without this check, movable zone may overlap with other zone.
*/
static int should_add_memory_movable(int nid, u64 start, u64 size)
@@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
unsigned long start_pfn = start >> PAGE_SHIFT;
pg_data_t *pgdat = NODE_DATA(nid);
struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
+ struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
+
+ if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
+ return 1;

if (zone_is_empty(movable_zone))
return 0;
--
1.8.3.1


2015-11-04 10:21:07

by Xishi Qiu

[permalink] [raw]
Subject: Re: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default

On 2015/11/4 16:23, liuchangsheng wrote:

> After the user config CONFIG_MOVABLE_NODE,
> When the memory is hot added, should_add_memory_movable() return 0
> because all zones including ZONE_MOVABLE are empty,
> so the memory that was hot added will be assigned to ZONE_NORMAL,
> and we need using the udev rules to online the memory automatically:
> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
> ATTR{state}="online_movable"
> The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
> The events of memory section are notified to udev asynchronously,

Hi Yasuaki,

If udev onlines memory in descending order, like 3->2->1->0, it will
success, but we notifiy to udev in ascending order, like 0->1->2->3,
so the udev rules cannot online memory as movable, right?

> so it can not ensure that the memory block onlined by udev is
> adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
> But we want the whole node to be added to ZONE_MOVABLE by default.
>
> So we change should_add_memory_movable(): if the user config
> CONFIG_MOVABLE_NODE and movable_node kernel option
> and the ZONE_NORMAL is empty or the pfn of the hot-added memory
> is after the end of the ZONE_NORMAL it will always return 1
> and then the whole node will be added to ZONE_MOVABLE by default.
> If we want the node to be assigned to ZONE_NORMAL,
> we can do it as follows:
> "echo online_kernel > /sys/devices/system/memory/memoryXXX/state"
>

The order should like 0->1->2->3, right? 3->2->1->0 will be failed.

> Signed-off-by: liuchangsheng <[email protected]>
> Signed-off-by: Xiaofeng Yan <[email protected]>
> Tested-by: Dongdong Fan <[email protected]>
> Reviewed-by: <[email protected]>
> Cc: Wang Nan <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Yinghai Lu <[email protected]>
> Cc: Tang Chen <[email protected]>
> Cc: Yasuaki Ishimatsu <[email protected]>
> Cc: Toshi Kani <[email protected]>
> Cc: Xishi Qiu <[email protected]>
> ---
> mm/memory_hotplug.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index aa992e2..8617b9f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
> /*
> * If movable zone has already been setup, newly added memory should be check.
> * If its address is higher than movable zone, it should be added as movable.
> + * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
> + * added memory does not overlap the zone before MOVABLE_ZONE,
> + * the memory is added as movable.
> * Without this check, movable zone may overlap with other zone.
> */
> static int should_add_memory_movable(int nid, u64 start, u64 size)
> @@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
> unsigned long start_pfn = start >> PAGE_SHIFT;
> pg_data_t *pgdat = NODE_DATA(nid);
> struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
> + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
> +
> + if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
> + return 1;
>

Looks good to me.

How about add some comment in mm/Kconfig?

Thanks,
Xishi Qiu

> if (zone_is_empty(movable_zone))
> return 0;


2015-11-04 16:13:02

by YASUAKI ISHIMATSU

[permalink] [raw]
Subject: Re: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default


On Wed, 4 Nov 2015 18:20:14 +0800
Xishi Qiu <[email protected]> wrote:

> On 2015/11/4 16:23, liuchangsheng wrote:
>
> > After the user config CONFIG_MOVABLE_NODE,
> > When the memory is hot added, should_add_memory_movable() return 0
> > because all zones including ZONE_MOVABLE are empty,
> > so the memory that was hot added will be assigned to ZONE_NORMAL,
> > and we need using the udev rules to online the memory automatically:
> > SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
> > ATTR{state}="online_movable"
> > The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
> > The events of memory section are notified to udev asynchronously,
>
> Hi Yasuaki,
>

> If udev onlines memory in descending order, like 3->2->1->0, it will
> success, but we notifiy to udev in ascending order, like 0->1->2->3,
> so the udev rules cannot online memory as movable, right?

right.

>
> > so it can not ensure that the memory block onlined by udev is
> > adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
> > But we want the whole node to be added to ZONE_MOVABLE by default.
> >
> > So we change should_add_memory_movable(): if the user config
> > CONFIG_MOVABLE_NODE and movable_node kernel option
> > and the ZONE_NORMAL is empty or the pfn of the hot-added memory
> > is after the end of the ZONE_NORMAL it will always return 1
> > and then the whole node will be added to ZONE_MOVABLE by default.
> > If we want the node to be assigned to ZONE_NORMAL,
> > we can do it as follows:
> > "echo online_kernel > /sys/devices/system/memory/memoryXXX/state"
> >
>

> The order should like 0->1->2->3, right? 3->2->1->0 will be failed.

right.

Thanks,
Yasuaki Ishimatsu

>
> > Signed-off-by: liuchangsheng <[email protected]>
> > Signed-off-by: Xiaofeng Yan <[email protected]>
> > Tested-by: Dongdong Fan <[email protected]>
> > Reviewed-by: <[email protected]>
> > Cc: Wang Nan <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: Yinghai Lu <[email protected]>
> > Cc: Tang Chen <[email protected]>
> > Cc: Yasuaki Ishimatsu <[email protected]>
> > Cc: Toshi Kani <[email protected]>
> > Cc: Xishi Qiu <[email protected]>
> > ---
> > mm/memory_hotplug.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index aa992e2..8617b9f 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
> > /*
> > * If movable zone has already been setup, newly added memory should be check.
> > * If its address is higher than movable zone, it should be added as movable.
> > + * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
> > + * added memory does not overlap the zone before MOVABLE_ZONE,
> > + * the memory is added as movable.
> > * Without this check, movable zone may overlap with other zone.
> > */
> > static int should_add_memory_movable(int nid, u64 start, u64 size)
> > @@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
> > unsigned long start_pfn = start >> PAGE_SHIFT;
> > pg_data_t *pgdat = NODE_DATA(nid);
> > struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
> > + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
> > +
> > + if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
> > + return 1;
> >
>
> Looks good to me.
>
> How about add some comment in mm/Kconfig?
>
> Thanks,
> Xishi Qiu
>
> > if (zone_is_empty(movable_zone))
> > return 0;
>
>
>

2015-11-04 16:18:15

by YASUAKI ISHIMATSU

[permalink] [raw]
Subject: Re: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default


On Wed, 4 Nov 2015 03:23:35 -0500
liuchangsheng <[email protected]> wrote:

> After the user config CONFIG_MOVABLE_NODE,
> When the memory is hot added, should_add_memory_movable() return 0
> because all zones including ZONE_MOVABLE are empty,
> so the memory that was hot added will be assigned to ZONE_NORMAL,
> and we need using the udev rules to online the memory automatically:
> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
> ATTR{state}="online_movable"
> The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
> The events of memory section are notified to udev asynchronously,
> so it can not ensure that the memory block onlined by udev is
> adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
> But we want the whole node to be added to ZONE_MOVABLE by default.
>
> So we change should_add_memory_movable(): if the user config
> CONFIG_MOVABLE_NODE and movable_node kernel option
> and the ZONE_NORMAL is empty or the pfn of the hot-added memory
> is after the end of the ZONE_NORMAL it will always return 1
> and then the whole node will be added to ZONE_MOVABLE by default.
> If we want the node to be assigned to ZONE_NORMAL,
> we can do it as follows:
> "echo online_kernel > /sys/devices/system/memory/memoryXXX/state"
>
> Signed-off-by: liuchangsheng <[email protected]>
> Signed-off-by: Xiaofeng Yan <[email protected]>
> Tested-by: Dongdong Fan <[email protected]>
> Reviewed-by: <[email protected]>
> Cc: Wang Nan <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Yinghai Lu <[email protected]>
> Cc: Tang Chen <[email protected]>
> Cc: Yasuaki Ishimatsu <[email protected]>
> Cc: Toshi Kani <[email protected]>
> Cc: Xishi Qiu <[email protected]>
> ---

Looks good to me.

Reviewed-by: Yasuaki Ishimatsu <[email protected]>

Thanks,
Yasuaki Ishimatsu

> mm/memory_hotplug.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index aa992e2..8617b9f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
> /*
> * If movable zone has already been setup, newly added memory should be check.
> * If its address is higher than movable zone, it should be added as movable.
> + * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
> + * added memory does not overlap the zone before MOVABLE_ZONE,
> + * the memory is added as movable.
> * Without this check, movable zone may overlap with other zone.
> */
> static int should_add_memory_movable(int nid, u64 start, u64 size)
> @@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
> unsigned long start_pfn = start >> PAGE_SHIFT;
> pg_data_t *pgdat = NODE_DATA(nid);
> struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
> + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
> +
> + if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
> + return 1;
>
> if (zone_is_empty(movable_zone))
> return 0;
> --
> 1.8.3.1
>

2015-11-04 16:33:10

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default

On 11/04/2015 12:23 AM, liuchangsheng wrote:
> After the user config CONFIG_MOVABLE_NODE,
> When the memory is hot added, should_add_memory_movable() return 0
> because all zones including ZONE_MOVABLE are empty,
> so the memory that was hot added will be assigned to ZONE_NORMAL,
> and we need using the udev rules to online the memory automatically:
> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
> ATTR{state}="online_movable"
> The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
> The events of memory section are notified to udev asynchronously,
> so it can not ensure that the memory block onlined by udev is
> adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
> But we want the whole node to be added to ZONE_MOVABLE by default.

I'm still a bit confused about the whole scenario here.

Is the core problem:
1. We add memory in a new node and that node can not be made entirely
movable?
or
2. We add memory to an existing zone that has some non-movable memory
and we want the new memory to be movable?

> @@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
> /*
> * If movable zone has already been setup, newly added memory should be check.
> * If its address is higher than movable zone, it should be added as movable.
> + * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
> + * added memory does not overlap the zone before MOVABLE_ZONE,
> + * the memory is added as movable.
> * Without this check, movable zone may overlap with other zone.
> */

This comment is describing what the code does, but is rather sparse on
why. This scenario is pretty convoluted and I can barely make sense of
why it is doing this today while looking at the whole changelog, much
less in a few years when the original changelog will be harder to come by.

Also please put the comment next to the new if() statement. It's really
hard to match the comment to the code the way you have it now.

> static int should_add_memory_movable(int nid, u64 start, u64 size)
> @@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
> unsigned long start_pfn = start >> PAGE_SHIFT;
> pg_data_t *pgdat = NODE_DATA(nid);
> struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
> + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
> +
> + if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
> + return 1;
>
> if (zone_is_empty(movable_zone))
> return 0;
>

2015-11-05 03:11:20

by Changsheng Liu

[permalink] [raw]
Subject: Re: [PATCH V8] mm: memory hot-add: hot-added memory can not be added to movable zone by default



在 2015/11/5 0:31, Dave Hansen 写道:
> On 11/04/2015 12:23 AM, liuchangsheng wrote:
>> After the user config CONFIG_MOVABLE_NODE,
>> When the memory is hot added, should_add_memory_movable() return 0
>> because all zones including ZONE_MOVABLE are empty,
>> so the memory that was hot added will be assigned to ZONE_NORMAL,
>> and we need using the udev rules to online the memory automatically:
>> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline",
>> ATTR{state}="online_movable"
>> The memory block onlined by udev must be adjacent to ZONE_MOVABLE.
>> The events of memory section are notified to udev asynchronously,
>> so it can not ensure that the memory block onlined by udev is
>> adjacent to ZONE_MOVABLE.So it can't ensure memory online always success.
>> But we want the whole node to be added to ZONE_MOVABLE by default.
> I'm still a bit confused about the whole scenario here.
>
> Is the core problem:
> 1. We add memory in a new node and that node can not be made entirely
> movable?
As I know, System will not ensure that the node can be made
entirely movable if the memory of
the node is assigned to ZONE_NORMAL
> or
> 2. We add memory to an existing zone that has some non-movable memory
> and we want the new memory to be movable?
It will work if we want to let the memroy movable by using
movable_node
>
>> @@ -1201,6 +1201,9 @@ static int check_hotplug_memory_range(u64 start, u64 size)
>> /*
>> * If movable zone has already been setup, newly added memory should be check.
>> * If its address is higher than movable zone, it should be added as movable.
>> + * And if system boots up with movable_node and config CONFIG_MOVABLE_NOD and
>> + * added memory does not overlap the zone before MOVABLE_ZONE,
>> + * the memory is added as movable.
>> * Without this check, movable zone may overlap with other zone.
>> */
> This comment is describing what the code does, but is rather sparse on
> why. This scenario is pretty convoluted and I can barely make sense of
> why it is doing this today while looking at the whole changelog, much
> less in a few years when the original changelog will be harder to come by.
>
> Also please put the comment next to the new if() statement. It's really
> hard to match the comment to the code the way you have it now.
>
>> static int should_add_memory_movable(int nid, u64 start, u64 size)
>> @@ -1208,6 +1211,10 @@ static int should_add_memory_movable(int nid, u64 start, u64 size)
>> unsigned long start_pfn = start >> PAGE_SHIFT;
>> pg_data_t *pgdat = NODE_DATA(nid);
>> struct zone *movable_zone = pgdat->node_zones + ZONE_MOVABLE;
>> + struct zone *pre_zone = pgdat->node_zones + (ZONE_MOVABLE - 1);
>> +
>> + if (movable_node_is_enabled() && (zone_end_pfn(pre_zone) <= start_pfn))
>> + return 1;
>>
>> if (zone_is_empty(movable_zone))
>> return 0;
>>
> .
>