2023-10-19 10:45:48

by Qi Zheng

[permalink] [raw]
Subject: [PATCH v3 0/2] handle memoryless nodes more appropriately

Hi all,

Currently, in the process of initialization or offline memory, memoryless
nodes will still be built into the fallback list of itself or other nodes.

This is not what we expected, so this patch series removes memoryless
nodes from the fallback list entirely.

This series is based on the next-20231018.

Comments and suggestions are welcome.

Thanks,
Qi

Changlog in v2 -> v3:
- add a comment in [PATCH v2 2/2] (suggested by David Hildenbrand)
- collect Acked-bys

Changlog in v1 -> v2:
- modify the commit message in [PATCH 1/2], mention that it can also fix the
specific crash. (suggested by Ingo Molnar)
- rebase onto the next-20231018

Qi Zheng (2):
mm: page_alloc: skip memoryless nodes entirely
mm: memory_hotplug: drop memoryless node from fallback lists

mm/memory_hotplug.c | 6 +++++-
mm/page_alloc.c | 7 +++++--
2 files changed, 10 insertions(+), 3 deletions(-)

--
2.30.2


2023-10-19 10:46:07

by Qi Zheng

[permalink] [raw]
Subject: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

In offline_pages(), if a node becomes memoryless, we
will clear its N_MEMORY state by calling node_states_clear_node().
But we do this after rebuilding the zonelists by calling
build_all_zonelists(), which will cause this memoryless node to
still be in the fallback list of other nodes. This will incur
some runtime overhead.

To drop memoryless node from fallback lists in this case, just
call node_states_clear_node() before calling build_all_zonelists().

Signed-off-by: Qi Zheng <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
---
mm/memory_hotplug.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index d4a364fdaf8f..f019f7d6272c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2036,12 +2036,16 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
/* reinitialise watermarks and update pcp limits */
init_per_zone_wmark_min();

+ /*
+ * Make sure to mark the node as memory-less before rebuilding the zone
+ * list. Otherwise this node would still appear in the fallback lists.
+ */
+ node_states_clear_node(node, &arg);
if (!populated_zone(zone)) {
zone_pcp_reset(zone);
build_all_zonelists(NULL);
}

- node_states_clear_node(node, &arg);
if (arg.status_change_nid >= 0) {
kcompactd_stop(node);
kswapd_stop(node);
--
2.30.2

2023-10-20 07:07:26

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Qi Zheng <[email protected]> writes:

> In offline_pages(), if a node becomes memoryless, we
> will clear its N_MEMORY state by calling node_states_clear_node().
> But we do this after rebuilding the zonelists by calling
> build_all_zonelists(), which will cause this memoryless node to
> still be in the fallback list of other nodes.

For fallback list, do you mean pgdat->node_zonelists[]? If so, in

build_all_zonelists
__build_all_zonelists
build_zonelists
build_zonelists_in_node_order
build_zonerefs_node

populated_zone() will be checked before adding zone into zonelist.

So, IIUC, we will not try to allocate from the memory less node.

--
Best Regards,
Huang, Ying


> This will incur
> some runtime overhead.
>
> To drop memoryless node from fallback lists in this case, just
> call node_states_clear_node() before calling build_all_zonelists().
>
> Signed-off-by: Qi Zheng <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>

[snip]

--
Best Regards,
Huang, Ying

2023-10-20 07:37:13

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Hi Ying,

On 2023/10/20 15:05, Huang, Ying wrote:
> Qi Zheng <[email protected]> writes:
>
>> In offline_pages(), if a node becomes memoryless, we
>> will clear its N_MEMORY state by calling node_states_clear_node().
>> But we do this after rebuilding the zonelists by calling
>> build_all_zonelists(), which will cause this memoryless node to
>> still be in the fallback list of other nodes.
>
> For fallback list, do you mean pgdat->node_zonelists[]? If so, in
>
> build_all_zonelists
> __build_all_zonelists
> build_zonelists
> build_zonelists_in_node_order
> build_zonerefs_node
>
> populated_zone() will be checked before adding zone into zonelist.
>
> So, IIUC, we will not try to allocate from the memory less node.

Normally yes, but if it is the weird topology mentioned in [1], it's
possible to allocate memory from it, it is a memoryless node, but it
also has memory.

In addition to the above case, I think it's reasonable to remove
memory less node from node_order[] in advance. In this way it will
not to be traversed in build_zonelists_in_node_order().

[1].
https://lore.kernel.org/all/[email protected]/

Thanks,
Qi


>
> --
> Best Regards,
> Huang, Ying
>
>
>> This will incur
>> some runtime overhead.
>>
>> To drop memoryless node from fallback lists in this case, just
>> call node_states_clear_node() before calling build_all_zonelists().
>>
>> Signed-off-by: Qi Zheng <[email protected]>
>> Acked-by: David Hildenbrand <[email protected]>
>
> [snip]
>
> --
> Best Regards,
> Huang, Ying

2023-10-20 08:32:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists


* Qi Zheng <[email protected]> wrote:

> In offline_pages(), if a node becomes memoryless, we
> will clear its N_MEMORY state by calling node_states_clear_node().
> But we do this after rebuilding the zonelists by calling
> build_all_zonelists(), which will cause this memoryless node to
> still be in the fallback list of other nodes. This will incur
> some runtime overhead.
>
> To drop memoryless node from fallback lists in this case, just
> call node_states_clear_node() before calling build_all_zonelists().

s/memoryless node
/memoryless nodes

>
> Signed-off-by: Qi Zheng <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>
> ---
> mm/memory_hotplug.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index d4a364fdaf8f..f019f7d6272c 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2036,12 +2036,16 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
> /* reinitialise watermarks and update pcp limits */
> init_per_zone_wmark_min();
>
> + /*
> + * Make sure to mark the node as memory-less before rebuilding the zone
> + * list. Otherwise this node would still appear in the fallback lists.
> + */
> + node_states_clear_node(node, &arg);

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2023-10-20 09:12:03

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Hi Ingo,

On 2023/10/20 16:32, Ingo Molnar wrote:
>
> * Qi Zheng <[email protected]> wrote:
>
>> In offline_pages(), if a node becomes memoryless, we
>> will clear its N_MEMORY state by calling node_states_clear_node().
>> But we do this after rebuilding the zonelists by calling
>> build_all_zonelists(), which will cause this memoryless node to
>> still be in the fallback list of other nodes. This will incur
>> some runtime overhead.
>>
>> To drop memoryless node from fallback lists in this case, just
>> call node_states_clear_node() before calling build_all_zonelists().
>
> s/memoryless node
> /memoryless nodes

Will do.

>
>>
>> Signed-off-by: Qi Zheng <[email protected]>
>> Acked-by: David Hildenbrand <[email protected]>
>> ---
>> mm/memory_hotplug.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index d4a364fdaf8f..f019f7d6272c 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -2036,12 +2036,16 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>> /* reinitialise watermarks and update pcp limits */
>> init_per_zone_wmark_min();
>>
>> + /*
>> + * Make sure to mark the node as memory-less before rebuilding the zone
>> + * list. Otherwise this node would still appear in the fallback lists.
>> + */
>> + node_states_clear_node(node, &arg);
>
> Acked-by: Ingo Molnar <[email protected]>

Thanks.

>
> Thanks,
>
> Ingo

2023-10-23 01:20:45

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Qi Zheng <[email protected]> writes:

> Hi Ying,
>
> On 2023/10/20 15:05, Huang, Ying wrote:
>> Qi Zheng <[email protected]> writes:
>>
>>> In offline_pages(), if a node becomes memoryless, we
>>> will clear its N_MEMORY state by calling node_states_clear_node().
>>> But we do this after rebuilding the zonelists by calling
>>> build_all_zonelists(), which will cause this memoryless node to
>>> still be in the fallback list of other nodes.
>> For fallback list, do you mean pgdat->node_zonelists[]? If so, in
>> build_all_zonelists
>> __build_all_zonelists
>> build_zonelists
>> build_zonelists_in_node_order
>> build_zonerefs_node
>> populated_zone() will be checked before adding zone into zonelist.
>> So, IIUC, we will not try to allocate from the memory less node.
>
> Normally yes, but if it is the weird topology mentioned in [1], it's
> possible to allocate memory from it, it is a memoryless node, but it
> also has memory.
>
> In addition to the above case, I think it's reasonable to remove
> memory less node from node_order[] in advance. In this way it will
> not to be traversed in build_zonelists_in_node_order().
>
> [1]. https://lore.kernel.org/all/[email protected]/

Got it! Thank you for information. I think that it may be good to
include this in the patch description to avoid potential confusing in
the future.

--
Best Regards,
Huang, Ying

> Thanks,
> Qi
>
>
>> --
>> Best Regards,
>> Huang, Ying
>>
>>> This will incur
>>> some runtime overhead.
>>>
>>> To drop memoryless node from fallback lists in this case, just
>>> call node_states_clear_node() before calling build_all_zonelists().
>>>
>>> Signed-off-by: Qi Zheng <[email protected]>
>>> Acked-by: David Hildenbrand <[email protected]>
>> [snip]
>> --
>> Best Regards,
>> Huang, Ying

2023-10-23 02:54:04

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Hi Ying,

On 2023/10/23 09:18, Huang, Ying wrote:
> Qi Zheng <[email protected]> writes:
>
>> Hi Ying,
>>
>> On 2023/10/20 15:05, Huang, Ying wrote:
>>> Qi Zheng <[email protected]> writes:
>>>
>>>> In offline_pages(), if a node becomes memoryless, we
>>>> will clear its N_MEMORY state by calling node_states_clear_node().
>>>> But we do this after rebuilding the zonelists by calling
>>>> build_all_zonelists(), which will cause this memoryless node to
>>>> still be in the fallback list of other nodes.
>>> For fallback list, do you mean pgdat->node_zonelists[]? If so, in
>>> build_all_zonelists
>>> __build_all_zonelists
>>> build_zonelists
>>> build_zonelists_in_node_order
>>> build_zonerefs_node
>>> populated_zone() will be checked before adding zone into zonelist.
>>> So, IIUC, we will not try to allocate from the memory less node.
>>
>> Normally yes, but if it is the weird topology mentioned in [1], it's
>> possible to allocate memory from it, it is a memoryless node, but it
>> also has memory.
>>
>> In addition to the above case, I think it's reasonable to remove
>> memory less node from node_order[] in advance. In this way it will
>> not to be traversed in build_zonelists_in_node_order().
>>
>> [1]. https://lore.kernel.org/all/[email protected]/
>
> Got it! Thank you for information. I think that it may be good to
> include this in the patch description to avoid potential confusing in
> the future.

OK, maybe the commit message can be changed to the following:

```
In offline_pages(), if a node becomes memoryless, we
will clear its N_MEMORY state by calling node_states_clear_node().
But we do this after rebuilding the zonelists by calling
build_all_zonelists(), which will cause this memoryless node to
still be in the fallback nodes (node_order[]) of other nodes.

To drop memoryless nodes from fallback nodes in this case, just
call node_states_clear_node() before calling build_all_zonelists().

In this way, we will not try to allocate pages from memoryless
node0, then the panic mentioned in [1] will also be fixed. Even though
this problem has been solved by dropping the NODE_MIN_SIZE constrain
in x86 [2], it would be better to fix it in the core MM as well.

[1].
https://lore.kernel.org/all/[email protected]/
[2]. https://lore.kernel.org/all/[email protected]/

```

Thanks,
Qi

>
> --
> Best Regards,
> Huang, Ying
>
>> Thanks,
>> Qi
>>
>>
>>> --
>>> Best Regards,
>>> Huang, Ying
>>>
>>>> This will incur
>>>> some runtime overhead.
>>>>
>>>> To drop memoryless node from fallback lists in this case, just
>>>> call node_states_clear_node() before calling build_all_zonelists().
>>>>
>>>> Signed-off-by: Qi Zheng <[email protected]>
>>>> Acked-by: David Hildenbrand <[email protected]>
>>> [snip]
>>> --
>>> Best Regards,
>>> Huang, Ying

2023-10-23 03:13:12

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

Qi Zheng <[email protected]> writes:

> Hi Ying,
>
> On 2023/10/23 09:18, Huang, Ying wrote:
>> Qi Zheng <[email protected]> writes:
>>
>>> Hi Ying,
>>>
>>> On 2023/10/20 15:05, Huang, Ying wrote:
>>>> Qi Zheng <[email protected]> writes:
>>>>
>>>>> In offline_pages(), if a node becomes memoryless, we
>>>>> will clear its N_MEMORY state by calling node_states_clear_node().
>>>>> But we do this after rebuilding the zonelists by calling
>>>>> build_all_zonelists(), which will cause this memoryless node to
>>>>> still be in the fallback list of other nodes.
>>>> For fallback list, do you mean pgdat->node_zonelists[]? If so, in
>>>> build_all_zonelists
>>>> __build_all_zonelists
>>>> build_zonelists
>>>> build_zonelists_in_node_order
>>>> build_zonerefs_node
>>>> populated_zone() will be checked before adding zone into zonelist.
>>>> So, IIUC, we will not try to allocate from the memory less node.
>>>
>>> Normally yes, but if it is the weird topology mentioned in [1], it's
>>> possible to allocate memory from it, it is a memoryless node, but it
>>> also has memory.
>>>
>>> In addition to the above case, I think it's reasonable to remove
>>> memory less node from node_order[] in advance. In this way it will
>>> not to be traversed in build_zonelists_in_node_order().
>>>
>>> [1]. https://lore.kernel.org/all/[email protected]/
>> Got it! Thank you for information. I think that it may be good to
>> include this in the patch description to avoid potential confusing in
>> the future.
>
> OK, maybe the commit message can be changed to the following:
>
> ```
> In offline_pages(), if a node becomes memoryless, we
> will clear its N_MEMORY state by calling node_states_clear_node().
> But we do this after rebuilding the zonelists by calling
> build_all_zonelists(), which will cause this memoryless node to
> still be in the fallback nodes (node_order[]) of other nodes.
>
> To drop memoryless nodes from fallback nodes in this case, just
> call node_states_clear_node() before calling build_all_zonelists().
>
> In this way, we will not try to allocate pages from memoryless
> node0, then the panic mentioned in [1] will also be fixed. Even though
> this problem has been solved by dropping the NODE_MIN_SIZE constrain
> in x86 [2], it would be better to fix it in the core MM as well.
>
> [1]. https://lore.kernel.org/all/[email protected]/
> [2]. https://lore.kernel.org/all/[email protected]/
>
> ```

This is helpful. Thanks!

--
Best Regards,
Huang, Ying

> Thanks,
> Qi
>
>> --
>> Best Regards,
>> Huang, Ying
>>
>>> Thanks,
>>> Qi
>>>
>>>
>>>> --
>>>> Best Regards,
>>>> Huang, Ying
>>>>
>>>>> This will incur
>>>>> some runtime overhead.
>>>>>
>>>>> To drop memoryless node from fallback lists in this case, just
>>>>> call node_states_clear_node() before calling build_all_zonelists().
>>>>>
>>>>> Signed-off-by: Qi Zheng <[email protected]>
>>>>> Acked-by: David Hildenbrand <[email protected]>
>>>> [snip]
>>>> --
>>>> Best Regards,
>>>> Huang, Ying

2023-10-23 03:17:46

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists



On 2023/10/23 11:10, Huang, Ying wrote:
> Qi Zheng <[email protected]> writes:
>
>> Hi Ying,
>>
>> On 2023/10/23 09:18, Huang, Ying wrote:
>>> Qi Zheng <[email protected]> writes:
>>>
>>>> Hi Ying,
>>>>
>>>> On 2023/10/20 15:05, Huang, Ying wrote:
>>>>> Qi Zheng <[email protected]> writes:
>>>>>
>>>>>> In offline_pages(), if a node becomes memoryless, we
>>>>>> will clear its N_MEMORY state by calling node_states_clear_node().
>>>>>> But we do this after rebuilding the zonelists by calling
>>>>>> build_all_zonelists(), which will cause this memoryless node to
>>>>>> still be in the fallback list of other nodes.
>>>>> For fallback list, do you mean pgdat->node_zonelists[]? If so, in
>>>>> build_all_zonelists
>>>>> __build_all_zonelists
>>>>> build_zonelists
>>>>> build_zonelists_in_node_order
>>>>> build_zonerefs_node
>>>>> populated_zone() will be checked before adding zone into zonelist.
>>>>> So, IIUC, we will not try to allocate from the memory less node.
>>>>
>>>> Normally yes, but if it is the weird topology mentioned in [1], it's
>>>> possible to allocate memory from it, it is a memoryless node, but it
>>>> also has memory.
>>>>
>>>> In addition to the above case, I think it's reasonable to remove
>>>> memory less node from node_order[] in advance. In this way it will
>>>> not to be traversed in build_zonelists_in_node_order().
>>>>
>>>> [1]. https://lore.kernel.org/all/[email protected]/
>>> Got it! Thank you for information. I think that it may be good to
>>> include this in the patch description to avoid potential confusing in
>>> the future.
>>
>> OK, maybe the commit message can be changed to the following:
>>
>> ```
>> In offline_pages(), if a node becomes memoryless, we
>> will clear its N_MEMORY state by calling node_states_clear_node().
>> But we do this after rebuilding the zonelists by calling
>> build_all_zonelists(), which will cause this memoryless node to
>> still be in the fallback nodes (node_order[]) of other nodes.
>>
>> To drop memoryless nodes from fallback nodes in this case, just
>> call node_states_clear_node() before calling build_all_zonelists().
>>
>> In this way, we will not try to allocate pages from memoryless
>> node0, then the panic mentioned in [1] will also be fixed. Even though
>> this problem has been solved by dropping the NODE_MIN_SIZE constrain
>> in x86 [2], it would be better to fix it in the core MM as well.
>>
>> [1]. https://lore.kernel.org/all/[email protected]/
>> [2]. https://lore.kernel.org/all/[email protected]/
>>
>> ```

Hi Andrew, can you help modify the commit message to this? :)

Thanks,
Qi

>
> This is helpful. Thanks!
>
> --
> Best Regards,
> Huang, Ying
>
>> Thanks,
>> Qi
>>
>>> --
>>> Best Regards,
>>> Huang, Ying
>>>
>>>> Thanks,
>>>> Qi
>>>>
>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Huang, Ying
>>>>>
>>>>>> This will incur
>>>>>> some runtime overhead.
>>>>>>
>>>>>> To drop memoryless node from fallback lists in this case, just
>>>>>> call node_states_clear_node() before calling build_all_zonelists().
>>>>>>
>>>>>> Signed-off-by: Qi Zheng <[email protected]>
>>>>>> Acked-by: David Hildenbrand <[email protected]>
>>>>> [snip]
>>>>> --
>>>>> Best Regards,
>>>>> Huang, Ying

2023-10-23 18:19:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] mm: memory_hotplug: drop memoryless node from fallback lists

On Mon, 23 Oct 2023 11:17:03 +0800 Qi Zheng <[email protected]> wrote:

> Hi Andrew, can you help modify the commit message to this? :)

Done, thanks.