2020-04-16 10:56:30

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK

The comment in add_memory_resource() is stale: hotadd_new_pgdat() will
no longer call get_pfn_range_for_nid(), as a hotadded pgdat will simply
span no pages at all, until memory is moved to the zone/node via
move_pfn_range_to_zone() - e.g., when onlining memory blocks.

The only archs that care about memblocks for hotplugged memory (either
for iterating over all system RAM or testing for memory validity) are
arm64, s390x, and powerpc - due to CONFIG_ARCH_KEEP_MEMBLOCK. Without
CONFIG_ARCH_KEEP_MEMBLOCK, we can simply stop messing with memblocks.

For s390x, it seems to be fairly easy to avoid CONFIG_ARCH_KEEP_MEMBLOCK.
arm64 could rework most code (esp., pfn_valid(), valid_phys_addr_range()
and kexec_file_load()) to not require memblocks for hotplugged
memory. E.g., as hotplugged memory has no holes and can be identified
using !early_section(), arm64's variant of pfn_valid() could be reworked
fairly easily to not require memblocks for hotadded memory. powerpc might
be more involed.

Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
mm/Kconfig | 3 +++
mm/memory_hotplug.c | 13 +++++++------
2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..a063fd9cdff4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -136,6 +136,9 @@ config HAVE_FAST_GUP
depends on MMU
bool

+# Don't discard allocated memory used to track "memory" and "reserved" memblocks
+# after early boot, so it can still be used to test for validity of memory.
+# Also, memblocks are updated with memory hot(un)plug.
config ARCH_KEEP_MEMBLOCK
bool

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9b15ce465be2..104285ee9ae8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1020,13 +1020,9 @@ int __ref add_memory_resource(int nid, struct resource *res)

mem_hotplug_begin();

- /*
- * Add new range to memblock so that when hotadd_new_pgdat() is called
- * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
- * this new range and calculate total pages correctly. The range will
- * be removed at hot-remove time.
- */
+#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
memblock_add_node(start, size, nid);
+#endif

ret = __try_online_node(nid, false);
if (ret < 0)
@@ -1075,7 +1071,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
/* rollback pgdat allocation and others */
if (new_node)
rollback_node_hotadd(nid);
+#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
memblock_remove(start, size);
+#endif
mem_hotplug_done();
return ret;
}
@@ -1751,8 +1749,11 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size)
mem_hotplug_begin();

arch_remove_memory(nid, start, size, NULL);
+
+#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
memblock_free(start, size);
memblock_remove(start, size);
+#endif
__release_memory_resource(start, size);

try_offline_node(nid);
--
2.25.1


2020-04-16 20:40:57

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK

On Thu, Apr 16, 2020 at 12:47:07PM +0200, David Hildenbrand wrote:
> The comment in add_memory_resource() is stale: hotadd_new_pgdat() will
> no longer call get_pfn_range_for_nid(), as a hotadded pgdat will simply
> span no pages at all, until memory is moved to the zone/node via
> move_pfn_range_to_zone() - e.g., when onlining memory blocks.
>
> The only archs that care about memblocks for hotplugged memory (either
> for iterating over all system RAM or testing for memory validity) are
> arm64, s390x, and powerpc - due to CONFIG_ARCH_KEEP_MEMBLOCK. Without
> CONFIG_ARCH_KEEP_MEMBLOCK, we can simply stop messing with memblocks.
>
> For s390x, it seems to be fairly easy to avoid CONFIG_ARCH_KEEP_MEMBLOCK.
> arm64 could rework most code (esp., pfn_valid(), valid_phys_addr_range()
> and kexec_file_load()) to not require memblocks for hotplugged
> memory. E.g., as hotplugged memory has no holes and can be identified
> using !early_section(), arm64's variant of pfn_valid() could be reworked
> fairly easily to not require memblocks for hotadded memory. powerpc might
> be more involed.
>
> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Pankaj Gupta <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>

Acked-by: Mike Rapoport <[email protected]>

> ---
> mm/Kconfig | 3 +++
> mm/memory_hotplug.c | 13 +++++++------
> 2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..a063fd9cdff4 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -136,6 +136,9 @@ config HAVE_FAST_GUP
> depends on MMU
> bool
>
> +# Don't discard allocated memory used to track "memory" and "reserved" memblocks
> +# after early boot, so it can still be used to test for validity of memory.
> +# Also, memblocks are updated with memory hot(un)plug.
> config ARCH_KEEP_MEMBLOCK
> bool
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 9b15ce465be2..104285ee9ae8 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1020,13 +1020,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
>
> mem_hotplug_begin();
>
> - /*
> - * Add new range to memblock so that when hotadd_new_pgdat() is called
> - * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
> - * this new range and calculate total pages correctly. The range will
> - * be removed at hot-remove time.
> - */
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_add_node(start, size, nid);
> +#endif
>
> ret = __try_online_node(nid, false);
> if (ret < 0)
> @@ -1075,7 +1071,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
> /* rollback pgdat allocation and others */
> if (new_node)
> rollback_node_hotadd(nid);
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_remove(start, size);
> +#endif
> mem_hotplug_done();
> return ret;
> }
> @@ -1751,8 +1749,11 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size)
> mem_hotplug_begin();
>
> arch_remove_memory(nid, start, size, NULL);
> +
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_free(start, size);
> memblock_remove(start, size);
> +#endif
> __release_memory_resource(start, size);
>
> try_offline_node(nid);
> --
> 2.25.1
>

--
Sincerely yours,
Mike.

2020-04-21 12:41:19

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK

On Thu 16-04-20 12:47:07, David Hildenbrand wrote:
> The comment in add_memory_resource() is stale: hotadd_new_pgdat() will
> no longer call get_pfn_range_for_nid(), as a hotadded pgdat will simply
> span no pages at all, until memory is moved to the zone/node via
> move_pfn_range_to_zone() - e.g., when onlining memory blocks.
>
> The only archs that care about memblocks for hotplugged memory (either
> for iterating over all system RAM or testing for memory validity) are
> arm64, s390x, and powerpc - due to CONFIG_ARCH_KEEP_MEMBLOCK. Without
> CONFIG_ARCH_KEEP_MEMBLOCK, we can simply stop messing with memblocks.

OK, makes sense to me.

> For s390x, it seems to be fairly easy to avoid CONFIG_ARCH_KEEP_MEMBLOCK.
> arm64 could rework most code (esp., pfn_valid(), valid_phys_addr_range()
> and kexec_file_load()) to not require memblocks for hotplugged
> memory. E.g., as hotplugged memory has no holes and can be identified
> using !early_section(), arm64's variant of pfn_valid() could be reworked
> fairly easily to not require memblocks for hotadded memory. powerpc might
> be more involed.

I haven't checked these architectures but is the information really
useful for this patch?

> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Pankaj Gupta <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>

Acked-by: Michal Hocko <[email protected]>

with a minor nit

> - /*
> - * Add new range to memblock so that when hotadd_new_pgdat() is called
> - * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
> - * this new range and calculate total pages correctly. The range will
> - * be removed at hot-remove time.
> - */
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK

if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)

would be slightly nicer. This should work for all the ifedefs in this
patch.

> memblock_add_node(start, size, nid);
> +#endif
>
> ret = __try_online_node(nid, false);
> if (ret < 0)
> @@ -1075,7 +1071,9 @@ int __ref add_memory_resource(int nid, struct resource *res)
> /* rollback pgdat allocation and others */
> if (new_node)
> rollback_node_hotadd(nid);
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_remove(start, size);
> +#endif
> mem_hotplug_done();
> return ret;
> }
> @@ -1751,8 +1749,11 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size)
> mem_hotplug_begin();
>
> arch_remove_memory(nid, start, size, NULL);
> +
> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
> memblock_free(start, size);
> memblock_remove(start, size);
> +#endif
> __release_memory_resource(start, size);
>
> try_offline_node(nid);
> --
> 2.25.1

--
Michal Hocko
SUSE Labs

2020-04-21 12:43:32

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK

>> For s390x, it seems to be fairly easy to avoid CONFIG_ARCH_KEEP_MEMBLOCK.
>> arm64 could rework most code (esp., pfn_valid(), valid_phys_addr_range()
>> and kexec_file_load()) to not require memblocks for hotplugged
>> memory. E.g., as hotplugged memory has no holes and can be identified
>> using !early_section(), arm64's variant of pfn_valid() could be reworked
>> fairly easily to not require memblocks for hotadded memory. powerpc might
>> be more involed.
>
> I haven't checked these architectures but is the information really
> useful for this patch?

It might of interest for Arm64 people (cced below). I sent out s390x
patches to change that.

I can drop that part.

>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Cc: Baoquan He <[email protected]>
>> Cc: Oscar Salvador <[email protected]>
>> Cc: Pankaj Gupta <[email protected]>
>> Cc: Mike Rapoport <[email protected]>
>> Cc: Anshuman Khandual <[email protected]>
>> Signed-off-by: David Hildenbrand <[email protected]>
>
> Acked-by: Michal Hocko <[email protected]>
>
> with a minor nit
>
>> - /*
>> - * Add new range to memblock so that when hotadd_new_pgdat() is called
>> - * to allocate new pgdat, get_pfn_range_for_nid() will be able to find
>> - * this new range and calculate total pages correctly. The range will
>> - * be removed at hot-remove time.
>> - */
>> +#ifdef CONFIG_ARCH_KEEP_MEMBLOCK
>
> if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)
>
> would be slightly nicer. This should work for all the ifedefs in this
> patch.

Yeah, will change if it compiles.


--
Thanks,

David / dhildenb