2018-03-21 08:11:40

by Jia He

[permalink] [raw]
Subject: [PATCH RFC 0/4] optimize memblock_next_valid_pfn() and early_pfn_valid()

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.

Patch 1 optimized the memblock_next_valid_pfn()
Patch 2~4 optimized the early_pfn_valid(), I have to split it into parts
because the changes are located across subsystems.

I tested the pfn steping up process in memmap_init(), the same as before.
As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24345 us in my
armv8a server(QDF2400 with 96G memory).

Attached the memblock region information in my server.
[ 86.956758] Zone ranges:
[ 86.959452] DMA [mem 0x0000000000200000-0x00000000ffffffff]
[ 86.966041] Normal [mem 0x0000000100000000-0x00000017ffffffff]
[ 86.972631] Movable zone start for each node
[ 86.977179] Early memory node ranges
[ 86.980985] node 0: [mem 0x0000000000200000-0x000000000021ffff]
[ 86.987666] node 0: [mem 0x0000000000820000-0x000000000307ffff]
[ 86.994348] node 0: [mem 0x0000000003080000-0x000000000308ffff]
[ 87.001029] node 0: [mem 0x0000000003090000-0x00000000031fffff]
[ 87.007710] node 0: [mem 0x0000000003200000-0x00000000033fffff]
[ 87.014392] node 0: [mem 0x0000000003410000-0x000000000563ffff]
[ 87.021073] node 0: [mem 0x0000000005640000-0x000000000567ffff]
[ 87.027754] node 0: [mem 0x0000000005680000-0x00000000056dffff]
[ 87.034435] node 0: [mem 0x00000000056e0000-0x00000000086fffff]
[ 87.041117] node 0: [mem 0x0000000008700000-0x000000000871ffff]
[ 87.047798] node 0: [mem 0x0000000008720000-0x000000000894ffff]
[ 87.054479] node 0: [mem 0x0000000008950000-0x0000000008baffff]
[ 87.061161] node 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[ 87.067842] node 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[ 87.074524] node 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[ 87.081205] node 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[ 87.087886] node 0: [mem 0x0000000008e50000-0x0000000008fcffff]
[ 87.094568] node 0: [mem 0x0000000008fd0000-0x000000000910ffff]
[ 87.101249] node 0: [mem 0x0000000009110000-0x00000000092effff]
[ 87.107930] node 0: [mem 0x00000000092f0000-0x000000000930ffff]
[ 87.114612] node 0: [mem 0x0000000009310000-0x000000000963ffff]
[ 87.121293] node 0: [mem 0x0000000009640000-0x000000000e61ffff]
[ 87.127975] node 0: [mem 0x000000000e620000-0x000000000e64ffff]
[ 87.134657] node 0: [mem 0x000000000e650000-0x000000000fffffff]
[ 87.141338] node 0: [mem 0x0000000010800000-0x0000000017feffff]
[ 87.148019] node 0: [mem 0x000000001c000000-0x000000001c00ffff]
[ 87.154701] node 0: [mem 0x000000001c010000-0x000000001c7fffff]
[ 87.161383] node 0: [mem 0x000000001c810000-0x000000007efbffff]
[ 87.168064] node 0: [mem 0x000000007efc0000-0x000000007efdffff]
[ 87.174746] node 0: [mem 0x000000007efe0000-0x000000007efeffff]
[ 87.181427] node 0: [mem 0x000000007eff0000-0x000000007effffff]
[ 87.188108] node 0: [mem 0x000000007f000000-0x00000017ffffffff]
[ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]

Without this patchset:
[ 117.106153] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[ 117.113677] before memmap_init
[ 117.118195] after memmap_init
>>> memmap_init takes 4518 us
[ 117.121446] before memmap_init
[ 117.154992] after memmap_init
>>> memmap_init takes 33546 us
[ 117.158241] before memmap_init
[ 117.161490] after memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us

With this patchset:
[ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[ 87.202314] before memmap_init
[ 87.206164] after memmap_init
>>> memmap_init takes 3850 us
[ 87.209416] before memmap_init
[ 87.226662] after memmap_init
>>> memmap_init takes 17246 us
[ 87.229911] before memmap_init
[ 87.233160] after memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 24345 us

Jia He (4):
mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
mm/memblock: introduce memblock_search_pfn_regions()
arm64: introduce pfn_valid_region()
mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

arch/arm64/include/asm/page.h | 3 ++-
arch/arm64/mm/init.c | 19 ++++++++++++++++++-
arch/x86/include/asm/mmzone_32.h | 2 +-
include/linux/memblock.h | 3 ++-
include/linux/mmzone.h | 12 +++++++++---
mm/memblock.c | 35 +++++++++++++++++++++++++++++++----
mm/page_alloc.c | 5 +++--
7 files changed, 66 insertions(+), 13 deletions(-)

--
2.7.4



2018-03-21 08:12:27

by Jia He

[permalink] [raw]
Subject: [PATCH 2/4] mm/memblock: introduce memblock_search_pfn_regions()

This api is the preparation for further optimizing early_pfn_valid

Signed-off-by: Jia He <[email protected]>
---
include/linux/memblock.h | 2 ++
mm/memblock.c | 12 ++++++++++++
2 files changed, 14 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 9471db4..5f46956 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */

+int memblock_search_pfn_regions(unsigned long pfn);
+
unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
/**
* for_each_free_mem_range - iterate through free memblock areas
diff --git a/mm/memblock.c b/mm/memblock.c
index a9e8da4..f50fe5b 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1659,6 +1659,18 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
return -1;
}

+/* search memblock with the input pfn, return the region idx */
+int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ int mid = memblock_search(type, PFN_PHYS(pfn));
+
+ if (mid == -1)
+ return -1;
+
+ return mid;
+}
+
bool __init memblock_is_reserved(phys_addr_t addr)
{
return memblock_search(&memblock.reserved, addr) != -1;
--
2.7.4


2018-03-21 08:12:45

by Jia He

[permalink] [raw]
Subject: [PATCH 4/4] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), we can record
the last returned memblock region index and check check pfn++ is still in
the same region.

Currently it only improves the performance on arm64 and has no impact on
other arches.

Signed-off-by: Jia He <[email protected]>
---
arch/x86/include/asm/mmzone_32.h | 2 +-
include/linux/mmzone.h | 12 +++++++++---
mm/page_alloc.c | 2 +-
3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/mmzone_32.h b/arch/x86/include/asm/mmzone_32.h
index 73d8dd1..329d3ba 100644
--- a/arch/x86/include/asm/mmzone_32.h
+++ b/arch/x86/include/asm/mmzone_32.h
@@ -49,7 +49,7 @@ static inline int pfn_valid(int pfn)
return 0;
}

-#define early_pfn_valid(pfn) pfn_valid((pfn))
+#define early_pfn_valid(pfn, last_region_idx) pfn_valid((pfn))

#endif /* CONFIG_DISCONTIGMEM */

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d797716..3a686af 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1267,9 +1267,15 @@ static inline int pfn_present(unsigned long pfn)
})
#else
#define pfn_to_nid(pfn) (0)
-#endif
+#endif /*CONFIG_NUMA*/
+
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+#define early_pfn_valid(pfn, last_region_idx) \
+ pfn_valid_region(pfn, last_region_idx)
+#else
+#define early_pfn_valid(pfn, last_region_idx) pfn_valid(pfn)
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/

-#define early_pfn_valid(pfn) pfn_valid(pfn)
void sparse_init(void);
#else
#define sparse_init() do {} while (0)
@@ -1288,7 +1294,7 @@ struct mminit_pfnnid_cache {
};

#ifndef early_pfn_valid
-#define early_pfn_valid(pfn) (1)
+#define early_pfn_valid(pfn, last_region_idx) (1)
#endif

void memory_present(int nid, unsigned long start, unsigned long end);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f28c62c..215dc92 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5481,7 +5481,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
if (context != MEMMAP_EARLY)
goto not_early;

- if (!early_pfn_valid(pfn)) {
+ if (!early_pfn_valid(pfn, &idx)) {
#ifdef CONFIG_HAVE_MEMBLOCK
/*
* Skip to the pfn preceding the next valid one (or
--
2.7.4


2018-03-21 08:12:46

by Jia He

[permalink] [raw]
Subject: [PATCH 3/4] arm64: introduce pfn_valid_region()

This is the preparation for further optimizing in early_pfn_valid
on arm64.

Signed-off-by: Jia He <[email protected]>
---
arch/arm64/include/asm/page.h | 3 ++-
arch/arm64/mm/init.c | 19 ++++++++++++++++++-
2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 60d02c8..da2cba3 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -38,7 +38,8 @@ extern void clear_page(void *to);
typedef struct page *pgtable_t;

#ifdef CONFIG_HAVE_ARCH_PFN_VALID
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
+extern int pfn_valid_region(unsigned long pfn, int *last_idx);
#endif

#include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..1d9842e 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -290,7 +290,24 @@ int pfn_valid(unsigned long pfn)
return memblock_is_map_memory(pfn << PAGE_SHIFT);
}
EXPORT_SYMBOL(pfn_valid);
-#endif
+
+int pfn_valid_region(unsigned long pfn, int *last_idx)
+{
+ struct memblock_type *type = &memblock.memory;
+
+ if (*last_idx != -1 && pfn < PFN_DOWN(type->regions[*last_idx].base
+ + type->regions[*last_idx].size))
+ return !memblock_is_nomap(&memblock.memory.regions[*last_idx]);
+
+ *last_idx = memblock_search_pfn_regions(pfn);
+
+ if (*last_idx == -1)
+ return false;
+
+ return !memblock_is_nomap(&memblock.memory.regions[*last_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/

#ifndef CONFIG_SPARSEMEM
static void __init arm64_memory_present(void)
--
2.7.4


2018-03-21 08:13:01

by Jia He

[permalink] [raw]
Subject: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. if pfn and pfn+1 are in the same
memblock region, we can simply pfn++ instead of doing the binary search
in memblock_next_valid_pfn.

Signed-off-by: Jia He <[email protected]>
---
include/linux/memblock.h | 3 +--
mm/memblock.c | 23 +++++++++++++++++++----
mm/page_alloc.c | 3 ++-
3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index b7aa3ff..9471db4 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,8 +203,7 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */

-unsigned long memblock_next_valid_pfn(unsigned long pfn);
-
+unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
/**
* for_each_free_mem_range - iterate through free memblock areas
* @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index c87924d..a9e8da4 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1133,13 +1133,26 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
}
#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */

-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
+ int *last_idx)
{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
+ unsigned long start_pfn, end_pfn;
phys_addr_t addr = PFN_PHYS(++pfn);

+ /* fast path, return pfh+1 if next pfn is in the same region */
+ if (*last_idx != -1) {
+ start_pfn = PFN_DOWN(type->regions[*last_idx].base);
+ end_pfn = PFN_DOWN(type->regions[*last_idx].base +
+ type->regions[*last_idx].size);
+
+ if (pfn < end_pfn && pfn > start_pfn)
+ return pfn;
+ }
+
+ /* slow path, do the binary searching */
do {
mid = (right + left) / 2;

@@ -1149,15 +1162,17 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
type->regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn is valid */
+ *last_idx = mid;
return pfn;
}
} while (left < right);

if (right == type->cnt)
return -1UL;
- else
- return PHYS_PFN(type->regions[right].base);
+
+ *last_idx = right;
+
+ return PHYS_PFN(type->regions[*last_idx].base);
}

static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3899209..f28c62c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5456,6 +5456,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
unsigned long end_pfn = start_pfn + size;
pg_data_t *pgdat = NODE_DATA(nid);
unsigned long pfn;
+ int idx = -1;
unsigned long nr_initialised = 0;
struct page *page;
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
@@ -5487,7 +5488,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
* end_pfn), such that we hit a valid pfn (or end_pfn)
* on our next iteration of the loop.
*/
- pfn = memblock_next_valid_pfn(pfn) - 1;
+ pfn = memblock_next_valid_pfn(pfn, &idx) - 1;
#endif
continue;
}
--
2.7.4


2018-03-21 10:16:13

by Daniel Vacek

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. if pfn and pfn+1 are in the same
> memblock region, we can simply pfn++ instead of doing the binary search
> in memblock_next_valid_pfn.

There is a revert-mm-page_alloc-skip-over-regions-of-invalid-pfns-where-possible.patch
in -mm reverting b92df1de5d289c0b as it is fundamentally wrong by
design causing system panics on some machines with rare but still
valid mappings. Basically it skips valid pfns which are outside of
usable memory ranges (outside of memblock memory regions).

So I guess if you want to optimize boot up time, you would have to
come up with different solution in memmap_init_zone. Most likely using
mem sections instead of memblock. Sorry.

--nX

> Signed-off-by: Jia He <[email protected]>
> ---
> include/linux/memblock.h | 3 +--
> mm/memblock.c | 23 +++++++++++++++++++----
> mm/page_alloc.c | 3 ++-
> 3 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index b7aa3ff..9471db4 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -203,8 +203,7 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
> i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> -unsigned long memblock_next_valid_pfn(unsigned long pfn);
> -
> +unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
> /**
> * for_each_free_mem_range - iterate through free memblock areas
> * @i: u64 used as loop variable
> diff --git a/mm/memblock.c b/mm/memblock.c
> index c87924d..a9e8da4 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1133,13 +1133,26 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
> }
> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> + int *last_idx)
> {
> struct memblock_type *type = &memblock.memory;
> unsigned int right = type->cnt;
> unsigned int mid, left = 0;
> + unsigned long start_pfn, end_pfn;
> phys_addr_t addr = PFN_PHYS(++pfn);
>
> + /* fast path, return pfh+1 if next pfn is in the same region */
> + if (*last_idx != -1) {
> + start_pfn = PFN_DOWN(type->regions[*last_idx].base);
> + end_pfn = PFN_DOWN(type->regions[*last_idx].base +
> + type->regions[*last_idx].size);
> +
> + if (pfn < end_pfn && pfn > start_pfn)
> + return pfn;
> + }
> +
> + /* slow path, do the binary searching */
> do {
> mid = (right + left) / 2;
>
> @@ -1149,15 +1162,17 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> type->regions[mid].size))
> left = mid + 1;
> else {
> - /* addr is within the region, so pfn is valid */
> + *last_idx = mid;
> return pfn;
> }
> } while (left < right);
>
> if (right == type->cnt)
> return -1UL;
> - else
> - return PHYS_PFN(type->regions[right].base);
> +
> + *last_idx = right;
> +
> + return PHYS_PFN(type->regions[*last_idx].base);
> }
>
> static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3899209..f28c62c 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5456,6 +5456,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> unsigned long end_pfn = start_pfn + size;
> pg_data_t *pgdat = NODE_DATA(nid);
> unsigned long pfn;
> + int idx = -1;
> unsigned long nr_initialised = 0;
> struct page *page;
> #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
> @@ -5487,7 +5488,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> * end_pfn), such that we hit a valid pfn (or end_pfn)
> * on our next iteration of the loop.
> */
> - pfn = memblock_next_valid_pfn(pfn) - 1;
> + pfn = memblock_next_valid_pfn(pfn, &idx) - 1;
> #endif
> continue;
> }
> --
> 2.7.4
>

2018-03-21 10:16:32

by Daniel Vacek

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memblock: introduce memblock_search_pfn_regions()

On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
> This api is the preparation for further optimizing early_pfn_valid
>
> Signed-off-by: Jia He <[email protected]>
> ---
> include/linux/memblock.h | 2 ++
> mm/memblock.c | 12 ++++++++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 9471db4..5f46956 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
> i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> +int memblock_search_pfn_regions(unsigned long pfn);
> +
> unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
> /**
> * for_each_free_mem_range - iterate through free memblock areas
> diff --git a/mm/memblock.c b/mm/memblock.c
> index a9e8da4..f50fe5b 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1659,6 +1659,18 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
> return -1;
> }
>
> +/* search memblock with the input pfn, return the region idx */
> +int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
> +{
> + struct memblock_type *type = &memblock.memory;
> + int mid = memblock_search(type, PFN_PHYS(pfn));
> +
> + if (mid == -1)
> + return -1;

Why this?

> + return mid;
> +}
> +
> bool __init memblock_is_reserved(phys_addr_t addr)
> {
> return memblock_search(&memblock.reserved, addr) != -1;
> --
> 2.7.4
>

2018-03-21 12:05:53

by Jia He

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memblock: introduce memblock_search_pfn_regions()

Hi Daniel

Thanks for the review


On 3/21/2018 6:14 PM, Daniel Vacek Wrote:
> On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
>> This api is the preparation for further optimizing early_pfn_valid
>>
>> Signed-off-by: Jia He <[email protected]>
>> ---
>> include/linux/memblock.h | 2 ++
>> mm/memblock.c | 12 ++++++++++++
>> 2 files changed, 14 insertions(+)
>>
>> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
>> index 9471db4..5f46956 100644
>> --- a/include/linux/memblock.h
>> +++ b/include/linux/memblock.h
>> @@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>> i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
>> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>>
>> +int memblock_search_pfn_regions(unsigned long pfn);
>> +
>> unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
>> /**
>> * for_each_free_mem_range - iterate through free memblock areas
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index a9e8da4..f50fe5b 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -1659,6 +1659,18 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
>> return -1;
>> }
>>
>> +/* search memblock with the input pfn, return the region idx */
>> +int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
>> +{
>> + struct memblock_type *type = &memblock.memory;
>> + int mid = memblock_search(type, PFN_PHYS(pfn));
>> +
>> + if (mid == -1)
>> + return -1;
> Why this?
Yes, it is redudant and can be removed.
Thanks

Cheers,
Jia

2018-03-21 12:30:27

by Jia He

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()



On 3/21/2018 6:14 PM, Daniel Vacek Wrote:
> On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But there is
>> still some room for improvement. E.g. if pfn and pfn+1 are in the same
>> memblock region, we can simply pfn++ instead of doing the binary search
>> in memblock_next_valid_pfn.
> There is a revert-mm-page_alloc-skip-over-regions-of-invalid-pfns-where-possible.patch
> in -mm reverting b92df1de5d289c0b as it is fundamentally wrong by
> design causing system panics on some machines with rare but still
> valid mappings. Basically it skips valid pfns which are outside of
> usable memory ranges (outside of memblock memory regions).
Thanks for the infomation.
quote from you patch description:
>But given some specific memory mapping on x86_64 (or more generally
theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) > the
implementation also skips valid pfns which is plain wrong and causes >
'kernel BUG at mm/page_alloc.c:1389!'

Do you think memblock_next_valid_pfn can remain to be not reverted on
arm64 with CONFIG_HAVE_ARCH_PFN_VALID? Arm64 can benifit from this
optimization.

Cheers,
Jia

2018-03-21 15:07:18

by Daniel Vacek

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

On Wed, Mar 21, 2018 at 1:28 PM, Jia He <[email protected]> wrote:
>
> On 3/21/2018 6:14 PM, Daniel Vacek Wrote:
>>
>> On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
>>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") optimized the loop in memmap_init_zone(). But there is
>>> still some room for improvement. E.g. if pfn and pfn+1 are in the same
>>> memblock region, we can simply pfn++ instead of doing the binary search
>>> in memblock_next_valid_pfn.
>>
>> There is a
>> revert-mm-page_alloc-skip-over-regions-of-invalid-pfns-where-possible.patch
>> in -mm reverting b92df1de5d289c0b as it is fundamentally wrong by
>> design causing system panics on some machines with rare but still
>> valid mappings. Basically it skips valid pfns which are outside of
>> usable memory ranges (outside of memblock memory regions).
>
> Thanks for the infomation.
> quote from you patch description:
>>But given some specific memory mapping on x86_64 (or more generally
>> theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) > the
>> implementation also skips valid pfns which is plain wrong and causes >
>> 'kernel BUG at mm/page_alloc.c:1389!'
>
> Do you think memblock_next_valid_pfn can remain to be not reverted on arm64
> with CONFIG_HAVE_ARCH_PFN_VALID? Arm64 can benefit from this optimization.

I guess this is a question for maintainers. I am really not sure about
arm(64) but if this function is correct at least for arm(64) with arch
pfn_valid(), which is likely, then I'd say it should be moved
somewhere to arch/arm{,64}/mm/ (init.c maybe?) and #ifdefed properly.

Ard?

> Cheers,
> Jia

2018-03-22 12:55:06

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

On Wed, Mar 21, 2018 at 08:28:18PM +0800, Jia He wrote:
>
>
> On 3/21/2018 6:14 PM, Daniel Vacek Wrote:
> >On Wed, Mar 21, 2018 at 9:09 AM, Jia He <[email protected]> wrote:
> >>Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> >>where possible") optimized the loop in memmap_init_zone(). But there is
> >>still some room for improvement. E.g. if pfn and pfn+1 are in the same
> >>memblock region, we can simply pfn++ instead of doing the binary search
> >>in memblock_next_valid_pfn.
> >There is a revert-mm-page_alloc-skip-over-regions-of-invalid-pfns-where-possible.patch
> >in -mm reverting b92df1de5d289c0b as it is fundamentally wrong by
> >design causing system panics on some machines with rare but still
> >valid mappings. Basically it skips valid pfns which are outside of
> >usable memory ranges (outside of memblock memory regions).
> Thanks for the infomation.
> quote from you patch description:
> >But given some specific memory mapping on x86_64 (or more generally
> theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) > the
> implementation also skips valid pfns which is plain wrong and causes >
> 'kernel BUG at mm/page_alloc.c:1389!'
>
> Do you think memblock_next_valid_pfn can remain to be not reverted on arm64
> with CONFIG_HAVE_ARCH_PFN_VALID? Arm64 can benifit from this optimization.

I confirm that the boot time of Rcar-H3 arm64 platform greatly
benefits from v4.11-rc1 commit b92df1de5d28 ("mm: page_alloc: skip over
regions of invalid pfns where possible"). The startup improvement is
roughly ~140ms, which will be lost if the mentioned commit is reverted.

For more details on my measurements, please see linux-next commit
283f1645e236 ("mm: page_alloc: skip over regions of invalid pfns on
UMA").

Whichever way you decide to go forward (reimplement/fix b92df1de5d28
or create an <arch>_next_valid_pfn), I am willing to participate in
testing your proposals on RCAR SoCs. TIA.

Thanks,
Eugeniu.