Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.
Patch 1 remain the memblock_next_valid_pfn on arm and arm64
Patch 2 optimizes the memblock_next_valid_pfn()
Patch 3~5 optimizes the early_pfn_valid()
I tested the pfn loop process in memmap_init(), the same as before.
As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24345 us in my
armv8a server(QDF2400 with 96G memory).
Attached the memblock region information in my server.
[ 86.956758] Zone ranges:
[ 86.959452] DMA [mem 0x0000000000200000-0x00000000ffffffff]
[ 86.966041] Normal [mem 0x0000000100000000-0x00000017ffffffff]
[ 86.972631] Movable zone start for each node
[ 86.977179] Early memory node ranges
[ 86.980985] node 0: [mem 0x0000000000200000-0x000000000021ffff]
[ 86.987666] node 0: [mem 0x0000000000820000-0x000000000307ffff]
[ 86.994348] node 0: [mem 0x0000000003080000-0x000000000308ffff]
[ 87.001029] node 0: [mem 0x0000000003090000-0x00000000031fffff]
[ 87.007710] node 0: [mem 0x0000000003200000-0x00000000033fffff]
[ 87.014392] node 0: [mem 0x0000000003410000-0x000000000563ffff]
[ 87.021073] node 0: [mem 0x0000000005640000-0x000000000567ffff]
[ 87.027754] node 0: [mem 0x0000000005680000-0x00000000056dffff]
[ 87.034435] node 0: [mem 0x00000000056e0000-0x00000000086fffff]
[ 87.041117] node 0: [mem 0x0000000008700000-0x000000000871ffff]
[ 87.047798] node 0: [mem 0x0000000008720000-0x000000000894ffff]
[ 87.054479] node 0: [mem 0x0000000008950000-0x0000000008baffff]
[ 87.061161] node 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[ 87.067842] node 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[ 87.074524] node 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[ 87.081205] node 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[ 87.087886] node 0: [mem 0x0000000008e50000-0x0000000008fcffff]
[ 87.094568] node 0: [mem 0x0000000008fd0000-0x000000000910ffff]
[ 87.101249] node 0: [mem 0x0000000009110000-0x00000000092effff]
[ 87.107930] node 0: [mem 0x00000000092f0000-0x000000000930ffff]
[ 87.114612] node 0: [mem 0x0000000009310000-0x000000000963ffff]
[ 87.121293] node 0: [mem 0x0000000009640000-0x000000000e61ffff]
[ 87.127975] node 0: [mem 0x000000000e620000-0x000000000e64ffff]
[ 87.134657] node 0: [mem 0x000000000e650000-0x000000000fffffff]
[ 87.141338] node 0: [mem 0x0000000010800000-0x0000000017feffff]
[ 87.148019] node 0: [mem 0x000000001c000000-0x000000001c00ffff]
[ 87.154701] node 0: [mem 0x000000001c010000-0x000000001c7fffff]
[ 87.161383] node 0: [mem 0x000000001c810000-0x000000007efbffff]
[ 87.168064] node 0: [mem 0x000000007efc0000-0x000000007efdffff]
[ 87.174746] node 0: [mem 0x000000007efe0000-0x000000007efeffff]
[ 87.181427] node 0: [mem 0x000000007eff0000-0x000000007effffff]
[ 87.188108] node 0: [mem 0x000000007f000000-0x00000017ffffffff]
[ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
Without this patchset:
[ 117.106153] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[ 117.113677] before memmap_init
[ 117.118195] after memmap_init
>>> memmap_init takes 4518 us
[ 117.121446] before memmap_init
[ 117.154992] after memmap_init
>>> memmap_init takes 33546 us
[ 117.158241] before memmap_init
[ 117.161490] after memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us
With this patchset:
[ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[ 87.202314] before memmap_init
[ 87.206164] after memmap_init
>>> memmap_init takes 3850 us
[ 87.209416] before memmap_init
[ 87.226662] after memmap_init
>>> memmap_init takes 17246 us
[ 87.229911] before memmap_init
[ 87.233160] after memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 24345 us
Changelog:
V4: - refine patches as suggested by Danial Vacek and Wei Yang
- optimized on arm besides arm64
V3: - fix 2 issues reported by kbuild test robot
V2: - rebase to mmotm latest
- remain memblock_next_valid_pfn on arm64
- refine memblock_search_pfn_regions and pfn_valid_region
Jia He (5):
mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
arm: arm64: page_alloc: reduce unnecessary binary search in
memblock_next_valid_pfn()
mm/memblock: introduce memblock_search_pfn_regions()
arm64: introduce pfn_valid_region()
mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
arch/arm/include/asm/page.h | 4 ++-
arch/arm/mm/init.c | 71 ++++++++++++++++++++++++++++++++++++++++++-
arch/arm64/include/asm/page.h | 4 ++-
arch/arm64/mm/init.c | 71 ++++++++++++++++++++++++++++++++++++++++++-
include/linux/memblock.h | 2 ++
include/linux/mmzone.h | 7 ++++-
mm/memblock.c | 9 ++++++
mm/page_alloc.c | 14 ++++++++-
8 files changed, 176 insertions(+), 6 deletions(-)
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But it causes
possible panic bug. So Daniel Vacek reverted it later.
But as suggested by Daniel Vacek, it is fine to using memblock to skip
gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
On arm and arm64, memblock is used by default. But generic version of
pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
not always return the next valid one but skips more resulting in some
valid frames to be skipped (as if they were invalid). And that's why
kernel was eventually crashing on some !arm machines.
And as verified by Eugeniu Rosca, arm can benifit from commit
b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
the related codes to arm64 arch directory.
Suggested-by: Daniel Vacek <[email protected]>
Signed-off-by: Jia He <[email protected]>
---
arch/arm/mm/init.c | 31 ++++++++++++++++++++++++++++++-
arch/arm64/mm/init.c | 31 ++++++++++++++++++++++++++++++-
mm/page_alloc.c | 13 ++++++++++++-
3 files changed, 72 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a1f11a7..0fb85ca 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
return memblock_is_map_memory(__pfn_to_phys(pfn));
}
EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ unsigned int right = type->cnt;
+ unsigned int mid, left = 0;
+ phys_addr_t addr = PFN_PHYS(++pfn);
+
+ do {
+ mid = (right + left) / 2;
+
+ if (addr < type->regions[mid].base)
+ right = mid;
+ else if (addr >= (type->regions[mid].base +
+ type->regions[mid].size))
+ left = mid + 1;
+ else {
+ /* addr is within the region, so pfn is valid */
+ return pfn;
+ }
+ } while (left < right);
+
+ if (right == type->cnt)
+ return -1UL;
+ else
+ return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
#ifndef CONFIG_SPARSEMEM
static void __init arm_memory_present(void)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..13e43ff 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
return memblock_is_map_memory(pfn << PAGE_SHIFT);
}
EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm64 */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ unsigned int right = type->cnt;
+ unsigned int mid, left = 0;
+ phys_addr_t addr = PFN_PHYS(++pfn);
+
+ do {
+ mid = (right + left) / 2;
+
+ if (addr < type->regions[mid].base)
+ right = mid;
+ else if (addr >= (type->regions[mid].base +
+ type->regions[mid].size))
+ left = mid + 1;
+ else {
+ /* addr is within the region, so pfn is valid */
+ return pfn;
+ }
+ } while (left < right);
+
+ if (right == type->cnt)
+ return -1UL;
+ else
+ return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
#ifndef CONFIG_SPARSEMEM
static void __init arm64_memory_present(void)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c19f5ac..8a92df7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5452,6 +5452,15 @@ void __ref build_all_zonelists(pg_data_t *pgdat)
* up by free_all_bootmem() once the early boot process is
* done. Non-atomic initialization, single-pass.
*/
+#if (defined CONFIG_HAVE_MEMBLOCK) && (defined CONFIG_HAVE_ARCH_PFN_VALID)
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+#endif
+
+#ifndef skip_to_last_invalid_pfn
+#define skip_to_last_invalid_pfn(pfn) (pfn)
+#endif
+
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
unsigned long start_pfn, enum memmap_context context,
struct vmem_altmap *altmap)
@@ -5483,8 +5492,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
if (context != MEMMAP_EARLY)
goto not_early;
- if (!early_pfn_valid(pfn))
+ if (!early_pfn_valid(pfn)) {
+ pfn = skip_to_last_invalid_pfn(pfn);
continue;
+ }
if (!early_pfn_in_nid(pfn, nid))
continue;
if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. if pfn and pfn+1 are in the same
memblock region, we can simply pfn++ instead of doing the binary search
in memblock_next_valid_pfn.
Signed-off-by: Jia He <[email protected]>
---
arch/arm/include/asm/page.h | 1 +
arch/arm/mm/init.c | 31 ++++++++++++++++++++++++-------
arch/arm64/include/asm/page.h | 1 +
arch/arm64/mm/init.c | 31 ++++++++++++++++++++++++-------
mm/page_alloc.c | 5 +++--
5 files changed, 53 insertions(+), 16 deletions(-)
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 4355f0e..7a0404f 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
typedef struct page *pgtable_t;
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
extern int pfn_valid(unsigned long);
#endif
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 0fb85ca..7779804 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
}
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
int pfn_valid(unsigned long pfn)
{
return memblock_is_map_memory(__pfn_to_phys(pfn));
@@ -200,31 +202,46 @@ int pfn_valid(unsigned long pfn)
EXPORT_SYMBOL(pfn_valid);
/* HAVE_MEMBLOCK is always enabled on arm */
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
+ int *last_idx)
{
struct memblock_type *type = &memblock.memory;
+ struct memblock_region *regions = type->regions;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
+ unsigned long start_pfn, end_pfn;
phys_addr_t addr = PFN_PHYS(++pfn);
+ /* fast path, return pfn+1 if next pfn is in the same region */
+ if (*last_idx != -1) {
+ start_pfn = PFN_DOWN(regions[*last_idx].base);
+ end_pfn = PFN_DOWN(regions[*last_idx].base +
+ regions[*last_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return pfn;
+ }
+
+ /* slow path, do the binary searching */
do {
mid = (right + left) / 2;
- if (addr < type->regions[mid].base)
+ if (addr < regions[mid].base)
right = mid;
- else if (addr >= (type->regions[mid].base +
- type->regions[mid].size))
+ else if (addr >= (regions[mid].base + regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn is valid */
+ *last_idx = mid;
return pfn;
}
} while (left < right);
if (right == type->cnt)
return -1UL;
- else
- return PHYS_PFN(type->regions[right].base);
+
+ *last_idx = right;
+
+ return PHYS_PFN(regions[*last_idx].base);
}
EXPORT_SYMBOL(memblock_next_valid_pfn);
#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 60d02c8..84b503a 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -38,6 +38,7 @@ extern void clear_page(void *to);
typedef struct page *pgtable_t;
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
extern int pfn_valid(unsigned long);
#endif
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 13e43ff..cd9b473 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
#endif /* CONFIG_NUMA */
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
int pfn_valid(unsigned long pfn)
{
return memblock_is_map_memory(pfn << PAGE_SHIFT);
@@ -292,31 +294,46 @@ int pfn_valid(unsigned long pfn)
EXPORT_SYMBOL(pfn_valid);
/* HAVE_MEMBLOCK is always enabled on arm64 */
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
+ int *last_idx)
{
struct memblock_type *type = &memblock.memory;
+ struct memblock_region *regions = type->regions;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
+ unsigned long start_pfn, end_pfn;
phys_addr_t addr = PFN_PHYS(++pfn);
+ /* fast path, return pfn+1 if next pfn is in the same region */
+ if (*last_idx != -1) {
+ start_pfn = PFN_DOWN(regions[*last_idx].base);
+ end_pfn = PFN_DOWN(regions[*last_idx].base +
+ regions[*last_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return pfn;
+ }
+
+ /* slow path, do the binary searching */
do {
mid = (right + left) / 2;
- if (addr < type->regions[mid].base)
+ if (addr < regions[mid].base)
right = mid;
- else if (addr >= (type->regions[mid].base +
- type->regions[mid].size))
+ else if (addr >= (regions[mid].base + regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn is valid */
+ *last_idx = mid;
return pfn;
}
} while (left < right);
if (right == type->cnt)
return -1UL;
- else
- return PHYS_PFN(type->regions[right].base);
+
+ *last_idx = right;
+
+ return PHYS_PFN(regions[*last_idx].base);
}
EXPORT_SYMBOL(memblock_next_valid_pfn);
#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8a92df7..f99b513 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5453,8 +5453,9 @@ void __ref build_all_zonelists(pg_data_t *pgdat)
* done. Non-atomic initialization, single-pass.
*/
#if (defined CONFIG_HAVE_MEMBLOCK) && (defined CONFIG_HAVE_ARCH_PFN_VALID)
-extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
-#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx);
+#define skip_to_last_invalid_pfn(pfn) \
+ (memblock_next_valid_pfn(pfn, &early_region_idx) - 1)
#endif
#ifndef skip_to_last_invalid_pfn
--
2.7.4
This api is the preparation for further optimizing early_pfn_valid
Signed-off-by: Jia He <[email protected]>
---
include/linux/memblock.h | 2 ++
mm/memblock.c | 9 +++++++++
2 files changed, 11 insertions(+)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 0257aee..a0127b3 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+int memblock_search_pfn_regions(unsigned long pfn);
+
/**
* for_each_free_mem_range - iterate through free memblock areas
* @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index ba7c878..0f4004c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
return -1;
}
+/* search memblock with the input pfn, return the region idx */
+int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ int mid = memblock_search(type, PFN_PHYS(pfn));
+
+ return mid;
+}
+
bool __init memblock_is_reserved(phys_addr_t addr)
{
return memblock_search(&memblock.reserved, addr) != -1;
--
2.7.4
This is the preparation for further optimizing in early_pfn_valid
on arm and arm64.
Signed-off-by: Jia He <[email protected]>
---
arch/arm/include/asm/page.h | 3 ++-
arch/arm/mm/init.c | 23 +++++++++++++++++++++++
arch/arm64/include/asm/page.h | 3 ++-
arch/arm64/mm/init.c | 23 +++++++++++++++++++++++
4 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 7a0404f..559b414 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,7 +158,8 @@ typedef struct page *pgtable_t;
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
+extern int pfn_valid_region(unsigned long pfn, int *last_idx);
#endif
#include <asm/memory.h>
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 7779804..11f9b82 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -201,6 +201,29 @@ int pfn_valid(unsigned long pfn)
}
EXPORT_SYMBOL(pfn_valid);
+int pfn_valid_region(unsigned long pfn, int *last_idx)
+{
+ unsigned long start_pfn, end_pfn;
+ struct memblock_type *type = &memblock.memory;
+ struct memblock_region *regions = type->regions;
+
+ if (*last_idx != -1) {
+ start_pfn = PFN_DOWN(regions[*last_idx].base);
+ end_pfn = PFN_DOWN(regions[*last_idx].base +
+ regions[*last_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return !memblock_is_nomap(®ions[*last_idx]);
+ }
+
+ *last_idx = memblock_search_pfn_regions(pfn);
+ if (*last_idx == -1)
+ return false;
+
+ return !memblock_is_nomap(®ions[*last_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
/* HAVE_MEMBLOCK is always enabled on arm */
unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
int *last_idx)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 84b503a..27892d5 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,7 +39,8 @@ typedef struct page *pgtable_t;
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
+extern int pfn_valid_region(unsigned long pfn, int *last_idx);
#endif
#include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index cd9b473..6dedd77 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -293,6 +293,29 @@ int pfn_valid(unsigned long pfn)
}
EXPORT_SYMBOL(pfn_valid);
+int pfn_valid_region(unsigned long pfn, int *last_idx)
+{
+ unsigned long start_pfn, end_pfn;
+ struct memblock_type *type = &memblock.memory;
+ struct memblock_region *regions = type->regions;
+
+ if (*last_idx != -1) {
+ start_pfn = PFN_DOWN(regions[*last_idx].base);
+ end_pfn = PFN_DOWN(regions[*last_idx].base +
+ regions[*last_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return !memblock_is_nomap(®ions[*last_idx]);
+ }
+
+ *last_idx = memblock_search_pfn_regions(pfn);
+ if (*last_idx == -1)
+ return false;
+
+ return !memblock_is_nomap(®ions[*last_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
/* HAVE_MEMBLOCK is always enabled on arm64 */
unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
int *last_idx)
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), if pfn and
pfn+1 are in the same memblock region, we can record the last returned
memblock region index and check check pfn++ is still in the same region.
Currently it only improve the performance on arm64 and will have no
impact on other arches.
Signed-off-by: Jia He <[email protected]>
---
include/linux/mmzone.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d797716..b68d22c 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1267,9 +1267,14 @@ static inline int pfn_present(unsigned long pfn)
})
#else
#define pfn_to_nid(pfn) (0)
-#endif
+#endif /*CONFIG_NUMA*/
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+#define early_pfn_valid(pfn) pfn_valid_region(pfn, &early_region_idx)
+#else
#define early_pfn_valid(pfn) pfn_valid(pfn)
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
+
void sparse_init(void);
#else
#define sparse_init() do {} while (0)
--
2.7.4
On Fri, Mar 30, 2018 at 10:15 AM, Jia He <[email protected]> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But it causes
> possible panic bug. So Daniel Vacek reverted it later.
>
> But as suggested by Daniel Vacek, it is fine to using memblock to skip
> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>
> On arm and arm64, memblock is used by default. But generic version of
> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
> not always return the next valid one but skips more resulting in some
> valid frames to be skipped (as if they were invalid). And that's why
> kernel was eventually crashing on some !arm machines.
>
> And as verified by Eugeniu Rosca, arm can benifit from commit
> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
> the related codes to arm64 arch directory.
>
> Suggested-by: Daniel Vacek <[email protected]>
> Signed-off-by: Jia He <[email protected]>
> ---
> arch/arm/mm/init.c | 31 ++++++++++++++++++++++++++++++-
> arch/arm64/mm/init.c | 31 ++++++++++++++++++++++++++++++-
> mm/page_alloc.c | 13 ++++++++++++-
> 3 files changed, 72 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index a1f11a7..0fb85ca 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
> return memblock_is_map_memory(__pfn_to_phys(pfn));
> }
> EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> + struct memblock_type *type = &memblock.memory;
> + unsigned int right = type->cnt;
> + unsigned int mid, left = 0;
> + phys_addr_t addr = PFN_PHYS(++pfn);
> +
> + do {
> + mid = (right + left) / 2;
> +
> + if (addr < type->regions[mid].base)
> + right = mid;
> + else if (addr >= (type->regions[mid].base +
> + type->regions[mid].size))
> + left = mid + 1;
> + else {
> + /* addr is within the region, so pfn is valid */
> + return pfn;
> + }
> + } while (left < right);
> +
> + if (right == type->cnt)
> + return -1UL;
> + else
> + return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
> #ifndef CONFIG_SPARSEMEM
> static void __init arm_memory_present(void)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 00e7b90..13e43ff 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
> return memblock_is_map_memory(pfn << PAGE_SHIFT);
> }
> EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm64 */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> + struct memblock_type *type = &memblock.memory;
> + unsigned int right = type->cnt;
> + unsigned int mid, left = 0;
> + phys_addr_t addr = PFN_PHYS(++pfn);
> +
> + do {
> + mid = (right + left) / 2;
> +
> + if (addr < type->regions[mid].base)
> + right = mid;
> + else if (addr >= (type->regions[mid].base +
> + type->regions[mid].size))
> + left = mid + 1;
> + else {
> + /* addr is within the region, so pfn is valid */
> + return pfn;
> + }
> + } while (left < right);
> +
> + if (right == type->cnt)
> + return -1UL;
> + else
> + return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
> #ifndef CONFIG_SPARSEMEM
> static void __init arm64_memory_present(void)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c19f5ac..8a92df7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5452,6 +5452,15 @@ void __ref build_all_zonelists(pg_data_t *pgdat)
> * up by free_all_bootmem() once the early boot process is
> * done. Non-atomic initialization, single-pass.
> */
> +#if (defined CONFIG_HAVE_MEMBLOCK) && (defined CONFIG_HAVE_ARCH_PFN_VALID)
> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> +#endif
> +
This should go to arch/arm{,64}/include/asm/page.h.
> +#ifndef skip_to_last_invalid_pfn
> +#define skip_to_last_invalid_pfn(pfn) (pfn)
> +#endif
And this to include/linux/mmzone.h. Something like this?
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 4355f0ec44d6..489875cf3889 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,6 +158,8 @@ extern void __cpu_copy_user_highpage(struct page
*to, struct page *from,
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
#endif
#include <asm/memory.h>
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 60d02c81a3a2..e57d3f2e2dbd 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,6 +39,8 @@ extern void __cpu_copy_user_page(void *to, const void *from,
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
#endif
#include <asm/memory.h>
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 356a814e7c8e..40d51bab6fc0 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1222,6 +1222,7 @@ static inline struct mem_section
*__pfn_to_section(unsigned long pfn)
extern int __highest_present_section_nr;
#ifndef CONFIG_HAVE_ARCH_PFN_VALID
+#define skip_to_last_invalid_pfn(pfn) (pfn)
static inline int pfn_valid(unsigned long pfn)
{
if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
--nX
> void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> unsigned long start_pfn, enum memmap_context context,
> struct vmem_altmap *altmap)
> @@ -5483,8 +5492,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
> if (context != MEMMAP_EARLY)
> goto not_early;
>
> - if (!early_pfn_valid(pfn))
> + if (!early_pfn_valid(pfn)) {
> + pfn = skip_to_last_invalid_pfn(pfn);
> continue;
> + }
> if (!early_pfn_in_nid(pfn, nid))
> continue;
> if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
> --
> 2.7.4
>
On Fri, Mar 30, 2018 at 10:15 AM, Jia He <[email protected]> wrote:
> This is the preparation for further optimizing in early_pfn_valid
> on arm and arm64.
>
> Signed-off-by: Jia He <[email protected]>
> ---
> arch/arm/include/asm/page.h | 3 ++-
> arch/arm/mm/init.c | 23 +++++++++++++++++++++++
> arch/arm64/include/asm/page.h | 3 ++-
> arch/arm64/mm/init.c | 23 +++++++++++++++++++++++
> 4 files changed, 50 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 7a0404f..559b414 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,7 +158,8 @@ typedef struct page *pgtable_t;
>
> #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
> +extern int pfn_valid_region(unsigned long pfn, int *last_idx);
> #endif
>
> #include <asm/memory.h>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 7779804..11f9b82 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -201,6 +201,29 @@ int pfn_valid(unsigned long pfn)
> }
> EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn, int *last_idx)
> +{
> + unsigned long start_pfn, end_pfn;
> + struct memblock_type *type = &memblock.memory;
> + struct memblock_region *regions = type->regions;
> +
> + if (*last_idx != -1) {
> + start_pfn = PFN_DOWN(regions[*last_idx].base);
> + end_pfn = PFN_DOWN(regions[*last_idx].base +
> + regions[*last_idx].size);
> +
> + if (pfn >= start_pfn && pfn < end_pfn)
> + return !memblock_is_nomap(®ions[*last_idx]);
> + }
> +
> + *last_idx = memblock_search_pfn_regions(pfn);
> + if (*last_idx == -1)
> + return false;
> +
> + return !memblock_is_nomap(®ions[*last_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
> /* HAVE_MEMBLOCK is always enabled on arm */
> unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> int *last_idx)
Since you have both functions in the same file, can you make the
early_region_idx global static here and get rid of the arguments,
perhaps?
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 84b503a..27892d5 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -39,7 +39,8 @@ typedef struct page *pgtable_t;
>
> #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
> +extern int pfn_valid_region(unsigned long pfn, int *last_idx);
> #endif
>
> #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index cd9b473..6dedd77 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -293,6 +293,29 @@ int pfn_valid(unsigned long pfn)
> }
> EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn, int *last_idx)
> +{
> + unsigned long start_pfn, end_pfn;
> + struct memblock_type *type = &memblock.memory;
> + struct memblock_region *regions = type->regions;
> +
> + if (*last_idx != -1) {
> + start_pfn = PFN_DOWN(regions[*last_idx].base);
> + end_pfn = PFN_DOWN(regions[*last_idx].base +
> + regions[*last_idx].size);
> +
> + if (pfn >= start_pfn && pfn < end_pfn)
> + return !memblock_is_nomap(®ions[*last_idx]);
> + }
> +
> + *last_idx = memblock_search_pfn_regions(pfn);
> + if (*last_idx == -1)
> + return false;
> +
> + return !memblock_is_nomap(®ions[*last_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
> /* HAVE_MEMBLOCK is always enabled on arm64 */
> unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
> int *last_idx)
Ditto.
--nX
> --
> 2.7.4
>