Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.
Patch 1 introduce new config to make codes more generic
Patch 2 remain the memblock_next_valid_pfn on arm and arm64
Patch 3 optimizes the memblock_next_valid_pfn()
Patch 4~6 optimizes the early_pfn_valid()
As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24389 us in my
armv8a server(QDF2400 with 96G memory).
Without this patchset:
[ 117.113677] before memmap_init
[ 117.118195] after memmap_init
>>> memmap_init takes 4518 us
[ 117.121446] before memmap_init
[ 117.154992] after memmap_init
>>> memmap_init takes 33546 us
[ 117.158241] before memmap_init
[ 117.161490] after memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us
With this patchset:
[ 123.222962] before memmap_init
[ 123.226819] after memmap_init
>>> memmap_init takes 3857
[ 123.230070] before memmap_init
[ 123.247354] after memmap_init
>>> memmap_init takes 17284
[ 123.250604] before memmap_init
[ 123.253852] after memmap_init
>>> memmap_init takes 3248
>>> totally takes 24389 us
Attached the memblock region information in my server.
[ 86.956758] Zone ranges:
[ 86.959452] DMA [mem 0x0000000000200000-0x00000000ffffffff]
[ 86.966041] Normal [mem 0x0000000100000000-0x00000017ffffffff]
[ 86.972631] Movable zone start for each node
[ 86.977179] Early memory node ranges
[ 86.980985] node 0: [mem 0x0000000000200000-0x000000000021ffff]
[ 86.987666] node 0: [mem 0x0000000000820000-0x000000000307ffff]
[ 86.994348] node 0: [mem 0x0000000003080000-0x000000000308ffff]
[ 87.001029] node 0: [mem 0x0000000003090000-0x00000000031fffff]
[ 87.007710] node 0: [mem 0x0000000003200000-0x00000000033fffff]
[ 87.014392] node 0: [mem 0x0000000003410000-0x000000000563ffff]
[ 87.021073] node 0: [mem 0x0000000005640000-0x000000000567ffff]
[ 87.027754] node 0: [mem 0x0000000005680000-0x00000000056dffff]
[ 87.034435] node 0: [mem 0x00000000056e0000-0x00000000086fffff]
[ 87.041117] node 0: [mem 0x0000000008700000-0x000000000871ffff]
[ 87.047798] node 0: [mem 0x0000000008720000-0x000000000894ffff]
[ 87.054479] node 0: [mem 0x0000000008950000-0x0000000008baffff]
[ 87.061161] node 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[ 87.067842] node 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[ 87.074524] node 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[ 87.081205] node 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[ 87.087886] node 0: [mem 0x0000000008e50000-0x0000000008fcffff]
[ 87.094568] node 0: [mem 0x0000000008fd0000-0x000000000910ffff]
[ 87.101249] node 0: [mem 0x0000000009110000-0x00000000092effff]
[ 87.107930] node 0: [mem 0x00000000092f0000-0x000000000930ffff]
[ 87.114612] node 0: [mem 0x0000000009310000-0x000000000963ffff]
[ 87.121293] node 0: [mem 0x0000000009640000-0x000000000e61ffff]
[ 87.127975] node 0: [mem 0x000000000e620000-0x000000000e64ffff]
[ 87.134657] node 0: [mem 0x000000000e650000-0x000000000fffffff]
[ 87.141338] node 0: [mem 0x0000000010800000-0x0000000017feffff]
[ 87.148019] node 0: [mem 0x000000001c000000-0x000000001c00ffff]
[ 87.154701] node 0: [mem 0x000000001c010000-0x000000001c7fffff]
[ 87.161383] node 0: [mem 0x000000001c810000-0x000000007efbffff]
[ 87.168064] node 0: [mem 0x000000007efc0000-0x000000007efdffff]
[ 87.174746] node 0: [mem 0x000000007efe0000-0x000000007efeffff]
[ 87.181427] node 0: [mem 0x000000007eff0000-0x000000007effffff]
[ 87.188108] node 0: [mem 0x000000007f000000-0x00000017ffffffff]
[ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
Changelog:
V8: - introduce new config and move generic code to early_pfn.h
- optimize memblock_next_valid_pfn as suggested by Matthew Wilcox
V7: - fix i386 compilation error. refine the commit description
V6: - simplify the codes, move arm/arm64 common codes to one file.
- refine patches as suggested by Danial Vacek and Ard Biesheuvel
V5: - further refining as suggested by Danial Vacek. Make codes
arm/arm64 more arch specific
V4: - refine patches as suggested by Danial Vacek and Wei Yang
- optimized on arm besides arm64
V3: - fix 2 issues reported by kbuild test robot
V2: - rebase to mmotm latest
- remain memblock_next_valid_pfn on arm64
- refine memblock_search_pfn_regions and pfn_valid_region
Jia He (6):
arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID
mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64
arm: arm64: page_alloc: reduce unnecessary binary search in
memblock_next_valid_pfn()
mm/memblock: introduce memblock_search_pfn_regions()
arm: arm64: introduce pfn_valid_region()
mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
arch/arm/Kconfig | 4 +++
arch/arm/mm/init.c | 1 +
arch/arm64/Kconfig | 4 +++
arch/arm64/mm/init.c | 1 +
include/linux/early_pfn.h | 79 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/memblock.h | 2 ++
include/linux/mmzone.h | 18 ++++++++++-
mm/Kconfig | 3 ++
mm/memblock.c | 9 ++++++
mm/page_alloc.c | 5 ++-
10 files changed, 124 insertions(+), 2 deletions(-)
create mode 100644 include/linux/early_pfn.h
--
2.7.4
Make CONFIG_HAVE_MEMBLOCK_PFN_VALID a config option so it can move
memblock_next_valid_pfn to generic code file.
arm/arm64 can benefit from this booting time improvement.
Signed-off-by: Jia He <[email protected]>
---
arch/arm/Kconfig | 4 ++++
arch/arm64/Kconfig | 4 ++++
mm/Kconfig | 3 +++
3 files changed, 11 insertions(+)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 51c8df5..77bc1bb 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1637,6 +1637,10 @@ config ARCH_SELECT_MEMORY_MODEL
config HAVE_ARCH_PFN_VALID
def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
+config HAVE_MEMBLOCK_PFN_VALID
+ def_bool y
+ depends on HAVE_ARCH_PFN_VALID
+
config HAVE_GENERIC_GUP
def_bool y
depends on ARM_LPAE
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c9a7e9e..f374203 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -747,6 +747,10 @@ config ARCH_SELECT_MEMORY_MODEL
config HAVE_ARCH_PFN_VALID
def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
+config HAVE_MEMBLOCK_PFN_VALID
+ def_bool y
+ depends on HAVE_ARCH_PFN_VALID
+
config HW_PERF_EVENTS
def_bool y
depends on ARM_PMU
diff --git a/mm/Kconfig b/mm/Kconfig
index c782e8f..c53ac38 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -137,6 +137,9 @@ config HAVE_MEMBLOCK_NODE_MAP
config HAVE_MEMBLOCK_PHYS_MAP
bool
+config HAVE_MEMBLOCK_PFN_VALID
+ bool
+
config HAVE_GENERIC_GUP
bool
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But it causes
possible panic bug. So Daniel Vacek reverted it later.
But as suggested by Daniel Vacek, it is fine to using memblock to skip
gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
On arm and arm64, memblock is used by default. But generic version of
pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
not always return the next valid one but skips more resulting in some
valid frames to be skipped (as if they were invalid). And that's why
kernel was eventually crashing on some !arm machines.
And as verified by Eugeniu Rosca, arm can benifit from commit
b92df1de5d28. So remain the memblock_next_valid_pfn on arm/arm64 and
move the related codes to one file include/linux/arm96_common.h
Suggested-by: Daniel Vacek <[email protected]>
Signed-off-by: Jia He <[email protected]>
---
arch/arm/mm/init.c | 1 +
arch/arm64/mm/init.c | 1 +
include/linux/early_pfn.h | 34 ++++++++++++++++++++++++++++++++++
include/linux/mmzone.h | 11 +++++++++++
mm/page_alloc.c | 5 ++++-
5 files changed, 51 insertions(+), 1 deletion(-)
create mode 100644 include/linux/early_pfn.h
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a1f11a7..de225a2 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -25,6 +25,7 @@
#include <linux/dma-contiguous.h>
#include <linux/sizes.h>
#include <linux/stop_machine.h>
+#include <linux/early_pfn.h>
#include <asm/cp15.h>
#include <asm/mach-types.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..913c327 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -40,6 +40,7 @@
#include <linux/mm.h>
#include <linux/kexec.h>
#include <linux/crash_dump.h>
+#include <linux/early_pfn.h>
#include <asm/boot.h>
#include <asm/fixmap.h>
diff --git a/include/linux/early_pfn.h b/include/linux/early_pfn.h
new file mode 100644
index 0000000..1b001c7
--- /dev/null
+++ b/include/linux/early_pfn.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2018 HXT-semitech Corp. */
+#ifndef __EARLY_PFN_H
+#define __EARLY_PFN_H
+#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID
+ulong __init_memblock memblock_next_valid_pfn(ulong pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ unsigned int right = type->cnt;
+ unsigned int mid, left = 0;
+ phys_addr_t addr = PFN_PHYS(++pfn);
+
+ do {
+ mid = (right + left) / 2;
+
+ if (addr < type->regions[mid].base)
+ right = mid;
+ else if (addr >= (type->regions[mid].base +
+ type->regions[mid].size))
+ left = mid + 1;
+ else {
+ /* addr is within the region, so pfn is valid */
+ return pfn;
+ }
+ } while (left < right);
+
+ if (right == type->cnt)
+ return -1UL;
+ else
+ return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_MEMBLOCK_PFN_VALID*/
+#endif /*__EARLY_PFN_H*/
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d797716..c40297d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1245,6 +1245,8 @@ static inline int pfn_valid(unsigned long pfn)
return 0;
return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
}
+
+#define next_valid_pfn(pfn) (pfn + 1)
#endif
static inline int pfn_present(unsigned long pfn)
@@ -1270,6 +1272,10 @@ static inline int pfn_present(unsigned long pfn)
#endif
#define early_pfn_valid(pfn) pfn_valid(pfn)
+#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID
+extern ulong memblock_next_valid_pfn(ulong pfn);
+#define next_valid_pfn(pfn) memblock_next_valid_pfn(pfn)
+#endif
void sparse_init(void);
#else
#define sparse_init() do {} while (0)
@@ -1291,6 +1297,11 @@ struct mminit_pfnnid_cache {
#define early_pfn_valid(pfn) (1)
#endif
+/* fallback to default defitions*/
+#ifndef next_valid_pfn
+#define next_valid_pfn(pfn) (pfn + 1)
+#endif
+
void memory_present(int nid, unsigned long start, unsigned long end);
unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c19f5ac..bab8e1a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5483,8 +5483,11 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
if (context != MEMMAP_EARLY)
goto not_early;
- if (!early_pfn_valid(pfn))
+ if (!early_pfn_valid(pfn)) {
+ pfn = next_valid_pfn(pfn) - 1;
continue;
+ }
+
if (!early_pfn_in_nid(pfn, nid))
continue;
if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. if pfn and pfn+1 are in the same
memblock region, we can simply pfn++ instead of doing the binary search
in memblock_next_valid_pfn. Furthermore, if the pfn is in a *gap* of two
memory region, skip to next region directly if possible.
Signed-off-by: Jia He <[email protected]>
---
include/linux/early_pfn.h | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)
diff --git a/include/linux/early_pfn.h b/include/linux/early_pfn.h
index 1b001c7..f9e40c3 100644
--- a/include/linux/early_pfn.h
+++ b/include/linux/early_pfn.h
@@ -3,31 +3,52 @@
#ifndef __EARLY_PFN_H
#define __EARLY_PFN_H
#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID
+static int early_region_idx __init_memblock = -1;
ulong __init_memblock memblock_next_valid_pfn(ulong pfn)
{
struct memblock_type *type = &memblock.memory;
- unsigned int right = type->cnt;
- unsigned int mid, left = 0;
+ struct memblock_region *regions = type->regions;
+ uint right = type->cnt;
+ uint mid, left = 0;
+ ulong start_pfn, end_pfn, next_start_pfn;
phys_addr_t addr = PFN_PHYS(++pfn);
+ /* fast path, return pfn+1 if next pfn is in the same region */
+ if (early_region_idx != -1) {
+ start_pfn = PFN_DOWN(regions[early_region_idx].base);
+ end_pfn = PFN_DOWN(regions[early_region_idx].base +
+ regions[early_region_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return pfn;
+
+ early_region_idx++;
+ next_start_pfn = PFN_DOWN(regions[early_region_idx].base);
+
+ if (pfn >= end_pfn && pfn <= next_start_pfn)
+ return next_start_pfn;
+ }
+
+ /* slow path, do the binary searching */
do {
mid = (right + left) / 2;
- if (addr < type->regions[mid].base)
+ if (addr < regions[mid].base)
right = mid;
- else if (addr >= (type->regions[mid].base +
- type->regions[mid].size))
+ else if (addr >= (regions[mid].base + regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn is valid */
+ early_region_idx = mid;
return pfn;
}
} while (left < right);
if (right == type->cnt)
return -1UL;
- else
- return PHYS_PFN(type->regions[right].base);
+
+ early_region_idx = right;
+
+ return PHYS_PFN(regions[early_region_idx].base);
}
EXPORT_SYMBOL(memblock_next_valid_pfn);
#endif /*CONFIG_HAVE_MEMBLOCK_PFN_VALID*/
--
2.7.4
This api is to find the memory region index of input pfn. With this
helper, we can improve the loop in early_pfn_valid by recording last
region index. If current pfn and last pfn are in the same memory
region, we needn't do the unnecessary binary searches because the
result of memblock_is_nomap is the same for whole memory region.
Signed-off-by: Jia He <[email protected]>
---
include/linux/memblock.h | 2 ++
mm/memblock.c | 9 +++++++++
2 files changed, 11 insertions(+)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 0257aee..a0127b3 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+int memblock_search_pfn_regions(unsigned long pfn);
+
/**
* for_each_free_mem_range - iterate through free memblock areas
* @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index ba7c878..0f4004c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
return -1;
}
+/* search memblock with the input pfn, return the region idx */
+int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
+{
+ struct memblock_type *type = &memblock.memory;
+ int mid = memblock_search(type, PFN_PHYS(pfn));
+
+ return mid;
+}
+
bool __init memblock_is_reserved(phys_addr_t addr)
{
return memblock_search(&memblock.reserved, addr) != -1;
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), we can record
the last returned memblock region. If current pfn and last pfn are in the
same memory region, we needn't do the unnecessary binary searches because
memblock_is_nomap is the same result for whole memory region.
Signed-off-by: Jia He <[email protected]>
---
include/linux/early_pfn.h | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/include/linux/early_pfn.h b/include/linux/early_pfn.h
index f9e40c3..9609391 100644
--- a/include/linux/early_pfn.h
+++ b/include/linux/early_pfn.h
@@ -51,5 +51,29 @@ ulong __init_memblock memblock_next_valid_pfn(ulong pfn)
return PHYS_PFN(regions[early_region_idx].base);
}
EXPORT_SYMBOL(memblock_next_valid_pfn);
+
+int pfn_valid_region(ulong pfn)
+{
+ ulong start_pfn, end_pfn;
+ struct memblock_type *type = &memblock.memory;
+ struct memblock_region *regions = type->regions;
+
+ if (early_region_idx != -1) {
+ start_pfn = PFN_DOWN(regions[early_region_idx].base);
+ end_pfn = PFN_DOWN(regions[early_region_idx].base +
+ regions[early_region_idx].size);
+
+ if (pfn >= start_pfn && pfn < end_pfn)
+ return !memblock_is_nomap(
+ ®ions[early_region_idx]);
+ }
+
+ early_region_idx = memblock_search_pfn_regions(pfn);
+ if (early_region_idx == -1)
+ return false;
+
+ return !memblock_is_nomap(®ions[early_region_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
#endif /*CONFIG_HAVE_MEMBLOCK_PFN_VALID*/
#endif /*__EARLY_PFN_H*/
--
2.7.4
Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), if pfn and
pfn+1 are in the same memblock region, we can record the last returned
memblock region index and check whether pfn++ is still in the same
region.
Currently it only improve the performance on arm/arm64 and will have no
impact on other arches.
For the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24345 us in my
armv8a server(QDF2400 with 96G memory).
Signed-off-by: Jia He <[email protected]>
---
include/linux/mmzone.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c40297d..426db40 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1271,11 +1271,16 @@ static inline int pfn_present(unsigned long pfn)
#define pfn_to_nid(pfn) (0)
#endif
-#define early_pfn_valid(pfn) pfn_valid(pfn)
#ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID
extern ulong memblock_next_valid_pfn(ulong pfn);
#define next_valid_pfn(pfn) memblock_next_valid_pfn(pfn)
-#endif
+
+extern int pfn_valid_region(ulong pfn);
+#define early_pfn_valid(pfn) pfn_valid_region(pfn)
+#else
+#define early_pfn_valid(pfn) pfn_valid(pfn)
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
+
void sparse_init(void);
#else
#define sparse_init() do {} while (0)
--
2.7.4
Ping
Sorry if I am a little bit verbose, but it can speedup the arm64 booting time
indeed.
--
Cheers,
Jia
On 4/11/2018 3:21 PM, Jia He Wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") tried to optimize the loop in memmap_init_zone(). But
> there is still some room for improvement.
>
> Patch 1 introduce new config to make codes more generic
> Patch 2 remain the memblock_next_valid_pfn on arm and arm64
> Patch 3 optimizes the memblock_next_valid_pfn()
> Patch 4~6 optimizes the early_pfn_valid()
>
> As for the performance improvement, after this set, I can see the time
> overhead of memmap_init() is reduced from 41313 us to 24389 us in my
> armv8a server(QDF2400 with 96G memory).
>
> Without this patchset:
> [ 117.113677] before memmap_init
> [ 117.118195] after memmap_init
>>>> memmap_init takes 4518 us
> [ 117.121446] before memmap_init
> [ 117.154992] after memmap_init
>>>> memmap_init takes 33546 us
> [ 117.158241] before memmap_init
> [ 117.161490] after memmap_init
>>>> memmap_init takes 3249 us
>>>> totally takes 41313 us
> With this patchset:
> [ 123.222962] before memmap_init
> [ 123.226819] after memmap_init
>>>> memmap_init takes 3857
> [ 123.230070] before memmap_init
> [ 123.247354] after memmap_init
>>>> memmap_init takes 17284
> [ 123.250604] before memmap_init
> [ 123.253852] after memmap_init
>>>> memmap_init takes 3248
>>>> totally takes 24389 us
> Attached the memblock region information in my server.
> [ 86.956758] Zone ranges:
> [ 86.959452] DMA [mem 0x0000000000200000-0x00000000ffffffff]
> [ 86.966041] Normal [mem 0x0000000100000000-0x00000017ffffffff]
> [ 86.972631] Movable zone start for each node
> [ 86.977179] Early memory node ranges
> [ 86.980985] node 0: [mem 0x0000000000200000-0x000000000021ffff]
> [ 86.987666] node 0: [mem 0x0000000000820000-0x000000000307ffff]
> [ 86.994348] node 0: [mem 0x0000000003080000-0x000000000308ffff]
> [ 87.001029] node 0: [mem 0x0000000003090000-0x00000000031fffff]
> [ 87.007710] node 0: [mem 0x0000000003200000-0x00000000033fffff]
> [ 87.014392] node 0: [mem 0x0000000003410000-0x000000000563ffff]
> [ 87.021073] node 0: [mem 0x0000000005640000-0x000000000567ffff]
> [ 87.027754] node 0: [mem 0x0000000005680000-0x00000000056dffff]
> [ 87.034435] node 0: [mem 0x00000000056e0000-0x00000000086fffff]
> [ 87.041117] node 0: [mem 0x0000000008700000-0x000000000871ffff]
> [ 87.047798] node 0: [mem 0x0000000008720000-0x000000000894ffff]
> [ 87.054479] node 0: [mem 0x0000000008950000-0x0000000008baffff]
> [ 87.061161] node 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
> [ 87.067842] node 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
> [ 87.074524] node 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
> [ 87.081205] node 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
> [ 87.087886] node 0: [mem 0x0000000008e50000-0x0000000008fcffff]
> [ 87.094568] node 0: [mem 0x0000000008fd0000-0x000000000910ffff]
> [ 87.101249] node 0: [mem 0x0000000009110000-0x00000000092effff]
> [ 87.107930] node 0: [mem 0x00000000092f0000-0x000000000930ffff]
> [ 87.114612] node 0: [mem 0x0000000009310000-0x000000000963ffff]
> [ 87.121293] node 0: [mem 0x0000000009640000-0x000000000e61ffff]
> [ 87.127975] node 0: [mem 0x000000000e620000-0x000000000e64ffff]
> [ 87.134657] node 0: [mem 0x000000000e650000-0x000000000fffffff]
> [ 87.141338] node 0: [mem 0x0000000010800000-0x0000000017feffff]
> [ 87.148019] node 0: [mem 0x000000001c000000-0x000000001c00ffff]
> [ 87.154701] node 0: [mem 0x000000001c010000-0x000000001c7fffff]
> [ 87.161383] node 0: [mem 0x000000001c810000-0x000000007efbffff]
> [ 87.168064] node 0: [mem 0x000000007efc0000-0x000000007efdffff]
> [ 87.174746] node 0: [mem 0x000000007efe0000-0x000000007efeffff]
> [ 87.181427] node 0: [mem 0x000000007eff0000-0x000000007effffff]
> [ 87.188108] node 0: [mem 0x000000007f000000-0x00000017ffffffff]
> [ 87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
>
> Changelog:
> V8: - introduce new config and move generic code to early_pfn.h
> - optimize memblock_next_valid_pfn as suggested by Matthew Wilcox
> V7: - fix i386 compilation error. refine the commit description
> V6: - simplify the codes, move arm/arm64 common codes to one file.
> - refine patches as suggested by Danial Vacek and Ard Biesheuvel
> V5: - further refining as suggested by Danial Vacek. Make codes
> arm/arm64 more arch specific
> V4: - refine patches as suggested by Danial Vacek and Wei Yang
> - optimized on arm besides arm64
> V3: - fix 2 issues reported by kbuild test robot
> V2: - rebase to mmotm latest
> - remain memblock_next_valid_pfn on arm64
> - refine memblock_search_pfn_regions and pfn_valid_region
>
> Jia He (6):
> arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID
> mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64
> arm: arm64: page_alloc: reduce unnecessary binary search in
> memblock_next_valid_pfn()
> mm/memblock: introduce memblock_search_pfn_regions()
> arm: arm64: introduce pfn_valid_region()
> mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
>
> arch/arm/Kconfig | 4 +++
> arch/arm/mm/init.c | 1 +
> arch/arm64/Kconfig | 4 +++
> arch/arm64/mm/init.c | 1 +
> include/linux/early_pfn.h | 79 +++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/memblock.h | 2 ++
> include/linux/mmzone.h | 18 ++++++++++-
> mm/Kconfig | 3 ++
> mm/memblock.c | 9 ++++++
> mm/page_alloc.c | 5 ++-
> 10 files changed, 124 insertions(+), 2 deletions(-)
> create mode 100644 include/linux/early_pfn.h
>
On Fri, May 4, 2018 at 4:45 AM, Jia He <[email protected]> wrote:
> Ping
>
> Sorry if I am a little bit verbose, but it can speedup the arm64 booting
> time indeed.
I'm wondering, ain't simple enabling of config
DEFERRED_STRUCT_PAGE_INIT provide even better speed-up? If that is the
case then it seems like this series is not needed at all, right?
I am not sure why is this config optional. It looks like it could be
enabled by default or even unconditionally considering that with
commit c9e97a1997fb ("mm: initialize pages on demand during boot") the
deferred code is statically disabled after all the pages are
initialized.
--nX
>
> --
> Cheers,
> Jia
>
>
> On 4/11/2018 3:21 PM, Jia He Wrote:
>>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") tried to optimize the loop in memmap_init_zone(). But
>> there is still some room for improvement.
>>
>> Patch 1 introduce new config to make codes more generic
>> Patch 2 remain the memblock_next_valid_pfn on arm and arm64
>> Patch 3 optimizes the memblock_next_valid_pfn()
>> Patch 4~6 optimizes the early_pfn_valid()
>>
>> As for the performance improvement, after this set, I can see the time
>> overhead of memmap_init() is reduced from 41313 us to 24389 us in my
>> armv8a server(QDF2400 with 96G memory).
>>
>> Without this patchset:
>> [ 117.113677] before memmap_init
>> [ 117.118195] after memmap_init
>>>>>
>>>>> memmap_init takes 4518 us
>>
>> [ 117.121446] before memmap_init
>> [ 117.154992] after memmap_init
>>>>>
>>>>> memmap_init takes 33546 us
>>
>> [ 117.158241] before memmap_init
>> [ 117.161490] after memmap_init
>>>>>
>>>>> memmap_init takes 3249 us
>>>>> totally takes 41313 us
>>
>> With this patchset:
>> [ 123.222962] before memmap_init
>> [ 123.226819] after memmap_init
>>>>>
>>>>> memmap_init takes 3857
>>
>> [ 123.230070] before memmap_init
>> [ 123.247354] after memmap_init
>>>>>
>>>>> memmap_init takes 17284
>>
>> [ 123.250604] before memmap_init
>> [ 123.253852] after memmap_init
>>>>>
>>>>> memmap_init takes 3248
>>>>> totally takes 24389 us
>>
>> Attached the memblock region information in my server.
>> [ 86.956758] Zone ranges:
>> [ 86.959452] DMA [mem 0x0000000000200000-0x00000000ffffffff]
>> [ 86.966041] Normal [mem 0x0000000100000000-0x00000017ffffffff]
>> [ 86.972631] Movable zone start for each node
>> [ 86.977179] Early memory node ranges
>> [ 86.980985] node 0: [mem 0x0000000000200000-0x000000000021ffff]
>> [ 86.987666] node 0: [mem 0x0000000000820000-0x000000000307ffff]
>> [ 86.994348] node 0: [mem 0x0000000003080000-0x000000000308ffff]
>> [ 87.001029] node 0: [mem 0x0000000003090000-0x00000000031fffff]
>> [ 87.007710] node 0: [mem 0x0000000003200000-0x00000000033fffff]
>> [ 87.014392] node 0: [mem 0x0000000003410000-0x000000000563ffff]
>> [ 87.021073] node 0: [mem 0x0000000005640000-0x000000000567ffff]
>> [ 87.027754] node 0: [mem 0x0000000005680000-0x00000000056dffff]
>> [ 87.034435] node 0: [mem 0x00000000056e0000-0x00000000086fffff]
>> [ 87.041117] node 0: [mem 0x0000000008700000-0x000000000871ffff]
>> [ 87.047798] node 0: [mem 0x0000000008720000-0x000000000894ffff]
>> [ 87.054479] node 0: [mem 0x0000000008950000-0x0000000008baffff]
>> [ 87.061161] node 0: [mem 0x0000000008bb0000-0x0000000008bcffff]
>> [ 87.067842] node 0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
>> [ 87.074524] node 0: [mem 0x0000000008c50000-0x0000000008e2ffff]
>> [ 87.081205] node 0: [mem 0x0000000008e30000-0x0000000008e4ffff]
>> [ 87.087886] node 0: [mem 0x0000000008e50000-0x0000000008fcffff]
>> [ 87.094568] node 0: [mem 0x0000000008fd0000-0x000000000910ffff]
>> [ 87.101249] node 0: [mem 0x0000000009110000-0x00000000092effff]
>> [ 87.107930] node 0: [mem 0x00000000092f0000-0x000000000930ffff]
>> [ 87.114612] node 0: [mem 0x0000000009310000-0x000000000963ffff]
>> [ 87.121293] node 0: [mem 0x0000000009640000-0x000000000e61ffff]
>> [ 87.127975] node 0: [mem 0x000000000e620000-0x000000000e64ffff]
>> [ 87.134657] node 0: [mem 0x000000000e650000-0x000000000fffffff]
>> [ 87.141338] node 0: [mem 0x0000000010800000-0x0000000017feffff]
>> [ 87.148019] node 0: [mem 0x000000001c000000-0x000000001c00ffff]
>> [ 87.154701] node 0: [mem 0x000000001c010000-0x000000001c7fffff]
>> [ 87.161383] node 0: [mem 0x000000001c810000-0x000000007efbffff]
>> [ 87.168064] node 0: [mem 0x000000007efc0000-0x000000007efdffff]
>> [ 87.174746] node 0: [mem 0x000000007efe0000-0x000000007efeffff]
>> [ 87.181427] node 0: [mem 0x000000007eff0000-0x000000007effffff]
>> [ 87.188108] node 0: [mem 0x000000007f000000-0x00000017ffffffff]
>> [ 87.194791] Initmem setup node 0 [mem
>> 0x0000000000200000-0x00000017ffffffff]
>>
>> Changelog:
>> V8: - introduce new config and move generic code to early_pfn.h
>> - optimize memblock_next_valid_pfn as suggested by Matthew Wilcox
>> V7: - fix i386 compilation error. refine the commit description
>> V6: - simplify the codes, move arm/arm64 common codes to one file.
>> - refine patches as suggested by Danial Vacek and Ard Biesheuvel
>> V5: - further refining as suggested by Danial Vacek. Make codes
>> arm/arm64 more arch specific
>> V4: - refine patches as suggested by Danial Vacek and Wei Yang
>> - optimized on arm besides arm64
>> V3: - fix 2 issues reported by kbuild test robot
>> V2: - rebase to mmotm latest
>> - remain memblock_next_valid_pfn on arm64
>> - refine memblock_search_pfn_regions and pfn_valid_region
>>
>> Jia He (6):
>> arm: arm64: introduce CONFIG_HAVE_MEMBLOCK_PFN_VALID
>> mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64
>> arm: arm64: page_alloc: reduce unnecessary binary search in
>> memblock_next_valid_pfn()
>> mm/memblock: introduce memblock_search_pfn_regions()
>> arm: arm64: introduce pfn_valid_region()
>> mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
>>
>> arch/arm/Kconfig | 4 +++
>> arch/arm/mm/init.c | 1 +
>> arch/arm64/Kconfig | 4 +++
>> arch/arm64/mm/init.c | 1 +
>> include/linux/early_pfn.h | 79
>> +++++++++++++++++++++++++++++++++++++++++++++++
>> include/linux/memblock.h | 2 ++
>> include/linux/mmzone.h | 18 ++++++++++-
>> mm/Kconfig | 3 ++
>> mm/memblock.c | 9 ++++++
>> mm/page_alloc.c | 5 ++-
>> 10 files changed, 124 insertions(+), 2 deletions(-)
>> create mode 100644 include/linux/early_pfn.h
>>
>
> I'm wondering, ain't simple enabling of config
> DEFERRED_STRUCT_PAGE_INIT provide even better speed-up? If that is the
> case then it seems like this series is not needed at all, right?
> I am not sure why is this config optional. It looks like it could be
> enabled by default or even unconditionally considering that with
> commit c9e97a1997fb ("mm: initialize pages on demand during boot") the
> deferred code is statically disabled after all the pages are
> initialized.
Hi Daniel,
Currently, deferred struct pages are initialized in parallel only on NUMA machines. I would like to make a change to use all the available CPUs even on a single socket systems, but that is not there yet. So, I believe Jia's performance improvements are still relevant.
Thank you,
Pavel
On Fri, May 4, 2018 at 6:53 PM, Pavel Tatashin
<[email protected]> wrote:
>> I'm wondering, ain't simple enabling of config
>> DEFERRED_STRUCT_PAGE_INIT provide even better speed-up? If that is the
>> case then it seems like this series is not needed at all, right?
>> I am not sure why is this config optional. It looks like it could be
>> enabled by default or even unconditionally considering that with
>> commit c9e97a1997fb ("mm: initialize pages on demand during boot") the
>> deferred code is statically disabled after all the pages are
>> initialized.
>
> Hi Daniel,
>
> Currently, deferred struct pages are initialized in parallel only on NUMA machines. I would like to make a change to use all the available CPUs even on a single socket systems, but that is not there yet. So, I believe Jia's performance improvements are still relevant.
Ahaa, I thought it also works on UP or single node systems. I didn't
study the code closely. Sorry about the noise. And thank you, Pavel.
You're right.
--nX
> Thank you,
> Pavel
On 5/5/2018 12:53 AM, Pavel Tatashin Wrote:
>> I'm wondering, ain't simple enabling of config
>> DEFERRED_STRUCT_PAGE_INIT provide even better speed-up? If that is the
>> case then it seems like this series is not needed at all, right?
>> I am not sure why is this config optional. It looks like it could be
>> enabled by default or even unconditionally considering that with
>> commit c9e97a1997fb ("mm: initialize pages on demand during boot") the
>> deferred code is statically disabled after all the pages are
>> initialized.
> Hi Daniel,
>
> Currently, deferred struct pages are initialized in parallel only on NUMA machines. I would like to make a change to use all the available CPUs even on a single socket systems, but that is not there yet. So, I believe Jia's performance improvements are still relevant.
Thanks for the information. I checked the config in my armv8a server,
DEFERRED_STRUCT_PAGE_INIT has not been enabled yet.And my server is
single socket.
Cheers.
Jia