2020-11-01 17:06:53

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

From: Mike Rapoport <[email protected]>

Hi,

It's been a while since DISCONTIGMEM is generally considered deprecated,
but it is still used by four architectures. This set replaces DISCONTIGMEM
with a different way to handle holes in the memory map and marks
DISCONTIGMEM configuration as BROKEN in Kconfigs of these architectures with
the intention to completely remove it in several releases.

While for 64-bit alpha and ia64 the switch to SPARSEMEM is quite obvious
and was a matter of moving some bits around, for smaller 32-bit arc and
m68k SPARSEMEM is not necessarily the best thing to do.

On 32-bit machines SPARSEMEM would require large sections to make section
index fit in the page flags, but larger sections mean that more memory is
wasted for unused memory map.

Besides, pfn_to_page() and page_to_pfn() become less efficient, at least on
arc.

So I've decided to generalize arm's approach for freeing of unused parts of
the memory map with FLATMEM and enable it for both arc and m68k. The
details are in the description of patches 10 (arc) and 13 (m68k).

v2 changes:
- remove additional stale '#ifdef CONFIG_SINGLE_MEMORY_CHUNK' on m68k
- fix bisectability on m68k

v1:
https://lore.kernel.org/lkml/[email protected]

Mike Rapoport (13):
alpha: switch from DISCONTIGMEM to SPARSEMEM
ia64: remove custom __early_pfn_to_nid()
ia64: remove 'ifdef CONFIG_ZONE_DMA32' statements
ia64: discontig: paging_init(): remove local max_pfn calculation
ia64: split virtual map initialization out of paging_init()
ia64: forbid using VIRTUAL_MEM_MAP with FLATMEM
ia64: make SPARSEMEM default and disable DISCONTIGMEM
arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
arm, arm64: move free_unused_memmap() to generic mm
arc: use FLATMEM with freeing of unused memory map instead of
DISCONTIGMEM
m68k/mm: make node data and node setup depend on CONFIG_DISCONTIGMEM
m68k/mm: enable use of generic memory_model.h for !DISCONTIGMEM
m68k: deprecate DISCONTIGMEM

Documentation/vm/memory-model.rst | 3 +-
arch/Kconfig | 3 ++
arch/alpha/Kconfig | 8 +++
arch/alpha/include/asm/mmzone.h | 14 +----
arch/alpha/include/asm/page.h | 7 +--
arch/alpha/include/asm/pgtable.h | 12 ++---
arch/alpha/include/asm/sparsemem.h | 18 +++++++
arch/alpha/kernel/setup.c | 1 +
arch/arc/Kconfig | 3 +-
arch/arc/include/asm/page.h | 20 ++++++--
arch/arc/mm/init.c | 29 ++++++++---
arch/arm/Kconfig | 10 +---
arch/arm/mach-bcm/Kconfig | 1 -
arch/arm/mach-davinci/Kconfig | 1 -
arch/arm/mach-exynos/Kconfig | 1 -
arch/arm/mach-highbank/Kconfig | 1 -
arch/arm/mach-omap2/Kconfig | 1 -
arch/arm/mach-s5pv210/Kconfig | 1 -
arch/arm/mach-tango/Kconfig | 1 -
arch/arm/mm/init.c | 78 ----------------------------
arch/arm64/Kconfig | 4 +-
arch/arm64/mm/init.c | 68 ------------------------
arch/ia64/Kconfig | 11 ++--
arch/ia64/include/asm/meminit.h | 2 -
arch/ia64/mm/contig.c | 58 ++++++++++-----------
arch/ia64/mm/discontig.c | 44 ++++++++--------
arch/ia64/mm/init.c | 14 -----
arch/ia64/mm/numa.c | 30 -----------
arch/m68k/Kconfig.cpu | 31 +++++++++--
arch/m68k/include/asm/page.h | 2 +
arch/m68k/include/asm/page_mm.h | 7 ++-
arch/m68k/include/asm/virtconvert.h | 5 --
arch/m68k/mm/init.c | 8 +--
fs/proc/kcore.c | 2 -
include/linux/mm.h | 3 --
include/linux/mmzone.h | 42 ---------------
mm/memblock.c | 80 +++++++++++++++++++++++++++++
mm/mmzone.c | 14 -----
mm/page_alloc.c | 16 ++++--
mm/vmstat.c | 4 --
40 files changed, 270 insertions(+), 388 deletions(-)
create mode 100644 arch/alpha/include/asm/sparsemem.h

base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
--
2.28.0


2020-11-01 17:07:01

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 01/13] alpha: switch from DISCONTIGMEM to SPARSEMEM

From: Mike Rapoport <[email protected]>

Enable SPARSEMEM support on alpha and deprecate DISCONTIGMEM.

The required changes are mostly around moving duplicated definitions of
page access and address conversion macros to a common place and making sure
they are available for all memory models.

The DISCONTINGMEM support is marked as BROKEN an will be removed in a
couple of releases.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/alpha/Kconfig | 8 ++++++++
arch/alpha/include/asm/mmzone.h | 14 ++------------
arch/alpha/include/asm/page.h | 7 ++++---
arch/alpha/include/asm/pgtable.h | 12 +++++-------
arch/alpha/include/asm/sparsemem.h | 18 ++++++++++++++++++
arch/alpha/kernel/setup.c | 1 +
6 files changed, 38 insertions(+), 22 deletions(-)
create mode 100644 arch/alpha/include/asm/sparsemem.h

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index d6e9fc7a7b19..aedf5c296f13 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -40,6 +40,7 @@ config ALPHA
select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67
select MMU_GATHER_NO_RANGE
select SET_FS
+ select SPARSEMEM_EXTREME if SPARSEMEM
help
The Alpha is a 64-bit general-purpose processor designed and
marketed by the Digital Equipment Corporation of blessed memory,
@@ -551,12 +552,19 @@ config NR_CPUS

config ARCH_DISCONTIGMEM_ENABLE
bool "Discontiguous Memory Support"
+ depends on BROKEN
help
Say Y to support efficient handling of discontiguous physical memory,
for architectures which are either NUMA (Non-Uniform Memory Access)
or have huge holes in the physical address space for other reasons.
See <file:Documentation/vm/numa.rst> for more.

+config ARCH_SPARSEMEM_ENABLE
+ bool "Sparse Memory Support"
+ help
+ Say Y to support efficient handling of discontiguous physical memory,
+ for systems that have huge holes in the physical address space.
+
config NUMA
bool "NUMA Support (EXPERIMENTAL)"
depends on DISCONTIGMEM && BROKEN
diff --git a/arch/alpha/include/asm/mmzone.h b/arch/alpha/include/asm/mmzone.h
index 9b521c857436..86644604d977 100644
--- a/arch/alpha/include/asm/mmzone.h
+++ b/arch/alpha/include/asm/mmzone.h
@@ -6,6 +6,8 @@
#ifndef _ASM_MMZONE_H_
#define _ASM_MMZONE_H_

+#ifdef CONFIG_DISCONTIGMEM
+
#include <asm/smp.h>

/*
@@ -45,8 +47,6 @@ PLAT_NODE_DATA_LOCALNR(unsigned long p, int n)
}
#endif

-#ifdef CONFIG_DISCONTIGMEM
-
/*
* Following are macros that each numa implementation must define.
*/
@@ -68,11 +68,6 @@ PLAT_NODE_DATA_LOCALNR(unsigned long p, int n)
/* XXX: FIXME -- nyc */
#define kern_addr_valid(kaddr) (0)

-#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
-
-#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> 32))
-#define pte_pfn(pte) (pte_val(pte) >> 32)
-
#define mk_pte(page, pgprot) \
({ \
pte_t pte; \
@@ -95,16 +90,11 @@ PLAT_NODE_DATA_LOCALNR(unsigned long p, int n)
__xx; \
})

-#define page_to_pa(page) \
- (page_to_pfn(page) << PAGE_SHIFT)
-
#define pfn_to_nid(pfn) pa_to_nid(((u64)(pfn) << PAGE_SHIFT))
#define pfn_valid(pfn) \
(((pfn) - node_start_pfn(pfn_to_nid(pfn))) < \
node_spanned_pages(pfn_to_nid(pfn))) \

-#define virt_addr_valid(kaddr) pfn_valid((__pa(kaddr) >> PAGE_SHIFT))
-
#endif /* CONFIG_DISCONTIGMEM */

#endif /* _ASM_MMZONE_H_ */
diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
index e241bd88880f..268f99b4602b 100644
--- a/arch/alpha/include/asm/page.h
+++ b/arch/alpha/include/asm/page.h
@@ -83,12 +83,13 @@ typedef struct page *pgtable_t;

#define __pa(x) ((unsigned long) (x) - PAGE_OFFSET)
#define __va(x) ((void *)((unsigned long) (x) + PAGE_OFFSET))
-#ifndef CONFIG_DISCONTIGMEM
+
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
+#define virt_addr_valid(kaddr) pfn_valid((__pa(kaddr) >> PAGE_SHIFT))

+#ifdef CONFIG_FLATMEM
#define pfn_valid(pfn) ((pfn) < max_mapnr)
-#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
-#endif /* CONFIG_DISCONTIGMEM */
+#endif /* CONFIG_FLATMEM */

#include <asm-generic/memory_model.h>
#include <asm-generic/getorder.h>
diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index 660b14ce1317..8d856c62e22a 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -203,10 +203,10 @@ extern unsigned long __zero_page(void);
* Conversion functions: convert a page and protection to a page entry,
* and a page entry and page directory to the page they refer to.
*/
-#ifndef CONFIG_DISCONTIGMEM
-#define page_to_pa(page) (((page) - mem_map) << PAGE_SHIFT)
-
+#define page_to_pa(page) (page_to_pfn(page) << PAGE_SHIFT)
#define pte_pfn(pte) (pte_val(pte) >> 32)
+
+#ifndef CONFIG_DISCONTIGMEM
#define pte_page(pte) pfn_to_page(pte_pfn(pte))
#define mk_pte(page, pgprot) \
({ \
@@ -236,10 +236,8 @@ pmd_page_vaddr(pmd_t pmd)
return ((pmd_val(pmd) & _PFN_MASK) >> (32-PAGE_SHIFT)) + PAGE_OFFSET;
}

-#ifndef CONFIG_DISCONTIGMEM
-#define pmd_page(pmd) (mem_map + ((pmd_val(pmd) & _PFN_MASK) >> 32))
-#define pud_page(pud) (mem_map + ((pud_val(pud) & _PFN_MASK) >> 32))
-#endif
+#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> 32))
+#define pud_page(pud) (pfn_to_page(pud_val(pud) >> 32))

extern inline unsigned long pud_page_vaddr(pud_t pgd)
{ return PAGE_OFFSET + ((pud_val(pgd) & _PFN_MASK) >> (32-PAGE_SHIFT)); }
diff --git a/arch/alpha/include/asm/sparsemem.h b/arch/alpha/include/asm/sparsemem.h
new file mode 100644
index 000000000000..a0820fd2d4b1
--- /dev/null
+++ b/arch/alpha/include/asm/sparsemem.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ALPHA_SPARSEMEM_H
+#define _ASM_ALPHA_SPARSEMEM_H
+
+#ifdef CONFIG_SPARSEMEM
+
+#define SECTION_SIZE_BITS 27
+
+/*
+ * According to "Alpha Architecture Reference Manual" physical
+ * addresses are at most 48 bits.
+ * https://download.majix.org/dec/alpha_arch_ref.pdf
+ */
+#define MAX_PHYSMEM_BITS 48
+
+#endif /* CONFIG_SPARSEMEM */
+
+#endif /* _ASM_ALPHA_SPARSEMEM_H */
diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index 916e42d74a86..03dda3beb3bd 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -648,6 +648,7 @@ setup_arch(char **cmdline_p)
/* Find our memory. */
setup_memory(kernel_end);
memblock_set_bottom_up(true);
+ sparse_init();

/* First guess at cpu cache sizes. Do this before init_arch. */
determine_cpu_caches(cpu->type);
--
2.28.0

2020-11-01 17:07:08

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 02/13] ia64: remove custom __early_pfn_to_nid()

From: Mike Rapoport <[email protected]>

The ia64 implementation of __early_pfn_to_nid() essentially relies on the
same data as the generic implementation.

The correspondence between memory ranges and nodes is set in memblock
during early memory initialization in register_active_ranges() function.

The initialization of sparsemem that requires early_pfn_to_nid() happens
later and it can use the memblock information like the other architectures.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/Kconfig | 3 ---
arch/ia64/mm/numa.c | 30 ------------------------------
include/linux/mm.h | 3 ---
include/linux/mmzone.h | 11 -----------
mm/page_alloc.c | 16 ++++++++++++----
5 files changed, 12 insertions(+), 51 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 39b25a5a591b..12aae706cb27 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -342,9 +342,6 @@ config HOLES_IN_ZONE
bool
default y if VIRTUAL_MEM_MAP

-config HAVE_ARCH_EARLY_PFN_TO_NID
- def_bool NUMA && SPARSEMEM
-
config HAVE_ARCH_NODEDATA_EXTENSION
def_bool y
depends on NUMA
diff --git a/arch/ia64/mm/numa.c b/arch/ia64/mm/numa.c
index f34964271101..46b6e5f3a40f 100644
--- a/arch/ia64/mm/numa.c
+++ b/arch/ia64/mm/numa.c
@@ -58,36 +58,6 @@ paddr_to_nid(unsigned long paddr)
EXPORT_SYMBOL(paddr_to_nid);

#if defined(CONFIG_SPARSEMEM) && defined(CONFIG_NUMA)
-/*
- * Because of holes evaluate on section limits.
- * If the section of memory exists, then return the node where the section
- * resides. Otherwise return node 0 as the default. This is used by
- * SPARSEMEM to allocate the SPARSEMEM sectionmap on the NUMA node where
- * the section resides.
- */
-int __meminit __early_pfn_to_nid(unsigned long pfn,
- struct mminit_pfnnid_cache *state)
-{
- int i, section = pfn >> PFN_SECTION_SHIFT, ssec, esec;
-
- if (section >= state->last_start && section < state->last_end)
- return state->last_nid;
-
- for (i = 0; i < num_node_memblks; i++) {
- ssec = node_memblk[i].start_paddr >> PA_SECTION_SHIFT;
- esec = (node_memblk[i].start_paddr + node_memblk[i].size +
- ((1L << PA_SECTION_SHIFT) - 1)) >> PA_SECTION_SHIFT;
- if (section >= ssec && section < esec) {
- state->last_start = ssec;
- state->last_end = esec;
- state->last_nid = node_memblk[i].nid;
- return node_memblk[i].nid;
- }
- }
-
- return -1;
-}
-
void numa_clear_node(int cpu)
{
unmap_cpu_from_node(cpu, NUMA_NO_NODE);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..ac51b07b9021 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2433,9 +2433,6 @@ static inline int early_pfn_to_nid(unsigned long pfn)
#else
/* please see mm/page_alloc.c */
extern int __meminit early_pfn_to_nid(unsigned long pfn);
-/* there is a per-arch backend function. */
-extern int __meminit __early_pfn_to_nid(unsigned long pfn,
- struct mminit_pfnnid_cache *state);
#endif

extern void set_dma_reserve(unsigned long new_dma_reserve);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fb3bf696c05e..876600a6e891 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1428,17 +1428,6 @@ void sparse_init(void);
#define subsection_map_init(_pfn, _nr_pages) do {} while (0)
#endif /* CONFIG_SPARSEMEM */

-/*
- * During memory init memblocks map pfns to nids. The search is expensive and
- * this caches recent lookups. The implementation of __early_pfn_to_nid
- * may treat start/end as pfns or sections.
- */
-struct mminit_pfnnid_cache {
- unsigned long last_start;
- unsigned long last_end;
- int last_nid;
-};
-
/*
* If it is possible to have holes within a MAX_ORDER_NR_PAGES, then we
* need to check pfn validity within that MAX_ORDER_NR_PAGES block.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..1fdbf8da77af 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1558,14 +1558,23 @@ void __free_pages_core(struct page *page, unsigned int order)

#ifdef CONFIG_NEED_MULTIPLE_NODES

-static struct mminit_pfnnid_cache early_pfnnid_cache __meminitdata;
+/*
+ * During memory init memblocks map pfns to nids. The search is expensive and
+ * this caches recent lookups. The implementation of __early_pfn_to_nid
+ * treats start/end as pfns.
+ */
+struct mminit_pfnnid_cache {
+ unsigned long last_start;
+ unsigned long last_end;
+ int last_nid;
+};

-#ifndef CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
+static struct mminit_pfnnid_cache early_pfnnid_cache __meminitdata;

/*
* Required by SPARSEMEM. Given a PFN, return what node the PFN is on.
*/
-int __meminit __early_pfn_to_nid(unsigned long pfn,
+static int __meminit __early_pfn_to_nid(unsigned long pfn,
struct mminit_pfnnid_cache *state)
{
unsigned long start_pfn, end_pfn;
@@ -1583,7 +1592,6 @@ int __meminit __early_pfn_to_nid(unsigned long pfn,

return nid;
}
-#endif /* CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID */

int __meminit early_pfn_to_nid(unsigned long pfn)
{
--
2.28.0

2020-11-01 17:07:21

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 03/13] ia64: remove 'ifdef CONFIG_ZONE_DMA32' statements

From: Mike Rapoport <[email protected]>

After the removal of SN2 platform (commit cf07cb1ff4ea ("ia64: remove
support for the SGI SN2 platform") IA-64 always has ZONE_DMA32 and there is
no point to guard code with this configuration option.

Remove ifdefery associated with CONFIG_ZONE_DMA32

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/mm/contig.c | 2 --
arch/ia64/mm/discontig.c | 2 --
2 files changed, 4 deletions(-)

diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index e30e360beef8..2491aaeca90c 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -177,10 +177,8 @@ paging_init (void)
unsigned long max_zone_pfns[MAX_NR_ZONES];

memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
-#ifdef CONFIG_ZONE_DMA32
max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
max_zone_pfns[ZONE_DMA32] = max_dma;
-#endif
max_zone_pfns[ZONE_NORMAL] = max_low_pfn;

#ifdef CONFIG_VIRTUAL_MEM_MAP
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index dbe829fc5298..d255596f52c6 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -621,9 +621,7 @@ void __init paging_init(void)
}

memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
-#ifdef CONFIG_ZONE_DMA32
max_zone_pfns[ZONE_DMA32] = max_dma;
-#endif
max_zone_pfns[ZONE_NORMAL] = max_pfn;
free_area_init(max_zone_pfns);

--
2.28.0

2020-11-01 17:07:23

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 04/13] ia64: discontig: paging_init(): remove local max_pfn calculation

From: Mike Rapoport <[email protected]>

The maximal PFN in the system is calculated during find_memory() time and
it is stored at max_low_pfn then.

Use this value in paging_init() and remove the redundant detection of
max_pfn in that function.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/mm/discontig.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index d255596f52c6..f41dcf75887b 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -594,7 +594,6 @@ void __init paging_init(void)
{
unsigned long max_dma;
unsigned long pfn_offset = 0;
- unsigned long max_pfn = 0;
int node;
unsigned long max_zone_pfns[MAX_NR_ZONES];

@@ -616,13 +615,11 @@ void __init paging_init(void)
#ifdef CONFIG_VIRTUAL_MEM_MAP
NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
#endif
- if (mem_data[node].max_pfn > max_pfn)
- max_pfn = mem_data[node].max_pfn;
}

memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
max_zone_pfns[ZONE_DMA32] = max_dma;
- max_zone_pfns[ZONE_NORMAL] = max_pfn;
+ max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
free_area_init(max_zone_pfns);

zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
--
2.28.0

2020-11-01 17:08:01

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 07/13] ia64: make SPARSEMEM default and disable DISCONTIGMEM

From: Mike Rapoport <[email protected]>

SPARSEMEM memory model suitable for systems with large holes in their
phyiscal memory layout. With SPARSEMEM_VMEMMAP enabled it provides
pfn_to_page() and page_to_pfn() as fast as FLATMEM.

Make it the default memory model for IA-64 and disable DISCONTIGMEM which
is considered obsolete for quite some time.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/Kconfig | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 83de0273d474..6e67d6110249 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -288,6 +288,7 @@ config ARCH_SELECT_MEMORY_MODEL

config ARCH_DISCONTIGMEM_ENABLE
def_bool y
+ depends on BROKEN
help
Say Y to support efficient handling of discontiguous physical memory,
for architectures which are either NUMA (Non-Uniform Memory Access)
@@ -299,12 +300,11 @@ config ARCH_FLATMEM_ENABLE

config ARCH_SPARSEMEM_ENABLE
def_bool y
- depends on ARCH_DISCONTIGMEM_ENABLE
select SPARSEMEM_VMEMMAP_ENABLE

-config ARCH_DISCONTIGMEM_DEFAULT
+config ARCH_SPARSEMEM_DEFAULT
def_bool y
- depends on ARCH_DISCONTIGMEM_ENABLE
+ depends on ARCH_SPARSEMEM_ENABLE

config NUMA
bool "NUMA support"
--
2.28.0

2020-11-01 17:08:03

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 08/13] arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL

From: Mike Rapoport <[email protected]>

ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
which in turn enables memmap_valid_within() function that is intended to
verify existence of struct page associated with a pfn when there are holes
in the memory map.

However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID
and arch-specific pfn_valid() implementation that also deals with the holes
in the memory map.

The only two users of memmap_valid_within() call this function after
a call to pfn_valid() so the memmap_valid_within() check becomes redundant.

Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely
entirely on ARM's implementation of pfn_valid() that is now enabled
unconditionally.

Signed-off-by: Mike Rapoport <[email protected]>
---
Documentation/vm/memory-model.rst | 3 +--
arch/arm/Kconfig | 8 ++------
arch/arm/mach-bcm/Kconfig | 1 -
arch/arm/mach-davinci/Kconfig | 1 -
arch/arm/mach-exynos/Kconfig | 1 -
arch/arm/mach-highbank/Kconfig | 1 -
arch/arm/mach-omap2/Kconfig | 1 -
arch/arm/mach-s5pv210/Kconfig | 1 -
arch/arm/mach-tango/Kconfig | 1 -
fs/proc/kcore.c | 2 --
include/linux/mmzone.h | 31 -------------------------------
mm/mmzone.c | 14 --------------
mm/vmstat.c | 4 ----
13 files changed, 3 insertions(+), 66 deletions(-)

diff --git a/Documentation/vm/memory-model.rst b/Documentation/vm/memory-model.rst
index 9daadf9faba1..ce398a7dc6cd 100644
--- a/Documentation/vm/memory-model.rst
+++ b/Documentation/vm/memory-model.rst
@@ -51,8 +51,7 @@ call :c:func:`free_area_init` function. Yet, the mappings array is not
usable until the call to :c:func:`memblock_free_all` that hands all the
memory to the page allocator.

-If an architecture enables `CONFIG_ARCH_HAS_HOLES_MEMORYMODEL` option,
-it may free parts of the `mem_map` array that do not cover the
+An architecture may free parts of the `mem_map` array that do not cover the
actual physical pages. In such case, the architecture specific
:c:func:`pfn_valid` implementation should take the holes in the
`mem_map` into account.
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fe2f17eb2b50..83adc46c1e67 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -25,7 +25,7 @@ config ARM
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAS_GCOV_PROFILE_ALL
- select ARCH_KEEP_MEMBLOCK if HAVE_ARCH_PFN_VALID || KEXEC
+ select ARCH_KEEP_MEMBLOCK
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_NO_SG_CHAIN if !ARM_HAS_SG_CHAIN
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
@@ -519,7 +519,6 @@ config ARCH_S3C24XX
config ARCH_OMAP1
bool "TI OMAP1"
depends on MMU
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARCH_OMAP
select CLKDEV_LOOKUP
select CLKSRC_MMIO
@@ -1479,9 +1478,6 @@ config OABI_COMPAT
UNPREDICTABLE (in fact it can be predicted that it won't work
at all). If in doubt say N.

-config ARCH_HAS_HOLES_MEMORYMODEL
- bool
-
config ARCH_SELECT_MEMORY_MODEL
bool

@@ -1493,7 +1489,7 @@ config ARCH_SPARSEMEM_ENABLE
select SPARSEMEM_STATIC if SPARSEMEM

config HAVE_ARCH_PFN_VALID
- def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
+ def_bool y

config HIGHMEM
bool "High Memory Support"
diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
index ae790908fc74..9b594ae98153 100644
--- a/arch/arm/mach-bcm/Kconfig
+++ b/arch/arm/mach-bcm/Kconfig
@@ -211,7 +211,6 @@ config ARCH_BRCMSTB
select BCM7038_L1_IRQ
select BRCMSTB_L2_IRQ
select BCM7120_L2_IRQ
- select ARCH_HAS_HOLES_MEMORYMODEL
select ZONE_DMA if ARM_LPAE
select SOC_BRCMSTB
select SOC_BUS
diff --git a/arch/arm/mach-davinci/Kconfig b/arch/arm/mach-davinci/Kconfig
index f56ff8c24043..de11030748d0 100644
--- a/arch/arm/mach-davinci/Kconfig
+++ b/arch/arm/mach-davinci/Kconfig
@@ -5,7 +5,6 @@ menuconfig ARCH_DAVINCI
depends on ARCH_MULTI_V5
select DAVINCI_TIMER
select ZONE_DMA
- select ARCH_HAS_HOLES_MEMORYMODEL
select PM_GENERIC_DOMAINS if PM
select PM_GENERIC_DOMAINS_OF if PM && OF
select REGMAP_MMIO
diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
index d2d249706ebb..56d272967fc0 100644
--- a/arch/arm/mach-exynos/Kconfig
+++ b/arch/arm/mach-exynos/Kconfig
@@ -8,7 +8,6 @@
menuconfig ARCH_EXYNOS
bool "Samsung Exynos"
depends on ARCH_MULTI_V7
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARCH_SUPPORTS_BIG_ENDIAN
select ARM_AMBA
select ARM_GIC
diff --git a/arch/arm/mach-highbank/Kconfig b/arch/arm/mach-highbank/Kconfig
index 1bc68913d62c..9de38ce8124f 100644
--- a/arch/arm/mach-highbank/Kconfig
+++ b/arch/arm/mach-highbank/Kconfig
@@ -2,7 +2,6 @@
config ARCH_HIGHBANK
bool "Calxeda ECX-1000/2000 (Highbank/Midway)"
depends on ARCH_MULTI_V7
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARCH_SUPPORTS_BIG_ENDIAN
select ARM_AMBA
select ARM_ERRATA_764369 if SMP
diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 3ee7bdff86b2..89fe1572c142 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -94,7 +94,6 @@ config SOC_DRA7XX
config ARCH_OMAP2PLUS
bool
select ARCH_HAS_BANDGAP
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARCH_HAS_RESET_CONTROLLER
select ARCH_OMAP
select CLKSRC_MMIO
diff --git a/arch/arm/mach-s5pv210/Kconfig b/arch/arm/mach-s5pv210/Kconfig
index 95d4e8284866..d644b45bc29d 100644
--- a/arch/arm/mach-s5pv210/Kconfig
+++ b/arch/arm/mach-s5pv210/Kconfig
@@ -8,7 +8,6 @@
config ARCH_S5PV210
bool "Samsung S5PV210/S5PC110"
depends on ARCH_MULTI_V7
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARM_VIC
select CLKSRC_SAMSUNG_PWM
select COMMON_CLK_SAMSUNG
diff --git a/arch/arm/mach-tango/Kconfig b/arch/arm/mach-tango/Kconfig
index 25b2fd434861..a9eeda36aeb1 100644
--- a/arch/arm/mach-tango/Kconfig
+++ b/arch/arm/mach-tango/Kconfig
@@ -3,7 +3,6 @@ config ARCH_TANGO
bool "Sigma Designs Tango4 (SMP87xx)"
depends on ARCH_MULTI_V7
# Cortex-A9 MPCore r3p0, PL310 r3p2
- select ARCH_HAS_HOLES_MEMORYMODEL
select ARM_ERRATA_754322
select ARM_ERRATA_764369 if SMP
select ARM_ERRATA_775420
diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index e502414b3556..4d2e64e9016c 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -193,8 +193,6 @@ kclist_add_private(unsigned long pfn, unsigned long nr_pages, void *arg)
return 1;

p = pfn_to_page(pfn);
- if (!memmap_valid_within(pfn, p, page_zone(p)))
- return 1;

ent = kmalloc(sizeof(*ent), GFP_KERNEL);
if (!ent)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 876600a6e891..7385871768d4 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1440,37 +1440,6 @@ void sparse_init(void);
#define pfn_valid_within(pfn) (1)
#endif

-#ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
-/*
- * pfn_valid() is meant to be able to tell if a given PFN has valid memmap
- * associated with it or not. This means that a struct page exists for this
- * pfn. The caller cannot assume the page is fully initialized in general.
- * Hotplugable pages might not have been onlined yet. pfn_to_online_page()
- * will ensure the struct page is fully online and initialized. Special pages
- * (e.g. ZONE_DEVICE) are never onlined and should be treated accordingly.
- *
- * In FLATMEM, it is expected that holes always have valid memmap as long as
- * there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed
- * that a valid section has a memmap for the entire section.
- *
- * However, an ARM, and maybe other embedded architectures in the future
- * free memmap backing holes to save memory on the assumption the memmap is
- * never used. The page_zone linkages are then broken even though pfn_valid()
- * returns true. A walker of the full memmap must then do this additional
- * check to ensure the memmap they are looking at is sane by making sure
- * the zone and PFN linkages are still valid. This is expensive, but walkers
- * of the full memmap are extremely rare.
- */
-bool memmap_valid_within(unsigned long pfn,
- struct page *page, struct zone *zone);
-#else
-static inline bool memmap_valid_within(unsigned long pfn,
- struct page *page, struct zone *zone)
-{
- return true;
-}
-#endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
-
#endif /* !__GENERATING_BOUNDS.H */
#endif /* !__ASSEMBLY__ */
#endif /* _LINUX_MMZONE_H */
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 4686fdc23bb9..f337831affc2 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -72,20 +72,6 @@ struct zoneref *__next_zones_zonelist(struct zoneref *z,
return z;
}

-#ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
-bool memmap_valid_within(unsigned long pfn,
- struct page *page, struct zone *zone)
-{
- if (page_to_pfn(page) != pfn)
- return false;
-
- if (page_zone(page) != zone)
- return false;
-
- return true;
-}
-#endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
-
void lruvec_init(struct lruvec *lruvec)
{
enum lru_list lru;
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 698bc0bc18d1..e292e63afebf 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1503,10 +1503,6 @@ static void pagetypeinfo_showblockcount_print(struct seq_file *m,
if (!page)
continue;

- /* Watch for unexpected holes punched in the memmap */
- if (!memmap_valid_within(pfn, page, zone))
- continue;
-
if (page_zone(page) != zone)
continue;

--
2.28.0

2020-11-01 17:08:11

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 09/13] arm, arm64: move free_unused_memmap() to generic mm

From: Mike Rapoport <[email protected]>

ARM and ARM64 free unused parts of the memory map just before the
initialization of the page allocator. To allow holes in the memory map both
architectures overload pfn_valid() and define HAVE_ARCH_PFN_VALID.

Allowing holes in the memory map for FLATMEM may be useful for small
machines, such as ARC and m68k and will enable those architectures to cease
using DISCONTIGMEM and still support more than one memory bank.

Move the functions that free unused memory map to generic mm and enable
them in case HAVE_ARCH_PFN_VALID=y.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/Kconfig | 3 ++
arch/arm/Kconfig | 4 +--
arch/arm/mm/init.c | 78 ------------------------------------------
arch/arm64/Kconfig | 4 +--
arch/arm64/mm/init.c | 68 -------------------------------------
mm/memblock.c | 80 ++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 85 insertions(+), 152 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..d715da18a8a9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
bool
depends on HAVE_STATIC_CALL

+config HAVE_ARCH_PFN_VALID
+ bool
+
source "kernel/gcov/Kconfig"

source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 83adc46c1e67..495d42c5ecec 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -68,6 +68,7 @@ config ARM
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
select HAVE_ARCH_MMAP_RND_BITS if MMU
+ select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_SECCOMP
select HAVE_ARCH_SECCOMP_FILTER if AEABI && !OABI_COMPAT
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
@@ -1488,9 +1489,6 @@ config ARCH_SPARSEMEM_ENABLE
bool
select SPARSEMEM_STATIC if SPARSEMEM

-config HAVE_ARCH_PFN_VALID
- def_bool y
-
config HIGHMEM
bool "High Memory Support"
depends on MMU
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index d57112a276f5..a04ac5ea7641 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -267,83 +267,6 @@ static inline void poison_init_mem(void *s, size_t count)
*p++ = 0xe7fddef0;
}

-static inline void __init
-free_memmap(unsigned long start_pfn, unsigned long end_pfn)
-{
- struct page *start_pg, *end_pg;
- phys_addr_t pg, pgend;
-
- /*
- * Convert start_pfn/end_pfn to a struct page pointer.
- */
- start_pg = pfn_to_page(start_pfn - 1) + 1;
- end_pg = pfn_to_page(end_pfn - 1) + 1;
-
- /*
- * Convert to physical addresses, and
- * round start upwards and end downwards.
- */
- pg = PAGE_ALIGN(__pa(start_pg));
- pgend = __pa(end_pg) & PAGE_MASK;
-
- /*
- * If there are free pages between these,
- * free the section of the memmap array.
- */
- if (pg < pgend)
- memblock_free_early(pg, pgend - pg);
-}
-
-/*
- * The mem_map array can get very big. Free the unused area of the memory map.
- */
-static void __init free_unused_memmap(void)
-{
- unsigned long start, end, prev_end = 0;
- int i;
-
- /*
- * This relies on each bank being in address order.
- * The banks are sorted previously in bootmem_init().
- */
- for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
-#ifdef CONFIG_SPARSEMEM
- /*
- * Take care not to free memmap entries that don't exist
- * due to SPARSEMEM sections which aren't present.
- */
- start = min(start,
- ALIGN(prev_end, PAGES_PER_SECTION));
-#else
- /*
- * Align down here since the VM subsystem insists that the
- * memmap entries are valid from the bank start aligned to
- * MAX_ORDER_NR_PAGES.
- */
- start = round_down(start, MAX_ORDER_NR_PAGES);
-#endif
- /*
- * If we had a previous bank, and there is a space
- * between the current bank and the previous, free it.
- */
- if (prev_end && prev_end < start)
- free_memmap(prev_end, start);
-
- /*
- * Align up here since the VM subsystem insists that the
- * memmap entries are valid from the bank end aligned to
- * MAX_ORDER_NR_PAGES.
- */
- prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
- }
-
-#ifdef CONFIG_SPARSEMEM
- if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
- free_memmap(prev_end,
- ALIGN(prev_end, PAGES_PER_SECTION));
-#endif
-}
-
static void __init free_highpages(void)
{
#ifdef CONFIG_HIGHMEM
@@ -385,7 +308,6 @@ void __init mem_init(void)
set_max_mapnr(pfn_to_page(max_pfn) - mem_map);

/* this will put all unused low memory onto the freelists */
- free_unused_memmap();
memblock_free_all();

#ifdef CONFIG_SA1111
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f858c352f72a..b7e6a0c09d12 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -138,6 +138,7 @@ config ARM64
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
+ select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_PREL32_RELOCATIONS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_STACKLEAK
@@ -1021,9 +1022,6 @@ config ARCH_SELECT_MEMORY_MODEL
config ARCH_FLATMEM_ENABLE
def_bool !NUMA

-config HAVE_ARCH_PFN_VALID
- def_bool y
-
config HW_PERF_EVENTS
def_bool y
depends on ARM_PMU
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 095540667f0f..3d8328277bec 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -430,71 +430,6 @@ void __init bootmem_init(void)
memblock_dump_all();
}

-#ifndef CONFIG_SPARSEMEM_VMEMMAP
-static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
-{
- struct page *start_pg, *end_pg;
- unsigned long pg, pgend;
-
- /*
- * Convert start_pfn/end_pfn to a struct page pointer.
- */
- start_pg = pfn_to_page(start_pfn - 1) + 1;
- end_pg = pfn_to_page(end_pfn - 1) + 1;
-
- /*
- * Convert to physical addresses, and round start upwards and end
- * downwards.
- */
- pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
- pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
-
- /*
- * If there are free pages between these, free the section of the
- * memmap array.
- */
- if (pg < pgend)
- memblock_free(pg, pgend - pg);
-}
-
-/*
- * The mem_map array can get very big. Free the unused area of the memory map.
- */
-static void __init free_unused_memmap(void)
-{
- unsigned long start, end, prev_end = 0;
- int i;
-
- for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
-#ifdef CONFIG_SPARSEMEM
- /*
- * Take care not to free memmap entries that don't exist due
- * to SPARSEMEM sections which aren't present.
- */
- start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
-#endif
- /*
- * If we had a previous bank, and there is a space between the
- * current bank and the previous, free it.
- */
- if (prev_end && prev_end < start)
- free_memmap(prev_end, start);
-
- /*
- * Align up here since the VM subsystem insists that the
- * memmap entries are valid from the bank end aligned to
- * MAX_ORDER_NR_PAGES.
- */
- prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
- }
-
-#ifdef CONFIG_SPARSEMEM
- if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
- free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
-#endif
-}
-#endif /* !CONFIG_SPARSEMEM_VMEMMAP */
-
/*
* mem_init() marks the free areas in the mem_map and tells us how much memory
* is free. This is done after various parts of the system have claimed their
@@ -510,9 +445,6 @@ void __init mem_init(void)

set_max_mapnr(max_pfn - PHYS_PFN_OFFSET);

-#ifndef CONFIG_SPARSEMEM_VMEMMAP
- free_unused_memmap();
-#endif
/* this will put all unused low memory onto the freelists */
memblock_free_all();

diff --git a/mm/memblock.c b/mm/memblock.c
index b68ee86788af..049df4163a97 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1926,6 +1926,85 @@ static int __init early_memblock(char *p)
}
early_param("memblock", early_memblock);

+static void __init free_memmap(unsigned long start_pfn, unsigned long end_pfn)
+{
+ struct page *start_pg, *end_pg;
+ phys_addr_t pg, pgend;
+
+ /*
+ * Convert start_pfn/end_pfn to a struct page pointer.
+ */
+ start_pg = pfn_to_page(start_pfn - 1) + 1;
+ end_pg = pfn_to_page(end_pfn - 1) + 1;
+
+ /*
+ * Convert to physical addresses, and round start upwards and end
+ * downwards.
+ */
+ pg = PAGE_ALIGN(__pa(start_pg));
+ pgend = __pa(end_pg) & PAGE_MASK;
+
+ /*
+ * If there are free pages between these, free the section of the
+ * memmap array.
+ */
+ if (pg < pgend)
+ memblock_free(pg, pgend - pg);
+}
+
+/*
+ * The mem_map array can get very big. Free the unused area of the memory map.
+ */
+static void __init free_unused_memmap(void)
+{
+ unsigned long start, end, prev_end = 0;
+ int i;
+
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
+ IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
+ return;
+
+ /*
+ * This relies on each bank being in address order.
+ * The banks are sorted previously in bootmem_init().
+ */
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
+#ifdef CONFIG_SPARSEMEM
+ /*
+ * Take care not to free memmap entries that don't exist
+ * due to SPARSEMEM sections which aren't present.
+ */
+ start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
+#else
+ /*
+ * Align down here since the VM subsystem insists that the
+ * memmap entries are valid from the bank start aligned to
+ * MAX_ORDER_NR_PAGES.
+ */
+ start = round_down(start, MAX_ORDER_NR_PAGES);
+#endif
+
+ /*
+ * If we had a previous bank, and there is a space
+ * between the current bank and the previous, free it.
+ */
+ if (prev_end && prev_end < start)
+ free_memmap(prev_end, start);
+
+ /*
+ * Align up here since the VM subsystem insists that the
+ * memmap entries are valid from the bank end aligned to
+ * MAX_ORDER_NR_PAGES.
+ */
+ prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
+ }
+
+#ifdef CONFIG_SPARSEMEM
+ if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
+ free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
+#endif
+}
+
static void __init __free_pages_memory(unsigned long start, unsigned long end)
{
int order;
@@ -2012,6 +2091,7 @@ unsigned long __init memblock_free_all(void)
{
unsigned long pages;

+ free_unused_memmap();
reset_all_zones_managed_pages();

pages = free_low_memory_core_early();
--
2.28.0

2020-11-01 17:08:43

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 11/13] m68k/mm: make node data and node setup depend on CONFIG_DISCONTIGMEM

From: Mike Rapoport <[email protected]>

The pg_data_t node structures and their initialization currently depends on
!CONFIG_SINGLE_MEMORY_CHUNK. Since they are required only for DISCONTIGMEM
make this dependency explicit and replace usage of
CONFIG_SINGLE_MEMORY_CHUNK with CONFIG_DISCONTIGMEM where appropriate.

The CONFIG_SINGLE_MEMORY_CHUNK was implicitly disabled on the ColdFire MMU
variant, although it always presumed a single memory bank. As there is no
actual need for DISCONTIGMEM in this case, make sure that ColdFire MMU
systems set CONFIG_SINGLE_MEMORY_CHUNK to 'y'.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/m68k/Kconfig.cpu | 4 ++--
arch/m68k/include/asm/page_mm.h | 2 +-
arch/m68k/include/asm/virtconvert.h | 2 +-
arch/m68k/mm/init.c | 4 ++--
4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 694c4fca9f5d..e8ad721e52f6 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -373,7 +373,7 @@ config RMW_INSNS
config SINGLE_MEMORY_CHUNK
bool "Use one physical chunk of memory only" if ADVANCED && !SUN3
depends on MMU
- default y if SUN3
+ default y if SUN3 || MMU_COLDFIRE
select NEED_MULTIPLE_NODES
help
Ignore all but the first contiguous chunk of physical memory for VM
@@ -406,7 +406,7 @@ config M68K_L2_CACHE
config NODES_SHIFT
int
default "3"
- depends on !SINGLE_MEMORY_CHUNK
+ depends on DISCONTIGMEM

config CPU_HAS_NO_BITFIELDS
bool
diff --git a/arch/m68k/include/asm/page_mm.h b/arch/m68k/include/asm/page_mm.h
index e6b75992192b..0e794051d3bb 100644
--- a/arch/m68k/include/asm/page_mm.h
+++ b/arch/m68k/include/asm/page_mm.h
@@ -126,7 +126,7 @@ static inline void *__va(unsigned long x)

extern int m68k_virt_to_node_shift;

-#ifdef CONFIG_SINGLE_MEMORY_CHUNK
+#ifndef CONFIG_DISCONTIGMEM
#define __virt_to_node(addr) (&pg_data_map[0])
#else
extern struct pglist_data *pg_data_table[];
diff --git a/arch/m68k/include/asm/virtconvert.h b/arch/m68k/include/asm/virtconvert.h
index dfe43083b579..eb9eb5cb23a6 100644
--- a/arch/m68k/include/asm/virtconvert.h
+++ b/arch/m68k/include/asm/virtconvert.h
@@ -29,7 +29,7 @@ static inline void *phys_to_virt(unsigned long address)
}

/* Permanent address of a page. */
-#if defined(CONFIG_MMU) && defined(CONFIG_SINGLE_MEMORY_CHUNK)
+#if defined(CONFIG_MMU) && !defined(CONFIG_DISCONTIGMEM)
#define page_to_phys(page) \
__pa(PAGE_OFFSET + (((page) - pg_data_map[0].node_mem_map) << PAGE_SHIFT))
#else
diff --git a/arch/m68k/mm/init.c b/arch/m68k/mm/init.c
index 53040857a9ed..4b46ceace3d3 100644
--- a/arch/m68k/mm/init.c
+++ b/arch/m68k/mm/init.c
@@ -47,14 +47,14 @@ EXPORT_SYMBOL(pg_data_map);

int m68k_virt_to_node_shift;

-#ifndef CONFIG_SINGLE_MEMORY_CHUNK
+#ifdef CONFIG_DISCONTIGMEM
pg_data_t *pg_data_table[65];
EXPORT_SYMBOL(pg_data_table);
#endif

void __init m68k_setup_node(int node)
{
-#ifndef CONFIG_SINGLE_MEMORY_CHUNK
+#ifdef CONFIG_DISCONTIGMEM
struct m68k_mem_info *info = m68k_memory + node;
int i, end;

--
2.28.0

2020-11-01 17:09:31

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 05/13] ia64: split virtual map initialization out of paging_init()

From: Mike Rapoport <[email protected]>

For both FLATMEM and DISCONTIGMEM/SPARSEMEM the virtual map initialization
is spread over paging_init() for no good reason.

Split out the bits related to virtual map initialization to a helper
functions, one for FLATMEM and another for !FLATMEM configurations.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/mm/contig.c | 34 ++++++++++++++++++++--------------
arch/ia64/mm/discontig.c | 37 ++++++++++++++++++++-----------------
2 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 2491aaeca90c..ba81d8cb0059 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -166,21 +166,8 @@ find_memory (void)
alloc_per_cpu_data();
}

-/*
- * Set up the page tables.
- */
-
-void __init
-paging_init (void)
+static void __init virtual_map_init(void)
{
- unsigned long max_dma;
- unsigned long max_zone_pfns[MAX_NR_ZONES];
-
- memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
- max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
- max_zone_pfns[ZONE_DMA32] = max_dma;
- max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
-
#ifdef CONFIG_VIRTUAL_MEM_MAP
efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
if (max_gap < LARGE_GAP) {
@@ -206,6 +193,25 @@ paging_init (void)
printk("Virtual mem_map starts at 0x%p\n", mem_map);
}
#endif /* !CONFIG_VIRTUAL_MEM_MAP */
+}
+
+/*
+ * Set up the page tables.
+ */
+
+void __init
+paging_init (void)
+{
+ unsigned long max_dma;
+ unsigned long max_zone_pfns[MAX_NR_ZONES];
+
+ memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
+ max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
+ max_zone_pfns[ZONE_DMA32] = max_dma;
+ max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
+
+ virtual_map_init();
+
free_area_init(max_zone_pfns);
zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
}
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index f41dcf75887b..c7311131156e 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -584,6 +584,25 @@ void call_pernode_memory(unsigned long start, unsigned long len, void *arg)
}
}

+static void __init virtual_map_init(void)
+{
+#ifdef CONFIG_VIRTUAL_MEM_MAP
+ int node;
+
+ VMALLOC_END -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
+ sizeof(struct page));
+ vmem_map = (struct page *) VMALLOC_END;
+ efi_memmap_walk(create_mem_map_page_table, NULL);
+ printk("Virtual mem_map starts at 0x%p\n", vmem_map);
+
+ for_each_online_node(node) {
+ unsigned long pfn_offset = mem_data[node].min_pfn;
+
+ NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
+ }
+#endif
+}
+
/**
* paging_init - setup page tables
*
@@ -593,29 +612,13 @@ void call_pernode_memory(unsigned long start, unsigned long len, void *arg)
void __init paging_init(void)
{
unsigned long max_dma;
- unsigned long pfn_offset = 0;
- int node;
unsigned long max_zone_pfns[MAX_NR_ZONES];

max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;

sparse_init();

-#ifdef CONFIG_VIRTUAL_MEM_MAP
- VMALLOC_END -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
- sizeof(struct page));
- vmem_map = (struct page *) VMALLOC_END;
- efi_memmap_walk(create_mem_map_page_table, NULL);
- printk("Virtual mem_map starts at 0x%p\n", vmem_map);
-#endif
-
- for_each_online_node(node) {
- pfn_offset = mem_data[node].min_pfn;
-
-#ifdef CONFIG_VIRTUAL_MEM_MAP
- NODE_DATA(node)->node_mem_map = vmem_map + pfn_offset;
-#endif
- }
+ virtual_map_init();

memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
max_zone_pfns[ZONE_DMA32] = max_dma;
--
2.28.0

2020-11-01 17:09:47

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 06/13] ia64: forbid using VIRTUAL_MEM_MAP with FLATMEM

From: Mike Rapoport <[email protected]>

Virtual memory map was intended to avoid wasting memory on the memory map
on systems with large holes in the physical memory layout. Long ago it been
superseded first by DISCONTIGMEM and then by SPARSEMEM. Moreover,
SPARSEMEM_VMEMMAP provide the same functionality in much more portable way.

As the first step to removing the VIRTUAL_MEM_MAP forbid it's usage with
FLATMEM and panic on systems with large holes in the physical memory
layout that try to run FLATMEM kernels.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/Kconfig | 2 +-
arch/ia64/include/asm/meminit.h | 2 --
arch/ia64/mm/contig.c | 48 +++++++++++++++------------------
arch/ia64/mm/init.c | 14 ----------
4 files changed, 22 insertions(+), 44 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 12aae706cb27..83de0273d474 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -329,7 +329,7 @@ config NODES_SHIFT
# VIRTUAL_MEM_MAP has been retained for historical reasons.
config VIRTUAL_MEM_MAP
bool "Virtual mem map"
- depends on !SPARSEMEM
+ depends on !SPARSEMEM && !FLATMEM
default y
help
Say Y to compile the kernel with support for a virtual mem map.
diff --git a/arch/ia64/include/asm/meminit.h b/arch/ia64/include/asm/meminit.h
index 092f1c91b36c..e789c0818edb 100644
--- a/arch/ia64/include/asm/meminit.h
+++ b/arch/ia64/include/asm/meminit.h
@@ -59,10 +59,8 @@ extern int reserve_elfcorehdr(u64 *start, u64 *end);
extern int register_active_ranges(u64 start, u64 len, int nid);

#ifdef CONFIG_VIRTUAL_MEM_MAP
-# define LARGE_GAP 0x40000000 /* Use virtual mem map if hole is > than this */
extern unsigned long VMALLOC_END;
extern struct page *vmem_map;
- extern int find_largest_hole(u64 start, u64 end, void *arg);
extern int create_mem_map_page_table(u64 start, u64 end, void *arg);
extern int vmemmap_find_next_valid_pfn(int, int);
#else
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index ba81d8cb0059..bfc4ecd0a2ab 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -19,15 +19,12 @@
#include <linux/mm.h>
#include <linux/nmi.h>
#include <linux/swap.h>
+#include <linux/sizes.h>

#include <asm/meminit.h>
#include <asm/sections.h>
#include <asm/mca.h>

-#ifdef CONFIG_VIRTUAL_MEM_MAP
-static unsigned long max_gap;
-#endif
-
/* physical address where the bootmem map is located */
unsigned long bootmap_start;

@@ -166,33 +163,30 @@ find_memory (void)
alloc_per_cpu_data();
}

-static void __init virtual_map_init(void)
+static int __init find_largest_hole(u64 start, u64 end, void *arg)
{
-#ifdef CONFIG_VIRTUAL_MEM_MAP
- efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
- if (max_gap < LARGE_GAP) {
- vmem_map = (struct page *) 0;
- } else {
- unsigned long map_size;
+ u64 *max_gap = arg;

- /* allocate virtual_mem_map */
+ static u64 last_end = PAGE_OFFSET;

- map_size = PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
- sizeof(struct page));
- VMALLOC_END -= map_size;
- vmem_map = (struct page *) VMALLOC_END;
- efi_memmap_walk(create_mem_map_page_table, NULL);
+ /* NOTE: this algorithm assumes efi memmap table is ordered */

- /*
- * alloc_node_mem_map makes an adjustment for mem_map
- * which isn't compatible with vmem_map.
- */
- NODE_DATA(0)->node_mem_map = vmem_map +
- find_min_pfn_with_active_regions();
+ if (*max_gap < (start - last_end))
+ *max_gap = start - last_end;
+ last_end = end;
+ return 0;
+}

- printk("Virtual mem_map starts at 0x%p\n", mem_map);
- }
-#endif /* !CONFIG_VIRTUAL_MEM_MAP */
+static void __init verify_gap_absence(void)
+{
+ unsigned long max_gap;
+
+ /* Forbid FLATMEM if hole is > than 1G */
+ efi_memmap_walk(find_largest_hole, (u64 *)&max_gap);
+ if (max_gap >= SZ_1G)
+ panic("Cannot use FLATMEM with %ldMB hole\n"
+ "Please switch over to SPARSEMEM\n",
+ (max_gap >> 20));
}

/*
@@ -210,7 +204,7 @@ paging_init (void)
max_zone_pfns[ZONE_DMA32] = max_dma;
max_zone_pfns[ZONE_NORMAL] = max_low_pfn;

- virtual_map_init();
+ verify_gap_absence();

free_area_init(max_zone_pfns);
zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index ef12e097f318..9b5acf8fb092 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -574,20 +574,6 @@ ia64_pfn_valid (unsigned long pfn)
}
EXPORT_SYMBOL(ia64_pfn_valid);

-int __init find_largest_hole(u64 start, u64 end, void *arg)
-{
- u64 *max_gap = arg;
-
- static u64 last_end = PAGE_OFFSET;
-
- /* NOTE: this algorithm assumes efi memmap table is ordered */
-
- if (*max_gap < (start - last_end))
- *max_gap = start - last_end;
- last_end = end;
- return 0;
-}
-
#endif /* CONFIG_VIRTUAL_MEM_MAP */

int __init register_active_ranges(u64 start, u64 len, int nid)
--
2.28.0

2020-11-01 17:10:16

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 10/13] arc: use FLATMEM with freeing of unused memory map instead of DISCONTIGMEM

From: Mike Rapoport <[email protected]>

Currently ARC uses DISCONTIGMEM to cope with sparse physical memory address
space on systems with 2 memory banks. While DISCONTIGMEM avoids wasting
memory on unpopulated memory map, it adds both memory and CPU overhead
relatively to FLATMEM. Moreover, DISCONTINGMEM is generally considered
deprecated.

The obvious replacement for DISCONTIGMEM would be SPARSEMEM, but it is also
less efficient than FLATMEM in pfn_to_page() and page_to_pfn() conversions.
Besides it requires tuning of SECTION_SIZE which is not trivial for
possible ARC memory configuration.

Since the memory map for both banks is always allocated from the "lowmem"
bank, it is possible to use FLATMEM for two-bank configuration and simply
free the unused hole in the memory map. All is required for that is to
provide ARC-specific pfn_valid() that will take into account actual
physical memory configuration and define HAVE_ARCH_PFN_VALID.

The resulting kernel image configured with defconfig + HIGHMEM=y is
smaller:

$ size a/vmlinux b/vmlinux
text data bss dec hex filename
4673503 1245456 279756 6198715 5e95bb a/vmlinux
4658706 1246864 279756 6185326 5e616e b/vmlinux

$ ./scripts/bloat-o-meter a/vmlinux b/vmlinux
add/remove: 28/30 grow/shrink: 42/399 up/down: 10986/-29025 (-18039)
...
Total: Before=4709315, After=4691276, chg -0.38%

Booting nSIM with haps_ns.dts results in the following memory usage
reports:

a:
Memory: 1559104K/1572864K available (3531K kernel code, 595K rwdata, 752K rodata, 136K init, 275K bss, 13760K reserved, 0K cma-reserved, 1048576K highmem)

b:
Memory: 1559112K/1572864K available (3519K kernel code, 594K rwdata, 752K rodata, 136K init, 280K bss, 13752K reserved, 0K cma-reserved, 1048576K highmem)

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/arc/Kconfig | 3 ++-
arch/arc/include/asm/page.h | 20 +++++++++++++++++---
arch/arc/mm/init.c | 29 ++++++++++++++++++++++-------
3 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 0a89cc9def65..c874f8ab0341 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -67,6 +67,7 @@ config GENERIC_CSUM

config ARCH_DISCONTIGMEM_ENABLE
def_bool n
+ depends on BROKEN

config ARCH_FLATMEM_ENABLE
def_bool y
@@ -506,7 +507,7 @@ config LINUX_RAM_BASE

config HIGHMEM
bool "High Memory Support"
- select ARCH_DISCONTIGMEM_ENABLE
+ select HAVE_ARCH_PFN_VALID
help
With ARC 2G:2G address split, only upper 2G is directly addressable by
kernel. Enable this to potentially allow access to rest of 2G and PAE
diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h
index b0dfed0f12be..23e41e890eda 100644
--- a/arch/arc/include/asm/page.h
+++ b/arch/arc/include/asm/page.h
@@ -82,11 +82,25 @@ typedef pte_t * pgtable_t;
*/
#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)

-#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
+/*
+ * When HIGHMEM is enabled we have holes in the memory map so we need
+ * pfn_valid() that takes into account the actual extents of the physical
+ * memory
+ */
+#ifdef CONFIG_HIGHMEM
+
+extern unsigned long arch_pfn_offset;
+#define ARCH_PFN_OFFSET arch_pfn_offset
+
+extern int pfn_valid(unsigned long pfn);
+#define pfn_valid pfn_valid

-#ifdef CONFIG_FLATMEM
+#else /* CONFIG_HIGHMEM */
+
+#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
#define pfn_valid(pfn) (((pfn) - ARCH_PFN_OFFSET) < max_mapnr)
-#endif
+
+#endif /* CONFIG_HIGHMEM */

/*
* __pa, __va, virt_to_page (ALERT: deprecated, don't use them)
diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 3a35b82a718e..ce07e697916c 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -28,6 +28,8 @@ static unsigned long low_mem_sz;
static unsigned long min_high_pfn, max_high_pfn;
static phys_addr_t high_mem_start;
static phys_addr_t high_mem_sz;
+unsigned long arch_pfn_offset;
+EXPORT_SYMBOL(arch_pfn_offset);
#endif

#ifdef CONFIG_DISCONTIGMEM
@@ -98,16 +100,11 @@ void __init setup_arch_memory(void)
init_mm.brk = (unsigned long)_end;

/* first page of system - kernel .vector starts here */
- min_low_pfn = ARCH_PFN_OFFSET;
+ min_low_pfn = virt_to_pfn(CONFIG_LINUX_RAM_BASE);

/* Last usable page of low mem */
max_low_pfn = max_pfn = PFN_DOWN(low_mem_start + low_mem_sz);

-#ifdef CONFIG_FLATMEM
- /* pfn_valid() uses this */
- max_mapnr = max_low_pfn - min_low_pfn;
-#endif
-
/*------------- bootmem allocator setup -----------------------*/

/*
@@ -153,7 +150,9 @@ void __init setup_arch_memory(void)
* DISCONTIGMEM in turns requires multiple nodes. node 0 above is
* populated with normal memory zone while node 1 only has highmem
*/
+#ifdef CONFIG_DISCONTIGMEM
node_set_online(1);
+#endif

min_high_pfn = PFN_DOWN(high_mem_start);
max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
@@ -161,8 +160,15 @@ void __init setup_arch_memory(void)
max_zone_pfn[ZONE_HIGHMEM] = min_low_pfn;

high_memory = (void *)(min_high_pfn << PAGE_SHIFT);
+
+ arch_pfn_offset = min(min_low_pfn, min_high_pfn);
kmap_init();
-#endif
+
+#else /* CONFIG_HIGHMEM */
+ /* pfn_valid() uses this when FLATMEM=y and HIGHMEM=n */
+ max_mapnr = max_low_pfn - min_low_pfn;
+
+#endif /* CONFIG_HIGHMEM */

free_area_init(max_zone_pfn);
}
@@ -190,3 +196,12 @@ void __init mem_init(void)
highmem_init();
mem_init_print_info(NULL);
}
+
+#ifdef CONFIG_HIGHMEM
+int pfn_valid(unsigned long pfn)
+{
+ return (pfn >= min_high_pfn && pfn <= max_high_pfn) ||
+ (pfn >= min_low_pfn && pfn <= max_low_pfn);
+}
+EXPORT_SYMBOL(pfn_valid);
+#endif
--
2.28.0

2020-11-01 17:10:42

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 13/13] m68k: deprecate DISCONTIGMEM

From: Mike Rapoport <[email protected]>

DISCONTIGMEM was intended to provide more efficient support for systems
with holes in their physical address space that FLATMEM did.

Yet, it's overhead in terms of the memory consumption seems to overweight
the savings on the unused memory map.

For a ARAnyM system with 16 MBytes of FastRAM configured, the memory usage
reported after page allocator initialization is

Memory: 23828K/30720K available (3206K kernel code, 535K rwdata, 936K rodata, 768K init, 193K bss, 6892K reserved, 0K cma-reserved)

and with DISCONTIGMEM disabled and with relatively large hole in the memory
map it is:

Memory: 23864K/30720K available (3197K kernel code, 516K rwdata, 936K rodata, 764K init, 179K bss, 6856K reserved, 0K cma-reserved)

Moreover, since m68k already has custom pfn_valid() it is possible to
define HAVE_ARCH_PFN_VALID to enable freeing of unused memory map. The
minimal size of a hole that can be freed should not be less than
MAX_ORDER_NR_PAGES so to achieve more substantial memory savings let m68k
also define custom FORCE_MAX_ZONEORDER.

With FORCE_MAX_ZONEORDER set to 9 memory usage becomes:

Memory: 23880K/30720K available (3197K kernel code, 516K rwdata, 936K rodata, 764K init, 179K bss, 6840K reserved, 0K cma-reserved)

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/m68k/Kconfig.cpu | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index b8884af365ae..3e70fb7a8d83 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -20,6 +20,7 @@ choice

config M68KCLASSIC
bool "Classic M68K CPU family support"
+ select HAVE_ARCH_PFN_VALID

config COLDFIRE
bool "Coldfire CPU family support"
@@ -377,11 +378,34 @@ config SINGLE_MEMORY_CHUNK
help
Ignore all but the first contiguous chunk of physical memory for VM
purposes. This will save a few bytes kernel size and may speed up
- some operations. Say N if not sure.
+ some operations.
+ When this option os set to N, you may want to lower "Maximum zone
+ order" to save memory that could be wasted for unused memory map.
+ Say N if not sure.

config ARCH_DISCONTIGMEM_ENABLE
+ depends on BROKEN
def_bool MMU && !SINGLE_MEMORY_CHUNK

+config FORCE_MAX_ZONEORDER
+ int "Maximum zone order" if ADVANCED
+ depends on !SINGLE_MEMORY_CHUNK
+ default "11"
+ help
+ The kernel memory allocator divides physically contiguous memory
+ blocks into "zones", where each zone is a power of two number of
+ pages. This option selects the largest power of two that the kernel
+ keeps in the memory allocator. If you need to allocate very large
+ blocks of physically contiguous memory, then you may need to
+ increase this value.
+
+ For systems that have holes in their physical address space this
+ value also defines the minimal size of the hole that allows
+ freeing unused memory map.
+
+ This config option is actually maximum order plus one. For example,
+ a value of 11 means that the largest free memory block is 2^10 pages.
+
config 060_WRITETHROUGH
bool "Use write-through caching for 68060 supervisor accesses"
depends on ADVANCED && M68060
--
2.28.0

2020-11-01 17:10:55

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2 12/13] m68k/mm: enable use of generic memory_model.h for !DISCONTIGMEM

From: Mike Rapoport <[email protected]>

The pg_data_map and pg_data_table arrays as well as page_to_pfn() and
pfn_to_page() are required only for DISCONTIGMEM. Other memory models can
use the generic definitions in asm-generic/memory_model.h.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/m68k/Kconfig.cpu | 1 -
arch/m68k/include/asm/page.h | 2 ++
arch/m68k/include/asm/page_mm.h | 5 +++++
arch/m68k/include/asm/virtconvert.h | 5 -----
arch/m68k/mm/init.c | 6 +++---
5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index e8ad721e52f6..b8884af365ae 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -374,7 +374,6 @@ config SINGLE_MEMORY_CHUNK
bool "Use one physical chunk of memory only" if ADVANCED && !SUN3
depends on MMU
default y if SUN3 || MMU_COLDFIRE
- select NEED_MULTIPLE_NODES
help
Ignore all but the first contiguous chunk of physical memory for VM
purposes. This will save a few bytes kernel size and may speed up
diff --git a/arch/m68k/include/asm/page.h b/arch/m68k/include/asm/page.h
index 2614a1206f2f..6116d7094292 100644
--- a/arch/m68k/include/asm/page.h
+++ b/arch/m68k/include/asm/page.h
@@ -62,8 +62,10 @@ extern unsigned long _ramend;
#include <asm/page_no.h>
#endif

+#ifdef CONFIG_DISCONTIGMEM
#define __phys_to_pfn(paddr) ((unsigned long)((paddr) >> PAGE_SHIFT))
#define __pfn_to_phys(pfn) PFN_PHYS(pfn)
+#endif

#include <asm-generic/getorder.h>

diff --git a/arch/m68k/include/asm/page_mm.h b/arch/m68k/include/asm/page_mm.h
index 0e794051d3bb..7f5912af2a52 100644
--- a/arch/m68k/include/asm/page_mm.h
+++ b/arch/m68k/include/asm/page_mm.h
@@ -153,6 +153,7 @@ static inline __attribute_const__ int __virt_to_node_shift(void)
pfn_to_virt(page_to_pfn(page)); \
})

+#ifdef CONFIG_DISCONTIGMEM
#define pfn_to_page(pfn) ({ \
unsigned long __pfn = (pfn); \
struct pglist_data *pgdat; \
@@ -165,6 +166,10 @@ static inline __attribute_const__ int __virt_to_node_shift(void)
pgdat = &pg_data_map[page_to_nid(__p)]; \
((__p) - pgdat->node_mem_map) + pgdat->node_start_pfn; \
})
+#else
+#define ARCH_PFN_OFFSET (m68k_memory[0].addr)
+#include <asm-generic/memory_model.h>
+#endif

#define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void *)PAGE_OFFSET && (void *)(kaddr) < high_memory)
#define pfn_valid(pfn) virt_addr_valid(pfn_to_virt(pfn))
diff --git a/arch/m68k/include/asm/virtconvert.h b/arch/m68k/include/asm/virtconvert.h
index eb9eb5cb23a6..ca91b32dc6ef 100644
--- a/arch/m68k/include/asm/virtconvert.h
+++ b/arch/m68k/include/asm/virtconvert.h
@@ -29,12 +29,7 @@ static inline void *phys_to_virt(unsigned long address)
}

/* Permanent address of a page. */
-#if defined(CONFIG_MMU) && !defined(CONFIG_DISCONTIGMEM)
-#define page_to_phys(page) \
- __pa(PAGE_OFFSET + (((page) - pg_data_map[0].node_mem_map) << PAGE_SHIFT))
-#else
#define page_to_phys(page) (page_to_pfn(page) << PAGE_SHIFT)
-#endif

/*
* IO bus memory addresses are 1:1 with the physical address,
diff --git a/arch/m68k/mm/init.c b/arch/m68k/mm/init.c
index 4b46ceace3d3..14c1e541451c 100644
--- a/arch/m68k/mm/init.c
+++ b/arch/m68k/mm/init.c
@@ -42,12 +42,12 @@ EXPORT_SYMBOL(empty_zero_page);

#ifdef CONFIG_MMU

-pg_data_t pg_data_map[MAX_NUMNODES];
-EXPORT_SYMBOL(pg_data_map);
-
int m68k_virt_to_node_shift;

#ifdef CONFIG_DISCONTIGMEM
+pg_data_t pg_data_map[MAX_NUMNODES];
+EXPORT_SYMBOL(pg_data_map);
+
pg_data_t *pg_data_table[65];
EXPORT_SYMBOL(pg_data_table);
#endif
--
2.28.0

2020-11-02 09:46:01

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 08/13] arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL

On 01.11.20 18:04, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
> which in turn enables memmap_valid_within() function that is intended to
> verify existence of struct page associated with a pfn when there are holes
> in the memory map.
>
> However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID
> and arch-specific pfn_valid() implementation that also deals with the holes
> in the memory map.
>
> The only two users of memmap_valid_within() call this function after
> a call to pfn_valid() so the memmap_valid_within() check becomes redundant.
>
> Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely
> entirely on ARM's implementation of pfn_valid() that is now enabled
> unconditionally.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> Documentation/vm/memory-model.rst | 3 +--
> arch/arm/Kconfig | 8 ++------
> arch/arm/mach-bcm/Kconfig | 1 -
> arch/arm/mach-davinci/Kconfig | 1 -
> arch/arm/mach-exynos/Kconfig | 1 -
> arch/arm/mach-highbank/Kconfig | 1 -
> arch/arm/mach-omap2/Kconfig | 1 -
> arch/arm/mach-s5pv210/Kconfig | 1 -
> arch/arm/mach-tango/Kconfig | 1 -
> fs/proc/kcore.c | 2 --
> include/linux/mmzone.h | 31 -------------------------------
> mm/mmzone.c | 14 --------------
> mm/vmstat.c | 4 ----
> 13 files changed, 3 insertions(+), 66 deletions(-)
>
> diff --git a/Documentation/vm/memory-model.rst b/Documentation/vm/memory-model.rst
> index 9daadf9faba1..ce398a7dc6cd 100644
> --- a/Documentation/vm/memory-model.rst
> +++ b/Documentation/vm/memory-model.rst
> @@ -51,8 +51,7 @@ call :c:func:`free_area_init` function. Yet, the mappings array is not
> usable until the call to :c:func:`memblock_free_all` that hands all the
> memory to the page allocator.
>
> -If an architecture enables `CONFIG_ARCH_HAS_HOLES_MEMORYMODEL` option,
> -it may free parts of the `mem_map` array that do not cover the
> +An architecture may free parts of the `mem_map` array that do not cover the
> actual physical pages. In such case, the architecture specific
> :c:func:`pfn_valid` implementation should take the holes in the
> `mem_map` into account.
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index fe2f17eb2b50..83adc46c1e67 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -25,7 +25,7 @@ config ARM
> select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> select ARCH_HAVE_CUSTOM_GPIO_H
> select ARCH_HAS_GCOV_PROFILE_ALL
> - select ARCH_KEEP_MEMBLOCK if HAVE_ARCH_PFN_VALID || KEXEC
> + select ARCH_KEEP_MEMBLOCK
> select ARCH_MIGHT_HAVE_PC_PARPORT
> select ARCH_NO_SG_CHAIN if !ARM_HAS_SG_CHAIN
> select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
> @@ -519,7 +519,6 @@ config ARCH_S3C24XX
> config ARCH_OMAP1
> bool "TI OMAP1"
> depends on MMU
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARCH_OMAP
> select CLKDEV_LOOKUP
> select CLKSRC_MMIO
> @@ -1479,9 +1478,6 @@ config OABI_COMPAT
> UNPREDICTABLE (in fact it can be predicted that it won't work
> at all). If in doubt say N.
>
> -config ARCH_HAS_HOLES_MEMORYMODEL
> - bool
> -
> config ARCH_SELECT_MEMORY_MODEL
> bool
>
> @@ -1493,7 +1489,7 @@ config ARCH_SPARSEMEM_ENABLE
> select SPARSEMEM_STATIC if SPARSEMEM
>
> config HAVE_ARCH_PFN_VALID
> - def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
> + def_bool y
>
> config HIGHMEM
> bool "High Memory Support"
> diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
> index ae790908fc74..9b594ae98153 100644
> --- a/arch/arm/mach-bcm/Kconfig
> +++ b/arch/arm/mach-bcm/Kconfig
> @@ -211,7 +211,6 @@ config ARCH_BRCMSTB
> select BCM7038_L1_IRQ
> select BRCMSTB_L2_IRQ
> select BCM7120_L2_IRQ
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ZONE_DMA if ARM_LPAE
> select SOC_BRCMSTB
> select SOC_BUS
> diff --git a/arch/arm/mach-davinci/Kconfig b/arch/arm/mach-davinci/Kconfig
> index f56ff8c24043..de11030748d0 100644
> --- a/arch/arm/mach-davinci/Kconfig
> +++ b/arch/arm/mach-davinci/Kconfig
> @@ -5,7 +5,6 @@ menuconfig ARCH_DAVINCI
> depends on ARCH_MULTI_V5
> select DAVINCI_TIMER
> select ZONE_DMA
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select PM_GENERIC_DOMAINS if PM
> select PM_GENERIC_DOMAINS_OF if PM && OF
> select REGMAP_MMIO
> diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
> index d2d249706ebb..56d272967fc0 100644
> --- a/arch/arm/mach-exynos/Kconfig
> +++ b/arch/arm/mach-exynos/Kconfig
> @@ -8,7 +8,6 @@
> menuconfig ARCH_EXYNOS
> bool "Samsung Exynos"
> depends on ARCH_MULTI_V7
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARCH_SUPPORTS_BIG_ENDIAN
> select ARM_AMBA
> select ARM_GIC
> diff --git a/arch/arm/mach-highbank/Kconfig b/arch/arm/mach-highbank/Kconfig
> index 1bc68913d62c..9de38ce8124f 100644
> --- a/arch/arm/mach-highbank/Kconfig
> +++ b/arch/arm/mach-highbank/Kconfig
> @@ -2,7 +2,6 @@
> config ARCH_HIGHBANK
> bool "Calxeda ECX-1000/2000 (Highbank/Midway)"
> depends on ARCH_MULTI_V7
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARCH_SUPPORTS_BIG_ENDIAN
> select ARM_AMBA
> select ARM_ERRATA_764369 if SMP
> diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
> index 3ee7bdff86b2..89fe1572c142 100644
> --- a/arch/arm/mach-omap2/Kconfig
> +++ b/arch/arm/mach-omap2/Kconfig
> @@ -94,7 +94,6 @@ config SOC_DRA7XX
> config ARCH_OMAP2PLUS
> bool
> select ARCH_HAS_BANDGAP
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARCH_HAS_RESET_CONTROLLER
> select ARCH_OMAP
> select CLKSRC_MMIO
> diff --git a/arch/arm/mach-s5pv210/Kconfig b/arch/arm/mach-s5pv210/Kconfig
> index 95d4e8284866..d644b45bc29d 100644
> --- a/arch/arm/mach-s5pv210/Kconfig
> +++ b/arch/arm/mach-s5pv210/Kconfig
> @@ -8,7 +8,6 @@
> config ARCH_S5PV210
> bool "Samsung S5PV210/S5PC110"
> depends on ARCH_MULTI_V7
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARM_VIC
> select CLKSRC_SAMSUNG_PWM
> select COMMON_CLK_SAMSUNG
> diff --git a/arch/arm/mach-tango/Kconfig b/arch/arm/mach-tango/Kconfig
> index 25b2fd434861..a9eeda36aeb1 100644
> --- a/arch/arm/mach-tango/Kconfig
> +++ b/arch/arm/mach-tango/Kconfig
> @@ -3,7 +3,6 @@ config ARCH_TANGO
> bool "Sigma Designs Tango4 (SMP87xx)"
> depends on ARCH_MULTI_V7
> # Cortex-A9 MPCore r3p0, PL310 r3p2
> - select ARCH_HAS_HOLES_MEMORYMODEL
> select ARM_ERRATA_754322
> select ARM_ERRATA_764369 if SMP
> select ARM_ERRATA_775420
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index e502414b3556..4d2e64e9016c 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -193,8 +193,6 @@ kclist_add_private(unsigned long pfn, unsigned long nr_pages, void *arg)
> return 1;
>
> p = pfn_to_page(pfn);
> - if (!memmap_valid_within(pfn, p, page_zone(p)))
> - return 1;
>
> ent = kmalloc(sizeof(*ent), GFP_KERNEL);
> if (!ent)
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 876600a6e891..7385871768d4 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1440,37 +1440,6 @@ void sparse_init(void);
> #define pfn_valid_within(pfn) (1)
> #endif
>
> -#ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
> -/*
> - * pfn_valid() is meant to be able to tell if a given PFN has valid memmap
> - * associated with it or not. This means that a struct page exists for this
> - * pfn. The caller cannot assume the page is fully initialized in general.
> - * Hotplugable pages might not have been onlined yet. pfn_to_online_page()
> - * will ensure the struct page is fully online and initialized. Special pages
> - * (e.g. ZONE_DEVICE) are never onlined and should be treated accordingly.
> - *
> - * In FLATMEM, it is expected that holes always have valid memmap as long as
> - * there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed
> - * that a valid section has a memmap for the entire section.
> - *
> - * However, an ARM, and maybe other embedded architectures in the future
> - * free memmap backing holes to save memory on the assumption the memmap is
> - * never used. The page_zone linkages are then broken even though pfn_valid()
> - * returns true. A walker of the full memmap must then do this additional
> - * check to ensure the memmap they are looking at is sane by making sure
> - * the zone and PFN linkages are still valid. This is expensive, but walkers
> - * of the full memmap are extremely rare.
> - */
> -bool memmap_valid_within(unsigned long pfn,
> - struct page *page, struct zone *zone);
> -#else
> -static inline bool memmap_valid_within(unsigned long pfn,
> - struct page *page, struct zone *zone)
> -{
> - return true;
> -}
> -#endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
> -
> #endif /* !__GENERATING_BOUNDS.H */
> #endif /* !__ASSEMBLY__ */
> #endif /* _LINUX_MMZONE_H */
> diff --git a/mm/mmzone.c b/mm/mmzone.c
> index 4686fdc23bb9..f337831affc2 100644
> --- a/mm/mmzone.c
> +++ b/mm/mmzone.c
> @@ -72,20 +72,6 @@ struct zoneref *__next_zones_zonelist(struct zoneref *z,
> return z;
> }
>
> -#ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
> -bool memmap_valid_within(unsigned long pfn,
> - struct page *page, struct zone *zone)
> -{
> - if (page_to_pfn(page) != pfn)
> - return false;
> -
> - if (page_zone(page) != zone)
> - return false;
> -
> - return true;
> -}
> -#endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
> -
> void lruvec_init(struct lruvec *lruvec)
> {
> enum lru_list lru;
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 698bc0bc18d1..e292e63afebf 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1503,10 +1503,6 @@ static void pagetypeinfo_showblockcount_print(struct seq_file *m,
> if (!page)
> continue;
>
> - /* Watch for unexpected holes punched in the memmap */
> - if (!memmap_valid_within(pfn, page, zone))
> - continue;
> -

Right, pfn_to_online_page() -> pfn_valid() / pfn_valid_within() should
handle that.

Acked-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2020-11-14 12:17:19

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 09/13] arm, arm64: move free_unused_memmap() to generic mm

On Sun, Nov 01, 2020 at 07:04:50PM +0200, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> ARM and ARM64 free unused parts of the memory map just before the
> initialization of the page allocator. To allow holes in the memory map both
> architectures overload pfn_valid() and define HAVE_ARCH_PFN_VALID.
>
> Allowing holes in the memory map for FLATMEM may be useful for small
> machines, such as ARC and m68k and will enable those architectures to cease
> using DISCONTIGMEM and still support more than one memory bank.
>
> Move the functions that free unused memory map to generic mm and enable
> them in case HAVE_ARCH_PFN_VALID=y.
>
> Signed-off-by: Mike Rapoport <[email protected]>

For arm64:

Acked-by: Catalin Marinas <[email protected]>

Subject: Re: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

Hi!

On 11/1/20 6:04 PM, Mike Rapoport wrote:
> It's been a while since DISCONTIGMEM is generally considered deprecated,
> but it is still used by four architectures. This set replaces DISCONTIGMEM
> with a different way to handle holes in the memory map and marks
> DISCONTIGMEM configuration as BROKEN in Kconfigs of these architectures with
> the intention to completely remove it in several releases.
>
> While for 64-bit alpha and ia64 the switch to SPARSEMEM is quite obvious
> and was a matter of moving some bits around, for smaller 32-bit arc and
> m68k SPARSEMEM is not necessarily the best thing to do.
>
> On 32-bit machines SPARSEMEM would require large sections to make section
> index fit in the page flags, but larger sections mean that more memory is
> wasted for unused memory map.
>
> Besides, pfn_to_page() and page_to_pfn() become less efficient, at least on
> arc.
>
> So I've decided to generalize arm's approach for freeing of unused parts of
> the memory map with FLATMEM and enable it for both arc and m68k. The
> details are in the description of patches 10 (arc) and 13 (m68k).

Apologies for the late reply. Is this still relevant for testing?

I have already successfully tested v1 of the patch set, shall I test v2?

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - [email protected]
`. `' Freie Universitaet Berlin - [email protected]
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

2020-11-17 06:27:38

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

Hi Adrian,

On Tue, Nov 17, 2020 at 06:24:51AM +0100, John Paul Adrian Glaubitz wrote:
> Hi!
>
> On 11/1/20 6:04 PM, Mike Rapoport wrote:
> > It's been a while since DISCONTIGMEM is generally considered deprecated,
> > but it is still used by four architectures. This set replaces DISCONTIGMEM
> > with a different way to handle holes in the memory map and marks
> > DISCONTIGMEM configuration as BROKEN in Kconfigs of these architectures with
> > the intention to completely remove it in several releases.
> >
> > While for 64-bit alpha and ia64 the switch to SPARSEMEM is quite obvious
> > and was a matter of moving some bits around, for smaller 32-bit arc and
> > m68k SPARSEMEM is not necessarily the best thing to do.
> >
> > On 32-bit machines SPARSEMEM would require large sections to make section
> > index fit in the page flags, but larger sections mean that more memory is
> > wasted for unused memory map.
> >
> > Besides, pfn_to_page() and page_to_pfn() become less efficient, at least on
> > arc.
> >
> > So I've decided to generalize arm's approach for freeing of unused parts of
> > the memory map with FLATMEM and enable it for both arc and m68k. The
> > details are in the description of patches 10 (arc) and 13 (m68k).
>
> Apologies for the late reply. Is this still relevant for testing?
>
> I have already successfully tested v1 of the patch set, shall I test v2?

There were minor differences only for m68k between the versions. I've
verified them on ARAnyM but if you have a real machine a run there would
be nice.

> Adrian
>
> --
> .''`. John Paul Adrian Glaubitz
> : :' : Debian Developer - [email protected]
> `. `' Freie Universitaet Berlin - [email protected]
> `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
>
>

--
Sincerely yours,
Mike.

2020-11-17 06:43:07

by Vineet Gupta

[permalink] [raw]
Subject: Re: [PATCH v2 10/13] arc: use FLATMEM with freeing of unused memory map instead of DISCONTIGMEM

Hi Mike,

On 11/1/20 9:04 AM, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Currently ARC uses DISCONTIGMEM to cope with sparse physical memory address
> space on systems with 2 memory banks. While DISCONTIGMEM avoids wasting
> memory on unpopulated memory map, it adds both memory and CPU overhead
> relatively to FLATMEM. Moreover, DISCONTINGMEM is generally considered
> deprecated.
>
> The obvious replacement for DISCONTIGMEM would be SPARSEMEM, but it is also
> less efficient than FLATMEM in pfn_to_page() and page_to_pfn() conversions.
> Besides it requires tuning of SECTION_SIZE which is not trivial for
> possible ARC memory configuration.
>
> Since the memory map for both banks is always allocated from the "lowmem"
> bank, it is possible to use FLATMEM for two-bank configuration and simply
> free the unused hole in the memory map. All is required for that is to
> provide ARC-specific pfn_valid() that will take into account actual
> physical memory configuration and define HAVE_ARCH_PFN_VALID.
>
> The resulting kernel image configured with defconfig + HIGHMEM=y is
> smaller:
>
> $ size a/vmlinux b/vmlinux
> text data bss dec hex filename
> 4673503 1245456 279756 6198715 5e95bb a/vmlinux
> 4658706 1246864 279756 6185326 5e616e b/vmlinux
>
> $ ./scripts/bloat-o-meter a/vmlinux b/vmlinux
> add/remove: 28/30 grow/shrink: 42/399 up/down: 10986/-29025 (-18039)
> ...
> Total: Before=4709315, After=4691276, chg -0.38%
>
> Booting nSIM with haps_ns.dts results in the following memory usage
> reports:
>
> a:
> Memory: 1559104K/1572864K available (3531K kernel code, 595K rwdata, 752K rodata, 136K init, 275K bss, 13760K reserved, 0K cma-reserved, 1048576K highmem)
>
> b:
> Memory: 1559112K/1572864K available (3519K kernel code, 594K rwdata, 752K rodata, 136K init, 280K bss, 13752K reserved, 0K cma-reserved, 1048576K highmem)
>
> Signed-off-by: Mike Rapoport <[email protected]>

Sorry this fell through the cracks. Do you have a branch I can checkout
and do a quick test.

Thx,
-Vineet

> ---
> arch/arc/Kconfig | 3 ++-
> arch/arc/include/asm/page.h | 20 +++++++++++++++++---
> arch/arc/mm/init.c | 29 ++++++++++++++++++++++-------
> 3 files changed, 41 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> index 0a89cc9def65..c874f8ab0341 100644
> --- a/arch/arc/Kconfig
> +++ b/arch/arc/Kconfig
> @@ -67,6 +67,7 @@ config GENERIC_CSUM
>
> config ARCH_DISCONTIGMEM_ENABLE
> def_bool n
> + depends on BROKEN
>
> config ARCH_FLATMEM_ENABLE
> def_bool y
> @@ -506,7 +507,7 @@ config LINUX_RAM_BASE
>
> config HIGHMEM
> bool "High Memory Support"
> - select ARCH_DISCONTIGMEM_ENABLE
> + select HAVE_ARCH_PFN_VALID
> help
> With ARC 2G:2G address split, only upper 2G is directly addressable by
> kernel. Enable this to potentially allow access to rest of 2G and PAE
> diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h
> index b0dfed0f12be..23e41e890eda 100644
> --- a/arch/arc/include/asm/page.h
> +++ b/arch/arc/include/asm/page.h
> @@ -82,11 +82,25 @@ typedef pte_t * pgtable_t;
> */
> #define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
>
> -#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
> +/*
> + * When HIGHMEM is enabled we have holes in the memory map so we need
> + * pfn_valid() that takes into account the actual extents of the physical
> + * memory
> + */
> +#ifdef CONFIG_HIGHMEM
> +
> +extern unsigned long arch_pfn_offset;
> +#define ARCH_PFN_OFFSET arch_pfn_offset
> +
> +extern int pfn_valid(unsigned long pfn);
> +#define pfn_valid pfn_valid
>
> -#ifdef CONFIG_FLATMEM
> +#else /* CONFIG_HIGHMEM */
> +
> +#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
> #define pfn_valid(pfn) (((pfn) - ARCH_PFN_OFFSET) < max_mapnr)
> -#endif
> +
> +#endif /* CONFIG_HIGHMEM */
>
> /*
> * __pa, __va, virt_to_page (ALERT: deprecated, don't use them)
> diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
> index 3a35b82a718e..ce07e697916c 100644
> --- a/arch/arc/mm/init.c
> +++ b/arch/arc/mm/init.c
> @@ -28,6 +28,8 @@ static unsigned long low_mem_sz;
> static unsigned long min_high_pfn, max_high_pfn;
> static phys_addr_t high_mem_start;
> static phys_addr_t high_mem_sz;
> +unsigned long arch_pfn_offset;
> +EXPORT_SYMBOL(arch_pfn_offset);
> #endif
>
> #ifdef CONFIG_DISCONTIGMEM
> @@ -98,16 +100,11 @@ void __init setup_arch_memory(void)
> init_mm.brk = (unsigned long)_end;
>
> /* first page of system - kernel .vector starts here */
> - min_low_pfn = ARCH_PFN_OFFSET;
> + min_low_pfn = virt_to_pfn(CONFIG_LINUX_RAM_BASE);
>
> /* Last usable page of low mem */
> max_low_pfn = max_pfn = PFN_DOWN(low_mem_start + low_mem_sz);
>
> -#ifdef CONFIG_FLATMEM
> - /* pfn_valid() uses this */
> - max_mapnr = max_low_pfn - min_low_pfn;
> -#endif
> -
> /*------------- bootmem allocator setup -----------------------*/
>
> /*
> @@ -153,7 +150,9 @@ void __init setup_arch_memory(void)
> * DISCONTIGMEM in turns requires multiple nodes. node 0 above is
> * populated with normal memory zone while node 1 only has highmem
> */
> +#ifdef CONFIG_DISCONTIGMEM
> node_set_online(1);
> +#endif
>
> min_high_pfn = PFN_DOWN(high_mem_start);
> max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
> @@ -161,8 +160,15 @@ void __init setup_arch_memory(void)
> max_zone_pfn[ZONE_HIGHMEM] = min_low_pfn;
>
> high_memory = (void *)(min_high_pfn << PAGE_SHIFT);
> +
> + arch_pfn_offset = min(min_low_pfn, min_high_pfn);
> kmap_init();
> -#endif
> +
> +#else /* CONFIG_HIGHMEM */
> + /* pfn_valid() uses this when FLATMEM=y and HIGHMEM=n */
> + max_mapnr = max_low_pfn - min_low_pfn;
> +
> +#endif /* CONFIG_HIGHMEM */
>
> free_area_init(max_zone_pfn);
> }
> @@ -190,3 +196,12 @@ void __init mem_init(void)
> highmem_init();
> mem_init_print_info(NULL);
> }
> +
> +#ifdef CONFIG_HIGHMEM
> +int pfn_valid(unsigned long pfn)
> +{
> + return (pfn >= min_high_pfn && pfn <= max_high_pfn) ||
> + (pfn >= min_low_pfn && pfn <= max_low_pfn);
> +}
> +EXPORT_SYMBOL(pfn_valid);
> +#endif

2020-11-17 06:59:56

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v2 10/13] arc: use FLATMEM with freeing of unused memory map instead of DISCONTIGMEM

On Tue, Nov 17, 2020 at 06:40:16AM +0000, Vineet Gupta wrote:
> Hi Mike,
>
> On 11/1/20 9:04 AM, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > Currently ARC uses DISCONTIGMEM to cope with sparse physical memory address
> > space on systems with 2 memory banks. While DISCONTIGMEM avoids wasting
> > memory on unpopulated memory map, it adds both memory and CPU overhead
> > relatively to FLATMEM. Moreover, DISCONTINGMEM is generally considered
> > deprecated.
> >
> > The obvious replacement for DISCONTIGMEM would be SPARSEMEM, but it is also
> > less efficient than FLATMEM in pfn_to_page() and page_to_pfn() conversions.
> > Besides it requires tuning of SECTION_SIZE which is not trivial for
> > possible ARC memory configuration.
> >
> > Since the memory map for both banks is always allocated from the "lowmem"
> > bank, it is possible to use FLATMEM for two-bank configuration and simply
> > free the unused hole in the memory map. All is required for that is to
> > provide ARC-specific pfn_valid() that will take into account actual
> > physical memory configuration and define HAVE_ARCH_PFN_VALID.
> >
> > The resulting kernel image configured with defconfig + HIGHMEM=y is
> > smaller:
> >
> > $ size a/vmlinux b/vmlinux
> > text data bss dec hex filename
> > 4673503 1245456 279756 6198715 5e95bb a/vmlinux
> > 4658706 1246864 279756 6185326 5e616e b/vmlinux
> >
> > $ ./scripts/bloat-o-meter a/vmlinux b/vmlinux
> > add/remove: 28/30 grow/shrink: 42/399 up/down: 10986/-29025 (-18039)
> > ...
> > Total: Before=4709315, After=4691276, chg -0.38%
> >
> > Booting nSIM with haps_ns.dts results in the following memory usage
> > reports:
> >
> > a:
> > Memory: 1559104K/1572864K available (3531K kernel code, 595K rwdata, 752K rodata, 136K init, 275K bss, 13760K reserved, 0K cma-reserved, 1048576K highmem)
> >
> > b:
> > Memory: 1559112K/1572864K available (3519K kernel code, 594K rwdata, 752K rodata, 136K init, 280K bss, 13752K reserved, 0K cma-reserved, 1048576K highmem)
> >
> > Signed-off-by: Mike Rapoport <[email protected]>
>
> Sorry this fell through the cracks. Do you have a branch I can checkout
> and do a quick test.

It's in mmotm and in my tree:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git memory-models/rm-discontig/v0

> Thx,
> -Vineet
>
> > ---
> > arch/arc/Kconfig | 3 ++-
> > arch/arc/include/asm/page.h | 20 +++++++++++++++++---
> > arch/arc/mm/init.c | 29 ++++++++++++++++++++++-------
> > 3 files changed, 41 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
> > index 0a89cc9def65..c874f8ab0341 100644
> > --- a/arch/arc/Kconfig
> > +++ b/arch/arc/Kconfig
> > @@ -67,6 +67,7 @@ config GENERIC_CSUM
> >
> > config ARCH_DISCONTIGMEM_ENABLE
> > def_bool n
> > + depends on BROKEN
> >
> > config ARCH_FLATMEM_ENABLE
> > def_bool y
> > @@ -506,7 +507,7 @@ config LINUX_RAM_BASE
> >
> > config HIGHMEM
> > bool "High Memory Support"
> > - select ARCH_DISCONTIGMEM_ENABLE
> > + select HAVE_ARCH_PFN_VALID
> > help
> > With ARC 2G:2G address split, only upper 2G is directly addressable by
> > kernel. Enable this to potentially allow access to rest of 2G and PAE
> > diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h
> > index b0dfed0f12be..23e41e890eda 100644
> > --- a/arch/arc/include/asm/page.h
> > +++ b/arch/arc/include/asm/page.h
> > @@ -82,11 +82,25 @@ typedef pte_t * pgtable_t;
> > */
> > #define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
> >
> > -#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
> > +/*
> > + * When HIGHMEM is enabled we have holes in the memory map so we need
> > + * pfn_valid() that takes into account the actual extents of the physical
> > + * memory
> > + */
> > +#ifdef CONFIG_HIGHMEM
> > +
> > +extern unsigned long arch_pfn_offset;
> > +#define ARCH_PFN_OFFSET arch_pfn_offset
> > +
> > +extern int pfn_valid(unsigned long pfn);
> > +#define pfn_valid pfn_valid
> >
> > -#ifdef CONFIG_FLATMEM
> > +#else /* CONFIG_HIGHMEM */
> > +
> > +#define ARCH_PFN_OFFSET virt_to_pfn(CONFIG_LINUX_RAM_BASE)
> > #define pfn_valid(pfn) (((pfn) - ARCH_PFN_OFFSET) < max_mapnr)
> > -#endif
> > +
> > +#endif /* CONFIG_HIGHMEM */
> >
> > /*
> > * __pa, __va, virt_to_page (ALERT: deprecated, don't use them)
> > diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
> > index 3a35b82a718e..ce07e697916c 100644
> > --- a/arch/arc/mm/init.c
> > +++ b/arch/arc/mm/init.c
> > @@ -28,6 +28,8 @@ static unsigned long low_mem_sz;
> > static unsigned long min_high_pfn, max_high_pfn;
> > static phys_addr_t high_mem_start;
> > static phys_addr_t high_mem_sz;
> > +unsigned long arch_pfn_offset;
> > +EXPORT_SYMBOL(arch_pfn_offset);
> > #endif
> >
> > #ifdef CONFIG_DISCONTIGMEM
> > @@ -98,16 +100,11 @@ void __init setup_arch_memory(void)
> > init_mm.brk = (unsigned long)_end;
> >
> > /* first page of system - kernel .vector starts here */
> > - min_low_pfn = ARCH_PFN_OFFSET;
> > + min_low_pfn = virt_to_pfn(CONFIG_LINUX_RAM_BASE);
> >
> > /* Last usable page of low mem */
> > max_low_pfn = max_pfn = PFN_DOWN(low_mem_start + low_mem_sz);
> >
> > -#ifdef CONFIG_FLATMEM
> > - /* pfn_valid() uses this */
> > - max_mapnr = max_low_pfn - min_low_pfn;
> > -#endif
> > -
> > /*------------- bootmem allocator setup -----------------------*/
> >
> > /*
> > @@ -153,7 +150,9 @@ void __init setup_arch_memory(void)
> > * DISCONTIGMEM in turns requires multiple nodes. node 0 above is
> > * populated with normal memory zone while node 1 only has highmem
> > */
> > +#ifdef CONFIG_DISCONTIGMEM
> > node_set_online(1);
> > +#endif
> >
> > min_high_pfn = PFN_DOWN(high_mem_start);
> > max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
> > @@ -161,8 +160,15 @@ void __init setup_arch_memory(void)
> > max_zone_pfn[ZONE_HIGHMEM] = min_low_pfn;
> >
> > high_memory = (void *)(min_high_pfn << PAGE_SHIFT);
> > +
> > + arch_pfn_offset = min(min_low_pfn, min_high_pfn);
> > kmap_init();
> > -#endif
> > +
> > +#else /* CONFIG_HIGHMEM */
> > + /* pfn_valid() uses this when FLATMEM=y and HIGHMEM=n */
> > + max_mapnr = max_low_pfn - min_low_pfn;
> > +
> > +#endif /* CONFIG_HIGHMEM */
> >
> > free_area_init(max_zone_pfn);
> > }
> > @@ -190,3 +196,12 @@ void __init mem_init(void)
> > highmem_init();
> > mem_init_print_info(NULL);
> > }
> > +
> > +#ifdef CONFIG_HIGHMEM
> > +int pfn_valid(unsigned long pfn)
> > +{
> > + return (pfn >= min_high_pfn && pfn <= max_high_pfn) ||
> > + (pfn >= min_low_pfn && pfn <= max_low_pfn);
> > +}
> > +EXPORT_SYMBOL(pfn_valid);
> > +#endif
>

--
Sincerely yours,
Mike.

Subject: Re: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

Hi Mike!

On 11/17/20 7:23 AM, Mike Rapoport wrote:
> There were minor differences only for m68k between the versions. I've
> verified them on ARAnyM but if you have a real machine a run there would
> be nice.

I do have a real machine (Amiga 68060) but it's currently not set up (but it
will be in the near future). So I'm not sure if I can test the change within
a short time frame.

I will certainly report back when I run into issues on real hardware.

Thanks,
Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - [email protected]
`. `' Freie Universitaet Berlin - [email protected]
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

2020-11-17 08:17:42

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

Hi Adrian,

On Tue, Nov 17, 2020 at 09:07:25AM +0100, John Paul Adrian Glaubitz wrote:
> Hi Mike!
>
> On 11/17/20 7:23 AM, Mike Rapoport wrote:
> > There were minor differences only for m68k between the versions. I've
> > verified them on ARAnyM but if you have a real machine a run there would
> > be nice.
>
> I do have a real machine (Amiga 68060) but it's currently not set up (but it
> will be in the near future). So I'm not sure if I can test the change within
> a short time frame.
>
> I will certainly report back when I run into issues on real hardware.

I hope there won't be any :)

--
Sincerely yours,
Mike.

Subject: Re: [PATCH v2 00/13] arch, mm: deprecate DISCONTIGMEM

Hi Mike!

On 11/17/20 7:23 AM, Mike Rapoport wrote:
>> Apologies for the late reply. Is this still relevant for testing?
>>
>> I have already successfully tested v1 of the patch set, shall I test v2?
>
> There were minor differences only for m68k between the versions. I've
> verified them on ARAnyM but if you have a real machine a run there would
> be nice.

I have just built a fresh kernel from the tip of Linus' tree and it boots
fine on my RX-2600:

root@glendronach:~# uname -a
Linux glendronach 5.10.0-rc6 #6 SMP Tue Dec 1 04:52:49 CET 2020 ia64 GNU/Linux
root@glendronach:~#

No issues observed so far. Looking at the git log, it seems these changes haven't
been merged for 5.10 yet. I assume they will be coming with 5.11?

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - [email protected]
`. `' Freie Universitaet Berlin - [email protected]
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913