2021-05-11 10:06:46

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

From: Mike Rapoport <[email protected]>

Hi,

These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
pfn_valid_within() to 1.

The idea is to mark NOMAP pages as reserved in the memory map and restore
the intended semantics of pfn_valid() to designate availability of struct
page for a pfn.

With this the core mm will be able to cope with the fact that it cannot use
NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
will be treated correctly even without the need for pfn_valid_within.

The patches are boot tested on qemu-system-aarch64.

I beleive it would be best to route these via mmotm tree.

v4:
* rebase on v5.13-rc1

v3: Link: https://lore.kernel.org/lkml/[email protected]
* Fix minor issues found by Anshuman
* Freshen up the declaration of pfn_valid() to make it consistent with
pfn_is_map_memory()
* Add more Acked-by and Reviewed-by tags, thanks Anshuman and David

v2: Link: https://lore.kernel.org/lkml/[email protected]
* Add check for PFN overflow in pfn_is_map_memory()
* Add Acked-by and Reviewed-by tags, thanks David.

v1: Link: https://lore.kernel.org/lkml/[email protected]
* Add comment about the semantics of pfn_valid() as Anshuman suggested
* Extend comments about MEMBLOCK_NOMAP, per Anshuman
* Use pfn_is_map_memory() name for the exported wrapper for
memblock_is_map_memory(). It is still local to arch/arm64 in the end
because of header dependency issues.

rfc: Link: https://lore.kernel.org/lkml/[email protected]

Mike Rapoport (4):
include/linux/mmzone.h: add documentation for pfn_valid()
memblock: update initialization of reserved pages
arm64: decouple check whether pfn is in linear map from pfn_valid()
arm64: drop pfn_valid_within() and simplify pfn_valid()

arch/arm64/Kconfig | 3 ---
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/include/asm/page.h | 3 ++-
arch/arm64/kvm/mmu.c | 2 +-
arch/arm64/mm/init.c | 14 +++++++++++++-
arch/arm64/mm/ioremap.c | 4 ++--
arch/arm64/mm/mmu.c | 2 +-
include/linux/memblock.h | 4 +++-
include/linux/mmzone.h | 11 +++++++++++
mm/memblock.c | 28 ++++++++++++++++++++++++++--
10 files changed, 60 insertions(+), 13 deletions(-)


base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
--
2.28.0


2021-05-11 10:06:49

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v4 1/4] include/linux/mmzone.h: add documentation for pfn_valid()

From: Mike Rapoport <[email protected]>

Add comment describing the semantics of pfn_valid() that clarifies that
pfn_valid() only checks for availability of a memory map entry (i.e. struct
page) for a PFN rather than availability of usable memory backing that PFN.

The most "generic" version of pfn_valid() used by the configurations with
SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
suitable place for documentation about semantics of pfn_valid().

Suggested-by: Anshuman Khandual <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
---
include/linux/mmzone.h | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0d53eba1c383..e5945ca24df7 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1427,6 +1427,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
#endif

#ifndef CONFIG_HAVE_ARCH_PFN_VALID
+/**
+ * pfn_valid - check if there is a valid memory map entry for a PFN
+ * @pfn: the page frame number to check
+ *
+ * Check if there is a valid memory map entry aka struct page for the @pfn.
+ * Note, that availability of the memory map entry does not imply that
+ * there is actual usable memory at that @pfn. The struct page may
+ * represent a hole or an unusable page frame.
+ *
+ * Return: 1 for PFNs that have memory map entries and 0 otherwise
+ */
static inline int pfn_valid(unsigned long pfn)
{
struct mem_section *ms;
--
2.28.0

2021-05-11 10:06:58

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v4 2/4] memblock: update initialization of reserved pages

From: Mike Rapoport <[email protected]>

The struct pages representing a reserved memory region are initialized
using reserve_bootmem_range() function. This function is called for each
reserved region just before the memory is freed from memblock to the buddy
page allocator.

The struct pages for MEMBLOCK_NOMAP regions are kept with the default
values set by the memory map initialization which makes it necessary to
have a special treatment for such pages in pfn_valid() and
pfn_valid_within().

Split out initialization of the reserved pages to a function with a
meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
reserved regions and mark struct pages for the NOMAP regions as
PageReserved.

Signed-off-by: Mike Rapoport <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
---
include/linux/memblock.h | 4 +++-
mm/memblock.c | 28 ++++++++++++++++++++++++++--
2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5984fff3f175..1b4c97c151ae 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -30,7 +30,9 @@ extern unsigned long long max_possible_pfn;
* @MEMBLOCK_NONE: no special request
* @MEMBLOCK_HOTPLUG: hotpluggable region
* @MEMBLOCK_MIRROR: mirrored region
- * @MEMBLOCK_NOMAP: don't add to kernel direct mapping
+ * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
+ * reserved in the memory map; refer to memblock_mark_nomap() description
+ * for further details
*/
enum memblock_flags {
MEMBLOCK_NONE = 0x0, /* No special request */
diff --git a/mm/memblock.c b/mm/memblock.c
index afaefa8fc6ab..3abf2c3fea7f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -906,6 +906,11 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
* @base: the base phys addr of the region
* @size: the size of the region
*
+ * The memory regions marked with %MEMBLOCK_NOMAP will not be added to the
+ * direct mapping of the physical memory. These regions will still be
+ * covered by the memory map. The struct page representing NOMAP memory
+ * frames in the memory map will be PageReserved()
+ *
* Return: 0 on success, -errno on failure.
*/
int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
@@ -2002,6 +2007,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
return end_pfn - start_pfn;
}

+static void __init memmap_init_reserved_pages(void)
+{
+ struct memblock_region *region;
+ phys_addr_t start, end;
+ u64 i;
+
+ /* initialize struct pages for the reserved regions */
+ for_each_reserved_mem_range(i, &start, &end)
+ reserve_bootmem_region(start, end);
+
+ /* and also treat struct pages for the NOMAP regions as PageReserved */
+ for_each_mem_region(region) {
+ if (memblock_is_nomap(region)) {
+ start = region->base;
+ end = start + region->size;
+ reserve_bootmem_region(start, end);
+ }
+ }
+}
+
static unsigned long __init free_low_memory_core_early(void)
{
unsigned long count = 0;
@@ -2010,8 +2035,7 @@ static unsigned long __init free_low_memory_core_early(void)

memblock_clear_hotplug(0, -1);

- for_each_reserved_mem_range(i, &start, &end)
- reserve_bootmem_region(start, end);
+ memmap_init_reserved_pages();

/*
* We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
--
2.28.0

2021-05-11 10:07:22

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v4 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()

From: Mike Rapoport <[email protected]>

The intended semantics of pfn_valid() is to verify whether there is a
struct page for the pfn in question and nothing else.

Yet, on arm64 it is used to distinguish memory areas that are mapped in the
linear map vs those that require ioremap() to access them.

Introduce a dedicated pfn_is_map_memory() wrapper for
memblock_is_map_memory() to perform such check and use it where
appropriate.

Using a wrapper allows to avoid cyclic include dependencies.

While here also update style of pfn_valid() so that both pfn_valid() and
pfn_is_map_memory() declarations will be consistent.

Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
---
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/include/asm/page.h | 3 ++-
arch/arm64/kvm/mmu.c | 2 +-
arch/arm64/mm/init.c | 12 ++++++++++++
arch/arm64/mm/ioremap.c | 4 ++--
arch/arm64/mm/mmu.c | 2 +-
6 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 87b90dc27a43..9027b7e16c4c 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -369,7 +369,7 @@ static inline void *phys_to_virt(phys_addr_t x)

#define virt_addr_valid(addr) ({ \
__typeof__(addr) __addr = __tag_reset(addr); \
- __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \
+ __is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr)); \
})

void dump_mem_limit(void);
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..75ddfe671393 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -37,7 +37,8 @@ void copy_highpage(struct page *to, struct page *from);

typedef struct page *pgtable_t;

-extern int pfn_valid(unsigned long);
+int pfn_valid(unsigned long pfn);
+int pfn_is_map_memory(unsigned long pfn);

#include <asm/memory.h>

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c5d1f3c87dbd..470070073085 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)

static bool kvm_is_device_pfn(unsigned long pfn)
{
- return !pfn_valid(pfn);
+ return !pfn_is_map_memory(pfn);
}

static void *stage2_memcache_zalloc_page(void *arg)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 16a2b2b1c54d..798f74f501d5 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -255,6 +255,18 @@ int pfn_valid(unsigned long pfn)
}
EXPORT_SYMBOL(pfn_valid);

+int pfn_is_map_memory(unsigned long pfn)
+{
+ phys_addr_t addr = PFN_PHYS(pfn);
+
+ /* avoid false positives for bogus PFNs, see comment in pfn_valid() */
+ if (PHYS_PFN(addr) != pfn)
+ return 0;
+
+ return memblock_is_map_memory(addr);
+}
+EXPORT_SYMBOL(pfn_is_map_memory);
+
static phys_addr_t memory_limit = PHYS_ADDR_MAX;

/*
diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
index b5e83c46b23e..b7c81dacabf0 100644
--- a/arch/arm64/mm/ioremap.c
+++ b/arch/arm64/mm/ioremap.c
@@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
/*
* Don't allow RAM to be mapped.
*/
- if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
+ if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
return NULL;

area = get_vm_area_caller(size, VM_IOREMAP, caller);
@@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
{
/* For normal memory we already have a cacheable mapping. */
- if (pfn_valid(__phys_to_pfn(phys_addr)))
+ if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
return (void __iomem *)__phys_to_virt(phys_addr);

return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 6dd9369e3ea0..ab5914cebd3c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -82,7 +82,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t vma_prot)
{
- if (!pfn_valid(pfn))
+ if (!pfn_is_map_memory(pfn))
return pgprot_noncached(vma_prot);
else if (file->f_flags & O_SYNC)
return pgprot_writecombine(vma_prot);
--
2.28.0

2021-05-11 10:08:50

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v4 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

From: Mike Rapoport <[email protected]>

The arm64's version of pfn_valid() differs from the generic because of two
reasons:

* Parts of the memory map are freed during boot. This makes it necessary to
verify that there is actual physical memory that corresponds to a pfn
which is done by querying memblock.

* There are NOMAP memory regions. These regions are not mapped in the
linear map and until the previous commit the struct pages representing
these areas had default values.

As the consequence of absence of the special treatment of NOMAP regions in
the memory map it was necessary to use memblock_is_map_memory() in
pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
generic mm functionality would not treat a NOMAP page as a normal page.

Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
the rest of core mm will treat them as unusable memory and thus
pfn_valid_within() is no longer required at all and can be disabled by
removing CONFIG_HOLES_IN_ZONE on arm64.

pfn_valid() can be slightly simplified by replacing
memblock_is_map_memory() with memblock_is_memory().

Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
---
arch/arm64/Kconfig | 3 ---
arch/arm64/mm/init.c | 2 +-
2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9f1d8566bbf9..d7dc8698cf8e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1052,9 +1052,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
def_bool y
depends on NUMA

-config HOLES_IN_ZONE
- def_bool y
-
source "kernel/Kconfig.hz"

config ARCH_SPARSEMEM_ENABLE
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 798f74f501d5..fb07218da2c0 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -251,7 +251,7 @@ int pfn_valid(unsigned long pfn)
if (!early_section(ms))
return pfn_section_valid(ms, pfn);

- return memblock_is_map_memory(addr);
+ return memblock_is_memory(addr);
}
EXPORT_SYMBOL(pfn_valid);

--
2.28.0

2021-05-11 10:25:34

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] memblock: update initialization of reserved pages

On Tue, 11 May 2021 at 12:06, Mike Rapoport <[email protected]> wrote:
>
> From: Mike Rapoport <[email protected]>
>
> The struct pages representing a reserved memory region are initialized
> using reserve_bootmem_range() function. This function is called for each
> reserved region just before the memory is freed from memblock to the buddy
> page allocator.
>
> The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> values set by the memory map initialization which makes it necessary to
> have a special treatment for such pages in pfn_valid() and
> pfn_valid_within().
>
> Split out initialization of the reserved pages to a function with a
> meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
> reserved regions and mark struct pages for the NOMAP regions as
> PageReserved.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
> Reviewed-by: Anshuman Khandual <[email protected]>

Acked-by: Ard Biesheuvel <[email protected]>

> ---
> include/linux/memblock.h | 4 +++-
> mm/memblock.c | 28 ++++++++++++++++++++++++++--
> 2 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 5984fff3f175..1b4c97c151ae 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -30,7 +30,9 @@ extern unsigned long long max_possible_pfn;
> * @MEMBLOCK_NONE: no special request
> * @MEMBLOCK_HOTPLUG: hotpluggable region
> * @MEMBLOCK_MIRROR: mirrored region
> - * @MEMBLOCK_NOMAP: don't add to kernel direct mapping
> + * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
> + * reserved in the memory map; refer to memblock_mark_nomap() description
> + * for further details
> */
> enum memblock_flags {
> MEMBLOCK_NONE = 0x0, /* No special request */
> diff --git a/mm/memblock.c b/mm/memblock.c
> index afaefa8fc6ab..3abf2c3fea7f 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -906,6 +906,11 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
> * @base: the base phys addr of the region
> * @size: the size of the region
> *
> + * The memory regions marked with %MEMBLOCK_NOMAP will not be added to the
> + * direct mapping of the physical memory. These regions will still be
> + * covered by the memory map. The struct page representing NOMAP memory
> + * frames in the memory map will be PageReserved()
> + *
> * Return: 0 on success, -errno on failure.
> */
> int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
> @@ -2002,6 +2007,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
> return end_pfn - start_pfn;
> }
>
> +static void __init memmap_init_reserved_pages(void)
> +{
> + struct memblock_region *region;
> + phys_addr_t start, end;
> + u64 i;
> +
> + /* initialize struct pages for the reserved regions */
> + for_each_reserved_mem_range(i, &start, &end)
> + reserve_bootmem_region(start, end);
> +
> + /* and also treat struct pages for the NOMAP regions as PageReserved */
> + for_each_mem_region(region) {
> + if (memblock_is_nomap(region)) {
> + start = region->base;
> + end = start + region->size;
> + reserve_bootmem_region(start, end);
> + }
> + }
> +}
> +
> static unsigned long __init free_low_memory_core_early(void)
> {
> unsigned long count = 0;
> @@ -2010,8 +2035,7 @@ static unsigned long __init free_low_memory_core_early(void)
>
> memblock_clear_hotplug(0, -1);
>
> - for_each_reserved_mem_range(i, &start, &end)
> - reserve_bootmem_region(start, end);
> + memmap_init_reserved_pages();
>
> /*
> * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
> --
> 2.28.0
>

2021-05-11 10:26:26

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] include/linux/mmzone.h: add documentation for pfn_valid()

On Tue, 11 May 2021 at 12:06, Mike Rapoport <[email protected]> wrote:
>
> From: Mike Rapoport <[email protected]>
>
> Add comment describing the semantics of pfn_valid() that clarifies that
> pfn_valid() only checks for availability of a memory map entry (i.e. struct
> page) for a PFN rather than availability of usable memory backing that PFN.
>
> The most "generic" version of pfn_valid() used by the configurations with
> SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
> suitable place for documentation about semantics of pfn_valid().
>
> Suggested-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Mike Rapoport <[email protected]>
> Reviewed-by: Anshuman Khandual <[email protected]>

Acked-by: Ard Biesheuvel <[email protected]>

> ---
> include/linux/mmzone.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0d53eba1c383..e5945ca24df7 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1427,6 +1427,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> #endif
>
> #ifndef CONFIG_HAVE_ARCH_PFN_VALID
> +/**
> + * pfn_valid - check if there is a valid memory map entry for a PFN
> + * @pfn: the page frame number to check
> + *
> + * Check if there is a valid memory map entry aka struct page for the @pfn.
> + * Note, that availability of the memory map entry does not imply that
> + * there is actual usable memory at that @pfn. The struct page may
> + * represent a hole or an unusable page frame.
> + *
> + * Return: 1 for PFNs that have memory map entries and 0 otherwise
> + */
> static inline int pfn_valid(unsigned long pfn)
> {
> struct mem_section *ms;
> --
> 2.28.0
>

2021-05-11 10:26:41

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()

On Tue, 11 May 2021 at 12:06, Mike Rapoport <[email protected]> wrote:
>
> From: Mike Rapoport <[email protected]>
>
> The intended semantics of pfn_valid() is to verify whether there is a
> struct page for the pfn in question and nothing else.
>
> Yet, on arm64 it is used to distinguish memory areas that are mapped in the
> linear map vs those that require ioremap() to access them.
>
> Introduce a dedicated pfn_is_map_memory() wrapper for
> memblock_is_map_memory() to perform such check and use it where
> appropriate.
>
> Using a wrapper allows to avoid cyclic include dependencies.
>
> While here also update style of pfn_valid() so that both pfn_valid() and
> pfn_is_map_memory() declarations will be consistent.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>

Acked-by: Ard Biesheuvel <[email protected]>

> ---
> arch/arm64/include/asm/memory.h | 2 +-
> arch/arm64/include/asm/page.h | 3 ++-
> arch/arm64/kvm/mmu.c | 2 +-
> arch/arm64/mm/init.c | 12 ++++++++++++
> arch/arm64/mm/ioremap.c | 4 ++--
> arch/arm64/mm/mmu.c | 2 +-
> 6 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 87b90dc27a43..9027b7e16c4c 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -369,7 +369,7 @@ static inline void *phys_to_virt(phys_addr_t x)
>
> #define virt_addr_valid(addr) ({ \
> __typeof__(addr) __addr = __tag_reset(addr); \
> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \
> + __is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr)); \
> })
>
> void dump_mem_limit(void);
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..75ddfe671393 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,8 @@ void copy_highpage(struct page *to, struct page *from);
>
> typedef struct page *pgtable_t;
>
> -extern int pfn_valid(unsigned long);
> +int pfn_valid(unsigned long pfn);
> +int pfn_is_map_memory(unsigned long pfn);
>
> #include <asm/memory.h>
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c5d1f3c87dbd..470070073085 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
>
> static bool kvm_is_device_pfn(unsigned long pfn)
> {
> - return !pfn_valid(pfn);
> + return !pfn_is_map_memory(pfn);
> }
>
> static void *stage2_memcache_zalloc_page(void *arg)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 16a2b2b1c54d..798f74f501d5 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -255,6 +255,18 @@ int pfn_valid(unsigned long pfn)
> }
> EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_is_map_memory(unsigned long pfn)
> +{
> + phys_addr_t addr = PFN_PHYS(pfn);
> +
> + /* avoid false positives for bogus PFNs, see comment in pfn_valid() */
> + if (PHYS_PFN(addr) != pfn)
> + return 0;
> +
> + return memblock_is_map_memory(addr);
> +}
> +EXPORT_SYMBOL(pfn_is_map_memory);
> +
> static phys_addr_t memory_limit = PHYS_ADDR_MAX;
>
> /*
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index b5e83c46b23e..b7c81dacabf0 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
> /*
> * Don't allow RAM to be mapped.
> */
> - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> + if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
> return NULL;
>
> area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
> void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
> {
> /* For normal memory we already have a cacheable mapping. */
> - if (pfn_valid(__phys_to_pfn(phys_addr)))
> + if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
> return (void __iomem *)__phys_to_virt(phys_addr);
>
> return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 6dd9369e3ea0..ab5914cebd3c 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -82,7 +82,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
> pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> unsigned long size, pgprot_t vma_prot)
> {
> - if (!pfn_valid(pfn))
> + if (!pfn_is_map_memory(pfn))
> return pgprot_noncached(vma_prot);
> else if (file->f_flags & O_SYNC)
> return pgprot_writecombine(vma_prot);
> --
> 2.28.0
>

2021-05-11 10:29:25

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Tue, 11 May 2021 at 12:06, Mike Rapoport <[email protected]> wrote:
>
> From: Mike Rapoport <[email protected]>
>
> The arm64's version of pfn_valid() differs from the generic because of two
> reasons:
>
> * Parts of the memory map are freed during boot. This makes it necessary to
> verify that there is actual physical memory that corresponds to a pfn
> which is done by querying memblock.
>
> * There are NOMAP memory regions. These regions are not mapped in the
> linear map and until the previous commit the struct pages representing
> these areas had default values.
>
> As the consequence of absence of the special treatment of NOMAP regions in
> the memory map it was necessary to use memblock_is_map_memory() in
> pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> generic mm functionality would not treat a NOMAP page as a normal page.
>
> Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> the rest of core mm will treat them as unusable memory and thus
> pfn_valid_within() is no longer required at all and can be disabled by
> removing CONFIG_HOLES_IN_ZONE on arm64.
>
> pfn_valid() can be slightly simplified by replacing
> memblock_is_map_memory() with memblock_is_memory().
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>

Acked-by: Ard Biesheuvel <[email protected]>

... and many thanks for cleaning this up.


> ---
> arch/arm64/Kconfig | 3 ---
> arch/arm64/mm/init.c | 2 +-
> 2 files changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 9f1d8566bbf9..d7dc8698cf8e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1052,9 +1052,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> def_bool y
> depends on NUMA
>
> -config HOLES_IN_ZONE
> - def_bool y
> -
> source "kernel/Kconfig.hz"
>
> config ARCH_SPARSEMEM_ENABLE
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 798f74f501d5..fb07218da2c0 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -251,7 +251,7 @@ int pfn_valid(unsigned long pfn)
> if (!early_section(ms))
> return pfn_section_valid(ms, pfn);
>
> - return memblock_is_map_memory(addr);
> + return memblock_is_memory(addr);
> }
> EXPORT_SYMBOL(pfn_valid);
>
> --
> 2.28.0
>

2021-05-11 23:41:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Tue, 11 May 2021 13:05:50 +0300 Mike Rapoport <[email protected]> wrote:

> From: Mike Rapoport <[email protected]>
>
> The arm64's version of pfn_valid() differs from the generic because of two
> reasons:
>
> * Parts of the memory map are freed during boot. This makes it necessary to
> verify that there is actual physical memory that corresponds to a pfn
> which is done by querying memblock.
>
> * There are NOMAP memory regions. These regions are not mapped in the
> linear map and until the previous commit the struct pages representing
> these areas had default values.
>
> As the consequence of absence of the special treatment of NOMAP regions in
> the memory map it was necessary to use memblock_is_map_memory() in
> pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> generic mm functionality would not treat a NOMAP page as a normal page.
>
> Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> the rest of core mm will treat them as unusable memory and thus
> pfn_valid_within() is no longer required at all and can be disabled by
> removing CONFIG_HOLES_IN_ZONE on arm64.
>
> pfn_valid() can be slightly simplified by replacing
> memblock_is_map_memory() with memblock_is_memory().
>
> ...
>
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1052,9 +1052,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> def_bool y
> depends on NUMA
>
> -config HOLES_IN_ZONE
> - def_bool y
> -
> source "kernel/Kconfig.hz"
>
> config ARCH_SPARSEMEM_ENABLE

https://lkml.kernel.org/r/[email protected]
already did this, so I simply dropped that hunk? And I don't think the
changelog needs updating for this?


> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 798f74f501d5..fb07218da2c0 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -251,7 +251,7 @@ int pfn_valid(unsigned long pfn)
> if (!early_section(ms))
> return pfn_section_valid(ms, pfn);
>
> - return memblock_is_map_memory(addr);
> + return memblock_is_memory(addr);
> }
> EXPORT_SYMBOL(pfn_valid);

2021-05-12 03:14:40

by Kefeng Wang

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()



On 2021/5/11 18:05, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Hi,
>
> These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> pfn_valid_within() to 1.
>
> The idea is to mark NOMAP pages as reserved in the memory map and restore
> the intended semantics of pfn_valid() to designate availability of struct
> page for a pfn.
>
> With this the core mm will be able to cope with the fact that it cannot use
> NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> will be treated correctly even without the need for pfn_valid_within.
>
> The patches are boot tested on qemu-system-aarch64.
>
> I beleive it would be best to route these via mmotm tree.

Reviewed-by: Kefeng Wang <[email protected]>

2021-05-12 05:33:18

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Tue, May 11, 2021 at 04:40:01PM -0700, Andrew Morton wrote:
> On Tue, 11 May 2021 13:05:50 +0300 Mike Rapoport <[email protected]> wrote:
>
> > From: Mike Rapoport <[email protected]>
> >
> > The arm64's version of pfn_valid() differs from the generic because of two
> > reasons:
> >
> > * Parts of the memory map are freed during boot. This makes it necessary to
> > verify that there is actual physical memory that corresponds to a pfn
> > which is done by querying memblock.
> >
> > * There are NOMAP memory regions. These regions are not mapped in the
> > linear map and until the previous commit the struct pages representing
> > these areas had default values.
> >
> > As the consequence of absence of the special treatment of NOMAP regions in
> > the memory map it was necessary to use memblock_is_map_memory() in
> > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> > generic mm functionality would not treat a NOMAP page as a normal page.
> >
> > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> > the rest of core mm will treat them as unusable memory and thus
> > pfn_valid_within() is no longer required at all and can be disabled by
> > removing CONFIG_HOLES_IN_ZONE on arm64.
> >
> > pfn_valid() can be slightly simplified by replacing
> > memblock_is_map_memory() with memblock_is_memory().
> >
> > ...
> >
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1052,9 +1052,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> > def_bool y
> > depends on NUMA
> >
> > -config HOLES_IN_ZONE
> > - def_bool y
> > -
> > source "kernel/Kconfig.hz"
> >
> > config ARCH_SPARSEMEM_ENABLE
>
> https://lkml.kernel.org/r/[email protected]
> already did this, so I simply dropped that hunk?
> And I don't think the changelog needs updating for this?

We need another hunk instead (below)

> And I don't think the changelog needs updating for this?

maybe "s/disabled by removing CONFIG_HOLES_IN_ZONE/disabled/", but does not
seem that important to me.

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3d6c7436a2fa..d7dc8698cf8e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -201,7 +201,6 @@ config ARM64
select HAVE_KPROBES
select HAVE_KRETPROBES
select HAVE_GENERIC_VDSO
- select HOLES_IN_ZONE
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING

2021-05-12 07:01:53

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Tue, 11 May 2021 at 12:05, Mike Rapoport <[email protected]> wrote:
>
> From: Mike Rapoport <[email protected]>
>
> Hi,
>
> These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> pfn_valid_within() to 1.
>
> The idea is to mark NOMAP pages as reserved in the memory map and restore
> the intended semantics of pfn_valid() to designate availability of struct
> page for a pfn.
>
> With this the core mm will be able to cope with the fact that it cannot use
> NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> will be treated correctly even without the need for pfn_valid_within.
>
> The patches are boot tested on qemu-system-aarch64.
>

Did you use EFI boot when testing this? The memory map is much more
fragmented in that case, so this would be a good data point.


> I beleive it would be best to route these via mmotm tree.
>
> v4:
> * rebase on v5.13-rc1
>
> v3: Link: https://lore.kernel.org/lkml/[email protected]
> * Fix minor issues found by Anshuman
> * Freshen up the declaration of pfn_valid() to make it consistent with
> pfn_is_map_memory()
> * Add more Acked-by and Reviewed-by tags, thanks Anshuman and David
>
> v2: Link: https://lore.kernel.org/lkml/[email protected]
> * Add check for PFN overflow in pfn_is_map_memory()
> * Add Acked-by and Reviewed-by tags, thanks David.
>
> v1: Link: https://lore.kernel.org/lkml/[email protected]
> * Add comment about the semantics of pfn_valid() as Anshuman suggested
> * Extend comments about MEMBLOCK_NOMAP, per Anshuman
> * Use pfn_is_map_memory() name for the exported wrapper for
> memblock_is_map_memory(). It is still local to arch/arm64 in the end
> because of header dependency issues.
>
> rfc: Link: https://lore.kernel.org/lkml/[email protected]
>
> Mike Rapoport (4):
> include/linux/mmzone.h: add documentation for pfn_valid()
> memblock: update initialization of reserved pages
> arm64: decouple check whether pfn is in linear map from pfn_valid()
> arm64: drop pfn_valid_within() and simplify pfn_valid()
>
> arch/arm64/Kconfig | 3 ---
> arch/arm64/include/asm/memory.h | 2 +-
> arch/arm64/include/asm/page.h | 3 ++-
> arch/arm64/kvm/mmu.c | 2 +-
> arch/arm64/mm/init.c | 14 +++++++++++++-
> arch/arm64/mm/ioremap.c | 4 ++--
> arch/arm64/mm/mmu.c | 2 +-
> include/linux/memblock.h | 4 +++-
> include/linux/mmzone.h | 11 +++++++++++
> mm/memblock.c | 28 ++++++++++++++++++++++++++--
> 10 files changed, 60 insertions(+), 13 deletions(-)
>
>
> base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
> --
> 2.28.0
>

2021-05-12 07:35:07

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Wed, May 12, 2021 at 09:00:02AM +0200, Ard Biesheuvel wrote:
> On Tue, 11 May 2021 at 12:05, Mike Rapoport <[email protected]> wrote:
> >
> > From: Mike Rapoport <[email protected]>
> >
> > Hi,
> >
> > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> > pfn_valid_within() to 1.
> >
> > The idea is to mark NOMAP pages as reserved in the memory map and restore
> > the intended semantics of pfn_valid() to designate availability of struct
> > page for a pfn.
> >
> > With this the core mm will be able to cope with the fact that it cannot use
> > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> > will be treated correctly even without the need for pfn_valid_within.
> >
> > The patches are boot tested on qemu-system-aarch64.
> >
>
> Did you use EFI boot when testing this? The memory map is much more
> fragmented in that case, so this would be a good data point.

Right, something like this:

[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000040000000-0x00000000ffffbfff]
[ 0.000000] node 0: [mem 0x00000000ffffc000-0x00000000ffffffff]
[ 0.000000] node 0: [mem 0x0000000100000000-0x00000004386fffff]
[ 0.000000] node 0: [mem 0x0000000438700000-0x000000043899ffff]
[ 0.000000] node 0: [mem 0x00000004389a0000-0x00000004389bffff]
[ 0.000000] node 0: [mem 0x00000004389c0000-0x0000000438b5ffff]
[ 0.000000] node 0: [mem 0x0000000438b60000-0x000000043be3ffff]
[ 0.000000] node 0: [mem 0x000000043be40000-0x000000043becffff]
[ 0.000000] node 0: [mem 0x000000043bed0000-0x000000043bedffff]
[ 0.000000] node 0: [mem 0x000000043bee0000-0x000000043bffffff]
[ 0.000000] node 0: [mem 0x000000043c000000-0x000000043fffffff]

This is a pity really, because I don't see a fundamental reason for those
tiny holes all over the place.

I know that EFI/ACPI mandates "IO style" memory access for those regions,
but I fail to get why...

> > I beleive it would be best to route these via mmotm tree.
> >
> > v4:
> > * rebase on v5.13-rc1
> >
> > v3: Link: https://lore.kernel.org/lkml/[email protected]
> > * Fix minor issues found by Anshuman
> > * Freshen up the declaration of pfn_valid() to make it consistent with
> > pfn_is_map_memory()
> > * Add more Acked-by and Reviewed-by tags, thanks Anshuman and David
> >
> > v2: Link: https://lore.kernel.org/lkml/[email protected]
> > * Add check for PFN overflow in pfn_is_map_memory()
> > * Add Acked-by and Reviewed-by tags, thanks David.
> >
> > v1: Link: https://lore.kernel.org/lkml/[email protected]
> > * Add comment about the semantics of pfn_valid() as Anshuman suggested
> > * Extend comments about MEMBLOCK_NOMAP, per Anshuman
> > * Use pfn_is_map_memory() name for the exported wrapper for
> > memblock_is_map_memory(). It is still local to arch/arm64 in the end
> > because of header dependency issues.
> >
> > rfc: Link: https://lore.kernel.org/lkml/[email protected]
> >
> > Mike Rapoport (4):
> > include/linux/mmzone.h: add documentation for pfn_valid()
> > memblock: update initialization of reserved pages
> > arm64: decouple check whether pfn is in linear map from pfn_valid()
> > arm64: drop pfn_valid_within() and simplify pfn_valid()
> >
> > arch/arm64/Kconfig | 3 ---
> > arch/arm64/include/asm/memory.h | 2 +-
> > arch/arm64/include/asm/page.h | 3 ++-
> > arch/arm64/kvm/mmu.c | 2 +-
> > arch/arm64/mm/init.c | 14 +++++++++++++-
> > arch/arm64/mm/ioremap.c | 4 ++--
> > arch/arm64/mm/mmu.c | 2 +-
> > include/linux/memblock.h | 4 +++-
> > include/linux/mmzone.h | 11 +++++++++++
> > mm/memblock.c | 28 ++++++++++++++++++++++++++--
> > 10 files changed, 60 insertions(+), 13 deletions(-)
> >
> >
> > base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
> > --
> > 2.28.0
> >

--
Sincerely yours,
Mike.

2021-05-12 08:00:33

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Wed, 12 May 2021 at 09:34, Mike Rapoport <[email protected]> wrote:
>
> On Wed, May 12, 2021 at 09:00:02AM +0200, Ard Biesheuvel wrote:
> > On Tue, 11 May 2021 at 12:05, Mike Rapoport <[email protected]> wrote:
> > >
> > > From: Mike Rapoport <[email protected]>
> > >
> > > Hi,
> > >
> > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> > > pfn_valid_within() to 1.
> > >
> > > The idea is to mark NOMAP pages as reserved in the memory map and restore
> > > the intended semantics of pfn_valid() to designate availability of struct
> > > page for a pfn.
> > >
> > > With this the core mm will be able to cope with the fact that it cannot use
> > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> > > will be treated correctly even without the need for pfn_valid_within.
> > >
> > > The patches are boot tested on qemu-system-aarch64.
> > >
> >
> > Did you use EFI boot when testing this? The memory map is much more
> > fragmented in that case, so this would be a good data point.
>
> Right, something like this:
>

Yes, although it is not always that bad.

> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000040000000-0x00000000ffffbfff]
> [ 0.000000] node 0: [mem 0x00000000ffffc000-0x00000000ffffffff]

This is allocated below 4 GB by the firmware, for reasons that are
only valid on x86 (where some of the early boot chain is IA32 only)

> [ 0.000000] node 0: [mem 0x0000000100000000-0x00000004386fffff]
> [ 0.000000] node 0: [mem 0x0000000438700000-0x000000043899ffff]
> [ 0.000000] node 0: [mem 0x00000004389a0000-0x00000004389bffff]
> [ 0.000000] node 0: [mem 0x00000004389c0000-0x0000000438b5ffff]
> [ 0.000000] node 0: [mem 0x0000000438b60000-0x000000043be3ffff]
> [ 0.000000] node 0: [mem 0x000000043be40000-0x000000043becffff]
> [ 0.000000] node 0: [mem 0x000000043bed0000-0x000000043bedffff]
> [ 0.000000] node 0: [mem 0x000000043bee0000-0x000000043bffffff]
> [ 0.000000] node 0: [mem 0x000000043c000000-0x000000043fffffff]
>
> This is a pity really, because I don't see a fundamental reason for those
> tiny holes all over the place.
>

There is a config option in the firmware build that allows these
regions to be preallocated using larger windows, which greatly reduces
the fragmentation.
> I know that EFI/ACPI mandates "IO style" memory access for those regions,
> but I fail to get why...
>

Not sure what you mean by 'IO style memory access'.



> > > I beleive it would be best to route these via mmotm tree.
> > >
> > > v4:
> > > * rebase on v5.13-rc1
> > >
> > > v3: Link: https://lore.kernel.org/lkml/[email protected]
> > > * Fix minor issues found by Anshuman
> > > * Freshen up the declaration of pfn_valid() to make it consistent with
> > > pfn_is_map_memory()
> > > * Add more Acked-by and Reviewed-by tags, thanks Anshuman and David
> > >
> > > v2: Link: https://lore.kernel.org/lkml/[email protected]
> > > * Add check for PFN overflow in pfn_is_map_memory()
> > > * Add Acked-by and Reviewed-by tags, thanks David.
> > >
> > > v1: Link: https://lore.kernel.org/lkml/[email protected]
> > > * Add comment about the semantics of pfn_valid() as Anshuman suggested
> > > * Extend comments about MEMBLOCK_NOMAP, per Anshuman
> > > * Use pfn_is_map_memory() name for the exported wrapper for
> > > memblock_is_map_memory(). It is still local to arch/arm64 in the end
> > > because of header dependency issues.
> > >
> > > rfc: Link: https://lore.kernel.org/lkml/[email protected]
> > >
> > > Mike Rapoport (4):
> > > include/linux/mmzone.h: add documentation for pfn_valid()
> > > memblock: update initialization of reserved pages
> > > arm64: decouple check whether pfn is in linear map from pfn_valid()
> > > arm64: drop pfn_valid_within() and simplify pfn_valid()
> > >
> > > arch/arm64/Kconfig | 3 ---
> > > arch/arm64/include/asm/memory.h | 2 +-
> > > arch/arm64/include/asm/page.h | 3 ++-
> > > arch/arm64/kvm/mmu.c | 2 +-
> > > arch/arm64/mm/init.c | 14 +++++++++++++-
> > > arch/arm64/mm/ioremap.c | 4 ++--
> > > arch/arm64/mm/mmu.c | 2 +-
> > > include/linux/memblock.h | 4 +++-
> > > include/linux/mmzone.h | 11 +++++++++++
> > > mm/memblock.c | 28 ++++++++++++++++++++++++++--
> > > 10 files changed, 60 insertions(+), 13 deletions(-)
> > >
> > >
> > > base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
> > > --
> > > 2.28.0
> > >
>
> --
> Sincerely yours,
> Mike.

2021-05-12 08:33:35

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()

On Wed, May 12, 2021 at 09:59:33AM +0200, Ard Biesheuvel wrote:
> On Wed, 12 May 2021 at 09:34, Mike Rapoport <[email protected]> wrote:
> >
> > On Wed, May 12, 2021 at 09:00:02AM +0200, Ard Biesheuvel wrote:
> > > On Tue, 11 May 2021 at 12:05, Mike Rapoport <[email protected]> wrote:
> > > >
> > > > From: Mike Rapoport <[email protected]>
> > > >
> > > > Hi,
> > > >
> > > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
> > > > pfn_valid_within() to 1.
> > > >
> > > > The idea is to mark NOMAP pages as reserved in the memory map and restore
> > > > the intended semantics of pfn_valid() to designate availability of struct
> > > > page for a pfn.
> > > >
> > > > With this the core mm will be able to cope with the fact that it cannot use
> > > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
> > > > will be treated correctly even without the need for pfn_valid_within.
> > > >
> > > > The patches are boot tested on qemu-system-aarch64.
> > > >
> > >
> > > Did you use EFI boot when testing this? The memory map is much more
> > > fragmented in that case, so this would be a good data point.
> >
> > Right, something like this:
> >
>
> Yes, although it is not always that bad.
>
> > [ 0.000000] Early memory node ranges
> > [ 0.000000] node 0: [mem 0x0000000040000000-0x00000000ffffbfff]
> > [ 0.000000] node 0: [mem 0x00000000ffffc000-0x00000000ffffffff]
>
> This is allocated below 4 GB by the firmware, for reasons that are
> only valid on x86 (where some of the early boot chain is IA32 only)
>
> > [ 0.000000] node 0: [mem 0x0000000100000000-0x00000004386fffff]
> > [ 0.000000] node 0: [mem 0x0000000438700000-0x000000043899ffff]
> > [ 0.000000] node 0: [mem 0x00000004389a0000-0x00000004389bffff]
> > [ 0.000000] node 0: [mem 0x00000004389c0000-0x0000000438b5ffff]
> > [ 0.000000] node 0: [mem 0x0000000438b60000-0x000000043be3ffff]
> > [ 0.000000] node 0: [mem 0x000000043be40000-0x000000043becffff]
> > [ 0.000000] node 0: [mem 0x000000043bed0000-0x000000043bedffff]
> > [ 0.000000] node 0: [mem 0x000000043bee0000-0x000000043bffffff]
> > [ 0.000000] node 0: [mem 0x000000043c000000-0x000000043fffffff]
> >
> > This is a pity really, because I don't see a fundamental reason for those
> > tiny holes all over the place.
> >
>
> There is a config option in the firmware build that allows these
> regions to be preallocated using larger windows, which greatly reduces
> the fragmentation.
> > I know that EFI/ACPI mandates "IO style" memory access for those regions,
> > but I fail to get why...
> >
>
> Not sure what you mean by 'IO style memory access'.

Well, my understanding is that the memory reserved by the firmware cannot
be mapped in the linear map because it might require different caching
modes (e.g like IO) and arm64 cannot tolerate aliased mappings with
different caching.
But what evades me is *why* these areas cannot be accessed as normal RAM.

> > > > I beleive it would be best to route these via mmotm tree.
> > > >
> > > > v4:
> > > > * rebase on v5.13-rc1
> > > >
> > > > v3: Link: https://lore.kernel.org/lkml/[email protected]
> > > > * Fix minor issues found by Anshuman
> > > > * Freshen up the declaration of pfn_valid() to make it consistent with
> > > > pfn_is_map_memory()
> > > > * Add more Acked-by and Reviewed-by tags, thanks Anshuman and David
> > > >
> > > > v2: Link: https://lore.kernel.org/lkml/[email protected]
> > > > * Add check for PFN overflow in pfn_is_map_memory()
> > > > * Add Acked-by and Reviewed-by tags, thanks David.
> > > >
> > > > v1: Link: https://lore.kernel.org/lkml/[email protected]
> > > > * Add comment about the semantics of pfn_valid() as Anshuman suggested
> > > > * Extend comments about MEMBLOCK_NOMAP, per Anshuman
> > > > * Use pfn_is_map_memory() name for the exported wrapper for
> > > > memblock_is_map_memory(). It is still local to arch/arm64 in the end
> > > > because of header dependency issues.
> > > >
> > > > rfc: Link: https://lore.kernel.org/lkml/[email protected]
> > > >
> > > > Mike Rapoport (4):
> > > > include/linux/mmzone.h: add documentation for pfn_valid()
> > > > memblock: update initialization of reserved pages
> > > > arm64: decouple check whether pfn is in linear map from pfn_valid()
> > > > arm64: drop pfn_valid_within() and simplify pfn_valid()
> > > >
> > > > arch/arm64/Kconfig | 3 ---
> > > > arch/arm64/include/asm/memory.h | 2 +-
> > > > arch/arm64/include/asm/page.h | 3 ++-
> > > > arch/arm64/kvm/mmu.c | 2 +-
> > > > arch/arm64/mm/init.c | 14 +++++++++++++-
> > > > arch/arm64/mm/ioremap.c | 4 ++--
> > > > arch/arm64/mm/mmu.c | 2 +-
> > > > include/linux/memblock.h | 4 +++-
> > > > include/linux/mmzone.h | 11 +++++++++++
> > > > mm/memblock.c | 28 ++++++++++++++++++++++++++--
> > > > 10 files changed, 60 insertions(+), 13 deletions(-)
> > > >
> > > >
> > > > base-commit: 6efb943b8616ec53a5e444193dccf1af9ad627b5
> > > > --
> > > > 2.28.0
> > > >
> >
> > --
> > Sincerely yours,
> > Mike.

--
Sincerely yours,
Mike.