2021-03-09 21:42:53

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH v6 0/4] Cleanup and fixups for vmemmap handling

Hi,

this series contains cleanups to remove dead code that handles
unaligned cases for 4K and 1GB pages (patch#1 and patch#2) when
removing the vemmmap range, and a fix (patch#3) to handle the case
when two vmemmap ranges intersect the same PMD.

More details can be found in the respective changelogs.

v5 -> v6:
- Fix some compilation errors when !CONFIG_MEMORY_HOTPLUG
(Reported by Zi Yan)
- Collect Acked-by from Dave

v4 -> v5:
- Rebase on top of 5.12-rc2
- Addessed feedback from Dave
- Split previous patch#3 into core-changes (current patch#3) and
the optimization (current patch#4)
- Document better what is unused_pmd_start and its optimization
- Added Acked-by for patch#1

v3 -> v4:
- Rebase on top of 5.12-rc1 as Andrew suggested
- Added last Reviewed-by for the last patch

v2 -> v3:
- Make sure we do not clear the PUD entry in case
we are not removing the whole range.
- Add Reviewed-by

v1 -> v2:
- Remove dead code in remove_pud_table as well
- Addessed feedback by David
- Place the vmemap functions that take care of unaligned PMDs
within CONFIG_SPARSEMEM_VMEMMAP


Oscar Salvador (4):
x86/vmemmap: Drop handling of 4K unaligned vmemmap range
x86/vmemmap: Drop handling of 1GB vmemmap ranges
x86/vmemmap: Handle unpopulated sub-pmd ranges
x86/vmemmap: Optimize for consecutive sections in partial populated
PMDs

arch/x86/mm/init_64.c | 203 +++++++++++++++++++++++++++++++-------------------
1 file changed, 128 insertions(+), 75 deletions(-)

--
2.16.3


2021-03-09 21:42:53

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH v6 1/4] x86/vmemmap: Drop handling of 4K unaligned vmemmap range

remove_pte_table() is prepared to handle the case where either the
start or the end of the range is not PAGE aligned.
This cannot actually happen:

__populate_section_memmap enforces the range to be PMD aligned,
so as long as the size of the struct page remains multiple of 8,
the vmemmap range will be aligned to PAGE_SIZE.

Drop the dead code and place a VM_BUG_ON in vmemmap_{populate,free}
to catch nasty cases.
Note that the VM_BUG_ON is placed in there because vmemmap_{populate,free}
is the gate of all removing and freeing page tables logic.

Signed-off-by: Oscar Salvador <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Dave Hansen <[email protected]>
---
arch/x86/mm/init_64.c | 48 +++++++++++++-----------------------------------
1 file changed, 13 insertions(+), 35 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b5a3fa4033d3..b0e1d215c83e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -962,7 +962,6 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end,
{
unsigned long next, pages = 0;
pte_t *pte;
- void *page_addr;
phys_addr_t phys_addr;

pte = pte_start + pte_index(addr);
@@ -983,42 +982,15 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end,
if (phys_addr < (phys_addr_t)0x40000000)
return;

- if (PAGE_ALIGNED(addr) && PAGE_ALIGNED(next)) {
- /*
- * Do not free direct mapping pages since they were
- * freed when offlining, or simplely not in use.
- */
- if (!direct)
- free_pagetable(pte_page(*pte), 0);
-
- spin_lock(&init_mm.page_table_lock);
- pte_clear(&init_mm, addr, pte);
- spin_unlock(&init_mm.page_table_lock);
+ if (!direct)
+ free_pagetable(pte_page(*pte), 0);

- /* For non-direct mapping, pages means nothing. */
- pages++;
- } else {
- /*
- * If we are here, we are freeing vmemmap pages since
- * direct mapped memory ranges to be freed are aligned.
- *
- * If we are not removing the whole page, it means
- * other page structs in this page are being used and
- * we canot remove them. So fill the unused page_structs
- * with 0xFD, and remove the page when it is wholly
- * filled with 0xFD.
- */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pte_page(*pte));
- if (!memchr_inv(page_addr, PAGE_INUSE, PAGE_SIZE)) {
- free_pagetable(pte_page(*pte), 0);
+ spin_lock(&init_mm.page_table_lock);
+ pte_clear(&init_mm, addr, pte);
+ spin_unlock(&init_mm.page_table_lock);

- spin_lock(&init_mm.page_table_lock);
- pte_clear(&init_mm, addr, pte);
- spin_unlock(&init_mm.page_table_lock);
- }
- }
+ /* For non-direct mapping, pages means nothing. */
+ pages++;
}

/* Call free_pte_table() in remove_pmd_table(). */
@@ -1197,6 +1169,9 @@ remove_pagetable(unsigned long start, unsigned long end, bool direct,
void __ref vmemmap_free(unsigned long start, unsigned long end,
struct vmem_altmap *altmap)
{
+ VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE));
+ VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE));
+
remove_pagetable(start, end, false, altmap);
}

@@ -1556,6 +1531,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
{
int err;

+ VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE));
+ VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE));
+
if (end - start < PAGES_PER_SECTION * sizeof(struct page))
err = vmemmap_populate_basepages(start, end, node, NULL);
else if (boot_cpu_has(X86_FEATURE_PSE))
--
2.16.3

2021-03-09 21:43:16

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

When sizeof(struct page) is not a power of 2, sections do not span
a PMD anymore and so when populating them some parts of the PMD will
remain unused.
Because of this, PMDs will be left behind when depopulating sections
since remove_pmd_table() thinks that those unused parts are still in
use.

Fix this by marking the unused parts with PAGE_UNUSED, so memchr_inv()
will do the right thing and will let us free the PMD when the last user
of it is gone.

This patch is based on a similar patch by David Hildenbrand:

https://lore.kernel.org/linux-mm/[email protected]/

Signed-off-by: Oscar Salvador <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Dave Hansen <[email protected]>
---
arch/x86/mm/init_64.c | 65 +++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 53 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9ecb3c488ac8..d93b36856ed3 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -826,6 +826,51 @@ void __init paging_init(void)
zone_sizes_init();
}

+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+#define PAGE_UNUSED 0xFD
+
+/* Returns true if the PMD is completely unused and thus it can be freed */
+static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
+{
+ unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
+
+ memset((void *)addr, PAGE_UNUSED, end - addr);
+
+ return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
+}
+
+static void __meminit vmemmap_use_sub_pmd(unsigned long start)
+{
+ /*
+ * As we expect to add in the same granularity as we remove, it's
+ * sufficient to mark only some piece used to block the memmap page from
+ * getting removed when removing some other adjacent memmap (just in
+ * case the first memmap never gets initialized e.g., because the memory
+ * block never gets onlined).
+ */
+ memset((void *)start, 0, sizeof(struct page));
+}
+
+static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
+{
+ /*
+ * Could be our memmap page is filled with PAGE_UNUSED already from a
+ * previous remove. Make sure to reset it.
+ */
+ vmemmap_use_sub_pmd(start);
+
+ /*
+ * Mark with PAGE_UNUSED the unused parts of the new memmap range
+ */
+ if (!IS_ALIGNED(start, PMD_SIZE))
+ memset((void *)start, PAGE_UNUSED,
+ start - ALIGN_DOWN(start, PMD_SIZE));
+ if (!IS_ALIGNED(end, PMD_SIZE))
+ memset((void *)end, PAGE_UNUSED,
+ ALIGN(end, PMD_SIZE) - end);
+}
+#endif
+
/*
* Memory hotplug specific functions
*/
@@ -871,8 +916,6 @@ int arch_add_memory(int nid, u64 start, u64 size,
return add_pages(nid, start_pfn, nr_pages, params);
}

-#define PAGE_INUSE 0xFD
-
static void __meminit free_pagetable(struct page *page, int order)
{
unsigned long magic;
@@ -1006,7 +1049,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
unsigned long next, pages = 0;
pte_t *pte_base;
pmd_t *pmd;
- void *page_addr;

pmd = pmd_start + pmd_index(addr);
for (; addr < end; addr = next, pmd++) {
@@ -1026,20 +1068,13 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
pmd_clear(pmd);
spin_unlock(&init_mm.page_table_lock);
pages++;
- } else {
- /* If here, we are freeing vmemmap pages. */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pmd_page(*pmd));
- if (!memchr_inv(page_addr, PAGE_INUSE,
- PMD_SIZE)) {
+ } else if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) &&
+ vmemmap_pmd_is_unused(addr, next)) {
free_hugepage_table(pmd_page(*pmd),
altmap);
-
spin_lock(&init_mm.page_table_lock);
pmd_clear(pmd);
spin_unlock(&init_mm.page_table_lock);
- }
}

continue;
@@ -1492,11 +1527,17 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start,

addr_end = addr + PMD_SIZE;
p_end = p + PMD_SIZE;
+
+ if (!IS_ALIGNED(addr, PMD_SIZE) ||
+ !IS_ALIGNED(next, PMD_SIZE))
+ vmemmap_use_new_sub_pmd(addr, next);
+
continue;
} else if (altmap)
return -ENOMEM; /* no fallback */
} else if (pmd_large(*pmd)) {
vmemmap_verify((pte_t *)pmd, node, addr, next);
+ vmemmap_use_sub_pmd(addr);
continue;
}
if (vmemmap_populate_basepages(addr, next, node, NULL))
--
2.16.3

2021-03-09 21:43:49

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH v6 4/4] x86/vmemmap: Optimize for consecutive sections in partial populated PMDs

We can optimize in the case we are adding consecutive sections, so no
memset(PAGE_UNUSED) is needed.
In that case, let us keep track where the unused range of the previous
memory range begins, so we can compare it with start of the range to be
added.
If they are equal, we know sections are added consecutively.

For that purpose, let us introduce 'unused_pmd_start', which always holds
the beginning of the unused memory range.

In the case a section does not contiguously follow the previous one, we
know we can memset [unused_pmd_start, PMD_BOUNDARY) with PAGE_UNUSE.

This patch is based on a similar patch by David Hildenbrand:

https://lore.kernel.org/linux-mm/[email protected]/

Signed-off-by: Oscar Salvador <[email protected]>
Acked-by: Dave Hansen <[email protected]>
---
arch/x86/mm/init_64.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 60 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index d93b36856ed3..13187a3debe9 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -829,17 +829,42 @@ void __init paging_init(void)
#ifdef CONFIG_SPARSEMEM_VMEMMAP
#define PAGE_UNUSED 0xFD

+/*
+ * The unused vmemmap range, which was not yet memset(PAGE_UNUSED), ranges
+ * from unused_pmd_start to next PMD_SIZE boundary.
+ */
+static unsigned long unused_pmd_start __meminitdata;
+
+static void __meminit vmemmap_flush_unused_pmd(void)
+{
+ if (!unused_pmd_start)
+ return;
+ /*
+ * Clears (unused_pmd_start, PMD_END]
+ */
+ memset((void *)unused_pmd_start, PAGE_UNUSED,
+ ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start);
+ unused_pmd_start = 0;
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
/* Returns true if the PMD is completely unused and thus it can be freed */
static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
{
unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);

+ /*
+ * Flush the unused range cache to ensure that memchr_inv() will work
+ * for the whole range.
+ */
+ vmemmap_flush_unused_pmd();
memset((void *)addr, PAGE_UNUSED, end - addr);

return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
}
+#endif

-static void __meminit vmemmap_use_sub_pmd(unsigned long start)
+static void __meminit __vmemmap_use_sub_pmd(unsigned long start)
{
/*
* As we expect to add in the same granularity as we remove, it's
@@ -851,13 +876,38 @@ static void __meminit vmemmap_use_sub_pmd(unsigned long start)
memset((void *)start, 0, sizeof(struct page));
}

+static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long end)
+{
+ /*
+ * We only optimize if the new used range directly follows the
+ * previously unused range (esp., when populating consecutive sections).
+ */
+ if (unused_pmd_start == start) {
+ if (likely(IS_ALIGNED(end, PMD_SIZE)))
+ unused_pmd_start = 0;
+ else
+ unused_pmd_start = end;
+ return;
+ }
+
+ /*
+ * If the range does not contiguously follows previous one, make sure
+ * to mark the unused range of the previous one so it can be removed.
+ */
+ vmemmap_flush_unused_pmd();
+ __vmemmap_use_sub_pmd(start);
+}
+
+
static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
{
+ vmemmap_flush_unused_pmd();
+
/*
* Could be our memmap page is filled with PAGE_UNUSED already from a
* previous remove. Make sure to reset it.
*/
- vmemmap_use_sub_pmd(start);
+ __vmemmap_use_sub_pmd(start);

/*
* Mark with PAGE_UNUSED the unused parts of the new memmap range
@@ -865,9 +915,14 @@ static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long
if (!IS_ALIGNED(start, PMD_SIZE))
memset((void *)start, PAGE_UNUSED,
start - ALIGN_DOWN(start, PMD_SIZE));
+
+ /*
+ * We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of
+ * consecutive sections. Remember for the last added PMD where the
+ * unused range begins.
+ */
if (!IS_ALIGNED(end, PMD_SIZE))
- memset((void *)end, PAGE_UNUSED,
- ALIGN(end, PMD_SIZE) - end);
+ unused_pmd_start = end;
}
#endif

@@ -1537,7 +1592,7 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start,
return -ENOMEM; /* no fallback */
} else if (pmd_large(*pmd)) {
vmemmap_verify((pte_t *)pmd, node, addr, next);
- vmemmap_use_sub_pmd(addr);
+ vmemmap_use_sub_pmd(addr, next);
continue;
}
if (vmemmap_populate_basepages(addr, next, node, NULL))
--
2.16.3

2021-03-09 21:44:50

by Oscar Salvador

[permalink] [raw]
Subject: [PATCH v6 2/4] x86/vmemmap: Drop handling of 1GB vmemmap ranges

There is no code to allocate 1GB pages when mapping the vmemmap range
as this might waste some memory and requires more complexity which
is not really worth.
Drop the dead code both for the aligned and unaligned cases and leave
only the direct map handling.

Signed-off-by: Oscar Salvador <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Dave Hansen <[email protected]>
---
arch/x86/mm/init_64.c | 35 +++++++----------------------------
1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b0e1d215c83e..9ecb3c488ac8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
unsigned long next, pages = 0;
pmd_t *pmd_base;
pud_t *pud;
- void *page_addr;

pud = pud_start + pud_index(addr);
for (; addr < end; addr = next, pud++) {
@@ -1071,33 +1070,13 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
if (!pud_present(*pud))
continue;

- if (pud_large(*pud)) {
- if (IS_ALIGNED(addr, PUD_SIZE) &&
- IS_ALIGNED(next, PUD_SIZE)) {
- if (!direct)
- free_pagetable(pud_page(*pud),
- get_order(PUD_SIZE));
-
- spin_lock(&init_mm.page_table_lock);
- pud_clear(pud);
- spin_unlock(&init_mm.page_table_lock);
- pages++;
- } else {
- /* If here, we are freeing vmemmap pages. */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pud_page(*pud));
- if (!memchr_inv(page_addr, PAGE_INUSE,
- PUD_SIZE)) {
- free_pagetable(pud_page(*pud),
- get_order(PUD_SIZE));
-
- spin_lock(&init_mm.page_table_lock);
- pud_clear(pud);
- spin_unlock(&init_mm.page_table_lock);
- }
- }
-
+ if (pud_large(*pud) &&
+ IS_ALIGNED(addr, PUD_SIZE) &&
+ IS_ALIGNED(next, PUD_SIZE)) {
+ spin_lock(&init_mm.page_table_lock);
+ pud_clear(pud);
+ spin_unlock(&init_mm.page_table_lock);
+ pages++;
continue;
}

--
2.16.3

2021-03-10 06:12:11

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

Hi Oscar,

On Wed, 10 Mar 2021 at 03:11, Oscar Salvador <[email protected]> wrote:
>
> When sizeof(struct page) is not a power of 2, sections do not span
> a PMD anymore and so when populating them some parts of the PMD will
> remain unused.
> Because of this, PMDs will be left behind when depopulating sections
> since remove_pmd_table() thinks that those unused parts are still in
> use.
>
> Fix this by marking the unused parts with PAGE_UNUSED, so memchr_inv()
> will do the right thing and will let us free the PMD when the last user
> of it is gone.
>
> This patch is based on a similar patch by David Hildenbrand:
>
> https://lore.kernel.org/linux-mm/[email protected]/
>
> Signed-off-by: Oscar Salvador <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
> Acked-by: Dave Hansen <[email protected]>
> ---
> arch/x86/mm/init_64.c | 65 +++++++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 53 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 9ecb3c488ac8..d93b36856ed3 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -826,6 +826,51 @@ void __init paging_init(void)
> zone_sizes_init();
> }
>
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +#define PAGE_UNUSED 0xFD
> +
> +/* Returns true if the PMD is completely unused and thus it can be freed */
> +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
> +
> + memset((void *)addr, PAGE_UNUSED, end - addr);
> +
> + return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
> +}
> +
> +static void __meminit vmemmap_use_sub_pmd(unsigned long start)
> +{
> + /*
> + * As we expect to add in the same granularity as we remove, it's
> + * sufficient to mark only some piece used to block the memmap page from
> + * getting removed when removing some other adjacent memmap (just in
> + * case the first memmap never gets initialized e.g., because the memory
> + * block never gets onlined).
> + */
> + memset((void *)start, 0, sizeof(struct page));
> +}
> +
> +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
> +{
> + /*
> + * Could be our memmap page is filled with PAGE_UNUSED already from a
> + * previous remove. Make sure to reset it.
> + */
> + vmemmap_use_sub_pmd(start);
> +
> + /*
> + * Mark with PAGE_UNUSED the unused parts of the new memmap range
> + */
> + if (!IS_ALIGNED(start, PMD_SIZE))
> + memset((void *)start, PAGE_UNUSED,
> + start - ALIGN_DOWN(start, PMD_SIZE));
> + if (!IS_ALIGNED(end, PMD_SIZE))
> + memset((void *)end, PAGE_UNUSED,
> + ALIGN(end, PMD_SIZE) - end);
> +}
> +#endif
> +
> /*
> * Memory hotplug specific functions
> */
> @@ -871,8 +916,6 @@ int arch_add_memory(int nid, u64 start, u64 size,
> return add_pages(nid, start_pfn, nr_pages, params);
> }
>
> -#define PAGE_INUSE 0xFD
> -
> static void __meminit free_pagetable(struct page *page, int order)
> {
> unsigned long magic;
> @@ -1006,7 +1049,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
> unsigned long next, pages = 0;
> pte_t *pte_base;
> pmd_t *pmd;
> - void *page_addr;
>
> pmd = pmd_start + pmd_index(addr);
> for (; addr < end; addr = next, pmd++) {
> @@ -1026,20 +1068,13 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
> pmd_clear(pmd);
> spin_unlock(&init_mm.page_table_lock);
> pages++;
> - } else {
> - /* If here, we are freeing vmemmap pages. */
> - memset((void *)addr, PAGE_INUSE, next - addr);
> -
> - page_addr = page_address(pmd_page(*pmd));
> - if (!memchr_inv(page_addr, PAGE_INUSE,
> - PMD_SIZE)) {
> + } else if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) &&
> + vmemmap_pmd_is_unused(addr, next)) {
> free_hugepage_table(pmd_page(*pmd),
> altmap);
> -
> spin_lock(&init_mm.page_table_lock);
> pmd_clear(pmd);
> spin_unlock(&init_mm.page_table_lock);
> - }
> }
>
> continue;
> @@ -1492,11 +1527,17 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start,
>
> addr_end = addr + PMD_SIZE;
> p_end = p + PMD_SIZE;
> +
> + if (!IS_ALIGNED(addr, PMD_SIZE) ||
> + !IS_ALIGNED(next, PMD_SIZE))
> + vmemmap_use_new_sub_pmd(addr, next);
> +
> continue;
> } else if (altmap)
> return -ENOMEM; /* no fallback */
> } else if (pmd_large(*pmd)) {
> vmemmap_verify((pte_t *)pmd, node, addr, next);
> + vmemmap_use_sub_pmd(addr);

While building the linux next 20210310 tag for x86_64 architecture with clang-12
and gcc-9 the following warnings / errors were noticed.

arch/x86/mm/init_64.c:1585:6: error: implicit declaration of function
'vmemmap_use_new_sub_pmd' [-Werror,-Wimplicit-function-declaration]
vmemmap_use_new_sub_pmd(addr, next);
^
arch/x86/mm/init_64.c:1591:4: error: implicit declaration of function
'vmemmap_use_sub_pmd' [-Werror,-Wimplicit-function-declaration]
vmemmap_use_sub_pmd(addr, next);
^
2 errors generated.
make[3]: *** [scripts/Makefile.build:271: arch/x86/mm/init_64.o] Error 1

Reported-by: Naresh Kamboju <[email protected]>

Steps to reproduce:
-------------------
# TuxMake is a command line tool and Python library that provides
# portable and repeatable Linux kernel builds across a variety of
# architectures, toolchains, kernel configurations, and make targets.
#
# TuxMake supports the concept of runtimes.
# See https://docs.tuxmake.org/runtimes/, for that to work it requires
# that you install podman or docker on your system.
#
# To install tuxmake on your system globally:
# sudo pip3 install -U tuxmake
#
# See https://docs.tuxmake.org/ for complete documentation.


tuxmake --runtime podman --target-arch x86_64 --toolchain clang-12
--kconfig defconfig --kconfig-add
https://builds.tuxbuild.com/1pYCPt4WlgSfSdv1BULm6ABINeJ/config


Build pipeline error link,
https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/jobs/1085496613#L428

--
Linaro LKFT
https://lkft.linaro.org


Attachments:
x86_64_next-20210310.config (133.91 kB)

2021-03-10 07:26:27

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

On Wed, Mar 10, 2021 at 11:37:53AM +0530, Naresh Kamboju wrote:
> Hi Oscar,

Hi Naresh,

> While building the linux next 20210310 tag for x86_64 architecture with clang-12
> and gcc-9 the following warnings / errors were noticed.
>
> arch/x86/mm/init_64.c:1585:6: error: implicit declaration of function
> 'vmemmap_use_new_sub_pmd' [-Werror,-Wimplicit-function-declaration]
> vmemmap_use_new_sub_pmd(addr, next);
> ^
> arch/x86/mm/init_64.c:1591:4: error: implicit declaration of function
> 'vmemmap_use_sub_pmd' [-Werror,-Wimplicit-function-declaration]
> vmemmap_use_sub_pmd(addr, next);
> ^
> 2 errors generated.
> make[3]: *** [scripts/Makefile.build:271: arch/x86/mm/init_64.o] Error 1
>
> Reported-by: Naresh Kamboju <[email protected]>

Yes, this was also reported by Zi Yan here [1].
Looking into your .config, seems to be the same issue as you have
CONFIG_SPARSE_VMEMMAP but !CONFIG_MEMORY_HOTPLUG.

This version fixes those compilation errors.

Thanks for reporting it anyway!

[1] https://lore.kernel.org/linux-mm/[email protected]/T/#ma566ff437ff4bf8fcc5f80f62cd0cc8761edd12d

> Steps to reproduce:
> -------------------
> # TuxMake is a command line tool and Python library that provides
> # portable and repeatable Linux kernel builds across a variety of
> # architectures, toolchains, kernel configurations, and make targets.
> #
> # TuxMake supports the concept of runtimes.
> # See https://docs.tuxmake.org/runtimes/, for that to work it requires
> # that you install podman or docker on your system.
> #
> # To install tuxmake on your system globally:
> # sudo pip3 install -U tuxmake
> #
> # See https://docs.tuxmake.org/ for complete documentation.
>
>
> tuxmake --runtime podman --target-arch x86_64 --toolchain clang-12
> --kconfig defconfig --kconfig-add
> https://builds.tuxbuild.com/1pYCPt4WlgSfSdv1BULm6ABINeJ/config
>
>
> Build pipeline error link,
> https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/jobs/1085496613#L428
>
> --
> Linaro LKFT
> https://lkft.linaro.org



--
Oscar Salvador
SUSE L3

2021-03-11 21:31:15

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

On 3/9/21 1:40 PM, Oscar Salvador wrote:
> +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
> +{
> + /*
> + * Could be our memmap page is filled with PAGE_UNUSED already from a
> + * previous remove. Make sure to reset it.
> + */
> + vmemmap_use_sub_pmd(start);
> +
> + /*
> + * Mark with PAGE_UNUSED the unused parts of the new memmap range
> + */
> + if (!IS_ALIGNED(start, PMD_SIZE))
> + memset((void *)start, PAGE_UNUSED,
> + start - ALIGN_DOWN(start, PMD_SIZE));
> + if (!IS_ALIGNED(end, PMD_SIZE))
> + memset((void *)end, PAGE_UNUSED,
> + ALIGN(end, PMD_SIZE) - end);
> +}
> +#endif

This is apparently under both CONFIG_SPARSEMEM_VMEMMAP and
CONFIG_MEMORY_HOTPLUG #ifdefs. It errors out at compile-time with this
config: https://sr71.net/~dave/intel/config-mmotm-20210311

> linux.git/arch/x86/mm/init_64.c: In function 'vmemmap_populate_hugepages':
> linux.git/arch/x86/mm/init_64.c:1585:6: error: implicit declaration of function 'vmemmap_use_new_sub_pmd' [-Werror=implicit-function-declaration]
> vmemmap_use_new_sub_pmd(addr, next);
> ^~~~~~~~~~~~~~~~~~~~~~~
> /home/davehans/linux.git/arch/x86/mm/init_64.c:1591:4: error: implicit declaration of function 'vmemmap_use_sub_pmd' [-Werror=implicit-function-declaration]
> vmemmap_use_sub_pmd(addr, next);
> ^~~~~~~~~~~~~~~~~~~

I didn't see a quick fix other than #ifdef'ing the call sites, which is
pretty ugly.

2021-03-11 21:50:10

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

On Thu, Mar 11, 2021 at 01:29:39PM -0800, Dave Hansen wrote:
> On 3/9/21 1:40 PM, Oscar Salvador wrote:
> > +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
> > +{
> > + /*
> > + * Could be our memmap page is filled with PAGE_UNUSED already from a
> > + * previous remove. Make sure to reset it.
> > + */
> > + vmemmap_use_sub_pmd(start);
> > +
> > + /*
> > + * Mark with PAGE_UNUSED the unused parts of the new memmap range
> > + */
> > + if (!IS_ALIGNED(start, PMD_SIZE))
> > + memset((void *)start, PAGE_UNUSED,
> > + start - ALIGN_DOWN(start, PMD_SIZE));
> > + if (!IS_ALIGNED(end, PMD_SIZE))
> > + memset((void *)end, PAGE_UNUSED,
> > + ALIGN(end, PMD_SIZE) - end);
> > +}
> > +#endif
>
> This is apparently under both CONFIG_SPARSEMEM_VMEMMAP and
> CONFIG_MEMORY_HOTPLUG #ifdefs. It errors out at compile-time with this
> config: https://sr71.net/~dave/intel/config-mmotm-20210311

It seems that mmotm still has v5.
v6 (this one) fixed that up. I basically moved the code out of
MEMORY_HOTPLUG #ifdefs.

I could not reproduce your error on v6.


--
Oscar Salvador
SUSE L3

2021-03-30 02:34:54

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

On Thu, Mar 11, 2021 at 10:48:34PM +0100, Oscar Salvador wrote:
> On Thu, Mar 11, 2021 at 01:29:39PM -0800, Dave Hansen wrote:
> > On 3/9/21 1:40 PM, Oscar Salvador wrote:
> > > +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
> > > +{
> > > + /*
> > > + * Could be our memmap page is filled with PAGE_UNUSED already from a
> > > + * previous remove. Make sure to reset it.
> > > + */
> > > + vmemmap_use_sub_pmd(start);
> > > +
> > > + /*
> > > + * Mark with PAGE_UNUSED the unused parts of the new memmap range
> > > + */
> > > + if (!IS_ALIGNED(start, PMD_SIZE))
> > > + memset((void *)start, PAGE_UNUSED,
> > > + start - ALIGN_DOWN(start, PMD_SIZE));
> > > + if (!IS_ALIGNED(end, PMD_SIZE))
> > > + memset((void *)end, PAGE_UNUSED,
> > > + ALIGN(end, PMD_SIZE) - end);
> > > +}
> > > +#endif
> >
> > This is apparently under both CONFIG_SPARSEMEM_VMEMMAP and
> > CONFIG_MEMORY_HOTPLUG #ifdefs. It errors out at compile-time with this
> > config: https://sr71.net/~dave/intel/config-mmotm-20210311
>
> It seems that mmotm still has v5.
> v6 (this one) fixed that up. I basically moved the code out of
> MEMORY_HOTPLUG #ifdefs.
>
> I could not reproduce your error on v6.

I can reproduce this with next-20210329.

.config attached.


Attachments:
(No filename) (1.31 kB)
config-for-osal.xz (24.12 kB)
Download all attachments

2021-04-02 18:01:24

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v6 3/4] x86/vmemmap: Handle unpopulated sub-pmd ranges

On Tue, Mar 30, 2021 at 03:29:50AM +0100, Matthew Wilcox wrote:
> I can reproduce this with next-20210329.
>
> .config attached.

Hi Matthew,

Thanks for the report. I tried to reproduce this with the attached
config on next-20210401 and I had no luck.


You still see it on that one?

Thanks

--
Oscar Salvador
SUSE L3