2021-07-20 12:37:35

by Will Deacon

[permalink] [raw]
Subject: [PATCH 0/2] Fix arm64 boot regression in 5.14

Hi folks,

Jonathan reports [1] that commit c742199a014d ("mm/pgtable: add stubs
for {pmd/pub}_{set/clear}_huge") breaks the boot on arm64 when huge
mappings are used to map the kernel linear map but the VA size is
configured such that PUDs are folded. This is because the non-functional
pud_set_huge() stub is used to create the linear map, which results in
1GB holes and a fatal data abort when the kernel attemps to access them.

Digging further into the issue, it also transpired that huge-vmap is
silently disabled in these configurations as well [2], despite working
correctly in 5.13. The latter issue causes the pgtable selftests to
scream due to a failing consistency check [3].

Rather than leave mainline in a terminally broken state for arm64 while
we figure this out, revert the offending commit to get things working
again. Unfortunately, reverting the change in isolation causes a build
breakage for 32-bit PowerPC 8xx machines which recently started relying
on the problematic stubs to support pte-level huge-vmap entries [4].
Since Christophe is away at the moment, this series first reverts the
PowerPC 8xx change in order to avoid breaking the build.

I would really like this to land for -rc3 and I can take these via the
arm64 fixes queue if the PowerPC folks are alright with them.

Cheers,

Will

[1] https://lore.kernel.org/r/[email protected]
[2] https://lore.kernel.org/r/20210719104918.GA6440@willie-the-truck
[3] https://lore.kernel.org/r/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
[4] https://lore.kernel.org/r/8b972f1c03fb6bd59953035f0a3e4d26659de4f8.1620795204.git.christophe.leroy@csgroup.eu/

Cc: Ard Biesheuvel <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Jonathan Marek <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Nicholas Piggin <[email protected]
Cc: Mike Rapoport <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: [email protected]

--->8

Jonathan Marek (1):
Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

Will Deacon (1):
Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"

arch/arm64/mm/mmu.c | 20 ++++-----
arch/powerpc/Kconfig | 2 +-
arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
arch/x86/mm/pgtable.c | 34 +++++++---------
include/linux/pgtable.h | 26 +-----------
5 files changed, 25 insertions(+), 100 deletions(-)

--
2.32.0.402.g57bb445576-goog


2021-07-20 12:37:54

by Will Deacon

[permalink] [raw]
Subject: [PATCH 2/2] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

From: Jonathan Marek <[email protected]>

This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.

c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
breaks arm64 in at least two ways for configurations where PUD or PMD
folding occur:

1. We no longer install huge-vmap mappings and silently fall back to
page-granular entries, despite being able to install block entries
at what is effectively the PGD level.

2. If the linear map is backed with block mappings, these will now
silently fail to be created in alloc_init_pud(), causing a panic
early during boot.

The pgtable selftests caught this, although a fix has not been
forthcoming and Christophe is AWOL at the moment, so just revert the
change for now to get a working -rc3 on which we can queue patches for
5.15.

Cc: Ard Biesheuvel <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Nicholas Piggin <[email protected]
Cc: Mike Rapoport <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
Link: https://lore.kernel.org/r/[email protected]
Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/mm/mmu.c | 20 ++++++++------------
arch/x86/mm/pgtable.c | 34 +++++++++++++++-------------------
include/linux/pgtable.h | 26 +-------------------------
3 files changed, 24 insertions(+), 56 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d74586508448..9ff0de1b2b93 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1339,7 +1339,6 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
return dt_virt;
}

-#if CONFIG_PGTABLE_LEVELS > 3
int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
{
pud_t new_pud = pfn_pud(__phys_to_pfn(phys), mk_pud_sect_prot(prot));
@@ -1354,16 +1353,6 @@ int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
return 1;
}

-int pud_clear_huge(pud_t *pudp)
-{
- if (!pud_sect(READ_ONCE(*pudp)))
- return 0;
- pud_clear(pudp);
- return 1;
-}
-#endif
-
-#if CONFIG_PGTABLE_LEVELS > 2
int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot)
{
pmd_t new_pmd = pfn_pmd(__phys_to_pfn(phys), mk_pmd_sect_prot(prot));
@@ -1378,6 +1367,14 @@ int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot)
return 1;
}

+int pud_clear_huge(pud_t *pudp)
+{
+ if (!pud_sect(READ_ONCE(*pudp)))
+ return 0;
+ pud_clear(pudp);
+ return 1;
+}
+
int pmd_clear_huge(pmd_t *pmdp)
{
if (!pmd_sect(READ_ONCE(*pmdp)))
@@ -1385,7 +1382,6 @@ int pmd_clear_huge(pmd_t *pmdp)
pmd_clear(pmdp);
return 1;
}
-#endif

int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
{
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3364fe62b903..3481b35cb4ec 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -682,7 +682,6 @@ int p4d_clear_huge(p4d_t *p4d)
}
#endif

-#if CONFIG_PGTABLE_LEVELS > 3
/**
* pud_set_huge - setup kernel PUD mapping
*
@@ -721,23 +720,6 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
return 1;
}

-/**
- * pud_clear_huge - clear kernel PUD mapping when it is set
- *
- * Returns 1 on success and 0 on failure (no PUD map is found).
- */
-int pud_clear_huge(pud_t *pud)
-{
- if (pud_large(*pud)) {
- pud_clear(pud);
- return 1;
- }
-
- return 0;
-}
-#endif
-
-#if CONFIG_PGTABLE_LEVELS > 2
/**
* pmd_set_huge - setup kernel PMD mapping
*
@@ -768,6 +750,21 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
return 1;
}

+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PUD map is found).
+ */
+int pud_clear_huge(pud_t *pud)
+{
+ if (pud_large(*pud)) {
+ pud_clear(pud);
+ return 1;
+ }
+
+ return 0;
+}
+
/**
* pmd_clear_huge - clear kernel PMD mapping when it is set
*
@@ -782,7 +779,6 @@ int pmd_clear_huge(pmd_t *pmd)

return 0;
}
-#endif

#ifdef CONFIG_X86_64
/**
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index d147480cdefc..e24d2c992b11 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1397,34 +1397,10 @@ static inline int p4d_clear_huge(p4d_t *p4d)
}
#endif /* !__PAGETABLE_P4D_FOLDED */

-#ifndef __PAGETABLE_PUD_FOLDED
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot);
-int pud_clear_huge(pud_t *pud);
-#else
-static inline int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
-{
- return 0;
-}
-static inline int pud_clear_huge(pud_t *pud)
-{
- return 0;
-}
-#endif /* !__PAGETABLE_PUD_FOLDED */
-
-#ifndef __PAGETABLE_PMD_FOLDED
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot);
+int pud_clear_huge(pud_t *pud);
int pmd_clear_huge(pmd_t *pmd);
-#else
-static inline int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
-{
- return 0;
-}
-static inline int pmd_clear_huge(pmd_t *pmd)
-{
- return 0;
-}
-#endif /* !__PAGETABLE_PMD_FOLDED */
-
int p4d_free_pud_page(p4d_t *p4d, unsigned long addr);
int pud_free_pmd_page(pud_t *pud, unsigned long addr);
int pmd_free_pte_page(pmd_t *pmd, unsigned long addr);
--
2.32.0.402.g57bb445576-goog

2021-07-20 12:39:45

by Will Deacon

[permalink] [raw]
Subject: [PATCH 1/2] Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"

This reverts commit a6a8f7c4aa7eb50304b5c4e68eccd24313f3a785.

Commit c742199a014d ("mm/pgtable: add stubs for
{pmd/pub}_{set/clear}_huge") breaks the boot for arm64 when block
mappings are used to create the linear map, as this relies on a working
implementation of pXd_set_huge() even if the corresponding page-table
levels have been folded.

Although the problematic patch reverts cleanly, doing so breaks the
build for 32-bit PowerPC 8xx machines, which rely on the default
function definitions when the corresponding page-table levels are
folded:

| powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
| linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'
| powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pmd_range':
| linux/mm/vmalloc.c:337: undefined reference to `pmd_clear_huge'
| powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
| linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'
| powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pmd_range':
| linux/mm/vmalloc.c:337: undefined reference to `pmd_clear_huge'
| make: *** [Makefile:1177: vmlinux] Error 1

Although Christophe has kindly offered to look into the arm64 breakage,
he's on holiday for another 10 days and there isn't an obvious fix on
the arm64 side which allows us to continue using huge-vmap for affected
configurations.

In the interest of quickly getting things back to a working state as
they were in 5.13, revert the huge-vmap changes for PowerPC 8xx prior to
reverting the change which breaks arm64. We can then work on this
together for 5.15 once Christophe is back.

Cc: Ard Biesheuvel <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Nicholas Piggin <[email protected]
Cc: Mike Rapoport <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
---
arch/powerpc/Kconfig | 2 +-
arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
2 files changed, 1 insertion(+), 44 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d01e3401581d..5fc19ac62cb9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -189,7 +189,7 @@ config PPC
select GENERIC_VDSO_TIME_NS
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
- select HAVE_ARCH_HUGE_VMAP if PPC_RADIX_MMU || PPC_8xx
+ select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 997cec973406..6e4faa0a9b35 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -178,7 +178,6 @@
#ifndef __ASSEMBLY__

#include <linux/mmdebug.h>
-#include <linux/sizes.h>

void mmu_pin_tlb(unsigned long top, bool readonly);

@@ -226,48 +225,6 @@ static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize)
BUG();
}

-static inline bool arch_vmap_try_size(unsigned long addr, unsigned long end, u64 pfn,
- unsigned int max_page_shift, unsigned long size)
-{
- if (end - addr < size)
- return false;
-
- if ((1UL << max_page_shift) < size)
- return false;
-
- if (!IS_ALIGNED(addr, size))
- return false;
-
- if (!IS_ALIGNED(PFN_PHYS(pfn), size))
- return false;
-
- return true;
-}
-
-static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, unsigned long end,
- u64 pfn, unsigned int max_page_shift)
-{
- if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_512K))
- return SZ_512K;
- if (PAGE_SIZE == SZ_16K)
- return SZ_16K;
- if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_16K))
- return SZ_16K;
- return PAGE_SIZE;
-}
-#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
-
-static inline int arch_vmap_pte_supported_shift(unsigned long size)
-{
- if (size >= SZ_512K)
- return 19;
- else if (size >= SZ_16K)
- return 14;
- else
- return PAGE_SHIFT;
-}
-#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift
-
/* patch sites */
extern s32 patch__itlbmiss_exit_1, patch__dtlbmiss_exit_1;
extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf;
--
2.32.0.402.g57bb445576-goog

2021-07-20 12:41:37

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fix arm64 boot regression in 5.14

On Tue, 20 Jul 2021 at 14:35, Will Deacon <[email protected]> wrote:
>
> Hi folks,
>
> Jonathan reports [1] that commit c742199a014d ("mm/pgtable: add stubs
> for {pmd/pub}_{set/clear}_huge") breaks the boot on arm64 when huge
> mappings are used to map the kernel linear map but the VA size is
> configured such that PUDs are folded. This is because the non-functional
> pud_set_huge() stub is used to create the linear map, which results in
> 1GB holes and a fatal data abort when the kernel attemps to access them.
>
> Digging further into the issue, it also transpired that huge-vmap is
> silently disabled in these configurations as well [2], despite working
> correctly in 5.13. The latter issue causes the pgtable selftests to
> scream due to a failing consistency check [3].
>
> Rather than leave mainline in a terminally broken state for arm64 while
> we figure this out, revert the offending commit to get things working
> again. Unfortunately, reverting the change in isolation causes a build
> breakage for 32-bit PowerPC 8xx machines which recently started relying
> on the problematic stubs to support pte-level huge-vmap entries [4].
> Since Christophe is away at the moment, this series first reverts the
> PowerPC 8xx change in order to avoid breaking the build.
>
> I would really like this to land for -rc3 and I can take these via the
> arm64 fixes queue if the PowerPC folks are alright with them.
>
> Cheers,
>
> Will
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/20210719104918.GA6440@willie-the-truck
> [3] https://lore.kernel.org/r/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
> [4] https://lore.kernel.org/r/8b972f1c03fb6bd59953035f0a3e4d26659de4f8.1620795204.git.christophe.leroy@csgroup.eu/
>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Jonathan Marek <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Nicholas Piggin <[email protected]
> Cc: Mike Rapoport <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Geert Uytterhoeven <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
>
> --->8
>
> Jonathan Marek (1):
> Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
>
> Will Deacon (1):
> Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"
>

Reviewed-by: Ard Biesheuvel <[email protected]>


> arch/arm64/mm/mmu.c | 20 ++++-----
> arch/powerpc/Kconfig | 2 +-
> arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
> arch/x86/mm/pgtable.c | 34 +++++++---------
> include/linux/pgtable.h | 26 +-----------
> 5 files changed, 25 insertions(+), 100 deletions(-)
>
> --
> 2.32.0.402.g57bb445576-goog
>

2021-07-20 12:55:27

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fix arm64 boot regression in 5.14

On 2021-07-20 13:35, Will Deacon wrote:
> Hi folks,
>
> Jonathan reports [1] that commit c742199a014d ("mm/pgtable: add stubs
> for {pmd/pub}_{set/clear}_huge") breaks the boot on arm64 when huge
> mappings are used to map the kernel linear map but the VA size is
> configured such that PUDs are folded. This is because the
> non-functional
> pud_set_huge() stub is used to create the linear map, which results in
> 1GB holes and a fatal data abort when the kernel attemps to access
> them.
>
> Digging further into the issue, it also transpired that huge-vmap is
> silently disabled in these configurations as well [2], despite working
> correctly in 5.13. The latter issue causes the pgtable selftests to
> scream due to a failing consistency check [3].
>
> Rather than leave mainline in a terminally broken state for arm64 while
> we figure this out, revert the offending commit to get things working
> again. Unfortunately, reverting the change in isolation causes a build
> breakage for 32-bit PowerPC 8xx machines which recently started relying
> on the problematic stubs to support pte-level huge-vmap entries [4].
> Since Christophe is away at the moment, this series first reverts the
> PowerPC 8xx change in order to avoid breaking the build.
>
> I would really like this to land for -rc3 and I can take these via the
> arm64 fixes queue if the PowerPC folks are alright with them.
>
> Cheers,
>
> Will
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/20210719104918.GA6440@willie-the-truck
> [3]
> https://lore.kernel.org/r/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
> [4]
> https://lore.kernel.org/r/8b972f1c03fb6bd59953035f0a3e4d26659de4f8.1620795204.git.christophe.leroy@csgroup.eu/
>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Jonathan Marek <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Nicholas Piggin <[email protected]
> Cc: Mike Rapoport <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Geert Uytterhoeven <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
>
> --->8
>
> Jonathan Marek (1):
> Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
>
> Will Deacon (1):
> Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"
>
> arch/arm64/mm/mmu.c | 20 ++++-----
> arch/powerpc/Kconfig | 2 +-
> arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
> arch/x86/mm/pgtable.c | 34 +++++++---------
> include/linux/pgtable.h | 26 +-----------
> 5 files changed, 25 insertions(+), 100 deletions(-)

Acked-by: Marc Zyngier <[email protected]>

M.
--
Jazz is not dead. It just smells funny...

2021-07-20 15:17:23

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 1/2] Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"

Will Deacon <[email protected]> a écrit :

> This reverts commit a6a8f7c4aa7eb50304b5c4e68eccd24313f3a785.
>
> Commit c742199a014d ("mm/pgtable: add stubs for
> {pmd/pub}_{set/clear}_huge") breaks the boot for arm64 when block
> mappings are used to create the linear map, as this relies on a working
> implementation of pXd_set_huge() even if the corresponding page-table
> levels have been folded.
>
> Although the problematic patch reverts cleanly, doing so breaks the
> build for 32-bit PowerPC 8xx machines, which rely on the default
> function definitions when the corresponding page-table levels are
> folded:
>
> | powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
> | linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'
> | powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pmd_range':
> | linux/mm/vmalloc.c:337: undefined reference to `pmd_clear_huge'
> | powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
> | linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'
> | powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pmd_range':
> | linux/mm/vmalloc.c:337: undefined reference to `pmd_clear_huge'
> | make: *** [Makefile:1177: vmlinux] Error 1
>
> Although Christophe has kindly offered to look into the arm64 breakage,
> he's on holiday for another 10 days and there isn't an obvious fix on
> the arm64 side which allows us to continue using huge-vmap for affected
> configurations.
>
> In the interest of quickly getting things back to a working state as
> they were in 5.13, revert the huge-vmap changes for PowerPC 8xx prior to
> reverting the change which breaks arm64. We can then work on this
> together for 5.15 once Christophe is back.

Instead of reverting this awaited functionnality, could you please
just add the two following functions in arch/powerpc/mm/nohash/8xx.c :

int pud_clear_huge(pud_t *pud)
{
return 0;
}

int pmd_clear_huge(pmd_t *pmd)
{
return 0;
}

Thank you
Christophe

>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Nicholas Piggin <[email protected]
> Cc: Mike Rapoport <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Geert Uytterhoeven <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Link:
> https://lore.kernel.org/r/[email protected]
> Signed-off-by: Will Deacon <[email protected]>
> ---
> arch/powerpc/Kconfig | 2 +-
> arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
> 2 files changed, 1 insertion(+), 44 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index d01e3401581d..5fc19ac62cb9 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -189,7 +189,7 @@ config PPC
> select GENERIC_VDSO_TIME_NS
> select HAVE_ARCH_AUDITSYSCALL
> select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
> - select HAVE_ARCH_HUGE_VMAP if PPC_RADIX_MMU || PPC_8xx
> + select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU
> select HAVE_ARCH_JUMP_LABEL
> select HAVE_ARCH_JUMP_LABEL_RELATIVE
> select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14
> diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
> b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
> index 997cec973406..6e4faa0a9b35 100644
> --- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
> +++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
> @@ -178,7 +178,6 @@
> #ifndef __ASSEMBLY__
>
> #include <linux/mmdebug.h>
> -#include <linux/sizes.h>
>
> void mmu_pin_tlb(unsigned long top, bool readonly);
>
> @@ -226,48 +225,6 @@ static inline unsigned int
> mmu_psize_to_shift(unsigned int mmu_psize)
> BUG();
> }
>
> -static inline bool arch_vmap_try_size(unsigned long addr, unsigned
> long end, u64 pfn,
> - unsigned int max_page_shift, unsigned long size)
> -{
> - if (end - addr < size)
> - return false;
> -
> - if ((1UL << max_page_shift) < size)
> - return false;
> -
> - if (!IS_ALIGNED(addr, size))
> - return false;
> -
> - if (!IS_ALIGNED(PFN_PHYS(pfn), size))
> - return false;
> -
> - return true;
> -}
> -
> -static inline unsigned long arch_vmap_pte_range_map_size(unsigned
> long addr, unsigned long end,
> - u64 pfn, unsigned int max_page_shift)
> -{
> - if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_512K))
> - return SZ_512K;
> - if (PAGE_SIZE == SZ_16K)
> - return SZ_16K;
> - if (arch_vmap_try_size(addr, end, pfn, max_page_shift, SZ_16K))
> - return SZ_16K;
> - return PAGE_SIZE;
> -}
> -#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
> -
> -static inline int arch_vmap_pte_supported_shift(unsigned long size)
> -{
> - if (size >= SZ_512K)
> - return 19;
> - else if (size >= SZ_16K)
> - return 14;
> - else
> - return PAGE_SHIFT;
> -}
> -#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift
> -
> /* patch sites */
> extern s32 patch__itlbmiss_exit_1, patch__dtlbmiss_exit_1;
> extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf;
> --
> 2.32.0.402.g57bb445576-goog


2021-07-20 18:28:06

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 0/2] Fix arm64 boot regression in 5.14

Will Deacon <[email protected]> a écrit :

> Hi folks,
>
> Jonathan reports [1] that commit c742199a014d ("mm/pgtable: add stubs
> for {pmd/pub}_{set/clear}_huge") breaks the boot on arm64 when huge
> mappings are used to map the kernel linear map but the VA size is
> configured such that PUDs are folded. This is because the non-functional
> pud_set_huge() stub is used to create the linear map, which results in
> 1GB holes and a fatal data abort when the kernel attemps to access them.
>
> Digging further into the issue, it also transpired that huge-vmap is
> silently disabled in these configurations as well [2], despite working
> correctly in 5.13. The latter issue causes the pgtable selftests to
> scream due to a failing consistency check [3].
>
> Rather than leave mainline in a terminally broken state for arm64 while
> we figure this out, revert the offending commit to get things working
> again. Unfortunately, reverting the change in isolation causes a build
> breakage for 32-bit PowerPC 8xx machines which recently started relying
> on the problematic stubs to support pte-level huge-vmap entries [4].
> Since Christophe is away at the moment, this series first reverts the
> PowerPC 8xx change in order to avoid breaking the build.
>
> I would really like this to land for -rc3 and I can take these via the
> arm64 fixes queue if the PowerPC folks are alright with them.
>

If you can drop patch 1,

Change patch 2 to add the two following functions in
arch/powerpc/mm/nohash/8xx.c :

int pud_clear_huge(pud_t *pud)
{
return 0;
}

int pmd_clear_huge(pmd_t *pmd)
{
return 0;
}

Then feel free to take it via ARM fixes with my acked-by as maintainer
of PPC8XX.

Christophe


> Cheers,
>
> Will
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://lore.kernel.org/r/20210719104918.GA6440@willie-the-truck
> [3]
> https://lore.kernel.org/r/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
> [4]
> https://lore.kernel.org/r/8b972f1c03fb6bd59953035f0a3e4d26659de4f8.1620795204.git.christophe.leroy@csgroup.eu/
>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Jonathan Marek <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Nicholas Piggin <[email protected]
> Cc: Mike Rapoport <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Geert Uytterhoeven <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
>
> --->8
>
> Jonathan Marek (1):
> Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
>
> Will Deacon (1):
> Revert "powerpc/8xx: add support for huge pages on VMAP and VMALLOC"
>
> arch/arm64/mm/mmu.c | 20 ++++-----
> arch/powerpc/Kconfig | 2 +-
> arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 43 --------------------
> arch/x86/mm/pgtable.c | 34 +++++++---------
> include/linux/pgtable.h | 26 +-----------
> 5 files changed, 25 insertions(+), 100 deletions(-)
>
> --
> 2.32.0.402.g57bb445576-goog


2021-07-21 07:06:53

by Michael Ellerman

[permalink] [raw]
Subject: [PATCH v2] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

From: Jonathan Marek <[email protected]>

This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.

c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
breaks arm64 in at least two ways for configurations where PUD or PMD
folding occur:

1. We no longer install huge-vmap mappings and silently fall back to
page-granular entries, despite being able to install block entries
at what is effectively the PGD level.

2. If the linear map is backed with block mappings, these will now
silently fail to be created in alloc_init_pud(), causing a panic
early during boot.

The pgtable selftests caught this, although a fix has not been
forthcoming and Christophe is AWOL at the moment, so just revert the
change for now to get a working -rc3 on which we can queue patches for
5.15.

A simple revert breaks the build for 32-bit PowerPC 8xx machines, which
rely on the default function definitions when the corresponding
page-table levels are folded, since commit a6a8f7c4aa7e ("powerpc/8xx:
add support for huge pages on VMAP and VMALLOC"), eg:

powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'

To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in
arch/powerpc/mm/nohash/8xx.c as suggested by Christophe.

Cc: Christophe Leroy <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
[mpe: Fold in 8xx.c changes from Christophe and mention in change log]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
Link: https://lore.kernel.org/r/[email protected]
---
arch/arm64/mm/mmu.c | 20 ++++++++------------
arch/powerpc/mm/nohash/8xx.c | 10 ++++++++++
arch/x86/mm/pgtable.c | 34 +++++++++++++++-------------------
include/linux/pgtable.h | 26 +-------------------------
4 files changed, 34 insertions(+), 56 deletions(-)

v2: Fold in suggestion from Christophe to add stubs for 8xx.

I kept the reviewed-by/acked-by tags from arm64 folks as the patch is
unchanged as far as arm64 is concerned.

Please take this via the arm64 tree.

cheers


diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d74586508448..9ff0de1b2b93 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1339,7 +1339,6 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
return dt_virt;
}

-#if CONFIG_PGTABLE_LEVELS > 3
int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
{
pud_t new_pud = pfn_pud(__phys_to_pfn(phys), mk_pud_sect_prot(prot));
@@ -1354,16 +1353,6 @@ int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
return 1;
}

-int pud_clear_huge(pud_t *pudp)
-{
- if (!pud_sect(READ_ONCE(*pudp)))
- return 0;
- pud_clear(pudp);
- return 1;
-}
-#endif
-
-#if CONFIG_PGTABLE_LEVELS > 2
int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot)
{
pmd_t new_pmd = pfn_pmd(__phys_to_pfn(phys), mk_pmd_sect_prot(prot));
@@ -1378,6 +1367,14 @@ int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot)
return 1;
}

+int pud_clear_huge(pud_t *pudp)
+{
+ if (!pud_sect(READ_ONCE(*pudp)))
+ return 0;
+ pud_clear(pudp);
+ return 1;
+}
+
int pmd_clear_huge(pmd_t *pmdp)
{
if (!pmd_sect(READ_ONCE(*pmdp)))
@@ -1385,7 +1382,6 @@ int pmd_clear_huge(pmd_t *pmdp)
pmd_clear(pmdp);
return 1;
}
-#endif

int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
{
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 60780e089118..0df9fe29dd56 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -240,3 +240,13 @@ void __init setup_kuap(bool disabled)
mtspr(SPRN_MD_AP, MD_APG_KUAP);
}
#endif
+
+int pud_clear_huge(pud_t *pud)
+{
+ return 0;
+}
+
+int pmd_clear_huge(pmd_t *pmd)
+{
+ return 0;
+}
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3364fe62b903..3481b35cb4ec 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -682,7 +682,6 @@ int p4d_clear_huge(p4d_t *p4d)
}
#endif

-#if CONFIG_PGTABLE_LEVELS > 3
/**
* pud_set_huge - setup kernel PUD mapping
*
@@ -721,23 +720,6 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
return 1;
}

-/**
- * pud_clear_huge - clear kernel PUD mapping when it is set
- *
- * Returns 1 on success and 0 on failure (no PUD map is found).
- */
-int pud_clear_huge(pud_t *pud)
-{
- if (pud_large(*pud)) {
- pud_clear(pud);
- return 1;
- }
-
- return 0;
-}
-#endif
-
-#if CONFIG_PGTABLE_LEVELS > 2
/**
* pmd_set_huge - setup kernel PMD mapping
*
@@ -768,6 +750,21 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
return 1;
}

+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PUD map is found).
+ */
+int pud_clear_huge(pud_t *pud)
+{
+ if (pud_large(*pud)) {
+ pud_clear(pud);
+ return 1;
+ }
+
+ return 0;
+}
+
/**
* pmd_clear_huge - clear kernel PMD mapping when it is set
*
@@ -782,7 +779,6 @@ int pmd_clear_huge(pmd_t *pmd)

return 0;
}
-#endif

#ifdef CONFIG_X86_64
/**
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index d147480cdefc..e24d2c992b11 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1397,34 +1397,10 @@ static inline int p4d_clear_huge(p4d_t *p4d)
}
#endif /* !__PAGETABLE_P4D_FOLDED */

-#ifndef __PAGETABLE_PUD_FOLDED
int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot);
-int pud_clear_huge(pud_t *pud);
-#else
-static inline int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
-{
- return 0;
-}
-static inline int pud_clear_huge(pud_t *pud)
-{
- return 0;
-}
-#endif /* !__PAGETABLE_PUD_FOLDED */
-
-#ifndef __PAGETABLE_PMD_FOLDED
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot);
+int pud_clear_huge(pud_t *pud);
int pmd_clear_huge(pmd_t *pmd);
-#else
-static inline int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
-{
- return 0;
-}
-static inline int pmd_clear_huge(pmd_t *pmd)
-{
- return 0;
-}
-#endif /* !__PAGETABLE_PMD_FOLDED */
-
int p4d_free_pud_page(p4d_t *p4d, unsigned long addr);
int pud_free_pmd_page(pud_t *pud, unsigned long addr);
int pmd_free_pte_page(pmd_t *pmd, unsigned long addr);

--
2.25.1

2021-07-21 10:54:57

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

On Wed, 21 Jul 2021 17:02:13 +1000, Michael Ellerman wrote:
> This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.
>
> c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
> breaks arm64 in at least two ways for configurations where PUD or PMD
> folding occur:
>
> 1. We no longer install huge-vmap mappings and silently fall back to
> page-granular entries, despite being able to install block entries
> at what is effectively the PGD level.
>
> [...]

Thank you Michael! I owe you a beer next time I see you, if we don't go
extinct before then.

Applied to arm64 (for-next/fixes), thanks!

[1/1] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
https://git.kernel.org/arm64/c/d8a719059b9d

Cheers,
--
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

2021-07-23 12:08:38

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2] Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"

Will Deacon <[email protected]> writes:
> On Wed, 21 Jul 2021 17:02:13 +1000, Michael Ellerman wrote:
>> This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.
>>
>> c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
>> breaks arm64 in at least two ways for configurations where PUD or PMD
>> folding occur:
>>
>> 1. We no longer install huge-vmap mappings and silently fall back to
>> page-granular entries, despite being able to install block entries
>> at what is effectively the PGD level.
>>
>> [...]
>
> Thank you Michael! I owe you a beer next time I see you, if we don't go
> extinct before then.

No worries, thanks to Christophe for identifying the solution while on
vacation!

Beers seem a long way off, but hopefully one day :)

cheers