2024-02-27 20:52:09

by Alexandre Ghiti

[permalink] [raw]
Subject: [PATCH -fixes 0/2] NAPOT Fixes

This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT
mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to
report NAPOT size.

Alexandre Ghiti (2):
Revert "riscv: mm: support Svnapot in huge vmap"
riscv: Fix pte_leaf_size() for NAPOT

arch/riscv/include/asm/pgtable.h | 4 +++
arch/riscv/include/asm/vmalloc.h | 61 +-------------------------------
2 files changed, 5 insertions(+), 60 deletions(-)

--
2.39.2



2024-02-27 20:54:00

by Alexandre Ghiti

[permalink] [raw]
Subject: [PATCH -fixes 2/2] riscv: Fix pte_leaf_size() for NAPOT

pte_leaf_size() must be reimplemented to add support for NAPOT mappings.

Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
Signed-off-by: Alexandre Ghiti <[email protected]>
---
arch/riscv/include/asm/pgtable.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 0c94260b5d0c..89f5f1bd6e46 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -439,6 +439,10 @@ static inline pte_t pte_mkhuge(pte_t pte)
return pte;
}

+#define pte_leaf_size(pte) (pte_napot(pte) ? \
+ napot_cont_size(napot_cont_order(pte)) :\
+ PAGE_SIZE)
+
#ifdef CONFIG_NUMA_BALANCING
/*
* See the comment in include/asm-generic/pgtable.h
--
2.39.2


2024-02-27 21:38:19

by Alexandre Ghiti

[permalink] [raw]
Subject: [PATCH -fixes 1/2] Revert "riscv: mm: support Svnapot in huge vmap"

This reverts commit ce173474cf19fe7fbe8f0fc74e3c81ec9c3d9807.

We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if
some part of a NAPOT mapping is unmapped, the remaining mapping is not
updated accordingly. For example:

ptr = vmalloc_huge(64 * 1024, GFP_KERNEL);
vunmap_range((unsigned long)(ptr + PAGE_SIZE),
(unsigned long)(ptr + 64 * 1024));

leads to the following kernel page table dump:

0xffff8f8000ef0000-0xffff8f8000ef1000 0x00000001033c0000 4K PTE N .. .. D A G . . W R V

Meaning the first entry which was not unmapped still has the N bit set,
which, if accessed first and cached in the TLB, could allow access to the
unmapped range.

That's because the logic to break the NAPOT mapping does not exist and
likely won't. Indeed, to break a NAPOT mapping, we first have to clear
the whole mapping, flush the TLB and then set the new mapping ("break-
before-make" equivalent). That works fine in userspace since we can handle
any pagefault occurring on the remaining mapping but we can't handle a kernel
pagefault on such mapping.

So fix this by reverting the commit that introduced the vmap/vmalloc
support.

Fixes: ce173474cf19 ("riscv: mm: support Svnapot in huge vmap")
Signed-off-by: Alexandre Ghiti <[email protected]>
---
arch/riscv/include/asm/vmalloc.h | 61 +-------------------------------
1 file changed, 1 insertion(+), 60 deletions(-)

diff --git a/arch/riscv/include/asm/vmalloc.h b/arch/riscv/include/asm/vmalloc.h
index 924d01b56c9a..51f6dfe19745 100644
--- a/arch/riscv/include/asm/vmalloc.h
+++ b/arch/riscv/include/asm/vmalloc.h
@@ -19,65 +19,6 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot)
return true;
}

-#ifdef CONFIG_RISCV_ISA_SVNAPOT
-#include <linux/pgtable.h>
+#endif

-#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
-static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, unsigned long end,
- u64 pfn, unsigned int max_page_shift)
-{
- unsigned long map_size = PAGE_SIZE;
- unsigned long size, order;
-
- if (!has_svnapot())
- return map_size;
-
- for_each_napot_order_rev(order) {
- if (napot_cont_shift(order) > max_page_shift)
- continue;
-
- size = napot_cont_size(order);
- if (end - addr < size)
- continue;
-
- if (!IS_ALIGNED(addr, size))
- continue;
-
- if (!IS_ALIGNED(PFN_PHYS(pfn), size))
- continue;
-
- map_size = size;
- break;
- }
-
- return map_size;
-}
-
-#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift
-static inline int arch_vmap_pte_supported_shift(unsigned long size)
-{
- int shift = PAGE_SHIFT;
- unsigned long order;
-
- if (!has_svnapot())
- return shift;
-
- WARN_ON_ONCE(size >= PMD_SIZE);
-
- for_each_napot_order_rev(order) {
- if (napot_cont_size(order) > size)
- continue;
-
- if (!IS_ALIGNED(size, napot_cont_size(order)))
- continue;
-
- shift = napot_cont_shift(order);
- break;
- }
-
- return shift;
-}
-
-#endif /* CONFIG_RISCV_ISA_SVNAPOT */
-#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
#endif /* _ASM_RISCV_VMALLOC_H */
--
2.39.2


Subject: Re: [PATCH -fixes 0/2] NAPOT Fixes

Hello:

This series was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <[email protected]>:

On Tue, 27 Feb 2024 21:50:14 +0100 you wrote:
> This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT
> mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to
> report NAPOT size.
>
> Alexandre Ghiti (2):
> Revert "riscv: mm: support Svnapot in huge vmap"
> riscv: Fix pte_leaf_size() for NAPOT
>
> [...]

Here is the summary with links:
- [-fixes,1/2] Revert "riscv: mm: support Svnapot in huge vmap"
https://git.kernel.org/riscv/c/16ab4646c905
- [-fixes,2/2] riscv: Fix pte_leaf_size() for NAPOT
https://git.kernel.org/riscv/c/e0fe5ab4192c

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html