2024-03-18 20:04:23

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 00/14] mm/treewide: Remove pXd_huge() API

From: Peter Xu <[email protected]>

[based on akpm/mm-unstable commit b66d4391d8fe, March 18th]

v2:
- Add a patch to cleanup ARM's pmd_thp_or_huge [Christophe]
- Enhance commit message for PowerPC patch on hugepd [Christophe]

v1: https://lore.kernel.org/r/[email protected]

In previous work [1], we removed the pXd_large() API, which is arch
specific. This patchset further removes the hugetlb pXd_huge() API.

Hugetlb was never special on creating huge mappings when compared with
other huge mappings. Having a standalone API just to detect such pgtable
entries is more or less redundant, especially after the pXd_leaf() API set
is introduced with/without CONFIG_HUGETLB_PAGE.

When looking at this problem, a few issues are also exposed that we don't
have a clear definition of the *_huge() variance API. This patchset
started by cleaning these issues first, then replace all *_huge() users to
use *_leaf(), then drop all *_huge() code.

On x86/sparc, swap entries will be reported "true" in pXd_huge(), while for
all the rest archs they're reported "false" instead. This part is done in
patch 1-5, in which I suspect patch 1 can be seen as a bug fix, but I'll
leave that to hmm experts to decide.

Besides, there are three archs (arm, arm64, powerpc) that have slightly
different definitions between the *_huge() v.s. *_leaf() variances. I
tackled them separately so that it'll be easier for arch experts to chim in
when necessary. This part is done in patch 6-9.

The final patches 10-14 do the rest on the final removal, since *_leaf()
will be the ultimate API in the future, and we seem to have quite some
confusions on how *_huge() APIs can be defined, provide a rich comment for
*_leaf() API set to define them properly to avoid future misuse, and
hopefully that'll also help new archs to start support huge mappings and
avoid traps (like either swap entries, or PROT_NONE entry checks).

The whole series is lightly tested on x86 and arm64.

[1] https://lore.kernel.org/r/[email protected]

Peter Xu (14):
mm/hmm: Process pud swap entry without pud_huge()
mm/gup: Cache p4d in follow_p4d_mask()
mm/gup: Check p4d presence before going on
mm/x86: Change pXd_huge() behavior to exclude swap entries
mm/sparc: Change pXd_huge() behavior to exclude swap entries
mm/arm: Use macros to define pmd/pud helpers
mm/arm: Redefine pmd_huge() with pmd_leaf()
mm/arm64: Merge pXd_huge() and pXd_leaf() definitions
mm/powerpc: Redefine pXd_huge() with pXd_leaf()
mm/gup: Merge pXd huge mapping checks
mm/treewide: Replace pXd_huge() with pXd_leaf()
mm/treewide: Remove pXd_huge()
mm/arm: Remove pmd_thp_or_huge()
mm: Document pXd_leaf() API

arch/arm/include/asm/pgtable-2level.h | 5 ++-
arch/arm/include/asm/pgtable-3level-hwdef.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 5 ++-
arch/arm/lib/uaccess_with_memcpy.c | 4 +--
arch/arm/mm/Makefile | 1 -
arch/arm/mm/hugetlbpage.c | 34 -------------------
arch/arm64/include/asm/pgtable.h | 6 ++--
arch/arm64/mm/hugetlbpage.c | 18 ++--------
arch/loongarch/mm/hugetlbpage.c | 12 +------
arch/mips/include/asm/pgtable-32.h | 2 +-
arch/mips/include/asm/pgtable-64.h | 2 +-
arch/mips/mm/hugetlbpage.c | 10 ------
arch/mips/mm/tlb-r4k.c | 2 +-
arch/parisc/mm/hugetlbpage.c | 11 ------
.../include/asm/book3s/64/pgtable-4k.h | 20 -----------
.../include/asm/book3s/64/pgtable-64k.h | 25 --------------
arch/powerpc/include/asm/book3s/64/pgtable.h | 27 +++++++--------
arch/powerpc/include/asm/nohash/pgtable.h | 10 ------
arch/powerpc/mm/pgtable_64.c | 6 ++--
arch/riscv/mm/hugetlbpage.c | 10 ------
arch/s390/mm/hugetlbpage.c | 10 ------
arch/sh/mm/hugetlbpage.c | 10 ------
arch/sparc/mm/hugetlbpage.c | 12 -------
arch/x86/mm/hugetlbpage.c | 26 --------------
arch/x86/mm/pgtable.c | 4 +--
include/linux/hugetlb.h | 24 -------------
include/linux/pgtable.h | 24 ++++++++++---
mm/gup.c | 24 ++++++-------
mm/hmm.c | 9 ++---
mm/memory.c | 2 +-
30 files changed, 68 insertions(+), 288 deletions(-)
delete mode 100644 arch/arm/mm/hugetlbpage.c

--
2.44.0



2024-03-18 20:04:25

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 01/14] mm/hmm: Process pud swap entry without pud_huge()

From: Peter Xu <[email protected]>

Swap pud entries do not always return true for pud_huge() for all archs.
x86 and sparc (so far) allow it, but all the rest do not accept a swap
entry to be reported as pud_huge(). So it's not safe to check swap entries
within pud_huge(). Check swap entries before pud_huge(), so it should be
always safe.

This is the only place in the kernel that (IMHO, wrongly) relies on
pud_huge() to return true on pud swap entries. The plan is to cleanup
pXd_huge() to only report non-swap mappings for all archs.

Cc: Alistair Popple <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
mm/hmm.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 277ddcab4947..c95b9ec5d95f 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -424,7 +424,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
walk->action = ACTION_CONTINUE;

pud = READ_ONCE(*pudp);
- if (pud_none(pud)) {
+ if (!pud_present(pud)) {
spin_unlock(ptl);
return hmm_vma_walk_hole(start, end, -1, walk);
}
@@ -435,11 +435,6 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
unsigned long *hmm_pfns;
unsigned long cpu_flags;

- if (!pud_present(pud)) {
- spin_unlock(ptl);
- return hmm_vma_walk_hole(start, end, -1, walk);
- }
-
i = (addr - range->start) >> PAGE_SHIFT;
npages = (end - addr) >> PAGE_SHIFT;
hmm_pfns = &range->hmm_pfns[i];
--
2.44.0


2024-03-18 20:04:40

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 02/14] mm/gup: Cache p4d in follow_p4d_mask()

From: Peter Xu <[email protected]>

Add a variable to cache p4d in follow_p4d_mask(). It's a good practise to
make sure all the following checks will have a consistent view of the entry.

Signed-off-by: Peter Xu <[email protected]>
---
mm/gup.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index df83182ec72d..69a777f4fc5c 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -772,16 +772,17 @@ static struct page *follow_p4d_mask(struct vm_area_struct *vma,
unsigned int flags,
struct follow_page_context *ctx)
{
- p4d_t *p4d;
+ p4d_t *p4dp, p4d;

- p4d = p4d_offset(pgdp, address);
- if (p4d_none(*p4d))
+ p4dp = p4d_offset(pgdp, address);
+ p4d = READ_ONCE(*p4dp);
+ if (p4d_none(p4d))
return no_page_table(vma, flags);
- BUILD_BUG_ON(p4d_huge(*p4d));
- if (unlikely(p4d_bad(*p4d)))
+ BUILD_BUG_ON(p4d_huge(p4d));
+ if (unlikely(p4d_bad(p4d)))
return no_page_table(vma, flags);

- return follow_pud_mask(vma, address, p4d, flags, ctx);
+ return follow_pud_mask(vma, address, p4dp, flags, ctx);
}

/**
--
2.44.0


2024-03-18 20:05:06

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 03/14] mm/gup: Check p4d presence before going on

From: Peter Xu <[email protected]>

Currently there should have no p4d swap entries so it may not matter much,
however this may help us to rule out swap entries in pXd_huge() API, which
will include p4d_huge(). The p4d_present() checks make it 100% clear that
we won't rely on p4d_huge() for swap entries.

Signed-off-by: Peter Xu <[email protected]>
---
mm/gup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 69a777f4fc5c..802987281b2f 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -776,7 +776,7 @@ static struct page *follow_p4d_mask(struct vm_area_struct *vma,

p4dp = p4d_offset(pgdp, address);
p4d = READ_ONCE(*p4dp);
- if (p4d_none(p4d))
+ if (!p4d_present(p4d))
return no_page_table(vma, flags);
BUILD_BUG_ON(p4d_huge(p4d));
if (unlikely(p4d_bad(p4d)))
@@ -3069,7 +3069,7 @@ static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned lo
p4d_t p4d = READ_ONCE(*p4dp);

next = p4d_addr_end(addr, end);
- if (p4d_none(p4d))
+ if (!p4d_present(p4d))
return 0;
BUILD_BUG_ON(p4d_huge(p4d));
if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
--
2.44.0


2024-03-18 20:05:18

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 05/14] mm/sparc: Change pXd_huge() behavior to exclude swap entries

From: Peter Xu <[email protected]>

Please refer to the previous patch on the reasoning for x86. Now sparc is
the only architecture that will allow swap entries to be reported as
pXd_huge(). After this patch, all architectures should forbid swap entries
in pXd_huge().

Cc: David S. Miller <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/sparc/mm/hugetlbpage.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index b432500c13a5..d31c2cec35c9 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -409,14 +409,12 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,

int pmd_huge(pmd_t pmd)
{
- return !pmd_none(pmd) &&
- (pmd_val(pmd) & (_PAGE_VALID|_PAGE_PMD_HUGE)) != _PAGE_VALID;
+ return pmd_leaf(pmd);;
}

int pud_huge(pud_t pud)
{
- return !pud_none(pud) &&
- (pud_val(pud) & (_PAGE_VALID|_PAGE_PUD_HUGE)) != _PAGE_VALID;
+ return pud_leaf(pud);
}

static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
--
2.44.0


2024-03-18 20:05:32

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 06/14] mm/arm: Use macros to define pmd/pud helpers

From: Peter Xu <[email protected]>

It's already confusing that ARM 2-level v.s. 3-level defines SECT bit
differently on pmd/puds. Always use a macro which is much clearer.

Cc: Russell King <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Konrad Dybcio <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/arm/include/asm/pgtable-2level.h | 4 ++--
arch/arm/include/asm/pgtable-3level-hwdef.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 4 ++--
3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index b0a262566eb9..4245c2e74720 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -213,8 +213,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)

#define pmd_pfn(pmd) (__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))

-#define pmd_leaf(pmd) (pmd_val(pmd) & 2)
-#define pmd_bad(pmd) (pmd_val(pmd) & 2)
+#define pmd_leaf(pmd) (pmd_val(pmd) & PMD_TYPE_SECT)
+#define pmd_bad(pmd) pmd_leaf(pmd)
#define pmd_present(pmd) (pmd_val(pmd))

#define copy_pmd(pmdpd,pmdps) \
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index 2f35b4eddaa8..e7b666cf0060 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -14,6 +14,7 @@
* + Level 1/2 descriptor
* - common
*/
+#define PUD_TABLE_BIT (_AT(pmdval_t, 1) << 1)
#define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0)
#define PMD_TYPE_FAULT (_AT(pmdval_t, 0) << 0)
#define PMD_TYPE_TABLE (_AT(pmdval_t, 3) << 0)
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 4b1d9eb3908a..e7aecbef75c9 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -112,7 +112,7 @@
#ifndef __ASSEMBLY__

#define pud_none(pud) (!pud_val(pud))
-#define pud_bad(pud) (!(pud_val(pud) & 2))
+#define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT))
#define pud_present(pud) (pud_val(pud))
#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \
PMD_TYPE_TABLE)
@@ -137,7 +137,7 @@ static inline pmd_t *pud_pgtable(pud_t pud)
return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
}

-#define pmd_bad(pmd) (!(pmd_val(pmd) & 2))
+#define pmd_bad(pmd) (!(pmd_val(pmd) & PMD_TABLE_BIT))

#define copy_pmd(pmdpd,pmdps) \
do { \
--
2.44.0


2024-03-18 20:06:21

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 08/14] mm/arm64: Merge pXd_huge() and pXd_leaf() definitions

From: Peter Xu <[email protected]>

Unlike most archs, aarch64 defines pXd_huge() and pXd_leaf() slightly
differently. Redefine the pXd_huge() with pXd_leaf().

There used to be two traps for old aarch64 definitions over these APIs that
I found when reading the code around, they're:

(1) 4797ec2dc83a ("arm64: fix pud_huge() for 2-level pagetables")
(2) 23bc8f69f0ec ("arm64: mm: fix p?d_leaf()")

Define pXd_huge() with the current pXd_leaf() will make sure (2) isn't a
problem (on PROT_NONE checks). To make sure it also works for (1), we move
over the __PAGETABLE_PMD_FOLDED check to pud_leaf(), allowing it to
constantly returning "false" for 2-level pgtables, which looks even safer
to cover both now.

Cc: Muchun Song <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/arm64/include/asm/pgtable.h | 4 ++++
arch/arm64/mm/hugetlbpage.c | 8 ++------
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 401087e8a43d..14d24c357c7a 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -704,7 +704,11 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
#define pud_none(pud) (!pud_val(pud))
#define pud_bad(pud) (!pud_table(pud))
#define pud_present(pud) pte_present(pud_pte(pud))
+#ifndef __PAGETABLE_PMD_FOLDED
#define pud_leaf(pud) (pud_present(pud) && !pud_table(pud))
+#else
+#define pud_leaf(pud) false
+#endif
#define pud_valid(pud) pte_valid(pud_pte(pud))
#define pud_user(pud) pte_user(pud_pte(pud))
#define pud_user_exec(pud) pte_user_exec(pud_pte(pud))
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 0f0e10bb0a95..1234bbaef5bf 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -81,16 +81,12 @@ bool arch_hugetlb_migration_supported(struct hstate *h)

int pmd_huge(pmd_t pmd)
{
- return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
+ return pmd_leaf(pmd);
}

int pud_huge(pud_t pud)
{
-#ifndef __PAGETABLE_PMD_FOLDED
- return pud_val(pud) && !(pud_val(pud) & PUD_TABLE_BIT);
-#else
- return 0;
-#endif
+ return pud_leaf(pud);
}

static int find_num_contig(struct mm_struct *mm, unsigned long addr,
--
2.44.0


2024-03-18 20:06:26

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 09/14] mm/powerpc: Redefine pXd_huge() with pXd_leaf()

From: Peter Xu <[email protected]>

PowerPC book3s 4K mostly has the same definition on both, except pXd_huge()
constantly returns 0 for hash MMUs. As Michael Ellerman pointed out [1],
it is safe to check _PAGE_PTE on hash MMUs, as the bit will never be set so
it will keep returning false.

As a reference, __p[mu]d_mkhuge() will trigger a BUG_ON trying to create
such huge mappings for 4K hash MMUs. Meanwhile, the major powerpc hugetlb
pgtable walker __find_linux_pte() already used pXd_leaf() to check leaf
hugetlb mappings.

The goal should be that we will have one API pXd_leaf() to detect all kinds
of huge mappings (hugepd is still special in this case, though). AFAICT we
need to use the pXd_leaf() impl (rather than pXd_huge()'s) to make sure
ie. THPs on hash MMU will also return true.

This helps to simplify a follow up patch to drop pXd_huge() treewide.

NOTE: *_leaf() definition need to be moved before the inclusion of
asm/book3s/64/pgtable-4k.h, which defines pXd_huge() with it.

[1] https://lore.kernel.org/r/[email protected]

Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: "Naveen N. Rao" <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
.../include/asm/book3s/64/pgtable-4k.h | 14 ++--------
arch/powerpc/include/asm/book3s/64/pgtable.h | 27 +++++++++----------
2 files changed, 14 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
index 48f21820afe2..92545981bb49 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
@@ -8,22 +8,12 @@
#ifdef CONFIG_HUGETLB_PAGE
static inline int pmd_huge(pmd_t pmd)
{
- /*
- * leaf pte for huge page
- */
- if (radix_enabled())
- return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
- return 0;
+ return pmd_leaf(pmd);
}

static inline int pud_huge(pud_t pud)
{
- /*
- * leaf pte for huge page
- */
- if (radix_enabled())
- return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
- return 0;
+ return pud_leaf(pud);
}

/*
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index df66dce8306f..fd7180fded75 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -262,6 +262,18 @@ extern unsigned long __kernel_io_end;

extern struct page *vmemmap;
extern unsigned long pci_io_base;
+
+#define pmd_leaf pmd_leaf
+static inline bool pmd_leaf(pmd_t pmd)
+{
+ return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
+}
+
+#define pud_leaf pud_leaf
+static inline bool pud_leaf(pud_t pud)
+{
+ return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
+}
#endif /* __ASSEMBLY__ */

#include <asm/book3s/64/hash.h>
@@ -1436,20 +1448,5 @@ static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_va
return false;
}

-/*
- * Like pmd_huge(), but works regardless of config options
- */
-#define pmd_leaf pmd_leaf
-static inline bool pmd_leaf(pmd_t pmd)
-{
- return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
-}
-
-#define pud_leaf pud_leaf
-static inline bool pud_leaf(pud_t pud)
-{
- return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
-}
-
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
--
2.44.0


2024-03-18 20:06:45

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 10/14] mm/gup: Merge pXd huge mapping checks

From: Peter Xu <[email protected]>

Huge mapping checks in GUP are slightly redundant and can be simplified.

pXd_huge() now is the same as pXd_leaf(). pmd_trans_huge() and
pXd_devmap() should both imply pXd_leaf(). Time to merge them into one.

Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
mm/gup.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 802987281b2f..e2415e9789bc 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3005,8 +3005,7 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo
if (!pmd_present(pmd))
return 0;

- if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
- pmd_devmap(pmd))) {
+ if (unlikely(pmd_leaf(pmd))) {
/* See gup_pte_range() */
if (pmd_protnone(pmd))
return 0;
@@ -3043,7 +3042,7 @@ static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned lo
next = pud_addr_end(addr, end);
if (unlikely(!pud_present(pud)))
return 0;
- if (unlikely(pud_huge(pud) || pud_devmap(pud))) {
+ if (unlikely(pud_leaf(pud))) {
if (!gup_huge_pud(pud, pudp, addr, next, flags,
pages, nr))
return 0;
@@ -3096,7 +3095,7 @@ static void gup_pgd_range(unsigned long addr, unsigned long end,
next = pgd_addr_end(addr, end);
if (pgd_none(pgd))
return;
- if (unlikely(pgd_huge(pgd))) {
+ if (unlikely(pgd_leaf(pgd))) {
if (!gup_huge_pgd(pgd, pgdp, addr, next, flags,
pages, nr))
return;
--
2.44.0


2024-03-18 20:07:05

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 11/14] mm/treewide: Replace pXd_huge() with pXd_leaf()

From: Peter Xu <[email protected]>

Now after we're sure all pXd_huge() definitions are the same as pXd_leaf(),
reuse it. Luckily, pXd_huge() isn't widely used.

Signed-off-by: Peter Xu <[email protected]>
---
arch/arm/include/asm/pgtable-3level.h | 2 +-
arch/arm64/include/asm/pgtable.h | 2 +-
arch/arm64/mm/hugetlbpage.c | 4 ++--
arch/loongarch/mm/hugetlbpage.c | 2 +-
arch/mips/mm/tlb-r4k.c | 2 +-
arch/powerpc/mm/pgtable_64.c | 6 +++---
arch/x86/mm/pgtable.c | 4 ++--
mm/gup.c | 4 ++--
mm/hmm.c | 2 +-
mm/memory.c | 2 +-
10 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index e7aecbef75c9..9e3c44f0aea2 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -190,7 +190,7 @@ static inline pte_t pte_mkspecial(pte_t pte)
#define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY))

#define pmd_hugewillfault(pmd) (!pmd_young(pmd) || !pmd_write(pmd))
-#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
+#define pmd_thp_or_huge(pmd) (pmd_leaf(pmd) || pmd_trans_huge(pmd))

#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_trans_huge(pmd) (pmd_val(pmd) && !pmd_table(pmd))
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 14d24c357c7a..c4efa47fed5f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -512,7 +512,7 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd)
return pmd;
}

-#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
+#define pmd_thp_or_huge(pmd) (pmd_leaf(pmd) || pmd_trans_huge(pmd))

#define pmd_write(pmd) pte_write(pmd_pte(pmd))

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 1234bbaef5bf..f494fc31201f 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -321,7 +321,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
if (sz != PUD_SIZE && pud_none(pud))
return NULL;
/* hugepage or swap? */
- if (pud_huge(pud) || !pud_present(pud))
+ if (pud_leaf(pud) || !pud_present(pud))
return (pte_t *)pudp;
/* table; check the next level */

@@ -333,7 +333,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
if (!(sz == PMD_SIZE || sz == CONT_PMD_SIZE) &&
pmd_none(pmd))
return NULL;
- if (pmd_huge(pmd) || !pmd_present(pmd))
+ if (pmd_leaf(pmd) || !pmd_present(pmd))
return (pte_t *)pmdp;

if (sz == CONT_PTE_SIZE)
diff --git a/arch/loongarch/mm/hugetlbpage.c b/arch/loongarch/mm/hugetlbpage.c
index 1e76fcb83093..a4e78e74aa21 100644
--- a/arch/loongarch/mm/hugetlbpage.c
+++ b/arch/loongarch/mm/hugetlbpage.c
@@ -64,7 +64,7 @@ uint64_t pmd_to_entrylo(unsigned long pmd_val)
{
uint64_t val;
/* PMD as PTE. Must be huge page */
- if (!pmd_huge(__pmd(pmd_val)))
+ if (!pmd_leaf(__pmd(pmd_val)))
panic("%s", __func__);

val = pmd_val ^ _PAGE_HUGE;
diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c
index 4106084e57d7..76f3b9c0a9f0 100644
--- a/arch/mips/mm/tlb-r4k.c
+++ b/arch/mips/mm/tlb-r4k.c
@@ -326,7 +326,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte)
idx = read_c0_index();
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
/* this could be a huge page */
- if (pmd_huge(*pmdp)) {
+ if (pmd_leaf(*pmdp)) {
unsigned long lo;
write_c0_pagemask(PM_HUGE_MASK);
ptep = (pte_t *)pmdp;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 9b99113cb51a..6621cfc3baf8 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -102,7 +102,7 @@ struct page *p4d_page(p4d_t p4d)
{
if (p4d_leaf(p4d)) {
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
- VM_WARN_ON(!p4d_huge(p4d));
+ VM_WARN_ON(!p4d_leaf(p4d));
return pte_page(p4d_pte(p4d));
}
return virt_to_page(p4d_pgtable(p4d));
@@ -113,7 +113,7 @@ struct page *pud_page(pud_t pud)
{
if (pud_leaf(pud)) {
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
- VM_WARN_ON(!pud_huge(pud));
+ VM_WARN_ON(!pud_leaf(pud));
return pte_page(pud_pte(pud));
}
return virt_to_page(pud_pgtable(pud));
@@ -132,7 +132,7 @@ struct page *pmd_page(pmd_t pmd)
* enabled so these checks can't be used.
*/
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
- VM_WARN_ON(!(pmd_leaf(pmd) || pmd_huge(pmd)));
+ VM_WARN_ON(!pmd_leaf(pmd));
return pte_page(pmd_pte(pmd));
}
return virt_to_page(pmd_page_vaddr(pmd));
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index ff690ddc2334..d74f0814e086 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -731,7 +731,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
return 0;

/* Bail out if we are we on a populated non-leaf entry: */
- if (pud_present(*pud) && !pud_huge(*pud))
+ if (pud_present(*pud) && !pud_leaf(*pud))
return 0;

set_pte((pte_t *)pud, pfn_pte(
@@ -760,7 +760,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
}

/* Bail out if we are we on a populated non-leaf entry: */
- if (pmd_present(*pmd) && !pmd_huge(*pmd))
+ if (pmd_present(*pmd) && !pmd_leaf(*pmd))
return 0;

set_pte((pte_t *)pmd, pfn_pte(
diff --git a/mm/gup.c b/mm/gup.c
index e2415e9789bc..8e04a04ef138 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -778,7 +778,7 @@ static struct page *follow_p4d_mask(struct vm_area_struct *vma,
p4d = READ_ONCE(*p4dp);
if (!p4d_present(p4d))
return no_page_table(vma, flags);
- BUILD_BUG_ON(p4d_huge(p4d));
+ BUILD_BUG_ON(p4d_leaf(p4d));
if (unlikely(p4d_bad(p4d)))
return no_page_table(vma, flags);

@@ -3070,7 +3070,7 @@ static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned lo
next = p4d_addr_end(addr, end);
if (!p4d_present(p4d))
return 0;
- BUILD_BUG_ON(p4d_huge(p4d));
+ BUILD_BUG_ON(p4d_leaf(p4d));
if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
P4D_SHIFT, next, flags, pages, nr))
diff --git a/mm/hmm.c b/mm/hmm.c
index c95b9ec5d95f..93aebd9cc130 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -429,7 +429,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end,
return hmm_vma_walk_hole(start, end, -1, walk);
}

- if (pud_huge(pud) && pud_devmap(pud)) {
+ if (pud_leaf(pud) && pud_devmap(pud)) {
unsigned long i, npages, pfn;
unsigned int required_fault;
unsigned long *hmm_pfns;
diff --git a/mm/memory.c b/mm/memory.c
index 904f70b99498..baee777dcd2d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2765,7 +2765,7 @@ static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud,
unsigned long next;
int err = 0;

- BUG_ON(pud_huge(*pud));
+ BUG_ON(pud_leaf(*pud));

if (create) {
pmd = pmd_alloc_track(mm, pud, addr, mask);
--
2.44.0


2024-03-18 20:07:23

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 13/14] mm/arm: Remove pmd_thp_or_huge()

From: Peter Xu <[email protected]>

ARM/ARM64 used to define pmd_thp_or_huge(). Now this macro is completely
redundant. Remove it and use pmd_leaf().

Cc: Mark Salter <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Russell King <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Konrad Dybcio <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/arm/include/asm/pgtable-2level.h | 1 -
arch/arm/include/asm/pgtable-3level.h | 1 -
arch/arm/lib/uaccess_with_memcpy.c | 4 ++--
arch/arm64/include/asm/pgtable.h | 2 --
4 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
index 4245c2e74720..6b5392e20f41 100644
--- a/arch/arm/include/asm/pgtable-2level.h
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -241,7 +241,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
* define empty stubs for use by pin_page_for_write.
*/
#define pmd_hugewillfault(pmd) (0)
-#define pmd_thp_or_huge(pmd) (0)

#endif /* __ASSEMBLY__ */

diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 9e3c44f0aea2..fa5939eb9864 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -190,7 +190,6 @@ static inline pte_t pte_mkspecial(pte_t pte)
#define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY))

#define pmd_hugewillfault(pmd) (!pmd_young(pmd) || !pmd_write(pmd))
-#define pmd_thp_or_huge(pmd) (pmd_leaf(pmd) || pmd_trans_huge(pmd))

#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_trans_huge(pmd) (pmd_val(pmd) && !pmd_table(pmd))
diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index 2f6163f05e93..c0ac7796d775 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -56,10 +56,10 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
* to see that it's still huge and whether or not we will
* need to fault on write.
*/
- if (unlikely(pmd_thp_or_huge(*pmd))) {
+ if (unlikely(pmd_leaf(*pmd))) {
ptl = &current->mm->page_table_lock;
spin_lock(ptl);
- if (unlikely(!pmd_thp_or_huge(*pmd)
+ if (unlikely(!pmd_leaf(*pmd)
|| pmd_hugewillfault(*pmd))) {
spin_unlock(ptl);
return 0;
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index c4efa47fed5f..a580068a6564 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -512,8 +512,6 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd)
return pmd;
}

-#define pmd_thp_or_huge(pmd) (pmd_leaf(pmd) || pmd_trans_huge(pmd))
-
#define pmd_write(pmd) pte_write(pmd_pte(pmd))

#define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
--
2.44.0


2024-03-18 20:07:36

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 12/14] mm/treewide: Remove pXd_huge()

From: Peter Xu <[email protected]>

This API is not used anymore, drop it for the whole tree.

Signed-off-by: Peter Xu <[email protected]>
---
arch/arm/mm/Makefile | 1 -
arch/arm/mm/hugetlbpage.c | 29 -------------------
arch/arm64/mm/hugetlbpage.c | 10 -------
arch/loongarch/mm/hugetlbpage.c | 10 -------
arch/mips/include/asm/pgtable-32.h | 2 +-
arch/mips/include/asm/pgtable-64.h | 2 +-
arch/mips/mm/hugetlbpage.c | 10 -------
arch/parisc/mm/hugetlbpage.c | 11 -------
.../include/asm/book3s/64/pgtable-4k.h | 10 -------
.../include/asm/book3s/64/pgtable-64k.h | 25 ----------------
arch/powerpc/include/asm/nohash/pgtable.h | 10 -------
arch/riscv/mm/hugetlbpage.c | 10 -------
arch/s390/mm/hugetlbpage.c | 10 -------
arch/sh/mm/hugetlbpage.c | 10 -------
arch/sparc/mm/hugetlbpage.c | 10 -------
arch/x86/mm/hugetlbpage.c | 16 ----------
include/linux/hugetlb.h | 24 ---------------
17 files changed, 2 insertions(+), 198 deletions(-)
delete mode 100644 arch/arm/mm/hugetlbpage.c

diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 71b858c9b10c..1779e12db085 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -21,7 +21,6 @@ KASAN_SANITIZE_physaddr.o := n
obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o

obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o
-obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_ARM_PV_FIXUP) += pv-fixup-asm.o

obj-$(CONFIG_CPU_ABRT_NOMMU) += abort-nommu.o
diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c
deleted file mode 100644
index c2fa643f6bb5..000000000000
--- a/arch/arm/mm/hugetlbpage.c
+++ /dev/null
@@ -1,29 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * arch/arm/mm/hugetlbpage.c
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * Based on arch/x86/include/asm/hugetlb.h and Bill Carson's patches
- */
-
-#include <linux/init.h>
-#include <linux/fs.h>
-#include <linux/mm.h>
-#include <linux/hugetlb.h>
-#include <linux/pagemap.h>
-#include <linux/err.h>
-#include <linux/sysctl.h>
-#include <asm/mman.h>
-#include <asm/tlb.h>
-#include <asm/tlbflush.h>
-
-int pud_huge(pud_t pud)
-{
- return 0;
-}
-
-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index f494fc31201f..ca58210d6c07 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -79,16 +79,6 @@ bool arch_hugetlb_migration_supported(struct hstate *h)
}
#endif

-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
-
-int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, size_t *pgsize)
{
diff --git a/arch/loongarch/mm/hugetlbpage.c b/arch/loongarch/mm/hugetlbpage.c
index a4e78e74aa21..12222c56cb59 100644
--- a/arch/loongarch/mm/hugetlbpage.c
+++ b/arch/loongarch/mm/hugetlbpage.c
@@ -50,16 +50,6 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
return (pte_t *) pmd;
}

-int pmd_huge(pmd_t pmd)
-{
- return (pmd_val(pmd) & _PAGE_HUGE) != 0;
-}
-
-int pud_huge(pud_t pud)
-{
- return (pud_val(pud) & _PAGE_HUGE) != 0;
-}
-
uint64_t pmd_to_entrylo(unsigned long pmd_val)
{
uint64_t val;
diff --git a/arch/mips/include/asm/pgtable-32.h b/arch/mips/include/asm/pgtable-32.h
index 0e196650f4f4..92b7591aac2a 100644
--- a/arch/mips/include/asm/pgtable-32.h
+++ b/arch/mips/include/asm/pgtable-32.h
@@ -129,7 +129,7 @@ static inline int pmd_none(pmd_t pmd)
static inline int pmd_bad(pmd_t pmd)
{
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
- /* pmd_huge(pmd) but inline */
+ /* pmd_leaf(pmd) but inline */
if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
return 0;
#endif
diff --git a/arch/mips/include/asm/pgtable-64.h b/arch/mips/include/asm/pgtable-64.h
index 20ca48c1b606..7c28510b3768 100644
--- a/arch/mips/include/asm/pgtable-64.h
+++ b/arch/mips/include/asm/pgtable-64.h
@@ -245,7 +245,7 @@ static inline int pmd_none(pmd_t pmd)
static inline int pmd_bad(pmd_t pmd)
{
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
- /* pmd_huge(pmd) but inline */
+ /* pmd_leaf(pmd) but inline */
if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
return 0;
#endif
diff --git a/arch/mips/mm/hugetlbpage.c b/arch/mips/mm/hugetlbpage.c
index 7eaff5b07873..0b9e15555b59 100644
--- a/arch/mips/mm/hugetlbpage.c
+++ b/arch/mips/mm/hugetlbpage.c
@@ -57,13 +57,3 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
}
return (pte_t *) pmd;
}
-
-int pmd_huge(pmd_t pmd)
-{
- return (pmd_val(pmd) & _PAGE_HUGE) != 0;
-}
-
-int pud_huge(pud_t pud)
-{
- return (pud_val(pud) & _PAGE_HUGE) != 0;
-}
diff --git a/arch/parisc/mm/hugetlbpage.c b/arch/parisc/mm/hugetlbpage.c
index a9f7e21f6656..0356199bd9e7 100644
--- a/arch/parisc/mm/hugetlbpage.c
+++ b/arch/parisc/mm/hugetlbpage.c
@@ -180,14 +180,3 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
}
return changed;
}
-
-
-int pmd_huge(pmd_t pmd)
-{
- return 0;
-}
-
-int pud_huge(pud_t pud)
-{
- return 0;
-}
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
index 92545981bb49..baf934578c3a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
@@ -6,16 +6,6 @@
*/
#ifndef __ASSEMBLY__
#ifdef CONFIG_HUGETLB_PAGE
-static inline int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
-
-static inline int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
/*
* With radix , we have hugepage ptes in the pud and pmd entries. We don't
* need to setup hugepage directory for them. Our pte and page directory format
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
index 2fce3498b000..579a7153857f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h
@@ -4,31 +4,6 @@

#ifndef __ASSEMBLY__
#ifdef CONFIG_HUGETLB_PAGE
-/*
- * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
- * 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
- *
- * Defined in such a way that we can optimize away code block at build time
- * if CONFIG_HUGETLB_PAGE=n.
- *
- * returns true for pmd migration entries, THP, devmap, hugetlb
- * But compile time dependent on CONFIG_HUGETLB_PAGE
- */
-static inline int pmd_huge(pmd_t pmd)
-{
- /*
- * leaf pte for huge page
- */
- return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
-}
-
-static inline int pud_huge(pud_t pud)
-{
- /*
- * leaf pte for huge page
- */
- return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
-}

/*
* With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 427db14292c9..f5f39d4f03c8 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -351,16 +351,6 @@ static inline int hugepd_ok(hugepd_t hpd)
#endif
}

-static inline int pmd_huge(pmd_t pmd)
-{
- return 0;
-}
-
-static inline int pud_huge(pud_t pud)
-{
- return 0;
-}
-
#define is_hugepd(hpd) (hugepd_ok(hpd))
#endif

diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 29c7606414d2..9a4bc4bd2a01 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -399,16 +399,6 @@ static bool is_napot_size(unsigned long size)

#endif /*CONFIG_RISCV_ISA_SVNAPOT*/

-int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
-
static bool __hugetlb_valid_size(unsigned long size)
{
if (size == HPAGE_SIZE)
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index c2e8242bd15d..ca43b6fce71c 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -233,16 +233,6 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
return (pte_t *) pmdp;
}

-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
-
-int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
bool __init arch_hugetlb_valid_size(unsigned long size)
{
if (MACHINE_HAS_EDAT1 && size == PMD_SIZE)
diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c
index 6cb0ad73dbb9..ff209b55285a 100644
--- a/arch/sh/mm/hugetlbpage.c
+++ b/arch/sh/mm/hugetlbpage.c
@@ -70,13 +70,3 @@ pte_t *huge_pte_offset(struct mm_struct *mm,

return pte;
}
-
-int pmd_huge(pmd_t pmd)
-{
- return 0;
-}
-
-int pud_huge(pud_t pud)
-{
- return 0;
-}
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index d31c2cec35c9..8ed5bdf95d25 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -407,16 +407,6 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
return entry;
}

-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);;
-}
-
-int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long addr)
{
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 8362953a24ce..dab6db288e5b 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -19,22 +19,6 @@
#include <asm/tlbflush.h>
#include <asm/elf.h>

-/*
- * pmd_huge() returns 1 if @pmd is hugetlb related entry.
- */
-int pmd_huge(pmd_t pmd)
-{
- return pmd_leaf(pmd);
-}
-
-/*
- * pud_huge() returns 1 if @pud is hugetlb related entry.
- */
-int pud_huge(pud_t pud)
-{
- return pud_leaf(pud);
-}
-
#ifdef CONFIG_HUGETLB_PAGE
static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
unsigned long addr, unsigned long len,
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 77b30a8c6076..300de33c6fde 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -272,13 +272,9 @@ void hugetlb_vma_unlock_write(struct vm_area_struct *vma);
int hugetlb_vma_trylock_write(struct vm_area_struct *vma);
void hugetlb_vma_assert_locked(struct vm_area_struct *vma);
void hugetlb_vma_lock_release(struct kref *kref);
-
-int pmd_huge(pmd_t pmd);
-int pud_huge(pud_t pud);
long hugetlb_change_protection(struct vm_area_struct *vma,
unsigned long address, unsigned long end, pgprot_t newprot,
unsigned long cp_flags);
-
bool is_hugetlb_entry_migration(pte_t pte);
bool is_hugetlb_entry_hwpoisoned(pte_t pte);
void hugetlb_unshare_all_pmds(struct vm_area_struct *vma);
@@ -399,16 +395,6 @@ static inline void hugetlb_vma_assert_locked(struct vm_area_struct *vma)
{
}

-static inline int pmd_huge(pmd_t pmd)
-{
- return 0;
-}
-
-static inline int pud_huge(pud_t pud)
-{
- return 0;
-}
-
static inline int is_hugepage_only_range(struct mm_struct *mm,
unsigned long addr, unsigned long len)
{
@@ -493,16 +479,6 @@ static inline vm_fault_t hugetlb_fault(struct mm_struct *mm,
static inline void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) { }

#endif /* !CONFIG_HUGETLB_PAGE */
-/*
- * hugepages at page global directory. If arch support
- * hugepages at pgd level, they need to define this.
- */
-#ifndef pgd_huge
-#define pgd_huge(x) 0
-#endif
-#ifndef p4d_huge
-#define p4d_huge(x) 0
-#endif

#ifndef pgd_write
static inline int pgd_write(pgd_t pgd)
--
2.44.0


2024-03-18 20:07:52

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 14/14] mm: Document pXd_leaf() API

From: Peter Xu <[email protected]>

There's one small section already, but since we're going to remove
pXd_huge(), that comment may start to obsolete.

Rewrite that section with more information, hopefully with that the API is
crystal clear on what it implies.

Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
include/linux/pgtable.h | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 85fc7554cd52..6b0d222a7fad 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1770,11 +1770,25 @@ typedef unsigned int pgtbl_mod_mask;
#endif

/*
- * p?d_leaf() - true if this entry is a final mapping to a physical address.
- * This differs from p?d_huge() by the fact that they are always available (if
- * the architecture supports large pages at the appropriate level) even
- * if CONFIG_HUGETLB_PAGE is not defined.
- * Only meaningful when called on a valid entry.
+ * pXd_leaf() is the API to check whether a pgtable entry is a huge page
+ * mapping. It should work globally across all archs, without any
+ * dependency on CONFIG_* options. For architectures that do not support
+ * huge mappings on specific levels, below fallbacks will be used.
+ *
+ * A leaf pgtable entry should always imply the following:
+ *
+ * - It is a "present" entry. IOW, before using this API, please check it
+ * with pXd_present() first. NOTE: it may not always mean the "present
+ * bit" is set. For example, PROT_NONE entries are always "present".
+ *
+ * - It should _never_ be a swap entry of any type. Above "present" check
+ * should have guarded this, but let's be crystal clear on this.
+ *
+ * - It should contain a huge PFN, which points to a huge page larger than
+ * PAGE_SIZE of the platform. The PFN format isn't important here.
+ *
+ * - It should cover all kinds of huge mappings (e.g., pXd_trans_huge(),
+ * pXd_devmap(), or hugetlb mappings).
*/
#ifndef pgd_leaf
#define pgd_leaf(x) false
--
2.44.0


2024-03-18 20:08:38

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 04/14] mm/x86: Change pXd_huge() behavior to exclude swap entries

From: Peter Xu <[email protected]>

This patch partly reverts below commits:

3a194f3f8ad0 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry")
cbef8478bee5 ("mm/hugetlb: pmd_huge() returns true for non-present hugepage")

Right now, pXd_huge() definition across kernel is unclear. We have two
groups that think differently on swap entries:

- x86/sparc: Allow pXd_huge() to accept swap entries
- all the rest: Doesn't allow pXd_huge() to accept swap entries

This is so confusing. Since the sparc helpers seem to be added in 2016,
which is after x86's (2015), so sparc could have followed a trend. x86
proposed such swap handling in 2015 to resolve hugetlb swap entries hit in
GUP, but now GUP guards swap entries with !pXd_present() in all layers so
we should be safe.

We should define this API properly, one way or another, rather than keep
them defined differently across archs.

Gut feeling tells me that pXd_huge() shouldn't include swap entries, and it
turns out that I am not the only one thinking so, the question was raised
when the current pmd_huge() for x86 was proposed by Ville Syrjälä:

https://lore.kernel.org/all/[email protected]/

I might also be missing something obvious, but why is it even necessary
to treat PRESENT==0+PSE==0 as a huge entry?

It is also questioned when Jason Gunthorpe reviewed the other patchset on
swap entry handlings:

https://lore.kernel.org/all/[email protected]/

Revert its meaning back to original. It shouldn't have any functional
change as we should be ready with guards on !pXd_present() explicitly
everywhere.

Note that I also dropped the "#if CONFIG_PGTABLE_LEVELS > 2", it was there
probably because it was breaking things when 3a194f3f8ad0 was proposed,
according to the report here:

https://lore.kernel.org/all/[email protected]/

Now we shouldn't need that.

Instead of reverting to _PAGE_PSE raw check, leverage pXd_leaf().

Cc: Naoya Horiguchi <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/x86/mm/hugetlbpage.c | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 5804bbae4f01..8362953a24ce 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -20,29 +20,19 @@
#include <asm/elf.h>

/*
- * pmd_huge() returns 1 if @pmd is hugetlb related entry, that is normal
- * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry.
- * Otherwise, returns 0.
+ * pmd_huge() returns 1 if @pmd is hugetlb related entry.
*/
int pmd_huge(pmd_t pmd)
{
- return !pmd_none(pmd) &&
- (pmd_val(pmd) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
+ return pmd_leaf(pmd);
}

/*
- * pud_huge() returns 1 if @pud is hugetlb related entry, that is normal
- * hugetlb entry or non-present (migration or hwpoisoned) hugetlb entry.
- * Otherwise, returns 0.
+ * pud_huge() returns 1 if @pud is hugetlb related entry.
*/
int pud_huge(pud_t pud)
{
-#if CONFIG_PGTABLE_LEVELS > 2
- return !pud_none(pud) &&
- (pud_val(pud) & (_PAGE_PRESENT|_PAGE_PSE)) != _PAGE_PRESENT;
-#else
- return 0;
-#endif
+ return pud_leaf(pud);
}

#ifdef CONFIG_HUGETLB_PAGE
--
2.44.0


2024-03-18 20:08:53

by Peter Xu

[permalink] [raw]
Subject: [PATCH v2 07/14] mm/arm: Redefine pmd_huge() with pmd_leaf()

From: Peter Xu <[email protected]>

Most of the archs already define these two APIs the same way. ARM is more
complicated in two aspects:

- For pXd_huge() it's always checking against !PXD_TABLE_BIT, while for
pXd_leaf() it's always checking against PXD_TYPE_SECT.

- SECT/TABLE bits are defined differently on 2-level v.s. 3-level ARM
pgtables, which makes the whole thing even harder to follow.

Luckily, the second complexity should be hidden by the pmd_leaf()
implementation against 2-level v.s. 3-level headers. Invoke pmd_leaf()
directly for pmd_huge(), to remove the first part of complexity. This
prepares to drop pXd_huge() API globally.

When at it, drop the obsolete comments - it's outdated.

Cc: Russell King <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Konrad Dybcio <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: [email protected]
Signed-off-by: Peter Xu <[email protected]>
---
arch/arm/mm/hugetlbpage.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c
index dd7a0277c5c0..c2fa643f6bb5 100644
--- a/arch/arm/mm/hugetlbpage.c
+++ b/arch/arm/mm/hugetlbpage.c
@@ -18,11 +18,6 @@
#include <asm/tlb.h>
#include <asm/tlbflush.h>

-/*
- * On ARM, huge pages are backed by pmd's rather than pte's, so we do a lot
- * of type casting from pmd_t * to pte_t *.
- */
-
int pud_huge(pud_t pud)
{
return 0;
@@ -30,5 +25,5 @@ int pud_huge(pud_t pud)

int pmd_huge(pmd_t pmd)
{
- return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
+ return pmd_leaf(pmd);
}
--
2.44.0


2024-03-19 04:26:44

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 05/14] mm/sparc: Change pXd_huge() behavior to exclude swap entries



> On Mar 19, 2024, at 04:03, [email protected] wrote:
>
> From: Peter Xu <[email protected]>
>
> Please refer to the previous patch on the reasoning for x86. Now sparc is
> the only architecture that will allow swap entries to be reported as
> pXd_huge(). After this patch, all architectures should forbid swap entries
> in pXd_huge().
>
> Cc: David S. Miller <[email protected]>
> Cc: Andreas Larsson <[email protected]>
> Cc: [email protected]
> Signed-off-by: Peter Xu <[email protected]>
> ---
> arch/sparc/mm/hugetlbpage.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
> index b432500c13a5..d31c2cec35c9 100644
> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -409,14 +409,12 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
>
> int pmd_huge(pmd_t pmd)
> {
> - return !pmd_none(pmd) &&
> - (pmd_val(pmd) & (_PAGE_VALID|_PAGE_PMD_HUGE)) != _PAGE_VALID;
> + return pmd_leaf(pmd);;

There is a redundant semicolon in the end.

Thanks.

> }
>
> int pud_huge(pud_t pud)
> {
> - return !pud_none(pud) &&
> - (pud_val(pud) & (_PAGE_VALID|_PAGE_PUD_HUGE)) != _PAGE_VALID;
> + return pud_leaf(pud);
> }
>
> static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
> --
> 2.44.0
>


2024-03-19 14:11:44

by Peter Xu

[permalink] [raw]
Subject: Re: [PATCH v2 05/14] mm/sparc: Change pXd_huge() behavior to exclude swap entries

On Tue, Mar 19, 2024 at 12:25:39PM +0800, Muchun Song wrote:
> > @@ -409,14 +409,12 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
> >
> > int pmd_huge(pmd_t pmd)
> > {
> > - return !pmd_none(pmd) &&
> > - (pmd_val(pmd) & (_PAGE_VALID|_PAGE_PMD_HUGE)) != _PAGE_VALID;
> > + return pmd_leaf(pmd);;
>
> There is a redundant semicolon in the end.

Will touch it up, thanks. PS: This will be dropped as a whole in patch 12.

--
Peter Xu