From: Mike Rapoport <[email protected]>
Hi,
These patches convert several architectures to use page table folding and
remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
The changes are mostly about mechanical replacement of pgd accessors with p4d
ones and the addition of higher levels to page table traversals.
All the patches were sent separately to the respective arch lists and
maintainers hence the "v2" prefix.
Geert Uytterhoeven (1):
sh: fault: Modernize printing of kernel messages
Mike Rapoport (12):
arm/arm64: add support for folded p4d page tables
h8300: remove usage of __ARCH_USE_5LEVEL_HACK
hexagon: remove __ARCH_USE_5LEVEL_HACK
ia64: add support for folded p4d page tables
nios2: add support for folded p4d page tables
openrisc: add support for folded p4d page tables
powerpc: add support for folded p4d page tables
sh: drop __pXd_offset() macros that duplicate pXd_index() ones
sh: add support for folded p4d page tables
unicore32: remove __ARCH_USE_5LEVEL_HACK
asm-generic: remove pgtable-nop4d-hack.h
mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h
arch/arm/include/asm/kvm_mmu.h | 5 +-
arch/arm/include/asm/pgtable.h | 1 -
arch/arm/include/asm/stage2_pgtable.h | 15 +-
arch/arm/lib/uaccess_with_memcpy.c | 9 +-
arch/arm/mach-sa1100/assabet.c | 2 +-
arch/arm/mm/dump.c | 29 ++-
arch/arm/mm/fault-armv.c | 7 +-
arch/arm/mm/fault.c | 28 ++-
arch/arm/mm/idmap.c | 3 +-
arch/arm/mm/init.c | 2 +-
arch/arm/mm/ioremap.c | 12 +-
arch/arm/mm/mm.h | 2 +-
arch/arm/mm/mmu.c | 35 ++-
arch/arm/mm/pgd.c | 40 +++-
arch/arm64/include/asm/kvm_mmu.h | 10 +-
arch/arm64/include/asm/pgalloc.h | 10 +-
arch/arm64/include/asm/pgtable-types.h | 5 +-
arch/arm64/include/asm/pgtable.h | 37 ++--
arch/arm64/include/asm/stage2_pgtable.h | 48 +++-
arch/arm64/kernel/hibernate.c | 44 +++-
arch/arm64/mm/fault.c | 9 +-
arch/arm64/mm/hugetlbpage.c | 15 +-
arch/arm64/mm/kasan_init.c | 26 ++-
arch/arm64/mm/mmu.c | 52 +++--
arch/arm64/mm/pageattr.c | 7 +-
arch/h8300/include/asm/pgtable.h | 1 -
arch/hexagon/include/asm/fixmap.h | 4 +-
arch/hexagon/include/asm/pgtable.h | 1 -
arch/ia64/include/asm/pgalloc.h | 4 +-
arch/ia64/include/asm/pgtable.h | 17 +-
arch/ia64/mm/fault.c | 7 +-
arch/ia64/mm/hugetlbpage.c | 18 +-
arch/ia64/mm/init.c | 28 ++-
arch/nios2/include/asm/pgtable.h | 3 +-
arch/nios2/mm/fault.c | 9 +-
arch/nios2/mm/ioremap.c | 6 +-
arch/openrisc/include/asm/pgtable.h | 1 -
arch/openrisc/mm/fault.c | 10 +-
arch/openrisc/mm/init.c | 4 +-
arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 58 +++--
arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
.../include/asm/nohash/64/pgtable-4k.h | 32 +--
arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
arch/powerpc/include/asm/pgtable.h | 8 +
arch/powerpc/kvm/book3s_64_mmu_radix.c | 59 ++++-
arch/powerpc/lib/code-patching.c | 7 +-
arch/powerpc/mm/book3s32/mmu.c | 2 +-
arch/powerpc/mm/book3s32/tlb.c | 4 +-
arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
arch/powerpc/mm/book3s64/radix_pgtable.c | 19 +-
arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
arch/powerpc/mm/hugetlbpage.c | 28 ++-
arch/powerpc/mm/kasan/kasan_init_32.c | 8 +-
arch/powerpc/mm/mem.c | 4 +-
arch/powerpc/mm/nohash/40x.c | 4 +-
arch/powerpc/mm/nohash/book3e_pgtable.c | 15 +-
arch/powerpc/mm/pgtable.c | 25 ++-
arch/powerpc/mm/pgtable_32.c | 28 ++-
arch/powerpc/mm/pgtable_64.c | 10 +-
arch/powerpc/mm/ptdump/hashpagetable.c | 20 +-
arch/powerpc/mm/ptdump/ptdump.c | 22 +-
arch/powerpc/xmon/xmon.c | 17 +-
arch/sh/include/asm/pgtable-2level.h | 1 -
arch/sh/include/asm/pgtable-3level.h | 1 -
arch/sh/include/asm/pgtable_32.h | 5 +-
arch/sh/include/asm/pgtable_64.h | 5 +-
arch/sh/kernel/io_trapped.c | 7 +-
arch/sh/mm/cache-sh4.c | 4 +-
arch/sh/mm/cache-sh5.c | 7 +-
arch/sh/mm/fault.c | 65 ++++--
arch/sh/mm/hugetlbpage.c | 28 ++-
arch/sh/mm/init.c | 15 +-
arch/sh/mm/kmap.c | 2 +-
arch/sh/mm/tlbex_32.c | 6 +-
arch/sh/mm/tlbex_64.c | 7 +-
arch/unicore32/include/asm/pgtable.h | 1 -
arch/unicore32/kernel/hibernate.c | 4 +-
include/asm-generic/5level-fixup.h | 58 -----
include/asm-generic/pgtable-nop4d-hack.h | 64 ------
include/asm-generic/pgtable-nopud.h | 4 -
include/linux/mm.h | 6 -
mm/kasan/init.c | 11 -
mm/memory.c | 8 -
virt/kvm/arm/mmu.c | 209 +++++++++++++++---
89 files changed, 988 insertions(+), 500 deletions(-)
delete mode 100644 include/asm-generic/5level-fixup.h
delete mode 100644 include/asm-generic/pgtable-nop4d-hack.h
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.
Since arm and arm64 share kvm memory management bits, make the conversion
for both variants at once to avoid breaking the builds in the middle.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 5 +-
arch/arm/include/asm/pgtable.h | 1 -
arch/arm/include/asm/stage2_pgtable.h | 15 +-
arch/arm/lib/uaccess_with_memcpy.c | 9 +-
arch/arm/mach-sa1100/assabet.c | 2 +-
arch/arm/mm/dump.c | 29 +++-
arch/arm/mm/fault-armv.c | 7 +-
arch/arm/mm/fault.c | 28 +++-
arch/arm/mm/idmap.c | 3 +-
arch/arm/mm/init.c | 2 +-
arch/arm/mm/ioremap.c | 12 +-
arch/arm/mm/mm.h | 2 +-
arch/arm/mm/mmu.c | 35 +++-
arch/arm/mm/pgd.c | 40 ++++-
arch/arm64/include/asm/kvm_mmu.h | 10 +-
arch/arm64/include/asm/pgalloc.h | 10 +-
arch/arm64/include/asm/pgtable-types.h | 5 +-
arch/arm64/include/asm/pgtable.h | 37 +++--
arch/arm64/include/asm/stage2_pgtable.h | 48 ++++--
arch/arm64/kernel/hibernate.c | 44 ++++-
arch/arm64/mm/fault.c | 9 +-
arch/arm64/mm/hugetlbpage.c | 15 +-
arch/arm64/mm/kasan_init.c | 26 ++-
arch/arm64/mm/mmu.c | 52 ++++--
arch/arm64/mm/pageattr.c | 7 +-
virt/kvm/arm/mmu.c | 209 ++++++++++++++++++++----
26 files changed, 522 insertions(+), 140 deletions(-)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 0d84d50bf9ba..8c511bb99e4c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -68,7 +68,8 @@ void kvm_clear_hyp_idmap(void);
#define kvm_mk_pmd(ptep) __pmd(__pa(ptep) | PMD_TYPE_TABLE)
#define kvm_mk_pud(pmdp) __pud(__pa(pmdp) | PMD_TYPE_TABLE)
-#define kvm_mk_pgd(pudp) ({ BUILD_BUG(); 0; })
+#define kvm_mk_p4d(pudp) ({ BUILD_BUG(); __p4d(0); })
+#define kvm_mk_pgd(p4dp) ({ BUILD_BUG(); 0; })
#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
@@ -194,10 +195,12 @@ static inline bool kvm_page_empty(void *ptr)
#define kvm_pte_table_empty(kvm, ptep) kvm_page_empty(ptep)
#define kvm_pmd_table_empty(kvm, pmdp) kvm_page_empty(pmdp)
#define kvm_pud_table_empty(kvm, pudp) false
+#define kvm_p4d_table_empty(kvm, p4dp) false
#define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
#define hyp_pmd_table_empty(pmdp) kvm_page_empty(pmdp)
#define hyp_pud_table_empty(pudp) false
+#define hyp_p4d_table_empty(p4dp) false
struct kvm;
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index eabcb48a7840..9e3464842dfc 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -17,7 +17,6 @@
#else
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>
#include <asm/memory.h>
#include <asm/pgtable-hwdef.h>
diff --git a/arch/arm/include/asm/stage2_pgtable.h b/arch/arm/include/asm/stage2_pgtable.h
index aaceec7855ec..7ed66e216a5e 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -19,8 +19,17 @@
#define stage2_pgd_none(kvm, pgd) pgd_none(pgd)
#define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
#define stage2_pgd_present(kvm, pgd) pgd_present(pgd)
-#define stage2_pgd_populate(kvm, pgd, pud) pgd_populate(NULL, pgd, pud)
-#define stage2_pud_offset(kvm, pgd, address) pud_offset(pgd, address)
+#define stage2_pgd_populate(kvm, pgd, p4d) pgd_populate(NULL, pgd, p4d)
+
+#define stage2_p4d_offset(kvm, pgd, address) p4d_offset(pgd, address)
+#define stage2_p4d_free(kvm, p4d) do { } while (0)
+
+#define stage2_p4d_none(kvm, p4d) p4d_none(p4d)
+#define stage2_p4d_clear(kvm, p4d) p4d_clear(p4d)
+#define stage2_p4d_present(kvm, p4d) p4d_present(p4d)
+#define stage2_p4d_populate(kvm, p4d, pud) p4d_populate(NULL, p4d, pud)
+
+#define stage2_pud_offset(kvm, p4d, address) pud_offset(p4d, address)
#define stage2_pud_free(kvm, pud) do { } while (0)
#define stage2_pud_none(kvm, pud) pud_none(pud)
@@ -41,6 +50,7 @@ stage2_pgd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
return (boundary - 1 < end - 1) ? boundary : end;
}
+#define stage2_p4d_addr_end(kvm, addr, end) (end)
#define stage2_pud_addr_end(kvm, addr, end) (end)
static inline phys_addr_t
@@ -56,6 +66,7 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
#define stage2_pte_table_empty(kvm, ptep) kvm_page_empty(ptep)
#define stage2_pmd_table_empty(kvm, pmdp) kvm_page_empty(pmdp)
#define stage2_pud_table_empty(kvm, pudp) false
+#define stage2_p4d_table_empty(kvm, p4dp) false
static inline bool kvm_stage2_has_pud(struct kvm *kvm)
{
diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index c9450982a155..cabf1119c256 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -24,6 +24,7 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
{
unsigned long addr = (unsigned long)_addr;
pgd_t *pgd;
+ p4d_t *p4d;
pmd_t *pmd;
pte_t *pte;
pud_t *pud;
@@ -33,7 +34,11 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
if (unlikely(pgd_none(*pgd) || pgd_bad(*pgd)))
return 0;
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (unlikely(p4d_none(*p4d) || p4d_bad(*p4d)))
+ return 0;
+
+ pud = pud_offset(p4d, addr);
if (unlikely(pud_none(*pud) || pud_bad(*pud)))
return 0;
@@ -154,7 +159,7 @@ arm_copy_to_user(void __user *to, const void *from, unsigned long n)
}
return n;
}
-
+
static unsigned long noinline
__clear_user_memset(void __user *addr, unsigned long n)
{
diff --git a/arch/arm/mach-sa1100/assabet.c b/arch/arm/mach-sa1100/assabet.c
index d96a101e5504..0631a7b02678 100644
--- a/arch/arm/mach-sa1100/assabet.c
+++ b/arch/arm/mach-sa1100/assabet.c
@@ -633,7 +633,7 @@ static void __init map_sa1100_gpio_regs( void )
int prot = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_DOMAIN(DOMAIN_IO);
pmd_t *pmd;
- pmd = pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(virt), virt), virt), virt);
*pmd = __pmd(phys | prot);
flush_pmd_entry(pmd);
}
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 7d6291f23251..677549d6854c 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -207,6 +207,7 @@ struct pg_level {
static struct pg_level pg_level[] = {
{
}, { /* pgd */
+ }, { /* p4d */
}, { /* pud */
}, { /* pmd */
.bits = section_bits,
@@ -308,7 +309,7 @@ static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start,
for (i = 0; i < PTRS_PER_PTE; i++, pte++) {
addr = start + i * PAGE_SIZE;
- note_page(st, addr, 4, pte_val(*pte), domain);
+ note_page(st, addr, 5, pte_val(*pte), domain);
}
}
@@ -350,14 +351,14 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
addr += SECTION_SIZE;
pmd++;
domain = get_domain_name(pmd);
- note_page(st, addr, 3, pmd_val(*pmd), domain);
+ note_page(st, addr, 4, pmd_val(*pmd), domain);
}
}
}
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
{
- pud_t *pud = pud_offset(pgd, 0);
+ pud_t *pud = pud_offset(p4d, 0);
unsigned long addr;
unsigned i;
@@ -366,7 +367,23 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
if (!pud_none(*pud)) {
walk_pmd(st, pud, addr);
} else {
- note_page(st, addr, 2, pud_val(*pud), NULL);
+ note_page(st, addr, 3, pud_val(*pud), NULL);
+ }
+ }
+}
+
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+ p4d_t *p4d = p4d_offset(pgd, 0);
+ unsigned long addr;
+ unsigned i;
+
+ for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+ addr = start + i * P4D_SIZE;
+ if (!p4d_none(*p4d)) {
+ walk_pud(st, p4d, addr);
+ } else {
+ note_page(st, addr, 2, p4d_val(*p4d), NULL);
}
}
}
@@ -381,7 +398,7 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {
addr = start + i * PGDIR_SIZE;
if (!pgd_none(*pgd)) {
- walk_pud(st, pgd, addr);
+ walk_p4d(st, pgd, addr);
} else {
note_page(st, addr, 1, pgd_val(*pgd), NULL);
}
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index ae857f41f68d..489aaafa6ebd 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -91,6 +91,7 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
{
spinlock_t *ptl;
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -100,7 +101,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
if (pgd_none_or_clear_bad(pgd))
return 0;
- pud = pud_offset(pgd, address);
+ p4d = p4d_offset(pgd, address);
+ if (p4d_none_or_clear_bad(p4d))
+ return 0;
+
+ pud = pud_offset(p4d, address);
if (pud_none_or_clear_bad(pud))
return 0;
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index bd0f4821f7e1..c2bd35a822e3 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -43,6 +43,7 @@ void show_pte(const char *lvl, struct mm_struct *mm, unsigned long addr)
printk("%s[%08lx] *pgd=%08llx", lvl, addr, (long long)pgd_val(*pgd));
do {
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -55,7 +56,19 @@ void show_pte(const char *lvl, struct mm_struct *mm, unsigned long addr)
break;
}
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (PTRS_PER_P4D != 1)
+ pr_cont(", *p4d=%08llx", (long long)p4d_val(*p4d));
+
+ if (p4d_none(*p4d))
+ break;
+
+ if (p4d_bad(*p4d)) {
+ pr_cont("(bad)");
+ break;
+ }
+
+ pud = pud_offset(p4d, addr);
if (PTRS_PER_PUD != 1)
pr_cont(", *pud=%08llx", (long long)pud_val(*pud));
@@ -408,6 +421,7 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
{
unsigned int index;
pgd_t *pgd, *pgd_k;
+ p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
@@ -427,8 +441,16 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
if (!pgd_present(*pgd))
set_pgd(pgd, *pgd_k);
- pud = pud_offset(pgd, addr);
- pud_k = pud_offset(pgd_k, addr);
+ p4d = p4d_offset(pgd, addr);
+ p4d_k = p4d_offset(pgd_k, addr);
+
+ if (p4d_none(*p4d_k))
+ goto bad_area;
+ if (!p4d_present(*p4d))
+ set_p4d(p4d, *p4d_k);
+
+ pud = pud_offset(p4d, addr);
+ pud_k = pud_offset(p4d_k, addr);
if (pud_none(*pud_k))
goto bad_area;
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index a033f6134a64..cd54411ef1b8 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -68,7 +68,8 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
static void idmap_add_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
unsigned long prot)
{
- pud_t *pud = pud_offset(pgd, addr);
+ p4d_t *p4d = p4d_offset(pgd, addr);
+ pud_t *pud = pud_offset(p4d, addr);
unsigned long next;
do {
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 054be44d1cdb..963b5284d284 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, pmdval_t mask,
{
pmd_t *pmd;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
+ pmd = pmd_off_k(addr);
#ifdef CONFIG_ARM_LPAE
pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index 72286f9a4d30..75529d76d28c 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -142,12 +142,14 @@ static void unmap_area_sections(unsigned long virt, unsigned long size)
{
unsigned long addr = virt, end = virt + (size & ~(SZ_1M - 1));
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmdp;
flush_cache_vunmap(addr, end);
pgd = pgd_offset_k(addr);
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
pmdp = pmd_offset(pud, addr);
do {
pmd_t pmd = *pmdp;
@@ -190,6 +192,7 @@ remap_area_sections(unsigned long virt, unsigned long pfn,
{
unsigned long addr = virt, end = virt + size;
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
@@ -200,7 +203,8 @@ remap_area_sections(unsigned long virt, unsigned long pfn,
unmap_area_sections(virt, size);
pgd = pgd_offset_k(addr);
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
pmd = pmd_offset(pud, addr);
do {
pmd[0] = __pmd(__pfn_to_phys(pfn) | type->prot_sect);
@@ -222,6 +226,7 @@ remap_area_supersections(unsigned long virt, unsigned long pfn,
{
unsigned long addr = virt, end = virt + size;
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
@@ -232,7 +237,8 @@ remap_area_supersections(unsigned long virt, unsigned long pfn,
unmap_area_sections(virt, size);
pgd = pgd_offset_k(virt);
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
pmd = pmd_offset(pud, addr);
do {
unsigned long super_pmd_val, i;
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 88c121ac14b3..4f1f72b75890 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -38,7 +38,7 @@ static inline pte_t get_top_pte(unsigned long va)
static inline pmd_t *pmd_off_k(unsigned long virt)
{
- return pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
+ return pmd_offset(pud_offset(p4d_offset(pgd_offset_k(virt), virt), virt), virt);
}
struct mem_type {
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 5d0d0f86e790..afd97342b634 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -375,7 +375,8 @@ static pte_t *pte_offset_late_fixmap(pmd_t *dir, unsigned long addr)
static inline pmd_t * __init fixmap_pmd(unsigned long addr)
{
pgd_t *pgd = pgd_offset_k(addr);
- pud_t *pud = pud_offset(pgd, addr);
+ p4d_t *p4d = p4d_offset(pgd, addr);
+ pud_t *pud = pud_offset(p4d, addr);
pmd_t *pmd = pmd_offset(pud, addr);
return pmd;
@@ -827,12 +828,12 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
} while (pmd++, addr = next, addr != end);
}
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
+static void __init alloc_init_pud(p4d_t *p4d, unsigned long addr,
unsigned long end, phys_addr_t phys,
const struct mem_type *type,
void *(*alloc)(unsigned long sz), bool ng)
{
- pud_t *pud = pud_offset(pgd, addr);
+ pud_t *pud = pud_offset(p4d, addr);
unsigned long next;
do {
@@ -842,6 +843,21 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
} while (pud++, addr = next, addr != end);
}
+static void __init alloc_init_p4d(pgd_t *pgd, unsigned long addr,
+ unsigned long end, phys_addr_t phys,
+ const struct mem_type *type,
+ void *(*alloc)(unsigned long sz), bool ng)
+{
+ p4d_t *p4d = p4d_offset(pgd, addr);
+ unsigned long next;
+
+ do {
+ next = p4d_addr_end(addr, end);
+ alloc_init_pud(p4d, addr, next, phys, type, alloc, ng);
+ phys += next - addr;
+ } while (p4d++, addr = next, addr != end);
+}
+
#ifndef CONFIG_ARM_LPAE
static void __init create_36bit_mapping(struct mm_struct *mm,
struct map_desc *md,
@@ -889,7 +905,8 @@ static void __init create_36bit_mapping(struct mm_struct *mm,
pgd = pgd_offset(mm, addr);
end = addr + length;
do {
- pud_t *pud = pud_offset(pgd, addr);
+ p4d_t *p4d = p4d_offset(pgd, addr);
+ pud_t *pud = pud_offset(p4d, addr);
pmd_t *pmd = pmd_offset(pud, addr);
int i;
@@ -940,7 +957,7 @@ static void __init __create_mapping(struct mm_struct *mm, struct map_desc *md,
do {
unsigned long next = pgd_addr_end(addr, end);
- alloc_init_pud(pgd, addr, next, phys, type, alloc, ng);
+ alloc_init_p4d(pgd, addr, next, phys, type, alloc, ng);
phys += next - addr;
addr = next;
@@ -976,7 +993,13 @@ void __init create_mapping_late(struct mm_struct *mm, struct map_desc *md,
bool ng)
{
#ifdef CONFIG_ARM_LPAE
- pud_t *pud = pud_alloc(mm, pgd_offset(mm, md->virtual), md->virtual);
+ p4d_t *p4d;
+ pud_t *pud;
+
+ p4d = p4d_alloc(mm, pgd_offset(mm, md->virtual), md->virtual);
+ if (!WARN_ON(!p4d))
+ return;
+ pud = pud_alloc(mm, p4d, md->virtual);
if (WARN_ON(!pud))
return;
pmd_alloc(mm, pud, 0);
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index 478bd2c6aa50..c5e1b27046a8 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -30,6 +30,7 @@
pgd_t *pgd_alloc(struct mm_struct *mm)
{
pgd_t *new_pgd, *init_pgd;
+ p4d_t *new_p4d, *init_p4d;
pud_t *new_pud, *init_pud;
pmd_t *new_pmd, *init_pmd;
pte_t *new_pte, *init_pte;
@@ -53,8 +54,12 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
/*
* Allocate PMD table for modules and pkmap mappings.
*/
- new_pud = pud_alloc(mm, new_pgd + pgd_index(MODULES_VADDR),
+ new_p4d = p4d_alloc(mm, new_pgd + pgd_index(MODULES_VADDR),
MODULES_VADDR);
+ if (!new_p4d)
+ goto no_p4d;
+
+ new_pud = pud_alloc(mm, new_p4d, MODULES_VADDR);
if (!new_pud)
goto no_pud;
@@ -69,7 +74,11 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
* contains the machine vectors. The vectors are always high
* with LPAE.
*/
- new_pud = pud_alloc(mm, new_pgd, 0);
+ new_p4d = p4d_alloc(mm, new_pgd, 0);
+ if (!new_p4d)
+ goto no_p4d;
+
+ new_pud = pud_alloc(mm, new_p4d, 0);
if (!new_pud)
goto no_pud;
@@ -91,7 +100,8 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
pmd_val(*new_pmd) |= PMD_DOMAIN(DOMAIN_VECTORS);
#endif
- init_pud = pud_offset(init_pgd, 0);
+ init_p4d = p4d_offset(init_pgd, 0);
+ init_pud = pud_offset(init_p4d, 0);
init_pmd = pmd_offset(init_pud, 0);
init_pte = pte_offset_map(init_pmd, 0);
set_pte_ext(new_pte + 0, init_pte[0], 0);
@@ -108,6 +118,8 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
no_pmd:
pud_free(mm, new_pud);
no_pud:
+ p4d_free(mm, new_p4d);
+no_p4d:
__pgd_free(new_pgd);
no_pgd:
return NULL;
@@ -116,6 +128,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pgtable_t pte;
@@ -127,7 +140,11 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
if (pgd_none_or_clear_bad(pgd))
goto no_pgd;
- pud = pud_offset(pgd, 0);
+ p4d = p4d_offset(pgd, 0);
+ if (p4d_none_or_clear_bad(p4d))
+ goto no_p4d;
+
+ pud = pud_offset(p4d, 0);
if (pud_none_or_clear_bad(pud))
goto no_pud;
@@ -144,8 +161,11 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
pmd_free(mm, pmd);
mm_dec_nr_pmds(mm);
no_pud:
- pgd_clear(pgd);
+ p4d_clear(p4d);
pud_free(mm, pud);
+no_p4d:
+ pgd_clear(pgd);
+ p4d_free(mm, p4d);
no_pgd:
#ifdef CONFIG_ARM_LPAE
/*
@@ -156,15 +176,21 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
continue;
if (pgd_val(*pgd) & L_PGD_SWAPPER)
continue;
- pud = pud_offset(pgd, 0);
+ p4d = p4d_offset(pgd, 0);
+ if (p4d_none_or_clear_bad(p4d))
+ continue;
+ pud = pud_offset(p4d, 0);
if (pud_none_or_clear_bad(pud))
continue;
pmd = pmd_offset(pud, 0);
pud_clear(pud);
pmd_free(mm, pmd);
mm_dec_nr_pmds(mm);
- pgd_clear(pgd);
+ p4d_clear(p4d);
pud_free(mm, pud);
+ mm_dec_nr_puds(mm);
+ pgd_clear(pgd);
+ p4d_free(mm, p4d);
}
#endif
__pgd_free(pgd_base);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 53d846f1bfe7..1f9bf19ac553 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,8 +172,8 @@ void kvm_clear_hyp_idmap(void);
__pmd(__phys_to_pmd_val(__pa(ptep)) | PMD_TYPE_TABLE)
#define kvm_mk_pud(pmdp) \
__pud(__phys_to_pud_val(__pa(pmdp)) | PMD_TYPE_TABLE)
-#define kvm_mk_pgd(pudp) \
- __pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
+#define kvm_mk_p4d(pmdp) \
+ __p4d(__phys_to_p4d_val(__pa(pmdp)) | PUD_TYPE_TABLE)
#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
@@ -299,6 +299,12 @@ static inline bool kvm_s2pud_young(pud_t pud)
#define hyp_pud_table_empty(pudp) kvm_page_empty(pudp)
#endif
+#ifdef __PAGETABLE_P4D_FOLDED
+#define hyp_p4d_table_empty(p4dp) (0)
+#else
+#define hyp_p4d_table_empty(p4dp) kvm_page_empty(p4dp)
+#endif
+
struct kvm;
#define kvm_flush_dcache_to_poc(a,l) __flush_dcache_area((a), (l))
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 172d76fa0245..58e93583ddb6 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -73,17 +73,17 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pudp)
free_page((unsigned long)pudp);
}
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
{
- set_pgd(pgdp, __pgd(__phys_to_pgd_val(pudp) | prot));
+ set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot));
}
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
{
- __pgd_populate(pgdp, __pa(pudp), PUD_TYPE_TABLE);
+ __p4d_populate(p4dp, __pa(pudp), PUD_TYPE_TABLE);
}
#else
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
{
BUILD_BUG();
}
diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h
index acb0751a6606..b8f158ae2527 100644
--- a/arch/arm64/include/asm/pgtable-types.h
+++ b/arch/arm64/include/asm/pgtable-types.h
@@ -14,6 +14,7 @@
typedef u64 pteval_t;
typedef u64 pmdval_t;
typedef u64 pudval_t;
+typedef u64 p4dval_t;
typedef u64 pgdval_t;
/*
@@ -44,13 +45,11 @@ typedef struct { pteval_t pgprot; } pgprot_t;
#define __pgprot(x) ((pgprot_t) { (x) } )
#if CONFIG_PGTABLE_LEVELS == 2
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#elif CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>
#elif CONFIG_PGTABLE_LEVELS == 4
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
#endif
#endif /* __ASM_PGTABLE_TYPES_H */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 538c85e62f86..c23c5a4e6dc6 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -298,6 +298,11 @@ static inline pte_t pgd_pte(pgd_t pgd)
return __pte(pgd_val(pgd));
}
+static inline pte_t p4d_pte(p4d_t p4d)
+{
+ return __pte(p4d_val(p4d));
+}
+
static inline pte_t pud_pte(pud_t pud)
{
return __pte(pud_val(pud));
@@ -401,6 +406,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
#define set_pmd_at(mm, addr, pmdp, pmd) set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
+#define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d))
+#define __phys_to_p4d_val(phys) __phys_to_pte_val(phys)
+
#define __pgd_to_phys(pgd) __pte_to_phys(pgd_pte(pgd))
#define __phys_to_pgd_val(phys) __phys_to_pte_val(phys)
@@ -588,49 +596,50 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
#define pud_ERROR(pud) __pud_error(__FILE__, __LINE__, pud_val(pud))
-#define pgd_none(pgd) (!pgd_val(pgd))
-#define pgd_bad(pgd) (!(pgd_val(pgd) & 2))
-#define pgd_present(pgd) (pgd_val(pgd))
+#define p4d_none(p4d) (!p4d_val(p4d))
+#define p4d_bad(p4d) (!(p4d_val(p4d) & 2))
+#define p4d_present(p4d) (p4d_val(p4d))
-static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
+static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
{
- if (in_swapper_pgdir(pgdp)) {
- set_swapper_pgd(pgdp, pgd);
+ if (in_swapper_pgdir(p4dp)) {
+ set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
return;
}
- WRITE_ONCE(*pgdp, pgd);
+ WRITE_ONCE(*p4dp, p4d);
dsb(ishst);
isb();
}
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
{
- set_pgd(pgdp, __pgd(0));
+ set_p4d(p4dp, __p4d(0));
}
-static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
+static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
{
- return __pgd_to_phys(pgd);
+ return __p4d_to_phys(p4d);
}
/* Find an entry in the frst-level page table. */
#define pud_index(addr) (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
-#define pud_offset_phys(dir, addr) (pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset_phys(dir, addr) (p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
#define pud_offset(dir, addr) ((pud_t *)__va(pud_offset_phys((dir), (addr))))
#define pud_set_fixmap(addr) ((pud_t *)set_fixmap_offset(FIX_PUD, addr))
-#define pud_set_fixmap_offset(pgd, addr) pud_set_fixmap(pud_offset_phys(pgd, addr))
+#define pud_set_fixmap_offset(p4d, addr) pud_set_fixmap(pud_offset_phys(p4d, addr))
#define pud_clear_fixmap() clear_fixmap(FIX_PUD)
-#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
+#define p4d_page(p4d) pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
/* use ONLY for statically allocated translation tables */
#define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
#else
+#define p4d_page_paddr(p4d) ({ BUILD_BUG(); 0;})
#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
/* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
index 326aac658b9d..9a364aeae5fb 100644
--- a/arch/arm64/include/asm/stage2_pgtable.h
+++ b/arch/arm64/include/asm/stage2_pgtable.h
@@ -68,41 +68,67 @@ static inline bool kvm_stage2_has_pud(struct kvm *kvm)
#define S2_PUD_SIZE (1UL << S2_PUD_SHIFT)
#define S2_PUD_MASK (~(S2_PUD_SIZE - 1))
-static inline bool stage2_pgd_none(struct kvm *kvm, pgd_t pgd)
+#define stage2_pgd_none(kvm, pgd) pgd_none(pgd)
+#define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
+#define stage2_pgd_present(kvm, pgd) pgd_present(pgd)
+#define stage2_pgd_populate(kvm, pgd, p4d) pgd_populate(NULL, pgd, p4d)
+
+static inline p4d_t *stage2_p4d_offset(struct kvm *kvm,
+ pgd_t *pgd, unsigned long address)
+{
+ return p4d_offset(pgd, address);
+}
+
+static inline void stage2_p4d_free(struct kvm *kvm, p4d_t *p4d)
+{
+}
+
+static inline bool stage2_p4d_table_empty(struct kvm *kvm, p4d_t *p4dp)
+{
+ return false;
+}
+
+static inline phys_addr_t stage2_p4d_addr_end(struct kvm *kvm,
+ phys_addr_t addr, phys_addr_t end)
+{
+ return end;
+}
+
+static inline bool stage2_p4d_none(struct kvm *kvm, p4d_t p4d)
{
if (kvm_stage2_has_pud(kvm))
- return pgd_none(pgd);
+ return p4d_none(p4d);
else
return 0;
}
-static inline void stage2_pgd_clear(struct kvm *kvm, pgd_t *pgdp)
+static inline void stage2_p4d_clear(struct kvm *kvm, p4d_t *p4dp)
{
if (kvm_stage2_has_pud(kvm))
- pgd_clear(pgdp);
+ p4d_clear(p4dp);
}
-static inline bool stage2_pgd_present(struct kvm *kvm, pgd_t pgd)
+static inline bool stage2_p4d_present(struct kvm *kvm, p4d_t p4d)
{
if (kvm_stage2_has_pud(kvm))
- return pgd_present(pgd);
+ return p4d_present(p4d);
else
return 1;
}
-static inline void stage2_pgd_populate(struct kvm *kvm, pgd_t *pgd, pud_t *pud)
+static inline void stage2_p4d_populate(struct kvm *kvm, p4d_t *p4d, pud_t *pud)
{
if (kvm_stage2_has_pud(kvm))
- pgd_populate(NULL, pgd, pud);
+ p4d_populate(NULL, p4d, pud);
}
static inline pud_t *stage2_pud_offset(struct kvm *kvm,
- pgd_t *pgd, unsigned long address)
+ p4d_t *p4d, unsigned long address)
{
if (kvm_stage2_has_pud(kvm))
- return pud_offset(pgd, address);
+ return pud_offset(p4d, address);
else
- return (pud_t *)pgd;
+ return (pud_t *)p4d;
}
static inline void stage2_pud_free(struct kvm *kvm, pud_t *pud)
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 590963c9c609..a370b1afeae0 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -187,6 +187,7 @@ static int trans_pgd_map_page(pgd_t *trans_pgd, void *page,
pgprot_t pgprot)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -199,7 +200,15 @@ static int trans_pgd_map_page(pgd_t *trans_pgd, void *page,
pgd_populate(&init_mm, pgdp, pudp);
}
- pudp = pud_offset(pgdp, dst_addr);
+ p4dp = p4d_offset(pgdp, dst_addr);
+ if (p4d_none(READ_ONCE(*p4dp))) {
+ pudp = (void *)get_safe_page(GFP_ATOMIC);
+ if (!pudp)
+ return -ENOMEM;
+ p4d_populate(&init_mm, p4dp, pudp);
+ }
+
+ pudp = pud_offset(p4dp, dst_addr);
if (pud_none(READ_ONCE(*pudp))) {
pmdp = (void *)get_safe_page(GFP_ATOMIC);
if (!pmdp)
@@ -422,7 +431,7 @@ static int copy_pmd(pud_t *dst_pudp, pud_t *src_pudp, unsigned long start,
return 0;
}
-static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+static int copy_pud(p4d_t *dst_p4dp, p4d_t *src_p4dp, unsigned long start,
unsigned long end)
{
pud_t *dst_pudp;
@@ -430,15 +439,15 @@ static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
unsigned long next;
unsigned long addr = start;
- if (pgd_none(READ_ONCE(*dst_pgdp))) {
+ if (p4d_none(READ_ONCE(*dst_p4dp))) {
dst_pudp = (pud_t *)get_safe_page(GFP_ATOMIC);
if (!dst_pudp)
return -ENOMEM;
- pgd_populate(&init_mm, dst_pgdp, dst_pudp);
+ p4d_populate(&init_mm, dst_p4dp, dst_pudp);
}
- dst_pudp = pud_offset(dst_pgdp, start);
+ dst_pudp = pud_offset(dst_p4dp, start);
- src_pudp = pud_offset(src_pgdp, start);
+ src_pudp = pud_offset(src_p4dp, start);
do {
pud_t pud = READ_ONCE(*src_pudp);
@@ -457,6 +466,27 @@ static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
return 0;
}
+static int copy_p4d(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+ unsigned long end)
+{
+ p4d_t *dst_p4dp;
+ p4d_t *src_p4dp;
+ unsigned long next;
+ unsigned long addr = start;
+
+ dst_p4dp = p4d_offset(dst_pgdp, start);
+ src_p4dp = p4d_offset(src_pgdp, start);
+ do {
+ next = p4d_addr_end(addr, end);
+ if (p4d_none(READ_ONCE(*src_p4dp)))
+ continue;
+ if (copy_pud(dst_p4dp, src_p4dp, addr, next))
+ return -ENOMEM;
+ } while (dst_p4dp++, src_p4dp++, addr = next, addr != end);
+
+ return 0;
+}
+
static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
unsigned long end)
{
@@ -469,7 +499,7 @@ static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
next = pgd_addr_end(addr, end);
if (pgd_none(READ_ONCE(*src_pgdp)))
continue;
- if (copy_pud(dst_pgdp, src_pgdp, addr, next))
+ if (copy_p4d(dst_pgdp, src_pgdp, addr, next))
return -ENOMEM;
} while (dst_pgdp++, src_pgdp++, addr = next, addr != end);
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 85566d32958f..fa6e7960f7d1 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -145,6 +145,7 @@ static void show_pte(unsigned long addr)
pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
do {
+ p4d_t *p4dp, p4d;
pud_t *pudp, pud;
pmd_t *pmdp, pmd;
pte_t *ptep, pte;
@@ -152,7 +153,13 @@ static void show_pte(unsigned long addr)
if (pgd_none(pgd) || pgd_bad(pgd))
break;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ p4d = READ_ONCE(*p4dp);
+ pr_cont(", p4d=%016llx", p4d_val(p4d));
+ if (p4d_none(p4d) || p4d_bad(p4d))
+ break;
+
+ pudp = pud_offset(p4dp, addr);
pud = READ_ONCE(*pudp);
pr_cont(", pud=%016llx", pud_val(pud));
if (pud_none(pud) || pud_bad(pud))
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index bbeb6a5a6ba6..b8a9f26f3790 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -67,11 +67,13 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, size_t *pgsize)
{
pgd_t *pgdp = pgd_offset(mm, addr);
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
*pgsize = PAGE_SIZE;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ pudp = pud_offset(p4dp, addr);
pmdp = pmd_offset(pudp, addr);
if ((pte_t *)pmdp == ptep) {
*pgsize = PMD_SIZE;
@@ -217,12 +219,14 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep = NULL;
pgdp = pgd_offset(mm, addr);
- pudp = pud_alloc(mm, pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ pudp = pud_alloc(mm, p4dp, addr);
if (!pudp)
return NULL;
@@ -259,6 +263,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp, pud;
pmd_t *pmdp, pmd;
@@ -266,7 +271,11 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
if (!pgd_present(READ_ONCE(*pgdp)))
return NULL;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ if (!p4d_present(READ_ONCE(*p4dp)))
+ return NULL;
+
+ pudp = pud_offset(p4dp, addr);
pud = READ_ONCE(*pudp);
if (sz != PUD_SIZE && pud_none(pud))
return NULL;
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index f87a32484ea8..2339811f317b 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -84,17 +84,17 @@ static pmd_t *__init kasan_pmd_offset(pud_t *pudp, unsigned long addr, int node,
return early ? pmd_offset_kimg(pudp, addr) : pmd_offset(pudp, addr);
}
-static pud_t *__init kasan_pud_offset(pgd_t *pgdp, unsigned long addr, int node,
+static pud_t *__init kasan_pud_offset(p4d_t *p4dp, unsigned long addr, int node,
bool early)
{
- if (pgd_none(READ_ONCE(*pgdp))) {
+ if (p4d_none(READ_ONCE(*p4dp))) {
phys_addr_t pud_phys = early ?
__pa_symbol(kasan_early_shadow_pud)
: kasan_alloc_zeroed_page(node);
- __pgd_populate(pgdp, pud_phys, PMD_TYPE_TABLE);
+ __p4d_populate(p4dp, pud_phys, PMD_TYPE_TABLE);
}
- return early ? pud_offset_kimg(pgdp, addr) : pud_offset(pgdp, addr);
+ return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr);
}
static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
@@ -126,11 +126,11 @@ static void __init kasan_pmd_populate(pud_t *pudp, unsigned long addr,
} while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)));
}
-static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
+static void __init kasan_pud_populate(p4d_t *p4dp, unsigned long addr,
unsigned long end, int node, bool early)
{
unsigned long next;
- pud_t *pudp = kasan_pud_offset(pgdp, addr, node, early);
+ pud_t *pudp = kasan_pud_offset(p4dp, addr, node, early);
do {
next = pud_addr_end(addr, end);
@@ -138,6 +138,18 @@ static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
} while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)));
}
+static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr,
+ unsigned long end, int node, bool early)
+{
+ unsigned long next;
+ p4d_t *p4dp = p4d_offset(pgdp, addr);
+
+ do {
+ next = p4d_addr_end(addr, end);
+ kasan_pud_populate(p4dp, addr, next, node, early);
+ } while (p4dp++, addr = next, addr != end);
+}
+
static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
int node, bool early)
{
@@ -147,7 +159,7 @@ static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
pgdp = pgd_offset_k(addr);
do {
next = pgd_addr_end(addr, end);
- kasan_pud_populate(pgdp, addr, next, node, early);
+ kasan_p4d_populate(pgdp, addr, next, node, early);
} while (pgdp++, addr = next, addr != end);
}
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 128f70852bf3..ad4be3e8e0c1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -289,18 +289,19 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
{
unsigned long next;
pud_t *pudp;
- pgd_t pgd = READ_ONCE(*pgdp);
+ p4d_t *p4dp = p4d_offset(pgdp, addr);
+ p4d_t p4d = READ_ONCE(*p4dp);
- if (pgd_none(pgd)) {
+ if (p4d_none(p4d)) {
phys_addr_t pud_phys;
BUG_ON(!pgtable_alloc);
pud_phys = pgtable_alloc(PUD_SHIFT);
- __pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
- pgd = READ_ONCE(*pgdp);
+ __p4d_populate(p4dp, pud_phys, PUD_TYPE_TABLE);
+ p4d = READ_ONCE(*p4dp);
}
- BUG_ON(pgd_bad(pgd));
+ BUG_ON(p4d_bad(p4d));
- pudp = pud_set_fixmap_offset(pgdp, addr);
+ pudp = pud_set_fixmap_offset(p4dp, addr);
do {
pud_t old_pud = READ_ONCE(*pudp);
@@ -647,6 +648,7 @@ static void __init map_kernel(pgd_t *pgdp)
READ_ONCE(*pgd_offset_k(FIXADDR_START)));
} else if (CONFIG_PGTABLE_LEVELS > 3) {
pgd_t *bm_pgdp;
+ p4d_t *bm_p4dp;
pud_t *bm_pudp;
/*
* The fixmap shares its top level pgd entry with the kernel
@@ -656,7 +658,8 @@ static void __init map_kernel(pgd_t *pgdp)
*/
BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
bm_pgdp = pgd_offset_raw(pgdp, FIXADDR_START);
- bm_pudp = pud_set_fixmap_offset(bm_pgdp, FIXADDR_START);
+ bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_START);
+ bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_START);
pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
pud_clear_fixmap();
} else {
@@ -690,6 +693,7 @@ void __init paging_init(void)
int kern_addr_valid(unsigned long addr)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp, pud;
pmd_t *pmdp, pmd;
pte_t *ptep, pte;
@@ -701,7 +705,11 @@ int kern_addr_valid(unsigned long addr)
if (pgd_none(READ_ONCE(*pgdp)))
return 0;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ if (p4d_none(READ_ONCE(*p4dp)))
+ return 0;
+
+ pudp = pud_offset(p4dp, addr);
pud = READ_ONCE(*pudp);
if (pud_none(pud))
return 0;
@@ -738,6 +746,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
unsigned long addr = start;
unsigned long next;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
@@ -748,7 +757,11 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
if (!pgdp)
return -ENOMEM;
- pudp = vmemmap_pud_populate(pgdp, addr, node);
+ p4dp = vmemmap_p4d_populate(pgdp, addr, node);
+ if (!p4dp)
+ return -ENOMEM;
+
+ pudp = vmemmap_pud_populate(p4dp, addr, node);
if (!pudp)
return -ENOMEM;
@@ -777,11 +790,12 @@ void vmemmap_free(unsigned long start, unsigned long end,
static inline pud_t * fixmap_pud(unsigned long addr)
{
pgd_t *pgdp = pgd_offset_k(addr);
- pgd_t pgd = READ_ONCE(*pgdp);
+ p4d_t *p4dp = p4d_offset(pgdp, addr);
+ p4d_t p4d = READ_ONCE(*p4dp);
- BUG_ON(pgd_none(pgd) || pgd_bad(pgd));
+ BUG_ON(p4d_none(p4d) || p4d_bad(p4d));
- return pud_offset_kimg(pgdp, addr);
+ return pud_offset_kimg(p4dp, addr);
}
static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -807,25 +821,27 @@ static inline pte_t * fixmap_pte(unsigned long addr)
*/
void __init early_fixmap_init(void)
{
- pgd_t *pgdp, pgd;
+ pgd_t *pgdp;
+ p4d_t *p4dp, p4d;
pud_t *pudp;
pmd_t *pmdp;
unsigned long addr = FIXADDR_START;
pgdp = pgd_offset_k(addr);
- pgd = READ_ONCE(*pgdp);
+ p4dp = p4d_offset(pgdp, addr);
+ p4d = READ_ONCE(*p4dp);
if (CONFIG_PGTABLE_LEVELS > 3 &&
- !(pgd_none(pgd) || pgd_page_paddr(pgd) == __pa_symbol(bm_pud))) {
+ !(p4d_none(p4d) || p4d_page_paddr(p4d) == __pa_symbol(bm_pud))) {
/*
* We only end up here if the kernel mapping and the fixmap
* share the top level pgd entry, which should only happen on
* 16k/4 levels configurations.
*/
BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
- pudp = pud_offset_kimg(pgdp, addr);
+ pudp = pud_offset_kimg(p4dp, addr);
} else {
- if (pgd_none(pgd))
- __pgd_populate(pgdp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
+ if (p4d_none(p4d))
+ __p4d_populate(p4dp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
pudp = fixmap_pud(addr);
}
if (pud_none(READ_ONCE(*pudp)))
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 250c49008d73..5a310991ff73 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -198,6 +198,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
bool kernel_page_present(struct page *page)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp, pud;
pmd_t *pmdp, pmd;
pte_t *ptep;
@@ -210,7 +211,11 @@ bool kernel_page_present(struct page *page)
if (pgd_none(READ_ONCE(*pgdp)))
return false;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ if (p4d_none(READ_ONCE(*p4dp)))
+ return false;
+
+ pudp = pud_offset(p4dp, addr);
pud = READ_ONCE(*pudp);
if (pud_none(pud))
return false;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 19c961ac4e3c..3d250fa3d2b9 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -158,13 +158,22 @@ static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
static void clear_stage2_pgd_entry(struct kvm *kvm, pgd_t *pgd, phys_addr_t addr)
{
- pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, pgd, 0UL);
+ p4d_t *p4d_table __maybe_unused = stage2_p4d_offset(kvm, pgd, 0UL);
stage2_pgd_clear(kvm, pgd);
kvm_tlb_flush_vmid_ipa(kvm, addr);
- stage2_pud_free(kvm, pud_table);
+ stage2_p4d_free(kvm, p4d_table);
put_page(virt_to_page(pgd));
}
+static void clear_stage2_p4d_entry(struct kvm *kvm, p4d_t *p4d, phys_addr_t addr)
+{
+ pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, p4d, 0);
+ stage2_p4d_clear(kvm, p4d);
+ kvm_tlb_flush_vmid_ipa(kvm, addr);
+ stage2_pud_free(kvm, pud_table);
+ put_page(virt_to_page(p4d));
+}
+
static void clear_stage2_pud_entry(struct kvm *kvm, pud_t *pud, phys_addr_t addr)
{
pmd_t *pmd_table __maybe_unused = stage2_pmd_offset(kvm, pud, 0);
@@ -208,12 +217,20 @@ static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
dsb(ishst);
}
-static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
+static inline void kvm_p4d_populate(p4d_t *p4dp, pud_t *pudp)
{
- WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
+ WRITE_ONCE(*p4dp, kvm_mk_p4d(pudp));
dsb(ishst);
}
+static inline void kvm_pgd_populate(pgd_t *pgdp, p4d_t *p4dp)
+{
+#ifndef __PAGETABLE_P4D_FOLDED
+ WRITE_ONCE(*pgdp, kvm_mk_pgd(p4dp));
+ dsb(ishst);
+#endif
+}
+
/*
* Unmapping vs dcache management:
*
@@ -293,13 +310,13 @@ static void unmap_stage2_pmds(struct kvm *kvm, pud_t *pud,
clear_stage2_pud_entry(kvm, pud, start_addr);
}
-static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
+static void unmap_stage2_puds(struct kvm *kvm, p4d_t *p4d,
phys_addr_t addr, phys_addr_t end)
{
phys_addr_t next, start_addr = addr;
pud_t *pud, *start_pud;
- start_pud = pud = stage2_pud_offset(kvm, pgd, addr);
+ start_pud = pud = stage2_pud_offset(kvm, p4d, addr);
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
@@ -317,6 +334,23 @@ static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
} while (pud++, addr = next, addr != end);
if (stage2_pud_table_empty(kvm, start_pud))
+ clear_stage2_p4d_entry(kvm, p4d, start_addr);
+}
+
+static void unmap_stage2_p4ds(struct kvm *kvm, pgd_t *pgd,
+ phys_addr_t addr, phys_addr_t end)
+{
+ phys_addr_t next, start_addr = addr;
+ p4d_t *p4d, *start_p4d;
+
+ start_p4d = p4d = stage2_p4d_offset(kvm, pgd, addr);
+ do {
+ next = stage2_p4d_addr_end(kvm, addr, end);
+ if (!stage2_p4d_none(kvm, *p4d))
+ unmap_stage2_puds(kvm, p4d, addr, next);
+ } while (p4d++, addr = next, addr != end);
+
+ if (stage2_p4d_table_empty(kvm, start_p4d))
clear_stage2_pgd_entry(kvm, pgd, start_addr);
}
@@ -351,7 +385,7 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
break;
next = stage2_pgd_addr_end(kvm, addr, end);
if (!stage2_pgd_none(kvm, *pgd))
- unmap_stage2_puds(kvm, pgd, addr, next);
+ unmap_stage2_p4ds(kvm, pgd, addr, next);
/*
* If the range is too large, release the kvm->mmu_lock
* to prevent starvation and lockup detector warnings.
@@ -391,13 +425,13 @@ static void stage2_flush_pmds(struct kvm *kvm, pud_t *pud,
} while (pmd++, addr = next, addr != end);
}
-static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
+static void stage2_flush_puds(struct kvm *kvm, p4d_t *p4d,
phys_addr_t addr, phys_addr_t end)
{
pud_t *pud;
phys_addr_t next;
- pud = stage2_pud_offset(kvm, pgd, addr);
+ pud = stage2_pud_offset(kvm, p4d, addr);
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
@@ -409,6 +443,20 @@ static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
} while (pud++, addr = next, addr != end);
}
+static void stage2_flush_p4ds(struct kvm *kvm, pgd_t *pgd,
+ phys_addr_t addr, phys_addr_t end)
+{
+ p4d_t *p4d;
+ phys_addr_t next;
+
+ p4d = stage2_p4d_offset(kvm, pgd, addr);
+ do {
+ next = stage2_p4d_addr_end(kvm, addr, end);
+ if (!stage2_p4d_none(kvm, *p4d))
+ stage2_flush_puds(kvm, p4d, addr, next);
+ } while (p4d++, addr = next, addr != end);
+}
+
static void stage2_flush_memslot(struct kvm *kvm,
struct kvm_memory_slot *memslot)
{
@@ -421,7 +469,7 @@ static void stage2_flush_memslot(struct kvm *kvm,
do {
next = stage2_pgd_addr_end(kvm, addr, end);
if (!stage2_pgd_none(kvm, *pgd))
- stage2_flush_puds(kvm, pgd, addr, next);
+ stage2_flush_p4ds(kvm, pgd, addr, next);
} while (pgd++, addr = next, addr != end);
}
@@ -451,12 +499,21 @@ static void stage2_flush_vm(struct kvm *kvm)
static void clear_hyp_pgd_entry(pgd_t *pgd)
{
- pud_t *pud_table __maybe_unused = pud_offset(pgd, 0UL);
+ p4d_t *p4d_table __maybe_unused = p4d_offset(pgd, 0UL);
pgd_clear(pgd);
- pud_free(NULL, pud_table);
+ p4d_free(NULL, p4d_table);
put_page(virt_to_page(pgd));
}
+static void clear_hyp_p4d_entry(p4d_t *p4d)
+{
+ pud_t *pud_table __maybe_unused = pud_offset(p4d, 0);
+ VM_BUG_ON(p4d_huge(*p4d));
+ p4d_clear(p4d);
+ pud_free(NULL, pud_table);
+ put_page(virt_to_page(p4d));
+}
+
static void clear_hyp_pud_entry(pud_t *pud)
{
pmd_t *pmd_table __maybe_unused = pmd_offset(pud, 0);
@@ -508,12 +565,12 @@ static void unmap_hyp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
clear_hyp_pud_entry(pud);
}
-static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+static void unmap_hyp_puds(p4d_t *p4d, phys_addr_t addr, phys_addr_t end)
{
phys_addr_t next;
pud_t *pud, *start_pud;
- start_pud = pud = pud_offset(pgd, addr);
+ start_pud = pud = pud_offset(p4d, addr);
do {
next = pud_addr_end(addr, end);
/* Hyp doesn't use huge puds */
@@ -522,6 +579,23 @@ static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
} while (pud++, addr = next, addr != end);
if (hyp_pud_table_empty(start_pud))
+ clear_hyp_p4d_entry(p4d);
+}
+
+static void unmap_hyp_p4ds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+ phys_addr_t next;
+ p4d_t *p4d, *start_p4d;
+
+ start_p4d = p4d = p4d_offset(pgd, addr);
+ do {
+ next = p4d_addr_end(addr, end);
+ /* Hyp doesn't use huge p4ds */
+ if (!p4d_none(*p4d))
+ unmap_hyp_puds(p4d, addr, next);
+ } while (p4d++, addr = next, addr != end);
+
+ if (hyp_p4d_table_empty(start_p4d))
clear_hyp_pgd_entry(pgd);
}
@@ -545,7 +619,7 @@ static void __unmap_hyp_range(pgd_t *pgdp, unsigned long ptrs_per_pgd,
do {
next = pgd_addr_end(addr, end);
if (!pgd_none(*pgd))
- unmap_hyp_puds(pgd, addr, next);
+ unmap_hyp_p4ds(pgd, addr, next);
} while (pgd++, addr = next, addr != end);
}
@@ -655,7 +729,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
return 0;
}
-static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
+static int create_hyp_pud_mappings(p4d_t *p4d, unsigned long start,
unsigned long end, unsigned long pfn,
pgprot_t prot)
{
@@ -666,7 +740,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
addr = start;
do {
- pud = pud_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
if (pud_none_or_clear_bad(pud)) {
pmd = pmd_alloc_one(NULL, addr);
@@ -688,12 +762,45 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
return 0;
}
+static int create_hyp_p4d_mappings(pgd_t *pgd, unsigned long start,
+ unsigned long end, unsigned long pfn,
+ pgprot_t prot)
+{
+ p4d_t *p4d;
+ pud_t *pud;
+ unsigned long addr, next;
+ int ret;
+
+ addr = start;
+ do {
+ p4d = p4d_offset(pgd, addr);
+
+ if (p4d_none(*p4d)) {
+ pud = pud_alloc_one(NULL, addr);
+ if (!pud) {
+ kvm_err("Cannot allocate Hyp pud\n");
+ return -ENOMEM;
+ }
+ kvm_p4d_populate(p4d, pud);
+ get_page(virt_to_page(p4d));
+ }
+
+ next = p4d_addr_end(addr, end);
+ ret = create_hyp_pud_mappings(p4d, addr, next, pfn, prot);
+ if (ret)
+ return ret;
+ pfn += (next - addr) >> PAGE_SHIFT;
+ } while (addr = next, addr != end);
+
+ return 0;
+}
+
static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
unsigned long start, unsigned long end,
unsigned long pfn, pgprot_t prot)
{
pgd_t *pgd;
- pud_t *pud;
+ p4d_t *p4d;
unsigned long addr, next;
int err = 0;
@@ -704,18 +811,18 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
pgd = pgdp + kvm_pgd_index(addr, ptrs_per_pgd);
if (pgd_none(*pgd)) {
- pud = pud_alloc_one(NULL, addr);
- if (!pud) {
- kvm_err("Cannot allocate Hyp pud\n");
+ p4d = p4d_alloc_one(NULL, addr);
+ if (!p4d) {
+ kvm_err("Cannot allocate Hyp p4d\n");
err = -ENOMEM;
goto out;
}
- kvm_pgd_populate(pgd, pud);
+ kvm_pgd_populate(pgd, p4d);
get_page(virt_to_page(pgd));
}
next = pgd_addr_end(addr, end);
- err = create_hyp_pud_mappings(pgd, addr, next, pfn, prot);
+ err = create_hyp_p4d_mappings(pgd, addr, next, pfn, prot);
if (err)
goto out;
pfn += (next - addr) >> PAGE_SHIFT;
@@ -1012,22 +1119,40 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
free_pages_exact(pgd, stage2_pgd_size(kvm));
}
-static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+static p4d_t *stage2_get_p4d(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
phys_addr_t addr)
{
pgd_t *pgd;
- pud_t *pud;
+ p4d_t *p4d;
pgd = kvm->arch.pgd + stage2_pgd_index(kvm, addr);
if (stage2_pgd_none(kvm, *pgd)) {
if (!cache)
return NULL;
- pud = mmu_memory_cache_alloc(cache);
- stage2_pgd_populate(kvm, pgd, pud);
+ p4d = mmu_memory_cache_alloc(cache);
+ stage2_pgd_populate(kvm, pgd, p4d);
get_page(virt_to_page(pgd));
}
- return stage2_pud_offset(kvm, pgd, addr);
+ return stage2_p4d_offset(kvm, pgd, addr);
+}
+
+static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+ phys_addr_t addr)
+{
+ p4d_t *p4d;
+ pud_t *pud;
+
+ p4d = stage2_get_p4d(kvm, cache, addr);
+ if (stage2_p4d_none(kvm, *p4d)) {
+ if (!cache)
+ return NULL;
+ pud = mmu_memory_cache_alloc(cache);
+ stage2_p4d_populate(kvm, p4d, pud);
+ get_page(virt_to_page(p4d));
+ }
+
+ return stage2_pud_offset(kvm, p4d, addr);
}
static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
@@ -1461,18 +1586,18 @@ static void stage2_wp_pmds(struct kvm *kvm, pud_t *pud,
}
/**
- * stage2_wp_puds - write protect PGD range
+ * stage2_wp_puds - write protect P4D range
* @pgd: pointer to pgd entry
* @addr: range start address
* @end: range end address
*/
-static void stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
+static void stage2_wp_puds(struct kvm *kvm, p4d_t *p4d,
phys_addr_t addr, phys_addr_t end)
{
pud_t *pud;
phys_addr_t next;
- pud = stage2_pud_offset(kvm, pgd, addr);
+ pud = stage2_pud_offset(kvm, p4d, addr);
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
@@ -1486,6 +1611,26 @@ static void stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
} while (pud++, addr = next, addr != end);
}
+/**
+ * stage2_wp_p4ds - write protect PGD range
+ * @pgd: pointer to pgd entry
+ * @addr: range start address
+ * @end: range end address
+ */
+static void stage2_wp_p4ds(struct kvm *kvm, pgd_t *pgd,
+ phys_addr_t addr, phys_addr_t end)
+{
+ p4d_t *p4d;
+ phys_addr_t next;
+
+ p4d = stage2_p4d_offset(kvm, pgd, addr);
+ do {
+ next = stage2_p4d_addr_end(kvm, addr, end);
+ if (!stage2_p4d_none(kvm, *p4d))
+ stage2_wp_puds(kvm, p4d, addr, next);
+ } while (p4d++, addr = next, addr != end);
+}
+
/**
* stage2_wp_range() - write protect stage2 memory region range
* @kvm: The KVM pointer
@@ -1513,7 +1658,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
break;
next = stage2_pgd_addr_end(kvm, addr, end);
if (stage2_pgd_present(kvm, *pgd))
- stage2_wp_puds(kvm, pgd, addr, next);
+ stage2_wp_p4ds(kvm, pgd, addr, next);
} while (pgd++, addr = next, addr != end);
}
--
2.24.0
From: Mike Rapoport <[email protected]>
h8300 is a nommu architecture and does not require fixup for upper layers
of the page tables because it is already handled by the generic nommu
implementation.
Remove definition of __ARCH_USE_5LEVEL_HACK in
arch/h8300/include/asm/pgtable.h
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/h8300/include/asm/pgtable.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/h8300/include/asm/pgtable.h b/arch/h8300/include/asm/pgtable.h
index 4d00152fab58..f00828720dc4 100644
--- a/arch/h8300/include/asm/pgtable.h
+++ b/arch/h8300/include/asm/pgtable.h
@@ -1,7 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _H8300_PGTABLE_H
#define _H8300_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>
#include <asm-generic/pgtable.h>
extern void paging_init(void);
--
2.24.0
From: Mike Rapoport <[email protected]>
The hexagon architecture has 2 level page tables and as such most of the
page table folding is already implemented in asm-generic/pgtable-nopmd.h.
Fixup the only place in arch/hexagon to unfold the p4d level and remove
__ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/hexagon/include/asm/fixmap.h | 4 ++--
arch/hexagon/include/asm/pgtable.h | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/hexagon/include/asm/fixmap.h b/arch/hexagon/include/asm/fixmap.h
index 933dac167504..97b1b062e750 100644
--- a/arch/hexagon/include/asm/fixmap.h
+++ b/arch/hexagon/include/asm/fixmap.h
@@ -16,7 +16,7 @@
#include <asm-generic/fixmap.h>
#define kmap_get_fixmap_pte(vaddr) \
- pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), \
- (vaddr)), (vaddr)), (vaddr))
+ pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr), \
+ (vaddr)), (vaddr)), (vaddr)), (vaddr))
#endif
diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index 2fec20ad939e..83b544936eed 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -12,7 +12,6 @@
* Page table definitions for Qualcomm Hexagon processor.
*/
#include <asm/page.h>
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
/* A handy thing to have if one has the RAM. Declared in head.S */
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, remove usage of __ARCH_USE_5LEVEL_HACK and replace
5level-fixup.h with pgtable-nop4d.h
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/ia64/include/asm/pgalloc.h | 4 ++--
arch/ia64/include/asm/pgtable.h | 17 ++++++++---------
arch/ia64/mm/fault.c | 7 ++++++-
arch/ia64/mm/hugetlbpage.c | 18 ++++++++++++------
arch/ia64/mm/init.c | 28 ++++++++++++++++++++++++----
5 files changed, 52 insertions(+), 22 deletions(-)
diff --git a/arch/ia64/include/asm/pgalloc.h b/arch/ia64/include/asm/pgalloc.h
index f4c491044882..2a3050345099 100644
--- a/arch/ia64/include/asm/pgalloc.h
+++ b/arch/ia64/include/asm/pgalloc.h
@@ -36,9 +36,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
#if CONFIG_PGTABLE_LEVELS == 4
static inline void
-pgd_populate(struct mm_struct *mm, pgd_t * pgd_entry, pud_t * pud)
+p4d_populate(struct mm_struct *mm, p4d_t * p4d_entry, pud_t * pud)
{
- pgd_val(*pgd_entry) = __pa(pud);
+ p4d_val(*p4d_entry) = __pa(pud);
}
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index d602e7c622db..c87f789bc914 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -283,12 +283,12 @@ extern unsigned long VMALLOC_END;
#define pud_page(pud) virt_to_page((pud_val(pud) + PAGE_OFFSET))
#if CONFIG_PGTABLE_LEVELS == 4
-#define pgd_none(pgd) (!pgd_val(pgd))
-#define pgd_bad(pgd) (!ia64_phys_addr_valid(pgd_val(pgd)))
-#define pgd_present(pgd) (pgd_val(pgd) != 0UL)
-#define pgd_clear(pgdp) (pgd_val(*(pgdp)) = 0UL)
-#define pgd_page_vaddr(pgd) ((unsigned long) __va(pgd_val(pgd) & _PFN_MASK))
-#define pgd_page(pgd) virt_to_page((pgd_val(pgd) + PAGE_OFFSET))
+#define p4d_none(p4d) (!p4d_val(p4d))
+#define p4d_bad(p4d) (!ia64_phys_addr_valid(p4d_val(p4d)))
+#define p4d_present(p4d) (p4d_val(p4d) != 0UL)
+#define p4d_clear(p4dp) (p4d_val(*(p4dp)) = 0UL)
+#define p4d_page_vaddr(p4d) ((unsigned long) __va(p4d_val(p4d) & _PFN_MASK))
+#define p4d_page(p4d) virt_to_page((p4d_val(p4d) + PAGE_OFFSET))
#endif
/*
@@ -388,7 +388,7 @@ pgd_offset (const struct mm_struct *mm, unsigned long address)
#if CONFIG_PGTABLE_LEVELS == 4
/* Find an entry in the second-level page table.. */
#define pud_offset(dir,addr) \
- ((pud_t *) pgd_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
+ ((pud_t *) p4d_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
#endif
/* Find an entry in the third-level page table.. */
@@ -582,10 +582,9 @@ extern struct page *zero_page_memmap_ptr;
#if CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>
#endif
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
#include <asm-generic/pgtable.h>
#endif /* _ASM_IA64_PGTABLE_H */
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index c2f299fe9e04..ec994135cb74 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -29,6 +29,7 @@ static int
mapped_kernel_page_is_present (unsigned long address)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;
@@ -37,7 +38,11 @@ mapped_kernel_page_is_present (unsigned long address)
if (pgd_none(*pgd) || pgd_bad(*pgd))
return 0;
- pud = pud_offset(pgd, address);
+ p4d = p4d_offset(pgd, address);
+ if (p4d_none(*p4d) || p4d_bad(*p4d))
+ return 0;
+
+ pud = pud_offset(p4d, address);
if (pud_none(*pud) || pud_bad(*pud))
return 0;
diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c
index d16e419fd712..32352a73df0c 100644
--- a/arch/ia64/mm/hugetlbpage.c
+++ b/arch/ia64/mm/hugetlbpage.c
@@ -30,12 +30,14 @@ huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
{
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, taddr);
- pud = pud_alloc(mm, pgd, taddr);
+ p4d = p4d_offset(pgd, taddr);
+ pud = pud_alloc(mm, p4d, taddr);
if (pud) {
pmd = pmd_alloc(mm, pud, taddr);
if (pmd)
@@ -49,17 +51,21 @@ huge_pte_offset (struct mm_struct *mm, unsigned long addr, unsigned long sz)
{
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, taddr);
if (pgd_present(*pgd)) {
- pud = pud_offset(pgd, taddr);
- if (pud_present(*pud)) {
- pmd = pmd_offset(pud, taddr);
- if (pmd_present(*pmd))
- pte = pte_offset_map(pmd, taddr);
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_present(*p4d)) {
+ pud = pud_offset(p4d, taddr);
+ if (pud_present(*pud)) {
+ pmd = pmd_offset(pud, taddr);
+ if (pmd_present(*pmd))
+ pte = pte_offset_map(pmd, taddr);
+ }
}
}
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index b01d68a2d5d9..4808f58220ac 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -208,6 +208,7 @@ static struct page * __init
put_kernel_page (struct page *page, unsigned long address, pgprot_t pgprot)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -215,7 +216,10 @@ put_kernel_page (struct page *page, unsigned long address, pgprot_t pgprot)
pgd = pgd_offset_k(address); /* note: this is NOT pgd_offset()! */
{
- pud = pud_alloc(&init_mm, pgd, address);
+ p4d = p4d_alloc(&init_mm, pgd, address);
+ if (!p4d)
+ goto out;
+ pud = pud_alloc(&init_mm, p4d, address);
if (!pud)
goto out;
pmd = pmd_alloc(&init_mm, pud, address);
@@ -382,6 +386,7 @@ int vmemmap_find_next_valid_pfn(int node, int i)
do {
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -392,7 +397,13 @@ int vmemmap_find_next_valid_pfn(int node, int i)
continue;
}
- pud = pud_offset(pgd, end_address);
+ p4d = p4d_offset(pgd, end_address);
+ if (p4d_none(*p4d)) {
+ end_address += P4D_SIZE;
+ continue;
+ }
+
+ pud = pud_offset(p4d, end_address);
if (pud_none(*pud)) {
end_address += PUD_SIZE;
continue;
@@ -430,6 +441,7 @@ int __init create_mem_map_page_table(u64 start, u64 end, void *arg)
struct page *map_start, *map_end;
int node;
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -444,12 +456,20 @@ int __init create_mem_map_page_table(u64 start, u64 end, void *arg)
for (address = start_page; address < end_page; address += PAGE_SIZE) {
pgd = pgd_offset_k(address);
if (pgd_none(*pgd)) {
+ p4d = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
+ if (!p4d)
+ goto err_alloc;
+ pgd_populate(&init_mm, pgd, p4d);
+ }
+ p4d = p4d_offset(pgd, address);
+
+ if (p4d_none(*p4d)) {
pud = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
if (!pud)
goto err_alloc;
- pgd_populate(&init_mm, pgd, pud);
+ p4d_populate(&init_mm, p4d, pud);
}
- pud = pud_offset(pgd, address);
+ pud = pud_offset(p4d, address);
if (pud_none(*pud)) {
pmd = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/openrisc/include/asm/pgtable.h | 1 -
arch/openrisc/mm/fault.c | 10 ++++++++--
arch/openrisc/mm/init.c | 4 +++-
3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h
index 248d22d8faa7..c072943fc721 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -21,7 +21,6 @@
#ifndef __ASM_OPENRISC_PGTABLE_H
#define __ASM_OPENRISC_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#ifndef __ASSEMBLY__
diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c
index 5d4d3a9691d0..44aa04545de3 100644
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -296,6 +296,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
int offset = pgd_index(address);
pgd_t *pgd, *pgd_k;
+ p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;
@@ -322,8 +323,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
* it exists.
*/
- pud = pud_offset(pgd, address);
- pud_k = pud_offset(pgd_k, address);
+ p4d = p4d_offset(pgd, address);
+ p4d_k = p4d_offset(pgd_k, address);
+ if (!p4d_present(*p4d_k))
+ goto no_context;
+
+ pud = pud_offset(p4d, address);
+ pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
goto no_context;
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index 1f87b524db78..2536aeae0975 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -71,6 +71,7 @@ static void __init map_ram(void)
unsigned long v, p, e;
pgprot_t prot;
pgd_t *pge;
+ p4d_t *p4e;
pud_t *pue;
pmd_t *pme;
pte_t *pte;
@@ -90,7 +91,8 @@ static void __init map_ram(void)
while (p < e) {
int j;
- pue = pud_offset(pge, v);
+ p4e = p4d_offset(pge, v);
+ pue = pud_offset(p4e, v);
pme = pmd_offset(pue, v);
if ((u32) pue != (u32) pge || (u32) pme != (u32) pge) {
--
2.24.0
From: Geert Uytterhoeven <[email protected]>
- Convert from printk() to pr_*(),
- Add missing continuations,
- Use "%llx" to format u64,
- Join multiple prints in show_fault_oops() into a single print.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/sh/mm/fault.c | 39 ++++++++++++++++++---------------------
1 file changed, 18 insertions(+), 21 deletions(-)
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 5f51456f4fc7..a2b0275413e8 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -47,10 +47,10 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
pgd = swapper_pg_dir;
}
- printk(KERN_ALERT "pgd = %p\n", pgd);
+ pr_alert("pgd = %p\n", pgd);
pgd += pgd_index(addr);
- printk(KERN_ALERT "[%08lx] *pgd=%0*Lx", addr,
- (u32)(sizeof(*pgd) * 2), (u64)pgd_val(*pgd));
+ pr_alert("[%08lx] *pgd=%0*llx", addr, (u32)(sizeof(*pgd) * 2),
+ (u64)pgd_val(*pgd));
do {
pud_t *pud;
@@ -61,33 +61,33 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
break;
if (pgd_bad(*pgd)) {
- printk("(bad)");
+ pr_cont("(bad)");
break;
}
pud = pud_offset(pgd, addr);
if (PTRS_PER_PUD != 1)
- printk(", *pud=%0*Lx", (u32)(sizeof(*pud) * 2),
- (u64)pud_val(*pud));
+ pr_cont(", *pud=%0*llx", (u32)(sizeof(*pud) * 2),
+ (u64)pud_val(*pud));
if (pud_none(*pud))
break;
if (pud_bad(*pud)) {
- printk("(bad)");
+ pr_cont("(bad)");
break;
}
pmd = pmd_offset(pud, addr);
if (PTRS_PER_PMD != 1)
- printk(", *pmd=%0*Lx", (u32)(sizeof(*pmd) * 2),
- (u64)pmd_val(*pmd));
+ pr_cont(", *pmd=%0*llx", (u32)(sizeof(*pmd) * 2),
+ (u64)pmd_val(*pmd));
if (pmd_none(*pmd))
break;
if (pmd_bad(*pmd)) {
- printk("(bad)");
+ pr_cont("(bad)");
break;
}
@@ -96,11 +96,11 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
break;
pte = pte_offset_kernel(pmd, addr);
- printk(", *pte=%0*Lx", (u32)(sizeof(*pte) * 2),
- (u64)pte_val(*pte));
+ pr_cont(", *pte=%0*llx", (u32)(sizeof(*pte) * 2),
+ (u64)pte_val(*pte));
} while (0);
- printk("\n");
+ pr_cont("\n");
}
static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
@@ -188,14 +188,11 @@ show_fault_oops(struct pt_regs *regs, unsigned long address)
if (!oops_may_print())
return;
- printk(KERN_ALERT "BUG: unable to handle kernel ");
- if (address < PAGE_SIZE)
- printk(KERN_CONT "NULL pointer dereference");
- else
- printk(KERN_CONT "paging request");
-
- printk(KERN_CONT " at %08lx\n", address);
- printk(KERN_ALERT "PC:");
+ pr_alert("BUG: unable to handle kernel %s at %08lx\n",
+ address < PAGE_SIZE ? "NULL pointer dereference"
+ : "paging request",
+ address);
+ pr_alert("PC:");
printk_address(regs->pc, 1);
show_pte(NULL, address);
--
2.24.0
From: Mike Rapoport <[email protected]>
The __pXd_offset() macros are identical to the pXd_index() macros and there
is no point to keep both of them. All architectures define and use
pXd_index() so let's keep only those to make mips consistent with the rest
of the kernel.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/sh/include/asm/pgtable_32.h | 5 ++---
arch/sh/include/asm/pgtable_64.h | 5 ++---
arch/sh/mm/init.c | 6 +++---
3 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 29274f0e428e..4acce5f2cbf9 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -407,13 +407,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
/* to find an entry in a page-table-directory. */
#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
-#define __pgd_offset(address) pgd_index(address)
/* to find an entry in a kernel page-table-directory */
#define pgd_offset_k(address) pgd_offset(&init_mm, address)
-#define __pud_offset(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
/* Find an entry in the third-level page table.. */
#define pte_index(address) ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/sh/include/asm/pgtable_64.h b/arch/sh/include/asm/pgtable_64.h
index 1778bc5971e7..27cc282ec6c0 100644
--- a/arch/sh/include/asm/pgtable_64.h
+++ b/arch/sh/include/asm/pgtable_64.h
@@ -46,14 +46,13 @@ static __inline__ void set_pte(pte_t *pteptr, pte_t pteval)
/* To find an entry in a generic PGD. */
#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
-#define __pgd_offset(address) pgd_index(address)
#define pgd_offset(mm, address) ((mm)->pgd+pgd_index(address))
/* To find an entry in a kernel PGD. */
#define pgd_offset_k(address) pgd_offset(&init_mm, address)
-#define __pud_offset(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+/* #define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1)) */
/*
* PMD level access routines. Same notes as above.
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index d1b1ff2be17a..4bab79baee75 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -172,9 +172,9 @@ void __init page_table_range_init(unsigned long start, unsigned long end,
unsigned long vaddr;
vaddr = start;
- i = __pgd_offset(vaddr);
- j = __pud_offset(vaddr);
- k = __pmd_offset(vaddr);
+ i = pgd_index(vaddr);
+ j = pud_index(vaddr);
+ k = pmd_index(vaddr);
pgd = pgd_base + i;
for ( ; (i < PTRS_PER_PGD) && (vaddr != end); pgd++, i++) {
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/nios2/include/asm/pgtable.h | 3 +--
arch/nios2/mm/fault.c | 9 +++++++--
arch/nios2/mm/ioremap.c | 6 +++++-
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index 99985d8b7166..54305aa09b74 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -22,7 +22,6 @@
#include <asm/tlbflush.h>
#include <asm/pgtable-bits.h>
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#define FIRST_USER_ADDRESS 0UL
@@ -100,7 +99,7 @@ extern pte_t invalid_pte_table[PAGE_SIZE/sizeof(pte_t)];
*/
static inline void set_pmd(pmd_t *pmdptr, pmd_t pmdval)
{
- pmdptr->pud.pgd.pgd = pmdval.pud.pgd.pgd;
+ *pmdptr = pmdval;
}
/* to find an entry in a page-table-directory */
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index 6a2e716b959f..d3da995665c3 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -245,6 +245,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
*/
int offset = pgd_index(address);
pgd_t *pgd, *pgd_k;
+ p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;
@@ -256,8 +257,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
goto no_context;
set_pgd(pgd, *pgd_k);
- pud = pud_offset(pgd, address);
- pud_k = pud_offset(pgd_k, address);
+ p4d = p4d_offset(pgd, address);
+ p4d_k = p4d_offset(pgd_k, address);
+ if (!p4d_present(*p4d_k))
+ goto no_context;
+ pud = pud_offset(p4d, address);
+ pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
goto no_context;
pmd = pmd_offset(pud, address);
diff --git a/arch/nios2/mm/ioremap.c b/arch/nios2/mm/ioremap.c
index 819bdfcc2e71..fe821efb9a99 100644
--- a/arch/nios2/mm/ioremap.c
+++ b/arch/nios2/mm/ioremap.c
@@ -86,11 +86,15 @@ static int remap_area_pages(unsigned long address, unsigned long phys_addr,
if (address >= end)
BUG();
do {
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
error = -ENOMEM;
- pud = pud_alloc(&init_mm, dir, address);
+ p4d = p4d_alloc(&init_mm, dir, address);
+ if (!p4d)
+ break;
+ pud = pud_alloc(&init_mm, p4d, address);
if (!pud)
break;
pmd = pmd_alloc(&init_mm, pud, address);
--
2.24.0
From: Mike Rapoport <[email protected]>
The unicore32 architecture has 2 level page tables and
asm-generic/pgtable-nopmd.h and explicit casts from pud_t to pgd_t for page
table folding.
Add p4d walk in the only place that actually unfolds the pud level and
remove __ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/unicore32/include/asm/pgtable.h | 1 -
arch/unicore32/kernel/hibernate.c | 4 +++-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/unicore32/include/asm/pgtable.h b/arch/unicore32/include/asm/pgtable.h
index c8f7ba12f309..82030c32fc05 100644
--- a/arch/unicore32/include/asm/pgtable.h
+++ b/arch/unicore32/include/asm/pgtable.h
@@ -9,7 +9,6 @@
#ifndef __UNICORE_PGTABLE_H__
#define __UNICORE_PGTABLE_H__
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#include <asm/cpu-single.h>
diff --git a/arch/unicore32/kernel/hibernate.c b/arch/unicore32/kernel/hibernate.c
index f3812245cc00..ccad051a79b6 100644
--- a/arch/unicore32/kernel/hibernate.c
+++ b/arch/unicore32/kernel/hibernate.c
@@ -33,9 +33,11 @@ struct swsusp_arch_regs swsusp_arch_regs_cpu0;
static pmd_t *resume_one_md_table_init(pgd_t *pgd)
{
pud_t *pud;
+ p4d_t *p4d;
pmd_t *pmd_table;
- pud = pud_offset(pgd, 0);
+ p4d = p4d_offset(pgd, 0);
+ pud = pud_offset(p4d, 0);
pmd_table = pmd_offset(pud, 0);
return pmd_table;
--
2.24.0
From: Mike Rapoport <[email protected]>
No architecture defines __ARCH_USE_5LEVEL_HACK and therefore
pgtable-nop4d-hack.h will be never actually included.
Remove it.
Signed-off-by: Mike Rapoport <[email protected]>
---
include/asm-generic/pgtable-nop4d-hack.h | 64 ------------------------
include/asm-generic/pgtable-nopud.h | 4 --
2 files changed, 68 deletions(-)
delete mode 100644 include/asm-generic/pgtable-nop4d-hack.h
diff --git a/include/asm-generic/pgtable-nop4d-hack.h b/include/asm-generic/pgtable-nop4d-hack.h
deleted file mode 100644
index 829bdb0d6327..000000000000
--- a/include/asm-generic/pgtable-nop4d-hack.h
+++ /dev/null
@@ -1,64 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _PGTABLE_NOP4D_HACK_H
-#define _PGTABLE_NOP4D_HACK_H
-
-#ifndef __ASSEMBLY__
-#include <asm-generic/5level-fixup.h>
-
-#define __PAGETABLE_PUD_FOLDED 1
-
-/*
- * Having the pud type consist of a pgd gets the size right, and allows
- * us to conceptually access the pgd entry that this pud is folded into
- * without casting.
- */
-typedef struct { pgd_t pgd; } pud_t;
-
-#define PUD_SHIFT PGDIR_SHIFT
-#define PTRS_PER_PUD 1
-#define PUD_SIZE (1UL << PUD_SHIFT)
-#define PUD_MASK (~(PUD_SIZE-1))
-
-/*
- * The "pgd_xxx()" functions here are trivial for a folded two-level
- * setup: the pud is never bad, and a pud always exists (as it's folded
- * into the pgd entry)
- */
-static inline int pgd_none(pgd_t pgd) { return 0; }
-static inline int pgd_bad(pgd_t pgd) { return 0; }
-static inline int pgd_present(pgd_t pgd) { return 1; }
-static inline void pgd_clear(pgd_t *pgd) { }
-#define pud_ERROR(pud) (pgd_ERROR((pud).pgd))
-
-#define pgd_populate(mm, pgd, pud) do { } while (0)
-#define pgd_populate_safe(mm, pgd, pud) do { } while (0)
-/*
- * (puds are folded into pgds so this doesn't get actually called,
- * but the define is needed for a generic inline function.)
- */
-#define set_pgd(pgdptr, pgdval) set_pud((pud_t *)(pgdptr), (pud_t) { pgdval })
-
-static inline pud_t *pud_offset(pgd_t *pgd, unsigned long address)
-{
- return (pud_t *)pgd;
-}
-
-#define pud_val(x) (pgd_val((x).pgd))
-#define __pud(x) ((pud_t) { __pgd(x) })
-
-#define pgd_page(pgd) (pud_page((pud_t){ pgd }))
-#define pgd_page_vaddr(pgd) (pud_page_vaddr((pud_t){ pgd }))
-
-/*
- * allocating and freeing a pud is trivial: the 1-entry pud is
- * inside the pgd, so has no extra memory associated with it.
- */
-#define pud_alloc_one(mm, address) NULL
-#define pud_free(mm, x) do { } while (0)
-#define __pud_free_tlb(tlb, x, a) do { } while (0)
-
-#undef pud_addr_end
-#define pud_addr_end(addr, end) (end)
-
-#endif /* __ASSEMBLY__ */
-#endif /* _PGTABLE_NOP4D_HACK_H */
diff --git a/include/asm-generic/pgtable-nopud.h b/include/asm-generic/pgtable-nopud.h
index d3776cb494c0..ad05c1684bfc 100644
--- a/include/asm-generic/pgtable-nopud.h
+++ b/include/asm-generic/pgtable-nopud.h
@@ -4,9 +4,6 @@
#ifndef __ASSEMBLY__
-#ifdef __ARCH_USE_5LEVEL_HACK
-#include <asm-generic/pgtable-nop4d-hack.h>
-#else
#include <asm-generic/pgtable-nop4d.h>
#define __PAGETABLE_PUD_FOLDED 1
@@ -65,5 +62,4 @@ static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
#define pud_addr_end(addr, end) (end)
#endif /* __ASSEMBLY__ */
-#endif /* !__ARCH_USE_5LEVEL_HACK */
#endif /* _PGTABLE_NOPUD_H */
--
2.24.0
From: Mike Rapoport <[email protected]>
There are no architectures that use include/asm-generic/5level-fixup.h
therefore it can be removed along with __ARCH_HAS_5LEVEL_HACK define and
the code it surrounds
Signed-off-by: Mike Rapoport <[email protected]>
---
include/asm-generic/5level-fixup.h | 58 ------------------------------
include/linux/mm.h | 6 ----
mm/kasan/init.c | 11 ------
mm/memory.c | 8 -----
4 files changed, 83 deletions(-)
delete mode 100644 include/asm-generic/5level-fixup.h
diff --git a/include/asm-generic/5level-fixup.h b/include/asm-generic/5level-fixup.h
deleted file mode 100644
index 4c74b1c1d13b..000000000000
--- a/include/asm-generic/5level-fixup.h
+++ /dev/null
@@ -1,58 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _5LEVEL_FIXUP_H
-#define _5LEVEL_FIXUP_H
-
-#define __ARCH_HAS_5LEVEL_HACK
-#define __PAGETABLE_P4D_FOLDED 1
-
-#define P4D_SHIFT PGDIR_SHIFT
-#define P4D_SIZE PGDIR_SIZE
-#define P4D_MASK PGDIR_MASK
-#define MAX_PTRS_PER_P4D 1
-#define PTRS_PER_P4D 1
-
-#define p4d_t pgd_t
-
-#define pud_alloc(mm, p4d, address) \
- ((unlikely(pgd_none(*(p4d))) && __pud_alloc(mm, p4d, address)) ? \
- NULL : pud_offset(p4d, address))
-
-#define p4d_alloc(mm, pgd, address) (pgd)
-#define p4d_offset(pgd, start) (pgd)
-
-#ifndef __ASSEMBLY__
-static inline int p4d_none(p4d_t p4d)
-{
- return 0;
-}
-
-static inline int p4d_bad(p4d_t p4d)
-{
- return 0;
-}
-
-static inline int p4d_present(p4d_t p4d)
-{
- return 1;
-}
-#endif
-
-#define p4d_ERROR(p4d) do { } while (0)
-#define p4d_clear(p4d) pgd_clear(p4d)
-#define p4d_val(p4d) pgd_val(p4d)
-#define p4d_populate(mm, p4d, pud) pgd_populate(mm, p4d, pud)
-#define p4d_populate_safe(mm, p4d, pud) pgd_populate(mm, p4d, pud)
-#define p4d_page(p4d) pgd_page(p4d)
-#define p4d_page_vaddr(p4d) pgd_page_vaddr(p4d)
-
-#define __p4d(x) __pgd(x)
-#define set_p4d(p4dp, p4d) set_pgd(p4dp, p4d)
-
-#undef p4d_free_tlb
-#define p4d_free_tlb(tlb, x, addr) do { } while (0)
-#define p4d_free(mm, x) do { } while (0)
-
-#undef p4d_addr_end
-#define p4d_addr_end(addr, end) (end)
-
-#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 52269e56c514..69fb46e1d91b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1841,11 +1841,6 @@ int __pte_alloc_kernel(pmd_t *pmd);
#if defined(CONFIG_MMU)
-/*
- * The following ifdef needed to get the 5level-fixup.h header to work.
- * Remove it when 5level-fixup.h has been removed.
- */
-#ifndef __ARCH_HAS_5LEVEL_HACK
static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
unsigned long address)
{
@@ -1859,7 +1854,6 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, p4d_t *p4d,
return (unlikely(p4d_none(*p4d)) && __pud_alloc(mm, p4d, address)) ?
NULL : pud_offset(p4d, address);
}
-#endif /* !__ARCH_HAS_5LEVEL_HACK */
static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
{
diff --git a/mm/kasan/init.c b/mm/kasan/init.c
index ce45c491ebcd..fe6be0be1f76 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -250,20 +250,9 @@ int __ref kasan_populate_early_shadow(const void *shadow_start,
* 3,2 - level page tables where we don't have
* puds,pmds, so pgd_populate(), pud_populate()
* is noops.
- *
- * The ifndef is required to avoid build breakage.
- *
- * With 5level-fixup.h, pgd_populate() is not nop and
- * we reference kasan_early_shadow_p4d. It's not defined
- * unless 5-level paging enabled.
- *
- * The ifndef can be dropped once all KASAN-enabled
- * architectures will switch to pgtable-nop4d.h.
*/
-#ifndef __ARCH_HAS_5LEVEL_HACK
pgd_populate(&init_mm, pgd,
lm_alias(kasan_early_shadow_p4d));
-#endif
p4d = p4d_offset(pgd, addr);
p4d_populate(&init_mm, p4d,
lm_alias(kasan_early_shadow_pud));
diff --git a/mm/memory.c b/mm/memory.c
index 0bccc622e482..10cc147db1b8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4252,19 +4252,11 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, unsigned long address)
smp_wmb(); /* See comment in __pte_alloc */
spin_lock(&mm->page_table_lock);
-#ifndef __ARCH_HAS_5LEVEL_HACK
if (!p4d_present(*p4d)) {
mm_inc_nr_puds(mm);
p4d_populate(mm, p4d, new);
} else /* Another has populated it */
pud_free(mm, new);
-#else
- if (!pgd_present(*p4d)) {
- mm_inc_nr_puds(mm);
- pgd_populate(mm, p4d, new);
- } else /* Another has populated it */
- pud_free(mm, new);
-#endif /* __ARCH_HAS_5LEVEL_HACK */
spin_unlock(&mm->page_table_lock);
return 0;
}
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
Signed-off-by: Mike Rapoport <[email protected]>
Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
---
arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 58 ++++++++++--------
arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
.../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
arch/powerpc/include/asm/pgtable.h | 8 +++
arch/powerpc/kvm/book3s_64_mmu_radix.c | 59 ++++++++++++++++---
arch/powerpc/lib/code-patching.c | 7 ++-
arch/powerpc/mm/book3s32/mmu.c | 2 +-
arch/powerpc/mm/book3s32/tlb.c | 4 +-
arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
arch/powerpc/mm/book3s64/radix_pgtable.c | 19 ++++--
arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
arch/powerpc/mm/hugetlbpage.c | 28 +++++----
arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
arch/powerpc/mm/mem.c | 4 +-
arch/powerpc/mm/nohash/40x.c | 4 +-
arch/powerpc/mm/nohash/book3e_pgtable.c | 15 +++--
arch/powerpc/mm/pgtable.c | 25 +++++++-
arch/powerpc/mm/pgtable_32.c | 28 +++++----
arch/powerpc/mm/pgtable_64.c | 10 ++--
arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
arch/powerpc/mm/ptdump/ptdump.c | 22 ++++++-
arch/powerpc/xmon/xmon.c | 17 +++++-
28 files changed, 284 insertions(+), 120 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 5b39c11e884a..39ec11371be0 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -2,7 +2,6 @@
#ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
#define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#include <asm/book3s/32/hash.h>
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 2781ebf6add4..876d1528c2cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
#define hash__pmd_bad(pmd) (pmd_val(pmd) & H_PMD_BAD_BITS)
#define hash__pud_bad(pud) (pud_val(pud) & H_PUD_BAD_BITS)
-static inline int hash__pgd_bad(pgd_t pgd)
+static inline int hash__p4d_bad(p4d_t p4d)
{
- return (pgd_val(pgd) == 0);
+ return (p4d_val(p4d) == 0);
}
#ifdef CONFIG_STRICT_KERNEL_RWX
extern void hash__mark_rodata_ro(void);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..69c5b051734f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
}
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
{
- *pgd = __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
+ *pgd = __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
}
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 201a69e6a355..ddddbafff0ab 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -2,7 +2,7 @@
#ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
#define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
#ifndef __ASSEMBLY__
#include <linux/mmdebug.h>
@@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
/* Bits to mask out from a PUD to get to the PMD page */
#define PUD_MASKED_BITS 0xc0000000000000ffUL
/* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS 0xc0000000000000ffUL
+#define P4D_MASKED_BITS 0xc0000000000000ffUL
/*
* Used as an indicator for rcu callback functions
@@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
return pte_access_permitted(pud_pte(pud), write);
}
-#define pgd_write(pgd) pte_write(pgd_pte(pgd))
+#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
+static inline __be64 p4d_raw(p4d_t x)
+{
+ return pgd_raw(x.pgd);
+}
+
+#define p4d_write(p4d) pte_write(p4d_pte(p4d))
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
{
- *pgdp = __pgd(0);
+ *p4dp = __p4d(0);
}
-static inline int pgd_none(pgd_t pgd)
+static inline int p4d_none(p4d_t p4d)
{
- return !pgd_raw(pgd);
+ return !p4d_raw(p4d);
}
-static inline int pgd_present(pgd_t pgd)
+static inline int p4d_present(p4d_t p4d)
{
- return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
+ return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
}
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
{
- return __pte_raw(pgd_raw(pgd));
+ return __pte_raw(p4d_raw(p4d));
}
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
{
- return __pgd_raw(pte_raw(pte));
+ return __p4d_raw(pte_raw(pte));
}
-static inline int pgd_bad(pgd_t pgd)
+static inline int p4d_bad(p4d_t p4d)
{
if (radix_enabled())
- return radix__pgd_bad(pgd);
- return hash__pgd_bad(pgd);
+ return radix__p4d_bad(p4d);
+ return hash__p4d_bad(p4d);
}
-#define pgd_access_permitted pgd_access_permitted
-static inline bool pgd_access_permitted(pgd_t pgd, bool write)
+#define p4d_access_permitted p4d_access_permitted
+static inline bool p4d_access_permitted(p4d_t p4d, bool write)
{
- return pte_access_permitted(pgd_pte(pgd), write);
+ return pte_access_permitted(p4d_pte(p4d), write);
}
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
/* Pointers in the page table tree are physical addresses */
#define __pgtable_ptr_val(ptr) __pa(ptr)
#define pmd_page_vaddr(pmd) __va(pmd_val(pmd) & ~PMD_MASKED_BITS)
#define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS)
-#define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_page_vaddr(p4d) __va(p4d_val(p4d) & ~P4D_MASKED_BITS)
#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
@@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
-#define pud_offset(pgdp, addr) \
- (((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
+#define pud_offset(p4dp, addr) \
+ (((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
#define pmd_offset(pudp,addr) \
(((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
#define pte_offset_kernel(dir,addr) \
@@ -1368,6 +1374,12 @@ static inline bool pud_is_leaf(pud_t pud)
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
}
+#define p4d_is_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
+{
+ return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
+}
+
#define pgd_is_leaf pgd_is_leaf
#define pgd_leaf pgd_is_leaf
static inline bool pgd_is_leaf(pgd_t pgd)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index d97db3ad9aae..9bca2ac64220 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -30,7 +30,7 @@
/* Don't have anything in the reserved bits and leaf bits */
#define RADIX_PMD_BAD_BITS 0x60000000000000e0UL
#define RADIX_PUD_BAD_BITS 0x60000000000000e0UL
-#define RADIX_PGD_BAD_BITS 0x60000000000000e0UL
+#define RADIX_P4D_BAD_BITS 0x60000000000000e0UL
#define RADIX_PMD_SHIFT (PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
#define RADIX_PUD_SHIFT (RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
@@ -227,9 +227,9 @@ static inline int radix__pud_bad(pud_t pud)
}
-static inline int radix__pgd_bad(pgd_t pgd)
+static inline int radix__p4d_bad(p4d_t p4d)
{
- return !!(pgd_val(pgd) & RADIX_PGD_BAD_BITS);
+ return !!(p4d_val(p4d) & RADIX_P4D_BAD_BITS);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 60c4d829152e..d4c2c4259fa3 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -2,7 +2,6 @@
#ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
#define _ASM_POWERPC_NOHASH_32_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index b9534a793293..668aee6017e7 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -15,7 +15,7 @@ struct vmemmap_backing {
};
extern struct vmemmap_backing *vmemmap_list;
-#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, (unsigned long)PUD)
+#define p4d_populate(MM, P4D, PUD) p4d_set(P4D, (unsigned long)PUD)
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
{
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
index c40ec32b8194..81b1c54e3cf1 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
@@ -2,7 +2,7 @@
#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
#define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
/*
* Entries per page directory level. The PTE level must use a 64b record
@@ -45,41 +45,41 @@
#define PMD_MASKED_BITS 0
/* Bits to mask out from a PUD to get to the PMD page */
#define PUD_MASKED_BITS 0
-/* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS 0
+/* Bits to mask out from a P4D to get to the PUD page */
+#define P4D_MASKED_BITS 0
/*
* 4-level page tables related bits
*/
-#define pgd_none(pgd) (!pgd_val(pgd))
-#define pgd_bad(pgd) (pgd_val(pgd) == 0)
-#define pgd_present(pgd) (pgd_val(pgd) != 0)
-#define pgd_page_vaddr(pgd) (pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_none(p4d) (!p4d_val(p4d))
+#define p4d_bad(p4d) (p4d_val(p4d) == 0)
+#define p4d_present(p4d) (p4d_val(p4d) != 0)
+#define p4d_page_vaddr(p4d) (p4d_val(p4d) & ~P4D_MASKED_BITS)
#ifndef __ASSEMBLY__
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
{
- *pgdp = __pgd(0);
+ *p4dp = __p4d(0);
}
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
{
- return __pte(pgd_val(pgd));
+ return __pte(p4d_val(p4d));
}
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
{
- return __pgd(pte_val(pte));
+ return __p4d(pte_val(pte));
}
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
#endif /* !__ASSEMBLY__ */
-#define pud_offset(pgdp, addr) \
- (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
+#define pud_offset(p4dp, addr) \
+ (((pud_t *) p4d_page_vaddr(*(p4dp))) + \
(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
#define pud_ERROR(e) \
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9a33b8bd842d..b360f262b9c6 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -175,11 +175,11 @@ static inline pud_t pte_pud(pte_t pte)
return __pud(pte_val(pte));
}
#define pud_write(pud) pte_write(pud_pte(pud))
-#define pgd_write(pgd) pte_write(pgd_pte(pgd))
+#define p4d_write(pgd) pte_write(p4d_pte(p4d))
-static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+static inline void p4d_set(p4d_t *p4dp, unsigned long val)
{
- *pgdp = __pgd(val);
+ *p4dp = __p4d(val);
}
/*
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 8cc543ed114c..0a05fddd7881 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -139,6 +139,14 @@ static inline bool pud_is_leaf(pud_t pud)
}
#endif
+#ifndef p4d_is_leaf
+#define p4d_is_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
+{
+ return false;
+}
+#endif
+
#ifndef pgd_is_leaf
#define pgd_is_leaf pgd_is_leaf
static inline bool pgd_is_leaf(pgd_t pgd)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 803940d79b73..5aacfa0b27ef 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -494,17 +494,39 @@ static void kvmppc_unmap_free_pud(struct kvm *kvm, pud_t *pud,
pud_free(kvm->mm, pud);
}
+static void kvmppc_unmap_free_p4d(struct kvm *kvm, p4d_t *p4d,
+ unsigned int lpid)
+{
+ unsigned long iu;
+ p4d_t *p = p4d;
+
+ for (iu = 0; iu < PTRS_PER_P4D; ++iu, ++p) {
+ if (!p4d_present(*p))
+ continue;
+ if (p4d_is_leaf(*p)) {
+ p4d_clear(p);
+ } else {
+ pud_t *pud;
+
+ pud = pud_offset(p, 0);
+ kvmppc_unmap_free_pud(kvm, pud, lpid);
+ p4d_clear(p);
+ }
+ }
+ p4d_free(kvm->mm, p4d);
+}
+
void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
{
unsigned long ig;
for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
- pud_t *pud;
+ p4d_t *p4d;
if (!pgd_present(*pgd))
continue;
- pud = pud_offset(pgd, 0);
- kvmppc_unmap_free_pud(kvm, pud, lpid);
+ p4d = p4d_offset(pgd, 0);
+ kvmppc_unmap_free_p4d(kvm, p4d, lpid);
pgd_clear(pgd);
}
}
@@ -566,6 +588,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
unsigned long *rmapp, struct rmap_nested **n_rmap)
{
pgd_t *pgd;
+ p4d_t *p4d, *new_p4d = NULL;
pud_t *pud, *new_pud = NULL;
pmd_t *pmd, *new_pmd = NULL;
pte_t *ptep, *new_ptep = NULL;
@@ -573,9 +596,15 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
/* Traverse the guest's 2nd-level tree, allocate new levels needed */
pgd = pgtable + pgd_index(gpa);
- pud = NULL;
+ p4d = NULL;
if (pgd_present(*pgd))
- pud = pud_offset(pgd, gpa);
+ p4d = p4d_offset(pgd, gpa);
+ else
+ new_p4d = p4d_alloc_one(kvm->mm, gpa);
+
+ pud = NULL;
+ if (p4d_present(*p4d))
+ pud = pud_offset(p4d, gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);
@@ -597,12 +626,18 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
/* Now traverse again under the lock and change the tree */
ret = -ENOMEM;
if (pgd_none(*pgd)) {
+ if (!new_p4d)
+ goto out_unlock;
+ pgd_populate(kvm->mm, pgd, new_p4d);
+ new_p4d = NULL;
+ }
+ if (p4d_none(*p4d)) {
if (!new_pud)
goto out_unlock;
- pgd_populate(kvm->mm, pgd, new_pud);
+ p4d_populate(kvm->mm, p4d, new_pud);
new_pud = NULL;
}
- pud = pud_offset(pgd, gpa);
+ pud = pud_offset(p4d, gpa);
if (pud_is_leaf(*pud)) {
unsigned long hgpa = gpa & PUD_MASK;
@@ -1220,6 +1255,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
pgd_t *pgt;
struct kvm_nested_guest *nested;
pgd_t pgd, *pgdp;
+ p4d_t p4d, *p4dp;
pud_t pud, *pudp;
pmd_t pmd, *pmdp;
pte_t *ptep;
@@ -1298,7 +1334,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
continue;
}
- pudp = pud_offset(&pgd, gpa);
+ p4dp = p4d_offset(&pgd, gpa);
+ p4d = READ_ONCE(*p4dp);
+ if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
+ gpa = (gpa & P4D_MASK) + P4D_SIZE;
+ continue;
+ }
+
+ pudp = pud_offset(&p4d, gpa);
pud = READ_ONCE(*pudp);
if (!(pud_val(pud) & _PAGE_PRESENT)) {
gpa = (gpa & PUD_MASK) + PUD_SIZE;
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 3345f039a876..7a59f6863cec 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
pte_t *ptep;
pmd_t *pmdp;
pud_t *pudp;
+ p4d_t *p4dp;
pgd_t *pgdp;
pgdp = pgd_offset_k(addr);
if (unlikely(!pgdp))
return -EINVAL;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ if (unlikely(!p4dp))
+ return -EINVAL;
+
+ pudp = pud_offset(p4dp, addr);
if (unlikely(!pudp))
return -EINVAL;
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 0a1c65a2c565..b2fc3e71165c 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
if (!Hash)
return;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
if (!pmd_none(*pmd))
add_hash_page(mm->context.id, ea, pmd_val(*pmd));
}
diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
index 2fcd321040ff..175bc33b41b7 100644
--- a/arch/powerpc/mm/book3s32/tlb.c
+++ b/arch/powerpc/mm/book3s32/tlb.c
@@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
if (start >= end)
return;
end = (end - 1) | ~PAGE_MASK;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
for (;;) {
pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
if (pmd_end > end)
@@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
return;
}
mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
if (!pmd_none(*pmd))
flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
}
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index 64733b9cb20a..9cd15937e88a 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
if (slab_is_available()) {
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
pmdp = pmd_alloc(&init_mm, pudp, ea);
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index dd1bea45325c..11762556fe4d 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -64,17 +64,24 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
{
unsigned long pfn = pa >> PAGE_SHIFT;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
pgdp = pgd_offset_k(ea);
if (pgd_none(*pgdp)) {
+ p4dp = early_alloc_pgtable(PGD_TABLE_SIZE, nid,
+ region_start, region_end);
+ pgd_populate(&init_mm, pgdp, p4dp);
+ }
+ p4dp = p4d_offset(pgdp, ea);
+ if (p4d_none(*p4dp)) {
pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
region_start, region_end);
- pgd_populate(&init_mm, pgdp, pudp);
+ p4d_populate(&init_mm, p4dp, pudp);
}
- pudp = pud_offset(pgdp, ea);
+ pudp = pud_offset(p4dp, ea);
if (map_page_size == PUD_SIZE) {
ptep = (pte_t *)pudp;
goto set_the_pte;
@@ -114,6 +121,7 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
{
unsigned long pfn = pa >> PAGE_SHIFT;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -136,7 +144,8 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
* boot.
*/
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
if (map_page_size == PUD_SIZE) {
@@ -173,6 +182,7 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
{
unsigned long idx;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -185,7 +195,8 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
for (idx = start; idx < end; idx += PAGE_SIZE) {
pgdp = pgd_offset_k(idx);
- pudp = pud_alloc(&init_mm, pgdp, idx);
+ p4dp = p4d_offset(pgdp, idx);
+ pudp = pud_alloc(&init_mm, p4dp, idx);
if (!pudp)
continue;
if (pud_is_leaf(*pudp)) {
diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
index 2ef24a53f4c9..27daeed1a141 100644
--- a/arch/powerpc/mm/book3s64/subpage_prot.c
+++ b/arch/powerpc/mm/book3s64/subpage_prot.c
@@ -54,6 +54,7 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
int npages)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -62,7 +63,10 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
pgd = pgd_offset(mm, addr);
if (pgd_none(*pgd))
return;
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
+ return;
+ pud = pud_offset(p4d, addr);
if (pud_none(*pud))
return;
pmd = pmd_offset(pud, addr);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 73d4873fc7f8..43d463f20fc3 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -112,6 +112,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
{
pgd_t *pg;
+ p4d_t *p4;
pud_t *pu;
pmd_t *pm;
hugepd_t *hpdp = NULL;
@@ -121,20 +122,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
addr &= ~(sz-1);
pg = pgd_offset(mm, addr);
+ p4 = p4d_offset(pg, addr);
#ifdef CONFIG_PPC_BOOK3S_64
if (pshift == PGDIR_SHIFT)
/* 16GB huge page */
- return (pte_t *) pg;
+ return (pte_t *) p4;
else if (pshift > PUD_SHIFT) {
/*
* We need to use hugepd table
*/
ptl = &mm->page_table_lock;
- hpdp = (hugepd_t *)pg;
+ hpdp = (hugepd_t *)p4;
} else {
pdshift = PUD_SHIFT;
- pu = pud_alloc(mm, pg, addr);
+ pu = pud_alloc(mm, p4, addr);
if (!pu)
return NULL;
if (pshift == PUD_SHIFT)
@@ -159,10 +161,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
#else
if (pshift >= PGDIR_SHIFT) {
ptl = &mm->page_table_lock;
- hpdp = (hugepd_t *)pg;
+ hpdp = (hugepd_t *)p4;
} else {
pdshift = PUD_SHIFT;
- pu = pud_alloc(mm, pg, addr);
+ pu = pud_alloc(mm, p4, addr);
if (!pu)
return NULL;
if (pshift >= PUD_SHIFT) {
@@ -384,7 +386,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
mm_dec_nr_pmds(tlb->mm);
}
-static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
+static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
@@ -394,7 +396,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
start = addr;
do {
- pud = pud_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
next = pud_addr_end(addr, end);
if (!is_hugepd(__hugepd(pud_val(*pud)))) {
if (pud_none_or_clear_bad(pud))
@@ -429,8 +431,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
if (end - 1 > ceiling - 1)
return;
- pud = pud_offset(pgd, start);
- pgd_clear(pgd);
+ pud = pud_offset(p4d, start);
+ p4d_clear(p4d);
pud_free_tlb(tlb, pud, start);
mm_dec_nr_puds(tlb->mm);
}
@@ -443,6 +445,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long floor, unsigned long ceiling)
{
pgd_t *pgd;
+ p4d_t *p4d;
unsigned long next;
/*
@@ -465,10 +468,11 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
do {
next = pgd_addr_end(addr, end);
pgd = pgd_offset(tlb->mm, addr);
+ p4d = p4d_offset(pgd, addr);
if (!is_hugepd(__hugepd(pgd_val(*pgd)))) {
- if (pgd_none_or_clear_bad(pgd))
+ if (p4d_none_or_clear_bad(p4d))
continue;
- hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
+ hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
} else {
unsigned long more;
/*
@@ -481,7 +485,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
if (more > next)
next = more;
- free_hugepd_range(tlb, (hugepd_t *)pgd, PGDIR_SHIFT,
+ free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT,
addr, next, floor, ceiling);
}
} while (addr = next, addr != end);
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index 16dd95bd0749..eed3f1ae3b90 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -36,7 +36,7 @@ static int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned
unsigned long k_cur, k_next;
pte_t *new = NULL;
- pmd = pmd_offset(pud_offset(pgd_offset_k(k_start), k_start), k_start);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_start), k_start), k_start), k_start);
for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
k_next = pgd_addr_end(k_cur, k_end);
@@ -78,7 +78,7 @@ static int __init kasan_init_region(void *start, size_t size)
block = memblock_alloc(k_end - k_start, PAGE_SIZE);
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
void *va = block + k_cur - k_start;
pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
@@ -102,7 +102,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
@@ -202,7 +202,7 @@ void __init kasan_early_init(void)
unsigned long addr = KASAN_SHADOW_START;
unsigned long end = KASAN_SHADOW_END;
unsigned long next;
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(addr), addr), addr), addr);
BUILD_BUG_ON(KASAN_SHADOW_START & ~PGDIR_MASK);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index ef7b1119b2e2..8262b384dcf3 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -69,8 +69,8 @@ EXPORT_SYMBOL(kmap_prot);
static inline pte_t *virt_to_kpte(unsigned long vaddr)
{
- return pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr),
- vaddr), vaddr), vaddr);
+ return pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr),
+ vaddr), vaddr), vaddr), vaddr);
}
#endif
diff --git a/arch/powerpc/mm/nohash/40x.c b/arch/powerpc/mm/nohash/40x.c
index f348104eb461..7aaf7155e350 100644
--- a/arch/powerpc/mm/nohash/40x.c
+++ b/arch/powerpc/mm/nohash/40x.c
@@ -104,7 +104,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
pmd_t *pmdp;
unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
- pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
*pmdp++ = __pmd(val);
*pmdp++ = __pmd(val);
*pmdp++ = __pmd(val);
@@ -119,7 +119,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
pmd_t *pmdp;
unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
- pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
*pmdp = __pmd(val);
v += LARGE_PAGE_SIZE_4M;
diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
index 4637fdd469cf..a62d59a928be 100644
--- a/arch/powerpc/mm/nohash/book3e_pgtable.c
+++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
@@ -73,6 +73,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -80,7 +81,10 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
BUILD_BUG_ON(TASK_SIZE_USER64 > PGTABLE_RANGE);
if (slab_is_available()) {
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_alloc(&init_mm, pgdp, ea);
+ if (!p4dp)
+ return -ENOMEM;
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
pmdp = pmd_alloc(&init_mm, pudp, ea);
@@ -91,13 +95,16 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
return -ENOMEM;
} else {
pgdp = pgd_offset_k(ea);
-#ifndef __PAGETABLE_PUD_FOLDED
if (pgd_none(*pgdp)) {
pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
pgd_populate(&init_mm, pgdp, pudp);
}
-#endif /* !__PAGETABLE_PUD_FOLDED */
- pudp = pud_offset(pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ if (p4d_none(*p4dp)) {
+ pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
+ p4d_populate(&init_mm, p4dp, pmdp);
+ }
+ pudp = pud_offset(p4dp, ea);
if (pud_none(*pudp)) {
pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
pud_populate(&init_mm, pudp, pmdp);
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index e3759b69f81b..dca6a72da26a 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -265,6 +265,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
@@ -272,7 +273,9 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
return;
pgd = mm->pgd + pgd_index(addr);
BUG_ON(pgd_none(*pgd));
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ BUG_ON(p4d_none(*p4d));
+ pud = pud_offset(p4d, addr);
BUG_ON(pud_none(*pud));
pmd = pmd_offset(pud, addr);
/*
@@ -313,6 +316,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
bool *is_thp, unsigned *hpage_shift)
{
pgd_t pgd, *pgdp;
+ p4d_t p4d, *p4dp;
pud_t pud, *pudp;
pmd_t pmd, *pmdp;
pte_t *ret_pte;
@@ -346,13 +350,30 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
goto out_huge;
}
+ pdshift = P4D_SHIFT;
+ p4dp = p4d_offset(&pgd, ea);
+ p4d = READ_ONCE(*p4dp);
+
+ if (p4d_none(p4d))
+ return NULL;
+
+ if (p4d_is_leaf(p4d)) {
+ ret_pte = (pte_t *)p4dp;
+ goto out;
+ }
+
+ if (is_hugepd(__hugepd(p4d_val(p4d)))) {
+ hpdp = (hugepd_t *)&p4d;
+ goto out_huge;
+ }
+
/*
* Even if we end up with an unmap, the pgtable will not
* be freed, because we do an rcu free and here we are
* irq disabled
*/
pdshift = PUD_SHIFT;
- pudp = pud_offset(&pgd, ea);
+ pudp = pud_offset(&p4d, ea);
pud = READ_ONCE(*pudp);
if (pud_none(pud))
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 5fb90edd865e..ad217e5e039f 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -63,7 +63,7 @@ int __ref map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot)
int err = -ENOMEM;
/* Use upper 10 bits of VA to index the first level map */
- pd = pmd_offset(pud_offset(pgd_offset_k(va), va), va);
+ pd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
/* Use middle 10 bits of VA to index the second-level map */
if (likely(slab_is_available()))
pg = pte_alloc_kernel(pd, va);
@@ -130,6 +130,7 @@ static int
get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -137,17 +138,20 @@ get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp)
pgd = pgd_offset(mm, addr & PAGE_MASK);
if (pgd) {
- pud = pud_offset(pgd, addr & PAGE_MASK);
- if (pud && pud_present(*pud)) {
- pmd = pmd_offset(pud, addr & PAGE_MASK);
- if (pmd_present(*pmd)) {
- pte = pte_offset_map(pmd, addr & PAGE_MASK);
- if (pte) {
- retval = 1;
- *ptep = pte;
- if (pmdp)
- *pmdp = pmd;
- /* XXX caller needs to do pte_unmap, yuck */
+ p4d = p4d_offset(pgd, addr & PAGE_MASK);
+ if (p4d && p4d_present(*p4d)) {
+ pud = pud_offset(p4d, addr & PAGE_MASK);
+ if (pud && pud_present(*pud)) {
+ pmd = pmd_offset(pud, addr & PAGE_MASK);
+ if (pmd_present(*pmd)) {
+ pte = pte_offset_map(pmd, addr & PAGE_MASK);
+ if (pte) {
+ retval = 1;
+ *ptep = pte;
+ if (pmdp)
+ *pmdp = pmd;
+ /* XXX caller needs to do pte_unmap, yuck */
+ }
}
}
}
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e78832dce7bb..1f86a88fd4bb 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -101,13 +101,13 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
#ifndef __PAGETABLE_PUD_FOLDED
/* 4 level page table */
-struct page *pgd_page(pgd_t pgd)
+struct page *p4d_page(p4d_t p4d)
{
- if (pgd_is_leaf(pgd)) {
- VM_WARN_ON(!pgd_huge(pgd));
- return pte_page(pgd_pte(pgd));
+ if (p4d_is_leaf(p4d)) {
+ VM_WARN_ON(!p4d_huge(p4d));
+ return pte_page(p4d_pte(p4d));
}
- return virt_to_page(pgd_page_vaddr(pgd));
+ return virt_to_page(p4d_page_vaddr(p4d));
}
#endif
diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c b/arch/powerpc/mm/ptdump/hashpagetable.c
index a07278027c6f..ac360ad865a8 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -417,9 +417,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
}
}
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
{
- pud_t *pud = pud_offset(pgd, 0);
+ pud_t *pud = pud_offset(p4d, 0);
unsigned long addr;
unsigned int i;
@@ -431,6 +431,20 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
}
}
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+ p4d_t *p4d = p4d_offset(pgd, 0);
+ unsigned long addr;
+ unsigned int i;
+
+ for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+ addr = start + i * P4D_SIZE;
+ if (!p4d_none(*p4d))
+ /* p4d exists */
+ walk_pud(st, p4d, addr);
+ }
+}
+
static void walk_pagetables(struct pg_state *st)
{
pgd_t *pgd = pgd_offset_k(0UL);
@@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
addr = KERN_VIRT_START + i * PGDIR_SIZE;
if (!pgd_none(*pgd))
/* pgd exists */
- walk_pud(st, pgd, addr);
+ walk_p4d(st, pgd, addr);
}
}
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 206156255247..7bd4b81d5b5d 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
}
}
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
{
- pud_t *pud = pud_offset(pgd, 0);
+ pud_t *pud = pud_offset(p4d, 0);
unsigned long addr;
unsigned int i;
@@ -293,6 +293,22 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
}
}
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+ p4d_t *p4d = p4d_offset(pgd, 0);
+ unsigned long addr;
+ unsigned int i;
+
+ for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+ addr = start + i * P4D_SIZE;
+ if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
+ /* p4d exists */
+ walk_pud(st, p4d, addr);
+ else
+ note_page(st, addr, 2, p4d_val(*p4d));
+ }
+}
+
static void walk_pagetables(struct pg_state *st)
{
unsigned int i;
@@ -306,7 +322,7 @@ static void walk_pagetables(struct pg_state *st)
for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
/* pgd exists */
- walk_pud(st, pgd, addr);
+ walk_p4d(st, pgd, addr);
else
note_page(st, addr, 1, pgd_val(*pgd));
}
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index e8c84d265602..c7bd1145b268 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3130,6 +3130,7 @@ static void show_pte(unsigned long addr)
struct task_struct *tsk = NULL;
struct mm_struct *mm;
pgd_t *pgdp, *pgdir;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -3174,7 +3175,21 @@ static void show_pte(unsigned long addr)
}
printf("pgdp @ 0x%px = 0x%016lx\n", pgdp, pgd_val(*pgdp));
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+
+ if (p4d_none(*p4dp)) {
+ printf("No valid P4D\n");
+ return;
+ }
+
+ if (p4d_is_leaf(*p4dp)) {
+ format_pte(p4dp, p4d_val(*p4dp));
+ return;
+ }
+
+ printf("p4dp @ 0x%px = 0x%016lx\n", p4dp, p4d_val(*p4dp));
+
+ pudp = pud_offset(p4dp, addr);
if (pud_none(*pudp)) {
printf("No valid PUD\n");
--
2.24.0
From: Mike Rapoport <[email protected]>
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
---
arch/sh/include/asm/pgtable-2level.h | 1 -
arch/sh/include/asm/pgtable-3level.h | 1 -
arch/sh/kernel/io_trapped.c | 7 ++++++-
arch/sh/mm/cache-sh4.c | 4 +++-
arch/sh/mm/cache-sh5.c | 7 ++++++-
arch/sh/mm/fault.c | 26 +++++++++++++++++++++++---
arch/sh/mm/hugetlbpage.c | 28 ++++++++++++++++++----------
arch/sh/mm/init.c | 9 ++++++++-
arch/sh/mm/kmap.c | 2 +-
arch/sh/mm/tlbex_32.c | 6 +++++-
arch/sh/mm/tlbex_64.c | 7 ++++++-
11 files changed, 76 insertions(+), 22 deletions(-)
diff --git a/arch/sh/include/asm/pgtable-2level.h b/arch/sh/include/asm/pgtable-2level.h
index bf1eb51c3ee5..08bff93927ff 100644
--- a/arch/sh/include/asm/pgtable-2level.h
+++ b/arch/sh/include/asm/pgtable-2level.h
@@ -2,7 +2,6 @@
#ifndef __ASM_SH_PGTABLE_2LEVEL_H
#define __ASM_SH_PGTABLE_2LEVEL_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
/*
diff --git a/arch/sh/include/asm/pgtable-3level.h b/arch/sh/include/asm/pgtable-3level.h
index 779260b721ca..0f80097e5c9c 100644
--- a/arch/sh/include/asm/pgtable-3level.h
+++ b/arch/sh/include/asm/pgtable-3level.h
@@ -2,7 +2,6 @@
#ifndef __ASM_SH_PGTABLE_3LEVEL_H
#define __ASM_SH_PGTABLE_3LEVEL_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>
/*
diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
index 60c828a2b8a2..037aab2708b7 100644
--- a/arch/sh/kernel/io_trapped.c
+++ b/arch/sh/kernel/io_trapped.c
@@ -136,6 +136,7 @@ EXPORT_SYMBOL_GPL(match_trapped_io_handler);
static struct trapped_io *lookup_tiop(unsigned long address)
{
pgd_t *pgd_k;
+ p4d_t *p4d_k;
pud_t *pud_k;
pmd_t *pmd_k;
pte_t *pte_k;
@@ -145,7 +146,11 @@ static struct trapped_io *lookup_tiop(unsigned long address)
if (!pgd_present(*pgd_k))
return NULL;
- pud_k = pud_offset(pgd_k, address);
+ p4d_k = p4d_offset(pgd_k, address);
+ if (!p4d_present(*p4d_k))
+ return NULL;
+
+ pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
return NULL;
diff --git a/arch/sh/mm/cache-sh4.c b/arch/sh/mm/cache-sh4.c
index eee911422cf9..45943bcb7042 100644
--- a/arch/sh/mm/cache-sh4.c
+++ b/arch/sh/mm/cache-sh4.c
@@ -209,6 +209,7 @@ static void sh4_flush_cache_page(void *args)
unsigned long address, pfn, phys;
int map_coherent = 0;
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -224,7 +225,8 @@ static void sh4_flush_cache_page(void *args)
return;
pgd = pgd_offset(vma->vm_mm, address);
- pud = pud_offset(pgd, address);
+ p4d = p4d_offset(pgd, address);
+ pud = pud_offset(p4d, address);
pmd = pmd_offset(pud, address);
pte = pte_offset_kernel(pmd, address);
diff --git a/arch/sh/mm/cache-sh5.c b/arch/sh/mm/cache-sh5.c
index 445b5e69b73c..442a77cc2957 100644
--- a/arch/sh/mm/cache-sh5.c
+++ b/arch/sh/mm/cache-sh5.c
@@ -383,6 +383,7 @@ static void sh64_dcache_purge_user_pages(struct mm_struct *mm,
unsigned long addr, unsigned long end)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -397,7 +398,11 @@ static void sh64_dcache_purge_user_pages(struct mm_struct *mm,
if (pgd_bad(*pgd))
return;
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d) || p4d_bad(*p4d))
+ return;
+
+ pud = pud_offset(p4d, addr);
if (pud_none(*pud) || pud_bad(*pud))
return;
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index a2b0275413e8..ebd30003fd06 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -53,6 +53,7 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
(u64)pgd_val(*pgd));
do {
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -65,7 +66,20 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
break;
}
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (PTRS_PER_P4D != 1)
+ pr_cont(", *p4d=%0*Lx", (u32)(sizeof(*p4d) * 2),
+ (u64)p4d_val(*p4d));
+
+ if (p4d_none(*p4d))
+ break;
+
+ if (p4d_bad(*p4d)) {
+ pr_cont("(bad)");
+ break;
+ }
+
+ pud = pud_offset(p4d, addr);
if (PTRS_PER_PUD != 1)
pr_cont(", *pud=%0*llx", (u32)(sizeof(*pud) * 2),
(u64)pud_val(*pud));
@@ -107,6 +121,7 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
{
unsigned index = pgd_index(address);
pgd_t *pgd_k;
+ p4d_t *p4d, *p4d_k;
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
@@ -116,8 +131,13 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
if (!pgd_present(*pgd_k))
return NULL;
- pud = pud_offset(pgd, address);
- pud_k = pud_offset(pgd_k, address);
+ p4d = p4d_offset(pgd, address);
+ p4d_k = p4d_offset(pgd_k, address);
+ if (!p4d_present(*p4d_k))
+ return NULL;
+
+ pud = pud_offset(p4d, address);
+ pud_k = pud_offset(p4d_k, address);
if (!pud_present(*pud_k))
return NULL;
diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c
index 960deb1f24a1..acd5652a0de3 100644
--- a/arch/sh/mm/hugetlbpage.c
+++ b/arch/sh/mm/hugetlbpage.c
@@ -26,17 +26,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, addr);
if (pgd) {
- pud = pud_alloc(mm, pgd, addr);
- if (pud) {
- pmd = pmd_alloc(mm, pud, addr);
- if (pmd)
- pte = pte_alloc_map(mm, pmd, addr);
+ p4d = p4d_alloc(mm, pgd, addr);
+ if (p4d) {
+ pud = pud_alloc(mm, p4d, addr);
+ if (pud) {
+ pmd = pmd_alloc(mm, pud, addr);
+ if (pmd)
+ pte = pte_alloc_map(mm, pmd, addr);
+ }
}
}
@@ -47,17 +51,21 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, addr);
if (pgd) {
- pud = pud_offset(pgd, addr);
- if (pud) {
- pmd = pmd_offset(pud, addr);
- if (pmd)
- pte = pte_offset_map(pmd, addr);
+ p4d = p4d_offset(pgd, addr);
+ if (p4d) {
+ pud = pud_offset(p4d, addr);
+ if (pud) {
+ pmd = pmd_offset(pud, addr);
+ if (pmd)
+ pte = pte_offset_map(pmd, addr);
+ }
}
}
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 4bab79baee75..594203530d43 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -45,6 +45,7 @@ void __init __weak plat_mem_setup(void)
static pte_t *__get_pte_phys(unsigned long addr)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
@@ -54,7 +55,13 @@ static pte_t *__get_pte_phys(unsigned long addr)
return NULL;
}
- pud = pud_alloc(NULL, pgd, addr);
+ p4d = p4d_alloc(NULL, pgd, addr);
+ if (unlikely(!p4d)) {
+ p4d_ERROR(*p4d);
+ return NULL;
+ }
+
+ pud = pud_alloc(NULL, p4d, addr);
if (unlikely(!pud)) {
pud_ERROR(*pud);
return NULL;
diff --git a/arch/sh/mm/kmap.c b/arch/sh/mm/kmap.c
index 9e6b38b03cf7..0e7039137f5a 100644
--- a/arch/sh/mm/kmap.c
+++ b/arch/sh/mm/kmap.c
@@ -15,7 +15,7 @@
#include <asm/cacheflush.h>
#define kmap_get_fixmap_pte(vaddr) \
- pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), (vaddr)), (vaddr)), (vaddr))
+ pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr), (vaddr)), (vaddr)), (vaddr)), vaddr)
static pte_t *kmap_coherent_pte;
diff --git a/arch/sh/mm/tlbex_32.c b/arch/sh/mm/tlbex_32.c
index 382262dc0c4b..1c53868632ee 100644
--- a/arch/sh/mm/tlbex_32.c
+++ b/arch/sh/mm/tlbex_32.c
@@ -23,6 +23,7 @@ handle_tlbmiss(struct pt_regs *regs, unsigned long error_code,
unsigned long address)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -42,7 +43,10 @@ handle_tlbmiss(struct pt_regs *regs, unsigned long error_code,
pgd = pgd_offset(current->mm, address);
}
- pud = pud_offset(pgd, address);
+ p4d = p4d_offset(pgd, address);
+ if (p4d_none_or_clear_bad(p4d))
+ return 1;
+ pud = pud_offset(p4d, address);
if (pud_none_or_clear_bad(pud))
return 1;
pmd = pmd_offset(pud, address);
diff --git a/arch/sh/mm/tlbex_64.c b/arch/sh/mm/tlbex_64.c
index 8ff966dd0c74..0d015f7556fa 100644
--- a/arch/sh/mm/tlbex_64.c
+++ b/arch/sh/mm/tlbex_64.c
@@ -44,6 +44,7 @@ static int handle_tlbmiss(unsigned long long protection_flags,
unsigned long address)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -58,7 +59,11 @@ static int handle_tlbmiss(unsigned long long protection_flags,
pgd = pgd_offset(current->mm, address);
}
- pud = pud_offset(pgd, address);
+ p4d = p4d_offset(pgd, address);
+ if (p4d_none(*p4d) || !p4d_present(*p4d))
+ return 1;
+
+ pud = pud_offset(p4d, address);
if (pud_none(*pud) || !pud_present(*pud))
return 1;
--
2.24.0
On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> From: Mike Rapoport <[email protected]>
>
> Hi,
>
> These patches convert several architectures to use page table folding and
> remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
>
> The changes are mostly about mechanical replacement of pgd accessors with p4d
> ones and the addition of higher levels to page table traversals.
>
> All the patches were sent separately to the respective arch lists and
> maintainers hence the "v2" prefix.
You fail to explain why this change which adds 488 additional lines of
code is desirable.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Le 16/02/2020 à 09:18, Mike Rapoport a écrit :
> From: Mike Rapoport <[email protected]>
>
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
I don't think it is worth adding all this additionnals walks of p4d,
this patch could be limited to changes like:
- pud = pud_offset(pgd, gpa);
+ pud = pud_offset(p4d_offset(pgd, gpa), gpa);
The additionnal walks should be added through another patch the day
powerpc need them.
See below for more comments.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
> ---
> arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
> arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgtable.h | 58 ++++++++++--------
> arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
> arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
> arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
> .../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
> arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
> arch/powerpc/include/asm/pgtable.h | 8 +++
> arch/powerpc/kvm/book3s_64_mmu_radix.c | 59 ++++++++++++++++---
> arch/powerpc/lib/code-patching.c | 7 ++-
> arch/powerpc/mm/book3s32/mmu.c | 2 +-
> arch/powerpc/mm/book3s32/tlb.c | 4 +-
> arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
> arch/powerpc/mm/book3s64/radix_pgtable.c | 19 ++++--
> arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
> arch/powerpc/mm/hugetlbpage.c | 28 +++++----
> arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
> arch/powerpc/mm/mem.c | 4 +-
> arch/powerpc/mm/nohash/40x.c | 4 +-
> arch/powerpc/mm/nohash/book3e_pgtable.c | 15 +++--
> arch/powerpc/mm/pgtable.c | 25 +++++++-
> arch/powerpc/mm/pgtable_32.c | 28 +++++----
> arch/powerpc/mm/pgtable_64.c | 10 ++--
> arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
> arch/powerpc/mm/ptdump/ptdump.c | 22 ++++++-
> arch/powerpc/xmon/xmon.c | 17 +++++-
> 28 files changed, 284 insertions(+), 120 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 5b39c11e884a..39ec11371be0 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -2,7 +2,6 @@
> #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
> #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
>
> -#define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopmd.h>
>
> #include <asm/book3s/32/hash.h>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index 2781ebf6add4..876d1528c2cf 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
>
> #define hash__pmd_bad(pmd) (pmd_val(pmd) & H_PMD_BAD_BITS)
> #define hash__pud_bad(pud) (pud_val(pud) & H_PUD_BAD_BITS)
> -static inline int hash__pgd_bad(pgd_t pgd)
> +static inline int hash__p4d_bad(p4d_t p4d)
> {
> - return (pgd_val(pgd) == 0);
> + return (p4d_val(p4d) == 0);
> }
> #ifdef CONFIG_STRICT_KERNEL_RWX
> extern void hash__mark_rodata_ro(void);
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index a41e91bd0580..69c5b051734f 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
> }
>
> -static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
> +static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
> {
> - *pgd = __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> + *pgd = __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> }
>
> static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 201a69e6a355..ddddbafff0ab 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -2,7 +2,7 @@
> #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
>
> -#include <asm-generic/5level-fixup.h>
> +#include <asm-generic/pgtable-nop4d.h>
>
> #ifndef __ASSEMBLY__
> #include <linux/mmdebug.h>
> @@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
> /* Bits to mask out from a PUD to get to the PMD page */
> #define PUD_MASKED_BITS 0xc0000000000000ffUL
> /* Bits to mask out from a PGD to get to the PUD page */
> -#define PGD_MASKED_BITS 0xc0000000000000ffUL
> +#define P4D_MASKED_BITS 0xc0000000000000ffUL
>
> /*
> * Used as an indicator for rcu callback functions
> @@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
> return pte_access_permitted(pud_pte(pud), write);
> }
>
> -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> +#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
> +static inline __be64 p4d_raw(p4d_t x)
> +{
> + return pgd_raw(x.pgd);
> +}
> +
Shouldn't this be defined in asm/pgtable-be-types.h, just like other
__pxx_raw() ?
> +#define p4d_write(p4d) pte_write(p4d_pte(p4d))
>
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
> {
> - *pgdp = __pgd(0);
> + *p4dp = __p4d(0);
> }
>
> -static inline int pgd_none(pgd_t pgd)
> +static inline int p4d_none(p4d_t p4d)
> {
> - return !pgd_raw(pgd);
> + return !p4d_raw(p4d);
> }
>
> -static inline int pgd_present(pgd_t pgd)
> +static inline int p4d_present(p4d_t p4d)
> {
> - return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> -static inline pte_t pgd_pte(pgd_t pgd)
> +static inline pte_t p4d_pte(p4d_t p4d)
> {
> - return __pte_raw(pgd_raw(pgd));
> + return __pte_raw(p4d_raw(p4d));
> }
>
> -static inline pgd_t pte_pgd(pte_t pte)
> +static inline p4d_t pte_p4d(pte_t pte)
> {
> - return __pgd_raw(pte_raw(pte));
> + return __p4d_raw(pte_raw(pte));
> }
>
> -static inline int pgd_bad(pgd_t pgd)
> +static inline int p4d_bad(p4d_t p4d)
> {
> if (radix_enabled())
> - return radix__pgd_bad(pgd);
> - return hash__pgd_bad(pgd);
> + return radix__p4d_bad(p4d);
> + return hash__p4d_bad(p4d);
> }
>
> -#define pgd_access_permitted pgd_access_permitted
> -static inline bool pgd_access_permitted(pgd_t pgd, bool write)
> +#define p4d_access_permitted p4d_access_permitted
> +static inline bool p4d_access_permitted(p4d_t p4d, bool write)
> {
> - return pte_access_permitted(pgd_pte(pgd), write);
> + return pte_access_permitted(p4d_pte(p4d), write);
> }
>
> -extern struct page *pgd_page(pgd_t pgd);
> +extern struct page *p4d_page(p4d_t p4d);
>
> /* Pointers in the page table tree are physical addresses */
> #define __pgtable_ptr_val(ptr) __pa(ptr)
>
> #define pmd_page_vaddr(pmd) __va(pmd_val(pmd) & ~PMD_MASKED_BITS)
> #define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS)
> -#define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS)
> +#define p4d_page_vaddr(p4d) __va(p4d_val(p4d) & ~P4D_MASKED_BITS)
>
> #define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
> #define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
> @@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
>
> #define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
>
> -#define pud_offset(pgdp, addr) \
> - (((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
> +#define pud_offset(p4dp, addr) \
> + (((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
> #define pmd_offset(pudp,addr) \
> (((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
> #define pte_offset_kernel(dir,addr) \
> @@ -1368,6 +1374,12 @@ static inline bool pud_is_leaf(pud_t pud)
> return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
> }
>
> +#define p4d_is_leaf p4d_is_leaf
> +static inline bool p4d_is_leaf(p4d_t p4d)
> +{
> + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
> +}
> +
> #define pgd_is_leaf pgd_is_leaf
> #define pgd_leaf pgd_is_leaf
> static inline bool pgd_is_leaf(pgd_t pgd)
[...]
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 8cc543ed114c..0a05fddd7881 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -139,6 +139,14 @@ static inline bool pud_is_leaf(pud_t pud)
> }
> #endif
>
> +#ifndef p4d_is_leaf
> +#define p4d_is_leaf p4d_is_leaf
> +static inline bool p4d_is_leaf(p4d_t p4d)
> +{
> + return false;
> +}
> +#endif
> +
> #ifndef pgd_is_leaf
> #define pgd_is_leaf pgd_is_leaf
> static inline bool pgd_is_leaf(pgd_t pgd)
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 803940d79b73..5aacfa0b27ef 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -494,17 +494,39 @@ static void kvmppc_unmap_free_pud(struct kvm *kvm, pud_t *pud,
> pud_free(kvm->mm, pud);
> }
>
> +static void kvmppc_unmap_free_p4d(struct kvm *kvm, p4d_t *p4d,
> + unsigned int lpid)
> +{
> + unsigned long iu;
> + p4d_t *p = p4d;
> +
> + for (iu = 0; iu < PTRS_PER_P4D; ++iu, ++p) {
> + if (!p4d_present(*p))
> + continue;
> + if (p4d_is_leaf(*p)) {
> + p4d_clear(p);
> + } else {
> + pud_t *pud;
> +
> + pud = pud_offset(p, 0);
> + kvmppc_unmap_free_pud(kvm, pud, lpid);
> + p4d_clear(p);
> + }
> + }
> + p4d_free(kvm->mm, p4d);
> +}
> +
> void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
> {
> unsigned long ig;
>
> for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
> - pud_t *pud;
> + p4d_t *p4d;
>
> if (!pgd_present(*pgd))
> continue;
> - pud = pud_offset(pgd, 0);
> - kvmppc_unmap_free_pud(kvm, pud, lpid);
> + p4d = p4d_offset(pgd, 0);
> + kvmppc_unmap_free_p4d(kvm, p4d, lpid);
> pgd_clear(pgd);
> }
> }
> @@ -566,6 +588,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> unsigned long *rmapp, struct rmap_nested **n_rmap)
> {
> pgd_t *pgd;
> + p4d_t *p4d, *new_p4d = NULL;
> pud_t *pud, *new_pud = NULL;
> pmd_t *pmd, *new_pmd = NULL;
> pte_t *ptep, *new_ptep = NULL;
> @@ -573,9 +596,15 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
>
> /* Traverse the guest's 2nd-level tree, allocate new levels needed */
> pgd = pgtable + pgd_index(gpa);
> - pud = NULL;
> + p4d = NULL;
> if (pgd_present(*pgd))
> - pud = pud_offset(pgd, gpa);
> + p4d = p4d_offset(pgd, gpa);
> + else
> + new_p4d = p4d_alloc_one(kvm->mm, gpa);
> +
> + pud = NULL;
> + if (p4d_present(*p4d))
> + pud = pud_offset(p4d, gpa);
Is it worth adding all this new code ?
My understanding is that the series objective is to get rid of
__ARCH_HAS_5LEVEL_HACK, to to add support for 5 levels to an
architecture that not need it (at least for now).
If we want to add support for 5 levels, it can be done later in another
patch.
Here I think your change could be limited to:
- pud = pud_offset(pgd, gpa);
+ pud = pud_offset(p4d_offset(pgd, gpa), gpa);
> else
> new_pud = pud_alloc_one(kvm->mm, gpa);
>
> @@ -597,12 +626,18 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> /* Now traverse again under the lock and change the tree */
> ret = -ENOMEM;
> if (pgd_none(*pgd)) {
> + if (!new_p4d)
> + goto out_unlock;
> + pgd_populate(kvm->mm, pgd, new_p4d);
> + new_p4d = NULL;
> + }
> + if (p4d_none(*p4d)) {
> if (!new_pud)
> goto out_unlock;
> - pgd_populate(kvm->mm, pgd, new_pud);
> + p4d_populate(kvm->mm, p4d, new_pud);
> new_pud = NULL;
> }
> - pud = pud_offset(pgd, gpa);
> + pud = pud_offset(p4d, gpa);
> if (pud_is_leaf(*pud)) {
> unsigned long hgpa = gpa & PUD_MASK;
>
> @@ -1220,6 +1255,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> pgd_t *pgt;
> struct kvm_nested_guest *nested;
> pgd_t pgd, *pgdp;
> + p4d_t p4d, *p4dp;
> pud_t pud, *pudp;
> pmd_t pmd, *pmdp;
> pte_t *ptep;
> @@ -1298,7 +1334,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> continue;
> }
>
> - pudp = pud_offset(&pgd, gpa);
> + p4dp = p4d_offset(&pgd, gpa);
> + p4d = READ_ONCE(*p4dp);
> + if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
> + gpa = (gpa & P4D_MASK) + P4D_SIZE;
> + continue;
> + }
> +
> + pudp = pud_offset(&p4d, gpa);
Same, here you are forcing a useless read with READ_ONCE().
Your change could be limited to
- pudp = pud_offset(&pgd, gpa);
+ pudp = pud_offset(p4d_offset(&pgd, gpa), gpa);
This comment applies to many other places.
> pud = READ_ONCE(*pudp);
> if (!(pud_val(pud) & _PAGE_PRESENT)) {
> gpa = (gpa & PUD_MASK) + PUD_SIZE;
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 3345f039a876..7a59f6863cec 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
> pte_t *ptep;
> pmd_t *pmdp;
> pud_t *pudp;
> + p4d_t *p4dp;
> pgd_t *pgdp;
>
> pgdp = pgd_offset_k(addr);
> if (unlikely(!pgdp))
> return -EINVAL;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + if (unlikely(!p4dp))
> + return -EINVAL;
> +
> + pudp = pud_offset(p4dp, addr);
> if (unlikely(!pudp))
> return -EINVAL;
>
> diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
> index 0a1c65a2c565..b2fc3e71165c 100644
> --- a/arch/powerpc/mm/book3s32/mmu.c
> +++ b/arch/powerpc/mm/book3s32/mmu.c
> @@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
>
> if (!Hash)
> return;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
If we continue like this, in ten years this like is going to be many
kilometers long.
I think the above would be worth a generic helper.
> if (!pmd_none(*pmd))
> add_hash_page(mm->context.id, ea, pmd_val(*pmd));
> }
> diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
> index 2fcd321040ff..175bc33b41b7 100644
> --- a/arch/powerpc/mm/book3s32/tlb.c
> +++ b/arch/powerpc/mm/book3s32/tlb.c
> @@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
> if (start >= end)
> return;
> end = (end - 1) | ~PAGE_MASK;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
> for (;;) {
> pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
> if (pmd_end > end)
> @@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
> return;
> }
> mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
> if (!pmd_none(*pmd))
> flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
> }
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index 64733b9cb20a..9cd15937e88a 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
> int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
> if (slab_is_available()) {
> pgdp = pgd_offset_k(ea);
> - pudp = pud_alloc(&init_mm, pgdp, ea);
> + p4dp = p4d_offset(pgdp, ea);
> + pudp = pud_alloc(&init_mm, p4dp, ea);
Could be a single line, without a new var.
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4d_offset(pgdp, ea), ea);
Same kind of comments as already done apply to the rest.
Christophe
Le 16/02/2020 à 09:22, Russell King - ARM Linux admin a écrit :
> On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
>> From: Mike Rapoport <[email protected]>
>>
>> Hi,
>>
>> These patches convert several architectures to use page table folding and
>> remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
>>
>> The changes are mostly about mechanical replacement of pgd accessors with p4d
>> ones and the addition of higher levels to page table traversals.
>>
>> All the patches were sent separately to the respective arch lists and
>> maintainers hence the "v2" prefix.
>
> You fail to explain why this change which adds 488 additional lines of
> code is desirable.
>
The purpose of the series, ie droping a HACK, is worth it.
However looking at the powerpc patch I have the feeling that this series
goes behind its purpose.
The number additional lines could be deeply reduced I think if we limit
the patches to the strict minimum, ie just do things like below instead
of adding lots of handling of useless levels.
Instead of doing things like:
- pud = NULL;
+ p4d = NULL;
if (pgd_present(*pgd))
- pud = pud_offset(pgd, gpa);
+ p4d = p4d_offset(pgd, gpa);
+ else
+ new_p4d = p4d_alloc_one(kvm->mm, gpa);
+
+ pud = NULL;
+ if (p4d_present(*p4d))
+ pud = pud_offset(p4d, gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);
It could be limited to:
if (pgd_present(*pgd))
- pud = pud_offset(pgd, gpa);
+ pud = pud_offset(p4d_offset(pgd, gpa), gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);
Christophe
On Sun, Feb 16, 2020 at 08:22:30AM +0000, Russell King - ARM Linux admin wrote:
> On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <[email protected]>
> >
> > Hi,
> >
> > These patches convert several architectures to use page table folding and
> > remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
> >
> > The changes are mostly about mechanical replacement of pgd accessors with p4d
> > ones and the addition of higher levels to page table traversals.
> >
> > All the patches were sent separately to the respective arch lists and
> > maintainers hence the "v2" prefix.
>
> You fail to explain why this change which adds 488 additional lines of
> code is desirable.
Right, I should have been more explicit about it.
As Christophe mentioned in his reply, removing 'HACK' and 'fixup' is an
improvement.
Another thing is that when all architectures behave the same it opens
opportunities for cleaning up repeating definitions of page table
manipulation primitives.
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up
--
Sincerely yours,
Mike.
On Sun, Feb 16, 2020 at 11:41:07AM +0100, Christophe Leroy wrote:
>
>
> Le 16/02/2020 ? 09:18, Mike Rapoport a ?crit?:
> > From: Mike Rapoport <[email protected]>
> >
> > Implement primitives necessary for the 4th level folding, add walks of p4d
> > level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
>
> I don't think it is worth adding all this additionnals walks of p4d, this
> patch could be limited to changes like:
>
> - pud = pud_offset(pgd, gpa);
> + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
>
> The additionnal walks should be added through another patch the day powerpc
> need them.
Ok, I'll update the patch to reduce walking the p4d.
> See below for more comments.
>
> >
> > Signed-off-by: Mike Rapoport <[email protected]>
> > Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
> > ---
...
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index 201a69e6a355..ddddbafff0ab 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -2,7 +2,7 @@
> > #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> > #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> > -#include <asm-generic/5level-fixup.h>
> > +#include <asm-generic/pgtable-nop4d.h>
> > #ifndef __ASSEMBLY__
> > #include <linux/mmdebug.h>
> > @@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
> > /* Bits to mask out from a PUD to get to the PMD page */
> > #define PUD_MASKED_BITS 0xc0000000000000ffUL
> > /* Bits to mask out from a PGD to get to the PUD page */
> > -#define PGD_MASKED_BITS 0xc0000000000000ffUL
> > +#define P4D_MASKED_BITS 0xc0000000000000ffUL
> > /*
> > * Used as an indicator for rcu callback functions
> > @@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
> > return pte_access_permitted(pud_pte(pud), write);
> > }
> > -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> > +#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
> > +static inline __be64 p4d_raw(p4d_t x)
> > +{
> > + return pgd_raw(x.pgd);
> > +}
> > +
>
> Shouldn't this be defined in asm/pgtable-be-types.h, just like other
> __pxx_raw() ?
Ideally yes, but this creates weird header file dependencies and untangling
them would generate way too much churn.
> > +#define p4d_write(p4d) pte_write(p4d_pte(p4d))
> > -static inline void pgd_clear(pgd_t *pgdp)
> > +static inline void p4d_clear(p4d_t *p4dp)
> > {
> > - *pgdp = __pgd(0);
> > + *p4dp = __p4d(0);
> > }
...
> > @@ -573,9 +596,15 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> > /* Traverse the guest's 2nd-level tree, allocate new levels needed */
> > pgd = pgtable + pgd_index(gpa);
> > - pud = NULL;
> > + p4d = NULL;
> > if (pgd_present(*pgd))
> > - pud = pud_offset(pgd, gpa);
> > + p4d = p4d_offset(pgd, gpa);
> > + else
> > + new_p4d = p4d_alloc_one(kvm->mm, gpa);
> > +
> > + pud = NULL;
> > + if (p4d_present(*p4d))
> > + pud = pud_offset(p4d, gpa);
>
> Is it worth adding all this new code ?
>
> My understanding is that the series objective is to get rid of
> __ARCH_HAS_5LEVEL_HACK, to to add support for 5 levels to an architecture
> that not need it (at least for now).
> If we want to add support for 5 levels, it can be done later in another
> patch.
>
> Here I think your change could be limited to:
>
> - pud = pud_offset(pgd, gpa);
> + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
This won't work. Without __ARCH_USE_5LEVEL_HACK defined pgd_present() is
hardwired to 1 and the actual check for the top level is performed with
p4d_present(). The 'else' clause that allocates p4d will never be taken and
it could be removed, but I prefer to keep it for consistency.
> > else
> > new_pud = pud_alloc_one(kvm->mm, gpa);
> > @@ -597,12 +626,18 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> > /* Now traverse again under the lock and change the tree */
> > ret = -ENOMEM;
> > if (pgd_none(*pgd)) {
> > + if (!new_p4d)
> > + goto out_unlock;
> > + pgd_populate(kvm->mm, pgd, new_p4d);
> > + new_p4d = NULL;
> > + }
> > + if (p4d_none(*p4d)) {
> > if (!new_pud)
> > goto out_unlock;
> > - pgd_populate(kvm->mm, pgd, new_pud);
> > + p4d_populate(kvm->mm, p4d, new_pud);
> > new_pud = NULL;
> > }
> > - pud = pud_offset(pgd, gpa);
> > + pud = pud_offset(p4d, gpa);
> > if (pud_is_leaf(*pud)) {
> > unsigned long hgpa = gpa & PUD_MASK;
> > @@ -1220,6 +1255,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> > pgd_t *pgt;
> > struct kvm_nested_guest *nested;
> > pgd_t pgd, *pgdp;
> > + p4d_t p4d, *p4dp;
> > pud_t pud, *pudp;
> > pmd_t pmd, *pmdp;
> > pte_t *ptep;
> > @@ -1298,7 +1334,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> > continue;
> > }
> > - pudp = pud_offset(&pgd, gpa);
> > + p4dp = p4d_offset(&pgd, gpa);
> > + p4d = READ_ONCE(*p4dp);
> > + if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
> > + gpa = (gpa & P4D_MASK) + P4D_SIZE;
> > + continue;
> > + }
> > +
> > + pudp = pud_offset(&p4d, gpa);
>
> Same, here you are forcing a useless read with READ_ONCE().
>
> Your change could be limited to
>
> - pudp = pud_offset(&pgd, gpa);
> + pudp = pud_offset(p4d_offset(&pgd, gpa), gpa);
Here again the actual check must be done against p4d rather than pgd. We
could skip READ_ONCE() for pgd, but since it is a debugfs method I don't
think it is more important than code consistency.
> This comment applies to many other places.
I'll make another pass to see where we can take the shortcut and use
pudp = pud_offset(p4d_offset(...))
> > pud = READ_ONCE(*pudp);
> > if (!(pud_val(pud) & _PAGE_PRESENT)) {
> > gpa = (gpa & PUD_MASK) + PUD_SIZE;
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 3345f039a876..7a59f6863cec 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
> > pte_t *ptep;
> > pmd_t *pmdp;
> > pud_t *pudp;
> > + p4d_t *p4dp;
> > pgd_t *pgdp;
> > pgdp = pgd_offset_k(addr);
> > if (unlikely(!pgdp))
> > return -EINVAL;
> > - pudp = pud_offset(pgdp, addr);
> > + p4dp = p4d_offset(pgdp, addr);
> > + if (unlikely(!p4dp))
> > + return -EINVAL;
> > +
> > + pudp = pud_offset(p4dp, addr);
> > if (unlikely(!pudp))
> > return -EINVAL;
> > diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
> > index 0a1c65a2c565..b2fc3e71165c 100644
> > --- a/arch/powerpc/mm/book3s32/mmu.c
> > +++ b/arch/powerpc/mm/book3s32/mmu.c
> > @@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
> > if (!Hash)
> > return;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
>
> If we continue like this, in ten years this like is going to be many
> kilometers long.
>
> I think the above would be worth a generic helper.
Agree. My plan was to first unify all the architectures and then start
introducing the generic helpers, like e.g. pmd_offset_mm().
> > if (!pmd_none(*pmd))
> > add_hash_page(mm->context.id, ea, pmd_val(*pmd));
> > }
> > diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
> > index 2fcd321040ff..175bc33b41b7 100644
> > --- a/arch/powerpc/mm/book3s32/tlb.c
> > +++ b/arch/powerpc/mm/book3s32/tlb.c
> > @@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
> > if (start >= end)
> > return;
> > end = (end - 1) | ~PAGE_MASK;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
> > for (;;) {
> > pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
> > if (pmd_end > end)
> > @@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
> > return;
> > }
> > mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
> > if (!pmd_none(*pmd))
> > flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
> > }
> > diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> > index 64733b9cb20a..9cd15937e88a 100644
> > --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> > +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> > @@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
> > int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > {
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
> > if (slab_is_available()) {
> > pgdp = pgd_offset_k(ea);
> > - pudp = pud_alloc(&init_mm, pgdp, ea);
> > + p4dp = p4d_offset(pgdp, ea);
> > + pudp = pud_alloc(&init_mm, p4dp, ea);
>
> Could be a single line, without a new var.
>
> - pudp = pud_alloc(&init_mm, pgdp, ea);
> + pudp = pud_alloc(&init_mm, p4d_offset(pgdp, ea), ea);
>
>
> Same kind of comments as already done apply to the rest.
>
> Christophe
--
Sincerely yours,
Mike.
Le 16/02/2020 à 09:18, Mike Rapoport a écrit :
> From: Mike Rapoport <[email protected]>
>
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
> ---
> arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
> arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgtable.h | 58 ++++++++++--------
> arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
> arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
> arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
> .../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
> arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
> arch/powerpc/include/asm/pgtable.h | 8 +++
> arch/powerpc/kvm/book3s_64_mmu_radix.c | 59 ++++++++++++++++---
> arch/powerpc/lib/code-patching.c | 7 ++-
> arch/powerpc/mm/book3s32/mmu.c | 2 +-
> arch/powerpc/mm/book3s32/tlb.c | 4 +-
> arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
> arch/powerpc/mm/book3s64/radix_pgtable.c | 19 ++++--
> arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
> arch/powerpc/mm/hugetlbpage.c | 28 +++++----
> arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
> arch/powerpc/mm/mem.c | 4 +-
> arch/powerpc/mm/nohash/40x.c | 4 +-
> arch/powerpc/mm/nohash/book3e_pgtable.c | 15 +++--
> arch/powerpc/mm/pgtable.c | 25 +++++++-
> arch/powerpc/mm/pgtable_32.c | 28 +++++----
> arch/powerpc/mm/pgtable_64.c | 10 ++--
> arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
> arch/powerpc/mm/ptdump/ptdump.c | 22 ++++++-
> arch/powerpc/xmon/xmon.c | 17 +++++-
> 28 files changed, 284 insertions(+), 120 deletions(-)
>
> diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> index 206156255247..7bd4b81d5b5d 100644
> --- a/arch/powerpc/mm/ptdump/ptdump.c
> +++ b/arch/powerpc/mm/ptdump/ptdump.c
> @@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> }
> }
>
> -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> {
> - pud_t *pud = pud_offset(pgd, 0);
> + pud_t *pud = pud_offset(p4d, 0);
> unsigned long addr;
> unsigned int i;
>
> @@ -293,6 +293,22 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> }
> }
>
> +static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
> +{
> + p4d_t *p4d = p4d_offset(pgd, 0);
> + unsigned long addr;
> + unsigned int i;
> +
> + for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
> + addr = start + i * P4D_SIZE;
> + if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
> + /* p4d exists */
> + walk_pud(st, p4d, addr);
> + else
> + note_page(st, addr, 2, p4d_val(*p4d));
Level 2 is already used by walk_pud().
I think you have to increment the level used in walk_pud() and
walk_pmd() and walk_pte()
> + }
> +}
> +
> static void walk_pagetables(struct pg_state *st)
> {
> unsigned int i;
> @@ -306,7 +322,7 @@ static void walk_pagetables(struct pg_state *st)
> for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
> if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
> /* pgd exists */
> - walk_pud(st, pgd, addr);
> + walk_p4d(st, pgd, addr);
> else
> note_page(st, addr, 1, pgd_val(*pgd));
> }
Christophe
On Wed, Feb 19, 2020 at 01:07:55PM +0100, Christophe Leroy wrote:
>
> Le 16/02/2020 ? 09:18, Mike Rapoport a ?crit?:
> > diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> > index 206156255247..7bd4b81d5b5d 100644
> > --- a/arch/powerpc/mm/ptdump/ptdump.c
> > +++ b/arch/powerpc/mm/ptdump/ptdump.c
> > @@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> > }
> > }
> > -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> > {
> > - pud_t *pud = pud_offset(pgd, 0);
> > + pud_t *pud = pud_offset(p4d, 0);
> > unsigned long addr;
> > unsigned int i;
> > @@ -293,6 +293,22 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > }
> > }
> > +static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > +{
> > + p4d_t *p4d = p4d_offset(pgd, 0);
> > + unsigned long addr;
> > + unsigned int i;
> > +
> > + for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
> > + addr = start + i * P4D_SIZE;
> > + if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
> > + /* p4d exists */
> > + walk_pud(st, p4d, addr);
> > + else
> > + note_page(st, addr, 2, p4d_val(*p4d));
>
> Level 2 is already used by walk_pud().
>
> I think you have to increment the level used in walk_pud() and walk_pmd()
> and walk_pte()
Thanks for catching this!
I'll fix the numbers in the next version.
> > + }
> > +}
> > +
> > static void walk_pagetables(struct pg_state *st)
> > {
> > unsigned int i;
> > @@ -306,7 +322,7 @@ static void walk_pagetables(struct pg_state *st)
> > for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
> > if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
> > /* pgd exists */
> > - walk_pud(st, pgd, addr);
> > + walk_p4d(st, pgd, addr);
> > else
> > note_page(st, addr, 1, pgd_val(*pgd));
> > }
>
> Christophe
--
Sincerely yours,
Mike.
On Tue, Feb 18, 2020 at 12:54:40PM +0200, Mike Rapoport wrote:
> On Sun, Feb 16, 2020 at 11:41:07AM +0100, Christophe Leroy wrote:
> >
> >
> > Le 16/02/2020 ? 09:18, Mike Rapoport a ?crit?:
> > > From: Mike Rapoport <[email protected]>
> > >
> > > Implement primitives necessary for the 4th level folding, add walks of p4d
> > > level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
> >
> > I don't think it is worth adding all this additionnals walks of p4d, this
> > patch could be limited to changes like:
> >
> > - pud = pud_offset(pgd, gpa);
> > + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
> >
> > The additionnal walks should be added through another patch the day powerpc
> > need them.
>
> Ok, I'll update the patch to reduce walking the p4d.
Here's what I have with more direct acceses from pgd to pud.
From 6c59a86ce8394fb6100e9b6ced2e346981fb0ce9 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Sun, 24 Nov 2019 15:38:00 +0200
Subject: [PATCH v3] powerpc: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
Signed-off-by: Mike Rapoport <[email protected]>
Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
---
v3:
* reduce amount of added p4d walks
* kill pgtable_32::get_pteptr and traverse page table in
pgtable_32::__change_page_attr_noflush
arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 60 ++++++++++---------
arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
.../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
arch/powerpc/include/asm/pgtable.h | 6 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 30 ++++++----
arch/powerpc/lib/code-patching.c | 7 ++-
arch/powerpc/mm/book3s32/mmu.c | 2 +-
arch/powerpc/mm/book3s32/tlb.c | 4 +-
arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
arch/powerpc/mm/book3s64/radix_pgtable.c | 26 +++++---
arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
arch/powerpc/mm/hugetlbpage.c | 28 +++++----
arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
arch/powerpc/mm/mem.c | 4 +-
arch/powerpc/mm/nohash/40x.c | 4 +-
arch/powerpc/mm/nohash/book3e_pgtable.c | 15 ++---
arch/powerpc/mm/pgtable.c | 30 ++++++----
arch/powerpc/mm/pgtable_32.c | 45 +++-----------
arch/powerpc/mm/pgtable_64.c | 10 ++--
arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
arch/powerpc/mm/ptdump/ptdump.c | 14 +++--
arch/powerpc/xmon/xmon.c | 18 +++---
28 files changed, 213 insertions(+), 184 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 5b39c11e884a..39ec11371be0 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -2,7 +2,6 @@
#ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
#define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#include <asm/book3s/32/hash.h>
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 2781ebf6add4..876d1528c2cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
#define hash__pmd_bad(pmd) (pmd_val(pmd) & H_PMD_BAD_BITS)
#define hash__pud_bad(pud) (pud_val(pud) & H_PUD_BAD_BITS)
-static inline int hash__pgd_bad(pgd_t pgd)
+static inline int hash__p4d_bad(p4d_t p4d)
{
- return (pgd_val(pgd) == 0);
+ return (p4d_val(p4d) == 0);
}
#ifdef CONFIG_STRICT_KERNEL_RWX
extern void hash__mark_rodata_ro(void);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..69c5b051734f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
}
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
{
- *pgd = __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
+ *pgd = __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
}
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 201a69e6a355..fa60e8594b9f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -2,7 +2,7 @@
#ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
#define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
#ifndef __ASSEMBLY__
#include <linux/mmdebug.h>
@@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
/* Bits to mask out from a PUD to get to the PMD page */
#define PUD_MASKED_BITS 0xc0000000000000ffUL
/* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS 0xc0000000000000ffUL
+#define P4D_MASKED_BITS 0xc0000000000000ffUL
/*
* Used as an indicator for rcu callback functions
@@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
return pte_access_permitted(pud_pte(pud), write);
}
-#define pgd_write(pgd) pte_write(pgd_pte(pgd))
+#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
+static inline __be64 p4d_raw(p4d_t x)
+{
+ return pgd_raw(x.pgd);
+}
+
+#define p4d_write(p4d) pte_write(p4d_pte(p4d))
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
{
- *pgdp = __pgd(0);
+ *p4dp = __p4d(0);
}
-static inline int pgd_none(pgd_t pgd)
+static inline int p4d_none(p4d_t p4d)
{
- return !pgd_raw(pgd);
+ return !p4d_raw(p4d);
}
-static inline int pgd_present(pgd_t pgd)
+static inline int p4d_present(p4d_t p4d)
{
- return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
+ return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
}
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
{
- return __pte_raw(pgd_raw(pgd));
+ return __pte_raw(p4d_raw(p4d));
}
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
{
- return __pgd_raw(pte_raw(pte));
+ return __p4d_raw(pte_raw(pte));
}
-static inline int pgd_bad(pgd_t pgd)
+static inline int p4d_bad(p4d_t p4d)
{
if (radix_enabled())
- return radix__pgd_bad(pgd);
- return hash__pgd_bad(pgd);
+ return radix__p4d_bad(p4d);
+ return hash__p4d_bad(p4d);
}
-#define pgd_access_permitted pgd_access_permitted
-static inline bool pgd_access_permitted(pgd_t pgd, bool write)
+#define p4d_access_permitted p4d_access_permitted
+static inline bool p4d_access_permitted(p4d_t p4d, bool write)
{
- return pte_access_permitted(pgd_pte(pgd), write);
+ return pte_access_permitted(p4d_pte(p4d), write);
}
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
/* Pointers in the page table tree are physical addresses */
#define __pgtable_ptr_val(ptr) __pa(ptr)
#define pmd_page_vaddr(pmd) __va(pmd_val(pmd) & ~PMD_MASKED_BITS)
#define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS)
-#define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_page_vaddr(p4d) __va(p4d_val(p4d) & ~P4D_MASKED_BITS)
#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
@@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
-#define pud_offset(pgdp, addr) \
- (((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
+#define pud_offset(p4dp, addr) \
+ (((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
#define pmd_offset(pudp,addr) \
(((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
#define pte_offset_kernel(dir,addr) \
@@ -1368,11 +1374,11 @@ static inline bool pud_is_leaf(pud_t pud)
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
}
-#define pgd_is_leaf pgd_is_leaf
-#define pgd_leaf pgd_is_leaf
-static inline bool pgd_is_leaf(pgd_t pgd)
+#define p4d_is_leaf p4d_is_leaf
+#define p4d_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
{
- return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
+ return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
}
#endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index d97db3ad9aae..9bca2ac64220 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -30,7 +30,7 @@
/* Don't have anything in the reserved bits and leaf bits */
#define RADIX_PMD_BAD_BITS 0x60000000000000e0UL
#define RADIX_PUD_BAD_BITS 0x60000000000000e0UL
-#define RADIX_PGD_BAD_BITS 0x60000000000000e0UL
+#define RADIX_P4D_BAD_BITS 0x60000000000000e0UL
#define RADIX_PMD_SHIFT (PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
#define RADIX_PUD_SHIFT (RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
@@ -227,9 +227,9 @@ static inline int radix__pud_bad(pud_t pud)
}
-static inline int radix__pgd_bad(pgd_t pgd)
+static inline int radix__p4d_bad(p4d_t p4d)
{
- return !!(pgd_val(pgd) & RADIX_PGD_BAD_BITS);
+ return !!(p4d_val(p4d) & RADIX_P4D_BAD_BITS);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 60c4d829152e..d4c2c4259fa3 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -2,7 +2,6 @@
#ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
#define _ASM_POWERPC_NOHASH_32_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopmd.h>
#ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index b9534a793293..668aee6017e7 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -15,7 +15,7 @@ struct vmemmap_backing {
};
extern struct vmemmap_backing *vmemmap_list;
-#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, (unsigned long)PUD)
+#define p4d_populate(MM, P4D, PUD) p4d_set(P4D, (unsigned long)PUD)
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
{
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
index c40ec32b8194..81b1c54e3cf1 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
@@ -2,7 +2,7 @@
#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
#define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
/*
* Entries per page directory level. The PTE level must use a 64b record
@@ -45,41 +45,41 @@
#define PMD_MASKED_BITS 0
/* Bits to mask out from a PUD to get to the PMD page */
#define PUD_MASKED_BITS 0
-/* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS 0
+/* Bits to mask out from a P4D to get to the PUD page */
+#define P4D_MASKED_BITS 0
/*
* 4-level page tables related bits
*/
-#define pgd_none(pgd) (!pgd_val(pgd))
-#define pgd_bad(pgd) (pgd_val(pgd) == 0)
-#define pgd_present(pgd) (pgd_val(pgd) != 0)
-#define pgd_page_vaddr(pgd) (pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_none(p4d) (!p4d_val(p4d))
+#define p4d_bad(p4d) (p4d_val(p4d) == 0)
+#define p4d_present(p4d) (p4d_val(p4d) != 0)
+#define p4d_page_vaddr(p4d) (p4d_val(p4d) & ~P4D_MASKED_BITS)
#ifndef __ASSEMBLY__
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
{
- *pgdp = __pgd(0);
+ *p4dp = __p4d(0);
}
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
{
- return __pte(pgd_val(pgd));
+ return __pte(p4d_val(p4d));
}
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
{
- return __pgd(pte_val(pte));
+ return __p4d(pte_val(pte));
}
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
#endif /* !__ASSEMBLY__ */
-#define pud_offset(pgdp, addr) \
- (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
+#define pud_offset(p4dp, addr) \
+ (((pud_t *) p4d_page_vaddr(*(p4dp))) + \
(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
#define pud_ERROR(e) \
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9a33b8bd842d..b360f262b9c6 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -175,11 +175,11 @@ static inline pud_t pte_pud(pte_t pte)
return __pud(pte_val(pte));
}
#define pud_write(pud) pte_write(pud_pte(pud))
-#define pgd_write(pgd) pte_write(pgd_pte(pgd))
+#define p4d_write(pgd) pte_write(p4d_pte(p4d))
-static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+static inline void p4d_set(p4d_t *p4dp, unsigned long val)
{
- *pgdp = __pgd(val);
+ *p4dp = __p4d(val);
}
/*
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 8cc543ed114c..05205d7a7b4a 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -139,9 +139,9 @@ static inline bool pud_is_leaf(pud_t pud)
}
#endif
-#ifndef pgd_is_leaf
-#define pgd_is_leaf pgd_is_leaf
-static inline bool pgd_is_leaf(pgd_t pgd)
+#ifndef p4d_is_leaf
+#define p4d_is_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
{
return false;
}
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 803940d79b73..beb694285100 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -499,13 +499,14 @@ void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
unsigned long ig;
for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
+ p4d_t *p4d = p4d_offset(pgd, 0);
pud_t *pud;
- if (!pgd_present(*pgd))
+ if (!p4d_present(*p4d))
continue;
- pud = pud_offset(pgd, 0);
+ pud = pud_offset(p4d, 0);
kvmppc_unmap_free_pud(kvm, pud, lpid);
- pgd_clear(pgd);
+ p4d_clear(p4d);
}
}
@@ -566,6 +567,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
unsigned long *rmapp, struct rmap_nested **n_rmap)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud, *new_pud = NULL;
pmd_t *pmd, *new_pmd = NULL;
pte_t *ptep, *new_ptep = NULL;
@@ -573,9 +575,11 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
/* Traverse the guest's 2nd-level tree, allocate new levels needed */
pgd = pgtable + pgd_index(gpa);
+ p4d = p4d_offset(pgd, gpa);
+
pud = NULL;
- if (pgd_present(*pgd))
- pud = pud_offset(pgd, gpa);
+ if (p4d_present(*p4d))
+ pud = pud_offset(p4d, gpa);
else
new_pud = pud_alloc_one(kvm->mm, gpa);
@@ -596,13 +600,13 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
/* Now traverse again under the lock and change the tree */
ret = -ENOMEM;
- if (pgd_none(*pgd)) {
+ if (p4d_none(*p4d)) {
if (!new_pud)
goto out_unlock;
- pgd_populate(kvm->mm, pgd, new_pud);
+ p4d_populate(kvm->mm, p4d, new_pud);
new_pud = NULL;
}
- pud = pud_offset(pgd, gpa);
+ pud = pud_offset(p4d, gpa);
if (pud_is_leaf(*pud)) {
unsigned long hgpa = gpa & PUD_MASK;
@@ -1220,6 +1224,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
pgd_t *pgt;
struct kvm_nested_guest *nested;
pgd_t pgd, *pgdp;
+ p4d_t p4d, *p4dp;
pud_t pud, *pudp;
pmd_t pmd, *pmdp;
pte_t *ptep;
@@ -1292,13 +1297,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
}
pgdp = pgt + pgd_index(gpa);
- pgd = READ_ONCE(*pgdp);
- if (!(pgd_val(pgd) & _PAGE_PRESENT)) {
- gpa = (gpa & PGDIR_MASK) + PGDIR_SIZE;
+ p4dp = p4d_offset(pgdp, gpa);
+ p4d = READ_ONCE(*p4dp);
+ if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
+ gpa = (gpa & P4D_MASK) + P4D_SIZE;
continue;
}
- pudp = pud_offset(&pgd, gpa);
+ pudp = pud_offset(&p4d, gpa);
pud = READ_ONCE(*pudp);
if (!(pud_val(pud) & _PAGE_PRESENT)) {
gpa = (gpa & PUD_MASK) + PUD_SIZE;
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 3345f039a876..7a59f6863cec 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
pte_t *ptep;
pmd_t *pmdp;
pud_t *pudp;
+ p4d_t *p4dp;
pgd_t *pgdp;
pgdp = pgd_offset_k(addr);
if (unlikely(!pgdp))
return -EINVAL;
- pudp = pud_offset(pgdp, addr);
+ p4dp = p4d_offset(pgdp, addr);
+ if (unlikely(!p4dp))
+ return -EINVAL;
+
+ pudp = pud_offset(p4dp, addr);
if (unlikely(!pudp))
return -EINVAL;
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index f888cbb109b9..edef17c97206 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
if (!Hash)
return;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
if (!pmd_none(*pmd))
add_hash_page(mm->context.id, ea, pmd_val(*pmd));
}
diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
index 2fcd321040ff..175bc33b41b7 100644
--- a/arch/powerpc/mm/book3s32/tlb.c
+++ b/arch/powerpc/mm/book3s32/tlb.c
@@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
if (start >= end)
return;
end = (end - 1) | ~PAGE_MASK;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
for (;;) {
pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
if (pmd_end > end)
@@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
return;
}
mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
- pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
if (!pmd_none(*pmd))
flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
}
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index 64733b9cb20a..9cd15937e88a 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
if (slab_is_available()) {
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
pmdp = pmd_alloc(&init_mm, pudp, ea);
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index dd1bea45325c..fc3d0b0460b0 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -64,17 +64,19 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
{
unsigned long pfn = pa >> PAGE_SHIFT;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
pgdp = pgd_offset_k(ea);
- if (pgd_none(*pgdp)) {
+ p4dp = p4d_offset(pgdp, ea);
+ if (p4d_none(*p4dp)) {
pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
region_start, region_end);
- pgd_populate(&init_mm, pgdp, pudp);
+ p4d_populate(&init_mm, p4dp, pudp);
}
- pudp = pud_offset(pgdp, ea);
+ pudp = pud_offset(p4dp, ea);
if (map_page_size == PUD_SIZE) {
ptep = (pte_t *)pudp;
goto set_the_pte;
@@ -114,6 +116,7 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
{
unsigned long pfn = pa >> PAGE_SHIFT;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -136,7 +139,8 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
* boot.
*/
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
if (map_page_size == PUD_SIZE) {
@@ -173,6 +177,7 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
{
unsigned long idx;
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -185,7 +190,8 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
for (idx = start; idx < end; idx += PAGE_SIZE) {
pgdp = pgd_offset_k(idx);
- pudp = pud_alloc(&init_mm, pgdp, idx);
+ p4dp = p4d_offset(pgdp, idx);
+ pudp = pud_alloc(&init_mm, p4dp, idx);
if (!pudp)
continue;
if (pud_is_leaf(*pudp)) {
@@ -847,6 +853,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
unsigned long addr, next;
pud_t *pud_base;
pgd_t *pgd;
+ p4d_t *p4d;
spin_lock(&init_mm.page_table_lock);
@@ -854,15 +861,16 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
next = pgd_addr_end(addr, end);
pgd = pgd_offset_k(addr);
- if (!pgd_present(*pgd))
+ p4d = p4d_offset(pgd, addr);
+ if (!p4d_present(*p4d))
continue;
- if (pgd_is_leaf(*pgd)) {
- split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
+ if (p4d_is_leaf(*p4d)) {
+ split_kernel_mapping(addr, end, P4D_SIZE, (pte_t *)p4d);
continue;
}
- pud_base = (pud_t *)pgd_page_vaddr(*pgd);
+ pud_base = (pud_t *)p4d_page_vaddr(*p4d);
remove_pud_table(pud_base, addr, next);
}
diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
index 2ef24a53f4c9..25a0c044bd93 100644
--- a/arch/powerpc/mm/book3s64/subpage_prot.c
+++ b/arch/powerpc/mm/book3s64/subpage_prot.c
@@ -54,15 +54,17 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
int npages)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
spinlock_t *ptl;
pgd = pgd_offset(mm, addr);
- if (pgd_none(*pgd))
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
return;
- pud = pud_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
if (pud_none(*pud))
return;
pmd = pmd_offset(pud, addr);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 33b3461d91e8..54f5994d4cbb 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -119,6 +119,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
{
pgd_t *pg;
+ p4d_t *p4;
pud_t *pu;
pmd_t *pm;
hugepd_t *hpdp = NULL;
@@ -128,20 +129,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
addr &= ~(sz-1);
pg = pgd_offset(mm, addr);
+ p4 = p4d_offset(pg, addr);
#ifdef CONFIG_PPC_BOOK3S_64
if (pshift == PGDIR_SHIFT)
/* 16GB huge page */
- return (pte_t *) pg;
+ return (pte_t *) p4;
else if (pshift > PUD_SHIFT) {
/*
* We need to use hugepd table
*/
ptl = &mm->page_table_lock;
- hpdp = (hugepd_t *)pg;
+ hpdp = (hugepd_t *)p4;
} else {
pdshift = PUD_SHIFT;
- pu = pud_alloc(mm, pg, addr);
+ pu = pud_alloc(mm, p4, addr);
if (!pu)
return NULL;
if (pshift == PUD_SHIFT)
@@ -166,10 +168,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
#else
if (pshift >= PGDIR_SHIFT) {
ptl = &mm->page_table_lock;
- hpdp = (hugepd_t *)pg;
+ hpdp = (hugepd_t *)p4;
} else {
pdshift = PUD_SHIFT;
- pu = pud_alloc(mm, pg, addr);
+ pu = pud_alloc(mm, p4, addr);
if (!pu)
return NULL;
if (pshift >= PUD_SHIFT) {
@@ -390,7 +392,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
mm_dec_nr_pmds(tlb->mm);
}
-static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
+static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
@@ -400,7 +402,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
start = addr;
do {
- pud = pud_offset(pgd, addr);
+ pud = pud_offset(p4d, addr);
next = pud_addr_end(addr, end);
if (!is_hugepd(__hugepd(pud_val(*pud)))) {
if (pud_none_or_clear_bad(pud))
@@ -435,8 +437,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
if (end - 1 > ceiling - 1)
return;
- pud = pud_offset(pgd, start);
- pgd_clear(pgd);
+ pud = pud_offset(p4d, start);
+ p4d_clear(p4d);
pud_free_tlb(tlb, pud, start);
mm_dec_nr_puds(tlb->mm);
}
@@ -449,6 +451,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long floor, unsigned long ceiling)
{
pgd_t *pgd;
+ p4d_t *p4d;
unsigned long next;
/*
@@ -471,10 +474,11 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
do {
next = pgd_addr_end(addr, end);
pgd = pgd_offset(tlb->mm, addr);
+ p4d = p4d_offset(pgd, addr);
if (!is_hugepd(__hugepd(pgd_val(*pgd)))) {
- if (pgd_none_or_clear_bad(pgd))
+ if (p4d_none_or_clear_bad(p4d))
continue;
- hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
+ hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
} else {
unsigned long more;
/*
@@ -487,7 +491,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
if (more > next)
next = more;
- free_hugepd_range(tlb, (hugepd_t *)pgd, PGDIR_SHIFT,
+ free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT,
addr, next, floor, ceiling);
}
} while (addr = next, addr != end);
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index db5664dde5ff..88e2e16380b5 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -36,7 +36,7 @@ static int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned
unsigned long k_cur, k_next;
pte_t *new = NULL;
- pmd = pmd_offset(pud_offset(pgd_offset_k(k_start), k_start), k_start);
+ pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_start), k_start), k_start), k_start);
for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
k_next = pgd_addr_end(k_cur, k_end);
@@ -78,7 +78,7 @@ static int __init kasan_init_region(void *start, size_t size)
block = memblock_alloc(k_end - k_start, PAGE_SIZE);
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
void *va = block + k_cur - k_start;
pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
@@ -102,7 +102,7 @@ static void __init kasan_remap_early_shadow_ro(void)
kasan_populate_pte(kasan_early_shadow_pte, prot);
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
pte_t *ptep = pte_offset_kernel(pmd, k_cur);
if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
@@ -201,7 +201,7 @@ void __init kasan_early_init(void)
unsigned long addr = KASAN_SHADOW_START;
unsigned long end = KASAN_SHADOW_END;
unsigned long next;
- pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);
+ pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(addr), addr), addr), addr);
BUILD_BUG_ON(KASAN_SHADOW_START & ~PGDIR_MASK);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index ef7b1119b2e2..8262b384dcf3 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -69,8 +69,8 @@ EXPORT_SYMBOL(kmap_prot);
static inline pte_t *virt_to_kpte(unsigned long vaddr)
{
- return pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr),
- vaddr), vaddr), vaddr);
+ return pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr),
+ vaddr), vaddr), vaddr), vaddr);
}
#endif
diff --git a/arch/powerpc/mm/nohash/40x.c b/arch/powerpc/mm/nohash/40x.c
index f348104eb461..7aaf7155e350 100644
--- a/arch/powerpc/mm/nohash/40x.c
+++ b/arch/powerpc/mm/nohash/40x.c
@@ -104,7 +104,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
pmd_t *pmdp;
unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
- pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
*pmdp++ = __pmd(val);
*pmdp++ = __pmd(val);
*pmdp++ = __pmd(val);
@@ -119,7 +119,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
pmd_t *pmdp;
unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
- pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
+ pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
*pmdp = __pmd(val);
v += LARGE_PAGE_SIZE_4M;
diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
index 4637fdd469cf..77884e24281d 100644
--- a/arch/powerpc/mm/nohash/book3e_pgtable.c
+++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
@@ -73,6 +73,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
{
pgd_t *pgdp;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -80,7 +81,8 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
BUILD_BUG_ON(TASK_SIZE_USER64 > PGTABLE_RANGE);
if (slab_is_available()) {
pgdp = pgd_offset_k(ea);
- pudp = pud_alloc(&init_mm, pgdp, ea);
+ p4dp = p4d_offset(pgdp, ea);
+ pudp = pud_alloc(&init_mm, p4dp, ea);
if (!pudp)
return -ENOMEM;
pmdp = pmd_alloc(&init_mm, pudp, ea);
@@ -91,13 +93,12 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
return -ENOMEM;
} else {
pgdp = pgd_offset_k(ea);
-#ifndef __PAGETABLE_PUD_FOLDED
- if (pgd_none(*pgdp)) {
- pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
- pgd_populate(&init_mm, pgdp, pudp);
+ p4dp = p4d_offset(pgdp, ea);
+ if (p4d_none(*p4dp)) {
+ pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
+ p4d_populate(&init_mm, p4dp, pmdp);
}
-#endif /* !__PAGETABLE_PUD_FOLDED */
- pudp = pud_offset(pgdp, ea);
+ pudp = pud_offset(p4dp, ea);
if (pud_none(*pudp)) {
pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
pud_populate(&init_mm, pudp, pmdp);
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index e3759b69f81b..c2499271f6c1 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -265,6 +265,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
@@ -272,7 +273,9 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
return;
pgd = mm->pgd + pgd_index(addr);
BUG_ON(pgd_none(*pgd));
- pud = pud_offset(pgd, addr);
+ p4d = p4d_offset(pgd, addr);
+ BUG_ON(p4d_none(*p4d));
+ pud = pud_offset(p4d, addr);
BUG_ON(pud_none(*pud));
pmd = pmd_offset(pud, addr);
/*
@@ -312,12 +315,13 @@ EXPORT_SYMBOL_GPL(vmalloc_to_phys);
pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
bool *is_thp, unsigned *hpage_shift)
{
- pgd_t pgd, *pgdp;
+ pgd_t *pgdp;
+ p4d_t p4d, *p4dp;
pud_t pud, *pudp;
pmd_t pmd, *pmdp;
pte_t *ret_pte;
hugepd_t *hpdp = NULL;
- unsigned pdshift = PGDIR_SHIFT;
+ unsigned pdshift;
if (hpage_shift)
*hpage_shift = 0;
@@ -325,24 +329,28 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
if (is_thp)
*is_thp = false;
- pgdp = pgdir + pgd_index(ea);
- pgd = READ_ONCE(*pgdp);
/*
* Always operate on the local stack value. This make sure the
* value don't get updated by a parallel THP split/collapse,
* page fault or a page unmap. The return pte_t * is still not
* stable. So should be checked there for above conditions.
+ * Top level is an exception because it is folded into p4d.
*/
- if (pgd_none(pgd))
+ pgdp = pgdir + pgd_index(ea);
+ p4dp = p4d_offset(pgdp, ea);
+ p4d = READ_ONCE(*p4dp);
+ pdshift = P4D_SHIFT;
+
+ if (p4d_none(p4d))
return NULL;
- if (pgd_is_leaf(pgd)) {
- ret_pte = (pte_t *)pgdp;
+ if (p4d_is_leaf(p4d)) {
+ ret_pte = (pte_t *)p4dp;
goto out;
}
- if (is_hugepd(__hugepd(pgd_val(pgd)))) {
- hpdp = (hugepd_t *)&pgd;
+ if (is_hugepd(__hugepd(p4d_val(p4d)))) {
+ hpdp = (hugepd_t *)&p4d;
goto out_huge;
}
@@ -352,7 +360,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
* irq disabled
*/
pdshift = PUD_SHIFT;
- pudp = pud_offset(&pgd, ea);
+ pudp = pud_offset(&p4d, ea);
pud = READ_ONCE(*pudp);
if (pud_none(pud))
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 5fb90edd865e..5774d4bc94d0 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -63,7 +63,7 @@ int __ref map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot)
int err = -ENOMEM;
/* Use upper 10 bits of VA to index the first level map */
- pd = pmd_offset(pud_offset(pgd_offset_k(va), va), va);
+ pd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
/* Use middle 10 bits of VA to index the second-level map */
if (likely(slab_is_available()))
pg = pte_alloc_kernel(pd, va);
@@ -121,53 +121,24 @@ void __init mapin_ram(void)
}
}
-/* Scan the real Linux page tables and return a PTE pointer for
- * a virtual address in a context.
- * Returns true (1) if PTE was found, zero otherwise. The pointer to
- * the PTE pointer is unmodified if PTE is not found.
- */
-static int
-get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp)
-{
- pgd_t *pgd;
- pud_t *pud;
- pmd_t *pmd;
- pte_t *pte;
- int retval = 0;
-
- pgd = pgd_offset(mm, addr & PAGE_MASK);
- if (pgd) {
- pud = pud_offset(pgd, addr & PAGE_MASK);
- if (pud && pud_present(*pud)) {
- pmd = pmd_offset(pud, addr & PAGE_MASK);
- if (pmd_present(*pmd)) {
- pte = pte_offset_map(pmd, addr & PAGE_MASK);
- if (pte) {
- retval = 1;
- *ptep = pte;
- if (pmdp)
- *pmdp = pmd;
- /* XXX caller needs to do pte_unmap, yuck */
- }
- }
- }
- }
- return(retval);
-}
-
static int __change_page_attr_noflush(struct page *page, pgprot_t prot)
{
pte_t *kpte;
pmd_t *kpmd;
- unsigned long address;
+ unsigned long address, va;
BUG_ON(PageHighMem(page));
address = (unsigned long)page_address(page);
+ va = address & PAGE_MASK;
if (v_block_mapped(address))
return 0;
- if (!get_pteptr(&init_mm, address, &kpte, &kpmd))
+
+ kpmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
+ if (!pmd_present(*kpmd))
return -EINVAL;
+
+ kpte = pte_offset_map(kpmd, va);
__set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0);
pte_unmap(kpte);
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e78832dce7bb..1f86a88fd4bb 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -101,13 +101,13 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
#ifndef __PAGETABLE_PUD_FOLDED
/* 4 level page table */
-struct page *pgd_page(pgd_t pgd)
+struct page *p4d_page(p4d_t p4d)
{
- if (pgd_is_leaf(pgd)) {
- VM_WARN_ON(!pgd_huge(pgd));
- return pte_page(pgd_pte(pgd));
+ if (p4d_is_leaf(p4d)) {
+ VM_WARN_ON(!p4d_huge(p4d));
+ return pte_page(p4d_pte(p4d));
}
- return virt_to_page(pgd_page_vaddr(pgd));
+ return virt_to_page(p4d_page_vaddr(p4d));
}
#endif
diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c b/arch/powerpc/mm/ptdump/hashpagetable.c
index a07278027c6f..ac360ad865a8 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -417,9 +417,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
}
}
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
{
- pud_t *pud = pud_offset(pgd, 0);
+ pud_t *pud = pud_offset(p4d, 0);
unsigned long addr;
unsigned int i;
@@ -431,6 +431,20 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
}
}
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+ p4d_t *p4d = p4d_offset(pgd, 0);
+ unsigned long addr;
+ unsigned int i;
+
+ for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+ addr = start + i * P4D_SIZE;
+ if (!p4d_none(*p4d))
+ /* p4d exists */
+ walk_pud(st, p4d, addr);
+ }
+}
+
static void walk_pagetables(struct pg_state *st)
{
pgd_t *pgd = pgd_offset_k(0UL);
@@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
addr = KERN_VIRT_START + i * PGDIR_SIZE;
if (!pgd_none(*pgd))
/* pgd exists */
- walk_pud(st, pgd, addr);
+ walk_p4d(st, pgd, addr);
}
}
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 206156255247..9d6256b61df3 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
}
}
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
{
- pud_t *pud = pud_offset(pgd, 0);
+ pud_t *pud = pud_offset(p4d, 0);
unsigned long addr;
unsigned int i;
@@ -304,11 +304,13 @@ static void walk_pagetables(struct pg_state *st)
* the hash pagetable.
*/
for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
- if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
- /* pgd exists */
- walk_pud(st, pgd, addr);
+ p4d_t *p4d = p4d_offset(pgd, 0);
+
+ if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
+ /* p4d exists */
+ walk_pud(st, p4d, addr);
else
- note_page(st, addr, 1, pgd_val(*pgd));
+ note_page(st, addr, 1, p4d_val(*p4d));
}
}
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 0ec9640335bb..3e29128c58cc 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3130,6 +3130,7 @@ static void show_pte(unsigned long addr)
struct task_struct *tsk = NULL;
struct mm_struct *mm;
pgd_t *pgdp, *pgdir;
+ p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
@@ -3161,20 +3162,21 @@ static void show_pte(unsigned long addr)
pgdir = pgd_offset(mm, 0);
}
- if (pgd_none(*pgdp)) {
- printf("no linux page table for address\n");
+ p4dp = p4d_offset(pgdp, addr);
+
+ if (p4d_none(*p4dp)) {
+ printf("No valid P4D\n");
return;
}
- printf("pgd @ 0x%px\n", pgdir);
-
- if (pgd_is_leaf(*pgdp)) {
- format_pte(pgdp, pgd_val(*pgdp));
+ if (p4d_is_leaf(*p4dp)) {
+ format_pte(p4dp, p4d_val(*p4dp));
return;
}
- printf("pgdp @ 0x%px = 0x%016lx\n", pgdp, pgd_val(*pgdp));
- pudp = pud_offset(pgdp, addr);
+ printf("p4dp @ 0x%px = 0x%016lx\n", p4dp, p4d_val(*p4dp));
+
+ pudp = pud_offset(p4dp, addr);
if (pud_none(*pudp)) {
printf("No valid PUD\n");
--
2.24.0
Le 26/02/2020 à 10:13, Mike Rapoport a écrit :
> On Tue, Feb 18, 2020 at 12:54:40PM +0200, Mike Rapoport wrote:
>> On Sun, Feb 16, 2020 at 11:41:07AM +0100, Christophe Leroy wrote:
>>>
>>>
>>> Le 16/02/2020 à 09:18, Mike Rapoport a écrit :
>>>> From: Mike Rapoport <[email protected]>
>>>>
>>>> Implement primitives necessary for the 4th level folding, add walks of p4d
>>>> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
>>>
>>> I don't think it is worth adding all this additionnals walks of p4d, this
>>> patch could be limited to changes like:
>>>
>>> - pud = pud_offset(pgd, gpa);
>>> + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
>>>
>>> The additionnal walks should be added through another patch the day powerpc
>>> need them.
>>
>> Ok, I'll update the patch to reduce walking the p4d.
>
> Here's what I have with more direct acceses from pgd to pud.
I went quickly through. This looks promising.
Do we need the walk_p4d() in arch/powerpc/mm/ptdump/hashpagetable.c ?
Can't we just do
@@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
addr = KERN_VIRT_START + i * PGDIR_SIZE;
if (!pgd_none(*pgd))
/* pgd exists */
- walk_pud(st, pgd, addr);
+ walk_pud(st, p4d_offset(pgd, addr), addr);
}
}
Also, I think the removal of get_pteptr() should be a patch by itself.
You could include my patches in your series. See
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=152102
Christophe
>
> From 6c59a86ce8394fb6100e9b6ced2e346981fb0ce9 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Sun, 24 Nov 2019 15:38:00 +0200
> Subject: [PATCH v3] powerpc: add support for folded p4d page tables
>
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
> ---
> v3:
> * reduce amount of added p4d walks
> * kill pgtable_32::get_pteptr and traverse page table in
> pgtable_32::__change_page_attr_noflush
>
>
> arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
> arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
> arch/powerpc/include/asm/book3s/64/pgtable.h | 60 ++++++++++---------
> arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
> arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
> arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
> .../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
> arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
> arch/powerpc/include/asm/pgtable.h | 6 +-
> arch/powerpc/kvm/book3s_64_mmu_radix.c | 30 ++++++----
> arch/powerpc/lib/code-patching.c | 7 ++-
> arch/powerpc/mm/book3s32/mmu.c | 2 +-
> arch/powerpc/mm/book3s32/tlb.c | 4 +-
> arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
> arch/powerpc/mm/book3s64/radix_pgtable.c | 26 +++++---
> arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
> arch/powerpc/mm/hugetlbpage.c | 28 +++++----
> arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
> arch/powerpc/mm/mem.c | 4 +-
> arch/powerpc/mm/nohash/40x.c | 4 +-
> arch/powerpc/mm/nohash/book3e_pgtable.c | 15 ++---
> arch/powerpc/mm/pgtable.c | 30 ++++++----
> arch/powerpc/mm/pgtable_32.c | 45 +++-----------
> arch/powerpc/mm/pgtable_64.c | 10 ++--
> arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
> arch/powerpc/mm/ptdump/ptdump.c | 14 +++--
> arch/powerpc/xmon/xmon.c | 18 +++---
> 28 files changed, 213 insertions(+), 184 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
> index 5b39c11e884a..39ec11371be0 100644
> --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> @@ -2,7 +2,6 @@
> #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
> #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
>
> -#define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopmd.h>
>
> #include <asm/book3s/32/hash.h>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index 2781ebf6add4..876d1528c2cf 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
>
> #define hash__pmd_bad(pmd) (pmd_val(pmd) & H_PMD_BAD_BITS)
> #define hash__pud_bad(pud) (pud_val(pud) & H_PUD_BAD_BITS)
> -static inline int hash__pgd_bad(pgd_t pgd)
> +static inline int hash__p4d_bad(p4d_t p4d)
> {
> - return (pgd_val(pgd) == 0);
> + return (p4d_val(p4d) == 0);
> }
> #ifdef CONFIG_STRICT_KERNEL_RWX
> extern void hash__mark_rodata_ro(void);
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index a41e91bd0580..69c5b051734f 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
> }
>
> -static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
> +static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
> {
> - *pgd = __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> + *pgd = __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> }
>
> static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 201a69e6a355..fa60e8594b9f 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -2,7 +2,7 @@
> #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
>
> -#include <asm-generic/5level-fixup.h>
> +#include <asm-generic/pgtable-nop4d.h>
>
> #ifndef __ASSEMBLY__
> #include <linux/mmdebug.h>
> @@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
> /* Bits to mask out from a PUD to get to the PMD page */
> #define PUD_MASKED_BITS 0xc0000000000000ffUL
> /* Bits to mask out from a PGD to get to the PUD page */
> -#define PGD_MASKED_BITS 0xc0000000000000ffUL
> +#define P4D_MASKED_BITS 0xc0000000000000ffUL
>
> /*
> * Used as an indicator for rcu callback functions
> @@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
> return pte_access_permitted(pud_pte(pud), write);
> }
>
> -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> +#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
> +static inline __be64 p4d_raw(p4d_t x)
> +{
> + return pgd_raw(x.pgd);
> +}
> +
> +#define p4d_write(p4d) pte_write(p4d_pte(p4d))
>
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
> {
> - *pgdp = __pgd(0);
> + *p4dp = __p4d(0);
> }
>
> -static inline int pgd_none(pgd_t pgd)
> +static inline int p4d_none(p4d_t p4d)
> {
> - return !pgd_raw(pgd);
> + return !p4d_raw(p4d);
> }
>
> -static inline int pgd_present(pgd_t pgd)
> +static inline int p4d_present(p4d_t p4d)
> {
> - return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
> }
>
> -static inline pte_t pgd_pte(pgd_t pgd)
> +static inline pte_t p4d_pte(p4d_t p4d)
> {
> - return __pte_raw(pgd_raw(pgd));
> + return __pte_raw(p4d_raw(p4d));
> }
>
> -static inline pgd_t pte_pgd(pte_t pte)
> +static inline p4d_t pte_p4d(pte_t pte)
> {
> - return __pgd_raw(pte_raw(pte));
> + return __p4d_raw(pte_raw(pte));
> }
>
> -static inline int pgd_bad(pgd_t pgd)
> +static inline int p4d_bad(p4d_t p4d)
> {
> if (radix_enabled())
> - return radix__pgd_bad(pgd);
> - return hash__pgd_bad(pgd);
> + return radix__p4d_bad(p4d);
> + return hash__p4d_bad(p4d);
> }
>
> -#define pgd_access_permitted pgd_access_permitted
> -static inline bool pgd_access_permitted(pgd_t pgd, bool write)
> +#define p4d_access_permitted p4d_access_permitted
> +static inline bool p4d_access_permitted(p4d_t p4d, bool write)
> {
> - return pte_access_permitted(pgd_pte(pgd), write);
> + return pte_access_permitted(p4d_pte(p4d), write);
> }
>
> -extern struct page *pgd_page(pgd_t pgd);
> +extern struct page *p4d_page(p4d_t p4d);
>
> /* Pointers in the page table tree are physical addresses */
> #define __pgtable_ptr_val(ptr) __pa(ptr)
>
> #define pmd_page_vaddr(pmd) __va(pmd_val(pmd) & ~PMD_MASKED_BITS)
> #define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS)
> -#define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS)
> +#define p4d_page_vaddr(p4d) __va(p4d_val(p4d) & ~P4D_MASKED_BITS)
>
> #define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
> #define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
> @@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
>
> #define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
>
> -#define pud_offset(pgdp, addr) \
> - (((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
> +#define pud_offset(p4dp, addr) \
> + (((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
> #define pmd_offset(pudp,addr) \
> (((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
> #define pte_offset_kernel(dir,addr) \
> @@ -1368,11 +1374,11 @@ static inline bool pud_is_leaf(pud_t pud)
> return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
> }
>
> -#define pgd_is_leaf pgd_is_leaf
> -#define pgd_leaf pgd_is_leaf
> -static inline bool pgd_is_leaf(pgd_t pgd)
> +#define p4d_is_leaf p4d_is_leaf
> +#define p4d_leaf p4d_is_leaf
> +static inline bool p4d_is_leaf(p4d_t p4d)
> {
> - return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
> + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
> }
>
> #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index d97db3ad9aae..9bca2ac64220 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -30,7 +30,7 @@
> /* Don't have anything in the reserved bits and leaf bits */
> #define RADIX_PMD_BAD_BITS 0x60000000000000e0UL
> #define RADIX_PUD_BAD_BITS 0x60000000000000e0UL
> -#define RADIX_PGD_BAD_BITS 0x60000000000000e0UL
> +#define RADIX_P4D_BAD_BITS 0x60000000000000e0UL
>
> #define RADIX_PMD_SHIFT (PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
> #define RADIX_PUD_SHIFT (RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
> @@ -227,9 +227,9 @@ static inline int radix__pud_bad(pud_t pud)
> }
>
>
> -static inline int radix__pgd_bad(pgd_t pgd)
> +static inline int radix__p4d_bad(p4d_t p4d)
> {
> - return !!(pgd_val(pgd) & RADIX_PGD_BAD_BITS);
> + return !!(p4d_val(p4d) & RADIX_P4D_BAD_BITS);
> }
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
> index 60c4d829152e..d4c2c4259fa3 100644
> --- a/arch/powerpc/include/asm/nohash/32/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
> @@ -2,7 +2,6 @@
> #ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
> #define _ASM_POWERPC_NOHASH_32_PGTABLE_H
>
> -#define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopmd.h>
>
> #ifndef __ASSEMBLY__
> diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> index b9534a793293..668aee6017e7 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> @@ -15,7 +15,7 @@ struct vmemmap_backing {
> };
> extern struct vmemmap_backing *vmemmap_list;
>
> -#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, (unsigned long)PUD)
> +#define p4d_populate(MM, P4D, PUD) p4d_set(P4D, (unsigned long)PUD)
>
> static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> {
> diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> index c40ec32b8194..81b1c54e3cf1 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> @@ -2,7 +2,7 @@
> #ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
> #define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
>
> -#include <asm-generic/5level-fixup.h>
> +#include <asm-generic/pgtable-nop4d.h>
>
> /*
> * Entries per page directory level. The PTE level must use a 64b record
> @@ -45,41 +45,41 @@
> #define PMD_MASKED_BITS 0
> /* Bits to mask out from a PUD to get to the PMD page */
> #define PUD_MASKED_BITS 0
> -/* Bits to mask out from a PGD to get to the PUD page */
> -#define PGD_MASKED_BITS 0
> +/* Bits to mask out from a P4D to get to the PUD page */
> +#define P4D_MASKED_BITS 0
>
>
> /*
> * 4-level page tables related bits
> */
>
> -#define pgd_none(pgd) (!pgd_val(pgd))
> -#define pgd_bad(pgd) (pgd_val(pgd) == 0)
> -#define pgd_present(pgd) (pgd_val(pgd) != 0)
> -#define pgd_page_vaddr(pgd) (pgd_val(pgd) & ~PGD_MASKED_BITS)
> +#define p4d_none(p4d) (!p4d_val(p4d))
> +#define p4d_bad(p4d) (p4d_val(p4d) == 0)
> +#define p4d_present(p4d) (p4d_val(p4d) != 0)
> +#define p4d_page_vaddr(p4d) (p4d_val(p4d) & ~P4D_MASKED_BITS)
>
> #ifndef __ASSEMBLY__
>
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
> {
> - *pgdp = __pgd(0);
> + *p4dp = __p4d(0);
> }
>
> -static inline pte_t pgd_pte(pgd_t pgd)
> +static inline pte_t p4d_pte(p4d_t p4d)
> {
> - return __pte(pgd_val(pgd));
> + return __pte(p4d_val(p4d));
> }
>
> -static inline pgd_t pte_pgd(pte_t pte)
> +static inline p4d_t pte_p4d(pte_t pte)
> {
> - return __pgd(pte_val(pte));
> + return __p4d(pte_val(pte));
> }
> -extern struct page *pgd_page(pgd_t pgd);
> +extern struct page *p4d_page(p4d_t p4d);
>
> #endif /* !__ASSEMBLY__ */
>
> -#define pud_offset(pgdp, addr) \
> - (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
> +#define pud_offset(p4dp, addr) \
> + (((pud_t *) p4d_page_vaddr(*(p4dp))) + \
> (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
>
> #define pud_ERROR(e) \
> diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
> index 9a33b8bd842d..b360f262b9c6 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
> @@ -175,11 +175,11 @@ static inline pud_t pte_pud(pte_t pte)
> return __pud(pte_val(pte));
> }
> #define pud_write(pud) pte_write(pud_pte(pud))
> -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> +#define p4d_write(pgd) pte_write(p4d_pte(p4d))
>
> -static inline void pgd_set(pgd_t *pgdp, unsigned long val)
> +static inline void p4d_set(p4d_t *p4dp, unsigned long val)
> {
> - *pgdp = __pgd(val);
> + *p4dp = __p4d(val);
> }
>
> /*
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 8cc543ed114c..05205d7a7b4a 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -139,9 +139,9 @@ static inline bool pud_is_leaf(pud_t pud)
> }
> #endif
>
> -#ifndef pgd_is_leaf
> -#define pgd_is_leaf pgd_is_leaf
> -static inline bool pgd_is_leaf(pgd_t pgd)
> +#ifndef p4d_is_leaf
> +#define p4d_is_leaf p4d_is_leaf
> +static inline bool p4d_is_leaf(p4d_t p4d)
> {
> return false;
> }
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> index 803940d79b73..beb694285100 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> @@ -499,13 +499,14 @@ void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
> unsigned long ig;
>
> for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
> + p4d_t *p4d = p4d_offset(pgd, 0);
> pud_t *pud;
>
> - if (!pgd_present(*pgd))
> + if (!p4d_present(*p4d))
> continue;
> - pud = pud_offset(pgd, 0);
> + pud = pud_offset(p4d, 0);
> kvmppc_unmap_free_pud(kvm, pud, lpid);
> - pgd_clear(pgd);
> + p4d_clear(p4d);
> }
> }
>
> @@ -566,6 +567,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> unsigned long *rmapp, struct rmap_nested **n_rmap)
> {
> pgd_t *pgd;
> + p4d_t *p4d;
> pud_t *pud, *new_pud = NULL;
> pmd_t *pmd, *new_pmd = NULL;
> pte_t *ptep, *new_ptep = NULL;
> @@ -573,9 +575,11 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
>
> /* Traverse the guest's 2nd-level tree, allocate new levels needed */
> pgd = pgtable + pgd_index(gpa);
> + p4d = p4d_offset(pgd, gpa);
> +
> pud = NULL;
> - if (pgd_present(*pgd))
> - pud = pud_offset(pgd, gpa);
> + if (p4d_present(*p4d))
> + pud = pud_offset(p4d, gpa);
> else
> new_pud = pud_alloc_one(kvm->mm, gpa);
>
> @@ -596,13 +600,13 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
>
> /* Now traverse again under the lock and change the tree */
> ret = -ENOMEM;
> - if (pgd_none(*pgd)) {
> + if (p4d_none(*p4d)) {
> if (!new_pud)
> goto out_unlock;
> - pgd_populate(kvm->mm, pgd, new_pud);
> + p4d_populate(kvm->mm, p4d, new_pud);
> new_pud = NULL;
> }
> - pud = pud_offset(pgd, gpa);
> + pud = pud_offset(p4d, gpa);
> if (pud_is_leaf(*pud)) {
> unsigned long hgpa = gpa & PUD_MASK;
>
> @@ -1220,6 +1224,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> pgd_t *pgt;
> struct kvm_nested_guest *nested;
> pgd_t pgd, *pgdp;
> + p4d_t p4d, *p4dp;
> pud_t pud, *pudp;
> pmd_t pmd, *pmdp;
> pte_t *ptep;
> @@ -1292,13 +1297,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> }
>
> pgdp = pgt + pgd_index(gpa);
> - pgd = READ_ONCE(*pgdp);
> - if (!(pgd_val(pgd) & _PAGE_PRESENT)) {
> - gpa = (gpa & PGDIR_MASK) + PGDIR_SIZE;
> + p4dp = p4d_offset(pgdp, gpa);
> + p4d = READ_ONCE(*p4dp);
> + if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
> + gpa = (gpa & P4D_MASK) + P4D_SIZE;
> continue;
> }
>
> - pudp = pud_offset(&pgd, gpa);
> + pudp = pud_offset(&p4d, gpa);
> pud = READ_ONCE(*pudp);
> if (!(pud_val(pud) & _PAGE_PRESENT)) {
> gpa = (gpa & PUD_MASK) + PUD_SIZE;
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 3345f039a876..7a59f6863cec 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
> pte_t *ptep;
> pmd_t *pmdp;
> pud_t *pudp;
> + p4d_t *p4dp;
> pgd_t *pgdp;
>
> pgdp = pgd_offset_k(addr);
> if (unlikely(!pgdp))
> return -EINVAL;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + if (unlikely(!p4dp))
> + return -EINVAL;
> +
> + pudp = pud_offset(p4dp, addr);
> if (unlikely(!pudp))
> return -EINVAL;
>
> diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
> index f888cbb109b9..edef17c97206 100644
> --- a/arch/powerpc/mm/book3s32/mmu.c
> +++ b/arch/powerpc/mm/book3s32/mmu.c
> @@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
>
> if (!Hash)
> return;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
> if (!pmd_none(*pmd))
> add_hash_page(mm->context.id, ea, pmd_val(*pmd));
> }
> diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
> index 2fcd321040ff..175bc33b41b7 100644
> --- a/arch/powerpc/mm/book3s32/tlb.c
> +++ b/arch/powerpc/mm/book3s32/tlb.c
> @@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
> if (start >= end)
> return;
> end = (end - 1) | ~PAGE_MASK;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
> for (;;) {
> pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
> if (pmd_end > end)
> @@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
> return;
> }
> mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
> - pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
> if (!pmd_none(*pmd))
> flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
> }
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index 64733b9cb20a..9cd15937e88a 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
> int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
> if (slab_is_available()) {
> pgdp = pgd_offset_k(ea);
> - pudp = pud_alloc(&init_mm, pgdp, ea);
> + p4dp = p4d_offset(pgdp, ea);
> + pudp = pud_alloc(&init_mm, p4dp, ea);
> if (!pudp)
> return -ENOMEM;
> pmdp = pmd_alloc(&init_mm, pudp, ea);
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index dd1bea45325c..fc3d0b0460b0 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -64,17 +64,19 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
> {
> unsigned long pfn = pa >> PAGE_SHIFT;
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
>
> pgdp = pgd_offset_k(ea);
> - if (pgd_none(*pgdp)) {
> + p4dp = p4d_offset(pgdp, ea);
> + if (p4d_none(*p4dp)) {
> pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
> region_start, region_end);
> - pgd_populate(&init_mm, pgdp, pudp);
> + p4d_populate(&init_mm, p4dp, pudp);
> }
> - pudp = pud_offset(pgdp, ea);
> + pudp = pud_offset(p4dp, ea);
> if (map_page_size == PUD_SIZE) {
> ptep = (pte_t *)pudp;
> goto set_the_pte;
> @@ -114,6 +116,7 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
> {
> unsigned long pfn = pa >> PAGE_SHIFT;
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -136,7 +139,8 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
> * boot.
> */
> pgdp = pgd_offset_k(ea);
> - pudp = pud_alloc(&init_mm, pgdp, ea);
> + p4dp = p4d_offset(pgdp, ea);
> + pudp = pud_alloc(&init_mm, p4dp, ea);
> if (!pudp)
> return -ENOMEM;
> if (map_page_size == PUD_SIZE) {
> @@ -173,6 +177,7 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
> {
> unsigned long idx;
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -185,7 +190,8 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
>
> for (idx = start; idx < end; idx += PAGE_SIZE) {
> pgdp = pgd_offset_k(idx);
> - pudp = pud_alloc(&init_mm, pgdp, idx);
> + p4dp = p4d_offset(pgdp, idx);
> + pudp = pud_alloc(&init_mm, p4dp, idx);
> if (!pudp)
> continue;
> if (pud_is_leaf(*pudp)) {
> @@ -847,6 +853,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
> unsigned long addr, next;
> pud_t *pud_base;
> pgd_t *pgd;
> + p4d_t *p4d;
>
> spin_lock(&init_mm.page_table_lock);
>
> @@ -854,15 +861,16 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
> next = pgd_addr_end(addr, end);
>
> pgd = pgd_offset_k(addr);
> - if (!pgd_present(*pgd))
> + p4d = p4d_offset(pgd, addr);
> + if (!p4d_present(*p4d))
> continue;
>
> - if (pgd_is_leaf(*pgd)) {
> - split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
> + if (p4d_is_leaf(*p4d)) {
> + split_kernel_mapping(addr, end, P4D_SIZE, (pte_t *)p4d);
> continue;
> }
>
> - pud_base = (pud_t *)pgd_page_vaddr(*pgd);
> + pud_base = (pud_t *)p4d_page_vaddr(*p4d);
> remove_pud_table(pud_base, addr, next);
> }
>
> diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
> index 2ef24a53f4c9..25a0c044bd93 100644
> --- a/arch/powerpc/mm/book3s64/subpage_prot.c
> +++ b/arch/powerpc/mm/book3s64/subpage_prot.c
> @@ -54,15 +54,17 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
> int npages)
> {
> pgd_t *pgd;
> + p4d_t *p4d;
> pud_t *pud;
> pmd_t *pmd;
> pte_t *pte;
> spinlock_t *ptl;
>
> pgd = pgd_offset(mm, addr);
> - if (pgd_none(*pgd))
> + p4d = p4d_offset(pgd, addr);
> + if (p4d_none(*p4d))
> return;
> - pud = pud_offset(pgd, addr);
> + pud = pud_offset(p4d, addr);
> if (pud_none(*pud))
> return;
> pmd = pmd_offset(pud, addr);
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 33b3461d91e8..54f5994d4cbb 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -119,6 +119,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
> pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
> {
> pgd_t *pg;
> + p4d_t *p4;
> pud_t *pu;
> pmd_t *pm;
> hugepd_t *hpdp = NULL;
> @@ -128,20 +129,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
>
> addr &= ~(sz-1);
> pg = pgd_offset(mm, addr);
> + p4 = p4d_offset(pg, addr);
>
> #ifdef CONFIG_PPC_BOOK3S_64
> if (pshift == PGDIR_SHIFT)
> /* 16GB huge page */
> - return (pte_t *) pg;
> + return (pte_t *) p4;
> else if (pshift > PUD_SHIFT) {
> /*
> * We need to use hugepd table
> */
> ptl = &mm->page_table_lock;
> - hpdp = (hugepd_t *)pg;
> + hpdp = (hugepd_t *)p4;
> } else {
> pdshift = PUD_SHIFT;
> - pu = pud_alloc(mm, pg, addr);
> + pu = pud_alloc(mm, p4, addr);
> if (!pu)
> return NULL;
> if (pshift == PUD_SHIFT)
> @@ -166,10 +168,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
> #else
> if (pshift >= PGDIR_SHIFT) {
> ptl = &mm->page_table_lock;
> - hpdp = (hugepd_t *)pg;
> + hpdp = (hugepd_t *)p4;
> } else {
> pdshift = PUD_SHIFT;
> - pu = pud_alloc(mm, pg, addr);
> + pu = pud_alloc(mm, p4, addr);
> if (!pu)
> return NULL;
> if (pshift >= PUD_SHIFT) {
> @@ -390,7 +392,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> mm_dec_nr_pmds(tlb->mm);
> }
>
> -static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
> +static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
> unsigned long addr, unsigned long end,
> unsigned long floor, unsigned long ceiling)
> {
> @@ -400,7 +402,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
>
> start = addr;
> do {
> - pud = pud_offset(pgd, addr);
> + pud = pud_offset(p4d, addr);
> next = pud_addr_end(addr, end);
> if (!is_hugepd(__hugepd(pud_val(*pud)))) {
> if (pud_none_or_clear_bad(pud))
> @@ -435,8 +437,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
> if (end - 1 > ceiling - 1)
> return;
>
> - pud = pud_offset(pgd, start);
> - pgd_clear(pgd);
> + pud = pud_offset(p4d, start);
> + p4d_clear(p4d);
> pud_free_tlb(tlb, pud, start);
> mm_dec_nr_puds(tlb->mm);
> }
> @@ -449,6 +451,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> unsigned long floor, unsigned long ceiling)
> {
> pgd_t *pgd;
> + p4d_t *p4d;
> unsigned long next;
>
> /*
> @@ -471,10 +474,11 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> do {
> next = pgd_addr_end(addr, end);
> pgd = pgd_offset(tlb->mm, addr);
> + p4d = p4d_offset(pgd, addr);
> if (!is_hugepd(__hugepd(pgd_val(*pgd)))) {
> - if (pgd_none_or_clear_bad(pgd))
> + if (p4d_none_or_clear_bad(p4d))
> continue;
> - hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
> + hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
> } else {
> unsigned long more;
> /*
> @@ -487,7 +491,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> if (more > next)
> next = more;
>
> - free_hugepd_range(tlb, (hugepd_t *)pgd, PGDIR_SHIFT,
> + free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT,
> addr, next, floor, ceiling);
> }
> } while (addr = next, addr != end);
> diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
> index db5664dde5ff..88e2e16380b5 100644
> --- a/arch/powerpc/mm/kasan/kasan_init_32.c
> +++ b/arch/powerpc/mm/kasan/kasan_init_32.c
> @@ -36,7 +36,7 @@ static int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned
> unsigned long k_cur, k_next;
> pte_t *new = NULL;
>
> - pmd = pmd_offset(pud_offset(pgd_offset_k(k_start), k_start), k_start);
> + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_start), k_start), k_start), k_start);
>
> for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
> k_next = pgd_addr_end(k_cur, k_end);
> @@ -78,7 +78,7 @@ static int __init kasan_init_region(void *start, size_t size)
> block = memblock_alloc(k_end - k_start, PAGE_SIZE);
>
> for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
> - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
> + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
> void *va = block + k_cur - k_start;
> pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
>
> @@ -102,7 +102,7 @@ static void __init kasan_remap_early_shadow_ro(void)
> kasan_populate_pte(kasan_early_shadow_pte, prot);
>
> for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
> - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
> + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
> pte_t *ptep = pte_offset_kernel(pmd, k_cur);
>
> if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
> @@ -201,7 +201,7 @@ void __init kasan_early_init(void)
> unsigned long addr = KASAN_SHADOW_START;
> unsigned long end = KASAN_SHADOW_END;
> unsigned long next;
> - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);
> + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(addr), addr), addr), addr);
>
> BUILD_BUG_ON(KASAN_SHADOW_START & ~PGDIR_MASK);
>
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index ef7b1119b2e2..8262b384dcf3 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -69,8 +69,8 @@ EXPORT_SYMBOL(kmap_prot);
>
> static inline pte_t *virt_to_kpte(unsigned long vaddr)
> {
> - return pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr),
> - vaddr), vaddr), vaddr);
> + return pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr),
> + vaddr), vaddr), vaddr), vaddr);
> }
> #endif
>
> diff --git a/arch/powerpc/mm/nohash/40x.c b/arch/powerpc/mm/nohash/40x.c
> index f348104eb461..7aaf7155e350 100644
> --- a/arch/powerpc/mm/nohash/40x.c
> +++ b/arch/powerpc/mm/nohash/40x.c
> @@ -104,7 +104,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
> pmd_t *pmdp;
> unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
>
> - pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
> + pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
> *pmdp++ = __pmd(val);
> *pmdp++ = __pmd(val);
> *pmdp++ = __pmd(val);
> @@ -119,7 +119,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
> pmd_t *pmdp;
> unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
>
> - pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
> + pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
> *pmdp = __pmd(val);
>
> v += LARGE_PAGE_SIZE_4M;
> diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
> index 4637fdd469cf..77884e24281d 100644
> --- a/arch/powerpc/mm/nohash/book3e_pgtable.c
> +++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
> @@ -73,6 +73,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
> int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -80,7 +81,8 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> BUILD_BUG_ON(TASK_SIZE_USER64 > PGTABLE_RANGE);
> if (slab_is_available()) {
> pgdp = pgd_offset_k(ea);
> - pudp = pud_alloc(&init_mm, pgdp, ea);
> + p4dp = p4d_offset(pgdp, ea);
> + pudp = pud_alloc(&init_mm, p4dp, ea);
> if (!pudp)
> return -ENOMEM;
> pmdp = pmd_alloc(&init_mm, pudp, ea);
> @@ -91,13 +93,12 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> return -ENOMEM;
> } else {
> pgdp = pgd_offset_k(ea);
> -#ifndef __PAGETABLE_PUD_FOLDED
> - if (pgd_none(*pgdp)) {
> - pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
> - pgd_populate(&init_mm, pgdp, pudp);
> + p4dp = p4d_offset(pgdp, ea);
> + if (p4d_none(*p4dp)) {
> + pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
> + p4d_populate(&init_mm, p4dp, pmdp);
> }
> -#endif /* !__PAGETABLE_PUD_FOLDED */
> - pudp = pud_offset(pgdp, ea);
> + pudp = pud_offset(p4dp, ea);
> if (pud_none(*pudp)) {
> pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
> pud_populate(&init_mm, pudp, pmdp);
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index e3759b69f81b..c2499271f6c1 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -265,6 +265,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> {
> pgd_t *pgd;
> + p4d_t *p4d;
> pud_t *pud;
> pmd_t *pmd;
>
> @@ -272,7 +273,9 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> return;
> pgd = mm->pgd + pgd_index(addr);
> BUG_ON(pgd_none(*pgd));
> - pud = pud_offset(pgd, addr);
> + p4d = p4d_offset(pgd, addr);
> + BUG_ON(p4d_none(*p4d));
> + pud = pud_offset(p4d, addr);
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> /*
> @@ -312,12 +315,13 @@ EXPORT_SYMBOL_GPL(vmalloc_to_phys);
> pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> bool *is_thp, unsigned *hpage_shift)
> {
> - pgd_t pgd, *pgdp;
> + pgd_t *pgdp;
> + p4d_t p4d, *p4dp;
> pud_t pud, *pudp;
> pmd_t pmd, *pmdp;
> pte_t *ret_pte;
> hugepd_t *hpdp = NULL;
> - unsigned pdshift = PGDIR_SHIFT;
> + unsigned pdshift;
>
> if (hpage_shift)
> *hpage_shift = 0;
> @@ -325,24 +329,28 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> if (is_thp)
> *is_thp = false;
>
> - pgdp = pgdir + pgd_index(ea);
> - pgd = READ_ONCE(*pgdp);
> /*
> * Always operate on the local stack value. This make sure the
> * value don't get updated by a parallel THP split/collapse,
> * page fault or a page unmap. The return pte_t * is still not
> * stable. So should be checked there for above conditions.
> + * Top level is an exception because it is folded into p4d.
> */
> - if (pgd_none(pgd))
> + pgdp = pgdir + pgd_index(ea);
> + p4dp = p4d_offset(pgdp, ea);
> + p4d = READ_ONCE(*p4dp);
> + pdshift = P4D_SHIFT;
> +
> + if (p4d_none(p4d))
> return NULL;
>
> - if (pgd_is_leaf(pgd)) {
> - ret_pte = (pte_t *)pgdp;
> + if (p4d_is_leaf(p4d)) {
> + ret_pte = (pte_t *)p4dp;
> goto out;
> }
>
> - if (is_hugepd(__hugepd(pgd_val(pgd)))) {
> - hpdp = (hugepd_t *)&pgd;
> + if (is_hugepd(__hugepd(p4d_val(p4d)))) {
> + hpdp = (hugepd_t *)&p4d;
> goto out_huge;
> }
>
> @@ -352,7 +360,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> * irq disabled
> */
> pdshift = PUD_SHIFT;
> - pudp = pud_offset(&pgd, ea);
> + pudp = pud_offset(&p4d, ea);
> pud = READ_ONCE(*pudp);
>
> if (pud_none(pud))
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 5fb90edd865e..5774d4bc94d0 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -63,7 +63,7 @@ int __ref map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot)
> int err = -ENOMEM;
>
> /* Use upper 10 bits of VA to index the first level map */
> - pd = pmd_offset(pud_offset(pgd_offset_k(va), va), va);
> + pd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
> /* Use middle 10 bits of VA to index the second-level map */
> if (likely(slab_is_available()))
> pg = pte_alloc_kernel(pd, va);
> @@ -121,53 +121,24 @@ void __init mapin_ram(void)
> }
> }
>
> -/* Scan the real Linux page tables and return a PTE pointer for
> - * a virtual address in a context.
> - * Returns true (1) if PTE was found, zero otherwise. The pointer to
> - * the PTE pointer is unmodified if PTE is not found.
> - */
> -static int
> -get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp)
> -{
> - pgd_t *pgd;
> - pud_t *pud;
> - pmd_t *pmd;
> - pte_t *pte;
> - int retval = 0;
> -
> - pgd = pgd_offset(mm, addr & PAGE_MASK);
> - if (pgd) {
> - pud = pud_offset(pgd, addr & PAGE_MASK);
> - if (pud && pud_present(*pud)) {
> - pmd = pmd_offset(pud, addr & PAGE_MASK);
> - if (pmd_present(*pmd)) {
> - pte = pte_offset_map(pmd, addr & PAGE_MASK);
> - if (pte) {
> - retval = 1;
> - *ptep = pte;
> - if (pmdp)
> - *pmdp = pmd;
> - /* XXX caller needs to do pte_unmap, yuck */
> - }
> - }
> - }
> - }
> - return(retval);
> -}
> -
> static int __change_page_attr_noflush(struct page *page, pgprot_t prot)
> {
> pte_t *kpte;
> pmd_t *kpmd;
> - unsigned long address;
> + unsigned long address, va;
>
> BUG_ON(PageHighMem(page));
> address = (unsigned long)page_address(page);
> + va = address & PAGE_MASK;
>
> if (v_block_mapped(address))
> return 0;
> - if (!get_pteptr(&init_mm, address, &kpte, &kpmd))
> +
> + kpmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
> + if (!pmd_present(*kpmd))
> return -EINVAL;
> +
> + kpte = pte_offset_map(kpmd, va);
> __set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0);
> pte_unmap(kpte);
>
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index e78832dce7bb..1f86a88fd4bb 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -101,13 +101,13 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
>
> #ifndef __PAGETABLE_PUD_FOLDED
> /* 4 level page table */
> -struct page *pgd_page(pgd_t pgd)
> +struct page *p4d_page(p4d_t p4d)
> {
> - if (pgd_is_leaf(pgd)) {
> - VM_WARN_ON(!pgd_huge(pgd));
> - return pte_page(pgd_pte(pgd));
> + if (p4d_is_leaf(p4d)) {
> + VM_WARN_ON(!p4d_huge(p4d));
> + return pte_page(p4d_pte(p4d));
> }
> - return virt_to_page(pgd_page_vaddr(pgd));
> + return virt_to_page(p4d_page_vaddr(p4d));
> }
> #endif
>
> diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c b/arch/powerpc/mm/ptdump/hashpagetable.c
> index a07278027c6f..ac360ad865a8 100644
> --- a/arch/powerpc/mm/ptdump/hashpagetable.c
> +++ b/arch/powerpc/mm/ptdump/hashpagetable.c
> @@ -417,9 +417,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> }
> }
>
> -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> {
> - pud_t *pud = pud_offset(pgd, 0);
> + pud_t *pud = pud_offset(p4d, 0);
> unsigned long addr;
> unsigned int i;
>
> @@ -431,6 +431,20 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> }
> }
>
> +static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
> +{
> + p4d_t *p4d = p4d_offset(pgd, 0);
> + unsigned long addr;
> + unsigned int i;
> +
> + for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
> + addr = start + i * P4D_SIZE;
> + if (!p4d_none(*p4d))
> + /* p4d exists */
> + walk_pud(st, p4d, addr);
> + }
> +}
> +
> static void walk_pagetables(struct pg_state *st)
> {
> pgd_t *pgd = pgd_offset_k(0UL);
> @@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
> addr = KERN_VIRT_START + i * PGDIR_SIZE;
> if (!pgd_none(*pgd))
> /* pgd exists */
> - walk_pud(st, pgd, addr);
> + walk_p4d(st, pgd, addr);
> }
> }
>
> diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> index 206156255247..9d6256b61df3 100644
> --- a/arch/powerpc/mm/ptdump/ptdump.c
> +++ b/arch/powerpc/mm/ptdump/ptdump.c
> @@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> }
> }
>
> -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> {
> - pud_t *pud = pud_offset(pgd, 0);
> + pud_t *pud = pud_offset(p4d, 0);
> unsigned long addr;
> unsigned int i;
>
> @@ -304,11 +304,13 @@ static void walk_pagetables(struct pg_state *st)
> * the hash pagetable.
> */
> for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
> - if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
> - /* pgd exists */
> - walk_pud(st, pgd, addr);
> + p4d_t *p4d = p4d_offset(pgd, 0);
> +
> + if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
> + /* p4d exists */
> + walk_pud(st, p4d, addr);
> else
> - note_page(st, addr, 1, pgd_val(*pgd));
> + note_page(st, addr, 1, p4d_val(*p4d));
> }
> }
>
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index 0ec9640335bb..3e29128c58cc 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -3130,6 +3130,7 @@ static void show_pte(unsigned long addr)
> struct task_struct *tsk = NULL;
> struct mm_struct *mm;
> pgd_t *pgdp, *pgdir;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -3161,20 +3162,21 @@ static void show_pte(unsigned long addr)
> pgdir = pgd_offset(mm, 0);
> }
>
> - if (pgd_none(*pgdp)) {
> - printf("no linux page table for address\n");
> + p4dp = p4d_offset(pgdp, addr);
> +
> + if (p4d_none(*p4dp)) {
> + printf("No valid P4D\n");
> return;
> }
>
> - printf("pgd @ 0x%px\n", pgdir);
> -
> - if (pgd_is_leaf(*pgdp)) {
> - format_pte(pgdp, pgd_val(*pgdp));
> + if (p4d_is_leaf(*p4dp)) {
> + format_pte(p4dp, p4d_val(*p4dp));
> return;
> }
> - printf("pgdp @ 0x%px = 0x%016lx\n", pgdp, pgd_val(*pgdp));
>
> - pudp = pud_offset(pgdp, addr);
> + printf("p4dp @ 0x%px = 0x%016lx\n", p4dp, p4d_val(*p4dp));
> +
> + pudp = pud_offset(p4dp, addr);
>
> if (pud_none(*pudp)) {
> printf("No valid PUD\n");
>
On Wed, Feb 26, 2020 at 10:46:13AM +0100, Christophe Leroy wrote:
>
>
> Le 26/02/2020 ? 10:13, Mike Rapoport a ?crit?:
> > On Tue, Feb 18, 2020 at 12:54:40PM +0200, Mike Rapoport wrote:
> > > On Sun, Feb 16, 2020 at 11:41:07AM +0100, Christophe Leroy wrote:
> > > >
> > > >
> > > > Le 16/02/2020 ? 09:18, Mike Rapoport a ?crit?:
> > > > > From: Mike Rapoport <[email protected]>
> > > > >
> > > > > Implement primitives necessary for the 4th level folding, add walks of p4d
> > > > > level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
> > > >
> > > > I don't think it is worth adding all this additionnals walks of p4d, this
> > > > patch could be limited to changes like:
> > > >
> > > > - pud = pud_offset(pgd, gpa);
> > > > + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
> > > >
> > > > The additionnal walks should be added through another patch the day powerpc
> > > > need them.
> > >
> > > Ok, I'll update the patch to reduce walking the p4d.
> >
> > Here's what I have with more direct acceses from pgd to pud.
>
> I went quickly through. This looks promising.
>
> Do we need the walk_p4d() in arch/powerpc/mm/ptdump/hashpagetable.c ?
> Can't we just do
>
> @@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
> addr = KERN_VIRT_START + i * PGDIR_SIZE;
> if (!pgd_none(*pgd))
> /* pgd exists */
> - walk_pud(st, pgd, addr);
> + walk_pud(st, p4d_offset(pgd, addr), addr);
We can do
addr = KERN_VIRT_START + i * PGDIR_SIZE;
p4d = p4d_offset(pgd, addr);
if (!p4d_none(*pgd))
walk_pud()
But I don't think this is really essential. Again, we are trading off code
consistency vs line count. I don't think line count is that important.
> }
> }
>
>
>
> Also, I think the removal of get_pteptr() should be a patch by itself.
>
> You could include my patches in your series. See
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=152102
>
> Christophe
>
>
>
> >
> > From 6c59a86ce8394fb6100e9b6ced2e346981fb0ce9 Mon Sep 17 00:00:00 2001
> > From: Mike Rapoport <[email protected]>
> > Date: Sun, 24 Nov 2019 15:38:00 +0200
> > Subject: [PATCH v3] powerpc: add support for folded p4d page tables
> >
> > Implement primitives necessary for the 4th level folding, add walks of p4d
> > level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
> >
> > Signed-off-by: Mike Rapoport <[email protected]>
> > Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx
> > ---
> > v3:
> > * reduce amount of added p4d walks
> > * kill pgtable_32::get_pteptr and traverse page table in
> > pgtable_32::__change_page_attr_noflush
> >
> >
> > arch/powerpc/include/asm/book3s/32/pgtable.h | 1 -
> > arch/powerpc/include/asm/book3s/64/hash.h | 4 +-
> > arch/powerpc/include/asm/book3s/64/pgalloc.h | 4 +-
> > arch/powerpc/include/asm/book3s/64/pgtable.h | 60 ++++++++++---------
> > arch/powerpc/include/asm/book3s/64/radix.h | 6 +-
> > arch/powerpc/include/asm/nohash/32/pgtable.h | 1 -
> > arch/powerpc/include/asm/nohash/64/pgalloc.h | 2 +-
> > .../include/asm/nohash/64/pgtable-4k.h | 32 +++++-----
> > arch/powerpc/include/asm/nohash/64/pgtable.h | 6 +-
> > arch/powerpc/include/asm/pgtable.h | 6 +-
> > arch/powerpc/kvm/book3s_64_mmu_radix.c | 30 ++++++----
> > arch/powerpc/lib/code-patching.c | 7 ++-
> > arch/powerpc/mm/book3s32/mmu.c | 2 +-
> > arch/powerpc/mm/book3s32/tlb.c | 4 +-
> > arch/powerpc/mm/book3s64/hash_pgtable.c | 4 +-
> > arch/powerpc/mm/book3s64/radix_pgtable.c | 26 +++++---
> > arch/powerpc/mm/book3s64/subpage_prot.c | 6 +-
> > arch/powerpc/mm/hugetlbpage.c | 28 +++++----
> > arch/powerpc/mm/kasan/kasan_init_32.c | 8 +--
> > arch/powerpc/mm/mem.c | 4 +-
> > arch/powerpc/mm/nohash/40x.c | 4 +-
> > arch/powerpc/mm/nohash/book3e_pgtable.c | 15 ++---
> > arch/powerpc/mm/pgtable.c | 30 ++++++----
> > arch/powerpc/mm/pgtable_32.c | 45 +++-----------
> > arch/powerpc/mm/pgtable_64.c | 10 ++--
> > arch/powerpc/mm/ptdump/hashpagetable.c | 20 ++++++-
> > arch/powerpc/mm/ptdump/ptdump.c | 14 +++--
> > arch/powerpc/xmon/xmon.c | 18 +++---
> > 28 files changed, 213 insertions(+), 184 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
> > index 5b39c11e884a..39ec11371be0 100644
> > --- a/arch/powerpc/include/asm/book3s/32/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
> > @@ -2,7 +2,6 @@
> > #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
> > #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
> > -#define __ARCH_USE_5LEVEL_HACK
> > #include <asm-generic/pgtable-nopmd.h>
> > #include <asm/book3s/32/hash.h>
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> > index 2781ebf6add4..876d1528c2cf 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> > @@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
> > #define hash__pmd_bad(pmd) (pmd_val(pmd) & H_PMD_BAD_BITS)
> > #define hash__pud_bad(pud) (pud_val(pud) & H_PUD_BAD_BITS)
> > -static inline int hash__pgd_bad(pgd_t pgd)
> > +static inline int hash__p4d_bad(p4d_t p4d)
> > {
> > - return (pgd_val(pgd) == 0);
> > + return (p4d_val(p4d) == 0);
> > }
> > #ifdef CONFIG_STRICT_KERNEL_RWX
> > extern void hash__mark_rodata_ro(void);
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> > index a41e91bd0580..69c5b051734f 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> > @@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> > kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
> > }
> > -static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
> > +static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
> > {
> > - *pgd = __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> > + *pgd = __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
> > }
> > static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index 201a69e6a355..fa60e8594b9f 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -2,7 +2,7 @@
> > #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> > #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
> > -#include <asm-generic/5level-fixup.h>
> > +#include <asm-generic/pgtable-nop4d.h>
> > #ifndef __ASSEMBLY__
> > #include <linux/mmdebug.h>
> > @@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
> > /* Bits to mask out from a PUD to get to the PMD page */
> > #define PUD_MASKED_BITS 0xc0000000000000ffUL
> > /* Bits to mask out from a PGD to get to the PUD page */
> > -#define PGD_MASKED_BITS 0xc0000000000000ffUL
> > +#define P4D_MASKED_BITS 0xc0000000000000ffUL
> > /*
> > * Used as an indicator for rcu callback functions
> > @@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
> > return pte_access_permitted(pud_pte(pud), write);
> > }
> > -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> > +#define __p4d_raw(x) ((p4d_t) { __pgd_raw(x) })
> > +static inline __be64 p4d_raw(p4d_t x)
> > +{
> > + return pgd_raw(x.pgd);
> > +}
> > +
> > +#define p4d_write(p4d) pte_write(p4d_pte(p4d))
> > -static inline void pgd_clear(pgd_t *pgdp)
> > +static inline void p4d_clear(p4d_t *p4dp)
> > {
> > - *pgdp = __pgd(0);
> > + *p4dp = __p4d(0);
> > }
> > -static inline int pgd_none(pgd_t pgd)
> > +static inline int p4d_none(p4d_t p4d)
> > {
> > - return !pgd_raw(pgd);
> > + return !p4d_raw(p4d);
> > }
> > -static inline int pgd_present(pgd_t pgd)
> > +static inline int p4d_present(p4d_t p4d)
> > {
> > - return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
> > + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
> > }
> > -static inline pte_t pgd_pte(pgd_t pgd)
> > +static inline pte_t p4d_pte(p4d_t p4d)
> > {
> > - return __pte_raw(pgd_raw(pgd));
> > + return __pte_raw(p4d_raw(p4d));
> > }
> > -static inline pgd_t pte_pgd(pte_t pte)
> > +static inline p4d_t pte_p4d(pte_t pte)
> > {
> > - return __pgd_raw(pte_raw(pte));
> > + return __p4d_raw(pte_raw(pte));
> > }
> > -static inline int pgd_bad(pgd_t pgd)
> > +static inline int p4d_bad(p4d_t p4d)
> > {
> > if (radix_enabled())
> > - return radix__pgd_bad(pgd);
> > - return hash__pgd_bad(pgd);
> > + return radix__p4d_bad(p4d);
> > + return hash__p4d_bad(p4d);
> > }
> > -#define pgd_access_permitted pgd_access_permitted
> > -static inline bool pgd_access_permitted(pgd_t pgd, bool write)
> > +#define p4d_access_permitted p4d_access_permitted
> > +static inline bool p4d_access_permitted(p4d_t p4d, bool write)
> > {
> > - return pte_access_permitted(pgd_pte(pgd), write);
> > + return pte_access_permitted(p4d_pte(p4d), write);
> > }
> > -extern struct page *pgd_page(pgd_t pgd);
> > +extern struct page *p4d_page(p4d_t p4d);
> > /* Pointers in the page table tree are physical addresses */
> > #define __pgtable_ptr_val(ptr) __pa(ptr)
> > #define pmd_page_vaddr(pmd) __va(pmd_val(pmd) & ~PMD_MASKED_BITS)
> > #define pud_page_vaddr(pud) __va(pud_val(pud) & ~PUD_MASKED_BITS)
> > -#define pgd_page_vaddr(pgd) __va(pgd_val(pgd) & ~PGD_MASKED_BITS)
> > +#define p4d_page_vaddr(p4d) __va(p4d_val(p4d) & ~P4D_MASKED_BITS)
> > #define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
> > #define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
> > @@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
> > #define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
> > -#define pud_offset(pgdp, addr) \
> > - (((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
> > +#define pud_offset(p4dp, addr) \
> > + (((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
> > #define pmd_offset(pudp,addr) \
> > (((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
> > #define pte_offset_kernel(dir,addr) \
> > @@ -1368,11 +1374,11 @@ static inline bool pud_is_leaf(pud_t pud)
> > return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
> > }
> > -#define pgd_is_leaf pgd_is_leaf
> > -#define pgd_leaf pgd_is_leaf
> > -static inline bool pgd_is_leaf(pgd_t pgd)
> > +#define p4d_is_leaf p4d_is_leaf
> > +#define p4d_leaf p4d_is_leaf
> > +static inline bool p4d_is_leaf(p4d_t p4d)
> > {
> > - return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
> > + return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
> > }
> > #endif /* __ASSEMBLY__ */
> > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> > index d97db3ad9aae..9bca2ac64220 100644
> > --- a/arch/powerpc/include/asm/book3s/64/radix.h
> > +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> > @@ -30,7 +30,7 @@
> > /* Don't have anything in the reserved bits and leaf bits */
> > #define RADIX_PMD_BAD_BITS 0x60000000000000e0UL
> > #define RADIX_PUD_BAD_BITS 0x60000000000000e0UL
> > -#define RADIX_PGD_BAD_BITS 0x60000000000000e0UL
> > +#define RADIX_P4D_BAD_BITS 0x60000000000000e0UL
> > #define RADIX_PMD_SHIFT (PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
> > #define RADIX_PUD_SHIFT (RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
> > @@ -227,9 +227,9 @@ static inline int radix__pud_bad(pud_t pud)
> > }
> > -static inline int radix__pgd_bad(pgd_t pgd)
> > +static inline int radix__p4d_bad(p4d_t p4d)
> > {
> > - return !!(pgd_val(pgd) & RADIX_PGD_BAD_BITS);
> > + return !!(p4d_val(p4d) & RADIX_P4D_BAD_BITS);
> > }
> > #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
> > index 60c4d829152e..d4c2c4259fa3 100644
> > --- a/arch/powerpc/include/asm/nohash/32/pgtable.h
> > +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
> > @@ -2,7 +2,6 @@
> > #ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
> > #define _ASM_POWERPC_NOHASH_32_PGTABLE_H
> > -#define __ARCH_USE_5LEVEL_HACK
> > #include <asm-generic/pgtable-nopmd.h>
> > #ifndef __ASSEMBLY__
> > diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> > index b9534a793293..668aee6017e7 100644
> > --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
> > +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
> > @@ -15,7 +15,7 @@ struct vmemmap_backing {
> > };
> > extern struct vmemmap_backing *vmemmap_list;
> > -#define pgd_populate(MM, PGD, PUD) pgd_set(PGD, (unsigned long)PUD)
> > +#define p4d_populate(MM, P4D, PUD) p4d_set(P4D, (unsigned long)PUD)
> > static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> > {
> > diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> > index c40ec32b8194..81b1c54e3cf1 100644
> > --- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> > +++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
> > @@ -2,7 +2,7 @@
> > #ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
> > #define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
> > -#include <asm-generic/5level-fixup.h>
> > +#include <asm-generic/pgtable-nop4d.h>
> > /*
> > * Entries per page directory level. The PTE level must use a 64b record
> > @@ -45,41 +45,41 @@
> > #define PMD_MASKED_BITS 0
> > /* Bits to mask out from a PUD to get to the PMD page */
> > #define PUD_MASKED_BITS 0
> > -/* Bits to mask out from a PGD to get to the PUD page */
> > -#define PGD_MASKED_BITS 0
> > +/* Bits to mask out from a P4D to get to the PUD page */
> > +#define P4D_MASKED_BITS 0
> > /*
> > * 4-level page tables related bits
> > */
> > -#define pgd_none(pgd) (!pgd_val(pgd))
> > -#define pgd_bad(pgd) (pgd_val(pgd) == 0)
> > -#define pgd_present(pgd) (pgd_val(pgd) != 0)
> > -#define pgd_page_vaddr(pgd) (pgd_val(pgd) & ~PGD_MASKED_BITS)
> > +#define p4d_none(p4d) (!p4d_val(p4d))
> > +#define p4d_bad(p4d) (p4d_val(p4d) == 0)
> > +#define p4d_present(p4d) (p4d_val(p4d) != 0)
> > +#define p4d_page_vaddr(p4d) (p4d_val(p4d) & ~P4D_MASKED_BITS)
> > #ifndef __ASSEMBLY__
> > -static inline void pgd_clear(pgd_t *pgdp)
> > +static inline void p4d_clear(p4d_t *p4dp)
> > {
> > - *pgdp = __pgd(0);
> > + *p4dp = __p4d(0);
> > }
> > -static inline pte_t pgd_pte(pgd_t pgd)
> > +static inline pte_t p4d_pte(p4d_t p4d)
> > {
> > - return __pte(pgd_val(pgd));
> > + return __pte(p4d_val(p4d));
> > }
> > -static inline pgd_t pte_pgd(pte_t pte)
> > +static inline p4d_t pte_p4d(pte_t pte)
> > {
> > - return __pgd(pte_val(pte));
> > + return __p4d(pte_val(pte));
> > }
> > -extern struct page *pgd_page(pgd_t pgd);
> > +extern struct page *p4d_page(p4d_t p4d);
> > #endif /* !__ASSEMBLY__ */
> > -#define pud_offset(pgdp, addr) \
> > - (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
> > +#define pud_offset(p4dp, addr) \
> > + (((pud_t *) p4d_page_vaddr(*(p4dp))) + \
> > (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
> > #define pud_ERROR(e) \
> > diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
> > index 9a33b8bd842d..b360f262b9c6 100644
> > --- a/arch/powerpc/include/asm/nohash/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
> > @@ -175,11 +175,11 @@ static inline pud_t pte_pud(pte_t pte)
> > return __pud(pte_val(pte));
> > }
> > #define pud_write(pud) pte_write(pud_pte(pud))
> > -#define pgd_write(pgd) pte_write(pgd_pte(pgd))
> > +#define p4d_write(pgd) pte_write(p4d_pte(p4d))
> > -static inline void pgd_set(pgd_t *pgdp, unsigned long val)
> > +static inline void p4d_set(p4d_t *p4dp, unsigned long val)
> > {
> > - *pgdp = __pgd(val);
> > + *p4dp = __p4d(val);
> > }
> > /*
> > diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> > index 8cc543ed114c..05205d7a7b4a 100644
> > --- a/arch/powerpc/include/asm/pgtable.h
> > +++ b/arch/powerpc/include/asm/pgtable.h
> > @@ -139,9 +139,9 @@ static inline bool pud_is_leaf(pud_t pud)
> > }
> > #endif
> > -#ifndef pgd_is_leaf
> > -#define pgd_is_leaf pgd_is_leaf
> > -static inline bool pgd_is_leaf(pgd_t pgd)
> > +#ifndef p4d_is_leaf
> > +#define p4d_is_leaf p4d_is_leaf
> > +static inline bool p4d_is_leaf(p4d_t p4d)
> > {
> > return false;
> > }
> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > index 803940d79b73..beb694285100 100644
> > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > @@ -499,13 +499,14 @@ void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
> > unsigned long ig;
> > for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
> > + p4d_t *p4d = p4d_offset(pgd, 0);
> > pud_t *pud;
> > - if (!pgd_present(*pgd))
> > + if (!p4d_present(*p4d))
> > continue;
> > - pud = pud_offset(pgd, 0);
> > + pud = pud_offset(p4d, 0);
> > kvmppc_unmap_free_pud(kvm, pud, lpid);
> > - pgd_clear(pgd);
> > + p4d_clear(p4d);
> > }
> > }
> > @@ -566,6 +567,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> > unsigned long *rmapp, struct rmap_nested **n_rmap)
> > {
> > pgd_t *pgd;
> > + p4d_t *p4d;
> > pud_t *pud, *new_pud = NULL;
> > pmd_t *pmd, *new_pmd = NULL;
> > pte_t *ptep, *new_ptep = NULL;
> > @@ -573,9 +575,11 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> > /* Traverse the guest's 2nd-level tree, allocate new levels needed */
> > pgd = pgtable + pgd_index(gpa);
> > + p4d = p4d_offset(pgd, gpa);
> > +
> > pud = NULL;
> > - if (pgd_present(*pgd))
> > - pud = pud_offset(pgd, gpa);
> > + if (p4d_present(*p4d))
> > + pud = pud_offset(p4d, gpa);
> > else
> > new_pud = pud_alloc_one(kvm->mm, gpa);
> > @@ -596,13 +600,13 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
> > /* Now traverse again under the lock and change the tree */
> > ret = -ENOMEM;
> > - if (pgd_none(*pgd)) {
> > + if (p4d_none(*p4d)) {
> > if (!new_pud)
> > goto out_unlock;
> > - pgd_populate(kvm->mm, pgd, new_pud);
> > + p4d_populate(kvm->mm, p4d, new_pud);
> > new_pud = NULL;
> > }
> > - pud = pud_offset(pgd, gpa);
> > + pud = pud_offset(p4d, gpa);
> > if (pud_is_leaf(*pud)) {
> > unsigned long hgpa = gpa & PUD_MASK;
> > @@ -1220,6 +1224,7 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> > pgd_t *pgt;
> > struct kvm_nested_guest *nested;
> > pgd_t pgd, *pgdp;
> > + p4d_t p4d, *p4dp;
> > pud_t pud, *pudp;
> > pmd_t pmd, *pmdp;
> > pte_t *ptep;
> > @@ -1292,13 +1297,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
> > }
> > pgdp = pgt + pgd_index(gpa);
> > - pgd = READ_ONCE(*pgdp);
> > - if (!(pgd_val(pgd) & _PAGE_PRESENT)) {
> > - gpa = (gpa & PGDIR_MASK) + PGDIR_SIZE;
> > + p4dp = p4d_offset(pgdp, gpa);
> > + p4d = READ_ONCE(*p4dp);
> > + if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
> > + gpa = (gpa & P4D_MASK) + P4D_SIZE;
> > continue;
> > }
> > - pudp = pud_offset(&pgd, gpa);
> > + pudp = pud_offset(&p4d, gpa);
> > pud = READ_ONCE(*pudp);
> > if (!(pud_val(pud) & _PAGE_PRESENT)) {
> > gpa = (gpa & PUD_MASK) + PUD_SIZE;
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 3345f039a876..7a59f6863cec 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
> > pte_t *ptep;
> > pmd_t *pmdp;
> > pud_t *pudp;
> > + p4d_t *p4dp;
> > pgd_t *pgdp;
> > pgdp = pgd_offset_k(addr);
> > if (unlikely(!pgdp))
> > return -EINVAL;
> > - pudp = pud_offset(pgdp, addr);
> > + p4dp = p4d_offset(pgdp, addr);
> > + if (unlikely(!p4dp))
> > + return -EINVAL;
> > +
> > + pudp = pud_offset(p4dp, addr);
> > if (unlikely(!pudp))
> > return -EINVAL;
> > diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
> > index f888cbb109b9..edef17c97206 100644
> > --- a/arch/powerpc/mm/book3s32/mmu.c
> > +++ b/arch/powerpc/mm/book3s32/mmu.c
> > @@ -312,7 +312,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea)
> > if (!Hash)
> > return;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, ea), ea), ea);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, ea), ea), ea), ea);
> > if (!pmd_none(*pmd))
> > add_hash_page(mm->context.id, ea, pmd_val(*pmd));
> > }
> > diff --git a/arch/powerpc/mm/book3s32/tlb.c b/arch/powerpc/mm/book3s32/tlb.c
> > index 2fcd321040ff..175bc33b41b7 100644
> > --- a/arch/powerpc/mm/book3s32/tlb.c
> > +++ b/arch/powerpc/mm/book3s32/tlb.c
> > @@ -87,7 +87,7 @@ static void flush_range(struct mm_struct *mm, unsigned long start,
> > if (start >= end)
> > return;
> > end = (end - 1) | ~PAGE_MASK;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, start), start), start);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, start), start), start), start);
> > for (;;) {
> > pmd_end = ((start + PGDIR_SIZE) & PGDIR_MASK) - 1;
> > if (pmd_end > end)
> > @@ -145,7 +145,7 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr)
> > return;
> > }
> > mm = (vmaddr < TASK_SIZE)? vma->vm_mm: &init_mm;
> > - pmd = pmd_offset(pud_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, vmaddr), vmaddr), vmaddr), vmaddr);
> > if (!pmd_none(*pmd))
> > flush_hash_pages(mm->context.id, vmaddr, pmd_val(*pmd), 1);
> > }
> > diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> > index 64733b9cb20a..9cd15937e88a 100644
> > --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> > +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> > @@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
> > int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > {
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
> > if (slab_is_available()) {
> > pgdp = pgd_offset_k(ea);
> > - pudp = pud_alloc(&init_mm, pgdp, ea);
> > + p4dp = p4d_offset(pgdp, ea);
> > + pudp = pud_alloc(&init_mm, p4dp, ea);
> > if (!pudp)
> > return -ENOMEM;
> > pmdp = pmd_alloc(&init_mm, pudp, ea);
> > diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> > index dd1bea45325c..fc3d0b0460b0 100644
> > --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> > +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> > @@ -64,17 +64,19 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
> > {
> > unsigned long pfn = pa >> PAGE_SHIFT;
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > pgdp = pgd_offset_k(ea);
> > - if (pgd_none(*pgdp)) {
> > + p4dp = p4d_offset(pgdp, ea);
> > + if (p4d_none(*p4dp)) {
> > pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
> > region_start, region_end);
> > - pgd_populate(&init_mm, pgdp, pudp);
> > + p4d_populate(&init_mm, p4dp, pudp);
> > }
> > - pudp = pud_offset(pgdp, ea);
> > + pudp = pud_offset(p4dp, ea);
> > if (map_page_size == PUD_SIZE) {
> > ptep = (pte_t *)pudp;
> > goto set_the_pte;
> > @@ -114,6 +116,7 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
> > {
> > unsigned long pfn = pa >> PAGE_SHIFT;
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -136,7 +139,8 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
> > * boot.
> > */
> > pgdp = pgd_offset_k(ea);
> > - pudp = pud_alloc(&init_mm, pgdp, ea);
> > + p4dp = p4d_offset(pgdp, ea);
> > + pudp = pud_alloc(&init_mm, p4dp, ea);
> > if (!pudp)
> > return -ENOMEM;
> > if (map_page_size == PUD_SIZE) {
> > @@ -173,6 +177,7 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
> > {
> > unsigned long idx;
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -185,7 +190,8 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
> > for (idx = start; idx < end; idx += PAGE_SIZE) {
> > pgdp = pgd_offset_k(idx);
> > - pudp = pud_alloc(&init_mm, pgdp, idx);
> > + p4dp = p4d_offset(pgdp, idx);
> > + pudp = pud_alloc(&init_mm, p4dp, idx);
> > if (!pudp)
> > continue;
> > if (pud_is_leaf(*pudp)) {
> > @@ -847,6 +853,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
> > unsigned long addr, next;
> > pud_t *pud_base;
> > pgd_t *pgd;
> > + p4d_t *p4d;
> > spin_lock(&init_mm.page_table_lock);
> > @@ -854,15 +861,16 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
> > next = pgd_addr_end(addr, end);
> > pgd = pgd_offset_k(addr);
> > - if (!pgd_present(*pgd))
> > + p4d = p4d_offset(pgd, addr);
> > + if (!p4d_present(*p4d))
> > continue;
> > - if (pgd_is_leaf(*pgd)) {
> > - split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
> > + if (p4d_is_leaf(*p4d)) {
> > + split_kernel_mapping(addr, end, P4D_SIZE, (pte_t *)p4d);
> > continue;
> > }
> > - pud_base = (pud_t *)pgd_page_vaddr(*pgd);
> > + pud_base = (pud_t *)p4d_page_vaddr(*p4d);
> > remove_pud_table(pud_base, addr, next);
> > }
> > diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
> > index 2ef24a53f4c9..25a0c044bd93 100644
> > --- a/arch/powerpc/mm/book3s64/subpage_prot.c
> > +++ b/arch/powerpc/mm/book3s64/subpage_prot.c
> > @@ -54,15 +54,17 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
> > int npages)
> > {
> > pgd_t *pgd;
> > + p4d_t *p4d;
> > pud_t *pud;
> > pmd_t *pmd;
> > pte_t *pte;
> > spinlock_t *ptl;
> > pgd = pgd_offset(mm, addr);
> > - if (pgd_none(*pgd))
> > + p4d = p4d_offset(pgd, addr);
> > + if (p4d_none(*p4d))
> > return;
> > - pud = pud_offset(pgd, addr);
> > + pud = pud_offset(p4d, addr);
> > if (pud_none(*pud))
> > return;
> > pmd = pmd_offset(pud, addr);
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index 33b3461d91e8..54f5994d4cbb 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -119,6 +119,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
> > pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
> > {
> > pgd_t *pg;
> > + p4d_t *p4;
> > pud_t *pu;
> > pmd_t *pm;
> > hugepd_t *hpdp = NULL;
> > @@ -128,20 +129,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
> > addr &= ~(sz-1);
> > pg = pgd_offset(mm, addr);
> > + p4 = p4d_offset(pg, addr);
> > #ifdef CONFIG_PPC_BOOK3S_64
> > if (pshift == PGDIR_SHIFT)
> > /* 16GB huge page */
> > - return (pte_t *) pg;
> > + return (pte_t *) p4;
> > else if (pshift > PUD_SHIFT) {
> > /*
> > * We need to use hugepd table
> > */
> > ptl = &mm->page_table_lock;
> > - hpdp = (hugepd_t *)pg;
> > + hpdp = (hugepd_t *)p4;
> > } else {
> > pdshift = PUD_SHIFT;
> > - pu = pud_alloc(mm, pg, addr);
> > + pu = pud_alloc(mm, p4, addr);
> > if (!pu)
> > return NULL;
> > if (pshift == PUD_SHIFT)
> > @@ -166,10 +168,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
> > #else
> > if (pshift >= PGDIR_SHIFT) {
> > ptl = &mm->page_table_lock;
> > - hpdp = (hugepd_t *)pg;
> > + hpdp = (hugepd_t *)p4;
> > } else {
> > pdshift = PUD_SHIFT;
> > - pu = pud_alloc(mm, pg, addr);
> > + pu = pud_alloc(mm, p4, addr);
> > if (!pu)
> > return NULL;
> > if (pshift >= PUD_SHIFT) {
> > @@ -390,7 +392,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> > mm_dec_nr_pmds(tlb->mm);
> > }
> > -static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
> > +static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
> > unsigned long addr, unsigned long end,
> > unsigned long floor, unsigned long ceiling)
> > {
> > @@ -400,7 +402,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
> > start = addr;
> > do {
> > - pud = pud_offset(pgd, addr);
> > + pud = pud_offset(p4d, addr);
> > next = pud_addr_end(addr, end);
> > if (!is_hugepd(__hugepd(pud_val(*pud)))) {
> > if (pud_none_or_clear_bad(pud))
> > @@ -435,8 +437,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
> > if (end - 1 > ceiling - 1)
> > return;
> > - pud = pud_offset(pgd, start);
> > - pgd_clear(pgd);
> > + pud = pud_offset(p4d, start);
> > + p4d_clear(p4d);
> > pud_free_tlb(tlb, pud, start);
> > mm_dec_nr_puds(tlb->mm);
> > }
> > @@ -449,6 +451,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> > unsigned long floor, unsigned long ceiling)
> > {
> > pgd_t *pgd;
> > + p4d_t *p4d;
> > unsigned long next;
> > /*
> > @@ -471,10 +474,11 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> > do {
> > next = pgd_addr_end(addr, end);
> > pgd = pgd_offset(tlb->mm, addr);
> > + p4d = p4d_offset(pgd, addr);
> > if (!is_hugepd(__hugepd(pgd_val(*pgd)))) {
> > - if (pgd_none_or_clear_bad(pgd))
> > + if (p4d_none_or_clear_bad(p4d))
> > continue;
> > - hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
> > + hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
> > } else {
> > unsigned long more;
> > /*
> > @@ -487,7 +491,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> > if (more > next)
> > next = more;
> > - free_hugepd_range(tlb, (hugepd_t *)pgd, PGDIR_SHIFT,
> > + free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT,
> > addr, next, floor, ceiling);
> > }
> > } while (addr = next, addr != end);
> > diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
> > index db5664dde5ff..88e2e16380b5 100644
> > --- a/arch/powerpc/mm/kasan/kasan_init_32.c
> > +++ b/arch/powerpc/mm/kasan/kasan_init_32.c
> > @@ -36,7 +36,7 @@ static int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned
> > unsigned long k_cur, k_next;
> > pte_t *new = NULL;
> > - pmd = pmd_offset(pud_offset(pgd_offset_k(k_start), k_start), k_start);
> > + pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_start), k_start), k_start), k_start);
> > for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++) {
> > k_next = pgd_addr_end(k_cur, k_end);
> > @@ -78,7 +78,7 @@ static int __init kasan_init_region(void *start, size_t size)
> > block = memblock_alloc(k_end - k_start, PAGE_SIZE);
> > for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
> > - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
> > + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
> > void *va = block + k_cur - k_start;
> > pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
> > @@ -102,7 +102,7 @@ static void __init kasan_remap_early_shadow_ro(void)
> > kasan_populate_pte(kasan_early_shadow_pte, prot);
> > for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
> > - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(k_cur), k_cur), k_cur);
> > + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(k_cur), k_cur), k_cur), k_cur);
> > pte_t *ptep = pte_offset_kernel(pmd, k_cur);
> > if ((pte_val(*ptep) & PTE_RPN_MASK) != pa)
> > @@ -201,7 +201,7 @@ void __init kasan_early_init(void)
> > unsigned long addr = KASAN_SHADOW_START;
> > unsigned long end = KASAN_SHADOW_END;
> > unsigned long next;
> > - pmd_t *pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);
> > + pmd_t *pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(addr), addr), addr), addr);
> > BUILD_BUG_ON(KASAN_SHADOW_START & ~PGDIR_MASK);
> > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> > index ef7b1119b2e2..8262b384dcf3 100644
> > --- a/arch/powerpc/mm/mem.c
> > +++ b/arch/powerpc/mm/mem.c
> > @@ -69,8 +69,8 @@ EXPORT_SYMBOL(kmap_prot);
> > static inline pte_t *virt_to_kpte(unsigned long vaddr)
> > {
> > - return pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr),
> > - vaddr), vaddr), vaddr);
> > + return pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr),
> > + vaddr), vaddr), vaddr), vaddr);
> > }
> > #endif
> > diff --git a/arch/powerpc/mm/nohash/40x.c b/arch/powerpc/mm/nohash/40x.c
> > index f348104eb461..7aaf7155e350 100644
> > --- a/arch/powerpc/mm/nohash/40x.c
> > +++ b/arch/powerpc/mm/nohash/40x.c
> > @@ -104,7 +104,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
> > pmd_t *pmdp;
> > unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
> > - pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
> > + pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
> > *pmdp++ = __pmd(val);
> > *pmdp++ = __pmd(val);
> > *pmdp++ = __pmd(val);
> > @@ -119,7 +119,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
> > pmd_t *pmdp;
> > unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
> > - pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
> > + pmdp = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(v), v), v), v);
> > *pmdp = __pmd(val);
> > v += LARGE_PAGE_SIZE_4M;
> > diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
> > index 4637fdd469cf..77884e24281d 100644
> > --- a/arch/powerpc/mm/nohash/book3e_pgtable.c
> > +++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
> > @@ -73,6 +73,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
> > int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > {
> > pgd_t *pgdp;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -80,7 +81,8 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > BUILD_BUG_ON(TASK_SIZE_USER64 > PGTABLE_RANGE);
> > if (slab_is_available()) {
> > pgdp = pgd_offset_k(ea);
> > - pudp = pud_alloc(&init_mm, pgdp, ea);
> > + p4dp = p4d_offset(pgdp, ea);
> > + pudp = pud_alloc(&init_mm, p4dp, ea);
> > if (!pudp)
> > return -ENOMEM;
> > pmdp = pmd_alloc(&init_mm, pudp, ea);
> > @@ -91,13 +93,12 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
> > return -ENOMEM;
> > } else {
> > pgdp = pgd_offset_k(ea);
> > -#ifndef __PAGETABLE_PUD_FOLDED
> > - if (pgd_none(*pgdp)) {
> > - pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
> > - pgd_populate(&init_mm, pgdp, pudp);
> > + p4dp = p4d_offset(pgdp, ea);
> > + if (p4d_none(*p4dp)) {
> > + pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
> > + p4d_populate(&init_mm, p4dp, pmdp);
> > }
> > -#endif /* !__PAGETABLE_PUD_FOLDED */
> > - pudp = pud_offset(pgdp, ea);
> > + pudp = pud_offset(p4dp, ea);
> > if (pud_none(*pudp)) {
> > pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
> > pud_populate(&init_mm, pudp, pmdp);
> > diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> > index e3759b69f81b..c2499271f6c1 100644
> > --- a/arch/powerpc/mm/pgtable.c
> > +++ b/arch/powerpc/mm/pgtable.c
> > @@ -265,6 +265,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
> > void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> > {
> > pgd_t *pgd;
> > + p4d_t *p4d;
> > pud_t *pud;
> > pmd_t *pmd;
> > @@ -272,7 +273,9 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> > return;
> > pgd = mm->pgd + pgd_index(addr);
> > BUG_ON(pgd_none(*pgd));
> > - pud = pud_offset(pgd, addr);
> > + p4d = p4d_offset(pgd, addr);
> > + BUG_ON(p4d_none(*p4d));
> > + pud = pud_offset(p4d, addr);
> > BUG_ON(pud_none(*pud));
> > pmd = pmd_offset(pud, addr);
> > /*
> > @@ -312,12 +315,13 @@ EXPORT_SYMBOL_GPL(vmalloc_to_phys);
> > pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> > bool *is_thp, unsigned *hpage_shift)
> > {
> > - pgd_t pgd, *pgdp;
> > + pgd_t *pgdp;
> > + p4d_t p4d, *p4dp;
> > pud_t pud, *pudp;
> > pmd_t pmd, *pmdp;
> > pte_t *ret_pte;
> > hugepd_t *hpdp = NULL;
> > - unsigned pdshift = PGDIR_SHIFT;
> > + unsigned pdshift;
> > if (hpage_shift)
> > *hpage_shift = 0;
> > @@ -325,24 +329,28 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> > if (is_thp)
> > *is_thp = false;
> > - pgdp = pgdir + pgd_index(ea);
> > - pgd = READ_ONCE(*pgdp);
> > /*
> > * Always operate on the local stack value. This make sure the
> > * value don't get updated by a parallel THP split/collapse,
> > * page fault or a page unmap. The return pte_t * is still not
> > * stable. So should be checked there for above conditions.
> > + * Top level is an exception because it is folded into p4d.
> > */
> > - if (pgd_none(pgd))
> > + pgdp = pgdir + pgd_index(ea);
> > + p4dp = p4d_offset(pgdp, ea);
> > + p4d = READ_ONCE(*p4dp);
> > + pdshift = P4D_SHIFT;
> > +
> > + if (p4d_none(p4d))
> > return NULL;
> > - if (pgd_is_leaf(pgd)) {
> > - ret_pte = (pte_t *)pgdp;
> > + if (p4d_is_leaf(p4d)) {
> > + ret_pte = (pte_t *)p4dp;
> > goto out;
> > }
> > - if (is_hugepd(__hugepd(pgd_val(pgd)))) {
> > - hpdp = (hugepd_t *)&pgd;
> > + if (is_hugepd(__hugepd(p4d_val(p4d)))) {
> > + hpdp = (hugepd_t *)&p4d;
> > goto out_huge;
> > }
> > @@ -352,7 +360,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
> > * irq disabled
> > */
> > pdshift = PUD_SHIFT;
> > - pudp = pud_offset(&pgd, ea);
> > + pudp = pud_offset(&p4d, ea);
> > pud = READ_ONCE(*pudp);
> > if (pud_none(pud))
> > diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> > index 5fb90edd865e..5774d4bc94d0 100644
> > --- a/arch/powerpc/mm/pgtable_32.c
> > +++ b/arch/powerpc/mm/pgtable_32.c
> > @@ -63,7 +63,7 @@ int __ref map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot)
> > int err = -ENOMEM;
> > /* Use upper 10 bits of VA to index the first level map */
> > - pd = pmd_offset(pud_offset(pgd_offset_k(va), va), va);
> > + pd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
> > /* Use middle 10 bits of VA to index the second-level map */
> > if (likely(slab_is_available()))
> > pg = pte_alloc_kernel(pd, va);
> > @@ -121,53 +121,24 @@ void __init mapin_ram(void)
> > }
> > }
> > -/* Scan the real Linux page tables and return a PTE pointer for
> > - * a virtual address in a context.
> > - * Returns true (1) if PTE was found, zero otherwise. The pointer to
> > - * the PTE pointer is unmodified if PTE is not found.
> > - */
> > -static int
> > -get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp)
> > -{
> > - pgd_t *pgd;
> > - pud_t *pud;
> > - pmd_t *pmd;
> > - pte_t *pte;
> > - int retval = 0;
> > -
> > - pgd = pgd_offset(mm, addr & PAGE_MASK);
> > - if (pgd) {
> > - pud = pud_offset(pgd, addr & PAGE_MASK);
> > - if (pud && pud_present(*pud)) {
> > - pmd = pmd_offset(pud, addr & PAGE_MASK);
> > - if (pmd_present(*pmd)) {
> > - pte = pte_offset_map(pmd, addr & PAGE_MASK);
> > - if (pte) {
> > - retval = 1;
> > - *ptep = pte;
> > - if (pmdp)
> > - *pmdp = pmd;
> > - /* XXX caller needs to do pte_unmap, yuck */
> > - }
> > - }
> > - }
> > - }
> > - return(retval);
> > -}
> > -
> > static int __change_page_attr_noflush(struct page *page, pgprot_t prot)
> > {
> > pte_t *kpte;
> > pmd_t *kpmd;
> > - unsigned long address;
> > + unsigned long address, va;
> > BUG_ON(PageHighMem(page));
> > address = (unsigned long)page_address(page);
> > + va = address & PAGE_MASK;
> > if (v_block_mapped(address))
> > return 0;
> > - if (!get_pteptr(&init_mm, address, &kpte, &kpmd))
> > +
> > + kpmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
> > + if (!pmd_present(*kpmd))
> > return -EINVAL;
> > +
> > + kpte = pte_offset_map(kpmd, va);
> > __set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0);
> > pte_unmap(kpte);
> > diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> > index e78832dce7bb..1f86a88fd4bb 100644
> > --- a/arch/powerpc/mm/pgtable_64.c
> > +++ b/arch/powerpc/mm/pgtable_64.c
> > @@ -101,13 +101,13 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
> > #ifndef __PAGETABLE_PUD_FOLDED
> > /* 4 level page table */
> > -struct page *pgd_page(pgd_t pgd)
> > +struct page *p4d_page(p4d_t p4d)
> > {
> > - if (pgd_is_leaf(pgd)) {
> > - VM_WARN_ON(!pgd_huge(pgd));
> > - return pte_page(pgd_pte(pgd));
> > + if (p4d_is_leaf(p4d)) {
> > + VM_WARN_ON(!p4d_huge(p4d));
> > + return pte_page(p4d_pte(p4d));
> > }
> > - return virt_to_page(pgd_page_vaddr(pgd));
> > + return virt_to_page(p4d_page_vaddr(p4d));
> > }
> > #endif
> > diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c b/arch/powerpc/mm/ptdump/hashpagetable.c
> > index a07278027c6f..ac360ad865a8 100644
> > --- a/arch/powerpc/mm/ptdump/hashpagetable.c
> > +++ b/arch/powerpc/mm/ptdump/hashpagetable.c
> > @@ -417,9 +417,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> > }
> > }
> > -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> > {
> > - pud_t *pud = pud_offset(pgd, 0);
> > + pud_t *pud = pud_offset(p4d, 0);
> > unsigned long addr;
> > unsigned int i;
> > @@ -431,6 +431,20 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > }
> > }
> > +static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > +{
> > + p4d_t *p4d = p4d_offset(pgd, 0);
> > + unsigned long addr;
> > + unsigned int i;
> > +
> > + for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
> > + addr = start + i * P4D_SIZE;
> > + if (!p4d_none(*p4d))
> > + /* p4d exists */
> > + walk_pud(st, p4d, addr);
> > + }
> > +}
> > +
> > static void walk_pagetables(struct pg_state *st)
> > {
> > pgd_t *pgd = pgd_offset_k(0UL);
> > @@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
> > addr = KERN_VIRT_START + i * PGDIR_SIZE;
> > if (!pgd_none(*pgd))
> > /* pgd exists */
> > - walk_pud(st, pgd, addr);
> > + walk_p4d(st, pgd, addr);
> > }
> > }
> > diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
> > index 206156255247..9d6256b61df3 100644
> > --- a/arch/powerpc/mm/ptdump/ptdump.c
> > +++ b/arch/powerpc/mm/ptdump/ptdump.c
> > @@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
> > }
> > }
> > -static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
> > +static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
> > {
> > - pud_t *pud = pud_offset(pgd, 0);
> > + pud_t *pud = pud_offset(p4d, 0);
> > unsigned long addr;
> > unsigned int i;
> > @@ -304,11 +304,13 @@ static void walk_pagetables(struct pg_state *st)
> > * the hash pagetable.
> > */
> > for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
> > - if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
> > - /* pgd exists */
> > - walk_pud(st, pgd, addr);
> > + p4d_t *p4d = p4d_offset(pgd, 0);
> > +
> > + if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
> > + /* p4d exists */
> > + walk_pud(st, p4d, addr);
> > else
> > - note_page(st, addr, 1, pgd_val(*pgd));
> > + note_page(st, addr, 1, p4d_val(*p4d));
> > }
> > }
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index 0ec9640335bb..3e29128c58cc 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -3130,6 +3130,7 @@ static void show_pte(unsigned long addr)
> > struct task_struct *tsk = NULL;
> > struct mm_struct *mm;
> > pgd_t *pgdp, *pgdir;
> > + p4d_t *p4dp;
> > pud_t *pudp;
> > pmd_t *pmdp;
> > pte_t *ptep;
> > @@ -3161,20 +3162,21 @@ static void show_pte(unsigned long addr)
> > pgdir = pgd_offset(mm, 0);
> > }
> > - if (pgd_none(*pgdp)) {
> > - printf("no linux page table for address\n");
> > + p4dp = p4d_offset(pgdp, addr);
> > +
> > + if (p4d_none(*p4dp)) {
> > + printf("No valid P4D\n");
> > return;
> > }
> > - printf("pgd @ 0x%px\n", pgdir);
> > -
> > - if (pgd_is_leaf(*pgdp)) {
> > - format_pte(pgdp, pgd_val(*pgdp));
> > + if (p4d_is_leaf(*p4dp)) {
> > + format_pte(p4dp, p4d_val(*p4dp));
> > return;
> > }
> > - printf("pgdp @ 0x%px = 0x%016lx\n", pgdp, pgd_val(*pgdp));
> > - pudp = pud_offset(pgdp, addr);
> > + printf("p4dp @ 0x%px = 0x%016lx\n", p4dp, p4d_val(*p4dp));
> > +
> > + pudp = pud_offset(p4dp, addr);
> > if (pud_none(*pudp)) {
> > printf("No valid PUD\n");
> >
>
>
--
Sincerely yours,
Mike.
Le 26/02/2020 à 11:56, Mike Rapoport a écrit :
> On Wed, Feb 26, 2020 at 10:46:13AM +0100, Christophe Leroy wrote:
>>
>>
>> Le 26/02/2020 à 10:13, Mike Rapoport a écrit :
>>> On Tue, Feb 18, 2020 at 12:54:40PM +0200, Mike Rapoport wrote:
>>>> On Sun, Feb 16, 2020 at 11:41:07AM +0100, Christophe Leroy wrote:
>>>>>
>>>>>
>>>>> Le 16/02/2020 à 09:18, Mike Rapoport a écrit :
>>>>>> From: Mike Rapoport <[email protected]>
>>>>>>
>>>>>> Implement primitives necessary for the 4th level folding, add walks of p4d
>>>>>> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.
>>>>>
>>>>> I don't think it is worth adding all this additionnals walks of p4d, this
>>>>> patch could be limited to changes like:
>>>>>
>>>>> - pud = pud_offset(pgd, gpa);
>>>>> + pud = pud_offset(p4d_offset(pgd, gpa), gpa);
>>>>>
>>>>> The additionnal walks should be added through another patch the day powerpc
>>>>> need them.
>>>>
>>>> Ok, I'll update the patch to reduce walking the p4d.
>>>
>>> Here's what I have with more direct acceses from pgd to pud.
>>
>> I went quickly through. This looks promising.
>>
>> Do we need the walk_p4d() in arch/powerpc/mm/ptdump/hashpagetable.c ?
>> Can't we just do
>>
>> @@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
>> addr = KERN_VIRT_START + i * PGDIR_SIZE;
>> if (!pgd_none(*pgd))
>> /* pgd exists */
>> - walk_pud(st, pgd, addr);
>> + walk_pud(st, p4d_offset(pgd, addr), addr);
>
> We can do
>
> addr = KERN_VIRT_START + i * PGDIR_SIZE;
> p4d = p4d_offset(pgd, addr);
> if (!p4d_none(*pgd))
> walk_pud()
>
> But I don't think this is really essential. Again, we are trading off code
> consistency vs line count. I don't think line count is that important.
Ok.
Christophe