2021-04-19 10:50:36

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 0/4] Convert powerpc to GENERIC_PTDUMP

This series converts powerpc to generic PTDUMP.

For that, we first need to add missing hugepd support
to pagewalk and ptdump.

v2:
- Reworked the pagewalk modification to add locking and check ops->pte_entry
- Modified powerpc early IO mapping to have gaps between mappings
- Removed the logic that checked for contiguous physical memory
- Removed the articial level calculation in ptdump_pte_entry(), level 4 is ok for all.
- Removed page_size argument to note_page()

Christophe Leroy (4):
mm: pagewalk: Fix walk for hugepage tables
powerpc/mm: Leave a gap between early allocated IO areas
powerpc/mm: Properly coalesce pages in ptdump
powerpc/mm: Convert powerpc to GENERIC_PTDUMP

arch/powerpc/Kconfig | 2 +
arch/powerpc/Kconfig.debug | 30 -----
arch/powerpc/mm/Makefile | 2 +-
arch/powerpc/mm/ioremap_32.c | 4 +-
arch/powerpc/mm/ioremap_64.c | 2 +-
arch/powerpc/mm/mmu_decl.h | 2 +-
arch/powerpc/mm/ptdump/8xx.c | 6 +-
arch/powerpc/mm/ptdump/Makefile | 9 +-
arch/powerpc/mm/ptdump/book3s64.c | 6 +-
arch/powerpc/mm/ptdump/ptdump.c | 187 ++++++++----------------------
arch/powerpc/mm/ptdump/shared.c | 6 +-
mm/pagewalk.c | 58 ++++++++-
12 files changed, 127 insertions(+), 187 deletions(-)

--
2.25.0


2021-04-19 13:02:11

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 3/4] powerpc/mm: Properly coalesce pages in ptdump

Commit aaa229529244 ("powerpc/mm: Add physical address to Linux page
table dump") changed range coalescing to only combine ranges that are
both virtually and physically contiguous, in order to avoid erroneous
combination of unrelated mappings in IOREMAP space.

But in the VMALLOC space, mappings almost never have contiguous
physical pages, so the commit mentionned above leads to dumping one
line per page for vmalloc mappings.

Taking into account the vmalloc always leave a gap between two areas,
we never have two mappings dumped as a single combination even if they
have the exact same flags. The only space that may have encountered
such an issue was the early IOREMAP which is not using vmalloc engine.
But previous commits added gaps between early IO mappings, so it is
not an issue anymore.

That commit created some difficulties with KASAN mappings, see
commit cabe8138b23c ("powerpc: dump as a single line areas mapping a
single physical page.") and with huge page, see
commit b00ff6d8c1c3 ("powerpc/ptdump: Properly handle non standard
page size").

So, almost revert commit aaa229529244 to properly coalesce pages
mapped with the same flags as before, only keep the display of the
first physical address of the range, as it can be usefull especially
for IO mappings.

It brings back powerpc at the same level as other architectures and
simplifies the conversion to GENERIC PTDUMP.

With the patch:

---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff 0x07000000 16M huge rw present dirty accessed
0xf9000000-0xf91fffff 0x01434000 2M r present accessed
0xf9200000-0xf95affff 0x02104000 3776K rw present dirty accessed
0xfef5c000-0xfeffffff 0x01434000 656K r present accessed
---[ kasan shadow mem end ]---

Before:

---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff 0x07000000 16M huge rw present dirty accessed
0xf9000000-0xf91fffff 0x01434000 16K r present accessed
0xf9200000-0xf9203fff 0x02104000 16K rw present dirty accessed
0xf9204000-0xf9207fff 0x0213c000 16K rw present dirty accessed
0xf9208000-0xf920bfff 0x02174000 16K rw present dirty accessed
0xf920c000-0xf920ffff 0x02188000 16K rw present dirty accessed
0xf9210000-0xf9213fff 0x021dc000 16K rw present dirty accessed
0xf9214000-0xf9217fff 0x02220000 16K rw present dirty accessed
0xf9218000-0xf921bfff 0x023c0000 16K rw present dirty accessed
0xf921c000-0xf921ffff 0x023d4000 16K rw present dirty accessed
0xf9220000-0xf9227fff 0x023ec000 32K rw present dirty accessed
...
0xf93b8000-0xf93e3fff 0x02614000 176K rw present dirty accessed
0xf93e4000-0xf94c3fff 0x027c0000 896K rw present dirty accessed
0xf94c4000-0xf94c7fff 0x0236c000 16K rw present dirty accessed
0xf94c8000-0xf94cbfff 0x041f0000 16K rw present dirty accessed
0xf94cc000-0xf94cffff 0x029c0000 16K rw present dirty accessed
0xf94d0000-0xf94d3fff 0x041ec000 16K rw present dirty accessed
0xf94d4000-0xf94d7fff 0x0407c000 16K rw present dirty accessed
0xf94d8000-0xf94f7fff 0x041c0000 128K rw present dirty accessed
...
0xf95ac000-0xf95affff 0x042b0000 16K rw present dirty accessed
0xfef5c000-0xfeffffff 0x01434000 16K r present accessed
---[ kasan shadow mem end ]---

Signed-off-by: Christophe Leroy <[email protected]>
Cc: Oliver O'Halloran <[email protected]>
---
arch/powerpc/mm/ptdump/ptdump.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index aca354fb670b..5062c58b1e5b 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -58,8 +58,6 @@ struct pg_state {
const struct addr_marker *marker;
unsigned long start_address;
unsigned long start_pa;
- unsigned long last_pa;
- unsigned long page_size;
unsigned int level;
u64 current_flags;
bool check_wx;
@@ -163,8 +161,6 @@ static void dump_flag_info(struct pg_state *st, const struct flag_info

static void dump_addr(struct pg_state *st, unsigned long addr)
{
- unsigned long delta;
-
#ifdef CONFIG_PPC64
#define REG "0x%016lx"
#else
@@ -172,14 +168,8 @@ static void dump_addr(struct pg_state *st, unsigned long addr)
#endif

pt_dump_seq_printf(st->seq, REG "-" REG " ", st->start_address, addr - 1);
- if (st->start_pa == st->last_pa && st->start_address + st->page_size != addr) {
- pt_dump_seq_printf(st->seq, "[" REG "]", st->start_pa);
- delta = st->page_size >> 10;
- } else {
- pt_dump_seq_printf(st->seq, " " REG " ", st->start_pa);
- delta = (addr - st->start_address) >> 10;
- }
- pt_dump_size(st->seq, delta);
+ pt_dump_seq_printf(st->seq, " " REG " ", st->start_pa);
+ pt_dump_size(st->seq, (addr - st->start_address) >> 10);
}

static void note_prot_wx(struct pg_state *st, unsigned long addr)
@@ -208,7 +198,6 @@ static void note_page_update_state(struct pg_state *st, unsigned long addr,
st->current_flags = flag;
st->start_address = addr;
st->start_pa = pa;
- st->page_size = page_size;

while (addr >= st->marker[1].start_address) {
st->marker++;
@@ -220,7 +209,6 @@ static void note_page(struct pg_state *st, unsigned long addr,
unsigned int level, u64 val, unsigned long page_size)
{
u64 flag = val & pg_level[level].mask;
- u64 pa = val & PTE_RPN_MASK;

/* At first no level is set */
if (!st->level) {
@@ -232,12 +220,9 @@ static void note_page(struct pg_state *st, unsigned long addr,
* - we change levels in the tree.
* - the address is in a different section of memory and is thus
* used for a different purpose, regardless of the flags.
- * - the pa of this page is not adjacent to the last inspected page
*/
} else if (flag != st->current_flags || level != st->level ||
- addr >= st->marker[1].start_address ||
- (pa != st->last_pa + st->page_size &&
- (pa != st->start_pa || st->start_pa != st->last_pa))) {
+ addr >= st->marker[1].start_address) {

/* Check the PTE flags */
if (st->current_flags) {
@@ -259,7 +244,6 @@ static void note_page(struct pg_state *st, unsigned long addr,
*/
note_page_update_state(st, addr, level, val, page_size);
}
- st->last_pa = pa;
}

static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
--
2.25.0

2021-04-19 13:02:13

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 2/4] powerpc/mm: Leave a gap between early allocated IO areas

Vmalloc system leaves a gap between allocated areas. It helps catching
overflows.

Do the same for IO areas which are allocated with early_ioremap_range()
until slab_is_available().

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/mm/ioremap_32.c | 4 ++--
arch/powerpc/mm/ioremap_64.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/ioremap_32.c b/arch/powerpc/mm/ioremap_32.c
index 743e11384dea..9d13143b8be4 100644
--- a/arch/powerpc/mm/ioremap_32.c
+++ b/arch/powerpc/mm/ioremap_32.c
@@ -70,10 +70,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *call
*/
pr_warn("ioremap() called early from %pS. Use early_ioremap() instead\n", caller);

- err = early_ioremap_range(ioremap_bot - size, p, size, prot);
+ err = early_ioremap_range(ioremap_bot - size - PAGE_SIZE, p, size, prot);
if (err)
return NULL;
- ioremap_bot -= size;
+ ioremap_bot -= size + PAGE_SIZE;

return (void __iomem *)ioremap_bot + offset;
}
diff --git a/arch/powerpc/mm/ioremap_64.c b/arch/powerpc/mm/ioremap_64.c
index ba5cbb0d66bd..3acece00b33e 100644
--- a/arch/powerpc/mm/ioremap_64.c
+++ b/arch/powerpc/mm/ioremap_64.c
@@ -38,7 +38,7 @@ void __iomem *__ioremap_caller(phys_addr_t addr, unsigned long size,
return NULL;

ret = (void __iomem *)ioremap_bot + offset;
- ioremap_bot += size;
+ ioremap_bot += size + PAGE_SIZE;

return ret;
}
--
2.25.0

2021-04-19 13:02:15

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v2 4/4] powerpc/mm: Convert powerpc to GENERIC_PTDUMP

This patch converts powerpc to the generic PTDUMP implementation.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/Kconfig | 2 +
arch/powerpc/Kconfig.debug | 30 ------
arch/powerpc/mm/Makefile | 2 +-
arch/powerpc/mm/mmu_decl.h | 2 +-
arch/powerpc/mm/ptdump/8xx.c | 6 +-
arch/powerpc/mm/ptdump/Makefile | 9 +-
arch/powerpc/mm/ptdump/book3s64.c | 6 +-
arch/powerpc/mm/ptdump/ptdump.c | 165 ++++++++----------------------
arch/powerpc/mm/ptdump/shared.c | 6 +-
9 files changed, 68 insertions(+), 160 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 475d77a6ebbe..40259437a28f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -120,6 +120,7 @@ config PPC
select ARCH_32BIT_OFF_T if PPC32
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEBUG_VM_PGTABLE
+ select ARCH_HAS_DEBUG_WX if STRICT_KERNEL_RWX
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
@@ -177,6 +178,7 @@ config PPC
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_PCI_IOMAP if PCI
+ select GENERIC_PTDUMP
select GENERIC_SMP_IDLE_THREAD
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 6342f9da4545..05b1180ea502 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -360,36 +360,6 @@ config FAIL_IOMMU

If you are unsure, say N.

-config PPC_PTDUMP
- bool "Export kernel pagetable layout to userspace via debugfs"
- depends on DEBUG_KERNEL && DEBUG_FS
- help
- This option exports the state of the kernel pagetables to a
- debugfs file. This is only useful for kernel developers who are
- working in architecture specific areas of the kernel - probably
- not a good idea to enable this feature in a production kernel.
-
- If you are unsure, say N.
-
-config PPC_DEBUG_WX
- bool "Warn on W+X mappings at boot"
- depends on PPC_PTDUMP && STRICT_KERNEL_RWX
- help
- Generate a warning if any W+X mappings are found at boot.
-
- This is useful for discovering cases where the kernel is leaving
- W+X mappings after applying NX, as such mappings are a security risk.
-
- Note that even if the check fails, your kernel is possibly
- still fine, as W+X mappings are not a security hole in
- themselves, what they do is that they make the exploitation
- of other unfixed kernel bugs easier.
-
- There is no runtime or memory usage effect of this option
- once the kernel has booted up - it's a one time check.
-
- If in doubt, say "Y".
-
config PPC_FAST_ENDIAN_SWITCH
bool "Deprecated fast endian-switch syscall"
depends on DEBUG_KERNEL && PPC_BOOK3S_64
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index c3df3a8501d4..c90d58aaebe2 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -18,5 +18,5 @@ obj-$(CONFIG_PPC_MM_SLICES) += slice.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
obj-$(CONFIG_PPC_COPRO_BASE) += copro_fault.o
-obj-$(CONFIG_PPC_PTDUMP) += ptdump/
+obj-$(CONFIG_PTDUMP_CORE) += ptdump/
obj-$(CONFIG_KASAN) += kasan/
diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h
index 7dac910c0b21..dd1cabc2ea0f 100644
--- a/arch/powerpc/mm/mmu_decl.h
+++ b/arch/powerpc/mm/mmu_decl.h
@@ -180,7 +180,7 @@ static inline void mmu_mark_rodata_ro(void) { }
void __init mmu_mapin_immr(void);
#endif

-#ifdef CONFIG_PPC_DEBUG_WX
+#ifdef CONFIG_DEBUG_WX
void ptdump_check_wx(void);
#else
static inline void ptdump_check_wx(void) { }
diff --git a/arch/powerpc/mm/ptdump/8xx.c b/arch/powerpc/mm/ptdump/8xx.c
index 86da2a669680..fac932eb8f9a 100644
--- a/arch/powerpc/mm/ptdump/8xx.c
+++ b/arch/powerpc/mm/ptdump/8xx.c
@@ -75,8 +75,10 @@ static const struct flag_info flag_array[] = {
};

struct pgtable_level pg_level[5] = {
- {
- }, { /* pgd */
+ { /* pgd */
+ .flag = flag_array,
+ .num = ARRAY_SIZE(flag_array),
+ }, { /* p4d */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),
}, { /* pud */
diff --git a/arch/powerpc/mm/ptdump/Makefile b/arch/powerpc/mm/ptdump/Makefile
index 712762be3cb1..4050cbb55acf 100644
--- a/arch/powerpc/mm/ptdump/Makefile
+++ b/arch/powerpc/mm/ptdump/Makefile
@@ -5,5 +5,10 @@ obj-y += ptdump.o
obj-$(CONFIG_4xx) += shared.o
obj-$(CONFIG_PPC_8xx) += 8xx.o
obj-$(CONFIG_PPC_BOOK3E_MMU) += shared.o
-obj-$(CONFIG_PPC_BOOK3S_32) += shared.o bats.o segment_regs.o
-obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o hashpagetable.o
+obj-$(CONFIG_PPC_BOOK3S_32) += shared.o
+obj-$(CONFIG_PPC_BOOK3S_64) += book3s64.o
+
+ifdef CONFIG_PTDUMP_DEBUGFS
+obj-$(CONFIG_PPC_BOOK3S_32) += bats.o segment_regs.o
+obj-$(CONFIG_PPC_BOOK3S_64) += hashpagetable.o
+endif
diff --git a/arch/powerpc/mm/ptdump/book3s64.c b/arch/powerpc/mm/ptdump/book3s64.c
index 14f73868db66..5ad92d9dc5d1 100644
--- a/arch/powerpc/mm/ptdump/book3s64.c
+++ b/arch/powerpc/mm/ptdump/book3s64.c
@@ -103,8 +103,10 @@ static const struct flag_info flag_array[] = {
};

struct pgtable_level pg_level[5] = {
- {
- }, { /* pgd */
+ { /* pgd */
+ .flag = flag_array,
+ .num = ARRAY_SIZE(flag_array),
+ }, { /* p4d */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),
}, { /* pud */
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 5062c58b1e5b..57d1270689c6 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -16,6 +16,7 @@
#include <linux/io.h>
#include <linux/mm.h>
#include <linux/highmem.h>
+#include <linux/ptdump.h>
#include <linux/sched.h>
#include <linux/seq_file.h>
#include <asm/fixmap.h>
@@ -54,11 +55,12 @@
*
*/
struct pg_state {
+ struct ptdump_state ptdump;
struct seq_file *seq;
const struct addr_marker *marker;
unsigned long start_address;
unsigned long start_pa;
- unsigned int level;
+ int level;
u64 current_flags;
bool check_wx;
unsigned long wx_pages;
@@ -189,9 +191,9 @@ static void note_prot_wx(struct pg_state *st, unsigned long addr)
}

static void note_page_update_state(struct pg_state *st, unsigned long addr,
- unsigned int level, u64 val, unsigned long page_size)
+ unsigned int level, u64 val)
{
- u64 flag = val & pg_level[level].mask;
+ u64 flag = level >= 0 ? val & pg_level[level].mask : 0;
u64 pa = val & PTE_RPN_MASK;

st->level = level;
@@ -205,15 +207,15 @@ static void note_page_update_state(struct pg_state *st, unsigned long addr,
}
}

-static void note_page(struct pg_state *st, unsigned long addr,
- unsigned int level, u64 val, unsigned long page_size)
+static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level, u64 val)
{
- u64 flag = val & pg_level[level].mask;
+ u64 flag = level >= 0 ? val & pg_level[level].mask : 0;
+ struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);

/* At first no level is set */
- if (!st->level) {
+ if (st->level == -1) {
pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
- note_page_update_state(st, addr, level, val, page_size);
+ note_page_update_state(st, addr, level, val);
/*
* Dump the section of virtual memory when:
* - the PTE flags from one entry to the next differs.
@@ -242,95 +244,7 @@ static void note_page(struct pg_state *st, unsigned long addr,
* Address indicates we have passed the end of the
* current section of virtual memory
*/
- note_page_update_state(st, addr, level, val, page_size);
- }
-}
-
-static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
-{
- pte_t *pte = pte_offset_kernel(pmd, 0);
- unsigned long addr;
- unsigned int i;
-
- for (i = 0; i < PTRS_PER_PTE; i++, pte++) {
- addr = start + i * PAGE_SIZE;
- note_page(st, addr, 4, pte_val(*pte), PAGE_SIZE);
-
- }
-}
-
-static void walk_hugepd(struct pg_state *st, hugepd_t *phpd, unsigned long start,
- int pdshift, int level)
-{
-#ifdef CONFIG_ARCH_HAS_HUGEPD
- unsigned int i;
- int shift = hugepd_shift(*phpd);
- int ptrs_per_hpd = pdshift - shift > 0 ? 1 << (pdshift - shift) : 1;
-
- if (start & ((1 << shift) - 1))
- return;
-
- for (i = 0; i < ptrs_per_hpd; i++) {
- unsigned long addr = start + (i << shift);
- pte_t *pte = hugepte_offset(*phpd, addr, pdshift);
-
- note_page(st, addr, level + 1, pte_val(*pte), 1 << shift);
- }
-#endif
-}
-
-static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
-{
- pmd_t *pmd = pmd_offset(pud, 0);
- unsigned long addr;
- unsigned int i;
-
- for (i = 0; i < PTRS_PER_PMD; i++, pmd++) {
- addr = start + i * PMD_SIZE;
- if (!pmd_none(*pmd) && !pmd_is_leaf(*pmd))
- /* pmd exists */
- walk_pte(st, pmd, addr);
- else
- note_page(st, addr, 3, pmd_val(*pmd), PMD_SIZE);
- }
-}
-
-static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
-{
- pud_t *pud = pud_offset(p4d, 0);
- unsigned long addr;
- unsigned int i;
-
- for (i = 0; i < PTRS_PER_PUD; i++, pud++) {
- addr = start + i * PUD_SIZE;
- if (!pud_none(*pud) && !pud_is_leaf(*pud))
- /* pud exists */
- walk_pmd(st, pud, addr);
- else
- note_page(st, addr, 2, pud_val(*pud), PUD_SIZE);
- }
-}
-
-static void walk_pagetables(struct pg_state *st)
-{
- unsigned int i;
- unsigned long addr = st->start_address & PGDIR_MASK;
- pgd_t *pgd = pgd_offset_k(addr);
-
- /*
- * Traverse the linux pagetable structure and dump pages that are in
- * the hash pagetable.
- */
- for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
- p4d_t *p4d = p4d_offset(pgd, 0);
-
- if (p4d_none(*p4d) || p4d_is_leaf(*p4d))
- note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE);
- else if (is_hugepd(__hugepd(p4d_val(*p4d))))
- walk_hugepd(st, (hugepd_t *)p4d, addr, PGDIR_SHIFT, 1);
- else
- /* p4d exists */
- walk_pud(st, p4d, addr);
+ note_page_update_state(st, addr, level, val);
}
}

@@ -383,32 +297,29 @@ static int ptdump_show(struct seq_file *m, void *v)
struct pg_state st = {
.seq = m,
.marker = address_markers,
- .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE,
+ .level = -1,
+ .ptdump = {
+ .note_page = note_page,
+ .range = (struct ptdump_range[]){
+ {TASK_SIZE, ~0UL},
+ {0, 0}
+ }
+ }
};

#ifdef CONFIG_PPC64
if (!radix_enabled())
- st.start_address = KERN_VIRT_START;
+ st.ptdump.range.start = KERN_VIRT_START;
+ else
+ st.ptdump.range.start = PAGE_OFFSET;
#endif

/* Traverse kernel page tables */
- walk_pagetables(&st);
- note_page(&st, 0, 0, 0, 0);
+ ptdump_walk_pgd(&st.ptdump, &init_mm, NULL);
return 0;
}

-
-static int ptdump_open(struct inode *inode, struct file *file)
-{
- return single_open(file, ptdump_show, NULL);
-}
-
-static const struct file_operations ptdump_fops = {
- .open = ptdump_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(ptdump);

static void build_pgtable_complete_mask(void)
{
@@ -420,22 +331,34 @@ static void build_pgtable_complete_mask(void)
pg_level[i].mask |= pg_level[i].flag[j].mask;
}

-#ifdef CONFIG_PPC_DEBUG_WX
+#ifdef CONFIG_DEBUG_WX
void ptdump_check_wx(void)
{
struct pg_state st = {
.seq = NULL,
- .marker = address_markers,
+ .marker = (struct addr_marker[]) {
+ { 0, NULL},
+ { -1, NULL},
+ },
+ .level = -1,
.check_wx = true,
- .start_address = IS_ENABLED(CONFIG_PPC64) ? PAGE_OFFSET : TASK_SIZE,
+ .ptdump = {
+ .note_page = note_page,
+ .range = (struct ptdump_range[]){
+ {TASK_SIZE, ~0UL},
+ {0, 0}
+ }
+ }
};

#ifdef CONFIG_PPC64
if (!radix_enabled())
- st.start_address = KERN_VIRT_START;
+ st.ptdump.range.start = KERN_VIRT_START;
+ else
+ st.ptdump.range.start = PAGE_OFFSET;
#endif

- walk_pagetables(&st);
+ ptdump_walk_pgd(&st.ptdump, &init_mm, NULL);

if (st.wx_pages)
pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found\n",
@@ -449,8 +372,10 @@ static int ptdump_init(void)
{
populate_markers();
build_pgtable_complete_mask();
- debugfs_create_file("kernel_page_tables", 0400, NULL, NULL,
- &ptdump_fops);
+
+ if (IS_ENABLED(CONFIG_PTDUMP_DEBUGFS))
+ debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops);
+
return 0;
}
device_initcall(ptdump_init);
diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index c005fe041c18..03607ab90c66 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -68,8 +68,10 @@ static const struct flag_info flag_array[] = {
};

struct pgtable_level pg_level[5] = {
- {
- }, { /* pgd */
+ { /* pgd */
+ .flag = flag_array,
+ .num = ARRAY_SIZE(flag_array),
+ }, { /* p4d */
.flag = flag_array,
.num = ARRAY_SIZE(flag_array),
}, { /* pud */
--
2.25.0

2021-06-26 10:41:32

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Convert powerpc to GENERIC_PTDUMP

On Mon, 19 Apr 2021 10:47:24 +0000 (UTC), Christophe Leroy wrote:
> This series converts powerpc to generic PTDUMP.
>
> For that, we first need to add missing hugepd support
> to pagewalk and ptdump.
>
> v2:
> - Reworked the pagewalk modification to add locking and check ops->pte_entry
> - Modified powerpc early IO mapping to have gaps between mappings
> - Removed the logic that checked for contiguous physical memory
> - Removed the articial level calculation in ptdump_pte_entry(), level 4 is ok for all.
> - Removed page_size argument to note_page()
>
> [...]

Patches 2 and 4 pplied to powerpc/next.

[2/4] powerpc/mm: Leave a gap between early allocated IO areas
https://git.kernel.org/powerpc/c/57307f1b6edd781fba2bf9f7ec5f4d17a881ea54
[3/4] powerpc/mm: Properly coalesce pages in ptdump
https://git.kernel.org/powerpc/c/6ca6512c716afd6e37281372c4c35aa6afd71d10

cheers