2019-09-12 06:05:44

by Anshuman Khandual

[permalink] [raw]
Subject: [PATCH V2 0/2] mm/debug: Add tests for architecture exported page table helpers

This series adds a test validation for architecture exported page table
helpers. Patch in the series adds basic transformation tests at various
levels of the page table. Before that it exports gigantic page allocation
function from HugeTLB.

This test was originally suggested by Catalin during arm64 THP migration
RFC discussion earlier. Going forward it can include more specific tests
with respect to various generic MM functions like THP, HugeTLB etc and
platform specific tests.

https://lore.kernel.org/linux-mm/[email protected]/

Testing:

Successfully build and boot tested on both arm64 and x86 platforms without
any test failing. Only build tested on some other platforms.

But I would really appreciate if folks can help validate this test on other
platforms and report back problems. All suggestions, comments and inputs
welcome. Thank you.

Changes in V2:

- Fixed small typo error in MODULE_DESCRIPTION()
- Fixed m64k build problems for lvalue concerns in pmd_xxx_tests()
- Fixed dynamic page table level folding problems on x86 as per Kirril
- Fixed second pointers during pxx_populate_tests() per Kirill and Gerald
- Allocate and free pte table with pte_alloc_one/pte_free per Kirill
- Modified pxx_clear_tests() to accommodate s390 lower 12 bits situation
- Changed RANDOM_NZVALUE value from 0xbe to 0xff
- Changed allocation, usage, free sequence for saved_ptep
- Renamed VMA_FLAGS as VMFLAGS
- Implemented a new method for random vaddr generation
- Implemented some other cleanups
- Dropped extern reference to mm_alloc()
- Created and exported new alloc_gigantic_page_order()
- Dropped the custom allocator and used new alloc_gigantic_page_order()

Changes in V1:

https://lore.kernel.org/linux-mm/[email protected]/

- Added fallback mechanism for PMD aligned memory allocation failure

Changes in RFC V2:

https://lore.kernel.org/linux-mm/[email protected]/T/#u

- Moved test module and it's config from lib/ to mm/
- Renamed config TEST_ARCH_PGTABLE as DEBUG_ARCH_PGTABLE_TEST
- Renamed file from test_arch_pgtable.c to arch_pgtable_test.c
- Added relevant MODULE_DESCRIPTION() and MODULE_AUTHOR() details
- Dropped loadable module config option
- Basic tests now use memory blocks with required size and alignment
- PUD aligned memory block gets allocated with alloc_contig_range()
- If PUD aligned memory could not be allocated it falls back on PMD aligned
memory block from page allocator and pud_* tests are skipped
- Clear and populate tests now operate on real in memory page table entries
- Dummy mm_struct gets allocated with mm_alloc()
- Dummy page table entries get allocated with [pud|pmd|pte]_alloc_[map]()
- Simplified [p4d|pgd]_basic_tests(), now has random values in the entries

Original RFC V1:

https://lore.kernel.org/linux-mm/[email protected]/

Cc: Andrew Morton <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Sri Krishna chowdary <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Russell King - ARM Linux <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Anshuman Khandual (2):
mm/hugetlb: Make alloc_gigantic_page() available for general use
mm/pgtable/debug: Add test validating architecture page table helpers

arch/x86/include/asm/pgtable_64_types.h | 2 +
include/linux/hugetlb.h | 9 +
mm/Kconfig.debug | 14 +
mm/Makefile | 1 +
mm/arch_pgtable_test.c | 429 ++++++++++++++++++++++++
mm/hugetlb.c | 24 +-
6 files changed, 477 insertions(+), 2 deletions(-)
create mode 100644 mm/arch_pgtable_test.c

--
2.20.1


2019-09-12 06:07:03

by Anshuman Khandual

[permalink] [raw]
Subject: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

This adds a test module which will validate architecture page table helpers
and accessors regarding compliance with generic MM semantics expectations.
This will help various architectures in validating changes to the existing
page table helpers or addition of new ones.

Test page table and memory pages creating it's entries at various level are
all allocated from system memory with required alignments. If memory pages
with required size and alignment could not be allocated, then all depending
individual tests are skipped.

Cc: Andrew Morton <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Sri Krishna chowdary <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Russell King - ARM Linux <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Suggested-by: Catalin Marinas <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
---
arch/x86/include/asm/pgtable_64_types.h | 2 +
mm/Kconfig.debug | 14 +
mm/Makefile | 1 +
mm/arch_pgtable_test.c | 429 ++++++++++++++++++++++++
4 files changed, 446 insertions(+)
create mode 100644 mm/arch_pgtable_test.c

diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 52e5f5f2240d..b882792a3999 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
#define pgtable_l5_enabled() 0
#endif /* CONFIG_X86_5LEVEL */

+#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
+
extern unsigned int pgdir_shift;
extern unsigned int ptrs_per_p4d;

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 327b3ebf23bf..ce9c397f7b07 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
depends on STRICT_KERNEL_RWX
---help---
This option enables a testcase for the setting rodata read-only.
+
+config DEBUG_ARCH_PGTABLE_TEST
+ bool "Test arch page table helpers for semantics compliance"
+ depends on MMU
+ depends on DEBUG_KERNEL
+ help
+ This options provides a kernel module which can be used to test
+ architecture page table helper functions on various platform in
+ verifying if they comply with expected generic MM semantics. This
+ will help architectures code in making sure that any changes or
+ new additions of these helpers will still conform to generic MM
+ expected semantics.
+
+ If unsure, say N.
diff --git a/mm/Makefile b/mm/Makefile
index d996846697ef..bb572c5aa8c5 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
+obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
obj-$(CONFIG_PAGE_OWNER) += page_owner.o
obj-$(CONFIG_CLEANCACHE) += cleancache.o
obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
new file mode 100644
index 000000000000..8b4a92756ad8
--- /dev/null
+++ b/mm/arch_pgtable_test.c
@@ -0,0 +1,429 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * This kernel module validates architecture page table helpers &
+ * accessors and helps in verifying their continued compliance with
+ * generic MM semantics.
+ *
+ * Copyright (C) 2019 ARM Ltd.
+ *
+ * Author: Anshuman Khandual <[email protected]>
+ */
+#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
+
+#include <linux/gfp.h>
+#include <linux/hugetlb.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/mm_types.h>
+#include <linux/module.h>
+#include <linux/pfn_t.h>
+#include <linux/printk.h>
+#include <linux/random.h>
+#include <linux/spinlock.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
+#include <linux/sched/mm.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+
+/*
+ * Basic operations
+ *
+ * mkold(entry) = An old and not a young entry
+ * mkyoung(entry) = A young and not an old entry
+ * mkdirty(entry) = A dirty and not a clean entry
+ * mkclean(entry) = A clean and not a dirty entry
+ * mkwrite(entry) = A write and not a write protected entry
+ * wrprotect(entry) = A write protected and not a write entry
+ * pxx_bad(entry) = A mapped and non-table entry
+ * pxx_same(entry1, entry2) = Both entries hold the exact same value
+ */
+#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC)
+
+/*
+ * On s390 platform, the lower 12 bits are used to identify given page table
+ * entry type and for other arch specific requirements. But these bits might
+ * affect the ability to clear entries with pxx_clear(). So while loading up
+ * the entries skip all lower 12 bits in order to accommodate s390 platform.
+ * It does not have affect any other platform.
+ */
+#define RANDOM_ORVALUE (0xfffffffffffff000UL)
+#define RANDOM_NZVALUE (0xff)
+
+static bool pud_aligned;
+static bool pmd_aligned;
+
+static void pte_basic_tests(struct page *page, pgprot_t prot)
+{
+ pte_t pte = mk_pte(page, prot);
+
+ WARN_ON(!pte_same(pte, pte));
+ WARN_ON(!pte_young(pte_mkyoung(pte)));
+ WARN_ON(!pte_dirty(pte_mkdirty(pte)));
+ WARN_ON(!pte_write(pte_mkwrite(pte)));
+ WARN_ON(pte_young(pte_mkold(pte)));
+ WARN_ON(pte_dirty(pte_mkclean(pte)));
+ WARN_ON(pte_write(pte_wrprotect(pte)));
+}
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
+static void pmd_basic_tests(struct page *page, pgprot_t prot)
+{
+ pmd_t pmd;
+
+ /*
+ * Memory block here must be PMD_SIZE aligned. Abort this
+ * test in case we could not allocate such a memory block.
+ */
+ if (!pmd_aligned) {
+ pr_warn("Could not proceed with PMD tests\n");
+ return;
+ }
+
+ pmd = mk_pmd(page, prot);
+ WARN_ON(!pmd_same(pmd, pmd));
+ WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
+ WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
+ WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
+ WARN_ON(pmd_young(pmd_mkold(pmd)));
+ WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
+ WARN_ON(pmd_write(pmd_wrprotect(pmd)));
+ /*
+ * A huge page does not point to next level page table
+ * entry. Hence this must qualify as pmd_bad().
+ */
+ WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
+}
+#else
+static void pmd_basic_tests(struct page *page, pgprot_t prot) { }
+#endif
+
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+static void pud_basic_tests(struct page *page, pgprot_t prot)
+{
+ pud_t pud;
+
+ /*
+ * Memory block here must be PUD_SIZE aligned. Abort this
+ * test in case we could not allocate such a memory block.
+ */
+ if (!pud_aligned) {
+ pr_warn("Could not proceed with PUD tests\n");
+ return;
+ }
+
+ pud = pfn_pud(page_to_pfn(page), prot);
+ WARN_ON(!pud_same(pud, pud));
+ WARN_ON(!pud_young(pud_mkyoung(pud)));
+ WARN_ON(!pud_write(pud_mkwrite(pud)));
+ WARN_ON(pud_write(pud_wrprotect(pud)));
+ WARN_ON(pud_young(pud_mkold(pud)));
+
+#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
+ /*
+ * A huge page does not point to next level page table
+ * entry. Hence this must qualify as pud_bad().
+ */
+ WARN_ON(!pud_bad(pud_mkhuge(pud)));
+#endif
+}
+#else
+static void pud_basic_tests(struct page *page, pgprot_t prot) { }
+#endif
+
+static void p4d_basic_tests(struct page *page, pgprot_t prot)
+{
+ p4d_t p4d;
+
+ memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
+ WARN_ON(!p4d_same(p4d, p4d));
+}
+
+static void pgd_basic_tests(struct page *page, pgprot_t prot)
+{
+ pgd_t pgd;
+
+ memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
+ WARN_ON(!pgd_same(pgd, pgd));
+}
+
+#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
+static void pud_clear_tests(pud_t *pudp)
+{
+ pud_t pud = READ_ONCE(*pudp);
+
+ pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
+ WRITE_ONCE(*pudp, pud);
+ pud_clear(pudp);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(!pud_none(pud));
+}
+
+static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
+{
+ pud_t pud;
+
+ /*
+ * This entry points to next level page table page.
+ * Hence this must not qualify as pud_bad().
+ */
+ pmd_clear(pmdp);
+ pud_clear(pudp);
+ pud_populate(mm, pudp, pmdp);
+ pud = READ_ONCE(*pudp);
+ WARN_ON(pud_bad(pud));
+}
+#else
+static void pud_clear_tests(pud_t *pudp) { }
+static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
+{
+}
+#endif
+
+#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
+static void p4d_clear_tests(p4d_t *p4dp)
+{
+ p4d_t p4d = READ_ONCE(*p4dp);
+
+ p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
+ WRITE_ONCE(*p4dp, p4d);
+ p4d_clear(p4dp);
+ p4d = READ_ONCE(*p4dp);
+ WARN_ON(!p4d_none(p4d));
+}
+
+static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
+{
+ p4d_t p4d;
+
+ /*
+ * This entry points to next level page table page.
+ * Hence this must not qualify as p4d_bad().
+ */
+ pud_clear(pudp);
+ p4d_clear(p4dp);
+ p4d_populate(mm, p4dp, pudp);
+ p4d = READ_ONCE(*p4dp);
+ WARN_ON(p4d_bad(p4d));
+}
+#else
+static void p4d_clear_tests(p4d_t *p4dp) { }
+static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
+{
+}
+#endif
+
+#ifndef __ARCH_HAS_5LEVEL_HACK
+static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
+{
+ pgd_t pgd = READ_ONCE(*pgdp);
+
+ if (mm_p4d_folded(mm))
+ return;
+
+ pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
+ WRITE_ONCE(*pgdp, pgd);
+ pgd_clear(pgdp);
+ pgd = READ_ONCE(*pgdp);
+ WARN_ON(!pgd_none(pgd));
+}
+
+static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
+{
+ pgd_t pgd;
+
+ if (mm_p4d_folded(mm))
+ return;
+
+ /*
+ * This entry points to next level page table page.
+ * Hence this must not qualify as pgd_bad().
+ */
+ p4d_clear(p4dp);
+ pgd_clear(pgdp);
+ pgd_populate(mm, pgdp, p4dp);
+ pgd = READ_ONCE(*pgdp);
+ WARN_ON(pgd_bad(pgd));
+}
+#else
+static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
+static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
+{
+}
+#endif
+
+static void pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
+{
+ pte_t pte = READ_ONCE(*ptep);
+
+ pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
+ WRITE_ONCE(*ptep, pte);
+ pte_clear(mm, 0, ptep);
+ pte = READ_ONCE(*ptep);
+ WARN_ON(!pte_none(pte));
+}
+
+static void pmd_clear_tests(pmd_t *pmdp)
+{
+ pmd_t pmd = READ_ONCE(*pmdp);
+
+ pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
+ WRITE_ONCE(*pmdp, pmd);
+ pmd_clear(pmdp);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(!pmd_none(pmd));
+}
+
+static void pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
+ pgtable_t pgtable)
+{
+ pmd_t pmd;
+
+ /*
+ * This entry points to next level page table page.
+ * Hence this must not qualify as pmd_bad().
+ */
+ pmd_clear(pmdp);
+ pmd_populate(mm, pmdp, pgtable);
+ pmd = READ_ONCE(*pmdp);
+ WARN_ON(pmd_bad(pmd));
+}
+
+static struct page *alloc_mapped_page(void)
+{
+ struct page *page;
+ gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO;
+
+ page = alloc_gigantic_page_order(get_order(PUD_SIZE), gfp_mask,
+ first_memory_node, &node_states[N_MEMORY]);
+ if (page) {
+ pud_aligned = true;
+ pmd_aligned = true;
+ return page;
+ }
+
+ page = alloc_pages(gfp_mask, get_order(PMD_SIZE));
+ if (page) {
+ pmd_aligned = true;
+ return page;
+ }
+ return alloc_page(gfp_mask);
+}
+
+static void free_mapped_page(struct page *page)
+{
+ if (pud_aligned) {
+ unsigned long pfn = page_to_pfn(page);
+
+ free_contig_range(pfn, 1ULL << get_order(PUD_SIZE));
+ return;
+ }
+
+ if (pmd_aligned) {
+ int order = get_order(PMD_SIZE);
+
+ free_pages((unsigned long)page_address(page), order);
+ return;
+ }
+ free_page((unsigned long)page_address(page));
+}
+
+static unsigned long get_random_vaddr(void)
+{
+ unsigned long random_vaddr, random_pages, total_user_pages;
+
+ total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
+
+ random_pages = get_random_long() % total_user_pages;
+ random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
+
+ WARN_ON(random_vaddr > TASK_SIZE);
+ WARN_ON(random_vaddr < FIRST_USER_ADDRESS);
+ return random_vaddr;
+}
+
+static int __init arch_pgtable_tests_init(void)
+{
+ struct mm_struct *mm;
+ struct page *page;
+ pgd_t *pgdp;
+ p4d_t *p4dp, *saved_p4dp;
+ pud_t *pudp, *saved_pudp;
+ pmd_t *pmdp, *saved_pmdp, pmd;
+ pte_t *ptep;
+ pgtable_t saved_ptep;
+ pgprot_t prot;
+ unsigned long vaddr;
+
+ prot = vm_get_page_prot(VMFLAGS);
+ vaddr = get_random_vaddr();
+ mm = mm_alloc();
+ if (!mm) {
+ pr_err("mm_struct allocation failed\n");
+ return 1;
+ }
+
+ page = alloc_mapped_page();
+ if (!page) {
+ pr_err("memory allocation failed\n");
+ return 1;
+ }
+
+ pgdp = pgd_offset(mm, vaddr);
+ p4dp = p4d_alloc(mm, pgdp, vaddr);
+ pudp = pud_alloc(mm, p4dp, vaddr);
+ pmdp = pmd_alloc(mm, pudp, vaddr);
+ ptep = pte_alloc_map(mm, pmdp, vaddr);
+
+ /*
+ * Save all the page table page addresses as the page table
+ * entries will be used for testing with random or garbage
+ * values. These saved addresses will be used for freeing
+ * page table pages.
+ */
+ pmd = READ_ONCE(*pmdp);
+ saved_p4dp = p4d_offset(pgdp, 0UL);
+ saved_pudp = pud_offset(p4dp, 0UL);
+ saved_pmdp = pmd_offset(pudp, 0UL);
+ saved_ptep = pmd_pgtable(pmd);
+
+ pte_basic_tests(page, prot);
+ pmd_basic_tests(page, prot);
+ pud_basic_tests(page, prot);
+ p4d_basic_tests(page, prot);
+ pgd_basic_tests(page, prot);
+
+ pte_clear_tests(mm, ptep);
+ pmd_clear_tests(pmdp);
+ pud_clear_tests(pudp);
+ p4d_clear_tests(p4dp);
+ pgd_clear_tests(mm, pgdp);
+
+ pmd_populate_tests(mm, pmdp, saved_ptep);
+ pud_populate_tests(mm, pudp, saved_pmdp);
+ p4d_populate_tests(mm, p4dp, saved_pudp);
+ pgd_populate_tests(mm, pgdp, saved_p4dp);
+
+ p4d_free(mm, saved_p4dp);
+ pud_free(mm, saved_pudp);
+ pmd_free(mm, saved_pmdp);
+ pte_free(mm, saved_ptep);
+
+ mm_dec_nr_puds(mm);
+ mm_dec_nr_pmds(mm);
+ mm_dec_nr_ptes(mm);
+ __mmdrop(mm);
+
+ free_mapped_page(page);
+ return 0;
+}
+
+static void __exit arch_pgtable_tests_exit(void) { }
+
+module_init(arch_pgtable_tests_init);
+module_exit(arch_pgtable_tests_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
+MODULE_DESCRIPTION("Test architecture page table helpers");
--
2.20.1

2019-09-12 06:31:25

by Anshuman Khandual

[permalink] [raw]
Subject: [PATCH V2 1/2] mm/hugetlb: Make alloc_gigantic_page() available for general use

alloc_gigantic_page() implements an allocation method where it scans over
various zones looking for a large contiguous memory block which could not
have been allocated through the buddy allocator. A subsequent patch which
tests arch page table helpers needs such a method to allocate PUD_SIZE
sized memory block. In the future such methods might have other use cases
as well. So alloc_gigantic_page() has been split carving out actual memory
allocation method and made available via new alloc_gigantic_page_order().

Cc: Mike Kravetz <[email protected]>
Cc: [email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
---
Should we move alloc_gigantic_page_order() to page_alloc.c and declarations
to include/linux/gfp.h instead ? This is still very much HugeTLB specific.

include/linux/hugetlb.h | 9 +++++++++
mm/hugetlb.c | 24 ++++++++++++++++++++++--
2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 53fc34f930d0..cc50d5ad4885 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -299,6 +299,9 @@ static inline bool is_file_hugepages(struct file *file)
}


+struct page *
+alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask);
#else /* !CONFIG_HUGETLBFS */

#define is_file_hugepages(file) false
@@ -310,6 +313,12 @@ hugetlb_file_setup(const char *name, size_t size, vm_flags_t acctflag,
return ERR_PTR(-ENOSYS);
}

+static inline struct page *
+alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
+{
+ return NULL;
+}
#endif /* !CONFIG_HUGETLBFS */

#ifdef HAVE_ARCH_HUGETLB_UNMAPPED_AREA
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ef37c85423a5..3fb81252f52b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1112,10 +1112,9 @@ static bool zone_spans_last_pfn(const struct zone *zone,
return zone_spans_pfn(zone, last_pfn);
}

-static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
int nid, nodemask_t *nodemask)
{
- unsigned int order = huge_page_order(h);
unsigned long nr_pages = 1 << order;
unsigned long ret, pfn, flags;
struct zonelist *zonelist;
@@ -1151,6 +1150,14 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
return NULL;
}

+static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
+{
+ unsigned int order = huge_page_order(h);
+
+ return alloc_gigantic_page_order(order, gfp_mask, nid, nodemask);
+}
+
static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);
static void prep_compound_gigantic_page(struct page *page, unsigned int order);
#else /* !CONFIG_CONTIG_ALLOC */
@@ -1159,6 +1166,12 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
{
return NULL;
}
+
+struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
+{
+ return NULL;
+}
#endif /* CONFIG_CONTIG_ALLOC */

#else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */
@@ -1167,6 +1180,13 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
{
return NULL;
}
+
+struct page *alloc_gigantic_page_order(unsigned int order, gfp_t gfp_mask,
+ int nid, nodemask_t *nodemask)
+{
+ return NULL;
+}
+
static inline void free_gigantic_page(struct page *page, unsigned int order) { }
static inline void destroy_compound_gigantic_page(struct page *page,
unsigned int order) { }
--
2.20.1

2019-09-12 11:05:41

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

On Thu, Sep 12, 2019 at 11:32:53AM +0530, Anshuman Khandual wrote:
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Test architecture page table helpers");

It's not module. Why?

BTW, I think we should make all code here __init (or it's variants) so it
can be discarded on boot. It has not use after that.

--
Kirill A. Shutemov

2019-09-12 12:11:58

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



On 09/12/2019 04:30 PM, Kirill A. Shutemov wrote:
> On Thu, Sep 12, 2019 at 11:32:53AM +0530, Anshuman Khandual wrote:
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
>> +MODULE_DESCRIPTION("Test architecture page table helpers");
>
> It's not module. Why?

Not any more. Nothing in particular. Just that module_init() code gets
executed after page allocator init which is needed here. But I guess
probably not a great way to get this test started.

>
> BTW, I think we should make all code here __init (or it's variants) so it
> can be discarded on boot. It has not use after that.

Sounds good, will change. Will mark all these functions as __init and
will trigger the test with late_initcall().

2019-09-12 15:40:55

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 12/09/2019 à 17:00, Christophe Leroy a écrit :
>
>
> On 09/12/2019 06:02 AM, Anshuman Khandual wrote:
>> This adds a test module which will validate architecture page table
>> helpers
>> and accessors regarding compliance with generic MM semantics
>> expectations.
>> This will help various architectures in validating changes to the
>> existing
>> page table helpers or addition of new ones.
>>
>> Test page table and memory pages creating it's entries at various
>> level are
>> all allocated from system memory with required alignments. If memory
>> pages
>> with required size and alignment could not be allocated, then all
>> depending
>> individual tests are skipped.
>
> Build failure on powerpc book3s/32. This is because asm/highmem.h is
> missing. It can't be included from asm/book3s/32/pgtable.h because it
> creates circular dependency. So it has to be included from
> mm/arch_pgtable_test.c

In fact it is <linux/highmem.h> that needs to be added, adding
<asm/highmem.h> directly provokes build failure at link time.

Christophe

>
>
>
>   CC      mm/arch_pgtable_test.o
> In file included from ./arch/powerpc/include/asm/book3s/pgtable.h:8:0,
>                  from ./arch/powerpc/include/asm/pgtable.h:18,
>                  from ./include/linux/mm.h:99,
>                  from ./arch/powerpc/include/asm/io.h:29,
>                  from ./include/linux/io.h:13,
>                  from ./include/linux/irq.h:20,
>                  from ./arch/powerpc/include/asm/hardirq.h:6,
>                  from ./include/linux/hardirq.h:9,
>                  from ./include/linux/interrupt.h:11,
>                  from ./include/linux/kernel_stat.h:9,
>                  from ./include/linux/cgroup.h:26,
>                  from ./include/linux/hugetlb.h:9,
>                  from mm/arch_pgtable_test.c:14:
> mm/arch_pgtable_test.c: In function 'arch_pgtable_tests_init':
> ./arch/powerpc/include/asm/book3s/32/pgtable.h:365:13: error: implicit
> declaration of function 'kmap_atomic'
> [-Werror=implicit-function-declaration]
>   ((pte_t *)(kmap_atomic(pmd_page(*(dir))) + \
>              ^
> ./include/linux/mm.h:2008:31: note: in expansion of macro 'pte_offset_map'
>   (pte_alloc(mm, pmd) ? NULL : pte_offset_map(pmd, address))
>                                ^
> mm/arch_pgtable_test.c:377:9: note: in expansion of macro 'pte_alloc_map'
>   ptep = pte_alloc_map(mm, pmdp, vaddr);
>          ^
> cc1: some warnings being treated as errors
> make[2]: *** [mm/arch_pgtable_test.o] Error 1
>
>
> Christophe
>
>
>>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: Greg Kroah-Hartman <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Mike Rapoport <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Dan Williams <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Mark Brown <[email protected]>
>> Cc: Steven Price <[email protected]>
>> Cc: Ard Biesheuvel <[email protected]>
>> Cc: Masahiro Yamada <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Tetsuo Handa <[email protected]>
>> Cc: Matthew Wilcox <[email protected]>
>> Cc: Sri Krishna chowdary <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: Russell King - ARM Linux <[email protected]>
>> Cc: Michael Ellerman <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> Cc: Martin Schwidefsky <[email protected]>
>> Cc: Heiko Carstens <[email protected]>
>> Cc: "David S. Miller" <[email protected]>
>> Cc: Vineet Gupta <[email protected]>
>> Cc: James Hogan <[email protected]>
>> Cc: Paul Burton <[email protected]>
>> Cc: Ralf Baechle <[email protected]>
>> Cc: Kirill A. Shutemov <[email protected]>
>> Cc: Gerald Schaefer <[email protected]>
>> Cc: Christophe Leroy <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>>
>> Suggested-by: Catalin Marinas <[email protected]>
>> Signed-off-by: Anshuman Khandual <[email protected]>
>> ---
>>   arch/x86/include/asm/pgtable_64_types.h |   2 +
>>   mm/Kconfig.debug                        |  14 +
>>   mm/Makefile                             |   1 +
>>   mm/arch_pgtable_test.c                  | 429 ++++++++++++++++++++++++
>>   4 files changed, 446 insertions(+)
>>   create mode 100644 mm/arch_pgtable_test.c
>>
>> diff --git a/arch/x86/include/asm/pgtable_64_types.h
>> b/arch/x86/include/asm/pgtable_64_types.h
>> index 52e5f5f2240d..b882792a3999 100644
>> --- a/arch/x86/include/asm/pgtable_64_types.h
>> +++ b/arch/x86/include/asm/pgtable_64_types.h
>> @@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
>>   #define pgtable_l5_enabled() 0
>>   #endif /* CONFIG_X86_5LEVEL */
>> +#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
>> +
>>   extern unsigned int pgdir_shift;
>>   extern unsigned int ptrs_per_p4d;
>> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
>> index 327b3ebf23bf..ce9c397f7b07 100644
>> --- a/mm/Kconfig.debug
>> +++ b/mm/Kconfig.debug
>> @@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
>>       depends on STRICT_KERNEL_RWX
>>       ---help---
>>         This option enables a testcase for the setting rodata read-only.
>> +
>> +config DEBUG_ARCH_PGTABLE_TEST
>> +    bool "Test arch page table helpers for semantics compliance"
>> +    depends on MMU
>> +    depends on DEBUG_KERNEL
>> +    help
>> +      This options provides a kernel module which can be used to test
>> +      architecture page table helper functions on various platform in
>> +      verifying if they comply with expected generic MM semantics. This
>> +      will help architectures code in making sure that any changes or
>> +      new additions of these helpers will still conform to generic MM
>> +      expected semantics.
>> +
>> +      If unsure, say N.
>> diff --git a/mm/Makefile b/mm/Makefile
>> index d996846697ef..bb572c5aa8c5 100644
>> --- a/mm/Makefile
>> +++ b/mm/Makefile
>> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
>>   obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
>>   obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
>>   obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
>> +obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
>>   obj-$(CONFIG_PAGE_OWNER) += page_owner.o
>>   obj-$(CONFIG_CLEANCACHE) += cleancache.o
>>   obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>> new file mode 100644
>> index 000000000000..8b4a92756ad8
>> --- /dev/null
>> +++ b/mm/arch_pgtable_test.c
>> @@ -0,0 +1,429 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * This kernel module validates architecture page table helpers &
>> + * accessors and helps in verifying their continued compliance with
>> + * generic MM semantics.
>> + *
>> + * Copyright (C) 2019 ARM Ltd.
>> + *
>> + * Author: Anshuman Khandual <[email protected]>
>> + */
>> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
>> +
>> +#include <linux/gfp.h>
>> +#include <linux/hugetlb.h>
>> +#include <linux/kernel.h>
>> +#include <linux/mm.h>
>> +#include <linux/mman.h>
>> +#include <linux/mm_types.h>
>> +#include <linux/module.h>
>> +#include <linux/pfn_t.h>
>> +#include <linux/printk.h>
>> +#include <linux/random.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/swap.h>
>> +#include <linux/swapops.h>
>> +#include <linux/sched/mm.h>
>> +#include <asm/pgalloc.h>
>> +#include <asm/pgtable.h>
>> +
>> +/*
>> + * Basic operations
>> + *
>> + * mkold(entry)            = An old and not a young entry
>> + * mkyoung(entry)        = A young and not an old entry
>> + * mkdirty(entry)        = A dirty and not a clean entry
>> + * mkclean(entry)        = A clean and not a dirty entry
>> + * mkwrite(entry)        = A write and not a write protected entry
>> + * wrprotect(entry)        = A write protected and not a write entry
>> + * pxx_bad(entry)        = A mapped and non-table entry
>> + * pxx_same(entry1, entry2)    = Both entries hold the exact same value
>> + */
>> +#define VMFLAGS    (VM_READ|VM_WRITE|VM_EXEC)
>> +
>> +/*
>> + * On s390 platform, the lower 12 bits are used to identify given
>> page table
>> + * entry type and for other arch specific requirements. But these
>> bits might
>> + * affect the ability to clear entries with pxx_clear(). So while
>> loading up
>> + * the entries skip all lower 12 bits in order to accommodate s390
>> platform.
>> + * It does not have affect any other platform.
>> + */
>> +#define RANDOM_ORVALUE    (0xfffffffffffff000UL)
>> +#define RANDOM_NZVALUE    (0xff)
>> +
>> +static bool pud_aligned;
>> +static bool pmd_aligned;
>> +
>> +static void pte_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pte_t pte = mk_pte(page, prot);
>> +
>> +    WARN_ON(!pte_same(pte, pte));
>> +    WARN_ON(!pte_young(pte_mkyoung(pte)));
>> +    WARN_ON(!pte_dirty(pte_mkdirty(pte)));
>> +    WARN_ON(!pte_write(pte_mkwrite(pte)));
>> +    WARN_ON(pte_young(pte_mkold(pte)));
>> +    WARN_ON(pte_dirty(pte_mkclean(pte)));
>> +    WARN_ON(pte_write(pte_wrprotect(pte)));
>> +}
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
>> +static void pmd_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pmd_t pmd;
>> +
>> +    /*
>> +     * Memory block here must be PMD_SIZE aligned. Abort this
>> +     * test in case we could not allocate such a memory block.
>> +     */
>> +    if (!pmd_aligned) {
>> +        pr_warn("Could not proceed with PMD tests\n");
>> +        return;
>> +    }
>> +
>> +    pmd = mk_pmd(page, prot);
>> +    WARN_ON(!pmd_same(pmd, pmd));
>> +    WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
>> +    WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
>> +    WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
>> +    WARN_ON(pmd_young(pmd_mkold(pmd)));
>> +    WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
>> +    WARN_ON(pmd_write(pmd_wrprotect(pmd)));
>> +    /*
>> +     * A huge page does not point to next level page table
>> +     * entry. Hence this must qualify as pmd_bad().
>> +     */
>> +    WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
>> +}
>> +#else
>> +static void pmd_basic_tests(struct page *page, pgprot_t prot) { }
>> +#endif
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> +static void pud_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pud_t pud;
>> +
>> +    /*
>> +     * Memory block here must be PUD_SIZE aligned. Abort this
>> +     * test in case we could not allocate such a memory block.
>> +     */
>> +    if (!pud_aligned) {
>> +        pr_warn("Could not proceed with PUD tests\n");
>> +        return;
>> +    }
>> +
>> +    pud = pfn_pud(page_to_pfn(page), prot);
>> +    WARN_ON(!pud_same(pud, pud));
>> +    WARN_ON(!pud_young(pud_mkyoung(pud)));
>> +    WARN_ON(!pud_write(pud_mkwrite(pud)));
>> +    WARN_ON(pud_write(pud_wrprotect(pud)));
>> +    WARN_ON(pud_young(pud_mkold(pud)));
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>> +    /*
>> +     * A huge page does not point to next level page table
>> +     * entry. Hence this must qualify as pud_bad().
>> +     */
>> +    WARN_ON(!pud_bad(pud_mkhuge(pud)));
>> +#endif
>> +}
>> +#else
>> +static void pud_basic_tests(struct page *page, pgprot_t prot) { }
>> +#endif
>> +
>> +static void p4d_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    p4d_t p4d;
>> +
>> +    memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
>> +    WARN_ON(!p4d_same(p4d, p4d));
>> +}
>> +
>> +static void pgd_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pgd_t pgd;
>> +
>> +    memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
>> +    WARN_ON(!pgd_same(pgd, pgd));
>> +}
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>> +static void pud_clear_tests(pud_t *pudp)
>> +{
>> +    pud_t pud = READ_ONCE(*pudp);
>> +
>> +    pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pudp, pud);
>> +    pud_clear(pudp);
>> +    pud = READ_ONCE(*pudp);
>> +    WARN_ON(!pud_none(pud));
>> +}
>> +
>> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>> pmd_t *pmdp)
>> +{
>> +    pud_t pud;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pud_bad().
>> +     */
>> +    pmd_clear(pmdp);
>> +    pud_clear(pudp);
>> +    pud_populate(mm, pudp, pmdp);
>> +    pud = READ_ONCE(*pudp);
>> +    WARN_ON(pud_bad(pud));
>> +}
>> +#else
>> +static void pud_clear_tests(pud_t *pudp) { }
>> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>> pmd_t *pmdp)
>> +{
>> +}
>> +#endif
>> +
>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>> +static void p4d_clear_tests(p4d_t *p4dp)
>> +{
>> +    p4d_t p4d = READ_ONCE(*p4dp);
>> +
>> +    p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*p4dp, p4d);
>> +    p4d_clear(p4dp);
>> +    p4d = READ_ONCE(*p4dp);
>> +    WARN_ON(!p4d_none(p4d));
>> +}
>> +
>> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>> pud_t *pudp)
>> +{
>> +    p4d_t p4d;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as p4d_bad().
>> +     */
>> +    pud_clear(pudp);
>> +    p4d_clear(p4dp);
>> +    p4d_populate(mm, p4dp, pudp);
>> +    p4d = READ_ONCE(*p4dp);
>> +    WARN_ON(p4d_bad(p4d));
>> +}
>> +#else
>> +static void p4d_clear_tests(p4d_t *p4dp) { }
>> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>> pud_t *pudp)
>> +{
>> +}
>> +#endif
>> +
>> +#ifndef __ARCH_HAS_5LEVEL_HACK
>> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
>> +{
>> +    pgd_t pgd = READ_ONCE(*pgdp);
>> +
>> +    if (mm_p4d_folded(mm))
>> +        return;
>> +
>> +    pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pgdp, pgd);
>> +    pgd_clear(pgdp);
>> +    pgd = READ_ONCE(*pgdp);
>> +    WARN_ON(!pgd_none(pgd));
>> +}
>> +
>> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>> p4d_t *p4dp)
>> +{
>> +    pgd_t pgd;
>> +
>> +    if (mm_p4d_folded(mm))
>> +        return;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pgd_bad().
>> +     */
>> +    p4d_clear(p4dp);
>> +    pgd_clear(pgdp);
>> +    pgd_populate(mm, pgdp, p4dp);
>> +    pgd = READ_ONCE(*pgdp);
>> +    WARN_ON(pgd_bad(pgd));
>> +}
>> +#else
>> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
>> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>> p4d_t *p4dp)
>> +{
>> +}
>> +#endif
>> +
>> +static void pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
>> +{
>> +    pte_t pte = READ_ONCE(*ptep);
>> +
>> +    pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*ptep, pte);
>> +    pte_clear(mm, 0, ptep);
>> +    pte = READ_ONCE(*ptep);
>> +    WARN_ON(!pte_none(pte));
>> +}
>> +
>> +static void pmd_clear_tests(pmd_t *pmdp)
>> +{
>> +    pmd_t pmd = READ_ONCE(*pmdp);
>> +
>> +    pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pmdp, pmd);
>> +    pmd_clear(pmdp);
>> +    pmd = READ_ONCE(*pmdp);
>> +    WARN_ON(!pmd_none(pmd));
>> +}
>> +
>> +static void pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>> +                   pgtable_t pgtable)
>> +{
>> +    pmd_t pmd;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pmd_bad().
>> +     */
>> +    pmd_clear(pmdp);
>> +    pmd_populate(mm, pmdp, pgtable);
>> +    pmd = READ_ONCE(*pmdp);
>> +    WARN_ON(pmd_bad(pmd));
>> +}
>> +
>> +static struct page *alloc_mapped_page(void)
>> +{
>> +    struct page *page;
>> +    gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO;
>> +
>> +    page = alloc_gigantic_page_order(get_order(PUD_SIZE), gfp_mask,
>> +                first_memory_node, &node_states[N_MEMORY]);
>> +    if (page) {
>> +        pud_aligned = true;
>> +        pmd_aligned = true;
>> +        return page;
>> +    }
>> +
>> +    page = alloc_pages(gfp_mask, get_order(PMD_SIZE));
>> +    if (page) {
>> +        pmd_aligned = true;
>> +        return page;
>> +    }
>> +    return alloc_page(gfp_mask);
>> +}
>> +
>> +static void free_mapped_page(struct page *page)
>> +{
>> +    if (pud_aligned) {
>> +        unsigned long pfn = page_to_pfn(page);
>> +
>> +        free_contig_range(pfn, 1ULL << get_order(PUD_SIZE));
>> +        return;
>> +    }
>> +
>> +    if (pmd_aligned) {
>> +        int order = get_order(PMD_SIZE);
>> +
>> +        free_pages((unsigned long)page_address(page), order);
>> +        return;
>> +    }
>> +    free_page((unsigned long)page_address(page));
>> +}
>> +
>> +static unsigned long get_random_vaddr(void)
>> +{
>> +    unsigned long random_vaddr, random_pages, total_user_pages;
>> +
>> +    total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
>> +
>> +    random_pages = get_random_long() % total_user_pages;
>> +    random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
>> +
>> +    WARN_ON(random_vaddr > TASK_SIZE);
>> +    WARN_ON(random_vaddr < FIRST_USER_ADDRESS);
>> +    return random_vaddr;
>> +}
>> +
>> +static int __init arch_pgtable_tests_init(void)
>> +{
>> +    struct mm_struct *mm;
>> +    struct page *page;
>> +    pgd_t *pgdp;
>> +    p4d_t *p4dp, *saved_p4dp;
>> +    pud_t *pudp, *saved_pudp;
>> +    pmd_t *pmdp, *saved_pmdp, pmd;
>> +    pte_t *ptep;
>> +    pgtable_t saved_ptep;
>> +    pgprot_t prot;
>> +    unsigned long vaddr;
>> +
>> +    prot = vm_get_page_prot(VMFLAGS);
>> +    vaddr = get_random_vaddr();
>> +    mm = mm_alloc();
>> +    if (!mm) {
>> +        pr_err("mm_struct allocation failed\n");
>> +        return 1;
>> +    }
>> +
>> +    page = alloc_mapped_page();
>> +    if (!page) {
>> +        pr_err("memory allocation failed\n");
>> +        return 1;
>> +    }
>> +
>> +    pgdp = pgd_offset(mm, vaddr);
>> +    p4dp = p4d_alloc(mm, pgdp, vaddr);
>> +    pudp = pud_alloc(mm, p4dp, vaddr);
>> +    pmdp = pmd_alloc(mm, pudp, vaddr);
>> +    ptep = pte_alloc_map(mm, pmdp, vaddr);
>> +
>> +    /*
>> +     * Save all the page table page addresses as the page table
>> +     * entries will be used for testing with random or garbage
>> +     * values. These saved addresses will be used for freeing
>> +     * page table pages.
>> +     */
>> +    pmd = READ_ONCE(*pmdp);
>> +    saved_p4dp = p4d_offset(pgdp, 0UL);
>> +    saved_pudp = pud_offset(p4dp, 0UL);
>> +    saved_pmdp = pmd_offset(pudp, 0UL);
>> +    saved_ptep = pmd_pgtable(pmd);
>> +
>> +    pte_basic_tests(page, prot);
>> +    pmd_basic_tests(page, prot);
>> +    pud_basic_tests(page, prot);
>> +    p4d_basic_tests(page, prot);
>> +    pgd_basic_tests(page, prot);
>> +
>> +    pte_clear_tests(mm, ptep);
>> +    pmd_clear_tests(pmdp);
>> +    pud_clear_tests(pudp);
>> +    p4d_clear_tests(p4dp);
>> +    pgd_clear_tests(mm, pgdp);
>> +
>> +    pmd_populate_tests(mm, pmdp, saved_ptep);
>> +    pud_populate_tests(mm, pudp, saved_pmdp);
>> +    p4d_populate_tests(mm, p4dp, saved_pudp);
>> +    pgd_populate_tests(mm, pgdp, saved_p4dp);
>> +
>> +    p4d_free(mm, saved_p4dp);
>> +    pud_free(mm, saved_pudp);
>> +    pmd_free(mm, saved_pmdp);
>> +    pte_free(mm, saved_ptep);
>> +
>> +    mm_dec_nr_puds(mm);
>> +    mm_dec_nr_pmds(mm);
>> +    mm_dec_nr_ptes(mm);
>> +    __mmdrop(mm);
>> +
>> +    free_mapped_page(page);
>> +    return 0;
>> +}
>> +
>> +static void __exit arch_pgtable_tests_exit(void) { }
>> +
>> +module_init(arch_pgtable_tests_init);
>> +module_exit(arch_pgtable_tests_exit);
>> +
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
>> +MODULE_DESCRIPTION("Test architecture page table helpers");
>>

2019-09-12 17:55:51

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 0/2] mm/debug: Add tests for architecture exported page table helpers

Hi,

I didn't get patch 1 of this series, and it is not on linuxppc-dev
patchwork either. Can you resend ?

Thanks
Christophe

Le 12/09/2019 à 08:02, Anshuman Khandual a écrit :
> This series adds a test validation for architecture exported page table
> helpers. Patch in the series adds basic transformation tests at various
> levels of the page table. Before that it exports gigantic page allocation
> function from HugeTLB.
>
> This test was originally suggested by Catalin during arm64 THP migration
> RFC discussion earlier. Going forward it can include more specific tests
> with respect to various generic MM functions like THP, HugeTLB etc and
> platform specific tests.
>
> https://lore.kernel.org/linux-mm/[email protected]/
>
> Testing:
>
> Successfully build and boot tested on both arm64 and x86 platforms without
> any test failing. Only build tested on some other platforms.
>
> But I would really appreciate if folks can help validate this test on other
> platforms and report back problems. All suggestions, comments and inputs
> welcome. Thank you.
>
> Changes in V2:
>
> - Fixed small typo error in MODULE_DESCRIPTION()
> - Fixed m64k build problems for lvalue concerns in pmd_xxx_tests()
> - Fixed dynamic page table level folding problems on x86 as per Kirril
> - Fixed second pointers during pxx_populate_tests() per Kirill and Gerald
> - Allocate and free pte table with pte_alloc_one/pte_free per Kirill
> - Modified pxx_clear_tests() to accommodate s390 lower 12 bits situation
> - Changed RANDOM_NZVALUE value from 0xbe to 0xff
> - Changed allocation, usage, free sequence for saved_ptep
> - Renamed VMA_FLAGS as VMFLAGS
> - Implemented a new method for random vaddr generation
> - Implemented some other cleanups
> - Dropped extern reference to mm_alloc()
> - Created and exported new alloc_gigantic_page_order()
> - Dropped the custom allocator and used new alloc_gigantic_page_order()
>
> Changes in V1:
>
> https://lore.kernel.org/linux-mm/[email protected]/
>
> - Added fallback mechanism for PMD aligned memory allocation failure
>
> Changes in RFC V2:
>
> https://lore.kernel.org/linux-mm/[email protected]/T/#u
>
> - Moved test module and it's config from lib/ to mm/
> - Renamed config TEST_ARCH_PGTABLE as DEBUG_ARCH_PGTABLE_TEST
> - Renamed file from test_arch_pgtable.c to arch_pgtable_test.c
> - Added relevant MODULE_DESCRIPTION() and MODULE_AUTHOR() details
> - Dropped loadable module config option
> - Basic tests now use memory blocks with required size and alignment
> - PUD aligned memory block gets allocated with alloc_contig_range()
> - If PUD aligned memory could not be allocated it falls back on PMD aligned
> memory block from page allocator and pud_* tests are skipped
> - Clear and populate tests now operate on real in memory page table entries
> - Dummy mm_struct gets allocated with mm_alloc()
> - Dummy page table entries get allocated with [pud|pmd|pte]_alloc_[map]()
> - Simplified [p4d|pgd]_basic_tests(), now has random values in the entries
>
> Original RFC V1:
>
> https://lore.kernel.org/linux-mm/[email protected]/
>
> Cc: Andrew Morton <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Mark Brown <[email protected]>
> Cc: Steven Price <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Tetsuo Handa <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Sri Krishna chowdary <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Russell King - ARM Linux <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Martin Schwidefsky <[email protected]>
> Cc: Heiko Carstens <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Vineet Gupta <[email protected]>
> Cc: James Hogan <[email protected]>
> Cc: Paul Burton <[email protected]>
> Cc: Ralf Baechle <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Gerald Schaefer <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Mike Kravetz <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
>
> Anshuman Khandual (2):
> mm/hugetlb: Make alloc_gigantic_page() available for general use
> mm/pgtable/debug: Add test validating architecture page table helpers
>
> arch/x86/include/asm/pgtable_64_types.h | 2 +
> include/linux/hugetlb.h | 9 +
> mm/Kconfig.debug | 14 +
> mm/Makefile | 1 +
> mm/arch_pgtable_test.c | 429 ++++++++++++++++++++++++
> mm/hugetlb.c | 24 +-
> 6 files changed, 477 insertions(+), 2 deletions(-)
> create mode 100644 mm/arch_pgtable_test.c
>

2019-09-12 19:06:57

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 12/09/2019 à 08:02, Anshuman Khandual a écrit :
> This adds a test module which will validate architecture page table helpers
> and accessors regarding compliance with generic MM semantics expectations.
> This will help various architectures in validating changes to the existing
> page table helpers or addition of new ones.
>
> Test page table and memory pages creating it's entries at various level are
> all allocated from system memory with required alignments. If memory pages
> with required size and alignment could not be allocated, then all depending
> individual tests are skipped.
>

[...]

>
> Suggested-by: Catalin Marinas <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> ---
> arch/x86/include/asm/pgtable_64_types.h | 2 +
> mm/Kconfig.debug | 14 +
> mm/Makefile | 1 +
> mm/arch_pgtable_test.c | 429 ++++++++++++++++++++++++
> 4 files changed, 446 insertions(+)
> create mode 100644 mm/arch_pgtable_test.c
>
> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
> index 52e5f5f2240d..b882792a3999 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
> #define pgtable_l5_enabled() 0
> #endif /* CONFIG_X86_5LEVEL */
>
> +#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
> +

This is specific to x86, should go in a separate patch.

> extern unsigned int pgdir_shift;
> extern unsigned int ptrs_per_p4d;
>
> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
> index 327b3ebf23bf..ce9c397f7b07 100644
> --- a/mm/Kconfig.debug
> +++ b/mm/Kconfig.debug
> @@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
> depends on STRICT_KERNEL_RWX
> ---help---
> This option enables a testcase for the setting rodata read-only.
> +
> +config DEBUG_ARCH_PGTABLE_TEST
> + bool "Test arch page table helpers for semantics compliance"
> + depends on MMU
> + depends on DEBUG_KERNEL
> + help
> + This options provides a kernel module which can be used to test
> + architecture page table helper functions on various platform in
> + verifying if they comply with expected generic MM semantics. This
> + will help architectures code in making sure that any changes or
> + new additions of these helpers will still conform to generic MM
> + expected semantics.
> +
> + If unsure, say N.
> diff --git a/mm/Makefile b/mm/Makefile
> index d996846697ef..bb572c5aa8c5 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
> obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
> obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
> obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
> +obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
> obj-$(CONFIG_PAGE_OWNER) += page_owner.o
> obj-$(CONFIG_CLEANCACHE) += cleancache.o
> obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
> new file mode 100644
> index 000000000000..8b4a92756ad8
> --- /dev/null
> +++ b/mm/arch_pgtable_test.c
> @@ -0,0 +1,429 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * This kernel module validates architecture page table helpers &
> + * accessors and helps in verifying their continued compliance with
> + * generic MM semantics.
> + *
> + * Copyright (C) 2019 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
> +
> +#include <linux/gfp.h>
> +#include <linux/hugetlb.h>
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/mman.h>
> +#include <linux/mm_types.h>
> +#include <linux/module.h>
> +#include <linux/pfn_t.h>
> +#include <linux/printk.h>
> +#include <linux/random.h>
> +#include <linux/spinlock.h>
> +#include <linux/swap.h>
> +#include <linux/swapops.h>
> +#include <linux/sched/mm.h>

Add <linux/highmem.h> (see other mails, build failure on ppc book3s/32)

> +#include <asm/pgalloc.h>
> +#include <asm/pgtable.h>
> +
> +/*
> + * Basic operations
> + *
> + * mkold(entry) = An old and not a young entry
> + * mkyoung(entry) = A young and not an old entry
> + * mkdirty(entry) = A dirty and not a clean entry
> + * mkclean(entry) = A clean and not a dirty entry
> + * mkwrite(entry) = A write and not a write protected entry
> + * wrprotect(entry) = A write protected and not a write entry
> + * pxx_bad(entry) = A mapped and non-table entry
> + * pxx_same(entry1, entry2) = Both entries hold the exact same value
> + */
> +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC)
> +
> +/*
> + * On s390 platform, the lower 12 bits are used to identify given page table
> + * entry type and for other arch specific requirements. But these bits might
> + * affect the ability to clear entries with pxx_clear(). So while loading up
> + * the entries skip all lower 12 bits in order to accommodate s390 platform.
> + * It does not have affect any other platform.
> + */
> +#define RANDOM_ORVALUE (0xfffffffffffff000UL)
> +#define RANDOM_NZVALUE (0xff)
> +
> +static bool pud_aligned;
> +static bool pmd_aligned;
> +
> +static void pte_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pte_t pte = mk_pte(page, prot);
> +
> + WARN_ON(!pte_same(pte, pte));
> + WARN_ON(!pte_young(pte_mkyoung(pte)));
> + WARN_ON(!pte_dirty(pte_mkdirty(pte)));
> + WARN_ON(!pte_write(pte_mkwrite(pte)));
> + WARN_ON(pte_young(pte_mkold(pte)));
> + WARN_ON(pte_dirty(pte_mkclean(pte)));
> + WARN_ON(pte_write(pte_wrprotect(pte)));
> +}
> +
> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
> +static void pmd_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pmd_t pmd;
> +
> + /*
> + * Memory block here must be PMD_SIZE aligned. Abort this
> + * test in case we could not allocate such a memory block.
> + */
> + if (!pmd_aligned) {
> + pr_warn("Could not proceed with PMD tests\n");
> + return;
> + }
> +
> + pmd = mk_pmd(page, prot);
> + WARN_ON(!pmd_same(pmd, pmd));
> + WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
> + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
> + WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
> + WARN_ON(pmd_young(pmd_mkold(pmd)));
> + WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
> + WARN_ON(pmd_write(pmd_wrprotect(pmd)));
> + /*
> + * A huge page does not point to next level page table
> + * entry. Hence this must qualify as pmd_bad().
> + */
> + WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
> +}
> +#else
> +static void pmd_basic_tests(struct page *page, pgprot_t prot) { }
> +#endif
> +
> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> +static void pud_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pud_t pud;
> +
> + /*
> + * Memory block here must be PUD_SIZE aligned. Abort this
> + * test in case we could not allocate such a memory block.
> + */
> + if (!pud_aligned) {
> + pr_warn("Could not proceed with PUD tests\n");
> + return;
> + }
> +
> + pud = pfn_pud(page_to_pfn(page), prot);
> + WARN_ON(!pud_same(pud, pud));
> + WARN_ON(!pud_young(pud_mkyoung(pud)));
> + WARN_ON(!pud_write(pud_mkwrite(pud)));
> + WARN_ON(pud_write(pud_wrprotect(pud)));
> + WARN_ON(pud_young(pud_mkold(pud)));
> +
> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> + /*
> + * A huge page does not point to next level page table
> + * entry. Hence this must qualify as pud_bad().
> + */
> + WARN_ON(!pud_bad(pud_mkhuge(pud)));
> +#endif
> +}
> +#else
> +static void pud_basic_tests(struct page *page, pgprot_t prot) { }
> +#endif
> +
> +static void p4d_basic_tests(struct page *page, pgprot_t prot)
> +{
> + p4d_t p4d;
> +
> + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
> + WARN_ON(!p4d_same(p4d, p4d));
> +}
> +
> +static void pgd_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pgd_t pgd;
> +
> + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
> + WARN_ON(!pgd_same(pgd, pgd));
> +}
> +
> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)

#ifdefs have to be avoided as much as possible, see below

> +static void pud_clear_tests(pud_t *pudp)
> +{
> + pud_t pud = READ_ONCE(*pudp);
if (mm_pmd_folded() || __is_defined(__ARCH_HAS_4LEVEL_HACK))
return;

> +
> + pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pudp, pud);
> + pud_clear(pudp);
> + pud = READ_ONCE(*pudp);
> + WARN_ON(!pud_none(pud));
> +}
> +
> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
> +{
> + pud_t pud;
> +
if (mm_pmd_folded() || __is_defined(__ARCH_HAS_4LEVEL_HACK))
return;
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pud_bad().
> + */
> + pmd_clear(pmdp);
> + pud_clear(pudp);
> + pud_populate(mm, pudp, pmdp);
> + pud = READ_ONCE(*pudp);
> + WARN_ON(pud_bad(pud));
> +}
> +#else

Then the else branch goes away.

> +static void pud_clear_tests(pud_t *pudp) { }
> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
> +{
> +}
> +#endif
> +
> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)

The same can be done here.

> +static void p4d_clear_tests(p4d_t *p4dp)
> +{
> + p4d_t p4d = READ_ONCE(*p4dp);
> +
> + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
> + WRITE_ONCE(*p4dp, p4d);
> + p4d_clear(p4dp);
> + p4d = READ_ONCE(*p4dp);
> + WARN_ON(!p4d_none(p4d));
> +}
> +
> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
> +{
> + p4d_t p4d;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as p4d_bad().
> + */
> + pud_clear(pudp);
> + p4d_clear(p4dp);
> + p4d_populate(mm, p4dp, pudp);
> + p4d = READ_ONCE(*p4dp);
> + WARN_ON(p4d_bad(p4d));
> +}
> +#else
> +static void p4d_clear_tests(p4d_t *p4dp) { }
> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
> +{
> +}
> +#endif
> +
> +#ifndef __ARCH_HAS_5LEVEL_HACK

And the same here (you already did part of it with testing mm_p4d_folded(mm)

> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
> +{
> + pgd_t pgd = READ_ONCE(*pgdp);
> +
> + if (mm_p4d_folded(mm))
> + return;
> +
> + pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pgdp, pgd);
> + pgd_clear(pgdp);
> + pgd = READ_ONCE(*pgdp);
> + WARN_ON(!pgd_none(pgd));
> +}
> +
> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> +{
> + pgd_t pgd;
> +
> + if (mm_p4d_folded(mm))
> + return;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pgd_bad().
> + */
> + p4d_clear(p4dp);
> + pgd_clear(pgdp);
> + pgd_populate(mm, pgdp, p4dp);
> + pgd = READ_ONCE(*pgdp);
> + WARN_ON(pgd_bad(pgd));
> +}
> +#else
> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> +{
> +}
> +#endif
> +
> +static void pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
> +{
> + pte_t pte = READ_ONCE(*ptep);
> +
> + pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> + WRITE_ONCE(*ptep, pte);
> + pte_clear(mm, 0, ptep);
> + pte = READ_ONCE(*ptep);
> + WARN_ON(!pte_none(pte));
> +}
> +
> +static void pmd_clear_tests(pmd_t *pmdp)
> +{
> + pmd_t pmd = READ_ONCE(*pmdp);
> +
> + pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pmdp, pmd);
> + pmd_clear(pmdp);
> + pmd = READ_ONCE(*pmdp);
> + WARN_ON(!pmd_none(pmd));
> +}
> +
> +static void pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
> + pgtable_t pgtable)
> +{
> + pmd_t pmd;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pmd_bad().
> + */
> + pmd_clear(pmdp);
> + pmd_populate(mm, pmdp, pgtable);
> + pmd = READ_ONCE(*pmdp);
> + WARN_ON(pmd_bad(pmd));
> +}
> +
> +static struct page *alloc_mapped_page(void)
> +{
> + struct page *page;
> + gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO;
> +
> + page = alloc_gigantic_page_order(get_order(PUD_SIZE), gfp_mask,
> + first_memory_node, &node_states[N_MEMORY]);
> + if (page) {
> + pud_aligned = true;
> + pmd_aligned = true;
> + return page;
> + }
> +
> + page = alloc_pages(gfp_mask, get_order(PMD_SIZE));
> + if (page) {
> + pmd_aligned = true;
> + return page;
> + }
> + return alloc_page(gfp_mask);
> +}
> +
> +static void free_mapped_page(struct page *page)
> +{
> + if (pud_aligned) {
> + unsigned long pfn = page_to_pfn(page);
> +
> + free_contig_range(pfn, 1ULL << get_order(PUD_SIZE));
> + return;
> + }
> +
> + if (pmd_aligned) {
> + int order = get_order(PMD_SIZE);
> +
> + free_pages((unsigned long)page_address(page), order);
> + return;
> + }
> + free_page((unsigned long)page_address(page));
> +}
> +
> +static unsigned long get_random_vaddr(void)
> +{
> + unsigned long random_vaddr, random_pages, total_user_pages;
> +
> + total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
> +
> + random_pages = get_random_long() % total_user_pages;
> + random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
> +
> + WARN_ON(random_vaddr > TASK_SIZE);
> + WARN_ON(random_vaddr < FIRST_USER_ADDRESS);
> + return random_vaddr;
> +}
> +
> +static int __init arch_pgtable_tests_init(void)
> +{
> + struct mm_struct *mm;
> + struct page *page;
> + pgd_t *pgdp;
> + p4d_t *p4dp, *saved_p4dp;
> + pud_t *pudp, *saved_pudp;
> + pmd_t *pmdp, *saved_pmdp, pmd;
> + pte_t *ptep;
> + pgtable_t saved_ptep;
> + pgprot_t prot;
> + unsigned long vaddr;
> +
> + prot = vm_get_page_prot(VMFLAGS);
> + vaddr = get_random_vaddr();
> + mm = mm_alloc();
> + if (!mm) {
> + pr_err("mm_struct allocation failed\n");
> + return 1;
> + }
> +
> + page = alloc_mapped_page();
> + if (!page) {
> + pr_err("memory allocation failed\n");
> + return 1;
> + }
> +
> + pgdp = pgd_offset(mm, vaddr);
> + p4dp = p4d_alloc(mm, pgdp, vaddr);
> + pudp = pud_alloc(mm, p4dp, vaddr);
> + pmdp = pmd_alloc(mm, pudp, vaddr);
> + ptep = pte_alloc_map(mm, pmdp, vaddr);
> +
> + /*
> + * Save all the page table page addresses as the page table
> + * entries will be used for testing with random or garbage
> + * values. These saved addresses will be used for freeing
> + * page table pages.
> + */
> + pmd = READ_ONCE(*pmdp);
> + saved_p4dp = p4d_offset(pgdp, 0UL);
> + saved_pudp = pud_offset(p4dp, 0UL);
> + saved_pmdp = pmd_offset(pudp, 0UL);
> + saved_ptep = pmd_pgtable(pmd);
> +
> + pte_basic_tests(page, prot);
> + pmd_basic_tests(page, prot);
> + pud_basic_tests(page, prot);
> + p4d_basic_tests(page, prot);
> + pgd_basic_tests(page, prot);
> +
> + pte_clear_tests(mm, ptep);
> + pmd_clear_tests(pmdp);
> + pud_clear_tests(pudp);
> + p4d_clear_tests(p4dp);
> + pgd_clear_tests(mm, pgdp);
> +
> + pmd_populate_tests(mm, pmdp, saved_ptep);
> + pud_populate_tests(mm, pudp, saved_pmdp);
> + p4d_populate_tests(mm, p4dp, saved_pudp);
> + pgd_populate_tests(mm, pgdp, saved_p4dp);
> +
> + p4d_free(mm, saved_p4dp);
> + pud_free(mm, saved_pudp);
> + pmd_free(mm, saved_pmdp);
> + pte_free(mm, saved_ptep);
> +
> + mm_dec_nr_puds(mm);
> + mm_dec_nr_pmds(mm);
> + mm_dec_nr_ptes(mm);
> + __mmdrop(mm);
> +
> + free_mapped_page(page);
> + return 0;

Is there any benefit in keeping the module loaded once the tests are
done ? Shouldn't the load fail instead ?

> +}
> +
> +static void __exit arch_pgtable_tests_exit(void) { }

Is this function really needed ?

> +
> +module_init(arch_pgtable_tests_init);
> +module_exit(arch_pgtable_tests_exit);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Test architecture page table helpers");
>

Christophe

2019-09-12 20:38:22

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



On 09/12/2019 06:02 AM, Anshuman Khandual wrote:
> This adds a test module which will validate architecture page table helpers
> and accessors regarding compliance with generic MM semantics expectations.
> This will help various architectures in validating changes to the existing
> page table helpers or addition of new ones.
>
> Test page table and memory pages creating it's entries at various level are
> all allocated from system memory with required alignments. If memory pages
> with required size and alignment could not be allocated, then all depending
> individual tests are skipped.

Build failure on powerpc book3s/32. This is because asm/highmem.h is
missing. It can't be included from asm/book3s/32/pgtable.h because it
creates circular dependency. So it has to be included from
mm/arch_pgtable_test.c



CC mm/arch_pgtable_test.o
In file included from ./arch/powerpc/include/asm/book3s/pgtable.h:8:0,
from ./arch/powerpc/include/asm/pgtable.h:18,
from ./include/linux/mm.h:99,
from ./arch/powerpc/include/asm/io.h:29,
from ./include/linux/io.h:13,
from ./include/linux/irq.h:20,
from ./arch/powerpc/include/asm/hardirq.h:6,
from ./include/linux/hardirq.h:9,
from ./include/linux/interrupt.h:11,
from ./include/linux/kernel_stat.h:9,
from ./include/linux/cgroup.h:26,
from ./include/linux/hugetlb.h:9,
from mm/arch_pgtable_test.c:14:
mm/arch_pgtable_test.c: In function 'arch_pgtable_tests_init':
./arch/powerpc/include/asm/book3s/32/pgtable.h:365:13: error: implicit
declaration of function 'kmap_atomic'
[-Werror=implicit-function-declaration]
((pte_t *)(kmap_atomic(pmd_page(*(dir))) + \
^
./include/linux/mm.h:2008:31: note: in expansion of macro 'pte_offset_map'
(pte_alloc(mm, pmd) ? NULL : pte_offset_map(pmd, address))
^
mm/arch_pgtable_test.c:377:9: note: in expansion of macro 'pte_alloc_map'
ptep = pte_alloc_map(mm, pmdp, vaddr);
^
cc1: some warnings being treated as errors
make[2]: *** [mm/arch_pgtable_test.o] Error 1


Christophe


>
> Cc: Andrew Morton <[email protected]>
> Cc: Vlastimil Babka <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Mark Brown <[email protected]>
> Cc: Steven Price <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Tetsuo Handa <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Sri Krishna chowdary <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Russell King - ARM Linux <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Martin Schwidefsky <[email protected]>
> Cc: Heiko Carstens <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: Vineet Gupta <[email protected]>
> Cc: James Hogan <[email protected]>
> Cc: Paul Burton <[email protected]>
> Cc: Ralf Baechle <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Gerald Schaefer <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
>
> Suggested-by: Catalin Marinas <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> ---
> arch/x86/include/asm/pgtable_64_types.h | 2 +
> mm/Kconfig.debug | 14 +
> mm/Makefile | 1 +
> mm/arch_pgtable_test.c | 429 ++++++++++++++++++++++++
> 4 files changed, 446 insertions(+)
> create mode 100644 mm/arch_pgtable_test.c
>
> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
> index 52e5f5f2240d..b882792a3999 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
> #define pgtable_l5_enabled() 0
> #endif /* CONFIG_X86_5LEVEL */
>
> +#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
> +
> extern unsigned int pgdir_shift;
> extern unsigned int ptrs_per_p4d;
>
> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
> index 327b3ebf23bf..ce9c397f7b07 100644
> --- a/mm/Kconfig.debug
> +++ b/mm/Kconfig.debug
> @@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
> depends on STRICT_KERNEL_RWX
> ---help---
> This option enables a testcase for the setting rodata read-only.
> +
> +config DEBUG_ARCH_PGTABLE_TEST
> + bool "Test arch page table helpers for semantics compliance"
> + depends on MMU
> + depends on DEBUG_KERNEL
> + help
> + This options provides a kernel module which can be used to test
> + architecture page table helper functions on various platform in
> + verifying if they comply with expected generic MM semantics. This
> + will help architectures code in making sure that any changes or
> + new additions of these helpers will still conform to generic MM
> + expected semantics.
> +
> + If unsure, say N.
> diff --git a/mm/Makefile b/mm/Makefile
> index d996846697ef..bb572c5aa8c5 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
> obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
> obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
> obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
> +obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
> obj-$(CONFIG_PAGE_OWNER) += page_owner.o
> obj-$(CONFIG_CLEANCACHE) += cleancache.o
> obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
> new file mode 100644
> index 000000000000..8b4a92756ad8
> --- /dev/null
> +++ b/mm/arch_pgtable_test.c
> @@ -0,0 +1,429 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * This kernel module validates architecture page table helpers &
> + * accessors and helps in verifying their continued compliance with
> + * generic MM semantics.
> + *
> + * Copyright (C) 2019 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
> +
> +#include <linux/gfp.h>
> +#include <linux/hugetlb.h>
> +#include <linux/kernel.h>
> +#include <linux/mm.h>
> +#include <linux/mman.h>
> +#include <linux/mm_types.h>
> +#include <linux/module.h>
> +#include <linux/pfn_t.h>
> +#include <linux/printk.h>
> +#include <linux/random.h>
> +#include <linux/spinlock.h>
> +#include <linux/swap.h>
> +#include <linux/swapops.h>
> +#include <linux/sched/mm.h>
> +#include <asm/pgalloc.h>
> +#include <asm/pgtable.h>
> +
> +/*
> + * Basic operations
> + *
> + * mkold(entry) = An old and not a young entry
> + * mkyoung(entry) = A young and not an old entry
> + * mkdirty(entry) = A dirty and not a clean entry
> + * mkclean(entry) = A clean and not a dirty entry
> + * mkwrite(entry) = A write and not a write protected entry
> + * wrprotect(entry) = A write protected and not a write entry
> + * pxx_bad(entry) = A mapped and non-table entry
> + * pxx_same(entry1, entry2) = Both entries hold the exact same value
> + */
> +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC)
> +
> +/*
> + * On s390 platform, the lower 12 bits are used to identify given page table
> + * entry type and for other arch specific requirements. But these bits might
> + * affect the ability to clear entries with pxx_clear(). So while loading up
> + * the entries skip all lower 12 bits in order to accommodate s390 platform.
> + * It does not have affect any other platform.
> + */
> +#define RANDOM_ORVALUE (0xfffffffffffff000UL)
> +#define RANDOM_NZVALUE (0xff)
> +
> +static bool pud_aligned;
> +static bool pmd_aligned;
> +
> +static void pte_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pte_t pte = mk_pte(page, prot);
> +
> + WARN_ON(!pte_same(pte, pte));
> + WARN_ON(!pte_young(pte_mkyoung(pte)));
> + WARN_ON(!pte_dirty(pte_mkdirty(pte)));
> + WARN_ON(!pte_write(pte_mkwrite(pte)));
> + WARN_ON(pte_young(pte_mkold(pte)));
> + WARN_ON(pte_dirty(pte_mkclean(pte)));
> + WARN_ON(pte_write(pte_wrprotect(pte)));
> +}
> +
> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
> +static void pmd_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pmd_t pmd;
> +
> + /*
> + * Memory block here must be PMD_SIZE aligned. Abort this
> + * test in case we could not allocate such a memory block.
> + */
> + if (!pmd_aligned) {
> + pr_warn("Could not proceed with PMD tests\n");
> + return;
> + }
> +
> + pmd = mk_pmd(page, prot);
> + WARN_ON(!pmd_same(pmd, pmd));
> + WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
> + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
> + WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
> + WARN_ON(pmd_young(pmd_mkold(pmd)));
> + WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
> + WARN_ON(pmd_write(pmd_wrprotect(pmd)));
> + /*
> + * A huge page does not point to next level page table
> + * entry. Hence this must qualify as pmd_bad().
> + */
> + WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
> +}
> +#else
> +static void pmd_basic_tests(struct page *page, pgprot_t prot) { }
> +#endif
> +
> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> +static void pud_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pud_t pud;
> +
> + /*
> + * Memory block here must be PUD_SIZE aligned. Abort this
> + * test in case we could not allocate such a memory block.
> + */
> + if (!pud_aligned) {
> + pr_warn("Could not proceed with PUD tests\n");
> + return;
> + }
> +
> + pud = pfn_pud(page_to_pfn(page), prot);
> + WARN_ON(!pud_same(pud, pud));
> + WARN_ON(!pud_young(pud_mkyoung(pud)));
> + WARN_ON(!pud_write(pud_mkwrite(pud)));
> + WARN_ON(pud_write(pud_wrprotect(pud)));
> + WARN_ON(pud_young(pud_mkold(pud)));
> +
> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> + /*
> + * A huge page does not point to next level page table
> + * entry. Hence this must qualify as pud_bad().
> + */
> + WARN_ON(!pud_bad(pud_mkhuge(pud)));
> +#endif
> +}
> +#else
> +static void pud_basic_tests(struct page *page, pgprot_t prot) { }
> +#endif
> +
> +static void p4d_basic_tests(struct page *page, pgprot_t prot)
> +{
> + p4d_t p4d;
> +
> + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
> + WARN_ON(!p4d_same(p4d, p4d));
> +}
> +
> +static void pgd_basic_tests(struct page *page, pgprot_t prot)
> +{
> + pgd_t pgd;
> +
> + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
> + WARN_ON(!pgd_same(pgd, pgd));
> +}
> +
> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> +static void pud_clear_tests(pud_t *pudp)
> +{
> + pud_t pud = READ_ONCE(*pudp);
> +
> + pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pudp, pud);
> + pud_clear(pudp);
> + pud = READ_ONCE(*pudp);
> + WARN_ON(!pud_none(pud));
> +}
> +
> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
> +{
> + pud_t pud;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pud_bad().
> + */
> + pmd_clear(pmdp);
> + pud_clear(pudp);
> + pud_populate(mm, pudp, pmdp);
> + pud = READ_ONCE(*pudp);
> + WARN_ON(pud_bad(pud));
> +}
> +#else
> +static void pud_clear_tests(pud_t *pudp) { }
> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
> +{
> +}
> +#endif
> +
> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
> +static void p4d_clear_tests(p4d_t *p4dp)
> +{
> + p4d_t p4d = READ_ONCE(*p4dp);
> +
> + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
> + WRITE_ONCE(*p4dp, p4d);
> + p4d_clear(p4dp);
> + p4d = READ_ONCE(*p4dp);
> + WARN_ON(!p4d_none(p4d));
> +}
> +
> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
> +{
> + p4d_t p4d;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as p4d_bad().
> + */
> + pud_clear(pudp);
> + p4d_clear(p4dp);
> + p4d_populate(mm, p4dp, pudp);
> + p4d = READ_ONCE(*p4dp);
> + WARN_ON(p4d_bad(p4d));
> +}
> +#else
> +static void p4d_clear_tests(p4d_t *p4dp) { }
> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
> +{
> +}
> +#endif
> +
> +#ifndef __ARCH_HAS_5LEVEL_HACK
> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
> +{
> + pgd_t pgd = READ_ONCE(*pgdp);
> +
> + if (mm_p4d_folded(mm))
> + return;
> +
> + pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pgdp, pgd);
> + pgd_clear(pgdp);
> + pgd = READ_ONCE(*pgdp);
> + WARN_ON(!pgd_none(pgd));
> +}
> +
> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> +{
> + pgd_t pgd;
> +
> + if (mm_p4d_folded(mm))
> + return;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pgd_bad().
> + */
> + p4d_clear(p4dp);
> + pgd_clear(pgdp);
> + pgd_populate(mm, pgdp, p4dp);
> + pgd = READ_ONCE(*pgdp);
> + WARN_ON(pgd_bad(pgd));
> +}
> +#else
> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> +{
> +}
> +#endif
> +
> +static void pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
> +{
> + pte_t pte = READ_ONCE(*ptep);
> +
> + pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> + WRITE_ONCE(*ptep, pte);
> + pte_clear(mm, 0, ptep);
> + pte = READ_ONCE(*ptep);
> + WARN_ON(!pte_none(pte));
> +}
> +
> +static void pmd_clear_tests(pmd_t *pmdp)
> +{
> + pmd_t pmd = READ_ONCE(*pmdp);
> +
> + pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
> + WRITE_ONCE(*pmdp, pmd);
> + pmd_clear(pmdp);
> + pmd = READ_ONCE(*pmdp);
> + WARN_ON(!pmd_none(pmd));
> +}
> +
> +static void pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
> + pgtable_t pgtable)
> +{
> + pmd_t pmd;
> +
> + /*
> + * This entry points to next level page table page.
> + * Hence this must not qualify as pmd_bad().
> + */
> + pmd_clear(pmdp);
> + pmd_populate(mm, pmdp, pgtable);
> + pmd = READ_ONCE(*pmdp);
> + WARN_ON(pmd_bad(pmd));
> +}
> +
> +static struct page *alloc_mapped_page(void)
> +{
> + struct page *page;
> + gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO;
> +
> + page = alloc_gigantic_page_order(get_order(PUD_SIZE), gfp_mask,
> + first_memory_node, &node_states[N_MEMORY]);
> + if (page) {
> + pud_aligned = true;
> + pmd_aligned = true;
> + return page;
> + }
> +
> + page = alloc_pages(gfp_mask, get_order(PMD_SIZE));
> + if (page) {
> + pmd_aligned = true;
> + return page;
> + }
> + return alloc_page(gfp_mask);
> +}
> +
> +static void free_mapped_page(struct page *page)
> +{
> + if (pud_aligned) {
> + unsigned long pfn = page_to_pfn(page);
> +
> + free_contig_range(pfn, 1ULL << get_order(PUD_SIZE));
> + return;
> + }
> +
> + if (pmd_aligned) {
> + int order = get_order(PMD_SIZE);
> +
> + free_pages((unsigned long)page_address(page), order);
> + return;
> + }
> + free_page((unsigned long)page_address(page));
> +}
> +
> +static unsigned long get_random_vaddr(void)
> +{
> + unsigned long random_vaddr, random_pages, total_user_pages;
> +
> + total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
> +
> + random_pages = get_random_long() % total_user_pages;
> + random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
> +
> + WARN_ON(random_vaddr > TASK_SIZE);
> + WARN_ON(random_vaddr < FIRST_USER_ADDRESS);
> + return random_vaddr;
> +}
> +
> +static int __init arch_pgtable_tests_init(void)
> +{
> + struct mm_struct *mm;
> + struct page *page;
> + pgd_t *pgdp;
> + p4d_t *p4dp, *saved_p4dp;
> + pud_t *pudp, *saved_pudp;
> + pmd_t *pmdp, *saved_pmdp, pmd;
> + pte_t *ptep;
> + pgtable_t saved_ptep;
> + pgprot_t prot;
> + unsigned long vaddr;
> +
> + prot = vm_get_page_prot(VMFLAGS);
> + vaddr = get_random_vaddr();
> + mm = mm_alloc();
> + if (!mm) {
> + pr_err("mm_struct allocation failed\n");
> + return 1;
> + }
> +
> + page = alloc_mapped_page();
> + if (!page) {
> + pr_err("memory allocation failed\n");
> + return 1;
> + }
> +
> + pgdp = pgd_offset(mm, vaddr);
> + p4dp = p4d_alloc(mm, pgdp, vaddr);
> + pudp = pud_alloc(mm, p4dp, vaddr);
> + pmdp = pmd_alloc(mm, pudp, vaddr);
> + ptep = pte_alloc_map(mm, pmdp, vaddr);
> +
> + /*
> + * Save all the page table page addresses as the page table
> + * entries will be used for testing with random or garbage
> + * values. These saved addresses will be used for freeing
> + * page table pages.
> + */
> + pmd = READ_ONCE(*pmdp);
> + saved_p4dp = p4d_offset(pgdp, 0UL);
> + saved_pudp = pud_offset(p4dp, 0UL);
> + saved_pmdp = pmd_offset(pudp, 0UL);
> + saved_ptep = pmd_pgtable(pmd);
> +
> + pte_basic_tests(page, prot);
> + pmd_basic_tests(page, prot);
> + pud_basic_tests(page, prot);
> + p4d_basic_tests(page, prot);
> + pgd_basic_tests(page, prot);
> +
> + pte_clear_tests(mm, ptep);
> + pmd_clear_tests(pmdp);
> + pud_clear_tests(pudp);
> + p4d_clear_tests(p4dp);
> + pgd_clear_tests(mm, pgdp);
> +
> + pmd_populate_tests(mm, pmdp, saved_ptep);
> + pud_populate_tests(mm, pudp, saved_pmdp);
> + p4d_populate_tests(mm, p4dp, saved_pudp);
> + pgd_populate_tests(mm, pgdp, saved_p4dp);
> +
> + p4d_free(mm, saved_p4dp);
> + pud_free(mm, saved_pudp);
> + pmd_free(mm, saved_pmdp);
> + pte_free(mm, saved_ptep);
> +
> + mm_dec_nr_puds(mm);
> + mm_dec_nr_pmds(mm);
> + mm_dec_nr_ptes(mm);
> + __mmdrop(mm);
> +
> + free_mapped_page(page);
> + return 0;
> +}
> +
> +static void __exit arch_pgtable_tests_exit(void) { }
> +
> +module_init(arch_pgtable_tests_init);
> +module_exit(arch_pgtable_tests_exit);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Test architecture page table helpers");
>

2019-09-12 21:33:44

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 12/09/2019 à 17:36, Christophe Leroy a écrit :
>
>
> Le 12/09/2019 à 17:00, Christophe Leroy a écrit :
>>
>>
>> On 09/12/2019 06:02 AM, Anshuman Khandual wrote:
>>> This adds a test module which will validate architecture page table
>>> helpers
>>> and accessors regarding compliance with generic MM semantics
>>> expectations.
>>> This will help various architectures in validating changes to the
>>> existing
>>> page table helpers or addition of new ones.
>>>
>>> Test page table and memory pages creating it's entries at various
>>> level are
>>> all allocated from system memory with required alignments. If memory
>>> pages
>>> with required size and alignment could not be allocated, then all
>>> depending
>>> individual tests are skipped.
>>
>> Build failure on powerpc book3s/32. This is because asm/highmem.h is
>> missing. It can't be included from asm/book3s/32/pgtable.h because it
>> creates circular dependency. So it has to be included from
>> mm/arch_pgtable_test.c
>
> In fact it is <linux/highmem.h> that needs to be added, adding
> <asm/highmem.h> directly provokes build failure at link time.
>

I get the following failure,

[ 0.704685] ------------[ cut here ]------------
[ 0.709239] initcall arch_pgtable_tests_init+0x0/0x228 returned with
preemption imbalance
[ 0.717539] WARNING: CPU: 0 PID: 1 at init/main.c:952
do_one_initcall+0x18c/0x1d4
[ 0.724922] CPU: 0 PID: 1 Comm: swapper Not tainted
5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty #2307
[ 0.734070] NIP: c070e674 LR: c070e674 CTR: c001292c
[ 0.739084] REGS: df4a5dd0 TRAP: 0700 Not tainted
(5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty)
[ 0.747975] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 28000222 XER: 00000000
[ 0.754628]
[ 0.754628] GPR00: c070e674 df4a5e88 df4a0000 0000004e 0000000a
00000000 000000ca 38207265
[ 0.754628] GPR08: 00001032 00000800 00000000 00000000 22000422
00000000 c0004a7c 00000000
[ 0.754628] GPR16: 00000000 00000000 00000000 00000000 00000000
c0810000 c0800000 c0816f30
[ 0.754628] GPR24: c070dc20 c074702c 00000006 0000009c 00000000
c0724494 c074e140 00000000
[ 0.789339] NIP [c070e674] do_one_initcall+0x18c/0x1d4
[ 0.794435] LR [c070e674] do_one_initcall+0x18c/0x1d4
[ 0.799437] Call Trace:
[ 0.801867] [df4a5e88] [c070e674] do_one_initcall+0x18c/0x1d4
(unreliable)
[ 0.808694] [df4a5ee8] [c070e8c0] kernel_init_freeable+0x204/0x2dc
[ 0.814830] [df4a5f28] [c0004a94] kernel_init+0x18/0x110
[ 0.820107] [df4a5f38] [c00122ac] ret_from_kernel_thread+0x14/0x1c
[ 0.826220] Instruction dump:
[ 0.829161] 4beb1069 7d2000a6 61298000 7d200124 89210008 2f890000
41be0048 3c60c06a
[ 0.836849] 38a10008 7fa4eb78 3863cacc 4b915115 <0fe00000> 4800002c
81220070 712a0004
[ 0.844723] ---[ end trace 969d686308d40b33 ]---

Then starting init fails:

[ 3.894074] Run /init as init process
[ 3.898403] Failed to execute /init (error -14)
[ 3.903009] Run /sbin/init as init process
[ 3.907172] Run /etc/init as init process
[ 3.911251] Run /bin/init as init process
[ 3.915513] Run /bin/sh as init process
[ 3.919471] Starting init: /bin/sh exists but couldn't execute it
(error -14)
[ 3.926732] Kernel panic - not syncing: No working init found. Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[ 3.940864] CPU: 0 PID: 1 Comm: init Tainted: G W
5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty #2307
[ 3.951165] Call Trace:
[ 3.953617] [df4a5ec8] [c002392c] panic+0x12c/0x320 (unreliable)
[ 3.959621] [df4a5f28] [c0004b8c] rootfs_mount+0x0/0x2c
[ 3.964849] [df4a5f38] [c00122ac] ret_from_kernel_thread+0x14/0x1c


Christophe

2019-09-13 06:26:52

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers

Fix build failure on powerpc.

Fix preemption imbalance.

Signed-off-by: Christophe Leroy <[email protected]>
---
mm/arch_pgtable_test.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
index 8b4a92756ad8..f2b3c9ec35fa 100644
--- a/mm/arch_pgtable_test.c
+++ b/mm/arch_pgtable_test.c
@@ -24,6 +24,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/sched/mm.h>
+#include <linux/highmem.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>

@@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
p4d_clear_tests(p4dp);
pgd_clear_tests(mm, pgdp);

+ pte_unmap(ptep);
+
pmd_populate_tests(mm, pmdp, saved_ptep);
pud_populate_tests(mm, pudp, saved_pmdp);
p4d_populate_tests(mm, p4dp, saved_pudp);
--
2.13.3

2019-09-13 06:27:48

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 0/2] mm/debug: Add tests for architecture exported page table helpers



On 09/12/2019 08:12 PM, Christophe Leroy wrote:
> Hi,
>
> I didn't get patch 1 of this series, and it is not on linuxppc-dev patchwork either. Can you resend ?

Its there on linux-mm patchwork and copied on [email protected]
as well. The CC list for the first patch was different than the second one.

https://patchwork.kernel.org/patch/11142317/

Let me know if you can not find it either on MM or LKML list.

- Anshuman

2019-09-13 06:34:42

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 12/09/2019 à 17:52, Christophe Leroy a écrit :
>
>
> Le 12/09/2019 à 17:36, Christophe Leroy a écrit :
>>
>>
>> Le 12/09/2019 à 17:00, Christophe Leroy a écrit :
>>>
>>>
>>> On 09/12/2019 06:02 AM, Anshuman Khandual wrote:
>>>> This adds a test module which will validate architecture page table
>>>> helpers
>>>> and accessors regarding compliance with generic MM semantics
>>>> expectations.
>>>> This will help various architectures in validating changes to the
>>>> existing
>>>> page table helpers or addition of new ones.
>>>>
>>>> Test page table and memory pages creating it's entries at various
>>>> level are
>>>> all allocated from system memory with required alignments. If memory
>>>> pages
>>>> with required size and alignment could not be allocated, then all
>>>> depending
>>>> individual tests are skipped.
>>>
>>> Build failure on powerpc book3s/32. This is because asm/highmem.h is
>>> missing. It can't be included from asm/book3s/32/pgtable.h because it
>>> creates circular dependency. So it has to be included from
>>> mm/arch_pgtable_test.c
>>
>> In fact it is <linux/highmem.h> that needs to be added, adding
>> <asm/highmem.h> directly provokes build failure at link time.
>>
>
> I get the following failure,
>
> [    0.704685] ------------[ cut here ]------------
> [    0.709239] initcall arch_pgtable_tests_init+0x0/0x228 returned with
> preemption imbalance

preempt_disable() is called from kmap_atomic() which is called from
pte_alloc_map() via pte_offset_map().

pte_unmap() has to be called to release the mapped pte and re-enable
preemtion.

Christophe


> [    0.717539] WARNING: CPU: 0 PID: 1 at init/main.c:952
> do_one_initcall+0x18c/0x1d4
> [    0.724922] CPU: 0 PID: 1 Comm: swapper Not tainted
> 5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty #2307
> [    0.734070] NIP:  c070e674 LR: c070e674 CTR: c001292c
> [    0.739084] REGS: df4a5dd0 TRAP: 0700   Not tainted
> (5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty)
> [    0.747975] MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 28000222  XER: 00000000
> [    0.754628]
> [    0.754628] GPR00: c070e674 df4a5e88 df4a0000 0000004e 0000000a
> 00000000 000000ca 38207265
> [    0.754628] GPR08: 00001032 00000800 00000000 00000000 22000422
> 00000000 c0004a7c 00000000
> [    0.754628] GPR16: 00000000 00000000 00000000 00000000 00000000
> c0810000 c0800000 c0816f30
> [    0.754628] GPR24: c070dc20 c074702c 00000006 0000009c 00000000
> c0724494 c074e140 00000000
> [    0.789339] NIP [c070e674] do_one_initcall+0x18c/0x1d4
> [    0.794435] LR [c070e674] do_one_initcall+0x18c/0x1d4
> [    0.799437] Call Trace:
> [    0.801867] [df4a5e88] [c070e674] do_one_initcall+0x18c/0x1d4
> (unreliable)
> [    0.808694] [df4a5ee8] [c070e8c0] kernel_init_freeable+0x204/0x2dc
> [    0.814830] [df4a5f28] [c0004a94] kernel_init+0x18/0x110
> [    0.820107] [df4a5f38] [c00122ac] ret_from_kernel_thread+0x14/0x1c
> [    0.826220] Instruction dump:
> [    0.829161] 4beb1069 7d2000a6 61298000 7d200124 89210008 2f890000
> 41be0048 3c60c06a
> [    0.836849] 38a10008 7fa4eb78 3863cacc 4b915115 <0fe00000> 4800002c
> 81220070 712a0004
> [    0.844723] ---[ end trace 969d686308d40b33 ]---
>
> Then starting init fails:
>
> [    3.894074] Run /init as init process
> [    3.898403] Failed to execute /init (error -14)
> [    3.903009] Run /sbin/init as init process
> [    3.907172] Run /etc/init as init process
> [    3.911251] Run /bin/init as init process
> [    3.915513] Run /bin/sh as init process
> [    3.919471] Starting init: /bin/sh exists but couldn't execute it
> (error -14)
> [    3.926732] Kernel panic - not syncing: No working init found.  Try
> passing init= option to kernel. See Linux
> Documentation/admin-guide/init.rst for guidance.
> [    3.940864] CPU: 0 PID: 1 Comm: init Tainted: G        W
> 5.3.0-rc7-s3k-dev-00880-g28fd02a838e5-dirty #2307
> [    3.951165] Call Trace:
> [    3.953617] [df4a5ec8] [c002392c] panic+0x12c/0x320 (unreliable)
> [    3.959621] [df4a5f28] [c0004b8c] rootfs_mount+0x0/0x2c
> [    3.964849] [df4a5f38] [c00122ac] ret_from_kernel_thread+0x14/0x1c
>
>
> Christophe

2019-09-13 06:41:12

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 0/2] mm/debug: Add tests for architecture exported page table helpers



Le 13/09/2019 à 08:24, Anshuman Khandual a écrit :
>
>
> On 09/12/2019 08:12 PM, Christophe Leroy wrote:
>> Hi,
>>
>> I didn't get patch 1 of this series, and it is not on linuxppc-dev patchwork either. Can you resend ?
>
> Its there on linux-mm patchwork and copied on [email protected]
> as well. The CC list for the first patch was different than the second one.
>
> https://patchwork.kernel.org/patch/11142317/
>
> Let me know if you can not find it either on MM or LKML list.
>

I finaly found it on linux-mm archive, thanks. See my other mails and my
fixing patch.

Christophe

2019-09-13 07:02:26

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers

On 09/13/2019 11:53 AM, Christophe Leroy wrote:
> Fix build failure on powerpc.
>
> Fix preemption imbalance.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> mm/arch_pgtable_test.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
> index 8b4a92756ad8..f2b3c9ec35fa 100644
> --- a/mm/arch_pgtable_test.c
> +++ b/mm/arch_pgtable_test.c
> @@ -24,6 +24,7 @@
> #include <linux/swap.h>
> #include <linux/swapops.h>
> #include <linux/sched/mm.h>
> +#include <linux/highmem.h>

This is okay.

> #include <asm/pgalloc.h>
> #include <asm/pgtable.h>
>
> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
> p4d_clear_tests(p4dp);
> pgd_clear_tests(mm, pgdp);
>
> + pte_unmap(ptep);
> +

Now the preemption imbalance via pte_alloc_map() path i.e

pte_alloc_map() -> pte_offset_map() -> kmap_atomic()

Is not this very much powerpc 32 specific or this will be applicable
for all platform which uses kmap_XXX() to map high memory ?

2019-09-13 07:14:52

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers



Le 13/09/2019 à 09:03, Christophe Leroy a écrit :
>
>
> Le 13/09/2019 à 08:58, Anshuman Khandual a écrit :
>> On 09/13/2019 11:53 AM, Christophe Leroy wrote:
>>> Fix build failure on powerpc.
>>>
>>> Fix preemption imbalance.
>>>
>>> Signed-off-by: Christophe Leroy <[email protected]>
>>> ---
>>>   mm/arch_pgtable_test.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>>> index 8b4a92756ad8..f2b3c9ec35fa 100644
>>> --- a/mm/arch_pgtable_test.c
>>> +++ b/mm/arch_pgtable_test.c
>>> @@ -24,6 +24,7 @@
>>>   #include <linux/swap.h>
>>>   #include <linux/swapops.h>
>>>   #include <linux/sched/mm.h>
>>> +#include <linux/highmem.h>
>>
>> This is okay.
>>
>>>   #include <asm/pgalloc.h>
>>>   #include <asm/pgtable.h>
>>> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
>>>       p4d_clear_tests(p4dp);
>>>       pgd_clear_tests(mm, pgdp);
>>> +    pte_unmap(ptep);
>>> +
>>
>> Now the preemption imbalance via pte_alloc_map() path i.e
>>
>> pte_alloc_map() -> pte_offset_map() -> kmap_atomic()
>>
>> Is not this very much powerpc 32 specific or this will be applicable
>> for all platform which uses kmap_XXX() to map high memory ?
>>
>
> See
> https://elixir.bootlin.com/linux/v5.3-rc8/source/include/linux/highmem.h#L91
>
>
> I think it applies at least to all arches using the generic implementation.
>
> Applies also to arm:
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/mm/highmem.c#L52
>
> Applies also to mips:
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/mips/mm/highmem.c#L47
>
> Same on sparc:
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/sparc/mm/highmem.c#L52
>
>
> Same on x86:
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/mm/highmem_32.c#L34
>
>
> I have not checked others, but I guess it is like that for all.
>


Seems like I answered too quickly. All kmap_atomic() do
preempt_disable(), but not all pte_alloc_map() call kmap_atomic().

However, for instance ARM does:

https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/include/asm/pgtable.h#L200

And X86 as well:

https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/include/asm/pgtable_32.h#L51

Microblaze also:

https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/microblaze/include/asm/pgtable.h#L495

Christophe

2019-09-13 08:46:57

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers



On 09/13/2019 12:41 PM, Christophe Leroy wrote:
>
>
> Le 13/09/2019 à 09:03, Christophe Leroy a écrit :
>>
>>
>> Le 13/09/2019 à 08:58, Anshuman Khandual a écrit :
>>> On 09/13/2019 11:53 AM, Christophe Leroy wrote:
>>>> Fix build failure on powerpc.
>>>>
>>>> Fix preemption imbalance.
>>>>
>>>> Signed-off-by: Christophe Leroy <[email protected]>
>>>> ---
>>>>   mm/arch_pgtable_test.c | 3 +++
>>>>   1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>>>> index 8b4a92756ad8..f2b3c9ec35fa 100644
>>>> --- a/mm/arch_pgtable_test.c
>>>> +++ b/mm/arch_pgtable_test.c
>>>> @@ -24,6 +24,7 @@
>>>>   #include <linux/swap.h>
>>>>   #include <linux/swapops.h>
>>>>   #include <linux/sched/mm.h>
>>>> +#include <linux/highmem.h>
>>>
>>> This is okay.
>>>
>>>>   #include <asm/pgalloc.h>
>>>>   #include <asm/pgtable.h>
>>>> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
>>>>       p4d_clear_tests(p4dp);
>>>>       pgd_clear_tests(mm, pgdp);
>>>> +    pte_unmap(ptep);
>>>> +
>>>
>>> Now the preemption imbalance via pte_alloc_map() path i.e
>>>
>>> pte_alloc_map() -> pte_offset_map() -> kmap_atomic()
>>>
>>> Is not this very much powerpc 32 specific or this will be applicable
>>> for all platform which uses kmap_XXX() to map high memory ?
>>>
>>
>> See https://elixir.bootlin.com/linux/v5.3-rc8/source/include/linux/highmem.h#L91
>>
>> I think it applies at least to all arches using the generic implementation.
>>
>> Applies also to arm:
>> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/mm/highmem.c#L52
>>
>> Applies also to mips:
>> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/mips/mm/highmem.c#L47
>>
>> Same on sparc:
>> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/sparc/mm/highmem.c#L52
>>
>> Same on x86:
>> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/mm/highmem_32.c#L34
>>
>> I have not checked others, but I guess it is like that for all.
>>
>
>
> Seems like I answered too quickly. All kmap_atomic() do preempt_disable(), but not all pte_alloc_map() call kmap_atomic().
>
> However, for instance ARM does:
>
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/include/asm/pgtable.h#L200
>
> And X86 as well:
>
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/include/asm/pgtable_32.h#L51
>
> Microblaze also:
>
> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/microblaze/include/asm/pgtable.h#L495

All the above platforms checks out to be using k[un]map_atomic(). I am wondering whether
any of the intermediate levels will have similar problems on any these 32 bit platforms
or any other platforms which might be using generic k[un]map_atomic(). There can be many
permutations here.

p4dp = p4d_alloc(mm, pgdp, vaddr);
pudp = pud_alloc(mm, p4dp, vaddr);
pmdp = pmd_alloc(mm, pudp, vaddr);

Otherwise pte_alloc_map()/pte_unmap() looks good enough which will atleast take care of
a known failure.

2019-09-13 08:53:58

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers

On Fri, Sep 13, 2019 at 02:12:45PM +0530, Anshuman Khandual wrote:
>
>
> On 09/13/2019 12:41 PM, Christophe Leroy wrote:
> >
> >
> > Le 13/09/2019 ? 09:03, Christophe Leroy a ?crit?:
> >>
> >>
> >> Le 13/09/2019 ? 08:58, Anshuman Khandual a ?crit?:
> >>> On 09/13/2019 11:53 AM, Christophe Leroy wrote:
> >>>> Fix build failure on powerpc.
> >>>>
> >>>> Fix preemption imbalance.
> >>>>
> >>>> Signed-off-by: Christophe Leroy <[email protected]>
> >>>> ---
> >>>> ? mm/arch_pgtable_test.c | 3 +++
> >>>> ? 1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
> >>>> index 8b4a92756ad8..f2b3c9ec35fa 100644
> >>>> --- a/mm/arch_pgtable_test.c
> >>>> +++ b/mm/arch_pgtable_test.c
> >>>> @@ -24,6 +24,7 @@
> >>>> ? #include <linux/swap.h>
> >>>> ? #include <linux/swapops.h>
> >>>> ? #include <linux/sched/mm.h>
> >>>> +#include <linux/highmem.h>
> >>>
> >>> This is okay.
> >>>
> >>>> ? #include <asm/pgalloc.h>
> >>>> ? #include <asm/pgtable.h>
> >>>> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
> >>>> ????? p4d_clear_tests(p4dp);
> >>>> ????? pgd_clear_tests(mm, pgdp);
> >>>> +??? pte_unmap(ptep);
> >>>> +
> >>>
> >>> Now the preemption imbalance via pte_alloc_map() path i.e
> >>>
> >>> pte_alloc_map() -> pte_offset_map() -> kmap_atomic()
> >>>
> >>> Is not this very much powerpc 32 specific or this will be applicable
> >>> for all platform which uses kmap_XXX() to map high memory ?
> >>>
> >>
> >> See https://elixir.bootlin.com/linux/v5.3-rc8/source/include/linux/highmem.h#L91
> >>
> >> I think it applies at least to all arches using the generic implementation.
> >>
> >> Applies also to arm:
> >> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/mm/highmem.c#L52
> >>
> >> Applies also to mips:
> >> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/mips/mm/highmem.c#L47
> >>
> >> Same on sparc:
> >> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/sparc/mm/highmem.c#L52
> >>
> >> Same on x86:
> >> https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/mm/highmem_32.c#L34
> >>
> >> I have not checked others, but I guess it is like that for all.
> >>
> >
> >
> > Seems like I answered too quickly. All kmap_atomic() do preempt_disable(), but not all pte_alloc_map() call kmap_atomic().
> >
> > However, for instance ARM does:
> >
> > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/include/asm/pgtable.h#L200
> >
> > And X86 as well:
> >
> > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/include/asm/pgtable_32.h#L51
> >
> > Microblaze also:
> >
> > https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/microblaze/include/asm/pgtable.h#L495
>
> All the above platforms checks out to be using k[un]map_atomic(). I am wondering whether
> any of the intermediate levels will have similar problems on any these 32 bit platforms
> or any other platforms which might be using generic k[un]map_atomic().

No. Kernel only allocates pte page table from highmem. All other page
tables are always visible in kernel address space.

--
Kirill A. Shutemov

2019-09-13 09:04:25

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers


On 09/12/2019 10:44 PM, Christophe Leroy wrote:
>
>
> Le 12/09/2019 à 08:02, Anshuman Khandual a écrit :
>> This adds a test module which will validate architecture page table helpers
>> and accessors regarding compliance with generic MM semantics expectations.
>> This will help various architectures in validating changes to the existing
>> page table helpers or addition of new ones.
>>
>> Test page table and memory pages creating it's entries at various level are
>> all allocated from system memory with required alignments. If memory pages
>> with required size and alignment could not be allocated, then all depending
>> individual tests are skipped.
>>
>
> [...]
>
>>
>> Suggested-by: Catalin Marinas <[email protected]>
>> Signed-off-by: Anshuman Khandual <[email protected]>
>> ---
>>   arch/x86/include/asm/pgtable_64_types.h |   2 +
>>   mm/Kconfig.debug                        |  14 +
>>   mm/Makefile                             |   1 +
>>   mm/arch_pgtable_test.c                  | 429 ++++++++++++++++++++++++
>>   4 files changed, 446 insertions(+)
>>   create mode 100644 mm/arch_pgtable_test.c
>>
>> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
>> index 52e5f5f2240d..b882792a3999 100644
>> --- a/arch/x86/include/asm/pgtable_64_types.h
>> +++ b/arch/x86/include/asm/pgtable_64_types.h
>> @@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
>>   #define pgtable_l5_enabled() 0
>>   #endif /* CONFIG_X86_5LEVEL */
>>   +#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
>> +
>
> This is specific to x86, should go in a separate patch.

Thought about it but its just a single line. Kirill suggested this in the
previous version. There is a generic fallback definition but s390 has it's
own. This change overrides the generic one for x86 probably as a fix or as
an improvement. Kirill should be able to help classify it in which case it
can be a separate patch.

>
>>   extern unsigned int pgdir_shift;
>>   extern unsigned int ptrs_per_p4d;
>>   diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
>> index 327b3ebf23bf..ce9c397f7b07 100644
>> --- a/mm/Kconfig.debug
>> +++ b/mm/Kconfig.debug
>> @@ -117,3 +117,17 @@ config DEBUG_RODATA_TEST
>>       depends on STRICT_KERNEL_RWX
>>       ---help---
>>         This option enables a testcase for the setting rodata read-only.
>> +
>> +config DEBUG_ARCH_PGTABLE_TEST
>> +    bool "Test arch page table helpers for semantics compliance"
>> +    depends on MMU
>> +    depends on DEBUG_KERNEL
>> +    help
>> +      This options provides a kernel module which can be used to test
>> +      architecture page table helper functions on various platform in
>> +      verifying if they comply with expected generic MM semantics. This
>> +      will help architectures code in making sure that any changes or
>> +      new additions of these helpers will still conform to generic MM
>> +      expected semantics.
>> +
>> +      If unsure, say N.
>> diff --git a/mm/Makefile b/mm/Makefile
>> index d996846697ef..bb572c5aa8c5 100644
>> --- a/mm/Makefile
>> +++ b/mm/Makefile
>> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
>>   obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
>>   obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
>>   obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
>> +obj-$(CONFIG_DEBUG_ARCH_PGTABLE_TEST) += arch_pgtable_test.o
>>   obj-$(CONFIG_PAGE_OWNER) += page_owner.o
>>   obj-$(CONFIG_CLEANCACHE) += cleancache.o
>>   obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>> new file mode 100644
>> index 000000000000..8b4a92756ad8
>> --- /dev/null
>> +++ b/mm/arch_pgtable_test.c
>> @@ -0,0 +1,429 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * This kernel module validates architecture page table helpers &
>> + * accessors and helps in verifying their continued compliance with
>> + * generic MM semantics.
>> + *
>> + * Copyright (C) 2019 ARM Ltd.
>> + *
>> + * Author: Anshuman Khandual <[email protected]>
>> + */
>> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
>> +
>> +#include <linux/gfp.h>
>> +#include <linux/hugetlb.h>
>> +#include <linux/kernel.h>
>> +#include <linux/mm.h>
>> +#include <linux/mman.h>
>> +#include <linux/mm_types.h>
>> +#include <linux/module.h>
>> +#include <linux/pfn_t.h>
>> +#include <linux/printk.h>
>> +#include <linux/random.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/swap.h>
>> +#include <linux/swapops.h>
>> +#include <linux/sched/mm.h>
>
> Add <linux/highmem.h> (see other mails, build failure on ppc book3s/32)

Okay.

>
>> +#include <asm/pgalloc.h>
>> +#include <asm/pgtable.h>
>> +
>> +/*
>> + * Basic operations
>> + *
>> + * mkold(entry)            = An old and not a young entry
>> + * mkyoung(entry)        = A young and not an old entry
>> + * mkdirty(entry)        = A dirty and not a clean entry
>> + * mkclean(entry)        = A clean and not a dirty entry
>> + * mkwrite(entry)        = A write and not a write protected entry
>> + * wrprotect(entry)        = A write protected and not a write entry
>> + * pxx_bad(entry)        = A mapped and non-table entry
>> + * pxx_same(entry1, entry2)    = Both entries hold the exact same value
>> + */
>> +#define VMFLAGS    (VM_READ|VM_WRITE|VM_EXEC)
>> +
>> +/*
>> + * On s390 platform, the lower 12 bits are used to identify given page table
>> + * entry type and for other arch specific requirements. But these bits might
>> + * affect the ability to clear entries with pxx_clear(). So while loading up
>> + * the entries skip all lower 12 bits in order to accommodate s390 platform.
>> + * It does not have affect any other platform.
>> + */
>> +#define RANDOM_ORVALUE    (0xfffffffffffff000UL)
>> +#define RANDOM_NZVALUE    (0xff)
>> +
>> +static bool pud_aligned;
>> +static bool pmd_aligned;
>> +
>> +static void pte_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pte_t pte = mk_pte(page, prot);
>> +
>> +    WARN_ON(!pte_same(pte, pte));
>> +    WARN_ON(!pte_young(pte_mkyoung(pte)));
>> +    WARN_ON(!pte_dirty(pte_mkdirty(pte)));
>> +    WARN_ON(!pte_write(pte_mkwrite(pte)));
>> +    WARN_ON(pte_young(pte_mkold(pte)));
>> +    WARN_ON(pte_dirty(pte_mkclean(pte)));
>> +    WARN_ON(pte_write(pte_wrprotect(pte)));
>> +}
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
>> +static void pmd_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pmd_t pmd;
>> +
>> +    /*
>> +     * Memory block here must be PMD_SIZE aligned. Abort this
>> +     * test in case we could not allocate such a memory block.
>> +     */
>> +    if (!pmd_aligned) {
>> +        pr_warn("Could not proceed with PMD tests\n");
>> +        return;
>> +    }
>> +
>> +    pmd = mk_pmd(page, prot);
>> +    WARN_ON(!pmd_same(pmd, pmd));
>> +    WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
>> +    WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
>> +    WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
>> +    WARN_ON(pmd_young(pmd_mkold(pmd)));
>> +    WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
>> +    WARN_ON(pmd_write(pmd_wrprotect(pmd)));
>> +    /*
>> +     * A huge page does not point to next level page table
>> +     * entry. Hence this must qualify as pmd_bad().
>> +     */
>> +    WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
>> +}
>> +#else
>> +static void pmd_basic_tests(struct page *page, pgprot_t prot) { }
>> +#endif
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> +static void pud_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pud_t pud;
>> +
>> +    /*
>> +     * Memory block here must be PUD_SIZE aligned. Abort this
>> +     * test in case we could not allocate such a memory block.
>> +     */
>> +    if (!pud_aligned) {
>> +        pr_warn("Could not proceed with PUD tests\n");
>> +        return;
>> +    }
>> +
>> +    pud = pfn_pud(page_to_pfn(page), prot);
>> +    WARN_ON(!pud_same(pud, pud));
>> +    WARN_ON(!pud_young(pud_mkyoung(pud)));
>> +    WARN_ON(!pud_write(pud_mkwrite(pud)));
>> +    WARN_ON(pud_write(pud_wrprotect(pud)));
>> +    WARN_ON(pud_young(pud_mkold(pud)));
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>> +    /*
>> +     * A huge page does not point to next level page table
>> +     * entry. Hence this must qualify as pud_bad().
>> +     */
>> +    WARN_ON(!pud_bad(pud_mkhuge(pud)));
>> +#endif
>> +}
>> +#else
>> +static void pud_basic_tests(struct page *page, pgprot_t prot) { }
>> +#endif
>> +
>> +static void p4d_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    p4d_t p4d;
>> +
>> +    memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
>> +    WARN_ON(!p4d_same(p4d, p4d));
>> +}
>> +
>> +static void pgd_basic_tests(struct page *page, pgprot_t prot)
>> +{
>> +    pgd_t pgd;
>> +
>> +    memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
>> +    WARN_ON(!pgd_same(pgd, pgd));
>> +}
>> +
>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>
> #ifdefs have to be avoided as much as possible, see below

Yeah but it has been bit difficult to avoid all these $ifdef because of the
availability (or lack of it) for all these pgtable helpers in various config
combinations on all platforms.

>
>> +static void pud_clear_tests(pud_t *pudp)
>> +{
>> +    pud_t pud = READ_ONCE(*pudp);
>     if (mm_pmd_folded() || __is_defined(__ARCH_HAS_4LEVEL_HACK))
>         return;
>
>> +
>> +    pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pudp, pud);
>> +    pud_clear(pudp);
>> +    pud = READ_ONCE(*pudp);
>> +    WARN_ON(!pud_none(pud));
>> +}
>> +
>> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>> +{
>> +    pud_t pud;
>> +
>     if (mm_pmd_folded() || __is_defined(__ARCH_HAS_4LEVEL_HACK))
>         return;
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pud_bad().
>> +     */
>> +    pmd_clear(pmdp);
>> +    pud_clear(pudp);
>> +    pud_populate(mm, pudp, pmdp);
>> +    pud = READ_ONCE(*pudp);
>> +    WARN_ON(pud_bad(pud));
>> +}
>> +#else
>
> Then the else branch goes away.
>
>> +static void pud_clear_tests(pud_t *pudp) { }
>> +static void pud_populate_tests(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>> +{
>> +}
>> +#endif
>> +
>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>
> The same can be done here.

IIRC not only the page table helpers but there are data types (pxx_t) which
were not present on various configs and these wrappers help prevent build
failures. Any ways will try and see if this can be improved further. But
meanwhile if you have some suggestions, please do let me know.

>
>> +static void p4d_clear_tests(p4d_t *p4dp)
>> +{
>> +    p4d_t p4d = READ_ONCE(*p4dp);
>> +
>> +    p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*p4dp, p4d);
>> +    p4d_clear(p4dp);
>> +    p4d = READ_ONCE(*p4dp);
>> +    WARN_ON(!p4d_none(p4d));
>> +}
>> +
>> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
>> +{
>> +    p4d_t p4d;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as p4d_bad().
>> +     */
>> +    pud_clear(pudp);
>> +    p4d_clear(p4dp);
>> +    p4d_populate(mm, p4dp, pudp);
>> +    p4d = READ_ONCE(*p4dp);
>> +    WARN_ON(p4d_bad(p4d));
>> +}
>> +#else
>> +static void p4d_clear_tests(p4d_t *p4dp) { }
>> +static void p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
>> +{
>> +}
>> +#endif
>> +
>> +#ifndef __ARCH_HAS_5LEVEL_HACK
>
> And the same here (you already did part of it with testing mm_p4d_folded(mm)

But it was not capturing all the build combinations which will break
otherwise e.g some configs on arm64 was failing to build.

>
>> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
>> +{
>> +    pgd_t pgd = READ_ONCE(*pgdp);
>> +
>> +    if (mm_p4d_folded(mm))
>> +        return;
>> +
>> +    pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pgdp, pgd);
>> +    pgd_clear(pgdp);
>> +    pgd = READ_ONCE(*pgdp);
>> +    WARN_ON(!pgd_none(pgd));
>> +}
>> +
>> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
>> +{
>> +    pgd_t pgd;
>> +
>> +    if (mm_p4d_folded(mm))
>> +        return;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pgd_bad().
>> +     */
>> +    p4d_clear(p4dp);
>> +    pgd_clear(pgdp);
>> +    pgd_populate(mm, pgdp, p4dp);
>> +    pgd = READ_ONCE(*pgdp);
>> +    WARN_ON(pgd_bad(pgd));
>> +}
>> +#else
>> +static void pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
>> +static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
>> +{
>> +}
>> +#endif
>> +
>> +static void pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
>> +{
>> +    pte_t pte = READ_ONCE(*ptep);
>> +
>> +    pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*ptep, pte);
>> +    pte_clear(mm, 0, ptep);
>> +    pte = READ_ONCE(*ptep);
>> +    WARN_ON(!pte_none(pte));
>> +}
>> +
>> +static void pmd_clear_tests(pmd_t *pmdp)
>> +{
>> +    pmd_t pmd = READ_ONCE(*pmdp);
>> +
>> +    pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
>> +    WRITE_ONCE(*pmdp, pmd);
>> +    pmd_clear(pmdp);
>> +    pmd = READ_ONCE(*pmdp);
>> +    WARN_ON(!pmd_none(pmd));
>> +}
>> +
>> +static void pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>> +                   pgtable_t pgtable)
>> +{
>> +    pmd_t pmd;
>> +
>> +    /*
>> +     * This entry points to next level page table page.
>> +     * Hence this must not qualify as pmd_bad().
>> +     */
>> +    pmd_clear(pmdp);
>> +    pmd_populate(mm, pmdp, pgtable);
>> +    pmd = READ_ONCE(*pmdp);
>> +    WARN_ON(pmd_bad(pmd));
>> +}
>> +
>> +static struct page *alloc_mapped_page(void)
>> +{
>> +    struct page *page;
>> +    gfp_t gfp_mask = GFP_KERNEL | __GFP_ZERO;
>> +
>> +    page = alloc_gigantic_page_order(get_order(PUD_SIZE), gfp_mask,
>> +                first_memory_node, &node_states[N_MEMORY]);
>> +    if (page) {
>> +        pud_aligned = true;
>> +        pmd_aligned = true;
>> +        return page;
>> +    }
>> +
>> +    page = alloc_pages(gfp_mask, get_order(PMD_SIZE));
>> +    if (page) {
>> +        pmd_aligned = true;
>> +        return page;
>> +    }
>> +    return alloc_page(gfp_mask);
>> +}
>> +
>> +static void free_mapped_page(struct page *page)
>> +{
>> +    if (pud_aligned) {
>> +        unsigned long pfn = page_to_pfn(page);
>> +
>> +        free_contig_range(pfn, 1ULL << get_order(PUD_SIZE));
>> +        return;
>> +    }
>> +
>> +    if (pmd_aligned) {
>> +        int order = get_order(PMD_SIZE);
>> +
>> +        free_pages((unsigned long)page_address(page), order);
>> +        return;
>> +    }
>> +    free_page((unsigned long)page_address(page));
>> +}
>> +
>> +static unsigned long get_random_vaddr(void)
>> +{
>> +    unsigned long random_vaddr, random_pages, total_user_pages;
>> +
>> +    total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
>> +
>> +    random_pages = get_random_long() % total_user_pages;
>> +    random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
>> +
>> +    WARN_ON(random_vaddr > TASK_SIZE);
>> +    WARN_ON(random_vaddr < FIRST_USER_ADDRESS);
>> +    return random_vaddr;
>> +}
>> +
>> +static int __init arch_pgtable_tests_init(void)
>> +{
>> +    struct mm_struct *mm;
>> +    struct page *page;
>> +    pgd_t *pgdp;
>> +    p4d_t *p4dp, *saved_p4dp;
>> +    pud_t *pudp, *saved_pudp;
>> +    pmd_t *pmdp, *saved_pmdp, pmd;
>> +    pte_t *ptep;
>> +    pgtable_t saved_ptep;
>> +    pgprot_t prot;
>> +    unsigned long vaddr;
>> +
>> +    prot = vm_get_page_prot(VMFLAGS);
>> +    vaddr = get_random_vaddr();
>> +    mm = mm_alloc();
>> +    if (!mm) {
>> +        pr_err("mm_struct allocation failed\n");
>> +        return 1;
>> +    }
>> +
>> +    page = alloc_mapped_page();
>> +    if (!page) {
>> +        pr_err("memory allocation failed\n");
>> +        return 1;
>> +    }
>> +
>> +    pgdp = pgd_offset(mm, vaddr);
>> +    p4dp = p4d_alloc(mm, pgdp, vaddr);
>> +    pudp = pud_alloc(mm, p4dp, vaddr);
>> +    pmdp = pmd_alloc(mm, pudp, vaddr);
>> +    ptep = pte_alloc_map(mm, pmdp, vaddr);
>> +
>> +    /*
>> +     * Save all the page table page addresses as the page table
>> +     * entries will be used for testing with random or garbage
>> +     * values. These saved addresses will be used for freeing
>> +     * page table pages.
>> +     */
>> +    pmd = READ_ONCE(*pmdp);
>> +    saved_p4dp = p4d_offset(pgdp, 0UL);
>> +    saved_pudp = pud_offset(p4dp, 0UL);
>> +    saved_pmdp = pmd_offset(pudp, 0UL);
>> +    saved_ptep = pmd_pgtable(pmd);
>> +
>> +    pte_basic_tests(page, prot);
>> +    pmd_basic_tests(page, prot);
>> +    pud_basic_tests(page, prot);
>> +    p4d_basic_tests(page, prot);
>> +    pgd_basic_tests(page, prot);
>> +
>> +    pte_clear_tests(mm, ptep);
>> +    pmd_clear_tests(pmdp);
>> +    pud_clear_tests(pudp);
>> +    p4d_clear_tests(p4dp);
>> +    pgd_clear_tests(mm, pgdp);
>> +
>> +    pmd_populate_tests(mm, pmdp, saved_ptep);
>> +    pud_populate_tests(mm, pudp, saved_pmdp);
>> +    p4d_populate_tests(mm, p4dp, saved_pudp);
>> +    pgd_populate_tests(mm, pgdp, saved_p4dp);
>> +
>> +    p4d_free(mm, saved_p4dp);
>> +    pud_free(mm, saved_pudp);
>> +    pmd_free(mm, saved_pmdp);
>> +    pte_free(mm, saved_ptep);
>> +
>> +    mm_dec_nr_puds(mm);
>> +    mm_dec_nr_pmds(mm);
>> +    mm_dec_nr_ptes(mm);
>> +    __mmdrop(mm);
>> +
>> +    free_mapped_page(page);
>> +    return 0;
>
> Is there any benefit in keeping the module loaded once the tests are done ? Shouldn't the load fail instead ?

Will change this as late_init() sequence with all __init marked
functions as suggested by Kirill on the other thread.

>
>> +}
>> +
>> +static void __exit arch_pgtable_tests_exit(void) { }
>
> Is this function really needed ?

This will be gone as well.

>
>> +
>> +module_init(arch_pgtable_tests_init);
>> +module_exit(arch_pgtable_tests_exit);
>> +
>> +MODULE_LICENSE("GPL v2");
>> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
>> +MODULE_DESCRIPTION("Test architecture page table helpers");
>>
>
> Christophe
>

2019-09-13 09:19:05

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

On Fri, Sep 13, 2019 at 02:32:04PM +0530, Anshuman Khandual wrote:
>
> On 09/12/2019 10:44 PM, Christophe Leroy wrote:
> >
> >
> > Le 12/09/2019 ? 08:02, Anshuman Khandual a ?crit?:
> >> This adds a test module which will validate architecture page table helpers
> >> and accessors regarding compliance with generic MM semantics expectations.
> >> This will help various architectures in validating changes to the existing
> >> page table helpers or addition of new ones.
> >>
> >> Test page table and memory pages creating it's entries at various level are
> >> all allocated from system memory with required alignments. If memory pages
> >> with required size and alignment could not be allocated, then all depending
> >> individual tests are skipped.
> >>
> >
> > [...]
> >
> >>
> >> Suggested-by: Catalin Marinas <[email protected]>
> >> Signed-off-by: Anshuman Khandual <[email protected]>
> >> ---
> >> ? arch/x86/include/asm/pgtable_64_types.h |?? 2 +
> >> ? mm/Kconfig.debug??????????????????????? |? 14 +
> >> ? mm/Makefile???????????????????????????? |?? 1 +
> >> ? mm/arch_pgtable_test.c????????????????? | 429 ++++++++++++++++++++++++
> >> ? 4 files changed, 446 insertions(+)
> >> ? create mode 100644 mm/arch_pgtable_test.c
> >>
> >> diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
> >> index 52e5f5f2240d..b882792a3999 100644
> >> --- a/arch/x86/include/asm/pgtable_64_types.h
> >> +++ b/arch/x86/include/asm/pgtable_64_types.h
> >> @@ -40,6 +40,8 @@ static inline bool pgtable_l5_enabled(void)
> >> ? #define pgtable_l5_enabled() 0
> >> ? #endif /* CONFIG_X86_5LEVEL */
> >> ? +#define mm_p4d_folded(mm) (!pgtable_l5_enabled())
> >> +
> >
> > This is specific to x86, should go in a separate patch.
>
> Thought about it but its just a single line. Kirill suggested this in the
> previous version. There is a generic fallback definition but s390 has it's
> own. This change overrides the generic one for x86 probably as a fix or as
> an improvement. Kirill should be able to help classify it in which case it
> can be a separate patch.

I don't think it worth a separate patch.

--
Kirill A. Shutemov

2019-09-13 10:13:36

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers



Le 13/09/2019 à 08:58, Anshuman Khandual a écrit :
> On 09/13/2019 11:53 AM, Christophe Leroy wrote:
>> Fix build failure on powerpc.
>>
>> Fix preemption imbalance.
>>
>> Signed-off-by: Christophe Leroy <[email protected]>
>> ---
>> mm/arch_pgtable_test.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>> index 8b4a92756ad8..f2b3c9ec35fa 100644
>> --- a/mm/arch_pgtable_test.c
>> +++ b/mm/arch_pgtable_test.c
>> @@ -24,6 +24,7 @@
>> #include <linux/swap.h>
>> #include <linux/swapops.h>
>> #include <linux/sched/mm.h>
>> +#include <linux/highmem.h>
>
> This is okay.
>
>> #include <asm/pgalloc.h>
>> #include <asm/pgtable.h>
>>
>> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
>> p4d_clear_tests(p4dp);
>> pgd_clear_tests(mm, pgdp);
>>
>> + pte_unmap(ptep);
>> +
>
> Now the preemption imbalance via pte_alloc_map() path i.e
>
> pte_alloc_map() -> pte_offset_map() -> kmap_atomic()
>
> Is not this very much powerpc 32 specific or this will be applicable
> for all platform which uses kmap_XXX() to map high memory ?
>

See
https://elixir.bootlin.com/linux/v5.3-rc8/source/include/linux/highmem.h#L91

I think it applies at least to all arches using the generic implementation.

Applies also to arm:
https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/arm/mm/highmem.c#L52

Applies also to mips:
https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/mips/mm/highmem.c#L47

Same on sparc:
https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/sparc/mm/highmem.c#L52

Same on x86:
https://elixir.bootlin.com/linux/v5.3-rc8/source/arch/x86/mm/highmem_32.c#L34

I have not checked others, but I guess it is like that for all.

Christophe

2019-09-13 14:35:47

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 13/09/2019 à 11:02, Anshuman Khandual a écrit :
>
>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>>
>> #ifdefs have to be avoided as much as possible, see below
>
> Yeah but it has been bit difficult to avoid all these $ifdef because of the
> availability (or lack of it) for all these pgtable helpers in various config
> combinations on all platforms.

As far as I can see these pgtable helpers should exist everywhere at
least via asm-generic/ files.

Can you spot a particular config which fails ?

>
>>

[...]

>>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>>
>> The same can be done here.
>
> IIRC not only the page table helpers but there are data types (pxx_t) which
> were not present on various configs and these wrappers help prevent build
> failures. Any ways will try and see if this can be improved further. But
> meanwhile if you have some suggestions, please do let me know.

pgt_t and pmd_t are everywhere I guess.
then pud_t and p4d_t have fallbacks in asm-generic files.

So it shouldn't be an issue. Maybe if a couple of arches miss them, the
best would be to fix the arches, since that's the purpose of your
testsuite isn't it ?


Christophe

2019-09-18 06:11:31

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



On 09/13/2019 03:31 PM, Christophe Leroy wrote:
>
>
> Le 13/09/2019 à 11:02, Anshuman Khandual a écrit :
>>
>>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>>>
>>> #ifdefs have to be avoided as much as possible, see below
>>
>> Yeah but it has been bit difficult to avoid all these $ifdef because of the
>> availability (or lack of it) for all these pgtable helpers in various config
>> combinations on all platforms.
>
> As far as I can see these pgtable helpers should exist everywhere at least via asm-generic/ files.

But they might not actually do the right thing.

>
> Can you spot a particular config which fails ?

Lets consider the following example (after removing the $ifdefs around it)
which though builds successfully but fails to pass the intended test. This
is with arm64 config 4K pages sizes with 39 bits VA space which ends up
with a 3 level page table arrangement.

static void __init p4d_clear_tests(p4d_t *p4dp)
{
p4d_t p4d = READ_ONCE(*p4dp);

p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
WRITE_ONCE(*p4dp, p4d);
p4d_clear(p4dp);
p4d = READ_ONCE(*p4dp);
WARN_ON(!p4d_none(p4d));
}

The following test hits an error at WARN_ON(!p4d_none(p4d))

[ 16.757333] ------------[ cut here ]------------
[ 16.758019] WARNING: CPU: 11 PID: 1 at mm/arch_pgtable_test.c:187 arch_pgtable_tests_init+0x24c/0x474
[ 16.759455] Modules linked in:
[ 16.759952] CPU: 11 PID: 1 Comm: swapper/0 Not tainted 5.3.0-next-20190916-00005-g61c218153bb8-dirty #222
[ 16.761449] Hardware name: linux,dummy-virt (DT)
[ 16.762185] pstate: 00400005 (nzcv daif +PAN -UAO)
[ 16.762964] pc : arch_pgtable_tests_init+0x24c/0x474
[ 16.763750] lr : arch_pgtable_tests_init+0x174/0x474
[ 16.764534] sp : ffffffc011d7bd50
[ 16.765065] x29: ffffffc011d7bd50 x28: ffffffff1756bac0
[ 16.765908] x27: ffffff85ddaf3000 x26: 00000000000002e8
[ 16.766767] x25: ffffffc0111ce000 x24: ffffff85ddaf32e8
[ 16.767606] x23: ffffff85ddaef278 x22: 00000045cc844000
[ 16.768445] x21: 000000065daef003 x20: ffffffff17540000
[ 16.769283] x19: ffffff85ddb60000 x18: 0000000000000014
[ 16.770122] x17: 00000000980426bb x16: 00000000698594c6
[ 16.770976] x15: 0000000066e25a88 x14: 0000000000000000
[ 16.771813] x13: ffffffff17540000 x12: 000000000000000a
[ 16.772651] x11: ffffff85fcfd0a40 x10: 0000000000000001
[ 16.773488] x9 : 0000000000000008 x8 : ffffffc01143ab26
[ 16.774336] x7 : 0000000000000000 x6 : 0000000000000000
[ 16.775180] x5 : 0000000000000000 x4 : 0000000000000000
[ 16.776018] x3 : ffffffff1756bbe8 x2 : 000000065daeb003
[ 16.776856] x1 : 000000000065daeb x0 : fffffffffffff000
[ 16.777693] Call trace:
[ 16.778092] arch_pgtable_tests_init+0x24c/0x474
[ 16.778843] do_one_initcall+0x74/0x1b0
[ 16.779458] kernel_init_freeable+0x1cc/0x290
[ 16.780151] kernel_init+0x10/0x100
[ 16.780710] ret_from_fork+0x10/0x18
[ 16.781282] ---[ end trace 042e6c40c0a3b038 ]---

On arm64 (4K page size|39 bits VA|3 level page table)

#elif CONFIG_PGTABLE_LEVELS == 3 /* Applicable here */
#define __ARCH_USE_5LEVEL_HACK
#include <asm-generic/pgtable-nopud.h>

Which pulls in

#include <asm-generic/pgtable-nop4d-hack.h>

which pulls in

#include <asm-generic/5level-fixup.h>

which defines

static inline int p4d_none(p4d_t p4d)
{
return 0;
}

which will invariably trigger WARN_ON(!p4d_none(p4d)).

Similarly for next test p4d_populate_tests() which will always be
successful because p4d_bad() invariably returns negative.

static inline int p4d_bad(p4d_t p4d)
{
return 0;
}

static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
pud_t *pudp)
{
p4d_t p4d;

/*
* This entry points to next level page table page.
* Hence this must not qualify as p4d_bad().
*/
pud_clear(pudp);
p4d_clear(p4dp);
p4d_populate(mm, p4dp, pudp);
p4d = READ_ONCE(*p4dp);
WARN_ON(p4d_bad(p4d));
}

We should not run these tests for the above config because they are
not applicable and will invariably produce same result.

>
>>
>>>
>
> [...]
>
>>>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>>>
>>> The same can be done here.
>>
>> IIRC not only the page table helpers but there are data types (pxx_t) which
>> were not present on various configs and these wrappers help prevent build
>> failures. Any ways will try and see if this can be improved further. But
>> meanwhile if you have some suggestions, please do let me know.
>
> pgt_t and pmd_t are everywhere I guess.
> then pud_t and p4d_t have fallbacks in asm-generic files.

Lets take another example where it fails to compile. On arm64 with 16K
page size, 48 bits VA, 4 level page table arrangement in the following
test, pgd_populate() does not have the required signature.

static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
{
pgd_t pgd;

if (mm_p4d_folded(mm))
return;

/*
* This entry points to next level page table page.
* Hence this must not qualify as pgd_bad().
*/
p4d_clear(p4dp);
pgd_clear(pgdp);
pgd_populate(mm, pgdp, p4dp);
pgd = READ_ONCE(*pgdp);
WARN_ON(pgd_bad(pgd));
}

mm/arch_pgtable_test.c: In function ‘pgd_populate_tests’:
mm/arch_pgtable_test.c:254:25: error: passing argument 3 of ‘pgd_populate’ from incompatible pointer type [-Werror=incompatible-pointer-types]
pgd_populate(mm, pgdp, p4dp);
^~~~
In file included from mm/arch_pgtable_test.c:27:0:
./arch/arm64/include/asm/pgalloc.h:81:20: note: expected ‘pud_t * {aka struct <anonymous> *}’ but argument is of type ‘pgd_t * {aka struct <anonymous> *}’
static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)

The build failure is because p4d_t * maps to pgd_t * but the applicable
(it does not fallback on generic ones) pgd_populate() expects a pud_t *.

Except for archs which have 5 level page able, pgd_populate() always accepts
lower level page table pointers as the last argument as they dont have that
many levels.

arch/x86/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
arch/s390/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)

But others

arch/arm64/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
arch/m68k/include/asm/motorola_pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
arch/mips/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
arch/powerpc/include/asm/book3s/64/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)

I remember going through all these combinations before arriving at the
current state of #ifdef exclusions. Probably, to solved this all platforms
have to define pxx_populate() helpers assuming they support 5 level page
table.

>
> So it shouldn't be an issue. Maybe if a couple of arches miss them, the best would be to fix the arches, since that's the purpose of your testsuite isn't it ?

The run time failures as explained previously is because of the folding which
needs to be protected as they are not even applicable. The compile time
failures are because pxx_populate() signatures are platform specific depending
on how many page table levels they really support.

2019-09-18 09:16:33

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers



On 09/13/2019 11:53 AM, Christophe Leroy wrote:
> Fix build failure on powerpc.
>
> Fix preemption imbalance.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> mm/arch_pgtable_test.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
> index 8b4a92756ad8..f2b3c9ec35fa 100644
> --- a/mm/arch_pgtable_test.c
> +++ b/mm/arch_pgtable_test.c
> @@ -24,6 +24,7 @@
> #include <linux/swap.h>
> #include <linux/swapops.h>
> #include <linux/sched/mm.h>
> +#include <linux/highmem.h>
> #include <asm/pgalloc.h>
> #include <asm/pgtable.h>
>
> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
> p4d_clear_tests(p4dp);
> pgd_clear_tests(mm, pgdp);
>
> + pte_unmap(ptep);
> +
> pmd_populate_tests(mm, pmdp, saved_ptep);
> pud_populate_tests(mm, pudp, saved_pmdp);
> p4d_populate_tests(mm, p4dp, saved_pudp);
>

Hello Christophe,

I am planning to fold this fix into the current patch and retain your
Signed-off-by. Are you okay with it ?

- Anshuman

2019-09-18 17:37:22

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 18/09/2019 à 07:04, Anshuman Khandual a écrit :
>
>
> On 09/13/2019 03:31 PM, Christophe Leroy wrote:
>>
>>
>> Le 13/09/2019 à 11:02, Anshuman Khandual a écrit :
>>>
>>>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>>>>
>>>> #ifdefs have to be avoided as much as possible, see below
>>>
>>> Yeah but it has been bit difficult to avoid all these $ifdef because of the
>>> availability (or lack of it) for all these pgtable helpers in various config
>>> combinations on all platforms.
>>
>> As far as I can see these pgtable helpers should exist everywhere at least via asm-generic/ files.
>
> But they might not actually do the right thing.
>
>>
>> Can you spot a particular config which fails ?
>
> Lets consider the following example (after removing the $ifdefs around it)
> which though builds successfully but fails to pass the intended test. This
> is with arm64 config 4K pages sizes with 39 bits VA space which ends up
> with a 3 level page table arrangement.
>
> static void __init p4d_clear_tests(p4d_t *p4dp)
> {
> p4d_t p4d = READ_ONCE(*p4dp);

My suggestion was not to completely drop the #ifdef but to do like you
did in pgd_clear_tests() for instance, ie to add the following test on
top of the function:

if (mm_pud_folded(mm) || is_defined(__ARCH_HAS_5LEVEL_HACK))
return;

>
> p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
> WRITE_ONCE(*p4dp, p4d);
> p4d_clear(p4dp);
> p4d = READ_ONCE(*p4dp);
> WARN_ON(!p4d_none(p4d));
> }
>
> The following test hits an error at WARN_ON(!p4d_none(p4d))
>
> [ 16.757333] ------------[ cut here ]------------
> [ 16.758019] WARNING: CPU: 11 PID: 1 at mm/arch_pgtable_test.c:187 arch_pgtable_tests_init+0x24c/0x474
> [ 16.759455] Modules linked in:
> [ 16.759952] CPU: 11 PID: 1 Comm: swapper/0 Not tainted 5.3.0-next-20190916-00005-g61c218153bb8-dirty #222
> [ 16.761449] Hardware name: linux,dummy-virt (DT)
> [ 16.762185] pstate: 00400005 (nzcv daif +PAN -UAO)
> [ 16.762964] pc : arch_pgtable_tests_init+0x24c/0x474
> [ 16.763750] lr : arch_pgtable_tests_init+0x174/0x474
> [ 16.764534] sp : ffffffc011d7bd50
> [ 16.765065] x29: ffffffc011d7bd50 x28: ffffffff1756bac0
> [ 16.765908] x27: ffffff85ddaf3000 x26: 00000000000002e8
> [ 16.766767] x25: ffffffc0111ce000 x24: ffffff85ddaf32e8
> [ 16.767606] x23: ffffff85ddaef278 x22: 00000045cc844000
> [ 16.768445] x21: 000000065daef003 x20: ffffffff17540000
> [ 16.769283] x19: ffffff85ddb60000 x18: 0000000000000014
> [ 16.770122] x17: 00000000980426bb x16: 00000000698594c6
> [ 16.770976] x15: 0000000066e25a88 x14: 0000000000000000
> [ 16.771813] x13: ffffffff17540000 x12: 000000000000000a
> [ 16.772651] x11: ffffff85fcfd0a40 x10: 0000000000000001
> [ 16.773488] x9 : 0000000000000008 x8 : ffffffc01143ab26
> [ 16.774336] x7 : 0000000000000000 x6 : 0000000000000000
> [ 16.775180] x5 : 0000000000000000 x4 : 0000000000000000
> [ 16.776018] x3 : ffffffff1756bbe8 x2 : 000000065daeb003
> [ 16.776856] x1 : 000000000065daeb x0 : fffffffffffff000
> [ 16.777693] Call trace:
> [ 16.778092] arch_pgtable_tests_init+0x24c/0x474
> [ 16.778843] do_one_initcall+0x74/0x1b0
> [ 16.779458] kernel_init_freeable+0x1cc/0x290
> [ 16.780151] kernel_init+0x10/0x100
> [ 16.780710] ret_from_fork+0x10/0x18
> [ 16.781282] ---[ end trace 042e6c40c0a3b038 ]---
>
> On arm64 (4K page size|39 bits VA|3 level page table)
>
> #elif CONFIG_PGTABLE_LEVELS == 3 /* Applicable here */
> #define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopud.h>
>
> Which pulls in
>
> #include <asm-generic/pgtable-nop4d-hack.h>
>
> which pulls in
>
> #include <asm-generic/5level-fixup.h>
>
> which defines
>
> static inline int p4d_none(p4d_t p4d)
> {
> return 0;
> }
>
> which will invariably trigger WARN_ON(!p4d_none(p4d)).
>
> Similarly for next test p4d_populate_tests() which will always be
> successful because p4d_bad() invariably returns negative.
>
> static inline int p4d_bad(p4d_t p4d)
> {
> return 0;
> }
>
> static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
> pud_t *pudp)
> {
> p4d_t p4d;
>
> /*
> * This entry points to next level page table page.
> * Hence this must not qualify as p4d_bad().
> */
> pud_clear(pudp);
> p4d_clear(p4dp);
> p4d_populate(mm, p4dp, pudp);
> p4d = READ_ONCE(*p4dp);
> WARN_ON(p4d_bad(p4d));
> }
>
> We should not run these tests for the above config because they are
> not applicable and will invariably produce same result.
>
>>
>>>
>>>>
>>
>> [...]
>>
>>>>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>>>>
>>>> The same can be done here.
>>>
>>> IIRC not only the page table helpers but there are data types (pxx_t) which
>>> were not present on various configs and these wrappers help prevent build
>>> failures. Any ways will try and see if this can be improved further. But
>>> meanwhile if you have some suggestions, please do let me know.
>>
>> pgt_t and pmd_t are everywhere I guess.
>> then pud_t and p4d_t have fallbacks in asm-generic files.
>
> Lets take another example where it fails to compile. On arm64 with 16K
> page size, 48 bits VA, 4 level page table arrangement in the following
> test, pgd_populate() does not have the required signature.
>
> static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
> {
> pgd_t pgd;
>
> if (mm_p4d_folded(mm))
> return;
>
> /*
> * This entry points to next level page table page.
> * Hence this must not qualify as pgd_bad().
> */
> p4d_clear(p4dp);
> pgd_clear(pgdp);
> pgd_populate(mm, pgdp, p4dp);
> pgd = READ_ONCE(*pgdp);
> WARN_ON(pgd_bad(pgd));
> }
>
> mm/arch_pgtable_test.c: In function ‘pgd_populate_tests’:
> mm/arch_pgtable_test.c:254:25: error: passing argument 3 of ‘pgd_populate’ from incompatible pointer type [-Werror=incompatible-pointer-types]
> pgd_populate(mm, pgdp, p4dp);
> ^~~~
> In file included from mm/arch_pgtable_test.c:27:0:
> ./arch/arm64/include/asm/pgalloc.h:81:20: note: expected ‘pud_t * {aka struct <anonymous> *}’ but argument is of type ‘pgd_t * {aka struct <anonymous> *}’
> static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
>
> The build failure is because p4d_t * maps to pgd_t * but the applicable
> (it does not fallback on generic ones) pgd_populate() expects a pud_t *.
>
> Except for archs which have 5 level page able, pgd_populate() always accepts
> lower level page table pointers as the last argument as they dont have that
> many levels.
>
> arch/x86/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
> arch/s390/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
>
> But others
>
> arch/arm64/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
> arch/m68k/include/asm/motorola_pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
> arch/mips/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
> arch/powerpc/include/asm/book3s/64/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
>
> I remember going through all these combinations before arriving at the
> current state of #ifdef exclusions. Probably, to solved this all platforms
> have to define pxx_populate() helpers assuming they support 5 level page
> table.
>
>>
>> So it shouldn't be an issue. Maybe if a couple of arches miss them, the best would be to fix the arches, since that's the purpose of your testsuite isn't it ?
>
> The run time failures as explained previously is because of the folding which
> needs to be protected as they are not even applicable. The compile time
> failures are because pxx_populate() signatures are platform specific depending
> on how many page table levels they really support.
>

So IIUC, the compiletime problem is around __ARCH_HAS_5LEVEL_HACK. For
all #if !defined(__PAGETABLE_PXX_FOLDED), something equivalent to the
following should make the trick.

if (mm_pxx_folded())
return;


For the __ARCH_HAS_5LEVEL_HACK stuff, I think we should be able to
regroup all impacted functions inside a single #ifdef
__ARCH_HAS_5LEVEL_HACK

Christophe

2019-09-18 18:25:07

by Gerald Schaefer

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers

On Wed, 18 Sep 2019 18:26:03 +0200
Christophe Leroy <[email protected]> wrote:

[..]
> My suggestion was not to completely drop the #ifdef but to do like you
> did in pgd_clear_tests() for instance, ie to add the following test on
> top of the function:
>
> if (mm_pud_folded(mm) || is_defined(__ARCH_HAS_5LEVEL_HACK))
> return;
>

Ah, very nice, this would also fix the remaining issues for s390. Since
we have dynamic page table folding, neither __PAGETABLE_PXX_FOLDED nor
__ARCH_HAS_XLEVEL_HACK is defined, but mm_pxx_folded() will work.

mm_alloc() returns with a 3-level page table by default on s390, so we
will run into issues in p4d_clear/populate_tests(), and also at the end
with p4d/pud_free() (double free).

So, adding the mm_pud_folded() check to p4d_clear/populate_tests(),
and also adding mm_p4d/pud_folded() checks at the end before calling
p4d/pud_free(), would make it all work on s390.

BTW, regarding p4d/pud_free(), I'm not sure if we should rather check
the folding inside our s390 functions, similar to how we do it for
p4d/pud_free_tlb(), instead of relying on not being called for folded
p4d/pud. So far, I see no problem with this behavior, all callers of
p4d/pud_free() should be fine because of our folding check within
p4d/pud_present/none(). But that doesn't mean that it is correct not
to check for the folding inside p4d/pud_free(). At least, with this
test module we do now have a caller of p4d/pud_free() on potentially
folded entries, so instead of adding pxx_folded() checks to this
test module, we could add them to our p4d/pud_free() functions.
Any thoughts on this?

Regards,
Gerald

2019-09-19 06:02:54

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



Le 19/09/2019 à 06:56, Anshuman Khandual a écrit :
>
>
> On 09/18/2019 09:56 PM, Christophe Leroy wrote:
>>
>>
>> Le 18/09/2019 à 07:04, Anshuman Khandual a écrit :
>>>
>>>
>>> On 09/13/2019 03:31 PM, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 13/09/2019 à 11:02, Anshuman Khandual a écrit :
>>>>>
>>>>>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>>>>>>
>>>>>> #ifdefs have to be avoided as much as possible, see below
>>>>>
>>>>> Yeah but it has been bit difficult to avoid all these $ifdef because of the
>>>>> availability (or lack of it) for all these pgtable helpers in various config
>>>>> combinations on all platforms.
>>>>
>>>> As far as I can see these pgtable helpers should exist everywhere at least via asm-generic/ files.
>>>
>>> But they might not actually do the right thing.
>>>
>>>>
>>>> Can you spot a particular config which fails ?
>>>
>>> Lets consider the following example (after removing the $ifdefs around it)
>>> which though builds successfully but fails to pass the intended test. This
>>> is with arm64 config 4K pages sizes with 39 bits VA space which ends up
>>> with a 3 level page table arrangement.
>>>
>>> static void __init p4d_clear_tests(p4d_t *p4dp)
>>> {
>>>          p4d_t p4d = READ_ONCE(*p4dp);
>>
>> My suggestion was not to completely drop the #ifdef but to do like you did in pgd_clear_tests() for instance, ie to add the following test on top of the function:
>>
>>     if (mm_pud_folded(mm) || is_defined(__ARCH_HAS_5LEVEL_HACK))
>>         return;
>>
>
> Sometimes this does not really work. On some platforms, combination of
> __PAGETABLE_PUD_FOLDED and __ARCH_HAS_5LEVEL_HACK decide whether the
> helpers such as __pud() or __pgd() is even available for that platform.
> Ideally it should have been through generic falls backs in include/*/
> but I guess there might be bugs on the platform or it has not been
> changed to adopt 5 level page table framework with required folding
> macros etc.

Yes. As I suggested below, most likely that's better to retain the
#ifdef __ARCH_HAS_5LEVEL_HACK but change the #ifdef
__PAGETABLE_PUD_FOLDED by a runtime test of mm_pud_folded(mm)

As pointed by Gerald, some arches don't have __PAGETABLE_PUD_FOLDED
because they are deciding dynamically if they fold the level on not, but
have mm_pud_folded(mm)

>
>>>
>>>          p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
>>>          WRITE_ONCE(*p4dp, p4d);
>>>          p4d_clear(p4dp);
>>>          p4d = READ_ONCE(*p4dp);
>>>          WARN_ON(!p4d_none(p4d));
>>> }
>>>
>>> The following test hits an error at WARN_ON(!p4d_none(p4d))
>>>
>>> [   16.757333] ------------[ cut here ]------------
>>> [   16.758019] WARNING: CPU: 11 PID: 1 at mm/arch_pgtable_test.c:187 arch_pgtable_tests_init+0x24c/0x474

[...]

>>> [   16.781282] ---[ end trace 042e6c40c0a3b038 ]---
>>>
>>> On arm64 (4K page size|39 bits VA|3 level page table)
>>>
>>> #elif CONFIG_PGTABLE_LEVELS == 3    /* Applicable here */
>>> #define __ARCH_USE_5LEVEL_HACK
>>> #include <asm-generic/pgtable-nopud.h>
>>>
>>> Which pulls in
>>>
>>> #include <asm-generic/pgtable-nop4d-hack.h>
>>>
>>> which pulls in
>>>
>>> #include <asm-generic/5level-fixup.h>
>>>
>>> which defines
>>>
>>> static inline int p4d_none(p4d_t p4d)
>>> {
>>>          return 0;
>>> }
>>>
>>> which will invariably trigger WARN_ON(!p4d_none(p4d)).
>>>
>>> Similarly for next test p4d_populate_tests() which will always be
>>> successful because p4d_bad() invariably returns negative.
>>>
>>> static inline int p4d_bad(p4d_t p4d)
>>> {
>>>          return 0;
>>> }
>>>
>>> static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>>>                                        pud_t *pudp)
>>> {
>>>          p4d_t p4d;
>>>
>>>          /*
>>>           * This entry points to next level page table page.
>>>           * Hence this must not qualify as p4d_bad().
>>>           */
>>>          pud_clear(pudp);
>>>          p4d_clear(p4dp);
>>>          p4d_populate(mm, p4dp, pudp);
>>>          p4d = READ_ONCE(*p4dp);
>>>          WARN_ON(p4d_bad(p4d));
>>> }
>>>
>>> We should not run these tests for the above config because they are
>>> not applicable and will invariably produce same result.
>>>

[...]

>>>>
>>>> So it shouldn't be an issue. Maybe if a couple of arches miss them, the best would be to fix the arches, since that's the purpose of your testsuite isn't it ?
>>>
>>> The run time failures as explained previously is because of the folding which
>>> needs to be protected as they are not even applicable. The compile time
>>> failures are because pxx_populate() signatures are platform specific depending
>>> on how many page table levels they really support.
>>>
>>
>> So IIUC, the compiletime problem is around __ARCH_HAS_5LEVEL_HACK. For all #if !defined(__PAGETABLE_PXX_FOLDED), something equivalent to the following should make the trick.
>>
>>     if (mm_pxx_folded())
>>         return;
>>
>>
>> For the __ARCH_HAS_5LEVEL_HACK stuff, I think we should be able to regroup all impacted functions inside a single #ifdef __ARCH_HAS_5LEVEL_HACK
>
> I was wondering if it will be better to
>
> 1) Minimize all #ifdefs in the code which might fail on some platforms
> 2) Restrict proposed test module to platforms where it builds and runs
> 3) Enable other platforms afterwards after fixing their build problems or other requirements

I understand that __ARCH_HAS_5LEVEL_HACK is an HACK as its name
suggests, so you can't expect all platforms to go for an HACK. I think
you can keep a single #ifdef __ARCH_HAS_5LEVEL_HACK / #else / #endif and
put all relevant tests inside it.

For things like __PAGETABLE_PXX_FOLDED dependancies, I still think that
they can all be replaced by a runtime test of mm_pxx_folded().

Can you try that and see what problem remains ?

>
> Would that be a better approach instead ?
>

Based on the above, that might be the approach to take, yes.

Christophe

2019-09-19 06:05:29

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] mm/pgtable/debug: Fix test validating architecture page table helpers



Le 18/09/2019 à 09:32, Anshuman Khandual a écrit :
>
>
> On 09/13/2019 11:53 AM, Christophe Leroy wrote:
>> Fix build failure on powerpc.
>>
>> Fix preemption imbalance.
>>
>> Signed-off-by: Christophe Leroy <[email protected]>
>> ---
>> mm/arch_pgtable_test.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/arch_pgtable_test.c b/mm/arch_pgtable_test.c
>> index 8b4a92756ad8..f2b3c9ec35fa 100644
>> --- a/mm/arch_pgtable_test.c
>> +++ b/mm/arch_pgtable_test.c
>> @@ -24,6 +24,7 @@
>> #include <linux/swap.h>
>> #include <linux/swapops.h>
>> #include <linux/sched/mm.h>
>> +#include <linux/highmem.h>
>> #include <asm/pgalloc.h>
>> #include <asm/pgtable.h>
>>
>> @@ -400,6 +401,8 @@ static int __init arch_pgtable_tests_init(void)
>> p4d_clear_tests(p4dp);
>> pgd_clear_tests(mm, pgdp);
>>
>> + pte_unmap(ptep);
>> +
>> pmd_populate_tests(mm, pmdp, saved_ptep);
>> pud_populate_tests(mm, pudp, saved_pmdp);
>> p4d_populate_tests(mm, p4dp, saved_pudp);
>>
>
> Hello Christophe,
>
> I am planning to fold this fix into the current patch and retain your
> Signed-off-by. Are you okay with it ?
>

No problem, do whatever is convenient for you. You can keep the
signed-off-by, or use tested-by: as I tested it on PPC32.

Christophe

2019-09-19 06:56:39

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



On 09/18/2019 09:56 PM, Christophe Leroy wrote:
>
>
> Le 18/09/2019 à 07:04, Anshuman Khandual a écrit :
>>
>>
>> On 09/13/2019 03:31 PM, Christophe Leroy wrote:
>>>
>>>
>>> Le 13/09/2019 à 11:02, Anshuman Khandual a écrit :
>>>>
>>>>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
>>>>>
>>>>> #ifdefs have to be avoided as much as possible, see below
>>>>
>>>> Yeah but it has been bit difficult to avoid all these $ifdef because of the
>>>> availability (or lack of it) for all these pgtable helpers in various config
>>>> combinations on all platforms.
>>>
>>> As far as I can see these pgtable helpers should exist everywhere at least via asm-generic/ files.
>>
>> But they might not actually do the right thing.
>>
>>>
>>> Can you spot a particular config which fails ?
>>
>> Lets consider the following example (after removing the $ifdefs around it)
>> which though builds successfully but fails to pass the intended test. This
>> is with arm64 config 4K pages sizes with 39 bits VA space which ends up
>> with a 3 level page table arrangement.
>>
>> static void __init p4d_clear_tests(p4d_t *p4dp)
>> {
>>          p4d_t p4d = READ_ONCE(*p4dp);
>
> My suggestion was not to completely drop the #ifdef but to do like you did in pgd_clear_tests() for instance, ie to add the following test on top of the function:
>
>     if (mm_pud_folded(mm) || is_defined(__ARCH_HAS_5LEVEL_HACK))
>         return;
>

Sometimes this does not really work. On some platforms, combination of
__PAGETABLE_PUD_FOLDED and __ARCH_HAS_5LEVEL_HACK decide whether the
helpers such as __pud() or __pgd() is even available for that platform.
Ideally it should have been through generic falls backs in include/*/
but I guess there might be bugs on the platform or it has not been
changed to adopt 5 level page table framework with required folding
macros etc.

>>
>>          p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
>>          WRITE_ONCE(*p4dp, p4d);
>>          p4d_clear(p4dp);
>>          p4d = READ_ONCE(*p4dp);
>>          WARN_ON(!p4d_none(p4d));
>> }
>>
>> The following test hits an error at WARN_ON(!p4d_none(p4d))
>>
>> [   16.757333] ------------[ cut here ]------------
>> [   16.758019] WARNING: CPU: 11 PID: 1 at mm/arch_pgtable_test.c:187 arch_pgtable_tests_init+0x24c/0x474
>> [   16.759455] Modules linked in:
>> [   16.759952] CPU: 11 PID: 1 Comm: swapper/0 Not tainted 5.3.0-next-20190916-00005-g61c218153bb8-dirty #222
>> [   16.761449] Hardware name: linux,dummy-virt (DT)
>> [   16.762185] pstate: 00400005 (nzcv daif +PAN -UAO)
>> [   16.762964] pc : arch_pgtable_tests_init+0x24c/0x474
>> [   16.763750] lr : arch_pgtable_tests_init+0x174/0x474
>> [   16.764534] sp : ffffffc011d7bd50
>> [   16.765065] x29: ffffffc011d7bd50 x28: ffffffff1756bac0
>> [   16.765908] x27: ffffff85ddaf3000 x26: 00000000000002e8
>> [   16.766767] x25: ffffffc0111ce000 x24: ffffff85ddaf32e8
>> [   16.767606] x23: ffffff85ddaef278 x22: 00000045cc844000
>> [   16.768445] x21: 000000065daef003 x20: ffffffff17540000
>> [   16.769283] x19: ffffff85ddb60000 x18: 0000000000000014
>> [   16.770122] x17: 00000000980426bb x16: 00000000698594c6
>> [   16.770976] x15: 0000000066e25a88 x14: 0000000000000000
>> [   16.771813] x13: ffffffff17540000 x12: 000000000000000a
>> [   16.772651] x11: ffffff85fcfd0a40 x10: 0000000000000001
>> [   16.773488] x9 : 0000000000000008 x8 : ffffffc01143ab26
>> [   16.774336] x7 : 0000000000000000 x6 : 0000000000000000
>> [   16.775180] x5 : 0000000000000000 x4 : 0000000000000000
>> [   16.776018] x3 : ffffffff1756bbe8 x2 : 000000065daeb003
>> [   16.776856] x1 : 000000000065daeb x0 : fffffffffffff000
>> [   16.777693] Call trace:
>> [   16.778092]  arch_pgtable_tests_init+0x24c/0x474
>> [   16.778843]  do_one_initcall+0x74/0x1b0
>> [   16.779458]  kernel_init_freeable+0x1cc/0x290
>> [   16.780151]  kernel_init+0x10/0x100
>> [   16.780710]  ret_from_fork+0x10/0x18
>> [   16.781282] ---[ end trace 042e6c40c0a3b038 ]---
>>
>> On arm64 (4K page size|39 bits VA|3 level page table)
>>
>> #elif CONFIG_PGTABLE_LEVELS == 3    /* Applicable here */
>> #define __ARCH_USE_5LEVEL_HACK
>> #include <asm-generic/pgtable-nopud.h>
>>
>> Which pulls in
>>
>> #include <asm-generic/pgtable-nop4d-hack.h>
>>
>> which pulls in
>>
>> #include <asm-generic/5level-fixup.h>
>>
>> which defines
>>
>> static inline int p4d_none(p4d_t p4d)
>> {
>>          return 0;
>> }
>>
>> which will invariably trigger WARN_ON(!p4d_none(p4d)).
>>
>> Similarly for next test p4d_populate_tests() which will always be
>> successful because p4d_bad() invariably returns negative.
>>
>> static inline int p4d_bad(p4d_t p4d)
>> {
>>          return 0;
>> }
>>
>> static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>>                                        pud_t *pudp)
>> {
>>          p4d_t p4d;
>>
>>          /*
>>           * This entry points to next level page table page.
>>           * Hence this must not qualify as p4d_bad().
>>           */
>>          pud_clear(pudp);
>>          p4d_clear(p4dp);
>>          p4d_populate(mm, p4dp, pudp);
>>          p4d = READ_ONCE(*p4dp);
>>          WARN_ON(p4d_bad(p4d));
>> }
>>
>> We should not run these tests for the above config because they are
>> not applicable and will invariably produce same result.
>>
>>>
>>>>
>>>>>
>>>
>>> [...]
>>>
>>>>>> +#if !defined(__PAGETABLE_PUD_FOLDED) && !defined(__ARCH_HAS_5LEVEL_HACK)
>>>>>
>>>>> The same can be done here.
>>>>
>>>> IIRC not only the page table helpers but there are data types (pxx_t) which
>>>> were not present on various configs and these wrappers help prevent build
>>>> failures. Any ways will try and see if this can be improved further. But
>>>> meanwhile if you have some suggestions, please do let me know.
>>>
>>> pgt_t and pmd_t are everywhere I guess.
>>> then pud_t and p4d_t have fallbacks in asm-generic files.
>>
>> Lets take another example where it fails to compile. On arm64 with 16K
>> page size, 48 bits VA, 4 level page table arrangement in the following
>> test, pgd_populate() does not have the required signature.
>>
>> static void pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
>> {
>>          pgd_t pgd;
>>
>>          if (mm_p4d_folded(mm))
>>                  return;
>>
>>         /*
>>           * This entry points to next level page table page.
>>           * Hence this must not qualify as pgd_bad().
>>           */
>>          p4d_clear(p4dp);
>>          pgd_clear(pgdp);
>>          pgd_populate(mm, pgdp, p4dp);
>>          pgd = READ_ONCE(*pgdp);
>>          WARN_ON(pgd_bad(pgd));
>> }
>>
>> mm/arch_pgtable_test.c: In function ‘pgd_populate_tests’:
>> mm/arch_pgtable_test.c:254:25: error: passing argument 3 of ‘pgd_populate’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>>    pgd_populate(mm, pgdp, p4dp);
>>                           ^~~~
>> In file included from mm/arch_pgtable_test.c:27:0:
>> ./arch/arm64/include/asm/pgalloc.h:81:20: note: expected ‘pud_t * {aka struct <anonymous> *}’ but argument is of type ‘pgd_t * {aka struct <anonymous> *}’
>>   static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
>>
>> The build failure is because p4d_t * maps to pgd_t * but the applicable
>> (it does not fallback on generic ones) pgd_populate() expects a pud_t *.
>>
>> Except for archs which have 5 level page able, pgd_populate() always accepts
>> lower level page table pointers as the last argument as they dont have that
>> many levels.
>>
>> arch/x86/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
>> arch/s390/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
>>
>> But others
>>
>> arch/arm64/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
>> arch/m68k/include/asm/motorola_pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pmd_t *pmd)
>> arch/mips/include/asm/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
>> arch/powerpc/include/asm/book3s/64/pgalloc.h:static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
>>
>> I remember going through all these combinations before arriving at the
>> current state of #ifdef exclusions. Probably, to solved this all platforms
>> have to define pxx_populate() helpers assuming they support 5 level page
>> table.
>>
>>>
>>> So it shouldn't be an issue. Maybe if a couple of arches miss them, the best would be to fix the arches, since that's the purpose of your testsuite isn't it ?
>>
>> The run time failures as explained previously is because of the folding which
>> needs to be protected as they are not even applicable. The compile time
>> failures are because pxx_populate() signatures are platform specific depending
>> on how many page table levels they really support.
>>
>
> So IIUC, the compiletime problem is around __ARCH_HAS_5LEVEL_HACK. For all #if !defined(__PAGETABLE_PXX_FOLDED), something equivalent to the following should make the trick.
>
>     if (mm_pxx_folded())
>         return;
>
>
> For the __ARCH_HAS_5LEVEL_HACK stuff, I think we should be able to regroup all impacted functions inside a single #ifdef __ARCH_HAS_5LEVEL_HACK

I was wondering if it will be better to

1) Minimize all #ifdefs in the code which might fail on some platforms
2) Restrict proposed test module to platforms where it builds and runs
3) Enable other platforms afterwards after fixing their build problems or other requirements

Would that be a better approach instead ?

2019-09-20 22:58:38

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm/pgtable/debug: Add test validating architecture page table helpers



On 09/18/2019 11:52 PM, Gerald Schaefer wrote:
> On Wed, 18 Sep 2019 18:26:03 +0200
> Christophe Leroy <[email protected]> wrote:
>
> [..]
>> My suggestion was not to completely drop the #ifdef but to do like you
>> did in pgd_clear_tests() for instance, ie to add the following test on
>> top of the function:
>>
>> if (mm_pud_folded(mm) || is_defined(__ARCH_HAS_5LEVEL_HACK))
>> return;
>>
>
> Ah, very nice, this would also fix the remaining issues for s390. Since
> we have dynamic page table folding, neither __PAGETABLE_PXX_FOLDED nor
> __ARCH_HAS_XLEVEL_HACK is defined, but mm_pxx_folded() will work.

Like Christophe mentioned earlier on the other thread, we will convert
all __PGTABLE_PXX_FOLDED checks as mm_pxx_folded() but looks like
ARCH_HAS_[4 and 5]LEVEL_HACK macros will still be around. Will respin
the series with all agreed upon changes first and probably we can then
discuss pending issues from there.

>
> mm_alloc() returns with a 3-level page table by default on s390, so we
> will run into issues in p4d_clear/populate_tests(), and also at the end
> with p4d/pud_free() (double free).
>
> So, adding the mm_pud_folded() check to p4d_clear/populate_tests(),
> and also adding mm_p4d/pud_folded() checks at the end before calling> p4d/pud_free(), would make it all work on s390.

Atleast p4d_clear/populate_tests() tests will be taken care.

>
> BTW, regarding p4d/pud_free(), I'm not sure if we should rather check
> the folding inside our s390 functions, similar to how we do it for
> p4d/pud_free_tlb(), instead of relying on not being called for folded
> p4d/pud. So far, I see no problem with this behavior, all callers of
> p4d/pud_free() should be fine because of our folding check within
> p4d/pud_present/none(). But that doesn't mean that it is correct not
> to check for the folding inside p4d/pud_free(). At least, with this
> test module we do now have a caller of p4d/pud_free() on potentially
> folded entries, so instead of adding pxx_folded() checks to this
> test module, we could add them to our p4d/pud_free() functions.
> Any thoughts on this?
Agreed, it seems better to do the check inside p4d/pud_free() functions.