Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp3856718ybv; Mon, 10 Feb 2020 07:38:47 -0800 (PST) X-Google-Smtp-Source: APXvYqwII0mHcZNV1iltbQs/EcCFqtILdeR5GfhYLEfV5ozzgScIBTED+7gev/1xO/hnqBLkfquY X-Received: by 2002:aca:4e02:: with SMTP id c2mr1226299oib.142.1581349127015; Mon, 10 Feb 2020 07:38:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581349127; cv=none; d=google.com; s=arc-20160816; b=izY6k2b7fqRu3xzjNr4VwaV5D6dj+8/nx3h8iZU9eTFyKtK/wEtcxsHB9AGcs7E8FZ 8JKm1IcMgdHtHx6oOYdJi50oaVgqmoX9wJY5RstsB51oo21KEzN8s1mbmjE8AyGuhg+B GNZkp2qk8ZY6AcmDs4eR5mRBsCzyQ1r4DOy/wIdHSNlJ8PjnM7hUCw4ijyMsxNIb1yb8 HHieorpOs+cciLxgFJQubutriLIkv5LZ20Wv8NThXLOZifNyjdK/mg7B5Cm2Wxu2aVtP YOnzHCsVRShAGO/ZFPQ9++MYkLB+iwXEt+bcfzAJ216jr7263CY15iZ/b+Dmjp4IfPuQ 4YWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=dyRf2XqxxxFrI/l127+WgbdhdjkZswwqsmHfKTdITkQ=; b=FgIo0Ds5bhbXHUMVUUMmQDBvIaBcqnbdSgVv2oXlil20+32kLpBuKIX8hmzBcn2nSG P+Wtmhk+Bm5s6XTXwaLFLWPl5jU5XRqEcy1c1Gn3Mkem2qoE10MuVH7Zx5td0+FT59fj HTqmmGmTRcJOUyZApy3KzKLL8R5acDmRsUtpPit0vOSqqNnfflF02M7xlZjFFhyTDavo UOTy/0KQtGYXhZJN8FdNJgUjdxpzIKhYJVp0FcmXSIsphLGXV11HAntSpsZEEIbkPI3Y wKHmhXZw8lUAoRolmfhcevF5oDoKxR2i+iwlW8xdzDbrTFszmSPiAQqRWqT7mxuUaWpK 2OKA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l9si353258oti.229.2020.02.10.07.38.34; Mon, 10 Feb 2020 07:38:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727752AbgBJPha (ORCPT + 99 others); Mon, 10 Feb 2020 10:37:30 -0500 Received: from foss.arm.com ([217.140.110.172]:35344 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726816AbgBJPha (ORCPT ); Mon, 10 Feb 2020 10:37:30 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7F75B1FB; Mon, 10 Feb 2020 07:37:28 -0800 (PST) Received: from E121110.arm.com (C02TF0J2HF1T.cambridge.arm.com [10.1.26.172]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1A4B13F68E; Mon, 10 Feb 2020 07:37:18 -0800 (PST) Date: Mon, 10 Feb 2020 15:37:16 +0000 From: Catalin Marinas To: Anshuman Khandual Cc: linux-mm@kvack.org, Mark Rutland , linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, Peter Zijlstra , James Hogan , Tetsuo Handa , Heiko Carstens , Michal Hocko , Dave Hansen , Paul Mackerras , sparclinux@vger.kernel.org, Thomas Gleixner , linux-s390@vger.kernel.org, Michael Ellerman , x86@kernel.org, Russell King - ARM Linux , Matthew Wilcox , Steven Price , Jason Gunthorpe , Gerald Schaefer , linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Ingo Molnar , Kees Cook , Masahiro Yamada , Mark Brown , "Kirill A . Shutemov" , Dan Williams , Vlastimil Babka , Christophe Leroy , Sri Krishna chowdary , Ard Biesheuvel , Greg Kroah-Hartman , linux-mips@vger.kernel.org, Ralf Baechle , linux-kernel@vger.kernel.org, Paul Burton , Mike Rapoport , Vineet Gupta , Martin Schwidefsky , Andrew Morton , linuxppc-dev@lists.ozlabs.org, "David S. Miller" Subject: Re: [PATCH V12] mm/debug: Add tests validating architecture page table helpers Message-ID: <20200210153716.GB9283@E121110.arm.com> References: <1580174873-18117-1-git-send-email-anshuman.khandual@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1580174873-18117-1-git-send-email-anshuman.khandual@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 28, 2020 at 06:57:53AM +0530, Anshuman Khandual wrote: > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. It may be worth posting the next version to linux-arch to reach out to other arch maintainers. Also I've seen that you posted a v13 but it hasn't reached linux-arm-kernel (likely held in moderation because of the large amount of addresses cc'ed) and I don't normally follow LKML. I'm not cc'ed to this patch either (which is fine as long as you post to a list that I read). Since I started the reply on v12 about a week ago, I'll follow up here. When you post a v14, please trim the people on cc only to those strictly necessary (e.g. arch maintainers, linux-mm, linux-arch and lkml). > diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > new file mode 100644 > index 000000000000..f3f8111edbe3 > --- /dev/null > +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > @@ -0,0 +1,35 @@ > +# > +# Feature name: debug-vm-pgtable > +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE > +# description: arch supports pgtable tests for semantics compliance > +# > + ----------------------- > + | arch |status| > + ----------------------- > + | alpha: | TODO | > + | arc: | ok | > + | arm: | TODO | I'm sure you can find some arm32 hardware around (or a VM) to give this a try ;). > diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h > index 0b6c4042942a..fb0e76d254b3 100644 > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h [...] > @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) > sched_init_smp(); > > page_alloc_init_late(); > + debug_vm_pgtable(); > /* Initialize page ext after all struct pages are initialized. */ > page_ext_init(); I guess you could even make debug_vm_pgtable() an early_initcall(). I don't have a strong opinion either way. > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > new file mode 100644 > index 000000000000..0f37f32d15f1 > --- /dev/null > +++ b/mm/debug_vm_pgtable.c > @@ -0,0 +1,388 @@ [...] > +/* > + * Basic operations > + * > + * mkold(entry) = An old and not a young entry > + * mkyoung(entry) = A young and not an old entry > + * mkdirty(entry) = A dirty and not a clean entry > + * mkclean(entry) = A clean and not a dirty entry > + * mkwrite(entry) = A write and not a write protected entry > + * wrprotect(entry) = A write protected and not a write entry > + * pxx_bad(entry) = A mapped and non-table entry > + * pxx_same(entry1, entry2) = Both entries hold the exact same value > + */ > +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC) > + > +/* > + * On s390 platform, the lower 12 bits are used to identify given page table > + * entry type and for other arch specific requirements. But these bits might > + * affect the ability to clear entries with pxx_clear(). So while loading up > + * the entries skip all lower 12 bits in order to accommodate s390 platform. > + * It does not have affect any other platform. > + */ > +#define RANDOM_ORVALUE (0xfffffffffffff000UL) I'd suggest you generate this mask with something like GENMASK(BITS_PER_LONG, PAGE_SHIFT). > +#define RANDOM_NZVALUE (0xff) > + > +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pte_t pte = pfn_pte(pfn, prot); > + > + WARN_ON(!pte_same(pte, pte)); > + WARN_ON(!pte_young(pte_mkyoung(pte))); > + WARN_ON(!pte_dirty(pte_mkdirty(pte))); > + WARN_ON(!pte_write(pte_mkwrite(pte))); > + WARN_ON(pte_young(pte_mkold(pte))); > + WARN_ON(pte_dirty(pte_mkclean(pte))); > + WARN_ON(pte_write(pte_wrprotect(pte))); Given that you start with rwx permissions set, some of these ops would not have any effect. For example, on arm64 at least, mkwrite clears a bit already cleared here. You could try with multiple rwx combinations values (e.g. all set and all cleared) or maybe something like below: WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte)))); You could also try something like this: WARN_ON(!pte_same(pte_wrprotect(pte), pte_wrprotect(pte_mkwrite(pte)))); though the above approach may not work for arm64 ptep_set_wrprotect() on a dirty pte (if you extend these tests later). > +} > + > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pmd_t pmd = pfn_pmd(pfn, prot); > + > + WARN_ON(!pmd_same(pmd, pmd)); > + WARN_ON(!pmd_young(pmd_mkyoung(pmd))); > + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd))); > + WARN_ON(!pmd_write(pmd_mkwrite(pmd))); > + WARN_ON(pmd_young(pmd_mkold(pmd))); > + WARN_ON(pmd_dirty(pmd_mkclean(pmd))); > + WARN_ON(pmd_write(pmd_wrprotect(pmd))); > + /* > + * A huge page does not point to next level page table > + * entry. Hence this must qualify as pmd_bad(). > + */ > + WARN_ON(!pmd_bad(pmd_mkhuge(pmd))); > +} > + > +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pud_t pud = pfn_pud(pfn, prot); > + > + WARN_ON(!pud_same(pud, pud)); > + WARN_ON(!pud_young(pud_mkyoung(pud))); > + WARN_ON(!pud_write(pud_mkwrite(pud))); > + WARN_ON(pud_write(pud_wrprotect(pud))); > + WARN_ON(pud_young(pud_mkold(pud))); > + > + if (mm_pmd_folded(mm) || __is_defined(ARCH_HAS_4LEVEL_HACK)) > + return; > + > + /* > + * A huge page does not point to next level page table > + * entry. Hence this must qualify as pud_bad(). > + */ > + WARN_ON(!pud_bad(pud_mkhuge(pud))); > +} > +#else > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } > +#endif > +#else > +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { } > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } > +#endif > + > +static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + p4d_t p4d; > + > + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t)); > + WARN_ON(!p4d_same(p4d, p4d)); > +} > + > +static void __init pgd_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pgd_t pgd; > + > + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t)); > + WARN_ON(!pgd_same(pgd, pgd)); > +} > + > +#ifndef __ARCH_HAS_4LEVEL_HACK This macro doesn't exist in the kernel anymore (it's a 5LEVEL now). But can you not use the __PAGETABLE_PUD_FOLDED instead? > +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) > +{ > + pud_t pud = READ_ONCE(*pudp); > + > + if (mm_pmd_folded(mm)) > + return; > + > + pud = __pud(pud_val(pud) | RANDOM_ORVALUE); > + WRITE_ONCE(*pudp, pud); > + pud_clear(pudp); > + pud = READ_ONCE(*pudp); > + WARN_ON(!pud_none(pud)); > +} > + > +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, > + pmd_t *pmdp) > +{ > + pud_t pud; > + > + if (mm_pmd_folded(mm)) > + return; > + /* > + * This entry points to next level page table page. > + * Hence this must not qualify as pud_bad(). > + */ > + pmd_clear(pmdp); > + pud_clear(pudp); > + pud_populate(mm, pudp, pmdp); > + pud = READ_ONCE(*pudp); > + WARN_ON(pud_bad(pud)); > +} > +#else > +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) { } > +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, > + pmd_t *pmdp) > +{ > +} > +#endif > + > +#ifndef __ARCH_HAS_5LEVEL_HACK Could you use __PAGETABLE_P4D_FOLDED instead? > +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) > +{ > + p4d_t p4d = READ_ONCE(*p4dp); > + > + if (mm_pud_folded(mm)) > + return; > + > + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); > + WRITE_ONCE(*p4dp, p4d); > + p4d_clear(p4dp); > + p4d = READ_ONCE(*p4dp); > + WARN_ON(!p4d_none(p4d)); > +} Otherwise the patch looks fine. As per the comment on v13, make sure you don't break the build on any architecture, so this could either be an opt-in or patch those architectures before this patch is applied. Thanks. -- Catalin