Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp5020896pxv; Wed, 28 Jul 2021 00:50:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyKwHPnE9HkIhpJ/7IsWreHDm0+Kqu9hpWVSfBCT7RtWiQU2Ob/ojz9kY5DB3g9WTkRo+/j X-Received: by 2002:a17:906:839a:: with SMTP id p26mr13265149ejx.547.1627458610596; Wed, 28 Jul 2021 00:50:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627458610; cv=none; d=google.com; s=arc-20160816; b=n9BdttffROiBo/T3LAgwhmozPNVRmlDxfqdHG6PI4M0ol9yu0ecEAwKFA5tIA2YivV 0xEgz2fiYr/TzEJdEUW9nJwV8js6nGiWf3YyCRnKG8j7xm5XI2w60DcJRMckgRRCopRk lD4cgYWih20wwXwhkdo16E/YFjtMZKbeG5m/psY1Yjxx+lD+z/Dtw9lppisbUK4VorDD UcLTa7aEvSW5IRxhMYpsDDkiKNJfuxMq2gnS0VZO+cvENq2hW8uZ5avNbc81Gl3Wqhrx rFQ1b6IeUwaddWv3FKTBZ6DjeZG3jUhAzdGWznVqGuLpIoWklkBIruTYFTG6jLUD+wTv AoDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-disposition :mime-version:user-agent:in-reply-to:references:subject:cc:to:from :message-id:date; bh=MkSo1xKTitjrtYVwsllVWl7vnLL1RXhXwc7iFc+eDjk=; b=zGlgoqrOyLdy/3CwQ95W3LSGbFbnJQuavh3NwtgiultsQhAzdSmTz2MQSQwWi0KDGb 2yLdgUq4/8/YtxmUok8XsQraXZCLvQVJyyNf/FbkhuEEegTvKzonjh3CqJfP4+M6y+/L Kn3hVjAbYwBEonV9D4j4nssxA929QPok60OIHeZt5Up4UWau0Hu/juXSX2qw+Fb7eBXr 6DUqJTEHOAlDAXDUcrF/+VGTVIGYBxZuoPUe3RB2GW6M2h7v5qllp2xohNOdDbF7iFQa c98tIZMmkoai1qWUSaK0uBzxsGmR8Aa2SRe25BTCgDbje+E5OpgQAylUxK7Ef/p4Qx0y OhnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b7si5396926edd.421.2021.07.28.00.49.46; Wed, 28 Jul 2021 00:50:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234464AbhG1HsM convert rfc822-to-8bit (ORCPT + 99 others); Wed, 28 Jul 2021 03:48:12 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:39786 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234299AbhG1HsL (ORCPT ); Wed, 28 Jul 2021 03:48:11 -0400 Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4GZQkc3T82zBBlF; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zhO5PGMIrOxw; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) Received: from vm-hermes.si.c-s.fr (vm-hermes.si.c-s.fr [192.168.25.253]) by pegase1.c-s.fr (Postfix) with ESMTP id 4GZQkc2DGBzBBkp; Wed, 28 Jul 2021 09:48:08 +0200 (CEST) Received: by vm-hermes.si.c-s.fr (Postfix, from userid 33) id B9E1A8EA; Wed, 28 Jul 2021 09:53:27 +0200 (CEST) Received: from 37.165.138.29 ([37.165.138.29]) by messagerie.c-s.fr (Horde Framework) with HTTP; Wed, 28 Jul 2021 09:53:27 +0200 Date: Wed, 28 Jul 2021 09:53:26 +0200 Message-ID: <20210728095326.Horde.k1npSPaQKh2i7W3XoBsdiQ3@messagerie.c-s.fr> From: Christophe Leroy To: Gavin Shan Cc: shan.gavin@gmail.com, chuhu@redhat.com, akpm@linux-foundation.org, will@kernel.org, catalin.marinas@arm.com, cai@lca.pw, aneesh.kumar@linux.ibm.com, gerald.schaefer@linux.ibm.com, anshuman.khandual@arm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4 12/12] mm/debug_vm_pgtable: Fix corrupted page flag References: <20210727061401.592616-1-gshan@redhat.com> <20210727061401.592616-13-gshan@redhat.com> In-Reply-To: <20210727061401.592616-13-gshan@redhat.com> User-Agent: Internet Messaging Program (IMP) H5 (6.2.3) Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Gavin Shan a écrit : > In page table entry modifying tests, set_xxx_at() are used to populate > the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is > set to the target page flag if execution permission is given. The logic > exits since commit 4f04d8f00545 ("arm64: MMU definitions"). The page > flag is kept when the page is free'd to buddy's free area list. However, > it will trigger page checking failure when it's pulled from the buddy's > free area list, as the following warning messages indicate. > > BUG: Bad page state in process memhog pfn:08000 > page:0000000015c0a628 refcount:0 mapcount:0 \ > mapping:0000000000000000 index:0x1 pfn:0x8000 > flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff) > raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000 > raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set > > This fixes the issue by clearing PG_arch_1 through flush_dcache_page() > after set_xxx_at() is called. For architectures other than ARM64, the > unexpected overhead of cache flushing is acceptable. > > Signed-off-by: Gavin Shan Maybe a Fixes: tag would be good to have And would it be possible to have this fix as first patch of the series so that it can be applied to stable without applying the whole series ? Christophe > --- > mm/debug_vm_pgtable.c | 55 +++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 51 insertions(+), 4 deletions(-) > > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > index 162ff6329f7b..d2c2d23e542e 100644 > --- a/mm/debug_vm_pgtable.c > +++ b/mm/debug_vm_pgtable.c > @@ -29,6 +29,8 @@ > #include > #include > #include > + > +#include > #include > #include > > @@ -119,19 +121,28 @@ static void __init pte_basic_tests(struct > pgtable_debug_args *args, int idx) > > static void __init pte_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pte_t pte; > > /* > * Architectures optimize set_pte_at by avoiding TLB flush. > * This requires set_pte_at to be not used to update an > * existing pte entry. Clear pte before we do set_pte_at > + * > + * flush_dcache_page() is called after set_pte_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > */ > - if (args->pte_pfn == ULONG_MAX) > + page = (args->pte_pfn != ULONG_MAX) ? pfn_to_page(args->pte_pfn) : NULL; > + if (!page) > return; > > pr_debug("Validating PTE advanced\n"); > pte = pfn_pte(args->pte_pfn, args->page_prot); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > ptep_set_wrprotect(args->mm, args->vaddr, args->ptep); > pte = ptep_get(args->ptep); > WARN_ON(pte_write(pte)); > @@ -143,6 +154,7 @@ static void __init pte_advanced_tests(struct > pgtable_debug_args *args) > pte = pte_wrprotect(pte); > pte = pte_mkclean(pte); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > pte = pte_mkwrite(pte); > pte = pte_mkdirty(pte); > ptep_set_access_flags(args->vma, args->vaddr, args->ptep, pte, 1); > @@ -155,6 +167,7 @@ static void __init pte_advanced_tests(struct > pgtable_debug_args *args) > pte = pfn_pte(args->pte_pfn, args->page_prot); > pte = pte_mkyoung(pte); > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > ptep_test_and_clear_young(args->vma, args->vaddr, args->ptep); > pte = ptep_get(args->ptep); > WARN_ON(pte_young(pte)); > @@ -213,15 +226,24 @@ static void __init pmd_basic_tests(struct > pgtable_debug_args *args, int idx) > > static void __init pmd_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pmd_t pmd; > unsigned long vaddr = args->vaddr; > > if (!has_transparent_hugepage()) > return; > > - if (args->pmd_pfn == ULONG_MAX) > + page = (args->pmd_pfn != ULONG_MAX) ? pfn_to_page(args->pmd_pfn) : NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pmd_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PMD advanced\n"); > /* Align the address wrt HPAGE_PMD_SIZE */ > vaddr &= HPAGE_PMD_MASK; > @@ -230,6 +252,7 @@ static void __init pmd_advanced_tests(struct > pgtable_debug_args *args) > > pmd = pfn_pmd(args->pmd_pfn, args->page_prot); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmdp_set_wrprotect(args->mm, vaddr, args->pmdp); > pmd = READ_ONCE(*args->pmdp); > WARN_ON(pmd_write(pmd)); > @@ -241,6 +264,7 @@ static void __init pmd_advanced_tests(struct > pgtable_debug_args *args) > pmd = pmd_wrprotect(pmd); > pmd = pmd_mkclean(pmd); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmd = pmd_mkwrite(pmd); > pmd = pmd_mkdirty(pmd); > pmdp_set_access_flags(args->vma, vaddr, args->pmdp, pmd, 1); > @@ -253,6 +277,7 @@ static void __init pmd_advanced_tests(struct > pgtable_debug_args *args) > pmd = pmd_mkhuge(pfn_pmd(args->pmd_pfn, args->page_prot)); > pmd = pmd_mkyoung(pmd); > set_pmd_at(args->mm, vaddr, args->pmdp, pmd); > + flush_dcache_page(page); > pmdp_test_and_clear_young(args->vma, vaddr, args->pmdp); > pmd = READ_ONCE(*args->pmdp); > WARN_ON(pmd_young(pmd)); > @@ -339,21 +364,31 @@ static void __init pud_basic_tests(struct > pgtable_debug_args *args, int idx) > > static void __init pud_advanced_tests(struct pgtable_debug_args *args) > { > + struct page *page; > unsigned long vaddr = args->vaddr; > pud_t pud; > > if (!has_transparent_hugepage()) > return; > > - if (args->pud_pfn == ULONG_MAX) > + page = (args->pud_pfn != ULONG_MAX) ? pfn_to_page(args->pud_pfn) : NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pud_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PUD advanced\n"); > /* Align the address wrt HPAGE_PUD_SIZE */ > vaddr &= HPAGE_PUD_MASK; > > pud = pfn_pud(args->pud_pfn, args->page_prot); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pudp_set_wrprotect(args->mm, vaddr, args->pudp); > pud = READ_ONCE(*args->pudp); > WARN_ON(pud_write(pud)); > @@ -367,6 +402,7 @@ static void __init pud_advanced_tests(struct > pgtable_debug_args *args) > pud = pud_wrprotect(pud); > pud = pud_mkclean(pud); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pud = pud_mkwrite(pud); > pud = pud_mkdirty(pud); > pudp_set_access_flags(args->vma, vaddr, args->pudp, pud, 1); > @@ -382,6 +418,7 @@ static void __init pud_advanced_tests(struct > pgtable_debug_args *args) > pud = pfn_pud(args->pud_pfn, args->page_prot); > pud = pud_mkyoung(pud); > set_pud_at(args->mm, vaddr, args->pudp, pud); > + flush_dcache_page(page); > pudp_test_and_clear_young(args->vma, vaddr, args->pudp); > pud = READ_ONCE(*args->pudp); > WARN_ON(pud_young(pud)); > @@ -594,16 +631,26 @@ static void __init pgd_populate_tests(struct > pgtable_debug_args *args) { } > > static void __init pte_clear_tests(struct pgtable_debug_args *args) > { > + struct page *page; > pte_t pte = pfn_pte(args->pte_pfn, args->page_prot); > > - if (args->pte_pfn == ULONG_MAX) > + page = (args->pte_pfn != ULONG_MAX) ? pfn_to_page(args->pte_pfn) : NULL; > + if (!page) > return; > > + /* > + * flush_dcache_page() is called after set_pte_at() to clear > + * PG_arch_1 for the page on ARM64. The page flag isn't cleared > + * when it's released and page allocation check will fail when > + * the page is allocated again. For architectures other than ARM64, > + * the unexpected overhead of cache flushing is acceptable. > + */ > pr_debug("Validating PTE clear\n"); > #ifndef CONFIG_RISCV > pte = __pte(pte_val(pte) | RANDOM_ORVALUE); > #endif > set_pte_at(args->mm, args->vaddr, args->ptep, pte); > + flush_dcache_page(page); > barrier(); > pte_clear(args->mm, args->vaddr, args->ptep); > pte = ptep_get(args->ptep); > -- > 2.23.0