Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp464136pxm; Wed, 2 Mar 2022 02:01:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJw3I8WivhulU2gNbE2ZjNaU8vHeYqy9/D7uPBMKOYSas8pDPERRKl9TH/77kEJIIS8/4c6F X-Received: by 2002:a63:64a:0:b0:36c:6d40:5688 with SMTP id 71-20020a63064a000000b0036c6d405688mr25393537pgg.554.1646215302991; Wed, 02 Mar 2022 02:01:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646215302; cv=none; d=google.com; s=arc-20160816; b=ykI2+jLIdeC0H5MpLxgOORmd6v5BFgye8/XtSkOZ2OjM1xsBk9G8Csp8c5Xz/OPlVi 8WrCFxdglk8NEs0B9iMXyXaPGxWQAVJDeAhn2acLhWACcZaGnqTkAJKucuC2pykLsKP0 N8U7w6njfajSLEYssvOzEaZRGAX0u43nynpmDmt0ImLA9hodthCYdC91kjjVgJv6ID04 2oH8t8iSiSLPQHNbTgVpBAPAiAx523nkLDsyy2Fb8Wx8dvfjNGA4Wv4q25L4NAePQhhD BowRlS1lUE0AnY+pumxRJtPmeODpVTWDebx+tHl/y8S03YA82yq2KKrDK9G6vC9ie0eQ 8QdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5g/0QWFd98PH+GDbIHxo8Dsrvq3fGfzY+G57IkW5WS4=; b=CtuFVc0gNxtwEIirzYN8l/GIQpZ2UMHDHFf184TanLK7YokqPkWf208AbMvIDyrsav UsCZ5FW6boG67+qNe8iCJWHATnk1odZua5G7iCAr8AHOb6QDHOWX9YrHJw3LYqqx4rmn PlPZuVOdThiffBf/dws+ife2MqViLOGmwJKY/wQoQr8TK9Ng/n1HvQjl3Mzm3J9J4AzH J8fqNmK/A6642Ywx7+I/RMHqRqBMb8V4V/LBgA1CXhgoaebZb89btWCNDbEgoTJyjATN 5NxNngGI5O5L3w8kAY4Ci8eYMB1fHVmtt0f5fOLOo0GW/p2pYGcu67AfeXSy3O/1BjmC 5oig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b="AAVeeC/1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r10-20020a056a00216a00b004e10370daffsi14189021pff.24.2022.03.02.02.01.23; Wed, 02 Mar 2022 02:01:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b="AAVeeC/1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240138AbiCBIaU (ORCPT + 99 others); Wed, 2 Mar 2022 03:30:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240143AbiCBIaM (ORCPT ); Wed, 2 Mar 2022 03:30:12 -0500 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC9DAB82F4 for ; Wed, 2 Mar 2022 00:29:29 -0800 (PST) Received: by mail-pl1-x634.google.com with SMTP id ay5so968916plb.1 for ; Wed, 02 Mar 2022 00:29:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5g/0QWFd98PH+GDbIHxo8Dsrvq3fGfzY+G57IkW5WS4=; b=AAVeeC/1N3G5GcQ+K6GvknxUpCVQsZK39/oE0xJ/wgZT1GfB7gPbGuBOafACxBd/c6 MkH1O0a51SjPMpZ4wg2yNC3c4zpZ9dE5JglFpUNSIXG0tww4bCfNAXwvDe+a5pT1A/P+ 9CUnfcNARVTbx0MaJya8bS1PjE/9KiSW76Iaio2gwlTnxUnXAbKEpjLj6kKMXhDjR6b7 OefsDv/keodJ7ewbycR7dyDT1rZ4Oos4YZlqHiUed2rus7fSEtveiT6lBZy0JF1vvXVe LKMCP0ft+vVnndO2ogo+gRfdYi8O/h6hMPqKgldeWsB49c0/KdDg4htf5wEE883PpJ0B zUow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5g/0QWFd98PH+GDbIHxo8Dsrvq3fGfzY+G57IkW5WS4=; b=fMcVqt2KRRg+UqHDkAOJmk+LvOvh8vvO3kXHC7aIGPqRXjZGc0Vat6CH95HdgVyAlk uzVF4FggEI4BU8GOiutpJnlgXxgkcEs//jx2E6VvFE5yZzazMWgvkv55J/VQwiZoJRAA utLzIe++rbJB8dXjCCPoczQko1IHmktZGb5Lr1Imj/3CWvtP0jrcRDFdv9iiocg/WKBJ rr5OYy1M8F9dqExNoojs3OVE1fOSpIk788+PwrnulQoh7+xD+V2wK8Alo1mZEwCXTLuc MSbAP2nr3UdCP3gBsw8sgtUrF+KiJiX4Lgb7X/gCc9i7E0GgbPNGgu0JjaG5lupxsMLE AENg== X-Gm-Message-State: AOAM533kL1XoN67R+2ahjLWcv96PodSAq3K3tfPoyWWxnib8tldcnFyF 0Lw1b91cAgqrYryVFvdlI1FbNw== X-Received: by 2002:a17:90a:ab89:b0:1bc:71a7:f93a with SMTP id n9-20020a17090aab8900b001bc71a7f93amr25812348pjq.111.1646209769383; Wed, 02 Mar 2022 00:29:29 -0800 (PST) Received: from FVFYT0MHHV2J.bytedance.net ([61.120.150.70]) by smtp.gmail.com with ESMTPSA id a20-20020a056a000c9400b004f396b965a9sm20922228pfv.49.2022.03.02.00.29.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 00:29:29 -0800 (PST) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v4 3/6] mm: rmap: introduce pfn_mkclean_range() to cleans PTEs Date: Wed, 2 Mar 2022 16:27:15 +0800 Message-Id: <20220302082718.32268-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220302082718.32268-1-songmuchun@bytedance.com> References: <20220302082718.32268-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE, T_FILL_THIS_FORM_SHORT,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The page_mkclean_one() is supposed to be used with the pfn that has a associated struct page, but not all the pfns (e.g. DAX) have a struct page. Introduce a new function pfn_mkclean_range() to cleans the PTEs (including PMDs) mapped with range of pfns which has no struct page associated with them. This helper will be used by DAX device in the next patch to make pfns clean. Signed-off-by: Muchun Song --- include/linux/rmap.h | 3 +++ mm/internal.h | 26 +++++++++++++-------- mm/rmap.c | 65 +++++++++++++++++++++++++++++++++++++++++++--------- 3 files changed, 74 insertions(+), 20 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b58ddb8b2220..a6ec0d3e40c1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -263,6 +263,9 @@ unsigned long page_address_in_vma(struct page *, struct vm_area_struct *); */ int folio_mkclean(struct folio *); +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, + struct vm_area_struct *vma); + void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); /* diff --git a/mm/internal.h b/mm/internal.h index f45292dc4ef5..ff873944749f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -516,26 +516,22 @@ void mlock_page_drain(int cpu); extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); /* - * At what user virtual address is page expected in vma? - * Returns -EFAULT if all of the page is outside the range of vma. - * If page is a compound head, the entire compound page is considered. + * * Return the start of user virtual address at the specific offset within + * a vma. */ static inline unsigned long -vma_address(struct page *page, struct vm_area_struct *vma) +vma_pgoff_address(pgoff_t pgoff, unsigned long nr_pages, + struct vm_area_struct *vma) { - pgoff_t pgoff; unsigned long address; - VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ - pgoff = page_to_pgoff(page); if (pgoff >= vma->vm_pgoff) { address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); /* Check for address beyond vma (or wrapped through 0?) */ if (address < vma->vm_start || address >= vma->vm_end) address = -EFAULT; - } else if (PageHead(page) && - pgoff + compound_nr(page) - 1 >= vma->vm_pgoff) { + } else if (pgoff + nr_pages - 1 >= vma->vm_pgoff) { /* Test above avoids possibility of wrap to 0 on 32-bit */ address = vma->vm_start; } else { @@ -545,6 +541,18 @@ vma_address(struct page *page, struct vm_area_struct *vma) } /* + * Return the start of user virtual address of a page within a vma. + * Returns -EFAULT if all of the page is outside the range of vma. + * If page is a compound head, the entire compound page is considered. + */ +static inline unsigned long +vma_address(struct page *page, struct vm_area_struct *vma) +{ + VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ + return vma_pgoff_address(page_to_pgoff(page), compound_nr(page), vma); +} + +/* * Then at what user virtual address will none of the range be found in vma? * Assumes that vma_address() already returned a good starting address. */ diff --git a/mm/rmap.c b/mm/rmap.c index 723682ddb9e8..ad5cf0e45a73 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -929,12 +929,12 @@ int folio_referenced(struct folio *folio, int is_locked, return pra.referenced; } -static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, - unsigned long address, void *arg) +static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw) { - DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int cleaned = 0; + struct vm_area_struct *vma = pvmw->vma; struct mmu_notifier_range range; - int *cleaned = arg; + unsigned long address = pvmw->address; /* * We have to assume the worse case ie pmd for invalidation. Note that @@ -942,16 +942,16 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, */ mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, 0, vma, vma->vm_mm, address, - vma_address_end(&pvmw)); + vma_address_end(pvmw)); mmu_notifier_invalidate_range_start(&range); - while (page_vma_mapped_walk(&pvmw)) { + while (page_vma_mapped_walk(pvmw)) { int ret = 0; - address = pvmw.address; - if (pvmw.pte) { + address = pvmw->address; + if (pvmw->pte) { pte_t entry; - pte_t *pte = pvmw.pte; + pte_t *pte = pvmw->pte; if (!pte_dirty(*pte) && !pte_write(*pte)) continue; @@ -964,7 +964,7 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, ret = 1; } else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - pmd_t *pmd = pvmw.pmd; + pmd_t *pmd = pvmw->pmd; pmd_t entry; if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) @@ -991,11 +991,22 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ if (ret) - (*cleaned)++; + cleaned++; } mmu_notifier_invalidate_range_end(&range); + return cleaned; +} + +static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, + unsigned long address, void *arg) +{ + DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int *cleaned = arg; + + *cleaned += page_vma_mkclean_one(&pvmw); + return true; } @@ -1033,6 +1044,38 @@ int folio_mkclean(struct folio *folio) EXPORT_SYMBOL_GPL(folio_mkclean); /** + * pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range of + * [@pfn, @pfn + @nr_pages) at the specific offset (@pgoff) + * within the @vma of shared mappings. And since clean PTEs + * should also be readonly, write protects them too. + * @pfn: start pfn. + * @nr_pages: number of physically contiguous pages srarting with @pfn. + * @pgoff: page offset that the @pfn mapped with. + * @vma: vma that @pfn mapped within. + * + * Returns the number of cleaned PTEs (including PMDs). + */ +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, + struct vm_area_struct *vma) +{ + struct page_vma_mapped_walk pvmw = { + .pfn = pfn, + .nr_pages = nr_pages, + .pgoff = pgoff, + .vma = vma, + .flags = PVMW_SYNC, + }; + + if (invalid_mkclean_vma(vma, NULL)) + return 0; + + pvmw.address = vma_pgoff_address(pgoff, nr_pages, vma); + VM_BUG_ON_VMA(pvmw.address == -EFAULT, vma); + + return page_vma_mkclean_one(&pvmw); +} + +/** * page_move_anon_rmap - move a page to our anon_vma * @page: the page to move to our anon_vma * @vma: the vma the page belongs to -- 2.11.0