Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp3561587pxv; Mon, 28 Jun 2021 07:26:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7faN3gxZJoWnaGFUglCwWFPAX9hQLOBUspa76qLV3m3KlVBHyDxiWCoGnbFZH7i5WqGdV X-Received: by 2002:a5d:9414:: with SMTP id v20mr21798923ion.66.1624890402420; Mon, 28 Jun 2021 07:26:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624890402; cv=none; d=google.com; s=arc-20160816; b=Mrghc8/hCzdwyZej8526eJQgJQjkMBGN2TjZ9tdLOpv02QBmAetN5T7bR9RkUtczLa MU35OQzYSH5Hm0HFQ4uULFGwiSXn8FE80ii03TF9mULROIzWlu/9TR3u8GEbLfMdAfQi RxPxVF/qRrGK4cFte5YDnmH19XA8EnlifS8xx2Bvuj5pejV8XelQQX25Rfd43QwSSAU3 UyxJ1gDnPe89q/A0AsDiUmxuY9oDl2HyLTSkfJH250buJzEe5wvSrcgYS1McCfczRHsQ BzU/KlkD0geaj3H0eh87AKaKuFYKmTMxyInqYXSh5W8FXDSS+Betrn5RsOYnxjSs0cr6 0Olg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rXHtZkr+4dRKx23vyNC0CjswmanJc7OzbNSy+lD/eI4=; b=l19eecmD+7lAisxZnNFiUZCkUEistfH4Ubh9vQ0OJsDvFpSYXSYctrVjwfAn3kpHiF 07qEMf6Jzmw8yFz5jimXub1Y2/LZTbWat6ElUGi5mXyjU3guQxu8QG8Z3lkJ8/rbxYYC gwgmsXnCGozp2Gr8kpkbT0gPQD61c0HFsWGugUC8loCG4ICnDVHNtkTL2tDtXkldqcxd vSX13fbFuMBprUaq6JQ7FJ/rCDMr5WxL2INVWX36ZqBbK+q8Z4q2P3z4aSXE30iAAdZx mEMNWWhuVO5717nkx71D9WlVnsWdUCLurawZQDiDLySTkLOlBkrW8KzFv+JQvKNUhsmk qrkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AVpunEbe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e16si18111095iom.90.2021.06.28.07.26.29; Mon, 28 Jun 2021 07:26:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AVpunEbe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233907AbhF1O14 (ORCPT + 99 others); Mon, 28 Jun 2021 10:27:56 -0400 Received: from mail.kernel.org ([198.145.29.99]:55128 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233476AbhF1OWd (ORCPT ); Mon, 28 Jun 2021 10:22:33 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0F43961C9B; Mon, 28 Jun 2021 14:19:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1624889985; bh=/5uUQ4MN1I+SGvaJR6l/Uk/IS44o29wOm+SzzHc73SM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AVpunEbeoWHSUv8r/AIhhMvjU7BJ6OgN8O/cbgKLCrgrUL4JltoAAo9Bw+aPkI/dM lVQPs1syqEuiYAgjh7ut1bJFt4yHyhT1Wwp8e0VvQInNrW5i/S/h6jFSd0vqe7RWKx /OGbhxVFep2zKTFUgMlnM2S5RfOshWEltEvOCrC9JFPQNBjVCYuXD24YmMoBU79NrA b7qMpOyWyB/1p2FswR0kQtM3Evg1hQB1UQqw4qfcC8tCeRkXdAoZL5jVakSXhJYQbD hkGb9hNDPp2kDKE+aauF6eXJWVnyq0xZ+rei/QrCIBrWH6iI6isI+JAil/R5dOtQcx ITPUaBcbEMgag== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Hugh Dickins , "Kirill A . Shutemov" , Yang Shi , Alistair Popple , Jan Kara , Jue Wang , "Matthew Wilcox (Oracle)" , Miaohe Lin , Minchan Kim , Naoya Horiguchi , Oscar Salvador , Peter Xu , Ralph Campbell , Shakeel Butt , Wang Yugui , Zi Yan , Andrew Morton , Linus Torvalds , Greg Kroah-Hartman Subject: [PATCH 5.12 084/110] mm/thp: make is_huge_zero_pmd() safe and quicker Date: Mon, 28 Jun 2021 10:18:02 -0400 Message-Id: <20210628141828.31757-85-sashal@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210628141828.31757-1-sashal@kernel.org> References: <20210628141828.31757-1-sashal@kernel.org> MIME-Version: 1.0 X-KernelTest-Patch: http://kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.12.14-rc1.gz X-KernelTest-Tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git X-KernelTest-Branch: linux-5.12.y X-KernelTest-Patches: git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git X-KernelTest-Version: 5.12.14-rc1 X-KernelTest-Deadline: 2021-06-30T14:18+00:00 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hugh Dickins commit 3b77e8c8cde581dadab9a0f1543a347e24315f11 upstream. Most callers of is_huge_zero_pmd() supply a pmd already verified present; but a few (notably zap_huge_pmd()) do not - it might be a pmd migration entry, in which the pfn is encoded differently from a present pmd: which might pass the is_huge_zero_pmd() test (though not on x86, since L1TF forced us to protect against that); or perhaps even crash in pmd_page() applied to a swap-like entry. Make it safe by adding pmd_present() check into is_huge_zero_pmd() itself; and make it quicker by saving huge_zero_pfn, so that is_huge_zero_pmd() will not need to do that pmd_page() lookup each time. __split_huge_pmd_locked() checked pmd_trans_huge() before: that worked, but is unnecessary now that is_huge_zero_pmd() checks present. Link: https://lkml.kernel.org/r/21ea9ca-a1f5-8b90-5e88-95fb1c49bbfa@google.com Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp") Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Reviewed-by: Yang Shi Cc: Alistair Popple Cc: Jan Kara Cc: Jue Wang Cc: "Matthew Wilcox (Oracle)" Cc: Miaohe Lin Cc: Minchan Kim Cc: Naoya Horiguchi Cc: Oscar Salvador Cc: Peter Xu Cc: Ralph Campbell Cc: Shakeel Butt Cc: Wang Yugui Cc: Zi Yan Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- include/linux/huge_mm.h | 8 +++++++- mm/huge_memory.c | 5 ++++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ba973efcd369..6686a0baa91d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -289,6 +289,7 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t orig_pmd); extern struct page *huge_zero_page; +extern unsigned long huge_zero_pfn; static inline bool is_huge_zero_page(struct page *page) { @@ -297,7 +298,7 @@ static inline bool is_huge_zero_page(struct page *page) static inline bool is_huge_zero_pmd(pmd_t pmd) { - return is_huge_zero_page(pmd_page(pmd)); + return READ_ONCE(huge_zero_pfn) == pmd_pfn(pmd) && pmd_present(pmd); } static inline bool is_huge_zero_pud(pud_t pud) @@ -443,6 +444,11 @@ static inline bool is_huge_zero_page(struct page *page) return false; } +static inline bool is_huge_zero_pmd(pmd_t pmd) +{ + return false; +} + static inline bool is_huge_zero_pud(pud_t pud) { return false; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index cd37a0829881..e1ad01e68aa3 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -61,6 +61,7 @@ static struct shrinker deferred_split_shrinker; static atomic_t huge_zero_refcount; struct page *huge_zero_page __read_mostly; +unsigned long huge_zero_pfn __read_mostly = ~0UL; bool transparent_hugepage_enabled(struct vm_area_struct *vma) { @@ -97,6 +98,7 @@ retry: __free_pages(zero_page, compound_order(zero_page)); goto retry; } + WRITE_ONCE(huge_zero_pfn, page_to_pfn(zero_page)); /* We take additional reference here. It will be put back by shrinker */ atomic_set(&huge_zero_refcount, 2); @@ -146,6 +148,7 @@ static unsigned long shrink_huge_zero_page_scan(struct shrinker *shrink, if (atomic_cmpxchg(&huge_zero_refcount, 1, 0) == 1) { struct page *zero_page = xchg(&huge_zero_page, NULL); BUG_ON(zero_page == NULL); + WRITE_ONCE(huge_zero_pfn, ~0UL); __free_pages(zero_page, compound_order(zero_page)); return HPAGE_PMD_NR; } @@ -2073,7 +2076,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, return; } - if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) { + if (is_huge_zero_pmd(*pmd)) { /* * FIXME: Do we want to invalidate secondary mmu by calling * mmu_notifier_invalidate_range() see comments below inside -- 2.30.2