Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp211495pxb; Wed, 18 Aug 2021 20:22:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzycclXQaBLiO86FpKXido+ZIcpjxEH00OBs3N4TjnFLPwPGsO/nmBMj+KqUAXje6eWSaXg X-Received: by 2002:a05:6602:3413:: with SMTP id n19mr9455151ioz.46.1629343329134; Wed, 18 Aug 2021 20:22:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629343329; cv=none; d=google.com; s=arc-20160816; b=xvwfFEKHDqhAoEUmFyKthxrgaII97W9lPlKDQer836Wikk812dO0qkSrRmJWxQDntT EBKMPwuYT2TcoZl4klRmxmnMUuAzySEJyVlyzEtdBU2rAjYhqELKjBJ96KSxasdJj/2l d8qUgaTPiQBVsjdmxHwESEkgZvGD5HOTvvuumg6Mhpfzarx2iLl5hEGoL5ox79arajVf A/oPjIs4dA1o//MlpqDhqeLbsMmzsAAsZIbfkfQWjxhemwcuV3+rH8SyrArDyfLnmEjn hhwVhJKA2iMMPrebZFwcGZ9T620JgoW8lHmFI/iCeLUgUx72t2ewqCtD2gA5WlLubLBm CEeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=r4sxzAuXhEsyJUy6AggBHx5gh/7nTNy4dMi+0q2BJs8=; b=vD9wH5fYqtsu9ZcAMizLJkqn6PaJKWdPdLXfJfZm6or8k5GwFrCUYOcMlUfjDmGloM 1tOp6XZ0W4urAqlQfQFpPUUfwKqfawRp1g2D9bcbmnkqU8hNDn/JoT0Gi22hcOcuIibf g7fcUyjxS5/cHxMBB8oNchGahWisBZOUXAFvVCBfifVnn9vM/Dgl+jaQx7bNjRxMe2E+ iXkMQ3pBQDWygEVoahdv8tLiLNww42nrfTJhO/mM1NhYmy1AZ0XVXr20nHXscFaCPLQl pKZde1JnHQB31ThhCBK9/IIQSzU+8T8ujWRRJmCTXH05LfjX5SeF5ejJA07GBtD8mKiQ FMcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=Im4iOguF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p20si1672905iog.74.2021.08.18.20.21.58; Wed, 18 Aug 2021 20:22:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=Im4iOguF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230459AbhHSDUk (ORCPT + 99 others); Wed, 18 Aug 2021 23:20:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236062AbhHSDUi (ORCPT ); Wed, 18 Aug 2021 23:20:38 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C481C0613CF for ; Wed, 18 Aug 2021 20:20:03 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id j187so4175469pfg.4 for ; Wed, 18 Aug 2021 20:20:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=r4sxzAuXhEsyJUy6AggBHx5gh/7nTNy4dMi+0q2BJs8=; b=Im4iOguFKafQyltDVp7tngcIwrXZadsyHSMscezUz6390MhVVrOGtuIkwzMBFokNP9 IjzBbpgYGqc84dW6aMVFjSpInpXjR2EQX6EB5Q72Tq++PLTalmuZFkt7Dn3jcZTFiO1A XBf/Xuh03goX4fzNyQc456hRFTWbg5DcbiD8KArCB4r4oTxcuWjoty0Q7azgvweeOZF+ YYnbgoSfJ8b1ElT1C5YnBC/eOg/am9MrKctOnzW7D2VH5oEBBngCXEC2OGSktr+mD4an /E99+bGBDbemvP5XrKelATFMEBlvQAqy0I+refUkGYIYolBvk1fkJOg5Ee72PtrhNFtd Ea5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=r4sxzAuXhEsyJUy6AggBHx5gh/7nTNy4dMi+0q2BJs8=; b=rj1zILYFqTINQSNN1PGC2D7AAiVe47WH7xXmqKgi1iDKqKtEsVjZBqDSVxwH36o8Th +ClGQwVYD7JvoQnRMaRsYr4Ql+cPLwGojztwxM7xcsPQ8B8kSVQaiNBXQAzWBSJAGrNm aM+C+Z5aNESGR9/oWWjUtG1oTruP/7m2LLRKjcebZUaAuPhl5kc5mczJbJVtwun5jErx vBnqWriSUW6ZmE1MfUbgNmwQoEk3Ickz9VgzNK0yf8PM4dJppnjJMBje9xZs67A78Koi hYZai6J+3gZd1UHalgxnc8fvFVToMLtURmZwDCaN9D0NtF0K+gTb/tA2pJI8CdNeQawl /1Lw== X-Gm-Message-State: AOAM533Pm8arfVivfYV8hdUF20TdKA5R5nQf2RhL6xZWGACPKO5U4T30 3EvUFzadoQzpnLYl/fqRWFpGHtFdc0bM1g== X-Received: by 2002:a63:170d:: with SMTP id x13mr11619028pgl.216.1629343203095; Wed, 18 Aug 2021 20:20:03 -0700 (PDT) Received: from localhost.localdomain ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id k3sm1261276pfc.16.2021.08.18.20.19.58 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Aug 2021 20:20:02 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, tglx@linutronix.de, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, kirill.shutemov@linux.intel.com, mika.penttila@nextfour.com, david@redhat.com Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, songmuchun@bytedance.com, Qi Zheng Subject: [PATCH v2 2/9] mm: remove redundant smp_wmb() Date: Thu, 19 Aug 2021 11:18:51 +0800 Message-Id: <20210819031858.98043-3-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20210819031858.98043-1-zhengqi.arch@bytedance.com> References: <20210819031858.98043-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The smp_wmb() which is in the __pte_alloc() is used to ensure all ptes setup is visible before the pte is made visible to other CPUs by being put into page tables. We only need this when the pte is actually populated, so move it to pte_install(). __pte_alloc_kernel(), __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar to this case. We can also defer smp_wmb() to the place where the pmd entry is really populated by preallocated pte. There are two kinds of user of preallocated pte, one is filemap & finish_fault(), another is THP. The former does not need another smp_wmb() because the smp_wmb() has been done by pte_install(). Fortunately, the latter also does not need another smp_wmb() because there is already a smp_wmb() before populating the new pte when the THP uses a preallocated pte to split a huge pmd. Signed-off-by: Qi Zheng Reviewed-by: Muchun Song --- mm/memory.c | 47 ++++++++++++++++++++--------------------------- mm/sparse-vmemmap.c | 2 +- 2 files changed, 21 insertions(+), 28 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ef7b1762e996..9c7534187454 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ mm_inc_nr_ptes(mm); + /* + * Ensure all pte setup (eg. pte page lock and page clearing) are + * visible before the pte is made visible to other CPUs by being + * put into page tables. + * + * The other side of the story is the pointer chasing in the page + * table walking code (when walking the page table without locking; + * ie. most of the time). Fortunately, these data accesses consist + * of a chain of data-dependent loads, meaning most CPUs (alpha + * being the notable exception) will already guarantee loads are + * seen in-order. See the alpha page table accessors for the + * smp_rmb() barriers in page table walking code. + */ + smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ pmd_populate(mm, pmd, *pte); *pte = NULL; } @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) if (!new) return -ENOMEM; - /* - * Ensure all pte setup (eg. pte page lock and page clearing) are - * visible before the pte is made visible to other CPUs by being - * put into page tables. - * - * The other side of the story is the pointer chasing in the page - * table walking code (when walking the page table without locking; - * ie. most of the time). Fortunately, these data accesses consist - * of a chain of data-dependent loads, meaning most CPUs (alpha - * being the notable exception) will already guarantee loads are - * seen in-order. See the alpha page table accessors for the - * smp_rmb() barriers in page table walking code. - */ - smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ - pmd_install(mm, pmd, &new); if (new) pte_free(mm, new); @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&init_mm.page_table_lock); if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + smp_wmb(); /* See comment in pmd_install() */ pmd_populate_kernel(&init_mm, pmd, new); new = NULL; } @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } ret = vma->vm_ops->fault(vmf); @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); if (pgd_present(*pgd)) /* Another has populated it */ p4d_free(mm, new); - else + else { + smp_wmb(); /* See comment in pmd_install() */ pgd_populate(mm, pgd, new); + } spin_unlock(&mm->page_table_lock); return 0; } @@ -4842,11 +4837,10 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); if (!p4d_present(*p4d)) { mm_inc_nr_puds(mm); + smp_wmb(); /* See comment in pmd_install() */ p4d_populate(mm, p4d, new); } else /* Another has populated it */ pud_free(mm, new); @@ -4867,11 +4861,10 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - ptl = pud_lock(mm, pud); if (!pud_present(*pud)) { mm_inc_nr_pmds(mm); + smp_wmb(); /* See comment in pmd_install() */ pud_populate(mm, pud, new); } else /* Another has populated it */ pmd_free(mm, new); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index bdce883f9286..db6df27c852a 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -76,7 +76,7 @@ static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start, set_pte_at(&init_mm, addr, pte, entry); } - /* Make pte visible before pmd. See comment in __pte_alloc(). */ + /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); pmd_populate_kernel(&init_mm, pmd, pgtable); -- 2.11.0