Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp4483801pxb; Tue, 31 Aug 2021 06:23:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzJemiiBc0nYMj2pB5BrvIoh+AGE8W7gV2sSWRZ2RO9PxdbvvBQO69zg6UuawYftzuQLNXq X-Received: by 2002:a92:902:: with SMTP id y2mr22206010ilg.304.1630416229879; Tue, 31 Aug 2021 06:23:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630416229; cv=none; d=google.com; s=arc-20160816; b=098R8KXcB/HzwffC9mPZ42UTUInD8XNPPpUasQVWPYiqWxym6l0dKj8izbiruIbjUG VrdIMfi+206nuUHVK85D2pbutXBnOCnfPrzhrd7Y09C+pNm+jA4+VRFf8xn00tiX+ppw 285tBu3VfpOwpCSM573lJmSlPB7qE8UFkdDTGBS7RCa2Qr4rpPVf85xOI4zCR6WuYa/L wtO1elj5s8QEIcHuUCjswGtEmaswSgO2Ijgke87+2YMUAUOrhjmrhjBIxOO7T89MWfRM KmEI+WI2zR3DvpdiDShTQFNGgWOXrI3C1alzTRi+DKmMIDiGKuR7kFDs0UQohDtckc5Z 1UPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0oFIUw/B2R8Fy4y8pZf4ZATrEqBDSvWc0VVN+2RIKNM=; b=Ytk9SsHa4yolaXkhAiVpG+ru7Oo5d9QMdCDUXk9A/X7oMZI81xErjAk63BPqDw5pYO MGsDrCGAKmzqt7YmN66Gxa8aFM1EwRVA/2VngTd/HvKwm+9heA5lhoZcky6kapVQ43EE gdGazPGiz/Fv6+4xSs0fp0IbHcGRluH0oVeAGWgZAk4ODhakTzHQXxJRXAE80gs1RM5B jhqfC1rvdk9arGvk+54XRSTC6xlwOgmf2j8Y3DLXa6I4gWmYxi3ORYUiP3dXj4PZP2UW lpFvJh1vlA+cDIR7XjnqfjixwUdW5V01CbKrd1dPIaLRgmkiVc2ZDMjtgOHjwyduvegn VLwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=dv7gsA8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r28si18129406jao.16.2021.08.31.06.23.35; Tue, 31 Aug 2021 06:23:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=dv7gsA8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236476AbhHaNWe (ORCPT + 99 others); Tue, 31 Aug 2021 09:22:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235683AbhHaNW1 (ORCPT ); Tue, 31 Aug 2021 09:22:27 -0400 Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0678CC061760 for ; Tue, 31 Aug 2021 06:21:32 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id s11so16643205pgr.11 for ; Tue, 31 Aug 2021 06:21:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0oFIUw/B2R8Fy4y8pZf4ZATrEqBDSvWc0VVN+2RIKNM=; b=dv7gsA8JEBHbuT5/nY4anDm3k3ha8vBonv8d3w3ZYRwxDR9P16ITSF2zuELrbyuneX zZUeKcMvbQXVzlLXNgy2ueWv7EPZBjNlXS6ggn+xP92COqZjU2mwWQr1DNosc4CVN3yv t5xQc4ZtHyiz6AOay1vde988pd/URZB9jP5HOMdb74b/Mfm+pOX20Ru1i3cIbHmMeWHu qTdB3C6L8cyjCVXpjmxS6FDE3MODvgihefysnbvhDgGqKK6PvmYKHk/8xmGCEnyhemHs kB/x2Q0br+mc2nDbM1UZhybjjaFYIh2ekjN1pH5pHXNEnYoTiZNg7fa6NU4RtovYe5fs NkCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0oFIUw/B2R8Fy4y8pZf4ZATrEqBDSvWc0VVN+2RIKNM=; b=c1Gopq9XWoe8M7Nf1uDDWTVRMo5MN+LcRhv0wTs5RWN2tcQEb4swojCgyvkYrhaovG zSTtxle6FTU/J8ajWrrl5H45oIYBW+taN1gfr7sYEwJXCPNTN1QughQVb6KXPNvf+MJN mv76wXeqhyHnwVPThEHuQuSM0j1qW6BGU5VsHazA5orXVlJezCFWIhBxY4dXd/qqXz+F DPjLssrSYawm6CDznSPySNbHYpnV7YKKBMZDBM6Mp8+sSQ8qjyc+ykHyiO9sYeE/s4c4 0QCBieIYEy/O5AwEjqW0VmIPYFv8+fI3hnaKgRE+MCL2UOcvEWH9Z3TyC+JbsCTnqA8z kSqA== X-Gm-Message-State: AOAM530l6u0p91/pZRfAOM3qnljQyKKSkPr54SxhqTFloa9Y1gt6IojC ipBw8wEeY2vx1l+5MYaDw3MpFYuOhjiZ/g== X-Received: by 2002:a63:a4a:: with SMTP id z10mr26444978pgk.329.1630416091415; Tue, 31 Aug 2021 06:21:31 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.230]) by smtp.gmail.com with ESMTPSA id k190sm9548352pgc.11.2021.08.31.06.21.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 31 Aug 2021 06:21:31 -0700 (PDT) From: Qi Zheng To: akpm@linux-foundation.org, tglx@linutronix.de, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, kirill.shutemov@linux.intel.com, mika.penttila@nextfour.com, david@redhat.com, vbabka@suse.cz Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, songmuchun@bytedance.com, Qi Zheng Subject: [PATCH v2 2/2] mm: remove redundant smp_wmb() Date: Tue, 31 Aug 2021 21:21:11 +0800 Message-Id: <20210831132111.85437-3-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20210831132111.85437-1-zhengqi.arch@bytedance.com> References: <20210831132111.85437-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The smp_wmb() which is in the __pte_alloc() is used to ensure all ptes setup is visible before the pte is made visible to other CPUs by being put into page tables. We only need this when the pte is actually populated, so move it to pmd_install(). __pte_alloc_kernel(), __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar to this case. We can also defer smp_wmb() to the place where the pmd entry is really populated by preallocated pte. There are two kinds of user of preallocated pte, one is filemap & finish_fault(), another is THP. The former does not need another smp_wmb() because the smp_wmb() has been done by pmd_install(). Fortunately, the latter also does not need another smp_wmb() because there is already a smp_wmb() before populating the new pte when the THP uses a preallocated pte to split a huge pmd. Signed-off-by: Qi Zheng Reviewed-by: Muchun Song Acked-by: David Hildenbrand --- mm/memory.c | 52 +++++++++++++++++++++++----------------------------- mm/sparse-vmemmap.c | 2 +- 2 files changed, 24 insertions(+), 30 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ef7b1762e996..658d8df9c70f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ mm_inc_nr_ptes(mm); + /* + * Ensure all pte setup (eg. pte page lock and page clearing) are + * visible before the pte is made visible to other CPUs by being + * put into page tables. + * + * The other side of the story is the pointer chasing in the page + * table walking code (when walking the page table without locking; + * ie. most of the time). Fortunately, these data accesses consist + * of a chain of data-dependent loads, meaning most CPUs (alpha + * being the notable exception) will already guarantee loads are + * seen in-order. See the alpha page table accessors for the + * smp_rmb() barriers in page table walking code. + */ + smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ pmd_populate(mm, pmd, *pte); *pte = NULL; } @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) if (!new) return -ENOMEM; - /* - * Ensure all pte setup (eg. pte page lock and page clearing) are - * visible before the pte is made visible to other CPUs by being - * put into page tables. - * - * The other side of the story is the pointer chasing in the page - * table walking code (when walking the page table without locking; - * ie. most of the time). Fortunately, these data accesses consist - * of a chain of data-dependent loads, meaning most CPUs (alpha - * being the notable exception) will already guarantee loads are - * seen in-order. See the alpha page table accessors for the - * smp_rmb() barriers in page table walking code. - */ - smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ - pmd_install(mm, pmd, &new); if (new) pte_free(mm, new); @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&init_mm.page_table_lock); if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ + smp_wmb(); /* See comment in pmd_install() */ pmd_populate_kernel(&init_mm, pmd, new); new = NULL; } @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } ret = vma->vm_ops->fault(vmf); @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) vmf->prealloc_pte = pte_alloc_one(vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm); if (!vmf->prealloc_pte) return VM_FAULT_OOM; - smp_wmb(); /* See comment in __pte_alloc() */ } return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); - if (pgd_present(*pgd)) /* Another has populated it */ + if (pgd_present(*pgd)) { /* Another has populated it */ p4d_free(mm, new); - else + } else { + smp_wmb(); /* See comment in pmd_install() */ pgd_populate(mm, pgd, new); + } spin_unlock(&mm->page_table_lock); return 0; } @@ -4842,11 +4837,10 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - spin_lock(&mm->page_table_lock); if (!p4d_present(*p4d)) { mm_inc_nr_puds(mm); + smp_wmb(); /* See comment in pmd_install() */ p4d_populate(mm, p4d, new); } else /* Another has populated it */ pud_free(mm, new); @@ -4867,14 +4861,14 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) if (!new) return -ENOMEM; - smp_wmb(); /* See comment in __pte_alloc */ - ptl = pud_lock(mm, pud); if (!pud_present(*pud)) { mm_inc_nr_pmds(mm); + smp_wmb(); /* See comment in pmd_install() */ pud_populate(mm, pud, new); - } else /* Another has populated it */ + } else { /* Another has populated it */ pmd_free(mm, new); + } spin_unlock(ptl); return 0; } diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index bdce883f9286..db6df27c852a 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -76,7 +76,7 @@ static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start, set_pte_at(&init_mm, addr, pte, entry); } - /* Make pte visible before pmd. See comment in __pte_alloc(). */ + /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); pmd_populate_kernel(&init_mm, pmd, pgtable); -- 2.11.0