Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3006230yba; Mon, 22 Apr 2019 17:39:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzSS/EDDvhniU2Vbzwsl2szCxnyT6QWiXlOHJ9FcbvWaPU+QzeDRAsgV7fOS4pbfGBVjSqz X-Received: by 2002:aa7:820c:: with SMTP id k12mr23690468pfi.177.1555979990337; Mon, 22 Apr 2019 17:39:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555979990; cv=none; d=google.com; s=arc-20160816; b=tRKwY5Kg63LbAgJYCevcie50hF/rwt5r4FF8zmkxRA4Vqw1gD78Eyebp6SEd9Nj/Br Bqb/juaAhZtSWlQaqAI/5vFN9OyLPa4iDU0EBWPFTViB28Uz24aTNGePCf6jIpbjzskv e/oGtqlme808hk1p2gkmzfxIM3OcnCYccqQ5NumH1+UvjFp79kWTY6192As0kZn3NIyj 7qGacDGvLryI+Hvv6Ypbp5emSsRtFzF3c6SKBfljiTCgwr/qrci4PGxRE0E97e8dZ1Yt QWXBgnIqf+5sxBcJpD+G/M3wdgE/3qRVo7IFnRN7RL6GaN4Lp/t9gcHFAr7cZw0AQDCR QSJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=SDK/enqLyrKpuOp0PMo1eZ2TiWUqDDo7E4INs3FtGZ8=; b=lqPqv1YHAaO32AC9fsJsWzgGE7h1TZDgqhfSo9shbju9K1JvhtR2JBcwqObhWDxL9q 14t05ihXnc2D0Ctly9nzyuAJkfrg/ED0LQP9veSy4VkJT8bXm9jxwQ5ZWK1+Yoik2CKB h0FIUw7vz9j1eMUxzZgw6b6FaMH9nYW9CsOIRB2ni4pOXv2jNdcYHdaVSC4PHDLjvdOp hwlrpF96nK+t9AmM2lXEhQLc9miOk0FNVB1MPybzdqg5/pseSrYfYOchYVUtY0FTxz/N 7PyJVeV3joAaUjWZKj1hM1YUPiyE5O85eZDHr3Dc1KUaH35WUXcM3z4+rMM4Jfu+rh37 vhCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i65si12511376plb.378.2019.04.22.17.39.35; Mon, 22 Apr 2019 17:39:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728452AbfDVUTE (ORCPT + 99 others); Mon, 22 Apr 2019 16:19:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53636 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728143AbfDVUTE (ORCPT ); Mon, 22 Apr 2019 16:19:04 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EFE9E83F3D; Mon, 22 Apr 2019 20:19:02 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.236]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4B40F6013B; Mon, 22 Apr 2019 20:18:59 +0000 (UTC) Date: Mon, 22 Apr 2019 16:18:57 -0400 From: Jerome Glisse To: Laurent Dufour Cc: akpm@linux-foundation.org, mhocko@kernel.org, peterz@infradead.org, kirill@shutemov.name, ak@linux.intel.com, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , aneesh.kumar@linux.ibm.com, benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , sergey.senozhatsky.work@gmail.com, Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, Daniel Jordan , David Rientjes , Ganesh Mahendran , Minchan Kim , Punit Agrawal , vinayak menon , Yang Shi , zhong jiang , Haiyan Song , Balbir Singh , sj38.park@gmail.com, Michel Lespinasse , Mike Rapoport , linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, npiggin@gmail.com, paulmck@linux.vnet.ibm.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: Re: [PATCH v12 17/31] mm: introduce __page_add_new_anon_rmap() Message-ID: <20190422201857.GH14666@redhat.com> References: <20190416134522.17540-1-ldufour@linux.ibm.com> <20190416134522.17540-18-ldufour@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190416134522.17540-18-ldufour@linux.ibm.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 22 Apr 2019 20:19:03 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 16, 2019 at 03:45:08PM +0200, Laurent Dufour wrote: > When dealing with speculative page fault handler, we may race with VMA > being split or merged. In this case the vma->vm_start and vm->vm_end > fields may not match the address the page fault is occurring. > > This can only happens when the VMA is split but in that case, the > anon_vma pointer of the new VMA will be the same as the original one, > because in __split_vma the new->anon_vma is set to src->anon_vma when > *new = *vma. > > So even if the VMA boundaries are not correct, the anon_vma pointer is > still valid. > > If the VMA has been merged, then the VMA in which it has been merged > must have the same anon_vma pointer otherwise the merge can't be done. > > So in all the case we know that the anon_vma is valid, since we have > checked before starting the speculative page fault that the anon_vma > pointer is valid for this VMA and since there is an anon_vma this > means that at one time a page has been backed and that before the VMA > is cleaned, the page table lock would have to be grab to clean the > PTE, and the anon_vma field is checked once the PTE is locked. > > This patch introduce a new __page_add_new_anon_rmap() service which > doesn't check for the VMA boundaries, and create a new inline one > which do the check. > > When called from a page fault handler, if this is not a speculative one, > there is a guarantee that vm_start and vm_end match the faulting address, > so this check is useless. In the context of the speculative page fault > handler, this check may be wrong but anon_vma is still valid as explained > above. > > Signed-off-by: Laurent Dufour Reviewed-by: J?r?me Glisse > --- > include/linux/rmap.h | 12 ++++++++++-- > mm/memory.c | 8 ++++---- > mm/rmap.c | 5 ++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > index 988d176472df..a5d282573093 100644 > --- a/include/linux/rmap.h > +++ b/include/linux/rmap.h > @@ -174,8 +174,16 @@ void page_add_anon_rmap(struct page *, struct vm_area_struct *, > unsigned long, bool); > void do_page_add_anon_rmap(struct page *, struct vm_area_struct *, > unsigned long, int); > -void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, > - unsigned long, bool); > +void __page_add_new_anon_rmap(struct page *, struct vm_area_struct *, > + unsigned long, bool); > +static inline void page_add_new_anon_rmap(struct page *page, > + struct vm_area_struct *vma, > + unsigned long address, bool compound) > +{ > + VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); > + __page_add_new_anon_rmap(page, vma, address, compound); > +} > + > void page_add_file_rmap(struct page *, bool); > void page_remove_rmap(struct page *, bool); > > diff --git a/mm/memory.c b/mm/memory.c > index be93f2c8ebe0..46f877b6abea 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2347,7 +2347,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > * thread doing COW. > */ > ptep_clear_flush_notify(vma, vmf->address, vmf->pte); > - page_add_new_anon_rmap(new_page, vma, vmf->address, false); > + __page_add_new_anon_rmap(new_page, vma, vmf->address, false); > mem_cgroup_commit_charge(new_page, memcg, false, false); > __lru_cache_add_active_or_unevictable(new_page, vmf->vma_flags); > /* > @@ -2897,7 +2897,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > /* ksm created a completely new copy */ > if (unlikely(page != swapcache && swapcache)) { > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > } else { > @@ -3049,7 +3049,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > } > > inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > setpte: > @@ -3328,7 +3328,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, > /* copy-on-write page */ > if (write && !(vmf->vma_flags & VM_SHARED)) { > inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > } else { > diff --git a/mm/rmap.c b/mm/rmap.c > index e5dfe2ae6b0d..2148e8ce6e34 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1140,7 +1140,7 @@ void do_page_add_anon_rmap(struct page *page, > } > > /** > - * page_add_new_anon_rmap - add pte mapping to a new anonymous page > + * __page_add_new_anon_rmap - add pte mapping to a new anonymous page > * @page: the page to add the mapping to > * @vma: the vm area in which the mapping is added > * @address: the user virtual address mapped > @@ -1150,12 +1150,11 @@ void do_page_add_anon_rmap(struct page *page, > * This means the inc-and-test can be bypassed. > * Page does not have to be locked. > */ > -void page_add_new_anon_rmap(struct page *page, > +void __page_add_new_anon_rmap(struct page *page, > struct vm_area_struct *vma, unsigned long address, bool compound) > { > int nr = compound ? hpage_nr_pages(page) : 1; > > - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); > __SetPageSwapBacked(page); > if (compound) { > VM_BUG_ON_PAGE(!PageTransHuge(page), page); > -- > 2.21.0 >