Received: by 10.192.165.148 with SMTP id m20csp3079196imm; Sun, 22 Apr 2018 23:33:29 -0700 (PDT) X-Google-Smtp-Source: AIpwx48D/OHPuECOwvliRofN07osxJhoEznnDdCDFxXzGJ8OJsnp/UspF/qWL7bitiPd8mX0T3j6 X-Received: by 10.98.157.137 with SMTP id a9mr12634294pfk.206.1524465209792; Sun, 22 Apr 2018 23:33:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524465209; cv=none; d=google.com; s=arc-20160816; b=fnXidPHfDaCuwk1pLrvLzOBABzSsSxmIkvCHBVWgC8fRx5jVpGsJqnB9g9Rlocv69B +ycj8VkYZHmrLzq1mqaT4bMCpdjq7Qki88Q1P8TvlRJX8A1kDARxiopPTTO9nLoN7+Mc 0RXe3HzoMbenxKiw+54ZWAdvbo2v7euwXjjKUZ37tfyclnfjDeUItH2Ri/Iqz8LTtAx4 0BjxMueBCPzxWutG0mJEnBFc22P3yi8Q+qPmiIaCkw1PFFLsYJFz1uZlE6qVqvCLMhIu PF+ajNeEGOtWjmldU0yT5FjnHkdr1SbgXTfIO4lSAkBbGTWtX5JaI5BhHmDACLkAv0Od FQ6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=VOBIpGj9x8UvXjghtsPELD+gbwjlgj3+Q25Ywo9CA24=; b=I30dty00C/f+6McDLxgoActtxgkUnMfmnBSe6iY9yW1Ttybnk6E3YQU67a+D6QejCL kSH48pxl185/qsbqWdTurcGrJffWICdfnChesp4OWzPnIyevgxHQOuGoowtnZKCj9bNS qyOD4apfRAFLjoZrw3yeP/VKyZyvVdFQVA/rLQouKU8bU5cYoq14iqXl6KU/ToeC0jpR y6nWesdzW4iSHc+SU4RwUI9G2Kw7YscFwQxnTo1AnJFCTDEmBO7CVtt8p1XbDB38t2O6 FHfwCbDs0pADyu7MzEXqPYlx35eIhzHEoZP47QknNkCANF3GXm2+LpiHhkWj7uQpfwZs N3RA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=JNDgAt+v; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 89-v6si11004399plc.444.2018.04.22.23.33.15; Sun, 22 Apr 2018 23:33:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=JNDgAt+v; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751932AbeDWGcK (ORCPT + 99 others); Mon, 23 Apr 2018 02:32:10 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:42529 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750852AbeDWGcI (ORCPT ); Mon, 23 Apr 2018 02:32:08 -0400 Received: by mail-pf0-f194.google.com with SMTP id o16so8403549pfk.9 for ; Sun, 22 Apr 2018 23:32:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=VOBIpGj9x8UvXjghtsPELD+gbwjlgj3+Q25Ywo9CA24=; b=JNDgAt+v1v8Q6yzqMYAw5D5cY5vHSzE72exEvP+NF8WOcSKNAN49owNgy/bB8WW43g Xmmgzw9d7fvO7/qAS0K7/9yiICMznCqbN/YQzdOc+5/9nkCN3GMwmxi/Ec0pQyN7B+6B Aie/hLipeW8Vyitq2Zy5NflYCuWgdcHiuV9/lj9rpx6fKS5hijwq/vH+Xv6Jb8hQX1Az LU7cGbp3sGxvnegngNRw/Gl9IcVHSQJqxjVU9iE9TnAP11R9u8kNxbxy1n3UjjGwvvK5 GNlh4OQP2DvDW9bngTcdkT7LarOU0AO1P0cPIRaSaWt3RsVjbRgFi6ODPK5blim9Bbha y49A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=VOBIpGj9x8UvXjghtsPELD+gbwjlgj3+Q25Ywo9CA24=; b=c4ViA1HCnBftFpkGftchoTKy56ak+sRug3fnZC0+O0wb6QeodwllAimy4zQ7/2AxqL WHV2/vxgg9i+4ujr3/1uBFl7ObDjcRVN82OFuKLKkZ4syxbAWdq/m/gHQhYDXGeJrl5a 57IePNtJNiBSzdXOuvMawDZAUfflUcoOBuCuquNNsG+ozJKGMQopRdeADCbzhi4T6lP3 8LR/l1LMCjA9t5kzp1Ga8mPtBFXyXztGPSNxhFXMcC2peNK5mFqwhKs8VqXWIn0tFRQ8 KkBjCsFomkcBxGVf8V31QPREB1Ib1mlTe2ApO1wyyL9Jdb6xCCE5vVbwmV51n253zxKa EmZA== X-Gm-Message-State: ALQs6tCW0Uqn4GK0mD56uCJ8bVypEjKhlt0/44HBaCGk+gG49bl2eDjE vpM7asrko5d/U7t+VVXyyQQ= X-Received: by 2002:a17:902:b609:: with SMTP id b9-v6mr19357779pls.29.1524465128004; Sun, 22 Apr 2018 23:32:08 -0700 (PDT) Received: from rodete-desktop-imager.corp.google.com ([2401:fa00:d:10:affa:813f:5380:6613]) by smtp.gmail.com with ESMTPSA id u9sm24821957pfj.10.2018.04.22.23.31.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 22 Apr 2018 23:32:06 -0700 (PDT) Date: Mon, 23 Apr 2018 15:31:57 +0900 From: Minchan Kim To: Laurent Dufour Cc: akpm@linux-foundation.org, mhocko@kernel.org, peterz@infradead.org, kirill@shutemov.name, ak@linux.intel.com, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com, Daniel Jordan , David Rientjes , Jerome Glisse , Ganesh Mahendran , linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, paulmck@linux.vnet.ibm.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: Re: [PATCH v10 06/25] mm: make pte_unmap_same compatible with SPF Message-ID: <20180423063157.GB114098@rodete-desktop-imager.corp.google.com> References: <1523975611-15978-1-git-send-email-ldufour@linux.vnet.ibm.com> <1523975611-15978-7-git-send-email-ldufour@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1523975611-15978-7-git-send-email-ldufour@linux.vnet.ibm.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 17, 2018 at 04:33:12PM +0200, Laurent Dufour wrote: > pte_unmap_same() is making the assumption that the page table are still > around because the mmap_sem is held. > This is no more the case when running a speculative page fault and > additional check must be made to ensure that the final page table are still > there. > > This is now done by calling pte_spinlock() to check for the VMA's > consistency while locking for the page tables. > > This is requiring passing a vm_fault structure to pte_unmap_same() which is > containing all the needed parameters. > > As pte_spinlock() may fail in the case of a speculative page fault, if the > VMA has been touched in our back, pte_unmap_same() should now return 3 > cases : > 1. pte are the same (0) > 2. pte are different (VM_FAULT_PTNOTSAME) > 3. a VMA's changes has been detected (VM_FAULT_RETRY) > > The case 2 is handled by the introduction of a new VM_FAULT flag named > VM_FAULT_PTNOTSAME which is then trapped in cow_user_page(). I don't see such logic in this patch. Maybe you introduces it later? If so, please comment on it. Or just return 0 in case of 2 without introducing VM_FAULT_PTNOTSAME. > If VM_FAULT_RETRY is returned, it is passed up to the callers to retry the > page fault while holding the mmap_sem. > > Acked-by: David Rientjes > Signed-off-by: Laurent Dufour > --- > include/linux/mm.h | 1 + > mm/memory.c | 39 ++++++++++++++++++++++++++++----------- > 2 files changed, 29 insertions(+), 11 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 4d1aff80669c..714da99d77a3 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1208,6 +1208,7 @@ static inline void clear_page_pfmemalloc(struct page *page) > #define VM_FAULT_NEEDDSYNC 0x2000 /* ->fault did not modify page tables > * and needs fsync() to complete (for > * synchronous page faults in DAX) */ > +#define VM_FAULT_PTNOTSAME 0x4000 /* Page table entries have changed */ > > #define VM_FAULT_ERROR (VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \ > VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \ > diff --git a/mm/memory.c b/mm/memory.c > index 0b9a51f80e0e..f86efcb8e268 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2309,21 +2309,29 @@ static inline bool pte_map_lock(struct vm_fault *vmf) > * parts, do_swap_page must check under lock before unmapping the pte and > * proceeding (but do_wp_page is only called after already making such a check; > * and do_anonymous_page can safely check later on). > + * > + * pte_unmap_same() returns: > + * 0 if the PTE are the same > + * VM_FAULT_PTNOTSAME if the PTE are different > + * VM_FAULT_RETRY if the VMA has changed in our back during > + * a speculative page fault handling. > */ > -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, > - pte_t *page_table, pte_t orig_pte) > +static inline int pte_unmap_same(struct vm_fault *vmf) > { > - int same = 1; > + int ret = 0; > + > #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) > if (sizeof(pte_t) > sizeof(unsigned long)) { > - spinlock_t *ptl = pte_lockptr(mm, pmd); > - spin_lock(ptl); > - same = pte_same(*page_table, orig_pte); > - spin_unlock(ptl); > + if (pte_spinlock(vmf)) { > + if (!pte_same(*vmf->pte, vmf->orig_pte)) > + ret = VM_FAULT_PTNOTSAME; > + spin_unlock(vmf->ptl); > + } else > + ret = VM_FAULT_RETRY; > } > #endif > - pte_unmap(page_table); > - return same; > + pte_unmap(vmf->pte); > + return ret; > } > > static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma) > @@ -2912,10 +2920,19 @@ int do_swap_page(struct vm_fault *vmf) > pte_t pte; > int locked; > int exclusive = 0; > - int ret = 0; > + int ret; > > - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) > + ret = pte_unmap_same(vmf); > + if (ret) { > + /* > + * If pte != orig_pte, this means another thread did the > + * swap operation in our back. > + * So nothing else to do. > + */ > + if (ret == VM_FAULT_PTNOTSAME) > + ret = 0; > goto out; > + } > > entry = pte_to_swp_entry(vmf->orig_pte); > if (unlikely(non_swap_entry(entry))) { > -- > 2.7.4 >