Received: by 10.223.176.5 with SMTP id f5csp879799wra; Tue, 6 Feb 2018 08:53:15 -0800 (PST) X-Google-Smtp-Source: AH8x224EOFiOA8n/q8RBvsD4PAWP5fazP+/9NvJQD/SYBE+K8FCzCdOhmpZk8JH0c0iiEtPAliPc X-Received: by 10.99.103.129 with SMTP id b123mr2403097pgc.177.1517935995846; Tue, 06 Feb 2018 08:53:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517935995; cv=none; d=google.com; s=arc-20160816; b=rgrvJFjIZOLuhI34CM7uKphy1WxmgjxcYXfdPXu6/jXs8zz2AmaGzqrSUctaFIgORI 93nBjEdFJ1SVqy7FwCB7Kk5CwK5vpbMvmzpxL+cVqdmfS+O9coYmZkj30yFg52JCxfqJ ZsZ63xJd8wKX1dH4NAdn3Wnzpp3D2+IFQ7oFYlj8LhYxjpy4xXAE7nOA/MKCFB8hWgbN AW9zckx4Vl1fTk9a5ye3mJsza5L6jUuPLme4s5UALPMSOxHx5O5KJ1nADbndLowL29Ij 7W860HyRXq6VtRTLL4cleJyeL+piJX4V5MgFvJl6ARW+0e6pRWu/ld45lpgtgMKEzQSc esHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from:arc-authentication-results; bh=RJcYwrGS0dzTw+YzPYIZPrQhQ1w2yxGSiLs65aBD8dw=; b=jYPchhC1QhidunHcNnTG1XRjRG+BNWq0vUhl6IFYneq7XIhl3SCNMiHCiMazmvbUAl SlDT3T9DHmebwJCj8zXPVeXNjeIRmauJx/Yfbf0kgCG0wteyzdxZcaDcsKGu67YeAQRW XuBAUo6aP7vWP35QvXOa02jivvSFYEioLDDRoswL882JpOSz6HgHa5F89KZkl6xAcOUX kZ3Z83pJpZOYsP6Zs0tA3V2XDjyicV7q5tPrXqy0R+u2VNQjIRi2aQKFP8VfZNvOKsQo AVWJtTXJQEy0utvbYLQoCxWHgKnc0tKDXULWcYUBG6arm3ig8Xj+OM/MM1BLvnrnP/q6 DOGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r17si108835pge.478.2018.02.06.08.53.01; Tue, 06 Feb 2018 08:53:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753017AbeBFQv6 (ORCPT + 99 others); Tue, 6 Feb 2018 11:51:58 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46796 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752870AbeBFQud (ORCPT ); Tue, 6 Feb 2018 11:50:33 -0500 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w16GoMXJ118649 for ; Tue, 6 Feb 2018 11:50:33 -0500 Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) by mx0a-001b2d01.pphosted.com with ESMTP id 2fyff92hur-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 06 Feb 2018 11:50:33 -0500 Received: from localhost by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 6 Feb 2018 16:50:30 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp15.uk.ibm.com (192.168.101.145) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 6 Feb 2018 16:50:22 -0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w16GoMhZ50200708; Tue, 6 Feb 2018 16:50:22 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7F5DF11C058; Tue, 6 Feb 2018 16:43:45 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 81D7E11C050; Tue, 6 Feb 2018 16:43:44 +0000 (GMT) Received: from nimbus.lab.toulouse-stg.fr.ibm.com (unknown [9.101.4.33]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 6 Feb 2018 16:43:44 +0000 (GMT) From: Laurent Dufour To: paulmck@linux.vnet.ibm.com, peterz@infradead.org, akpm@linux-foundation.org, kirill@shutemov.name, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com, Daniel Jordan Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: [PATCH v7 05/24] mm: Prepare for FAULT_FLAG_SPECULATIVE Date: Tue, 6 Feb 2018 17:49:51 +0100 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1517935810-31177-1-git-send-email-ldufour@linux.vnet.ibm.com> References: <1517935810-31177-1-git-send-email-ldufour@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18020616-0020-0000-0000-000003F2EA9C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18020616-0021-0000-0000-0000428562AC Message-Id: <1517935810-31177-6-git-send-email-ldufour@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-02-06_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1802060212 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Zijlstra When speculating faults (without holding mmap_sem) we need to validate that the vma against which we loaded pages is still valid when we're ready to install the new PTE. Therefore, replace the pte_offset_map_lock() calls that (re)take the PTL with pte_map_lock() which can fail in case we find the VMA changed since we started the fault. Signed-off-by: Peter Zijlstra (Intel) [Port to 4.12 kernel] [Remove the comment about the fault_env structure which has been implemented as the vm_fault structure in the kernel] Signed-off-by: Laurent Dufour --- include/linux/mm.h | 1 + mm/memory.c | 56 ++++++++++++++++++++++++++++++++++++++---------------- 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 47c06fd20f6a..51d950cac772 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -302,6 +302,7 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */ #define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */ #define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */ +#define FAULT_FLAG_SPECULATIVE 0x200 /* Speculative fault, not holding mmap_sem */ #define FAULT_FLAG_TRACE \ { FAULT_FLAG_WRITE, "WRITE" }, \ diff --git a/mm/memory.c b/mm/memory.c index 32b9eb77d95c..bb058527525a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2452,6 +2452,13 @@ static inline void wp_page_reuse(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); } +static bool pte_map_lock(struct vm_fault *vmf) +{ + vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + return true; +} + /* * Handle the case of a page which we actually need to copy to a new page. * @@ -2479,6 +2486,7 @@ static int wp_page_copy(struct vm_fault *vmf) const unsigned long mmun_start = vmf->address & PAGE_MASK; const unsigned long mmun_end = mmun_start + PAGE_SIZE; struct mem_cgroup *memcg; + int ret = VM_FAULT_OOM; if (unlikely(anon_vma_prepare(vma))) goto oom; @@ -2506,7 +2514,11 @@ static int wp_page_copy(struct vm_fault *vmf) /* * Re-check the pte - we dropped the lock */ - vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl); + if (!pte_map_lock(vmf)) { + mem_cgroup_cancel_charge(new_page, memcg, false); + ret = VM_FAULT_RETRY; + goto oom_free_new; + } if (likely(pte_same(*vmf->pte, vmf->orig_pte))) { if (old_page) { if (!PageAnon(old_page)) { @@ -2598,7 +2610,7 @@ static int wp_page_copy(struct vm_fault *vmf) oom: if (old_page) put_page(old_page); - return VM_FAULT_OOM; + return ret; } /** @@ -2619,8 +2631,8 @@ static int wp_page_copy(struct vm_fault *vmf) int finish_mkwrite_fault(struct vm_fault *vmf) { WARN_ON_ONCE(!(vmf->vma->vm_flags & VM_SHARED)); - vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + if (!pte_map_lock(vmf)) + return VM_FAULT_RETRY; /* * We might have raced with another page fault while we released the * pte_offset_map_lock. @@ -2738,8 +2750,11 @@ static int do_wp_page(struct vm_fault *vmf) get_page(vmf->page); pte_unmap_unlock(vmf->pte, vmf->ptl); lock_page(vmf->page); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + if (!pte_map_lock(vmf)) { + unlock_page(vmf->page); + put_page(vmf->page); + return VM_FAULT_RETRY; + } if (!pte_same(*vmf->pte, vmf->orig_pte)) { unlock_page(vmf->page); pte_unmap_unlock(vmf->pte, vmf->ptl); @@ -2967,8 +2982,10 @@ int do_swap_page(struct vm_fault *vmf) * Back out if somebody else faulted in this pte * while we released the pte lock. */ - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + if (!pte_map_lock(vmf)) { + delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + return VM_FAULT_RETRY; + } if (likely(pte_same(*vmf->pte, vmf->orig_pte))) ret = VM_FAULT_OOM; delayacct_clear_flag(DELAYACCT_PF_SWAPIN); @@ -3024,8 +3041,11 @@ int do_swap_page(struct vm_fault *vmf) /* * Back out if somebody else already faulted in this pte. */ - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + if (!pte_map_lock(vmf)) { + ret = VM_FAULT_RETRY; + mem_cgroup_cancel_charge(page, memcg, false); + goto out_page; + } if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte))) goto out_nomap; @@ -3154,8 +3174,8 @@ static int do_anonymous_page(struct vm_fault *vmf) !mm_forbids_zeropage(vma->vm_mm)) { entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address), vma->vm_page_prot)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + if (!pte_map_lock(vmf)) + return VM_FAULT_RETRY; if (!pte_none(*vmf->pte)) goto unlock; ret = check_stable_address_space(vma->vm_mm); @@ -3190,8 +3210,11 @@ static int do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + if (!pte_map_lock(vmf)) { + mem_cgroup_cancel_charge(page, memcg, false); + put_page(page); + return VM_FAULT_RETRY; + } if (!pte_none(*vmf->pte)) goto release; @@ -3315,8 +3338,9 @@ static int pte_alloc_one_map(struct vm_fault *vmf) * pte_none() under vmf->ptl protection when we return to * alloc_set_pte(). */ - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + if (!pte_map_lock(vmf)) + return VM_FAULT_RETRY; + return 0; } -- 2.7.4