Received: by 10.192.165.148 with SMTP id m20csp4100496imm; Mon, 30 Apr 2018 11:44:14 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrUE2L7LCyCz+Be/3X5SGBfLNOt1pXDZuk9qWtwUtW3oqP2l96lbZJgVOZfpbH3nuYAUzHX X-Received: by 2002:a65:4648:: with SMTP id k8-v6mr11146511pgr.47.1525113854222; Mon, 30 Apr 2018 11:44:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525113854; cv=none; d=google.com; s=arc-20160816; b=Kp8Wl7MxBJR4t+fiFRWW6CTT92mfUDsjJ/awo/45/6MALvXbtmk1hkFSJ2Tt4qHRoM 0Uh+ERrX+/TKPiCOC89+ysGOMG9cTTgfdokN4cobAmb+Qk77r4MSzOHqh4eL2h0Hjg4D S46q9I4s1IXOmjqelljtTDZbdYrPzDglkVcXK7c8OhTFWgVR3fhVwQ6xXIOJjpCbsYzB tEAGYjahWUlAezME+BGRosoLpd0nYV2N5elwtQ0HJ/LV2+DkJEzvYso0Ki9Oe9ua6yGr ULMQTx16XuXFtJXIgfsG5XkNITouAKheU3rkkB7SdYQ/c7OvjofXDIuX7Iuc+EdthxNi JjDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=wp3gBWfD3BIS1gqULlJb8w7ttBkEN5B6or61opLaCmU=; b=qg9sDC9Vvc/SqHohxNSDE8ZvDH8bjjfJ3b7/lRs7Xvd0+H7rZqEN4rUh1NaLWVFX8V 1+FZ+eF0SMQGKIk6pu92EinOBLCUXKpkA+ebBNKN06pypSY827hB0K9qRVpQSIF/5dAd fDQ2tbIW+VRw/Ix26YURowm+cNSJ0Gf+3jitkKupHmjb/GpsQDZ8zpgMWHGYPTQj95Vx 6emUFoMipqswY3qJL95O4BVxxx49YnHqLCQjUr8Uyho3+ijm0XbohlNujzYWzyMv4k2l 7wVYkIvCbBWKIeWKVQg53Rf097hPg+fdyssMGbG5wlkEAT/epcX0skGyHSC26W5Y0JZY eu/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=E6EL//5w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b5-v6si6473669pgc.150.2018.04.30.11.43.59; Mon, 30 Apr 2018 11:44:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=E6EL//5w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754780AbeD3Snh (ORCPT + 99 others); Mon, 30 Apr 2018 14:43:37 -0400 Received: from mail-qk0-f196.google.com ([209.85.220.196]:42528 "EHLO mail-qk0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752534AbeD3Snf (ORCPT ); Mon, 30 Apr 2018 14:43:35 -0400 Received: by mail-qk0-f196.google.com with SMTP id j10so7351873qke.9 for ; Mon, 30 Apr 2018 11:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=wp3gBWfD3BIS1gqULlJb8w7ttBkEN5B6or61opLaCmU=; b=E6EL//5waabQqpSLhgPoWffWyK/GOBskuxXxbEG4aSxb/6fknXhH3r7EpTxvN2k+ho mHG9LBZ/xE49iTneUvIUOKXf+gSmL9YeWVGiSCT02OA1RdYw2MeOX+Fw6Y6PWxCcKUoo tvXe3I/DDsC9oQTMFQQS7FjsZl0bUOuyDnsTzMBYnRwl50kDY8zBhiXxmixwMGjpnWpo ACHaxdZH8sYEPJDG0zPxfDeUIZnLXH0HS8x/5ntlkAzUREoZkr9KF8+8s/sJgCt7fYpr ce6d8rLRqehb4MmaFXSNUqYVMvCo0UfPDG9S6uUVeXHck0po+ouvNohAnB2uyHZuh5sK Umww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=wp3gBWfD3BIS1gqULlJb8w7ttBkEN5B6or61opLaCmU=; b=e97Gxsmo7SD9oaAGx72KXzVHC1uS8CufcBvHfYShwzFU6IJwq5RdaNN1GQ5kEPuhdn 3O2twLkJ8/D0aBzLXDE9XSzxu/v/sOItXmhZa4TT347c6l1+t4cYectKviVwI+n01lwb /qcPZrxps4AgJQDqX4IaJD/5+5KpINYVjbkwQCkW/mFazke9V7PwG8QGrnDBTY7Q7yuJ jlUPRWIslC0OtQd7zHOUbWqAUfXPKpjJuWw/mMvQmKNtnfhxr//Hsl9UoLzmVdR7gHQY E9l/807OCAmh1GaxXXiYcCbWN0fyiUyD25HMt639CfPYIakmBUckfIFacsS452iVznSV 9jiA== X-Gm-Message-State: ALQs6tCSCBPiqKmgGuAgz9vX2ito35nYL91yoGfBZ8fr4UrtV5QZa7qW hVAN5dP2IXFFE0KR2WSRjDqssg4gdSEHD95Ar64= X-Received: by 10.55.43.168 with SMTP id r40mr10448229qkr.432.1525113814376; Mon, 30 Apr 2018 11:43:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.53.19 with HTTP; Mon, 30 Apr 2018 11:43:13 -0700 (PDT) In-Reply-To: <1523975611-15978-25-git-send-email-ldufour@linux.vnet.ibm.com> References: <1523975611-15978-1-git-send-email-ldufour@linux.vnet.ibm.com> <1523975611-15978-25-git-send-email-ldufour@linux.vnet.ibm.com> From: Punit Agrawal Date: Mon, 30 Apr 2018 19:43:13 +0100 Message-ID: Subject: Re: [PATCH v10 24/25] x86/mm: add speculative pagefault handling To: Laurent Dufour Cc: akpm@linux-foundation.org, mhocko@kernel.org, peterz@infradead.org, kirill@shutemov.name, ak@linux.intel.com, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com, Daniel Jordan , David Rientjes , Jerome Glisse , Ganesh Mahendran , linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, paulmck@linux.vnet.ibm.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Laurent, I am looking to add support for speculative page fault handling to arm64 (effectively porting this patch) and had a few questions. Apologies if I've missed an obvious explanation for my queries. I'm jumping in bit late to the discussion. On Tue, Apr 17, 2018 at 3:33 PM, Laurent Dufour wrote: > From: Peter Zijlstra > > Try a speculative fault before acquiring mmap_sem, if it returns with > VM_FAULT_RETRY continue with the mmap_sem acquisition and do the > traditional fault. > > Signed-off-by: Peter Zijlstra (Intel) > > [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in > handle_speculative_fault()] > [Retry with usual fault path in the case VM_ERROR is returned by > handle_speculative_fault(). This allows signal to be delivered] > [Don't build SPF call if !CONFIG_SPECULATIVE_PAGE_FAULT] > [Try speculative fault path only for multi threaded processes] > [Try reuse to the VMA fetch during the speculative path in case of retry] > [Call reuse_spf_or_find_vma()] > [Handle memory protection key fault] > Signed-off-by: Laurent Dufour > --- > arch/x86/mm/fault.c | 42 ++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 38 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 73bd8c95ac71..59f778386df5 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -1220,7 +1220,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, > struct mm_struct *mm; > int fault, major = 0; > unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; > - u32 pkey; > + u32 pkey, *pt_pkey = &pkey; > > tsk = current; > mm = tsk->mm; > @@ -1310,6 +1310,30 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, > flags |= FAULT_FLAG_INSTRUCTION; > > /* > + * Do not try speculative page fault for kernel's pages and if > + * the fault was due to protection keys since it can't be resolved. > + */ > + if (IS_ENABLED(CONFIG_SPECULATIVE_PAGE_FAULT) && > + !(error_code & X86_PF_PK)) { You can simplify this condition by dropping the IS_ENABLED() check as you already provide an alternate implementation of handle_speculative_fault() when CONFIG_SPECULATIVE_PAGE_FAULT is not defined. > + fault = handle_speculative_fault(mm, address, flags, &vma); > + if (fault != VM_FAULT_RETRY) { > + perf_sw_event(PERF_COUNT_SW_SPF, 1, regs, address); > + /* > + * Do not advertise for the pkey value since we don't > + * know it. > + * This is not a matter as we checked for X86_PF_PK > + * earlier, so we should not handle pkey fault here, > + * but to be sure that mm_fault_error() callees will > + * not try to use it, we invalidate the pointer. > + */ > + pt_pkey = NULL; > + goto done; > + } > + } else { > + vma = NULL; > + } The else part can be dropped if vma is initialised to NULL when it is declared at the top of the function. > + > + /* > * When running in the kernel we expect faults to occur only to > * addresses in user space. All other faults represent errors in > * the kernel and should generate an OOPS. Unfortunately, in the > @@ -1342,7 +1366,8 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, > might_sleep(); > } > > - vma = find_vma(mm, address); > + if (!vma || !can_reuse_spf_vma(vma, address)) > + vma = find_vma(mm, address); Is there a measurable benefit from reusing the vma? Dropping the vma reference unconditionally after speculative page fault handling gets rid of the implicit state when "vma != NULL" (increased ref-count). I found it a bit confusing to follow. > if (unlikely(!vma)) { > bad_area(regs, error_code, address); > return; > @@ -1409,8 +1434,15 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, > if (flags & FAULT_FLAG_ALLOW_RETRY) { > flags &= ~FAULT_FLAG_ALLOW_RETRY; > flags |= FAULT_FLAG_TRIED; > - if (!fatal_signal_pending(tsk)) > + if (!fatal_signal_pending(tsk)) { > + /* > + * Do not try to reuse this vma and fetch it > + * again since we will release the mmap_sem. > + */ > + if (IS_ENABLED(CONFIG_SPECULATIVE_PAGE_FAULT)) > + vma = NULL; Regardless of the above comment, can the vma be reset here unconditionally? Thanks, Punit