Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp464409pxb; Fri, 8 Jan 2021 09:20:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJwiUxqj0U1myNR97L+bR8xG63aLXn/WUeIbUsyjvc4tUaILrt86WB+fJSoroE/SgA8Odklj X-Received: by 2002:a17:906:118c:: with SMTP id n12mr3366497eja.167.1610126403123; Fri, 08 Jan 2021 09:20:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610126403; cv=none; d=google.com; s=arc-20160816; b=nleFuO2nMcoY/fXh/wMv2CePXzX9AOollBI6GhY1flU79SNR2uAA6EQiCudjWYooQi mxTbZlbuDWP2qmwXu9EcNUV/UjTETyoW/e0Z+PilhS+HovrqWe0kS5SsKc7X4Vb+RBIU nRXsIA+sF3bGGgmDmSb76iUj0FysCVXhK4zscYbgRr8cHJE70wxvS3FkqFzA+vSaAy+t wyLtnILn/+I2Ub1MRN+Ab1stEssMnEvvKlTr8li7XoFeUgReugM+mKv0BuLqs0fgVGIy fAFaC8urghREnHeqUFN+dk+OPhSKf8AVNLpN1msExJLt45eev76FuJtYD3qnadGngpSh IhTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=LQpW1LChLcK+ZG+u/033JT1wGWa6kscTGte69HWmtzM=; b=g6BdkxVxmd3vOPjBeRFvyEBLKfG+pSNL0pkjEKl4dwopqTXNLQsCmShLdLVYQU+hoe GW06CM0adKrkaXQr1U0L63382LyJ3ZVXXlUboF+fMbwMTkaHFAGCANw4oLKJuVNo68BH 3bUpM0vSZepSoiVreTtTqUrTwvJx3kaznQkr7epY3Z2wK+KjUeUz7eVCwKLAisrYvFMN nlClk3/BzUY0hQvxMB+54YNOuSx4WvqKjLno5ylJEhgiGs+J0SN1/D3euS/Yrs1F13Ip QV7rS99Un2eeW2P0Byt6Xswam/MFFZYwEVyhLcanYD0vvD0FpUjwRaq6uY+P4S+Ze/IS 6tNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=l1piDlIh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bx8si3606825ejb.664.2021.01.08.09.19.39; Fri, 08 Jan 2021 09:20:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=l1piDlIh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728409AbhAHRQR (ORCPT + 99 others); Fri, 8 Jan 2021 12:16:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:50636 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728408AbhAHRQP (ORCPT ); Fri, 8 Jan 2021 12:16:15 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7BECD23A1E; Fri, 8 Jan 2021 17:15:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610126134; bh=RQ63edxxspf278oJN0oZGKPL8Fi8yL2rV5xz0JAIPv8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l1piDlIhX/f7OnjtPcJpJ4+WxCxxcyvn0iiTbR9lDZR6q0nMVPBTV7cjgwoxLQnoc cdEY2UWOe4xwjG2xKf562FaMUxxnniFbfGEoRUV7fj+ciTIkUy+NTU9wNfWINk9l1m 5uacUKtKOVQYgNHM/eo1DLLGT+P4cBjqznWyUaC5KQOsb7ndljYvtZXOrG/9OScwKp N7hBFkDQtiBjvBvud+jy4oLAVqdsFwY0F2GVjIHppJQWXsh6GdT/WZLqLBYgV0W3no xa3W9mCmCkkvvVQQDjViPOqb1mqSrQjdOCraxW+dBg1W4Di062CDKANGBOhcCStZqU tWOPBXbg9zTqQ== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , kernel-team@android.com Subject: [PATCH v2 2/3] mm: Allow architectures to request 'old' entries when prefaulting Date: Fri, 8 Jan 2021 17:15:16 +0000 Message-Id: <20210108171517.5290-3-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210108171517.5290-1-will@kernel.org> References: <20210108171517.5290-1-will@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed the "faultaround" behaviour to initialise prefaulted PTEs as 'old', since this avoids vmscan wrongly assuming that they are hot, despite having never been explicitly accessed by userspace. The change has been shown to benefit numerous arm64 micro-architectures (with hardware access flag) running Android, where both application launch latency and direct reclaim time are significantly reduced. Unfortunately, commit 315d09bf30c2 ("Revert "mm: make faultaround produce old ptes"") reverted the change to it being identified as the cause of a ~6% regression in unixbench on x86. Experiments on a variety of recent arm64 micro-architectures indicate that unixbench is not affected by the original commit, yielding a 0-1% performance improvement. Since one size does not fit all for the initial state of prefaulted PTEs, introduce arch_wants_old_prefaulted_pte(), which allows an architecture to opt-in to 'old' prefaulted PTEs at runtime based on whatever criteria it may have. Cc: Jan Kara Cc: Minchan Kim Cc: Andrew Morton Cc: Kirill A. Shutemov Cc: Linus Torvalds Reported-by: Vinayak Menon Signed-off-by: Will Deacon --- include/linux/mm.h | 5 ++++- mm/filemap.c | 12 ++++++++---- mm/memory.c | 20 +++++++++++++++++++- 3 files changed, 31 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 801dd99f733c..873e410d6238 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -434,6 +434,7 @@ extern pgprot_t protection_map[16]; * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. + * @FAULT_FLAG_PREFAULT: Fault was a prefault. * * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify * whether we would allow page faults to retry by specifying these two @@ -464,6 +465,7 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_REMOTE 0x80 #define FAULT_FLAG_INSTRUCTION 0x100 #define FAULT_FLAG_INTERRUPTIBLE 0x200 +#define FAULT_FLAG_PREFAULT 0x400 /* * The default fault flags that should be used by most of the @@ -501,7 +503,8 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags) { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ - { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" } + { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" }, \ + { FAULT_FLAG_PREFAULT, "PREFAULT" } /* * vm_fault is filled by the pagefault handler and passed to the vma's diff --git a/mm/filemap.c b/mm/filemap.c index c1f2dc89b8a7..0fb9d1714797 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3051,14 +3051,18 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, if (!pte_none(*vmf->pte)) goto unlock; + /* We're about to handle the fault */ + if (vmf->address == address) { + vmf->flags &= ~FAULT_FLAG_PREFAULT; + ret = VM_FAULT_NOPAGE; + } else { + vmf->flags |= FAULT_FLAG_PREFAULT; + } + do_set_pte(vmf, page); /* no need to invalidate: a not-present page won't be cached */ update_mmu_cache(vma, vmf->address, vmf->pte); unlock_page(head); - - /* The fault is handled */ - if (vmf->address == address) - ret = VM_FAULT_NOPAGE; continue; unlock: unlock_page(head); diff --git a/mm/memory.c b/mm/memory.c index 3e2fc2950ad7..f0e7c589ca9d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -134,6 +134,18 @@ static inline bool arch_faults_on_old_pte(void) } #endif +#ifndef arch_wants_old_prefaulted_pte +static inline bool arch_wants_old_prefaulted_pte(void) +{ + /* + * Transitioning a PTE from 'old' to 'young' can be expensive on + * some architectures, even if it's performed in hardware. By + * default, "false" means prefaulted entries will be 'young'. + */ + return false; +} +#endif + static int __init disable_randmaps(char *s) { randomize_va_space = 0; @@ -3725,11 +3737,17 @@ void do_set_pte(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; + bool prefault = vmf->flags & FAULT_FLAG_PREFAULT; pte_t entry; flush_icache_page(vma, page); entry = mk_pte(page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); + + if (prefault && arch_wants_old_prefaulted_pte()) + entry = pte_mkold(entry); + else + entry = pte_sw_mkyoung(entry); + if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* copy-on-write page */ -- 2.29.2.729.g45daf8777d-goog