Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp3125804imd; Mon, 29 Oct 2018 02:00:48 -0700 (PDT) X-Google-Smtp-Source: AJdET5fu8jJ40Ai0XmitSbxI3XgTqbKqavZifHD+YwakqitSkxd/5J/4LYTdQx17KMDgNywd3Z4J X-Received: by 2002:a63:2f86:: with SMTP id v128mr12602573pgv.407.1540803648929; Mon, 29 Oct 2018 02:00:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540803648; cv=none; d=google.com; s=arc-20160816; b=z3Kq6m+288yo3E16VJ7II+O9WAjicNJdDcBDdB+F+MIx/1jjS1dAe/0sVdJg/82DHK ja8+JwP8ZRNTNf5bt4SDRrvJUlS3sYkINd9zosGAf7xbE3MzFIw4u7xnMAxcdmvjgGJM nNuwHKzwvC22dk1o2/QNz6cmTsPMoHzgyP4BNtKiaFkeVs2rUXPq8CAMlddCKekkJQOC cIkKRGoHOW/gFqACEJvpwgQd7zM89LE6+XYta6ONG2dwYkrkb+kn/u3dvmujvVcOZaUP bToR4fUGVpExZVzVLjWsgFTo7yMGDbtiXHpGtdASPcvTRYatxU7XiuVOfbK2ysLETmhs rPzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=7UR8CWSWazbL6HYWRXiE5qjpCrm83KLi7EWMRK2gYq4=; b=rcJ43BuXnQu5hxPhgrruQOVWOhbjA9RZEQxRj5BclDAFMFLKcaWJno+8FlSJjVwolc JWtpjaX+Sj1zC/CRioY0h6r4N9s5EnHg9LMNNKpDkP5RkCSPZTdyW2+E6bQ3bNf5JkGh 5Gukvg+SZ9/WQCNNIx1Vw7vBgkZlBoQnzDw/82vQFLzxStyv3EG/RdGQFCC+9a64LZOP mNkF/MqcUPEVdgrtMfh37BWqkIcaJnMxcYTPaO1dYVePGibBzVswzZQheyObNEV9aHQb JnM5/PJCzrPXYycredFEp68O6EhkYlvTjODHb9gYhH8acfR/lvIfxO3Gl8CzUxIqcoZ0 Zl8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q4-v6si16640068pli.119.2018.10.29.02.00.32; Mon, 29 Oct 2018 02:00:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729655AbeJ2Rp6 (ORCPT + 99 others); Mon, 29 Oct 2018 13:45:58 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:36422 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729300AbeJ2Rpo (ORCPT ); Mon, 29 Oct 2018 13:45:44 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4BD1D341; Mon, 29 Oct 2018 01:58:02 -0700 (PDT) Received: from salmiak (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D8BCC3F557; Mon, 29 Oct 2018 01:58:00 -0700 (PDT) Date: Mon, 29 Oct 2018 08:57:50 +0000 From: Mark Rutland To: Ashish Mhetre Cc: linux-arm-kernel@lists.infradead.org, linux-tegra@vger.kernel.org, avanbrunt@nvidia.com, linux-kernel@vger.kernel.org, vdumpa@nvidia.com Subject: Re: [PATCH V2] arm64: Don't flush tlb while clearing the accessed bit Message-ID: <20181029085738.kcjiwf5p6rdqeb6j@salmiak> References: <1540794599-30922-1-git-send-email-amhetre@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1540794599-30922-1-git-send-email-amhetre@nvidia.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 29, 2018 at 11:59:59AM +0530, Ashish Mhetre wrote: > From: Alex Van Brunt > > Accessed bit is used to age a page and in generic implementation there is > flush_tlb while clearing the accessed bit. > Flushing a TLB is overhead on ARM64 as access flag faults don't get > translation table entries cached into TLB's. Flushing TLB is not necessary > for this. Clearing the accessed bit without flushing TLB doesn't cause data > corruption on ARM64. [It may cause incorrect page aging but chances of that > should be relatively low.] > In our case with this patch, speed of reading from fast NVMe/SSD through > PCIe got improved by 10% ~ 15% and writing got improved by 20% ~ 40%. > So for performance optimisation don't flush TLB when clearing the accessed > bit on ARM64. > x86 made the same optimization even though their TLB invalidate is much > faster as it doesn't broadcast to other CPUs. Please specifically refer to commit: b13b1d2d8692b437 ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB") ... so that it's easy for people to track down the relevant x86 change. > > Signed-off-by: Alex Van Brunt > Signed-off-by: Ashish Mhetre > --- > v2: Added comments about why flushing is not needed while clearing accessed bit > > arch/arm64/include/asm/pgtable.h | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index 2ab2031..33e1940 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -652,6 +652,22 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, > return __ptep_test_and_clear_young(ptep); > } > > +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH > +static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > + unsigned long address, pte_t *ptep) > +{ > + /* > + * Flushing a TLB is overhead on ARM64 as access flag faults don't get > + * translation table entries cached into TLB's. Flushing TLB is not > + * necessary for this. Clearing the accessed bit without flushing TLB > + * doesn't cause data corruption on ARM64.[ It may cause imcorrect page > + * aging but chances of this should be comparatively low. ] > + * So as a performance optimization don't flush the TLB when clearing > + * the accessed bit. > + */ Can we just copy the x86 comment from commit b13b1d2d8692b437? Thanks, Mark. > + return ptep_test_and_clear_young(vma, address, ptep); > +} > + > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG > static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, > -- > 2.7.4 >