Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp3517342imd; Mon, 29 Oct 2018 08:17:17 -0700 (PDT) X-Google-Smtp-Source: AJdET5c0LJ/RqPz2TuQ2RWcvYYrZ8gTl6TKU4LtLY+te/iTNUtEaubxvr2Br+2ss5zDhFtoMJmRg X-Received: by 2002:a17:902:4303:: with SMTP id i3-v6mr14576130pld.204.1540826237617; Mon, 29 Oct 2018 08:17:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540826237; cv=none; d=google.com; s=arc-20160816; b=omCix6x1mTwS5TKpAOJ7zR+u6h/A36cdhold3+CbISPrxTdffacBvV2YDOywmNnSRC 7Jl8d6h3XdS5bB1T3H1yUl3hkBFu2eri+e1FFCi1O9nqnV1GV6rfC2GY3cXgJe4OuXuu VeFDDvA8xOEV2W4HoYnMeXnm0OCTKD/XSHo4xDP9t8+ekF0yCByOkqzlKYQquqxiPCt6 FSHj2Qztm4oflAG2L9rGg9CeW0t8UT2BLNMXm9l7izlsU/MwCGD1wgZ8STSwpLZauCmz 9pvVLnzuBTNa1KRKYWHaJPcyuehXSf/LzeymoWJHTrIrjA2YRmUniIm/A7iu5t1p0AJG gRFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :content-language:mime-version:spamdiagnosticmetadata :spamdiagnosticoutput:accept-language:in-reply-to:references :message-id:date:thread-index:thread-topic:subject:cc:to:from; bh=dWn1mdBnKTAg611Hx+yPebJ2hZjWHxfrGtq+D3o6Jrk=; b=iZzcV+bvJYLdSo7rtypMGrwnGmKenUI6TndQh0pWCtM+L93pa1I2rnWUV7K/MbKJmT I4xhE+HQLR0srWTidCsDa0+okwHk/HaFtnNu92tVWVlvLTf2tH+yWQyUHboXaO694cwt OoWwcfgZP/m0MdouFu/cERZ0qBF9gijMBltNX+TobOUljVrYq0hSG1gWtkyYbIk2PgeT 0iuYnfopIpCuSu57XfkTZeBR/wrMnJJ8Tlwm/42hJYN53SWQZwwCPRUpLOPvHbvNEyvn lplaDDfSkbL+u6qHqqvv7PcNzxGl6IaBJ/wS1FWANhj9Fa0o67DscfwNYzCfbzKRY0sJ M4PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=odDaJ+BF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p22-v6si20695145pgh.8.2018.10.29.08.16.46; Mon, 29 Oct 2018 08:17:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=odDaJ+BF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727591AbeJ3ACV (ORCPT + 99 others); Mon, 29 Oct 2018 20:02:21 -0400 Received: from nat-hk.nvidia.com ([203.18.50.4]:19037 "EHLO nat-hk.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726991AbeJ3ACV (ORCPT ); Mon, 29 Oct 2018 20:02:21 -0400 Received: from hkpgpgate102.nvidia.com (Not Verified[10.18.92.77]) by nat-hk.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 29 Oct 2018 23:13:14 +0800 Received: from HKMAIL101.nvidia.com ([10.18.16.10]) by hkpgpgate102.nvidia.com (PGP Universal service); Mon, 29 Oct 2018 08:13:13 -0700 X-PGP-Universal: processed; by hkpgpgate102.nvidia.com on Mon, 29 Oct 2018 08:13:13 -0700 Received: from HKMAIL103.nvidia.com (10.18.16.12) by HKMAIL101.nvidia.com (10.18.16.10) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Mon, 29 Oct 2018 15:13:12 +0000 Received: from NAM04-SN1-obe.outbound.protection.outlook.com (216.32.180.84) by HKMAIL103.nvidia.com (10.18.16.12) with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Frontend Transport; Mon, 29 Oct 2018 15:13:12 +0000 Received: from BN7PR12MB2708.namprd12.prod.outlook.com (20.176.176.142) by BN7PR12MB2787.namprd12.prod.outlook.com (20.176.178.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1273.25; Mon, 29 Oct 2018 15:13:09 +0000 Received: from BN7PR12MB2708.namprd12.prod.outlook.com ([fe80::51c3:23f7:adb6:a183]) by BN7PR12MB2708.namprd12.prod.outlook.com ([fe80::51c3:23f7:adb6:a183%6]) with mapi id 15.20.1273.027; Mon, 29 Oct 2018 15:13:08 +0000 From: Alexander Van Brunt To: Will Deacon , Ashish Mhetre CC: "mark.rutland@arm.com" , "linux-arm-kernel@lists.infradead.org" , "linux-tegra@vger.kernel.org" , Sachin Nikam , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit Thread-Topic: [PATCH V3] arm64: Don't flush tlb while clearing the accessed bit Thread-Index: AQHUb2mK2jioXCSBXUOglXMUWPpsLKU2DOwAgABGE2g= Date: Mon, 29 Oct 2018 15:13:08 +0000 Message-ID: References: <1540805158-618-1-git-send-email-amhetre@nvidia.com>,<20181029105515.GD14127@arm.com> In-Reply-To: <20181029105515.GD14127@arm.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=avanbrunt@nvidia.com; x-originating-ip: [216.228.112.22] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;BN7PR12MB2787;6:1ofK+8bxLaC21i+vL55wvqJEGmaq1eiluH7kJv78zKhJNKO2RMsA9R+Y/z/82dEYzpzO55oy0a9w7eJimdeD9hKW/a3eZb38S4G+qrAhQUl4LuJCcx6eOoM70Rnzzm9cD4IN5c2dtfTLTje6R8Rmxt1xyKfb9pwsKSQBivLKjYJ4ThVHfoCWwGvcJd0XbzqATHrNn99UiYKEi2bxst6NhPqvG/xMEVms8XxmVTyqrZQtqGTgWlLqJODKpaHxgDVzN9ETpxA2V/QKnrtlzq0oLFhGX4uKxbu6Cdu8iJiCo8VvqDLqwEcuSUcfC215WuWILxtZQZNOy0/FfWapcLzYqPy/9avlNAmT/eWDmsI8TbP27bw32MbMW48sA7iZFHT3Or9/ux6qKqmH/nztbPcXiybg/fdiFLT1dcLZdHDUBvN9lzOmDCnGkhhLT+jJTfne6yMGn9HLAn6USYu3cNKRGw==;5:0919Vi7ZfRAF8K8NNcdmJHnTHVdgtyxPCP/WyNURVk9LSvLiyPjpHqQehe4d1214BrMnYao5FYto5aVcz+2mzSRals2omJI3k2cyvsO8rUa/bKIK7tLtsi+vmjS8F240tnaLHCaGC5Uy0sVh5iFBYKaexlFCAO/noKMevj8d2HQ=;7:B2N3mj3DPi0f7Mfr/Ff8f7sbjCxikEAjYhGLfFBfh//QhorjyULtHkHltU3zV1sF2Sc7UOTEmgWyh74CRQq3C7yRtVADOOp7j388bEnzphrxml65latUW9+7luBI25yDSlzs2ChwzebCC6GykGEOgg== x-ms-exchange-antispam-srfa-diagnostics: SOS;SOR; x-forefront-antispam-report: SFV:SKI;SCL:-1;SFV:NSPM;SFS:(10009020)(346002)(39860400002)(366004)(136003)(376002)(396003)(199004)(189003)(81156014)(68736007)(97736004)(106356001)(6636002)(7696005)(102836004)(9686003)(76176011)(110136005)(105586002)(4326008)(81166006)(316002)(25786009)(6436002)(7736002)(11346002)(33656002)(66066001)(14454004)(486006)(55016002)(446003)(476003)(5250100002)(6506007)(305945005)(99286004)(8936002)(229853002)(53546011)(74316002)(5660300001)(26005)(3846002)(6116002)(8676002)(71190400001)(186003)(86362001)(6246003)(2906002)(71200400001)(54906003)(14444005)(256004)(53936002)(2900100001)(478600001);DIR:OUT;SFP:1101;SCL:1;SRVR:BN7PR12MB2787;H:BN7PR12MB2708.namprd12.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; x-ms-office365-filtering-correlation-id: 7725b8c8-aa73-4935-b55b-08d63db108d3 x-microsoft-antispam: BCL:0;PCL:0;RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(2017052603328)(7153060)(7193020);SRVR:BN7PR12MB2787; x-ms-traffictypediagnostic: BN7PR12MB2787: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917)(258649278758335)(9452136761055)(18589796830644); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040522)(2401047)(5005006)(8121501046)(3002001)(93006095)(93001095)(10201501046)(3231382)(944501410)(52105095)(148016)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123564045)(20161123560045)(20161123562045)(201708071742011)(7699051)(76991095);SRVR:BN7PR12MB2787;BCL:0;PCL:0;RULEID:;SRVR:BN7PR12MB2787; x-forefront-prvs: 084080FC15 received-spf: None (protection.outlook.com: nvidia.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: xN1rUJy8X0oCFzScLERr5Du0ik9eTnIq+9ua6dkP81YHoEDB6QeoDNtRk48m3v62vZ4X8lRpDAX8rIm9/1blQHiW6V+7wIzrkkLBcdLqWEQ0eRwFO29AN5xzPzz0GRzRVWsUeWKXuHU6fEicZJOmeQaFnyAJLIHFZpLWXhQjSW+Cuq6jwVFiEJxu3DR4rpYy26Qz/xpM0izCOjLF+7dkCgJ5m4M/Nz/JXTjBxHl4CWkm8cZ8v/KAAdYyuziRMI+R3QJk2r9X6AW04yam6pGtcqdtMTAfRC6jiSKPgrF3qk0W1dISEB/t1VWg8gE2jncIp6BQRaPauwKujLe9ORENgMXocPH4jdianCOM9FTQZMA= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 7725b8c8-aa73-4935-b55b-08d63db108d3 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Oct 2018 15:13:08.5021 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN7PR12MB2787 X-OriginatorOrg: Nvidia.com Content-Language: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1540825994; bh=dWn1mdBnKTAg611Hx+yPebJ2hZjWHxfrGtq+D3o6Jrk=; h=X-PGP-Universal:From:To:CC:Subject:Thread-Topic:Thread-Index:Date: Message-ID:References:In-Reply-To:Accept-Language:X-MS-Has-Attach: X-MS-TNEF-Correlator:authentication-results:x-originating-ip: x-ms-publictraffictype:x-microsoft-exchange-diagnostics: x-ms-exchange-antispam-srfa-diagnostics: x-forefront-antispam-report: x-ms-office365-filtering-correlation-id:x-microsoft-antispam: x-ms-traffictypediagnostic:x-microsoft-antispam-prvs: x-exchange-antispam-report-test:x-ms-exchange-senderadcheck: x-exchange-antispam-report-cfa-test:x-forefront-prvs:received-spf: x-microsoft-antispam-message-info:spamdiagnosticoutput: spamdiagnosticmetadata:MIME-Version: X-MS-Exchange-CrossTenant-Network-Message-Id: X-MS-Exchange-CrossTenant-originalarrivaltime: X-MS-Exchange-CrossTenant-fromentityheader: X-MS-Exchange-CrossTenant-id: X-MS-Exchange-Transport-CrossTenantHeadersStamped:X-OriginatorOrg: Content-Language:Content-Type:Content-Transfer-Encoding; b=odDaJ+BF6/3D3JlJ63SRgeteO0abUQxLUC1Yp5pVY5dHlkTMJK+27ycl42lryvylQ P7hXmt8Tf0LDFSPNukOb+vgY+QxkOw+SNDqKltsG5YgAUhYwHIt4DjudHFUfraqk95 3mOX9O3dO7yD2/zyD+LOIXcqCRl148h2vIiNZtuNfJBKU/KHMsTOgHLcbb/IHJBRmW zzAgF3njabvPCJoBKXTXie/xWZfoaYpZd75WHHy/Pik9mDbQlzKgMUfWYv2nupNK/g tbjAyjHQ+ST+x41F1F0lzTFbLQ22FQ4YVLRmjmxPLDizMYYxD6P79Vix8RsL0Om7lW somf67zswUg8Q== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >=A0If we roll a TLB invalidation routine without the trailing DSB, what so= rt of >=A0performance does that get you? We have been doing our testing on our Carmel CPUs. Carmel will effectively ignore a TLB invalidate that doesn't have a DSB (until the invalidate buffe= r overflows). So, I expect the performance to be the same as with no TLB invalidate, but not represent the performance of other ARMv8 CPUs From: Will Deacon Sent: Monday, October 29, 2018 3:55 AM To: Ashish Mhetre Cc: mark.rutland@arm.com; linux-arm-kernel@lists.infradead.org; linux-tegra= @vger.kernel.org; Alexander Van Brunt; Sachin Nikam; linux-kernel@vger.kern= el.org Subject: Re: [PATCH V3] arm64: Don't flush tlb while clearing the accessed = bit =A0=20 On Mon, Oct 29, 2018 at 02:55:58PM +0530, Ashish Mhetre wrote: > From: Alex Van Brunt >=20 > Accessed bit is used to age a page and in generic implementation there is > flush_tlb while clearing the accessed bit. > Flushing a TLB is overhead on ARM64 as access flag faults don't get > translation table entries cached into TLB's. Flushing TLB is not necessar= y > for this. Clearing the accessed bit without flushing TLB doesn't cause da= ta > corruption on ARM64. > In our case with this patch, speed of reading from fast NVMe/SSD through > PCIe got improved by 10% ~ 15% and writing got improved by 20% ~ 40%. > So for performance optimisation don't flush TLB when clearing the accesse= d > bit on ARM64. > x86 made the same optimization even though their TLB invalidate is much > faster as it doesn't broadcast to other CPUs. Ok, but they may end up using IPIs so lets avoid these vague performance claims in the log unless they're backed up with numbers. > Please refer to: > 'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear > the accessed bit instead of flushing the TLB")' >=20 > Signed-off-by: Alex Van Brunt > Signed-off-by: Ashish Mhetre > --- >=A0 arch/arm64/include/asm/pgtable.h | 20 ++++++++++++++++++++ >=A0 1 file changed, 20 insertions(+) >=20 > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pg= table.h > index 2ab2031..080d842 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -652,6 +652,26 @@ static inline int ptep_test_and_clear_young(struct v= m_area_struct *vma, >=A0=A0=A0=A0=A0=A0=A0 return __ptep_test_and_clear_young(ptep); >=A0 } >=A0=20 > +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH > +static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long address, pte_t *ptep) > +{ > +=A0=A0=A0=A0 /* > +=A0=A0=A0=A0=A0 * On ARM64 CPUs, clearing the accessed bit without a TLB= flush > +=A0=A0=A0=A0=A0 * doesn't cause data corruption. [ It could cause incorr= ect > +=A0=A0=A0=A0=A0 * page aging and the (mistaken) reclaim of hot pages, bu= t the > +=A0=A0=A0=A0=A0 * chance of that should be relatively low. ] > +=A0=A0=A0=A0=A0 * > +=A0=A0=A0=A0=A0 * So as a performance optimization don't flush the TLB w= hen > +=A0=A0=A0=A0=A0 * clearing the accessed bit, it will eventually be flush= ed by > +=A0=A0=A0=A0=A0 * a context switch or a VM operation anyway. [ In the ra= re > +=A0=A0=A0=A0=A0 * event of it not getting flushed for a long time the de= lay > +=A0=A0=A0=A0=A0 * shouldn't really matter because there's no real memory > +=A0=A0=A0=A0=A0 * pressure for swapout to react to. ] This is blindly copied from x86 and isn't true for us: we don't invalidate the TLB on context switch. That means our window for keeping the stale entries around is potentially much bigger and might not be a great idea. If we roll a TLB invalidation routine without the trailing DSB, what sort o= f performance does that get you? Will =