Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932346AbdHWQ6q (ORCPT ); Wed, 23 Aug 2017 12:58:46 -0400 Received: from mail-io0-f182.google.com ([209.85.223.182]:36537 "EHLO mail-io0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754129AbdHWQ6o (ORCPT ); Wed, 23 Aug 2017 12:58:44 -0400 Date: Wed, 23 Aug 2017 10:58:42 -0600 From: Tycho Andersen To: Mark Rutland Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, Marco Benatto , Juerg Haefliger Subject: Re: [kernel-hardening] [PATCH v5 04/10] arm64: Add __flush_tlb_one() Message-ID: <20170823165842.k5lbxom45avvd7g2@smitten> References: <20170809200755.11234-1-tycho@docker.com> <20170809200755.11234-5-tycho@docker.com> <20170812112603.GB16374@remoulade> <20170814163536.6njceqc3dip5lrlu@smitten> <20170814165047.GB23428@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170814165047.GB23428@leverpostej> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1729 Lines: 59 Hi Mark, On Mon, Aug 14, 2017 at 05:50:47PM +0100, Mark Rutland wrote: > That said, is there any reason not to use flush_tlb_kernel_range() > directly? So it turns out that there is a difference between __flush_tlb_one() and flush_tlb_kernel_range() on x86: flush_tlb_kernel_range() flushes all the TLBs via on_each_cpu(), where as __flush_tlb_one() only flushes the local TLB (which I think is enough here). As you might expect, this is quite a performance hit (at least under kvm), I ran a little kernbench: # __flush_tlb_one Wed Aug 23 15:47:33 UTC 2017 4.13.0-rc5+ Average Half load -j 2 Run (std deviation): Elapsed Time 50.3233 (1.82716) User Time 87.1233 (1.26871) System Time 15.36 (0.500899) Percent CPU 203.667 (4.04145) Context Switches 7350.33 (1339.65) Sleeps 16008.3 (980.362) Average Optimal load -j 4 Run (std deviation): Elapsed Time 27.4267 (0.215019) User Time 88.6983 (1.91501) System Time 13.1933 (2.39488) Percent CPU 286.333 (90.6083) Context Switches 11393 (4509.14) Sleeps 15764.7 (698.048) # flush_tlb_kernel_range() Wed Aug 23 16:00:03 UTC 2017 4.13.0-rc5+ Average Half load -j 2 Run (std deviation): Elapsed Time 86.57 (1.06099) User Time 103.25 (1.85475) System Time 75.4433 (0.415852) Percent CPU 205.667 (3.21455) Context Switches 9363.33 (1361.57) Sleeps 14703.3 (1439.12) Average Optimal load -j 4 Run (std deviation): Elapsed Time 51.27 (0.615873) User Time 110.328 (7.93884) System Time 74.06 (1.55788) Percent CPU 288 (90.2197) Context Switches 16557.5 (7930.01) Sleeps 14774.7 (921.746) So, I think we need to keep something like __flush_tlb_one around. I'll call it flush_one_local_tlb() for now, and will cc x86@ on the next version to see if they have any insight. Cheers, Tycho