Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2321840yba; Thu, 25 Apr 2019 14:30:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqzv90ZqTN4A/TbfqpTpFhFbqqw96uv2NLgtF9MnY5FwZ5pr8vuuMaCQ8LJVfP4TROzPLicT X-Received: by 2002:a63:6e0e:: with SMTP id j14mr39794626pgc.203.1556227825094; Thu, 25 Apr 2019 14:30:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556227825; cv=none; d=google.com; s=arc-20160816; b=VzNt18G0AcC4l6k0VYHns4W95Iwtc5MxFoeJyp/tIsge+yuwyqLq49jOP3aVvjO/aq FkN+KRzn3LY4acp/qjPnOtMJNnnJ6vESxUuWkcAYp5PpoAdiqz2t6F0QFQgezOn65ybD 40DI4l3NOTSDMwKMZNb4Wokemsth1FJcwErB0eF4M7lGSqBOQ+5lp278ZkivHQWi2588 bB4+iiY4+dyNy8SaH5fMrjIp+ffKH0fNiancTJGDk7o10rZyKVECgHUOeUlV/60bo3mi jIeDeibm0RGlm9OQtaMC07izpFLoBK74Do+/O7MbEQ9PuphfauiYNnnapAUETdNpDHEn GaAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=SllEnuGUHy25072CChEQRbkQ9ObK85GcQ9ye+pgVqwM=; b=lwjyKRRAauq0BvXwSObYconQaMPCxmF8y/6CCCaGoAEyp4VEi1/WG+hj0gGOj1qpOW vhimISM/zeEMM765bpwsm7KB30mzncgeprhTV3uQXllxR4enTD0P7f8IWFeisUmvgbyb juY1oEWkExcb3QbhmueNxA4691goXmlUTAGIHzv8Ip/R94H9yHYagZPeaJN1C6XwVlyc 3IkwYZIUOFmekHAHxOaAMUnvj8gkfVFZGdFRNDZuJZIq5g75uSuM1WmgITP9uYCItm5p KvTYf5S8obh0rEZhsMyV3biCYstFyaDzr2OkxZKtzIdospe0+scvuGmXuMZ4wbALNd0b FyMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 60si24327741plf.122.2019.04.25.14.30.09; Thu, 25 Apr 2019 14:30:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730280AbfDYSIb (ORCPT + 99 others); Thu, 25 Apr 2019 14:08:31 -0400 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:36555 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726006AbfDYSIa (ORCPT ); Thu, 25 Apr 2019 14:08:30 -0400 Received: from sc9-mailhost2.vmware.com (10.113.161.72) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Thu, 25 Apr 2019 11:08:28 -0700 Received: from htb-2n-eng-dhcp405.eng.vmware.com (unknown [10.33.114.36]) by sc9-mailhost2.vmware.com (Postfix) with ESMTP id 4F2F5B1F03; Thu, 25 Apr 2019 14:08:29 -0400 (EDT) From: Nadav Amit To: Peter Zijlstra , Borislav Petkov CC: Andy Lutomirski , Ingo Molnar , Thomas Gleixner , , , Nadav Amit , Dave Hansen Subject: [PATCH v2] x86/mm/tlb: Remove flush_tlb_info from the stack Date: Thu, 25 Apr 2019 11:08:28 -0700 Message-ID: <20190425180828.24959-1-namit@vmware.com> X-Mailer: git-send-email 2.19.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Received-SPF: None (EX13-EDG-OU-002.vmware.com: namit@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Move flush_tlb_info variables off the stack. This allows to align flush_tlb_info to cache-line and avoid potentially unnecessary cache line movements. It also allows to have a fixed virtual-to-physical translation of the variables, which reduces TLB misses. Use per-CPU struct for flush_tlb_mm_range() and flush_tlb_kernel_range(). Add debug assertions to ensure there are no nested TLB flushes that might overwrite the per-CPU data. For arch_tlbbatch_flush() use a const struct. Results when running a microbenchmarks that performs 10^6 MADV_DONTEED operations and touching a page, in which 3 additional threads run a busy-wait loop (5 runs, PTI and retpolines are turned off): base off-stack ---- --------- avg (usec/op) 1.629 1.570 (-3%) stddev 0.014 0.009 Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: Dave Hansen Cc: Borislav Petkov Cc: Thomas Gleixner Signed-off-by: Nadav Amit --- v1->v2: - Initialize all flush_tlb_info fields [Andy] --- arch/x86/mm/tlb.c | 100 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 74 insertions(+), 26 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 487b8474c01c..aac191eb2b90 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -634,7 +634,7 @@ static void flush_tlb_func_common(const struct flush_tlb_info *f, this_cpu_write(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, mm_tlb_gen); } -static void flush_tlb_func_local(void *info, enum tlb_flush_reason reason) +static void flush_tlb_func_local(const void *info, enum tlb_flush_reason reason) { const struct flush_tlb_info *f = info; @@ -722,43 +722,81 @@ void native_flush_tlb_others(const struct cpumask *cpumask, */ unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; +static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info); + +#ifdef CONFIG_DEBUG_VM +static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); +#endif + +static inline struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, + unsigned long start, unsigned long end, + unsigned int stride_shift, bool freed_tables, + u64 new_tlb_gen) +{ + struct flush_tlb_info *info = this_cpu_ptr(&flush_tlb_info); + +#ifdef CONFIG_DEBUG_VM + /* + * Ensure that the following code is non-reentrant and flush_tlb_info + * is not overwritten. This means no TLB flushing is initiated by + * interrupt handlers and machine-check exception handlers. + */ + BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1); +#endif + + info->start = start; + info->end = end; + info->mm = mm; + info->stride_shift = stride_shift; + info->freed_tables = freed_tables; + info->new_tlb_gen = new_tlb_gen; + + return info; +} + +static inline void put_flush_tlb_info(void) +{ +#ifdef CONFIG_DEBUG_VM + /* Complete reentrency prevention checks */ + barrier(); + this_cpu_dec(flush_tlb_info_idx); +#endif +} + void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables) { + struct flush_tlb_info *info; + u64 new_tlb_gen; int cpu; - struct flush_tlb_info info = { - .mm = mm, - .stride_shift = stride_shift, - .freed_tables = freed_tables, - }; - cpu = get_cpu(); - /* This is also a barrier that synchronizes with switch_mm(). */ - info.new_tlb_gen = inc_mm_tlb_gen(mm); - /* Should we flush just the requested range? */ - if ((end != TLB_FLUSH_ALL) && - ((end - start) >> stride_shift) <= tlb_single_page_flush_ceiling) { - info.start = start; - info.end = end; - } else { - info.start = 0UL; - info.end = TLB_FLUSH_ALL; + if ((end == TLB_FLUSH_ALL) || + ((end - start) >> stride_shift) > tlb_single_page_flush_ceiling) { + start = 0UL; + end = TLB_FLUSH_ALL; } + /* This is also a barrier that synchronizes with switch_mm(). */ + new_tlb_gen = inc_mm_tlb_gen(mm); + + info = get_flush_tlb_info(mm, start, end, stride_shift, freed_tables, + new_tlb_gen); + if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) { - VM_WARN_ON(irqs_disabled()); + lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN); + flush_tlb_func_local(info, TLB_LOCAL_MM_SHOOTDOWN); local_irq_enable(); } if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) - flush_tlb_others(mm_cpumask(mm), &info); + flush_tlb_others(mm_cpumask(mm), info); + put_flush_tlb_info(); put_cpu(); } @@ -787,22 +825,32 @@ static void do_kernel_range_flush(void *info) void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - /* Balance as user space task's flush, a bit conservative */ if (end == TLB_FLUSH_ALL || (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { on_each_cpu(do_flush_tlb_all, NULL, 1); } else { - struct flush_tlb_info info; - info.start = start; - info.end = end; - on_each_cpu(do_kernel_range_flush, &info, 1); + struct flush_tlb_info *info; + + preempt_disable(); + + info = get_flush_tlb_info(NULL, start, end, 0, false, 0); + + info = this_cpu_ptr(&flush_tlb_info); + info->start = start; + info->end = end; + + on_each_cpu(do_kernel_range_flush, info, 1); + + put_flush_tlb_info(); + + preempt_enable(); } } void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { - struct flush_tlb_info info = { + static const struct flush_tlb_info info = { .mm = NULL, .start = 0UL, .end = TLB_FLUSH_ALL, -- 2.19.1