Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3253014yba; Mon, 22 Apr 2019 23:58:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+v3e/NIKLdCG3YGdMsvueJtnVgrWu6FPzJOBhX4myw0ntXFPC+jNWEeCXMGqZl3lv82Yu X-Received: by 2002:a65:4183:: with SMTP id a3mr23071554pgq.121.1556002704164; Mon, 22 Apr 2019 23:58:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556002704; cv=none; d=google.com; s=arc-20160816; b=xWHmVJ1Xvxm3YXIXzisy5E4eL7+3T4eL89mWINR92tEYjGsZ8uyYbUIq1/60RE4pQD 4y5aGPUvVb/dB+xPxRIUomEPY6kobHKYYuNk6Xh0jWQ6ydp1JlzvtdGY8ay22gj5eE4U exAIQmpEmfMFanEVJg3M3CttXo1T0ZNKtr/9O/WGEbf59c9DX7WMOSkbzVxpww7SR+QJ MT+4FkChH8kEJ5mg01lhMA738VL6K9ujjaheaN/pbLMa0gCjZjIE8EqnpueHVXPOlhkq JmaKqpeu0PtIiebpDFL8UlEdDmysBOgXAaPEYY6TPVx7uwF6pHmxFJQFWGmWH/VK+05H QUrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=uvxIu8C5Al2Q8e2Q3O1p3BneCTe0GCyaMCNp8cdP/4I=; b=oOYoDAAP8D2LIoQgUXI5SjvGCc4Jpu889phbxGYjAINAJr9a1U2Rblvf1Vt4/a6Bqt VYlYm8gTmP/0TKhFyRSuoDXanlDhumkqGlFGzaMAcypFwIXoJ3IIFwi30mL1CA0PP7WE sSzlNoWYNfGc4E5KKC9XXvrULeP075KRI0b/NMb4zXoyrhPkkmhvGrFMyBsvpOkOnKQW maMIaCKLk6tKpjhQeShghMtAmEsATQsYw6vEAAmbir0oDD3Kz2aPfJ96WMWVyA5lw656 OTFZYoe9N24WS2Z+Tbw543WMUBx3IdY9dN/D0eCojqQXMjLimdk7asjxX9h7khkGW03W +SQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e26si15071690pfi.54.2019.04.22.23.58.02; Mon, 22 Apr 2019 23:58:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726155AbfDWG5J (ORCPT + 99 others); Tue, 23 Apr 2019 02:57:09 -0400 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:17064 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725888AbfDWG5I (ORCPT ); Tue, 23 Apr 2019 02:57:08 -0400 Received: from sc9-mailhost2.vmware.com (10.113.161.72) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Mon, 22 Apr 2019 23:57:06 -0700 Received: from htb-2n-eng-dhcp405.eng.vmware.com (unknown [10.33.114.36]) by sc9-mailhost2.vmware.com (Postfix) with ESMTP id E0835B2137; Tue, 23 Apr 2019 02:57:06 -0400 (EDT) From: Nadav Amit To: Peter Zijlstra , Borislav Petkov CC: Andy Lutomirski , Ingo Molnar , Thomas Gleixner , , , Nadav Amit , Dave Hansen Subject: [PATCH] x86/mm/tlb: Remove flush_tlb_info from the stack Date: Mon, 22 Apr 2019 23:57:06 -0700 Message-ID: <20190423065706.15430-1-namit@vmware.com> X-Mailer: git-send-email 2.19.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Received-SPF: None (EX13-EDG-OU-002.vmware.com: namit@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Remove flush_tlb_info variables from the stack. This allows to align flush_tlb_info to cache-line and avoid potentially unnecessary cache line movements. It also allows to have a fixed virtual-to-physical translation of the variables, which reduces TLB misses. Use per-CPU struct for flush_tlb_mm_range() and flush_tlb_kernel_range(). Add debug assertions to ensure there are no nested TLB flushes that might overwrite the per-CPU data. For arch_tlbbatch_flush(), use a const struct. Results when running a microbenchmarks that performs 10^6 MADV_DONTEED operations and touching a page, in which 3 additional threads run a busy-wait loop (5 runs): base off-stack ---- --------- avg (per operation) 1.629 1.580 (-3%) stddev 0.007 0.012 Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: Dave Hansen Cc: Borislav Petkov Cc: Thomas Gleixner Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 75 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 54 insertions(+), 21 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 487b8474c01c..c4ac66dfb34e 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -634,7 +634,7 @@ static void flush_tlb_func_common(const struct flush_tlb_info *f, this_cpu_write(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen, mm_tlb_gen); } -static void flush_tlb_func_local(void *info, enum tlb_flush_reason reason) +static void flush_tlb_func_local(const void *info, enum tlb_flush_reason reason) { const struct flush_tlb_info *f = info; @@ -722,43 +722,62 @@ void native_flush_tlb_others(const struct cpumask *cpumask, */ unsigned long tlb_single_page_flush_ceiling __read_mostly = 33; + +static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info); + +#ifdef CONFIG_DEBUG_VM +static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); +#endif + void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables) { + struct flush_tlb_info *info; int cpu; - struct flush_tlb_info info = { - .mm = mm, - .stride_shift = stride_shift, - .freed_tables = freed_tables, - }; - cpu = get_cpu(); + info = this_cpu_ptr(&flush_tlb_info); + + /* + * Ensure that the following code is non-reentrant and flush_tlb_info + * is not overwritten. This means no TLB flushing is initiated by + * interrupt handlers and machine-check exception handlers. + */ +#ifdef CONFIG_DEBUG_VM + BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1); +#endif /* This is also a barrier that synchronizes with switch_mm(). */ - info.new_tlb_gen = inc_mm_tlb_gen(mm); + info->new_tlb_gen = inc_mm_tlb_gen(mm); + info->mm = mm; + info->stride_shift = stride_shift; + info->freed_tables = freed_tables; /* Should we flush just the requested range? */ if ((end != TLB_FLUSH_ALL) && ((end - start) >> stride_shift) <= tlb_single_page_flush_ceiling) { - info.start = start; - info.end = end; + info->start = start; + info->end = end; } else { - info.start = 0UL; - info.end = TLB_FLUSH_ALL; + info->start = 0UL; + info->end = TLB_FLUSH_ALL; } if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) { - VM_WARN_ON(irqs_disabled()); + lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN); + flush_tlb_func_local(info, TLB_LOCAL_MM_SHOOTDOWN); local_irq_enable(); } if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) - flush_tlb_others(mm_cpumask(mm), &info); + flush_tlb_others(mm_cpumask(mm), info); +#ifdef CONFIG_DEBUG_VM + barrier(); + this_cpu_dec(flush_tlb_info_idx); +#endif put_cpu(); } @@ -787,22 +806,36 @@ static void do_kernel_range_flush(void *info) void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - /* Balance as user space task's flush, a bit conservative */ if (end == TLB_FLUSH_ALL || (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { on_each_cpu(do_flush_tlb_all, NULL, 1); } else { - struct flush_tlb_info info; - info.start = start; - info.end = end; - on_each_cpu(do_kernel_range_flush, &info, 1); + struct flush_tlb_info *info; + + preempt_disable(); + +#ifdef CONFIG_DEBUG_VM + BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) != 1); +#endif + + info = this_cpu_ptr(&flush_tlb_info); + info->start = start; + info->end = end; + + on_each_cpu(do_kernel_range_flush, info, 1); + +#ifdef CONFIG_DEBUG_VM + barrier(); + this_cpu_dec(flush_tlb_info_idx); +#endif + preempt_enable(); } } void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { - struct flush_tlb_info info = { + static const struct flush_tlb_info info = { .mm = NULL, .start = 0UL, .end = TLB_FLUSH_ALL, -- 2.19.1