Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2615064ybi; Mon, 17 Jun 2019 07:45:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqysOiqQjX2aGv0DYQVG1qYwdMtlWxIlMNtaH9ZDPF92uuWqpY4QxNXWZfF2JrPT8dlpTkFJ X-Received: by 2002:a17:902:44a4:: with SMTP id l33mr28566800pld.174.1560782751928; Mon, 17 Jun 2019 07:45:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560782751; cv=none; d=google.com; s=arc-20160816; b=dGvnKsPt8ct/TMpuAhDZcNwDaDneBI7JwQFfq1WBoR4WUFYiKbPB4xxGnJZV+3M8eq XzwRU+/wnhQUEroDMiQ2xHoqc0ubu4ki98JfIIqvihf+gBBBMK+iXYpRbujTzvOdhM4g YG3v7WIDBxQA5vM6YudlMSNbpU6F/Wmcm7aBoTExaOyFAvX6sgSQ5kvvFlKlox1hM1wd 6LkLpQShLd+lGvgJmni3nYyEA4CwfGPbCASnsB3XdxtuuvfZGjRx+mxioDKAvDHMiW3I t7tk2vQs1McLyPOV37art4+ZObnL+VNucfLxDTZAYUVdCEdTFHvk5/vuy3t69PUJ2sVN RFVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=VaQQahH7Bn4P1ls3atHSeDeycm3QD3GPY1FfuoUSgkE=; b=PkHR7/uDz0kA+FrW2hiLznwpa/UamN2rqYamAwlEdsiSGqoXFhPeHbsXcJAkJ3IDK/ /em4IshK0RM/SyarnNRXqY2lZJ/aK+700aQig4sXhtvXCXKFNSy7yH8SVveSfZcxe01l ZiYNPhrODmrt6LYP1kY1n4Q6Jp5FePhJ+3gx/eO1zTA80f4K4jpgja4Cr6relvTYkz5T PxrmNb6a8VZ08Y+Y10PurugKC2fZq+L8PgI9tEtjVfLZQXbS8jDrB+30ZWeIH6iJ8aYn KrNvsS4gd1ps4nvBpBsT9WuK0BqvNE3vA+neLUfJ27QmdsjZbFr5HJ845hLHy2vOpeE/ PeZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o24si10710595pll.280.2019.06.17.07.45.37; Mon, 17 Jun 2019 07:45:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728531AbfFQOo0 (ORCPT + 99 others); Mon, 17 Jun 2019 10:44:26 -0400 Received: from mgwkm01.jp.fujitsu.com ([202.219.69.168]:21026 "EHLO mgwkm01.jp.fujitsu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726151AbfFQOoU (ORCPT ); Mon, 17 Jun 2019 10:44:20 -0400 X-Greylist: delayed 679 seconds by postgrey-1.27 at vger.kernel.org; Mon, 17 Jun 2019 10:44:19 EDT Received: from kw-mxoi2.gw.nic.fujitsu.com (unknown [192.168.231.133]) by mgwkm01.jp.fujitsu.com with smtp id 60e9_f732_5c8e35ba_3ca3_402a_a669_b169c986abb5; Mon, 17 Jun 2019 23:32:58 +0900 Received: from g01jpfmpwkw02.exch.g01.fujitsu.local (g01jpfmpwkw02.exch.g01.fujitsu.local [10.0.193.56]) by kw-mxoi2.gw.nic.fujitsu.com (Postfix) with ESMTP id 33167AC009B; Mon, 17 Jun 2019 23:32:58 +0900 (JST) Received: from G01JPEXCHKW15.g01.fujitsu.local (G01JPEXCHKW15.g01.fujitsu.local [10.0.194.54]) by g01jpfmpwkw02.exch.g01.fujitsu.local (Postfix) with ESMTP id 0D6EC8B4904; Mon, 17 Jun 2019 23:32:57 +0900 (JST) Received: from localhost.localdomain (10.17.204.146) by G01JPEXCHKW15.g01.fujitsu.local (10.0.194.54) with Microsoft SMTP Server id 14.3.439.0; Mon, 17 Jun 2019 23:32:55 +0900 From: Takao Indoh To: Jonathan Corbet , Catalin Marinas , Will Deacon CC: , , , QI Fuli , "Takao Indoh" Subject: [PATCH 2/2] arm64: tlb: Add boot parameter to disable TLB flush within the same inner shareable domain Date: Mon, 17 Jun 2019 23:32:55 +0900 Message-ID: <20190617143255.10462-3-indou.takao@jp.fujitsu.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190617143255.10462-1-indou.takao@jp.fujitsu.com> References: <20190617143255.10462-1-indou.takao@jp.fujitsu.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-SecurityPolicyCheck-GC: OK by FENCE-Mail X-TM-AS-GCONF: 00 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Takao Indoh This patch adds new boot parameter 'disable_tlbflush_is' to disable TLB flush within the same inner shareable domain for performance tuning. In the case of flush_tlb_mm() *without* this parameter, TLB entry is invalidated by __tlbi(aside1is, asid). By this instruction, all CPUs within the same inner shareable domain check if there are TLB entries which have this ASID, this causes performance noise, especially at large-scale HPC environment, which has more than thousand nodes with low latency interconnect. When this new parameter is specified, TLB entry is invalidated by __tlbi(aside1, asid) only on the CPUs specified by mm_cpumask(mm). Therefore TLB flush is done on minimal CPUs and performance problem does not occur. Signed-off-by: QI Fuli Signed-off-by: Takao Indoh --- .../admin-guide/kernel-parameters.txt | 4 + arch/arm64/include/asm/tlbflush.h | 61 ++----- arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/tlbflush.c | 155 ++++++++++++++++++ 4 files changed, 172 insertions(+), 50 deletions(-) create mode 100644 arch/arm64/kernel/tlbflush.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 138f6664b2e2..a693eea34e48 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -848,6 +848,10 @@ disable= [IPV6] See Documentation/networking/ipv6.txt. + disable_tlbflush_is + [ARM64] Disable using TLB instruction to flush + all PE within the same inner shareable domain. + hardened_usercopy= [KNL] Under CONFIG_HARDENED_USERCOPY, whether hardening is enabled for this boot. Hardened diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index dff8f9ea5754..ba2b3fd0b63c 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -139,6 +139,13 @@ * on top of these routines, since that is our interface to the mmu_gather * API as used by munmap() and friends. */ + +void flush_tlb_mm(struct mm_struct *mm); +void flush_tlb_page_nosync(struct vm_area_struct *vma, + unsigned long uaddr); +void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, + unsigned long end, unsigned long stride, bool last_level); + static inline void local_flush_tlb_all(void) { dsb(nshst); @@ -155,24 +162,14 @@ static inline void flush_tlb_all(void) isb(); } -static inline void flush_tlb_mm(struct mm_struct *mm) +static inline void local_flush_tlb_mm(struct mm_struct *mm) { unsigned long asid = __TLBI_VADDR(0, ASID(mm)); - dsb(ishst); - __tlbi(aside1is, asid); - __tlbi_user(aside1is, asid); - dsb(ish); -} - -static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, - unsigned long uaddr) -{ - unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm)); - - dsb(ishst); - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); + dsb(nshst); + __tlbi(aside1, asid); + __tlbi_user(aside1, asid); + dsb(nsh); } static inline void flush_tlb_page(struct vm_area_struct *vma, @@ -188,40 +185,6 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, */ #define MAX_TLBI_OPS PTRS_PER_PTE -static inline void __flush_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long stride, bool last_level) -{ - unsigned long asid = ASID(vma->vm_mm); - unsigned long addr; - - start = round_down(start, stride); - end = round_up(end, stride); - - if ((end - start) >= (MAX_TLBI_OPS * stride)) { - flush_tlb_mm(vma->vm_mm); - return; - } - - /* Convert the stride into units of 4k */ - stride >>= 12; - - start = __TLBI_VADDR(start, asid); - end = __TLBI_VADDR(end, asid); - - dsb(ishst); - for (addr = start; addr < end; addr += stride) { - if (last_level) { - __tlbi(vale1is, addr); - __tlbi_user(vale1is, addr); - } else { - __tlbi(vae1is, addr); - __tlbi_user(vae1is, addr); - } - } - dsb(ish); -} - static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 9e7dcb2c31c7..266c9a57b081 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -19,7 +19,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \ return_address.o cpuinfo.o cpu_errata.o \ cpufeature.o alternative.o cacheinfo.o \ smp.o smp_spin_table.o topology.o smccc-call.o \ - syscall.o + syscall.o tlbflush.o extra-$(CONFIG_EFI) := efi-entry.o diff --git a/arch/arm64/kernel/tlbflush.c b/arch/arm64/kernel/tlbflush.c new file mode 100644 index 000000000000..52c9a237759a --- /dev/null +++ b/arch/arm64/kernel/tlbflush.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2019 FUJITSU LIMITED + +#include +#include + +struct tlb_args { + struct vm_area_struct *ta_vma; + unsigned long ta_start; + unsigned long ta_end; + unsigned long ta_stride; + bool ta_last_level; +}; + +int disable_tlbflush_is; + +static int __init disable_tlbflush_is_setup(char *str) +{ + disable_tlbflush_is = 1; + + return 0; +} +__setup("disable_tlbflush_is", disable_tlbflush_is_setup); + +static inline void __flush_tlb_mm(struct mm_struct *mm) +{ + unsigned long asid = __TLBI_VADDR(0, ASID(mm)); + + dsb(ishst); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); + dsb(ish); +} + +static inline void ipi_flush_tlb_mm(void *arg) +{ + struct mm_struct *mm = arg; + + local_flush_tlb_mm(mm); +} + +void flush_tlb_mm(struct mm_struct *mm) +{ + if (disable_tlbflush_is) + on_each_cpu_mask(mm_cpumask(mm), ipi_flush_tlb_mm, + (void *)mm, true); + else + __flush_tlb_mm(mm); +} + +static inline void __flush_tlb_page_nosync(unsigned long addr) +{ + dsb(ishst); + __tlbi(vale1is, addr); + __tlbi_user(vale1is, addr); +} + +static inline void __local_flush_tlb_page_nosync(unsigned long addr) +{ + dsb(nshst); + __tlbi(vale1, addr); + __tlbi_user(vale1, addr); + dsb(nsh); +} + +static inline void ipi_flush_tlb_page_nosync(void *arg) +{ + unsigned long addr = *(unsigned long *)arg; + + __local_flush_tlb_page_nosync(addr); +} + +void flush_tlb_page_nosync(struct vm_area_struct *vma, unsigned long uaddr) +{ + unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm)); + + if (disable_tlbflush_is) + on_each_cpu_mask(mm_cpumask(vma->vm_mm), + ipi_flush_tlb_page_nosync, &addr, true); + else + __flush_tlb_page_nosync(addr); +} + +static inline void ___flush_tlb_range(unsigned long start, unsigned long end, + unsigned long stride, bool last_level) +{ + unsigned long addr; + + dsb(ishst); + for (addr = start; addr < end; addr += stride) { + if (last_level) { + __tlbi(vale1is, addr); + __tlbi_user(vale1is, addr); + } else { + __tlbi(vae1is, addr); + __tlbi_user(vae1is, addr); + } + } + dsb(ish); +} + +static inline void __local_flush_tlb_range(unsigned long addr, bool last_level) +{ + dsb(nshst); + if (last_level) { + __tlbi(vale1, addr); + __tlbi_user(vale1, addr); + } else { + __tlbi(vae1, addr); + __tlbi_user(vae1, addr); + } + dsb(nsh); +} + +static inline void ipi_flush_tlb_range(void *arg) +{ + struct tlb_args *ta = (struct tlb_args *)arg; + unsigned long addr; + + for (addr = ta->ta_start; addr < ta->ta_end; addr += ta->ta_stride) + __local_flush_tlb_range(addr, ta->ta_last_level); +} + +void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, + unsigned long end, unsigned long stride, bool last_level) +{ + unsigned long asid = ASID(vma->vm_mm); + + start = round_down(start, stride); + end = round_up(end, stride); + + if ((end - start) >= (MAX_TLBI_OPS * stride)) { + flush_tlb_mm(vma->vm_mm); + return; + } + + /* Convert the stride into units of 4k */ + stride >>= 12; + + start = __TLBI_VADDR(start, asid); + end = __TLBI_VADDR(end, asid); + + if (disable_tlbflush_is) { + struct tlb_args ta = { + .ta_start = start, + .ta_end = end, + .ta_stride = stride, + .ta_last_level = last_level, + }; + + on_each_cpu_mask(mm_cpumask(vma->vm_mm), ipi_flush_tlb_range, + &ta, true); + } else + ___flush_tlb_range(start, end, stride, last_level); +} -- 2.20.1