Received: by 10.223.185.116 with SMTP id b49csp5143259wrg; Wed, 7 Mar 2018 07:02:39 -0800 (PST) X-Google-Smtp-Source: AG47ELstHArzUNmb/wkrLed0Uy9Xao1FDBcpar/o97I4D8zQbDPhktdq5TKGTnAXUXNPKMYn93On X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr12795012plp.331.1520434959185; Wed, 07 Mar 2018 07:02:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520434959; cv=none; d=google.com; s=arc-20160816; b=LqKbAt3XAUc5tXGlIf6GxBaCyy2XtW9wOrw7xuAZ8rB47L9LTpkEMm3V5wtWAXKz6M TBtYPmLjCDuKPYOEnhN/h9M6afThix3wH1FqHsfaBocZ6kmQWTGKswgyTVkq7ZjYL2Ow AGPi/HzaM/CwQjlJaQb6LozsuekTiYkOGMFJbvJEh0RG/ZdGUGejQQPFuyStYYjkallN HBk3fkrsvodPG5Y9pSNMueNfNeOFYLbtXONDqBAPrOakPQqZVMez5d33ttP4oGQ4kI3l ZOgkxOZmDhiNSO1CBapTs3HIMxq1lWEPD5cBb54CSgF+q2S4w/PkxBWvk3FYqHRoOe3w quLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dmarc-filter:dkim-signature:dkim-signature :arc-authentication-results; bh=ABnLguxyv8xSbOUmbVCXsYJG6JFOCksF4QesiKYSBlQ=; b=gOxXn/OxSzIPC6VtMKTUaE95VIMa1KqcfJjeOSvc0OF00sJzzeZFCPvh9oWd71Phys i5E7LPa97Fwjt2+mwARiZovkkNR3XVe5vIgDUwQg29LEqEAVHfAa25b+Dn3kqag7yraC 9Y71WMSjiTrUDsPBegsZ27sf2J4Vp3Ilf6KztlTE3A8vULfeL8y3bpy+0/tvnsVuRshy WM5h/7vjhUFQR3R4DNV5U24IuYTeS7smifaxyjBr7udwVOgjA6PWkl90YAXqRhDIbAYl pjr1oUydUIALc33DUtGrAMAPSFSvsT7iqbnJANxKG0FF3c41aQhG6R9GF1zgmL40jLHf /mog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=RAsV6DlD; dkim=pass header.i=@codeaurora.org header.s=default header.b=QAS/sHb/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n9-v6si4949610pll.695.2018.03.07.07.02.13; Wed, 07 Mar 2018 07:02:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=RAsV6DlD; dkim=pass header.i=@codeaurora.org header.s=default header.b=QAS/sHb/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754650AbeCGPAO (ORCPT + 99 others); Wed, 7 Mar 2018 10:00:14 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:39126 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754389AbeCGPAN (ORCPT ); Wed, 7 Mar 2018 10:00:13 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 840106083C; Wed, 7 Mar 2018 15:00:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1520434812; bh=pduuWYo1nyYsWCAWymlRy4cKWPmXB8UZfnl23d8KyHw=; h=From:To:Cc:Subject:Date:From; b=RAsV6DlD9akDoSgJZxgo4lZ8TlYM7x7QkjhKdJgeTPglfr+WGkMdTdN8ckT+J48QL Vk88EcdB1IKQ0m2qxp9UqWs/vfJJGVi6+K/31cyhLIxG6iGrMJ1eOooM3ZkolT5Lr6 4wXeINLIbZe8HZ2Z90s9lEIx/j/1tJGnt9N3p2OU= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from shankerd-ubuntu.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: shankerd@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 2352260452; Wed, 7 Mar 2018 15:00:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1520434811; bh=pduuWYo1nyYsWCAWymlRy4cKWPmXB8UZfnl23d8KyHw=; h=From:To:Cc:Subject:Date:From; b=QAS/sHb/FLHOhY8UNuMBVfHnCyaJ1ePCxriGIHpSNbuGNNN1UwLTtLreDdYH06wts YkfXwTPTasQSmGaGJ+8mg4EKxcNGnAB/Nz1N1pyQV8kUMDRvsr03uuJrSPCQWSEymq QMP9ubTWRxF43CSqhAem4hjEdhWhGqsvLDkNx6Cw= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 2352260452 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=shankerd@codeaurora.org From: Shanker Donthineni To: Will Deacon , Robin Murphy , Mark Rutland , linux-kernel , linux-arm-kernel , Catalin Marinas , kvmarm Cc: Marc Zyngier , Vikram Sethi , Philip Elcan , Shanker Donthineni Subject: [PATCH v7] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC Date: Wed, 7 Mar 2018 09:00:08 -0600 Message-Id: <1520434808-29703-1-git-send-email-shankerd@codeaurora.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Co-authored-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v6: -Both I-Cache and D-Cache changes are symmetric as Will suggested. -Remove Kconfig option. -Patch __flush_icache_all(). Changes since v5: -Addressed Mark's review comments. Changes since v4: -Moved patching ARM64_HAS_CACHE_DIC inside invalidate_icache_by_line -Removed 'dsb ishst' for ARM64_HAS_CACHE_DIC as Mark suggested. Changes since v3: -Added preprocessor guard CONFIG_xxx to code snippets in cache.S -Changed barrier attributes from ISH to ISHST. Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/include/asm/cache.h | 4 ++++ arch/arm64/include/asm/cacheflush.h | 7 +++++-- arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 36 ++++++++++++++++++++++++++++++------ arch/arm64/mm/cache.S | 21 ++++++++++++++++++++- 5 files changed, 62 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..9bbffc7 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,12 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMINLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index bef9f41..d51bde1 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -133,8 +133,11 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, static inline void __flush_icache_all(void) { - asm("ic ialluis"); - dsb(ish); + /* Instruction cache invalidation is not required for I/D coherence? */ + if (!cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) { + asm("ic ialluis"); + dsb(ish); + } } #define flush_dcache_mmap_lock(mapping) \ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT 25 #define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_CACHE_IDC 27 +#define ARM64_HAS_CACHE_DIC 28 -#define ARM64_NCAPS 27 +#define ARM64_NCAPS 29 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 2985a06..9f39e9c 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -199,12 +199,12 @@ static int __init register_cpu_hwcaps_dumper(void) }; static const struct arm64_ftr_bits ftr_ctr[] = { - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 29, 1, 1), /* DIC */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 28, 1, 1), /* IDC */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, 24, 4, 0), /* CWG */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, 20, 4, 0), /* ERG */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 1), /* DminLine */ + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IDC_SHIFT, 1, 1), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, CTR_CWG_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, CTR_ERG_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DMINLINE_SHIFT, 4, 1), /* * Linux can handle differing I-cache policies. Userspace JITs will * make use of *minLine. @@ -852,6 +852,18 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus ID_AA64PFR0_FP_SHIFT) < 0; } +static bool has_cache_idc(const struct arm64_cpu_capabilities *entry, + int __unused) +{ + return read_sanitised_ftr_reg(SYS_CTR_EL0) & BIT(CTR_IDC_SHIFT); +} + +static bool has_cache_dic(const struct arm64_cpu_capabilities *entry, + int __unused) +{ + return read_sanitised_ftr_reg(SYS_CTR_EL0) & BIT(CTR_DIC_SHIFT); +} + #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */ @@ -1088,6 +1100,18 @@ static int cpu_copy_el2regs(void *__unused) .enable = cpu_clear_disr, }, #endif /* CONFIG_ARM64_RAS_EXTN */ + { + .desc = "Data cache clean to the PoU not required for I/D coherence", + .capability = ARM64_HAS_CACHE_IDC, + .def_scope = SCOPE_SYSTEM, + .matches = has_cache_idc, + }, + { + .desc = "Instruction cache invalidation not required for I/D coherence", + .capability = ARM64_HAS_CACHE_DIC, + .def_scope = SCOPE_SYSTEM, + .matches = has_cache_dic, + }, {}, }; diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S index 758bde7..303dfcc 100644 --- a/arch/arm64/mm/cache.S +++ b/arch/arm64/mm/cache.S @@ -50,6 +50,10 @@ ENTRY(flush_icache_range) */ ENTRY(__flush_cache_user_range) uaccess_ttbr0_enable x2, x3, x4 +alternative_if ARM64_HAS_CACHE_IDC + dsb ishst + b 7f +alternative_else_nop_endif dcache_line_size x2, x3 sub x3, x2, #1 bic x4, x0, x3 @@ -60,8 +64,13 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE b.lo 1b dsb ish +7: +alternative_if ARM64_HAS_CACHE_DIC + isb + b 8f +alternative_else_nop_endif invalidate_icache_by_line x0, x1, x2, x3, 9f - mov x0, #0 +8: mov x0, #0 1: uaccess_ttbr0_disable x1, x2 ret @@ -80,6 +89,12 @@ ENDPROC(__flush_cache_user_range) * - end - virtual end address of region */ ENTRY(invalidate_icache_range) +alternative_if ARM64_HAS_CACHE_DIC + mov x0, xzr + isb + ret +alternative_else_nop_endif + uaccess_ttbr0_enable x2, x3, x4 invalidate_icache_by_line x0, x1, x2, x3, 2f @@ -116,6 +131,10 @@ ENDPIPROC(__flush_dcache_area) * - size - size in question */ ENTRY(__clean_dcache_area_pou) +alternative_if ARM64_HAS_CACHE_IDC + dsb ishst + ret +alternative_else_nop_endif dcache_by_line_op cvau, ish, x0, x1, x2, x3 ret ENDPROC(__clean_dcache_area_pou) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.