Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp1505965ybd; Sun, 23 Jun 2019 09:06:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqxpDHE5sQd5qKkTTHxrqjB1qZITxNaUpZ/VQsb72aFj/RTBLR8GMGP1WYeetQqPtJ9cJ16X X-Received: by 2002:a63:df46:: with SMTP id h6mr28050984pgj.181.1561306006024; Sun, 23 Jun 2019 09:06:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561306006; cv=none; d=google.com; s=arc-20160816; b=ZlqIUASntB/6GU6l4Fmr/KAA0XecZIAYbfD64hbcbrfo691moXnJcn8RcXj01IdqLI 9OB89YuyERZ1GRQfIbY2Sw5FicHN+jI0ZDqSBXN6DW/SSNCPY0VcAbg2uf6VvgZQ8T0S eEDTSJ1rj59FYVFrYBoAEGMv/l8izO86IXn1CKGR1N8Gp/WDDtMpu7ZOJJ0Y8CP5Zghn zFqsEg53nyFgW18IELDsXrSpuDD40DcLBe/J6QipPTYo/AvLoZVXydGSbf4cui8QzdYU JeETWaBD/MRDwD10Kcy3CU3kRGpfFCFzBXwJMM/oeiPQSQNTU2QM/70lKwNsNRC4Gr86 Z54A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=hv8yGS0kF8RvmAGCLFBiea5R/dONPEJ0fh+rly5HMpI=; b=esPAKZZ8OGdcNjmLfxLAmHILJIn79QFtOzg5hfSrNtqFpJRqBJr9a8fU3a4j575hzN m1aF6cQONnkOR2TL75d4DMK7nLoXc0ap0l1+4AOOKvENNGwy+ui5ZvFPBdzIZOw4gLL7 nE/2GKUdUaGFgqFDyIXBRxB7vZ3HgOqPSWenUT2ndL8UbYEmX/gLyl1OaVUFhotk+pn3 hqSagH560XbnsviVqk/WmuEITvxTKHWbbXTtok10bLeoxLz4CXJeAL7yoV5866z8Q3/g KznAVVUKpiZJ8ckwBlyZFcTfjQFLNMfuJFu5rUwPTHt7Z7eqEAce/91p2ro8nZonCaET 4SOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=R0cK7REI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c12si7351165pgq.533.2019.06.23.09.06.30; Sun, 23 Jun 2019 09:06:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=R0cK7REI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726728AbfFWQEt (ORCPT + 99 others); Sun, 23 Jun 2019 12:04:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:41200 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726483AbfFWQEt (ORCPT ); Sun, 23 Jun 2019 12:04:49 -0400 Received: from guoren-Inspiron-7460.lan (89.208.247.74.16clouds.com [89.208.247.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E7697208C3; Sun, 23 Jun 2019 16:04:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1561305888; bh=m3PXllBlpDL+ihlJ9/avHamsAhmCEasRuxU2xZbo2YU=; h=From:To:Cc:Subject:Date:From; b=R0cK7REIoa8qRyqgt9GaVPCPHEFgZHn6ATO0n0TugMlVT5WcRpfWQSIbtK+FO55wP HD9E9t21JS/tDOosuFo/+3/rnlt+yPb8z9m+4dK2wbfCw19/+NEg5MyxG2NY6OupYs qGVn6hGOM4ZRabeGejpmNritCd+nMrAjdSqSdOOg= From: guoren@kernel.org To: julien.grall@arm.com, arnd@arndb.de, linux-kernel@vger.kernel.org Cc: linux-csky@vger.kernel.org, Guo Ren , Catalin Marinas Subject: [PATCH] arm64: asid: Optimize cache_flush for SMT Date: Mon, 24 Jun 2019 00:04:29 +0800 Message-Id: <1561305869-18872-1-git-send-email-guoren@kernel.org> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Guo Ren The hardware threads of one core could share the same TLB for SMT+SMP system. Assume hardware threads number sequence like this: | 0 1 2 3 | 4 5 6 7 | 8 9 a b | c d e f | core1 core2 core3 core4 Current algorithm seems is correct for SMT+SMP, but it'll give some duplicate local_tlb_flush. Because one hardware threads local_tlb_flush will also flush other hardware threads' TLB entry in one core TLB. So we can use bitmap to reduce local_tlb_flush for SMT. C-SKY cores don't support SMT and the patch is no benefit for C-SKY. Signed-off-by: Guo Ren Cc: Catalin Marinas Cc: Julien Grall --- arch/csky/include/asm/asid.h | 4 ++++ arch/csky/mm/asid.c | 11 ++++++++++- arch/csky/mm/context.c | 2 +- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/csky/include/asm/asid.h b/arch/csky/include/asm/asid.h index ac08b0f..f654492 100644 --- a/arch/csky/include/asm/asid.h +++ b/arch/csky/include/asm/asid.h @@ -23,6 +23,9 @@ struct asid_info unsigned int ctxt_shift; /* Callback to locally flush the context. */ void (*flush_cpu_ctxt_cb)(void); + /* To reduce duplicate tlb_flush for SMT */ + unsigned int harts_per_core; + unsigned int harts_per_core_mask; }; #define NUM_ASIDS(info) (1UL << ((info)->bits)) @@ -73,6 +76,7 @@ static inline void asid_check_context(struct asid_info *info, int asid_allocator_init(struct asid_info *info, u32 bits, unsigned int asid_per_ctxt, + unsigned int harts_per_core, void (*flush_cpu_ctxt_cb)(void)); #endif diff --git a/arch/csky/mm/asid.c b/arch/csky/mm/asid.c index b2e9147..50a983e 100644 --- a/arch/csky/mm/asid.c +++ b/arch/csky/mm/asid.c @@ -148,8 +148,13 @@ void asid_new_context(struct asid_info *info, atomic64_t *pasid, atomic64_set(pasid, asid); } - if (cpumask_test_and_clear_cpu(cpu, &info->flush_pending)) + if (cpumask_test_cpu(cpu, &info->flush_pending)) { + unsigned int i; + unsigned int harts_base = cpu & info->harts_per_core_mask; info->flush_cpu_ctxt_cb(); + for (i = 0; i < info->harts_per_core; i++) + cpumask_clear_cpu(harts_base + i, &info->flush_pending); + } atomic64_set(&active_asid(info, cpu), asid); cpumask_set_cpu(cpu, mm_cpumask(mm)); @@ -162,15 +167,19 @@ void asid_new_context(struct asid_info *info, atomic64_t *pasid, * @info: Pointer to the asid allocator structure * @bits: Number of ASIDs available * @asid_per_ctxt: Number of ASIDs to allocate per-context. ASIDs are + * @harts_per_core: Number hardware threads per core, must be 1, 2, 4, 8, 16 ... * allocated contiguously for a given context. This value should be a power of * 2. */ int asid_allocator_init(struct asid_info *info, u32 bits, unsigned int asid_per_ctxt, + unsigned int harts_per_core, void (*flush_cpu_ctxt_cb)(void)) { info->bits = bits; info->ctxt_shift = ilog2(asid_per_ctxt); + info->harts_per_core = harts_per_core; + info->harts_per_core_mask = ~((1 << ilog2(harts_per_core)) - 1); info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb; /* * Expect allocation after rollover to fail if we don't have at least diff --git a/arch/csky/mm/context.c b/arch/csky/mm/context.c index 0d95bdd..b58523b 100644 --- a/arch/csky/mm/context.c +++ b/arch/csky/mm/context.c @@ -30,7 +30,7 @@ static int asids_init(void) { BUG_ON(((1 << CONFIG_CPU_ASID_BITS) - 1) <= num_possible_cpus()); - if (asid_allocator_init(&asid_info, CONFIG_CPU_ASID_BITS, 1, + if (asid_allocator_init(&asid_info, CONFIG_CPU_ASID_BITS, 1, 1, asid_flush_cpu_ctxt)) panic("Unable to initialize ASID allocator for %lu ASIDs\n", NUM_ASIDS(&asid_info)); -- 2.7.4