Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp3513017rdb; Wed, 27 Dec 2023 09:38:46 -0800 (PST) X-Google-Smtp-Source: AGHT+IEU4lE9wqJHwG1jVVajNY9MB+HdNkglfcKWLBgk9TxqqSyE+I2v3EANPPJKn1e1kTIMCFJE X-Received: by 2002:a05:6808:189d:b0:3b8:b063:9b52 with SMTP id bi29-20020a056808189d00b003b8b0639b52mr11368000oib.68.1703698726024; Wed, 27 Dec 2023 09:38:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703698726; cv=none; d=google.com; s=arc-20160816; b=bZ3V3+E2vx7drptc/bSsL0N8SrhaM2VSA7a/3yaF+91KzDrIGpR76UfAWBE1F/FbwX FVOPmVPouxCP0xho/rngRtmlR1FvaEQXOqgAtpyk9i3yIoURRl1Nn/vex8mIwR7DgfRv kx1tn2mlUHDIS9qUbPJQhhEe9KFiE2KA/5Yxc0HF5Bp2Oz6FVN1zaGQTCQPTPjnEOjgo mA7Q+yRHmKc3XpKWn3wg2AdJP9Ky78BcV5PnNwcT5B4q+hCose1ykAj7PzZCw3YaLL+8 FJK/tzLBcpy6eB3n/JSjs5X34pDOyOeFkfQXtGOpkaD9OTBuBJgEYAQL4oUDHUXxhMhp dL7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=YMlsfn7czPVsXwrmU+kv95pA8qnyqJLx1nukNUHIJyw=; fh=9MaSVDUXOJxvs2iiq1scgRqoweizMhvME86gTnpLD7E=; b=YetNkmkr3pm68RM1ZzY0edYfL89yeZu7BiSY77fOMO4QToE0ZXtDYruZrzuVhID11S jjfDAjsNIR4Z9CzA89eYdi/cfIC0BkU8j8shj+fkGvGYcacC/O9VzT3bnQF657LTaz0I G3flPNFjehKjnV+Fq1gYEvtTd+LDUeq4RKA5MTWi1aHF54Llu9fL4+zLKTiOUHG1KAwD DERIUUtUXm16jLsZIaX5r6NruYMx9oGZBzdj/ruLy70BE1yuytdCygpg+fZrPioF3ngw /UJAZLuzPYuVDZhJr+aX+MOuD/8+ItvOPvaOOTYhmc9oXqas1//bKfVygTJt+txaw2of F2DA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=CS1jGi83; spf=pass (google.com: domain of linux-kernel+bounces-12178-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12178-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id m2-20020a05622a118200b00427e7b0cb1fsi2861593qtk.38.2023.12.27.09.38.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Dec 2023 09:38:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12178-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=CS1jGi83; spf=pass (google.com: domain of linux-kernel+bounces-12178-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12178-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id AD3A31C223B5 for ; Wed, 27 Dec 2023 17:38:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 62C5B4776D; Wed, 27 Dec 2023 17:38:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="CS1jGi83" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2252D4652C for ; Wed, 27 Dec 2023 17:38:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-2044d093b3fso3180722fac.3 for ; Wed, 27 Dec 2023 09:38:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1703698698; x=1704303498; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=YMlsfn7czPVsXwrmU+kv95pA8qnyqJLx1nukNUHIJyw=; b=CS1jGi83E7VPxvt/zfcdFd8WqYW7kOSdswTvvWDV7hbCYmRqSlz0MjKR+JKaoi1c1b MtP79vlajSqHWDpZz0SG/HTWUMCQwWPcVEThieDhCWD9+Gro7Ukp3iwJoM6mdQ7Hz2r1 gELZI0svWI/5vscAJTp2+p7ZECYBaq6EWjxXOs1c4090eXCHoADYJ2tWajnJrm5UueH2 l8sD3ZUmsWWSxUjUr3SLStwOWFeaUTUbDnRGeeZShJx91fpkVi4EFkPkZ58eU1e1DH4Q a7qeLnCDFpsWLL6S/xCeFTpl1ySl7G68usW7eLdVgXiPryltzSRyImYGsreWnCEr4Gx/ qPUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703698698; x=1704303498; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YMlsfn7czPVsXwrmU+kv95pA8qnyqJLx1nukNUHIJyw=; b=uswudyqcx6blqt/3e2ZGMGYo8Y0LT9ozphQdA//5XAJHBwWWEPnKXUXy6LXXuizlD0 4cQnNuZ9a3gKFwiDrdPPqNzIT3feugDkNFApNr9WEFmW812c8NRwr5eA7H/F5S88CArX oWSIsfTBQVgkLoMn3oo+pcmRiZVCnguVA0HGUXoGfi7hJtiUkqJv6ZtjhN5d8p6bCn5t eQjTDB1cF2pGEVsbZyaaLkEdbAMmyaTqlhqAOD1hMv9HSA8jGH3l7cd9/O0erLVcyp9E xyuJ+cJslOr/L/n/T/ZOC23aAlTpJtXpd6BNlaI9sMDNU/pVEY+Bc8qgtEpOjct7c9yb N5OA== X-Gm-Message-State: AOJu0Yzz8GWxZXNF/IP1LepN2JXwpikcNz4VsbYnG7tIMj5vfKogiypv ntddD7OHPhiuh3vgNirjnl4+VlSQG6jJtQ== X-Received: by 2002:a05:6871:724:b0:204:302f:74cb with SMTP id f36-20020a056871072400b00204302f74cbmr11492877oap.24.1703698698069; Wed, 27 Dec 2023 09:38:18 -0800 (PST) Received: from charlie.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id rl15-20020a056871650f00b002049c207104sm1337173oab.27.2023.12.27.09.38.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Dec 2023 09:38:17 -0800 (PST) From: Charlie Jenkins Date: Wed, 27 Dec 2023 09:38:01 -0800 Subject: [PATCH v14 2/5] riscv: Add static key for misaligned accesses Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20231227-optimize_checksum-v14-2-ddfd48016566@rivosinc.com> References: <20231227-optimize_checksum-v14-0-ddfd48016566@rivosinc.com> In-Reply-To: <20231227-optimize_checksum-v14-0-ddfd48016566@rivosinc.com> To: Charlie Jenkins , Palmer Dabbelt , Conor Dooley , Samuel Holland , David Laight , Xiao Wang , Evan Green , Guo Ren , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Cc: Paul Walmsley , Albert Ou , Arnd Bergmann X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1703698692; l=5478; i=charlie@rivosinc.com; s=20231120; h=from:subject:message-id; bh=D4IKaaj3vE+b0K5XN8QdVSXmK3dl6XnNZpqh6Lepfss=; b=wW1oSltYcPLm6WjHMbTJUzZDxQi942rfKCi7R0Jd24cRe7PpnWNp2kkXzDGS2uPhT+MHc6RqD cYQbMZT28P7Bxk85HOSEujEx+JNt2ZYDyHx8MaBRhCDsRybFl1WXekw X-Developer-Key: i=charlie@rivosinc.com; a=ed25519; pk=t4RSWpMV1q5lf/NWIeR9z58bcje60/dbtxxmoSfBEcs= Support static branches depending on the value of misaligned accesses. This will be used by a later patch in the series. All online cpus must be considered "fast" for this static branch to be flipped. Signed-off-by: Charlie Jenkins --- arch/riscv/include/asm/cpufeature.h | 2 + arch/riscv/kernel/cpufeature.c | 89 +++++++++++++++++++++++++++++++++++-- 2 files changed, 87 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index a418c3112cd6..7b129e5e2f07 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -133,4 +133,6 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); } +DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); + #endif diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index b3785ffc1570..dfd716b93565 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -8,8 +8,10 @@ #include #include +#include #include #include +#include #include #include #include @@ -44,6 +46,8 @@ struct riscv_isainfo hart_isa[NR_CPUS]; /* Performance information */ DEFINE_PER_CPU(long, misaligned_access_speed); +static cpumask_t fast_misaligned_access; + /** * riscv_isa_extension_base() - Get base extension word * @@ -643,6 +647,16 @@ static int check_unaligned_access(void *param) (speed == RISCV_HWPROBE_MISALIGNED_FAST) ? "fast" : "slow"); per_cpu(misaligned_access_speed, cpu) = speed; + + /* + * Set the value of fast_misaligned_access of a CPU. These operations + * are atomic to avoid race conditions. + */ + if (speed == RISCV_HWPROBE_MISALIGNED_FAST) + cpumask_set_cpu(cpu, &fast_misaligned_access); + else + cpumask_clear_cpu(cpu, &fast_misaligned_access); + return 0; } @@ -655,13 +669,70 @@ static void check_unaligned_access_nonboot_cpu(void *param) check_unaligned_access(pages[cpu]); } +DEFINE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); + +static int exclude_set_unaligned_access_static_branches(int cpu) +{ + /* + * Same as set_unaligned_access_static_branches, except excludes the + * given CPU from the result. When a CPU is hotplugged into an offline + * state, this function is called before the CPU is set to offline in + * the cpumask, and thus the CPU needs to be explicitly excluded. + */ + + cpumask_t online_fast_misaligned_access; + + cpumask_and(&online_fast_misaligned_access, &fast_misaligned_access, cpu_online_mask); + cpumask_clear_cpu(cpu, &online_fast_misaligned_access); + + if (cpumask_weight(&online_fast_misaligned_access) == (num_online_cpus() - 1)) + static_branch_enable_cpuslocked(&fast_misaligned_access_speed_key); + else + static_branch_disable_cpuslocked(&fast_misaligned_access_speed_key); + + return 0; +} + +static int set_unaligned_access_static_branches(void) +{ + /* + * This will be called after check_unaligned_access_all_cpus so the + * result of unaligned access speed for all CPUs will be available. + * + * To avoid the number of online cpus changing between reading + * cpu_online_mask and calling num_online_cpus, cpus_read_lock must be + * held before calling this function. + */ + cpumask_t online_fast_misaligned_access; + + cpumask_and(&online_fast_misaligned_access, &fast_misaligned_access, cpu_online_mask); + + if (cpumask_weight(&online_fast_misaligned_access) == num_online_cpus()) + static_branch_enable_cpuslocked(&fast_misaligned_access_speed_key); + else + static_branch_disable_cpuslocked(&fast_misaligned_access_speed_key); + + return 0; +} + +static int lock_and_set_unaligned_access_static_branch(void) +{ + cpus_read_lock(); + set_unaligned_access_static_branches(); + cpus_read_unlock(); + + return 0; +} + +arch_initcall_sync(lock_and_set_unaligned_access_static_branch); + static int riscv_online_cpu(unsigned int cpu) { static struct page *buf; /* We are already set since the last check */ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_UNKNOWN) - return 0; + goto exit; buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER); if (!buf) { @@ -671,7 +742,14 @@ static int riscv_online_cpu(unsigned int cpu) check_unaligned_access(buf); __free_pages(buf, MISALIGNED_BUFFER_ORDER); - return 0; + +exit: + return set_unaligned_access_static_branches(); +} + +static int riscv_offline_cpu(unsigned int cpu) +{ + return exclude_set_unaligned_access_static_branches(cpu); } /* Measure unaligned access on all CPUs present at boot in parallel. */ @@ -705,9 +783,12 @@ static int check_unaligned_access_all_cpus(void) /* Check core 0. */ smp_call_on_cpu(0, check_unaligned_access, bufs[0], true); - /* Setup hotplug callback for any new CPUs that come online. */ + /* + * Setup hotplug callbacks for any new CPUs that come online or go + * offline. + */ cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "riscv:online", - riscv_online_cpu, NULL); + riscv_online_cpu, riscv_offline_cpu); out: unaligned_emulation_finish(); -- 2.43.0