Received: by 2002:ab2:7a55:0:b0:1f4:4a7d:290d with SMTP id u21csp703121lqp; Fri, 5 Apr 2024 06:34:14 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUZGL1dozcvVmlJczZWaTUk+/8sUaDun6qjqGiYYwwQ53H7Tz+WoI9oS8zcEgdmwLFWgCBLRdp4qohPgfqLEaVU7VUGIWDkxQ+NrrOMPg== X-Google-Smtp-Source: AGHT+IGNvp7Dl7lNE0y4lSyLXjg62bIHz0HSqJBCd2D0RZrpvQrUmdxflvkTedyLhqtAnEkty0PQ X-Received: by 2002:a50:8acd:0:b0:56b:9b11:9594 with SMTP id k13-20020a508acd000000b0056b9b119594mr1231473edk.2.1712324054135; Fri, 05 Apr 2024 06:34:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712324054; cv=pass; d=google.com; s=arc-20160816; b=uOPYUHM6UNdrfCu8R77q0UvoMMQAseL3r1GZR/+KjpjrigVOuxcMoTEXHAma1fnuIQ Gh9oDUe2fRMBQk/9yLyCbuaKRkUIN+YjU0ndQYJGJb0/RSFHDJgiaue3MaHLeinTxHyz TRgg65Iu9zXIW5X3+4SYOemEL2KPSGYX6n+yX1o9X9Nkm3NJMoDx77Aybfjl8PFSCW/h vUOQePELHeWhzFrBwAmuDr6AV7txiGnjTXpsPpCRke+Qg1ZrVNQYEb4hi2E8RKhkUQus E0EO7bzFJjGC1xLnIjU4WySy5D/TvAI8+xDYwfWkJSnOWjrMPr4OKt9Z4VVfHMpM4yhN 6zGg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=wPPshVCzR+LNP9jTHCpKSax58u4/LwaWLczaW+odQ7E=; fh=cyDp+t11qGExQGVvcee9BaXo/5kV7bnnsUJA2sJhI3s=; b=zdfAlJlNpfMhbtW9tCpYp40jMhH899/r8BDm1KTaLVtGndDq7R6XKeMruFRMpM1pdg xdLJyeH0ra7KWUntkmEU8RrxxVo0aCJ4iEOkZLBImxPWn1sjAHYwBc/hxkRHsnitUn1g BkxwDQtmKSlxLOT9bcr9eZnfhgoTx2ik4wvccZf35shwosmdl0VjX8O+eBl1A4Qy3/UI kO90/98V5GYxG/hxJf/Nfg06GpEZA+S77RRPe9RIyZEhDzcQD5hPq0BIKssUg3ZQJY8Y stytJFZ63q81fA/8W+3eaGCRWTi4CnweFKe5pj72FFft55AAhX7GuJRepqkPu98a/42T tPeA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-133069-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-133069-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id f16-20020a056402355000b0056e1015d944si739941edd.578.2024.04.05.06.34.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Apr 2024 06:34:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-133069-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-133069-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-133069-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id AC8BB1F2256E for ; Fri, 5 Apr 2024 13:34:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3979C16D9D2; Fri, 5 Apr 2024 13:33:46 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B261316D4F0 for ; Fri, 5 Apr 2024 13:33:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712324025; cv=none; b=au4G4hppFWXaU6VrWaG7LUj8LwiYt9JX/EFz5jAe/obMCa+iQwKUxhf2dwtQNh5f/yw6X7DTc7x2wZ/W/wUiIpcdiWtf51Xa2XGh+nwPauIetBIjj9opcBClgdqgNaGTm7qbmevEiZdn0Phmwxhfgzm+5/QaUVdih2OCj3lLZ/Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712324025; c=relaxed/simple; bh=ps6HKEXP7kVBdfc6Ge4JKgvOhVJczVbMf9EQXmSXdkg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LxYZ815QNFj9CIrMVtnqduaNbOFO3JtqmjvFyoB1kxOMJ0bEqgs4wjNYNSNbEH3Nqhi03Q2EUBl5xnQA5PZM9sfawKez+Mq4cEEVJtlPo9q9XussoeLtYjaEARa4NEeKGvjrjakBmQRqkgSmR1+CdeZNLgFmW7+C0izTp+Ct+1Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6EEB8169C; Fri, 5 Apr 2024 06:34:13 -0700 (PDT) Received: from e125905.cambridge.arm.com (e125905.cambridge.arm.com [10.1.194.73]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 645213F7B4; Fri, 5 Apr 2024 06:33:41 -0700 (PDT) From: Beata Michalska To: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, ionela.voinescu@arm.com, vanshikonda@os.amperecomputing.com Cc: sudeep.holla@arm.com, will@kernel.org, catalin.marinas@arm.com, vincent.guittot@linaro.org, sumitg@nvidia.com, yang@os.amperecomputing.com, lihuisong@huawei.com Subject: [PATCH v4 2/4] arm64: Provide an AMU-based version of arch_freq_get_on_cpu Date: Fri, 5 Apr 2024 14:33:17 +0100 Message-Id: <20240405133319.859813-3-beata.michalska@arm.com> In-Reply-To: <20240405133319.859813-1-beata.michalska@arm.com> References: <20240405133319.859813-1-beata.michalska@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit With the Frequency Invariance Engine (FIE) being already wired up with sched tick and making use of relevant (core counter and constant counter) AMU counters, getting the current frequency for a given CPU, can be achieved by utilizing the frequency scale factor which reflects an average CPU frequency for the last tick period length. The solution is partially based on APERF/MPERF implementation of arch_freq_get_on_cpu. Suggested-by: Ionela Voinescu Signed-off-by: Beata Michalska --- arch/arm64/kernel/topology.c | 112 +++++++++++++++++++++++++++++++---- 1 file changed, 102 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 1a2c72f3e7f8..b03fe8617721 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -88,18 +89,28 @@ int __init parse_acpi_topology(void) * initialized. */ static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale) = 1UL << (2 * SCHED_CAPACITY_SHIFT); -static DEFINE_PER_CPU(u64, arch_const_cycles_prev); -static DEFINE_PER_CPU(u64, arch_core_cycles_prev); static cpumask_var_t amu_fie_cpus; +struct amu_cntr_sample { + u64 arch_const_cycles_prev; + u64 arch_core_cycles_prev; + unsigned long last_update; +}; + +static DEFINE_PER_CPU_SHARED_ALIGNED(struct amu_cntr_sample, cpu_amu_samples); + void update_freq_counters_refs(void) { - this_cpu_write(arch_core_cycles_prev, read_corecnt()); - this_cpu_write(arch_const_cycles_prev, read_constcnt()); + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); + + amu_sample->arch_core_cycles_prev = read_corecnt(); + amu_sample->arch_const_cycles_prev = read_constcnt(); } static inline bool freq_counters_valid(int cpu) { + struct amu_cntr_sample *amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); + if ((cpu >= nr_cpu_ids) || !cpumask_test_cpu(cpu, cpu_present_mask)) return false; @@ -108,8 +119,8 @@ static inline bool freq_counters_valid(int cpu) return false; } - if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) || - !per_cpu(arch_core_cycles_prev, cpu))) { + if (unlikely(!amu_sample->arch_const_cycles_prev || + !amu_sample->arch_core_cycles_prev)) { pr_debug("CPU%d: cycle counters are not enabled.\n", cpu); return false; } @@ -152,17 +163,22 @@ void freq_inv_set_max_ratio(int cpu, u64 max_rate) static void amu_scale_freq_tick(void) { + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); u64 prev_core_cnt, prev_const_cnt; u64 core_cnt, const_cnt, scale; - prev_const_cnt = this_cpu_read(arch_const_cycles_prev); - prev_core_cnt = this_cpu_read(arch_core_cycles_prev); + prev_const_cnt = amu_sample->arch_const_cycles_prev; + prev_core_cnt = amu_sample->arch_core_cycles_prev; update_freq_counters_refs(); - const_cnt = this_cpu_read(arch_const_cycles_prev); - core_cnt = this_cpu_read(arch_core_cycles_prev); + const_cnt = amu_sample->arch_const_cycles_prev; + core_cnt = amu_sample->arch_core_cycles_prev; + /* + * This should not happen unless the AMUs have been reset and the + * counter values have not been restored - unlikely + */ if (unlikely(core_cnt <= prev_core_cnt || const_cnt <= prev_const_cnt)) return; @@ -182,6 +198,8 @@ static void amu_scale_freq_tick(void) scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE); this_cpu_write(arch_freq_scale, (unsigned long)scale); + + amu_sample->last_update = jiffies; } static struct scale_freq_data amu_sfd = { @@ -189,6 +207,80 @@ static struct scale_freq_data amu_sfd = { .set_freq_scale = amu_scale_freq_tick, }; +#define AMU_SAMPLE_EXP_MS 20 + +unsigned int arch_freq_get_on_cpu(int cpu) +{ + struct amu_cntr_sample *amu_sample; + cpumask_var_t ref_cpumask = NULL; + unsigned long last_update; + unsigned int freq; + u64 scale; + + if (!cpumask_test_cpu(cpu, amu_fie_cpus) || !arch_scale_freq_ref(cpu)) + return 0; +retry: + amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); + + last_update = amu_sample->last_update; + + /* + * For those CPUs that are in full dynticks mode, + * and those that have not seen tick for a while + * try an alternative source for the counters (and thus freq scale), + * if available, for given policy: + * this boils down to identifying an active cpu within the same freq + * domain, if any. + */ + if (!housekeeping_cpu(cpu, HK_TYPE_TICK) || + time_is_before_jiffies(last_update + msecs_to_jiffies(AMU_SAMPLE_EXP_MS))) { + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + int ref_cpu = cpu; + + if (!policy_is_shared(policy)) { + cpufreq_cpu_put(policy); + return 0; + } + + if (!ref_cpumask) { + if (!zalloc_cpumask_var(&ref_cpumask, GFP_KERNEL)) { + cpufreq_cpu_put(policy); + return 0; + } + + cpumask_copy(ref_cpumask, policy->cpus); + } + + cpufreq_cpu_put(policy); + + do { + cpumask_clear_cpu(ref_cpu, ref_cpumask); + ref_cpu = cpumask_first(ref_cpumask); + + } while (ref_cpu < nr_cpu_ids && idle_cpu(ref_cpu)); + + if (ref_cpu >= nr_cpu_ids) { + /* No alternative to pull info from */ + free_cpumask_var(ref_cpumask); + return 0; + } + cpu = ref_cpu; + goto retry; + } + /* + * Reversed computation to the one used to determine + * the arch_freq_scale value + * (see amu_scale_freq_tick for details) + */ + scale = arch_scale_freq_capacity(cpu); + freq = scale * arch_scale_freq_ref(cpu); + freq >>= SCHED_CAPACITY_SHIFT; + + free_cpumask_var(ref_cpumask); + + return freq; +} + static void amu_fie_setup(const struct cpumask *cpus) { int cpu; -- 2.25.1