Received: by 2002:ab2:5c0e:0:b0:1ef:a325:1205 with SMTP id i14csp65425lqk; Wed, 13 Mar 2024 16:47:09 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV2svqhso6tS9egMVdNN4/+zVVgmpoXhScZ1GIKFynvBYfpILHnOrB2RIS0P5lifqsEZoV/tPFhlsnFHZ13RNS5Y5Tj0hK33S6BpoMOyg== X-Google-Smtp-Source: AGHT+IG9VsamGG9LYF0shgMpc0wR1h5avb0zjBWH9CRJl5Kf47fOq7IVnnc7GrqdCN5GQ3ksJ88w X-Received: by 2002:a17:906:69c3:b0:a45:f209:d2cb with SMTP id g3-20020a17090669c300b00a45f209d2cbmr17558ejs.28.1710373628997; Wed, 13 Mar 2024 16:47:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710373628; cv=pass; d=google.com; s=arc-20160816; b=SpVe5EzeW9SHVroYMcXu9HgHFCI6KSJQ3Zm2VSfxltobe+/D8RpeAHj8bt1M74pai8 OZDrvh6sHF4oKM43IGFvEWIUsCAVwXnn8b4H1Q535HBillCq276oRPcgMqRkGtoYCNWC rrqX0dYRvGmY61FyGDUoHH+iaOZ9HSz4jR+qACW3+6ZtF26P/ZEjNZBJN7zmySK4AWiC YDHFLNyYzncURclkAjpo87bWPx5Ub6oZr8Xs9s4XD6gtThmYOD+eBKiNaWnC7H15gd5S ZgpruJ+MyIY3/gk29OPcjo+pRrKF1uMsQjHqi0WH84IkJJHB1SHBRdHrolU7DHmHw4k5 tEKQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date; bh=8N8Ca30I+kTBnc77l1CXTGoRIZ9vq4PbxYVDQSvrlV8=; fh=+ZKT7MAoiuZIL+XPyTp2IzXA4ERppTpYB4xvMYFfcvQ=; b=g3Oe5crKc5RoanoJjLn1sT6js2IaMp50CVUO6+Yo80izl11ohVvyIxmXTpEkiJGHKS 7Yb8Vvv8GbduhciHkjk+4cSK/DNqfMRacJtgabKQkSjIlCTU/eI+Y29aHft3eyJCCLv0 6/ZtnmWmVGlCc6T7OKHiPUtwUmtUAu4Dq7YBdiOqgXj2cw+yy5xpiDgGLHAXJ7herxeX fu7NMUgU+et52p3bnxsp1lg2h+UpTc48D/Leh2HlZyX49bhtKiLtXijzn1B5xjc/OzS3 PE2D1Nykn1tEg/EWLkabLTBx8xAobamMvf1SGKovag0EfgiZu6DbzJ4pNCOpkvObpRNd 6iPQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-102651-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102651-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id h11-20020a1709062dcb00b00a464ec5f827si129779eji.442.2024.03.13.16.47.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Mar 2024 16:47:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-102651-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-102651-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102651-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 8A5C31F22D0B for ; Wed, 13 Mar 2024 23:47:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4A9E55E06B; Wed, 13 Mar 2024 23:47:04 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 91FE75E066 for ; Wed, 13 Mar 2024 23:47:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710373623; cv=none; b=lIGeAwdzjxP5QVfUoEVBA03tSXBA8GGTriDkUJCg+72zXEHpARHCGpHwDp6P9PYhk8WIojktdSqXfMTCnVkKU0WBMjT5CcUiflMswBgo+qbv51zj9ghcq3v4ctOt3hlEnr5ObFlhqzSnTVALJKU+W2L3Y+K2jk1o6yCCnjILrxM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710373623; c=relaxed/simple; bh=+Ej+wixx98+bYwHqfN+n5qbeL8nbp4qjx5dE8Tv3EAg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=eYHn+pKsJQTPuMk7BWbgobAR88y2MuGPFX3Cl9h1n+HDq94PpGPFtM4fvbilCKBx4Fvbk5ARaEQP4uxtD8eSp+cWNFSLM3Ecn+h5UxsFVAf+XNJJcCR/5xQ/5LWpydIOMka/12VKKl62UP8rtu5mooAfsOfNXkR1JTQbJZPFJWc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 392361007; Wed, 13 Mar 2024 16:47:36 -0700 (PDT) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2D9B73F73F; Wed, 13 Mar 2024 16:46:55 -0700 (PDT) Date: Thu, 14 Mar 2024 00:46:19 +0100 From: Beata Michalska To: Ionela Voinescu Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, vanshikonda@os.amperecomputing.com, sudeep.holla@arm.com, will@kernel.org, catalin.marinas@arm.com, vincent.guittot@linaro.org, sumitg@nvidia.com, yang@os.amperecomputing.com, lihuisong@huawei.com Subject: Re: [PATCH v3 2/3] arm64: Provide an AMU-based version of arch_freq_get_on_cpu Message-ID: References: <20240312083431.3239989-1-beata.michalska@arm.com> <20240312083431.3239989-3-beata.michalska@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed, Mar 13, 2024 at 12:20:16PM +0000, Ionela Voinescu wrote: > Hi Beata, > > Thank you for the patches! > High time for those! > On Tuesday 12 Mar 2024 at 08:34:30 (+0000), Beata Michalska wrote: > > With the Frequency Invariance Engine (FIE) being already wired up with > > sched tick and making use of relevant (core counter and constant > > counter) AMU counters, getting the current frequency for a given CPU > > on supported platforms can be achieved by utilizing the frequency scale > > factor which reflects an average CPU frequency for the last tick period > > length. > > > > The solution is partially based on APERF/MPERF implementation of > > arch_freq_get_on_cpu. > > > > Suggested-by: Ionela Voinescu > > Signed-off-by: Beata Michalska > > --- > > arch/arm64/kernel/topology.c | 103 +++++++++++++++++++++++++++++++---- > > 1 file changed, 92 insertions(+), 11 deletions(-) > > > > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c > > index 1a2c72f3e7f8..42cb19c31719 100644 > > --- a/arch/arm64/kernel/topology.c > > +++ b/arch/arm64/kernel/topology.c > > @@ -17,6 +17,8 @@ > > #include > > #include > > #include > > +#include > > +#include > > > > #include > > #include > > @@ -88,18 +90,31 @@ int __init parse_acpi_topology(void) > > * initialized. > > */ > > static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale) = 1UL << (2 * SCHED_CAPACITY_SHIFT); > > -static DEFINE_PER_CPU(u64, arch_const_cycles_prev); > > -static DEFINE_PER_CPU(u64, arch_core_cycles_prev); > > static cpumask_var_t amu_fie_cpus; > > > > +struct amu_cntr_sample { > > + u64 arch_const_cycles_prev; > > + u64 arch_core_cycles_prev; > > + unsigned long last_update; > > + seqcount_t seq; > > +}; > > + > > +static DEFINE_PER_CPU_SHARED_ALIGNED(struct amu_cntr_sample, cpu_amu_samples) = { > > + .seq = SEQCNT_ZERO(cpu_amu_samples.seq) > > +}; > > + > > void update_freq_counters_refs(void) > > { > > - this_cpu_write(arch_core_cycles_prev, read_corecnt()); > > - this_cpu_write(arch_const_cycles_prev, read_constcnt()); > > + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); > > + > > + amu_sample->arch_core_cycles_prev = read_corecnt(); > > + amu_sample->arch_const_cycles_prev = read_constcnt(); > > } > > > > static inline bool freq_counters_valid(int cpu) > > { > > + struct amu_cntr_sample *amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); > > + > > if ((cpu >= nr_cpu_ids) || !cpumask_test_cpu(cpu, cpu_present_mask)) > > return false; > > > > @@ -108,8 +123,8 @@ static inline bool freq_counters_valid(int cpu) > > return false; > > } > > > > - if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) || > > - !per_cpu(arch_core_cycles_prev, cpu))) { > > + if (unlikely(!amu_sample->arch_const_cycles_prev || > > + !amu_sample->arch_core_cycles_prev)) { > > pr_debug("CPU%d: cycle counters are not enabled.\n", cpu); > > return false; > > } > > @@ -152,20 +167,27 @@ void freq_inv_set_max_ratio(int cpu, u64 max_rate) > > > > static void amu_scale_freq_tick(void) > > { > > + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); > > u64 prev_core_cnt, prev_const_cnt; > > u64 core_cnt, const_cnt, scale; > > > > - prev_const_cnt = this_cpu_read(arch_const_cycles_prev); > > - prev_core_cnt = this_cpu_read(arch_core_cycles_prev); > > + prev_const_cnt = amu_sample->arch_const_cycles_prev; > > + prev_core_cnt = amu_sample->arch_core_cycles_prev; > > + > > + write_seqcount_begin(&amu_sample->seq); > > The critical section here does not need to be this extensive, right? > > The arch_freq_get_on_cpu() function only uses the frequency scale factor > and the last_update value, so this need only be placed above > "this_cpu_write(arch_freq_scale,..", if I'm not missing anything. You're not missing anything. The write side critical section could span only those two, but having it extended gives a chance for the readers to get in on the update and as those are not really performance sensitive I though it might be a good option, especially if we can save the cycles on not needing to poke the cpufeq driver. Furthermore, if the critical section is to span only the two, then it does not really change much and can be dropped. > > > > > update_freq_counters_refs(); > > > > - const_cnt = this_cpu_read(arch_const_cycles_prev); > > - core_cnt = this_cpu_read(arch_core_cycles_prev); > > + const_cnt = amu_sample->arch_const_cycles_prev; > > + core_cnt = amu_sample->arch_core_cycles_prev; > > > > + /* > > + * This should not happen unless the AMUs have been reset and the > > + * counter values have not been resroted - unlikely > > + */ > > if (unlikely(core_cnt <= prev_core_cnt || > > const_cnt <= prev_const_cnt)) > > - return; > > + goto leave; > > > > /* > > * /\core arch_max_freq_scale > > @@ -182,6 +204,10 @@ static void amu_scale_freq_tick(void) > > > > scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE); > > this_cpu_write(arch_freq_scale, (unsigned long)scale); > > + > > + amu_sample->last_update = jiffies; > > +leave: > > + write_seqcount_end(&amu_sample->seq); > > } > > > > static struct scale_freq_data amu_sfd = { > > @@ -189,6 +215,61 @@ static struct scale_freq_data amu_sfd = { > > .set_freq_scale = amu_scale_freq_tick, > > }; > > > > +#define AMU_SAMPLE_EXP_MS 20 > > + > > +unsigned int arch_freq_get_on_cpu(int cpu) > > +{ > > + struct amu_cntr_sample *amu_sample; > > + unsigned long last_update; > > + unsigned int seq; > > + unsigned int freq; > > + u64 scale; > > + > > + if (!cpumask_test_cpu(cpu, amu_fie_cpus) || !arch_scale_freq_ref(cpu)) > > + return 0; > > + > > +retry: > > + amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); > > + > > + do { > > + seq = raw_read_seqcount_begin(&amu_sample->seq); > > + last_update = amu_sample->last_update; > > + } while (read_seqcount_retry(&amu_sample->seq, seq)); > > Related to the point above, this retry loop should also contain > "scale = arch_scale_freq_capacity(cpu)", otherwise there's no much point > for synchronisation, as far as I can tell. I'm not entirely sure why we would need to include the scale factor within the read critical section. The aim here is to make sure we see the update if one is ongoing and that the update to the timestamp is observed along with one to the scale factor, which is what the write_seqcount_end will guarantee (although the latter is not a hard sell as the update happens under interrupts being disabled). If later on we fetch newer scale factor that's perfectly fine, we do not want to see the stale one. Again, I can drop the seqcount (which is slightly abused in this case I must admit) at a cost of potentially missing some updates. > > For x86, arch_freq_get_on_cpu() uses the counter deltas and it would be > bad if values from different ticks would be used. But here the only > benefit of synchronisation is to make sure that we're using the scale > factor computed at the last update time. For us, even skipping on the > synchronisation logic would still be acceptable, as we'd be ensuring that > there was a tick in the past 20ms and we'd always use the most recent > value of the frequency scale factor. How would we ensure there was a tick in last 20ms ? > > Hope it helps, It does, thank you. -- BR Beata > Ionela. > > > + > > + /* > > + * For those CPUs that are in full dynticks mode, > > + * and those that have not seen tick for a while > > + * try an alternative source for the counters (and thus freq scale), > > + * if available for given policy > > + */ > > + if (time_is_before_jiffies(last_update + msecs_to_jiffies(AMU_SAMPLE_EXP_MS))) { > > + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); > > + int ref_cpu = nr_cpu_ids; > > + > > + if (cpumask_intersects(housekeeping_cpumask(HK_TYPE_TICK), > > + policy->cpus)) > > + ref_cpu = cpumask_nth_and(cpu, policy->cpus, > > + housekeeping_cpumask(HK_TYPE_TICK)); > > + > > + cpufreq_cpu_put(policy); > > + if (ref_cpu >= nr_cpu_ids || ref_cpu == cpu) > > + /* No alternative to pull info from */ > > + return 0; > > + cpu = ref_cpu; > > + goto retry; > > + } > > + /* > > + * Reversed computation to the one used to determine > > + * the arch_freq_scale value > > + * (see amu_scale_freq_tick for details) > > + */ > > + scale = arch_scale_freq_capacity(cpu); > > + freq = scale * arch_scale_freq_ref(cpu); > > + freq >>= SCHED_CAPACITY_SHIFT; > > + > > + return freq; > > +} > > + > > static void amu_fie_setup(const struct cpumask *cpus) > > { > > int cpu; > > -- > > 2.25.1 > >