Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp7596573rdb; Thu, 4 Jan 2024 01:13:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHdpcsaRI9iVLCYPgoiRorBjHa7njjvft9DJjYnTAsSWmVZpC/m1l/iNLIAUs8M5z8cH8lm X-Received: by 2002:a05:6830:12d6:b0:6dc:421:e1b6 with SMTP id a22-20020a05683012d600b006dc0421e1b6mr335707otq.31.1704359582028; Thu, 04 Jan 2024 01:13:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704359581; cv=none; d=google.com; s=arc-20160816; b=hEOMuZNvRVKb2Sv9GZ+62yZYHIeZVFkTyfkQIckEzzVvMstqLzs7nK8zqj9ssrUsY8 FR24gs+Txes84lBZ04beLTksHgSqhInVA3YyC8l8B0Oy5Hu7LyPbDOOZxYnpmHY9Eq0D mXf+FP2YV2VSpIIQEOnMjxEqN8ARURwB+eZmGc7mP5m1LhZAw409tKZHHCp2Js1ALT1z cECK7uXe3YXpcaRpwKyJvuKMyiI+W1+zCG2RlvrXZJEFyxnzyV9CwdFXbYTrZ/ZNSROr Z6/MZaG8ZgtH47MO54/9Fzk/716xL+Jut1hdDzyRIVFE2XeCznTNgZd7t3p3st8L7nE8 0/og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id; bh=ogdwZdhXH/lW+bnNb8ZNoX0Y2sp8KqpJqEA1aVbom/c=; fh=lawjYCWxekO/BwxfCQO/RVKw+Jyx8M1hN2+FSRmyZeQ=; b=vaKVT9C4q7mczUsMoweDAl9o5z7PWP84k33yW0WaBxq4/gD/hggd9aP4Oe2vhMUni3 gQE+2AAK8HvK8yT2lhFPOw+yX/x9SS/AKIuHANdH/4h3vRBRtHaFMftnr+x513JrtMuX Dq/axTsC5iLYoMW4a8QXY0744id8HGaALxxBi12UZPq4h6V+C1i7rJPVmlYFNoCmcmIx qes8tqzvmmHj1aJq2R0Z2KKKGZWuViaBnEJAaiMO5N88oRAnTb8tmsFyRmGgmvThnXxP gAMAZKdkPN9sJNl6A5sgpyztTLIoJFBsovBWDaVUcNqOhO4kMdYMduGed4D6YbIEcRyJ 7gPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-16424-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-16424-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id m17-20020a62f211000000b006d9974f0caesi20051198pfh.54.2024.01.04.01.13.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 01:13:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-16424-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-16424-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-16424-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 06A65B244B9 for ; Thu, 4 Jan 2024 09:09:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E4C82210FD; Thu, 4 Jan 2024 09:07:36 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66E8E20B1B; Thu, 4 Jan 2024 09:07:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4T5LKj3M5Rz1g21f; Thu, 4 Jan 2024 17:06:01 +0800 (CST) Received: from kwepemm600004.china.huawei.com (unknown [7.193.23.242]) by mail.maildlp.com (Postfix) with ESMTPS id E09BD1A0172; Thu, 4 Jan 2024 17:07:27 +0800 (CST) Received: from [10.67.121.59] (10.67.121.59) by kwepemm600004.china.huawei.com (7.193.23.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Jan 2024 17:07:27 +0800 Message-ID: <482ffcce-1224-2bf8-cf07-cfc4633897a8@huawei.com> Date: Thu, 4 Jan 2024 17:07:26 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH] cpufreq: CPPC: Resolve the large frequency discrepancy from cpuinfo_cur_freq To: "Rafael J. Wysocki" CC: , , , "linux-acpi@vger.kernel.org" , , , , , , , , , References: <20231212072617.14756-1-lihuisong@huawei.com> <486f8563-42b7-a049-97a2-bc0b553926aa@huawei.com> <26ba9fa9-7871-27c3-0de5-62f61071dacd@huawei.com> From: "lihuisong (C)" In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600004.china.huawei.com (7.193.23.242) 在 2024/1/3 18:59, Rafael J. Wysocki 写道: > On Mon, Dec 18, 2023 at 3:15 AM lihuisong (C) wrote: >> >> 在 2023/12/15 10:41, lihuisong (C) 写道: >>> Hi Rafael, >>> >>> Thanks for your review.???? >>> >>> 在 2023/12/15 3:31, Rafael J. Wysocki 写道: >>>> On Tue, Dec 12, 2023 at 8:26 AM Huisong Li wrote: >>>>> Many developers found that the cpu current frequency is greater than >>>>> the maximum frequency of the platform, please see [1], [2] and [3]. >>>>> >>>>> In the scenarios with high memory access pressure, the patch [1] has >>>>> proved the significant latency of cpc_read() which is used to obtain >>>>> delivered and reference performance counter cause an absurd frequency. >>>>> The sampling interval for this counters is very critical and is >>>>> expected >>>>> to be equal. However, the different latency of cpc_read() has a direct >>>>> impact on their sampling interval. >>>>> >>>>> This patch adds a interface, cpc_read_arch_counters_on_cpu, to read >>>>> delivered and reference performance counter together. According to my >>>>> test[4], the discrepancy of cpu current frequency in the scenarios with >>>>> high memory access pressure is lower than 0.2% by stress-ng >>>>> application. >>>>> >>>>> [1] >>>>> https://lore.kernel.org/all/20231025093847.3740104-4-zengheng4@huawei.com/ >>>>> [2] >>>>> https://lore.kernel.org/all/20230328193846.8757-1-yang@os.amperecomputing.com/ >>>>> [3] >>>>> https://lore.kernel.org/all/20230418113459.12860-7-sumitg@nvidia.com/ >>>>> >>>>> [4] My local test: >>>>> The testing platform enable SMT and include 128 logical CPU in total, >>>>> and CPU base frequency is 2.7GHz. Reading "cpuinfo_cur_freq" for each >>>>> physical core on platform during the high memory access pressure from >>>>> stress-ng, and the output is as follows: >>>>> 0: 2699133 2: 2699942 4: 2698189 6: 2704347 >>>>> 8: 2704009 10: 2696277 12: 2702016 14: 2701388 >>>>> 16: 2700358 18: 2696741 20: 2700091 22: 2700122 >>>>> 24: 2701713 26: 2702025 28: 2699816 30: 2700121 >>>>> 32: 2700000 34: 2699788 36: 2698884 38: 2699109 >>>>> 40: 2704494 42: 2698350 44: 2699997 46: 2701023 >>>>> 48: 2703448 50: 2699501 52: 2700000 54: 2699999 >>>>> 56: 2702645 58: 2696923 60: 2697718 62: 2700547 >>>>> 64: 2700313 66: 2700000 68: 2699904 70: 2699259 >>>>> 72: 2699511 74: 2700644 76: 2702201 78: 2700000 >>>>> 80: 2700776 82: 2700364 84: 2702674 86: 2700255 >>>>> 88: 2699886 90: 2700359 92: 2699662 94: 2696188 >>>>> 96: 2705454 98: 2699260 100: 2701097 102: 2699630 >>>>> 104: 2700463 106: 2698408 108: 2697766 110: 2701181 >>>>> 112: 2699166 114: 2701804 116: 2701907 118: 2701973 >>>>> 120: 2699584 122: 2700474 124: 2700768 126: 2701963 >>>>> >>>>> Signed-off-by: Huisong Li >>>> First off, please Cc ACPI-related patches to linux-acpi. >>> got it. >>> >>> +linux-acpi@vger.kernel.org >>> >>>>> --- >>>>> arch/arm64/kernel/topology.c | 43 >>>>> ++++++++++++++++++++++++++++++++++-- >>>>> drivers/acpi/cppc_acpi.c | 22 +++++++++++++++--- >>>>> include/acpi/cppc_acpi.h | 5 +++++ >>>>> 3 files changed, 65 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/arch/arm64/kernel/topology.c >>>>> b/arch/arm64/kernel/topology.c >>>>> index 7d37e458e2f5..c3122154d738 100644 >>>>> --- a/arch/arm64/kernel/topology.c >>>>> +++ b/arch/arm64/kernel/topology.c >>>>> @@ -299,6 +299,11 @@ core_initcall(init_amu_fie); >>>>> #ifdef CONFIG_ACPI_CPPC_LIB >>>>> #include >>>>> >>>>> +struct amu_counters { >>>>> + u64 corecnt; >>>>> + u64 constcnt; >>>>> +}; >>>>> + >>>>> static void cpu_read_corecnt(void *val) >>>>> { >>>>> /* >>>>> @@ -322,8 +327,27 @@ static void cpu_read_constcnt(void *val) >>>>> 0UL : read_constcnt(); >>>>> } >>>>> >>>>> +static void cpu_read_amu_counters(void *data) >>>>> +{ >>>>> + struct amu_counters *cnt = (struct amu_counters *)data; >>>>> + >>>>> + /* >>>>> + * The running time of the this_cpu_has_cap() might have a >>>>> couple of >>>>> + * microseconds and is significantly increased to tens of >>>>> microseconds. >>>>> + * But AMU core and constant counter need to be read togeter >>>>> without any >>>>> + * time interval to reduce the calculation discrepancy using >>>>> this counters. >>>>> + */ >>>>> + if (this_cpu_has_cap(ARM64_WORKAROUND_2457168)) { >>>>> + cnt->corecnt = read_corecnt(); >>>> This statement is present in both branches, so can it be moved before >>>> the if ()? >>> Yes. >>> Do you mean adding a blank line before if()? >> Sorry, I misunderstood you. >> The statement "cnt->corecnt = read_corecnt();" cannot be moved before >> the if(). >> The AMU core and constant counter need to be read togeter without any >> time interval as described in code comments. >> The this_cpu_has_cap() is time-consuming. >> That is why I don't use the cpu_read_constcnt() to read constant counter. > So define something like > > static inline void amu_read_counters(struct amu_counters *cnt, bool > read_constcnt) > { > cnt->corecnt = read_corecnt(); > if (read_constcnt) > cnt->constcnt = read_constcnt(); > else > cnt->constcnt = 0; > } > >>>>> + cnt->constcnt = 0; >>>>> + } else { >>>>> + cnt->corecnt = read_corecnt(); >>>>> + cnt->constcnt = read_constcnt(); >>>>> + } > and use it like this: > > amu_read_counters(cnt, !this_cpu_has_cap(ARM64_WORKAROUND_2457168); > > It should work as expected AFAICS. Yes it would work well. The new cpc_read_arch_counters_on_cpu() uses the counters_read_on_cpu() to read core counters on specified cpu. smp_call_function_single() called by counters_read_on_cpu() just support to pass one argument for "smp_call_func_t func". So we cannot define the function as you suggested. what do you think? > >>>>> +} >>>>> + >>>>> static inline >>>>> -int counters_read_on_cpu(int cpu, smp_call_func_t func, u64 *val) >>>>> +int counters_read_on_cpu(int cpu, smp_call_func_t func, void *data) >>>>> { >>>>> /* >>>>> * Abort call on counterless CPU or when interrupts are >>>>> @@ -335,7 +359,7 @@ int counters_read_on_cpu(int cpu, >>>>> smp_call_func_t func, u64 *val) >>>>> if (WARN_ON_ONCE(irqs_disabled())) >>>>> return -EPERM; >>>>> >>>>> - smp_call_function_single(cpu, func, val, 1); >>>>> + smp_call_function_single(cpu, func, data, 1); >>>>> >>>>> return 0; >>>>> } >>>>> @@ -364,6 +388,21 @@ bool cpc_ffh_supported(void) >>>>> return true; >>>>> } >>>>> >>>>> +int cpc_read_arch_counters_on_cpu(int cpu, u64 *delivered, u64 >>>>> *reference) >>>>> +{ >>>>> + struct amu_counters cnts = {0}; >>>>> + int ret; >>>>> + >>>>> + ret = counters_read_on_cpu(cpu, cpu_read_amu_counters, &cnts); >>>>> + if (ret) >>>>> + return ret; >>>>> + >>>>> + *delivered = cnts.corecnt; >>>>> + *reference = cnts.constcnt; >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> int cpc_read_ffh(int cpu, struct cpc_reg *reg, u64 *val) >>>>> { >>>>> int ret = -EOPNOTSUPP; >>>>> diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c >>>>> index 7ff269a78c20..f303fabd7cfe 100644 >>>>> --- a/drivers/acpi/cppc_acpi.c >>>>> +++ b/drivers/acpi/cppc_acpi.c >>>>> @@ -1299,6 +1299,11 @@ bool cppc_perf_ctrs_in_pcc(void) >>>>> } >>>>> EXPORT_SYMBOL_GPL(cppc_perf_ctrs_in_pcc); >>>>> >>>>> +int __weak cpc_read_arch_counters_on_cpu(int cpu, u64 *delivered, >>>>> u64 *reference) >>>>> +{ >>>>> + return 0; >>>>> +} >>>>> + >>>>> /** >>>>> * cppc_get_perf_ctrs - Read a CPU's performance feedback counters. >>>>> * @cpunum: CPU from which to read counters. >>>>> @@ -1313,7 +1318,8 @@ int cppc_get_perf_ctrs(int cpunum, struct >>>>> cppc_perf_fb_ctrs *perf_fb_ctrs) >>>>> *ref_perf_reg, *ctr_wrap_reg; >>>>> int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum); >>>>> struct cppc_pcc_data *pcc_ss_data = NULL; >>>>> - u64 delivered, reference, ref_perf, ctr_wrap_time; >>>>> + u64 delivered = 0, reference = 0; >>>>> + u64 ref_perf, ctr_wrap_time; >>>>> int ret = 0, regs_in_pcc = 0; >>>>> >>>>> if (!cpc_desc) { >>>>> @@ -1350,8 +1356,18 @@ int cppc_get_perf_ctrs(int cpunum, struct >>>>> cppc_perf_fb_ctrs *perf_fb_ctrs) >>>>> } >>>>> } >>>>> >>>>> - cpc_read(cpunum, delivered_reg, &delivered); >>>>> - cpc_read(cpunum, reference_reg, &reference); >>>>> + if (cpc_ffh_supported()) { >>>>> + ret = cpc_read_arch_counters_on_cpu(cpunum, >>>>> &delivered, &reference); >>>>> + if (ret) { >>>>> + pr_debug("read arch counters failed, >>>>> ret=%d.\n", ret); >>>>> + ret = 0; >>>>> + } >>>>> + } >>>> The above is surely not applicable to every platform using CPPC. Also >>> cpc_ffh_supported is aimed to control only the platform supported FFH >>> to enter. >>> cpc_read_arch_counters_on_cpu is also needed to implemented by each >>> platform according to their require. > Well, exactly. > >>> Here just implement this interface for arm64. > So on x86 cpc_ffh_supported() returns true and > cpc_read_arch_counters_on_cpu() will do nothing, so it will always > fall back to using cpc_read(). That is not particularly > straightforward IMV. Understand you. But we have to do like this to be compatible with all platforms. I am thinking twice about cpc_ffh_supported(). And I found that calling cpc_ffh_supported() here is redundant for ARM. Because counters_read_on_cpu() has verified if this CPU support AMU counter. What do you say we remove cpc_ffh_supported() here and directly call this new architecture interface? > >>>> it looks like in the ARM64_WORKAROUND_2457168 enabled case it is just >>>> pointless overhead, because "reference" is always going to be 0 here >>>> then. >>> Right, it is always going to be 0 here for the >>> ARM64_WORKAROUND_2457168 enabled case . >>> But ARM64_WORKAROUND_2457168 is a macro releated to ARM. >>> It seems that it is not appropriate for this macro to appear this >>> common place for all platform, right? >>> >>>> Please clean that up. >>>> >>>>> + if (!delivered || !reference) { >>>>> + cpc_read(cpunum, delivered_reg, &delivered); >>>>> + cpc_read(cpunum, reference_reg, &reference); >>>>> + } >>>>> + >>>>> cpc_read(cpunum, ref_perf_reg, &ref_perf); >>>>> >>>>> /* >>>>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h >>>>> index 6126c977ece0..07d4fd82d499 100644 >>>>> --- a/include/acpi/cppc_acpi.h >>>>> +++ b/include/acpi/cppc_acpi.h >>>>> @@ -152,6 +152,7 @@ extern bool cpc_ffh_supported(void); >>>>> extern bool cpc_supported_by_cpu(void); >>>>> extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val); >>>>> extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val); >>>>> +extern int cpc_read_arch_counters_on_cpu(int cpu, u64 *delivered, >>>>> u64 *reference); >>>>> extern int cppc_get_epp_perf(int cpunum, u64 *epp_perf); >>>>> extern int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls >>>>> *perf_ctrls, bool enable); >>>>> extern int cppc_get_auto_sel_caps(int cpunum, struct >>>>> cppc_perf_caps *perf_caps); >>>>> @@ -209,6 +210,10 @@ static inline int cpc_write_ffh(int cpunum, >>>>> struct cpc_reg *reg, u64 val) >>>>> { >>>>> return -ENOTSUPP; >>>>> } >>>>> +static inline int cpc_read_arch_counters_on_cpu(int cpu, u64 >>>>> *delivered, u64 *reference) >>>>> +{ >>>>> + return -EOPNOTSUPP; >>>>> +} >>>>> static inline int cppc_set_epp_perf(int cpu, struct >>>>> cppc_perf_ctrls *perf_ctrls, bool enable) >>>>> { >>>>> return -ENOTSUPP; >>>>> -- >>>> . >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > .